Tuesday, March 27, 2007

On diligence

Today's class discussion, particularly the points on recording & entering study data in a timely manner, was very interesting/relevant to me. While I haven't done a lot of research, I do keep track of a lot of "data" (some of it numeric, some of it just info that I need) in my daily life. A lot of this stuff is scribbled down onto scrap paper, into a notepad file, or onto my computer's desktop (more on that in a second), and unfortunately, much of it is forgotten. Of course, I know that I should transfer into whatever its final form will be (be it in a spreadsheet, Quicken, my Amazon list, an email, assignment, blog post, my cell phone) as soon as possible, but rarely do so. It was good to hear this reiterated.

So, I decided to clear up some of the information that's clogging up my computer right now. I have a folder on my desktop devoted to 2.5 yrs worth of notepad files, ranging all the way from from "$ spent updat" (which not only has records of money spent, but some weightlifting workouts and part of a to-do list??) to "YOU ABSOLUTELY MUST BY 11-05" (in reference to some magazine subscriptions I had to cancel to avoid getting charged). However, there are 318 files. So... I decided on an easier task to tackle first.

I have a program that lets me put virtual post-it notes on my desktop. These notes tend to accumulate and the problem is exacerbated because whenever I have too much into cluttering up the space, I just stack all the notes on top of each other and forget about the bottom ones. So I decided to clean this up and sort all the important data. After separating all of them, here's the diversity of information I found:
  • Distances to a bunch of landmarks on the levy bike path that I measured with Google Earth.
  • Training paces for various kinds of running (easy, tempo, long, speedwork, etc) calculated based on my mile time.
  • Data for some stuff I sold on ebay (listing fees, final value fee, paypal fee, shipping, price charged, profit).
  • No less than 48 (!) songs to download. This list has been in progress for a long time.
  • Information about a job on craigslist.
  • A list of features I was looking for in a heart rate monitor (despite having already bought one that satisfied these requirements nearly two months ago).
  • A to-do list (only one! a good step for me!)
  • Some motivational training quotes.
  • Six identifying phrases/sentences from different song lyrics so that I could google them to find out what the songs are, for eventual download.
  • A library call number. To what? Don't know.
  • Two phone numbers. Whose? Don't know.
  • Some order numbers. For what? Don't know.
  • Random workout information from ages ago, identified in time only by the day of the week since I suppose I expected to have it all entered in my spreadsheet well before I forgot what week it referred to.
  • A few websites I heard/read about in some context and planned to visit. Almost a year ago.
  • The titles of a few studies I wanted to look up.
  • A list of all the categories of information I planned to include in my excel workout log.
  • The numbers "12" and "6166537". Ok...
Kind of ridiculous. But after a bit of work I have it pared down to four notes: a to-do list, stuff related to exercise, stuff related to music, and a quote that I like to see.

If you stay on top of things - entering data as soon as you get it - you're more likely to maintain that, whereas if you make a habit of jotting down quick notes in different places without timely follow-up, that's something you're going to stick to as well. The conclusion I've come to is, if the information is important to you, and you're taking the time to monitor it - whether it be for research or just personal stuff - then it should be worth the extra 2 seconds (or even, god forbid, 2 minutes!) to make sure that it's in a format that will be useful to you. Now hopefully I'll be able to abide by this philosophy!

Monday, March 26, 2007

virtue and vice

So, keeping up with a blog is clearly not my forte. I am definitely going to make a concerted effort to stay more on track with this. The stupid thing is, I had tons of ideas for what I was going to make entries about, yet never actually did anything about it because I felt my thoughts weren't entirely fleshed out. The smart thing is, I did write down lots of notes about blog ideas so that I wouldn't forget about everything I'd come up with. The stupid thing is, I put these notes (along with a bunch of other important stuff for school) into my checked luggage over spring break. Which the airline proceeded to lose for EIGHT days. My bag was apparently sent to Birmingham, Alabama, because that's "close enough" to Burlington, Vermont. Thank you, Northwest. The smart thing is - well, not so much smart due to anything on my part so much as dumb luck - I finally got the suitcase back on Saturday (I wonder what portion of people never get their stuff back? Or how your odds of it being returned to you decrease with each passing day?). So I now have all my ideas back! But the stupid thing is, it's a little late for the kind of extended entry I was planning. So it'll have to wait. But I'll leave you with this interesting link that I found while researching some data for the take-home exam:

http://addictedtor.free.fr/graphiques/

It's a collection of graphs made using R, and the diversity and complexity of some (or most) of them is astonishing. I never would have thought it possible to make figures like those with R. Odd how a program that appears so "simple" (as compared to even something like Excel) is so powerful given the right knowledge about how to use it.

Monday, March 5, 2007

improbability

The summer after I graduated from high school I attended a summer program which is known among its alumni for having a number of crazy traditions, including some strange celebratory antics whenever someone has a birthday. The problem during my year? Not one person in the camp had a birthday the entire time. It was a science-oriented camp, so of course somebody had to calculate the probability of this happening. I don't remember exactly what it turned out to be (other than impossibly small), but during Thursday's class I realized that with R, I could figure it out in a matter of seconds.

For simplicity's sake I assumed that every day of the year is equally likely to be someone's birthday (which is not true, but close enough for this estimate). There were 97 delegates, plus 29 staph (not a typo - their enthusiasm is "infectious." ha ha.), for a total of 126 people. The program lasted almost four weeks, or about 26 days.

So, the probability of 0 occurrences of an event when there are 126 trials and the odds are 26/365 is:

pbinom(0,size=126,prob=26/365)

Which turns out to be:

9.041928e-05

Or, less than 1 in 10,000!