Lies, Damned Lies, and Statistics by Mollie B. '06

If you want to come to MIT, you'd better learn a thing or two about statistics first

September 1, 2005

I do understand that human beings are not intuitively good at statistics. Daniel Kahneman won the Nobel Prize in Economics in 2002 for showing, among other things, that people in general are terrible at statistical thinking. From Kahneman’s Nobel biography:

The standard example of a framing problem, which was developed quite early, is the ‘lives saved, lives lost’ question, which offers a choice between two public-health programs proposed to deal with an epidemic that is threatening 600 lives: one program will save 200 lives, the other has a 1/3 chance of saving all 600 lives and a 2/3 chance of saving none. In this version, people prefer the program that will save 200 lives for sure. In the second version, one program will result in 400 deaths, the other has a 2/3 chance of 600 deaths and a 1/3 chance of no deaths. In this formulation most people prefer the gamble. If the same respondents are given the two problems on separate occasions, many give incompatible responses. When confronted with their inconsistency, people are quite embarrassed. They are also quite helpless to resolve the inconsistency.

Still, just because other people are bad at statistics doesn’t mean I’ll accept that as an excuse from you as a prospective MIT applicant. Most MIT majors require or suggest a course in probability and/or statistics, so you might as well get a head start in statistical thinking now.

First, a few facts on which to chew:
1. The overall admission rate for the class of 2009 was 14.3%. (From here.)
2. Applicants who interviewed (or had their interview waived) had a 19% admission rate; those who didn’t interview had a 7% admission rate. (I don’t have a citation for this, which is sketchy, so feel free not to believe me. But although I can’t remember where I found the numbers, this is close enough to the truth for the purposes of this entry.)
3. Applicants with SAT scores in the 88th percentile (roughly a 1290 old SAT) have about a 5% admission rate, while those with perfect scores have about a 50% admission rate. (From here — a very fun read, if you’re into this kind of thing. I highly suggest it!)

So does this mean that you can pour all of your personal data into some magic admissions algorithm and have it spit out a number which reflects your chances of getting into MIT?

First of all, no. Moreover, it wouldn’t matter if it could. For example, if the computer said that you had a 33% chance, that would mean that if you applied to MIT many times, you would expect to get in in approximately 1 in 3 tries. (And we’re not talking “if you applied 3 times” here. I think applying 500 times would probably give a good result, but I don’t feel like playing around with Matlab to see if that’s true.) Of course, you can’t apply 500 times to MIT in a single year, or even in your lifetime, so it’s pointless to try and stick a number on your chances at MIT.

I guess the moral of the story here is that no one is a shoo-in for MIT, but the opposite is also true — nobody should think they have no hope. But it’s pointless to over-think this issue, because you just can’t control for all the variables.

For what it’s worth, my Super Getting into MIT Guide goes something like this:
1. Do something that you really care about, and make sure you write about it glowingly on your application.
2. Interview, and don’t be lame and fake at said interview.
3. Get good scores on the SAT I and SAT IIs.
4. Take difficult classes at your high school (or even local community college) and get good grades in them.

And, of course, you can get into MIT if you only have three of these four characteristics… you can get in if you only have two… you can get in if you only have one. But even if you have four, you’re not a sure thing.

My final statistics lesson has to do with something you may have heard — that MIT supposedly has a stratospherically high suicide rate. This is a contention supported by the Boston Globe, a group of stellar journalists, I’m sure, but not so good at the statistics thing. (I can’t find the original Globe article, but the article here makes all the points the original article made.) The Globe basically looked at the MIT suicide rate between 1990 and 1999, compared it to suicide rates at other schools, and decided it was too high. (Let’s just say there’s a reason the Globe article wasn’t published in a scientific journal. Sweeping conclusions backed up by questionable data like that make scientists — including me — want to bang their heads on hard surfaces.)

Now let’s look at some problems with the Globe’s grandiose conclusions:
1. People who successfully commit suicide are significantly more likely to be young and male. In the 1990s, the average MIT student was both those things; since then, the population has famously evened out. (Source here; relevant quote: “In fact, MIT’s suicide rate is below the national average if one adjusts figures for the school’s overwhelmingly male student body [during the years of the study].”)
2. Moreover, science, engineering, and business students have significantly higher suicide rates than do liberal arts students. MIT undergraduates are almost exclusively science, engineering, and/or business majors. Given that both those things are true, one would expect MIT to have a high suicide rate based on those demographics alone. (Source here; relevant quote: “Based on 10 undergraduate suicides over 11 years, the article concludes that suicide is a greater danger at MIT than elsewhere. When one factors in that science and business students have considerably higher suicide rates than liberal arts students, and that male college students kill themselves five times more often than female college students, the figures quoted prove nothing. MIT is cited as currently being composed of 59 percent male students; that fact alone would make the suicide rate differences with most other colleges understandable; but in the early 1990s an even higher percentage of the students at MIT were male.”)
3. The Globe compared MIT to other schools with engineering programs, which is a terrible control. Other schools have engineering programs, yes, but few other schools have 50% of the undergraduate student body majoring in engineering. If you don’t have appropriate controls (and it’s difficult to think of a school which would be a good control — Caltech is science/engineering focused too, but only having one school as the control population would be pretty sketchy.)
4. Statistics like this are terribly vulnerable to small swings in absolute numbers. The absolute number of suicides is very small, and therefore it takes many of them spread over many years to accurately determine whether or not the rate in one place is higher or lower than the rate in another. (Source here; quote: “Because of small number statistics, the “true” suicide rate — i.e., that that would be measured by an very large MIT in the limit of an infinite number of students — is, to 95% confidence, approximately 100,000*(11 +/- 2*sqrt(11)/48,000). At this level, MIT’s suicide rate is consistent with the national average… it would take approximately another thirty three years in order to obtain a measurement of the MIT suicide rate that could be distinguished from the national average at 95% confidence.”)

So now you know. Go out, and tell my story to the masses. ;)

5 responses to “Lies, Damned Lies, and Statistics”

Mitra says:

September 2, 2005 at 9:20 am

Hey Mollie,

I also posted those interview admit rates, and I got those from the interview page of the MyMIT site. Another interesting thing we talk a lot about in econometrics is selection bias, but perhaps that’s a bit much for now…

ttys!

mitra
Neha says:

September 1, 2005 at 11:30 am

I think I’ve just been enlightened. I’ll run to the masses and let them know. =D

Damn, now I kind of regret picking AP Calc over AP Stats.
Anonymous says:

September 4, 2005 at 2:50 am

Don’t you think this is something of an unfair misconstruence of the way people use statistics? Obviously when someone wants to know what their “chances” of being admitted are, or what the admit rate for “people like them” is, the piece of information they’re getting at has to do with what their expectations should look like. Might they still be elated or disappointed? Of course, and I think they understand that.

Consider the job of someone who sells terrorism insurance. They need to assess the likelihood that a particular building is going to be knocked down in a terrorist attack. They look at similar buildings, make some assumptions, do lots of math, and come up with a number: something like “I belive that this building has a one tenth of one percent chance of being a terrorist target within the next 30 years, and so the premium paid by its owner for insurance should be X.” Of course, either the building will be attacked by terrorists, or it will not: there’s no repeatable test, and the sample size is very small. Do terrorism insurance companies do a good job at assessing risk? I certainly hope they do…
Laura says:

September 4, 2005 at 3:08 am

oooh…statistics.

Hmmm, a funny bit, that first one, but I think that it would probably have more effect if spoken than written. Sadly, I think I would have forgotten the statistics of the first part by the time it took me to get to the second part.

Well, to a prospective MIT student such as myself, statistics in general just worry me more regardless.
harish says:

September 5, 2005 at 1:54 am

Why would guys at MIT want to commit suicide?? The highest education-related suicide rate is among those poor souls who spent 3years of their lives studying 14hrs a day everyday and still didnt get into a particular college.

Lies, Damned Lies, and Statistics by Mollie B. '06

Share this post

5 responses to “Lies, Damned Lies, and Statistics”