|
Author |
Message |
|
TAGS:
|
|
|
GMAT Club Legend
Affiliations: HHonors Diamond, BGS Honor Society
Joined: 05 Apr 2006
Posts: 5881
Schools: Chicago (Booth) - Class of 2009
GMAT 1: 730 Q45 V45
WE: Business Development (Consumer Products)
Followers: 196
Kudos [?]:
1363
[0], given: 7
|
Let the debate begin!!! [#permalink]
15 Nov 2006, 16:06
For GSB, odds of an interview. From 2009 data, source admissions411
Coefficients in question, 1 equating to interview, 0 equating to not.
Intercept -0.460851231
GMAT 0.003873504
VERB 0.0023856
QUANT -0.029936274
TIMES TAKEN 0.066353128
AGE -0.001569024
GPA 0.215000635
Where's nationality and job function you ask? I'm too lazy to map them to values and use them as dummys. Maybe if i get unlazy I'll do it.
From 2008 data, odds of acceptance, assuming unknown = ding. Again 1 equating to accept, 0 to not.
Coefficients
Intercept -0.501643016
GMAT 0.000804336
VERB 0.008032123
QUANT -0.005540867
TIMES TAKEN -0.023042997
AGE -0.026421802
GPA 0.260872977
ALUMNI RECS 0.015136345
This is decidely disconcerting if there's any truth to it. It places my odds at
pretty crappy right now.
On the other hand, Adjusted R Square 0.08175456
The whole thing means nothing, cause I did a sloppy job.
Some other tidbits I need to do to add some value:
Set up an over 21 proxy for the age or change it to years of exp assuming a 21 year old grad date. (Right now the implication is being 10 years old increases my odds)
Setup proxy for industry and country.
Pull in a much larger data set across the top 20 schools and see what happens with a data set on the order of 2000+
Maybe if I have some time tomorrow I'll do it.
Ok i did the work exp thing. No real change here -
Coefficients
Intercept -1.056500848
SCORE 0.000804336
VERB 0.008032123
QUANT -0.005540867
TIMES TAKEN -0.023042997
WORK EXP -0.026421802
GPA 0.260872977
RECS 0.015136345
Implication is that there is some bias against age. The whole data blows because all I have is 700+ to begin with, so its crap. Someone find me data sets in the 600 range.
|
|
|
|
|
|
|
|
|
GMAT Club Legend
Affiliations: HHonors Diamond, BGS Honor Society
Joined: 05 Apr 2006
Posts: 5881
Schools: Chicago (Booth) - Class of 2009
GMAT 1: 730 Q45 V45
WE: Business Development (Consumer Products)
Followers: 196
Kudos [?]:
1363
[0], given: 7
|
Ok I'll reply to myself here.
Wharton, GSB, Kellogg, Haas data set, 2008, removed incomplete entries, removed all unknowns (as i expect at least some of these are people who just never came back to update), sample size of about 1000 after clean up.
Coefficients
Intercept -0.440176724
GMAT 0.000101253
WORK EXP -0.002895856
GPA 0.22269004
ALUM REC 0.144391973
Interesting, too bad that:
Adjusted R Square 0.037007749
It's still crap.
Seriously though, 17 views and NOT ONE REPLY? Does anyone else see what the above is implying? Or am I just overdorkulating here?
|
|
|
|
|
|
SVP
Joined: 31 Jul 2006
Posts: 2310
Schools: Darden
Followers: 27
Kudos [?]:
402
[0], given: 0
|
I'd like to reply, but I'm just not sure what to make of that stuff. I need to take a closer look at it later.
|
|
|
|
|
|
GMAT Club Legend
Affiliations: HHonors Diamond, BGS Honor Society
Joined: 05 Apr 2006
Posts: 5881
Schools: Chicago (Booth) - Class of 2009
GMAT 1: 730 Q45 V45
WE: Business Development (Consumer Products)
Followers: 196
Kudos [?]:
1363
[0], given: 7
|
pelihu wrote: I'd like to reply, but I'm just not sure what to make of that cr@p. I need to take a closer look at it later.
Heh. Crap is the right word.
I was trying to run a regression to determine how different variables play into the model of accept or deny.
In short, the data suggests that GMAT is worth nothing and GPA is worth everything. The reason for this is that there simply isn't enough data below the 700 mark. There's plenty above, but little below. In short, the regression is worthless.
|
|
|
|
|
|
GMAT Club Legend
Affiliations: HHonors Diamond, BGS Honor Society
Joined: 05 Apr 2006
Posts: 5881
Schools: Chicago (Booth) - Class of 2009
GMAT 1: 730 Q45 V45
WE: Business Development (Consumer Products)
Followers: 196
Kudos [?]:
1363
[0], given: 7
|
|
|
|
|
|
|
SVP
Joined: 31 Jul 2006
Posts: 2310
Schools: Darden
Followers: 27
Kudos [?]:
402
[0], given: 0
|
rhyme wrote: pelihu wrote: I'd like to reply, but I'm just not sure what to make of that cr@p. I need to take a closer look at it later. Heh. Crap is the right word. I was trying to run a regression to determine how different variables play into the model of accept or deny. In short, the data suggests that GMAT is worth nothing and GPA is worth everything. The reason for this is that there simply isn't enough data below the 700 mark. There's plenty above, but little below. In short, the regression is worthless.
That was my first reaction to your message; that the most important factor was GPA, and that there was a negative correlation to GMAT Quant. As you have said, the real problem is with the data set. It is reasonable to believe that GMAT will be more of a factor as it gets lower from the average; by the time GMAT is below 640, it might be the single more important factor of all (nearly impossible to overcome).
On the other hand, if scores are artificially limited to 700+ (as they are here), but other factors are allowed to flow freely (more or less), then the other factors will clearly gain in importance.
I think of it this way. Each of the following could result in an "easy deny":
1. really low GMAT
2. really low GPA
3. really outrageous age
4. really horrendous recs
5. really low grade work experience
If you remove the "easy denies" with low GMATs, but leave the other "easy denies" in place, then clearly they will factor in more obviously.
|
|
|
|
|
|
CEO
Joined: 17 Jul 2004
Posts: 3291
Followers: 17
Kudos [?]:
419
[0], given: 0
|
Sounds like a range restriction issue.
|
|
|
|
|
|
GMAT Club Legend
Affiliations: HHonors Diamond, BGS Honor Society
Joined: 05 Apr 2006
Posts: 5881
Schools: Chicago (Booth) - Class of 2009
GMAT 1: 730 Q45 V45
WE: Business Development (Consumer Products)
Followers: 196
Kudos [?]:
1363
[0], given: 7
|
Hjort wrote: Sounds like a range restriction issue.
Spoken like a true mathematician.
|
|
|
|
|
|
Senior Manager
Joined: 23 Jun 2006
Posts: 398
Followers: 1
Kudos [?]:
293
[0], given: 0
|
i also tried to make some sense on the relative importance of parameters.
I think that linear regression is not suitable here...
there are too many non-parametric variables that are arbitrarily modeled as [0,1]. in general regression model tend to work better for variables that are parametric, and preferably linear.
also, the GMAT score in itself is not a linear parameter, i.e. the difference between 620 to 650 has different meaning (and effect) than the difference between 720 to 750. a linear model cannot model such difference.
so i'm not surprised with rhyme's result are not so good (and its not that rhymes work is crap... actually it seems that you did good work... but you ran into the theoretical limitations of regression).
i'd approach it differently. to check the effect of GMAT score, i'd compare the GMAT scores of those who accepted/dinged using t-test. these tests are better to model connection between parametric/non-parameteric variables.
to see the multi-dimensional (or multi-variate) connections, i'd use factor analysis that would help explain the source of the variance in the target variable in terms of variances of the dependant variables.
|
|
|
|
|
|
Senior Manager
Joined: 23 Jun 2006
Posts: 398
Followers: 1
Kudos [?]:
293
[0], given: 0
|
reading the thread (and my post) again...
it might be that i confused factor analysis and "analysis of variance" (also known as ANOVA)... i can check it if you'd like. long time since i used them in practice...
also, if we want to still purues the linear model there are 3 things that may help it to be more accurate:
a) normalizing scale. i'm not sure if you did that or not, but if you'd like for the coefficient to represent relative importance you need all parameters to work on the same scale. i.e. if you work with gmat score, divide it by 800 to have a 0...1 scale (or better, since there is no data on lower scores, better substract 550 from the score and divide by 250. same with GPA (divide by 4 or substract something and normalize) etc...
if all variables are normalized to 0..1 scale the correlation remains the same, but coefficients can be compared. but again... you might have already done that
b) to overcome the non-linearity of gmat score (and quant/verbal score as well), you can use percentiles instead. percentiles are, by definition, a linear parameter.
c) instead of 0/1 target variable (0-dinged,1-accepted), you can elaborate it further to represent more information. for example: 0-dinged, 1-interviewed but dinged, 2-accepted
or even better (if you have the data):0-dinged,1-interviewed but dinged, 2-waitlisted but rejected, 3- waitlisted and accepted, 4-accepted.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Moderators:
Michmax3, shovitdhar, MBAgirl2010, billyjeans, MDF, getgyan, losttraveler, mc, OasisGC, jumsumtak, RogerDodger, whiplash2411, threestripes, GMATLA, milias, aerien, highhopes, scorpionz, asimov, redjam17, crackHSW, jko, hunterashmore, highwyre237, Dbalks, nktdotgupta, kingfalcon, boogs, GoBruin, shorttheworld, ariel, jb88, theK, CobraKai, helpmehelpme, staind, mappleby
|