Some interesting —and alarming—articles have been making the rounds lately, following on the heels of an academic study published by professors at the University of Akron and Cleveland State University. The
more reputable articles report such sweeping conclusions that I actually wondered whether the journalists got it wrong, so I went to the source (I can
link only to the abstract here, but I did read the full study).
When I read the study’s methodology, I knew I had my next article topic. We’re going to test our Critical Reasoning (CR) skills on an actual academic study! You might have to do something similar in business school (admittedly with a business case, not an academic study), so let’s test your b-school readiness now!
(Note: I refer to the “more reputable articles” because some blogs have picked this up and publishing under headlines such as “Is the GMAT the root of all evil?” As much as you may hate studying for this test, I think we can agree that this characterization is a bit over the top. : ) )
Correlation vs. Causation
We need to define a couple of terms first. You may already have learned about correlation and causation in your CR studies; here’s a refresher.
Correlation: two phenomena tend to occur or appear at the same time or in conjunction with one another
Causation: one phenomenon causes another phenomenon
Correlation does not imply causation. One of two correlated phenomena could cause the other but those two things could also have absolutely no causation between them. Alternatively, the two things could both be caused by a third thing. The two things could even cause each other! (Predator-prey dynamics are an example of this kind of two-way dependency.)
For example, have you ever noticed how, when the ground is wet, people often seem to be carrying around umbrellas? Those two phenomena are correlated. Which one causes the other?
Neither of course. A third thing (rain!) causes both.
Academic studies typically first seek to establish correlation. If correlation can be established, then the next step is to explore—systematically!—every possible causal relationship in order to rule out all but one. Doesn’t that sound like fun? This is why such studies can take years or even decades to draw broad conclusions. In addition, typically one or multiple studies will establish correlation and then other studies will tackle the possible causation.
Assumptions
Unless you haven’t started studying CR yet, you’ve already run across assumptions, which are a part of any arguments that contain conclusions. When drawing a conclusion, the author assumes certain things to be true without stating them outright in the argument.
In an academic study, any such assumptions actually will be stated upfront. If the study makes any significant but unstated assumptions, the authors are likely to be criticized and the study might not pass the peer-review process.
On the GMAT, we have to brainstorm these assumptions ourselves (since they aren’t actually stated) and then ask ourselves what the weak links might be in such assumptions. When reading a business case or an academic study, you will want to analyze any stated assumptions and try to figure out whether any unstated assumptions affect the information presented or the conclusions drawn.
The Argument
I can’t, of course, reproduce the full 20-page study here—and you’re probably not particularly interested in reading it. I’ve picked out a few tidbits, though, to create a GMAT-like argument that we can analyze. (Note: by definition, this is going to leave out many major aspects of the study. GMAT arguments are only a paragraph long. But we’re not trying to recreate the academic study here; we’re just stretching our critical reasoning brain cells.)
Let’s start with two of the stated assumptions in the study, followed by a piece of evidence (cited from an earlier study), and a conclusion of this study. I’m going to number the paragraphs so that we can refer back to these quotes more easily as we discuss. (All quotes from Culture, Gender, and GMAT Scores: Implications for Corporate Ethics, R. Aggarwal et al. Journal of Business Ethics. July 2013.)
1. “In this study we are implicitly assuming that populations of GMAT test takers in respective countries are reflective of their respective national cultures.”
2. “In this paper we start from the view that cultural dimensions that partially determine GMAT scores also partially determine the cultural landscape of cohorts of graduate business.”
3. “Getz and Volkema (2001) conclude that both high-level public officials and members of the underclass are more susceptible to unethical behavior (bribery, extortion) in high power distance cultures.” (Note from Stacey: “high power distance” means a society with a greater incidence of hierarchies; there is a “power distance” between junior and senior members of an organization, for example.)
4. “Power distance [is] negatively significant, suggesting that greater societal hierarchy and greater differences in gender roles are associated with lower GMAT scores.”
In statement 1, the authors appear to be referring to the country in which someone takes a test as opposed to the national origin of that person (though I’m not entirely sure; the language isn’t precise enough).
Is that a solid assumption? If I were to take the test in Germany or China, then I would be counted as part of that “national culture.” That may or may not be appropriate. If I had lived in China for 15 years, then perhaps it’s reasonable to assume that my cultural outlook would be much closer to Chinese culture than to that of my native country. On the other hand, if I’d only moved there a year ago, then presumably my cultural outlook would be much closer to that of my native country (or wherever I’d lived before going to China). What do you think?
Given the ambiguity in the hypothetical China situation, we’ve got a problematic assumption. This begs a broader question. Is it reasonable to assume that all people living in a certain country share the same general cultural and ethical outlook? Likewise, is it reasonable to assume the same about everyone who grew up in a certain country? There are so many variables—regional differences within a country, differences in work and life experience, differences in religious and moral upbringing, and so on.
In short, people are very different! Assumptions that are based on whole-country averages or statistics for something as complex as ethical behavior… well, such assumptions are suspect for good reason.
That brings us to the second assumption. The italics are mine; the original quote does not emphasize that text. Go back up and think about what that text might mean before you read the next paragraph.
The italicized text states that the authors assume that there are cultural dimensions that affect GMAT scores. In other words, the authors have assumed causation at the beginning of the study, not the end of it. The only thing they leave open is which specific cultural traits are correlated with certain GMAT scores—that’s what the study is for. But do cultural dimensions actually affect GMAT scores in the first place?
Remember our correlation and causation discussion? The second assumption seems to have things backwards. Rather than look for correlation first and then try to determine causation, the study assumes causation and then looks for correlation. (In a fun twist, the study later takes those correlations and assumes more causation… but I didn’t actually give you the information you would need to analyze that. You’ll have to take my word for it.)
So far, as with GMAT arguments, we’ve found a decent amount to question in the assumptions. Let’s take a look at the evidence and conclusion.
The third statement, the premise (evidence), tells us that an earlier study, conducted by Getz and Volkema, concluded that high power distance cultures are more likely to have unethical behavior going on. Without reading the actual study, we can’t know whether their conclusion is sound, but the authors of this study are citing the Getz and Volkema study as evidence in support of their position. (In the paper, the authors also cite a later study by Volkema that “suggests” that a greater power distance is also “associated with the use of dubious negotiation practices.”)
Okay, so the premise is that greater power distance is correlated with unethical behavior. That is, the more hierarchical a society is, the more likely it is to display characteristics of unethical behavior in the workplace.
What about the conclusion? I’ll repeat it here:
4. “Power distance [is] negatively significant, suggesting that greater societal hierarchy and greater differences in gender roles are associated with lower GMAT scores.”What does that mean? Think it through.
A greater power distance (more hierarchy) is correlated with lower GMAT scores. Hmm, and the earlier piece of evidence said that greater power distance is also correlated with unethical behavior… what does that mean?
Let’s recap:
Getz and Volkema: higher power distance is correlated with more unethical behavior
Aggarwal et al: higher power distance is correlated with lower GMAT scores
With those two pieces of data, you would hypothesize one of two things. Either there is no correlation between ethical behavior and GMAT scores (just because they’re both correlated to power distance doesn’t mean they’re correlated to each other) or the correlation goes along with those two statements—that is, more unethical behavior would be correlated with lower GMAT scores.
The ultimate conclusion of the study, though, is that more unethical behavior is correlated with higher GMAT scores, not lower ones. Now, the study does include its own mathematical analysis that purports to support this view (I haven’t shared it with you here because it’s highly technical).
The problem is that the paper essentially shows that both correlations are potentially valid—that is, that higher GMAT scores may be correlated with more ethical behavior and less ethical behavior. If that’s the case… well, more work needs to be done to settle the correlation issue before we can even start to think about causation.
I’ll add one more thing: that potential discrepancy isn’t even the most disturbing issue (to me) in the paper. In the Discussion section at the end of the paper, the authors suggest that business schools should implement some changes in the way that they use the GMAT. But the authors haven’t actually established any causation! The study details never addressed causation at all, only correlation (and, as we saw, even that correlation argument has some issues). In fact, I gave the paper to a couple of statisticians to read and they were so distracted by the fact that the study simply assumed causation that they both had to read it a second time in order to delve into the mathematical analysis.
Now, I think there are probably some good changes that can be made with respect to the way schools and employers use the GMAT. I definitely think some are using the GMAT more heavily than they should or in a way that isn’t appropriate given the nature of this test. The GMAT is only one measure of a certain kind of potential, but any one measure cannot and will never catch all of the “best” people for a certain job or career. There are just too many variables that can lead to success. (And that’s a great thing, isn’t it? : ) )
The academic study does mention several puzzling aspects that are worth further exploration, such as the fact that women tend to have lower GMAT scores, on average, but women and men perform similarly in business school. In addition, one of the strongest predictors of performance is wealth. The study doesn’t explore these phenomena, though. Rather, the conclusions are focused on the ethics argument.
(By the way, it also used to be true for decades that boys scored higher than girls on the SAT, even though girls had better grades at university. Today, on average, boys still score higher than girls on the math section, but girls score better on the essay section, and the two groups are about equal on critical reasoning.)
_________________