Wednesday, October 03, 2012
Correlation <--> Causation
A nice response to one of the most common default challenges every empirical researcher must deal with:
The correlation phrase has become so common and so irritating that a minor backlash has now ensued against the rhetoric if not the concept. No, correlation does not imply causation, but it sure as hell provides a hint. Does email make a man depressed? Does sadness make a man send email? Or is something else again to blame for both? A correlation can't tell one from the other; in that sense it's inadequate. Still, if it can frame the question, then our observation sets us down the path toward thinking through the workings of reality, so we might learn new ways to tweak them. It helps us go from seeing things to changing them.
Posted by Orly Lobel on October 3, 2012 at 11:18 PM | Permalink
TrackBack URL for this entry:
Listed below are links to weblogs that reference Correlation <--> Causation :
I read that article and thought it was a terrible response to what you, correctly, call "the most common default challenge[ ] every empirical researcher must deal with." It is the most common default challenge because it is the most common defect that requires challenge. If I had a nickel for every empirical paper that makes unsupported causal claims from the empirical data . . .
To be sure, it is a challenge that can be abused. In other words, there are two possible abuses: (a) making an incorrect causal inference, (b) improperly challenging a correct causal inference. But the two abuses are not symmetrical. An improper challenge to a correct causal inference can be rebutted by explaining why the causal inference is warranted. An incorrect causal inference can only be challenged with the statement "correlation does not prove causation." If you discredit that notion--and I read Engber to be trying to do exactly that by saying it is a "blowhard's favorite phrase"--then you legitimate a lot of blowhard scholarship by leaving no other possible response.
Posted by: TJ | Oct 4, 2012 1:22:03 AM
I largely agree with TJ. It is indeed very important to distinguish causal explanations from correlations and the two often are confused or conflated. And it is true that correlations may suggest a causal relation but no such relation exists until we can specify or speculate about the possible causal mechanism at work that ties the two things or events together. That one event follows another is no reason to infer that the former causes the latter, for it may very well be the case that a third thing accounts for both events, that a third thing, a common mechanism, lies behind both, and thus explain the effects of both events or cases. Elster provides a nice example from child custody cases wherein we find a correlation between children being more “disturbed” in contested custody cases than in those cases in which the parents have reached a private custody agreement. It is natural to infer the correlation in the first instance suggests an explanation: the custody dispute itself is the cause of pain and guilt in the children. However, as Ester explains, it may be the case “that custody disputes are more likely to occur when the parents are bitterly hostile toward each other and that children of two such parents tend to be disturbed.” So the disturbance is accounted for by mechanisms that exist prior to the contested custody event itself and thus the contested custody does not in fact account for the disturbance, even if it might be the occasion in which we are better able to take notice of the effects of the prior cause on the children. Elster says that to identify the true cause of the disturbance we would have to have some way of measuring or assessing the pain and guilt or suffering both before and after a divorce. He further raises the possibility that we might be mistaken about the direction of our causal arrow: in our case this would mean that it could very well be (i.e., possible even if unlikely) that the it is the disturbed child or children who serve as the cause behind the hostility of the parents toward each other and their eventual divorce! One more possibility exists, although perhaps not applicable to our case: a correlation may arise purely by chance and in fact have no causal interpretation. Finally, perhaps an event requires a fortuitous confluence of conditions or mechanisms that together make for a causal explanation. Unlike Elster (who holds to a sophisticated conception of ‘methodological individualism’), Harold Kincaid has argued that we can speak about causation in the terms of explanation and confirmation without precise micro-level identification of mechanisms.
A nice example in philosophy of mind and (the) neuroscience(s) speaks to the importance of not making undue inferences from correlation: Brain scans reveal correlations, and thus neither causal nor (let alone) identity relations. Neural activity and the phenomenological experience of pain, for example, are not at all like one another and we can’t assume or claim the former provides a causal explanation of the latter: “Seeing correlations between event A (neural activity) and event B (say, reported experience) is not the same as seeing event B when you are seeing event A.” Yet scientists, laypersons, and even philosophers have succumbed to assuming, suggesting or positing a causal relation between brain states and mental states without identifying the specific causal mechanism at work. The intuitive belief that the mind might, in the end, turn out to be merely the brain or that the mind is the “product” of the brain, makes such an inference almost too tempting to abjure for some folks, especially those individuals Raymond Tallis christens “neuromaniacs.” Neural correlates may simply be evidence of the necessary but not sufficient conditions that make for a causal explanation of, say, a sensation, experience, or consciousness itself.
Posted by: Patrick S. O'Donnell | Oct 4, 2012 3:25:37 AM
Reading that Slate article caused it to be way past my bedtime.
Posted by: Orin Kerr | Oct 4, 2012 3:44:13 AM
No doubt kindness and discretion kept you from saying the same thing about my comment!
Posted by: Patrick S. O'Donnell | Oct 4, 2012 7:46:56 AM
hmmm...would this be proximate correlation or producing correlation?
Posted by: Fletcher | Oct 4, 2012 8:09:08 AM
Actually, correlation does not provide even a hint unless we think it's probative of possible causal connections of some kind. And that will depend on the robustness of the correlation controlling for other confounding factors. So not sure that this paragraph you quote really advances the ball.
Posted by: Brian | Oct 4, 2012 11:06:28 AM
Many omitted-variables, endogeneity, and related stories for why [correlation doesn't imply causation] also imply that [*lack* of correlation doesn't imply *lack* of causation]. That is, a lack of correlation is often logically consistent with the existence of a causal relationship.
More generally, such stories typically have the twin properties that
(i) the existence of a correlation is logically consistent with both the presence and absence of a causal relationship, and
(ii)the *lack* of a correlation is logically consistent with both the presence and absence of a causal relationship.
In such cases, observing a correlation does nothing to restrict the set of logically possible conclusions concerning the presence of a causal relationship.
The problem in the "sure as hell provides a hint" position is its failure to acknowledge(ii). Suppose we change (ii) as follows
(ii')the *lack* of a correlation is logically consistent with only the absence of a causal relationship.
If (ii') were true, then it would make sense to say that observing a correlation provides a hint of causation, since such an observation would rule out one part of the logical space--the Venn diagram, if you prefer--in which there would be no causation.
Again, though, most stories for why a correlation is logically consistent with the absence of causation also tell us that an absence of correlation is consistent with the presence of causation. Thus when we observe no correlation, we are ruling out parts of the logical space--the Venn diagram--in which there isn't causation *as well as* parts where there is.
Without a more explicit model that lets us quantify the relative likelihoods of the different parts of the logical space in question, we usually have no way to tell whether correlation provides a hint concerning causation.
The moral of this story is simple--drawing causal conclusions from empirical observation requires either some mechanism to assure comparability (randomization being the leading example) or a behavioral model that is powerful (because restrictive) enough to link empirical observations with causal conclusions.
Posted by: Jonah Gelbach | Oct 4, 2012 11:06:45 AM
"That is, a lack of correlation is often logically consistent with the existence of a causal relationship." I'm just a bystander here but I'm having trouble imagining a real-world example of this. (I'm sure it's *logically* consistent.) Let's say condition A causes B; how could you have lots of A around, then, but no B? Or I guess the scenario is, you have just as much B as you do in the absence of A? Is the idea that there might be other, much more frequently occurring causes of B, that hide the correlation with A in a haystack, as it were? I.e., is it that there are cases where the correlation with A is undetectable as a practical matter, or that there are cases where it doesn't exist?
Posted by: Bruce Boyden | Oct 4, 2012 12:06:03 PM
Suppose the question is whether being a skydiver reduces a young person's probability of living to age 70. That is, the question is, Does being a skydiver make each given young person less likely to live to 70?
Common sense suggests that, other things equal, a young person who throws herself from an airplane thousands of feet above the earth is more likely to die before she gets to 70 than she would be if she did not throw herself from an airplane thousands of feet above the earth.
And yet it would not be at all surprising to find that skydivers are just as likely to make it to 70 as non-skydivers, since people who especially like to skydive might well also especially like to exercise and eat well. In this example, the correlation between skydiving and living to 70 could be zero, even though the causal effect of skydiving on the probability of living to 70 is negative for each person in the relevant population. The issue here is that causal effects are but-for in nature, which means they are intra-person, whereas correlation measures are cross-sectional, so that they compare potentially non-comparable people to each other.
Note that there's nothing special about my skydiving example. Any time omitted variables bias or some other confounding influence operates in the direction opposite to a causal effect, it is possible an absence of correlation and a presence of causation can occur at the same time.
Posted by: Jonah Gelbach | Oct 4, 2012 9:51:13 PM
Ah, I get it, it's washed out by stuff you can't isolate. Thanks Jonah.
Posted by: Bruce Boyden | Oct 4, 2012 11:51:48 PM
The comments to this entry are closed.