« Omitted Variable Bias: A Quick Primer | Main | A Preview of Henderson v. United States »

Thursday, February 19, 2015

Crime, Incarceration, and Crack

In my first post on the new Brennan Center report on prison’s impact on incarceration, I examined its problematic treatment of endogeneity bias. Today I want to look at how it addresses another tricky empirical morass, namely omitted variable bias.*

To the report’s credit, the authors think through a long list of possible causal factors. In the end, they come up with fourteen:

1. Increased incarceration

2. Increased policing

3. Death penalty

4. Concealed-carry laws

5. Unemployment

6. Growth in income

7. Inflation

8. Consumer confidence

9. Decreased alcohol consumption

10. Aging population

11. Decreased crack use

12. Legalized abortion

13. Decreased lead in gasoline

14. Introduction of CompStat

That’s a pretty long list. There are other factors that should be included, and I’ll look at that OVB problem in a future post. Here, I just want to consider the OVB problems with the variables they listed.

Because while they assembled this long list, only only eight of these made it into their state-level analyses. Their major regressions dropped inflation, consumer confidence, decreased crack use, abortion, decreased lead, and CompStat.

For all but CompStat, the rationale was lack of data. Inflation and consumer confidence data are available only at the regional (inflation) or national (confidence) level. Crack data isn’t available before 1990 and the authors claim that there is no state-by-state data for any years (although this is wrong, as we’ll see). There is no publicly available state-level data of lead levels; the famous study by Jessica Reyes apparently  relied on data she collected herself, and the authors obliquely state that they “could not recover this data from her.”** The abortion data is available at the state level, but it is missing (because never gathered) for fifteen of the years between 1983 and 2011. CompStat data is dropped from the state regressions but included in the city-level ones on understandable (but debatable) grounds that policing is a city matter, not a state one.

The list of dropped variables is initially fairly concerning. Three of the six are considered by many to be major explanatory variables for the crime drop: see Steve Levitt here for crack and abortion, and this article (same link as above) for lead. 

In all three cases, though, the report’s authors argue that whatever important effect these factors had on crime in the 1980s and 1990s, all three had started to influence crime rates much less by the 2000s and 2010s. If true, that addresses the OVB problem, since as noted in my earlier post, the omission of a variable that does not influence crime can’t bias the estimate of incarceration’s effect on crime.

Moreover, even if any or all these factors have a strong impact on crime even to this day, their omission won’t bias the estimate of incarceration unless they are also correlate with incarceration. Is that true in these cases?

For the rest of this post, I’ll just focus on crack. I’ll come back to the other factors in future posts. (To foreshadow a bit, my feeling right now is that the only other variable besides crack whose omission could be a problem is CompStat, especially for the post-2000 data, but maybe even for the whole time period.)

For crack, let’s start with the correlation issue. It’s plausible that city- or state-level exposure to crack could shape penal outcomes outside of crack’s effect on crime rates: crack-related violence could have spurred police to take more extreme action because of the fear and panic it created. Non-crack laws (such as gun laws, etc.) could have toughened in response as well. So increased crack use could lead to increased incarceration, even independently of its effect on crime.***

So as long as crack is increasing crime, omitting crack will bias the estimate of the effect of incarceration, and it will bias it towards zero (i.e., the regression will understate the true effect of incarceration).****

But the authors respond, somewhat correctly, that crack-related offending had declined enough by the 2000s that its effect on crime was likely minimal from that point on. And the authors of the one major study that does look at the crack-crime relationship (Roland Fryer, Paul Heaton, Steven Levitt, and Kevin Murphy) do argue that crack-related violence and other crack-related pathologies had declined significantly by the time their study ended, in 2000.

At the same time, the Fryer-Heaton-Levitt-Murphy paper argues that crack consumption remained high in 2000, at about 65% to 70% its peak levels, even as many of the social ills (like exceptionally high murder rates for young black men) dissipated. As long as higher crack use leads to higher incarceration rates outside of its effect on offending—perhaps high-use states continue to adopt or maintain tougher sentencing laws, or deploy more police per unit of crime, or are more urbanized (and urbanization shapes incarceration rates), etc., etc.—and as long as crack use continues to contribute to offending (perhaps less the violent drug-market wars of the past, and now more lower-level offenses committed by addicts, as suggested by the recent work by Shawn Bushway and others arguing that the greying of US prisons comes from an older cohort of heavy drug users who continue to offend much later in the life than expected), then the bias will persist, although likely less strongly than it was in the 1980s and 1990s.

But there is an additional wrinkle to omitting crack use. Crack actually belongs on both sides of the equation. Crack can lead to more violent and property crime, but crack use and distribution are crimes themselves, though ones not counted in this report. The report just looks at the index offenses gathered by the Uniform Crime Reports (murder, aggravated assault, forcible rape, arson, robbery, burglary, larceny-theft, and motor vehicle theft), thus leaving out all drug offenses and less-serious violent, property, and public-order offenses—for all of these, the FBI just gathers arrest data, not offending data, and variations in arrest data (especially for drug offenses) need not closely track variations in underlying offending.

The problem here is clear. High-crack states will have higher incarceration rates due in part to higher crime rates (i.e., more crack offenses), but those higher crime rates aren’t captured in the crime variable. This could further magnify the bias discussed above. Assume we have two states with identical violent and property crime rates, but one has a bigger crack problem than the other. The high-crack state will have a higher prison population but same apparent crime rate, making incarceration look less effective.

And while only 17% of state prisoners are in prison on drug charges, a large share of those are serving time for crack or cocaine charges. So the numbers being dropped from the crime variable but included in the incarceration term are not trivial.

Finally (at last), the authors actually aren’t entirely right when they say that there is no state-level crack-use data. They are right about the gaps in the official data. As for the Fryer-Heaton-Levitt-Murphy data, the authors state that they “could not secure the data,” even though they are publicly available right here (more here—but don’t ask me why the files are pdfs).

The index, based on a weighing of crime rates, media accounts, and other factors, is certainly not above reproach, but then neither are any of our problematic official accounts of drug use, which is an understandably hard thing to measure. As long as the index is sufficiently correlated with actual crack use/sale, an imperfect proxy is generally better than an omitted variable.

Of course, 1980 - 2000 doesn’t match the entire period the authors wish to consider, but there is nothing stopping them from looking at that subperiod, seeing if excluding crack alters the results in any meaningful way, and then using that information to gauge the costs of omitting it more generally. After all, if there is no apparent bias from omission during the 1980 - 2000 period, then there almost certainly isn’t one during the 2000 - 2012 period; and if there is one, then that is likely the upper bound on whatever bias persists into the 2000s. To simply ignore the issue because it time periods don’t align is not a convincing approach.

So, does the omission of crack bias their results? It almost certainly causes them to understate the effect of incarceration in the 1980s and 1990s. But live today, not then. What does it mean for today? The big questions are (1) how much does higher crack use lead to higher incarceration and crime rates, and (2) how important is the omission of crack offenses on the crime side of the model? Both of these are tough questions, but both also show the need to be careful when interpreting the report’s findings. At the very least, we should continue to be concerned that they are underestimating the effect of incarceration. (But again, we should also be careful not to run too far with that and argue that this means current levels are efficient, which they almost certainly are not.)

 

* For those unfamiliar with OVB and how its skews empirical results, I wrote up a brief primer here.

** I really wonder what this means. Was the data corrupted? In some sort of format that made it hard to share (not all final datasets are neatly and cleanly assembled)? Or did she refuse to share the data?

*** So does a decline in crack lead to a decline in toughness? Maybe not—these sorts of shocks may be asymmetric if it is easier to be tough when things are bad than more-lenient when times are safer, which raises more concerns about how to properly model them statistically.

**** Recall that the true effect is likely negative, the correlation with crack and incarceration is positive, and the effect of crack on incarceration is positive. So the bias factor is a positive (it’s a positive times a positive), so it’ll make the estimated effect less negative.

 

Posted by John Pfaff on February 19, 2015 at 09:40 AM | Permalink

Comments

The comments to this entry are closed.