« The Half-Sized Law School and the Cost in New Prawfs | Main | The Under-Theorization Paradox »

Monday, May 15, 2017

Algorithms in Blue

A little later in the month I am going to preview my book, “The Rise of Big Data Policing: Surveillance, Race, and the Future of Law Enforcement” (releasing October 2017).  But, today, I wanted to discuss some new revelations out of Chicago about how predictive policing works in practice.

As some of you may know, certain police departments across America have adopted a “predictive policing” strategy that targets both places of forecast crime as well as the people predicted to be involved in crime.

The Chicago Police Department has been at the forefront of developing a predictive model to identify the individuals most at risk of violence.  The theory – arising from sociological studies – is that proximity to violent acts increases one’s risk of being the victim or perpetrator of violence.  Essentially, if you are a young man involved in Chicago’s gang culture and your friend is killed, you are statistically more likely to be shot yourself or avenge the killing.  Your risk of violence is elevated due to your personal connection to violence and the cyclical nature of violence. 

Police have taken this insight and created a rank-ordered list (scored 10 to 500+) of the high risk offenders in the City.  They call it the “Strategic Subjects List” or colloquially the “heat list” and it includes 1400 names (although recent reports include a higher number).  Prior to last week, there was little information about what factors were included to get on the list or how the risk scores were calculated. 

But, last week, The Chicago Sun Times released a fascinating story on who exactly gets on “the heat list.”  More after the break.

The following quotes are from the news article. 

The paper reported that “risk scores were based on eight factors, including arrests for gun crimes, violent crimes or drugs, the number of times the person had been assaulted or shot, age at the time of the last arrest, gang membership and a formula that rated whether the person was becoming more actively involved in crime.”

The release of the data ran counter to some beliefs that only those who had been arrested for repeat gun offenses made the list.  As reported by the Chicago Sun Times, “Of those with the maximum score, nearly half — 48 percent — had never been arrested for unlawful use of a weapon, the charge typically leveled for crimes involving an illegally owned gun. Another 30 percent had been arrested once.”

That said, “87 percent of those with the top score had been arrested for some kind of violent offense” and “63 percent had been shot before.” 

I have written a bit about the constitutional implications of this predictive scoring system and some of the general dangers of these predictive systems, but this is a great revelation about some of the data behind data-driven policing.  The entire Chicago Sun Times article deserves a read.  For the first time in a while we have a bit of transparency in the world of predictive policing giving actual figures about who makes these lists and why. 

Posted by Andrew Guthrie Ferguson on May 15, 2017 at 12:01 PM | Permalink


For big data to constitute evidence for, well, anything, it should have to conform to something like a Frye or Daubert standard. That may be achievable in some very controlled environments like statistical process control in a factory, and even then it is hard to do without knowing what data operations you want to do before you collect the data. For data harvested "in the wild", it is essentially impossible to establish repeatability, one of the cornerstones of the scientific method. In situations like credit scoring (the original big data application, before the term was invented) the costs of such inaccuracy are borne in the aggregate: lost business opportunity if the score is too low, defaults if the score is too high. Even for an individual consumer the effect is mostly limited to the price of a financial service, and the scoring method is transparent enough that credit repair is possible.

Even if predictive modeling can be shown to work in a technical sense, that doesn't necessarily make it a good idea. Who is most likely to shoot a police officer? Probably someone who was a victim or police abuse. A predictive model can be offered as a "value-neutral" justification for a vendetta.

That doesn't mean that such big data can't be used for public policy. Crummy big data scraped in from the wild is better than no data at all. I would submit, for example, that the rather low weight attached to previous gun crime (lower than the weight of being a shooting *victim*) undermines the justification for gun control laws.

Posted by: M. Rad. | May 16, 2017 8:37:43 PM

Post a comment