Fivethirtyeight.com

We Now Have Algorithms To Predict Police Misconduct

2016-03-09

On a sweltering Monday in late June 2015, the city council in Charlotte, North Carolina, met to discuss, among other items in a seven-hour marathon, how to carry out a controversial new approach to predicting police misconduct. Opinions were divided, and the discussion was tense. One council member was afraid of “upsetting the troops.” A second called the use of data about individual police officers an invasion of privacy. In response, another said, “I’m always a fan of third parties looking over our shoulder.”

Finally, Kerr Putney, soon to be sworn in as Charlotte’s new police chief, got up to reassure the council. He spoke about the need to “balance public need versus what officers may want.” He seemed to persuade several members.

“So it won’t be used for retribution?” one asked. “Absolutely not,” Putney replied.

Minutes later, the council voted to work with a group of data scientists to develop a sophisticated system for predicting when cops will go bad. These researchers, part of the White House’s Police Data Initiative, say their algorithm can foresee adverse interactions between officers and civilians, ranging from impolite traffic stops to fatal shootings. Their system can suggest preventive measures — an appealing prospect for police departments facing greater scrutiny and calls for accountability. Two other large departments — the Los Angeles County sheriff and the Knoxville police — have signed on to use the research to develop new systems, and several other agencies have expressed interest. The scientists hope their method can serve as a template for stopping police misbehavior before it happens.

Many police departments have early warning systems — software that tracks each officer’s performance and aims to forecast potential problems. The systems identify officers with troubling patterns of behavior, allowing superiors to monitor these cops more closely or intervene and send them to counseling.

The researchers, a mixed group of graduate and undergraduate students working together at the University of Chicago with backgrounds in statistics, programming, economics and related disciplines, are trying to build a better kind of early warning system. They began their task last summer with a request from the Charlotte-Mecklenburg Police Department: Predict when police officers would participate in adverse interactions with civilians.

To build their early warning system, the University of Chicago group first looked for signals in the data that an officer might be going astray. They used a comprehensive data set of interactions between cops and the public gathered by Charlotte police officials over more than a decade. The researchers found that the most potent predictor of adverse interactions in a given year was an officer’s own history.1 Cops with many instances of adverse interactions in one year were the most likely to have them in the next year. Using this and other indicators, the University of Chicago group’s algorithm was better able than Charlotte’s existing system to predict trouble.

The algorithm holds great promise for its cleverness and its accuracy. But the idea of using statistical models to predict police misconduct is not new, and past efforts have often met with resistance. In fact, the Chicago Police Department — now under intense federal scrutiny in the wake of the Laquan McDonald shooting — constructed such an algorithm more than 20 years ago, only to abandon it under pressure from the officers’ union, the Fraternal Order of Police.

Ironically, the Chicago Police Department crafted the approach partly because of criticism from the union. A previous early warning system, based on the recommendations of departmental supervisors, was itself the subject of a union grievance calling it subjective and inconsistent. Seeking a more objective way to identify troubled officers, the Internal Affairs office bought state-of-the-art software from a company called California Scientific. The software was to use neural-network-based data analysis to build models for predicting which officers would be fired for misconduct.

The department’s forward-thinking approach immediately attracted press attention. Articles in outlets as diverse as Scientific American and Playboy praised the sophistication of the method while raising questions about its moral dimensions: Could any computer algorithm, they asked, take into account the particular circumstances of officers’ behavior and foresee whether they would commit misconduct?

The list of predictive factors Internal Affairs found using the software is consistent with other studies of police misconduct, including my own. Along with each officer’s past history of complaints, Internal Affairs identified personal stressors linked to bad behavior. If an officer had recently divorced or gone into serious debt, for example, he was flagged by the algorithm as more likely to commit misconduct in the future. Like employees of any other kind, cops are likely to see their job performance suffer when there is trouble in their personal lives.

The neural network didn’t last long: about two years from the first announcement to its formal shutdown. (And all its reports and predictions went missing at some point in that period.) Soon after the model produced its first predictions, the union intervened; its president, Bill Nolan, called the system “absolutely ludicrous.” In particular, he objected to the way administrators responded to the predictions: Internal Affairs handed over a list of about 200 officers to Human Resources, which called each one into the office for questioning the union called adversarial.2

Human Resources then recommended some officers for a counseling program (about half of the flagged officers were already enrolled in counseling because of previous bad behavior). Nolan said police officers were being punished for crimes they had not yet committed.

At the time, the notion of using predictive analytics to forecast potentially criminal behavior was still quite foreign. Although 27 percent of departments reported using some kind of early warning system in 1999 (according to a Department of Justice study), most existing models were simple, based either on supervisor observations or on an officer’s exceeding a certain number of complaints in a given period. (Both Chicago’s current system and Charlotte’s previous algorithm use such thresholds.) The idea of a more sophisticated algorithm seemed spooky back then, and union leaders called the Chicago PD’s model a “crystal-ball thing.” Mark Lawrence, the CEO of California Scientific, received a handful of inquiries from other police departments, but he said interest in his software dropped off rapidly after the union’s well-publicized objections. (The Fraternal Order of Police did not respond to multiple requests for comment for this story.)

Now, advanced statistical models designed to quantify, measure and predict crime have become common tools in police departments across the country. But the Chicago experience illustrates the difficulty in getting smart early warning systems to stick. By developing their algorithm in partnership with officers, the University of Chicago team hopes to gain acceptance from the Charlotte police force as a whole and not just its administrators.

Instead of identifying and intimidating officers at risk of being fired, the new algorithm seeks to head off troublesome behavior. The team worked in close collaboration with the Charlotte police force, which volunteered for the program to upgrade its early warning system and prevent misconduct, despite having a relatively low rate of officer-involved violence. Over the course of several ridealongs, focus groups and field interviews with officers, the data scientists developed a feel for some of the challenges of modern policing. Officers “deal with a lot of things the typical person on the street doesn’t really think about,” said Joe Walsh, a data scientist and mentor assigned to supervise the project.

In addition to the prognostic value of past complaints, the University of Chicago group uncovered some less obvious factors that may predict police misconduct. Incidents that officers deemed stressful were a major contributor; cops who had taken part in suicide and domestic-violence calls earlier in their shifts were much more likely to be involved in adverse interactions later in the day. It’s notable that although stressful calls emerged as a powerful predictor, right now there is no way to control which officers are dispatched to crimes based on the number or kind of previous calls during their shifts.

Charlotte’s previous early intervention system was based on a simple threshold formula: Cops who had more than three adverse interactions in the past 180 days were flagged as a risk for problems in the next year. Supervisors then decided what corrective action to take, from changing the officers’ duties to recommending them for counseling.

To compare the system’s effectiveness with that of their new algorithm, the Chicago researchers broke their data into yearlong chunks. After training the algorithm to look for variables that predicted adverse interactions in past years, they tested the algorithm on more recent data to see whether it could make accurate predictions for individual officers, taking into account whether they had been disciplined or sent to counseling during the year.

The researchers say the prior system overestimated the number of officers at risk: 50 percent of the flagged officers in the data set did not go on to participate in an adverse interaction in the next year. And this was not because of successful interventions by the police department, since officers flagged by the system were often not directed to counseling.

The new algorithm flags 15 percent fewer officers than the old one, creating a smaller list for the department to monitor. (The researchers did not provide me with absolute numbers.) Yet it correctly identified more of the officers who went on to participate in adverse interactions in the following year, suggesting that the new system is more sensitive.

Because the algorithm is still in a pilot phase and not in active use, the researchers don’t know whether their predictions will translate into interventions that reduce the probability of adverse interactions. But three separate studies of police agencies that implemented early warning systems (in Miami, Minneapolis and New Orleans) have shown that targeted intervention can reduce citizen complaints against officers as much as 66 percent over two to three years. The more accurate the predictions, the more powerful those interventions become, and the more adverse interactions can be prevented.