03/01/2024 | Press release | Distributed by Public on 03/01/2024 11:02
One of the first steps in a rape investigation[1] is the responding officer's written report. What the officer includes and how those conclusions are worded can have an impact on the case.
In a study sponsored by the National Institute of Justice (NIJ) that used cross-disciplinary research, data scientists applied machine learning techniques to nearly two decades' worth of police reports on rape cases.
The data scientists used advanced computational power to support social scientists in a study of how evidence of officer sentiment - meaning opinions and subjectivity - toward victims' credibility may affect key procedural decisions down the line, such as whether to prosecute a rape case.
The study -conducted by a team of scholars from Case Western University, Cleveland State University, and Texas A&M University -aimed to identify linguistic "signaling" of officers' views or biases found in their narratives of rape reports.
The research team evaluated narratives in more than 5,600 police reports of rape in one large urban jurisdiction from 1993-2011. Using sentiment analysis, a form of natural language processing, to screen for words or phrases that contain hidden emotions - such as dissatisfaction, happiness, or doubt - researchers detected and interpreted evidence of officer emotion or bias found in the narratives. (See "Natural Language Processing: AI Dives into Human Narratives" and "Leveraging the Nuance of Qualitative Analysis on a Larger Scale.")
The study demonstrates the power of algorithm-based technology to help social and behavioral sciences address pressing social issues. Cross-disciplinary research, like this joint effort of data scientists and social scientists, is a key priority of NIJ Director Nancy La Vigne.
The research yielded important new insights on the relationship between police incident reports and case outcomes. In considering these findings, it is important to note that, overall, the reports did not contain high levels of sentiment. Findings from the study include:
The researchers also noted that, overall, reports did not change much in terms of sentiment levels or report length over the two-decade study period.
The researchers identified several important implications of the collaborative study for best practices, including that:
Researchers highlighted some examples of factually unsupported "signaling" statements from officer incident reports, including:
The researchers observed, with respect to these statements from different police incident reports:
The report writer does not provide detail as to why there were no bruises and disheveled clothes, why a victim's prior sexual history or being a "known prostitute" [is] mentioned or relevant. . . However, without that important next statement qualifying why the factual statement is pertinent to the investigation, a human likely reads this as signaling - disbelieving the victim's statements and/or blaming a victim for what happened to them.
This cross-disciplinary research broke new ground through machine learning analysis that gained insights into the significance of police incident report language in rape investigations. The study identified how certain words could signal officer attitudes regarding victim credibility and possibly foreshadow assault case outcomes.
Importantly, the NIJ-supported research enabled researchers to leverage the nuances of qualitative (or narrative-based) data on a scale previously seen only in quantitative (or numbers-driven) studies.
The research described in this article was funded by NIJ award 2018-VA-CX-0002, awarded to Case Western Reserve University. This article is based on the grantee report "Using Sentiment Analysis and Topic Modeling in Assessing the Impact of Police Signaling on Investigative and Prosecutorial Outcomes in Sexual Assault Reports," by Rachel Lovell, Joanna Klingenstein, Jiaxin Du, Laura Overman, Danielle Sabo, Danielle Flannery, and Xinyue Ye.
Machine learning is an artificial intelligence (AI) application that mimics the human brain's ability to learn from experience. From the criminal justice perspective, a critical function of machine learning is pattern recognition. Self-learning algorithms use datasets to understand how to identify people from images, complete intricate computational and robotics tasks, detect medical conditions from complex scans, and understand online purchasing habits.
Natural language processing is a branch of machine learning designed to enable computers to process language the way that humans do. Although computers have surpassed humans in data-driven calculations, it was believed that computers could not master qualitative tasks such as analyzing or creating narratives. Recent developments in AI, however, establish that machines can perform qualitative tasks.
Quantitative research collects and analyzes numerical data to measure variables. Qualitative research collects non-numerical data to gain insights on a subject. It generally measures views and attributes rather than hard numbers. Qualitative research adds a human voice and narrative to research, creating a human context for research findings.
Natural language processing enabled the investigators to, in the words of their report, "leverage the nuance of qualitative research on a scale previously seen only in quantitative assessments."
Media coverage of, public interest in, and concern over the exceptional power and ability of AI products that perform human cognitive tasks, such as producing academic essays that mimic human composition, underscore how much and how quickly the power of natural language processing has evolved.