Unsupervised identification of crime problems from police free-text data

PDF / 1,736,444 Bytes
19 Pages / 595.276 x 790.866 pts Page_size
49 Downloads / 283 Views

(2020) 9:18 Birks et al. Crime Sci https://doi.org/10.1186/s40163-020-00127-4

Open Access

RESEARCH

Unsupervised identification of crime problems from police free‑text data Daniel Birks1,2* , Alex Coleman2 and David Jackson3

Abstract We present a novel exploratory application of unsupervised machine-learning methods to identify clusters of specific crime problems from unstructured modus operandi free-text data within a single administrative crime classification. To illustrate our proposed approach, we analyse police recorded free-text narrative descriptions of residential burglaries occurring over a two-year period in a major metropolitan area of the UK. Results of our analyses demonstrate that topic modelling algorithms are capable of clustering substantively different burglary problems without prior knowledge of such groupings. Subsequently, we describe a prototype dashboard that allows replication of our analytical workflow and could be applied to support operational decision making in the identification of specific crime problems. This approach to grouping distinct types of offences within existing offence categories, we argue, has the potential to support crime analysts in proactively analysing large volumes of modus operandi free-text data—with the ultimate aims of developing a greater understanding of crime problems and supporting the design of tailored crime reduction interventions. Keywords: Policing, Burglary, Unstructured data, Text mining, Machine learning Background When a crime is recorded by police, situational and behavioural information describing the incident are often captured in a free-text narrative account. These data, commonly referred to as Modus Operandi (MO) notes, are routinely used for both administrative and investigatory purposes. Yet beyond such function, these accounts also have the potential to increase understanding of specific crime problems in ways that support crime reduction efforts. The key challenge in realising this potential relates to the unstructured nature of MO notes. Put simply, free-text data are more challenging to analyse at scale than structured measurements of crime events such as quantity, location, and time—all of which immediately lend themselves to traditional analytical approaches such as measuring rates of offending, examining spatial distributions across neighbourhoods, or measuring hourly, *Correspondence: [email protected] 1 School of Law, University of Leeds, Leeds, UK Full list of author information is available at the end of the article

weekly and monthly variations. These challenges dictate that potentially actionable insights into specific crime problems can be lost in the categorisation of crime events into tractable but restrictively homogenous groupings. In response to this problem, this paper proposes a novel method for automatically clustering specific crime problems within existing crime categories based on the application of unsupervised text-mining algorithms to narrative crime report data. This approach, we argue, has the potential

Data Loading...

Unsupervised identification of crime problems from police free-text data

Recommend Documents

Cyclophilin nomenclature problems, or, 'a visit from the sequence police'

Unsupervised Author Identification and Characterization

Driving Style Identification with Unsupervised Learning

Hate Crime Data Collection Systems

Targeting Domestic Abuse with Police Data

Surfing the crime net: the European Police Research & Science Database (CEPOL-eDOC) as a new source for police resea

Data Problems

Unsupervised Domain Adaptation for Person Re-Identification with Few and Unlabeled Target Data

A Letter from India: Problems and Prospects for the Modern Police

DeStress: Deep Learning for Unsupervised Identification of Mental Stress in Firefighters from Heart-Rate Variability (HR

Identification of Continuous-time Models from Sampled Data

System Identification of Structural Dynamic Parameters From Modal Data