Spatial Analysis of Crime Point Data in ArcGIS Pro

Spatial analysis consists of "methods to study the location, distribution, and relationship of spatial phenomena" (Bäing 2014).

Because geospatial data is fundamentally quantitative, these methods are often based in complex statistics that can be impenetrable to most GIS users. However, the fundamental ideas behind these methods are often conceptually straightforward, and the implementation of the algorithms in software means that most GIS users only need to have a general understanding of the capabilities and limitations of these methods in order to be able to use them effectively.

This module demonstrate some spatial analysis methods that can be used with point data in ArcGIS Pro. While this tutorial uses Chicago crime data as an example, these techniques can be applicable to any data consisting of points for a single type of phenomenon.

Crime Data Limitations

Some major cities make georeferenced crime location data available for analysis through their open data portals. This data has a number of limitations that you should be aware of as you perform your analysis:

Acquire and Process the Data

Crime Data

The Chicago Police Department (CPD) provides a dataset of reported incidents of crime (with the exception of murders where data exists for each victim) that occurred in the City of Chicago since 2001. This dataset is made available through a dashboard and web map for convenient online analysis, and is also made available as raw CSV point data that can be imported into GIS. The original source of the data is the Chicago Police Department's CLEAR (Citizen Law Enforcement Analysis and Reporting) system.

The Chicago data portal uses the Socrata web app, which permits download of the data as CSV files with columns of latitude and longitude.

The app also provides a facility for filtering data. Because this data a variety of crimes over more than 20 years, downloading the millions of points and 1.6 gigabytes of data will result in data that will be very slow to analyze with ArcGIS Pro.

Therefore, in this example, a filter is used to download only data for the Year of 2019 and 2023.

Point data in CSV files with longitudes and latitudes can be imported into ArcGIS Pro and stored in the project database.

  1. Create a new project and a new map.
  2. Download a CSV file of crime data from the Chicago Data Portal, filtered to show only 2019 and 2023.
  3. Under Analysis, Tools, open the XY Table To Point tool to import the points into the project geodatabase.
Downloading 2019 and 2020 Chicago crime data from the Chicago Data Portal and importing it into a project geodatabase

Neighborhoods

While neighborhoods are vernacular areas that often have unclear and contested boundaries, cities commonly create data files for neighborhoods that define unofficial boundaries that are useful for reference and context.

The City of Chicago makes neighborhood boundaries available as shapefiles that can be downloaded, unzipped, and imported into ArcGIS Pro.

Downloading and importing neighborhood boundaries

Demographics

For this example, we use demographic data from the the US Census Bureau's 2015-2019 American Community Survey five year estimates. For convenience, we will use the Minn 2015-2019 ACS Tracts feature service from the University of Illinois ArcGIS online organization.

Under Analysis, Tools open the Export Features tool to import the boundaries into the project geodatabase.

Downloading and importing census tract demographics

Analyze the Data

ModelBuilder

ModelBuilder is a visual programming language in ArcGIS Pro that allows you use a graphical editor to create custom tools that allow you to automate complex, tedious, or repetitive tasks where there are consistent step-by-step workflow sequences of operations.

Using ModelBuilder, you graphically chain together sequences of tools from the toolbox. This will be useful for this example because you will be executing a long sequence of tools, and using them in a ModelBuilder diagram will both make it easier to keep track of what you are doing and will allow you to easily rerun the analysis if you need to modify or fix one step in the analysis.

A ModelBuilder diagram

To start a new ModelBuilder diagram, on the Analysis ribbon, select ModelBuilder.

Creating a ModelBuilder diagram

Crime Points

The example in this tutorial focuses on robberies. The Chicago Police Department (2024) defines "robbery" as:

The taking or attempting to take anything of value from the care, custody, or control of a person or persons by force or threat of force or violence and/or by putting the victim in fear, including attempted offenses.

To facilitate analysis, we will separate the crime data into separate feature classes of robberies in 2019 and robberies in 2023.

  1. Add the Export Features tool to your diagram.

    • Input Features: Crime_Points (imported above)
    • Output Feature Class: Crime_2019
    • Expression: Add a filter to export only crimes for the analysis type in 2019. Also filter by latitude greater than 41 to exclude incorrectly geocoded points.
  2. Add a second Export Features tool to your diagram.

    • Input Features: Crime_Points (imported above)
    • Output Feature Class: Crime_2023
    • Expression: Add a filter to export only crimes for the analysis type in 2023. Also filter by latitude greater than 41 to exclude incorrectly geocoded points.
  3. Insert a new Map.
Filtering and mapping crime points

Viewing Statistics

You can Explore Statistics by selecting columns in the Attribute Table.

Viewing statistics in the attribute table

Save ModelBuilder

You should periodically save your ModelBuilder diagrams from the ModelBuilder ribbon as you work on your project. This will mitigate loss of work if the software crashes.

Note that the ModelBuilder Save button is separate from the Save Project button at the top of the screen for the project as a whole.

Make sure to Save the ModelBuilder diagram before exporting a project package, or your changes may not be saved in the package.

Saving a ModelBuilder diagram

Neighborhood Crime Counts

A spatial join is a join where data from a join data set is copied into a target data set based on proximity of features in the two data sets.

Spatial joins can also be used to count the number of points within polygons. In this step, we use a spatial join to get counts of crime points within neighborhood polygons.

Figure
Spatial joins for counts
  1. Add the Join Features tool to your diagram to perform a spatial join to get neighborhood crime counts for 2019.
  2. Add the Alter Field tool to rename the new Count.
  3. Add the Join Features tool to your diagram to perform a spatial join to get neighborhood crime counts for 2023.
  4. Add the Alter Field tool to rename the new Count.
  5. Insert a new Map.
  6. View the Attribute Table if you wish to know the neighborhoods with the highest and lowest crime counts.
Neighborhood robbery counts in 2023

Neighborhood Demographics

Demographics are "the statistical characteristics of human populations (such as age or income)" (Merriam-Webster 2024).

To analyze the relationship of crime to community characteristics, we need demographic data for the neighborhoods.

The neighborhood data from the City of Chicago contains only boundaries, so we will have to use another spatial join to aggregate that information with census tract demographic data from the US Census Bureau's American Community Survey.

Figure
Spatial joins for attributes
  1. Add the Feature to Point tool to your diagram. This will convert the tract data to centroid points so that tracts that overlap neighborhood boundaries are only counted in one neighborhood.
  2. Add the Join Features tool to your diagram.
  3. Insert a new Map.
Neighborhood demographics

Neighborhood Rates

Neighborhoods vary by population, and places with more people could be expected to have more crime, so maps of crime counts often become simple maps of population that obscure where crime is actually a more serious problem.

Normalization is the adjustment of variable values to a common scale so they are comparable across space and time (Wikipedia 2023).

In this example we normalize crime counts by dividing by population to get crime rates that are comparable across different neighborhoods.

The small numbers problem occurs when small changes in counts create exceptionally high changes in rates when the population or incidence counts are small (Taklar et al. 2009). In this case, because the numbers of reported robberies per neighborhood is often small (there was one reported robbery in the Museum Campus neighborhood), the small numbers problem can cause the results to overstate or understate the severity of the condition.

  1. Add an Export Features tool to your diagram. You need to duplicate the Neighborhoods feature class because ModelBuilder will not add the same created feature class to multiple maps.

    • Input Features: Demographics
    • Output Feature Class: Rates_2023
  2. Add the Calculate Field tool to your diagram.
  3. Insert a new Map.
Neighborhood robbery rates in 2023

Neighborhood Change

Changes in the density of crime over time can be useful for assessing the success of past interventions, and identifying emerging areas that may need additional intervention in the future.

Since we have crime counts for two different years, we can map percentage change between those two periods.

  1. Add an Export Features tool to your diagram. You need to duplicate the feature class because ModelBuilder will not add the same created feature class to multiple maps.

    • Input Features: Rates_2023
    • Output Feature Class: Change_2023
  2. Add the Calculate Field tool to your diagram.
  3. Insert a new Map.
Neighborhood crime percent change between 2019 and 2023

Hot Spot Analysis

While the counts and rates displayed above by tract may be adequate for your needs, maps can be deceiving.

With data like this, it can be helpful to analyze areas in the context of their immediate neighbors and the area as a whole.

The Hot Spot Analysis tool calculates the Getis-Ord Gi* statistic (pronounced gee eye star) for each feature in the context of its neighbors and reports the results as z-scores and p-values to indicate the probability that an area is a hot spot or cold spot.

The use of calculated probabilities makes this tool more rigorously defensible in research than simple visual inspection of points.

  1. Add the Hot Spot Analysis tool to your diagram.
  2. Insert a new Map.
Getis-Ord GI* hot spot analysis

View Python

Behind the scenes, ModelBuilder creates Python code that you can view.

On the ModelBuilder ribbon, click the Export options (green arrow) and select Send To Python Window to view or copy the code.

Viewing ModelBuilder Python code

Crime and Social Variables

While having a descriptive understanding of where crime has occurred is useful, often we want to gain some understanding of why it occurred where it occured (explanatory model), or be able to have some idea about where it could occur in the future (predictive model).

Spatial analysis provides a wide variety of techniques for relating variables to each other in order to build models. While, crime, like most social phenomena, is driven by complex chains of causation and elements of randomness, it is possible to build models that offer insights into these processes. The examples below are highly simplified and do not provide a particularly strong fit with crime, but they are provided to offer some insights into the types of modeling that you can do with GIS.

Social Disorganization Theory

Social disorganization theory is a social ecology theory that asserts that crime is the result of an "inability of a community structure to realize the common values of its residents and maintain effective social controls" (Lersch 2004, 46; Sampson and Groves 1989, 177). Social disorganization theory has its roots in research performed in the School of Sociology at the University of Chicago beginning in the 1920s, most notably by Robert Park, Ernest Burgess, Clifford Shaw, and Henry McKay. Accordingly, these ideas are often referred to as Chicago School ideas.

Although the complexity of human society makes analysis of social disorganization similarly complex, a handful of neighborhood characteristics are theorized to lead to breakdowns in formal and informal social controls, and tend to correlate with higher levels of crime in specific areas ( Lersch 2004, 50-53; 148; Sampson and Raudenbusch 1999; 2001):

Although the American Community Survey tract data used in this example does not contain variables that directly measure the social disorganization theory factors directly, it does contain potential proxy variables that can be assumed to correlate with those factors. Those variables include:

Bivariate Correlation Charts

Correlation is "a relation existing between phenomena or things or between mathematical or statistical variables which tend to vary, be associated, or occur together in a way not expected on the basis of chance alone" (Merriam-Webster 2020).

While correlation does not prove that one of the measured characteristics causes the other, correlation is a common exploratory data analysis technique used to identify relationships in geospatial data that are deserving of further investigation.

A quick way to look for correlation between variables is to use an x/y scatter chart.

Figure
Example x/y scatter charts

Evaluation of R2 to determine whether correlation should be considered strong or not depends on the type of phenomena being studied.

To create an x/y scatter chart:

  1. Select the layer you want to analyze in the Contents pane and then select Data, Visualize, Create Chart, Scatter Plot.
  2. Set the X and Y variables to examine correlations.
  3. Using a log scale can be useful when the values are skewed to the low end of the distribution.
  4. If desired, you can add the chart to a layout.

As shown in the video, the R2 values for all four variables are fairly low (0.13 or less) and the X/Y scatter charts show a spreading pattern (heteroskedacity) that indicates that high crime areas tend to be low income, but low crime areas can be both low and high income.

This is consistent with crime / adversity mismatch research that shows no consistent relationship between socioeconomic disadvantage and levels of violence (Manguel 2021).

Creating X/Y scatter charts

Ground Truth

In 1982, George L. Kelling and James Q. Wilson published a highly influential article entitled Broken Windows: The police and neighborhood safety that asserted a connection between crime and disorder as typified by the physical care of a neighborhood. If a window on a building is broken and no one fixes it, that is a sign that the people in the neighborhood do not care about their community. This leads to a breakdown in the informal community controls that normally hold crime in check.

Further research has suggested that broken windows is a fallacious confusion of correlation with causation (Thacher 2004), and the policing practices that follow from broken windows theory (like "stop-and-frisk") have been vigorously critiqued as counterproductive and socially unjust. However, the theory does lead us to ask questions about the relationship between crime and the built environment, something more fully fleshed out in urban planning practices like crime prevention through environmental design (CPTED) (Jeffery 1971).

Google Street View allows you to take a virtual walking tour of a neighborhood, and street view can be used to assess whether analysis performed in GIS is consistent with the conditions on the ground (ground truth). Care should be used in such qualitative analysis to mitigate the effects of confirmation bias that can reinforce preconcieved notions and stereotypes rather than validate analysis.

You can copy coordinates from a map in ArcGIS Pro by right-clicking on the location in the map, selecting Copy Coordinates, and search for that location in Google Maps.

Visiting a location virtually in Google Street View

Communicate the Results

These different types of visualizations can be integrated into an infographic layout to provide an infographic summarizing your analysis.

Figure
Infographic layout

Neat Lines and Marginalia

  1. Insert a Layout sized 11" x 17" in landscape orientation
  2. Properties, General rename the layout to Infographic.
  3. Add rectangles as neat lines.
  4. Insert a Picture of the logo.
  5. Add Straight text for the title (Robbery in Chicago, 2023)
  6. Add a Rectangle text box for the credits:
Infographic marginalia

Main Map

  1. Map frame: 5 x 7 @ 1 x 2
  2. Hide the base map service credits.
  3. Add a Straight text for the caption (There were 11,054 reported robberies in Chicago in 2023)
  4. Use 24 point Arial to be consistent across the layout.
Main map

Small Maps with Legends

  1. Small map frames: 2.3 x 3
  2. Remove base map
  3. Add a legend
  4. Add caption text (18 pt Arial)
Small maps

Correlation Results

You can add the correlation x/y scatter chart to the layout.

X/Y scatter chart

Export PDF

Export PDF