Point Data Analysis in ArcGIS Pro

Rev. 14 August 2024

Geospatial point data is used to represent things that exist or events that occurred at specific locations on the surface of the earth. Examples of points include crime locations, animal nests, street trees, vehicle charging stations, cellphone towers, WiFi hotspots, GPS waypoints, etc. Areas are sometimes represented with points (centroids) for mapping and navigation, such as with restaurants, houses, stores, or transit stations.

This tutorial demonstrates basic techniques for analysis of point data in ArcGIS Pro.

Loading Point Data

The example point data used for this tutorial is 2022 electrical power plant location data from the US Energy Information Administration.

The oil market disruptions of 1973 resulted in the creation of a wide variety of federal systems and organizations for collecting and managing national energy data, and these functions were centralized in the US Energy Information Administration by The Department of Energy Organization Act of 1977. The US Energy Information Administration (EIA) "collects, analyzes, and disseminates independent and impartial energy information to promote sound policymaking, efficient markets, and public understanding of energy and its interaction with the economy and the environment" (EIA 2023).

Current data is also available directly from the EIA's US Energy Atlas.

An older CSV file and metadata containing field information is also available here.

Figure
Power plants data from the EIA

Loading a Feature Service

The following code shows how to import the power plant data from a feature service into the project geodatabase for visualization and analysis.

  1. Create a new project and give it a meaningful name (Minn 2024 Points).
  2. Go to the US Energy Atlas, find the Power Plants data and copy the Geoservice feature service URL.
  3. Under Analysis and Tools, open the Export Features tool to copy the data from the feature service into a new feature class in the project geodatabase.
  4. Symbolize as Unique Values from the Primary Source attribute to map the plants by fuel source.
Loading the power plant data

You can view the available data and fields in the Attribute Table.

Fields View offers data type information for each field.

Viewing field information

Load CSV Points

Points can also be loaded in from CSV files containing fields with latitude and longitude.

  1. Download the CSV file to your local storage. For this example we use a CSV file of major league baseball parks in 2019.
  2. Under Analysis and Tools, open the XY Table to Point tool to import the data as a new point feature class.
Locations of US baseball parks in 2019 imported from a CSV file with latitudes and longitudes (Wikipedia)

Point Maps

Simple Point Map

As shown above, after a point data set is imported into the project geodatabase, it is automatically added to the current map.

You can right-click on the layer in the Contents pane and select Symbology to change the default symbology.

Changing the symbology of a single symbol point map.

Filtering

A filter selects a specified subset of the data.

Data can be filtered into a separate feature class with the Select tool.

Using the select tool

Base Map

A base map is a simple reference map that you place under mapped features to provide geographic context for where those features are located on the surface of the earth.

While area map borders can provide geographic context on their own, point maps usually need base maps so it is possible for the user to see where the points are located relative to other known features.

ArcGIS Pro provides a variety of different base maps and you may want to use a less detailed base map if the default base map symbols make it difficult to see your point symbols.

Choosing alternative base maps

Mapping a Categorical Attribute

If your point data has a categorical attribute, you can map it by changing the Symbology to Unique Values.

You will always need to provide a legend in your layout so readers can know what the different symbols mean.

Mapping points with a categorical field

Mapping a Quantitative Variable

A common method of visualizing points with a quantitative attribute is the graduated symbol (bubble) map that uses differently sized shapes based on the variable being mapped.

Mapping points with a bubble map

Point Autocorrelation

One of the primary ways of analyzing the spatial distribution of phenomenon is looking where points group together, or where high or low values are clustered together.

Spatial autocorrelation is the clustering of similar values together in adjacent or nearby areas.

Spatial data is usually spatially autocorrelated, and this tendency is useful since clusters often provide insights into the characteristics of the clustered phenomena.

While visible observation of spatial autocorrelation is useful when exploring data, more-rigorous means of quantifying autocorrelation are helpful when performing serious research.

Ripley's K

B.D. Ripley (1977) developed a collection of lettered techniques for analyzing spatial point correlation.

The Ripley's K function iterates through each point and counts the number of other points within a given distance.

Autocorrelation is evaluated by analyzing the position of the observed line relative to the theoretical lines.

While ArcGIS Pro does not create Ripley's K graphics, it can be used to create a table that can be mapped as a line chart.

  1. Under Analysis and Tools, open the Multi-Distance Spatial Cluster Analysis (Ripley's K Function) too.
  2. Click on the table in the Contents pane and select Visualize, Create Chart, Line Chart.
Ripley's K

Kernel Density

A classical method of mapping areas of high spatial autocorrelation of points is kernel density analysis, in which a kernel of a given radius is used to systematically scan an area, and the density of any particular location is the number of points within the kernel surrounding that point. Locations surrounded by clusters of points will have higher values, and areas where there are few or no points will have lower values.

Kernel density demonstration animation (R code)

A kernel density raster can be created using the Kernel Density tool.

The search radius parameter will define how much smoothing will be applied to create the raster. While the search radius can be thought of as an area of influence around each point, the choice is often arbitrary, which reduces the rigor of kernel density analysis.

Kernel density raster

Heat Map

A heat map symbology is similar to a kernel density except that the smoothing radius changes depending on your level of zoom. Accordingly, this is another arbitrary visualization technique that is primarily of value as an interactive exploratory technique.

Heat map symbology

Join Features

In situations where your point data does not contain an attribute that you want to analyze, it may be possible to get that data from another area layer using a spatial join.

A spatial join connects two datasets based on a spatial relationship.

Spatial joins can be used to transfer attribute values set of polygons in a join layer to points in a target layer.

Spatial join area attributes (join layer) to points (target layer)

In this example, we spatial join median household income data from ACS census tracts (join layer) to the power plant points (target layer) so we can evaluate whether areas around power plants have a lower income than the national median.

With this data, the median household income of tracts around coal-fired plants is well below the national median, corroborating our hypothesis that these plants tend to be located in lower-income areas.

  1. Under Analysis and Tools, open the Export Features tool to copy the ACS census data from a feature service into a new feature class in the project geodatabase. Leave this off your ModelBuilder diagram so that ArcGIS Pro does not re-read the entire feature service when you save your project package.
  2. Add the Join Features tool
  3. Modify the Sumbology to show the Median Household Income.
  4. Under Data, Visualize, Create Chart create a Histogram to view the distribution and find the median.
  5. Go to data.census.gov to find the national median for comparison to the value for the power plants as a question of environmental justice.
Spatial join census tract (area) attributes to power plant points

Aggregation

Aggregation of points into areas adds attributes to the areas that summarize information about the points within each area.

When analyzing point data, it can be helpful to aggregate counts and attribute values by areas and then analyze the values in those areas for relationships.

Spatial joins can be used to aggregate points in a join layer to area polygons in a target layer.

Spatial join point attributes (join layer) to areas (target layer)

Aggregation States

For this example, we will aggregate by US states using polygons from the Minn 2012-2020 Electoral States feature service in the University of Illinois ArcGIS Online organization.

Under Analysis and Tools, open the Export Features tool to copy the data from the feature service into a new feature class in the project geodatabase. Perform this operation separately from ModelBuilder so that ArcGIS Pro does not re-read the entire feature service when you save your project package.

Importing the states polygons into the project geodatabase

Aggregate by Areas

  1. Add the Join Features tool to your ModelBuilder diagram.
  2. Modify the Sumbology to show the Sum_Nameplate_Capacity.
Aggregate by areas

Normalization

States with larger populations will have higher electricity demands and larger amounts of installed power plant capacity. What is often more meaningful for the purposes of analysis is the capacity per resident of the state (per capita).

Normalization is the adjustment of variable values to a common scale so they are comparable across space and time (Wikipedia 2023).

Normalization

If you need a normalized field in your data for further analysis, you can use the Calculate Field tool to create an KW_per_Capita field.

Creating a normalized field

Categorical Comparison

If your area data contains a meaningful categorical variable, you can compare aggregated values by category.

For this example, we spatially join the coal plants with state-level election results from the 2012 presidential election to find installed capacity by political leaning.

Select the Electoral_Plants layer in the Contents pane and select Data, Visualize, Create Chart, Box Plot.

Categorical comparison

Area Autocorrelation

After points have been aggregated into areas, techniques with areas can be used to analyze and visualize autocorrelation.

For these examples we will use total installed coal-fired power plant capacity in megawatts by state.

Neighbors

In order to detect autocorrelation between neighboring areas, we must first answer the ancient question, "Who is my neighbor?"

Neighboring areas can be defined by adjacency or proximity.

Adjacency vs. proximity

Global Moran's I

Patrick Alfred Pierce Moran (1950) developed a technique for quantifying autocorrelation within values associated with areas. Moran's I is a method for assessing area autocorrelation analogous to the use of Ripley's K with points.

There are two types of Moran's I statistics: global and local.

The global Moran's I tool in ArcGIS Pro assesses autocorrelation over the entire area of analysis as a z-score (number of standard deviations away from the expected mean if the distribution was random).

To evaluate autocorrelation using Moran's I in ArcGIS Pro:

  1. Add the Spatial Autocorrelation (Global Moran's I) tool to your ModelBuilder diagram.
  2. View the HTML report link to see the graphical report showing the results.
Analyzing global Moran's I

Local Moran's I

Local indicators of spatial association (LISA) were initially developed by Luc Anselin (1995) to decompose global indicators of spatial autocorrelation (like global Moran's I) and assess the contribution of each individual area.

LISA analysis with the local Moran's I statistic is used to identify clusters of areas with similar values and outlier areas with values that stand out among the neighbors.

Add the Cluster and Outlier Analysis (Anselin Local Moran's I) tool to your diagram.

Mapping local Moran's I

Getis-Ord GI*

The Getis-Ord GI* statistic was developed by Arthur Getis and J.K. Ord and introduced in 1992. This algorithm locates areas with high values that are also surrounded by high values.

Unlike simple observation or techniques like kernel density analysis, the Getis-Ord GI* algorithm uses statistical comparisons of all areas to create p-values indicating how probable it is that clusters of high values in a specific areas could have occurred by chance. This rigorous use of statistics makes it possible to have a higher level of confidence (but not complete confidence) that the patterns are more than just a visual illusion and represent clustering that is worth investigating.

The Getis-Ord GI* statistic is a z-score on a normal probability density curve indicating the probability that an area is in a cluster that did not occur by random chance. A z-score of 1.282 represents a 90% probability that the cluster did not occur randomly, and a z-score of 1.645 represents a 95% probability.

  1. Add the Hot Spot Analysis (Getis-Ord Gi*) tool to your ModelBuilder diagram.
  2. With this data, Getis-Ord GI* analyzes the clusters as Not Significant but notes the outlier hot spot states with large plants that are not surrounded by other states with large plants.

    Getis-Ord GI*

    Points to Polygons

    Voronoi Polygons

    Voronoi polygons are polygons around points that each contain one point and have edges that evenly divide the areas between adjacent points. Voronoi diagrams are named after Russian mathematician Georgy Feodosievych Voronoy. They are also sometimes called Thiessen polygons after American meterologist Alfred Thiessen (Wikipedia 2023).

    While Voronoi polygons are a useful way of visualizing points as areas, the polygons are mathematical abstractions that may not represent actual areas of influence or jurisdiction in the real world, and care must be taken to communicate that limitation.

    1. Insert a new map and give it a meaningful name (Voronoi Map).
    2. Under Analysis and Tools, open the Select tool to create a feature class with a small enough number of points to have a visible group of polygons.
      • Input Features: Power_Plants
      • Output Feature Class: NV_Geothermal
      • Expression: Where State is equal to Nevada and Geothermal_MW is greater than 0
    3. Under Analysis and Tools, open the Create Thiessen Polygons tool.
      • Input Features: NV_Geothermal
      • Output Feature Class: NV_Geothermal_Polygons
      • Output Fields: All fields
    Creating Voronoi polygons

    Hulls

    Points can also be bound by hulls to more-clearly demonstrate the spatial extent of the points, such as when you need to visualize a service area for a utility or transit service.

    This example uses the Nevada geothermal points selected above.

    Under Analysis and Tools, open the Minimum Bounding Geometry tool.

    Creating a convex hull

    Colocation

    Colocation analysis assesses how closely two sets of points are located to each other. The two sets of points can be in different feature classes or distinguished by a categorical variable in the same feature class.

    For this example, we examine colocation between coal-fired power plants and natural gas-fired power plants.

    ArcGIS Pro provides a Colocation Analysis tool.

    In this output we see few coal plants colocated with natural gas plants, which confirms the visual observation that areas served by coal plants tend to be separate from areas served by natural gas plants.

    Colocation analysis

    Centrographics

    Centrographics summarize point patterns as a single central feature.

    Mean Center

    Mean center is the location that is at the minimum possible total distance from all points. Mean center can be calculated by taking the mean (average) of all latitudes and the mean of all longitudes.

    ArcGIS Pro provides the Mean Center tool.

    Mean center

    Standard Deviational Ellipse

    A standard deviational ellipse is an ellipse formed around the mean center with the distances based on the standard deviation of the X and Y values.

    The standard deviational ellipse clarifies the spatial distribution of points more rigorously than simple visual observation.

    You can create a standerd deviational ellipse with the Directional Distribution tool.

    Standard deviational ellipse