Point Data Analysis in ArcGIS Pro

Geospatial point data is used to represent things that exist or events that occurred at specific locations on the surface of the earth. Examples of points include crime locations, animal nests, street trees, vehicle charging stations, cellphone towers, WiFi hotspots, GPS waypoints, etc. Areas are sometimes represented with points (centroids) to simplify mapping and navigation, such as with restaurants, houses, stores, or transit stations.

This tutorial demonstrates basic techniques for analysis of point data in ArcGIS Pro.

Loading Point Data

The example point data used for this tutorial is 2022 electrical power plant location data from the US Energy Information Administration.

The oil market disruptions of 1973 resulted in the creation of a wide variety of federal systems and organizations for collecting and managing national energy data, and these functions were centralized in the US Energy Information Administration by The Department of Energy Organization Act of 1977. The US Energy Information Administration (EIA) "collects, analyzes, and disseminates independent and impartial energy information to promote sound policymaking, efficient markets, and public understanding of energy and its interaction with the economy and the environment" (EIA 2023).

Current data is also available directly from the EIA's US Energy Atlas.

An older CSV file and metadata containing field information is also available here.

Figure
Power plants data from the EIA

Loading Geospatial Data Files

The following code shows how to import the power plant data from a feature service into the project geodatabase for visualization and analysis.

  1. Create a new project and give it a meaningful name (Minn 2023 Points).
  2. Right-click on the map in the Contents pane, and under Properties and General, give it a meaningful name (Power Plant Map).
  3. Go to the US Energy Atlas, find the Power Plants data and copy the Geoservice feature service URL.
  4. Under Analysis and Tools, open the Export Features tool to copy the data from the feature service into a new feature class in the project geodatabase.
  5. Symbolize as Unique Values from the Primary Source attribute to map the plants by fuel source.
Loading the power plant data

You can view the available data and fields in the Attribute Table.

Fields View offers data type information for each field.

Viewing field information

Load CSV Points

Points can also be loaded in from CSV files containing fields with latitude and longitude.

  1. Download the CSV file to your local storage. For this example we use a CSV file of major league baseball parks in 2019.
  2. Under Analysis and Tools, open the XY Table to Point tool to import the data as a new point feature class.
Locations of US baseball parks in 2019 imported from a CSV file with latitudes and longitudes (Wikipedia)

Point Maps

Simple Point Map

As shown above, after a point data set imported into the project geodatabase, it automatically added to the current map.

You can right-click on the layer in the Contents pane and select Symbology to change the default symbology.

Changing the symbology of a single symbol point map.

Filtering

A filter selects a specified subset of the data.

Data can be filtered into a separate feature class with the Select tool.

Using the select tool

Base Map

A base map is a simple reference map that you place under mapped features to provide geographic context for where those features are located on the surface of the earth.

While area map borders can provide geographic context on their own, point maps usually need base maps so it is possible for the user to see where the points are located relative to other known features.

ArcGIS Pro provides a variety of different base maps and you may want to use a less detailed base map if the default base map symbols make it difficult to see your point symbols.

Choosing alternative base maps

Mapping a Categorical Attribute

If your point data has a categorical attribute, you can map it by changing the Symbology to Unique Values.

You will always need to provide a legend in your layout so readers can know what the different symbols mean.

Mapping points with a categorical field

Mapping a Quantitative Variable

A common method of visualizing points with a quantitative attribute is the graduated symbol (bubble) map that uses differently sized shapes based on the variable being mapped.

Mapping points with a bubble map

Voronoi Polygons

Voronoi polygons are polygons around points that each contain one point and have edges that evenly divide the areas between adjacent points. Voronoi diagrams are named after Russian mathematician Georgy Feodosievych Voronoy. They are also sometimes called Thiessen polygons after American meterologist Alfred Thiessen (Wikipedia 2023).

While Voronoi polygons are a useful way of visualizing points as areas, the polygons are mathematical abstractions that may not represent actual areas of influence or jurisdiction in the real world, and care must be taken to communicate that limitation.

  1. Insert a new map and give it a meaningful name (Voronoi Map).
  2. Under Analysis and Tools, open the Select tool to create a feature class with a small enough number of points to have a visible group of polygons.
  3. Under Analysis and Tools, open the Create Thiessen Polygons tool.
Creating Voronoi polygons

Hulls

Points can also be bound by hulls to more-clearly demonstrate the spatial extent of the points, such as when you need to visualize a service area for a utility or transit service.

This example uses the Nevada geothermal points selected above.

Under Analysis and Tools, open the Minimum Bounding Geometry tool.

Creating a convex hull

Point Autocorrelation

One of the primary ways of analyzing the spatial distribution of phenomenon is looking where points group together, or where high or low values are clustered together.

Spatial autocorrelation is the clustering of similar values together in adjacent or nearby areas.

Spatial data is usually spatially autocorrelated, and this tendency is useful since clusters often provide insights into the characteristics of the clustered phenomena.

While visible observation of spatial autocorrelation is useful when exploring data, more-rigorous means of quantifying autocorrelation are helpful when performing serious research.

Ripley's K

B.D. Ripley (1977) developed a collection of lettered techniques for analyzing spatial point correlation.

The Ripley's K function iterates through each point and counts the number of other points within a given distance.

Autocorrelation is evaluated by analyzing the position of the observed line relative to the theoretical lines.

While ArcGIS Pro does not create Ripley's K graphics, it can be used to create a table that can be mapped as a line chart.

  1. Under Analysis and Tools, open the Multi-Distance Spatial Cluster Analysis (Ripley's K Function) too.
  2. Click on the table in the Contents pane and select Create Chart, Line Chart.
Ripley's K

Kernel Density

A classical method of mapping areas of high spatial autocorrelation of points is kernel density analysis, in which a kernel of a given radius is used to systematically scan an area, and the density of any particular location is the number of points within the kernel surrounding that point. Locations surrounded by clusters of points will have higher values, and areas where there are few or no points will have lower values.

Kernel density demonstration animation (R code)

A kernel density raster can be created using the Kernel Density tool.

The search radius parameter will define how much smoothing will be applied to create the raster. While the search radius can be thought of as an area of influence around each point, the choice is often arbitrary, which reduces the rigor of kernel density analysis.

Kernel density raster

A heat map symbology is similar to a kernel density except that the smoothing radius changes depending on your level of zoom. Accordingly, this is another arbitrary visualization technique that is primarily of value as an interactive exploratory technique.

Heat map symbology

Join Features

In situations where your point data does not contain an attribute that you want to analyze, it may be possible to get that data from another area layer using a spatial join.

An spatial join connects two datasets based on a spatial relationship where attribute values are transferred from a set of features in a join layer to a target layer.

Spatial join area attributes (join layer) to points (target layer)

In this example, we spatial join median household income data from ACS census tracts (join layer) to the power plant points (target layer) so we can evaluate whether areas around power plants have a lower income than the national median.

With this data, the median household income of tracts around coal-fired plants is well below the national median, corroborating our hypothesis that these plants tend to be located in lower-income areas.

  1. Insert a new Map and give it a meaningful name (Plant Income Map)
  2. Under Analysis and Tools, open the Export Features tool to copy the ACS census data from a feature service into a new feature class in the project geodatabase.
  3. Under Analysis and Tools, open the Join Features tool
  4. Modify the Sumbology to show the Median Household Income.
  5. Under Data, Visualize, Create Chart create a Histogram to view the distribution and find the median that you can compare to the national value.
Spatial join census tract (area) attributes to power plant points

Attribute Aggregation

When analyzing point data, it can be helpful to aggregate counts and attribute values by areas and then analyze the values in those areas for relationships.

For this example of coal-fired plants, we will aggregate by state based on the State attribute.

If your data does not have an attribute that can be used for aggregating by area, you will need to use spatial aggregation as described below.

Note that because power plants are interconnected by grids that extend across state lines, the aggregated production values may not reflect actual consumption values within state lines.

Attribute aggregation

Attribute Join

Because the point data contains no state polygon information need to perform an attribute join to be able to map the aggregated data as a choropleth.

An attribute join connects two datasets based on common key values.

Attribute Join
  1. Under Analysis and Tools, open the Export Features tool to copy the data from the state feature service into a new feature class in the project geodatabase.
  2. Under Analysis and Tools, open the Join Field tool
  3. Modify the Sumbology to show the Sum_Nameplate_Capacity.
Attribute join

Normalization

States with larger populations will have higher electricity demands and larger amounts of installed power plant capacity. What is often more meaningful for the purposes of analysis is the capacity per resident of the state (per capita).

Normalization is the adjustment of variable values to a common scale so they are comparable across space and time (Wikipedia 2023).

Normalization

Spatial Aggregation

In situations where your point data does not contain an attribute that you can use to aggregate, or if you want to aggregate into areas that are different from any specified in the point attributes, it is possible to aggregate using a spatial join.

An spatial join connects two datasets based on a spatial relationship where attribute values are transferred from a set of features in a join layer to a target layer.

Spatial join point attributes (join layer) to areas (target layer)

For this example, we will aggregate power plants by North American Electric Reliability Corporation (NERC) administrative regions that are used to manage the North American electrical grid. NERC "develops and enforces reliability standards; annually assesses seasonal and longā€term reliability; monitors the bulk power system through system awareness; and educates, trains, and certifies industry personnel" (NERC 2023).

  1. Insert a new Map and give it a meaningful name (NERC Capacity Map)
  2. Add Data with the Analysis_Plants layer.
  3. The NERC region polygons are read directly from a feature service provided by the EIA through the US Energy Atlas.

  4. Under Analysis and Tools, open the Export Features tool to copy the data from the feature service into a new feature class in the project geodatabase.
  5. Under Analysis and Tools, open the Join Features tool
  6. Modify the Sumbology to show the Sum_Nameplate_Capacity.
Spatial join point attributes to areas

Normalized Sum

We can use a similar spatial join to find the total installed capacity in each region and normalize and get a percentage of electricity generated by coal-fired plants.

  1. Under Analysis and Tools, open the Join Features tool
  2. Modify the Sumbology to show a calculation of percentage with the expression:
    100 * $feature.SUM_Install_MW / $feature.Sum_Install_MW1
Spatial join to get total values for normalization

Categorical Aggregation

After completing a spatial join, aggregation can be performed on attributes other than area.

For this example, we spatially join the coal plants with state-level election results from the 2012 presidential election to find installed capacity by political leaning.

  1. Insert a new map and give it a meaningful name (Electoral Plants Map)
  2. Add Data with the Analysis_Plants layer.
  3. Under Analysis and Tools, open the Export Features tool to copy the data from the state feature service into a new feature class in the project geodatabase.
  4. Under Analysis and Tools, open the Join Features tool
  5. Select the Electoral_Plants layer in the Contents pane and select Data, Visualize, Create Chart, Bar Chart.

    • Category or Date: Winner_2012
    • Aggregation: Sum
    • Numeric field(s): Sum_Install_Capacity
    • Check Label bars
Categorical aggregation

Area Autocorrelation

After points have been aggregated into areas, techniques with areas can be used to analyze and visualize autocorrelation.

For these examples we will use total installed coal-fired power plant capacity in megawatts by state.

Neighbors

In order to detect autocorrelation between neighboring areas, we must first answer the ancient question, "Who is my neighbor?"

Neighboring areas can be defined by adjacency or proximity.

Adjacency vs. proximity

Global Moran's I

Patrick Alfred Pierce Moran (1950) developed a technique for quantifying autocorrelation within values associated with areas. Moran's I is a method for assessing area autocorrelation analogous to the use of Ripley's K with points.

There are two types of Moran's I statistics: global and local.

The global Moran's I tool in ArcGIS Pro assesses autocorrelation over the entire area of analysis as a z-score (number of standard deviations away from the expected mean if the distribution was random).

To evaluate autocorrelation using Moran's I in ArcGIS Pro:

  1. Insert a new map and give it a meaningful name (Moran Plants Map)
  2. Add Data with the State_Capacity layer created above.
  3. Under Analysis and Tools, open the Spatial Autocorrelation (Global Moran's I) tool.
  4. View Details to view the Global Moran's I Summary. In this example, the Z-score of 3.5 indicates strong autocorrelation.
  5. Viewing the HTML link at the bottom of the results displays a graphical report showing the same results.
Calculating global Moran's I

Local Moran's I

Local indicators of spatial association (LISA) were initially developed by Luc Anselin (1995) to decompose global indicators of spatial autocorrelation (like global Moran's I) and assess the contribution of each individual area.

LISA analysis with the local Moran's I statistic is used to identify clusters of areas with similar values and outlier areas with values that stand out among the neighbors.

Under Analysis and Tools, open the Cluster and Outlier Analysis (Anselin Local Moran's I) tool.

Mapping local Moran's I

Getis-Ord GI*

The Getis-Ord GI* statistic was developed by Arthur Getis and J.K. Ord and introduced in 1992. This algorithm locates areas with high values that are also surrounded by high values.

Unlike simple observation or techniques like kernel density analysis, the Getis-Ord GI* algorithm uses statistical comparisons of all areas to create p-values indicating how probable it is that clusters of high values in a specific areas could have occurred by chance. This rigorous use of statistics makes it possible to have a higher level of confidence (but not complete confidence) that the patterns are more than just a visual illusion and represent clustering that is worth investigating.

The Getis-Ord GI* statistic is a z-score on a normal probability density curve indicating the probability that an area is in a cluster that did not occur by random chance. A z-score of 1.282 represents a 90% probability that the cluster did not occur randomly, and a z-score of 1.645 represents a 95% probability.

  1. Insert a new map and give it a meaningful name (Getis Plants Map)
  2. Add Data with the State_Capacity layer created above.
  3. Under Analysis and Tools, open the Hot Spot Analysis (Getis-Ord Gi*) tool.
Getis-Ord GI*