Analyzing Areas of Influence in ArcGIS Online

This tutorial describes one type of analysis of areas of influence that can be performed using ArcGIS Online and Google Sheets.

This example examines potential differences in incidence rates for all cancers in counties surrounding nuclear power plants in the USA versus counties outside those areas.

Our question is, "Are incidence rates for all types of cancer higher near nuclear power plants?"

Our hypothesis is that cancer rates will be higher near nuclear power plants.

Map the Areas

The first step in analyzing the geospatial distribution of a phenomena is generally to map it and look for patterns. The cancer rate by county data for this analysis comes from the State Cancer Registry maintained by the National Institutes of Health and the Centers for Disease Control (CSV file, Metadata).

The video below demonstrates the creation of a choropleth using a cancer rate feature layer already created in this ArcGIS Online organization. In this case we symbolize by the incidence rate (per 100,000 population) for all cancers.

Although there are hot spots (clusters) of cancer around the country, there is a notable cancer belt that extends from the deep south through Appalachia.

Creating a Choropleth of Cancer Incidence

Map the Points

Areas of influence are centered around specific points. For this example, we use point locations for nuclear power plants from the US Energy Information Administration (CSV file, Metadata). The video below demonstrates adding points from a power plant feature layer already created in this ArcGIS Online organization.

Creating a Point Layer of Power Plants


In some cases you may need to filter your points to isolate a specific type of influence. For example, this feature layer contains points for all types of power plants. The video below demonstrates how to filter the layer so that only the plants with nuclear as their primary fuel source are displayed.

We also color the points black so they will stand out over the completed map.

Filtering a Point Layer

Find Nearest Locations

To isolate the counties around the power plants that we hypothesize may have higher cancer rates, we use the Find Locations, Find Existing Locations analysis tool.

  1. For this example, we ADD EXPRESSION to select counties that have at least some area within a distance of 25 miles (arbitrarily chosen) from a plant
  2. Give the resulting layer a meaningful name. You may want to include your last name in the name to avoid duplicating a name used by someone else in your organization
  3. Uhcheck Use current map extent so that any areas outside the area currently being mapped will be selected
  4. Show credits to make sure you haven't made a mistake that will use up your credit quota. Credits required over 10 generally indicate some problem
  5. We symbolize them by cancer rate with a red color ramp so the will stand out in the map
  6. Selecting Counties Within the Areas of Influence

    To isolate counties outside that area, we repeat the operation, but choose not within a distance of. We symbolize them blue so they are visually subdued compared to the red areas of influence.

    Selecting Counties Outside the Areas of Influence


    Now that we have a layer of areas that should be influenced by the points, and an area that should not be influenced, we can compare the statistics to see if there is any difference in the dependent variable - in this case incidence rates for all cancers.

    Getting Statistics for Comparison

    Share the Map and Layers

    Sharing the Map

    Export to Excel Files

    To perform further analysis on this data, we will need to export both layers to Excel files and import them into a spreadsheet.

    Creating Excel files requires two steps: exporting the data to a file in ArcGIS Online, and then downloading that ArcGIS Online file as an Excel file to your hard drive.

    Exporting Layer Data to CSV Files

    Import Into Google Sheets

    Once the CSV files are on your hard drive, you can import them into Google Sheets as separate worksheets in a single workbook.

    Importing CSV files into Google Sheets

    Descriptive Statistics Table

    Once you have your data in a spreadsheet, you can perform descriptive statistical analysis. For example:

    Give the sheet a meaningful title.

    Turn off the spreadsheet gridlines and add formatting.

    Descriptive Statistics in Google Sheets


    One way of visualizing a distribution is a histogram, that divides the distribution into a set of ranges, and then displays the number of values in each range.

    To compare the two distributions, we can add a second data series. While the height of the bars is different due to the larger number of counties outside the nuclear plant areas of influence, the peak values of the nuclear plant counties are slightly to the right of the non-nuclear counties.

    Configuring the histogram to place the outlier 1% in the outside bars keeps the chart from being too wide.

    Configuring the bucket size to a smaller value (10 in this case) reduces the number of bars and may make the curve clearer, depending on your distribution.

    Two-Series Histogram in Google Sheets


    If you are collaborating with others, or wish to include a spreadsheed and chart in a Story Map, you need to publish the chart. Note that you can publish the whole workbook or just the current spreadsheet or chart. You should choose the latter if placing the chart in a Story Map as a visualization.

    Publishing a Spreadsheet from Google Sheets


    While this comparatively simple type of analysis gives intuitive results with a minimum amount of effort, it is subject to a number of limitations that should be noted when attempting to interpret the results of the analysis.