Geospatial Data from the US Census Bureau

Rev. 21 September 2025

This tutorial covers basic techniques for acquiring and joining US Census Bureau (USCB) data. While focusing on ArcGIS Pro, this tutorial also includes information on using USCB data in the ArcGIS Online Map Viewer, Python, and R.

census.gov

US Census Bureau Data

The US Census Bureau (USCB) is the part of the US federal government responsible for collecting data about people (demographics) and the economy in the United States. The Census Bureau has its roots in Article I, section 2 of the US Constitution, which mandates an enumeration of the entire US population every ten years (the decennial census) in order to set the number of members from each state in the House of Representatives (the lower house of the US Congress) and Electoral College (that selects the US President) (USCB 2017). The Census Act of 1840 established a central office for conducting the decennial census, and that office became the Census Bureau under the Department of Commerce and Labor in 1903 (USCB 2021).

The American Community Survey

Among the Census Bureau's many programs is the American Community Survey (ACS), an ongoing survey that provides information on an annual basis about people in the United States beyond the basic information collected in the decennial census. The ACS is commonly used by a wide variety of researchers when they need information about the general public.

Unlike the constitutionally-mandated decennial census which is only taken every ten years, the ACS continuously surveys people in America's communities so that the ACS data can be more detailed and current than the decennial census. However, because the ACS is a survey rather than a complete count like the decennial census, there is uncertainty about how accurately the sampling represents the facts on the ground, and that uncertainty is expressed in a statistical margin of error (MOE) on most ACS values (US Census Bureau 2018).

Spatial Aggregation

In order to preserve the confidentiality of respondents (and the associated willingness of people to respond to highly-personal questions), the US Census Bureau generally only releases data that has been aggregated (combined) into areas at various geographic scales:

Types of aggregation areas used by the US Census Bureau

Temporal Aggregation

Although ACS data is captured through surveys that are administered on an ongoing basis, it is aggregated into time-periods to improve geographic coverage and reduce margin of error.

ACS data is released annually in aggregation by two different time-periods.

One-Year Estimates Five-Year Estimates
Useful when you need the most current data about an characteristic that changes frequently Useful when you need the most accurate data about a characteristic that stays fairly stable over time
Useful for areas that are changing rapidly Useful for areas that are well-established
Often has gaps in sparsely-populated rural areas Data is more complete
Based on fewer surveys, so it has wider margins of error Based on more surveys, so it has lower margins of error

Community Profile Pages

If you are looking for quick information on a specific state, county, city or community, the USCB provides profile pages in data.census.gov that include basic demographic information about population, income, education, etc.

You can access a profile page by typing the name of the area of interest into the search bar and waiting for it to autocomplete. If there is a profile page, a link to that page will appear for you to select.

A Profile Page on data.census.gov

FIPS Codes and GEOIDs

FIPS (Federal Information Processing Standards) codes are used to uniquely identify geographies in USCB data. FIPS codes for different geographies can be found with Google.

A full list of current FIPS codes for different area types is available on the US Census Bureau website.

FIPS codes build left to right from the more general to the more specific.

FIPS codes are commonly used as GEO_ID values in US Census Bureau data.

Because purely numeric FIPS codes are ambiguous about exactly what type of geography they represent, US Census Bureau data often includes fully-qualified GEOIDs (GEOIDFQ) that append a prefix to a FIPS code that clearly indicates what type of area the FIPS code represents.

GEOIDFQ prefixes also have specific subfields:

For example:

USCB Table Data

Data from a variety of different USCB programs is available on data.census.gov for download as table data.

These ACS demographic profile (DP) tables contain useful groups of data:

The video below demonstrates downloading selected variables from the DP03 table with county-level data from data.census.gov.

  1. From the data.census.gov home page, search for the desired table (DP03). The default table shows values for the entire USA.
  2. Click Geos to select the type of geographic area. For this example, we will use County and All counties within the United States and Puerto Rico.
  3. Click Download Table.
  4. Select the appropriate Table Vintages. For this example, we use the 2019 five-year estimates for maximum accuracy in the pre-COVID world.
  5. Download the zipped CSV.
  6. In the Windows File Explorer, open the .zip archive, and open the file with a name containing the word "data" in it.
  7. Remove all unnecessary columns, and rename the columns to meaningful names. For this example, these are the variables we keep.
  8. Remove the 2nd row with the descriptive column information and leave just the top header row and data rows.
  9. Look through the rows and remove any rows with non-numeric data.
  10. Save the spreadsheet as a CSV file under a meaningful name (County_Economics.csv).
Downloading table data from data.census.gov

Joining USCB Table Data to TIGER Shapefiles

Although the tables downloaded from data.census.gov contain geographic area identifiers, they do not contain the polygon information needed to map that data as areas in software and we need to join the table data to area polygons for mapping.

The Topologically Integrated Geographic Encoding and Referencing (TIGER) database is a collection of geospatial polygons maintained by the US Census Bureau.

Shapefiles utilize a file format developed by ESRI in the 1990s that is actually a collection of files that each contain separate information, such as the coordinates, attributes, projection, and metadata.

A join is a database operation where two tables are connected based on common key values.

In GIS, an attribute join attaches a table of attribute data to a feature class of polygons based on a set of key field values common to both the table and feature class.

Attribute joins can be used to connect data from external tables (such as in a CSV file) to geospatial locations defined in feature classes that comes from USCB shapefiles or file geodatabases.

Attribute join illustration

To join USCB table data to a TIGER shapefile, we join on the GEO_ID field in the downloaded table data and the GEOIDFQ field in the TIGER/LINE shapefile.

Joining USCB table data with a TIGER shapefile in ArcGIS Pro

Joining Non-USCB Data on FIPS Codes

Geospatial table data from non-USCB sources occasionally includes FIPS codes for areas that can then be joined with TIGER shapefiles or secondary data GEOIDFQ for import into ArcGIS Pro.

There are multiple common challenges to joining with FIPS codes from non-USCB sources:

The solution to these problems is to convert your FIPS codes into GEOIDFQ in your spreadsheet program before importing and joining the CSV in ArcGIS Pro.

For example, the downloadable county level data for the US Religion Census from the Association of Statisticians of American Religious Bodies contains a FIPS code numeric FIPS code field.

Joining table data on FIPS codes in ArcGIS Pro

Joining Non-USCB Data on County Names

Attribute joins can be used to join county-level table data to TIGER shapefiles based on county names for choropleth mapping.

This technique has numerous pitfalls that can cause joins to make incomplete matches or mismatches:

This example uses county-level breast cancer incidence rates from the Illinois Department of Public Health Cancer in Illinois data portal.

Joining county table data to demographic polygons in ArcGIS Pro

TIGER Infrastructure Shapefiles

TIGER/Line shapefiles are available for a variety of geographic features. These features can be used for creating base maps.

  1. Go to the USCB's TIGER/Line Shapefiles page, choose the web interface, and download the desired geometries.
  2. Using Windows Explorer, open the file, copy the contents, and paste them into the Downloads directory.
  3. Bring the data into the project geodatabase with the Export Features tool.
Importing a TIGER/Line shapefile of railroad lines

Project Geodatabase

When working with data from shapefiles and tables in ArcGIS Pro, the data should be saved in the project geodatabase to keep the project data together and avoid losing external files. This is why many examples in this tutorial use Export Features tool to copy the data from shapefiles or feature services into the project geodatabase.

Figure
The relationship between geodatabases, feature classes, features, maps, and layouts

A project geodatabase is the default geodatabase used for storing feature classes that are imported or created as part of an ArcGIS Pro project.

Viewing the contents of the project file geodatabase in the Catalog Pane

USCB Data from Secondary Sources

Although data.census.gov is the definitive primary source for US Census Bureau data, the amount of available data is vast, and that data is made available in formats that requires additional processing to use in GIS. Accordingly, subsets of that data are sometimes made available from secondary sources in pre-processed formats that facilitate easier use.

Feature Services

Feature services provide clients with access to vector features in a server geodatabase through REST endpoints.

Data from feature service endpoints can be downloaded into your project geodatabase so you can preserve a snapshot of the current data in case the feature service data is modified or deleted.

Client-server architecture

ArcGIS Online Feature Services

Numerous authors publish subsets of USCB data as feature services in ArcGIS Online. You should always use caution when accessing data from non-authoritative ArcGIS Online sources as the data is often work from student projects that is often of uncertain vintage and integrity.

The Minn 2019-2023 ACS feature service in the University of Illinois ArcGIS Online organization features a wide variety of commonly-used demographic variables from the 2019-2023 ACS five-year estimates data profile (DP) tables at state, county, and census tract aggregation levels. The data has full metadata and is also available as downloadable GeoJSON for use in R or Python.

Use the Export Features tool to copy the data from the feature service into the project geodatabase.

Importing county ACS data from a feature service

Counties Filtered by State

If you only need counties in a specific state, you can use the Export Features tool with a filter.

Importing county ACS data using an precompiled layer

County Tracts Filtered by GEOID

Tracts for specific counties can be filtered using a GEOIDFQ, which is described above..

Downloading census tract ACS data for Cook County, IL

Living Atlas Features

ESRI's Living Atlas of the World is a collection of geospatial data services that can be accessed in ArcGIS Pro or in ArcGIS Online. These services include data from the USCB and other government agencies.

Although the Living Atlas can be convenient, there are three issues with using Living Atlas data on a regular basis:

You can download features from many Living Atlas feature services into your ArcGIS Pro project geodatabase.

  1. Open the Export Features tool.
  2. Symbolize the new feature class by the desired variable.
Exporting Living Atlas features into a new project database feature class

Public Feature Services

Government agencies often make their geospatial data available through their open data portals.

In this example, we download the a feature class of trail center lines from the Lake County, IL Open Data & Records Hub.

  1. Acquire: Find the desired feature service and select I want to use this...
  2. Store: Under Analysis, Tools, open the Export Features tool.
    • Input Features: Copy the endpoint URL and remove the query component of the endpoint URL.
    • Output Features: Browse into the project geodatabase and provide a meaningful name (Trails).
  3. Communicate: Symbolize as needed.
Exporting features from a feature service

Data downloaded from feature services may cause the cryptic Geometry cannot have Z values error when packaging a project, even if the feature class properties do not indicate the presence of Z values. The workaround is to export the feature class to a shapefile, then export the shapefile back over the feature class in the project geodatabase.

Raster and Elevation Data

While the USCB does not provide raster or elevation data, ArcGIS Pro provides tools that can be used to acquire, clip, and summarize raster data within USCB polygons.

Image Service Download

Image services provide clients with the ability to access raster and image data from a server geodatabase.

The National Agricultural Imagery Program (NAIP) is a program begun by the US Department of Agriculture (USDA) in 2002 to collect leaf-on aerial imagery during the agricultural growing season. Aside from research value, the imagery is used to maintain the USDA's Common Land Unit (CLU) database of farm fields across the US (ESRI n.d.).

For this example we demonstrate how to download a portion of the USA NAIP Imagery: Natural Color Living Atlas layer covering Peoria County, Illinois to a raster in the project geodatabase.

  1. Acquire: Add the desired image service layer to your map.
  2. Store: Right click on the image layer and select Data and Export Raster tool
  3. Communicate: Remove the ArcGIS Online tile layer and leave only the new raster layer on the map.
Exporting a portion of an Living Atlas image layers to the project geodatabase

Clipped Image Service Download

If you have boundary polygon(s), you can clip the downloaded raster.

  1. Acquire: Add the desired image service layer to your map.
  2. Store: Right click on the image layer and select Data and Export Raster tool
  3. Communicate: Remove the ArcGIS Online tile layer and leave only the new raster layer on the map.
Exporting a clipped section of an Living Atlas image layers to the project geodatabase

Point Elevation

If your primary interest is getting elevation values for points and you have fewer than 1,000 features, the Summarize Elevation tool can be used to add an elevation field to a point feature class from ESRI's world elevation service.

This example demonstrates adding elevation values to a point feature class of Chicago Transit Authority "L" Stations.

Getting elevation values for points

Area Elevation

As with points, if your primary interest is getting elevation values for areas and you have fewer than 1,000 features, the Summarize Elevation tool can be used to add an elevation field to an area feature class.

This example adds elevation to neighborhood boundaries in the City of Chicago.

Getting elevation values for areas

Spatial Joins

A spatial join connects two datasets based on a spatial relationship where attribute values are transferred from a set of features in a join layer to a target layer.

Spatial joins can be used to aggregate and summarize values to/from USCB polygons for mapping and analysis.

Joining Dissimilar Areas

Spatial joins can be used to aggregate data from polygons into larger or smaller polygons. This example demonstrates use of a spatial join to join demographic census tract data from the Minn 2015-2019 ACS Tracts feature service to neighborhood boundaries in the City of Chicago.

Joining dissimilar areas

Point Attributes from Polygons

You can get attribute data for points from polygons with a spatial join.

In this example we join hospital locations in Illinois with census tracts to get demographic characteristics of the neighborhoods where the schools are located.

Library demographics

Proximity Joins

Spatial joins can be used to get counts of join features within a specific distance of the target features.

For this example, we join park boundaries from the Chicago Park District with census tracts in Cook County, IL to get the count of parks within 1/2 mile of each census tract.

Counts of parks within 1/2 mile of census tracts in Cook County, IL

Aggregate Points

Spatial joins can be used to get counts of point within polygons.

This example aggregates Chicago Police Department robbery crime points for 2023 from the Chicago Data Portal

Neighborhood robbery counts in 2019 and 2023

Aggregate Coverage

Spatial joins can be used to calculate the areas of polygon features that are covered by areas in another feature class.

This example estimates the percentage of census tracts in Milwaukee County, WI (Tracts) that are covered with tree canopy based on a feature class of tree canopy polygons (Tree_Canopy) digitized from 2020 lidar data by the Milwaukee County GIS and Land Information Office.

Aggregating polygon coverage areas

ArcGIS Online

Using Precompiled Data

The video below demonstrates how to add the Minn 2015-2019 ACS Counties feature service available from the University of Illinois ArcGIS Online organization.

  1. Search for the layer in ArcGIS Online.
  2. If needed, resymbolize the layer and select the variable you wish to display. For this example we use median monthly rent.
  3. If needed, change the color ramp and/or adjust the scale to accentuate the differences.
  4. Change the blending mode to Multiply so you can see the base map as geographic context for your data.
  5. Save the map under a meaningful name (Minn County Income).
  6. Share the map.
  7. Copy the URL to get a link.
Creating a county map using an precompiled layer

Living Atlas

The video below shows how to add a Living Atlas layer of median household income to a map in ArcGIS Online. Note that this layer is scale-dependent and changes the types of areas being displayed (states, counties, census tracts) depending on how closely you are zoomed in to the map.

  1. Create a new map in ArcGIS Online.
  2. Select Add and Living Atlas Layers.
  3. Search for the data by name, in this case median household income.
  4. Zoom in on the area you want to display. This particular layer is a scale-dependent layer that changes the types of areas displayed depending in how closely you are zoomed in to an area.
  5. Change the Blending to Multiply so you can see the base map and labels as geographic context for your data.
  6. Save the map under a meaningful name, share it, and copy the URL for a link (Minn_2019_County_Economics).
Creating a Map With a Living Atlas Layer

You can get metadata on the source and year of the information in a layer by opening the Properties panel and clicking on the link below Information.

Getting Metadata For a Living Atlas Layer

Mapping Table Data in ArcGIS Online

For the join key, we use the USCB GEO_ID field that is common to both the table downloaded from data.census.gov and the TIGER/LINE shapefile.

  1. Download table data from data.census.gov as demonstrated above.
  2. Download an area polygon shapefile as demonstrated above.
  3. On your ArcGIS Online Content page, click New Item and upload the zipped shapefile as a hosted feature layer.
  4. Click New Item and upload the CSV as a hosted feature table.
  5. After the layer completes publishing, Open in Map Viewer.
  6. Click the Analysis icon on the right side panel and select Tools, Summarize Data, Join Features
  7. Update the polygon Symbology based on the newly joined data variable
  8. Save the map under a meaningful title (Minn_2019_County_Economics).
  9. Back in your Content page, remove the original shapefile and table.
Creating a choropleth in ArcGIS Online

Table Data and TIGER Shapefiles

Zipped shapefiles can be read directly into the ArcGIS Online Map Viewer to create new feature classes for mapping and analysis.

Importing a TIGER shapefile of military installations into a new feature service using the ArcGIS Online Map Viewer

Python

GeoPandas is a Python package for working with geospatial data.

Matplotlib is a Python package for plotting graphs.

Precompiled Data

To load precompiled ACS data into a GeoDataFrame in Python:

  1. Read the geospatial data from the file into a GeoDataFrame object using the read_file() function.
  2. The to_crs() method is used to reproject the data to the North America Lambert Conformal Conic projection suitable for North America (ESRI 102009).
  3. The GeoDataFrame plot() method can be used to with the name of the attribute to visualize the variable as a choropleth map.
  4. The pyplot set_axis_off() method turns off the axis scale around the map, which is unnecessary with a projected map.
  5. The pyplot show() function displays the plotted map.

import geopandas

import matplotlib.pyplot as plt

counties = geopandas.read_file("https://michaelminn.net/tutorials/data/2015-2019-acs-counties.geojson")

counties = counties.to_crs("ESRI:102009")

axis = counties.plot("Median Household Income", cmap = "coolwarm", legend=True, scheme="quantiles")

axis.set_axis_off()

plt.show()
Figure
Choropleth of median household income by county from a precompiled GeoJSON file

We can filter the counties to show only the continental US to more effectively use the mapped area. We can also overlay a map of state outlines over the counties for geographic context.

counties = counties[~counties["ST"].isin(['AK', 'HI', 'PR'])]

states = geopandas.read_file("https://michaelminn.net/tutorials/data/2015-2019-acs-states.geojson")

states = states[~states["ST"].isin(['AK', 'HI', 'PR'])]

states = states.to_crs(counties.crs)

axis = counties.plot("Median Household Income", scheme="naturalbreaks", 
	cmap="coolwarm_r", edgecolor="none", legend=True,
	legend_kwds={"bbox_to_anchor":(0.2, 0.4)})

states.plot(facecolor="none", edgecolor="#808080", ax=axis)

axis.set_axis_off()

plt.show()
Figure
Filtered county data with a state base map

Table Data in Python

A conventional technique for acquiring USCB data is to download table data from data.census.gov and join it to TIGER polygons. Downloading is preferred over API access if you wish to preserve a snapshot of the data at a particular time, or you want to avoid the unreliability and download times associated with APIs.

  1. Download and clean up the table data as described above.
  2. Pandas is a Python package working with tabular data.
  3. GeoPandas is a Python package for working with geospatial data.
  4. Matplotlib is a Python package for plotting graphs.
  5. CSV files can be read into Pandas DataFrames using the read_csv() function.
  6. Find the URL to the appropriate TIGER cartographic boundary zipped shapefile from the link on the page shown above.
  7. Read the cartographic boundary file directly from the USCB website into a GeoDataFrame object using the read_file() function.
  8. Use the to_crs() method to reproject the data to the North America Lambert Conformal Conic projection suitable for North America (ESRI 102009).
  9. Join the table DataFrame with the county polygon GeoDataFrame using the Pandas merge() method and the GEO_ID fields in the two objects.
  10. To map only the continental 48 states, we exclude Alaska, Hawaii, and Puerto Rico using their FIPS codes.
import pandas

import geopandas

import matplotlib.pyplot as plt

county_data = pandas.read_csv("https://michaelminn.net/tutorials/gis-census/2023_County_Economics.csv")

counties = geopandas.read_file("https://www2.census.gov/geo/tiger/GENZ2019/shp/cb_2019_us_county_5m.zip")

counties = counties.to_crs("ESRI:102009")

counties = counties.merge(county_data, left_on="AFFGEOID", right_on="GEO_ID")

counties = counties[~counties["STATEFP"].isin(['02', '15', '72'])]

axis = counties.plot("Percent Workforce Participation", cmap="coolwarm_r", legend=True, scheme="quantiles")

axis.set_axis_off()

plt.show()
Figure
Choropleth of percent workforce participation

TIGER Shapefiles

  1. Find the URL to the appropriate TIGER cartographic boundary zipped shapefile from the link on the page shown above.
  2. Read the cartographic boundary file directly from the USCB website into a GeoDataFrame object using the read_file() function.
  3. The to_crs() method is used to reproject the data to the North America Lambert Conformal Conic projection suitable for North America (ESRI 102009).
  4. The GeoDataFrame plot() method plots the polygons.
  5. The pyplot set_axis_off() method turns off the axis scale that is unnecessary with a projected map.
  6. The pyplot show() function displays the plotted map.
import geopandas

import matplotlib.pyplot as plt

military = geopandas.read_file("https://www2.census.gov/geo/tiger/TIGER2023/MIL/tl_2023_us_mil.zip")

states = geopandas.read_file("https://www2.census.gov/geo/tiger/GENZ2023/shp/cb_2023_us_state_5m.zip")

states = states.to_crs("ESRI:102009")

military = military.to_crs("ESRI:102009")

axis = states.plot(facecolor='none', edgecolor='gray')

military.plot(facecolor='red', ax=axis)

axis.set_axis_off()

plt.show()
Figure
Map of US military installations

R

Functions from the sf (simple features) library are used to work with vector geospatial data in R.

Precompiled Data

  1. Load the GeoJSON polygons into a simple features data.frame using st_read().
  2. Reproject the polygons into a cartographically appropriate projection with st_transform(). For this data we use the North America Lambert Conformal Conic projection suitable for North America (ESRI 102009).
  3. Create an appropriate diverging colorRampPalette.
  4. plot() a choropleth colored by the desired variable.
library(sf)

counties = st_read("https://michaelminn.net/tutorials/data/2015-2019-acs-counties.geojson")

counties = st_transform(counties, "ESRI:102009")

redblue = colorRampPalette(c("red", "lightgray", "navy"))

plot(counties["Median.Household.Income"], breaks="quantile", pal=redblue, lwd=0.1)
Figure
Choropleth of median household income by county from a precompiled GeoJSON file

Filtering in R

We can filter the counties to show only the continental US to more effectively use the mapped area. We can also overlay a map of state outlines over the counties for geographic context.

states = st_read("https://michaelminn.net/tutorials/data/2015-2019-acs-states.geojson")

states = states[!(states$ST %in% c('AK', 'HI', 'PR')),]

states = st_transform(states, st_crs(counties))

counties = counties[!(counties$ST %in% c('AK', 'HI', 'PR')),]

plot(counties["Median.Household.Income"], breaks="quantile", pal=redblue, border=NA, reset=F)

plot(states$geometry, col=NA, border="#404040", add=T)
Figure
Filtered county data with a state base map

Shapefiles in R

library(sf)

download.file('https://www2.census.gov/geo/tiger/TIGER2023/MIL/tl_2023_us_mil.zip', 'temp.zip')

military = st_read('/vsizip/temp.zip')

download.file('https://www2.census.gov/geo/tiger/GENZ2023/shp/cb_2023_us_state_5m.zip', 'temp.zip')

states = st_read('/vsizip/temp.zip')

military = st_transform(military, "ESRI:102009")

states = st_transform(states, "ESRI:102009")

states = states[!(states$STATEFP %in% c('02', '15', '60', '66','69', '72', '78')),]

plot(states$geometry, col=NA, border='gray')

plot(military$geometry, col='red', add=T)
Figure
Map of US military installations

Table Data in R

A conventional technique for acquiring USCB data is to download table data from data.census.gov and join it to TIGER polygons. Downloading is preferred over API access if you wish to preserve a snapshot of the data at a particular time, or you want to avoid the unreliability and download times associated with APIs.

  1. Download and clean up the table data as described above.
  2. Read the table data into an R data.frame using the read.csv() function.

  3. Find the URL to the appropriate TIGER cartographic boundary zipped shapefile from the link on the page shown above.
  4. download.file() to a temporary file with the .shz extension. You must go through this intermediate step so the file has the .shz extension so that st_read() knows this is a zipped shapefile.
  5. Load the shapefile polygons into a simple features data.frame using st_read().
  6. Reproject the polygons into a cartographically appropriate projection with st_transform(). For this data we use the North America Lambert Conformal Conic projection suitable for North America (ESRI 102009).
  7. merge() the polygons and the table on the GEO_ID fields.
  8. To map only the continental 48 states, we exclude Alaska, Hawaii, and Puerto Rico using their FIPS codes.
  9. Create an appropriate diverging colorRampPalette.
  10. plot() a choropleth colored by the desired variable.
library(sf)

county_data = read.csv("https://michaelminn.net/tutorials/gis-census/2023_County_Economics.csv")

download.file("https://www2.census.gov/geo/tiger/GENZ2019/shp/cb_2019_us_county_5m.zip", "temp.shz")

counties = st_read("temp.shz")

counties = st_transform(counties, "ESRI:102009")

counties = merge(counties, county_data, by.x="AFFGEOID", by.y="GEO_ID")

counties = counties[!(counties$STATEFP %in% c('02', '15', '72')),]

redblue = colorRampPalette(c("red", "lightgray", "navy"))

plot(counties["Percent.Workforce.Participation"], pal=redblue, breaks="quantile")
Figure
Choropleth of percent workforce participation

If you want to save a copy of your processed data for later use, you can use the st_write() function to create a variety of different types of geospatial data files.

st_write(counties, "2019_County_Economics.geojson")

The US Census Bureau API

The USCB makes much of their data available through application programmers interfaces (APIs) that permit direct access to current versions of USCB data via services.

Although APIs have a learning curve, if you are using USCB data in R or Python, access through an API can be a much more flexible way of accessing USCB data than manually downloading and cleaning table data.

ACS Variables

ACS variables are referenced by cryptic variable names that indicate the source table and the number of the variable in that table, along with letters that indicate whether the variable represents estimated values or margins of error. Adding to the complexity, variables representing different types of summarization are stored in different reference files.

For this example, we focus on data from the 2015-2019 ACS five-year estimates that represent the final data release reflecting the pre-COVID world.

Typical variables from the profile variable list (DP02 - DP05) include:

Typical variables from the subject variable list include:

API Calls

USCB API calls are URLs with path components and parameters that will return the requested data. This list of available APIs describes the different API options.

For this examples, we will download state-level median household income (DP03_0062E) from the 2015-2019 ACS five-year estimates that represent the final data release reflecting the pre-COVID world.

For this API query URL:

https://api.census.gov/data/2019/acs/acs5/profile?get=GEO_ID,DP03_0062E&for=state:*
USCB API JSON
USCB API JSON viewed in a browser

Using USCB API Data in Python

To create a GeoDataFrame of ACS data:

  1. GeoPandas is a Python package for working with geospatial data.
  2. Matplotlib is a Python package for plotting graphs.
  3. Load the API data into a DataFrame by passing the API URL to the Pandas read_json() function.

  4. Use iloc to remove the header row and the redundant state column.
  5. rename() the columns with meaningful names.
  6. astype(float) to convert the data column from text to numeric.
  7. Find the URL to the appropriate TIGER cartographic boundary zipped shapefile from the link on the page shown above.
  8. Read the cartographic boundary file directly from the USCB website into a GeoDataFrame object using the read_file() function.
  9. Use the to_crs() method to reproject the data to the North America Lambert Conformal Conic projection suitable for North America (ESRI 102009).
  10. Join the table DataFrame with the county polygon GeoDataFrame using the Pandas merge() method and the GEO_ID fields in the two objects.
  11. To map only the continental 48 states, we exclude Alaska, Hawaii, and Puerto Rico.
import pandas

import geopandas

import matplotlib.pyplot as plt

state_income = pandas.read_json("https://api.census.gov/data/2019/acs/acs5/profile?get=GEO_ID,DP03_0062E&for=state:*")

state_income = state_income.iloc[1:, 0:2]

state_income = state_income.rename(columns={0:"GEO_ID", 1:"Median Household Income"})

state_income.iloc[:,1] = state_income.iloc[:,1].astype(float)

states = geopandas.read_file("https://www2.census.gov/geo/tiger/GENZ2022/shp/cb_2022_us_state_5m.zip")

states = states.to_crs("ESRI:102009")

state_income = states.merge(state_income, left_on="AFFGEOID", right_on="GEO_ID")

state_income = state_income[~state_income["STUSPS"].isin(['AK', 'HI', 'PR'])]

axis = state_income.plot("Median Household Income", scheme="naturalbreaks", 
	cmap="coolwarm_r", edgecolor="#808080", legend=True)

axis.set_axis_off()

plt.show()
Figure
An import of USCB ACS data using the API: median household income by state 2015-2019

County Level Data

County level data can be loaded by modifying the API parameters and joining with the TIGER counties file.

This example maps median household income (DP03_0062E) by Illinois county (state FIPS 17).

county_income = pandas.read_json("https://api.census.gov/data/2019/acs/acs5/profile?get=GEO_ID,DP03_0062E&for=county:*&in=state:17")

county_income = county_income.iloc[1:, 0:2]

county_income = county_income.rename(columns={0:"GEO_ID", 1:"Median Household Income"})

county_income.iloc[:,1] = county_income.iloc[:,1].astype(float)

counties = geopandas.read_file("https://www2.census.gov/geo/tiger/GENZ2019/shp/cb_2019_us_county_20m.zip")

counties = counties.to_crs("ESRI:102009")

county_income = counties.merge(county_income, left_on="AFFGEOID", right_on="GEO_ID")

axis = county_income.plot("Median Household Income", scheme="naturalbreaks", 
	cmap="coolwarm_r", edgecolor="#808080", legend=True,
	legend_kwds={"bbox_to_anchor":(0.2, 0.4)})

axis.set_axis_off()

plt.show()
Figure
Median household income by county

Tract-Level Data

Census tracts are subdivisions of counties that are drawn based on clearly identifiable features to ideally contain around 4,000 residents, although in practice the range of population is usually between 1,200 and 8,000 (USCB 2019).

This example maps median household income (DP03_0062E) by census tract in Cook County, Illinois (state FIPS 17, county FIPS 031).

Note that tract-level data is commonly undisclosed or unavailable and is represented with the negative number -666666666. This code resets those values to zero to avoid distorting the legend.

tract_income = pandas.read_json("https://api.census.gov/data/2019/acs/acs5/profile?get=GEO_ID,DP03_0062E&for=tract:*&in=state:17&in=county:031")

tract_income = tract_income.iloc[1:, 0:2]

tract_income = tract_income.rename(columns={0:"GEO_ID", 1:"Median Household Income"})

tract_income.iloc[:,1] = tract_income.iloc[:,1].astype(float)

tract_income[tract_income["Median Household Income"] < 0] = 0

tracts = geopandas.read_file("https://www2.census.gov/geo/tiger/GENZ2019/shp/cb_2019_17_tract_500k.zip")

tracts = tracts.to_crs("ESRI:102009")

tract_income = tracts.merge(tract_income, left_on="AFFGEOID", right_on="GEO_ID")

axis = tract_income.plot("Median Household Income", scheme="naturalbreaks", 
	cmap="coolwarm_r", edgecolor="none", legend=True,
	legend_kwds={"bbox_to_anchor":(0.2, 0.4)})

axis.set_axis_off()

plt.show()
Figure
Tract-level choropleth of median household income from ACS data

Using USCB API Data in R

To create a data.frame of ACS data:

  1. The sf (simple features) library provides functions for working with vector geospatial data.
  2. The jsonlite library is a JSON parser.
  3. Load API data as a data.frame by passing the API URL to the jsonlite fromJSON() function.
  4. Remove the header row and the redundant state column.
  5. Provide meaningful column names().
  6. Use as.numeric() to convert the data column from text to numeric.
  7. Find the URL to the appropriate TIGER cartographic boundary zipped shapefile from the link on the page shown above.
  8. download.file() to a temporary file with the .shz extension. You must go through this intermediate step so the file has the .shz extension so that st_read() knows this is a zipped shapefile.
  9. Load the shapefile polygons into a simple features data.frame using st_read().
  10. Reproject the polygons into a cartographically appropriate projection with st_transform(). For this data we use the North America Lambert Conformal Conic projection suitable for North America (ESRI 102009).
  11. merge() the polygons and the table on the GEO_ID fields.
  12. To map only the continental 48 states, we exclude Alaska, Hawaii, and Puerto Rico using their FIPS codes.
  13. Create an appropriate diverging colorRampPalette.
  14. plot() a choropleth colored by the desired variable.
library(sf)

library(jsonlite)

state_data = fromJSON("https://api.census.gov/data/2019/acs/acs5/profile?get=GEO_ID,DP03_0062E&for=state:*")

state_data = as.data.frame(state_data[2:nrow(state_data),1:2],)

state_data[,2] = as.numeric(state_data[,2])

names(state_data) = c("GEO_ID", "Median Household Income")

download.file("https://www2.census.gov/geo/tiger/GENZ2022/shp/cb_2022_us_state_5m.zip", "temp.shz")

states = st_read("temp.shz")

states = st_transform(states, "ESRI:102009")

state_income = merge(states, state_data, by.x="AFFGEOID", by.y="GEO_ID")

state_income = state_income[!(state_income$STUSPS %in% c('AK', 'HI', 'PR')),]

redblue = colorRampPalette(c("red", "lightgray", "navy"))

plot(state_income["Median Household Income"], pal=redblue, breaks="quantile")
Figure
An import of USCB ACS data using the API: median household income by state 2015-2019

County Level Data

County level data can be loaded by modifying the API parameters and joining with the TIGER counties file.

This example maps median household income (DP03_0062E) by Illinois county (state FIPS 17).

county_data = fromJSON("https://api.census.gov/data/2019/acs/acs5/profile?get=GEO_ID,DP03_0062E&for=county:*&in=state:17")

county_data = as.data.frame(county_data[2:nrow(county_data),1:2],)

county_data[,2] = as.numeric(county_data[,2])

names(county_data) = c("GEO_ID", "Median Household Income")

download.file("https://www2.census.gov/geo/tiger/GENZ2019/shp/cb_2019_us_county_20m.zip", "temp.shz")

counties = st_read("temp.shz")

counties = st_transform(counties, "ESRI:102009")

county_income = merge(counties, county_data, by.x="AFFGEOID", by.y="GEO_ID")

redblue = colorRampPalette(c("red", "lightgray", "navy"))

plot(county_income["Median Household Income"], pal=redblue, breaks="quantile")
Figure
Median household income by county

Tract-Level Data

Census tracts are subdivisions of counties that are drawn based on clearly identifiable features to ideally contain around 4,000 residents, although in practice the range of population is usually between 1,200 and 8,000 (USCB 2019).

This example maps median household income (DP03_0062E) by census tract in Cook County, Illinois (state FIPS 17, county FIPS 031).

Note that tract-level data is commonly undisclosed or unavailable and is represented with the negative number -666666666. This code resets those values to zero to avoid distorting the legend.

tract_data = fromJSON("https://api.census.gov/data/2019/acs/acs5/profile?get=GEO_ID,DP03_0062E&for=tract:*&in=state:17&in=county:031")

tract_data = as.data.frame(tract_data[2:nrow(tract_data),1:2],)

tract_data[,2] = as.numeric(tract_data[,2])

names(tract_data) = c("GEO_ID", "Median Household Income")

download.file("https://www2.census.gov/geo/tiger/GENZ2019/shp/cb_2019_us_tract_500k.zip", "temp.shz")

tracts = st_read("temp.shz")

tracts = st_transform(tracts, "ESRI:102009")

tract_income = merge(tracts, tract_data, by.x="AFFGEOID", by.y="GEO_ID")

redblue = colorRampPalette(c("red", "lightgray", "navy"))

plot(tract_income["Median Household Income"], pal=redblue, breaks="quantile", lwd=0.1)

Figure
Tract-level choropleth of median household income from ACS data