Geospatial Data from the US Census Bureau

The US Census Bureau (USCB) is the US federal government agency responsible for collecting data about people and the economy in the United States. The Census Bureau has its roots in Article I, section 2 of the US Constitution, which mandates an enumeration of the entire US population every ten years (the decennial census) in order to set the number of members from each state in the House of Representatives and Electoral College (USCB 2017). The Census Act of 1840 established a central office for conducting the decennial census, and that office became the Census Bureau under the Department of Commerce and Labor in 1903 (USCB 2021).

This tutorial covers basic techniques for acquiring US Census Bureau data for use in a ArcGIS Pro, the ArcGIS Online Map Viewer, Python, and R.

census.gov

The American Community Survey

Among the Census Bureau's many programs is the American Community Survey (ACS), an ongoing survey that provides information on an annual basis about people in the United States beyond the basic information collected in the decennial census. The ACS is commonly used by a wide variety of researchers when they need information about the general public.

Unlike the constitutionally-mandated decennial census which is only taken every ten years, the ACS continuously surveys people in America's communities so that the ACS data can be more detailed and current than the decennial census. However, because the ACS is a survey rather than a complete count like the decennial census, there is uncertainty about how accurately the sampling represents the facts on the ground, and that uncertainty is expressed in a statistical margin of error (MOE) on most ACS values (US Census Bureau 2018).

Spatial Aggregation

In order to preserve the confidentiality of respondents (and the associated willingness of people to respond to highly-personal questions), the US Census Bureau generally only releases data that has been aggregated (combined) into areas at various geographic scales:

Types of aggregation areas used by the US Census Bureau

Temporal Aggregation

Although ACS data is captured through surveys that are administered on an ongoing basis, it is aggregated into time-periods to improve geographic coverage and reduce margin of error.

ACS data is released annually in aggregation by two different time-periods.

One-Year Interval Five-Year Interval
Useful when you need the most current data about an characteristic that changes frequently Useful when you need the most accurate data about a characteristic that stays fairly stable over time
Useful for areas that are changing rapidly Useful for areas that are well-established
Often has gaps in sparsely-populated rural areas Data is more complete
Based on fewer surveys, so it has wider margins of error Based on more surveys, so it has lower margins of error

FIPS Codes and GEO_IDs

Geographies in USCB data are uniquely identified with FIPS (Federal Information Processing Standards) codes. FIPS codes for different geographies can be found with Google.

FIPS codes build left to right from the more general to the more specific.

When FIPS codes are used in US census data, they are commonly included in GEO_ID values.

For example:

Community Profile Pages

If you are looking for quick information on a specific state, county, city or community, the USCB provides profile pages in data.census.gov that include basic demographic information about population, income, education, etc.

You can access a profile page by typing the name of the area of interest into the search bar and waiting for it to autocomplete. If there is a profile page, a link to that page will appear for you to select.

A Profile Page on data.census.gov

TIGER Shapefiles

The Topologically Integrated Geographic Encoding and Referencing (TIGER) database is a collection of geospatial data maintained by the US Census Bureau.

Shapefiles utilize a file format developed by ESRI in the 1990s that is actually a collection of files that each contain separate information, such as the coordinates, attributes, projection, and metadata.

ArcGIS Pro

This video demonstrates how to download, unzip, and import a shapefile into an ArcGIS Pro project geodatabase. This example uses county polygon boundaries compatable with the ACS table data described below.

  1. Go to the USCB's TIGER Cartographic Boundary Files page and download the appropriate type of geography. The lowest or medium resolution files are fine unless you are doing high accuracy mapping.
  2. Using Windows Explorer, open the file, copy the contents, and paste them into the Downloads directory.
  3. Bring the data into the project geodatabase with the Export Features tool (County_Economics).
Importing a TIGER cartographic boundary file into an ArcGIS Pro project geodatabase

A geodatabase is special type of database containing "a collection of geographic datasets of various types" (ESRI 2020).

A feature class is a geographic dataset within a geodatabase that contain features of the same geometric type (points, lines, polygons) and a common set of attributes (ESRI 2024).

Each individual geographic entity in a feature class is called a feature. For example, in a feature class for roads in a county, each road segment would be a feature (ESRI 2020).

A project geodatabase is the default geodatabase used for storing feature classes that are imported or created as part of an ArcGIS Pro project.

Viewing the contents of the project file geodatabase in the Catalog Pane

ArcGIS Online

Zipped shapefiles can be read directly into the ArcGIS Online Map Viewer to create new feature classes for mapping and analysis.

Importing a TIGER shapefile of military installations into a new feature service using the ArcGIS Online Map Viewer

Python

GeoPandas is a Python package for working with geospatial data.

Matplotlib is a Python package for plotting graphs.

  1. Find the URL to the appropriate TIGER cartographic boundary zipped shapefile from the link on the page shown above.
  2. Read the cartographic boundary file directly from the USCB website into a GeoDataFrame object using the read_file() function.
  3. The to_crs() method is used to reproject the data to the North America Lambert Conformal Conic projection suitable for North America (ESRI 102009).
  4. The GeoDataFrame plot() method plots the polygons.
  5. The pyplot set_axis_off() method turns off the axis scale that is unnecessary with a projected map.
  6. The pyplot show() function displays the plotted map.
import geopandas

import matplotlib.pyplot as plt

military = geopandas.read_file("https://www2.census.gov/geo/tiger/TIGER2023/MIL/tl_2023_us_mil.zip")

states = geopandas.read_file("https://www2.census.gov/geo/tiger/GENZ2023/shp/cb_2023_us_state_5m.zip")

states = states.to_crs("ESRI:102009")

military = military.to_crs("ESRI:102009")

axis = states.plot(facecolor='none', edgecolor='gray')

military.plot(facecolor='red', ax=axis)

axis.set_axis_off()

plt.show()
Figure
Map of US military installations

R

Functions from the sf (simple features) library are used to work with vector geospatial data.

library(sf)

download.file('https://www2.census.gov/geo/tiger/TIGER2023/MIL/tl_2023_us_mil.zip', 'temp.zip')

military = st_read('/vsizip/temp.zip')

download.file('https://www2.census.gov/geo/tiger/GENZ2023/shp/cb_2023_us_state_5m.zip', 'temp.zip')

states = st_read('/vsizip/temp.zip')

military = st_transform(military, "ESRI:102009")

states = st_transform(states, "ESRI:102009")

states = states[!(states$STATEFP %in% c('02', '15', '60', '66','69', '72', '78')),]

plot(states$geometry, col=NA, border='gray')

plot(military$geometry, col='red', add=T)
Figure
Map of US military installations

Using Precompiled USCB Data

Although data.census.gov is the definitive source for US Census Bureau data, the amount of available data is vast, and that data is made available in formats that requires additional processing to use in GIS. Accordingly, subsets of that data are sometimes made available within organizations in pre-processed forms to facilitate easier use.

The following precompiled layers are available on this website as GeoJSON and as feature services from the University of Illinois ArcGIS Online organization:

ArcGIS Pro

The video below demonstrates how to download data from the Minn 2015-2019 ACS Tracts feature service available from the University of Illinois ArcGIS Online organization.

To avoid the speed, reliability, and feature count limitations of a large feature service like this, it may be advisable to use the Export Features tool to copy the data from the feature service into the project geodatabase.

Creating a census tract map of median age in ArcGIS Pro using an precompiled layer

Filter by State

If you need a subset of the features, you can set a Filter to the Export Features tool.

This precompiled data has a ST field with the USPS state abbreviation that can be used to subset tracts in individual states.

Creating a census tract map of median age in Illinois

Definition Query

Optionally, if you have already imported the full data set, you can use a If you need a subset of the features, you can use a definition query to limit display and analysis to a specific state.

Creating a census tract map of median age in Illinois

Filter Tracts by County GEO_ID

Filtering by county requires a definition query based on the GEO_ID, which is described above..

The GEO_ID for tracts begins with 1400000US, followed by the five digit county FIPS code, followed by the tract ID.

For this example, the FIPS code for Cook County, IL is 17031, so tract GEO_IDs in Cook County begin with 1400000US17031.

Creating a census tract map of median age in Cook County

MMUSCB

Customizable GeoJSON files of ACS data at the county or state tract level can be downloaded using MMUSCB and imported using the JSON to Fetures tool.

Creating a census tract map in Illinois using MMUSCB

ArcGIS Online

The video below demonstrates how to add the Minn 2015-2019 ACS Counties feature service available from the University of Illinois ArcGIS Online organization.

  1. Search for the layer in ArcGIS Online.
  2. If needed, resymbolize the layer and select the variable you wish to display. For this example we use median monthly rent.
  3. If needed, change the color ramp and/or adjust the scale to accentuate the differences.
  4. Change the blending mode to Multiply so you can see the base map as geographic context for your data.
  5. Save the map under a meaningful name (Minn County Income).
  6. Share the map.
  7. Copy the URL to get a link.
Creating a county map using an precompiled layer

Python

To load precompiled ACS data into a GeoDataFrame in Python:

  1. GeoPandas is a Python package for working with geospatial data.
  2. Matplotlib is a Python package for plotting graphs.
  3. Read the geospatial data from the file into a GeoDataFrame object using the read_file() function.
  4. The to_crs() method is used to reproject the data to the North America Lambert Conformal Conic projection suitable for North America (ESRI 102009).
  5. The GeoDataFrame plot() method can be used to with the name of the attribute to visualize the variable as a choropleth map.
  6. The pyplot set_axis_off() method turns off the axis scale around the map, which is unnecessary with a projected map.
  7. The pyplot show() function displays the plotted map.

import geopandas

import matplotlib.pyplot as plt

counties = geopandas.read_file("https://michaelminn.net/tutorials/data/2015-2019-acs-counties.geojson")

counties = counties.to_crs("ESRI:102009")

axis = counties.plot("Median Household Income", cmap = "coolwarm", legend=True, scheme="quantiles")

axis.set_axis_off()

plt.show()
Figure
Choropleth of median household income by county from a precompiled GeoJSON file

We can filter the counties to show only the continental US to more effectively use the mapped area. We can also overlay a map of state outlines over the counties for geographic context.

counties = counties[~counties["ST"].isin(['AK', 'HI', 'PR'])]

states = geopandas.read_file("https://michaelminn.net/tutorials/data/2015-2019-acs-states.geojson")

states = states[~states["ST"].isin(['AK', 'HI', 'PR'])]

states = states.to_crs(counties.crs)

axis = counties.plot("Median Household Income", scheme="naturalbreaks", 
	cmap="coolwarm_r", edgecolor="none", legend=True,
	legend_kwds={"bbox_to_anchor":(0.2, 0.4)})

states.plot(facecolor="none", edgecolor="#808080", ax=axis)

axis.set_axis_off()

plt.show()
Figure
Filtered county data with a state base map

R

Functions from the sf (simple features) library are used to work with vector geospatial data.

  1. Load the shapefile polygons into a simple features data.frame using st_read().
  2. Reproject the polygons into a cartographically appropriate projection with st_transform(). For this data we use the North America Lambert Conformal Conic projection suitable for North America (ESRI 102009).
  3. Create an appropriate diverging colorRampPalette.
  4. plot() a choropleth colored by the desired variable.
library(sf)

counties = st_read("https://michaelminn.net/tutorials/data/2015-2019-acs-counties.geojson")

counties = st_transform(counties, "ESRI:102009")

redblue = colorRampPalette(c("red", "lightgray", "navy"))

plot(counties["Median.Household.Income"], breaks="quantile", pal=redblue, lwd=0.1)
Figure
Choropleth of median household income by county from a precompiled GeoJSON file

We can filter the counties to show only the continental US to more effectively use the mapped area. We can also overlay a map of state outlines over the counties for geographic context.

states = st_read("https://michaelminn.net/tutorials/data/2015-2019-acs-states.geojson")

states = states[!(states$ST %in% c('AK', 'HI', 'PR')),]

states = st_transform(states, st_crs(counties))

counties = counties[!(counties$ST %in% c('AK', 'HI', 'PR')),]

plot(counties["Median.Household.Income"], breaks="quantile", pal=redblue, border=NA, reset=F)

plot(states$geometry, col=NA, border="#404040", add=T)
Figure
Filtered county data with a state base map

ESRI's Living Atlas

If you are using ArcGIS Pro or ArcGIS Online and you are not too particular about the symbology of your map ESRI's Living Atlas of the World contains a variety of layers of demographic data. Some of this data is from the American Community Survey, although ESRI also makes data available that they collect from other sources.

ArcGIS Pro

The video below shows how to add a Living Atlas layer of median age to a map in ArcGIS Pro. Note that this layer is scale-dependent and changes the types of areas being displayed (states, counties, census tracts) depending on how closely you are zoomed in to the map.

Note that with aggregated income numbers, median is often used instead of a mean (average) because income is unually not evenly distributed across a population, and a handful of wealthy people can distort averages so they are not representative of the typical economic well-being of people living in a particular area. (Yates 2020).

Creating a map using a Living Atlas feature service

If you want to use the Living Atlas data for analysis, use the Export Features tool to copy the data from the Living Atlas into a new feature class in the project database.

Exporting a Living Atlas layer into the project database

ArcGIS Online

The video below shows how to add a Living Atlas layer of median household income to a map in ArcGIS Online. Note that this layer is scale-dependent and changes the types of areas being displayed (states, counties, census tracts) depending on how closely you are zoomed in to the map.

  1. Create a new map in ArcGIS Online.
  2. Select Add and Living Atlas Layers.
  3. Search for the data by name, in this case median household income.
  4. Zoom in on the area you want to display. This particular layer is a scale-dependent layer that changes the types of areas displayed depending in how closely you are zoomed in to an area.
  5. Change the Blending to Multiply so you can see the base map and labels as geographic context for your data.
  6. Save the map under a meaningful name, share it, and copy the URL for a link (Minn_2019_County_Economics).
Creating a Map With a Living Atlas Layer

You can get metadata on the source and year of the information in a layer by opening the Properties panel and clicking on the link below Information.

Getting Metadata For a Living Atlas Layer

Using USCB Table Data

Data from a variety of different programs is available on data.census.gov for download as table data.

These ACS demographic profile (DP) tables contain useful groups of data:

If you need to variable(s) that are unavailable from an precompiled sources, you can download separate table and polygon data from the USCB and join the data together in ArcGIS Pro.

A join is a database operation where two tables are connected based on common key values. In GIS, an attribute join is used to connect data from external tables (such as in a CSV file) to geospatial locations defined in a feature class that comes from a shapefile or file geodatabase.

Attribute join illustration

Downloading Table Data

The video below demonstrates downloading selected variables from the DP03 table with county-level data from data.census.gov.

  1. From the data.census.gov home page, search for the desired table (DP03). The default table shows values for the entire USA.
  2. Click Geos to select the type of geographic area. For this example, we will use County and All counties within the United States and Puerto Rico.
  3. Click Download Table.
  4. Select the appropriate Table Vintages. For this example, we use the 2019 five-year estimates for maximum accuracy in the pre-COVID world.
  5. Download the zipped CSV.
  6. In the Windows File Explorer, open the .zip archive, and open the file with a name containing the word "data" in it.
  7. Remove all unnecessary columns, and rename the columns to meaningful names. For this example, these are the variables we keep.
  8. Remove the 2nd row with the descriptive column information and leave just the top header row and data rows.
  9. Look through the rows and remove any rows with non-numeric data.
  10. Save the spreadsheet as a CSV file under a meaningful name (County_Economics.csv).
Downloading table data from data.census.gov

Mapping Table Data in ArcGIS Pro

Although the tables downloaded from data.census.gov contain geographic area identifiers, they do not contain the polygon information needed to map that data as areas in software and we need to join the table data to area polygons for mapping.

For the join key, we use the USCB GEO_ID field that is common to both the table downloaded from data.census.gov and the TIGER/LINE shapefile.

  1. Download table data from data.census.gov as demonstrated above.
  2. Import an area polygon shapefile as demonstrated above.
  3. Under Analysis, Tools find the Join Fields tool to copy the data from the CSV table into the polygon feature class.
  4. Symbolize the updated layer to verify the new fields have been joined into the data.
Joining in ArcGIS Pro

If using data that covers a broad area, you may need to isolate particular subsets of the data.

For this example, we use a definition query to subset only counties in Illinois.

  1. TIGER files have different identification fields depending on the geography. With county-level data like this, the STATEFP field is a numeric FIPS code that indicates the state.
  2. Examine the fields in a location in the area you want to select to find the appropriate code. In this case, the STATEFP for Illinois is 17.
  3. Right click on the layer, select Properties and select Definition Query.
  4. Add a New Definition Query.
  5. Select the identification field (STATEFP), is equal to, and the identification code (17).
  6. Click OK and only the locations matching the criteria should be visible.
Isolating a subset of features

Mapping Table Data in ArcGIS Online

For the join key, we use the USCB GEO_ID field that is common to both the table downloaded from data.census.gov and the TIGER/LINE shapefile.

  1. Download table data from data.census.gov as demonstrated above.
  2. Download an area polygon shapefile as demonstrated above.
  3. On your ArcGIS Online Content page, click New Item and upload the zipped shapefile as a hosted feature layer.
  4. Click New Item and upload the CSV as a hosted feature table.
  5. After the layer completes publishing, Open in Map Viewer.
  6. Click the Analysis icon on the right side panel and select Tools, Summarize Data, Join Features
  7. Update the polygon Symbology based on the newly joined data variable
  8. Save the map under a meaningful title (Minn_2019_County_Economics).
  9. Back in your Content page, remove the original shapefile and table.
Creating a choropleth in ArcGIS Online

Mapping Table Data in Python

A conventional technique for acquiring USCB data is to download table data from data.census.gov and join it to TIGER polygons. Downloading is preferred over API access if you wish to preserve a snapshot of the data at a particular time, or you want to avoid the unreliability and download times associated with APIs.

  1. Download and clean up the table data as described above.
  2. Pandas is a Python package working with tabular data.
  3. GeoPandas is a Python package for working with geospatial data.
  4. Matplotlib is a Python package for plotting graphs.
  5. CSV files can be read into Pandas DataFrames using the read_csv() function.
  6. Find the URL to the appropriate TIGER cartographic boundary zipped shapefile from the link on the page shown above.
  7. Read the cartographic boundary file directly from the USCB website into a GeoDataFrame object using the read_file() function.
  8. Use the to_crs() method to reproject the data to the North America Lambert Conformal Conic projection suitable for North America (ESRI 102009).
  9. Join the table DataFrame with the county polygon GeoDataFrame using the Pandas merge() method and the GEO_ID fields in the two objects.
  10. To map only the continental 48 states, we exclude Alaska, Hawaii, and Puerto Rico using their FIPS codes.
import pandas

import geopandas

import matplotlib.pyplot as plt

county_data = pandas.read_csv("https://michaelminn.net/tutorials/gis-census/2023_County_Economics.csv")

counties = geopandas.read_file("https://www2.census.gov/geo/tiger/GENZ2019/shp/cb_2019_us_county_5m.zip")

counties = counties.to_crs("ESRI:102009")

counties = counties.merge(county_data, left_on="AFFGEOID", right_on="GEO_ID")

counties = counties[~counties["STATEFP"].isin(['02', '15', '72'])]

axis = counties.plot("Percent Workforce Participation", cmap="coolwarm_r", legend=True, scheme="quantiles")

axis.set_axis_off()

plt.show()
Figure
Choropleth of percent workforce participation

Mapping Table Data in R

A conventional technique for acquiring USCB data is to download table data from data.census.gov and join it to TIGER polygons. Downloading is preferred over API access if you wish to preserve a snapshot of the data at a particular time, or you want to avoid the unreliability and download times associated with APIs.

  1. Download and clean up the table data as described above.
  2. Read the table data into an R data.frame using the read.csv() function.

  3. Find the URL to the appropriate TIGER cartographic boundary zipped shapefile from the link on the page shown above.
  4. download.file() to a temporary file with the .shz extension. You must go through this intermediate step so the file has the .shz extension so that st_read() knows this is a zipped shapefile.
  5. Load the shapefile polygons into a simple features data.frame using st_read().
  6. Reproject the polygons into a cartographically appropriate projection with st_transform(). For this data we use the North America Lambert Conformal Conic projection suitable for North America (ESRI 102009).
  7. merge() the polygons and the table on the GEO_ID fields.
  8. To map only the continental 48 states, we exclude Alaska, Hawaii, and Puerto Rico using their FIPS codes.
  9. Create an appropriate diverging colorRampPalette.
  10. plot() a choropleth colored by the desired variable.
library(sf)

county_data = read.csv("https://michaelminn.net/tutorials/gis-census/2023_County_Economics.csv")

download.file("https://www2.census.gov/geo/tiger/GENZ2019/shp/cb_2019_us_county_5m.zip", "temp.shz")

counties = st_read("temp.shz")

counties = st_transform(counties, "ESRI:102009")

counties = merge(counties, county_data, by.x="AFFGEOID", by.y="GEO_ID")

counties = counties[!(counties$STATEFP %in% c('02', '15', '72')),]

redblue = colorRampPalette(c("red", "lightgray", "navy"))

plot(counties["Percent.Workforce.Participation"], pal=redblue, breaks="quantile")
Figure
Choropleth of percent workforce participation

If you want to save a copy of your processed data for later use, you can use the st_write() function to create a variety of different types of geospatial data files.

st_write(counties, "2019_County_Economics.geojson")

The US Census Bureau API

The USCB makes much of their data available through application programmers interfaces (APIs) that permit direct access to current versions of USCB data via services.

Although APIs have a learning curve, if you are using USCB data in R or Python, access through an API can be a much more flexible way of accessing USCB data than manually downloading and cleaning table data.

ACS Variables

ACS variables are referenced by cryptic variable names that indicate the source table and the number of the variable in that table, along with letters that indicate whether the variable represents estimated values or margins of error. Adding to the complexity, variables representing different types of summarization are stored in different reference files.

For this example, we focus on data from the 2015-2019 ACS five-year estimates that represent the final data release reflecting the pre-COVID world.

Typical variables from the profile variable list (DP02 - DP05) include:

Typical variables from the subject variable list include:

API Calls

USCB API calls are URLs with path components and parameters that will return the requested data. This list of available APIs describes the different API options.

For this examples, we will download state-level median household income (DP03_0062E) from the 2015-2019 ACS five-year estimates that represent the final data release reflecting the pre-COVID world.

For this API query URL:

https://api.census.gov/data/2019/acs/acs5/profile?get=GEO_ID,DP03_0062E&for=state:*
USCB API JSON
USCB API JSON viewed in a browser

Using USCB API Data in Python

To create a GeoDataFrame of ACS data:

  1. GeoPandas is a Python package for working with geospatial data.
  2. Matplotlib is a Python package for plotting graphs.
  3. Load the API data into a DataFrame by passing the API URL to the Pandas read_json() function.

  4. Use iloc to remove the header row and the redundant state column.
  5. rename() the columns with meaningful names.
  6. astype(float) to convert the data column from text to numeric.
  7. Find the URL to the appropriate TIGER cartographic boundary zipped shapefile from the link on the page shown above.
  8. Read the cartographic boundary file directly from the USCB website into a GeoDataFrame object using the read_file() function.
  9. Use the to_crs() method to reproject the data to the North America Lambert Conformal Conic projection suitable for North America (ESRI 102009).
  10. Join the table DataFrame with the county polygon GeoDataFrame using the Pandas merge() method and the GEO_ID fields in the two objects.
  11. To map only the continental 48 states, we exclude Alaska, Hawaii, and Puerto Rico.
import pandas

import geopandas

import matplotlib.pyplot as plt

state_income = pandas.read_json("https://api.census.gov/data/2019/acs/acs5/profile?get=GEO_ID,DP03_0062E&for=state:*")

state_income = state_income.iloc[1:, 0:2]

state_income = state_income.rename(columns={0:"GEO_ID", 1:"Median Household Income"})

state_income.iloc[:,1] = state_income.iloc[:,1].astype(float)

states = geopandas.read_file("https://www2.census.gov/geo/tiger/GENZ2022/shp/cb_2022_us_state_5m.zip")

states = states.to_crs("ESRI:102009")

state_income = states.merge(state_income, left_on="AFFGEOID", right_on="GEO_ID")

state_income = state_income[~state_income["STUSPS"].isin(['AK', 'HI', 'PR'])]

axis = state_income.plot("Median Household Income", scheme="naturalbreaks", 
	cmap="coolwarm_r", edgecolor="#808080", legend=True)

axis.set_axis_off()

plt.show()
Figure
An import of USCB ACS data using the API: median household income by state 2015-2019

County Level Data

County level data can be loaded by modifying the API parameters and joining with the TIGER counties file.

This example maps median household income (DP03_0062E) by Illinois county (state FIPS 17).

county_income = pandas.read_json("https://api.census.gov/data/2019/acs/acs5/profile?get=GEO_ID,DP03_0062E&for=county:*&in=state:17")

county_income = county_income.iloc[1:, 0:2]

county_income = county_income.rename(columns={0:"GEO_ID", 1:"Median Household Income"})

county_income.iloc[:,1] = county_income.iloc[:,1].astype(float)

counties = geopandas.read_file("https://www2.census.gov/geo/tiger/GENZ2019/shp/cb_2019_us_county_20m.zip")

counties = counties.to_crs("ESRI:102009")

county_income = counties.merge(county_income, left_on="AFFGEOID", right_on="GEO_ID")

axis = county_income.plot("Median Household Income", scheme="naturalbreaks", 
	cmap="coolwarm_r", edgecolor="#808080", legend=True,
	legend_kwds={"bbox_to_anchor":(0.2, 0.4)})

axis.set_axis_off()

plt.show()
Figure
Median household income by county

Tract-Level Data

Census tracts are organizational boundaries used for USCB data collection that are drawn to roughly align with neighborhood borders. Ideally, each tract contains 4,000 residents, although the number of residents can vary depending on area (USCB 2019).

This example maps median household income (DP03_0062E) by census tract in Cook County, Illinois (state FIPS 17, county FIPS 031).

Note that tract-level data is commonly undisclosed or unavailable and is represented with the negative number -666666666. This code resets those values to zero to avoid distorting the legend.

tract_income = pandas.read_json("https://api.census.gov/data/2019/acs/acs5/profile?get=GEO_ID,DP03_0062E&for=tract:*&in=state:17&in=county:031")

tract_income = tract_income.iloc[1:, 0:2]

tract_income = tract_income.rename(columns={0:"GEO_ID", 1:"Median Household Income"})

tract_income.iloc[:,1] = tract_income.iloc[:,1].astype(float)

tract_income[tract_income["Median Household Income"] < 0] = 0

tracts = geopandas.read_file("https://www2.census.gov/geo/tiger/GENZ2019/shp/cb_2019_17_tract_500k.zip")

tracts = tracts.to_crs("ESRI:102009")

tract_income = tracts.merge(tract_income, left_on="AFFGEOID", right_on="GEO_ID")

axis = tract_income.plot("Median Household Income", scheme="naturalbreaks", 
	cmap="coolwarm_r", edgecolor="none", legend=True,
	legend_kwds={"bbox_to_anchor":(0.2, 0.4)})

axis.set_axis_off()

plt.show()
Figure
Tract-level choropleth of median household income from ACS data

Using USCB API Data in R

To create a data.frame of ACS data:

  1. The sf (simple features) library provides functions for working with vector geospatial data.
  2. The jsonlite library is a JSON parser.
  3. Load API data as a data.frame by passing the API URL to the jsonlite fromJSON() function.
  4. Remove the header row and the redundant state column.
  5. Provide meaningful column names().
  6. Use as.numeric() to convert the data column from text to numeric.
  7. Find the URL to the appropriate TIGER cartographic boundary zipped shapefile from the link on the page shown above.
  8. download.file() to a temporary file with the .shz extension. You must go through this intermediate step so the file has the .shz extension so that st_read() knows this is a zipped shapefile.
  9. Load the shapefile polygons into a simple features data.frame using st_read().
  10. Reproject the polygons into a cartographically appropriate projection with st_transform(). For this data we use the North America Lambert Conformal Conic projection suitable for North America (ESRI 102009).
  11. merge() the polygons and the table on the GEO_ID fields.
  12. To map only the continental 48 states, we exclude Alaska, Hawaii, and Puerto Rico using their FIPS codes.
  13. Create an appropriate diverging colorRampPalette.
  14. plot() a choropleth colored by the desired variable.
library(sf)

library(jsonlite)

state_data = fromJSON("https://api.census.gov/data/2019/acs/acs5/profile?get=GEO_ID,DP03_0062E&for=state:*")

state_data = as.data.frame(state_data[2:nrow(state_data),1:2],)

state_data[,2] = as.numeric(state_data[,2])

names(state_data) = c("GEO_ID", "Median Household Income")

download.file("https://www2.census.gov/geo/tiger/GENZ2022/shp/cb_2022_us_state_5m.zip", "temp.shz")

states = st_read("temp.shz")

states = st_transform(states, "ESRI:102009")

state_income = merge(states, state_data, by.x="AFFGEOID", by.y="GEO_ID")

state_income = state_income[!(state_income$STUSPS %in% c('AK', 'HI', 'PR')),]

redblue = colorRampPalette(c("red", "lightgray", "navy"))

plot(state_income["Median Household Income"], pal=redblue, breaks="quantile")
Figure
An import of USCB ACS data using the API: median household income by state 2015-2019

County Level Data

County level data can be loaded by modifying the API parameters and joining with the TIGER counties file.

This example maps median household income (DP03_0062E) by Illinois county (state FIPS 17).

county_data = fromJSON("https://api.census.gov/data/2019/acs/acs5/profile?get=GEO_ID,DP03_0062E&for=county:*&in=state:17")

county_data = as.data.frame(county_data[2:nrow(county_data),1:2],)

county_data[,2] = as.numeric(county_data[,2])

names(county_data) = c("GEO_ID", "Median Household Income")

download.file("https://www2.census.gov/geo/tiger/GENZ2019/shp/cb_2019_us_county_20m.zip", "temp.shz")

counties = st_read("temp.shz")

counties = st_transform(counties, "ESRI:102009")

county_income = merge(counties, county_data, by.x="AFFGEOID", by.y="GEO_ID")

redblue = colorRampPalette(c("red", "lightgray", "navy"))

plot(county_income["Median Household Income"], pal=redblue, breaks="quantile")
Figure
Median household income by county

Tract-Level Data

Census tracts are subdivisions of counties that are drawn based on clearly identifiable features to ideally contain around 4,000 residents, although in practice the range of population is usually between 1,200 and 8,000 (USCB 2019).

This example maps median household income (DP03_0062E) by census tract in Cook County, Illinois (state FIPS 17, county FIPS 031).

Note that tract-level data is commonly undisclosed or unavailable and is represented with the negative number -666666666. This code resets those values to zero to avoid distorting the legend.

tract_data = fromJSON("https://api.census.gov/data/2019/acs/acs5/profile?get=GEO_ID,DP03_0062E&for=tract:*&in=state:17&in=county:031")

tract_data = as.data.frame(tract_data[2:nrow(tract_data),1:2],)

tract_data[,2] = as.numeric(tract_data[,2])

names(tract_data) = c("GEO_ID", "Median Household Income")

download.file("https://www2.census.gov/geo/tiger/GENZ2019/shp/cb_2019_us_tract_500k.zip", "temp.shz")

tracts = st_read("temp.shz")

tracts = st_transform(tracts, "ESRI:102009")

tract_income = merge(tracts, tract_data, by.x="AFFGEOID", by.y="GEO_ID")

redblue = colorRampPalette(c("red", "lightgray", "navy"))

plot(tract_income["Median Household Income"], pal=redblue, breaks="quantile", lwd=0.1)

Figure
Tract-level choropleth of median household income from ACS data