Geospatial Data from OpenStreetMap

OpenStreetMap (OSM), is a collection of geospatial data built by a community of mappers that contribute and maintain data about roads, trails, cafés, railway stations, and much more, all over the world. While the site is often thought of as the OSM web map that is similar to Google Maps, the primary focus of the OSM project is the collecting and disseminating of the data itself.

The project was initially started by Steve Coast in 2004 following the lead of Wikipedia as an open-source encyclopedia. As with open-source software, the open data model contrasts with the proprietary data model under the belief that community is stronger by standing on each other's shoulders rather than standing on each other toes.

As of 3 February 2024, OSM had around 10.6 million users, and the OSM database contained around 9 billion nodes (lat-long points) and around 1 billion ways (line or polygon features).

The community emphasizes:

Data Organization

A challenge to using OpenStreetMap data is that the data is organized in a hierarchical manner rather than the comparatively simpler tabular structure common in traditional GISystems.

A number of wrapper libraries exist to access OSM data through the OSM API, although you may find learning to use the wrapper as difficult as learning to use the API directly.

OSM Export

Data can be downloaded from the OpenStreetMap web app using the Export button in XML format. While this data cannot be directly imported using standard GeoPandas functions or ArcGIS Pro tools,

Exporting OSM data from the web app

Python

If you simply want a collection as a collection of ways in GeoPandas GeoDataFrame without any relations data, the XML exported from OSM can be read using this simple script. Lines and polygons are placed in separate GeoDataFrames.

import pandas
import geopandas
import xml.etree.ElementTree as ET

osm_file_name = '2024-tolono.osm'

osm = ET.parse(osm_file_name).getroot()

ways = []

for way in osm.iter('way'):
	nodeids = [ node.attrib['ref'] for node in way.findall('nd') ]
	nodelem = [ osm.find("./node[@id='" + id + "']") for id in nodeids ]
	nodewkt = [ str(node.attrib['lon']) + ' ' + str(node.attrib['lat']) for node in nodelem ]
	if (len(nodewkt) > 0) & (nodewkt[0] == nodewkt[len(nodewkt) - 1]):
		wkt = 'POLYGON((' + ', '.join(nodewkt) + '))'
	else:
		wkt = 'LINESTRING(' + ', '.join(nodewkt) + ')'
	tags = [ tag.attrib['k'].replace(':', '_') + ':' + tag.attrib['v'] for tag in way.findall('./tag') ]
	ways.append({'id':way.attrib['id'], 'relation':'', 'tags':'; '.join(tags), 'wkt':wkt})

ways = pandas.DataFrame(ways)

lines = ways[ways.wkt.str.contains('LINESTRING')]

geometry = geopandas.GeoSeries.from_wkt(lines.wkt)

lines = geopandas.GeoDataFrame(lines[['id', 'tags']], geometry = geometry, crs="EPSG:4326")

polygons = ways[ways.wkt.str.contains('POLYGON')]

geometry = geopandas.GeoSeries.from_wkt(polygons.wkt)

polygons = geopandas.GeoDataFrame(polygons[['id', 'tags']], geometry = geometry, crs="EPSG:4326")

You can then plot() the ways.

import matplotlib.pyplot as plt

axis = polygons.plot(facecolor='none', edgecolor='#00000040')

lines.plot(ax=axis)

axis.set_xlim(-88.277, -88.243)

axis.set_ylim(39.979, 39.997)

plt.show()
Figure
Tolono, IL from OSM data

The ways can be exported to GeoJSON files using to_file().

line_file_name = '2024-tolono-lines.geojson'

polygon_file_name = '2024-tolono-polygons.geojson'

lines.to_file(line_file_name)

polygons.to_file(polygon_file_name)