Geospatial Data from OpenStreetMap
OpenStreetMap (OSM), is a collection of geospatial data built by a community of mappers that contribute and maintain data about roads, trails, cafés, railway stations, and much more, all over the world. While the site is often thought of as the OSM web map that is similar to Google Maps, the primary focus of the OSM project is the collecting and disseminating of the data itself.
The project was initially started by Steve Coast in 2004 following the lead of Wikipedia as an open-source encyclopedia. As with open-source software, the open data model contrasts with the proprietary data model under the belief that community is stronger by standing on each other's shoulders rather than standing on each other toes.
As of 3 February 2024, OSM had around 10.6 million users, and the OSM database contained around 9 billion nodes (lat-long points) and around 1 billion ways (line or polygon features).
- Community Driven: OpenStreetMap's contributors include enthusiast mappers, GIS professionals, engineers running the OSM servers, humanitarians mapping disaster-affected areas, etc. The community associated with OSM includes for-profit companies that both use and contribute to OSM, such as CraigsList, MapQuest, JMP (statistical software), Foursquare, MapBox, and many more.
- Local Knowledge: Contributors use aerial imagery, GPS devices, and low-tech field maps to verify that OSM is accurate and up to date. Data is also initially sourced from government entities like the US Census Bureau, and is often based on aerial imagery that for-profit companies like Yahoo and Micro$oft (Bing) have permitted to be used for reference.
- Open Data: OpenStreetMap is open data that you are free to use it for any purpose as long as you credit OpenStreetMap and its contributors. To protect both the data and the project, OpenStreet Map is licenced under the Open Data Commons Open Database License, and if you alter or build upon the data in certain ways, you may distribute the result only under the same licence.
- Foundation Governance: The OSM website and related services are formally operated by the OpenStreetMap Foundation (OSMF) on behalf of the community. Hosting is supported by the UCL VR Centre, Imperial College London and Bytemark Hosting, and other partners.
Data Organization
A challenge to using OpenStreetMap data is that the data is organized in a hierarchical manner rather than the comparatively simpler tabular structure common in traditional GISystems.
- Nodes are points at specific latitudes and longitudes.
- Ways are collections of nodes representing lines or areas with attributes as keyword / value pairs.
- Relations are collections of ways and other relations reflecting logical groupings of features, such as collections of centerline ways representing the lanes, exit ramps, and service roads. Relations can also have keyword / value attribute pairs.
A number of wrapper libraries exist to access OSM data through the OSM API, although you may find learning to use the wrapper as difficult as learning to use the API directly.
OSM Export
Data can be downloaded from the OpenStreetMap web app using the Export button in XML format. While this data cannot be directly imported using standard GeoPandas functions or ArcGIS Pro tools,
Python
If you simply want a collection as a collection of ways in GeoPandas GeoDataFrame without any relations data, the XML exported from OSM can be read using this simple script. Lines and polygons are placed in separate GeoDataFrames.
import pandas import geopandas import xml.etree.ElementTree as ET osm_file_name = '2024-tolono.osm' osm = ET.parse(osm_file_name).getroot() ways = [] for way in osm.iter('way'): nodeids = [ node.attrib['ref'] for node in way.findall('nd') ] nodelem = [ osm.find("./node[@id='" + id + "']") for id in nodeids ] nodewkt = [ str(node.attrib['lon']) + ' ' + str(node.attrib['lat']) for node in nodelem ] if (len(nodewkt) > 0) & (nodewkt[0] == nodewkt[len(nodewkt) - 1]): wkt = 'POLYGON((' + ', '.join(nodewkt) + '))' else: wkt = 'LINESTRING(' + ', '.join(nodewkt) + ')' tags = [ tag.attrib['k'].replace(':', '_') + ':' + tag.attrib['v'] for tag in way.findall('./tag') ] ways.append({'id':way.attrib['id'], 'relation':'', 'tags':'; '.join(tags), 'wkt':wkt}) ways = pandas.DataFrame(ways) lines = ways[ways.wkt.str.contains('LINESTRING')] geometry = geopandas.GeoSeries.from_wkt(lines.wkt) lines = geopandas.GeoDataFrame(lines[['id', 'tags']], geometry = geometry, crs="EPSG:4326") polygons = ways[ways.wkt.str.contains('POLYGON')] geometry = geopandas.GeoSeries.from_wkt(polygons.wkt) polygons = geopandas.GeoDataFrame(polygons[['id', 'tags']], geometry = geometry, crs="EPSG:4326")
You can then plot() the ways.
import matplotlib.pyplot as plt axis = polygons.plot(facecolor='none', edgecolor='#00000040') lines.plot(ax=axis) axis.set_xlim(-88.277, -88.243) axis.set_ylim(39.979, 39.997) plt.show()
The ways can be exported to GeoJSON files using to_file().
line_file_name = '2024-tolono-lines.geojson' polygon_file_name = '2024-tolono-polygons.geojson' lines.to_file(line_file_name) polygons.to_file(polygon_file_name)