Geographic Information Systems Architecture

Geographic information systems consist of a variety of different hardware, software, and human components that work together harmoniously. The architecture of a computer system is "the manner in which the components of a computer or computer system are organized and integrated" (Merriam-Webster 2020). Different types of system architecture are used for different needs. These architectures can be simple, such as a stand-alone home desktop systems, or they can be intricately complex, like cloud-architectures that rely on layers of developers, engineers, and technicians to satisfy challenging business or governmental requirements.

Understanding the different available software and hardware options can help GIS professionals make optimal system design choices needed to meet the needs of the organization in the most efficient manner possible.

Types of Architecture

There are four fundamental components to a geographic information system:

Stand-alone Architecture

The simplest GIS architecture is a stand-alone desktop or laptop that houses all components. This architecture is appropriate for users who are working alone and do not need to regularly share data. Examples of this include ArcMap software or, in the non-spatial world, use of Microsoft desktop programs like Word, Excel, or PowerPoint on files stored on your personal hard drive.

Stand-alone architecture

Network Architecture

Since most GIS projects and training involves some measure of collaboration, most architectures separate components in the client-server model. Servers are computers on a network that are dedicated to managing network resources. Servers provide services to other computers on that network called clients that need those services (Techopedia 2020).

For a academic classroom lab or small business, the simplest form of client-server architecture is stand-alone computers connected to a central file server or database server through a local network. In such cases, the applications are still on individual desktop computers, but the data is kept on a server and managed by the DBMS. The users see the data as if it were on their own machines.

Simple network architecture

Enterprise Architecture

As the needs of an organization get larger and the types of clients become more diverse, an enterprise architecture is needed to separate the system components across multiple servers and networks in various configurations. In this usage, the term enterprise refers to large businesses and government organizations.

Enterprise network architecture

Cloud Architecture

As a network continues to grow, the expense and difficulty of maintaining increasing numbers of dedicated physical server computers becomes an issue. This challenge, along with the ubiquitous deployment of high-speed internet connections led in the 1990s to the development of the cloud architecture.

In a cloud architecture, massive racks of servers run in large data centers, and customers contract with cloud providers to have access to services provides across the internet by those virtual servers. While this architecture may appear to users as if it were a traditional enterprise architecture, you do not know or care exactly what physical server is providing you with a service at any given time. Server administrators can increase or decrease server capacity as needs dictate, allowing for more flexibility and more economical use of resources.

Cloud architecture

The flexibility and low cost of this architecture for users and companies have made cloud architecture increasingly dominant for both simple private consumer needs as well as large enterprise needs. The Google apps (GMail, Google Sheets, Google Docs) and Office 365 (from Microsoft) are examples of cloud-based consumer applications.

Companies needing to build enterprise networks commonly contract with cloud service providers who handle the construction and maintence of the hardware and networks. Cloud services are available from a number of providers, but three major companies currently dominate cloud computing as of this writing: Amazon Web Services (AWS), Microsoft (Azure), and Google.

An AWS EC2 cloud services management console

Database Management Systems

Outside of single-user projects, data needs to be shared by groups of people. These groups of people can have a variety of different data needs, use different devices and applications to access the data, and need access from a variety of different geographic locations.

A database management system is "a specialized computer program for organizing and manipulating data" (Bolstad 2019, 331). Following the client-server model, the database management system for a project or organization is located on centralized server(s), and users access the data on the server through a computer network, which can include the internet.

The use of centralized database management systems provides numerous advantages over keeping data on individual machines:

Relational Databases

Most contemporary geospatial databases are extensions of relational databases, which were general purpose databased first proposed by computer scientist E.F. Codd in (1970).

Relational databases are composed of sets of tables. Tables are like spreadsheets in that they are arrays of data arranged in rows and columns.

With geospatial data, the rows are records represent individual features and the columns represent attributes for each feature. The attribute columns are referred to as fields.

Unlike cells in a spreadsheet, each field must have a specific data type indicating what kind of values it can have (text, integers, real numbers, etc.).

Fields also sometime have domains which indicate the range of acceptable values. Setting domains on fields can be useful to prevent accidental insertion of invalid values.

As an example, the following is a table that represents building locations.

Example database table

A database table has one or more columns referred to as keys. Primary key values in each row uniquely identify that particular row. In the street table example above, the BIN (building identification number) column is the primary key.

Primary keys, as well as columns of spatial data, can be connected to indexes that are designed to dramatically increase the speed with which values from a key field can be searched for specific values.

The relational part of the name relational database means that the tables in the database are related to each other. Keys allow rows in one table to be associated with rows in another table.

For example, below is a table of restaurants where the BIN field is a foreign key that identifies the building for each restaurant.

Example database tables related by keys

Separating information into multiple tables connected by keys reduces duplicated information and wasted space. It also makes changes simpler, since information only needs to be changed in one record rather than across multiple records. The mathematically ideal structure for a database is called a normal form. The process of structuring a relational database in this way is called normalization.

Example normalized database tables

Queries

Interaction with a relational database is commonly performed with a language called structured query language (SQL). Even when geospatial data is being handled transparently by a consumer app, behind the scenes, the app may be using SQL to extract, add, or modify information in the database.

SQL is a complex and powerful language, and there are variations on that language used by different DBMS. However, giving examples of a few common commands will give you some sense of what can be done with SQL.

For example, the primary command for extracting information from a table is SELECT. For example, to show all street segments on East John Street:

SELECT * FROM Segments WHERE Street = 'South Gregory';

+-------+---------------+-----------------------+-------------------------------+
| BIN	| Number	| Street		| Geometry			|
+-------+---------------+-----------------------+-------------------------------+
| 1001	| 701		| South Gregory		| POINT(40.1064, -88.2217)	|
+-------+---------------+-----------------------+-------------------------------+

Software Business Models

Expenses are associated with development, operation, and maintenance costs for software and services. Accordingly, all types of geographic information systems have associated business models that define how income is generated to pay for the costs associated with the software and services, and in the case of private companies, how those companies will make a profit off the software and services they provide. Different architectures are conducive to different business models.

There are four general business models that are common in GIS:

Software business models

Proprietary Software and Services

The term proprietary means "something that is used, produced, or marketed under exclusive legal right of the inventor or maker" (Merriam-Webster 2021).

Proprietary software is completely controlled by a single company and the details of how that software is built (the source code) and how that software operates and shares data is often information that is shared with users outside the company.

With proprietary GIS software, this monopoly power is the basis of the company's business model that enables it to pay for the continued development of the software and services while making a profit for its shareholders.

ESRI

The dominant company in enterprise GIS is ESRI, which was founded in 1969 by Jack and Laura Dangermond as a land-use consulting firm.

Although the company came to prominence with with stand-alone software running on minicomputers and desktops, over the past decade the company has increasingly moved to cloud-based architectures. ESRI is probably best best known to academics for desktop software like ArcMap and ArcGIS Pro, and for the ArcGIS Online cloud environment.

ArcGIS Pro desktop GIS software
ArcGIS Online Web App
ArcMap desktop GIS software (obsolete)

ESRI also offers server-based architectures built around their ArcGIS Enterprise software that can be configured in a wide variety of ways to suit business requirements.

Architecting the ArcGIS Platform (ESRI 2020)

Google

While Google is, perhaps, best known as a web search company, their geospatial apps, services, and APIs are integral to the geospatial web. The integration of technology from acquired companies into Google Maps in 2005 revolutionized web mapping.

Google Cloud (Google 2020)

Carto

CARTO is a cloud computing platform that provides GIS, web mapping, and spatial data science tools. The company markets itself as a "Location Intelligence" platform that is readily usable for data analysis and visualization without prior GIS experience.

While the web app software is open source, the online service and data sets are available as a subscription service.

Carto

AutoDesk

AutoCAD is popular desktop engineering design software includes toolsets that can be used to integrate and visualize geospatial data. Design data is commonly exchanged between GIS and CAD, although the process often requires some manual tweaking, and experience moving data between CAD and GIS is a useful job skill for work in government and work with consulting firms.

AutoDesk mapping (AutoDesk 2020)

Bentley

Bentley Microstation is another commonly used engineering CAD software package that incorporates mapping capability.

Bentley mapping (Bentley 2020)

Mapinfo

MapInfo is a desktop mapping application first introduced in 1986 that still has a user base, and you may encounter MapInfo files when working with organizations that still use it.

MapInfo

Open Software

Open collaboration is "any system of innovation or production that relies on goal-oriented yet loosely coordinated participants who interact to create a product (or service) of economic value, which they make available to contributors and noncontributors alike" (Jemielniak and Przegalinska 2020).

The open model exists in contrast to the proprietary model under the belief that community is stronger by standing on each other's shoulders rather than standing on each other toes. The open model emerged as an offshoot of the free software movement in the 1990s. Rather that one company having to bear the total burden of development costs, multiple individuals and organizations make smaller contributions that over time add up to robust software.

The open model is manifest in GIS in three ways: open-source software, open standards, and open data.

Open Source Software

Open source software is software where the programming source code can be accessed and modified, although development expertise is needed to actually make such modifications. More importantly to most users, open source software is also usually freely downloadable.

While open source GIS software projects are developed and supported by a variety of community groups, The Open Source Geospatial Foundation (OSGeo) is a not-for-profit organization whose mission is to foster global adoption of open geospatial technology by being an inclusive software foundation devoted to an open philosophy and participatory community driven development. While not having any direct control over open projects, OSGeo promotes selected projects and, in some cases, serves as a conduit for development funding.

Some notable open-source GIS projects:

For example, the diagram below compares simplified open and proprietary architectures for publishing web maps.

On the open left, QGIS is used to import and process data stored in a PostGIS database. MapServer is used to render the geospatial data into services that can then be accessed by web or mobile clients over the internet.

On the proprietary right, ESRI provides a fully-integrated stack of software. ArcGIS Pro is used to import and process data stored in a SQL Server database. The ArcGIS Enterprise software is at the center of all operations, including rendering the geospatial data into services that can then be accessed by the web or mobile clients over the internet.

Open vs. proprietary web map architecture

Open Standards

A major challenge with geospatial data is that it is stored and disseminated in a variety of different and, often, proprietary formats. This creates a situation where GIS professionals often have to spend significant amounts of time reformatting or recreating data so it can be used in their GIS.

An approach to mitigating this wasted effort is the promotion of open standards that "ensure interoperability, enhance collaboration, and create a diverse, interoperable, decentralized software and data ecosystem that benefits all participants" This makes it possible to create a "data ecosystem where diverse data sources and software can easily be combined in novel ways to create value and provide a platform for innovation" (Alameh 2020).

Even proprietary software companies like ESRI or Microsoft develop their software to utilize open standards to enable interoperability with software from other vendors (giving the proprietary software a wider potential audience), and to eliminate the burden of maintianing proprietary protocols and formats.

The Open Geospatial Consortium (OGC) is a professional community that create free, publicly available geospatial standards. Open standards can be used by both open and proprietary systems.

Some notable standards include:

The relationship between clients/servers and OGC protocols (Shadura 2010 via Wikipedia)

Open Data

The open model also applies to data. Many governmental organizations around the world make their data freely available to the general public through open data portals, although this data is collected and maintained by government employees.

NYC Open Data

One effort that represents collaborative creation is OpenStreetMap (OSM), which is a collection of geospatial data built by a community of mappers that contribute and maintain data about roads, trails, cafés, railway stations, and much more, all over the world. While the site is often thought of as the OSM web map that is similar to Google Maps, the primary focus of the OSM project is the collecting and disseminating of the data itself.

The project was initially started by Steve Coast in 2004 following the lead of Wikipedia as an open-source encyclopedia. As of 21 December 2020, OSM had around seven million users, and the OSM database contained around 8.4 billion nodes (lat-long points) and around 726 million ways (line or polygon features).

The community emphasizes:

OpenStreetMap

Considerations With the Open Model

There are a number of issues that should be considered when chosing what kind of software business-model will be most appropriate for your situation.