Geographic Information Systems and Geographic Information Science
Within the domain of geography, the terms geographic information systems, geographic information science, and geospatial data science are used in a variety of often ambiguous and overlapping ways. This tutorial will introduce some perspectives on those terms that can be useful for understanding how geospatial technology can be useful for answering different types of questions.
Geography
Geography is a term with a wide variety of meanings and there is no single definition that covers the variety of activities that are performed by people that call themselves geographers or are in some way associated with the discipline of geography. Indeed, there are probably as many definitions of geography as there are geographers and the question of what geography is has triggered many (sometimes vicious and arguably pointless) debates among geographers.
For the purposes of this tutorial, we will use the following three definitions of geography:
What is where, why is it there, and why do we care?
Human, physical, and GIS
Geography is what geographers do.
What is where, why is it there, and why do we care?
In 2002, geographer Charles Gritzner proposed this three part definition of the types of questions explored in geography:
- What is where: This is probably what most Americans think geography is based on their experience in elementary school geography memorizing the names of state capitals. The map is an exemplar of this idea as a graphical representation of where things are on the surface of the earth. "What is where" questions are commonly answered with geospatial technologies
- Why is it there: This aspect of geography deals with the processes and causes that put things where they are. These questions can be more interesting than "what is where" questions in that they give an understanding of how the past led to the present, and allow us to build models to anticipate what might happen in the future
- Why do we care: This aspect of geography deals with values and ethics. Values questions are often embedded within and hidden under the representations used in geospatial technology
Human, physical, and GIS
While activities that can be considered geography are performed in a variety of different professions, many (if not most) people that call themselves geographers work in the academy (colleges and universities). Within university geography departments there are three broad domains of study:
- Human geography explores human phenomena and commonly overlaps with sociology, urban planning, anthropology, economics, and history.
- Physical geography is a natural science that commonly overlaps with geology, biology, and ecology.
- Geographic information systems (GIS) focuses on the development and use of geospatial technology, and overlaps with computer science, information technology, and data science.
Although academic geographers commonly focus primarily on only one of these fields, the boundaries are not rigid. Many human and physical geographers use geographic information systems in their research. Physical geography often studies environmental phenomena that are caused or affected by humans, and human geographers often study the interaction between people and their environments. College instructors will commonly teach courses in all three areas.
Geography is what geographers do.
Everything that is happens in a space and time. Therefore, everything has a history and everything has a geography. There is no Pope in geography that says what can and cannot be studied in geography, and since few people know (or care) what contemporary academic geography is, academic geographers are often free to cross disciplinary boundaries and study what they want. Therefore, almost anything can be and is studied within geography. If you visit the annual meeting of the Association of American Geographers, you can see presentations on a very wide variety of topics by a very heterogeneous collection of researchers.
Geographic Information Systems
Geographic information systems (GIS or GI Systems) are computer systems used for capturing, storing, processing, analyzing, and communicating geospatial data.
GIS is the use of information technology to store information about what is where. From this perspective, GIS is useful as a tool in a variety of applications:
- Consumer and Business Applications
- Government and Military Applications
- Research Applications
Consumer and Business Applications
GIS is the technology behind a wide variety of consumer applications like Google Maps.
GIS is also used by businesses in ways that customers do not see.
Location is a fundamental aspect of doing business in terms of locating retail outlets, optimizing supply chains, or creating effective marketing campaigns. Businesses use GIS to collect and analyze a wide variety of information on suppliers, customers and competitors that can inform crucial business decisions.
Extractive industries like oil/gas, mining and forestry use GIS to maintain inventories of assets and perform analysis needed to support successful exploration and exploitation of resources.
Precision agriculture uses geospatial data to assess field variability, ensure optimal use of inputs, and maximize the output from a farm. Remotely sensed data, along with GPS-enabled farm equipment and GIS analysis allows farmers to deliver precise amounts of fertilizers, pesticides, herbicides and irrigation needed in different parts of their fields to maximize harvests and minimize costs.
Government and Military Applications
GIS is used extensively in the public sector.
Almost all city governments of any size maintain information on municipal works (water, sewer, gas, etc) and property ownership in GIS databases. Accordingly, governments frequently employ GIS technicians, analysts and programmers to maintain these data and systems.
The use of crime mapping as a tool for policing dates back at least to the 19th century (Dent 2000). Contemporary geospatial technology allows the use of increasingly large amounts of location-based data to create actionable intelligence that law enforcement agencies can use to respond to and anticipate criminal activity (ESRI 2023).
Geography in general and geospatial technology in particular have a long historical relationship with military activity. Geospatial intelligence is an indispensable part of modern warfare.
Science
In order to address the terms geographic information science and geospatial data science, we need to address the concept of science.
As with the word geography, the term science is contested and subject to a variety of definitions.
For our purposes we use the dictionary definition of science as " knowledge or a system of knowledge covering general truths or the operation of general laws especially as obtained and tested through scientific method" (Merriam-Webster 2021). This breaks down into three components:
- Science is a systematic process for the production of knowledge, often through some variant of the scientific method.
- Science is the body of knowledge resulting from a systematic process for production of knowledge.
- Science is a political tool in the process of acquiring and maintaining power.
Science as a Process
Science is "knowledge or a system of knowledge covering general truths or the operation of general laws especially as obtained and tested through scientific method." Science is a process for producing new knowledge that both draws on the existing body of knowledge and then adds or updates that body of knowledge.
A fundamental concept in these definitions is general truths, which are applicable in multiple times and places. This is in contrast to specific truths which are only applicable to a one particular time and place. An example of a specific truth would be the number of influenza deaths so far in this year's flu season, while a general truth would be that influenza is caused by infection by specific types of viruses.
Scientific research considers the relationship between both general and specific knowledge.
- Deductive research seeks to find whether general theories explain what is happening in specific situations.
- Inductive research seeks to find whether knowledge about what is happening in a specific situations can be generalized into a general theory.
Science as a Body of Knowledge
Science is "a department of systematized knowledge as an object of study." Science represents a body of knowledge that has been organized and structured in a useful way.
As with other sciences, GIScience embraces:
- Empiricism
- Rationalism
- Skepticism
Science as Politics
Science is a political term (Douthat 2020) that has a wide variety of meanings in common usage. Indeed, science is often elevated in common discourse to the status of a secular religion (Farias et al 2013).
Geographic Information Science
As with the term geography, the term geographic information science (GIScience or GI Science) has a variety of different uses. Based on the three-fold conception of science given above, we have a taxonomy for the various definitions of GIScience.
- GIScience is the production of knowledge about geographic information systems or the production of knowledge in fields where GIS can be used in scientific inquiry.
- GIScience is the body of knowledge underpinning GIS.
- GIScience is a political term.
GIScience as the Production of General Knowledge About GIS
From that conception of science, one of the two definitions of GIScience proposed by Goodchild (1992) is:
The exploration of scientific questions about GIS that are "generic, rather than specific to particular fields of application and particular contexts"
Activities in this domain include:
- The development of new algorithms for analysis of geospatial data
- The analysis and comparison of algorithms for analysis of geospatial data
- The development of open standards for the exchange and dissemination of geospatial data
- Analysis of the ways users interact with GIS software in order to improve efficiency and productivity
- The analysis of the effects of geospatial technology on society
GIScience as the Application of GIS in Scientific Inquiry
Goodchild (1992) also proposed another definition of GIScience:
The use of geographic information systems (GIS) to provide "insight, explanation, and understanding" in "supporting those sciences for which geography is a significant key"
This definition encompasses GIScience as the application of GIS in scientific research, and in the development of GIS technology that can be used in scientific research. Examples of this type of GIScience include:
- Archaeology: Remote sensing technologies like Lidar have been helpful in discovering and exploring sites like the lost city of La Ciudad Blanca in Honduras that were obscured by vegetation or other landscape changes over time (Daukantas 2014)
- Epidemiology: GIS has been used to detect specific areas where there are above-average incidence of breast cancer on Long Island, NY in order to try to determine whether environmental pollution is responsible for those elevated rates (National Cancer Institute 2014)
- Crop Science: GIS has been used to model climate change induced flood risk in Mediterranean tree crop areas to inform decision-making about the use of flood tolerant fruit tree species (Kourgialsa and Kratzas 2016).
- Geology: GIS has been used to model earthquake risk associated with hydraulic fracturing wells (fracking) (Meng 2015).
- Urban Studies: GIS has been used to analyze the relationship between urban geometry and the urban heat island effect (Nakata-Osakia, Souza, and Rodrigues 2018)
GIScience as the Body of General Knowledge About GIS
Longley et al. (2009, 2) define geographic information science as:
"The general knowledge and important discoveries that have made geographic information systems possible"
Geographic information systems as a technology focuses on the application of knowledge to answer questions about specific situations and locations (Merriam-Webster 2020).
In contrast, geographic information science represents the body of knowledge about GIS that can be applied in a variety of different specific situations.
Specific (GIS) | General (GIScience) |
---|---|
Where is the nearest sushi restaurant? | What data models can be used to represent business in GIS? |
What is the median assessed value of homes in my neighborhood? | What are the best system architectures to use for delivering web maps (like property maps) to large public audiences? |
Why are breast cancer rates so high on Long Island? | How can we synthesize disease incidence location data and environmental data to assess potential causes of disease? |
Where are the areas with the highest rates of assault in Chicago? | What algorithms are best suited to analysis of different types of crime data? |
GIScience as a Political Term
GIScience is often simply a framing of GIS training in a way that makes it seem more sophisiticated than simply learning which buttons to push in ArcGIG Pro. In order to fit the academic expectation of research universities as producers of scientific knowledge, departments brand themselves as geographic information science, even if their primary activity is training students and researchers to use the technology of geographic information systems in pursuit of separate aims.
The debate over whether much of what falls under the term GIScience should be grouped with other sciences like chemistry and physics echoes a similar debate in the sister field of computer science (Cerf 2012; Abrahams 2013).
This is also related to a broader issue in the social sciences of physics envy, where social scientists feel the need to emulate the hypothetico-deductive techniques (scientific method) of natural scientists (like physicists), even though that model may not be appropriate for many types of subjects and knowledge dealt with by the social sciences (Clarke and Primo 2012).
Data Science
Data science involves extracting insights from complex datasets and presenting those insights to non-technical audiences in ways that can support decision-making (Singh 2019).
Data science draws on three areas of expertise: statistics, computer science, and knowledge of the domain where it is being applied.
Data science as a distinct paradigm and profession emerged in the early 2000s with the advent of Web 2.0, the subsequent deluge of big data, and the emergence of new techniques for extracting useful information from those massive data sets in ways that could enable data-driven decision making (Anselin 2020).
While the line between big and non-big data is fuzzy and ever-changing, the distinguishing characteristics are velocity, volume, variety and veracity. Big data comes in quickly, there is alot of it, it comes in a lot of different forms, and it isn't always rigorously trustworthy.
For example:
- Velocity: Globally around 6,000 tweets are sent every second.
- Volume: That results in a daily total of around 500 million tweets.
- Variety: The text in the tweets contains a lot of different types of information (topics, facts, attitudes, beliefs, etc.)
- Veracity: However, the meanings are often ambiguous and sometimes represent misperceptions or deliberate attempts to deceive.
Geospatial Data Science
Geospatial data science is "a subset of data science that takes into account the special characteristics of spatial data in analytical methods and software tools" (Anselin 2020).
Geospatial data science is also referred to as spatial data science or geographic data science. The boundary between geographic information science and geospatial data science is not clear or universally defined, although there are two fuzzy delineators:
- Geospatial data science routinely uses complex statistical and computational techniques that have not been commonly used or needed in traditional technological applications of GIS.
- Geospatial data science often incorporates the analysis of big data versus the cleanly structured data sets of traditional GIS.
Regardless of the nomenclature, the focus in data science is on the data, with geospatial data science adding an additional focus on the importance of the where dimension of geospatial data that has traditionally been overlooked in data science.
Some examples of geospatial data science provided by Anselin (2020):
- Using machine learning with remotely-sensed satellite images to classify areas by types of land use (agriculture, industry, residential, etc.)
- Aggregation of open government data on crime, health, complaints, etc. to identify quality of life issues and inform possible proposals for ways to remediate those issues
- Synthesis of data from smart city sensors (air quality, temperature, air quality, traffic, energy use, water use, etc.) to provide a dynamic space-time picture of the health and vitality of urban areas
- Analysis of volunteered geographic information (such as Google searches) to analyze geographic trends (such as Google Flu Trends)
- Data mining social-spatial networks from social media postings to better understand social and political trends
Environmental Data Science
Environmental data science is " the use of data science techniques, methodologies, and techniques to address...problems that are related to the environment" Pierson 2017, 287).
While data science is generally associated with the volume and velocity aspect of big data, and large data sets are sometimes used in environmental data science, Blair et al. (2019) note that in environmental data science, the primary challenges are variety and veracity (accuracy). Typical data sources for environmental data science include:
- Remotely-sensed data from satellites or unmanned aerial vehicles
- Ground-based monitoring systems like weather or stream-flow monitoring stations
- Field data captured by manual observation
- Historical data
- Modeled data
- Data mined from text and images on the web and social media
Because environmental phenomenon are inherently spatial in their distribution across the surface of the earth, geospatial data science techniques are a commonly used in environmental data science.
Pierson (2017, 287) defines three categories of environmental data science:
- Environmental intelligence: Providing the most current and accurate information possible to decision-makers in a comprehensible format
- Environmental modeling: Creation of models to help better understand and predict environmental change over time
- Spatial statistics: Creation of models to help better understand environmental differences between different locations
Jobs in GIS, GIScience, and Geospatial Data Science
This hierarchy of GIS, GIScience, and geospatial data science is also reflected in the types of jobs that geographic information professionals can have during their careers.
GIS Technician / Field Technician: Technicians capture geospatial data in the field and/or perform data entry and cleaning. A bachelors degree is usually required. The job often involves outdoor work with lots of travel and modest pay.
GIS Analyst / Specialist: This is the most common entry-level and early-career GIS job. Analysts perform entry, maintenance, visualization, and publication of geospatial data and databases. This job is usually performed in a cubicle.
Support Analyst / Engineer: GIS consulting firms and software companies like ESRI hire people to provide technical support to existing customers and help both existing and prospective customers more effectively use (and buy) the company's software offerings. Requires good people skills (sales) as well as a strong, practical, high-level understanding of the company's products.
GIS Developer: These are programming jobs in developing software with a geospatial component, usually involving web sites and/or mobile devices. These jobs requires strong programming skills and experience.
GIS Director / Coordinator / Manager: People who start as GIS analysts commonly move into these management jobs as their careers progress in a private company or government agency. Different positions have different combinations of technical consulting, outreach, sales, and management of subordinates.
Geospatial Data Scientist / Analyst / Specialist: These positions involve the analysis of both geospatial and non-geospatial data. Strong quantitative skills are essential, including programming.
Transit Data Analyst: Transit agencies hire analysts to maintain operational data and communicate analysis of that data. In addition to GIS skills, the ability to use and maintain specialized transit data software will also be needed.
Urban Planner: Urban planners commonly use GIS to provide analysis and visualization during the planning process. Although geography and urban planning often have a significant overlap in what they cover, urban planning careers usually require a professional master's of urban planning degree from an urban planning department.