Geographic information science (GIScience) has three common definitions. This tutorial will focus on the first of these three definitions:
- GIS as a Scientific Tool: The use of geographic information systems (GIS) to provide "insight, explanation, and understanding" in "supporting those sciences for which geography is a significant key" (Goodchild 1992)
- The Science of GIS: The exploration of scientific questions about GIS that are "generic, rather than specific to particular fields of application and particular contexts" (Goodchild 1992)
- Information Science: A related definition sees GIScience as a subfield of information science, which is " the science and practice dealing with the effective collection, storage, retrieval, and use of information." (Saracevic 2009). This begins to touch on the emerging field of data science
What Are Geographic Information Systems?
Defining GIScience in terms of GIS requires defining GIS.
Geographic information systems (GIS) are computer systems used for capturing, storing, processing, analyzing, and communicating geospatial data. GIS is a specific type of information technology (IT).
In the most broad sense, GIS has become highly visible in the wide variety of geospatial technology that has become a part of modern life, such as Google Maps and GPS navigation.
However, the term GIS is often limited to cover primarily the software used for processing geospatial data, such as ArcMap, the industry-standard GIS program from ESRI.
Science is a political term that has a wide variety of meanings in common usage. However, for the purposes of this tutorial science will be defined as knowledge or a system of knowledge covering general truths or the operation of general laws especially as obtained and tested through scientific method (Merriam-Webster 2017).
The key terms in that definition are scientific method and general truths.
The Scientific Method
Science as an activity is often performed using the scientific method, which is principles and procedures for the systematic pursuit of knowledge involving:
- The recognition and formulation of a problem
- The collection of data through observation and experiment
- The formulation and testing of hypotheses
The successful end of this process is a new, updated, or corroborated theory, which is a plausible or scientifically acceptable general principle or body of principles offered to explain phenomena.
General Truths: Inductive vs Deductive Research
Specific knowledge is knowledge restricted to a particular individual, situation, relation, or effect. In contrast, general knowledge is knowledge involving, relating to, or applicable to every member of a class, kind, or group.
Based on the definition of science given above, scientific inquiry seeks general knowledge that can be applied beyond specific situations or locations. For example, specific knowledge might be rates of a disease in a specific year in specific places (such as US states), while general knowledge would be the different risk factors for that disease.
Optimally, we seek to find patterns that can help us understand the reality of the present, assess possibilities for the future, adjust our actions to create a desired future, and prepare for conditions in the future that we cannot change.
While GIS is used to answer specific questions about what is where, GIScience deals with general questions about why is it there, or what general knowledge the spatial distribution of a phenomena tells us about that phenomena.
However, the relationship between the general and the specific goes in two directions, and there are two broad research approaches to dealing with general truths and the specifics of particular geographic situations (Trochim 2006):
- Inductive research starts with specific situations and seeks to infer general theories that can be applied in other times and places other than the specific situations being studied
- Deductive research starts with existing general theories and investigates whether those theories explain what is happening in a specific situation being studied
An example of deductive research with GIS is a study by Bauer et al (2017) that used GIS to analyze and compare accessibility to obstetric care in Germany and England. The starting theory from previous research (primarily in the developing world) is that improved access to obstetric care improves neonatal outcomes (reduced miscarrages, stillbirths and infant mortality). The hypothesis in this case was that this theory would also explain some of the differences in neonatal outcomes in these specific developed countries.
The study found that women in general, as well as women in poor and rural areas, could more easily access obstetric care in Germany than in England. However, the study did not find a clear relationship between obstetric accessibility and neonatal outcomes, leading to rejection of the hypothesis in this specific case.
An example of inductive research is a study by Parry et al (2017) that used GIS to determine if accessibility had any relationship to vulnerability to climate shocks in Amazonia. A hypothesis was that marginalized people in isolated and remote locations would be more vulnerable to increased flooding and droughts that are expected effects of climate change.
The analysis confirmed the hypothesis, creating a new general spatial theory relating reduced accessibility to increased vulnerability.
The logical social justice conclusion of this theory would be that improved accessibility would result in reduced vulnerability. However, the authors noted the paradoxical caveat that "increasing accessibility through road building would be maladaptive, exposing marginalized people to further harm and exacerbating climatic change by driving deforestation." This dramatizes the care that should be taken when interpreting and acting on the results of research.
Spatial vs. Non-Spatial Data
In determining whether GIS is the appropriate tool for exploring a specific question, it is important to distinguish between spatial and non-spatial data.
Spatial data contains multiple where, in contrast to non-spatial data that has only one where or no where information.
The general test to determine if data is spatial is whether you can make a meaningful map from that data. If your data can be represented (modeled) as multiple points, lines, polygons, or as rasters or point clouds, it is spatial data.
For example, flu infections by state during a particular week of the flu season is spatial data because there are multiple where and you can map it.
In contrast, a time-series of total rates of flu infection over the whole USA during multiple flu seasons is not spatial data because there is only one where involved.
Exploratory Data Analysis
The idealized world of the scientific method is question-driven, with the collection and analysis of data determined by questions and hypotheses.
However, in some cases, especially at the early states of research, a data-driven approach may be more appropriate. This reverses the question-driven process, with the data coming first and animating a process of exploratory data analysis:
The ability to create maps with geospatial data facilitates quick visualization to observe spatial patterns, such as clusters, that raise questions and offer options for building hypotheses.
One common technique used with point data like crime or disease is the creation of heat maps. In such cases, there are often too many points to make pattern recognition possible.
However, heat maps show areas where there are high spatial concentrations (clusters) of activity. While observation of these clusters by themselves may not have clear significance, identification of clusters along with other spatial information, or simply from personal knowledge, can inform specific interventions, or general theories of the causes (and solutions) to the problems causing such clusters.
There are also statistical mapping techniques like hot-spot analysis using the Getis-Ord GI* algorithm to identify hot-spots and cold-spots where the concentration of points is higher or lower, respectively, from what might be expected from a purely random distribution of points.
Research Proposals and Reports
Science involves research, and research involves documentation at all stages of the process.
Research proposals describe planned research and research reports describe completed research. Proposals and reports for research involving geospatial data come in a wide variety of different structures and styles, but these documents commonly have some variant on the following sections.
- Introduction: This section gives a general introduction to the research topic, background on the motivations that led to the research.
- Literature Review: A summary of existing research on the topic. Researchers commonly seek to identify gaps in the literature that they can fill with new research.
- Study Area: For research that focuses on specific geographical areas, an overview of the history and significant characteristics of the area is useful for readers unfamiliar with that area.
- Question and Hypothesis: This section poses the question(s) that the research hopes to answer. It also then suggests hypotheses of what the answer to those question(s) might be. The research then involves the testing of those hypothesis to either corroborate or falsify them. These elements are sometimes included in the introduction.
- Methods and Data Sources: A description of the sources of data used in the research and the methods that will be used to analyze that data. This section also includes limitations and caveats, such as issues with data quality, availability, or scale.
- (Preliminary) Results: A summary of the results of the execution of the methods described in the prior section. Description of whether the hypotheses were corroborated or falsified, or whether the results are inconclusive. With research proposals, this is used to summarize any preliminary research done on the topic.
- Conclusions, Significance, Future Research: Interpretation of the what the results mean, why those results are important, and what further questions this research raises.
Additional sections that are often also included in papers and reports:
- Abstract: This is a one-paragraph summary of the research topic, questions, and results. The abstract is placed under the title at the beginning of the article and gives readers a quick way to evaluate whether the research is relevant to their work and deserving of further reading.
- Bibliography / References: A list of the literature referenced throughout the document.
- Appendices: Graphs, data tables, computer source code, etc. that can be used to more deeply investigate or validate the results of the research.