GIScience

Geographic information science (GIScience) has three common definitions. This tutorial will focus on the first of these three definitions:

Geographic Information Systems

Defining GIScience in terms of GIS requires defining GIS.

Geographic information systems (GIS) are computer systems used for capturing, storing, processing, analyzing, and communicating geospatial data. GIS is a specific type of information technology (IT).

GIS is the use of information technology to store information about what is where.

The Flow of Geospatial Data in GIS

Science

Defining GIScience as the use of GIS in scientific inquiry requires defining what science is.

Science is a political term that has a wide variety of meanings in common usage. However, for the purposes of this tutorial science will be defined as knowledge or a system of knowledge covering general truths or the operation of general laws especially as obtained and tested through scientific method (Merriam-Webster 2019).

The key terms in that definition are scientific method and general truths.

The Scientific Method

Science as an activity is often performed using the scientific method (more-formally known as the hypothetico-deductive model) which is principles and procedures for the systematic pursuit of knowledge involving:

The successful end of this process is a new, updated, or corroborated theory, which is a plausible or scientifically acceptable general principle or body of principles offered to explain phenomena.

The Scientific Method

General vs. Specific Knowledge

Specific knowledge is knowledge restricted to a particular individual, situation, relation, or effect.

In contrast, general knowledge is knowledge involving, relating to, or applicable to every member of a class, kind, or group.

While GIS is used to answer specific questions about what is where, GIScience deals with general questions about why is it there, or what general knowledge the spatial distribution of a phenomena tells us about that phenomena.

For example, GIS was used in the Long Island Breast Cancer Study Project (LIBCSP) to explore the unusually high breast cancer rates on Long Island, NY. In it's simplest usage, GIS was used to create maps of specific information about what is where like this map of cancer rates relative to the rest of the state:

Breast Cancer Relative Incidence in Long Island, NY (National Cancer Institute 2005)

However, what is more important is general information about why is it there: why are breast cancer rates high, and what can be done to reduce those rates?

Continuing the example above, Long Island is the home of numerous industrial sites from the past and present that have released toxic chemicals into the environment. Again, this is specific information about what is where that can be mapped:

Inactive Hazardous Waste Sites on Long Island (National Cancer Institute 2005)

One hypothesis was that these chemicals were a significant cause of breast cancer. However, using GIS and other statistical techniques with survey data to test that hypothesis, the study did not find a clear link (general knowledge) between levels of hazardous chemicals and breast cancer levels.

What the study did find is that women on Long Island had higher prevalence of other known risk factors (general knowledge), such as having children later in life, high alcohol consumption, and family history of breast cancer. (Maurer Foundation 2012).

Inductive vs Deductive Research

The relationship between the general and the specific goes in two directions, and there are two broad research approaches to dealing with general truths and the specifics of particular geographic situations (Trochim 2006):

Deductive vs Inductive Research

The Long Island breast cancer study described above is an example of deductive research: starting with general information about known possible causes like carcinogenic chemicals and lifestyle risk factors, and testing to see if those explained the specific patterns of breast cancer rates seen on Long Island.

Inactive Hazardous Waste Sites on Long Island (National Cancer Institute 2005)

In contrast, an example of inductive research is a study by Athas et al (2000) studied women in New Mexico who had breast cancer surgery (specific information) and used GIS to determine how far the women lived from a radiotherapy center where they could get follow-up radiation treatment. The researchers found that the farther a woman lived from the center, the less likely she was to get treatment (general information).

In a case like this where there were a large number of points, and privacy concerns for the women being studied, a map would not be useful or appropriate. In this case a graph of a mathematical model comparing distance (x-axis) with likelyhood of getting treatment (y-axis) was a more-effective visualization.

Distance to Radiotherapy Centers vs Likelihood of Seeking Treatment (Athas 2000)

Exploratory Data Analysis

The idealized world of the scientific method is question-driven, with the collection and analysis of data determined by questions and hypotheses.

However, in some cases, especially at the early states of research, a data-driven approach may be more appropriate. This reverses the question-driven process, with the data coming first and animating a process of exploratory data analysis:

Exploratory Data Analysis

The ability to create maps with geospatial data facilitates quick visualization to observe spatial patterns, such as clusters, that raise questions and offer options for building hypotheses.

One common technique used with point data like crime or disease is the creation of heat maps. In such cases, there are often too many points to make pattern recognition possible.

Assaults in Spokane, WA in 2015 (City of Spokane)

However, heat maps show areas where there are high spatial concentrations (clusters) of activity. While observation of these clusters by themselves may not have clear significance, identification of clusters along with other spatial information, or simply from personal knowledge, can inform specific interventions, or general theories of the causes (and solutions) to the problems causing such clusters.

Heat Map of Assaults in Spokane, WA in 2015 (City of Spokane)

There are also statistical mapping techniques like hot-spot analysis using the Getis-Ord GI* algorithm to identify hot-spots and cold-spots where the concentration of points is higher or lower, respectively, from what might be expected from a purely random distribution of points.

Assault Hot Spots in Spokane, WA in 2015 (City of Spokane)

Research Proposals and Reports

Science involves research, and research involves documentation at all stages of the process.

Research proposals describe planned research and research reports describe completed research. Proposals and reports for research involving geospatial data come in a wide variety of different structures and styles, but these documents commonly have some variant on the following sections.

  1. Introduction: This section gives a general introduction to the research topic, background on the motivations that led to the research.
  2. Literature Review: A summary of existing research on the topic. Researchers commonly seek to identify gaps in the literature that they can fill with new research.
  3. Study Area: For research that focuses on specific geographical areas, an overview of the history and significant characteristics of the area is useful for readers unfamiliar with that area.
  4. Question and Hypothesis: This section poses the question(s) that the research hopes to answer. It also then suggests hypotheses of what the answer to those question(s) might be. The research then involves the testing of those hypothesis to either corroborate or falsify them. These elements are sometimes included in the introduction.
  5. Methods and Data Sources: A description of the sources of data used in the research and the methods that will be used to analyze that data. This section also includes limitations and caveats, such as issues with data quality, availability, or scale.
  6. (Preliminary) Results: A summary of the results of the execution of the methods described in the prior section. Description of whether the hypotheses were corroborated or falsified, or whether the results are inconclusive. With research proposals, this is used to summarize any preliminary research done on the topic.
  7. Conclusions, Significance, Future Research: Interpretation of the what the results mean, why those results are important, and what further questions this research raises.

Additional sections that are often also included in papers and reports: