Introduction to Geospatial Technology and Geographic Information Systems
Geospatial technology is a term used to describe the range of tools contributing to the geographic mapping and analysis of the Earth and human societies. More specifically for this class, the term geospatial technology will be used to describe the various parts of electronic information systems used to capture, store, process, analyze and communicate geospatial information.
What is Geography?
Geospatial technology falls under the larger umbrella of geography.
As with many other terms in this document, geography is a term with a wide variety of meanings and there is no single definition that covers the variety of activities that are performed by people that call themselves geographers or are in some way associated with the discipline of geography. Indeed, there are probably as many definitions of geography as there are geographers and the question of what geography is has triggered many (sometimes vicious and arguably pointless) debates among geographers.
Below are three different definitions of geography that yoy may find useful. While these definitions are not universally accepted, they will give us a foundation for understanding the relationship between geography and geospatial technology.
What is where, why is it there, and why do we care?
In 2002, geographer Charles Gritzner proposed this three part definition of the types of questions explored in geography:
- What is where: This is probably what most Americans think geography is based on their experience in elementary school geography memorizing the names of state capitals. The map is an exemplar of this idea as a graphical representation of where things are on the surface of the earth. "What is where" questions are commonly answered with geospatial technologies
- Why is it there: This aspect of geography deals with the processes and causes that put things where they are. These questions can be more interesting than "what is where" questions in that they give an understanding of how the past led to the present, and allow us to build models to anticipate what might happen in the future
- Why do we care: This aspect of geography deals with values and ethics. Values questions are often embedded within and hidden under the representations used in geospatial technology
Human, physical, and GIS
While activities that can be considered geography are performed in a variety of different professions, many (if not most) people that call themselves geographers work in the academy (colleges and universities). Within university geography departments there are three broad areas of study:
- Human geography explores human phenomena and commonly overlaps with sociology, anthropology, economics and history
- Physical geography is a natural science that commonly overlaps with geology, biology, and ecology
- Geographic information systems (GIS) focuses on the development and use of geospatial technology
Although academic geographers commonly focus primarily on only one of these fields, the boundaries are not rigid. Many human and physical geographers use geographic information systems in their research. Physical geography often studies environmental phenomena that are caused or affected by humans, and human geographers often study the interaction between people and their environments. College instructors will commonly teach courses in all three areas.
Geography is what geographers do.
Everything that is happens in a space and time. Therefore, everything has a history and everything has a geography. There is no Pope in geography that says what can and cannot be studied in geography, and since few people know (or care) what contemporary academic geography is, academic geographers are often free to cross disciplinary boundaries and study what they want. Therefore, almost anything can be and is studied within geography. If you visit the annual meeting of the Association of American Geographers, you can see presentations on a very wide variety of topics by a very heterogeneous collection of researchers.
What is GIS?
Geographic information systems (GIS) are computer systems used for capturing, storing, processing, analyzing, and communicating geospatial data. GIS is a specific type of information technology (IT).
While in academia the term geographic information systems is commonly used specifically to refer to the manipulation and analysis of geospatial data using software like ESRI's ArcMAP software, the term can be used to refer to a much wider range of applications of geospatial technology.
There are three broad areas where GIS is used:
- Consumer applications
- Business applications
- Research tool
GIS is the technology behind a wide variety of consumer applications like Google Maps.
The Global Positioning System (GPS) was created in the 1970s for military navigation and adapted in the 1990s for civilian and commercial use. GPS navigation systems for the general public have evolved considerably since the expensive and inaccurate systems that first appeared in 1995. They have now become ubiquitous for both private and corporate use and have made travel faster, safer and less confusing. Route data and route conditions are maintained in GIS.
Similarly, geospatial technology is used in real estate for comparatively simple tasks like maintaining databases of properties and providing information to potential buyers, to more complex tasks like market analysis and determining optimal locations for retail stores.
Location-based services (LBS) are mobile services that combine information about a user's physical location with online connectivity. LBS allows users to access to relevant information about their surroundings and inform others of their whereabouts. LBS can also be used by businesses for fleet tracking and inventory management.
GIS is used extensively as part of the operations of a wide variety of businesses and governmental agencies.
GIS used by businesses and other large organizations is referred to as enterprise GIS. While the customers of those businesses often do not see these applications of GIS, the general public benefits from the products and services provided by those businesses.
Almost all city governments of any size maintain information on municipal works (water, sewer, gas, etc) and property ownership in GIS databases. Accordingly, governments frequently employ GIS technicians, analysts and programmers to maintain these data and systems.
Geospatial technology is used by urban planners to analyze existing urban form, model the effects of changes, and visualize proposed plans and structures.
Extractive industries like oil/gas, mining and forestry use GIS to maintain inventories of assets and perform analysis needed to support successful exploration and exploitation of resources.
Forests are widely dispersed assets and contemporary forestry relies on geospatial technology to assist in a wide variety of forest management tasks:
- Assists in the planning and scheduling of harvesting and planting trees.
- Provides land appraisers with detailed information needed to determine the market value of the land.
- Provides information for herbicide and fertilization treatment specialists and helps them accurately spray and fertilize specific places on the land.
- Assists grading contractors and logging companies in the planning of road systems, bridges, ponds, and land clearing.
- Helps wildlife management specialists with endangered species management. Provides habitat and species maps and data that allow for the study of animal populations.
- Provides forest managers with an inventory of the trees and other vegetation on the land.
Precision agriculture uses geographical information to determine field variability to ensure optimal use of inputs and maximize the output from a farm. Remotely sensed data, along with GPS-enabled farm equipment and GIS analysis allows farmers to deliver precise amounts of fertilizers, pesticides, herbicides and irrigation needed in different parts of their fields to maximize harvests and minimize costs.
Similar to the same way that consumers use GPS apps to help navigate in unfamiliar areas, GIS is used by logistics companies to optimize routing of vehicles, and by transportation planners to evaluate options for changes to transportation infrastructure.
Location is a fundamental aspect of doing business in terms of locating retail outlets, optimizing supply chains, or creating effective marketing campaigns. Businesses use GIS to collect and analyze a wide variety of information on suppliers, customers and competitors that can inform crucial business decisions.
The use of crime mapping as a tool for policing dates back at least to the 19th century. Contemporary geospatial technology allows the use of increasingly large amounts of location-based data to create actionable intelligence that law enforcement agencies can use to respond to and anticipate criminal activity:
- Crime and investigative analysis
- Data fusion and intelligence analysis
- Tracking vehicles and personnel
- Corrections, parole, and probation
- In-vehicle mobile mapping
- Traffic and accident analysis
- Intelligence-led policing
Geography in general and geospatial technology in particular have a long historical relationship with military activity. Geospatial intelligence is an indispensable part of modern warfare.
Research Tool (GIScience)
GIS is commonly used as a tool for research and scientific inquiry.
While geographic information systems maintain specific information about what is where, geographic information science (often referred to as GIScience or GISci) seeks general knowledge about specific phenomena that can be analyzed using GIS.
For example, GIS can be used to detect specific areas where there are above-average incidence of specific diseases, such as breast cancer on Long Island, NY:
In contrast, GIScience seeks to understand why those hot spots are where they are, or what those hot spots tell us in general about the causes of breast cancer. In the case of Long Island, GIS was one of a variety of tools used in the Long Island Breast Cancer Study Project, was a major research project in the early 2000s organized by the National Institutes of Health to explore the causes of above-average breast cancer rates on Long Island. While a number of carcinogenic compounds were suspected as causal factors, the ultimate conclusion was that the high rates are more-likely attributable to known social risk factors, such such as having children at a later age, a family histories of breast cancer, and high alcohol consumption.
Knowledge gained from GIScience can then be used to plan strategies for addressing social problems. In the case of breast cancer on Long Island, knowing that many cancer risk factors are under the control of individuals can encourage the adoption healthy eating and lifestyle choices.
Facebook posts, Tweets, cellphone messages, blog posts and a vast array of other daily electronic activities often have location information associated with them, making geospatial technology an integral part of the big data revolution. Data analytics is an emerging, exciting area of research and commercial activity seeking to extract useful (and sometimes profitable) information from the vast amount of data generated on the digital earth.
Location data is fundamental to archaeology, making geospatial technology a perfect tool for archaeologists. In recent years, remote sensing technologies like Lidar have been helpful in discovering sites like the lost city La Ciudad Blanca that were obscured by vegetation or other landscape changes over time.
Why Study GIS?
GIS as a Career
Jobs are available in GIS under a variety of titles. GIS is also incorporated as a marketable skill in a variety of other job areas.
Data from the Bureau of Labor Statistics (BLS) on GIS-focused careers distributed across a variety of areas corresponds to the data above:
|Career Area||2017 Median Pay||2016 Number of Jobs||Job Growth Outlook 2016-2026|
|Surveying and Mapping Technicians||$43,340||60,200||11% (faster than average)|
|Cartographers and Photogrammetrists||$63,990||12,600||19% (much faster than average)|
|Geographers||$76,860||1,500||7% (as fast as average)|
GIS as a Tool
In addition to careers explicitly use GIS, the ability to analyze and visualize data in general and geospatial data in particular is useful for a variety of careers. Aside from explicitly quantitative careers like statisticians, epidemiologists, and data scientists, administrators and managers (especially in the public sector) often have questions about what is where, and knowing that GIS is available to help answer those questions (even if they have to bring in a specialist to actually perform the analysis) can be helpful.
Within the academy, GIS fits into the liberal arts tradition of colleges and universities that strives to give students a broad knowledge of the world. This helps students understand their place in that world, and helps students become more-informed citizens of their communities, nation, and planet.
GIS as a Challenge
While we often think of college as vocational training in specific job skills, a college degree is a more-general certification that helps employers sift through the multitude of applicants for each job and find employees that will succeed in their organizations. The economist Bryan Caplan argues that the primary value of a college degree is as a signal to employers of three characteristics:
- Intelligence: Does a potential employee have the ability to learn or understand or to deal with new or trying situations?
- Conscientiousness: Can a potential employee set and keep long-range goals, deliberate over choices, and take seriously obligations to others?
- Conformity: Does a potential employee know how to adhere to organizational and social norms, and do what they are told?
GIS and Geospatial Data
Geospatial data specifies the location of characteristics or objects on the surface of the earth. Characteristics or attributes can be specified as names or numeric values. Locations are usually defined as pairs of numbers (coordinates) that specify longitude (how far east or west) and latitude (how far north or south).
The ultimate objective of geographic information systems and geospatial technology in general is to allow people to understand reality and be able to perform actions based on that understanding. Toward that end, geospatial information flows through different stages and different layers of technology.
Capture: Reality needs to be captured in some way and converted to an electronic abstraction (data) - usually numbers. Geospatial information capture can be performed in a variety of ways: via cameras and sensors in satellites or on airplanes, through GPS receivers in smartphones, or simply by people walking around and notating observations of the physical world or answers to questions that are asked of the people that inhabit that physical world.
Store: One of the aspects of contemporary electronic geospatial technology that gives it power is the ability to inexpensively store massive amounts of geospatial information. However, there are numerous technologies and formats for storing geospatial information, and the choices of storage media in an organization can greatly influence the efficiency and flexibility of the geospatial technologies used by the organization.
Process: Geospatial data in its raw form usually needs additional processing to remove noise and/or give the data a structure that can allow it to be handled more easily. Satellite data must often undergo numeric transformations so that it can be used with geospatial data from other sources or times. Data from surveys often needs to be interpreted and placed into tables. Addresses need to be converted into numeric coordinates (geocoded). When performing analysis with Geographic Information Systems (GIS) it is not uncommon to spend more time gathering and cleaning up the data than actually performing analysis on the data.
Analyze: Much of the analysis performed with GIS involves relating one set of data to another. In health geography, a common analysis technique is to compare the locations of disease outbreaks with the locations of possible sources of biological or chemical pathogens. Analysis can range from simple overlays of two geospatial data sets, to complex multi-variable statistical models that can be used to anticipate future changes.
Communicate: Geospatial information stored inside an information system must be communicated back to people in order for it to be useful. Geospatial data is commonly visualized as maps or charts.
Interpret: Maps and charts are made of signifiers - signs and symbols. These are then interpreted by users as communicating some kind of meaning or knowledge. This process of interpretation (like most human behavior) is complex and dependent on the background of the user and the context in which the data is being used.
Act: Although geospatial technology is sometimes simply used to gain understanding for the sake of understanding, geospatial technology more commonly serves as a tool for accomplishing some larger objective - often commercial. This raises questions of values and ethics as knowledge about reality gained from geospatial technology is then used to make some change in that reality (the why do we care aspect of geography). A couple using a mobile app to find a nearby Thai restaurant may be unaware of the commercial relationships that lead some restaurants to be displayed higher in the list than others. A colorful map used to promote a public policy initiative can hide the assumptions made about the underlying data and the motivations of the map makers. A military drone pilot may need to target a missile strike based on geospatial intelligence that is uncertain, in support of a mission that reflects debatable geopolitical objectives and strategies.
What is Geospatial Information?
The Earth is a spheroid - it is round like a ball or sphere but flattened by slightly by the centrifugal force of rotation. The Earth is 24,900 miles circumference, but around 27 miles wider than it is tall.
To specify locations on the surface of the earth, we use angles that divide the earth into slices that form lines on the surface.
The horizontal slices tell you how many degrees you are north or south from the equator. This is called latitude. The Equator is zero degrees latitude and, commonly, negative numbers are south, positive numbers are north. The range is from negative 90 degrees at the south pole to positive 90 degrees at the north pole
The vertical lines tell you how many degrees you are from Greenwich, England. This East/West dimension is called LONGitude - think LONG. Negative numbers up to negative 180 are west of England. Positive numbers up to positive 180 are east of England.
Data Models and Geometry
There are two broad models for storing geospatial information: Vector and Raster. They are called models because they are simplified representations of the objects or phenomena they are used to represent.
Vector data stores locations as discrete geometric objects: points, lines or polygons.
- Points are represented with a single coordinate pair (latitude and longitude). Points useful for representing objects like vehicles or smartphone locations that occupy little or no area
- Polygons are represented with a collection of coordinate pairs that define the outside boundary of an area. Polygons are useful for representing things like property boundaries or city boundaries that define an area
- Lines are represented by a sequence of coordinate pairs. Lines are useful for representing things like roads or rivers that are long and narrow
Raster data stores information in a grid of regularly-spaced pixels of attribute data that cover an area of interest. Raster data is most useful for representing data about areas where there are unclear boundaries, such as with elevation or levels of vegetation. The best known type of raster data is photographic image data.
Although many different types of data can be stored as rasters, data about discrete objects with clear boundaries is usually more appropriately and accurately stored as vector rather than raster. GIS Software allows conversion between raster and vector, although the conversion process between the two models often involves inaccuracy and uncertainty.
An emerging third type of model is the point cloud, which stores geospatial information as a collection of points in three-dimensional space (latitude, longitude and elevation). Point clouds are commonly captured using a aerial laser scanning technique called Lidar. Unlike vector and raster data that is analogous to a flat two-dimensional map, point clouds can be used to more-faithfully represent structures and topography, albeit at the cost of greater storage and processing demands.
Remotely Sensed Data
Remote sensing is the process of obtaining information about objects or areas from a distance, typically from aircraft or satellites. The age of remote sensing can be said to have started in 1860 with James Wallace Black's photograph of Boston from a balloon.
Aerial photography is still an important source of remotely sensed geospatial information, and the advent of inexpensive drones, or Unmanned aerial vehicles (UAVs) (like quadcopters), has made aerial photography increasingly accessible for commercial and non-commercial purposes.
Satellites have been used for capturing geospatial information since the Corona defense intelligence satellite project debuted in 1960. Satellite data is used for weather forecasting, mapping, environmental research, military intelligence and an ever-expanding collection of uses.
Lidar is a fairly recent technique for capturing geospatial data that uses laser scanning to create three-dimensional point clouds of geographic features. Lidar sensors can be mounted on UAVs, airplanes or satellites.
Directly Captured Data
Almost all cities in the developed world use GIS to maintain property records and information on utilities like water and sewers. Designs for large civil engineering projects like roads and bridges often incorporate geospatial information to define the locations of their various components.
Surveyors capture the locations of existing features on the surface of the Earth, delineate areas of ownership or control, and provide accurate guidance for human actions that transform the Earth, such as when building structures. Surveying is an ancient activity that dates back at least to the third millenium BC. Modern surveyors commonly use GPS-enabled devices like Total stations to make precise measurements of distances, angles and elevations of infrastructure and property lines.
The Global Positioning System (GPS) is a satellite-based system that enables accurate capture of location anyplace on the surface of the planet. While radio navigation aids for aviation have been used since 1908 and the first military satellite-based navigation system became operational in 1964, the deployment of the Global Positioning System beginning in 1978 has made location tracking an integral part of commercial, military and private life in the developed world. GPS receivers use trigonometry to calculate exact locations on the surface of the earth based on variations in the timing of signals received from GPS satellites.
Global Navigation Satellite System (GNSS) is a generic name used to refer to the growing number of satellite navigation systems around the world. However, the term GPS is common in popular usage to refer to satellite navigation in general.
Government agencies around the world use geospatial information for a variety of administrative purposes. The US Census Bureau, as well as city governments like the City of Denver, the City of Chicago, and New York City make a wide variety of geospatial data available to the public.
Commercial data vendors like DigitalGlobe also make high-quality proprietary data available to the public for a fee.
Volunteered Geographic Information (VGI) is geographic data created, assembled and disseminated voluntarily by individuals (Goodchild 2007). OpenStreetMap is an example of a portal for the sharing of geographic data.
Big data is an emerging area that deals with data that is too big for traditional data processing techniques. Much of that data (like cell phone records, Facebook posts, and tweets) includes locations, making it a form of geospatial information.
While the line between big and non-big data is fuzzy and ever-changing, the distinguishing characteristics are velocity, volume, variety and veracity: it comes in quickly, there is alot of it, it comes in a lot of different forms, and it isn't always rigorously trustworthy.
For example: Globally around 6,000 tweets are sent every second (velocity) with a daily total of around 500 million tweets (volume). The text in the tweets contains alot of different types of information (variety). However, the meanings are often ambiguous and sometimes represent misperceptions or deliberate attempts to decieve (veracity).
Almost all data (geospatial or otherwise) requires some additional processing after capture (whether from sensors or surveys) in order to make it usable. Secondary data often must be transformed from the format of its primary use (such as presentation as a map or on a website) to one suitable for further analysis or communication.
This janitorial work is tedious and is often colloquially referred to as massaging. With big data projects, rough estimates are that 50% to 80% of effort is spent massaging data rather than doing the more-interesting tasks of analyzing and communicating. The diverse formats and styles for representing data reflect the diverse sources and uses of data. In many ways, advances in technology only increase this complexity, making processing this data a continual source of challenges.
One common processing technique is orthorectification, stretching an aerial (or satellite) photo so that it is an accurate two-dimensional representation of distances on the ground. ortho means intersecting or lying at right angles and rectification means to correct by removing errors. Once a remotely sensed image has been orthorectified, it can be used for photogrammetry, or taking measurements of objects or distances on the ground using the photo.
For the purposes of this lesson (and course), the product of processing is considered to be the same data with a different representation. This is distinct from analysis where the intent is to extract new or hidden information from geospatial data.
Geospatial data can be stored on any medium that is used to store other types of data. On mobile devices and workstations, data can be stored as files on hard drives or flash memory drives. File types include:
- KML (used with Google Maps/Earth)
- CSV Files (tables of data)
- Spreadsheets (Excel)
- Shapefiles (an old format commonly used to share data)
- File and Personal Geodatabases (used with ESRI software)
Large data sets are usually stored in databases on networked servers. This permits the communal use and editing of geospatial data that must be shared among multiple people.
Increasingly, The Cloud is used to store geospatial data. The cloud can be more formally described as server virtualization, where data is accessed across the internet as if it were on a normal server, but the actual physical storage location can change depending on the amount of data, the number of simultaneous users, and desired speed of access.
Examples of cloud data storage are Box, Dropbox, Google Drive, Amazon Web Services.
Systems administrators with an understanding of both information technology and the unique characteristics of geospatial data are needed to maintain these servers.
One of the most important aspects of GIS is the ability to analyze geospatial data and find patterns that can inform real-world decisions.
The locational characteristic of geospatial data allows an overlay of multiple data sets to determine relationships between those data sets. For example, overlaying a layer of household income data over a layer of crime data will likely show a relationship between high levels of crime and high levels of poverty.
A model is a simplified representation of a thing or process that is used to communicate information or anticipate possible future changes. In GIS, models are often created using mathematical formulas to estimate characteristics of locations or anticipate possible future changes. For example, a spatial model can be built that incorporates automobile accessibility, locations of competitors, locations of residential neighborhoods, and other factors to create a map of optimal locations for a new retail store.
Network analysis can be applied to transportation networks (sets of lines and connecting node points) to model possible changes when planning for a new network link, such as a new bridge or widened highway.
The most common means of communicating geospatial data is the map. Indeed, humans appear to have been communicating with maps at least 3,000 to 9,000 years before we communicated with written language.
The craft of creating maps is called cartography. Maps are a combination of the functional and the aesthetic: The objective with a map is to be both useful and beautiful, requiring both technical and artistic skill. But the ultimate objective is communication, a process by which information is exchanged between individuals through a common system of symbols, signs, or behavior.
A common type of map is a choropleth, where polygons of data are colored differently based on characteristics of that area. The colors of a choropleth are based on a variable (attribute), usually a number (such as population) or a category (such as dominant political party). The map below is a choropleth of state election results in the 2012 presidential election.
A cartogram is a type of choropleth where the sizes of polygons are distorted to represent a second variable. For example, the cartogram below distorts state size by population, showing that the geographically correct choropleth above visually exaggerates the amount of population in the country that is Republican, since many large, conservative western states have small populations.
Point data is often visualized with one symbol per point, although the visualization may not be meaningful if there are a large number of points:>
One way of visualizing large numbers of points is a heat map, that continuously colors the map based on density of points.
Cartography is part of a larger, ever-changing area called data visualization. While classical techniques for creating beautiful and effective maps on paper are still useful, the ability to create clever, effective visualizations of a variety of different types of data on multiple types of electronic platforms is a valuable skill.
Another aspect of the communication of geospatial information is interactivity, where communication involves user interaction with geospatial data, making the process bi-directional. Accordingly programming skills are increasingly needed to create web services, interactive web sites, and mobile apps.