An Introduction To The Geography Of The Internet

While we commonly think of the internet as a 24-7 virtual world where time and place don't matter, the internet is comprised of a vast collection of both physical and social infrastructure that is located in very specific places.

This tutorial introduces some concepts and techniques that can be used to understand the "what is where" of the internet.

What Is The Internet?

A digital data network is a set of digital devices with electronic interconnections that allow them to communicate with each other.

The internet is a global network of networks that was originally developed as a US Defense Department project in the late 1960s to connect large mainframe computers. However, it has grown to provide interconnection of, and access to, digital devices like smartphones, cars, trucks, industrial equipment, infrastructure control centers, remote sensors, and even home appliances (the internet of things).

The internet has become fundamentally important to contemporary life and commerce. Practically all communications and entertainment content now passes through the internet, including e-mail, text messages, telephone calls, and streaming audio/video. In addition, a vast array of non-public data such as financial transactions, scientific information, logistics coordination, etc. passes through the internet, invisibly supporting life in the industrialized world.

Identification on the Internet

Domain Names

A domain name identifies a domain of control on the internet.

Parts of a Domain Name

The top-level domain (TLD) defines major groups of domain names. .com has traditionally been used for commercial entities, while .org is used for non-profits, .edu for educational institutions. There are now a wide collection of TLDs available (such as .tv, .biz, .xxx, etc.), making it difficult to use a TLD to reliabily identify what a domain is used for.

Just to the left of the TLD is the domain, which defines a specific domain within the top-level domain. The TLD and domain used together are a unique identifier.

Within a domain there can be one or more subdomains. These are commonly used to clearly separate services available from a domain, such as mail, the library, web apps, etc.

TLDs can also be Country Code Top-Level Domains that associate domain names with specific countries. In the diagram above, the .bw TLD on gov.bw indicates that this website is controlled by an entity in Botswana, in this case the central government. You can view a list of all Country Code Top-Level Domains at https://icannwiki.org/Country_code_top-level_domain.

Google uses TLDs to distinguish between versions of their search engine targeted at specific countries. For example, google.co.uk focuses on the United Kingdom, while google.co.bw focuses on Botswana.

As with non-geographic TLDs, country code TLDs are not used consistently, and servers are not necessarily located in the country indicated by a TLD.

WhoIs

Domain names are bought and sold. Although the business of registering domain names is handled by wide variety of private companies like GoDaddy and Verisign, the global registration of domain names is coordinated by the Internet Assigned Numbers Authority (IANA), which is a department of the NGO Internet Corporation for Assigned Names and Numbers (ICANN).

When an individual or business registers a domain name, they are required to provide identifying information for a WhoIs database that is accessible to the public. Companies that register domains usually provide web access to WhoIs information so you can determine if a domain you want to register is already owned.

You can get WhoIs data directly from IANA at http://www.iana.org/whois. For example, the listing for gov.bw shows the owner of the domain as an agency of the Botswanan government, located in Gabarone, the capitol of the country:

% IANA WHOIS server
% for more information on IANA, visit http://www.iana.org
% This query returned 1 object

refer:        whois.nic.net.bw

domain:       BW

organisation: Botswana Communications Regulatory Authority (BOCRA)
address:      Plot 206/207 Independence Avenue
address:      Private Bag 00495
address:      Gaborone
address:      Botswana

contact:      administrative
name:         Snr Engineer ccTLD
organisation: Botswana Communications Regulatory Authority (BOCRA)
address:      Plot 206/207 Independence Avenue
address:      Private Bag 00495
address:      Gaborone
address:      Botswana
phone:        +267 368 5557
fax-no:       +267 395 7976
e-mail:       kamanga@bocra.org.bw

contact:      technical
name:         Snr Engineer ccTLD
organisation: Botswana Communications Regulatory Authority (BOCRA)
address:      Plot 206/207 Independence Avenue
address:      Private Bag 00495
address:      Gaborone
address:      Botswana
phone:        +267 368 5557
fax-no:       +267 395 7976
e-mail:       kamanga@bocra.org.bw

nserver:      DNS1.NIC.NET.BW 168.167.98.226 2c0f:ff00:1:3:0:0:0:226
nserver:      DNS2.NIC.NET.BW 168.167.98.218 2c0f:ff00:1:5:0:0:0:218
nserver:      MASTER.BTC.NET.BW 168.167.168.37 2c0f:ff00:0:6:0:0:0:3 2c0f:ff00:0:6:0:0:0:5
nserver:      NS-BW.AFRINIC.NET 196.216.168.72 2001:43f8:120:0:0:0:0:72
nserver:      PCH.NIC.NET.BW 2001:500:14:6070:ad:0:0:1 204.61.216.70
ds-rdata:     6919 8 2 2fd5bd844725991e9a7708c3b1134b05a3d2ea216d9e239f71caeea35e4cb928
ds-rdata:     6919 8 1 5100f53e64928adc9ef0fbdfca6299dfb7081edf
ds-rdata:     18880 8 1 a948aff07700c9f18ad356c5159b64cb65a0c487
ds-rdata:     18880 8 2 56b561d20ee04927d24d8a7591c58a22a42e0a18202b4deed03caa5b66d4dd42

whois:        whois.nic.net.bw

status:       ACTIVE
remarks:      Registration information: https://registry.nic.net.bw

created:      1993-03-19
changed:      2017-07-06
source:       IANA

Note that some domain name owners, notably when the domain exists for personal use, pay their domain registrars extra fees to preserve their privacy and list the registrar as the owner in the WhoIs information list. For example, the website for the late guitarist Chris Mello (chrismello.com) lists the domain owner as the registrar rather than the executor of his estate:

% IANA WHOIS server
% for more information on IANA, visit http://www.iana.org
% This query returned 1 object

refer:        whois.verisign-grs.com

domain:       COM

organisation: VeriSign Global Registry Services
address:      12061 Bluemont Way
address:      Reston Virginia 20190
address:      United States

contact:      administrative
name:         Registry Customer Service
organisation: VeriSign Global Registry Services
address:      12061 Bluemont Way
address:      Reston Virginia 20190
address:      United States
phone:        +1 703 925-6999
fax-no:       +1 703 948 3978
e-mail:       info@verisign-grs.com

Uniform Resource Locators

A domain name can be used to create a Uniform Resource Locators (URL) that uniquely identifies a resource (such as a web page) on the internet. For example:

Parts of a URL

Internet Protocol (IP) Addresses

While humans identify resources on the internet using domain names and URLs, the internet itself identifies servers using numeric internet protocol (IP) addresses.

These addresses are commonly written as four digits (each representing one 8-bit byte) separated by periods. For example, this is the IP address of the web server for michaelminn.com:

173.236.188.143

Domain names are converted to IP addresses using the Domain Name Service (DNS), which itself involves a complex hierarchy of nameservers that keep track of what IP address is associated with which domain.

Large web sites, such as Google, have multiple servers around the world that handle requests, and which one you are actually accessing may vary based on your location in the world.

Ping

You can identify the IP address associated with a domain name using the ping command, which is often used by technicians to see if a computer is successfully connected to the internet or to see if a web server is running.

On a Windows machine, open the cmd terminal and type: ping <domain_name>. Press Control-C to stop the ping listing:

Windoze cmd Terminal
Ping on a Windows Machine

On a Macintosh computer, open the terminal app and type: <domain_name>. Press Control-C to stop the pin listing:

Terminal App in Finder on a Mac
Ping michaelminn.com on a Mac

In both examples above, we verify that the server for the michaelminn.com domain has an IP address of 173.236.188.143.

IP Address Blocks

Blocks of IP addresses are assigned to ISPs. You can use a IP WhoIs tool like https://www.ultratools.com/tools/ipWhoisLookup to find the ISP associated with an IP address.

For example, searching the IP address of 173.236.188.143 for michaelminn.com, we get this listing indicating the IP address is in a block of addresses allocated to the web hosting provider DreamHost.com, which has offices in Brea, CA, and which controls a block of IP addresses from 173.236.128.0 to 173.236.255.255

           Source:  whois.arin.net
       IP Address:  173.236.188.143
             Name:  DREAMHOST-BLK10
           Handle:  NET-173-236-128-0-1
Registration Date:  3/30/10
            Range:  173.236.128.0-173.236.255.255
              Org:  New Dream Network, LLC
       Org Handle:  NDN
          Address:  417 Associated Rd.
                    PMB #257
             City:  Brea
   State/Province:  CA
      Postal Code:  92821
          Country:  UNITED STATES

Note that this does not necessarily indicate where the server is located. ISPs often have multiple data centers, and the registration information is a contact address which will usually be a corporate headquarters rather than the location of the server farm(s).

Connecting Clients and Servers On The Internet

The Client-Server Model

The internet is largely based on the client-server model where clients communicate through the internet to servers, which then either serve information (such as web pages or streaming video), or serve as an intermediary between clients (such as two cellphones).

While a server can be just an ordinary desktop computer sitting under a desk and running a server operating system like Linux or Windows Server, servers are often centralized in vast collections called server farms. The buildings housing server farms are often football-field sized warehouses that are tightly secured, have backup power systems, and are staffed by small armies of maintenance technicians that assure reliable service.

The Server Farm at CERN in Switzerland (2009 FLorian Hirzinger, Wikimedia)

Clients and servers connect to the the internet through Internet Service Providers (ISPs).

ISPs have their own networks, and data that moves on those networks is directed to appropriate clients and servers using routers that keep track of where specific IP addresses are located on a network.

When a connection is being made between a client and server that are not on the same network, routers also connect local ISP networks to high-speed, high-volume backbone networks that transfer data between local networks.

Simplified Diagram Of Internet Access To A Web Site

For example, this is the basic process of how a client (like a desktop computer or smartphone) gets a web page from a web server:

  1. A physical device is connected to an ISP through a modem, an access point when on a Wi-Fi network, or through a cell tower when using a smartphone and the cellular telephone network
  2. A person on that client types a URL or clicks on a link in a web browser
  3. The domain name is separated from URL
  4. Message sent to a domain name server which returns an IP address to the client (All the Internet knows is numeric IP addresses)
  5. Client sends request to IP address via a router
  6. Routers pass to higher level routers until a router is found that knows the network that contains the IP address
  7. Request is routed to server
  8. The server responds to the request
  9. Response packet(s) make their way back to the client through the routers
  10. The packets of information are received by client, reassembled and provided to the client's application (such as a web browser) for use
  11. Repeat as necessary

Examining Network Paths Using Traceroute

The connection between a client and a server through the maze of internet routers can be examined using the traceroute utility.

In the cmd terminal on a Windoze PC (see ping above) you can type tracert <domain_name>

Tracert to michaelminn.com on Windoze

In the terminal app on a Mac (see ping above) you can type traceroute <domain_name>

Traceroute to michaelminn.com on a Mac

On mobile devices, there are apps like inettools, iptools and traceroute that allow you to see similar listings.

Traceroute to michaelminn.com on an Android Device

The following is an example of a traceroute listing to michaelminn.com through a home internet connection:

traceroute to michaelminn.com (173.236.188.143), 30 hops max, 60 byte packets
 1  FIOS_Quantum_Gateway.fios-router.home (192.168.1.1)  0.413 ms  0.512 ms  0.622 ms
 2  lo0-100.NYCMNY-VFTTP-402.verizon-gni.net (72.89.78.1)  8.487 ms  8.531 ms  8.583 ms
 3  B3402.NYCMNY-LCR-21.verizon-gni.net (100.41.138.178)  11.419 ms  11.475 ms 
 4  B3402.NYCMNY-LCR-22.verizon-gni.net (100.41.138.180)  13.603 ms
 5  0.et-10-3-0.BR2.NYC4.ALTER.NET (140.222.1.61)  22.855 ms  22.862 ms  22.915 ms
 6  204.255.169.2 (204.255.169.2)  15.840 ms  9.046 ms  8.963 ms
 7  ae16.cs2.lga5.us.zip.zayo.com (64.125.29.222)  24.351 ms  21.768 ms  21.744 ms
 8  ae4.cs2.dca2.us.eth.zayo.com (64.125.29.31)  18.532 ms  15.967 ms  18.889 ms
 9  ae27.cr2.dca2.us.zip.zayo.com (64.125.30.249)  16.719 ms  15.530 ms  16.601 ms
10  ae15.er5.iad10.us.zip.zayo.com (64.125.31.42)  18.737 ms  15.222 ms  15.195 ms
11  208.185.23.134.t00867-03.above.net (208.185.23.134)  19.714 ms  17.447 ms  17.441 ms
12  ip-208-113-156-4.dreamhost.com (208.113.156.4)  21.032 ms  21.707 ms  20.965 ms
13  ip-208-113-156-73.dreamhost.com (208.113.156.73)  19.445 ms 
14  apache2-hok.halfback.dreamhost.com (173.236.188.143) 19.462 ms  20.941 ms

Walking through the steps:

International Traceroute

The path to international servers can be more complex and interesting. For example, this is the traceroute to gov.bw, the Botswanan government's website:

 1  FIOS_Quantum_Gateway.fios-router.home (192.168.1.1)  0.378 ms  0.500 ms  0.617 ms
 2  lo0-100.NYCMNY-VFTTP-402.verizon-gni.net (72.89.78.1)  9.465 ms  9.488 ms  10.193 ms
 3  B3402.NYCMNY-LCR-21.verizon-gni.net (100.41.138.178)  14.006 ms  14.058 ms  14.116 ms
 4  * * *
 5  0.et-5-1-0.BR2.NYC4.ALTER.NET (140.222.239.33)  14.105 ms
    0.et-10-3-0.BR2.NYC4.ALTER.NET (140.222.1.61)  14.193 ms  16.140 ms
 6  204.255.168.114 (204.255.168.114)  18.252 ms  11.344 ms  12.095 ms
 7  be2057.ccr42.jfk02.atlas.cogentco.com (154.54.80.177)  12.788 ms 
    be2056.ccr41.jfk02.atlas.cogentco.com (154.54.44.217)  11.112 ms 
    be2057.ccr42.jfk02.atlas.cogentco.com (154.54.80.177)  11.101 ms
 8  be2490.ccr42.lon13.atlas.cogentco.com (154.54.42.86)  85.622 ms  76.448 ms  78.687 ms
 9  be2871.ccr21.lon01.atlas.cogentco.com (154.54.58.186)  78.701 ms 
    be2870.ccr22.lon01.atlas.cogentco.com (154.54.58.174)  83.813 ms 
    be2868.ccr21.lon01.atlas.cogentco.com (154.54.57.154)  78.573 ms
10  te0-0-2-2.rcr11.b015592-1.lon01.atlas.cogentco.com (130.117.51.234)  82.942 ms  83.244 ms  83.228 ms
11  149.14.80.218 (149.14.80.218)  82.401 ms  81.693 ms  82.069 ms
12  41.191.216.56 (41.191.216.56)  264.450 ms  267.330 ms  262.130 ms
13  41.191.216.126 (41.191.216.126)  262.187 ms  262.082 ms  260.020 ms
14  gbe-msu1-pr2-lnk2custr5.btc.net.bw (168.167.252.46)  259.781 ms  261.915 ms  267.307 ms
15  gbe-dit.btc.net.bw (168.167.254.82)  261.807 ms  261.725 ms  264.190 ms

Steps 1-3 are the home router and local ISP (Verizon) as in the previous example.

Steps 5 - 10 are backbone ISPs alter.net and cogentco.com

With step 11, traceroute does not have information on that IP address to display. Performing an IP Whois on that IP address (149.14.80.218) we see this is an undersea cable connection to Africa:

inetnum:        41.191.216.0 - 41.191.216.63
netname:        UnderSeaCables
descr:          Point to Point Links to London through Undersea Cables Eassy and WACS
country:        BW
admin-c:        MK44-AFRINIC
tech-c:         TM25-AFRINIC
status:         ASSIGNED PA
mnt-by:         BOFINET-MNT
source:         AFRINIC # Filtered
parent:         41.191.216.0 - 41.191.219.255

person:         Mpho KOOLESE
address:        Gaborone
address:        BW
phone:          +267 392 3856
nic-hdl:        MK44-AFRINIC
mnt-by:         GENERATED-LWKEYV7AP6LKXDKOYBRBKA7LAPGJDCX9-MNT
source:         AFRINIC # Filtered

Then for the final steps we see routers in the .bw TLD, meaning these are probably an ISP in Botswana.

The Digital Divide

The level of access to the internet is not uniform around the world. The internet is largely comprised of infrastructure owned by corporations, and, as such, this favors wealthy urban areas where revenues from customers justifies the significant investment needed to build the infrastructure.

A variety of metrics can be used to assess the level of internet access in countries. The International Telecommunications Union (ITU) collects data from sources of varying reliability to estimate the percent of individuals that use the internet within countries. As might be expected, wealthy western countries tend to have higher levels of internet use.

An example comparison by numbers:

Access to broadband (high-speed) internet is also a way to assess the internet capabilities of a country. The ITU also publishes estimates of Fixed broadband subscriptions (per 100 people), and the patterns are similar to internet use.

Comparing GDP per capita (a measure of wealth) to broadband subscriptions on an X/Y scatter chart, the relationship between wealth and internet access is clear.

An example comparison by numbers:

Internet Censorship

Knowledge is power. The internet has thrived on free communication. Accordingly, authoritarian regimes consistently attempt to censor internet content and monitor internet use in order to detect and suppress opposition to their rule.

Freedom On The Net 2016

The non-governmental organization Freedom House works to defend human rights and promote democratic change, with a focus on political rights and civil liberties. Freedom House regularly publishes a wide variety of research on freedom in various facets of life in countries around the world.

Freedom House's 2016 Freedom on the Net report documented a continuing global decline in internet freedom, with 2/3 of all internet users living in countries where criticism of the government is subject to censorship, and 1/4 of internet users living in places where people have been arrested for sharing content on Facebook.

The Freedom on the Net report assigns scores on a scale of 0 (good) to 100 (bad) to countries, and also groups them into categories of free, partly free, or not free.

For example, comparing by numbers:

The web page for the report linked above contains links to reports describing conditions in individual countries.

The Digital Dark Ages

In the developed world, we capture and store almost everything that can be stored: security video, electronic communications, smartphone photos of events momentous and trivial.

Almost none of that data will survive us.

The internet is a communications medium, not a permanent storage medium. Although storage becomes cheaper every year, technology changes every year. Data must be migrated from old storage media and file formats, or it is lost to physical degradation or technological obsolescence.

Data in The Cloud never has a permanent physical home. The Cloud is a performance and requires constant flows of capital and resources to stay in operation. Changes in the economics of The Cloud will necessitate loss of some of that data. Which data will be lost to time?

Contrast the impermanence of the digital with papyrus text from 2500 BC or clay tablets from as far back as 3300 BC.

Cuneiform tablet, about fishers wage, about 2000 BC, Mesopotamia. Musée de Mariemont (Wikimedia Commons)

While security camera video from an ATM where there has been no criminal activity may not be something that should outlive us, your grandchildren may want to see some of those thousands of baby pictures that you took of your son in the first year of his life. You should plan accordingly.

Me and Dad (1966)