Introduction to Python

This tutorial will give a brief introduction to the syntax of the python programming language.

About Python

Python logo

Python is an cross-platform, open-source, general purpose programming language is an cross-platform, open-source, general purpose programming language developed by Dutch programmer Guido van Rossum and first released in 1991.

Van Rossum was solely responsible for the project until he ceded responsibility for governance to a five-member steering committee in 2019.

Van Rossum named the language Python because he was reading scripts from the Monty Python's Flying Circus TV show while trying to come up with a name, and he felt that "Python" would be appropriately "short, unique, and slightly mysterious" (Python Software Foundation 2022).

Of the myriad uses for Python, four areas are notable:

Getting Python

Windows and MacOS installers can be downloaded for free from Python.org. Linux users can install from the standard Debian and RPM repositories.

Note that unless you are working on old legacy software, you should always use one of the Python 3.x.x versions rather than Python 2.7.

Python.org download page

Python Console

You can directly interact with Python from the Python console, where you can type in commands line-by-line and see results.

The Python console

Scripts

A script is "a sequence of instructions or commands for a computer to execute" (Merriam-Webster 2022).

Scripts allow you to easily repeat complex sequences of operations. And if you find an error in your script, you can fix it and rerun the script without the labor of having to repeat long sequences of button clicks that you would need when using software with a graphical user interface like ArcGIS Pro.

Although there are programmer focused interactive development environments like PyCharm available for download, many users can adequately edit and run Python scripts using the Integrated Development and Learning Environment (IDLE) editor that is included with the standard Python installation. The IDLE editor has a simple beginner-friendly user interface while providing syntax highlighting and context-sensitive help.

This video demonstrates creating and running a short script in IDLE.

The Python IDLE editor

Notebooks

A notebook is an interactive interface that allows you to integrate programming code with documentation, analysis, and visualizations.

A Jupyter notebook in CyberGISX

Expressions

At it's simplest, you can use Python as a calculator and it will display the value of mathematical expressions.

2 + 2
4

Python expressions are are similar to traditional mathematical notation and use the same mathematical symbols or operators: + - * /. The double asterisk operator (**) is used for exponents.

Operation Example Output
Addition 10 + 2 12
Subtraction10 - 2 8
Multiplication10 * 2 20
Division 10 / 2 5
Exponents 10**2 100

As with traditional mathematical notation, parentheses can be used to add clarity to expressions, or to override the normal precedence of operators.

3 + 2 * 4
11
(3 + 2) * 4
20

Objects

To make it possible to use the values of calculations in subsequent formulas, you can assign values to named objects.

The symbolic names used to refer to objects are called variables.

In Python, objects are areas in memory where data is stored, and variables are names that point to those areas in memory (Eubank 2022).

You can use variables in later expressions to save the effort of repeating calculations, or simply to make expressions easier to read.

To display the contents of a object at the console, you simply type in the variable.

x = 10 * 2
x
20
x + 15
35

Variable Naming Styles

Variables must start with a letter and are case sensitive.

hello = 12

Hello = 15

hello
12
Hello
15

You should always try to make your variables meaningful so that you and other people can understand what your objects are. Rather than calling a object containing a standard deviation "s" you might call it "stdev". The extra time spent typing now may save you confusion later.

Variables cannot contain spaces. However there are techniques for representing multi-word variables that get around this issue:

Note that the Style Guide for Python Code generally recommends CapWords formatting, although this is far from a universally followed convention.

Strings

One of the most powerful features of Python is that objects can contain many different types of data.

Objects can contain text. Segments of text are called strings of characters. You assign text by enclosing your text in either double or single quotation marks.

Be aware that text strings and variable names are separate things.

x = "Hello"

x
'Hello'

Be aware that text strings and variable names are separate things.

Hello = "Goodbye"

Hello
'Goodbye'

The plus (+) operator can be used to concatenate (combine end to end) multiple strings (Lofsöngur).

country = "Iceland"

anthem = "Lofsöngur"

print('The national anthem of ' + country + ' is "' + anthem + '."')
The national anthem of Iceland is "Lofsöngur."

Lists

In statistical calculations, we commonly deal with multiple numbers at the same time. One of the most powerful features of is that it permits objects to contain multiple numbers at the same time. These collections of numbers are called lists.

Lists can be created by enclosing multiple numbers, strings, or variables in square brackets, and separating the values with commas:

x = [1,3,5,7,10]

x
[1, 3, 5, 7, 10]

You can perform operations on lists using mathematical operators.

The plus sign concatenates (combines) two lists.

x = ['Alpha', 'Beta', 'Gamma']

y = [1, 2, 3]

x + y
['Alpha', 'Beta', 'Gamma', 1, 2, 3]

The multiplication sign repeats the contents of a list by the given number of times.

y = [1, 2, 3]

y * 3
[1, 2, 3, 1, 2, 3, 1, 2, 3]

Dictionaries

Dictionaries in Python are collections of objects similar to lists, except rather than accessing elements with numbered indices, elements in a dictionary are accessed using values called keys.

Dictionaries can be constructed by specifying key:value pairs within brackets { }.

Dictionary values can be accessed by specifying the key in square brackets [ ].

anthems = {
'United States': 'The Star-Spangled Banner',
'Canada': 'Oh, Canada',
'Mexico': 'Himno Nacional Mexicano',
'Russia': 'Patrioticheskaya Pesnya' }

anthems['Russia']
'Patrioticheskaya Pesnya'

You can add or change dictionary values using that same square bracket notation:

anthems['Ukraine'] = 'Derzhavnyy Himn Ukrayiny'

anthems['Ukraine']
'Derzhavnyy Himn Ukrayiny'

List entries can be used in expressions just like other variables.

anthems['Russia'] = 'Госудáрственный гимн Росси́йской Федерáции'

print('The national anthem of Russia is ' + anthems['Russia'] + '.')
The national anthem of Russia is Госудáрственный гимн Росси́йской Федерáции.

Variables can be used as keys.

country = "Ukraine"

print("The national anthem of " + country + " is " + anthems[country] + ".")
The national anthem of Ukraine is Derzhavnyy Himn Ukrayiny.

Functions

A Python function is "a series of statements which returns some value to a caller" (Python Software Foundation 2022).

You call a function with a function name, an open parenthesis, a set of zero or more parameters separated by commas, and then a closing parenthesis. The function then returns an object based on the parameters.

name(parameter1, parameter2, ...)

Python has dozens of built-in functions that are available with a default installation.

The basic descriptive statistical functions are similar to those available in Excel.

x = [2, 5, 13, 24, 35, 40, 35, 24, 13, 5, 2]

max(x)
40

Functions that return numeric values can be used in mathematical expressions just like numbers or variables.

y = sum(x) + 2

y
200

print()

The built-in print() function is often used in scripts to display the value of variables or expressions to the screen. While simply typing in the expression or variable will display the value in the Python console, in scripts simply putting a variable alone on a line causes no action.

x = 45 + 72

print(x)
117

str()

If you wish to append a numeric value to a string, you must first use a type convertor like the built-in str() function to convert the number to a string before you can append.

This example also uses the round() function to round the value to two decimal places, which is consistent with display as dollars and cents.

dinner = (18.50 + 4.5 + 22.99 + 4.5) * 1.24

print("The total cost for our dinner with tax and tip was $" + str(round(dinner, 2)) + ".")
The total cost for our dinner with tax and tip was $62.61

range()

The built-in range() function is useful for generating lists of values. The first parameter is the starting value, the second parameter is the value immediatly after the last value.

range(1, 10)
range(1, 10)

range() returns a range object, and you can use the list() function to convert the range to a list.

list(range(1,10))
[1, 2, 3, 4, 5, 6, 7, 8, 9]

An optional third paramter gives the spacing between values (defaulting to one).

list(range(1, 10, 3))
[1, 4, 7]

Modules

A module is a set of functions and other objects that you can include in your script.

Specialized functions can be brought in from modules that permit different types of operations to be performed on different types of data.

For example, the statistics module adds functions for calculating descriptive statistics for lists of numeric values.

Modules are loaded with the import command.

import statistics

x = [2, 5, 13, 24, 35, 40, 35, 24, 13, 5, 2]

statistics.mean(x)
18
statistics.median(x)
13
statistics.stdev(x)
14.26184

The math Module

The math module provides a wide variety of mathematical functions, such as trigonometric functions.

For this example, we use sin() to calculate the length of the opposite leg of a right triangle, given the angle and the length of the hypoteneuse (radius).

Sine

Note that angles in math functions are specified in radians (2π = 360 degrees).

import math

degrees = 75

radius = 5

radians = math.pi * (degrees / 180)

length = math.sin(radians) * radius

print("The length of the opposite side of a right triangle",
"with a radius of", str(radius),
"and angle", str(degrees) ,
"is", str(length), ".")
The length of the opposite side of a right triangle with a radius of 5 and angle 75 is 4.8296291314453415.

Great-Circle Distance

Module functions can facilitate complex calculations.

Great-circle distance is the shortest possible distance across the surface of a sphere, and is use to find straight-line distance between two points on the surface of the earth.

Great-circle distance (Wikipedia 2016)

The Haversine formula can be used to calculate the distance between two points on the surface of the earth specified with latitudes and longitudes.

Complex formulas like the Haversine formula need to be used when calculating distances across the surface of the earth because the earth is three-dimensional and simple two-dimensional formulas like the Pythagorean theorem are inadequate for making three-dimensional calculations.

The "from math import *" import statement allows you to use imported math functions without having to type the namespace before the function name: sin() instead of math.sin().

from math import *

lat1 = 40.10900
long1 = -88.22699
name1 = "Illini Union"

lat2 = 40.10621
long2 = -88.22719
name2 = "Foellenger Auditorium"

radius = 6378137 # meters
flattening = 1/298.257223563

start_x = long1 * pi / 180
start_y = atan2((1 - flattening) * sin(lat1 * pi / 180), cos(lat1 * pi / 180))
end_x = long2 * pi / 180
end_y = atan2((1 - flattening) * sin(lat2 * pi / 180), cos(lat2 * pi / 180))

arc_distance = (sin((end_y - start_y) / 2) ** 2) + \
	(cos(start_y) * cos(end_y) * (sin((end_x - start_x) / 2) ** 2))

distance = 2 * radius * atan2(sqrt(arc_distance), sqrt(1 - arc_distance))

print("The distance between " + name1 + " and " + name2 + " is " + str(round(distance)) + " meters.")
The distance between Illini Union and Foellenger Auditorium is 310 meters.

User-Defined Functions

Users can create custom functions using the def keyword.

For example, this function calculates the hypoteneuse of a right triangle using the Pythagorean theorem (勾股定理).

import math

def hypotenuse(rise, run):
    return math.sqrt((rise^2) + (run^2))

hypotenuse(3, 4)
2.6457513110645907

Paths: The os Module

A commonly used module for accessing operating system capabilities is the os module.

The os.getcwd() function returns the name of the current working directory.

import os

os.getcwd()
'C:\\Program Files\\Python39'

You can use The os.listdir() function returns a list of files in the directory passed as a parameter. This can be useful if you need to perform some kind of operation on every file in a directory.

path = os.getcwd()

os.listdir(path)
[ 'Documents',
  'Downloads',
  'Photos',
  'Music']

Directories exist within a hierarchical system of directories used to organize files.

C:\\Program Files\\Python39

The os.listdir() function returns a list of all the files in a directory. For example, the Program Files directory on Windows systems contains (as the name indicates), the files for the software installed on the system.

os.listdir("c:\\Program Files")
['7-Zip', 'Agisoft', 'ArcGIS', 'Common Files', 'Dell', 'desktop.ini', 'dotnet',
'Emulex', 'Exelis', 'GDAL', 'GeoDa Software', 'Golden Software', 'Google',
'Gwb', 'Internet Explorer', 'LAPS', 'LAStools', 'Managed Defender', 'MATLAB',
'Microsoft MPI', 'Microsoft Office', 'Microsoft Policy Platform', 
'Microsoft Silverlight', 'MSBuild', 'Notepad++', 'PackageManagement', 'Python310',
'Python37', 'Python38', 'Python39', 'QGIS 2.18', 'QGIS 3.10', 'QGIS 3.16',
'QGIS 3.22.7', 'QGIS 3.4', 'R', 'Reference Assemblies', 'RStudio', 'rtools40',
'SedInConnect', 'SIFT3D 1.4.5', 'TauDEM', 'tempini', 'Uninstall Information',
'VcXsrv', 'Windows Defender', 'Windows Firewall Configuration Provider',
'Windows Mail', 'Windows Media Player', 'Windows Multimedia Platform', 'Windows
NT', 'Windows Photo Viewer', 'Windows Portable Devices', 'Windows Sidebar',
'WindowsApps', 'WindowsPowerShell', 'Zabbix']

Graphs: The matplolib Module

The matplotlib library permits visualization of data in Python.

The plot() function draws graphs. By default, when passed a single list, the plot() function draws a line graph.

import matplotlib.pyplot as plt

y = [2, 5, 13, 24, 35, 40, 35, 24, 13, 5, 2]

graph = plt.plot(y)

plt.show()
Example default line plot

Histograms can be plotted with the hist() function.

graph = plt.hist(x)

plt.show()
Histogram

Conditions

Python provides operators for comparing values that return logical values (true or false):

Comparison operators are commonly used with if statements to choose whether to execute a block of code.

Note that the if statement ends with a colon (:) and, as with functions, the block of code controlled by the if statement is indented.

latitude = 89

if latitude > 90:
    print("Invalid latitude: ", str(latitude))
Invalid latitude: 91
latitude = 91

if latitude > 90:
    print("Invalid latitude: ", str(latitude))




Iteration

A for loop is used to run a block of code on all items in a list.

values = [14, 69, 32, 75]

for x in values:
    print(x)
14
69
32
75

One common application for for loops is to perform some operation on all the files in a given directory.

import os

path = "C:/Documents/ArcGIS Pro/Projects/Merge/Sources"

for file in os.listdir(path):
	print(path + '/' + file)
C:/Documents/ArcGIS Pro/Projects/Merge/Sources/Alpha
C:/Documents/ArcGIS Pro/Projects/Merge/Sources/Beta
C:/Documents/ArcGIS Pro/Projects/Merge/Sources/Gamma
C:/Documents/ArcGIS Pro/Projects/Merge/Sources/Delta

Python has an in operator that can be used to examine whether one object contains another object. With strings, the in operator can be used to check whether a string contains a substring.

In this example, in is used with an if statement to list only the Word documents (.docx) files in a directory and ignore all others.

import os

path = "c:/Documents/ArcGIS Pro/Projects/Merge/Sources"

for file in os.listdir(path):
	if ".docx" in file:
		print(path + '/' + file)
C:/Documents/ArcGIS Pro/Projects/Merge/Sources/Alpha
C:/Documents/ArcGIS Pro/Projects/Merge/Sources/Beta
C:/Documents/ArcGIS Pro/Projects/Merge/Sources/Gamma
C:/Documents/ArcGIS Pro/Projects/Merge/Sources/Delta

Comments

Comments in scripts are lines that the program ignores. These lines are used for documenting the authorship of scripts and for adding comments that explain what is going on when you have complex sequences of expressions and function calls.

Comments start with a pound sign (#) and tell the Python interpreter to ignore everything that follows on that line.

# Name of script (date)
# This is a comment that explains what the line after it does

x = 2 + 2
print(x)
4

Packages

Python has a wide variety of modules available beyond those that come with a standard Python installation (like math or statistics).

Because there are so many different modules, and because those modules can be interrelated, there is a hierarchy of structures used to manage modules.

Python modules, packages, and libraries

Modules also sometimes provide bindings. Bindings are specialized Python modules containing functions and methods that can be used to call libraries created to be used with other programming languages, usually C or C++.

Repositories are collections of libraries on the internet that are maintained by the Python development team.

You can install packages from repositories using PIP, the package installer for Python.

PIP handles installation of dependencies. Dependencies additional libraries and packages that must also be installed in order to use the modules in a package. Dependencies can get messy and cause confusing installation error messages when they are not carefully configured, or when you are installing a package on a machine with an unusual configuration.

If an import command fails because the module is not installed, you can probably install the needed packages with PIP.

In this example, the numpy module needed by the matplotlib module is not installed and fails when you attempt to import.

Open the Windows Command Prompt and run:

pip import <module_name>

Alternatively, if PIP has not been set up with appropriate environment variables, you can run PIP via Python:

py -m pip install <module_name>
Installing a library with pip from the Windows command prompt