Data Structures#

In addition to what are called “primitive” data types in Python (integer, float, string, and Boolean), most programming languages also include or support more complex data structures or more complex ways of storing and accessing information. In Python, those one-dimensional (or linear) data structures include strings, lists, tuples, sets, and dictionaries.

NameSyntaxExampleDescription
Stringstr(), """Hello world!"Sequence of characters
Listlist(), []["apple", "banana", "pear"], [1, 3, 5, 7]Sequence of objects/values
Dictionarydict(), {}{'first_name': 'Knute', 'last_name':'Rockne', 'class':'1918'}Key-value pairs
Setset(), {}{"apple", "banana", "pear"}, {1, 3, 5, 7}Unordered group of unique values
Tupletuple(), ()("apple", "banana", "pear"), (1, 3, 5, 7)Ordered group of values that can include duplicates

Lists#

“In computer science, an array is a data structure consisting of a collection of elements (values or variables), each identified by at least one array index or key…The simplest type of data structure is a linear array, also called one-dimensional array” (Wikipedia, “Array (data structure)”)

We can think of arrays as one-dimensional, linear data structures made up of a collection of items. As the definitions note, we can access specific elements in the array by using an index or key value.

“In Python, the built-in array data structure is a list” (Busbee and Braunschweig, “Arrays and Lists”). Python also includes a few other built-in array-like data structures, including sets and tuples. We’ll come back to these later in this lab. We can also think of string objects, which are a sequence of characters, as a type of one-dimiensional, linear array.

These one-dimensional or linear array structures have a few key properties that differentiate the structures and shape how we can interact with or manipulate them in a programming environment.

Those properties include:

  • Mutable: Can values in the structure be changed once it has been created or assigned to a variable?

  • Order: Does the order of values in the structure have meaning/significance, or is order not significant?

  • Indexing/Slicing: Can values in the structure be accessed using their position or index? Can we isolate values in the structure using their position?

  • Duplicates: Does the structure allow duplicate values?

How these properties show up for Python’s built-in data structures:

Each structure has its own specific vocabulary and syntax, but some common operations we can use with these structures:

  • Getting number of values in the structure (using the len() function)

  • For structures that are mutable, adding, modifying, and removing values

  • Sorting values in the structure

  • Testing for membership, if specific value(s) are present in the structure (using the in and not in operators)

  • For structures that are ordered or indexed, accessing elements using their position

An example that uses a list of strings:

# list of string objects
fruits = ["apple", "banana", "blueberry", "cherry"]

# check data type
print(type(fruits))

We can determine the number of elements in the list using the len() function.

print(len(fruits))

Remember Python starts at 0 and counts left-to-right. We can access specific values using their position.

# access first value
print(fruits[0])

# access second value
print(fruits[1])

# access third value
print(fruits[2])

Python lists also support negative indexing- we can use negative index values to count right-to-left.

  • NOTE: Negative indexing starts counting at -1

# access last value
print(fruits[-1])

# access next to last value
print(fruits[-2])

Dictionaries#

The other primary type of array we can encounter is an associative array, “an abstract data type that stores a collection of (key, value) pairs, such that each possible key appears at most once in the collection” (Wikpedia, Associative Array)

Python stores associate arrays using the dictionary data structure. Python dictionaries consist of key-value pairs, where the key is working as an identifier or index.

A preliminary example in Python:

# create dictionary
english_to_french = {
  'one': 'un',
  'two': 'deux',
  'three': 'trois',
  'four': 'quatre',
  'five': 'cinq'
}

# check data type
print(type(english_to_french))

We can use the index operator ([]) and key values to select specific values in the dictionary.

# access value for one key
print(english_to_french['one'])

# access value for five key
print(english_to_french['five'])

# access value for asdf key
print(english_to_french['asdf'])

The last line will return a KeyError because asdf is not a key in this dictionary.

Application#

Q2: Create your own list of numbers or strings, using the examples in the lab as a starting point. What is the number position for each of the items in your list? How would you return the value of the first item? How would you return the value of the last item?

Additional Resources#

For more background on arrays and data structures in Python: