Dictionaries#


Questions:#

  • How can I organize data to store values associated with labels?

  • How do I work with such data structures?

Learning Objectives:#

  • Be able to store data in dictionaries

  • Be able to retrieve specific items from dictionaries


Dictionaries are mappings#

  • Python provides a valuable data type called the dictionary (often abbreviated as “dict”)

  • Dictionaries are collections — Python data types that store multiple values — like lists or strings

  • However, dictionaries are mappings. Like a traditional dictionary that contains a word, followed by a definition, dictionaries are pairings of keys and values

  • A dictionary can be defined using curly braces that specify key-value mappings using the colon (:)

  • For example, the following defines a dictionary with four keys (First Name, Last Name, etc.), each with an associated value:

my_info = {'First Name':'Lionel', 
           'Last Name':'Messi', 
           'Age':35, 
           'Height':1.70}
  • Like lists, dictionaries can contain a mixture of data types.

  • Dictionary keys must be an immutable type, such as a string or number (int or float). This is because dictionaries are organized by keys, and once a key is defined, it cannot be changed.

  • Dictionary values can be any type.

The values in a dictionary are accessed using their keys inside square brackets:

my_info['First Name']
'Lionel'

Why Dictionaries?#

  • Dictionaries provide a convenient way for labelling data. For example, in the previous lesson on lists, we used an example of a list of life expectancies for a country (Canada) for different years:

life_exp = [48.1, 56.6, 64.0, 71.0, 75.2, 79.2]
  • One limitation of using list like this is that we don’t know what years the values are associated with. For example, in what year was 48.1 the average life expectancy?

  • Dictionaries solve this problem, because we can use the keys to label the data. For example define the following dictionary in which keys indicated years, and values are life expectancies:

life_exp = {1900:48.1, 1920:56.6, 1940:64.0, 1960:71.0, 1980:75.2, 2000:79.2}
  • In defining a dictionary, we use curly braces {}

  • We associate keys and values with a colon :

life_exp = {1900:48.1, 1920:56.6, 1940:64.0, 1960:71.0, 1980:75.2, 2000:79.2}
  • Now we can see the life expectancy associated with a given year like so:

life_exp[1940]
64.0
  • What happens if we ask for a key that doesn’t exist?

life_exp[2021]
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[5], line 1
----> 1 life_exp[2021]

KeyError: 2021
  • We get a specific type of error called a KeyError, which tells us that the key isn’t defined

Dictionaries are mutable#

  • We previously discussed the difference between immutable types like strings — whose contents cannot be changed — and mutable types like lists — which can be changed. Dictionaries are mutable.

  • This means we can:

    • add new key:value pairs after the dictionary is first defined, and

    • modify the values associated with existing keys

  • For example, we can add a new key:value pair by using the dict name plus square brackets to specify a new key, followed by = to assign the value to this key:

life_exp[2020] = 83.2
print(life_exp)
{1900: 48.1, 1920: 56.6, 1940: 64.0, 1960: 71.0, 1980: 75.2, 2000: 79.2, 2020: 83.2}
  • We can also change the value for an existing key in the same way. In the example above we assigned the wrong value to 2020; it should be 82.3 not 83.2. So we can fix it like this:

life_exp[2020] = 82.3
print(life_exp)
{1900: 48.1, 1920: 56.6, 1940: 64.0, 1960: 71.0, 1980: 75.2, 2000: 79.2, 2020: 82.3}
  • Note that Python doesn’t warn you if you’re overwriting a value associated with an existing key.

Dictionary keys cannot be renamed#

  • Although dictionaries are mutable, we can’t rename dictionary keys. However, we can delete existing entries and create new ones if we need to.

  • For example, below we add a life expectancy value for the year 2040, but we mistakenly make the key a string instead of an integer like the other, existing keys:

life_exp['2040'] = 85.1
print(life_exp)
{1900: 48.1, 1920: 56.6, 1940: 64.0, 1960: 71.0, 1980: 75.2, 2000: 79.2, '2040': 85.1}
  • We can add another entry using the correct (integer) key, but this doesn’t delete the old entry:

life_exp[2040] = 85.1
print(life_exp)
{1900: 48.1, 1920: 56.6, 1940: 64.0, 1960: 71.0, 1980: 75.2, 2000: 79.2, '2040': 85.1, 2040: 85.1}
  • Alternatively, rather than manually entering the new value, we can copy it from the value corresponding to the original (incorrect) key:

life_exp[2040] = life_exp['2040']
print(life_exp)
{1900: 48.1, 1920: 56.6, 1940: 64.0, 1960: 71.0, 1980: 75.2, 2000: 79.2, '2040': 85.1, 2040: 85.1}
  • Whether we manually enter a new key:value pair, or copy a value from an existing dictionary entry, we still retain the original dictionary entry (in this case, '2040')

Removing dictionary entries#

  • We can remove a dictionary entry with the del statement. del is a Python statement (not a function or method), so it is followed by a space rather than parentheses:

del life_exp['2040']
print(life_exp)
{1900: 48.1, 1920: 56.6, 1940: 64.0, 1960: 71.0, 1980: 75.2, 2000: 79.2, 2040: 85.1}
  • We can alternatively remove a dictionary item using the .pop() method, as we saw last time for lists.

  • The key for the dictionary entry you wish to delete is the argument you pass to .pop()

  • In this example we first create an erroneous entry, then remove it with .pop():

# Create incorrect entry
life_exp[200] = 79.2
print(life_exp)

# Remove incorrect entry
life_exp.pop(200)
print(life_exp)
{1900: 48.1, 1920: 56.6, 1940: 64.0, 1960: 71.0, 1980: 75.2, 2000: 79.2, 2040: 85.1, 200: 79.2}
{1900: 48.1, 1920: 56.6, 1940: 64.0, 1960: 71.0, 1980: 75.2, 2000: 79.2, 2040: 85.1}

Dictionaries are unordered#

  • Both strings and lists are ordered — the items exist in a string or list in a specific order. This is why we can use integers to index string or list items based on their position

  • Dictionaries, in contrast, are unordered. Because they are not ordered, you can’t access values using numerical indexing, only by their keys. For example, this will fail:

life_exp[0]
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
/var/folders/x0/pvpjfdzd39sc5rqvsk5lvv4m0000gn/T/ipykernel_12883/1067234015.py in <module>
----> 1 life_exp[0]

KeyError: 0
  • The code above results in a KeyError, because Python interprets what’s in the square brackets as a dictionary key, not a sequential index. This makes sense, since dictionary keys can be integers.

Dictionaries have properties#

  • Like other Python collections, dictionaries have a length property, which is the number of key:value pairs:

len(life_exp)
8

Viewing all the keys or values in a dictionary#

  • You can view the entire set of keys in a dictionary like this:

life_exp.keys()
dict_keys([1900, 1920, 1940, 1960, 1980, 2000, 2040])
  • Likewise, you can view all of the values using .values():

life_exp.values()
dict_values([48.1, 56.6, 64.0, 71.0, 75.2, 79.2, 82.3, 85.1])
  • You can also view both the keys and values at once, using .items():

life_exp.items()
dict_items([(1900, 48.1), (1920, 56.6), (1940, 64.0), (1960, 71.0), (1980, 75.2), (2000, 79.2), (2020, 82.3), (2040, 85.1)])
  • The output of .items() is more complex than just asking Python to print the dictionary, but it’s organized in a way that will be useful in later lessons, for example if you want to systematically do the same thing to each item in a dictionary.

Skill-Testing Question: What Python type are the results of the .keys() and .values() methods?

Finding a key in a dictionary#

  • If your dictionary is very large, it may not be feasible to visually scan through all the entries to determine if a particular key is present. You can use the in statement to check whether a key is in a dictionary:

print(1900 in life_exp)
print(1800 in life_exp)
True
False

Dictionary values can be any type#

  • While dictionary keys must be an immutable type, such as strings or numbers, values can be any type — including lists, or other dictionaries.

  • For example, here we create a dictionary in which each key is a country name, and the value is a list of life expectancies for different years:

intl_life_exp = {'Canada':[48.1, 56.6, 64.0, 71.0, 75.2, 79.2],
                   'Denmark':[51.3, 57.5, 66.1, 72.0, 74.2, 77.0],
                   'Egypt':[32.7, 32.6, 33.8, 46.9, 58.0, 68.7]
                  }
intl_life_exp['Egypt']
[32.7, 32.6, 33.8, 46.9, 58.0, 68.7]

Nested indexing#

  • intl_life_exp is an example of nesting, which we saw previously in the lesson on lists. Each dictionary entry’s value is a list, which is “nested” inside the dictionary.

  • Since we can index entries in a list, we can use a sequence of specifiers to access a particular element within the list for a specific country’s dictionary entry:

intl_life_exp['Denmark'][1]
57.5

Nested dictionaries#

  • Using lists in the above example has the same limitation talked about at the start of this lesson: we don’t know what years correspond to the values in each list.

  • Using a dictionary of dictionaries solves this problem:

intl_life_exp = {'Canada':{1900:48.1, 1920:56.6, 1940:64.0, 1960:71.0, 1980:75.2, 2000:79.2},
                   'Denmark':{1900:51.3, 1920:57.5, 1940:66.1},
                   'Egypt':{1900:32.7, 1920:32.6, 1940:33.8, 1980:58.0}
                  }
intl_life_exp['Egypt']
{1900: 32.7, 1920: 32.6, 1940: 33.8, 1980: 58.0}
  • Note that each nested dictionary is independent of the others, so you don’t need to have the same keys in each dictionary.

  • We can now obtain values for specific years, within specific countries, using a sequence of keys

  • The order of keys goes from the outside in as you move from left to right, so 'Denmark' comes before the year we want to access for Denmark

intl_life_exp['Denmark'][1940]
66.1

Exercises#

Creating a dictionary#

  • Create a dictionary called weekdays in which the keys are the names of the days of the week from Monday to Friday (not including weekends), and the values are the dates of the days of the week for this week (e.g., if today is Monday, Sept 20, then your value for Monday would be 20)

  • Using the weekdays list you created, print the value for Wednesday

Sorting dictionaries#

  • Although dictionaries are not stored in an ordered fashion, it is possible to view a sorted list of dictionary keys using the sorted() function we learned in the previous lesson on lists. Try printing the sorted keys for weekdays


Summary of Key Points#

  • Dictionaries are a special type of collection called mappings

  • Each dictionary entry is specified as a key:value pair

  • Key:value pairs are assigned to variable names within curly braces: {}

  • Keys must be immutable types, such as strings or numbers.

    • However, dictionary values can be any Python type, including lists or other dictionaries

  • Dictionaries are mutable, in that one can add or delete entries, or change the value of an entry

    • You can add a new dictionary entry through assignment

    • However, dictionary keys cannot be renamed. Instead, you must delete the old key:value pair and create a new one

    • You can delete a dictionary entry using wither the del statement or the .pop() method

  • Dictionaries are unordered, so you cannot access an entry based on its serial (sequential) position the way you can entries in a list

    • Instead, you access values based on their keys

  • The length of a dictionary is the number of key:value pairs it contains

  • You can see a list of all dictionary keys or values using the .keys() and .values() methods, respectively


This section was adapted from Aaron J. Newman’s Data Science for Psychology and Neuroscience - in Python and Software Carpentry’s Plotting and Programming in Python workshop.