Courses‎ > ‎Python worksheets‎ > ‎

Python dictionaries

Python lists can store an ordered sequence of items, for example a list of words. But when you want to map one type of items to another, you need a Python dictionary.
Example cases where you could use a Python dictionary:
  • Mapping English words to their German translations
    "dog" -> "Hund"
    "cat" -> "Katze"
    ...

  • Mapping each word to its count in Wilkie Collins’ “The woman in white”

    "the" -> 10,231
    "a" -> 9,765
    "to"-> 9,190
    ...
  • Mapping each preposition to the verbs with which it occurs
    "up" -> ["stand", "get", "look", "pass"]
    "in" -> ["live", "get", "take"]
    ...

Mapping English words to German translations with a Python dictionary:

>>> mydict = {}
>>> mydict[“dog”] = “Hund”
>>> mydict[“cat”] = “Katze”
>>> mydict[“rhinoceros”] = “Nashorn”
>>> mydict[“dog”]
Hund

Initialization of a dictionary: mydict = {}
Note: curly brackets! Used only for dictionaries in Python.
A dictionary maps a key (e.g. “dog”) to a value (e.g. “Hund”)


Comparing Python dictionaries .and Python lists

Initializing to an empty data structure:
mylist = []       # empty list: straight brackets
mydict = {}    # empty dictionary: curly brackets
Initializing to a nonempty data structure:
# initializing a list:straight brackets
mylist = [“dog”, “cat”, “rhinoceros”]
# initializing a dictionary: curly brackets, key-colon-value
mydict = {"dog":"Hund", "cat":"Katze",  "rhinoceros":"Nashorn"}

Accessing items on a list: index in straight brackets. .A list “maps” indices to items.
>>> mylist = ['dog', 'cat', 'rhinoceros']
>>> mylist[1]
'cat’

Accessing items on a dictionary: key in straight brackets..A dictionary maps keys to values.
>>> mydict = {"dog":"Hund", "cat":"Katze", "rhinoceros":"Nashorn"}
>>> mydict['cat']
'Katze’

You can change an item on a list, and a value for a dictionary key.

>>> mylist = ["cat", "dog"]
>>> mylist[1] = "chien"
>>> mylist
['cat', 'chien']
>>> mydict = {"dog":"Hund", "cat":"Katze"}
>>> mydict['cat'] = "chat"
>>> mydict
{'dog': 'Hund', 'cat': 'chat'}




When you try to assign a list item to an unused index, you get an error:
>>> mylist = ['dog', 'cat', 'rhinoceros']
>>> mylist[3] = 'brontosaurus'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: list assignment index out of range

In a dictionary, you can add a new key:value pair through assignment:
 
>>> mydict = {'dog':'Hund', 'rhinoceros':'Nashorn'}
>>> mydict['hippo'] = 'Nilpferd'
>>> mydict
{'rhinoceros': 'Nashorn', 'hippo': 'Nilpferd', 'dog': 'Hund'}

Note that the order in which dictionary items are shown need not reflect the order in which they were inserted. This, too, is different from lists.


Dictionary keys and dictionary values

What can be a dictionary key?

Strings can be dictionary keys:
mydict = {"dog":"Hund", "rhinoceros":"Nashorn"}

Integers can be dictionary keys. The following dictionary (somewhat redundantly) maps prime numbers to their rank, e.g. 2 is the first prime number.
prime_nums = {2:1, 3:2, 5:3, 7:4, 11:5}

Floating point numbers can be dictionary keys as well:
>>> mydict = {3.1415 : "pi", 2.71828 : "e" }
>>> mydict[3.1415]
'pi’

Not everything can be a dictionary key, for example lists cannot. (If you're interested in the details: It's because lists are "mutable", that is, you can change individual items on a list, and that would mess up the way dictionaries are represented internally in Python.)

What can be a dictionary value?

Any data type can be a dictionary value. Here is again our first example, where the dictionary values were strings:
mydict = {"dog":"Hund", "rhinoceros":"Nashorn"}

Here is an example where dictionary values are lists:
>>> preps_and_verbs = {}
>>> preps_and_verbs['to'] = ['see', 'go', 'belong']
>>> preps_and_verbs
{'to': ['see', 'go', 'belong']}

Even a dictionary can be a dictionary value.


Checking whether a key is present

>>> mydict[“armadillo”]
Traceback (most recent call last):
 File "<stdin>", line 1, in ?
KeyError: ’armadillo’
>>> "armadillo" in mydict
False
>>> if "rhinoceros" in mydict: print( "hello" )
hello

"in" checks keys, it does not check values:
 
>>> "Katze" in mydict
False


Note: When you try to access the dictionary value for a key that is not in the dictionary, you get a KeyError.



A task

Here is a mini German/English: dictionary
mydict = {"befreit":"liberated", "baeche":"brooks", "eise":"ice", "sind":"are", "strom":"river", "und":"and", "vom":"from"}

Can you use it to do a bad translation of the following German sentence?
mysent = "vom eise befreit sind strom und baeche"



































Here's a solution: (Note that this is not how you want your machine translation to work! The translations that you get this way are terrible.)
mydict = {“befreit”:”liberated”, “baeche”:”brooks”, “eise”:”ice”, “sind”:”are”, “strom”:”river”, “und”:”and”, “vom”:”from”}
mysent = “vom eise befreit sind strom und baeche”
for german_word in mysent.split():
    print( mydict[ german_word], end = " ")
print
Adding the parameter end = " " puts a space instead of a linebreak at the end of what is printed. That way, multiple "print" outputs land on the same line.


Counting words in a text

Here is how you can count occurrences of just one word (here: "the") in a text:
# paragraph from the Onion, March 04
paragraph = ”””While dieters are accustomed to exercises of will, a new English translation of Germany's most popular diet book takes the concept to a new philosophical level. The Nietzschean diet, which commands its adherents to eat superhuman amounts of whatever they most fear, is developing a strong following in America.”””

count_the = 0
for word in paragraph.split():
    if word == "the":
        count_the = count_the + 1

print( count_the )


Now suppose we want to count occurrences of all words at the same time. How can we do that? Here's a solution that uses Python dictionaries. It uses the words as keys, and their counts as the values. Every time we encounter a word, we add one to its value in the dictionary.

# paragraph from the Onion, March 04
paragraph = ”””While dieters are accustomed to exercises of will, a new English translation of Germany's most popular diet book takes the concept to a new philosophical level. The Nietzschean diet, which commands its adherents to eat superhuman amounts of whatever they most fear, is developing a strong following in America.”””
counts = { }

for word in paragraph.split():
    if  word not in counts:
            counts[word] = 0
    counts[ word ] = counts[ word ] + 1

print( counts )

The condition "if word not in counts" is true if the content of the variable word is not a key in the dictionary counts.

Note that this is a variant of the "accumulation" code pattern that you have seen before. We initialize counts to an empty dictionary. Then we iterate over the words in the paragraph, adding numbers to the dictionary as we go along. The first time we encounter a word, we initialize its count to zero. We know we encounter it for the first time because there is no dictionary key for them yet.


Retrieving things from a dictionary

As you have seen above, you can retrieve an individual value by its key:
>>> mydict = { “dog” : “Hund”, “cat” : “Katze”, “rhinoceros” : “Nashorn”}
>>> mydict["rhinoceros"]
'Nashorn'

You can also retrieve a list of all the keys, or a list of all the values, of a dictionary:
>>> mydict.keys()
[‘rhinoceros’, ’cat', ’dog']
>>> mydict.values()
[‘Nashorn’, ‘Katze’, ‘Hund’]

You can use this to do a for-loop over all keys or over all values in a dictionary:
mydict = { “dog” : “Hund”,  “rhinoceros” : “Nashorn”}
for english in mydict.keys():
    print “I got the following English animal:”, english

for german in mydict.values():
    print “I got the following German animal:”, german

The following code has a hypothetical dictionary of word counts for the words "the", "a", "hippo", and "iguana". We can figure out how many words were in the "corpus" by summing up the counts for all four words:

>>> mycounts = {"the" : 4000, "a":3500, "hippo": 2, "iguana": 1}
>>> mysum = 0
>>> for count in mycounts.values():
...     mysum = mysum + count
...
>>> mysum
7503

Or, more simply:
>>> sum(mycounts.values())
7503

So, keys() retrieves all keys, and values() retrieves all values. What if you want to retrieve all keys with their values? Here is how:
>>> mydict.items()
[('rhinoceros', 'Nashorn'), ('dog', 'Hund'), ('cat', 'Katze')]

What is this? ('rhinoceros', 'Nashorn')
It is a new data type, a tuple. This is simply a list that is fixed, such that you cannot change it in any way.
>>> rhino = list(mydict.items())[0]
>>> rhino
('rhinoceros', 'Nashorn')
>>> type(rhino)
<class 'tuple'>

mydict.items() is something like a list, the list of key/value pairs. We can iterate over it:

>>> for pair in mydict.items():
...     print( pair )
...
('rhinoceros', 'Nashorn')
('dog', 'Hund')
('cat', 'Katze')


You can take a tuple or a list apart by assigning multiple variables to it at once, one variable for each item on the list or tuple. (Of course, that only works if you know exactly how long the list or tuple is.)

>>> rhino
('rhinoceros', 'Nashorn')
>>> english, german = rhino
>>> print( english, "translates to", german )
rhinoceros translates to Nashorn

The central line is
english, german = rhino

Usually when doing assignments, assigning the right-hand side of the "=" to the left-hand side, there was only a single variable on the left-hand side. But if we know that the right-hand side of the "=" has exactly two components, we can put two variables on the left-hand side. The command above takes the tuple ('rhinoceros', 'Nashorn') apart into two items and assigns the first to the variable english and the second to the variable german.

We can do this for all key/value pairs in mydict:

>>> for pair in mydict.items():
...     english, german = pair
...     print( english, "translates to", german)
...
rhinoceros translates to Nashorn
dog translates to Hund
cat translates to Katze

Or we can take the key/value pairs apart directly in the 'for' loop:

>>> for english, german in mydict.items():
...     print( german, "is German for", english)
...
Nashorn is German for rhinoceros
Hund is German for dog
Katze is German for cat

The central line here is:
for english, german in mydict.items():

This is the same idea as above -- we know that any member of mydict.items() consists of two parts (a key and a value), so we can assign it to two variables at once.


Inserting and deleting items in a dictionary

You can insert a key/value pair into a dictionary simply by assigning the value to the key:
>>> mydict['hippo'] = "Nilpferd"
>>> mydict
{'rhinoceros': 'Nashorn', 'hippo': 'Nilpferd', 'dog': 'Hund', 'cat': 'Katze'}

And here is how you can delete an item:
>>> mydict = { “dog” : “Hund”, “cat” : “Katze”, “rhinoceros” : “Nashorn”}
>>> del mydict[ ”dog” ]
>>> mydict
{’rhinoceros': ’Nashorn', ’cat’: ’Katze'}

And if you want to delete all items at the same time, here's how:
>>> mydict.clear()
>>> mydict
{}

{ } is an empty dictionary.


Complex data types

A list can be the value in a dictionary. For example, we might want to map a preposition ("on") to a list of all the verbs with which we have seen it. We can then even retrieve the first verb from that list:
>>> mydict = { }
>>> mydict["on"] = ["step", "rely", "wait"]
>>> mydict["on"][0]
'step’

How this works: mydict[“on”] is a list, which we can then index using [0]

The value in a dictionary can even be another dictionary. For example, we may want to map each preposition (for example, "on"), to the verbs with which we have seen it and the counts for the verb/preposition pairs.
>>> mydict = { }
>>> mydict["on"] = {"step":100, "rely":34, "wait":9}

That is, we have see "step on" 100 times, "rely on" 34 times, and "wait on" 9 times. How can we access the count?

>>> mydict["on"]["rely"]
34

How this works: mydict[“on”] is a dictionary, which we can then index using [“rely”]
Comments