Worksheet: Sorting in Python

Sorting lists

The method sort() sorts a list in place:

>>> mylist = ["armadillo", "zebra", "guppy", "cat"]

>>> mylist.sort()

>>> mylist

['armadillo', 'cat', 'guppy', 'zebra']

The function sorted(), on the other hand, leaves the original list unchanged, but returns a sorted list.

>>> mylist = ["armadillo", "zebra", "guppy", "cat"]

>>> sorted(mylist)

['armadillo', 'cat', 'guppy', 'zebra']

>>> mylist

['armadillo', 'zebra', 'guppy', 'cat']

The parameter reverse=True inverts the sorting order.

>>> sorted(mylist, reverse = True)

['zebra', 'guppy', 'cat', 'armadillo']

>>>

Sorting Python dictionaries by their values

Suppose we have a dictionary of word counts, for example

counts = { "the" : 3, "cat":2, "and":1, "chased":1, "dog":1}

If we try to sort it using the same formulation as for lists above, we get the dictionary keys in alphabetical order:

>>> sorted(counts)

['and', 'cat', 'chased', 'dog', 'the']

If we sort key/value pairs, we still get them in alphabetical order of the keys:

>>> sorted(counts.items())

[('and', 1), ('cat', 2), ('chased', 1), ('dog', 1), ('the', 3)]

We need to change the way that sorted() looks at the items that it is sorting. In particular, we want to map each key/value pair to the value, as this is what sorted() should consider for sorting. To do this, we define a function that maps pairs to the second element, and hand it on to sorted() as the sorting key function:

def second_of_pair(tuple):

    return tuple[1]

 

>>> second_of_pair( ("a", 1) )

1

>>> second_of_pair( ("the", 2) )

2

>>> sorted(counts.items(), key = second_of_pair)

[('and', 1), ('chased', 1), ('dog', 1), ('cat', 2), ('the', 3)]

To get the highest counts first, we do

>>> sorted(counts.items(), key = second_of_pair, reverse = True)

[('the', 3), ('cat', 2), ('and', 1), ('chased', 1), ('dog', 1)]

We only need the function second_of_pair() in one place: as a sorting key for sorted(). For cases like these, Python has "nameless functions". We can write

sorted(counts.items(), key = lambda pair: pair[1], reverse = True)

to get the same result as the previous call to sorted(). Note that while the separate function definition of second_of_pair() had to use the "return" statement, the lambda formulation cannot use it.