Worksheet: A collection of Python constructions and functions

List comprehensions

What do you think this will do?
>>> text= """Pierre Vinken, 61 years old, will join the board as a nonexecutive director Nov. 29."""
>>> words = text.split()
>>> [w.lower() for w in words]

It gives you a list, which contains a lowercased word for each item in the list 'words':
>>> [ w.lower() for w in words ]
['pierre', 'vinken,', '61', 'years', 'old,', 'will', 'join', 'the', 'board', 'as', 'a', 'nonexecutive', 'director', 'nov.', '29.']

You can use a list comprehension to transform each item on a list

Try it for yourself: Write a function that takes as input a list of strings and returns a list of the same strings ,but with one space before and one space after each string:
>>> surround_by_space([“a”, “b”, “c”])
[“ a “, “ b “, “ c “]

Here is the solution:
def surround_by_space(stringlist):
    return [ “ “ + w + “ “ for w in stringlist ]

Addressing punctuation (a first attempt)

Punctuation often gets in our way if we want to process words. For example, above we had the sentence "Pierre Vinken, 61 years old, will join the board as a nonexecutive director Nov. 29." (This sentence is famous among computational linguists because it is the first sentence on the Wall Street Journal corpus, which everyone uses as an example sentence.) If we split this sentence into words to process it, it will, among other things, contain "old.", which is unfortunately a different string from "old" and won't be counted as the same word. So we will often want to separate punctuation from the words.

But what is punctuation? Python has this:

>>> import string
>>> string.punctuation

Now how can we remove punctuation at the end of a word? We can use the Python string function rstrip():
>>> "    hello   ".rstrip()
'    hello’
>>> "hello!?!".rstrip("!?")

If given no further argument, it removes all whitespace on the end of a string. if given an argument, for example "!?", it will remove all "!" and "?" at the end of the string.

Putting things together: We will use list comprehensions, string.punctuation, and rstrip() to remove punctuation.
def remove_punct(text):
     words = text.split()
     return [w.rstrip( string.punctuation) for w in words ]

Here's what this will do:

>>> text = """For a minute he scarcely realised what this meant, and, although the heat was excessive, he clambered down into the pit close to the bulk to see the Thing more clearly. """
>>> remove_punct(text)
['For', 'a', 'minute', 'he', 'scarcely', 'realised', 'what', 'this', 'meant', 'and', 'although', 'the', 'heat', 'was', 'excessive', 'he', 'clambered', 'down', 'into', 'the', 'pit', 'close', 'to', 'the', 'bulk', 'to', 'see', 'the', 'Thing', 'more', 'clearly']

More uses of list comprehensions

What do you think this will do?
>>> mylist = ['for', 'a', 'minute', 'he', 'scarcely', 'realised', 'what', 'this', 'meant', 'and', 'although', 'the', 'heat', 'was', 'excessive', 'he', 'clambered', 'down', 'into', 'the', 'pit', 'close', 'to', 'the', 'bulk', 'to', 'see', 'the', 'thing', 'more', 'clearly']
>>> mystopwords = ["the", "a", "to", "for", "he", "she", "it", "what", "and"]
>>> [ w for w in mylist if w not in mystopwords]

It only retains the members of mylist that are not stopwords. So you can use list comprehensions to filter lists:
>>> mylist = ['for', 'a', 'minute', 'he', 'scarcely', 'realised', 'what', 'this', 'meant', 'and', 'although', 'the', 'heat', 'was', 'excessive', 'he', 'clambered', 'down', 'into', 'the', 'pit', 'close', 'to', 'the', 'bulk', 'to', 'see', 'the', 'thing', 'more', 'clearly']
>>> mystopwords = ["the", "a", "to", "for", "he", "she", "it", "what", "and"]

>>> [ w for w in mylist if w not in mystopwords]
 ['minute', 'scarcely', 'realised', 'this', 'meant', 'although', 'heat', 'was', 'excessive', 'clambered', 'down', 'into', 'pit', 'close', 'bulk', 'see', 'thing', 'more', 'clearly']

So we have seen uses of list comprehensions that transform each item on the list, and uses that filter a list. You can do both at the same time. Here's an example task:
Given a list of numbers,
  • If a number is even, drop it
  • If a number is odd, double it
  • And return the result as another list of numbers
First we need to figure out how to test whether a number is even. We will use the operator "%". Try the following expressions to figure out what it does:
>>> 5 % 3
>>> 19 % 5
>>> 3 % 2

It gives you the "modulo". When you divide 5 by 3, the remainder is 2, so 5 % 3 = 2. When you divide 19 by 5, the remainder is 4, so 19 % 5 = 4. How can we use this to test whether a number is even or odd?
>>> 4 % 2
>>> 3 % 2

An even number modulo 2 is zero, an odd number modulo 2 is 1. Putting things together:

def drop_even_square_odd(intlist):
    return [i * i for i in intlist if i % 2 != 0]

>>> drop_even_square_odd([5,2018, 2, 9])
[25, 81]

Next, we will make a function that takes an integer as input and returns a list of its digits. So we want
>>> extract_digits(1234)
[1, 2, 3, 4]

We will proceed as follows: Given a number myint,
  • First convert the number to a string:
    mystr = str(myint)

  • Then break up the string into its characters -- see below.
  • And convert the characters back to numbers. Those are the digits we want.
Here is how to break a string up into characters:
>>> list("hippopotamus")
['h', 'i', 'p', 'p', 'o', 'p', 'o', 't', 'a', 'm', 'u', 's']
So if you convert a string to a list, what you get is a list of the letters of the string.

Putting things together:

def extract_digits(myint):
    mystr = str(myint)
    digit_string_list = list(mystr)
    return [ int(s) for s in digit_string_list ]

Counting backwards

Can you make a function that takes a string as input and returns the reverse string?
>>> my_reverse(“nemo”)
>>> my_reverse(“aibohphobia”)

aibohphobia is the fear of palindromes… See

Here are two possibilities of how to do this. The first uses your well-known idiom of starting with an empty container (here, new_string, a string), and filling it over the course of a for-loop. Note that if we iterate over a string, for letter in my_string, we iterate over the letters in the string. Each new letter is attached to the beginning of new_string, in order to invert the original string.

def my_reverse(my_string):
        new_string = “”
        for letter in my_string:
                new_string = letter + new_string
        return new_string

And here is another version, which iterates over the indices of my_string backwards:

def my_reverse(my_string):
        new_string = “”
        for i in range(len(my_string) – 1, -1, -1):
                    new_string += my_string[ i ]
        return new_string

Note the start and end point of the for-loop: We start at len(my_string) – 1, the index of the last letter. We stop before index – 1, that is, we stop at the first letter. And we proceed in steps of -1.