What do you think this will do? >>> text= """Pierre Vinken, 61 years old, will join the board as a nonexecutive director Nov. 29.""" >>> words = text.split() >>> [w.lower() for w in words] It gives you a list, which contains a lowercased word for each item in the list 'words': >>> [ w.lower() for w in words ] ['pierre',
'vinken,', '61', 'years', 'old,', 'will', 'join', 'the', 'board', 'as',
'a', 'nonexecutive', 'director', 'nov.', '29.'] You can use a list comprehension to transform each item on a list Try it for yourself: Write a function that takes as input a list of strings and returns a list of the same strings ,but with one space before and one space after each string: >>> surround_by_space([“a”, “b”, “c”]) [“ a “, “ b “, “ c “] Here is the solution: def surround_by_space(stringlist): return [ “ “ + w + “ “ for w in stringlist ] Addressing punctuation Punctuation
often gets in our way if we want to process words. For example, above
we had the sentence "Pierre Vinken, 61 years old, will join the board as
a nonexecutive director Nov. 29." (This sentence is famous among
computational linguists because it is the first sentence on the Wall
Street Journal corpus, which everyone uses as an example sentence.) If
we split this sentence into words to process it, it will, among other
things, contain "old.", which is unfortunately a different string from
"old" and won't be counted as the same word. So we will often want to
separate punctuation from the words. But what is punctuation? Python has this: >>> import string >>> string.punctuation '!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~’ Now how can we remove punctuation at the end of a word? Take a look at Python string functions strip(), lstrip() , and rstrip():>>> " hello ".rstrip() ' hello’ >>> "hello!?!".rstrip("!?") 'hello’ If given no further argument, it removes all whitespace on the end of a string. if given an argument, for example "!?", it will remove all "!" and "?" at the end of the string. Putting things together: Can you use for-loops or list comprehensions, string.punctuation, and rstrip() to remove punctuation? More uses of list comprehensions What do you think this will do? >>>
mylist = ['for', 'a', 'minute', 'he', 'scarcely', 'realised', 'what',
'this', 'meant', 'and', 'although', 'the', 'heat', 'was', 'excessive',
'he', 'clambered', 'down', 'into', 'the', 'pit', 'close', 'to', 'the',
'bulk', 'to', 'see', 'the', 'thing', 'more', 'clearly'] >>> mystopwords = ["the", "a", "to", "for", "he", "she", "it", "what", "and"] >>> [ w for w in mylist if w not in mystopwords] It only retains the members of mylist that are not stopwords. So you can use list comprehensions to filter lists: >>>
mylist = ['for', 'a', 'minute', 'he', 'scarcely', 'realised', 'what',
'this', 'meant', 'and', 'although', 'the', 'heat', 'was', 'excessive',
'he', 'clambered', 'down', 'into', 'the', 'pit', 'close', 'to', 'the',
'bulk', 'to', 'see', 'the', 'thing', 'more', 'clearly'] >>> mystopwords = ["the", "a", "to", "for", "he", "she", "it", "what", "and"] >>> [ w for w in mylist if w not in mystopwords] ['minute',
'scarcely', 'realised', 'this', 'meant', 'although', 'heat', 'was',
'excessive', 'clambered', 'down', 'into', 'pit', 'close', 'bulk', 'see',
'thing', 'more', 'clearly'] So we have seen uses of list comprehensions that transform each item on the list, and uses that filter a list. You can do both at the same time. Here's an example task: Given a list of numbers,
>>> 5 % 3 >>> 19 % 5 >>> 3 % 2 It gives you the "modulo". When you divide 5 by 3, the remainder is 2, so 5 % 3 = 2. When you divide 19 by 5, the remainder is 4, so 19 % 5 = 4. How can we use this to test whether a number is even or odd? >>> 4 % 2 0 >>> 3 % 2 1 An even number modulo 2 is zero, an odd number modulo 2 is 1. Putting things together: def drop_even_square_odd(intlist): return [i * i for i in intlist if i % 2 != 0] >>> drop_even_square_odd([5,2018, 2, 9]) [25, 81] Next, we will make a function that takes an integer as input and returns a list of its digits. So we want >>> extract_digits(1234) [1, 2, 3, 4] We will proceed as follows: Given a number myint,
>>> list("hippopotamus") ['h', 'i', 'p', 'p', 'o', 'p', 'o', 't', 'a', 'm', 'u', 's'] Putting things together: def extract_digits(myint): mystr = str(myint) digit_string_list = list(mystr) return [ int(s) for s in digit_string_list ] |
Courses > Python worksheets >