Question 14
Not complete
Marked out of
4.00
Flag question
Write a function print_word_counts2(filename) that takes the name of a file as a parameter and prints a list of all words in the document converted to lowercase plus their occurrence
counts, i.e. how many times each words appears in the file.
Words should be extracted using the following:
input_file = open(filename, 'r')
source_string = input_file.read().lower()
input_file.close()
words = re.findall('[a-zA-Z]+', source_string)
You will, of course, need to import re at the start of your program.
Words should be printed in descending order of occurrence frequency and, within that, in ascending alphabetical order where multiple words have the same occurrence count.
The example output below shows the expected result if you use engltreaty.txt as the filename.
Note: there are various different ways of going about this. You could consider using an inverted dictionaries or you might wish to read up on the key parameter to the list.sort method.
Depending on the approach you choose, you might also want to need to know that by default, tuples are sorted in increasing order of the first item, then, within that, in increasing order of the second item,
and so on.
For example:
Test
Result
print_word_counts2('engltreaty.txt') the: 52
and: 38
of: 37
to: 26
her: 14
in: 13
majesty: 11
chiefs: 10
new: 10
zealand: 10
or: 8
article: 6
tribes: 6
which: 6