3. Material: Loopy Lists¶
Everything is Recurrence¶
The evolution of programming skills is quite rapid in this course. In the first material our programs were straight as arrows. Last time we made them branch based on user input. So far our programs have only processed data as individual values of predetermined quantity. For example, we've used two variables,
number_1
and number_2
to refer to exactly two numbers. However, in real life, it's very easy to come up with scenarios where the quantity of data is not predetermined. Even if we had a predetermined quantity of data it's not much help for any significant number of values: if we have a thousand numbers, we would need a thousand code lines to handle them with what we currently know.
This material is very ambitious as we are tackling both lists and loop structures at the same time. These will be the last important pieces of the programming puzzle. Having said that, the ambition will be cut a bit short cause we're leaving a few details for the next material. Nevertheless, the tools introduced here allow you to solve almost all kinds of programming problems - in theory. In practice, real world programming involves a lot of topics beyond the basics covered in this and previous materials. Still, everything is built on the same basic principles. What comes after the basics is more often about using tools to avoid reinventing the wheel every time.
Our biggest limitation after this material will be the inability to access files on the computer's hard drive. This puts things like processing measurement data beyond us - for now. We also will be lacking the ability to modify lists. For instance, placing mines in a field stored as lists won't be quite possible yet.
Another Prompt Lesson: Questions Are Forever¶
One of the results from the last material was a program that converted US customary units to SI units. The program had one recurring prompt:
Input value to convert:
. We left input checking on our todo list in the name of simplicity. However, if it was a real program it should have obviously handled those situations without crashing. Another obvious flaw with the program was having to restart it after each conversion. Learning goals: In this section you'll learn to implement one type of a loop and use that to create programs capable of repeating some code indefinitely. We'll go through the philopsophy of this common construct and, naturally, show you how to implement it in Python. After this section you'll know how to make stubborn input functions that prompt for a value until the user provides a valid one.
Forced Repetition with Option to Break Out¶
Loop structures
are relatives of conditional structures
, and they both belong to a larger family: control structures
, or control flow statements. The has pretty clear implications about their nature: they are structures that control the program's execution in one way or another. Whereas conditional structures create branching paths, loop structures create repetition. Hence the name loop. Two varieties of loops exist in Python (and most other languages). The one we'll introduce first is a quite close relative to conditional structures. Whereas conditional structures use a condition to determine whether code underneath them should be executed or not, the corresponding loop structure uses its condition to determine how many times the code underneath is repeated - the code is repeated as long as the condition is true. The truthfulness is re-evaluated after each iteration. As code:
word = ""
while len(word) < 8:
word = input("Write a word with at least 8 letters: ")
print("You wrote the word", word)
This condition uses the
len
function
that returns its argument's
length (e.g. string's
length). This code would prompt the user for input
until they give one that has at least 8 characters. Note that the condition
is reverse of the end condition: as long as the word string is shorter than 8, the input prompt gets repeated. In most languages,
while
is the loop to use in these kinds of scenarios where the loop's end condition depends on something that happens inside the loop itself. In this particular example, at the time of implementing the code, we have no way of knowing how many times a user will input a word with less than 8 characters. We simply have to define the loop to end exactly on the iteration when the user gives a desired string. In short, a while loop
is suitable for situations where it's impossible to predict the amount of iterations before executing the loop.Back to the task at hand. We want to have a program that prompts the user for a number until they actually give a valid one. The word in this sentence that hints at
loops
- and while loop in particular - is until. There's a clear need in the program to repeat certain line(s) of code, and at implementation time we cannot say how many iterations are going to be needed. A while loop is ideal for the job, but we do have a roadbump. So far we've been checking validity of numbers with a try structure. We haven't done that with conditions
for a very good reason: a condition that would accept all floats
Python considers valid would be extremely complex. Jailbreak¶
In order to solve the problem at hand, we should cover all the different ways to end a
loop
. So far we're only aware of one: the while
loop ends when its condition
is no longer true. But that's not the only way. The loop also ends if there's an unhandled exception
during it:In [1]: while True:
...: number = float(input("Input number: "))
...:
Input number: 15
Input number: donkey
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-1-bd285c9d1f25> in <module>()
1 while True:
----> 2 number = float(input("Input number: "))
3
The condition of this loop is literal True so it clearly cannot end via its condition becoming false. In this example, there's an exception: the user is trying to input something that cannot be interpreted as a number. This causes the loop to end. If, instead, there was a try structure inside the loop, it would not be possible to exit the loop with an invalid input:
In [1]: while True:
...: try:
...: number = float(input("Input number: "))
...: except ValueError:
...: print("This isn't a number")
...:
Input number: 15
Input number: aasi
This isn't a number
Input number: 40
Input number:
There are cleaner ways to end a
loop
than what went down in the task above. If the loop is inside a function
it will end when the function's return
statement is encountered inside the loop. Similarly, encountering a break
statement inside the loop immediately ends it. This is a new trick, but not particularly complicted. When the break
statement is encountered, program execution immediately slips out of the current loop, and resumes from the first line that's after the loop. Let's modify the above example further by adding a break statement in the else branch
:In [1]: while True:
...: try:
...: number = float(input("Input number: "))
...: except ValueError:
...: print("This isn't a number")
...: else:
...: break
...:
Input number: aasi
This isn't a number
Input number: 40
The key to understanding what happens above is remembering how the else branch works: it's entered when the try branch is executed entirely without errors. Let's put this new tech into our existing code. Shown here is the length function.
def length():
print("Select unit of length from the options below using the abreviations")
print("Inch (in or \")")
print("Foot (ft or ')")
print("Yard (yd)")
print("Mile (mi)")
unit = input("Input source unit: ")
while True:
try:
value = float(input("Input value to convert: "))
except ValueError:
print("Value must be a plain number")
else:
break
try:
conversion = LENGTH_FACTORS[unit]
except KeyError:
print("The selected unit is not supported.")
else:
print(f"{value:.2f} {unit} is {value * conversion:.2f} m")
This works but is starting to look very unpleasant. Even more so when considering we'd need to put the same snippet into three other functions. Not to mention how horrible the code would look if we had to make more than one of these prompts. Luckily we can make tidy things up a lot by moving this loop monstrosity to its own
function
. Let's call it prompt_value
:def prompt_value()
while True:
try:
value = float(input("Input value to convert: "))
except ValueError:
print("Value must be a plain number")
else:
break
return value
Of course in order to make this work as a function, we had to add
return
to the end, so that the value can actually be made available where the function is called
. We also very recently learned that it's possible to exit loops with a return statement. Using this knowledge we can just drop the return in place of the break:def prompt_value()
while True:
try:
value = float(input("Input value to convert: "))
except ValueError:
print("Value must be a plain number")
else:
return value
Now, if we use this function in the code we had before, the result looks just as neat as it looked before we added input checking. One could claim it's even better because the function name is more descriptive of what's happening.
def length():
print("Select unit of length from the options below using the abreviations")
print("Inch (in or \")")
print("Foot (ft or ')")
print("Yard (yd)")
print("Mile (mi)")
unit = input("Input source unit: ")
value = prompt_value()
try:
conversion = LENGTH_FACTORS[unit]
except KeyError:
print("The selected unit is not supported.")
else:
print(f"{value:.2f} {unit} is {value * conversion:.2f} m")
The same change can be applied to every function, and everything works almost exactly as it did before. The only minor difference is having a different prompt in the temperature conversion since it used to say "Input temperature" but now it's just "Input value to convert" like all the others.
Looped for Life¶
Another feature that would vastly enhance the user experience of our program: not exiting after each conversion. The user has went through some trouble to even get to the point where they can do a conversion. It'd be polite to allow them to do multiple conversions without repeating each step every time. We don't even need any new tricks to achieve this. Let's start with the main program that we left in this state:
print("This program converts US customary units to SI units")
print("Available features:")
print("(L)ength")
print("(M)ass")
print("(V)olume")
print("(T)emperature")
print()
choice = input("Make your choice: ").strip().lower()
if choice == "l" or choice == "length":
length()
elif choice == "m" or choice == "mass":
mass()
elif choice == "v" or choice == "volume":
volume()
elif choice == "t" or choice == "temperature":
temperature()
else:
print("The selected feature is not available")
Making the program repeat itself is quite easily achieved with a simple
while
loop:print("This program converts US customary units to SI units")
print("Available features:")
print("(L)ength")
print("(M)ass")
print("(V)olume")
print("(T)emperature")
print()
while True:
choice = input("Make your choice: ").strip().lower()
if choice == "l" or choice == "length":
length()
elif choice == "m" or choice == "mass":
mass()
elif choice == "v" or choice == "volume":
volume()
elif choice == "t" or choice == "temperature":
temperature()
else:
print("The selected feature is not available")
Unfortunately it's now impossible for the user to exit the program without using software violence (i.e. keyboard interrupt). Previously the program exited by itself, but now exiting has to be programmed into it. We need to add a menu option to quit the program. In line with the previous options we choose a single letter shortcut for it: q for quit. We've already learned that
break
can be used to exit a loop. In this case it will also exit the program simply because there's nothing left to execute after the loop
. Let us add another branch
to the conditional structure
for quit, and add it to the instructions:print("This program converts US customary units to SI units")
print("Available features:")
print("(L)ength")
print("(M)ass")
print("(V)olume")
print("(T)emperature")
print("(Q)uit")
print()
while True:
choice = input("Make your choice: ").strip().lower()
if choice == "l" or choice == "length":
length()
elif choice == "m" or choice == "mass":
mass()
elif choice == "v" or choice == "volume":
volume()
elif choice == "t" or choice == "temperature":
temperature()
elif choice == "q" or choice == "quit":
break
else:
print("The selected feature is not available")
Now the user isn't kicked out of the program after a conversion:
This program converts US customary units to SI units Available features: (L)ength (M)ass (V)olume (T)emperature (Q)uit Make your choice: l Select unit of length from the options below using the abreviations Inch (in or ") Foot (ft or ') Yard (yd) Mile (mi) Input source unit: yd Input value to convert: 12 12.000 yd is 10.973 m Make your choice:
Returning the user all the way to the main menu is also a bit rude in case they wanted to do another length conversion. A similar
loop
should be included in every function
. We also need to decide which input to use for exiting the function. We're going to choose the unit choice because it's prompted first, and doesn't require changes for the value prompt function. Instead of prompting the user to give a specific letter to return, we're going to use empty input to indicate it. There's no conditional structure
to add this option to, so we'll just add one:def length():
print("Select unit of length from the options below using the abreviations")
print("Inch (in or \")")
print("Foot (ft or ')")
print("Yard (yd)")
print("Mile (mi)")
while True:
unit = input("Input source unit: ")
if not unit:
break
value = prompt_value()
try:
conversion = LENGTH_FACTORS[unit]
except KeyError:
print("The selected unit is not supported.")
else:
print(f"{value:.2f} {unit} is {value * conversion:.2f} m")
Note how the break is placed inside a simple
if not unit:
conditional statement above the value prompt. With this simple combination of two lines the loop ends immediately when the user gives an empty input
. It works as expected:This program converts US customary units to SI units Available features: (L)ength (M)ass (V)olume (T)emperature (Q)uit Make your choice: l Select unit of length from the options below using the abreviations Inch (in or ") Foot (ft or ') Yard (yd) Mile (mi) Input source unit: mi Input value to convert: 10 10.000 mi is 16093.440 m Input source unit: Make your choice: q
If it wasn't proven before, this example clearly shows how both
break
and return
exit a loop
immediately. Even the ongoing iteration
is interrupted midway. Therefore the value is never prompted. Shown below is an animation from another program where empty input
is used for breaking a loop.Listful Transition¶
Next we're using the ongoing example to transition to the next topic - very smoothly, promise. (Way) earlier we had an idea about inputting the value and the unit at the same time. It could work like this:
Enter value and unit to convert: 12 yd 12.000 yd is 10.973 m
Learning goals: This section teaches you what exactly is a list and how it's related to types we've studied previously. You'll also learn how to create lists by splitting strings and how to handle potential issues in the process.
Splitting Hairs¶
So the big question is: how exactly can we take out two different values from a single
string
? As usual, the exact solution depends on the context, but the most common way is to the string split
method
. It's a method that splits the string using an argument
as the separator. The separator can be a string with any number of characters, and it will be left out from the results. For instance, a decimal number could be separated to its integer and decimal part by splitting at the period:In [1]: "12.54".split(".")
Out[1]: ['12', '54']
Split also supports an
optional argument
for limiting how many splits will be done (at most). This feature is particularly useful for separating file extensions
from the rest of the filename
. Periods are perfectly valid characters in filenames, e.g. music files can contain them (e.g. P.H.O.B.O.S. - 03 - Wisdoom.ogg
or installation packages that often have their versin number in the name (e.g. Python installer: python-3.8.1.exe
). In these particular scenarios we would use the rsplit
method. It's the same as split except it starts splitting from the end. In [1]: filename = "donkey_bridge_2.5.7.zip"
In [2]: filename.rsplit(".", 1)
Out[2]: ["donkey_bridge_2.5.7", "zip"]
This example only does one split. Because the
method
is rsplit, splitting starts from the end. This allows the code to separate the file extension from rest of the name. The value returned
by this method, enclosed in square braces, is known as a list
. We've actually seen them before, as return values of the dir function:In [1]: dir("")
Out[1]:
['__add__',
'__class__',
'__contains__',
'__delattr__',
'__dir__',
'__doc__',
'__eq__',
'__format__',
'__ge__',
'__getattribute__',
'__getitem__',
'__getnewargs__',
'__gt__',
'__hash__',
'__init__',
'__iter__',
'__le__',
'__len__',
'__lt__',
'__mod__',
'__mul__',
'__ne__',
'__new__',
'__reduce__',
'__reduce_ex__',
'__repr__',
'__rmod__',
'__rmul__',
'__setattr__',
'__sizeof__',
'__str__',
'__subclasshook__',
'capitalize',
'casefold',
'center',
'count',
'encode',
'endswith',
'expandtabs',
'find',
'format',
'format_map',
'index',
'isalnum',
'isalpha',
'isdecimal',
'isdigit',
'isidentifier',
'islower',
'isnumeric',
'isprintable',
'isspace',
'istitle',
'isupper',
'join',
'ljust',
'lower',
'lstrip',
'maketrans',
'partition',
'replace',
'rfind',
'rindex',
'rjust',
'rpartition',
'rsplit',
'rstrip',
'split',
'splitlines',
'startswith',
'strip',
'swapcase',
'title',
'translate',
'upper',
'zfill']
Just like quotation characters denote a
string
or curly braces a dictionary
, square braces denote a list
. Out of all types we know so far, these two are also the closest relatives to lists - both in their own ways. Like strings, lists are also sequences
as in sequence of individual values. Like dictionaries, lists are mutable
data structures
. We had the following goal in mind:Enter value and unit to convert: 12 yd 12.000 yd is 10.973 m }}}4 With what we just learned, we should be able to make a list of the user's [!term=Input!]input[!term!] by using the split [!term=Method!]method[!term!], using space as the separator: {{{ value_unit = input("Enter value and unit to convert: ").split(" ")
Lists 101¶
A very short crash course on lists can be given with a four word description: ordered collection of values. Each of the three significant words tells something crucial about lists:
- List is an ordered collection of values: things contained in a list are in a determined order and they can be referenced via their position. The reference is defined as the distance from beginning of the list (we'll soon see how). Lists can be reordered by the program.
- List is an ordered collection of values: a list contains a determined set of values; their number can be anything upwards from zero, within limits of memory. Values can be added and removed.
- List is an ordered collection of values: a list contains values(i.e.objects), in other words anything we've encountered so far can be put into a list - including lists. Values in a list are calleditems.
In comparison: a dictionary is (mostly) unordered colelction of values - the selling point of lists is the ability to control the ordering of contents. In current version of Python dictionaries also do have a determined order, but it cannot be changed. From the perspective of syntax, a list is denoted with square braces, containing values separated by commas. And as we've learned with functions and function calls, each comma should be followed by a single space. Lists can contain numbers:
In [1]: results = [12.54, 5.12, 38.14, 9.04]
or
strings
:In [2]: members = ["Haruna", "Tomomi", "Mami", "Rina"]
or even
functions
, as a reminder that even functions are actually objects
:In [3]: functions = [max, abs, min, round]
or you can stuff
dictionaries
into a list, and list definition can be split into multiple lines too:In [4]: measurements = [
...: {"value": 4.64, "unit": "in"},
...: {"value": 13.54, "unit": "yd"}
...: ]
and, like we promised,
lists
can be made of lists:In [5]: [results, members, functions, measurements]
Out[5]:
[[12.54, 5.12, 38.14, 9.04],
['Haruna', 'Tomomi', 'Mami', 'Rina'],
[<function max>,
<function abs>,
<function min>,
<function round>],
[{'value': 4.64, 'unit': 'in'}, {'value': 13.54, 'unit': 'yd'}]]
The last example shows what lists look like inside a list. Unlike strings that use the same character at the beginning and end, lists use different characters (left and right brace). This makes it possible to start another list definition while inside a previous list definition with a second left brace. Of course it then becomes the programmer's responsibility to also close all the lists they have started. Otherwise the result is the same syntax error as when forgetting to close all braces on a line that contains nested function calls.
Listful Indexing¶
One common error in thinking goes as follows: if there are variables in the vein of number_1, number_2 etc. could variables like this be created dynamically so that the code would generate variables all the way to number_n? The short answer is: NO (long answer: sort of yes, but it involves overly clever tricks that will mess up your code). A similar result is achievable however, the question is just slightly wrong. The correct question would be: how to maintain a non-predetermined number of values. The answer to this question is:
list
. A list can store N values
, and they are kept in an order. Compare this:In [1]: number_1 = 13
In [2]: number_2 = 6
In [3]: number_3 = 24
to this:
In [4]: numbers = [13, 6, 24]
The latter solution is markedly better because it doesn't tie the number that refers to each value (13, 6 or 24) to a variable name. But how would we access each individual number in the list solution?
Items
in a list are in a determined order. This order is based on each item's distance from the beginning of the list so that the first item's (value 13) distance is 0. In order to refer to the first item of the list, we would use the number 0. The numbers that indicate this distance have a name: index
. It works very similarly to dictionary
keys
.Referring with index is done with the following syntax:
In [5]: numbers[0]
Out[5]: 13
In [6]: numbers[1]
Out[6]: 6
In [7]: numbers[2]
Out[7]: 24
The square braces at the end of a
list
have the same meaning as key lookup: subscription
via index. The critical difference is that while dictionary keys can be arbitrary immutable
values, list indices are always integers going from 0 to list length - 1 (i.e. 0,...,N-1). This isn't actually exclusive to lists - it works for other sequences too, like strings
. For instance, if you want to check the first letter of a word:In [1]: word = "donkeyswings"
Out[1]: word[0]
'd'
Whether we're talking about
lists
or strings
, indices
must always be integers. A float
with a decimal part of zero is not a valid index. This is the first time we absolutely need the int function to convert floats into integers. This could be the case in a scenario where a program is picking items
from a list using some form of mathemagics involving floats (could be as simple as the middle value from a list with odd length). In this scenario it would be mandatory to convert the result into an integer before it could be used as a list index. Subscription
can be used in achieving what we wanted:Enter value and unit to convert: 12 yd 12.000 yd is 10.973 m
Using subscription with indices we can pick each item from the list produced by the split
method
. For example, we could assign
them to variables
.value_unit = input("Enter value and unit to convert: ").split(" ")
value = value_unit[0]
unit = value_unit[1]
There's also a more direct way to assign the results of split to variables:
value, unit = input("Enter value and unit to convert: ").split(" ")
This way works best when the values are needed as
strings
. With this we would need one extra line to convert the value to a float
. You should also note that using this form of assignment results in a different exception
if the split doesn't result in a list
with the expected length:--------------------------------------------------------------------------- ValueError Traceback (most recent call last)in ----> 1 value, unit = input("Enter value and unit to convert: ").split(" ") ValueError: not enough values to unpack (expected 2, got 1)
When strings are partitioned it's always important to check that the split produces the correct number of results. So if we want to ask both value and unit at the same time, the prompt function needs some changes. We need to add one new except
branch
to the try structure. This isn't very revolutionary - it's very similar to adding more elif branches to a conditional structure
. It turns out that try structures can have an indefinite number of excepts. Furthemore like conditional structures, only the first except branch that matches the encountered Exception is executed. In addition to changing the exception handling, the function must now return
two values. def prompt_value_and_unit():
while True:
try:
value_unit = input("Enter value and unit to convert: ").split(" ")
value = float(value_unit[0])
unit = value_unit[1]
except ValueError:
print("Value must be a plain number")
except IndexError:
print("You must enter the value and its unit separated by a space")
else:
break
return value, unit
Because it now has two return values, the
function call
on the other end must also be modified. We've also changed the function's name to be more descriptive of what it does. The function call would look like this:value, unit = prompt_value_and_unit()
We went through the trouble of developing this, but we're actually not going to use it. It would introduce new problems, particularly because we used to use the unit input for determining when to go back to the main menu. We could make it work but there isn't much to learn in the process right now. We'll just leave it here as an example of how to handle exceptions when prompting multiple values from an input with the split method.
A Compilation of Lists¶
We've come as far as we can with the unit converter. We're moving on to a new example: a command line program that is used for managing collections. The example itself is for managing music album collections but is easily adaptable for some other collectibles. The basic features of the program are adding, removing and modifying albums.
Learning goals: This section teaches you more about lists - particularly how to add things to them. You'll also learn that lists are mutable and what it means in practice. Comment lines are also introduced officially, and we'll show how they can be used to make the code easier to read.
Requesting Comment¶
There are situations - quite often - when real code is not yet available and some sort of replacement is called for. These replacements should look like the real thing from the outside. In our example we should already introduce two
functions
: load_collection
and save_collection
. The latter can just do literally nothing with the pass keyword as its sole content. The first one should return a list
that represents the collection, with a few sample albums inside it. This way we can implement the rest of the program even though we don't have real data. At this moment we also need to decide what the collection data will look like. We're going to agree on five fields for each album: artist, album title, number of tracks, total length, and release year. Because these fields have distinct names,
dictionary
is an ideal fit for representing them. Like this:{
"artist": "Monolithe",
"album": "Nebula Septem",
"no_tracks": 7,
"length": "49:00",
"year": 2018
}
Until we have actual albums in the collection, we can use
comments
to remind us what keys were agreed upon. Comments are lines in a code file that the Python interpreter
ignores entirely when it's reading the code. They can therefore contain additional information for the human who is reading the code, like for your future self. They can also be used in planning: we can write a short description of what unimplemented parts of the code are supposed to do when they're ready. When doing this it's possible to skip some details at the start of the project, and still be able to come back and actually remember what it should eventually do. Comments can also act as reminders about what data structures
contain:def load_collection():
# keys:
# artist, album, no_tracks, length, year
collection = []
return collection
The lines beginning with # are comments. When Python encounters this character outside a
string
it immediately stops reading the line and moves on to the next one. The code example above contains just three lines that are executed: function definition, creating an empty list, and returning
said list.Comments
are also useful in solving bugs in the program. By temporarily putting the comment character at the beginning of an existing line of code, the line's execution can be skipped without having to entirely delete it. This can also sometimes be used to skip broken lines to see if rest of the program works as intended.Comments also have a relative,
documentation string
, or docstring as we're going to call it. The use of docstrings is more standardized than comments. A docstring is typically attached to a function
. It is, in fact, that exact string
you will see when using the help function in the Python console
. At minimum a docstring should describe what the function does, what arguments
it expects/accepts and what it returns
. A docstring is usually demilited with triple quotes and it must be immediately below the function definition line before any actual code lines.def load_collection():
"""
Creates a test collection. Returns a list that contains dictionaries of
five key-value pairs.
Dictionary keys match the following information:
"artist" - name of the album artist
"album" - title of the album
"no_tracks" - number of tracks
"length" - total length
"year" - release year
"""
# keys:
# artist, album, no_tracks, length, release_year
collection = []
return collection
If we write a
main program
with one line, help(load_collection)
, the code will have the following output when it's ran:Help on function load_collection in module __main__: load_collection() Creates a test collection. Returns a list that contains dictionaries of five key-value pairs. Dictionary keys match the following information: "artist" - name of the album artist "album" - title of the album "no_tracks" - number of tracks "length" - total length "year" - release year
This looks quite familiar. Using
docstrings
is an extremely good habit, especially if someone else will look at your code - including future you who may not recall what you were thinking when writing the code. In this course this is holds especially for the course projects that are checked by assistants manually by reading your code. In bigger projects docstrings can also be used in combination with specific documentation tools. One such tool is Sphinx that can create very neat documentation pages from docstrings that use a certain syntax.Before moving on, let's add a
stub
function for save_collection, and add some albums to the list
returned by the load_collection function. Only part of the test collection definition is shown below - defining a list that contains dictionaries becomes quite a few lines of code after all. The comment about the fields can now be removed since the same information is contained in the docstring.def load_collection():
"""
Creates a test collection. Returns a list that contains dictionaries of
five key-value pairs.
Dictionary keys match the following information:
"artist" - name of the album artist
"album" - title of the album
"no_tracks" - number of tracks
"length" - total length
"year" - release year
"""
collection = [
{
"artist": "Alcest",
"album": "Kodama",
"no_tracks": 6,
"length": "42:15",
"year": 2016
},
{
"artist": "Canaan",
"album": "A Calling to Weakness",
"no_tracks": 17,
"length": "1:11:17",
"year": 2002
},
{
"artist": "Deftones",
"album": "Gore",
"no_tracks": 11,
"length": "48:13",
"year": 2016
},
# rest is cut, the code example itself defines 8 more
]
return collection
def save_collection(collection):
pass
Now we can create a
main program
draft where the user can select the program's basic functions, and add corresponding stub functions.def add(collection):
pass
def remove(collection):
pass
def show(collection):
pass
collection = load_collection()
print("This program manages an album collection. You can use the following features:")
print("(A)dd new albums")
print("(R)emove albums")
print("(S)how the collection")
print("(Q)uit")
while True:
choice = input("Make your choice: ").strip().lower()
if choice == "a":
add(collection)
elif choice == "r":
remove(collection)
elif choice == "s":
show(collection)
elif choice == "q":
break
else:
print("The chosen feature is not available.")
save_collection(collection)
The collection is loaded at the beginning of the program, and (if the function did anything) saved at the end. We've also added a stub for each function that is called in the main program so that the code can be executed. Docstrings have been left out because we are about to implement the functions.
Lists Growing Before Your Eyes¶
Unlike
dictionaries
, lists
do not support adding new values
by assigning to a non-existing index
. The example shows adding a new key-value pair to a dictionary, and then what happens if a similar operation is attempted for lists:In [1]: d = {"a": 1, "b": 2, "c": 3}
In [2]: d["d"] = 4
In [3]: d
Out[3]: {'a': 1, 'b': 2, 'c': 3, 'd': 4}
In [4]: numbers = [213, 12, 45]
In [5]: numbers[3] = 53
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-5-8f3b48cbce55> in <module>
----> 1 numbers[3] = 53
IndexError: list assignment index out of range
If this doesn't work, what does? Put very shortly,
items
are added to lists using the append
method
. It appends a new item to the end of the list. Short explanation, short example:In [1]: numbers = [213, 12, 45]
In [2]: numbers.append(34)
In [3]: print(numbers)
[213, 12, 45, 34]
The difference is largely due to how list items exist in an order defined by their indices. The indices are managed by the list itself, so the coder doesn't need to know the exact place where the new item will end up. Meanwhile, in dictionaries, the keys are arbitrary, and therefore each key must be explicitly specified - Python cannot guess what the key should be. On the other hand, these two are similar in the sense that they are both
mutable
types. This means that appending does not create a new copy of the list. Likewise, if multiple variables contain the same list, the added item will appear in all of them. This is also the first time in the course where a new reference is created without the assignment operator
.Let's see what the help function tells us about append:
append(...) method of builtins.list instance L.append(object) -> None -- append object to end
Typically these help texts show a
function
or method's
return value
's type
on the right side of the arrow. In this case the return value is None which is a special value that means "nothing". It's an empty object
that is only equal to itself, and its boolean
equivalent is False. Functions and methods that do not contain a return statement, or have no values in their return statement, will implicitly return None.Because
lists
are mutable
just like dictionaries, adding values also behaves similarly. The animation illustrates this process for lists.There does exist a way to add new values to a list so that a new list is created. This can be done with the + operator, putting two lists together to form a new one:
In [1]: numbers_1 = [1, 2, 3]
In [2]: numbers_2 = numbers_1
In [3]: numbers_2 = numbers_2 + [4]
In [4]: print(numbers_1, numbers_2)
[1, 2, 3] [1, 2, 3, 4]
This is not seen very often because needless copying lists is a bit expensive, but mostly because in most cases it's far more practical that all parts of the program access the same list. So it is in our collection management program. When the add
function
adds new album to the collection, the added albums will be visible in the main program
as well as any other functions, and we don't need to worry about accidentally using an outdated copy of the list. For the same reason, the add function doesn't need to return
the list - it manipulates the one and only list we have in memory, the one that was loaded with the load_collection function.With this information we're able to implement the add function.
def add(collection):
print("Fill the information for a new album. Leave album title empty to stop.")
while True:
title = input("Album name: ")
if not title:
break
artist = input("Artist name: ")
no_tracks = prompt_number("Number of tracks: ")
length = prompt_time("Total length: ")
year = prompt_number("Release year: ")
collection.append({
"artist": artist,
"album": title,
"no_tracks": no_tracks,
"length": length,
"year": year
})
print("Album added")
This
function
adds new albums until the user inputs
an empty album name. The new thing we've learned is shown in the append method call
that we've split into multiple lines. It adds a new dictionary with the album information to the collection list
. Note also the lack of return in the function. It's not needed because, like we just learned, the function modifies an existing list. The code also shows calls to two functions that don't exist yet. Of these two, prompt_number is a close relateive to the prompt_value function from earlier:def prompt_number(prompt):
while True:
try:
number = int(input(prompt))
except ValueError:
print("Input an integer")
else:
return number
There are two differences: first of all, this is used for number of tracks and release year, so it makes more sense to only accept integers. Second is the bigger change: we've changed the input function's
argument
from a literal
string
to a variable
. Doing this allows us to use the same function for prompting all integers in the program. The prompt string will now be provided as an argument when this function is called. This is a very useful trick for any program that has several similar prompts (like this one where we prompt integers repeatedly). The try structure that looks out for exceptions
is only needed in one place, but at the same time we can still change the prompt that is shown to the user by changing the argument.We're going to leave the prompt_time function as a stub for now. The validity of time input is pondered later.
def prompt_time(prompt):
return input(prompt)
With all of this, the function is pretty neat:
This program manages an album collection. You can use the following features: (A)dd new albums (R)emove albums (S)how the collection (Q)uit Make your choice: a Fill the information for a new album. Leave album title empty to stop. Album name: Lead and Aether Artist name: Skepticism Number of tracks: 12.5 Input an integer Number of tracks: 6 Total length: 47:49 Release year: 1997 Album added Album name: Make your choice: q
Looping with Abandon¶
In our next episode lists get new friends: loops that can iterate through them. Our goal here is to implement one feature: printing of lists.
Learning goals: After this section you know what for loops are and how they are connected to lists. We'll also show you a new string method that can make really pretty prints.
Runthrough¶
We'll start with printing because it is a bit gentler introduction to
for loops
. The goal is to implement the contents of the show function
in our example program. The very first thing to try is to see what happens if we just print the list
.def show(collection):
print(collection)
That seems easy. Did we really need a section for this? Well, yes and no, but mostly yes. As you can see the results are not particularly pleasing to read (manual line splitting was done at 80 characters):
This program manages an album collection. You can use the following features: (A)dd new albums (R)emove albums (S)how the collection (Q)uit Make your choice: s [{'artist': 'Alcest', 'album': 'Kodama', 'no_tracks': 6, 'length': '0:42:15', ' year': 2016}, {'artist': 'Canaan', 'album': 'A Calling to Weakness', 'no_tracks ': 17, 'length': '1:11:17', 'year': 2002}, {'artist': 'Deftones', 'album': 'Gor e', 'no_tracks': 11, 'length': '0:48:13', 'year': 2016}, {'artist': 'Funeralium ', 'album': 'Deceived Idealism', 'no_tracks': 6, 'length': '1:28:22', 'year': 2 013}, {'artist': 'IU', 'album': 'Modern Times', 'no_tracks': 13, 'length': '0:4 7:14', 'year': 2013}, {'artist': 'Mono', 'album': 'You Are There', 'no_tracks': 6, 'length': '1:00:01', 'year': 2006}, {'artist': 'Panopticon', 'album': 'Roads to the North', 'no_tracks': 8, 'length': '1:11:07', 'year': 2014}, {'artist': ' PassCode', 'album': 'Clarity', 'no_tracks': 13, 'length': '0:49:27', 'year': 20 19}, {'artist': 'Scandal', 'album': 'Hello World', 'no_tracks': 13, 'length': ' 0:53:22', 'year': 2014}, {'artist': 'Slipknot', 'album': 'Iowa', 'no_tracks': 1 4, 'length': '1:06:24', 'year': 2001}, {'artist': 'Wolves in the Throne Room', 'album': 'Thrice Woven', 'no_tracks': 5, 'length': '0:42:19', 'year': 2017}]
appropriate reaction:
A more desirable result would look something like this:
1. Alcest - Kodama (2016) [6] [42:15] 2. Canaan - A Calling to Weakness (2002) [17] [1:11:17] 3. Deftones - Gore (2016) [11] [48:13] 4. Funeralium - Deceived Idealism (2013) [6] [1:28:22] 5. IU - Modern Times (2013) [13] [47:14] 6. Mono - You Are There (2006) [6] [1:00:01] 7. Panopticon - Roads to the North (2014) [8] [1:11:07] 8. PassCode - Clarity (2019) [13] [49:27] 9. Scandal - Hello World (2014) [13] [53:22] 10. Slipknot - Iowa (2001) [14] [1:06:24] 11. Wolves in the Throne Room - Thrice Woven (2017) [5] [42:19]
Each album on their own line without any messy braces or quotes. We know that by default each print call in the code produces one line in the output. The logical conclusion would be to
call
print for each item
in the list
separately. This material has also shown us that repetition is usually done with loops
. However, so far we only know of while loops
, that are generally used when the number of iterations cannot be deduced in advance. In our current case the number of iterations can be deduced: the program can count the number of items in the list before looping through it. Python uses
for loops
specifically for these cases. Their expertise is in going through lists and other sequences
. These also share a common name: iterable, which hints at how they are objects
that can be iterated through. There's a wide variety of these. Lists are a given, but there's also strings
and tuples
, the latter of which we'll introduce properly later. In addition to these iterable types, there are iterable special objects like generators
and enumerate
. Python's for loop executes all of the statements contained within it for each
item
in a given iterable. A very typical use case is to apply an operation or series of operations to each item in a list
. It fits like a glove for printing each item from a list. Before applying it to our more complex example list, let's examine some details. We'll use the simplest possible example in the console:In [1]: animals = ["dog", "cat", "squirrel", "walrus", "donkey", "llama"]
In [2]: for animal in animals:
...: print(animal)
...:
dog
cat
squirrel
walrus
donkey
llama
If nothing else this example proves that a for loop indeed does repeat the statement(s) contained within for each item in the target list. The loop itself is declared as follow, by starting a line with the proper
keyword
:for animal in animals:
In addition to the for keyword itself, the in
operator
is also a mandatory part of a for loop declaration. This statement can be read as "for each animal
in the animals
sequence". Just like functions
have their named parameters
, this example also has something similar: the animal variable
- we'll call it loop variable
- represents the current item of the list on each iteration. It gets the values "dog", "cat", "squirrel", "walrus", "donkey", and "llama", in this order, during the loop's execution. The value changes on each iteration.3
The animation also shows a small side detail: the
The animation also shows a small side detail: the
loop variable
does not cease to exist at the end of the loop - it just retains its last value. In this sense it is markedly different from function
parameters
that only exist for the function's lifetime. There's no real use cases for this feature, it's just something to keep in mind to avoid nasty surprises. The best practice is to always use loop variable names that do not overlap with other names inside the function's scope
in global scope
. By now we know how to declare a
for loop
, and how its loop variable
can be used to do things to items
in a list
. Let's try to put this loop tech into our show function
to print each item separately.def show(collection):
for album in collection:
print(album)
With this the print looks like
{'artist': 'Alcest', 'album': 'Kodama', 'no_tracks': 6, 'length': '42:15', 'year': 2016} {'artist': 'Canaan', 'album': 'A Calling to Weakness', 'no_tracks': 17, 'length': '1:11:17', 'year': 2002} {'artist': 'Deftones', 'album': 'Gore', 'no_tracks': 11, 'length': '48:13', 'year': 2016} {'artist': 'Funeralium', 'album': 'Deceived Idealism', 'no_tracks': 6, 'length': '1:28:22', 'year': 2013} {'artist': 'IU', 'album': 'Modern Times', 'no_tracks': 13, 'length': '47:14', 'year': 2013} {'artist': 'Mono', 'album': 'You Are There', 'no_tracks': 6, 'length': '1:00:01', 'year': 2006} {'artist': 'Panopticon', 'album': 'Roads to the North', 'no_tracks': 8, 'length': '1:11:07', 'year': 2014} {'artist': 'PassCode', 'album': 'Clarity', 'no_tracks': 13, 'length': '49:27', 'year': 2019} {'artist': 'Scandal', 'album': 'Hello World', 'no_tracks': 13, 'length': '53:22', 'year': 2014} {'artist': 'Slipknot', 'album': 'Iowa', 'no_tracks': 14, 'length': '1:06:24', 'year': 2001} {'artist': 'Wolves in the Throne Room', 'album': 'Thrice Woven', 'no_tracks': 5, 'length': '42:19', 'year': 2017}
This is markedly better but still a far cry from the beautiful printout we had planned.
List Exposure¶
Printing
lists
and other data structures
is always a challenge, and the best solution will always depend on what we want the program to do. One factor that limits options is list length. If the length is not fixed, the ways in which it can be printed are more limited. In our case the length of the collection list does vary, and for that we print it with a loop
. For the dictionaries contained within the list we can use f strings
.def show(collection):
for album in collection:
print(
f"{album['artist']}, {album['album']}, {album['no_tracks']}, "
f"{album['length']}, {album['year']}"
)
This is one step closer to the desired output but there's still something missing, like the ordinals at the beginning of each line. One way to add them would be to introduce a counter inside the loop:
def show(collection):
ordinal = 1
for album in collection:
print(
f"{ordinal:2}. "
f"{album['artist']}, {album['album']}, {album['no_tracks']}, "
f"{album['length']}, {album['year']}"
)
ordinal += 1
In other words we introduce an integer variable with an initial value of 1, and then we add 1 to it on each iteration. However, there is a more convenient way to make
indices
available inside a for loop. The ordinal is basically the same as the index of an item, + 1. Let's write the code first and wonder about it after:def show(collection):
for i, album in enumerate(collection):
print(
f"{i + 1:2}. "
f"{album['artist']}, {album['album']}, {album['no_tracks']}, "
f"{album['length']}, {album['year']}"
)
First question: what is the enigmatic enumerate? The answer, as usual, can be figured out in the
Python console
. Our example has a bit too much stuff per item, so let's examine using the animal list from earlier:In [1]: animals = ["dog", "cat", "squirrel", "walrus", "donkey", "llama"]
In [2]: enumerate(animals)
Out[2]: <enumerate at 0x7fe27e0e3168>
That's, uh, useful information. It turns out enumerate produces a
generator
-like object that can be iterated with a loop but isn't a data structure
. It's actually a function that produces the next value in a sequence each time it's called. We don't do much with generators right now, but they - and enumerate - can be converted into lists. Take 2:In [3]: list(enumerate(animals))
Out[3]:
[(0, 'dog'),
(1, 'cat'),
(2, 'squirrel'),
(3, 'walrus'),
(4, 'donkey'),
(5, 'llama')]
The values contained within braces inside the list are data structures called
tuples
. We haven't talked about them yet. Luckily there isn't much to tell either: they are an immutable
version of lists. It can be read like a list but it cannot be changed, basically. If we iterate through a list that contains tuples, on each iteration the loop variable
is assigned a tuple as its value. However, it's perfectly legal to have multiple loop variables, and it works just like a function call
that stores multiple return values. In other words, when we know with certainty that each item
in the list contains two items, they can be split into two variables like we already did in the example. The same with animals:In [4]: for i, animal in enumerate(animals):
...: print(f"{i}. {animal}")
...:
0. dog
1. cat
2. squirrel
3. walrus
4. donkey
5. llama
This shows that both i and animal have new values on each loop
iteration
. The chosen loop variable
name, i, is a typical choice in loops like this in pretty much all programming langugages. If there's multiple nested loops, the next picks are j and then k. Of course using longer names is also allowed.Shown below are two different ways to write the same functionality. This should help in understanding what's going on in the enumerate example. Shown first is splitting the loop variable on a separate line instead of using two loop variables. This adds one extra line:
In [5]: for item in enumerate(animals):
...: i, animal = item
...: print(f"{i}. {animal}")
Shown second is using
subscription
on the print line itself. This makes the print line itself less readable:In [6]: for item in enumerate(animals):
...: print(f"{item[0]}. {item[1]}")
There's one more technique to printing lists that's particularly neat when we want to print lists of unknown length to one line. For a simple example, let's consider a program where the user
inputs
some words one by one. The words are put into a list and one of the program's features is to print out all of the words. Because the number of words in the list is not fixed, formatting cannot be used - we wouldn't know how many placeholders
to use. Not to worry, there's another method
that is perfectly suitable for this scenario: join. Using it looks a bit odd compared to what we've seen so far.In more specific terms, the join method works by joining together
items
from a list
(or another sequence) by using a string
as a "connector". In this sense it's like the reverse of split. Also like split, join is a string method. This is why the syntax
when using it can look a bit unintuitive: the connector string is first on the line, and the list itself isn't seen until we get into the method call
arguments. The reason they put it this way is most likely the fact that while join can be used for any sequences
, the connector must always be a string - therefore the method has been attached to the string type rather than implementing it for each sequence type separately. The join method is often so handy that it can also be used for lists with a fixed length. In addition to producing generally more compact code than a format, join is also more dynamic and therefore far less likely to break due to changes elsewhere in the code. Our collection program uses
dictionaries
but we could join if we also used the values method for each dictionary to retrieve the values without the keys. Almost:---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
/media/sf_virtualshare/OA/M3/collection.py in <module>
148 remove(collection)
149 elif choice == "s":
--> 150 show(collection)
151 elif choice == "q":
152 break
/media/sf_virtualshare/OA/M3/collection.py in show(collection)
133 ordinal = 1
134 for i, album in enumerate(collection):
--> 135 print(f"{i + 1:2}. {', '.join(album.values())}")
136
137 collection = load_collection()
TypeError: sequence item 2: expected str instance, int found
This shows an important restriction regarding join: all
items
in the list
(or other sequence) must be strings
. We have two potential solutions to this issue: a) use another way; b) convert all items in the list to strings before joining them. It would also be possible to just include all numbers in the dictionary as strings but then they would no longer be valid for calculations (if we had any planned). Once again it depends on the context what the best option is.Our example program will go back to the solution that used format:
def show(collection):
for i, album in enumerate(collection):
print(
f"{i + 1:2}. "
f"{album['artist']}, {album['album']}, {album['no_tracks']}, "
f"{album['length']}, {album['year']}"
)
New World Order¶
This is what we got so far for our catalog program. If you've followed the code's progress by updating your own version of it when we've made changes, you should have a similar code on your computer. You may have defined the functions in a different order, but everything else should be the same.
def prompt_number(prompt):
while True:
try:
number = int(input(prompt))
except ValueError:
print("Input an integer")
else:
return number
def prompt_time(prompt):
return input(prompt)
def load_collection():
"""
Creates a test collection. Returns a list that contains dictionaries of
five key-value pairs.
Dictionary keys match the following information:
"artist" - name of the album artist
"album" - title of the album
"no_tracks" - number of tracks
"length" - total length
"year" - release year
"""
collection = [
{
"artist": "Alcest",
"album": "Kodama",
"no_tracks": 6,
"length": "42:15",
"year": 2016
},
{
"artist": "Canaan",
"album": "A Calling to Weakness",
"no_tracks": 17,
"length": "1:11:17",
"year": 2002
},
{
"artist": "Deftones",
"album": "Gore",
"no_tracks": 11,
"length": "48:13",
"year": 2016
},
# rest is cut, the code example itself defines 8 more
]
return collection
def save_collection(collection):
pass
def add(collection):
print("Fill the information for a new album. Leave album title empty to stop.")
while True:
title = input("Album name: ")
if not title:
break
artist = input("Artist name: ")
no_tracks = prompt_number("Number of tracks: ")
length = prompt_time("Total length: ")
year = prompt_number("Release year: ")
collection.append({
"artist": artist,
"album": title,
"no_tracks": no_tracks,
"length": length,
"year": year
})
print("Album added")
def remove(collection):
pass
def show(collection):
for i, album in enumerate(collection):
print(
f"{i + 1:2}. "
f"{album['artist']}, {album['album']}, {album['no_tracks']}, "
f"{album['length']}, {album['year']}"
)
collection = load_collection()
print("This program manages an album collection. You can use the following features:")
print("(A)dd new albums")
print("(R)emove albums")
print("(S)how the collection")
print("(Q)uit")
while True:
choice = input("Make your choice: ").strip().lower()
if choice == "a":
add(collection)
elif choice == "r":
remove(collection)
elif choice == "s":
show(collection)
elif choice == "q":
break
else:
print("The chosen feature is not available.")
save_collection(collection)
In terms of how much code there is, this is like a third or half of the course project size. Our next mission is to implement a new feature that leads us to new things in the domain of lists: sorting and slicing content. Let's add a new function we can start to work on:
def organize(collection):
pass
We can also already add this feature to the main menu:
collection = load_collection()
print("This program manages an album collection. You can use the following features:")
print("(A)dd new albums")
print("(R)emove albums")
print("(S)how the collection")
print("(O)rganize the collection")
print("(Q)uit")
while True:
choice = input("Make your choice: ").strip().lower()
if choice == "a":
add(collection)
elif choice == "r":
remove(collection)
elif choice == "s":
show(collection)
elif choice == "o":
organize(collection)
elif choice == "q":
break
else:
print("The chosen feature is not available.")
save_collection(collection)
Learning goals: We'll investigate how lists can be organized with the
sort
method and suitable helper functions. We'll also learn some new things about how strings are sorted.Order Please¶
We need to sort our collection. What's more, we want to make it possible for the user to select the field to sort by and whether they want the sorting in ascending or descending order. All of this can be done with
list
method
sort.In [1]: list_1 = [37, 5, 12]
In [2]: list_1.sort()
In [3]: list_1
Out[3]: [5, 12, 37]
By itself this method sorts the list's
items
in an ascending order. In other words, smallest item first, and biggest item last. It's pretty easy to understand for numbers how it works. For strings, ordering is based on alphabetical order:In [1]: animals = ["walrus", "duck", "donkey", "llama", "koala", "dromedary", "moose"]
In [2]: animals.sort()
In [3]: animals
Out[3]: ['donkey', 'dromedary', 'duck', 'koala', 'llama', 'moose', 'walrus']
The ordering of
lists
is based on the first item of each list by default. And, if the first items are the same, the second items are looked at, and so on. However, our collection contains dictionaries. Trying to sort them without any additional arguments to sort causes a TypeError exception
. For this reason the sort method
has an optional argument
called key
. This argument accepts a function
that is used for deriving a comparison value from each item
before the sorting itself is done. Let's start by looking at a simple example, using a builtin Python function in this role. Sometimes you get interesting results when sorting lists that contain numbers as strings
:In [1]: numbers = ["2", "12", "5", "43", "48"]
In [2]: numbers.sort()
In [3]: numbers
Out[3]: ['12', '2', '43', '48', '5']
If the goal was to sort numbers based on their numerical value, this is obviously not the desired outcome. This can be fixed by using the int function as the key argument:
In [4]: numbers.sort(key=int)
In [5]: numbers
Out[5]: ['2', '5', '12', '43', '48']
This is the first time the entire learning material where we use a
function's
name in real code without it being followed by parentheses to call
it. This a specific mechanism in programming called callback
. In this mechanism, instead of calling a function normally, the function is provided to another part of the program to tell that part which function it should call when the time comes. In order for callbacks to work, the function must always have the number and types of arguments and return values that's expected by the part of the program that needs to call it.In this case the part of the program that will call the function is the sort method. Its key argument can be used for defining a callback. The method will call whichever function has been assigned to its key parameter when it needs a
comparison value
for each item in the list. In this case the callback function must be one that receives exactly one argument and returns
exactly one value. Python doesn't come with a function that would fit our purposes. We need a function that chooses one field from a dictionary to represent the whole dictionary.Since the function we provide as the key argument can only have one argument (the item), we cannot create a function that receives the sorting field as an argument. The only way that we currently have for implementing this is to create a function for each separate field. It's not ideal but with what we know it's the best we can do. If you're interested in better solutions, you can always find them in the Python documentation. These helper functions that our code will use as key arguments are rather simple:
def select_artist(album):
return album["artist"]
The function returns the value of the "artist" key of each dictionary to be used in sorting. The other four are similar:
def select_title(album):
return album["album"]
def select_no_tracks(album):
return album["no_tracks"]
def select_length(album):
return album["length"]
def select_year(album):
return album["year"]
The organize function itself follows a pretty familiar pattern where we first ask the user which field they want to sort by, and then use a
conditional structure
to choose the code that executes it:def organize(collection):
print("Choose a field to use for sorting the collection by inputting the corresponding number")
print("1 - artist")
print("2 - album title")
print("3 - number of tracks")
print("4 - album length")
print("5 - release year")
field = input("Choose field (1-5): ")
if field == "1":
collection.sort(key=select_artist)
elif field == "2":
collection.sort(key=select_title)
elif field == "3":
collection.sort(key=select_no_tracks)
elif field == "4":
collection.sort(key=select_length)
elif field == "5":
collection.sort(key=select_year)
else:
print("Field doesn't exist")
Since we had the foresight to use integer as the type for the number of tracks field, they already get sorted correctly. However, trying to sort by length gives a bit problematic results:
1. Mono, You Are There, 6, 1:00:01, 2006 2. Slipknot, Iowa, 14, 1:06:24, 2001 3. Panopticon, Roads to the North, 8, 1:11:07, 2014 4. Canaan, A Calling to Weakness, 17, 1:11:17, 2002 5. Funeralium, Deceived Idealism, 6, 1:28:22, 2013 6. Alcest, Kodama, 6, 42:15, 2016 7. Wolves in the Throne Room, Thrice Woven, 5, 42:19, 2017 8. IU, Modern Times, 13, 47:14, 2013 9. Deftones, Gore, 11, 48:13, 2016 10. PassCode, Clarity, 13, 49:27, 2019 11. Scandal, Hello World, 13, 53:22, 2014
While the order is mostly from shortest to longest, any albums over an hour long are considered shorter than the ones under an hour by the sort
method
. This is a problem that has been visited before: albums over an hour start with the "1" character which is smaller than any other numbers - except zero. For sorting purposes it would make more sense if all lengths included hours even when they'd be zero. For the exact same reason programmers like dates written as year-month-day - as it happens, they sort correctly by default as long as all numbers have leading zeros (e.g. 2015-07-22).The first step in solving the problem is to add the zero hours into the collection dictionary. At this point we just write them in manually.
collection = [
{
"artist": "Alcest",
"album": "Kodama",
"no_tracks": 6,
"length": "0:42:15",
"year": 2016
},
{
"artist": "Canaan",
"album": "A Calling to Weakness",
"no_tracks": 17,
"length": "1:11:17",
"year": 2002
},
{
"artist": "Deftones",
"album": "Gore",
"no_tracks": 11,
"length": "0:48:13",
"year": 2016
},
# rest is cut, the code example itself defines 8 more
]
This by itself doesn't guarantee anything though. The user can still
input
lengths without hours in them. In general it should be the code that fixes issues like this, not the user. Conveniently, we already have a function
for prompting the length - it just doesn't do a whole lot yet. Since it exists, we can just add a couple of things:- Checking that the input is valid
- Adding zero hours if needed
This can be by done splitting the user's input by colons and examining each part individually. We are going to make a relatively sane assumption that all albums are less than 10 hours long. This means one zero is sufficient for hours, and albums longer than an hour don't need leading zeros.
def prompt_time(prompt):
while True:
parts = input(prompt).split(":")
if len(parts) == 3:
h, min, s = parts
elif len(parts) == 2:
min, s = parts
h = "0"
else:
print("Input the time as hours:minutes:seconds or minutes:seconds")
continue
try:
h = int(h)
min = int(min)
s = int(s)
except ValueError:
print("Times must be integers")
continue
if not (0 <= min <= 59):
print("Minutes must be between 0 and 59")
continue
if not (0 <= s <= 59):
print("Seconds must be between 0 and 59")
continue
if h < 0:
print("Hours must be a positive integer")
continue
return f"{h}:{min:02}:{s:02}"
Well, that wasn't so simple. The function became a bit long because we have to check each part individually: check that they are numbers, and are within the valid range.
We've also snuck a new
keyword
, continue
, into this example. Whereas break
interrupts the execution of a loop
, continue interrupts the current iteration
and skips to the next one. In this example it means that as soon as we hit a check that fails, the code jumps to prompt for a new input. We've used continue here because otherwise all of these checks would have to be nested, and it would look rather unpleasant. Here's an animation that illustrates how continue works:Note also how we've placed return at the end of the
while loop
. When the code gets an input that passes all checks it can be returned
, and exit the loop entirely. Generally speaking,
continue
is not used particularly often in loops
. It's only really needed in situations like this where the code makes multiple checkes, each of which can cause skipping to the next iteration - whether it's prompting for the next input in a while
loop, or moving on to the next item
in a for
loop. A continue can always be replaced by nested conditional structures
. It's just a useful tool that can sometimes lead to prettier code than the alternative.After all these changes, the user can now add albums that are shorter than an hour, and sorting by length works correctly.
This program manages an album collection. You can use the following features: (A)dd new albums (R)emove albums (S)how the collection (O)rganize the collection (Q)uit Make your choice: a Fill the information for a new album. Leave album title empty to stop. Album name: Rotten Tongues Artist name: Curse Upon a Prayer Number of tracks: 9 Total length: 43:17 Release year: 2015 Album added Album name: Make your choice: o Choose a field to use for sorting the collection by inputting the corresponding number 1 - artist 2 - album title 3 - number of tracks 4 - album length 5 - release year Choose field (1-5): 4 Make your choice: s 1. Alcest, Kodama, 6, 0:42:15, 2016 2. Wolves in the Throne Room, Thrice Woven, 5, 0:42:19, 2017 3. Curse Upon a Prayer, Rotten Tongues, 9, 0:43:17, 2015 4. IU, Modern Times, 13, 0:47:14, 2013 5. Deftones, Gore, 11, 0:48:13, 2016 6. PassCode, Clarity, 13, 0:49:27, 2019 7. Scandal, Hello World, 13, 0:53:22, 2014 8. Mono, You Are There, 6, 1:00:01, 2006 9. Slipknot, Iowa, 14, 1:06:24, 2001 10. Panopticon, Roads to the North, 8, 1:11:07, 2014 11. Canaan, A Calling to Weakness, 17, 1:11:17, 2002 12. Funeralium, Deceived Idealism, 6, 1:28:22, 2013
This example also shows that zero hours are correctly added.
The last sorting feature we're going to implement before moving on is the ability to choose ascending or descending order. This can be achieved with the other
optional argument
of the sort method
, reverse
, which has the default value
of False. If we want we can change it to True which reverses the order (from ascending to descending). All we need to do is to add a prompt about the order:def organize(collection):
print("Choose a field to use for sorting the collection by inputting the corresponding number")
print("1 - artist")
print("2 - album title")
print("3 - number of tracks")
print("4 - album length")
print("5 - release year")
field = input("Choose field (1-5): ")
order = input("Order; (a)scending or (d)escending: ").lower()
if order == "d":
reverse = True
else:
reverse = False
if field == "1":
collection.sort(key=select_artist, reverse=reverse)
elif field == "2":
collection.sort(key=select_title, reverse=reverse)
elif field == "3":
collection.sort(key=select_no_tracks, reverse=reverse)
elif field == "4":
collection.sort(key=select_length, reverse=reverse)
elif field == "5":
collection.sort(key=select_year, reverse=reverse)
else:
print("Field doesn't exist")
So we prompt for another
input
and make the ordering choice based on that. This is done by setting the reverse variable
to either True or False. In this case we've chosen that any input that is not l or L will set the order to ascending because it's the default.Economizing Printouts¶
The last task in this material is to show how list slicing can be used in limiting how much is printed. If the collection grows to substantial size, browsing the output can become quite an ordeal. We're going to implement a rudimentary solution that shows 20 results at a time, whenever the user presses enter. We'll also do some final touch-ups to the output.
Learning goals: This section should give you some idea about how list slicing is used. We're also going to show you a special type of for loop that is used for repeating the code a certain number of times.
Paginated Lists¶
Our goal is to print 20 items from the collection at a time, and wait until the user presses enter before printing the next 20. Let's start with the easiest part, which is figuring out how to get the first 20
items
from a list
. In order to keep the first example concise we're going to use a smaller number and a shorter list - time for the animals list to make comeback. We'll take the first three animals:In [1]: animals = ["walrus", "duck", "donkey", "llama", "koala", "dromedary", "moose"]
In [2]: top3 = animals[:3]
In [3]: top3
Out[3]: ['walrus', 'duck', 'donkey']
The new thing is on the second line. The
[:3]
notation indicates slicing
from beginning of the list until the index
3. The first index to be included in the slice is placed on the left side of the colon. If it's not there, 0 is used. On the right is the first index that will not be included in the slice. If it's not given, everything until the end of the list is included. We would get the same result if the slice was written as [0:3]
. Slicing is nice in the sense that it doesn't complain about going outside the list:
In [4]: animals[10:15]
Out[4]: []
Conveniently this example also shows how to write a slice that doesn't begin at the start of the list. Slicing can be used to make the show function print only first 20 albums from the collection:
def show(collection):
for i, album in enumerate(collection[:20]):
print(
f"{i + 1:2}. "
f"{album['artist']}, {album['album']}, {album['no_tracks']}, "
f"{album['length']}, {album['year']}"
)
Numbers in a Range¶
In order to print the next 20
items
we need two things:- We have to find out how many 20 item prints have to be done.
- We need to make a loopthat gets as many iterations.
Let's start with the first problem. A list's length can be found out with the len function. After that it's a matter of simple division to figure out how many 20 item chunks are in there. The only problem is rounding. If there's exactly 20 albums, one print is sufficient but if there's 21, two prints are required. The math module has a ceil function that solves the problem rather neatly:
def show(collection):
pages = math.ceil(len(collection) / 20)
The result of this calculation can be used in declaring the new printing loop. The
loop
needs to have a certain number of iterations
, and on each iteration, 20 albums are printed. A loop that's executed a definite number of times is often implemented with a specific type of for loop
. In this loop, the right hand operand of the in operator is not an existing list
. Instead, it's a range
object that's been spefically created for the purpose. It's an object
that produces a desired range of numbers:In [1]: numbers = range(10)
In [2]: for number in numbers:
...: print(number)
...:
0
1
2
3
4
5
6
7
8
9
If we go through numbers [0, ..., 9], it naturally leads to 10 iterations. The range object can be put directly on the loop declaration, and usually is:
In [1]: for number in range(10):
...: print(number)
...:
0
1
2
3
4
5
6
7
8
9
The
argument
to range can of course be a variable
instead of a literal value
. When we apply this knowledge, the new show function is starting to shape up:PER_PAGE = 20
# other function definitions have been cut
def show(collection):
pages = math.ceil(len(collection) / PER_PAGE)
for i in range(pages):
start = i * PER_PAGE
end = (i + 1) * PER_PAGE
format_page(collection[start:end])
The core idea of this code is to print a new
slice
from the list
by starting from iteration
index times 20 (the first index is 0) and ending to iteration index + 1 times 20. This produces slices 0:20, 20:40, 40:60 etc. We assigned the number 20 to a constant
to make the code easier to modify later. As for the format_page function
that is called
in this example, it's actually just a copy of what the show function used to be. Just with a more accurate name for the parameter
. def format_page(lines):
for i, album in enumerate(lines):
print(
f"{i + 1:2}. "
f"{album['artist']}, {album['album']}, {album['no_tracks']}, "
f"{album['length']}, {album['year']}"
)
This is all great. The result just isn't any different from before because printing doesn't actually stop after each chunk of 20. Because the input function pauses program execution until the user gives their input, we can use it in a clever way to solve this problem.
def show(collection):
pages = math.ceil(len(collection) / PER_PAGE)
for i in range(pages):
start = i * PER_PAGE
end = (i + 1) * PER_PAGE
format_page(collection[start:end])
if i < pages - 1:
input(" -- press enter to continue --")
As a small detail, the
if statement
before the input is used to skip the prompt when the last page is printed. The example below shows how this code works, but we changed the PER_PAGE constant
to 5 because we don't actually have over 20 albums in the collection currently.This program manages an album collection. You can use the following features: (A)dd new albums (R)emove albums (S)how the collection (O)rganize the collection (Q)uit Make your choice: s 1. Alcest, Kodama, 6, 0:42:15, 2016 2. Canaan, A Calling to Weakness, 17, 1:11:17, 2002 3. Deftones, Gore, 11, 0:48:13, 2016 4. Funeralium, Deceived Idealism, 6, 1:28:22, 2013 5. IU, Modern Times, 13, 47:14, 2013 -- press enter to continue -- 1. Mono, You Are There, 6, 1:00:01, 2006 2. Panopticon, Roads to the North, 8, 1:11:07, 2014 3. PassCode, Clarity, 13, 49:27, 2019 4. Scandal, Hello World, 13, 53:22, 2014 5. Slipknot, Iowa, 14, 1:06:24, 2001 -- press enter to continue -- 1. Wolves in the Throne Room, Thrice Woven, 5, 42:19, 2017
That went pretty well, at least if we ignore the fact that our numbering starts from 1 on every "page". In order to solve this problem we need to smuggle a second
argument
to the format_page function: the page number. def format_page(lines, page_n):
for i, album in enumerate(lines):
print(
f"{i + 1:2}. "
f"{album['artist']}, {album['album']}, {album['no_tracks']}, "
f"{album['length']}, {album['year']}"
)
Just to show you some new things, there's another perfectly valid place for the index math in this function. It turns out that enumerate accepts a second argument as well, one that can be used to change the starting number from zero to something else:
def format_page(lines, page_n):
for i, album in enumerate(lines, page_n * PER_PAGE + 1):
print(
f"{i + 1:2}. "
f"{album['artist']}, {album['album']}, {album['no_tracks']}, "
f"{album['length']}, {album['year']}"
)
In this example the change doesn't really matter, but if the ordinal was used more than once inside the loop, the advantages would be way more apparent. All that's left is to hand a value to this new parameter from the show function:
def show(collection):
pages = math.ceil(len(collection) / PER_PAGE)
for i in range(pages):
start = i * PER_PAGE
end = (i + 1) * PER_PAGE
format_page(collection[start:end], i)
if i < pages - 1:
input(" -- press enter to continue --")
There's one thing we didn't really make a number about because it should be obvious by now but... Did you notice how we used the same variable i in both functions? This does not give us any trouble because, once again, these are separate variables in separate
scopes
. Just a reminder in case you've forgotten one of the advantages of using functions.Now the result is what we wanted:
1. Alcest, Kodama, 6, 0:42:15, 2016 2. Canaan, A Calling to Weakness, 17, 1:11:17, 2002 3. Deftones, Gore, 11, 0:48:13, 2016 4. Funeralium, Deceived Idealism, 6, 1:28:22, 2013 -- press enter to continue -- 5. IU, Modern Times, 13, 47:14, 2013 6. Mono, You Are There, 6, 1:00:01, 2006 7. Panopticon, Roads to the North, 8, 1:11:07, 2014 8. PassCode, Clarity, 13, 49:27, 2019 9. Scandal, Hello World, 13, 53:22, 2014 10. Slipknot, Iowa, 14, 1:06:24, 2001 -- press enter to continue -- 11. Wolves in the Throne Room, Thrice Woven, 5, 42:19, 2017
Finishing Touches¶
The prints are admittedly still a bit unseemly, and zero hours are visible. Let's close this chapter by modifying the new format_page function. All the changes are applied to the
f string
:def format_page(lines, page_no):
for i, album in enumerate(lines, page_no * PER_PAGE + 1):
print(
f"{i:2}. "
f"{album['artist']} - {album['album']} ({album['year']}) "
f"[{album['no_tracks']}] [{album['length'].lstrip('0:')}]"
)
The formatting template has been rewritten and
keyword arguments
make a comeback. The lstrip used for length removes zero hours from length if it finds them, and actually also removes zero minutes if the album is very short (you should examine how strip works very carefully!). The result is almost beautiful:1. Alcest - Kodama (2016) [6] [42:15] 2. Canaan - A Calling to Weakness (2002) [17] [1:11:17] 3. Deftones - Gore (2016) [11] [48:13] 4. Funeralium - Deceived Idealism (2013) [6] [1:28:22] 5. IU - Modern Times (2013) [13] [47:14] -- press enter to continue -- 6. Mono - You Are There (2006) [6] [1:00:01] 7. Panopticon - Roads to the North (2014) [8] [1:11:07] 8. PassCode - Clarity (2019) [13] [49:27] 9. Scandal - Hello World (2014) [13] [53:22] 10. Slipknot - Iowa (2001) [14] [1:06:24] -- press enter to continue -- 11. Wolves in the Throne Room - Thrice Woven (2017) [5] [42:19]
That said, the code itself might not look that neat anymore. The placeholders inside the string are starting to look very busy. In this scenario it might be worth considering whether the old
format method
might provide a cleaner solution. The biggest diference with these two approaches is that while f strings
grab the values for placeholders from the program's namespace
, the format method takes those values from the arguments given to it. The example below might have more lines of code, but the string itself looks a lot cleaner:def format_page(lines, page_no):
for i, album in enumerate(lines, page_no * PER_PAGE + 1):
print("{i:2}. {artist} - {album} ({year}) [{tracks}] [{length}]".format(
i=i,
artist=album["artist"],
album=album["album"],
tracks=album["no_tracks"],
length=album["length"].lstrip("0:"),
year=album["year"]
))
In the Next Episode...¶
In the thrilling season closer of this programming drama we'll perform a couple more tricks with lists: removing items and changing their values. Frankly those just didn't fit in this time, there's no other reason. We'll also learn how to actually save the collection when the program is closed which sounds extremely useful for anyone who isn't prepared to have the program open indefinitely.
Oh, there's also the small detail of giving an introduction to graphical user interfaces in the final episode...
Closing Words¶
Together with
lists
, loops
form a toolkit that, when combined with what we previously learned, can theoretically be used to implement almost anything. Python's lists in particular are extremely useful. The more you program with Python, the more you will notice how the solution you're looking for is a list. Loops often go hand in hand with lists because they're the only reasonable way to go through values inside lists.The advantage of splitting code to
functions
should have also become more apparent in this material. Things like universally usable prompt functions make the implementation of core features much more effortless as does putting all prints into their own functions. Likewise the splitting of features into functions made more sense since they were doing completely different things.In general program design became increasingly important in this material. We even noticed that our current solution could have been implemented better. This happens rather often regardless of how experienced the programmer is or how many iterations they've made of the same programs already.
When the toolkit expands, the most important programming skill is ultimately the ability to work systematically. If you "just start from somewhere" without thinking about the other parts at all, you may find yourself absolutely swamped with no way out. What you learned from this material is already a solid basis for taking a systematic approach to implementing your own course project, and the exercises will give you another big boost.
Image Sources¶
- original license: CC-BY-NC 2.0 (caption added)
- original license: CC-BY 2.0 (caption added)
- original license: CC-BY 2.0 (caption added)
- original license: CC-BY-SA 2.0
- original license: CC-BY-NC 2.0 (caption added)
- original license: CC-BY-NC-SA 2.0
- original license: CC-BY 2.0 (caption added)
- original license: CC-BY 2.0 (caption added)
Give feedback on this content
Comments about this material