2013-04-22

The code for this tutorial can be downloaded here: threadworms.py or from GitHub. This code works with Python 3 or Python 2, and you need Pygame installed as well in order to run it.



Click the animated gif to view a larger version.

This is a tutorial on threads and multithreaded programs in Python, aimed at beginning programmers. It helps if you know the basics of classes (what they are, how you define methods, and that methods always have self as the first parameter, what subclasses (i.e. child classes) are and how a method can be inherited from a parent class, etc.) Here’s a more in-depth classes tutorial.

The example used is a “Nibbles” or “Snake” style clone that has multiple worms running around a grid-like field, with each worm running in a separate thread.

What are threads and why are they useful?

You can skip this section if you already know what threads are and just want to see how to use them in Python.

When you run a normal Python program, the program execution starts at the first line and goes down line by line. Loops and function calls may cause the program execution to jump around, but it is fairly easy to see from the code which line will get executed next at any given point. You can put your finger on the first line of code in the .py file on the screen, and then trace through the next lines of code that are executed. This is single-threaded programming.

However, using multiple threads is like putting a second finger down on your code. Each finger still moves the same way, but now they are executing code simultaneously.

Actually, they aren’t executing simultaneously. Your two fingers are taking turns at which one executes code. Computers with multicore CPUs can actually run multiple instructions simultaneously, but there is a feature of Python programs called the GIL (Global Interpreter Lock) that limits a Python program to one core only.

The Python interpreter will run one thread for a while, and then pause it to run another thread for a while. But it does this so fast that it seems like they are running simultaneously.

You can start dozens or hundreds of threads in your program (that’s a lot of fingers). This doesn’t automatically make your programs dozens or hundreds of times faster though (all the threads are still sharing the same CPU) but it can make your program more efficient.

For example, say you write a function that will download a file full of names, then sorts the names, and then writes these names to a file on your computer. If there are hundreds of files your program needs to process, you would put a call to this function in a loop and it would handle each file serially: download, sort, write, download, sort, write, download, sort, write…

Each of these three steps use different resources on your computer: downloading uses the network connection, sorting uses the CPU, writing the file uses the hard drive. Also, there are tiny pauses within each of these steps. For example, the server you are downloading the file from may be slow and your computer’s Internet connection has bandwidth to spare.

It would be better if you could call this function hundreds of times in parallel by using one thread for each file. Not only would this make better use of your bandwidth, but if some files download sooner than others, the CPU can be used to sort them while the network connection continues to work. This makes more efficient use of your computer.

What makes multithreaded programming tricky?

Of course, in the above case, each thread is doing its own separate thing and doesn’t need to communicate or synchronize anything with the other threads. You could just write the simple single-threaded version of the download-sort-write program and run the program hundreds of times separately. (Though it might be a pain to type & click each time to run the program each with a different file to download.)

Many multithreaded programs share access to the same variables, but this is where things can get tricky.



Photo from Brad Montgomery)

Here’s a common metaphor that is used: Say you have two robot ticket sellers. Their tasks are simple:

Ask the customer which seat they want.

Check a list to see if the seat is available.

Get the ticket for that seat.

Cross that seat off the list.

A customer asks Robot A for seat 42. Robot A checks that the seat is available from the list and finds that it is, so it grabs the ticket. But before Robot A can cross the seat off the list, Robot B is asked by a different customer for seat 42. Robot B checks the list and sees that the seat is still available, so it tries to grab the ticket for the seat. But Robot B can’t find the ticket for seat 42. THIS DOES NOT COMPUTE, and Robot B’s electronic brain explodes. Robot A then crosses seat 42 off of the list.

The above problem happens because although the two robots (or rather, two threads) are executing independently, they are both reading and modifying a shared list (or rather, a variable). Your programs can get very hard-to-fix bugs which are also difficult to even reproduce, since Python’s thread execution switching is nondeterministic, that is, done differently each time the program is run. We aren’t used to having the data in variables “magically” change from one line to the next just because a different thread was executed in between them.

When the execution switches from one thread to another, this is known as a context switch.

There is also the problem of deadlocks, which is commonly explained using the metaphor of the Dining Philosophers. Five philosophers are sitting around a circular table eating spaghetti but require two forks to do so. There is one fork between each philosopher (for a total of five forks). The method the philosophers use to eat is this:

Philosophize for a while.

Pick up the fork on your left.

Wait until the fork on your right is available.

Pick up the fork on your right.

Eat.

Put the forks down.

Go back to step 1.



Aside from the fact that they’ll be sharing forks with their neighbors (eww), it seems like this method will work. But sooner or later everyone at the table will end up with the fork on their left in their hand and waiting for the fork on their right. But because everyone is holding on to the fork their neighbor is waiting for and won’t put it down until they’ve eaten, the philosophers are in a deadlock state. They will be holding forks in their left hand but never getting a fork in their right hand, so they never eat and never put down the fork in their left hand. The philosophers all starve to death (except for Voltaire who is actually a robot. Without spaghetti, his electronic brain explodes.)

There is also a similar situation called a livelock. This is when no work gets done because the threads are too generous at making a resource available. The best metaphor of this is when two people are walking towards each other down a hall. They step to the side to let the other person walk past, but end up blocking each other. So they both step back to the other side, but end up blocking each other again. They continue doing this until they starve/electronic-brain-explode.

There are a few other problems that can come up with multithreaded programming such as starvation (no seriously, that’s what it is called) and generally fall under the label of “Concurrency” in computer science. But we will only treat a simplified case.

Locks

One way to prevent bugs with multithreaded programming is by using locks. Before a thread reads or modifies a shared variable, it attempts to “acquire” a lock. If it can acquire the lock, the thread goes on to read or modify the variable. If the thread cannot acquire the lock, it waits until the lock becomes available.

When the thread is done with the shared variable, it will “release” the lock so that some other thread waiting for the lock can acquire it.

Going back to our robot ticket seller metaphor, this is like having a robot pick up the list (the list is a “lock”), and then reading it the ticket is available, grabbing the ticket, and then crossing out the seat on the list. When the robot puts the list back down, it is “releasing the lock”. If another robot needs to pick up the list but it is not there, it will wait until the list is available.

You can cause bugs by writing code that forgets to release a lock. This will cause a deadlock situation since the other threads will hang and do nothing while waiting for a lock to bereleased.

Threads in Python

Okay, let’s write a Python program that demonstrates how to use threads and locks. This program is based off of my “Snake” clone in Chapter 6 of my Making Games with Python & Pygame book. Except instead of a worm running around eating apples, we’ll just have the worm running around the screen. And instead of just one worm, we will have multiple worms. Each worm will be controlled by a separate thread. The shared variable will have the data structure that represents which places on the screen (called “cells” in this program) are occupied by a worm. A worm cannot move forward to occupy a cell if another worm is already there. We will use locks to ensure that the worms don’t occupy the same cell as another worm.

The code for this tutorial can be downloaded here: threadworms.py or from GitHub. This code works with Python 3 or Python 2, and you need Pygame installed as well in order to run it.

Here’s a summary of the thread-related code in our threadworms.py program:

Python’s thread library is in a module named threading, so first import this module.

The class Lock in the threading module has acquire() and release() methods. We will create a new Lock object and store it in a global variable named GRID_LOCK. (Since the state of the grid-like screen and which cells are occupied is stored in a global variable named GRID. The pun was unintended.)

Our WORMS_RUNNING global variable is regularly checked by the worm threads to see if they should quit. Calling sys.exit() will not stop the program because it only quits the thread that made the call. As long as there are other threads still running the program will continue. The main thread in our program (which handles the Pygame drawing and event handling) will set WORMS_RUNNING to False before it calls pygame.quit() and sys.exit(). The next time a thread checks WORMS_RUNNING, it will quit, until eventually the last thread quits and then the program terminates.

The thread’s code must start from a class that is a child of the Thread class (which is in the threading module). Our Thread subclass will be named Worm since it controls You don’t need an __init__() function, but since our Worm classes uses one we need to call the threading.Thread class’s __init__() method first. Also optional is to override the name member. Our __init__() function uses the string 'Worm' by default, but we can supply each thread with a unique name. Python will display the thread’s name in the error message if it crashes.

Before we read or modify the value in the GRID variable, the thread’s code should attempt to acquire the lock. If the lock isn’t available, the method call to acquire() will not return and instead “block” until the lock becomes available. The thread is paused while this happens. This way, we know that the code after the acquire() call will only happen if the thread has acquired the lock.

Acquiring and releasing a lock around a bit of code ensures that another thread does not execute this code while the current thread is. This makes the code atomic because the code is always executed as a single unit.

After the thread’s code is done with the GRID variable, the lock can be released by calling the release() method.

A thread starts when the Worm class (which is a subclass of threading.Thread) has its start() method called. We don’t have to implement start() in the Worm class because it is inherited from the threading.Thread class. When the start() method is called, a new thread is created and the code inside the run() method is executed in this new thread. Do not call the run() method directly, as this won’t create the new thread.

This is important to know: to start the thread call the start() method, but the code that gets run in the new thread is in run(). We don’t have to define start() because it is inherited from threading.Thread. We do need to define run() since that is where our thread’s code will go.

When the run() method returns (or sys.exit() is called in the thread), the thread will be destroyed. All threads in a program must be destroyed before the program terminates. The program will still be running as long as there is one running thread.

So when start() is called, this is when you would place your second finger on the source code in run() to start tracing the code. Your first finger will continue tracing the code after the line that has the start() call.

A Simple Multithreaded Example

Before we go into the Threadworm code, let’s just look at a dead simple multithreaded program:

This program defines a new class called CountThread. When a CountThread object’s start() method is called, a new thread is created which will loop 100 times and increment the TOTAL global variable (which is shared between the variables) by 1 on each iteration of the loop.

Since we are creating two CountThread objects, whichever one finishes last should display 200. Each thread increases TOTAL by 100 and there are two threads. When we run this program, that’s what we see:

Because the first number is 100, we can tell that probably what happened is that one thread ran through the entire loop before a context switch happened.

However, if we change range(100) to range(100000), we would expect the second number to be 200000, since each thread increases TOTAL by 100000 and there are two threads. But when we run the program, something like this appears (your numbers may be slightly different):

That second number is not 200000! It’s quite less than that actually. The reason this happened is because we did not use locks around the code the reads and modifies the TOTAL variable, which is shared among multiple threads.

Look at this line:

If TOTAL was set to 99, then you would expect TOTAL + 1 to evaluate to 99 + 1 and then to 100, and then 100 is stored as the new value in TOTAL. Then on the next iteration, TOTAL + 1 would be 100 + 1 or 101, which is stored as the new value in TOTAL.

But say when TOTAL + 1 gets evaluated as 99 + 1, the execution switches to the other thread, which is also about to execute the TOTAL = TOTAL + 1 line. The value in TOTAL is still 99, so TOTAL + 1 in this second thread gets evaluated to 99 + 1.

Then, another context switch happens back to the first thread where TOTAL = 99 + 1 is in the middle of being executed. The integer 100 is assigned to TOTAL. Now execution switches back to the second thread again.

In this second thread, TOTAL = 99 + 1 is about to be executed. Even though TOTAL is now 100, the TOTAL + 1 in this second thread has already been evaluated as 99 + 1. So the second thread also ends up assigning the integer 100 to TOTAL. Even though this TOTAL = TOTAL + 1 has been executed twice (once by each thread), the value in TOTAL has really only been incremented by 1!

The problem is, the line of code TOTAL = TOTAL + 1 is not atomic. The context switch can happen right in the middle of the line being executed. We need to use locks around this code to make this an atomic operation.

This new code fixes this problem:

When we run this code, this is what is outputted (your first number might be a little different):

That the second number is 200000 tells us that the TOTAL = TOTAL + 1 line was correctly executed each of the 200,000 times it was run.

Explaining the Threadworms Program

I’m going to use the threadworms_nocomments.py version of the program since it doesn’t have the very verbose comments in it. The line numbers have been included at the front of each line (they are not a part of the actual Python source code). I skip a lot of the commented sections because they are self-explanatory. You don’t really need to know Pygame to follow this code. Pygame is only responsible for creating the window and drawing the lines and rectangles on it.

One thing to know is that Pygame uses a tuple of three integers to represent colors. The integers each span from 0 to 255 and represent the RGB (Red-Green-Blue) value of the color. So (0, 0, 0) is black and (255, 255, 255) is white and (255, 0, 0) is red and (255, 0, 255) is purple, etc.

The top part of the code imports some modules our program needs and defines some constant values. Feel free to edit these constant values. Increasing or decreasing the FPS value doesn’t change how fast the worms run around, it just changes how often the screen updates. If you set this value very low, it looks like the worms are teleporting since they move multiple spaces in between screen updates.

CELL_SIZE is how big each square on the screen’s grid is (in pixels). If you want to change the number of cells, modify the CELLS_WIDE and CELLS_HIGH constants.

The GRID global variable will contain data that tracks the state of the grid. It is a simple list of lists so that GRID[x][y] will refer to the cell at the X and Y coordinate. (In programming, the (0, 0) origin is at the top-left corner of the screen. X increases going to the right (just like in mathematics classes) but Y increases going down.)

If GRID[x][y] is set to None, then that cell is unoccupied. Otherwise, GRID[x][y] will be set to an RGB triplet. (This information is used when drawing the grid to the screen.)

Line 24 creates a Lock object which our threads’ code will acquire and release before reading or modifying GRID.

Some more simple constants. I use constants like DOWN and RIGHT instead of strings like 'down' and 'right' because if I make a typo using constants (i.e. DWON) then Python will immediately crash with a NameError exception. This is much better than if I make a typo like 'dwon' which won’t immediately crash the program will cause bugs later on, making it more difficult to track down.

Each worm will be represent by a list of dictionaries like {'x': 42, 'y': 7}. Each of these dictionaries represents a single body segment of the worm. The dictionary at the front of the list (at index 0) is the head and the dictionary at the end of the list (at index -1, using Python’s nice negative indexing which begins counting from the end) is the butt of the worm.

(In computer science, “head” often refers to the first item in a queue or list, and “tail” refers to every item after the head. So I use “butt” to refer to just the last item. Also, I am silly.)

The above worm would be represented with a list that looks like this: [{'x': 7, 'y': 2}, {'x': 7, 'y': 3}, {'x': 7, 'y': 4}, {'x': 8, 'y': 4}, {'x': 9, 'y': 4}, {'x': 10, 'y': 4}, {'x': 11, 'y': 4}, {'x': 11, 'y': 3}, {'x': 11, 'y': 2}]

As long as one thread is running, the program will continue to execute. The main thread that does the screen drawing will also detect when the user has clicked the close button on the window or pressed the Esc key, so it needs a way to tell the worm threads to quit. We will code the worm threads to constantly check WORMS_RUNNING. If WORMS_RUNNING is set to False, then the thread will terminate itself.

Here’s our Worm class. It is a child class of the threading.Thread class. Each worm can have a name (which appears if the thread crashes, helping us identify which thread crashed), and a size, color, and speed. Default values are provided, but we can specify these ourselves if we want.

Since we are overriding the __init__() method, we need to call the parent classes __init__() method so that it can initialize all the thread stuff. (We don’t need to know how it works, just remember to call it.)

The above code sets up a worm with random values for the size, color, and speed unless specific values were specified for the parameters.

We need to determine a random starting location for the worm. To make this easier, all worms begin with a length of one body segment and grow until they reach their full maximum size. But we need to make sure that the random location on the grid we come up with isn’t already occupied. This involves reading and modifying the GRID global variable, so we need to acquire and release the GRID_LOCK lock before doing this.

(As a side note, you might be wondering why we don’t have a “global GRID” line at the beginning of this method. GRID is a global variable and we are modifying it in this method, and without a global statement Python should consider this a local variable that just happens to have the same name as the GRID global variable. But if you look closer, we only change values inside the GRID list of lists, but never the value in GRID itself. That is, we have code that looks like “GRID[startx][starty] = self.color” but never “GRID = someValue“. Because we don’t actually modify GRID itself, Python considers the use of the variable name GRID in this method to refer to the global variable GRID.)

We keep looping until we’ve found an unoccupied cell, and then mark that cell as now occupied. After this, we are done reading and modifying GRID so we release the GRID_LOCK lock.

(Another side note, if there are no free cells on the grid, this loop will continue to loop forever and the thread will “hang”. Since the other threads will continue to run, you might not notice this problem. The new worm will not be created but the rest of the program continues to run normally. However, when you try to quit, since the hanging thread never gets to check WORMS_RUNNING to know it should quit and the program will refuse to terminate. You will have to force the program to shut down through your operating system. Just be sure not to add more worms than you have space for.)

The starting body segment is added to the body member variable. The body member variable will be a list of all the locations of segments of the body. The direction that the worm is heading in is stored in the direction member variable.

Technically, since this worm right now only has one body segment that is both the first and last item in the list, the worm’s head is the same as its butt.

The run() method is the method that is called when the worm’s start() method is called. The code in run() is executed in a brand new thread. We will have an infinite loop that causes the worm to continuously move around the grid. The first thing we do on each iteration of the loop is check if WORMS_RUNNING is set to False, and if so, we should return from this method.

The thread will terminate itself if we either call sys.exit() from the thread or when the run() method returns.

On each move, there’s a 20% chance that the worm randomly changes direction. (Although the new direction could be the same as the current direction. But I wanted to write this code out quickly.)

The above code handles moving the worm one space. Since this involves reading and modifying GRID, we need to acquire the GRID_LOCK lock first. Essentially, the worm will try to move one space in the direction that it’s direction member variable says. If this cell is beyond the border of the grid or is already occupied, then the worm will change its direction. If the worm is blocked on all sides, then the worm reverses itself so that the butt becomes the head and the head becomes the butt. If the worm still can’t move in any direction, then it will just stay put for now.

After the worm has moved one space (or at least tried to), we will put the thread to sleep. Pygame has a function called wait() that does the same thing as time.sleep(), except that the argument to wait() is in integer of milliseconds instead of seconds.

Pygame’s pygame.time.wait() and the Python Standard Library’s time.time() functions (and Pygame’s tick() method) are smart enough to tell the operating system to put the thread to sleep for a while and just run other threads instead. Of course, while the OS could interrupt our thread at any time to hand execution off to a different thread, calling wait() or sleep() is a way we can explicitly say, “Go ahead and don’t run this thread for X milliseconds.”

This wouldn’t happen if we have “wait” code like this:

The above code also implements “waiting”, but to the OS it looks like your thread is still executing code (even though this code is doing nothing but looping until 5 seconds has passed). This is inefficient, because time spent executing the above pointless loop is time that could have been spent executing other thread’s code.

Of course, if ALL worms’ threads are sleeping, then the computer can know it can use the CPU to run other programs besides our Python Threadworms script.

The getNextPosition() figures out where the worm will go next given the position of its head and the direction it is going.

The getNewDirection() method returns a direction (one of the UP, DOWN, LEFT, or RIGHT strings) that is for an unoccupied cell within the grid. If there are no available cells the head could move towards, the method returns None.

The setGridSquares() function can be used to draw static blocks on the grid by passing a multiline string. The period characters represent no change, a space character means “set this to be unoccupied” and any other character will represent a static block to place on the grid. You can uncomment line 207 if you want to see the “Hello worm” text written out in blocks.

This is standard Pygame setup code to create a window for our program.

This code creates the Worm objects and then creates their threads by calling the start() method. The code in each worm’s run() method will begin executing in a separate thread at this point.

The main game loop is pretty simple. The handleEvents() function will be checking if the user is terminating the program and the drawGrid() function will draw the grid lines and cells to the screen. The pygame.display.update() function tells the window to update the screen, after which the tick() method will pause for however long is needed to achieve the framerate specified in FPS.

The Pygame events can tell us when the user has pressed the Esc key or clicked on the close button for the window. In this case we want to set WORMS_RUNNING to False so that the threads will terminate themselves and then the main thread shuts down Pygame and exits.

This code draws the screen based on the values in GRID. But first it draws the grid lines.

Because this code reads the GRID variable, we will first acquire the GRID_LOCK lock. If a cell is occupied (that is, it is set to an RGB tuple value inside the GRID variable) the code draws in the cell.

The setGridSquares() can write static blocks to the grid and was explained previously.

The above is a Python trick. Instead of putting the main code in the global scope, we put it into a function named main() which is called from the bottom. This guarantees that all the functions have been defined before the code in main() runs. The __name__ variable is only set to the string '__main__' if this program was run itself, as opposed to imported as a module by another program.

Summary

That’s it! Multithreaded programming is fairly simple to explain, but it can be tricky to understand how to get your own multithreaded programs to work correctly. The best way to learn is to practice by writing your own programs.

Actually, the way we have our code set up, even if we got rid of the locks it would still run almost perfectly. Nothing would crash, although there would sometimes be the case where two worms are approaching the same cell and end up both occupying it. They would then seemingly move through each other. Using locks ensures that only one worm can occupy a cell at any time.

Good luck!

Show more