Multiple Flows
When we're converting Celsius to Fahrenheit, a single machine with a single thread of execution gets the job done. However, many of the computational problems that will earn us a living demand more of us and more of technology. Perhaps the problem that we are solving is so big that we will have to enlist many computers to find a solution. Perhaps we will need to run a task in the background while the user keeps interacting with the graphical user interface. Perhaps we will have to serve many users from all across the world at once. Whatever the cause, more than one thing happens at a time. In this chapter, we look at how our languages can help us deal with multiple flows of execution.
Computer scientists talk about and often conflate two different but related ideas: concurrency and parallelism. A concurrent program is one that responds interactively to requests from users or other flows of execution, even if it's busy processing other requests. A parallel program is one that breaks up a big computational task into smaller subtasks and uses multiple resources like threads, processes, or entire computers to simultaneously execute the subtasks.
When we make a program concurrent, we acknowledge that the computation is situated in an environment that is bombarding our software with demands, and our software needs to appear to be able to handle them all. When we hit the play button on our media player but then continue to navigate through the catalog as the music plays, we have concurrency. The program does not wait for the music to stop before showing the catalog. When we enter a URL in the web browser that triggers a slow download, and we hit the stop button to cancel it, we have concurrency. The program does not wait for the download to finish before responding to our click.
When we make a program parallel, we acknowledge that the problem is big, and we need to find ways to improve our program's performance. When we use several processor cores to search the file system, we have parallelism. Each core handles a subset of directories. When we distribute the rendering of an animated 3D film across the nodes of a cluster, we have parallelism. Each node handles a region of an image or a subset of the frames.
Concurrency is reacting to what's going on outside the current flow of execution, and parallelism is about speeding up what's going on inside. The two ideas are orthogonal. We can have a program that feels concurrent, even though no flows execute in parallel. We can have a program that runs in parallel but doesn't respond to concurrent requests. Or we can have both. Or neither.
Each concurrent or parallel task in a program is handled by a sequential flow of execution. These tasks may be running on only a single processor with a single core, yet they may appear to all be running at the same time. Really they are taking turns on the processor. In cooperative multitasking, the flows voluntarily give up the processor through a special yield
instruction. In preemptive multitasking, an external scheduler lets a flow run for a bit, pauses it, and then lets another run.
A computer with multiple cores or processors executes flows at the exact same time. Ideally, this parallelism reduces the overall execution time. In some situations, however, the overhead of setting up and synchronizing flows makes a parallel implementation slower than a serial implementation.
Historically, we could expect our programs to speed up with each new generation of processors and RAM. However, Moore's law is more of a socioeconomic trend than a law. Maintaining the trend has become challenging as transistors reach their physical limits. As chips become smaller, the harder it is to dissipate heat and keep electrons following their intended paths. These days, if we want to speed up a data-intensive program, we're better off increasing its parallel execution than waiting around for faster hardware.
In this chapter we examine several ways in which concurrency and parallelism are supported in programming languages. By its end, you'll be able to answer the following questions:
- How does a programmer write code that runs at the same time or in cooperation with other code?
- How do multiple flows communicate with each other?
- What safeguards are available for protecting shared data from being corrupted by simultaneous edits?
- What abstractions do programming languages and their standard libraries provide for making parallelism automatic?
You will not learn from this chapter how to write concurrent and parallel software that scales to many users and is safe. That's what your career and courses on high-performance computing are for. We start here with the more modest goal of learning what features languages provide so that programs can do more than one thing at a time.