Closures
A small miracle is happening in this biggerThan
function, which gives back a function that checks to see if numbers are bigger than some threshold value:
biggerThan :: Ord a => a -> (a -> Bool)
biggerThan threshold = \x -> x > threshold
biggerThan :: Ord a => a -> (a -> Bool) biggerThan threshold = \x -> x > threshold
The biggerThan
function is a generator of other functions. We can use it to crank out predicates for many different thresholds:
biggerThan100 = biggerThan 100
biggerThan18 = biggerThan 18
biggerThan0 = biggerThan 0
biggerThan100 = biggerThan 100 biggerThan18 = biggerThan 18 biggerThan0 = biggerThan 0
But here's the miracle. After biggerThan
returns its lambda, the threshold
parameter goes out of scope. Yet when the returned lambdas are assigned to biggerThan100
, biggerThan18
, or biggerThan0
, they somehow remember their values of threshold
. These functions may not be called for minutes, hours, or even years into the future, long after the biggerThan
function that made them returns.
How does the lambda remember the value of threshold
? When a lambda is created, two things are stored: the function's code and a bundle of any free variables that the code borrows from the surrounding lexical scope. The pairing of code and its bundle of external free variables is a closure. The bundle of variables from the surrounding scope plugs or closes over these holes.
Applications
Closures have many useful applications. They allow us to parameterize callbacks that are triggered by external systems, improve runtime performance, and patch in features that a language may be lacking.
Parameterized Callbacks
We have the power to make the functions that we invent more flexible by adding parameters. For example, we might add a size parameter to a list constructor to preallocate a certain number of elements. However, adding a parameter doesn't work for callbacks in event-driven systems because our callbacks are typically called by code that knows nothing about us or our task. Callbacks must match the interface established by the system. A windowing library written 10 years ago to serve the masses isn't going to send our callback an extra piece of information through a parameter.
Closures subvert the traditional mechanism of passing parameters. If a callback needs data but can't get it from the parameters, it can instead pull the data from its closure. In this JavaScript code, for example, three callbacks are generated to handle clicks on buttons that set a game's difficulty:
function modeCallback(mode) {
return event => {
setTitle(`Fasteroids (${mode})`);
setDifficulty(mode);
};
}
button1.addEventListener('click', modeCallback('easy'));
button2.addEventListener('click', modeCallback('medium'));
button3.addEventListener('click', modeCallback('hard'));
function modeCallback(mode) { return event => { setTitle(`Fasteroids (${mode})`); setDifficulty(mode); }; } button1.addEventListener('click', modeCallback('easy')); button2.addEventListener('click', modeCallback('medium')); button3.addEventListener('click', modeCallback('hard'));
The lambdas returned by modeCallback
close over mode
, setTitle
, and setDifficulty
. The event system doesn't have to pass these values in through parameters.
Avoiding Recomputation
Consider the time complexity of this aboveAverage
function in Haskell, which filters out all the elements of a list that are greater than its mean:
aboveAverage :: [Double] -> [Double]
aboveAverage xs = filter (\x -> x > sum xs / length xs) xs
aboveAverage :: [Double] -> [Double] aboveAverage xs = filter (\x -> x > sum xs / length xs) xs
The filter
call visits each element in the list. For each element, it computes the mean by calling sum
and length
, each of which are themselves linear operations. This combination puts aboveAverage
in \(\Theta(n^2)\). It will perform poorly if passed a long list or run on a slow computer.
A well-written Haskell compiler will discover that it can cache the mean. However, languages that allow side effects may not be able to make this optimization. In any case, we can improve the complexity without relying on the compiler by using a closure. We calculate the mean once and cache it in a local variable, which the lambda may freely access:
aboveAverage :: [Double] -> [Double]
aboveAverage xs = filter (\x -> x > mean) xs
where mean = sum xs / length xs
aboveAverage :: [Double] -> [Double] aboveAverage xs = filter (\x -> x > mean) xs where mean = sum xs / length xs
The lambda closes over the precomputed mean
instead of computing it over and over again. This implementation is \(\Theta(n)\).
Ad Hoc Classes
Not all languages support classes. Even some that do don't provide a way to restrict access to instance variables. In this JavaScript code, for example, the loan balance is wiped to 0 because clients can freely mutate instance variables without having to go through the class itself:
class LoanAccount {
constructor(debt) {
this.balance = debt;
}
}
let account = new LoanAccount(100000);
account.balance = 0;
class LoanAccount { constructor(debt) { this.balance = debt; } } let account = new LoanAccount(100000); account.balance = 0;
Closures can be used to implement objects with private instance variables. Consider this constructor-like function that creates a new instance of a player object, which is made up of two instance variables and three instance methods:
function constructPlayer(name, health) {
const heal = () => {
health += 10;
console.log(`${name} has ${health} hit points.`);
};
const hurt = () => {
health -= 10;
console.log(`${name} has ${health} hit points.`);
}
const rename = newName => name = newName;
return {heal, hurt, rename};
}
function constructPlayer(name, health) { const heal = () => { health += 10; console.log(`${name} has ${health} hit points.`); }; const hurt = () => { health -= 10; console.log(`${name} has ${health} hit points.`); } const rename = newName => name = newName; return {heal, hurt, rename}; }
The constructPlayer
function returns an object containing only the three local functions. The name
and health
variables are not included in this object. However, the functions in the object are really closures over them. A player instance is constructed and its methods are called like any other object:
const player = constructPlayer('Zorn', 99);
player.heal();
player.rename('Slar');
player.hurt();
// player.health = 9999; <- doesn't touch methods' health
const player = constructPlayer('Zorn', 99); player.heal(); player.rename('Slar'); player.hurt(); // player.health = 9999; <- doesn't touch methods' health
The instance variables are completely hidden inside the closures and are therefore private. Only the methods can access them.
Implementation
Closures ease many different programming tasks, but supporting them in a language is no small feat. They complicate how and where a variable is allocated and when it is released from memory.
The variables accessed by a closure are often local to the surrounding function in which the closure is created. Local variables are normally wiped from the stack when the function returns, but this can't be allowed to happen if they are referenced by a closure that outlives the function. The variables must have a lifetime as long as the closures that access them. To keep the variables alive, the language runtime allocates them on the heap rather than the stack.
Closures are fully supported in JavaScript, Ruby, Python, C++, and many others. Java provides only partial support. Consider this Java program that attempts to make an array of click listeners:
ActionListener[] listeners = new ActionListener[3];
for (int i = 0; i < listeners.length; ++i) {
listeners[i] = event -> System.out.println(i);
}
ActionListener[] listeners = new ActionListener[3]; for (int i = 0; i < listeners.length; ++i) { listeners[i] = event -> System.out.println(i); }
This code does not compile because the lambda references variable i
, and Java closures are only allowed to access final variables. The designers of Java imposed this limitation to ease the implementation of closures. In other languages, the free variables are placed in the heap and then each closure gets pointers to the heap memory. If one closure modifies a shared free variable, the other closures see the modification. Java compiler writers implement closures by giving each closure its own copy of any free variable. If a closure could assign a new value to its personal copy, then that copy would diverge from the other copies. That's not what we expect of closed-over variables. To prevent programmers from being surprised, Java requires closure variables to be marked final. In this case, i
is not final and cannot be accessed by a closure.
This limitation of Java can be worked around. We can't make i
final since it's modified by the loop. But we can copy i
into a final variable inside the loop and have each closure close around its personal copy:
ActionListener[] listeners = new ActionListener[3];
for (int i = 0; i < listeners.length; ++i) {
final int j = i;
listeners[i] = event -> System.out.println(j);
}
ActionListener[] listeners = new ActionListener[3]; for (int i = 0; i < listeners.length; ++i) { final int j = i; listeners[i] = event -> System.out.println(j); }
C does not support closures. Functions can't be nested, so there's never an occasion where a function needs access to a surrounding scope. They are always defined at the top-level of a file and draw values only from their own scope and the global scope.