Scope

Dear Computer

Chapter 2: Naming Things

Scope

Scope is the region of code in which a variable may be accessed. If a variable's scope is too large, then our precious data may get clobbered by other code that reassigns the variable, or we may expose sensitive data that shouldn't be exposed. If a variable's scope is too small, then subprograms will not be able to share data except through parameters. We want a variable's scope to be as small as possible without making sharing clumsy.

Scope Regions

There are several ways to limit the scope of a variable. To make a variable local to a subprogram, we declare it within that subprogram. The scope starts at the declaration and ends at the close of the subprogram. For example, in this Ruby script, the variable whole may only be accessed from within function fraction:

Ruby
def fraction(x)
  whole = x.truncate 
  (x - whole).abs
end
puts fraction(-2.3)  # prints 0.3
puts whole           # fails because whole is not in scope
def fraction(x)
  whole = x.truncate 
  (x - whole).abs
end
puts fraction(-2.3)  # prints 0.3
puts whole           # fails because whole is not in scope

Since whole is local, its memory is reclaimed when the function returns and may even have been overwritten by the time it is unsuccessfully accessed on the last line of the script.

To make a global variable visible to all subprograms, we declare it at the top level. Some languages do not immediately give subprograms access to globals. Consider this Python program:

Python
mode = 'dark'
def foreground():
  return 'green' if mode == 'dark' else 'black'
mode = 'dark'
def foreground():
  return 'green' if mode == 'dark' else 'black'

Calling the foreground function yields green, as we'd expect. However, this code fails:

Python
mode = 'dark'
def toggle_mode():
  mode = 'light' if mode == 'dark' else 'dark'
mode = 'dark'
def toggle_mode():
  mode = 'light' if mode == 'dark' else 'dark'

In Python, an assignment to a variable in a subprogram is by default interpreted as a declaration of a brand new local variable. The mode variable inside toggle_mode has an assignment and is therefore local. But it hasn't been assigned a value when it is referenced inside the condition, so the program fails. To force the interpreter to treat mode as a global, we must declare it as such:

Python
mode = 'dark'
def toggle_mode():
  global mode
  mode = 'light' if mode == 'dark' else 'dark'
mode = 'dark'
def toggle_mode():
  global mode
  mode = 'light' if mode == 'dark' else 'dark'

JavaScript presents a unique twist on scoping called hoisting. Any function definitions or variables declared with the keyword var are lifted up to the start of their enclosing block or file. Consider this script:

JavaScript
logTime();
function logTime() {
  console.log(label, new Date());
}
var label = 'Time:';
logTime();
logTime();
function logTime() {
  console.log(label, new Date());
}
var label = 'Time:';
logTime();

The first line calls function logTime, which hasn't been defined yet. This would fail in most languages because logTime is not in scope. But this is legal in JavaScript, as hoisting moves function definitions and var declarations upward:

JavaScript
function logTime() {
  console.log(label, new Date());
}
var label;
logTime();
label = 'Time:';
logTime();
function logTime() {
  console.log(label, new Date());
}
var label;
logTime();
label = 'Time:';
logTime();

Only the declaration of label is hoisted, not its initialization. When logTime is first called, label hasn't been initialized. Try pasting this code into your browser's developer console and inspecting the results.

Function hoisting is convenient because you don't have to be concerned about the order of definitions in your code. But variable hoisting can lead to surprises as you see with label. Many developers stopped using var when JavaScript introduced let and const, which do not hoist their variables but instead have scoping semantics similar to other mainstream languages.

A class is made of instance variables and methods. The instance variables should be visible in all the methods but hidden from outside code. If outside code has free access to the internal state of an object, it may put that object into an inconsistent state. In Java, we make an instance variable visible to all instance methods by declaring it outside them, and we make it hidden to everything else by marking it private:

Java
class Monster {
  private int hitPoints;

  public cure(int bump) {
    hitPoints += bump;
  }
}
class Monster {
  private int hitPoints;

  public cure(int bump) {
    hitPoints += bump;
  }
}

To widens its scope to subclasses, we mark it protected.

Some languages allow us to restrict a variable's scope to just the source file in which it is declared. We impose file scope on a variable in C by declaring it at the top-level and marking it static:

C
static char alphabet[] = "abcdefghijklmonpqrstuvwxyz";

void f() {
  // ...
}
static char alphabet[] = "abcdefghijklmonpqrstuvwxyz";

void f() {
  // ...
}

If we leave off static, then other files may access alphabet if they include their own declaration and mark it extern:

C
extern char alphabet[];
int main() {
  printf("%s\n", alphabet);
  return 0;
}
extern char alphabet[];
int main() {
  printf("%s\n", alphabet);
  return 0;
}

The keyword static has different semantics if we apply it to a local variable rather than a global. Consider this function that tracks how many times it has been called:

C
void audited_function() {
  static int callCount = 0;
  ++callCount;
  // ...
}
void audited_function() {
  static int callCount = 0;
  ++callCount;
  // ...
}

In this context, static means that the variable persists even after the function returns. It has the lifetime of a global variable, but the scope of a local. The initialization only occurs once, and the incremented value is retained between calls.

Shadowing

What happens when an inner scope introduces a variable with the same name as a variable in the outer scope? This doubling up of identifiers is called shadowing. It is legal in some languages and contexts and illegal in others. Consider this Java code:

Java
class Monster {
  private int hitPoints;

  public Monster(int hitPoints) {
    this.hitPoints = hitPoints;
  }
}
class Monster {
  private int hitPoints;

  public Monster(int hitPoints) {
    this.hitPoints = hitPoints;
  }
}

The constructor parameter shadows and supercedes the instance variable from the larger class scope. Java allows us to access both but gives preference to the innermost scope. To reach out to the instance variable, we must qualify the variable as this.hitPoints.

On the other hand, shadowing a parameter with a local variable is not allowed in Java:

Java
class Monster {
  private int hitPoints;

  public Monster(int hitPoints) {
    int hitPoints = ...;
  }
}
class Monster {
  private int hitPoints;

  public Monster(int hitPoints) {
    int hitPoints = ...;
  }
}

C is more tolerant. This program has two independent variables named c:

C
char c = 'a';
for (char c = 'z'; c >= 'w'; --c) {
  printf("%c\n", c);
}
printf("%c\n", c);  // prints 'a'
char c = 'a';
for (char c = 'z'; c >= 'w'; --c) {
  printf("%c\n", c);
}
printf("%c\n", c);  // prints 'a'

Reusing names from nearby scopes easily confuses the humans that read and write code. Even if shadowing is legal, we should use it with caution. The mere availability of a feature in a programming language is not enough reason to warrant its use.

← Reassignment and MutabilityStatic and Dynamic Scoping →