Scope
Scope is the region of code in which a variable may be accessed. If a variable's scope is too large, then our precious data may get clobbered by other code that reassigns the variable, or we may expose sensitive data that shouldn't be exposed. If a variable's scope is too small, then subprograms will not be able to share data except through parameters. We want a variable's scope to be as small as possible without making sharing clumsy.
Scope Regions
There are several ways to limit the scope of a variable. To make a variable local to a subprogram, we declare it within that subprogram. The scope starts at the declaration and ends at the close of the subprogram. For example, in this Ruby script, the variable whole
may only be accessed from within function fraction
:
def fraction(x)
whole = x.truncate
(x - whole).abs
end
puts fraction(-2.3) # prints 0.3
puts whole # fails because whole is not in scope
Since whole
is local, its memory is reclaimed when the function returns and may even have been overwritten by the time it is unsuccessfully accessed on the last line of the script.
To make a global variable visible to all subprograms, we declare it at the top level. Some languages do not immediately give subprograms access to globals. Consider this Python program:
mode = 'dark'
def foreground():
return 'green' if mode == 'dark' else 'black'
Calling the foreground
function yields green
, as we'd expect. However, this code fails:
mode = 'dark'
def toggle_mode():
mode = 'light' if mode == 'dark' else 'dark'
In Python, an assignment to a variable in a subprogram is by default interpreted as a declaration of a brand new local variable. The mode
variable inside toggle_mode
has an assignment and is therefore local. But it hasn't been assigned a value when it is referenced inside the condition, so the program fails. To force the interpreter to treat mode
as a global, we must declare it as such:
mode = 'dark'
def toggle_mode():
global mode
mode = 'light' if mode == 'dark' else 'dark'
JavaScript presents a unique twist on scoping called hoisting. Any function definitions or variables declared with the keyword var
are lifted up to the start of their enclosing block or file. Consider this script:
logTime();
function logTime() {
console.log(label, new Date());
}
var label = 'Time:';
logTime();
The first line calls function logTime
, which hasn't been defined yet. This would fail in most languages because logTime
is not in scope. But this is legal in JavaScript, as hoisting moves function definitions and var
declarations upward:
function logTime() {
console.log(label, new Date());
}
var label;
logTime();
label = 'Time:';
logTime();
Only the declaration of label
is hoisted, not its initialization. When logTime
is first called, label
hasn't been initialized. Try pasting this code into your browser's developer console and inspecting the results.
Function hoisting is convenient because you don't have to be concerned about the order of definitions in your code. But variable hoisting can lead to surprises as you see with label
. Many developers stopped using var
when JavaScript introduced let
and const
, which do not hoist their variables but instead have scoping semantics similar to other mainstream languages.
A class is made of instance variables and methods. The instance variables should be visible in all the methods but hidden from outside code. If outside code has free access to the internal state of an object, it may put that object into an inconsistent state. In Java, we make an instance variable visible to all instance methods by declaring it outside them, and we make it hidden to everything else by marking it private
:
class Monster {
private int hitPoints;
public cure(int bump) {
hitPoints += bump;
}
}
To widens its scope to subclasses, we mark it protected
.
Some languages allow us to restrict a variable's scope to just the source file in which it is declared. We impose file scope on a variable in C by declaring it at the top-level and marking it static
:
static char alphabet[] = "abcdefghijklmonpqrstuvwxyz";
void f() {
// ...
}
If we leave off static
, then other files may access alphabet
if they include their own declaration and mark it extern
:
extern char alphabet[];
int main() {
printf("%s\n", alphabet);
return 0;
}
The keyword static
has different semantics if we apply it to a local variable rather than a global. Consider this function that tracks how many times it has been called:
void audited_function() {
static int callCount = 0;
++callCount;
// ...
}
In this context, static
means that the variable persists even after the function returns. It has the lifetime of a global variable, but the scope of a local. The initialization only occurs once, and the incremented value is retained between calls.
Shadowing
What happens when an inner scope introduces a variable with the same name as a variable in the outer scope? This doubling up of identifiers is called shadowing. It is legal in some languages and contexts and illegal in others. Consider this Java code:
class Monster {
private int hitPoints;
public Monster(int hitPoints) {
this.hitPoints = hitPoints;
}
}
The constructor parameter shadows and supercedes the instance variable from the larger class scope. Java allows us to access both but gives preference to the innermost scope. To reach out to the instance variable, we must qualify the variable as this.hitPoints
.
On the other hand, shadowing a parameter with a local variable is not allowed in Java:
class Monster {
private int hitPoints;
public Monster(int hitPoints) {
int hitPoints = ...;
}
}
C is more tolerant. This program has two independent variables named c
:
char c = 'a';
for (char c = 'z'; c >= 'w'; --c) {
printf("%c\n", c);
}
printf("%c\n", c); // prints 'a'
Reusing names from nearby scopes easily confuses the humans that read and write code. Even if shadowing is legal, we should use it with caution. The mere availability of a feature in a programming language is not enough reason to warrant its use.