Passing Mechanics

Dear Computer

Chapter 4: Functions

Passing Mechanics

Parameters appear to magically arrive at their destination when a program runs. But a real human had to orchestrate how data gets shared between the caller and the callee. Many language developers have faced this problem, and they've come up with a variety of sharing mechanics. We cluster these into three broad categories.

Pass-by-value

With a pass-by-value mechanic, independent copies of the actual parameters are written into the function's formal parameters. Assignments to the formal parameters affect only the local copies in the function's stack frame, so the caller doesn't see these changes. Pass-by-value parameters are strictly in parameters. C, Java, JavaScript, and Ruby are all languages with pass-by-value semantics.

Pass-by-value pervades modern languages because it cleanly separates the caller's and callee's memory and it is relatively simple to implement. However, copying values can incur extra runtime costs, especially if the data being copied is a big struct or array. Some languages avoid the costs of pass-by-value by passing primitives differently than non-primitives. To pass a primitive, an independent copy of the value is assigned to the formal parameter. To pass a non-primitive, an independent copy of a pointer to the value is assigned to the formal parameter. Pointers are typically only 4 or 8 bytes, making them as cheap to copy as primitives.

Any changes made to a passed pointer are local and will not affect the caller. However, any changes made through a passed pointer will reach the shared memory at which both the caller's and callee's pointers point. In this way, a callee can still effect changes to the caller's data, even though the language has pass-by-value semantics.

Pass-by-reference

With a pass-by-reference mechanic, an actual parameter and a formal parameter are two different names for the same memory cell. Assignments to a formal parameter will therefore immediately effect changes in the caller's memory.

C++ and Rust are amongst the few modern mainstream languages that truly support the pass-by-reference mechanic. Some folks will claim that Java and Ruby are pass-by-reference, claiming that a method can change a parameter, like this:

Java
public void budge(ArrayList<String> names) {
  names.add(0, "Veruca");            // change the list
}
public void budge(ArrayList<String> names) {
  names.add(0, "Veruca");            // change the list
}

These folks are right that the method is changing memory beyond itself. But the memory being changed is not the caller's parameter, but rather the object to which the parameter refers. If we tried to change the parameter, that change would be local:

Java
public void budge(ArrayList<String> names) {
  names = new ArrayList<String>();   // change the parameter
  names.add("Veruca");
  names.add("Veruca");
  names.add("Veruca");
  names.add("Veruca");
}

public static void main(String[] args) {
  ArrayList<String> names = new ArrayList<>();
  budge(names);
  System.out.println(names);   // prints []
}
public void budge(ArrayList<String> names) {
  names = new ArrayList<String>();   // change the parameter
  names.add("Veruca");
  names.add("Veruca");
  names.add("Veruca");
  names.add("Veruca");
}

public static void main(String[] args) {
  ArrayList<String> names = new ArrayList<>();
  budge(names);
  System.out.println(names);   // prints []
}

In C++, we see true pass-by-reference at work in this canonical swap routine:

C++
void swap(int &a, int &b) {
  int tmp = a;
  a = b;
  b = tmp;
}
void swap(int &a, int &b) {
  int tmp = a;
  a = b;
  b = tmp;
}

Those assignments to a and b modify the caller's actual parameters. This behavior means that pass-by-reference is an implementation of in-out parameter semantics.

Most of the confusion about pass-by-reference is due to terminology, as reference is an overloaded term. The alias references of C++ and the implicit pointer references of Java, Ruby, and JavaScript have different meanings.

References are often used as an alternative to traditional return values. The Int32.TryParse method in C# converts a string into an integer. It returns a boolean to indicate the success of the parse. It therefore needs a different channel for returning the parsed integer. It uses a reference parameter, which we pass with the out modifier:

C#
int number;
if (Int32.TryParse("1861", out number)) {
  // TryParse returned true, so use number
} else {
  Console.WriteLine("parse failed");
}
int number;
if (Int32.TryParse("1861", out number)) {
  // TryParse returned true, so use number
} else {
  Console.WriteLine("parse failed");
}

Unlike most parameters, out parameters must be variables or some other lvalue expression. Since there's an implied assignment, we can't pass an rvalue expression like 2 * i.

Pass-by-name

With a pass-by-name mechanic, the caller shares an unevaluated form of its actual parameters with the subprogram. Each time the subprogram accesses the parameter, the unevaluated form is evaluated. This mechanic is used to pass around code that the callee will execute later. Not many modern languages support pass-by-name; Scala is a notable exception.

One benefit of pass-by-name semantics is that they allow the user to define their own control structures as functions, perhaps like this definition of a repeat loop:

function repeat(n, body)
  for i to n
    body

repeat(4, print "x")      # prints xxxx
function repeat(n, body)
  for i to n
    body

repeat(4, print "x")      # prints xxxx

We effectively pass in the AST form of print "x" to the repeat function. Each time the body parameter is referenced, the node gets evaluated.

Even though few languages support pass-by-name, we can achieve something like it by explicitly wrapping the passed code up in a function. This delays its execution. Such wrapping usually introduces a lot of punctuation, but Kotlin reduces this noise. If the function is the last parameter, it may be moved out of the actual parameters list and placed after the call. That makes the call look like for, while, and other builtin control structures.

The term pass-by-name doesn't give any indication that a parameter is passed in unevaluated and then evaluated on each access. But the term has been used since 1960, and changing established vocabulary is difficult. If we were starting over with naming these mechanics, we might choose these more descriptive terms:

← Return ValuesOverloading →