Passing Mechanics

Dear Computer

Chapter 4: Functions

Passing Mechanics

Parameters and return values appear to magically arrive at their destination when a program runs. But a real human had to orchestrate how data gets shared between the caller and the callee. Many language developers have faced this problem, and they've come up with a variety of sharing mechanics. We examine here a few of the most common.

Pass-by-value

With a pass-by-value mechanic, independent copies of the actual parameters are written into a function's formal parameters. Any assignments to the formal parameters affect only the local copies in the function's stack frame. Pass-by-value parameters are strictly in parameters. C, Java, JavaScript, and Ruby are all languages with pass-by-value semantics.

Pass-by-value pervades modern languages, perhaps because it cleanly separates the caller's and callee's memory and requires less labor to implement. However, copying values can incur extra runtime costs, especially if the data being copied is a big struct or array. Some languages avoid the costs of pass-by-value by passing primitives differently than non-primitives. To pass a primitive, an independent copy of the value is assigned to the formal parameter. To pass a non-primitive, an independent copy of a pointer to the value is assigned to the formal parameter. Pointers are usually much smaller than the data they point to. Typically they consume only 4 or 8 bytes and are as fast to copy as a primitive.

Any changes made to a passed pointer are local and will not affect the caller. However, any changes made through a passed pointer will reach the shared memory at which both the caller's and callee's pointers point. In this way, a callee can still effect changes to the caller's data, even though the language has pass-by-value semantics.

Pass-by-result

With a pass-by-result mechanic, the caller passes an out parameter in the list of actual parameters. The callee allocates storage for this parameter but does not initialize it with any incoming value from the caller. The function executes and assigns the parameter a value. When the function finishes, this value is copied into the variable designated by the caller.

Pass-by-result is strictly used to implement return values. In C#, the TryParse method returns a boolean that indicates the success of the parse. The parsed result is saved in a variable passed as an out parameter:

C#
int number;
if (Int32.TryParse("1861", out number)) {
  // use number
} else {
  Console.WriteLine("parse failed");
}

Unlike most parameters, out parameters must be variables or some other lvalue expression. Since there's an implied assignment, you can't pass an rvalue expression like 2 * i.

Pass-by-value-result

With a pass-by-value-result mechanic, formal parameters are initialized with copies of the actual parameters, and any changes to these parameters copied back to the caller when the function finishes. This mechanic is a combination of pass-by-value and pass-by-result. Since the parameters can be both read from and written to, they are in-out parameters.

Pass-by-reference

With a pass-by-reference mechanic, an actual parameter and a formal parameter are two different names for the same memory cell. Assignments to a formal parameter will therefore immediately effect changes in the caller's memory. Unlike pass-by-result and pass-by-value-result, there is no copying between callee storage and caller storage.

C++ and Rust are some of the only modern mainstream languages that truly support the pass-by-reference mechanic. Some folks will try to tell you Java or Ruby are pass-by-reference, perhaps claiming that a method can change a parameter, like this:

Java
public void budge(ArrayList<String> names) {
  names.add(0, "Thanos");            // change the list
}

They are right that the method is changing memory beyond itself. But the memory being changed is not the caller's parameter, but rather the object to which the parameter refers. If you tried to change the parameter, that change would be local:

Java
public void budge(ArrayList<String> names) {
  names = new ArrayList<String>();   // change the parameter
  names.add("Thanos");
  names.add("Thanos");
  names.add("Thanos");
  names.add("Thanos");
}

public static void main(String[] args) {
  ArrayList<String> names = new ArrayList<>();
  budge(names);
  System.out.println(names);   // prints []
}

In C++, we can see true pass-by-reference at work in the canonical swap routine:

C++
void swap(int &a, int &b) {
  int tmp = a;
  a = b;
  b = tmp;
}

Those assignments to a and b modify the caller's actual parameters. This behavior means that pass-by-reference is an implementation of in-out parameter semantics.

Most of the confusion about pass-by-reference is due to terminology, as reference is an overloaded term. C++ references and the implicit pointers of Java, Ruby, and JavaScript have different meanings.

Pass-by-name

With a pass-by-name mechanic, the caller shares an unevaluated form of its actual parameters with the subprogram. Each time the subprogram accesses the parameter, the unevaluated form is evaluated. This mechanic is used to pass around code that the callee will execute later. Not many modern languages support pass-by-name; Scala is a notable exception.

One benefit of pass-by-name semantics is that they allow the user to define their own control structures as functions, perhaps like this definition of a repeat loop:

function repeat(n, body)
  for i to n
    body

repeat(4, print "x")      # prints xxxx

We want the body parameter to get evaluated each time it is accessed in the repeat function, not just once.

The term pass-by-name doesn't give any indication that a parameter is passed in unevaluated and then evaluated on each access. But the term has been used since 1960, and changing established vocabulary is difficult.

← Return ValuesOverloading →