Lvalues and Rvalues

What a name does within a program depends on its context. For example, in the assignment ncountries = 195, the name ncountries is used only to locate the memory cell that needs to be updated. When a name is used for its location, it is an lvalue.

In the expression ncountries - 1, the name ncountries is used to get at its associated value. In this context, the name is an rvalue.

Historically, an lvalue appeared on the left-hand side of an assignment statement, while an rvalue appeared on the right. These definitions are not adequate, however, as lvalues may appear on the right and rvalues may appear on the left:

numbers[i] = 87;  // rvalue i appears on the left
p = &tmp;         // & treats tmp as an lvalue

numbers[i] = 87;  // rvalue i appears on the left
p = &tmp;         // & treats tmp as an lvalue

The important distinction today is that in an lvalue context, a variable provides its cell's address. We use a cell's address in several contexts: assignments, an explicit address-of operation like & in C, and when offsetting from a struct's or array's base address to reach an element. In an rvalue context, the variable provides its cell's value. Its location is irrelevant. We use a cell's value when we perform logic or arithmetic.

Pointers have lvalue and rvalue interpretations consistent with ordinary names. Like any name, a pointer is located at some address. But its value is also an address. If we want to know where the pointer points, we treat it as an rvalue. If we want the pointer's location, we treat it as an lvalue.

There are programming languages that hide the von Neumann architecture, like Haskell. Its designers prefer a more mathematical view of computation. Mathematics, for example, doesn't have a bank of memory cells that change over time. When mathematicians equate a name with a value, as in \(\pi = 3.14159\), they are declaring a synonym: \(\pi\) is and will always be shorthand for \(3.14159\). When we name a cell in the von Neumann architecture and equate it with a value, we are making a temporary association. We may write the declaration double pi = 3.14159 only to reassign it later with pi = 3.14. Haskell does not permit reassignment.

Mathematicians use the = operator to recognize that two expressions have the same value. Many programming languages use the = operator to store a value in a memory cell, which is a subtly different idea. This difference has led the designers of some programming languages to use a different operator for assignment. ALGOL, Pascal, and Ada use :=, which is called the walrus operator due to the tusks. Here's an assignment statement in Ada:

Ada

Opacity := 0.25;

Opacity := 0.25;

In R, we may assign variables with the traditional = operator championed by Fortran and C, or we may use the arrow operators:

opacity = 0.25
opacity <- 0.25
0.25 -> opacity

opacity = 0.25
opacity <- 0.25
0.25 -> opacity

When we equate two values in mathematics, we are always comparing rvalues. There are no memory cells in mathematics and therefore no lvalues. In a program, however, we may be able to choose between comparing two names as lvalues or as rvalues. Two names have the same lvalue if they are associated with the same address. The two names are said to have the same identity. Two names have the same rvalue if they are associated with the same value—regardless of where they're stored. The two names are said to be equivalent.

Equivalent names may or may not have the same identity, but two names of the same identity are necessarily equivalent.

Suppose a and b are two names in a Java program. They are compared as lvalues using the == operator, and they are compared as rvalues using the equals method:

Java

if (p == q) {
  System.out.println("same identity, equivalent values");
} else if (p.equals(q)) {
  System.out.println("different identities, equivalent values");
}

if (p == q) {
  System.out.println("same identity, equivalent values");
} else if (p.equals(q)) {
  System.out.println("different identities, equivalent values");
}

Having two ways of comparing values is a source of bugs. Do we really need both of them? Yes.

Imagine we are developing a program for drawing shapes. Suppose a user draws two circles on top of each other. The circles have the same center and radius, so they are equivalent. If the user attempts to delete one of the circles, we won't know which to delete if we only consider equivalence. We might choose one or the other or both. The ambiguity is resolved if we have a notion of identity. We delete the circle with the matching identity.

Elsewhere in the program, the account creation dialog prompts the user to enter their password twice. If we only consider identity, the two passwords won't ever match because the entered passwords will be at different memory locations. We need equivalence to compare the characters instead of the addresses.

Comparing lvalues is extremely fast since just two addresses are compared. But lvalue comparison is only appropriate when identity is what matters. Comparing rvalues is more versatile, allowing two values in different cells to still be equal, but it's also more expensive since we compare all the bytes associated with the two values.

← Combinatorics of Names Naming Practices →

Dear Computer

Lvalues and Rvalues