Namespaces

Dear Computer

Chapter 2: Naming Things

Namespaces

Imagine you have been working on a project for which you've built a robust Matrix class. One day you need a complex routine that you don't have time to research and implement yourself, so you add a dependency on a third-party library that also has a Matrix class. The two Matrix classes collide. They can only coexist if you are using a language that supports namespaces. A namespace is a collection of related classes, functions, types, and variables that are organized into a common scope. Java and Kotlin call them packages. Ruby, Python, and JavaScript call them modules. C++ and C# call them namespaces.

Defining Namespaces

Namespaces are traditionally explicitly defined by the programmer and are logically independent of the source files. In C++, for instance, a source single file may contain definitions for several namespaces:

C++
namespace math {
  const double PI = 3.14159265358979323846;
  const double TAU = 2 * PI;
}

namespace physics {
  const double GRAVITY = 9.81;
}
namespace math {
  const double PI = 3.14159265358979323846;
  const double TAU = 2 * PI;
}

namespace physics {
  const double GRAVITY = 9.81;
}

A single namespace may also have its definitions spread across multiple files.

These namespaces help with the clashing Matrix problem. Your use of the name Matrix in your namespace doesn't infringe on someone else using the name Matrix in a different namespace. Though namespaces reduce the probability of a collision, they don't eliminate it entirely. If both namespaces have the same name, the names will still clash. The name math, for example, is probably a poor choice for a namespace because it is too common. The Java community recommends naming packages with a reversed domain name prefix and a series of nested descriptors, as in org.twodee.pong.physics.

Elements within a namespace can freely access their sibling elements. Elements outside the namespace may access any elements that have been made public or exported. In some languages, top-level elements are public by default. Any elements that should not be visible must be explicitly made private, as in this Kotlin package:

Kotlin
package org.example

private fun helperMethod() {
  // ...
}

class Matrix {
  // ...
}
package org.example

private fun helperMethod() {
  // ...
}

class Matrix {
  // ...
}

In other languages, elements are private by default and must be explicitly exported, as in this JavaScript module:

JavaScript
function helperMethod() {
  // ...
}

export class Matrix {
  // ...
}
function helperMethod() {
  // ...
}

export class Matrix {
  // ...
}

JavaScript modules are uncharacteristically implicit. Each separate source file is its own module, and currently there's no special syntax for naming the module.

Ruby modules lack a clean mechanism for making elements of a module private. This module leaves helper_method public:

Ruby
module MyMath
  def helper_method
    # ... 
  end

  class Matrix
    # ...
  end
end
module MyMath
  def helper_method
    # ... 
  end

  class Matrix
    # ...
  end
end

Python similarly lacks a mechanism for making module elements private.

Importing

You have two options for accessing an element of a namespace outside of that namespace. You may qualify its name by prefixing it with the name of the namespace. For example, to access PI in the math namespace in C++, you'd write:

C++
double circumference = diameter * math::PI;
double circumference = diameter * math::PI;

The :: is the scope resolution operator in C++ and Ruby. Many other languages, including Java, Kotlin, and Python, use the dot operator. In Java, the ArrayList class is defined in the java.util package. You could make a new instance using the fully-qualified name:

Java
public class Main {
  public static void main(String[] args) {
    java.util.ArrayList<String> quotations = new java.util.ArrayList<>();
  }
}
public class Main {
  public static void main(String[] args) {
    java.util.ArrayList<String> quotations = new java.util.ArrayList<>();
  }
}

But fully-qualified names are verbose. If there is no collision between names, you may import ArrayList so that you can refer to it without qualification:

Java
import java.util.ArrayList;

public class Main {
  public static void main(String[] args) {
    ArrayList<String> quotations = new ArrayList<>();
  }
}
import java.util.ArrayList;

public class Main {
  public static void main(String[] args) {
    ArrayList<String> quotations = new ArrayList<>();
  }
}

If the final field of the import statement is a wildcard (*), all elements of the package are available without qualification. C++ has a similar using statement:

C++
using namespace math;
double circumference = diameter * PI;
using namespace math;
double circumference = diameter * PI;

Ruby does not have a facility for importing individual elements from a module. You import the entire module, and then fully qualify each name that you access.

No Namespaces

Not every language supports namespaces. C is one of them, despite and perhaps because of its popularity. It is one of the oldest languages still in use, and adding namespaces would likely introduce changes to the binary interface that would break a lot of applications. C's lack of namespaces is a major problem. As projects grow in size and gain dependencies, name collisions become increasingly likely. As a workaround, C programmers often name their types and functions in unique ways, perhaps by prepending their project name:

C
typedef struct {
  /* ... */
} interneat_config;
interneat_connect(interneat_config *config);
interneat_disconnect(interneat_config *config);
typedef struct {
  /* ... */
} interneat_config;
interneat_connect(interneat_config *config);
interneat_disconnect(interneat_config *config);

Cross your fingers and hope that there's no other project named Interneat.

C's lack of namespaces also presents a problem for C++. While C++ does have namespaces, it is binary compatible with C. This means that compiled C++ code can be linked with compiled C code. Since their common binary interface does not recognize the concept of namespaces, the namespaces are lost upon compilation and the threat of name collisions persists. C++ reduces this threat through mangling. The compiler translates or mangles the fully-qualified name into a unique identifier that also encodes metadata about method parameters. Consider the following C++ class:

C++
namespace threedee {
  class Matrix {
    Matrix(int nrows, int ncolumns) {
      // ...
    }
  };
}
namespace threedee {
  class Matrix {
    Matrix(int nrows, int ncolumns) {
      // ...
    }
  };
}

The Clang++ compiler mangles the name to __ZN8threedee6MatrixC1Eii. Hopefully no C programmer with whom you are collaborating chooses that name for their own structure.

Summary

Modern computers still have at their core a processor and a bank of mutable memory cells—a von Neumann computer. High-level programming languages make it possible for us to attach names to those memory cells. Good names make code easier for humans to understand and use correctly. As we write code that accesses memory, we have to consider whether we care about a cell's location (its lvalue) or the data inside of it (its rvalue). Languages provide two different mechanisms for locking memory. We can lock the cells themselves so that their values remain constant. Or we can lock the names so that the names can't be transferred to other cells. Memory cells are also protected by limiting their scope to just the parts of a program that need access. To help organize scopes, we employ various encapsulation strategies like functions, classes, and namespaces.

← Static and Dynamic ScopingLecture: Settle, Part 2 →