Namespaces

Imagine we have been working on a project for which we've built a robust Matrix class. One day we need a complex routine that we don't have time to research and implement ourselves, so we add a dependency on a third-party library that also has a Matrix class. The two Matrix classes collide. They can only coexist if we are using a language that supports namespaces. A namespace is a collection of related classes, functions, types, and variables that are organized into a common scope. Java and Kotlin call them packages. Ruby, Python, and JavaScript call them modules. C++ and C# call them namespaces.

Defining Namespaces

Namespaces are traditionally explicitly defined by the programmer and are logically independent of the source files. In C++, for instance, a source single file may contain definitions for several namespaces:

C++

namespace math {
  const double PI = 3.14159265358979323846;
  const double TAU = 2 * PI;
}

namespace physics {
  const double GRAVITY = 9.81;
}

namespace math {
  const double PI = 3.14159265358979323846;
  const double TAU = 2 * PI;
}

namespace physics {
  const double GRAVITY = 9.81;
}

A single namespace may also have its definitions spread across multiple files.

These namespaces help with the clashing Matrix problem. Our use of the name Matrix in our namespace doesn't infringe on someone else using the name Matrix in a different namespace. Though namespaces reduce the probability of a collision, they don't eliminate it entirely. If both namespaces have the same name, the names will still clash. The name math, for example, is probably a poor choice for a namespace because it is too common. The Java community recommends naming packages with a reversed domain name prefix and a series of nested descriptors, as in org.twodee.pong.physics.

Elements within a namespace can freely access their sibling elements. Elements outside the namespace may access any elements that have been made public or exported. In some languages, top-level elements are public by default. Any elements that should not be visible must be explicitly made private, as in this Kotlin package:

Kotlin

package org.example

private fun helperMethod() {
  // ...
}

class Matrix {
  // ...
}

package org.example

private fun helperMethod() {
  // ...
}

class Matrix {
  // ...
}

In other languages, elements are private by default and must be explicitly exported, as in this JavaScript module:

JavaScript

function helperMethod() {
  // ...
}

export class Matrix {
  // ...
}

function helperMethod() {
  // ...
}

export class Matrix {
  // ...
}

JavaScript modules are uncharacteristically implicit. Each separate source file is its own module, and currently there's no special syntax for naming the module.

Ruby modules lack a clean mechanism for making elements of a module private. This module leaves helper_method public:

Ruby

module MyMath
  def helper_method
    # ... 
  end

  class Matrix
    # ...
  end
end

module MyMath
  def helper_method
    # ... 
  end

  class Matrix
    # ...
  end
end

Python similarly lacks a mechanism for making module elements private.

Importing

We have two options for accessing an element of a namespace outside of that namespace. We may qualify its name by prefixing it with the name of the namespace. For example, to access PI in the math namespace in C++, we'd write:

C++

double circumference = diameter * math::PI;

double circumference = diameter * math::PI;

The :: is the scope resolution operator in C++ and Ruby. Many other languages, including Java, Kotlin, and Python, use the dot operator. In Java, the ArrayList class is defined in the java.util package. We could make a new instance using the fully qualified name:

Java

public class Main {
  public static void main(String[] args) {
    java.util.ArrayList<String> quotations = new java.util.ArrayList<>();
  }
}

public class Main {
  public static void main(String[] args) {
    java.util.ArrayList<String> quotations = new java.util.ArrayList<>();
  }
}

But fully qualified names are verbose. If there is no collision between names, we may import ArrayList so that we can refer to it without qualification:

Java

import java.util.ArrayList;

public class Main {
  public static void main(String[] args) {
    ArrayList<String> quotations = new ArrayList<>();
  }
}

import java.util.ArrayList;

public class Main {
  public static void main(String[] args) {
    ArrayList<String> quotations = new ArrayList<>();
  }
}

If the final field of the import statement is a wildcard (*), all elements of the package are available without qualification. C++ has a similar using statement:

C++

using namespace math;
double circumference = diameter * PI;

using namespace math;
double circumference = diameter * PI;

Ruby does not have a facility for importing individual elements from a module. We import the entire module, and then fully qualify each name that we access.

No Namespaces

Not every language supports namespaces. C is one of them, despite and perhaps because of its popularity. It is one of the oldest languages still in use, and adding namespaces would likely introduce changes to the binary interface that would break a lot of applications. C's lack of namespaces is a major problem. As projects grow in size and gain dependencies, name collisions become increasingly likely. As a workaround, C programmers often name their types and functions in unique ways, perhaps by prepending their project name:

typedef struct {
  /* ... */
} interneat_config;
interneat_connect(interneat_config *config);
interneat_disconnect(interneat_config *config);

typedef struct {
  /* ... */
} interneat_config;
interneat_connect(interneat_config *config);
interneat_disconnect(interneat_config *config);

Cross your fingers and hope that there's no other project named Interneat.

C's lack of namespaces also presents a problem for C++. While C++ does have namespaces, it is binary compatible with C. This means that compiled C++ code can be linked with compiled C code. Since their common binary interface does not recognize the concept of namespaces, the namespaces are lost upon compilation and the threat of name collisions persists. C++ reduces this threat through mangling. The compiler translates or mangles the fully qualified name into a unique identifier that also encodes metadata about method parameters. Consider the following C++ class:

C++

namespace threedee {
  class Matrix {
    Matrix(int nrows, int ncolumns) {
      // ...
    }
  };
}

namespace threedee {
  class Matrix {
    Matrix(int nrows, int ncolumns) {
      // ...
    }
  };
}

The Clang++ compiler mangles the name to __ZN8threedee6MatrixC1Eii. Hopefully no C programmer with whom we are collaborating chooses that name for their own structure.

Summary

Modern computers still have at their core a processor and a bank of mutable memory cells—a von Neumann computer. High-level programming languages make it possible for us to attach names to those memory cells. Good names make code easier for humans to understand and use correctly. As we write code that accesses memory, we have to consider whether we care about a cell's location (its lvalue) or the data inside of it (its rvalue). Languages provide two different mechanisms for preventing changes to variables or their values. We can lock the cells themselves so that their values remain constant. Or we can lock the names so that the names can't be transferred to other cells. Memory cells are also protected by limiting their scope to just the parts of a program that need access. To help organize scopes, we employ various encapsulation strategies like functions, classes, and namespaces.

← Static and Dynamic Scoping Lecture: Grammars and Parsing →

Dear Computer

Namespaces

Defining Namespaces

Importing

No Namespaces

Summary