Enumerations

Earlier we defined a type as a set of values and their operations. For many types, the set of possible values is implicit. There is, for example, no list of all legal 4.2 billion 4-byte integer values stored anywhere in a computer. However, sometimes we do want to define a new type by listing out its possible values. These are enumeration types or enums.

Enums provide a menu of choices for a value. A meal option might be VEGETARIAN, VEGAN, GLUTENFREE, or OMNIVOROUS. A player class might be ELF, DWARF, WIZARD, ORC, or BALROG. A water temperature might be either SCALDING or FREEZING. Programmers often respond to the different choices with switch statements. Ideally, options not on the menu are forbidden. No one should be able to request ORC for a meal.

Languages differ widely in their support of enums. Ruby and JavaScript don't have them. C and C++ treat them as a thin veneer over integers. Java implements enums atop classes, allowing programmers to add custom behaviors. Haskell and Rust elevate enums to a fundamental custom data type. Let's examine these various treatments in more detail.

C and C++

In C and C++, we define an enumeration for the four classical elements of the ancient world with this syntax:

C++

enum element_t {
  FIRE,
  WATER,
  EARTH,
  AIR
};

enum element_t {
  FIRE,
  WATER,
  EARTH,
  AIR
};

The options of an enum are its variants. If we print the variants, we see that they look like integers.

Enums in C and C++ are in fact integers. Because they are integers, enums support integer operations like addition:

C++

enum element_t element = FIRE + 1;

enum element_t element = FIRE + 1;

But what would the value of FIRE + 20 be? Certainly not one of the four elements. Nevertheless, the expression compiles without complaint and yields the value 20.

By default, the numbering starts at 0, and each successive variant is one more than its predecessor. This scheme may be manually overridden with explicit assignments. For example:

C++

enum element_t {
  FIRE = 1,
  WATER = 2,
  EARTH = 4,
  AIR = 8,
};

enum element_t {
  FIRE = 1,
  WATER = 2,
  EARTH = 4,
  AIR = 8,
};

Aristotle classified earth and water as cold, and air and fire as hot. Consider this is_cold function that accepts an element_t parameter:

C++

bool is_cold(enum element_t element) {
  return element == EARTH || element == WATER;
}

bool is_cold(enum element_t element) {
  return element == EARTH || element == WATER;
}

Because enums and integers are interchangeable, we can call this function with any integer. The call is_cold(156) typechecks, compiles, and executes just fine. Apparently 156 is not cold. But it's not hot either.

Since enums are open to any integers and not just the defined variants, enums in C and C++ are not typesafe.

Java

Java takes a different approach to enums that is typesafe. The syntax is similar to C:

Java

enum Element {
  FIRE,
  WATER,
  EARTH,
  AIR
}

enum Element {
  FIRE,
  WATER,
  EARTH,
  AIR
}

However, instead of assigning each variant a unique integer, the Java compiler turns them into objects. The Element enum effectively translates to this normal class:

Java

final class Element extends Enum<Element> {
  public static final Element FIRE = new Element();
  public static final Element WATER = new Element();
  public static final Element EARTH = new Element();
  public static final Element AIR = new Element();

  private Element() {}
}

final class Element extends Enum<Element> {
  public static final Element FIRE = new Element();
  public static final Element WATER = new Element();
  public static final Element EARTH = new Element();
  public static final Element AIR = new Element();

  private Element() {}
}

The variants are really instances of the Element class. Its constructor is marked private so that no other Element instances can be made. The class is sealed with final so that no subclasses can be made. The four variants are the only instances that will ever exist. If a method expects an Element, we won't be able to pass in a rogue element like AETHER or METAL. Java enums are therefore typesafe.

Java provides a handful of operations on the enum values, including toString, clone, ordinal, and name. But we can also add custom behaviors as regular methods:

Java

enum Element {
  FIRE,
  WATER,
  EARTH,
  AIR;

  public boolean isCold() {
    return this == EARTH || this == WATER;
  }
}

enum Element {
  FIRE,
  WATER,
  EARTH,
  AIR;

  public boolean isCold() {
    return this == EARTH || this == WATER;
  }
}

Normally we compare objects with the equals method. Since there are only four instances of Element total, they are uniquely identified by their lvalues. A shallow and fast comparison using == is sufficient.

Haskell

The data command in Haskell defines an enum:

Haskell

data Element = Fire | Water | Earth | Air

data Element = Fire | Water | Earth | Air

Unlike object-oriented languages, which organize related data and code into a single syntactic unit called a class, Haskell keeps the data and code separate. To add an operation to an enum, we define a standalone function that accepts a parameter of the enum type:

Haskell

isCold :: Element -> Bool
isCold element = element == Earth || element == Water

isCold :: Element -> Bool
isCold element = element == Earth || element == Water

This code fails when we try to run it because, by default, enums can't be compared with ==. We could use a case expression instead of a comparison:

Haskell

isCold :: Element -> Bool
isCold element = 
  case element of
    Fire -> False
    Water -> True
    Earth -> True
    Air -> False

isCold :: Element -> Bool
isCold element = 
  case element of
    Fire -> False
    Water -> True
    Earth -> True
    Air -> False

But that's a little wordy. A less verbose option is to ask the compiler to define == for us. We announce with a deriving clause what typeclasses the enum belongs to, and the compiler automatically defines the functions needed by that typeclass. This definition has a deriving clause for typeclasses Eq and Show, so the Element type will have both an == function and a show function:

Haskell

data Element = Fire | Water | Earth | Air
  deriving (Eq, Show)

data Element = Fire | Water | Earth | Air
  deriving (Eq, Show)

Now consider this enum that lists the three motion states of a vehicle:

Haskell

data Gear = Forward | Reverse | Park
  deriving (Eq, Show)

data Gear = Forward | Reverse | Park
  deriving (Eq, Show)

Suppose we need a pure function that runs on every frame of an animation and moves the vehicle according to its current state. We could write this case expression:

Haskell

tick :: Gear -> Int -> Int
tick gear position =
  case gear of
    Forward -> position + 1
    Reverse -> position - 1
    Park -> position

tick :: Gear -> Int -> Int
tick gear position =
  case gear of
    Forward -> position + 1
    Reverse -> position - 1
    Park -> position

But enums also support pattern matching. This definition is semantically equivalent and breaks the logic up into subdefinitions:

Haskell

tick :: Gear -> Int -> Int
tick Forward position = position + 1
tick Reverse position = position - 1
tick Park position = position

tick :: Gear -> Int -> Int
tick Forward position = position + 1
tick Reverse position = position - 1
tick Park position = position

Java's enum classes and Haskell's data definitions implement the OR operation of a type algebra. An enum value can be this variant or that variant or that other variant. Both languages also support the AND operation, which we'll see next.

← Algebraic Data Types Enums with Data →

Dear Computer

Enumerations

C and C++

Java

Haskell