Enums with Data

Dear Computer

Chapter 9: Types Revisited

Enums with Data

A variant in a Java enum can be more than just a name. Since each variant is an object, we may add extra state. Suppose we have a Direction enum that lists four possible directions of movement:

Java
enum Direction {
  NORTH,
  SOUTH,
  EAST,
  WEST
}

We'd like to associate each name with its x- and y-offsets. For instance, NORTH moves 0 units on the x-axis and 1 unit on the y-axis. We do this by giving Direction a couple of instance variables and a private constructor that initializes them:

Java
enum Direction {
  NORTH(0, 1),
  SOUTH(0, -1),
  EAST(1, 0),
  WEST(-1, 0);

  public final int x;
  public final int y;

  private Direction(int x, int y) {
    this.x = x;
    this.y = y;
  }
}

The four implicit instantiations at the top are modified to pass along the actual parameters to the constructor.

These instance variables save us from having to write conditional logic to orchestrate movement. Instead of asking what enum we have and offsetting accordingly, we just apply the offsets that the Direction gives us:

Java
void move(Direction direction) {
  position.x += direction.x;
  position.y += direction.y;
}

All values in a Java enum have the same shape. If we need NORTH to have certain instance variables, then SOUTH will have the same ones. In Haskell, each enum variant is shaped separately. This is an equivalent Direction enum:

Haskell
data Direction =
  North Int Int |
  South Int Int |
  East Int Int |
  West Int Int

Each variant of Direction is now a 2-tuple. To create a Direction, the client has to provide the tuple's two numbers: North 0 1. This is not as convenient as the Java enum, whose data was baked into the variants at the point of their definition. Since the offsets aren't predetermined like they are in the Java enum, we could write North 5 (-3), which isn't very northy. Giving Direction fields doesn't doesn't make sense in Haskell. It would be better to write a function that associates a Direction with an offset, like this one:

Haskell
offset :: Direction -> (Int, Int)
offset North = (0, 1)
offset South = (0, -1)
offset East = (1, 0)
offset West = (-1, 0)

The fields given to the variants of a Haskell enum may be different in type and arity. Consider this type definition that is used to schedule a calendar event that may or may not recur:

Haskell
data Schedule =
  Day Int Int Int |   -- only on the given Y/M/D
  Daily |             -- recurs everday
  Weekly Int |        -- once a week (Sunday is 0)
  Monthly Int |       -- once a month (day is in 1-31)
  Yearly Int Int      -- once a year on given M/D

We see the algebraic type operations in this definition. A Schedule is either a Day OR a Daily OR a Weekly OR a Monthly OR a Yearly. If it's a Day, it has an Int year AND an Int month AND an INT day. If it's a Daily, it has no fields. And so on.

Once we add fields to a data definition, it doesn't feel much like a traditional enum. In fact, it's more aptly called a tagged union. Recall that a union is a polymorphic container that holds just one value of several possible forms. To know which type of value a union contains, we need a type tag. In a Haskell data definition, the variant name is that type tag.

What do you suppose the types of the variants like Day and Yearly are?

The variant names are called constructors. This same term is used in object-oriented programming to describe a method that initializes the state of an object. The meaning is similar here. Constructors are functions that construct new instances of a tagged union.

A tagged union serves as a handy bundle for passing around compound data. But such a bundle is useless unless we have a way to access fields within it.

← EnumerationsDestructuring →