Enums with Data
A variant in a Java enum can be more than just a name. Since each variant is an object, we may add extra state. Suppose we have a Direction
enum that lists four possible directions of movement:
enum Direction {
NORTH,
SOUTH,
EAST,
WEST
}
We'd like to associate each name with its x- and y-offsets. For instance, NORTH
moves 0 units on the x-axis and 1 unit on the y-axis. We do this by giving Direction
a couple of instance variables and a private constructor that initializes them:
enum Direction {
NORTH(0, 1),
SOUTH(0, -1),
EAST(1, 0),
WEST(-1, 0);
public final int x;
public final int y;
private Direction(int x, int y) {
this.x = x;
this.y = y;
}
}
The four implicit instantiations at the top are modified to pass along the actual parameters to the constructor.
These instance variables save us from having to write conditional logic to orchestrate movement. Instead of asking what enum we have and offsetting accordingly, we just apply the offsets that the Direction
gives us:
void move(Direction direction) {
position.x += direction.x;
position.y += direction.y;
}
All values in a Java enum have the same shape. If we need NORTH
to have certain instance variables, then SOUTH
will have the same ones. In Haskell, each enum variant is shaped separately. This is an equivalent Direction
enum:
data Direction =
North Int Int |
South Int Int |
East Int Int |
West Int Int
Each variant of Direction
is now a 2-tuple. To create a Direction
, the client has to provide the tuple's two numbers: North 0 1
. This is not as convenient as the Java enum, whose data was baked into the variants at the point of their definition. Since the offsets aren't predetermined like they are in the Java enum, we could write North 5 (-3)
, which isn't very northy. Giving Direction
fields doesn't doesn't make sense in Haskell. It would be better to write a function that associates a Direction
with an offset, like this one:
offset :: Direction -> (Int, Int)
offset North = (0, 1)
offset South = (0, -1)
offset East = (1, 0)
offset West = (-1, 0)
The fields given to the variants of a Haskell enum may be different in type and arity. Consider this type definition that is used to schedule a calendar event that may or may not recur:
data Schedule =
Day Int Int Int | -- only on the given Y/M/D
Daily | -- recurs everday
Weekly Int | -- once a week (Sunday is 0)
Monthly Int | -- once a month (day is in 1-31)
Yearly Int Int -- once a year on given M/D
We see the algebraic type operations in this definition. A Schedule
is either a Day
OR a Daily
OR a Weekly
OR a Monthly
OR a Yearly
. If it's a Day
, it has an Int
year AND an Int
month AND an INT
day. If it's a Daily
, it has no fields. And so on.
Once we add fields to a data
definition, it doesn't feel much like a traditional enum. In fact, it's more aptly called a tagged union. Recall that a union is a polymorphic container that holds just one value of several possible forms. To know which type of value a union contains, we need a type tag. In a Haskell data
definition, the variant name is that type tag.
What do you suppose the types of the variants like Day
and Yearly
are?
The variant names are called constructors. This same term is used in object-oriented programming to describe a method that initializes the state of an object. The meaning is similar here. Constructors are functions that construct new instances of a tagged union.
A tagged union serves as a handy bundle for passing around compound data. But such a bundle is useless unless we have a way to access fields within it.