Generic Types

Dear Computer

Chapter 9: Types Revisited

Generic Types

The only collection we've examined in Haskell thus far has been the list. Let's try defining a new data structure: a binary search tree for integers. In a binary search tree, each node has a value and two links to its child nodes. All the values in the left child are less than the parent value, and all values in the right child are greater.

One approach we might take to defining the Tree type is to break it up into two variants: one for the empty tree and one for a root with its value and two children:

Haskell
data Tree = Empty | Node Int Tree Tree
  deriving Show
data Tree = Empty | Node Int Tree Tree
  deriving Show

The expression Node 13 Empty Empty yields a leaf node. The expression Node 13 (Node 5 Empty Empty) Empty yields a node with only a left child. The expression Node 13 (Node 5 Empty Empty) (Node 20 Empty Empty) yields a node with two children.

Building complex trees by hand is not fun, so we decide to write an add function that accepts a tree and an item to add. It produces a new tree that includes the item. The function has this type signature:

Haskell
add :: Tree -> Int -> Tree
add :: Tree -> Int -> Tree

Function add is recursive because it must iterate through the tree to find the right spot to place the new node. Let's break the cases up into two separate definitions using pattern matching. The base case of adding an item to an empty tree is handled by matching the Empty variant and calling the Node constructor to create a new leaf node:

Haskell
add Empty newValue = Node newValue Empty Empty
add Empty newValue = Node newValue Empty Empty

In the general case, we must compare the new value with current node's value. If the new value is smaller, we must rebuild the tree with the new value added to the left child. If the new value is bigger, we must rebuild the tree with the new value added to the right child. This logic does the trick:

Haskell
add (Node rootValue left right) newValue =
  if newValue < rootValue then 
    Node rootValue (add left newValue) right
  else
    Node rootValue left (add right newValue)
add (Node rootValue left right) newValue =
  if newValue < rootValue then 
    Node rootValue (add left newValue) right
  else
    Node rootValue left (add right newValue)

Now we can build up the tree one node at a time:

GHCI
> tree1 = Empty
> tree1
Empty
> tree2 = add tree1 13
> tree2
Node 13 Empty Empty
> tree3 = add tree2 5
> tree3
Node 13 (Node 5 Empty Empty) Empty
> tree1 = Empty
> tree1
Empty
> tree2 = add tree1 13
> tree2
Node 13 Empty Empty
> tree3 = add tree2 5
> tree3
Node 13 (Node 5 Empty Empty) Empty

If we had a list of values to add, we could assemble the tree using a fold.

This tree is neat, perhaps, but it only holds Int values. What if we want a tree of String? Or Double? Or Friend? If this were Java, we'd add a generic type parameter to the class header: class Tree<T extends Comparable<T>>. In Haskell, we similarly add a generic type parameter to the data definition:

Haskell
-- data Tree = Empty | Node Int Tree Tree <-- old definition
data Tree a = Empty | Node a (Tree a) (Tree a)
  deriving Show
-- data Tree = Empty | Node Int Tree Tree <-- old definition
data Tree a = Empty | Node a (Tree a) (Tree a)
  deriving Show

Anywhere we mention Tree, we must now attach a type for the type parameter. If the tree stores Int values, then we say Tree Int. If String, then Tree String. The Node constructor uses Tree a to pass along the type parameter to the child trees.

In Java, we qualify the generic type with extends Comparable<T> in order to make sure we can compare values. There's no mention of values being comparable in the Haskell data definition. Instead, we push this qualification onto the function definitions. To make the add function work on trees of any comparable type, we qualify a as being an implementation of the Ord typeclass:

Haskell
add :: Ord a => Tree a -> Int -> Tree a
add :: Ord a => Tree a -> Int -> Tree a
← Record TypesMaybe and Either →