Types
Expressions yield values, and values have types. Since everything in Haskell is an expression, we had better get to know its type system. The Glasgow Haskell Compiler provides two ways of working with Haskell code: we compile a source file with ghc
or we interactively write code with ghci
. The ghci
utility is a REPL similar to Ruby's irb
. It prompts us for an expression, evaluates it, prints the result, and loops back to do it all again. Let's use ghci
to look at some core types available in Haskell.
Haskell is statically typed, which means the compiler can determine the types just by looking at the code—without running it. But the types don't have to be explicitly stated in the source; the compiler can infer them. We see what type is inferred for an expression using the :type
or :t
commands in ghci
. For example, :t 0 == 0
prints the type of the expression 0 == 0
. It does not evaluate the expression.
Typeclasses
Not all the types are so simple. Observe what happens when we inspect the type of an integer literal:
> :t 5
5 :: Num p => p
Num
is not a type, but a typeclass. Typeclasses are an advanced concept that would be better to discuss later, but they are hard to avoid. So we'll talk about them some now and some more later on.
A typeclass is like an interface in Java. It describes a set of functions that any types implementing the typeclass must define. We get a list of a typeclass's operations using :info
or :i
command in ghci
. Here is the list of functions required by Num
:
> :info Num
type Num :: * -> Constraint
class Num a where
(+) :: a -> a -> a
(-) :: a -> a -> a
(*) :: a -> a -> a
negate :: a -> a
abs :: a -> a
signum :: a -> a
fromInteger :: Integer -> a
There are seven functions that Num
imposes on a type. These include addition, subtraction, and multiplication—but not division. The /
function is imposed by a different typeclass named Fractional
.
The type signature 5 :: Num p => p
is a mouthful. The ::
is read as “has type”. The compiler doesn't have enough information to determine the exact type of 5, so it assigns it the wildcard p
, which appears to the right of the =>
. There's nothing special about p
; the wildcard a
or any lowercase letter could have been assigned instead without changing the meaning. This p
is a polymorphic type that can match several types. The arrow and the clause preceding p
limit the possible matches to only those types that conform to the Num
typeclass. Altogether, we read the type signature as, “5 has type p
, where p
is some type that implements the Num
typeclass.”
Numbers
The three most common numeric types in Haskell are Int
, Integer
, and Double
. All three implement the Num
typeclass. Int
is for fixed-size integer primitives, Integer
is for integers of arbitrary magnitude, and Double
is for double-precision floating point numbers. If we want a variable of type Int
, we do not declare the variable as having type Int
. Rather, we shape the type of the expression with which it is initialized using the ::
operator. This code shapes 5 into an Int
and assigns it to a variable:
a = 5 :: Int
However, we rarely need to use ::
when assigning a variable. The type inference system will shape the variable according to its use. If later on in the code we pass a
to a function that expects an Int
or add it to another Int
, then a
will be inferred to have type Int
.
Suppose we want to add an Int
and a Double
. The arithmetic operators do not automatically coerce the Int
to the more general Double
. This expression generates an error:
(5 :: Int) + 1.2
However, if we don't constrain the type of 5, the type inference system will assign it the same type as 1.2 and allow the addition. This expression yields the 6.2 that we expect:
5 + 1.2
Tuples
Haskell provides a lightweight syntax for working with tuples. Values may be grouped in pairs, triplets, and beyond using parentheses:
> turn = ('G', 9, False)
> :t turn
turn :: Num b => (Char, b, Bool)
This reads as “turn
is a 3-tuple whose first element is a Char
, whose second element is of some type b
that implements the Num
interface, and whose third element is a Bool
.”
If a tuple has just two elements, we can use the fst
and snd
functions to pull out the first and second components. That the Haskell designers abbreviated these names to abbreviations with unclear pronunciations is unfortunate. Code should be as easy to talk about with other humans as it is to type.
Try applying fst
to turn
, and we will encounter a type error. We can see why the type error occurs by inspecting the type signature of fst
:
> :t fst
fst :: (a, b) -> a
The ->
shows the action of the function: fst
takes in a tuple parameter and gives back a scalar. The types are polymorphic. The first item has some type a
and the second has some type b
. They don't have to be same type, but they could be. There's no =>
clause, so no typeclass qualifications are imposed on a
and b
. Since fst
returns the first element of this tuple, the return type must be the same a
that appeared in the tuple. This type signature reveals that a 3-tuple cannot be passed as a parameter.
Lists
The basic linear collection of Haskell is a linked list. List literals are formed using comma-separated values sandwiched by square brackets:
friends = []
highs = [106, 103, 104, 99, 95, 95, 92]
facts = [True, True, False]
All elements of a list must be of the same type. Observe the type of facts
reported by :t
:
> :t facts
facts :: [Bool]
This heterogeneous list is rejected:
> :t [5, True]
<interactive>:1:2: error:
• No instance for (Num Bool) arising from the literal ‘5’
• In the expression: 5
In the expression: [5, True]
The types of numeric literals are inferred to match the most restrictive type found in the list. The integer literal 5 is coerced into a Fractional
because of its neighbor:
> [5, 6.2]
[5.0,6.2]
Haskell lists are linked lists. That means we should rarely index into them. Indexing into an array-based list is a constant time operation, but indexing into a linked list is linear time. Instead, we break the list apart into its head and tail:
> head [1, 2, 3, 4, 5]
1
> tail [1, 2, 3, 4, 5]
[2, 3, 4, 5]
New lists may be built from old ones. To build a list with a new element inserted at its front, we use the :
operator:
> 0 : [1, 2, 3, 4, 5]
[0, 1, 2, 3, 4, 5]
> 87 : highs
[87,106,103,104,99,95,95,92]
This :
is the cons operator. The name comes from Lisp. It is an abbreviation for construct. The cons operator does not mutate the parameter list, because Haskell doesn't have mutation. Cons yields a brand new list.
The type signature of :
is a bit different from the other signatures we've seen since this operator expects two operands:
> :t (:)
(:) :: a -> [a] -> [a]
The operator passed to :t
must be surrounded by parentheses to escape it. In the resulting type signature, the first two fields of the arrow sequence describe the two operands. The first operand is a scalar a
and the second operand is a list of a
. The return value is a list of a
.
To join two lists, we use the concatenation operator ++
:
> [1, 10, 100] ++ [1000, 10000]
[1, 10, 100, 1000, 10000]
A string is a list of Char
. The types [Char]
and String
are synonymous. We could assemble a string using lists of Char
literals and the cons and concatenation operators:
fruit = 'p' : ['e', 'a', 'c', 'h']
lots = ['a'] ++ ['b', 'u', 'n'] ++ ['d', 'a', 'n', 'c', 'e']
However, string literals are equivalent to lists of Char
literals and easier to read:
fruit = 'p' : "each"
lots = "a" ++ "bun" ++ "dance"