Error Flow
A program executes step by step. But what happens if one of these steps fails and we don't know how to recover at the point we are in the code? We have at least three choices on how to proceed:
-
Crash the program. We do this by calling the
exitfunction with a non-zero status. The regular program immediately stops executing. The language runtime and operating system clean up the program's resources as best as they can. Files and sockets are closed. Memory is reclaimed. -
Throw an exception. We do this with a
throworraisestatement. The current function immediately stops executing and an error value bubbles up through the call stack until we reach a caller that can recover from the error. - Return a value that signals the error to the caller. For example, if we're searching for an element in a list and can't find it, we return -1. If we're fetching text from an input that hasn't been filled yet, we return the empty string.
Crashing is a drastic move that we should only take during development. Production software should never crash. Throwing exceptions introduces a brand new flow of execution to our program. We already have a flow that we understand pretty well: the natural call-and-return of functions. Returning errors with this existing flow is the way errors are dealt with in both Haskell and Rust. However, dealing with returned errors easily leads to messy code. Haskell prevents this mess in two ways: by augmenting types with error information and by factoring out error-propagation from functional pipelines.
Fallible Types
There isn't always a special value that we can return to signal an error. Consider the atoi function in the C standard library. It converts an ASCII string to an integer. Its return type is therefore int. Since the string may contain any integer—positive, negative, or zero—there's no integer left to signal a failed parse.
Haskell fixes this problem with Maybe and Either. These add an error channel to the return type. If something goes wrong, we don't need a special value of the wrapped type to communicate the error. We just return Nothing or Left with an error value. We could write a better atoi function in Haskell with this type signature:
atoi :: String -> Maybe Int
atoi :: String -> Maybe Int
Theoretically, we could make all our types in C fallible by manually wrapping them up in a tagged union. Haskell makes this wrapping easy.
Fallible Pipelines
If we are executing statements in a Java program and one statement fails, we can just abandon the sequence and return to the caller. But a Haskell program is a chain of function calls, not statements. Here's a pipeline of six calls:
f $ g $ h $ i $ inc $ get x
f $ g $ h $ i $ inc $ get x
What if the call to get fails? Its error result will be fed as a parameter to inc, the next function in the chain. Suppose that inc has this definition:
inc :: Int -> Int
inc x = x + 1
inc :: Int -> Int inc x = x + 1
Let's call this version the optimist function. It assumes that errors aren't possible. We must make the function error-ready by having it receive a Maybe Int and return a Maybe Int. Since Maybe has two variants, we match them across two subdefinitions. Let's call this version the pessimist function:
incMaybe :: Maybe Int -> Maybe Int
incMaybe Nothing = Nothing
incMaybe (Just x) = Just $ x + 1
incMaybe :: Maybe Int -> Maybe Int incMaybe Nothing = Nothing incMaybe (Just x) = Just $ x + 1
That's real nice, but every stage of a functional pipeline must be made pessimistic. This places considerable burden on function writers. Our code will drown in error handling logic. However, this is Haskell. We can write a higher-order function that takes in an optimist function and a Maybe. It abstracts away the pessimism:
callMaybe :: (a -> b) -> Maybe a -> Maybe b
callMaybe _ Nothing = Nothing
callMaybe f (Just x) = Just $ f x
callMaybe :: (a -> b) -> Maybe a -> Maybe b callMaybe _ Nothing = Nothing callMaybe f (Just x) = Just $ f x
From a high-level perspective, callMaybe has the effect of unwrapping the value if it can, applying the function, and wrapping the value back up. With this function defined, we can rewrite our pipeline to handle error values:
callMaybe f $ callMaybe g $ callMaybe h $ callMaybe i $ callMaybe inc $ get x
callMaybe f $ callMaybe g $ callMaybe h $ callMaybe i $ callMaybe inc $ get x
Our optimist functions remain simple, but calling them has become noisy. We reduce this noise by turning callMaybe into an infix operator. Actually, callMaybe and the infix operator are already available in Haskell. The builtin function is fmap and the operator is <$>—which is meant to mirror the $ of optimist pipelines. The operator makes our pessimistic pipeline readable:
f <$> g <$> h <$> i <$> inc <$> get x
f <$> g <$> h <$> i <$> inc <$> get x
If any of these functions fails, the Nothing will just safely bubble up as the final value.
At this point in our discussion, it looks like Haskell provides a clean way to communicate errors through simple return values. But there's one more issue. Our example pipeline has only a single parameter. What if our functions expect multiple parameters, and each might be an error value? For example, suppose we want to find the maximum of two Maybe values x and y. We might try to write this illegal code:
max <$> x y -- compile error: <$> only allows one parameter
max <$> x y -- compile error: <$> only allows one parameter
The error we get from this expression is easier to understand and fix if we examine its association. Since <$> expects two parameters, it associates like this:
(max <$> x) y
(max <$> x) y
We focus our attention on max <$> x.
In the best case, we'll get back a partially applied max function.
The value of max <$> x isn't a function; it's a possible function. We can't pass it y until we unwrap it as a Just variant. There's another builtin infix operator that does this possible unwrapping and calling:
(max <$> x) <*> y
(max <$> x) <*> y
These two operators have the same precedence, but they are left-associative. That means the parentheses are unnecessary. In general, we separate the function from its parameters with <$> and the parameters from each other with <*>.
What all this means is that we can continue to write optimist functions that don't give a thought to errors, and we can keep on sequencing these optimist functions in pipelines. The only hitch is that we must sprinkle these operators between the pieces of the pipeline. They are the error-handling glue that will catch and propagate the error values.
Further Abstract
These operators might feel strange. The good news is that they are used for more than just errors. They may be used with any pipeline that processes wrapped values. The examples above all dealt with values wrapped in Maybe, but <$> and <*> work equally well with Either and IO and even lists. Earlier we kept pure and impure calls separate in IO functions. With these operators, we can build pipelines that send impure values through a pure pipeline. This program reads a file and sends the resulting IO String on to several pure functions to get the first line:
main = do
firstLine <- head <$> lines <$> readFile "page.md"
putStrLn firstLine
main = do firstLine <- head <$> lines <$> readFile "page.md" putStrLn firstLine
A wrapper type that lets its wrapped values be operated on by the <$> operator is a functor. Maybe, Either, IO, and lists all implement the Functor typeclass. The name was chosen in 1945 by some mathematicians, well before Haskell existed. A type that additionally allows the mapping function itself to be wrapped—by supporting <*>—is an applicative.
You might never encounter these words again, but understanding how to sequence values of fallible types into fallible pipelines without making a mess of code is a valuable skill. We will see it again in Rust.