Lecture: Error Handling

Dear students:

Today we continue our introduction to Rust, the language that may or may not save us from ourselves. Last class we only had time to examine one program. Today let's examine a few more as we explore how to perform I/O and deal with errors. Often we spend all our time thinking about the “happy path” through a program, in which data is never missing or malformed, internet connections never get disrupted, and we have infinite memory and disk. Happy path code is clean. The unhappy path complicates it.

Bad Choices

When something goes wrong and the current code doesn't know how to continue, we have to make a decision. Here are the possible choices that our languages give us:

Let the program crash or panic, as it's called in Rust. The caller is given no chance to recover or clean up.
Throw an exception. The current function halts, and the exception bubbles or propagates upward through the call stack, giving callers the chance to respond and possibly fix the situation. Exceptions are better than crashing, but they complicate programs by adding a secondary but implicit flow to our programs. We already know how to bail from functions: by returning. We already have a way of communicating results to the caller: by returning something. Why not use the mechanism we already have?
Set some global error value and expect the caller to check it before proceeding.
Return an optional value. Instead of returning an Int, we return an IntOrFailure. But we don't make a special or-failure type for each non-failure type. We make a wrapper type and call it Maybe or Option. We have the caller choose its response on these two possible states.

In C, we either crash or set a global. In Java, we throw exceptions. In Rust, we either crash or return an optional value. Optional types might seem like a sidewise move compared to exceptions. They seem to accomplish the same goal as exceptions but seemingly add a lot of code. However, there are some advantages to optional types:

Optional types build failure into the type system. The compiler, through its normal typechecking, verifies that we aren't treating a failure as normal data. We won't get accidental null pointer exceptions.
Every function along the error path acknowledges the possibility of failure. There will be no wild jumping like we see with exceptions.
When an algorithm invokes a sequence of operations that might fail, the success case can indeed become deeply nested inside conditionals. However, languages provide a syntactic shortcut for optional types that turns nesting into sequence.

Exception Types

Rust has two types for communicating exceptional situations: Option and Result. The Option type distinguishes between a value and a no-value. It's defined as a tagged union like this:

Rust

enum Option<T> {
  Some(T),
  None
}

enum Option<T> {
  Some(T),
  None
}

We use this type when an exception isn't really an error but rather just a signal that we have no data. If we do have an error, we should offer an explanation like “You ran out of memory; more will arrive by Thursday” or “You've already opened that file this month”. The Result type is a union that lets us return either a good value or an error value:

Rust

enum Result<T, E> {
  Ok(T),
  Err(E)
}

enum Result<T, E> {
  Ok(T),
  Err(E)
}

We have these two choices for optional types plus panicking in Rust. Which one do you suppose is used in the following situations?

Indexing into a Vec with the indexing operator []
Indexing into a Vec with the get method
Slurp up a file with fs::read_to_string
Traversing an Iterator with next
Parsing a string

Whether we use Option or Result depends on two things: how natural the exception is and how much information we have to communicate about the error.

Responding

If a function gives us back an Option or a Result, we can't do much to it until we know what it is. We have to bifurcate into the happy and unhappy paths. Let's examine the three choices we have.

Crashing

If a function detects that it's on the happy path and knows that there's no way to proceed, it crashes. To crash the program on a no-value, we call unwrap:

Rust

let value = maybe_value.unwrap();

let value = maybe_value.unwrap();

If we want control over the error message printed to stderr, we call expect:

Rust

let value = maybe_value.expect("You can't do that to negative numbers");

let value = maybe_value.expect("You can't do that to negative numbers");

Crashing is better than persisting, but it's severe.

Propagating

If a function discovers that it's on the unhappy path but thinks a caller can get back to the happy path, it propagates the error value back up the stack. We can discover and propagate with a match statement:

Rust

let value = match maybe_value {
  Some(value) => value,
  None => return None,
};

let value = match maybe_value {
  Some(value) => value,
  None => return None,
};

For this to work, the function must be declared to return a failable type. If we have a lot of opportunities for failure, all this conditional logic gets very noisy. Rust has the try operator, which accomplishes the same thing with a single character:

Rust

let value = maybe_value?;

let value = maybe_value?;

Recovering

There a couple of ways the calling function can get back on the happy path. It might provide a fallback value. Here's one way we can fall back to 0 if we don't have a value:

Rust

let value = match maybe_value {
  Some(value) => value,
  None => 0,
};

let value = match maybe_value {
  Some(value) => value,
  None => 0,
};

Again, this is noisy. The unwrap_or function does the same thing:

Rust

let value = maybe_value.unwrap_or(0);

let value = maybe_value.unwrap_or(0);

Perhaps we want to repeat the dangerous operation until it works. We can repeatedly call it in a loop until we get back a value. This match statement breaks out of the loop on Some and does nothing on None:

Rust

let value = loop {
  let maybe_value = dangerous_operation();
  match maybe_value {
    Some(value) => break value,
    None => {},
  };
};

let value = loop {
  let maybe_value = dangerous_operation();
  match maybe_value {
    Some(value) => break value,
    None => {},
  };
};

Often there's a way to avoid empty bodies. Rust let's us target just the Some case with an if-let statement:

Rust

let value = loop {
  let maybe_value = dangerous_operation();
  if let Some(value) = maybe_value {
    break value;
  }
};

let value = loop {
  let maybe_value = dangerous_operation();
  if let Some(value) = maybe_value {
    break value;
  }
};

Rust's Option and Result types provide many other methods for operating on failable data, but this is enough of an overview. Let's see this types in use.

Wc

The Unix utility wc counts characters, words, and lines in a file. We run it from the command-line as wc path/to/file.txt. Let's write our own version in Rust. Let's start by pulling out the path from the command-line arguments.

Rust

use std::env;
use std::fs;

fn main() {
  let args = env::args();
  println!("{:?}", args);
}

use std::env;
use std::fs;

fn main() {
  let args = env::args();
  println!("{:?}", args);
}

The output shows that the executable name is the first parameter. We want the second.

When we call env::args we do not get back a collection. Instead we get an iterator. In Java, iterators have two methods, hasNext and next, that we use to drive forward and eventually terminate. In Rust, we have next but no hasNext. Any guesses why? The next method returns Option. It'll give back None when hasNext would give back false.

To get at the second parameter, we could call next twice. Or we could call skip(1) to jump over an unwanted element. Iterators also have an nth method for fetching the \(n^\textrm{th}\) parameter. Let's call that instead:

Rust

fn main() {
  let path = env::args().nth(1);
  println!("{:?}", path);
}

fn main() {
  let path = env::args().nth(1);
  println!("{:?}", path);
}

The path is wrapped up in an Option variant. Earlier we forced the program to panic if we got a None back. What method did that? unwrap. Instead, we could explicitly handle each variant with a match statement:

Rust

fn main() {
  match env::args().nth(1) {
    Some(path) => println!("{}", path),
    None => panic!("Usage: wc <file>"),
  }
}

fn main() {
  match env::args().nth(1) {
    Some(path) => println!("{}", path),
    None => panic!("Usage: wc <file>"),
  }
}

Neat, but the path is only valid inside the first arm. Do we need to nest all our code inside that arm? Imagine if we had five more calls that returned Option. The nesting would get out of hand. Sequences are easier to read than nesting, so let's switch to a match expression that yields the path to the outer scope:

Rust

fn main() {
  let path = match env::args().nth(1) {
    Some(path) => path,
    None => panic!("Usage: wc <file>"),
  };
}

fn main() {
  let path = match env::args().nth(1) {
    Some(path) => path,
    None => panic!("Usage: wc <file>"),
  };
}

The second arm exits the process, so it doesn't return anything.

This pattern of trying to get a value and panicking if it's bad is common enough that there's a helper function that abstracts it away. It's expect:

Rust

fn main() {
  let path = env::args().nth(1).expect("Usage: wc <file>");
}

fn main() {
  let path = env::args().nth(1).expect("Usage: wc <file>");
}

That's much easier to read. Let's hand the path off to a function that reads in the file and computes its statistics. We'll have that function return an instance of this struct:

Rust

#[derive(Debug)]
struct Statistics {
  character_count: usize,
  word_count: usize,
  line_count: usize,
}

#[derive(Debug)]
struct Statistics {
  character_count: usize,
  word_count: usize,
  line_count: usize,
}

Then we frame our helper function. It takes in a borrowed string and gives back an instance of our struct:

Rust

fn count(path: &str) -> Statistics {
  // ...
}

fn count(path: &str) -> Statistics {
  // ...
}

The main function will call count like this:

Rust

fn main() {
  let path = env::args().nth(1).expect("Usage: wc <file>");
  let statistics = count(&path);
  println!("{:?}", statistics);
}

fn main() {
  let path = env::args().nth(1).expect("Usage: wc <file>");
  let statistics = count(&path);
  println!("{:?}", statistics);
}

Inside count we need to read the file. That might fail. For the moment, let's panic if that reading fails:

Rust

fn count(path: &str) -> Statistics {
  let text = fs::read_to_string(path).expect("Couldn't read file.");
  // ...
}

fn count(path: &str) -> Statistics {
  let text = fs::read_to_string(path).expect("Couldn't read file.");
  // ...
}

Our next task is to compute the three statistics. At your tables, find answers to at least one of the following questions:

How do we count the characters in a string?

How do we count the words in a string?

How do we count the lines in a string?

Most Rust collections have a len method reporting their number of elements. We can use it to get the number of bytes in the string, which is not quite the same as the number of characters. Function split_whitespace gives us an iterator over words, and function lines gives us an iterator over the lines. Iterators don't have a len method, but they do have count. With these methods, we compute our statistics:

Rust

fn count(path: &str) -> Statistics {
  let text = fs::read_to_string(path).expect("Couldn't read file.");

  Statistics {
    character_count: text.len(),
    word_count: text.split_whitespace().count(),
    line_count: text.lines().count(),
  }
}

fn count(path: &str) -> Statistics {
  let text = fs::read_to_string(path).expect("Couldn't read file.");

  Statistics {
    character_count: text.len(),
    word_count: text.split_whitespace().count(),
    line_count: text.lines().count(),
  }
}

The output matches pretty well to the builtin wc. There's one difference, however. The real wc accepts multiple paths. We can too by looping over the args:

Rust

fn main() {
  for path in env::args() {
    let statistics = count(&path);
    println!("{:?}", statistics);
  }
}

fn main() {
  for path in env::args() {
    let statistics = count(&path);
    println!("{:?}", statistics);
  }
}

But what if we feed in a bad path? The real wc prints the error but keeps on going through the remaining paths. Ours crashes. We want to do that too. We'll make it so count returns its struct wrapped up in a Result:

Rust

fn count(path: &str) -> Result<Statistics, io::Error> {
  match fs::read_to_string(path) {
    Ok(text) => Ok(Statistics {
        character_count: text.len(),
        word_count: text.split_whitespace().count(),
        line_count: text.lines().count(),
    })
    Err(e) => Err(e),
  }
}

fn count(path: &str) -> Result<Statistics, io::Error> {
  match fs::read_to_string(path) {
    Ok(text) => Ok(Statistics {
        character_count: text.len(),
        word_count: text.split_whitespace().count(),
        line_count: text.lines().count(),
    })
    Err(e) => Err(e),
  }
}

I find this code hard to read. Imagine if we had to call another I/O method that could also fail. That would make the nesting even worse. Good news. Rust has a really nice postfix operator that we can place after a dangerous method call. If that call gives back None or an Err variant, it will automatically return. Otherwise it will go on to the next statement. It turns nesting into a sequence. This is much more readable:

Rust

fn count(path: &str) -> Result<Statistics, io::Error> {
  let text = fs::read_to_string(path)?;
  Ok(Statistics {
    character_count: text.len(),
    word_count: text.split_whitespace().count(),
    line_count: text.lines().count(),
  })
}

fn count(path: &str) -> Result<Statistics, io::Error> {
  let text = fs::read_to_string(path)?;
  Ok(Statistics {
    character_count: text.len(),
    word_count: text.split_whitespace().count(),
    line_count: text.lines().count(),
  })
}

Flashmod

For our last example, let's make a math flashcard quizzer. We're all good enough at addition and multiplication. Our flashcards will be for modular division. Rust doesn't ship with a library for generating random numbers, so we have to install a package. As soon as a project has dependencies, we are better off using its package manager Cargo than using rustc directly. We build a new Cargo project and add the third-party rand crate (library) with this shell command:

Shell

cargo new flashmod
cargo add rand

cargo new flashmod
cargo add rand

In src/main.rs we add code to generate and print two random numbers:

Rust

use rand::Rng;
use std::io::{self,Write};

fn main() {
  let mut generator = rand::thread_rng();
  let mut stdin = io::stdin();
  let mut stdout = io::stdout();

  let a = generator.gen_range(0..50);
  let b = generator.gen_range(1..10);

  print!("{} % {} = ", a, b);
  stdout.flush().unwrap();
}

use rand::Rng;
use std::io::{self,Write};

fn main() {
  let mut generator = rand::thread_rng();
  let mut stdin = io::stdin();
  let mut stdout = io::stdout();

  let a = generator.gen_range(0..50);
  let b = generator.gen_range(1..10);

  print!("{} % {} = ", a, b);
  stdout.flush().unwrap();
}

Since there's no newline, we flush the output. Let's give the user ten problems with a for loop:

Rust

fn main() {
  let mut generator = rand::thread_rng();
  let mut stdin = io::stdin();
  let mut stdout = io::stdout();

  for _ in 0..10 {
    let a = generator.gen_range(0..50);
    let b = generator.gen_range(1..10);

    print!("{} % {} = ", a, b);
    stdout.flush().unwrap();
  }
}

fn main() {
  let mut generator = rand::thread_rng();
  let mut stdin = io::stdin();
  let mut stdout = io::stdout();

  for _ in 0..10 {
    let a = generator.gen_range(0..50);
    let b = generator.gen_range(1..10);

    print!("{} % {} = ", a, b);
    stdout.flush().unwrap();
  }
}

We don't name the element because we never reference it. Next we get input from the user:

Rust

fn main() {
  let mut generator = rand::thread_rng();
  let mut stdin = io::stdin();
  let mut stdout = io::stdout();
  let mut response = String::new();

  for _ in 0..10 {
    let a = generator.gen_range(0..50);
    let b = generator.gen_range(1..10);

    print!("{} % {} = ", a, b);
    stdout.flush().unwrap();

    response.clear();
    stdin.read_line(&mut response).unwrap();
  }
}

fn main() {
  let mut generator = rand::thread_rng();
  let mut stdin = io::stdin();
  let mut stdout = io::stdout();
  let mut response = String::new();

  for _ in 0..10 {
    let a = generator.gen_range(0..50);
    let b = generator.gen_range(1..10);

    print!("{} % {} = ", a, b);
    stdout.flush().unwrap();

    response.clear();
    stdin.read_line(&mut response).unwrap();
  }
}

We need to clear the string on each iteration because read_line appends the content. As always, user input comes in as a string. We need to parse out a number. This might fail, so we add a loop around the prompt and user input until the parse succeeds:

Rust

use rand::Rng;
use std::io::{self,Write};

fn main() {
  let mut generator = rand::thread_rng();
  let mut stdin = io::stdin();
  let mut stdout = io::stdout();
  let mut response = String::new(); 

  for _ in 0..10 {
    let a = generator.gen_range(0..50);
    let b = generator.gen_range(1..10);

    let number = loop {
      print!("{} % {} = ", a, b);
      stdout.flush().unwrap();

      response.clear();
      stdin.read_line(&mut response).unwrap();

      match response.trim().parse::<u32>() {
        Ok(number) => break number,
        Err(e) => println!("That wasn't a number."),
      }
    };
  }
}

use rand::Rng;
use std::io::{self,Write};

fn main() {
  let mut generator = rand::thread_rng();
  let mut stdin = io::stdin();
  let mut stdout = io::stdout();
  let mut response = String::new(); 

  for _ in 0..10 {
    let a = generator.gen_range(0..50);
    let b = generator.gen_range(1..10);

    let number = loop {
      print!("{} % {} = ", a, b);
      stdout.flush().unwrap();

      response.clear();
      stdin.read_line(&mut response).unwrap();

      match response.trim().parse::<u32>() {
        Ok(number) => break number,
        Err(e) => println!("That wasn't a number."),
      }
    };
  }
}

This generic loop lets us check the exit condition anywhere instead of just at the beginning or end as in a while or for loop. We call break when we're ready to bail.

All that remains is a final conditional to give feedback to the user:

Rust

use rand::Rng;
use std::io::{self,Write};

fn main() {
  let mut generator = rand::thread_rng();
  let mut stdin = io::stdin();
  let mut stdout = io::stdout();
  let mut response = String::new(); 

  for _ in 0..10 {
    let a = generator.gen_range(0..50);
    let b = generator.gen_range(1..10);

    let number = loop {
      print!("{} % {} = ", a, b);
      stdout.flush().unwrap();

      response.clear();
      stdin.read_line(&mut response).unwrap();

      match response.trim().parse::<u32>() {
        Ok(number) => break number,
        Err(e) => println!("That wasn't a number."),
      }
    };
    
    if a % b == number {
      println!("That's exactly right.");
    } else {
      println!("Well, no.");
    }
  }
}

use rand::Rng;
use std::io::{self,Write};

fn main() {
  let mut generator = rand::thread_rng();
  let mut stdin = io::stdin();
  let mut stdout = io::stdout();
  let mut response = String::new(); 

  for _ in 0..10 {
    let a = generator.gen_range(0..50);
    let b = generator.gen_range(1..10);

    let number = loop {
      print!("{} % {} = ", a, b);
      stdout.flush().unwrap();

      response.clear();
      stdin.read_line(&mut response).unwrap();

      match response.trim().parse::<u32>() {
        Ok(number) => break number,
        Err(e) => println!("That wasn't a number."),
      }
    };
    
    if a % b == number {
      println!("That's exactly right.");
    } else {
      println!("Well, no.");
    }
  }
}

That concludes our first tour of Rust. It sits somewhere between Ruby and Haskell. The compiler knows the types, so it can perform typechecking. Values can be mutable, so we don't have to contort ourselves into recursion like we did in Haskell. It's got a rich standard library and a very friendly package manager and build tool in Cargo. It's hip. It's supposed to make software safer.

TODO

Here's your list of things to do before we meet next:

Complete the middle quiz as desired.

There are two ready dates remaining. These last two effectively are extensions. Further extensions will not be granted.

See you next time.

Sincerely,

P.S. It's time for a haiku!

One road but two signs Happy Path, Unhappy Path A box of blindfolds

Dear Computer

Lecture: Error Handling

Bad Choices

Exception Types

Responding

Crashing

Propagating

Recovering

Wc

Flashmod

TODO