Lecture: Everything Revisited
Dear students:
Today we switch to our final language of the semester: Rust. We will see some new ideas that Rust contributes to programming language landscape, but its primary strength is that it blends together a lot of great ideas that we've already seen without the ascetism of Haskell.
We'll have a look at this upstart of a language. When we learn a new language, we've really got two tasks: learn how to say things and learn what has already been said. In a spoken language, what's already been said includes all the idioms and cultural references that we build on. In a programming language, what's already been said is the API that we build on.
When I was first learning to code, I loved writing lots of little main programs that did silly things. I'm still learning to code, and my love of silly mains persists. So that's how we'll begin. We won't enumerate all the features of the language.
Countdown
Imagine you want a little timer to help you take a break from staring at a screen. Certainly we could call up one of the many apps on your phone or the web, but they pose more opportunities for distraction. We want one that runs in the terminal. Let's write it in Rust. Our main function starts off with a variable marking the current time:
fn main() {
let time = 60;
}
Notice there's no explicit type. What kind of typing system do you suppose Rust has? Dynamic or static?
What do I need next? Display the time. Tick time
downward. Sleep. Repeat. Let's tick first with an arithmetic assignment operator:
fn main() {
let time = 60;
time -= 1;
}
Compilation fails. All variables are immutable unless we opt in to mutability. Immutability means we programmers can be certain that a variable is what it was initialized to. We don't need to look anywhere but the initialization for its value. It also means we can share data between tasks with no concern that they'll interfere like two roommates over the microwave. Compilation succeeds once we modify the variable with mut
:
fn main() {
let mut time = 60;
time -= 1;
}
Now we add a loop:
fn main() {
let mut time = 60;
while time >= 0 {
time -= 1;
}
}
There are no parentheses around conditions. If you include them, the compiler will warn you to stop being so noisy.
Let's print:
fn main() {
let mut time = 60;
while time >= 0 {
println!("{}", time);
time -= 1;
}
}
Rust's format strings are more like Python's than C's. The output looks nice, but we need to slow it down. We call the sleep
function, which needs to be imported with a use
statement:
use std::thread;
use std::time::Duration;
fn main() {
let mut time = 60;
while time >= 0 {
println!("{}", time);
thread::sleep(Duration::from_millis(1000));
time -= 1;
}
}
There's no reason it needs to take up so much output. This timer shows both the current time and all the old times. Useless. I don't care about the old times. Let's erase them. We could uses Curses, or we could just print a carriage return to bring the cursor back to the beginning of the line. We'll need print!
instead of println!
. Since one-digit numbers won't overwrite the second digit, we need to pad them out with a formatting flag:
use std::thread;
use std::time::Duration;
fn main() {
let mut time = 60;
while time >= 0 {
print!("\r{:<2}", time);
thread::sleep(Duration::from_millis(1000));
time -= 1;
}
}
But this switch causes the output to disappear algother. The problem, like always, is unflushed buffers. When we write data, it doesn't go immediately to its destination. Instead it goes to a buffer in RAM. Only when that buffer gets full does it get flushed onward. Or sometimes newlines cause it to flush. Or we can explicitly flush it, which requires some more imports and also error handling:
use std::thread;
use std::time::Duration;
use std::io::{self,Write};
fn main() {
let mut time = 60;
let mut stdout = io::stdout();
while time >= 0 {
print!("\r{:<2}", time);
stdout.flush().unwrap();
thread::sleep(Duration::from_millis(1000));
time -= 1;
}
}
There's not a global variable for standard out as we see in other languages. All I/O functions have the potential to fail. Rust doesn't have exceptions. Instead it uses types like Maybe
and Either
, which add a secondary channel to normal return values. We can't ignore the return value. The unwrap
call causes the process to panic if a failure is returned. Panic means exit with a bad status code. We'll see other ways of dealing with optional types in our next examples.
Handling Errors
When something goes wrong and the current code doesn't know how to continue, we have to make a decision. Here are the possible choices that our languages give us:
- Ignore it and proceed—at who knows what peril.
- Let the program crash or panic, as it's called in Rust. This is a bit fatalistic in that callers are given no chance to recover.
- Throw an exception. The current function halts, and the exception bubbles or propagates upward through the call stack. Exceptions aren't necessarily a bad choice, but they complicate programs by adding a secondary but implicit flow to our programs. We already know how to bail from functions: by returning. We already have a way of communicating results to the caller: by returning something. Why not use the mechanism we already have?
-
Return an optional value. Instead of returning an
Int
, we return anIntOrFailure
. But we don't make a special or-failure type for each non-failure type. We make a wrapper type and call itMaybe
orOption
. We have the caller choose its response on these two possible states.
In C, we take either the first approach or set some global error value. In Java, we throw exceptions. In Rust, we use Option
. Optional types might seem like a sidewise move compared to exceptions. They seem to accomplish the same goal as exceptions but seemingly add a lot of code. However, there are some advantages to optional types:
- Optional types build failure into the type system. The compiler, through its normal typechecking, verifies that we aren't treating a failure as normal data. We won't get accidental null pointer exceptions.
- Every function along the error path acknowledges the possibility of failure. There will be no wild jumping like we see with exceptions.
- When an algorithm invokes a sequence of operations that might fail, the success case can become deeply nested inside of conditionals. However, languages provide a syntactic shortcut for optional types that turns nesting into sequence. We'll see Rust's shortcut in the next example.
Wc
The Unix utility wc
counts characters, words, and lines in a file. We run it from the command-line as wc path/to/file.txt
. Let's write our own version in Rust. Instead of diving right into code, work with your neighbors to do some reconnaissance. In particular, answer these questions:
Okay, we're ready to write some code. Let's start by pulling out the path from the command-line arguments.
use std::env;
use std::fs;
fn main() {
let args = env::args();
println!("{:?}", args);
}
The output shows that the executable name is the first parameter. We want the second.
When we call env::args
we do not get back a collection. Instead we get an iterator. In Java, iterators have two methods, hasNext
and next
, that we use to drive forward and eventually terminate. In Rust, we have next
but no hasNext
. Any guesses why? The next
method returns Option
. It'll give back the None
variant when hasNext
would give back false.
To get at the second parameter, we could call next
twice. Or we could call skip(1)
to jump over an unwanted element. Iterators also have an nth
method for fetching the \(n^\textrm{th}\) parameter. Let's call that instead:
fn main() {
let path = env::args().nth(1);
println!("{:?}", path);
}
The path is wrapped up in an Option
variant. Earlier we forced the program to panic if we got a None
back. What method did that? unwrap
. Instead, we could explicitly handle each variant with a match
statement:
fn main() {
match env::args().nth(1) {
Some(path) => println!("{}", path),
None => panic!("Usage: wc <file>"),
}
}
Neat, but the path is only valid inside the first arm. Do we need to nest all our code inside that arm? Imagine if we had five more calls that returned Option
. The nesting would get out of hand. Sequences are easier to read than nesting, so let's switch to a match
expression that yields the path to the outer scope:
fn main() {
let path = match env::args().nth(1) {
Some(path) => path,
None => panic!("Usage: wc <file>"),
};
}
The second arm exits the process, so it doesn't return anything.
This pattern of trying to get a value and panicking if it's bad is common enough that there's a helper function that abstracts it away. It's expect
:
fn main() {
let path = env::args().nth(1).expect("Usage: wc <file>");
}
That's much easier to read. Let's hand the path off to a function that reads in the file and computes its statistics. We'll have that function return an instance of this struct:
#[derive(Debug)]
struct Statistics {
character_count: usize,
word_count: usize,
line_count: usize,
}
Then we frame our helper function. It takes in a borrowed string and gives back an instance of our struct:
fn count(path: &str) -> Statistics {
// ...
}
The main function will call count
like this:
fn main() {
let path = env::args().nth(1).expect("Usage: wc <file>");
let statistics = count(&path);
println!("{:?}", statistics);
}
Inside count
we need to read the file. That might fail. For the moment, let's panic if that reading fails:
fn count(path: &str) -> Statistics {
let text = fs::read_to_string(path).expect("Couldn't read file.");
// ...
}
Most Rust collections have a len
method reporting their number of elements. We can use it to get the number of bytes in the string, which is not quite the same as the number of characters. Function split_whitespace
gives us an iterator over words, and function lines
gives us an iterator over the lines. Iterators don't have a len
method, but they do have count
. With these methods, we compute our statistics:
fn count(path: &str) -> Statistics {
let text = fs::read_to_string(path).expect("Couldn't read file.");
Statistics {
character_count: text.len(),
word_count: text.split_whitespace().count(),
line_count: text.lines().count(),
}
}
The output matches pretty well to the builtin wc
. There's one difference, however. The real wc
accepts multiple paths. We can too by looping over the args:
fn main() {
for path in env::args() {
let statistics = count(&path);
println!("{:?}", statistics);
}
}
But what if we feed in a bad path? The real wc
prints the error but keeps on going through the remaining paths. Ours crashes. We want to do what the real wc
does. We'll make it so count
returns its struct wrapped up in a Result
:
fn count(path: &str) -> Result<Statistics, io::Error> {
match fs::read_to_string(path) {
Ok(text) => Ok(Statistics {
character_count: text.len(),
word_count: text.split_whitespace().count(),
line_count: text.lines().count(),
})
Err(e) => Err(e),
}
}
I find this code hard to read. Imagine if we had to call another I/O method that could also fail. That would make the nesting even worse. Good news. Rust has a really nice postfix operator that we can place after a dangerous method call. If that call gives back None
or an Err
variant, it will automatically return. Otherwise it will go on to the next statement. It turns nesting into a sequence. This is much more readable:
fn count(path: &str) -> Result<Statistics, io::Error> {
let text = fs::read_to_string(path)?;
Ok(Statistics {
character_count: text.len(),
word_count: text.split_whitespace().count(),
line_count: text.lines().count(),
})
}
Flashmod
For our last example, let's make a math flashcard quizzer. We're all good enough at addition and multiplication. Our flashcards will be for modular division. Rust doesn't ship with a library for generating random numbers, so we have to install a package. As soon as a project has dependencies, we are better off using its package manager Cargo than using rustc
directly. We build a new Cargo project and add the third-party rand
crate (library) with this shell command:
cargo new flashmod
cargo add rand
In src/main.rs
we add code to generate and print two random numbers:
use rand::Rng;
use std::io::{self,Write};
fn main() {
let mut generator = rand::thread_rng();
let mut stdin = io::stdin();
let mut stdout = io::stdout();
let a = generator.gen_range(0..50);
let b = generator.gen_range(1..10);
print!("{} % {} = ", a, b);
stdout.flush().unwrap();
}
Since there's no newline, we flush the output. Let's give the user ten problems with a for loop:
fn main() {
let mut generator = rand::thread_rng();
let mut stdin = io::stdin();
let mut stdout = io::stdout();
for _ in 0..10 {
let a = generator.gen_range(0..50);
let b = generator.gen_range(1..10);
print!("{} % {} = ", a, b);
stdout.flush().unwrap();
}
}
We don't name the element because we never reference it. Next we get input from the user:
fn main() {
let mut generator = rand::thread_rng();
let mut stdin = io::stdin();
let mut stdout = io::stdout();
let mut response = String::new();
for _ in 0..10 {
let a = generator.gen_range(0..50);
let b = generator.gen_range(1..10);
print!("{} % {} = ", a, b);
stdout.flush().unwrap();
response.clear();
stdin.read_line(&mut response).unwrap();
}
}
We need to clear the string on each iteration because read_line
appends the content. As always, user input comes in as a string. We need to parse out a number. This might fail, so we add a loop around the prompt and user input until the parse succeeds:
use rand::Rng;
use std::io::{self,Write};
fn main() {
let mut generator = rand::thread_rng();
let mut stdin = io::stdin();
let mut stdout = io::stdout();
let mut response = String::new();
for _ in 0..10 {
let a = generator.gen_range(0..50);
let b = generator.gen_range(1..10);
let number = loop {
print!("{} % {} = ", a, b);
stdout.flush().unwrap();
response.clear();
stdin.read_line(&mut response).unwrap();
match response.trim().parse::<u32>() {
Ok(number) => break number,
Err(e) => println!("That wasn't a number."),
}
};
}
}
This generic loop lets us check the exit condition anywhere instead of just at the beginning or end as in a while or for loop. We call break
when we're ready to bail.
All that remains is a final conditional to give feedback to the user:
use rand::Rng;
use std::io::{self,Write};
fn main() {
let mut generator = rand::thread_rng();
let mut stdin = io::stdin();
let mut stdout = io::stdout();
let mut response = String::new();
for _ in 0..10 {
let a = generator.gen_range(0..50);
let b = generator.gen_range(1..10);
let number = loop {
print!("{} % {} = ", a, b);
stdout.flush().unwrap();
response.clear();
stdin.read_line(&mut response).unwrap();
match response.trim().parse::<u32>() {
Ok(number) => break number,
Err(e) => println!("That wasn't a number."),
}
};
if a % b == number {
println!("That's exactly right.");
} else {
println!("Well, no.");
}
}
}
That concludes our first tour of Rust. It sits somewhere between Ruby and Haskell. The compiler knows the types, so it can perform typechecking. Values can be mutable, so we don't have to contort ourselves into recursion like we did in Haskell. It's got a rich standard library and a very friendly package manager and build tool in Cargo. It's hip. It's supposed to make software safer.
TODO
Here's your list of things to do before we meet next:
See you next time.
Sincerely,
P.S. It's time for a haiku!