Lifetimes

Dear Computer

Chapter 11: Managing Memory

Lifetimes

Collections own their data, but some abstractions should merely borrow data from some other owner. For example, suppose we have a very large heap string that we need to share with several abstractions. The string can only have one owner, but no single one of the abstractions strikes us as the obvious claimant. We don't want to copy the string; that would be a waste of memory. Instead, we have each abstraction borrow the one string.

Maybe the first abstraction is a lexer. We pass the lexer a reference to the source in this main function that reads the string in from a file:

Rust
fn main() {
  let path = env::args().nth(1).unwrap();
  let source = fs::read_to_string(path).unwrap();
  let lexer = Lexer::new(&source);
}
fn main() {
  let path = env::args().nth(1).unwrap();
  let source = fs::read_to_string(path).unwrap();
  let lexer = Lexer::new(&source);
}

Based on this usage, we define a simplified Lexer as a struct with a string slice—an &str—as its only field.

The compiler rejects this code. The problem is that we have two owners whose fates have become entangled. Variable source is the owner of the source code. Variable lexer is the owner of the Lexer instance. If source were to go out of scope or get reassigned, its original text would be freed, and the lexer would have a dangling pointer that points to memory no longer allocated. Unlike C, Rust doesn't allow dangling pointers.

The Rust compiler expects us to declare this entanglement with a lifetime parameter. A lifetime parameter is a tag of the form 'a. Any identifier can appear after the apostrophe, but they tend be short and generically named. We attach this tag to the ampersand of a reference type, like source:

Rust
struct Lexer {
  source: &'a str,
}
struct Lexer {
  source: &'a str,
}

The 'a represents the lifetime of source, which is determined by the compiler. We must also declare the lifetime parameter in the struct header:

Rust
struct Lexer<'a> {
  source: &'a str,
}
struct Lexer<'a> {
  source: &'a str,
}

The impl block and the constructor function also need lifetime parameters.

The code now compiles. The lifetime parameter declares that the lexer cannot outlive the string that is passed to it. Let's verify that the compiler uses this lifetime information to identify a memory issue. Suppose we reassign source after creating the lexer but after that call a lexer method.

The code is appropriately rejected. The lexer's reference to source is not allowed to go stale. Try swapping the final two statements of main. The compiler accepts this alternative ordering because the lexer's lifetime fits within the lifetime of source.

Lifetime parameters are needed whenever borrowed data becomes part of an abstraction's state or a function's return value. If a reference is passed to a function for only temporary calculations, the allocation has no long-term entanglement and the reference does not need a lifetime parameter.

Can't the compiler figure out the lifetimes without us having to declare them? It could. However, one could ask the same question about types: can't the compiler figure out the types of parameters or fields without us having to declare them? Often it can. However, if we leave it up to the compiler to infer types and lifetimes, then changes to the implementation may result in changes to the inferred types and lifetimes. Imagine relying on a function that yesterday returned an f64 and suddenly today returns an i64. Client code will break if abstractions don't have a stable interface. The designers of Rust adopted this rule to achieve stability: developers decide the interface. Therefore, we must explicitly declare types, lifetimes, constness, and mutability—not to help the compiler, but because we humans are in charge.

← Box TypeRc Type →