Parametric Polymorphism in Rust
Generics have been in Rust from the very beginning of the language. Unlike Java and C++, its developers didn't have to find a way to squeeze them in to an established language. Rust generics have an expressive power similar to Java generics, but they are implemented like C++ templates.
Formal Type Parameters
Structs and enums declare their type parameters right after the name of the type, as in Java. For example, this Point3
struct can hold either integers or floats, or anything really:
struct Point3<T> {
x: T,
y: T,
z: T,
}
struct Point3<T> { x: T, y: T, z: T, }
The impl
block used to define functions for a generic type must declare any formal type parameters that will be used within. This block adds a generic constructor function to Point3
:
impl<T> Point3<T> {
pub fn new(x: T, y: T, z: T) -> Self {
return Point3 {x, y, z};
}
}
impl<T> Point3<T> { pub fn new(x: T, y: T, z: T) -> Self { return Point3 {x, y, z}; } }
The several instances of T
might seem redundant in this simple example. However, the only formal type parameter declaration is the T
on impl
. The other occurrences of T
are uses of that declared formal.
Functions declare their formal parameters after their name. For example, this clone
function makes a deep copy of a Point3
:
fn clone<T>(p: Point3<T>) -> Point3<T> {
Point3 {x: p.x, y: p.y, z: p.z}
}
fn clone<T>(p: Point3<T>) -> Point3<T> { Point3 {x: p.x, y: p.y, z: p.z} }
A function like this should normally be placed in the impl
block for Point3
, in which case it wouldn't declare its own type parameter but rather use the one declared by the block.
Actual Type Parameters
Actual type parameters are passed after the type or function name, but Rust offers two different syntaxes. If the actual type parameters appear where only a type is expected, then the actuals are attached directly after the name. This variable declaration puts the type parameter u32
in a spot where a type is expected:
let p: Point3<u32> = Point3::new(0, 0, 0);
let p: Point3<u32> = Point3::new(0, 0, 0);
In contexts where arbitrary expressions may occur, the type is attached to the name with the turbofish operator ::<>
, which eliminates the ambiguity of parsing <
. These variable declarations put the type information in their right-hand sides:
let p = Point3::<u32>::new(0, 0, 0);
let q = clone::<u32>(p);
let p = Point3::<u32>::new(0, 0, 0); let q = clone::<u32>(p);
The actual type parameters can be omitted if the compiler has enough clues from the surrounding code to infer them.
Implementation
As in C++, the Rust compiler instantiates a new instance of a generic type or function every time it encounters a new set of actual type parameters that it hasn't encountered before. The C++ community class this template instantiation, but the Rust community calls it monomorphization. Just as code that serves many types is polymorphic, code that serves a single type is monomorphic. Rust's monomorphized generics are subject to the same costs as C++'s templates: they lead to longer build times and bigger executables. But they also execute more quickly than erasure since the compiler knows the types and can wire up method calls to the appropriate code at compile time.
To share C++ templates, we define the templates in header files so that the compiler can instantiate new versions from the original source code. The Rust compiler behaves differently. It lightly compiles generic code down to an intermediate abstract syntax tree. This tree is monomorphized when the compiler encounters a new usage. In theory, we could release a library of these intermediate generic syntax trees without sharing any source code.