Item 12: Prefer generics to trait objects

Item 2 described the use of traits to encapsulate behaviour in the type system, as a collection of related methods, and observed that there are two ways to make use of traits: as trait bounds for generics, or in trait objects. This Item explores the trade-offs between these two possibilities.

Rust's generics are roughly equivalent to C++'s templates: they allow the programmer to write code that works for some arbitrary type T, and specific uses of the generic code are generated at compile time – a process known as monomorphization in Rust, and template instantiation in C++. Unlike C++, Rust explicitly encodes the expectations for the type T in the type system, in the form of trait bounds for the generic.

In comparison, trait objects are fat pointers (Item 9) that combine a pointer to the underlying concrete item with a pointer to a vtable that in turn holds function pointers for all of the trait implementation's methods.

    let square = Square::new(1, 2, 2);
    let draw: &dyn Drawable = □
Trait object layout, with pointers to concrete item and vtable

These basic facts already allow some immediate comparisons between the two possibilities:

  • Generics are likely to lead to bigger code sizes, because the compiler generates a fresh copy of the code generic::<T>(t: &T) for every type T that gets used; a traitobj(t: &dyn T) method only needs a single instance.
  • Invoking a trait method from a generic will generally be ever-so-slightly faster than from code that uses a trait object, because the latter needs to perform two dereferences to find the location of the code (trait object to vtable, vtable to implementation location).
  • Compile times for generics are likely to be longer, as the compiler is building more code and the linker has more work to do to fold duplicates.

In most situations, these aren't significant differences; you should only use optimization-related concerns as a primary decision driver if you've measured the impact and found that it has a genuine effect (a speed bottleneck or a problematic occupancy increase).

A more significant difference is that generic trait bounds can be used to conditionally make methods available, depending on whether the type parameter implements multiple traits.

trait Drawable {
    fn bounds(&self) -> Bounds;
}
    struct Container<T>(T);

    impl<T: Drawable> Container<T> {
        // The `area` method is available for all `Drawable` containers.
        fn area(&self) -> i64 {
            let bounds = self.0.bounds();
            (bounds.bottom_right.x - bounds.top_left.x)
                * (bounds.bottom_right.y - bounds.top_left.y)
        }
    }

    impl<T: Drawable + Debug> Container<T> {
        // The `show` method is only available if `Debug` is also implemented.
        fn show(&self) {
            println!("{:?} has bounds {:?}", self.0, self.0.bounds());
        }
    }
    let square = Container(Square::new(1, 2, 2)); // Square is not Debug
    let circle = Container(Circle::new(3, 4, 1)); // Circle is Debug

    println!("area(square) = {}", square.area());
    println!("area(circle) = {}", circle.area());
    circle.show();
    // The following line would not compile.
    // square.show();

A trait object only encodes the implementation vtable for a single trait, so doing something equivalent is much more awkward. For example, a combination DebugDrawable trait could be defined for the show() case, together with some conversion operations (Item 6) to make life easier. However, if there are multiple different combinations of distinct traits, it's clear that the combinatorics of this approach rapidly become unwieldy.

Item 2 described the use of trait bounds to restrict what type parameters are acceptable for a generic function. Trait bounds can also be applied to trait definitions themselves:

trait Shape: Drawable {
    fn render_in(&self, bounds: Bounds);
    fn render(&self) {
        self.render_in(overlap(SCREEN_BOUNDS, self.bounds()));
    }
}

In this example, the render() method's default implementation (Item 13) makes use of the trait bound, relying on the availability of the bounds() method from Drawable.

Programmers coming from object-oriented languages often confuse trait bounds with inheritance, under the mistaken impression that a trait bound like this means that a Shape is-a Drawable. That's not the case: the relationship between the two types is better expressed as Shape also-implements Drawable.

Under the covers, trait objects for traits that have trait bounds

    let square = Square::new(1, 2, 2);
    let draw: &dyn Drawable = &square;
    let shape: &dyn Shape = &square;

have a single combined vtable that includes the methods of the top-level trait, plus the methods of all of the trait bounds:

Trait objects for trait bounds, with distinct vtables for Square and Shape

This means that there is no way to "upcast" from Shape to Drawable, because the (pure) Drawable vtable can't be recovered at runtime (see Item 19 for more on this). There is no way to convert between related trait objects, which in turn means there is no Liskov substitution.

Repeating the same point in different words, a method that accepts a Shape trait object

  • can make use of methods from Drawable (because Shape also-implements Drawable, and because the relevant function pointers are present in the Shape vtable)
  • cannot pass the trait object on to another method that expects a Drawable trait object (because Shape is-not Drawable, and because the Drawable vtable isn't available).

In contrast, a generic method that accepts items that implement Shape

  • can use methods from Drawable
  • can pass the item on to another generic method that has a Drawable trait bound, because the trait bound is monomorphized at compile time to use the Drawable methods of the concrete type.

Another restriction on trait objects is the requirement for object safety: only traits that comply with the following two rules can be used as trait objects.

  • The trait's methods must not be generic.
  • The trait's methods must not return a type that includes Self.

The first restriction is easy to understand: a generic method f is really an infinite set of methods, potentially encompassing f::<i16>, f::<i32>, f::<i64>, f::<u8>, … The trait object's vtable, on the other, is very much a finite collection of function pointers, and so it's not possible to fit an infinite quart into a finite pint pot.

The second restriction is a little bit more subtle, but tends to be the restriction that's hit more often in practice – traits that impose Copy or Clone trait bounds (Item 5) immediately fall under this rule. To see why it's disallowed, consider code that has a trait object in its hands; what happens if that code calls (say) let y = x.clone()? The calling code needs to reserve enough space for y on the stack, but it has no idea of the size of y because Self is an arbitrary type. As a result, return types that mention1 Self lead to a trait that is not object safe.

There is an exception to this second restriction. A method returning some Self-related type does not affect object safety if Self comes with an explicit restriction to types whose size is known at compile time: Self: Sized. This trait bound means that the method can't be used with trait objects anyway, because trait objects are explicitly of unknown size (!Sized), and so the method is irrelevant for object safety.

The balance of factors so far leads to the advice to prefer generics to trait objects, but there are situations where trait objects are the right tool for the job.

The first is a practical consideration: if generated code size or compilation time is a concern, then trait objects will perform better (as described at the start of this Item).

A more theoretical aspect that leads towards trait objects is that they fundamentally involve type erasure: information about the concrete type is lost in the conversion to a trait object. This can be a downside (see Item 19), but it can also be useful because it allows for collections of heterogeneous objects – because the code just relies on the methods of the trait, it can invoke and combine the methods of differently (concretely) typed items.

The traditional OO example of rendering a list of shapes would be one example of this: the same render() method could be used for squares, circles, ellipses and stars in the same loop.

    let shapes: Vec<&dyn Shape> = vec![&square, &circle];
    for shape in shapes {
        shape.render()
    }

A much more obscure potential advantage for trait objects is when the available types are not known at compile-time; if new code is dynamically loaded at run-time (e.g via dlopen(3)), then items that implement traits in the new code can only be invoked via a trait object, because there's no source code to monomorphize over.


1: At present, the restriction on methods that return Self includes types like Box<Self> that could be safely stored on the stack; this restriction might be relaxed in future.