Item 8: Use builders for complex types

Rust insists that all fields in a struct must be filled in when a new instance of that struct is created. This keeps the code safe, but does lead to more verbose boilerplate code than is ideal.

#![allow(unused)]
fn main() {
#[derive(Debug, Default)]
struct BaseDetails {
    given_name: String,
    preferred_name: Option<String>,
    middle_name: Option<String>,
    family_name: String,
    mobile_phone_e164: Option<String>,
}

// ...

    let dizzy = BaseDetails {
        given_name: "Dizzy".to_owned(),
        preferred_name: None,
        middle_name: None,
        family_name: "Mixer".to_owned(),
        mobile_phone_e164: None,
    };
}

This boilerplate code is also brittle, in the sense that a future change that adds a new field to the struct requires an update to every place that builds the structure.

The boilerplate can be significantly reduced by implementing and using the Default trait, as described in Item 5:

    let dizzy = BaseDetails {
        given_name: "Dizzy".to_owned(),
        family_name: "Mixer".to_owned(),
        ..Default::default()
    };

Using Default also helps reduce the changes needed when a new field is added, provided that the new field is itself of a type that implements Default.

That's a more general concern: the automatically derived implementation of Default only works if all of the field types implement the Default trait. If there's a field that doesn't play along, the derive step doesn't work:

    #[derive(Debug, Default)]
    struct Details {
        given_name: String,
        preferred_name: Option<String>,
        middle_name: Option<String>,
        family_name: String,
        mobile_phone_e164: Option<String>,
        dob: chrono::Date<chrono::Utc>,
        last_seen: Option<chrono::DateTime<chrono::Utc>>,
    }
error[E0277]: the trait bound `Date<Utc>: Default` is not satisfied
   --> builders/src/main.rs:176:9
    |
169 |     #[derive(Debug, Default)]
    |                     ------- in this derive macro expansion
...
176 |         dob: chrono::Date<chrono::Utc>,
    |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ the trait `Default` is not implemented for `Date<Utc>`
    |
    = note: this error originates in the derive macro `Default` (in Nightly builds, run with -Z macro-backtrace for more info)

The code can't implement Default for chrono::Utc because of the orphan rule, so this means that all of the fields have to be filled out manually:

    use chrono::TimeZone;

    let bob = Details {
        given_name: "Robert".to_owned(),
        preferred_name: Some("Bob".to_owned()),
        middle_name: Some("the".to_owned()),
        family_name: "Builder".to_owned(),
        mobile_phone_e164: None,
        dob: chrono::Utc.ymd(1998, 11, 28),
        last_seen: None,
    };

These ergonomics can be improved if you implement the builder pattern for complex data structures.

The simplest variant of the builder pattern is a separate struct that holds the information needed to construct the item. For simplicity, the example will hold an instance of the item itself.

#![allow(unused)]
fn main() {
struct DetailsBuilder(Details);

impl DetailsBuilder {
    /// Start building a new [`Details`] object.
    fn new(
        given_name: &str,
        family_name: &str,
        dob: chrono::Date<chrono::Utc>,
    ) -> Self {
        DetailsBuilder(Details {
            given_name: given_name.to_owned(),
            preferred_name: None,
            middle_name: None,
            family_name: family_name.to_owned(),
            mobile_phone_e164: None,
            dob,
            last_seen: None,
        })
    }
}
}

The builder type can then be equipped with helper methods that fill out the nascent item's fields. Each such method consumes self but emits a new Self, allowing different construction methods to be chained.

    /// Set the preferred name.
    fn preferred_name(mut self, preferred_name: &str) -> Self {
        self.0.preferred_name = Some(preferred_name.to_owned());
        self
    }

These helper methods can be more helpful than just simple setters:

    /// Update the `last_seen` field to the current date/time.
    fn just_seen(mut self) -> Self {
        self.0.last_seen = Some(chrono::Utc::now());
        self
    }

The final method to be invoked for the builder consumes the builder and emits the built item.

    /// Consume the builder object and return a fully built [`Details`] object.
    fn build(self) -> Details {
        self.0
    }

Overall, this allows clients of the builder to have a more ergonomic building experience:

    let also_bob =
        DetailsBuilder::new("Robert", "Builder", chrono::Utc.ymd(1998, 11, 28))
            .middle_name("the")
            .preferred_name("Bob")
            .just_seen()
            .build();

The all-consuming nature of this style of builder leads to a couple of wrinkles. The first is that separating out stages of the build process can't be done on its own:

        let builder = DetailsBuilder::new(
            "Robert",
            "Builder",
            chrono::Utc.ymd(1998, 11, 28),
        );
        if informal {
            builder.preferred_name("Bob");
        }
        let bob = builder.build();
error[E0382]: use of moved value: `builder`
   --> builders/src/main.rs:249:19
    |
241 |         let builder = DetailsBuilder::new(
    |             ------- move occurs because `builder` has type `DetailsBuilder`, which does not implement the `Copy` trait
...
247 |             builder.preferred_name("Bob");
    |                     --------------------- `builder` moved due to this method call
248 |         }
249 |         let bob = builder.build();
    |                   ^^^^^^^ value used here after move
    |
note: this function takes ownership of the receiver `self`, which moves `builder`
   --> builders/src/main.rs:49:27
    |
49  |     fn preferred_name(mut self, preferred_name: &str) -> Self {
    |                           ^^^^

This can be worked around by assigning the consumed builder back to the same variable:

    let mut builder =
        DetailsBuilder::new("Robert", "Builder", chrono::Utc.ymd(1998, 11, 28));
    if informal {
        builder = builder.preferred_name("Bob");
    }
    let bob = builder.build();

The other downside to the all-consuming nature of this builder is that only one item can be built; trying to repeatedly build() copies:

        let smithy =
            DetailsBuilder::new("Agent", "Smith", chrono::Utc.ymd(1999, 6, 11));
        let clones = vec![smithy.build(), smithy.build(), smithy.build()];

falls foul of the borrow checker, as you'd expect:

error[E0382]: use of moved value: `smithy`
   --> builders/src/main.rs:269:43
    |
267 |         let smithy =
    |             ------ move occurs because `smithy` has type `DetailsBuilder`, which does not implement the `Copy` trait
268 |             DetailsBuilder::new("Agent", "Smith", chrono::Utc.ymd(1999, 6, 11));
269 |         let clones = vec![smithy.build(), smithy.build(), smithy.build()];
    |                                  -------  ^^^^^^ value used here after move
    |                                  |
    |                                  `smithy` moved due to this method call

An alternative approach is for the builder's methods to take a &mut self and emit a &mut Self:

    /// Update the `last_seen` field to the current date/time.
    fn just_seen(&mut self) -> &mut Self {
        self.0.last_seen = Some(chrono::Utc::now());
        self
    }

This removes the need for self-assignment in separate build stages:

    let mut builder = DetailsRefBuilder::new(
        "Robert",
        "Builder",
        chrono::Utc.ymd(1998, 11, 28),
    );
    if informal {
        builder.preferred_name("Bob"); // no `builder = ...`
    }
    let bob = builder.build();

However, this version makes it impossible to chain the construction of the builder together with invocation of its setter methods:

        let builder = DetailsRefBuilder::new(
            "Robert",
            "Builder",
            chrono::Utc.ymd(1998, 11, 28),
        )
        .middle_name("the")
        .just_seen();
        let bob = builder.build();
error[E0716]: temporary value dropped while borrowed
   --> builders/src/main.rs:289:23
    |
289 |           let builder = DetailsRefBuilder::new(
    |  _______________________^
290 | |             "Robert",
291 | |             "Builder",
292 | |             chrono::Utc.ymd(1998, 11, 28),
293 | |         )
    | |_________^ creates a temporary which is freed while still in use
294 |           .middle_name("the")
295 |           .just_seen();
    |                       - temporary value is freed at the end of this statement
296 |           let bob = builder.build();
    |                     --------------- borrow later used here
    |
    = note: consider using a `let` binding to create a longer lived value

As indicated by the compiler error, this can be worked around by letting the builder item have a name:

    let mut builder = DetailsRefBuilder::new(
        "Robert",
        "Builder",
        chrono::Utc.ymd(1998, 11, 28),
    );
    builder.middle_name("the").just_seen();
    if informal {
        builder.preferred_name("Bob");
    }
    let bob = builder.build();

This mutating builder variant also allows for building multiple items. The signature of the build() method has to not consume self, and so must be:

#![allow(unused)]
fn main() {
    /// Construct a fully built [`Details`] object.
    fn build(&self) -> Details {
        // ...
    }
}

The implementation of this repeatable build() method then has to construct a fresh item on each invocation. If the underlying item implements Clone, this is easy – the builder can hold a template and clone() it for each build. If the underlying item doesn't implement Clone, then the builder needs to have enough state to be able to manually construct an instance of the underlying item on each call to build().

With any style of builder pattern, the boilerplate code is now confined to one place – the builder – rather than being needed at every place that uses the underlying type.

The boilerplate that remains can potentially be reduced still further by use of a macro (Item 28), but if you go down this road you should also check whether there's an existing crate (such as the derive_builder crate in particular) that provides what's needed – assuming that you're happy to take a dependency on it (Item 25).