Item 25: Manage your dependency graph

Like most modern programming languages, Rust makes it easy to pull in external libraries, in the form of crates. By default, Cargo will:

  • download any crates named in the [dependencies] section of your Cargo.toml file from,
  • finding versions that match the preferences configured in Cargo.toml.

There are a few subtleties lurking underneath this simple statement. The first thing to notice is that crate names form a single flat namespace (and this global namespace also overlaps with the names of features in a crate, see Item 26). Names are generally allocated on a first-come, first-served basis, so you may find that your preferred name for a public crate is already taken. (However, name-squatting – reserving a crate name by pre-registering an empty crate – is frowned upon, unless you really are going to release in the near future.)

As a minor wrinkle, there's also a slight difference between what's allowed as a crate name in this namespace, and what's allowed as an identifier in code: a crate can be named some-crate but it will appear in code as some_crate (with an underscore). To put it another way: if you see some_crate in code, the corresponding crate name may be either some-crate or some_crate.

The second aspect to be aware of is Cargo's version selection algorithm. Each Cargo.toml dependency line specifies an acceptable range of versions, according to semver (Item 21) rules, and Cargo takes this into account when the same crate appears in multiple places in the dependency graph. If the acceptable ranges overlap and are semver-compatible, then Cargo will pick the most recent version of the crate within the overlap.

However, if there is no semver-compatible overlap, then Cargo will build multiple copies of the dependency at different versions. This can lead to confusion if the dependency is exposed in some way rather than just being used internally (Item 24) – the compiler will treat the two versions as being distinct crates, but its error messages won't necessarily make that clear.

Allowing multiple versions of a crate can also go wrong if the crate includes C/C++ code accessed via Rust's FFI mechanisms (Item 34) . The Rust toolchain can internally disambiguate distinct versions of Rust code, but any included C/C++ code is subject to the one definition rule: there can only be a single version of any function, constant or global variable. This is most commonly encountered with the ring cryptographic library, because it includes parts of the BoringSSL library (which is written in C and assembler).

The third subtlety of Cargo's resolution process to be aware of is feature unification: the features that get activated for a dependent crate are the union of the features selected by different places in the dependency graph; see Item 26 for more details.

Once Cargo has picked acceptable versions for all dependencies, its choices are recorded in the Cargo.lock file. Subsequent builds will then re-use the choices encoded in Cargo.lock, so that the build is stable and no new downloads are needed.

This leaves you with a choice: should you commit your Cargo.lock files into version control or not?

The advice from the Cargo developers is that:

  • Things that produce a final product, namely applications and binaries, should commit Cargo.lock to ensure a deterministic build.
  • Library crates should not commit a Cargo.lock file, because it's irrelevant to any downstream consumers of the library – they will have their own Cargo.lock file; be aware that the Cargo.lock file for a library crate is ignored by library users.

Even for a library crate, it can be helpful to have a checked-in Cargo.lock file to ensure that regular builds and continuous integration (Item 32) don't have a moving target. Although the promises of semantic versioning (Item 21) should prevent failures in theory, mistakes happen in practice and it's frustrating to have builds that fail because someone somewhere recently changed a dependency of a dependency.

However, if you version control Cargo.lock, set up a process to handle upgrades (such as GitHub's Dependabot). If you don't, your dependencies will stay pinned to versions that get older, outdated and potentially insecure.

Pinning versions with a checked-in Cargo.lock file doesn't avoid the pain of handling dependency upgrades, but it does mean that you can handle them at a time of your own choosing, rather than immediately when the upstream crate changes. There's also some fraction of dependency upgrade problems that go away on their own: a crate that's released with a problem often gets a second, fixed, version released in a short space of time, and a batched upgrade process might only see the latter version.

Version Specification

The version specification clause for a dependency defines a range of allowed versions, according to the rules explained in the Cargo book.

  • Avoid too-specific a version dependency: pinning to a specific version ("=1.2.3") is usually a bad idea: you don't see newer versions (potentially including security fixes), and you dramatically narrow the potential overlap range with other crates in the graph that rely on the same dependency (recall that Cargo only allows a single version of a crate to be used within a semver-compatible range).
  • Avoid too-general a version dependency: it's possible to specify a version dependency ("*") that allows non-semver compatible versions to be included, but it's a bad idea: do you really mean that the crate can completely change every aspect of its API and your code will still work? Thought not.

The most common Goldilocks specification is to allow semver-compatible versions ("1.*") of a crate, possibly with a specific minimum version that includes a feature or fix that you require ("^1.4.23")

Solving Problems with Tooling

Item 31 recommends that you take advantage of the range of tools that are available within the Rust ecosystem; this section describes some dependency graph problems where tools can help.

The compiler will tell you pretty quickly if you use a dependency in your code, but don't include that dependency in Cargo.toml. But what about the other way around? If there's a dependency in Cargo.toml that you don't use in your code – or more likely, no longer use in your code – then Cargo will go on with its business. The cargo-udeps tool is designed to solve exactly this problem: it warns you when your Cargo.toml includes an unused dependency ("udep").

A more versatile tool is cargo-deny, which analyzes your dependency graph to detect a variety of potential problems across the full set of transitive dependencies:

  • Dependencies that have known security problems in the included version.
  • Dependencies that are covered by an unacceptable license.
  • Dependencies that are just unacceptable.
  • Dependencies that are included in multiple different versions across the dependency tree.

Each of these features can be configured and can have exceptions specified; the latter inevitably becomes necessary for large projects, particularly for the multiple-version warning.

These tools can be run as a one-off, but it's better to ensure they're executed regularly and reliably by including them in your continuous integration system (Item 32). This helps to catch newly-introduced problems – including problems that may have been introduced outside of your code, in an upstream dependency (for example, a newly reported vulnerability).

If one of these tools does report a problem, it can be difficult to figure out exactly where in the dependency graph the problem arises. The cargo tree command that's included with cargo helps here; it shows the dependency graph as a tree structure.

dep-graph v0.1.0
├── dep-lib v0.1.0
│   └── rand v0.7.3
│       ├── getrandom v0.1.16
│       │   ├── cfg-if v1.0.0
│       │   └── libc v0.2.94
│       ├── libc v0.2.94
│       ├── rand_chacha v0.2.2
│       │   ├── ppv-lite86 v0.2.10
│       │   └── rand_core v0.5.1
│       │       └── getrandom v0.1.16 (*)
│       └── rand_core v0.5.1 (*)
└── rand v0.8.3
    ├── libc v0.2.94
    ├── rand_chacha v0.3.0
    │   ├── ppv-lite86 v0.2.10
    │   └── rand_core v0.6.2
    │       └── getrandom v0.2.3
    │           ├── cfg-if v1.0.0
    │           └── libc v0.2.94
    └── rand_core v0.6.2 (*)

cargo tree includes a variety of options that can help to solve specific problems, including:

  • --invert shows what depends on a specific package, helping you to focus on a particular problematic dependency.
  • --edges features shows what crate features are activated by a dependency link, which helps you figure out what's going on with feature unification (Item 26).

What To Depend On

The previous sections have covered the more mechanical aspect of working with dependencies, but there's a more philosophical (and therefore harder to answer) question: when should you take on a dependency?

Most of the time, there's not much of a decision involved: if you need the functionality of a crate, you need that function and the only alternative would be to write it yourself1.

But every new dependency has a cost: partly in terms of longer builds and bigger binaries, but mostly in terms of the developer effort involved in fixing problems with dependencies when they arise.

The bigger your dependendency graph, the more likely you are to be exposed to these kinds of problems. The Rust crate ecosystem is just as vulnerable to accidental dependency problems as other package ecosystems, where history has shown that one developer yanking a package, or a team fixing the licensing for their package, can have widespread knock-on effects.

More worrying still are supply chain attacks, where a malicious actor deliberately tries to subvert commonly-used dependencies, whether by typo-squatting or by more sophisticated attacks2.

This kind of attack doesn't just affect your compiled code ‐ be aware that a dependency can run arbitrary code at build time, via scripts. That means that a compromised dependency could end up running a cryptocurrency miner as part of your continuous integration system!

So for dependencies that are more "cosmetic", it's sometimes worth considering whether adding the dependency is worth the cost.

The answer is usually "yes", though – in the end, the amount of time spent dealing with dependency problems ends up being much less than the time it would take to write equivalent functionality from scratch.

1: If you are targetting a no_std environment, this choice may be made for you: many crates are not compatible with no_std, particularly if alloc is also unavailable (Item 33).

2: Go's module transparency takes a step towards addressing this, by ensuring that all clients see the same code for a given version of a package.