Global Registration

You might not have considered this before, but tests in Rust are rather magical. Anywhere in your project you can slap #[test] on a function and the compiler makes sure that they're all automatically run. This pattern, of wanting access to items that are distributed over a crate and possibly even multiple crates, is something that projects like bevy, tracing, and dioxus have all expressed interest in but it's not something that Rust supports except for tests specifically.

Criterion is a custom benchmarking framework which provides great statistical analyses of performance. If you’ve ever used it before, you may have seen this pattern:

1
fn criterion_benchmark(c: &mut Criterion) {
2
c.bench_function("benchmark_1", benchmark_1);
3
c.bench_function("benchmark_2", benchmark_2);
4
c.bench_function("benchmark_3", benchmark_3);
5
// ...
6
c.bench_function("benchmark_n", benchmark_n);
7
}
8
9
criterion_group!(benches, criterion_benchmark);
10
criterion_main!(benches);

Being explicit has upsides, but it would be very cool if you could instead write:

1
#[bench]
2
fn benchmark_1(b: &mut Bencher) {}
3
4
#[bench]
5
fn benchmark_2(b: &mut Bencher) {}
6
7
#[bench]
8
fn benchmark_3(b: &mut Bencher) {}
9
10
// ...
11
12
#[bench]
13
fn benchmark_n(b: &mut Bencher) {}
14
15
// magically finds all the #[bench] functions
16
// even when they're spread over multiple files
17
criterion_main!();

I call this pattern global registration: collecting marked items across a crate, or even across all the crates in an executable.

We see it in many places:

Personally, I needed this functionality for implementing unit tests in embedded code. There, you don’t have access to rust’s built-in test framework. Interestingly, Rust does have a weird kind of support for this through #![feature(custom_test_frameworks)]. I’ve even recommended it to my students once or twice, who wanted to test their embedded software but it always felt a bit painful as this feature will, at least in its current form, never be stable.

Various people have expressed interest in having a generic system built into the compiler that can provide this behavior. At RustNL, I talked to epage's github avatar epage who wants this for the Testing Devex team, exactly to make tests less magical and to provide users with the option to define their own test frameworks. Since then, I’ve been thinking about this feature. It started with a pre-rfc, and at this point I’ve written most of an implementation. I’m just not 100% sure that that’s the implementation we should want anymore.

Why? Well that’s what the rest of this blog post is about.

Library Solutions

While direct support from the compiler is the ideal way to implement this feature, as always, people have come up with interesting workarounds. There are currently two libraries that can help you achieve global registration, both of which are maintained by dtolnay's github avatar dtolnay . Let’s discuss them briefly.

Inventory

Inventory works using global constructors. These are special functions that are called before main by the operating system - the exact mechanism differs a bit from platform to platform. C++ uses these to run constructors for global variables, and in rust you can use them using the ctor. If you would like to know more, I rather liked this blog post about them.

What happens, is that before main, a small bit of code runs for each element that needs to be registered, to atomically add itself to a global linked list. Collecting happens based on the type of the item that’s being collected. I adapted the example from their docs to show what’s going on:

1
pub struct Flag {
2
short: char, name: &'static str,
3
}
4
5
inventory::submit! {
6
Flag::new('v', "verbose")
7
}
8
// ====== generates rougly ========
9
// linked list node
10
static NODE: ... = Node::new(/* the flag */);
11
// runs when the program starts before main
12
#[cfg(link_section = ".text.startup")]
13
unsafe extern "C" fn __ctor() {
14
// where T: Collect
15
unsafe { add_to_registry::<T>(NODE); }
16
}
17
// ================================
18
19
inventory::collect!(Flag);
20
// ====== generates rougly ========
21
impl Collect for Flag {
22
fn registry() -> Registry {
23
// A registry is a linked list
24
// this is what the add_to_registry function adds to
25
static REGISTRY: ... = Registry::new();
26
&REGISTRY
27
}
28
}
29
// ================================
30
31
fn main() {
32
// iterate through the linked list
33
// which was built just before main started
34
for flag in inventory::iter::<Flag> {
35
println!("-{}, --{}", flag.short, flag.name);
36
}
37
}

Inventory works as promised, and can be quite useful. Also, it deals well with dynamic library loading. However, it requires running code before main, and it doesn’t work on all platforms; notably embedded platforms don’t really work.

Using global constructors in Rust code is quite tricky. The standard library does a lot of work before main(), such as preparing the threading infrastructure, configuring standard input and output, and collecting the command-line arguments. When global constructors are run, none of these are initialized yet. Thus, Inventory does the bare minimum within its constructor functions.

Linkme

An alternative to inventory is linkme. It comes with slightly different platform support, there are even some test for a cortex-m target, and does not involve running code before main. It works all at compile time, though most of the magic happens during linking.

Object files contain different kinds of data (code, global variables, etc.), organized into sections. When a linker processes object files, it collects the data for each section together. Linkme uses some tricks that instruct the linker to create a section that contains all the registered elements. Again, adapting the example from the docs:

1
use linkme::distributed_slice;
2
3
// all the elements ultimately "appear" here
4
#[distributed_slice]
5
pub static BENCHMARKS: [fn(&mut Bencher)];
6
7
// adds to the static above
8
// by placing this in a specific linker section
9
#[distributed_slice(BENCHMARKS)]
10
static BENCH_DESERIALIZE: fn(&mut Bencher) = bench_deserialize;
11
fn bench_deserialize(b: &mut Bencher) {}
12
13
// tries to generate a linker section that contains:
14
// __SPECIAL_START_SYMBOL
15
// &BENCH_DESERIALIZE
16
// ... more elements
17
// __SPECIAL_END_SYMBOL

Now, the memory between the special start and end symbol form a contiguous range of memory, a slice, containing each of the elements that were added. This works without running code before main, though dynamic library loading isn’t really supported.

At this point, an obvious reaction is:

Why don’t we put this in the compiler?

If the compiler could generate this kind of distributed slice, the issue of platform support disappears. This is basically how libtest works. So, after talking about it on internals.rust-lang.org I enthusiasicaly added what is essentially linkme to the compiler. The implementation even considers the possibility of some day supporting dynamic linking if Rust ever starts properly doing that.

Note, in past discussions this feature was often called “distributed slice”. That’s also the name of the version from the linkme crate. I renamed it because it being specifically a slice exposes too much of the internal working of the feature, and it removes a lot of flexibility to change the design later.

1
#![feature(global_registration)]
2
3
use core::global_registration::{global_registry, register, Registry};
4
5
#[global_registry]
6
static ERROR_MSGS: Registry<&str>;
7
8
register!(ERROR_MSGS, "a");
9
register!(ERROR_MSGS, "b");
10
11
fn main() {
12
for msg in ERROR_MSGS {
13
println!("{}", msg);
14
}
15
}

One crate defines a registry, with #[global_registry]; anyone can add to it and in the final binary a static appears that, when iterated over, contains all the elements. Neat!

Of course, I acted first and only then thought about it properly… After lots of incredibly helpful discussion with m-ou-se's github avatar m-ou-se , we realised that this has a lot of implications. Let’s do some thought experiments, and discuss what they mean.

Visibility

If a registry is public, should anyone be able to add to it? Does pub mean read or write access? On a related note, does pub use forward this read and/or write access?

An option that we considered is somehow applying two visibilities to a registry definition, or splitting it up in two parts:

1
#[global_registry(pub REGISTRY_ADDER)]
2
static ERROR_MSGS: Registry<&str>;
3
4
// in another crate:
5
6
// the adder is public so this is ok
7
register!(REGISTRY_ADDER, "b");

I think the only design that makes sense is that pub forwards both read and write access, if you want a read-only registry you can make it not public but provide a public getter function that returns the elements:

1
// private
2
#[global_registry]
3
static ERROR_MSGS: Registry<&str>;
4
5
// public getter
6
pub fn get_error_msgs() -> &Registry<&str> {
7
&ERROR_MSGS
8
}

If you somehow want a write-only registry, you’re out of luck.

Versioning

What happens if there are two different versions of a crate that defines a global registry in the dependency tree.

Imagine a crate graph like this:

Terminal window
1
cargo tree
2
a v0.1.0
3
├── b v0.1.0
4
└── c v0.1.0
5
└── c v0.2.0

c defines a global registry ERROR_MSGS. b adds some messages, and a reads them, getting access to the registy by importing a different version of c. The only possibility is that a does not see the items added by b.

You can already encounter this problem with the log crate right now. They did employ the semver trick to make it somewhat complicated to construct.

Semver

You’ve published a crate with a global registry definition collecting u32. You’ve made a mistake though, and would actually like it to collect u64 instead. Is there any way you can upgrade without it being a breaking change?

With global registration, everyone in the dependency tree of a crate has to agree on the type of the element in the registry. You could define a second registry, and deprecate the old one, and when you need the elements iterate over both the new old registry. However, existing crates that are already reading the global registry won’t know that they now have to read two registries to make sure they get all the elements.

What you’d want is some way to communicate that all the elements in the old registry should be converted using some function into elements in the new registry. Later on in this blog post I briefly touch on Externally Implementable Items, for which m-ou-se's github avatar m-ou-se and I think we have solved this problem, but it won’t easily transfer to global registration.

Compile time access

If the compiler implements this feature, why is the information only available at runtime? What stops us from collecting registered elements to a const like this?

1
// private
2
#[global_registry]
3
const ERROR_MSGS: Registry<&str>;
4
5
register!(ERROR_MSGS, "a");
6
register!(ERROR_MSGS, "b");

It would be rather convenient if you could for example sort the elements at compile time (ignoring problems with us not yet having traits in const fns). In fact, it seems rather likely that most uses of global registries will be hidden in macros. Like a #[test] attribute, or #[get("/")] on a server route. So why couldn’t registries be a part of the macro machinery?

The answer to all this is quite simple, we can only know all the items in a registry once all crates in a dependency tree are compiled. At that point we can’t go back and rerun const fns in other crates with the final list of items, and we definitely can’t expand any macros in dependent crates anymore.

Registration in dependencies

Let’s think of a usecase. A custom testing framework. The framework defines a registry of tests, and in your crate you add to it. Some dependency of yours uses the same version of the same test framework. When you run your tests, should the tests of the dependency also run?

If the registry is truly global, that’s exactly what’d happen, but it’s not at all how tests behave right now. Obviously, each crate should get its own crate-local registry of tests. In fact, I’m not sure there’s ever a usecase where you genuinely want global registration. A benchmarking framework will also be crate-local, and the routes for a webserver probably as well. It can even be pretty confusing if some far far dependency accidentally (or even maliciously) injects some extra routes in your webserver.

So why did I initially implement inter-crate global registration? I thought that was the thing people wanted. For a little while I was afraid I’d misunderstood what people thought this feature meant, or implemented the wrong thing. But this is the design I sketched in my pre-rfc, where everyone generally agreed with it. Maybe that’s because of the way I framed it. However, it’s also just the version of global registration people are used to. That’s what linkme and inventory provide.

In any case, I now believe that actually global registration is not something we should want.

An alternative design

A fact is that testing devex wants to make custom test frameworks a thing, and I think it’s safe to conclude that this same pattern is useful in other places too. Let’s start from the beginning.

To support custom test frameworks, we at least want a way to register items from various modules within a single crate. So lets start with single-crate registries. That also resolves some problems with registry visibility, within a single crate it’s less important who’s allowed to add to a registry since you control all the code. In fact, now a registry could actually be a const and available during const evaluation, since within a crate that shouldn’t really be an issue. Neat!

At this point we can also drop any issues related to registries and dynamic linking. Everything happens before and during const evaluation and dynamic linking isn’t yet relevant.

Intercrate sometimes?

It’s not true that there are no applications for inter-crate registries. Talking to some friends, we came up with some examples

Still, you want to limit the number of crates these items can come from. You don’t want a random dependency’s metrics, or server routes, or tests. A way to solve this, is to explicitly register them.

1
use other_crate::OTHER_ROUTES;
2
3
#[global_registry]
4
const MY_ROUTES: &[Route];
5
6
register_many!(MY_ROUTES, OTHER_ROUTES);

I think having this is pretty neat, as you can now register any const slice of elements.

Unsolved questions

I think that designing registries like this solves most of the complicated questions associated with the magic intercrate global registration that I implemented. Adding elements from different crates is possible, but much more explicit now. However, there are also some things I haven’t completely worked out yet.

In truth, that’s why I began writing this blogpost. This is roughly where I got to, and maybe someone else has a brilliant idea.

Stable identifier

I think it will be common to hide global registries in macros. From a user’s perspective, it looks like tests are somewhat magically collected, and I actually think that’s alright. So let’s look at an example of what a custom test framework could look like:

1
use custom_test_framework::test_main;
2
3
fn main() {
4
#[cfg(test)]
5
test_main!()
6
}
7
8
#[cfg(test)]
9
mod tests {
10
use custom_test_framework::custom_test;
11
12
#[custom_test]
13
fn test_a() {}
14
15
#[custom_test]
16
fn test_b() {}
17
}

I think this looks pretty neat, if this worked. Using #![feature(custom_test_frameworks)] I’ve written tests for embedded systems that look pretty much exactly like this. But where, in this example is the global registry defined? If test_main! defines it, with some name, how does #[custom_test] refer to it? test_main! is invoked in a function scope. Maybe you’d have a 3rd macro, that you’d have to put above main to define the registry.

1
use custom_test_framework::{test_main, init};
2
3
init!()
4
5
fn main() {
6
#[cfg(test)]
7
test_main!()
8
}
9
10
#[cfg(test)]
11
mod tests {
12
use custom_test_framework::custom_test;
13
14
#[custom_test]
15
fn test_a() {}
16
17
#[custom_test]
18
fn test_b() {}
19
}

Actually, nowhere here we use the name of the global registry. That’s actually fine hygiene-wise because top level items don’t really have hygiene in Rust, but it’d probably give some weird errors when you call init! twice, or not at all. It’s all a little unfortunate.

NonLocal DefIDs

Under this crate-local version of global registration, an implementation would probably look like a hashmap that maps from the DefId of the registry definition to a list of all the elements. This DefId key could in theory be any DefId. Also one that lives outside the current crate. In theory, this could be valid:

1
register!(custom_test_framework, some_test);

The identifier custom_test_framework, the name of our custom test framework crate is easy to refer to from anywhere in the program. Alternatively you could refer to some identifier in custom_test_framework. I think this is incredibly complicated to teach the community. Also, it can be confusing because if you bind a registry to an identifier in a different crate, that crate itself won’t get access to the elements as registration will be crate-local.

Compile time collections

At this point, we’re inventing a lot of new features that together gives users what’s essentially a compile-time growable vector. You can add single const elements, concat const slices from different crates into one larger slice, and iterate over them. However, that’s a completely new thing to Rust. Rust doesn’t really have a compile-time only collection type, nor the ability to express that. Maybe that’s an abstraction that we’d need first in Rust before implementing this feature, because right now there’s a lot of compiler magic going on that’s likely hard to teach to users of the language.

Externally Implementable Items

Many of the things I’ve written about here come from discussions with m-ou-se's github avatar m-ou-se , as we thought that a feature she was working on (Externally Implementable Items; EII) would have a very similar implementation to global registration.

Basically, externally implementable items are like registering a single element in a global registry. Or worded differently, a global registry is like having the possibility to provide more than one externally implemented element.

In the end I think we worked out a rather neat design for EII (more info probably coming soon), but one that does not help implement global registration.

Conclusion

Well, that’s where global registration stands now. I’m pretty sure we should not want the kind of inter-crate registration that I originally implemented. A more explicit importing of elements from other crates leads to fewer surprises. However, as you can see there are still some details that I’m not entirely sure about how to solve. Let me know if you have any ideas! Also, if you do have strong arguments for usecases which need truly global registration, I’d love to hear.