How to instrument a Rust application with OpenTelemetry

antenna contact dawn dusk Photo by Pixabay on Pexels.com

Let’s talk about troubleshooting your Rust Code in production. Whenever we deploy Rust code into a production environment, we want to troubleshoot arising issues quickly and confidently.
Let’s explore OpenTelemetry. OpenTelemetry provides us with the tools to collect various Telemetry signals from applications running in production.

Create a new Rust project

Let’s set up a new Rust project by running cargo new:

$ cargo new otel_instrumentation
     **Created** binary (application) `otel_instrumentation` package

This command provides us with a new binary project. We are adding a few dependencies to our Cargo.toml:

[package]
name = "otel_instrumentation"
version = "0.1.0"
edition = "2021"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
opentelemetry = { version="0.16", features = ["serialize", "rt-tokio"] }
opentelemetry-jaeger = { version="0.15", features = ["rt-tokio"] }
tokio = { version = "1.0", features = ["full"] }

First of all, we need to install the opentelemetry and opentelemetry-jaeger crate. opentelemetry provides us with all necessary structs and functionality to instrument our code effectively. We also add tokio because we want to export spans concurrently in a background task.

opentelemetry-jaeger allows us to send trace data to a backend. OpenTelemetry is backend-agnostic, meaning it does not specify how telemetry data should be post-processed. There are several OpenTelemetry backends available, both commercial and Open Source. For this use case, we’re using Jaeger.
Download the Jaeger-All-in-One release from the Binaries download page. It includes a receiver, processor, and Web UI. It provides us with a convenient way of inspecting traces in a development setting. Please note, Jaeger-All-in-One is not suitable for production usage because it stores data in memory only.

How to setup OpenTelemetry within Rust

Before we can start instrumenting our source code, we need to setup a pipeline. Open main.rs, and above fn main(), paste the following code:

fn init_tracer() -> Result<sdktrace::Tracer, TraceError> {
    opentelemetry_jaeger::new_pipeline()
        .with_service_name("trace-demo")
        .with_trace_config(Config::default().with_resource(Resource::new(vec![
            KeyValue::new("exporter", "otlp-jaeger"),
        ])))
        .install_batch(opentelemetry::runtime::Tokio)
}

Here, we’re creating a new jaeger pipeline and configuring it with a few global values. First of all, we’re setting a service_name, which will identify all traces uniquely. We also add a Resource to mark these traces as Jaeger-generated.
OpenTelemetry allows sending traces in a few different ways. In our example above, we’re assuming jaeger-all-in-one runs on localhost.

If Jaeger runs on a different host or port, please check out this GitHub Example.

Initialize a Tracer within your main Rust program

Now that we have a setup function for the tracer let’s do the next step. Behind the scenes, we’re using Tokio to batch traces and send them out.


#[tokio::main]
async fn main() -> Result<(), Box<dyn Error + Send + Sync + 'static>> {
    // By binding the result to an unused variable, the lifetime of the variable
    // matches the containing block, reporting traces and metrics during the whole
    // execution.
    let _tracer = init_tracer()?;
    let tracer = global::tracer("ex.com/basic");
    //...

    Ok(())
}

You see that our main function is configured to use Tokio. Initially, we’re calling init_tracer to set up the pipeline.

Whenever we want to generate a new trace, we first need a tracer. We obtain a new instance by calling global::tracerglobal::tracer requires an argument specifying the tracing library name. Replace "ex.com/basic" with your custom identifier.

Generate traces

Now we’re ready to generate traces.

tracer.in_span("operation", |cx| {
    let span = cx.span();
    span.add_event(
        "Nice operation!".to_string(),
        vec![Key::new("bogons").i64(100)],
    );
});

We call in_span to generate a new span and provide a closure as an argument. In that closure, we can customize the span, for instance, by adding an event.

In general, we want to generate a span for each critical (business) transaction within the application. For instance, we’d wish to a span for User Checkout, describing the overall checkout process.
We’d add the user’s id plus any other necessary meta-information we might need during troubleshooting. Using span.add_event, we can add more context to checkout, such as Credit Card Charged Amountor Number of Items in Cart.

Viewing Traces

Once you have finished instrumenting your application, it is time to test it. If you haven’t yet, go and install Jaeger-All-in-One and start it. In a separate terminal window, run:

$ cd jaeger-1.26.0-darwin-amd64
$ ./jaeger-all-in-one

This single binary performs a few different jobs. Most importantly, it listens on a few other ports to accept telemetry data. Also, it provides a web interface at http://localhost:16686.

Now that Jaeger is running let’s get back to our small application. In the Rust application directory, run:

$ cargo run
   Compiling otel_instrumentation v0.1.0 (~/projects/otel_instrumentation)
    Finished dev [unoptimized + debuginfo] target(s) in 3.32s
     Running `target/debug/otel_instrumentation`

Open Jaeger in your browser, refresh the page, and you’ll see our service trace-demo show up in the Service dropdown:

Select trace-demo in Service and click on Find Traces. You will see a trace with the name trace-demo: operation show up in the list.

This trace does not provide too much information right now. It’s up to you to enrich the trace’s spans with additional metadata relevant to specific operations your program is performing.

Summary

To collect telemetry data from your Rust applications in production, we’re using the OpenTelemetry project. It provides many building blocks, such as a Rust SDK and a backend (Jaeger).
Once we start generating span data within our applications, we get a better insight into runtime operations.

Find the example source code on GitHub.