In a previous blog post, I talked about using OpenTelemetry to gather telemetry data from your production applications. Having this data available can differentiate between a few minutes of trace data analysis and a few hours of debugging.
If you’d like to start instrumenting your applications but want to take it one step at a time, why don’t you start with the tracing
crate? The tracing
crate allows you to collect structured event data from your application, even across thread boundaries. Using it in your code is the first step towards achieving Observability.
Table of contents
What is Observability?
Observability allows you to ask random questions about a system based on its output data. Observability is a superset of monitoring, as it expands its focus beyond metrics and logs. Metrics and logs work well for predefined use cases, where observability expands that horizon to unknown error scenarios using metrics, logs, and trace data.
Why is Observability critical?
In today’s world, most engineering teams are shifting to a micro-service-based architecture. The reasons for that are clear: It’s easier to fix the system when one of 20 services fails instead of a large monolith. It’s analogous to a car’s engine. A car’s engine consists up of several parts. If one of them fails, you only need to replace one or perhaps a few instead of the entire engine.
But how do you know when it’s time to replace the spark plugs or the battery? Today’s cars have many sensors to determine if a part is healthy or not. The onboard computer will let you know if a part needs to be replaced.
The same is true for our modern and cloud-native web applications. Observability helps us in various ways. For one, we learn early on when a service needs attention because it handles too much traffic or fails too often.
On the other hand, we can also troubleshoot unforeseen issues more effectively. Observability allows us to ask questions based on the system’s output data. We don’t need to go in and add additional debug outputs first; deploy and hope the issue comes up again. Instead, we can use that data for troubleshooting because our services are already instrumented and sending telemetry data.
How do I instrument my service?
There are different ways to instrument a specific service. A good starting point would be adding log output to your code, generating structured logs at runtime. Logs are one of the most commonly used data signals for troubleshooting.
Many applications also report metrics. Metrics consist of aggregated data points such as “Number of customers per hour” or “Number of 4xx errors served”.
Last but not least, there are traces. Traces paint a picture of transactions across services boundaries. For instance, during checkout, you see how the API service calls payments, then logistics, and finally notifications service. Traces record these interactions, including meta information such as payloads.
For Rust applications, there are several ways available when it comes to instrumenting your application. Today, we’ll also be using the tracing crate to generate trace data, but in a less intrusive way.
What does the tracing crate do?
The tracing crate is part of Tokio and provides convenient methods to collect telemetry data from within our application through log generation, but also macros that record function parameters.
Getting started with tracing
In your application’s Cargo.toml
, add these two crates:
tracing = "0.1.29"
tracing-subscriber = "0.2.0"
tracing
and tracing-subscriber
will allow us to generate structured telemetry data locally, without sending it anywhere.
Once you’ve installed the crates, let’s get into instrumenting our code. The examples below are from tracing
’s README file:
fn main() {
// a builder for `FmtSubscriber`.
let subscriber = FmtSubscriber::builder() //(1)
// all spans/events with a level higher than TRACE (e.g, debug, info, warn, etc.)
// will be written to stdout.
.with_max_level(Level::TRACE)
// completes the builder.
.finish();
tracing::subscriber::set_global_default(subscriber) //(2)
.expect("setting default subscriber failed");
let number_of_yaks = 3;
// this creates a new event, outside of any spans.
info!(number_of_yaks, "preparing to shave yaks"); //(3)
let number_shaved = yak_shave::shave_all(number_of_yaks);
info!(
all_yaks_shaved = number_shaved == number_of_yaks,
"yak shaving completed."
); //(4)
}
We’re starting with building a new subscriber
(1). This subscriber
will receive all our outputs and print them to the console. Next, we’re configuring our subscriber as the default target for telemetry data (2). These two steps complete our setup. We’re now ready to generate data, such as (3) and (4), using the info!
macro to create a new event.
Inspecting the output on the console
Let’s try out our code and inspect the output.
$ cargo run
Finished dev [unoptimized + debuginfo] target(s) in 0.02s
Running `target/debug/tokio-tracing-demo`
Dec 11 15:14:07.166 INFO tokio_tracing_demo: preparing to shave yaks number_of_yaks=3
Dec 11 15:14:07.167 INFO tokio_tracing_demo: yak shaving completed. all_yaks_shaved=true
Once we run our code, we see that it starts to produce log output, featuring the log message (such as preparing to shave yaks
) and additional parameters (like number_of_yaks=3
).
We’re currently only using the info!
macro in main
. How would it look like if we were to instrument more functions?
Using the instrument
attribute
If we’re building a library, or we’d like to get information about called functions (including arguments), that’s where the instrument
macro comes into play. Part of our demo application is a separate module called yak_shave
. Currently, the module is holding a placeholder implementation to shave yaks:
// src/yak_shave.rs
pub(crate) fn shave_all(number_of_yaks: i32) -> i32 {
number_of_yaks
}
It receives an i32
and returns the same number. We want to start instrumenting this function to document whenever the function gets called with different arguments.
As a first step, we’re annotating shave_all
with the instrument
macro:
// src/yak_shave.rs
#[tracing::instrument]
pub(crate) fn shave_all(number_of_yaks: i32) -> i32 {
number_of_yaks
}
You don’t see additional log output when you run your code now. Behind the scenes, instrument
creates a new span. A span holds different events generated over time, such as by calling info!
. We do not see additional log output since we haven’t done that yet.
Let’s enhance shave_all
to generate meaningful log output:
// src/yak_shave.rs
use tracing::info;
#[tracing::instrument]
pub(crate) fn shave_all(number_of_yaks: i32) -> i32 {
for yak_index in 0..number_of_yaks {
info!(current_yak=yak_index+1, "Shaving in progress");
}
number_of_yaks
}
We added a for
loop to shave all provided yaks. While we’re doing that, we’re emitting an event using info!
:
"Shaving in progress
is our status message- While
current_yak=yak_index+1
indicates which yak we’re currently shaving
When we run our program now, let’s have a look at the output:
$ cargo run
Compiling tokio-tracing-demo v0.1.0 (projects/tokio-tracing-demo)
Finished dev [unoptimized + debuginfo] target(s) in 0.55s
Running `target/debug/tokio-tracing-demo`
Dec 11 15:32:18.000 INFO tokio_tracing_demo: preparing to shave yaks number_of_yaks=3
Dec 11 15:32:18.001 INFO shave_all{number_of_yaks=3}: tokio_tracing_demo::yak_shave: Shaving in progress current_yak=1
Dec 11 15:32:18.001 INFO shave_all{number_of_yaks=3}: tokio_tracing_demo::yak_shave: Shaving in progress current_yak=2
Dec 11 15:32:18.001 INFO shave_all{number_of_yaks=3}: tokio_tracing_demo::yak_shave: Shaving in progress current_yak=3
Dec 11 15:32:18.001 INFO tokio_tracing_demo: yak shaving completed. all_yaks_shaved=true
Besides "preparing to shave yaks
and "yak shaving completed.
, we now have three additional lines belonging to the span shave_all
. They also contain additional context, namely the number of yaks (number_yaks
). Also note, that the first and last messages do not belong to a specific span.
Conclusion
Introducing the tracing
crate to your application is a great starting point to understand its behavior in production better. Writing logs to the console is a fantastic starting point.
Once you’re ready to take the next step, check out the follow up blog post on shipping your traces to Jaeger.
You can find the source code for this blog post here. Also follow me on Twitter for more Rust content.
One thought on “Getting better insights into your Rust applications”