| Title: Lightweight data monitoring using RRDtool
Author: Solène
Date: 16 February 2023
Tags: monitoring nocloud
Description: In this article, I will introduce you to RRDtool, a robust
software to keep track of data and render graphs from it
# Introduction
I like my servers to run the least code possible, and the least
services running in general, this ease maintenance and let room for
other thing to run. I recently wrote about monitoring software to
gather metrics and render them, but they are all overkill if you just
want to keep track of a single value over time, and graph it for
visualization.
Fortunately, we have an old and robust tool doing the job fine, it's
perfectly documented and called RRDtool.
|
|
RRDtool stands for "Round Robin Database Tool", it's a set of programs
and a specific file format to gather metrics. The trick with RRD files
is that they have a fixed size, when you create it, you need to define
how many values you want to store in it, at which frequency, for how
long. This can't be changed after the file creation.
In addition, RRD files allow you to create derivated time series to
keep track of computed values on a longer timespan, but with a lesser
resolution. Think of the following use case: you want to monitor your
home temperature every 10 minutes for the past 48 hours, but you want
to keep track of some information for the past year, you can tell RRD
to compute the average temperature for every hour, but for a week, or
the average temperature for four hours but for a month, and the average
temperature per day for a year. All of this will be fixed size.
# Anatomy of a RRD file
RRD files can be dumped as XML, this will give you a glimpse that may
ease the understanding of this special file format.
Let's create a file to monitor the battery level of your computer every
20 seconds, with the last 5 values, don't focus at understanding the
whole command line now:
```rrdtool
rrdtool create test.rrd --step 10 DS:battery:GAUGE:20:0:100 RRA:AVERAGE:0.5:1:5
```
If we dump the created file using the according command, we get this
result (stripped a bit to make it fit better):
```rrdtool
0003
10
1676569107
battery
GAUGE
20
0.0000000000e+00
1.0000000000e+02
U NaN 7
AVERAGE
1
5.0000000000e-01
0.0000000000e+00
0.0000000000e+00
NaN
0
NaN
NaN
NaN
NaN
NaN
```
The most important thing to understand here, is that we have a "ds"
(data serie) named battery of type GAUGE with no last value (I never
updated it), but also a "RRA" (Round Robin Archive) for our average
value that contain timestamp and no value associated to each. You can
see that internally, we already have our 5 slots that exist with a null
value associated. If I update the file, the first null value will
disappear, and a new record will be added at the end with the actual
value.
# Monitoring a value
In this guide, I would like to share my experience at using rrdtool to
monitor my solar panel power output over the last few hours, which can
be easily displayed on my local dashboard. The data are also collected
and sent to a graphana server, but it's not local and displaying to
know the last values is wasting resources and bandwidth.
First, you need `rrdtool` to be installed, you don't need anything else
to work with RRD files.
## Create the RRD file
Creating the RRD file is the most tricky part, because you can't change
it afterward.
I want to collect a data every 5 minutes (300 seconds), this is an
absolute data between 0 and 4000, so we will define a step of 300
seconds to tell the file must receive a value every 300 seconds. The
type of the value will be GAUGE, because it's just a value that doesn't
depend on the previous one. If we were monitoring power change over
time, we would like to use DERIVE, because it computes the delta
between each value.
Furthermore, we need to configure the file to give up on a value slot
if it's not updated within 600 seconds.
Finally, we want to be able to graph each measurement, this can be done
by adding an AVERAGE calculated value in the file, but with a
resolution of 1 value, with 240 measurements stored. What this mean,
is for each time we add a value in the RRD file, the field for AVERAGE
will be calculated with only the last value as input, and we will keep
240 of them, allowing us to graph up to 240 * 5 minutes of data back in
time.
```shell
rrdtool create solar-power.rrd --step 300 ds:value:gauge:600:0:4000 rra:average:0.5:1:240
^ ^ ^ ^ ^ ^ ^ ^ ^
| | | | | max value | | | | number of values to keep
| | | | min value | | | how many previous values should be used in the function, 1 means just a single value, so averaging itself
| | | time before null | | (xfiles factor) how much percent of unknown values do we agree to use for calculating a value
| | measurement type | function to apply, can be AVERAGE, MAX, MIN, LAST, or mathematical operations
| variable name
```
And then, you have your `solar-power.rrd` file created. You can
inspect it with `rrdtool info solar-power.rrd` or dump its content with
`rrdtool dump solar-power.rrd`.
|
|
## Graph the content of the RRD file
The trickiest part, but less problematic, is to generate a usable graph
from the data. The operation is not destructive as it's not modifying
the file, so we can make a lot of experimentations on it without
affecting the content.
We will generate something simple like the picture below. Of course,
you can add a lot more information, color, axis, legends etc.. but I
need my dashboard to stay simple and clean.
|
|
# Conclusion
RRDtool is very nice, it's a storage engine for monitoring software
such as collectd or munin, but we can also use them on the spot with
simple scripts. However, they have drawbacks, when you start to create
many files it doesn't scale well, generate a lot of I/O and consume CPU
if you need to render hundreds of pictures, that's why a daemon named
`rrdcached` has been created to help mitigate the load issue by
delegating updates of a lot of RRD files in a more sequential way.
# Going further
I encourage you to look at the official project website, all the other
command can be very useful, and rrdtool also exports data as XML or
JSON if needed, which is perfect to plug in with other software. |