Though Wavefront has a remarkable ability to relentlessly consume huge amounts of metrics, from agents like Telgraf or CollectD, it is sometimes useful to be able to send data in a more ad-hoc way. The Wavefront CLI can be useful for that.
This article looks at data ingestion from the command-line. All the examples
work with version 2.16.2 of the wavefront-cli
gem, going into a 2018-46-44
cluster, via a 4.29 proxy.
Single, Arbitrary Points
Sometimes you want to poke just the odd point into Wavefront. The write point
sub-command does just that. Here’s the syntax, with the write
-specific
options.
wf write point [-DnViq] [-c file] [-P profile] [-E proxy] [-t time]
[-p port] [-H host] [-T tag...] [-u method] [-S socket] <metric>
<value>
Options:
-E, --proxy=URI proxy endpoint
-t, --time=TIME time of data point (omit to use current
time)
-H, --host=STRING source host
-p, --port=INT Wavefront proxy port
-T, --tag=TAG point tag in key=value form
-F, --infileformat=STRING format of input file or stdin
-m, --metric=STRING the metric path to which contents of a file
will be assigned. If the file contains a metric
name, the two will be dot-concatenated, with
this value first
-i, --delta increment metric by given value
-I, --interval=INTERVAL interval of distribution (default 'm')
-u, --using=METHOD method by which to send points
-S, --socket=FILE Unix datagram socket
-q, --quiet don't report the points sent summary (unless
there were errors)
Pretty simple, I hope.
$ wf write point dev.cli.example 10
sent 1
rejected 0
unsent 0
$ wf write -u api point dev.cli.example 20
sent 1
rejected 0
unsent 0
$ wf query --start=-1m "ts(dev.cli.example)"
name ts(dev.cli.example)
query ts(dev.cli.example)
timeseries
label dev.cli.example
sparkline > <
host box
data 2019-02-13 15:51:18 20.0
---------------------------------------------------------------
label dev.cli.example
sparkline > <
host box
tags
env lab
data 2019-02-13 15:51:13 10.0
Note that the query results are in two sections, even though we sent two values
on the same metric path. The directly ingested point (value 20
) has no tags,
but the one we sent via the proxy is tagged with env=lab
. This is because my
lab proxy has a preprocessor rule which tags everything going through it, and it
shows that sending points directly, though useful in many cases, is not a
straight substitute for using a proxy.
Most obviously, API calls are far more expensive. Sending a metric via -u api
takes about a second for me, because that metric’s got to get from England to
us-west-1
, over HTTPS. Sending to a proxy listening on a Unix socket on the
same subnet is pretty much instantaneous. (The proxy batches points and
compresses the bundle before sending to your cluster, so the actual delay may be
longer, though it will “feel” faster to your client.)
The Wavefront proxy is not a thing you want to avoid. Even a modestly sized one can handle insane amounts of metrics, and it gives you great reliability through its buffering and retrying, and efficiency through batching and compressing of the data it sends. It lets you manipulate, mangle, or block points based on sophisticated rules. It can extract metrics from log files, tag things on the fly, understand all kinds of different formats. And it generates rich metrics of everything it does, which can be very useful for debugging and tuning.
That’s not to say direct ingestion is without value. For instance, when we made an IoT biscuit tin, we wanted its metrics to go to Wavefront. That turned out to be a huge pain, because all our proxies, and indeed, all our hosts were in AWS, and our biscuit tin was in the office. Direct ingestion – which didn’t exist at the time – would have been perfect for a little job like that. We’ve also, at times, wanted to put a small amount of data from Lambda functions into Wavefront, but the Lambdas were running in VPCs without proxy access. We could peer, or stand up proxies, but direct ingestion would have been easier all round. Now direct ingestion is available, we’re starting to use it all over the place.
Note that we didn’t specify a timestamp for the point, so the CLI assumed “now”.
When you set a timestamp, as in all wf
subcommands, you can use epoch seconds
or, anything naturally parseable by Ruby’s strptime()
method.
$ wf write point -t 14:20:33 dev.cli.example 98.76
sent 1
rejected 0
unsent 0
If you find yourself wondering whether, or how, wf
will parse a time you
enter, open up irb
and find out.
$ irb -r time
irb(main):003:0> Time.parse('12:00')
=> 2018-10-22 12:00:00 +0100
irb(main):004:0> Time.parse('13/03/2016')
=> 2016-03-13 00:00:00 +0000
Note that when you send points, you get a summary of how many were sent,
rejected, or unsent. Depending on your viewpoint this is useful and reassuring,
or irritating, so you have the option to make the write
command quiet with
-q
. If anything goes wrong, even with -q
specified, wf
will exit nonzero
and print the summary anyway.
If you are not irritated by summaries, and demand EVEN MORE verbosity when
writing points, you’re in luck. --verbose
(AKA -V
) will print make wf
print out every point it sends in native Wavefront wire format. (Or as an HTTP
POST
if you’re going over the API.) You can combine with -q
to only get the
detail.
$ wf write point -t 14:20:33 dev.cli.example 98.76 --verbose
SDK INFO: dev.cli.example 98.76 1550067633 source=box
sent 1
rejected 0
unsent 0
$ wf write point -t 14:21:33 dev.cli.example 54.321 -Vq
SDK INFO: dev.cli.example 54.321 1550067693 source=box
$ wf write point -t 14:20:33 -u api dev.cli.example 98.76 --verbose
SDK INFO: dev.cli.example 98.76 1550067633 source=box
SDK INFO: uri: POST https://metrics.wavefront.com/report
SDK INFO: body: dev.cli.example 98.76 1550067633 source=box
sent 1
rejected 0
unsent 0
$ wf write point -u api -t 14:21:33 dev.cli.example 54.321 -Vq
SDK INFO: dev.cli.example 54.321 1550067693 source=box
SDK INFO: uri: POST https://metrics.wavefront.com/report
SDK INFO: body: dev.cli.example 54.321 1550067693 source=box
There’s --debug
too, but that will take you into the innards of wf
.
Hopefully everything will work all the time and you’ll never need it.
You can write points with tags, using -T
. Multiples are allowed.
$ wf write point -q -t 14:25 -T cmd=wf -T subcmd=write dev.cli.example 99.999
If I don’t specify a source (or “host”), the CLI will use what it thinks is the
hostname of my machine. Up to now that’s been box
.
$ wf write point -q -H made-up-host dev.cli.example 99
Here are the points we just sent. Hover over them and you’ll see the tags. (If you can’t see the chart, you’ll have to enable third-party cookies for this page, because the embedded graphs use Typekit.)
I did not specify a proxy endpoint or port in any of the above examples. The
write
command respects the .wavefront
config file, so I have my proxy stowed
away in there:
$ grep -v token ~/.wavefront
[default]
endpoint = metrics.wavefront.com
format = human
proxy = wavefront.localnet
write -u api
uses the proxy and token just like any other wf
command.
api
isn’t the only thing you can -u
. As of late 2018, proxies accept HTTP
POSTed points, on the same port the socket uses. This is simple HTTP - there’s
no authentication or authorization yet, but it works, and the CLI supports it.
$ wf write point dev.cli.example 123 --verbose --using http
SDK INFO: dev.cli.example 123.0 source=box
SDK INFO: uri: POST http://wavefront:2878/
SDK INFO: body: dev.cli.example 123.0 source=box
sent 1
rejected 0
uunsent 0
The final (for now) -u
mechanism is unix
. This writes points to a Unix
socket. I added this because I wanted a very fast local mechanism to write
points, which could handle a proxy temporarily being unavailable. I was already
using Telegraf, so I added a socket listener by putting this in telegraf.conf
.
[[inputs.socket_listener]]
service_address = "unix:///tmp/telegraf.sock"
data_format = "wavefront"
Then I could write points straight to the socket. (I do this from inside Ruby,
using the SDK. But once it was in the SDK I thought I might as well put a CLI
front-end on it. You have to specify the path to the socket with -S
(or
--socket
). No other credentials are required, and everything looks exactly the
same as the other methods.
$ wf write point -u unix -S /tmp/telegraf.sock dev.cli 1 --verbose
SDK INFO: dev.cli.socket 1.0 source=www-blue
sent 1
rejected 0
unsent 0
As other transport mechanisms appear, they will be supported by the CLI.
Multiple Points, From a File
Writing one point at a time is fine, and may well be just what you need, but
it’s more likely that you want to push in a batch of points. Enter write file
.
wf write file [-DnViq] [-c file] [-P profile] [-E proxy] [-H host]
[-p port] [-F infileformat] [-m metric] [-T tag...]
[-u method] [-S socket] <file>
A while ago, I needed to push retrospective data into Wavefront, and had to hack together some Ruby to generate and push the points. Now I could use the CLI.
Here’s an example file.
$ cat file1
1744075043 dev.cli.file1 144
1744075167 dev.cli.file1 185
1744075253 dev.cli.file1 157
1744075350 dev.cli.file1 129
1744075384 dev.cli.file1 48
1744075540 dev.cli.file1 67
1744075549 dev.cli.file1 172
Clearly the three fields are epoch timestamp, metric path, and value. I can load
in that file with the following ‘write file’ command. Supplying -V
will show
me the points in Wavefront wire format, as they go in.
✦2 ➜ wf write file -V -F tmv file1
SDK INFO: uri: POST http://wavefront:2878/
SDK INFO: body: dev.cli.file1 144.0 1744075043 source=slim
SDK INFO: uri: POST http://wavefront:2878/
SDK INFO: body: dev.cli.file1 185.0 1744075167 source=slim
SDK INFO: uri: POST http://wavefront:2878/
SDK INFO: body: dev.cli.file1 157.0 1744075253 source=slim
SDK INFO: uri: POST http://wavefront:2878/
SDK INFO: body: dev.cli.file1 129.0 1744075350 source=slim
SDK INFO: uri: POST http://wavefront:2878/
SDK INFO: body: dev.cli.file1 48.0 1744075384 source=slim
SDK INFO: uri: POST http://wavefront:2878/
SDK INFO: body: dev.cli.file1 67.0 1744075540 source=slim
SDK INFO: uri: POST http://wavefront:2878/
SDK INFO: body: dev.cli.file1 172.0 1744075549 source=slim
sent 7
rejected 0
unsent 0
And here’s the chart. Hover over the points and you’ll see the values from the file.
The key part of the wf write file
command is the -F
option. This lets the
user describe the format of the file they wish wf
to parse. t
stands for
timestamp
; m
for metric
, and v
for value
. So, tmv
, describes the
format of file1
.
The v
column is mandatory, but the time and metric path can be set in other
ways. For instance, the -m
option allows you to define a metric path which
will be applied to all data points in the file. So, the following file and
command would be an identical data load to the example above.
$ cat file1
1744075043 144
1744075167 185
1744075253 157
1744075350 129
1744075384 48
1744075540 67
1744075549 172
$ wf write file -F tv -m dev.cli.file1 file1
sent 7
rejected 0
unsent 0
You can also use -m
to set a metric prefix, and have the final portion of the
metric in your file. If you do that, the two parts will be concatenated. I’ll
show you that later.
If you wish, you can even add point tags to a data load. For fine-grained
control, put them at the end of each line to which they apply. To tag everything
uniformly, use the -T key=val
option. If you do both, you get both sets of
tags. Tags have to be at the end of the line because there can be arbitrarily
many for each data point, and the number may not be constant.
All this, of course, works exactly the same for any ingestion method.
Multiple Points, from a Live Source
Though it’s more useful than sending a single point, I still think loading data
in from a static file is something most people would use rarely, if ever. Far
more useful to, in proper Unix style, set the input file to -
, and read from
standard in.
Maybe the simplest illustration is to generate some (pseudo) random data.
(Ignoring the fact that Wavefront has a perfectly capable random()
function.)
$ while true; do echo $RANDOM; sleep 1; done | wf write file -V -m dev.cli.demo -Fv -
SDK INFO: uri: POST http://wavefront:2878/
SDK INFO: body: dev.cli.demo 21397.0 source=slim
SDK INFO: uri: POST http://wavefront:2878/
SDK INFO: body: dev.cli.demo 29321.0 source=slim
SDK INFO: uri: POST http://wavefront:2878/
SDK INFO: body: dev.cli.demo 9168.0 source=slim
SDK INFO: uri: POST http://wavefront:2878/
SDK INFO: body: dev.cli.demo 26475.0 source=slim
SDK INFO: uri: POST http://wavefront:2878/
SDK INFO: body: dev.cli.demo 6167.0 source=slim
...
Producing:
That’s fine, but you’re more likely to want to plot the output of a command, so to illustrate that, here’s a little script which generates the points for a parabola. You can see it outputs pairs of numbers: the first is the abcissa, as a timestamp, and the second is the ordinate.
#!/usr/bin/env ruby
h, k, a = 25, 1000, 10
1.upto(49) do |x|
$stdout.puts "#{Time.now.to_i} #{a * (x - h) ** 2 + k}"
$stdout.flush
sleep 1
end
The $stdout
stuff is necessary because otherwise the script will flush all its
output when it exits, and I wanted to use wf
’s -V
option to watch the points
flowing through when I was testing. (wf
has a --noop
flag which will not
make a connection to the proxy, and will show you the points in Wavefront wire
format, in real-time.)
Anyway, run the script, and pipe its output into wf
, supplying a metric path
and a description of the file format.
$ ./parabola.rb | wf write file -m dev.cli.demo -V -F tv -
Back in a previous article, I wrote some Ruby to wire DTrace into
Wavefront. Now, I can use the write file
command for simple D scripts.
Revisiting intr.d
, I can describe the field format I expect. The old version
of the CLI would ignore lines which don’t match the field definition, but in the
rewrite I chose to make that throw an error. So now we have to run the output
through awk
to reject anything without two fields. The first field is the CPU
ID, which I want as the final part of the metric path, and the second is the
value to send (in this case, the total number of interrupts handled by that
CPU). Because I am not supplying any timestamps, wf
will use the current UTC
time whenever it sends a point. -V
is for verbosity.
# ./intr.d | awk 'NF == 2 { print $0 }' | wf write file -V -m dev.cli.d1 -F mv -
dtrace: script './intr.d' matched 4 probes
SDK INFO: dev.cli.d1.1 1029.0 source=cube
SDK INFO: dev.cli.d1.0 1214.0 source=cube
SDK INFO: dev.cli.d1.2 1188.0 source=cube
SDK INFO: dev.cli.d1.3 1239.0 source=cube
SDK INFO: dev.cli.d1.1 1620.0 source=cube
SDK INFO: dev.cli.d1.0 1910.0 source=cube
SDK INFO: dev.cli.d1.2 1775.0 source=cube
SDK INFO: dev.cli.d1.3 1867.0 source=cube
SDK INFO: dev.cli.d1.0 2705.0 source=cube
SDK INFO: dev.cli.d1.1 3394.0 source=cube
SDK INFO: dev.cli.d1.2 2146.0 source=cube
...
and, with the whole thing wrapped in a deriv()
expression, to turn a counter
into a gauge, I see:
How about kstats? Say I’d like to see a chart of network throughput when I do an NFS copy between a couple of machines. That’s now a one-liner. (Or it would be if I didn’t have to break it because of formatting issues!) Let’s use direct ingestion, just to show that it works the same.
# while true; do kstat link:0:net0:obytes64 | grep obytes; sleep 1; \
done | wf write file -u api -V -Fmv -m dev.cli.network -
SDK INFO: dev.cli.network.obytes64 12747937388.0 source=cube
SDK INFO: uri: POST https://metrics.wavefront.com/report
SDK INFO: body: dev.cli.network.obytes64 12747937388.0 source=cube
SDK INFO: dev.cli.network.obytes64 12747937554.0 source=cube
SDK INFO: uri: POST https://metrics.wavefront.com/report
SDK INFO: body: dev.cli.network.obytes64 12747937554.0 source=cube
SDK INFO: dev.cli.network.obytes64 12747941720.0 source=cube
SDK INFO: uri: POST https://metrics.wavefront.com/report
...
That required no setting up, and nothing beyond a local installation of the
wavefront-cli
gem. Now you have no excuse for not putting everything in
Wavefront!
Histograms
Wavefront now has an add-on histogram feature. For this to work you need to have a histogram-enabled endpoint. Speak to your sales person.
Histograms are a way around Wavefront’s one-second resolution limit, and a way of interpreting millions of points per second without it costing the earth. They work like a global statsd. You send points to a proxy, which buckets them all, and flushes a mathematical description of said bucket up to your cluster at a predefined interval. These intervals are every minute, hour, and day.
You must configure your proxy to allow histogram ingestion, and each of the
intervals I mentioned has its own port. By default the “minute” bucket listens
on 40001, the hourly one on 40002, and the daily on 40003. To send metrics with
the CLI and have them bucketed in one minute intervals is exactly as I described
above, but pop -p 40001
in the command. Watch.
$ while true
> do
> wf write point -qV -p 40001 demo.cli.histogram_1 $RANDOM
> sleep 0.1
> done
SDK INFO: demo.cli.histogram_1 1028.0 source=box
SDK INFO: demo.cli.histogram_1 11952.0 source=box
SDK INFO: demo.cli.histogram_1 12442.0 source=box
SDK INFO: demo.cli.histogram_1 26243.0 source=box
SDK INFO: demo.cli.histogram_1 17687.0 source=box
...
produces:
Once the results are in Wavefront, you can view them with an hs()
(as opposed
to ts()
) expression, and apply various statistical functions. The chart above
uses, max()
, median()
, min()
, and uses percentile()
to show the 95th
percentile. As this is analysis is performed on data from all hosts, it’s a true
95th percentile, not an average view of the 95th percentile from each host.
There is another way of writing histogram data to Wavefront, which is to use a
“distribution”. A distribution assigns multiple values to a single metric over a
given time range. So, if you were recording web server response codes, and had
150 “200”s and 6 “404”s in a minute, you could send a distribution which looked
like #150 200 #6 404
.
The CLI lets you send distributions just like normal points, using
write distribution
.
wf write distribution [-DnViq] [-c file] [-P profile] [-E proxy]
[-H host] [-p port] [-T tag...] [-u method] [-S socket] [-I interval]
<metric> <val>...
Wavefront describes distributions in the way I just showed you, with #a b
where a
is the number of times b
occurred during the time range. To save you
the trouble of counting your individual values, the CLI lets you describe a
distribution “in the raw”.
$ wf write distribution -V demo.dist 3 1 4 1 1 2 3 6 4 1 3 2
SDK INFO: !M 1539780323 #3 3.0 #4 1.0 #2 4.0 #2 2.0 #1 6.0 demo.dist source=box
sent 1
rejected 0
unsent 0
But if you have gone to the trouble of counting up the values, it would be rude of me to expect you to break them up again. So this will work too.
$ wf write distribution -Vq test.dist 3x3 4x1 2x4 2x2 1x6
SDK INFO: !M 1539781868 #3 3.0 #4 1.0 #2 4.0 #2 2.0 #1 6.0 test.dist source=box
I chose 3x1
rather than Wavefront’s #3 1
format to save you having to escape
the hash. You can even mix and match, so 3x1 2 3
is fine.
When you send a distribution, you must define the time interval it covers. The
-I
option lets you do this, and its value can be m
, h
or d
. If you don’t
specify, m
is chosen. When the CLI detects a distribution it will
automatically send it to port 40000. If you need to use a different port, -p
will help you out.
You can even take distributions from a file, as we saw above. When you describe
the input file format, just use d
for distribution instead of v
for value.
And instead of a single value in the file, use a comma-separated list of values.
Values can be straight numbers, or they can be duplicated with an x
in the way
you already saw. All the other rules of write file
apply.
Note that distribution and histogram data cannot be sent via the API. They must go through a proxy. This is a design decision of Wavefront itself, not of the CLI. They also don’t currently appear to work if you send them to the proxy over HTTP.
I hope you find the CLI a useful way of getting data into Wavefront. If you find any bugs, or wish any enhancements, please open an issue.