►
From YouTube: Lightning Talk: SNMP done quick - tuning JunOS for metrics extraction - Ben Kochie, GitLab
Description
Don’t miss out! Join us at our upcoming event: KubeCon + CloudNativeCon North America 2021 in Los Angeles, CA from October 12-15. Learn more at https://kubecon.io The conference features presentations from developers and end users of Kubernetes, Prometheus, Envoy, and all of the other CNCF-hosted projects.
Lightning Talk: SNMP done quick - tuning JunOS for metrics extraction - Ben Kochie, GitLab
A
Welcome
hi,
my
name
is
ben
cochi,
I'm
one
of
the
maintainers
of
the
prometheus
snmp
exporter.
Snmp
is
a
networking
protocol.
That's
used
to
manage
and
gather
data
from
network
devices,
typically
router
switches,
that
kind
of
thing
it's
very
old,
but
fortunately
the
data
model
that
it
uses
maps
very
well
into
prometheus
metrics.
A
The
metric
trees
can
be
mapped
into
metrics
and
they're
indexed
in
tables
and
the
indexes
can
be
mapped
to
labels.
This
works
out
really
well,
so
I've
got
a
couple
of
old
juniper
switches,
they're
in
a
switch
stack
and
there's
quite
a
lot
of
ports
and
a
lot
of
data
together.
So
let
me
start
up
a
quick
scrape.
A
A
Let's
take
a
look
at
the
snmp
configuration
that
I've
added
to
my
juniper
switch,
so
there's
some
stuff
that
I've
left
out,
but
this
is
the
interesting
bit
that
helps
improve
performance.
The
first
thing
I
did
to
improve
performance
was,
I
added
a
inter
interface
filter,
and
this
drops
some
of
the
data
from
the
device
that
I
don't
actually
need
to
gather
from
the
device
there's
a
number
of
sub-interfaces
and
it's
a
little
bit
cryptic.
A
But
basically
this
drops
the
sub-interface
data
from
the
from
the
output
of
the
switch,
and
the
second
thing
I've
done
is
I've
created
a
I've
added
the
stats
cache
that
caches
the
data
for
29
seconds,
and
this
is
designed
to
match
with
the
scrape
interval.
So
if
I
hit
the
device
twice
from
two
different
prometheus
instances,
it'll
produce
cache
data,
which
is
should
be
much
faster
than
producing
the
pulling
the
raw
data
from
the
switch.
And
but
I
I
wanted
to
make
sure
that
I
didn't
cache
longer
than
one
scrap
interval.
A
So
I've
made
it
one
second
shorter
than
the
actual
scrape
interval.
Let's
see
how
that
squawk
is
doing
okay,
so
that
walk
completed
and
it
took
22
seconds.
Well,
it's
not
bad,
but
it's
not
great.
So,
let's
see
if
we
can
figure
out
why
and
or
how
to
improve
this
well,
so
we've
got
two
subtree
walks
here
in
the
debug
log.
One
of
them
took
12
seconds.
A
One
of
them
took
took
eight
point
9.8
seconds,
well
that
pretty
much
matches
up
with
the
default
iaf
mib,
and
so
this
is
the
walk
configuration
that
I've
I've
asked
the
device
to
produce
data
for,
and
so
the
interfaces
table
in
the
ifx
table
come
from
this
ifmib
and,
as
you
can
see
here,
these
two
tables,
the
iftable
and
the
ifx
table-
contain
a
lot
of
subtrees,
and
so
the
first
thing
we
can
do
is
well,
let's
see
what
happens
if
we
take
and
split
that
out,
so
I've
taken
and
I've
built
an
expanded
tree
that
takes
and
expands
all
of
these
subtrees,
and
let's
run
that
scrape
so
here's
if
expanded
and
let's
see
what
happens
if
we
try
and
load
this
and
see
and
we'll
wait
for
those
logs
to
finish
all
right.
A
A
We
don't
need
so
here's
a
generator
config
that
I've
created
that
only
gathers
exactly
what
I
need
from
the
device,
which
is
the
high
capacity
counters
for
all
the
basics,
and
then
I've
created
a
second
config
that
gathers
all
the
error
counters
and
a
couple
of
other
things
like
admin
status,
upper
status
and
port
speed,
and
so
once
this
is
done
producing
data
yeah
so
that
still
took
24
seconds.
It
definitely
wasn't
any
faster.
So,
let's
see
what
happens
if
I
do
the
same
thing.
A
And
I
only
gather
my
my
mini
config.
Well,
let's
take
a
look:
let's
wait
for
that
walk
to
run.
A
And
see
how
long
that
take
that
looks
like
it
completed
well,
that
was
much
much
faster.
I
wonder
why
the
the
the
system
log
seems
to
be
a
little
bit
lagged,
but
let's
see
if
we
can
get
that
to
produce
more
data
there
we
go
yeah
so
that
that
walk
only
took
six
seconds,
so
the
big
trick
to
do
if
your
gathering
data
is
too
slow
is
turn
on.
Snmp
exporter
to
blog
logging
examine
all
of
the
sub
trees
to
find
out
if
any
specific
subtree
is
fast
or
slow,
and
then.