►
From YouTube: Opstrace Tracing demo e2e
Description
Demo of the current state of the new clickhouse backed tracing in Opstrace
A
Hi,
this
is
nick
parker
on
the
monitor
observability
team
and
I'm
going
to
do
a
quick,
relatively
unscripted
demo
of
just
the
current
state
of
tracing
in
ops
trace.
So
tracing
itself
is
a
relatively
recently
added
feature
and
it's
still
kind
of
in
the
proof
of
concept
stages.
But
I
figured
it's
still
it's
at
a
point
now,
where,
like
we
can
probably
do
a
quick
demo
and
show
you
like
what
works
and
what's
what's
left
to
do
so,
I
figured
I'd
start
it
off
with
kind
of
this
diagram
showing
how
things
flow.
A
The
the
main
thing
this
looks
way
more
complicated
than
it
is.
The
main
thing
to
get
from
this
is
that
you
can
have
multiple
tenants,
an
ops
trace,
that's
already
a
thing
and
against
each
of
those
tenants
you
can
send
traces
in
the
traces
then
get
stored,
and
then
you
can
get
them
back
by
going
to
a
ui.
So
the
idea
here
is
that
some
end
user
would
send
probably
using
the
open,
telemetry
collector,
which
is
kind
of
like
a
you,
can
think
of
it.
A
As
like
a
swiss
army
knife
agent
for
managing
traces,
it
would
send
data
to
an
authenticated
endpoint
using
the
otlp
span
format
and
then
that
endpoint,
what
it
does
first
is
obviously
check
the
authentication
header,
but
assuming
that
passes
it
will
then
re.
Take
that
data
and
internally
convert
it
to
a
jager
format
so
that
an
internal
jaeger
pod
will
accept
it.
A
A
The
the
traces
are
being
streamed
in
through
the
system
into
click
house
and
then
an
end
user
when
they
want
to
look
at
something
they
would
go
to
the
yeager
ui,
which
is
literally
just
a
stock
jaeger
interface
and
view
it
for
the
tenant
that
they're
interested
in.
A
So
this
is
sort
of
the
state
of
the
system.
At
this
point,
you
can
get
data
in
and
you
can
see
the
data
coming
out
and
we
nowadays
have
a
regular
ci
runs
against
all
merge
requests,
as
well
as
periodically
against
the
main
branch
that
check
that
you
can
get
data
stored
into
the
system
and
then
query
it
back
out
via
the
jaeger
api,
and
so
we've
basically
been
running
it
long
enough
that
it
looks
like
things
in
their
current
mvp
state
are
are
relatively
stable.
A
If
we
look
at
the
system
tenant
on
this
example,
ops,
trace
instance
that
I've
got
up
and
running
so
I'm
at
the
system
tenant
and
then
the
the
cluster
domain
and
then
slash
jager.
We
can
see
we've
just
got
a
stock
jaeger
ui
with
a
little
bit
of
customization
over
here,
with
some
links
related
to
ops
trace
itself,
but
otherwise
this
is
just
the
anyone
who's
used.
Yeager
has
probably
seen
this
before.
A
In
this
instance.
In
the
system
instance,
we
can
see,
we've
already
got
some
internal
obstrace
components,
sending
traces
into
this
at
the
moment.
It's
just
the
cortex
metrics
management
system
that
obstructs
uses
internally
for
metrics,
as
well
as
the
jaeger
operator,
which
is
deploying
all
the
per
tenant
jaeger
instances,
as
well
as
a
kind
of
jaeger
self
reporting
on
itself.
A
For
this
yeager
query,
one
that
one's
basically
on
by
default,
so
that
one's
we're
getting
that
for
free,
but
so,
for
example,
with
the
cortex
traces
I
can
hit
find,
and
then
I
see
a
bunch
of
cortex
spans
or
traces
for
storing
data
into
the
system.
So
you
can
see,
distributors
are
accepting
data
and
then
picking
an
adjuster
and
then
the
adjuster
is
actually
storing
it
into
I
guess
locally.
A
And
then,
if
we
go
into
like
the
jaeger
operator
and
then
query
for
spans
against
there,
we
can
see
it
technically
has
some
errors.
If
we
go
and
look
at
one
of
those
we
can
see
here,
we
go
it's
complaining
about
no
matches
for
kind,
ingress
and
version
networking
case
I
o
v1,
that's
okay,
that's
kind
of
expected.
It's
a
side
effect
of
deprecated
ingress
versions
between
different
versions
of
kubernetes.
A
So
that's
that's
expected
for
the
record,
but
you
can
kind
of
see
that,
like
we
can
very
quickly
diagnose
problems
in
the
ops
trace
instance
itself
for
the
other
half
of
this
demo.
I
guess
we
can
look
at
like
so
we've
we've
looked
at
for
some
examples
of
the
system
tenant
getting
data,
basically
from
the
system
itself
and
a
snake
eating
own
tail
form
form,
but
let's
try
actually
sending
in
some
data
from
the
outside.
A
A
A
So
I
have
a
kubernetes
cluster,
that's
just
running
in
my
house
here
and
so
I've
configured
the
api
server
on
the
raspberry
pi
to
send
metrics
or
sorry
send
traces.
I
should
say
to
an
open
telemetry
collector,
and
this
is
the
definition
of
what
that
collector
is
running,
so
you
can
see
the
to
be
clear.
A
The
open,
telemetry
collector
is
just
a
third-party
stock
agent
that
you
can
run
and
we're
just
configuring
it
with
a
the
auth
token
and
b
the
end
point
where
the
tenant,
where
the
tenant
tracing
endpoint
is
located,
where
the
where
the
traces
should
actually
be
sent,
and
once
we've
got
those
two
things
we
can
check
up
on
our
pod
here
and
see
yeah,
it's
basically
sending
some
spans
every
five
seconds
or
so,
and
then,
if
we
go
whoops,
I
didn't
mean
to
click
on
that.
A
Wherever
that's
going,
let's
go
back.
If
we
go
into
the
default
tenant
here,
I
can
just
go
back
to
the
root
we
can
see
here.
We've
got
api
server,
which
is
the
the
trace
is
coming
from.
My
open,
telemetry
collector
pod
here
in
the
house
getting
sent
to
this
remote
ops,
trace
instance,
and
then
we've
also
got
jager
query,
which
is
again
just
jager
by
default
reporting
on
itself.
A
So
if
I
do
a
quick
find
traces,
it
comes
back.
Very
quickly
we
can
see
there's
some
events
going
on.
I
don't
know
anything
about
the
internals
of
the
kubernetes
api
server,
but
you
can
see
okay,
there's
my
raspberry
pi,
there's
the
kubernetes
cluster
name,
so
we
can
see
we've
got
traces
coming
in
from
an
arbitrary
source
and
getting
into
the
ops
trace
instance
against
this
default
tenant.
A
So
anyway,
I
guess
that's
that's
kind
of
it,
so
those
are
just
some
examples
of
sending
data
into
the
system
and
being
able
to
see
it
in
the
ui
again
we're
just
running
a
stockyard
ui
for
now-
and
I
guess,
as
far
as
like
things
that
are
left
to
do
obvious
things
are,
for
example,
the
click
house
instance
right
now
is
running
in
unreplicated
mode,
which
means
there's
basically
one
pod,
that
all
of
this
data
is
being
stored
against,
and
so
joe
shaw
is
currently
working
on
setting
up
replication
for
that
click
house
instance,
so
that
it's
a
bit
less
prone
to
single
point
of
failure
issues.
A
I
am
working
on
setting
up
some
quotas
and
limits.
A
lot
of
these
individual
components
have
limits
around.
For
example,
total
throughput
of
data
coming
in
the
total
number
of
spans
that
you
can
store
that
sort
of
thing
so
that
tenants
are
not
taking
up
too
much
of
the
system.
That
sort
of
thing,
mostly
it's
just
a
matter
of
exposing
those
options
that
already
exist,
but
anyway
it's
it's.