►
Description
Don’t miss out! Join us at our upcoming event: KubeCon + CloudNativeCon North America 2021 in Los Angeles, CA from October 12-15. Learn more at https://kubecon.io The conference features presentations from developers and end users of Kubernetes, Prometheus, Envoy, and all of the other CNCF-hosted projects.
Keynote: Fireside chat with Martin Mao, CEO of Chronosphere - Anurag Gupta, Production Manager, Calyptia; Martin Mao, CEO/Co-Founder, Chronosphere
In this session join Martin Mao, CEO of Chronosphere, in a fireside chat with Fluent maintainer Anurag to discuss the evolution of logging in relation to observability and answering questions about the broader Fluent Ecosystem.
A
Okay,
well,
hey
martin
thanks!
So
much
for
for
joining
me
on
this
fireside
chat,
you
know,
would
love,
maybe
a
quick
intro
for
for
all
the
folks
who
are
joining
us
to
influence,
maybe
who
we
are,
what
chronosphere
is
yeah
just
a
little
bit
about
yourself.
B
Yeah
for
sure,
thanks
for
having
me
anorak,
I'm
really
excited
to
be
here
today
chatting
with
you
a
little
bit
about
myself.
So
my
name
is
martin,
I'm
currently
the
ceo
and
co-founder
of
chronosphere.
We
provide
a
hosted
monitoring
solution
to
companies
adopting
cloud
native.
We
help
these
companies
monitor
their
infrastructure,
so
primarily
kubernetes.
B
These
days
monitor
their
applications,
which
is
generally
micro,
services,
oriented
these
days
and
monitor
their
business
as
well
in
in
real
time,
and
the
core
technology
of
our
product
really
came
out
of
uber
and
a
lot
of
the
open
source
observability
projects
that
came
out
of
uber
and
that's
actually,
where
I
spent
four
years
of
my
career
before
founding
chronosphere,
I
led
a
core
part
of
the
observability
team
there,
where
we
created
projects
such
as
m3,
which
is
a
distributed
and
scalable
metric
storage
engine,
it's
compatible
with
prometheus's,
long-term
storage
there
for
metrics
we
created
jager,
which
is
the
cncf
graduated,
distributed
tracing
project.
B
We
actually
completed
the
trifecta
and
created
a
logging
platform
as
well
internally,
but
unfortunately
we
never
open
sourced
it.
So
yeah
spent
a
lot
of
time
in
my
career
in
the
observability
space,
both
solving
problems
for
uber,
with
these
direct
solutions,
but
also
for
the
broader
community
via
the
open
source
channels,
as
well.
So
a
little
bit
about
myself
there
and
again
really
excited
to
be
here
today
and
looking
forward
to
our
chat.
A
Awesome
yeah
and
you
know,
jaeger,
alongside
fluentd
as
a
graduated
cncf
project,
so
you
know
awesome
to
hear
about
all
this
observability
journey
and
uber
what
a
small
app
right.
You
know,
I
think,
as
as
probably
one
of
the
leading
folks
in
that
observability
space
with
jaeger
and
all
these
other
projects.
You
know
I'm
curious.
What
what
should
companies
be
be
thinking
about,
or
what
should
users
actually
be
thinking
about
when
solving
for
observability.
B
Yeah,
that's
a
great
question
and
I
love
the
fact
that
you
put
users
in
there
right
because
it
is
probably
what
the
focus
should
be
is
what
should
the
the
users
be
thinking
about
and
and
even
if
we
think
about
who
the
users
of
observability
are
that's
been
changing
fairly
rapidly,
I'd,
say
right.
So
historically,
perhaps
the
users
or
practitioners
of
observability
and
monitoring
has
been
isolated
to
the
sre
department
or
you
know,
perhaps
a
core
infrastructure
team.
B
However,
if
you
think
about
modern
development
and
application
developer,
they
not
only
have
to
write
and
develop
their
application.
They
also
have
to
test
it
that
they
deploy
it.
They
have
to
monitor
in
production.
They
have
to
remediate
issues
when
it
goes
wrong,
so
really
for
us,
observability,
you
know
is,
is
more
than
just
a
practice.
B
It's
probably
like
a
cultural
mindset,
much
like
how
devops
is,
and
it
really
is
something
that
all
developers
you
know
are
at
seeing
embrace
and
adopted
this
culture,
and
this
mindset
and
we're
seeing
that
more
and
more
so
really
the
the
end
users
of
observability
is,
is
all
the
developers
out
there
and
if
you
look
at
it
from
that
perspective,
they're
really
trying
to
optimize
for
one
outcome,
which
is
to
know
when
something
is
wrong
and
to
remediate
that
issue
as
quickly
as
possible
and
and
ideally
before
end
customers
find
out
right.
B
Or
other
perhaps
engineering,
teams
and
whatnot,
and
I
think
optimizing
for
that
outcome
of
remediating
particular
issues
in
their
applications,
really
three
questions
that
we're
trying
to
solve
and
and
answer
for
here,
the
first
of
which
is
can
I
get
notified
and
how
quickly
can
I
get
notified
when
something
goes
wrong?
Because
if
you
don't
even
know
when
something
goes
wrong
or
your
customers
find
out
before
you,
that's
really
not
a
great
place
to
be,
and
I'd
say
the
second
one
is
triage
right.
B
So
once
you
do
get
notified,
something
is
wrong
figuring
out.
What
is
the
impact?
Is
it
impacting
all
of
my
customers
or
just
the
subset
of
my
customers?
Is
it
one
cluster
or
another?
You
know
if
you
get
working
up
in
the
middle
of
the
night?
Is
this
something
I
have
to
deal
with
now?
Or
can
I
wait
until
the
morning?
B
So
you
know
triaging
the
issue
and
knowing
the
impact
is
a
fairly
important
question
that
we
need
to
answer
and
then
the
third
step
or
phase
or
question
that
we
do
need
to
answer
is.
Can
I
root
cause
this
issue?
Can
I
find
at
the
underlying
root
cause
of
the
issue
and
and
really
provide
a
fix
here
for
it?
B
So
you
know,
I
think
those
are
probably
the
three
steps
or
or
phases
I'd
say
that
developers
go
through
in
achieving
their
outcome,
which
is
to
remediate
the
issue
as
quickly
as
possible,
and
you
know
for
myself
and
for
us
the
team
here
at
chronosphere.
That's
what
we
think
about
when
we
think
about
observability
and
that's
what
we
think
end
users
should
focus
on.
We
hear
we
do
hear
a
lot
of
definitions
out
there.
Where
you
know
it's
more,
I
would
say,
concentrated
around.
B
Perhaps
the
data
types
like
metrics
traces
and
logs
and
the
three
pillars
per
se.
You
know
those
data
types
are
definitely
important
for
sure,
but
you
know,
and
they
are
the
the
types
of
data
we
need
to
answer
the
question
and
arrive
at
our
outcome.
B
The
data
types
by
themselves,
don't
really
give
you
observability
or
better
observability
right,
just
just
taking
the
three
of
them
off
and
say:
hey
I
have
logs,
I
have
metrics,
I
have
tracers,
it
doesn't
necessarily
mean
you
have
observability
or
great
observability
and
producing
more
of
each
of
those
data
types
that
also
doesn't
lead
to
greater
observability.
So
you
know,
we
do
think
that
an
outcome-based
approach
for
the
end
user,
which
is
the
developer,
is,
is
perhaps
a
better
way
to
think
about
observably
as
a
whole,
as
opposed
to
a
data
approach.
There.
A
Yeah,
it
makes
a
ton
of
sense.
I
think
everyone's
getting
hammered
in
with
these
these
three
pillars:
logs
metric
traces,
you
got
a
checkbox
all
above
and
yeah.
It's
sometimes
we
just
forget
about
the
user.
In
those
cases-
and
you
know
you
mentioned,
like
the
the
three
steps
or
phases
are
there-
things
you'd
recommend
to
users
to
like
going
about
to
to
meet
those?
Or
you
know,
maybe
you
can
help
clarify
those
pieces
a
little
bit
yeah.
B
B
You
do
have
to
get
notified
that
something
is
wrong
before
you
can
go
and
fix
the
issue
for
sure
I'll
say
that
you
know
the
best
way
to
think
about
these
three
steps
or
phases
is
that
it's
still
again
optimizing
for
the
outcome,
which
is
remediation
and
it's
not
necessary
that
you
have
to
go
through
all
three
phases
to
remediate
an
issue
right.
So
if
you
think
about
it,
if
you're
mid
deploying
your
service
and
you
get
notified-
that
something
is
wrong.
B
First
course
of
action
is
probably
to
roll
back
that
deploy
instantly
right
and
that
could
be
your
way
of
remediating
the
issue.
You
don't
know
the
root
cause
there,
but
you've
remediated
the
issue,
you've
avoided
customer
impact,
I'd
say,
and
that's
really
what
you're
trying
to
optimize
for
right.
So
it's
really
about
that,
and
you
know
perhaps
at
the
notification
phase
you
can
remediate
instantly.
So
there's
no
need
to
go
triage
and
to
reclause
the
issue.
B
You
can
sort
of
do
that
after
the
fact-
and
I
think
that's
important
as
well,
you
don't
really
want
to
be
doing
sort
of
root.
Cause
analysis
live
during
the
incident
with
the
pressure
of
knowing
that
the
business
is
down
or
impacted,
or
anything
like
that.
Perhaps-
and
you
know
I'd
say
also
generally,
issues
are
introduced
when
we
change
a
system
when
we
leave
a
system
alone
generally
there's
not
as
many
issues
and
I'd
say.
The
highest
percentage
of
causes
of
issues
is
when
we
introduce
change
to
a
particular
system.
B
So
perhaps
when
you
get
notified
there,
that
that
resolution
or
step
to
remediation
is
is
fairly
quick.
The
second
one
is
triage,
so
you
know.
Definitely
there
are
a
bunch
of
situations
where
just
being
notified
isn't
enough.
You
haven't
actively
you're
not
actively
introducing
a
change
to
the
system,
so
you
do
want
to
triage
the
issue
in
the
sense
of
knowing
what
is
impacted.
What
is
the
impact
to
you
know
subsets
of
my
of
my
customer
base,
perhaps
or
all
of
my
customer
base.
B
Perhaps
you
know
you
can
isolate
the
issue
down
to
one
cluster
or
one
availability
zone
or
one
region
right,
and
I
think
that
helps
you
know
how
bad
the
issue
is
and
how
much
sort
of
urgency
you
need
to
put
into
resolving
the
issue.
But
often
we
find
that
at
that
step
you
can
also
remediate
fairly
easily
without
root
cause
analysis,
and
you
can
imagine
you
know.
B
Most
of
our
modern
architectures
are
spread
across
multiple
clusters:
multiple
availability
zones,
multiple
regions,
so
a
quick
sort
of
resolution
to
that
to
remediation,
perhaps
is
to
route
your
request
around
those
impacted
zones
or
clusters.
Right.
If
you
know
the
issue
is
isolated
to
to
cluster
a
or
to
you
know,
zone
a
route.
Your
request
away
from
that
such
that
again
you're
you're,
you've,
remediated
the
issue
and
your
customers
are
not
impacted.
B
You
know
yet
you
have
time
to
to
really
figure
things
out,
not
under
that
time
pressure,
and
then
you
know
there
are
definitely
occasionally
those
issues
where
you
know
you
can't
do
any
of
the
first
two
and
you
really
have
to
get
dug
in
and
really
dig
at
what
the
root
cause
is
in
production.
Again,
this
is
probably
not
the
preferable
one.
B
You
know
I'm
sure
the
developers
would
would
prefer
not
to
have
to
debug
things
live
in
production
under
the
time
pressure
of
actual
impact
to
the
business,
but
that
does
need
to
happen
sometimes
and
and
when
it
does,
you
know
you
really
got
to
get
dug
in
there
figure
out
what
the
root
cause
is:
roll
out
a
fix
and
remediate
the
issue
that
way.
So
you
know
they
are
sort
of
sequential
and
dependent
on
each
other,
but
I
think
you
know
again.
A
Yeah-
and
I
think
it
helps
a
lot
and
now
I
think,
as
you
talk
about
remediation,
is
there
going
a
little
bit
back
to
kind
of
the
three
pillars
and
the
three
data
types?
Are
there
certain
data
types
that
make
themselves
or
lend
themselves
to
make
it
easier
to
remediate
right
is
like
metrics
or
traces?
Is
that
going
to
help
you
move
faster
or
you
know
how
do
the
data
types
kind
of
relate
back
to
all
of
this.
B
That's
a
great
question:
I
think
the
short
answer
is
is
yes,
you
know,
there's,
obviously
a
lot
of
details
in
that
you
know.
I
think,
when
we
think
about
the
data
types
and
as
I
mentioned
earlier,
you
know
just
by
having
all
three
checked
off
doesn't
lead
to
better
observability
in
in
in
any
way
and
just
having
more
of
each
data
type
doesn't
really
accomplish
that
or
achieve
that
either.
B
I
think,
if
you
look
at
the
the
the
outcome,
which
is
remediation
and
the
phases,
there
are
particular
data
types
that
are
better
suited
to
particular
phases
right.
So
if
you
think
about
notification
generally,
when
you're
trying
to
be
notified
of
something
that
notification
is
generally
done
in
an
aggregate
view
of
all
of
your
data
or
all
of
your
requests,
perhaps
right
so
you
know
you're
looking
at
how
many
requests
am
I
getting
per
second?
How
many
errors
am
I
getting
per
second?
What's
the
aggregate
latency
of
those
particular
requests?
B
So
I
think
you
know
to
answer
or
to
help
with
the
notification
phase
of
the
problem.
Metric
data
is
generally
a
more
optimal
type
of
data
to
go
and
solve
that
problem.
It
doesn't
necessarily
mean
it's
the
only
one
and
it's
the
necessary
one,
but
if
you
think
about
what
you're
trying
to
measure
there,
it's
really
an
aggregate
view
of
numerical
data,
you're,
counting
things
or
you're,
measuring
the
latency
and
generally
metrics,
which
is
you
know,
values
over
time.
B
And
if
you
think
about
notification
and
alerting
you're
generally
checking
those
numerical
values
against
a
particular
threshold.
So
you
know,
I
think
metrics
is
perhaps
better
suited
for
the
notification
phase
of
things,
perhaps
triage
a
little
bit
as
well.
B
If
you
think
about
triage
you're,
really
trying
to
dig
one
level
deeper
into
the
error
count
or
into
latency
a
little
bit
and
then
perhaps
having
labels
or
tags
on
your
metric
data
to
better
slice
and
dice
by
a
particular
cluster
or
an
ac
or
region,
can
help
you
with
triage
out
there
for
sure
and
the
the
actual
individual
request
itself,
perhaps
is
not
is
not
as
required
for
those
phases,
but
as
you
shift
later
into
the
the
phases
in
the
process
a
little
bit
into
more
steeper
triage
and
root
cause
analysis.
A
B
Default,
a
little
bit
more
verbose
have
a
little
bit
more
information
there,
and
I
think
you
know
as
that
transitions
there
are
definitely,
I
would
say,
lots
and
traces
are
perhaps
more
efficient
at
solving
those
particular
phases
of
the
problem.
So
you
know,
I
think,
all
three
types
and
I'm
sure
they
exist
for
a
particular
reason.
B
You
know
the
announcement
earlier
today
about
fluent
bid
and
sort
of
extracting
metric
data
off
of
vlogs,
because
quite
often
it
doesn't
mean
that
you
need
to
go
an
instrument
for
all
three
types
of
data,
but
converting
between
one
type
to
another
type
to
optimize,
for
the
use
case
is,
is
a
great
a
a
great
advantage.
I
guess-
and
you
know
again
really
happy
to
see
fluent
bit
and
fluently-
go
down
that
path
in
enabling
those
use
cases.
A
Yeah-
let's
talk
a
little
bit
about
that.
You
know,
I
think
you
know
we
have
a
couple
sessions
at
fluentcon
that
you're
going
to
talk
a
little
bit
about
more
on
the
the
the
tracing
side
with
flip
it,
as
well
as
just
the
metric
side,
would
love
to
just
get
your
your
thoughts
about
maybe
metrics
logs
how
these
things
all
stem
together
and
fluent
bits
kind
of
new
announcement.
There.
B
Yeah
for
sure,
so
you
know,
I
think
there
will
be
a
session
later
today
from
from
mike
at
named
marcus.
And
if
you
look
at
that
session,
I
don't
want
to
ruin
his
session
bye,
but
by
any
means,
but
he
was
already
you
know
via
a
custom
plugin
already
generating
metrics
off
of
logs
right.
B
So
this
is
already
happening,
even
though
there
is
first
class
support
now
in
sloan
b,
which
I
think
is
great
users
were
already
doing
this
out
of
necessity,
and
and
if
you
look
at
mike's
use
case,
it
really
is
to
extract
metric
data
off
of
the
log
so
that
he
can
alert
off
of
this
and
get
faster
notifications,
because
again,
metrics
is
perhaps
a
better
optimal
method
of
that
phase
than
than
logs
is
right.
B
So
you
know,
I
think
this
is
already
something
that
that
end
users
are
leveraging
already
that
the
need
is
there
already.
I
think,
what's
great
about
the
announcement
and
correct
me
if
I'm
wrong
here
under
a,
but
I
believe
the
capability
of
extracting
metrics
from
logs
has
existed
influencing
fluent
bit
for
quite
a
while.
But
the
big
announcement
today
is
that
it's
happening
in
in
the
metric
extraction
component.
B
It's
happening
in
prometheus
format,
and
I
think
that
is
also
really
great
for
the
industry
as
a
whole
right
and
if
you
take
a
step
back
and
look
at
the
monitoring
and
observability
industry,
it's
been
a
huge
shift
towards
open
source
standards
right.
So
you
know,
fluent
d:
is
the
graduated
sort
of
sincere
project
for
logging
same
for
prometheus
for
metrics,
and
I
think
that
the
best
part
of
that
isn't
that
there
is
a
solution.
B
B
If
we
take
metrics
as
an
example
emitting
metric
data
in
the
prometheus
format
already,
I
think
that
is
hugely
advantageous
for
the
industry
as
a
whole,
because
it
means,
from
an
end
user
perspective
from
a
developer
perspective,
you're,
not
locked
into
one
storage
solution,
whether
that's
a
storage
solution,
you're
hosting
yourself,
or
whether
that's
like
a
vended
storage
solution.
Right,
you
can
instrument
in
in
one
way
in
one
protocol
and
every
solution
out
there
sort
of
supports
it
right
and
again
from
the
metrics
perspective.
B
If
you
look
at
the
back
ends,
there's
a
lot
of
different
solutions
out
there.
In
addition
to
prometheus,
I
mentioned,
m3
is
one
that
we
open
source
out
of
uber,
but
cortex
and
thanos
is
there
and
available
as
well,
and
I
think
the
the
sort
of
movement
to
these
standards
and
the
power
of
that
movement
is
also
seen
by
all
the
vendors
that
are
providing
monitoring
and
metrics
based
solutions
as
well,
in
the
sense
that
they
all
have
to
support
prometheus
as
a
protocol.
Now
as
well.
B
Right
and
again,
all
of
this,
I
think,
is
great
for
the
end
user
in
the
industry
at
large.
Is
you
know,
because
you're
not
locked
into
one
technology
as
a
back
end
or
you're,
not
locked
into
one
vendor
as
a
back-end,
which
I
think
is
great?
So
you
know
it's
great
to
see
that
you
know
that
ability
is
supported
at
first
class
now
in
fluent
bit,
but
also
that
the
exposition
format
is
in
the
industry
standard
which
is
prometheus
and
that's
you
know
great
to
see,
I
would
say.
A
Awesome
awesome
yeah.
I
I
think
that's
that's
how
we
were
thinking
about
it
from
the
the
fluentd
side,
the
fluid
bit
side.
How
do
we
just
conform
with
the
standards-
and
I
think
you
know
a
big
upcoming
project
in
this
space
is
definitely
open
telemetry
and
we
announced
some
some
earlier
stuff
today,
where
we're
saying
hey
we're
going
to
have
some
integrations
going
on
with
the
protocols
they're
building
would
love
to
get.
Maybe
your
take
on
on
the
approach
of
of
that
project.
These
projects
together,
and
just
maybe
your
your
take
on
open,
telemetry.
B
I
I
assume
most
folks
watching
this
call
are
somewhat
familiar
with
open
telemetry,
but
if
not,
it
is
a
collection
of
apis
and
sdks
again
with
the
goal
of
standardizing
the
the
protocols
and
the
clients
that
is
generating
all
of
this
observability
data,
so
overall
as
a
whole
as
a
project,
I
love
it
because
it's
pushing
the
industry
towards
more
open
source
standards
for
sure
I
think
if
you
look
at
open,
telemetry
and
and
the
project
really
started
off
around
having
sort
of
standard
client
libraries
for
disparate
trace
data,
it
expanded
over
time
to
include
metric
data,
and
I
believe
then
you
know
the
natural
progression
would
be
to
extend,
expand
it
even
further
over
time
to
include
log
data
as
well
right.
B
And
if
you
look
down
that
path-
and
they
have
you
know
similar
again
apis
and
sdks
across
all
the
major
programming
languages
which
again
standardizes
the
instrumentation
and
the
production
of
the
data,
which
I
think
is
great.
If
you
look
at
it
as
a
project,
you
know
it
has
support
for
tracing
right
now.
Metric
support
is
being
added
actively
right
now
and
I
think
vlog
may
be
coming
soon
and
I
think
that's
going
to
be
great
for
the
industry
moving
forward.
B
I
think
you
know,
perhaps
in
a
year
or
two
or
perhaps
even
sooner
than
that,
you'll
see
a
lot
of
the
applications.
We
are
writing
instrument
in
open
telemetry
from
the
beginning
and
that's
great
and
in
fact
it's
it's
not
only
just
the
the
three
protocols
there.
It's
a
single
client
for
all
three
types
of
of
data,
or
at
least
two
types
of
data
right
now
and
perhaps
a
third
type
down
the
line.
B
I
think
that's
great
for
new
applications
moving
forward,
but
I
think,
if
you
look
at
a
from
a
practical
lens
of
the
things
that
we
need
to
monitor
today,
there
is
so
much
existing
instrumentation.
That
is
pretty,
I
would
say,
impractical
to
go
back
and
re-instrument
from
a
company's
perspective
for
existing
customer
applications
have
written
themselves
or
sometimes
it's
impossible
because
you're
pulling
you
know
a
dependent
library
or
an
upstream
project
that
you're
using
and
you
don't
really
even
have
control
on
how
those
things
are
instrumented.
B
So
I
do
think
that
you
know
open
telemetry.
Hopefully,
is
the
future
that
that
is
the
standard
there
eventually,
I
think
taking
it
from
a
practical
looking
at
it
from
a
practical
perspective,
I
think
projects
like
fluent
d
and
fluent
bit
are
great
here,
because
there
is
a
lot
of
sort
of
back
good
support
for
existing
protocols
and
existing
instrumentation.
That
is
this
today
and
it's
tackling
the
problem
not
from
a
client
perspective,
but
from
like
a
processing
perspective
outside
of
the
application
itself.
B
So
I
think
you
know
that
that
is
a
very
different
way
of
of
solving
the
problem.
I
think
it's
one
that's
going
to
be
required
as
we
sort
of
handle
a
transition,
that's
going
to
be
for
multiple
years,
and
I
think
you
know,
with
with
this
design
of
fluent
bit
fluenty,
where
you're
processing
outside
of
the
application
itself
lends
itself
to
have
other
advantages
as
well.
So
one
of
the
companies
we
work
with,
they
called
techton
when
we
talked
to
them
about
their
use
of
fluent
bit.
B
What
they
were
using
it
for
was
to
actually
augment
the
stream
of
log
data
that
was
coming
out
of
the
application
itself,
with
additional
metadata
around
the
environment
that
it's
running,
and
so
they
were
augmenting
it
with
the
cluster
and
the
namespace
of
of
the
kubernetes
cluster
that
they
were
running
in
right,
and
I
think
that
is
a
hugely
powerful
thing
to
be
able
to
do,
and
I
think
they
were
using.
I
can't
remember
the
exact
feature
name.
B
I
believe
it's
called
the
rewrite
tag
feature
or
something
like
that
in
influent
it,
but
I
think
you
know
that
adds
a
bunch
of
fairly
powerful
additional
value
as
well,
in
the
sense
that
you
know
now,
you
can
sort
of
standardize
the
additional
metadata
that
you
add
to
the
streams,
which
is
always
a
hard
problem
to
solve
right.
You
can
imagine
if
you
ask
every
developer
to
emit
environment
or
the
cluster
name,
who.
A
B
You
know
which
way
they're
going
to
go.
Do
it
in
the
standard
format,
there's
going
to
be
weird
camel
casing
and
all
sorts
of
other
things
in
there
right.
So
I
think,
being
able
to
do
it
in
one
centralized
location
is,
is
important
and
again
sometimes,
if
you
look
at
it
from
the
end
user
perspective,
sometimes
it's
actually
really
hard
if
you're
an
application
developer.
When
you're
writing
your
application
and
instrumenting
your
application,
it's
actually
really
hard
to
even
know
hey,
which
cluster
am
I
going
to
be
running
in?
B
How
am
I
going
to
go?
Get
that
data?
It's
actually
something
that
that
may
not
be
possible
from
the
application
developer
perspective.
So
I
think
this
approach
from
fluency
influent
bit
of
sort
of
processing,
all
of
the
existing
streams
of
data
coming
out
adds
additional
value
there
and
unlocks
a
bunch
of
use
cases
for
sure,
and
as
you
mentioned,
I
I
believe
there
will
be
support
for
all
the
protocols
that
open
telemetry
are
going
to
be
crafting
and
standardizing
as
well.
So
you
know
sort
of
it's
not
an
either
or
thing.
B
A
Awesome
yeah,
and
I
think
both
of
us
would
probably
agree
that
observability
has
changed
significantly
in
the
last
three
years:
new
projects,
new
protocols,
new
standards.
You
know
as
someone
who's
at
the
forefront
of
this.
What
do
you
think
the
next
three
years
hold
like
what
is?
Maybe
the
future
in
martin
mao's,
mind
of.
B
B
You
know,
I
think,
if
you
look
at
the
future,
what
I
believe
is
you
know
this
trend
that
I
talked
about
at
the
beginning,
where
every
developer
adopts
this
observability
mindset,
I
think,
will
continue
right
and-
and
I
think
the
the
the
trend
that
there
will
be
a
huge
transfer
of
both
knowledge
and
skill
set
from
that
core
sre
team
from
the
experts
in
these
practices
today
to
all
developers
everywhere,
and
I
really
hope
that
that
transition
continues
to
happen.
B
I
think,
as
that
transition
happens
and
ends,
hopefully
as
well
as
that
transition
happens.
There
is
a
focus
on
the
outcome
as
opposed
to
the
input
all
the
data
types
as
well.
So
I
do
see
that
happening
over
the
next
three
years,
so
you
know
having
the
the
developers
optimize
for
the
outcome,
which
is
remediation
as
quickly
as
possible,
and
I
think
if
you
assume
that
that
is
the
direction
things
are
moving,
and
I
think
there
are
a
few
implications
of
that
or
a
few
outputs
of
that.
B
One
of
which
is,
I
think,
you're,
going
to
see
a
lot
more
of
what
we're
seeing
already,
where
there
is
conversion
between
the
three
data
types
to
optimize
for
the
various
phases
that
you
know
and
really
the
developers
are
going
to
be
optimizing
for
the
various
phases
here.
So
do
you
think
you'll
see
a
lot
more
of
this?
What
we're
seeing
today,
already
of
like
transferring
between
the
data
types
to
solve
a
particular
phase
and
to
remediate
as
quickly
as
possible,
but
not
just
that.
B
I
do
also
think
that
you
know,
as
as
part
of
this
shift,
I
think
there
is
also
going
to
be,
I
think,
better
context
being
passed
between
each
of
the
phases.
Are
there
as
well?
So
you
know
going
from
notification
to
triage
to
root
cause
analysis
going
through
the
three
phases
there.
B
I
think
there'll
be
a
focus
and
and
sort
of
innovation
on
passing
more
context
throughout
each
of
those,
and
you
know
one
example
I
can
give
here
is
my
co-founder
rob
skillington
gave
a
talk
a
couple
of
years
ago
at
kubecon,
where
we
showed
how
you
could
jump
from
a
metric
data
point
on
a
dashboard
which
you
would
use
for
notification
and
triage
straight
into
the
underlying
request
in
the
distributed
trace
system,
which
you
would
use
for
root,
cause
analysis
right
so
really
trying
to
not
begin
your
search
again
as
you
go
through
the
phases,
but
really
take
the
the
sort
of
effort
you
put
in
from
each
phase
and
sort
of
use
that
more
effectively
in
in
in
the
next
phase.
B
I
do
think
that
we'll
see
you
know
more
things
there
and
actually
I
I
think
that
we'll
also
see
sort
of
better
integration
between
the
the
the
tiers
as
well,
right
and
and
by
tears.
I
mean
infrastructure
and
the
application
to
you
right.
So
you
can
imagine-
and
I
think
you
may
have
alluded
to
this
earlier
today-
that
you
know
that
there
are
some
plans
on
fluent
being
fluent
bit
to
also
sort
of
collect
infrastructure,
stats
or
infrastructure
metrics
from
from
from
the
the
hardware
itself.
B
B
We
see
this
in
large
data
as
well
and
and
in
other
industries
as
well
and
I'd
say
you
know
that
has
implications,
for
you
know
the
central
observability
team
or
the
sre
team
that
is
managing
and
running
all
of
the
infrastructure
and
all
the
observability
tooling,
that
the
rest
of
the
developers
use
and
depend
on
and
there's
probably
two
large
implications
there.
B
The
first
of
which
is,
I
think,
that,
as
the
observability
tooling
becomes
a
more
of
a
of
an
important
tool,
I
would
say
in
the
tool
set
of
of
developers,
the
reliability
of
that
system
is
gonna
is
gonna,
become
more
important,
and
you
know
this
is
coming
from
my
experience
at
uber,
where
we
built
a
hugely
powerful
metrics
back
in
storage.
Yet
you
know
we
couldn't
prevent
a
single
developer
from
writing
a
single
line
of
code
that
inadvertently
emitted
high,
cardinality,
metrics
or
inadvertently
emitted.
B
B
Just
because
there's
going
to
be
a
larger
dependence
on
those
tools
and
then
the
second
of
which
is
you
know,
I
do
think
that
the
monitoring
data
is,
as
I
mentioned
earlier,
is
going
to
outpace
and
grow
at
a
much
faster
rate
than
you
know,
our
spend
or
our
use
of
infrastructure,
and
I
think,
at
a
certain
point,
the
central
observability
team
or
the
sre
team
is
going
to
have
to
focus
on
how,
to
you
know,
implement
best
practices
for
the
developers
to
understand
the
implications
of
the
instrumentation
and
sort
of
optimize.
B
This
data
that's
been
produced
to
still,
you
know,
solve
the
problem
and
optimize
the
outcome,
but
perhaps
not
in
in
a
way.
That's
like
hey
just
produce
as
much
data
as
you
can
and
sort
of
hope
for
the
best.
I
think
there
will
be
a
lot
of
focus
on
how
to
deal
with
that
side
of
the
problem
as
well
looking
forward
but
yeah.
That's
it's
probably
my
best
guess
that
what
we're
going
to
see
in
the
next
three
years
here.
A
Awesome
awesome,
yeah.
I
know
I
think
every
everyone's
going
to
be
a
part
of
it
right
if
you're
watching
this,
you
are
probably
at
the
forefront
of
observability,
so
really
appreciate
your
your
answer
and
your
honesty
right.
It's
a
maybe
an
expensive
future.
So
so,
with
that,
I
I
think
yeah
we
can.
We
can
go
ahead
and
close
up
and
you
know
thank
you
so
much
again,
martin
for
your
time
and
your
insights
and
yeah
we'll
we'll
chat
again
soon.