►
From YouTube: Ceph Developer Monthly 2022-03-02
Description
Every month the Ceph Developer Community meets to discuss current work in the Ceph codebase, and coordinate efforts to minimize collisions and issues.
This monthly Ceph Dev Meeting will occur on the first Wed of every month via our BlueJeans teleconferencing system. Each month we alternate meeting times to ensure that all time zones have the opportunity to participate.
https://tracker.ceph.com/projects/ceph/wiki/Planning
A
Hello
folks
looks
like
we've
got
a
because
I
have
a
good
crowd
already,
I
think,
for
tonight
we
had
something
on
the
agenda
around.
A
Discussing
the
scaling
of
metrics
more
that
looks
like
all
if
I
was
going
to
be
able
to
make
it
not
be
able
to
make
it
tonight.
A
A
Matt,
I
think
you
had
some
thoughts
on
this
topic,
though.
What
do
you
think
do
you
want
to
there's,
or
there
are
some
aspects
with
that
you
wanted
to
bring
up.
B
B
What's
going
on
from
the
arse
from
the
right
hand,
side,
paul
kuzner
was
involved
and
also
some
folks
working
with
him
a
couple
folks
of
those
who
are
here,
vandakar
and
and
and
others,
but
we
we
believed
that
I
I
believed
in
casey
believed
because
he's
unable
to
be
here
that
that
that
we're
kind
of
focused
on
on
on
a
corner
of
it.
But
you
know
the
two
or
two
parts
one
we
wanted.
B
We
want
to
be
able
to
widen
the
interface
to
get
alerts
and
parametric
counters
out
of
rhw,
and
we
think
we
think
something
we
think
I'm
not
quite
sure
about
the
alerts
but
the
but
the,
but
something
some
something
similar
about
the
about
counter
information
is
is,
as
I
understand
it,
something
that
that
7s
and
and
everybody
also
need
to
do
so-
there's
a
comment
only
that
those
are
the
we.
B
I
thought
we,
maybe
we
kind
of
had
a
conversation
going
last
last
month
that
we
were
picking
that
there
were
a
few
different
themes
and
there's
a
bunch
of
themes
in
the
stock
that
people
have
been
editing.
I'm
sorry,
that's
not
visible.
Here
I
mean
we
could
find
a
way
to
make
that
visible
group,
because
I
think
it's
all
upstream
relevant
stuff,
but
one
theme
was
that
jason
jason
tillemann.
I
think
originally
the
theme
that
we
wanted.
B
B
B
Last
month
there
were
there
were
there
were
themes
about
doing
those
things
I
subscribed.
There
were
also
also
top.
They
were
also
kind
of
quasi
similar,
our
closet,
counter
proposals,
giggle
up
manager
in
different
ways.
There
was
a
proposal
just
to
put
counters
into
radios
using
indexed
storage.
I
think
problematic
aspects,
but
it
was
proposed.
A
Thanks
valley,
that's
a
good
summary
of
kind
of
what
this
question
was
last
time
and
kind
of
what
the
different
thoughts
were
at
the
time,
I'm
a
little
bit
confused
about
the
current
state
of
things
as
well.
I'm
not
sure
precisely.
What's
what
what.
B
Another
thing
that
that's
in
that's
another
one
more
one
more
thing:
that's
in
in
the
document
whether
or
not
we
can
rely
on
prometheus
as
as
a
primary
endpoint.
You
know
for
or
a
sink
for.
We
have
to
think
about
different
stuff
that
got
that
kind
of
feeds
into
that.
That
ernesto
has
said,
and
also
maybe
a
point
that
patrick
made
last
time.
You
know
that'd
be
something
more
general,
but
I'm
afraid
of
that.
But
it's.
A
All
right:
well,
let's
just
meet
sorry,
try
to
summarize
some
of
these
things
in
the
ether
pad
that
I
just
linked
in
the
chat,
so
they're
more
accessible
to
everybody,
and
we
can
try
to
I
kind
of
just
a
little
bit
more
widely
too
really.
It
looks
like
we
have
avon
here
ivan.
Maybe
you
have
some
context
for
what
you've
been
thinking
about
our
working
on
so
far.
C
C
Okay,
the
actual
problem
which
we
found
out
is
not
only
applies
to
the
promises.
That's
why
we
are
even
planning
to
solve
it
at
the
lower
level.
I
mean
the
c
plus
plus
code
itself,
because
the
thing
is
mgr
is
collecting
two
minor
metrics
right
now
and
updating
those
according
to
the
mgi
sketch
period,
which
is
mostly
around
five
seconds
or
so
yeah
yeah.
It's
collects
and
it's
sending.
C
B
C
So
yeah
coming
to
that
yeah.
Well,
basically,
the
plan
is
to
have
a
permanent
exporter,
which
is
basically
what
we
are
proposing
right
now
is
having
a
c
plus
plus
daemon,
which
is
doing
this
so
for
now
we
have
the
these
admin
socket
files.
I
mean
in
the
run
safe
on
each
host,
at
least
with
the
safari
ms,
and
we
can
do
the
same
for
the
rook
as
well,
because
touring
on
quality
one
pages.
So
it's
all
about
persisting
on
the
ghost
yeah
that
we
can
do
so
from
that.
C
We
are
basically
trying
to
have
a
demand
which
will
gather
all
those
socket
files
and-
and
we
can
have
the
output
of
all
the
perf
terms
for
demand
basis
and
and
the
thing
is
then
we
can
just
expose
this
as
a
http
endpoint.
So
basically
that
daemon
is
also
responsible
for
starting
the
promoter
server.
C
So
you
can
assume
here
that
there
is
no
promises
module
or
you
can
say
that
prometheus
model
is
no
more
getting
the
perf
controls
or
or
no
get
all
perfect
underscores
so
basically,
which
will
pull
the
sports
counters
and
prometheus
can
just
scrape
yeah.
All
those
you
can
say
the
addresses
of
the
all
the
pretty
much
running
on
each
host
so
yeah.
That's
our
export.
B
B
Certainly
you
know
the
well
rw
and,
I
think,
are
rbd
and
zips.
Do
you
want
to
widen
your
face
of
what
counters
are
as
well?
I
mean
we
could
use
help
from
people
that
understand
how
the
like
prometheus
could
consume
it
right,
but
like,
for
example,
the
specific
things
we
think
we
need
to
be
able
to
do
is
additional
addition.
B
In
addition
to
this
to
the
fixed
sort
of
array
of
counters
that
we
have
sensible,
but
an
array
of
counters
that
are
flat,
we
think
we
and
the
ncfs
are
going
to
need
to
be
able
to
say
well
this
volume,
this
array
of
counters
or
fourth
or
or
or
bondless
lines
fixed
number
of
volumes
that
are
currently
interesting.
This
these
counters
and
our
rbd.
I
see
our
hw
has
a
similar
need
and
there
is-
and
I
think,
also
alerts.
B
You
know
temporal
temporal
least
local
information
like
that
is,
or
something
like
that
would
be
desirable
to
get
to
an
endpoint
we
aren't
playing
on
logs
log.
You
know
log
mining
for
that
for
some
for
particularly
interesting,
interesting
information.
Is
that
something
that
you
guys
know
how
to
do?
I
I
was
kind
of
relying
on
on
paul's
knowledge
of
prometheus
for
how
to
do
that,
but
other
people
might
know
it
as
well.
C
Yeah
the
thing
is
the
the
issue
comes
in
as
soon
as
you
have
this
the
python
bindings,
basically
the
lockings
or
the
guild
you
can
say
so
to
avoid
those
things
we
we
that's
why
we
propose
this.
For
that
prove,
there's
no
need
to
remain
on
any
or
the
mdr
modules
to
to
get
those
spell
counters.
We
can
have
the
demon
itself
with
the
c
plus
plus
they
were
doing
that.
B
Oh,
I
I
think
I
get
that
part,
but
I
mean
I'm
just
saying:
if,
if
no
decks,
let's,
let's
pretend
let's
propose,
I
mean
I
don't
know
if
other
people
have
are
objecting.
But
let's
say
let's
say
that
we
all
accepted
the
note
exporter
idea
as
as
cogent.
The
only
only
point
I
was
raising
is
right.
Now,
perf
counters
are
a
thing.
That's
you
know
legacy
seth
concept,
they're
they're,
an
array
of
named
slots
in
an
array.
B
We'd
we'd
want
to
be
able
to
widen
that
somehow,
so
that
we
can
send
a
bit
more
data
more
essentially,
I
used
to
call
it
sparse
data,
but,
like
you
know,
we
might
have
tens
of
thousands
of
buckets
in
an
environment,
but
for
some
sub
for
some
subset
of
those
we'd
like
to
be
able
to
send
the
activity
on
those
buckets.
So
we
might
have
those
buckets
might
be
replicating
right
now
or
are
they
or
they
might
be
ingesting
right
now?
B
C
C
But
foreign
for
the
modules
we
still
require
the
cli
commands
to
run.
We
need
to
still
send
oh
sure,.
B
I
get
that,
but
for
the
for
these
additional
for
these
additional
for
this
additional
stuff,
there
wouldn't
be
there's
no
there's
nothing
for
one
thing
in
terms
of
rbd
and
maybe
other
stuff
there
there
actually
is
this
backward
compatible
angle
like
I
understand
that,
but
rw
doesn't
really
have
right,
but
but
either,
but
if
so,
but
even
if,
even
if
it
did
anything
anything
anything,
any
new
interfaces
we'd
be
proposing,
wouldn't
have
to
be
backward
compatible
right.
So
whatever
whatever
the
cli
and
manager
is
doing
it
best,
it
needs
to
do
it.
A
Yeah
there
are
some
like
some
purposes
that
we
use
like
the
existing
metrics
in
the
manager
for
which
you
still
want
to
use
the
metrics,
for
example,
like
the
isometry
we've
just
added
a
number
of
those
metrics
to
the
performance
channel.
That's
kind
of
a
once
per
day
thing
doesn't
need
to
be
collected
every
60
seconds
or
whatever.
B
I
don't
know
if
anyone
if
anyone
from
rbd
or
gregory
or
someone
is
here
craig,
is
here
like
like.
Let's
say
we
had,
you
know
replication
counters
per
volume,
where
our
view
had
replication
counters
for
bucket
or
something
or
top
10
buckets
strained
version
of
that
that
I
wouldn't
propose
that
I
I
don't
know
of
a
reason
why
that
would
have
to
be
available
in
a
cli
or
a
certain
legacy,
interface
and
manager.
You
know
if
that
that
that
only
showed
up
in
in
a
new
substrate
that
that
would
be
okay.
You
know
for
us.
A
Yeah
imagine
that,
for
at
least
rpd,
I
think
that
that's
true,
I
mean
that
at
the
point
of
these
kinds
of
metrics
is
to
be
able
to
look
at
them
through
some
sort
of
metric
system
which
could
be
prometheus.
It
could
be
potentially
something
else,
I'm
not
sure
exactly
how
generic
that
that
sort
of
interface
is
how
generalizable
just
a
different
sorts
of
monitoring
tools.
B
B
B
Well,
the
question
the
question
was
well:
the
quit.
Do
you
want
to
do?
Do
you
know
where
the
question?
The
question
was
just
about
whether
like
rbd
and
cfs,
whether
like
they
also,
we
have
similar
to
you?
I
think
you
guys
similar
to
rhw.
You
have
a
concept,
have
a
concept
that
you
would
like
to
see
like
metrics
for
some
some
subset
of
of
active
volumes
or
something
along
those
lines.
D
D
B
C
B
B
D
I
mean-
maybe
I
I
haven't
really
thought
about
it.
We,
we
discussed
it
briefly
internally
for
downstream
reasons
how
we
might
design
the
replication
metrics,
and
at
that
time
we
were
sort
of
like
well
either.
We
can
do
it
on
demand
for
volume
like
for
a
volume,
and
we
would
probably
just
hack
something
together
with
watch
notify
in
that
case,
where
we
like
go,
ask
it
or
we
do
something.
D
That's
sort
of
like
time
lapse
and
just
you
know
things
send
in
updates
and
we
show
the
most
recent
status
we
have
from
any
given
thing.
The
way
that
we
work,
pg
stats
right
now
so,
but
it
is,
it
is
not
top
of
my
mind
right
now,
given
the
other
projects
we've
been
working
on
and
how
important
this
one
is
to
us
in
that
status.
B
E
Would
this
be
a
good
candidate
for
a
summer
of
code
thing
or
we've
got
some
of
those
coming
up?
Haven't
we.
B
Well
sure
and
I've
got
two
interns
who
I
can
donate
to
helping
with
it
and
there's
other
people
too
can
help,
do
prototyping
work,
but
so
yeah.
So
if
we
so
if
we
can
come
up
with
proposals
or
you
can
propose
architectures
and
things
to
experiment
with
or
attempt
to
implement,
you
know
yet
we
can
find
a
variety
of
ways
to
resource
it.
A
That
sounds
like
we
may
need
to
talk
to
folks
who
know
more
about
prometheus
and
other
immune
systems
to
understand
what
is
feasible.
There.
B
Well,
paul
kushner
seems
to
know
a
lot
about
that,
and
so
I'm
not
sure
quite
what
happened
has
happened.
There
he's
convinced
that
he's
kind
of
stepping
on
toes
are
not
able
to
keep
track
of
what
you
know
the
dashboard
is
doing
so.
It
feels
like
we
kind
of
fully.
Unfortunately,
people
couldn't
come
together
here
and
talk
it
out
kind
of
need.
I
think
to
do
that.
I
don't
know
if
we're,
I
think
so.
It
seems
like
there's
just
a
lot
of
crosstalk.
E
It's
just
gonna
propose
the
standard
proposal
of
drafting
an
email
to
the
dev
list
and
maybe
loop
in
the
people
that
you
see
as
crucial
and
see
what
sort
of
feedback
we
can
get
from
that.
Maybe
a
link
to
the
pad
that
josh
just
created
as
well.
B
B
A
C
Yeah
and
the
thing
is,
they
are
all
with
the
private
greater
than
equal
to
phi.
So
are
they
really
needed
or
used
all
and
those
if
this
prime
can
be
increased
or
some
of
the
metrics
can
be
reduced.
A
So
if
I
understand
your
question,
you're
asking
those
these
are
those
he's
reporting
lots
of
metrics
right
now
are
all
of
them
necessary
to
be
reported
in
monitoring
systems.
A
F
I
think
I'd
raise
the
same
question
with
paul
at
one
time
when
he
was
talking
about
some
of
these
bottlenecks
and
I
definitely
would
like
to
see
which
metrics
those
are
and
how
they
are
being
used
at
this
point,
and
if
there
is,
you
know,
the
scope
for
reducing
them,
we'd
be
happy
to
do
that.
If
nobody's
looking
at
it,
there's
no
point
capturing
it,
but
longer
term.
We
do
need
a
more
solid
solution
right.
We
might
just
scale
up
to
those
many
meaningful
metrics
at
some
point.
F
As
a
short-term
solution,
we
can
definitely
look
at
which
of
those
are
actually
useful,
but
longer
term
170
shouldn't
be
a
problem.
I
guess.
B
F
Yeah
yeah,
absolutely
that
I
mean
that
that
is
the
sort
of
stuff.
You
know
it's
almost
like
going
ground
up
right
over
the
period.
I
know
that
you
know
we've
we
needed
something
somewhere,
so
we've
tried
to
fit
into
into
the
right
priority
to
get
them
in,
but
over
time.
I
understand
that
this
has
been
happening
a
lot,
so
maybe
some
of
them
are
not
so
useful
and
some
of
them,
as
you
said,
can
be
summarized
so.
I'm
happy
to
you
know,
drive
that
effort.
B
B
Granular,
probably
and
and-
and
I
was
and
I
was
I
was-
I
found
it
pretty
compelling
that
the
idea
that
we
would
we
push
the
if
we
just
push
them
to
this
node
local
thing
and
it
even
summarizes
things
and
and
we
can
we
can
make
it
more
less
costly
for
still
to
get.
You
know
to
to
take
like
snapshots
of
that
within
10
minutes,
20
minutes
whatever
basis,
then
it
becomes
pretty
cheap
to
do
time.
Series
databases.
B
Anyway,
well,
I'm
happy,
I'm
I'm
happy
to
hear
that
that
the
things
that
aren't
as
far
the
the
the
the
exporter
idea
is
still
in
pl
in
play
I'll
I'll
I'll
I'll
talk
with
you
josh
I
mean
we
can
figure
or
anybody
who
knows
or
rbd
paul
what
whoever
is
whoever's
looped
in
they
can.
They
can
turn
paul
stock
into
an
email,
but
then.
A
Ever
I
tried
to
take
some
notes
during
the
discussion
in
the
youtube
right
there
so
feel
free
to
add
more
to
that
too.
G
A
All
right
are
there
any
other
topics
I
folks
want
to
discuss
today.
H
I
wanted
to
discuss
a
topic
from
centralized
logging
perspective.
I
mean
even
in
upstream,
which
solution
do
we.
I
was
having
a
discussion
with
ernesto
and
ashish
on
a
separate
private
thread
and
ashish
has
a
recently
submitted
pr
for
incorporating
a
loki
as
for
centralized
logging
solution,
but
we
are
exploring
it.
H
You
know
three
different,
centralized
logging
solutions,
which
is
a
gray
log.
I
think
I,
if
I
remember,
and
I've
experimented,
that
we
have
a
native
built-in
support
for
greylock
already
and
safe.
We
work
and
the
other
logging
stack
that
we
we
are
certainly
exploring
and
more
experimenting
with
is
a
loki
and
elasticsearch.
H
So
we
would
certainly
like
to
get
more
idea
that
which
one
is
the
best
in
I
mean
if
anybody
any
users
are
also
using
elasticsearch
loki
or
any
of
external
logging
stacks
on
their
own.
I
would
certainly
bring
this
up
again
in
the
user,
plus
the
user
plus
dev
monthly
meeting
the
next
one,
but
I
wanted
to
get
a
feedback
from
everyone
here
in
the
cdm
to
get
a
thought
process
started
and
which
one
could
be
best
and
helpful
solution
for
us,
because.
A
H
So
currently
we
can
integrate
with
any
of
the
centralized
logging
solution
that
users
want
it's
as
per
their
choice,
but
from
ceph
perspective,
what
would
be
recommended
as
a
as
the
best
in
the
standard
one,
every
centralized
logging
solution
has
a
pros
has
some
pros
and
cons
on
their
own.
H
We
have.
We
are
even
coming
up
with
that
list
and
because,
for
example,
greylog
for
greylock
we'll
have
to
configure
inputs
forcefully
on
our
own.
It
doesn't
have
any
dynamic
way
of.
H
I'll
say
picking
up
those
logs,
so
we
have
to
tell
greylock
that
the
it's
it's
a
there's,
a
just
a
second.
I
will
share
one
limitation
of
greylock.
H
Yes,
so
yeah.
Meanwhile,
I'm
looking
for
the
specifics
and
definitely
there
are
certain
drawbacks
of
each
and
every
centralized
logging
solution.
But
out
of
the
best
elasticsearch
provides
a
lot
of
customization
and
a
better
visualization.
H
Whereas
loki
is
much
more
simpler
in
its
own.
H
H
I
think
you
have
worked
on
the
centralized
logging
part
as
well
right,
along
with
ashish
in
the.
H
One
of
the
main
problems
I
noticed
with
greylock
was
that
we
have
to
create
or
configure
inputs
before
ingesting
data,
but
we
if
we
want
to
push
logs
to
a
centralized
location,
let's
say
from
multiple
random
clusters,
it's
difficult
to
pinpoint
to
differentiate
them.
So
we
would
need
to
consider
some
metadata.
H
A
So,
looking
at
this
pr,
it
looks
like
it's
focusing
on
noki
and
fromtail.
I'm
not
really
familiar
with
these
centralized
logging
systems,
myself
very
much.
C
Yeah
yeah,
it's
starting
basically,
two
containers,
the
lucky
and
the
bronzer
ones
and
frontal
is
the
one
to
face
the
locks
from
under
the
I
guess,
sephardium
stores
under
the
h
containers
so
in
under
the
fsids.
C
So
it's
like
a
double
itself
and
the
corresponding
fsids.
It
stores
the
logs,
for
I
mean
the
demon
locks.
So
that's
where
the
problem
is
looking
for.
You
can
look
at
the
ml
as
well
for
the
yeah
for
the
prompter
and
it's
trying
to
from
there
so
basically
responsible
for
fetching
those
locks
and
expose
it
in
on
the
http
server.
A
A
You
were
breaking
up
a
little
bit
there
yvonne,
but
I
think
I
I
got
the
idea
from
looking
at
the
pr
a
little
bit
and
what
you
were
saying
that
primetale
is
effectively
looking
at
updates
to
the
local
log
files
and
exposing
those
to
http
and
then
loki
is
scripting.
Those
and
and
storing
those
in
its
own
local
database.
C
Yeah
the
thing
is
in
this
line,
so
lucky
doesn't
have
to
pull
it.
It's
like
a
push
based
strategy,
so
yeah
the
prompter
pushes
it
to
this
yeah
to
the
client
host,
and
that
is
basically
the.
H
And
I
can
share
a
brief
about
about
the
elastic
search
architecture
as
well.
Can
I
share
my
screen
just
to
quick,
share
a
quick
overview
of
the
architecture.
G
H
So,
basically,
if
we
are
using
elasticsearch
and
logstash,
there
would
be,
there
would
be
a
need
to
run
the
beats
daemon
on
every
node
from
which
we
are
trying
to
ship
and
collect
logs,
and
that
logs
would
be
sent
to
a
centralized
logging,
server
or
elasticsearch
server,
and
then
kibana
can
be
used
to
visualize
that
just
from
one
cluster
perspective,
but
we
can
have
multiple
clusters
as
well.
H
We
just
have
to
send
and
a
metadata
field
and
there's
a
huge
customization
that
can
be
done
and
we
can
display
these
logs
very
easily.
You
know
in
a
nice
way,
normally
you
know
in
a
pacific
or
in
in
a
pacific
environment
every
logs
like
from
say
we
can
just
tell
beats
to
the
a
certain
directory
locations.
H
It
would
send
all
the
log
files
to
the
elasticsearch
server
to
visualize
it.
One
of
the
initial
problems
I
noticed
was
that
basically
it
was
flooding
wire,
log
messages.
H
Hello
to
be
share
a
brief
idea
like
to
get
the
discussion
started,
and
you
know
our
recommendations.
I
mean
what
would
be
the
best
best
one
which
could
be
suited
for
safe
or
in
if
anyone
is
actually
using
it
using
any
one
has
any
has
tried
any
would
be
great
to
have
the
opinion
as
well.
Regarding
the
same.
A
Yes
might
be
an
interesting
question
to
ask
on
the
sf
users
list
as
well.
I
get
some
better
feedback
from
folks
who
have
been
using
these
systems
in
the
past
and
clearly
greylock
was
used
by
someone
since
someone
went
to
trouble
with
integrating
it
with
the
existing
stuff,
locking
infrastructure
already,
I'm
sure
someone
else
other
folks
are
probably
using
the
elk
stack
and
others,
but
I
haven't
really
heard
much
about
the
pros
and
cons
of
each
myself,
so
I
think
it'll
be
interesting
to
get
more
user
feedback
there.
A
B
A
Yeah,
I
guess
for
each
of
these
things
I'm
curious
like
how,
like
I
know,
elasticsearch
has
pretty
advanced
like
degree
capabilities.
Well,
there's
a
similar
kind
of
capabilities
exist
in
loki
and
greylog,
or
how
do
you
then
deal
with
like
the
volume
of
logs,
which
can
be
could
be
essentially
quite
high?.
H
Yeah,
certainly
so
I'll
work
with
ashish
and
we
would
send
out
an
email
to
both
the
dev
list
and
the
user
list
to
get
more
feedback
across
the
wider
audience
as
well.
Regarding
the
same,
that
will
be
useful
and
helpful
in
understanding
if
anybody
is
currently
using
any
illusion
at
the
moment
that
we
are.
G
H
A
H
Yes,
and
even
in
terms
of
scale
right,
if
centralized
logging
solution
is
configured,
I
mean
we
can
certainly
debug
things
in
a
much
more
faster
way
as
well
right.
A
I
also
wonder
how
much
was
kind
of
overlaps
somewhat
with
the
tracing
and
observability
type
of
work.
A
Like
a
lot
of
a
lot
of
our
existing
blogging
is
really
geared
more
towards
like
developers,
but
I
think
some
we've
talked
about
like
cluster
log
being
more
of
a
central
store
for
human
consumption
and
and
end
user
under
administrator
understanding.
A
A
So
well,
maybe
interesting
to
see
like
how
useful
these
kind
of
low-level
logs
are
for
folks
in
field
versus
a
cluster
log
and
potentially
the
tracing
ideas
that
we've
talked
about
in
our
implementing.
Now.
B
It's
definitely
a
concern
that
we
really
need
conditional
flows
to
to
reduce
the
volume,
because
you
point
out
curve.
I
mean
you
just
get
an
astronomical
quantity
of
mostly
redundant
data,
but
you
still
need
the
cl
yeah.
The
atomic
bits
to
understand
what's
going
on
tracing,
gives
you
the
possibility
of
really
reducing
it
to
specific
flows
and
turning
on
traces
and
or
some
traces
when
the
conditions
are
observed,
and
that's
that's
what
seems
like
the
holy
grail
to
me,
trying
to
get
find
ways
to
get
that
into
into
something
we
can
use.
B
I
Some
of
the
things
I
while
I
was
checking
in
the
past,
I
came
across
the
ideas
of
connecting
the
traces
with
logs
so
like
in
particular.
If
we
have
a
trace
point,
then
it
can
like
yeah.
I
Maybe
a
point
is
to
where
the
logs,
for
that
particular
process
is
situated
in
so
like
that,
might
help
reducing
the
over
overhead
of
like
processing
the
logs
in
a
central
location
which
might
be
a
problem
if
we
are
going
to
like
look
into
centralized
tracing
since
we
have
a
large
amount
of
law
aggregation
of
himself
so
yeah,
maybe
I
think
that,
but
I'm
not
familiar
with
anything,
that's
happening
for
a
c
plus
plus
right
now,.
B
Oh
I've
sort
of
I
I
only
I
understand
parts
of
that-
I
think
part
of
the
last
part,
I'm
not
sure
if
you're
saying
what
I
think
you're
saying
but
like
I've
sort
adam
and
I
at
least
that
of
course,
I
have
begun
to
I've
begun
to
accept
that
we
might
have
to
work
on
being
your
client.
B
A
Probably
just
try
to
put
a
little
bit
of
notes
there,
but
I'm
sure
you
guys
can
fill
in
more
details.
And
if
you
like,.
H
Certainly,
we
will
do
that.
A
Right,
yeah-
and
it
seems
like
in
general,
like
one
of
the
advantages
of
these
like
elk,
elasticsearch,
livestream
cabana
and
the
lokian
and
fromtel
approaches
is
that
they
don't
need
like
integration
with
the
system
they're
previously
coupled
they're,
just
reading
text
files
and
I'm
sending
them
elsewhere.
Essentially,
whereas
graylog
is
kind
of
more
tightly
integrated
with
the
cpos
plus
code.
H
Yes,
greylog
is
more
tightly
integrated
with
our
current
code
and
the
other
two
don't
need
to
be
tightly
coupled.
It's
just
like
from
elasticsearch
point
of
view.
Just
like
we
have
a
node
exporter
running
on
every
node
right.
We
would
have
another
daemon
beats
running
on
every
node
like
that,
and
it
can
be
done
user
level,
so
yeah,
just
that
and
some
configurations
that
we
have
to
do
in
order
to
send
those
to
a
centralized,
elasticsearch
logging
server
and
on
on
that
particular
node
or
the
centralized
logging
server
some
patterns.
H
H
A
Yeah,
I
I
think
for
alerting
I'd
be
kind
of
hesitant
to
add
it
more
on
top
of
the
existing
login
infrastructure
and,
like
anything
that
we
could
identify
there
in
theory,
we
should
be
able
to
surface
more
directly
as
like
a
health
error
and
stuff.
I
think
if
we,
if
we
add
too
much
more
like
trying
to
interpret
the
logs,
it's
easy
to
kind
of
get
a
mismatch
of
understanding
of
the
health
of
the
system
if
you're
trying
to
interpret
it
at
more
than
one
level.
A
I
think
it's
going
to
be
valuable
for
for
things
that
maybe
in
a
short
term
or
if
it's
it's
like,
maybe
hard
to
try
harder
to
actually
add
a
warning
about
it
into
the
stuff
itself.
A
But
I
think
the
longer
term
might
try
to
centralize
those
kinds
of
warnings
into
the
self-health
system
and
not
try
to
kind
of
bolt
them.
On
top.
B
H
G
H
You
know
centralized
logging
as
an
advantage
for
multi-site
as
well.
I've
been
because
we
have,
I
mean.
B
A
site
you
know
so
we've
got
the
idea
of
serializing
traces
got
a
prototype
that
does
all
these
multi-part
uploads
because
they
have
like
a
series
of
transactions
that
can
then
recover
and
you
have
as
there's
an
idea
of
doing
something.
Similar
parts
of
parts
of
parts
of
traces
can
be,
could
be
serialized
and
actually
in
the
logs
and
meaning
in
their
replication
logs,
and
then
we
can
recover
them
to
build
spam.
You
know
to
export
spans
later
but
like,
but
I
you
could
do
this
with
the
logging
system
too.
B
There's
no
reason
you
couldn't,
but
right
now
the
logging
system
is
switches
are
fairly
fine
grain,
but
you
know
they
could
be
more
so,
but
even
matter
how
much
you
do
with
the
config
parameters,
you
have
you
still.
You
still
have
either
you
don't
have
a
lot
of
unrelated
stuff
flowing
through
interests.
Me
is
graphs
if
something
comes,
if
a
set
of
conditions
come
in
and
then
we're
sampling
them,
and
we
see
that
we're
looking
for
a
particular
thing
like
we're.
B
B
Storing
on
the
cassandra
and
we
didn't
have
any
anything
like
the
resources.
We
would
have
to
sort
of
debug
that
I'm
really
excited
by
the
fact
that
that
jager
seems
to
have
that
they
have
a
solid
sort
of
infrastructure
for
collecting
things,
but
also
even
our
logs
could
could
do
it
if
we,
if
we
have
a
bunch
of
the
trace,
send
them
sort
of
wherever
we
want.
A
User,
like
had
a
base
filtering
rather
than
per
section
of
code,
that
the
logging
happens
to
live
in.
B
Exactly
exactly
as
we
go
forward,
that
might
be
that
that
might
be
a
way
to
really
dramatically
reduce
the
it
means
it
means
you
don't
have
it.
It
takes
away
something
that
logs
have
logs
logs.
That
logs
have
a
yeah
vlogging.
Is
that
you
is
it
at
a
certain
level?
You've
collected
everything,
so
you
could,
then
you
could
then
mine
it
mind
the
mind
in
the
past
for
what
happened.
B
We're
not
we're
not
able
to
we're,
usually
not
unless
we're
collecting
core
files,
which
is
even
more
immense,
there's
still
some
condition
too
that
gates
that
right
it
crashed
the
crash,
the
machine
or
we
triggered
the
core.
So
we
must
have.
We
must
have
figured
out
that
we
wanted
to,
or
else
it
or
else
it
blew
up,
and
so
it's
so
even
that,
so
that's
conditional
too,
but
but
in
debugging
you
know
you
just
something
something
went
wrong
and
they
reported
it
and
they're
getting
we're
coming
in
and
trying
to
fix
it.
B
A
Yeah
and
being
able
to
filter
out
the
debugging
too
much
a
much
more
fine
grain
level
would
certainly
help
with,
like
that
being
like
even
feasible
in
a
production
environment,
because
today,
a
lot
of
places
if
you're.
Turning
on
the
lovely
bug
you
need
to
really
diagnose.
Something
might
be
too
much
to
really
leave
on
for
a
sustained
period
of
time.
H
I
think,
in
terms
of
logging,
a
certain
I
mean.
Is
it
a
good
idea
to
put
certain
conditions
in
the
code
where
debugging
can?
If
we
notice
some
events,
a
bug
can
be
turned
on
automatically
at
the
demon
level
for
a
certain
time
being
based
on
a
certain
condition,
and
then
it
can
be
when
the
situation
as
soon
as
it
satisfies
that
condition.
It
dumped
some
about
20
or
level
logs
and
certainly
came
back
to
normal.
B
Yeah,
I
think
I
think,
that's
a
conversion.
I
think
it's
the
same
idea.
It's
just
different,
just
with
different
different
outputs
and
different
trace,
different
trace
points.
Our
log
points
are
analogous
to
trades
points
in
that,
in
that
scenario
yeah
we
could
we
could
build,
we
can
we
can
look
at.
We
could
as
we
as
we
as
we
work
on
this.
We
could
try
to
unify.
H
A
Yeah,
it's
another
thing
that
we
could
potentially
do
more
of
it
like
today
we
have,
I
have
several
levels
of
vlogging
in
memory
versus
what's
going
to
disk
and
in
certain
exceptional
cases,
when
we
run
into
some
problem,
we
might
be
able
to
do
more
of
like
dumping.
What's
in
our
memory
version
of
volumes,
which
is
more
verbose
and
has
more
information
that
we
wouldn't
normally
be
storing
kind
of
a
baby
step
to
getting
a
little
more
detail
about
a
problem.
A
Series
two
hours
of
logging
for
the
debug,
the
logging
and
seth
right,
there's
the
oh!
That's
going
to
the
log
file
on
this
level,
that's
being
stored
with
memory,
which
today's
only
attempts
by
I
mean
having
command
or
when
the
state
crashes
interesting.
B
B
B
If
you
have
the
ability
to
do
some
dynamic
code,
you
know,
then
you
can
inject
new
conditions
at
various
points,
and
that
might
be
another.
You
know
a
way
to
make
very
flexible.
I
mean
it
has
some
scary
aspects,
but
I
sort
of
feel
like
it's
like
the
complexity
of
some
of
the
debugging
scenarios
we
get
pulled
into
that
makes
it
interesting,
just
like
it
became,
became
pretty
trace.
A
B
Yeah
release
early
is
fast,
I
mean
yeah.
It
might
be
the
case
that
yeah,
maybe
the
case
of
this
is
a
I've
that
could
be
true
too
I've
I've
I'm
get
I
had
I
had.
I
would
have
thought
I'd,
guess
that
you
know
that
that
that
that's
that
that's
more
like
a
sink
and
and
some
sort
of
layer
above
it
decides
what
gets
there,
but
they
could,
but
it
could
be
both
but
like,
but
it
isn't.
B
But
in
this
case,
like
we
looked
at
zipkin
and
and
delta,
tng
and
other
stuff,
the
trace
points
and
then
well
other
things
before
that
k,
probes
and
whatnot,
but
the
acre,
isn't
it
anchor
tracing
in
the
c
plus
plus
environmentally,
it
isn't
exactly
cheap.
Doesn't
it
doesn't
seem
like
it's
really
been
bought
through
completely.
B
A
All
right
are
you
familiar
with
like
dtrace
and
how
it
it
does
its
dynamic
filtering.
B
Well,
yeah,
it's
just
so
exciting.
I
mean
teacher
is
the
precursor
to
ltt
and
g,
essentially
in
the
open
source,
but
so
they
invented
ctf.
So
it
has
so
it
has
so
it
has
low
cost,
trace
points
that
generate
ctf
events,
yeah
and
but
then,
but
then
the
interest
engine
is
a
scripting
language
and
again,
and
you
attach
it
to
a
group
of
trace
points.
B
B
B
G
B
A
B
A
That's
very
interesting,
but
yeah,
maybe
something
to
think
about
in
the
future.
Once
we
have
tracing
more
in
place
and.
B
B
So
so
it
forces
us
to
solve,
distribute
some
distributed
tracing,
and
maybe
some
freeze,
thraw
freeze,
throw
freeze,
thought
things
and
and
then,
and
then
we
know
that
we've
we've
learned
we've
learned
this
year
that
we
want
to
try
to
get
some
conditional
in
as
well.
So
if
we
can
figure
out,
okay,
we're
looking
for
there's
events
happen
that
we,
it
turns
on
trace
flows,
then
figure
out
what
to
do
with
them.
A
Yeah,
absolutely
especially,
if
you
could,
if
you're,
constraining
like
the
types
of
conditionals
and
not
trying
to
make
a
generic
scripting
engine,
that's
super
fast
and
all
that
yeah
exactly.
B
G
H
Nothing
further
at
the
moment,
but
yeah.
This
was
really
an
interesting
discussion.
H
And
we'll
move
forward
to
discuss
this
in
further
forums
and
mailing
lists,
as
discussed.