►
From YouTube: 2022-07-05 CNCF TAG Observability Meeting
Description
Metrics at Scale: Issues in OTEL Collector, Prometheus by Kevin w/Google
C
Good
today
might
be
a
really
quiet
day,
we'll
see
there
was
no
toc
meeting
today
and
we're
coming
off
the
july
4th
holiday.
So
I
wouldn't
be
surprised
if
there's
like
you
me
and
one
other
person,
I
haven't
heard
from
allah
leading
either,
but
that
was
your
fourth.
A
It
was
good
we
actually,
our
fireworks,
got
canceled,
unfortunately,
for
unforeseen
reasons.
So,
oh
really.
A
No,
no,
no,
no
rain.
I
don't
actually
know
what
it
was.
We
actually
went
to
like
an
amusement
park
near
us
and
they
were
supposed
to
have
fireworks
on
site
and
then
like
an
hour
before
they're
like
due
to
technical
reasons,
we
have
to
cancel
them.
I'm
like
well
that
kind
of
stinks,
because
that's
why
we
came
how
about
you.
C
So
my
my
girlfriend
and
her
three
kids
are
in
colorado
for
almost
two
weeks,
so
they
come
back
on
friday
or
third
friday
and
they're
they're
like
having
a
vacation
with
family
and
my
oldest
he's
17
and
will
be
a
high
school
senior
next
year.
He
is,
he
was
accepted
to
rhode,
island
school
of
design,
risd's
pre-college
program.
Oh
wow,
yeah,
that's
awesome!
There
you
go
on
saturday.
I
think
and
pick
them
up
the
first
week
of
august
like
so
it's
like
me
and
the
dog.
C
Yeah
I'm
pretty
psyched
about
it.
I'm
actually
plowing
back
through
the
notes
from
like
a
year
and
a
half
ago,
maybe
because
I
had
identified
way
back
then
some
sig
roll
or
some
tag
roles
that
we
really
need
to
fill.
You
know
both
tech
leads,
as
I
talk
to
you
about
a
chair,
but
there's
a
bunch
of
other
stuff
that
you
know
tags
are
free
to
make
their
own
roles.
C
So,
for
example,
we
need
someone
to
work
on
social
media
and
communications
like
in
an
intentional
way.
It's
an
actual
job.
It
takes
time.
You
know
we
need
artists
and
or
design
folks
to
work
on
a
tag
website
and
or
other
you
know,
collateral
and
artifacts,
and
things
like
that.
So
I
remember
I
captured
a
lot
of
this
a
long
time
ago
and
so
I'm
being
a
little
bit
lazy
here
and
hoping
to
copy
paste
because
none
of
it's
changed
but
a
couple
others
too
yeah.
C
Well,
we
are
now
two
minutes
over
so
I'll
give
people
another
minute
or
two
to
filter
in
but,
like
I
said,
given
the
holiday
and
the
proximity,
I
wouldn't
be
surprised
if
we
have
a
really
short
meeting
today.
C
Let's
see
what
else
has
been
new,
but
I
should
say
for
the
recording.
This
is
the
first
tuesday
week
of
the
tag
observability
the
cncf
sponsored
group
and
channel
and
stream.
So
please
don't
do
anything
that
would
be
a
violation
of
dakota
conduct.
A
B
D
D
C
How
are
you
I
was
telling
steve
that
as
we're,
you
know
one
day
after
july
4th
and
no
toc
meeting
today,
which
always
happens
just
the
hour
prior
to
this
meeting
yeah,
we
might
have
very
light
attendance.
D
Yes,
I
think
most
folks
in
the
us
at
least,
are
out
still
on
break.
E
Hey
does
it
work?
Yes,
okay,
good.
E
Yeah,
so
my
team
specifically
is
gke
metrics,
so
we
take.
We
collect
metrics
in
our
kubernetes
clusters,
mostly
system
metrics,
in
like
promises
format
and
from
like
the
cubelet
container
metrics
and
then
send
them
to
google
monitoring
for
customers
to
see
and
right
now
we
yeah
we
do
that
using
open,
telemetry
and
we're
slowly
moving
away
from
open
telemetry
to
do
more
optimizations
on
memory
efficiency,
because
we
noticed
that
parsing,
the
promises
format
is
expensive
and
we're
like
doing
a
few
experiments
on
how
to
handle
metrics
more
efficiently.
D
No,
that's
cool,
that's
very
interesting.
I
mean
you
said
that
you're
moving
away
from
hotel
or
or
yeah
I
see
is
that
the
mainly
because
yeah.
E
Right
now
in
on
every
gk
node,
there
is
a
container
called
gk
metrics
agent
and
it's
a
slightly
customized
open,
telemetry
collector
that
finds
all
parts
in
the
on
the
node
that
export
prometheus
metrics.
Like
all
system
parts.
E
Scrapes,
every
one
of
them
using
open,
telemetry
and
prometheus
receiver
and
sensors
metrics
and
the
problem
is
it's
like
this
one
single
thing
and
it
scrapes
a
lot
of
metrics
and
with
the
amount
of
metrics
and
the
script
and
tools
we're
getting
to
a
point
where
it
consistently
out
of
goes
out
of
memory.
At
some
point
it
doesn't
grow
well
enough
and
we've
had
other
issues
where,
like
open
telemetry
is
seems
to
be
still
changing
a
lot.
So
like
we
updated
open
telemetry.
E
E
No,
the
reason
the
metrics
were
gone
were
changes
in
open
telemetry
like
in
the
collector
specifically
like,
I
think,
the
exporter
and
like
we
use
batching
to
send
200
metrics
at
once,
and
that
doesn't
always
work
correctly.
So
we
ran
a
few
problems
and
mostly
that,
like
the
collector,
is
a
great
great
for
specific
use
cases.
But
in
our
case
we're
collecting
so
many
metrics
in
one
process
and
it's
no
longer.
D
E
Not
whatever
yeah
and
we
want
to
split
it
up
into
a
mix
of
side,
cars
and
other
options
and
and
right
now
we're
working
on
the
side
car
to
scrape
prometheus
metrics,
convert
them
into
that
monitoring
format
and
send
them
away,
and
that's
what
we
came
across
openmetrics,
because
google
cloud
monitoring
relies
on
start
time
to
be
there
for
cumulative
metrics
and
right
now
the
open
telemetry.
Well,
not
in
the
current
version.
E
E
D
Would
be
super
interesting
because
you
know
when
I
mean
I
worked
on
the
prometheus
in
drop,
you
know
for
the
hotel,
collector
and
otlp
specifically
and-
and
you
know
one
of
the
issues,
as
you
said,
is
you
know,
handling
scaling
up
the
collector
to
to
be
able
to
handle.
D
You
know
shorted
sharding
as
well
as
you
know,
not
sidecar
sidecars
are
used
typically,
but
you
know
really
stateful
set
support,
so
that
was
something
that
we
built
out,
but
we
did
actually
make
sure
that
it
was
open,
metrics
compliant
so
from
an
you
know,
format
and
a
compatibility
standpoint.
D
That
certainly
exists,
but
you're
you
are
it's.
I
think,
in
the
scaling
that
we
did.
I
did
see
that
you
know
there
was
scaling
of
metrics,
especially
when
you
have
like
a
stream
of
metrics
coming
in
for
in
a
really
large
number,
there's
still
work
to
be
done
there
for
the
collector,
so.
E
That's
like
I'm,
also
joining
prometheus
maintainer
sessions
now
because
we
want
to
like
maybe
help
on
there
and
to
implement
but
like
what
we
noticed
is,
for
example,
re-labeling
is
very
expensive.
Like
we
just
yeah,
we
recently
fought
the
prometheus
library
because,
like
we
know
what
metrics
we
want,
so
we
can
already.
E
If
we
see
the
metric
name,
we
can
already
stop
parsing
and
we're
labeling
if
we
don't
care
about
the
rest,
that's
just
like
optimization
and
like
in
our
case,
we
collect
metrics
from
many
containers
and
when
we
notice
that
that
if
a
single
container
has
like
a
cardinality
explosion
or
sends
too
many
metrics,
that's
enough
to
lose
metrics
for
everybody
else,
and
so
that's
why
we're
kind
of
splitting
things
up
a
little
we're
just
blast.
Radius
have.
D
You
have
you
proposed
some
of
these
changes
back
to
the
collector's
sick
at
in
hotel.
E
We
have
a
regular
sync
with
david
aspel
yeah,
but
well
we're
proposing
some
of
those
changes,
but
in
general
I
think.
Well,
I'm
not
sure
if
what
we're
doing
is
actually
the
target
use
case
for
the
collector.
So
we're
that's.
Why
we're
kind
of
branching
off
in
this
other
direction?
Also
because
the
prometheus
metrics
we
straight
like
we
don't
need
most
of
the
logic
from
resistors.
We
only
need
their
parsing
for
the
text,
format
and
hopefully
proto
soon,
we'll
see
with
openmetrics
and
so
on.
E
D
You
looked
at
the
operator
in
hotel
for
the
hotel
collector
because
that's
where
that
scaling
had
was
done-
and
you
know
one
of
the
areas,
as
you
rightly
said,
is
that
there's
a
lot
of
stuff
you
do
not
need
from
the
prometheus.
You
know
dependencies,
you
know
and
and
optimizing
and
making
a
lower.
You
know
smaller
footprint.
There
would
actually
be
ideal
because
it's
actually
quite
it's
carrying
a
lot
of
baggage
in
that
whole
process.
To
my.
E
Understanding
right
now,
though,
like
that's,
why
we
like
we
have
a
fork
of
hotel,
hotel,
collector,
contrib
and
prometheus,
because
we
had
to
make
optimizations
in
the
prometheus
library
and
then
in
the
receiver,
library
and
and
it's
a
whole
chain
and
the
optimizations
we
did
were
kind
of
like
hotel.
Right
now
takes
the
prometheus
library,
so
you
pass
the
electromycius
config
and
any
optimization
we're
doing
a
prometheus
site.
You
can't
really
because
it's
like
nested
in
three
libraries
yeah.
E
D
Think
I
think
there
was
another
issue,
and-
and
this
was
something
that
has
been
ongoing-
is
where
to
ensure
from
a
compatibility,
configuration
compatibility
standpoint,
full
compatibility
between
the
two
formats
of
configuration
right.
D
So
there
is
a
proposal
for
a
you
know,
standardized
or
a
more
sophisticated
configuration
manager
which
is
actually
handled
with
the
remote
there's,
a
remote
agent
work
group
in
in
the
hotel
six,
and
that's
where
you
know
this
agent
management,
especially
with
configuration
management
specifically,
is
you
know,
being
discussed
right
that
said
again,
there
were
there
is.
D
There
are
actually
a
few
design
proposals
that
were
made
in
terms
of
improving
the
configuration
management
for
hotel
and
making
it
more
sophisticated
and
fully
compatible
with
you
know,
users
not
having
to
change
a
prometheus,
existing
prometheus
configuration,
for
example,
and
the
format
and
it
being
interoperable
with
the
hotel
configuration
injection.
So
again,
I
think
david's
also
aware
of
it,
but
yeah.
E
D
And
and
steve
it
might
be
worth
you
know,
while
kind
of
having
a
more
detailed
discussion,
because
I
think
that
that
work
got
a
bit
less
prioritized
by
the
maintainers.
So
we
should
definitely,
you
know,
figure
out
how
we
can
get
some
of
these
requirements
aligned,
because
these
these
are
no
known
issues
they're,
not
new
and
kevin
we'd.
Love
to
you
know,
make
sure
that
hotel
does
address
them,
because
it's
not
only
you,
it's
actually
many
other
users
who
are
having
the
same
issues
and
it's
the
same.
A
B
C
You've
also
touched
on
sort
of
an
operational,
pragmatic,
real
world
kind
of
issue.
That
was
my
experience
as
well
running,
in
particular
multi-tenanted
clusters.
You
know
where
you're
using
name
space
to
get
sort
of
pseudo
isolation.
You
know.
Ultimately,
you
know
the
purpose-built
clusters
worked
better
for
us,
but
in
that,
even
in
that
scenario
we
found
the
need.
You
know
to
have
many
prometheuses.
C
You
know
one
for
the
cluster
itself
and
then
and
then
you
know,
teams
or
whatever,
whatever
self-service
you
know
or
or
team
ownership
scheme
is,
is
at
play.
You
know
that
was
the
granularity
that
we
needed
to
give
everybody
their
own
prometheus,
because
cardinality
bombs
would
take
out
not
only.
C
Well,
there's
actually
something
we
used.
I
don't
know
if
it's
so
called
this,
but
it's
called
the
cardinality
bomb
detector
and
it
literally
looks
at
the
number
of
series
and
some
other
things
as
a
rate
and
then
sets
up
alert
manager
configs.
You
know
if
something
is
blowing
up,
but
but
again
in
practice.
If
you
don't
have
those
blast
radii,
you
know,
then
you
know
it
blew
up.
That's
great
and
and
that's.
C
E
C
E
Something
we're
kind
of
is
one
part
of
our
list
of
we're
building
into
this
new
scraper.
Whatever
is
yeah.
We
like
we
can
measure
how
many
points
we
get
for
a
single
metric
and
we
know
how
many
to
expect.
So
if
we
get
too
many
we're
like
okay,
hey,
this
is
the
thousands
a
thousand
like
I
don't
know.
We
we
had
like
bucks
where
there
was
like
one
metric
straight
was
like
250
000
lines,
and
usually
it's
like
10.
E
and
we're
like.
We
can't
tell
like
okay,
let's
start
dropping
metrics
instead,
because
that's
we
alert
the
exporter
and
like
say,
hey,
you're,
sending
too
many
metrics
and
there's
a
bug
and
we
start
drop
and
fail
gracefully,
which,
right
now,
with
the
current
architecture
of
open
telemetry,
I
think,
would
be
very
hard
to
get
into
and,
like
our
team,
developed
kind
of
the
stance
of
open
telemetry
is
great,
but
it's
way
too
general
for
the
optimizations.
We
want
to
have.
D
I
think
I
think
to
that
point
and
sorry
steve
I
just
wanted
to
complete
this
thought
was
kevin,
that
there
was
a
discussion
and
there
is
actually
an
open
discussion
on
having
a
lighter
weight.
You
know
optimized,
collector
and
and-
and
I
think
that
again
I'd
invite
you
to
you-
know
kind
of
chat
with
us
about
it.
Because
again
we
totally
understand
that
you
know
not
all
use
cases
fit
one.
D
You
know
snapshot
and-
and
there
are
obviously
that's
the
reason
why
distributions
also
exist
downstream,
because
you
know
you
can
actually
shed
the
number
of
processors
or
tune
the
you
know,
components
which
are
part
of
an
general
release.
If
you
will
of
the
collector
and
and
that's
one
of
the
ways
it's
been
handled,
but
that
said
again,
I
really
would
love
to
see-
and
you
know,
high
cardinality
optimized
version
of
the
collector
with
the
with
a
smaller
footprint.
C
And
there's
really
two
scenarios
there
right,
there's
the
high
cardinality
all
by
itself,
which
can
be
challenging,
but
you
know
it
was
a
straightforward,
optimization
problem
and
a
resourcing
problem,
but
but
the
sudden
cardinality
bombs
are
spikes.
You
know
I
found
myself
wishing.
Maybe
this
is
already
in
hotel.
I
don't
know
we
were
at
the
time
we
were
running.
You
know
prometheus
directly
and
using
remote
right,
hotel
collector
wasn't
quite
as
grown
up.
C
This
was
like
three
years
ago,
but
I
found
myself
almost
wishing
for
like
a
circuit
breaker
pattern
inside
the
scrapers
right,
so
that
for
a
particular
target
and
there's
different
ways,
you
can
implement
it.
You
know
you
could
do
based
on
delta
or
absolutes
or
you
know
all
manner
of
rules,
but
you
know
if
you
could
have
a
per
target.
You
know
this
is
just
too
many
I'm
going
to
just
stop
and
lose
metrics
for
this
one
target,
rather
than
try
to
drink
from
the
fire
hose
and
tank
everything
by
the
way
too.
C
C
What
I
do
right
right,
you
know,
and
that,
and
if
we
did
have
that
with
some
same
defaults,
you
know
as
well
or
at
least
the
potential
for
them,
then
the
complexity
of
like
what
we
had
to
do,
which
is
like
every
name
space
gets
its
own
prometheus.
You
know
the
cluster
gets
its
own
prometheus,
but
then
people
might
want
to
do
rules
that
branch
across
promethei.
So
now
you
need
like
a
multi-tenant
cross
tenant.
C
You
know
query
capability
and
or
scraping
capability
if
you
wanted
to
do
recording
rules
to
record
rates,
for
example
of,
like
you
know
something
in
your
app
combined
with
something
from
the
cluster
right,
so
it
very
quickly
gets
complicated
and
in
a
sea
of
painful
our
backs.
So
I
think,
like
an
elegant
fix
a
little
bit
upstream
on
the
collector
side
might
really
have
a
lot
of
knock-on
benefit,
but
kevin
definitely.
A
I
think
there's
several
things
right
like
there's.
Definitely
some
missing
features.
I
totally
agree
with
the
circuit
breaking
by
the
way
like
we
should
totally
implement
something
like
that
to
offer
kind
of
flexibility,
but
you're
you're
spot
on
kevin.
Like
today,
the
hotel
collector
is
kind
of
a
generic
like
just
trying
to
support
a
bunch
of
use
cases.
A
We
are
definitely
going
to
head
down
the
path
of
needing
to
optimize
like
we're
going
to
have
like
an
edge
instance
or
like
a
high
cardinality
instance.
Or
what
have
you
because
these
use
cases
sure
I
mean
google
has
massive
scale,
totally
appreciate
that,
but
some
other
users
of
this
system
are
also
going
to
have
some
similar
type
behaviors
that
need
to
be
handled
and
just
having
a
generic
solution.
The
collector's
not
going
to
be
great.
A
I
I
want
to
just
second
what
eloita
was
saying,
which
is
we
would
love
to
chat,
understand
those
use
cases
better
figure
out
how
we
can
better
support
you,
and
even
if
it's
not
today,
even
if
it's
like
in
the
future,
how
do
we
get
there?
Because
some
of
this,
I
think,
is
applicable
to
many
users
out
there.
E
And,
like
one
more
thing
I
think
can
share,
is
we
like
we've
spent?
I
I
think
I
I
like.
I
joined
google
october
and
since
I
joined
I've
been
mainly
working
on
the
collector
and
optimizing
memory
stuff
and
one
thing
we
notice
that
we're
getting
to
a
point
where
we
optimized
a
lot
and
now
we're
mostly
fighting
the
garbage
collector,
which
and
so.
E
We're
like
looking
into
four
metrics
like
collecting
parsing
sending
if
we
would
be
better
off
using
a
non-garage
selected
language
to
do
that
because,
like
the
metrics
collecting
part
it
like
it's
a
bunch
of
strings
and
numbers,
and
then
we
throw
them
away
and
collect
them
again
and
that's
kind
of
an
experiment
I'm
into
right
now,
because
we
like
at
a
certain
scale.
You
notice
that,
after
a
scrape,
even
if
you
use
60
seconds,
it
takes
longer
for
the
garbage
collector
to
clean
up
behind
the
metrics.
It
just
collected.
E
A
E
So
that's
kind
of
an
experiment.
I
I
don't
know
if
that
was
ever
discussed
with
hotel
to
do
the
collector
and
like
yes,.
D
Yeah
yeah
actually
kevin.
I
mean
these
are
all
known
issues.
It's
more
that
I
think
there
are
a
couple
of
things
here.
One
is
again
it's
on
list
for
the
hotel
collector.
You
know
to
address
all
of
these
areas,
but
it's
also
you
know.
D
We've
worked
with
dave
ashford
here,
you
know
pretty
closely
as
well
as
some
of
the
other
engineers
who
are
involved,
but
I
think
there's
also
your
input
there
as
well
as
other
feedback,
would
be
useful
because
I
mean
this
is
something
that
even
we
are
seeing
and-
and
you
know
so-
it's
something
that
I
I
definitely
would
love
to
see
more
prioritization
on
and
and
please.
B
D
D
E
E
Yeah
and
and
one
thing
about
that
very
quick-
we
found
something
surprising
in
a
profile
a
while
ago,
but
I
think
it's
prometheus
in
the
hotel,
the
prometheus
library
imports
for
its
discovery
stuff.
A
lot
of
third
parties
like-
and
in
this
case
I
think
it
was
an
aws
library
that
always
allocated,
I
think,
one
or
two
megabytes
of
memory,
just
by
being
imported
as
go
code
which,
like
we
now
do
like
when
they're
magic
to
get
rid
of
dependencies.
E
E
D
Because
there's
a
lot
of
dependencies
on
the
prometheus
side
that
automatically
you
know
you
your.
C
You
mentioned
that
you
had
forked
a
few
of
the
the
pieces
of
hotel.
I
was
curious
what
your
experience
was.
You
know
in
running
with
those
forks
like,
for
example,
was
it
easy
for
you
to
replicate
ci?
Was
that
straightforward
and
you
know,
or
or
you
know,
and
and
also
like?
What's
your
predicate
for
pushing
these
changes
back
up
so
that
you're
not
having
to
work
off
a
fork?
So
you
know
I'm
kind
of
curious
like
for
someone
new
who
wants
to
make
a
change
like
what
you've
done.
C
E
Well,
the
forks
are
not
on
get
up;
they
are
an
internal
system
because
for
the
like
build
security,
when
you
have
the
code
around,
but
I
mean
the
forking
was
clone
push
we
don't
like
the
ci
part
we
didn't
replicate.
E
For
I
don't
know
if
the
hotel
collector
forecast,
we
don't
maintain
that
if
that
has
a
ci,
but
the
forks
are
really
there,
mostly
especially
the
prometheus
and
the
hotel
ones,
to
give
us
a
way
to
quickly
test
the
develop
fixes
and
build
images
internally
until
we
know
that
it
works
and
then
contributed
back
upstream,
because
importing
the
data
versions
into
our
builds
is
pain.
E
E
To
right
the
collector,
we
have
the
separate
repo
and
we
built
like
we
built
the
collector
ourselves.
We
have
our
own
ci
and
testing
and
stuff
and
yeah,
so
we
don't
use
the
hotel
collector
binary.
We
have
built
our
own
binary
because
we
add
stuff
to
it
and
for
the
other
parts
like
my
process
right
now
is
I
have
all
the
forks.
I
need
to
check
that
locally
and
when
I
test
something
I
overwrite
the
go
mod
file
to
point
to
local
paths
instead
of
a
version
and
that
works.
E
You
know,
I
think
what
was
more
challenging
was
whenever
we
updated
the
hotel
libraries.
There
is
a
lot
of
connected
libraries
like
there's
the
collector
and
then
there's
the
contrib,
and
for
me
and
like
the
one
where
a
few
times
we
had
it
where
we
updated
open,
telemetry
collector
to,
I
think
the
one
was
from
0
35
to
0
37
or
something
and
a
lot
of
stuff
went
weird,
and
we
didn't
like
understand
why
and
then
you
have
to
go
through
a
lot
of
diffs
and
and
understand
like
across
three
four
repositories.
D
E
In
that
case,
no
because
it
was
a
very
subtle
change
and
we,
but
it
like
it,
also
because
we
relied
on
probably
implementation.
We
relied
on
labels
and
there
were
no
longer
there
at
some
point
or
remains
in
some
way,
and
we
didn't
notice.
E
One
of
the
custom
things
in
our
collector
is
we
define
in
our
config
what
metrics
we
accept
and
based
on
and
what
labels
we
accept.
And
if
we
see
a
metric
with
a
label
we
don't
know,
we
just
drop
it.
So
there
was
a
label
added
to
every
metric
and
as
it's
not
on
our
allow
list,
we
just
dropped
them
all
and
we
like,
but
it's
hard
to
debug
and
we
didn't
know
and
like
david,
helped
us
a
little
and
we
found
a
twist
out,
but
it
took
a
while.
D
D
But
I
mean
again,
you
know
if
you
have
run
into
issues
like
this
in
the
future.
Just
ask
because
I
can
you
know
again,
there
is
a
collected
maintainers
channel
and
you
can
just
ask.
E
Right
now
we
asked
david,
I
don't
know
if
we
should
also
ask
somewhere
else.
D
Yeah
yeah,
I
mean
typically.
D
I
I
mean,
if
you're
on
cncf
slack,
I
can
bring
you
the
names,
but
you
can
reach
out
to
you.
Can
reach
out
to
alex
burton
you
can
reach
out
to
bogdan.
You
can
reach
out
to
jurassic
there.
There
are
a
few
maintainers
that
you
can
reach
out
to
so
in
case
you
ever,
you
know,
have
an
issue
where
you
are
seeing
dependency,
changes
or
any
kind
of
breaking
changes.
C
I
I
think
I
think
you
had
mentioned
that
you
and
or
your
team
were
looking
for
ways
to
contribute
to
the
technical
advisory
group.
You
know
this
this
body
itself.
Do
you
have
any
idea
about
this?
The
kind
of
things
you'd
like
to
do
I
mean
I
think
I
I
think
it
is
the
case
that
we
have
a
pile
of
work
streams
defined
but
not
resourced,
and
you
know
the
way
that
tags
work
like
if
there's
something
that
you
or
your
team
want
to
work
on.
C
That's
in
scope
for
for
the
for
the
tag
and
it's
there
isn't
an
issue
for
it
or
something
like
that.
Then
we'll
just
make
one
right.
It's
a
it's!
A
community
driven
group.
D
Yeah
and
then
especially,
I
think
that
kevin
as
you
called
out
you
know
there
are
specific
use
cases
even
for
the
hotel
collector
that
actually
can
be
defined
in
in
the
in
this
forum,
for
example,
and-
and
you
know
then
worked
collaboratively
with
with
hotel
with
the
you
know,
maintainers
right
so
again
leverage
it,
because
that
reiterates,
you
know
the
necessity
and.
B
C
If
they're
pragmatic,
you
know
anyone
would
hit
this
at
scale
or
even
not
scale
kind
of
kind
of
kind
of
things
I
mean
I've
noticed
over
the
last
year.
You
know
there's
a
dramatic
disparity
between,
like
you
know,
the
900
plus
people
in
our
channel
and
the
you
know,
generally
less
than
a
dozen
that
that
come
every
couple
of
weeks
right.
So
I
I
feel
like
there's
a
huge
untapped
reservoir
of
people
who
are
lurking
right,
and
so,
if
we
engage
in
work
that
meets
people
where
they
are,
we
might
find
ourselves.
C
You
know
with
more
resourcing
and
more
contribution
from
from
the
community,
as
well
as
more
feedback
to
vet.
Some
of
these
you
know
ideas.
You
know
in
this
case
for
hotel,
but
just
generally
for
observability.
So.
D
Right,
right
and,
and
I
mean
going
back
to
kevin
your
point
of
you
know
when
you're
really
looking
at
high
volume
streams
of
metrics,
you
know
what
what
architecture
actually
works
because
even
prometheus
has
you
know,
issues
and
and
again
that
needs
to
be
addressed
for
a
you
know,
better
user
experience
that
needs
to
be
addressed
in
all
of
these
stacks,
no
matter
what
right,
so,
those
discussions
you
know
need
actually
to
be
had
across
projects,
and
you
know
really
addressed
to
scale
to
you
know
what
is
needed
today
right
because
again,
these
stacks
have
evolved
over
time.
E
Yeah
and
like
I
think,
that's
mainly
my
well
mainly
my
focus,
we
only
for
people,
so
we
split
things.
A
little
is
yeah
performance
and
efficiency,
performance
efficiency
scalability.
E
So
I
I
or
we
noticed
this
well
ongoing
push
for
prometheus
as
a
standard
which
I'm
mostly
on
board
with
I'm
excited
for
openmetrics
and
would
like
to
work
on
that,
because
I
think
it's
an
opportunity,
because
one
thing
is
prometheus,
like
I've
been
using
it,
for,
I
don't
know
how
long
and
it
never
seemed
scalable
like
that's
why
we
see
like
product
mimir
or
thanos
or
whatever,
and
I,
like
I
wanna,
see
how
this
tag
is
an
opportunity
to.
E
C
E
Multi-Dimensional
like
my
biggest
issue
right
now,
is,
for
example,
prefixes
format
is
the
text-based
format
with
like
you
have
a
lot
of
string
like
the
metrics
names
along.
It's
it's
a
lot
of
overhead
head
that
might
be
worth
optimizing
away
if
possible.
Like
that's.
E
Where
I'm
curious
about
the
up
and
downs
of
the
proto
format,
I
know
it
was
dropped
and
now
it's
coming
back
and
yeah
like
optimizing,
the
processes
that
collect
scrape,
send
store,
metrics
yeah,
because
that's
what
where
my
pain
is
right
now
that's
kind
of
over
the
last
few
months,
where
I
build
my
my
opinion
and
now
I
I'm
looking
on
how
to
use
it
somewhere.
D
D
I
think
that
feedback
as
well
as
you
know
how
we
actually
look
at
it
at
an
overall
scale,
where
some
of
these
issues
are
standardized,
because
performance
benchmarks,
for
example,
and
can
be
standardized
right,
I
mean
that's
depending
on
you
know
what
the
definition
of
different
categories
of
scalability
is
right
and-
and
those
are
you
know,
those
are
some
of
the
areas
that
both
prometheus
as
a
project,
as
well
as
a
hotel
as
a
project
have
been
interested
in
and.
E
Oh
and
like
just
for
me
personally,
I
I
see
like
with
the
also
the
ecosystem
narrowing
down
on
prometheus.
I
just
want
to
make
sure
we
still
keep
thinking
about.
I
don't
know
the
like.
What
comes
after,
like
yeah
prometheus
is
awesome.
I
really
like
it,
but
I
don't
want
to
stay
stubborn
on
okay,
that
it's
prometheus
and
we
stopped
thinking
about
the
next
innovation
for
metrics
or
no.
D
No,
and
and
absolutely
because
I
think
that
you
know
as
the
as
the
platforms
that
we
are
pulling
data
from
for
telemetry
for
observability
change
and
they
evolve
right.
The
formats
need
to
be
compatible
the
scalability,
you
know
requirements
and
stability.
Guarantees
need
to
accommodate
that
right.
So
it's
something!
That's
always
in
work
in
progress.
C
Also,
the
protocols,
like
some
of
the
things
you
said
are
analogous
to
some
of
the
things
that
have
been
coming
out
of
the
hotel
profiling
discussions,
particularly
around
and
here
again
google
has
a
stateful
protocol
versus
a
stateless
protocol.
So
you
lower
the
overhead
of
like
these
repeated
strings
over
and
over
and
over.
So
you
know,
as
we
do
look
at
this
ecosystem.
C
I
couldn't
agree
more
that
you
know
we
want
to
be
looking
forward.
Not
you
know
saying
well,
we've
got
this
one
thing
and
it's
good
forever,
because
then
you
get
surprised,
but
I
think
inclusive
to
that
discussion
is
also
you
know
the
protocols
and
their
overhead
and
the
nature
of
them,
and
how
to
do
that
in
a
way
that
that
doesn't
have
a
one-size-fits-all
kind
of.
D
Approach
here
we.
C
Do
need
to
transition
to
a
couple
of
bookkeeping
items,
but
I
can
do
them
really
quick.
I
just
need
to
make
some
announcements
before
we
hit
the
50-minute
mark,
so
we've
got
another
10
minutes.
There
are
a
couple
people
who
joined
while
we
were
talking.
Do
you
guys
want
to
say
hi
or
was
there
anything
specific
you
wanted
to
put
on
the
agenda
either
for
for
this
meeting
or
a
future
one.
B
B
C
All
right,
well
I'll
just
do
do
we
want
to.
Is
it
okay
to
transition
here?
And
I
don't
want
to
be
too
abrupt,
because
this
is
actually
yeah.
D
I
think
kevin
again
ma'am
just
to
wrap
up.
I
was
kevin
thanks
again
for,
for
you,
know,
kind
of
discussing
these
different
areas
as
an
action
item
stephen
I'll,
follow
up
from
you
know,
hotels.
We
have
to
make
sure
that
we,
you
know,
have
not
missed
any
of
these
requirements
as
well,
as
you
know,
work
towards
figuring
out
how
we
can
accelerate
some
of
these
changes
that
you
know
are
already
on
our
radar,
but
we
should
be
made.
D
You
know
sooner
than
later,
and
also
would
love
to
you
know
have
a
little
more
time
from
you
if
possible,
on
the
hotel
side,
you
know
to
actually
review
design
proposals
as
well
as
provide
feedback
on.
You
know
some
of
the
code
that's
been
developed
already.
The
collector
sig
may
be
a
bit
noisy,
but
I
do
welcome
you
to
you,
know
kind
of
join
in
there
it's
on
wednesdays,
at
nine
pacific
time
and
but
you
know
happy
to
also
help
pull
in
the
collector
leads
to
be
able
to.
D
You
know,
discuss
with
you.
We
work
very
closely
with
david
ashford,
but
you
know
he
also
has
a
lot
of
different
projects.
He's
working
on
and
and
with
josh
suarez
also,
but
you
know,
josh
is
also
kind
of
spread
thin
right.
So
please
get
more
involved
yourself
and-
and
you
know
ping
me
on
slack-
let's
you
know
work
with
steve
to
figure
this
out.
D
Yes,
there's
the
prometheus
web
group
right
that
we
have
and
we
work
with
openmetrics
very
closely
there.
Most
of
the
you
know:
active
maintainers
on
openmetrics
join
in
there.
It's
on
wednesdays,
at
eight
a.m.
Pacific
time,
there's
a
work
group
meeting
every
week.
So
you
know
if,
if
that's
the
time
that
works
for
you
please,
you
know,
please
do
join
tomorrow
or
next
week.
Whatever
works
for
you.
E
C
Cool
cool.
I
wanted
to
make
just
a
few
brief
sort
of
announcements
because,
again
by
looking
at
the
youtube
video
download
viewing
numbers,
you
know
who's
actually
watching
these,
it's
quite
a
bit
larger
than
than
folks.
That
can
be
present
presently.
C
So
one
piece
of
that
is
our
meeting
times
are
almost
are
always
you
know
the
first
and
third
tuesdays
at
noon,
eastern,
which
is
great
for
the
various
folks
in
americas,
but
we
probably
should
move
to
a
cadence
where
you
know
every
other
meeting
is
a
little
bit
more
suitable
for
asia,
pacific
and
and
europe.
So
so
that's
just
one
thing.
C
You
know
I
think,
maybe
next
in
two
weeks
we
can
kind
of
make
something
more
formal
around
that,
but
is
going
to
put
that
out
there
as
far
as
nominations
go
we've,
we've
kind
of
been
running
lean.
If
you
will,
in
terms
of
roles,
so
I
intend
to
nominate
steve
flanders
as
a
co-chair
co-chairs,
have
a
two-year
duration
and,
as
some
folks
might
not
know,
steve
and
he's
here
today,
I
want
to
give
him
an
opportunity
to
introduce
himself
and
just
say
who
you
are.
C
Maybe
for
those
that
are
not
familiar
and
I'll
say
I
met
steve
close
to
a
decade
ago.
Virtually
I
don't
know
if
he
remembers,
but
when
he
was
working
on
log
insight.
We
were
both
in
in
in
a
similar
in
a
similar
part
of
a
vmware,
but.
A
Yeah
so
I'm
steve
I've
kind
of
been
in
the
monitoring
observability
space
for
for
more
than
a
decade
now
I've
worked
in
the
logging
space
at
vmware.
I
did
apm
or
distributed
tracing
at
a
stealth,
startup
called
omnition
that
was
acquired
by
splunk
three
years
ago.
Now,
I'm
at
splunk
and
I'm
also
working
on
the
metric
side
of
the
house.
So
I
swung
infrastructure
monitoring,
which
was
previously
signalfx
at
omniscient.
A
I
was
working
on
the
open
census
project
and
the
open
census
service,
which
is
now
the
open
symmetry
collector
and
I've
been
involved
in
basically
open
census
and
open
telemetry,
both
projects
since
very
early
days
since
the
very
beginning
so
very,
very
passionate
about
what's
happening
in
this
space.
I
think
it's
really
innovative
and
necessary.
A
I
have
lots
of
customer
conversations
across
many
different
companies
that
I've
been
at
and
kind
of
feeling
the
pain
of
how
hard
is
it
to
actually
do
observability
in
general,
but
how
to
do
it
in
a
way
that's
kind
of
like
vendor
agnostic,
because
you
don't
really
want
to
rip
and
replace
that's
a
very
heavy
lift.
People
are
just
looking
for
a
generic
solution
and
then
kind
of
flexibility
through
configuration,
so
they
can
leverage
whatever
back
ends.
They
want
to
whether
it's
open
source
proprietary,
local
sas
whatever
right.
A
So
I
really
think
that
the
open
symmetry
project
has
a
lot
to
offer.
So
it's
a
long
way
to
go
still
quite
early
days,
but
really
proud
of
the
work
that
everyone's
been
kind
of
pitching
in
and
given
that
we
see
broad
adoption
across
cloud
providers,
end
users,
observability
companies
and
the
like,
like.
I
think
it
really
points
out
a
pain
point
so
area
that
I'm
very
passionate
about
used
to
be
a
pretty
extensive
blogger.
A
I
haven't
been
doing
that
as
of
late,
because
life
has
been
busy
but
hoping
to
get
back
to
it
here
soon.
That
would
be
great,
but
yeah
definitely
passionate
about
observability
in
general.
I
love
the
cncf
and
I
would
love
to
be
helping
out
in
this
if
this
tag.
So
thanks
for
your
consideration.
D
Awesome,
that's
that's
great
steve,
thank
you
again
and,
and
I
think
you
know
as
going
forward,
I
think
you
know
our
meetings.
As
you
know,
steve
and
I
know
you've
joined.
D
Is
twice
every
every
first
and
third
tuesdays
of
you
know
at
9?
00
am,
but
I
think,
as
matt
was
suggesting
we
might
also.
We
should
at
least
address
one
meeting
in
apac
times
once
every
month.
If
not,
you
know
so
alternatingly.
If
you
will
and-
and
I
think,
4
p.m.
Pacific
might
be
a
good
time,
although
it's
7
p.m,
on
the
east
coast,
but
still
still
doable
right
because
again,
it
seems
to
work
out.
C
I'm
on
the
east
coast,
I
believe,
as
is
steven
and
you
know,
we've
enjoyed,
I
think
two
and
a
half
years
of
noon
meetings.
So
you
know,
I
think.
C
And
the
second
one
he's
not
here
today
he
had
a
conflict,
but
henrik
rext.
He
runs
a
couple
of
podcasts.
One
is
called
it's
observable,
that's
been
running
for
some
time
now
and
he
covers
a
lot
of
issues
around
observability
and
open
telemetry
and
also
there's
something
that
he
and
michael
hasenblast
have
been
doing.
A
little
bit
called,
I
believe,
open
source
news.
C
He's
he's
been
vocal
and
helpful
and
is
a
great
example
of
someone
from
the
community-
that's
just
persistently
been
here
and
and
is
helping
out
in
a
variety
of
forums.
So
you
should
be
here,
I
think
in
two
weeks,
so
that
was
the
other
one
and
then
lastly,
I
put
in
a
couple
of
links.
We
talked
about
it
before
about
other
sort
of
roles.
That
would
make
a
lot
of
sense.
I
think
so.
C
If
anyone
knows
anyone
looking
for
an
excuse
to
contribute,
we
either
have
one
or
can
make
one.
So
that's
really
all
I
have
for
today
and
it's
12
49.
I
think
for
the
first
time
in
about
a
year
at
least
the
agenda
items
have
been
covered
by
the
50
minute
mark.
So
I'm
going
to
take
that
as
a
win,
but
if
there's
anything
else
folks
want
to
chat
about,
I
think
the
cncf
zoom
doesn't
cut
off
for
another
12
minutes
or
so.
D
No,
I
think
I
think
this
is
a
good
discussion
and
kevin.
Thank
you
again,
for
you
know
kind
of
bringing
in
some
of
these
areas.
We
definitely
love
to
see.
You
know
hotel
and
open
metrics
and
prometheus
continue
to
work
closely
in
this
in
in
terms
of
metrics
support,
you
know
and
and
handling
the
different
use
cases.
So,
let's,
let's
continue
to
work
together
and,
and
we
I'll
also
reach
out
to
the
prometheus
and
open
metrics
communities.
D
Typically
richard
joins
in
from
openmetrics,
but
would
love
to
see
other
folks
also
joining
in
from
the
larger
for
the
larger
discussions.
Yeah.
D
I
pinged
you
on
slack
kevin,
so
let
me
know
if
you
thought.
Thank
you.
Sorry,
thanks
matt,
thanks
for
setting
this
up.
Thank
you
steve.
Thank
you.
Chad,
later
folks
take
care.