►
From YouTube: Policies and Telemetry WG Meeting - 2020-07-01
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
A
B
So
doc,
okay,
can
you
present?
Are
you
yeah.
B
Yeah
yeah,
okay,
thanks
stuff
yeah,
so
David
I
will
present
the
audience
of
policy.
David
is
an
intern.
He
started
working
on
this
project
for
his
intern
project
yeah.
So
the
proposal
is
to
have
an
audit
policy
that
allow
user
to
control
when
a
log
is
created
and
the
content
of
the
log,
and
so
the
motivation
is
pretty
clear
if
customer
I
have
to
log.
So
customer
wants
some
some
control,
what
they
are
going
to
log
and
to
avoid
the
log
everything
which
is
a
one
hand.
B
It
cost
a
lot
of
starch
cost
and,
on
the
other
hand,
it's
very
hard
to
identify
any
useful
information
from
the
large
amount
of
logs.
So
we
propose
to
extend
the
authorization
policy
to
add
a
log
action.
So
today
authorization
policy
has
allowed
the
nine
and
the
proposal
is
for
add
one
additional
action
which
is
called
a
log.
So
basically
we
want
to
use
the
matching
condition,
matching
hosting
authorization
policy
to
ensure
on
under
what
condition
a
log
should
be
generated
and
young
yep.
B
B
Yeah,
okay,
yeah,
so
because
audit
policy
is
a
basically
author,
I
should
policy
with
the
lock
action.
So
it
has.
It
shares
all
the
properties
that
authorizing
policy
has
so,
for
example,
it
can
be
defined
as
namespace
scope
and
the
master
code
and
the
policies
are
additive,
which
means
a
request
is
unlocked
as
long
as
there
any
lock
policy
system
and
the
policy
should
take
effect
only
in
the
target
in
a
policy
scope.
So
if
there
is
no
auto
policy
apply
to
a
given
workload,
the
default
behavior
is
basically
lock
everything.
B
B
In
this
case,
you
can
look
at
the
domain
claim
in
that
Josh
:,
that
only
the
requests
coming
from
Acme
comm
should
be
locked
and
there's
also
very
common
use
case,
and
you
can
also
write
a
policy
to
log
everything
cause
a
lot
of
nothing.
This
is
actually
the
same
as
how
oscillation
policy
work
today.
So,
in
order
to
log
everything,
you
basically
create
a
policy
that
apply
to
the
given
name.
Space.
Indication
in
this
case
is
ns-one
and
you
can
define
I
drew
up
that
has
no
restriction.
B
So,
basically
you
are
saying
everything
will
match
this
room,
so
it's
basically
lock
everything
and
lock.
Nothing
is
basically
you
create
a
policy
apply
to
the
dev
namespace
and
but
there's
no
rule
allowing
to
lock
anything.
So,
yes,
that's
the
lock,
nothing
policy
and
if
you
apply
this
or
if
your
says
the
namespace
would
to
the
routine
namespace,
your
basic
raid
apply
the
policy
to
the
mesh
scope.
B
Yeah
log
entry,
as
I
said,
is
actually
a
very
similar
to
the
previous
mix
of
your
log
entry.
We
also
have
had
a
log
entry
to
define
the
content
of
the
law,
so
we
can
have
source
section
destination
section,
requester
section
and
the
other
context,
and
the
other
syntax
is
most
today
copy.
The
frontal
previous
mix,
a
policy
blog
entry
and
yeah
now
I'm
going
to
hand
over
to
David
David
is
going
to
talk
about
how
this
a
lot
policy.
Excellent.
It's
going
to
be
implemented.
I.
C
B
B
E
E
E
The
reason
that
we're
deploying
three
filters,
and
not
one
or
two,
is
because
currently
the
our
back
doesn't
really
support
multiple
actions
and
I
miss
do
we
do
we
convert
the
off?
Each
authorization
policy
could
fake
into
a
single
our
back
config,
so
it
just
went
in
there's
no
easy
way
to
do
multiple
actions
in
one
hour
back
filter.
So
that's
why
we
deploy
D
log
action
as
a
third
filter,
and
this
is
the
simplest
change.
Obviously,
again,
one
action
per
config
is
reasonable.
E
The
issue
is,
we
need
three
filters
to
run
envoi,
but
there
is
potential
in
the
future,
and
voy
is
looking
to
supporting
multiple
actions
for
our
Beck
filter,
but
for
now
this
makes
sense
and
the
way
we're
actually
transmitting
this
decision
to
do.
Telemetry
backends
is
we're
setting
a
key
we're
sending
a
access
log
policy
key
that
can
be
read
by
each
individual
backend,
based
on
the
matching
principle,
permission
and
principle
information,
and
then
the
lemon
tree
back-end
can
then
determine
whether
it
wants
to
log
not
so
currently.
E
So
in
this
doc,
it
says
it's
a
filter
state
key.
That's
we've
changed
this
to
be
a
dynamic
metadata
key
on
simply
because
that's
more,
that
seems
to
be
more
convenient
for
use
in
wasum
extensions
and
some
of
the
other
solutions
that
we
considered
for
this.
We're
deploying
all
actions
of
one
filter
and
doing
allow
allow
log
or
deny
log,
but
again
that
that
would
require
a
much
bigger
API
change
to
both
our
Beck
and
sto
and
with
allow
login
to
dialog
it's
not
clear
which
which
of
the
filters
we
pair
the
log
action
with.
E
So
that's
about
it
for
in
terms
of
the
implementation,
it's
it's
pretty
simple,
especially
on
the
sto
side.
The
only
thing
we're
doing
an
sto
is
adding
another
filter
on
voice.
Add
is
a
little
bit
more
complicated
because
we're
adding
a
whole
new
action,
but
anyways
generally
supportive
of
this.
They
they
are
already
considering,
potentially
adding
multiple
actions.
Just
besides
the
regular
allow
and
log
allow
and
deny
actions
so
yeah.
A
So
I
like
the
idea
of
passing
in
sort
of
an
should
audit
flag
or
audit
audit
this
this
mesh
flag,
I'm,
wondering
I,
think
I,
see
comments
in
here
from
John
and
there's
so
much
of
mine.
That
can
we,
if
do
we,
need
to
define
log
entry
as
part
of
this,
or
is
this
something
that
it
makes
it's
a
limit?
You
API
a
general
purpose.
A
Let
me
show
you
I
could
could
could
define,
and
you
guys
you
know
the
inside
of
authorization
policy,
would
reference
somehow
or
I'm
worried
about
fragmentation
of
places
where
we
define
telemetry
scopes
and
I.
Don't
I,
don't
know
how
other
people
feel
but
I
just
sort
of
curious.
What
your
thoughts
are
on
that.
B
Actually,
when
we
designed
the
authorization
policy,
we
were
considering
the
different
action
and
the
log
action
is
well
since
we
we
had
a
in
mind
and
I
actually
sent
a
email
school,
a
couple
folks,
including
Louie
and
others
asking.
Should
we
make
a
war
channel
name
because
oscillation.
So
we
are
also
going
to
add
audit
and
the
the
the
comments.
The
feedback
was
that,
because
the
main
use
cases
for
oscillation
is
okay
to
keep
it
best
name
and
young
and
I
also
mentioned
in
the
comment
replying
to
adjourn.
B
D
B
Yeah
so
katakana
is
the
action
field
called
allowed,
unite
log
and
the
past,
so
past
means
that
does
not
do
anything.
Similarly,
in
google,
we
also
have
the
RTCC
current
policy.
It
has
a
very
similar
structure
to
this.
It
also
has
a
lot
and
I'm,
not
sure
it
has
deny
yeah
buzz
loggers
cause
of
it.
Ip.
B
A
I
think
I
guess
I'm,
not
quite
clear.
I
wasn't
clicked
through
I
think
having
something
that
says
log
that
makes
a
lot
of
sense
to
me.
I
was
wondering
about
the
log
entry
CID
part
of
this.
Oh,
should
that
be
something
that
we
we
need,
or
is
it
just
be
something
from
as
part
of
a
more
general
purpose,
limited
I
wish
we
referenced
somehow
through
this
API,
or
is
this
they're
doing
do
we
want
this
to
be
a
separate
thing
and
serve
asking
the
group
and
you
at
the
same
time,
so.
B
If
you
are
planning
some
telemetry
API
that
commerce
log
entry
of
course
yeah
this
Balasco
part
of
the
geometry,
lock
geometry
configuration
yeah,
actually
even
in
the
API
version,
I
just
used
to
configure
Tec
on
total,
which
is
the
previous.
If
you
are
watching
for
mixer
yeah,
so
because
I'm
not
sure
where
this
API
showed
you
know
below
so
I
think
it's
plus
for
security,
API,
so
yeah
I
think.
C
Right
so
I
think
I
think
when
went
to
log
and
what
to
log
are
somewhat
separate
concerns
right.
So
we
we
already
have
this
well,
what
this
exactly,
but
maybe
have
something
similar.
We
have
a
stateful
filter
which
decides
when
to
log,
and
then
the
stack
travel
filter
decides
what
to
log
it.
It
doesn't
act.
The
first
filter
doesn't
decide
what
exactly
to
log
it
just
says.
C
Yes,
you
need
to
log
this
log
entry
and
the
actual
filter
that
does
the
logging
decides
how
its
configured
and
what
the
configuration
is
and
all
that
you
you're
clearly
combining
those
two
things
at
least
had
an
API
level
and
saying
that
you
will
decide
both
went
to
and
what
too
long
it's
it
like
to
play
nicely
with
all
the
ways
that
we
can
generate
a
log,
it's
probably
best
to
leave
it
to
the
backends
right.
So,
for
example,
you
can
use
stacked
over
logging
to
log
like
we
have
today.
B
F
F
F
B
F
Is
so
there
is
a
global
option
where
you
can
say
I
want
envoy
access
log
sent
sent,
it
will
just
be
using
the
STD
out
and
you
can
configure
the
format
of
that
log
and
also
the
fields
in
that
log.
Additionally,
you
can
use
the
Envoy
filter
to
configure
it
at
a
per
trust
at
a
per
listener
basis.
I
am
trying
to
understand
if
this
works
with
it,
or
these
are
two
orthogonal
things.
I
see.
B
F
The
issue
is,
in
my
opinion
at
least:
logging
is
very
broad,
so
access
logging
is
used
for
lots
of
purposes,
whereas
audit
logging
into
security
sense
is
more
for
logging,
when
you
are
doing
something
whenever
something,
whenever
you're
taking
security
related
activity
right,
so
I'm
little
confused
as
combining
them.
And
how
should
I
express
this
so
there's
access
log
which
on
were
will
do
already
I'm,
not
I'm,
thinking
you're,
not
stopping
it
or
that's
orthogonal
to
this.
But
then
there
is
an
Aussie
logging
which
is
provided
in
this
document.
F
B
A
C
So
so
so,
actually,
like
the
mechanism
of
creating
the
logs,
is
again
separate
than
the
mechanism
to
decide
when
the
log
should
be
created,
and
we
already
have
two
broad
mechanisms
to
create
logs.
One
is
on
voice
native
access
log
and
we
also
have
stack
travel
logs
and
then
other
logs
as
well,
and
those
individual
mechanisms
have
more
configurability
right
with
envoy
and
native
access
log.
You
can
send
it
to
the
G
RPC
service
or
you
can
log
it
to
standard
out
or
you
can
do
many
many
other
things.
B
Yes
from
that
says,
if
we
want
to
distinguish
it
is
to
so
previously,
I
was
also
had
a
like
motivation.
Try
to
a
distinguish
these
two,
like
for
audits
block.
We
want
to
make
it
more
reliable,
like
any
message
cannot
be
lost
and
though
we
want
to
make
sure
all
the
required
fields
are
for
audience.
Purpose
has
to
be
logged.
It
cannot
like.
You
cannot
skip
a
lot
of
message
in
their
lock
in
the
log
information.
B
Yes,
so
if
we
want
to
do
this,
distinguishing,
of
course,
I
think
we
need
to
a
strictly
control
like
even
for
the
log
entry,
we
need
to
have
very
strict
access
control.
Who
can
who
can
edit
this
a
log
entry
right-
and
this
has
to
be
mesh
wider
from
one
singleton
and
short
by
the
audio
data
mean
it's
not
like
any
user
can
modify
it
exactly.
F
A
B
C
Just
just
to
continue
that
that
further
download
the
the
separated
streams
in
order
to
actually
realize
it
right
or
materialize
it.
We
have
right
now,
let's
say
two
different
streams
right
and
I'm,
just
using
using
example
of
stackdriver,
because
I
know
let
me
use
it
often
and
then
the
standardout
stream
with
envoy,
even
though
the
even
though
audit
logging
and
access
logging
have
different
different
uses,
it
looks
like
we
just
have
one
means
of
producing
both
right,
even
in
those
two
different
paths,
there
is
just
there
just
one.
C
F
Like
if
we
are
trying
to
do
this
separation
of
streams
with
personas,
we
need
to
make
sure
those
personas
can
can
reliably
configure.
There
are
what
is
the
reliability
and
durability
of
each
of
those
streams
and
then
where
they
go,
because
those
will
vary
differently,
but
this
is
a
good
start.
I
mean
authorization
shields
like
a
good
housing
for
the
log
action.
I
think
we
have
to
go
a
few
steps
further
around
where
to
send
it,
how
to
send
it.
C
I
think
I
think
that,
like
that,
then
that
makes
perfect
sense
in
which
case,
actually
it
is
not
at
that
person
so
later
on,
I'm,
actually
rethinking
what
I
said
earlier.
If,
if
there
is
a
single
persona
or
a
single
role
that
controls
all
of
audit
logging,
then
it
makes
sense
for
that
authority
to
configure
both
when
hello
and
we're
no,
it's
not
a
separation
offense
on
then
okay.
So
that's
yeah.
B
G
Because
the
structure
is
so
similar,
let's
just
reuse
authorization
policy.
When
you
know
everyone's
talking
about
the
person
managing
the
audit
logs
is
a
different
persona,
which
is
more
than
likely
it's
not
someone
who's,
saying:
hey
I
only
want
these
people
to
have
access
to
these
endpoints
on
my
services
or
methods
which
to
me
seems,
like
the
other
persona
that
we
should
be
talking
about,
and
it
seems
like
we're
sort
of
commingling
these
in
you
know,
auditing
versus
who
can
access
it.
I.
B
I
think
from
your
artists
point:
if
auditing
belongs
to
a
security,
rather
so
basically
it's
the
audio
data
me
who
is
audited,
so
you
can.
You
can
also
consider
it
as
part
of
or
like
a
security
enemy.
It's
controlling
what
content
should
be
audited
and
when
to
audit,
if
ya
like
ACL,
if
you
try
to
apply
access
control
to
cocaine,
okay,
who
can
edit
this
the
computation
you're?
Normally
our
science
to
the
audit
aadmi,
which
is
a
part
of
security,
atomy.
B
F
So
I
so
I
think
so
Rob
so
they're,
two
things
which
are
happening
right
here.
So
if
you
look
at
other
traditional
authorization,
products
or
the
regular
security
products,
whether
it's
a
networking
policies
or
IP
tables
or
even
firewalls
traditionals,
that
is
the
place
where
you
configure
allow
deny
and
log
or
not.
So
in
that
sense,
this
location
feels
ready.
F
It's
it
feels
natural.
What
you
are
saying
is
we
don't
have
separation
of
developer
persona
versus
a
security
persona
to
begin
with,
so
I
don't
think
that
can
be
fixed
by
this
particular
API.
It's
it
makes
it
worse,
but
I
think
we
have
a
broader
problem
of.
We
don't
have
a
developer
persona
separately
defined
because
ODS
II
policies
kind
of
right
now
they're
tied
to
applications,
but
they
also
type
to
security
right,
so
we
have
already
mingled
them
well,.
G
I
mean
not
necessarily
right,
because
I
can
go
in
and
create
policies
for
my
application
in
a
namespace
that
only
expose
particular
services
outside
of
my
namespace
right.
So
I
can
say
you
know
here
at
my
application,
resides
in
this
namespace
and
the
only
people
that
can
access
the
public
endpoints
I
set
it
through.
You
know
JWT
policy
or
something
like
that.
Right
and
I
can
control
who's
coming
in
and
what
methods
they
can
see
and
they
can
make
all
their
other
services
internal
to
that
which
don't
have
access
right.
I.
A
G
B
B
On
the
next
page,
you
can
still
separate
to
them
like
Oh
on
the
next
page.
You
can
say
which
resources
are
allowed
to
edit.
You
can
say
only
the
security
admin
can
answer
to
this.
The
IDS
Reiko,
authentication,
CRT
or
sedation
CIT
they
can
only.
This
are
only
allowed
to
be
edited
by
security
at
me,
but
if
you
have
a
namespace
super
or
super
user
permission,
of
course,
you
can
do
anything.
C
Wait
I
thought:
okay,
I
thought
rob
was
saying
something
slightly
different
right.
It's
that
we
would
be
forced
to
form
that
audit
is
a
separate
persona
and
it's
like
a
cluster
wide
super
user
admin
that
decides
when
and
what
right,
when
to
audit
and
what
toward
it.
But
then
Rob
pointed
out
that
we
have
actually
delegated
to
act,
the
CR
that
you
will
use
to
configure
this
down
to
the
namespace,
and
now
those
two
are
already
in
conflict
mm-hmm,
because
now
in
every
namespace
I
can
have
whatever
it
is.
C
C
That
says:
if
your
policy
doesn't
conform
to
like
these
fields,
then
then
even
the
namespace
argument
cannot
write
right,
the
CR
right
and
then
you
can
do
that
through
CIC
be
as
well
just
to
composition,
and
then
it's
never
an
issue,
it's
correct
by
construction,
so
it
it
is
possible.
I.
Think
we
generally
do
need
to
decide,
though,
to
what
extent
we
want
to
model
everything
in
these
two
API,
and
what
do
we
want
to
leave
to
other
tools
there
outside?
C
F
Mandar
I
think
I
like
where
you're
going
so
there's
only
one
gap
with
this
API
then
is
as
a
mesh
admin.
How
will
they
configure
a
log
policy
cluster
wide
so
that
either
at
the
namespace
level
they
allowed
to
override,
or
they
don't
so
currently,
it
looks
like
this
will
be
always
tied
to
a
workload
right,
so
you
can't
configure
things
at
a
cluster
level
if
I
am
a
security
admin
yeah.
Is
that
correct,
lemon
yeah.
B
B
C
C
This
by
the
way
is
a
is
a
very,
very
general
problem
and,
in
fact,
part
of
the
extension
API
and
just
the
requirements
right
off.
What
are
the
goals
of
an
extension
API,
and
this
is
another
extension
now
we're
trying
to
make
it
first
class,
but
this
the
exact
same
issue
there
is,
to
what
extent
do
we
go
in
modeling
these
use
cases
in
the
hto
API
and
to
what
extent
we
say
that
anything,
that's
more
sophisticated
that
this
needs
to
be
handled
outside
right.
So
this
is
so.
This
is
a
very
very.
F
I
can
make
one
suggestion
here:
lumen,
which
might
be
a
reasonable
middle
ground
here,
which
is
for
the
mesh
level
permissions
or
for
the
mesh
level
audit
log
capabilities.
Maybe
we
can
embed
some
of
this
inside
mesh
can
to
start
with
which
gives
you
the
cluster
white
defaults,
and
then
we
can
keep
still
the
authorization
policy
where
we
can
add
the
log
and
then
the
idea
is
like
Mandar
was
saying
you
will
have
to
have
either
admission
controllers
or
CI
CD,
where,
depending
on
your
policy,
the
namespace
levels
are
allowed
or
not.
F
F
Yes,
the
only
reason
I'm
saying
I
mean
I'm
not
opposed
to
new
CR
DS
for
the
mesh
one,
but
I
know
you
will
get
a
lot
of
pushback
from
the
I.
Think
you
see
members
so
try,
but
your
easiest
path
here
might
be
or
the
path
might
be,
let's
embedded
within
mesh
config.
If
needed,
then
progress
it
to
a
CR
D.
Let's
I
can
help
you
with
that,
but
yeah.
F
B
Config
is
is
currently
already
has
too
many
stuff
right,
and
you
know
you
are
not
able
to
separate
the
fields
for
different
purposes.
F
B
C
But
so
I
think
again
right.
This
does
go
back
again
to
the
same
question.
So
a
if
changing
msconfig
is
not
a
very
frequent
thing
which
hopefully
is
thought.
Then
you
can
actually
go
through
the
course
admin
say:
hey
make
this
change
right
and
it
should
it
shouldn't
be
too
bad
and
then
the
second
question
about
again
about
modeling.
B
C
Right
so
yeah,
so
my
my
yeah,
my
my
suggestion
is
that
the
mesh
configure
proxy
config
files,
even
though
it
doesn't
look
the
best
is
acting
okay
because
we
have
been
doing
that
and
then,
if
there
is
actually
a
pressure
to
say,
no
actually
take
these
defaults
out
of
there
and
like
put
them
in
some
other
place,
then
you
can
always
grow
the
API.
But
once
we
add
the
API,
it's
difficult
to
go
back.
B
Okay,
yeah,
but
in
this
case
it's
not
a
single
field
right,
it's.
Basically,
we
are
adding
a
bit
of
structure
to
a
mesh
config.
So
that
part
is
what
I'm
worried
about
you're.
Basically
saying
the
mesh
admin
can
say:
what
is
the
administration
doing
this
auditing
condition
and
for
this
def
namespace
do
something
else
right
and
the
can
say,
Oh
for
this
bad
name,
space
delegates
to
the
namespace
owner,
I,
don't
care
so
that
part
of
logic
is
a
pretty
complicated
adjust
to
adding
to
the
national
field.
So.
C
So,
okay,
so
I'm
actually
suggesting
we
don't
do
any
of
that
right.
So
what
I'm?
What
I'm
suggesting
is
that
there
is?
There
is
a
default
that
is
in
proxy,
configure
or
msconfig,
and
just
like
today,
the
API
in
the
namespace
right,
the
API
object
in
a
namespace
can
just
decide
completely
what
it
wants
to
do
in
the
namespace.
C
So
as
far
as
it's
is
concerned,
that
is
still
the
API.
If,
as
a
deployer,
you
want
to
add
more
controls,
then
you
will
add
admission
control
that
says
a
namespace
admin.
You
cannot
do
this
or
you
cannot
do
this
under
some
conditions
and
we
don't
need
to.
We
don't
need
to
actually
do
that
in
our
API
in.
B
F
So
that's
one
more
thing
which
I
can?
Let's
think
about
that?
Also
we
are
talking
about
mesh
config
and
for
the
sake
of
this
argument
we
should
not
add
it
to
proxy
config,
because
that
will
make
it
a
boot
time
configuration
we
want
it
dynamic,
so
mesh
can,
instead
of
adding
in
the
mesh
config
lemon.
Can
we
go
down
the
route
of
the
peer
authentication
API?
Where
initiative
system
there
is
a
default
one
that
we
create,
which
is
for
mesh
wide.
Is
that
better
than
embedding
it
inside
mesh
config.
B
Yeah
actually
I
was
thinking
of
were
creating
a
mesh
level
security
config
we
can
probably
because
authentication
is
just
a
one
field
of
previously.
We
were
reluctant
to
edit
as
a
single
appear
right,
so
we
can
probably
have
a
security
configuration
at
measurable,
which
include
the
POS
education
config,
like
MTS,
for
the
whole
mesh.
This
is
the
some
similar
consideration
right.
Thank
you.
I
saw
mesh
audio
setting
for
the
whole
mesh
yeah.
That's
the
one
possibility,
yeah
and
also
things
like
I
chose
domain
I
was
also
thinking
putting
into
the
mesh
level
security
configuration.
B
F
A
H
Yeah,
so
this
is
not
a
major
concern
really,
but
one
thing,
I
noticed
is
I've
been
testing
pilot
at
larger
scales,
like
the
order
of
you
know,
10,000
proxies
connected,
and
one
thing
I
noticed
was
that
as
we
get
up
to
a
scale,
the
census
libraries
are
starting
to
take
up
more
and
more
CPU
at
that
scale,
I
see
usually
like
12
to
15%
of
the
CPU
is
spent
on
recording
metrics.
What
I
was
wondering
is
like
is
this
normal?
H
Are
the
things
we're
doing
that
we
shouldn't
be
doing
and
that's
causing
this
and
what
we
can
do
to
improve
it?
If
anything,
specifically,
we
see
certain
metrics,
the
two
main
ones
are.
This
is
for
every
cluster.
We
record
how
many
endpoints
it
has
and
we
do
it
on
every
push.
So
it's
kind
of
like
N,
squared
2
n
cubed,
almost
on
the
number
of
times
you
record
it,
and
then
we
also
have
one
that's
every
push.
H
H
Yeah,
so
that
was
my
my
next
that
was
kind
of
so
I
think
the
EDS
one
I,
don't
think
it's
really
useful
great,
like
that.
One
would
probably
be
the
first
to
go
this
one.
That's
push
triggers
I'm,
actually
kind
of
surprised.
It
used
this
so
much
because
I
like
we
have
similar
metrics,
where
we
record
it
for
every
push
for
every
proxy.
H
Like
we
record
the
like
latency,
you
know
convergence
time
that
sort
of
thing
so
I,
don't
know
why
this
one
actually
uses
more
I
feel
like
there
actually
may
just
be
a
bug
somewhere
in
that
code
or
something
that's
somehow
like
triggering
that
more
than
we
you
to
that.
One
I
think
is
useful,
so
I
would
like
to
investigate
that
one
more
probably,
but
I
mean
the
proper
answer,
could
just
be
yeah.
Let's
just
remove
the
EDS
one,
but
I
don't
know
you
know
how
we
feel
about
removing
metrics.
That
sort
of
thing
just
can.
A
H
So
I
just
tried
large
scale
cluster
with
10,000
proxies
and
it
does
go
down
like
10%.
It's
not
a
huge
difference,
it's
honestly,
maybe
less
than
10%,
but
the
the
flame
graph
shows
that
over
10%,
but
then
the
actual
difference
I'm
a
slightly
less
so
it
is.
It
is
noticeable.
But
it's
you
know
we're
not
talking
about
two
times
performance
increase
or
anything
like
that.
Can.
C
Think
one
one
more
place
where
we
can
change
is
that
we
so,
for
example,
I
think
cluster
and
the
more
friend
points
right
that
has
got
nothing
to
do
with
pushes
cluster
and
number
of
end
points
is
actually
measuring
inputs
to
pilot
it.
So
so
it
should
not
be
recorded
at
the
output
right
right,
definite
right,
so
so
that
so
that
we
so
that
we
just
remove
the
x
proxies
sitting
there,
so
so
that
and
then
so
that
that
that
seems
like
that
should
definitely
helped
right.
Hey.
F
C
H
C
C
F
The
current
implementation
of
this
metric
is
completely
broken
and
looked
at
it
long
back
and
when
sight
curves
resource
was
being
added,
I
wasn't
sure
what
is
going
to
happen
now.
This
explains
it.
That's
not
correct,
and
the
second
thing
is
we
can
add
what
month
are
is
asking.
I
am
still
not
sure
how
useful
that
is
Amanda
just
because
you
can
get
that
information
very
easily
from
kubernetes.
H
Yeah
I
mean
I,
think
that
makes
sense
like
if
you're
I
think
you
guys
are
familiar
with
pilot,
but
the
there's
like
push
context
at
the
start
of
every
push
which
you
know
cache
is
like
everything
we
have
and
at
that
point
we
can
easily
record
like
the
number
of
destination
real
services,
virtual
services
right
et
cetera,
I-
think
we
actually
do
with
virtual
service.
Although
that's
probably
the
wrong
place.
You
probably
want
that
in
the
actual
config
reading
correct
but
yeah
it's
what
it
is.
F
F
H
F
C
C
And
and
I
just
I
just
added
for
reference
right
what
percentage
so
we
found
that
envoy
was
taking
17%.
However,
most
of
that
was
on
the
scraping
side.
So
if
you,
if
you
look
at
the
the
link
that
I've,
that
I
posted
of
some
flame
graph,
there
you'll
see
that
on
envoy,
when
you
enable
all
metrics,
it
takes
17%
about
17%
CPU,
and
that's
why
10%
seemed
okay
to.
H
C
H
C
That
so
the
equivalence
was
also
that
I
had
measured
these
at
ingress.
Where
we
get
all
the
clusters
in
the
mesh
and
if
you
enable
all
telemetry,
then
then
that's
where
the
explosion
is
and
and
again
there
is
some
some
equivalence,
but
but
but
yeah
they
I
mean
clearly
pilot
and
on
what
are
different
and
you're,
not
seeing
it
in
the
scripts.
Oh
okay
and.
C
C
A
So
it
sounds
like
we
have
some
action
items
now
out
of
this
right,
we're
going
to
add
a
new
metric,
get
rid
of
the
one.
That's
not
helpful,
and
look
at
maybe
flagging,
like
protecting
ones
that
are
problematic,
so
they
could
be
turned
off
new
performance,
critical
situations
since
I'm,
right,
yeah,.
C
What
one
one
question
quickly
related
to
that:
if
we,
if
we
enable
and
disable
metric
collection,
then
Prometheus
would
be
okay
with
it
right.
I
just
wanna
make
sure
that
so
it
would.
It
would
think
that
the
counter
hasn't
gone
up
for
a
long
time
and
gauges
don't
matter
anyway,
because
gates
yeah,
so
it
should
be
fine
if
we
suddenly
stop
collecting
and
then
restart
the
Prometheus
data
would
still
make
sense.
A
Okay,
well,
there's
not
much
time
left,
so
there
is
a
community
member
who
was
looking
at
alerting
based
on
its
geometrics
and
having
trouble
creating
doc,
so
I
created
the
doc
to
them.
So
I
I
just
wanted
everyone
to
take
a
look
at
this
as
we
try
and
beef
up,
alerting,
I,
don't
know
if
anyone
wants
to
add
comments
now
or
discuss
something
now
that
this
is
something
we
I
think
we
want
to
flesh
out
over.
That's
in
it
didn't
pull
it
up
is
like
yeah
yeah.
A
This
is
just
so
I
basically
cut
and
pasted
this
from
another
Google
Doc,
so
this
would
be
in
the
drive
so
I'm
an
owner
of
it,
but
only
because
I
copied
this.
This
over
I
need
to
go
through
it
as
well,
but
I
just
wanted
to
bring
this
up
in
case.
Anyone
has
expertise
in
alerting
if
they
want
to
share
or
contribute
to
I
think
there
is
an
effort
building
around
getting
a
good
doc
on
alerting
and
to
follow
up
alerts
like
a
4h.
C
A
A
No,
okay
and
then
the
other
thing
I
wanted
to
bring
up.
Is
we
had
some
discussions
about
the
RFC
for
its
lemon
tree
API,
so
I
took
a
quick
stab
at
merging
all
the
existing
configuration
into
one
sort
of
bigger
proposal,
so
I
just
wanted
to
bring
that
to
attention
everyone
to
take
another.
Look
at
what
that
looks
like
and
add
comments
and
questions.
So
you
can
keep
iterating
on
this
and
hopefully
get
to
a
spot
in
which
you
can
build
a
larger
design.
A
Format
that
is
more
acceptable
than
the
the
pricing
config
proposal
that
was
originally
there
and
seems
in
the
right
rate
span
or
not
so
I,
just
wanna
keep
moving
that
that
forward
as
we
get
as
this
unit
element
cycle
is
moving
on,
so
they
just
bring
it
up
to
mention
that
so
appreciate
you
in
all
comments
there
is
there
anything
else.
Anyone
wants
discussed
in
the
last
minute
bring.
H
A
A
good
question
I
think
at
a
bare
minimum
we
would
add
them
to
the
best
practices
documentation,
as
we
have
on
the
website.
Saying:
hey
here's
some
alerting
configuration
that
we
recommend
tailor
it
obviously
to
your
needs,
but
this
is
a
good
starting
set
just
to
have
something
we
can
point
people
at
to
say
this
is
how
we
think
this
do
should
be,
should
be
monitored.
I,
don't
know
about
including
it
by
default
anywhere.
C
C
And
and
oh
by
the
way
we
we
will
be
using
it
right
so,
for
example,
the
the
release,
qualification
testing
and
monitoring
effort
that
chignon
is
working
on.
He
is
also
working
on
alerts
and
he
will
probably
consume
this
and
kind
of
add
to
this,
but
we
will
be
using
that
right
away
kind
of
internally
to
make
to
so
that
these
alerts
are
actually
used
in
you
know
in
our
own
clusters
for
for
testing
yeah.