►
From YouTube: Kuma Community Call - May 11, 2022
Description
During this call, we are discussing the following:
- Graceful shutdown https://github.com/kumahq/kuma/pull/4229
- Sidecar customization https://github.com/kumahq/kuma/pull/4241
- Policy matching refinement
- Identify / socialize some 'good first issues'
A
Okay,
hello,
everyone
welcome
to
the
kumo
community
call
so
please
add
your
name
to
that
and
the
list
and
feel
free
to
submit
any
agenda
topics.
Item
you
would
like
to
discuss
today.
I
think
everyone
has
a
link
to
to
the
dock
and
I
think
we
can
start
so.
We
didn't
release
anything
since
the
last
community
call
that's
why
no
updates
for
releases,
but
we
implemented
some
cool
features
and
I
think
jacob
can
give
more
info
on
the
first
one.
Graceful,
shutdown.
B
B
So,
with
the
next
release,
we'll
be
fixing
the
graceful
shutdown
commodity
so
right
now
there
is,
there
was
an
option
to
set
the
graceful
sorry
to
set
a
training
time
in
the
android
configuration
like
globally
for
every
employee
right,
but
the
problem
is
that
we
were
not
really
respecting
this
in
a
proper
way.
So
so
just
to
show
you,
I
have
a
kubernetes
cluster
and
gcp
with
our
counter
demo
up.
B
So
so,
let's
set
them
up
with
freddy's
and
let
me
port
forward
the
control
plane
and
so
as
a
demo
app
on
the
left.
B
B
Let's
watch
the
parts,
so
there
is
a
second
reading
spot
right
which
is
offline
going
back
to
this,
because
that's
the
most
important
one.
You
can
see
that
now
we
are
kind
of
alterating
between
two
instances
already,
so
you
can
see
that
the
second
instance
is
this
one,
because
the
counter
is
is
really
low
right.
So
that
was
not
a
problem
so
far
because
we
were
taking
into
account
this
end
point
for
clients
whenever
the
pod
was
ready
right.
B
The
problem
was,
if
we
scaled
this
down
from
two
instances
to
one
instance.
So
previously
we
were
kind
of
relying
on
the
reach
price
policy
on
the
default
three
trade
policy,
which
was
not
ideal
right.
So,
let's
see
what
happens
now.
If
I
scale
down
to
one
replica,
as
you
can
see
almost
immediately,
we
cannot.
We
can
no
longer
see
that
other
instance.
B
If
we
go
to
list
of
data
planes,
we
see
that
this
one
is
offline
right
and
it
says
that
the
party
is
not
ready
because
it's
marked
as
unhealthy,
because
the
android
health
check
is
not
is
not
passing
anymore.
So,
to
give
you
an
overview,
what
is
happening
with
this
with
this
change?
B
The
flow
when
the
pod
is
going
down
on
kubernetes
is
that
kubernetes
first
marks
the
pod
as
terminating
right
via
the
annotation
on
pot.
So
whenever
we
see
this,
we
immediately
mark
the
data
plane
object
which,
which
is
like
an
equivalent
of
pot
as
non-healthy,
which
triggers
all
all
the
configuration
changes
for
the
client
to
exclude
this
from
the
from
all
the
potential
endpoints
right,
then
kubernetes,
concurrently
for
every
container
it
executes
any
pre-stop
hook.
If
there
is
a
pre-stop
hook,
I
will
get
to
this
in
a
second.
B
Then
it
sends
a
sick
term
signal
to
every
container
right
and
then
it
waits
until
the
container
is
no
longer
running.
If
the
container
is
running
after
the
graceful
time
which
is
set
on
the
pot
right,
then
cubelet
just
sends
cq
and
gets
rid
of
the
of
the
container
right.
Then
the
pod
is
removed
from
the
system
and
that's
it
so
the
problem
with
this
one
so
from
the
kuma
dp
site.
B
What
we've
done
is
we've
implemented
a
logic
when,
whenever
we
receive
the
sick
term
signal,
so
this
one
instead
of
just
exiting
the
commodity
container,
we
mark
this
as
unhealthy,
but
we
are
still
running,
which
means
that
we
can
still
accept
the
connections
and
we
will
send
back
the
connection
clause
header
into
this
http,
which
means
that
the
client
can
just
reconnect
to
another
instance.
B
But
what
is
also
important
is
how
do
we
handle
the
application
in
this
case
right,
because
if
cubelet
concurrently
executes
a
sick
term
on
every
container
and
if
application
goes
down
immediately
then
well,
android
is
alive.
That
application
is
not
so
there
is
a
problem
right.
So
there
are
two
options.
One
option
is
to
just
teach
the
application
so
implement
the
graceful
shutdown
in
that
application
itself.
So
application
receives
a
sick
term.
B
It
waits
for
a
couple
of
seconds
and
that's
it,
but
with
kubernetes
there
is
second
option
which
is
a
pre-stock
hook.
So
this
is
what
I
implemented
in
redis,
because
rarities
just
stops
after
receiving
the
first
sick
term
and
obviously
it's
hard
to
modify
readys
to
wait
right.
Maybe
there
is
such
option,
but
the
alternative
is
to
let's
see,
let's
see
the
pot.
B
And
the
term
the
redis
container
is
here
and,
as
you
can
see,
there
is
a
post
start
operation
up
there,
that
called
container
is
started,
but
I
added
the
pre-stop
operation,
which
means
that
it's
executed
before
the
seeker
right,
so
application
will
be
also
sleeping
for
30
seconds.
This
should
be
in
line
with
the
drain
time
that
is
set
on
envoy.
B
So
in
this
case
we
have
a
time
to
exclude
this
endpoint
from
all
the
clients.
So
this
is
how
you
do
a
graceful
shutdown.
B
I
was
considering
adding
this
pre-stop
hook
by
default
to
every
con
to
every
container
if
we
are
injecting
kuma's
sidecar
right,
which
would
be
really
cool,
but
I
feel
like
that's,
probably
dangerous,
because
well,
there
is
no
car
guarantee
that
there
is
sleep
in
the
container
and
so
on,
yeah.
So
that's
so
that's
kubernetes!
B
I
can
also
show
you
quickly:
how
does
it
work
with
universal?
So
let
me
run
just
in
memory
on
control,
plane.
Okay,
you
see
a
beautiful
animation.
B
Setup.
Okay,
so
I
have
no
data
in
proxies
right.
Let
me
start
my
data
plane
proxy.
I
have
like
a
very
simple
script
that
generates
data
from
token
and
starts
start
data
plane
with
like
an
environment
and
and
inbound
defined
okay.
So
if
I
start
this,
I
now
can
see
a
data
point
property
here
and
it's
not
yet
healthy,
because
the
status
is
flashed
every
10
seconds
or
so.
B
Okay,
now
it's
online!
So
now
I'm
sending
a
c
term
by
pressing
the
ctrl
c
and
in
the
logs
I
can
see
that
connections
are
being
trained,
and
here
I
can
see
that
data
plane
is
offline,
which
means
that
it
won't
be
included
for
the
client
right,
of
course,
on
universal.
You
need
to
take
care
of
the
graceful
shutdown
of
your
application
yourself.
There
is
probably
no
like
who
can
cooperate
is
mary.
B
Maybe
there
is,
but
it
depends
on
your
kind
of
environment
on
which
you
are
running
this
so
and
there
you
go
it
stopped
and
the
data
plane.
B
B
Yeah,
as
you
could
see,
with
the
counter
demo,
I
already
scaled
down
right
and
you
could
not
see
any
like
connection
problem
or,
like
you
know,
waiting
for
connection
that
you
could
see
before
so
yeah.
We
even
have
end
to
end
test
that
I
just
remove
the
rib
tie.
First
and
then
I
send
re
requests
in
the
background
and
scale
up
and
down
and
see
that
we
don't
drop
enemies.
C
I
had
a
really
quick
question
about
the
last
bit
where
the
data
plane
didn't
disappear
and
just
showed
offline,
because
I've
seen
that
recently
as
well.
Do
we
just
keep
them
for
a
certain
amount
of
time,
because
I
mean
it
is
useful
to
see
that
there
was
one
and
it's
now
offline.
I
just
couldn't
find
where
that
was
implemented.
B
C
D
B
E
B
Oh,
no,
sorry,
sorry,
it's
probably
in
well
either
way.
If,
like
we
listen
for
different
for
different
signals,
so
it
can
be
either
either
one
of
them.
So.
A
E
Yeah,
can
I
click
that
or
drop
me
to
share
my
screen.
E
Yes,
it's
not
a
not
like,
like
ilya,
said,
not
a
big,
not
a
long
proposal,
but
generated
a
lot
of
discussion
so
yeah,
if
you
just
if
you
just
want
to
click
on
the
file
files-
and
you
can
just
view
and
then
click
on
those
three
dots
all
the
way
to
the
right
and
I'm
pointing
on
my
finger
yeah
all
the
way
to
the
right
next
to
viewed
yes,
and
then
you
can
view
file
it'll
actually
render
the
markdown.
E
So
this
is
what
this
is.
What
we
settled
on
started
implementation
on
it
yesterday.
It
should
get
into
this
release.
If
you
scroll
down
a
little
bit
you
can.
It
probably
makes
most
sense
to
talk
about
the
example.
So,
yes
for
anyone,
who's
not
familiar
with
discussions,
we've
had
and
what
this
is
all
about.
People
want
to
be
able
to
modify
the
in
it
container
and
the
sidecar
container
that
we
inject
into
a
pod.
E
So
what
we
settled
on
was
a
json
patch
format
of
doing
this,
so
you
we
have
a
crd,
which
will
be
a
new
crd,
called
container
patch.
E
That
crd
will
be
matched
to
workloads
via
an
annotation
on
the
workload.
So
you
will,
you
know
either
on
your
deployment
or
your
pod
or
whatever
you'll
say
annotation,
I'm
sorry,
yeah
and
you'll
reference,
the
name
of
this
container
patch
and
then
for
the
sidecar
container
and
then
in
it
container,
there's
a
as
you
can
see
here
there
is
a
yama
yaml
version
of
jsonpatch,
so
jsonpatch
actually
has
an
rfc.
You
can
go
to
the
rc
and
see
exactly
how
it
works
is
actually
really
simple.
E
You
have
a
path
to
what
you
want
to
change,
and
then
you
have
an
operations
you
can
add.
Remove,
modify
copy,
I
think,
add,
is
what
we've
gotten
the
most
requests
for
right
people
want,
in
fact,
what
you
see
right
here
is
what
we've
seen
the
most
of
is
people
want
to
be
able
to
modify
the
security
context,
so
it
should
be
a
pretty
simple
way
from
the
modified
security
context.
E
You
just
say
just
say:
if
you
want
the
security
context,
you
want
this
value
in
the
security
context,
which
makes
it
bad.
Then
you
can
say
what
you
want
to
do
with
it.
What
else
is
there
to
say
about
this?
Oh
yeah,
so
some
of
the
discussion
that
we
had
was
like.
What
do
you
do
if
there
are
like
in
the
past?
We
had
other
ways
of
matching
this
and
we
were
thinking
about
like
what
is
there?
What
do
you
do
if
there's
like
conflicts
or
multiple
matches?
So
what
now?
E
What
we
actually
settled
on
and
you
can
see
down
here-
is
like
a
an
example
of
what
it
looks
like
on
a
deployment.
We
have
this
annotation
and
you
just
have
the
patches
you
want
to
apply
in
order.
So
I
think
this
is
pretty
easy
to
like
reason
about
you
just
say.
Well,
I
know
what
my
in
container
looks
like.
I
know
what
like
this
patch
and
this
patch,
I
think
anyone
who's
used.
You
know
git
or
diff
or
anything.
They
know.
I
think
it's.
E
I
think
it's
a
pretty
common
way
to
think
about
applying
patches
right
and
then
we
also
have
we'll
have
a
default
reference
in
the
kuma
config
to
a
crd
which
will
allow
you
to
do
it
too.
I
guess
everything,
and
I
guess
one
other
thing
is
yeah.
I
guess
that's
that's.
E
Basically
it
we're
not
nothing
special
for
multi-zone
we're,
not
gonna,
validate
that
you
have
like
a
semantically
or
we're
not
gonna,
validate
that
your
configuration
is
same,
but
we
should
be
able
to
validate
that
you
have
like
you've
generated
a
legit
container.
I
think
that
makes
sense.
E
E
Was
that
was
part
of
this
question
as
well
yeah,
so
we
will.
What
we
settled
on
was
that
we
want
to
log
the
error.
So
like
let's
say,
let's
say
something
goes
wrong
right,
like
you,
don't
have
a
patch
here,
I'm
sorry
like
the
patch.
Doesn't
the
patch
doesn't
exist
right
like
that,
and
so
what
we
do.
What
we're
going
to
do
is
we're
going
to
log
the
error
and
we're
actually
going
to
abort
the
inject,
and
the
reasoning
is
that
you
could
have.
E
You
could
be
like
applying
security
features
on
top
of
your
container,
and
if
we
can't
do
that,
it
seems
foolish
to
fail
open
right,
because
you
might,
you
might
fail
open
with
like
incorporating
something
into
the
mesh
that
someone
was
not
ready
to
have
incorporated
without
the
security
features.
So
you
get,
you
get
a
log,
you
get
a
log
error
and
then
you
get
it
does
not
get
and
does
not
have
the
sidecar
in
it
container
injected.
C
E
A
Yeah,
okay,
thank
you
paul,
okay.
Next,
I
won't
take
much
time.
I
just
wanted
to
update
everyone.
We
had
proposal
for
new
policy
matching
how
we
want
to
change
this,
and
essentially
we
decided
to
slightly
change
what
we
thought.
So
this
is
first
version
of
the
proposal
or
you
can
check
if
you're
still
not
familiar,
but
maybe
it's
already
outdated,
but
still
we
we
decided
that
we
need
to
support
routes
better,
especially
because
we
have
like
experimental
measure
gateway
and
it
has
smash
gateway
routes.
A
So
we
want
to
treat
routes
as
a
first-class
citizens
in
our
policy.
That's
why
we
decided
to
think
about
pulse
margin.
One
more
time,
and
essentially
we
were.
I
here
have
small
draft
and
it
was
mostly
inspired
by
a
new
gateway
api
in
kubernetes
that
has
like
policy
attachment
concept
and
it
has
target
reference
and
we
want
to
utilize
this
thing.
A
We
now
want
to
have
like
this
idea
of
direction
because
some
policies
they
have
from
so
I
mean,
if
you
configure
them
on
the
inbound.
It
means
that
something
from,
for
example,
traffic
permission
and
you
deny
or
allow
traffic
from
all
other
proxies
in
the
mesh.
Some
other
policies
like
circuit
breaker,
for
example.
They
will
have
two
instead
of
from
yeah,
so
for
traffic
permission,
you
can
put
another
traffic
permission
for
service
backend,
and
essentially
we
will
merge
this
all
together.
A
This
concept
is
similar
with
what
we
already
had
so
we'll
merge
this
into
a
list
of
rules
where
this
rule
comes
from
mesh
policy.
This
comes
from
this
policy
and
then
we
will
generate
configuration
based
on
these
rules
and
available
target.
Reps
are
mesh,
so
essentially
you
can
attach
your
policy
to
mesh,
to
service,
to
proxy
group,
to
just
proxy
one
proxy,
to
mesh
gateway
route
and
to
http
route.
So
this
why?
This
are
main
reasons
why
we
pre-thinking?
D
Yeah
one
one
point
that
is
an
improvement
on
this
side
is
like
the
the
target
ref
that
is
like
on
the
top
of
the
policy
always
identifies
like
it's.
A
selector
of
a
number
of
data
plane
proxies
that
you're
actually
modifying
the
configuration
of,
whereas
before
the
selectors
would
sometimes
like
select
things,
but
actually
it's
applied
on
policies.
D
It's
on
data
planes
that
are
reaching
out
to
you
right,
which
was
like
not
very
intuitive
in,
like
figuring
out
what
is
like
what
is
being
modified
by
your
policy,
whereas
now
it's
very
explicit,
it's
whatever
is
in
this
stop
bit.
A
Okay,
let's
move
forward
and
I
think
john,
about
good
first
issues.
You
wanted
to
bring
up
this.
C
F
All
right,
this
is
the
wonder
of
zoom
on
linux,
right
it
just
never.
It
keeps
flipping
my
mic
anyway.
First
world
problems
right,
yes,
so
yeah
so
hi,
I'm
I'm
from
d2q.
We
build
up
like
multi-cluster
stuff,
and
one
of
the
things
we're
I'm
exploring
is
is
how
we
build
build
this
around
a
mesh.
We
actually
have
kind
of
got
a
home
built
kind
of
networking
topology
at
the
moment
and
yeah.
F
I
want
to
move
to
something
a
bit
more
all-purpose
right
so
that
we
can
adopt
configure
one
of
our
use
cases.
So
I've
got
this
all
set
up
actually
using
kuma
already
in
nice,
simple
networking
topologies,
whether
it's
connectivity
from
between
egress
and
ingress
everywhere,
and
yet
a
number
of
our
use
cases.
Number
of
our
customers
have
places
where
they
can't
connect
from
between
classes
only
are
all
like
a
a
one-way
connection.
It
can
go
from
cluster
a
to
cluster
b,
but
not
back
the
other
way
right
initiating
connections.
F
We
do
something
called
chisel
actually
to
do
like
a
sox,
5
proxy,
actually
in
in
the
middle
to
reverse
tunnel
and
tcp
time
through
that-
and
I
was
thinking
like-
can
we
do
this
with
with
with
service
mesh
the
way
it
is
and
my
kind
of,
and
so
is
there
do
you
need
any
more
info
on
that
kind
of
use
case
it's
effectively
to
allow
traffic
from
egress
in
cluster
a
to
ingress
in
cluster
b,
when
the
only
way
for
connectivity
to
be
initiated
is
from
cluster
b
to
cluster
a
right.
C
F
So,
in
our
use,
cases
literally
does
tcp
right,
so
that
would
satisfy
so
that
what
I
was
kind
of
thinking
was
how
you
could
initiate
like
a
reverse
tunnel,
by
connecting
something
in
in
in
in
cluster
b,
to
effectively
the
the
egress
in
in
cluster
a
maybe
or
some
other
service
in
cluster,
a
and
changing
the
the
the
ingress
address
for
cluster
b
in
in
cluster
a
right
or
zone.
You
know
where
we
call
it
so.
F
Route
to
this
timeline
point
that
would
then
route
it
through
and
then
you'd
be.
You
know
effectively
in
the
ingressing
in
zone
b
right,
so
I
have
no
experience
with
ombre.
I
have
no
thoughts
about
how
we
do
this
so
yeah
any
any
idea
there.
Anything
that
could
we
could
play
with.
I
could
hack
on
any
pointers.
C
Actually,
I
think
that
the
only
thing
is
because
we
probably
have
to
use
android,
so
we
if
android
allows
some
kind
of
tunneling
by
default.
So
the
only
point
I
would
start
with
is
to
see
if
there
is
any
existing
tunneling.
I
know
protocols
or
tunneling
solutions
working
with
android
at
this
point,
because
I
know
I
have
not
heard
about
it
yet,
but.
F
B
F
Yeah,
so
it's
like
for
our
our
particular
phase.
We
have
a
centralized
cluster
and
that
is
kind
of
the
management
cluster
for
a
federation
of
clusters.
So
it's
kind
of
a
in
our
particular
use
case.
It
is
actually
only
all
these
clusters
connecting
to
one
cluster
right,
you're
right
when
you
get
into
more
of
the
the
kind
of
and
that's
kind
of
you
could
work
with
a
hubspoke
model.
In
that
scenario,
you're
right.
F
If
you
wanted
to
do
this
with
more,
then
there
would
need
to
be
connections
to
all
the
the
ones
that
you
can't
connect
to.
But
I
wouldn't
expect
that
to
be
like
I'd
expect
that
to
be
some,
maybe
we
could
introduce
another
kind
of
routing
cluster
in
the
middle,
so
another
proxy
layer
in
there.
So
if
you
have
we're
going
into
a
particular
different
data
center
or
something
or
subnet
or
you
don't
have
routing,
you
could
root
it
through
a
particular
single
cluster
and
there's
like
a.
B
D
Right,
like
do
you
wanna,
like
maybe
maybe
the
next
step
on
that
is
like
if
you
could
like
start
an
issue
right
about
what
this
would
look
like.
What
are
the
requirements
and
stuff
like
that,
and
then
you
know,
maybe
we
can
try,
lay
all
together
like
hashing
out
like
ids.
I
think
I
think
in
itself,
like
it's
it's
in
it's
a
teacher
that
makes
a
lot
of
sense
right.
It's
like
and
that's
one
of
the
places
where
kumar
is
supposed
to
be
like
good
at
fine,
so
so
yeah,
I
think
yeah.
F
Yeah,
nice,
okay,
I'll
open
an
issue
and
detail
that
that's
yeah,
because
I
like
with
what
we're
kind
of
working
on.
I
think
this
could
be
a
pretty
awesome.
Generic
multi-cluster
management
kind
of
scenario
like
this
would
actually
change
the
multi-cluster
management
plane
for
for
kubernetes
right,
which
is
an
unsolved
problem
at
the
moment,
and
I
think
meshes
is
the
answer.
So
I
think
this
could
be
a
really
big
thing.
To
be
honest,.
D
Nice
yeah
yeah
cool
yeah.
I
guess
we
we
had
like
a
similar
thing
as
that
with
like
multi-zone
right
as
in
global,
and
we
started
having
to
reach
out
to
global.
It
was
like
a
two-way
connection.
It
was
like
the
zone
would
reach
out
to
global
when
global
would
reach
out
to
the
zone,
and
that
was
like
a
mess
for
multiple
reasons.
Right
and-
and
I
guess
it's
very
similar
for
you
right-
it's
exactly
the
same
case.