►
From YouTube: Kubernetes SIG Apps 20220627
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
Okay,
hi:
everyone
welcome
to
join
the
june
27th
caps
meeting.
I'm
your
host
janet
and
with
me
machete
on
the
call
is
also
co-host,
and
today
we
have
two
things
on
the
agenda
and
the
first
one
is
mitch
is
going
to
talk
about
the
kubernetes
progress,
progressive
rollouts
and
I
made
you
a
co-host,
so
you
should
be
able
to
present.
B
Thank
you.
I
just
noticed
that
I
have
to
exit
zoom
in
order
to
give
it
sharing
permissions,
so
I
will
be
back
in
30
seconds.
B
Awesome
thanks
so
quick,
intro,
hi
everybody,
I
don't
think
most
of
us
have
met.
My
name
is
mitch
connors,
I'm
a
software
engineer
at
google
most
days.
B
I
work
on
the
istio
project,
which
has
been
a
lot
of
fun,
but
I've
been
playing
around
with
progressive
delivery
in
kubernetes
for
quite
some
time
as
part
of
my
experi
experiments
sort
of
in
how
to
support
istio
best
how
to
enable
safe
rollouts,
and
I
wanted
to
share
some
findings
and
sort
of
a
a
recommendation
here
with
sig
apps
around
progressive
rollouts.
B
Let's
go
to
the
next,
so
if
you're
not
familiar
with
progressive
rollouts,
they
are
similar
to
the
existing
deployment
concept
in
kubernetes
you're,
going
from
one
version
of
software
to
another,
but
rather
than
shifting
as
soon
as
a
new
instance
becomes
available
spin
up
a
new
instance
tear
down
an
old
instance,
you
spin
up
some
instances
and
intentionally
move
slowly.
B
You
move
a
little
bit
of
the
work
onto
that
set
of
new
instances
and
rather
than
simply
waiting
for
a
health
or
liveness
signal,
you
watch
how
well
those
instances
do
the
work
and
I'm
being
intentionally
vague
with
that
word
work
for
many
kubernetes
apps.
The
work
is
handling
network
traffic
for
others.
B
It
might
be
processing
jobs
out
of
a
queue
or
something
along
those
lines,
but
at
a
high
level,
what
we're
doing
is
we're
shifting
progressively
shifting
work
onto
new
instances
and
then
deciding
whether
to
proceed
not
based
on
some
arbitrary
health
signal
from
the
app
itself,
but
on
how
well
it
does
the
work
on
an
http
service
you're
going
to
be
looking
at
latency
at
error
rates
at
success
codes.
If
those
things
are
improved
or
better
than
the
previous
version,
or
not
worse
than
some
arbitrary
threshold.
You
continue
rolling
out.
B
You
shift
more
work
onto
the
new
workloads
over
time
until
eventually,
hopefully,
you've
either
got
100
of
your
work
on
the
new
workload
or
you
found
that
there
was
something
bad
with
this
rollout.
Some
reason
that
it's
not
doing
work
as
well
as
the
old
one
and
you've
rolled
back.
That
rollout
has
effectively
failed.
B
This
is
sort
of
what
the
workload
on
a
progressive
rollout
for
a
a
service
looks
like
so
here
the
work
is
traffic.
We're
gonna
run
two
versions
of
an
application
concurrently
once
that,
well
sorry,
we'll
run
run
one
in
a
stable
state.
When
a
new
version
is
specified,
we
schedule
a
new
version.
We
don't
deschedule
the
old
version
yet
they're
running
concurrently.
B
We
increment
traffic
to
the
new
version
check
some
measure
of
the
health
of
the
work
that
has
been
shifted.
If
it's
healthy,
we
check
to
see
if
we're
done,
whether
we're
at
100
or
not.
If
we're
done,
then
we
unschedule
the
old
version,
and
now
the
new
version
is
latest.
If
we're
not
done,
then
we
just
increment
again
we're
going
to
increment
until
we
get
to
done
or
until
our
is
healthy
signal
sends
a
no.
B
B
Kubernetes
does
have
a
concept
of
rollouts
today
that
is
not
entirely
dissimilar
to
a
progressive
rollout.
The
biggest
difference
is
that
the
signal
for
progressing
a
rollout
today
in
kubernetes
is
liveness
and
readiness
of
the
pod,
which
is
a
very
simple
on
or
off
flag.
That's
set
by
the
pod.
Usually
we're
looking
is
the
pod
able
to
respond
http
200
to
a
request
at
port
80
status
or
something
along
those
lines?
At
that
point,
the
pod
hasn't
done
any
work.
B
B
So
you
can
sort
of
see
that
you
can
sort
of
see
the
progress
today
you
get
a
pod
scheduled
its
startup
probe.
Succeeds
then
its
liveness
probe
succeeds.
At
that
point,
we
are
going
to
start
tearing
down
the
old
version.
B
Eventually,
later
its
readiness
probe
will
succeed,
at
which
point
it
starts
serving
traffic,
so
that
the
doing
work
is
two
two
full
steps.
After
we've
made
the
decision
to
progress
with
the
rollout,
what
we'd
like
to
do
is
delay
that
decision
to
progress
with
a
roll
out
until
traffic
has
actually
been
served.
Work
has
been
done
and
we
can
get
a
better
idea
of
whether
the
pod
is
doing
work
well
or
not.
B
There's
a
lot
of
prior
art
I
see
alexis
has
joined
us
today.
His
team
at
weaveworks
has
done
a
great
deal
to
move
the
state
of
the
art
with
their
flagger
tool,
and
that's
where
most
of
my
experience
with
progressive
delivery
comes
from.
There's
a
lot
to
like
about
flagger
it
uses.
Existing
deployments
creates
two
side-by-side
deployments
in
order
to
get
both
versions
executing
concurrently.
B
It
uses
horizontal
pod,
auto
scaler
and
then
creates
this
new
crd
called
a
canary.
That
gives
you
all
the
information
you
need
to
check
on
the
progress
of
a
rollout.
You
can
define
what
healthy
looks
like
and
how
quickly
it's
going
to
take
steps
which
what
how
much
traffic
is
shifted
in
each
step
and
actually
the
way
that
I
got
involved
with
this
is
one
of
the
ways
that
flagger
can
work
is
built
on
top
of
istio
for
traffic.
Shifting.
B
So
the
the
canary
crd
is
a
little
bit
undesirable
in
the
flagger
supports
a
lot
of
traffic
shifting
implementations
under
the
hood,
it's
incredibly
flexible,
which
is
fantastic,
but
that
does
mean
that
the
crd
has
a
lot
of
fields
that
are
dedicated
to
a
single
implementation
of
traffic.
Shifting
we'll
come
back
later
to
why
sig
apps
would
not
need
to
implement
so
many
different
underlying
support
mechanisms
for
the
traffic
shifting
sort
of
explains
why
the
timing,
I
think,
is
right
for
a
progressive
delivery.
B
Api
intuit
also
has
developed
a
progressive
rollout
thing,
sort
of
service
called
argo
rollouts.
B
B
The
downside
there
is
because
it
doesn't
use
deployments.
If
you
want
to
onboard
to
intuit
argo
rollouts
you're,
going
to
have
to
update
all
of
your
deployments
to
now
be
rollouts,
it's
a
pretty
small
update
because
they
share
pretty
much
identical
schemas,
but
that
is
a
pretty
big
change
in
terms
of
onboarding,
whereas
with
weaveworks
you
don't
need
to
change
any
of
your
existing
deployments.
You
just
add
a
canary
crd
on
top
and
you
magically
get
those
progressive
rollouts.
B
However,
the
fact
that
argo
has
copied
the
deployment
api
for
rollouts
signals
to
me
that
the
deployment
api
is
actually
the
right
place
for
this
to
live.
They
were
able
to
take
pretty
much
off-the-shelf
deployment
at
a
field
or
two
call
it
something
new
and
achieve
progressive
rollouts.
B
I'm
not
sure
that
anyone's
using
this,
I
haven't
seen
any
development
in
the
last
three
years,
but
it
is
one
of
the
other
available
kubernetes
tools
in
this
space
and
that
one
clones
the
deployment
api
with
like
two
specs
within
the
spec
field.
There's
I
can't
remember
what
it's
called,
but
it's
like
an
old
spec
and
a
new
spec
to
define
what
you're
coming
from
and
what
you're
going
to
that's,
not
especially
declarative
right.
B
When
you
use
an
api,
you
want
to
say
what
you
want
it
to
be,
and
ideally
kubernetes
would
take
care
of
annealing
towards
that
state.
You
don't
need
to
specify
the
from
state
with
a
top-level
declarative
api.
So
that
there's
a
few
reasons-
that's
not
desirable,
but
I
think
we
can
take
the
best
from
all
of
these
solutions,
learn
from
what
we
see
in
them
and
come
up
with
something
that
is
excellent
for
kubernetes
users.
B
This
is
what
that
might
look
like.
We
already
have
a
deployment
spec
strategy
field.
There
are
two
values
currently
supported,
replace,
which
is
I
mean
it
does
what
you
would
think
it
deletes
one
replica
set
and
then
creates
a
new
replica
set
with
the
new
version
and
then
actually
what
is
the
other
one.
B
I
can't
remember
what
the
other
other
value
the
default
for
strategy
spins
up
a
number
of
new
instances
and
then
tears
down
a
number
of
old
instances
once
those
new
instances
are
marked
as
ready.
This
would
add
a
third
value
to
the
strategy
field
and
I've.
I've
named
it
here,
progressive
rollout,
I'm
really
bad
at
naming
things.
B
So
please
don't
get
hung
up
on
on
the
particulars
there,
but
the
idea
is
to
introduce
this
third
enum
value
that
allows
us
to
within
the
strategy
field
and
or
sorry
specify
that
we
want
progressive,
rollout
similar
to
other
values
in
the
strategy
field.
It
has
its
own
settings
under
the
progressive
rollout
heading
and
the
settings
should
look
fairly
familiar
max
surge
and
match
max
unavailable
are
carry
overs
from
existing
strategies.
I
really
should
have
written
down
what
the
name
of
that
default.
B
Existing
strategy
is,
but
then
there's
also
these
new
fields
interval
threshold
max
weight
step.
These
are
defining
how
to
take
steps
towards
the
new
version.
We're
going
to
every
minute
shift
two
percent
of
traffic
up
until
we
get
to
fifty
percent
of
traffic,
and
then
we
consider
it
a
success
and
roll
it
to
a
hundred.
B
Rolling
update,
I
think,
is
the
strategy
that
I
was
unable
to
think
of.
So
thank
you
for
what
it's
worth.
There
are
three
already,
so
this
should
be
a
fourth,
oh,
is
there
okay?
I
missed
that.
E
B
So
these
field
whoops
these
fields
here,
are
describing
the
steps
that
will
be
taken
in
order
to
get
to
production
and
then
there's
another
set
of
fields
that
we'll
cover
in
a
minute
that
define
health.
B
What
does
it
look
like
for
this
new
version
to
be
doing
work
in
a
healthy
way
and
again,
that's
configurable
to
the
user?
Before
we
look
at
those
fields,
though,
there
are
a
couple
of
advantages
here.
One
is
that
the
user
is
only
declaring
the
desired
end
state
and
the
steps
or
or
how
quickly,
to
anneal
and
reconcile
to
get
to
that
desired.
End
state
they're,
not
defining
what
the
current
state
is
or
they're,
not
responsible
for
shifting
anything
from
old
or
new
spec
to
old,
spec
or
anything
along
those
lines.
B
Also,
if,
if
you're
using
deployments
today,
you
could
onboard
to
this
api
just
by
updating
your
rollout
or
your
rollout
strategy,
type
sorry
deployment
strategy
type
to
be
progressive,
there's
no
need
to
install
a
new
crd.
There's
no
need
to
write
to
new.
You
know
a
new
rollouts,
api
or
canary
api.
It's
all
packaged
up
in
the
deployment
api
that
our
users
are
already
familiar
and
fairly
happy
with.
B
So
those
are
some
of
the
advantages
here.
The
other
bits
of
api
that
we
need
again
are
the
ways
to
define
what
success
looks
like
in
a
progressive
rollout.
So
in
this
case
we're
going
to
look
at
request,
success
every
minute
and
look
for
99
percent
success.
We're
also
going
to
look
for
request,
duration
every
minute
and
make
sure
that
we're
never
seeing
a
value
over
500
if
either
of
these
standards
are
violated
during
the
rollout
by
the
new
workload,
the
traffic
on
the
new
workload,
then
the
rollout
fails.
B
It's
going
to
roll
back
and
mark
itself
as
a
failed
state.
This
is
fairly
heavy
in
terms
of
the
amount
of
information
that
we
would
need
to
stuff
into
the
strategy
field.
Ideally,
there
would
be
a
crd
or
sorry
a
resource
specific
to
service
health,
say
a
slo,
an
slo
resource
that
we
could
write
to.
I've
talked
to
sig
instrumentation.
It
doesn't
sound
like
anything
like
that
is
on
the
horizon
for
them,
but
they
did
point
me
to
horizontal
pod,
auto
scalers,
which
already
support
custom
metrics.
B
As
a
decision
decision
driver
for
auto
scaling.
I
would
think
we
more
or
less
take
implementation
from
horizontal
pod,
auto
scaling,
custom
metrics
and
with
a
few
modifications
for
how
thresholds
work,
we
could
use
that
for
progressive
rollouts.
Again,
it's
still
a
lot
of
data
to
get
stuffed
into
your
deployment.
It
might
be
ideal
to
have
this
defined
elsewhere.
B
Let's
see
other
rough
edges,
there
are
multiple
types
of
progressive
rollouts.
What
I've
been
talking
about
and
describing
is
mostly
the
the
canary
process
where
you,
you
have
a
new
version
of
your
software
that
you
want
to
roll
to
as
long
as
it
doesn't
violate
some
basic
foundational
rules.
You've
set
out,
there's
also
a
b
deployments
where
you
might
want
to
move
to
the
new
version
or
you
might
want
to
stay
on
the
old
version.
B
Another
rough
edge
here
is
that
this
is
very
specific
to
applications
whose
work
is
traffic.
If
you're
looking
at
jobs
that
pull
from
a
queue
or
something
along
these
lines,
the
api
would
really
need
to
modify
pretty
drastically
to
support
shifting
their
work
in
that
direction.
B
B
There's
other
envoy-based
service
mesh
products
that
have
supported
it,
but
each
of
them
came
with
their
own
apis
and
it
really
wouldn't
have
been
suitable
within
the
kubernetes
deployment
api
to
take
out
dependencies
on
17
different
kinds
of
crds
for
traffic
shifting.
But
today,
with
the
development
of
the
kubernetes
gateway
api,
we
now
have
effectively
one
abstraction
that
most
or
all
of
those
implementations
support
for
traffic
shifting.
B
This
is
what
a
given
http
root
looks
like
in
the
gateway
api.
If
you
haven't
played
with
it,
you
can
define
a
particular
hostname
and
say
it's
going
to
be
backed
by
these
two
services
at
a
weight
of
20
and
80,
respectively
and
you'll
notice.
Within
this
you
don't
see
anything
about
istio.
You
don't
see
anything
about
envoy
or
nginx,
and
that's
because
it's
defined
in
a
completely
different
resource,
the
gateway
resource
which
is
user
defined,
specifies
exactly
what
implementation
is
going
to
be
executing
the
shift
of
this
traffic.
B
You
could
build
a
gateway
on
istio.
You
could
build
a
gateway
on
nginx,
google
cloud
load.
Balancer
supports
the
gateway
api.
There's
I
think
aws
does
as
well,
so
the
implementation
is
really
up
to
the
user.
What
the
deployment
would
create
is
this
here
we
would
create
this
http
root
and
allow
the
user
to
create
a
gateway
object
that
selects
it,
which
would
actuate
the
traffic
shift.
This
80
20
split
that
we've
got
right
here.
B
So
it's
it's
nice
timing,
in
that
this
dependency
is
finally
available
and,
finally,
at
a
level
of
readiness
that
makes
it
appropriate.
We
did
have
the
ingress
api
before,
but
shifting
really
wasn't
traffic
shifting
was
not
a
part
of
it
and
the
amount
of
brake
glass
configuration
that
you
had
to
do
with
implementation,
specific
annotations,
etc
was
pretty
heavy.
So
this
is
a
pretty
big
improvement
coming
from
signet
working.
B
B
B
The
cluster
operator
defines
a
gateway
which
refers
to
a
gateway
class
that
might
be
istio
envoy,
engine
etc,
and
those
three
resources
together
sort
of
separate
concerns,
so
that
each
of
us
is
not
stepping
on
the
other's
toes
and
we
can
actuate
this
traffic
shift
in
a
pretty
well
defined
and
isolated
manner,
and
that
is
the
last
slide.
So
with
that
questions.
B
I
any
at
this
point
I'm
looking
for.
Is
this
a
great
idea?
Does
this
belong
in
the
deployment
api?
I
I
certainly
wouldn't
want
to
take
something
as
big
a
change
as
this
and
send
it
as
like.
A
drive-by
pull
request.
F
All
right
beside
the
previous
one,
this
one
here,
yeah
yeah,
I'm
curious-
is
that
match
the
pro
promises
matrix.
B
If,
if
you
look
into
the
horizontal
pod
autoscaler
by
default,
if
you've
used
one
of
these
it
scales
on
like
memory
or
cpu
consumption
of
the
pod
right,
that's
the
the
standard
way
of
defining
an
hpa,
but
it
also
supports
custom
and
external
metrics
where
you
can
build
up
or
spin
up.
What's
called
sig
instrumentation
calls
it
a
metric
server
within
your
kubernetes
cluster
that
could
be
prometheus.
That's
probably
the
most
common
metric
server
to
spin
up
in
kubernetes,
but
there
are
others,
and
then
these
will
pull
these
named.
B
F
So
interesting
I
mean
that's
a
little
weird.
We
put
the
promises
metrics
into
the
deployment
api,
and
I
mean
the
deployment
api
is
a
core
resource
of
communities
and
how
about
a
basic
canary
update
api
into
a
deployment,
and
we
may
have
a
new
sub-project
under
c-gaps
to
do
the
improv
progressive
reward
on
top
of
the
deployment
canary
update.
F
I
think
the
the
progress
throughout
the
api
may
be
changed
in
future
and
since
it
is
a
new
api
and
some
users
may
have
their
specific
visual
requests
and
I
don't
think
the
api.
If
the
api
will
put
into
the
deployment,
we
can
change
it
easily
in
the
in
future.
B
That's
a
great
point:
I
don't
really
know
how
the
api
maturity
model
would
work
here.
If
you
were
considering
in
the
abstract
a
large
change
to
deployment.
B
F
I
mean
if
deployment
supports
the
canary
update,
then
we
can
control
the
deployment
and
canary
update
on
top
of
the
deployment,
such
as
a
new
crd
in
a
sub
project.
The
crd
can
control
the
steps
to
load
such
as
the
first
step.
It
can
control
department
to
update
a
percentage
of
imports
and
it
can
check
some
metrics
or
some
some
hooks
custom
hooks
by
defined
by
users,
and
it
can
cont
it
can
decide
to
whether
to
to
to
block
here
or
to
go
to
the
second
step.
E
You
don't
we.
We
always
said
that
we
would
allow
for
the
expansion
of
update
strategies
without
migrating
the
entire
api
version
to
a
new
version.
So
you
could
release
this
update
like
a
rolling
update
as
a
feature.
I
think
some
of
the
the
points
that
cu
brought
up.
I
would
echo
the
major
one
being
that,
like
by
pulling
the
metrics
api
into
the
deployment
api
and
having
the
deployment
enter,
the
deployment
controller
interact
with
directly
with
the
metric
server,
which
isn't
actually
a
requirement
to
have
a
kubernetes
cluster
as
far
as
I'm
concerned.
E
A
E
A
E
You
can't
mandate
like
we
putting
in
an
update
strategy
that
requires
the
cluster
to
have
a
metric
server
installed,
might
be
a
bit
controversial,
but
having
that
as
part
of
the
api
would
probably
be
a
non-goal
too
just
because
it's
it's
really
like
it's
failing
to
pass
the
test
of
separation
of
concerns.
Now
the
deployment
controller
is
going
to
have
to
do
a
whole
bunch
of
things
that
really
are
outside
of
its
kind
of
design
constraints,
but
from
like,
I
think
what
would
interest
me
is.
Like
one
thing,
I've
noticed
you
brought
up
argo.
E
E
E
They
don't
seem
like
god.
This
is
painful.
I
really
wish
like
you
guys,
would
take
this
in
tree
like
these
projects
or
like,
but
if
there's
some
value
we
can
add
to
make
it
easier
to
offer
it.
Like.
E
That's
super
interesting
to
me
like,
if
there's
something
I
can
do
to
make
those
projects
more
successful
and
their
users
happier
as
opposed
to
kind
of
saying
like
well,
do
it
this
way,
because
this
is
what
is
built
into
the
system
and
you
know
forget
all
the
other
ones
that
have
been
you
know
in
production
for
many
clusters
for
many
users
for
a
couple
of
years
now.
B
B
They
will
use
the
kubernetes
apis.
They
see
those
as
the
standard
if
there
are
best
practices
that
are
not
specified
in
the
kubernetes
apis.
B
Very
few
of
our
users
will
pick
pick
them
up
and
so,
like
adoption
of
progressive
rollout,
whether
it's
any
of
these
providers
across
all
of
our
customers
is
incredibly
low,
very
few
of
them,
because
very
few
of
them
have
done
this,
and
I
think
it's
because
they
see
the
kubernetes
api
as
the
standard
for
how
to
safely
roll
out
software
and
as
soon
as
they
start
looking
beyond
the
core
kubernetes
api
there's
an
overwhelming
number
of
choices,
and
it's
not
clear
which
one
is
best
or
preferable
from
customer
perspective,
I've
heard
a
lot
of
them
say
we're
waiting
to
see
which
one
wins,
which
may
not
be
the
best
model
to
be
thinking
about
in
terms
of
providers,
but
from
a
customer
perspective.
B
They
just
want
to
know
the
right
way
to
roll
out
software
safely,
so
the
my
motivation
in
presenting
it
this
way
and-
and
I
would
totally
support
spinning
some
of
this
stuff
out
into
separate
resources
if
it
made
sense,
but
my
motivation
was
that
the
default
kubernetes
way
of
rolling
out
software
ought
to
be
basically
best
in
class.
G
Yeah,
maybe
to
jump
in
here
sorry
always
some
tech,
app
delivery
in
the
cncf
and
and
also
working
on
captain.
So
I
think
a
lot
of
what
you've
built
here
is
what
we've
dealt
with
as
well
as
like
more
or
less
orchestrating
tools
like
argo
and
others.
G
I
think
one
key
point
in
design
of
the
api
is
also
the
separation
of
concerns
and
who
builds
what
the
deployment
is
usually
built
by
someone
who
doesn't
necessarily
know
which
rollout
controller
or
the
deployment
controller
like
argo
or
flux
is
going
to
be
used.
That's
why
the
flux
version
is
a
bit
more
convenient
because
they
like
reproduce
the
deployment
versus
relying
on
a
dedicated
resource
in
argo
with.
G
G
The
sre
team
might
decide
how
they
want
to
roll
up
some
applications
in
different
stages,
and
we
see
differences
in
stages,
so
that
would
more
or
less
mean
you
would
have
to
adjust
this
across
different
stages,
and
you
don't
even
want
to
give
other
people
control
over
how
this
is
supposed
to
happen
like
if
you're
responsible
for
the
environment,
the
developer
obviously
will
tell
you
what
they
want
to
deploy,
but
the
actual
process
is
kind
of
separated.
G
So
I
think
there
are
some
separations
of
concerns
there,
the
same
like
with
validation
I
I
could
share
it
afterwards.
We
have
worked
in
some
of
like
the
slo
and
sli
work
as
well.
This
again
might
be
something
coming
from
the
developer,
the
slis,
but
some
of
the
the
rules
for
slos
might
actually
be
provided
by
somebody
else
like
the
owner
of
the
environment,
for
example,
and
historic
values.
So
that
would
be
a
good
reason
to
spin
these
out
into
a
separate
resource.
G
E
E
The
reason,
in
my
opinion,
at
least
that
you
see
a
proliferation
of
tooling
for
cicd,
both
in
the
cncf
and
outside
of
the
cncf,
is
because
it
tends
to
be
like
the
software
release
process
tends
to
be
very
close
to
an
organization's
heart
right.
It's
how
you
get
product
in
the
production
and
they're
like
finding
one
universal
right
way
that
fits
all
customers
or
even
a
large
fraction,
is
just
very
tricky
to
do
right
so
like
it
would
be
if
there
was
something
that
the
ecosystem
had
already
developed.
That
was
the
universal
way
to.
E
E
Okay,
we
have
an
industry
standard
best
practice
for
how
to
do
this
and
it's
worth
looking
at
it
to
adopt
it
and
potentially
bring
it
in
tree
right,
but
I
mean
that
that
doesn't
seem
to
be
the
case
when
I
look
more
broadly
across
the
industry
today,
and
I
get
that
like
you,
want
to
be
able
to
advise
your
customers
like
this
is
the
right
way
to
do
it,
but
different
software,
like,
depending
on
how
the
customer
has
deployed
their
software,
how
many
regions
it's
in,
how
many
availability
zones,
if
you're
on
public
cloud,
that
it's
in
the
right
rollout
strategy
is
going
to
differ,
and
it
also
depends
on
the
nature
of
the
software
and
the
level
of
risk
tolerance
of
the
business.
E
That's
rolling
that
software
out,
like
you
know,
if
you're,
if
your
app
is
like
yo.com,
use
an
exponential
rollout
strategy.
Who
cares
if
it's
down
the
user
is
not
infected
if
you're
a
financial
institution
or
a
bank
you're,
probably
going
to
be
a
little
bit
more
conservative
right.
So
right
you
know
like-
and
maybe
maybe
you
do-
blue
green,
because
you
want
to
do
a
fast
or
red,
black
or
raven
record,
because
you
want
to
do
a
fast
roll
back
as
opposed
to
doing
canary
analysis
on
a
progressive
rollout.
E
So
it's
like
to
say,
like
I
mean
working
toward
the
goal
of
having
like
supporting
out
of
the
box
a
great
way
to
roll
out
software
sounds
like
a
goal.
It's
just
you
know
I.
I
would
definitely
want
to
see
it
tested
out
a
tree
before
starting
work
on
like
bringing
it
in
and
shipping
it
as
the
de
facto
standard.
B
So
what
would
that
out
of
tree
work?
Look
like
how
would
that
differ
from
say
the
argo
rollouts
and
the
flagger
from
weaveworks.
F
E
None
of
those
other
organizations
that
have
demonstrated
a
large
degree
of
success
have
even
seen
a
need
to
come
back
with
sick
acts
and
say
we'd
like
to
bring
it
back
into
tree
as
a
filter
which,
which
is
like
what.
What
if
the
goal
of
the
built-in,
I
guess
is,
is
really
what
right
like,
if
you
can
be
successful,
doing
it
out
of
tree
and
allow
your
users
to
adopt
it
as
they
see
fit
or
not
adopted.
As
you
see
fit.
E
There
should
just
be
resources
right,
so
the
idea
that
a
built-in
resource
is
has
actually
become
a
non-goal
of
the
community
at
large,
so
if
like
and
which
is
why
I
was
kind
of
like
asking
like
what
can
we
do
to
make
this
easy
for
anyone
who
wants
to
build
a
resource
that
encodes
the
business
logic
of
the
the
custom
release
strategy,
that's
best
fitted
to
their
organization
or
a
group
of
organizations
that
they
represent?
What
can
we
as
cigars,
do
to
help
expedite
that
work
and
make
that
easier?
E
There
would
be
very
high
and
like
yes,
it
can
be
done
without
violating
the
v1.
If
you
could
do
this
as
like
an
alpha
beta
release
entry
with
an
adoption
strategy
that
didn't
violate
the
existing
compliance
of
the
v1
api.
But
the
question
there
is
with
all
that
heavy
lifting.
Where
is
the
value
of
doing
it
in
tree
as
opposed
to
offering
it
as
something
on
top
and
and
that
might
be.
I
wonder
if
that's
not,
why,
like
argo,
hasn't,
come
back
and
saying
like
or
flagger
or
spinnaker,
haven't
come
back
and
said.
E
This
is
so
great
that
we
want
to
offer
it
to
the
community
by
patching
deployment
to
implement
it
directly.
The
complexity
is
high
and
I
don't
like
is
the
value
actually
there
to
just
build
it
and
trade
and
like
if
it's,
if
you're
working
for
google
and
you're
working
on
gke
there'd
be
nothing
stopping
you
from
or
actually
anyone
who
is
releasing
a
kubernetes
cluster
is
in
charge
of
the
crds
that
they
ship
by
default?
E
With
that
cluster,
I
mean,
if
you
look
at
openstack,
for
instance,
they
do
for
kubernetes,
but
they
ship
a
bunch
of
other
resources
along
with
it.
You
know-
and
you
can
just
do
that
so
like
if
it's
like.
I
want
my
customers
to
have
a
same
default
and
I'm
google
nothing's
stopping
you
from
doing
that
today,
like
you,
don't
need.
F
E
To
help
you
do
that,
even
unless.
F
B
B
The
ease
of
use
of
installing
a
third-party
library
installing
crds
familiarizing
yourself,
with
their
particular
apis,
their
particular
support
policy,
etc,
versus
the
ease
of
use
of
going
into
your
existing
deployments.
That
you're,
already
making
use
of
and
changing
your
strategy
type
from,
say,
replace
to
progressive
and
then
you've
got
some
sane
defaults
right
off
the
bat.
So
it's
it's
really
about
ease
of
use.
B
In
my
opinion,
if,
if
we
were
to
do
this
as
gke,
we
would
necessarily
be
producing
a
new
api,
not
the
deployment
api
that
our
users
are
already
familiar
with,
that
we've
already
got
tons
of
integration
baked
into
around
all
of
the
ui
and
the
the
various
systems
that
support
gke.
Instead,
we
would
have
to
implement
something
new
and
ask
all
of
our
users
to
migrate,
which
is
a
very
different
story
in
terms
of
onboarding,
but
is.
E
It
though,
because
if
you
I
mean
so,
if
you
implemented
a
new
api
and
asked
people
to
migrate,
you
could
at
least
find
a
set
of
lighthouse
house
customers
that
would
work
with
you
and
give
you
some
feedback.
If
you
implement
it
as
alpha
number
one
when
it
goes
alpha,
nobody
uses
it
right
like
except
people
who
are
getting
to
turn
up
alpha
clusters
to
test
with,
but
it's
not
going
to
get
any
production
production
utilization
at
all.
Then
it
goes
beta
and
a
lot
of
times.
People
are
kind
of
unsure
of
it.
E
So
the
path
to
get
to
ga
is
just
it's
harder
to
do
it,
entry,
which
is
kind
of
like
why
you
know,
I
think
I
don't
want
to
seem
like
I'm
opposed
to
doing
it.
It's
just
more
like
it.
The
motivation
seems
like
if
I
wanted
to
get
this
to
the
my
customers
inside
of
my
company
as
fast
as
possible.
I
would
use
a
custom
resource
because
I
can
get
it
out
immediately.
I
can
get
early
feedback
from
product
teams.
E
I
can
iterate
on
it
rapidly
and
then,
when
it's
mature
enough,
I
can
release
it
to
the
g.
Like
ga
and
say:
okay,
this
is
the
supported
version
and
then
like
if
I
really
wanted
to
bring
it
back
in
tree
after
I
have
evidence
that,
like
this
suits
a
large
amount
of
organizations,
okay,
then
then
I
would
do
that
right,
but
but
I
mean
like
yeah,
that's
I
guess
that's
my
feedback.
B
If,
if
it
were
to
have
taken
off
like
let's
say
hypothetically,
that
one
of
these
implementations
gained
dominance
and
say
flagger
got
80
adoption
compared
to
the
even
deployment
at
that
point,
there's
really
no
reason
to
be
bring
it
back
and
treat
right.
It's
already
successful
independently.
There's
not
really
a
value
add
if
everyone's
already
done
the
onboarding
work
of
installing
crds
installing
a
new
controller
just
leave
it
as
it
is
everybody's
happy,
there's
no
reason
for
kubernetes
to
own
it.
B
In
this
case,
I
think
we've
seen
broad
consensus
that
this
is
a
great
way,
a
very
safe
way
to
operate
software,
that
a
lot
of
users
want,
but
we
haven't
seen
that
broad
adoption
specifically
because
of
some
of
those
barriers
involved
with
installing
a
third-party
controller.
In
crd.
C
Yeah,
so
I
think
one
of
the
things
that
I
want
to
think
about-
and
this
is
what
I
have
been
discussing
with
match
as
well-
is
the
apa
that
we
have
currently
for
the
workloads
is
pretty
much
tied
to
the
controllers
that
we
have
in
in
the
core,
and
part
of
the
problem
is
safe.
We
can
separate
that
out
like
have
some
sort
of
cook
mechanisms
for
the
various
stages
of
the
workloads,
so
that
say
at
the
rollout
time
or
at
a
particular
phase.
C
So
if
we
can
have
some
sort
of
mechanism
where
people
can
tell
this
is
a
phase
within
the
workload
controller,
and
I
would
like
to
outsource
outsource
it
to
another
controller
that
I
already
have
and
if
you
can
provide
that
hook
mechanism
into
the
core
controllers.
Perhaps
it's
going
to
solve
the
problem
because
I
think
most
of
us
agree
with
the
api.
C
It's
the
implementation
for,
for
example,
in
this
case
health
check
right-
and
this
is
a
recurring
theme
that
I'm
noticing
in
the
past
two
to
three
calls
where
people
are
telling.
I
would
like
to
have
this
new
api
field
with
this
particular
implementation
within
the
controller,
but
I
would
like,
but
as
a
community,
we
have
been
telling
them
no
do
not
do
it
go
ahead
for
and
then
work
on
it.
C
D
Yeah,
I
just
linked
the
the
issue
that
was
discussed
very
early
in
the
project
and
we,
if
I
remember
correctly,
thomas,
even
put
together
a
proposal,
but
we
never
moved
it
forward
but
to
to
also
illustrate
a
little
bit
different
point
that
and
we
did
approach
a
couple
years
back
specifically
when
we
finished
writing
jobs,
and
you
did
mention
that
you're
not
sure
how
to
support
the
work
shifting
through
non-service
resources.
D
But
after
further
discussions
with
with
eric
tune
with
brian
grant,
clayton
myself
and
daria
at
the
time,
we
figured
out
that
forcing
users
to
a
particular
way
of
approaching
workflow
where,
as
you
probably
are
aware,
with
workloads,
there
are.
Similarly,
as
you
mentioned,
several
different
approaches
and
the
thing
and
the
need
is
that
each
of
those
external
resources
are
significantly
better
at
delivering
features
the
very
different
variations
of
the
features,
and
it's
also
clear
that
it's
not
just
one
solution
that
exists.
D
So,
and
I
tend
to
agree
with
what
what
ken
said
earlier,
having
a
very
tight
control
over
the
life
cycle
of
the
application,
I
know
that
it
is
problematic,
but
the
ability
to
build
on
top
gives
a
lot
of
flexibility
to
everyone,
and
we
had
similar
different,
similar
tough
discussions
around
various
various
different
pieces
of
the
kubernetes
itself.
D
I
just
want
to
have
the
entire
resource,
as
is,
but
what
happened
is
that
each
and
every
single
person
that
looked
at
the
export
resource
requested
different
fields
being
removed
from
the
retrieved
resources,
so
for
one
an
export
was
being
used
for
as
a
backup
mechanism.
So
they
want
to
have
a
full
json
or
yum
of
the
resource
for
someone
else.
They
were
using
the
export
functionality
as
a
templating
mechanism,
so
they
would
want
to
have
the
metadata
being
removed.
F
D
The
rest
should
be
end
status
and
and
the
rest
should
be
removed
and
so
forth.
Each
and
every
single
person
was
giving
a
very
different
use
case.
The
same
situation
is
with
this
particular
approach
that
you're
presenting
the
fact
that
you
face
this
particular
problem
and
you
approach
it.
D
It's
not
that
it's
bad
in
any
way
it
just
it
suits
you
in
your
particular
case,
it's
very
possible
that
next
month
or
next
year,
if
you
will
be
struggling
with
a
different
approach
with
a
different
problem,
you
will
start
looking
back
at
the
stuff
that
you
wrote
and
it,
and
you
will
realize
that
oh,
it's
very
limiting
because
now
I
don't
want
to
do
it
this
way,
but
I
would
prefer
to
do
it
that
way
and
it's
pretty
normal
from
what
we've
been
seeing
across
the
cube.
D
I
I'm
fully
aware
that
we
have
limitations
inside
of
the
core,
but
at
the
same
time
we
have
certain
responsibilities
to
ensure
the
interoperable
interoperability
for
various
approaches
and
adding
features.
D
It's
always
hard,
and
it's
always
goes
through
a
lot
of
scrutiny,
because
we
need
to
make
sure
that
the
current
use
cases
are
not
broken
in
any
way
and
are
maintainable
and
at
the
same
time,
adding
additional
update
strategies
makes
the
controller
harder
and
harder
to
maintain,
and
the
group
of
people
that
it's
maintaining
is
slowly
shrinking
because
everyone
are
chasing
the
next
new
thing
and
the
amount
of
people
that
is
left
behind
with
the
maintenance
burden
is
growing.
D
So
we
as
the
maintainers,
we
need
to
weigh
the
options
between
one
and
the
other,
sometimes
not
accepting
a
particular
approach
or
waiting
for
how
it
can
be
sold
outside
and
see
if,
if
it
gains
enough
popularity
outside
of
the
core
and
then
being
brought
back,
is
it
is
a
viable
approach?
D
We
did
that
with
the
with
a
job
api,
a
lot
of
the
the
recent
work
that
has
been
happening
around
jobs
we
set
and
we
completed
jobs
in
three
four
years
ago,
and
we
we
didn't
do
any
developments
around
it
and
we
said
everyone
go
on
a
side.
Try.
There
are
solutions
such
as
volcano
cube
flow
and,
although
all
of
those
different
approaches,
we
are
looking
at
what
they
are
trying
to
bring
back.
D
There
is
a
separate
work
group
that
was
devoted
into
looking
at
how
we
can
improve
in
a
more
general
and
and
have
a
general
approach
towards
what
we
can
do
and
improve
to
make
the
any
kind
of
hpc
related
workload
simpler
in
kubernetes.
But
it's
not
solving
one
particular
case.
D
I
hope
that
me
talking
for
the
past
10
minutes.
It
does
make
a
little
bit
sense,
at
least.
E
You
know,
I
think
I
understand
I
do
have
one
more
question.
The
presentation
was
given
as
a
proposed
api
change
that
doesn't
have
prior
art
other
than
from
argo
and
so
forth.
Is
this
something
that's
actually
been
implemented
in
gke
that
you
put
in
front
of
a
large
number
of
customers-
and
it's
like
this-
has
been
great
for
us
and
we
want
to.
We
want
to
offer
it
back
as
a
built-in,
or
is
it
something
that
we're
proposing
to
kind
of
collaborate
on
and
build
from.
B
Scratch
this
would
be
more
the
latter,
a
collaboration
proposal
as
to
the
best
of
my
knowledge
gkd,
is
not
interested
in
forking
the
deployment
api
right.
E
B
B
It's
a
sidecar
to
every
pod.
So
in
order
to
be
able
to
safely
upgrade
the
proxy,
I
need
better
signals
around.
Is
the
app
still
healthy?
Should
this
rollout
be
proceeding?
Should
we
be
rolling
back?
B
I
can
do
all
of
that
by
driving
all
of
my
users
to
a
third
party
library
like
flagger
or
argo
rollouts,
but
it
is
a
harder
sell
to
users
to
adopt
something
that
is
third
party
that
has
less
clear
support
less
it's
less
clear
that
that's
the
the
correct
direction
to
them
than
when
it
becomes
a
core
kubernetes
api.
E
Think
one
challenge
as
thinking
of
it
from
the
perspective
of
the
workload
controllers
and
those
are
all
real
problems,
is
you
know
you
can't
laser
fo,
because
okay
you're
thinking
about
istio
but
there's
also
linker
d?
You
mentioned
you
know
engine
x
as
a
the
gateway,
but
nginx
also
has
a
service
mesh
implementation
right.
B
Model
is
appropriate
for
anything
that
uses
side
cars,
which
is
the
vast
majority
of
service
meshes
today.
The
one
standout
that
comes
to
mind
is
the
psyllium
is
working
on
like
a
service
mesh
product
that
is
not
sidecar
based.
So
it's
not
clear
to
me
how
this
would
help
with
that.
But
if
you're
talking
about
linker
d,
nginx
service
mesh,
open
service
mesh
sni
glue
mesh,
all
of
those
useful
might
benefit
from
this
model.
E
And
I
think
more
more
still,
it's
just
a
matter
of
what
yeah,
I
think
it's
hard
to
figure
out
what
the
best
way
to
try
to
integrate
something
like
this
in
the
quarter
would
be
that
would
both
serve
the
immediate
customer
working
out
of
the
box
with
a
built-in
resource,
as
well
as
enable
users
of
third-party
mature
third-party
software.
That's
been
around
for
many
years
to
leverage
as
well.
I
think
ravi's
point
that
he
brought
up
about
lifecycle.
Hooks
is
one
thing
we
can
do
now
to
like.
Maybe
revisit
that
and
try
to
figure
out.
C
B
C
C
E
The
think
challenge
with
readiness
gates
is
so
contrary
to
what
was
you
might
take
from
looking
at
the
slide?
We're
not.
He
wants
the
pod
to
get
added
to
the
network,
but
the
rollout
to
not
to
progress
until
you
have
a
better
measure
of
healthiness
and
my
understanding
of
readiness
gates
is
it's
actually
going
to
block
readiness
which
would
have
the
effect
of
keeping
it
outside
of
the
load
balancer
right,
so
the
pi
will
be
marked
as
unready
and
won't
receive
traffic,
which
would
kind
of
not
allow
you
to
do
the
canary
analysis.
C
E
It's
it's
more
than
that
right,
like
the
problem
that
a
lot
of
people
have
is
our
notion
of
what
health
is
is
basically,
do
you
pass
your
health
check,
but
health
check
means
just
don't
kill
me
right
like
please.
Don't
don't
shoot
me
shoot
me
down.
Let
me
continue
to
run
when
you're
doing
canary
analysis.
E
What
you
really
want
to
understand
is
like
how
good
is
this
new
product
versus
the
previous
version
of
the
product
that
I
and
should
I
progress
with
the
rollout
right,
so
the
idea
of
pausing
deployments
at
a
particular
gate
and
then
like
there
are
different
strategies
you
might
use.
You
might
want
to
do
a
geometric
rollout
with
an
operator
in
a
loop
where
it
pauses
after,
like
it's
a
very
complicated
space
of
things,
which
is
why
it's
hard
to
like
wrap
for
me
at
least
to
wrap
my
head
around
like
this.
B
Yeah,
the
overall
idea
is
the
way
that
you
know
that
something
is
good
is
that
it
does
good
work.
It's
sort
of
a
skepticism
that
at
least
is
baked
into
the
google
sre
concept.
You're,
never
sure
that
a
change
that
you're
deploying
is
good
until
it's
done
good
work
and
done
good
work
for
a
while.
G
So
it
would
be
hard
to
measure
this.
A
part
can
be
unhealthy,
but
the
deployment
can
still
be
healthy
and
to
to
other
point
like
the
entire
deployment
is
unhealthy,
but
you
can't
like
nail
it
down
to
a
specific
part,
assuming
like
once
of
your,
is
this
release?
Good
metric
is
whether
users
are
able
to
log
in,
and
this
can
be
multiple
parts
causing
an
issue
or
like
even
taking
other
metrics
into
account.
So
this
would
relate
more
or
less
to
the
entire
deployment
and
not
to
a
single
part.
B
I
I
would
think
of
it
at
the
replica
set
level
rather
than
the
deployment,
because
you
want
to
compare
old
versus
new.
So
if,
if
the
old
deployment,
like
the
version,
a
is
misbehaving
and
not
meeting
your
slos,
if
you're
not
in
a
roll
out,
there's
really
nothing
to
do
about
that.
But
when
you
start
rolling
out
version
b,
version
b
will
need
to
meet
those
slos.
Does
that
make
sense.
G
Yes,
if
you
still
think
of
a
service
and
not
of
an
end-to-end
application,
use
case
like
we
have
some
of
our
project
users,
for
example,
that
like
really
use
like
card
value,
metrics
and
other
things
that
you
can't
like
really
tie
to
even
a
specific
replica
set.
That's
really
that
deployment
that
is
giving
birth
then
another
one.
B
C
So
what
I'm
understanding
is
if
we
have
something
similar
like
pod
readiness
check
at
the
workload
spec
level,
would
that
solve
the
problem
and
having
that
custom
check
and
again,
the
custom
check
has
to
be
done
externally,
not
within
the
core
controllers.
B
B
E
One
way
to
think
of
it
like,
if
you
did
it,
if
you
leverage
deployment
or
replica
or
whatever,
let's
say
you
leverage
deployments
directly.
Imagine
if
the
progressive
rollout
was
a
top
level
resource
as
opposed
to
a
rollout
strategy
and
like
we
had
something
like
workload
hooks
that
were
able
to
pause
the
progress
at
particular
points
right.
E
You
could
have
this
thing,
this
controller
that
sat
outside
of
both
deployment
and
hpa
and
the
metric
service
that
interface
with
all
three
of
them
and
manage
the
orchestration
across
all
of
them
in
order
to
control
the
progress
of
rollouts
based
on
external
signals-
and
you
could
add
things
like
metrics,
you
could
even
add
things
like
black
box
tests.
You're
talking
about
user
logging
like
or
is
that
dropping
down?
Are
my
synthetics
going
down?
They're
like
you
could
do
arbitrary
things
in
this
controller
outside
of
it,
which
would
be
like.
E
E
That's
kind
of
like
it's,
not
that
sig
apps
or
any
other
signal
to
my
knowledge
is
adverse
to
taking
more
things
in
tree,
we're
trying
to
kind
of
enable
people
to
build
a
larger
and
better
ecosystem
leveraging
what's
in
tree
and
focus
our
efforts
on
making
the
built-in
pieces
as
best
as
they
can
be,
to
enable
those
extensions
and
to
enable
the
growth
of
the
ecosystem
and
the
cncf
as
a
whole.
Right
so
like
that
would
be
one
approach
I
could
see
where
it
would
be
really
like
it's
something
you
can
do.
E
You
can
leverage
the
existing
ecosystem
around
it.
You
can
like
have
your
own
release
schedule
outside
of
intrigue
because
again
remember
whatever
we
do
in
tree,
it's
alpha
and
there's
fairly
few
users
in
alpha.
Then
it's
beta
and
you
can
keep
messing
around
with
beta,
but
once
it's
beta,
it's
never
going
away
right
and
like
the
only
path,
is
towards
stable
so
like
getting
that
good
signal
to
make
sure
it's
just
a
lot.
B
B
Okay,
well,
if
there's
no
other
questions
really
appreciate
everyone's
comments
and
time.
There's
a
lot
of
feedback
for
me
to
go
through
here
and
a
lot
of
different
sort
of
directions
to
consider
and
evaluate
so
really
appreciate
everything.
A
Cool
thanks
for
everyone
for
joining
the
call
today
and
I'll
end.
The
call
now
thanks
bye.
Thank
you.