►
From YouTube: Kubernetes SIG Apps 20221128
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
Good
morning,
good
evening,
good
afternoon,
depending
on
where
you
are
today
is
November
28th,
and
this
is
another
of
Our
Sick
apps
bi-weekly
calls.
My
name
is
Marche
and
I'll
be
your
host.
Today,
I
have
one
announcement.
The
126
release
is
for
the
next
Tuesday.
If
I'm,
looking
correctly
and
based
on
the
emails
that
I
saw
earlier
today,
there
has
been
no
problems,
so
it
looks
like
we
are
on
the
schedule
to
be
released
on
time
for
December
6th.
A
B
Sure
hi
I'm
drove
I'm
from
Shopify
and
I've
been
written
with
Peter
as
well
from
Google
to
basically
propose
these
improvements
to
stateful
sets
apologize
a
bit
from
my
wife,
I've
kind
of
lost
it
I'm
recovering
it.
So
if
yeah
it
might,
there
might
be
a
few
hiccups,
so
I'm
gonna
be
talking
about
adding
zono
rollouts
for
staple
sets
and
General
Zone
awareness
for
pdbs
as
well.
B
B
B
One
key
thing
we
rely
on
when
we
deploy
is
the
Readiness
probe,
which
essentially
is
a
command
which
checks
if
a
pod
is
caught
up
caught
up
enough.
Rather,
the
drop
location
of
data,
I,
think
different
databases
and
things
have
different
kind
of
commands
which
are
kind
of
similar,
and
one
key
thing
when
you
reply
this
as
a
stable
set,
is
if
three
or
more
pods
are
down
at
random.
B
At
any
point,
we
run
the
risk
of
data
loss
and
unavailable
partitions
because
it
could
be
from
three
different
zones
and
then
we
could
lose
all
three
replicas
for
something.
C
So
is
your
data
loss
because
you
don't
what's
the
durability
that
you
configure
your
Kafka
with?
Is
that
the
reason
you
would
lose
data.
B
Oh
yeah,
no
I'm,
just
saying
yeah
that
if
you
we
have
a
replication
factor
of
three
for
basically
every
partition,
which
is
the
lowest
denominator
of
Kafka
replicas,
replicates
things.
So,
basically,
if
there's
ever
a
disruption
and
more
than
three
pods
are
done
at
the
same
time,
you
technically
may
be
at
risk
of
losing
the
data.
C
Well,
okay,
so
it
makes
sense
if
you're
saying
that
you
basically
have
a
very
long
time
between
the
data
being
committed
to
Kafka
and
being
synchronized
with
storage
media.
So
if
all
three
go
down
you
use
partitions
and
they're
unrecoverable,
because
you
know
Kafka
is
configurable
to
allow
you
to
write
much
faster
than
the
disk
moves.
Oh
I
see
at
your
YouTube.
B
I
think
yeah
I
think
I've
reported
it
a
little
incorrectly,
then
I
think
it's
more
about
unavailability
than
necessarily
data
loss,
because
it
did
a
loss
in
the
term
of
if
you're,
trying
to
reach
and
write
it
to
a
certain
partition
for
which
there
are
no
replicas
available.
Then
you're
losing
data
in
that
regard,
but
yeah
other
than
that.
Kafka
is
pretty
resilient
with
what
you
describe
where
it
does
frequently
flush
the
disk
and
write
whatever
is
available.
B
B
Yeah
sorry
I
should
have
been
clear,
so
yeah,
so
the
issue
we
were
facing
is
for
some
of
our
largest
staffer
clusters,
which
can
be
in
the
order
of
two
to
three
hundred
Brokers
or
pods.
It
can
take
some
in
the
order
of
20
hours
to
like
more
than
that,
even
to
deploy,
because
we
have
to
rely
on
the
default
strategy
to
roll
out
one
part
at
a
time.
B
These
pods
cannot
be
a
bit
at
random,
as
I
mentioned.
If
you
set
that
to
something
like
maximum
25,
you
could
accidentally
take
down
all
three
replicas
for
something.
That's
the
three-part
restriction.
I
was
talking
about,
and
so
we
want
these
deployments
to
be
faster,
not
just
from
the
perspective
of
Ops
that
you
know,
I
would
like
Motorola
to
happen
within
working
hours,
but
also
at
least
for
the
kafka's
use
case.
B
We
we
prefer
shorter
disruptions
because
of
clients
who
connect
and
consume
from
Kafka
the
longer
a
rollout
might
take
the
more
time
they
might
perform,
rebalances,
where
basically,
they
do
no
work.
B
So
the
key
Insight
we
realize
is
that
we
can
disrupt
all
pods
in
a
given
Zone
and
still
have
the
other
two
replicas
for
data
being
available,
and
so
we
can
basically
update
our
stateful
stats
by
updating
all
parts
of
the
zone
at
once,
and
it
doesn't
have
to
be
all
pods
I
think
the
AWS
use
case
was
they
were
going
exponentially.
They
would
start
one
part
at
a
time,
then
two
and
four
and
so
on
to
whatever
marks
and
available
is.
B
But
the
point
is
we:
can
these
kind
of
apps
can
tolerate
a
lot
of
disruption
in
a
given
Zone
and
we
can
leverage
that
for
faster
deploys,
be
it
for
your
app
or
even
node
upgrades,
and
things
like
that.
B
So
one
thing:
a
lot
of
people
when
we
talk
about
this
thing
is
like
bot
disruption.
Budgets
could
help
with
this.
They
don't
because
they're,
not
limited
deploys
are
not
limited
by
pdbs
when
you're
doing
rolling
upgrades,
but
regardless
of
that,
we
also
want
pdbs
to
be
aware
of
zones
as
well.
B
So
the
way
we
solve
this
essentially
is
by
writing
our
own
controller
or
operator.
We
call
it
the
zonal,
deploy
controller,
it
uses
the
popular
controller,
runtime
Library
and
basically,
we
now
deploy
our
SQL
set
with
the
on
delete
strategy
and
on
relevant
updates,
which
for
us
is
mostly
a
change
in
revision
or
a
change
in
anode
certain
annotations.
The
controller
takes
over
the
deploy.
B
We
update
all
pods
in
the
same
Zone.
At
the
same
time,
you
could
also
configure
it
to
be
like
200
time
or
something
if
you
want
and
as
I
mentioned
before,
we
heavily
rely
on
the
Readiness
probe
for
the
stateful
sector.
Tell
us
when
the
Pod
is
up
and
running
and
if
pods
are
unhealthy
after
resetting
a
Zone,
then
the
controller
stalls
and
doesn't
make
any
progress.
B
We
also
have
a
pause
field
which
I
think
mirrors.
What
deploys
have
for
some
reason
we
can
pause,
deploys
in
kubernetes,
but
not
stateful
sets.
So
ideally
we
can
add
that
too.
B
Our
client
disruptions
are
minimized
for
the
shutter
duration
of
time,
as
I
mentioned,
there's
a
certain
thing
called
consumer
groups
and
if
any
kind
of
outage
happens
to
what
they're
looking
for
they
basically
stop
the
world
rebalance
and
you're,
basically,
they're
not
consuming
anything
in
that
time,
so
the
shorter
the
disruption
window,
the
better
it
is
for
them
and
the
way
we
designed
this
controller,
it
was
felt
like
it
was
a
negative
feature
and
it
just
worked
without
deployed
pipelines
as
well,
and
we
designed
it
to
be
generic
and
for
now
we're
only
using
it
to
deploy
a
Kafka.
B
But
there's
another
team
which
uses
elasticsearch,
which
can
basically
use
the
exact
same
solution.
B
So
I
guess
what
we're
really
proposing
is
to
Upstream.
This
kind
of
zonal
awareness
for
staple
sets
just
like
Kafka
there's
a
lot
of
other
apps
which
can
tolerate
disruptions
in
many
pods.
If
limited
the
same,
Zone
I
think
kubernetes
would
benefit
by
offering
these
kind
of
features
and
I
think
it
should
be
native
to
kubernetes.
B
So
the
two
things
we
think
that
will
have
high
impacts
is
the
ability
to
roll
out
pods
by
zone
for
a
stateful
set
and
to
allow
pdbs
to
have
a
budget
of
disruption
per
Zone
as
well.
So
basically,
rollouts
and
pdbs
should
be
a
zoner
topology
player
and
again.
The
goal
for
both
these
features
is
to
essentially
speed
up
deployment
times
without
having
an
impact
on
our
any
kind
of
availability.
B
So
we
kind
of
prototype
what
this
might
look
like.
There's
a
picture
on
the
right
there
of
what
it
might
be
where
the
main
thing
is.
Maybe
you
can
come
up
with
a
new
update
strategy
type
like
zonal
update,
where
you
can
define
a
key.
B
B
We
can
have
a
discussion
on
that
and
the
reason
that's
important
is
is,
for
example,
if
you
deploy
a
bad
change
and
you
need
to
roll
back
or
maybe
have
a
fixed
forward
change,
or
something
like
that,
and
it's
useful
to
update
the
same
pause
which
we're
just
which
are
down
now,
rather
than
starting
from
some
other
place,
and
all
parts
of
the
zones
should
be
updated
before
moving
on
to
the
next
one,
apps
can
use
the
Readiness
probe
really
to
control
how
fast
the
deploy
is
happening
and
to
see
if
they're
caught
up
for
us
each
part
reports
their
own
health,
but
it
could
honestly,
you
could
be
reporting
the
General
Health
of
the
entire
cluster
as
well
or
whatever
I
think
that's
where
people
can
customize
how
the
sold
out
strategy
would
interact
with
their
app
a
bit
and
if
bars
are
unhealthy
after
resetting
a
Zone,
then
the
controller
should
just
stop
as
I
mentioned
before.
B
These
are
some
other
examples
we
came
up
with
the
image
on
the
top
left
is
the
one
I
just
showed,
which
is
you
can
have
a
new
update
strategy
called
zonal
update
the
one
below
that
bottom
bottom
left
is.
You
could
keep
rolling
update
but
perhaps
update
the
partition
type
to
not
only
support
something
like
an
ordinal,
but
maybe
a
label
selector,
and
then
you
can
have
the
partition
key,
be
something
like
the
hosting
and
that's
actually
incorrect.
B
That
needs
to
be
the
topology
key
there
or
you
could
have
something
we
have
on
the
right,
which
is
again.
You
have
a
new
update
strategy.
Like
Journal
update,
you
can
Define
how
what
is
the
max
of
unavailability
per
Zone
like
20
or
10
pods,
and
you
can
Define
what
order
it
might
be
or
what
key
it
might
be,
for
example,
alphabetical
here
or
any
mix
and
match
of
these.
This
is
the
part
where
we
don't
know
what
it
really
should
look
like
here
and
people
with
more
experience.
B
Please
you
know
pitch
in,
but
we
just
wanted
to
give
a
couple
of
different
things
or
ideas
we
had
for
this
yeah
and
it's
not
just
us
I
think
there's
other
people
on
the
call
there's
Mariana
from
AWS
who
implemented
essentially
the
same
controller.
We
did
at
Shopify
to
speed
up,
deploys
for
I,
think
Prometheus,
and
then
we
have
Peter
from
Google
I
think
who
is
working
on
a
feature
to
migrate.
Stapable
sets
where
having
zonal
part
disruption.
B
Budgets
would
also
be
very
useful
for
speeding
up
those
migrations,
and
we
actually
do
do
that.
We,
when
we
do
cross
cluster
migrations
for
Kafka,
we
are
trying
to
take
Zone
into
account
as
well
and
I.
Don't
think
I
mentioned
it,
but
it's
not
just
scaffold.
There's
a
lot
of
stateful
apps
out
there,
which
would
benefit
from
this.
C
Okay,
so
that
would
be
good.
It
would
be
good
to
get
something
in
the
open
source
community
that
people
can
iterate
on
and
give
feedback
up
before
proposing
to
like
take
an
entry.
So
they
give
us
an
opportunity
to
kind
of
iterate
on
that
API
and
lock
it
down
before
you
try
to
be
wanted
or
build
it
into
an
alpha
feature
even.
C
C
So
so
getting
some
some
you
know
in
the
wild
feedback
on
an
API,
that's
being
used
across
like
Shopify,
AWS
and
Google
would
be
a
strong
motivator
to
Upstream
something
then
my
other
thing
would
be
so
there
are
Mo
very
many
users
use
kubernetes
on
top
of
a
cloud
provider
and
the
failure
domain.
There
is
most
failure
domains.
They
are
almost
easily
specified
in
terms
of
regions
and
zones
right,
but
many
users
also
use
it
on
top
of
do-it-yourself
data.
C
Centers
right
and
their
failure
domains
are
not
going
to
be
necessarily
described
in
the
same
way
right
so
that,
in
terms
of
topology
spread
constraints,
which
we
already
have
introduced
in
in
the
Pod
spec
for
being
able
to
spread
out.
Instead
of
calling
it
zonal
like
we,
we
should
probably
we
want
to
do
this.
We
need
to
be
able
to
handle
arbitrarily
arbitrary
failure,
domains
in
terms
of
the
topology
spread
right
so
and
I.
C
Well,
how
do
we
make
sure
that
we
can
generalize
this
to
the
use
case
of
somebody
who,
instead
of
going
across
Zone,
just
going
across
power
domains,
Network
domains,
racks
so
forth,
and
so
on,
but
yeah
other
than
that
it
seems
like
cool
like
I
mean
I
I,
don't
I
have
no
like
opposition
to
the
to
it
in
general
practice,
but
those
those
would
be
the
two
things
like
getting
some
more
real
world
hit
on
like
a
a
common
Epi
that
multiple
people
are
using,
as
opposed
to
like
we
have
one
from
AWS
and
one
from
Shopify
and
we're
gonna
try
to
select
best
to
read
with
like
help
push
us
in
the
right
direction.
C
So
we
know
we're
doing
the
right
thing
and
being
able
to
handle
arbitrary
topologies
would
be
more
inclusive
of
users
who
are
not
running
it.
On
top
of
the
cloud
which
you
know
over
time,
that
maybe
may
not
be
as
many
people,
we
expect
more
and
more
workloads
to
move
out
of
do-it-yourself
data
centers,
but
today,
there's
still
a
lot
to
run
on
it.
B
So
just
one
thing
with
that:
I
think
a
common
pattern
we've
seen
is
it's
not
just
us
writing
these
operators,
but
for
a
lot
of
stickable
sets,
people
would
just
deploy
them
in
general,
using
operators,
and
this
is
kind
of
baked
in
there.
They
have
a
lot
of
logic.
How
do
you
recommend
We
Gather
these
use
cases
I
suppose,
because
we
have
one
from
AWS,
we
can
come
up
with
two
from
Shopify.
C
Do
I
go
about
that?
What
would
be
awesome
is
if
like,
if
we
have,
if
there
are
multiple
people
that
are
all
working
so
okay,
in
the
same
way
that
you
said
this
is
great,
and
the
reason
this
is
great
is
because
we
believe
that
this
controller
that
we
have
will
work
for
multiple
workloads.
All
doing
the
same
thing
right
I
would
assume
that
the
same
thing
is
true
about
what
AWS
is
building.
C
If
we
could
convert
like
I
mean
we
could
form
a
working
group,
if
that
would
help
I
I'm
I'm,
very
I'm,
open
to
any
suggestion
about
how
we
can
help
gather
our
community
around
it
and
convert
it
to
one
open
source
thing.
I
would
really
love
to
do
that
prior
to
start
to
baking
it
into
the
API
like
outright.
A
In
parallel
to
the
zonal
rollout,
the
pausing
of
the
workload
controllers
has
been
mentioned
in
the
past
couple
of
times
during
the
batch
work
group
meetings
and
just
like
it
is
currently
implemented
for
deployments
and
cron
jobs
and
jobs.
Currently,
that
has
a
request
also
to
have
The
Identical
functionality
for
the
remaining
controllers.
C
The
feedback
that
we
were
given
by
the
architecture
Sig
was
that
the
pause,
primitive
and
deployment
wasn't
declarative
and
was
not
compatible
with
GitHub
style,
workflows
and
automated
orchestration
at
a
higher
level,
which
is
why
we
didn't
like,
because
we
looked
at
including
it
in
the
rest
of
the
workload
Primitives
resources
prior
to
B18.
C
But
there
was
push
back
more
broadly
and
I.
Don't
that's
not
to
say
that,
like
we
can't
revisit
it,
we
can
always
revisit
it,
but
it
wasn't
done
like
capriciously
or
because
we
thought
that
it
was
like,
like
let's
rewind
it
without
this
feature
and
add
it
later,
it
was,
let's
not
add
it,
but
maybe
it's
time
to
revisit
that
position.
E
I
I
have
a
quick
question.
It's
a
good
moment.
It
wasn't
apparent.
What
are
the
changes
to
pdbs
required
by
this
proposal?
It
seems
like
none.
Could
you
please
clarify
that,
because
you.
F
Mentioned
yes,
I
think
for
pdb
is
the
the
work
around
or
I
guess.
The
solution
we
want
to.
The
problem
we
want
to
solve
is
that
for
many
apps
there
is
a
discrepancy
between
the
unavailability
cross,
Zone
and
intrazone,
so
you
may
be
able
to
take
down,
have
more
pod
disruption
in
a
single
zone,
but
if
you
do
have
a
second
significant
disruption
event
in
one
zone,
you
don't
want
your
budget
to
be
affected
across
the
other
zones.
F
I
think
the
the
challenge
that
Drew
was
pointing
out
was
for
the
replication
factor
of
of
two
in
Kafka.
You
can't
have
more
than
two
like
assuming
you
know.
You
have
partitions
sharded
fairly
uniformly.
You
may
have
a
case
where
you
can't
take
down
more
than
two
pods,
even
if
you
have
a
capital
cluster,
that's
very
large
and
it's
really
dependent
on
the
scale
of
your
of
your
Zone
replication.
D
And
in
our
case,
at
AWS
Ikea
has
this
no
managed
node
groups
and
during
node
groups
a
big
rates
when
you
need
to,
for
example,
updated
your
OS
within
your
Secret
Patches.
It
uses
pdbs
to
like
to
in
case
the
the
the
parts
don't
became.
Half
of
another
group
of
updated
supports
the
node
group,
a
big
raid
in
that
case.
D
So
in
our
our
case,
why
we
created
the
Zona
where
a
PDP
was
because
we
wanted
to
have
a
way
to
safely
start
shopping
date,
multiple
nodes
at
the
same
time,
and
have
these
safe
word
in
case
something
goes
badly.
We
can
pause
the
rollout.
D
What
it
did
was
like
you
saw,
we
created
a
new
admission
web
hook
so
with
that.
That's
basically
watched
for
the
pods
status,
States,
a
status
between
between
the
zones
and
then
basically
using
that
information
allows
or
not
to
request
to
go
to
the
eviction
API.
D
So
in
our
case,
we
kind
of
create
a
new
object
and
I'm
not
sure
on
the.
If
and
to
be
honest,
we
copied
a
lot
of
the
pdb
code,
but
we
had
to
like
create
a
new
like
the
statuses
you
need
to
to
keep
monitoring
how
many
parts
we
have
per
easy,
how
many
different
options
we
have
per
Z
kind
of
for
the
status
that
we
used
was
pretty
different
from
from
the
what
we
have
in
the
PDP
one.
D
C
I
would
say
this
I
can
at
least
understand.
I
can
understand
the
use
case
for
making
staple
sets
or
other
workload
orchestrators
aware
of
topology
constraints
during
pod
termination
right.
But
if
you're
going
to
do
it
like
I,
don't
know
what
that
looks
like
for
pdb
like
we
just
offered
a
couple
of
API
one
apis
that
are
like
very
targeted
towards
Zone,
but
I
can
understand
like
how
you
could
extend
that
to
work
with
an
arbitrary
topology
constraint.
C
Pdbs
are
designed
to
work
on
label
selectors
right,
like
they're
I.
What
how
yeah
I
mean
like
is
that
even
compatible
like
would
label
selection,
be
compatible
with
topology
constraints,
because
you're
saying
specifically
like
these
pods
are
grouped
together
and
I
only
want
X
number
of
them
to
be
disrupted.
C
C
Then
that
is
kind
of
the
way
like
the
Zoro
constraints
are
specified
for
scheduling.
You
have
preferred.
D
H
C
Right
I
could
see
that
but
yeah
I,
just
like
I'm
I'm,
not
opposed
to
the
idea
but
I'm
not
seeing
no
I
haven't
seen
so
far
today.
Anybody
offer
like
this
is
what
we
think
an
API
would
look
like
and
handle
those
cases
which
I
mean
sounds
like
a
good
project
to
work
on.
If
we
can
come
up
with
how
that
works.
E
You
know
main
concern
about
the
using
the
admission
API
to
reject
the
eviction.
Calls
is
that
essentially
actually
comes
in
from
two
sources:
schedule
preemption
and
drain
right.
So
essentially,
if
you
reject
eviction
and
actually
scheduler
who
drives
the
preemption
that
may
lead
to
unexpected
results,
because
schedule
actually
want
to
preempt
that
pod,
because
it
needs
to
find
room
for
high
priority,
but
is
protected
by
your
own
custom
pdb
logic.
But
ultimate
scheduler
needs
to
find
route
for
the
high
priority.
So
I
think
that
could
be
a
little
bit
concerning.
C
Wouldn't
I
wouldn't
do
that
in
the
general
case
right
and
then,
but
that
is
the
kind
of
thing
like
if
we're
going
to
solve
something
that
allows
pdb
to
be
a
bit
more
expressive
in
terms
of
like
preference
toward
Pi
termination.
It
does
affect
scheduling,
it
affects
eviction,
it
affects
draining
so
I'm,
not
saying
that's
not
in
it's
possible
to
do
that
potentially,
but
we
have
to
be
thoughtful.
E
F
Which
is
one
to
throw
a
thought
out
there
so
I
know
pdb
is
today.
If
you
have
overlapping
pdbs
eviction
is
not
allowed
like
a
potential
option
is
allowing
overlapping,
pdbs
and
kind
of
relaxing
that
constraint.
I,
don't
know
how
complicated
that
is,
though,
but
that
is
a.
E
Fantastic
segment:
segue
I,
have
the
follow-up
item.
I
have
the
requested
proposal
and
I
think
I'm
going
to
link
it
again,
request
and
review
for
supporting
multiple
pdbs
overlapping
pdbs.
It
was
assigned
to
see
apps
a
couple
of
weeks
ago.
I
proposed
it
roughly
a
month
ago,
so
I
would
love
to
get
feedback
on
that
proposal.
Okay,.
C
C
If
you,
if
you
allow
them
to
be
too
aggressive,
then
it's
hard
for
people
who
are
providing
kubernetes
as
a
service
to
respect
those
pdbs
when
they're
doing
managed
node
upgrades
right
that
it
can
basically
take
forever
so
but
again,
I
mean
like
yeah
like
if
the
community
really
wants
it?
It's
not
super
opposed
to
it.
Personally.
F
F
Back
to
the
topic
of
the
staple
set
up
updates
and
one
thought
that
occurred
to
me
was
today
with
staple
set
updates.
Your
updates
are
declarative,
so
you
say:
Okay
I
want
to
be
able
to
update
to
this
partition.
X
number
of
replicas
updated
out
of
out
of
n.
F
If
we're
doing
this
along,
say,
topology
aware
boundaries,
the
state
that
you
update
to
is
dependent
on
the
current
state
of
a
staple
set
and
the
current
state
of
those
pods
were
which
zone
they're
scheduled
to
I
think
for
say
a
staple
set
that
has
PVCs
that
are
assigned
to
specific
topologies
that
doesn't
matter
as
much,
because
those
topologies
are
fairly
static.
F
But
we
can
get
into
some
weird
scenarios
where
maybe
you're
rescheduling
and
you're
rescheduling
two
different
topology
group,
so
you're
not
necessarily
deterministically
doing
an
update
but
I
kind
of
wanted
to
get
like
as
a
general
principle.
The
city
I
have
70
thoughts
about
like
risks
there
in
terms
of
using
pod
state
for
these
updates.
I.
C
Guess
here's
my
question
so
one
thing
for
okay
so
for
it
depends
on
what
your
PVC
is
baked
on
in
terms
of
whether
it
actually
is
only
locked
or
not
right.
So
if
you're
using
like
elastic
file
store
in
AWS,
then
you
can
move
it
all
over
the
place
if
you're
using
EBS
and
then
through
Google,
it
depends
on
which
version
of
PD
you're
using
right,
like
they
do
have
multiple
PDS.
So
the
the
storage
topology
is
a
much
longer
conversation
which
has
been
going
on
between
Sig,
apps
scheduling
and
storage
for
a
while
I.
C
C
Well,
I
guess
I'd
ask
like
why?
Is
it
back
decorative,
it's
clearly
non-deterministic
right,
which
is
definitely
a
change
but
deployments
and
when
they
update
when
they're
updating
replica
sets
that
behavior
is
non-deterministic
at
intermediate
States
right,
like
eventually
you
converge
toward.
You
have
X
number
of
replicas,
but
at
any
given
point,
regardless
of
what
you
specified
as
Max
surge
have
been
unavailable
like
if
you
specify
Max
server
to
five.
You
know
you
may
end
up
at
some
point
surging
to
six,
just
due
to
network
partitions
right.
C
If
you
specify
A
Min
unavailable
two
or
three
you,
you
may
end
up
with
four
again,
just
because
of
network
partitions,
so
like
for
most
of
the
controllers,
the
behavior
in
turn,
it's
never
been
completely
deterministic
like
a
state
machine,
you're
guaranteed
to
go
from
X
Y
to
Z.
It
just
converges
toward
the
declarative.
The
users
declared
state
so
like
yeah
to
me
it
seems
declarative,
but
I'm
definitely
open
to
hearing
why
it's
not.
F
Yeah
I
guess
I
guess
the
non-declarative
nature
I'm
thinking
of
is
like
the
like,
the
Zola
spread,
May
adjust
and
that
may
have
effects
on
the
application.
F
So,
like
maybe
from
the
like
staple
set
perspective,
you
know
we're
still
declaring
what
the
eventual
outcome
is
going
to
be.
But,
like
Drew
pointed
out,
you
know
there
are
some.
Maybe
some
rebalancing
risks
that
you
have
if
update
takes
too
long
and
that
sort
of
thing
so
there's
these
side
effects
that
may
result
if
we're
not
necessarily
rescheduling
to
the
same
Zone,
that
a
pod
was
updated
from
foreign.
C
C
Don't
necessarily
think
that
would
change
if
you
mean,
because
really
what
you're
looking
at
the
proposal
is
basically
like
what
we
want
to
do
is
we
want
to
control
termination
so
that
we
can
burst
it
more
rapidly
across
a
particular
failure,
domain,
right
and
and
I,
because
the
storage
in
in
the
case
of
where
you're,
in
the
case,
where
you're
on
a
cloud
and
you
have
zonal
storage,
you're
still
always
going
to
get
rescheduled
to
the
same
Zone
like
that's,
that's
just
the
way
the
scheduler
is
going
to
look
at
it,
they're
not
going
to
put
you
on
a
node
where
you
can't
get
the
storage
that
you
want.
C
F
Yeah
I
think
either
if
you
have
like
capacity
constraint
during
an
update
like
maybe
you
have
a
very
aggressive,
auto
scaler
scales
down
once
resources
are
not
being
consumed.
C
Like
if
you
use,
if
you
use
cluster
Auto
scalar
on
AWS
with
EBS,
just
as
a
for
instance,
then
you
scale
down
to
zero
and
you
have
staple
sets
that
are
provisioned
and
looking
to
try
to
mount
volumes
in
the
zone
where,
where
it's
already
been
scaled
down
to
zero,
you,
you
can
get
unschedulable
pods,
that
just
kind
of
hang
and
break
so
that
that
is
already
a
thing.
H
Have
a
question:
okay,
just
wanted
to
clarify
the
expectations
about
the
next
steps
for
the
staple
said,
update
strategy
proposal,
specifically
I
heard
that
we
want
to
solidify
the
API
in
public,
but
I'm
wondering
what
the
bar
is
like.
H
What
we're
looking
for
specifically
before
we
would
accept
a
kept
as
the
forum
for
discussion
discussion
on
what
that
API
looks
like
because
what
I'm
hearing
is
that
there
are
already
a
bunch
of
different
companies
who
have
either
open
source
or
closed
source
implementations
of
this,
that
they
have
some
API
that
they're
using
and
whether
or
not
they're
able
to
share
the
code
and
have
like
competing
essentially
out
of
tree
Solutions
to
this
they
can
probably
share
the
API.
H
C
Really
up
to
the
contributors
right
like
if
you
guys
want
to
move
forward
with
a
cap
right
now,
we'll
review
it,
and
if
we
think
it
we
can
be
confident
and
you're
you're
willing
to
contribute
the
code
to
make
it
run.
We
will
Shepherd
it
through
committee
I'm,
just
suggesting
that
you
might
want
if
you
have
working
code
right
now
right.
C
One
thing
that,
as
in
the
past,
help
motivate
contributions
to
move
around
rapidly
Intrigue
with
a
large
degree
of
success
is
to
do
it
as
an
open
source
thing
and
start
a
working
group
and
then
just
merge
it
and
after,
if
you
don't
want
to
do
that,
I
mean
like.
If
you
want
to
open
a
cap,
that's
fine
too,.
B
B
C
G
C
B
Ing
question
with
that
is
I'm
not
been
part
of
the
process
where
you
saw
with
open
source
contributions
like
I'm
not
used
to
the
park
like
like.
We
can,
for
example,
I'm
sure
it's
very
easy
for
me
to
add
what
I
need
to
Rihanna's
open
source
project
and
then
it's
a
team
of
two
companies
which
know
what
it
you
know.
Who
have
this
one
open
source
solution,
and
maybe
we
can
use
it
for
the
other
elasticsearch
use
case
for
us
right,
but
then
how
do
we
do
that
reach
out
to
other
people?
B
How
do
we
say
Hey?
Listen,
this
exists.
Maybe
you
want
to
use
this
right
because
we're
not
advertisers
for
these
features.
We
don't
I,
don't
know
if
there's
a
forum
to
talk
about
this
or
or
people
engage
about
this,
so
my
concern
is
it'll
just
stall
at
that
point,
where,
like
it's
just
us,
we
figured
to
figure
this
out
and
then,
where
do
we
go
from
there?
We.
A
C
Specifically,
on
like
topology
aware,
updates,
I
would
love
to
see
that
evolved
to
also
respect
the
incense.
We
could
involve
six
scheduling
because
I'm
sure
they
would
be
interested
as
well
and
see
if
we
can
get
some
contributors
from
there.
That
would
also
kind
of
blend
in
so
that
that
would
be
one
act
if
you
look
at,
for
instance,
snapshots
right
like
that's
a
feature
that
has
broad
utilization
across
the
community
that
I,
don't
it's
not
entry
at
all
right
it
is.
C
C
You
know
if,
if
you
feel
like,
like
I'm,
I
am
hearing
that
people
are
saying.
Maybe
we
have
enough
data
like
if
you
feel
like
there
is
enough
data
from
various
users.
You
can
also
go
and
open
up
account.
My
my
one
pushback
would
be
again
what
I,
from
what
I've
seen
presented
today
is
that
the
data
you
have
seems
very
cloud
provider
focused
and
even
like
the
type
of
topology
that
you're
talking
about
respecting
is
native
to
Cloud
providers,
right
which
isn't
representative
of
the
entire
community
of
users.
C
So
just
looking
at
the
API,
you
have
I'm
like
that's.
That
may
not
be
ready
and
the
first
feedback
you're,
probably
going
to
get
from
the
broader
Community,
is
that
well.
This
is
very
different
from
the
topology
constraints
that
we
use
for
zonal
spreading,
which
are
aware
of
or
potentially
support,
arbitrary
failure
domains,
and
this
is
very
focused
on
cloud
provider
zones.
C
In
order
to
use
this
feature,
especially
when
the
scheduling
constraints
that
we
use
for
anti-affinity
and
affinity,
don't
work
that
way.
B
So
just
do
things
on
that.
Can
we
not
use
the
same
idea
of
just
having
letting
people
specify
a
generic
label
like
we
do
for
Affinity
empty
Infinity?
Can
we
not
just
adopt
that
enough.
C
Bring
that
back
in
house
use
it
at
AWS
use
it
at
Shopify
demonstrate
that
you've
got
the
right
thing
and
then
that's
a
strong
motivator
for
like
well,
and
it's
Google
too
I
mean
you
have
two
out
of
like
three
of
the
largest
public
clouds
already
using
it.
On
top
of
you
know,
some
major
tech
companies
like
Shopify
already
using
it,
doesn't
make
sense
to
not
take
an
entry
at
that
point
right.
C
That's
kind
of
my
two
sense
now
the
other
thing
would
be
like
if
we
find
it's
too
difficult
to
do
it
like,
because
if
you
want
to
use
staple
stat
as
a
primitive,
if
you're
not
trying
to
Fork
staple
set-
and
you
want
it
to
be
a
rolling
update
mechanism-
that's
embedded
in
the
controller
itself,
it
may
prove
that,
like
okay,
the
only
way
to
do
this
is
Alpha
feature
inside
of
staple
set,
which
you
know
we
do
a
cap
and
do
it
that
way
too.
C
That's
also
not
a
bad
path,
but
it
it
seems
like
we
were
able
to
do
this
out
of
tree
already,
at
least
in
several
several
cases.
Having
one
method
that
you
know
is
like
this
is
great.
It
works
for
all
three
of
us
as
opposed
to
three
different
things
and
then
saying
well.
This
is
what
we
would
like
to
take
as
an
API
and
offer
to
the
public
in
general.
To
utilize
would
be
a
very
strong
case
to
me.
A
Okay,
thank
you.
A
different
venue
that
you
might
want
to
reach
out
to
is
Michelle
from
six
store
to
reach
out
to
us
last
time.
There
is
a
link
to
the
data
on
kubernetes
community.
They
are
specifically
talking
about
running
storage.
Related
applications
on
Queue
may
be
worth
syncing
with
them.
We're
planning
to
to
reach
out
to
them
as
a
group
and
gathered
their
additional
feedbacks.
A
Maybe
they
will
be
able
to
to
either
confirm
your
current
approaches
or
add
additional
use
cases
that
you
might
not
have
thought
about
before
or
basically
before
info.
G
A
Okay,
cool
the
two
other
items,
I
see
from
Raul
and
Ilya,
both
of
them
they've
been
brought
up
a
couple
of
times
before
I.
Remember,
Ken,
that
you
were
requested
to
look
at
the
PVC
Recreation
from
last
time.
I'm,
not
sure.
If
you
had
any
chance,
it
would
be
nice
to
for
you
to
have
a
look
and
probably
similar
questions
for
for
aliapr
about
the
multiple
pdbs
on
a
pod.
C
Yeah,
we
I
think
look
I
think
was
aware
of
the
multiple
pdb,
the
it's
PVC
Recreation
or
PVC
delete
PB
deletion,
Recreation.
A
Thanks
a
lot
does
anyone
else
have
any
other
topics
that
they
want
to
bring
up
with
the
group.