►
From YouTube: Kubernetes kops office hours 20200410
Description
Recording of the kops office hours meeting held on 20200410
A
Hello,
everybody
today
is
Friday
April,
10th
2020.
This
is
cops
office
hours.
I,
am
your
moderator,
facilitator,
Justin,
Santa,
Barbara
I
work
at
Google,
a
reminder.
This
meeting
is
being
recorded
and
what
we
put
on
the
internet-
and
so
please
be
mindful
of
our
code
of
conduct,
which
boils
down
to
be
a
good
person.
I
have
pasted
a
link
to
the
agenda
in
the
zoom
chat.
Please
feel
free
to
add
your
name
and
any
items
you'd
like
to
cover
to
that
agenda.
A
B
B
B
A
B
B
A
Permissions
for
that
right-
and
this
is
this-
is
actually
probably
a
kubernetes
org
level
permission
on
the
github
kubernetes
fork.
So
it
probably
isn't
something
that,
for
example,
I
could
do.
We
should
probably
talk
to
either
contrib
X
or
test
infra
I
think
we
do
have
a
use
case
for
continuing
to
use
Travis,
which
is.
It
is
the
only
way
that
I
know
of
that.
We
can
test
on
OSX
mac
OS,
but
they
formerly
known
as
iOS
whatever
those
and
so
I
think
we
want
to
keep
that
going.
A
Yeah
the
thing
we're
trying
to
check
with
Travis
is:
can
leak
up,
CLI
be
built
on
Mac
OS
with
effectively
go,
build
or
like
without
a
bunch
of
options,
and
that's
what
I
think
we
can
only
get
from
Travis,
but
maybe
maybe
we
can
make
do
with
other
things.
I
suggest
why
don't
we
keep
an
eye
on
this
one?
If
it
happens
again,
can
you
ping
me
on
the
PR?
You
probably
didn't
for
just
missed
it.
A
I
apologize,
but
if
you
ping
me
on
the
PR
I
think
we're
getting
closer
on
the
PRS
in
terms
of
catching
up
and
so
I
should
be
able
to
like
be
a
little
more
responsive
and
I
can
see
what
I
can
figure
out
as
well,
and
we
can
maybe
loop
in
which
free
loop
in
Contra
vex,
who
would
be
the
people
that
would
approve
the
change
to
a
github
app,
and
hopefully
this
is
just
temporary,
with
everything
being
sort
of
overloaded
right
now
in
terms
of
all
things,
computers
and
network.
So.
A
B
C
A
A
B
A
B
A
Alright,
well,
there
are
some
good
thank
you
for
the
links
and
we
can
have
a
look
at
and
see
what
we
can
figure
out,
but
it
doesn't
sound
like
it.
Sound
like
the
apps
thing
is
something
that
might
fix
it,
but
I'm
not
getting
a
sense
of
Oklahoma,
confident
us
either.
You
know,
let's
see,
okay,
let's,
let's
keep
going
Mazzy
89
I'm
at
now,
who
is
that
is
Mazzy
89
here.
A
A
The
big
one
is
around
enabled,
but
like
a
field
but
I'm
gonna
send
a
follow-on,
PR
and
that's
someone
beats
me
to
it.
Just
a
bull
versus
pointer
to
pool
problem
or
a
potential
problem.
That
I
would
be
more
comfortable
if
we
didn't
have
to
face,
but
I
think
that's
that's
in
that's
one
of
the
lasts.
That
was
one
of
the
last
blockers
for
118
I
have
a
sort
of
burned
down
list
so
for
118,
alpha,
okay,
so
I
assume
Lee
Fela
dealt
with
that
Hackman
arm.
64
support
for
worker
nodes.
B
A
B
A
B
C
B
B
A
D
A
A
E
A
A
Alright
yeah
yeah
must
be
a
change
of
such
a
change
of
roles.
The
yeah,
the
primary
value
I,
can
think
of
for
the
legacy
EDD
provider.
Is
it
more
closely
matches
the
way
we
would
want
to
structure
a
future
ed
city
manager
which
doesn't
have
everything
baked
in,
but
that
is
we
can
easily
get
it
from
code,
and
that
is
the
least
of
the
problems
around
that
so
I
II
I
would
I'm
inclined
to
like
be
I'm,
always
like
that.
Stick
things
a
little
more
gradually.
A
A
C
A
Think
I
do
agree
with
you
John
in
that,
like
the
kubernetes,
like
the
the
the
argument
for
not
the
argument
for
tying
things
to
cabanas.
First
I'll
posts
on
the
on
the
PR,
but
the
organ
for
tiniest
cameos
versions,
relevant
cults
versions.
Is
we
don't
want
to
have
to
support
older
cops
versions?
We
don't
want
people
to
say
well.
I'm
gonna
use
cops
114
because
it
was
the
last
version
that
had
feature
X.
A
E
F
E
Proceed
on
rolling
updates
until
you
pass
cluster
validation,
some
user
specified
number
of
times,
and
it
also
removes
the
ten-second
wait
between
those
successful
validations
table.
I
mean
our
I
would
rather
get
down
to
you,
investigate
and
fix
the
underlying
issue
as
I
as
I
see
it
so
far.
The
underlying
issue
is
that
nodes
marks
that
they
are
ready
when
they
are,
in
fact
not
ready.
We
have
a
workaround,
which
is
cluster
validation,
but
cluster
validation
is
marking.
The
cluster
is
validated
when
the
cluster
is
not
ready
and.
A
D
B
Yes,
but
that
was
different
that
was
validate
cluster.
It
was
a
different
command
which
we
made
it
fail
if
it
didn't
happen
like
that,
but
in
this
case,
with
all
due
respect,
people
don't
use
rolling
update
for
testing.
They
use
it
for
production,
so
I
don't
really
want
to
see
it
fails
so
that
I
can
report
the
bug.
I
want
it
to
work
to
rolling
update
until
you
know,
wait
until
it's
ready
and
then
move
over
and
so
on.
B
A
E
A
B
A
E
A
That
can
we
make
sure
that
that
feedback
is
coming
I
think
Cochrane
was
saying
like:
can
we
get
that
type
that
feedback
in
our
in
our
tests
and
I
I
I
think
that's
a
good
way,
I
think
it
would
be
nice
to
surface
I,
don't
know,
there's
a
way
we
can
surface.
The
error
like
saying:
please
report
this
without
hurting
someone's
cluster
I.
B
E
B
E
It
only
fails
when
you
go
to
a
new
instance
group,
and
that
is
because
I
fix
the
bug
where
we
currently
do
a
check
before
we
start
rolling
an
instance
group
and
that
particular
check
doesn't
retry.
That
check
fails
the
whole
thing
stops
and
the
bug
was.
It
was
previously
ignoring
validation
failures,
so
it
validate,
ignore
the
failures
and
then
proceed.
Okay.
So
now.
C
E
But
then
we
we
would
still
have
the
problem
of
you
know
if
it's
flapping
and
returning
success
incorrectly
we're
going
to
be
rolling
to
the
next
thing
before
the
clusters
ready.
The
two
failures
we've
seen.
One
was
the
first
one
was
that
the
controller
manager
was
going
pending
and
so
I
added
a
check
to
make
sure
that
every
master
had
a
controller
manager
pod.
Now
it's
API
server,
so
I
think
there's
a
problem
where
static
pods
that
a
node
can
validate
successfully
before
the
API
server
even
knows
that
the
static
pods
exist.
E
E
G
So
we're
we're
hitting
that
case
of
the
initial
validation.
Failing
because,
right
before
we
start
the
rolling
update
on
our
masters,
we're
scaling
down
the
cluster
autoscaler,
which
is
running
a
coop
system
and
because
that
pod
happens
to
be
going
pending,
like
it
tries
to
do
the
validation
and
it
sees
that's
pending
and
fails.
And
so
we
put
a
retry
in
for
that.
E
A
Think
I
made
a
great
point
about
the
like:
let's
make
sure
our
ete
does
not
paper
over
it,
no
matter
what
and
then
what
I'm
wondering
is.
Can
we
can
we
like
surface
it?
Without?
Can
we
do
the
right
thing
and
still
surface?
Can
we
do
the
best
thing
we
can
do
for
the
user
and
still
surface
there
in
a
way
that
we
get
the
report
like
some
form
of
message
with,
while
still
doing
the
right
thing,
I,
don't
know,
I
mean
yeah.
B
A
Yeah
I
mean
no
greatness,
has
a
number
of
known
issues
like
this?
The
other
one
that
really
bothers
me
is
network.
Readiness
is
not
always
accurately
reported,
so
a
node
can
be
considered
ready
before,
for
example
like
if
you're,
using
the
ADA
routes
routes.
Mapping
like
there's
no
guarantee
that
route
mapping
has
been
set
up
before
the
it's.
A
separate
controller
upset
yeah
well,.
A
I'll
say
it's
ready,
so,
yes,
this
is
one
of
those
ones
that
has
to
send
it
into
like
upstream
finger-pointing
and
not
a
lot
of
progress
which
is
disappointing
and
I
guess.
That
is
why
we
have
the
father
date,
because,
ideally,
we
would
just
be
able
to
look
at
no
status
and
I
think
there
is
slow
progress
being
made.
I,
don't
know
that
it's
great
but
yeah,
but
we
need
to
figure
out.
E
A
A
Yeah
we
yeah
I'm,
like
you,
proxies
yeah
I'm,
just
like,
but
yeah,
ideally
one
day,
we'll
get
your
proxy
to
daemon
set
and
which
has
its
own
challenges
because
of
like
SKU
and
architecture,
but
yeah.
A
No,
my
phone
runs
coupons
yeah.
Yes,
this
was
pretty
deep.
So
yes,
I
as
the
action
Adam
I
put
put
this
on
the
list
as
a
1:18
blocker,
to
figure
out
what
we
want
the
behavior
to
be
for
when
a
team
or
might
alpha
and
I
don't
know
that
yeah
sounds
like
we
should
address
that.
But
I
don't
know
that.
There's
much
more!
We
can
figure
out
here.
It's
a
ferret.
A
B
A
I
think
that's
reasonable,
I
think
I
think
we
need
to
figure
out
how
a
good
balance
between
not
papering
over
the
cracks,
giving
the
right
user
experience
getting
the
data
and
trying
to
like
it
be
be
a
more
accurate,
validate
right
that
actually
like
to
check
some
of
this
stuff
yeah
and
like
being
mindful
of
what's
coming
from
upstream.
In
terms
of
know,
readiness
that
may
negate
some
of
the
need
for
this.
But
it's
that's
not
happening
to
take
your
Appetit
I.
We
spent
a
long
time
on
the
site.
A
E
F
A
A
B
A
A
B
So,
while
cleaning
up
the
removed,
docker
versions
so
for
which
we
don't
have
represent
more
notice,
that
there
is
quite
a
mess
in
the
older
versions,
so
my
proposal
was
to
remove
some
of
them.
Like
the
duplicates
lap,
we
duplicates
like
Ison
wrote
there.
It's
1806,
1,
2,
&
3
people
should
use,
even
if
they
want
to
use
1806
should
use
the
latest
available.
I,
don't
see
any
reason
to
have
all
of
them.
A
Because
I
I
guess
my
concern
is
suppose
someone
is
as
happily
using
1806
1
and
has
for
whatever
reason
tested
that
1806
3
breaks
their
workloads
and
then
we
come
along
and
we
say
you
can't
do
that.
I
would
be
much
I
mean
I
know
they
shouldn't
be
doing
any
of
this.
But
that's
what
I'm
sort
of
trying
to
balance
I
would.
B
B
A
A
B
A
H
There
was
a
regression
for
coop
router
in
116,
and
I
believe
we
fixed
the
test
coverage
here
as
well.
So
we
can
start
detecting
these,
but
there's
a
cherry
pick
now
to
back
port
this
to
116,
and
unless
we
cut
a
patch
here,
it's
gonna
require
anybody
using
cooperate
or
manually
intervenes
for
an
upgrade
to
succeed.
A
H
A
I
mean
I,
think
that's
great
I,
think,
and
so
there
is
it.
There's
an
open
cherry
fact
that
we
need
to
approach
sorry
the
trick
that
has
been
trafficked.
We
just
need
to
do
the
release
to
actually
like
ship
it.
That's
correct,
perfect,
I
think
we
should
definitely
do
that.
Any
sorry
and
I
think
we
should
do
that.
A
Any
objections
ready
any
other
things,
people
one
in
1/16,
I,
guess:
okay,
so
I'll
do
that
I
think
we're
actually
very
close
to
cutting
118
the
next
one
18
alpha
or
the
first
once
you
have
a
I
thought,
but
the
next
one,
18
and
I
just
didn't
see
the
point
of
doing
it
fast.
Like
you
know,
two
minutes
for
the
meeting,
so
I
will
do
that.
I,
guess
absolutely
more
on
the
topic.
I
also
have
been
catching
up
on
youtube.
A
Uploads
I
am
able
to
do
I
think
five
a
day,
because
I
am
sticking
to
using
the
API,
because
I
refused
to
click
in
a
GUI.
I
am
that's
why
I
have
I
have
written
enough
lecture
and
that
only
allows
me
to
upload
five
a
day,
but
I
will
get
there
at
five
a
day,
so
I
think
we're
in
I
think
we
went
to
the
end
of
January,
so
we
are
two
days
behind
I.
A
G
At
first
first
validation
check
during
the
rolling
update.
Does
the
just
a
single
try
right
now?
I
did
an
update
just
to
make
it
use
the
same
weight
function,
so
we
have
to
validate
two
to
multiple
checks.
The
question
is
I.
Guess
two
questions.
One
you
know
is
that
the
right
approach
is
that
what
we
want
to
do,
which
we
were
discussing
and
then
to
if
we
do
merge
this
in
the
static
test
failed
before,
because
it
was
the
only
thing
to
use.
It
was
still
using
the
validate
clustered
function.
It
looks
like
so.
A
E
Yeah
well,
the
logging
is
a
little
different
from
the
two
and
you
might
because
the
the
one
that
retries
is
a
little
bit
noisy
and
you
don't
necessarily
want
to
go
out.
Logging
on
the
first
validation,
so
I
think
there's
a
slight.
You
might
want
a
slight
difference
between
those
two
modes.
The
downside
of
retrying
on
that
one
is,
you
know,
paper
over
the
problems.
G
A
E
A
A
A
G
G
E
A
I
just
wanna
say
like
thank
you
to
everyone
that
merges
the
PR.
It's
like
it
can
be,
I
make
mistakes,
people
make
mistakes
and
it's
always
a
balance
between
like
merging
and
like
trying
to
get
the
right.
The
right
balance
and
I
think
it's.
If
we
never
made
a
mistake,
we'd
be
going
too
slowly
and
so
I
think
you
know
as
its
it's
great.
So
thank
you
to
everyone
that
merges
PRS
and
they
give
Ana
comments
when
we
do
make
mistakes.
A
A
B
A
So
this
is
a
background.
This
is
a
PR
I
put
up
which
updated
our
kubernetes
version.
Two
one,
eight
sorry
are
the
version
of
our
kubernetes
libraries
in
particular
API
machinery
and
friends.
We
thankfully
no
longer
depend
on
KK
updates
to
118.
The
gotcha,
which
everyone
will
notice
very
shortly,
is
118
client
go,
adds
a
context,
basically
changes
the
signature
of
all
the
calls.
It
adds
a
context.
A
It
also
requires
a
options
as
the
final
ish
parameter
on
all
methods,
whereas
previously
previously
was
only
on
maybe
update
and
list,
and
now
there
is
a
read
options.
There
is
a
patch
options,
though
these
things.
So
there
is
a
ton
of
context,
threading
that
happens
and
I.
Don't
think
it's
a
bad
thing.
It's
just
icky!
Now
it's
done
it
was
not.
Those
are
those
tedious
to
do,
but
it's
fine
and
then
I.
Think
hack.
A
Man
raised
a
good
point,
which
is
at
the
time
I
started
before
I
thread
through
the
context
it
was
1:18
0
in
the
interim
they
released,
1:18,
1
and
so
I.
Probably
yes,
I
will
probably
update
to
118
1
in
general.
We
don't
like
follow
it
patch
by
patch,
unless
we
actually
see
something
we
need
to.
In
this
case,
it
feels
like
there's
no
point
going
with
1:18
unless
there's
no
probably
going
1:18
0
when
1:18
1
is
right.
There
just
is
a
rebase
s.
To
do
so.
I
will.
A
A
Okay,
any
other
topics
before
we
go
through
the
release
plan:
okay,
I'm
gonna
move
create
the
Buster
API
is
beneath
the
other
two,
but
it
remains
on
my
agenda,
but
I
I
feel
like
it's
becoming
embarrassing
at
this
point,
so
I'm
gonna
I'm,
going
to
deprioritize
it
from
the
list,
so
I
can
feel
better
on
myself.
So
we
as
discussed,
we
will
do
a
1/16
one,
including
I,
guess
that's
eight,
eight
six,
four
thank
you
ever
put
that
in
and
presume
there'll
be
some
other
deltas.
A
A
A
A
A
B
B
A
D
A
E
A
I
mean
I've
started
tagging
things
for
119,
but
it's
more
like
not
a
118
blocker
like
I,
actually
started
on
the
on
the
reflection
thing
and
but
I
put
the
work
in
progress.
Reflection
setter
in
119
and
I
also,
therefore
tagged
set
instance
group,
as
for
119,
and
there
was
like
all
my
work
in
progress
into
119
as
in
like
these
are
not
gonna
make
118,
so
there's
just
get
them
off
my
screen.
A
Actually,
when
we
do
that,
we
are,
we
are
below
50
PRS,
which
is
something
I've
never
seen
if
we
exclude
work
in
progress
and
execute
or
anything
already
triage
to
119.
So
that's
really
good.
Thank
you
to
everyone
that
drove
that
yeah.
Are
there
any
other
any
other
things
that
people
are
are?
Are
there
any
things
that
people
are
as
your
last
anything
to
get
in?
A
A
B
E
A
E
B
B
B
A
You
I
hoped
you
have
time
I
think
thank
you.
Everyone
actually
has
done
a
wonderful
job
of
clearing
the
most
of
the
backlog
of
PRS.
That
has
been
super
helpful
to
everyone.
That's
been
doing
that
we
have
three
minutes
remaining
in
our
scheduled
time.
I
don't
know
if
there
are
any
final
issues
or
final
topics.
If
you
want
to
discuss.