►
From YouTube: 20201021 Cluster API Office Hours
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
Okay,
so
welcome.
Everyone
today
is
wednesday
october
21st,
and
this
is
the
cluster
api
office
hours.
Cluster
api
is
a
sub
project
of
sig
cluster
life
cycle.
A
During
this
meeting,
please
make
sure
you
follow
the
cncf
code
of
conduct
which
you
can
find
on
this
document.
If
you
haven't
read
it
at
the
top
and
please
be
sure
to
raise
your
hand
if
you'd
like
to
speak
and
add
any
agenda
items
to
the
agenda
right
here,
so
to
kick
it
off.
I
think
vince
has
a
psa
so
I'll
hand
it
off
to
vince.
B
Sure,
thank
you.
So
this
is
something
like
I
just
wanted
to
point
out
like
alpha.
4
already
has
like
a
lot
of
change
changes
merged
in
and
during
alpha
2
to
alpha
3.
We
kind
of
wrote
this
migration
document
that
we
that
are
linked
here
and
it
was
extremely
useful.
I
gathered
also
like
some
feedback,
so
like
probably
going
forward
what
we
want
to
do.
B
We
should
one
backfill
the
changes
that
have
already
been
merged
to
a
similar
document
and
two
like
require
that
all
the
new
breaking
changes
document
their
changes
within
this
document.
So
this
is
more
a
call
out
for
both
reviewers
that
to
like
watch
out
for
breaking
changes
and
for
providers
implementers
that,
like
the
the
changes
that
are
coming
in,
could
be
extensive
and
will
probably
be
extensive.
So
yeah
take
take
a
look
at
that
document
and
if
you
have
any
questions
reach
out.
A
Thanks
vince,
yes,
and
you
have
a
question.
C
Yeah,
it's
more
it's
more
of
a
no!
So
for
cabi.
We
started
also
documenting
these
breaking
changes,
or
at
least
the
ones
we're
planning
and
the
goal
is
to
make
it
also
a
living
document
for
anyone
consuming
the
provider.
So
it
might
be
something
good
to
adopt
also
across
providers
so
that
the
end
users
know
what
is
going
to
change.
A
Yeah
great
point,
and
the
one
in
capi,
I
assume,
is
more
of
a
like
for
providers
or
like
who's.
The
target
audience
means
is
it
more
like
the
users
or
the
providers
who
are
going
to
be
adopting
these
changes.
B
It's
probably
more
for
providers
this,
like
the
document
that
I
wanted
to
point
out,
but
yeah
for
users,
it's
like
hopefully
like
cluster
code
upgrade,
will
take
care
of
most
of
those
things
in
terms
of
like
just
upgrading
and
if
there
are
like
api
changes,
those
should
definitely
be
documented.
A
Yeah,
and
is
there
an
existing
living
dock
for
v,
one
of
the
four
or
is
that
to
be
created.
B
It's
it's
to
be
created.
We
haven't,
we
have
merged
like
a
few
vrs
in.
I
have
to
collect
like
the
the
changes
that
have
already
been
merged,
but
I
probably
start
like
at
the
end
of
next
week
or
the
week
after
that.
A
Okay,
maybe
it'd
be
a
good
idea
to
just
create
a
blank
document,
even
if
it
doesn't
have
all
the
changes
so
far.
Just
so
people
can
add
their
stuff
in
the
next
few
weeks
as
they
create
changes.
A
D
Okay,
so
I
have
a
cluster
with
three
control:
plane,
machine
nodes
and
basically,
what
what
I'm
trying
to
to
do
in
this
demo
is
first
trying
to
to
make
one
of
this
node
fades
and
show
our
remediation.
We
work
out.
D
So,
in
order
to
do
so
and
make
it
visible,
I'm
doing
I
have
a
slightly
modified
version
of
the
code
in
my
pr
and
basically
I'm
supporting
two
annotation
one.
One
annotation
basically
blocks
the
radiation,
so
basically
they
could
enter
into
the
remediation
part,
but
but
it
blocks
before
issuing
the
radiation.
This
will
allow
us
to
to
see
one
step.
D
And
and
the
second,
I
have
also
another
annotation
that
basically
blocks
the
machine
from
being
deleted,
so
we
will
see
basically
measure
the
the
annotation
being
the
remediation
being
started,
but
not
completed.
So
this
will
allow
us
to
to
follow
a
little
bit
so
and
I'm
applying
these
those
annotations
on
on
one
of
the
on
those
three
machines,
and
now
I'm
I'm
in
order
to
basically
kick
off
the
the
limitation,
I'm
applying
a
label,
so
this
machine
will
be
basically
checked
by
a
mhc
which
is
running
in
the
cluster.
D
So
now
I
apply
this
annotation
mmc.
Basically
it's
configured
in
order
to
consider
the
machine
unhealthy
in
me
immediately
and
basically
what
happened?
What
it
happened
that
mhc
applied
on
a
condition
to
a
machine
saying
that
the
node
is
unhealthy
and
the
node
is
unhealthy
because
our
condition
is
supporting
falls
for
more
than
15
minutes.
D
D
Kcp
will
start
taking
care
of
the
of
the
of
the
remediation,
so
it
is
unblocked
and
now
kcp
is
taking
care
and
basically
what
happened?
What
happened
that
the
kcp
started,
processing
the
remediation
and
and
is
a
revelation,
is
in
progress
and
in
fact
remediation
in
kcp
is
deleting
is
deleting
the
node.
As
you
can
see,
the
node
is
being
deleted
and
also
the
the
infrastructure
machine
are
being
deleted
at
this
stage,
and
so,
if
I
am
block
deletion,
so
I
let
the
process
to.
D
Basically,
the
machine
is
is
being
deleted,
right,
okay
and
a
new
a
and
then
when
the
machine
deleted.
Basically
a
normal
scaled-up
process
kick
kick
screen,
and
so
we
are,
we
have.
We
have
kcp
restoring
the
deter,
the
control
drain,
node-
and
I
I
block
here
for
the
for
because
I
know
it
is
not
easy
to
follow.
So
I
block
here
if
the
right
question
and
then
I
will
show
another
use
case
where
remediation
cannot
happen
because
there
are
failing
tcd
nodes.
A
Very
cool,
I
don't
see
any
hands
raised
right
now.
Anyone
have.
D
Okay,
so
I
I
move
on
in
the
second
example
in
the
second
example
of
the
remediation.
Now,
I'm
I'm
basically
what
I'm
doing
I'm
creating
a
a
critical
situation.
So
I'm
going
to
this
control
pane
machine
and
basically
making
a
pcb
to
fail,
and
then
I
will
try
to
remediate
another
machine,
but
this
is
this
one
to
be
possible.
This
will
not
be
possible
because
if
I
do
remediate
the
machine
I
I
will
lose
quorum.
D
D
D
Okay,
there
is
a
pr
out
that
now,
as
you
can
see,
it
is
not
visible
in
condition,
but
there
is
a
pr
out
from
cedar
that
will
make
basically
the
tcd
member
feeling
visible
in
condition
as
well,
but
now
I'm
going
to
so
this
one
is
the
machine
with
this,
defending
I'm
trying
to
remediate
this
one,
so
I'm
basically
forcing
this
machine
being
being
remediated
and
what
what
happened.
It
happens
that.
D
A
This
is
awesome,
I
think
we
have
a
question
from.
A
Joe,
I
don't
know
if
you're
muted
joe,
but
we
can't
hear
you.
A
Oh
and
sorry,
I
thought
you
were
raising
your
hand.
E
A
Something:
okay,
maybe
having
technical
difficulties
yeah.
This
is
really
cool.
I
had
a
question,
so
did
you
install
a
machine
health
check
on
your
cluster
in
order
to
be
able
to
do
this.
D
Yes,
I
started
a
machinery
check
which
is
configured
in
order
to
basically
make
f
make
make
another
fail
immediately,
but
this
is
only
for
for
testing,
so
you
can
have
your
checks
targeting
the
control,
play
machines
and
configure
it,
as
as
you
do
for
for
the
for
the
nodes.
A
Okay
sounds
good
cool
and
also
a
very
nice
ui
or
like
presentation
of
the
overview
of
the
cluster.
I'm
sure
I'm
not
the
only
one
who
can't
wait
to
be
able
to
use
this
as
well
very
polished
all
right.
Any
other
questions.
F
Hello,
can
you
can
hear
me.
F
Awesome
I
had
a
question:
is
this
quorum
check
plugable
like,
for
example,
if
I
had
a
rook
cluster
and
I
wanted
to
make
sure
that
I
didn't
lose
quorum
of
my
monitors?
D
F
So
so
you
know,
I
could
host
a
a
a
rook
cluster
on
k's
nodes
right,
and
so
how
would
I
you
know
you
use
the
same
process.
There.
A
Jason
is
saying
something
in
the
chat
about
you
could
use
the
upcoming
external
remediation
if
your
root
cluster
is
managed
by
a
machine
deployment.
F
A
A
That
but
yeah,
I
I
think,
there's
is
the
proposal
merged
for
that
or
I
think
so,
yeah
andy
did
you
have
something.
E
Yeah,
so
kcp
is
cubitium
control
plane
and
it
specifically
uses
cubitium
to
manage
machines
that
represent
a
kubernetes
control
plane.
So
I
would
second
what
jason
suggested
in
chat
if
you
have
rook
deployed
on
your
cluster,
that
is
managed
totally
separately
from
the
kubernetes
control
plane.
So
you
would
need
external
remediation
or
some
other
way
to
deal
with
that
sure.
E
Yes,
so
your
question,
then,
is:
I
have
multiple
things
running
on
my
control
plane
and
I
want
to
not
only
consider
etcd
health.
I
want
to
consider
other
things
before
reading
it,
I
think,
probably
pod
disruption
budgets
would
be
useful
there
to
prevent
draining
a
node
until
you've
got
the
right
number
of
replicas
elsewhere.
I
don't
really
think
this
is
a
q
idiom
control,
plane
problem,
okay,
it.
F
Just
seems
like
it's
a
common
pattern
of
wanting
to
check
a
quorum
at
the
higher
application
layer
layer
before
remediating.
D
A
So,
for
what
is
worth
the
cab
z,
conformance
periodic
jobs
are
still
passing
if
that
helps,
but
I
think
it'd
be
a
good
idea
to
have
yeah
either
like
a
periodic
triage
or
maybe
like
alerting
in
some
way.
So
we
can
pay
more
attention
to
those
failures.
G
For
those
specific
jobs
there
is
alerting
we
are
getting
the
message
in
the
release:
release,
email
and
for
especially
for
cap
g
was
like
some.
They
are
like
quite
different
from
their
other
providers
and
they
always
like
building
the
the
conformance
testing
using
the
kubernetes
kk
and
using
bazer
and
using
a
lot
of
other
stuff
behind
the
scenes
and
looks
like
the
to
build
them.
Then
testing
for
conformers
change
a
little
bit
and
the
script
was
like
out
of
date
that
there's
a
piano
in
place
to
fix
cap
g
for
kappa.
G
That
is,
I
I
made
a
comment
in
in
this
like
channel,
I'm
just
waiting,
maybe
for
andy.
If
he
can
comment
that,
like
that's
some
there's
missing,
it's
a
missing
template.
That's
why
it's
failing.
H
Yeah,
I
was
just
gonna
say
with
fabrizio's
work
to
kind
of
create
a
more
unified
dashboard
of
the
various
conformance
jobs
that
we
have
out
there.
That
would
probably
be
you
know,
a
good
tool
that
we
can
use
to
kind
of
triage
any
issues
with
those
tests
during
this
meeting
on
a
regular
basis.
A
Yes,
definitely
and
then
I
think
we've
also
talked
about
using
those
as
release
informing
for
cappy
itself
at
some
point,
which
would
also
be
a
natural
step,
and
then
we
have
a
suggestion
in
the
chat
to
add
the
sig
cluster
life
cycle
mailing
list
to
the
alerts.
If
they
don't
have
it.
Yet
that's
a
good
point,
because
if
it's
just
alerting
sigilies
the
capy
maintainers
might
not
be
aware
of
it.
H
A
Thank
you
any
other
comments,
thoughts
concerns
on
this
topic.
A
Okay,
thanks
carlos
again
for
stepping
up
and
investigating,
and
let
us
know
if
you
need
any
help
on
that
front-
all
right,
jason!
You
have
the
next
one.
H
Yeah
so,
following
up
from
the
meeting
last
week,
I'm
trying
to
schedule
a
kickoff
meeting
for
anybody
who's
interested
in
working
on
the
load,
balancer
provider
proposal.
H
H
Yeah,
so
the
basic
idea
is
that
we
currently
have
the
idea
of
a
load
balancer
provider
within
the
vsphere
provider,
and
there
are
other
kind
of
provider
implementations
that
can't
rely
on
kind
of
a
default
cloud
managed
load
balancer.
So
the
idea
is
kind
of
bring
up
the
load
balancer
to
kind
of
a
first
class
provider
within
cluster
api
itself,
and
then
that
gives
the
ability
for
other
providers
similar
to
vsphere
that
don't
have
like
a
default
like
built-in
load
balancer.
They
can
kind
of
share
common
implementations
across
those.
H
It
also
will
open
up
the
ability
to
more
easily
swap
out
load
balancers
even
for
cloud
providers
so
like
in
aws.
Right
now
we
use
an
el
or
classic
elb.
It
would
potentially
give
us
the
ability
to
create
like
an
nlb
equivalent
that
could
be
swapped
out
relatively
easily
so
yeah.
That's
that.
A
A
I
Yes,
last
week,
I
I
brought
this
out
that
I
I
opened
the
google
doc
for
for
this
feature,
like
I
don't
know,
maybe
three
weeks
ago.
This
is
more
like
a
question
because
it
happened
like
I
said
and
asked
people
to
comment
that
the
google
doc
last
week
and
it
doesn't
collect
that
much
comments
anymore.
So
what
is
the
current
kind
of
a
policy?
How
long
we
should
keep
this
open,
or
what
are
we
expecting
to
from
these
google
docs
or
so
I'm
kind
of
asking?
I
A
Yes,
so
I'm
not
familiar
with
the
proposal
itself,
but
is
it
targeting
v1
alpha
4?
Yes,
okay,
great!
So
there
is
a
proposal
process
documented
in
the
project
and
I
think
the
general
guidance
is
to
leave
it
open
as
a
google
doc
for
a
while,
unless
there
are
like
any
big
blocking
comments
and
then
opening
it
as
a
pr-
and
I
think
vince
just
shared
it
and
it
should
be
in
the
implementable
state,
but
other
than
that.
A
I
Yeah,
I
thought
so
so
if
we
don't
have
any
other
other
thing
on
the
agenda,
I
would
like
to
quickly
ask
because
I
I
got
the
comment
about
the
in-place
upgrade.
I
Obviously
it
is
the
different
thing,
but
have
you
ever
discussed
anything
about
the
could
we
could
we
have
in
place
option
also,
while
we
upgrading
so
basically,
for
example,
in
in
bare
metal,
we
might
have
the
certain
disks
attached
to
the
servers
that
we
we
might
want
to
reuse
during
the
upgrade,
so
that
we
won't
take
a
new
fresh
server.
We
probably
want
to
have
this
new
machine
to
have
the
same
node
that
the
the
earlier
one
but
upgraded.
I
A
Yes,
so
it
is
a
topic
that
comes
back
every
once
in
a
while.
I,
if
you
search
through
the
doc,
I
just
found
the
like
notes
from
last
time
that
there
was
a
pretty
extensive
discussion.
I
think
you
can
find
the
recording
it
was
from
september
2nd,
but
I
think
jason
has
a
comment
so
I'll.
Let
him
speak.
H
That
said,
I
I
would
expect
it
would
be
I
I
would
expect
right
now.
It
would
take
another
kind
of
control,
plane
implementation
to
be
able
to
implement
in
place
there
shouldn't
be
anything
that
would
prevent
that,
but
it
hasn't
been
some
that
we've
talked
about
supporting,
because
of
various
reasons
that
I'm
sure
linkedin.
That
discussion
cecile
mentioned.
I
Yeah
yeah
that
that's
actually
good
to
know,
so
we
we
probably
need
to
find
the
other
kind
of
way
to
to
implement
this,
maybe
in
metal
tree
how
we
select
the
nodes
and
yeah.
So
this
is
good
to
know,
even
if
it's
not
even
being
thinned
off,
so
I'm
not
staying
to
wait
for
the
for
the
discussion
to
go
further
at
least
right
now,
thanks
jason.
A
All
right,
so,
unless
anyone
has
any
topics,
I
think
this
is
the
end
of
the
agenda,
just
a
reminder
that
if
you
have
any
current
proposals
that
are
open,
please
make
sure
they're
in
this
list,
if
they're
still
in
the
google
doc
form,
so
we
can
track
them.
And
if
you
are
looking
at
this
list-
and
something
is
interesting
to
you-
please
make
sure
you
review
them
and
yeah
until
next
time.
Oh
andy,
did
you
want
to
add
something.
E
Yes,
one
quick
follow-up
on
the
alerting
around
the
test
failures.
We
do
have
a
google
group,
that's
linked
in
the
notes
right
now
and
there's
a
few
of
us
who
have
ownership
rights
on
that.
So
if
you
are
interested
in
receiving
alerts
that
any
time
there
are
failures
from
prowl
related
to
cluster
api
jobs,
please
feel
free
to
reach
out
to
me
and
basically
I
need
your
name
and
your
email
address
and
I'll
be
happy
to
add.
You.