►
From YouTube: Kubernetes SIG Cluster Lifecycle 20180117 - Cluster API
Description
Meeting Notes: https://docs.google.com/document/d/16ils69KImmE94RlmzjWDrkmFZysgB2J4lGnYMRN89WM/edit#heading=h.gltopy3z23bu
Highlights:
- Design review for the node-controller-manager is next Monday
- New Slack channel for cluster-api discussions
- Reducing duplicate code (generated clients) in the git repo
- PRs for Machine / MachineSet
- Namespacing for Machines
- States for managed machines
A
Hello
and
welcome
to
the
January
17th
2018
edition
of
the
sig
cluster
life
cycle.
Cluster
API
breakout
meeting
I
want
to
start
today's
meeting
just
doing
follow-ups
from
last
week's
meeting.
There
was
one
action
item
that
was
assigned
to
hang
it
myself,
which
was
to
schedule
a
design
review
review
meeting
with
the
SAP
folks
that
presented
the
no
controller
manager
that
has
been
scheduled
for
next
Monday
at
10
a.m.
A
A
This
was
after
Jo.
Beta
did
did
his
a
thank
God.
It's
kubernetes
talk
last
Friday
on
the
cluster
API
and
during
that
talk,
Chris,
Nova
and
I
were
chatting
offline
and
she
asked
a
couple
of
questions
like
where
do
people
get
together
and
talk
about
the
cluster,
API
and
I
realized
I
didn't
really
have
a
good
answer
for
her,
and
so
it
created
a
slack
channel
so
that
we
could
get
together
there
and
talk
about
the
cluster
API.
A
A
But
you
know
in
in
response
to
that
I
know
there
was
like
a
issue
filed
and
send
emails
sent
around
and
I
think
it
was
unclear
if
you
didn't
work
at
Google
sort
of
how
that
was
being
addressed
or
fixed.
So
I'm
hoping
that
going
forward.
We
can
start
doing
those
sorts
of
things
in
the
site
channels
to
provide
you
know
greater
visibility
and
also
have
more
people
be
able
to
participate
in
finding
and
fixing
problems.
A
B
So,
if
right
now
with
the
repositories,
there
are
multiple
in
there
multiple
clients
that
generated
one
from
the
the
extension
API.
So
it
was
just
recently
audit
and
one
was
the
one
that
I
previously
generated
and,
for
example,
the
extension
APIs
that
were
also
audit.
Blank
types
for
the
for
the
API,
so
I
suggest,
which
splits
I
mean
just
make
the
photos
to
trim
for
more
understandable,
like
with
only
one
with
one
folder
only
for
the
API,
and
that's
only
for
the
types
so
later
on.
B
C
C
A
Yeah,
that
was
also
my
understanding,
as
this
is
just
sort
of
a
transient
state
as
we're
spinning
up
the
aggregated
API
server.
But
you
do
bring
up
some
good
questions
about
what
should
go
like
in
the
cluster
API
folder
I
think,
there's
pretty
general
agreement
that
the
types
should
go
in
there,
but
I
anticipate
us
having
some.
You
know
either
shared
libraries
or
even
some
shared
you
know.
A
Maybe
it
shared
controller
manager,
skeleton
or
something
that
we
want
to
reuse
across
different
implementations
and
also
the
extension
API
server,
where
we're
gonna
have
other
sort
of
buckets
of
shared
code,
and
we
should
talk
about
whether
we
want
that
shared
code
to
go
in
the
cluster
API
folder,
where
that's
like
types
and
shared
code
or
where
that
should
just
be
typed
to
make
it
really
small
and
we
use
sort
of
other
top
level
folders
to
put
Towe.
Do
we
think
it's
reusable
into.
B
Mean
if
we
can
mimic
what
well
right
now,
the
core
crisis
is
trying
to
do
just
having
like
repositories
or,
let's
say,
folders,
full
of
different
things,
so
they
can
for
the
API
for
the
food
for
the
API
server
and
except
for
so
they
are
trying
to
split
up,
because
if
we
just
bond
away
everything
together,
we
have
really
bad
dependency
hell
in
the
future.
And
if
you
try
to
import
only
the
API,
then
you
might
get
some
other
things
that
you,
you
don't
want
to
use
an
acceptor.
C
C
D
C
A
Yeah
I
do
think
that
the
barn
had
a
really
good
point,
though
about
importing
the
API
types,
because
if
we
do
bundle,
the
I
read
a
pea
I
serve
with
the
API
types
and
another
project
wants
to
be
a
client
of
the
cluster
API.
Then
I
end
up
importing
all
of
our
sort
of
serving
infrastructure
for
the
API,
when
all
they
really
care
about
is
I
want
to
be
able
to
have
a
client
that,
instead
of
talk
to
the
API
and
so
I,
think
I
think
yeah.
A
Keeping
that
sort
of
separate
like
maybe
the
cluster
API
directory
is
just
the
tightest
effectively.
So
it's
really
easy
to
vendor
elsewhere,
and
then
we
can
put
the
entire
ad
API
server,
which
we
do
believe
will
be
reused
by
pretty
much
all
the
implementations
somewhere
where
everyone
can
easily
get
it.
But
if
they're
trying
to
implement
the
cluster
API,
but
not
necessarily
to
import
for
people
who
were
trying
to
use
the
cluster
API,
make
it
easy
to
build
tools
on
top
I.
B
Everyone
we
need
the
client
which
you
actually
depend
on
the
own,
the
types.
So
you
don't
think
that
at
the
end
we
care
is
the
client
and
I
mean
for
if
you,
if
you
don't
right
and
anything,
if
you
not
want
you
to
modify
for
the
API
server,
you
only
need
like
the
the
client
and
the
client
will
get
the
type.
So
this
is
more
or
less.
A
Okay,
so
looking
at
where
we
are
today,
Fang
is
in
the
middle
of
sort
of
transitioning
stuff
out
of
CR
DS
into
the
extension
API
server.
So
I
think
what
we
should
do
is
we
should
wait
for
that
transition
to
finish,
and
maybe,
as
part
of
that,
he
can
also
do
this
split
or
we
can
loop
back
and
we
can
see
where
the
directory
structure
looks
like
once
that's
finished
and
then
make
sure
we
kind
of
get
it
to
the
desired.
A
Sorry
in
the
state,
that's
having
to
be
really
easy,
just
to
grab
the
generated
clients
and
the
types
so
I
think
what
I
would
suggest
as
we
circle
back
on
this
in
a
week
or
maybe
two
weeks,
depending
on
how
long
it
takes
to
extract
a
CR
D
code
and
make
sure
that
this
is
still
headed
in
the
right
direction.
It
looks
good,
sounds.
A
Alright,
so
next
I
put
an
engine
item
for
the
machines
pull
request,
so
Jacob
sent
out
this
floor
quest
quite
a
while
back.
He
sort
of
at
the
same
time
merged
a
version
of
the
pull
request.
So
you
start
writing
code
left
the
pull
request
open
for
comments.
It's
sort
of
gathered
a
whole
bunch
of
comments
in
that
time,
which,
at
this
point
we
should
resolve
for
a
couple
of
reasons.
A
So
Jacob
sort
of
put
a
nice
summary
of
where
we
were
at
on
the
pull
request
and
steps
forward
from
there,
and
then
he
also
had
a
couple
of
things
that
I
created
new
issues.
For
so
we
have
two
new
issues
that
we
should
consider
sort
of
fixing
if
you
will
in
the
machines
API
and
then
his
proposal
is
basically
three
steps
of
someone
going
through
and
basically
saying
sanity
checking
the
Delta
between
what's
what's
checked
in
and
what's
in
the
pull
request
and
then
getting
this
thing
merged.
A
So
I
just
wanted
to
bring
that
up
here,
see
if
anybody
wanted
to
volunteer
for
for
going
through
the
types
files
and
doing
that
sanity
check.
I
was
I've
been
planning
to
do
it,
but
I
haven't
had
time
yet.
So
if
someone
wants
to
do
that
now,
that
would
be
awesome
and
then
also
see
if
anybody
else
had
any
outstanding
comments
on
on
this
pull
request.
They
wanted
to
write
up
in
this
forum
versus
filing
issues
or
sending
pull
requests
to
modify
the
API.
Once
this
poor
quest
merges.
B
A
So
he
says
that
there's
some
differences
like
what's
in
the
pull
requests,
we're
using
I
cannae
new
Mike
for
machine
roll
instead
of
a
string
and
there
a
couple
other
little
ones
so
yeah.
If
that's
something
that
Martin,
if
you're
interested
in
sort
of,
if
you
I
linked
to
the
the
comment
that
he
put
on
that
P
are
talking
about
what
steps
he
proposed,
which
I
think
makes
sense.
A
A
A
So
next
thing
on
the
agenda
was
the
Machine
set
for
request.
This
was
sent
by
the
Lucy
folks
to
add
a
machine
set
sort
of
sitting
on
top
of
the
machine.
I'm,
not
sure
if
people
have
seen
this
so
I
want
to
was
sent
out
right
before
in
the
u.s..
What
was
a
long
holiday
weekend
so
I
want
to
bring
this
up
for
people's
attentions.
B
I
have
a
comment
so
I
made
the
I
already
commented
the
proquest,
but
let's
discuss
it.
So
it's
if
you
look
at
the
kinetic
limitation
of
the
state
full
of
the
red
of
the
deployment
scene
in
version
1,
it's
actually
from
version
V,
but
V
1,
B,
2,
1,
2,
I,
believe
and
they're.
The
selectors
are
immutable,
so
you
cannot
change
them
after
you
create
them.
So
are
we
going
enforce
this
by
default?
A
Yeah
I
mean
I,
think
well,
or
one
of
our
goals
here
should
be
to
try
and
have
consistency
with
other
parts
of
the
communities
API,
because
then
you
have
sort
of
the
principle
of
least
astonishment
for
people
that
are
used
to
working
with
terminated
applications
and
replica
sets.
Then,
when
they
start
working
with
machines,
they
behave
very
similarly
and
so
I.
Think.
If
there's
a
pattern,
that's
been
established
in
a
replica
set
or
deployment
for
how
we
handle
label
selectors.
E
A
So,
as
I
recall,
this
issue
was
created
because
we
were
talking
about
whether
machines
should
be
an
unnamed
spaced
resource,
which
means
that
they
would
sort
of
implicitly
only
apply
to
the
cluster
that
they
live
in
or
the
whether
we
should
put
machines
in
namespaces,
which
would
allow
the
flexibility
to
have
machines
in
a
namespace
represent
machines
in
a
different
cluster,
as
opposed
to
classes
that
you're
in
or
multiple
namespaces
could
represent
multiple
other
clusters.
Yes,.
E
And
this
also
goes
into
a
little
more
detail
about.
Should
we
adopt
some
conventions
like
say
the
namespace
cube
system
machines
in
there
are
reserved
for
like
locally
managed
clusters,
whereas
every
other
namespace
can
be
assumed
to
be
remotely
anyways
read
through
the
issue.
There
are
three
options
you
can
propose
more,
but
I
just
want
to
get
a
consensus
on
how
we
should
approach
this
and
close
it
out.
So
we
can
move
forward.
F
On
our
side
we
had
we
were
looking
at
this
week.
We
would
love
the
flexibility
to
potentially
have
these
namespace.
You
know,
and
certainly
they
could
be
used
just
to
manage
local
clusters
as
well,
but
it
would
be
great
to
have
the
door
open
for
both
I
think
in
for
our
purposes.
We're.
Definitely
we're
definitely
going
to
need
that
sort
of
read
closer
approach
that
we
talked
about
before,
but
we're
also
talking
about
use
cases
where
we
might
be
using
local
clusters
as
well.
A
A
So
next
thing
that
I
want
to
talk
about
was
states
for
machines.
So
one
thing
that
became
clear
is
we're.
Having
some
discussions
last
week
was
that
we're
sort
of
implicitly
defining
a
model
for
sort
of
the
life
cycle
of
a
machine
as
we
build
a
machines
API
and
start
implementing
that
and
I
think
it.
A
Api
knows
that
if
they're
using
it
on
AWS
and
they
switch
to
Z
on
GC
p2
and
if
they've
written
around
it,
whether
you
know
maybe
delete
a
machine
and
they
expect
it
to
go
through
a
certain
set
of
states
before
it's
gone,
that
that
behavior
can
be
consistent
across
environments
and
that
tooling
will
be
portable
across
environments,
because
you
know
I
see
things
right
now,
where
yeah
you
have
sort
of
two
choices.
If
you
lead
a
machine,
the
Machine
object
can
disappear
immediately
and
the
the
reconciler
can
say.
Oh
it's
not
there.
A
Let
me
delete
the
cloud
resource
or
it
could
go
into
like
a
terminating
state
which
I
see
and
some
other
resources
and
kubernetes
where
it
stays
in
the
terminating
state
until
the
underlying
resource
has
been
removed.
And
then
you
know,
the
machine
resource
gets
removed
and
I
think
it's.
It's
gonna
be
important
for
us
to
start
codifying
what
the
states
are
and
exposing
some
of
those
states
in
the
API
itself.
A
So
first
I
guess
comments
concerns
questions
about
sort
of
the
high-level
view
here.
I've
got
more
details:
I
can
talk
about
in
terms
of
some
proposed
states
and
state
transitions,
but
I
want
to
see
what
people
think
about
the
idea
of
having
a
consistent
state
machine
that
we
apply
across
environments.
First,.
A
All
right,
well,
I,
can
jump
into
that
unless
other
people
have
comments,
so
the
or
Phillip
do
you
want
to.
You
want
to
say
a
couple
words
about
this
Phillips
from
our
worst
office
and
actually
drew
up
a
list
of
states
and
set
it
to
me
this
morning,
which
was
awesome
so
I
can
talk
through
it.
I
can
just
let
him
do
it
since
he's
here.
Yes,
so
big.
D
In
the
context,
sorry,
can
you
hear
me
okay,
so
the
context
is
that
like
I
was
working
previously
in
the
managed
instance
group
team,
so
the
current
the
current
deployment
of
kubernetes
clusters
in
DC
iran's
on
managing
such
groups.
So
that's
like
that's
like
the
the
place
I'm
coming
from,
so
this
states
that
between
query,
useful
from
from
our
point
of
view,
might
be
like
a
good
starting
point
for
for
deep
discussion
so
that
we
try
not
to
reinvent
like
the
solutions
to
similar
problems,
but
maybe
try
to
learn
from
the
experiences.
A
So,
if
you
think
about
the
state
diagram,
you
start
at
this
new
state
and
you
end
up
at
a
tombstone
state
and
you
go
through
sort
of
a
number
of
transient
states
and
steady
states
in
between
sort
of
the
most
common
place
where
we
expect
people
to
be
is
in
like
a
running
or
a
serving
state
which
is
basically
you
know,
things
are
working
normally
the
machine
is
part
of
the
cluster.
You
know
we're
running
pods
on
the
machine
like
effectively,
we
are
able
to
serve
traffic
from
this
machine
or
everything
is
running
properly.
A
Another
steady
state
that
we
expect
to
have
is
something
like
standby
or
drained,
which
is
you
can
imagine.
I
have
a
cluster
and
I
want
to
coordinate
machine,
so
we
sort
of
explicitly
represent
that
state
of
a
sheath
machine
being
cordoned
as
a
different
state
than
serving
or
running
because
this
machine,
while
it's
still
part
of
the
cluster,
still
has
a
couplet
the
cube
little
reporting
status.
A
We
are
no
longer
able
to
schedule
new
things
onto
this
machine
using
the
default
scheduler
and
so
I
think
that
that
is
may
be
useful
to
explicitly
have
be
a
different
state
in
the
system,
and
then
you
can
say
you
know
if
I
want
to
get
machines
and
see
all
the
machines
that
are
cordoned.
That
makes
that
really
easy.
We
have
a
couple
of
states
proposed
that
are,
in
my
mind,
maybe
a
little
bit
overlapping,
which
is
stopped
and
suspended,
and
so
this
is,
if
you
imagine,
a
virtual
machine,
you
can.
A
You
know,
press
a
button
in
the
console
to
stop
the
machine
from
running
or
if
you
have
a
physical
machine,
you
could
do
the
power
off
on
that
machine.
The
machine
is
still
represented
in
the
API.
The
fact
that
it
is
not
actively
reporting
health
status
means
doesn't
mean
that
we
should
necessarily
delete
it
from
the
cluster
right,
which
is
something
that
we
do
today,
the,
but
that
is
still
sort
of
represented,
and
we
know
that
the
the
Machine
still
exists,
like
the
physical
virtual
machine
is
still
there.
A
A
So
those
are
sort
of
the
the
steady
states
that
we
could
think
of
from
the
start,
which
is
you
know?
You
start
without
a
machine
you,
you
know
you
end
up,
hopefully
in
the
serving
or
running
state,
where
everything's
working,
maybe
you
coordinate
if
you
uncoordinated,
restart
your
machine,
but
generally
you're,
hopefully
in
sort
of
this
serving
state
where
everything
is
working
properly
and
then,
throughout
the
life
cycle
of
you
know,
going
through
these
different
states.
A
There
are
lots
of
sort
of
transition
area
states
that
you
end
up
in
between
these
sort
of
steady
state
places
where
we
expect
machines
to
be
most
of
the
time.
So
in
particular,
if
you
want
to
go
from
running
to
cordoned
or
you
know,
drain,
we
have
a
state
transition
there,
where
you
go
from
running
to
draining
to
drain
right,
so
we
can
explicitly
represent
the
fact
that
we
are
actively
draining
a
machine.
A
So
draining
means
that
you
accord
the
Machine
you
make
and
schedulable,
and
then
you
start
removing
the
pods
that
are
running
on
that
machine,
and
this
can
often
take
a
while
right.
So
if
you
think
about
pods
are
able
to
set
graceful
term
timeouts
so
that
when
you,
you
know
what,
if
you
want
to
delete
the
pod
from
machine,
it
can
take
quite
a
long
time
to
do
that.
A
So
you
can
end
up
in
this
draining
state
for
a
while,
like
we
don't
expect
this
to
be,
you
go
to
draining,
and
then
you
know
a
second
later
you're
in
the
drain
to
state
like
you
could
be
in
draining.
For
you
know,
hours
right,
there's
also
a
possibility
to
put
a
state
sort
of
between
running
and
draining
where
we
can
ask
the
system
like
I
want
to
drain
this
node
and
the
system
can
say
no
right.
A
We
can
rely
on
things
like
pond
disruption
budgets
where
this
is,
we
can
say
like
sorry,
you
can't
actually
drain
that
node
right
now,
because
if
you
were
to
drain
that
node,
you
would
not
be
respecting
the
disruption
budget
for
this
pod.
That's
right
on
that
note,
and
this
is,
we
can
so
basically
bounce
you
back
into
the
serving
state
by
rejecting
the
request
to
start
draining
a
node
again.
A
So
if
we
have
this
this
notion
of
a
stopped
or
a
suspended
state,
we
can
imagine
a
transition
from
serving
to
stopped
going
through
sort
of
a
stopping
phase
right.
So
you
you
might
drain
your
node
and
then
start
stopping
your
node
and
before
it
ends
up
in
a
stop
state
and
in
vice-versa
going
from
stops
to
serving
you
might
start.
Your
note
again
under
in
your
node
verify
the
things
are
all
working
correctly
and
then
end
up
back
in
the
serving
state
right.
A
Obviously
there's
also
a
deleting
state
where
you
go
from
my
machine
is
either
running
or
drained
to
I.
Don't
want
it
anymore,
and
so
we
put
it
into
sort
of
a
terminating
state
before
it's
actually
removed
and
then
I
think
there
are
some
other
states
that
we've
been
thinking
about
where
we
might
be
of
trying
to
reconfigure
a
machine.
So
you
might
be
taking
machine.
That's
running
and
you
know
reconfigure
that
machine
dynamically
and
it
comes
back
into
the
running
States.
A
Or
maybe
you
drain
it
first
and
then
you
reconfigure
the
machine
and
then
you
put
to
the
rank
state.
So
you
can
imagine,
maybe
draining
a
machine
doing
an
operating
system,
update
on
draining
a
machine
and
then
it's
back
in
the
certain
state
or
if
you're
able
to
do
in-place
upgrades
for
like
the
cubelets
a
or
for
the
docker
engine.
You
might
have
a
machine
in
the
running
State
and
then
you
might
just
update
it
sort
of
in
place
and
it
stays
in
the
running
State.
The
whole
time.
A
So
that's
sort
of
a
high
overview,
high-level
overview.
We've
started
trying
to
draw
this
up
in
a
picture.
I
didn't
have
time
to
render
that
before
the
meeting
started,
but
I
will
try
to
post
that
publicly
and
link
it
into
slack
and
later
today.
Hopefully-
and
definitely
we
can
talk
through
that
pictorially
next
week,
but
I
did
want
to
sort
of
start.
A
The
conversation
talk
about
some
sort
of
states
that
we
started
thinking
about
and
see
if
there
were
sort
of
big
gaps
that
people
could
identify
in
terms
of
other
states
that
you
think
that
the
system
might
end
up
in
that
are
missing
all
right.
So
I
think
that
was
sort
of
impetus
here
is
we
want
to
I
think
we
want
to
agree
on
what
the
state
machine
looks
like.
We
want
to
sort
agree
on
what
the
states
look
like
and
what
the
states
mean
what
the
state
transitions
are.
G
A
Yeah
I
think
some
part
of
that
comes
from
sort
of
this.
This
a
little
bit
of
dissonance
that
we
had
between
like
the
node
and
the
machine
are
not
the
same
object.
They're
two
different
objects
in
an
ideal
world.
If,
if
they
were
the
same
objects,
we
had
that
desired
states
in
the
node
object,
then
I
think
if
you
did
say
like
I,
want
to
drain
this
No.
You
put
that
in
the
desire
state
for
the
node
that
would
be
reflected
sort
of
in
the
actual
state.
A
So
one
thing
we
should
about
is
whether
we
want
to
sort
of
maintain
that
sort
of
like
I
desire,
my
note
to
be
drained.
Therefore,
I
put
in
the
Machine
object
and
it
gets
reflected
than
an
object
or
if
we
think
people
are
going
to
just
kind
of
reach
around
and
run
queue
cuddle
cordon,
and
it
might
be
difficult
to
reflect
the
fact
that
a
node
is
being
cordoned
in
the
machines
API
and.
E
D
So
one
or
maybe
maybe
argument
for
cutting
the
Machine
level
so
once
we
would
have
a
higher
level
controllers
like
machine
deployment
and
also
want
to
track
the
number
of
of
machines
that
are
not
serving
to
be
able
to
respect
the
destruction
budget
for
further
machines
in
the
cluster.
So,
for
example,
do
not
drain
more
than
two
machines
at
once,
so
it
would
might
be
more
natural
to
just
be
able
to
figure
this
out
just
from
looking
at
machines.
Instead
of
also
going
and
looking
at
notes.
A
That's
part
of
why
we
want
to
talk
about
it
here
and,
like
I,
said:
I
apologize
for
not
having
a
picture
to
present
right
now,
I
think
that'd
be
a
lot
easier
to
talk
to
and
I
will
I
will
send
that
out,
and
we
can
definitely
talk
over
it
more
next
week.
I
also
want
to
start
the
conversation
and
not
wait
for
the
picture
before
putting
this
in
people's
minds.
So
you
start
thinking
about
it.
A
A
Okay,
so,
as
Chris
mentioned,
please
go
take
a
look
at
the
issue
that
he
linked
in
the
notes
we'd
like
to
wrap
grab
that
discussion
up
to
the
neck
speaker.
So
we
also
want
to
wrap
up
the
machine's
PR
here
pretty
quickly
and
get
a
version
of
the
machine
set.
Pr
emerged.
So
if
you
guys
have
some
bandwidth,
you
have
25
minutes
now
means
ending
early.
Please
go
take
a
look
at
those
issues
and
pull
requests,
and
you
know
join
us
on
slack
and
we'll
continue
discussing
there
between
active
meetings
thanks.