►
From YouTube: Kubernetes SIG Cluster Lifecycle 20180228 - cluster api
Description
Link to doc: https://docs.google.com/document/d/16ils69KImmE94RlmzjWDrkmFZysgB2J4lGnYMRN89WM/edit#
A
Hello
and
welcome
to
the
Wednesday
March
28th
edition
of
the
cluster
API
working
group,
a
working
group
of
CID
question
lifecycle.
We
have
a
few
agenda
items
today
and
it
looks
like
folks
are
just
now
signing
into
the
dock.
The
first
agenda
item
we
have
is
from
Robert,
who
is
out
for
the
next
follow
following
weeks
so
and
says
he
won't
be
responding
by
email
or
slacker
will
be
reachable.
So
if
anybody
needs
him
or
anything
else,
contact
us
or
the
rest
of
the
sig,
and
we
can
probably
help
you.
A
B
Right
now,
this
PR
only
updates
to
1/9.
It
does
not
have
a
fix,
there's
a
dependency
on
the
UPS
or
a
builder.
To
actually
have
that
fix.
I
can
probably
make
the
CR
from
the
circulatory
flow
as
well
there's
a
problem
in
how
they
generate
strategies
for
merging
which
doesn't
seem
to
so.
This
PR
needs
to
be
in
before
we
can
actually
we'll
need
to
do
another
update
to
get
the
UK
server
builder,
but
for
now
this
only
updates
all
the
machinery
to
1/9
I
think
so,
and
yeah
that's
pretty
much
it
for
this.
A
B
A
B
A
Cool
so
I'll
take
a
look
at
the
machine,
but
ployment
PR
anything
else
on
your
two
items.
No.
C
Yeah
Friday
before
Robbie
went
out
for
a
while.
He
posted
today
machine
glass,
prototype
and
I
just
want
to
point
it
out.
It's
there
for
you,
I've
already
provided
some
feedback
on
it,
particularly
with
how
we
specify
available
capacity
versus
actual
capacity
and
how
that
might
actually
feedback
to
changes
in
machine
spec
instead
of
the
machine
glass.
But
I
would
like
to
get
more
eyes
on
it,
and
I
will
try
to
think
the
autoscaler
team
to
make
sure
that
things
we
just
must
satisfy
what
they
need.
So.
D
My
question
is:
should
be
better
to
put
like
those
parameters,
the
capacities
and
all
those
different
things
into
the
config
complete
configuration
or
whatever
it
was
caught
in
to
the
parent
object
and
only
reference,
the
cognitive,
the
the
provider
config
roll
extension
as
a
reference
or
was
a
point,
I
mean
something
from
the
roots
of
things
we
with
the
current
api's
that
Robert
is
proposing.
You
were
only
able
to
change
to
reference
that
quad,
config
and
yeah.
My
question
was
about
this:
Oh.
D
D
E
F
D
C
C
I
think
that's
good
feedback.
If
you
could
put
that
on
the
PR
itself,
we
could
discuss
it
in
long
for
yeah.
A
Okay,
anything
else
on
Roberts,
PR,
I
think
the
one
thing
that
I
would
want
to
just
call
out
about
it
is
if
we
are
trying
to
push
to
a
stable
release
of
the
API
whatever
that
means
this
would
be.
This
would
be
implicitly
needed
to
be
solved
before
we
get
to
that
stable
release.
So
I'm
gonna
flag
that
as
well,
and
we
can
talk
more
about
what
we're
going
to
do
with
this.
As
we
talked
about
migrating
to
a
new
repository
which
is
conveniently
coming
up,
I,
I
guess.
A
My
next
thing
is
in
the
new
repository
I,
updated
the
owners
file
and
used
my
almighty
get
privileges
to
disperse
that
in
so,
hopefully
other
people
can
start
to
review
PRS
as
well,
and
they
can
start
to
work.
Has
anybody
had
a
chance
to
look
at
that
or
test
to
see
if
they
have
a
merge
button
and
the
bots
are
working
as
expected?
I.
C
C
Some
testing,
for
we
need
to
check
some
things
in
to
enable
the
bots
for
certain
things
on
in
the
testing
for
repo
and
I'm,
not
sure
that
the
auto
merge
are
registering
it
in.
A
tight
pool
has
been
done
yet,
but
before
we
even
get
to
that,
I'd
want
to
get
the
unit
test
over
there
and
the
CI
testing
so
that
we
have
tests
before
merge.
C
A
C
A
C
A
Okay,
cool
I
had
the
one
thing
that
I've
been
kind
of
thinking
about.
I
watched
the
call
from
last
week.
This
is
relevant
to
the
migration
like
the
bigger
migration
effort
in
general.
Was
one
of
the
issues
I
brought
up
and
again
I'm
sorry,
I
missed
a
call
was
what
do
we
want
to
migrate
and
what
do
we
not
want
a
migrate,
and
it
looks
like
just
moving
the
API
definition
and
the
common
code
is
going
to
be
a
little
more
tricky
than
we
thought.
A
C
C
A
C
A
A
And
I
think
the
the
big
difference
between
what
I
was
just
proposing
and
what's
actually
written
here
again,
it's
just
in
the
the
proposal.
It's
more
of
a.
We
should
only
migrate
these
subsets
of
features
and
what
I'm
saying
is.
It
might
be
easier
just
to
migrate
everything
which
I'm
kind
of
I
don't
really
have
strong
opinions
either
way.
It
just
seems
like
it
easier
having
you.
A
D
Just
wanting
for
my
site
during
the
West
Coast
Oversight,
Committee
Brian
grant
those
of
you
who
are
not
present
explain
the
more
place
in
the
new
the
projects
and
all
those
kinds
of
sick
own
aunt
code.
So
what
we
going
to
do
about
him
in
right
now,
the
the
cube
the
post
is.
This
is
a
project
into
the
sick
coast,
arrived
psycho
and
are
going
to
get
us
support
of
our
own
or
some
ideas
on
the
on
the
for
work.
I
think.
A
We
have
certain
there's
certain
expectations
of
us
as
far
as
like
finding
an
intruder
and
our
scope
of
working
and
things
like
that,
I
think
this
might
be
like
I,
don't
know
it's
it's
either
we
go
out
of
our
way
and
do
the
the
items
needed
ourselves
as
a
sub-project,
or
we
wait
for
that
mandate
to
come
down
from
sequester
lifecycle.
C
A
E
Bring
one
comment
to
that:
it
seems
like
right
now
was:
was
most
of
this
working
groups
for
going
into
cube,
deploy
a
repo,
it's
harder
for
people
to
discover
that
Reaper,
it's
kind
of
it's
a
little
bit
like
you
know
it's
it's
a
little
bit
of
an
odd
place.
It
seems
so
if
there
was
a
I
guess
what
I'm
thinking
is
that
sub-project.
A
A
A
F
A
Think
it
would
be
cool
to
be
in
the
top
level
personally,
because
I
really
feel
like
the
infrastructure
layers
are
super
important
to
kubernetes
and
often
get
miss.
What's
the
word
I'm
trying
to
say
here
overlooked
I
mean
I,
don't
I
don't
know
I.
Should
we
open
up
a
proposal
for
that
or
what
are
our
thoughts
here
like
it?
I,
don't
know
how
hard
of
an
uphill
battle
this
would
be
to
potentially
get
you
know:
kubernetes
slash
machine
cluster,
API
machines,
API,
whatever
I.
C
F
A
Okay
I
would
this
is
going
to
be
the
most
made
of
thing.
I
say
all
day.
I
would
propose
that
we
write
a
proposal
proposal
that
would
go
into
the
new
repository.
A
A
The
one
thing
I
wanted
to
bring
up
is
I
volunteered
to
bring
up
the
working
group
versus
sub
project,
and
next
Tuesday
SiC
cluster
lifecycle,
meeting
I'm
gonna
be
out
next
week.
Does
anybody
who
I
added
it
to
the
agenda?
Does
anybody
want
to
volunteer
to
to
bring
that
up
and
track
that
work
and
plug
it
back
into
these
calls
the
following
week,
if
not
it'll
just
get
pushback,
we
could
just
fine
I.
A
G
Orphan
machines
kisses
like
that,
so
you
know
you
have
types
defined,
but
we
expect
users
right
the
controller
and-
and
we
also
write
the
controller
which
might
have
bugs
inside
it
right
so
much.
There
is
still
a
possibility
that
machining
said
let's
say
and
secreting
more
machine
that
it
should
be
or
I
would
say
what
we
have
seen.
The
way
cloud
providers
behaves
that
sometimes
it
gives
from
the
cloud
provider
SDK.
We
give
a
response
that
it
has
actually
created
the
machine,
but
the
machine
is
not
created.
G
In
that
case,
machine
object
becomes
orphan
in
other
way
around
if
it
happens,
and
machine
itself
become
orphan.
So
having
even
thought
about
it
or
any
strategy
on
that,
how
we'll
be
dealing
with
this
orphan
machines
just
to
prevent
machine
sight
or
machine
deployment
to
explode?
Maybe
in
terms
of
number
of
machines
I
mean.
A
I
don't
know
if
this
is
necessarily
a
cluster
API
mandated
concern,
but
it's
definitely
something
that
happens.
A
lot
I've
seen
it
and
almost
every
kubernetes
deployment
tool
I've
worked
with
where
some
infrastructure
can
get
orphaned.
Somehow,
usually
that
happens
like
through
a
failed
deployment
or
a
partially
deployed
deployment.
A
I've
seen
a
lot
of
different
tips
and
tricks.
I
know
the
go
team:
it's
like
a
scraper
that
will
go
in
like
every
night
and
just
destroy
all
infrastructure
and
give
an
account.
So
there's
a
lot
of
different
avenues
here,
but
I
just
I
guess
the
higher-level
question
for
me
would
be
if
this
is
something
that
we
want
to
prescribe
so.
G
Goes
in
motion
controller
manager
recently,
so
this
thing
and
we
ended
up
writing
a
safety
controller.
So
what
we
try
to
do
is
that,
if
an
object
that
we
create,
we
basically
put
a
specific
label
which
refers
to
the
cluster
name.
So
it's
like
in
all
providers
supports
either
aw
put
stakes
and
Google
supports
that
where
it
takes
and
so
on
right.
So
we
put
this
kind
of
tags
and
then
separate
in
parallel.
G
Safety
control
will
basically
keep
an
eye
on
the
machines
with
the
same
tags
and
it
will
they
have
a
map
across
the
actual
machines
and
the
machine
objects
which
are
created
and
if
it
finds
that
the
machines
are
orphan,
it
can
easily
delete
it.
But
the
interesting
use
case
is
in
the
second
part,
where
any
point,
because
of
the
bug
or
some
reason,
if
machine
said,
tries
to
create
more
machine
objects
or
more
machines,
then
what
safety
controller?
Does
it
basically
freezes
them
up
on
the
planet?
So
each?
G
If
you
see
each
other
machine
deployment,
ER
machine
set,
is
some
kind
of
sink
handler
inside
it
right.
So
we
basically
skip
that
loop
continuously.
The
freeze
label
is
there
on
the
machine
so
that
logic
we
and
that
I
mean
that
that
is
actually
coming
up
to
be
really
useful
in
our
case,
because
OpenStack
and
some
other
providers
does
behave
weird,
sometimes
you
don't
know
when
they
respond
what
they
say.
We
kept
the
response
that
machine
is
created,
but
it's
not
there
and
sometimes
we
occurs
and
so
on.
G
Because,
in
our
case,
if
you
see
the
types
and
something
that
is
defined,
but
if
user
tries
to
write
the
controller,
then
the
controller
is
only
component
which
is
talking
to
cloud
provider,
and
this
can
actually
cause
huge.
This
is
something
which
can
actually
cause
the
resources
because
of
the
my
notebooks
waiting
on
of
VM
says.
A
Yeah
I
I
new
is
just
gonna,
be
a
matter
of
time
before
we
we
brought
up
tagging
sources
and
cloud
provider
account
yeah
Justin's,
giving
us
that
thumbs
up.
It's
it's
a
well
known
problem,
especially
because
a
lot
of
the
cloud
providers
kind
of
follow
the
it's
okay
to
be
eventually
consistent
mentality.
So,
even
with
a
tag
like
you
can
create
a
tag
and
then
there's
some
Delta
in
time
before,
like
you
might
be
able
to
read
or
render
that
tag.
So
it's
it's
a
hard
problem
to
solve.
A
I
think
probably
the
the
best
pattern-
and
this
is
my
opinion,
for
what
it's
worth
that
I've
seen
is
every
time
you
create
infrastructure,
you
also
it's
sort
of
like
an
atomic
creation
where
you,
you
hang
until
you're
able
to
also
recognize
the
infrastructure
from
the
cloud
as
well.
So
that
would
be
like
a
controller
level
primitive.
That
would
say
if
you
are
making
a
mutation,
you
don't
want
to
consider
that
mutation
valid
until
you're,
also
able
to
read
from
the
cloud
that
your
mutation
has
succeeded.
G
Just
that's
correct,
so,
ideally
the
controllers
themselves
should
be
mature
enough
to
take
care
of
this
part,
but
it
does
so
happen
that
sometimes
sometimes
becomes
so
impossible
for
the
controller
themselves.
At
the
moment
to
understand,
because,
for
example,
provider
is
timing
outer
is
not
reachable
at
all.
For
some
moment,
the
creation
called
internally
might
have
gone
through
the
VM
is
created,
but
then
you
are
not
able
to
reach
the
cloud
and
controller
cannot
so
any
other
process,
maybe
is
like
landing
in
parallel,
which
can
later
on
garbage
collectors
produce.
G
F
Is
it
a
machine
controller
should
be
responsible
for
for
the
in
the
underlying
instances
and
making
sure
that
they're
all
bound
to
a
machine
as
it
were.
If
you
have
an
extra
instance
that
you
know
you're
responsible
for
that,
you
created
but
isn't
backed
by
a
machine,
I
think
you
should
delete
it
and
hopefully
that
will
just
be
part
of
the
sort
of
normal
reconcile
loop.
You
know,
how
would
we
do
that
so
I
can
ws.
For
example,
we
would
tag
every
machine
that
we
want
so
that
we
know
that
it's
not
someone.
F
You
know
some
other
machine
that
you
choose
running
and
then
I
think
what
I
think
I
would
probably
personally
haven't
done
yet,
but
I
would
probably
also
tag
each
if
I'm
watching
instances
directly
I
would
take
each
instance
with
the
machine.
I
knew
ID.
Do
you
any
of
the
Machine
object
and
then
I
think
it
would
be
safe
to
delete
any
instances
that
are
completely
orphaned?
F
Then
you
can
also
say
to--if.
Even
if
you
restart
at
just
the
wrong
time,
you
can
still
reconcile
it
to
a
machine.
If
that
machine
still
exists,
you
can
sort
of
recover
from
most
scenarios,
the
the
other
one,
the
other
case
I
think
is
you
have
a
machine,
but
it's
a
it's
running
in
the
infrastructure,
but
it
doesn't
join
as
a
note.
Not
one
I
think
should
be
generic
I.
F
The
logic
is
basically
the
same
where
you
know
after
10
minutes
or
whatever
you
configure
that
you
decide
that
machine
is
not
joining
and
you
delete
the
machine,
I,
guess
and
expect
that
then
the
cloud
provider
or
that
the
machine
provider,
machine
controller,
would
delete
the
actual
infrastructure,
and
that
can
see
the
other
day
in
siga
w
ask
where
we're
talking
about
nodes
that
go
not
ready.
After
about
13
minute
people,
you
wanna
delete
them,
and
we
don't
have
that
terminated.
Yes,
that's
a
similar
thing
where
we
see
infrastructure.
A
D
H
H
Example,
if
you
want
to
hot-swap
a
VM
and
have
that
represent
the
same
instance,
low
separating
those
two
concepts
like
machine
is
kind
of
a
machines
right
now
playing
both
of
those
concepts,
but
the
logical
inference
and
the
physical
machine.
Having
two
separate
concepts
may
may
be
helpful.
I
agree.
A
Another
pattern
that
like
I've
seen
be
pretty
successful,
is
when
your
controller
goes
through
an
iteration
all
of
the
infrastructure
it
does
know
about.
It
updates
a
timestamp,
and
that
way
you
can
define
TTLs.
Let's
say
if
we
have
a
bit
of
infrastructure
where
you
store
that
timestamp
doesn't
matter,
it
could
be
in
a
database,
convenient
tag,
there's
pros
and
cons
to
each
of
those.
But
when
you
get
to
a
piece
of
infrastructure
that
hasn't
had
an
updated
Leichhardt,
a
timestamp,
you
can
probably
say
according
to
some
policy
and
save,
to
delete
as
well.
G
That's
correct,
so
I
think
mission
controller
can
take
care
of
this
part,
except
in
the
cases
where
so
say,
for
example,
the
deletion
call
for
some
moment
cannot
go
through,
but
we
still
want
to
maintain
a
number
of
machines
right.
So
at
that
point
we
might
want
to
create
one
more
new
machine,
get
it
at
this
existing
cluster
and
let
someone
else
connector
get
rid
of
the
older
machine
which
was
not
which
we
were
not
able
to
delete
previously,
yeah.
Of
course,
this
this
logic
could
be
baked
in
inside
the
controller
itself.
A
D
One
more
thing
so
cubelets
right
now
it
it
actually
does
the
garbage
question
of
orphan
pots
so
and
yeah.
In
some
cases,
for
example,
the
machine
might
have
been
forcefully
deleted
when
the
controller
is
down
and
asked
me
to
garbage
collect
in
that
case
as
well,
but
in
resources
deleted.
But
the
note
is
up
and
then
you
need
to
do
some.
Do
some
magic
in
that
case
either
delete
the
machine
if
it
is
not,
for
example,
bount
or
not,
or
try
to
create
a
new
machine
object
and
then
put
it
back
together.