►
From YouTube: Kubernetes SIG Cluster Lifecycle 20180214 - Cluster API
Description
Meeting Notes: https://docs.google.com/document/d/16ils69KImmE94RlmzjWDrkmFZysgB2J4lGnYMRN89WM/edit#heading=h.xvr33m5suu00
Highlights:
- CRDs vs aggregated APIs with @erictune
- Splitting api groups
- Type for provider config
- Terminal vs. transient errors
- Container runtime in machine spec
A
Hello
and
welcome
to
this
Valentine's
Day
edition
February
14th
2018
of
the
sig
cluster
life
cycle,
cluster
API,
breakout
working
group.
Today,
we
are
lucky
enough
to
have
Eric
toon
joining
us
after
our
conversation
last
week
about
CR
DS
versus
API
or
Gatien,
we
looped
eric
in
to
come
and
sort
of
continue
that
conversation
and
hopefully
get
to
a
good
conclusion,
so
for
short
of
for
both
sort
of
short
and
long
term
answers
for
the
future
of
the
project,
so
I
don't
know
Eric.
A
B
A
A
B
Okay,
so
my
my
pitch,
is
you
see
our
DS
whenever
possible,
see
our
DS
has
way
more
users
than
aggregated
here's,
a
bunch
of
people
that
are
using
them
and
seem
to
like
them?
I've
talked
to
most
of
these
people
issue
I
work
with
very
closely
and
talk
to
them
like
at
least
weekly
on
their
needs.
Around
CR,
DS
and
I
am
highly
motivated
to
keep
them
happy,
as
they
are
close
peers
and
I.
B
Think
also
other
people
that
are
in
the
kubernetes
space
outside
of
Google
are
also
exciting
ideas
TL
and
share
a
motivation
to
support
that
that
use
case
as
some
of
the
easily
installs
on
your
kubernetes
cluster
there's.
Also
a
lot
of
people
are
building
operators
and
there's
a
much
other
peripheral
thing.
So
a
lot
of
people
are
succeeding
with
CR
DS,
so
the
momentum
I
just
talked
about
is
office
code
that
you
need
to.
B
If
you
just
want
to
do
basic
validation
and
define
a
schema,
then
you
have
to,
for
you
write
no
code
to
use
C
AR
DS,
except
for
your
controller,
which
you're
gonna
write
either
way.
If
you
want
to
do
fancy
things
with
validation,
then
you
end
up.
Writing
some
code
for
that
and
it's
similar
to
an
aggregated,
API
server.
B
Then
you're
going
to
be
in
the
noise
in
terms
of
the
API
server,
the
SE
DS,
you
know
load
from
your
resource,
so
you
shouldn't
worry
about
that
and
there's
one
less
moving
part
in
your
cluster
in
that
case,
so
they
can
do
a
lot.
Currently,
they
can
do
quite
a
bit
of
validation,
I.
Some
examples
are
like
min
and
Max
values
for
fields:
pattern
matches
for
Strings,
uniqueness
requirements,
one
of
befall
that
link
to
open
API,
v3
you'll,
see
there's
a
ton
of
things
that
you
can
do
without
writing
any
code,
just
a
schema.
B
B
So
the
the
biggest
gap
we
know
people
have
asked
for
is
multi
version
support,
we're
trying
to
squeak
something
out
for
110
that
will
at
least
allow
you
to
like
promote
from
like
a
b-1
alpha,
1
to
V
1,
beta
1
or
2
V
1.
Whatever,
with
there's
no
changes
in
your
schema,
just
communicating
like
yeah
we're.
This
is
more
production
ready
and
then
it
probably
not
in
110,
but
by
111.
B
We're
gonna
have
at
least
the
ability
for
you
to
like
rename
fields
as
you
go
from
versions
to
versions,
and
then
we
definitely
will
eventually
are
gonna.
Have
an
escape
hatch
where
you
can
do
arbitrary
transformations
is
probably
worth
having
a
conversation.
If
you
guys
want
about
how
versioning
and
storage
sharing
works
in
kubernetes,
if
that's
not
clear
to
everyone.
B
B
A
No,
why
don't
you
finish
the
presentation,
then
I'll
I'll?
How
do
you
go
back
and
I'll
start
questions?
That's.
A
Right,
so
if
we
go
back
to
slide
3
we're
talking
about
benefits
and
try
to
figure
out
how
those
apply
to
us
in
particular,
so
the
forking
rebasing
code
thing
found
that
there's
an
api
server
builder,
which
allows
you
to
auto
generate
most
of
those.
You
know
six
thousand
lines
of
code
which
I
think
significantly
decreases.
The
pain
of
forking
and
maintaining
your
own
code
Prabhas
and
get
rid
of
the
rebasing
pain
entirely,
and
some
people
last
week
mentioned
that
rebasing
can
be
rather
difficult
with
the
api
server.
A
A
Yeah,
that's
good
to
know
so
for
running
a
separate
Etsy
D.
You
talked
about
how
if
number
of
clusters
is
or
machines
as
small,
compared
to
pod,
that's
sort
of
in
the
noise
in
terms
of
storage,
but
one
thing
we've
talked
about
is
for
sort
of
reliability
and
availability.
We
actually
would
prefer
to
have
it
on
a
separate
storage
environment
so
that
being
able
to
figure
out
the
desire
to
sate
of
your
cluster
as
possible.
A
A
And
I
think
that
the
main
kubernetes
api
server
is
more
likely
to
break,
especially
as
everybody
else
starts,
using
CR
DS
and
increases
the
chances
that
it
becomes
fragile
in
unpredictable
ways
and
having
the
machines
and
cluster
API
in
a
separate
Etsy
D.
That's
less
likely
to
go
into
an
umlaut
if
we're
not
doing
proper,
garbage
collection
of
leaking
job
objects
or
something
seems
like
you,
good
reliability
story
for
us
that.
B
Does
make
sense,
I
think
that
is
a
strong,
a
good
reason
to
want
to
have
separate
storage
I'll
point
out
that
another
thing
you
can
do,
which
I
maybe
should
have
talked
about
these
slices.
You
can
run
the
released
latest
release
of
kubernetes
api
server
turn
off
the
other
api.
Is
you
don't
need
it
basically
have
no
nodes
and
install
your
resources
CRD
inside
that
API
server
so
again,
you're
actually
not
owning
any
code.
C
A
That
ok
and
the
last
one
I
think
lower
net
memory.
Requirements
is
tightly
coupled
to
that
sort
of
same
deployment
scenario,
where
that's
not
necessarily
a
benefit
to
us,
because,
regardless
of
whether
it's
a
forked
API
server
or
our
own,
with
all
of
the
default
api's
turned
off
it's
sort
of
the
same
footprint
in
terms
of
resource
usage,
yeah.
A
And
I
think
that's
sort
of
the.
The
conclusion
we
came
to
last
week
was
that
short-term,
tactically,
for
a
number
of
reasons,
including
the
fact
that,
like
a
lot
of
stuff,
that's
on
the
CRT
roadmap.
Isn't
there
today
and
we
want
to
have
stuff
working
today.
It
makes
sense
to
continue
with
a
great
API
server
in
the
short
term.
A
I
think
what
we're
really
looking
at
is:
where
is
the
sort
of
long-term
convergence,
and
we
want
to
always
have
a
separate,
a
great
API
server,
or
do
we
want
to
sort
of
converge
back
with
the
desire
to
have
people
using
CR
DS?
Even
if
we
have
a
deployment
model
where
we
run
our
own
API
server
binary?
If
it's
not
a
fork,
API
server,
if
it's
just
an
API
server
with
CRTs
installed,
you.
A
It's
sort
of
both
right,
so
we
talked
about
having
there's
a
discussion
on
a
github
issue
and
I.
Think
Chris
was
driving
this
about
Chris
Rousey
about
whether
machines
and
clusters
should
be
named
spaced
and
what
we
decided
on
was
that
we
would
allow
them
through
namespaced
and
that
sort
of
by
convention,
although
not
enforced,
if
they're
in
the
default
namespace.
That
means
they're
the
local
cluster
out
resources
and
if
there
are
other
namespaces,
they
could
represent
remote
clusters.
A
A
A
Cucit
machines
hit
the
same
aerated
api
point:
it'll
go
to
two
different
API
servers
and
give
you
back
the
right
resources,
that's
the
most
common
case,
but
we
wanted
to
leave
the
door
open
for
the
case
where
you'd
be
running
controllers
to
manage
things
remotely
because
it
sounded
like
there
are
a
number
of
people
in
the
working
group
where
that
would
that
would
be
very
useful
to
them
down
the
road.
Okay.
B
B
A
B
A
The
other
thing
I
wanted
to
follow
up
with
it
sounds
like
since
we're
pushing
most
people
to
use.
Crt,
isn't
that's
where
a
lot
of
the
community?
What
mentum
is
gonna
be
is
since,
at
least
for
the
short
term,
we're
gonna
be
using
API.
Irrigation
I
want
to
make
sure
that
is
sort
of
a
well
supported
path
and
if
everybody
else
jumps
off
of
it,
and
we
are
the
only
people
still
using
it.
A
I
worry
that
we're
gonna
end
up
sort
of
becoming
the
de
facto
owners
and
I
really
don't
want
to
have
to
own
the
you
know,
API
server
builder
or
running,
like
the
sort
of
use
case
of
running
your
own
aggregated
API
server.
So
I
want
to
make
sure
that
sort
of
remains
a
sort
of
first-class
supported
for
deployment
model,
at
least
until
CRTs
have
feature
parity
with
our
API
eyes.
I
think.
B
It
will
be
I
think
your
from
my
stable,
your
larger
risk
is
that,
given
the
additional
freedom
that
owning
your
own
code
gives
you
you
decide
to
like
set
off
in
a
in
your
own
direction
and
like
and
your
API
deviates
from
the
conventions
that
are
possible,
CR
DS
and
then
we
layer
on
additional
structure
in
the
project
and
then
you're
left
figured
out
how
to
rebase
on
those
new
conventions
and
the
newer
clients.
Don't
like
to
talk
to
you
and
you
can't
take
advantage
as,
like.
You
know
a
lot
of
stuff
in
the
future.
B
A
B
If
I
mean
when
I
was
talking
about
bringing
the
API
extensions
API
to
GA
this
year,
which
would
be
an
even
strong
commitment
of
it
being
possible
to
use
the
extension
API
server
mechanism,
that's
not
quite
what
you're
saying!
No!
You
want
to.
You
really
want
the
API
server.
Libraries
continue
to
be
able
to
be
used
to
build
your
own
binary.
E
To
say,
first
of
all
that,
like
I
very
much
appreciate
the
work
you
are
doing
to
make
adding
API
objects
easier,
but
why
are
we
doing
CR
DS
and
not
making
the
API
machinery
make
the
AI
extensions
easier
right?
Why
I
mean?
Could
we
not
get
most
of
those
benefits
by
having
you
know,
a
super
smooth,
build
or
having
that
we
talked
about
having
that
CD
being
able
to
inject
into
an
existing
CD
for
the
gke
scenario,
that
sort
of
thing
so
two
reasons:
I
guess:
three,
one.
B
Is
it's
very
like
you've
been
in
the
community
for
a
long
time
Justin?
So
you
understand
a
lot
of
the
nuances
of
kubernetes.
Api
is
versioning
style,
apply
by
all
that
stuff
takes
a
long
time
to
build
an
understanding,
and
so
the
way
that
we
build
people
who
tell
the
key
people
on
that
golden
path
is
by
giving
them
less
choices
and
CR
DS
gives
them
significantly
less
choices.
We
also
want
that
path,
so
we
want
the
kubernetes
api
platform
to
be
cohesive
and
we
want
people
to.
B
We
want
to
tell
people
to
start
on
the
easiest
path.
You
guys
are
pro
so
I'm
comfortable
like
that
you're
gonna
wander
off
and
use
the
hard
thing
and
not
screw
it
up,
but
most
people
I
would
not
be
comfortable
with
and
then
the
third
reason
is
that
we
want
to
be
able
to
bring
new
features
without
you
having
to
rebase
so,
for
example,
we're
gonna
move
certain
applying
to
the
server
side.
We
can't
do
that.
B
If
you
know
you
guys,
don't
rebase
to
pick
up
those
changes
or
like
chunked
api,
so
you'd
have
to
rebase
to
pick
that
up
with
CR
knees,
you'll
just
get
chunking
for
those,
because
you're
you're
just
installing
a
declarative
definition.
So
those
are
the
three
reasons
why
we
like
that
path.
But
we
realize
you
need
an
escape
hatch
and
I'm
committed
to
there
being
some
kind,
an
escape
hatch.
There
may
be
some
rebase
paint
on
that
risque
patch,
but
it'll
always
be
possible
to
do
that.
E
E
B
A
trick
you
can
do
where
you
could
install
an
aggregation
if
you
start
on
a
CRT
and
you
want
to
move
to
aggregation,
you
can
install
a
facade
that
then
goes
back
and,
like
you
know,
talks
both
to
its
own
storage
and
to
the
existing
CR
DS
and
then
like
manages
the
migration
itself.
I
haven't
figured
out
all
the
details,
but
there's
no
reason:
you're
aggregating
Gaiser
can't
go
back
and
look
at
the
old
resources
you
might
have
to
bump
the
version
or
something
like
that.
C
That
kind
of
approaches-
a
topic
I
was
curious
about
when
and
if
Ciara
deeds
become
feature
sufficient
for
what
we
need
and
we
just
want
to
run
the
standard
API
server
with
all
the
other
API
groups
turned
off.
Is
there
any
sort
of
definition
of
a
transition
plan
from
aggregated
api's
like
as
long
as
they
don't
violate
certain
things,
but
from
aggregated
api's
to
series.
B
We
don't
have
a
way
like
we're,
not
sure
we'll
ever
have
arbitrary
sub
resources
for
CR
DS
or
do
like
weird
things
in,
like
the
you
know,
storage
layer
right
or
try
to
guarantee
how
to
miss
it
Eve
across
like
adding
multi-object
transactions,
because
you
can
do
it
because
you've
modified
the
code.
Those
are
things
I
can't
help
you
with.
If
you
do
that,
I'm
not
for
that
answering
your
question,
my.
C
B
I'm
trying
to
keep
people
on
CR
I
can't
anticipate
all
the
things
that
could
go
wrong.
They
would
prevent
you
from
moving
so
I
can't
write
that
doc,
which
is
why
I'm
trying
to
discourage
people
unless
they
have
a
strong
reason,
and
so
most
people
are
using
CR
DS,
so
I'm,
just
mostly
my
effort
is
keeping
them
there
and
understanding.
Why
do
you
aren't
there?
Ok,.
C
B
D
B
Know
then
they
filled
with
rock
who's.
One
of
the
people
on
that
has
a
thing
called
COO
builder
he's
working
on
he
hasn't
announced.
Yet
besides
I
know,
I
shouldn't
have
stolen
his
fire,
but,
like
he's
actively
working
on
the
next
generation
of
it,
I
don't
know
what
is
out
for
being
a
plan
is
okay,
reassuring
to
know
that
the
guy
working
on
it
is
still
working
on
yeah.
A
That
was
my
point
before
is
I,
don't
want
to
become
the
owners
of
it
just
because
it's
the
open
source
right?
That's
that's,
not
the
primary
mission
of
our
sig
and
it's
not
really
our
value,
add
for
the
community,
and
so
we've
really
like
before.
Maybe
it's
not
just
fill,
but
for
they
can
machinery
say
to
sort
of
commit
to
owning
that
going
forward.
Will.
B
A
Hoping
is
that
they
owned
it
long
enough
and
give
us
a
transition
path
off
so
that
we
can
be
on
sort
of
the
happy
path.
Let's
see,
are
these
once
CRTs
or
feature
complete,
like
Chris
was
saying:
I,
don't
think
we're
planning
on
doing
anything
that
will
prevent
us
from
doing
that.
We
just
need
new
features
that
don't
exist
there
today
and
we're
not
letting
thank.
B
A
For
me,
Eric
I
think
this
is.
This
is
really
useful
and
helps
sort
of
bridge
our
connection,
at
least
starting
to
bridge
our
connection
with
with
the
API
machinery.
So
you
we're
gonna
have,
like
you
said,
have
to
become
friends
with
here,
going
forward
to
make
sure
that
this
stuff
continues
working
as
we
expected
yeah.
B
C
That's
a
great
idea,
because
currently
the
machines
depend
on
input
from
the
cluster
type
and
splitting
it
out
may
turn
into
a
versioning
nightmare
where,
like
machines,
v1
beta
one
needs
at
least
cluster
V
1,
alpha
2
or
something
like
that
and
I
was
just
wanting
to
maybe
understand
that
happy
to
take
the
discussion
to
the
issue.
If
it
was
discussed
ad
nauseam
last
week,.
G
G
C
It
is
a
different
type,
yes,
but
it
is
closely
related
to
the
cluster
and
I
think
you
need
there.
There
were
basically
some
cluster
level
configurations
that
need
to
be
known
cluster
wide.
That
machine
has
to
pull
from
like
the
pod.
Cider
has
to
be
known
cluster
wide,
so
you
know
like
if
you're
using
cube
ATM,
it's
the
10th
offset
of
that
pot
cider
for
DNS
for
every
node
needs
to
know
that
yeah.
D
C
A
G
In
our
project,
we
currently
have
like
the
notion
more
or
less
of
a
project
which
is
no
less
than
namespace
and
incited
this
project.
Inside
of
this
next
place,
you
can,
you
could
have
multiple
clusters
so
I
at
least
from
the
way
we
actually
do
our
deployment
so
of
cried
scooters.
We
really
want
to
keep
like
this
notion
so.
H
C
A
I
think
I
think
that
makes
sense.
I
think
we
were
trying
to
keep
them
very
loosely
coupled,
but
it
sounds
like
they're,
not
sort
of
semantically
loosely
coupled
anyway
and
so
I'm.
Making
the
connection
explicit
makes
a
lot
of
sense,
because
then
you
can
easily
tie
them
together
and
I
gave
you
the
flexibility
to
have
more
than
one
which
right
now
you
could.
It
just
doesn't
make
sense,
and
if
you
tie
them
together
explicitly,
then
you
can
have
more
than
one
and
it
actually
doesn't
make
sense.
A
So
I
think
in
that
case
we
should
probably
close
the
issue
that's
linked
to
here.
That
was
added
to
the
Alfa
milestone
and
replace
it
with
an
issue
to
explicitly
tied
the
resources
together
and
mark
that
for
the
Alfa,
milestone,
I
think
Devon,
you
volunteered
to
create
the
new
issue.
Yes,
once
you
do,
if
you
could
ping
it
to
either
Roderigo
or
myself,
if
you
don't
have
permission
to
add
milestones,
we
will
do
that.
Okay,
sure.
A
Right
so
next
there
was
an
issue
that
was
extracted
from
the
initial
machines
PR
about
changing
the
provide
type
of
provider.
Config
I
just
wanted
to
mention
this
briefly
and
ask
if
there
are
any
objections,
I
think
so
far,
I've
only
seen
people
saying
yes
both
on
the
initial
PR
and
on
this
issue,
but
I
wanted
to
put
that
in
front
of
folks
to
make
sure
nobody
was
saying
no
before
we
just
went
ahead
and
did
that.
G
And
I'm
not
sure
that
the
client
automatically
goes
route
wrong
extensions.
It's
like
a
struct
inside
of
it
that
the
raw
data
and
there's
also
like
an
an
optional
object
field
and
I'm,
not
sure
that
the
client,
even
even
though
I
automatically
generated
two
ones
populates
this
object.
So
we
have
to
do
some
magic
with
the
colors.
A
G
C
G
Yeah
I
mean
internal
versus
cell
are
only
for
the
API
server
Mallis,
because
no
there's
nothing
to
do
to
do
the
conversion
and
I.
Don't
think
that
the
external
version
I
mean
by
the
father,
the
convertor,
the
decoder
we
were
actually
I
think
complained
that
it
cannot
be
cooked
to
object
to
tune
interface,
more
less.
C
Yeah
I'm
a
little
concerned
with
having
to
register
all
the
provider
config
types
with
those
with
a
generic
API
server.
I'm
I'm,
not
against
I,
just
need
to
be
convinced
that
that's
not
gonna,
be
an
issue
like
I
like
if
I'm
just
deploying
say
to
an
AWS
cluster
I,
don't
want
to
have
to
have
the
azure
types,
the
GCE
types
and
all
all
the
other
types
I
want
registered
with
that
API
server.
G
C
A
A
If
we're
just
registering
objects,
it's
probably
okay.
If
we're
passing
extra
flags,
that
seems
a
little
bit
less
flexible,
because
I
don't
have
to
restart
the
API
server.
If
I
want
to
manage
a
different,
you
basically
install
a
different
machine
controller
into
the
cluster
that
shouldn't
require
a
certain
API
server.
E
D
Right,
yeah
I,
don't
have
limited
experience
on
this,
but
I
feel
like
that.
The
the
aggregation
is
done
by
a
plenum
resource
pass
to
the
main
ideas
over
the
API,
but
not
necessary.
I
can
know
all
the
detail
about
this
object.
It
just
forward
all
the
requests
stuff
on
the
front
of
it
with
the
aggregator
from
maybe
a
server
to
the
extension.
So.
A
Excellent,
all
right,
if
there's
nothing
else
there,
we'll
move
on
I
was
going
through
the
API
definitions
last
night
and
came
across
the
number
of
things
that
I
wanted
to
bring
up
during
the
call
today.
Some
of
these,
hopefully,
will
be
somewhat
quick,
hopefully
most
of
them,
if
not
all
of
them.
So
the
first
one
was
terminal
versus
transient
errors.
So
looking
at
machines,
API
the
documentation
basically
says
that
we
have.
A
You
know
we
decided
not
to
use
conditions,
because
Eric
tun
and
Brian
grant
suggested
that
conditions
were
on
their
way
out
and
we
shouldn't
be
using
them
for
anything
new
and
so,
and
the
new
way
you're
supposed
to
be
doing
things
is
sort
of
putting
top-level
errors
like
rolling
errors
up
into
the
top
level
of
your
status
objects.
And
so
we
have
two
error
fields.
A
We
have
a
reason,
an
error
message
and
the
documentation
says
they
should
not
be
set
for
sort
of
transient
errors
that
you're
expected
to
fail
from,
and
they
should
only
be
set
for
thermal
errors,
which
is
a
case
that
you're
expected
to
never
ever
get
out
of.
And
then,
if
you
look
at
the
the
documentation
for
the
actual
reasons
that
there
could
be
errors,
many
of
them
say
this
is
a
transient
error,
which
seems
contradictory
with
the
documentation
of
what
they're
supposed
to
be
setting,
which
is
a
terminal
error.
That
is
not
actually
transient.
A
In
addition,
in
the
proposed
machine
state
diagram,
which
we
talked
about
a
couple
of
times
on
this
call,
we
purposely
did
not
put
any
sort
of
terminal
error
state
and
that's
that's
something.
We
learned
from
gke,
where
we
do
have
a
terminal
error
state
in
our
internal
state
diagram
for
clusters
and
we've
found
that
that
can
be
kind
of
a
painful
situation
to
be
in
because
we
have
a
terminal
error
state.
It
means
that
you
go
there
and
there's
really
no
way
out
and
in
a
lot
of
the
cases
like.
A
If
you
look
at
the
different
types
of
errors,
they
are
actually
potentially
recoverable
right.
So,
like
a
crate
machine
error,
the
example
is
time
out
connecting
the
GCE
which
to
me
sounds
like.
If
you
try
it
again,
maybe
the
service
was
down
and
GC
would
come
back
up
and
you'd
actually
would
be
able
to
create
a
machine
at
some
point
in
the
future.
A
Is
it's
not
clear
to
me
if
we
don't
have
conditions
which
conditions
seem
to
sort
of
represent
States
and
a
state
machine,
whether
we
want
how
we
want
to
use
these
error
fields
right?
How
what
what
information
we
want
to?
The
error
fields
to
convey
back
to
clients
of
the
machine's
API,
so
I'm
going
to
stop
talking
and
let
somebody
else
jump
in
here.
A
Don't
I
was
another
thing
that
I
I
had
on
my
list
of
things
to
talk
about
was
maybe
I
press
write
down.
Should
we
expose,
like
the
explicit
state
machines,
Daniel
Smith's
feedback
in
the
document
that
I've
linked
to
is
that
they
we
did
not.
We
decided
not
to
do
that
with
some
of
the
other
kubernetes
types,
because
they
thought
that
once
those
states
were
exposed
and
the
transitions
between
those
states
were
exposed,
that
trying
to
change
the
flow
between
states
would
constitute
a
breaking
API
change.
A
I
know
that
if
you
look
at
some
of
the
other
sort
of
machine
management
tools
like
the
ones
that
this,
if
you
guys
have
it's
got
sort
of
two
fields
represent
state
you've
got
one
field
that
represents
sort
of
the
steady
states
and
another
field
that
represents
sort
of
most
recent
transition.
I,
don't
know
if
you,
if
you
guys,
have
tried
to
change
what
those
states
so
just
and
found
that
it's
caused
any
sort
of
compatibility
problems
with
clients
or
not.
Oh.
A
I
mean
we
have,
we
have
a
sort
of
implicit
notion
of
a
node
being
ready
right,
so
you
have
a
machine
that
crate
causes
the
node
to
be
created
and
I
know
goes
into
a
ready
state.
And
if
you
look
at
the
the
Machine
set
API
that
went
in
kind
of
like
replica
set,
it's
got
fields
for
a
number
of
ready,
pods
a
number
of
available
pods,
which
is
slightly
different
right.
A
So
we
can
sort
of
take
that
implicit
state
from
a
qubit
reporting
status
and
saying
that
the
node
is
ready
state
machine,
but
in
the
machine
state
diagram.
That
is
one
of
the
potential
states
which
is
actually
that
represents
more
than
one
potential
state
right.
Because,
if
am,
if
a
note
is
saying
it's
ready,
it
could
be
in
the
serving
state
it
could
be
in
the
drained.
State
read
the
qulet
or
so
report
ready.
A
If
it's
in
the
drain,
State
sort
of
inferred
that,
because
it'll
tell
you
that
it's
been
cordoned
but
doesn't
actually
tell
you
if
it's
been
drained
or
not
just
tells
you
if
it's
been
cordoned,
it's
not
schedulable
and
so
I
think
we
have
a
potential
to
actually
have
sort
of
a
more
refined
state
machine
that
we
could
expose
and
we'd
have
to
make
sure
we
reconciled
that
with
like
the
actual
State
on
the
node.
Oh
yeah
right
now,
it's
not
a
not
explicitly
exposed
it's
sort
of
sort
of
implicitly
exposed.
E
Once
one
way
it
make
the
solve,
this
would
be
to
drive
it
by
the
use
case
of
the
Machine
set,
say
what
exactly
doesn't
machines
that
need,
and
that
is
what
we
shouldn't
make
sure
we
go
to
the
Machine
and
then
maybe
we
find
that
error
text
or
whatever
it's
not
even
used.
So
it's
purely
informational,
yeah.
A
I
mean
it
was
put
there
because
we
needed
a
place
to
surface
errors
initially
right.
So
if
you
take
crate
machine
and
you
try
to
create
a
machine
that,
like
just,
is
completely
invalid,
some
of
those
you
can
catch
during
input,
validation,
stages
and
some
you
can't
right.
You
go
out
to
the
cloud
provider
and
it
says
sorry
this
doesn't
work
and
that
was
sort
of
a
way
to
service
back
to
the
user.
A
So
you'd
say
like
we
aren't
gonna
bother
trying
again
but
I
think
some
of
the
error
error
types
that
crept
in
there
are
things
where
we
should
be
trying
again
right,
like
if
you're
out
of
resources
that
could
be
a
transient
state.
If
there's
a
stock
out.
That
sounds
like
a
transient
state
to
me,
like
those
things,
do
get
resolved
or
could
get
resolved,
and
you
need
to
bubble
up
the
fact
that
it's
not
working
right
now,
somewhere,
right,
I,
think
that's
what
conditions
were
used
for
events.
E
A
Yeah
the
documentation
says,
will
produce
events,
but
events
are
not
really
reliable
for
higher-level
controllers
to
be
built
on
top
of
right
events
are
more
transient.
They're,
not
they
don't
sort
of
the
same
guaranteed
storage
as
your
current
status
field
does
that's
easy
to
miss
events.
Great
yeah.
E
I
guess
I
guess:
the
point
is
that
the
replica
set
doesn't
do
anything
with
the
notion
with
the
idea
that
Claud
a
plots
image
can't
be
pulled
right.
It
won't
behave
in
differently,
and
so
the
information
is
only
of
value
currently
to
humans,
who
are
looking
at
the
events
and
maybe
in
our
system
in
the
future.
But
I
guess
the
question
is:
could
a
replica
set
do
something
smarter
with
that
information
now
more
pertinently
could
are
machines
that
do
something
smarter
with
it.
J
Me
know,
okay,
so
so
the
discussion
about
conditions,
I
added
the
note,
the
link
to
the
meeting
notes
as
well,
don't
with
some
historian
conditions
and
why
or
not
use
on
those,
so
the
thought
process
after
the
no
reading
quite
a
bit
no
yesterday
is
that,
like
it
would
be
used
in
kind
of
events
for
anything
that
is
transient
to
the
controller.
So
the
controller
is
rich.
We
try
and
and
I
just
want
to
put
something
about
what
it's
it's
actually
doing.
Their
own
failures
are
happening.
J
Actually
is
right,
although
like
yeah
you're
gonna,
if
you
try
again
later,
you
know
you
might
be
able
to
fix
that.
The
same
goes
for
any
internal
errors.
You
know
internal
cloud
or
stock-outs
or
something
that
might
you
know
be
not
be
expected,
not
be
cutting
welding
errors
and
later
than
hope,
they
happen.
Eventually,
the
controller
gives
up
and
that's
kind
of
what
gets
followed
up-
that
all
sorry
with
the
thought
process
and
the
mission
set
and-
and
we
can
revisit
it-
make
consistent
across
everything
that
I've
needed.
A
Should
it
give
up,
though,
or
should
it
just
back
off
and
retry
it
at
low
frequency
right,
I?
Think,
like
both
those
cases
you
mentioned,
we
shouldn't
give
up
and
say
we're,
never
gonna
try
to
create
this
machine
as
part
of
machine
machine
set
again,
and
maybe
the
difference
is
on
machines.
We
say
we
try
to
create
the
Machine
we
gave
up
and
that
allows
the
machine
set
to
try
again
later
and
they
can
do
the
back
off
and
I.
Think
that
was
Justin's
point
is
what
fields
do
we
need
for
our
automation
to
work?
A
Some.
There
are
two
things:
one
is:
how
do
we
get
error
messages
out
to
user
so
that
they
can
manually
fix
things
if
necessary,
like
if
there's
a
quota
problem?
You
know
in
the
same
way
that
if
you
try
to
pull
an
image
on
a
pod
that
doesn't
exist,
you
know
your
replica
set,
won't
scale
up
and
you
can
look
at
it
and
figure
out
why
and
fix
it.
So
we
need
a
place
to
bubble
up
the
things
that
you
know.
A
People
might
need
to
go
fix,
out-of-band,
and
then
we
need
to
have
enough
fields
there
to
build
the
higher-level
tools.
So
maybe
the
fields
there
are
sufficient
for
both
of
those
and
we
just
need
to
update
the
documentation
to
make
it
less
confusing.
But
what
I
found
reading
through
it
was
it
sounded
sort
of
self
contradictory,
so
yeah.
So
maybe
the
answer
is
we
sort
of
keep
building
the
machine
sets
start
looking
machine
deployments
and
figure
out
if
the
fields
are
correct
and
if
so,
let's
just
make
sure
the
documentation
is
clean.
Yeah.
J
A
Cool
so
next
thing:
I
we
have
the
container
runtime
the
machine
spec.
It's
it's
got
used
in
a
field
that
it's
also
in
the
Machine
status.
So
we
probably
just
don't
want
to
change
it
because
it
is
I
think
useful
to
have
in
a
machine
status.
But
in
some
recent
discussions
I've
had
with
folks
from
sig
node.
It
sounds
like
the
future
direction
for
container
runtimes
is
that
they
are
going
to
be
tightly
coupled
and
bundled
with
the
underlying
operating
system.
A
So,
for
example,
if
you're
running
rel,
you
might
get
cryo
as
your
container
runtime
and
if
you're
running
Ubuntu
you
might
get
docker
and
if
you're
writing
core
West.
Maybe
you
get
rocket,
but
the
intention
wouldn't
be
to
go
and
install
docker
on
core
OS.
It
would
be.
We
get
core
OS.
It
has
rocket.
A
That's
been
validated
as
sort
of
functional
pair
of
OS
plus
container
runtime,
all
the
Catena
runtimes
implement
the
CRI
interface,
and
so
that
Cuba
doesn't
really
care
too
too
much
what's
running,
underneath
it
as
long
as
it
passes,
the
node
expect
validation.
Right
now
we
have
in
our
declarative,
API
a
way
to
specify
I
want
this
container
runtime
with
this
cubelet
and
it's
starting
to
sound.
A
J
How
pervasive
you
know
our
images
with
the
runtime
they
don't
baked
into
them.
If
you
go
to
any
cloud
provider
or
if
I
have
like
home
premises
and
anyone
started
using
our
image
like
do
we
have
the
majority
of
images
with
the
container
runtime
today,
then,
if
I
were
to
pick
one
a
random
one,
I
would
go.
Oh
so
that's
kind
of
a
where
I'm
going
with
this
question,
but
it
I.
E
Mean
I
can
answer
from
my
experience
on
any
device
which
has
a
pretty
broad
range
most
of
the
stock
images.
Debian
Ubuntu
rel
CentOS
do
not
include
a
runtime
at
all
some
of
the
ones
which
are
like
cops,
I'm,
kubernetes
I
guess
builds
one
specifically,
which
bakes
in
the
relevant
version
of
docker,
but
that
is
something
we
do
and
it's
purely
an
authorization
to
speed
up
boot.
E
Most
of
them
do
not
have
dr.
Belton
the
the
complexity.
Is
that,
like
I,
think,
the
reason
is
that
docker
versioning
doesn't
match
up
with
the
the
docker
speed
of
version
doesn't
match
up
with
the
lifespan
of
OS
releases
very
well.
So
it
is
a
pain
for
people
to
bundle.
It
I
think
there
are
exceptions
like
I,
think,
I,
think
Red
Hat
has
atomic
right,
I'm
sure
their
image
doesn't
core.
Os
includes
docker,
because
you
can't
install
software
on
core
OS,
basically
or
what,
if
they're
shipping
these
days,
I
don't
but
then
yeah
in
general.
A
I
think
that's
that's
true.
The
state
of
the
world
today
is,
if
you
ask,
for
just
an
OS
from
Amazon
you're,
not
gonna,
get
one
that
has
a
container
runtime
and
some
of
you
have
to
install
one
and
maybe
for
something
like
CI.
It's
important
to
be
able
to
specify
a
version,
I
guess
what
I'm
saying?
Is
it
looking
forward?
A
I
think
what
we're
gonna
see
as
we
see
projects
like
wardroom
from
hep
do
get
spun
up
as
we
look
at
people
building
more
things
like
atomic
or
core
OS
that
are
more
sort
of
container
optimized
operating
systems.
Is
that
we'll
start
to
see
a
tight
coupling
and
if
you
say,
give
me
an
atomic
image
and
install
rocket
on
it?
It's
just
not
gonna
make
any
sense,
because
atomic
is
gonna
come
bundled
with
a
version
of
docker
and
that
version
of
docker
is
work.
A
That
feature
that's
interesting,
so
I
think
there
is
a
provision
in
at
least
one
of
those
fields.
Now
I
mean
you
can
leave
them
all
blank
and
you
basically
just
get
the
defaults
right,
and
so
maybe
that's
we
say
by
convention.
We
expect
most
people
to
leave
these
things
blank
and
just
take
the
defaults,
but
there
are
some
escape
hatches
where
you
might
want
to
specify
a
specific
version,
and
some
machine
controllers
will
just
reject
you
know
somewhere
all
of
those
requests
if
it's
not
supported
yeah.
K
A
So
the
comment
for
that
field
basically
says
we
are
copying
it
over
from
the
node,
because
that
puts
it
in
the
same
structure
and
format
in
a
spec
and
status,
which
makes
it
really
easy
for
controllers
and
the
field
names
are
all
different
in
the
node
object.
So
if
you
follow
the
reference
from
an
end
users,
point
of
view
that's
trying
to
like
introspect
machines
and
nodes,
it
makes
it
difficult
because
the
fields
are
inconsistently
named
and
that
that
basically
puts
the
bird
on
the
controller.
Do
the
mapping
from
I
understand?
A
What's
reported
the
node
object
and
I'll
map
that
over
to
the
machine,
object
to
make
it
easier
for
consumers
of
the
Machine
API
and
that's
another
decision.
We
could
say:
let's,
let's
change
the
burden
and
put
it
on
the
user,
because
information
is
already
surfaced
in
the
API
elsewhere,
and
people
should
just
follow
the
reference.
K
A
So
like
like
justin,
was
saying
if
you're
on
Amazon-
and
you
say,
give
me
a
stock,
Ubuntu
and
I
want
this
version
of
docker
installed.
You
can
put
that
in
the
provider
spec
or
the
you
know
that
the
machine
controller
could
just
pick
the
right
version.
Based
on
that.
Oh
s,
right.
You
can
encode
that
logic
of
how
to
how
to
choose
it
in
at
run
time.
In
a
couple
of
different
places
that
the
provider
spec
would
be
one
place,
you
can
put
it.
J
Yeah
I
think
my
just
for
New
Year's
old
announcer,
the
API.
If
this
is
something
that
is
like
I'm
doing
my
own
image
or
I,
have
this
image
I've
been
using
and
I
need
to
kind
of
create
a
new
image
and
bacon
doctor
version
or
any
other
softly?
Sorry,
mister
API,
that's
one
extra
step
and
all
work
for
them,
and
that's
just
another
consideration
right.
If
the
majority
of
users
you
know,
could
you
know
do
that
or
wouldn't
need
to
do
that,
then
that's
something
to
consider
I.
J
Think
we're
converging
data
like
at
least
a
supporting
an
image
that
comes
with
the
runtime
would
be
you
know.
We
certainly
want
to
do
that
and
have
it
probably
some
sort
of
a
notes
back
that
we're
gonna
build
it
against
make
sure
their
own
time
is
in
place,
etc
or
nobody
playing
I
think
we're
not
sure.
If
we're
gonna
keep
the
installation
code,
then
all
that
installs
on
the
images
with
all
that
right.
A
Yeah
I
think
that's
the
other
thing
I
was
kind
of
getting
at
here
is
for
most
people.
They
say,
give
me
a
machine
with
this
version
of
the
cubelet
and
because
we
are
abstract
about
the
container
at
runtime
through
through
the
Siara
interface,
and
because
it
looks
like
we're
moving
towards
having
a
tightly
coupled
with
the
OS
like
that's
something
that
users
probably
shouldn't
care
about.
K
E
A
A
Chris
also
points
out
in
chat
that
the
users-
probably
don't
care,
but
ops
folks,
probably
do
care
about
asbestos
and
and
to
be
clear.
The
users
I'm
talking
about
are
the
ops
folks
right.
The
primary
users
of
the
cluster
API
are
the
ops
they're,
not
the
application
developers,
and
so
if
we
think
that
that
the
ops
folks
do
care,
then
maybe
it
is
worth
leaving
in
yeah.
A
We're
trying
to
leave
the
API
flexible
enough
to
do
both
and
letting
machine
controllers
decide
if
they
want
to
try
and
implement
both
okay.
So
certainly
blue
green
is
easy
because
you
can
always
create
new
machines,
delete
old
machines.
You
can
sort
of
orchestrate
that
at
the
higher
level
and
in
places
is
more
difficult
and
I
think
most
people
have
not
implemented
in
place
upgrades
for
kubernetes
today,
I,
don't
want
to
close
the
door
to
doing
that.
Living
actor.
I
think
there's
some
valid
use
cases
where
we
want
to
do
that.
Okay,
thanks.
A
Alright
I
had
a
couple
other
things
to
discuss,
but
it
looks
like
we're
just
about
out
of
time
so
I'm
gonna
punt
those
to
next
week
and
thank
everyone
for
coming,
and
we
will
see
you
all
again
soon
how
people
had
action
items
to
go
and
open
issues.
Please
make
sure
you
follow
up
on
those
and
certainly
if
people
want
to
keep
chatting,
we've
got
slack
an
email
and
so
forth.
So
thank
you.
Everyone
for
coming
and
we'll
see
you
again
soon.