►
From YouTube: Kubernetes Sig Docs 20181016
Description
Meeting notes: https://docs.google.com/document/d/1Ds87eRiNZeXwRBEbFr6Z7ukjbTow5RQcNZLaSvWWQsE/
The Kubernetes special interest group for documentation (SIG Docs) meets weekly to discuss improving Kubernetes documentation. This video is the meeting for 16 October 2018.
https://github.com/kubernetes/website
A
And
so
I,
don't
I
hope
have
to
deal
with
uploading
and
we
are
now
recording.
This
is
the
weekly
meeting
for
cig
docks,
October
16
2018,
and
this
is
Jennifer
Rondo
hosting,
because
that
Carlson
is
out
sick
today.
So
let's
get
started.
Do
we
have
anybody
new
on
the
call
I
I'm
not
looking
at
the
full
list
at
the
moment,
but
it
doesn't
look
as
though
we
do.
We
do
have
new,
reviewers
and
approvers
I.
Think
Jim
I
saw
you
on
the
call
right.
A
B
B
Dominic
will
so
last
week
we
asked
like
you
know
what
topic
people
want
to
see
next,
so
it
sounded
like
high
high
availability
was
the
winner
so
Dominic
as
the
presentation
prepared
go
ahead
and
take
away
dummy,
okay,.
C
Thank
you
so,
today
in
front
with
modeling,
we
want
to
talk
about
my
favorite
topic,
actually
about
high
availability
and,
once
again,
just
as
last
time.
Interactive,
please
don't
be
shy.
Ask
questions
at
any
time
and
I
may
ask
a
question
or
two
during
the
presentation.
So,
first
off,
when
we
talk
about
high
availability,
we
want
to
set
the
stage
and
they're
usually
too
calm.
It's
surrounding
responsive
systems,
one
is
scalability
and
one
is
reliability.
C
So
scalability
is
generally
defined
as
responsiveness
in
the
presence
of
load
on
the
system
and
reliability
is
defined
as
responsiveness
in
the
present
of
failure
of
system
components.
So,
in
this
presentation
for
this
talk,
we
want
to
focus
on
reliability
only
that
is
the
ability
of
the
system
to
be
responsive
in
the
presence
of
failure.
So
the
first
fundamental
distinction
that
I
want
to
make
is
that
the
reliability
of
kubernetes
does
not
imply
the
reliability
of
applications
hosted
on
kubernetes
kubernetes
may
be
able
to
sustain
node
failure
or
code
failure.
C
That
does
not
immediately
translate
into
your
application
being
able
to
withstand
that
failure.
Simple
example:
if
I
publish
my
workload
or
deploy
my
workload
as
parts
and
the
node
fails,
then
kubernetes
will
keep
being
responsive,
but
my
application
will
not
be
responsive,
so,
in
that
case,
I
would
have
to
deploy.
My
workloads
is
replica
sets
or
deployments
in
order
to
take
advantage
of
kubernetes
high
availability
and
have
kubernetes
reschedule
my
application
workload
that
just
failed.
However,
that
still
doesn't
protect
me
if
my
application
has
inherent
bottlenecks
or
inherent
single
points
of
failure.
C
So
we
need
to
make
this
distinction.
So
first,
let's
look
at
high-level
architecture.
That
is
important
in
this
conversation,
so
we
have
two
set
of
components:
we
have
the
master
components
and
the
node
components
the
master
components
they
host
the
controller,
a
process
that
hosts
all
core
controllers:
the
cube
scheduler
process
that
holds
the
scheduler
cube,
API
server,
the
API
server
and
an
etcd
node,
whereas
on
the
North
components
we
have
the
usual
suspect,
cubelet
and
the
container
runtime.
Of
course,
we
have
other
components
like
queue
proxy
see
advisor,
but
for
this
we
can.
C
Now,
if
you
want
to
talk
about
high
availability,
actually
that
is
also
true
for
scalability,
the
one
ring
to
rule
them
all
is
redundant
deployment.
So,
in
order
to
withstand
the
outage
of
a
component,
we
deploy
multiple
components
or
the
same
component.
We
actually
deploy
the
same
component
redundantly.
So
in
the
case
of
kubernetes.
That
would
result
in
this
architecture.
We
have
multiple
nodes
and
we
have
multiple
masters
and
so
because
a
node
cannot
connect
to
a
master
directly,
because
the
master
is
the
one
that
may
fail.
C
We
have
to
add
an
indirection
in
that
case,
that
is
the
load
balancer,
so
that
the
node
connects
to
the
load
balancer
and
the
load.
Balancer
will
then
forward
traffic
to
an
available
master.
Now
quick
show
of
hands.
As
I
said,
this
is
all
about
reliability,
but
since
we
have
a
node
balancer
in
there
does
this
setup
also
increase
scalability
of
master
components,
or
does
it
not
have
an
effect
on
master
components?
C
Since,
since
there
is
a
load
balance
in
the
picture
and
load,
balancer
is
a
common
component.
When
we
talk
about
scalability,
who
believes
that
the
load
balancer
can
help
us
scaled,
the
master
components,
if
our,
if
our
cluster
increases
or
does
the
load
balance
and
not
have
an
effect
on
the
scalability
of
our
master
components,.
A
C
However,
the
lower
half
of
the
picture
you
see
that
in
this
case
the
load
balancer
chooses
another
API
server
that
is
not
co-located
with
the
current
etcd
leader,
but
with
an
etcd
follower.
Therefore,
the
etcd
follower
will
not
answer
the
request
directly,
but
will
redirect
that
request
to
the
leader
and
then
reply
on
the
leaders
behalf.
So
our
bottleneck,
no
matter
how
far
we
scale
our
ap
API
service
is
etcd
int.
C
A
Well,
you
know
what
they
say
about
guessing
I'm,
really,
not
sure
the
answer,
so
you
can
set
up
a
highly
available
cluster
either
with
stack
masters
so
that
that
control
playing
node
is
running
on
the
same
machine
as
that
CD
note
or
you
can
set
it
up
with
a
separate,
a
TD
cluster.
Does
that
make
a
difference
here.
C
Not
much
so
you're
right.
There
are
multi
deployment
scenarios,
however,
it
doesn't
make
much
of
a
difference,
because
the
etcd
cluster
will
always
be
your
bottleneck.
Parallel
scale,
etcd
horizontally,
you
have
to
scale
the
individual
nodes
vertically,
so
give
it
a
bigger
give
it
a
bigger
node,
give
it
a
bigger
machine,
give
it
faster,
faster
hard
drives.
You
can
scale
up,
but
you
cannot
scale
out
right.
However,
now
that
you
are
deep
into
the
topic,
this
is
actually
what
I'm
showing
here
is
an
oversimplification.
It
always
holds
true
for
right
requests.
C
Anything
that
is
state
modifying.
You
do
have
the
chance,
so
to
start
kubernetes
with
either
a
quorum
read
or
not.
If
the
quorum
read
is
true,
any
follower
will
always
redirect
read
requests
to
the
leader.
However,
if
it's
not
true,
the
follower
may
respond
to
requests
straight
from
this
local
cache.
However,
you
may
actually
get
stale
information,
so
you
may
read
old
data
and
if
you
do
that
rapidly
in
a
row
and
the
load
balancer
spits
out
random
etcd
followers,
the
same
query
may
actually
lead
to
different
results
over
time.
C
C
We
have
the
cube
controller
and
the
cube
scheduler
just
as
before,
but
in
this
case
you
also
see
that
kubernetes
needs
to
select
master
components.
I'm
sorry
needs
to
select
leader
components,
the
cube
controller
and
the
cube
scheduler
work
on
global
shared
state
that
is
in
etcd.
If
you
have
multiple
cube
controllers
active
at
the
same
time
or
a
multiple
cube
schedule
as
active
at
the
same
time,
they
may
actually
be
competing.
C
This
is
not
true
for
the
node
components,
nodes
or
the
cubelet
actually
works
on
inherently
partitioned
data,
since
a
part
is
assigned
to
exactly
one
node
and
there
will
be
no
competing
updates
in
this
case.
Since
we're
on
working
on
global
state,
there
may
be
competing
updates,
so
kubernetes
has
a
leader
election
for
the
cube
controller
and
the
cube
scheduler.
As
you
see
here,
cube
controller
and
cube
scheduler,
the
the
leader
can
actually
end
up
on
separate
hosts.
C
However,
etcd
guarantees
you
that
at
any
point
in
time
there
is
only
one
of
them.
The
leader
information
is
stored
as
kubernetes
endpoint
objects
and
the
cube
system
namespace,
and
there
is
an
annotation
that
is
holder,
identity
and
the
holder.
Identity
will
point
to
the
cube
controller
or
cube
scheduler.
That
is
the
current
master
and
kubernetes
implements
leader
election
on
top
of
endpoints
and
on
on
top
of
ET
city.
Et
city
guarantees,
consistency
of
it,
and
this
is
simply
depiction
of
how
the
leader
election
works
fairly
standard
leader
election.
C
C
It
will
host
the
scheduler
and
it
will
keep
trying
to
renew
the
lease
as
long
as
it
still
has
the
lease
it
is
still
the
master
and
once
it
loses
the
lease,
it
is
actually
a
hard
kill
of
the
process,
for
example,
cube
controller
it
hard
kills
itself
and
the
operating
system
mechanisms
like,
for
example,
system
D.
The
process
restarts
starts
again
and
enters
this
loop
again.
C
An
interesting
sidenote,
so
is
that
etcd
guarantees
you
that
at
any
point
in
time
there
is
a
most
one
master
control,
either
at
most
one
leader
cube
controller
or
at
most
one
leader,
cube
scheduler.
However,
that
does
not
prevent
the
individual
components
from
falsely
believing
that
they
are
still
the
leader
component.
While
they
are
not.
That
is
the
situation
where,
at
one
point
in
time,
you
can
actually
have
two
competing
leaders-
and
this
is
also
referred
to
as
split
brain
typically
split
brain-
is
it's
an
unavoidable
condition.
C
However,
you
can
guard
against
it
to
begin
with,
for
example,
fencing
tokens.
However,
kubernetes
does
not
apply
any
of
those
guarding
mechanisms.
So,
for
a
brief
period
of
time,
you
may
have
two
competing
masters.
That
may
lead
to
situations
where
you
have
a
replica
set
that
specifies
three
replicas
at
time.
T
you
have
zero.
Two
of
the
masters
are
online,
leaving
to
be
the
leader
resulting
in
six
replicas
of
the
pot.
However,
down
the
road
the
reconciliation
loop
will
take
care
of
that.
C
This
is
an
interesting
side
note,
since
kubernetes
actually
doesn't
give
you
any
guarantees.
You
know
with,
for
example,
replica
sets
that
it
will
create
only
up
to
three
parts,
but
it
basically,
it
only
gives
you
the
guarantee
that
over
its
lifetime,
it
does
its
best
to
keep
it
at
a
steady
state
of
three
parts.
So,
while
high
availability
actually
helps
to
keep
that
guarantee,
there
is
there
may
be
situations
where
you
see
jitter
now,
let
me
see
we
do
have
yeah.
Unfortunately,
this
one.
C
C
D
C
Not
much
is,
if
you
look
again
at
this
picture,
you
take
the
load
balance
away
or
you
take
etcd
away
and
the
entire
architecture
crumbles
on
you,
so
the
load
balance
and
needs
to
be
a
highly
available
load.
Balancer
kubernetes
does
not
provide
that
it
just
requires
the
presence
of
it
and
etcd
is
the
one
that
actually
provides
a
consistence
of
the
data
and
in
this
case
also
of
the
master
and
yes,
kubernetes
does
have
its
own
leader
election.
But,
as
I
said
since
it
is,
it
doesn't
guard
against
split-brain
situations.
C
It
is
what's
the
right
word
you
could,
you
could
argue,
this
is
suboptimal
and
in
a
situation
that
requires
consistency.
That
would
actually
on
a
would
be
unacceptable.
However,
due
to
its
reconciling
nature,
kubernetes
can
actually
deal
with
these
situations
down
the
road,
but
you
see
certain
jitter
that
we
talked
about,
for
example,
with
with
the
replica
sets
so
actually
out
of
the
box.
Kubernetes
itself
doesn't
provide
much.
C
It
stands
on
the
shoulder
of
giants
that
is
load
balancer
and
etcd,
and
it
uses
their
capabilities
sufficiently,
however,
not
to
the
maximum
yeah,
but
since
it
doesn't
give
you
any
guarantees.
Besides,
I
do
my
best
to
try
to
reconciliate
the
current
state
with
the
desired
state.
It
is
actually
a
fair
architecture,
but
in
and
off
itself
there
is
not
much
kubernetes
adds
to
the
high
availability
yeah.
D
C
C
Since
API
server
is,
you
could
say
a
proxy
on
top
of
etcd.
There
is
no
need
to
elect
a
leader
for
the
API
server,
whatever
API.
So
ever
the
load
balancer
chooses
to
forward
the
request
or
will
do
just
fine.
It
will
then
eventually
forward,
especially
if
it's
a
write
request.
Etcd
will
forward
that
request
to
the
current
etcd
leader,
no
matter
what
it
guys
are
always
selected.
D
C
E
C
D
C
C
Custom
controllers
or
custom
schedulers
do
not
cleanly
fit
into
this
view.
So
various
controllers
as
a
concept
or
an
extension
point.
When
you
have
a
replica
set,
you
have
a
deployment,
you
can
have
anything
else
they're.
An
extension
point.
Kubernetes
makes
a
difference
in
how
it
hosts
core
controllers
or
how
it
hosts
custom
controllers,
and
therefore
you
will
run
into
these
problems,
because
custom
controllers
can
not
take
advantage
of
the
entire
already
existing
machinery.
C
D
A
A
D
D
D
Now
I
remember
now:
I
didn't
get
to
that
between
last
week
and
this
week
so
and.
D
A
Yeah
we've
been
kind
of
random.
We've
been
unsystematic
in
how
we
use
the
projects
in
the
in
the
website.
Repo,
so
I
think
the
idea
is
to
start
figuring
out
more
system
there
and
to
think
of
it
in
terms
of
like
sub
projects
of
the
master
project
that
lives
in
the
work.
That's
my
recollection
anyway,
yeah.
A
And
and
Steve
I'm
happy
to
take
a
look
at
things
if
you
want
to
rubber
duck
or
brainstorm
since
Zack
and
I
have
been
talking
about
the
larger
content.
Cleanup
content,
yeah,
okay
stock
also
sounds.
D
A
F
F
If
the
group
has
any
input
on
what
exactly
we
should
do
with
it
thus
far,
we've
mostly
just
been
kind
of
having
you
know,
ad-hoc
conversations
on
slack
about
various
things
and
we've
responded
to
a
couple
of
small
action
items,
but
I
guess
I'm
here
to
get
input
from
the
group
not
just
about
logistics
and
how
our
subgroups
should
run,
but
also
what
our
priorities
should
be.
If
anybody
has
any
action
items
that
they'd
like
to
request
and
so
on
and
so
forth,
so
I'm
just
opening
up
the
opening
up
to
this
guy.
A
You
anybody,
our
group
has
shrunk
I
think
significantly
here,
but
Jim's
Lana
got
anything
I'm.
Sorry
I
was
multitasking
here.
F
A
Like
most
of
the
agendas
headed
that
direction,
which
is
fine
with
me
and
and
I'll,
give
it
a
little
bit
of
thought
to
not
because
I
want
to
be
part
of
the
working
group.
I,
don't
I
need
to
sit
on
my
hands
here,
I'm
interested
but
too
many
other
things,
but
I
think
it
might
also
be
useful
for
those
of
us
who
aren't
planning
to
be
part
of
the
group
to
ponder
a
bit
more.
Some
of
this
stuff
is
already
laid
out.
A
I
think
the
kinds
of
sort
of
tooling
and
infor
issues
that
we
have
been
thrashing
around
as
a
larger
group
to
make
sure
that
those
are
clearly
defined.
For
your
group
look
I
mean
some
of
that
is
sort
of
legacy,
human
migration
pain
and
these
largely
cleaned
up.
That's
why
I
need
to
go
away
and
think
about
it
and
can't
just
brainstorm
on
the
spot,
but
does
that
make
sense
to
the
rest
of
you.
G
A
F
Okay,
yeah
cuz
I
think
you
know
it's
one
of
those
things
where
we
have
all
these
different
platforms
and
sources
of
information
and
nothing
to
really
not
yet
any
means
have
really
centralizing
that
than
information,
so
I
think
maybe
like
in
the
next
week
or
so
I
can
I
can
give
that
a
little
bit
of
thought.
You
know
I
wonder
if
maybe
like
issue
tags
could
be
the
primary
means
of
of
coordinating
this
information,
or
maybe
we
should
set
up
a
separate
project
so
on
and
so
forth,
but
yeah.
F
A
A
Somebody
else
needs
to
wrangle
this
one,
but
I
was
curious
about
it,
because
historically,
the
travis
build
has
done
other
things
from
what
the
this
particular
PR
was
suggesting
that
I
do
and
if
we
had
a
place
like
like
a
way
of
tracking
related
issues,
so
that
PR
Wranglers,
who
aren't
necessarily
going
to
step
into
those
PRS
and
wrangle
them,
but
who
need
some
context
to
figure
out
how
to
sort
of
point
them
at
people
on
yeah.
I
guess
that's
really
really
a
long-winded
+10
for
your
issue.
Labeling
you
Luke.
D
F
That's
that's
something
that
Karen
Bradshaw,
who
and
I
think
a
couple
of
those
have
expressed
interesting
yeah
I
mean
basically
I.
Do
think
that
that
very
much
falls
under
our
our
umbrella
yeah,
which
which
at
this
point
is
basically
everything.
That's
not
writing
in
thinking
about
content
kind
of
falls
under
our
umbrella.
So
yes
yeah,
the
long
would
it
answer.
Okay,.
F
Yeah
I
would
love
to
I,
don't
know,
designate
a
point
person
on
that
or
or
at
least
maybe
you
know,
get
some
kind
of
a
Google
Doc
or
something
from
somebody,
because
you
know
I
I,
don't
even
know
where
to
begin
comprehending
the
scope
of
the
problem
or
understanding
the
trade-offs
and
challenges
that
have
been
considered
thus
far.
Let's
have
an
information
source
on
that
yeah.
A
And
another
another
person
to
bring
into
that
conversation
is
to
me
because
he's
done
the
most
work
recently
to
rewrite
stuff
and
sort
of
Corral
and
wrangle.
It
and
there's
still
I
know
around
the
the
bit
that
I
know
the
best
and
I
don't
know
it
well.
I
only
know
the
issues
on
the
cube,
ATM
Doc's,
four
one:
nine,
six,
Esther
lifecycle
generated
them
and
for
110
I'm,
not
sure
what
happened
when
eleven
things
were.
A
I
am
still
not
sure
what
happened
112
for
sure,
sig
Docs
generated
them,
and
it
appears
that
we've
taken
that
over
and
that's
now
part
of
the
generated,
Doc's
responsibilities
and
part
of
the
scripts,
but
that
story
alone
suggests
that
we
need
to
pull
more
bits
together.
In
terms
of
the
story
we
tell
to
other
to
other
things.
Also.