►
From YouTube: Kubernetes Office Hours 20210616 (EU Edition)
Description
Office Hours is a live stream where we answer live questions about Kubernetes from users on the YouTube channel. Office hours are a regularly scheduled meeting where people can bring topics to discuss with the greater community. They are great for answering questions, getting feedback on how you’re using Kubernetes, or to just passively learn by following along.
For more info: https://k8s.dev/events/office-hours
A
All
right
welcome
everyone
to
today's
kubernetes
office
hours,
where
we
will
answer
your
questions
live
on
air
with
our
esteemed
panel
of
experts.
You
can
find
us
in
the
office
hours
channel
on
slack,
please
check
the
topic
for
the
url
for
more
information
before
we
begin.
We
just
want
to
take
a
moment
for
everyone
on
our
panel
to
introduce
themselves,
say
hello
and
share
a
little
bit
about
them.
We'll
start
from
left
to
right.
Please
take
it
away.
Am
I
right.
C
Hi,
marcus
johansson
at
equinix,
metal,
principal
engineer
on
our
developer
relations
integrations
team,
so
that's
working
on
kubernetes
integrations,
terraform,
integrations
and
then
every
kind
of
sdk
and
cloud
integration
imaginable
also
hiring
also
looking
for
people
who
want
to
join
this
great
opportunity.
A
All
right,
my
name
is
david
mckay.
I
am
also
a
equinix
medal.
I'm
a
developer
advocate
and
youtube
streamer.
D
A
Awesome.
Thank
you.
Everybody
all
right
now
before
we
get
started
here
are
the
ground
rules.
This
is
a
kubernetes
event,
so
the
code
of
conduct
is
in
effect.
In
short,
please
be
excellent
to
one
another.
This
is
also
a
judgment-free
zone.
Everybody
has
to
start
from
somewhere.
So
please
help
out
your
body
by
having
a
supporting
environment
in
the
channel.
A
A
Normally
we
do
provide
t-shirts.
However,
the
cnc
store
is
being
replenished
at
the
moment,
but
we
will
give
you
a
shout
out
and
their
undying
devotion
panelists.
You
are
encouraged
to
expand
on
answers
with
your
experience
and
pro
tips
and
audience,
please.
You
can
help
by
pasting
urls
to
docs
blogs
and
anything
that
might
be
relevant
to
the
topic
at
hand
in
the
channel,
and
you
can
also
post
your
questions
to
discuss.kubernetes.io.
A
E
A
All
right,
you
can
also
help
us
by
tweeting
spreading
the
word
and
paying
it
forward.
This
panel
is
made
entirely
of
volunteers
if
you
want
to
join
us
for
our
kubernetes
office
hours
and
sit
in
one
of
these
chairs
and
help
the
community
by
answering
questions
reach
out,
we're
always
happy
to
have
fresh
faces.
Join
us
on
this
show,
and
also
as
a
new
thing
we
do
at
the
kubernetes
officers.
Is
our
community
shout
out
each
month.
A
A
A
I
want
to
know
why
when
I
remove
the
command
and
the
command
here
is
ben's
sleep
3650d,
it's
a
lot
of
days.
The
pod
won't
show
as
running
this
is
a
centos
image.
A
C
Okay,
I'm
muted.
Well,
if
there's
no
command,
there's
nothing
for
the
pod
to
run.
I
kind
of
wonder
what
error
message:
a
person
would
receive
without
a
command
that
it
would
get
the
default
entry
point
for
the
image
perhaps
which
I
guess.
That
would
be
the
reason
why
nothing
would
be
running.
Maybe
the
whatever's
in
the
image
is
exiting
as
soon
as
it
runs.
D
D
If
you
just
want
to
start
an
os
like
in
a
vm
and
work
in
it.
That's
okay
for
like
testing.
Maybe,
and
then
you
need
the
sleep,
but
that's
usually
not
how
you
would
use
a
container.
You
would
try
to
define
directly,
which
process
are
you
running
and
try
to
stick
to
a
single
process,
because
in
a
pot
you
can
join
a
few
containers
for
each
of
your
processes.
One
and
you
don't
need
to
run
it
like
a
vm
where
you
just
go
in
and
run
a
bunch
of.
B
Yeah,
I
I
just
want
to
say
briefly
for
people
that
are
looking
for
more
advanced
use
cases
and
building
out
containers
from
scratch,
building
up
their
docker
file
or
maybe
having
multiple
things
running
in
a
container,
there's
kind
of
two
or
three
solutions
out
there
that
are
kind
of
base
image
or
very
thin
in
it
processes.
B
In
it
and
and
tiny
and
a
few
other
solutions
that
are
in
processes
that
you
can
have
set
as
your
entry
point
and
kind
of,
can
keep
the
containers
running
so
there's
some
options
there,
depending
on
what
you're
trying
to
do,
but.
A
Thank
you
very
much.
I
guess
I'll
just
gonna
add
one
thing
that
may
seem
obvious
to
probably
everyone
here,
but
you
know:
docker
docker
files
use
entry
points
as
the
command
to
run
by
default,
where
the
command
is
the
arguments
to
the
entry
point.
Unless
the
entry
point
doesn't
exist,
in
which
case
the
command
is
a
command
and
then
both
of
those
can
be
overwritten
as
well.
A
A
A
D
Yeah
is,
is
it
from
the
nodes
or
is
it
also
the
registry
cleanup,
because
the
node
cleanup,
if
I
remember
correctly,
there
is
a
garbage
collection
in
the
cubelet
that,
if
set
up
correctly,
that
should
do
it
for
you
as
long
as
you
don't
have
like
more
requirements,
there's
something
in
the
thread.
Some
details,
yeah.
A
Yeah,
I
I
think
it's
I
I
think,
because
I
said
ruler,
I'm
just
in
my
head,
I'm
assuming
there's
going
to
be
the
node
and
the
garbage
collection
and
the
cube
plate
is
a
really
good
point.
Does
anyone
have
any
specifics
on
what
that
garbage
collection
looks
like
and
how
old
the
images
need
to
be
anyone
familiar
with
that
off
the
top
of
their
heads.
B
E
Well,
some
registries:
I
think
you
can
kind
of
have
a
policy
to
expire,
some
images
that
are
older,
some
dates,
so
they
will
be
cleaned
up
automatically.
You
probably
need
to
check
your
registry
if
it
has
such
feature
to
do
the
kind
of
a
garbage
collection
like
a
policy
to
clean
up
all
their
images.
D
D
Yeah
and
I
guess
from
an
experience,
I've
used
azure
container
registry.
I
think
it
has
a
concept
of
tasks
which
you
can
use
to
then
automatically
clean
up
other
images
that
you
may
not
be
using
or
tags
that
you
may
not
be
using
anymore.
So
that
might
be
something
that
can
work
for
that
specific
solution.
C
I
know
you
said
off
the
top
of
your
head,
but
I
like
to
cheat
when
I
can
and
the
first
thing
that
googled
up
is
garbage
collection
for
container
images
and
it
talks
about
c
advisor
and
how
it
has
thresholds
for
when
images
would
be
deleted
and
then
the
end
of
the
this
url
that
I'll
add
in
the
hack
md
says
that
some
of
these
features
will
be
replaced
by
cubelet
eviction
in
the
future.
A
A
D
I
did
cheat
a
bit
right
now
and
checked
out
the
way
that
it's
installed
on
the
genesis
stocks
and
it
seems
they're,
basically
just
pulling
down
the
the
images
before
so
in
the
download
and
install
kubernetes
images
you
would
instead
of
now
in
that
docs
downloading
the
v1
18
one
download
the
version,
for
example
the
latest
latest
119,
and
then
the
installation
or
the
deployment
of
kubernetes
is
done
using
cube
adm
here
within
it,
but
with
an
upgrade,
I
think,
what's
the
correct
cube,
adm
command
for
an
upgrade,
I
think
just
q
adm
upgrade
right
so
there
you
would
then
use
the
same
kubernetes
version
that
you
just
downloaded.
D
The
images
for
you
would
most
probably
also
want
to
update
your
flannel
and
ingress
controllers
to
more
interest
in
your
versions,
but
like
just
for
the
kubernetes
itself,
I
would
just
redo
the
the
the
the
items
in
that
docs
around
the
download
and
install
the
keyboard
or
download
install
kubernetes
images.
It's
like
docker
pull,
save
load
and
then
do
the
cube.
Adm
upgrade
documentation
for
cube
adm
upgrade.
I
guess
we
can
link
that
that's
I've.
E
I've
done
previously,
like
installation
upgrades
for
the
offline
mode
like
when
the
company's
behind
firewall,
so
you
can't
like
pull
the
images
from
internet
and
stuff
like
that,
that
time,
the
the
customer
that
I
work,
they
had
a
jfrog
artifactory
and
they
have
a
kind
of
a
mirror
feature.
E
So
on
on
the
registry
itself,
you
will
be
able
to
mirror
your
images
from
the
that
is
available
on
other
registries,
so
you
can
pull
it
like
internally
in
your
organization,
if
you're
behind
firewall
and
then,
if,
like,
I
think
the
extreme
case,
if
you
don't
have
even
that
there
is
like
you
know,
you
can
always
use
docker
save
docker
load
command
like
it's
docker
save,
will
create
a
tar
file
for
your
image
and
you
can
copy
it
like
on
your
usb
stick
and
then
bring
it
up
and
then
docker
load
from
turbo.
E
It's
gonna
create
your
image.
Obviously
you
need
to
do
some
research.
What
images
you
need
to
have
for
your
kubernetes
installation,
like
a
q,
proxy
image
or,
like
you,
know,
cubelet
image
or,
like
all
the
like
cube
api
server,
docker
image
like
so
you'll,
have
to
do
research.
It's
probably
a
little
bit
of
pain
process
for
you,
so
maybe
look
for
also
some
some
vendors
or
some
distribution
of
kubernetes.
They
already
have
pre-loaded
images
that
can
be
reused
and
stuff
like
that.
E
A
Yeah,
I
guess
if
these
are
qbdm
clusters
and
I'll
just
assume
they
are
because
I
think
that's
where
a
lot
of
us
are
these
days
that
you
know
as
far
as
the
the
host
goes,
the
only
binaries
you'll
need
are
the
upgraded
cubelet
and
qbdm
itself,
and
then
the
images
like
everyone
else
has
mentioned,
with
the
api
server,
the
controller
manager,
the
scheduler,
etc,
pull
them
down
load
them
onto
the
hosts
and
should
be
a
large
chunk
of
it
done
so.
Best
of
luck,
great.
D
A
D
E
All
right
I
haven't
used
for
a
while.
We
used
to
have
a
a
project
in
kubernetes
community,
which
is
called
cube
spray,
so
keep
spray
was
an
ansible
deployment
of
cube
adm,
essentially
with
a
lot
of
different
features.
I
just
checked
the
repository.
I
don't
know
how
updated
it
is.
I
haven't
used
it
for
a
while,
but
like
apparently
they
have
offline
environment
installations.
I'll
share
the
link,
maybe
in
the
slack
channel
or
in
the
hack
md
document.
A
On
the
discuss
forums,
awesome:
okay,
let's
jump
back
over
to
our
slack
channel.
I
see
we
have
a
question
from
long.
I
reckon
we
may
need
some
more
details
there.
So
I'll
read
it
out
and
if
you
can
get
back
to
us
with
some
more
that
would
be
great,
but
long
asks.
What
is
the
what
is
recommended?
If
we
know
this
shutdown
and
pods,
don't
reschedule
anyone
feel
confident
enough?
Answering
that
as
is
or
do
we
want.
A
D
A
D
Of
these
issues
on
my
last
on-call
shifts,
so
I
can
tell
you
what
I
did
so
mainly.
You
need
to
check
why
the
pots
are
not
rescheduling
oftentimes,
at
least
for
me,
for
example,
it
was
that
a
volume,
a
persistent
volume,
was
stuck
and
was
not
being
moved
to
the
new
node,
and
that
was
a
cloud
issue,
basically
that
I
had
to
resolve.
It
wasn't
wasn't
very
easy
to
resolve,
but
it's
very
different.
It
can
be
very
different
reasons
why
the
pod
is
stuck
and
is
not
getting
rescheduled.
D
If
you
have
a
non-auto
scaling
cluster,
that
could
also
be
that
just
there
is
no
space,
there's
not
enough
space.
Sometimes
it
will
tell
you.
I
don't
know
if
it's
a
host
port
service
and
that
host
port
is
not
available
on
any
other
host,
so
it
will
usually
tell
you
why
it
cannot
be
rescheduled
so
on
each
part,
you
need
to
to
run
the
describe
command
and
then
see
why
why
they're
not
being
rescheduled.
C
I
like
to
take
advantage
of
auto
scalers
and
and
add
a
node
before
I
remove
a
node
and
then
off.
You
know
you
might
end
up
with
the
same
problem
when
you
try
to
remove
that
node,
but
it
it's
kind
of
like
cycling
forward.
I'll
just
add
a
new
note
and
remove
from
remove
the
oldest
things
and
definitely
check
the
health
of
everything
that
you've
moved
to
the
new
node
before
you
go
killing
your
old
node.
I've
lost
some
longhorn
cluster.
C
A
The
question
that
popped
in
my
mind
when
I
read
your
question
along
with
what
do
you
mean
like
define
a
node
shutdown
for
me
like?
Was
it
a
clean
shutdown?
Was
it
drained
and
cordoned,
or
did
the
note
disappear?
As
the
note
disappears,
those
pause
won't
necessarily
be
rescheduled
until
there's
the
timeout.
That's
been
kind
of
satisfied,
just
in
case
the
node
magically
comes
back
online.
D
Timeout
I'll
I'll
add
a
couple
of
things.
Definitely
the
the
volume
point
I
ran
into
that
before
the
other
things
to
check
is
like
check
your
note,
selectors
or
paints
and
colorations
like
affinity
like
no
dfinity,
or
anything
like
that
that
perhaps
the
old
note
had
certain
certain
of
these
properties,
and
then
you
know
it
doesn't
so
the
pods
aren't
being
scheduled
as
well.
A
D
Think
what
david
was
saying
around
the
eviction
time
like
how
much
time,
because
there's
also
a
time
like
because
the
the
note
might
come
back
when
as
it's
expecting,
maybe
it
comes
back,
maybe
it's
the
network
split,
so
it
will
wait.
I
think,
by
default
at
least
15
minutes
or
depending
on
what
have
has
been
set
up
there.
A
Yeah,
it's
a
lot
longer
than
I
think
most
people
think
again
along
came
back
and
said
someone
tested
disaster
recovery
and
just
shut
down
the
worker
so
yeah.
That's
that's
not
going
to
reschedule
immediately
it's
going
to
take
a
substantial
amount
of
time
unless
you
nudge
it
and
encourage
it
along
plus.
A
Of
course,
all
the
answers
we
have
from
the
panel
here,
like
you
know,
are
you
using
local
host
path,
stuff,
other
tenants
and
tolerations
there's
a
whole
bunch
of
things,
it's
quite
a
quite
a
challenge,
but
hopefully
we've
given
you
enough
information
that
you
can
move
that
forward.
A
All
right.
Let's
move
on
to
the
next
slack
question
this
one
comes
from
mustafa.
Who
is
asking?
Does
anyone
have
any
suggestions
for
gpu
partitioning
other
than
a
gpu
share,
scheduler
extender
container,
I'm
not
sure
what
the
first
one
is
container
all
container
service,
maybe
and
since
the
lag
setup
on
container
d
and
any
recommendations
on
switching
from
docker
to
container
d
for
a
bare
metal
cluster.
A
B
Don't
I
don't-
I
don't
really
know
but,
like
I
haven't
done,
that
conversion
firsthand,
but
like
I've
gotta,
imagine
that
you
install
container
d
and
then
you
probably
tell
kubernetes
to
leverage
container
d
or
you
tell
the
keyboard
to
leverage
container
d,
so
this
would
probably
be
a
cubelet
config
update
after
you
installed
the
container
d
binaries
on
whatever
distribution
that
you're
using
that's
what
my
guess
would
be.
B
I
did
bare
metal
years
ago
and
it's
very
much
like
you're,
going
no
to
know
doing,
upgrading
and
adding
things
that
you
need
the
dependencies
that
are
needed
for
cable
to
use.
So
that
should
be
about
it,
though
there
shouldn't
be
anything
like
control
plane.
Ask
that
you
should
really
need
to
do,
but
I
could
be
completely
wrong.
So.
A
No,
I
I
think
my
experience
is
the
same
there
you
you
do
need
to
do
it
not
to
node
you'll,
want
to
drain
and
coordinate.
While
you
do
the
upgrade
and
just
reconfigure
the
cubelet,
restart
it
and
then
bring
it
back
into
action.
You
can
do
that
node
by
node,
every
node
can
run
a
different
cri.
If
you
want,
I
would
recommend
it,
but
the
upgrade.
E
No
just,
alternatively,
just
without
you
know,
if
it's
a
small
footprint
here,
we
you
can
just
deploy
a
new
cluster.
You
know
with
the
cri
and
just
do
the
kind
of
a
blue
green
shift
and
push
your
cd
towards
a
new
cluster
and
then
do
a
cut
over.
I
would
say,
like
once,
appropriate,
should
be
fine.
D
Just
another
note:
I
guess
you
do
still
have
to
take
care
of
any
workloads
that
may,
for
some
reason,
be
using
the
docker
socket,
so
you
may
have
to
migrate
those
workloads
to
use
something
else.
D
Yeah
yeah
one
thing
I've
seen
also
is
that
some
people
keep
the
doctors
up
like
like
just
move
kubernetes
to
container
d,
but
have
a
docker
still
on
the
system.
For
I
don't
know
their
legacy
ci
cd
pipeline,
where
they
need
to
mount
the
docker
socket
or
I
think
in
like
some
versions
of
the
aws
csi
or
a
cni
plug-in.
You
need
the
docker
socket
still
and
I've
seen
some
cases
where
they
just
keep
it
on
in
the
u.s
and
don't
completely
remove
docker.
As
long
as
you
don't
have
like
big
security
concerns
around
it.
C
I,
like
the
ephemeral
cluster
approach,
where
you
just
move
everything
to
a
new
cluster,
but
if
you
don't
want
to
try
that,
maybe
you
could
try
just
doing
that
with
nodes
again
just
adding
new
nodes
that
are
using
the
new
container
engine
and
then
ditch
the
old
docker
ones.
A
Yeah,
I
guess
it
depends
how
bare
metal
their
bare
metal
cluster
is
and
what
flexibility
they've
got
in
capacity,
but
yeah
all
really
really
good
options
there.
What
about
the
gpu
partitioning
and
then
does
anyone
get
any
advice
on
that?
I
don't
think
I've
ever
touched,
the
gpu
workload
on
kubernetes.
So
I'm
I'm
really.
D
Missing
I've
never
done
it
myself,
but
I
I
know
actually
the
people
or
some
of
the
people
behind
the
early
on
union,
the
ali
cloud
gpu
share
scheduler
that
one
is
actually
kind
of
the
one
that
is
used
most
these
days.
I've
heard
there
is
another
one
which
is
a
fork
of
the
nvidia
kts
device,
plugin
I'll
post
the
link
in
a
bit.
I
I
found
it
at
some
point,
but
it
has
also
not
been
touched
for
like
a
long
time.
D
D
However,
if
the
alien
one
is
not
working
on
container
d
yet
then
most
probably
they
will
be
working
on
that
too,
because
I
mean
kubernetes
is
deprecating,
anything
that,
like
that
is
still
docker
shim.
So
I
I
would
expect
the
alibaba
people
to
also
move
away
from
that.
Otherwise,
just
I
I
guess
they
would
be
open
to
having
an
issue
and
and
talking
about
that,
because
the
plugin
looks
good
and
looks
pretty
used
and
widely
used
as
quite
a
few
stars
and
is
is
very
well
maintained.
It
seems
so
I
would.
A
All
right,
thank
you
for
that.
I
guess
that
also
shows
my
gpu
experience,
because
I
thought,
ally
and
container
service
was
a
table.
So
thank
you
for
correcting
me
there
today,
I
learned
okay,
let's
move
on
with
the
slack
questions.
Now,
we've
got
one
here
from
vamber
who
says
hi
everyone,
I'm
currently
working
for
a
company
that
has
around
50
000
workloads
running
on
kubernetes
recently.
The
main
objective
is
to
improve
cpu
efficiency.
B
A
A
We
ran
into
a
pretty
interesting
problem,
some
of
our
software
engineers
actually
used
cpu
set
within
their
codes,
which
we
believe
it
would
be
a
cause,
the
corresponding
container,
to
monopolize
the
cpu,
and
this
is
bad
for
sharing
the
cpu
between
pods.
Is
there
a
more
efficient
way
to
quickly
filter
out
the
pods
which
exhibit
the
cpu
set
behavior,
I'm
not
sure
what
cpu
set
behavior
is.
Does
anyone
here
know
what
that
is
and
want
to
give
us
a
tldr.
E
B
B
Okay,
yeah,
I
just
I
didn't-
want
to
overshadow
somebody
but
yeah,
I
think
cpu
set
and
I'm
pretty
sure
I
use
this
back
in
college
briefly,
when
learning
about
things
is
basically
a
mechanism
in
the
linux
kernel
to
let
you
set
the
cpus
to
leverage
for
a
application,
so
this
is
in
kubernetes
land,
not
something
you
want
to
be
using
in
your
code
at
all
that
that's
not
going
to
that's
not
going
to
work.
B
Well,
I
think
really
the
thing
what
you
should
start
doing
is
saying:
how
can
we
get
developers
to
think
differently
about
the
environment
that
their
application
works
in
and
the
other
part
of
this
is
there
is
an
option.
I
forget
the
what
you
pass
now,
but
you
can
actually
have
specific
pods
take
a
dedicated,
cpu
cores
and
someone.
B
Someone
like
me,
link
in
the
office
hours
chat
the
the
documentation
to
do
this,
but
maybe
that's
a
short
term
thing
that
you're
able
to
do
where
you
actually
grant
their
services
dedicated
cpu
cores,
and
then
the
acp
you
set
on
those
cores.
B
If
that's
something
that
might
work,
but
I
think
to
really
get
like
the
the
proper
multi-tenancy
and
shared
resources,
you're
looking
for
you're
gonna
have
to
get
developers
to
stop
using
cpu
set,
I'm
interested
to
know
how
it
actually
like
with
the
things
that
are
using
cpu
set,
what's
actually
happening,
and
if
the
runtime
is
actually
allowing
them
to
see
group
configuration
actually
allowing
them
to,
or
you
know,
take
a
certain
tpus
or
is
it
just
like
a
facade
and
they're
not
actually
able
to
to
pin
or
have
affinity
to
any
any
actual
cores?
B
It
just
looks
like
they
can,
so
the
developers
aren't
getting
what
they
want
with
that
being
set
and
you're
not
getting
what
you
want.
So
there's
a
lot
here
to
kind
of
to
work
through.
D
They
also
mentioned
they're,
not
sure
how
to
find
all
these
containers
that
are
running
this
cpu
set
not
sure
how,
because
it's
it's
in
the
co
it's
or
it's
on
the
kernel.
What
would
something
like
falco
maybe
help
like
today,
like
I
think
it
was
falco
you
could
you
could
call
you
could
define
a
rule
that
would
find
all
these
for
you
in
an
audit.
A
E
No,
I
just
want
to
say
that
that
seems
like
a
50
000
work,
clothes.
It
seems
like
a
twitter
level
or
I
don't
know
like
it's
a
large
alleged
deployment
right.
So
obviously
they
need
to
find
some
figure
out.
Some
rules
around.
You
know
how
to
use
this
large
cluster
in
the
multi-tenant
environment.
So
the
definitely
policies
and
like
file
code
would
be
great,
but
I
I
just
want
to
call
out
that
you
know
kubernetes
has
the
quality
of
service
functionality
right.
E
So
potentially,
if
that's
important
workloads,
you
need
to
make
sure
that
your
cos
qs
set
to
be
the
guaranteed.
So
you
need
to
make
sure
that
your
requests
and
limits
equal
to
each
other
and
then
in
terms
of
like
finding
the
benchmarking
and
on
all
of
these
things,
I
I
don't
know
like.
Maybe
you
you
guys
have
any
experience,
but
I
have
recently
played
with
the
vpa
virtual
porado
scaler
and
it
has
a
feature
called
recommender,
so
it
doesn't
really
change
your
cpu
requests
and
limits.
E
But
it's
it's
recommending
you
a
good
values
based
on
the
history
of
your
container
run,
so
you
can
actually
have
a
like
out
of
the
box
benchmark
for
your
containers
and
then
you
know
you
can
still
use
hpa
but
like
use
dpa
to
just
recommend
you
good
values
for
requests
and
limits
and
like
hopefully
it
helps
because
on
such
a
large
you
know
number
of
containers
it's
at
least
this
process
will
be
like
semi-automatic.
D
D
I
it's
been
a
while,
since
I've
looked
into
it,
but
as
far
as
I
know
it,
it
runs
vpa
internally,
but
it
runs
like
certain
experiments.
I
think
it
tries
to
to
set
a
certain
saying,
try
out
a
bit
to
get
more
metrics,
and
then
it
gives
a
recommendation
like
here.
This
is
what
we
found
would
be
a
perfect
setting
for
your
for
your
workout.
Maybe
someone
else
is
more
up
to
date
on
that
I
haven't
looked
at
it
for
like
a
year,
almost.
E
Yeah
I
haven't
checked
to
be
honest,
like
how
goldilocks
work,
but
I
found
this
very
interesting
like
it,
I
think,
provides
a
ui
additional,
so
you
can
have
more
like
you
can
have
better
visibility
in
terms
of
like
how
how
the
cpu
and
memory
get
used,
I
think
there
there
is
like.
E
Obviously
there
are
some
other
features
like
if,
if
discussed,
if
this
you
know
person
is
running
on
google
cloud,
there
is
a
new
functionality
called
gq
autopilot
that
automatically
going
to
size,
your
workloads
and
your
nodes,
based
on
the
you
know,
workloads
you're
running
so
so
that
is
like
fully
automated
mode.
You
don't
really
need
to
worry
about
your.
E
A
D
One
more
thing
like
another
thing
to
maybe
consider
is,
if
you
can
use
like
satcom
profiles
to
just
limit,
perhaps
this
functionality
from
all
of
the
workloads,
and
then
you
don't
like,
if
that's
what
you
you're
trying
to
do,
you
don't
have
to
go
back
and
then
change
identify
all
the
codes
and
workloads
and
and
change
the
code.
So
that
might
be
another
option
that
that
might
be
cleaner
way
to
do
it.
Investigate.
C
C
So,
even
if
this
user
is
able
to
jump
to
different
cpus,
they
should
still
be
limited
to
the
same
cpu
overall
percentage,
but
yeah
barco
linked
to
that
set
comp
profile
configuration
which
is
at
a
pod
level.
I
do
wonder
if
there's
a
way
to
set
that
at
a
name,
space
level.
D
Valid
also
just
mentioned
that
cpu
set
is
usually
used
to
pin
cpus
and
to
do
that
within
kubernetes.
You
would
need
like
a
specific
hpc
scheduler.
He
also
mentioned
one
by
univer
or
part
of
altair
right
now.
They
might
have
something
there,
but
I've
seen
other
like
hpc
scheduling
solutions
for
kubernetes.
D
A
All
right,
thank
you,
well,
eat.
We
would
love
to
have
you
join
the
panel
one
month
I'm
going
to
reach
out
to
you.
You've
got
a
lot
of
experience
there
to
share,
and
just
because
there
was
an
acronym-
and
I
don't
like
acronyms.
So
at
least
one
person
saying
what
it
means.
Hpc
is
high
performance
computing
for
anyone
that
isn't
familiar.
A
Okay,
let's
see
what
else
we've
got.
We've
got.
Oh
another
burmel
question,
so
ashish
is
asked
and
bare
metal,
multi-master
h,
a
cluster
setup.
Aj
proxy
is
distributing
load
between
all
of
the
masters.
How
shall
traffic
be
rooted
for
deployments
on
multiple
nodes?
A
A
A
All
right,
I
guess
I'll-
have
a
go
at
this
one,
so
there
are
a
couple
of
ways
to
do
a
highly
available
control
plane
on
kubernetes
on
bare
metal.
You
need
to
either
use
gratuitous,
arp
or
bgp.
I
would
encourage
you
to
go
down
the
bgp
route
and
advertise
the
highly
available
control
plan
ip
from
each
of
your
control
plane
nodes.
A
When
you're
moving
on
to
workloads
running
on
the
cluster,
you
probably
want
to
go
down
the
same
route
again.
Bgp
is
a
pretty
solid
option
here
you
can
use
metal
lb
for
this
or
cube
vip
and
both
which
can
advertise
the
addresses
of
your
load.
Balancer
ips,
how
you
get
those
ips.
It
comes
down
to
your
setup.
I
can't
really
give
you
a
lot
of
advice
there
and
if
you
want
to
provide
more
details,
just
feel
free
to
reach
out
to
me
on
the
kubernetes
slack
and
I'm
happy
to
chat
about
that
all
right.
A
Kubernetes
today,
all
right
karim
asks
hello.
I
have
a
bare
metal
cluster
with
eight
nodes.
Three
control
planes,
five
workers.
If
I
shut
down
one
worker
node,
this
sts
stateful
set
never
recreates
its
pods
on
another
worker
node.
If
I
query
the
kubernetes
api,
it
shows
that
the
pod
is
running
they've
waited
more
than
30
minutes,
the
pods
never
get
unknowing
or
terminating
so
a
little
bit
familiar
to.
I
think
the
second
question
to
be
tackled
today
did
someone
else
want
to
add
anything
to
that.
Well,
I
see
long
shirts.
D
I
mean
stateful
set,
definitely
leads
me
to
think
it's
something
related
to
volumes
yeah,
so
I
mean
I
would
definitely
dig
deeper
into
that
and
see.
If
that's
that's
an
issue.
A
A
D
But
I've
seen
this
like
gpu
style
like
home
clusters
was
there
was
this
this
orange
ubuntu
box
back
in
the
days
like
a
year
or
two
ago,
not
sure
if
those
are
still
a
thing
these
days
nooks
are
an
option
as
they
they
support
gpus,
although,
like
some
machine
learning
workloads,
you
won't
need
gpus,
so
most
probably
was
like
more
recent
pies
like
I
know,
pi
fours,
with
eight
gigs
and
enough
of
them
that
should
that
should
be
fine.
D
I
think
both
alex
alice
and
and
lucas
kelstrom.
They
have
very
good
kubernetes
and
raspberry
pi
cluster
tutorials
and
then
on
top,
depending
on
what
kind
of
machine
learning
you're
running
kubeflow
might
make
sense.
However,
kubeflow
is
also
not
not
trivial
to
install.
Let's
say
I've
heard
they're
making
it
easier.
A
Yeah,
I
think,
you're
right,
I
think,
with
their
latest
release,
they
started
providing
a
helm
chart
for
deploying
kubeflow,
so
it
should
be
a
lot
easier
whether
the
workloads
are
run
well
on
a
raspberry
pi,
I'm
not
entirely
convinced,
but
it
doesn't
mean
that
it's
not
possible
and,
as
was
mentioned,
you
may
not
need
a
gpu
depending
on
what
kind
of
machine
learning
stuff
is
happening
and
then
there's
my
leads
with
some
more
links
for
for
you
in
the
chat
all
right.
A
A
Our
next
question
here
is:
please
consider
the
below
scenario,
and
there
is
a
cluster
with
three
worker
nodes.
There's
one
pod
with
three
I'm
going
to
say
deployment,
there's
one
deployment
with
three
replicas
and
which
have
access
to
a
pvc,
a
persistent
volume
claim
suppose,
one
of
the
pauses
on
engine
x
webster
I'm
trying
to
translate.
As
I
read
suppose,
deployment
is
an
engineering
web
server
and
I
have
manually
added
a
virtual
host
configuration
to
the
pod
on
worker
one
and
only
worker
one.
A
A
D
Let's
say
using
a
persistent
volume
is
not
needed
there
and
would
most
probably
because
most
persistent
volumes
cannot
be
used
in
a
shared,
read
mode.
I
mean,
depending
on
your
storage,
the
easiest
actually
and
most
probably
the
the
best
way
to
to
solve.
This
would
be
to
put
the
site
conf
in
a
config
map
and
then
use
that
in
your
deployment,
because
then
you
get
it
automatically
mounted
by
kubernetes
into
each
of
your
parts,
and
you
don't
need
to
worry
about,
like
any
volume
mounting
remoting
or.
E
Issues,
I
think
the
fusion
is
coming
here
because
in
docker
we're
using
volume,
but
in
kubernetes
it's
maps
to
the
config
map
that
you
can
mount
as
a
volume
as
well,
so
look
into
the
configmap
documentation
and
your
configuration
will
be
added
to
each
container
and
it
will
be
automatically
reloadable
if
you
modify
this
configuration.
So
I
don't
think
volumes
is
a
good
example
for
that
it's,
I
think
volumes
are
good.
A
D
B
Yeah,
it
should
just
be
the
standard
so
for
people
that
don't
know
a
service
object
provides
standard,
basic
load
balancing
and
when
we
say
that
what
we
actually
mean
is
it's
a
very
simple
round,
robin
right,
very
kind
of
dumb.
It's
just
like.
If
you
have
a
list
of
end
points,
it's
just
going
each
one.
Each
new
request
goes
to
the
next
one
right
and
just
keeps
looping.
That's
the
simplest
type
of
load,
balancing,
there's
other
options
with
other
inkers
controllers
and
things
like
that.
But
the
default
is
like
using
ip
tables.
B
I
think
ipbs
as
well
provides
another
like
some
other
options
for
load
balancing,
but
that's
kind
of
the
default.
So
once
things
get
into
the
cluster
right,
so
something
comes
into
the
ingress
controller
and
the
ingers
controller
is
using
the
service
object
to
contact
the
service.
That's
all
just
round
robin
so
or
service
mesh
can
help.
You
do
many
other
different
patterns
with
that
as
well,
which
could
be
useful
depending
on
your
application.
So
yep.
D
So
just
a
quick
follow-up,
so
yes,
kubernetes
will
perform
load
balancing
there
is
that
last
question
here:
what
do
three
simultaneous
users
type
domain
one.com,
I'm
not
sure
what
what
exactly
maybe
perhaps
he's
looking
toward
things
like
preserving
sessions.
So
there
is
a,
I
believe,
like
something
like
external
traffic
policy
field
that
you
can
specify
in
your
service
specification.
D
I
believe,
if
I
remember
correctly,
you
set
it
to
local
and
that
will
preserve,
I
believe,
like
local
ip,
like
the
source,
client
ips,
so
that
that
may
be
related
to
that
question.
To
look
into.
A
Yep
there's
also
a
sticky
session
swag
on
the
service,
which
will
allow
you
to
get
rooted
to
the
same
pod
for
future
requests
all
right
and
partly,
if
that
has
ever
shut
down
one
of
the
workers
as
the
state
still
accessible
via
the
other
nodes.
Sure
answer
is
yes,
it
should
be-
and
we
hope
so
I'll-
maybe
reach
out
for
more
comments
on
that
and
make
sure
we
tackle
that
correctly.
A
A
E
A
C
C
Yeah,
I
I
wonder
how
how
you
determine
that
the
container
is
still
running
if
you're,
on,
if
you're
on
the
node
and
you
see
that
the
container
exists,
it
doesn't
mean
that
it's
still
running,
I
do
wonder
how
you
delete
a
pod
and
end
up
with
a
container
still
existing,
but
not
running.
A
D
Wondering
maybe
it's
in
our
package
and
perhaps
they
think
it
is
deleted,
but
it
hasn't
been
and
I
mean
you
will
get.
I
guess
a
feedback
saying
that
you
have
no
permission,
but
perhaps
that's
it.
Perhaps
it's
you.
You
don't
have
permission
to
delete
them.
I
I
don't
know
I'm
just
throwing
out
ideas.
A
B
D
I
mean
if
the
if
the
pot
really
went
away
completely,
but
it's
still
running
in
darker.
It
might
also
there
be
like
an
issue
of
like
an
old
version
and
a
bug
that
you're
hitting,
because
I
would
guess,
if
you're
running,
kubernetes,
113
you're,
most
probably
also
running
pretty
old,
os
and
and
docker
versions
below
that,
so
they
might
be
just
like
known
issues.
D
You're
hitting
there
so
upgrading
or
moving
to
to
in
a
cluster
that
is
still
like
within
support,
might
might
help
a
lot
like
most
issues
are
really
just
like
things
that
we've
been
hitting
in
the
community
for
a
long
time
and
then,
hopefully,
in
most
most
of
the
time
they
have
been
addressed
in
your
versions.
A
All
right:
well,
we
didn't
have
a
lot
to
work
on,
but
I
think
we
gave
a
lot
of
different
potentials
there.
So
we
hope
that
really
helps.
Okay,
let's
move
on
to
our
next
question.
This
is
an
eks
cluster,
and
so
this
person
is
asking
they
want
to
be
able
to
mount
an
efs
volume
to
the
pod
and
they
want
to
be
able
to
download
certificates
to
that
pod
they've
mentioned
their
use
case
is,
I
want
to
connect
to
an
external
system,
and
I
need
our
certificate
to
do
so.
A
A
C
Yeah,
a
knit
container,
a
nic
container
that
just
crawls
down
the
certificate.
I
was
trying
to
figure
out
how
how
the
certificate
is
being
used
and
where
it's
being
used.
If
you
want
the
certificate
to
be
part
of
a
config
map
that
this
pod
is
dependent
on,
then
maybe
you
need
something
external
to
this
pod
to
do
the
work.
A
Yeah,
it
looks
like
they're
pulling
the
certificate
from
jfrog
as
a
jks
certificate.
I
don't
know
if
anyone's
familiar
with
that
tool
chain
or
not
I'm
not
something
I'm
familiar
with,
so
I'm
not
entirely
sure
what
they're
trying
to
do
but
yeah.
It
sounds
to
me
like
an
in
a
container
to
pull
down
the
certificate
and
stick
it
into
a
volume.
That's
shared
across
into
the
main
pod
would
be
their
best
bet,
I'm
not
sure.
D
D
I
guess
I'll
I
mean
I,
I
don't
have
experience
doing
something
like
this
with
the
fs,
but
I
kind
of
call
like
I'm
not
sure
where,
if
you
need
the
efs
here,
I
guess
what
I've
done
previously
that
some
sounds
similar
to
this.
Is
I've
connected
to
external,
like
system,
to
download
the
certificate
as
part
of
my
kind
of
workload
startup.
D
I
doubt
it's
it's
doing
it
in
jks,
like
format,
so
you
would
then
convert
them
as
part
of
your
startup
to
to
jks
format
and
then
import
them
into
your
javas
keystore
and
just
store,
and
that's
what
I've
done
for
some
records
in
a
while
ago,
and
I
I
didn't
have
to
use
any
sort
of
persistent
necessarily
persistent
volumes
or
things
like,
such
as
like
elastic
file,
shift
or
ews,
or
anything
like
that.
D
It
was
all
essentially
just
connecting
to
an
external
service
getting
the
certificate
you
doing
like
a
quick
script
to
convert
them
to
jks
and
then
import
them
into
the
certificate
store.
So
I'm
guessing
if
you're,
using
jks,
you're,
probably
already
using
something
like
the
key
tool
to
con
to
import
those
into
your
chest,
store
and
key
store,
that's
kind
of
suggestion.
I
I
would
check
if
you,
if
you
really
need
to.
E
Complicate
things
with
bfs,
this
looks
like
in
general,
like
a
typical
service
specialist
case
so
like
maybe
they
need
to
also
read
up
on
the
service
meshes
and
see
how
they
can
use
them
here,
yeah.
So
it's
connecting
to
services
essentially.
D
It
does
sound,
but
I'm
wondering
if
this
is
like
a
third-party
java
application.
That
requires
something
like
no
to
note
encryption
between
itself,
because
that's
exactly
the
scenario
like
I've
run
into
is
that
it
was
a
clustered
application
that
has
its
own
like
zookeeper
and
cluster
management
and
all
of
that
and
each
node
had
to
have
like
mutual
tls,
but
it
wasn't
like
individual
service,
so
it
may
not
be
perfect
for
for
a
service
mesh,
or
at
least
to
me,
it
seemed
like
serious
mesh
won't
be
ideal
solution
for
that.
A
All
right,
thank
you
both
well.
It
says
in
chat,
can't
the
tls
certificates
live
inside
of
a
secret.
That
is
another
good
option,
but
depending
on
the
rotation
velocity,
if
it's
a
short
lifetime
secret
that
may
be
slightly
more
cumbersome
or
require
an
additional
operator,
I
would
imagine,
and
so
yeah
lots
of.
C
C
I
I
nothing
I
wasn't
familiar
with
the
efs,
I'm
still
not,
but
two
second
google
says
it's
more
like
an
nfs
where
it
offers
shared
access.
So
I
can
also
imagine
there's
race
conditions
where
all
of
your
pods
are
kicking
up
and
starting
up
and
looking
for
the
certificate.
One
of
them
is
going
to
have
it
in
the
container
that
primes,
this
tls
cert
and
then
you're
also
going
to
have
this
renewal
problem,
where
you're
going
to
have
some
pods
trying
to
read
from
it
while
others
are
trying
to
update
from
it.
A
All
right,
I
think,
we'll
move
on
from
this
one
efs
nfs
managed
by
amazon,
but
we
won't
talk
about
using
nfs
in
production.
Well,
it
also
throws
out
the
ssm
as
an
option,
and
I
can't
remember
exactly
what
ssm
stands
for.
So
I'm
going
to
say
it's
super
super
secret,
secure
management
or
something
I
know
it's
the
key
manager
on
amazon.
I
can't
remember
it's
definitely
what
it
is,
but
that
would
also
be
another
option
for
something
like
this.
Does
anyone
know
what
it
means?
It's
secrets,
management.
D
A
Systems
manager
there
we
go.
I
prefer
super
secret
manager.
Let's
go
with
that.
Okay,
let's
move
on
we've
got
time
for
maybe
just
there's
one
last
question.
So
thank
you
to
everyone
that
did
submit
a
question.
It's
been
good
fun.
A
A
A
A
D
B
D
I
think
we
implemented
our
own
to
some
with
some
hooks
on
aws,
for
example
on
premise:
you're,
most
probably
doing
it
yourself,
so
yeah
not
sure
how
they're
how
they're
really
rebooting
the
note
how
much
control
there
is.
But
if
there
is,
if
there's
a
some
kind
of
manual
control
or
ultimate
control
that
you
have
like
access
to
than
doing
the
drain
before
that.
That
will
help.
C
I
also
wonder
how
the
how
does
run
volume
is
being
attached
to
the
pod.
If
possibly,
it's
not.
D
So
it's
usually
in
its
container
of
the
cni
that
does
it
so
the
cni,
the
demon
set
cni
that
runs
an
init
container,
puts
the
the
cni
file
in
that
that
folder
and
then
then
it
works.
If
I
remember
that
from
calico
correctly
and
flannel
should
be
working
somewhere
again
all
right,
I.
A
All
right
well,
there's
a
link
there
in
kubernetes,
discussed
to
our
panelists
I'll
share
in
the
kubernetes
officers.
I
know
we've
just
slightly
went
over
by
a
minute,
so
I'm
keen
to
get
you
all
back
to
your
day.
I'm
sure
you've
all
got
other
calls,
but
thank
you
so
much
everyone
here
for
joining
us
today
and
bringing
your
expertise
and
knowledge
and
sharing
that
with
everyone
and
the
channel
and
watching
on
youtube.