►
From YouTube: Kubernetes Office Hours 20200617 (EU Edition)
Description
Office Hours is a live stream where we answer live questions about Kubernetes from users on the YouTube channel. Office hours are a regularly scheduled meeting where people can bring topics to discuss with the greater community. They are great for answering questions, getting feedback on how you’re using Kubernetes, or to just passively learn by following along.
For more info: https://github.com/kubernetes/community/blob/master/events/office-hours.md
A
The
great
white
north,
what
do
you
say
we
get
started?
Welcome
everybody.
It
is
the
third
Wednesday
of
every
month.
That
means
it's
time
for
the
kubernetes
office
hours.
If
you're
in
the
channel
listening,
please
let
us
know
how
the
audio
sounds.
We
always
like
to
make
sure
that
we
sound
him.
Look
good,
so
I'm
Jorge,
Castro
I'm
your
host.
Let's
start
with
some
introductions:
let's
go
Chris
Mario,
Pavel,
Oz,
Dave
and
Marco
you
can.
As
the
new
person
you
can
go
last.
B
Hopefully,
I
won't
trip
over
my
tongue
here
I'm.
My
name
is
Chris
I'm,
a
customer
engineer
with
Google
clouds
public-sector
up
here
in
Canada.
My
background
has
been
primarily
running:
on-prem
kubernetes
installations,
till
I
joined
Google,
so
hopefully
I'll
be
able
to
impart
some
of
that
knowledge
on
you
and
looking
forward
to
helping
any
way.
I
can
today.
C
Everybody,
my
name
is
Mario
Lauria
I
am
a
senior
sre
for
stock
X
in
Detroit
Michigan.
We
are
retail
about
e-commerce
platform.
I
own
arthur,
Burnett
used
infrastructure
on
EAS,
as
well
as
a
lot
of
developer,
centric
things
like
CI,
CB,
helm
and
other
things
like
that.
So
my
ninjas
really
lied
in
the
eks
parts.
Kind
of
configuring
operating
clusters,
life
cycles
on
life
cycle
and
helping
developers
understand
a
horizontal
pod,
almost
killing,
and
things
like
that.
So.
D
D
A
And
I'm
Marko
chappie
I'm,
the
director
of
engineering
of
vapor
IO,
which
is
a
small
start-up,
that's
building
edge,
colocation
data,
centers
co-located
on
cell
tower
sites,
and
we
leverage
kubernetes
entirely
through
our
infrastructure,
both
in
cloud
as
well
as
on
in
inside
each
data
center
for
control,
plane
services
and
I'm.
Your
host
today,
Jorge
Castro
I
started
this
program
to
learn
about
kubernetes
and
I
still
I
figure.
We
would
do
it
as
a
community
program
together
and
I'm
a
community
manager
at
VMware,
so
we're
gonna
go
over
how
this
works
feel
free.
A
A
Here
we
go
so
before
we
begin.
Let's
start
by
introducing.
Oh
we
did
that
here's
some
ground
rules.
This
is
a
kubernetes
event,
so
the
code
of
conduct
is
in
fact,
so
please
be
excellent
to
each
other
and
never
really
had
an
issue
with
that.
This
is
also
a
judgment-free
zone.
Everyone
had
to
start
from
somewhere.
So
please
help
out
everybody.
We
really.
There
are
no
dumb
questions.
All
skill
levels
are
welcome
to
participate.
A
So,
let's,
let's
try
to
be
positive
there
when
it
comes
to
all
of
our
levels
of
expertise,
and
while
we
will
do
our
best
to
answer
your
question
as
a
panel
doesn't
have
access
to
your
cluster,
so
live
debugging
is
off
topic.
We
can't
like
really
ssh
to
your
stuff
and,
like
fix
your
control
plane
that
kind
of
thing,
but
what
we
can
do
is
try
to
help
you
get
into
the
right
mindset
or
where
to
look
to
find
your
problem,
so
you
can
at
least
unblock
whatever
thing.
A
You're
stuck
on
and
at
least
hopefully
give
you
a
direction
on
where
to
go.
So
you
can
like
progress
in
fixing
your
problem
here,
panelist
you're
encouraged
to
expand
on
your
answers
with
your
experiences
and
pro
tips.
Part
of
the
reason
we
asked
you
to
come
on
is
because
of
your
expertise
and
your
production
experience
things
like
that.
I'm
audience.
This
is
a
participatory
sport,
so
you
can
help
by
pacing
in
URL,
so
the
official
Doc's
blogs
or
anything
that
might
be
relevant
to
the
topic
at
hand.
A
I
know
that
we
have
a
lot
of
experience
out
in
the
audience
as
well.
So
please
feel
free
to
you
know,
help
out
by
just
typing
in
the
chat.
I
know
every
time,
if
someone's
asking
for
tools
or
something
there's,
always
a
new
github
link
that
someone
drops
and
we
learn
about
a
new
tool
and
things
like
that
and
then
what
I
do
at
the
end
of
the
show.
I
have
all
the
URLs
and
I
whack
them
into
the
show
notes.
A
So
we
have
some
reference
material
for
those
of
you
who
can't
come
live
feel
free
to
post
your
questions
just
directly
in
chat
just
but
question
colon
or
something
that
makes
it
obvious.
So
we
can
see
it
and
then
what
we
do
we'll
stick
that
in
our
notes
and
then
we'll
get
to
the
questions
in
the
order
that
we
received
them.
If
you
have
a
premade
question
from
discussed,
kubernetes
dot,
IO
or
Stack
Overflow,
you
can
just
link
the
pace
to
pace
the
link
to
that.
A
So
we
could
read
that,
so
you
don't
have
to
rewrite
your
question.
If
you
already
have
it,
you
can
also
help
us
by
tweeting
spreading
the
word
paying
it
forward.
Anything
that
might
help
people
who
are
using
kubernetes,
you
know
be
exposed
to
this
content
will
be
appreciated
and,
as
always,
if
you
stick
around
to
the
end,
we
do
a
raffle
where
we
give
away
a
snazzy,
kubernetes
t-shirt
which
I'm
wearing
today
it's
one
of
these.
The
way
it
works
is
we'll
pick
two.
A
If
we've
addressed
your
question
on
the
air,
all
you
have
to
do
is
ask
it.
We
will
raffle
out
two
t-shirts,
so
we're
giving
out
two
today
and
two
this
afternoon.
There
is
a
West
Coast
session
in
about
two
hours
after
this
livestream
that
will
cover
that
part
of
the
world.
So
this
is
the
EU
session.
So
with
that
how's
everyone
feeling
today,
let
me
check
out
the
notes.
Joe
is
here:
Konstantinos
is
here
awesome
Ahri's,
here's
got
a
question.
I
see
it
Vishnu.
A
Thanks
for
your
question,
it
looks
like
Demetri
has
a
question:
okay,
let's
get
started,
Marko
I
want
you
to
kick
it
off
real
quick,
because
we
asked
her
questions
and
you
pace
it.
An
entire
paragraph.
I
know
a
lot
of
it
is
background
information,
but
we
do
get
questions
like
this
at
all.
So
let's
just
spend
a
minute
to
talk
about
it.
So
what's
your
question?
I
hold
yeah
free
stuff.
After
I
press
ENTER
in
the
slack
I
was
like
you
know
what
probably
I
probably
should
is
elsewhere,
but
yeah
the
we
run.
A
We
run
a
bunch
of
edge
data,
centers
they're
unmanned
units,
and
you
have
a
bunch
of
interesting
hardware
that
we
run
there,
that,
like
taps
into
like
all
of
our
ot
systems
like
power
management
and
all
that
stuff.
So
it's
it's,
this
crazy
hardware
platform
that
is
basically
a
motherboard,
some
ramen,
some
CPU
and
some
disk,
and
we
have
like
ten
six
of
ten
of
those
we
were
in
kubernetes
on
it.
It
works
really
well
for
us.
A
Kubernetes
helps
to
keep
our
like
400
plus
pods,
that
do
all
the
control
playing
stuff
running
there,
but
we
have
two
pods
in
particular
that
need
storage
ones,
a
stateful
set,
it's
a
Postgres
database
and
one's
a
deployment.
It's
a
Prometheus
database
and
the
problem
we
have
is
that
these
hardware
nodes
will
power
down
occasionally
and
they
have
unreliable
BMC.
So
I
can't
like
go
reboot
or
remotely
there's
no
ipmi,
so
a
tech
has
to
go
and
reboot
them.
A
The
problem
we've
encountered
is
when
it
comes
to
storage
and
persistent
storage,
either
whether
it's
a
pod
or
a
stateful
set.
When
that
node
goes
away,
kubernetes
thinks
that
the
storage
is
still
attached
and
I've
read
through
the
code.
In
the
comments,
this
is
expected
behavior
because
they
don't
know
when
that
node
will
return,
so
it
doesn't
want
to
remove
the
storage,
an
effect
of
deadlocking
or
doing
right,
terrible
things
to
to
corruptions
of
file
systems.
A
So
our
problem
is,
is
we
need
these
stateful
sets
to
be
running
off
than
not?
We
don't
care
as
much
about
the
persistence
of
that
storage
like
the
validity
or
the
the
safe
regards
there,
because
we
know
the
notice
powered
downs
like
there's
nothing
mounted
there
anymore.
We've
found
a
way
to
have
kubernetes,
short-circuit
and
re
mount
volumes
by
deleting
the
node
from
our
cluster.
The
biggest
problem
that
we
have
is
that,
even
when
we
do
that,
kubernetes
will
still
sometimes
say.
No.
No.
A
This
volume
is
attached
somewhere
else,
they're
already
running
attached
to
a
pod,
I'm,
not
gonna,
reallocate
it
until
we
do
a
bunch
of
like
forced
deletions
and
some
other
things
from
github
threads
and
issues
that
have
gotten
us
to
the
point
where,
like
we
can
somewhat
reliably
very
intensely
recover
storage.
My
question
to
the
panel
in
particular
into
the
into
the
community,
is:
have
people
encountered.
This
I
know
on
cloud
providers
they're
very
good
about
you
know
when
a
node
goes
off.
Liners
are
moved
because
of
that
really
really
good
integration
with
those
cloud
providers.
A
Remapping
storage,
this
stuff
is
all
possible
and
most
saw
providers
provide
readwrite
many
where
we
have
rewrite
single
or
rewrite
one
in
ours.
What's
the
best
way
for
us
to
kind
of
move
forward
until
we
can
retrofit
our
hardware
to
be
more
reliable
with
a
reliable
out-of-band
controller,
how
can
we
make
sure
that
these
workloads
come
back
online
and
storage
attaches
appropriately
in
a
reasonable
time,
other
than
kind
of
codifying
the
hacks
that
we've
done
today
to
recover
the
cluster.
D
A
We
are
using
rook
and
Seth,
but
we
found
we've
used
other
mediums.
We
settle
over
concept
because
we
have
separate
experience,
but
any
any
storage
driver
that
is
a
read/write
single,
has
the
same
effect
where
kubernetes
won't
unmount
or
remount
the
volume
storage,
because
it
sees
it
as
being
mounted
elsewhere,
whether
it's
at
the
I'm
not
sure
if
it's
something
that's
with
the
CSI
driver
for
Steph
or
with
the
other
ones
we've
used.
A
But
when
we
use
something
like
NFS
and
read/write
many,
it's
not
a
problem
because
we
can
explicitly
mount
multiple
pods
to
a
single
data
stored.
So
we
haven't
found,
and
maybe
the
answer
is:
here's
a
better
storage
provider
but
other
than
like
NFS
and
a
few
others.
We
haven't
found
as
equally
performance
of
storage
compared
to
SEFs
our
bb's
to
provide
us
with
the
kind
of
we
do
a
lot
of
it's
like
a
lot
of
writes
on
these
devices.
A
D
D
Because
Amazon
and
we
basically
we
can
shut
down
the
node,
then
we
can
still
like
Amazon
EBS
actually
does
this,
and
it's
still
like
that
pod
would
actually
unmount
and
go
to
another
node
and
actually
mount
there.
So
I
think
this
is
more
like
a
problem
with
diversity.
Acai
drivers
are
yeah,
it's
a
fork
in
particular.
D
One
of
my
suggestions
is:
you
can
actually
just
run
things
on
local
modes,
for
example
for
Prometheus.
You
definitely
the
one
that
be
running
this
son,
rook
or
itself,
because
you
can
just
have
like
move
the
pope
prometheus
replicas,
each
writing
in
into
local
disk,
and
then
basically,
if
one
note
goes
down,
you
still
have
another
replica.
We
just
constantly
step
scraping
for
metrics
the
problem.
D
D
A
Would
definitely
eliminate
that
issue
for
us
we
still
have
a
Postgres
database,
that's
critical
to
our
our
flow
of
data
from
our
devices
to
our
Prometheus,
which
is
still
a
stickler,
but
I
suppose.
Another
thing
we
could
do
is
similarly
just
run
Prometheus
in
a
clustered
configuration
or
sorry
Postgres
in
a
cluster
configuration,
removing
them
or
block
storage
directly.
A
The
controller
manager
and
storage,
but
it
removes
our
need
to
worry
about
it
for
now,
which
is
a
good
stepping
stone
for
us,
because
we
have
a
little
ways
to
go
before
we
get
better
hardware
in
these
sites.
So
the
real
problem
is
the
hardware
we're
just
yeah
attacking
it
from
a
software
perspective,
yeah
all
right,
well,
mark
I,
hope
that
gets.
You
started
in
the
meantime
feel
free
to
hang
out
on
the
panel.
In
the
background,
if
there's
there's
a
question,
you
want
to
help
answer.
I
feel
free,
I,
know
you're
babysitting
today,
alright.
A
My
question
is:
what
about
disk
I/o
and
networking
the
underlying
node
storage
and
networking
will
have
a
real
limit
to
IAP
set
cetera?
Are
there
plans
to
get
kubernetes
to
allow
positive
defined
storage
network
requirements
and
not
over
scheduled
storage
and
networking
requirements
on
nodes
which
cannot
handle
their
requests
panel.
A
B
A
D
So
I
remember
basically,
kubernetes
has
like
an
annotation
which
allows
you
to
limit
network
throughput,
but
it
doesn't
allow
you
to
request
it
or
that
kind
of
thing
mm-hmm,
so
yeah.
So
there
is
basically,
if
you
are
Chennai
plug-in
supports
it.
You
currently
can
like
limit
one
part
to
say:
let's
say
please
just
use
up
to
like
10
megabytes
per
second
or
something
like
that
right.
So
that's
already
are
basically
unknown.
Notation
in
kubernetes
yeah,
yeah
I.
Don't
have
any
would.
B
E
D
A
A
Okay,
moving
on
Demetri
hope,
I
got
your
name
right,
says
good
morning.
I
have
a
question
about
cluster
utilization.
I
am
setting
resource
requests
and
limits
for
our
services
and
I
was
curious.
How
to
approach
these?
Should
I
set
request,
bakes
on
peak
load
or
should
I
allow
limits
to
handle
those?
My
current
request,
setup
leaves
my
clustered
about
less
than
20
percent
utilization
and
I
was
wondering
how
I
should
approach
increasing
my
utilization,
interesting
question.
C
Testing,
can
you
guys
hear
me
yep,
awesome,
cool
I
would
love
to
respond
to
this
one
word
and
I
going
through
the
same
thing
right
now,
so
research
requests
and
limits
are
incredibly
hard
to
really
understand
without
actually
analyzing
and
spending
a
lot
of
time
effectively.
Looking
at
your
workload
over
time,
especially
from
a
developmental
perspective
kind
of
asking
developers
to
do
this
a
little
bit,
it's
a
little
bit
hard
to
do.
They
really
have
other
things.
C
Other
priorities,
there's
some
tools
out
there
that
help
with
this,
one
of
which
I
think
we
mentioned
before,
is
called
Foley
locks
from
the
Fairwinds
Ops
teams,
and
really
that
just
provides
a
fancy
UI
to
the
vertical
autoscaler,
which
sits
and
kind
of
observes
and
then
makes
recommendations
for
possible
resource
requests
that
you
could
set.
I.
C
That
is
guaranteed
where
you
actually
have
your
request
and
limit
exactly
the
same,
and
that
basically
gives
your
application
if
it
does
need
to
inflate
those
resources
already
reserving
guaranteed
in
that
house
with,
you
know
your
nodes,
having
issues
as
well,
and
you
trying
to
overcome
scribe
the
allocation
that
you
will
have
for
your
nodes.
We've
earned
into
all
of
those
sorts
of
issues,
including
working
with
developers
to
help
kind
of
get
this.
C
C
A
lot
of
it's
just
trying
and
I
think
my
big
thing
that
I
tell
people
is
have
something
especially
a
limit,
especially
request,
even
if
it's
a
little
low
the
other
thing
too-
and
this
is
I,
believe
it
I
think
there's
startup
resources,
I
know,
they're,
startup,
probes,
I,
think
sort
of
resources
might
be
a
thing
or
a
cap
I'm
not
sure,
but
we
have
issues
with
some
of
our
node
apps
where
they
start
up
and
they
take
over
a
CPU
core,
and
then
they
come
back
down
in
normal
state
within
30
seconds
and
they're
back
to
200
millisecond
put
a
limit
on
that
because,
what's
what's
the
point
of
doing
a
limit,
that's
1.5
CPUs
right
that
just
doesn't
make
sense.
C
So
there
again,
this
is
not
perfect.
I
would
say,
look
into
Goldilocks
see
if
you
can
get
you
at
least
a
UX
perspective
of
some
suggested
values
and
really
look
at
your
your
critical
workloads,
the
the
bigger
workloads
on
your
cluster,
especially
daemons
that
since
April
sets
as
well.
So
we
just
had
an
issue
with
data
value
agents
that
we're
just
kept
inflating
kept
inflating,
no
limits
on
them.
C
E
Would
just
reiterate
kind
of
what
Mario's
like
resource
management?
Some
I
see
a
lot
of
people
struggle
with,
because
it
has
a
lot
of
impact
on
other
things
like
if
you're
using
the
horizontal
pirata
scale
or
how
does
requests
work
with
that.
So
it's
something
you
do
want
to
invest
a
lot
of
time
in
is
really
understanding
resource
management
in
kubernetes.
E
Just
because
I
see
a
lot
of
people
struggle
with
that.
There's
a
lot
of
little
things
you
kind
of
have
to
know
about
it
from
what
mario
said
about
the
different
classes
of
burstable
guaranteed
those
types
of
things
that
you
just
really
have
to
invest
time
to
try
to
understand
it
as
best
as
you
can,
but
you'll
never
have
it
perfect
day,
one
or
even
day
365,
but
just
invest
that
time
too.
Try
to
understand
it.
A
C
It
kind
of
a
waste,
even
define
I
mean
you
could
still
do
it
definitely
and
if
the
application
still
inflated
and
kept
going
going
going,
but
really
at
that
point,
you're
you're,
just
setting
it
for
the
sake
of
setting
it
you're
not
really
providing
much
value,
and
it's
so
high
that
you
know
if
it's
a
200,
a
minute
jumps
to
1/5.
You
know
that's
kind
of
taking
that
note
a
little
bit
depending
on
if
you
set
your
allocatable
for
the
for
the
note
as
well.
C
A
D
Yeah
also
like
where
I
work,
we
also
struggle
a
lot
with
resource
management
like
a
lot
of
teams.
Just
over
provisioned
resources
are
under
provision.
It's
just
a
super
complex
problem.
So
right
now
we
have
started
looking
into
a
vertical
paddle
together.
Basically,
that's
software
you
can
run
and
which
would
basically
automatically
pick
correct
resources.
It
is
ours
request
for
you,
so
yeah
I'm
right
now
investigating
this
approach.
Maybe
it
will
work.
I
I
really
hope
it
will
I
know.
D
D
A
Right,
the
questions
keep
coming,
keep
on
asking
them.
If,
if
I've
missed
your
question,
please
just
let
me
know-
and
slack
next
question
goes
from
comes
from
Joel
Davis,
who
says:
is
there
anything
in
our
back
that
allows
you
to
select
against
certain
namespaces,
for
example,
if
I
want
to
give
someone
the
ability
to
create
new
namespaces,
of
which
they
have
access
to
star
verbs
on
star
resources,
but
not
interact
with
a
given
set
of
main
spaces?
Is
that
possible
to
do
through
our
back.
A
Let
me
see
the
replies.
People
are
asking
c10
wants
to
point
out
that
over
chip
does
this
using
an
operator
that
creates
Auerbach
resources
on
the
creation
of
a
new
namespace,
and
it
has
a
link
to
that
which
I
will
put
in
the
thing
here
into
the
main
channel
any
option.
Any
ideas
on
this
max
guy
says
OPA
might
help
you.
A
A
A
A
Wow
we
got
stumped
alright,
let's,
let's
keep
this
one
open
here.
Oh
Joel
has
some
information,
says:
there's
nothing
built-in,
because
you
have
to
bind
the
permissions
within
the
namespace
after
creating
it
a
web
hook
or
operator
is
about
the
only
solution,
so
I
think
if
you're
not
using
OpenShift,
is
there
a
non
opus
version
of
an
operator
like
this.
D
B
B
E
A
So
it
sounds
like
an
operator
is
gonna,
be
the
way
to
go
there.
So,
hopefully
that
helps
you
out.
Joe
feel
free
to
post
follow-up
questions
or
a
lot
of
people
are
just
responding
to
each
question
and
threads.
So
please
keep
that
up
there.
Joel
speed,
welcome
back,
says
something.
That's
come
up
to
my
work
recently
when
hosting
metrics
endpoints
in
an
application.
Do
people
do
people?
Should
people
require
off
and
RZ
to
access
them
ie,
something
like
cube
are
back
proxy,
be
put
in
front
or
the
functionality
be
implemented
into
the
application.
A
D
So
I
believe
right
now,
prometheus
folks
are
actually
working
on
this
and
so,
for
example,
node
exporter
that
recently
added
I
think
like
basic
off
or
something
like
that
for
metrics
on
point
so
I
believe
client
metrics
will
also
have
similar
feature
if
it
doesn't
already
so
yeah,
basically,
I,
don't
think
you
really
need
proxies
because
it
will
be
supported.
Natively
I'll
try
to
find
some
good
links.
Sure.
A
E
B
The
other
thing,
if
it's
for,
like
storage
on
the
PVCs,
your
storage
provider,
might
have
a
separate
solution
to
backup
individually,
like
that
so
Oh
cluster
state
Valero
I
for
nothing
but
amazing
things
about
or
even
if
you're
following
get
ops
here,
you
have
your
state
ich
with
and
get
but
for
storage
yeah.
That's
that
might
be
something
one
look
at
the
provider
for.
A
All
right
next
question
comes
from
Andre,
says
high
volume
question
on
the
volumes
documentation
page
we
can
read.
Kubernetes
supports
several
types
of
volumes,
generic.
What
does
it
mean?
Kubernetes
supports
particular
seems,
glossary,
FS
client
comes
with
height
width,
hypercube
kubernetes
node
also
seems
hypercube
is
going
to
be
deprecated.
How
about
kubernetes
continue
to
support?
A
A
A
Okay,
so
types
of
volumes,
Alisa
usuals
right
like
as
your
disc,
sefa,
fast
cinder.
You
know
port
works
volumes.
So
what
the
first
question
is,
whatever
you
mean
by
support
like
and
also
I'm
gonna,
add
up
my
own
little
thing
here.
How
does
this
differ
from
like
I
thought
everything
just
supported
CSI
and
then
you
would
just
get
that
right
or
just
like
a
native
support
thing.
I
think
I've
confused
myself
pop
you
wanna
taste,
it
yeah.
D
Yeah,
so
basically
one
of
those
types
of
volumes
is
CSI.
So
right
now,
kubernetes
has
a
bunch
of
integrator
clients,
for
example
aw,
yes,
EBS
or
Azure
discs,
so
yeah
so
generally
supports
means
that
you
can
actually
cubelet
kubernetes
node
agent
can
actually
connect
to
that
type
of
volume
and
mounted
to
your
pod
right.
So
yeah.
A
Okay,
so
it
seems
Gloucester,
Fest
kind
of
comes
with
hypercube
to
kubernetes
node.
So
let's
look
at
the
link
that
they
sent
here
so
I'm
confused,
so
I'm
totally
confused
here,
just
kind
of
showing
my
lack
of
knowledge
in
this
area.
I
thought
everything
just
talked
to
CSI
like
through
CSI,
and
that,
like
nothing
really
like,
talked
directly
to
the
storage
like
I
thought,
this
is
being
moved
out
of
core
like.
A
C
A
C
A
More
configurability,
it's
is
it
safe
to
say
that
probably
most
current
and
future
development
is
going
towards
a
CSI
side
of
things
and
not
them
yeah.
What
do
people
call
these
native
drivers
like?
What's
the
name
there's
a
page
on
the
CSI
site
that
has
like
supported
drivers,
and
there
is
like
an
order
of
magnitude,
more
supported,
driver.
A
Then,
in
like
the
traditional
flex,
volume
entry
supported
driver,
so
yeah
there's
definitely
a
lot
of
concerted
efforts.
I
think
storage,
vendors
that
want
to
be
a
part
of
kubernetes
no
CSI
is
the
way
to
put
their
driver
abstraction
into
Kate.
So
a
lot
of
development
is
going
there
if
it
hasn't
already
reached
they're
relatively
mature
state
yeah
I
was
just
I
was
just
surprised
when
I
when
I.
You
know,
when
I
read
this
I
thought
everything
had
moved
to
CSI
by
now,
but.
E
Yeah
I
would
say
it's
going
to
be
very
dependent
on
your
storage
provider
and
if
their
CSI
driver
is
stable
or
not
like
for
just,
for
example,
like
an
azure,
we
have
a
CSI
driver,
but
it's
maturing
and
getting
to
a
stable
state.
When
that's
supported,
that's
when
you
would
use
CSI
so
I'd
be
dependent
a
lot
on
your
storage
provider,
whether
you're
using
a
cloud
hosted
storage
provider
or
a
different
type
of
storage,
whether
its
supports
CSI.
D
Just
quickly
after
that,
I
mean
some
of
the
entry
drivers
are
actually
a
bit
more
resilient
than
TSI
drivers.
At
least
I
had
some
experience
with
CFS
and
basically
fuse
type
of
drivers.
Then
you
just
kick
one
driver,
node
and
then
a
bunch
of
pods
actually
lose
connection
to
basically
two
data,
so
I
mean
it
really
depends
on
technology
you're
using
and
yeah.
E
A
A
A
C
Cannot
answer
that
because
I'm
using
cloud
which
they
manage
it
for
me,
I
would
say
everything
especially
critical
services
should
have
limits
and
you
should
have
monitoring
in
place
so
that
you
know
warnings
when
the
things
are
getting
hot,
70,
80
%.
Something
like
that
right.
So
those
are
those
are
operational
things
that,
like
you
should
just
be
on
top
of,
but
they
should
have
limits
because
they
they
can
again
impact
other.
There
were
other
workloads
and
there
is
a
problem
there
as
well.
So
one
handed
off
sooner
rather
than
later.
A
Anybody
else
yeah
that
seemed
pretty
straightforward,
alright
and
Ray
I
hope
that
helps
you
out.
Let
us
know
how
you
get
on
with
that.
Can
Abbott
asks
hi.
Is
there
an
official
slash
recommended
graph
on
a
template
for
monitoring
the
entire
kubernetes
ecosystem
metrics
on
API
server,
cubelets
scheduler?
What
other
tools
are
people
to
get
a
big
picture
of
it?
That's
the
I
want
a
really
cool
dashboard.
What
do
I
use
so
yeah.
D
A
D
A
That's
awesome
sure
I
didn't
even
know
that
I
didn't
I
didn't
even
know.
Kubernetes
monitoring
was
like
a
namespace,
so
I
got
a
dig
through
there.
That's
always
a
good
one
awesome.
So
let
us
know
how
that
goes.
Everybody
and
everybody
thank
that
repo
by
giving
it
a
star
when
you
get
to
it
and
if
you
use
it
next,
mahir
sha
asks
how
to
set
custom
metrics
for
HPA.
Also
whoever
answers
this
questions
tell
me
what
HPA
is.
C
Yes,
horizontal
pot,
all
those
killers,
another
top
level,
API
object.
What
it
does
is
it
monitors
the
percentage
of
your
instances,
the
average
usage
for
either
CPU
or
memory
or
other
metrics
as
well.
They
set
thresholds
on
and
it
measures
that
against
it,
specifically
for
CPU
and
memory
what
you
requested
so
going
back
to
those
resource
requests.
We
talked
about.
Let's
say
if
you
said:
100
millisieverts
it
and
your
application,
the
average
of
all
begins.
This
is
part
of
that
deployment
is
up
to
80%.
C
Of
that
you
have
a
percent
threshold
set
on
your
HP
a
then
HP
will
start
to
take
action.
That's
either
scaling
in
or
scaling
out,
depending
so
either
killing
instances
that
aren't
needed
or
adding
an
sis
to
meet
that
threshold
that
you
set.
So
the
question
here
is
specifically
around
like
external
metrics
or
custom
metrics.
Usually,
what
happens
is
there's
a
controller
that
can
provide
those
for
you.
C
So,
for
instance,
we
have
we
use
the
dog
and
they
have
a
cluster
agent
which
can
provide
custom
metrics
and
that
cluster
agent
kind
of
taps
into
all
the
metrics
that
the
agents
are
collecting
so
basically
anything
that
we
we
have
that
gets
reported
at
about
any
metric
at
all,
that's
available
there
through
you
know
in
the
US
or
do
native
or
we're
recording
up
through
a
service,
etc.
We
can
set
thresholds
and
have
HPA
scale
on
that
right,
and
so
that
that
makes
things
super
super
nice.
C
We
haven't
done
a
lot
of
the
custom
metrics
stuff,
yet
we've
really
honed
in
on
the
CPU
and
memory,
but
you
can
really
like
again.
This
is
gonna,
be
a
rabbit
hole
if
you
kind
of
know
some
of
the
key
key
metrics.
So
maybe,
if
error
rate
goes
up,
you
want
to
add
some
instances,
but
more
so
latency
and
other
sensitive
metrics
around
your
application.
You
can
do
that,
so
you
should
also
include
your
look
at
things
like
monitoring
for
that
and
alerts
as
well.
C
It's
not
a
hundred
percent,
so
also
I
also
want
to
point
out
I'll
link
it
here
in
the
channel
there's
another
project,
that's
in
the
CN
CN
CF
health
kita
and
that
actually
taps
into
other
constructs
like,
for
instance,
an
alias
s.
Qsq
size
can
be
something
that
you
you
set
a
threshold
on
and
say:
okay
well
for
over
ten
thousand
entries
and
ten
thousand
cute
entries,
then
you
know
we
need
to
scale
up
or
something
like
that,
so
that
that
makes
things
a
lot
easier
and
I
think
this
is.
A
Alright,
we
got
about
15
minutes
left
and
about
two
or
three
questions
in
the
queue
so
keep
on
asking
audience.
If
you
have
questions,
meanwhile,
we
will
get
to
the
next
question
which
Mario
answered
with
a
bunch
of
links
and
Mario
I'm
gonna
ask
you
to
take
those
links
and
toss
them
in
the
main
channel
here,
but
Jojo
Perez.
Is
there
a
tool
for
testing
kubernetes
service?
Latency
Mario
responded
with
a
bunch
of
links,
but
let
me
give
the
pant
the
other
panelists
a
chance
to
feel
this,
that
they
they
wish.
A
C
I'm
Mario
he's
Marco,
I,
know
good
and
pace.
I
can't
seem
to
find
that
thread
right
now.
Yeah
I
got
him.
C
Actually
yeah
I
was
gonna,
say
I
just
so
we
I
mentioned
before
that.
We
were
kind
of
in
a
serve
as
much
research
mode
and
I.
Actually
me
and
my
coworker
put
the
other
Google
Doc,
just
like
vomited,
a
bunch
of
things
in
our
brain
and
stuff.
I
had
started
github
and
all
that
and
these
kind
of
were
the
frontrunners
when
it
came
to
performance,
testing
and
and
specifically,
I.
C
Think
the
big
ones
for
me
are
the
blue,
shot
the
Kate,
CNI
benchmark
and
probably
ripsaw
in
terms
of
I,
want
to
see
both
like
service
latency
latency
is
out
of
the
cluster
in
the
cluster
services
service.
Things
like
that
those
are
the
sorts
of
metrics
that
we
were
gonna,
try
to
kind
of
observe
when
it
came
to
you
know,
certainty
and
eyes,
ie,
link
or
de
or
something
used,
convoy,
etc.
So,
there's
no
like
perfect.
It
really
depends
on
your
needs
somewhere
before
declarative
than
others.
C
I
really
like
what
still
has
been
doing
as
well.
Ks
smashes
again
just
and
everyone
knows
the
chaos
stuff
that
came
urging
kind
of
on
Netflix
and
others
18
different
solutions
for
that,
so
that
that
does
that
for
a
service
mesh,
artilleries
kind
of
a
go
to
first
over
our
front-end
teams,
as
well
I,
just
from
like
very
outside
I'm
on
my
laptop.
How
is
performance
so
again
be
careful
with
these
tools:
I
don't
DDoS
yourself,
but
yet
they
they
can
they're
pretty
configurable
and
they
provide
a
lot
of
a
lot
of
rope.
A
A
C
We
were
still
exploratory
I,
think
we've
narrowed
it
down.
Most
of
it,
mostly
to
like,
probably
probably
not
doing,
kuhmo,
which
was
Kong's,
were
probably
focused
more
on
link
or
D
and
console
right
now,
sto,
you
know
we're
a
small
team,
we're
still
kind
of
a
small
company.
I
think
it
was
just
a
little
bit
more
complex.
I
know,
there's
aspen
mesh,
which
is
really
making
that
all
of
easier
and
providing
support
services
as
well
for
it,
but
I
I
think
it
feels
a
little
bit
over
ahead
in
terms
of
what
we
really
need.
C
A
C
A
A
A
Caroline,
wouldn't
give
you
some
time
there
to
give
us
follow-up,
feel
free
to
just
keep
typing.
There
are
some
questions
that
I
appear
to
have
mists.
So
let
me
go
back
here.
While
we
let
that
one
stew
for
a
minute
here,
a
Sivan,
kena,
pulley,
hope
I
got
that
right,
says:
hi
I
have
a
question:
is
there
a
way
to
specify
certain
pods
in
a
deployment
to
get
killed
when
the
horizontal
autoscaler
scales
down
the
deployment?
Maya
looks
like
you've
answered
this
one,
but
I
wanted
to
get
this
one
on
the
video
sure.
C
Yeah,
just
just
really
quick,
that's
I
mean
the
HP
references,
the
deployments,
everything
that's
part
of
that
deployment
every
instance
is
impacted,
so
I.
In
that
case
you
have
to
make
a
separate
deployment
or
something
like
that,
which
deployments
track
odds
through
labels.
So
you
know,
if
you
wanted
to
pull
them
out,
you
could
remove
the
labels
of
those
pods,
etc,
etc.
So
yeah.
A
Alright
and
another
follow
up
here:
Vishnu
Prasad
ass.
Is
there
a
project
or
tools
that
would
help
us
configure
how
and
when
to
auto
scale
nodes
up
and
down
like
in
eks,
node
groups
went
to
scale
down
the
nodes,
especially
mainly
because
certain
loads
can't
use
them
on
those
metrics
like
cpu
memory
are
all
the
time
scaling
up
and
down.
It
looks
like
you've
been
answering
all
the
questions
in
chat
before
we
get
to
them
so.
C
The
easy
ones
man
I've
lived
in
auto-scaling
yeah,
the
cluster
autoscaler,
is
great
for
that.
It's
it's
I
wouldn't
say
it's
the
the
most
stable,
perfect
production
piece
of
software,
but
it
gets
the
job
done
it
logs,
I
think
as
AI
config
map
you
can
reference
for
status
and
kind
of
it's
looking
to
make
and
I've
never
had
an
issue
with
it
talking
to
a
toes
API
to
change,
ASG
sighs,
which
is
effectively
what
it's
doing
to
scale
the
entirety
of
the
cluster.
C
So
the
automation,
where
it
kind
of
senses,
everything
through
labels
and
tags
and
whatnot
is,
is
really
good
and
the
home
chart
is
great.
We
use
it
right
now
on
all
of
our
monsters.
So
one
thing
is
multi.
A-Z
is
a
little
tricky.
You
might
be
out
of
balance
in
some
cases,
but
I
won't
get
into
that.
So
there's
docks
there.
Okay.
A
So,
let's,
let's
circle
back
to
Caroline's
question:
if
she,
if
they
weren't
listening,
given
a
currently
update
in
deployment
rolling
in
a
new
replica
set
and
rolling
out
the
previous
one
which
replica
set
up
odds,
are
taken
away
from
if
a
user
scalar
decreases
I
think
we
asked
her
follow-up
info
on
this
one
right,
yeah.
C
We
yester
problem
I
actually
am
looking
at
our
live
production
cluster
right
now
and
I
actually
went
to
the
real
flickers
that
view
in
my
k-9s
interface
and
I
only
see
for
any
given
deployment.
Let's
call
it
edge
platform,
it's
got
I
see
like
10
replicas
sets
here,
and
all
of
them
are
zeroed
out,
except
for
one
which
actually
is
active
and
has
the
active
number
of
instances.
So
in
that
case,
I
think
that
HPA,
let's
say
that's
editing
your
deployment.
C
The
number
of
replicas
should
probably
be
working
in
just
that
replica
set
because
you're
not
draining
that
realistic
replica
set
entirely
because
you're
not
doing
a
deployment
or
anything
like
that.
There's
no
need
for
it
to
be
killed
at
all.
So
that's
my
understanding,
so
maybe
there's
better
Doc's
or
something
around
that.
Ok.
A
Awesome
and
with
that
we've
reached
the
end
of
the
queue
of
questions.
So,
if
you
have
questions
feel
free
to
ask
them
we're
gonna
get
to
Long's
question
next,
which
is
the
last
one,
so
we
probably
have
time
for
one
or
two
more
so
get
them
in
metal.
Motsi
welcome,
ass,
says,
just
saw,
monitoring
mixing
is
being
discussed
here,
there's
even
a
slack
channel
dedicated
to
here
on
the
slack
on
the
kubernetes
slack.
So
thanks
for
that
link,
that's
always
useful
long.
A
A
A
A
E
C
I
was
just
like
you
better,
be
if
there's
a
heart
for
the
metric
server,
which
is
also
what
you
BBC
I,
was
gonna
base.
Gate
close
the
question
like
I'm,
not
sure
what
they
do
there.
What
the
defaults
are.
That's
always
worked
for
us,
even
through
everyone,
12
to
115
or
right
now,
so
like
I'm
guessing
it
does.
I
would
look
to
the
templates
in
there
and
see
if
that's
an
option
they
passed
by
default,
so
that
would
be
yeah
that'd,
be
my
two
cents.
A
C
So
like
in
our
in
our
case,
we
have
a
data
dog
cluster
agent
that
provides
and
an
external
metrics
object,
I
guess
that
has
metrics
as
part
of
it.
So,
okay,
we
can
leverage
those
in
HPA
and
say
HPA
scale
on
number
of
requests
going
to
each
instances.
We
want
it
to
be
100
so
balance
it
out
kind
of
thing.
So.
A
C
What
is
the
go
to
for
Prometheus
right
now
for
the
sound
like
that
the
standard
kind
of
ultimate
deployment
I
know
in
the
helm,
Official
Charts,
there's
a
Prometheus
operator
I,
think
that
seems
to
include
pretty
much
everything
that
you
would
need,
including
the
node
exporter,
alert
manager,
keepsake,
metrics,
Griffon,
etc.
Is
that
kind
of
still
the
go-to
or
is
anyone
using
anything
else.
A
We
use-
oh,
my
goodness.
We
used
two
Prometheus
operated,
mom
charts
that
we
also
wrote
a
secondary
helm
chart
that
we
use
internally
that
standardizes
our
Prometheus
CRD
definition,
so
that
we
have
a
pretty
consistent
setup
of
what
we
do
for
Prometheus,
whether
it's
in
our
cloud
or
hedge
sites,
but.
C
A
So
they
say
cube
queue.
Prometheus
is
a
standard.
How
much
art
is
based
on
that
as
well
yeah.
It
looks
like
my
metal
mater
unit
is
a
works
on
prometheus,
so,
yes,
datian
from
a
maintainer
gives
you
extra
weighted.
So
he
says
yeah
we're
actively
maintaining
that
repository
lots
of
things
going
on
daily.
So
thanks
very
much
for
that
that
helps.
D
A
I
think
a
Prometheus
officer
should
be
a
good
one.
That's
what
I
was
writing
down.
That's
why
I
wasn't
paying
attention.
I
was
like
we
need
to
have
a
session
on
this,
so
good
to
know
alright.
So
we
are
really
running
out
of
time.
I
really
appreciate
everyone
for
showing
up,
especially
those
of
you,
sharing
your
recommendations
in
chat.
I'm
gonna
give
away
two
t-shirts
here
today.
Here's
what's
gonna
happen,
I'm
going
to
tell
you
that
you
want
a
t-shirt
and
then
I
will
PM
you
with
the
store.
A
You
can
always
get
all
the
goodies
from
the
store
from
the
CN
CF
big
shout
out
to
Google
stock
X,
Microsoft,
VMware
and
Parv.
Where
do
you
work
at
again,
eww
eww
and
vapor
dot
IO
for
letting
their
engineers
sit
on
this
panel.
If
you're
interested
in
sitting
on
this
panel,
we're
always
looking
for
volunteers,
so
we
will
have
to
have
like
a
rotating
set
of
people,
so
we
can
cover
different
levels
of
expertise
and
different
areas
of
the
project
and
with
that
there
is
a
Prometheus
ecosystem
every
call
every
month.
A
If
someone
could
drop
a
link
to
that
in
chat.
I
will
make
sure
that
gets
to
the
show
notes.
The
winners
are
Vishnu
Prasad
you've
won
a
Kira,
Nerys
t-shirt
and
a
Sivan
Ken
pulley.
Thank
you
for
your
questions.
I
will
follow
up
with
you.
We
are
gonna,
go
live
in
another
two
hours.
Geoffrey
Sica
will
be
grabbing
a
bunch
of
West
Coasters
and
we
will
go
live
again
in
this
channel.
So
if
you're
listening-
and
we
do
these
a
third
Wednesday
of
every
month-
it's
always
a
third
Wednesday.
A
We
try
to
have
as
many
sessions
as
possible
if
you're,
interesting
and
helping
out
yeah,
just
let
us
know,
feel
free
to
hang
out
in
the
channel.
We
like
to
keep
it's
like
a
nice.
Much
smaller
group
than
trying
to
you
know,
get
help
in
a
channel
with
a
hundred
thousand
people
panel.
Any
last
thoughts
you.