►
From YouTube: Kubernetes UG VMware 20220203
Description
February 3, 2022 meeting of the Kubernetes VMware User Group with discussion of recent updates and patches, resource limit declaration and admission controllers, monitoring of resource usage.
A
On
the
agenda
today,
I
added
a
coverage
of
a
recent
updates
to
software,
which
includes
both
the
vmware
infrastructure
products
like
vsphere,
as
well
as
the
open
source
kubernetes
like
the
cloud
provider
and
the
csi
storage
plugin.
A
We
had
an
informal
chat
before
the
meeting
started
and
I
think
miles
will
go
on
the
agenda
in
a
bit
and
talk
about
possibly
amending
our
recurring
meeting
time
for
maybe
better
convenience
to
the
people
who
tend
to
attend,
and
I
also
noticed
recently
that
there
were
a
number
of
conversations
scattered
about
slack,
not
all
in
the
user
group
channel,
but
some
of
them
were
in
the
cloud
provider
channel.
I
think
some
might
have
been
in
the
tanzu
community
edition,
but
all
related
to
mapping
kubernetes
zone
features
to
running
on
things
like
vsphere
clusters.
A
A
Okay,
I
assume
you
can
see
that
agenda
notes,
document
and
somebody
okay,
so
recent
updates,
both
esxi
and
vcenter,
were
recently
updated.
A
Vmware
did
conclude
an
investigation
and
found
that
at
least
a
couple
aspects
of
this
weren't
exploitable
in
vmware
products,
but
the
authoritative
links
are
there
and
there
are
a
lot
of
vmware
products,
but
this
one
is
pretty
much
just
talking
about
the
infrastructure
related
to
kubernetes.
Those
links
go
the
full
span.
So
go
look
over
there.
If
you're
interested
the
csi
storage
plug-in
was
updated.
In
mid-january,
there
were
features
added
for
those
still
ringing
on
vsphere,
6.7
and.
A
Go
look
at
that
link
to
go,
get
it
and
read
about
it.
I
believe
there
were
also
features
related
to
supporting
upgrades
from
the
old
entry
storage
plug-in
to
the
csi
version.
A
Bug
fixes
if
you're
on
csi
2.3
for
the
cloud
provider,
the
health
chart
version
one
was
released
and
there
are
docs
over
there
on
how
to
use
it.
Cloud
provider
itself
had
an
update
that
related
to
being
able
to
configure
ipsubnets
a
little
better.
I
know
the
person
who
asked
for
this
feature
in
particular
was
using
cube
vip
and
had
asked
for
some
enhancements
related
to
being
able
to
use
it
a
little
more
flexibly,
so
that
might
be
of
interest
to
people.
The
alpha
of
version
1.23
was
released
in
mid-january.
A
This,
principally,
would
be
useful
if
you
want
to
move
up
to
the
newer
version
of
kubernetes
at
the
cloud
provider
meeting
yesterday,
the
dev
announced
that
they
expect
this
release
by
february
11.
as
usual,
that
just
an
estimate
it
could
come
in
a
little
earlier.
It
could
be
later
because
things
might
happen,
but
just
wanted
to
let
you
know,
what's
going
on
then
miles.
If
you
want
to
bring
up
the
subject
of
a
new
meeting
time,
you
can
go
for
it
I'll
stop.
My
screen
share.
B
Sure
so,
whenever
we
originally
formed
these
group
kubernetes
vmware
user
group,
maybe
a
year
and
a
half
ago
or
so
now,
maybe
a
year,
I
don't
remember-
we
had
a
poll
on
what
time
slot
to
hold
it
in
and
at
the
time
a
lot
of
the
folks
that
were
coming
to
the
user
group
regularly
were
based
in
west
coast
and
u.s,
and
the
vote
came
out
to
be.
I
think
it's
11
a.m.
Pacific
right,
steve,
that's
correct!
B
Okay!
So
in
the
last
while
it
appears
to
be
more
and
more
predominantly
emea
based
folks,
so
maybe
it
makes
sense
for
us
to
have
another
poll
with
some
alternate
time
slots
and
see
how
that
comes
out
just
for
the
folks
that
are
regular
attenders
that
are
based
outside
the
us.
If
it
doesn't
change,
that's
fine,
but
I
figure
it's
worth
having
another
poll,
because
it's
been
some
time
now,
since
we
asked.
A
That
sounds
good
to
me.
I'd
second,
that
it
maybe
suggests
that
it's,
you
probably
better,
add
a
few
more
selections,
but
we
might
as
well
ask
the
people
on
the
call
now
for
their
suggested
times.
I
don't
think
you
want
it
free
form,
so
maybe
just
have
an
assortment
out
there.
B
Yeah
yeah
I
mean
do
people
prefer
like
during
work
hours
in
a
time
or
like
morning
time
early
afternoon
or
after
work
is
usually
the
best.
D
B
A
little
earlier
than
7
pm,
okay,
but
yeah,
I'm
not
precious
about
it.
If,
if
it
works
for
for
everyone
else,
that's
cool,
I'm!
Okay
with
it.
A
A
Okay,
then,
let
me
see
if
anybody
tacked
on
anything
else
in
the
agenda,
no,
that
that
was
it.
We
we
do
have
an
opportunity
in
this
user
group
to
submit
a
maintainer
track
talk
for
kubecon
and
that
will
be
due
in
around
a
week
or
so.
A
If
people
have
them
on
things,
they
might
like
to
see
at
the
kubecon
event,
we
could
maybe
have
the
event
system
and
add
a
second
thing
or
just
go
with
that,
but
just
putting
it
on
the
table
for
discussion.
A
Okay,
I'll
just
mark
that
down
as
no
objections
and
no
other
ideas,
so
we'll
probably
go
forward
with
that,
and
I've
talked
to
michael
and
he
seemed
he
seemed
amenable
to
attending
physically
to
give
a
presentation
on
that
subject.
So
that
sounded
pretty
good
to
me
too,
for
those
who
weren't
aware
that
the
conference
is
slated
to
be
physical
in
spain
in
mid-may.
A
No,
it
was
valencia.
C
That
would
be
nice,
it
might
make
discussions
here
a
bit
easier.
I'm
sorry,
I
don't
know
I'd
like
to
go
kind
of,
but
things
are
still
a
bit
wacky
around
covet
and
yeah
same
here
and
it's
hard
to
justify
going
there
at
the
moment
within
itunes.
A
It
seems
to
change
on
a
week-by-week
basis
where
I
live
it.
You
could
even
like
alter
your
view
of
the
world
by
news
source
and
get
a
completely
different
impression
of
what's
going
on
in
your
own
city.
D
A
A
And
then
the
other
thing
that
I
didn't
put
it
in
the
agenda
but,
like
I
said
I
I
don't
know
if
the
rest
of
you
noticed
it,
but
there
have
been
a
fair
number
of
questions
recently.
Some
of
them
might
be
by
newbies,
but
people
trying
to
run
kubernetes
on
top
of
vsphere
in
light
of
and
taking
into
account
clusters
even
moving
things
around.
A
I
mean
some
of
these
people
have
been
enabled
by
recent
open
source
releases
so
that
I
think
that
I
I
looking
at
the
descriptions
of
the
thing
questions
they're
asking
I'm
almost
thinking,
they're
more
on
home
lab,
like
environments
rather
than
production
grade
commercial,
but
still
we'd
like
to
support
all
of
these,
because,
particularly
when
you
go
out
to
edge
locations,
the
fact
is,
even
in
enterprises,
sometimes
these
edge
locations
end
up
looking
closer
to
a
home
lab
than
a
big
data
center.
A
So
I
think
some
of
these
issues
are
pretty
generic
to
a
lot
of
people,
and
I
gathered
that
some
of
these
people
even
had
situations
where
they're
on
they
have
multiple
esxi's,
but
they
don't
even
potentially
have
shared
storage
readily
available
with
them,
which
I
mean
I.
My
own
attitude
is
as
long
as
you're,
not
on
persistent
workloads
running
on
kubernetes.
It
probably
is
just
fine
to
not
have
any
shared
storage.
A
If
you
need
to
support
a
persistent
workload
like
a
database
or
something
you
know,
there
might
be
hope
if
you're
for
hosting
that
on
kubernetes,
if
you
use
the
right
type
of
storage
solution
and
maybe
had
three
available
physical
nodes,
but
going
down
to
two,
I'm
not
so
sure
that
you're
not
better
off
just
running
those
persistent
workloads
in
a
vm
with
vsphere
features
to
attempt
to
achieve
availability
rather
than
adding
a
kubernetes
layer
on
top
of
that
mix.
A
B
Yeah
when
it
comes
to
edge,
it
gets
really
hairy,
especially
whenever
you're
talking
three
hosts
or
less,
because
you
have
no
room
for
any
maintenance
whatsoever.
So
this
is
actually
what
robert
and
scott-
and
I
were
talking
about.
You
know
maybe
doing
some
work
on
or
doing
some
kind
of
presentation
on
at
some
point
to
do
with
like
izzy's
storage
kubernetes,
how
they
work
or
don't
work
more
critically
together.
Why
not
to
stretch
storage
and
yeah?
B
Basically,
just
why
not
to
do
things
rather
than
why
to
do
things
but
yeah
whenever
it
comes
to
kate's
on
vsphere
the
simpler,
the
better.
C
I
think
I
think
we
could
do
a
whole
session
just
on
storage
easily,
so.
B
The
I
have
done
lots
of
them.
C
It's
so
so,
maybe
nice,
for
you,
know
steve
the
and
anyone
who's
watching,
probably
me
scott
miles
and
some
other
guy
from
itq.
C
That's
also
thinking
about
some
of
these
problems,
we're
going
to
do
a
tanzania
tuesday
session,
where
we
continue
that
conversation
about
about
how
to
yeah
kind
of
make
these
two
worlds
match
up
when
it
comes
to
resource
and
and
storage
use,
thinking
of
resource
use,
in
particular,
it's
funny
because
I
speak
to
a
lot
of
guys
from
from
infrared
backgrounds
and
they're
concerned
always,
and
it's
not
not
kubernetes
related
per
se.
It's
it's
that
whenever
other
people
are
running
high
workload,
high
resource
use
things
in
their
esx
clusters.
C
They
get
a
bit
worried.
You
know
they
don't
like
the
black
boxiness
of
a
whole
bunch
of
vms,
pulling
a
huge
amount
of
resource
x
and
there's
this
kind
of
this
strange
contention
around
responsibility
or
a
feeling
of
responsibility.
C
I
I've
yet
to
meet
a
visa
admin
or
a
team
responsible
for
vsphere
that
was
happy,
leaving
an
esx
cluster.
Just
you
know
letting
letting
it
go
and
saying.
Well,
apparently,
today
it's
running
at
90
ram.
We
don't
care
we're
not
responsible
but
whatever's
running
at
it's
the
problem
of
the
people
running
it
even
even
if
there
are
very
well
delineated
agreements
about
who's
responsible
for
what.
C
Because
these
fabians
for
15
years
have
been
trained
to
not
let
that
happen,
yeah
and
and
to
take
responsibility
and
to
make
sure
that
be
here
stays
healthy,
is
x,
stays
healthy,
all
the
stuff
running
on
stay,
selfish
and,
of
course,
partly
they
have
to,
because
you
know,
because
if
the
cluster
is
going
to
go,
100
cpu
vsan
is
going
to
fall
over
and
it's
going
to
fall
over
all
these
things
going
to
fall
over.
So
you.
D
Scheduler
is
that
you
just
don't,
go
over
a
2.1
ever
in
your
ratio
of
v
cpu
to
cpu
and
ram
to
ram,
and
then
you're,
usually
fine
in
those
cases,
but
that's
also
waste
of
resources.
Right.
You
don't.
C
D
That
was
put
up
on
openshift
mentioned
this,
and
there
was
one
on
pks
in
1.0
of
pks
that
hasn't
been
updated
since
1.0,
but
like
best
practices
of
running
on
vsphere
and
then
just
from
overall,
like
community,
like
talking
with
people
and
seeing
what
organizations
are
doing
and
and
things
like
that,
but
there's
nothing
official
out
there.
That
says
this,
because
no
one
wants
to
come
out
and
say:
yeah
get
half
the
resource
utilization
out
of
your
environment.
D
Only
do
two
to
one
ratio,
doesn't
sound
good,
but
when
it's
written
down,
when
you
explain
it,
it
makes
sense
in
terms
of
how
the
scheduler
works
and
that
you
actually
have
over
provisioning
in
kubernetes
with
requests
and
limits.
If
you
set
them
correctly
and
things
like
that,
but
it's
the
double
level
of
over
provisioning,
that's
always
wrong.
I.
A
D
For
sure
no,
but
it's
the
same
issue
that
we
had
with
like
data
stores
that
are
thin
provision.
Where,
like
you,
provision
your
distance
and
your
net
app
is
in
provision
volumes,
it's
like
great
job.
You
have
no
idea
how
much
is
actually
used
or
how
much
is
allocated
or
what's
going
to
happen
tomorrow.
D
It's
this
double
level
of
management
has
always
been
an
issue,
and
it's
been
known
that
it's
an
issue
in
storage
and
there's
best
practices
kubernetes,
it's
just
not
official
that
there
are
best
practices
on
this,
but
it's
the
exact
same
issue.
Here,
it's
compute
there
is
storage,
but
it's
the
same
idea
over
provision
at
one
level.
B
I
think
it's
there
seems
to
be
an
element
of
have
your
cake
and
eat
it
too,
from
people
that
manage
a
vi
infrastructure
that
happens
to
have
kates
on
it
as
well,
though,
which
is
they
want
to
get
the
absolute
maximum
efficiency
out
of
the
platforms?
That
means
when
stuff
is
run
in
steady
state.
B
You
know
they're
running
80,
cpu
utilization
ram
utilization,
it
all
comes
along
nicely,
but
you've
got
absolutely
no
room
for
burst
or
you
know
workloads
rolling
out
like
if
you
suddenly,
you
know
nuka
cluster,
and
then
you
run
argo
cd
and
you
set
up
the
entire
stack
all
over
again.
It's
gonna
have
a
bad
time
doing
that
so
yeah
there's
there's.
Definitely
an
element
of
people
want
to
be
too
heavy
on
the
resource
consolidation.
Just
because
of
that's
how
you've
always
done
it,
and
it
was
really
easy
because
it
was
steady
state
before.
A
Yeah,
the
other
thing
that
could
put
you
over
the
edge
is,
of
course,
a
failure,
and
it
could
even
be
a
software
failure,
not
necessarily
hardware
but
I've.
I've
heard
horror
stories
of
people
running
cluster,
aware
persistent
apps
like
cassandra
that
go
across
clusters
and
you
can
host
those
on
kubernetes,
but
if
they
ever
trigger
themselves
to
think
they
need
a
rebuild
or
a
node
replacement.
A
Those
things
can
get
pretty
ugly,
often
evidence
this
hasn't
come
up
yet
but
store.
Networking
is
another
aspect
that
traditionally
in
on
the
vsphere
platform,
you
could
hear
mark
and
carve
out
resource
allocations
in
your
you
know
allocating
the
physical
network
bandwidth
to
particular
workloads
or
particular
designated
traffic,
like
storage
versus
whatever
compute
you're
hosting,
and
it
was
always
views
as
pretty
wise
not
to
have
a
big
burst
in
compute.
A
Take
out
your
underlying
storage,
because
that
could
lead
to
a
positive
feedback
loop,
and
I
think
that
layering
kubernetes
with
these
cluster
aware
persistent
apps
can
do
that
exact
same
thing.
Some
of
those
things
on
a
rebuild
can
really
get
ugly
like
demanding
1020x
the
normal
baseline
amount
of
network
connectivity.
C
I
also
think
there's
something
there's
something
new
here,
because
I
mean
the
the
fact
that
whatever
vms
are
doing,
you
know,
kill
your
clusters.
We've
known
this
for
a
while.
The
thing
is
the
chance
of
many
vms
all
doing
something.
At
the
same
time,
there
are
only
you
know,
a
bunch
of
scenarios
where
that
can
happen
either
you're
already
running
some
kind
of
distributed
workload.
C
Or
you're
having
like
some
kind
of
reboot
storm
or
you
know
the
the
virus
scanner
you
know
go
went
crazy
on
on.
You
know,
like
you
know,
a
thousand
horizon
vms
at
the
same
time
or
or
you
get
a
you
know,
some
kind
of
failure
which
causes
a
massive
hay
storm
we've
seen
that
before
so
or
drs
storms,
you
know
all
kinds
of
stuff,
but
the
kubernetes
I
mean
I've,
never
seen
I
mean
the
kinetic
is.
C
Is
this
this
amazing
control
plane
to
run
any
kind
of
distributed
architecture
on
top
of
and
the
chance
of
getting
these
kind
of
effects
are
so
much
larger
with
this
just
during
normal
operations,
because
the
whole
thing
is
a
distributed
model
and
everything
you
do
with
it.
Has
that
aspect
to
it.
B
If,
if
it
were
me
that
was
running
kubernetes
in
prod,
I
would
just
turn
them
off
on
vsphere
and
try
and
make
the
vsphere
layer
simply
run.
My
vms
leave
them
alone.
Let
them
sit
in
place
and
like
kubernetes,
do
everything
else,
because
you
you're
going
to
have
two
control
planes
fighting
with
each
other,
the
entire
time,
otherwise,.
C
Yeah,
it's
a
conclusion.
I've
reached
very
slowly.
You
know
over
the
last
two
years,
I'm
still
quite
new
to
this
stuff,
and
this
is
just
with
the
tgi.
But
this
is
the
same.
You
reach
the
same
glue
just
start
turning
stuff
off
because
you
don't
need
it.
It
gets
in
the
way,
but
there's
another
there's
another
aspect
to
this
and
I
saw
a
tweet
come
around
just.
I
think
it
was
this
week
earlier
this
week,
which
was
very
interesting,
which
I
think
I
retweeted
it
it
was.
C
You
can
simplify
the
infrastructure
layer
and
then
do
kubernetes
bunch
of
clusters
and
you
leave
the
intelligence
to
that
control,
plane
and
hopefully
to
the
developers
deploying
apps
if
they
are
intelligently
deploying
apps
and
building
apps,
and
there
was
this
great
tweet.
It
was
like
the
top
five
things
that
most
kubernetes
developers
are
still
not
doing
properly
like
basic
stuff.
Give
you
know,
like
a
part,
you
know
a
limit
live
in
this
channel.
It's
resources,
yeah
start
basic
things
like
this.
Don't.
A
D
A
Yeah
well,
I'm
active
in
the
kubernetes
iot
edge
working
group,
and
I
have
to
say
that
my
observation
there
is
that
people
who
grow
up
in
a
massively
elastic
public
cloud,
it's
almost
like
the
platform,
isn't
actually
going
to
coach
you
on
putting
these
constraints
there.
They
just
assume
you
sprawl
out
and
gobble
down
a
bunch
of
extra
vms
and
run
up
your
bill.
A
People
who
evolve
using
kubernetes
in
that
environment
and
then
in
later
years,
trying
to
move
to
edge
with
having
gotten
away
with
never
declaring
any
resource
allocations
or
limits
are
in
for
a
pretty
bad
year
of
learning
the
hard
way
why
that
stuff
is
important,
and
I
think
that
that
yeah,
some
of
these
same
lessons
apply,
I
think,
when
you
potentially
move
to
vsphere
and
you're
paying
for
your
own
hosting
and
kind
of
have
a
finite
amount
of
capital.
A
You
you
want
to
invest
in
your
compute
resource
and
not
have
it
be
open-ended.
I
mean
once
you
go
on-prem,
it
isn't
a
public
cloud
where
it's
just
a
matter
of
getting
worrying
about
it.
When
the
bill
comes
a
month
later,
you
can
just
always
be
guaranteed.
You
get
what
you
want
more
or
less
instantly
and
yeah.
B
D
C
D
What's
wrong
with
the
system,
I
said
nothing
watch
and
I
bring
up
an
engine
xbox
I
set
requested
limits
and
everything
is
great.
They
say
well.
Why
is
that
I
said
well,
you're
not
sending
requests
in
limits,
they
don't
know,
there's
a
validating
web
hook
in
the
back
end.
That's
doing
this.
All
they
see
is
their
pod
and
it's
getting
10
megabytes
and
one
millicore.
A
A
D
D
So
if
someone
were
to
have
a
specific
workload
that
needs
more
than
what
a
normal
workload
should
need,
they
talk
to
the
platform
team,
who
would
add
that
into
the
exclusion
list,
allowing
them
a
certain
amount,
but
otherwise
for
a
standard
workload.
You
want
self-service
deployed
to
the
cluster
awesome
you're
limited
to
whatever
is
decided
in
that
organization
based
off
the
types
of
apps
they
need.
D
D
D
It's
the
easiest
policy
manager
for
kubernetes
by
far
because
it's
all
yamo
based,
so
you
don't
need
to
know
a
programming
language
and
the
real
benefit
is
that
it
also
has
like
it
has
both
the
validating
web
hooks
like
opa,
has
opa
just
added
mutating
web
hooks,
so
it
can
mutate
requests
as
well,
but
kyverno
also
has
what's
called
the
generate
policies
that
allow
like
any
time.
Let's
say
a
namespace
is
created
automatically
create
these
other
objects.
D
That's
what
I
use
and
they
actually
have
a
way
where
you
can
generate
an
object
or
you
can
create
a
clone
of
an
object
from
another
name
space
and
it
will
keep
them
synced.
So
that's
what
I
do
for
image
posts
secrets.
I
have
an
image:
pull
secret
in
cube
system,
any
new
namespace
that
gets
created.
It
creates
it
in
that
namespace
and
then
it
automatically
syncs
it.
So
if
I
need
to
change
that
password,
I
just
change
it
in
the
cubesystem
namespace
and
it
gets
replicated
within
two
minutes
to
all
other
namespaces.
B
B
D
B
B
So,
essentially,
could
you
total
up
the
number
of
resources
from
each
of
the
nodes
in
the
cluster
and
then
have
an
emission
controller
and
a
mission
web
hook?
That
then
says
as
long
as
the
resource
that's
being
requested
by
this
pod
is
less
than
what's
left
in
the
cluster
admit,
the
workload.
D
You
should
be
able
to
do
that
with
jms
jms
paths
in
kyberno
as
well,
because
it
has
an
api
access
to
any
kubernetes
object
that
you
can
pull
in
and
then
use
that
and
save
that
in
variables
and
use
it
in
next
steps,
and
things
like
that,
so
you
could
do.
That
would
be
my
guess.
I've
never
done
anything
like
that,
but
you
probably
could.
A
B
Well,
no,
I
mean,
if
it's
an
emission
web
hook,
then
I
guess
you
could
have
it
fail
or
you
could
get
it
to
default
like
like
what
scott
was
saying
to
some
ludicrously
low
value,
so
it
admits
it,
but
then
it
won't
start
up
and
then
they'll
have
to
go
debug
why?
I
won't
start
to
start
up
so
either
of
those
would
be
okay
as
far
as
it.
A
Just
strikes
me
that
it's
a
situation
where
it's
based
on
things
not
having
to
do
with
this
particular
workload.
You
know
what
was
already
there
before
I
tried
to
run
so
it
wouldn't
be
repeatable
or
deterministic
and
you'd,
without
leaving
some
breadcrumbs
to
explain
what
this
happened
and
when
somebody
might
be
clueless
as
to
the
behavior
going
on.
D
Right
so
what
the
validating
web
hook
does
is
you
can
set
it
into
warn
mode
or
fail
mode,
and
then
it
will
in
either
case
even
in
warn
mode.
It
will
print
out
to
the
console
of
the
user.
What
the
error
that
you're
throwing
in
the
policy
is,
so
this
policy
failed
pod
created
if
it
was
in
worn
if
it
wasn't
and
it
was
in
failed
mode,
it
would
just
throw
that
error
and
wouldn't
create
the
object.
So
that's
the
benefit
of
doing
it
through
a
in
mission
controller.
D
Doing
it
through
mutating
can
be
very
beneficial
in
terms
of
the
user
experience,
but
it's
less
visible
to
the
developer,
what's
actually
happening.
So
it's
kind
of
a
way
of
don't
make
your
developer
do
things
because
you
just
mutate
it
with
sane
defaults.
D
On
the
other
hand,
you
know
it
is
kind
of
behind
the
scenes
and
they
don't
know
the
magic
that's
happening,
but
that's
actually
one
of
the
someone
who
used
to
come
a
lot
to
these
meetings.
Chip
zoller
from
dell
he's
actually
one
of
the
maintainers
of
kaiverno
and
has
been
doing
amazing
work
on
all
of
the
policies
out
there
there's,
I
think,
over
a
hundred
policies.
Example
policies
on
their
website
for
everything
and
like
really
good
use
cases
as
well
and
they're
active.
A
You
know
I'm
just
brainstorming
here.
This
isn't
really
an
admission
controller,
but
my
thoughts
are
that
looking
ahead,
people
might
have
a
tendency
to
exaggerate
their
resource
demands
and
as
a
policy,
you
might
want
to
police
that
and
it's
a
sort
of
a
situation
where
you
have
to
let
them
run,
but
leave
a
reminder
that
hey
in
five
minutes
or
an
hour,
I'm
gonna
check
on
this
and
just
compare
kind
of
what
they
declared.
They
were
going
to
use
to
what
the
historical
actual
usage
is
to
go
catch
liars
and
you
might.
A
This
might
be
a
great
basis
for
something
that
I
think
would
be
really
popular.
You
know
the
whole
green
movement
is
popular
of
eliminating
waste
and
if
you
were
to
couch
this
as
something
that
would
support
a
greener
planet
by
re,
reducing
wasted
resource
like
over
provisioning,
compute
and
power
usage
to
more
match,
you
know
I
don't
know
a
target
80
loading
of
what
you
spent
money
on
this
could
catch
on,
and
I'm
wondering
what
tools
might
be
available
to
build.
Something
like
that
out
by
putting
together
existing
projects.
B
You
could
do
that
with
prometheus.
I
would
imagine
so
not
the
modification
part,
but
you
could
pull
back
the
metrics
for
a
given
pod
say
over.
I
don't
know
a
week
and
then
just
say
we'll
take
the
90,
90th,
percentile
or
95th
percentile
and
we'll
just
modify
all
the
objects
to
be
95th
percentile
for
their
resource
quotas.
B
A
A
B
B
D
The
one
issue
is:
is
it's
the
same
issue
that
we
have
with
like
tools
that
try
to
monitor
the
traffic
of
an
application,
then
generate
network
policies
off
of
it?
Things
like
what
the
rni
and
network
insight
can
do
on
like
vms.
It
doesn't
work
well
in
kubernetes
from
my
experience,
because
the
average
lifetime
of
a
container
in
kubernetes
is
so
low
and
the
tag
changes,
and
they
made
a
small
change
to
the
code
that
anything
like
this
really
needs
a
learning
period.
It
needs
to
do
inference
and
it
can't
do
that.
D
It
can't
infer
anything
in
the
amount
of
time,
because
by
the
time
it's
collected
enough
data,
it's
irrelevant,
because
there
are
already
four
versions
ahead
in
the
container
in
terms
of
production
right,
so
it
works
well,
possibly
at
the
beginning
or
people
that
are
just
starting
with
kubernetes,
and
that
may
be
still
on
the
same
release
cycle
of
two
three
times
a
year,
but
the
people
that
are
really
moving
into
the
cloud
native
world
in
organizations
that
are
doing
that
and
doing
rapid
iterations
on
their
deployments.
B
D
D
Just
did
this
and
it
lowered
by
like
20
their
memory
usage
just
by
updating
the
go
version
right,
because
that
was
actually
something
wrong
in
the
going
version
they
were
using,
but,
like
any
of
these,
things
are
so
specific
that
you
could
have
a
version
that
also
jumps
up
or
it
goes
down
all
of
a
sudden
because
they
brought
in
some
new
library.
D
It
just
becomes
very
difficult.
It's
not
that
it's
impossible.
It's
just
the
numbers
you
get
here
are
not
nearly
as
accurate
as
things
like.
Vr
ops
is
or
vr,
and
I
are
for,
like
compute
and
networking
in
the
traditional
monolithic
world.
It's
just
unfortunate,
it's
unfortunate,
but
it's
a
fact
of
how
you
know.
We've
kind
of
moved
along
in
this
world
where
things
are
just
so
short-lived.
A
D
Great,
if
you
could
catch
even
things,
need
a
really
good
ai
model
yeah.
If
you
had
a
good
ai
model
that
miles
built,
that
would
like
basically
correlate
it
understood
the
versions
of
a
deployment
and
then
understood
what
the
standard
deviation
is
between
each
of
those
versions
of
going
up
and
down
between
versions
of
the
container
being
used,
it's
possible
that
it
could
predict
what
the
general
like
number
of
versions
that
you
create
are
and
what
the
standard
deviation
changes
are
between
each
of
those
versions
and
then
accordingly
build
it.
B
Problem
like
say,
you
run
the
python
app
and
someone
just
does
traditional
python
library
management,
import
tensorflow.
All
of
it
just
give
me
everything,
and
you
know
suddenly
it's
five
six
hundred
meg
heavier
in
ram
usage
for
that
one
little
container
I
think
it'll
catch
it
would
catch
outliers,
which
might
be
useful.
You
know
if
there's
anomalies,
but
it's
definitely.
A
Yeah,
I
think
you
know
it's
not
just
things
like
your
example
of
drawing
in
too
much
through
the
python
tensorflow,
but
man
if
you
could
flag
things
like
container
images
that
came
off
docker
hub
that
got
corrupted.
You
know,
there's
been
a
sad
history
of
bitcoin
miners
managed
managing
to
slip
things
into
databases
and
things,
and
if
you
have
this
kind
of
monitoring
going
on,
you
should
be
able
to
catch
those
anomalies
where
yeah
the
version
went
up
but
gee.
A
It's
sure
peculiar
that
when
the
version
bump
happened,
it's
burning
twice
as
much
resource
for
for
some
reason,
and
maybe
you
don't
shut
it
down.
But
if
you
had
a
system
to
report
those
things
as
warnings
or
notifications
that
might
be
really
useful,
not
just
for
greenness
and
preventing
resource
waste,
but
maybe
catching
things
like
security
risks
that
are,
I
don't
know.
Ransomware
would
be
another
thing.
Maybe
you
could
flag
by
this.
D
I
think
that
the
one
interesting
thing
also
that
kiverno
does
is
that
it
has
they
just
added.
I
don't
think
one
six
has
come
out
yet
I
think
it's
still
in
the
release
candidate
phase,
but
when
it
comes
out,
they
added
image.
D
Signings
validating
web
hooks
to
make
sure
that
things
are
signed
with,
like
a
cosign
certificate,
to
make
sure
that
things
actually
aren't
that
no
one
intervened
in
the
middle
and
that's
something
that's
big
with
like
docker
hub,
if
you're
using
public
registries,
sign
your
images
and
make
sure
that
they
are
actually
the
same
when
it
comes
down,
because
that's
usually
where
a
lot
of
these
people
are
getting
in
they're,
not
getting
into
the
image
necessarily
they're
faking
the
image
on
the
way
and
doing
things
through.
D
A
Okay,
well,
I
think
we
kicked
around
a
whole
bunch
of
ideas
here
related
to
this,
this
that
maybe
could
spawn
off
future
activity.
If,
if
what
you
were
talking
about
of
doing
this
whole
storage
thing
is
something
you
need
a
extra
person
on
I.
I
just
think
it
would
be
a
learning
experience
for
me.
A
So
if
you
need
an
extra
body
in
there
or
even
want
to
run
by
a
draft
of
a
presentation
or
something,
let
me
know,
and
also
in
terms
of
a
forum
to
present
it,
it
could
be
at
these
user
group
meetings
or
something
else.
I
don't
know
what
you
had
in
mind
there,
or
even
you
know
that
this
is
almost
such
a
broad
concept,
that
it
could
be
a
whole
series
of
presentations
blogs,
whatever
I
think
and
could
go
in
user
group
meetings,
com,
physical
conferences,
online
conferences
or
whatever.
B
Yeah,
that's
why
we
were
looking
at
the
what
we
are
doing
tanzu
tuesday
march
sometime.
Isn't
it
robert
march
22nd
or
something
yeah.
C
C
B
C
But
I
think
there's
a
huge
need
for
this
kind
of
guidance
to
be
out
there,
because
everyone's
everyone's
struggling
with
this.
A
A
C
Well,
I
have
a
question
that
maybe
you
guys
can
help
me
with
it's
related
to
the
csi,
so.
C
So
I'm
still
quite
unfamiliar
with
how
you
do
things
like
metric
collection
in
kubernetes
and
how
you
you
know,
can
scrape
things
with
prometheus,
and
one
thing
I
noticed
recently
was
that
the
the
csi
apparently
now
exposes
a
low
balancer
service.
So
you
with
a
prometheus.
What's
the
word
for
it
instance.
C
Yeah
well
a
thing
that
produces
prometheus
an
no.
I
think
it's
that
yeah
and-
and
I
thought
well
that's
interesting,
and
then
I
thought
I
thought
just
today.
I
think
I
saw
like
this
was
popping
up
in
other
places
as
well
now,
up
until
now,
the
various
kubernetes
distributions
that
I've
seen,
which
are
obviously
vmware
ones.
They
don't
do
this
yet
and
but
now
tkgs
does
because
it
incorporates
version
of
the
csi.
C
Does
this
now
the
problem
is
it's
not
documented
anywhere
at
all,
except
in
the
csi
project,
and
it's
kind
of
weird,
because
you
know:
do
kubernetes
and
there's
this
low
balance
entry
and
it
doesn't
even
work
with
default.
So
I'm
like
what
is
it?
What
can
you
do
with
this?
So
that
that's
kind
of
my
question
like
what's
that?
What's
it
actually
for
how
are
you
supposed
to
use
endpoints
like
that?
C
B
B
So
it
doesn't
need
to
be
a
server-side
load
balancer,
but
it
is
service
type
low
balancer
in
tkgs,
because
the
thought
is
that
people
want
to
run
external
prometheus
instances
somewhere
else
on
the
network,
and
it
would
not
be
in
that
cluster
because,
where
you're
seeing
service
type
load,
balancer
is
in
supervisor
or
in
tkgs-
and
you
generally
not
have
your
prometheus
instance
installed
there.
So
it
can't
be
cluster
ip
and
there's
no
point
making
it
node
port.
So
it's
service
type
load
balancer.
B
What
it
is
is
essentially
key
value
pairs
right.
That's
that's!
How
prometheus
does
its
metrics
collection
it?
It's
a
fetch
based
system.
It
runs
every
15
seconds
by
default
at
prometheus
instance.
That
is,
and
it
will
search
whatever
endpoints
it
is
given.
B
Now
you
can
give
it
endpoints
in
a
number
of
ways
by
a
crd
with
service
with
a
type
of
service
monitor,
and
you
give
it
essentially
the
service
name,
the
name
space
it's
in,
and
it
will
go
scrape
that
it'll,
discover
and
scrape
that
endpoint
right
and
the
what
is
exposed
from
csi
is
just
you
know,
like
g
c
underscore
memory
underscore
alloc
equals
and
then
the
number
right,
that's
that's
kind
of
the
level
of
stuff
that
you're
getting
there
might
be
number
of
pvcs
and
the
amount
of
storage
used.
B
I
I
don't
know
I
haven't
looked
at
what
metrics
are
in
there,
but
it's
just
you
know
plain
text,
and
then
it
gets
scraped
every
15
seconds
by
prometheus,
so
you
would
have
to
set
up
a
service
monitor
or
either
a
static
target,
and
this
is
what
prometheus
calls
them
is
targets,
so
you're
setting
up
a
target
to
scrape
and
depending
on
how
you
install
prometheus.
You
could
do
that
through
the
ui.
You
can
do
it
through
kate's
itself,
because
there's
api
objects
to
do
that
or
use
their
operator.
B
In
any
case,
you
set
up
a
target
that
says
I
want
you
to
scrape
this
ip
at
this
port
at
this
path
and
the
default
passes
path
is
slash
metrics,
so
if
you
go
to
slash,
metrics
you'll
see
all
of
it
and
that
will
just
pull
those
back
and
they
will
then
be
stored
in
prometheus,
and
you
can
run
queries
on
them
over
time.
B
That's
essentially
it
it's
it's
it's
quite
simple,
but
getting
into
the
prometheus
stuff
and
figuring
out
how
to
get
like
new
targets
into
it
is
a
bit
mind-bending
to
begin
with,
especially
if
you've
never
played
with
it.
Before,
like
I
did
the
ml
demo.
I
think
here
at
one
point
where
it
was
like
recording
the
number
of
frames
per
second
that
were
being
processed
by
the
gpus,
and
it
is
it's
very
non-trivial
to
get
that
kind
of
stuff
set
up
if
you've
never
worked
with
it
before.
B
D
And
the
only
other
thing
that
is
very
important
is
that,
if
you're
using
the
tanzu
prometheus
just
because
whether
it's
tensor
community
edition
or
a
commercial
have
fun
it's
in
a
config
map,
so
you
have
to
edit
a
manual
config
map
and
add
it
in
the
prometheus
method.
D
You
can't
use
a
service
monitor,
that's
part
of
prometheus
operator,
which
is
the
the
fact
a
way
that
most
people
are
installing
it
today
in
the
community,
because
it
gives
you
the
ability
to
set
up
alert
rules
for
alert
manager
and
service
monitors
and
all
and
scrape
jobs,
and
things
like
that
all
through
crds.
So
I
would
suggest
going
down
that
approach,
no
matter
what,
because
it's
very
hard
to
manage
through
a
config
map.
Even
if
you
know
prometheus,
it
is
not
trivial
and
have
fun
reloading
the
pod.
Every
time.
B
All
right,
you're
gonna,
have
to
have
a
config
map
based
reloader
for
the
pods
and
all
kinds
of
stuff.
So
just
just
use
the
prometheus
operator
if
you're
interested
and
if
you're
gonna
stand
it
up,
I
would
suggest
the
easiest
way
to
do
that
is
use.
The
helm
chart
called
cube,
prometheus
stack.
That
is
absolutely
one
to
get
started
with.
D
D
The
bitnami
prometheus
operator,
okay,
because
I
just
had
that
actually
by
a
customer
who
needed
to
keep
their
metrics
for
a
year
and
a
half.
Why
regulations?
I
said:
there's
no
like
it's
not
logging,
they
said
yeah
metrics
need
to
be
kept
for
a
year
and
a
half
never
understood.
Why.
B
And
robert
yeah,
that
is,
the
right
helm
chart
and
I
just
sent
you
my
values
file
for
that
config
map
or
that
helm
chart
that
I
use
live.
So
it
is
run
on
my
arm
cluster.
So
there's
some
image
customizations
just
whip
all
those
out
but
use
the
rest
as
a
reference.
B
B
A
So
we're
at
noon
pacific
time
so
last
chance
if
somebody
has
a
very
short
topic.
Otherwise,
let's
close
this
one
and
resume
in
a
month.
A
Okay,
well
bye
everybody.
It
was
a
great
conversation
as
usual
and
we'll
look
forward
to
the
next
one.