►
Description
During this talk we will demonstrate how your applications can benefit from Vertical Pod Autoscaler improving the responsiveness and performance of your workloads in Kubernetes environments.
Website: https://www.redhat.com/de
Organized by @Microsoft @kubermatic7173 @SysEleven
Thanks to our sponsors @CapgeminiGlobal, @gardenio, @sysdig , @SUSE, @anynines, @redhat, nginx, serve-u
A
A
If
your
application
is
consuming
more
and
more
a
memory,
then
cubelet
will
restart
and
terminate
your
boat
and,
on
the
other
hand,
also,
you
can
suffer
a
poor
workload
performance,
because
if
your
application
is
using
more
memory
and
CPU
more
resource
than
is
expected
also
will
affect
to
the
other
containers
and
all
the
pots
that
are
running
in
the
same
node,
because
this
no
this
node
it's
allocating
and
it's
running
different
applications.
Then,
for
example,
it's
competing
for
the
same
resources.
A
So
you
need
to
take
care
also
for
not
having
this
poor
workload
performance,
and
these
can
cause
also
wrong
resource
allocation.
You
can,
for
example,
set
your
Java
application
with
eight
gigs,
but
this
is
not
an
optimized
situation.
You
need
to
adjust,
deploy
your
applications
and
the
resource
consumption,
with
the
proper
request
and
limits
and
with
the
proper
resource
consumption
in
order
to
better
utilize
and
optimize
the
different
resources
in
your
kubernetes
customs,
and
with
all
of
that
in
and
more
situations
that
can
have
with
a
wrong
request
and
limits
definition.
A
So
what's
a
possible
solution
for
avoiding
these
different
issues
that
we
describe
entering
vertical
board,
Auto
scalar,
vertical
part
of
the
scalar
or
vpa
is
a
kubernetes
2
in
the
cncf
is
fully
open
source
that
frees
the
users
from
the
necessity
to
setting
up
to
date,
resource
limits
manually
for
the
containers
in
the
ports.
So
it
will
set
the
requests
and
limits
automatically
based
on
this
new
certain
metrics
and
Define.
The
proper
thresholds
that
will
allow
proper
scheduling
and
turnouts,
so
BPA
will
look
for.
A
The
cluster
resources,
for
example,
to
preventing
the
Bots
to
go
and
reserving
more
memory
and
CPU
than
is
needed
and,
for
this
reason,
optimizing
the
different
resources
in
your
kubernetes
clusters,
so
the
vpa
will
monitor
the
resources
that
the
workloads
is
actually
using,
asking
the
API
and
also
grabbing
these
kubernetes
metrics
and
adjusting
the
resource
requirements.
So
the
capacity
is
available
also
for
other
workloads
that
are
running
in
your
kubernetes
clusters
as
well.
A
So
this
is
the
vpa
architecture
that
have
three
major
components.
You
have
the
recommender
that
will
monitor
current
and
past
resource
consumption
and
based
on
it
based
on
these
metrics
and
that
crop
from
the
kubernetes
metrics
will
provide
recommended
values
based
in
this
container
and
memory
requests
and
on
the
other
hand,
the
updater
will
check
which
of
the
different
managed
spots
have
this
correct
set
and
if
not,
are
applied
to
your
ports.
A
It
will
automatically
kill
this
spot
and
recreate
it
with
the
updated
request
based
in
the
recommended
resources
that,
in
the
real
time,
kubernetes
is
grabbing
and,
on
the
other
hand,
the
admission
plugin
that
sets
the
correct
resource
requests
on
the
new
boards.
Just
for
example,
if
I
created
or
recreated
by
the
updated
that
we
saw
this
is
a
diagram
that
shows
the
VPI
architecture.
A
We
have
the
recommender
and
also
the
updata
that
it's
in
the
middle
that
it's
watching
the
Pod
through
a
BBA
object,
and
also
we
have
this
vpa
emission
controller
that
acts
and
reacts.
For
example,
when
a
new
Port
is
generated
and,
on
the
other
hand,
always
the
vpa
controller,
it's
pulling
different
metrics
from
the
metric
server
through
the
API
server
and
it's
pulling
utilization
and
events
at
the
real
temp.
A
So
it
will
monitor
your
application
at
the
real
time
and
will
adjust
and
provide
recommendations
predicting
the
different
recommend
predicting
the
different
requests
and
limits
to
avoid,
for
example,
to
have
these
situations
where
you
are
using
more
memory
than
is
expected
or
situations
that
you
have
eviction
for
this
home
killed
for
example,
and
sometimes
you
wanted
to
have
also
apply
automatically
this,
this
vpa
recommendations,
but
sometimes
not
so
have
three
different
modes
to
run.
A
The
first
is
the
automatic
mode
that
automatically
applies
the
recommended
resources
and
ports
associated
with
this
controller,
so
the
vpa
terminates
existing
ports
and
generate
New
Ports
with
the
recommended
resource
limits
and
requests,
and
on
the
other
hand,
we
have
the
initial
that
acts
more
or
less
like
this
automatic,
but
with
only
on
certain
events,
for
example,
when
you
create
a
new
bot,
never
will
touch
existing
and
running
boards,
and
also
the
last
but
not
least,
is
the
oauth
mode
that
only
generates
resource
recommendation,
but
never
will
touch
a
running
board,
neither
also
if
it's
created.
A
So
just
we
look
for
your
different
boards
and
then
will
suggest
several
recommendation,
values
and
sometimes
vpa
recommendations.
So
imagine
that
your
application
is
consuming
more
and
more
memory
and
then
the
vpa
sets
a
recommendation
that
might
exceed
these
available
resources.
For
example,
node
size.
It's
requesting
more
things
that
the
available
quota
is
allowing
so
can
cause
this
VBA
recommendation
that
ports
go
to
pending
state
using
the
cluster
autoscaler
and
combining
this
cluster
other
scalar
with
VBA.
A
So
combining
these
two,
this
BPA
plus
cluster
autoscala,
could
be
very
good
combination
in
order
to
scale
up
and
scale
down
your
different
nodes
and
also
based
in
this
spot
capacity
and
base
and
looking
for
this
utilization
metrics.
So
you
can
combine
this
vpa
plus
the
cluster
of
the
Scala
in
order
to
grow
the
ports
and
also
the
number
of
nodes
in
your
different
classes,
and
then
we
can
see
these
yeah.
Hopefully
you
can
see
my
screen
perfect,
so
we
have
our
kubernetes.
A
Our
cabinet
discussed
here
that
have
different
nodes
and
I
will
run
to
avoid
type
manually,
and
this
just
demo
script
that
will
run
effectively
this
demo.
So
we
will
have
two
different
scenarios:
the
first
without
BPA
that
we
can
see
what
happens.
If
my
application
is
using
more
memory,
then
it
limits
its
definition.
So
just
I
generate
this
vpa.
Hopefully,
you
can
see
the
screen
right
and
the
font
is
okay,
perfect,
so
I
will
deploy
an
application
that
will
use
an
image.
A
You
can
see
that
I
allocated
250
and
I
requesting
more
done
its
limits,
so
what
happened
home
killed
for
what
reason,
because
my
application
is
using
more
memory
and
then
trying
to
use
more
memory
and
then
cubelet
detects
this
and
then
we'll
restart
my
application
and
my
application
is
not
in
good
shape.
Isn't
it
so?
If
I
describe
this
application
and
I
try
to
grab
for
the
reason,
the
reason
is:
OHM
kill
so
effectively,
because
my
application
is
using
more
memory
than
its
limits.
A
I
not
calculated
properly
and
then
I
have
this
state
so
how
vpa
can
help
us?
So
the
first
thing
is:
you
need
to
deploy
BPA
into
your
different
kubernetes
clusters.
I
used
an
open
shift
cluster
and
I
used
olm.
The
operator
lifecycle
monitor
in
order
to
deploy
it
and,
as
you
can
see,
we
have
the
admission
plugin
the
recommended
and
also
the
update
and
the
port
of
the
scalar
operator.
A
I
will
deploy
a
namespace
this
name
space.
We
will
deploy
the
exact
same
application
and
we
can
check
how
vpa
can
adjust
automatically
the
different
requests
and
limits.
So
these
requests
and
limits
we
Define
and
we
will
Define
the
exact
same
application,
same
image,
everything
it's
okay,
but
we
will
set
at
the
first
time,
150
allocated
memory
and
one
core.
A
That
is
200
mini
course,
and
the
request
100
millicos
and
then
BPI
will
just
use
the
metrics
that
it's
within
the
kubernetes
metrics
and
you
can
check
this
metrics
using
the
cube
CTL
Top
Pot
and
the
this
we'll
check
the
different
metrics
that
you
have
in
this
specific
namespace
and
as
you
can
check,
for
example,
the
request
that
we
provide
it's
one,
almost
one
core
and
a
little
bit
below
of
the
meme
respected,
but
also
we
can
use
this.
A
So
with
this
metrics,
we
need
to
generate
an
VBA
object.
So
we
have
the
controller.
We
have
also
the
recommended
updater,
but
we
need
to
generate
an
object
Asia
for
the
vertical
Board
of
the
Scala.
So
we
can
check
here
but
effectively
we
generate
a
vpa
autoscaler
that
we
look
for
my
deployment
in
my
specific
namespace
and
will
have
also
the
minimum
value
that
is
allowed
and
the
maximum
value.
A
And
this
is
the
maximum
value
that
the
vpa
will
allow
to
be
adjusted,
because,
even
if
my
application,
it's
requesting
more
and
more
memory
and
CPU
I,
will
put
a
limit
in
order
to
prevent
bad
things
in
my
cluster
and
also
running
out
of
memory.
So
my
controlled
resources
will
CPU
and
memory
and
then,
after
that
is
applied,
vpa
will
look
for
my
application
and
will
check
the
different
metrics
and
if
my
application
have
a
request,
for
example,
and
is
using
more
memory
than
before,
BPA
will
look
for
these
metrics
and
will
Define
different
recommendations.
A
So,
for
example,
if
I
put
the
minimum
and
the
maximum
value
at
the
maximum,
we'll
put
one
call
and
one
gigabyte
of
memory
never
more,
because
we
are
limited
at
the
maximum
level.
Only
for
this
application
remember
that
this
vpa
and
you
can
have
every
vpa
objects
that
you
want
controlling
different
objects.
It's
a
namespace.
So
if
we
check
the
BPA
status,
vpa
will
look
for
these
metrics
we'll
check
this
Matrix
in
order
to
adjust
these
different
resource
requests
and
limits
and
we'll
check
the
different
CPU
and
memory.
A
And
then,
if
we
check
the
VBA
status,
Let's-
okay,
we
have
here
that
it's
recommending
one
CPU
and
unless
the
memory
that
it's
consuming
in
this
moment
and
as
we
can
see,
we
have
this
target.
This
target
is
defining
this
and
also
we
have
the
upper
bound.
That
is
the
maximum
value
that
we
can
have
in
bytes.
A
As
we
can
see,
the
application
is
limited
to
200,
and
now
we
can
simulate
how
vpa
will
adjust
the
different
resources
as
long
as
the
application
requires
more
resources
and
how
we
can
do
that
specifying
and
trying
to
patch
our
application
in
order
to
require
more,
for
example,
more
memory
than
its
limits.
So
in
a
real
situation
without
BPA,
this
will
be
on
the
home
killed,
but
vpa
will
save
the
day.
A
How
will
adjust
automatically
the
different
requests
and
limits
based
in
the
usage,
so
we'll
check
the
different
kubernetes
metrics,
we'll
check
this
and
look?
We
have
this
VBA
that
Port
resources
updated
by
a
stress,
BPA
the
container
and
the
memory
request
are
evaluated
and
then
it's
applied
automatically
so
the
next
time
that
we
have
this
spot,
that
it's
requested
automatically
increases
the
limits
and
we
have
a
memory
that
is
adjusted
and
also
the
request
and
limits
that
are
defined
and
are
a
little
bit
more
higher.
A
And
for
this
reason
my
application
is
surviving
and
it's
not
entering
in
this
home,
kill
in
this
out
of
memory.
So
the
vpas
save
the
day
and
as
you
can
see,
this
is
running,
and
we
have
here
my
obligation,
for
example,
that
we
can
check
and
my
application
it's
safe
and
it's
up
and
running
and
we
can
check
the
effectively
that
it's
products
and
it's
this
requested
limits
that
are
adjusted
by
the
vpa.