►
Description
Don't miss out! Join us at our upcoming event: KubeCon + CloudNativeCon Europe in Amsterdam, The Netherlands from 18 - 21 April, 2023. Learn more at https://kubecon.io The conference features presentations from developers and end users of Kubernetes, Prometheus, Envoy, and all of the other CNCF-hosted projects.
A
A
Now,
let's
get
going,
thank
you
for
joining
us.
Everyone
welcome
to
today's
cncf
live
webinar,
Dynamic
right
sizing
of
kubernetes
for
cloud
cost
savings,
I'm,
Libby,
Schultz
and
I'll,
be
moderating
today's
webinar
I'm
going
to
read
our
code
of
conduct
and
then
I
will
hand
over
to
varsha
Nike
devops
engineer
at
Trigg
and
Chip
Huang
technical
product
marketing
manager
at
oci
a
few
housekeeping
items
before
we
get
started
during
the
webinar.
You
are
not
able
to
talk
as
an
attendee,
but
you
can
leave
all
your
questions
in
our
chat
box.
A
We
will
get
to
as
many
as
we
can
at
the
end.
This
is
an
official
webinar
of
cncf
and,
as
such
is
subject
to
the
cncf
code
of
conduct.
Please
do
not
add
anything
to
the
chat
or
questions
that
would
be
in
violation
of
that
code
of
conduct.
Please
be
respectful
of
all
of
your
fellow
participants
and
presenters.
Please
also
note
that
the
recording
and
slides
will
be
posted
later
today
to
the
cncf
online
programs,
page
at
community.cncf.io,
under
online
programs
and
they're
also
available
via
your
registration
link.
A
B
Hey
thanks
Libby,
so
my
name
is
Chip
Wong
I'm
with
work
for
cloud
infrastructure
and
she
didn't
join
paparta
and
ifm
trade,
one
of
the
large
insurance
and
all
of
Scandinavia
to
discuss
a
topic.
That's
top
of
my
for
making
of
you
when
it
comes
to
run
your
applications,
kubernetes
application
cloud,
and
that
is
the
cheap
customization
for
a
cluster
without
confidence
performance
application,
and
that
is
why
right
side,
you
kubernetes
clouds,
are
so
important
because
it's
done
right,
you
can
effectively
achieve
both
of
the
objectives.
Next
slide.
Please.
B
So
when
we're
talking
about
rice,
I
think
kubernetes
cluster,
we
really
talk
about
two
things,
the
first
of
which
is
allocating
the
right
amount
of
resource
for
a
cluster,
and
this
means
the
memory
CPU.
It's
important
that
Clips
are
not
resource
around
the
application
to
ruin
the
application
fitly,
but
you
don't
want
to
over
provision
and
waste
resources
so
by
right-sizing
kubernetes
cluster.
You
are
saving
money
because
you're
only
paying
for
a
resource
to
utilize.
B
The
second
aspect
is
really
selecting
the
right
type
of
Hardware.
In
no
types,
not
all
the
applicants
perform
the
same.
Some
required
more
CPUs
other
requires
more
memory
or
more
or
are
IO
intensive
or
or
require
specialized
Hardware
in
order
to
run
effectively
so
providing
the
right,
node
type
and
write
Hardware
to
your
application.
You
allow
it
to
perform
optimally,
the
third
of
which
is
really
when
you
write
a
kubernetes
cluster.
B
So
if
you
write
that
equivalent
study
effectively,
essentially
your
app
your
your
cluster
can
operate
more
smoothly,
as
well
as
become
more
stable,
so
a
key
aspect
of
right
side,
kubernetes
cluster.
It's
really
looking
at
the
best
way
to
scale
the
application
depending
the
application
it
might
be
using
vertical
products,
Taylor,
quartz
and
scaler
or
composition
mode,
but
you're
also
looking
for
the
right
metrics
in
order
to
be
able
to
get
your
application.
So
just
by
going
through
the
exercise
of
Rights
right
cycling,
equivalent
cluster,
you
essentially
make
your
applicants
more
scalable.
B
It
also
allow
to
run
more
efficiently
next
slide.
Please.
B
But
when,
when
you
write
that
in
kubernetes
cluster
for
the
cloud
it
does
come
with
some
challenges,
well,
first
of
which
the
word
close
in
kubernetes
is
dynamic
and
the
amount
of
resources
May
allocate
to
your
application
run
effectively,
will
change
it
depending
on
the
local
application
and
because
of
the
changes.
The
way
that
you
right
size,
your
kubernetes
cluster
needs
to
dynamic
as
well.
So
you
need
to
just
along
with
the
lowdown
application
and
before
you
even
start
are
able
to
write
that
here,
recruited
cluster.
B
You
really
have
to
understand
how
your
research
are
being
used
by
your
application
of
cluster,
and
that
means
you
need
to
find
the
right
right
way
to
monitor
your
monster
cluster.
But
what
are
the
right
tools
and
what
to
write
methodology.
So
these
are
some
of
the
things
that's
kind
of
hard
to
determine
and
also
when
you're
right
at
the
terminal
called
it
is
complicated.
B
First
of
all,
you
do
need
to
understand
how
your
applicant
behave,
but
even
knowing
how
to
apps
in
your
face.
You
still
need
to
understand
for
a
Google
cloud
provider
What
hardware
and
what
type
of
available.
So
you
can
match
it
up
correctly
and
finally,
when
you're
doing
on
on,
when
you
do
when
you're
recycling
through
this
cluster,
it
can
affect
your
performance
application.
So
you
have
to
keep
that
in
mind
when
you're
doing
dynamic,
dynamic
aside
from
this
cluster,
so
the
methodology
youth
do
not
interfere
with
the
before
the
application.
B
C
Packaging,
hello:
this
is
Russia
Nike,
a
devops
engineer
from
trick
for
crink
I.
Also
have
my
platform
manager
present
here,
pyoto
haikovski
again
from
trick
he's
one
of
our
panelists
today.
So
let's
get
started
actually
before
we
get
started.
Let's
talk
a
little
about
who
we
are
yes,
scandinavia's
largest
non-life
insurance
company.
We
headquartered
in
bellhope
Denmark.
We
have
over
5.3
million
customers
and
over
7
000
employees.
C
Where
do
we
stand
in
the
market
position
in
the
whole
of
Scandinavia?
We
hold
the
top
three
Market
positions
spread
across
Denmark,
Norway
and
Sweden,
and
in
no
and
in
Denmark
we
hold
the
highest
position.
We
have
a
broad
variety
of
insurance
products
made
available
to
our
customers
and
it
is
spread
across
various
business
sectors,
for
example
in
the
private
sectors
we
have
accident,
Insurance,
home
insurance,
pet
insurance,
health
insurance
and
various
others.
C
We
have
a
commercial
sector
where
we
have
insurance
for
small
and
medium
scale,
businesses
we
have
a
workers,
liability,
insurance,
property
insurance,
motor
insurance
and
likewise,
and
when
it
comes
to
corporate
sector,
we
also
have
group
life
insurance,
including
property
insurance,
transport,
insurance
and
the
like.
So
what
I'm?
C
Trying
to
stress
upon
is,
we
have
a
huge
variety
of
insurance
products
that
are
made
available
across
various
different
business
sectors
and
which
means
we
have
a
huge
amount
of
data
flowing
in
at
real
time,
both
structured
and
unstructured,
and
what
we
have
to
do
is
we.
We
have
to
collect
them
from
various
different
sources,
and
many
of
these
sources
have
density
of
data
in
terabytes.
C
We
have
to
structure
them,
we
have
to
model
them
as
per
the
Accord
standard
for
insurance.
We
have
to
centralize
and
streamline
this
data
and
make
it
available
at
one
place
at
real
time,
so
that
we
can
feed
our
analytical
and
business
intelligence
services
and
also
to
our
stps
straight
through
processes.
This
is
just
to
ensure
that
we
improve
the
customer
experience
and
also
fasten
the
process
of
insurance
itself.
C
So
the
agenda
for
this
session
would
be,
will
first
try
to
have
a
quick
view
of
how
we
do
a
deployment,
a
quick
overview
of
that,
and
then
we
talk
about
the
challenges
that
we
initially
faced.
C
There
were
quite
a
few
I'll
try
to
touch
upon
the
major
ones
and
we'll
talk
about
the
solutions
Solutions
as
in
what
we
did
to
actually
right
size,
our
cumulative
cluster,
so
there
are
actually
spread
across
two
stages.
One
is
we
have
to
right
size,
the
worker
nodes
involved
in
the
kubernetes
cluster,
and
the
second
way
is:
we
have
Auto
scaling
various
techniques
of
Auto
scaling
that
we
have
leveraged
in
order
to
optimize
the
utilization
and
save
cost.
C
And
finally,
we
would
be
glad
to
share
the
statistics
and
the
results
that
we
have
achieved
before
and
after
we
did
these
optimizations
and
a
quick
summary.
C
Thank
you
now
talking
about
the
challenges
that
we
faced
with
this
architecture,
we
had
workloads
reading
with
CPU
hungry
workloads.
We
had
few
memory
hungry
workloads,
we
had
few
performance
intensive
workloads
and
we
understood
that
we
could
not
put
all
of
these
application
Parts
under
one
umbrella
and
have
the
same
host
or
same
host
note
for
all
of
these
applications.
C
The
second
issue,
being
we
have
a
huge
deployment,
a
huge
scale
of
deployment
and
every
deployment
of
every
kubernetes
cluster
will
comprise
of
at
least
two
thousand
hundred
odd
number
of
pots.
So
this
would
basically
mean
that
we
are
stressing
the
underlying
host
quite
a
lot
and
we
hit
a
few
edge
cases
and
to
get
around
it,
we
had
to
use
few
customization
scripts
and
this
we
were
able
to
achieve
using
something
called
Cloud.
C
C
What,
for
example,
how
we
use
this
cloud
and
it
was?
We
wanted
to
change
few
arguments
in
the
cubelet
service
of
all
the
worker
nodes.
For
example,
we
have
wanted
to
increase
the
system,
reserved
resources
of
the
node
itself
on
every
worker
node,
and
this
we
were
able
to
achieve
using
the
cloud
init
script
and
then
coming
to
the
third
point.
C
We
had
a
diversified
workload
as
in
the
there
were
busy
hours
and
there
were
idle
times
in
our
workload,
and
we
had
over
committed
the
resources
to
cater
to
the
busy
hours
alone,
and
this
meant
that
when
the
load
is
basically
idle,
we
were
still
paying
for
the
same
amount
of
resources
regardless,
and
that
was
costing
us
too
much
and
of
course,
like
every
other
business.
We
also
had
budget
constraints
and
we
had
to
bring
down
the
cost.
Somehow.
C
So
I
keep
stressing
about.
We
have
huge
large
deployment
and
a
huge
scale
of
deployment,
so
this
basically
means
that
for
every
kubernetes
cluster
that
we
have
in
our
production
environment,
we
have
approximately
5000
vcpus,
approximately
13
terabytes
of
memory,
300
or
terabytes
of
storage
and
200
250
number
of
parts,
and
this
is
for
one
kubernetes
cluster
and
one
production
deployment,
and
we
have
at
least
four
of
them
running
at
any
given
instance.
So
there
was
a
huge
dirt
near
need
of
optimizing
the
cost
and
the
resources
here.
C
C
So
apart
from
just
selecting
the
host
OS
type
or
the
flavor,
we
also
thought
about
the
processors
that
are
used
by
the
compute
instances.
So,
for
example,
at
least
in
our
project,
we
had
few
Java
applications
which
were
AMD
64
based,
and
we
had
Kafka
clusters
also
deployed
as
Bots
within
the
cluster,
using
the
stream
Z
operator,
and
we
used
amt64
based
compute
nodes
for
this,
and
we
have
few
other
Java
applications
like
in
the
second
box,
where
arm
64
piece.
C
The
images
were
created
and
we
could
easily
deploy
them
onto
the
erm64
compute
nodes.
The
advantage
of
this
is
if
they
have
highly
performant
and
also
they
are
fairly
cheaper
in
comparison
with
the
AMD
64
ones,
and
these
arm
64bs
compute
instances
are
available
in
most
major
Cloud
platform,
speed,
Azure,
Google,
Oracle,
of
course,
and
Amazon.
C
So
we
could
leverage
that
wherever
possible,
and
then
we
had
a
special
requirement
for
running
a
database
as
a
part
inside
the
kubernetes
cluster.
Now
you
would
ask
me
why
this
is
because
we
wanted
our
applications
to
be
able
to
talk
to
our
database
as
much
as
possible
and
as
frequently
as
possible,
without
incurring
much
of
a
latency
and
Oracle,
of
course,
provides
a
VM
type
or
VM
shape,
which
has
a
local
disk
attached
to
it,
an
nvme
based
SSD
attached
to
it.
C
C
Furthermore,
now
that
we
have
chosen
the
nodes
to
deploy
on,
we
now
have
to
make
sure
that
the
pots,
every
part
that
we
have
application
Port,
goes
to
the
right
or
stipulated
notes
worker
node
and
to
make
sure
this
happens.
We
use
kubernetes,
node,
Affinity,
node,
selectors,
stains
and
tolerations,
and
talking
about
arm
64
again,
we
were
able
to
build
ERM
64
based
images
using
the
same
Docker
multi-architecture
Builder
called
build
X
so
in
case
you're
interested.
C
Furthermore,
now
that
we've
decided
that
the
flavor
is
so
and
so-
and
we
have
this
type
of
VMS
to
select
from,
we
also
have
to
consider
how
do
we
architect
our
kubernetes
cluster
itself?
So
from
our
learnings,
we
suggest
that
we
should
have
a
discrete
node
pool
planning,
basically,
meaning
that
every
node
pool
should
serve
just
one
purpose
and
one
kind
of
workload.
It
will
be
easier
to
manage
and
also
makes
more
sense.
C
This
way
consider
I
choose
a
very
huge
node
with
a
very
huge
amount
of
memory
in
CPU
and
say:
I
will
deploy
all
my
parts
onto
this
node.
There
is
a
disadvantage
to
this,
which
is
if
we
have
block
volumes
or
volumes
attached
to
the
pods
in
these
nodes.
There's
a
limitation
on
the
number
of
volumes
that
can
be
attached
to
every
instance,
compute
instance-
and
this
is
true
for
all
the
cloud
platforms
so
we'll
have
to
watch
out
for
that.
C
So
we'll
have
to
watch
out
for
that
as
well.
Of
course,
not
all
Cloud
providers
have
this
flexibility
of
selecting
the
memory
to
CPU
ratio.
Oracle
provides
it
in
the
form
of
flex
virtual
machines,
but
few
other
Cloud
providers.
For
example,
Azure,
provides
an
exhaustive
list
of
standard
VM
shapes
and
sizes
to
choose
from,
so
that
also
will
help
in
choosing
the
right
size
for
your
kind
of
workload.
C
Of
course,
we
suggest
that
we
have
a
limited
set
of
node
pools
so
that
it
is
easier
for
us
to
manage
every
kubernetes
cluster
all
right
so
now
that
we
have
decided
on
what
host
to
deploy
our
ports
on,
let's
Venture
into
Auto
scaling
and
just
before
we
start
with
auto
scaling.
I
would
like
to
give
a
prerequisite
for
the
scaling
which
is
called
a
metric
server.
C
A
metric
server
will
basically
collect
the
container
resource
metrics
from
all
the
cubelets
in
the
worker
nodes
and
then
send
them
to
the
kubernetes
API
server
and
that
becomes
available
to
all
our
autoscalers
beat
horizontal
vertical
or
cloud.
or
cluster
Auto
scalar.
So
that's
kind
of
a
prerequisites.
It's
also
used
when
you
try
to
use
a
cube,
CTL
top
command.
It
results
in
how
how
much
of
CPU
and
the
memory
every
pod
is
utilizing.
You
get
to
see
all
those
stats.
C
If
you
try
to
install
the
metric
server
now,
let's
Venture
into
the
first
kind
of
Auto
scaling,
which
is
horizontal
power
to
scalar
in
as
the
name
suggests,
it
tries
to
scale
out
the
number
of
replicas
of
a
particular
controller
horizontally,
as
in
when
the
lower
is
high,
tries
to
increase
the
number
of
pod
replicas
belonging
to
a
particular
deployment
or
a
stateful
set
and
the
load.
When
the
load
decreases,
it
tries
to
scale
down
again,
and
there
are
two
ways
you
can
do.
C
This
one
way
is
using
the
CPU
and
the
memory
based
on
how
much
CPU
or
memory
every
deployment
or
part
in
every
deployment
is
using.
It
scales
out
or
scales
in
the
other
way
is
to
use
a
custom
metrics
custom
metrics,
as
in
your
application,
Port,
will
export
some
metrics,
which
makes
sense
of
cost
to
the
horizontal
port
at
a
scalar
to
the
Prometheus
server,
which
is
installed,
maybe
on
the
cumulative
cluster
itself
or
somewhere
outside
and
then
using
these
metrics.
C
The
Prometheus
adapter
makes
these
metrics
available
to
the
horizontal
power
autoscaler
as
a
feedback
loop,
and
the
horizontal
part
of
the
scale
then
decides
if
it
has
to
scale
out
or
scale
in
so
there
are
two
ways
of
using
horizontal
part
of
the
scalar
and
there's
just
a
note
that
you
have
to
consider.
If
you're,
using
a
horizontal
border
to
scalar
with
CPU
and
memory,
you
will
not
be
able
to
use
vertical
port
or
to
scalar.
C
C
Okay,
before
we
talk
about
this
graph,
I'll
just
give
up
a
few
introduction
to
this
graph
itself.
This
is
a
Custom
Tool
which
is
derived
which
drives
all
the
metrics
and
the
stats
from
our
OK
or
kubernetes
engine
cluster,
and
this
is
collected
from
our
production
environment.
C
So
I
will
be
showing
you
many
such
crafts
and
they
are
all
collected
from
our
production
environment,
and
this
is
in
congruence
with
the
deployment
chart
that
we
showed
initially
with
2000
art
parts
and
300
terabytes
of
storage.
So
here
you
can
see
on
the
x-axis,
it
is
time
and
then
the
y-axis,
you
can
see
the
number
of
PODS
and
with
time
you
can
see,
the
number
of
ports
are
varying
because
the
workload
is
demanding.
C
So
you
can
see
that
at
one
point
it
went
below
350
and
at
one
point
it
even
went
beyond
400..
So
we
see
that
this
is
varying
with
time,
as
opposed
to
a
parallel
line
parallel
to
the
x-axis,
as
it
was
earlier
before,
horizontal
part
of
the
scalar
was
introduced.
Now
with
this,
are
we
saving
anything?
C
No,
unfortunately,
because
we
are
only
waiting,
the
number
of
pots
in
the
cluster
doesn't
mean
that
we
are
changing
anything
with
regard
to
nodes.
The
nodes
are
still
static
and
that's
what
we
pay
for
so,
unfortunately,
with
just
horizontal
pod,
Auto
scalar
we're
not
achieving
cost
savings,
so
let's
bring
in
cluster
Auto
scaler
so
to
explain
this
I'll.
Just
take
an
example:
let's
consider
Node
1
and
node
2
of
the
same
size
for
our
for
our
convenience.
C
Let's
say
that
we
can
accommodate
a
maximum
of
three
parts
of
the
same
type
and
both
Node
1
and
node.
2
are
occupied
right.
So
now,
let's
consider
there
are
two
more
parts
trying
to
get
scheduled
on
the
same
cluster.
Now
the
cluster
Auto
scalar
senses
this
and
starts
a
new
node
or
Provisions
a
new
node
and
then
deploys
these
two
nodes.
C
These
two
parts
Part
7
and
part
8
onto
those
notes-
and
this
is
how
the
upscaling
works
in
cluster
Auto,
scalar
and
downscaling
again,
when
we
consider
the
same
Node,
1,
node,
2
and
node
3.,
let's
see
node,
2
and
node
3
are
running,
are
not
optimally
used
and
they
have
enough
capacity
to
accommodate
two
more
parts
of
the
same
type.
Then
cluster
Auto
scalar
science
is
this
again
and
then
it
marks
one
of
the
nodes
for
deletion
and
then
tries
to
move
that
part
or
scheduled
on
that
note
to
the
node
2..
C
As
you
can
see
in
this
figure
and
then
it
will
try
to
delete
the
note
3..
This
is
provided
the
Pod
disruptor.
The
constraints
like
poor
disruption.
Budgets
are
all
of
the
scene,
so
there
are
few
constraints
on
which
the
cloud
cluster
test
killer
acts,
but
then,
if
there
are
no
Port,
just
though
there
are
no
constraints,
it
will
try
to
reschedule
the
pots
based
on
which
node
can
accommodate
it
and
then
try
to
reduce
the
number
of
nodes.
C
C
But
there
are
a
few
other
clouds,
a
cloud
platforms
like
Oracle,
where
we
will
have
to
install
it
as
an
add-on,
and
this
gives
us
the
liberty
to
choose
or
fine
tune.
The
class
starter
scaler
for
our
needs.
I
have
just
tried
to
highlight
few
of
the
flags
that
we
have
configured
and
tweaked
with
just
to
make
sure
that
they
fit
our
needs
better.
So
I'll
just
quickly
take
a
few
examples
here.
C
For
example,
you
can
see
this
scale
down
utilization
threshold,
which
will
tell
the
cluster
Auto
scaler
how
much
of
the
utilization
should
a
node
be
using
before
you
try
to
scale
down
that
node.
So
before
you
consider
to
scale
down
that
particular
node,
and
you
can
also
specify
how
much
time
you
give
up
node
particular
node
to
get
provision
and
come
up
to
the
ready
State
here
we
have
just
kept
as
15
minutes.
There
are
many
such
parameters
that
you
can
play
around
with,
and
you
can
find
that
on
the
GitHub
repository.
C
C
Now
this
graph
is
together
with
a
horizontal
portal
to
scalar
and
cluster
to
scalar.
This
is
what
we
achieve.
This
is
a
similar
graph
provided
from
the
same
tool,
but
on
the
y-axis
you
now
see
the
node
counts,
the
number
of
nodes
in
the
node
pool,
so
you
can
see
initially
the
node
count
went
up
to
100
and
then
it
gradually
tried
to
reduce
and
it
stabilized
at
around
60..
C
C
So
that's
how
you
can
see
the
node
count
goes
up
and
down
and
are
we
saving
anything
here
cost
twice,
and
the
answer
is
yes,
because
notes
are
what
we
pay
for
and
if
we
see
lesser
number
of
nodes,
of
course,
we
are
paying
for
lesser
number
of
them
as
opposed
to
having
a
street
land
parallel
to
the
YX
x-axis,
and
we
would
have
to
pay
for
all
100
of
them
at
all.
Given
times
so,
this
is
an
advantage,
but
is
this
enough?
C
C
This
is
the
cumulative
disaster.
We
have
a
huge
difference
between
the
utilization
and
the
requests
and
we
are
paying
for
the
requests
by
the
way
and
vertical
part
of
the
Skiller
comes
to
our
rescue
again
vertical
part
of
the
scalar,
together
with
horizontal
Paratus
killer,
can
work
only
when
a
horizontal
part
of
the
scalar
works
on
the
customers,
custom,
metrics
and
not
CPU
or
memory.
C
Now,
let's
consider
vertical
power,
Auto
scalar,
as
opposed
to
horizontal
part
of
the
scale
which
increases
or
decreases
the
number
of
replicas
of
a
particular
pod
increases
the
size
of
the
part
itself.
So
when
I
see
the
size,
it
basically
means
the
CPU,
and
the
memory
requested
by
a
part
is
increased
based
on
the
actual
part
utilization.
C
This
is
done
on
a
regular
interval,
of
course,
that's
configurable
and
in
steps
which
is
also
configurable
in
the
vertical
part
of
the
scalar.
So
in
this
diagram,
what
I
try
to
show
is.
First,
we
start
off
with
a
minimum
CPU
and
the
memory
that
is
configured
by
the
user
for
the
vertical
part
of
the
scalar.
You
can
also
configure
a
limit
sorry
about.
A
C
So
well,
it
started
with
vertical
power
to
scalar
I'm,
hoping
we
got
through
this
slide,
where
I
discuss
why
we
need
vpa.
So
we
see
that
the
utilization
and
the
requested
resource
is
way
off
and
we
have
to
bridge
this.
C
Gap
and
vpe
comes
to
RSQ
vpa
together
with
hpe
can
be
used
only
when
HPA
is
used
with
custom
metrics
and
not
with
CPU
or
memory
and
PPE,
as
opposed
to
hpe,
where
the
number
of
ports
broad
replicas,
gets
scaled
up
and
scaled
down,
PPE
tries
to
increase
the
size
of
the
Pod
itself
when
I
say
the
size
it
is
CPU
and
the
memory
requested
by
the
port
is
increase
based
on
the
actual
utilization
of
the
pod,
and
this
is
done
in
regular
intervals,
which
is
configurable
and
also
in
steps
which
is
also
configurable
just
for
the
sake
of
Simplicity.
C
I
have
chosen
some
random
numbers
to
depict
how
this
works.
So
initially
you
have
to
set
a
minimum
and
the
maximum
number
of
CPU
and
memory
that
vpa
can
play
with,
and
then
it
will
start
with
the
minimum
configured
CPU
and
memory
on
the
Pod
and
as
as
and
when
it
senses
that
the
Pod
is
utilizing
more
than
this.
C
Of
course,
there
are
many
others
that
can
be
found
in
the
gitlab
repo,
but
these
are
the
few
which
we
would
like
to
highlight,
and
you
can
ignore
these
numbers
that
we
have
configured
because
we
had
to
do
few
iterations
to
get
to
these
numbers,
and
so
will
you
because
it'll,
if
you
have
to
tailor
this
VP
to
your
workload,
you'll
have
to
run
a
few
iterations
before
you
get
to
the
ideal
values
for
your
project.
C
So
to
start
with,
I
will
try
to
highlight
a
few
Flags
here.
One
is
this
recommendation
margin
fact
fraction,
which
is
basically
the
amount
by
which
CPU
are
or
the
memory
gets
stepped
up
or
stepped
down.
So
we
have
chosen
30,
so
every
step
would
be
30
increase
in
CPU
and
the
memory.
We
would
also
have
parameters
like
how
long
do
you
want
to
retain
the
memory
and
CPU
history
before
it
dies
down
before
you
discard
them?
We
can
also
set
which
namespace
do
you
want
to
do?
C
You
want
vpe
to
act
upon
and
we
have
used
Prometheus
for
our
storage
here,
so
we
provide
the
Prometheus
URL
so
on
and
so
forth.
This
is
for
the
vpa
recommender
entity.
We
also
have
something
called
vpa
updater
entity
where
you
can
provide
informations
like
how
what
is
the
interval
at
which
you
want
the
vpe
to
run.
We
have
set
that
to
10
minutes,
you
could
choose
yours
and
we
can
also
provide
the
number
of
minimum
replicas
of
every
controller.
That
is
to
be
run.
C
That
is
needed
for
the
VP
to
act
on
a
particular
controller
at
all,
and
if
you
see
too
many
evictions
of
the
pods-
and
you
want
to
tolerate-
or
you
want
to
reduce
that,
we,
for
example,
use
this
eviction.
Read
limit,
read,
pass
and
also
eviction
tolerance.
This
is
to
just
to
tell
the
vertical
part
of
this
killer
that
you
have
to
have
a
certain
percentage
of
the
replicas
running
at
all
times
when
you
consider
evicting
one
of
the
pods
in
the
controller.
C
C
We
see
that
the
red
line,
which
is
the
utilization
of
the
resource,
actual
utilization
of
the
resource,
is
very
close
to
the
blue
line,
which
is
the
requested
resource
you
might
ask.
There
is
still
a
huge
glitch
or
huge
difference
initially,
at
least
in
the
initial
few
days,
and
this
is
because
we
have
configured
it
to
be
such,
and
this
is
because
totally
because
our
workload
behaves
this
way.
C
We
have
huge
historical
data
in
the
beginning,
and
we
want
to
reduce
the
number
of
evictions
that
happen
due
to
the
vertical
product
is
killer,
and
we
have
configured
it
kind
of
this
way.
But
you
can
refrain
from
doing
this
of
course,
and
are
we
saving
here
with
this?
Of
course,
combined?
We
are
seeing
a
lot
because,
firstly,
our
ports
are
optimized,
our
pods
are
scaling
and
the
nodes
are
also
scaling
and
the
notes
are
what
we
pay
for,
and
that's
also
optimized,
and
we
save
huge,
a
safe,
huge
on
the
cost.
C
So
this
table
that
you
see
on
screen
is
from
from
the
same
workload
in
our
production
environments,
two
different
production
environments,
one
without
any
of
these
optimization
techniques,
the
one
at
the
bottom
that
you
see
you
can
see
that
we
have
used
4K,
CPUs
and
after
optimization,
we've
come
down
to
1.7,
K
CPUs
and
with
memory
we
had
used
12
terabytes
of
memory,
and
we
came
down
to
6.5
terabytes
of
memory,
almost
50
percent
of
the
cost.
Saving
that
we're
doing
here.
Of
course,
you
can
find
few
differences
in
the
tables.
C
This
is
because
we
we
tried
to
get
wiser,
so
so
to
say
we
try
to
increase
the
planning
or
increase
number
of
node
pools
and
discreetly,
discreetly,
have
notes
defined,
and
we
also
change
the
shapes
here
and
there.
So
that's
the
reason
why
you
see
it,
but
it
the
load
that
it
is
can
is
the
scene.
So
you
could
see
these
denzio
A1
E4
Flex.
These
are
basically
standard
VMS
provided
by
Oracle
kubernetes
cluster
danzaio
is
the
locally
the
one
with
locally
attached
nvme
SSD
A1
Flex.
C
Are
these
arm
64
pieced
compute
instances?
E4
is
the
standard,
amd64
images
compute
instances.
C
So
with
this
we
come
to
the
summary,
and
to
summarize,
we
first
need
to
explore
our
Cloud
providers.
C
C
We
are
not
doing
git
Ops
as
yet
for
the
first
question.
Okay,
let
me
get
to
the
question:
if
you
do
a
new
deployment,
the
latest
value
will
get
overwritten
by
the
value
that
was
in
the
Manifest.
So
you
didn't
get.
Do
you
do
anything
special
to
sync
to
get
we
when
I
see
deployments
in
production,
we
do
fresh
deployments
from
the
gitlab
CI
CD
every
time,
so
the
values
are
in
the
gitlab
repo
and
once
we
update
there,
that's
when
it
goes
to
the
master
and
that's
what
gets
deployed.
C
C
That's
because
we
tried
to
start
the
vpa
a
little
later
and
we
configured
the
resource
parameters,
resource
request,
parameters
to
very
high
value
initially,
because
initially
that's
when
at
least
for
our
workload,
that's
when
most
of
the
loading
and
the
busy
time
happens,
we
have
a
huge
amount
of
historical
data
and
to
get
to
the
stable
CDC
till
then
we
don't
want
too
many
evictions
of
the
pods.
C
That
would
just
delay
the
loading
process
and
that's
the
reason
we
try
to
manually
have
a
glitch
there
and
we
start
start
applying
or
patching
the
VP
a
little
later
during
the
deployment.
C
So
a
horizontal
power
to
scalar
can't
use
memory
or
CPU.
What
metrics
are
you
using
to
know
when
to
scale
up
the
multiple
parts
yeah
we
use
custom
metrics.
We
have
applications
which
export
a
particular
kind
of
metrics,
which
is
kind
of
a
deciding
factor
because
it
provides
basically
how
much
are
we
lacking
in
time?
I
would
not
try
to
go
into
the
details,
but
it
could
differ
in
your
raw
project
as
well,
but
this
value
should
basically
be
the
deciding
factor
for
you
to
know.
C
If
you
have
to
your
workload,
is
heavy
or
if
your
applications
are
busy
or
not.
So
if
you
have
some
metrics
in
your
application,
you
could
of
course
export
that
metrics
and
then
Prometheus
to
this
property
server
and
have
from
this
adapter
or
try
to
make
it
available
to
the
horizontal
port
or
to
scalar.
C
Handle
when
there
is
one
or
two
parts
preventing
it
from
removing
the
note,
is
there
any
way
to
flag
some
parts
as
not
important
or
that
or
that
they
can
be
rescheduled
if
needed?
Yeah?
This
is
one
thing
that
we
also
faced
with
Blaster
Auto
scalar.
We
had
pod
disruption
budgets,
and
that
meant
we
had
one
instance
of
the
pot
and
the
pot
is
disruption.
Budget
was
saying
that
I
should
have
a
minimum
of
one
and
in
which
case,
cluster
autoscaler
refrains.
C
It
raises
its
hand
and
says:
okay,
now,
I
can't
do
anything
with
this
load.
You
will
have
to
exist
because
the
port
is
still
existing
and
there's
no
second
replica
of
that
part.
That's
kind
of
to
say
wrong
configuration
because
for
a
poor
disruption,
budgets
should
start
with
at
least
two
replicas.
In
my
opinion,.
C
And
to
get
around
this
I
would
say
you
will
need
a
bit
of
manual
intervention
if
you
really
want
the
cluster
Auto
scaler
to
run,
or
you
have
to
make
sure
you
decide
wisely.
When
you
have
the
poor
disruption
budgets
configured,
do
you
have
keeda
along
with
VP?
Is
it
possible?
C
I
am
not
aware
of
Kira
to
be
frank
here
that
you
have
any
inputs
on
this
yeah.
D
C
Okay,
so
each
app
has
a
has
to
push
metrics
to
Prometheus
at
some
interval,
or
does
parameters
pull
yeah?
There
are
two
ways
the
Prometheus
works.
One
is
push
and
one
is
pulled,
so
we
are
doing
the
push
wherein
we
push
the
metrics
to
the
Prometheus
and
yeah.
There
is
configuration
in
Prometheus
itself,
where
you
can
see.
How
frequently
should
you
pull
or
push
the
metrics
from
various
apps?
That's
a
configuration
in
the
Prometheus.
When
you
install
Prometheus
onto
the
cluster,
you
have
those
configuration
parameters.
C
The
HP
and
VP
methods
are
used
together.
What
is
the
difference
between
its
metrics?
Okay,
if
HP
is
used
together
with
vpa
I'm
presuming?
This
is
because
HPA
was
used
with
custom
metrics
and
the
custom
metrics
basically
means
your
application
is
trying
to
export
some
customized
metrics,
very
customized
to
your
application
right
and
vpa
methods.
Vpa
uses
the
actual
resource
utilization.
When
I
say
when
you
do
a
cubesatel
top
command,
you
actually
get
to
see
how
much
support
is
actually
utilizing,
regardless
of
how
much
you've
configured
the
Pod
resource
requests.
C
So
that
is
what
it
considers
for
vpa
vertical
power
to
scalar
and
for
hpe.
You
have
used
the
custom
metrics,
which
is
your.
What
your
application
is
seeing.
I
need
to
have
a
threshold
to
scale
out
and
scale
in
the
number
of
replicas
of
a
pod
hope
that
answered,
but
I
didn't
miss
any
question.
C
Hope
the
session
was
of
some
help
to
you.
If
you
further
have
any
questions,
you
can,
of
course,
leave
them
with
the
organizer
and
they
will
try
to
communicate
to
us
over
mail,
but
thank
you
for
your
patient
listening.
A
All
right
well
excellent
job.
Thank
you,
varsha
and
Chip.
Thank
you.
Everyone
for
joining
us
I
think
you
know
how
to
reach
our
speakers.
If
you
have
any
additional
questions,
but
thanks
for
joining
our
cncf
live
webinar
today.
As
a
reminder,
everything
will
be
online
later
today,
early
tomorrow.
So
just
let
us
know
if
you
have
any
questions
and
we'll
see
you
again
next
time,
thanks
so
much.