►
Description
A very special edition from the Red Hat APAC Managed OpenShift Black Belt team. Join our Red Hat experts Nethali Zoysa and Suresh Gaikwad for a LIVE Q&A chat accompanying a replay of their recent presentation on resiliency on ROSA and ARO. You’ll learn about the key concepts behind scaling and high availability - from types of scaling to load balancing, they’ve got it all. There’s technical examples and real world recommendations to make your clusters amazing. And our presenters will be standing by LIVE in the chat to answer questions and share experiences!
A
Hybrid
Cloud
copy
hour
today
we'll
be
bringing
you
some
really
cool
content
from
our
managed
cloud
services
team,
the
APAC
managed
openshift
black
belts
internally.
We
refer
to
them
as
an
acronym,
of
course,
for
their
name
Mobb
or
the
mobs.
The
mobs
work
closely
with
our
managed
openshift
customers
and
field
teams
supporting
real
world
installs
of
Rosa
arrow
and
more.
They
get
to
play
in
this
space
every
day
and
see
all
the
ways
that
a
managed
openshift
service
is
proving
helpful
to
our
customers
to
reach
their
goals.
A
And
with
this
experience
they
have
gained
a
massive
amount
of
knowledge
and
how
to
do
real
world
deployments.
The
right
way.
So
today,
I'm
excited
to
share
with
you
their
recent
session
on
scaling
and
high
availability
in
Rosa
and
arrow
and
as
an
added
bonus.
Both
our
presenters
are
standing
by
live
in
the
chat
to
answer
your
questions
and
share
their
stories
and
experiences
be
sure
to
say,
hi
So
today,
we're
featuring
two
mobs.
A
First,
up
is
Natali
to
cover
scaling
in
openshift
Natalie
dives
into
scaling
different
types
of
scaling,
resource
management
and
more
you're,
going
to
learn,
Concepts
and
stuff
about
how
it
works,
sections
on
optimizing
and
best
practices
after
that
is
Suresh
to
present
on
high
availability
and
he's
going
to
Deep
dive
into
everything,
from
application,
probes
to
load,
balancing
capabilities
and
controlling
pod
placement.
Everything
to
make
your
clusters
safe
and
you
won't
want
to
miss
the
best
practices
session
or
Suresh,
gives
you
the
rock
solid
foundation
to
making
your
openshift
cluster
always
available.
A
B
B
Scaling
refers
to
the
ability
of
openshift
to
automatically
adjust
the
number
of
running
instance
of
application
based
on
demands.
This
can
be
achieved
manually
or
automatically
when
it
comes
to
manual
scaling
to
instantly
change.
The
number
of
replicas
administrator
can
use
the
also
scale
command
to
alter
the
size
of
job
deployment
or
replication
controllers
Etc.
B
A
vertical
scaling
mechanism
involved,
the
dynamic
provisioning
of
resources
such
as
Ram
or
CPU
of
cluster
nodes
to
match
the
application
requirements.
This
is
essentially
achieved
by
tweaking
the
port
resources
based
on
the
workload
consumption.
This
scaling
technique
automatically
adjusts
the
port
resources
based
on
the
usage
over
time,
thereby
minimizing
resource
wastage
and
facilitating
Optimum
clusters.
Utilization.
B
Works
best
with
long
running,
homogeneous
workloads
such
as
databases,
it
has
few
limitations
also,
for
example,
vertical
perotto
scaling
is
not
ready
to
use
with
jvm-based
workloads
due
to
limited
visibility
into
actual
memory
usage
of
the
workload
other
than
horizontal
scaling
and
vertical
scaling.
We
have
even
a
multi
-dimensional
scaling.
It
combines
horizontal
and
vertical
scaling
simultaneously.
This
is
a
less
frequent
and
not
recommended
more
complex
to
manage,
because
defined
in
the
when
to
scale
horizontally
or
vertically
depends
on
many
parameters
which
are
sometimes
hard
to
predict.
B
The
third
type
is
cluster
scaling.
Cluster
scaling
involves
increasing
or
reducing
the
number
of
nodes
in
the
cluster,
based
on
the
neurode
utilization
matrices
and
the
existence
of
pendant
parts.
The
cluster's
auto
skill,
typically
interfaces
with
the
chosen
cloud
provider,
but
in
the
case
of
open
shift
container
platform
implementation,
cluster
Auto
scaling
is
integrated
with
the
machine
API
by
extending
the
compute
machine
set
API.
What
this
really
means,
as
you
can
see,
on
the
screen,
openstack
has
custom
resources,
called
machine,
pools,
machine
sets
and
machines,
specifically
in
AWS
infrastructure,.
B
You
can
consider
machine
set
and
machines
similar
to
replicaset
and
replicas
of
pots.
You
can
change
the
number
of
replicas
in
the
replica
set
to
change
the
ports.
Similarly,
you
can
change
the
replicas
of
machine
set
to
change
number
of
machines
which
is
controlled
by
Machine
set,
but
when
it
comes
to
AWS,
you
have
another
high
level,
abstract
called
machine
pool
which
basically
control
even
machine
sets.
B
B
B
So,
let's
try
to
understand
some
key
Concepts
under
the
hood
of
scaling.
The
first
one
is
resource
requests
and
limits
a
whole
bunch
of
resources.
A
container
need
can
be
optionally,
specified
using
resource
requests
and
limits.
So
what
is
resource
request
when
the
resource
request
is
specified
for
a
container?
The
cube
scheduler
use
this
information
to
decide
on
which
node
the
Port
is
placed.
B
Then
what
is
resource
limit
when
a
resource
limit
is
specified
for
a
container?
The
cubelet
and
container
runtime
enforce
those
limits
so
that
the
running
content
is
not
allowed
to
use
more
of
the
resources
than
the
limit.
This
is
used
to
prevent
a
pot
from
using
up
all
compute
resources
from
a
node.
B
In
some
cases,
cubelet
even
reserve
the
limit,
which
is
greater
than
or
equal
requester
because
of
the
reservation.
The
scheduler
refused
to
place
a
port
on
a
node,
although
actual
memory
or
CPU
resource
usage
on
node
is
very
low.
Since
the
capacity
check
fails,
these
protects
again.
The
resource
shortage
of
node
with
resource
usage
later
increases
for
the
peak
request
rate.
B
Cube
result
the
cube,
preserve
resource
necessary
to
run
kubernetes
agents
such
as
coupler
the
container
runtime
Etc
the
system
reserved,
as
it
name
says,
this
is
the
resource
needed
to
run
the
operation
system
and
system
demons
and
last
eviction.
Threshold
memory,
pressure
at
the
node
level
to
the
system.
B
Is
sometimes
lead
to
what
we
call
ohms
out
of
memories,
this
effect
the
entire
node
and
all
part
running
on
it.
Node
can
go
offline
temporarily
until
memory
has
been
reclaimed
avoid
or
reduce
this
probability
of
system,
ohms
or
system
out
of
memories.
Kubernetes
reserved
some
memory
via
a
specific
flag
called
eviction
heart
the
kubernetes
attempt
to
every
pods
whenever
memory
available
on
the
Node
drop
below
the
reserved
value.
For
this
reason,
resources
reserved
for
eviction
are
not
available
for
parts.
B
B
B
B
B
In
the
worst
case
scenario,
the
horizontal
scale
can
take
up
to
1
minutes
and
a
half
to
trigger
the
auto
scaling.
You
can
see
you
can
add
this
10
second
60
second
plus
15
seconds,
roughly
around
one
and
half
okay,
let's
think
now,
horizontal
scale
want
to
create
another
pod,
but
you
are
node
does
not
have
resources,
in
that
case
the
cluster
Auto
Scale
company
to
the
picture,
because
you
have
to
create
another
node.
The
cluster
Auto
scalar
has
a
reaction
time
of
10
seconds.
B
The
thrust
Auto
scale
check
for
unscreditable
pods
in
the
cluster
every
10
seconds.
Once
one
or
more
ports
are
detected,
it
will
run
an
algorithm
to
decide
how
many
nodes
are
necessary
to
deploy
all
pending
boards
and
what
type
of
nodes
groups
should
be
created.
The
entire
process
take
no
more
than
60.
Second,
on
a
cluster
with
less
than
100
knots,
the
average
latency
should
be
about
5
Seconds,
and
if
your
cluster
has
nodes
around
100
to
1000
nodes,
it
is
said
that
this
time
take
no
more
than
60
seconds
now.
B
Let's
imagine
that
our
cluster
has
less
than
100
nodes,
so
this
become
30
seconds.
Okay,
next
step
is
not
provisioning
load
provisioning.
Actually,
it
depends
mainly
on
the
cloud
provider,
but
it's
pretty
standard
for
new
compute
results
to
be
provisioned
in
three
to
five
minutes.
So
let's
say
take
three
to
five
minutes
and
last
the
port
creation
tag
launching
a
container
generally
happen
in
couple
of
milliseconds,
but
some
application
may
take
time
to
boot,
depending
on
the
kind
of
application,
for
example,
jvm
based.
B
It
takes
some
time
and
also
some
operation
to
perform.
For
example,
some
data
injection
might
take
time.
More
importantly,
downloading
the
content
image
could
take
from
a
couple
of
seconds
up
to
minutes,
depending
on
the
size
of
and
number
of
layers
of
the
image.
So
that's
how
horizontal
Auto
scaling
work
and
all
the
time
involving
in
each
and
every
steps.
B
So
with
that
explanation
we
have
a
good
understanding
that
Port
Auto
scaling
lead
time
is,
as
shown
on
the
picture
with
the
default
time
in
horizontal
product
or
scalar
reaction
time,
10
seconds
for
Port
CPU
and
memory
scrap
60.
Second,
for
Matrix,
aggregation
and
15
seconds
for
HPA
checkpoint
matrices,
and
if
you
want
to
create
another
new
node,
the
cluster
or
scalar
reaction
time
come
with
a
10
second
plus
6
additional
30
seconds
and
no
permission
in
time
take
around
three
to
five
minutes
and
Port
creation.
B
B
I
am
happy
with
two:
are
you
happy
to
wait
for
six,
seven
minutes
to
handle
sudden
surge
in
demand?
No
right,
so
in
that
case
we
have
to
see
how
to
optimize
this
Auto
scaling
s.
U
k
and
C
from
top
to
bottom.
Open
shift.
Use
Cube
controller
manage
flags
for
optimal
HPA
performance,
so
this
Flex
has
been.
These
flags
have
been
already
optimized,
they
are
fixed
and
you
have
no
control
on
this.
So
therefore
you
have
no
control
on
this
HPA
performance.
You
cannot
produce
the
time
and
also
openshift
cluster
Auto
scalar
reaction.
B
Time
is
10
seconds.
So
in
that
case
you
you
have
no
control
on
that,
also,
but
node
provisioning
time
and
product
creation
time.
Of
course
you
have
no
control.
You
cannot
reduce
node
permission
in
time.
You
cannot
reduce
the
port
image
downloading
time
you,
but
you
can
have
some
work
around
to
optimize
them.
So
let's
have
a
look.
The
first
one
is
not
provisioning
time.
B
B
B
Minimize
creation
of
new
nodes,
if
possible,
choosing
the
right
instance
type
for
your
cluster
has
dramatic
consequences
in
your
scaling
strategy.
For
example,
if
your
node
has
space
only
for
few
pods,
as
you
can
see
on
the
left
side,
you
have
to
provision
in
a
new
nodes
for
additional
replicas,
incurring
additional
another.
Six
seven
minutes
delay
the
lead
time
to
trigger
the
horizontal
for
total,
scalar
class
dot
or
scale
and
provision
of
the
compute
resources
on
the
cloud
provide.
B
But
if
your
node
has
a
space
for
a
large
number
for,
as
you
can
see
on
the
right
side,
this
will
reduce
it
dramatically
because
you
have
enough
space
always
choosing
a
large
instance.
Type
also
has
another
benefit.
The
ratio
between
the
resource
result
for
cubelet
operating
system
and
eviction
solves
what
we
discussed
earlier
and
the
available
resources
for
your
ports.
The
ratio
is
a
greater
select,
the
right
instance,
but
not
the
biggest
instance.
B
At
all
the
times,
you
have
to
have
some
kind
of
a
research
and
try
to
select
the
optimum
large
enough
node
size.
There's
a
big
efficiency
dictated
by
how
many
pods
you
can
have
another
node
cloud
provider
limit
the
number
of
PODS,
sometimes
it's
dictated
by
the
underlying
Network
on
per
instance
basis.
You
have
to
consider
all
of
this.
You
should
also
consider
if
you
have
only
few
nodes.
The
impact
of
fail
in
one
node
is
very
high.
B
B
B
B
You
create
another
empty
node,
but
cluster
Auto
scale.
It
does
not
have
this
functionality
built
in,
but
we
can
have
a
nice
workaround
for
this.
So,
let's
see
how
it
works,
run
a
low
priority
deployment
with
the
enough
request
to
reserve
an
entire
node.
As
you
can
see
on
this
picture,
as
you
can
see
on
the
right
side,
there
is
a
low
priority,
placeholder
for
which
always
preserve
the
spare
node.
B
Consider
this
low
priority
report
as
a
placeholder
and
as
soon
as
a
real
Port
need
the
resource.
You
could
evict
the
placeholder
port
and
deploy
the
real
high
priority
report.
Let's
see
how
it
works,
so
the
new
application
pod
comes
it
evict
the
place
of
all
the
port,
and
now
what
happen
is
there's
a
no
node
for
the
placeholder
Port.
It
will
create
another
new
spare
node
and
your
low
priority.
Placeholder
Port
will
appear
there.
B
So
pay
extra
attention
to
the
memory
and
CPU
request
so
that
they
are
used
to
reserve
the
space
of
your
new
spare
node.
You
may
provision
a
single
large
Port
that
has
roughly
the
request,
matching
the
available
node
resources
to
make
sure
that
the
point
is
evicted
as
soon
as
a
real
phone,
which
is
the
application
Port
is
created.
You
can
use
put
priorities.
B
Put
Priority
indicates
the
important
of
the
Pod
relative
to
other
ports.
When
a
port
cannot
be
scheduled.
The
scheduler
tries
to
evict
the
low
priority
parts
to
schedule.
The
pending
points.
You
can
configure
Port
priorities
in
your
cluster
with
a
pod
priority
class.
So
this
is
the
third
option
that
you
can
try
and
we
we
are
proposing
this
one.
There
are
some
workaround
that
you
can
optimize,
not
provisioning
timing
and
the
next
one
is
for
creation
time.
B
If
your
company
allows
to
Cache
the
image
in
advance,
in
that
case
always
try
to
catch
the
image
in
advance.
There
are
some
companies
which
has
image
pool
policy
set,
as
always
for
some
reason,
for
example
Security.
In
that
case,
you
have
no
chance
for
this,
because
it
always
has
to
pull
the
image
from
the
registry.
But
if
your
company
allows
use
pod
creation
time,
you
can
optimize
spot
creation
Time
by
downloading
the
image
in
address.
B
There
are
some
mechanism
you
can
use
to
Cache
the
image
into
the
node
in
advance,
for
example,
image
puller
use
demon
set.
So
this
this
use
a
demon
set
to
download
images
in
each
and
every
node,
and
also
there
are
some
operators,
as
you
can
see,
in
the
kubernetes
image,
puller
Cube,
flage,
Etc
and
also
even
you
can
use
the
placeholder
put
in
the
spare
node
to
download
the
image
in
advance.
B
B
B
B
So
there
are
a
lot
of
things
happening
with
every
new
releases
performance
and
comparative
compatibility
of
different
kubernetes
version
with
auto
scalar
releases
are
truly
tested
and
documented.
It's
highly
recommend
to
use
only
the
compatible
Auto
scaling
object
of
the
kubernetes
control
plane
version
to
ensure
the
cluster
Auto
scale,
appropriately
simulates
the
kubernetes
schedule.
B
The
second
one
is
be
specific,
specific
on
resource
requests
and
limits.
It
is
crucial
to
ensure
the
port
resource
requests
are
comparable
to
their
actual
consumption
as
a
recommended
best
practice,
cluster
administrator
leverage,
historical
consumption,
Statics
and
ensure
each
body
is
allowed
to
request
resources
close
to
the
actual
kind
of
usage.
B
B
So
in
this
case
it
is
sometimes
problematic,
since
your
nodes
will
be
often
over
committed.
Our
committee
nodes
can
lead
to
the
excessive
eviction
more
work
for
quiblet
and
a
lot
of
pressure
only
so
it's
not
a
good
practice.
Maybe,
and
the
second
option
is
you
can
set
the
request
and
limit
to
the
same
value
in
kubernetes
is
often
called
referred
to
as
a
guaranteed
quality
of
service
class
and
refer
to
the
fact
that
it
is
in
importable
that
the
Pod
will
be
terminated
and
evicted.
B
The
community
scheduler
will
reserve
the
entire
CPU
CPU
and
memory
for
the
pod
on
the
assigned
nodes.
Pods
with
guaranteed
quality
of
service
are
stable,
but
also
inefficient.
For
example,
if
you
app
use
a
256
megabytes
of
memory
on
average,
but
you
reserve
two
gigabytes,
we
have
1.5
gigabyte
unused
most
of
the
times,
so
we
have
to
ask
the
question:
is
it
Worthy,
if
you
want
to
extra
if
you
want
to
have
an
extra
stability?
B
This
is
this
is
one
of
the
one
of
the
practice
that
you
can
increase
the
efficiency
when,
when
your
requests
match
the
app,
actually
you
say
the
scheduler
will
pack
your
ports
in
your
node
efficiently
use
this
mechanism,
which
is
request
match
to
the
actual
average
dosage.
When
you
want
to
optimize
your
cluster
and
use
the
resource
resources
wisely,
for
example,
you
can
use
this
kind
of
a
mechanism
for
web
application.
B
The
fact
is,
the
auto
scaler
relies
on
each
node
utilization
and
put
scheduling
status
to
tackle
the
scaling
changes.
Calculating
node
utilization
relies
on
the
simple
formula
dividing
the
sum
of
all
resources
requested
by
capacity
of
nodes.
As
a
result,
missing
resources
request
for
one
or
many
pod
affect
the
calculation
of
node
utilization.
B
To
achieve
this,
a
template
node
is
created
by
the
auto
scalar,
on
which
all
cluster
wide
scaling
operations
are
performed
to
ensure
the
autoskelectory
the
template
not
perform
accurately.
It
is
ideal
to
have
node
groups
with
the
nodes
having
the
same
resource
and
footprints,
and
the
fifth
point
is
specify
disruption,
budget
for
All
Parts
kubernetes
support
defined
in
the
port
disruption
budget
as
a
cluster
object
for
voluntary
or
involuntary
disruption
of
workload
replicas
to
prevent,
incurring
losses.
Cluster
administrator
should
Define
Port
disruption
budgets
to
ensure
the
autoscaler
maintain
a
maximum
results
of
pod.
B
C
Hello
everyone.
Today
we
will
talk
about
how
to
achieve
High
availability
of
application.
When
deployed
on
openshift
clusters,
we
will
cover
how
high
availability
can
be
achieved
with
the
help
of
application,
props
load,
balancing
techniques,
scheduling
strategies,
Port
disruption,
budget
for
priority
and
preemption,
and
then
we
will
discuss
some
of
the
best
practices
to
be
followed
to
achieve
High
availability.
C
When
we
talk
about
high
availability,
it's
important
to
consider
cluster
High
availability
along
with
the
application
High
availability,
though
we
will
be
focusing
on
application.
High
availability
today
when
deploying
Rosa
Arrow
OSD
clusters
always
ensure
the
Clusters
are
deployed
in
multiple
availability
zones.
C
C
C
Openshift
includes
two
type
of
probes
for
evaluating
the
status
of
applications.
Liveness
probe
examines
whether
an
application
is
functioning
properly
a
and,
if
not
causes
it
to
be
started.
A
Readiness
probe
examines
whether
an
application
is
ready
to
receive
traffic
and,
if
not,
causes
it
to
be
removed
from
the
service
endpoint
list.
C
C
The
TCP
socket
probe
makes
a
pure
socket
connection
to
the
identified
port
and
is
only
considered
successful.
If
the
connection
can
complete
neither
HTTP
gate
nor
TCP
socket
require
anything
special
of
the
parts
or
containers
being
cropped.
The
most
powerful
prop
is
exec.
This
probe
actually
results
in
the
specified
executable
being
run
inside
the
designated
parts
container.
C
If
a
Readiness
Pro
fails
a
sufficient
number
of
times,
this
part
is
marked
as
not
ready,
and
this
causes
it
to
be
removed
from
the
endpoint
list
for
service
that
would
map
to
it.
In
this
way,
additional
traffic
going
to
the
service
is
no
longer
balanced
across
parts
that
are
not
ready.
Once
the
port
passes
the
Readiness
checks
again,
it
is
marked
ready,
and
this
results
in
the
port
being
re-added
to
the
endpoint
list.
Then
the
traffic
will
again
distribute
it
to
the
power.
C
Now
there
are
some
other
important
settings
when
configuring
these
application
probes
initial
delay
seconds,
which
is
how
long
to
wait
after
the
Pod
is
launched
to
begin
checking
timeout
seconds
is
how
long
to
wait
for
a
successful
connection.
It
could
be
HTTP
gate
or
TCP
socket
only
period
seconds,
how
frequently
to
recheck
for
the
pro
connection
and
failure
threshold
which
states
how
many
consecutive
failed
checks
before
the
prop
is
considered
failed.
C
Now,
let's
take
a
look
at
load,
balancing
capabilities
which
we
have
openshift,
provides
built-in
load,
balancing
capabilities
which
can
distribute
the
traffic
across
multiple
replicas
of
your
application.
This
helps
to
ensure
that
no
single
replica
is
overloaded
and
that
can
that
traffic
can
be
automatically
rerouted
to
healthy
replicas.
If
one
fails,
you
can
even
split
the
traffic
between
multiple
services
for
a
b
testing,
blue,
green
and
kenro
deployments.
C
C
We
do
not
want
to
have
all
our
parts
on
the
same
node,
because
that
would
make
having
multiple
Parts
irrelevant
when
nodes
must
be
taken.
Offline
openshift
does
a
good
job
already
at
this.
The
scheduler
by
default
will
try
to
spread
your
ports
across
the
nodes
that
can
run
them
now.
The
first
of
all.
We
can
use
node
selectors
on
ports
and
labels
on
the
nodes
to
control
where
the
Pod
is
scheduled
with
node
selectors
openshift
schedules
the
ports
on
the
nodes
that
contain
matching
labels.
C
It's
a
good
practice
to
dedicate
some
notes
for
your
mission,
critical
applications
to
avoid
impact
such
as
resource
crunch
on
these
application
ports,
which
may
be
caused
by
some
other
parts
running
on
the
same
node.
When
we
want
to
have
more
granular
control
of
on
the
scheduling,
affinity
and
anti-affinity
rules
can
be
used.
Using
these
rules,
you
can
spread
the
parts
of
a
service
across
nodes,
availability
zones
or
availability
sets
to
reduce
correlated
failures
when
latency
or
performance
matters.
C
Application
pause
to
get
scheduled
on
the
specific
nodes
dedicated
for
your
mission.
Critical
applications,
tense
and
tolerations
can
be
used
tense
and
tolerations
allows
the
node
to
control
which
parts
should
or
should
not
be
scheduled
on
them.
Now,
even
after
using
Affinity
or
anti-affinity
rules,
it
does
not
guarantee
or
of
even
spread
of
your
replicas
with
preferred
mode.
Even
spread
is
not
guaranteed
and
with
strict
anti-affinity
mode.
C
You
may
end
up
running
single
replica
on
nodes,
which
presents
as
a
problem
of
fault
tolerance
and
may
not
be
a
efficient
use
of
resources
available
on
the
nodes.
This
is
where
topology
spread
constraints
comes
to
the
picture,
so
the
topology
spread
constraints
gives
you
more
granular
controls
during
scheduling
of
the
pods.
You
can
use
spot
topology,
spread
constraints
to
control
the
placement
of
your
pores
across
nodes
zones
or
other
user
defined
topology
domains
by
using
pod
topology
spread
constraint.
C
You
provide
A
fine
grain
control
over
the
distribution
of
ports
across
failure,
domains
to
help
achieve
High
availability
and
more
efficient
resource
utilization.
Take
a
look
at
how
a
board
topologies
spread
constraint.
Definition
looks
like
looking
at
this
example.
Let's
consider
you
want
to
run
seven
replicas
of
your
application
across
three
availability
zones.
C
C
C
C
The
next
field
is
in
this
example,
is
topology
key,
which
is
a
key
for
one
of
the
node
label.
The
next
field
is
when
unsatisfiable,
it
is
the
action
of
the
scheduler,
the
action
the
scheduler
should
take
when
it
can't
satisfy
the
conditions
specified
in
the
topology
spread
constraint.
It
could
be
scheduled
anyway,
which
means
even
if
it
doesn't
match
the
condition.
The
parts
will
still
be
scheduled
or
do
not
schedule,
which
means
if
it
doesn't
match
the
condition,
do
not
schedule
the
pods.
C
C
You
can
also
benefit
from
this
scheduler
in
a
situation
where
your
nodes
are
over
utilized
or
underutilized
in
these
situations,
where
new
nodes
are
added
to
the
cluster
or
other
scheduling,
conditions
have
changed
and
let's
take
a
look
at
different
disruptions
and
how
we
can
deal
with
them
before
discussing
part
disruption
budget.
Let's
understand
what
what
is
a
disruption
pots
do
not
disappear
until
someone,
a
person
or
controller
destroys
them
or
because
of
unavoidable
situation
occurs.
C
Even
if
a
node
goes
down
for
some
other
reason
or
application
for
fails,
the
scheduler
should
be
able
to
schedule
the
pause
on
other
available
nodes
for
fault.
Tolerance
always
run
multiple
ports
at
least
two
ports
for
your
application,
such
that
application
can
tolerate
a
single
point.
A
single
power
failure
also
spread
these
parts
across
availability
zones
using
scheduling
techniques,
so
that,
even
in
case
of
easy
failure,
application
can
still
run
efficiently
serving
to
the
clients.
C
Let's
take
a
look
at
how
we
can
deal
with
voluntary
disruptions,
as
these
are
application
owner
or
admin
initiated
disruptions,
always
use
GitHub
practices
for
kubernetes
manifest
files
with
githubs
Git
will
be
the
source
of
truth
even
for
your
application,
manifest
files.
C
Another
way
to
deal
with
these
voluntary
disruptions
is
to
use
deployment
strategy.
A
deployment
strategy
is
a
way
to
change
or
upgrade
an
application.
The
aim
is
to
make
the
change
without
any
downtime
in
a
way
that
usually
barely
notices
the
improvements
by
default
in
openshift.
We
use
rolling
deployment
strategy.
A
rolling
deployment
slowly
replaces
instances
of
the
previous
version
of
your
application
with
instances
of
a
new
version
of
the
application
and
it
waits
for
the
New
Ports
to
become
ready
via
Readiness
check.
C
This
is
a
error
budget
which
states
no
matter
what
we
will
always
have:
minimum
set
of
application,
ports
up
and
running
all
the
time
in
case
of
voluntary
disruption.
A
bond
disruption
budget
specifies
the
minimum
number
or
percentage
of
replicas.
That
must
be
up
at
a
time
if
you
specify
a
port
disruption.
Budget
openshift
respects
them
when
preempting
pores
at
a
best
effort
level,
it
limits
the
number
of
parts
of
a
replicated
application
that
are
down
simultaneously
from
voluntary
disruptions.
C
Now,
even
with
all
this
in
place
consider
a
disastrous
situation
where
we
don't
have
enough
resources
available
to
run
the
entire
workload
in
the
cluster.
This
is
where
report
priority
can
help
to
some
extent.
Let's
talk
about
product
priority
and
preemption.
C
C
This
is
specifically
useful
when
you
have
unknown
failures
like
easy
failures
or
have
minimal
resources
available
in
the
cluster
poor
priority
ensures
your
part
gets
the
highest
priority
when
scheduler
tries
to
schedule
the
pause.
The
default
priority
is
z
in
between
0
to
2
billion
population.
Preemption
allows
the
cluster
to
evict
or
preempt
lower
priority
parts
so
that
higher
priority
Parts
can
be
scheduled
or
can
be
run
if
there
is
no
available
space
on
a
suitable
node
for
priority
also
affects
the
scheduler
scheduling,
order
of
parts
and
out
of
resource
eviction
ordering
on
the
Node.
C
The
scheduler
schedules
a
higher
priority
part
sooner
than
the
pots,
with
lower
priority.
If
its
scheduling
requirements
are
made,
you
can
apply
Port
priority
and
preemption
by
creating
a
priority
class
object
and
associating
parts
to
the
priority
using
priority
class
name
in
your
pod
specifications
by
default
in
openshift,
we
have
three
priority
classes
and
you
can
also
have
one
Global
default
priority
class
in
the
cluster,
which
means,
if
the
part
doesn't
if
the
port
doesn't
specify
a
priority
class.
This
default
priority
class
will
be
used.
C
Now,
when
we
talk
when
we
talk
about
best
practices,
always
ensure
you
run
multiple
replicas
of
your
application
pores,
you
can
use
replica,
set
or
replication
controller
to
ensure
we
have
X
number
of
replicas
running
all
the
time
always
spread
your
application
ports
evenly
across
nodes.
You
can
use
different
scheduling
strategies
as
we
discussed
to
spread
these
application
ports
evenly
across
nodes,
then
specify
the
disruption
budget
for
the
for
your
parts
use
pdbs
to
so
that
applications
will
have
minimum
set
of
replicas
running
all
the
time.
C
Even
during
voluntary
disruptions
be
aware
that
bot
disruption
budgets
cannot
prevent
involuntary
disruptions
from
occurring,
then
Define,
a
scheduler,
D
scheduler
policy.
Even
with
all
the
scheduling
techniques
we
have
once
a
maintenance
event
has
been
completed,
reports
may
be
scheduled
in
an
unbalanced
manner
to
be
better
prepared
for
the
next
maintenance
event
we
may
have
to
make
sure
parts
are
scheduled
evenly.
C
The
next
is
design
your
application
so
that
it
can
tolerate
losing
pause.
You
need
to
instrument
your
application
to
do
the
nest
any
necessary
cleanup
when
it
receives
a
sick
term.
This
phase
has
to
be
quick
with
openshift.
By
default.
You
get
30
seconds
when,
when
a
bar
gets
sick
term
to
clear
all
its
in-flight
connections
in
this
case
of
a
web
in
the
case
of
a
web
application,
there
is
no
time
at
this
point
to
wait
for
the
existing
client
sessions
to
be
concluded.
This
leads
to
us
to
the
next
principle.
C
It
shouldn't
matter,
which
part
receives
a
request,
because
there
is
no
way
to
clean
in-flight
sessions
when
a
port
gets
killed.
It
may
happen
that
subsequent
request
of
an
in-flight
session
go
to
a
different
part.
After
that
power
was
managed
after
the
port
that
was
managing
the
session
is
killed.
Your
application
needs
to
be
designed
around
this.
Eventually,
it
might
be
okay
to
lose
the
session
in
some
cases,
but
in
most
of
the
circumstances,
you
will
want
to
give
your
customer
the
best
user
experience,
which
means
not
losing
decision
for
a
stateless
application.
C
C
The
most
important
thing
is
sure
nothing
across
multiple
clusters.
The
cluster
should
be
completely
independent.
Without
sharing
any
resources
you
might
be
tempted
to
share
the
storage
layer
or
some
Services
outside
the
cluster.
Unless
this
Services
provide
their
High
availability
solution,
the
cluster
should
not
share
any
service
with
other
cluster.
C
C
Now,
even
in
case
of
a
cluster
failure
or
an
application
failure
in
one
of
the
cluster,
your
application
will
still
be
available
to
your
end.
Users
now
consider
a
situation
wherein
your
application
is:
writing
some
data
or
reading
some
data
from
an
Oracle
database
which
is
outside
your
openshift
cluster.