►
Description
Making Ceph Fast in the Face of Failure - Neha Ojha, Red Hat
Ceph has made a lot of improvements to reduce the impact of recovery and background activities on client I/O. In this talk, we'll discuss the key features that affect this, and how Ceph users can take advantage of them.
About Neha Ojha
Senior Software Engineer, Red Hat
Neha is a Senior Software Engineer at Red Hat. She is the project | technical lead for the core team focusing on RADOS. Neha holds a | Master’s degree in Computer Science from the University of California, Santa Cruz.
A
A
So,
having
said
that,
I
want
to
understand
how
many
of
us
in
this
room
actually
know
how
safe
deals
with
failures
or
like
how
recovery
in
general
happens,
all
right,
almost
50%,
let's
say
60%
I'll
still
maybe
spend
a
couple
of
minutes
just
to
go
over
the
basics
of
how
staff
deals
with
failures.
So
failure
in
general
is
like
an
OSD.
Fails,
a
node
fails
or
what
does
have
do?
There
are
two
basic
mechanisms
that
chef
uses
to
deal
with
failures.
A
One
is
called
log
based
recovery
and
the
other
was
called
backfill,
and
each
of
these
mechanisms
have
their
pros
and
cons
and
I
will
go
into
that
while
I
explain
each
of
them.
So
as
let's
start
with
log
based
recovery,
as
the
name
indicates,
there
is
some
kind
of
log
involved.
So
the
log
that
we
are
interested
in
here
is
the
PG
log.
A
So
every
OSD
has
a
a
PG
log
that
it
maintains,
which
is
basically
a
history
of
the
operations
that
it
C
is
happening
on
itself,
and
the
idea
of
log
based
recovery
is
that
when
a
node
fails,
there
are
obviously
some
information
that
is
missing
in
its
PG
log.
So
it
goes
and
consults
an
up-to-date
copy
of
another
OSDs
PG
log
and
tries
to
figure
out
what
its
missing
and
then
copies
the
Delta
information
that
it
requires
to
become
up-to-date.
A
So
obviously,
as
I
said,
the
PG
log
is
something
that
the
OSD
needs
to
maintain
and
it
is
an
in-memory
data
structure
which
has
the
advantage
that
recovery
is
fast
because
you,
when
I
explained
backfill
it
definitely
backfill,
takes
longer
time
to
happen
than
recovery,
but
the
advantage
the
disadvantage
is
that,
under
certain
circumstances,
when
there's
a
lot
of
recovery
happening,
these
PG
logs
can
keep
growing
out
of
bound
and
end
up,
causing
out
of
memory
conditions,
onerous
T's.
So
having
discussed
log
based
recovery
a
little
bit,
let's
see
what
back
fellas
in
backfill.
A
A
A
Inherently
or
like
when
we
started
out,
there
was
a
basic
problem
with
the
way
recovery
worked
and
safe.
So
what
we
focused
on
was
when
an
OSD
died.
We
would
want
to
bring
it
back
up
to
normal
as
quickly
as
possible.
Obviously
it
sounds
good,
but
there
are
other
implications
like
client,
IO
gets
gets
a
backseat
and
obviously
performance
during
recovery
is
really
poor.
A
So
the
people
dying
here
are
like
client
operations
were
not
being
able
to
cope
up
with
recovery
and
backfill,
which
is
like
a
plague
taking
over
the
entire
city,
and
this
picture
nicely
translates
into
this
graph,
which
compares
the
performance
baseline
performance
versus
performance
during
recovery
in
hammer
I,
don't
even
want
to
say
I
mean
like
it's
less
than
10%
of
baseline
performance.
So
obviously
there
was
a
problem
and
we
did
address
that.
A
What
a
few
things
that
did
help
were
the
defaults
like
OS
D,
max
backfill
and
OST
recovery
max
active
these
kind
of
parameters
that
were
available
for
tuning
weren't
by
default
tuned.
Well,
so,
if
you
manually,
you
know
tweak
them
around,
you
could
get
better
performance,
but
there
were
still
problems,
so
in
infernalis,
which
was
the
next
release,
we
made
those
two
Nobles
default
to
better
values
so
that
you
could
get
better
performance
than
hammer.
A
But
still,
if
you
look
at
it,
it's
not
doing
very
well,
so
in
luminous
what
we
did
was
we
introduced
a
way
to
throttle
recovery
and
the
option
that
did
that
is
called
OST
recovery
sweep,
as
the
name
clearly
indicates,
it's
the
amount
of
sleep
that
you
induce
between
recovery
operations
and
that
way
you
can
control
the
rate
of
recovery
happening
versus
client
I.
So
this
was
our
immediate
solution
and
it
helped
a
lot
with
controlling
recovery,
especially
when
you
wanted
client
IO
to
take
priority
over
recovery.
A
So
we
did
some
experimentation
because
we
had
our
past
learnings
that
we
were
not
good
at
coming
up
with
good
defaults.
We
thought
that,
since
we
have
this
option,
we
should
come
up
with
good
defaults
from
day
one.
So
we
did
a
bunch
of
experiments
to
see
what
reasonable
defaults
for
these
would
be
and
so
you're
what
I
have
for.
A
You
is
a
graph
which
was
average
on
the
left,
which
is
average
latency
versus
the
sleep
value,
which
is
the
US
recovery,
sleep
value
and
the
one
on
the
right
is
the
total
time
to
recover
versus
the
sleep
value.
So,
as
you
can
clearly
see,
increasing
the
sleep
did
help
us
reduce
the
average
latency,
but
on
the
right
you
can
also
notice
that
the
more
amount
of
sleep
you
induce
it's
going
to
take
longer
for
entire
recovery
to
complete.
A
So
we
wanted
to
come
up
with
an
optimal
value
so
that
we
could
take
the
advantage
of
reducing
the
average
latency.
But
overall
not
bearing
too
much
of
a
cost
of
total
time
to
recover.
So
we
came
up
with
point
1
seconds
as
the
default
value
for
hard
disks.
Just
for
the
information
sake,
these
tests
were
done
with
blue
store
with
FIO
for
K
random
writes.
A
We
did
similar
experiments
with
SSDs,
but,
as
we
know,
SSDs
are
very
fast.
We
don't
who
didn't
want
recovery
to
get
halted
at
any
stage,
so
we
just
kept
0
as
the
default
and
then
we
also
have
this
hybrid
option,
which
is
one
of
the
most
common
use
cases
for
safe,
where
your
data
is
on
hard
disks
and
meta
data
is
on
faster
devices,
even
here
the
trends
for
average
latency
and
only
time
to
recover,
but
pretty
same
as
what
we
saw
for
hard
disks.
A
But
we
came
up
with
a
consulate:
sleep
value
of
point,
zero,
two
five,
four
hybrid
setups,
but
while
I
showed
you
these
graphs,
these
graphs
made
sense
when
we
did
these
experiments,
but
the
idea
what
I
want
you
guys
to
take
from
this
is
that
we
have
those
knobs
now
and
these
results
may
or
may
not
hold
tomorrow
or
like
next
in
the
next
five
releases.
But
you
can
always
reaper
form
these
experiments
and
figure
out
what
defaults
work
for
you
and
set
will
automatically
use
that
to
control
the
rate
of
recovery.
A
A
So
there
there
were
cases
when
users
wanted
to
recover
some
of
their
data
as
quickly
as
possible,
and
we
added
this
option
to
induce
a
force
recovery
on
a
PG
level
where
you
could
just
run
this
command,
say
of
safe
PG
force
recovery,
and
you
could
give
them
give
it
a
list
of
Fiji's
or
you
could
give
it
just
one
PG,
and
it
would
recover
that
PG
over
any
other
pages
that
are
there
in
your
system.
You,
if,
in
case
you
change
your
mind
or
you
you
know
just
in
like
incorrectly
use
the
PG.
A
Okay,
then
came
mimic
so
with
mimic.
We
thought
further
of
improving
latency
during
recovery
and
what
you
know
came
to
us
very
you
know:
I
mean
something
that
was
there
was
available
for
us
to
know,
but
we
did
not
address
it.
Maybe
is
that
we
knew
that
recovery
is
a
synchronous
process,
so
it
blocked
rights.
So
clearly
it
impacted
right.
Layton
sees
and
affected
availability.
So
we
wanted
to
solve
this
problem
by
introducing
async
recovering
and
the
way
we
did
it
is.
A
The
idea
is
that
we
do
not
block,
writes
on
objects
that
are
only
missing
on
non-acting,
OS
DS,
so
when
I
say
non
acting
OSD.
The
whole
idea
is
that
you
have
a
bunch
of
OS
T's
and
as
long
as
you
can
pick
some
OS
DS
by
ensuring
that
you
have
enough
to
process
io2,
you
can
afford
to
postpone
recovery
on
those
non-acting
over
settees.
For
the
purposes
of
this
presentation,
they
are
called
s
in
recovery
targets.
A
We
will
postpone
recovery
on
those
objects
and
not
block
io
if
there
are
only
objects
that
are
not
up
to
date
on
those
or
s
DS.
So
what
the
whole
idea
is
that
we
use
log
based
recovery
to
eventually
recover
those
OS
DS.
But
you
are
just
by
postponing.
You
are
getting
better
performance
and
not
blocking
on,
writes
immediately.
A
So
when
can
we
perform
a
sleep
recovery,
so
there
are
a
few
conditions
that
need
to
be
met
for
a
sink
recovery
to
happen,
and
if
the
process
starts
from
selecting
these
recent
recovery
targets,
how
can
you
select
so,
as
I
mentioned?
If
there
you
have
a
pool
of
OSD?
How
can
you
say
which
of
these
always,
these
should
potentially
become
eccentric
of
early
targets.
So
we
use
this
concept
of
the
difference
in
length
of
PG
logs.
A
So,
as
I
mentioned
earlier
in
log
based
recovery,
every
PG
has
a
log
that
it
maintains
of
the
operations
that
has
happened
on
it
during
a
recovery,
a
node
that
fails
does
not
have
an
up-to-date
PG
log
and
when
it
tries
to
compare
it
with
an
up-to-date
copy
of
that
PG,
it
is
going
to
figure
out
that
it
is
missing
some
information.
Now
this.
This
Delta
of
these
PG
logs
can
give
us
a
rough
idea
of
how
much
it
is
behind
from
the
most
up-to-date
copy
of
that
OSD.
A
So
the
larger
the
difference,
the
more
time
it
is
going
to
take
for
it
to
recover.
So
we
post
we
try
to
choose
those
OSD
sizing
recovery
targets
so
that
we
postpone
recovery
on
those
which
are
potentially
going
to
take
longer
to
recover
and
the
other
thing
is
you
need
to
have
a
way
so
that
you
can
control
a
sink
recovery
in
the
sense
that
when
do
you
want
a
sink
recovery
to
happen
or
at
what
level.
A
You
say
that
this
is
what
the
difference
in
logs
should
be,
and
that
is
when
you
should
perform
a
soon
recovery
and
that
parameter
is
called
OSD.
A
sink
recovery,
min
PG
log
entries
the
default
for
now,
I
mean
at
least
and
mimic
100.
There
is
no
like
evidence
that
hundred
is
the
best
value
to
use.
We
can
definitely
come
with
a
better
default
here
again,
but
the
idea
is
that
this
can
let
you
control
how
much
recent
recovery
happens
and
if
you
let
us
say
one
that
a
sink
recovery
should
not
happen.
A
You
can
set
this
parameter
to
a
very
high
value,
and
the
difference
in
logs
is
never
going
to
be
that
high.
It's
never
going
to
choose.
Oh
sd4
isn't
recovery
beyond
all
those
things.
The
most
important
thing
to
highlight
is
when
VR
or
taking
out
OS
T's
for
a
sink
recovery.
We
need
to
ensure
that
we
have
min
sized
replicas
available.
So
the
part
where
I
said
that,
as
long
as
you
can
process
IO,
it
is
important
that
you
have
min
sized
copies
available
for
IO
to
proceed.
A
These
are
more
details
about
how
it
works
in
general,
but
the
overview
is
that
it
works
for
both
replicated
and
original
coded
setups,
and
the
whole
idea
is
that
when,
when
so
earlier,
what
you've
used
to
Ruby
is
to
see.
If
then,
if
an
object
is
missing,
that
means
we
want
to
first
recover
that
object
and
then
process
the
IO
that
has
come
on
that
object.
A
So
now
the
idea
is
that
if
an
OSD
is
a
nascent
recovery
target,
we
send
only
the
log
entries
to
that
OSD
and
not
the
entire
transaction,
and
when
log
based
recovery
happens,
the
log
the
history
of
that
log
is
present,
so
it
compares
it
tries
to
find
the
right
objects
in
an
up-to-date
copy.
Just
by
looking
at
its
PG
logs
and
seeing
what
the
missing
set
on
it
looks
like.
A
We
also
did
some
experiments
to
validate
that.
Async
recovery
worked
better
than
regular
recovery,
and
these
results
seem
to
indicate
that
this
is
a
graph
of
throughput
where
we
are
comparing
the
baseline
performance,
where
there
are
no
failure
and
the
the
one
that's
the
blue,
one
blue
bar
and
the
one
in
red
is
the
throughput
for
the
OSD
in
the
OSD.
One
is
dias,
it's
basically,
the
whole
idea
is,
we
are
running
cost
bench
and
we
are
generating
rgw
workloads.
A
We
are
running
a
workload
which
is
mixed
workload
with
read
list
right,
delete
everything
and
in
order
to
induce
recovery
which
is
killing
in
OST,
and
this
shows
that
the
recover
like
vendrick
up
async
recovery
is
happening.
Throughput
is
definitely
falling,
but
it's
not
falling
as
much,
and
if
you
look
at
the
list
and
the
delete
cases
or
even
the
right
cases,
it's
pretty
much
comparable
throughput.
A
Okay,
this
is
the
average
processing
time
this
has
similar
to
a
similar
of
trends.
The
average
processing
time
is
definitely
going
to
increase
in
our
failure
scenario.
We
see
that
it
has
increased,
but
surprisingly
for
delete.
It
hasn't,
in
fact
it
is
lower,
so
in
general,
the
graphs
and
our
validation
looked
good.
A
This
is
a
more
detailed
graph
of
people
who
have
actually
run
Cosman
should
understand
this,
and
the
whole
idea
is
to
just
show
a
time
series
graph
of
how,
with
every
operation
how
the
throughput
and
latency
have
been
the
bubbles
will.
Basically,
if
you
want
to
just
look
at
the
overview,
the
bubbles
are
just
like,
for
example,
the
one
and
orange
we
have
61
to
compare
with
54.
A
So
it's
like
you're,
comparing
one
with
no
failure
which
is
61
against
54,
which
is
not
bad
during
a
sin
recovery
then
the
the
exciting
stuff
which
is
now
available,
not
less
so
as
I
mentioned
earlier,
that
log
based
recovery
has
this
inherent,
or
at
least
had
this
inherent
problem
that
PG
logs
can
grow
out
of
bounds.
Because,
though
we
have
a
parameter,
though
we
had
a
parameter
called.
A
So
in
Nautilus,
what
we
did
is
we
implemented
a
hard
limit
for
PG
log
length
so
that
when
you
just
say
that
this
is
the
max
I
want
my
PG
log
to
grow
too,
it
will
just
stick
to
it.
So
we
understand
that
PG
like
log
based
recovery,
is
important
and
can
be
faster.
So
we
want
you
to
be
able
to
use
it,
but
you
can
decide
how
much
of
it
you
want
want
it
to
happen
and
by
the
way
this
has
been
back
ported
to
do
ministry.
A
If
you
are
running
luminous,
you
can
still
use
it
like
mimic
and
anonymous
then
came
than
the
other
feature.
So
I
talked
about
forced
recovery
and
forced
backfill
and
I
also
mentioned
that
they
were
introduced.
Luminous
at
a
PG
level,
one
feedback
that
we
got
from
a
lot
of
users
was
that
it
is
not
very
intuitive
and
easy
to
map
which
pools
are
mapped
to
which
PG
so
I
mean
you
have
to
go
and
figure
out.
Okay,
this
PG
2.1
is
the
one
I
want
to
run
forced
recovery
on.
A
So,
as
sage
mentioned
this
morning,
we
usability
has
been
of
thing
for
us
and
even
in
in
this
scenario,
what
we
have
now
is
that
you
can
run
a
force
recovery
at
a
pool
level.
You
can
just
indicate
that
okay
I
have
a
CFS
meta
data
pool,
which
is
the
highest
priority
thing
for
me
now.
I
can
just
run
a
force,
recovery
and,
of
course,
backfill
on
the
pool
level.
Also
again,
we
always
let
you
go
back.
So
if
you
change
your
mind,
you
can
cancel
on
the
pool
level
again.
A
So
next
thing
is
about
improved
performance,
improved
racing
recovery.
Now
why
I
say
improved
as
earlier
as
I
discussed
the
the
way
we
decided,
which
OS
these
are
going
to
get
selected
as
ascent
recovery
targets
is
by
just
looking
at
the
length
of
the
logs,
but
in
Nautilus
we
improved
the
accounting
of
missing
objects
a
lot,
so
we
felt
that
a
good
cost
parameter
would
be
a
combination
of
this
difference
in
PG
log
and
the
number
of
missing
objects
on
a
particular
OSD
and
that
we
renamed
that
cost
parameter
to
OSD
async
recovery
main
cost.
A
So
far,
no
validation
has
been
done
on
what
a
good
default
value
is
here.
I'm,
pretty
sure
it's
going
to
change
with
different
scenarios,
but
I
think
this
will
at
least
allow
you
to
again
control
a
sinks
recovery
when
you
use
it.
Also,
it's
I
think
it's
more
realistic.
Now
that
our
accounting
has
been
fixed
just
by
looking
at
the
PG
log
length,
I
think
is
not
enough
and
missing
objects
can
give
you
a
more
realistic
picture
of
how
much
time
is.
A
Actually
it's
going
to
take
to
recover
about
backfill,
so
there
were
improvements
around
backfill
as
well.
So
the
whole
idea
was
that
we
calculate
the
tentative
amount
of
space
that
is
required
for
a
backfill
operation
to
complete,
and
if
we
see
that
we
there
isn't
enough
amount
of
space
available
on
an
OSD,
we
do
not
even
accept
reservations
from
that
OST
for
backfill.
So
we
basically
deny
reservations
and
we
put
it
back
into
the
few
--until
space
gets
freed
up.
A
A
Moving
on
in
the
lines
of
recovery
sleep,
this
is
I
think
this
is
another
thing
that
was
motivated
by
some
of
our
user
experiences,
which
is
basically
introducing
an
option
to
even
throttle
deletes
of
BG's,
and
there
could
be
scenarios
where
pg
deletion
can
just
hog
all
the
bandwidth
in
your
cluster
and
client
I
use
might
not
get
enough
bandwidth,
so
you
can
now
control
that
using
a
similar
OSD
delete
sleep
option.
This
also
has
different
default
values
to
Autotune,
based
on
underlying
hardware
and
I.
A
A
A
We
and
she
is
and
I
think
a
PhD
research
candidate,
and
it
is
her
what
I
think
a
work
of
more
than
a
year
that
we
just
merged
for
Nautilus,
which
I
think
I'm
very
proud
to
say
that
we
have
Academy
research
translate
into
open
source
software,
which
will
be
I
think
run
in
production
in
very
very
soon.
So
the
whole
idea
is
that
this
is
experiment
has
been
marked,
experiment
experimental
for
the
right
reasons,
because
we
haven't
done
at
our
end.
A
We
haven't
done
enough
validation
on
under
all
kinds
of
scenarios,
but
I
think
if
users
and
our
community
can
give
us
feedback
on
it,
it
will
be
really
useful
to
us.
It
has
really
promising
advantages,
and
so
the
whole
idea
is
that,
instead
of
you,
actually
increase
the
number
of
OS
DS
that
you
reach
out
to
during
Eurasia
coded
recovery,
so
that
you
are
getting
lesser
amount
of
data
from
individual
OSTs
rather
than
having
less
number
overseas
and
getting
more
amount
of
data
from
them.
A
A
So
the
granularity
at
which
the
copying
happens
is
going
to
be
deeper,
so
you
save
on
the
extra
copying
around
that
was
being
done
earlier.
This
merge
striping
a
couple
of
weeks
ago,
and
this
is
something
really
exciting
and
I-
think
a
lot
of
users
are
going
to
benefit
from
it.
The
other
thing
that's
almost
ready
and
I.
Think
it's
getting
reviewed
and
tested
and
stuff
is
that
we
had
a
restriction
in
the
erasure
coded
pools
that
we
did
not
perform.
Recovery
below
I
mean
recovery
below
mean
size
was
aloud.
A
So
that's
a
restriction,
and
we
feel
that
some
of
the
reasons
are
based
on
which
we
had
made.
This
statement
do
not
hold
anymore
and
I.
Think
we
what
we
are
confident
now
that,
as
long
as
in
an
AirAsia
coded
scenario,
we
can
recover
the
data
we
will
afford
to
go
with
our
main
size.
That
is
the
whole
idea
and
I
think
that
it
is
again
going
to
be
really
useful
in
the
real
world
scenario.
A
That's
going
to
London
October
so
watch
out
for
that
other
than
that
I
think
one
more
thing
is
about
in
the
same
lines
of
prioritizing
the
right
kind
of
stuff,
letting
you
prioritize
the
right
kind
of
stuff.
We
also
thought
going
to
the
next
level.
We
are
going
to
prioritize
stuff
that
we
think
is
already
important
like
the
example
of
in
the
southwest
case.
A
I
think
the
next
thing
that
we
have
identified-
and
we
think
is
going
to
be
useful-
is
that
when
I
mentioned
OST
recovery,
sleep
being
an
option
in
which,
by
making
use
of
which
you
can
throttle
recovery,
it
has
a
clear
downside
that
it's
a
static
value.
So
if
you
use
one
second
value,
it's
just
going
to
keep
it
keep
a
one-second
gap
between
recovery
operation,
no
matter
what
so
either.
A
If
you
don't
have
enough
client
IO,
you
could
be
wasting
a
lot
of
time
in
recovery
when
you
shouldn't
either
you
have
to
manually,
go
and
change
it
and
remember
to
you
know,
put
it
back
to
the
right,
defaults
or
or
you
just
pay
pay
the
extra
cost
of
total
time
to
recover.
So
what
we
have
now
thought
of
doing
is
adaptive
recovery
or
throttling.
A
So
the
whole
idea
is
that
we
will
sense
how
much
of
client
operations
are
going
on
in
the
cluster
and
based
on
that,
we
are
going
to
bump
up
or
lower
down
this.
This
sleep
option
so
that
there
is
no
intervention
or,
like
nobody,
waking
up
in
the
night
going
to
change.
Okay,
I
need
to
put
an
alarm
to
change
my
OSU
recovery,
Steve
value.
You
wouldn't
have
to
do
that.
A
So
that's
that's
something
that
we
hope
is
going
to
land
up
in
the
next
next
release
or
so,
and
the
other
one
is
a
full-blown
QoS
project
that
we
have
been
working
as
a
research
project
and
I
think
now
we
have
realized
it's
time
for
it
to
get
even
more
focused
from
us.
So
we
as
a
team
are
in
Raiders,
are
trying
to
muster
up
our
resources
and
focusing
on
a
full-blown
QoS
project
plan
and
timelines
as
to
when
we
want
to
deliver
it
and
what
we
want
to
do
so.
A
I
think
there
is
a
bunch
of
stuff
that
we
have
associated
with
UC
Santa
Cruz
and
we
have
a
PhD
student
contributing
on
a
weekly
basis
and
we
meet
them,
but
now
I
think
we
also
are
going
to
use
the
research
bit
of
it
and
hack
use
our
expertise
at
in
the
radar
steam
and
get
this
project
to
the
next
level.
I
would
say:
that's
also
something
to
watch
out
for
all
right,
so
I
think
I'm
on
my
last
slide,
which
summarizes
the
summary
word
is
upgrade
I.
A
Think
if
you're
running
anything
like
jewel
or
hammer,
you
already
know
that
you
are
in
the
bad
space
and
there
are
lot
more
exciting
things.
A
lot
more
feature
additions
that
have
happened.
That
can
make
a
chef.
You
know
hands-free
and
you
don't
have
to
bother
about
a
lot
of
tuning
and
hands-on
operations
so
just
upgrade
to
luminous
or
mimic
or
like
even
Nautilus.
I
mean
Nautilus
is
the
best
I
mean
it.
You
can
you,
as
you
can
see
it's
the
smartest
you
it
has
the
best
kind
of
auto
tuning.
A
It
has
better
performance
in
terms
of
a
sink
recovery
and
I.
Think
it
has.
The
prioritization
stuff
has
been
walked
all
over
again,
so
recovery
prior
utilities
are
going
to
work
much
better
and
Nautilus
than
any
older
releases
and
yeah
watch
out
for
new
features
in
octopus,
which
should
be
probably
discussed
in
next
cephalic
on
yes,
that's
it.
Thank
you.
B
A
B
Cool
I
just
wanted
to
give
the
feedback
that
the
OSD
silly
deets
delete.
Sleep
is
actually
making
a
huge
impact
like
it's
really
really
working,
because
if
anyone
turns
on
the
balancing
or
like
PG
merging,
then
the
delete
sleep
is
exercised
like
all
the
time
so
before
I
think
it
was
12
to
11
or
12
to
12.
Before
this
balancing
was
kind
of
disruptive,
and
you
had
to
really
throttle
this
back
and
when
we,
when
we
enabled
the
sleep,
it's
like
everything,
becomes
much
more
transparent.