►
From YouTube: CDS Jewel -- Cache Tiering
A
A
A
Fine,
hey.
A
Yeah
I'm,
trying
to
browse
my
way
through
for
blueprints
really
quick,
but
oh
we
write,
you
were
on
the
same
blueprint
is
allen.
Also
you
the
whole
Sam
discuses
group
yeah.
Damn
you
want
to
kick
yours
off.
First
I
think
yours
fell
top
to
bottom
you're,
the
first
one.
Yes,
then,
what
was
next?
It
was
yours
and
the
Intel
one,
then
the
sandisk
one
so
we'll
run
to
them
in
that
order,
and
then
we
can
beat
each
other
up
on
cash,
steering
as
we
go
along.
C
C
My
understanding
is
that
this
pretty
much
allows
them
to
measure
the
cache
miss
cost
associated
with
reading
directly
from
the
cashier
or
from
the
base
to
you
rather
for
reading
proxy
from
the
based
here
and
what
they
found
was
that
for
replicated
pools
at
least
on
random
reads.
The
performance
was
actually
pretty
good
in
that
case,
because
it
turns
out
that
a
lot
of
the
poor
performance
we
were
seeing
was
because
of
extraneous
promotions.
C
C
So
right
would
suggest
that
what
we
really
want
to
do
is
focus
on
when
we
want
to
do
promotes.
So
what
they're
suggesting
is
and
is
a
remembering
more
explicitly
the
most
recent
operations
devoting
some
memory
to
remembering
the
most
recent
end
operations
so
that
we
can
focus
our
efforts
on
promoting
objects
that
really
have
been
accessed
a
lot
of
times
recently.
The
hit
sets
have
limited
granularity,
since
their
main
purpose
is
to
detect
cold
objects,
but
they
don't
do
a
great
out
great
gray
day.
C
They
don't
do
a
great
job
at
differentiating
among
the
hotness
of
odd
objects,
so
they
suggest
we
add
a
cue
NM,
r
uq
and
trigger
a
sinker
oceans
from
the
head
of
that
list.
We
basically
want
to
throttle
the
number
of
promotions
such
that
we're
devoting
a
certain
percentage
of
the
cache
tears
write
throughput
to
promoting
from
the
filter.
C
So
it
occurred
to
me
that
one
way
we
could
do
this
is
we
could
tune
the
promotion
or
not
promotion
threshold.
Based
on
how
many
promotions
we've
done
recently.
So
if
we've
used
a
lot
of
I/o
recently
on
promotions,
we
could
become
more
selective
and
vice
versa.
C
C
One
thing
that
another
thing
that
came
through
is
that,
with
this
approach,
the
quench
Alejo
from
the
eraser
coated
tier
or
if
you
have
an
eraser
coded
here
as
a
based
here,
was
much
much
worse,
which
is
not
terribly
surprising,
since
a
lot
of
reads
had
to
go
through
the
dc-based
here
before
we
were
willing
to
promote
it
so,
but
further
suggests
that,
even
in
that
case,
we
want
to
be
very
sir.
We
want
to
do
a
good
job
of
detecting,
obviously
sequential
AO.
C
F
Okay
is
sometimes
ago
with
you,
some
testing
it
on
the
cash
children
using
the
FIO
DPF
distributions
a
we
have
the
we
have.
We
have
for
mega
400
gigabyte
data
of
the
total
total
data.
Size
is
400
gigabytes
and
when
you
from
the
deeper
and
deeper,
if
I,
have
a
tool
to
estimate
I,
how
many?
How
many
data
will
be
hot,
using
a
tool
and
call
the
jenti
path
and
I
from
this
tool?
We
use
the
parameter
1.2.
Then
we
can
get
a
line.
T
%
/
over.
F
Ninety
percent
of
the
data
will
be
hit
in
no
no
I'm.
Sorry
over
ten,
a
ten
percent
of
the
data
will
behave,
will
be
hit
over
eight
percent
of
the
time.
So
from
this
point
of
view,
I,
if
we
have
the
cast
eyes
setting
to
because
our
data
size
is
a
hundred
megabyte.
So
if
we
set
the
case
has
to
be
I
to
be
40
40
gigabyte,
then
we
will
have
a
good
performance
from
the
castro
ring
and
in
our
testing
we
said
that
we
set
the
calcite
cache
size
to
be
any
stigma.
F
F
F
In
the
current
implementations,
we
I
we
we
do
the
eviction
based
on
the
age
of
the
object
that
is,
we
have
hinted
to
cover
the
we
have
several
he
set
for
each
PG
to
cover
the
time
I
of
the
object
age
for
those
object
which
is
which
is
in
the
headset
I.
We
we
use
the
hill
and
hisses
he
said
timestamp
to
calculate
the
object
stage,
but
there
is
problem
with
this,
since
we,
since
the
hinge
that
will
cover
a
lot
of
objects,
then
all
of
the
objects
in
this
headset
will
be
get
the
same.
F
Get
the
same
time.
I
mean
the
same
age,
yep
same
age
and
then
I
when
we
use
this
age
to
calculate
the
to
calculate
which
one
should
be
built,
then
I,
we
didn't
get
get
a
good
result
from
this
because
I
we
have
many
objects
in
the
hills
and
he
said
then
we
have
same
age,
and
then
we
have
the
same
I
this
that
this
object
will
be
averted
at
the
same
time,
I
mean
and
the
photos
object
which
are
not
in
the
headset
I.
F
Currently
we
use
the
same
time
to
to
use
them
time
as
the
age
I,
but
as
well
as
we
know
that
the
aim
time
is,
the
eternam
is
a
multiple
time
it
is.
It
doesn't
change
when
we
doing
read
or
or
some
and
which
we
read
operations
on
this
object.
We
do
not
change
them
time
I.
So
this
means
that
I
read,
operations
doesn't
have
doesn't
have
effect
on
the
on
edge.
E
F
And
also
when
we,
when
we
want
to
when
we
doing
vixens
I,
we
calculate
something
like
that.
Now
we
use
the
power
of
two
histogram
22
to
locate
the
object
age.
We
can
get
an
upper
bound
under
low
bone
and
then
we
compare
the
upper
bound
with
the
something
called
a
big
effort
to
decide.
We,
if
we
should
do
you,
get
this
object,
and
this
event
effort
is,
is
a
global
value
to
a
PG
I.
F
So
from
these
calculations,
maybe
some
some
object
which
may
be
accessed
later,
then
then
some
other
object,
but
the
early
object
of
the
later
object
may
be
evicted
before
the
earlier
object.
I,
don't
think
this
is
quite
reasonable.
I
think
both
of
the
disc
lead
to
the
to
the
not
ethical
performance
of
the
cash
to
ring.
F
The
first
one
is
at
two
to
calculate
that
and
the
two
to
improve
the
way
to
calculate
the
age.
I,
ok,
let's,
when
we
stay
at
the
age,
we
actually
we
are
using
the
recency
22.
We
have
some
have
a
word
called
recency
CH
we
use
the
date
is
something
something
called
at
recently
and
then
we
use
the
recency
to
to
make
the
divisions.
F
F
But
keep
this
in
memory
in
the
sim
snack?
It's
not
nice
it!
It's
not!
It's
not
good
to
keep
in
memory
because
it
maybe
we
need
a
lot
of
memory
to
keep
giving
that
info,
but
persistent
with
together
with
the
updating
for,
is
also
not
good
because
I.
When
we
do
a
read,
we
do
not
run
when
we
do
it.
When
we
do
read
operations,
we
were
all
going
to
persist
in
anything
to
the
to
the
disk,
and
so
maybe
maybe
persistent
the
item
into
the.
F
Into
the
qumari
story,
so
maybe
it's
a
good
idea.
If
we
have
this
air
time,
then
we
can
use
this
item
to
calculate
the
age.
That
is
that
recency
of
the
object
and
then
we
can
make
more
accurate
decisions
to
give
it
the
object.
F
The
second
idea
is
to
score:
something
is
something
called
reuse
distance.
Either
we
used
the
reuse.
Distance
is
something
define
eyes
duh,
I
we
we
access
the
object
and
then
later
we
works
as
some
other
object
and
after
some
time
we
access
the
first
update.
Again
we
call
the
reuse
distance
as
the
lumbar
access
is
between
these
two
excess
of
the
object.
F
F
The
idea
of
what
this
is
that
I
we
have
many
hits
that
we
have.
We
can
set
the
heat
the
number
of
his
set
for
PG
and
then
we
eat
hpg,
which
he
said
covers
some
time
or
it
covers
a
lump
of
object.
And
if
we
have
a
laugh
his
debt,
then
we
can
calculate
that
reuse,
distance
negatives,
I
say
we
have
for
his
that
one
Underhill
we
have
650
set
and
the
object
is
accessed
in
the
he
set
one
and
his
Essex.
F
F
F
If
we
can,
if
we
can
find
the
ones
if
we
can
find
the
object
once
in
the
inner
of
the
heat
set,
that
means
that
the
reuse
distance
of
this
object
is
infinite.
If
we
can,
if
you
can't
find
it
in
any
of
the
history
we
it
is
infinite.
Also
so,
for
we
can
make
this
to
calculate
the
reuse
distance,
and
then
we
can
also
use
the
power
of
two
histogram
to
make
the
evolution
the
eviction
division.
F
F
There
are
some
of
the
some
of
the
annual
list,
which
are
impractical
use.
What
is
called
a
RC,
that
is,
this
one
is,
is
used
in
the
IBM
high
in
a
storage
system,
and
there
are
some
others
called
some,
such
as
AI
is
and
they're
also
using
some
doing
some
systems.
I
think
we
can
also
make
use
of
this
these
algorithms
to
make
a
better
eviction
division
I
in
the
linux
kernel.
They
also
used
to
some.
They
also
use
the
tool
list
to
check
the
the
informations
and
then
make
the
eviction
tradition.
F
F
F
B
You
do
you
have
a
good
feel
for
how
much
of
the
problem
is
caused
by
poor
handling
of
the
information.
That's
there,
you
talked
about
using
a
time,
maybe
having
deeper
headsets
with
the
different
distance
versus
how
much
of
the
problem
is
being
caused
by
individual
bits
in
the
hit
set
essentially
merging
the
behavior
of
multiple
objects.
Because
of
the
resolution.
F
For
the
first
two
ideas,
I
think
it's
it
would
be
not
that
complicated
the
the
overhead
would
be
not
that
too
much
I
for
each
operations
and,
let's
say
further
a
time
I
wanted
to
avoid
operations.
We
need
to
update
the
a
time
all
right.
This
is
a
case
since
we
need
to
to
write
to
persistent
the
object
in
for
two
entities
and
a
ladder
we
put
if
we
put
Adam
in
two
key
value
stores:
I,
don't
think
that
is
too
much
overhead.
C
F
All
right,
yeah
for
read
all
right:
Yeah
Yeah,
right,
yeah,
sorry
that
we
we
need
to
pay
ya
for
V
that
we
need
to
add.
There
is
some
lessons
overhead.
We
need
to
persist
the
the
atom
into
command
store
that
depends
on
then,
because
this
information
is
is
is
much
smaller.
I
think
the
time
is,
is
affordable.
It's
doable.
F
C
C
Would
need
to
issue
an
actual
wrap
up,
so
that
would
be
difficult
to
a
batch
another
another
approach
would
be
to
write
them
to
a
right
aside,
a
time
buffer,
maybe
of
the
most
recent
and
it
conducts
but
I
wonder
if
that's
really
different
from
using
something
like
a
tell.
Are
you
variant
or
I
briefly
started?
Reading
the
IRS
paper,
whatever
that
is
combined
with
a
hit
count,
entered
a
time
and
periodically
snapshot
that
object.
C
12
the
replicas
I'm
skeptical
about
the
value
of
keeping
it
only
in
memory,
because
I
worry
about
after
an
interval
change
having
drastically
different
cache
behavior,
which
would
be
bad.
So
it
would
be
nice
if
period
if
at
least
periodically
we
were
able
to
snap
shot
our
current
Cashin
or
current
recency
information.
I.
Don't.
C
C
E
C
C
B
F
F
By
the
way,
I
read
some
some
something
from
the
website.
The
internet
that
I
see
latest
kernel
is
also
you
to
have
some.
They
already
have
some
air.
You
list
adapt
the
apt,
you
least
one
to
hold
the
ho
ho
dose
and
pages
which
are
accessed
once
recently,
and
the
another
needs
to
hold
I
hold
those
pages
which
are
exist
more
than
twice
recently
and
that
they
still
have
some
ideas
to
improve
proof
decide.
This
are
you
list.
I
are
also
considering
using
the
algorithm.
I
I
also
use
the
algorithms
neck.
C
F
F
A
E
D
D
D
The
primary
goal
would
be
to
support
eviction
and
promotion,
scheduling
for
objects
and
also
support
creation
objects
to
bypass
cash
tears,
if
required.
This
would
coexist
with
yehuda's
algebra
bucket
exploration,
work
that
his
proposed
in
the
previous
blueprints,
the
what
we
would
be
basically
coming
up
with
is
a
new
policy
engine
which
ties
all
these
together.
Okay,
the
policy
engine
is
what
would
manage
the
rules.
The
policy
engine,
the
rjw,
would
talk
to
the
policy
engine
and
figure
out.
D
What
is
the
policy
to
be
enforced
and
these
policies
would
be
set
on
the
objects
through
etcetera
and
discipling
would
call
them
stamping
or
tagging
with
policies.
We
could
also
come
up
with
different
loadable
policies.
To
reduce
classes
would
have
support
for
that
too,
but
the
default
policy
in
thing
that
we
would
come
up
with
should
be
sufficient
to
support
most
of
the
workloads
in
any
questions,
please,
please
feel
free
to
interrupt
me.
D
The
the
work
in
the
clearing
agent
would
be
to
understand
this
new,
extended
attributes
of
objects
to
make
sure
that
the
tiering
agent
would
be
configurable
to
invoke
a
crawl
on
the
cache
tier
2.
A
pass
all
the
objects
which
require
these
rules
to
bed
first
on
also,
the
tiering
agent
would
be
enhanced
to
avid
an
object
to
not
just
the
base
here,
but
to
named
TS
in
case
there's
much
more
than
one
layer
of
tearing
the
end.
D
If
the
user
would
want
eviction
to
not
the
best
here,
but
the
entity
r,
we
would
support
that
to
identifying
either
the
pools
by
names
are
through.
The
pool
IDs
in
the
future.
Promotion
of
objects
should
also
be
possible
for
the
simple
reason
that
there
might
be
workloads
where
you
would
need
certain
objects
to
be
presented,
the
cashier
to
improve
the
performance.
So
we
would
try
to
tackle
that
clue
in
the
next
installments.
The
rules
basically
would
be
set
for
objects,
buckets
pools
and
a
global
one.
The
object.
D
If
you
want
to
specify
a
particular
rule
for
an
object,
it
would
be
through
HTTP
headers
in
rgw,
when
you
do
a
put.
If,
if
the
headers
are
basically
not
specified
with
rules,
then
the
rules
set
on
a
bucket
or
a
pool
or
a
global
in
this
hierarchy
would
come
into
effect
and
we
were
and
the
policy
into
its
tag
or
stamps
the
objects
yeah
now
again
identified
by
name
sir
ID
and
look
up
of
a
particular
pole
or
ID
fails.
Then
currently
we
would
resolve
to
resort
to
erroring
out
the
request
in
the
future.
D
You
could
also
define
a
default
pool
to
which
object
would
go
to
in
case
the
pool
name
is
not
specified
right.
We
would
identify
rules
based
on
pools.
There
is
something
like
you
know:
poor
lame,
followed
by
Reed
and
the
duration,
which
would
say:
what's
the
maximum
duration
object
can
leave
in
I
live
in?
D
D
So
so
that
that's
that's
kind
of
what
we
are
proposing,
where,
where
we
would
give
the
user
a
much
more
fine-grained
control
of
making
sure
that
the
objects
exist
in
the
cache
layer
for
a
certain
particular
duration
and
after
that,
you
would
move
it
to
a
particular
pool.
Or
you
know
even
market
for
differ
deletion
that
that's
our
oral
idea.
Any
any
questions,
suggestions.
B
No
I
think
you
did
a
pretty
good
job
of
explaining
that
you
know
the
the
goal
is
to
let
people
you,
administrative
Lee
control,
where
the
objects
go.
I
think
the
two
principal
things
that
people
would
want
to
do
would
be
to
send
a
subset
of
their
objects
directly
to
the
base
tier
skipping,
a
cached
here.
You
know
you
could
see
that
being
done
for
large
objects
or
objects
with
certain
filename
patterns
that
was
in
the
proposal.
The
other
thing
would
be
to
administrative
leics.
Pyre
objects,
out
of
the
tier.
B
B
B
C
C
B
C
B
B
C
B
C
C
Yeah
so
well,
that's
that's
another
thing.
I
would
like
it
if
this
stuff
were
also
a
queryable
ground
scrub,
because
it
may
be
that
we
want
to
trigger
other
operations
as
well,
although
come
to
think
of
it,
I
guess
so.
Object,
operated,
X,
expiration
excessively,
would
be
a
scrub
level
concept
rather
than
a
casher
a
casting
process,
but
only
because
you
might
well
do
object
expiration
on
a
pool
that
does
not
have
a
bass
gear
and
therefore
no
no
cash
during
agent
scanning.
In
the
background,
that's
not
a
big
divergence.
B
C
C
B
C
Yeah
not
sure
I
plans
to
that
of
to
ask
important
there's,
no
reason
in
principle
why
the
OST
couldn't
simply
remove
it
from
the
the
bit
that
the
PG
could
initiate
an
asynchronous
operation
which
the
OSD
does
on
its
behalf.
That
simply
runs
whatever
ray
does
GW
would
normally
do
that
is
I
registry
w
class
operation,
that's
capable
of
making
object,
recalls,
I,
don't
think,
there's
an
inherent
problem
with
having
the
OST
data
and
do
that.
B
B
C
B
E
C
E
E
C
Enough
to
get
around
oh
so
you're
right,
none
of
none
of
that
exists,
yet
it's
not
really
a
blocker
that
the
base
tier
or
that
the
cash
tier
is
always
full.
It's
just
a
matter
of
we
set
up
a
an
operation
with
the
base
for
notifies
the
cashier
that
it
wants
to
pull.
Is
this
object
and
the
caster?
Does
it
the
other
piece
that
part
would
be
done
in
scrub?
B
B
C
Yes,
there
is
because
checking
the
metadata
involves
talking
to
the
replicants.
That's
why
the
cast
your
the
tea
reagent
is
separate
from
scrub,
which
is
not
to
say
that
you
wouldn't
necessarily
also
want
to
scrub,
but
it
means
that,
if
you're
just
trying
to
apply
policy
decisions
to
the
objects,
you
don't
need
to
talk
to
the
replica.
So
you
can
just
have
the
primary
scam.
So
with
a
3x
replica
pool,
you
do
one.
Third,
the
work
yeah.
C
B
C
C
C
Yeah,
okay,
there
are
a
few
other
differences.
Scrub
goes
to
some
effort
to
ensure
consistency
of
the
result,
so
it
locks
the
extent
of
objects
that
it's
scrubbing
at
any
particular
point
which
you
definitely
don't
want
to
do,
for
a
cheering
enter
or
for
a
policy
agent
come
to
think
of
it.
So
that's
that
that's
argument
for
not
using
scrub
yeah.
A
A
Sounds
like
a
fair
bit
of
quiet
there,
so
excellent
did
you
guys
want
to
run
through
Narendra
sprint
at
all?
Is
there
anyone
here
that
would
like
to
represent
that
or
chat
through
the
cash
cheering
efficiency
of
read,
miss
operations,
or
should
we
just
let
him
catch
up
with
sage?
On
the
back
end,
I.
A
A
C
A
I
guess
that
concludes
it.
So
thank
you,
everybody
for
another
great
set
developer
summit.
Obviously
we'll
continue
these
discussions
on
the
lists
and
in
IRC
and
then
obviously
tickets
and
pull
requests
as
normal.
So
if
you
have
any
questions
about
the
summit,
let
me
know
otherwise.
The
videos
should
be
posted
sometime
next
week
and
we
go
look
forward
to
seeing
you
for
the
k
release
summit
in
a
few
months
thanks,
everybody.