►
From YouTube: 2020-02-11 :: Ceph Crimson Meeting
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
C
A
On
classical
is
the
OSD
map
handling
path,
I
spotted
that
we
are
cashing
actually
both
always
the
map,
because
it
was
the
map
and
the
purple
released,
containing
crow
or
is
the
map
data?
What
is
the
reason
for
having
actually
two
level
of
caches
even
destroy
that
the
X,
the
fifth
level
in
in
object,
start.
D
We
go,
that's
it
like
the
whole
thing,
but
apparently
I
was
muted,
I'm,
pretty
sure
the
buffer
lists
are
incrementals.
You
can
check
me
on
that,
but
the
answer
is
that
if
you,
if
OSD
10,
is
on
map
120
and
he
gets
a
message
from
OSD
5
who's
on
map
115,
this
way
it
just-
you
can
just
go
back
in
the
incremental
bufferless
and
just
send
to
those
5
encoded
OSD
map
incrementals.
Instead
of
sending
the
I'm
like
80%
sure,
that's
that's
what
it
is
lightly.
E
D
D
Sense
but
obviously
nothing
below
that
layer,
layer
actually
works.
So
if
you
wanted
to
test
your
implementation,
we'd
have
to
talk
about
what
it's
supposed
to
do
and
then
probably
build
in
memory
mock
for
it.
It's
up
to
you,
though.
It'll
it'll
get
rapidly
more
complete
over
the
next
couple
of
weeks.
I
think.
E
F
D
You're
one
of
the
cases
it's
because
the
interface
that
we
created
for
what's
a
call,
whatever
the
object
store,
we
interface
with
created
for
crimson,
doesn't
have
the
flush
mechanisms
that
the
one
classic
has,
which
is
fine,
so
instead
I
just
sent
an
empty
transaction
with
the
blush
callback
on
it,
which
is
semantically
identical.
If
you
think
about
it
right,
because
you
can't
complete
a
transaction
before
the
other
transactions,
the
same
sequencer
complete,
so
sending
an
empty
transaction
on
the
same
sequence
or
what
the
callback
you
want
is
the
same
thing
as
a
flush.
F
So
the
missing
transaction
is
that
the
sequence
is
the
later
cami
and
the
first
to
come
out
in
the
keep
his
arm
ages.
So
I
add
a
sister,
a
mutex
there
in
the
to
guarantee
the
first
guinea
first
route.
So
then
it
works,
but
when
every
it
to
the
cat.
Ladies,
the
Russian
master
branch,
I
founded
the
buffer
Frey,
has
some
problem.
F
E
E
A
I
was
too
focused
muscle
almost
exclusively
on
qussuk
USD
Thomas
working
can
memory
corruption
back
biology,
it's
that
we
have
a
synchronization
issues
with
OSD,
or
is
the
map
field?
It's
a
short
pointer
accessed
by
multiple
afraid,
without
very
likely
without
any
synchronization
I
made
a
fix
and
going
through
total
ology
testing
to
verify
whether
this
was
the
root
cause
or
not.
A
B
D
D
B
D
Okay,
if
you
want
to
create
a
Google
Doc,
you
can
put
your
questions
in
there
and
then
we
can
do
the
sort
of
chap
thing
on
the
side.
That
way.
Other
people
can
see
the
question
answered
process
and
we
can
kind
of
turn
it
into
a
bit
of
documentation
as
we
go.
It
doesn't
have
to
be
a
lot
you
just
not
like
we're
trying
to
write
a
lot
down.
It's
just
a
way
to
get
a
bunch
of
people
to.
There
is
just
a
way
to
make
it
easy
for
people
to
track
the
progress.
B
D
D
E
B
D
Just
one
suggestion
you
might
want
to
simply
try
commenting
out
the
part
where
you
add
the
PG
log
entries
to
the
message
and
see
what
that
does
to
the
overall
time.
My.
C
D
B
E
The
code
I
also
suggest
you
to
to
try
to
rerun
the
proof
test,
because
the
current
recently
the
perfect
test
not
quite
stable,
for,
for
example,
if
we
add
some
some
non
function,
some
code
England
in
in
the
past,
which
is
not
critical
to
the
I/o.
That
also
could
fill
the
protest
so
normally
I,
just
rerun
the
perfect
tester
and
received
make
difference.
C
Just
to
try
to
see
what
we
can
get
if
we
can
get
better
performance
to
remove
another
copy
in
the
native
stack,
because
currently,
our
current
native
stack
performance
numbers
included
copies
in
both
read
and
write
to
the
map
to
the
network
layer.
So
I'm
trying
to
see
how
better
we
can
get.
If
we
disable
that
that
two
copies.
B
E
D
B
D
There
are
two
problems
with
that:
one
is
that
we
obviously
need
to
commit
all
of
the
rights
associated
with
a
transaction
atomically.
So
during
a
replay
we
can't
replay
half
a
transaction
we
to
replay
the
whole
thing,
but
the
second
part
is
that
we
are
we
interleave
reads
and
writes
especially
I'm.
Imagine
updating
a
whole
bunch
of
different
no
map
values
in
an
rgw
bucket
index,
which
is
a
real
thing.
That
really
happens
that
might
involve
a
bunch
of
temporarily
separated
reads
across
a
bunch
of
B
tree
pages.
D
It's
a
classic
concurrency
problem
and
it's
true
in
crimson,
even
though
we
don't
have
multiple
threats
and
the
reason
is
that
what
we
need
to
send
the
disk
a
lot
of
reads
in
parallel
and
in
the
common
case
you
don't
really
have
reads
touching
the
same
petri
pages
right,
because
the
OSD
already
prevented
concurrent
writes
on
the
same
object.
So
it
won't
normally
happen.
But
if
you
caused
enough
merges
or
splits
for
a
table
up
the
tree
hypothetically,
it
could
because
everyone
has
the
same
root
right.
E
D
D
E
D
I'm
not
entirely
sure
that
actually
avoids
the
problem,
because
you
may
already
have
performed
to
read
at
a
leaf
because
you
don't
necessarily
know
which
insertions
you're
doing
at
what
stage
like
there.
There
are
way.
There
are
many
ways
to
slice
this
problem,
but
this
is
the
fundamental
problem.
D
D
E
D
Look
mighty,
it
means
well,
it's
not
a
lock.
It
means
my
truck
this
pending
transaction.
We're
doing
right
now
has
own
copy
of
this
physical
block.
When
we
go
to
commit
the
transaction,
we'll
do
a
bunch
of
things.
One
of
them
is.
D
We
will
actually
write
the
block
out
to
disk
right,
but
we'll
also
insert
the
new
leech
or
the
Delta
rather
for
the
block
out
to
disk,
but
will
also
insert
the
block
back
into
the
cache
and
while
it's
just
ours,
it'll
be
in
state
pending,
which
means
that
if
we
try
to
mutate
this
block
again
in
the
same
transaction,
we
won't
do
a
mutation
we'll
just
directly
change
it,
because
we
already
have
a
copy
when
we
put
it
into
the
cache
it'll
be
a
new
state.
D
Necessarily
no,
no,
not
necessarily,
we
might
just
be
writing
a
delta
down.
So
if
I'm
changing
block
address,
20,000
I
I
could
write
a
new
copy
of
it
out
and
then
change
everything
that
points
to
it
or
I
could
just
write
a
delta
that
says.
20,000
now
has
a
different
value
for
whatever
b-tree
value.
Key
right,
which
means
I,
have
a
dirty
copy
in
memory,
but
that
value
that
I,
that
now
it
has
a
memory,
will
be
wrong.
D
It
won't
be
correct
until
the
transaction
completes
and
I
know
what
address
at
what
and
that's
actually
what
I
was
doing
this
week,
I
was
having
a
conversation
like
my
up.
D
Until
last
week,
I
had
been
assuming
that
I'd
be
able
to
predict
where
a
journal
append
would
show
up
if
I
controlled,
which
open
treant
transactions
existed,
but
a
converse
for
one
thing
that
turned
out
to
be
shockingly
hard
to
do
from
a
concurrency
standpoint,
and
the
other
problem
is
that
I
was
talking
to
him
young
on
at
Samsung,
and
he
was
saying
that
really
for
their
drives.
At
least
you
really
want
to
use
the
append
primitive,
but
it's.
D
You
so
I,
don't
know
the
address
when
I'm
sending
a
thing
down
to
disk
the
way
I'm
fixing.
That
is
that.
So,
if
you
think
about
it,
there
are
like
two
basic
kinds
of
things:
I
could
be
sending
to
desk.
There
are
things
that
come
from
above
the
transaction
manager
layer.
Those
things
do
not
point
to
actual
physical
addresses,
so
those
are
fine.
They
don't
care
where
the
I
don't
care
where
they
go,
because
nothing
else
can
possibly
know
what
their
physical
address
is
anyway.
D
But
the
internal
b-tree
notes
for
the
LBA
tree
are
physically
at
rest,
so
I
care
very
much
what
their
internal
pointers
are.
So
the
way
this
works
is
any
b-tree
note
that
gets
written
to
a
record
will
have
two
kinds
of
values:
it
will
have
values
that
are
absolute,
that
point
to
physical
addresses
that
are
already
stable,
or
they
will
have
values
that
are
relative,
then
point
to
a
relative
offset
from
that
block.
E
D
E
D
D
In
other
words,
I,
but
I
can't
assume
that
if
I
send
down
a
make
of
data
will
be
relative,
offsets
any
bytes
with
them.
That
megabyte
of
data
will
be
the
same.
So
if
I
send
down
ten
blocks
that
are
for
KH-
and
I
say-
and
I
know
that
the
fourth
block
refers
to
the
fifth
block-
I
can
just
let
the
value
in
that
block
be
one
or
the
next
block
or
4k
whatever
right.
D
So
that's
that's
why
the
cache
it
keeps
getting
more
complicated,
I'm
trying
to
give
it
enough
states
to
express
this
is
kind
of
what
I
was
saying
earlier
with
white
I,
didn't
I,
didn't
think
the
B
tree
for
the
physical
layer
would
look
much
like
the
B
tree
for
the
logical
layer.
The
problem
space
is
just
it's
like
it's
a
giant
headache.
You
can
kind
of
see
it
the
design
when.
D
Because
of
the
way
the
physical
addresses
work
does
that
make
sense
to
anyone
else
or
just
didn't,
want
me
to
go
over
something
differently.
I
will
commit
this
to
a
document
of
some
sort
once
I
have
at
least
three
things
that
all
make
sense
together,
hopefully
the
next
week
or
two,
and
that
basically
means
I'm
gonna
write
the
Albie
a
tree
implementation
in
terms
of
the
cash
interface,
and
that
should
tell
me
what
the
cash
has
to
do
and
once
those
two
things
agree.
I
think
everything
else
will
also
agree.
D
D
E
E
D
A
A
We
could
actually,
we
have
the
plc
of
input
buffer
factory,
which
is
uncommon
for
sister
and
one
commit
for
for
crimson.
Maybe
we
should,
maybe,
if,
if
you
want
to
focus
on
the
performance
of
things
like
digital
from
the
variable
from
the
very
ground
up,
maybe
we
should
consider.
Maybe
we
should
merge
the
things
without
waiting
to
apps
to
commit
to
commit
the
modification
to
commit
there
into
the
buffer
Factory
things
to
upstream.
A
D
I
I
think
they're
independent,
except
for
the
fact
that
they
both
involve
profiling.
I
think
we're
gonna
be
profiling,
a
lot
of
stuff
over
time.
I
I
really
don't
want
to
block
the
PG
log
stuff
at
all.
I
think
the
patch
looks
about
right,
so
I
kind
of
just
want
to
fear
it.
Why
it's
slow.
D
Oh
I
think
it
is
correct.
Iii
don't
know
if
the
exact
numbers
are
accurate,
but
it
doesn't
really
shock
me
that
it's
significantly
slower
there
are
a
lot
of
opportunities
that
were
inheriting
from
classic
OST
to
burn
some
CPU,
particularly
when
we're
encoding
the
PG
log
entries
so
I
think
it's
more
just.
We
need
to
identify
why
it's
happening.
A
A
D
A
Be
okay
still,
I'm,
not
sure
whether
there's
betters
would
be
meaningful.
I
mean
that
if
you
have
had
plenty
of
Cisco's
over
a
right
path
that
they
could
have
some
side
effects
that
it
far
away
from
the
place.
But
they
are
issued
this
concept
not
only
the
direct
cost,
but
also
indirect,
affecting
the
affecting
cashing
efficiency
and
things
very
similar.
To
that
I
mean.