►
From YouTube: Ceph Crimson/SeaStore 2021-07-07
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
Okay
same,
we
have
fulcrum
here.
Let's
start
last
week,
I've
been
working
on
some
random
cleanups
on
crimson
and
also
has
some
discussion
on
the
an
extended
allocation
manager
with
with
jihan
on
his
on
his
pr,
and
that's
me,
aaron.
B
Hey
hi,
I'm
working,
I
was
working
on
the
next
version
of
patch
for
ioctl
and
control
support
in
c
star
development,
csr
double
group,
so
the
next
patch
version
has
been
submitted
and
receiving
some
review
comments
yeah
and
get
to
start
on
the
corresponding
interfaces
in
the
segment
manager
in
c
store,
so
that
what
is
this
to
be
studied
so
yeah
I'll
just
paste.
The
link
also
here,
thank
you
yeah.
That
will
be
me
yeah.
It
comes.
C
I
fixed
a
workaround
walk
around
the
ed
ping,
I'll
be
a
tree
and
pin
dessert.
That
means
one
one
operation
do
the
albu3s
plate
and
that
just
laid
without
influence.
The
I
mean
the
the
fun
friend
whole
operation
get
the
old
extender
content
and
at
that
time,
when
do
the
find
the
whole
and
that
time
the
extent
is
related
and
the
in
the
continuation
the
extent
is
content
is
updated.
You
find
the
whole
continuation,
that's
very
weird
to
me.
C
So
the
the
continuation
when
the
extended
content
is
is
updated
in
the
continuation
and
sometimes
and
the
continuation
exchange.
The
other
operations
in
continuation
get
updated,
extended
content.
C
So
so
the
final
get
a
wrong
address.
The
address
is
already
existing
in
lb3
and
since
two
of
the
whole
operation
get
the
same,
get
the
same.
Rb3
get
the
same,
offend
the
whole
address
and.
C
So
that
is
the
the
comparison
issue,
so
there's
confliction
when
the
concurrent
comparison
increased
and
the
the
current
extent
counties
may
be
changed
by
other
operations,
and
I
influenced
this.
This
operation.
C
Yeah
when
hang
hold
fan
host,
they
got
the
last
literature.
The
last
literature
max
address
is
the
error
atdr
max,
but
that
at
that
time
the
the
child,
the
con
child
node,
is,
is
related
and
they
insert
another
the
literature
after
the
end,
so
the
first
electrode
max
is
now
to
the
it's,
not
the
ladder
max
a
number,
but
it
still
gets
the
older
older
content.
C
B
C
So
why
why
the
extent
why
the
continuation
go
to
the
updated
extent?
That's
making
me
weird,
because
the
extended
address
we
have.
We
have
already
already
got,
get
right.
It's
the
older
extent.
Then,
though,
in
the
continuation,
it
got
updated
a
new
extent,
and
I
found
when
I
added
the
log
printed
in
the
continuation
of
the
whole
continuation
that
log
printed
twice
and
but
the
continuation
before
this
continuation
is
now
that
we
enter
only
the
continuation
this
one
and
the
follow
one.
The
print
print
two
times.
C
Do
the
repeat,
but
it
should
go
off
from
the
repeater
from
the
beginning
right.
I
still,
I
also
add
the
printer
in
the
in
about
continuation.
There
are
so
many
continuation
chains
together,
and
I
either
the
printer
in
the
in
the
above
continuation,
but
that
printed
is
not
pronounced.
Only
the
the
last
two
continuation
print
out
the
two
times,
the
first
of
the
the
about
continuation
only
print
once
so
that
should
do
the
repeater,
but
it's
not
from
the
beginning.
So
that
makes
me
weird.
C
Yeah,
so
I
still
need
a
sam
to
consider
how
to
solve
the
continue,
how
to
solve
the
confliction.
C
C
This
is
a
test
result
so
from
this.
This
is
a
release,
build
without
output
and
used
raw
disk
e.
C
A
C
And
under
ping
the
cfosd21
cpu
and
the
the
cycles
per
operation,
crimson
is
worse
than
seven
sd
and
right
lattice
is
better
and
the
rate
latency
is
still
worse.
So
you
can
take
a
look
on
the
test
result
and.
D
In
our
code
yeah,
so
a
lot
of
cpu
cycles
are
wasted
because
of
the
choice.
D
You
mean
visualization,
no,
the
there
are
transactions
created,
but
at
in
the
middle
it
is
invalidated
by
another
conflicting
transaction.
A
C
Yeah
so,
and
when
I
I
said,
the
job
number
is
one
and
and
the
ldaps
is
two
and
I
run
the
I
o
four
min
four
minutes
after
it
works
according
to
the
result,
when
I
increase
our
depth
to
six
six
four.
B
C
I
still
meet
the
confliction
because
there
are
the
the
root.
The
root
lobby
root
is
like
it's
really
two
times
very,
very
close,
and
it
means
that
the
two
operations,
the
all
of
them,
got
the
old
extent
and
they
also
all
all
of
them,
think
that
the
root
is
at
the
is
full
and
they
do
the
split
so
two
options
to
the
split
and
very
close
and
there's
conflict
in
the
ad
pin.
A
So,
in
other
words,
you,
when
you
increase
the
the
current
concurrent
number,
you
still
remember,
which
you
try
to.
C
A
C
The
internet
operation,
you
got
the
same
old
extent
and
they
all
think
the
extent
is
full
and
they
do
the
slate.
So
we
there's
two
new
route
extent.
C
It's
conflicted
so
oh
still
need
a
single
method
to
how
to
solve
the
plate.
That's
how
to
solve
the
confliction
because
they
influenced
each
other
yeah
and
then
I'll
do
some
pr.
This,
oh.
A
Thank
you,
jeremy
redec.
E
I'm
working
on
the
mlis
case
after
more
runs
and
more
analyzes.
It
turns
out
it's
not
it's
not
totally
happening
in
the
straight
state.
It
happens
almost
in
every
possible
state
of
of
appearing,
which
makes
me
think
it's
it's
actually
due
to
the
message
handling.
E
Here
it
happened
in
research,
but
the
analog
would
what
is
interesting
in
the
sequence
of
events
that
caused
this
particular
crash.
I
extract
them
from
the
large
topology
log.
E
It
seems
that
the
primary
okay
there
was
there.
I
believe
there
is
a
some
kind
of
race
happening
between
the
primary
and
two
different
instances
of
the
same
osd.3
to
instances
because
of
of
the
osd
restart
respawn
that
happens
in
the
middle.
E
E
It's
about
sending
the
pd
list
by
primary,
but
there
is
that
the
epoch
is
still
50
55
the
same
as
when
the
when
the
other
instance
of
the
pg
least,
message
has
been
sent
to
the
first
to
the
first
instance
of
the
osd
same
ipo
is
also
used,
is
also
present
in
the
log
when
sending
ijlis
to
the
second
instance
of
the
replica
in
in
my
understanding,
if,
if
we,
if
it
annoys
the
birds,
the
cluster
needs
to
agree
on
that.
E
But
somehow
this
doesn't
happen,
I'm
looking
for
possible
explanation
for
that.
One
of
the
of
the
one
of
the
approaches,
one
of
the
direction
of
path
of
investigation
and
following
right
now
is
about
handling
of
the
m
osd
boot
message
where
there
is
a
substantial
difference
between
classical
and
crimson
crimson
is
unaware
he's
perfectly
unaware
about
something
like
boot
epoch?
E
A
E
People,
but
please
take
a
look
on
the
last
on
the
very
last
section
in
the
and
the
parameters
that
we
are
passing
to
the
constructor
of
mosd
boot
in
in
crimson.
We
call
twice
osd
map
get
epoch,
osd
map,
but
in
classical
the
second.
E
E
A
E
Guess,
that's
my,
but
it's
you
know
it's
it's!
It's
pure
decoration
at
the
moment.
It's
a
long
shot.
A
I
should
see
thank
you,
I'm
wondering
if
we
also
need
to
be
defensive
on
message
which
might
might
on
the.
E
If
you
would,
if
you
would
like
to
harden
and
the
code
and
and
and
be
ready
for
misbehaving
cluster,
then
yes,
but
my
impression
is
that
we
trust
that
the
cluster
that
other
osds
are
are.
A
E
And
we
only
allows
from
the
security
point
of
view.
We
only
allows
the
appearing
messages
when
talking
inside
cluster
with
authorized
osd,
so
client
shouldn't
be
able
to
reach
that
path.
A
Yeah,
so
to
crash
the
application,
probably
the
best
way
and
another
approach
is
to
add
a
desert
to
make
the
clear
but
yeah
that's
the
same
thing.
Basically.
E
A
E
I
see
we
are
considering
adding
such
such
asset,
maybe
to
other
osd
than
the
one
that
actually
sees
the
crash.
E
A
F
Last
week
I
modified
the
extent
placement
manager,
press
sum
and
instance
concern
right
now.
The
extent
placement
manager
works
in
this
way.
First,
it
writes
rewrite
all
the
extents
to
to
the
to
to
other
segments,
then
and
then
used
by
journal
and
then.
F
The
journal
record
is
constructed
and
persisted
to
to
the
journal
and
that's
the
final
general
approach.
The
10th
placement
manager
works
right
now
and
the
the
address
at
the
absolute
address
of
the
extent
that
that
are
being
written
is
determined
at
the
time
of
its
writing,
not
the
not
allocating
as
before.
D
Last
week
I
have
identified
a
corruption
issue
of
a
right
set
which
is
seen
by
shahin,
and
I
will
add
it
to
graph
tracker
about
the
issue.
I
don't.
I
don't
yet
know
how
to
fix
that,
and
there
are
multiple
fake,
multiple
fixes
about
transaction
isolation
issues.
I
think
there
are
three
pr's
and
they
are
all
merged
and
I'm
currently
working
on
performance,
profiling
of
c-store
and
trying
to
add
matrix
to
c-star,
to
help
diagnose
the
performance
and
still
working
on
it.
A
Regarding
the
the
behavior
in
in
wrong
behavior
in
the
right
set,
I
think
you
already
recorded
and
fixed
the
right.
F
Oh,
I
I
think
it'll
work,
just
fine
if
we
just
don't
modify
the
absolute
modified.
The
p
p
p
ddr
field
of
the
cache
cached
extent
when
it's
still
in
the
when
it's
still
in
the
transaction.
I
mean
one.
F
Actually,
this
happens
only
in
my
pr
in
my
extend
placement
manager,
pr
in
which
I
I
modify
the
p
p
adr
field
before
the
transaction
commit
oh
right
now,
I've
corrected
it
and
it
works
fine.
Now.
D
B
A
Cool
anything
else.