►
From YouTube: Ceph Developer Monthly 2022-12-07
Description
Join us on the first Wednesday of every month for the Ceph Developer Monthly meeting
https://tracker.ceph.com/projects/cep...
A
B
Okay,
so
I
am
a
graduate
student
from
Northeastern
University
studying
information
system
today,
I'll
be
talking
about
depend
about
so
what
is
depend
about?
It
is
basically
I'll
just
I'll
just
brief
about
what
is
depend
about.
It
is
basically
a
tool
where
which
helps
in
which
helps
our
which
helps
are
depend
which
helps
to
which
helps
out
depend
which
helps
our
project
for
watching,
update
or
to
help
help
our
dependency
to
have
a
sick
to
help
our
project
to
have
a
secure,
secure
dependencies
in
our
project.
B
So
so
why?
Why
do
we
need
a.
B
So
what
is
the
current
situation
in
our
safe
project?
So
so,
if
and
for
example,
if
any
developer
is
working
through
an
error
and
he
spends
the
entire
day
working
on
it
and
at
the
end
of
the
day
he
understands
that
the
error
was
basically
due
to
the
version
was
not
updated
and
the
project
was
being
working
on
an
older
version
of
that
project,
which
was
eventually
giving
that
error.
B
So
the
developer
has
spent
the
entire
day
and
it
was
not
needed
what,
if
this
process
of
having
an
updated
dependency
in
our
project
was
automated.
So
how?
How
do
we
go
about
this
so
we'll
basically
need
to
have
a
list
of
current
dependencies
and
their
version?
So
how
can
we
get
that
it
can
be
through
spec
file?
Also,
we
can
get.
B
We
can
get
that
for
python
for
requirement
dot
text
for
package
manager
pip
for
in
the
same
manner,
for
npm
will
we
can
get
it
using
package.json
for
JavaScript,
typescript
or
angular.js.
Also
for
Java
related
dependency,
we
can
have
pom.xml
or
go
modules
for
for
go
language
and
also
have
their
Origins
in
the
listing.
So
why
would
we
need
to
list
the
dependencies?
B
These
listing
would
help
us
understand
the
current
version
that
we
are
using
and
also
if
there
are,
if
there
is,
we
can
check
if
there
is
any
new
version
for
that
particular
dependency.
This
will
help
the
developer
to
save
a
lot
of
time,
also
or
if,
if
there
is
any
current
dependency,
which
is
not
secure
to
be
used,
we
can
also
check
that
so
here
comes
depend
about.
B
So
what
does
depend
about?
Do
it
basically
tries
to
update
it
basically
keeps
your
dependencies
updated
to
keep
our
software
secure.
B
It
how
how
does
it
do
that?
It
basically
provides
two
features,
that
is
a
version
update
and
provide
security
alerts.
It
can
it
checks
for
packages
for
its
for
its
features
in
Ruby,
JavaScript
python,
PHP
Elixir
go
rust,
Java
and
dotnet.
So
what
are
the
features?
What
are
the
features
of
depend
about
so
it
provides?
What
does
it
provide
depend?
What
does
Dependable
provide?
It
basically
provides
version
updates
for
its
depend
or
our
dependencies
also.
B
It
provides
security
alerts
for
any
vulnerable
dependencies
that
are
there
in
our
project.
So
let's
go
with
the
first
feature
and
that
is
version
update.
So
what
does
depend
about
do
into
its
version
update
feature
for
any
normal
circumstances.
B
If
a
developer
is
if
a
developer
comes
across
it
that
a
dependency
which
needs
to
be
updated
with
a
newer
version
that
developer
will
manually,
go
and
correct
the
dependency
and
create
a
PR
and
go
through
all
the
process,
so
depend
about
comes
into
picture
and
it
will
check
the
list
of
dependencies
that
are
there
in
a
project
all
and
also
if
there
is
any
new
version
for
that
particular
dependency.
It
will
update
it
and
also
create
a
PR
for
that
project.
B
So
all
the
effort
for
the
developer
is
saved
and
here's
the
virtual
representation
for
the
pr
that
is
created
by
depend
about
for
version
update.
So
we
basically
have
the
bump
for
this
particular
dependency
node
from
12
to
12.21
to
18.8,
and
it
also
has
reviewers
for
the
particular
PR.
B
So
this
is
actually
the
actual
representation
of
the
PR.
So
we
have
the
commit
message
and
the
commit
message
can
be
configured
in
a
depend
about
configuration
file
like
the
dependency
node,
which
is
being
bumped
from
so
and
so
depend
version
number
also
where
it
is
located.
So
if
any
developer
needs
to
go
and
verify
all
these
changes,
so
he
can
check
it
through
through
the
location.
Also
through
the
the
description
we
can
understand.
B
If
there
are
Earth
what
are
the
additional
features
that
are
there
in
the
new
version
and
if
it
is
needed
for
our
project
also,
there
are
reviewers
so
that
we
can
go
add
the
reviewers
for
the
particular
ecosystems
or
package
managers
or
languages
related
projects
in
ourself.
B
So
moving
forward,
we
have
the
next
feature
that
is
security
alert.
B
So
so,
if
there
is
any
dependency,
that
is
there
in
our
project,
which
is
not
a
which
is
not
vulnerable,
which
is
vulnerable
and
is
not
secure
for
our
project
and
our
project
might
be
prone
to
any
malicious
attack
because
of
that
dependency.
So
what
does
depend
about
do
in
this?
It
scans
all
the
dependencies
in
our
project
and
checks
if
there
are
any
dependencies
which
are
vulnerable,
which
are
vulnerable
or
not
secure,
and
it
will
give
a
give
give
or
send
an
alert
in
GitHub.
B
It
also
suggests
a
fix
on
that.
It
would
create
a
PR
for
for
the
closest
non-vulnerable
version
for
that
particular
dependency,
so
that
we
can
know
that
we
can
place
our
dependency
version
to
a
safer
version.
B
So
here's
the
visual
representation
for
the
alerts
that
depend
about
creates.
So
so
we
can
basically
sort
these
by
a
severity
as
in
high
critical
moderate.
So
when
a
developer
goes
through
all
these
all
these
lists,
he
can
sort
these
and
give
preferences
to
the
severity.
B
So
moving
forward,
we
look
at
the
implementation.
The
implementation
is
simple
and
easy.
We
need
to
enable
depend
about
for
GitHub,
which
is
already
done
for
npm
package,
ecosystem
package,
ecosystem
and
GitHub
work
actions,
GitHub
actions,
so
we
configure
depend
about
for
all
the
dependency
ecosystems
that
are
required
for
version
update
ecosystem
is
basically
package
manager.
For
example,
your
package
ecosystem
is
npm
package.
Ecosystem
is
GitHub
actions
so
in
safe.
B
Currently,
we
have
already
implemented
ecosystem
for
npm
and
GitHub
actions
also
for
getting
the
security
alerts.
We
enable
it
in
our
we
enable
it
in
it
in
our
Organization
for
GitHub
and
thus
for
moving
forward.
We
configure
depend
about
for
using
depend
about
dot
yaml
in
GitHub
dot
GitHub
directory.
B
So
the
configuration
consists
of
the
package
ecosystem
that
is
pip
and
the
directory
where
we
want
to
have
these
checks
to
be
done
on.
We
can
also
schedule
this
like
on
daily
basis
or
or
give
a
specific
time.
The
intervals
can
be
like
daily,
weekly
or
monthly.
Whenever
we
want,
as
said
previously,
in
the
version
update
PR,
we
can
even
have
a
commit
message
and
add
prefix
like
like.
We
had
ngr
dashboard
for
the
dashboard
related
project.
Then
we
can
add
reviewers.
Also.
B
We
can
limit
the
number
of
pull
requests
for
that
for
that
particular
ecosystem.
So
we
can
have
open
20
open,
pull
requests
for
pip
for
this
particular
directory
location.
B
So
with
all
these
fascinating
features,
what
what
can
be
the
challenge
in
implementing
this
in
our
project?
So,
as
you
see
like,
there
are
56
open,
PR's
and
153
closed
PRS,
so
any
developer
work
working
on
depend
about
would
get
like
on
daily
basis.
B
B
So
the
solution
is,
we
can
basically
add
reviewers
So.
Currently
in
our
project
we
have
code
owners,
so
so,
if
we
are
making
any
changes
for
the
directory
e
dashboard
and
through
code
owners,
only
the
dashboard
related
dashboard
related
team
would
get
the
alerts
about
these
and
can't
review
the
PRS
and
approve
them.
Also,
if,
if
needed,
we
can
add
reviewers
explicitly
like
we
can
add
the
team,
we
can
also
add
the
username
or
we
can
add
the
organization.
B
Thank
you,
and
if
there
are
any
questions,
I'm
open
to
answer
also
the
mentors
in
my
project
who
helped
me
through
this
is
Chris.
My
manager,
Christina,
Justin
and
Gabriella.
C
I
have
a
question
for
you.
So
when
you
were
working
in
your
internship,
what
was
the
hardest
part
about
it
and
what
was
the
part
that
you
liked
the
most.
B
At
the
beginning,
as
I
I
would
say,
depend
about
working
on
depend
about.
One
was
one
of
the
interesting
thing
as
as
I
didn't
know,
like
I
just
had
the
problem
statement
that
I
needed
to
know
how.
How
can
I
update
the
update
the
version
and
like
I,
went
from
scratch
like
how
can
I
go
about
it
like
like
getting
on?
B
How
do
I
get
get
the
dependencies
and
know
the
version
numbers
and
how
can
I
compare
it
and
understanding
this
I
understood
like
how
can
I
go
about
it
and
look
look
around
in
on
GitHub
to
our
other
contributors
or
what
are
they
doing
about
it?
And
if
I
can
understand
what
they
are
doing
and
apply
it
through
in
my
project
about
it?
So
that
was
the
most
interesting
part.
C
That's
great
to
hear
and
I
think
in
the
the
project
we've
got.
We
had
depend
about
it
for
some
in
some
areas
and
not
exactly.
B
Yeah
we
had
depend
about
for
dashboard
related
project
that
is
basically
for
npm,
and
but
we
can
add
it
for
others
to
like
for
python.
There
was
specifically
one
PR
that
I
saw
that
it.
There
was
a
update
for
Fedora
and
if,
if
we
could
get
that
update,
like
any
OS
related
major
updates,
so
it
could
be
really
helpful
for
our
developers
that
they
can
look
for
foreign.
D
D
Regarding
the
scrub
scheduling,
the
idea
is
to
dedicate
some
10-15
minutes
to
describing
house
Cup
scheduling
is
handled
in
in
the
existing
codes
and
why
the
existing
code
is
built,
the
way
it
is,
and
then
what
the
changes
that
I'm
trying
to
push
for
approval
and
what
will
be
the
benefits
in
those
changes
now
most
of
what
I'll
be
describing
is
now
in
a
pr49237.
D
D
So
this
is
a
I
said
that
starts
with
a
few
minutes
perspective.
This
is
the
way
scrub
scheduling,
scrub
implementation,
a.
D
D
And
everything,
actually
it's
a
take
without
OSD
log
for
those
but
every
tick.
The
OST
queries
these
queue
and
looks
for
the
first
first
PG,
that
is
a
scheduled
to
describe
to
be
described
and
scrubbed.
The
PG
held
its
as
part
of
the
PG
class
held
the
scrub
related
data
that
controlled
the
scheduling
and
controlled.
The
behavior
of
this
most
most
important
word
is
a
set
of
flags.
D
There
was
probably
you
know.
The
names
there
were
there
was
a
set
of
a
large
set
of
flags
like
wreck
scrub.
Mask
scrub,
must
deep
scrub
Etc,
which
controlled
when
when
the
scrub
will
be
performed
to
some
extent
and
what
type
of
scrub,
what
level
whether
it
will
be
a
shallow
or
a
deep
scrub
and
controlled
some
other
parameters.
We
talked
about
it
later
in
more
details,
Okay,
so
I
started
working
on
scrub.
D
D
D
D
This
was
one
of
the
heavy
heavy
issues
that
I
encoded
I.
The
first
step
I
did
was
to
separate
those
into
two
groups.
One
one
group
was
the
set
of
flags
that
controlled
the
next
Club
the
plant
crab
and
the
other
was
were
made
part
of
the
pity
scrubber
and
were
affecting
the
currently
running
scrub.
The
current
scrub
session-
okay-
and
there
is
there-
is
a.
There
is
a
point
in
time
when
the
scrub
starts,
when
this
set
of
plugs
is
Frozen.
D
Now
this
this
is
the
one
side
of
the
change
regarding
the
scrap
scheduling
a
in
the
first
phase,
nothing
was
changed
and
then
I
took
up.
I
took
out
the
scrap
queue
from
within
the
OSD
made
it
separate
entity,
or
is
it
each
of
his
dinner
owns
a
scrap
queue
but-
and
the
idea
is
mostly
the
same-
there
is
a
queue
of
on
pgs,
but
there
was
an
added
a
structure
that
was
added,
which
is
the
scrub
job
in
the
middle.
D
D
The
main
points
of
the
existing
implementation,
two
points
to
note,
take
a
look
at
what
happens
when
a
when
trying
to
schedule
a
scrub
like
we
said
the
OSD
once
every
second
or
on
most
seconds
there
is
some
Randomness
there
was
the
ask,
describe
queue
for
the
topmost
PG
describe
and
then
make
a
just-in-time
decision.
What
level
of
scrub
is
required?
D
The
level
is
determination
is
based
on
the
specific
flags
on
the
in
the
PG,
those
plan
scrubs
we
discussed
and
some
environment
variables,
for
example,
are
we
allowed
to
perform
a
deep
scrub?
Are
we
allowed
to
perform
better
scrub
whatever?
What
what
time
in
the
oops
I
don't
know
why
you
just
okay,
what
all
these
decisions
are
made
when
we
initiate
the
scrubbing
of
a
specific
pigeon
Now
display
itself
is
a
problem,
because
it
means
that
we
can
never
tell
the
user
or
not
always.
D
D
D
So,
let's
tap
every
tick
or
more
sticks,
we
are
selecting
the
first
PG
in
the
queue
as
a
candidate.
We
then
perform
some
validation.
Checks
is
the
PG.
Still
there
is
the
PG
still
active.
Is
it
everything
is
right
for
coming?
Do
we
have
the
environment
and
the
configuration
Etc
to
start
scrubbing
this
PG?
If
all
okay,
if
all
is
okay,
what?
Why
is
this.
D
And
get
the
PG
then
goes
and
I'm
trying
not
to
something
causes
the
slides
to
go
back.
I,
don't
know
why,
anyway,
the
PG
tries
the
PG
try
to
initiate
the
scrub
by
first
reserving
the
replicas,
and
this
is
an
important
step,
the
P,
the
primary
the
primary
OSD
requests,
all
the
replicas
to
assign
resources
which
you
and
the
resource
is
simply
a
counter
within
the
active
OSD.
D
D
That's
the
basic
idea
and
everything
is
fine.
Unless
there
is
an
error
now
we
have.
There
is
a
problem
here.
If
okay
there's
a
problem
here,
suppose
we
try
to
scrub.
Apg
is
the
topmost
in
the
queue
and
we
try
to
scrub
it,
and
then
it
fails
to
reserve
the
replicas
in
we
discuss
in
a
minute
some
reasons.
But
what
will
happen
we've
already?
D
We
will
not
be
trying
a
any
more
pages,
we're
not
trying
to
initiate
a
scrub
on
any
more
pgs
in
that
specific
time
frame
and
because,
as
a
trying
to
get
trying
to
reserve
the
replicas
and
being
rejected
might
take
time
might
take
seconds.
D
We
are
wasting
in
quotation
marks.
We
are
wasting
more
than
one
tick
and
then
decide,
and
then
we
decide.
We
understand
that
this
PG
has
failed
in
secured
resources.
Okay,
suppose
that
happened.
What
happened
the
next
time?
The
next
stick?
If
that
PG
is
still
the
topmost,
we
will
again
try
and
scrub
try
to
initiate
the
scrub
on
that
specific
PG.
D
E
D
D
Okay,
so
this
is
the
state
as
it
was,
and
mostly
this
is
how
things
are
increasing
and
it's
pretty
good.
It
works
pretty
good
apart
from
some
issues,
it
isn't
perfect.
F
D
D
Has
one
of
his
resources
a
locked
for
long
for
a
long
period
of
time,
and
during
that
time,
that
OSD
have
has
reduced
capacity
to
answer
other
requests
for
scrubbing,
for
example,
for
those
pgs
for
which
it
is
the
replica
it
serves
as
a
replica
not
as
a
primary.
D
D
Well,
the
problems
are,
but
just
to
mention
some.
For
example,
if
we
have
a
group
of
pgs
that
cannot
be
scrubbed,
but
have
they
are
high
in
the
queue
we
are
wasting
a
effort
in
trying
to
scrub
to
scrub
them
to
initiate
the
scrap.
Now
it's
not
that
we
are
wasting
a
lot
of
CPU,
but
mostly
I,
think
it's
a
lot
of
logs
that
are
that
we
are
creating,
especially
in
debug
modes,
and
this
is
wasteful
and
disturbing.
D
D
We
are
not
every
each
OSD
makes
its
own
decisions
regarding
those
pages
for
which
it
is
the
primary,
and
there
is
no
Central
Authority
which
dies
which
club
should
be
a
scheduled
as
the
next
which
position
should
be
scrubbed
at
any
specific
point
of
time.
Now
we
wish
to
keep
this
this
way
the
distributed
the
the
distributed.
D
I
know
we
have
customers,
the
USF
users
that
build
their
own
Central
scrub
Management
on
top
of
what
we
are
doing
here
and
one
of
the
at
one
of
the
ideas.
One
of
my
goals
is
to
create
a
system
which
will
only
make
make
this
Central
system
not
necessary,
I'm
trying
to
cover
most
of
the
90
or
more
percent
of
what
you
might
might
be
achieved
with
Central
scheduling
in
what
steep,
by
still
maintaining
the
distributed
security.
D
Another
is
the
issue
of
cluster-wide
effects.
Problems
like
I
said:
I
gave
an
example.
What
happens
when
one
scrub
on
one
SPG
on,
probably
because
of
one
object?
It
has
a
problem,
it
is.
It
is
an
issue
and
there
is
an
issue
that
is
not
fully
investigated,
but
happens.
We
have
clients,
we
have
a
SF
users
that
see
a
long
tail
of
pgs
that
are
never
scrubbed.
D
I
have
I
have
some
theories
about
that,
and,
and
even
some
things
that
already
some
reason
that
already
known,
but
you
can
see
the
points
you
can
see.
What
might
happen
here
suppose
you
have
a
group
of
pgs
that
for
some
reason,
a
fail
constantly
fail.
If
this
group
is
large
enough,
it
might
mean
that
stampages
are
starved
and
never
get
to
okay,
8.4
scrub
and
the
last
two
issues
observability.
D
D
D
D
If
you
see
my
pointer,
but
a
scrap
target
has
an
urgency
which
we'll
talk
in
a
second
target
time,
a
deadline,
Target
time
and
deadline
of
what
we
have
now
the
same
and
not
before
time,
which
we'll
describe
in
a
minute
in
each
scrub
job,
which
means
that
each
PG,
through
his
PG
scrubber
each
scrub
job,
holds
two
Targets
to
schedule
a
targets,
a
shadow
Target
and
a
deep
Target
and
the
main
chain
is
the
describe
queue.
The
OSD
scrub
queue
is
now
composed
of
those
targets.
D
Okay,
take
a
look
at
the
in
this
in
the
left
of
this
slide.
This
is
how
a
scrub
queue
might
look
like
you
have
entries
which
specify
the
PG
and
the
level
okay.
So
in
this
specific
example,
PG
1.7
appears
twice.
The
shallow
scrub
is
first
because
of
whatever
parameters
and
timestamps
relevant
and
PG,
and
the
Deep
scope
for
that
specific
PG
will
be
later
in
the
queue.
D
This
change
by
itself
helps
the
visibility,
because
now,
when
you
take,
when
you
look
at
the
scrap
queue
and
they
will
see
how
it
looks
in
the
logs
damp
play
commands,
you
can
clearly
see
what
types
of
Scrubs
are
planned
and
we're
talking.
We
see
in
a
minute
what
other
types
of
information
will
be
available
to
make
the
scope
queue
easier
to
curse
for
the
user.
D
D
We
talked
we
talked
about.
We
talked
about
this.
The
flags
I
mean
to
remind
you.
There
is
a
set
of
flags
now
in
two
groups.
One
is
the
requested
or
the
planned
neck
scrub,
which
Define.
What?
How
do
we
want
to
see
the
next
curve
on
a
specific
PG,
for
example?
Should
it
be
a
dips?
Must
it
be
a
deep
scrub,
should
it
be
a
high
priority
scrub
Etc?
D
Other
things,
it's
a
way
to
specify
how
fast
we
want
or
urgently
we
want
this
specific,
specific
grab
and
variable.
We
have
two
Targets,
which
means
we
have
different
agencies
for
the
shallow
and
the
Deep
targets
for
each
scrub
job.
So
it
means
we
can
specify
how
urgent
is
the
shallow
or
the
Deep
Target
for
each
scrub
Loop.
D
Okay,
and
but
we
also
have
the
some
other
agencies,
for
example,
analyzed
is
the
implementation
of
the
those
the
penalized
Q
which
we
saw
earlier
now
we
don't
need
it.
It's
just
one
more
urgency,
with
some
specific
logic
around
it
in
the
same
hierarchy
of
urgency,
we
have
the
overview,
which
is
another
a
Behavior
or
another
functionality
that
is
currently
handed
sometimes
with
flags
overdue
means
that
we
are
beyond
the
target
of
the
Beyond
them.
So
beyond
the
deadline
for
specific
Target
and
again,
because
we
have
two
different
targets:
the
shallow
and
the
Deep.
D
We
can
do
that
and
then
we
have.
The
higher
priority
agencies
operator
requested
must,
which
is
specific
to
something
that
we
have
now,
which
mean
which
is
a
after
a
request
from
the
operator
with
referral.
Mostly.
This
is
mostly
why
we
have
mass
and
we
have
the
after
repair,
which
is
an
immediate
type
of
scrub,
of
deep
scrub.
That
is
both
of
which
is
performed
after
a
repair
in
some
instances
and
should
be,
and
should
be
scheduled
immediately
after
the
repair,
and
that
is
why
it's
given
the
highest
priority.
D
D
Now
we
don't
need
those
this
flat
because
the
functionality
is
encompassed
in
the
Mast
and
the
and
the
after
repair
the
agencies
in
the
in
the
urgency
enum
that
I
showed
you
and
the
same
holds
for
the
other
tags
that
you
see.
None
of
them
was
needed.
Anymore
is
needed
anymore,
which
means
that
a
lot
of
code
and
a
lot
of
a
a
lot
of
areas
of
a
in
Clarity
were
removed
and
improved.
D
C
D
D
D
This
is
what
we
said
earlier
that
suppose
a
PG
has
failed,
failed
to
start
or
a
PGA
fair
to
a
scrubbing
or
fed
during
during
scrub,
for
whatever
reason,
this
enabled
enables
us
to
set
a
new
note
before
in
some
time
six
seconds
and
10
seconds
in
the
future,
which,
with
the
end,
depending
on
whatever
factors
for
example,
why?
What
was
the
reason
for
the
failure?
D
How
many
other
pgs
are
waiting
Etc,
but
it
at
least
enables
us
to
set
a
time
that
and
make
sure
that
we
will
not
try
again
for
this,
a
scheduling,
Target
and,
at
the
same
time,
enables
us
to
to
keep
the
urgency
the
target
time
the
deadline,
all
those
parameters
that
stay,
how
badly?
We
need
this
in
this
case,
a
describe
when
we,
when
we
will
be
able
to,
but
only
when
we
will
be
able,
we
will
be
allowed
to
retry
it.
D
D
Now
a
few
points
that
I
want
to
make
if
anyone
will
be
looking
at
the
code-
and
we
try
to
understand
it,
let's
take
a
look
at
the
scrub
queue.
The
scrub
queue
is
ordered
a
first
by
those
jobs
that
are
those
not
jobs,
those
those
get
targets
that
are
ripe,
which
means
not
before
has
arrived
and
for
those
that
are
ripe.
They
are
sorted
by
urgency,
then
deadline,
then
Target,
Etc
and
all
the
rest
of
the
jobs
that
are
not
ripe
are
sorted
first
by
the
not
before.
D
D
B
D
Point
to
a
just
to
mention
is
here
the
next
channel
in
the
next
dip.
There
are
many
instances
where
A
change
is
made
to
the
Target,
for
example,
when
a
scrub
terminates
or
when
a
scrub
ends
on
if
a
configuration
changes
or
if
the
operator
the
issues
are
command.
In
all
those
cases,
a
Target
is
modified,
but
before
what
before
it
is
modified
a
well,
the
specific
Target
that
is
modified
depends
on
whether
we
are
currently
scrubbing
or
using
the
specific
Target
for
scrubbing
and
to
allow
this.
D
We
have
some
kind
of
double
waffle
with
the
next
Channel
Next
dip
entries
had
managed
by
the
scrap
job
most
of
the
code.
That
is
not
aware
of
this,
but
there
is
a
most
of
the
code.
Just
asks
gave
me
the
shell
of
Target
that
is
modifiable
and
will
usually
we
see,
get
back
the
shadow
Target,
the
main
Shadow
Target,
but
in
some
cases,
when
the
only
case
is,
if
we
are
currently
performing
a
shallow
scrub
of
the
specific
PG,
a
the
modifiable
target
will
be
the
next
Channel
okay.
D
D
Okay,
I,
don't
see
an
option
to
start
it,
okay
to
start
to
show
it,
but
anyway
there
is.
If
you
can
see,
for
example,
the
PG.
You
can
see
the
shell
of
Target
parameters,
they
left
me
the
left
and
the
Deep
Target
rights
are
probably
columns
and
you
see
a
reference
to
the
nearest
of
the
two.
D
This
is
one
example
of
listing
here
is
an
example
of
how
much
things
might
look
at
at
the
logs
in
the
logs,
usually
a
okay.
This
is
how
look
at
the
first
line.
This
is
how
a
scrub
queue
is
a
wow,
but
this
is
how
a
Skype
queue
might
appear
in
the
in
the
log.
D
Is
not
it's
not
ripe,
the
closest
Target
is
on
7th
of
December,
etc,
etc,
and
we
have
the
closest
Target
here.
It
is
a
shallow
one.
The
not
it's
not
before
is
is
whatever
it
will
be.
A
periodic,
regular
urgency
and
here
are
the
rest
of
the
parameters,
and
there
is
a
the
issue
feed
which
stays
might
say
why
why
we
failed
the
last
time
if
we
failed
scheduling
this
specific
Target,
okay
and
below
that.
D
D
D
Usually
it
was
the
old
Flags
with
and
when
I
once
those
flags
were
segregated
into
two
glasses
in
two
sets,
like
I
said
earlier,
it
was
it
wasn't
very
easy
to
understand.
What's
going
on
with
the
with
them
now
you
can
I
added
again
a
clever
understanding
of
what
a
clear
depiction
of
what
this
current
scrub
or
the
next
scrub
will
be,
for
example,
take
a
look
at
the
first
line.
Second
line,
so
PG
2.5,
PT
2.5
is
currently
active
and
clean.
D
D
D
H
D
D
Now
this
is
in
five
little
stages
of
testing
I
hope
to
merge
it
in
the
next
couple
of
days
this.
So
this
specific
issue
is
halfway
to
be
handled
and
we
have
the
we
had
the
issue
of
a
scrubs
resources
locked
by
jobs
that
are
blocked,
which
I
mentioned
earlier
now,
the
what
they
have
water
is
already
does
already
exist
in
currency
is
quite
recent.
Change
is
a
warning
in
the
cluster.
Look
there
isn't
much
there
isn't
a
lot.
D
There
is
a
fpr
in
the
works
which
I
I'm
not
sure
what
will
be
I
will
be
able
to
make
get
ready
in
in
time
for
with
which
will
react
to
those
to
such
a
locked
or
would
have
blocked
for
whatever
reason,
scrubs.
H
D
A
I
think
even
I'm
trying
to
get
to
that
question
is
that
you
earlier
mentioned
that
you're
wasting
a
big
cycle
or
more
than
one
cycle.
If
something
gets
preempted
right,
how
is
the
urgency
class
or
whatever
the
enum
solving
that
problem.
D
Okay,
he
the
urgency
by
itself,
does
not.
This
is
not
I
mentioned
it.
Just
is
one
of
the
problems,
not
not
the
only
problem
we
have,
and
so,
if
I
give
the
longer
person
that
I'm
solving
this
here,
what
I
am
what
I
am
doing
is
enabling
well
I'm
solving
some
of
the
problem
of
the
long
tail
that
is
caused
by
such
because
by
using
the
note
before
this
is
a
this
is
this
does
have
a
effect
on
what
what
will
be
the
changes
of
of
starvation,
of
bridges.
D
I
am
hoping
to
use
the
same
mechanism,
the
same
enum
to
have
more
to
handle
other
types
of
a
runtime
or
scrub
time
failures,
but
yeah.
The
main
change
in
in
solving
this
problem
is
the
note
before.
A
A
D
A
D
I'm
not
adding
data
to
this
okay
in
one
second,
and
before
there
is
one
you
know
which
I
didn't,
including
this
in
this
presentation.
Greg.
Regarding
your
what
you
asked
earlier,
one
of
the
the
next
one
of
the
next
PRS
that
I'm
working
on
was
was
specifically
regarding
the
issue
of
a
replica
reservation.
H
D
Each
sketches
better
I
am
exposing
the
data
because
it
seems
that
the
the
clients
want
to
know
why
things
are
scheduled
or
not.
There
are
a
few
knobs
that
are
added.
D
It
is
exposed
in
in
the
inquiry
wherever
I
can
send
a
a
Json.
I
can
add
the
you
can
add
data.
If,
if
you,
if
you,
if
you,
if
you
don't
consider
a
query,
if
the
user
operator
option
okay.
D
I
can't
give
you
a
full
answer.
The
idea
was
just
to
add
I'm
not,
but
I
did
I
wasn't
able
to
fully
keep
the
what
you
had
in
a
in
a
query
request.
H
Well,
okay,
so
so
let
me
I
I
guess
what
I'm
getting
at
is
that
it
seems
like
you've
added
several
new
data
structures,
but
the
important
parts
as
I
understand
it
are
that
it's
got
the
not
before
field
which
is
used
in
like
ordering
the
pgs
that
get
chosen
to
scrub
and
all
the
all
the
fools
that
were
mostly
exclusive
from
each
other,
although
I'm
not
sure
if
they
all
were
got
collapse
into
a
singly
numb.
And
that's
you
know,
that's
fine,
if
that
makes
it
easier
to
reason
about
and
improves
the
scheduling.
H
But
our
more
advanced
users
have
a
lot
of
knowledge
that
they've,
you
know
developed
about
how
scrubbing
works
and
as
you
referenced,
how
to
like
control
it
by
like,
like
they,
people
have
built
their
own
scrub,
scheduling,
engines
they're
on
outside
stuff,
and
so
if
this
is
like
at
the
moment,
just
these
these
little
things
but
but
works.
But
like
then
that's
one
thing,
but
if
we're
expecting
to
like
massively
change
the
way
scrubbing
works,
so
all
their
all
their
learned,
knowledge
and
encoding
knowledge
and,
like
their
algorithmic
engines,
breaks.
D
A
I
I
K
H
I
D
Priority
in
reserving
who
is
this
time,
but
would
be
hard
to
get
it
into.
C
Jotted
down
some
notes
in
the
either
pad,
but
if
there's
anything
else,
we
want
to
remember
here's
ether
pad
to
add
to
I
wrote
his
note
about.
We
should
add
a
release,
note
about
anything
that
has
changed
meaning.
J
A
C
Thanks
Ronan,
the
next
topic
on
the
agenda
is
hit
assumptions.
There
have
been
some
ongoing
conversations
in
the
greatest
Team
about
what
we
should
assume
in
terms
of
it's
I.
Don't
know
if
there's
any
one
person
that
wants
to
kick
off
the
conversation
but
I'll
leave
it
up
to
whoever
wants
to
start.
A
So
yeah
this
is
just
a
general
topic
for
discussion
based
on
something
we
saw
in
a
recent
case
that
came
in
probably
sridhar.
You
have
a
background
on
what
was
going
on.
Maybe
you
can
set
context
and
then
we
can
open
it
up
for
discussion.
L
Yeah
sure,
how
can
you
guys
see
me
yes,
yeah
so,
like
I
mentioned
the
the
this
specific
issue
relating
to
Prince
the
scheme,
thus
War,
when
one
of
one
of
our
customers
was
trying
to
run
a
scenario
involving
restart
of
all
the
safe
payments.
So
the
scenario
basically
involved
a
graceful
and
ungrateful
restarts
of
all
these
septemones,
and
the
expectation
was
that,
after
the
restarts,
everything
comes
comes
up,
fine
and
then
cluster,
the
connectivity
and
all
the
all
the
stuff
happens
properly.
L
But
during
reading
one
of
the
during
the
these
tests,
it
was
noticed
that
a
bunch
of
osds
they
were,
they
were
not
coming
up.
Fine,
they
transitioned
into
the
booting
State.
As
you
know,
the
OST
essentially
transitions
from
init
to
booting
and
finally,
to
the
active
state
after
restart,
so
a
bunch
of
offices
were
noticed
to
be
stuck
in
the
booting
State
forever,
leading
to
the
mon
amounts,
eventually
marking
all
those
specific
osds
down
after
the
time
period
of
around
900
seconds.
L
So
this
is
the
observation,
and
this
issue
was
consistently
being
reproducing
the
customers
environment,
and
so
we
requested
QE
to
reproduce
it
locally
and
they
were
able
to
do
that
as
well.
So,
essentially,
what
what
was
happening
was.
This
is
a
containerized
environment
and
customers.
Customer
was
running
odf
4.1
if
I'm
not
mistaken,
so
some
analysis
of
the
logs.
It
was
pretty
clear
that
the
OSD
that
were
that
were
not
coming
up.
L
L
We
essentially
expect
that
the
osds
would
have
a
feed
value
of
one
that
essentially
tells
chefs
that
the
OSD
is
running
in
a
continuous
containerization
environment
and
then
it
goes
ahead
and
generates
a
random
64-bit
nonce
value
so
that
that
actually
helps
USF
to
figure
out
that
the
the
the
the
Incarnation
of
an
OSD
that
that
goes
down
and
comes
up
so
essentially
in
this
case,
what
was
happening
was
for
a
bunch
of
osds
and
the
nonce
value,
didn't
change
across
reboots
and,
as
a
result,
the
osds
those
overseas
that
came
up.
L
They
were
marked
as
do
boots
and,
and
essentially
they
never
really
transitioned
into
the
active
stake,
and
it
was
through
Greg's
input
that
we
finally
figured
out
that
the
the
the
issue
with
the
PID
values
and-
and
we
essentially
got
back
to
the
customer-
to
recommend
them
to
use
the
I
think
there
is
a
environment
where
variable
called
self
use,
random
nouns
if
I
either
that
should
be
enabled
or
the
PID
or
the
safe
Diamond
should
be
set
to
1..
L
So
essentially
that
was
the
issue
and
right
now
we
don't
understand
how
this
PID
value
is
getting
changed
to
the
PID
of
the
OSD.
That
is
not
expected,
but
at
least
we
have
a
workaround
where
we
can
set
specific
environment.
Variable
called
refuse
random
nodes
that
should
help
the
osts
to
come
up
with
a
new
new
nonce
and
and
stuff
identity,
ideally
identify
that
the
this
is
a
new
incarnation
of
an
exist
University
and
that
should
eventually
help
the
USD
go
into
the
active
state.
A
Actually,
when
I
pasted
a
couple
of
things
in
the
chat
to
report
context
in
terms
of
code
around
what
assumptions
are
present
and
what
assumptions
were
broken,
it
seems
like
the
Assumption
in
Roku
is
that
the
PID
is
always
going
to
be
one,
but
that
clearly
was
broken
in
this
case,
because
of
which
we
encountered
other
issues
but
I
think
radic.
You
also
had
an
experience
of
something
like
this
in
Crimson.
Maybe
you
can
elaborate
on
that
and
then
we
can
open
it
up.
J
Yes,
I
encountered
this
issue
when
reporting
Crimson
to
The
Rook.
The
main
symptom
was
were
problems
with
with,
during
a
lot
of
misdirected,
a
lot
of
things
directed
to
mishandled
Pink
measures.
J
This
was
because
in
the
nuns
collection,
okay,
basically,
one
of
the
responsibilities
of
messenger
is
to
provide
each
translate
identity
into
a
couple
of
network
parameters
like
IP
address
this
part,
and
also
nouns,
we
use
nouns
to
distinguish
between
different
instances
of
the
same
of
the
same
of
the
same
demon.
Basically,
the
detect
the
Earth
to
detect
restarts
and
in
Crimson
the
non's
selection
logic
was
that
it
was.
It
was
depending
on
on
on
feed
I.
If
I
recall
correctly,
the
Assumption
for
containerized
environments
was
that.
J
Was
that
we
are
randomly
are
picking
up
and
a
huge
integer,
if
our
repeat
is
equal
to
one,
in
other
words
in
it,
but
in
the
loop
environment
we
failed
Cooperative
slant,
another.
J
A
Foreign,
so
I
guess
the
the
general
idea
is
a
bit
this
new
discovery
earlier,
safe
ADM,
the
pr
that
I
just
pasted,
Was
Always
setting
this
environmental
variable
now
Rook
is
also
doing
the
same
now.
The
question
is:
is
there
any
other
Improvement
that
we
can
do
in
this
area
and
Greg?
You
had
some
thoughts
around
this
right.
H
H
You
know
every
invocation
of
a
demon
and
on
you
know,
respawn
or
whatever,
and
once
upon
a
time
we
used
the
PID,
because
you
know
in
2007
pids
were
mostly
random
and
then
most
of
the
Demons
except
the
osds,
now
just
use
a
random
nonce,
because
if
we
got
started
seeing
cases
where
they
weren't
sufficiently
random-
and
so
we
were
getting,
these
conflicts
obviously
comes
all
the
time
in
containers
now.
H
But
it
also
historically
has
happened
a
little
bit
in
just
like
you
know,
systems
where
you're
starting
a
lot
of
processes
and
the
and
the
number
of
IDs
are
constrained
or
something
and
I
I.
Don't
remember
why
we
tried
to
just
use
the
kept
trying
to
just
use
the
PID
for
the
OSD,
I,
really
I
think
it
was
just
so
that
it
was
easier
for
developers
to
map
the
like
messages.
They
were
seeing
on.
H
Other
servers
to
which
OSD
demon
they
needed
to
go,
run
GDB
on
or
something
elsewhere,
in
which
case
I
think
we
should
just
get
rid
of
the
PID
mapping
into
always
randomize
every
time
so
that
we
never
ever
see
this
again,
but
maybe
there's
some
other
constraint
I've
forgotten
about
that.
We
need
to
account
for
right.
A
H
H
Think
these
one
of
these
PRS
that
Sage
did
is
actually
where
it
moved
in
the
messenger
layer
from
from
the
in
from
the
invoker
and
I'm,
pretty
sure
the
OSD
is
the
only
one
that
tries
to
use
the
use,
IP
ID
version,
the
Monitor
and
the
well,
the
MBS
for
sure
and
I
think
the
monitor
are
just
random
and
they've
always
been
random
for
many
many
years.
I
Pits
have
pretty
much
one
property
that
makes
them
attractive
for
this,
and
that's
that
the
operating
system
goes
to
quite
a
bit
of
effort
to
make
sure
that
it
doesn't
reuse
pits
for
a
while.
That's
the
reason
why
we
use
pits
so
I'm
with
Greg
I,
don't
think,
there's
any
reason
we
actually
have
to
do
this.
If
we
randomize
an
insufficiently
large
space,
we'll
get
a
stronger
guarantee,
there
won't
be
any
possibility
of
failing
to
set
the
environment
variable
and
I've
actually
never
used
the
nonce
to
map
to
a
pin.
H
A
A
J
Hi
and
I
found
the
very
old
guest
about
the
case
in
Crimson.
Also,
one
of
the
comments
is
actually
about
the
nuns
selection
logic.
Let
me
pinpoint
it
directly
here
here
is
the
link
comment
with
non-selection
Snippets.
A
F
J
Here
somebody
could
point
out:
okay,
they
are
random,
which
means
that
there
is
very,
very
unlikely,
but
still
existence
case
of
running
into
non-nons
clashes.
I
That's
actually
handled
when
the
Daemon
starts
up.
It
checks
the
ostm
app
to
see
if
it
actually
grabbed
the
same
knots,
because.
I
A
Cool,
that's
pretty
much
any
anything
else
on
this,
so
we
want
the
next
topic.
G
Hi
guys
I
would
like
to
very
shortly
talk
about
an
issue
that
we
got.
Basically,
we
encountered
customers
who,
due
to
some.
G
That
of
course
led
to
all
strange
and
initially
difficult
to
understand.
Corruptions.
G
It
mostly
happens
when
we
use
containerized
environment
and
the
reason
the
reason
why
it
didn't
in
general,
we
do
have
in
Blue
Store
mechanism
to
protect
against
running
objects
for
multiple
times
we
just
take
a
logs
for
each
block,
each
device
that
will
be
used
for
data
and,
in
addition,
an
extra
file
called
fsid.
G
The
problem
that
we
had
was
that
two
different
runs
of
the
container
recreated
all
that
access
files,
meaning
they
used
Seth,
Object
Store
tool
or
equivalent
to
recreate
Blue
Store
data
path,
deal
and
also
create
all
the
links
to
devices
and
the
Locking
we
were
taking
was
a
inode
based,
meaning
when
files
were
recreated,
we
no
longer
actually
were
locking
against
the
other
other
osds.
That
could
still
be
running.
G
Make
it
work
that
the
links
in
our
Blue
Store
path
were
to
block
devices
and
our
Deluxe
were
executing
were
against
a
I
notes
that
were
block
devices.
But
if
you,
if,
if
there
would
be
some
mechanism
and
I
assume
there
might
must
have
been,
and
that
will
also
recreate
a
block
device,
I
node,
then
we
got
completely
different
set
of
inodes
and
two
osds
can
run.
G
Ing
on
the
other
OSD
was
to
open
a
block
device
in
an
exclusive
mode.
There
is
a
it
seems,
a
bit
extra
implementation
in
Linux
kernel
to
handle
blog
devices
differently,
and
if
you
open
a
block
device
inode,
then
you
really
get
a
locking
for
a
device
itself,
not
only
the
inode.
Hence
the
VRI
I
created.
G
F
Thanks
Adam,
so
I
can
relate
this
one
to
this
hacker
as
well,
where
we
are
seeing
the
similar
issue
in
the
odf
environment,
where
the
blue
FS
got
corrupted,
because
the
multiple
safe
parts
were
running
against
the
same
device.
You
can
say
it
hello,.
F
So
they
are
handling
it
a
bit
different
way
in
the
root
just
to
prevent,
like
multiple
osts,
sorry,
multiple
dimmers,
using
the
same
device.
You
can
say
osts.
G
Can
you
shortly
summarize,
what
is
the
technique
used
in
that
VR,
because
I
I
don't
have
time
to
reread
it.
F
So
what
they're
doing
at
least
I
when
I
went
through
yesterday,
so
it's
basically
there
using
the
kind
of
like
the
OSD
pod
uses
like
the
either
the
host
path
host
path.
I
think
means
like
it's
actually
using
the.
What
are
you
gonna
sell
it?
F
F
E
I
think
the
fact
that
it
is
a
rook
PR
makes
it
makes
it
more
or
less
irrelevant
here,
because
the
point
of
this
exercise
is
really
to
to
protect
from
something
like
like
this
happening
at
the
lowest
layer,
possible
yeah
and
the
opening
the
blog
device
with
oh
exclusive,
which
is
Linux
only
thing.
But
then
we
don't
really
care
about
anything
else
here
at
least
four
osds
that
that
should
be
fine,
so
whatever
Rook
is
doing
that.
E
First
of
all
that
clearly
that
kudu
didn't
work
in
this
case,
but
also
the
it's.
You
know
it's
it's
a
thing
that
you
know
orchestration
tools
come
and
go
things
change.
Those
changes
aren't
always
act
by
you
know
deaf
maintainers,
sometimes
they're.
You
know
either
considered
to
be.
You
know
trivial
enough
and
no
one
asks.
Sometimes
it's
just
that
you
know
people
just
don't
understand
how
dire
the
consequences
can
be
of
you
know,
for
example,
disabling
pit
files
as.
G
E
Case
as
the
case
was
so
all
of
this
really
points
at
the
need
to
to
you
know,
have
this
implemented
at
the
lowest
layer.
So
what
work
is
doing?
E
I
mean
that
that
that
probably
doesn't
shouldn't
shouldn't
be
a
factor
here,
simply
because
you
we
need
to
make
this
work
independent
of
Rook
or
any
other
orchestration
tool.
Yeah.
D
G
I
So
have
we
tested
this
if,
if
Linux
always
respects
the
exclusive
flag,
even
in
a
container,
that
would
be
great
I.
I
The
other
thing
is
we:
when
we,
when
we
create
these
containers
and
run
them,
we
could
be
mounting
the
VAR
directory.
We
use
in
a
mode
that
actually
shares
between
containers.
That
would
help
I
actually
think
we
should
do
ilia's
suggestion
anyway.
I
think
the
more
ways
to
prevent
multiple
accesses
to
a
block
device
the
better,
but
we
could
also
be
doing
that.
It
would
mean
pit
files
all
that
other
stuff
would
still
work,
which
would
be
nice
I.
E
Yeah,
well,
that's
actually,
that's
actually
was
like
I
I
I.
Think
I
I
commented
on
the
pr
that
was
one
of
the
questions
like
if
all
exclusive
thing
works
and
it
so
it
really
seems
to
be
the
thing
like
that
actually
works
that
that
can
be.
You
know,
fooled
around
with
then
do
we
do
we
really
need
the
existing
open
file
description?
Locks
like
the
the
F
lock
thing
is
it?
E
Is
it
still
needed
because
the
the
last
you
know
it's
just
additional
code
that
tries
the
of
D
Lock,
then,
if
that
doesn't
work
falls
back
to
F
lock,
and
you
know
it's
like
if,
if
none
of
that
is
going
to
actually
bring
any
value
with
all
the
exclusive
change
in
then
I
think
we
should
consider
getting
rid
of
it
instead
of,
like
you
said,
you
know,
try
all
possible
locks
and
you
know
locking,
locking
every
everything
that
we
can
get
our
hands
on.
E
E
I
E
Yeah
so
I
mean
we're,
certainly
not
going
to
like
I
I.
Don't
think
anyone
is
planning
to
run
ceft
clusters
on
Windows
and
we
could
research.
The
the
PSD
question,
I
I
think
it
would
take
just
a
basic
rep
for
the
previously
kernel
code
to
determine
that
if
it's
not
documented
in
their
main
pages,
that
is.
E
So
that's
that's
a
question
that
I'm
not
sure.
If,
if
we
say
we
care,
then
yes,
but
then
again
the
F
log
mechanism
can
be
bypassed.
So
it
takes.
E
You
know
it
takes
some
effort,
but
they
contain
a
tooling
makes
it
very
easy
and
again
I'm,
not
sure
what
is
the
state
of
of
like,
because
the
container
tools
that
we
use
on
Linux
that
make
this
very
easy
are
probably
probably
not
support
it
on
on
these
days,
and
that
would
that
would
mean
that
F
log
base
protection,
you
know,
probably
carries
more
weight
there,
because
it's
going
to
be
harder
to
bypass.
E
Then
obviously
we
should
just
we
should
just
you
know
claim
that
it's
supported
and,
in
my
opinion,
just
dropped.
The
Vlog
stuff.
I
H
I
I
G
K
I
We
don't
have
any
unit
tests
that
apply
uniformly
to
all
objects
or
implementations.
Think
of
comments
enough,
like
we
have
two
optics
for
implementations
right
now,
blue
store
and
distort
and
file
store.
Right,
like
these
things,
don't
come
up
that
often
we
don't
need
to
be
that
prophylactic
about
it.
E
Well,
I
I
think
we
agreed
that
we
need
to
like.
If
we
care
about
potentially
supporting
BCS,
then
we
need
to
check
whether
oh
exclusive,
for
blog
devices
is
a
thing
there,
because
you
mentioned
like
I'm,
not
sure
did
you
did
you
check
the
main
Pages
or
because
for
some
reason,
I
always
thought
this
was
their
learning
songly
Edition,
because.
I
A
Yeah
but
the
first
two
deprecation
them
walking
around
the
issues
they're
having
with
blue
stores,
so
with
file
store
depreciation
on
the
horizon.
They
might
want
to
use
two
stores,
so
we
should
cater
for
that
case
as
well
right.
E
Yeah
and
and
just
to
kind
of
I
I
was
specifically
commenting
on
the
on
the
Kernel
device
implementation,
with
with
with
this
exclusive
PR
doing
two
types
of
locking
and
that
that
seems
to
be
not
desirable
to
me
and
so
leaving
just
oh
exclusive.
E
There
is
probably
the
way
to
go,
but
that
doesn't
say
that
we
can't
continue
f-locking
one
of
the
metadata
files
like,
for
example,
the
whatever
it
is,
the
fsid
or
something
else
in
the
you
know
in
the
OSD
directory,
and
you
know
have
that
be
common
for
any.
With
a
user
space.
E
You
know
I
O
stack
with
the
user
space
pump
device.
Essentially,
so
my
comment
only
extended
to
the
to
to
the
implementation
of
the
blog
device
interface,
not
to
the
to
Blue
Stone,
so
not
to
the
object,
store
implementation.
The
object
store
implementation
is
free
to
use
whatever
locking
mechanism
to
lock
whatever
metadata
file
it
sees
fit.
In
my
opinion,.
G
Okay,
you
mean
you
would
like
to
have
locking
a
separate
thing
from
block
device
implementation.
E
E
You
know
on
the
same
Vlog
device
right
on
the
same
thing,
so
that
seemed
excessive
to
me,
but
we
could
still
like
an
object,
store
implementation
could
still
log,
for
example,
an
fsid
file
or
a
superblock
file,
which
is
not
a
Blog
device
with
conventional
lock-in
mechanisms
which
are
portable
and
support
it
on.
You
know
just
generally
on
posix.
E
The
one
I'm
aware
of
is
the
persistent
client-side
cache
in
in
RBD,
so
it
has
two
two
backends
it
can.
It
can
use
pmem
device
like
a
like
a
pmem
card
or
just
a
standard
log
device.
So
with
the
intentive
you
know
in
being
an
SSD.
So
that's
that's
a
second
user.
There
might
be
a
third
one,
I'm,
not
sure.
E
E
I
I
think
I
think
it
actually
uses
locking
already
right
because
the
lock
exclusive
flag
or
it's
not
actually
a
flag.
It's
a
it's
a
field
in
the
in
the
blog
device
class
or
maybe
kernel
device,
implementation,
I'm,
not
sure,
but
that's
a
it's.
It's
a
member
field
and
it
defaults
to
true
and
any
open
method
like
when
you
call
your
call
blog
device
create
and
then
blog
device
open.
E
If
you,
if
you
don't
do
anything
in
between
then
open,
would
do
an
exclusive
open
by
default.
So
this
means
that
this
second
user,
this
SSD
cache
already
uses
exclus.
You
know
exclusive
mode.
It
just
doesn't
rely
on
it
because
there
is
higher
level
locking
within
lib
IBD
to
the
best
of
my
knowledge.
The
only
reason
to
request
a
non-exclusive
open
is
to
actually
call
a
method
in
between
create
and
open.
There
is
something
named
said:
no
exclusive
lock.
E
So
that's
an
actual
method
that
that
you
call
and
that
method
onsets
the
the
member
field.
So
it's
that's.
You
know
it
changes
it
from
True
to
false.
G
The
reason
why
we
have
a
set,
not
exclusive
open
is
because
we
have
two
components
in
Blue:
Store
one
is
blue
star
core
that
serves
object,
data
and
the
companion
blue
FS
that
keeps
the
metadata
and
they
use
partially
the
same
device
with
cooperation,
but
the
same
device
using
different
block
device
handlers
for
architecture,
Simplicity
reasons.
That's
why
we
have
that
weird
open
with
non-exclusive
mode,
but
rightly
I
can
so.
G
E
But
that's
exactly
what
I'm
saying
that
we
don't
need
that.
We
probably
like
a
99
sure
that
we
don't
need
existing
locking
and
we
can
just
switch
to
all
exclusive
locking
and.
D
E
Not
expose
this
experiment,
this
implementation
detail
at
all,
so
the
blue
refresh
code
would
still
invoke
this
set
no
exclusive,
lock
method
when
it
needs
it,
but
that
was
actually
another
concern
that
I
brought
up
in
in
your
PR
is.
E
Are
we
sure,
because
the
way
like
I
I
didn't
follow?
All
the
you
know
the
entire
cold
chain,
but
lock
exclusive
is
on
Set,
so
set
no
exclusive,
lock
is
called
in
minimal,
open
blue
fuss.
E
About
that
yeah,
so
that
method
is
called
through
from
from
you
know,
from
a
bunch
of
places
and
I
just
wanted
to
you
know
you
or
someone
else
to
verify
that,
like
all
those
like
that
there
isn't
a
case
there,
where
you
know
that
that
is,
you
know
where
things
could
go
wrong
and
we
could
still
end
up
with
the
same
old
device
opened
open
twice
in
in
you
know,.
E
So
in
in
the
end
of
the
day,
one
of
these
opens
needs
to
be
exclusive
and
the
other
then
can
be
non-exclusive.
E
That
would
work
for
the
case
of
you
know.
Opening
the
blog
device
in
you
know
two
different
ways.
As
long
as
one
of
them
is
is
exclusive,
then
we
are
protected
and
and
I
just
wanted
us
to
verify
that
that
is
the
case
in
in
all
cases,
and
there
isn't
the
corner
case
where
we
we
are.
You
know,
where
said,
no
exclusive
luck
is
called
on.
All
you
know
on
all
the
ways
that
the
blog
device
has
opened
for
that
particular
OSD.
G
E
Okay,
so
it
sounds
like
the
all
exclusive
change
kind
of
stands
on
its
own,
but
there
would
be
an
additional
PR
needed
to
to
fix
this
minimal,
open
blue
effect,
stuff
right.
G
Yes,
possibly
in
two
PR's,
because
we
should
also
maybe
fix
the
previous
versions.
I
I,
don't
know
if
we
want
back
Port
the
same
or
just
fix
the
tooling
I
would
be
mostly
for
replacing
existing
clocks
with
all
exclusive
I.
Don't
really
want
to
track
another
customer
problems
related
with
suddenly
running
multiple
osds
on
the
same
data
set.
E
Right
but
once
again,
I
mean
going
back
to
my
point,
that
and
and
to
what
you
seem
to
be
confirmed,
that
there
are
cases
where
the
existing
the
existing
support
for
exclusivity,
whether
it
works
in
all
cases
or
not
like
whether
the
container
environment
can
fool
it
or
not.
E
So
the
change
to
or
exclusive
is
not
going
to
take
care
of
that
because
you
would
still
not
be
using
all
exclusive
in
those
places
yeah.
So
that's
what
I'm
saying
it's
either
two
different
pris
or
at
least
two
different
commits,
because
these
are
clearly
different.
Different
changes
right,
one
is
a
change
of
the
underlying
login
mechanism
and
the
other
is
using
the
locking
mechanism
in
more
places.
G
A
All
right
we're
already
out
of
time
any
last
thoughts
on
this
topic.