►
From YouTube: CDS Reef: RADOS
Description
The Ceph Developer Summit for Reef is a series of planning meetings around the next release and some community planning.
Schedule: https://ceph.io/en/news/blog/2022/ceph-developer-summit-reef/
A
Hello
and
welcome
everyone
to
cds,
raiders
for
the
reef
release,
so
today
we'll
be
talking
about
features
and
things
that
we
want
to
do
for
the
next
release
that
is
coming
up,
which
is
the
reef
release.
A
There
are
topics
that
are
already
there
in
the
ether
pad.
We
will
try
to
follow
the
order
and
just
ensure
that
we
are
not
taking
too
much
time
for
any
one
topic
so
that
there
is
fairness
across
them
to
kick
it
off.
I
think
we
are
going
to
be
starting
with
the
telemetry
topics,
so
I
guess
yarit
and
laura.
Do
you
want
to
take
it
off
sure
thanks
niha?
B
Thanks
so,
first
just
a
quick
announcement
that
I'll
be
giving
a
walkthrough
on
our
telemetry
crashes,
probably
in
a
couple
of
weeks,
so
I
encourage
everyone
to
join.
I
put
on
the
ether
pad
a
link
to
the
public
dashboards.
So
I'm
I'm
not
aware.
I
don't
know
if
everyone
is
aware
of
our
public
dashboards.
B
This
is
where
we
present
aggregated
data
from
our
telemetry.
So
I
encourage
everyone
to
check
it
out
so
yeah,
let's,
let's
kick
it
off
with
a
list
of
subjects
that
we
have
is
everybody.
B
Perfect
yeah
this
is
for
for
the
general
session,
so
we'll
be
looking
at
the
the
one
for
the
telomere
sure.
B
Yeah,
so
the
first
topic
that
we
want
to
cover
is
new
metrics
collection.
So,
right
now
we
collect
data
in
five
different
channels.
We
have
the
basic
channels
where
we
collect
general
information
about
the
deployment
we
have
a
device
channel
where
we
collect
health
metrics
matrix,
mostly
smart,
smart
metrics,
and
then
we
have
a
crash
channel
where
we
collect
all
the
crash
dumps
that
happened
in
the
cluster.
B
B
Data
that
is
needed
in
order
to
identify
rook
deployments.
So
apparently
we
do
not
have
this
data
collected
yet
and
we
started
a
discussion
offline
about
it.
But
I
I
just
wanted
to
make
to
make
sure
if
everyone,
if
anyone
has
any
ideas
of
what
else
needed
to
be
collected
specifically
for
rook
and
generally
speaking,
for
other
components
as
well.
B
So
I'll
just
say
that
for
for
the
rook
part,
blaine
already
said
that
we
probably
need
to
collect.
They
will
have
to
add
some
flag
that
the
cluster
was
deployed
via
rook,
and
this
way
we
can
fetch
this
information
and
collect
collected
in
telemetry
and
radic
also
suggested
one
configuration
option
that
might
indicate
it
is
a
rook
cluster.
B
This
is
on
the
ether
pad
under
the
first,
the
first
topic.
It
says
ms
learn
learn
address
from
peer
thanks
yeah.
B
So
this
is
not
super
urgent
and
I
don't
want
to
take
too
much
time
from
other
topics,
but
just
if
anyone
has
any
new
metrics
new
data
that
we're
not
collecting
in
telemetry
that
you
would
like
to
see
that
is
being
collected.
Please
please
let
us
know
just
take
into
consideration
that
it
cannot
be
anything
that
is
user
defined,
we're
not
collecting
any
sensitive
information
like
host
names,
full
names
or
anything
that
can
identify
the
user.
C
Well,
maybe
in
the
long
run,
maybe
it
would
be
worth
considering
to
introduce
some
kind
of
extra
option
that
would
be
set
by
the
deployment
machinery
to
indicate
who
was
responsible
for
deploying
such
a
cluster,
whether
it
was
adm
whether
it
was
rook.
You
know,
because
now
msl
engineer
from
pier
is
far
from
from
being
an
implicit
I'm
from
being
an
explicit
solution.
It's
a
very
implicit
one
and
if
we
want
to,
if
we
really
need
such
an
information,
maybe
we
should,
in
long
term,
not
now
in
long
term
should
have
something
better.
B
D
And
I
think
we
can
also
talk
to
the
rook
community
what
else
what
they
are
exposing
from
the
set
point
of
view
right
I
mean
something
we
can
figure
out
from
the
upstream.
We
can
showcase
this
that
this
is
what
we
have
in
the
telemetry
and
in
the
community.
There
is
a
rook
community
meeting.
We
can
present
there
and
ask
their
feedback
like
if
there
is
any
other
way.
We
can,
you
know,
get
adoption
and
publicize
it
more
on
the
rook
community.
D
B
Right
sure
sure
this
will
be
great
thanks.
I
appreciate
it
so
if
no
one
else
has
ideas,
we're
just
really
short
on
time,
I
think
we
can
move
to
to
the
next
topic
and
please,
if
anyone
has
any
ideas,
please
feel
free
to
update
this
ether
pad
or
just
bring
us
offline
laura.
Do
you
want
to
go
to
the
next
topic.
E
Hi
everyone,
so
the
next
topic
about
telemetry
is
about
collecting
a
data
availability
score
and
including
this
score
in
the
basic
channel.
E
So
a
while
ago,
sage
authored
a
trello
card
which
is
linked
under
the
at
the
top
there
about
collecting
or
tracking
data
availability
over
time,
and
essentially
we
this
information
in
the
ceph-s
command,
which
provides
a
snapshot
of
availability.
E
E
But
what
we
are
interested
in
doing
is
collecting
this
information
over
time
and
tracking
the
overtime
and
there's
a
telemetry
can
be
used
to
do
that,
since
we
collect
telemetry
reports
once
every
24
hours,
so
we
were
thinking
that
would
be
a
good
tool
to
collect
this
information,
and
the
newly
added
perf
channel
performance
channel
already
collects
a
lot
of
information,
that's
available
in
pg
dump,
so
we
have
access
to
the
states
that
certain
pgs
were
in
and
when
they
were
last
active
when
they
were
last
peering.
Timestamps
like
that.
E
So
there
are
several
questions
to
answer
with
this
information
question.
One
would
be
how
often
and
in
what
quantity
has
data
been
unavailable
in
the
stuff
cluster,
so
we
were
thinking
that
one
way
we
could
answer.
This
question
is
by
looking
at
the
last
active
timestamp
on
clusters,
pgs
and
tracking,
the
so
sending
that
on
the
telemetry
public
dashboards
and
we're.
We
have
several
ideas
of
how
to
do
that.
E
But
that's
one
one
way
we
would
decide
data
availability
and
another
question
we're
answering
is:
how
does
the
frequency
of
an
individual
state
relate
to
the
cluster
as
a
whole?
So,
in
the
the
telemetry
report
we
have
access
to
the
states
that
pg's
are
in
and
certain
states
are
really
indicative
of
unavailable
data
such
as
anything,
that's,
not
active
or
anything
that's
incomplete
or
creating.
E
So
we
would
be
interested
in
collecting
data
about
you
know
daily.
What
states
pgs
are
in
and
does
this
indicate
unavailable
tracking
this
over
time?
So
the
main
concerns
we
have
with
doing
this
in
telemetry
is
collecting.
Collecting
this
and
creating
or
calculating
an
unavailability
score
would
only
happen
once
every
24
hours,
because
that's
the
default
configuration
for
telemetry
reports
they're
sent
to
to
the
dashboards
once
every
24
hours.
So
this
would
really
provide
only
a
snapshot
of
data
availability.
But
maybe
that's
enough.
E
Maybe
that's
would
still
tell
us
something,
and
the
main
thing
is
also
in
addition
to
tracking
this
in
telemetry.
We
would
also
want
this
data
availability
score
to
be
available
to
developers
on
their
clusters,
so
we
would
consider
putting
it
maybe
adding
it
to
the
s
command
or
in
some
kind
of
existing
command,
but
essentially
we
would
want
developers
to
have
access
to
this
score.
E
E
But
if
you
have
any
comments
about
the
frequency
of
which
this
data
should
be
collected
or
if
there
are
any
data
points
that
we
haven't
considered,
you're,
welcome
to
add
those
comments
on
the
ether
pad
or
just
any
questions
you
might
have
about
what
we're.
Looking
to
collect
here,
that's
it
for
that
topic,
I'll
pause
for
any
any
any
kind
of
comments
before
moving
on
to
the
next
one,
oh
and
yuri.
If
you
wanted
to
add
anything
to.
B
Yeah,
I
you
covered,
you
covered
everything,
just
yeah
that
nuance
about
whether
calculating
it
by
a
daily
snapshot
would
be
enough,
or
do
we
really
need
a
better
resolution
for
these
data
availability
score.
E
Okay,
if
there
aren't
any
questions
again,
if
you
remember
something
that
you've
wanted
to
ask
and
we've
moved
past,
just
please
write
any
comments.
You
have
there's
a
comment
section
on
the
bottom
of
this
ether
pattern.
You
know
wherever
you
end
up
putting
it
on
the
the
schedule
which,
wherever
is
the
last
topic.
I'll
cover
is
oh.
F
When
saying
a
snapchat,
are
you
saying
that
you
would
like
only
look
at
the
data
every
24
hours
to
calculate
scores?
Are
you
saying
like
we
would
look
at
it
every
24
hours
and
whatever
the
data
ability
happened
to
be?
That
is
all
we
would
have
visibility
into,
because
if
we're
going
to
miss
every
15
minute
outage,
then
yeah.
That's.
That
seems
not
very
useful
for
what
we're
looking
to
understand
about
about
the
clusters
and
we
definitely
need
more
resolution.
Then
it
was
down
so
long
that
we
happened
to
catch
it.
Yeah.
G
F
That's
good
for
an
urgency
of
updates,
but
yeah
this
I
I
I
don't
know
exactly
what
our
options
are
on
the
manager
or
whatever,
but
like.
This
probably
needs
to
be
a
like
a
push
model
rather
than
a
polling
model
like
like,
like
noticing
certain
events
happen
and
like
like
taking
a
notification
from
certain
events
happening
in
the
cluster
and
then
being
like
okay,
something
was
was
gone
for
this
long.
H
B
Yes,
that's
a
very
good
point
now
that
it
could
be
totally
separated
from
telemetry
and
telemetry.
You
can
just
use
whatever
reports
that
tool
can
provide
so
yeah.
The
user
can
use
that
or
orthogonally
of
telemetry
yeah.
E
So
we
have,
you
know
if,
if
we're
thinking
about
calculating
the
score
in
the
telemetry
report,
we
don't
have
to
do
it.
That
way
we
can
just
have
the
telemetry
report
take
whatever
scores
has
already
been
calculated
on
the
cluster
side,
which
maybe
that
has
a
more
you
know,
that's
been
collecting
it
every
every
minute
or
something,
and
the
telemetry
report
would
include
that
score
rather
than
calculating
it
right
before
the
report
is
sent.
E
E
Okay,
I'll
go
to
the
next
one,
just
since
we're
so.
The
final
topic
for
telemetry
is
identifying
osd
performance
outliers
in
the
manager.
E
So
there's
a
trello
card
which
is
linked
here
for
a
set
for
reef,
which
is
expressing
interest
in
collecting
or
identifying
osd
performance
outliers
in
the
manager,
side
and
neha
had
a
thought
that
the
perf
channel
in
the
in
telemetry
is
already
collecting
a
lot
of
performance
information
about
each
osd.
So
that
would
be
a
good
idea
to.
E
I
use
that
information
to
identify
outliers,
and
this
is
a
fairly
new
topic.
So
we
haven't
really
had
any
discussions
on
what
that
means
to
identify
outliers
but
yeah.
A
I
think
it's
written.
I
think
I
think
this
is
more
like
you
know
it's,
it's
just
a
thought
that
now
that
we
have
all
this
information,
we
better
use
it,
for
you
know
something
that
we've
been
thinking
that
we
would
do
but
yeah.
This
needs
to
be
thought
through
and
discussed
in
you
know
the
telemetry
huddle
or
something.
E
Yeah,
I
just
wrote
down
some
information
about
what
the
perf
channel
is
collecting
per
osd
like
we're,
collecting
perf
counters,
histograms,
mempools
and
heap
stats,
so
that
might
help
those
data
points
might
help
us
identify
outliers
and
of
course,
more
conversation
needs
to
happen
about
it.
But
it
was
just
an
idea
for
life
and
that's
that's
really
all
I
have
to
say.
Were
there
any
final
questions
or
I
don't
want
to
take
away
from
the
next
topic?
But
if
there
are
any
comments,
you
know
again
just
add
them.
Wherever.
A
C
C
Actually
two
of
them
is
is
is
providing
such
a
guarantee,
a
relevant
thing
today
and
even
more
general
question:
do
we
really
have
honestly
any
users
of
the
c
plus
plus
api
of
the
c
space
radius?
Api
for
sure
c?
Api
is
widely
used,
but
I'm
doubtful
when
it
comes
to
see
past
pass,
so
maybe
maybe
we
could
actually
remove
some
of
of
the
mental
burden
put
on
us
just
for
the
sake
of
providing
a
thing
that
is
not
so
widely
used
and
necessary.
I
It
was
my
understanding
that
the
simplest
plus
api
was
internal
and
that
com,
that
sort
of
part
started
with
conversations
which
we
had
with
jason
tillman
three
or
four
years
ago,
and
then
we
created
is
there
just-
and
I
guess
I
have
to
ask-
is
that
is
also
looking
at
your
github
or
your
tracker
issue
on
this.
Is
this?
I
C
Well,
my
understanding
is
that
we
have
only,
as
you
said
my
understanding
is.
We
only
have
internal
clients,
we
don't
need
to
worry
about
external
ones,
but
it's
just
a
say
this.
I
would
really
love
to
confirm
before
before
casing,
before
facing
the
guarantee.
J
A
Okay,
I
I
think
it
doesn't
make
sense
to
discuss
any
more
on
this.
I
think,
following
up
on
the
mailing
list,
would
be
a
good
idea
and
then
go
ahead
with
it.
C
F
C
Yes,
I
I
linked
that.
I
linked
the
tracker
in
the
chat.
Basically,
it
was
in
the
buffer
in
the
buffer
header,
there
was
a
construct.
There
was
a
patch
that
introduced
a
specific
construct
and
because
of
that,
we
violated
the
supers
plus
eleven
granted
enforced.
We
got
one
test
for
technology
very
correctly
that
tries
to
that
that
tries
to
build
a
built
an
example
program
using
cpr
11.
C
got
it
anyway.
My
understanding
is
that
no
no
objections
here
to
to
strip
down
this
guarantee
and
we
can
move
forward
by
basically
issuing
email
to
the
the
dev
list.
A
Cool
all
right,
that's
ridic,
moving
on
lock
devices
with
compression
josh.
Do
we
know
if
the
relevant
person
is
here
or.
H
Yeah,
I
think
someone
who
brought
this
up.
A
D
Yeah
hi
neha
martin
is
gonna
present.
That
martin
is
online.
K
Hi,
yes,
so
I
don't
have
any
slides
or
anything
for
that.
Sorry
about
that.
This
is
about
a
patch
that
we
propose
in
order
to
support
to
support
block
devices
that
have
internal
compression
capability.
K
Ibm
has
developed
and
actually
even
sells,
one
of
these
devices,
it's
an
nvme
device
and
it's
a
model
usage
model
is
very
similar
to
the
vdo
device
that
is
already
supported
within
seth.
It
provides
a
very
large
logical
block
space,
which
is
backed
by
a
smaller
physical
space
and
provides
information
about
the
state
of
the
physical
block
space
via
sideband
interface,
which
is
basically
some
special
nvme
commands.
K
This
is
very
similar
to
what
vdo
the
video
device
does.
The
video
device
is
basically
an
inline
compression
module
that
is
part
of
well
can
be
inserted
into
the
linux
kernel
that
uses
a
standard,
nvme
device
or
any
block
device
at
the
back
end.
K
There
is
already
support
inside
ceph
for
that,
but
this
support
is
buried
inside
the
block
driver
and
it's
referenced
by
the
kernel
device.
What
we're
proposing
is
to
encapsulate
this
interface
between
kernel
device
and
block
driver
into
a
plug-in
api,
and
this
patch
basically
contains
this
plug-in
system.
The
plug-in
system
itself
is
based
on
the
eraser
code,
plug-in
system
that
already
exists
in
there,
so
the
structure
is
basically
identical
and
it
defines
an
interface
that
allows
to
query
the
physical
block
state
of
the
compression
device.
K
This
allows
basically
later
on
adding
multiple
different
plugins
for
different
hardware,
so
the
the
the
patch
for
this
is
yeah
provided
in
in
this.
K
In
my
public
vlog,
it
is
actually
consisting
out
of
three
commits
sorry
about
this.
K
The
last
commit
that
you
just
showed
is
one
that
basically
supports
some
additional
things
that
we
encountered
when,
when
running
tests
with
it,
for
example,
running
out
of
the
top
as
out
of
the
build
tree
needed
and
restart
some
patches,
and
there
was
one
dependency
required
that
would
force
basically
the
the
block
device
driver
which
actually
instantiates
the
or
references
the
the
plugin
system,
to
use
the
comment,
because
the
plugin
system
is
actually
pushed
into
the
common
library,
but
essentially
the
the
two
main
commits
of
this.
I
will.
K
Okay,
so
the
main
two
commits
actually
one
that
implements
the
entire
plugin
system
and
wraps
the
video
driver
into
into
a
plug-in.
So
basically,
I
took
the
code
of
for
the
video
support
and
put
it
into
plug-in
so
that
it's
basically
used
via
that
and
there's
actually
a
second
commit
that
I
separated
out,
which
adds
the
keep
caps
bit
to
the
state
of
of
of
the
osd.
K
We
need
this
actually
for
our
special
plug-in
in
order
to
do
certain
nvme
pass-through
calls
I
occur
to
pass
through
calls
that
require
certain
kernel
capabilities
and
when
switching
from
from
root
user
to
self-user,
you
usually
lose
it,
but
this
patch
basically
maintains
it
and
for
capability,
aware
plug-ins,
you
can
basically
activate
the
capabilities
that
you
need.
K
K
So
you
basically
on
the
on
the
wrong
side
of
this
merge.
I
have
to
rebase
my
patch
to
make
it
more
clear.
Sorry
about
that.
A
No
worries
no
worries,
and
I
think
this
sounds
exciting
and
it's
worth
discussing
at
a
cdm
and
when
you
know
maybe
you
can
more
formally
prepare
and
share
it
with
the
broader
group.
What
do
you
think
it's
so
cdm
is
essentially
like
once
in
a
month
session
that
we
have
a
developer
meeting,
so
maybe
we
can
choose
whichever
one
works
for
you.
A
F
Oh
yeah
yeah,
I
just
want
so
I
guess
I
I
know
that
there's
some
integration
with
video
that
already
exists.
Have
you
have
you
looked
at
how?
Well
it
works
for
your
block
device,
because
I
never
got
the
impression
the
video
integration
was
really
fully
baked.
F
It
was
sort
of
a
technology
that
that
people
wanted
to
put
together,
but
you
know,
video,
trades
off,
cpu
usage
for
lower
storage
utilization
and
that's
sort
of
the
wrong
optimization
for
seth
having
it
in
the
device
is
a
lot
more
interesting,
but
I'm
just
wondering
what
happens
when
we
start
running
law
in
space.
K
Yeah
so
again,
yeah,
as
you
said
that
it's
actually
in
this
case
you
don't
trade
off
both
cpu
cycles,
because
it's
all
the
device
that
actually
does
an
inline
compression
and
you
actually
is
now
doing
it
full
line
rate
at
full.
Full
bandwidth,
there's
no
impact
whatsoever,
the
the
yes
I
I
agree,
the
the
the
integration
does
not
seem
complete.
K
You
get
at
least
the
feedback
to
the
upper
layers
of
cells,
so
you
will
get
warning
messages
that
your
cluster
is
running
for,
if
it
just
it
really
runs
out
of
physical
space,
and
you
can
take
measures
at
that
point
similar
to
when
your
non-compressed
basically
runs
low.
But,
yes,
the
displays
are
not
that
nice,
because
the
the
ratios
do
not
consider
actually
the
the
compression
100
percent.
You
get
basically
a
certain
starting
capacity,
which
is
your
physical
capacity.
K
A
Move
on
to
the
next
topic,
which
is
qos,
I
believe,
we've
got
ashvaria
and
sridhar.
However,
you
want
to
drive
this
topic.
G
Yeah,
so
if
you
could
just
open
that
speaker
pad
and
just
walk
you
through
yeah,
that
is
being
done
yeah.
Can
you
guys
see
it
yeah?
Yes,
so
I
can
see
it
yeah,
so
I
thought
I'll
just
give
a
background
of
the
current
work
that
has
been
done
for
the
client
versus
client
viewers.
This
has
been
in
the
works
for
quite
some
time
in
the
past
and
folks
like
eric,
sam
and
others.
G
They
have
been
working
on
this,
so
I
I
basically
based
the
current
changes
that
I
have
made
on
the
on
the
original
pr
that
the
folks
have
been
worked.
I
had
worked
on
that.
I
highlighted
there
two
zero
two,
three
five,
that's
the
one,
so
I
based
my
the
the
work
based
on
this
pr
and
adapted
it
to
the
current
implementation
of
the
clock
scheduler.
G
So
the
the
pr
that
I'm
working
on
is
currently
still
working
progress
and
I
have
provided
a
link
to
that
as
well.
G
So
there
are
a
few
still
a
few
see
quite
a
few
things
to
work
on
that
and
I'll
just
present,
the
current
state
and
the
next
steps
that
that
we
can
take
so
the
the
current
state
of
this
is
that
the
I'll
just
enumerate
the
the
things
that
the
changes
that
have
made
as
part
of
this,
this
new
pr,
the
the
first
item
that
that
talks
about
the
service
tracker.
G
It's
basically
the
we
basically
incorporated
the
dm
clock
service
tracker
in
the
objective
code.
The
service
tracker
essentially
tracks
the
response
from
the
vm
clock
server
and
then
proceeds
to
calculate
the
request
parameters
like
delta
and
rho,
which
which
is
essential
for
further
dm
clock
algorithm
to
work
so
service
tracker
object.
Essentially,
does
this
work
for
us
and
and
based
on
the
implementation
of
the
original
we
have?
We
are.
I
have
added
the
comments
which
which
essentially
follows
the
same
thing
that
what
the
original
vr
at
intended.
G
The
other
change
is
the
messaging
changes
to
allow.
You
know
clients
to
send
the
request
parameters
like
delta
and
row
and
then
also
in
the
response
path,
receive
the
qos
response,
like
the
cost
interface,
that
the
service
tracker
kind
of
uses
to
calculate
the
request
parameters
for
the
subsequent
requests
from
the
clients.
G
The
other
bit
that
that
that
I
have
integrated
looking
into
the
original
pr
is
the
qos
profile
manager
that
that
essentially
encapsulates
the
client
viewers
profile
and
the
the
block
service
tracker.
So
this
is
this.
This
is
essentially
used
by
the
injector
in
the
library
and
it
basically
acts
as
a
conduit
to
pass
the
keyword,
request,
parameter
parameters
and
then
receive
the
keywords
response.
G
The
other
bit
that
I
have
added,
which
is
not
there
in
the
optional
pr
is
the
client
registry.
This
is
again
based
on
the
implementation,
the
the
main
dim
clock
code.
So
this
the
client
registry
essentially
tracks
the
tracks,
the
map
of
the
external
clients
and
the
and
the
qs
parameters
associated
with
those
external
clients
and
also
introduces
cleanup
logic
to
clean
up
to
remove
tail
clients
from
this
registry.
G
So,
for
example,
expensive
client
becomes
silent
for
quite
quite
some
time.
It
doesn't
make
sense
for
us
to
keep
that
entry
long
enough
for
so
long.
So
the
the
the
cleanup
logic
essentially
looks
at
looks
at
the
age
based
intimacy,
based
cleanup
mechanism
and
cleans
up
this
registry
periodically,
and
this
vr
also
implements
quite
a
lot
of
unique
unit
tests
to
test
all
the
above
functionality
that
I
have
mentioned
and
yeah.
So
this
is
the
current
state
of
the
pr.
G
It's
in
line
with
whatever
folks
had
thought
of
earlier
and
that
feedback
would
be
highly
highly
valuable
and
then
and
then
there's
a
bit
there's
some
few
few
things
about
the
us
profile
manager
that
I
need
to
really
look
at
so
essentially,
what
happens
now
is
we.
G
The
profile
manager
is
implemented
like
a
like
a
static,
it's
a
process-wide
kind
of
variable,
and
then
I
see
I
I
saw
a
few
changes
that
was
that
was
made
by
sam
to
make
it
on
to
make
this
on
a
project
for
objective
basis.
So
that's
something
that
I'm
trying
to
take
a
look
at,
and
if
that
is
those
things,
I
need
to
look
at
and
discuss
and
then
proceed
to
make
changes
there.
G
Apart
from
that
yeah
based
on
the
feedback,
we
can
get
into
testing
with
the
actual
liberators
client
software
after
incorporating
the
new
apis,
so
that
that's
the
ultimate
goal.
I
was
not
sure
about
the
support
of
other
types
of
clients
like
rgw
and
south
of
us.
Maybe
that
could
be
taken
up
at
a
later
point,
but
right
now
this
spr
essentially
focuses
on
liberators.
G
So
that's
the
current
state
of
client
of
this
client
viewers
work.
M
So
certainly
that
sounds
very
impressive
and
I'm
very
appreciative
that
you're
picking
up
the
ball
here
and
going
with
it
because
yeah
it
was
a
lot
of
work
and
at
some
point,
rgw
needed
my
attention.
So
I
left
it
and
sam
pushed
the
push
it
forward
quite
a
bit.
But
thank
you
for
doing
all
of
this.
G
Yeah
sure
a
lot
of
interesting
work,
so
I
really
hope
that
you
guys
take
a
take
a
look
at
this
new
pr
and
then
give
me
the
feedback
so
that
I
can
take
this
forward
and
yeah.
G
If
there
are
no
other
questions,
you
can
go
ahead
with
the
next
topic,
which
is
handling
high
priority
operations.
This
kind
of
game
came
up
on
a
trailer
board
just
just.
A
One
second,
before
we
go
into
that,
maybe
it's
worth
giving
like
a
two-minute
overview
of
what
currently
exists
in
quincy
and
why
this
handle
high
priority
ops
is
required.
Maybe
I
don't
know
whoever
wants
to
do
that.
Maybe
just
give
a
quick
summary
of
what
we
what
exists
at
the
moment
or
folks
who
are
not
aware.
G
So
so
you
could
see
we
essentially
added
the
dim
clock
capability
only
on
the
usbs
so
to
provide
cues
for
operations
like
client,
ops,
recovery
operations
and
other
background
operations
like
scrub,
nap
trim
and
the
pg
deletions,
so
the
so.
The
testing
for
this
is
currently
in
progress
on
on
large
clusters.
So
on
a
small
scale.
Of
course,
we
have
been
able
to
test
a
few
of
these
and
the
fine-tuning
of
the
of
the
of
the
profiles
for
the
background
operations
that
is
still
in
progress.
G
I
think
we
have
pretty
much
narrowed
down
on
the
things
that
we
want
to
make
for
the
conflict
profiles,
but
largely
it
looks
good
for
the
client
office
and
the
recovery
operations
and
a
bit
of
fine
tuning
is
necessary
for
the
background
operations.
So
that's
where
we
are
at
currently
with
the
changes
in
the
osd,
so
this
came
up.
G
Yeah
so
so
this
came
up
because
we
saw
an
issue
where
the
in
some
cases
the
pgs
in
three
merge
state.
They
were
stuck
in.
You
know:
backfill,
wait
state
for
for
a
long
time,
and
there
was
a
need
was
failed
to
handle,
handle
such
such
pg
stuck
in
this
state
at
a
higher
priority.
So
essentially,
the
requirement
is
for
the
the.
E
G
So
this
is
still
in
the
conceptual
conceptualizing
stage,
but
some
solutions
that
I
I
thought
of
was
one
which
involves
using
iq,
which
has
slightly
a
lower
clarity
than
the
current
immediate
queue
that
we
use
named
claw
in
the
n
clock
code.
Now,
the
immediate
queue
is
not
handled
by
the
anticlock
scheduler,
but
rather
it's
used
for
very
high
priority
operations
like
replying
to
the
duplication
options,
and
things
like
that.
G
That's
that's
one
solution
that
I
was
thinking
of
the
other
one
was
to
introduce
some
kind
of
logic
where
the
after
looking
into
the
high
priority
high
priority
field,
we
could
manipulate
the
cost
of
such
items,
basically
lower
the
cost,
so
that
m
clock
so
that,
once
it
gets
put
into
the
m
clock
queue,
it's
able
to
decrease
the
the
these
kinds
of
these
operations.
G
A
Yeah,
I
don't
think
we
need
any
other
third
solution.
One
of
these
should
work.
I
think
we
discussed
solution,
one
I
believe,
and
one
in
the
qos
call
right
yeah.
I
I
think
either
should
be
fine.
Let's
probably
you
know,
look
at
the
feasibility
of
the
code
and
see
but
which
one
we
can
get
to
merge
faster,
because
this
kind
of
completes
the
picture
of
a
background
qos
and
then
you
know
we
can
completely
focus
on
client
versus
plan,
which
I
believe
is
going
to
be
the
tougher
night.
G
A
Cool
any
any
questions,
anything
else,
yeah
somebody
had
a
question.
A
All
right
cool,
so
if
there's
nothing
else
on
this
one,
the
other
piece
of
qs
I
wanted
to
just
quickly
mention
is
the
codel
pr
which
is
actively
being
worked
on
from
folks
from
uc
santa
cruz
and
there's
already
been
a
cd
and
sessions.
I'm
not
going
to
go
into
too
much
detail
into
it,
but
we
are
pretty
close
to
getting
this
merged
and
the
basic
idea
is
admission:
control
at
the
blue
store
layer,
sam
anything
else.
You
want
to
add
to
this.
N
Maybe
I
will
jump
in
for
a
second.
If
I
for
a
some
time,
I
was
very
picky
about
this
this
pr.
I
did
not
wanted
to
merge
it,
but
recently
and
understood
that
it
it
its
algorithm
might
be
a
bit
deficient,
but
still
it
provides
a
valuable
infrastructure
to
actually
improve
upon
the
the
rate
control
algorithm.
So
I
will
quickly
review
this
and
because
it
was
already
in
a
very
good
shape,
technically
from
a
technical
point
of
view
a
long
ago,
so
I
think
we'll
merge
it.
Rather
sooner
than
later,.
L
Cool,
I
think
this
is
sam.
I
think
it's
pretty
well
isolated
from
the
rest
of
blue
store,
so
I
think
it's
pretty
safe
to
do
so
and
we'll
want
to
make
further
improvements
to
it
so
merging
it
will
make
it
easier
for
people
to
do
performance,
testing.
A
Yep,
I
agree
all
right
for
folks
who
are
interested
in
further
details.
This
is
in
the
pr
itself,
there's
a
bunch
of
links
to
presentation
in
the
code,
walkthrough
so
feel
free
to
look
at
that.
A
Okay,
I
think
that,
with
that
all
the
curious
stuff
is
complete
next,
I
just
wanted
to
touch
upon
a
few
recovery
improvements
and
changes.
I
believe
that
we
have
in
the
pipeline
and
it's
worth
considering
them
for
the
reef
release.
A
A
It
is
a
new
eurasia
coded
plugin,
but
it
has
been
marked
experimental
since
then,
last
year
there
was
some
discussion
in
the
user
email
list
about
removing
the
experimental
flag
and
we
in
the
process
found
out
that
there
have
been
users
who've
been
using
this
successfully
since
we
released
this,
but
the
only.
I
think
the
common
concern
was
that
it
made
some
changes
to
the
ec
back-end
logic,
which
made
us
think
you
know
twice
before
removing
the
experimental
flag.
A
So
in
general,
I
think
the
question
I
want
to
raise
is
that:
do
we
think
we
can,
we
should
just
start
testing
it
more
and
give
it
like
a
full
cycle
not
make
it
like
anything
like
a
default
or
anything,
not
even
suggesting
that
but
remove
the
experimental
flag.
If
we
think
that
it's
not
breaking
our
our
tests
or
the
additional
test
coverage
around
ec
that
we
could
add
in
reef.
H
A
Rather
than
the
normal
plug
and
stuff
it
optimizes
for
network
bandwidth,.
A
L
A
That
is
a
very
good
question,
so
there
are
two
pr's
that
went
in
when
this
got
added
the
follow-up.
Pr
is
where
there
were
some
ec
changes
that
were
made
by
sage.
Those
are
those
definitely
need
to
be
revisited.
I
don't
remember
of
the
top
of
my
head
josh.
Do
you
remember.
A
A
A
project
it
was
like
a
project,
an
outside
contribution,
which
we
definitely
welcomed
and
looked
very
promising,
yeah
the
cats.
You
had
something.
D
Yeah,
so
I
think
there
was
one
thing
we
wanted
to
ask
this
instead
of
users
and
also
to
run
some
extensive
tests
right
that
in
the
scale
also
like
how
the
ec
pools
will
be
with
the
rgw
workload,
especially
because
we,
the
larger
use
case
of
easy
pools,
are
with
the
rgw
workload
right.
So.
A
Yep
yep,
that's
exactly
what
I'm
saying
when
I'm
saying
in
investment:
those
are
the
kind
of
things
we'll
probably
want
to
do
before
and
yeah.
I
think
I'm
again
suggesting
removing
the
experimental
flag.
So
it's
not
breaking
anything
badly.
I
think
I'm
more
concerned
about
the
corruption
aspect
of
things.
So
all
right,
so
I
don't
think
there's
much
more
discussion
required.
I
think
we
all
agree
revisiting
the
ec
code.
A
Changes
and
further
testing
is
what
we
need
the
next
one
is
another
pr
that
we've
discussed
at
a
cdm,
and
there
is
a
very
detailed,
is
mccola
on
the
call
by
any
chance.
A
Yes
mccola.
This
is
your
pr
that
we
held
off
merging
for
quincy,
but
I
think
we
definitely
want
to
go
ahead
with
it
for
reef.
Do
you
want
to
briefly
cover
what
the
purpose
of
this
pr
was.
O
O
O
Persist
this
missing
information
on
in
the
secondary
with
this
two
and
here
the
patch
will
it's
kinda
worked,
but
when
I
was
testing
its
own
technology,
I've
found
some
age
cases
where
it
still
didn't
work
and
yeah
right
now
I
don't
remember
details.
I
did
not
have
a
chance
to
open
it
for
a
long
time,
but
probably
yes,
it's
a
good
idea
to
so
yeah.
I
I
still
have
a
plans
to
look
at
it
and
analyze
the
results
of
testing
and
probably
to
come
with
a
better
solution.
A
A
The
missing
sets,
because
I
I
remember
that
we
had
there
were
some
corner
cases
that
you
ran
into
while
fixing
some
other
problem
where
we
were
actually
losing
this
information
and
that
in
nautilus
it
was
leading
to
a
crash
or
something,
if
I
recall
correctly,
but
all
the
context
is
in
this
pr
there's
a
bunch
of
discussion
we
had
around
this,
but
I
think
the
important
piece
here
that
I
want
to
focus
on
was
the
testing
piece.
O
A
O
A
Remember
that
I
think
the
additional
piece
that
we
discussed
at
that
cdm
was
that
we
lack
enough
tests
that
are.
A
A
A
All
right,
if
not
let's
move
on,
there
are
a
couple
of
items
that
I
added
here
again,
just
mostly
for
everybody's
information,
but
these
are
pg
log
related
improvements.
There
is
a
well-known
issue
that
was
reported
around
the
accumulation
of
dupes
in
the
pg
log,
which
led
to
out
of
memory
conditions,
and
we
now
have
a
pr
that
fixes
this
from
mid-sun.
A
We
just
need
to
get
this
done
and
similarly
there's
a
another
pr
which
is
essentially
going
to
be
logging
if
logging
the
size
of
the
pg
log
entries,
because
in
the
past
we've
run
into
issues
where
one
pg
log
entry
for
one,
for
whatever
reason
has
been
huge
and
it's
it's
not
something
like,
I
would
say,
is
fixing
a
problem
or
anything,
but
it's
more
like
a
developer
knowledge
in
case
we
we,
it
saves
us
the
trouble
of
going
back
and
dumping
the
pg
log
and
all
kind
of
object
store
tool
activities,
which
is
also
another
pr
that
nitsan
from
the
raiders
team
is
working
on.
A
G
A
Yeah
it
was
the
the
primary
would
send
a
message
to
the
peers.
Yes,.
O
L
Not
immediately,
I
would
need
to
review
what
the
problem
was
in
the
first
place,
but
I'm
just
going
to
say
that
any
any
change
that
involves
changing
the
message,
protocol
or
adding
messages.
The
osds
is
a
vastly.
G
L
L
O
A
To
yeah
go
ahead.
O
Point
of
view
that
this
adding
a
message
would
complicate
things
very
much
because
it
is,
it
is
what
I
I
I
remember
what
I
saw
that
I
I
still
love
after
adding
I
I
had
some
age
cases
where
it
didn't
work
as
expected
on
on
the
test.
Probably
it
was
just
a
problem
with
my
implementation,
but
still
yeah.
So
if
we
the
think
of
about
another
solution,
that
does
not
require
at
all
to
send
those
messages
and
instead
exits
differently
than
yes.
O
O
L
O
You
when
you
restart
osd,
this
information
is
lost
in
it's
just
active
in
clean
state
enters
so
and
but
if
you
run
deep
scrub,
it
will
find
this
problem
again.
So
the
problem
in
our
customer
was
that
they
didn't
run
even
just
crap,
not
this
crap.
So
they
had
this
disabled
and
were
not
aware
about
this
problem.
So
but
they
were
still
very
unhappy
about
this
and
wanted
us
to
fix
it.
L
M
A
Cool
all
right
and
just
to
make
sure,
except
like
the
condition
that
you
just
described
my
collar
there
weren't
any
other
upsides.
To
doing
this,
if
I
recall
correctly
right
like,
is
it
going
to
be
handling
any
other
weird
edge
condition.
O
No
basically,
this
they
are
about
to
fix
this
problem.
That,
for
us
also,
I
mean
for
me,
the
developer
does
not
look
very
critical.
It's
rather
my
minor
the
issue
that
the
information
about.
O
O
That
it
was
well,
it
was.
L
Yes,
but
I'm
saying
the
general
version
is
the
one
where
it
gets
detected
during
during
scrub.
That's
a
lot
more
common.
We
don't
currently
write
that
down.
We
don't
propagate
it
back
to
the
replica
or
write
it
into
the
primary's
missing
set.
So
when
you
reset
the
acting
set,
that
information
gets
lost
until
you
re-perform
a
scrub,
there's
a
special
case
version
of
this,
where
if
backfill
is
operating,
and
it
expects
to
find
an
object
that
isn't
there
because
again
the
pg
is
corrupted
in
the
first
place.
F
F
L
L
A
No,
I
I'm
all
for
complex,
like
you
know,
reducing
complexity.
I
still
think
the
the
the
error
injection
tests
and
all
that
can
still
be
improved,
given
that
there
are
certain
edge
cases
and
things
that
sometimes
users
report
and
we
are
still
not
catching
them
enough.
L
O
A
N
Yeah
I
mean
I
will
maybe
quickly
go
through
my
points.
First
one
is
I
wanted.
I
want
to
make
a
continuation
of
some
allocator
tests
that
were
done
very
long
ago
and
basically
I
didn't
properly
finish
them.
Then
it's
stems
from
the
recent
mark
tests
that
were
being
performed,
that
we
had
the
problems,
performance
problems,
difference
between
quincy
and
pacific,
that
we
couldn't
explain
basically
to
cut
it
short.
We
found
that
the
allocator
had
a
deficiency
in
some
in
some
situation.
N
The
overall
problem
with
blossom
allocators
behavior,
is
that
we
basically
picks
one
issue
at
the
time
when
we
find
some
problem
with
allocator,
we
do
a
modification
either
it's
to
reduce
fragmentation
or
to
improve
its
speed
and
in
some
conditions
that
we
actually
were
have
were
having
problems
on.
N
Fragment
fragment
the
data
put
objects
basically
do
some
simulation
of
a
real
life
workload.
N
In
with
with
some
some
data
points
like
having
testing
rbd
a
highly
variable
high
right,
random
rate
with
osd
fields,
to
like
80
or
90
percent,
and
on
such
created
states
then
perform
allocator
tests.
It's
it's.
The
goal
is
to
actually
make
a
comprehensive
table
of
our
allocators
performance
in
a
different
st,
almost
corner
case
scenarios,
so
we
could
actually
validate
them,
and
the
use
usage
for
this
would
be,
like
recently
mark
made
a
very
good
change
for
avl
allocator
to
keep
it.
N
Always
move
its
last
allocation
search
place,
even
if
proper
place
was
not
found
just
to
not
repeat
the
same
search
next
time,
but
we
currently,
we
don't
know
how
this
will
behave
if
our
osd
would
be
severally,
fragmented
and
also
very
full
if
it
wouldn't
be
better
to
actually
now
pay
more
time
to
find
some
better
placement,
even
though
we
are
just
breaking
some
requirements
and
yeah.
Basically,
that's
it
it's
it's
something
that
I
would.
I
would
just
make
infrastructure
for
testing
our
new
allocators
and
existing
also,
so
we
could
know.
P
Adam,
I
think
one
of
the
tricky
things
with
this
is
that
we're
we're
definitely
finding
that
different
hardware
behaves
differently,
not
even
not
even
so
much
just
you
know
hard
drives
versus
flash,
but
within
you
know
the
the
range
of
different
flash
hardware
we're
seeing
different
behavior.
So
it's
it's
a
tricky
problem.
N
Yeah,
okay,
I
might
have
omitted
that
such
testing
would
be
maybe
an
analog
to
rudder's
bench,
so
it
could
be
actually
run
on
a
deployed,
maybe
empty,
but
a
deployed
osd
on
on
actual
hardware,
not
an
analytic
test
run
on
some
validation
hardware.
N
But
if
we
expect
that
something
is
fishy
in
in
some
actual
hardware
environment,
then
we
could
use
that
to
verify.
P
I
think
I
think
one
of
the
things
this
can
be
really
difficult
for
us
is
figuring
out
like
what
is
the
the
cost
benefit
of
spending
time
now
to
reduce
fragmentation?
That
will
hurt
us
later
right.
Like
you
know,
do
you
really
need
this
one
io
to
to
be
guaranteed
to
to
finish
fast
or
you
know?
Is
it
okay
for
this?
I
o
to
take
a
little
longer
if
it
reduces
fragmentation
on
the
drive,
so
it's
a
hard
question
to
do
and
then
we
can
test
for
it.
N
Yeah
mark
I
totally
agree
and
to
actually
find
out
that
boundaries
we
should
have
tests
to
push
us
very
far
to
like
a
full
filling
their
drives,
because
we
know
that
for
high.
N
P
Yeah,
I
mean
the
the
whole
reason
why
kefu's
pr
went
in
last
summer
for
the
avl
allocator
was
because
I
guess
you
know
we
were
spending
like
excessive
time
in
near
fit
like
to
the
point
where
it
was
like
having
a
huge
impact
on
customer
clutch
clusters.
P
N
P
P
P
P
N
Q
I
don't
just
almost
a
side
note
about
limitation.
Q
Q
N
Q
So
just
want
to
say
that
I'm
not
sure
this.
Q
Q
Perhaps
yes
so
well,
my
point
is
it's
it's
we
we
need
some
means
to
fight
the
fragmentation.
Is
it
from
defragmentation
or
any
other
means
like
yeah?
Probably
the
fermentation.
That's
the
only
thing
which
comes
to
my
mind
but
yeah.
So
the
idea
is,
we
need
something
more
higher
level
than
allocator
to
fighter.
P
I
agree
with
you
igor.
I
think
it's
probably
good
for
us
to
try
to
optimize
for
fast
turnaround
on
allocations,
maybe
not
ridiculously
fast,
even
everything
else
that
we
deal
with
so
some
some,
you
know
don't
fragment
everything
immediately
but
but
yeah
for
for
cleanup,
especially
of
really
full
discs.
Q
N
Okay,
so
I
might
take
from
this
would
be
that
pushing
down
such
endeavor
will
be
better,
just
not
not
make
excessive
testing
infrastructure
for
allocator,
even
if
more
think
about
recollecting
merging
free
space
in
in
some
some
way.
Some
the
fragmentation.
L
N
In
actual
spending
there
budget,
I
I
maybe
should
more
focus
on
making
the
fragmentation
fierce
and
then
trying
to
improve
and
find
some
corner
cases
for
allocators.
That
might
be
more
more
gain.
L
Maybe
but
a
background
defragmenter
is
going
to
require
a
lot
of
empirical
evaluation.
There's
a
lot
of
freedom
in
how
you
implement
something
like
that.
So
I
don't
know
testing
the
fragmentation.
Behavior
seems
pretty
important
to
me.
N
Okay,
in
that
sense,
having
an
infrastructure
that
that
can
test
behavior
could
be
and
valuable
input
to
also
evaluate
the
fragmentator.
I
I
agree
yeah.
I
I
get
that.
A
O
A
A
N
Go
ahead
so
next
topic-
I
guess
it
should
be
mentioned
after
igor
talks
about
possibly
his
work
about
custom
right
ahead
log
for
roxdb,
so
I
will
postpone
that,
and
the
third
topic
is
we
seem
to
have.
I
mean
it's
basically
confirmed
in
some
cases
we
have
a
situation
where
different
rights
done
done
to
objects,
meaning,
instead
of
actually
writing
to
to
drive.
We
only
mark
it
in
writer
in
right
ahead.
Log
in
in
roxdb
in
deferred
operations
table.
N
This
is
something
we
basically
introduced
when,
in
nautilus
we
dropped
mechanism
that
we
moved
data
blocks
between
blue
fs
and
main
blue
store
device,
and
that
changed
cost
us
to
open
a
window
for
data
corruption.
So
it's
possible
that
we
will,
after
restart
corrupt
some
blue
fs
data.
There
are
two
possible
solutions
I
mean
for
for
this
issue.
The
bad
and
quick
one
is
to
not
execute
and
deferred
right.
When
we
detect
that
we
will
destroy
active
blue
fs
data.
N
N
Q
Because
we
keep
deferred
rights
in
database
after
okay,
the
charts
they
attached
it
to
are
actually
released
and
hence
on
non-graceful
shutdown.
We
might
want
to
replay
these
default
rights
again
and
they
get
into.
Q
Q
So
from
one
side
on
one
hand,
we
don't
want
to
over
complicate
the
default
right
processing.
Q
We
are
not
that
limited
to
cpu
cycles,
for
instance
to
do
that
checks,
so
it
doesn't
matter
much
for
if
we
need
a
bit
more
time
during
different
try
to
play
on
laptop
rather
than
perform
such
checking
on
each
right
in
regular
regulation.
Q
And
just
a
couple
more
comments,
so
I
managed
to
reproduce
that
using
this
dark
cluster-
and
I
saw
that
at
least
in
at
one
cluster
in
this
field
and
also
from
time
to
time
we
are
getting
some
rocks
db,
corruptions
with
unknown
causes
which
might
be
caused
by
this
issue.
So
I
I
can't
say
it's
very
seldom
so,
but
most
probably
we
just
haven't
identified
it
properly.
All
the
time.
H
N
Yeah,
actually,
my
thinking
was
to
make
it
to
a
first
make
it
quick
mitigation
as
much
as
as
we
can
without
any
change
and
then
on
a
longer
time.
L
L
Q
And
just
another
comment
of
that:
this
actually
happens
when
db
the
user's
main
device,
so
either
it's
collocated
with
main
volume
or
if
it's
split
over,
so
it
doesn't
happen.
If
you
have
standalone.
L
It's
true
that
it
won't.
It
won't
corrupt
rock's
db,
but
it
could
still
overwrite
another
object's
data
right.
Q
L
Q
Well,
I
what
I'm
trying
to
say.
I
think
it's
not
possible
it's
not
corrupting
user
data.
I
I
can
prove
that,
but
my
current
understanding
is
that
causes
corruption
to
well,
if
it's
only,
it
might
cause
corruption
to
block
the
soul.
Q
So
if
you
use,
which
means
that
if
you
use
standalone
blueprints
volume,
you
are,
you
are
safe.
Q
Well,
I'm
not
saying
it's
it's
perfect,
but
just
just
just
to
mention.
Q
But
I
I
completely
agree
that
we
need
to
fix
that
issue.
The
the
the
question
is
how
to
do
that
in
a
simple
and
straightforward,
straightforward
way,
which
wouldn't
bring
the
issues
and
yeah
again
trying
to
to
avoid
this
during
regular
collaboration.
Looks
not
that
easy
and.
Q
Q
I
I
was
quite
surprised
when
I
realized
that
I
could
see
that
when
using
ssd
drives
only,
I
need
to
double
check
that,
but
but
I
have
a
concern
that
it
might
affect
all
this
is
the
configuration
as
well.
A
A
Q
Yeah,
I
can
say
that
I
was
planning
some
extensive
discussion
on
that.
Just
try
to
recap
features
you
might
want
in
upcoming
in
in
the
next
major
release
and
the
first
one
is
this
new
right
effect
lock,
which
I
mentioned
and
which
I'm
currently
working
on.
Q
The
idea
is
to
remove
right
ahead
lock,
so
is
to
replace
roxdb
right
ahead.
Lock
embedded
throttle
props
to
be
right
ahead,
lock
with
standalone
one
residing
at
the
store
level
which
will
allow
us
to
parallelize
access
to
it
from
multiple
threads,
which
is
not
the
case
with
roxdb
write
the
headlock
currently
as
we
access
roxdb
from
single
tvsync
thread.
Q
For
pg
locks
tasks,
so
instead
of
using
roxdb,
we
can
have
a
more
natural
way
to
keep
pg
lock
using
this
right
ahead,
one
since
we
you
get
more
control
over
it.
When
we
have
it
in
the
painting
blue
story,
we
get
more
chances
to
do
that.
Q
I
still
need
to
think
about
how
to
ensure
consistency
between
db
transactions
and
pg.
Lock
updates
so
just
need
some
some
more
time
to
proceed
and
think
about
it,
and
also
there
are
at
least
a
couple
new
issues
or
tasks
which
you
might
want
to
have
to
fix
or
to
implement
before
moving
right
ahead,
lock
to
production.
Q
The
first
well.
Actually,
the
second
I
mentioned
here
is:
we
need
to
refactor
our
blues
fest
bluestora
statifos
tracking
mechanism,
which
is
currently
like
we
update
start
fs
on
each
metadata,
update
transaction
and
similarly
to
debit
staff,
which
currently
avoids
allocation
map
updates
on
each
transaction,
but
provides
means
to
recover
that
on
startup
from
database,
we
can
do
the
same
for
bluestock.fs
in
absolutely
the
same
manner.
Q
If
we,
so
what
what
what
we
gotta
be
doing,
he
enumerates
all
the
objects
and
retrieves
the
location.
Retrieves
charts,
which
are
located
for
them
and
builds
the
allocation
map.
This
happens
on
non-grayson
shutdown
only
and
the
same.
During
the
same
process.
We
can
learn
how
many.
Q
Piece
is
keeping
so
again,
I'm
on
that
at
the
moment,
hopefully
to
to
publish
this
piece.
Q
That
might
be
in
front
of
us
is
the
this
applies
to
both
new
right
ahead,
lock
and
to
some
degree
to
gabby's
location
map
recovery
and
that's.
Q
Let's
say
circle
dependencies
between
database
allocation
maps
and
some
additional
entries,
like
you,
write
ahead
local
recovery
procedure,
and
it
looks
like
well
for,
for
instance,
for
for
recovery
procedure.
Q
But
this
happens
before
allocation
map
is
retrieved
and
hence
bluefish
is
unable
to
allocate
space
from
from
main
device.
So
workaround
for
this
issue
is
to
use
standalone
again,
so
it
tracks
allocation
maps
independently
of
database
and
hence
we
are
able
to
allocate
chunks
from
it
at
any
time,
but
for
shared
device.
N
Our
metadata
on
on
allocations
from
okay,
again,
my
thinking
is
maybe
if
we
had
right
ahead
log,
the
your
new
one.
Maybe
we
could
revert
what
we
did
with
allocations,
meaning
we
will
not
pay
modification
to
roxdb,
because
we
will
have
a
location
file
with
allocations
and
any
updates
that
we
did
not
do
will
reside
in
right
ahead.
Log
just
piggybacking.
Q
Orthogonal
to
location
map
and
actually
introducing
this,
this
log
makes
this
circle
dependence
the
issue
more
visible,
so
I
definitely
need
to
avoid
these
circle
dependencies
for
from
europe.
Alright
write
a
headlock,
and
the
only
way
I
can
do
that
right
now
is
to
use
standalone
blushes
to
extend
the
long
volume
for
the
db.
Q
All
right,
but
for
for
existing
right
ahead
lock,
we.
Q
Q
Q
There
offline-
probably
some
some
some
some
other
issues
with
you
write
the
headlock,
which
makes
this
more
important,
but
it's
hard
to
to
explain
during.
A
Q
The
major
reason
behind
that
is
to
improve
the
robust
blueprints
robustness
and
to
do
that,
we
might
want
some
to
implement
some
redundant
super
blocks,
for
instance,
start
using
4k
allocation
units
and
maybe
well
not-
maybe
it's
definitely.
It
would
be
definitely
required
for
4k
allocation
unit.
We
will
need
expandable
super
blocks,
because
smaller
allocation
unit
sometimes
result
in
lack
of
so
that
they
might
suffer
from
the
lack
of
enough
space.
Q
Q
Hence
gain
performance
at
cost
of
data
safety,
so
for
some
use
cases
users
might
want
to
get
performance,
but
care
do
not
care
about
data
safety.
So-
and
I
I
I
did
some
pc
on
that-
and
indeed
the
performance
gain
is
pretty
large
and
the
implementation
is
actually
completely
independent
on
of
the
other
osd
stuff
or
why
not
simple
method.
A
Yeah,
I
think
we
discussed
it
performance
call
and
we
agree
that
it's
a
great
idea,
maybe
once
your
poc
is
ready,
it'll
be
a
useful
thing
to
send
it
out
to
the
mailing
list
and
get
more
feedback.
I'm
I'm
curious
to
know
what
those
use
cases
look
like.
A
A
I,
like
it
all
right,
any
any
questions,
any
thing
we
want
to
discuss
on
that
topic,
or
should
we
move
ahead.
A
All
right
not
hearing
anything,
so
I
will
gloss
over
the
next
couple
of
topics,
so
these
are
features
we
wanted
to
get
into
quincy,
but
have
spilled
over.
There
are
outstanding
prs
for
both
the
first
one
is
about
configuration
profiles
that
can
be
used
for
several
purposes
on
a
pool
level
or-
and
there
is
a
pr
link
in
this-
I'm
not
going
to
go
into
further
detail.
A
The
next
one
is
about
the
automatic
key
rotation
that
is.
There
also
is
an
outstanding
pr
that
I
think
radic
has
taken
over.
From
said,
we
are
pretty
close
to
getting
that
done
for
reef
with
that.
I
am
going
to
give
it
laura
to
talk
about
the
balancer
improvements
that
are
in
the
pipeline.
E
Yes,
so
I'll
go
through
this
quickly
so
for
targeted
for
reef,
but
it
will
be
able
to
be
backported
into
quincy,
josh
salman
and
I
have
been
collaborating
on
a
workload
or
primary
balancer
first,
so
the
current
in
ceph's
current
balancer
implementation,
it's
important
to
balance,
write
and
read,
requests
across
osds
for
optimal
performance
and
the
current
capacity
balancer
works
well
to
or
handles
write
requests,
but
there's
still
a
need
to
balance.
E
Read
requests
based
on
pool
workloads
so
targeted
for
reef
josh,
and
I
josh
josh
salman,
and
I
have
been
working
on
implement
cloud
balancer
that
will
balance,
read,
requests
on
a
pool
by
pool
basis,
and
there
was
a
some
work
done
for
quincy
that
made
it
possible
for
us
to
backport
this.
E
I've
also
linked
an
open
pr
that
will
be
included
once
the
implementation
has
been
merged,
but
it
explains
what
the
existing
capacity
balancer
does
and
why?
E
Why
there's
a
need
for
the
workload
balancer
and
how
they
would
how
they
work
in
conjunction
and
how
they
sometimes
contradict
each
other
for
use
cases,
and
this
this
work
is
also
set
to
be
talked
about
at
cephalocon,
so
you'll
hear
more
about
if,
if
you're
planning
on
attending
cephalocon
you'll
hear
more
about
progress
during
that
and
then
another
thing
I
want
to
mention
is
something
we
work
on
see,
but
we'll
carry
into
reef.
Is
we
want
to
improve
balancer
testing
and
to
do
that?
E
We
need
more
accurate
examples
of
what
so
this
happened
or
was
brought
up
originally
during
the
user
dev
monthly
meeting.
But
we
created
a
a
tracker
issue.
It's
at
the
bottom,
the
third
thing
listed
in
the
appendix
there
and
we
opened
this
up
to
the
the
ceph
community
so
that
people
could
share
their
osd
maps
with
us,
and
this
is
of
course,
voluntary.
E
We
would
never
ask
for
osd
maps.
You
know
against
people's
will,
but
this
is
all
volunteered
by
stuff
community
members
and
the
goal
here
is
to
improve
balance
or
testing
moving
forward
and
to
account
for
more
realistic
scenarios
and
we
hope
to
continue
collecting
during
the
development
of
reef
and
that's
pretty
much
all
I
have
for
that
and
of
course
it's
a
it's
a
lot
about
the
workload
balancer,
especially
but
I've,
hopefully
the
links,
I've
included.
E
If
you
have
more
questions
or
are
wanting
to
know
more
details,
you
can
check
those
out
and
that's
all
I
had.
But
if
there
are
any
questions,
feel
free
to
ask
or
comment.
A
Cool
thanks,
laura
and,
as
you
rightly
pointed
out,
if
you
want
to
hear
more
about
or
stay
tuned
on
the
progress,
this
will
be
discussed
at
the
reception
talk.
So
I
think
with
that,
I'm
going
to
move
to
the
next
topic,
which
is
about
the
auto
scaler.
So
this
is
a
very
open-ended
topic.
I
would
say:
there's
nothing
in
particular
that
we
have
planned
or
we
want
to
do,
but
we
are,
is
junior
on
the
call.
A
Yeah,
so
I
guess
junior,
maybe
you
briefly
want
to
talk
about,
like
you
know
the
checks
and
balances
that
were
added
in
quincy
and
some
recent
issues
that
we
you
know
that
were
brought
to
our
notice,
because
of
which
we
want
to
probably
revisit
what
else
we
want
to
do
for
the
auto
scaler.
J
Sure
so
for
in
quincy,
we
added,
I
think
the
big
thing
that
we
added
is
the
dash
dash
bulk
flag,
which
this
essentially
any
pool.
That
is,
a
data
pool
that
we
anticipated
will
need
a
lot
of
pgs
the
dash
dash.
You
would
create
the
pool
with
the
dash
dashboard
flag
and
that
will
essentially
start
to
pull
with
the
maximum
amount
of
pgs
that
they
can
can
be
given
and
any
pool
that
doesn't
have
like
the
dash
dash
bulk.
J
J
The
other
feature,
I
think,
is
the
no
auto
scale
global
flag,
where
you
can
turn
off
the
auto
scaler
globally,
with
just
one
command,
rather
than
going
into
each
pool
and
manually
like
turning
it
off
and
so
and
the
issue
we
just
face
with
this
there's
a
set
of
users
of
users
so
that
that
user,
basically
upgraded
from,
I
think,
14
dot,
something
to
like
16.27,
which
is,
which
is,
I
believe,
pacific.
J
And
there
were
some
rebalancing
issue
which
it
takes
like
many
days
for.
I
think
pgs
to
to
to
to
decrease,
and
basically
I
think
the
motivation
behind
this
topic
is
basically
should
we
like.
Should
we
make
the
autoscalers
not
like
do
things
behind
the
the
user's
back
without
like
if,
if
the,
if
the
scaling
of
the
pg's
will,
you
know
like
really
impact
like
the
performance
of
the
cluster,
it
takes
too
long
for
it
to
to
to
change
the
number
of
pgs.
J
A
Yeah,
I
think
the
general
questions
that
we
have
on
our
minds
is
around
visibility
into
what
the
auto
scaler
is
doing,
and
especially
this
one.
It
seems
to
be
a
weird
case
when
the
autoscaler
tried
to
scale
the
pools
down
by
you
know
four
times
or
something.
So
why
did
that
happen?
And
if
you
know
what
what
extra
you
know,
maybe
logging
or
improvements
we
can
add
into
the
auto
scaler
to
be
able
to
easily
catch
these
kind
of
cases.
A
In
general,
we
have
a
max
misplaced
object,
metric
that
the
auto
scaler
uses
to
not
create
too
many
misplaced
objects
at
a
time.
But,
given
this
particular
case,
it
seemed
like
this
is
one
case
where
the
the
auto
scaler
was
on
in
a
in
a
large
cluster.
So
even
the
number
of
misplaced,
based
on
the
total
number
of
objects
did
make
sense
initially,
but
there's
still
some
unknowns
that
we
are
trying
to
figure
out
in
this
case,
and
in
line
with
this,
we
want
to.
A
You
know
in
general
see
what
else
the
autoscaler
needs
in
in
order
to
perform
well
in
all
kinds
of
scale.
A
A
All
right,
if
not
so,
there
are
two
more
topics:
there's
one
about
ost
map
trimming
on
the
osd
that
was
brought
to
my
attention
by
prashanth
yesterday.
I
do
not
have
too
much
context
on
it,
but
it
seems
like
there's
a
case
that
has
come
up
where
there
are
millions
of
osd
maps
in
the
usd
that
are
not
getting
trimmed
because
of
a
trigger
of
osd
map
generation.
That
is
required.
A
H
Yeah,
I
added
that
one.
It's
been
on
the
on
the
backlog
for
a
while,
since
we
haven't
really
done,
we
can
look
at
it.
I
think
it
might
be
becoming
more
relevant
again,
with
more
analytics
workloads
being
run
on
top
of
rgw
things
like
okay
and
and
there
are,
and
other
sorts
of
processing
engines
that
end
up
doing
smaller
beads
than
the
whole
object.
H
H
One
end
of
the
cables
them
and
then
I
went
into
the
cost,
basically
for
a
kpsm
erasure
code.
So
basically,
this
is
a
the
read
case.
Theories
is
relatively
simple
compared
to
the
right
case,
and
I
think
it
seems
like
it
would
provide
significant
benefit
for
a
cpu
load
for
these
kinds
of
cases.
Another
application
would
be
potentially
a
log
structured
format
for
rbd,
which
could
be
stored
in
ratio
coded
pools
much
more
efficiently,
but
where
reads
especially
small
reads
would
be
expensive
with
the
current
striping
strategy.
P
A
A
A
Okay
with
that,
I
think
we've
got
a
couple
more
broader
things
to
discuss
in
terms
of
testing
improvements
and
board
cleanup
and
deprecation.
I
know
we
are
over
time,
but
maybe
next
five
minutes.
We
can
quickly
touch
upon
these.
A
So
I
guess
the
goal
here
is
to
talk
about
testing
coverage
improvement
for
things
like
stretch
mode
that
got
implemented
a
couple
of
years
ago,
but
we
are
starting
to
find
bugs
when
our
downstream
folks
are
testing
it.
So
it
seems
like
we
need
more
coverage
across
stretch
mode
testing
in
topology
other
than
we
have
some
facets,
which
change
the
election
mode
and
I
also
believe,
there's
a
net
split
subsuite.
F
Yeah,
I
don't
think
the
netsplit
is
actually
run
anywhere.
I
started
on
it
and
it
sort
of
was
like
okay,
here's
here's
the
tasks,
but
actually
integrating
with
anything
useful
is,
is
not
done,
and
I
know
it's
confused
like
I
implemented
stretch
mode
and
there
was
no
pretense
that
the
technology
testing
was
adequate.
The.
H
F
Goes
back
some
to
some
of
the
well,
maybe
it
doesn't.
There
was
a
discussion
earlier
that
was
making
me
think
of
this.
So
there's
the
election
mode
testing
is
pretty
good
because
we
can
just
switch
it
for
the
monitors
and
run
it
everywhere
and
because
I
you
know,
wrote
a
new
rope
wrote
a
bunch
of
new
monitor
election
logic.
That
was
easy
for
me
to
architect
so
that
I
could
write
unit
tests
for
it.
F
But
setting
up
netsplit
is
a
pain
and
at
the
time
that
I
was
working
on
it,
then
we
just
couldn't
really
count
on
and
then
I
know
this
is
fixed
now
we
just
couldn't
really
count
on
one
test
that
took
more
than
two
nodes
really
ever
running,
and
you
know
we
just
mostly
can't
unit
test
the
peering
state,
which
is
in
lsds,
which
is
a
bummer,
but
basically
someone
with
you
know
more
time
for
development
than
me
needs
to
sit
down
and
make
right
tests
for
this.
F
I'm
doing
it
or
a
definitely
contact.
That's
basically
like
here's.
What's
written,
here's
what's
not
written
and
by
the
way
please
write
tests
because
we
need
them.
A
So
how
about
we?
I
mean
like
at
least
one
of
the
problems
that
you
said.
Like
you
know
more
than
a
test
with
more
than
two
nodes.
We
don't
have
any
kind
of
constraints
at
this
point
with
the
new
dispatcher
stuff
in
pathology.
So
I
think
the
best
starting
point
would
be
to
look
at
what
the
net
split
suite
is
doing,
and
maybe
you
can
talk
about
that
in
your
talk
and
start
running
it
and
add
more
coverage.
There.
F
F
And
then,
since
it's
not
possible
to
run
things
with
three
or
five
nodes
or
whatever,
then
we
could
use
that
to
start
doing
exactly
the
split
testing.
But
all
those
things
need
to
be
written.
O
F
M
A
All
right,
I
think
we
are-
we
at
least
have
a
path
forward
for
this.
The
next
one,
and
maybe
the
next
few
topics
are
not
just
radar
specific.
I
think
the
whole
chef
project
can
benefit
from
these.
The
first
one
here
is
about
large,
logical
skill
testing.
A
This
is
not
not
much
discussion
to
be
done,
but
work
to
be
done
any
other
any
other
thoughts.
Anybody
else
has
on
this
and
why
I
said
this
is
a
crucial
not
just
for
raiders
I
mean
any
other
component
can
start
doing.
You
know
larger
scale
testing
you,
given
that
this
piece
has
already
merged,
so
we
are
not
literally
filling
up
the
the
devices
in
the
smithy
machines,
etc.
So
we
can't
afford
to
run
tests
with
larger
number
of
osds,
even
for
a
longer
amount
of
time.
If
you
want
to.
H
A
Yep
yep,
that's
a
good
point
and
it
kind
of
ties
with
the
config
profile
thing.
We
discussed
that
we
could
have
a
config
profile
that
would
be
like
low
resource
usage,
config
profile
that
we
could
set
and
the
test
would
just
you
know,
tune
tuned.
The
cluster
based
on
that.
D
D
A
A
All
right
with
that,
the
next
topic
is
everybody's
favorite
topic
and
one
of
our
pain
points
upgrades.
I
think
it's
high
time.
We
need
to
improve
our
upgrade
test
coverage
and
we
had
some
discussion
around
how
we
you
know:
how
do
we
ensure
that
all
the
prs
that
are
getting
back,
ported
and
even
like
new
prs
that
are
coming
in
are
going
through
adequate
upgrade
tests
or
upgrade
test
runs?
I
I
feel
this
is
going
to
be
an
ongoing
discussion,
but
some
of
the
basic
ideas
that
came
in
were
like.
A
If
there
are,
let
us
say,
encoding
changes,
or
there
are
pieces
of
code
that
are
prone
to
backward
compatibility
breakages,
we
could
have
things
like
github
hooks,
which
would
raise
some
sort
of
indication
in
the
pr
to
a
increase.
The
awareness
of
the
reviewer
be
kind
of
make
the
upgrade
test
run
a
requirement,
and
not
just
you
know,
leave
it
to
to
the
p
person
who's
testing
it
to
identify
whether
an
upgrade
risk
is
needed
or
not.
So
I
guess
there
are
a
few
ideas.
A
We
need
to
start
implementing
some
of
this
github
hooks
thing
or
even
like
bring
it
to
our
day-to-day
when
we
are
reviewing
pr
things
that
need
upgrade
testing,
how
do
we
ensure
that
they
don't
get
merged
without
that,
and
the
other
thing
is
that
we
are
also
doing
baseline
baseline
tests
for
for,
like
almost
all
suites,
for
the
master
branch
every
week.
So
upgrades
is
going
to
be
something
that
we
are
going
to
do.
You
know
test
with
that.
A
If
not,
I
am
just
going
to
leave
the
last.
I
mean
this
is
just
a
general
topic
that
we
we
have
like
things
like
trello
tracker
tracker
in
which
we
track
features,
but
we
don't
have
anything
where
wherein
we
can
actually
track
test
improvements,
and
one
of
the
use
cases
was
that
we
had
a
contributor
who
who
had
a
valid
patch,
but
we
wanted
them
to
write
a
test
and
they
because
of
their
lack
of
familiarity
or
whatever
it
is
weren't
able
to
contribute
that
particular
test.
A
A
So
it's
a
general
question
about
how
what
mechanism
we
want
to
use
to
track
test
improvements.
We
could
do
and
do
it
in
the
form
of,
like
you
know,
a
separate
category
in
our
red
mine
component,
like
we
already
have
bugs
features.
We
could
have
something
like
this
that
we're
looking
at
at
a
regular
cadence
or
we
could
do
trello,
but
I
I
believe
this
is
also
another
topic
where
we
want
to
get
more
developer
feedback
on
any
any
other
thoughts
or
strong
opinions
on
this.
E
I
just
wrote
a
comment:
I
like
the
idea
of
using
redmine
for
that,
because
that's
a
lot
of
users
are
familiar
with
and
it's
a
good
idea
and
overall,
I
think.
A
Yeah,
I
think
radek-
and
I
were
talking
about
this
and
we
were
like
okay,
it
just
makes
the
obvious
place
would
be
a
red
mine.
I
adding
a
new
category.
There
shouldn't
be
a
big
task,
but
if
you
think
that's
a
good
idea,
we
can
just
go
ahead
and
get
that
done.
A
And
yeah,
I
think
the
next
one
is
it's
again,
something
that
the
raiders
team
has
seen
so
currently,
the
entire
time
for
to
complete
a
raiders
sweet
run
is
like
five
hours.
I
would
say
not
less
than
that.
Of
course.
What
can
we
do
to
reduce
that?
So
I
think
this
is
again
a
common
topic
we
will
be
discussing
in
different
forums,
but
just
wanted
to
raise
awareness
on
it.
A
C
Sure,
well,
actually,
one
of
the
ways
to
reduce
the
the
run
time
of
of
our
test
is
to
duplicate
components.
For
instance,
I
think
we're
still
testing
a
pretty
extensive
file
store.
We
could,
if
we
duplicate
it.
If
we
we
finally
remove
it,
we
could
actually
go
back
some
of
our
resources
and
we
can
spend
it
are
either
spent
on
more,
let's
say,
bluestar
testing
or
just
squeeze
the
time
of
the
resources
spent
for
particular
first
single
run,
so
it's
also
been
beneficial.
However,
I
haven't
added
file,
sir
anyway,
I'm
the
I'm.
C
I
really
I
would
really
vote
for
it.
A
Yeah,
so
I
think
we
can
that
prick.
We
have
already
sent
out
a
deprecation
warning
in
quincy
for
file
store,
so
I
would
say
most
of
our
test
cases
can
now
afford
to
not
run
with
file
store.
Maybe
there
are
things
like
upgrade
tests
and
stuff
which
still
should
use
file
store,
but
I
think
in
general
we
can
get
rid
of
it.
D
A
C
Easy,
it's
not
a
no-brainer
another.
Another
points
are
about,
underpoint
is
about
memdb.
It's
it's
an
outstanding
pr.
C
And
the
question
is
whether
do
we
really
have
an
active
user
of
it,
because
if
not,
we
could
not
only
kill
one
of
the
kv
backends,
but
we
also
could
eradicate
some
complexity
from
buffer
list.
Actually,
there
is
batteries.
Buffer
raw
has
the
interface
for
cloning,
which
means
a
bunch
of
code
and
even
putting
some
extra
data
for
into
each
of
our
buffers
just
for
the
sake
of
cloning
for
the
cloning
responsibility
which
is
being
used
solely
inside
mdb.
So
I
believe
there
are
a
few
rabbits
to
shoot
the
same
ballot
here.
C
Being
unsure,
I
think
we
should
I
I
think
we
should
at
least
make
an
announcement
to
to
the
mailing
list.
Ask
ask
all
developers
I
mean
outside
whether
it's
still
useful
for
them
or
not.
I
would
expect
a
silence
to
be
honest,
but.
C
No,
I
think
it's
for
k
for
testing
the
the
kv
interface,
so
it's
not
used
by
memster.
G
J
F
That
down,
because
there's
no
way
something
like
that
is
exposed
to
any
external
users,
it's
just
about
what
we're.
A
Okay,
I
think
it
doesn't
hurt
to
ask
in
the
user
thread
if
there
are
existing
users,
they'll,
probably
chime,
and
then.
C
Yep
well
another
question:
it
spawns
whether
whether
do
we
need
to
go
through
the
full
brown
deprecation
and
removal
process,
or
maybe
we
could
just
remove
it
immediately
in
in
the
reef.
H
Like
we
said,
I
think
it's
a
feature,
not
a
user
feature,
so
I
think
it's,
it's
really
a
removable
question,
not
a
deprecation
question.
C
C
I
think
we
have
this
point
addressed
so
moving
to
another
one.
The
next
one
chef
was
the
watch.
Op
legacy
watch
very
old
one.
That
is
that,
however,
there
is
one
clutch
here:
krbd
and
actually
cardinals
in
the
candle
version
around
4.66.
C
There
is
a
last
comment
from
from
ilya
droimov
who,
who
made
us
aware
that
this
this
old,
this
legacy
app
is
still
in
still
is
being
used
by
by
on
very
old
cardinals,
okay,
everything
below
4.7
from
july
2016.,
six
years
coming
from
now,
six
years
after
after
the
release,
I
think
it
would
be
rather
safe
to
finally
kill
it.
C
Still,
I'm
not
entirely
sure-
and
I
I
will
need
to
I
really.
I
would
really
appreciate
feedback
on
that
battle.
The
benefit
the
potential
benefit
from
eradication
is
that
actually
the
watch
notify
machinery
it
got
into
now
it
got
in
the
classical
osd.
It
got
two
variants,
two
tastes,
two
flavors
one
is
assuming
that
we
can
reliably.
C
Discover
a
connection
reset,
which
is
the
legacy
one
well
internet
wrong.
The
assumption
is
wrong
and
later
implemented,
a
full
blown,
pink
punk
machinery
on
our
own
just
to
detect
just
detect
connection
issues
and
this
one
in
the
new
flavor.
Only
the
new
flavor
got
implemented
in
crimson.
C
C
Stage
of
crimson,
because
it
requires
basically
some
interaction
with
ms
handler
reset
of
a
messenger.
It
has,
in
other
words
it's
it
made
our
our
session.
It
makes
a
necessity
of
our
sessions
to
be
aware
about
the
watch
con
state.
C
If
we
can
remove
that
it
will
be,
it
will
be,
it
will
for
sure
decrease
the
complexity
pretty
significantly
of
the
watch
certified
stuff.
C
F
Those
are
two
different
questions.
I
think
I
can't
imagine
there's
any
reasonable
argument.
We
need
to
implement
the
old
version
in
crimson
for
removing
it,
though.
What
we
need
to
do
is
look
at
what
distros
are
shipping
which
kernels
and
what
lts
releases
probably
and
then
make
a
call,
because
if
you
know,
there's
a
still
active,
ubuntu
lts
that
doesn't
support
the
new
watch,
notify
we
probably
don't
want
to
kill
it
until
that's
done.
C
Agreed
agreed,
but
actually
yes,
there
are
two
cases
we
don't.
First
one
is
screen
is
implementing
the
crimson.
We
don't
need
to.
Of
course
we
don't
need
to
do
that,
except
somebody
would
mandate
as
the
support
very
also
cardinals
right
or
not.
The
second
one
is
with
slg
movable,
so
yeah
you,
I
think,
you're
also
right.
We
need
to.
We
need
to
start
from
from
from
taking
a
look
on
on
the
current
version.
C
A
C
Once
we
have
the
answer,
okay,
you
need
to
no
need
to
spend
more
time
on
that.
The
last
thing
is:
is
inside
mdr
module?
I
don't
know
whether
that.
A
A
Alright,
I
agree,
and
that
is
pretty
much
everything
that's
on
the
agenda
27
minutes
over
time,
but
I
guess
is
there
anything
else
which
is
not
in
the
agenda
and
you
want
to
talk
about.
A
If
not,
thank
you
for
joining
the
raiders
area
session
for
reef
and
thank
you
for
staying
over
time
as
well.
Have
a
good
rest
of
your
day,
see
you
later
bye.
Thank
you.
Everyone
guys
bye
thanks.