►
From YouTube: Ceph RGW Refactoring Meeting 2023-06-14
Description
Join us every Wednesday for the Ceph RGW Refactoring meeting: https://ceph.io/en/community/meetups
Ceph website: https://ceph.io
Ceph blog: https://ceph.io/en/news/blog/
Contribute to Ceph: https://ceph.io/en/developers/contribute
What is Ceph: https://ceph.io/en/discover/
A
I
also
opened
a
Tracker
issue
to
track
this.
The
earlier
pull
request
tried
to
offload
the
md5
stuff
to
background
threads,
and
so
it
was
a
little
too
expensive
in
the
thread
synchronization,
but
other
projects
are
using
XMD
and
vectorization.
For
this
instead
and
I've
shown
speed
UPS
of
up
to
like
5X,
which
sounds
really
useful.
A
It
did
do
a
search
on
GitHub
and
found
the
AC
plus
library
that
has
a
avx2
implementation
that
we
might
try
to
experiment
with,
but
yeah
hoping
that
somebody
is
interested
in
in
taking
a
look
at
this,
because
I
think
that
it
could
give
us
some
good
wins.
The
beneficial
in
reducing
our
CPU
usage.
B
A
I,
actually
don't
have
much
experience
with
that
instruction
set
I,
don't
know
how
common
it
is.
It's
added.
B
It's
widely
available
abx2.
Let
me
see
what's
right,
but
yeah
until
as
well
as
I
am
nearly
introduced
it,
but
it's
not
new.
A
D
A
Okay,
yeah
I'll
send
an
email
and
and
CC
him,
but
we'll
just
track
this
feature
and
and
the
tracker
Linked
In
the
agenda.
A
C
So
I
started
to
take
a
look
at
the
multiple
zipper
back
ends
or
multiple
zipper
filters.
Work
I
know:
you'd,
set
an
email
about
this
last
August,
Casey
and
I.
Read
that
in
detail
and
I
had
a
few
questions.
C
C
The
questions
I
had
were
I
guess
so.
I
I
understand
that
the
email
you
sent
out
outlined
multiple
backends.
How
is
that
the
same
or
different
from
using
multiple
filters
with
just
one
back
end
to
start
with.
A
So
I
mean
you
would
you
would
just
need
to
use
the
nesting
in
Json?
Very
sorry,
you
got
to
answer
the
door,
be
right
back
yeah.
E
F
C
E
A
As
we
process
it,
we
would
essentially
just
use
Factory
functions
for
each
type
of
filter
and
they
would
parse
that
Json
and
then,
if
they
need
to
instantiate
other
back
ends
or
other
stores,
they
would
call
into
the
factories
for
those
kind
of
in
a
a
nested,
recursive
fashion.
C
D
C
E
No,
so
you
would
have
I
I
if
so,
if
I'm,
understanding,
correctly
and
Casey,
corrects
me
if
I'm
wrong,
because
I
would
like
to
understand
this,
you
would
have
a
Json
object.
That
would
describe
something
a
driver
and
you
would
call
the
factory
function
on
that.
Json
object
right.
If
that
Json
object
is
just
Rado
store.
It's
done.
E
If
that
Json
object
is
a
filter,
then
it
would
then
entertain
within
it.
A
nested
Json
object
that
describes
another
driver
right
and
that
filter
would
then
call
the
factory
function
on
that
nested
Json
object
and
it
could
conceivably
have
a
list
of
Json
objects
that
it
called
on
each
of
them,
but
that
would
be
specific
to
that
driver
that
driver's
Factory
function
would
have
to
be
able
to
understand
the
Json
with
the
list
right.
E
So
if
you
tried
to
use
a
filter
that
only
could
have
one,
you
know
one
thing
below
it
and
you
tried
to
pass
it
a
list.
It
would
fail
to
parse
the
Json
and
cause
an
error
and
ultimately,
eventually
at
the
leaf
of
every
of
every
one
of
these.
So
the
last
non,
the
last
non-nested
Jason,
would
be
a
store
in
every
case.
A
E
Okay,
and
so
this
way
this
protects
us
from
you
know,
trying
to
pass
branching
config
into
something
that
can't
handle
branching
config
and
protects
us
from
dangling
filters
and
protects
us
from
trying
to
have
a
store,
be
a
filter,
because
in
each
of
those
cases,
it's
factory
function
only
knows
how
to
parse
the
stuff
that's
valid
for
it,
and
if
it
tries
to
parse
it
and
it's
not
valid
for
it.
It'll
just
fail,
causing
the
config
to
fail
out
and
presumably
rgw
to
fail
to
start.
E
A
A
A
E
A
But
yeah
the
idea
was
that
all
all
right,
osgw
instances
in
the
same
Zone,
would
read
the
same
configuration
from
The
Zone,
and
so
we
wouldn't
have
different
Stacks
trying
to
serve
the
same
data
right.
C
C
And
then
the
last
question
I
well,
the
last
two
questions:
I
had
were
number
one
I
see
in
the
email,
there's
in
your
prototype,
for
load,
store,
you're,
doing
a
DL
open
and
then
a
deal
SYM
for
a
shared
object.
Library
is
that
was
that
assuming
the
super
loadable
modules
would
work.
So
it's
not
currently
how
we
initialize
okay.
A
C
F
Well,
you
can
ask
before
I
left
on
PTO
I
had
successful
Builds
on
focal
that
did
not
did
not
get
the
the
undefined
refs
that
I
that
I'd
been
getting
previously
f36
Builds
on
f36
were
always
success
or
yeah.
After
a
certain
point,
we're
always
successful,
so
I
just
need
to
rebase
and
run
some
local
tests
and
update
the
pr,
ank,
goodness
and
possibly
figure
out.
You
know
why
why.
F
C
The
last
question
I
had
around
this
is:
how
can
I
make
some
incremental
progress
or
work
towards
getting
this
started
in?
What
just
would
just
working
on
I
guess
having
that
factory.
C
E
Would
start
with
one
back
end
and
then
one
filter
and
one
back
end
and
then
move
from
there?
Okay,
we
do
have
the
base
filter,
which
is
strictly
passed
through.
So
when
you
want
to
do
a
bigger
stack,
I
mean
you
could
always
you
can
always
stack
up
like
you
know
two
of
two
just
or
more
of
those.
If
you
need
to
yeah
because
they
don't
they're,
you
know
no
Ops,
so
that
would
be
useful
for
testing.
A
For
for
each
type
of
driver,
you're
going
to
have
to
define
the
Json
format,
so
I
would
start
with
the
rados
store
and
make
a
factory
function.
For
that.
That
knows
how
to
parse
adjacent
that
we
need
for
the
rados
store,
and
then
you
can
pick
a
type
of
filter
and
create
a
format
for
that
and
where
it
needs
a
next
back
end.
You
could
plug
in
the
existing
rados
stuff
and
expect
that
to
work.
C
C
A
H
H
Gateway
admin
commands
to
clean
things
up
in
the
cases
where
they
do
and
what
emotionally
boils
down
to
is
their
concurrency
issues
with
clients
submitting
requests
for
the
same
key
almost
at
the
same
time,
and
the
timing
issues
create
some
race
conditions
and
stuff
that
a
lot
of
times
ends
up.
Leaving
behind
index
entries
sometimes
also
instance,
objects
that
never
get
cleaned
up
in
the
future,
and
we
found
that
in
production
for
us
that
it's
like
a
big
problem.
H
I
mean
we
we've
cleaned
up.
Billions
of
these
entries
and
millions
of
these
leftover
objects
over
the
past
couple
weeks
and
it
affected
a
lot
of
customers
and
they've
affected
things
in
multiple
ways,
including
performance
issues
that
result
from
having
a
lot
of
extra
index
entries
and
iterating
over
them,
and
also
just
complete
failure
of
life
cycle
processing.
H
Due
to
those,
because
there's
a
limit
to
the
number
of
these
entries
that
or
the
OSD
CLS
method
will
iterate
over
before
it
returns
an
error,
it
ends
up
being
unhandled,
so
long
story
short
I
mean
we
I
think
that
the
pr
is
covering
most
pretty
much
all
the
cases
that
we
know
about
causing
issues
for
us
at
this
point,
and
there
is
a
slight
difference
in
some
of
the
older
releases
so
I
had
initially
on
the
tracker
indicated
that
that
was
a
commit
on
Maine
and
reef
that
had
fixed
the
issue
and
I
updated
the
tracker
this
morning
as
well.
H
But
I
was
slightly
wrong
with
what
I
originally
thought
was
going
on.
It
turned
out
that
the
reason
that
it
was
fixed
on
Maine
and
on
Reef
was
that
there
was
a
different
commit
that
was
changing.
The
way
restarting
worked
such
that
it
didn't
change
the
instance
ID
of
the
bucket,
and
that
was
actually
the
reason
that
it
wasn't
the
put
404
specifically
part
of
it
wasn't
a
problem
on
Main
and
reef.
Some
of
the
other
issues
are
still
an
issue
and
Main
and
wreath
are
in
the
main,
PR
ior.
A
That
one
specifically,
we
had
already
been
tracking,
said
olh
returning
minus
two
when
resharding
I
attached
that
as
a
related
issue
to
numbers,
61359,
okay,.
H
Yeah
I'm
going
to
go
ahead
and
open
a
PR
for
the
Pacific
and
Quincy
fix
for
that.
Then
I
have
had
that
for
a
little
while
we've
kind
of
been
playing
with
it
in
some
of
our
environments,
so
I
think
that's
good
and
then
I
we've
also
been
using
the
radio
Skateway
admin
components
of
this
PR
for
a
couple
weeks
in
production
as
well,
and
that
has
been
going
really
well
to
clean
things
up
so
I
guess
at
this
point:
I
guess:
I'm.
H
Looking
for
feedback,
Casey
I'm
wondering
where
you
stand
on
the
pr
and
then
also
after
that,
I'd
like
to
talk
a
little
bit
more
about
just
how
we
can
do
a
better
job
of
testing.
These
kinds
of
scenarios
for
the
future,
where
it's
related
to
concurrency
issues
or
errors
that
are
thrown,
not
happy
path,
kind
of
scenarios.
A
Yep
I've
still
been
working
on
review
if
I'm,
focusing
on
your
radish
GW
admin
commands
initial
feedback
is
that
I
think
core
routines
would
be
a
better
fit
than
the
threading
stuff,
but
the
actual
repair
stuff
looks
okay.
A
H
So
yeah
I
guess
the
as
I
was
alluding
to
the
more
General
I
think
conversation
is
like.
We
I
think
there
are
three
or
four
separate
scenarios
here
where
things
can
happen
where
there
can
be
race
conditions
between
different
threads
or
different
radio
Skateway
admin
instances
when
requests
are
made
for
the
same
key
and
we've
had
a
couple
in
the
past
as
well
and
I
guess
I'm
still
concerned,
there
may
be
more
cases
because
it's
pretty
hard
to
reason
about
what's
possible
with
all
the
different
steps
within
those
transactions.
H
H
I
have
a
couple
of
those
actually
in
this
PR,
but
that's
pretty
unwieldy
for
thoroughly
testing
this
kind
of
thing,
since
there
are
so
many
different
places
that
you
might
want
to
inject
delays
or
errors
in
order
to
tell
whether
something
is
happening
and
then
also
once
you
fix
the
issue.
Of
course,
the
new
implementation,
often
the
test-
becomes
Irrelevant
in
a
sense
when
you're
trying
to
inject
specific
errors
or
delays
and
stuff.
A
Yeah
so
I
mean
most
of
our
test.
Coverage
is
coming
from
S3
tests,
which
doesn't
have
visibility
of
you,
know
the
x,
additors
or
leftover
index
entries
and
so
I'm
glad
that
you're
adding
some
test
coverage
outside
of
S3
test.
That
specifically
looks
for
those
things,
I
think
that's
a
good
strategy
I
agree
that
it's
not
great,
adding
a
bunch
of
config
variables
to
inject
errors,
but
I
can't
really
think
of
a
better
way
to
do
that.
A
H
Yeah
my
initial
thought,
without
having
put
much
time
into
it
from
a
high
level,
is
potentially
having
an
alternate
version
of
the
objector,
something
that
can
do
some
like
random
latency
injection
on
different
calls
or
air
injections,
or
something
for
just
randomly.
While
we're
running
S3
tests,
Maybe
just
to
kind
of
exercise,
some
different,
some
different
cases
there,
where
things
can
happen,
that
that
aren't
quite
right,
but
that
we,
we
actually
see
quite
a
bit
in
production.
Apparently
maybe
I'll,
look
into
that
more
and
we
can
talk
about
that
on
a
future
call.
H
If
I
can
come
up
with
some
proposals
of
more
specific
things
that
we
might
do
to
gain
some
more
coverage.
That's
covering
some
of
those
chaotic
cases
for
the
future.
A
Yeah,
that
would
be
great
the
the
way
I
see
it
is
that
you've,
essentially
just
discovered
ways
that
are
consistency.
Model
was
broken
and
have
good
fixes
for
those
but
I
think
at
a
higher
level.
It
would
be
really
valuable
if
we
could
document
exactly
what
our
model
is
for.
Versioning
and
I
would
be
happy
to
work
with
you
on
trying
to
create
some
documentation
that
describes
the
different
sets
steps
and
how
how
the
olh
head
object
correspond
with
the
bucket
index
transactions
to
keep
things
consistent.
H
H
A
draft
of
it,
but
I
have
I,
do
have
a
lot
of
notes
and
a
lot
of
stuff
put
together,
because
it
took
me
a
long
time
to
go
through
and
figure
out
how
everything
worked
together
and
I
don't
want
to
yeah
I,
don't
want
to
lose
that
on
my
own
end
either,
but
I
can
see
where
that
would
be
helpful
to
share
with
everyone
I
kind
of
give
a
human
level
understanding
of
what
the
whole
process
is
without
having
to
step
through
all
the
code.
A
Yeah
that
would
be
great
or
Ducks
I
think
the
place
to
start
would
just
be
a
very
high
level
outline
and
then
we
could
fill
in
the
details
kind
of
incrementally.
A
B
B
A
Yeah,
it
would
be
good
to
combine
that
with
a
step
at
the
end.
Maybe
that
stands
for
orphaned
instance,
objects
or
index
entries,
I.
Think
testing.
B
A
A
D
F
B
A
Yeah
some
kind
of
automation
to
set
things
up
and
and
start
running
them
would
be
great
and
then
just
we
could
build
a
process
around
releases
or
some
way
to
make
sure
that
we're
running
the
stuff
regularly
and
actually
looking
at
it.
H
Yeah
I
mean
I
think
that
that
would
probably
go
a
long
way.
I
mean
my
other
thought
is
just
handcrafts
and
like
really
abusive
clients,
the
sort
that
you
end
up,
seeing
in
real
production
deploys
and
yeah.
H
Do
some
sanity
checks
after
running
some
of
those
kinds
of
things,
just
to
make
sure
that
there's
not
like
leftover
index
entries
and
stuff
like
that
which
we
can
use
some
of
these
new
radios,
Gateway
admin
commands
or
to
check
for
pretty
easily
at
this
point,
but
I
think
some
of
these
things
would
have
been
fairly
easy
to
expose.
If
you
just
had
a
client
that
was
doing
a
bunch
of
things
and
a
bunch
of
concurrency
kind
of
things.
B
That
better,
yes,
much
better
thanks!
Okay,
I
just
moved
my
microphone,
I'm!
Sorry,
no,
no
I
mean
I
mean
I
mean.
Do
you
think
it
relies
on
clients,
performing
illegal
operations
or
illegal
operation,
sequences
or
or
be
another
one,
because
otherwise
good
test
that
can
act?
They
get
a
lot
of
activation
yeah.
It
can
mostly
do
a
thing
do
with
what's
needed,
maybe
maybe
you
might
inject
latency
and
and
things
like
that,
but
we
should
I'm
bad
I
mention
prevention.
H
Well,
I
I
mostly
mean
like
just.
We
have
particular
clients
that
just
submit
a
lot
of
requests
and
a
lot
of
like
concurrent
requests
like
they
might
do.
A
few
deletes
for
the
same
key
like
within
a
few
milliseconds
of
each
other
or
something,
and
that
has
exposed
issues
or
just
really
frequent
updates
to
a
particular
key
tends
has
caused
some
stuff.
Apparently.
B
B
It
can
run
it
can
it
can.
It
has
some
real
issues
too,
but
they're
but
they're,
but
they're,
probably
fixable,
but
scrolling
based
and
it
is.
They
can
do
a
lot
of
parallel
issue
and
it
can
gang
up
a
bunch
of
clients,
automating
them
from
different
different
hosts
and
consolidate
the
results.
It
has
sort
of
built-in
canned
workloads
that
you
know
that
are
that
are
statistically.
You
know,
arranged.
B
H
B
Defer
multiple
King
started,
work
started
injecting
latency
with
TC,
so
that
in
that
and
that
and
for
the
things
we've
tested
it
with.
You
know
it
worked
well.
G
B
A
Yeah
I'm
Corey,
you
made
a
good
point:
I
had
mentioned
Eric's
work
to
scan
for
Orphans,
but
these
tools
that
you're
writing
would
look
for
these
very
specific
issues
and
I.
Guess
that
that
is
a
case
where
you'd
it
would
be
useful
to
run
it
without
fix
just
to
just
a
flag
that
there
are
issues.
H
Yeah
and
I'd
say:
we've
been
using
it
without
fix
intermittently
too,
because
we
have
customer
Supports
or
whatever
you're
saying
their
bucket
listing
is
slow
and
knowing
that
that
can
be
a
side
effect
of
these
particular
issues.
That's
just
been
our
initial
like
check
like.
Are
they
affected
by
this
same
issue?
Kind
of
thing,
so
I
think
there
is
value
in
that
from
the
user
standpoint
as
well
just
to
identify
whether
that's
a
problem
that
somebody's
experiencing.
A
H
H
A
All
right
and
I
will
follow
up
on
review,
hopefully
today,
but
certainly
this
week,
any
other
thoughts
on
this.
D
So
we
discussed
this
couple
of
weeks
ago
about
the
notification
being.
There
are
multiple
options
on
improving
the
notifications.
If
you
open
that
track-
and
there
are
a
bunch
of
options
that
we
wanted
to
try
in
terms
of
defining
retry
time
and
then
converting
from
using
existing
rados
object
or
FIFA
object.
D
Since
we,
if
you
haven't,
got
any
solid
consensus
on
the
three
tries
or
the
time
limit,
but
at
least
in
terms
of
converting
or
using
fifo
I
mean
the
way
I'm
saying
is:
can
we
start
working
on
the
first
part
of
converting
from
radius
object
to
five
four
thing,
because
that
does
not
I
mean
that's
kind
of
clear-cut
requirement?
There
we
have
our.
We
have
high
use
cases
where
we
have
seen
these
issues
whether
Aeros
object
is
getting
full
if
the
broker
is
down.
A
So
I
think
an
important
part
of
the
design
was
having
a
way
to
push
back
against
clients.
If
we
can't
keep
queuing
up
all
of
the
notifications,
I
I
do
think
that
the
existing
clsq
limitation
is
too
small,
so
I
do
think
it
makes
sense
to
go
to
CLS
fifo,
but
I
think
we
could
also
use
some
way
to
limit
how
big
the
fifo
can
grow.
Definitely.
D
B
I
think
we've
learned
that
you
know
that
the
reasons
we
needed
my
phone
I
think
we
might
well.
Then
we
want
to
apply
it
more
places.
I
think
we
have
reason
to
believe
that
it's
got
to
be
faster
on
tune.
Instead
of
communication,
you
should
be
able
to
get
parallel
and
Q
to
Q,
maybe
maybe
not
get
it
from
from
clsq.
A
B
D
I'm
also
I'm
hoping,
if
you
get
a
constant
sort
of
the
other
two
as
well.
Probably
the
size
would
not
would
become
irrelevant,
because
if
we
have
some
way
of
defining
number
of
reprise
or
the
time
that
it
needs
to
stay
there
before
it
gets
trimmed
or
probably
get
removed
out
of
the
queue,
then
probably
enforcing
that
strictness
on
the
fifo
size
becomes
irrelevant
like
if
we
decide
to
go
the
other
two.
A
So
I'm,
just
remembering
the
design
process
for
persistent
notifications,
specifically
but
I
thought
that
it
was
a
requirement
that
they'd
be
reliable,
so
I.
Don't
think
that
a
Max,
retry
or
or
time
limit
is
what
we
want
there
and
the
other
requirement
was
that
there
was
some
push
back
to
clients
preventing
the
queue
from
from
growing
without
bound.
D
So
this
phase
I
mean
this.
Of
course
I
mean
we
can
still
be
configurable
where
by
default
we
can
say
there
are
no
retrace,
but
then
we
could
have
a
default
configuration
variable
while
creating
a
topic
which
would
decide
number
of
retries
like
of
course,
there
could
be
a
scenario
where
it's
completely
down
and
it
just
stays
there
forever.
So
wouldn't
it
make
sense
to
just
go
and
trim
them,
delete
them
or
something
I
mean
today.
D
If
you
look
at
the
AWS
documentation,
It
also
says
it's
not
kind
of
clearly
mentioned,
but
it
says
that
they
do.
They
do
the
best
efforts
on
trying
to
send
the
notification,
but
it
doesn't
kind
of
say
it's
guaranteed.
It
will
be
delivered,
no
matter
what,
if
you're,
trying
to
be
compliant
with
this
AWS
notifications.
Here.
D
A
D
I
mean
yeah
today,
while
creating
topic.
There
is
no
way
we
kind
of
verify
whether
the
broker
is
online
or
something
so
create
topic
does
not
verify
that
it's
only
while
sending
a
notification
is
when
the
connection
is
established.
So
I
mean
you
kind
of
coming
to
a
point
where
there
could
be
a
scenario
where
a
client
is
writing,
but
their
broker
is
down.
We
need
to
verify
that.
D
I
mean
we
could
discuss
all
those
bills.
At
least
converting
from
the
queue
to
fifo
at
least
seems
logical.
If
we
come
to
our
art
code
size
that
we
need
for
that
fifo
object,
we
can
at
least
start
work
on.
That
does
make
sense.
A
I
I
do
think
so.
Yeah
the
the
existing
implementation
is
kind
of
broken
up
into
two
steps.
There's
a
thing.
F
A
D
I
mean
so,
and
can
we
I
mean
we
take
up
this
tracker
and
try
to
work
on
at
least
the
first
part
of
converting
from
Q
to
rifle,
and
then
we
can
during
this
journey
we
can
decide
if
we
want
how
we
want
to
enforce
the
size
release.
D
A
Adam
you
did
a
ton
of
work
on
fifo.
Can
you
think
of
any
easy
way
to
track
something
like
the
the
total
size
I
think
we
do
have
a
head
object
somewhere,
but
the.
C
E
I
mean
we
could
I
mean
we
could
do
something
like
that.
We'd,
basically
just
have
to
make
to
do
another
Rendezvous
and
update
the
head
of
the
head
object.
Every
time
we
did.
A
I
wonder
if
so,
there's
already
a
kind
of
transaction
on
that,
each
time
that
we
add
a
new
tail
object,
I
wonder.
F
D
Sure
all
right
that'll
be
the
track
area
and
yeah
assign
it
to
us,
assign
it
to
me
and
then
start
work
on
the
first
spot.
At
least.
D
D
Another
thing
about
the
about
the
testing
for
the
the
data
data.
Sync
fairness:
probably
this
instruction
practically
based
on
the
last
comment.
It
says
there
are
some
issues
going
on
still.
Is
it
good
enough
to
take
the
current
version
and
do
the
scale
testing?
D
There
were
a
few
crashes
before
I
see
they've
been
fixed,
but
they're
still
seen.
There
are
some
issues,
but
is
it
good
to
go
for
a
scale?
Testing
is
what
I
was
asking
so
that
we
can
just
take
this
up
as
well.
G
Issues
when
we
go
over
a
few
generations
and
then
we
perform
data,
think
in
it
so
I'm
still
trying
to
iron
out
some
some
cases
like
that
and
also
some
Corner
cases
that
might
still
be
lurking,
but
but
I
think
we
should
start
doing
some
scale.
Testing
there
and
and
some
early
feedback
would
help.