►
From YouTube: Ceph RGW Refactoring Meeting 2022-10-19
Description
Join us every Wednesday for the Ceph RGW Refactoring meeting: https://ceph.io/en/community/meetups
Ceph website: https://ceph.io
Ceph blog: https://ceph.io/en/news/blog/
Contribute to Ceph: https://ceph.io/en/developers/contribute/
What is Ceph: https://ceph.io/en/discover/
A
B
Okay,
yeah,
so
I
want
to
present
a
small
project
that
I
did
with
some
student.
It
was
like
their
in
the
University
course
pretty
small,
but
I
think
really
nice.
B
So
the
the
project
targets
the
kind
of
iot
industrial
space
where
you
have
mqtt
as
the
main
protocol
and
in
in
some
cases
you
want
to
so
some
of
this
information
is
needed
for
the
the
short-term
management
of
of
data,
but
some
of
it
need
to
be
stored
for
later,
for
data
scientists
or
for
machine
learning,
training
or
something
else
that
needs
to
to
have
a
long-term
access
to
the
data
and
S3
sounds
like
a
good
option
for
this
kind
of
use
cases.
B
The
problem
is
that
mqtt
is
really
designed
for
very
small
messages,
so
messages
in
mqt
are,
like
you
know,
probes
and
sensors
measuring
voltage,
temperature,
all
kind
of
things
like
that,
and
they
usually
send
messages
that
have
a
very
small
payload.
It's
a
very
efficient
protocol
with
very
low
overhead
and
if
we
just
naively
take
those
messages
and
put
them
as
objects
in
S3,
that's
going
to
become
an
extremely
inefficient
process.
B
So
the
idea
is
to
was
to
write
a
converter
that
Aggregates
those
MPT
messages
into
into
objects
and
Echo
into
some
configuration
policies,
stores
them
as
S3
objects.
B
So
this
was
the
project
and
I
have
a
quick
demo
showing
how
it
works.
Let's
first,
you
look
at
the
at
the
configuration
of
this
smoke
converter.
So
in
the
configuration
you
have
couple
of
things,
you
need
to
Define
the
mqtt
broker,
and
here
the
mosquito
project
has
a
very
nice
service
for
who
wants
to
try
it
out,
don't
need
to
install
a
mosquito
broker
on
their
laptop
on
their
machine.
B
You
can
just
use
the
one
that
they
open
up
in
the
cloud
for
for
Access,
of
course,
don't
send
anything
sensitive
there
because
it's
like
everybody
can
see
that
and-
and
so
this
is
like
kind
of
one
leg
of
this
of
this
converter.
The
other
leg
is
the
S3
endpoint.
B
The
students
actually
didn't
want
to
compile
Seth,
so
they
they
tested
that
with
AWS
suggests
what
I
I
compile
Seth
every
now
and
then
so
I
just
test
it
with
SEF
and
with
the
Reds
Gateway.
This
is
the
aggregation
kind
of
logic.
You
can
it
could
be
based
on
the
amount
of
messages
the
size
in
bytes
time.
B
I,
don't
know,
there's
a
couple
of
conditions
there
that
you
can,
you
can
put
together
to
to
create
how
those
entity
messages
are
being
aggregated
and-
and
the
last
piece
here
is
the
mapping
between
buckets
and
topics.
So
here
my
topic,
one
in
one,
my
topic-
2
both
go
to
my
bucket.
B
So
these
are
the
configurations
that
are
needed
as
a
side
note,
because
we
want
to
later
on
containerize
everything
and
make
sure
everything
runs
nicely
in
kubernetes
the
broker
and
the
S3
endpoint
are
probably
going
to
be
removed
from
the
confile
they're
going
to
be
automatically
extracted
from
the
kubernetes
configuration
like
from
their
operator.
There's
a
mosquito
operator
and,
of
course,
there's
Rook
for
the
for
the
safe
operator
and
that
doesn't
need
to
be
configured
in
a
in
an
openshift
or
kubernetes
environment.
B
B
Now
here,
I'm
going
to
to
use
like
the
the
mosquito
project
has
like
a
client
that
can
do
publishing
so
I'm
going
to
publish
to
the
test
mosquito
server
on
the
web.
I'm
going
to
publish
to
a
topic
called
my
topic
one,
because
this
is
one
of
the
topics
that
my
converter
was
subscribing:
you're,
subscribing
to
and
I'm
going
to
send
a
hello
world
message
now.
I
also
have
to
remember
to
create
the
bucket
because
the
the
bridge
doesn't
create
the
bucket,
so
I
have
to
create
a
bucket
called
my
bucket.
B
Because
again,
this
is
what
the
configuration
said
and
then
I'm
going
to
send
the
messages.
So
you
would
see
that
it
received
the
message
because
it
subscribed
to
this
topic,
but
it
doesn't
do
anything
with
it
because
it
says
that
you
need
to
get
five
messages
before
you're
gonna
aggregate
them
into
one
object,
so
25
five
messages.
Then
it
created
the
object
and
he
written
the
object
into
the
bucket.
B
B
B
Another
small
thing
that
also
was
part
of
this
project
is
that
there
is
the
not
the
best
name,
the
opener,
so
you
we
do
need
to
have
some
kind
of
logic
that
helps
whoever
wants
to
read
those
objects
and
extract
them
back
to
the
original
messages,
because
the
person
that
writes
the
application,
that
does
the
analysis
or
presented
in
a
graph
or
something
that
doesn't
care
that
we
put
that
everything
in
one
object.
This
is
just
for
our
efficiency.
B
They
want
to
see
the
original
messages
so,
as
part
of
the
project,
we've
written,
like
a
small
python
code
that,
based
on
the
way
that
we
we've
written
the
object,
do
we
do
the
extraction?
The
the
thing
that
we
do
is
that
we
create
a
small
header
in
the
object
and
we
put
the
offsets
of
the
different
messages
together
with
their
timestamps
in
this
header.
So
you
know
with
five
messages.
B
It's
really
trivial,
but
I
mean
let's
say
that
for
efficiency
results,
we
want
to
create
very
large
objects
with
thousands
of
messages,
and
the
reader
doesn't
want
to
read
the
entire
object.
Then
they
can
use
the
they
can
read
the
header
of
the
object
and
figure
out
what
offsets
they're
interested
at
and
then
use
the
range
command
to
fetch
only
part
of
the
object
to
get
to
be
more
efficient
in
what
they're
looking
for
yeah.
This
is
pretty
much
it
in
a
nutshell,
this
small
small
student
project,
if
you
have
questions.
A
Is
cool
the
the
subject
of
how
to
kind
of
do
bulk,
uploads
of
small
objects
or
figure
out
a
way
to
pack
them
efficiently
has
come
up
a
couple
other
times,
but
I
think
this
is
interesting
in
that
it's
kind
of
an
application
layer
thing
that's
doing
the
packing.
B
B
Maybe,
with
other
more
General
packing,
you
would
need
some
other
kind
of
logic,
so
I
think
it
does
make
sense
to
use
it
here.
The
original
thinking
was
not
to
use
it
not
to
build
their
own
small
header
in
the
object,
but
to
use
the
actual
head
object
like
choose
the
attributes,
but
I
don't
think
we
had
enough
space
there
in
the
attributes
to
put
enough
information,
assuming
we
want
to
be
able
to
have
a
like
a
long
list
of
offsets
of
all
the
small
all
the
small
messages
in
the
object.
B
C
It
appears
to
me
that
this
is
somewhere
similar
to
what
the
multi-part
upload,
which
is
what
I've
been
a
plane
waste
in
the
last
two
weeks
that
has
the
dedicated
metaphile,
which
records
with
objects
are
just
similar
to
what
you
all
said
about
where
the
offset
should
be
for
each
messages,
and
you
can
also
has
a
list
of
objects
that
for
each
chunks
being
uploaded,
which
may
be
corresponding
to
the
aggregated
object.
C
You
always
talking
about
grouping
multiple
message
into
a
simple
object,
and
in
such
a
structure
you
achieve
the
with
basically
one
layer
of
hierarchy
from
metadata
from
manifest
to
the
actual
object
which
you
can
retrieve
the
individual
messages.
C
Of
course,
many
improvements
and
variant
can
be
derived
from
there,
but
I
find
the
similarity
somehow
in
between
those
two.
B
Yeah
I
think
the
multi-part,
though,
is
kind
of
I
mean
just
built
into
the
fact
that
you
can't
handle
you
don't
want
to
handle
one
huge
one,
huge
object,
and
it's
pretty
generic.
It's
like
later
on,
you
kind
of
forget
about
all
the
parts
in
the
motor
part
I'm.
C
A
Yeah
and
in
the
Swift
API
there's
kind
of
explicit
support
for
manifests
of
smaller
pieces
of
objects.
D
So
the
the
you
know
the
init
would
Define
the
structure
of
the
data
and
then
each
of
the
puts
would
you
know,
fill
in
that
structure
and
then
you
could,
you
know,
list
these
things
and
get
the
the
header
which
would
have
the
structure
defined
in
it
and
then
read
out
the
structured
data
from
that
there
might
be
a
generic
way
of
doing
effectively.
This
kind
of
thing.
C
C
A
D
Yeah
I
mean
I,
guess
you'd
probably
have
to
tag
the
parts
with
a
header
that
describes
which
of
the
pieces
of
structure.
It
fills
I,
guess
or
something
like
that.
D
E
Hey
thanks
for
having
us
so
I'll,
admit:
I'm
I'm
I
have
some
experience
with
stuff,
but
not
a
tremendous
amount,
so
take
anything
I'd
say
with
a
grain
of
salt,
but
I
was
sort
of
helping
to
facilitate
some
contact
between
the
Sith
team
and
the
folks
at
Aquaman,
La
node.
So
obviously
a
machine
that's
been
looking
on
a
or
looking
at
a
possible
mitigation
for
a
problem.
E
I'm
assuming
everybody
is
probably
familiar
with
it
in
the
case
of
Leno,
the
the
probably
the
biggest
challenge
that
we
have
related
to
multi-part
orphan
objects
is
that
are
accounting
for
billing
purposes
is
incorrect,
I
think
in
general
you
know
there
are
a
few
different
scenarios
whereby
clients
inadvertently
upload
multi-part,
Parts,
multiple
times,
I
think
in
many
cases
it's
buggy
scripts
or
or
that
sort
of
thing.
E
So
for
some
time
we've
been
trying
to
figure
out
how
to
deal
with
both
culling
the
Redundant
Parts
in
a
time
and
space
effective
way
and
then
also
trying
to
Implement
some
sort
of
fix.
E
That
would
prevent
the
issue
from
occurring,
but
one
of
the
things
we're
we're
very
sensitive
to
we
that
has
sort
of
a
long
history
of
taking
a
third
party
off-the-self
software
and
patching
it
in
ways
that
are
difficult
or
impossible
to
Upstream
later
and
so
to
try
to
avoid
falling
back
into
that
trap
again
are
our
aim
here
was
to
be
able
to
contribute
back
a
fix
that
you
know
that
would
be
generally
usable,
and
so
we
are
sort
of
in
the
in
the
the
hacking
phase
on
that
now.
E
But
I
think
that
when
I
reached
out
to
Daniel,
the
real
thing
I
was
looking
for
was
some
guidance
on
what
sort
of
a
fix
might
be
upstreamable,
and
then
you
know
we
later
learned
in
the
email
thread
that
there
was.
You
know
a
parallel
effort
underway
and
I
I
haven't
seen
that
code,
yet
so
I
don't
know
much
about
it.
But
so
that's
that's
my
my
introductory
Spiel
I'm
not
sure
where
to
go
from
there.
E
A
Is
this
is
not
about
incomplete
multi-part
uploads
that
just
never
get
cleaned
up
right,
okay,
yeah!
So
this
does
sound
like
what
Matt
is
working
on
in
in
that
PR,
but
I
think
it
just
got
complicated
and
and
hasn't
been
finished
right.
So
I
I
pinged
him,
but
he
has
a
conflict
at
the
moment.
E
All
right,
so
it
seems
like
maybe
a
a
good
Next
Step
would
be
to
to
see
if,
if
he
and
you
think
you
can
compare
notes
because
I
think
it
would
be
helpful
for
us
to
know
whether
the
strategy
that
we're
thinking
of
it
had
been
potentially
ruled
out
already.
E
Or
you
know
if
it's
similar
enough
to
what
Matt
has
been
working
on,
that
it
might
be
that
we
could.
You
know
we
could
at
least
assist
with
the
testing
or.
A
A
So
yeah
I
think
I
can
ask
Matt
on
the
pr
itself
to
just
give
a
high
level
description
of
his
design,
and
maybe
you
guys
can
take
it
from
there.
C
Actually,
yeah
I
have
a
GitHub
account.
It's
a
Iman
77,
the
Y
I
ma77.
B
B
No,
no,
no,
nothing!
Nothing
too
important.
Just
maybe
a
small
note
is
that
we
recently
added
the
ability
to
to
nicely
Trace
multi-part
uploads.
So
if
the
different
products
go
to
different
rgw's
and
and
so
on,
then
we
edit
tracing
abilities
using
Jaeger
and
there
are
specific
mechanism
there
that
is
geared
toward
multi-part
so
because
this
is
kind
of
a
problem
usually
for
many
people.
B
So
instead
of
digging
lots
of
lots
of
log
files
on
multiple
rgw's,
you
can
see
the
the
nice
tracing.
Okay,
all
the
parts
and
maybe
it'd
be
easy
for
you
to
see.
Okay,
the
same
part
twice
or
a
cancel
or
something
that
didn't
finish
with
it
so
forth.
Oh
awesome,
yeah.
E
That
would
be
nice.
The
question
I
had
is
that
I
knew
sort
of
based
on
the
name
of
this
call
and
then
doing
a
little
bit
of
reading
around
the
the
you
know.
The
head
of
the
repo
right
now
as
it
looks
like
radius
GW,
is
getting
some
refactoring
related
to
like
pluggable
back
ends,
so
is
if
we
were
to
devise
a
fix
that
I
think
we'd
be
targeting
like
16x
now.
Does
that
code
look
radically
different
following
a
refactoring
or
it
has
the
refactoring
already
occurred,
or.
D
That,
but
it's
it's
now
just
the
general
Upstream
planning
and
discussion
meeting
for
rgw,
okay,
stuff
stuff
is
generally
not
difficult
to
backboard.
Okay,
there
are,
there
are
some
specific
areas
in
which
it
will
be
more
difficult
than
others,
but
this
kind
of
thing
probably
will
not
be
particularly
difficult.
All
right.
E
Foreign
I,
don't
think
I
have
any
more
specific
questions,
but
you
know
if
anybody
that
an
arcamide
does
feel
free.
C
I'm
good
having
a
pi
is
wonderfully
helpful
I'll.
Look
into
that.