►
From YouTube: Ceph RGW Refactoring Meeting 2022-11-23
Description
Join us every Wednesday for the Ceph RGW Refactoring meeting: https://ceph.io/en/community/meetups
Ceph website: https://ceph.io
Ceph blog: https://ceph.io/en/news/blog/
Contribute to Ceph: https://ceph.io/en/developers/contrib...
What is Ceph: https://ceph.io/en/discover/
B
B
B
One
of
them
is
that
they're
not
restricted
to
tallure.
They
have
all
kind
of
language
support,
they're,
just
manipulating
object
gets
which
makes
sense,
because
this
is
what
you
have
to
manipulate
in
line.
You
can
order
the
offline,
but
the
the
one
one
difference
that
I've
seen
there-
and
this
could
kind
of
make
me
think
about.
B
B
So
what
if
it's
just
you
know
switching
some
metadata
Fields,
then
it's
probably
not
too
bad,
but
if,
if
it
comes
to
actually
reading
the
object
or
reading
Chunk
from
the
object
is
doing
some
heavy
processing
on
them,
then
it
kind
of
okay.
We
don't
want
to
those
those
scripts
to
hog
all
the
CPU
of
the
rgw
and
kind
of
make
the
all
the
rest
of
the
people
using
the
rgw
to
upload
the
double
object.
B
We
don't
want
to
slow
them
down,
scaling
would
be
a
problem
and
estimating
the
capacitive
system
would
be
a
problem
and
what
they
did
in
AWS
is
that
if
you
want
to
fetch
the
object-
and
you
want
to
run
the
the
Lambda
function
against
it,
then
you
use
a
different
URL.
B
C
This
is
a
whole
line
of
development
there,
but
there's
been
there
all
right
and
then
and
the
cows
have
been
in
conversation
been
in
meetings
with
people
about
different
things,
serverless
functions
and
supporting
them
from
from
from
other
stories
or
without
optic
stores.
It's
a
whole
area
where
you
proliferate
an
entire
Suites
of
software
to
run
such
such
serverless
functions
right
that
they
could
put
in
that
that's
the
case
and
and
so
and
and
all
those
will
be
out
of
process.
C
Well,
there's
there's
also
been
conversations
about
having
a
converged
one
and
and
people
and
and
I
think
I
can
say
here
we
had
meetings
with
ronancott
this
team
and
gal
was
present,
so
I
was
a
little
confused.
Why
he
doesn't
remember
it,
but
I
mean
one
of
our
one
is,
if
you
remember,
maybe
you
ever
looped
into
some
of
that
too.
One
of
the
one
of
the
things
that
ronan's
team
did
was
develop.
The
Prototype
of
what
IBM
calls
fabric
right
and
fabric
is
its
own
sort
of
Technology.
C
You've
been
doing
a
bunch
of
things,
but
it
but
it.
C
But
it's,
but
it's
what
it's
it's
it's,
but
it's
fine,
because
the
core,
the
core
functionality
of
retention
for
fabric
was
that
it
would
be
that
it
would
be,
and
it
would
be
a
Gateway
that
that
that
that
and
and
with
an
interface
to
a
variety
of
as
an
interfaces,
a
variety
of
back-end
systems
but
I
think
but
I
think
one
of
the
I
think
the
consumer
interface,
for
it
is
actually
use
a
zero
point
in
some
way
and
you
can
insert
it
into
various
workload:
workflow
pipelines
or
application,
pipel
data
pipelines
and
and
it
imposes
it.
C
Data
passes
through
it
and-
and
it
imposes
Advanced
security
for
pro
enforces
policy-based
security
and
of
various
kinds
on
the
data
or
or
the
access.
And
then
it
could
include
restricting
how
much
of
the
data
can
be
seen
which
which
columns
and
Tech
columnar
data
sets
can
be
seen.
C
I
can
anonymize
or
transform
things
they
wrote,
All,
This
and
and
and
the
process
asking
for
whatever
whether
this
was
you
know,
we
talked
to
them
about
whether
this,
whether
whether
whether
implementing
an
object,
Lambda
using
a
technology
like
this
would
be
would
be
interesting.
One
of
the
reasons
is
that
they,
now
they
have,
they
had
this.
They
have
this.
That's
sort
of
part
of
this
part
of
fabric
was
it
was
it
was
a
wasm
run
time
that
they
could
accelerate
and
they
had
already
written
on
all
this.
C
All
these
transforms
to
allow
wasm32
codes
to
interact
with
64-bit
data
and
other
stuff,
but
there's,
but
then
there's
a
whole
bunch
of
things
out
there
right
that
I've
said
that
I
haven't
really
able
to
know
which
of
us
have
explored
them,
but
there
are
a
whole
Suite
of
software
that
do
serverless
functions
and
as
as
as
as
as
as
as
a
as
an
infrastructure
component
which
you
could
link
to
S3.
So
these
are
all
things
that
could
be
done.
C
I
mean
I,
I
right,
so
so
there's
so
there's
a
question
of
does
harsh.
If
you
need
her
once
its
own
system,
but
but
but
I
but
I,
think
if
we
start
combining
it
with
it
shouldn't
be
instead
of
rgw,
then
I
think
the
idea
of
rgw
doing
it
I
think
those
I
think
those
I
think
those
premises
fight
with
each
other,
but
I
think
but
I.
Think
if
you
or
you
know
but
I,
think
if
you
then
say:
hey
Community
version,
there's
a
lot
of
off-the-shelf
stuff.
B
Okay
I
mean
my.
My
kind
of
question
was
so
yeah,
so
if
this
is
something
so
the
the
whole
idea
is
about
like
inline
processing
right,
because
if
this
is
like
offline
processing,
then
we
have
a.
C
It's
in
line
then
I
would
say
it's
an
interest.
There's
interesting
projects
there,
but
I
wouldn't
be
fine,
but
I
wouldn't
be
focusing
on
how
to
get
it
outside
and
inside.
At
the
same
time,
I'd
be
focusing
on
how
to
cut
how
to
get
an
ins
out
how
to
get
it
in
line,
but
very
efficient
and
safe,
so
think
so.
Sandboxing
and
other
ideas
come
up
at
that
point.
Hence
the
hence
maybe
the
the
run-in
stuff
is
interesting.
B
C
B
Yes,
yes,
like
sure
we
don't
know
what
they
do,
but
I'm
saying
I
think
that
the
idea
is
that
if
you
just
want
to
fetch
the
object,
you
use
one
URL,
even
if
you
want
to
fetch
the
object
that
goes
through
something
you
need
another
another
URL.
Maybe
this
is
not
the
the
Gateway.
Maybe
this
is
something
that's
done
before
that.
C
Yeah
the
flexibility,
so
they
can
do
that
and
they
probably
are
doing
it
out
of
process,
but
we
wouldn't
have
to
make
that
choice
right.
B
A
So
I'm
I
guess
I'm
curious
what
what
the
advantages
are
for
doing
it
in
line
instead
of
as
some
other
client
that
just
issues
a
get
request
to
S3
and
processes,
the
data
that
it
gets
back.
B
Well,
when
you
do
a
get,
then
then
you
can't
you
have
to
do
that
inline,
whether
it's
in
the
rgw
OR
in
some
other
layer
could
be,
but
it
has
to
be
in
line
because
you
can't
I
mean
when
you
do
a
put,
then
you
can
later
on
Kick
something
that
will
do
the
processing,
because
you
just
uploaded
something
and
it
doesn't
really
matter
whether
this
is
going
to
be
the
right
thing
immediately
or
later
on.
When
you
do
it
yet
you
have
to
get
that
right.
So.
C
Another
thing
like
I,
don't
know
if
this
is
true
of
object.
Lambda
per
se,
I,
don't
know
everything
about
object,
Lambda,
but
but
like
another
thing
that
I
remember
when,
when
when,
when
nuba
came
one
of
the
things
one
of
the
things
that
nubas
said
it
could
do
and
add
some
classic
for
it
was
you
could
you
could
you
could
you
could
add
to
the
processing
pipeline?
This
is
more
like
the
Lua
integration.
Now,
in
some
ways
and
and
some
things
you
could
do
you
like,
like
they
had
an
example.
I,
don't
know
I.
B
C
Exemplar
of
good
practice,
but
one
of
the
one
of
the
one
of
the
demos
they
had
was
a
card
credit
card,
anonymization
data
or
data
anonymization,
which
is
something
that
yeah,
so
so
so
so
it
could
be
in
line
and
permanent
and
a
put
other
ideas
like
this.
So
so
there
are
things
complete
reasons
why
you
would
you
know
who
knows
what
they're
object?
No,
but
there's
there
there
are.
C
There
are
hypotheticals
where
we,
where
you
would
do
something
that
maybe
isn't
the
same
as
what
most
climate
processing
is,
but
where
you
do
want
to
con
sort
of
converge,
the
processing
together
with
that
with
with
with
operation,
because
either
for
performance
reasons
or
or
or
or
or
or
or
or
for
not
I,
guess
non-refutations
by
the
time
the
data
is
arrested.
It's
it's
where
you
want
it.
Okay,.
B
Right
right
when,
if
even
if
you're
limited
to
the
get
I
think
to
to
Casey's
Point,
then
sure
the
client
can
do
everything
with
the
data.
But
sometimes
the
owner
of
the
data
is
not
the
client
that
gets
the
data
right.
So
so
it
could
be
that
the
owner
of
the
data
wants
you
to
get
something
that
goes
through
some
filters
on
processing
and
they
don't
want
to
count
on
the
implementation
of
the
application
or
the
client.
To
actually
do
that.
I
mean
this
is
what
the
object
Lambda
is
for.
C
C
I
think
it
seems
like
the
argument
for
doing
it
inside
of
rgw
is,
is
not
as
strong
as
at
least
a
bigger
picture
than
that,
because
so
much
of
because
so
much
of
the
of
this
sort
of
area
would
be
about
building
a
a
resilient
ecosystem
with
the
developers
can
use
to
organize
the
different
codes
and
outputs
and
credentials
and
all
that
stuff
and
it's
a
whole
different
sort
of
area
of
activity
and
all
and
and
it
can
always
just
access.
S3.
A
C
It
certainly
seems
to
me,
like
you'd,
have
a
hard
time
winning
a
race
with
with
with
with
with
with
the
left,
with
with
well-developed
infrastructure,
that
that
does
that
and
like
you
know,
for
example,
for
example,
with
the
awesome
system,
it's
got
a
multi-language
runtimes
and
all
those
things.
So
probably
that
would
be,
but
it's
possible
that
there
wasn't
work
by
by
Ryan's
group
is
useful.
It's
possible
that
people
are
doing
similar
things
as
part
of
other
serverless
function,
Suites
and
all
of
those
should
be
able
to
consume
us
by
ask3.
B
Yeah
yeah
I
mean
sure
you
can
put
that
in
front
of
of
the
rgw
and
do
whatever
processing
and
if
he
just
speaks
S3
at
the
other
end,
then
it
does
the
same
thing:
yeah
I.
C
B
C
B
I
think
this
is
why
AWS
separated
that,
so
you
know,
they'll
have
better
control
on
the
you
know
how
things
need
to
scale
yeah
I.
C
Feel
differently
about
about
the
data
integration
piece
like
the
aeroflight
stuff
and
so
I've
been
pushing
for
that
to
be
the
other
I
think
for
routing
pipelines
and
things
like
that.
The
kind
of
purpose
of
that
is
to
beat
there's
a
there's,
a
there's,
a
net
there's
a
likely
advantage
to
being
close
to
the
storage
there
at
least
that's
kind
of
our
pitch,
but
but
yeah
I,
I'm
I'm
sort
of
thinking
that
the
best
way
to
write
these
things
is
as
some
sort
of
elastic
facility.
That's
not
part
of
the
S3.
B
Yeah
I
agree
that
makes
sense.
I
mean
the
I
mean,
as
what
we've
seen
in
in
the
cephalicorn
is
doing.
Those
small
manipulations
on
the
metadata
is
useful
in
line
like.
C
A
A
C
Sorry
well,
I
think
I
think
that
the
idea
is
that
spark
is
going
in
is
the
spark
connectors
are
evolving
interfaces
that
we'll
just
Implement,
so
Eric
has
been
working
on.
The
Eric's
prototyping
is
aimed
at
getting
the
basic
aeroflight
with
interface
there,
which
is
really
really
narrow,
or
at
least
oh,
it's
not
quite,
but
it's
relatively
it's
relatively
narrow.
It's
super
generics.
What
he's
doing
right
now,
but
where,
but
but
people
the
smart
communities
are,
you
know,
is
already
layering
stuff
on
top
of
flight.
C
That
is
more
interesting,
like
there's
something
called
flight
SQL,
which
has
capabilities
like
a
superset
of
of
our
essay
select
in
some
ways
and
and
it
would
be-
and
the
idea
would
be
that
that
the
the
general
the
generic
optimization
of
flight
SQL
integrated
into
the
spark
connector
or
into
the
spark
Catalyst
for
the
connector
enabled
would
would
let
us
just
be
a
first
class
provider
for
data
sets
or
they
could
send
push
Downs
similar
to
how
S3
select
does
and
maybe
other
stuff
after
after
that,.
C
C
C
A
lot
to
it,
but
this
is
sort
of
the
base
level
where,
where
we
show
okay,
we've
got
an
essay,
we've
got
an
aeroflight
there
and
you
can
do
some
simple
operations
and
we
can
admit
some
conventions
there,
but
but
I
think
the
gold,
the
gold
we're
after
is
above
that
and
it's
like,
and
it's
probably
in
the
flight
SQL
space
or
something
some
other
protocols
sub
protocols
that
end
up
getting
layered
on,
but
but
they
should
all.
But
but
there
should
be,
there
should
be
free
money
here.
C
B
By
the
way,
if,
if
we
are
going
to
develop
the
the
skills
to
to
write
stuff
in
the
in
the
Catalyst
could
be,
this
could
be
useful
for
for
S3
selects
offering
as
well,
because,
because,
like
at
least
this
one
I
know
from
GAO,
like
the
you
know,
there
are
some
limitations
and
some
things
that
we
support
and
are
not
support
by
AWS
and
like
the
the
Catalyst
implementation,
is
very,
very
kind
of
prudent
and
whenever
he
is
not
100
sure
that
the
the
servers
are
gonna
ingest.
B
B
So
that
could
be
very
useful.
I
mean
this
would
be
good
because
that
would
demonstrate
the
the
real
power
of
as
you
select
because
currently
because
the
client
is
kind
of
cautious,
then
they're
not
gonna,
quite
often
they're
not
pushing
down
the
the
query.
They're
just
doing
everything
themselves
is.
C
This
applied,
does
it
Supply
mostly
to
the
this,
is
actually
I
mean
I,
don't
know
which,
which
things
you
know,
which
things
Cal
knows.
I
mean
we
spent
a
long
time.
Sort
of
fighting
with
this
I
mean
there's
in
addition
to
S3
select
well,
S3
select,
doesn't
didn't,
hasn't,
seemed
to
be
a
big
theme
in
Sparkle,
though
for
cold
data.
It
could
always
show
up
there.
C
They
have
S3,
you
know
they
have
S3
there,
but
but
for
us
to
select
there,
there
are
some
applications
that
are
starting
to
use
it,
but
it
took
a
long
time
to
get
all
the
way
there,
but
like
one
is
Torino
Presto
DB.
C
So
this
gal
did
work
to
get
press
get
Presto
DB
to
work,
although
just
just,
as
you
said,
their
their
Optimizer
isn't
Catalyst
but
it,
but
it
also
tended
not
to
be
using
us.
You
know
when
he
expected
it
to,
and
then
we
tried
to
use
trino
because
we
had
we
had
people
testing
with
that
and
and
trino
didn't
even
have
S3
select
working.
So
we've
since
read
a
couple
of
papers
of
white
papers
that
suggest
that
some
parts
of
trino
now
have
it.
C
So
each
of
these
application
ecosystems
have
some
notion
of
an
Optimizer
that
does
this,
what
you
say
has
to
decide
whether
to
push
down
something
and
whether
it
would
be
use
whether
whether
it
would
refer
to
spark
its
Catalyst
so
yeah.
Each
of
these
things
has
to
have
some
where,
where
deducing,
whatever
it
should
and
I,
don't
I
I,
don't
think
this
is
a
well-evolved
area
in
in
in
in
in
Estuary.
C
Select
really
at
this
point,
but
I
think
you
know,
but
but
the
problem
is
like
might
be
simpler,
but
but
over
in,
but
I
think
it's
a
very
I
think
it's
very
important
topic
for,
like
the
for
the
hot
integration
over
on
the
spark
side,
when,
if
it's,
if
it's
it's
going
to
need
some
sort
of
way
of
inferring,
when
it
could
be
a
good
idea
and
it's
going
to
have
to
have
information
from
us.
C
This
prefecture,
you
know
the
this
part
that
this
was
work
that
was
done
by
the
students
at
Mass,
open
Cloud,
but
they
have
this
engine
that
you
know
that
they
that
they,
what
that
they,
that
they
moved
into
spark
and
included
a
couple
other
things
too,
when
caught
something
called
Pig,
but
spark
was
one
and
the
end
and
and
Spark
which,
when
spark
spark
wants
to
when
spark
queries
are,
are
other
jobs
in
spark
are
set
up,
spark
creates
a
dag,
and
you
know
a
description
of
the
work,
a
graph
description
of
the
work
and
it
and
at
the
end,
at
the
point
where,
when
it
does
that
and
and
then
inserts
them
into
its
work,
it's
Collective
work
plans
this
this
this
this
this
glue
stole
it
stole
the
dag
and
sent
it
over
to
this.
C
To
this
to
this
Cruise
engine
and
the
crazy,
and
then
the
crease
thing
would
look
at
all
the
dags
that
it
had
seen
recently
did
a
critical
path,
analysis
on
them
and
decided
which
objects
were
going
to
be
needed
at
what
point,
and
it
would
use
them
to
pre-fetch
stuff
into
the
cache
and
they
could
use
them
to
discard
stuff.
We
weren't
going
to
look
at
it
anymore,.