►
Description
http://goo.gl/U4b70r
28 October 2014
Ceph Developer Summit: Hammer
Day 1
RGW: Object Versioning
Mike Bryant
RGW: Snapshots
Craig Lewis
A
C
Yeah
well
with
regard
to
the
object,
versioning
been
implementing
it
for
a
few
months.
Now
there
been
some
discussion
internally
and
externally,
and
there
is
a
wiki
page
and
subscribes
the
design
that
we
came
up
with
and
basically
originally
we
looked
at
that
feature.
There
were
two
different
approaches
that
were
previously
implemented:
one
is
the
ds3
alisone
on
object,
versioning
and
there's
a
sweet
object,
versioning
and
both
are
quite
distinct
and
historically
we
were
when
we
implemented
in
feature.
We
made
it
pretty
much
agnostic
in
the
actual
restful
api
this.
C
So
they
there
was
they
doing
the
internal
implementation
and
then
the
core
functionality,
and
then
there
was
a
RESTful
API,
but
in
the
a
specific
feature
the
Swift
API
and
the
SVP
are
so
so
different.
Then
we
just
chose
one
and
we
went
to
with
the
31
and
and
that's
what
I've
been
implementing
ever
since.
So
there
were
a
few
requirements
that
we
needed
to
take
care
of,
because
the
SVA
PS
is
quite
different
than
everything
we
we
do.
We
used
to
be
doing
up
until
now.
So
first
of
all,.
C
You
need
to
obviously
be
able
to
read
a
specific
object
version
and
remove
a
specific
object
version
now.
One
more
thing
that
we
want
to
avoid
is
to
to
need
to
be
required
to
access
the
bucket
index
for
each
object
trade.
We,
because
if
we
do
that,
then
the
packet
index
becomes
a
bottleneck
which
we
wanted
to
avoid,
and
now
there
is
the
the
entire
f3
API.
C
C
You
delete
an
object,
it
doesn't
really
isn't
been
deleted,
but
there's
an
object,
deletion,
markers
and
there
he's
been
created
and
then
you
can
suspend
versioning
on
a
bucket
and
once
you
do
that
the
versions
seal
exists.
But
if
you
create
new
versions,
new
objects,
the
objects
will
be
created
with
them,
what
they
call
an
out
version
and
if
you
remove
it
and
in
that
version
marker
has
been
created.
So
everything
had
had
to
be
supported.
So
we
discussed
this.
C
Didn't
fit
our
current
design
because
we
usually
don't
have
a
cross
object
interaction,
because,
basically,
what
you
need
to
do
is
have
kind
of
a
pointer
to
another
object,
and
if
you
remove
an
object,
you
need
to
change
the
pointer
and
to
to
set
it
on
a
different
object.
But
what
what
happens
if
one
of
the
the
the
operation
failed?
What?
While
we
were
doing
it,
how
to
handle
it?
And
what,
if
you
have
multiple
operations
coming
in
parallel
to
the
same
object
of?
C
How
would
the
architecture
work
work?
With
this?
We
ended
up
coming
up
with
a
solution
that
in
which
we
have
the
bucket
index,
serves
as
as
the
decision
maker.
So
whenever
you
create
create
a
versioned
object,
you
first
go
to
the
packet
index
and
and
basically
say:
ok
I'm
now
create
creating
this
object,
and,
and
the
bucket
index
will
make
sure
that
everything
is
happening
at
the
correct
or
order
and
provided
a
log
board
for
the
Raiders
gateway
to
to
to
to
reclaim
of
the
play.
C
So
it's
so
very
quickly
go
to
to
the
object
this
now
we
call
the
olh
the
object,
logical
hand,
which
is
kind
of
like
soft
link
and
say
ok,
I'm
about
to
check
to
change,
to
make
a
change
to
the
to
this
subject,
and
then
we
go
to
the
back
in
the
mix
say:
okay,
now
we
create
a
new
new
version
of
the
the
object
and
and
and
the
back
of
the
neck
would
say:
okay
go
to
the
o,
LH
and
now
weird
it's
at
this
Pacific
version
make
sure
the
we
are
at
this
specific
version
before
changing
anything
and
and
and
update
the
the
LH
and
32.8
are
at
at
the
object
version
that
was
just
created
and
basically
that
how
it
works
so
create.
C
C
Now
about
the
object
versions
themselves.
These
are
basically
regular,
regular
objects
that
just
are
that
are
just
put
in
as
this
innate
namespace,
so
that
the
there
is
some
kind
of
a
naming
convent
convention
and
that
it's
a
version
to
the
object.
So
whenever
there
is
a
version,
versioning
turned
on
on
the
buckets
and
every
object
now
is
being
created
with
some
kind
of
a
add
version
instance
appended
to
it
internally.
B
C
B
I'm,
a
quick
question
on
the
S
on
the
s3
vs.
lifting,
so
the
s3
has
a
pretty.
As
I
understand
it,
has
a
pretty
rich
API
as
far
as
listing
versions
and
rolling
back
two
versions
and
all
that
stuff.
On
the
left
side,
it's
like.
Basically,
when
you
overwrite
an
object,
it
first
copies
it
to
a
different
container
and
then
yes,
it
yeah.
C
Right
right,
yeah,
I,
honestly,
don't
see
the
value
as
a
splint
and
supporting
any
I
like
it's
more
of
whatever
they're
the
user
requirement,
whether
they
need
versioning.
If
you
have
the
s3
one
and
it
kind
of
encapsulate
everything
you
need,
the
suite
provides,
which
provides
a
specific
API
like
you
need
to
list.
C
A
C
Will
be
able
to
provide,
or
maybe
extend,
our
Swift
API
to
support
the
s3
object,
Virgin,
Atlantic's,
right,
yeah,
actually
yeah
so
and
there's
not
much
difference
like
you
just
add
a
the
field
and
saying:
okay,
that's
the
object
version
that
I
want
to
read,
and
so
it
should
work.
Fine,
okay,.
B
And
I
guess
the
second
thing
is
the
I
mean
slipped:
it's
used
as
a
it's,
it's
sort
of
wrapped
up
in
the
way
that
you
do
the
object,
retention
and
object
expiration.
The
same
thing
happens
on
a
spree
right.
So
once
you
have
object
versions
and
eventually
you'll
be
able
to
define
a
policy
it
controls.
I
come
in
and
keep
is.
C
B
C
C
The
way
I
see
there
should
be
some
kind
of
an
external
agent
that
would
handle
everything
and
probably
using
some
blogs
that
will
generate
and
for
whether
you
know
whether
we
we
deal
with
virgin
objects
or
with
regular
objects,
I'm
going
to
change
much
oh
yeah,
the
affected
by
the
way
it's
gonna,
say,
external
up
and
gently
doesn't
mean
that
it's
really
an
external,
oh.
A
B
C
The
trucks
and
under
gateway,
okay,
yeah,
no
I,
made
some
changes
to
the
back
of
the
index.
Although
originally
the
bakit
index
was
just
a
list
of
a
Clane
list
of
objects
that
existed
on
on
the
bucket,
then
at
one
point
we
added
some
logs
into
it,
but
all
residing
in
different
namespaces.
Now
the
thing
was
that,
for
this
specific
feature,
we
needed
to
be
able
to
list
version
embedded
in
the
the
object
listing
we
needed
to
put
in
the
list
of
objects.
C
In
in
the
in
the
in
the
same
namespace
on
the
list
of
object
versions
in
the
same
namespace
were
where
the
regular
object
tree
side
we
which
what
wasn't
really
hard
obviously,
but
there
there
is
a
new
notion.
Now
now
there
are
two
types
of
giving
in
the
back
index
ones
that
are
just
use
being
used
as
for
object,
listing
and
in
new
keys
that
are
used
for
actual
object
data
so
for
for
an
object.
Perversion
object
for
each
instance.
C
We'd
have
two
entries
14
just
for
being
being
listed
in
one
for
its
actual
data
and
the
reasons
we
have
to.
If
that
the
version
objects
need
to
be
listed
in
in
order
or
from
the
newer
from
the
newest
to
the
oldest.
So
we
need
to
make
sure
that
that
they're
kind
of
sorted
in
correct
way
and
there's
a
way
to
generate
that.
But
that
makes
it
us.
C
That
we
we
need
to
to
index
them
somehow,
so
we
needed
to
create
a
second
entry
for
each
in
order
to
be
able
to
look,
look
them
up,
and
so
there's
them
the
listing
index
and
the
data
index.
And
then
there's
the
the
null
version
object,
which
is
that's
created
on
a
bucket
that
has
suspended
versioning
and.
C
There
is
some
solution
for
that,
but
the
what
it
means
is
that
for
packets
that
first
had
objects
created
with
before
there
were
were
versioning
on
on
the
packet
turned
on,
and
now
we
enabled
versioning
we
create
a
week.
We
convert
the
old
entries
into
into
new
entries,
and
we
we
mark
that
as
such,
so
we
put
an
entry
we're
in
in
the
where
we
used
to
have
the
original
key.
We
put
an
entry
and
say
saying
the
this
is
not
not
a
key
anymore.
C
And
then
we
we
return
it,
we
convert
it
into
nu
and
a
new
version,
the
object,
because
we
need
to
keep
it
sorted
in
reverse
order.
So
that's
about
it
think
any
questions.
B
C
Versioning
object
versions
can
be
created,
the
object
versions
are
gonna,
be
listed,
the
correct
ordering
you
can
turn
on
and
off,
or
suspend
and
enable
versions
on
on
the
bucket
that
some
bugs
in
that
area,
but
mostly
it's
mostly
working
correctly.
You
can
remove
versions
and
it
will
roll
back
to
the
previous
to
the
previous
version,
which
kind
of
that
the
main.
The
main
theme
here
mainly
works
mostly
works,
and
what
what's
missing
is
there's
no
two
major
things
is
there
is
no
timing.
C
Out
of
you
know
when
marking
olh
and
say:
ok
we're
now
about
to
modify
it
and
then
later
on.
If
you
read
the
old
age,
we
need
to
look
at
it
and
say:
if
there
is
some
kind
of
an
operation
in
progress,
we
need
to
go
to
the
back
into
index
and
in
quick
query
eight
and
potentially
play
the
changes,
keeping
because
the
original
client
that
made
the
changes
might
have
crashed.
C
So
we
do
that
that
we
don't
timeout.
We
don't
like.
We
need
to
make
sure
that
if
it's
an
it's
an
oil
change,
we
remove
it
so
I
didn't
do
that
and
there's
the
whole
multi-region
multi-zone
stuff
that
I
haven't
looked
at
it
yet
so
yeah.
B
C
B
C
E
yeah,
at
this
moment,
I'm
testing
everything
manually
but
and
yet
and
need
need
to
create
some
kind
of
test
which
I
created
up
a
new
s3
client
just
for
the
sake
of
version,
because
I
wasn't
very
happy
with
that,
but
the
the
other
clients,
it's
a
Python
client,
pretty
simple-
reflects
what
bodo
provides.
Mainly
you
can
do
most
of
the
basic
things
like
create
objects,
remove
objects
and
these
objects
create
buckets
and
do
everything
we
in
able
versioning.
C
Listing
versions
and
everything
not
sure
if
that's
going
to
be
the
as
the
base
for
a
new
test
suite
or
will
be,
I
will
be
using
the
will
be
using
the
s3
tests,
which
the
existing
one
the
problem
with
existing
one
is,
it
doesn't
handle
versioning
nicely.
Oh,
if
you
create
version
bucket,
it
will
know
how
to
how
to
remove
it,
because
there
might
be
some
object,
fighting
their
little
version,
but
that
that
can
probably
fix
easily,
but
in
a
more
higher
level.
C
Question
is
whether
to
create
how
to
create
the
to
generate
all
those
test
cases.
Whether
doing
it
again
is
some
kind
of
a
Python
code
or
doing
it
in
the
more
higher
level
or,
like
you
know,
creating
scrapes
the
trend
that
rut
run
run
the
the
nuestra,
client
or
I'm,
not
sure
prob,
probably
going
with
yes,
we
test.
This
is
a
way
to
live,
though
yeah.
B
C
B
C
I
never
figured
yeah,
but
we
asked
him
to
reopen
it
and
we
discuss
it
because
there
was
originally
a
scent.
A
pull
request
for
a
branch
were
implemented.
B
Will
just
I
guess
a
stupid
question?
Does
it
doesn't
even
make
sense
to
have
like
a
bucket
snapshot
contact
concept
if
we're
doing
sort
of
fine-grained
object,
versioning
or
would
it
make
more
sense
to
to
if
there
is
like
a
idea
of
snapshotting
a
whole
bucket
set
its
it's
integrated,
with
the
way
that
the
versioning
is
happening.
C
Quite
well,
you
know
snapshot
can
give
you
a
few
specific
view.
It's
one
point
on
the
bucket
yeah
all
right
object.
You
can
give
you
a
at
an
object
level,
not
not
in
the
entire
back
to
travel
right.
One
point
where
packet
snapshots
may
be
useful:
this
for
doing
a
point
in
time,
Ripley
zone
the
application
mm-hmm.
Basically
you
take
a
snapshot.
C
A
C
Correctly
think
about
it
for
a
second
yeah.
B
C
Yeah
yeah,
it's
it's
doing
that
already
just
a
in
it'll
constrain
the
thief.
So
basically,
currently
what?
What
happens
is
that
if
you
have,
if
they
think
agent
needs
a
bit
behind
and
there
are
newer
objects,
so
it
might
be
that
some
of
the
objects
will
will
be
really
new
and
some
will
be
much
older
because
these
it
hasn't
gone
through
the
cycle,
and-
and
this
is
gonna
kind
of
limit
limited
to
all.
The
objects
are
not
going
to
be
newer
than
this
point.
C
A
C
At
at
the
zone,
when
doing
the
replication
I,
I
I'm
not
sure
if
that's
what
Craig
had
in
mind
when
when
he
voted
out
of
it,
this
feature
and
I
know
that
originally
we
thought
about
snapshots
for
using
in
buckets
for
implementing
sites
implementing
some
kind
of
version
I
mean,
but
that
was
long
gone
time
to
go
before
yeah
I'm.
Sorry,.
C
Yeah
and
well
yeah
I
mean.
A
B
So,
if
okay,
so
if
we
did
this
with
self-managed
snaps,
but
it's
leveraging
the
rate
of
snapshot
stuff
when
you
take
a
cluster
wide
snapshot,
own
wide
snapshot,
I
guess
I,
guess
the
key
is
that
it
would.
It
would
work
the
same
way
that
our
BD
does,
where
you
would
have
to
eat,
send
a
notify
to
all
the
gateways
that
says
there
is
now
this
snapshot
and
yeah
are
tagging
all
of
your
rights,
like
so
yeah.
C
B
C
B
B
I
mean
kind
of
you
have
to
you.
You
would
allocate
all
set
by
these
that
you're
going
to
use
for
the
snapshot,
and
then
you
would
you
have
to
sort
of
tell
the
the
gateways
to
start
using
them
all
at
once.
Right,
though,
that
isn't
anything
until
the
client
starts
using
it
right
right,
you
have
on
here.
Oh
oh,
it's
like
a.
C
C
B
C
Yeah,
well,
it's
a
solvable,
a
shoe
yeah,
okay,.
C
C
B
B
C
You
look
at
it
later
on
if,
if
the
back
bucket
is
wasted
later
on,
there's
some
kind
of
a
pending
and
trailer,
then.
C
But
but
II
you'll
you'll
need
to
disable
all
those
things
for
be
cut
because
then
you
go
and
update
the
packet
index
and
you
don't
wanna
go
up
update
the
backing
index
when
you
in
a
snapshot.
C
A
B
C
Think
we'd
learn
you'd
want
to
do
anything:
okay,
yeah
at
the
time
we
spoke
about
upper
bracket
for
for
implementing,
versioning
and
doing
all
sorts
of
crazy
stuff,
but
it
wouldn't
have
failed.
But
if
you
take
the
global
approach
and.
B
B
B
B
Yeah:
okay-
let's
see
that
is
this
is
coming
after
object.
Pershing,
probably.
C
C
We
then
kind
of
with
the
next
paris
tiger.
Think
or
I.
Don't
remember
how
you
are
we
set
it
up,
but
you
can
you
can
control
per
object
exploration
and
then
the
there
is
some
kind
of
an
async
process
that
goes
and
look
it
at
the
objects
that
need
to
be
expired
and
and
sends
them
to
the
trash
or
something
like
that
and
probably
object
that
has
been
expired
and
you
try
to
read
them
eat
either
succeed
so
not
succeed
with
it,
some
kind
of
logic.
C
There
we
got
to
s3
bucket
lifecycle,
that's
how
it's
cold,
you
do
it
the
bucket.
So
you
say:
okay
on
this
bucket
or
objects
will
be
expired
after
this
amount
of
time
or
all
objects,
its
corresponding
to
this
filter.
All
objects
start
with
the
letter,
A
can
be
can
be.
Expired,
will
be
expired
or
will
be
deleted
that
the
bucket
is
version,
so
the
version
is
going
to
be
created
for
them.
C
Oh
they're
gonna
be
trans,
this
transition
into
a
secondary
storage,
but
that's
a
per
bucket,
so
there
needs
to
be
some
kind
of
a
a
whole
bucket
process,
something
that
looks
it
at
the
bucket
and
knows
how
to
to
to
do
those
things
on
the
entire
packet
in
and
to
it
to
to
add
some
complexity.
You
can
turn
it
off
and
on
again
and
once
you
turn
it
off,
it
shouldn't
work
and
if
you
turn
it
on
or
or
it
puts
put
a
new
configuration,
then
then
the
new
stuff
needs
to
work.
C
On
a
bucket
so
yeah,
so
it
sounds
like
a
great
fun.
C
C
I'm
not
sure
if
it's
a
different
bucket,
I
think
I
think
it
is
different
pockets,
but
yeah
it's
a
right
either
reduce
redundancy
or
amazon.
Has
that
glacier
I'm
done
glacier?
You
know.
B
B
C
Yeah
I
think
it
really,
I
think,
bucket
I'm,
not
hundred
percent
sure
like
that,
though,
I
need
to
check
that
okay,
it
will
be
impossible
for
us
to
implement
something.
That's
not
moving
it
to
it
to
a
different
bucket.
Oh,
oh,
maybe
not
because
now
we
have
the
olh
so
yeah.
B
B
Wiring
up
the
reduce
redundancy
as
three
api's,
though
that,
but
we
can
already
obviously
have
multiple
raters
pools,
that
back
the
zone
and
you
can
write
select
which
one
source
I
guess
similar
to
the
wettest
Leone
doors
policies
yeah.
Well,
we
can
we
get
map
the
reduced
redundancy,
one
to
a
different,
specific,
greatest
pool
or
whatever
right
to
provide
that
same
API.
C
Yeah
we
at
the
moment
you
cannot,
you
can
already
create
multiple
bucket
placements
or
possible,
Iceland,
don't
remember
ice
cold
and
you
can
create
a
bucket
in
a
specific,
with
a
specific
placement
target
using
the
s3
API
other
though
we
have
a
way
to
specify
it,
but
it's
not
not
produced
to
redundancy
API.
It's
just
like
we
say
all
right,
depending
on
how
how
the
admin
named
named
the
bat
bucket
place
on
target,
but
yeah,
we
can
write
up
to
the
reduced
redundancy
API
and
should
just
work.
A
B
C
C
Already
do
that,
okay,
okay
and
we
you
can
have-
we
can
have
a
way
to
we
already
have
that.
Like
you,
we
have
a
way
to
have
a
user
on
on
the
Gulf
policy
and
user
and
silver
policy
so
that
each
other
can
packets
will
be
created
in
granite,
different
policy
and
you
can
limit
a
user
to
only
use
specific
policies.
C
C
Then,
at
the
time
I
sent
a
few
emails,
the
word
questions
are
explain
how
to
use
it,
but.
B
C
Well,
currently,
we
just
we
don't
hear
the
header,
it's
part
of
the
when
you
create
a
budget
yeah.
It's
part
of
the
package
request,
there's
some
kind
of,
but
ya,
know
we.
You
can
wear
that
to
them
bakit
place
and
target,
and
it
will
just
work
for
the
reduce
redundancy
we
need
to
to
make
it
so
that
if
you
specify
wittiest
redundancy,
it
go
to
a
specific
pick.
One
yeah
just
be
qin
shalini
to
define
it
somehow
nicely
and
say
which
one
is
the
reduce
redundancy.