►
Description
Meeting of Kubernetes Storage Special-Interest-Group (SIG) Object Bucket API Review - 17 September 2020
Meeting Notes/Agenda: -
Find out more about the Storage SIG here: https://github.com/kubernetes/community/tree/master/sig-storage
A
Okay,
can
I
share
my
screen.
B
Okay,
so
we've
started
implementing
and
as
we
implement,
some
new
questions
have
come
up
before
I
jump
into
that.
I
want
to
quickly
recap
what
we've
been
doing
so
far.
B
So
as
of
about
two
weeks
ago,
we
agreed
upon
the
overall
design
of
the
system
and
we've
we've
updated
the
cap
to
have
the
latest
designs
that
that
we've
we've
all
agreed
on
and.
B
I
think
some
of
us
some
some
of
you-
have
already
reviewed
the
cap.
I
I
humbly
request
that
you
take
some
time,
those
of
you
who
haven't
had
a
chance
to
review
yet
take
some
time
to
review
the
document
and
leave
your
feedback
and,
as
of
last
week,
or
so
we
started
discussing
about
the
implementation
of
of
the
project.
Since
we
have
an
overall
understanding
of
the
design,
we
could
move
forward
at
this
now
in
terms
of
implementation.
B
As
we
start
writing
code,
we
ran
into
a
few
decision
points
that
that
we're
still
that
still
need
some
clarification.
So
that's
that's
what
I
want
to
bring
up
today
so
starting
with
the
api
group,
so
I
wanted
to
quickly
review
what
the
api
group
for
csi
volumes
look
like
right
now.
B
So
in
case
of
csi
volumes,
the
api
group
is
or
csi
specific
objects.
The
api
group
is
storage.k
destroyo.
B
This
group
does
not
have
any
implementation
specific
details
in
the
group
name
itself
like
it
doesn't
say,
csi
dot,
key
it
is
io
and-
and
it's
pretty
neutral
in
terms
of
you
know
what
it
does.
It's
just
storage,
rather
than
some
specific
type
of
or
specific
implementation
of
it
like
like
csi
or
flex
or
any
of
the
other
things
we,
you
know
we've
built
before,
so
what
we
came
up
with
initially
for
cozy
was
cosy.6.ktsdio.
B
Ideally
when
when
when
we,
you
know,
create
a
group
like
there's
a
new
group
like
this,
it
sounds
like
we're
almost
creating
a
new
standard
and
it's
different
from
csi.
It's
it's
like
it's
competing
with
csr
almost.
Ideally,
though,
we
want
to
be
aligned
and
not
compete
with
it.
B
In
the
sense,
the
question
of
you
know,
cozy
being
something
entirely
or
in
the
best
way.
For
me,
to
put,
it
is
the
way
I
see
this
object.
Storage
is
another
form
of
storage.
Really
I
don't
know
if
it
should
be
a
separate
standard
that
that
that
you
know
users
of
kubernetes
will
have
to.
B
So
I'm
thinking,
I'm
just
open
to
suggestions
too
cozy
should
ideally,
in
the
long
run,
go
into
storage
or
caterer
that
api
group-
and
I
I
can
imagine
this
being
a
bottleneck
for
us
to
move
forward
with
getting
the
api
in.
However,
if,
if,
if
the,
if
the,
if
the
process
of
getting
the
api
in
is
going
to
be
a
huge
bottleneck,
there
is,
there
is
another
option
which
is
for
now
in
the
short
run.
B
While,
while
we're
waiting
for
the
api
approval
to
happen,
we
could
use
object
storage.kts.io
as
the
api
group
for
all
the
code
that
we
write
now.
I
want
to
make
this
decision
now,
rather
than
later,
because
all
the
code,
we
write
imports
by
this
package
name,
and
I
wanted
to
get
your
feedback
on
how
to
proceed
with
this.
B
Okay,
now,
while
we're
implementing
you
know
we,
we
will
be
dependent
on
getting
code
into
upstream
now.
C
Right
and
so
the
nice
thing
is,
the
api
groups
are
version.
You
have
v1
alpha
one
and.
B
C
If
you
introduce
it
in
b1
alpha
one,
I
think
api
reviewers
understand
that
this
is
not
final,
and
you
know
you
could
potentially
deprecate
it
without
concern
in
the
future.
If,
if
there
was
a
mistake
made,
so
the
bar
is
a
little
bit
lower.
B
C
Yeah
I
mean
we
can
run
it
by
the
api
reviewers.
I
wouldn't
bring
it
up
as
a
major
concern
unless
they
do
so.
I
think
from
the
sigs
perspective,
I'm
I'm
okay
with
storage.kds.io.
B
Okay
sounds
good,
so
so
now
that
that
is
resolved,
there
is
there's
another
concern
that
we've
been
dealing
with
and
I
don't
have
slides
for
it
just
started
creating
them.
It
is
deletion.
B
B
A
delete
bucket
returns,
an
error
if
there's
data
in
the
bucket,
and
so
a
deletion
operation
will
constitute
listing
the
objects
in
the
bucket
and
then
calling
delete
on
either
a
group
of
objects
or
one
object
at
a
time,
depending
on
the
implementation
of
the
backing.
B
Now,
there's
a
problem
that
that
comes
up
here,
which
is
how
do
we
design
the
deadline
for
the
grpc
call?
That
is,
if
I
have,
if
I
have
a
across
the
board
deadline
for
for
an
operation,
say
something
like
30
seconds
for
any
given
operation
or
say
a
minute
for
any
given
operation.
D
So
I
have
a
suggestion
here
I
mean
you're
correct
that
some
back
ends
definitely
will
have
a
very
long
running
version
of
this
because
they
have
to
do
a
lot
of
work
to
delete
a
bucket.
We
should
acknowledge
that
some
will
be
able
to
do
this
efficiently
and
instantaneously,
because.
D
Some
garbage
collection
mechanism
that
they
use
to
actually
clean
it
up
after
the
fact,
I
think
I
think
the
best
way
to
proceed
is
have
a
return
code
that
indeed,
that
you
can
return
quickly
if,
if
the
implementation
knows
that
it
will
probably
take
a
long
time,
it
can
just
say
you
know
what
I
got
the
request,
I'm
working
on
it,
it's
not
done,
but
don't
don't
sit
around
blocking
on
some
thread.
I
mean
we
have
this
with
snapshots
where
you
can
take
a
snapshot
and
it'll
return.
Success
but
it'll,
say
I'm.
D
The
successful
response
will
include
a
flag
that
says
I'm
still
working.
You
need
to
pull
me
to
find
out
when
it's
really
done,
we
can
either
you
could
either
do
it
through
a
successful
return
code
and
a
a
flag
in
the
response
or
through
an
error
code.
That
is
just
defined
to
mean
that
that
you
know
try
again
later,
I'm
working
on
it.
B
Okay,
okay
and
and
the
behavior
should
be.
Then,
if
you
get
that
try
again
later
working
on
it,
should
we
just
put
the
item
back
into
the
queue
and
then
just
let
it
try
as.
E
D
I
yeah:
what's
the
alternative
you
wouldn't
want
to
put
it
back
at
the
front
of
the
queue.
B
D
B
D
I
think
the
you
have
to
transition
into
like
a
deleting
state
and
as
long
as
it's
in
the
deleting
state,
you
got
to
periodically
try
to
delete
it
and
as
long
as
it
says,
yes,
we're
still
working
on
that.
You
just
stay
in
the
deleting
state
and
remember
to
keep
polling,
maybe
with
some
exponential
back
off
with
some
maximum
time.
B
D
D
Yeah
and
you
could
just
re-cue
it
with
back
off
at
the
work
queue
and
let
the
work
you
do
the
backing
off.
D
You
could
also
have
a
have
a
mode
where
you
know
you
explicitly:
don't
try
until
a
certain
amount
of
time
has
passed.
If,
if
you
get
back,
you
know
still
working
on
a
response,
because
if
it
you
know,
if,
if
the,
if
the
plug-in
says
you
know
I'm,
this
is
going
to
take
a
while,
you
don't
want
to
try
again
after
two
seconds
and
after
four
seconds
and
after
eight
seconds,
you
probably
just
want
to
immediately
go
to
some
long
weight
before
asking.
B
D
That
how's
it
going,
but
this
pushes
all
of
the
the
decision
making
about
whether
to
block
or
whether
to
return
quickly
down
into
the
plug-in
and
the
plug-in.
Can
you
know
the
plug-in?
Might
guess
wrong
it?
Might
it
might
return?
You
know
this
is
going
to
take
a
while
when
in
fact
it's
only
going
to
take
five
seconds
and
then
it
might
have
been
better
to
just
block.
D
But
I
don't
know
you
need
a
heuristic,
I
guess
and
and
the
the
plug-in
implementer
will
have
way
more
information
at
their
level
about
what
the
right
thing
to
do.
B
Is
understood
now,
do
we
want
to
have
a
deadline
on
operations
at
the
plug-in
level?
In
the
sense
we
expect
every
grpc
call
to
always
finish
within
a
minute
or
something
like
that.
D
Well,
I
don't
view
that
as
a
deadline
I
mean
there's
a
csi
has
a
timeout,
but
it's
it's
not
it's
not
like
a
requirement
that
everything
completes
within
the
timeout
is
just.
This
is
how
long
it
will
block
and
if
you
take
longer
than
that,
we're
going
to
call
you
again
later
that
that's
how
it's
interpreted-
okay
so
and
it
should,
it
should
probably
be
like
I
think
it's
30
seconds
in
csi-
is
that
does
anyone
know?
I
don't
actually
remember.
D
G
Right
so
so
the
segue.
Well,
I
just
curious:
does
csi
expect
the
drivers
or
plug-ins
as
you're
calling
them
to
be?
Not
only
item
potent
but
also
multi-threaded.
D
E
D
Okay
yeah,
the
idea
is,
you
can
call
multiple
rpcs
in
parallel
and
if
and
if
that
causes
problems
on
the
plug-in
side,
they
can
have
locking
to
block
out
the
other
calls
and
they
can
figure
out
how
to
handle
locking
internally
if
they
don't
want
to
get
multiple
calls
in
parallel.
D
G
D
C
Was
something
about
that
in
the
spec
I
I
forgot,
which
way
we
put
it
and
whether
it
was
on
the
co
side
to
not
do
that
or
whether
we
put
it
on
the
driver's
side
to
make
them
more
robust,
read
the
spec.
D
C
It
might
have
been
something
like
best
effort
on
both
sides
like
make
sure
your
driver
is
tolerant,
for
you
know,
cases
that
would
cause
it
to
behave
badly
and
on
the
co
side,
try
and
behave
appropriately
and
not
call
the
same
call
on
two
different
threads.
H
All
right
it
looks
like
for
s3.
We
should
expect,
like
our
driver,
to
be
reentrant
right,
because,
while
it's
working
on
deletion,
we
may
kind
of
ask
it
to
delete
again.
B
Yeah,
that's
that's
the
expectation.
Well,
the
driver
has
to
implement
that
here.
That's
the
that's
the
expectation
in
the
sense
that'll
be
completely
left
to
the
driver
about
how
they
implement
like
pen,
is
saying
some
drivers
could
be
very
efficient
at
it
and
might
have
support
for
it
in
the
in
the
actual
provider
itself.
D
But
like
in
particular,
you
know
if
there's
state,
that's
used
to
ensure
item
potency
inside
the
driver,
you
probably
won't
need
locks
around
that
in
case
you
get
the
multiple
threads
coming
in
but,
like
you,
shouldn't,
be
holding
the
lock
for
the
whole
delete
operation,
because
if
it's
going
to
take
like
an
hour
to
delete
everything
and
you
hold
a
lock,
so
you
can't
do
any
other
work.
That's
that's
also
very
bad,
so
yeah.
B
D
The
next
work
it
needs
to
do
is
make
that
crate
call
again
when
the
original
one
is
still
hanging
on
the
you
know,
because
the
socket
maybe
didn't
go
away
so
so
the
the
driver
was
still
processing
that
create
call,
and
then
another
crate
call
comes
in
with
the
exact
same
parameters
while
you're
still
processing
the
first
one.
Just
because
the
sidecar
I
mean
you
can't
avoid
situations
like
that.
So
you
best
effort
is
probably.
B
H
Object,
storage
infrastructure
because
they
also
need
need
to
handle
that
situation
right.
So
we
can
just
kind
of
rely
on
that
with
the
deletion.
It's
different
because,
as
I
understand,
our
s3
driver
is
going
to
just
go
through
the
all
the
objects
and
delete
them
one
by
one
or
something
like
that.
In
that
case,
we
cannot
rely
probably
on
the
underlying
object,
storage
and
have
to
have
some
mechanism
for
in.
D
Our
like
to
ensure
item
potency,
you're
gonna,
need
some
kind
of
locking
because,
like
you
have
a
name
that
comes
in
and
you
return
an
id
and
that
id
should
always
be
the
same
and
so
like
for
a
given
set
of
inputs,
and
you
got
to
be
careful
about.
You
know,
generating
that
id
and
making
sure
that
two
different
calls
coming
in
with
the
same
parameters
at
the
same
time
couldn't
generate
two
different
ids,
and
now
you
have
now.
You
don't
know
which
one
is
which.
H
H
B
D
B
B
D
H
Yeah
right
correct
me:
if
I'm
wrong
like
I
can
talk,
for
example,
for
gcs
right
in
gcs,
when
you
create
bucket
actually
bucket
name.
Is
an
item
potency
token?
So
if
you
try
to
create
the
same
bucket
with
the
same
name,
you'll,
probably
the
backend
will
just
properly
return
you
some
code,
like
error,
code
or
whatever
code.
So
it's
actually,
you
can
rely
on
the
back
end
and
it.
H
That
yep
for
s3,
I'm
not
sure
like
I
I
cannot,
but
I'm
sure,
like
in
amazon,
you
you
can
use
something
like,
as
adam
potency
token
parameter
and
we
can
use.
For
example,
I
don't
know
some
sort
of
uid
to
pass
and
this
uid
will
be
the
same
and
it
will
be
taken
from
based
off.
I
don't
know
like
uid
of
any
of
our
object
bucket
resource
and
we
can
pass
it
every
time.
They'll
correspond
to
specific
bucket
underlying
bucket.
B
Yeah
we're
going
to
pass
it
yeah.
We
definitely
are
going
to
pass
that
now.
The
question
becomes
for
creation,
it's
pretty
straightforward.
H
B
B
Yeah
mind
we
don't
do
any
sort
of
update
bucket
right
now.
Okay,
I
think
I
think
I
can't
I
can't
find
a
flaw
in
this.
I
think
this
sounds
good.
H
Think
the
deletion
which
the
thing
but
it's
a
good
question
you
raised,
because
deletion
might
be
implemented,
different
different
ways
and
different
object,
storage
providers
right
and
if.
H
That
some
of
these
storage
providers
require
all
the
objects
to
be
deleted
before
the
actual
bucket
is
deleted.
Then
we
just
need
to
kind
of
architecture
our
jrpc
api,
which
is
going
to
talk
to
these
plugins,
to
kind
of
to
know
that
if
you
ask
to
delete
sometimes
the
response
would
be
like
in
progress
or
something
like
that,
and
you
need
to
try
again
later.
B
F
F
B
B
H
B
I
B
Yeah,
okay,
so
this
this
clears
up
a
lot,
so
we
know
how
we
want
to
proceed
with
the
design
of
this.
Is
there
any
other
questions
and
if
you
wanted
to
bring
up.
B
Yeah
yeah,
I
yeah-
I
just
read
them
before
this
meeting.
I
think
that
makes
total
sense
I'll
quickly,
open
it
up
kubernetes
enhancements
yeah.
So
so
the
comment
was
already
exists.
Error
is
returned
when
you
have
this
different
parameters
sent
for
the
same
resource.
E
J
This
way,
then,
you
get
true
item
potency
because
you,
basically,
if
you
create
a
corporate
creep
arcade
with
exact
the
same
set
of
parameters,
you
always
get
the
same
results
right.
So
if
it's
a
new,
a
new
create
bucket
requires
you
after
it's
created,
you
get
it
okay,
but
then
you
call
it
again,
it's
more
like
a
query
and
that
also
returns.
Okay,
not
already
exists,
because
otherwise
you're
writing
two
different
error
code
for
the
same
input.
J
C
J
H
J
Our
code
used
by
csi,
that's
that's
exactly
the
same
already
it's
a
little.
B
Doesn't
conflict
also
have
another
one.
B
Yeah
conflict
in
kubernetes
generally
means
the
resource
version
doesn't
match
when
you're
yeah.
J
Now
this
is
the
pure
jrp
zero
code,
but
if
you
look
at
csi
spec,
the
mini
is
not
exactly
the
same.
So
if
you
go
just
look
at
the
css
back
basically
just
says.
C
C
Csi,
the
idea
is
that
every
call
has
to
be
item
potent
and
item
potent
means
that
the
same
response
should
be
received
for
the
same
request.
So
exactly
what
sheng
is
saying.
So
if
you
have
an
existing
object,
then
just
return
success
and
give
us
the
fields
of
that
object.
You
know
that's
it
and
this
error
code
is
reserved,
for
you
know
for
for
specific
cases
where
we
want
to
point
out
that
we
can't
complete
an
operation
because
it
already
exists.
C
Yeah
yeah,
you
depend
on
the
storage
vendor
to
tell
you
that
the
storage
vendor
either
relies
on
its
own
back
end.
B
H
B
H
D
C
D
E
B
B
Nope,
nobody
knows.
Okay,
we
can
figure
it
out.
Okay,
that's
it!
That's
it
from
my
side
today.
I
I
wanted
to
go
over
these
questions
that
were
on
our
minds.
I
think
I
think
we
cleared
up
for
the
most
part.
B
B
B
B
A
J
E
E
C
C
G
H
Like
something
something
like
create
snapshot
is
consider
is
not
sync
blocking
right
and
having
like
create
bucket.
J
Crease
is
blocking,
actually
it
is
blocking
it
yeah.
Well,
you
you,
you
create
until
the
snapshot
is
cut,
but
then
you
call
it
again.
So
it's
blocking
until
the
snapshot
is
cut
and
then
you
can
still
driver
can
still
in
the
back
end.
You
can
still.
You
can
do.
J
I'm
saying
so
we're
actually
calling
chris
snapchat
multiple
times
so
chris
snapchat
is
also
like
a
guest
snapshot.
Actually,
so
it's
so
basically,
it's
supposed
to
block
until
snapchat
is
cut
and
then
there's
the
second
step
which
is
upload,
for
you
know
some
cloud
providers
right
and
then
in
that
case,
basically,
you
just
call
controller
central
controller
will
call
it
again
to
call
guess
to
quick
quiz
until
it
is
ready
when
this
upload
is
finished,.
H
J
C
True
yeah,
basically
it
comes
down
to
we
look
at
it
on
an
operation
by
operation
basis.
If
we
believe
that
it's
going
to
take
a
long
time
to
complete,
create
snapshot
was
an
example
of
that
and
then
delete
bucket
is
an
example
of
that.
Then
we
will
follow
this
meth
model.
Otherwise
we
just
do
blocking.
J
Like
a
two-step
as
well,
this
delete
delete
bucket
said,
like
I
said,
you
have
like
several
steps,
you
can
return
in
the
middle
or
what?
How
do
you
decide
when
to.
C
I
think
it's
up
to
the
driver
to
decide
what
it
wants
to
do.
It
can
make
it
one
step
or
two
step.
If
it's
going
to
complete
fairly
quickly,
you
can
complete
it
in
one
step
and
say
you
know,
block
and
then
return
the
success
code
and
say:
I'm
done,
you
know
everything
is
successful
or
if
it
knows
there's
a
long
running
operation
it
can
convert
itself
into
a
two-step
process
by
returning
immediately
and
saying,
hey,
I've
started
this
thing,
but
it's
not
complete.
C
Please
follow
up
and,
and
you
know
verify
that
I've
completed
this
will
be
a
long
running
process
and
then
it
is
up
to
the
caller
for
the
side,
cars
to
continuously
pull
and
get
that
status
until
it's
finally
deleted.
C
C
D
C
Yeah,
so
from
that
perspective
it
is
slightly
different,
completely
agreed,
but
I
I
think
we
can
follow
a
very
similar
model.