►
From YouTube: OCI Weekly Discussion - 2021-03-10
Description
OCI weekly developer's call recording from 10 Mar 2021. Notes/agenda here: https://hackmd.io/El8Dd2xrTlCaCG59ns5cwg#March-10-2021
A
On
that
note,
john
you're,
we
so
we
do
have
a
number
of
items.
So
let's
try
to
be
fair
because
the
we
try
to
leave
the
presentation.
Discussions
tend
to
have
longer
discussions
and
if
we
don't
have
a
longer
discussion,
then
what's
the
point
of
having
a
discussion
so
face
image.
What
actually
can
we
do
the
base
image
annotation
proposal,
because
that
one,
I
think,
will
be
short
and
we
can
time
box
that
at
maybe
seven
minutes?
A
A
A
All
I
see
is
-20
holy
crap
mike
dead,
maybe
sec,
okay
hold
on
first
of
all,
here's
our
hack,
so
everybody
please
sign
in
okay.
I
was
trying
to
capture
more
so
with
that.
Why
don't
we
get
to
content
editing?
Well,
john
can't
do
that
one
either
john.
If
you
can't
talk
we're
gonna,
just
punch
all
your
stuff
to
next
week.
B
Sure
so
yeah
I
mean
I
was
the
one
who
originally
made
the
issue,
and
this
is
like
my
first
time
petitioning
or
going
to
the
oci
for
respect
stuff.
So
yeah
I'm
a
little
new
to
this.
I
guess,
but
we
basically
have
a.
B
Specific
formats
would
entail
in
terms
of
cpu
overhead
and
how
much
benefit
that
would
have
in
terms
of
their
container
start
times
or
container
push
times
for
build
for
builders,
and
what
we
found
was
that
different
users
have
significantly
different
benefits
from
different
compression
algorithms,
for
example,
in
the
data
center,
where
networking
is
super
fast,
having
no
compression
is
ideal
and
the
oci
image
spec
added
the
ability
to
have
layers
without
compression
in
them.
B
So
you
can
just
upload
a
tarball
layer
as
opposed
to
a
target
gzip
layer
and
for
uploaders
for
example,
or
for
excuse
me,
home
users.
Doing
a
very
high
level
of
z-state
compression
is
actually
beneficial,
even
though
it's
very
cpu
intensive.
It
turns
out
now
that
everyone's
working
from
home,
you
know
people
have
like
10
meg
internet.
B
You
can
spend
all
the
cpu
time
in
the
data
center
and
it
pays
off
so
based
on,
go
ahead,
sure,
pros
and
cons,
so
yeah
you're
never
going
to
make
everybody
happy
right
in
the
current
distribution,
spec
and
image
format
don't
allow
for
different
users
to
get
different
re-encodings
of
the
same
layer
or
different
compressed
versions
of
the
same
layer
unless
you
d
duplicate
that
in
the
image
itself
and
that
just
becomes
wasteful
in
terms
of
storage
server
side
and
it
doesn't
allow
for
for
upgrades
in
flight.
B
So
the
proposal
kind
of
talks
to
primarily
a
adopting
the
http
content,
encoding
spec,
which
allows
the
server
to
serve
up
different
encodings
of
the
same
file,
format
or
same
file.
Excuse
me
two
different
users,
depending
on
on
you
know
how
they
decide
to
set
their
accept.
Headers.
B
There's
a
small
part
of
the
issue
that
describes
a
mechanism
to
determine
which
content
encodings
are
available
for
uploaders,
because
http
doesn't
allow
for
the
client
to
like
interrogate
the
server
in
a
trivial
way.
So
that's
that's
really
the
only
extension
there,
but
the
the
big
benefit
is
on
on
download.
B
How
how
that
would
work
with
the
content
encoding
on
the
download
or
the
upload
side.
C
B
Like
in
the
data
center,
we
would
configure
all
of
our
docker
daemons
to
pull
and
say
you
know,
accept
identity
or
accept
encoding
identity.
So
don't
ask
for
a
compressed
version
of
the
the
data
and
it
would
serve
the
tar
balls
as
is,
or
you
know,
302
to
s3
with
the
tarballs,
as
is
for
home
users,
who
are
doing
like
a
docker
pull.
B
It
would
be
able
to
do
on-the-fly
compression
for
images
that
have
not
been
fetched
before
and
then
the
first
time
you
fetch
a
layer,
it
would
be
able
to
re-upload
it
to
the
s3
store,
with
after
doing
compression
with
that
level
and
subsequent
polls
would
be
302
to
that
new
poll.
That's
like
the
very
specific
of
how
we're
planning
on
doing
this,
and
we
do
this
with
our
like
java,
tarballs
or
java
jars,
for
example.
B
B
They
would
have
to
reference
uncompressed
tar
layers
in
order
for
this
to
work.
Okay,
yeah.
That
makes
more
sense.
E
Going
to
be
my
like
concern
also
is,
if
the
yeah,
it
breaks
the
content
adjustability,
if,
if
you're,
somehow
expecting
the
registry
to
re-encode
the
layers,
but
if
it's
just
yeah,
if
it's
just
on
the
content
transfer,
I
guess
that
makes
sense.
B
Yeah,
unfortunately,
there's
not
really
a
way
yet
to
tell
people
that
they
should
upload
their
manifests
without
compression,
but
that's
something
that,
like
at
least
we
own
some
of
the
build
infrastructure.
B
So
we
can
start
to
migrate
people
to
doing
that,
enriched
in
add
a
rejection
policy
when
we
have
enough
people
moved
over,
but
right
now,
if
they
upload
a
tarball,
it's
super
slow
because
their
their
upload
time
is
just
like
abysmal
and
they're
uploading
10x
the
amount
of
data,
because
it's
uncompressed
because
there's
no
way
to
say
you
know
upload
this
in
a
compressed
fashion
and
then
in
the
data
center,
like
they
want
uncompressed
when
we're
pulling
their.
It
says
it's.
D
B
The
one
other
thing-
that's
not
in
the
proposal
as
today,
because
the
standard
is
not
standardized
upstream
yet
is
for
zested
custom
dictionary
support.
So
this
was
a
case
where
we
found
doing
offline
dictionary
generation
can
get
us
almost
eight
like
we
already
have
like
a
90
something
percent
compression
factor,
and
it
can
get
us
from
like
the
93
94
to
like
97
to
98
compression
factor.
So
it's
a
a
pretty
significant
one.
You
can
get
from
doing
offline
compression.
A
I
think
we
all
recognize
the
like.
I
was
watching
there.
I
had
container
day
things
today
and
there
was
three
different
sessions
on
the
compression
formats
and
nidus
and
different
approaches
and
so
forth.
So
I
think
we're
all
seeing
that
in
to
mike's
point
like
there's,
there's
always
trade-offs
and
how
do
you
support
both
whether
we're
all
working
from
home
and
how
long
that
works,
or
what
does
a
product
like
dockerhub
do
which,
for
the
most
part,
never
has
somebody
close
to
it?
A
So,
and
then
how
much
does
the
author
need
to
play
a
role
in
that
difference
like
this
is
yeah
we're
experimenting
this
with
a
teleport.
We
see,
you
know
we
try
to
make
it
completely
transparent
to
the
user,
the
user
uploads
it
we
expand
it.
There's
teleport
nodes
that
know
how
to
say
they
negotiate
say:
hey,
I'm
teleport
enabled.
Are
you
in
the
same
region?
Do
you
is
that
expanded?
Yes,
yes,
then
bang
it's
done
and
it's
much
faster.
I
think
the
problem
we're
facing
is:
where
does
that
happen?
A
A
The
other
thing
I'll
just
add
that
one
of
the
reasons
I've
been
pushing
back
and
concerned
about
this
is
that
we
keep
on
assuming
that
registries,
which
are
generally
speaking,
dumb
storage
devices
right,
there's
a
cacintas
and
there's
a
gazatus,
and
we
promised
that
the
gazata
is
the
same
thing
that
went
in
between
the
tough
stuff.
Looking
for
upstairs
updates
and
time
stamps
between
some
of
these
conversations,
where
we're
doing
conversion
on
the
fly.
A
Not
only
is
it
going
to
be
compute
intensive,
potentially,
but
it's
also
a
matter
of
are
we
should
we
be
changing
content
on
the
fly,
so
I
just
I.
I
just
wanted
to
raise
those
two
concerns
for
us
to
think
about.
How
do
we
incorporate
these.
B
So
so,
in
your
first
point
about
digest,
since
this
is
not
like,
you
do
the
digest.
After
doing
the
encoding
decoding
step-
or
I
guess
before
the
encoding
step
and
after
the
decoding
step,
so
the
content
encoding
and
the
like
content
addressable
nature
of
the
store
have
nothing
to
do
with
each
other.
This
content
encoding
is
just
like
an
artifact
of
the
fact
that
the
registry
protocol
is
over
http
and
storage
on
disk
will
all
be
this
this
the
same,
and
then
the
other
thing
is
like
it's
totally.
B
Opt-In,
there's
no
demand
that
the
user
has
to,
or
the
registry
has
to
do,
on
the
fly
mutation.
You
can
still
allow
users
to
upload
with
manifest
that
have
tar
balls
in
them
that
are
compressed
with
gzip.
I
would
say
that,
like
it
would
become
it.
It
is
becoming
very
exorbitant
to
do
z,
state
plus
g,
zip,
plus
uncompressed,
and
storing
that
at
rest
and
that's
becoming
more
expensive
compared
to
the
compute,
at
least
from
the
economics
in
aws.
A
Have
you
done
the
cost
in
the
compute?
Like
that's,
that's
part
of
the
storage
is
not
free
right.
We
don't
necessarily
charge
as
much
as
we
should,
and
it's
not
just
the
size
of
the
storage.
It's
also
there's
just
a
bunch
of
overhead
related
to
how
many
list
apis
can
support
how
many
objects,
the
deletion
management,
then
the
the
in.
F
F
A
I
yeah,
I
don't
know
if
you
got
another
mic
or
something
and
it's
hard
to
win
if
you
have
another
option,
so
this.
E
G
B
It
allows
for
them
to
do
this.
It
does
not
require
them
to
do
this.
G
Right
right
and
then
the
I
mean,
there's
also
the
opportunity
of
doing
lazy
pulls.
I
think
we've
got
a
starter
gz
little
little
thing
that
we
could
show
off
right
in
container
d.
G
I'm
not
sure
how
that
would
affect
this.
We
have
to
take
a
look
at
that
you,
don't
you
don't
want
to
have
these
compressed
packets
sitting
in
a
in
a
cache
in
on
the
server
for
a
very
long
time
period
right.
B
So,
like
I,
as
far
as
the
star,
gz
format
would
not
be
significantly
affected
by
this
unless
the
server
is
trying
to
do
on-the-fly
compression
because
of
the
way
that,
like
some
of
the
compression
algorithms,
the
streaming
compression
algorithms
require
that
you
see
through
the
entire
file
before
you
do
compression
you're.
B
Yeah,
like
the
the
economics
going
back
to
steve's
point
on
upload,
like
we
storage,
is
cheap
for
a
day
and
then
overnight
compute
gets
really
cheap.
You
run
compression
overnight
at
like
z,
sid
13,
and
you
compress
overnight
and
the
next
day
your
like
your
compute,
was
basically
free
and
your
storage
cost
has
now
gone
down
and
you've
cut
it
by
fifty
percent
and
yeah.
That's
basically
the.
B
Yeah
and
also
that
we
have
no
way
of
upgrading
right
now
like
if,
if
we
go
from
z
stage,
six
as
the
default,
which
I
think
is
the
what's
in
the
image,
spec
there's
no
way
for
the
server
to
be
like.
I
want
to
use
zested
extreme,
which
can
have
significant
benefits
compared
to
the
whatever's
in
the
image
spec.
G
B
It
requires
that
you
use
a
specific
level,
because
the
image
spec
does
the
content
digest
after
the
compression.
A
A
B
Yeah,
so
I
mean
it's
broke,
there's
two
parts
of
it.
I
I
can
turn
it
into
more
formal
language
and
break
up
the
two
parts.
If
that's
helpful,
but
yeah,
I
don't,
or
I
can
wait
for
feedback
on
the
issue
before
doing
that.
B
I
believe
that
things
go
very
pear-shaped
if
you
start
to
mix
and
match
gzip
levels
within
a
given
registry,
in
your
your
cost
kind
of
explode
as
well.
G
A
H
Yeah
only
justin
and
v-bats
commented
that
I
can
see.
A
A
All
right,
why
don't
we
give
this
some
big
time?
Just
because
there's
a
few
people
that
commented,
it's
certainly
a
meaty
one
and
go
from
there
and
then
because
this
encoding
one
is
very
much
similar
conversation.
C
I
think
simpler
for
us
to
resolve
like
that's
it's
just
it's
just
like
it's
just
adding
functionality.
That's
in
hdb
today,
just
formalizing
it,
the
duplicate
ones,
someone
wrapped
my
head
around
with
what
that
one's
asking.
C
C
A
C
I
would
say:
formalize
235:
do
we
even
discuss
236,
because
I
had
that
one
open
as
well,
but
I
haven't
talked
about
236
at
all,
yeah,
so
yeah.
I
think
formalizing
235,
like
you,
mentioned,
that
that
makes
sense.
I'd
like
to
see
what,
like
the
formalized
version
of
that
would
be
so
we
can
run
through
like
where
the
client
edge
cases
might
be.
Did
I
break
up
the.
B
B
C
Gotcha
so
with,
without
that
today,
the
clients
would
have
to
assume
identity,
because
if
they
start
sending
compressed
content
up,
I
can
yeah.
B
Yeah
or
our
experience
with
playing
with
this
with
registries,
is
that
they
just
will
take
the
content,
encoding
and
store
the
like
double
encoded
version
or
the
the
encoded
version,
and
then
when
they
serve
it,
they
don't
didn't
like
record
that
header,
so
things
go
really
weird
right
now,
so
you
need
some
way
to
tell
like
have
the
registry
say
that
I
support
this
feature.
B
Yeah
we
played
with
them,
I
think
so.
We,
the
the
standard
distribution
we
used
and
I
feel
like
the
other
one
we
used,
but
both
basically
just
took
the
blob
and
stored
it
and
ignored
the
content
and
coding
header
completely.
C
I
B
So
where
we
are
decompressing
the
blob
on
upload
and
storing
it
in
s3
decompress
until
the
like
cleanup
process,
the
janitor
comes
along
and
then
the
janitor
turns
it
into
a
z-stead
version
that
that's
how
we're
going
about
it.
And
then
our
registry
can
see
if
this
eastern
version
exists
and
we'll
serve
that
up.
Instead,.
C
Yeah,
I'm
just
trying
to
think
of
how
a
registry
would
like
just
any
generic
registry
would
implement
something
like
this,
because
they
don't
necessarily
know
the
content,
the
content
type,
that's
being
uploaded.
They
only
know
the
digest
in
the
end.
A
C
C
If
you
have
a
somewhat
eventually
consistent
back
end
right
for
the
most
part
like
the
manifests
are
uploaded.
I
guess
this
thing
the
manufacturer's
usually
uploaded
afterwards
anyways
in
the
flows.
So
you
don't
even
know
when
you
just
you
just
start
getting
bytes
coming
up.
You'd
have
to
look
at
four
compression
headers.
B
The
I
mean
the
generic
way
to
implement.
This
is
say
that
you
only
accept
the
identity
in
coding
and
only
store
identity,
blobs.
B
So
that
that's
so
what
we
want
to
do,
at
least
for
our
use
case,
is
that
people
who
are
building
and
pushing
on
their
laptops
will
compress
with
zstead
people
who
are
building
in
the
data
center
will
upload
identity,
blobs
and
then
any
generic
registry
that
wants
to
implement
this
would
say.
I
only
accept
the
identity,
encoding
and
store
the
identity,
encoded
stuff
in
its
blob
store.
B
So
if
the
content
encoding
flag
is
missing
or
not
present,
it
is
implicitly
identity,
encoding.
C
I
guess
the
promise,
so
the
identity
coding
just
means
that
it's
completely
opaque
to
the
registry,
but
the
content
that
comes
up
could
still
be
compressed,
so
you
could
still
end
up
double
compressing
stuff
unless
you
actually
look
at
the
content
right,
yeah
but
like
so
yeah.
That's
that's
why?
I
think,
like
you,
you
really,
if
you're
an
environment
where
you're
controlling
the
build
side
and
the
registry,
it
makes
sense.
But
if
you're,
just
like
a
generic
registry
service
like
it
seems
hard
to
to
handle
that.
B
D
B
Right
or
I
mean
if,
if
either
the
client
or
the
server
had
some
level
of
intelligence,
you
could
get
benefit
out
of.
B
D
Right
but
that
digest
is
on
whatever
compressed
bits
you
upload
so
either.
The
client
needs
to
know
in
advance
that
the
registry
is
going
to
handle
the
compression
and
thus
provide
it.
The
identity
digest
or
the
the
uncompressed
digest,
and
let
the
registry
handle
the
compression
or
it
needs
to
know
that
it
should
compress
the
blob
and
provide
a
digest.
That
is
the
compressed
blob.
B
C
Yeah,
obviously,
you
can't
touch
the
content
if
it's
part
of
the
digest.
I
think
that
that's
kind
of
why
it's
getting
that,
though,
that
unless
the
builders
are
building
with
un
compressed
data,
you
don't
get
any
benefit
from
it.
So
and
you
wouldn't
necessarily
automatically
do
uncompress.
Unless
you
knew
you
were
so,
you
were
using
a
registry
that
was
taking
advantage
of
this
so
yeah.
It
makes
sense
to
me
I
I
would
like
to
see
like
if
you
want
to
formalize
that
what
what
it
would
actually
look
like
it
doesn't
do
any
harm.
C
A
I
think
that
there's
an
interesting
point
around:
what
do
you
do
about
gzip
and
wait?
Did
I
read
that
right.
B
I
mean
I
I
I
can
add
this
to
the
language
but
like
if
a
client
is
uploading,
data
entered
as
a
smarter
client
and
it
sees
that
the
server
side,
content
or
accepted
encoding
is
identity,
gzip
bz2,
let's
say,
but
it
only
wants
to
upload
a
targey
z
blob,
because
it's
builder
built
only
a
tarjay
zz
blob
and
the
manifest
that
it
has
is
only
a
charge,
easy
blob.
It
would
then
have
to
upload
a
targey
z
format
with
the
identity,
encoding.
A
I'm
just
trying
to
wrap
my
head
around
the
sequence
of
events
is:
when
does
it
get
known
right
because
the
blobs
get
uploaded,
they
get
uploaded
through
the
rest
api.
So
it's
not
like.
It's
go
straight
to
storage,
so
there's
a
chance
to
to
look
at
it,
but
the
manifest
is
a
separate
asynchronous
put
which
has
to
be
after
the
blob
is
set,
so
it
can
validate
it.
Where
is
the
understanding
and
correlation
of
the
two.
B
A
G
Gzip,
wouldn't
be
the
only
solution.
The
the
intention
was
that
you
could
extend
the
image
back
to
you
know
to
have
other
formats
as
well,
but
this
makes
a
lot
more
sense
to
just
use
identity.
You
know
basic
guitar
when
you're
doing
a
push
pushes
are
aren't
the
primary
case
right
for
using
registry.
It's
the
polls.
So
if
you,
if
this
solution,
I
think,
makes
more
sense
than
the
other
prior
discussions
that
we've
had
right,
where
we
could
just
push
an
identity
and
then
have
that
be.
C
Good
intentions,
but
I
think
we
realized
pretty
early
on
that
that
wasn't
the
best
idea
to
just
ease
up
everything
and
it
it
was.
We
actually
thought
of
pulling
that
back
a
few
times
and
the
clients,
but
the
reason
we
didn't
is
because
yeah
the
havoc
it
would
reach
on
havoc
for
registries
which
aren't
handling
compression
at
all,
just
bloat
storage,
so
yeah,
it's
kind
of
a
difficult
problem,
but
the
client
I
mean
there's
nothing
stopping
builders
today
from
doing
uncompressed.
B
B
The
really
common
use
case
is
that
someone
updates
the
ubuntu
base
image,
and
then
everybody
in
the
company
builds
on
top
that
new
ubuntu
base
image
and
there's
a
docker
push
and
then,
when
they
do
the
docker
push
they
end
up
having
to
pay
for
upload
time
on
those
lower
layers
that
the
registry
already
has.
B
We
don't
have
any
way
for
one
trusted
clients
to
get
a
deduplication
upload
unless
they
do
unless
they're
able
to
figure
out
which
other
repository
has
that
store
in
it
and
a
lot
of
clients.
Don't
keep
track
of,
like
repository
name
to
blob.
B
So
this
basically
allows
the
client
to
say
I
am
trying
to
upload
file
with
blob
descriptor
sha,
256
foo,
and
the
registry
then,
can
do
a
proof
of
data
possession
challenge
against
the
client
and
allow
the
client
to
prove
that
it
has.
This
data
to
securely
do
deduplicate
at
upload
time.
A
So
I
love
the
idea,
because
this
is
a
demo.
I've
always
hated
that
when
I
push
an
image
that
the
registry
should
surely
know
of,
I
have
to
wait
for
the
upload
for
it
to
just
say:
yep.
I
already
got
it,
but
it
had
to
upload
it
first
and
then
it
just
tosses
it
on
the
server
like.
For
me,
the
user,
the
expensive
part,
was
already
paid.
A
The
thing
that
I
just
always
get
nervous
about
is
the
disclosure
challenge,
so
I
think
we
just
have
to
somehow
specify
that
and
I'm
just
going
to
look
at
mike,
because
I
can
see
him
standing
there
if
mike
and
I
have
repos
next
to
each
other
on
the
same
registry
and
that
registry,
you
know,
has
decided
that
it
doesn't
allow
sharing
of
layers
across
two
security
boundaries,
then
that
it
shouldn't
it
shouldn't
basically
acknowledge
that
that
layer
exists,
because
that
basically
tells
me
that
that
content
already
is
in
the
registry,
and
I
can
somehow
circumvent
it
now.
A
If
mike
and
I
have
the
same
permission,
boundaries
and
I
happen
to
have
probably
push
in
addition
to
pull
rights,
then
it
makes
perfect
sense
for
it
to
be
smarter
and
not
upload
and
to
be,
you
know,
just
say,
don't
even
bother
uploading.
I
already
have
it.
B
Yeah
so
there's
this
is
addressed
in
the
proposal
where
the
registry
can
do
one
of
two
things:
it
can
either
completely
ignore
issuing
challenges
and
never
issue
a
challenge
allowing
the
user
to
deduplicate,
or
it
can
always
issue
a
false
challenge.
The
issue
with
with
this
is
that
at
push
time,
if
the
user
tries
to
fulfill
that
challenge,
you
are
open
to
a
bunch
of
cryptography,
related
timing
attacks
and
that's
called
out
explicitly
that
this
does
not
try
to
be
resistant
to
this
proof
of
data.
B
Possession
protocol
does
not
try
to
be
resistant
to
timing
attacks
and
that
trust
boundary
evaluation
would
have
to
be
done
at
evaluation
of
the
proof
of
data
possession.
B
B
On
the
other
hand,
registries
like
docker
hub
if
a
docker
hub
node
has
a
or
docker
hub
store,
has
a
public
image.
It
can
issue
challenges
against
images
that
refer
to
blobs
in
the
public,
but
you
never
want
to
refer
to
someone
else's
blob
in
the
private.
Now
that's
complicated
logic.
So,
like
you
know,
for
docker
hub,
it
would
probably
never
issue
challenges,
but
for
corporate
registries
or
or
other
registries,
like
you
know,
if
you
have
an
amazon
ecr,
you
might
want
to
be
able
to
do
this.
B
A
A
So
I
think
it's
just
if
you
address
the
the
security
boundary
acts
issue
and
allow
that
negotiation
to
be
done,
because
we've
had
this
debate
before
here,
where
some
registries
consider
it
perfectly
fine
to
share
images
across
org
boundaries
and
because
there's
more
there's
more
beneficial
savings
in
storage
than
the
concern,
for
you
know,
hacking
somebody
else's
layer
for
somebody
else's
image,
so
I
think
there's
just
a
trade-off.
So
as
long
as
the
registry
and
the
client
can
negotiate
that
and
let
the
registry
decide
what
the
boundary
is
for
them.
A
That
point:
that's
why
I
kind
of
use
mike
and
I
as
two
repos
like
it-
should
allow
cross
repo
mounting
not
just
in
the
same
repo,
but
it
must
somewhere.
There
must
be
a
determination,
it
says,
but
I
actually
do
have
the
rights
to
mike's
repo,
for
instance.
So
it's
allowed
to
to
do
that.
I
think
we're
going
to
see
this
more
and
more.
It's
not
just
working
from
home,
we're
pushing
more
and
more
for
ephemeral,
client
build
environments
because
it's
the
only
way
to
have
a
safe.
A
You
know
build,
so
does
that
mean
that
that
every
one
of
those
clients
isn't
going
to
know
about
the
images
that
it
already
pushed
or
the
layers
that
already
pushed
so
that's
kind.
G
B
I
mean
the
the
the
thing
that
you
don't
want
like,
even
if
you
say
that,
like
your
challenge,
protocol
is
resistant
enough
to
attacks
that
you're,
okay,
with
like
full
sharing
so
like
in
a
corporate
environment,
we're
basically
fine
with
disclosing
that
we
have
store
a
specific
bob,
but
we
don't
want
like
hr
to
be
able
to.
B
You
know
fetch
it's
blogs,
or
vice
versa,
if
they're
uploading
across
different
parts
of
the
repository
different
parts
of
the
the
registry,
so
we
need
some
way
for
them
to
prove
that
they
actually
have
initial
access
to
this
blob.
B
Somehow,
our
initial
possession
of
this
blob
and
that's
what
the
proof
of
data
possession
protocol
is
about-
and
I
think
justin
was
the
one
that
brought
up
a
concern
around
the
proof
of
data
possession
protocol,
and
you
know
whether
we
would
have
a
preference
for
a
proof-
data
possession
protocol
that
requires
a
ad
hoc
generation
of
proof
points
or
just
allows
people
to
store
an
unfinished
digest.
B
B
So
you
need
to
have
a
merkle
tree
construct
for
a
hash.
It's
like
sha,
3
or
blake
3
or
blake
2
or
similar.
A
B
Yeah,
I
mean
both
proof
of
data
possession
protocols
are
in
there.
It
turns
out
that,
like
cryptography,
a
language
is
very
complicated
and,
like
it
turns
out,
we
have
one
cryptographer
at
work,
so
I
wouldn't
want
to
bring
a
data.
Possession
protocol
to
them
that
it
turns
out
is
not
sufficient
for
most
people's
security
needs.
B
It
doesn't
satisfy
the
trade-off.
So
I
think
what
I
would
like
is
a
an
under
like
to
understand
what
the
community
thinks
in
terms
of
minimal,
viable
security,
and
then
you
know
talk
with
folks
internally
to
kind
of
get
some
formal
language
and
get
a
formal
definition
of
this.
The
spec.
A
A
B
We
don't
really
have
the
problem
where
there's
two
two
things
like
one:
our
clients,
don't
necessarily
know
where
they
got
those
layers
from
because
those
layers
were
not
like
they
were
given
out
via,
like,
I
think
we're
using
build
kit
and
build,
or
something
like
that
and
like
it
doesn't
keep
track
of
this
information
and
the
or
or
people
are
pulling
from
docker
hub
and
then
pushing
to
our
internal
registry.
B
So
so
that's
like,
like
let's
say,
mike's
registry
is
on
docker
hub
and
everyone's
building
on
top
of
him,
his
image.
When
people
build
and
push
to
internal
registries,
the
internal
registry
can't
dedupe.
Now,
if
you
relax
the
guarantees
to
say
that,
like
any
repository
can
share
blobs
with
any
repository,
you
can
totally
do
what
you're
saying
we
don't
want
to
completely
throw
security
to
the
wolves.
B
We
want
a
little
bit
of
security
and
being
like
you
know,
you
have
to
prove
that
you
had
mike's
blob
at
one
point
and
you
can't
just
probe
the
registry
for
random
hashes
to
see
if
it
has
that
in
there
or
not
the
specific
risk.
Kind
of.
Is
that,
like,
let's
say
that
there's
a
known
vulnerable
layer,
these
security
people
publishes?
B
A
A
Yet
so
I
pull
image
from
doc
rub
it
has
layers
a
b
and
c
and
mike
pulls
an
image
that
has
layers
a
d
dne
when
the
first
one
of
us
that
pushed
to
that
registry
will
get
layer
a
in
there
and
then
let's
say
mic
goes
first
and
then
I
push
whether
layer
a
gets
tossed
because
it
got
pushed
to
the
registry
and
gets
tossed,
because
it
knows
it's
deduped
or
the
client
could
tell
the
registry.
By
the
way
I
have
lay.
I
have
a
digest.
A
do
you
know
of.
B
A
A
I
don't
have
access
to
yours,
you're
in
the
third
repo
in
the
same
registry.
The
way
we
actually
do
it
in
azure
is
it's
per
registry
sharing,
and
then
we
we're
discussing
doing
more
granular.
But
if
mike
and
I
have
two
different
registries
in
azure-
another
acr
we'll
never
share
layers
between
each
other
right.
B
B
But
in
the
case
that
I'm
talking
about,
we
have
lots
of
different
teams
that
cannot
pull
each
other's
image
right.
But
we
want
to
be
able
to
do
d-dupe
across
those
and
do
this
without
allowing
people
to
not
steal
other
people's
layers
to
arbitrarily.
D
A
I
guess
I'm
being,
I
didn't
really
think
about
it
as
if
I
can't
pull
it
then
I
shouldn't
dedupe
it
I
guess
was
the
thought,
so
I
was
tying
the
permission
boundary
closer
and
as
long
as
it's
defined,
the
registry
can
define
what
like.
If
what
I
guess,
I'm
trying
to
say,
is,
I
know,
registries
try
to
do
different
optimizations
and
we
should
allow
that.
A
B
In
the
proposal,
that's
also
talked
about
of
no
proof
of
data
possession
and
totally
relying
on
the
trust
boundaries
of
the
registry
and
that's
a
doable
thing
as
well,
and
I
think,
if
that's
where
we
want
to
start,
I
am
fine
writing
up
the
language
for
that.
And
then,
if
we
wanted
to
get
into
more
complicated
use
cases
of
like
soft
multi-tenancy,
then
there's
extensions,
and
I
want
to
make
sure
that
we
leave
a
specification.
That's
open
to
such
extensions.
G
G
A
But
that's
what
I'm
kind
of
getting
at
it
if
to
know
that
the
digestive
layer
to
layer
digest
exists
in
the
registry.
If
I
could
have
pulled
it,
does
it
matter
that
I
found
out
that
it
it
has
it
like?
What's
the
and
I'm
just
asking,
is
I'm
really
I'm
trying
to
think
what
is
the
problem
if
I
could
pull
that
digest
what
you
can't.
C
A
B
And
the
problem
kind
of
comes
into
any
time
you
need
to
do
a
auth
z,
evaluation,
you're,
like
typically
scanning
through
a
big
list
and
use
that
in
two
steps.
The
first
thing
you
do
is
you
determine
that
this
blob
exists
and
then
you
look
at
everyone
who
can
reference
it
and
you
go
through
everyone
who
has
access
to
those
references
and
validate
this.
That
becomes
like
a
big
timing:
vulnerability
because,
like
at
least
with
something
like
s3,
it's
super
slow,
like
you
have
seconds
between
those
steps.
B
So
if
you
know
it
takes
zero
seconds,
you
can
say
this
registry
doesn't
have
this
blob
if
it
takes
one.
Second,
you
can
say
this
registry
has
this
blob,
but
at
two
seconds,
if
it
rejects
you,
then
you
know
that
this
registry
has
this
blob
and
you
just
don't,
have
access
to
it
and
in
the
reason
why
you
don't
want
this
beyond
like.
B
C
Guess
here,
though,
when
does
a
registry
actually
return
this
proof
of
data
possession,
or
this
request
for
it
in
the
first
place,.
B
So
when
the
user
initiates
the
upload,
they
would
say
I
am
uploading
blob,
descriptor,
foo
and
the
registry
at
that
point
can
issue
a
challenge.
But
why
would
a
registry
issue
the
challenge
like
it?
Can
you
you
have
to
you?
Have
you're
gonna?
You
have
three
choices,
you
can
say
I
will
never
issue
challenges.
B
I
will
issue
challenges
on
potentials
of
d-dupes
or
I
will
always
issue
challenges
and
just
invalidate
them
from
from
a
security
perspective,
and
this
is
where
I
need
to
go
talk
to
our
like
crypto
people
and
see,
if,
like
those
are
all
viable
options
before
I
like
write
language
up
around
that,
but
I
think
that
if
you
just
say
like,
if
you
just
do
the
check
of
existence
and
don't
provide
a
challenge,
if
it
doesn't
exist,
you
might
be
able
to
get
around
the
timing
problem.
C
Well,
I
guess
yeah.
The
point
is
usually
the
way
we
do.
These
fallbacks
is
yeah.
If
the
registry
is
basically
giving
up
the
information
that
it
has
that
blob
already
by
issuing
the
challenge,
and
if
it
doesn't,
then
it
really
changes
the
protocol
like
if
it
always
issues
the
challenge,
whether
or
not
it
has
the
blob
or
not,
then
that
kind
of
messes
up
the
protocol
from
the
client
perspective,
because
now
they're
always
trying
to
prove
they
have
something
that
the
registry
will
make
them
upload.
The
data
anyways.
B
Yes,
I
guess
you
have
to
do
the
authy
check.
I
guess
let
me
think
about
this
a
little
bit
more
then
and
see
if
there's
a
more
secure
way
to
do
this.
Otherwise
we
we
have
to
yeah
yeah.
A
I
mean
we're
already
tossing
dupes
today
what
I
just
I
just,
never
really
poked
to
figure
out
when
that
happens
like
I
know
it
gets
uploaded
and
then
eventually
we
say
yep
we've
got
it
ready
and
we
toss
it.
What
I
don't
know
if
that's
done
asynchronously
later
on
or
when
that
operation
happens.
So
this
whole
auth
check
is
happening.
It's
just.
I
haven't
really
poked
deep
enough
to
know
where
that
check
happens,
to
know
like
you're
bringing
up
something.
C
Know
from
the
client's
perspective
they
don't,
they
shouldn't
know
like
yeah
there's,
probably
you
could
probably
poke
into
any
registry
and
there's
gonna
be
some
sort
of
timing
attack.
To
tell
like
oh,
I
uploaded
something
that
already
existed
but
based
on
the
response,
but
there
shouldn't
be
anything
in
the
protocol
that
leaks
that
information.
A
I've
got
a
hard
stop
so
until
next
we'll
just
move.
However,
there
was
one
agenda.
Item
left,
just
move
it
to
next
week,
the
templates
at
the
bottom
and
we'll
pick
up
next
week.