►
From YouTube: Retrieval Incentives Intro - Marina Kostioutchenko
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
A
First,
I
will
give
a
quick
introduction
to
the
problem
itself,
where
we're
seeing
the
drop-offs
pain
points
we'll
learn
about
existing
efforts.
I'll
do
my
best
to
cover
every
everything
that's
going
on
in
the
retrieval
incentives
world
and
then
Jake
will
navigate
us
through
the
brainstorming
and
hopefully,
will
come
out
of
the
session
today
with
the
joint
understanding
what
the
highest
priority
items
are
for
us
to
all
work
together
in
the
next
six
months
or
so
so
in
the
first
half
of
next
year.
A
A
So
when
we
think
about
successive
retrievals
I
would
classify
kind
of
like
two
big
buckets
where
we
can
see
some
challenges.
First
is
the
reliability
of
rituals,
so
the
technical
process
of
serving
retrievals
and
in
that
regard,
we've
done
a
huge
progress
in
the
past
a
couple
of
months.
So,
first
of
all,
the
data
transfer
success
has
more
than
doubled
in
the
past
three
months.
A
So
thank
you
to
the
Bedrock
team
and
all
the
great
effort
they've
done
there
and
you
know
we're
continuing
to
work
towards
making
sure
it's
even
less
than
than
what
we
have
right
now,
and
particularly
there
are
some
ux
improvements
needed
for
just
getting
invisibility
into
a
Lotus
node
processes.
A
But
then
there
is
the
second
bucket
where
we
see
challenges,
and
this
is
basically
storage
providers
being
willing
to
serve
retrievals.
What
we
see
right
now
and
we'll
dive
into
that
a
bit
more
in
the
next
few
slides
is
that
storage
providers
are
either
choose
to
not
serve
retrievers
at
all
or
they
are
limiting
the
access
to
data,
and
this
is
where
we
want
to
focus
today.
So
on
the
willingness
of
storage
providers
to
serve
retrievals,
we
have
gathered
some
data
in
the
past
couple
of
months.
A
A
A
We
see
that
only
so
the
if
you
take
out
everything,
that's
cached
in
the
auto
retrieval
Bridge.
There
is
a
first
big
drop
of.
Let
me
see
from
203
to
12.5
million
requests.
This
is
the
data
that
we
can
find
with
storage
providers.
A
This
is
kind
of
expected,
but
the
the
area
where
we
want
to
focus
at
is
the
drop
off
number
blue,
12.5
to
475
000..
So
the
first
drop-off
is
basically
the
retrieval
request
or
any
request
at
all
being
unsuccessful.
A
But
then
the
interesting
part
is
1.77
to
4
74,
where
we
see
some
errors
with
retrievals
and
if
you
look
at
the
breakdown
of
all
errors,
it
turns
out
that
more
than
50
percent
of
these
retrieval
requests
to
public
data
through
graph
sync
protocol
to
ipfs
nodes
failed
to
some
restrictions
that
are
introduced
by
storage
providers.
A
So
half
half
of
those
are
either
so
their
rate
limit,
receivable
requests,
so
basically
search
providers
are
saying
we
can
only
serve
like
three
requests
per
day
and
the
other
half
roughly
is
the
access
control
restrictions.
So
this
is
storage
providers,
just
usually
through
CID
gravity,
Tool,
saying
that
we
don't
want
to
serve
this
content.
A
So
how
do
we
solve
and
like?
How
do
we
solve
this
issue?
First
of
all,
let's
try
to
understand
why
storage
providers
are
exhibiting
this
behavior
and
it's
from
the
kind
of
current
state
of
the
world.
It's
pretty
natural.
First
of
all,
there
is
no
rewards
for
certain
retrievals.
Currently
you
don't
earn
filecoin
or
even
like
Frankly
Speaking
reputational
standing
by
servant
retrievals,
because
such
systems
don't
exist
yet,
nor
are
you
being
punished
for
not
serving
retrievals?
A
A
Second,
there's
opportunity
cost
you
can
use
that
same
capacity
to
mine
file
coin
through
Cc
or
storage
deals,
there's
time
and
effort
both
in
setting
things
up,
but
also
debugging,
because
they're
still
operational
efficiencies.
And
finally,
there
might
be
some
legal
liability
if
you
accidentally
or
not,
knowingly
serve
data.
That's
illegal,
for
example,
so
search
providers
choose
to
just
not
do
it,
and
so
before
we
start
in
diving
into
incentive
structure.
Let's
take
a
look
at
what
data
is
on
chain
right
now.
What
what
data
do
we
want
to
retrieve?
A
Potentially
so
I've
gathered
basically
or
tried
to
estimate
the
best
we
can
from
the
258
petabytes
of
data
stored
in
chain?
So,
first
of
all,
in
terms
of
who
pays
for
a
data
being
stored.
Currently
it's
mostly
in
the
field
plus
program.
A
We
see
four
pepper
bites
that
are
a
legacy.
Slingshot
deals,
but
majority
of
data
stored
in
chin
is
sponsored
by
Network.
So
it's
Phil
plus
program
in
terms
of
data
privacy.
This
is
a
highly
estimated
number.
So
don't
please
don't
quote
me
on
that,
but
this
is
basically
the
you
know.
If
you
look
at
the
amount
of
data
being
indexed,
obviously
about
30
percent
of
data
currently
provided
to
the
indexer,
maybe
it's
more,
maybe
there's
some
public
data.
A
But
what
we
can
deduce
from
some
of
the
programs
being
run
is
that
about
18
18
paper
bytes
out
of
258
are
slingshot
data,
there's
3.5
and
another
two
programs,
which
is
discovered
in
Evergreen
aggregators,
which
are
dag
house
at
shuri,
Only
Store
about
one
petabyte
of
data
and
the
rest
236
is:
is
this
a
bucket
of
different
kinds
of
field
plus
clients?
It's
both
public
and
private
data?
There
are
research
institutions
and
Enterprises,
so
that's
the
best
we
can
tell
about
data
clients.
A
With
that
in
mind,
let's
put
these
clients
or
programs
on
the
two
by
two.
So
let's
look
at
you
know
the
latest
requirement,
as
well
as
how
open
the
data
is
so
yellow
ovals
are
current
use
cases.
You
can
see
that
they're,
mostly
kind
of
like
concentrated
around
Cold
Storage.
We
have
both
publicly
accessible
and
Access
Control,
Data
and
publicly
accessible.
It's
and
call
is
mostly
all
the
data
programs
that
we
have
slingshot.
A
There
are
some
aggregators
that
are
also
doing
called
storage
and
store
public
data,
and
we
also
see
the
green
kind
of
emerging
use
cases
they're
very
highly
concentrated
in
this
publicly
accessible
data.
That's
that
requires
a
non-archival
storage.
It's
the
ipfs
knows
that
one
request
and
retrieve
data
from
Storage
providers
is
the
emergent
Saturn
Network
and
their
compute
nodes
that
will
serve
compute
over
data,
and
so
this
is
the
area
that
we
want
to
focus
today.
A
So
when
we,
when
we
break
into
the
workshop,
start
brainstorming,
we
want
to
talk
about
the
publicly
accessible
non-archival
storage
incentive
systems
in
terms
of
the
systems
that
we've
spoken
about,
or
maybe
some
teams
have
started
working
on.
Here's
here's,
the
lay
of
the
land
as
we
can
see.
There's
you
know
we
don't
assume
that
there
will
be
one
incentive
system
that
fits
all
the
use
cases
and
that's
that's
how
it
should
be
so,
for
example,
for
the
Access
Control
Data.
A
We
anticipate
it's
going
to
be
and
continue
to
be
legal
contract
that
incends
the
retrieval,
so
search
providers
will
serve
retrievals,
because
this
is
what
they
agreed
with
with
the
client
for
the
data
that
you
know
only
they
can
access
in
the
publicly
in
the
publicly
accessible
data
in
the
archival
world.
We
will
also
see
legal
contracts,
but
on
top
of
that
there
are
some
reputation
systems
that
are
emerging
through
mostly
word
of
mouth
right
now,
so
clients
would
come
to
our
team.
A
Ask
team
members
about
you
know
what
search
providers
have
have
been
trustworthy
and
good
to
work
with.
There
are
some
emerging
set
of
dashboards.
That
I'll
show
you
later
that
also
provides
invisibility
into
what
search
providers
are
reliable
and
will
serve
retrievals
and
also
there's
obviously
enforced
requirements.
For
example,
slingshot
program
requires
that
data
is
being
served.
The
retrievals
are
successful
in
terms
of
kind
of
this
sad
and
focus.
A
Today,
we've
seen
a
lot
of
interesting
ideas
emerging
of
how
to
make
retrievals
attractive
to
search
providers
it's
reimbursement
at
the
end
of
the
month,
for
example,
I
think
this
is
what
Saturn
network
is
planning
to
do.
Other
ideas
include
clients,
paper,
byte,
so
pay
as
you
use
so
to
say.
They're
staking
at
deal
time,
which
is
more
of
an
insurance
insurance
system.
A
Teams
have
spoken
about
prepaid
coupons
where,
for
example,
the
user
of
the
data
is
not
the
person
who
who
is
interested
in
this
data
being
retrieved,
and
so
there's
separation
and
payment
and
usage
of
the
data.
So
this
is
the
initial
set
later
today
we'll
break
into
groups
and
brainstorm
one
more,
but
this
is
the
area
where
we
want
to
kind
of
idea
today
and
very
quickly.
I
want
to
give
you
an
overview
of
what's
going
on
right
now.
In
terms
of
retrieval
incentives,
I
would
categorize
them
in
three
different.
A
Buckets
first
is
generating
metrics
themselves,
getting
some
visibility
into
whether
retrievals
are
even
successful
or
not,
and
all
the
detailed
characteristics
of
retrievals
second
bucket
of
efforts
is
surfacing
these
metrics
to
users
in
in
a
way
that's
useful
to
them
and
to
their
goals
and,
finally,
the
systems
that
are
aligning
incentives
between
data
client
storage
providers
in
a
way
that
data
is
easily
retrieved.
A
So
in
the
first
bucket
the
generating
metrics
bucket,
there
are
three
projects
that
were
or
are
ongoing,
the
most
prominent
and
interesting.
One
is
the
validation,
but
current
is
built
just
for
the
slingshot
program,
as
I
mentioned
before.
Slingshot
requires
data
to
be
retrieved,
and
so
we
want
to
be
able
to
observe
and
enforce
that
rule,
but
the
the
vision
is
to
grow
it
further,
so
make
the
solution
open
source
and
open
it
up
to
more
use
cases
than
a
slick
shot.
Slingshot
used
to
use
dealbot.
Currently
inactive
was
semi-manual.
A
Solution
didn't
quite
required
a
lot
of
manual
effort
from
the
team's
perspective,
but
that
was
sort
of
a
previous
version
of
validation,
bots.
So
to
say-
and
you
know,
the
data
that
we
reviewed
earlier
is
the
other
retrieval
Bridge
data
that
Bedrock
is
collecting.
These
are
ipfs
Gateway
requests
that
were
able
to
that.
We
were
able
to
find
to
be
stored
as
storage
providers.
A
Azure
also
ran
their
own
automobile
Bridge,
based
on
what
I
know
is
currently
paused,
but
it
also
exists.
So
this
is
this:
these
are
the
systems
that
gather
metrics.
Now
there
are
a
few
dashboards
that
present
these
metrics
serve
these
metrics
back
to
data
clients.
First
one
is
Project
called
fieldgram
developed
by
filmmind
team.
A
The
interesting
part
about
this
dashboard
is
that
it
allows
for
clients
to
actually
leave
a
qualitative
and
quantitative
reviews
on
search
providers.
So,
for
example,
if
I
store
data
with
a
storage
provider
and
I
had
great
experience,
I
can
rate
them
with
a
five-star
rating
and
write
a
quick
review.
A
We
see
some
friction
with
the
world:
authentic
authentication,
not
an
easy
ux.
So
currently
we
don't
see
a
ton
of
reviews,
but
the
capability
is
there
and
it's
very
interesting
idea
in
terms
of
developing
this
sort
of
reputation,
metric
based
on
kind
of
like
Yelp,
like
reviews,
retrieval
metrics,
are
not
included,
but
once
we
have
at
scale
kind
of
metrics
collection,
there's
capability
to
include
these
metrics
here,
another
dashboard
I'm
sure
a
lot
of
people
might
have
seen
because
it's
one
of
the
oldest
it's
a
filter
up
that
IO.
A
Unfortunately,
we
don't
see
again
a
high
adoption
because
partially
because
retrieval
metrics
are
not
included,
currently
not
actively
supported,
but
again,
there's
capability,
there's
API
capability
and
there's
capability
to
include
more
more
data
points
here
and
finally,
there
is
starboard
dashboard.
A
Also
there
is
room
for
including
retrieval
metrics
here,
currently
they're
not
being
served.
The
work
has
been
paused
but
again
there's
capability
to
plug
in
retrieval
metrics
here
and
make
this
dashboard
place
where
clients
can
go
and
learn
about
some
of
their
achieval
details
about
this
or
that
storage
provider,
and
then
finally,
there
are
a
lot
of
exciting
projects
in
the
incentive
systems.
Right
now,
I
would
categorize
kind
of
mechanisms
at
either
either
carrot
or
a
stick
and
I
try
to
to
do
this
distinction
here.
A
A
First
is
a
stick:
you
know
if
you
don't
serve
retrievals
you're
excluded
from
the
data
programs
and
excluded
from
the
opportunity
to
earn
filled
plastic
words
within
the
slingshot
program,
but
there
is
a
kind
of
like
more
in
planning,
but
the
project
that
is
hoping
to
set
up
a
storage
Dao
around
these
metrics.
That
would
allow
storage
providers
to
earn
Dow
tokens,
for
example,
for
good
retrieval,
behavior
and
I,
say
hope.
Luca
will
cover
that
today
or
Irene.
Irene
will
cover
oh
you're
not
going
to
cover
if
you're
interested.
A
We
can
share
more
about
this
project,
but
this
is
basically
at
the
high
level
rewarding
storage
providers
for
exhibiting
good
storage,
Behavior,
retrieval
Behavior,
the
next
one
that
will
be
covered.
I
hope
is
the
retrieve.org,
and
this
is
more
of
a
stick
mechanism
where
provider
of
this
insurance
will
get
slashed
if
retrievals
are
not
served.
So
you
stake
some
amount
of
asset
at
the
beginning
and
then
based
on
whether
you
serve
or
don't
serve
retrieval
you
either
are
able
to
get
it
back
or
you're
getting
slashed.
A
A
And
finally,
we
we
do
have
one
aggregator
that
is
currently
using
a
notion
of
replication
score
to
Grant
to
decide
who
the
deals
will
go
to
and
that's
Phil's
one
developed
by
Charles's
team,
again
more
of
a
carrot
if
you
serve
retrievals
you're
rewarded
by
more
deals
in
the
future-
and
this
is
an
active
system
that
exists
already
and
Jake-
will
help
us
go
through
the
phases
today.
I'll,
let
him
cover
the
plan.