►
From YouTube: RETMKT Builders - Retrieval Market Indexing
Description
Will talks about content indexers at Retrieval Market Builders Mini-Summit in April 2021.
A
Rowell
just
showed
us
a
lot
about
what
we're.
A
Seeing
happening
to
lotus
as
a
process
and
a
lot
of
what
motivates
that
is
that
there
are
new
capabilities
that
we
want
to
add
around
making
the
value
of
a
file
coin.
Miner
better
in
terms
of
you
know,
enabling
this
retrieval
market
and
and
one
of
those
is
indexing,
and
so
what
we're
meaning
when
we
say
indexing,
is
that
it's
sort
of
important
for
a
minor
to
know
what
content
it
has
and
to
be
able
to
get
that
content
back
out.
A
So
right
now,
content
is
stored
in
falcon
in
pieces
and
deals,
and
so
these
are,
you
know,
an
eight
gig
or
a
32
gig
car.
A
It's
a
bunch
of
individual
sid
data
items
and
we
refer
to
it
based
on
that
hash
of
that
overall
thing,
and
so
what
is
hard
to
know
is
which
individual
sids,
which
pieces
of
content
are
inside
of
those
things
and
to
map
those
individual
pieces
of
content
that
someone
may
want
later
back
to
which
minor
has
it
and
which
piece
which
archive
is
that
individual
piece
of
item
in
or
that
individual
piece
of
content?
A
So
this
this
problem
of
like
content,
routing
routing
from
a
sid
into
the
the
entity
that
has
that
is
sort
of
what
ipfs
is
already
solving
right
and
and
is
the
content
writing
problem.
But
one
of
the
things
that
we
think
is
sort
of
the
useful
next
step
to
enabling
this
in
in
the
context
of
of
the
file
coin
network
is
we
want
to
have
some
interaction
between
an
index
or
some
external
node?
A
That
is
going
to
help
us
with
content,
routing
and
the
miners,
the
the
nodes
that
have
content,
and
we
think
that
this
extension,
where,
where
we,
we
have
this
interaction
and
interface,
that
we
define
for
a
node
with
content
that
sort
of
accumulates
new
content
and
is
going
to
notify
that
and
then
some
external
party
that
can
learn
what
that
index
and
what
the
available
content
is,
is
one
that
both
isn't
and
doesn't
need
to
be
specific
to
filecoin.
So
we
can.
A
We
can
make
this
interface
more
general,
so
that
existing
services
that
have
a
lot
of
content
address
data,
maybe
also
fit
the
same
model
of
they.
You
know
in
some
sort
of
checkpointed
way
will
get
more
content
that
they
want
to
make
available
in
a
content,
address,
network
setting
and
and
that
interface
probably
looks
pretty
similar
to
what
a
miner's
lifecycle
looks
like
of
content
sort
of
coming
and
going
in
batches.
A
So
we've
got
two
sides
of
this
sort
of
interface,
interaction
that
we
need
to
think
about
when
you
think
about
what
that
means
that
we're
going
to
ask
on
the
minor
side
and
then
what
what
is
this
sort
of
index
process
that
then
flows
up
to
clients
and
the
client
that
we're
thinking
about
here
specifically
are
like
how
do
we
make
this
content
available
to
ipfs
clients
and
and
serve
available
more
generally
in
a
useful,
low
latency
manner,
so
so
enabling
the
cdn
style
use
case?
A
A
Oh
there's
a
little
bit
smaller
really
doesn't
want
to
make
that
sidebar
smaller,
but
so
we're
imagining
that
there
is
some
sort
of
indexing
service,
that's
going
to
live
on
a
minor
and
that,
as
the
market
completes
a
deal,
that
probably
is
a
trigger
that
triggers
the
indexing
service
to
make
that
new
content
available.
A
What
does
the
indexing
service
need
to
do?
Well,
it
needs
to
generate
an
index
one
of
the,
and
so
what
is
an
index?
Well
at
some
level?
It's
just
like
the
list
of
all
of
the
sids
that
are
in
that
deal,
and
so
we
have
tools
that
can
go
over
a
car
archive
as
a
deal
is
currently
stored.
We
could
imagine
with
a
different
proof
of
replication
that
potentially
that
storage
is
different,
but
it
is
still
you
know,
a
bunch
of
content,
addressable
pieces
that
are
referenced
by
sits.
A
So
so,
if
an
indexer
or
if
the
rest
of
the
network
wants
to
say
hey
what
content
do
you
have?
Is
the
answer
really
that
list
of
sids,
and
so
the
the
motivating
example
here
is
that
a
lot
of
data
that
we
have
is
in
unixfs
format
and
unixfs
format
has
two
things
that
are
sort
of
interesting.
A
One
is
that
if
you've
got
a
large
file,
that
file
is
spread
across
multiple
items
right,
so
individual
blocks
of
data
are
limited
in
size,
and
so,
when
you
have
a
large
contiguous
sort
of
file,
that's
more
than
a
few
megs.
A
And
so
right
now,
what
we're
imagining
is:
there's,
probably
some
baseline
of
just
the
undifferentiated
list
of
sids
that
may
turn
out
to
be
too
big,
and
so
we
may
ask
the
minor,
who
knows
this
structure
or
can
have
some
heuristics
to
also
be
able
to
provide
alternative
lists
or
sub-lists
like
a
semantic
index
of
you
know
the
list
of
files
in
case
it's
unix,
fs
or,
if
there's
other
data
formats,
that
we
find
to
be
commonly
used
that
are
valuable
to
do
that,
where
you
do
need
to
know
some
semantics
about
the
actual
data
in
order
to
do
an
efficient
pruning.
A
Perhaps
that
needs
to
live
on
the
minor,
because
it's
unclear,
if
either
the
sid
or
the
dag
of
sids,
that
there's
a
good
balance
that
can
be
provided
at
this
interface
that
lets
an
indexer
node.
Do
useful
sub
selection
just
on
that
list
that
it
may
need
to
be
done.
Someplace,
that
has
the
data,
which
would
be
the
miner
okay,
so
so
you've
got
some
service
that
now
has
this
list
of
data.
How
do
you
get
that
data
to
an
indexing
node?
A
That's
then
going
to
do
the
the
work
of
generating
an
overall
index
and
being
able
to
say
oh
this
sid,
that's
on
this
minor
and
probably
what
that
looks.
Like
is
two
pieces.
There's
a
there's,
a
send
of
advisory
of
new
data
available.
So
whenever
a
new
deal
finishes
or
or
there's
a
new,
you
know
chunk
of
data
items
that
have
become
available
on
the
miner
it'll
publish,
probably
on,
and
this
can
work
just
sort
of
as
a
gossip
sub
topic
that
we
agree
upon
that.
A
There's
a
message
with
a
new
route,
a
new
amount,
a
new
list
of
synths
that
are
become
available
and
that
will
trigger
indexers,
who
are
watching
that
miner
to
then
go
and
and
pull
that
set
of
sits
from
them,
and
we
can
imagine
that
poll
happening
over.
A
You
know
an
ipfs
like
thing
where,
where
these
are
all
ipld
data,
and
so
there's
a
new
sid
that
gets
put
out
and
the
indexers
fetch
that
sid
with
a
connection
to
the
miner
and
get
back
or
or
a
connection
to
this
index
process
that
maybe
exports
itself
over
ipfs
and
gets
back,
then
the
list
of
of
new
sids
that
need
to
be
indexed
and-
and
so
we
started
writing
up
that.
That
interface
a
little
bit
in
terms
of
how
we
think
about
these.
So
what?
A
What
is
the
advertisement
that
happens
over
pub
sub?
And
so
you
have
your
your
a
sid
that
points
to
sort
of
a
manifest
of
this
new
list
of
data.
That's
that
exists,
and,
along
with
the
previous
one
and
along
with
sort
of
how
to
connect
to
you
and
then
you
sign
it,
and
this
means
that
that
miners
sort
of
get
one
global
view
of
what
content
they
have
are
making
available
for
indexing
and
that
helps
them
from
that.
A
That
helps
us
to
sort
of
have
this
consistent
view
that
we
are
able
to
enforce
and
make
sure
that
miners.
If
they
advertise
data
as
available,
they
have
to
advertise
it
as
available
to
everyone,
and
so
that
that's
sort
of
one
step
where
once
we
have
those
semantics,
we
can
make
sure
that
they're
sort
of
the
availability
of
a
open
marketplace
that
you
that
you
don't
have,
at
least
through
this.
A
This
process,
miners
that
that
claim
to
you
know
they'll
only
make
their
data
available
to
the
their
partnering
cdn
that
rather
any
indexer
node
would
need
to
would
be
able
to
find
miners
and
and
see
and
validate
the
the
same
index,
basically,
okay,
so
so
this
this
is.
A
This,
hopefully
starts
to
sound
like
a
plausible
interface
of
something
that
we
might
ask
about
as
a
way
to
now
have
these
lists
of
what
content
is
available,
and
we
could
imagine
aggregating
that
and
I've
sort
of
been
purposefully,
vague
about
what
an
index
or
node
actually
is
doing
and
and
whether
an
index
or
node
is,
is
even
like
a
single
logical
entity
or
whether
this
is
you
know,
multiple
decentralized
ones
or
sharded
likely.
A
You
know
the
the
first
version
is
we
we
make
something
that
is
logically
centralized
and
we
wait
for
that
to
fail
over
in
terms
of
its
ability
to
scale,
and
then
we
figure
out
how
to
shard
it
as
a
way
to
get
to
an
mvp
and-
and
you
could
also
imagine
that
there's
multiple
of
these-
that
different
people
run
as
as
a
secondary
thing.
A
Okay,
but
but
then
there's
there's
also
the
the
subsequent
question
of
the
second
site
of
this
interface,
which
is
you've,
got
these
indexer
nodes.
Now
that
have
these
big
indexes
of
a
lot
of
sids
that
are
available
within
file
coin
and
and
which
miners
and
which
pieces
they
reside
in,
and
we
still
now
have
the
problem
of
okay.
You've
got
a
bunch
of
clients
that
want
to
do
that.
A
Do
that
lookup
of
I've
got
a
sid
and
I
would
like
to
know
who
has-
and
this
is
a
problem
that
ipfs
already
solves
via
a
api
called
content
routing,
and
what
that
actually
looks
like
right
now
is:
is
the
method
find
provider
where
you
give
a
sid
and
you
get
back
a
peers
address
info?
So
you
learn
which
peer
has
that
content,
and
this
is
the
interface
that
currently
is
exported
by
the
dht
and
by
other
content,
routing
modules
within
ipfs.
A
So
the
the
relevant
thing
here
is
that
what
we
actually
probably
want
to
return
is
not
just
a
peer
address
info,
but
something
that's
a
little
bit
more
extensible
that
we
can
do
things
like
provide
a
record
that
says
well,
this
minor
has
this
data,
and
also
is
you
know,
there's
some
other
metadata
that
we
start
to
really
care
about.
A
That
said,
and
also
maybe
you
know
it's
in
this
piece-
there's
some
other
metadata-
that's
maybe
relevant,
so
that
the
miner
can
efficiently
get
it
for
you,
and
these
start
to
look
a
lot
like
the
smart
records
that
petar
described
to
us
yesterday,
which
is
what
we
want
to
be
able
to
have
in
terms
of
the
shape
that
we
get
back,
is
not
simply
a
peer,
but
a
record.
A
That
is
this
sort
of
extensible
view
of
what
we
know
about
the
sit
right,
because
the
record
that
I
get
back
is
actually
likely
to
have
multiple
minors
that
are
multiple
options
of
minors
that
all
have
that
piece
and
so
you've
got
multiple
minor
piece
records
and
then
you've
got
some
extensibility
of
the
protocol
where
you're
able
to
say-
and
you
should
get
to
it
via
this
thing-.
A
There's
there's
a
bunch
of
interop
sort
of
path
of
least
resistance.
That's
still
getting
figured
out
in
terms
of
how
do
we
enable
that
actual
subsequent
fetch
of
okay?
So
an
ipfs
node
gets
this
record
from
an
index
or
node
potentially,
and
then
how
does
it
know
to
actually
get
the
data
out
of
a
minor
and
and
that
there
there's
sort
of
a
few
different
ways
and
and
we'll
see
how
hard
they
they
end
up
being
relative
to
each
other?
A
You,
with
with
the
current
graph
sync,
go
data
transfer
process
that
exists
in
the
market,
part
of
filecoin
miners.
The
process
is
happens
in
a
few
steps
where
first,
you
on
chain
do
a
retrieval
deal
and
ask
for
that
data
and
and
then
that
results
in
a
voucher-
and
you
use
that
voucher
to
then
initiate
the
go
data
transfer
session
over
which
you
transfer
the
data
for
some
subset
of
the
data
that's
stored
it.
A
A
You
can
pass
in
to
serve
an
empty
voucher
here
and
just
go
straight
ahead
to
data
transfer
and
see
if
it
works
and
for
miners
that
allow
free
deals
that
may
be
sufficient,
and
so
you
may
be
able
to
do
this
without
popping
back
up
once
it
is
a
paid
deal
or.
A
Then
then,
you
need
to
you
know:
do
this
file
coin
specific
process
next
of
actually
paying
for
it
or
delegating
to
someone
else
like
your
your
content
provider,
the
the
app
the
dap
or
something
that
that's
going
to
be
willing
to
pay
for
it,
and
that
starts
to
get
somewhat
specific.
A
And
so
we
need
to
think
about
how
we
allow
that
to
get
bubbled
up
in
an
ipfs
client
so
that
it
can
delegate
out
to
you,
know
a
lotus
pro
plug-in
that
can
make
the
deal
and
and
provide
a
way
to
allow
for
extensible
protocol
handlers
at
the
ipf
tesla.
A
But
but
if
we
have
some
point
of
extensibility
both
in
what
records
look
like
and
then
also
on
protocols
where,
where,
if
you
don't
have
it
built
into
ipfs,
there's
a
place
to
plug
in
on
the
client,
we
think
we
can
get
from
record
and
from
this
different
interface
of
finding
providers
to
getting
the
data
and
making
it
available
on
gateways
on,
go
ipfs
nodes
on
embedded
nodes
and
a
bunch
of
people
who
want
sits.
A
The
other
thing
that
we're
going
to
have
to
figure
out
is
how
does
the
ipfs
node
know
that
it
is
getting
or
or
ask
an
index
or
node?
So
if
we,
if
we
start
with
a
limited
number
of
these
indexer
nodes,
initially,
what
what
does
this
initial
arrow
look
like,
where
we
find
the
right,
indexer
nodes
and
make
use
of
them,
and
that
also
likely
goes
through
a
few
iterations?
We
could
imagine
that
you
can
do
sort
of
an
initial
delegated
content
routing.
A
This
is
something
that
javascript
ipfs
nodes
already
do,
where
they
delegate
their
find
provider
request
over
to
a
companion,
go
node
that
they
know,
but
you
could
imagine
writing
that
same
implementation
of
content
routing
in
go
ipfs
so
that,
for
instance,
if
there
is
a
known
indexer,
the
gateway
could
just
delegate
to
an
indexer
that
gets
stuck
very
close
to
it,
so
that
it's
low
latency,
so
that
that's
that's
one
option
and
in
an
initial
version
you
could
also
imagine
that
the
thing
that
happens
afterwards
as
you
as
you
begin
to
decentralize
this
is
you
have
an
advertisement
of
indexers,
so
indexing
services.
A
Can
you
know
announce
or
or
provide
themselves
as
a
service
provider
of
indexing,
and
then
nodes
keep
track
of
who
they've
seen
as
potential
indexers
and
to
keep
track
of
how
fast
and
how
reliable
these
indexers
are
and
use
that
as
a
way
of
prioritizing
who
they
ask
for
indexing
queries
so
that
there
there
is,
I
think,
naturally,
going
to
be
a
market
for
this
indexing,
which
is
you
want
an
indexer
who
is
close
to
you
and
you
want
and
there's
a
trade-off
within
the
indexing
of.
A
Are
you
going
to
prune
down
to
a
smaller
index
of
head
queries
of
popular
sits
that
are
getting
asked
for
a
lot
where
you
can
respond
to
80
of
the
queries
in
relatively
small
number
of
milliseconds?
Or
are
you
going
to
keep
the
much
larger
list
of
all
possible
sits
so
that
you
can
answer
the
longer
tail
of
sort
of
rarer
queries
which
will
be
a
more
expensive
lookup
and
higher
latency?
A
So
there's
a
couple
places
of
potential
differentiation
there
in
indexing.
I
think
we
we
also
still
need
to
figure
out.
What
you
know
is:
is
the
story
going
to
be
that
these
indexing
nodes
get
get
renumerated
directly
in
this
protocol.
There
there's
two
places
there
that
you
could
imagine
that
you
know
retrievable
mining
indexing
playing
into
the
overall
economic
game.
A
One
is
that
a
client
could,
potentially
you
know,
pay
somehow
or
or
just
based
on
usage
of
number
of
queries
that
could
go
into
the
network,
somehow
allocating
resources
to
the
indexing
nodes.
The
other
one
is
that
for
deals
that
are
made
off
of
an
index
poll
to
the
indexer.
A
The
indexer
could
get
some
small
commission
on
those
so
that
it
goes
within
the
existing
sort
of
payment
channel
thing,
but
basically,
there's
like
a
finder
speed
type
thing
where
the
indexer
node
that
did
help
make
that
deal
happen
gets
somehow
tags
in
in
the
record
that
it
returns
so
that
when
the
miner
does
its
payment
channel
thing,
the
main
the
miner
sort
of
tips,
the
the
indexer
for
fun,
bringing
it
in
the
traffic
to
will
quicktime
check.
What
can
you
wrap
it
in
one
minute
yeah?
A
I
think
I'm
basically
done
talking
about
these
two
particles.
This
is
you
know.
A
Sort
of
a
a
bit
more
future,
looking
we're
just,
I
think
you
know
it
gives
you
a
sense
of
what
the
hopefully
sort
of
first
concrete
steps
look
like
in
terms
of
getting
to
something
that
works
and
then
a
bunch
of
the
things
we're
imagining,
but
we
would
like
feedback.
Certainly
in
terms
of
other
constraints,
we
need
to
be
thinking
about
and
other
ways
that
you
can
you
all
in
particular,
are
imagining
plugging
in
or
would
have
to
interact
with
this
or
things
that
sound
unrealistic.
A
So
there
is
a
breakout
session
in
about.
I
don't
know
an
hour,
something
like
that
where
I'm
happy
to
talk
with
you
all
more
about
details
of
what
actually
will
happen
here
and
with
that
I
will
turn
over
to
hannah.