►
From YouTube: CI WG demo Enhanced Robust Persistent Identification of Data (ERPID) & FAIR Digital Object Framework
Description
Persistent Identifiers are commonly used for long term identification of publications (DOIs), published data sets (DataCite), and even people (ORCIDs). However, PIDs have could have more utility throughout the data lifecycle. ERPID and the FDOF are looking at ways to track workflow and provenance information with PIDs that could enable universal data interoperability and full reproducibility of computational workflows.
Date: 04/03/20
Presenter: Rob Quick
Institution: Indiana University
Midwest Big Data Innovation Hub
A
B
B
C
D
A
E
B
E
F
Hi
Christine
Kirkpatrick
a
proud
co-chair
of
this
working
group
and
deputy
director
at
the
West
big
data
hub
also
at
the
San
Diego
supercomputer
center,
where
I
work
on
other
fun
projects
that
came
out
of
this
working
group
with
John
and
Melissa
and
fun
to
see
all
of
you
and
a
glad.
We
can
still
be
connected
at
this
strange
time,
struggling
to
be
a
contributing
member
of
society
and
very
glad.
It's
Friday
me
Oh.
A
H
That's
me:
I
was
not
able
to
assuming
my
name
is:
go
pain
and
I'm,
a
researcher
at
NC,
State,
University
and
I'm,
always
interested
in
the
data
quality
and
fair
digital
object.
Framework
Rob
had
put
in
about
this
particular
working
group
meeting
couple
days
ago
at
a
fairs
seminars,
Reznor,
so
I'm,
just
kind
of
listening
to
see
what's
going
on,
and
it
sounds
like
a
very
exciting
working
group
with
very
different
stakeholders.
A
D
Sure
that
sounds
great.
Thank
you.
So
one
of
the
big
things
that
we've
been
working
on
is
that
a
week
ago,
tonight
NSF
reached
out
and
said
that
they
would
like
the
hub's
to
work
on
developing
and
NSF
rapid
coded
Commons,
and
this
would
be
an
information
source
highlighting
the
NSF
rapid
grants
which
are
fast
but
rapid,
has
I.
Think
it's
not
going
to,
of
course,
and
so
they're,
currently
22
NSF
rapid
grants
that
have
been
announced
regarding
kovat
they're
planning
on
more
including
and
the
convergence
accelerators.
D
So
we've
been
working
with
all
the
hubs,
which
were
very
grateful
that
were
such
solid
community
to
put
together
a
proposal
to
develop
this
NSF
Kovach
Commons,
starting
with
the
rapid
grants
that
are
announced
and
then
looking
at
connections
to
the
open
knowledge
network,
convergence,
accelerators
and
maybe
some
other
analytics
and
tools.
So
that's
what's
been
keeping
us
super
busy.
All
this
weekend,
get
me
dizzy
and
then
the
beautiful
thing
is.
We
have
letters
of
collaboration
from
all
four
hubs
from
OSM
and
John.
D
Thank
you
for
helping
on
that
as
well
and
Christine
as
well,
and
everybody
and
Renata
and
the
whole
family.
So
that's
what
we're
up
to
our
hoping
to
get
that
submitted
today
or
Monday.
We
have
swinging
for
a
couple
things,
a
couple
of
facilities,
statements
here
and
there
and
sizing
what
we
think
is
going
to
be
needed
for
this,
but
it's
very
exciting
and
a
great
opportunity
for
us
to
work
together
in
a
unique
and
valuable
way
which
we've
been
looking
for
in
this
kovat
area
soon
as
it
started.
D
I'm
sure
someone
will
talk
about
the
all
hub
summit
that
was
moved
from
May
to
October
as
a
concept
now
we're
trying
to
figure
out
the
right
venue
the
right
way
to
do
that
with
you
know,
I
know:
I
was
let's
go
on
vacation
next
month
to
South
Carolina
and
the
people
with
the
condo
said:
here's
the
letter
from
the
governor
saying
that
if
you
come
here
you
have
to
you
have
to
quarantine
for
two
weeks,
so
you
can't
go
inside
the
condo
I'm
like
well.
That
doesn't
seem
like
it
makes
a
lot
of
sense.
B
I'm
happy
to
give
an
update
in
the
South
hub
what
Florence
was
talking.
I
was
very
salient
that's
just
this
week.
We've
also
walked
a
long
road
on
for
the
last
year
now
working
with
NSF
harnessing
the
data
revolution
there,
projects
that
are
funded
through
that
big
idea,
which
and
all
of
the
pis
that
are
involved
in
those
trying
to
look
forward
to
a
development
meeting.
B
So
we're
switching
that
meeting
very
rapidly
and
we
did
a
survey
as
all
of
the
hubs
to
of
the
community,
also
a
virtual
meeting,
and
we
were
going
to
have
a
three
day,
in-person
meeting
that
was
going
to
be
partially
API
meeting
and
partially
for
future.
Looking
meeting
with
recommendations
around
collaboration
and
coordination
in
the
data
science
space,
and
so
because
of
everything,
that's
happening
in
the
world
that
is
happening
this
month,
and
so
we've
switched
it
to
a
fully
virtual
meeting.
B
We're
gonna
be
running
it
with
the
same
collaborate,
collaborate
collaboration
feel,
but
with
just
the
three
days
being
fully
online.
So
that's
going
to
be
an
interesting
update
and
how
the
community
may
synergize.
So
this
could
play
in
to
some
of
the
some
of
the
work
that's
going
into
Cova
down
the
line
as
well.
A
Actually,
we
were
working
on
rescheduling
a
meeting
earlier
this
week.
It
seems
to
be
going
around
and
we
realized,
oh
man,
if
everybody
doesn't
have
to
be
in
the
same
place
on
the
same
day,
we
don't
actually
have
to
have
the
meeting
all
on
the
same
day,
so
we're
gonna
work
on
spreading
it
out.
That's
we're
making.
F
I
could
do
that
and
then
maybe
Melissa
won't
mind.
Oh
I'll
do
this.
She
could
do
OSN
later.
That's
a
great,
so
the
West
big
data
hub
Meredith's,
put
together
a
very
kind
of
just
the
favorite
resources
for
people
at
the
at
West.
Big
data
hub
org,
hope
I
got
that
right,
so
you
can
go
there
and
you
can
think
see
things
like
the
oh
shoot.
I'm
gonna
get
all
the
nouns
wrong,
but
didn't
get
to
the
US.
Did
you
response?
F
We
put
this
up
real,
quick,
which
is
a
clearinghouse
for
if
you
need
volunteers
or
you,
yes,
us
digital
response,
font,
ear,
mouthing
I'm
just
going
to
piece
this
one
into
the
chat
for
people
who
haven't
seen
it
I!
Think
it's
a
you
know:
we've
been
thinking
as
a
hub.
What
can
we
do
in
this
time?
And
it's
probably
not
to
spin
up
new
efforts
but
to
highlight
and
try
and
sift
through
everything
that
is
bombarding
our
community
and
just
the
you
know,
everyone's
an
expert
everyone's
doing
stuff
painting?
F
F
So
we've
been
very
involved
with
go
faire
host
the
US
coordination
office
and
starting
this
Wednesday
with
our
first
zoom
bombing
experience
for
many
of
us.
We
launched
our
four-part
webinar,
but
we
did
actually.
We
only
started
10
minutes
late
and
we
got
into
a
new
room
and
it
all
worked
out.
You
get
the
introduction
to
the
virus
outbreak
data
network
on
this
coming
Wednesday
at
9
a.m.
Pacific,
12:00
Eastern.
We
have
dr.
Mayor,
Jim,
fun,
rice
and
I.
F
Hope
I
said
that
right
who
will
be
presenting
on
how
they're
doing
training
capacity
in
Africa,
especially
as
they
try
to
gear
up
and
confront
kovat
19
with
a
much
different
and
under-resourced,
even
more
under-resourced.
Then
then
we're
experiencing,
of
course
situation
and
trying
to
make
sure
that
the
data
Swan
fair
so
that
it
can
be
quickly
aggregated
and
insights
mined
from
it.
F
Four
we're
going
to
have
three
different
people,
including
Microsoft
Natalie
Meyers,
at
Notre,
Dame,
and
then
a
couple
people
here
at
the
supercomputer
Center.
Look
at
various
ways
that
they're
mining
heterogeneous
data,
so
not
just
from
journals,
but
also
geospatial,
all
kinds
of
omics
data
and
doing
things
like
building
map,
knowledge,
graphs
or
even
word
clouds
to
look
at
trends
in
response,
so
we're
pretty
excited
about
it.
I'm
also
part
of
the
virus
outbreak,
disease,
Network
and
I'm,
along
with
Rob
part
of
the
RDA
Coburg
19
working
group,
which
is
so
immense.
F
We
have
I
think
300
experts
signed
up
to
a
moment.
It's
spawned
off
five
different
working
groups
from
omics
to
social
science,
to
community
outreach
and
some
other
things
I
can't
think
of
at
the
moment,
and
so
Rob
and
I,
along
with
one
or
two
other
people
are
already
a
tabla
asan's
that
will
try
and
help
the
the
co-chairs
navigate
through
the
very
aggressive
timeline
that
the
European
Commission
is
given
for
some
policy
I
input
they
need,
and
then
last
but
not
least,
I
know.
On
blood
Florence
mentioned
the
rapid
that
she's
working
on.
F
We
also
have
gotten
an
inquiry
from
NSF
to
ask:
are
there
federal
data
sets
out
there?
That
would
help
researchers
if
they
were
open
and
as
we
find
them,
if
we
could,
you
know
please
let
them
know
so
they
can
do
what
they
can
to
try
and
help
facilitate
all
of
the
day
that
people
need
that
researchers
need
for
for
modeling
what
is
happening
or
doing
the
research
that
see
them,
but
didn't
take
too
much
time.
Then.
A
D
D
D
You
can
click
through
our
homepage
to
our
coded
site,
but
on
there
there's
an
event
as
an
example
called
being
led
by
body
I'm
Italian,
so
I
like
to
say
it
like
that,
but
it's
BA
RI
and
they're
having
what
are
the
other
effects
from
kovat
webinar
on
April
24th,
and
now
that
this
stuff
is
virtual,
anyone
can
participate
so
I
asked
him.
Is
it
okay?
Anyone
participates,
they
said
yeah,
so
I
think
this
is
an
opportunity
for
us,
as
all
the
hub's
working
together
to
kind
of
highlight
these
opportunities
with
each
other.
D
D
B
D
D
C
There
we
go
sorry
still
having
it's
it's:
okay,
I'm,
getting
this!
The
hang
of
this
spacebar
thing
yeah,
so
the
OS
n
we're
we're
having
great
success
with
moving
toward
automated
or
centralized,
and
some
automated
management
of
the
of
the
the
hardware
and
the
and
managing
the
software.
When
we
have
updates
the
software,
pushing
those
things
out,
really
great
headway
and
working
on
our
trusted.
Ci
assessment
for
for
cybersecurity
and
we've
got
a
number
of
new
use
cases
that
we're
onboarding
as
well.
The
next
big
push
in
addition,
so
the
leadership
team
met
this
week.
C
I
can
come
back
and
do
a
presentation
of
this
group
on
our
use
cases,
because
one
of
the
efforts
or
late
summer,
one
of
the
efforts
that
we're
doing
and
is
really
necessary,
is
getting
some
more
information
out
about
the
the
use
and
users
of
the
OS
n
and
actually
being
able
to
articulate
sort
of
the
them
as
case
studies.
So
people
can
start
to
say,
hey,
we
know
of
a
project
that
really
could
use
the
OS
n,
4x
purpose.
So
again,
a
handful
of
new
use
cases.
C
We've
got
a
whole
bunch
now,
with
allocations
anywhere
from
from
ten
to
a
hundred
terabytes
and
larger.
We
actually
have
a
group
now
and
working
on
spinning
up
a
petabyte
worth
of
data.
That's
going
to
take
some
time
because
we
have
to
they're
going
to
get
funding
to
recruit
new,
a
whole
new
pod
to
pay
for
a
whole
new
pod
for
that,
so
it
is
exciting
times
and
we're
we're
looking
for
we're,
we're
open
and
ready
for
business,
we're
looking
for
new
users
and,
in
fact,
Tyler.
C
A
A
This
morning,
one
of
the
neat
things
he
brought
up
then
was
was
like
any
project.
You
have
a
set
of
expectations
about
how
people
are
going
to
use
what
you've
got
and
we're.
Now
in
that
period
where
people
are
coming
to
us
and
saying
we
want
to
do
this
and
it's
like
whoa,
we
never
thought
of
that.
The
user
community
is
inventing
new
uses
as
we
go,
which
is
kind
of
fun,
and
that.
H
This
is
paint
can
I
make
one
point.
Certainly.
A
H
I'm
I'm
still
a
little
loose
in
our
data
centers,
so
we've
been
actually
looking
for
how
users
can
use
our
data
if
they
need
to
especially
meteorological
data
climate
data,
but
they
find
they're
hard
to
get
or
hard
to
use,
and
we
like
to
hear
about
it
so
along
that
and
I
just
caught
wondering
if
you
guys
arguing
the
situation
like
that
and
would
like
to
contact
me.
So
I
will
be
able
more
than
happy
to
forward
the
request
to
the
data
center
management.
H
H
No
I'm,
assuming
it
with
no
net
NOAA
data
center
I
work.
I,
do
some
research
on
the
data,
stewardship
and
fair
data
fair
set
like
that,
but
currently
the
the
data
center
management
actually
are
looking
for
feedback
in
terms
you
know
if
and
how
the
user,
especially
with
the
cover
19,
and
if
they
can
use
other
data
if
they
needed
to
do
they,
have
a
difficulty
finding
or
if
they
don't
understand
or
how
to
use
it,
how
to
integrate
it
into
their
system.
C
H
B
C
A
Hearing
none
I'll
do
a
quick
introduction.
So
Rach
quick
is
the
Associate
Director
of
the
surgeries
for
structural
integration,
Research,
Center
and
Indiana
University.
He
can
talk
a
little
bit
more
about
who
he
is
and
what
he
does
and
he
is
going
to
talk
to
us
about
persistent
identifiers
and
again
you've
seen
the
abstract
so
I'm
going
to
just
turn
it
over
to
rob
and
let
him
do
a
better
job
of
introducing.
A
E
Very
good
and-
and
you
can
hear
me-
okay,
yes,
okay,
very
good,
so
thank
you
for
the
introduction.
I
will
talk
a
little
bit
more
about
my
position
and
what
I'm
doing
but
I
wanted
to
start
with
this
short
story
and
pitcher-
and
this
is
a
a
real
picture
of
me
and
my
daughter
as
I,
was
preparing
slides
for
this
presentation.
Yesterday
she
asked
me
what
I
was
doing.
E
But
really
this
is
I.
Think
the
reality
for
a
lot
of
people
now
is,
is
that
they're
working
from
home
and
have
various
different
interruptions?
So
please
do
excuse
me
if
you
hear
some
laughter
in
the
background
or
or
such
because
I
am
working
from
home
as
many
of
us
are
and
in
have
interruptions
occasionally,
if
there's
an
interruption,
I'll
just
say
science
and
stop
and
go
play
ooh,
but
how
do
I
forward
this
next
slide?
E
There
we
go,
but
I'm
gonna
talk
a
lot
about
persistent
identifiers,
one
form
of
persistent
identifiers
that
everybody's
familiar
with
our
do
is
at
this
point.
Pids
is
just
kind
of
a
a
general
description
of
of
a
DOI
being
one
form
of
PID
with
its
own
specific
standards
and
metadata
behind
it.
Pid
is
that
I
use
are
gonna,
be
very
generalized,
just
as
persistent
identifiers
and
I
often
say,
wrap.
E
E
So
when
I
first
started
with
RDA-
and
it
was
back
yet-
oh-
it
was
in
Amsterdam
and
I
think
it
was
p4
that
the
fourth
planaria
may
have
been
a
bit
earlier,
but
somebody
said
in
one
of
the
presentations
data
data
everywhere
nor
any
drop
to
drink,
and
everybody
recognizes
that
from
the
Samuel
Samuel
Coolidge
Rime
of
the
Ancient
Mariner,
with
the
with
the
insertion
of
data
versus
water.
Now
the
question
we've
been
asking
ourselves
in
the
rapid
project
is
and
rapid
again
being
persistent
identification
of
data
is
not
only.
E
E
So
let
me
say
first
of
all
that
many
of
the
contributions
in
these
slides
and
there
are
only
50
15
of
them,
so
you
can
see
what
a
collaboration
this
I
think
there's
15
of
them,
and
it
looks
about
like
about
half
that
many
names
that
contributed
to
this
presentation,
but
they
they
all,
have
contributed
in
in
both
intellectual
and
actual
slides
during
this
presentation.
So
they
deserve
as
much
of
the
credit
for
this
work
as
I.
Do
myself,
I
am
at
Indiana
University
I
am
the
associate
director
of
the
cyber
infrastructure
integration
Research
Center.
E
This
was
previously
known
as
the
science
gateways
Research
Center
we've
we've
really
kind
of
changed
our
view
to
look
at
all
of
cyber
infrastructure
and
integrating
of
the
cyber
infrastructure
that's
available
to
researchers
into
a
usable
format,
something
that
people
have
heard
it
many
times
now
that
the
scientists
can
do
their
research
and
and
not
worry
about
the
technology
not
become
IT
experts
along
the
way.
I'm.
Also,
the
principal
investigator
for
the
NSF
project
called
the
enhanced
robust,
persistent
identification
of
data.
E
Universal
interoperability
mean
I'm
also
with
exceed
and
I
run
the
extended
collaborative
support
services,
science
gateways
portion
so
I
have
a
connection
to
the
cyber
infrastructure
there
and
on
the
RTA
I'm
part
of
the
technical
advisory
board,
along
with
Christina
and
I,
am
the
co-chair
of
the
RTA
data
fabric
interest
group,
which
really
a
lot
of
these
ideas
were
fleshed
out
in
here.
In
fact,
the
data
interest
groups
came,
and
they
had
all
these
puzzle
pieces
that
we
said
well.
E
Can
we
make
this
a
real
fabric
that
is
then
useful
for
the
community,
so
I'm
gonna
start
really
big
here
and
and
make
a
suggestion
that
there
are
three
main
eras
of
IT
and
that
we
are
in
middle
of
the
second
era
and
moving
towards
the
third.
So
that
first
era
was
basically
from
the
invention
of
computing
and
transistors
to
about
1995,
and
in
this
era
there
were
really
many
computers
and
many
data
sets.
E
Occasionally,
a
single
computer
was
connected
to
a
single
data
set
usually
via
a
mounted
drive,
but
for
the
most
part,
all
computers
operated
head
Rajini
in
heterogeneous
lis,
and
all
data
sets
the
same
way,
of
course,
with
the
1995
and
the
proliferation
of
the
internet.
We
went
into
this
new
era
where
there
were
now
a
single
computer,
and
many
data
sets
the
single
computer
being
that
all
computers
could
talk
to
each
other.
E
In
a
single
with
a
single
communication
protocol,
and-
and
you
may
recall,
son
had
a
marketing
slogan
called,
they
said
the
network
is
computer,
so
in
this
era,
from
1995
till
sometime,
hopefully
in
the
near
future,
we're
in
an
era
where
datasets
are
still
heterogeneous,
but
really
there's
a
homogeneous
computing
structure.
Now
you
can
probably
guess,
from
era
one
and
year
two
what
the
third
era
may
be
and
again
2025
is
just
a
projection.
I
think
that
it
may
be.
E
There
will
be
schemes
for
it
sooner
whether
it
will
be
widely
accepted
it
might
be
after
that
is
of
one
single
computer
and
one
single
data
set,
and
by
that
I
mean
there
will
be
interoperability
of
all
heterogeneous
data,
meaning
that
you
can
interact
with
all
headers
Genia
the
same
way,
and
this
is
actually
from
a
small
white
paper
by
George
Strom.
For
those
of
you
who
don't
know
George
make
it
a
point
to
meet
George
he's
with
the
National
Academy
of
Sciences
and
just
a
wonderful
person
to
talk
to.
E
He
was
part
of
NSF
met
and
really
the
formation
of
that
initial
networking
technology.
That
came
to
be
the
internet
over
time,
but
I
hope
that
everybody
here
it's
a
chance
to
meet
George
because
he's
just
a
wonderful
person
to
talk
to,
and
we
have
kind
of,
a
motivation
for
making
all
that
that
data
into
a
single
heterogeneous
data
set
and
that
motivation
is
really
the
Internet
of
Things
and
machine
learning.
Internet
of
Things
have
made
and
sensor
are
inexpensive.
So
you
now
have
all
this
data
machine
learning
requires
will
require
a
better
data
infrastructure.
E
If
we're
really
going
to
see
how
far
it
can
go
where
we
can
push
these
these
new
machine
learning
techniques
and
even
when
data
remains
in
silos,
global
fare
data
infrastructures
could
automate
that
drink
data,
wrangling
step,
which
anyone
who
has
spent
a
time
really
trying
to
work.
The
pre-processing
up
to
a
analysis
knows
is
the
majority
of
a
data
scientist
time
whether
this
80%
is
is
realistic
or
not.
I'm,
not
sure.
E
But
that's
the
estimate
in
some
areas,
and
we
really
see
this
emergence
of
open
science
and
this
emergence
of
open
science
at
least
some
advocates
are
saying,
we'll
be,
have
the
same
impact
and
it
will
rival
the
the
original
science
revolution.
Making
science
open
and
available
to
all
can
really
have
a
massive
impact,
as
everyone
starts
using
the
data
available
and,
in
fact,
the
resources
and
instruments
available.
E
Also-
and
it's
easy
to
see
that
if
we
have
that
those
three
eras
and
the
data
infrastructure
coalesce
as
the
internet
and
the
web
did
that
that
data
infrastructure
will
be
revolutionary
and,
interestingly
enough,
the
two
capitalized
words
here:
internet
and
web
coalesced
around
two
specific
things
and
those
were
protocols,
the
internet
around
tcp/ip,
which
basically
meant
all
devices
could
talk
to
all
other
devices
without
having
any
specialized
software.
All
you
have
to
understand
is
tcp/ip
and
the
web
was
HTTP.
E
The
protocol
that
allowed
every
read
browser
to
understand
bits
in
a
certain
sequence
and
to
make
them
into
something
readable
by
human.
So
the
data
sharing
and
interoperability
has
kind
of
a
a
long
history,
and
this
history
really
and
why
we're
moving
towards
fair
and
open
data
is,
is
combined
in
several
things.
One
is
the
technology
advances
we've
gone
in
the
1970s
from
thousands
of
transistors
on
a
chip
to
now
a
billion
transistors
on
ship,
the
networking
technology
and
fiber-optic
and
laser
communications
have
gone
from.
E
I
wrote
down
the
first
thing
megabits,
but
I
remember
having
a
300
baud
modem,
which
means
it
was
hectic
bits
per
second
to
now
experimental,
petabytes
per
second
networks.
Dis
prices
have
dropped
tremendously.
It
was
half
a
million
dollars
for
a
gig,
a
gigabyte
of
data
in
1981,
and
now
that's
about
three
cents
per
gigabyte,
as
you
can
get.
A
four
terabyte
drive
for
about
$100
in
these
great
performance
increases
along
with
have
enabled
this
data
intensive
science
and
things
like
machine
learning
and
the
complex
algorithms
are
now
realistic
as
they
weren't
before.
E
At
the
same
time,
society
and
government
has
been
moving
forward
in
2011.
There
was
a
interagency,
a
US
federal
interagency
committee
committee.
That
said,
basically,
at
this
time
we
can
now
store
more
data
than
we
can
effectively
process.
In
2013
we
had
a
u.s.
presidential
science
adviser
signed
an
executive
order
and
that
required
that
all
all
the
federally
funded
research
be
openly
available
so
leading
to
again
open
science,
then
January
2014
there
was
a
workshop
in
Leiden
University.
E
That
line
that
was
led
by
Professor
Baron
Mons,
who
was
actually
part
of
the
vote
and
meeting
earlier.
This
week
and
the
results
of
that
meeting
was
really
the
definition
of
fair
data,
so
I
think
everybody
in
this
room
probably
understands
what
fair
data
is,
but
just
to
rehash.
That
and
I
actually
have
some
details
on
the
next
slide,
but
that's
findable,
accessible,
interoperable
and
reusable,
and
then
the
this
one
is
a
little
less
well
known,
but
there
was
a
a
national
science
national
academy
of
science
paper
last
year
at
Masari.
E
It
was
in
2018,
so
I
guess
two
years
ago.
Now
that
included
a
recommendation
that
all
research
products
be
made
available
according
to
Fair
principles,
so
we're
really
pushing
along
the
lines
of
this
open
science
and
these
fair
principles
as
a
way
to
allow
open
science
to
happen.
And
how
do
you
reach
interoperability?
E
Interoperability
has,
for
a
long
time,
been
a
tool
used
for
by
computer
scientists
to
create
new
levels
of
extraction.
So
you
have
high-level
languages
that
in
interpreters
that
solve
the
interoperability
problem
for
heterogeneous
computers.
You
have
the
internet
that
solves
the
interoperability
problem
for
heterogeneous
networks,
and
the
question
is:
can
this
digital
object
architecture,
which
is
at
the
base
of
these
persistent
identifier
schemes?
E
Can
that
solve
the
interoperability
problem
for
heading
heterogeneous
data
and
just
to
reiterate
the
fair
principles
and
what
I've
done
on
this
slide
and
in
fact
this,
like
I,
did
it
I
I?
Did
this
with
Lewis
Benigno
from
Leiden?
This
slide
separates
what's
in
green,
in
the
fair
principles,
and
these
are
the
the
fair
principles
word
by
word.
What's
in
green
is
what
can
be
accomplished
purely
in
a
technical
level
and
then
in
black.
What
the?
E
What
needs
the
community
to
provide
some
solutions
along
with,
though
the
technology
can
still
aid
in
those
things,
you'll
see
things
here
like
the
the
first
one
metadata
is
assigned
a
globally
unique
identifier.
We
can
do
that
with
technology.
That
technology
is
this
for
a
long
time.
The
persistence
part
is
not
a
technical
solution.
This
persistent
parts
requires
a
community
to
say
that
they
will
house
these
registries
of
persistent
identifiers
for
long
for
long
periods
of
time.
D
E
I'm
aware
of
the
name,
data
networking
and
I
did
a
compare
and
contrast
for
the
e
rapid
proposal
without
digging
it
into
it
too.
A
too
far
the
one
thing
that
is
the
major
difference
as
the
name
data
networking
actually
changes
the
base
level
networking
protocols
right.
So
you
need
a
change
in
networking
what
the
digital
object
architecture
does
it
operates
within
the
the
initial.
E
Are
the
existing
networking
framework
now
I
would
say
that
the
goal
is
the
same
for
many
projects
right
that
that
can
all
water
be
drinkable
and
it's
more
likely
gonna
be
a
combination
of
good
ideas
from
many
I
can't
say
enough
to
say
whether
I
don't
know
enough
about
named
data
and
networking
to
know.
If
it's
going
to
a
you
know
what
what
part
it
plays
in
or
if
it
solves
some
of
these
issues.
E
However,
I
think
that
some
convergence
of
a
several
of
these
technologies
is
going
to
be
the
right
answer
in,
and
we
know
what
the
solution
is.
The
solution
is
that
all
water
is
drinkable,
how
we
get
there
I
think
the
devil
is
in
the
details
right.
Thank
you.
I
think.
The
thing
that
it
has
the
so
it's
what
I've
know
of
name
dated
networking
is
very
powerful
that
the
the
downside
of
it
is
that
it
takes
changes
in
at
the
networking
layer
and
whether
that
is
palatable.
Why
the
community
remains
to
be
seen.
D
So,
just
a
little
more
I
want
to
share
on
that
so
I
used
to
work
with
Christos
Papadopoulos
on
this
a
little
bit
when
I
was
at
internet2
and
now
he's
a
DHS
I
think
so
he
was
in
Colorado.
So
the
interesting
thing
when
we
did
a
meeting
at
NIST
about
it
was
that
they
had
some
posters
with
tanks
on
them
and
of
course,
when
you
say
what's
the
use
case,
they
say.
Oh
you
know
tactical.
D
You
know
they
can't
tell
you
anything
but
I
think
what
they're
looking
to
do
is
see
if
it
actually
provides
a
more
secure,
more
valuable,
valuable
data,
networking
and
finding
opportunity
in
an
environment
that
they
can
manage.
Because
it's
the
cover,
you
know
it's
the
military.
So
there
may
be
some
interesting
things
coming
out
of
that.
D
So
I
just
wanted
to
share
that
we'll
see
what
comes
out
in
the
public,
but
it
might
provide
some
interesting
new
opportunities
for
those
who
don't
know
nd,
M
I
think
the
idea
is
that
you
put
the
identifiers
on
the
front
of
the
data
packet.
So
you
look
for
the
data
packet.
You
know
about
the
IP
addresses
and,
as
you
know,
is
being
shared,
you
have
to
change
how
you're
doing
networking,
but
but
it's
interesting,
so
it
may
be
a
future
innovation
that
could
be
valuable,
yeah.
D
E
Going
to
need
to
be
you
know
the
network's
going
to
be
need
to
be
involved.
It's
going
to
need
to
be
aware
of
what's
happening
with
the
data
if
that
chain
takes
so
again,
I
won't
go
too
in
depth,
because
what
I
know
of
my
main
data
networking
is
not
enough
to
say
too
much,
but
I
do
think
that
the
network
is
going
to
have
to
be
aware
of
what's
happening
with
me
with
with
data
and
be
part
of
the
solution.
E
H
Have
a
comment,
so
you
mentioned
you,
you
mentioned
number
time
that
all
waters
are
drinkable
I'm
wondering
whether
because
there's
always
a
balance
between
resources
and
result,
I'm
wondering
whether
make
it
all
water
available,
but
some
water
or
drinkable
some
water.
Maybe
we
don't
need
to
be
drinkable.
We
just
need
to
used
to
flash
taller.
Don't
think
like
that
sure.
E
Sure
in
in
making
our
all
water
drinkable
another
important
thing
here
doesn't
mean
that
anyone
can
drink
any
water
right.
You,
you
have
I.
Think
it's
here
in
point
point
a
one
two
is
that
the
the
there
needs
to
be
authentication
and
authorization,
also,
meaning,
basically,
that
you
can.
You
can
still.
You
still
have
to
live
behind
that
authentication
and
authorization
scheme,
because
you
know
there
there's
a
difference
between
open
data
and
and
data
that
every
anyone
can
use
any
time
without
some
authorization
and
authentication
did
that
answer
your
question.
H
I
in
a
way,
I
guess
I'm
wondering
whether
philosophically
is
too
much
of
a
go
to
requiring
all
data
to
be
interoperable
at
the
sound
of
at
the
same
level.
Does
that
make
sense
to.
E
You,
oh
yes,
but
I.
Look
at
this
with
a
the
say,
my
idea
of
what
happened
with
the
internet
right,
so
the
power
and
the
Internet
of
the
Internet
and
the
networking
protocol
and
tcp/ip
is
that
all
devices
connect
the
same
way
right.
So
the
you
don't,
the
the
network
doesn't
need
to
know
whether
you're
on
a
cell
phone
or
a
laptop
or
a
roof,
a
Internet
enabled
refrigerator
the
the
network
is
the
same.
E
Yes,
so
there's
a
lot
to
be
done
both
on
the
technical
side
and
the
community
side
to
get
to
the
stage
and
some
some
groups
will
come
along
faster
than
others.
You
can
see
again
when
we
went
to
that
one
computer
model
that
it
became
two
and
as
the
internet
developed,
it
became
too
expensive
to
not
be
a
part
of
it
right.
E
E
Okay,
so
I
and
I've
gone
on
here
already
25
minutes,
I
see
so
I
do
only
have
about
I'm
about
halfway
through
my
slides,
so
I'll
try
to
go
through
this
quickly.
The
good
thing
is
that
the
people
who
are
really
thinking
about
this
and
the
brains
behind
this
are
the
people
who
are
involved
with
the
internet.
I
mentioned
George
strong,
but
Robert
Kahn,
who
is
the
executive
director
of
CNN
RI,
which
created
the
handle
system?
The
handle
systems
are
what
do
I
uses
as
a
technical
background.
E
Pids
are
persistent
identifiers
point
to
digital
objects
and
provide
a
kernel
of
metadata,
so
some
state
data
about
that?
This
can
be
done
at
a
very
quick
level.
You
can
get
information
and
then
kernel
of
metadata
and
that
kernel
metadata
is
just
enough
metadata
to
do
some
based
operations.
Much
like
you
have
a
very
base
operations
HTTP.
Only
in
only
enables
things
like
get
head
and
post
and
delete
that
there's
a
kernel
of
metadata
that
would
allow
you
to
interact
and
learn
more
about
each
of
the
digital
objects.
E
A
digital
object
can
be
anything
we
have
a
weather.
Is
the
digital
object
interface
protocol
this
place
the
same
role
as
HTTP
for
webpages.
It
facilitates
those
operations
and,
and
deos
can
be
anything
that
you
represent
digitally,
as
as
people
know,
there
are
P
IDs
for
all
publications
data
site
does
data
now
orchid
does
puts
PIDs
on
people
basically,
so
anything
that
has
a
digital
replica
say
a
representation
can
be
a
PID.
The
PID
is
involved
in
our
testbed.
E
Don't
have
to
come
from
from
the
testbed
itself
they
can
be
do
is,
are
pids
issued
by
other
agencies
as
long
as
they're,
they're,
persistent
and
globally
unique
and
resolvable.
So
I
think
that
covered
that
one
thing
to
say:
digital
object,
architecture,
here,
services
based
infrastructure,
only
so
basically
the
technical
components,
it
doesn't
say
anything
about
the
modeling
of
data
that
objects
themselves.
For
that
we
look
to
groups
like
the
Fair
principles
and
and
the
RDA
working
group
on
the
PID
Colonel
information.
E
These
relationships
are
defined
outside
of
the
technical
realm
you
you
can
say
that
location
represents
the
location
of
the
digital
object.
For
instance,
the
master
is
all
mobile
part
of
the
PID,
but
to
say
what
else
and
what
other
objects
are
are
what
other
metadata
is
available?
That's
left
up
to
things
like
the
fair
principles
and
and
the
PID
colonel
information.
E
Now
the
e-rep
and
testbed,
which
is
entering
its
second
year,
production,
has
the
basic
components
and
really
what
we
did
had
was
puzzle
pieces
from
various
different
various
different
organizations
that
we
stitched
together.
It
really
works
with
a
pids,
a
handle
service.
A
handle
service,
as
I
said
earlier,
is
the
same.
Handle
service
based
software.
That
issues
do
is
or
D
lies
is
the
big
one,
but
there
are
a
few
other
ones
that
are
used
outside
of
academia.
It's
it's
the
same
software.
The
the
handle
server
is
exactly
the
same.
E
We
had
a
data
type
registry
data
typing
is
important
because,
if
you're
having
this
machine
communication,
you
have
to
know
the
form
of
the
data
if
you're
going
to
act
on
it.
So,
for
example,
if
you
have
a
created
on
date
in
the
in
the
state
data
of
the
PID,
you
need
to
know
that
that's
coming
in
the
some
standard
form
that
the
the
machine
can
then
parse
an
act
on
so
so
this
the
datatype
registry
is
where
you
register
the
types
of
the
state
data
that
you're
going
to
get
from
resolving
the
PID.
E
This
helps
you
make
decisions,
then
as
to
whether
that
is
useful.
The
so
the
components
really
of
rapid
are
a
handle
service
for
issuing
and
resolving
persistent
identifiers,
a
data
type
registry
for
recording
what
data
types
those
are
for
machine
actionability,
we're
working
on
a
mapping
service
that
really
prevents
the
refactoring
of
repositories
right.
So
we
understand
that
we're
dead
in
the
water.
If
we
go
in
and
say,
okay,
you've
got
a
changer.
Your
repository
schema
to
do
this
that
the
other
thing
before
it
can
be
part
of
this.
E
Before
it
can
be
part
of
this
structure,
we
really
have
to
have
something
that
map's,
what
is
an
existing
repository
to
the
digital
object
architecture
and
we're
working
on
that
with
deep
spaces,
the
one
we're
working
on
right
now
and
then
and
then
that
operations
protocol,
which
is
really
the
hinge
of
it
all,
and
it's
the
it's.
What
I
call
the
digital
object
interface
protocol
and
this
allows
the
basic
so
both
basic
and
extended
operations.
E
The
basic
operations
are
much
like
HTTP
they're,
a
in
probably
the
one
that's
most
used
will
be
get
because
you
you
find
out
some
metadata
about
an
object.
You
want
to
bring
that
object
to
your
computing
source
or
you
know
your
screen,
so
you
can
review
it
whatever
it
may
be,
but
there
are
the
basic
crud
operations
which
we've
enabled
in
a
rapid
Craig
to
the
create,
create,
update,
delete
and
I.
Always
a
read
I
always
forget
are
so
yeah,
so
so,
basically,
this
are
two
services
that
would
be
globally
network.
E
There
would
be
many
of
these,
not
just
the
e
rapid
service,
which
is
a
testbed
which
is
two
server
sitting
in
Indiana.
They
would
be.
It
would
be
a
network
series,
our
federated
series
of
these
handle
and
data
type
services,
along
with
a
protocol
that
functions
with
them,
and
let
me
talk
about
so
we
have
a
wide
variety
of
use
cases
they
include
open,
I
went
the
wrong
way.
E
They
include
some
weather
modeling,
some
some
climate
surveys
in
the
Taiwan
area,
some
wife's
genomic
projects,
some
actually
a
stream
as
part
of
it,
with
its
virtual
machine
images,
we
had
galaxy
as
part
of
the
project
where
we
were
looking
at
their
workflows,
but
the
one
I
want
to
Center
on
and
just
mention
here
is
the
science
and
engineering
applications
gateway
or
C
grid.
If
you
go
to
rapid,
our
PID
secret
org
you'll
find
our
our
test,
but
enabled
in
a
science
gateway.
E
What
it
does
basically
is
assigns
peds
for
every
aspect
of
the
workflow,
so
each
input
the
raw
data
is
assigned
a
PID.
It's
it's
at
this
point
assigned
when
it
is
when
the
workflow
has
started.
However,
you
could
imagine
that
a
PID
be
assigned
to
data
as
it
comes
out
of
the
instrumentation
there's
some
data
preparation,
software
or
some
pre-processing
software-
that
is
also
assigned
to
PID
at
the
intermediate
data
products
again
PIDs
and
down
the
line
until
you
have
your
output
and
and
visualization,
which
are
also
assigned
pids.
E
This
in
the
end
could
all
be
wrapped
in
a
DOI
which
then
could
be
published
outside
you.
Wouldn't
want
to
use
DOI
is
that
on
every
step
of
the
way,
because
there's
there
are
some
requirements
there
that
get
pretty
heavy,
and
this
you
can
imagine
following
a
series
of
P
IDs
and
that
allowing
you
to
totally
reproduce
a
experiment,
in
fact,
even
the
resource
that
it
runs
on,
be
given
a
PID.
E
This
is
all
behind
the
scenes
for
the
most,
for
the
first
run,
so
the
researcher
coming
in
for
the
first
time
doesn't
even
realize
this
is
happening
when
this
becomes
important
is
when
they
publish
they
have
a
path
to
reproducibility,
to
show
basically
the
entire
computational
work
flow
and
and
as
to
how
they
got
the
results
and
I
know
I'm
running
along
on
time.
I
did
want
to
also
talk
about
the
fair
digital,
the
fair
digital
object
framework.
This
is
what
what
several
people
in
the
US
and
and
Europe
are
working
towards.
E
There's
a
fair
digital
object.
Working
group
I
have
the
link
on
my
final
slide
here,
but
this
combines
the
digital
object
architecture,
which
is
over
here
on
the
Left,
even
closer
with
the
fair
interoperability.
It
pulls
in
some
link
data
concepts
for
semantic
relationships
between
the
metadata
and
then
you
have
the
digital
object.
That
is
as
fair
I
would
suggest,
talking
or
looking
back
over
Behrens
patient
for
more
on
fair
digital
objects,
but
the
the
idea
is
basically
everything
that
I
just
talked
about
is
given.
E
Some
is
given
some
semantic
relationship
to
pull
it
from
a
digital
object
architecture
and
combine
it
with
fair,
and
it
also
has
a
specific
format
and
RDF
format
that
again
is
relatively
available
and
usable.
The
general
I
here
idea
here
is
you:
have
a
fair
digital
object.
Wrapped
in
PID
is
that
an
agent
can
act
upon
when
they
decide
that
that
digital
object
is
fit
for
the
purpose
that
they're
looking
and
they
can
get
that
from
the
metadata.
E
That
is
part
of
the
state
data
that
you
can
get
from
resolving
the
persistent
identifier
so
going
back
and
and
realizing
that
I'm
going
20
minutes
long
here,
the
the
next
year
of
IT
we
talked
about
these
I
will
just
say
that
I
think
this.
This
final
IQ
era
is
coming,
whether
the
technology
we're
working
on
or
a
hybrid
technology,
as
I
mentioned,
that
includes
networking
or
or
some
commercial
group
is
going
to
develop
something
before
we
get
this
out
the
door.
E
Is
it
waits
to
be
steamed,
but
I
think
we
are
moving
towards
that
third
era,
where
we
have
basically
a
single
data,
set
that
the
technologists
who
is
writing
the
client
can
interact
with
all
data
through
one
with
one
method
or
with
one
protocol,
basically
so
realizing
and
I
hurried
through
the
last
half
of
my
presentation.
I
will,
if
I
have
time,
take
questions
I,
think.
A
Just
being
respectful
of
people's
time,
what
I'm
going
to
do
is
just
a
quick
note
on
upcoming
presentations,
and
then
people
can
hang
around
the
virtual
podium
afterwards,
but
are
also
free
to
leave
so
I'll.
Just
make
a
note
we'll
be
meeting
again
on
the
1st
of
May
Stenton
Martin
is
going
to
be
talking
about
the
National
biome
data
collaborative
stands
at
at
Oakridge,
and
then
you
heard
earlier
christine
kirkpatrick
who's.
A
E
F
Just
wanted
to
say,
Rob
I
really
appreciated
your
presentation.
I
especially
appreciated
the
last
couple
slides
which
really
wrap
up
some,
not
only
complicated
concepts,
but
to
be
experiencing
this
as
it
develops
in
real
time.
There's
so
much
to
sift
through
about
you
know
what
what
is
worth
following
and
and
what's
going
to
emerge
is
the
way
to
do
it.
It's
really
nice
to
have
a
recent
retrospective
kind
of
of
what
has
emerged
as
things
that
we
should
be
building
around
so
I
appreciated
that
a
lot.
F
Also,
really
love
the
drawing
at
the
beginning
and
I
think
that
one.
It
was
just
a
timely
way
to
frame
the
discussion,
but
I
also
think
the
more
that
we
can
support
each
other
in
our
varying
working
styles
and
acknowledge
that
it's
a
different
time
and
that,
yes,
your
kids,
might
be
in
the
room
where
you
might
be
sharing
a
home
office
with
the
spouse
and
things
are
different.
I
think
that's
just
really
good
to
interject
and
be
real
about
with
each
other.
E
A
Get
more
virtual
right,
yeah,
here's
a
thing
that
wove
in
and
out
of
your
talk
and
I
I
also
love
the
diagrams.
It's
especially
in
the
last
couple
good
nice
way
to
kind
of
present
it
conceptually,
but
you
know
kind
of
forever.
There's
there's!
You
know
a
worldview
that
says
we
got
to
organize
the
data
better
and
better
and
then
there's
a
different
worldview,
especially
with
the
rise
of
machine
learning.
It
says,
yeah,
you
know
it's
put
in
a
pile
and
we'll
figure
it
out.
E
So
I
would
I
would
point
you
towards
more
of
the
details
of
the
PID
kernel
information.
In
my
opinion,
you
want
to
get
the
minimum
set
available
for
all
data
that
allows
you
to
do
those
base
operations
which
the
base
operations
are
your
computer
science
first
year.
You
want
to
be
able
to
retrieve
an
object.
You
want
to
be
able
to
put
some
basic
trusts
and
determine
if
it's
the
object,
you're
looking
for
so
you
need
something
like
an
md5
checksum.
E
You
need
a
created
on
date,
and
so
that
has
to
be
very
small,
and
it
has
to
be
very
small,
for
a
very
technical
reason
is
that
this
has
to
operate
at
internet
scales
right
so
you're
looking
billions
of
these
objects.
So
if
you
try
to
put
all
of
your
metadata
into
the
associated
with
a
PID,
then
all
your
time
is
going
to
be
turning
and
trying
to
get
through.
You
know
tons
of
metadata,
so
we
need
there
are
really
two
sets
of
operations,
one.
E
Then
all
of
these
similar
data
objects
that
you're
going
to
have
to
then
get
some
a
more
in-depth
profile
and
that
metadata
and
that
metadata
still
probably
lives
on
a
metadata
server,
not
within
the
not
within
the
greater
environment.
That
allows
you
to
quickly
operate
on
on
on
billions
of
objects.
Now
you
you
need
a
pointer
in
that
in
that
PID
kernel
as
to
where
you
can
find
more
right.
So
when
you,
when
you're
searching
for
an
object,
you
want
to
know
if
that
object
is
suitable
or
fit
for
use
yeah.
E
So
so
it
comes
to
what
is
searchable,
what
is
reasonably
searchable
and
then
what
is
it
that
it
takes
to
build
the
necessary
level
of
trust
that
you're
going
to
do
some
extend
operation
or
some
analysis
with
an
object?
So
from
a
purely
technical
point,
you
want
to
keep
that
kernel
as
small
as
possible
simply
for
a
performance
issue,
but
you
also
have
to
have
that
greater
detail
and
that
that
greater
detail
is
going
to
tell
you.
E
You
know
whether
the
you
know
the
units
on
the
the
number
you
just
picked
up,
so
you
know
what
you're
actually
calculating
and
all
that
needs
to
be
recorded
also,
but
you
don't
necessarily
need
that
about
every
object.
What
you
need
about
every
object
is
where
it's
located,
where
you
can
get
more
data
from
it,
and
you
know
some
very
bass,
some
very
bass
metadata
that
allows
you
to
establish
trust
that
you're
the
object
you're
getting
is
the
object.
A
E
The
internet,
I
will
say
it
absolutely
did
exist
before
that,
though
the
the
genius
of
it
really
became
widely
used
at
that
point.
Right,
which
is,
is,
as
I
said,
the
the
network
doesn't
care
if
you're,
using
your
your
cell
phone,
your
laptop
your
server
in
your
data
center,
your
kids,
toys
that
have
an
IP
address.
It
talks
to
them
all
the
same
way.
They'd
be
the
network
itself,
treats
them
the
same
way
and
that's
what
we
really
need:
clients
to
be
able
to
treat
data
all
the
same
way.
E
A
E
E
Say
they'd
be
anyway,
you
had
protocols
that
allowed
communication
between
light
devices
for
as
long
as
you've
had
computing
well,
not
quite
as
long
but
but
nearly
as
long
as
you
can
computing
is
that
genius
of
the
network
doesn't
care.
It
can
talk
to
you
weather,
no
matter
what
operating
says
them
or
what
hardware
you're
operating
right,
yeah.
G
Can
I
ask
a
question
about
why,
what's
what's
your
justification
for
really
wanting
resolvable
identifiers,
as
opposed
to
just
global
identifier,
z'
for
such
fine-grained
objects,
you
you've
got
data,
sets
that
have
a
hundred
thousand
files,
and
you
know
you've
now
got
a
reasonable
ID
for
some
intermediate
data
product
that
you'd
really
be
hard-pressed
to
understand
without
understanding
the
whole
data
set
so
white.
Why
do
we
want
to
have
a
resolver
service
to
be
able
to
get
to
every
single
one
of
those
individual
things
without
having
to
find
the
main
data
set.
E
Let
me
say
that
you
can
do
that.
That
doesn't
necessarily
mean
that
it's
useful
to
do
that
and
that
you,
you
have
to
do
that
right
at
this
point,
we're
still
showing
that
you,
you
can
PIDs
themselves
are
very
inexpensive.
The
resolution
is
very
quick
as
long
as
the
the
kernel,
the
metadata,
doesn't
get
big.
It's
going
to
be
useful
in
some
cases
and
not
useful
in
others.
What
we
found
in
the
seed
I'm,
sorry,
the
secret
project,
is
that
you
may
want
to
start
at
the
intermediate
product
all
right.
E
So
so
you
may
want
to
start
at
the
pre-processed
data
you
may
want
to
start
from
the
raw
data.
You
may
want
to
actually
change
just
a
component
of
the
input
file.
So
you
you
don't
want
to
change
the
molecular
model,
for
instance
that
you're
submitting,
but
you
do
want
to
change
the
parameters
that
are
given
to
the
to
the
application
to
run
that.
So,
if
you
divide
them
in
two,
you
know
this
is
the.
This
is
the
instructions
or
the
analysis,
and
this
is
the
actual
data.
E
Then
you
can
change
one
without
changing
the
other.
So
and
again,
this
is
useful
in
some
cases
and
not
useful
in
some
cases,
and
the
granularity
is
really
going
to
be
up
to
the
researcher
in
what
is
useful
or
the
community
and
what's
useful.
In
fact,
one
of
the
projects
that
we
looked
on
early
on
was
called
the
Perseus
digital
library
and
the
question
they
had
is
with
text
analysis.
Do
you
want
to
put
a
PID
on
a
chapter?
Do
you
want
to
put
it
on
a
page?
E
Do
you
want
to
put
it
on
a
letter,
a
word?
What
is
the
right,
granularity
and
I
think
that
this
is
not
the
the
what
they
came
to,
but
I
think
the
applications
are
going
to
define
what
granule
area,
what
granularity
is
useful
right,
so
the
applications
that
that
will
use
that
that
use
this
input
data
either
exist
or
they
will
exist,
you're
going
to
put
you're
going
to
solve
the
granularity
problem
based
on
the
workflow
and
the
application
that
you
have
doing.
E
So
if
you're
looking
at
billions
of
PIDs
for
$50,
it's
inexpensive
to
put
PID
s
on
there,
so
the
usefulness
isn't
going
to
be
defined
by
a
technologist.
The
usefulness
is
going
to
be
defined
by
the
research
community
and,
in
fact,
the
research,
the
applications
developers
I
think
in
the
end,
because
you're
going
to
want
to
feed
the
application
with
the
application
is
I'm.
G
Really
I'm
really
questioning
just
the
resolvable
part,
putting
global
identifier,
zhan,
really
small
things
so
that
you
can
reference
them.
You
know,
I
think,
is
something
we
can't
write.
You
don't
want
to
do
that
with
a
DOI
and
but
but
things
that
don't
need
a
resolver
service
just
to
have
those
pids
that
can
be
connected
to
a
DOI
level
or
to
a
higher
level.
I
think
as
a
place
in
this
game.
To
so
I'm
curious
that
you
know
this
is
sort
of
there's
two
eyes.
G
E
These
run
on
a
very
minimal
scheme
and,
yes,
I,
think,
there's
a
still
a
lot
of
open
questions
as
to
where
the
usefulness
or
what
level
of
usefulness
this
will
be.
Will
it
be
that
in
the
end
we
decide
that
really
it's
it's
only
useful
to
put
PIDs
on
data
when
it's
published
and
put
somewhere
like
data
sight,
or
is
it
useful
to
use
pids
at
this
workflow
level,
which
is
what
we're
doing.
E
I
do
encourage
anybody
who
has
more
in-depth
questions
to
get
a
hold
of
me.
I've
put
and
I
think
this
is
still
up.
I,
don't
I,
think
I'm
still
sharing
links
to
the
rapid
project
itself,
the
testbed,
the
fair
digital
object
group
and
then,
if
you're,
really
interested
in,
want
to
listen
to
me
talk
about
this
for
an
hour
with
a
little
bit
more
of
the
technical
details.
Instead
of
of
the
shortened
kind
of
high
level
overview,
I've
put
in
a
videotaped
presentation,
there.