►
Description
NERSC Data Seminars Series: https://github.com/NERSC/data-seminars
Abstract: This talk will present an update on the features and future of HDF5 for exascale HPC. Currently, our work focuses on asynchronous I/O and node-local storage caches, but future work will include GPU direct I/O and data movement across the deeper memory hierarchy anticipated on future systems.
A
Okay,
so
I
I
mean
I
can
waste
one
minute
on
introduction,
not
that
anyone
really
needs
an
introduction.
A
B
A
Now
with
us
at
nurse
and
seren
is
also
in
crd
and
they've
been
working
on.
I
guess
the
future.
B
A
So
it
will
be
very
exciting
to
hear
about
where
hdf5
can
go
from
here.
So
hopefully.
A
Once
they
find
the
password
but
yeah,
I
guess
take
it
away
and
see
what.
B
Sorry
didn't
mean
to
catch.
You
up
sure
thing
all
right,
thanks
everyone
for
coming,
it's
great
to
have
you
all
here,
and
I
want
to
acknowledge
the
many
many
team
members
that
we
have
on
this
project.
It's
fairly
long
running,
we've
been
renewed
by
ecp
at
least
once
so.
B
We've
had
a
sequence
of
people
in
some
cases,
primarily
it's
a
it's
a
collaboration
between
ldnl
argonne
and
the
hdf
group,
but
we've
had
some
really
great
interns,
particularly
this
summer,
with
cayenne
and
john
and
may
have
contributed
quite
a
bit
to
the
effort.
B
So
it's
a
broad
effort
across
a
lot
of
people,
so
I'll
talk
briefly,
many
people
are
familiar
with
hdf5,
so
I've
tried
to
pack
it
down
into
a
reasonably
short
amount
of
slides
and
then
kind
of
talk
about
how
that
applies
to
the
ecp
applications
and
the
features
that
we
have
been
working
on
to
help
those
app
teams
now
and
in
the
future
and
then
kind
of
think
about.
Well.
What
does
this
mean
in
the
longer
run?
Where?
Where
are
we
going
in
the
next
few
years?
B
Once
we
get
kind
of
to
the
end
of
our
current
milestones
or
how
can
we
continue
to
play
those
forward?
So
so?
Why
use
hdf5
right?
B
You
know,
if
you
ask
yourself,
how
do
I
deal
with
I
o
in
the
exascale?
Do
I
need
to
understand
the
specifics
of
the
mpi
standard
for
io
and
by
the
way?
Why
is
my
checkpoint
taking
so
long?
So
hdf5
is
designed
to
hide
all
that
io
complexity,
so
that
you
can
concentrate
on
your
science.
That's
our
goal
right
is.
B
We
want
to
help
you
out
by
hiding
those
all
those
moving
parts
in
a
way
that
is
really
nice,
easy
abstractions
for
science,
application
teams
to
just
work
with
the
hdf5
api
and
trust
that
we'll
do
the
right
thing:
they'll
get
their
data
back
and
it'll
work.
Well,
so
hdf5
stands
for
hierarchical
data
format
version
five.
B
B
I
o
on
data,
of
course,
according
to
the
data
model,
but
it's
built
on
top
of
lots
of
different
kinds
of
backing
stores,
posix
object,
stores
the
cloud
memory
hierarchies,
whatever
you
want,
and
the
last
thing
that
I
won't
emphasize
here
and
actually
we're
kind
of
gradually
migrating
away
from
is
we
have
a
very
high
volume,
complex
data,
friendly
file
format?
That's
very
well
defined
people
have
written
third
party
readers
and
writers
for
our
file
formats,
we're
pretty
confident
that
we've
described
it
and
even
if
the
software
went
away,
you
could
get
your
data
back.
B
So
the
ecosystem
is,
as
I
say,
quite
broad
there
are.
I
mean
this
is
just
this
very
small
kind
of
old,
in
effect,
subsampling
of
all
the
different
teams
and
tools
that
are
working
with
hdf5
data
today
and
over
the
last
20
years
since
it
started.
I
think
our
first
release
was
in
1997.,
so
we've
been
doing
this
for
more
than
20
years.
B
Okay,
conceptually
speaking
right,
hdf5
is
a
lot
like
xml,
it's
self-describing
and
has
extensible
typing
system,
and
it
has
a
lot
of
rich
metadata
that
users
and
the
software
can
apply
to
the
the
data
that
you
create.
It's
also
designed
to
be
very
high
performance
and
compact
scalable.
It's
like
binary
flat
files.
B
Sometimes
we
call
hdf5,
you
know
the
pdf
of
science,
it's
a
standard,
interchange
format.
It
contains
a
lot
of
different
kinds
of
data
in
one
container,
it's
hierarchical
in
a
lot
of
ways.
B
You
can
also
make
it
be
a
true
graph
if
you
really
feel
like
building
your
links
in
the
way
that
that
works
for
you
if
you'd
like,
but
it's
a
lot
like
the
directory
in
the
file
system
and
we
provide
lots
of
random
access,
subsetting
kinds
of
capabilities,
so
it's
similar
in
some
ways
to
databases,
although
it's
much
more
friendly
to
science
data,
those
are
so
it's
it's
a
broad
intersection
of
all
these
guys.
B
It's
not
a
true
superset
of
any
of
them,
and
it
has
capabilities
that
are
not
included
in
any
of
these
kind
of
related
concepts
and
technologies.
B
So,
from
a
data
model
perspective
files
are
containers
right.
We
don't
we're
gradually.
Moving
away
from
a
file
is
a
file
on
the
file
system
and
opening
up
that
concept
to
well.
It
could
be
an
object,
store
system
with
many
many
objects
inside
a
container,
and
you
treat
those
as
if
it
were
a
single
container
for
the
objects
within
it,
but,
conceptually
speaking
it's
a
file,
it
has
a
set
of
objects
that
are
supposed
to
belong
to
that
file.
B
The
core
object
inside
hdf5
are
data,
sets
right,
they're,
basically
a
multi-dimensional
array
of
homogeneous
data
elements
in
order
to
understand
how
that
works
in
the
file.
We
need
to
store
a
description
of
that,
so
we
have
to
have
some
specification
about
the
data
elements
themselves
and
what
well?
How
big
is
this
array
right?
B
First
component
for
the
data
elements,
we
call
data
types
in
this
case.
I'm
I'm
just
naming
this
one.
You
know
it's
a
32-bit
little
indian
integer.
You
can
make
these
be
arbitrarily
complex,
nested
compound
variable,
length,
sequence,
array,
fields,
the
whole
shebang
quite
complex
if
you'd
like
to
have
complex
state
elements,
but
it's
very
efficient
to
store
floats
and
it's
too,
and
we
describe
those
and
allow
you
to
do.
I
o
in
them
for
the
arrayness
of
the
array
we
need
to
store.
B
You
know
how
many
dimensions
is
this:
we
call
that
the
rank
frequently
and
what
are
the
sizes
of
each
dimension
and
in
fact,
hdf5
allows
any
dimension
to
be
unlimited
in
size.
So
you
can
extend
an
array
in
hdf5
in
any
dimension.
You'd
like
not
just
the
slowest
changing
line
is
kind
of
the
typical
append
images
to
a
movie
kind
of
notion
of
things.
You
can
actually
extend
and
all
the
other
dimensions.
B
Finally,
to
kind
of
organize
these
concepts.
You
know
the
data
sets
in
a
group.
We
don't
just
want
a
you
know,
set
of
them
laying
around
in
the
file.
We
want
to
have
some
structure
and
hierarchy.
That
means
something
semantic
to
the
users
so
that
we
provide
groups
folders
here
and
links
instead
of
the
arrows
so
that
users
can
kind
of
build
a
semantically,
meaningful
science,
meaningful,
usually
structure
out
of
these
objects
in
the
file.
So
every
object
or
sorry,
every
file
has
a
root
group.
B
Very
much
like
a
file
system
objects
kind
of
like
a
file
system.
You
can
add
hard
links
to
more
than
one
file.
You
can
have
links
to
more
than
one
object
in
an
hdfi
file.
You
can
also
kind
of
have
something
like
soft
links
that
refer
to
objects
in
other
files
and
other
hdf5
files,
but
unlike
normal
file
systems,
hdf5
files,
you
can
create
graphs
and
cycles,
do
whatever
you
want,
and
I
don't
necessarily
recommend
that.
But
you
know
because
it
could
be,
can
become
confusing
for
users
parsing
this
and
like
what
happened.
B
Why
is
all
this
tangled
up?
But
it
is
possible
to
create
custom
graphs
if
you
have
a
need
for
that
in
some
way.
So
all
of
these
things
together
lay
out
something
that
hopefully,
is
semantically,
meaningful
and
kind
of
standardized
for
an
application
to
kind
of
augment
the
basic
objects.
The
groups
and
the
data
sets.
We
provide
attributes
and
they
provide
user
metadata
to
generally
decorate
those
or
add
information
to
those
baseline
objects,
we're
similar
to
key
value
pairs
and
that
each
attribute
has
a
unique
name
for
that
object.
So
you
can
have
multiple
objects.
B
B
B
If
you
find
yourself
doing
one
of
those
kinds
of
things
with
an
attribute,
it's
probably
better
to
create
a
data
set
or
some
other
structure
in
the
file
and
then
use
one
of
the
reference
data
types
in
hdf5
for
an
attribute
that
points
at
refers
to
that
other
object
or
group
hierarchy
that
you're
trying
to
work
with
here.
B
B
So,
in
a
nutshell,
really
fast
right:
this
is
the
hdf5
data
model.
There's
a
lot
more
depth
in
here.
I
did
not
tell
you
about
all
the
different
kinds
of
varieties
of
data
types,
and
you
know
some
of
the
more
obscure
things
you
can
do
with
links
and
whatnot,
but
going
forward
you
could
at
least
apply
these
four
basic
objects.
Files
data
sets
groups
and
attributes
to
problems
that
you
hit
or
when
people
talk
to
you
about
hdfi.
B
B
We
plan
to
productize
a
set
of
hdfi
features
that
are
appropriate
for
that
time,
frame
and
set
of
machines,
support,
maintain
and
release
hdf5
and
then
also
do
some
planning
for
the
future
right.
We
don't
want
to
just
run
out
the
end
of
our
funding
and
then
go
well.
Sorry
guys
so,
hopefully
we'll
we'll
hit
to
that.
B
Here
is
two
slides
of
these
there's
a
lot
of
teams
that
work
with
hdf5
and
rely
on
it
to
one
degree
or
another.
Some
of
them
are
completely
reliant
and
others
of
them
are
like.
Well,
we
have
several
different
output
formats
and
hf5
is
one
of
them.
Can
you
guys
help
us
up
so
lots
of
different
teams,
lots
of
different
locations
in
the
doe?
B
Many
different
aspects,
not
just
simulation,
but
machine.
B
We
also
work
with
a
bunch
of
the
st
the
software
technology
teams
to
support
them
and
build
infrastructure
and
collaborate
kind
of
horizontally,
not
necessarily
they
will
always
use
us,
but
they
they're
building
tools
and
we
should
leverage
them
or
they're
going
to
leverage
us
in
some
way.
B
So
if
the
apps
are
so
focused
on
this
kind
of
performance
aspects
of
things,
well,
why
are
they
using
us?
Then
we
always
tell
people.
Well,
you
know
you're
going
to
lose
a
little
bit,
hopefully
not
a
lot
of
performance
when
you
use
hdf5
or
some
other.
I
o
middleware,
and
they
really
don't
want
to
play
with
your
I
o
middleware.
They
really
want
to
do
their
science
right.
They're,
not
the
middleware
developers
and
io.
Just
doesn't
produce
results
right,
it's
not
compute
in
the
sense
right.
It
just
preserves
them
and
it
io.
B
This
kind
of
looks
not
exciting.
We
like
it,
but
it's
not
exciting
to
them
and,
realistically
speaking,
application
teams
shouldn't
need
to
know
the
details
about
all
this
item
and
aware
you
know
it's
like
saying:
oh
impitch
didn't
perform
well,
go
over
there
and
optimize
it.
You
know
you
don't
tell
that
to
the
app
teams
you
try
to
do.
B
So
that's
our
goal
right,
the
app
teams.
They
just
want
someone
knowledgeable
to
fix
it
and
sometimes
they're,
not
quite
certain
about
exactly
what
would
best
help
them.
They
say
we
trust
you
guys
you're
smart
people
and
we
work
hard
to
build
those
relationships,
and
you
know
build
up
that
trust
in
order
to
make
intelligent
decisions
and
when
we
say
hey,
it
would
be
really
good.
If
you
guys
did
this,
they
go.
Oh
okay,
sure
we'll
try
that
they
don't
have
to
come
up
with
all
the
ideas.
B
B
B
We
spend
a
lot
of
time
trying
to
talk
to
app
teams
and
learn
about
what
it
is
they're
trying
to
do
and
why
you
know
and
then
say:
oh
okay,
here's
how
we
could
help
you
or
I
see
you
have
a
problem.
That's
how
we
you
know,
maybe
you
guys
should
change
your
code
a
little
and
we'll
add
some.
You
know
tweaks
into
hdf5
and
together
we
move
forward
and
part
of
our
responsibility
really
too
is
to
look
at
not
just
today
but
five
to
ten
years
out
right.
B
They
want
their
data
back
they're,
not
going
to
run
their
binaries
on
new
machines,
they're
going
to
recompile
or
update
their
software,
but
they
want
their
data
back,
especially
some
teams.
You
know
some
of
the
nuclear
weapons
labs
have
data
from
before
the
test
ban
right,
so
they
plan
to
keep
data
in
certain
circumstances
for
quite
a
long
time.
B
So
as
part
of
the
ecp
hdf5,
we
said:
okay,
fine
great,
we
will
go
out
and
build
a
certain
set
of
features.
We'll
talk
about
those
guys
we're
going
to
spend
time
talking
to
the
app
teams
and
tuning
our
software
to
meet
their
needs
and
as
a
side
effect,
we
decided
that
it
would
be
really
smart
to
have
a
performance
test
suite
for
hdf5
some
set
of
small.
I
o
kernels
and
benchmarks
that
we
can
run
on
current
and
new
systems
so
that
we
can
tell
are
we
doing?
Okay
here?
B
Did
the
performance
fall
off?
What
is
this
special
case?
Is
there
a
reason
why
this
got
slower
or
faster?
You
know:
do
some
decent
software
engineering
on
the
performance,
regression
side
of
things,
and
also
we
spend
a
lot
of
time
thinking
about
what's
going
to
happen
in
the
future
and
talking
to
software
and
hardware
teams
and
thinking
on
our
own
about
what's
going
to
happen
for
new
systems,
so
I'll
kind
of
hit
the
first
four
of
these
in
more
detail
and
then
kind
of
skip
out
to
the
future.
B
B
So
this
first
one
is
finally
rolled
out
beginning
of
this
year.
Earlier
this
year
is
called
the
virtual
object
layer.
It's
a
nice
abstraction
layer
within
hdf5
to
redirect
the
I
o
operations,
things
that
touch
a
file
our
container
today
into
what
we
call
connectors
virtual
object,
layer,
connectors,
ball,
connectors
and
it's
right
underneath
the
api
level.
So
immediately
we
come
in
the
app
calls
data
set
create
and
we
immediately
jump
into
the
vol
connector
and
ask
it
to
do
the
data
set,
create
operation
right.
B
It
happens,
kind
of
at
an
object-oriented
interface,
and
so
they
they
do
these
methods
for
the
various
kinds
of
objects
in
the
data
model
and
operations
on
them.
Reading
and
writing
data
elements
and
other
things
they're
very
nice
for
apps,
because
they
can
be
transparently
invoked
from
shared
libraries.
They
just
have
to
you
know,
set
the
environment.
Variable
app
code
doesn't
have
to
change,
doesn't
have
to
be
recompiled,
it's
great
just
as
long
as
they
linked
with
the
newest
version
of
hdf5,
the
one
that
has
support
for
the
virtual
object
layer.
B
They
can
just
pass
all
their
data
directly
through
something
completely
new
retarget
onto
a
completely
new
storage
system
or
a
new
mode
of
operation
without
rebuilding
without
recompiling
at
all.
These
are
nice.
They
allow
you
to
stack
them
and
to
build
up
nice
chains
of
things.
B
B
Pass-Throughs
are
optional
and
you
can
stack
as
many
of
them
as
you
like
them,
so
zero
star
in
regex
terms,
there's
only
one
term
terminal.
Vol
connector
and
that
stores
it
either
with
the
native
oops,
the
native
traditional
file
format,
or
maybe
an
object
store
system
or
in
the
cloud
or
in
whatever
file
format
that
you
invent-
and
this
is
all
very
pluggable.
B
There's
a
nice,
well-defined
public
interface
for
people
to
write,
vault,
connectors
and
probably
10
or
20
folks
have
written
connectors
that
they're
using
today
with
this
interface-
and
this
is
one
of
the
kind
of
foundational
building
blocks
for
like
the
next
three
features.
Without
the
virtual
object
layer,
we
couldn't
do
at
least
not
the
way.
We've
done
it.
B
The
next
set
of
features.
So
this
is
a
core
infrastructure,
huge
upgrade
that
allows
us
to
add
capabilities
to
hdf5
that
had
no
planning
when
it
was
designed.
You
can
post
facto
change
lots
of
the
hdf5
behavior
without
rebuilding
hdf5,
without
rebuilding
your
app
just
retarget,
the
completely
different
storage
system
and
just
keep
going.
B
So
found
in
mind
we
added
in
support
for
asynchronous
io,
and
this
is
a
pass-through
connector.
It
uses
background
threads
that
we
use
the
argobots
thread
package
from
argon
to
help
with
the
scheduling
and
and
organize
those
threads
again
totally
transparent
to
the
app
don't
have
to
recompile.
You
have
to
make
any
code
changes
if
you
don't
want
to
it.
Just
executes
those.
I
o
operations
in
the
background
on
a
thread.
There's
no
servers,
nothing
right.
It
all
just
runs
inside
your
app
as
long
as
you've
got
a
spare
core
or
you
can.
B
You
know
spare
a
little
bit
of
time
on
the
thread.
This
works
out
great.
B
So
it
just
says:
oh
great,
I
will
go
open
that
file
for
you,
here's
a
placeholder
so
that
you
can
keep
going
and
then
create
an
object.
It's
all
great
I'll
go!
Do
that!
Here's
a
placeholder,
so
you
can
keep
going,
go
write
some
data.
Okay,
sure
great,
add
all
these
things
into
the
task
queue
and
then
we
try
the
monitor
and
see
if
the
app
is
idle.
B
So
we
idle
in
the
sense
that
it's
not
making
hdf5
calls
it's
gone
off
to
compute
and
then
we
start
queuing
up
and
executing
the
I
o
operations
in
the
background.
So
hopefully
we've
decoupled
the
I
o
from
the
compute
cycle
and
we
can
hide
it
as
much
as
possible.
B
Hopefully
we
can
eliminate
some,
or
maybe
all
of
that
I
o
time
with
asynchronous
execution
and
really
speed
up
the
app's
appearance
of.
What's
going
on
with
I
o
right
at
the
very
end
of
each
compute
cycle,
it
starts
some.
I
o
and
that's
probably
a
little
bit
overhead.
Sometimes
you
can
get
it
down
to
very
close
to
zero,
but
there's
a
little
bit
of
overhead
still
with
io.
Then
it
comes
back
and
it
starts
its
next
compute
cycle,
which
is
ideally
overlapped
with
all
the
I
o
at
the
end.
B
You
still
see
this
one
I
o
block
at
the
very
end.
Eventually,
we
have
to
close
the
file
and
flush
the
buffers
and
everything
else
before
the
app
terminates.
We
can't
go
into
the
future.
You
still
have
a
significant
time
savings
and
that
gets
more
and
more
as
the
more
iterations
through
the
computing.
I
o
cycle
that
you
make
the
more
opportunity
we
have
to
save.
I
o
time
for
your
application.
B
There's
certain
operations
in
hdf5
that
are
essentially
read
modified
right
or
it's
got
a
sequence
of
operations
that
it
needs
to
do
in
order
to
perform
your
action
update.
Some
metadata
over
here
update
some
metadata
over
there
and
then
come
back
to
you
with
a
new
object,
so
we're
trying
to
decouple
anything
that
could
potentially
touch
disk
reading
or
writing
from
the
I
o
from
the
I'm.
Sorry
from
the
app
doesn't
win,
always
sometimes
it's
fine.
B
Sometimes
you
have
enough
memory
to
buffer
your
your
data
in
memory
and
and
the
os
would
have
done
it
just
as.
B
B
Okay,
so
this
async
vault
connector
right
has
two
modes
of
operation
effectively.
One
is
the
one
we've
been
calling
implicit
and
if
you
don't
want
to
modify
your
app
and
that's
what
I've
been
saying
right,
you
can
just
transparently
link
to
the
or
dynamically
link
to
the
async
wall,
connector
with
this
environment
variable
and
it
has
kind
of
a
conservative
async
behavior.
It's
like
you
know.
B
I
understand
that
you're
going
to
expect
this
buffer
to
be
reusable
when
I
return
from
my
dataset
right
and
kind
of
will
block
to
make
certain
we
get
that
done,
but
any
of
the
metadata
things
that
can
happen
same
with
reads.
You
know
we
will
we'll
execute
that
effectively
synchronously
in
order
for
the
app
to
be
able
to
read
the
buffer
when
it
comes
back
from
the
data
set
read.
B
B
So
this
looks
like
another
on
the
hand
side,
the
implicit
mode
right.
It's
existing
hdf5
calls
user
doesn't
do
anything
different
with
their
code.
They
just
point
at
the
async
all
connector
on
the
right-hand
side.
If
they
really
kind
of
want
to
manage
this
in
a
more
explicit
way,
they
can
create
a
new
event
set
object
this
esid
and
then
pass
that
in
to
the
all
the
same
operations
right
same
on
both
the
left
sides
here,
except
we're
aggregating
those
asynchronous
ops
into
this
event
set
as
the
user's
proceeding
along.
B
Maybe
this
is
a
checkpoint
right,
they're
going
to
do
create
a
file
create
a
group
for
my
checkpoint
dump,
a
bunch
of
data
sets
and
data
in
there
and
then
at
the
end,
either
they
compute
for
a
while
longer
and
allow
the
data
set
and
I'm
sorry
all
the
data
to
be
written
out,
and
then
they
can
wait
on
that
event
set
at
the
end
of
their
compute
or
for
whatever
reason,
if
they're
trying
to
guarantee
that
this
is
on
disk.
At
this
point,
they
could
wait
earlier
right.
On
that
event,.
B
C
So
you
know,
plural,
or
you
know
massively
parallel
actions
against
data
are
interesting.
Does
an
event
set
come
with
a
descriptor
as
to
what
type
of
metadata
transactions
might
be
required.
B
No,
it's
it's
a
in-memory
object
and
it's
it's
really
kind
of
boring
right.
It's
just
a
bag.
Full
of
you
know
tokens
for
the
operations
that
you
executed,
asynchronously,
okay,
it's
it's
just
sitting
there
managing
all
those
tokens
for
you
in
a
nice
programmable,
easily
manageable
way.
B
B
So
the
really
big
advantage
we
feel
for
application
developers
that
these
guys
are
there's
a
single
token.
This
event
said
id
that
they
have
to
manage.
As
many
things
as
they
want
in
there
to
happen
during
this
set
of
asynchronous
operations,
great,
you
only
have
to
touch
one
id
thing
and
keep
track
of
that
and
internally
within
the
vol
connector.
D
B
Manage
these
dependencies
right
so
we'll
guarantee
that
the
file
gets
created
before
you
use
the
file
to
create
the
group,
and
likewise
the
group
must
get
created
before
the
data
set
gets
created
and
then
the
data
written
to
it
some
cases
you
know
we
can
parallelize
things
out.
If
there
were
10
data
sets
in
the
group,
we
could
fan
those
out
right
because
they
only
need
to
depend
on
the
group
getting
created.
They
don't
depend
on
each
other.
B
Some
things
are
more
sequential,
but
at
least
it's
asynchronous
and
offloaded
into
the
background
one
way
or
the
other.
We
manage
all
these
dependencies.
We
correctly
handle
collective
parallel,
metadata,
io,
all
the
goodness
there.
That's
all
fine,
and
this
set
of
code
will
execute
and
produce
identical
results
to
the
one
on
the
left,
the
implicit
sequence,
which
is
identical
to
what
you
would
run
in.
If
you
run
it,
serially
sequentially,
whenever
synchronously
without
the
async
connector.
B
Okay,
so
going
on
the
other
aspect
that
we
I
mentioned
earlier
is
system
and
topology
aware
io,
and
this
is
I'm
certain-
I've
missed
some
locations
where
data
could
be
as
well
as
connections
between
them,
but
this
is
already
gnarly
enough
right.
We've
got
all
these
different
places
where
there's
a
memory
buffer,
effectively
ram
or
disk
or
tape
or
whatever
you
want
to
think
of
it,
as
and
they're
all
connected
together
and
they're,
getting
deeper
and
deeper
over
time.
B
It's
not
you
know,
10
years
ago,
even
we
just
had
cpus
and
a
parallel
file
system,
and
it
was
pretty
straightforward.
User
knew
what
they're
doing
today
we
got
all
this
running
around
and
tomorrow
might
be
more
or
less
you
know,
and
not.
Every
system
has
a
burst
buffer
or
no
local
storage
or
is
connected
to
the
outside
world.
But
you
know
conceptually
there's
a
lot
of
pieces
moving
around
here.
B
B
So
with
that
in
mind
and
building
on
all
the
technologies
that
we've
kind
of
talked
to
from
here,
we
have
this
caching
vol
connector
and
it's
primarily
focused
at
no
local
storage
today,
but
with
a
plugable
design,
we
can
evolve
it
towards
caching
at
any
level
in
that
hierarchy.
Right
we
could
say:
oh
this.
B
I
o
that
it
that
gets
performed
in
node
local
storage
and
then
return
it
back
to
the
app
as
it's
occurring
and
we're
just
in
the
process
of
implementing
the
update
where
we
can
stack
the
caching
connector
on
top
of
another
async
connector.
So
he
could,
in
the
background,
be
evicting
things
from
his
or
prefetching
things
from
the
cache
that
it's
stored
on
the
local
storage,
and
that's
this
part
here
in
the
bottom
left
right.
You
can
stack
these
small
connectors
right.
B
You
can
build
anything
you'd
like
and
the
app
is
completely
unaware
of
all
this.
You
can
sidestep
modifying
hdf5
and
modifying
the
app
and
build
up
stackable
connections
across
the
memory
in
your
system,
the
different
locations
where
data
could
reside,
and
so
we
think
that's
where
we're
going
to
go
in
the
future
I'll
talk
about
a
little
bit
more
in
the
last
few
slides.
B
B
So
another
kind
of
twist
on
this
is
what
we
called
subfiling.
You
know:
single
fi
shared
file
is
traditional
for
hdf5,
but
it's
kind
of
slow,
sometimes
with
mocking
contention
and
other.
I
o
bandwidth,
you
know
difficulties
so
what
happens?
Instead
of
storing
in
a
single
shared
file
behind
the
scenes
underneath
the
covers
to
the
user,
we
shard
that
up
into
a
set
of
pieces,
sub
files
and
then
create
another
metadata
file
that
describes
how
all
that
all
works
together.
B
We
get
better
use
of
the
parallel
file
system,
hopefully
reduce
the
mocking
tension
issues
to
improve
the
performance
and
in
theory
this
is
very
prototyping
like
hacking
together
code
right,
but
on
corey
we
can
see
some
moderately
significant
2x
3x
in
this
very
prototype
code:
10x,
not
10x
there
6x.
So
we
have
some
pretty
high
hopes
that
this
could
work
out.
B
B
B
So
finally,
okay,
fine
gpus,
are
coming
right.
This
stuff
works
fine,
because
we've
got
cuda.
You
know
data
transfers
and
other
kind
of
similar
technologies
with
hip
or
one
api
whatever,
but
this
stuff
doesn't
work
at
all.
Yet
right
you,
if
you've
got
gpu
private
memory,
you
pretty
much
have
to
send
it
back
over
to
the
cpu
and
then
get
it
out
through
the
cpu's
memory
back
into
some
file
system,
but
nvidia
and
other
vendors
are
working
on.
How
do
we
change
that
right?
B
That
speaks
gpu
direct
storage
gds
and
this
was
worked
out
really
really
well,
it's
a
drop-in
replacement
for
the
posix
vfd.
B
It's
a
nice
single
call
to
enable
from
the
apps
and
works
perfectly
passes
all
the
hdf5
regressions
test,
suites,
it's
ready
for
beta
testing.
If
you
feel
like
playing
it
out,
you
have
to
have
a
gds
capable
machine
and
all
that
goodness,
but
it
is
there
and
available
for
people
to
work
on
and
the
performance
can
be
pretty
good.
The
green
bars
over
here
the
gds
read
and
write
rates
we're
still
kind
of
working
on.
Why
is
the
read
not
quite
so
good,
but
this
is
very
early
right.
B
Very
preliminary
one
single
thread,
one
gpu
single
e.
So,
but
we
are
showing
that
you
know
gpu.
Direct
storage
can
outperform.
B
The
unfortunate
sort
of
part
of
this
is
that
it
only
works
in
serial
hdf5
right
because
we're
replacing
the
posix
driver
inside
hdf5,
but
in
hdf5,
when
we
want
to
do
parallel
io,
we
rely
on
the
mpi
library
to
do
that
for
us
right.
We
just
make
it
pio
calls,
and
you
know,
boom
magic
happens
so
right
now,
the
developers
on
both
the
openmpi
and
the
m-pitch
teams
are
making
progress
on
supporting
gds.
I
o
for
the
mpi
libraries,
which
will
therefore
enable
hdf5
to
do
parallel.
I
o
from
gpu
native.
B
You
know
private
gpu
memory
in
the
future
as
soon
as
these
guys
get
stable
and
roll
out.
This
capability
for
mpi
and
hda5
won't
have
to
change.
In
that
case,
right
I'll
just
invoke
the
mpio
call
and
it
will
correctly
take
the
buffer
directly
from
gpu's
memory
out
to.
B
Disk
again
playing
with
these
ideas
and
the
system
topology
getting
changed
and
updating,
but
in
the
future
we've
got
this
gnarly
diagram
right,
and
we
really
don't
think
that
application
teams,
application
developers
a
have
enough
time
right
and
we
have
enough
desire.
You
know
knowledge
and
desire
to
go,
write,
custom
data
movement
pieces
for
their
apps
and
they're
gonna
have
to
pour
them
between
one
machine
and
the
other
and
they're
gonna
be
constantly
dealing
with
this.
So
this
is
a
real
opportunity
for
our.
B
B
So
we've
been
discussing
over
the
last
few
weeks
what
we
have
to
find
a
good
name,
but
a
data
movement
dsl
right,
you'd
like
to
come
up
with
some
high-level
description
of
where
are
the
data?
Where
is
the
data
in
this
system?
How
is
those?
How
are
those
locations
connected?
What
are
the
various
properties
of
those?
You
know
this
one's
high
bandwidth?
This
is
how
big
the
memory
is
or
the
burst
buffer
is
or
whatever
and
then
well.
Okay.
B
B
We
want
the
apps
to
be
able
to
create
some
very
high
level
description
of
what
they'd
like
to
have
happen
and
then
build
up
a
nice
stackable
set
of
all
connectors
inside
hdf5
and
apply
those
policies
and
descriptions
to
that
stack
of
wall
connectors
where
you've
got
the
components
right
kind
of
been
showing
you
the
components
along
the
way,
and
we
think
in
the
next
year
or
two
maybe
we'll
be
able
to
implement
some
good
pieces
of
this
really
allow
the
apps
to
do
stuff.
Okay,.
C
Alongside
those
four,
which
are
very
good
for,
could
be
the
resource
expectations
or.
C
Or
well,
expectations
could
be
either
the
addresses
of
specific
resources
that
that
are
required
or
or
performance
expectations.
B
Yeah
sure,
okay
yeah,
I
mean,
like
I
said,
we're
very
new.
We
don't
even
have
a
you
know
a
concrete
example
of
language
sketch,
so
yeah
sure
resources
would
be
good
to
apply
in
there
too.
F
F
B
B
B
You
could
add
in
a
few
more
connections
like
I
don't
connect,
cpu
private
memory
to
another
cpu
private
memory
like
mpi
communication
there's,
you
know
we
could
sit
here
and
draw
arrows
in
here
and
talk,
but
the
notion
still
stands
right.
The
app
teams
don't
want
to
have
to
think
about
this
craziness.
B
They
just
want
to
have
some
hopefully
default
right.
When
you
install
hdf5
on
a
system
there
should
we're
trying
to
play
with
the
idea.
Is
there
should
be
some
just
default
description
that
the
system
guys
you
know,
install
for
that
machine
and,
like
you
know
this
is
how
our
system
operates.
It's
not
going
to
change
really
rapidly
right.
There
should
be
a
way
for
an
app
team
to
override
that
or
to
emphasize
certain
aspects
of
the
other,
but
basically
the
default
behavior
should
be
a
whole
lot
better.
F
F
B
B
So
I
mean
this
is
this
is
where
I'm
I'm
finished
up
right,
we're
funded
by
the
doe
lots
of
great
teams
and
lots
of
hard
work
by
teams
at
argonne.
You
know
berkeley
and
hdf
group
as
well.
As
you
know,
interns
for
the
summers
in
north
carolina
and
northwestern
any
more
thoughts
comments,
questions
I
can
flip
back
over
here
to
the
fancy
diagram.
F
F
F
I
think,
over
the
last
five
years
at
least
every
problem
I've
seen
specified
a
computer
scientist
has
come
along
and
said
that
dsl
is
the
answer
to
that.
So
I've
just
been
a
little
bit
wary
of
specifying
just
on
yet
another
dsl.
A
B
Yeah,
you
have
to
use
the
this
guy
whoops
this
this
guy,
the
gds
virtual
file
driver
at
the
bottom
level.
Virtual
files
drivers
are
below
within
effectively
the
native
wall
connector,
but
yeah.
You
have
to
choose
that
guy.
B
The
nice
thing
about
it
is
that
we're
still
working
with
nvidia
and
exactly
how
to
make
this
work
out
in
detail,
but
it
seems
possible
to
auto
detect
whether
the
buffer
is
actually
a
gpu
buffer
or
a
cpu
buffer,
and
we
might
be
able
to
just
make
gds
be
the
default.
That
seems
wild.
But
I
don't
so
I'm
not
certain
about
that.
But
at
the
beast
is
possible
to
tell
whether
your
buffer
is
in
which
pool
of
memory.
B
E
B
I
mean
we
could
let
you
do
it,
it's
not
our
problem,
man.
No,
we
gotta
convince
jack
and
doug
and
the
you
know
all
the
people
to
update
the
os
and
corey's
gpu
nodes
to
be
gds
compatible.
E
B
We've
been
talking
yeah,
I
don't.
I
don't
think
it's
right,
a
really
near-term
idea
for
them.
Yet,
but
if
we
had
more
users
like
you,
then
we
could
say:
hey
we
have
real
users.
Do
you
think
you
can
prioritize
a
little
bit
higher.
C
Hey
quincy
have
a
different
question:
yeah,
which
is
more
more
on
the
attributes.
Side
things
you
know
hdf
is
a,
is
a
luckily
a
multi-stakeholder
undertaking.
You
know
that
has
lots
of
lots
of
people
interested
in
it,
but
you
know
I'm
wondering
whether
or
not
hdf
has
thought
about
bolting
in
providence
at
a
more
fundamental
level.
C
So
one
one
way
to
do
that
would
be.
I
mean
just
throwing
out.
Ideas
is,
is
sort
of
providing
a
foothold
for
orchid
or
or
other
identifiers
that
that,
when
an
action
or
an
event
set
happens
that
it
could
be
connected
to
either
who
or
why
that
happened.
B
Yeah,
I
would
volunteer
that
we've
done
some
work
in
that
area.
Have
a
prototype
flipping
back
here
to
the
ball
connector
diagram.
Here
you
have
a
prototype
provenance,
pass-through
wall
connector.
So
it's
it's
a
it's
designed
to
take
record.
I
guess
just
to
record
the
operations
that
occur
on
an
hdf5
container,
a
file
and
then
log
that,
however
you'd,
like.
C
Using
like
geekos
and
and
and
posix
or.
B
Right
now,
it's
actually
a
kind
of
a
more
of
a
baseline
implementation.
That's
more
plain
text.
Logging,
we
kind
of
were
figuring
out
how
to
enable
darshan
logging
and
siren.
I
don't
know
if
we
ever
finished,
that
we
got
fired
around.
D
Yeah
we
haven't,
we
hadn't
finished
that
because
we
kind
of
asked
for
future
funding
on
that.
One
of
the
things
that
we
want
to
do
is
use
more
standard,
provenance,
libraries
and
formats
such
as
rdf
and
travel
standards.
So
that's
yeah
that
that's
a
work
in
progress
or
somewhat
near-term
future
work,
yeah.
C
I
mean
there's
some,
like
some
simplicity
and
and
reason
to
you
know,
treat
key
value
pairs.
Just
you
know
in
the
abstract,
but
you
know
kind
of
recognizing
how
people
wield
data
sets
you
you
know
potentially
could
bolt
on
some
some
easy
features.
D
Yeah
there
is
a
lot
of
provenance.
Work
is
already
out
there,
so
we
can
take
advantage
of
this
rdf
for
and
sparkle
type
of
querying
make
available
on
these.
Oh
yeah.
A
B
Well,
thank
you
all
again.
If
you
want
any
more
information
contacting
siren
and
I
or
anyone
else
on
the
team,
it's
more
than
welcome,
I
mean
we'd
love
to
talk
and
hear
about
other
cases
and
interesting
ideas.