►
From YouTube: CI WG demo: HydroShare
Description
Date: 04/06/18
Presenter: Dave Tarboton
Institution: Utah State University
West Big Data Innovation Hub
A
While
we're
going
through
all
this
to
walk
your
minds,
if
there's
anything,
you
want
to
think
about
or
contribute
on
that
I'll
just
sort
of
put
that
out
there,
and
so
rather
than
take
up
too
much
time,
I
will
just
go
ahead
and
introduce
Dave
who's
here
from
Utah
State
who's
part
of
the
hydrosphere
program,
which
is
a
project.
That's
collecting
hydrological
information
across
the
whole
country,
believes
I.
A
Think
we've
used
some
of
it
even
during
the
the
Harvey
situation
down
here
and
he's
the
principal
investigator
for
Hydra
chair
and
lead
civil
engineering
for
a
water
up
at
Utah,
State
University,
and
so
rather
than
read
all
the
stuff
on
the
agenda.
Notes
I'll
go
ahead
and
turn
it
over
to
you
to
tell
us
pan
out
of
here.
Yeah.
B
Great
thanks,
thanks
for
the
chance
to
talk,
I'm
really
happy
to
be
sharing
this
with
you,
and
so
how
I
share
is
a
web-based
system
for
sharing
hydrologic
data
and
models
with
with
specific
functionality
aimed
at
making
collaboration
easier
for
hydrologist.
It's
been
developed
over
the
last
six
years
now,
in
collaboration
with
quality,
to
support
the
initially.
B
The
data,
management
and
publication
needs
and
then
growing
into
model
sharing
of
the
hydrologic
research
community,
and
we
can
think
of
it
as
the
hydrology
communities
sort
of
contribution
to
the
transparency
and
research
reproducibility
movement,
it's
funded
by
the
National
Science
Foundation,
the
it
is
its
r2
program.
Software
integration
for
sustained
innovation
and
their
programs
changed
names
now,
but
it
was
funded
under
the
old
name.
B
So
the
rain
noses
quite
well
I
wanted
to
just
in
the
next
slide
talk
a
bit
about
what
quietly
is
to
non-profit
consortium
of
about
130
US
universities,
whose
mission
is
to
shape
the
future
of
water
science
by
strengthening
interdisciplinary
collaboration
and
this
data
sharing
activity
of
Hydra
series
is
part
of
that
sort
of
broader
community
effort.
So
that's
the
effectively
the
audience
that
this
cyber
infrastructures
is
targeting.
Our
motivation
is
really
collaboration,
so
this
slide
is
intended
to
emphasize
the
collaborative
nature
of
hydraulic
research.
B
The
need
to
combine
information
from
multiple
sources
to
do
analyses
that
may
be
data
and
computationally
intensive
at
all,
but
you
still
need
collaboration
even
if
they
not
and
a
great
address
the
Grand
Challenges,
avoiding
flooding,
avoiding
water
shortages,
and
things
like
that
on
the
right.
You
see
just
a
screenshot
of
the
Hydra
share
website,
which
I'll
be
talking
talking
through
a
little
bit
and
it's
a
sort
of
open
access.
Anybody
can
post
data
in
it
and
use
it
for
for
collaboration.
So
it's
really.
B
In
the
what
we
refer
to
as
geographic
features,
you
may
have
multi-dimensional
space
time
data,
and
then
you
may
have
that
aggregated
together
into
model
programs
and
model
instances,
and
we
distinguish
between
the
programs,
which
is
the
the
code
that
actually
implements
the
computations
and
then
that
the
model
instance,
which
would
be
that
program,
plus
the
data
for
application
of
it
at
a
specific
watershed,
for
example.
So
we've
designed
how
to
say
to
hold
a
sort
of
wide
variety
of
the
data
of
interest
and
in
the
format
that
role
just
like
to
use.
B
So
first,
let
me
do
this
slide.
This
will
step
through
what
Tyra
Shea
is
first
at
a
platform
for
sharing
and
collaborating
really
exchanging
information
in
terms
of
computer
files,
so
files
storage
with
Dropbox
ish
type
of
functionality
and
hopefully
as
easy
to
share
information
as
Dropbox.
But
then
we
want
to
add
information.
Metadata
descriptions
provide
access
to
that
metadata
through
API.
The
capability
for
web
apps
and
social
functions
enable
formal
publication
of
the
data
to
get
digital
object
identifiers
so
that
it
can
be
identified
in
in
citations
and
enhanced
trust
in
the
finding.
B
So
that's
all
some
of
the
value
added
functionality
that
we're
building
on
with
the
goal
really
to
advance
the
science
by
enabling
the
community
to
easily
and
freely
share
the
products
resulting
from
from
the
research,
not
just
the
scientific
publication,
but
also
the
data
and
the
models
used
to
create
them.
So
it's
based
on
a
fairly
carefully
designed
resource
data
model
that
uses
open
archives,
initiative,
objects,
reuse
and
exchange
standards.
B
So
the
pattern
with
that
is
there's
a
every-everything
is
referred
to
as
a
resource,
and
we
use
the
word
resource
because
that
can
describe
quite
generally
an
an
object.
If
one
might
be
sharing
comprised
of
computer
files,
it
may
be
data,
it
may
be
a
model.
It
may
be.
A
combination
of
a
combination
of
both
that
can
get
grouped
into
irrigation.
B
That
would
be
a
bit
of
a
challenge
to
type
in,
and
so
this
is
the
I've
just
got
a
couple
of
screenshots
of
some
of
the
interface
with
with
Irish
I.
Don't
want
to
recognize
the
Lightning
talks.
I
can't
do
too
much
of
it,
but
there's
the
individual
user
after
they've
logged
in
can
go
to
the
my
resources
page
and
create
new
resources
that
that's
where
they
get
to
basically
post
information
into
the
system.
These
high-rises
used
as
the
underlying
storage
layer.
B
So
then,
when
you
actually
get
on
after
you've
created
a
resource,
it's
got
a
number
of
features.
The
landing
page
for
the
resource
shows
the
authors,
the
owner,
the
type
of
it
when
it
was
created,
citation
information,
for
example.
If
it's
been
published
with
a
do,
is
that'll
be
that'll,
be
given
there,
the
abstract,
with
the
user,
created
the
resource
roads
and
then
for
each
resource.
You
can
manage
the
access,
so
information
can
be
private
or
or
public.
B
You
can
give
people
permission
to
just
view
or
edit
is
commenting
and
rating-
that's
been
somewhat
underused,
but
we
built
that
in
with
the
idea
of
trying
to
promote
social
value
on
to
full
resources.
And
then
you
can
do
things
like
organize
resources
into
collections
and
Prall,
and
you
can
also
create
different
versions.
But
one
of
the
interesting
things
is
you
can
open
them
with
compatible
web
air.
So
the
concept
here
is
that
apps
can
or
effectively
any
web-based
system
that
can
connect
on
resources
through
the
applications
program.
B
Interface,
two
steps
to
both
visualization
support
analysis
and
anybody
can
establish
that
app
and
then
registered
Mahara
share
if
it
gets
approved
by
quality.
It'll
appear
on
the
apps
landing
page,
but
even
if
it
doesn't
get
approved
by
quasi
or
stole
in
the
process
of
being
evaluated,
and
it's
still
available
for
people
to
do
you.
B
One
of
the
apps
that
we're
putting
quite
a
lot
of
energy
into
is
really
a
deployment
of
of
Jupiter
hub
with
a
Jupiter
parson
notebook,
because
that
gives
really
general
capability
to
have
let's
say
sort
of
entry-level
programmers
write
and
execute
code
in
the
system.
Where
is
all
of
the
libraries
and
dependencies
effectively
resolved
for
them?
Extract
data
from
hydro
say:
do
they
work
and
then
serve
it
back
into
Hydra
share,
including
the
notebook
itself,
and
then
let
other
people
have
picked
up
on
working
on
the
notebook
more
in
a
in
a
collaboration.
B
So
I
know
this
is
a
fairly
technical
crowd,
so
I
wanted
to
go
bit
into
how
the
system
works,
and
it
is
sort
of
high-level,
is
really
three
parts
to
it.
The
main
entry
point
is
there
is
a
django
website,
so
the
technology
that
we've
used
as
a
software
snack
built
on
on
django
and
that's
used
effectively
to
support
the
loading
of
information,
support,
the
editing
of
metadata
support,
the
discovery
of
resources
and
to
organize
and
annotate
your
contents
to
to
manage
access.
B
So
if
you
want
to
think
of
this
as
parallel
to
perhaps
the
way
a
PT
works,
you
think
of
this
as
a
as
a
file
explorer
and
then
we've
got
irods
as
the
effectively
interface
to
the
storage
layer
and
that's
to
allow
data
to
be
held
in
an
federated
data
store.
So,
while
Hana
she
provides
some
capacity.
There's
also
capacity
for
other,
perhaps
heavy
heavy
or
big
big
data
users
to
establish
their
own
federated
irods
server.
B
So
there's
a
number
of
examples
of
those
already
this
swatch
here,
which
is
actually
a
padieu
running
with
hubzero,
there's
apps
that
Rho
G
is
at
University
of
Illinois.
Now
those
happen
to
be
offline
right
now,
because
they
were
on
the
rajah
system,
that's
going
to
rebuild
and
you
can
have
apps
that
take
advantage
of
standard
systems
that
come
out
of
say
you
need
ADA
and
the
atmospheric
sciences
for
accessing
multi-dimensional
data.
B
So,
but
a
couple
of
slides
to
to
in
here.
This
is
just
a
bit
about
our
statistics
and
we
keep
track
of
the
the
users
that
we
we
have
sign
up
and
we
keep
also
keep
track
of
how
frequently
they
they
log
in
and
whether
they
active
or
not
the
primary
audience
being
the
us
hydraulic
research
community,
but
it's
open
to
international
use
and
we're
also
trying
to
keep
track
of
who
the
people
are
in
terms
of
p.m.
B
to
report
to
the
National,
Science,
Foundation
and
other
organizations,
and
then
we're
also
looking
at
the
number
of
resources
that
have
been
added
to
the
system
and
their
and
their
types
to
sort
of
understand
how
people
are
looking
at
things.
This
is
just
a
fairly
small
snapshot
from
the
metric
tracking
system
that
we
have
to
be
able
to
understand.
What's
going
on,
so
this
just
summarizes
some
of
the
points
that
I've
made
it's
a
web-based
system
for
data
and
model
sharing.
B
You
can
share
models
and
to
the
degree
that
the
models
can
be
executed
by
apps.
You
can
execute
them,
facilitate
ease
of
access
to
high
performance
computing
and
that
really
comes
from
the
the
data
being
considered
to
go
into
a
system
and
we're
actually
going
to
be
I'm
traveling
to
eg
you
in
Vienna
next
week
and
there's
a
group
of
the
hydrosphere
team
are
are
going
to
be
connecting
Harbor
say
to
the
Cheyennes
supercomputer
and
that's
part
of
encode
to
trying
to
do
so
further.
B
The
collaboration
around
around
a
model-
that's
running
there,
so
we're
really
thinking
of
seek
to
be
framing
the
data
as
social
objects
that
people
could
use
for
collaboration,
I'm,
trying
to
be
interoperable
to
other
data
and
modeling
systems
was
ultimately
that
goal
being
to
advance
hydraulic
understanding
more
rapidly
and
that's
the
picture
of
some
of
the
team
outside
of
our
tile
event
and
a
lot
of
clear.
It
goes
to
all
the
people.
Who've
done
all
the
work.
A
B
B
A
B
B
So
we
did
I
mean
there's
always
danger
when
you
sort
of
make
a
system
free
and
easy
about
whether
it's
going
to
get
overwhelmed
by
by
a
use
that
wasn't
necessarily
the
primary
one
that
you
defended
for,
but
we've
got
a
strategy
where
we
basically
gives
each
user
a
free
quota
of
20
gigabytes
and
then,
if
somebody
needs
needs
more
than
that,
basically
they
just
need
to
talk
to
quality
about
it.
And
this
there
are
sort
of
NSA's
person
planets
on
the
hydrologic
Sciences
program.
B
B
We
have
we're
not
using
any
any
formal
oncology's
well
at
a
rudimentary
level,
so
our
resource
data
model
describes
each
metadata
element
from
formally
from
a
namespace
where
we,
where
we've
got
terms,
that
we
can
pull
from
dublin
core
was
or
things
like
there
we
are
using
them.
B
We
found
that
for
quite
a
lot
of
concept,
we've
defined
our
own
terms,
so
that
that
may
not
be
necessarily
all
that
helpful
for
everybody
effectively
defines
all
the
words
they're
going
to
be
using
themselves,
but
that's
definitely
an
issue
that
we're
trying
to
sort
of
be
sensitive
to.
We
would
like
it
all
to
be
effectively
machine,
readable.
A
Right,
I,
don't
see
anything
else
pop
up
in
the
chat
window
on
that,
but
go
ahead,
oh
yeah.
Actually
my
question
was
for
you
now:
I
really
enjoyed
the
presentation
by
the
way
and
and
I
have
a
couple
follow-up
things
at
all:
I'll
be
emailing
David
about,
but
not
all
I
was
wondering
if
you
have
any
thoughts
on
how
Hydra
share
and
whole
tail,
maybe
even
our
workbench,
where
there's
commonality
and
maybe
a
place
to
converge,
I
think
that
would
be
interesting.
The
ability
to
to
migrate
I
mean
you
know
we're
so
hotel
is
I.
A
Think
many
people
here
know
about
it,
but
it's
essentially
the
we're
using
containers
as
a
way
to
have
visible
computation
and
publishable
of
data
results
and
the
computations
so
and
so
I
think.
With
this
I
mean
we
have
the
ability
to
subscribe
to
irods
data,
it
would
be
sort
of
again
data
discovery
and
bringing
those
things
in,
but
it
would
be
an
interesting
thing
to
do
as
well
as
perhaps
even
impossible
getting
some
of
those
apps
into
containers
so
that
we
can
run
with
those
in
an
environment
that
will
capture
that.
B
Right
I
would
like
to
learn,
learn
a
lot
about
that.
I
haven't
heard
about
Hotel
before,
but
there
is
we're
using
docker
containers
quite
widely
in
the
system
itself
has
is
split
up
and
amongst
a
bunch
of
docker
containers,
but
also
in
our
Jupiter
hub
environment.
Some
of
the
models
are
being
put
in
docker,
so
that's
the
one
level
of
Magnus
using
the
other
one
is
turning
malloc
at
DePaul.
University
in
Chicago
has
a
Eartha
cube
project.
B
That's
developing
what
she
calls
as
SCI
units,
which
is
a
sort
of
containerization
procedure
that
can
you
can
go
through
a
sequence
of
steps,
executing
programs.
It
will
record
all
of
those
as
well
as
record
all
of
the
dependencies
allow
you
to
put
all
of
those
in
a
container
that
she
calls
a
Sai
unit.
You
can
actually
then
push
that
container
into
Hydra
share.
Somebody
else
could
download
it
to
a
different
platform
and
really
cute
and
reproduce.
The
results.
I
think.
A
Sounds
like
it's
right
for
the
picking
on
that,
because
all
those
things
I
think
are
possible
and
there
just
be
interesting
to
see
how
we
could
federated
cross
things
like
this.
I
I
had
an
additional
question
we
probably
run
on
rather
than
going
on,
but
the
one
that
I
had
was.
This
is
mostly
users
bringing
in
data.
Do
you
also
support
sort
of
data
streaming
coming
in
from
sensors
and
other
sources
that
are
then
leveraged
by
both
the
modeling
and
data
integration?
Portions.
B
We
do
to
a
limited
extent
the
we've
got
a
couple
of
what
we
refer
to
some
community
high-value
data
sets
such
as
outputs
from
the
national
water
model
that
we
are
actually
supporting
on
separate,
arid
servers
at
rain
seed
and
though
we
provide
access
to
that
to
apps
that
can
get
launched
from
how
to
share
so
that
there
sort
of
one
sort
of
connection
to
high-value
data.
The
other
is
well
Hydra.
B
A
Alright
well,
thank
you
very
much
for
a
good
talk
and
lots
of
information,
and
we
probably
ought
to
I
should
get
in
touch
with
you
a
little
bit
about
Christine
suggestion
as
well
as
the
I.
Don't
know
if
you
know
about
the
data
national
data
service
stuff
that
we
have
in
the
workbench
there,
which
might
also
be
an
interesting
area
to
look
in
I,
did
some.