►
From YouTube: CI WG demo: iRODS (An Overview)
Description
Date: 1/5/18
Presenter: Jason Coposky
Institution: Renaissance Computing Institute
South Big Data Hub
A
List
here
on
the
agenda,
so
I
think
we'll
start
off
with
the
irods
overview,
got
the
intro
slide
there,
and
so
Jason
is
joining
us
he's
been
working
for
21
years
in
a
variety
of
areas,
ranging
from
virtual
reality,
EDA
management.
He
was
technically
the
director
to
start
up
where
he
was
developing
projection
and
distortion
correction
technologies.
A
He
became
the
first
member
of
the
visualization
team
at
rensi,
creating
large
novel
format,
multi-touch
systems
and
has
moved
on
now
to
sort
of
become
the
project,
technical,
lead
and
chief
technologists
of
the
irods
consortium
where
he
now
provides
management
and
oversight
for
the
consortium.
So
I'll
go
ahead
and
let
you
kick
off
Jason.
B
All
right,
thank
you
very
much
so
I'm
here,
I'm,
just
going
to
give
a
pile
of
vulgar
review
since
we
hire
ODS
is
what
it
can
do
for
you
and
what
does
those
things
so,
first
of
all,
our
odds
is
an
open
source
distributed
metadata,
driven
and
data
centric
piece
of
technology,
and
we
like
to
talk
about
it
as
a
flexible
framework
for
the
abstraction
of
your
infrastructure.
B
An
array
of
different
capabilities
and
insulate
the
users
and
clients,
above
that
from
all
of
that
change,
so
we'll
talk
about
data
virtualization,
where
iron
rods
can
service
a
number
of
different
storage
technologies
such
as
object
and
tape.
You
may
want
to
talk
about
how
you
can
take
that
data
out
of
the
one
store
system
and
put
it
into
another
storage
system,
such
as
scratch
for
a
high-performance
computing
or
a
high
throughput
computer,
and
then
how
do
we
find
that
data?
B
You
have
you
know
whatever
I
can
petabytes
of
data
everywhere,
so
you're
gonna
want
some
sort
of
technology
to
allow
you
to
discover
that
and
so
on.
So
our
Roth's
provides
all
of
these
different
capabilities
that,
as
I
said,
it
insulates
the
users
and
the
different
clients.
You
know
be
that
command
line
all
the
way
out
to
web
applications
or
say
the
discovery
environment,
which
is
a
much
more
rich
browser-based
interface,
and
all
these
things
can
change
under
the
covers
without
having
any
break
to
your
users.
B
So
the
first
thing
we'll
talk
about
is
data
virtualization,
and
this
speaks
the
ability
of
our
odds
to
provide
a
unified
namespace
on
any
kind
of
storage
technology.
This
could
be
a
file
system
such
as
like
NetApp.
This
could
be
cloud
storage
and
Amazon
or
or
Microsoft.
You
could
have
on
premises,
object,
storage.
B
How
are
you
going
to
get
that
data
from
the
on-premises
objects
or
say
to
an
offer
off
promises,
object,
storage
and
still
have
your
users
be
able
to
discover
and
utilize
their
data,
and
we
also
can
speak
to
various
archival
storage
systems,
so
that
could
be
Amazon's
glacier.
That
could
be
a
caged
system
such
as
HP,
SS
and
so
on.
So
the
takeaway
here
is
that
irods
provides
a
logical
view
into
this
complex
physical
representation,
your
data
that
could
be
geographically
distributed,
or
it
could
be
simply
in
a
cluster
in
your
basement.
B
B
Now
this
data
object
is
once
again
in
a
logical
representation
of
your
data
that
data
object
can
represent
one
or
more
physical
instantiation
of
that
data,
which
we
call
replicas,
and
then
this
replicas
could
be
stored
anywhere,
so
they
could
be
stored
locally
or
you
can
have
a
replica
that
is
also
geographically
distributed,
say
for
locality
of
reference.
You
want
the
ability
to
have
a
user
fast
access
to
their
data
file
that
that
data
is
geographically
distributed,
and
so
I'll
give
you
the
ability
to
do
that,
and
that
also
provides
durability.
B
So
if
one
server
is
down
for
whatever
reason,
you
have
a
high
availability
to
access
your
and
what
this
means
is
that,
while
we
have
say
a
logical
path
of
temps
on
home
rods,
the
file
that
logical
path
may
amount
to
one
or
more
physical
locations
of
that
data.
So
the
physical
path
of
that
data
would
be
you
know,
volatile
rods
will
home
the
file,
and
then
we
have
two
other
replicas
and
you
two
vault
and
you
one
ball.
So
this
one
logical
path
mapped
to
any
of
these
physical
instantiation
through
the
data.
B
Now,
let's
say
that
you
had
one
of
these
replicas
stored
in
s3
there
there
would
be
a
bucket
and
a
key
associated
with
that
or
say
you
had
it
in
B,
deeanne's
lost
the
physical
path
would
actually
be
an
object.
Id
some
sort
of
hash
of
the
data
itself
and
irods
provides
us
logical
view
on
top
of
all
of
this
complex
storage
infrastructure.
Now,
since
we
have
a
catalog
which
managed
just
this
unified
namespace,
we
have
the
ability
to
write
things
down
about
the
data
itself.
B
B
We
can
start
reasoning
about
these,
so
within
the
within
our
rods
itself,
we
can
ask
ourselves:
does
this
user
have
the
appropriate
role
to
access
this
blue
server
here,
because
that
server
maybe
hit
the
compliant?
So
now
the
metadata
attached
to
these
storage
resources
actually
not
only
give
the
storage
resources,
but
also
the
server's
themselves,
an
identity,
a
thing
about
which
we
can
reason,
which
also
gives
us
the
ability
to
implement
a
wide
array
of
other
use
cases,
not
just
simply
whether
or
not
a
user's
authorized
to
to
interact
with
a
particular
storage
resource.
B
And
since
we
have
this
metadata
and
I
talked
about
how
our
odds
had
a
a
integrated
scripting
engine.
Well,
we
can
talk
about
workflow
automation,
so
this
is
an
integrated,
scripting
language
that
is
built
into
irods,
which
is
of
your
choice.
So
this
could
be
Python.
This
could
be
JavaScript.
You
can
also
build
very
performant
C++
rule
engine
plug-ins
that
perform
one
specific
task,
and
these
the
scripting
engine
responds
to
what
we
call
dynamic
policy
enforcement
points.
B
If
a
user
puts
some
data
into
irods
that
will
trigger
the
pre
put
hook
your
code
injected,
there
can
reason
about
whether
or
not
the
user
is
allowed
say.
Does
that
metadata
attribute
on
the
user
match
up
with
the
metadata
attributes
on
the
particular
storage
resource,
with
what
you're
trying
to
interact
is
that
is
that?
Okay,
if
it
is
you
get
a
check
mark
and
then
the
operation
is
triggered.
B
Assuming
that
operation
is
successful,
the
post
hook
will
fire
on
a
put.
This
is
typically
where
our
users
will
want
to
save
automatically
extract
metadata
either
from
data,
that's
at
rest
and
apply
it
to
the
catalog
or
say,
if
you're,
putting
in
a
a
thumb
net
an
image
you
can
generate
thumbnails
and
so
on.
There
are
many
different
use
cases
there.
B
You
can
have
an
opinion
about
what
happens
in
the
system,
so
coupling
that
metadata
with
the
integrated
scripting
engine
gives
you
the
ability
to
implement
any
number
of
use
case,
and
we
we
like
to
say
that,
as
I
said
earlier,
irods
is
metadata
driven
or
data
centric
and
metadata
driven,
and
the
last
thing
that
we
can
talk
about
is
secure
collaboration.
This
speaks
the
irods
ability
to
generate
different,
unified
namespaces
or
what
we
call
different
zones.
B
The
idea
here
there
is
is
that,
since
we
have
a
catalog
in
a
network
protocol
since
effectively,
irods
is
the
distributed
technology.
Not
only
can
different
irods
service
within
a
single
zone
speak
to
each
other,
but
different
IRL
service
and
different
zones
can
speak
to
each
other.
So
the
idea
here
is
that
he
sure.
B
They're
couple
keys,
which
represent
your
you,
need
some
users
and
grant
some
access,
and
then
you
can
immediately
start
collaborating.
You
don't
need
common
infrastructure
anymore,
you
don't
need
sure
they're
trying
to
buy
common
infrastructure,
and
this
affords
a
temporary
collaboration.
You
don't
have
to
stand
up
a
monolithic
piece
of
piece
of
technology
to
collaborate
with
your
users.
You
create
the
Federation,
you
create
users
and
grants
and
access,
and
you
can
immediately
start
collaborating,
and
when
you
are
done,
you
can
delete
the
keys
and
delete
the
users
and
continue
about
your
business.
B
So
if
you
think
about
the
second
slide,
how
irods
was
sitting
in
between
the
infrastructure
and
the
client?
That's
and
you
wrap
around
around
all
that
infrastructure.
Encircle.
You
effectively
now
have
a
service
interface
between
your
technology
and
a
collaborators
technology
and
the
idea,
here
being
that
you
are
no
longer,
you
can
you're,
not
only
sharing
data.
You
are
sharing
infrastructure.
B
If
your
collaborator
has
data
that
is
not
allowed
to
leave
their
data
center,
you
can
launch
jobs
in
their
data
center
in
order
to
gather
your
results
and
just
simply
share
the
results
of
that
analysis.
They
can
do
the
same
thing
with
the
other
users
as
well,
and
vice
versa.
So
the
idea
here
is
is
that
these
collaborations
can
be
dynamic,
and
then
they
can
be
quite
powerful
because
you
are
leveraging
more
than
just
simply
the
data
itself
and
I
believe.
My
time
is
up.
B
A
B
A
B
I
arrived
one
of
the
plugin
interfaces
to
irods
is
authentication,
and
so
irods
manages
it
since
it,
since
we
have
the
catalog
and
we
can
attach
metadata
users,
users
exist
within
irods
catalog
itself.
Typically,
what
happens
is
that
we
end
up
synchronizing,
the
irods
user,
catalog
against
anything
like
LDAP
or
Active,
Directory
and
so
on,
and
so
that
is
completely
up
to
the
users,
so
we
handle
authentication
via
GSI.
Of
course,
we
also
write
out
through
Pam
and
we
can
reach
a
number
of
different,
a
technologies
there
Thanks
sure.
Thank
you.
C
This
is
Christine
I,
just
want
to
say
I've,
you
know
been
following
irods
and,
of
course,
we've
used
eye
rubs
for
many
projects
and
I'm
always
impressed
by
every
time.
I
see
a
presentation,
all
the
neat
new
things
that
you've
added
and
know
it's
very
difficult
to
make
this
all
work
together.
I
hope
it's
not
too
indelicate
to
ask
this
question
and
you
can
point
if
you
want,
but
if
do
you
have
any
people
that
you
work
with,
who
are
using
irods
and
Globus
together?
I
know
that
well,
but
has
some
things
that.
B
Overlap
a
little
bit,
but
we
don't.
We
don't
overlap
terribly
much.
If
you
look
at
you
Det
there
are
24.
There
are
25
sites
that
are
on
based
on
our
odds
and
they
use
globus
for
a
lot
of
data
movement
through
the
NIH
data.
Commons
project
that
we're
working
on
will
be
much
more
tightly
integrating
with
Globus
and
which
means
that
we
will
actually
be
able
to
register
data
objects
that
are
Globus
endpoints
into
the
catalog.
B
So
we're
going
to
write
a
resource
plugin
to
interact
with
Lovis
in
that
way
as
well
authentication,
another
means
and
when
our
next
plugin,
when
or
not
plug-in
interface
is
released
or
our
multi-part
transfer
we're
going
to
consider
Globus
as
a
means
by
which
actually
moved
the
data
under
the
covers.
So
irods
negotiates
the
connections
and
then
oh.
C
C
B
A
A
So
this
is
Niall
I'll
jump
on
that
bandwagon
as
well.
I
know
we're
using
Globus
transfer
all
this
off
in
the
hotel,
and
you
know
the
the
power
behind
having
blow
us
off
and
being
able
to
use
your
local
identity
has
been
fantastic
for
that
project
so
far
and
holding
that
with
irods,
which
we
run
quite
a
bit
of
here
at
TAC
as
well,
would
really
really
help.
Are
you
working
with
them
on
on
actually
getting
down
to
the
user
management
level
at
that
point?
Or
is
it
just
the
author
authentication
portion.