►
From YouTube: CI WG demo: IMaD (Integrative Materials and Design)
Description
Date: 11/2/2018
Presenter: Ben Blaiszik
Institution: University of Chicago
Midwest Big Data Hub
A
So
just
diving
in
here,
as
I
mentioned,
the
high
level
things
we're
looking
to
do,
we're
looking
to
connect
researchers
so
connecting
people
first,
whether
those
are
academics
or
industrial
researchers,
data
services
and
software
and
tooling
across
materials.
Science,
starting
in
the
Midwest
and
then
branching
out.
A
So
some
of
the
connections
that
we're
building
between
researchers,
we
see
that
as
very
critical,
making
sure
that
people
are
talking
to
each
other
in
this
in
this
little
ecosystem.
The
first
is
with
the
materials
informatics.
Skunk
works
at
the
University
of
Wisconsin,
so
the
the
Chicago
and
Argonne
team
is
holding
weekly
meetings
with
this
group,
mostly
around
the
discovery
of
metallic
glasses.
A
But
it's
also
to
coordinate
software
development
and
machine
learning
projects
and
make
sure
we're
not
duplicating
work
and,
as
a
result,
we
actually
submitted
a
joint
CSSI
proposal
that
is
still
waiting
waiting
for
work
and,
as
you
see,
this
is,
as
Jim
mentioned,
it's
a
great
opportunity
to
start
training
the
next
generation.
So
the
skunkworks
is
a
group
of
about
20
undergraduates
at
Wisconsin
that
are
interested
in
materials
informatics
and
we
see
there's
quite
a
good
opportunity
there
to
train
them
and
we're
working
with
citrine
informatics
as
well.
A
So
we
have
bimonthly
meetings
with
this
group
to
discuss
data
service
integrations
and
joint
machine
learning
projects.
We've
had
two
joint
papers
over
the
last
year
with
this
group
and
we
actually
have
now
a
project
funded
by
citrine
as
a
result
of
this,
this
collaboration
just
recently.
Actually,
in
the
last
two
weeks,
we
had
materials
microscopy
data
workshop
held
at
Northwestern,
basically
working
working
through
challenges
to
toward
capturing
data
from
from
various
instrumentation,
especially
around
microscopy
data.
A
So
we're
really
finding
that
to
be
important
here.
Another
aspect
of
community
outreach
that
we're
working
on.
We
have
ahead
through
the
I'm
at
project
to
two
different
interns,
Stephanie
Fox
and
Austin
Keating,
who
are
part
of
the
middle
school
of
journalism,
science,
communication
program,
and
so
we
found
we've
actually
had
them
visit
each
partner
site
and
create
videos
that
are
showcasing
the
people
that
are
at
those
institutions,
the
facilities
and
the
data
services
to
really
drive
awareness
of
all
of
those
you
can
see
here
on
the
right.
A
So
the
other
piece
that
we've
been
working
on
is
leveraging
the
materials
data
facility
effort
to
connect
data
services.
Just
to
give
you
a
real
brief
overview.
The
materials
data
facility
is
building
data
services
to
allow
researchers
to
publish
data
regardless
of
size,
so
maybe
terabytes
of
data
data
type.
So
it
could
be
heterogeneous
data
and
location.
It
could
be
distributed.
A
Add
the
the
concept
that
we've
kind
of
really
latched
on
to
is
this
MDF
connect
flow,
where
we
have
many
inputs
being
piped
through
the
MDF
Connect
service
and
set
to
many
different
outputs
and
really
the
goal
there
is
to
to
make
it
easier
for
users
to
deposit
their
data
from
where
they're
collaborating
into
many
services.
On
the
other
side,
all
from
one
location,
we
heard
this
many
times
that
there
are,
you
know
10
or
15
different
data
services
that
are
important
to
the
materials
community.
A
Researchers
work.
Don't
don't
really
want
to
have
to
deposit
their
data
set
in
15
different
ways,
so
we're
trying
to
help
allay
that
problem
and,
as
you
see,
the
inputs,
we've
been
able
to
focus
on
through
I'm
add
are
largely
centered
around
the
partners.
So
the
four
seed
service
is
built
at
Illinois,
a
materials
Commons
is
at
Michigan
and
we've
also
been
working
with
NIST
and
of
course,
we
have
other
integrations
that
make
it
very
simple
for
users
to
get
data
into
this
pipeline.
The
data
is
then
sent
through
the
pipeline.
A
A
series
of
extractions
are
performed
on
that
data,
extractions,
meaning
trying
to
pull
out
things
like
crystal
structure
or
other
material
properties
transformation,
meaning
we
transform
the
data
into
a
form
that
is
amenable
to
deposit
in
other
services,
so
we
could
deposit
into
other
MDF
services
like
publish
to
get
a
DOI
or
our
search
service
to
allow
querying
and
aggregating
of
that
data.
But
beyond
that,
we
can
also
deposit
into
services
like
the
NIST
materials
resource
registry,
informatics
generation
platform
and
nano
mine
and
others
that
are
in
development.
A
A
So
with
you,
if
you
have
a
large
collaboration,
you
have
hundreds
of
different
data
sets
that
are
going
to
be
deposited.
Often
you
can
use
this
type
of
functionality
to
automate
that,
and
you
see
that
you
get
all
the
functionality
that
was
shown
in
the
webform
that
you
get
it
through
a
Python
client.
A
The
other
thing
that
we've
seen
that
is
important
is
integrations
to
basically
the
places
that
researchers
are
doing
their
collaboration
right
now.
So
if
a
researcher
has
their
data
in
Google
Drive,
it
makes
it
very
easy
to
send
it.
Through
mdf
connect
box,
we
have
an
integration
with
dropbox,
figshare
and
and
others.
A
The
one
I
do
want
to
highlight,
specifically
is
our
connection
to
for
seed
since
they're
a
partner
of
our
spoke-
and
you
see
here
on
the
left,
a
screenshot
of
their
project
space,
where
you
have,
in
this
case
gallium
nitride
atching
with
a
given
pressure,
and
you
may
have
some
files
associated
with
that,
but
basically
that
the
four
seed
project
space
is
an
active
space.
Where
you
you
share
it
mainly
among
local
collaborators.
A
And
what
we're
seeing
is
that
those
researchers
also
want
a
way
to
export
that
data
outside
of
for
seed
for
the
community
to
have
access
to
later.
And
so
we
we
were
able
to
work
with
the
four
seed
team
and
Ben
Gillespie
at
Illinois
to
build
a
very
simple
publication
flow
from
foresee
to
mdf
connect,
and
you
see
that
the
user
can
select
a
repository
to
send
to
they
send
a
select
materials
data
facility,
and
then
they
fill
out
a
little
bit
of
that
same
metadata
and
click.
A
And
then
everything
is
sent
out
to
the
community,
as
as
the
user
wants
I'm
just
going
to
show
you
a
very
quick
video
of
how
this
how
this
works?
Just
to
show
you
how
easy
it
is.
So
you
log
in
with
your
institutional
credentials,
you
see,
there
are
hundreds
of
different
institutions.
You
can
login
with
I'm
gonna
use,
Chicago
you'd
then
select
to
become
a
contributor
and
then
you're
going
to
fill
in
a
little
bit
of
the
metadata
like
the
title,
the
author's
institutions
and
the
data
location.
A
A
Now
you
can
follow
along
as
our
services
parsing
through
it,
and
here
in
a
second
you'll,
see
that
it's
set
to
Globus
Globe
is
published.
So
now
you
get
to
DOI,
and
you
can
cite
that
in
your
in
your
papers
and
such
and
that
is
persistent,
a
persistent
location
and
then
down
here.
You
see
that
it
was
sent
to
Sutra
Nation,
and
we
pop
over
to
such
a
nation
and
the
nice
part
here
is
that
it's
not
like
you
get
the
same
functionality.
You
don't
get
another
DOI,
you
don't
get.
A
A
B
A
Yeah
I'm
trying
to
train
to
share
some
slides
here,
but
I
think
I'll
just
give
up
for
now.
I
think
I.
Think
I
heard
the
question,
as
is
there
is
there
hopes
that
we
could
apply
this
to
a
different
domain
at
some
point?
Certainly
we
were
interested
in
applying
this
to
various
domains.
As
as
we're
building
these
data
services
there
there
are
only
a
few
layers
of
it
that
are
really
domain-specific.
A
You
know
the
underlying
cyber
infrastructure
is
largely
generalizable,
so
the
pieces
that
are
ungenerous
alar,
largely
within
that
MDF
connect
service
which
I
labeled
as
converters
or
transformers.
So
those
pieces
would
need
to
be
changed
for
different
domains
to
understand
different
file
types
and
understand
service
integrations
in
different
areas,
but
everything
else
is
really
very
domain:
agnostic
and
handling
things
like
large
data
transfer,
publishing
datasets
getting
a
DOI.
You
know
a
lot
of
those
things
are
very
generalizable,
so
we're
definitely
looking
for
collaborators
in
other
spaces
to
build.