►
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
A
He's
a
signature
faculty
which
has
recently
been
awarded
a
CD
0
by
the
DIA
BOE.
He
has
led
the
computational
materials
for
eczema
x
and
for
the
National
Labs
co-design
effort
leads
leads
all
of
their
lnl
on
material
of
informatics.
So
this
is
another
she's,
a
fellow
of
the
american
physical
society,
recipient
of
japan's
society,
of
promotion
and
science
award
in
2010,
and
also
a
fellow
for
outstanding
research
in
science
and
engineering
at
Los,
Alamos,
National
Lab.
So
today,
I
think
he's
talking
about
data
challenges
and
opportunities
for
the
next
generation
of
materials.
Innovation.
B
Thank
you
very
much
indeed,
and
so
let
me
try
to
tell
you
the
scope
of
what
I
will
focus
on,
so
there
are
essentially
two
issues
that
I
will
choose.
Two
sets
of
problems
that
I
will
tell
you
something
about
so
the
first
one
relates
to
innovation
and
experimental
material
science,
because
in
many
ways
that
really
will
be
the
source
of
the
data,
the
increasing
sizes
of
data
that
we
will
have
to
wrestle
with,
and
that
then
we'll
take
me
to
Murray
matter,
radiation
and
extreme.
B
So
this
is
it's
very
revolutionary
concept
that
Los
Alamos
is
working
on.
So
this
is
a
decadal
challenge
in
that
this
is
an
in
situ
facility
with
an
ex
fel,
a
free
electron
laser
designed
to
actually
look
at
materials
behavior.
In
other
words,
if
you've
got
a
material,
it's
being
subjected
to
something
or
the
other
shock,
you
want
to
be
able
to
see
inside
it
to
actually
see
what's
happening.
So
imagine
the
amount
of
data
that
you're
going
to
collect.
B
So
that's
that's
the
first
part
of
my
talk
and
then
once
you
have
all
the
nice
data.
Clearly,
the
challenge
is
to
learn
from
the
data,
and
so
this
will
bring
me
to
the
whole
aspect,
to
materials,
informatics
and
design
that
we've
been
looking
at
in
so
far
as
how
do
we
do
statistical
design?
How
do
we
do
informatics
and
what?
What
is
it
that
we
can
learn
I'll
show
you
that
uncertainties
are
very,
very
important.
They
really
allow
us
to
explore
the
search
space.
B
So
it's
all
it's
all
a
matter
of
exploitation
and
exploration,
and
that's
really
how
one
can
find
materials
with
targeted
response.
So
I'll
give
you
some
examples:
I'll
focus
on
an
alloy
system,
nickel,
titanium
I'll,
show
you
how
we
can
find
nickel
titanium
alloys
with
very,
very
small
crumble
dissipation
by
using
this
strategy.
There
are
some
other
examples
that
I
have
these
electrics
light
emitting
diodes.
I
won't
have
time
to
tell
you
about
those
okay.
So
that's
really
the
scope
of
what
I
want
to
tell
you
about.
B
Okay,
so
the
first
question
that
arises
is:
do
we
really
have
a
big
data
problem
in
material
science?
If
you
talk
to
experimentalist,
my
friends,
tell
me,
look
I
basically
have
you
know.
Ten
data
sets
I,
have
10
phase
diagrams
of
10
samples.
I
have
synthesized
10,
solid
solutions.
This
is
not
a
big
data
problem
right,
and
so
what
you
have
to
do
is
to
start
from
there.
However,
if
we
look
at
what's
been
happening,
this
is
really
the
landscape.
As
far
as
the
data
is
concerned,
what
you
have
is
the
in
many
ways.
B
The
closest
to
this
to
this
to
a
lot
of
these
activities
is
the
LHC,
the
Large
Hadron
Collider.
That
really
is
its
closest,
because
that's
also
an
experimental
facility
and
in
many
ways
that's
really.
The
benchmark.
They've
got
a
very,
very
nice
data
portal.
They've
got
a
lot
of
experience,
and
if
you
talk
to
them,
they
will
tell
you
that
the
challenge
really
is
in
how
you
I
mean
they
process
a
lot
of
data.
But
processing
is
one
aspect.
Another
aspect
is
collecting
the
data.
Okay,
so
what
they
see
are
hundreds
of
millions
of
collisions.
B
They
use
certain
criteria,
dirty
criteria
to
essentially
look
at
one
event
in
a
thousand,
and
it's
really
from
that
one
L
event
in
a
thousand
that
they
make
certain
conclusions.
So
this
in
many
ways
provide
some
measure
of
what
the
landscape
is
like.
It's
estimated
that
in
the
year
by
2025
in
the
next
upgrade
that
they
actually
foresee
the
amount
of
data
will
increase
by
24
right
now.
For
example,
the
velocity
of
the
data
approaches
something
like
twenty
five
gigabytes
per
second.
So
that's
the
scope.
That's
really
the
big
data
problem
now
in
material
science.
B
Here
is
sort
of
my
rendition
of
the
materials
data
landscape.
The
kind
of
problem
that
I
will
tell
you
about
in
so
far
as
being
able
to
do
informatics
in
design
is
that
little
speck
right
there,
one
gigabyte
got
just
basically
a
small
amount
of
data.
I'll
tell
you
something
about
ApS
HED
M,
so
this
is
high-resolution
electron,
diffraction
microscopy
of
the
order
of
a
terabyte
okay.
If
you
take
a
lot
of
data,
several
samples,
it's
sort
of
10,
10,
terabytes,
ebsd
you're,
not
really
using
a
light
source
there.
B
It's
all
basically
done
on
site
of
the
order
of
two
gigabytes
per
sample.
Really
the
future
in
terms
of
collecting
data
lies
in
new
experimental
facilities,
light
sources.
Okay,
so
that's
really
where
the
action
is
so
80s
is
an
example
currently.
So
this
is
really
the
current
state
of
the
art,
but
then
lcls.
This
is
at
flack.
So
this
is
the
XA
fel
facility,
coherent,
x-ray
diffraction,
where
you
can
actually
get
beautiful
spatially
as
well
as
time-resolved
data.
That
really
is
sort
of
of
the
order
right
now,
five
terabytes
per
beam
time.
B
So
this
is
some
guy
going.
You've
got
be
in
time;
he
basically
takes
one
sample
essentially
can
sort
of
take
collect
data.
You
know
for
five
different
fields
or
stresses
and
I'll
give
you
an
example
of
what.
Actually
you
can
learn
by
doing
that
LCLs
too.
So
that's
the
upgrade
in
the
next
five
years,
100
terabytes
per
beam
time.
This
is
Marie,
so
that's
the
facility
I'm
going
to
tell
you
something
about
that
really
is
of
your
sort
of
tenfold
okay.
B
So
that's
in
the
next
decade,
/
beam
time,
so
this,
in
our
view,
is
sort
of
the
landscape.
Now
this
is
by
the
way,
a
logarithmic
scale.
So
that's
why
you
see
this
sort
of
slight
discrepancy
in
the
sizes
of
these
things.
So
that's
la
see
there
there's
Google
there,
so
this
gives
you
the
scope
of
where
material
science
life.
So,
yes,
we
are
going
to
be
moving
to
an
area
where
we
will
have
large
amounts
of
data,
but
the
contention
is
that
a
lot
of
this
data
will
come
from
experimental
facility
such
as
these.
B
B
B
You
know
every
week
to
this
data
set,
but
it's
essentially
a
stationary
data
set.
You've
got
something:
for
example,
o
qm
d.
You've
got
of
the
order
of
300
for
100,000
compounds,
they're,
all
sort
of
part
of
ICS
be
anyways,
and
so
that's
well,
but
it's
a
very
useful
thing
to
do,
because
you
can
learn
something
from
the
data
and
the
way
you
learn
from
this
data
is
really
by
screening.
Okay.
B
So
here
the
emphasis
is
on
generating
data
and
then
screening
to
learn
something
as
this
thing
from
another
strategy
that
I'll
tell
you
something
about
where
I
actively
ask
the
question:
what
are
the
next
experiments?
I
need
to
be
able
to
do
to
find
a
material
with
a
targeted
property?
Okay.
So
that's
high
throughput
calculations,
but
high
throughput
measurements
are
also
very
important
and
so
there's
some
very
beautiful
work
being
done
in
this
area
by
by
ichiro
kikuchi
in
particular,
and
so
here
the
idea
is
that
you
may
have
a
parameter
space.
B
Okay,
I
bet
you
want
to
span
and
get
enough
data
on,
and
then
you
want
to
hone
in
on
the
region
that
is
of
interest.
This
is
a
very
good
way
to
screen
it's
a
first
cut
and
then
you
go
and
typically
what
what's
done
is
that
you
essentially
have
some
sputtering
guns.
You
may
have
three
species,
sputtering
guns.
You
will
have
a
thin
film
and
its
natural
to
do
that
and
its
rapid
rapid
sort
of
characterization.
So
when
11
zap
you
can
actually
get
out,
you
can
do
diffraction
get
out
the
lattice
parameters.
B
B
But
then
you
have
a
big
jump,
I
jump
in
that
the
data
sets
now
are
of
the
order
of
terabytes
tens
to
hundreds
of
terabytes,
and
here
the
whole
idea
is,
and
it's
a
challenging
problem.
What
you're
doing
is
reconstructing
microstructure
okay,
so
you
basically
shine
photons
the
light
source
to
a
given
layer
and
by
layer
by
layer.
You
build
this
microstructure
reconstruct
the
microstructure,
it's
very,
very
slow.
B
If
I
look,
if
I
show
you
the
work
flow
associated
with
that
problem,
it
shows
you
that
complexity
that
you're
dealing
with
you
know
you've
got
360
angles.
This
is
only
50
layers.
You
got
three
distances,
far
field
near
field
somewhere
in
between
54,000
measurements,
okay,
obvious
diffraction
patterns.
So
that's
your
data
set
and
then,
of
course,
what
you
have
to
do
is
you
have
to
calibrate
the
model.
You've
got
to
do
some
forward.
B
Modeling
here,
you've
got
to
calibrate
the
model
and
then
you've
got
to
be
able
to
in
the
crystal
orientations
that
correspond
to
the
diffraction
pattern
that
you
have.
So
this
is
a
very
time
consuming
problem,
so
some
of
the
informatics
tools
that
I
will
tell
you
something
about
we're
actually
using
those
to
alleviate
some
of
the
bottlenecks
and
the
analysis
we
want
to
be
able
to
show
that
we
can
actually
speed
this
process
up,
rather
than
doing
it
the
way
it's
currently
being
done
by
brute
force.
B
B
You
should
see
it
coming
so
the
wheel
is
moving.
You've
got
a
wheel,
it's
moving
around
at
a
certain
rate
and
you're
laying
down
material
okay.
So
there
is
a
torch
there.
That's
laying
down
material
and
what
you've
got
is
a
light
source
that
is
going
to
interrogate
what's
happening
to
the
material.
It's
all
in
fits
you.
So
what
you're
doing
is
so
this
is
the
sort
of
deposition
you're,
just
just
showing
you.
So
this
is
the
wheel,
that's
moving.
They
should
have
been
of
something
at
the
bottom
didn't
come
out.
B
Okay,
that's
fine!
So
so
this
is
sitting
on
a
wheel,
that's
actually
moving
and
what
you
what
this
is.
The
this
is
a
torch.
So
what
you're
doing
is
you've
got
a
diffraction
spot
here.
You've
got
another
diffraction
spot
here,
a
diffraction
spot
here.
These
are
the
diffraction
patterns
associated
with
the
would
be
with
the
process,
and
so
you
can
see
here
what's
happening.
Is
it's
essentially
molten
liquid?
B
It's
very
diffused,
as
you
go
a
little
bit
here,
you're
starting
to
see
some
of
the
peaks,
the
signatures
of
the
crystallization
process,
and
then
thereafter
you
see
well-defined
peaks,
okay.
So
it's
in
situ
monitoring
of
all
three
of
the
diffraction
patterns
from
which
you
can
infer
things
like
residual
stresses
and
so
forth.
So
this
is
a
nice
example
of
what
we
can
do
and
clearly
the
data
sets
are
fairly
large,
their
of
the
order
of
hundreds
of
terabytes-
and
this
is
just
being
done
right
now
by
my
colleague,
Don
Brown,
at
Los,
Alamos.
Ok!
B
So
now
we
come
to
what
is
what
are
the
sort
of
next
generation
facility
is
going
to
look
like,
so
we
just
saw
that
you
want
to
learn
about
processing
and
how
the
processing
effects
structure,
which
is
what
you
learn,
is
by
doing
the
diffraction.
Clearly,
the
next
step
is
going
from
the
process
to
a
more
product
based
set
up,
whereby
you
really
want
to
learn
about
the
properties
that
you're
interested
in
in
controlling,
and
so
this
is
where
Marie
comes
in.
So
this
is
Marie,
which
is
a
facility.
B
It's
of
the
order
of
two
and
a
half
billion
dollars
over
the
next
decade.
That
Los
Alamos
wants
to
build
to
be
able
to
monitor
materials
behavior
in
situ,
ok
and
it
uses
a
free
electron
laser
accessory
here.
Why?
Because
really,
it's
only
the
coherence
of
the
x-rays
which
give
you
the
brilliance
and
the
high
repetition
rate
that
will
allow
you
to
have
the
kind
of
time
resolution
that
you
need
to
be
able
to
monitor,
what's
happening
as
a
function
of
time.
We
imagine
snapshots
at
picosecond
nano
second
intervals.
You
can
control
that
over
time.
B
Ok,
so
you
can
sort
of
in
one
millisecond.
You
can
sort
of
get
a
whole
bunch
of
different
snapshots,
separated
by
say,
100
picoseconds,
that's
what
you
want
to
be
able
to
do
to
monitor.
Well,
you
know
what's
happening,
how
the
material
is
behaving,
how
it's
transforming
etc-
and
let's
say
it's
a
multi
probe.
So
you've
got
information
at
different
scales.
B
Ok,
so
you've
got
protons,
you
got
x-rays
electrons,
and
so
you
can
sort
of
take
a
continuum
image
using
protons
and
the
x
fe
l
will
give
you
a
lot
of
fine
structure,
so
multi
probe
through
to
be
able
to
do
that.
Let
me
tell
you
now
what
we
can
actually
do
today,
so
lcls
a
track
is
an
example
of
a
free
electron
laser.
Now,
it's
very,
very
good
for
thin
films,
molecules,
etc.
B
B
It's
about
19
nanometers
and
it's
a
ferroelectric,
ok
ferroelectrics
are
very
important
because
you're
dealing
with
polarization,
switching
they're,
very
important
for
fe
Rams,
ferroelectric
RAM,
and
what
you
want
to
do
is
you
want
hide
large,
high
density,
ok,
so
the
way
you
get
high
density
is
by
taking
nanoparticles
and
essentially
building
a
raise,
and
so
the
idea
is
to
study
a
nanoparticle
like
this
now
way,
because
its
polarization
it's
a
vortex.
So
this
is
an
example
of
a
vortex.
It's
got
a
dislocation
line.
B
It's
got
a
core,
that's
like
a
dislocation
line
in
the
inside,
so
you're,
seeing
a
sort
of
vortex,
nanoparticle
being
image,
and
now
you
can
take
slices.
So
these
are
slices
inside
this
guy.
Ok,
and
you
can
see
that
under
the
action
of
a
field,
this
vortex
Center
actually
is
sort
of
you
know,
shifting
in
the
medium
itself.
So
this
is
beautiful,
because
I
can
exactly
see
inside
its
3d
imaging
and
you've
got
all
the
beautiful
data.
This
is
what
we
can
do
now.
This
requires
about
this.
B
Will
this
sort
of
generates
about
one
and
a
half
terabytes
of
data
per
sample?
So
so
so
this
is
the
state
of
the
art,
ok,
and
so,
where
we're
going
is
so
we've
been
working
on
Murray
at
least
I've
been
working
on
Murray
since
2008
we've
got
it
to
the
point
where
the
critical
decision
zero
has
been
has
been
approved,
and
it's
now
going
through.
You
know,
there's
a
whole
process
when
you
build
facilities
has
to
go
through
CD,
1,
CD,
2,
etc,
and
it's
going
to
be
on
the
Mesa.
It's
in
fact
really
become.
B
B
So
that's
where
we're
going
so
in
terms
of
data
we're
not,
whereas
now
we
basically
get
one
visor
plot.
So
that's
a
velocity
profile
when
you
do
a
shock,
experiment!
Okay,
you
know
you
basically
get
one
every
few
days
with
Murray,
we
will
be
able
to
get
hundreds
of
these
okay
and
that's
where
the
large
data
starts
to
enter
in
this
game.
So
that's
what
we're
heading
towards
and,
of
course,
the
integration.
This
co-design
loop
is
very,
very
critical.
B
B
So
that's
what
I
will
address
and
then
I
will
just
show
you
how
we
can
use
very
similar
methods
to
look
at
data
from
facilities-
okay,
like,
for
example,
aps
and
here
right,
and
so
you
know,
the
kind
of
data
that
I'm
talking
about
here
is
very,
very
small,
and
so
the
best
way
to
sort
of
show
you
how
we've
been
sort
of
doing
this
is
by
an
example.
So
here's
my
example
I
want
to
find
a
alloy
happens
to
be
nickel,
titanium
alloy
with
the
lowest
hysteresis
thermal
thermal
hysteresis.
Okay.
B
So
now,
as
you
probably
know,
nickel
titanium
is
a
shape-memory
alloy,
so
that
simply
means
that
I
essentially
prepare
it
in
some
shape,
give
it
some
shape.
I
I
sort
of
look
at
it
in
the
martensite
phase,
low
symmetry
phase,
and
then
I
can
deform
it
to
my
heart's
content.
But
when
I
then
heat
it
up
across
the
transition,
the
structural
transition,
it
recovers
the
shape
that
was
given
to
it.
It
recovers
the
screens.
B
Okay,
the
way
you
monitor,
that
is
through
differential
scanning
calorimetry,
and
so
you
basically
look
at
the
heat
flow.
So
you
heat,
you
get
a
Heaton
and
you
cool
you
get
essentially
a
peak.
The
hysteresis
is
the
interval
between
those
two
right
so
for
a
material.
What
you!
What
you
want
is
to
minimize
that
hysteresis.
Why
do
you
want
to
minimize
that
it's
theresa's,
because
that's
what
affects
fatigue
and
you
want
something
that
can
go
through
many
many
cycles
without
fatigue?
That's
your
target!
That's
what
you
want.
B
How
are
you
going
to
do
this
so
so
here,
for
example,
is
nickel,
titanium,
okay,
which
is
which
is
used
in
industry
a
lot,
and
you
can
see
that
the
spread
is
of
the
order
of
25
k,
okay
with
cycles
and
also
the
interval.
This
delta
T
is
also
of
the
order
of
25
30
k,
so
our
strategy
was
ok,
we're
looking
for
a
multi-component
alloy
with
very
very
small
sum
of
hysteresis,
and
so
our
domain
knowledge
told
us
that
we're
going
to
restrict
ourselves
to
this
family.
B
B
B
Now
so
the
problem
is
very
simple:
I
want
to
know
what
is
XY
and
Z,
which
will
minimize
the
hysteresis.
That's
my
problem.
Okay
and
clearly,
the
space
is
very
large,
so
our
experimental
friends
had
the
ability
to
control
the
composition,
2.1
percent.
So
if
I
use
that
the
search
space,
the
number
of
possibilities
of
X
whines
ease
of
the
order
of
800
k,
so
now
what
you're
doing
is
you're
looking.
You
know,
you've
got
this
vast
search
space
and
you
want
to
find
which
what
particular
composition
is
going
to
minimize
the
thermal
hysteresis.
B
That's
the
problem,
and
how
are
you
going
to
do
it?
The
key
point
is
that
uncertainties
are
very,
very
important.
They
will
allow
us
to
in
many
ways
search
that
vast
search
space.
So
the
strategy
that
we
came
up
with
is
this.
We
essentially
starts
with
our
compositions.
We
identified
the
material
descriptors
or
the
features,
and
here
domain
knowledge
was
very
important.
We
know
if
we
knew
from
the
literature
that
things
like
the
valence
electron
number
per
atom
is
important.
It
affects
the
thermal
hysteresis.
It
affects
the
transition
temperatures,
a
lot
of
work
published
atomic.
B
The
radii
of
the
of
the
of
the
chemistry
the
species
is
also
important,
so
we've
essentially
down
selected
a
set
of
features.
We
didn't
want
the
featureless
to
be
too
large
because
that
really
explodes
the
sort
of
degree
of
difficulty
in
terms
of
adamant
high
dimensionality
of
the
problem.
You
want
it
to
be
small,
yet
you
wanted
to
it
to
be
able
to
say
something,
then
what
we
did
was
to
essentially
do
what
everybody
else
does
you
do
inference?
So
this
is
when
people
talk
about
materials
informatics,
this
is
really
what
they
mean.
B
I'm
going
to
do
some
kind
of
regression.
Okay,
you
can
use
your
favorite
tool
box
off
the
web
scikit-learn
and
you
will
do
inference.
That's
what
I
mean
there,
but
the
key
point
that
we
realized
was
that
it's
really
the
design,
that's
critical.
How
do
we
choose
the
next
experiment,
the
next
sort
of
experiment
that
has
to
be
done?
Okay,
this
is
not
a
one-shot
deal.
I
make
a
prediction
here:
it
is
good
for
you,
you
really
have
to
iterate,
so
you
make
you
make
certain
predictions.
B
You
suggest
certain
alloys
for
the
experimentalist
to
do,
and
we
suggested
for
so
that
in
itself
is
a
nice
informatics
problem.
What
are
the
best
for
that?
You
should
suggest,
and
then
the
experimentalist
goes
and
makes
them.
We
then
put
that
back
it
augments
the
data
set
the
training
data
and
we
keep
going.
This
is
how
we
iterated
to
actually
come
up
with
the
solution
to
this
problem.
Okay,
now
before
I
give
you
the
solution
to
this
problem.
You
know
there
I
want
to
make
this
point.
B
Come
back
to
this
point,
so
inference
is
really
not
adequate
by
itself.
You
really
need
to
explore.
So,
let's
see
what
of
what
is
when
doing
when
one
does
influence
regression.
Ok,
you're
in
you've
got
it
you've
got
some
data
set
somebody's
given
you
and
what
your
empirically
doing
is
constructing
some
function.
F
of
X
can
be
least
square.
For
example.
You
know
you
all
done
be
squares
right,
so
so
on,
though,
on
the
left
there
I
have
a
plot
of
exactly
something
like
that
where
I'm
showing
you
for
the
22
compounds.
B
That
are
that
I
showed
you.
I
have
the
predicted
delta
T
against
the
measure
delta
T,
and
you
feel
that's
pretty
good.
I
already
have
a
nice
model
and
really
what
I'm
looking
for
is
small
delta
T.
So
what
may
occur
to
you
is
that
hey
I
have
an
outlier.
You
see
right
there,
large
uncertainty
and
anyway
I'm
looking
for
a
small
delta
T,
that's
not
so
important.
I
don't
want
to
sample
their
your.
B
You
will
be
tempted
to
basically
throw
that
away
bad
okay,
because
there's
a
large
uncertainty
associated
with
it
and
you
don't
know,
what's
going
to
happen
in
the
next
step
me
tell
you
what
actually
happens
in
the
next
step
we
sort
of
we
predicted
for
and
so
the
one
this
one
still
allows
for
a
small
delta
T.
It
still
allows
for
it,
but
I,
don't
know
what
the
result
is.
So
we
went
in
and
and
and
and
synthesized
the
compound
three
of
them
very
nicely
made
the
model,
but
the
fourth
one.
B
Essentially
we
be
predictably
measured.
One
was
quite
large
okay,
so
that
tells
you
in
subsequently
that
that's
not
going
to
be
important.
But
a
priori,
you
don't
know
that
what
this
is
telling
you
is
that
there
is
a
landscape
in
feature
space
which
has
local
minima,
and
it's
very
important
for
you
to
explore
this
to
be
able
to
get
the
sort
of
best
global
minimum.
B
If
you
can
and
you
mustn't
pro
stuff
away
because
then
you're
not
exploring
you're,
not
getting
the
best
results,
it
is
suboptimal
and
we
can
actually
show
that
because
what
we
did
was
to
give
ourselves
a
pest
problem.
I
have
a
data
set.
This
happened
to
be
max
phases
of
220
compounds
and
then
I
basically
asked
myself
and
I
know
the
elastic
moduli
I
asked
myself.
I
want
the
compound
with
the
largest
elastic
modulus.
Let's
do
it,
and
so
this
is
number
of
new
measurements
against
the
initial
number
of
measurements
in
your
training
data.
B
But
if
you
do
them
using
inference
what
usually
people
do
in
materials
informatics,
then
you
can
see
that
what
you
will
get.
So
that's
pure
exploitation.
You
will
get
something
like
this,
but
the
best
result
is
when
you
actually
do
statistical
design.
So
you
can
see
it
doesn't
matter
which
strategy
you
use
in
statistical
design.
They
all
work
reasonably
well
and
they
give
you
the
best
results
within
35
new
measurements.
I
can
get
the
best.
I
can
get
the
compound
with
the
best
modulus
okay.
B
So
that
basically
brings
me
to
this
slide,
which
really
is
not
new
industry
has
really
known
about
this
for
a
long
time
and
the
Operations
community
has
known
about
this
is
for
a
long
time.
It's
really
a
matter
of
using
uncertainties
to
sort
of
balance,
the
trade-off
between
exploration
and
exploitation.
B
This
happens
to
be
a
gaussian
process
model,
and
what
you
see
here
are
the
data
points
where
you
don't
have
uncertainty.
You
know
the
stuff,
but
where
you
don't
have
data
points,
you
have
these
footballs
of
uncertainty
and
that's
where
you
need
to
go
and
explore.
Okay,
very
important
and,
as
I
said,
a
lot
of
these
ideas
are
not
new
they've
been
used
in
the
aerospace
industry.
Also
in
the
auto
industry.
B
They
come
the
classic
ideas
that
go
back
to
Howard
and
Kushner
30
40
years
ago
on
the
value
of
information,
and
so
they
recognize
that
what's
important,
if
you
want
to,
if
you
have
complex
calculations
that
are
going
to
take
a
look
many
many
days,
you
really
need
to
choose
the
best
infill
points.
You
really
need
to
address
the
issue
of
what
are
the
best
response
surfaces.
B
So
a
lot
of
this
goes
under
the
heading
of
surrogate
based
modeling,
and
all
we
did
was
to
take
those
lessons
from
these
guys
and
actually
implement
them
on
materials
data
sets.
So
this
is
the
result
that
we
got
from
our
study.
The
alloy
that
we
found,
which
had
the
smallest
form
of
dissipation,
is
right.
There
I,
certainly
wouldn't
have
been
able
to
find
that
otherwise
and
we
got
it
in
the
sixth
iteration.
B
So
in
our
actual
exercise
we
went
through
nine
loops.
We
made
36
predictions
because
you
were
giving
for
each
time.
So
we
come.
We
synthesize
36
compounds,
14
of
them
were
better
than
the
best
in
our
training
set,
and
so
the
p-value
is
very
small.
There's
no
way
that
we
could
have
found
this
compound
on
a
random
basis.
If
you
want
to
know
how
good
it
is.
B
This
shows
you
that
the
shift
in
in
in
the
temperature
and
in
delta
T
you
over
60
cycles
is
very,
very
small
compared
to
something
like
nickel
titanium,
and
one
of
the
things
to
point
out
is
that
our
compound
is
very
competitive
in
the
in
the
landscape
of
compounds
but
notice
that
the
the
transition
temperature
is
also
in
the
right
window.
Now
we
didn't
design
for
that.
B
That
was
fortuitous,
but
it
shows
you
that
the
right
way
to
do
these
things
is
through
multi-objective,
optimization,
ok,
so
this
strategy,
we
have
subsequently
been
using
to
address
the
problem
of
that
the
facilities
care
about
the
whole
problem
of
reconstruction.
Ok,
where
I
want
to
sort
of
choose
very
very
fast.
I
want
to
get
very,
very
fast
the
orientations
of
crystal
orientations
which
will
match
the
detector
pattern,
because
that's
what
you
care
about
in
the
microstructure
I
want.
The
crystal
orientations-
and
so
we've
actually
implemented
this
and
it
works
very,
very
competitively.
B
B
The
big
warning
here
is
the
no
free
lunch
here.
Ok,
materials
informatics
is
really
fraught
with
a
lot
of
issues
and
there's
a
very
famous
theorem
called
the
no
free
lunch
theorem,
which
basically
says
that
there
is
no
universal
optimizer,
something
that
I
do
on
a
given.
Data
set
a
model
that
I
come
up
with
for
a
given
data,
set.
There's
no
assurance
that
that's
going
to
work
on
a
slightly
different
data
set;
ok,
no
assurance
at
all.
So
you
have
to
be
exceedingly
careful
this.
B
There
are
no
results
to
guide
you
here
in
classification
for
binary
classification.
You
have
a
result
that
can
actually
guide
you,
but
here
there
are
no
results.
You
really
have
to
be
very,
very
careful
things
like
when
you
have
small
data
sets
things
like
cross-validation.
Don't
work
very
well.
The
bioinformatics
people
really
know
this
well,
because
they've
got
few
patients
and
they've
got.
You
know
thousands
of
genes,
a
very
large
feature
space
and
that's
fraught
with
a
lot
of
difficulties,
and
so
you
have
to
use
these
methods
very
carefully.
B
But
one
thing
we
have
found
is
that
the
design,
this
exploration,
exploitation
start
strategy
really
in
some
sense
make
makes
amends
for
the
lack
of
an
adequate
inference
model.
That's
a
very
interesting
thing.
Usually
people
just
want
a
good
regression
model.
We've
discovered
that
the
designer
limit
actually
is
quite
forgiving
of
the
paucity
of
the
inference
model.
B
You
know
what
I've
talked
about
is
a
data-driven
approach,
but
I
think
that's
not
adequate.
We
really
need
to
bring
in
theory.
We
need
to
bring
in
theory
and
relationships,
constitutive
relationship
scaling
relationships
to
constrain
the
search
base
and
how
we
do
that
is
outstanding
challenge,
so
I
think
I
think
it's
data-driven,
plus
knowledge
that
should
give
better
predictions
rather
than
just
data
driven
by
itself.