►
From YouTube: Week 10 - Hidden Physics Models - Maziar Raissi
Description
More about this lecture: https://dl4sci-school.lbl.gov/maziar-raissi
Lecture slides: https://drive.google.com/file/d/1pfPs-ll_ffq7SYMZVWPISnfwTG6oLJHC/view?usp=sharing
The Deep Learning for Science School: https://dl4sci-school.lbl.gov/
A
A
A
A
And
in
simple
terms,
the
hydrofoil
is
gonna
change
into
an
airfoil,
and
it's
gonna
help
that
vessel
to
lift
from
the
water
and
then
start
flying.
A
A
And
you
can
imagine
that
you
are
doing
4d
mri,
so
you're.
We
are
trying
to
simulate
for
the
mri
basically
space
and
time
imaging
and
it's
common
practice.
A
A
A
From
that
that's
going
to
be
an
inverse
problem,
it's
going
to
be
a
very
complicated
inverse
problem.
We
want
to
reconstruct
what
the
flow
looks
like
that's
the
exact
dynamics,
because
we
are
simulating
it.
So
we
know
how
the
exact
dynamics
is
going
to
look
like
these
are
the
stream
lines
and
the
stream
lines
are
getting
color
coded
by
the
pressure
and
that's
what
the
algorithm
learns.
A
A
A
What
am
I
not
going
to
be
talking
about
today?
Are
these
topics?
I'm
not
going
to
be
talking
about
image
classification,
where
you
have
an
image,
it's
on
high
dime.
It's
a
high
dimensional
object.
The
image,
because
you're
gonna
have
a
lot
of
pixels
and
the
pixels
are
basically
your
dimensions
and
every
pixel
has
three
channels:
red
green
blue.
So
that's
a
very
high
dimensional
object.
A
A
A
A
A
A
A
A
A
It's
very
expensive
computationally,
because
each
data
point
that
you're
gonna
collect
for
us.
It
was
taking
six
hours.
We
had
this
reynolds
average
turbulence
model
and
it
was
really
expensive.
A
A
A
A
Take
that
geometry
feed
it
into
your
favorite
cfd
solver.
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
That's
training
we
want
to
maximize
the
likelihood
and
the
normal
distribution
has
an
exponential
term.
If
you
write
down
the
formulation,
it
has
an
exponential,
we
want
to
get
rid
of
the
exponential.
That's
why
we
take
a
log
and
in
machine
learning.
Usually
we
like
to
minimize
rather
than
maximize,
so
we
are
going
to
minimize
the
negative
of
the
log
of
the
likelihood.
A
And
that's
what
you're
going
to
get?
That's
our
objective
function
and
there
is
nothing
complicated
about
it.
It's
just
normal
distribution
when
you
take
a
log
of
the
exponential-
and
these
are
the
terms
that
are
going
to
be
left
plus
some
constants
and
the
constants
are
not
going
to
affect
your
optimization,
so
we're
going
to
drop
them.
A
A
A
A
A
A
A
A
A
And
the
cash
is
the
training
step
for
krigging.
You
usually
assume
that
you
know
these
parameters,
these
two
parameters
and
then
everything
is
going
to
be
fast
and
easy,
but
then
in
train
during
training
with
gaussian
processes,
you
have
to
find
the
best.
Basically,
you
are
trying
to
find
the
best
hyper
parameters
and
it's
going
to
be
the
best
basis
function
and
you
are
trying
to
adapt
your
basis
to
your
data
and
that's
the
role
of
the
training,
and
this
step
is
missing
from
cricket.
A
A
A
A
A
Again,
that's
a
prior!
That's
a
choice
that
you
have
you
don't
have
to
assume
the
mean
is
zero.
You
can
assume
something
else,
but
even
if
you
don't
assume
that
the
mean
is
zero.
Basically,
even
if
you
assume
the
mean
is
zero,
you're
gonna
get
a
mean
after
conditioning
on
the
data,
and
this
is
going
to
be
a
non-zero
mean,
and
you
have
a
variance,
that's
updated.
What
are
the
formulas
for
that?
A
A
A
A
A
A
A
A
And
we
don't
see
the
blue
curve.
What
are
we
going
to
do?
We're
going
to
take
the
mean?
Add
the
2
standard
deviation
to
it
from
our
prediction
model.
That's
going
to
be
f
bar
of
x,
the
mean
plus
2
is
standard.
Deviation,
that's
going
to
give
you
an
upper
confidence
bound
this
curve.
You
see
now
the
blue
curve
you
couldn't
see,
but
the
red
curve
you
can
see
and
you
can
evaluate
it.
If
you
can
see
and
evaluate
it,
we
can
maximize
it.
A
A
A
A
A
A
A
A
A
A
You
would
probably
need
to
put
a
point
here
and
do
another
simulation
there.
It's
six
more
hours.
So,
let's
see
how
many
hours
we
are
saving
six
six
hours
here,
another
six
hours
here,
12
hours,
some
more
hours
here,
because
you
want
to
match
the
care,
and
these
are
the
costs
that
we
are
avoiding
by
using
the
bayesian
optimization
framework
and
for
more
information.
This
is
a
very
good
paper
to
refer
to.
A
A
A
A
A
You
try
to
write
down
the
correlation
between
two
functions
now
and
the
model
that
we
wrote
and
that's
the
prior
assumption
that
we
made
it
might
be
correct.
It
might
be
wrong,
but
once
it
sees
the
data,
it's
gonna
fix
it
and
that's
a
very
simple
model.
You
say
our
high
fidelity
model
is
a
linear
combination
of
our
low
fidelity
models.
This
could
be
a
vector.
We
can
have
multiple
models,
plus
a
noise
model.
A
A
A
A
A
A
A
A
A
A
A
B
A
A
A
A
A
So
whatever
that
I'm
gonna
tell
you
started
from
this
simple
observation
that
the
derivative
of
a
gaussian
process
is
a
gaussian
drop,
is
another
gaussian
process.
So
the
derivative
of
the
gaussian
process
is
a
gaussian
process,
of
course,
with
a
different
kernel,
the
derivatives
are
going
to
go
on
the
kernels,
but
that's
a
crucial.
That's
a
crucial
observation.
A
A
A
A
A
B
A
So
now
that
it's
very
similar
to
multi-fidelity,
it
means
that
you
can
use
this
to
do
cool
stuff
with
differential
equations,
for
instance,
you
can
try
to
solve
them.
That's
the
first
thing
that
comes
to
our
mind.
Let's
try
to
solve
our
differential
equation,
but
now
you
can
solve
it
in
a
data
driven
fashion.
A
A
These
are
our
observations.
We
start
with
observations.
We
also
have
some
observations
on
the
boundary
using
those
two
observations.
You
have
a
lot
of
observations
on
yesterday,
like
your
low
fidelity,
and
you
have
very
few
observations
on
your
high
fidelity
and
we
know
that
low
fidelity
is
going
to
help
high
fidelity,
we're
gonna
use
that
we're
gonna
condition
on
this
data.
A
First,
do
your
training
do
your
conditioning
and
once
you
do
your
conditioning,
you
can
write
down
your
posterior
and
that's
gonna,
give
you
the
prediction
for
a
little
bit
ahead
of
ourselves,
basically,
one
time
step
in
the
future
perfect.
So
now
what
we
did?
We
took
one
time
step
now
we
have
a
problem.
A
A
A
A
A
A
So
it's
noiseless.
There
is
no
noise,
you
are
generating
it
artificially,
but
they
are
uncertain
and
that's
the
uncertainty.
You
are
not
sure,
because
your
data
point
could
be
here
could
be
here.
It
has
a
distribution,
but
there
is
this
tiny
difference
between
being
noisy
and
being
uncertain
for
uncertainty.
You
have
a
distribution
for
being
noisy
things
are
deterministic,
but
there
is
a
gap
between
the
observation
than
the
actual
truth,
but
the
location
is
deterministic.
A
A
A
A
C
Hey
sorry,
I
just
want
to
give
you
a
quick
time
check.
Just
so
you're
aware,
there's
about
27
minutes
left.
C
A
A
A
Low
dimensional
systems,
actually
it
could
be
high
dimensional,
but
it's
finite
dimensional.
The
dimension
is
finite.
It's
an
ode!
It's
an
ordinary
differential
equation.
What
we
have
here
is
for
infinite
dimensional
stuff.
It's
for
functions
for
functional
spaces.
It's
for
partial
differential
equations.
A
A
A
So
it's
not
like
you
take
a
gaussian
process.
You
push
it
through
a
non-linear
differential
equation
and
you
get
a
non-gaussian
distribution
out.
No,
the
uncertainty
is
being
propagated
outside
of
your
system
outside
of
your
differential
equation
and
that's
how
the
uncertainty
is
propagated.
This
uncertainty
is
the
uncertainty
of
our
artificially
generated
data
points.
A
So
there
is
a
great
question.
It
says
what?
If?
U
n,
condition
on?
U
n
minus
1
being
normal
is
violated,
is
common
in
common
filter.
We
could
use
unscented,
comma
filter
what
happens
with
gps.
You
can
do
the
same
thing
here.
A
A
A
A
A
A
A
A
A
A
Because
if
you
want
to
measure
the
pressure,
you
are
interfering
with
the
dynamics
and
the
velocity
is
going
to
change
and
we
were
trying
to
estimate
these
two
parameters.
Basically
the
reynolds
number
that
has
something
to
do,
with
the
shape
of
the
object
in
front
of
you,
etc,
and
these
are
pretty
good
estimates
in
even
in
the
presence
of
noise.
A
A
A
A
A
A
If
you
have
a
big
data
set,
many
of
that
data
are
going
to
be
redundant
and
if
that's
the
case,
you
can
reduce
the
number
of
data
through
our
framework
and
then
you
do
a
gaussian
process
under
reduced
data.
It's
similar
to
dimensionality
reduction
ideas.
Sometimes,
yes,
your
problem
is
high
dimensional,
like
images,
but
there
is
an
underlying
low
dimensional
manifold
in
a
high
dimensional
space.
Similarly,
if
you
have
a
very
huge
big
data
set,
maybe
most
of
them
are
redundant.
If
that
assumption
is
correct,
you
can
reduce
the
dimension.
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
Using
noisy
data,
initial
land
boundary
and
that's
nice,
you
got
a
solution.
I'm
now
going
to
a
question
from
the
audience
in
the
question
and
answer
the
question:
is
you
can
solve
differential
equations
using
other
methods?
Why
would
you
use
gaussian
processes?
What
would
you
use
neural
networks
for
gaussian
processes?
The
answer
is
clear:
you
get
the
uncertainty,
you
get
a
nice
visually
speaking
and
quantitative,
as
well
as
qualitative
uncertainty
bound
around
your
solutions
with
neural
networks.
A
A
A
A
A
A
Neural
networks
are
going
to
shine
in
high
dimensions
because
they
are
gritless
in
1d.
Yes,
you
can
put
10
grids.
You
can
solve
your
differential
equation
using
finite
differencing
in
2d.
You
need
to
put
10
to
the
power
2
grid
points
in
3d,
it's
going
to
be
10
to
the
power
3.
in
4d.
It's
going
to
be
10
to
the
power
4
and
then
100d
forget
about
it
forget
about
grids
so
in
high
dimensions.
Neural
networks
shine.
A
A
A
We
can
collect
data
behind
the
cylinder
on
the
velocity
only
and
then
to
be
honest,
when
we
were
writing
that
paper,
we
were
trying
to
find
the
reynolds
number
this
number
and
this
number
here,
but
then
something
interesting
happened.
The
pressure
popped
out
of
nowhere,
it's
popping
not
out
of
nowhere.
It's
popping
out
of
our
equations
enforcing
these
equations
in
our
loss
function,
but
we
had
absolutely
no
data
on
this.
A
A
C
Thank
you
very
much,
yeah,
perfect
timing.
I
guess,
if
you
still
have
time
you
can
stick
around
to
answer
some
of
the
other
questions
on
the
call
if
anybody
needs
to
drop
off,
though,
thank
you
very
much
for
coming
and
we'll
see
you
for
the
next
one.
A
A
The
cool
thing
about
hidden
physics
models
is
that
you
are
not
going
to
go
through
your
differential
equation,
because
the
differential
equation
is
there
is
inside
your
prior,
it's
inside
your
loss.
In
the
end,
how
would
you
suggest
tackling
pdes
when
certain
components
emit
are
small
compared
to
others?
A
A
A
A
A
A
A
Can
you
elaborate
a
bit
more
on
how
different
pins
from
running
a
neural
network
and
simulated?
That's
exactly
why
I
spent
so
much
time
on
the
first
part
of
the
topic
to
tell
you
that
there
are
two
ways
to
go
around
physics:
informing
things.
One
is
you:
do
your
simulation,
get
some
data
and
then
apply
and
deploy
a
deep
learning
framework,
a
usual
deep
learning
framework
on
your
data?
A
A
C
Okay,
so
it
looks
like
you
answered
all
the
questions,
my
sir.
Thank
you
so
much
for
going
through
all
of
those.
I
guess
we
can
close
this
session
now,
thanks
for
sticking
around
to
answer
those
again
and
thanks
to
everybody
for
coming
and
asking
all
these
great
questions.
That
was
a
great
presentation.
I
think
folks
learned
a
lot
once
again.