►
From YouTube: CESM Workshop: Machine Learning, CESM-related Efforts
Description
The 26th Annual CESM Workshop will be a virtual workshop with a modified schedule on its already scheduled date. Specifically, the virtual Workshop will begin with a full-day schedule on 14 June 2021 with presentations on the state of the CESM; by the award recipients; and three invited speakers in the morning, followed by order 15-minute highlight and progress presentations from each of the CESM Working Groups (WG) in the afternoon.
On 15-17 June 2021, working groups and cross working groups have half-day sessions, some with presentations and some that are discussion only.
A
Machine
learning
we
are
post.
Let
me
just
clear
my
screen,
katie
dagon
dagon
from
encar
and
I
christiania
blanovsky
from
the
university
of
michigan
are
hosting
the
session
and
before
we
get
started,
we
have
a
slate
of
speakers
and
then
also
discussion
periods.
I
would
like
to
remind
you
oops
of
the
now
I'm
too
fast
a
code
of
conduct.
A
Okay,
before
we
get
started,
we
would
like
to
advertise
a
few
events,
and
this
actually
includes
past
events.
These
are
opportunities
for
people
that
are
maybe
new
to
machine
learning,
and
these
are
either
upcoming
opportunities
like
this
one.
So
here
we
are
looking
at
an
opportunity
this
summer
for
grad
students
postdoc,
but
actually
everybody
to
attend
and
participate
in
an
artificial
intelligence
summer
school,
and
this
is
organized
by
ncar
and
partners.
A
So
this
is
an
opportunity
for
new
people
to
yeah
get
started
with
machine
learning.
This
is
on
the
focus
on
trustworthy
ai.
So,
if
you're,
really
new
to
machine
learning,
maybe
before
you
attend
this
summer,
school
take
a
look
at
last
year's
summer
school
from
ncar.
This
was
featured
last
year,
a
virtual
summer
school
at
encar
and
all
recordings
or
lecture
notes
are
online,
and
this
really
provides
a
fundamental
background
for
ai
and
machine
learning
for
the
earth
system
sciences.
I
highly
recommend
this
okay.
A
Last
but
not
least,
or
at
least
a
few
more
opportunities
here
we
had
twice
now
we
had
adu
tutorials
on
machine
learning.
Here
you
see
links
now
to
the
event
that
happened
in
2019
and
2020.
This
was
organized
by
karthik
kashina
from
lawrence
berkeley,
national
lab
the
it's
a
hands-on
tutorial.
So
you
have
access
to
google
collab
notebooks,
for
example,
and
presentations
are
online,
including
a
recording,
again
a
wonderful
opportunity
to
get
started.
A
There
is
an
upcoming
workshop.
This
is
a
nor
series,
and
this
is
now
the
third
instance
or
the
third
workshop
of
that
series.
That
is
a
workshop
on
leveraging
ai
in
the
environmental
sciences.
This
is
upcoming
and
note
the
deadline,
if
you
want
to
contribute
a
talk,
is
tomorrow,
so
there's
still
time,
but
the
deadline
is
tomorrow.
The
workshop
will
be
in
september
this
year,
partly
in
person,
partly
virtual
and
additional
opportunities
are,
and
these
are
past
opportunities,
but
notes
and
recordings
are
all
online.
A
The
second
newer
workshop
on
leveraging
ai
in
the
environment
of
sciences.
There
is
a
u.s
climate
working
group
with
a
webinar
series
that
just
ended.
This
is,
I
think,
last
may,
or
this
may
and
then
also
there
were
a
few
wins
from
the
european
center
for
medium
range
weather
forecasts.
A
Again,
I
recommend
you
to
check
this
out.
Okay,
today's
goal
of
this
cross
working
group
is
to
yeah
bring
people
together.
As
we
know,
machine
learning
is
a
vastly
growing
field,
and
often
these
you
know
machine
learning,
efforts
are
pretty
scattered,
that's
even
true
within
ncar
and
katie
will
give
us
an
overview
actually
of
the
in-car
activities,
and
the
goals
of
this
cross-working
group
session
is
yeah
to
network
to
inform
each
other
about
the
ongoing
or
planned
ml
activities
related
to
csm
and
provide
a
discussion
forum.
B
Thanks
christian
and
I'll
also
say,
those
slides
that
we
were
just
presenting
should
be
posted
on
the
csm
workshop
website.
If
you
wanted
to
follow
some
of
those
links,
I
would
just
check
back
after
the
workshop
and
you
should
be
able
to
find
those
slides.
So
we
have
a
great
set
of
talks
this
afternoon.
B
C
So
thanks
everybody
for
joining
us
and
and
and
thanks
for
organizing
the
session,
it's
always
pretty
exciting
to
see
such
a
large
community.
You
know
both
within
cesm
and
within
the
machine
learning
kind
of
coming
together
and
and
thinking
about
those
problems.
So
so
thanks
again,
so
all
right.
So
what
I
want
to
show
you
are,
basically
you
know
pieces
of
work
that
we've
been
up
to
in
the
last
few
years.
C
Tackling
the
problem
of
you
know
parameterization
and
ocean
process
parameterizations
and
then
I'll
give
you
a
little
bit
of
an
update
on
m
squared
line,
which
is
a
large
international
collaboration
to
use
machine
learning,
scientific
machine
learning
to
improve
parameterization
in
the
ocean,
sea,
ice
and
atmospheric
components
of
climate
simulation,
with
ncar,
of
course,
being
a
big
partner
and,
of
course,
csm
being
front
and
center.
C
So,
of
course,
here
I'm
going
to
focus
on
the
ocean
component
because
I'm
an
oceanographer,
so
some
of
the
work
has
you
know,
I'm
the
one
presenting
and
getting
the
invite,
but
all
the
credits
goes
to
you
know.
Members
in
my
group,
tom
bolton
was
a
phd
student
in
oxford
with
me
back
when
I
was
in
oxford,
and
that
was
for
github
and
altra
guillermo
was
a
postdoc
at
current
nyu
and
starting
a
faculty
job
back
in
the
uk
in
september.
C
Okay,
so
I
don't
need
you
know,
I'm
preaching
to
the
choir.
We
all
love
climate
models
and
they're
great.
We
use
them
for
prediction
and
we
use
them
to
understand
processes,
but
we
also
know
that
they
have
some
limitations.
So
here
what
I'm
showing
is
dynamic
sea
level
in
a
one
percent
co2
experiment.
So
up
here
is
the
cme
5
ensemble
down
here
is
the
same
f6
ensemble
and
what
we
see
is,
of
course,
you
know.
We
see
basically
strong
signal
in
dynamic
sea
level.
C
You
can
think
about
it
as
integrated
origin,
heat
content.
You
know
with
very
large
signal
in
the
north
atlantic
and
in
the
southern
ocean,
for
example,
in
either
of
the
ensemble,
so
really
those
those
basin
responds
very
strongly
to
one
percent
co2
in
terms
of
sea
level.
What
we
see
here
on
the
right
is
the
model
spread
in
semi
privacy
in
excess
and
unfortunately,
what
you
can
see
is
that,
for
example,
again
in
the
north
atlantic
and
southern
ocean,
the
spread
is
actually
as
large
as
the
signal.
C
So
in
many
regions
of
the
ocean,
the
uncertainty
or
the
you
know.
Basically,
the
the
response
of
the
climate
models
can
be
very
different
in
regions
where
actually
the
the
responses
might
be
very
strong
and
we
were
able
to
actually
understand
the
spread
or
at
least
pin
it
down
to
the
ocean
component
to
the
ocean
models
and
the
parameterization
within
them
in
particular.
Mixing
and
eddies
are
are
kind
of
critical
pieces
here
that
actually
set
the
spread
and
set
the
uncertainty.
C
And
so,
of
course,
you
know,
we've
been
dealing
with
parametrization
climate
model
for
a
very
long
time
and
they're,
usually
based
on
physical
understanding
of
those
processes
and
writing
them
down.
As
mathematical
expression,
so
here
of
course
you
know
again,
I'm
preaching
to
the
choir.
Our
idea
is:
can
we
use
machine
learning
to
actually
go
and
parameterize
those
processes,
and
so
now,
let's
not
assume
that
we
know
what
the
physics
is
going
to
look
like
mathematically.
C
Let's
just
give
the
algorithm
a
choice
given
the
data,
and
so
the
idea
is
taking
data
either
from
observation.
Unfortunately,
in
the
ocean
were
a
little
bit
limited,
so
what
we
focus
on
is
going
to
be
mostly
a
high
resolution
simulation
and
ask
the
algorithm
to
actually
pick
out
what
you
know.
What
are
the
missing
terms
or
what?
What
is
the
misinforming
at
course,
resolution
that
would
need
to
add
to
faithfully
represent
the
physics
that
is
unresolved
now
everything
I'm
going
to
show
you
is
is
trying
to
blend
physics
and
machine
learning
together.
C
So
we're
not
going
completely
blind,
but
you
know
I'll
give
you
a
couple
of
examples,
and
so,
as
I
said
right,
so
we
we're
going
to
come
up
with
parameterization
of
small-scale
processes.
So
that's
you
know,
for
example,
that
would
be
a
100
kilometer
grid
box,
which
is
the
resolution
of
of
our
typical
climate
model
and
all
the
physics
underneath
it
or
the
ocean
turbulence
are
not
parametrized
and
that's
what
my
group
has
been
really
focused
on
kind
of
ocean
turbulence,
parameterization
at
the
meso
scale,
so
10
to
100
kilometers
roughly.
C
There
are
a
few
other
papers,
kind
of
tackling
different
aspects.
So
there's
one
paper
on
vertical
mixing,
which
is
a
very
nice
way
to
use
machine
learning,
a
neural
network
and,
of
course,
some
some
of
our
colleagues
and
here
at
denkar,
who
have
been
actually
working
on
parametrizing,
mesoscale,
kinetic
energy.
I
think
the
paper
might
be
out
already
in
the
atmosphere.
C
C
So
here
I'm
going
to
show
you
a
couple
of
examples
from
our
group
and
what
we
found,
what
works,
what
doesn't
and
where
we're
going
with
it.
So
the
first
example
is
basically
we're
going
to
use
high
resolution
simulation,
we're
going
to
filter
them
and
cause
grain
them
and
we're
going
to
extract.
C
You
know
what
is
the
missing
forcing
that
the
course
resolution
model
should
have
basically
so
basically
we're
trying
to
diagnose
a
parameterization
from
a
high-resolution
model
and
we're
going
to
ask
a
convolutional
neural
network
to
learn
that
missing,
forcing
given
the
large
scale
the
resolved
velocity.
So
again,
you
can
think
about
it
as
we're
looking
for
a
function,
some
kind
of
function,
given
the
data
that
will
only
be
a
function
of
the
result
scale
and
in
our
case
the
result
scale
are
going
to
be
resolved
velocity
fields.
C
So
now
what
the
neural
net
is
doing,
I'm
not
going
to
go
into
the
details
of
the
architecture.
I
just
want
to
say
one
thing:
it's
certainly
going
to
go
and
optimize
through
this
data
set
to
find
the
best
function,
but
we
trick
it
because
the
new
owner
doesn't
know
about
conservation
principle.
It
has
no
idea
about
the
physics,
just
see
images
and
tries
to
match
those
images
and
find
the
best
thoughts.
C
So
what
we've
done
in
this
work
is,
we
know
we
need
to
conserve
momentum,
because
at
the
end
of
the
day
I
need
to
go
and
take
that
parameterization
and
put
it
in
a
cross-resolution
model.
If
I
have
a
net
input
of
momentum
or
net
sync,
nothing
good
is
going
to
happen,
as
you
can
imagine.
So
what
we
do
is
we
actually
learn
different
components
of
the
tensor
and
at
the
end
we
have
a
fixed
layer
that
texts
the
divergence
of
that
tensor.
C
C
C
Second
example,
which
is
a
little
bit
closer
to
you,
know
what
we
like
as
climate
scientists
and
as
physicists,
which
is
okay.
Let's,
let's
you
know,
take
a
step
back
and
rather
than
ask
the
algorithm
to
learn
a
humongous.
You
know
you
know
basically
function
but
try
to
constrain
it
a
bit.
So
we're
still
gonna
do
the
same
exercise.
Take
a
high
resolution
model
diagnosis,
the
missing
forcing
from
a
course
resolution
model
and
now
use
a
different
machine
learning
algorithm.
It's
called
relevant
vector
machine.
C
So
it's
a
spouse,
bayesian
regression
and
what
the
regression
is
doing
is
we're
giving
it
a
library
of
function,
because
the
library
of
function
is
based
on
data
right.
So
we
give
it.
You
know
basically,
images
of
velocities
gradients
of
velocity,
higher
order,
derivative
and
so
on
and
so
forth,
and
we
have
the
algorithm
to
actually
prune
through
that
library
of
function
and
select
the
one
that
actually
best
match
the
force.
So
now
the
beauty
of
it
is
that
you
don't
end
up
with
you
know
thousands
of
weights,
multiplying
things
that
you
can't
interpret.
C
So
I'll
show
you
the
good
and
the
bad
about
those
two
things.
The
first
step
is,
of
course,
we
need
to
actually
see
if
we're
doing
a
good
job
and
so
we're
doing
some
kind
of
testing
offline.
So,
as
I
said,
what
we're
trying
to
do
in
those
idealized
simulation
is
learn
this
true
missing
forcing
so
we
have
x
here
y
here,
that's
the
mean
missing
mesoscale,
forcing
in
the
momentum
equation.
That's
the
standard
deviation,
that's
what
we're
trying
to
learn.
C
So
this
is
what
the
neural
net
is
doing
right.
So
that's
the
mean
standard
deviation,
that's
what
the
equation
discovery
gave
us
again
mean
standard
deviation
and
those
are
the
correlation
between
what
the
machine
learning
algorithm
has
learned
versus
the
truth.
Okay,
so
in
both
instances
you
can
see
that
both
machine
learning
algorithms
do
a
great
job
overall,
so
the
neural
net
does
better
in
you
know
many
regions
of
of
this
domain
and
does
worth
in
others
where
actually
there's
very
low
turbulence
over
there.
C
So
maybe
it
doesn't
matter
but
can't
predict
the
waves
very
well.
The
relevance
vector
machine
ended
up
being
a
tensor
that
I'm
not
explaining
you
the
physics
here,
because
that
would
take
me
more
than
the
you
know.
Five
minutes
I
have,
but
it
really
depends
on
the
stretching
and
shearing
of
the
fluid,
which
we
know
are
important
components
for
parameterizing
mesoscale
turbulence,
and
so
you
know
in
both
instances
the
network
do
a
great
job.
C
The
advantage
of
the
equation
discovery,
even
though
has
lower
skill
offline
is
that
you
know
we
were
able
to
and
to
understand
all
those
pieces
with
the
neural
net.
It
was
much
harder
because
I
still
have
no
idea
why
it
does
such
a
great
job,
both
of
them
generalize
us
very
well
by
the
way.
But
it's
just
to
give
you.
You
know
a
flavor
of
the
type
of
methods
we
can
use
directly
using
data
to
actually
extract
information
for
parameterization
of
misinformation.
C
So
all
of
that
is
offline
right
at
the
end
of
the
day,
I
need
to
go
and
plug
it
into
a
crosser
resolution
model.
So,
of
course,
we
started
with
something
very
simple:
a
very
simple
biotropic
double
gyro
model.
So
it's
if
you're
an
oceanographer,
it's
simplest
possible
model.
You
could
come
up
with,
so
we
have
a
jet
in
the
middle
a
little
bit
as
before,
and
this
is
a
coarse
resolution
model
with
no
parameterization.
So
that's
the
zonal,
velocity
again
x
and
y
over
here.
C
That's
the
standard
deviation
of
the
zonal
velocity
and
that's
a
higher
resolution,
so
there
were
30
kilometer
and
that's
basically
less
than
four
kilometers.
So
a
lot
more
turbulence
right.
The
flow
is
a
lot
more
energetic
if
we
add
the
parameterization
this
is
you
know
what
the
flow
field
looks
like
if
we
plug
in
in
the
course
resolution
the
equation
discovery,
so
you
can
see
we're
basically
kind
of
recovering
a
lot
of
the
turbulence
in
the
flow
field.
C
That's
if
we
implement
the
neural
network
again,
actually
neural
network
does
a
better
job.
In
both
instances
we
had
to
actually
tune
down
the
parameterization
the
equation
discovery.
The
model
blew
up
so
even
though
it
was
an
equation,
the
model
was
was
unstable
very
quickly,
so
we
had
to
have
the
coefficients.
C
The
neural
network
never
became
unstable,
but
brother
gave
us
a
solution
that
was
highly
unphysical,
so
basically
completely
forgot
that
there
was
reinforcing
and
gave
us
a
gigantic
eddy
that
took
the
entire
basically
the
entire
domain.
There
was
no
jet
anymore.
There
was
just
a
massive
inverse
cascade
happening
kind
of
swapping
around
all
pieces
of
turbulence.
C
Again,
I'm
telling
you
the
good
and
the
bad
right.
There
are
many
things
that
can
happen,
but
nonetheless,
in
both
instances
with
some
tuning
we're
able
to
actually
recover
the
properties
of
a
high
resolution
simulation
without
the
cost
of
the
highlights.
C
So
all
of
that
is
very
idealized.
Next
up
is
implementing
it
in
more
complex
models,
so
we
started
in
bioclinic
models,
but
of
course,
mum.
Sex
is
our
next
step
as
part
of
m
squared
line,
as
I
mentioned
at
the
beginning,
which
is
a
new
international
collaboration.
So
so
I
was
asked
to
say
specifically
what
we
would
implement
so
we're
going
to
implement
some
of
the
equation:
discovery
parameterization.
C
So
both
momentum
and
buoyancy
that
we've
discovered
in
those
kind
of
you
know
pre-trained
model
we're
also
going
to
implement
the
deep
learned
parametrization.
Those
are
a
little
bit
trickier
again
because,
because
of
the
form
of
the
parameterization,
so
I
talked
about
this
one
there's
a
new
one
that
that
we
trained
using
data
from
coupled
climate
models
so
cm
2.6
and
that's
a
stochastic
parameterization,
actually,
where
we
learn
the
mean
and
the
standard
deviation
of
the
missing
person.
C
So
those
are
kind
of
the
learned
parameterization
that
we're
going
to
go
and
directly
implement
in
216..
But
of
course,
we're
doing
a
lot
more
in
m
squared
lines.
So
we're
going
to
tackle
mesosphere
and
seven
as
a
scale
parameterization
for
momentum
and
tracers
we're
going
to
look
at
vertical
mixing
as
well
again
momentum
and
tracers
parameterization
ideas,
interface,
both
ocean
and
atmosphere
and
ice
and
atmosphere.
So
america,
holland,
is
kind
of
tackling
that
problem
at
ncar
and
in
the
atmosphere,
both
within
the
boundary
layer
and
and
momentum.
C
Transporter
cloud,
so
judith
werner,
also
at
mkhar,
will
tackle
that
for
the
ocean.
Many
people
are
involved
both
at
gfdl
and
ips
cell
and
across
many
universities,
so
a
wide
range
of
of
parameterization
and
processes
that
we're
going
to
tackle-
and
I
just
kind
of
want
to
close
with
that-
it's
a
very
large
project
with
many
many
partners
and
so
really
involved
many
components
of
it
of
using
high
resolution
simulation
and
data
simulation
product
with
a
male
in
theory
to
come
up
with
new
parameterization
that
we
can
plug
into.
C
You,
know
the
gfd
ln,
car
and
ipsl
model
for
improving
prediction,
and
so
again
this
is
this
is
an
exciting
product
project,
supported
by
mid
future
with
with
many
colleagues
that
are
absolutely
essential
to
the
project.
So
thanks
again
for
having
me
and
hopefully
there's
time
for
questions.
B
Thanks
very
much
for
that
excellent
talk,
laura
yeah,
let's,
let's
take
a
few
questions.
We
can
also
push
into
the
discussion
time.
At
the
end,
I
forgot
to
mention
that
if
you
have
a
question,
you
can
use
the
raise
hand
feature
in
zoom
or
you
can
type
it
in
the
chat.
D
You
know,
thank
you
very
much.
Could
you
tell
me
if
there's
any
effort
or
advantage
or
what
you
do
with
your
inputs,
do
you
them
or
do.
C
They
have
units
yeah.
So
that's
a
great
question.
So
usually
we
try
to
normalize
them
because
you
want
to
wait
by
standard
deviation,
so
so
that
that
helps
you.
So
there's
a
lot
of
pre-processing
that
goes
in
it
by
the
way.
So
here
I
kind
of
you
know,
went
through
the
12-minute
version
of
it,
but
there
is
what
you
do
with
the
data
at
the
beginning
makes
a
big
big
difference.
B
Thanks
yeah,
I
have
a
question
about
how
how
we
might
be
able
to
understand
a
little
bit
more
why
the
neural
network
is
doing
such
a
great
job.
I
mean:
do
you
have
thoughts
on
interpretation
of
that
or
explainable
techniques.
C
Yeah,
absolutely
so
we're
definitely
tackling
this
as
well.
So
I
mean
you
know:
there
are
many
techniques,
as
you
know,
since
you're
doing
a
lot
of
them
now,
so
we're
using
you
know,
feature
maps
and
we're
calculating
jacobians
what
we're
finding
right
now.
The
first
step
is
actually
a
lot
of.
You
know
a
lot
of
the
gradient
and
the
strain
and
stress
that
we
had
with
the
equation
discovery.
C
That's
what
the
neural
net
is
learning
as
well,
but
it
does
hire
other
moments
better
and
that
part,
I
still
don't
understand
why
so
right
now,
what
we're
doing
is
we're
trying
to
give
the
algorithm
the
equation
and
then
learn
the
residual
from
it
and
see
if
we
can
interpret
the
residual
from
the
neural
net
so
trying
to
kind
of
go
at
it.
But
it's
hard
to
understand
what
the
neural
net
is
doing,
especially
for
hierarchies
yeah.
B
E
Okay,
yeah
thanks
for
having
me
so
today,
I'll
be
talking
about
some
of
the
work
that
myself
and
christiana
jablonowski
are
doing
on,
trying
to
emulate
simplified
physical
processes
in
cam
so
similar
to
lara's
talk,
but
now
we're
talking
about
the
atmosphere
so
some
over
overall
motivation.
E
As
we
all
know,
I
think
machine
learning
has
become
just
this
very
intriguing
and
useful
tool
to
us
as
scientists,
and
particularly
in
atmosphere
and
earth
science,
and
one
of
the
biggest
applications
of
that
for
us
has
been
to
see
whether
or
not
we
can
emulate
or
improve
physical
parameterizations
in
our
climate
models.
And
that's
where
we
began
this
work.
When
I
came
into
the
university
of
michigan
new
to
atmospheric
science
and
new
to
data
science.
E
So
it's
been,
it's
been
a
long
time
coming,
but
we're
finally,
we're
finally
excited
about
some
of
these
results.
E
So
the
number
of
neurons,
if
you're,
using
a
neural
network
or
the
number
of
trees
in
your
random
forest,
whatever
the
case
may
be,
and
in
order
to
address
these
kind
of
fundamental
questions,
we've
been
implementing
this
in
a
hierarchy
of
extremely
simplified
atmosphere
models.
And
if
we
look
at
a
at
a
diagram
of
kind
of
the
simplest
climate
model,
you
can
make
is
2d
types
of
deterministic
tests
all
the
way
up
to
some
an
amit
model
with
realistic
or
or
state
of
the
art
physics.
E
We
kind
of
fall
here
in
the
middle
where
we're
we're
in
this
3d
dicor,
but
we're
keeping
it
very,
very
simple.
With
a
dry
held
solaris
test,
some
of
you
may
be
familiar
with
and
then
a
moist
version
of
that
as
well,
where
we
allow
it
to
rain
as
well
and
in
the
dry
setup
we
have
two
forcings.
We
have
the
horizontal
velocity,
which
is
a
very
linear
function.
We're
not
going
to
tackle
this
with
machine
learning,
because
you
can.
E
You
can
emulate
that
with
just
linear
regression,
if
you
want,
but
in
the
temperature
forcing
this
newtonian
newtonian
newtonian
temperature
relaxation,
you
actually
introduce
a
little
bit
of
nonlinearity
with
the
latitude
dependence
on
kteq,
which
means
it's
just
outside
of
the
range
of
linear
regression,
but
it's
still
extremely
simple
and
so
we're
going
to
try
and
emulate
that
and
then
in
the
moist
setup
you
start
to
introduce
condensation
boundary
layer,
mixing
and
heat
fluxes,
and
you
of
course
allow
it
to
rain.
So
all
three
of
these
are
non-linear.
E
The
temperature
tendency
is
definitely
dominated
by
the
newtonian
relaxation.
So
essentially
that's
going
to
be
the
easiest,
but
it's
still
highly
non-linear
compared
to
the
dry
case
and
then
the
q.
You
have
only
these
non-linear
terms,
a
little
bit
more
complicated
terms
and
then,
of
course,
precipitation
is
an
integration
over
the
vertical.
E
So
it
is
also
a
much
more
complicated
problem,
and
this
is
what
I've
been
spending
the
majority
of
the
last
year
trying
to
try
to
emulate
so
machine
learning
we're
all
here,
because
we
have
some
kind
of
an
interest
in
it
and
fundamentally,
what
it
is
is
just
determining
functional
relationships
between
an
input
and
an
output
of
a
data
set
and,
in
general,
the
most
modern
machine
learning
techniques.
E
The
more
data
you
have,
the
better
not
all
of
them,
but
and
but
most
of
them
and
the
ones
that
we
focus
on
are
random.
For
us.
For
the
majority
of
this
work,
I
do
have
a
neural
network
in
here.
E
I
hope
we
can
get
to
it
see
how
time
goes
and
these
the
models
are
built
in
python,
using
established
libraries,
scikit-learn
for
the
random
force
and
keras
for
the
neural
networks,
and
then
we've
incorporated
sherpa,
which
is
a
library
out
of
uc
irvine
that
helps
us
to
hyperparameter
tune
our
hyper
parameters
in
our
machine
learning
model.
E
So
it
can
tell
me
what
the
best
choice
for
number
of
trees
in
my
forest
or
the
number
of
neurons
in
my
neural
network
and
many
many
more
things,
there's
pretty
much
a
countless
number
of
options
that
you
have.
So
it
becomes
a
little
bit
taxing
if
you
don't
have
some
kind
of
an
optimizer
and
just
diagram
wise.
I
want
to
introduce
what
a
random
forest
and
neural
network
is
so
random.
E
Forest
is
an
ensemble
of
decision,
trees
that
are
randomly
initiate
initialized
and
each
decision
tree
is
fit
individually
to
come
up
with
a
prediction
based
on
your
data
and
then
your
final
prediction
is
a
some
kind
of
a
an
average
of
all
of
the
trees
predictions,
so
something
that's
unique
about
random
or
useful
about
your
inner
forest.
E
So
things
like
the
precipitation
which
we
know
cannot
be
negative
value
that
inherently
can't
be
predicted
as
a
negative
value
with
a
random
force,
which
is
why
they're
so
powerful
or
part
of
why
they're
so
powerful
neural
networks,
on
the
other
hand,
are
a
system
of
interconnected
neurons
with
nonlin
non-linear
activation
functions,
which
introduces
that
non-linearity
and
the
computationally
they're
very
efficient
once
you've
implemented
them
to
run
on
either
parallel
or
on
a
gpu.
E
E
E
I'll
talk
quickly
about
the
model
configuration
where
we
get
our
data
set
from
we're
using
two
configurations.
Most
of
the
work
with
our
random
force
is
done
on
a
what
we're
calling
a
two
degree
grid
with
the
cesm
2.1
finite
volume.
E
Dynamical
core
both
of
these
are
run
with
30
vertical
levels,
but
the
2
degree
grid
is,
of
course,
1.9
by
2.5
lat
lawn
and
we're
running
that
one
for
60
years
and
we're
collecting
weekly
output,
so
we're
not
collecting
it
every
time,
step
or
physics,
time
step
we're
collecting
it
sporadically
throughout
this
16
year,
60-year
climate
model
run
or
atmosphere
run,
and
this
is
mostly
used
for
our
tendencies:
the
dt
dt,
the
dq
dt
and
also
for
precipitation,
and
only
on
random
force.
E
We
also
have
an
older
one
degree.
Resolution
run
that
was
a
three
year
run
with
hourly
outputs
the
same
order
of
magnitude
on
the
total
number
of
data
points
we
have,
and
then
we
used
this
for
a
a
precipitation
run
about
a
year
ago,
which
gave
promising
results
and
I'm
hoping
to
talk
quickly
on
that.
But
most
of
the
work
is
going
to
be
with
the
random
force.
E
For
the
two
degree
data
set
and
as
far
as
splitting
up
our
data
we
use
about
50
years
for
training
and
validation,
we
leave
a
gap
and
then
we
for
all
my
results.
We
use
what's
called
the
test
data,
which
is
about
six
years
at
that
weekly
output
for
for
the
two
degree,
and
so
it's
basically
we're
running
this
all
offline,
but
on
data
that
it's
never
seen
before,
so
that
we
don't
have
any
overfitting
issues.
E
If
it
does,
if
it
does
poorly,
then
we
know
we
overfit
if
it
does
well,
then
we
know
we
kind
of
where
we
want
to
be,
and
I
just
want
to
introduce
the
idea
of
an
r
square.
Some,
a
lot
of
you
are
probably
familiar
with
r
square,
but
essentially
closer
to
one.
Is
what
we
would
like
the
better?
Our
machine
learning
has
learned
whatever
we're
trying
to
emulate
and
the
lower
around
zero,
the
po,
the
worse
it
is
at
emulating,
and
if
you
have
something
negative,
which
is
definitely
possible.
E
It
essentially
means
that
you
didn't
learn
your
the
the
variance
between
the
unexplained
variance
of
your
machine
learning
model
is
just
far
more
than
or
more
than
the
actual
variance
of
your
data,
so
it
didn't
learn
anything
so
keep
an
eye
out
for
basically
white
space
on
my
r
square
plots.
If
that
that
that
just
means
that
in
those
regimes
it
didn't
regions,
it
didn't
work,
so
some
results
for
the
dry.
E
This
is
that
very
simple
case
where
we
have
that
newtonian
temperature
relaxation
and
on
the
left
is
a
zonal,
mean
time
mean
plot
from
the
actual
cam
output
in
the
middle
for
the
over
the
testing
data
in
the
middle,
we
have
the
machine
learning.
This
is
a
random
forest
and
we
see
it's
very
close.
This
is
what
we
should
expect
to
see.
E
It's
a
very
simple,
simple
problem,
something
we
weren't
seeing
for
quite
a
long
time,
but
recently
got
it,
got
it
kind
of
working
finally,
and
then
oops
clicked
away
from
something
the
r
square
for
that
is
kind
of
reinforces
how
skillful
that
that
predictor
is,
but
you
know
0.96,
I
think,
was
the
minimum
we're
looking
at
something
that
learns
very
well.
But
it's
learning
something
very
easy.
So
it's
easy
to
get
excited,
but
it's
also,
you
know,
take
it
with
a
grain
of
salt.
E
This
is
a
very
easy
problem,
so
the
more
interesting
results
starts
with
the
moist
case
for
that
dt
dt-
and
this
is
the
same
cam
model
on
the
left,
the
random
forest
in
the
middle
and
then
the
difference
on
the
right,
and
we
see
that
overall,
the
structure
is
there.
E
We're
hovering
right
around
that
that
minus
0.1
to
0.1
kelvin
per
day,
which
is
which
is
encouraging-
and
it's
also
further
reinforced
in
our
in
our
our
square,
which
is
pretty
much
the
majority
of
the
domain,
is,
is
around
where
we
want
to
see
it
0.9
and
above,
and
we
have
a
little
bit
of
negative
r
square
here
at
the
poles
and
the
lower
levels.
E
And
we
do
have
this
poor
performance
close
to
the
surface,
and
I
do
think
I
know
where
that's
coming
from,
and
I
have
a
a
new
run
going
right
now
to
try
to
address
that.
But
essentially
it's
because
I'm
I'm
parameterizing
with
sherpa
on
the
level
like
around
800
850
hectopascal.
So
it
may
be
parameterizing
well
for
everything
above
that,
but
not
below
it.
E
So
I'm
trying
something
a
little
closer
to
the
surface
right
now,
so
we'll
see,
if
that,
if
that
improves
that
low
level
difficulties
for
dqt,
this
is
probably
the
the
most
difficult
and
also
the
most
challenging
for
machine
learning
to
to
emulate.
So,
overall,
structurally
in
the
zonal
mean
time
I
mean
we
see
it's,
it
looks
similar
we're
missing
those
those
largest
we're
overshooting.
E
I
should
say
in
the
equatorial
region
with
the
moisture
and
then
even
worse,
so
in
the
r
square,
we're
seeing
significant
negative
r
square,
which
is
basically
it
wasn't.
It
didn't
learn
well
and
and
where
it
did
learn
it's
not
anything
to
be
too
excited
about.
You
know
the
point
two
to
0.6
range,
definitely
something
that
needs
to
be
improved
upon,
and
this
was
a
first
attempt
just
last
week,
so
I
do
expect
I
can.
I
can
improve
this
somewhat
in
the
next
couple
of
weeks.
E
The
precipitation,
however,
was
a
little
bit
more
encouraging
kind
of
in
between
the
dtdt
and
the
dqdt.
So
on
the
left
here.
This
is
now
just
a
time
mean
since
it's
a
surface
field,
so
you
have
latitude
on
the
left.
Longitude
on
the
right-
and
we
see
pretty
pretty
good
between
the
equatorial
region
and
the
mid
latitudes,
where
most
of
the
excitement
is
actually
happening.
E
The
difference
plots
also
look
look
good,
there's.
I
think
if
I
would
have
done
a
longitud
zonal
mean
on
this.
The
line
plot
would
have
been
a
little
bit
more
interesting,
but
I
do
like
the
panels
too.
So
it's
something
I
I
could
have
done
this.
These
results
actually
just
finished
this
morning,
so
I
didn't
have
a
great
amount
of
time
to
to
mess
around
with
them,
but
the
r
squared
also
is
is
encouraging.
E
We
have
point
six
and
above
for
the
most
of
the
regime,
and
then
these
pockets
of
low
and
somewhat
sometimes
negative
are
square
where
most
of
the
there's
this
transition
region
between
where
the
activity
is
in
the
equator
and
in
the
mid
latitudes
and
just
outside
of
that
equatorial
region
seems
to
be
a
little
bit
difficult
and
and
for
the
machine
learning
to
to
predict,
but
we
do
see
a
skill
overall
skill
of
around
0.85,
and
so
that's
that's
encouraging
for
sure.
Also.
E
This
was
the
first
attempt
with
a
random
forest,
so
we're
I'm
excited
for.
You
know
first
attempt
to
be
this,
this
skillful
and
then
just
wanted
to
go
back.
We
don't
see
any
negative
precip
on
these.
These
mean
plots
here
for
the
machine,
learning
predictor
and
that's,
as
I
mentioned,
part
of
the
the
benefits
of
using
a
random
forest.
E
So
I
will
quickly
play
this.
Video,
which
was
generated
by
christiana
from
our
old
one
degree
simulation
using
a
neural
network
to
parameterize
the
the
large-scale
condensate
or
precipitation,
and
we
see
that
there's
a
lot
of
skill.
The
the
flow
of
the
precipitation
over
the
test
data
is
is
very
similar
between
the
two.
This
is
just
a
no
mean.
E
This
is
just
an
eye
test
in
a
movie
and-
and
I
show
this
time
mean
just
like
the
one
before,
but
we
see
here
these
little
pockets
where
we
have
on
the
right.
We
see
the
skill
in
the
peaks,
the
equatorial
region
and
the
mid-latitudes,
but
we
see
these
pockets
that
are
accumulating
negative
precipitation,
in
the
mean,
so
not
just
a
little
bit
enough
to
be
to
be
noticeable,
and
this
could
cause
instabilities
or
things
if
we
were
to
couple
it
back
to
the
to
the
gcm.
E
So
you
know,
neural
networks
are
are
good.
They
are
difficult
to
parameterize,
at
least
in
my
in
my
experience,
and
they
do
come
with
their
with
their
negatives
as
well.
So
it's
something
that's
that
we're
interested
in
diagnosing
and
addressing
in
the
future
and
then
I'll
just
leave
the
summary,
because
I
think
I'm
running
late,
but
I
don't
know
how
late
we
started.
So
hopefully
we
can
answer
some
questions,
but
yeah
I'll
conclude.
There.
B
Thank
you,
garrett
yeah
thanks
very
much
really
interesting
work
thanks
for
including
those
late
breaking
results.
Yeah.
Let's
take
one
quick
question:
we
we
do
have
one
that
just
came
in
the
chat,
so
we'll
read
that
thanks
for
your
talk,
what
is
the
reason
behind
leaving
a
four-year
gap
between
the
training
and
test
data.
E
E
Of
course,
four
years
is
well
beyond
what
I
what
I
actually
need
as
far
as
a
gap
there,
you
know
a
few
weeks
to
a
month
would
be
fine,
but
I
used
four
years
just
because
I
wanted
to
take
the
last
part
of
my
of
my
data
for
testing,
and
I
didn't
want
to
test
on
too
much,
because
it's
just
it's
you're
over
complicating
your
computational
resources
by
running
it
on.
Let's
say
I'm
running
it
on
about
six
six
years.
E
If
I
ran
it
on
nine
years,
it's
you're
not
gaining
a
whole
lot
more
out
of
the
predictability
of
your
your
machine
learning
model
there.
So
I
leave
for
four
years
for
the
three
year
case.
I
left
only
about
three
to
four
weeks,
which
is
more
in
line
with
where
you
want
that
gap
to
be
to
avoid
autocorrelations
and
stuff
with
the
climate
signals
and
in
these
simplified
test
cases
it's
not
as
important.
But
it's
a
good
practice
for
when
you
do
kind
of
go
into
those
more
complicated
things
and
more
realistic.
B
F
Okay,
so
hopefully
you
can
all
see
my
slide
so
hello,
I'm
a
postdoc
at
colorado,
state
university,
I'm
working
with
dr
elizabeth
barnes
and
today
I'm
going
to
be
talking
about
a
little
bit
of
a
different
aspect
of
machine
learning
that
we
haven't
discussed
quite
yet,
and
it's
really
the
idea
of
explainable
machine
learning.
So
how
can
we
take?
F
You
know
we
have
this
machine
learning
model
that
makes
a
prediction,
but
sometimes
machine
learning
is
often
regarded
as
this
idea
of
a
black
box
sort
of
how
is
the
machine
learning
model
using
its
methods
inside
like
its
neural
networks,
to
really
make
its
prediction
so
to
get
everyone
started.
My
background
is
a
climate
scientist
so
something
I
often
think
about
is
maps
and
particularly,
let's
think
of
a
map
of
surface
temperature
and
one
way
we
can
sort
of
think
about
you
know.
F
Maps
of
temperature
is
to
calculate
the
global,
mean
temperature,
so
right
here,
you're
seeing
a
time
series.
This
is
from
observations
I'm
using
the
third
generation
of
the
noaa
20th
century
reanalysis
data
set.
So
you
see
your
classic
climate
change
and
variability
signal
going
forward
until
2015..
F
Another
way
we
can
think
about
global
temperature,
which
many
of
you
are
familiar
with,
is
running
climate
models.
So
here
now,
I'm
adding
on
the
cesm
one
large
ensemble
and
one
advantage
is
all
of
you
are
aware
that
the
large
ensembles
allow
us
to
really
think
about
natural
variability
and
noise
through
each
ensemble
member,
in
addition
to
averaging
through
them,
to
really
understand
the
forced
response,
but
to
understand
patterns
of
regional,
climate
change
and
variability.
F
F
And
of
course,
then
we
have
everything
else.
That's
in
the
model
things
like
that
internal
variability,
which
those
ensemble
members
really
allow
us
to
capture.
But
again
it's
really
challenging
if
we're
thinking
about
sort
of
climate
change
attribution
to
really
disentangle
what
is
being
affected
by
things
like
greenhouse
gases
versus
industrial
aerosols.
F
F
What
you're
doing
here
is
you're,
taking
the
csm
one
large
ensemble
and
for
different
simulations
you're,
actually
fixing
one
of
the
four
things.
So
in
my
case,
let's
think
about
aer.
So
in
this
case
it's
your
full,
forcing
csm,
large
ensemble,
but
I'm
fixing
greenhouse
gases
to
1920
levels.
So,
therefore,
the
predominant
or
dominant,
forcing
in
this
case
will
be
aerosols.
F
So
really
so.
What
you
know,
as
I've
already
mentioned,
these
different
external
forcings,
can
affect
regional
climate,
variability
aerosols,
potentially
you
know
or
remain
a
big
uncertainty,
even
understanding,
20th
century
historical
climate
change.
So
the
question
is,
you
know
using
explainable
ai?
Can
we
gain
new
insight
to
understanding
forced
climate
signals
from
these
different
forcings?
F
So
I'll
return
to
that
idea
of
thinking
about
a
surface
temperature
map
and
I'm
going
to
set
up
a
very
simple
problem
from
a
neural
network,
and
this
problem
again
is
really
not
interesting
per
se.
So
to
explain
what
I
mean
by
that,
I'm
going
to
take
each
point
on
a
temperature
map,
that's
going
to
be
one
sample
at
every
latitude
and
longitude
point,
I'm
going
to
input
that
into
an
artificial
neural
network.
F
This
is
a
very
shallow
neural
network
and
adding
hidden
layers
doesn't
really,
you
know,
affect
its
accuracy
or
prediction.
And
then
my
prediction
from
this
neural
network
is
to
tell
me
some
metadata
about
the
file
in
particular.
That
metadata
is
what
year
is
that
map
coming
from
now
you
could
argue
and
say:
well.
I
already
know
that
you
know
if
I
read
in
let's
say
a
net
cdf
file
of
temperature.
F
I
already
know
what's
in
the
year
of
the
temperature
map,
but
what's
really
interesting
is
we
can
now
use
these
explainable
ai
methods
to
understand
how
the
neural
network
is
making
its
decision,
and
in
this
case
there
are
many
different
methods
for
explainable
ai.
I'm
going
to
be
focusing
on
one
called
layer,
wise
relevance,
propagation
of
lrp
and
if
you're
not
familiar
with
lrp.
F
Essentially
the
concept
I'll
go
through
a
simple
examples:
let's
say:
you're
doing
an
image
classification
problem,
so
I'm
inputting
an
image
of
a
wolf
into
the
neural
network
and
I'm
hoping
it
classifies
it
as
a
wolf.
In
this
case
it
does.
F
But
now
I
can
use
the
explainable
ai
method
called
lrp
and
it's
going
to
provide
me
a
heat
map
of
where
the
neural
network
is
looking.
That
adds
to
its
decision
that
it
was
a
wolf.
So,
in
this
case,
you're
going
to
get
a
heat
map
for
every
point
of
that
image,
and
you
can
see
that
it
resembles
a
wolf.
We
can
do
that
for
other
examples.
F
Here's
an
image
of
a
volcano,
and
now
we
can
see
that
there's
an
outline
of
a
volcano
from
the
lrp
that
helped
determine
what
are
the
points
that
made
it
so
the
neural
network
could
make
an
accurate
decision.
F
And,
lastly,
I,
like
sharks,
here's
an
image,
that's
inputted,
into
this
neural
network,
which
correctly
classifies
it
as
a
great
white
shark,
and
you
can
see
that
heat
map
from
this
explainable
ai
is
saying:
where
did
the
neural
network?
Look
to
make
its
decision
I'll?
Add
that
layer,
wise
relevance?
Propagation
of
course,
is
not
the
only
method
of
explainable
ai
and
it's
not
necessarily
perfect.
It's
subject
to
interpretation.
F
F
So
again,
I'll
return
to
my
problem,
this
idea
of
taking
a
temperature
map
from
climate
models
and
observations
it
predicts
the
year,
which
of
course,
is
not
that
interesting.
But
now
we
can
produce
these
heat
maps
of
where,
on
the
temperature
map
did
the
neural
network
look
to
be
able
to
predict
the
year.
This
allows
us
to
no
longer
think
of
machine
learning
models
as
black
boxes,
so
to
get
to
the
data
and
just
to
show
what
these
you
know.
The
raw
data
is
looking
like
for
these
different
large
ensemble
simulations.
F
Of
course,
in
the
first
simulation
where
aerosols
are
dominating
and
greenhouse
gases
are
fixed,
we
have
cooling
during
the
late
20th
century
due
to
increasing
aerosols
in
the
simulation
with
fixed
aerosols.
We
see
even
greater
warming
across
the
world
due
to
greenhouse
gases
that
are
evolving
through
time
and
then
in
all
simulation
where
we
have
both
aerosols
and
greenhouse
gases
evolving
over
time.
We
see
it
a
bit
less
as
warm
for
the
temperature
trends
due
to
that
aerosol
interaction
when
compared
to
the
ghg
plus
okay.
F
So
now
I'm
finally
going
to
get
to
the
prediction
of
our
simple
neural
network,
again
sort
of
a
less
interesting
aspect
of
it.
So
what
you're
seeing
here
is
the
prediction
output
and
how
to
read
these
plots
again,
I'm
predicting
the
year.
So
you
see
that
that
line
here
in
the
white
line.
That
is
your
one
to
one
line.
So
we
want
our
data
to
follow
the
one
to
one
line
which
would
indicate
that
it
correctly
predicts
the
the
year
of
the
maps.
F
And
now
you
can
see
where
the
differences
really
evolve.
In
the
case
where
there
is
no
time
evolving
greenhouse
gases,
we
can
see
that
our
model
does
not
correctly
predict
the
year
in
aer
plus.
We
can
also
see
it
does
pretty
well
in
the
ghg
plus
and
in
the
all
the
typical
csm
large
ensemble,
one
proxy.
We
could
think
of
how
well
the
observations
are
being
predicted
is
sort
of
the
slope
or
the
r
squared
of
these
predictions,
and
in
this
case,
what's
really
interesting.
F
We
can
then
make
sure
that
you
know
this
wasn't
just
by
chance
by
the
different
combinations
of
training
and
testing
data,
in
this
case,
I'm
using
training
and
testing
data
from
different
ensemble
members
in
each
of
these
three
simulations.
So
we
can
see
here
by
testing
different
combinations
of
random
initialization
seeds
or
the
training
and
testing
data.
These
are
histograms
of
the
slope
of
observations.
F
F
What's
nice
about
lrp?
Is
that
we
get
a
heat
map
for
each
year
over
time,
so,
instead
of
a
simple
regression
problem
where
we
would
get
sort
of
one
map
of
the
regression
weights.
In
this
case
we
actually
getting
lrp
maps
over
time
and
how
to
read
these
is
that
the
whiter
or
the
higher
relevance
values
are
more
important
for
the
network
to
make
its
decisions
I'll
point
out
that
one
aspect
of
lrp
is
that
it's
leveraging
non-linear
patterns
or
correlations
across
space.
F
So
we
can't
directly
infer
that
surface
forcing
over
a
particular
region
is
related
to
higher
relevance.
So
one
you
know
realistic
example
of
that
is
is
in
the
arctic.
We
know
there's
a
big
climate
change
signal,
even
with
you
know
the
large
internal
variability,
but
in
our
simulations
for
the
neural
network,
the
neural
network
is
almost
never
seeing
the
arctic
or
using
the
arctic
to
make
its
predictions.
F
So
I
realize
there's
a
lot
of
maps
on
this.
So
if
we
really
break
down
sort
of
this
period
from
the
mid
20th
century
into
the
mid
21st
century,
we
can
look
at
our
three
different
simulations,
and
now
we
can
compare
them.
You
know,
where
is
the
network
looking
between
the
one
that's
aerosol,
driven
or
the
greenhouse
gas
driven
or
the
aerosol?
Only
for
or
the
realistic
forcings
on
the
right?
F
So
now
we
can
run
our
network
many
many
different
times
to
see
whether
that
was
just
a
fluke
of
the
hyper
parameters
and
make
them
a
histogram
of
sort
of
the
relevance
over
different
regions.
Here,
I'm
outlining
the
north
atlantic,
so
you
can
see
the
histogram
that
is
further
right
indicates
that
there
was
more
relevance
in
the
north
atlantic
for
ghg
plus
to
make
its
prediction
for
observations,
and
we
then
can
compare
different
types
of
regions
we
can
see.
If
you
focus
on
southeast
asia,
we
can
see
that
ghg
plus,
where
there's
fixed
aerosols.
F
So
I'll
just
end
with
my
key
points,
I
really
believe
there
is
a
lot
of
potential
here
for
explainable
ai
to
sort
of
reveal
patterns
of
climate
change
and
variability
sort
of
as
a
pattern,
recognition
method
across
space
really
leveraging
those
non-linear
relationships
we
can
also
see
in
interesting
result
of
this
is
that
at
least
for
how
our
artificial
neural
network
is
being
trained.
It's
really
producing
a
higher
correlation
with
the
actual
observations
without
time
involving
aerosols.
F
Now,
the
interpretation
of
why
that
is
occurring
is
difficult,
but
potentially
it
could
be
that
these
patterns
of
force,
climate
change,
such
as
over
the
north
atlantic,
may
be
closer
to
the
forced
signals
in
observations,
suggesting
again
the
importance
of
looking
at
our
climate
models
and
how
sensitive
sensitive
they
are
to
very
small
changes
in
aerosol
forcing
so
you
can.
This
paper
was
recently
released
about
this
and
I'm
happy
to
take
any
questions.
Thank
you.
B
G
Hey
zack
nice
talk.
As
always.
I
have
a
quick
question
about
lrp,
so
there's
a
paper
submitted
by
someone
in
your
group
about
how
lrp
may
not
be
the
best
explainable
ai
method.
G
F
So
what
we've
done
and
as
the
paper
you're
referencing,
is
to
take
a
look
at
the
pros
and
the
cons
of
the
interpretation
of
the
different
back
propagation
rules.
So
we've
repeated
these
results
with
different
lrp
propagation
rules
that
sort
of
get
rid
of
some
of
the
issues
that
have
risen
and
we
actually
see
that
the
patterns
remain
the
same,
at
least
the
spatial
patterns
in
our
setup.
F
One
other
opportunity,
I
think,
for
this
type
of
work,
particularly
for
comparing
different
climate
models,
is
an
explainable
ai
method
called
backwards,
optimization,
which
would
really
allow
us
to
take
a
look
at
the
differences
between
climate
models,
particularly
maybe
useful,
for
like
cement
five
version
of
six.
So
there's
a
lot
of
really
exciting
explainable
ai
methods
that
really
allow
us
to
dive
in
especially
useful
for
climate
science.
B
H
H
Oh
okay,
I
actually
don't
know
how
to
swap
displays.
H
H
So
the
people
involved
are
myself.
I'm
kevin
rader,
I'm
working
in
the
data
simulation
group
at
ncar.
I
had
a
lot
of
help
in
putting
together
this
data
set
from
the
team,
jeff
anderson's
the
pi
and
then
lots
of
other
helpers,
which
I
have
listed
there.
I'd
like
to
thank
ian
grooms
for
making
the
first
query
about
using
this
data
set
for
machine
learning.
H
Hadn't
occurred
to
me
yet,
but
I
think
you
may
have
promise
and
also
I'd
like
to
thank
katie
and
maria
for
exploratory
discussions
of
using
the
machine
learning
in
the
cesm
context.
H
Algorithms
are
usually
difficult
and
expensive
to
produce
because
of
the
large
amount
of
time
needed
to
label
the
data,
in
my
case,
we'll
see
that
the
difficulty
was
not
labeling
the
data
it
was
just
producing
the
data
the
labeling
came
along
with
it
context
for
this
data
is
that
a
reanalysis
is
a
picture
of
the
state
of
the
atmosphere
or
any
other
system
you
have
which
uses
the
information
in
both
the
model.
Heim
casts
and
observations
and
takes
account
of
the
uncertainties
in
both
of
them.
H
H
Some
of
it
is
labeled
highly
labeled,
I'm
hoping
you
all
can
shed
some
light
on
that.
So,
as
I
mentioned,
we
know
about
data
simulation
in
these
data
sets,
but
we
don't
know
much
about
machine
learning.
H
The
atmosphere
is
cam6
and
it
was
a
one
degree
resolution
with
32
levels.
That's
called
the
workhorse
model
for
cesm,
had
an
active
land
model,
active
sea
ice
and
an
active
river
model.
So
those
are
time
evolving
and
creating
their
own
data
and
it
was
forced
by
a
data,
sst
sea,
surface
temperature
from
avhrr.
H
H
So
consistent
here
means
that
it's
a
balance
of
information
in
the
observations
and
the
model
heimcast
and
it
explicitly
accounts
for
the
uncertainties
in
those
two
main
sources
of
information
and
those
those
uncertainties
are
represented
by
the
observational
errors
that
come
along
with
the
observations
and
the
model
ensemble
spread.
H
So
we
feel
like
this
might
be
an
unprecedented
data
set,
but
I'd
like
your
input
on
that.
It's
an
80
member
ensemble
output,
which
is
pretty
big
by
most
standards
in
the
atmospheric
sciences,
the
large
ensemble
experiments
and
others
like
that
and
ensemble
data
simulation
have
shown
the
value
of
having
a
large
ensemble.
H
H
H
So
the
first
thing
is
the
instrument
that
took
the
measurements
and
the
quantity
observed
like
the
temperature.
It
also
includes
the
actual
observation
value,
for
instance
290
kelvin.
It
includes
an
observation
error
estimate
which
we
use
in
the
assimilation
process,
but
might
be
useful
in
machine
learning.
Also,
then
it
creates.
It
also
has
an
ensemble
of
model.
Estimates
of
the
observation,
it's
another
crucial
part
of
the
assimilation
process,
but
maybe
a
unique
form
of
data
for
machine
learning.
H
Other
highlights
it
comes
with
quality
control
labels
for
both
the
input
and
the
output
observation
types,
and
it
also
contains
the
the
observation
locations
and
times
there's
up
to
a
million
observations
at
each
of
these
assimilation
time
windows
and
we
have
13
000
assimilation
windows.
So
I
think
that's
a
lot
of
data
to
work
with.
H
One
thing
I
wanted
to
mention
too
is
that
combining
the
observation
error
with
this
ensemble
spread
gives
something
we
call
the
total
spread,
which
is
a
better
measure
of
the
consistency
of
the
observations
in
the
model.
So
that's
something
that
can
be
derived
and
used.
H
Another
data
set
we
have
is
generated
well
I'll,
give
you
the
background.
Csm
can
be
configured
to
run
surface
components
such
as
clm
or
pop
using
a
data
atmosphere
read
from
a
file
instead
of
calculated
in
cam,
so
here's
the
date
atmosphere.
It
feeds
data
to
the
coupler
and
the
coupler
gives
it
to
whichever
components
want
it.
These
are
surface
forcing
files,
usually
fluxes,
sometimes
just
the
state
at
the
surface.
H
H
These
are
two-dimensional,
gridded
data,
there's
very
little
metadata
in
them,
but
the
values
that
the
data
themselves
are
very
useful
for
running
these
surface
models
and
also
I'd
like
to
point
out
it's
expensive
to
produce
these
or
to
reproduce
them.
If
you
wanted
to
try
that,
so
these
data,
forcing
files
can
be
used
to
force
all
of
these
surface
components.
H
A
third
form
of
the
output
data
is,
we
have
an
ensemble
mean
of
the
cam
model
state,
which
was
I
mentioned
before,
as
the
surface
pressure,
temperature,
etc
and
that's
available
every
six
hours.
At
the
same
time,
we
have
the
80
member
ensemble,
which
is
available
weekly,
so
I've
sketched
it
out
here
graphically.
So
each
column
is
one
time.
We've
got
our
0
hour
6
hour
12.
H
H
So
this
is
my
final
slide.
I'd
like
to
open
it
up
for
discussion,
questions
I'm
particularly
interested
in
are
which
data
sets
seem
most
or
least
useful
for
machine
learning.
Here's
a
summary
of
the
four
of
them.
How
would
they
be
useful?
B
Thanks
kevin
yeah
thanks
for
summarizing
here,
I
think
these
are
really
helpful
questions
to
think
about,
along
with
points
like
cloud
computing
which
you
brought
up.
Any
questions
or
responses
to
these
discussion
points
feel
free
to
raise
your
hand,
put
something
in
the
chat.
We
can
also
be
thinking
about
these
throughout
the
rest
of
the
session
and
come
back
to
them
at
the
end.
But
I'll
pause
for
a
moment
here,
see
if
there's
any
questions.
G
Sorry
I
was
going
to
see
if
someone
else
raised
their
hand.
So
someone
has
a
question.
I
can
definitely
yield,
but
I
had
a
quick
question
kevin.
I
was
wondering
if
you
also
are
making
available
the
observations
that
were
used.
Oh
yeah,
I
see
says
observation
files,
but
that
were
used
to
create
the
the
re-analysis
product,
just
kind
of
thinking,
more
broadly
about
areas
where
there's
very
limited
observations
and
what
that
means
for
training,
a
machine,
learning
model,
etc.
H
Yeah,
the
complete
observation
set
is
available
within
this
data
set
and
you
can
make
maps
of
the
distribution
you
can
analyze
them,
statistically
they're
they're
all
available
for
use.
H
Great
I
lost
my
cursor
there,
we
go,
that's
what
the
instrument
is
and
the
quantity
observed.
So
it
may
say:
satellite,
you
know
cloud
top
wind
velocity
or
it
may
say
a
radio
sun
balloon
moisture
measurement
so
that
that's
all
part
of
the
metadata.
That's
in
the
observations.
G
B
So
much
yeah,
certainly
having
such
a
large
data
size
is,
is
definitely
a
benefit
to
machine
learning
and
the
amount
of
data
that
we
need
for
training.
I
think
it
probably
just
depends
on
the
question,
and
hopefully
people
can
be
thinking
about
that
during
the
rest
of
the
session.
There's
a
comment
in
the
chat
that
I'll
just
read
before
we
move
on,
and
so
it.
H
B
I
would
think
that
frequency
distributions
of
observed
and
modeled
variables
would
be
important
for
testing
the
veracity
of
machine
learning
methods
so
kind
of
similar
to
what
maria
was
saying.
How
we
can
use
data
sets
like
this
for
validation
of
machine
learning,
algorithms,
which
is,
I
think,
another
potential
powerful
use
for
this
data
set.
B
Great
thanks
kevin.
I
think
we
should
move
on
to
our
next
speaker,
who
is
chris
fletcher
from
the
university
of
waterloo
and.
H
I
Okay,
great,
thank
you
very
much.
I
will
try
and
share
mine.
Give
me
a
thumbs
up
or
something.
If
you
can
see.
My
title
looks
great.
It's
good,
okay,
perfect!
Well,
thanks
very
much
for
the
introduction,
thanks
for
having
me
here
today
and
thanks
everyone
who's
watching
for
coming
along.
So
I'd
like
to
begin
this
talk
by
just
showing
this
kind
of
beautiful
picture
of
a
very
high
resolution,
global
simulation.
This
is
from
a
paper
in
james
last
year.
I
So
this
is
a
1.4
kilometer
global
climate
simulation,
and
I
think
it's
fair
to
say
that
you
know
if
we,
if
we
all
had
our
kind
of
you
know
top
10
list
of
of
things
that
we
would
like
as
climate
modelers,
you
know
a
very
high
resolution.
Global
climate
model
or
earth
system
model
would
be
right
at
the
top
right
because
to
support
decision
making
and
adaptation
efforts.
We
really
need
that
high
resolution
simulation
particularly
hydrologic
change.
I
Unfortunately,
computational
resource
limitations
mean
that
we're
far
from
this
right
now,
so
the
work
I'm
going
to
describe
today
is
an
attempt
to
use
machine
learning
as
a
way
to
kind
of
augment
or
or
improve
the
efficiency
of
the
model
development
for
higher
resolution
esm.
This
is
a
sort
of
proof
of
concept
or
an
early
study,
and
this
is
work
in
conjunction
with
a
phd
student
in
computer
vision
at
the
university
of
waterloo.
I
Will
mcnally
and
jack
virgin
a
climate
scientist
a
phd
student
in
my
group
as
well
in
geography
and
I'd
like
to
acknowledge
microsoft,
the
ai
for
earth
program
through
the
ai
institute
at
waterloo
for
for
funding?
I
So
I've
already
said
we
need
high
resolution
right.
This
is
pretty
clear,
but
in
terms
of
cmip6,
you
know
we're
basically
at
the
sort
of
hundred
kilometer
grid
spacing
scale
and
so
really
to
get
to
that
high
resolution.
Various
methods
that
I'm
clumping
into
this,
this
sort
of
general
term
downscaling,
are
required.
We've
heard
many
talks
this
week
at
the
workshop
about
variable
resolution,
where
you
have
a
global
model
operating
mostly
at
lower
resolution,
and
you
kind
of
focus
in
on
a
particular
region.
I
At
high
resolution
gray,
you
can
do
dynamical
downscaling
with
a
sort
of
limited
area
model,
high
resolution,
statistical
downscaling.
All
of
these,
in
my
opinion,
are
sort
of
sub
optimal.
The
the
optimal
solution
would
be
to
have
a
higher
resolution,
global
esm,
but
we're
sort
of
prohibited
from
prevented
from
having
that
by
computational
limitations
and
one
of
the
places
where
there's
a
barrier.
I
I
think
to
to
these
getting
these
high
resolution
esm's
even
off
the
ground,
is
in
tuning
and
calibration
okay,
and
so
what
I'm
going
to
show
today
is
a
way
that
we
believe
we
can
use
a
machine
learning
technique
from
computer
vision,
namely
convolutional
neural
networks,
to
support
this
calibration
effort
and
reduce
the
amount
of
cpu
time.
That's
required
to
run
simulations
with
the
higher
resolution
models.
I
So
we
heard
a
nice
talk
earlier
in
the
week
from
cecile
henae
about
tuning
calibration
of
cesm
and
cecile
outlined
the
the
sort
of
multi-step
process
of
this
calibration
and
tuning
process
really
nicely.
And
what
I'm
going
to
focus
on
in
my
talk
today
is
the
way
that
we
believe
we
can
insert
machine
learning
right
into
this
tuning.
I
The
the
tuning
and
calibration
process
at
its
heart
is
about
finding
the
the
optimal
values
of
uncertain
parameters,
and
so
I
believe
we
can
use
machine
learning
to
to
help
us
do
that
and
I'll
show
some
results
about
that.
I'm
going
to
present
results
that
are
purely
in
the
in
the
sort
of
atmosphere,
land
only
the
kind
of
uncoupled
framework,
but
I
do
believe
that
there
is
potential
to
extend
this
methodology
to
incorporate
the
fully
coupled
sort
of
multi-component
modeling
framework.
I
That's
required
to
you
know,
find
energy
balance
and
optimize
the
calibration,
for
you
know
transient
simulations,
for
example,
with
cesm,
so
a
set
of
simulations
that
we
ran
with
a
very
old
version
of
cesm
and,
in
particular
the
atmosphere
component,
cam
4,
with
fixed
prescribed,
pre-industrial,
ssts
and
sea
ice.
I
And
what
we're
trying
to
do
is
to
investigate
the
impact
on
the
atmospheric
model
simulation
from
a
number
of
uncertain
atmospheric
parameters,
and
so
we're
doing
a
perturbed
physics,
ensemble
or
ppe
and
in
fact,
we're
doing
three
of
those
we're
doing
three
ppe's
each
of
a
hundred
members
each
and
we're
doing
them
at
different
resolutions.
Okay,
so
we
have
a
one
degrees
or
f,
o
nine
ppe,
that's
our
highest
resolution.
I
In
this
example,
we
also
have
a
two
degree
version
and
we
have
a
four
degree
version:
the
f40
and
the
f45,
and
the
nine
parameters
are
shown
are
shown
here
and
the
the
values
that
we
vary
them
across.
I'm
not
expecting
you
to
take
this
information
home
and
the
the
actual
details
of
the
parameters
that
we're
varying
are
not
that
important
for
for
the
presentation
that
I'm
making
it's
it's
more
of
a
an
example
of
how
the
calibration
of
uncertain
parameters
can
be
incorporated
and
improved
in
this
machine
learning
architecture.
I
We've
run
each
of
these
members
in
the
ppe
for
three
years
so
36
months,
and
then
all
of
the
results
I'm
showing
you
today
will
be
an
analysis
of
the
annual
mean
three-year
mean,
so
the
climatological
mean
okay
and
we
have
to
upscale
the
lower
resolution.
Outputs
to
the
the
higher
resolution
grid
192
by
288,
just
to
make
the
the
numerics
of
the
cnn
work,
the
the
convolutional
neural
network
or
the
the
cnn
is
a
technique
borrowed
from
computer
vision
and
normally
in
computer
vision.
I
You
go
from
right
to
left
in
in
terms
of
the
order
of
operations,
so
in
computer
vision,
the
task
is
to
start
with
a
complex
image
and
then
try
to
sort
of
progressively
simplify
that
image.
To
identify
kind
of
the
key
features
similar
to
to
you
know
in
zack's
talk
where
he
was
showing
the
identification
of
a
type
of
animal
or
a
kitchen
item,
or
whatever
now
will
mcnally
who's
the
grad
student
who's
working
with
me
on
this
project.
I
He
had
the
insight
that,
if
you
invert
this
process,
then
you
could
actually
go
from
the
the
sort
of
very
low
dimensional.
In
fact,
the
sort
of
90
input
of
parameter
values;
okay,
so
we
have
those
nine
parameters
and
then,
through
a
series
of
convolutions
and
and
reshapings,
you
can
actually
kind
of
increase
the
complexity
right
up
into
the
point
where
the
model
outputs
as
its
predictions.
I
This
fully
resolved
global
array
of
seven
output
variables,
things
like
low
cloud
fraction,
total
precipitation
net
radiative
flux
at
the
top
of
atmosphere,
all
that
kind
of
thing,
and
so
all
of
these
outputs
come
from
a
single
kind
of
iteration
through
this
cnn
architecture.
I
So
we
provide
as
an
input
the
nine
input
parameters
and
the
cnn
once
it's
fully
trained,
we'll
spit
out
this,
this
array
of
global
maps
of
our
output
variables-
and
we
do
a
sort
of
training
and
testing
cross-validated
methodology
to
to
assess
the
accuracy
of
this
cnn
at
predicting
outputs
from
from
cam4
and
just
to
show
you
what
it
looks
like
in
the
one
degree
mode.
I
You
know
we
have
a
100
member
ensemble
and
we
sort
of
randomly
sample
80
cases,
and
then
we
try
and
predict
the
unseen
20
cases,
and
we
repeat
that
whole
thing
40
times.
So
we
can
get
kind
of
a
sampling
distribution
and
assess
what
the
skill
is
and
we
assess
the
skill
using.
This
is
not
sum
of
squares.
This
is
a
skill
score
metric
from
pierce
29
2009.
I
This
is
a
sort
of
very
common
that
that
is
very
similar
to
sort
of
cling
up
to
efficiency,
takes
into
account.
You
know
bias
and
rmse,
but
also
the
spatial
pattern
of
the
data,
the
correlation
and
the
variance
ratio.
I
So
it's
it's
appropriate
when
you're
trying
to
match
a
spatial
map
and
another
important
detail
to
mention
before
I
move
on-
is
that
we're
using
the
cnn
not
to
predict
the
the
sort
of
raw
output
from
cam4
but
in
fact
to
predict
the
differences
in
the
output
that
are
due
to
the
parametric
changes
so
in
each
ensemble
member,
we're
perturbing
nine
parameters,
and
what
we
want
to
predict
is
the
impact
that
those
nine
parameters
have
on
a
particular
output
field.
I
Okay,
so
here's
the
sort
of
first
results
slide
that
is
showing
our
in
the
left
column,
the
output
from
cam
4
for
a
single
realization
in
the
ensemble
and
again
these
are
differences
that
are
due
to
the
parameters
that
are
being
perturbed,
and
so
we've
got
the
rest
dom
or
the
top
of
atmosphere
or
top
of
model.
Radiative
balance
field
here,
which
is
fairly
smooth,
fairly
flat
field,
and
then
we've
got
the
total
precipitation,
which
of
course
has
a
lot
of
spatial
structure.
I
This
is
the
one
degree
case
and
then
the
two
columns
here,
the
cnn
mse
and
the
cnn
ss
columns.
I
So
we
can
look
at
skill
scores
and
we
can
look
at
mse
and
overall,
we
can
see
that
the
accuracy
of
this
cnn
is
is
is
is
pretty
good
for
an
individual
realization,
zooming
in
on
the
precipitation
field,
and
then
looking
you
know
averaged
over
all
realizations.
The
cross-validated
skill
score
is
about
0.8.
Okay,
for
this
cnn,
using
the
skill
score
loss
function
for
precipitation
in
particular,
which
is
the
hardest
target
for
the
for
the
neural
net.
I
It's
at
0.73
and
it's
producing
maps
that
look
like
this,
and
if
you
sort
of
compare
the
map
on
the
right
and
the
left
and
and
look
you
know,
there's
a
lot
of
detail
here
in
the
asian
monsoon
regions,
northern
part
of
south
america,
the
sort
of
separation
of
the
itcz
there's
a
lot
of
detail
here,
the
the
the
cnn
is
able
to
capture
and
just
to
remind
you
that
the
only
information
that
this
model's
been
given
is
those
nine
parameter
values
at
the
beginning.
I
Okay,
now,
for
my
final
part
of
my
talk,
just
a
sort
of
slightly
more
applied
example
of
how
we
think
this
might
be
used
operationally
down
the
road
to
try
and
make
calibration
and
tuning
more
efficient.
So
I
mentioned
at
the
beginning
that
we
actually
have
three
ppe's.
I
We
train
different
versions
of
the
cnn,
gradually
increasing
the
number
of
high
resolution,
examples
that
we
we
show
the
cnn
so
we're
progressively,
giving
it
more
and
more
of
the
data
that
it
really
needs
to
make
predictions
of
high
resolution
outputs
and
once
again
we
have
this
kind
of
20
unseen
cases
and
we
repeat
the
whole
thing
40
times
to
assess
the
skill
and
we
can
compare
our
cnn
output
to
just
you
know
what
we
would
obtain
by
using
a
prediction
of
the
climatological
mean
difference
due
to
the
parametric
changes
as
our
ben
as
our
baseline.
I
So
here's
the
kind
of
main
result
slide
for
the
talk
and
starting
in
this
this
panel
on
the
left.
What
you
see
on
the
y-axis
is
the
skill
score
and
on
the
x-axis,
this
is
being
shown
now
as
a
function
of
the
number
of
higher
resolution
cases
that
are
incorporated
in
the
training.
So
a
value
of
zero
means
there
are
no
high-resolution
cases.
I
So,
overall
we
see
this
kind
of
plateauing
in
the
skill
for
all
of
the
seven
outputs
that
we
predict
around
40
cases.
So,
basically,
if
you
can
run
your
higher
resolution
model
for
about
40
times,
then
you've
seen
about
as
much
skill
gain
as
you're
going
to
see
there's
sort
of
diminishing
returns
above
40
cases.
So
this
is
where
the
efficiency
comes
from
right.
I
You
can
run
the
lower
resolution
versions
of
the
model
and
they
provide
a
fair
amount
of
information
right
about
what
higher
resolution
outputs
are
going
to
look
like,
and
then
you
can
kind
of
feed
it.
Smaller
chunks
or
smaller
numbers
of
higher
resolution
runs,
and
in
that
way
you
can
save
yourself
a
lot
of
time
because
you
don't
have
to
run
as
many
of
those
those
different
combinations
of
cases
with
high
resolution
at
the
beginning.
I
Okay,
so
I'm
going
to
conclude
with
a
few
thoughts,
so
we
would
argue
that
this
this
method,
using
the
convolutional
neural
network,
you
know,
could
potentially
support
the
calibration
of
of
higher
resolution
models.
The
prediction
skill
from
the
cnn
is
it
is
reasonably
good.
I
I
think
that
that
is
a
a
rather
subjective
measure,
though,
and
it
would
be
really
interesting
to
have
discussions
with
model
developers
as
to
whether
or
not
that
precipitation
field,
for
example,
that
we
end
up
predicting
from
the
cnn
would
be
kind
of
useful
information
or
whether
the
the
finer
scale
details
that
are
in
the
in
the
original
simulation
from
cam
4,
that
aren't
predicted
by
the
cnn
would
be
required.
I
I've
made
a
statement
here
that
you
know
in
in
this
in
this
setup
the
having
the
cnn
in
place
of
just
running
the
higher
resolution
model.
100
times
you
know,
reduces
the
amount
of
cpu
time
we
need
by
about
20
to
40
percent.
So
there
is
a
considerable
cost
saving
and
you
know
you
might
argue
that
if
those
relationships
across
scales
were
to
hold
that
that
saving
might
be
even
more
profound
as
you
go
to
to
get
higher
resolutions.
I
The
highest
resolution
in
this
example
is
just
one
degree:
it's
the
standard,
cmip6
resolution,
it's
not
particularly
high.
So
the
the
key
question
for
extending
this
work
is
you
know
whether
or
not
we
can?
We
can
push
this
to
you
know
0.25,
0.1
or
or
further
right
and
maybe
to
time,
evolving
and
multi-component
situations,
and
so
I'm
going
to
leave
it
there
just
to
acknowledge
my
co-authors.
Will
mcnally
from
computer
vision
and
jack
virgin,
and
we
have
a
manuscript
in
preparation
thanks
a
lot
for
your
attention.
B
Thanks
chris,
for
that
really
nice
talk,
we
did
have
a
question
coming
in
the
chat
during
the
talk,
so
we'll
read
that
from
chuck,
just
as
there
are
multiple
methods
of
down
scaling,
are
there
multiple
methods
of
upscaling
and
does
the
choice
of
method
for
upscaling
significantly
influence
the
end
results.
I
Yeah,
that's
a
great
question
thanks,
so
the
upscaling
happens
basically
to
get
our
data
sort
of
con
conforming.
So
I
forget
where
I
mentioned
it
here
yeah.
So
we
have
sort
of
different
different
resolutions
of
our
training
data.
We
have
to
get
them
all
kind
of
conforming,
and
so
you
can
either
sort
of
degrade
the
high
resolution
or
you
can
upscale
the
low
resolution.
We
we
decided
to
sort
just
use
a
bilinear
interpolation
to
get
these
lower
resolution
fields
onto
the
one
degree
grid.
I
We
tested
a
few
other
methods
of
upscaling,
you
know
sort
of
cubic
interpolation
and
other
things
it
didn't
seem
to
make
a
kind
of
material
difference
to
the
to
the
conclusions
we,
our
our
cnn
skill,
was
fairly
robust
to
those
changes,
but
we
haven't
explored
any
kind
of
more
advanced
methods
of
of
upscaling
and
it
would
be
interesting
to
know
whether
or
not
those
those
may
have
an
effect.
B
Yeah
related
to
the
it's
kind
of
unique
that
you
have
the
different
resolutions
for
the
ppes,
and
so
I
guess
I
was
wondering
if
you
see
significant
differences
in
the
ppe
spreads
with
at
different
resolutions,
since
you
do
have
like
consistent
parameters.
So
does
does
that
impact
the
spread
of
the
resulting
ensemble
members.
I
Yeah,
that's
a
really
good
question
actually
off
the
top
of
my
head.
I
I
don't
know
I
need.
I
need
to
look
at
that.
That
would
be
something
very
interesting
to
look
at.
I
do
recall
from
the
anderson
and
lucas
paper,
which
is
the
the
the
previous
example
that
I
I'm
aware
of
where
I
have
this
multi-resolution
ppe-
that
they
did
show
that
the
distributions
the
pdfs
of
the
the
outputs
for
the
different
resolutions.
I
think
that
was
a
cam
5
study.
I
B
Great
thanks,
I
think,
there's
a
bunch
more
questions
in
the
chat,
but
I'm
wondering
if
we
should
move
on
to
the
last
talk,
which
is
will
be
given
by
me.
B
Okay
hope
you
can
see
that
okay,
yes
great
so
yeah
thanks
everybody
for
the
talk
so
far,
I've
sort
of
been
tasked
with
trying
to
give
an
overview
of
all
of
the
cesm
related
machine
learning
projects
here
at
npr,
but
also
you
know
some
collaboration
with
external
folks.
So,
let's
jump
right
in,
I
don't
think
I
have
to
motivate
this
too
much.
B
I
thought
I'd
just
break
it
down
by
cesm
components
and
we
actually
do
have
a
pretty
good
sampling
of
different
projects
thanks
to
input
that
different
folks
at
ncar
sent
me
on
what
they're
working
on,
and
so
I'm
going
to
start
with
a
couple
of
projects
that
I've
been
involved
in
and
the
first
one
is
actually
from
the
land
side,
so
very
similar
to
the
work
that
chris
was
just
talking
about.
I've
done
some
emulation
and
calibration
with
ppes
and
clm5,
using
machine
learning
to
emulate
clm
and
then
I'm
moving
to
the
atmosphere
side.
B
B
We'll
also
highlight
some
work
by
andrew
gedemann
and
others
looking
at
parametrization
in
cam6
and
then
staying
in
the
atmosphere.
Maria
molina
is
leading
work
on
using
machine
learning
for
earth
system
prediction
with
a
number
of
us
here
at
encar
and
also
a
few
external
folks.
B
We'll
also
highlight
some
work
by
alice
duvivier,
looking
at
moving
into
polar
modeling.
So
looking
at
process
understanding
for
sea
ice
and
then
finally,
we'll
highlight
some
work
by
scott
bachmann
and
marcus,
using
sort
of
a
framework
to
couple
hpc
with
machine
learning
and
that's
with
the
ocean
model.
That's
with
mom
6.
B
B
Okay,
so
starting
with
the
projects
that
I'm
involved
in
again
really
nice
to
follow,
chris's
talk,
because
what
I've
done
with
the
land
model
with
clm5
is
to
train
a
series
of
artificial
neural
networks
to
emulate.
Cln5
output,
given
different
sensitive
parameters
as
input,
and
so
what
this
does
is
allows
for
many
fast
computations
with
an
emulator
instead
of
running
the
full
model.
B
So
you
can
test
lots
of
different
parameter
values,
and
this
is
work
with
ben
sanderson,
rosie
fisher
and
dave
lawrence,
and
so
really,
what
you
can
do
here
is
do
a
series
of
emulation,
calibration
and
testing
procedures
so
on
the
left,
I'm
showing
sort
of
assessing
the
skill
of
the
clm
emulator
for
a
particular
output
variable.
B
That's
then
run
with
those
optimized
parameter
values
of
relative
to
observations,
and
you
can
compare
that
with
the
default
model
bias
on
the
right.
So
what
we're
finding
is
that
we're
getting
improvement
in
some
regions
and
degradation
in
others.
This
is
for
again
for
gpp,
looking
spatially
kind
of
highlighting
the
difficulty
of
calibrating
globally
with
these
different
parameters,
but
there's
more
detail
in
this
paper
that
came
out
last
year.
B
So
what
we'd
like
to
do
is
use
machine
learning,
based
detection
to
automate
the
classification
of
different
synoptic
scale,
weather
features,
and
so
what
we're
showing
here
is
that
we've
been
able
to
apply
some
existing
algorithms
that
are
both
based
on
pre-trained,
convolutional,
neural
networks
or
cnns,
and
on
the
left.
I'm
showing
detection
of
atmospheric
rivers
and
tropical
cyclones-
and
this
is
using
the
pre-trained
climate
net
algorithm.
B
There's
a
paper
here
per
bottle
that
has
more
details
on
that
algorithm
to
detect
ars
and
tcs
globally
in
high
resolution
or
quarter
degree
coupled
cesm
simulations
and
then
on
the
right.
I'm
showing
a
different
machine
learning
algorithm
called
dl
front,
and
this
is
to
detect
different
front
types,
and
this
was
developed
by
jim
bart
and
ken
kunkel
at
nc
state.
So
there's
a
reference
here
for
more
details
on
that
algorithm.
B
Some
detection
of
you
know
different
type
of
extreme
weather
event,
but
then
our
overall,
our
overarching
goal,
is
to
connect
the
identified
features
with
extreme
precipitation
events,
and
so
I'm
just
showing
an
example
of
this
here
where
we
have
a
snapshot
of
the
detected
fronts
and
then
the
90th
percentile
precipitation
plotted
on
top
of
that.
So
what
you
can
see
is
there
are
some
areas
where
the
extreme
precipitation
lines
up
nicely
with
the
frontal
systems.
B
Okay,
so
moving
right
along
to
the
work
I
mentioned
by
andrew
dettleman,
also
with
jack
chen
and
david
john
gagne,
and
so
this
the
goal
of
this
work
is
to
machine,
learn
the
warm
brain
processes
so
emulating
the
cloud
microphysics
in
cam
using
a
neural
network.
So
the
question
is:
can
we
do
the
warm
rain
processes
better?
I'm
not
going
to
go
through
all
the
details
of
the
emulation
here,
but
I
would
direct
you
to
andrew's
recent
paper
in
james.
B
But
if
you
look
at
the
top
right
figure,
they're
showing
here
the
emulator
performance,
which
is
quite
good
so
comparing
the
emulator
on
the
y-axis
with
the
bin
model,
this
is
tel
aviv
university
bin
microphysical
model
on
the
x-axis
and
so
showing
very
good
agreement
between
those.
And
this
is
plotting.
A
rain
mass
tendency
and
the
colors
are
frequency,
but
then
he
also
notes
that
the
bin
code
is
different
than
the
original
model,
which
is
the
I
believe,
the
morrison
and
gentleman
cloud
microphysics
scheme.
B
So
there
are
differences
there
and
then
on
the
bottom
here,
they're
showing
the
onset
of
precipitation,
which
does
look
different
between
the
control
and
the
emulator
and
so
andrew
circled
sort
of
the
region
here
of
lower
effective
radiuses,
where
you
can
see
differences
in
the
rain
rate,
which
is
the
colors
here
as
you
increase
the
liquid
water
path.
But
another
important
takeaway
is
that
this
is
an
opportunity
for
a
large
speed
up
and
calculations,
because
the
neural
network
is
quite
fast.
B
So
the
next
project
I
want
to
highlight
is
led
by
maria
molina,
and
this
is
looking
at
machine
learning
for
earth
system
predictability.
This
is
with
yaga
richter,
sasha
glanville,
myself,
kirsten
mayer,
zane,
martin
from
csu
julie,
crown,
ichihu
and
jeremiah.
B
So
the
idea
here
is
to
use
actually
an
unsupervised
learning
technique
which
we
haven't
heard
too
much
about
so
far.
Today,
we've
heard
a
lot
of
different
examples
of
supervised
learning
and
the
specific
technique
here
is
psalms
or
self-organizing
maps,
and
so
the
idea
is
to
use
this
technique
to
group
synoptic
scale
patterns
without
the
need
for
any
pre-existing
labels.
B
So
the
way
this
could
work
is
you
know,
providing
an
input
vector,
for
example,
containing
week
three
mean
winds,
geopotential,
height
and
precipitation
over
the
us,
and
then
letting
the
sun,
letting
the
self-organizing
map
determine
different
groupings
of
these
climatological
patterns,
and
so
maria's
provided
an
example
here
of
the
week
three
mean
precipitation
anomaly:
these
are
from
the
cesm2
s2s
simulations
sub,
seasonal,
reforecasts,
and
so
again,
what
the
song
is
doing
here
is
kind
of
grouping
different
precipitation,
anomaly
patterns.
You
can
see
you
know,
patterns
where
you've
got
drying
in
the
southeastern
u.s
versus
wedding.
B
I
believe
the
arrows
are
winds,
but
I'm
not
100
sure
and
there's
nine
groupings
here.
The
number
of
categorizations
from
the
som
is
somewhat
of
a
user-defined
parameter,
but
you
can
see
how
it
it
separates
into
different
patterns
with
different
sample
sizes
across
those,
and
then
you
can
also
look
sort
of
upstream
of
those
week,
three
precipitation
anomalies.
B
So
here's
just
two
very
different
precipitation
anomaly
patterns
week:
three,
but
then
also
the
the
preceding
outgoing
long
wave
radiation
patterns
week,
one-
and
so
you
know
you
can
think
about
possible,
teleconnections
and
so
teleconnections
here,
if
you
look
at
the
olr
and
the
tropical
pacific,
so
the
next
steps
would
involve
predicting
the
synoptic
scale
patterns
of
u.s
precipitation
on
sub-seasonal
time
scales,
so,
for
example,
starting
from
olr
patterns
and
then
sort
of
switching
over
to
more
supervised
learning
technique.
B
For
example,
cnn,
to
predict
these
different
precipitation
anomalies
and
some
of
the
questions
are
you
know
how
robust
are
the
patterns
delineated
by
the
sum?
Can
we
leverage
these
methods
to
think
more
about
extreme
events
and
also
improving,
potentially
improving
s2s
prediction
skill
so
moving
right
along
to
polar
modeling
alice
to
vivier,
maria
molina
and
marika?
B
B
So
what
is
smartsim?
It's
a
development
library
dedicated
to
converging
ai
and
numerical
simulation
models
sort
of
shown
here.
We
know
a
lot
of
our
climate
models
are
written
in
c
and
fortran,
and
so
smart
sim
is
sort
of
working
to
connect
those
models
with
the
modern
data
science
stack
and
therefore
enabling
sort
of
inference,
training
and
analysis
through
that
connection,
and
so
by
modern
data
science
stack,
we
mean
things
like
jupiter
lab
and
desk
and
scott's,
provided
an
example
simulation.
B
I'm
going
to
start
this
here,
where
they're
using
smart,
sim
with
mom,
6
and
so
what's
plotted
here
is
eddie
kinetic
energy,
so
the
use
case
is
predicting
eddy,
kinetic
energy,
using
the
machine
learning
to
hopefully
help
improve
parameterizations
of
mesoscale
turbulence.
So
I'll
start
the
simulation
again
because
it's
pretty
cool,
so
this
is
looks
like
a
pre-industrial
control.
Daily
output
run
over
the
course
of
a
year,
so
very
fine
scale,
sort
of
turbulence
features
that
you
can
see
here.
B
Okay
and
then
just
a
few
more
details
on
their
end
car
collaboration.
I
believe
there
is
a
paper
that's
been
submitted.
B
B
A
A
A
I
don't
see
any
specific
question
just
from
maria
great
overview,
which
I
agree.
So
one
of
our
motivations
was
that
we
have
a
pretty
scattered
landscape,
even
at
encar,
as
you
just
heard
of
machine
learning,
scientists
or
data
scientists
that
work
with
us,
the
earth
system,
scientists
and
it's
even
with
an
end
car,
not
quite
in
my
view,
an
organized
effort.
A
So
I
think
we
could
transition
into
the
discussion
period,
and
this
is
one
chat
question,
but
I
think
it's
for
for
for
a
different
person.
Katie,
do
you
want
to
bring
up
the
discussion
questions.
A
Yep,
let
me
I
can
share
my
screen.
Okay,.
A
A
A
Okay,
we
only
have
maybe
10
more
minutes
or
four
14
more
minutes,
but
potential
discussion
points
are
listed
here,
but
we
don't
need
to
stick
to
these,
so
this
is
really
an
open-ended
discussion
and
really
an
opportunity
for
us
as
a
community
to
exchange
ideas.
One
question
that
I
put
up
here
is
whether
in-car
can
really
facilitate
help
facilitate
the
communication
among
us,
the
people
that
participate
here
in
this
cross-working
group
machine
learning
scientists
are
maybe
also
interested
in
getting
into
csm
related
research
with
respect
to
machine
learning.
A
So
a
question,
and-
and
we
might
not
have
the
answer
today-
but
can
ankara
really
help
facilitate
that
discussion?
Can
encar
connect
us
scientists
that
that
you
know
who
do
machine
learning,
related
research
with
csm
or
even
other
models
that
encore
like
wolf
or
other
models?
And
if,
yes,
how
would
we
do
that?
And
this
is
an
open-ended
question,
of
course,
but
other
discussion
points
as
you
see
them
here,
and
this
came
up
in
many
of
the
presentations
we
talked
about
the
machine
learning
workflows,
even
the
tools
like
sherpa
or
others.
A
These
are
hyper
parameter,
optimization
tools.
We
also
just
heard
from
katie
about
tools
to
link
python
and
fortran
codes,
and
I
think
this
is
actually
a
very
important
topic,
as
we
are
looking
at
emulators
for
csm
or
other
models
or
the
month
six
months,
even
that
we
heard
from
from
xena
there
may
be
questions
concerning
the
computational
resources
needs.
Encode
does
provide,
of
course,
gpu
capabilities
with
casper
and,
of
course,
the
next
system
that's
coming.
A
A
We
heard
about
physical
constraints,
how
we
integrate
them
into
the
machine
learning
algorithm,
how
we
improve
parametrizations,
or
I
guess
we
heard
that's
a
mixed
bag.
That's
not
always
an
improvement
emulation
of
physical
organizations,
and
maybe
we
didn't
concentrate
too
much
here
on
the
uncertainty
quantification,
but
these
are
all
points
that
are
of
interest
here
to
this
group.
A
G
Yeah
I'm
happy
to
go
first,
I
guess
I
I'm.
I
love
the
first
bullet
point
and
you
know
how
could
we
work
to
connect
everyone,
and
particularly
at
ncar,
where
we
have
scientists
working
across
so
many
different
components
of
cesm
and
with
so
many
different
areas
of
expertise,
but
I'm
curious
to
hear
from
everyone
here
or
or
who
would
like
to
speak
up
about?
How
can
that
be
done?
G
Or
can
one
of
these
efforts
be
done
to
connect
everyone
without
it
being
more
like
imposing
on
others
time,
because
I
think
in
this
virtual
world
a
lot?
Maybe
people
are
having
more
and
more
meetings
or
you
know
just
finding
a
struggle
for
time
balance.
So
yeah
are
there
efforts
that
others
are
part
of,
and
that
have
seen
success
or
any
suggestions
for
how
something
like
that
could
be
done
or
led.
A
Maybe
I
can,
you
know,
provide
my
own
thoughts,
but
I
of
course
invite
everybody
to
contribute,
so
a
first
step
could
be
and-
and
it
would
of
course
require
some
resources
to
build
up-
maybe
even
just
a
simple
web
page
on
one
of
the
end
car
servers
to
facilitate
information
exchange,
for
example,
let's
post
our
references,
can
we
maybe
list
the
csm
related
machine
learning
references
in
one
location
that
we
have
a
kind
of
a
go-to
place
to
see?
What's
going
on,
at
least
in
the
csm
world?
A
Of
course
we
have
many
other
activities
in
the
community,
but
that
could
be
a
starting
point.
Maybe
with
a
few
you
know
highlights
from
recent
papers,
maybe
updated
once
a
month,
or
so
you
know
not
too
frequently,
but
a
place
where
we
at
least
would
go
to
to
find.
Maybe
even
collaborators
from
the
end
card
team
and
see
what
the
activity
is.
D
Yeah
so
well,
first
of
all,
thanks
katie
for
sharing
some
of
the
work
that
we've
done
with
hpe.
D
So,
as
katie
said,
I'm
sort
of
mostly
interested
in
doing
the
improvement
of
parameterizations,
and
I
was
sort
of
wondering
if
ncar
could
maybe
act
as
a
host
for
some
kind
of
repository
for
like
transfer
learning,
so
that
we
can
share
our
you
know:
trained
neural
networks
or
parameterizations
with
each
other.
You
know
thinking
about
a
conventional
turbulence,
closure
parameterization
right.
It's
like
we
want
to
share
that
idea
with
other
researchers.
D
You
know
we
either
share
the
equations
or
maybe
share
numerical
algorithms,
but
with
machine
learning,
it's
a
little
bit
different
right,
we're
sharing
the
results
of
training.
You
know
a
neural
network
or
something
so
you
know
if
we
wanted
to
share
our
training
that
produced
that
mom
6
movie,
that
katie
showed
you
know
another
another
scientist
or
another
modeling
center
may
not
have
the
ability
to
run
the
very
expensive
model
that
we
use
for
the
training,
but
we'd
still
like
to
be
able
to
you
know
share.
D
You
know
share
the
results
that
we
have
with
them
right,
because
I
think
that
would
be
quite
nice
and
I
think
ncar
is
perhaps
uniquely
situated
to
do
that,
since
we
have
the
computational
resources-
and
we
have
you-
know
we're
hosting
these
gigantic
data
sets.
So
why
not
hope
be
like
a
host
for
these
kinds
of
model?
Improvements
too.
B
Yeah,
I
think
I
think,
that's
a
great
point,
scott.
I
know
we
have
another
gokan
as
his
hand
up,
but
I'll
just
mention
that,
in
our
collaboration
with
the
extreme
weather
detection,
we've
been
able
to
leverage
models
that
were
trained
elsewhere
and
then
do
that
sort
of
transfer
learning
by
taking
those
models
and
applying
them
to
cesm
output.
And
it's
you
know
less
computationally
heavy
on
on
our
end,
because
we
don't
have
to
do
the
training
sort
of
from
zero.
B
D
So
it's
along
the
similar
along
this.
My
comment
was
along
this
along
similar
lines
as
well,
and
I
think
bailey
just
commented
as
well.
I
guess
I
mean
listening
to
all
the
talks,
so
well,
not
that
many
actually,
but
it
looked
like
many
people
are
using
different.
I
guess
I
don't
have
a
sense
of
how
common
these
algorithms
or
whatever
scripts
or
codes
are.
D
D
I
can
see
this
thing
as
more
or
less
at
this
stage
since
its
early
stages,
this
field
is
almost
in
its
infancy.
People
are
pro,
probably
are
trying
different
methods
and
different
techniques,
but
in
five
years
time
we
may
end
up
with
a
whatever
suit
of
all
these
algorithms.
How?
How
are
we
I
mean?
Is
it
possible
for
ces
I'm
able
to
host
some
of
these
algorithms
that
are
accepted
by
the
community?
You
know
or
identified,
as
maybe
robust
I
mean
I
can
see
that
this
can
be
diverging
quite
a
lot.
So
what
is?
B
Yeah,
I
guess
I'll
I'll
just
also
mention
that
some
of
the
early
machine
learning
for
earth
science
tutorials
have
played
a
similar
role,
and
this
is
also
where
we
can
leverage
the
work.
That's
been
done
in
sizzle
because
to
prepare
for
those
tutorials
they've
created
a
set
of
no
jupiter
notebooks
or
python
scripts,
using
some
of
the
common
machine
learning
libraries
that
we've
heard
you've
heard
about
today.
B
So
I
think
maybe
that's
happening
a
little
bit,
maybe
where
cesm
can
be
more
specific
to
some
of
the
techniques
we
would
want
for
looking
at
different
climate,
modeling
questions
and
there's
certainly
some
best
practices
and,
as
dave
said,
maybe
some
sort
of
standard
ml
toolkit,
because
there
are
a
lot
of
choices
to
be
made
along
the
way.
I
think
garrett
kind
of
hinted
at
that
in
his
talk,
so
there's
not
really
one
size
fits
all,
but
yeah.
D
I
mean
an
interesting
thing,
is
that
I
don't
know
in
late,
1990s
machine
learning
came
up
again
and
many
people
essentially
were
working
on
it
and
nothing
came
out
of
it.
I
don't
know
whether
there
was
actually
lack
of
momentum
behind
it
or
not,
but
it
was
actually
it
comes
in
cycles
almost
and
we
need
to
make
sure
that
this
one
sticks
around.
I
guess.
G
Yeah
thanks
mom
katie
gokan.
I
wanted
to
reply
to
your
suggestion
or
idea.
I
think
it's
a
great
point
it'd
be
interesting
to
see
if
maybe
like,
while
katie
mentioned,
that
the
models
so
far
that
are
being
built
and
all
these
different
projects
and
things
like
that
are
highly
specialized
for
a
specific
question
and
the
architectures
can
vary
greatly.
G
It
would
be
interesting
to
maybe
try
an
effort
where
we
have
some
sort
of
like
vanilla,
that's
what
they
call
like
the
simpler
cnns,
but
some
sort
of
vanilla,
cesm
applied
neural
network
that
maybe
can
be
provided
to
the
community
specific
for
cesm
and
maybe
and
have
that
be
untrained
and
others
can
train
it
for
their
own
application.
G
I
don't
know
just
a
totally
random
thought
that
I
had,
as
you
were,
explaining
that
to
try
to
maybe
motivate
more
people
to
train,
but
something
that's
like
very
standard
like
like
we
just
kind
of
packaged
it
and
provided
it
I
don't
know,
that's
just
a
random
thought.
I
had.
E
E
So
one
thing
that
I've
learned-
and
this
could
just
be
a
product
of
my
of
my
workflow,
but
it's
definitely
not
a
one-size-fit-all
type
of
work
by
any
means,
and
I
did
want
to
commend
sizzle
and
their
staff
for
being
able
to
get
things
that
I've
been
interested
in
using
whether
it's
sherpa
or
even
a
library
xg
boost
for
boosted
force
things
that
weren't
implemented
on
the
on
casper
and
in
the
python.
Casper
libraries
they've
been
really
helpful
in
getting
those
things
on.
E
I
know
it's
not
what
you're
talking
about
with
some
kind
of
a
robust
everybody
can
can
kind
of
use
this
machine
learning
model,
but
it
is.
It
is
very
helpful
to
be
able
to
use
those
resources
and
have
that
and
have
that
you
know
ability
to
to
kind
of
do
do
what
whatever
you
you
as
long
as
you
can
justify
that
you're
gonna
get
use
out
of
it.
You
know
they're
really
helpful
at
getting
those
things
together
for
you.
So.
G
Hi,
I
also
want
to
say
something
about
what
maria
and
gokun
was
saying,
and
actually
those
are
great
points.
One
thing
that
I
think
our
community
should
stay
vigilant
about
is
like
the
the
basically
progresses
in
the
like
the
new
algorithms
that
come
come
up
and
basically
are
very
successful
and
can
be
used
like
in
the
in
the
transfer
learning.
G
So
there
are
like
a
lot
of
like
competitions,
and
there
are
like
a
lot
of
like
newer
algorithms
coming
up
that
are
like
a
basically
a
version
of,
for
example,
see
a
convolutional
neural
network
cnns,
but
they
are
like
kind
of
like,
for
example,
something
like
unit
or
alex
net,
or
something
like
that.
G
Those
things
I
think
it
would
be
important
for
us
to
stay
vigilant
after
mostly
some
like
progresses
on
those,
because
there
are
like.
I
think
there
are
a
lot
of
like
newer
algorithms
or
newer,
basically
neural
network
structures
that
are
like
successful
and
can
be
used
in
in
like
other
fields,
and
we
can
use
it
with
like
transfer
learning
a
bit
like
csm
or
with
climate
data.
A
G
Can
I
reply
to
you
yeah,
just
that's
a
great
point.
I
think
maybe
could
like
making
finding
a
way
to
get
the
community
more
involved
in
the
computer
science
conferences
on
machine
learning
like
neurops
and
then
there's
that
neurix
earth
science-
something
something
workshop,
so
I
think
yeah
finding
I
mean
we
already
have
agu
ams
et
cetera,
but
some
of
these
large
machine
learning
conferences
and
getting
the
csm
community
involved.
There
would
be
great
to
stay
up
to
date
on
some
of
these
newer
models
that
are
coming.
A
A
So
we
are
past
our
allocated
time.
We
can
keep
this
channel
open
for
interested
participants
if
you
still
have
a
little
bit
of
time
if
we
have
more
interest
in
this
discussion.
Of
course,
this
was
brief
and
we
couldn't
address.
You
know
all
the
points
we
could
have
potentially
discussed,
and
please
view
this
here
as
a
start
of
the
conversation,
because
I
think
there's
a
lot
of
potential
for
machine
learning
and
csm.
A
So
the
action
item
for
us
as
organizers
will
be
to
think
about.
Maybe
can
anchor
facilitate
communication,
as
this
can
be
at
first
pretty
simple.
I
am
not
envisioning
a
huge
effort
right
now,
but
you
know
I
I
suggest
we
give
it
a
go
and
see
whether
we
have
a
better
platform
to
use
with,
and
this
is
really
specific
to
csm.
Of
course
we
have
access
to
all
these
other
resources
ago,
ams
and
so
forth,
but
I
think
it
would
be
good
place
to
start.
You
know
a
discussion
about
csm
related.