►
From YouTube: Day 1 AI for Science at NERSC
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
B
A
Thanks
Troy
thanks
to
the
organizers
at
Nvidia
and
my
colleagues
at
a
nurse
for
putting
this
on,
and
thanks
all
of
you
for
for
being
here
and
listening
I'm,
a
I'm,
Steve,
Farrell
I'm,
a
machine
learning
engineer
at
nurse
I'm
in
the
data
and
analytics
Services
Group
broadly,
my
job
is
to
support
machine
learning
workloads
on
our
nurse
high
performance
Computing
systems.
A
I
will
talk
a
little
bit
about
the
kinds
of
things
that
we
do
at
nurse.
I
I
won't
go
too
much
into
introductory
things,
I'll
say
a
little
bit
about
AI
for
Science
and
our
perspective
from
an
HPC
Center.
But
of
course,
there's
going
to
be
more
introductory
stuff
coming
up
later,
so
I
apologize
if
I
gloss
over
things.
That
may
be
of
interest
to
you
but
happy
to
supplement
with
questions
afterwards
or
discussions
on
Slack,
foreign
and
I
think
I'm
already
to
go
here.
A
Yeah.
Unfortunately,
we
can't
we.
We
can't
actually
use
a
nurse
system
today
we
were
really
really
hoping
to
be
able
to
use
Pro
Mudder
resources
for
the
Hands-On
stuff
today
and
in
fact
that's
why
we
delayed
this
event
from
from
last
year.
But
you
know
these
kinds
of
systems
are
complicated
and
I'll
I'll
mention
shortly
that
we're
in
the
process
of
upgrading
the
system.
It's
just
it's
very
hard
to
predict
how
things
are
going
to
be.
A
We
actually
probably
could
have
used
Pearl
modder
today
in
the
end,
but
there
was
a
lot
of
uncertainty
yesterday
and
we
had
to
make
a
decision
I
had
to
pull
the
plug,
so
we're
very
grateful
that
Nvidia
has
resources
that
they
can
spin
up
so
quickly.
For
events
like
this
yeah,
so
I'll
say
a
little
bit
about
AR
for
Science
and
then
I'll
I'll
talk
about
Nurse
I'll
talk
about
our
our
AI
strategy.
A
A
All
right
so
we're
all
here,
presumably
because
we're
interested
in
science
we're
working
on
science-
and
we
probably
are
all
interest
working
on
interesting
problems
and
are
aware
of
ai's
potential
to
enhance
our
research
to
really
transform
the
the
kinds
of
science
that
we're
doing
and,
in
fact,
AI
as
as
we're
as
it's
being
rapidly
adopted
across
many
domains
of
science,
we're
seeing
that
it
can
be
applied
almost
in
any
kind
of
science
domain.
I.
A
Don't
really
know
of
any
that
that
it
has
not
yet,
you
know
been
considered,
potentially
transformative,
but
even
within
specific
science
domains.
Ai
broadly
can
be
applied
to
a
lot
of
different
aspects
of
our
research
workflows,
including,
but
not
limited
to
the
the
things
that
I
have
here,
such
as
analysis
of
large
data
sets.
So,
of
course,
AI
is
not
only
limited
to
large
data
set.
But
when
we
talk
about
the
the
modern
techniques
in
AI,
with
deep
learning
and
deep
neural
networks,
they
really
shine.
A
We
have
large
data
sets
and
when
I
say
analysis
of
large
data
says
that
can
also
mean
a
few
things.
So
we
know
that
that
AI
gives
us
methods
that
can
learn
directly
from
data
and
in
many
cases
these
learned
models
can
actually
get
more
out
of
our
data
than
we
can
with
hand.
Engineer
features
by
you
know,
learning
the
complex
features
that
are
needed
to
solve
a
specific
problem.
A
Ai
can
also
help
us
in
cases
where
maybe
we
don't
really
have
a
great
traditional
solution
to
a
problem.
Maybe,
instead
we
rely
on
hand
labeling
data
scanning
through
data
which
is
tedious
and
limits
how
much
we
can
do,
of
course,
with
our
grad
student
armies
right,
but
with
AI.
We
can
automate
a
lot
of
that.
So
that's
just
a
couple
of
things
so
far.
Another
big
one
is
acceleration
of
expensive
simulations,
and
this
is
especially
relevant
from
the
HPC
facility
perspective,
but
also
broadly
in
science.
A
We
know
that
we
rely
a
lot
on
having
physical
models
of
the
world
of
having
simulations
that
can
go
from
initial
conditions.
To
you
know,
final
conditions
or
from
first
principles
to
some
observed
quantities,
and
very
very
often
the
amount
of
science
we
can
do
is
actually
limited
in
terms
of
the
computational
resources
that
we
can
commit
to
that.
A
Sometimes
these
these
computations,
like
performing
density,
functional
theory
on
a
very
large
system
of
atoms,
just
the
computational
need
just
explodes
and
that
limits
what
we
can
actually
do,
or
our
ability
to
model
the
climate
of
the
earth.
We
can
do
things
at
low
resolution.
Maybe
we
even
have
good
physical
models
for
the
smaller
scale
physics,
but
to
try
and
model
the
entire
Earth
at
the
high
resolution.
Needed
is
pretty
much
impossible
with
today's
resources.
So
again,
science
is
limited
by
that.
A
A
You
may
have
seen
a
paper
from
deepmind
not
too
long
ago,
where
they
had
like
really
great
results
on
controlling
a
Tokamak,
Fusion
Reactor
with
AI
and
showing
that
they
could
do
I
think
even
go
beyond
things
that
that
an
expert
engineer
could
do
so
AI
is
it's
really
enthusiastically
being
adopted
by
the
science
communities,
both
in
the
doe
and
the
NSF
and
Beyond?
So
we
see
a
recent
AI
wave
here.
A
There
are
a
lot
of
I
think
science
domains
that
are
still
waking
up
to
the
the
capabilities
of
AI,
but
luckily
we're
also
seeing
I
think
in
a
lot
of
areas.
A
Research
moving
from
proof
of
concept
to
maturity
or
things
are
actually
getting
sophisticated
enough,
mature
enough
where
they
can
actually
be
used
to
do
scientific
discovery
or
be
used
in
scientific
production.
Of
course,
that
doesn't
say
that
the
the
story
is
done
here.
A
There's
still
a
lot
of
work
needed,
which
is
why
we're
all
here
so
that
we
can
learn
more
about
Ai
and
how
to
apply
it
to
our
problems
and,
as
things
keep
growing
as
things
keep
getting
more
sophisticated
as
we
tackle
more
and
more
complex
problems,
the
the
computational
needs
of
AI
become
quite
demanding
and
they're
still
growing.
A
So
HPC
centers,
like
nurse,
can
play
a
really
important
role,
not
only
because
they
provide
those
needed
computational
Resources
with
large-scale
high
performance
Computing
systems,
but
also
the
expertise
for
how
to
deploy
those
workloads,
because
it
turns
out
that
it's
still
non-trivial
to
deploy,
let's
say
a
massively
parallel
model.
Training
of
some
of
the
biggest
Cutting
Edge
state-of-the-art,
deep
learning,
models
that
are
out
there
today,
so
hopefully
that'll
get
better
over
time.
But
that's
the
situation
now
so
introduction
to
nurse.
So
nurse
is
the
national
Energy,
Research
scientific
and
Computing
Center.
A
We
are
located
at
Lawrence,
Berkeley,
National
Lab
and
by
Mission
HPC
Center
I
mean
that
we
we
cover
the
whole
mission
of
the
Department
of
energy
office.
Of
science,
all
science
domains
that
the
department
of
energy
bonds
and
cares
about
they
potentially
get
time
on
our
systems.
In
fact,
the
doe
decides
most
they
allocate
most
of
the
hours
on
our
systems.
So
we
have
a
very
large
and
diverse
user
base,
lots
of
different
kinds
of
science
being
done
on
our
systems
in
terms
of
the
systems
we
have.
A
Okay,
I'll
get
a
little
bit
more
now
into
our
our
strategy
of
what
we're
doing
to
support
and
enable
Cutting
Edge
AI
methods
for
science,
so
sort
of
three
categories
here
that
should
be
fairly
digestible.
So
first
we
have
to
deploy.
We
try
to
deploy
optimized
systems
for
AI
for
science,
both
hardware
and
software,
but
we've
we
found
that.
That's
really
not
enough!
You
can't
just
have
a
system.
That's
that's!
Well
optimized!
We
also
have
to
be
there
in
the
weeds.
A
We
have
to
make
sure
that
we
have
the
expertise
and
and
also
be
on
the
front
lines
to
turn
push
on
methods
and
tools.
So
we
do
engage
a
bit
with
the
community
with
Scientists.
We
have
post
docs
that
we
hire
at
nurse
to
work
on
Research
problems,
applying
AI
for
science
and
then
the
third
thing
is
empowerment.
So
we
do
a
bit
of
Outreach.
We
do
seminars,
workshops,
training
events
and
schools,
which
I'll
say
a
little
bit
more
about,
and
and
of
course,
this
is
one
example.
A
This
event
today,
so
in
the
deployment
category
say
a
little
bit
more
about
Pearl
Motor,
now,
sorry
again
that
you're
not
able
to
use
it
today,
but
hopefully
there's
enough
stuff
in
the
presentation
here
that
you'll
be
able
to
to
go
back
and
try
it
out
later.
If
you
already
had
an
account,
if
you
don't,
then
maybe
you
can
request
on
I'm
happy
to
talk
with
people
about
how
to
do
that
if
they
need
Pearl.
Mudder
is
a
system
from
hpe.
A
Actually,
it's
a
Croatia
esta
system.
When
we
first
started
procuring
it,
it
was
just
from
cray
and
then
cray
was
bought
by
hpe.
So
last
year
we
deployed
the
phase
one
system,
which
was
all
of
the
GPU
nodes,
so
we
had
12
GPU
cabinets.
Each
node
has
four
Nvidia
ampere
a100
gpus.
In
total
we
have
over
6
000
of
these
gpus.
So
it's
pretty
sizable
a
fairly
substantial
all
flash
luster
storage
system.
A
That's
not
available
right
now
because
of
the
upgrade
is
one
of
the
problems
and
then
the
phase
two
upgrade
is
what's
happening
now
this
brings
in
a
whole
CPU
only
partition
to
Pearl
Mudder,
in
addition
to
the
GPU
partition
some
nodes
without
gpus
for
workloads
that
either
don't
yet
use
gpus
or
don't
need
them.
It
also
brings
an
upgrade
to
the
network,
and-
and
this
is
actually
the
part-
that's
really
impacting
the
GPU
nodes
as
well.
A
A
And
one
other
thing
to
say
here
in
video
is
very
was
kind
enough
to
call
this.
The
the
world's
fastest
AI
supercomputer,
when
we
turned
it
on
all
right,
so
part
of
our
strategy
is
to
kind
of
track
what's
going
on
in
the
community
and
in
our
users.
So
we
do
see
a
growing
scientific
AI
workload
at
nurse
and,
of
course
we
anticipate
that
to
keep
growing
as
people
put
their
workloads
onto
Pro
Mudder,
which
is
particularly
well
suited
for
these
kinds
of
workloads.
A
So
we
do
that
and
if
we
track
these
things
in
a
few
ways,
one
is
that
we
can
actually
track
the
machine
learning
software
usage,
at
least
to
some
extent
on
our
systems.
Some
of
this
is
not
yet
working
on
Pro
Mudder,
but
that's
still
in
progress,
but
in
principle,
if
somebody
does
like
module
load,
Pi,
torch
or
tensorflow,
we
can
log
that,
and
we
have
a
way
to
log
python
Imports.
So
we
can
see
what
python
packages
people
are
using
and,
for
example,
we
see
like
on
this
this
bar
chart
on
the
right.
A
You
can
see
how
we've
seen
you
know
six
times
grow
from
2018
to
2021
and
then
tensorflow
and
Pike
torch.
We'll,
hopefully
have
you
know
another
much
larger
extension
of
that
soon.
We
also
put
on
a
survey
we've
been
doing
this
about
every
two
years.
There's
one
going
on
right
now,
where
we
ask
the
scientific
Community,
including
nurse
users,
which
I
think
make
up,
probably
most
of
it,
what
they're
doing
what
kinds
of
problems
they're
working
on?
What
are
their
computational
needs?
A
What
kinds
of
software
they're
using
the
tools
they
need,
how
they're
using
their
systems
and
stuff
like
that
so
I
I,
said:
there's
one
ongoing
right
now,
it'd
be
really
great
if
you're
applying
machine
learning
to
science,
if
you
would
help
us
out
by
filling
that
survey,
there's
a
link
at
the
bottom
I
did
share
these
slides
on
the
slack
presentation.
Channel
I
can
also
dump
them
in
the
zoom
chat
or
share
them
in
any
way
that
you
need
so
I'll
have
some
plots
that
come
out
of
those
those
surveys.
A
I
don't
have
the
preliminary
ones
from
this
year.
The
conclusions
are
not
too
different,
but
I
can
point
out
ways
where
they're,
where
the
trains
are
changing
here.
One
thing
we
see
is
from
the
the
users
out
there
from
the
community.
We
see
a
really
a
need
for
large-scale
resources
and
for
parallelization
basically
motivating
the
need
for
HPC
systems.
A
Our
users
can
sometimes
take
days
or
weeks
to
train
their
machine
learning
models.
They
can
have
large
data
sets
of
hundreds
of
gigabytes
terabytes
and
even
into
getting
into
petabytes.
Now
these
days,
I
I
don't
have
anything
on
the
different
ways
to
parallelize
machine
learning,
workloads
like
training,
machine
learning
workloads.
A
Today
we
actually
have
a
whole
tutorial
on
that
and
I'll
share
a
link
later
on,
but
there
are,
but
just
you
know
just
to
say
briefly,
that
there
are
various
ways
to
parallelize
machine
learning,
workloads
on
systems
and
we
we
also
ask
our
users
about
the
kinds
of
things
they're
doing.
A
Data
parallelism
is
one
that
that's
kind
of
most
prevalent
today,
but
we're
already
seeing
as
models
get
bigger,
a
need
for
more
kinds
of
parallelism
like
model
parallelism
and
things
like
that,
and-
and
so
that
kind
of
comes
back
to
that
point.
I
said
before,
though
it's
still
non-trivial,
it
still
can
be
challenging
to
to
deploy
these
kinds
of
sophisticated
parallel
workloads
on
HPC
systems.
So
we
do
what
we
can
to
try
and
educate
the
community
and
make
it
easier.
A
We
know
that
our
scientists
need
perform
in
a
flexible
software
that
enables
their
productivity.
They
need
to
be
able
to
iterate
quickly
and
try
things
out.
You
don't
want
to
be
bottlenecked,
because
this
is
done.
Micro
software
is
slow,
so
they
not
only
need
things
that
run
fast,
but
they
also
need
flexibility,
so
people
need
to
be
able
to
add
whatever
packages
are
relevant
to
their
domain
or
their
application
area
and
and
at
nurse
we
we
deploy
these
things
in
a
few
different
ways.
A
We
enable
our
users
to
use
either
software
we
provide
or
to
install
their
own.
So
we
do
provide
custom
built
modules.
Users
can
do,
for
example,
module
load,
Pi
torch
and
have
an
installation
that
that
they
know
is,
you
know,
built
and
optimized
for
our
systems,
but
people
can
also
build
their
own
custom,
condom,
condom
environments
and
they
can
use
containers.
So
we
support
containers
through
our
shifter
runtime
system
and
a
really
a
really
important
thing
for
this
on
promoter
is
nvidia's
offerings
these
NGC
containers
that
tend
to
be
very
Cutting
Edge.
A
They
always
have
the
latest
Nvidia
GPU
software
stack,
Cuda
and
kudi
in
and
nickel
and
things
like
this,
and
so
we
we
increasingly
rely
a
lot
on
these
containers
and
encourage
our
our
users
to
to
use
those.
A
I'll
skip
quickly
to
catch
back
up
a
little
bit
on
time
here,
but
we
know
that
scientists
also
need
productive
interfaces.
Jupiter
is
a
very
popular
service.
At
nurse
we
have
over
2000
users
and
and
users
are
able
actually
to
use
Jupiter
on
Pearl
mutter
for
their
machine
learning
workloads.
A
You
can
request
a
GPU
node,
you
can
use
software
kernels,
we
provide
or
you
can,
you
can
bring
your
own
and
then
on
top
of
that
users
also
need
systems
and
platforms
for
managing
all
their
experimentation,
their
exploration
to
find
what
models
are
best
for
their
problems,
things
like
Ray
tune
and
weights
and
biases.
A
So
we
don't
like
pick
and
choose
any
specific
offerings
here,
but
we
we
do
like
weights
and
biases
and
rate
tune
for
our
own
usage,
and
we
try
to
make
sure
these
things
work
and
encourage
people
to
try
them
out
and
let
us
know
if
they
have
problems
and
then
what
do
we
do
to
make
sure
our
systems
and
software
are
optimized?
Well,
one
important
aspect
of
that
is
is
benchmarking,
and
this
is
something
that
I
personally
spend
a
lot
of
time
on
so
mlperf.
A
Is
this
standard
machine
learning
performance,
benchmarking
effort
from
ml
comments?
It's
it's
the
it's
the
industry
standard
these
days
and
working
with
a
bunch
of
sites.
We
we
put
together
an
ml
per
HPC,
Benchmark
Suite
that
actually
brings
in
scientific
applications
that
have
the
kinds
of
attributes
that
we
think
are
important
for
pushing
on
HPC
systems,
so
things
like
3D,
volumetric,
cosmology
data,
high
resolution,
climate
images
or
Atomic
systems
for
graph
neural
networks,
and
so
this
has
been
a
really
valuable
effort
for
us.
A
It's
also
been
pretty
successful
with
a
couple
of
submission
rounds.
We
have
measurements
from
systems
all
over
the
world,
31
submissions,
I,
think
in
the
last
in
the
last
round,
we
present
results
at
Super
Computing
and
for
us
personally,
this
has
been
great
in
participating
to
help
us
shake
out
the
issues
in
Pearl
Mudder
to
understand
its
performance
characteristics
and
what
it
takes
to
to
get
performance
out
of
it,
and
then
we
can
pass
that
knowledge
on
to
our
users.
A
Now
I'll
switch
a
little
bit
into
the
application
side
of
things.
This
is
mainly
highlighting
work
that
some
of
our
awesome
postdocs
are
doing
right
now
and
and
that
have
interesting,
sophisticated
aspects
of
the
work.
I'll
just
skip
that
slide,
but
this
first
one
is
this
self-supervised
Sky
survey
work,
so
my
colleague
Peter
Harrington
works
with
George
Stein
and
some
others
on
this.
So
this
is
looking
at
images
of
galaxies
from
Sky
surveys
where
in
this
case
we
have
a
lot
of
data,
but
not
a
lot
of
labeled
data,
and
so
this
is
a
technique.
A
That
way,
and
then
they're
actually
looking
for
I,
can
say
this
they're
looking
for
the
strong
lens
gravitational
candidates,
where
a
Galaxy
gets
distorted
by
gravity
and
can
look
like
this
sort
of
ring
pattern
here,
it's
just
pretty
cool
this
work
is
called
forecast
net.
This
is
led
by
some
of
our
post
docs
J
deep
was
a
former
postdoc
now
at
Nvidia
Shashank
is
a
current
postdoc
and
Peter
works
on
this
as
well.
We
work
a
lot
with
Nvidia
folks
on
this
one.
A
This
is
kind
of
taking
atmospheric
modeling
with
deep
learning
to
the
next
level.
It's
using
an
interesting
Fourier,
transform
based
operator
and
using
basically
an
attention
mechanism
here
to
to
really
be
able
to
do
this
at
higher
resolution.
That
was
done
before
in
deep
learning
models
and
bringing
the
Precision
up
to
the
level
of
numerical
models,
but
being
much
much
faster.
A
So
again,
this
is
the
case
that
you
know
we'll
potentially
open
the
door
to
letting
us
really
be
able
to
do
better
science
and
modeling
the
climate
of
the
Earth
and
well
one
other
thing
to
mention
about
this
is
if
you
watch
like
the
GTC
Keynotes
from
Jensen
Huang,
you
know
he
he
talks
about
this.
This
work
here
this
last
one
is
another
interesting
case.
This
is
also
one
of
our
benchmarks
in
that
ml
perf,
HBC
Suite.
A
This
is
the
from
the
open
Catalyst
project,
where
they're
they're
trying
to
find
new
catalysts
for
energy
storage,
trying
to
combat
climate
change
related
things,
and
so
this
is
where
you
use
density,
functional
Theory,
that's
very
expensive
and
slow,
but
you
can
replace
that
with
graph
neural
networks
to
model
that
system
and
and
get
good
speed
up.
So
we
had
a
postdoc
Brandon
working
on
this
who's
now
at
meta,
and
it's
a
collaboration
with
CMU
and
meta.
A
They
have
the
they
put
out
a
very
large
and
diverse
data
set
for
this
there's
a
nurix
challenge
and
a
lot
of
cool
work
coming
out
of
there,
and
one
thing
that
Brandon
was
able
to
show
on
this
work
is
that
larger
models
in
this
case
do
better
so
they're.
You
know
working
on
scaling
up
to
large
systems
and
basically
out
of
time
so
I'll
just
say
really
quickly.
We
do
a
lot
again
in
empowerment
and
training.
A
We
have
done
a
deep
learning
for
Science
school
at
Berkeley
lab
we've
got
two
iterations
of
this
one
in
2019,
which
was
in
person
and
then
in
2020.
That
was
an
online
webinar
series.
You
can
get
videos
and
slides
and
everything
on
these
web
pages.
We
do
a
deep
learning
at
scale
and
tutorial
here.
The
focus
is
really
on
performance
and
how
to
scale
up
the
training
of
a
neural
network
model
to
a
large
system
and
all
the
tricks
that
you
might
need
to
use
there
all
the
materials
available.
A
We
have
videos,
you
can
check
that
out.
It's
accepted
again
for
super
Computing
this
year.
So
if
you're
there
please
check
us
out,
and
then
things
like
this
boot
camp,
which
we're
doing
right
now,
I
think
I'll,
just
very
quickly.
Maybe
I'll
mostly
skip
the
conclusions
here.
I'm
trying
to
say
you
know,
AI
for
science
needs
super
computers.
A
We
see
that
the
the
field
of
scientific
AI
is
growing
and
becoming
more
sophisticated,
which
is
great
to
see
still
work
to
do,
though,
and
so
we're
doing
our
best
to
contribute
to
that
feel
free
to
reach
out
to
me.
If
you,
you
have
questions
or
want
to
ask
about
collaborations,
we're
also
hiring
and
there's
a
link
down
here
with
some
some
openings
for
postdocs
and
engineers,
and
that's
all
thank
you,
sorry
for
running
a
little
over
time.
B
Nope
that
was
perfect,
yeah
great
to
see
what's
happening
so
now.
Let's
go
ahead
and
dive
into
our
boot
camp,
so
I'm
going
to
go
ahead
and
hand
this
over
to
Caleb.
B
Hey
good
morning
evening
afternoon,
everybody
I
just
wanted
to
say:
Steven
that
was
awesome.
I
could
I
could
have
listened
and
looked
at
way
more
projects.
So
if
you
have
a.