►
From YouTube: 11. Python and Jupyter
Description
Learn about using Python and the ever-popular Jupyter at NERSC.
Slides for all sessions can be downloaded from here: https://www.nersc.gov/users/training/events/new-user-training-june-21-2019/
A
I'm
rollin
Thomas
I'm,
a
data
architect
from
the
data
and
analytic
services
group.
That's
privates
group
and
I'm
going
to
talk
to
you.
I'm
gonna
give
two
talks.
The
first
talk
is
gonna,
be
about
Python
and
Jupiter
at
nurse
k--.
So
the
idea
here
is
to
kind
of
give
everybody
a
set
of
kind
of
high-level
takeaways
about
do's
and
don'ts,
with
Python
and
Jupiter
at
nurse,
but
first
of
all,
before
I
get
started.
I
want
to
know
how
many
people
in
here
are
Python
users
so
who
uses
it
for
like
just
scripting.
A
A
Okay,
how
about
the
Kaos
people,
people
who
use
Python?
It's
a
platformer
machine
for
machine
learning,
I!
Guess
nobody
is
doing
that
really
all
right!
Okay!
Well,
thanks
for
for
letting
me
know!
So,
presumably
you
all
want
to
do
science
and
you
want
to
do
science
with
Python
at
risk.
So
this
slide
is
just
an
example
of
some
of
the
things
that
people
do
in
terms
of
science
through
Python
at
Newark,
so
there's
the
materials
project.
In
fact,
I
saw
a
few
MIT's
materials
project
stickers
on
people's
laptops.
A
So
you
probably
know
if
that's
all
about
workflows
being
managed
with
fireworks
here
at
nurse
people
also
do
data
analysis
or
data
flow
or
managing
workflows
like
the
LHC
data
processing,
workflow,
there's
also
processing
for
sky
surveys.
Cosmology
cosmic
frontier
stuff
is
a
big
deal
here
at
nurse
that
those
are
here
on
this
side.
Of
course,
it's
a
platform
for
machine
learning
and
deep
learning.
A
Really
it's
the
way
to
go,
and
then
there
are
actually
some
simulation
codes
like
warp
or
in
body
kit,
which
are
actually
written,
mostly
in
Python
or
Python,
with
C
extensions
or
Fortran
linked
into
them.
Those
are
simulation
codes
and
some
of
those
run
here
at
nurse
cos
as
well
the
most
important
to
know
about
Python
getting
started
with
Python
and
nurse,
because
we
have
really
awesome
documentation.
It's
awesome
because
I
wrote
it
and
we
have
a
lot
of
really
good
stuff
in
there.
We
try
to
keep
it
up
to
date,
pretty
continuously.
A
A
First
of
all
this
morning,
I
think
you
learned
about
the
software
module
system
at
nurse.
If,
if
Rebecca
was,
was
giving
a
good
talk,
then
she
talked
about
modules
or
somebody
did
the
way
to
get
Python
working
at
nurse
cos
through
modules.
You
must
always
be
sure
to
load
a
Python
module,
do
not
ever
ever
ever
use
user
bin
Python!
A
That's
the
system
Python
that
came
with
the
cray,
so
it's
a
little
bit
old,
but
you
can
do
the
the
kind
of
standard
user,
ben
and
python
thing
once
you
have
a
module
loaded
so
that
you
can
get
the
the
Python
interpreter
running
in
your
script.
If
you
don't
know
what
versions
of
Python
modules
we
have
currently
on
Cori,
you
can
do
module
avail
Python
now.
This
is
not
the
only
way
you
can
use
Python
at
nurse.
You
can
install
your
own
Python.
If
you
want,
you
can
compile
it
from
a
source.
A
If
that's
your
thing,
but
there's
other
more
recommended
ways
of
doing
that.
I'll
talk
about
that
in
a
little
bit.
Nurse
Python
is
anaconda
Python.
How
many
people
have
used
in
a
condo
Python
say
on
their
laptop
yeah
most
people
right,
it's
kind
of
the
the
distribution
of
Python
of
choice,
especially
because
it's
really
good
at
providing
tools
for
data
analytics
and
scientific
computing
and
getting
you
going
has
this
handy
package
tool
called
Conda
lets
you
build
these
environments
that
are
customized.
You
have
your
own
set
of
libraries
that
you
like
to
work
with.
A
You
can
completely
destroy
that
whole
environment
and
build
it
all
over
again.
If
you
like,
in
a
matter
of
minutes,
it's
very
popular
content
environments,
replace
virtual
ends,
who
has
ever
used
virtual
end,
it's
kind
of
the
older
tool
they
kind
of
replace
virtual
end
and
do
a
lot
more
than
verse
and
there's
a
few
other
packaged
tools
out
there
like
Pippen.
Does
anybody
heard
of
Pippen
yeah?
You
can
use
Pippen
if
you
want,
of
course,
anaconda
Python
has
many
hundreds
of
very
useful
packages.
A
For
you
know
if
you
are
part
of
a
community
that
does
I,
don't
know
cosmic
microwave
background
stuff,
and
you
have
your
own
way
of
compiling
and
own
set
of
packages
that
you
like
to
put
together.
You
can
use
channels
to
get
those
to
the
modules.
The
anaconda
modules
are
monolithic,
there's
like
a
whole
anaconda
that
comes
inside
of
that
module.
There
are
a
few
add-on
modules
like
h5
pi
parallel.
So
if
you
need
to
use
parallel
h5
PI,
then
you
need
to
add
that
module.
A
These
are
the
modules
that
you
should
be
using
on
Cori
right
now:
7
anaconda
4
for
python,
3,
6,
anaconda
4
for
those
those.
So
if
you're,
2.7
percent
or
3
point
6
person,
those
are
the
ones
that
are
there
there's
a
few
other
modules
you
can
mixin,
you
can
try,
try
different
ones
out.
I've
decided
that
what
we'll
do
is
we'll
keep
2.7
as
the
default
module.
So
if
you
do
just
module
loaded
Python,
it
will
load
a
default
module
that
will
be
Python
2.7
until
the
end
of
this
year,
when
2.7
is
done.
A
So,
if
you're,
how
many
people
have
switched
to
Python
3,
all
right,
those
of
you
you're
on
borrowed
time,
you
have
6
months,
ok
and
in
fact,
there's
this.
There's
this
handy
website
here
that
will
tell
you
exactly
when
Python
2.7
will
retire.
It's
a
load,
countdown
I
think
there's
supposed
to
be
a
party
at
PyCon
2020.
Also,
if
you
wanna
go
to
that,
so
you
know,
if
you,
if
you
forget
exactly
what
it
is,
you
can
you
can
look
there.
A
Alright,
I'm,
even
smarter,
okay,
alright,
so
switch
to
Python
three
in
about
six
months:
okay,
kondeh
environments,
it
seems
like
everybody
knows
how
to
do
Khanda
environments
right
so
Konda,
create
n,
and
then
this
this
Python
equals
whatever
is
pretty
important
because
it
might
pick
for
you
and
you
want
to
make
sure
that
it's
the
Python,
that
you
want
probably
Python
3
and
then
you
activate
activating,
very
MIT,
and
you
can
go
on
install
whatever
you
want.
You
can
use
pip,
actually
I
tend
to
prefer
to
use
pip
instead
of
go,
find
condo,
forge
stuff.
A
A
You
know
there's
a
lot
of
I
mentioned
that
because
a
lot
of
users
kind
of
get
stuck
and
they
don't
know
what's
wrong
and
then
I
go
in
and
I
try
to
just
build
it
from
the
beginning
and
it
looks
like
I
didn't
do
anything
and
it
all
just
works.
Also
I've
decided
don't
try
to
do
this,
some
pip
install
user
thing.
Quite
so
much
I'll
just
go
ahead
and
stick
it
straight
in
your
Conda
environment.
Don't
don't
put
it
over
in
python
user
base
unless
you
really
have
a
good
reason
for
doing
that.
A
If
you
don't
really
know
what
I'm
talking
about,
then
that's
fine
just
stick
to
kondeh
all
right
doing
things
yourself.
If
you
don't
like
our
modules
for
whatever
reason,
because
or
you
don't
even
you
don't
like
to
do
the
module
load
thing,
you
can
install
your
own
and
a
Khan
installation.
If
you
like
just
a
couple
of
tips,
there
make
sure
you
don't
have
a
python
module,
loaded
and
unset.
This
Python
startup
thing,
but
this
is
how
you
do
it.
You
just
grab
this
installer
from
anaconda.
A
It
seems,
like
probably
everybody
knows
how
to
do
that,
but
you
can
do
this
just
fine
on
Cori
and
then
you
just
set
it
up,
so
you
can
do
source
activate
alright,
a
couple
couple,
little
things
that
are
special
about
Cori
that
you
need
to
know
you
should
not
ever
Khanda
install
MPI
for
pi.
So
if
you
want
to
use
MPI
from
Python,
you
use
MPI
for
pi,
don't
ever
do
Konda
install
MPI,
for
it
won't
work
right.
It
might
look
like
it
works
on
OneNote
or
something
like
that.
A
But
then
you
go
to
two
notes
and
it
will
make
no
sense.
What
you
need
to
do
is
you
need
to
compile
the
MPI
for
pi
against
the
Cray
in
pitch,
and
it's
very
easy
to
do
so
here.
I
just
have
like
five
lines.
The
first
one
is
just
download
the
package
on
packet
CD
in
there
swap
a
module
so
you're
using
the
new
compiler
I
think
you
can
use
the
Intel
compiler
to
it
probably
play
doesn't
matter,
but
I
usually
do
this.
A
The
only
thing
you
have
to
do
is
Python
setup,
I
build
and
then
tell
it
where
MPI
CC
the
MPI
cc
compiler
is
that's
just
the
compiler
wrapper
that
we
talked
about
this
morning,
right
CC.
If
you
do
that,
then
you'll
have
built
your
own
MPI
for
pi,
and
you
only
need
to
do
this.
If
you
create
a
cond
environment
and
you
want
to
use
MPI
for
pi
from
that,
if
you're
using
mine,
modulo
Python
MPI
for
PI,
is
there
I
have
a
couple
of
slides
about
parallelism
with
Python?
A
Just
generally,
people
tend
to
use
process
level
parallelism
in
Python
a
lot
just
because
it's
easier
to
push
okay.
So
what
I
mean
by
that
is
thread
level
parallelism.
You
can
really
only
get
by
using
a
compiled
library
like
in
C
or
whatever
you
could
there's
a
few
other
things
you
can
do,
but
mostly
people
use
mkl
from
numpy
and
it's
threaded
and
vectorized
and
all
of
that,
and
so
that
works
pretty
good.
A
But
usually
when
people
are
writing
Python,
they
kind
of
use
MPI
for
pi
or
multi
processing
or
desk
or
PI
SPARC
to
kind
of
get
parallelism
going.
So
you'll
see
these
jobs
that
are
like
flat.
Mpi
jobs
by
that
I
mean
like
68mph,
on
a
on
a
kml
node
and
that's
kind
of
how
how
people
are
all,
but
you
can
do
both.
You
can
do
hybrid
pilot
parallelism,
it's
just
the
same
as
submitting
a
to
any
other
kind
of
job.
A
That's
hybrid,
parallel,
I
think
I
went
over
that
one
already
one
bit
of
information
is
that
the
only
one
of
these
parallel
libraries
that
really
could
like
scale
to
the
whole
machine
is
going
to
be
MPI
for
PI
okay,
but
if
you're
gonna
try
to
scale
to
a
significant
portion
of
the
machine,
it's
a
Python
application
or
hybrid
Python,
C,
C++
kind
of
application.
You're
gonna
want
to
do
something
besides
just
launch
it
out
of
your
home
directory
and
that's
because
of
some
characteristics
of
our
final
system,
namely
pythons
import
mechanism
is
really
metadata
intensive.
A
Basically,
anytime,
you
do
import
numpy.
It
goes
through
the
whole
file
system
through
your
whole
Python
path
and
all
of
that
stuff
trying
to
open
libraries
everywhere,
and
so,
if
you
have
a
hundred
thousand
MPI
ranks
or
even
100
MPI
ranks
they're
all
gonna
do
that
kind
of
more
or
less
at
the
same
time,
so
I'm
they're
all
doing
import
numpy.
What
happens
is
those
requests
go
to
a
single
metadata
server
and
they
say:
hey
you
guys
get
in
line
I'll
get
to
all
of
your
requests
in
order.
A
Okay,
so
your
application,
who's
gonna,
spend
a
half
hour
doing
import
numpy
right.
So
you
don't
want
to
do
that.
That's
bad!
So
these
are
the.
This
is
actually
a
thing
where
I
do
import
numpy
and
then
this
package
called
Astro
pie,
which
has
lots
and
lots
of
little
little
sub
modules
in
it
and
how
long
it
takes
to
import
that
at
that
4800
rank
scale.
There
are
different
file
systems,
so
the
one
with
the
best
performance
on
the
right
is
with
shifter,
which
is
a
container
technology
which
just
so
happens
to
be
the
next.
A
The
next
talk,
but
the
second-best
performance
you
can
get
besides.
Building
shifter
container
and
running
from
there
is
to
use
the
global
common
file
system
that
Java
and
mentioned
earlier.
So
generally,
don't
like
do
a
big
MPI
numpy
import
from
your
home
directory.
It's
so
bad
I,
don't
even
benchmark
it.
Ok,
scratch
is
ok.
Project
is
not
that
great,
but
global
common
is
kind
of
your
second
best,
one
all
right
how
to
profile
and
debug
Python
applications.
There's
of
course,
good
old
print
asked.
A
If
that's
the
thing
that
you
do
a
lot
of
the
time,
that's
how
people
get
started
doing
it.
You
just
have
to
remember
to
unbuffered
the
output
from
both
s
run
and
python,
but
we
have,
as
I
mentioned
a
whole
page
about
how
to
profile
applications
going
kind
of
from
easier-to-use
to
kind
of
more
difficult
or
kind
of
professional
grade
tools.
Python
comes
with
C
profile,
and
that
works
just
fine
on
Cori.
A
You
can
use
a
tool
like
snake,
snake,
vis
or
G
prof
to
dot,
which
is
what
this
visualization
is
here
to
see
where
your
code
has
been
it's
time
and
they
can
work
on
the
bottlenecks
there.
There's
even
a
way
to
do
this
with
MPI
processes.
Just
follow
that
link
line
profiler
is
a
tool
you
can
use
to
study
the
where
your
bottle,
where
y-your
bottleneck
is
a
bottleneck.
We've
also
developed
a
package
here
called
time
memory,
which
does
which
you
can
instrument
into
your
code.
A
So
you
put
little
decorators
on
functions
and
it
tells
you
how
much
time
it
spends
in
that
function.
How
much
memory
is
being
used
all
kinds
of
neat
stuff
it
works
with
MPI,
it
works
with.
You've,
got
a
Python
and
C++
application
and,
of
course,
there's
vtune
for
Intel,
Python
and
tau,
which
both
work
on
Python.
Okay,
so
are
there
any
questions
about
Python
like
I
could
handle
what
maybe
right
now,
it's
all
pretty
clear.
What's
nice
I
think
is
that
we've
set
it
up
so
that
it's
kind
of
not
a
big
deal
right?
A
Okay,
how
many
people
use
Jupiter
at
all?
Okay?
How
about
it
at
a
nurse?
Okay,
cool
all
right!
So
Jupiter
is
a
you
know.
This
really
powerful
platform
for
data
analytics
for
creating
documents
that
have
code
text,
equations,
visualizations,
widgets,
all
kinds
of
nifty
stuff
in
it,
our
default
Jupiter
deployment
is
Jupiter
lab
and
it
has
been
since,
basically,
since
they
said
they
weren't
in
beta
anymore,
today
is
the
release
of
Jupiter
lab
1.0
I.
A
Think
so
we'll
probably
be
upgrading
this
in
the
next
few
weeks
to
use
Jupiter
at
nurse
we've
we've
set
up
a
hub
which
is
a
place
where
you
log
in
and
then
you
can
launch
from
and
that's
the
URL.
You
can
go
to
Jupiter
nurse
techyv.
So
if
you
were
a
long
time
Jupiter
or
a
recent
Jupiter
user,
you
might
have
used
Jupiter,
dev
or
Jupiter
they're
the
same
thing
now
so
we've
smushed
them
together
into
one
thing
and
what
you
can
do
there
is.
You
can
pick
where
you
want
your
notebook
to
start
up.
A
You
can
have
it
start
up
on
Cori
or
you
can
have
it
start
up
in
this
container.
Environment
called
spin,
but
mostly
people
are
going
to
want
to
start
up
their
notebooks
on
Cori.
We
have
not
one
node
now,
not
two,
but
we
have
three
nodes
set
aside
that
are
kind
of
like
login
notes
for
all
of
the
notebooks
that
people
arrive
and
at
any
given
time,
there's
about
150
or
200
notebooks
running
across
those
three
nodes.
Why
would
you
want
to
run
on
on
Cori?
A
Well,
you,
of
course
your
notebooks
would
then
be
on
Cory.
They
could
see
the
Cory
scratch
file
system,
it's
the
same
kind
of
Python
environment,
as
if
you
SSH
in
you
can
also
submit
jobs
there.
We
have
a
some
handy
little
tools
for
submitting
jobs
from
from
cells
called
slurm
magics.
The
spin
shared
node
configuration
is
external
to
Cori.
So
it's
not
not
inquiry
can't
see
scratch.
You
can't
submit
jobs
from
it.
A
What
that's
for
is
you
have
a
paper
deadline
and
you
need
to
get
to
your
data,
that's
on
project,
so
you
can
make
that
last
plot
for
your
paper.
Okay,
so
let's
back
up
I'll,
say
that
I
think
last
time,
I
looked,
there
were
200
notebooks
running
on
Cori
and
then
like
two
in
spin
okay.
So
it's
kind
of
this
is
a
backup.
So
if
cauri
is
down
for
maintenance,
you
can
maybe
you
spin
all
right.
A
The
most
common
Jupiter
question
I
get
is
how
do
I
take
a
Conda
environment
that
I
created
and
use
that
from
inside
a
jupiter
notebook,
there's
a
few
different
ways
to
do
this,
but
here's
the
way
that
I
recommend,
so
you
log
into
to
koryu
SSH
and
you
create
your
Conda
environment.
So
you
have
to
add
one
package
called
I,
pi
kernel.
Okay,
if
you
do
that,
then
the
next
thing
you
can
do
is
this
Python.
A
A
So
if
you
want
to
go
look
at
it,
you
can
okay,
once
you've
done
that
yeah
once
you've
done
that
you
point
your
browser
to
Jupiter
nurse
that
gov,
you
may
need
to
restart
your
notebook
server,
but
once
you
do
that,
you
should
see
that
kernel
show
up
and
then
you
should
be
able
to
click
it,
and
then
you
have
that
Conda
environment
from
your
from
your
notebook.
This
is
what
the
kernel
spec
file
looks
like.
A
So
it's
just
JSON,
but
basically
all
it
does
is
it
takes
an
argument
which
is
run
Python
and
then
launch
my
kernel
and
then
connection
file
stuff
which
don't
worry
about
it.
That's
Jupiter,
stuff,
okay!
Now,
why
am
I
showing
you
this?
Because
you
can
actually
do
more
than
just
this
with
this,
you
can
customize
the
environment,
so
you
can
add
this
little
in
red
in
red,
this
environment
stanza.
Basically
there
to
let
you
set
the
path
or
the
LD
library
path
or
all
that
stuff
that
people
like
to
to
customize
with
I.
A
Don't
actually
like
this
quite
so
much
the
way
that
I
like
to
do
this
to
do
this
kind
of
customization
like
if
you
want
to
add
a
module
or
something
like
that
is
don't
run
Python,
run
a
script
that
wrappers
Python,
okay,
so
the
way
that
you
do,
that
is
you
you
change
that
kernel
spec
file,
so
that,
instead
of
the
first
argument
being
to
do
Python,
it's
do
this
shell
script
at
some
place.
Okay,
and
then
inside
that
shell
script,
it's
gonna
help
your
guy,
you
say
export
whatever
you
want
hey.
A
This
is
like
a
real
real
common
one.
Is
people
want
to
make
matplotlib
plots
list,
lay
tech,
you
know
labels
or
whatever.
This
is
the
way
that
you
can
do
it.
If
you
do
module
load
this
and
then
what
it
actually
does
is
it
just
runs
the
high
pi
kernel
piece,
the
kernel
piece?
Okay.
So
if
you
have
other
modules
that
you
want
to
be
able
to
talk
to
from
Jupiter,
you
can
load
them
this
way
and
then
shifter
is
a
container
technology
I'm
going
to
talk
about
next.
A
This
is
how
you
could
run
a
kernel
from
inside
a
shifter
container.
That's
also
documented
on
the
website.
I
think.
If
not
it's
on
the
slides
here
we
should
add
it
and
then,
before
you
write
me
a
ticket
and
say
something's
wrong
with
Jupiter.
What
you
should
do
is
you
should
look
at
your
notebook,
server's
log
file
and
the
place
where
that's
found
is
in
your
home
directory
at
dot,
Jupiter
dot
log
used
to
just
be
called
Jupiter
dot
log,
but
people
told
us
they
didn't
like
seeing
it.
A
So
we
put
dot
in
front
of
it.
Now
they
don't
see
it,
but
we
all,
but
we,
the
staff,
know
where
it
all
is.
But
what
it's
got
is
it's
got
all
the
stuff
that
your
server
says
it's
doing
and
if
you
see
an
error
in
there
that
might
give
you
a
hint
okay,
we're
working
on
ways
to
expand
support
for
Jupiter.
You
can
run,
you
can
run
things
like
desk
or
spark
on
compute
nodes
and
talk
to
them
from
notebooks,
and
so
we'll
I
can
tell
you
all
about
that.
A
If
you
want
to
know,
we
are
gonna
have
a
way
for
people
to
launch
notebooks
on
compute
nodes
so
that
you
don't
have
to
share
with
you
know
sixty
six
other
people,
but
you
have
to
pay
okay
and
then
we're
also
working
on
interfaces
inside
Jupiter
lab
that
kind
of
expose
slurm
and
things
like
that.
So
maybe
you
don't
need
to
ever
SSH
in
ever
again,
so
this
is
kind
of
the
key
takeaways.
It's
basically
use
Conda
stuff
about
MPI
for
pi
should
use
shifter.
A
A
Okay,
so
all
right,
so
the
the
questions
about
tasks
who
knows
what
desk
is
yeah
so
desk
is
one
of
these
kind
of
newer
frameworks
for
starting
up
little
clusters
that
you
submit
work
to
in
the
form
of
a
direct
directed
acyclic
graph.
But
you
have
tasks,
they
depend
on
each
other
and
you
say
just
go:
do
that
the
architecture
for
des
distributed
is
that
there's
a
scheduler
and
that's
the
person
you
submit
work
to
and
then
there's
workers
and
those
are
the
people
who
get
the
stuff
from
the
scheduler
and
do
it
okay.
A
So
how
do
you
run
desk,
distributed
at
nurse
there's
a
few
different
ways?
One
would
be
to
set
up
a
job
where
you
start
the
scheduler
task,
scheduler
the
thing
that
drops
the
scheduler
file
ampersand,
then
the
next
thing
is
that
s
run
all
the
desk
workers.
Okay,
so
those
are
the
things
that
get
the
S
run.
The
scheduler
runs
on
the
head
node,
but
the
workers
run
across
the
all
of
the
all
of
the
nodes
that
are
in
the
job.
And
then
you
start
your
client
script
up
after
that.
A
Okay,
if
it's
Jupiter,
you
have
to
figure
out
a
way
to
wire
up
the
connection
between
the
notebook
and
a
scheduler.
I
can
tell
you
more
about
how
to
do
that
in
a
minute,
but
generally
this
would
be
a
way
for
you
to
start
up
a
task
cluster
inside
of
a
job
and
submit
work
to
it
from
a
client
script.
So
that's
that's
kind
of
the
way
to
go
now.
We
we
want
when
you
do
that,
though,
what
we
want
you
to
do
is
to
make
sure
that
you
turn
on
the
SSL.