►
Description
July 11, 2019 Jupyter Community Workshop lightning talk by Shreyas Cholia, Lawrence Berkeley National Laboratory
A
A
Lhc
detector,
where
they
wanted
to
do
some
distributed
deep
learning,
using
convolutional
neural
networks
to
be
able
to
ossify
some
of
these
images
coming
out
of
the
detector
and
so
without
getting
into
the
details
yeah.
The
idea
is
to
be
able
to
use
Jupiter
to
do
to
solve
a
couple
of
these
classes
of
problems,
so
why
interactive
distributed
deep
learning,
I
think
for
a
lot
of
projects.
This
is
kind
of
the
next
frontier
in
terms
of
being
able
to
enable
scientific
discovery,
typically
take
a
while
to
train
these
networks
and
you're.
A
You
know
doing
a
lot
of
tuning
and
figuring
out
all
the
parameters
and
hyper
parameters
that
go
into
a
model.
There's
a
lot
of
brute
force,
scans
and
optimized
automated
optimization,
and
then
our
batch
HPC
systems
have
their
own
wait
times
and
slow
iteration
cycles
combine
this
with
the
fact
that
a
lot
of
the
new
deep
learning
frameworks
are
python-based,
so
things
like
Harris
and
tensor
flow
I
think
using
Jupiter
notebooks
as
Jupiter
as
a
whole
as
the
environment
to
be
able
to
manage
these
things
made
a
lot
of
sense.
A
So
for
this
demo
this
was
part
of
an
LDR
D,
and
then
we
presented
this
work
at
ISC,
interactive
computing
workshop.
So
for
the
things
to
get
all
this
to
work
together,
so
we
ended
up
using
IP
parallel
to
manage
the
tasks
on
the
back
end,
we're
using
huger
quanto
pian
to
render
the
the
you
know
an
interactive
table
that
you
could
use
to
flip
through
and
you'll
see.
A
A
couple
of
movies
in
a
second
EQ
plot
from
Bloomberg
was
really
useful
to
be
able
to
do
Jools
ation,
and
then
we
wrote
this
little
thing
called
kale.
That
would
let
you
have
fine-grained
control
over
the
tasks
themselves,
and
so,
if
you
wanted
to
issue
starts
and
stops
and
changing
the
parameters,
kill
would
just
wrap
your
individual
tasks
and
then
you
could
basically
control
those
through
the
service
all
right.
So
here's
a
little
bit
about
how
we
set
all
this
stuff
up.
A
So
at
nernst
we
have
a
jupiter
hub
web
server
that
basically
lets
you
spin
up
a
and
there's
other
multiple
go
into
a
lot
more
detail
on
the
various
other
ways
you
can
do
jupiter
notebooks
at
nurse,
but
for
this
particular
effort,
we're
spinning
on
a
the
equivalent
of
a
login
node.
So
it's
got
a
lot
of
memory,
a
lot
of
CPU
and
it's
a
shared
resource,
but
we
can
spin
up
a
lot
of
notebooks
on
there.
So
you
spin
up
the
notebook
server
process
on
here.
You
start
up.
A
The
kernel
runs
the
IP,
parallel
client
and
you
bring
up
a
bunch
of
back-end
nodes.
On
the
compute
side,
we
had
a
little
magic
called
IP
cluster.
That
would
let
you
do
that,
so
you
just
give
it
a
few
parameters
and
it
spins
everything
up
and
calls
for
you
and
it
lets
you
set
up
all
of
these
nodes,
which
you
can
then
control
using
iPad
parallel,
because
we're
using
iPad
parallel
that
they
also
gave
us
the
ability
to
use
MPI
on
the
backend.
A
A
That's
that
desk
is
probably
better
supported
and
if
you
know,
there's
a
way
to
do
this
in
desk
moving
forward.
That
would
be
interesting
and
so
yeah.
This
is
basically
just
a
couple
of
screenshots
or
how
we
set
this
thing
up.
So
you
know
you
just
describe
your
job
pass
it
into
this
magic,
bring
up
an
IP,
parallel
client
which
connects
to
the
cluster
on
the
back
end.
A
So
it's
just
connecting
to
these
workers
and
you're
off
and
running,
and
so
we
did
two
kinds
of
things
there
is
this
distributed:
training
which
was
basically
just
you
know,
go
off
and
deeper
training
and
we
use
a
tool
called
par
Avadh,
which
is
out
of
SE
and
I,
think
they
do
basically
with
a
bunch
of
primitives
to
do
deep
learning,
I
can't
and
they
actually
use
MPI
under
the
covers,
and
so,
if
you
look
at
their
primitives,
you'll
see
things
like
hvd
drank
and
whatnot.
So
you
can.
A
Actually
you
know
it
combines
this
MPI
world
with
a
more
you
know,
deep
learning,
training
model
world,
and
then
you
notice
that
we
could
actually
just
use
ipad
pal
to
start
the
workers
and
then
use
harvick
to
do
all
the
communication
between
those
and
there
was
really
no
overhead
in
terms
of
the
infrastructure
all
right,
so
that
was
maybe
not
quite
active.
It
was,
you
know,
you're
using
the
list
stuff,
the
more.
A
This
was
for
parameter,
optimization,
and
so
this
actually
involves
setting
up
these
workers
and
then
trying
to
optimize
for
hyper
parameters
across
a
bunch
of
different
possible
models
that
you're
trying
to
use,
and
so
what
we're
doing
running
each
task
separately
and
then
seeing
which
ones
which
tasks
are
doing
better.
So
you
can
get
the
loss
and
the
accuracy
you
can
sort
through
what's
going
on
and
then
so
it's
a
lot
you'll
see
from
this
short
little
movie.
We
have
here.
A
A
All
right-
and
so
the
idea
here
is
that
you're
basically
running
this
across
a
space
of
hyperparameters,
which
you
can
see
down
over
here,
so
you've
got
a
bunch
of
different
values
that
you're
trying
out
you
can
flip
through
and
see
which
models
are
doing
better,
which
ones
are
not.
You
can
sort
based
on
the
things
you
can.
So
if
you
want
to
sort
on
the
best
model
based
on
validation,
loss
or
accuracy
or
loss,
you
can
do
that.
A
So
it's
a
nice
way
of
running
stuff,
running
stuff
in
real
time,
getting
those
results
and
then
actually
being
able
to
do
things
like
you
know,
stop
and
start
things
in
case
there's
things
are
more
promising
than
others
and
actually
have
a
second
movie
there.
Okay,
we
worked
out
of
here
yeah.
So
this
is
the
other,
the
the
the
second
little
video,
where
it's
the
same
thing.
This
is.
A
This
was
a
little
bit
of
a
more
of
a
toy
problem,
a
short
film
with
this,
where
we're
actually
starting
and
stopping
jobs,
but
yeah
so
you're
here,
you're,
actually
stopping
something
that
didn't
and-
and
you
can
go
and
do
things
like
tweak
the
parameters
that
you're
running
against.
So
you
know
you
can
also
get
resource
monitoring
under
the
covers
and.
A
Yeah,
you
can
change
the
parameters
that
you
pass,
the
hyper
parameters
that
you're
running
the
job
with
so
you're
stopping
a
job,
redefining
those
hyper
parameters
and
starting
it
up
again.
So
it's
it's!
It's
next
way:
sort
of
interactive
training
and
all
set
up
and
the
pre
and
post
analysis
happens
as
a
jupiter
notebook.
So
it's
not
just
a
one-off
widget
thing:
it
actually
fits
into
a
larger
workflow,
alright,
so
the
same
approach
with
National
Center
for
electron
microscopy,
where
we're
looking
at
a
bunch
of
these
images
there.
A
It's
it's
a
thing
called
PI
for
T
stem
which
takes
these
two-dimensional
images,
and
then
you
can
explore
each
pixel
in
that
2d
image,
and
that
gives
you
an
another
two
dimensions
and
that's
where
the
4d
comes
from
and
they
had
this
serie
o
in
a
Jupiter
notebook
to
do
all
their
analyses.
And
then
we
basically
just
put
these
hooks
on
the
back
end
and
allowed
them
to
spread
their
tasks
on
an
HPC
cluster,
and
we
got
a
much
really
nice
speed
up
there.
A
Alright.
So
this
was
extra
slide,
since
we
actually
have
some
time
I'll
kind
of
walk
through
some
of
this
stuff
and
it's
mic
feed
seed.
Other
topics
later
in
this
workshop,
so
also
do
it
with
Dan
Allen
from
Anna.
So
we're
talking
about
doing
curated,
no
notebook
environments,
where
the
idea
is
that
you
can
browse
these
curated
notebook
environments
using
things
like
MBA
or
Club
into
the
users
workspace
with
the
appropriate
Konda
environment
and
might
have
reproduce
line
notebooks.
A
So
in
some
sense
it's
a
lot
like
binder,
but
it's
a
insects,
PC
world
where
you
don't
have
either
like
back-end
and
really
all
you're
trying
to
do
is
copy
a
notebook
over,
send
it
over,
send
it
off
with
the
appropriate.
You
want
to
be
able
to
look
at
things
easily
and
then
create
a
copy
in
your
workspace
with
the
appropriate
kernel.
I
think
that's
kind
of
the
thing
that
we've
been
the
requests
we've
been
getting
from
user,
so
I
think
we're
still
very
much
in
the
prototyping
and
experimenting
phase
of
that.
A
But
and
so
it'll
be
useful
to
see
what
other
people
are
thinking
in
the
space
as
well.
We're
also
playing
with
pay
per
ml
from
Netflix.
Do
this
parametrize
notebook
thing
where
people
want
it's
different
datasets
and
they
just
want
to
capture
everything
as
a
notebook,
but
wanna
capture
parameter
so
we're
playing
around
with
that.
We've
got
a
couple
of
Jupiter
lab
extensions
that
we
have
some
students.
Looking
at
so
slurm
extension
lets.
A
You
manage
things
and
you
you
can
basically
bring
up
slurm
as
a
as
an
extension
in
jupiter
lab
and
start
job
submit
jobs.
Release
kill
do
things
with
that.
I
think
somebody
had
a
request
for
something
like
this
in
the
discourse
and
then
we
also
have
a
resource
usage
monitoring
extension,
which
is
basically
the
NB
residues
thing
that
we
talked
about
a
couple
of
graphs
that
lets
you
display
that
all
right,
that's
all
I,
have.