►
From YouTube: NERSC User Group Webinar 2019-12-12
Description
Community File System; Dask and Jupyter at NERSC
A
C
Is
I
didn't
click
present
before
I
could
about
that
so
I'm
here
with
a
bunch
of
members
from
the
storage
team,
Kristy
callback,
Rose,
Gregg,
Butler,
Thank,
You
cat
roll
is
in
ski
and
a
few
others
remote
to
talk
to
you
guys
about
the
new
community
file
system.
That's
going
to
be
deployed,
so
we've
got
some
slides
but
feel
free
to
jump
in
with
questions.
If
you
go
as
we
go
so.
C
Okay,
so
why
do
we?
Why
are
we
deploying
the
community
file
system?
So
the
community
file
system
is
the
central
part
of
nurses,
long-term
storage
plan.
Our
users
here
at
nurse
Kevin
need
to
store
for
long
term
scientific
data,
that's
being
accessed
lots
of
people
access
data
for
multiple
years,
and
we
need
a
place
where
that
can
be
quickly
accessed
and
staged
up
to
our
more
performant
ear
for
analysis.
Users
also
have
a
need.
This
also
serves
a
need
to
share
your
scientific
data.
E
C
It's
going
to
have
yeah
it'll,
be
on
all
our
files,
all
our
computational
systems,
and
it
will
be
used
by
our
scientific
portals
and
in
spin.
So
this
is
coming.
This
is
scheduled
for
replacement
with
the
new
allocation
year.
January
14th.
The
transition
will
begin,
we'll
have
some
more
details
on
that
and
ultimately
it
will
place
the
project
file
system
and
also
our
sponsored
storage
project,
a
file
system,
but
not
not
for
some
time.
C
So,
what's
new
with
the
community
file
system,
the
main
new
thing
here
is
space.
It's
an
almost
an
order
of
magnitude
increase
in
space,
that's
available
for
users,
we'll
have
roughly
60
petabytes
before
the
rate
that's
available
for
users.
The
plan
is
to
increases
to
about
200
petabytes
over
the
next
five
years
to
in
order
to
scale
with
the
space
needs
of
our
users.
The
default
quota
for
directories
on
the
community
file
system
will
go
to
20
terabytes.
C
It's
one
terabyte
now
in
the
project
file
system,
so
by
default
repos
will
get
20,
terabytes
and
20
million
I
notes.
We
also
are
offering
better
quota
management
for
sub
projects.
We've
gotten
a
lot
of
requests
from
pis
over
the
years
to
be
able
to
have
some
projects
inside
of
there
on
their
file
system
with
separate
quotas,
so
that
they
can
manage
the
different
requirements
of
their
project.
That
way,
this
will
be
possible
in
the
community
file
system.
C
You
can
have
separate
directories
that
are
all
owned
by
your
repo
that
will
have
individual
quotas
for
them,
and
you
can
split
your
total
kulluk.
Your
total
quota
amongst
these
directories,
as
you
like,
it's
also
a
new
allocation
model.
The
quotas
that
people
that
reposts
get
on
the
community
file
system
will
be
granted
by
the
DOA
allocations
manager
as
part
of
the
cap
process.
C
So
in
the
ER
cap,
you'll
ask
for
the
space
that
you
need
and
you
did
it
this
year
and
then
the
dua
managers
will
consider
that
and
grant
all
over
summer,
whatever
they
choose
for
that.
A
nurse
will
follow
that.
So
it
also
comes
with
a
lot
of
nice.
New
file
system
features
we'll
have
faster,
rebuild
for
distribute
raid.
What
that
means
to
you
is
that,
if
there's
a
file
system
issue,
the
performance
won't
be
impacted
for
as
long.
C
C
So
that's
what
he
you
know.
This
feature
I'm
saying
the
same
for
the
community
file
system,
so
just
like.
With
the
project
file
system,
every
repo
will
have
a
directory
of
the
same
name
by
default.
Everyone
gets
at
least
one
when
your
repo
is
created.
It's
automatically
created
on
the
file
system
like
I
said
before.
If
you
need
multiple
directories,
you
can
write
to
us
and
request
multiple
directories,
and
that
can
be
done.
C
So
we
have
so
we
have
an
example
right
down
here
of
what
yeah
I
should
be
asking
that
snapshot
don't
get
too
sick.
We
have
an
example
what
this
looks
like
on
the
file
system.
The
new
path
will
be
global,
CFS,
Cedars
and
then
it'll
be
your
repo
name.
So
project
project
errs
is
replaced
by
CFS
Cedars,
and
this
is
the
permissions
it'll
have
here.
This
is
the
one
for
and
staff
it's
owned
by
city
heads
group.
C
Reading
writable
in
the
sticky
bit
is
that
so
this
is
all
the
same
as
on
the
with
the
exception
of
the
new
happening.
The
good
permissions
are
the
same
is
on
the
project
file
system.
Community
file
system
will
be
mounted
on
every
system
and
we
mounted
on
Cori
Italy
massive
mounted
under
nine
and
mounted
on
our
DT
ends
will
have
the
same
retirement
policy
that
we
have
now
for
project
inactive
repos
are
migrated
to
HP
SS
after
a
year.
C
C
C
So
what
we're
doing
is
we're
syncing
the
data
over,
but
in
order
to
make
sure
that
we
get
a
final
and
complete
sync
of
the
data,
we
need
a
period
of
time
where
the
project
file
system
is
read-only,
where
it's
not
being
changed,
and
so
that's
going
to
start
January
14th
the
beginning
of
a
new
allocation
year.
It
will
become
read-only.
The
project
file
system
will
remain
read-only
for
that
a
week
until
January
21st
we're
going
to
make
every
effort
that
we
can
to
return
the
file
system
to
you
earlier.
C
But
this
is
how
long
we
estimate
it
will
take.
So
if
you
feel
this
will
cause
a
major
hardship
for
you,
increase
reach
out
to
nurse
consulting
and
we'll
work
with
you
to
try
and
mitigate
this.
So
once
this
is
deployed,
the
old
path
at
project
project
errs
will
actually
stay
around.
It
will
point
at
the
new
community
file
system
until
sometime
in
mid
2020,
when
we
retire,
that
and
that's
so
that
you
don't
have
to
rush
to
update
your
scripts.
C
C
Just
project
just
project,
you
know
our
project,
B
or
any
of
this
project
is
the
only
one
that
will
be
read-only.
That's
the
only
one
that
we're
migrating
the
data
from
project
day.
I
have
a
slide
on
that
a
little
later
and
we'll
talk
about
that
more
in
detail
for
those
of
you
who
have
a
sponsored
storage,
so
the
old
path
will
stick
around.
However,
we
encourage
you
to
once
this
migration
data
migration
is
done,
go
through
the
upgrade.
C
C
Ok,
so
this
is
a
little
sketch
of
kind
of
where
we
are
and
where
we're
going
so
here
right
now,
we've
got
the
project
file
system.
It's
read
and
write
it's
visible
at
both
slash
project
and
global
project.
Right
now,
we're
going
to
do
a
month
of
data
transfer,
so
the
community
file
system
is
present
on
the
systems,
but
it
can't
be
can't
be
written
to
you
by
anyone,
but
root
and
it'll
stay
like
that
until
January
14th,
then
during
the
query,
maintenance
project
will
become
Mountain.
C
C
So
now
just
a
quick
slide
on
sponsored
storage.
You
know,
for
those
of
you
don't
know,
nurse
allows
some
groups
to
purchase
large
blocks
of
storage
on
a
separate
file
system.
These
existing
sponsor
storage
purchases
will
be
honored
till
the
end
of
their
contract
at
a
later
date,
where
these
spaces
may
be
migrated
onto
community
file
system.
But
you
know
the
space
will
still
be
yours
for
the
time
that
you
have
purchased
it
going
forward.
We'd
like
our
community
file
system
to
handle
most
of
the
volume
of
these
sponsored
storages.
C
C
We're
also
going
to
only
consider
buying
more
storage,
responsive
shorts
twice
a
year
to
try
and
help
people
consolidate
purchases
if
they
need
to
to
get
to
this
one
petabyte
in
general,
if
you
have
a
large
storage
need
that
should
be
communicated
to
your
do
e
allocation
manager
and
should
flow
through
them
so
that
we
can
better
better
serve
our
priorities
that
are
giving
you,
the
storage
that
you
need
so
I
think
that's
all
I
have
I.
Just
don't
say
we
sort
of
have
had
an
unbalanced
file
system
hierarchy
for
a
while.
E
Even
what
should
the
expectations
be
for
metadata
performance
and
throughput
compared
to
project.
B
B
F
Okay,
I'm
rollin
thomas
I'm
from
the
data
analytic
services
group
here
at
nurse
I,
take
care
of
Python
and
Jupiter
on
quarry
here
and
I
thought.
It
would
be
a
good
time
for
us
to
talk
about
how
to
use
desk
through
Jupiter
on
quarry
here
at
nurse,
if
you're
not
familiar
with
what
desk
does
or
what
it
is.
These
first
couple
slides
are
for
you.
The
story
starts
with
Python.
F
Python
is
now
the
dominant
language,
for
data
analytics
and
general
programming
and
for
general
programming,
and
it's
also
a
major
platform
for
machine
learning
and
deep
learning,
and
that's
definitely
true
here
at
nursing.
This
growth
has
been
fueled
by
a
number
of
factors
that
include
easy
to
use.
Computational
libraries
like
numpy
sci-fi,
pandas
scikit-learn,
the
scientific
Python
stack
libraries
for
visualization,
like
matplotlib,
Seabourn,
bouquet,
and
tools
for
interactivity
and
sharing,
which
are
becoming
ever
more
popular
jupiter
notebooks,
and
the
jupiter
ecosystem
in
particular.
F
Now
the
problem
is
that
these
tools
were
not
designed
to
scale
really
beyond
being
you
a
single
machine,
and
that's
where
desk
comes
in
desk
is
a
scalable
analytics
platform
that
is
supposed
to
work.
Well,
with
these
python
based
tools,
it's
developed
specifically
to
scale
up
with
the
Python
ecosystem
and
to
serve
multi-core
machines,
distributed
clusters,
University
clusters
and
now
high
performance
computing
I'll
just
say
also
that
this
slide
and
the
next
slide
are
adapted
from
the
desk
documentation.
F
I
didn't
just
write
them
myself,
there's
a
very
nice
website
about
desk
with
lots
of
documentation
that
you
read
later
so
here's
a
big
picture
of
how
desk
works
if
you've
used
spark
before
it
will
be
pretty
familiar
to
you
with
desk.
What
you
do
is
you
start
a
cluster
of
some
processes
on
some
hardware.
F
Now
there
are
many
ways
to
start
on
these
desk
clusters:
we're
going
to
focus
on
one
today,
especially
because
it
leverages
core
compute
nodes,
but
there
are
other
ways
to
do
it
and
you
may
even
be
able
to
just
do
it
in
Jupiter
without
using
computers.
That's
fine
just
observe
the
usual
recommendations
about
remembering
that
it's
a
shared
resource.
It's
like
the
login
node,
don't
take
all
the
cores
for
more
than
a
couple
of
minutes.
Okay,
be
a
good
neighbor.
So
then,
what
do
you
do
with
this
cluster?
Once
you've
started
it
up?
F
F
What
I
mean
by
talk
to
the
client
is
that
we're
gonna
send
tasks,
work
that
we
want
done
through
the
client
to
the
scheduler,
and
these
are
gonna
be
scheduled
by
the
cluster
dynamically
by
the
scheduler
dynamically.
There
are
modes
of
operation
that
can
be
kind
of
immediate
and
very
highly
interactive.
F
Evaluation
of
the
tasks
is
lazy,
returning
futures
and
promises
that
you
can
feed
into
other
tasks.
So
your
notebook
isn't
blocked,
while
these
things
are
running
also
desk
provides
a
number
of
very
handy
big
data
collections,
like
data
frames
to
guests,
delayed
library,
desk
arrays,
and
things
like
this
are
useful
for
a
different
for
different
purposes
and
for
communication.
F
It's
it's
important
to
know
that
desk
doesn't
use
MPI
for
communication,
and
this
has
it
has
an
effect
on
how
how
big
it's
going
to
scale
on
our
system,
but
there's
no
reason
why,
in
the
future,
it
couldn't
do
something
like
that.
It
supports
TCP,
InfiniBand
and
UC
X,
which
is
fairly
new.
So
why
are
we
talking
about
desk
at
nurse
right
now?
F
The
fact
is,
more
and
more
users
have
been
asking
about
whether
or
not
they
can
actually
use
it.
So
we're
trying
to
respond
to
those
inquiries.
In
particular,
people
want
to
use
it
from
Jupiter,
because
it's
a
data
analytics
platform
for
interactive
use
and
also,
we
think
that
the
desk
ecosystem
has
matured
to
the
point
that
it
interacts
with
HPC
better
than
it
did
in
the
past.
So
in
this
talk,
we're
going
to
talk
a
little
bit
about
what
best
practices
are
for
using
desk
on
Cori
compute
nodes.
F
There's
two
two
main
ways
that
we're
advocating
users
try
to
do
this
right
now.
One
is
a
package
called
task
MPI,
which
just
uses
MPI
to
launch
the
cluster.
For
you
on
the
communication
of
tasks
and
doing
all
the
work
is
still
done
over
TCP,
but
this
this
seems
to
work
really
well
to
start
up
a
cluster
really
fast,
there's
also
a
desk
job
queue
which
we're
not
going
to
talk
about
today.
F
This
is
a
mechanism
for
starting
workers
by
submitting
batch
jobs,
so
your
scheduler
runs
outside
of
the
compute
nodes
that
you
connect
to
workers
as
they
start
up,
so
you
can
kind
of
build
up
with
cluster
and
the
cluster
can
kind
of
teardown
and
the
scheduler
can
handle
this
going.
What
coming
and
going
of
workers
very
very
nicely
another
best
practice
we
want
to
highlight
is
that
you
should
use
containers,
probably
in
order
to
scale
up
launching,
maybe
larger
clusters
around
the
size
of
like
50
to
100
workers.
F
Whispered
asked
on
HPC
that
the
gas
Proctor
itself
until
we
recommend
so
these
are
kind
of
our
high-level
recommendations,
we're
making
right
now
about
how
to
use
tasks
on
the
quarry
compute
modes.
In
addition
to
figuring
this
stuff
out,
we
had
to
do
some
work
on
our
own.
We
had
to
do
some
work
on
networking
between
the
jupiter
nodes,
and
the
few
notes
on
corey
also
have
to
contribute
some
code
to
some
of
the
infrastructure,
in
particular
to
make
the
dashboard
work
and
have
a
conversation
with
the
developers.
F
F
The
the
free
notebooks
I'm
falling
one
is
very,
very
simple
way
of
starting
up
a
desk
cluster
and
Corey
compute
using
gas
MPI
and
how
to
actually
connect
to
the
cluster
and
then
how
to
start
up
and
see
the
dashboard.
The
second
is
going
to
be
connecting
to
that
cluster
and
doing
a
very
simple
kind
of
MapReduce
example.
Calculation.
The
final
one
is
a
little
bit
more
complicated
desk
data
frames
based
calculation
that
actually
uses
some
real
data
that
I've
reprocessed.
F
For
people,
because
I
can
flip
that
handy
way
to
learn-
and
of
course,
if
you
get
stuck
doing
any
of
this
stuff
file
a
ticket-
and
it
will
help
the
place
where
these
notebooks
are
going
to
be
posted
is
going
to
be
links
from
Doc's
Tanner's
can
go
from
the
desk
page
there,
okay!
So
let's,
let's
do
the
live
demo.
F
F
Okay,
so,
like
I,
said,
there's
a
whole
bunch
of
documentation
inside
these
notebooks,
and
so
here
is
basically
a
one-liner,
even
though
it's
spread
across
many
lines
of
how
you
can
actually
submit
a
single
task,
MPI
job,
if
you
want
you,
can
put
this
into
a
script
which
I
have
actually
done
to
kind
of
streamline.
My
my
demo
launch
today.
F
And
I'm
going
to
ask
for
10
notes
on
in
a
reservation
and
then
I'm
going
to
inside
of
that
allocation,
I'm
going
to
run
Gascon
P
I
I'm
not
going
to
go
over
every
line
in
the
allocation
or
in
the
job
submission
here,
but
the
notebook
has
a
line
by
line
explanation
of
everything
that
I'm
doing
in
the
job
submission
there.
So
in
the
terminal
at
the
bottom,
you
can
see
that
the
cluster
has
started
itself
up
and
that
we
are
going
to
be
able
to
connect
to
it
from
a
notebook
here.
F
There's
a
couple
of
things
we
need
to
do
right
before
we
set
up
here.
I
recommended
some
settings
that
you
should
put
into
a
configuration.
If
you
want
to
do
that,
mainly
it's
about
avoiding
spilling
just
because
we
don't
have
local
disks
on
our
cooking
nodes
and
also
setting
up
a
link
that
allows
you
to
actually
connect
to
the
dashboard
here
and
that's
another
thing
that
you
can
stick
into
the
configuration
file.
If
you
don't
want
to
have
it
here
in
the
notebook.
F
I'll
say
I'm
doing
I'm
doing
something
that
I
didn't
really
recommend,
which
was
that
I'm
launching
this
out
of
a
Jupiter
Colonel,
that
is
a
Conda
Conda
environment,
that's
in
my
home
directory!
So
if
things
are
bad
on
the
home
directory
right
now,
I
could
be
having
at
that
time.
But
why
don't
we
go
ahead
and
just
at
least
do
something
see
if
we
can
get
this
to
come
back
up?
Okay,
all
right!
All.
F
Right
that
word
all
right,
and
then
this
is
the
part
where
we
actually
start
the
client
and
we
need
to
the
and
so
I've
printed
out
what
the
little
widget
that
shows
you.
What
the
client
and
the
cluster
look
like
and
there's
this
link
here,
which,
if
you
use
to
ask
before
I,
have
showed
something
that
kind
of
didn't
work.
But
now
it
works
okay,
because
what
it's
doing
is
it's
proxying
through
to
the
dashboard
on
the
computer
mode?
D
B
F
F
Take
up
all
the
real
estate,
all
right,
okay,
so
we've
started
up
a
cluster
and
we've
connected
to
it.
Alright,
let's
do
something
with
it.
So
the
first
thing
I'm
gonna
do
with
this
other
notebook
is
I'm
going
to
use
the
same
cluster
I'm
going
to
connect
to
it.
We're
gonna
calculate
by
the
way
we're
gonna
do.
This
is
using
the
dark
word
method:
we're
gonna
use
about
a
hundred
billion
darts
to
estimate
pi
on
these
320
workers
that
I've
started
up.
So
let's
get
our
connection
here
whenever
I
have
to
restart
to.
F
F
All
right,
perfect,
okay,
319
workers,
so
we're
gonna
do
this
as
a
MapReduce
kind
of
calculation.
So
this
is
the
map
part,
which
just
says
give
me
a
random
number
generator
and
some
number
of
darts
to
throw
and
I'll
do
it
and
I'll
figure
out
how
many
of
them
are
inside
the
unit
circle
in
the
first
quadrant.
F
Here's
the
part
that
actually
does
something,
because
here
we
just
wrote
a
function
right,
but
here
we're
going
to
use
the
client
dot
map
function
to
actually
tell
all
of
the
workers.
Hey
do
this,
however.
Many
times
needs
to
be
done
to
get
200
billion
of
these,
so
we'll
get
that
started
all
right,
and
so
the
it
looks
like
nothing's
happening.
F
But
what
actually
happens
is
the
scheduler
is
trying
to
figure
out
how
it's
going
to
schedule
all
of
the
tests
that
I
request
and
then
when
they
get
started,
they
show
up
there
in
the
task
stream.
So
each
of
these
tasks
is
taking
it
about
700,
milliseconds
or
so
the
overhead
for
scheduling
each
one
of
these
is
about
one
millisecond.
So
that's
handy
to
keep
in
mind.
It's
not
a
good
idea
to
have
a
million
tasks
that
each
take
like
half
a
second
or
something,
because
that's
a
lot
of
overhead
for
the
reduce.
D
F
And
then
estimating
pi
will
do
the
reduce
part
there
that
goes
out
to
the
cluster
and
gets
those
that
I
use
summed
up
and
then
we're
good
to
four
parts
and
ten
minus
seven.
Okay,
all
right!
So
that's
really
simple!
So
that's
just
MapReduce
and
we
didn't
use
any
kind
of
fancy
desk
data
structure.
Is
there
we
just
wrote
some
Python
stuff
and
we
shipped
it
off
to
the
cluster
to
do
some
stuff
and
we
fry
it
back.
Let's
do
another
notebook.
This
is
the
last
demo
and
out
of
an
abundance
of
caution.
F
Let's
just
do
that
preemptively.
What
we're
gonna
do
here
is
we're
gonna,
make
a
thing
called
a
color
color
diagram.
We're
gonna
make
actually
a
two
dimensional
histogram
of
two
colors.
Basically,
the
ratio
of
two
colors
and
the
way
we're
gonna
do
this
is
we're
going
to
load
up
some
data
about
a
billion
rows
of
data,
I,
think
and
we're
going
to
select
based
on
a
signal-to-noise
Cup
and
we're
gonna
select
a
particular
type
of
object
from
the
deck
ham
legacy
survey.
This
is
a
public.
F
Now
I've
gone
ahead
of
ahead
of
time
to
speed
up
the
demo,
I've
reformatted
some
of
the
data
into
hdf5,
so
we
don't
have
to
sit
here
for
fifteen
minutes
to
load
up
a
whole
bunch
of
files
on
and
we're
going
to
use
gasp
data
frames
to
do
this.
So
let's
connect
this
pendulum
right,
so
you
can
see
that
the
dashboard
reacted.
They're,
restarting
the
scheduler
cleared
out
everything
that
we
had
done
before.
Here's
a
data
frame
call
now
if
you've
used
pandas
before
this
should
look
pretty
familiar.
F
The
API
is
supposed
to
remind
you
of
pandas,
but
it's
got
some
extra
arguments
like
this
chunk
size
argument,
which
tells
it
how
much
data
to
put
on
each
partition
when
it
loads
up,
then
to
actually
do
the
load
up
and
redistribution
of
the
data
across
the
workers,
because
it's
probably
not
optimal.
We
do
this
repartition
and
persist
business
here
so
here
the
data
is
being
loaded
from
hdf5,
it's
being
loaded
from
scratch.
F
I've
noticed
that
when
you
try
to
load
data
directly
off
of
project,
they
don't
all
kind
of
start
at
the
same
time,
whereas
they
all
seem
to
kind
of
start
the
load
much
much
much
closer
together
in
time
on
scratch.
I
thought
that
was
interesting.
Okay.
So
now,
let's
do
some
some
data
science,
we've
loaded
up
about
a
billion
billion
1/2
rows,
I
think
we've
got
to
compute
our
signal
estimate
our
signal-to-noise,
okay
and
you'll
see
the
cells
are
coming
back
immediately.
F
Each
time
I
hit
them
because
all
zerrin
they're
not
actually
returning
anything
through
the
scheduler
they're
just
sending
to
the
scheduler
hey,
build
the
links
between
all
the
tasks
that
need
to
be
done
to
do
this.
Let's
compute!
This
is
the
part
that
computes
colors
it
converts
from
flux
into
magnitude
and
then
does
the
subtraction.
And
then
we
apply
a
signal
to
noise
cut.
F
So
we
only
want
good
data.
Okay.
So
since
that
persists
call
nothing
has
happened
on
the
cluster,
yet
we're
just
telling
the
scheduler
hey.
This
is
the
word
in
which
we
want
things
to
be
done.
Here's
my
function
for
doing
a
two
dimensional
histogram.
There's
not
a
two
dimensional
histogram
function
in
desk,
but
it's
pretty
easy
to
write
one
yourself,
but
the
stuff
that
comes
in
is
a
data
frame.
Actually
it's
a
future
of
a
data
frame,
but
you
can
operate
on
it
like
it's
a
regular
old
gas
via
frame
and
then
down
here.
F
This
is
the
part
that
actually
sends
out
the
heat
map
calculation
to
all
the
partitions
to
be
done
and
then
and
still
nothing's
happened.
Yet
until
I
do
compute-
and
this
is
the
part
that
brings
them
all
together,
so
we're
basically
doing
a
similar
kind
of
reduce
operation
over
all
of
the
data
over
all
of
the
partitions.
So
this
should
be
pretty
fast.
So
when
this
is
done,
we're
going
to
have
the
sum
over
all
partitions
in
this
two-dimensional
histogram
and.
F
That's
a
color
color
diagram,
all
right,
so
those
are
the
live
demos
and
again
they're
going
to
be
shared
on
the
web
and
in
fact
to
make
it
so
that
you
can
start
them
off
yourself
in
a
do
paterno
book
and
we'll
be
adding
other
other
notebooks
as
well
to
the
documentation
to
try
to
help
people
out
a
couple
questions
it
might
be
in
people's
mind
at
this
point.
Is
you
know
why
would
I
use
this
and
not
say
MPI
I,
think
that
you
know
MPI
is
great.
F
D
F
F
D
F
F
I've
run
gas
clusters
at
nurse
up
reliably
up
to
about
a
thousand
workers
that
works
pretty
well
when
I've
tried
to
push
it
to
maybe
seven
or
eight
thousand
workers,
the
cluster
never
kind
of
fully
comes
into
being.
But,
interestingly
enough,
when
you
submit
work
to
it,
it
all
gets
that
people
who
use
em
gauges.
F
We've
never
thought
was
that
to
be
highly
inefficient,
to
try
to
run
that
kind
of
work,
and
another
thing
to
think
about
is
it
has
a
central
scheduler
and
smart
kinda
has
as
a
similar
kind
architecture,
one
millisecond
worth
of
overhead
and
that's
going
to
influence
how
much
work
you
do
how
much,
how
much,
how
big
of
a
task
you
actually
want
to
have
and
how
many
tasks
you
want
to
submit
to
a
cluster.
So
it's
not
a
good
idea.
F
Just
submit
a
million
one
millisecond
tasks,
okay
and
dass
vs
park
I've
mentioned
spark
a
few
times.
There's
a
lot
tasks
is
a
lot
more
like
Spartan
than
MPI
demagog
required.
There's
reasons
why
you
might
use
why
not
the
other
there's
a
whole
page
on
on.
Why
used
spark
versus
desk
in
the
desk
documents?
I
think
it's
fairly
I
think
it's
fairly
fair.
So
this
is.
F
And
if
you
have
any
questions,
please
file
a
ticket
either
a
Python
ticket
or
a
Jupiter
ticket,
we'll
figure
it
out
help
that
nurse
that
go
and,
of
course,
we're
working
on
our
documentation
and
dog
Steiner
stacked
up,
and
if
you
discover
things
that
are
valuable
to
other
users,
and
you
want
to
add
them,
you
can
contribute
to
our
documentation
as
well.
That's
all
I've
got
okay.
D
E
Hi
this
is
Steven,
so
in
this
case
you
launched
the
cluster
from
the
command
line
and
then
separately
went
over
the
notebook
to
connect
to
it.
Is
there
a
reason
for
doing
that
versus
launching
it
directly
from
notebook
by
spawning
up
command
and
then
shutting
it
down
from
the
notebook
as
part
of
a
self-contained
ecosystem,
or
is
it
necessary?
Do
it
separately?
It.
F
Is
not
necessary
to
do
that,
you
could
run
it
as
a
separate
process
from
within
the
notebook.
One
reason
that
I
like
to
in
in
the
experimentation
that
I've
been
doing
with
desk
run
it
in
a
separate
terminal,
is
that
I
can
I
can
more
easily
start
and
stop
things
I
think
just
control
see
and
I
can
also
kind
of
follow.
F
What's
going
along
when
things
are
arguing
awry,
and
you
know,
as
you
can
imagine,
I
probably
I'm
starting
and
stopping
these
all
the
time
trying
to
experiment
with
ways
to
make
this
work
and
just
taking
the
I
would
say
that
making
it
integrated
with
the
notebook
or
integrated
with
the
lab
extension,
which
is
I,
think
that
direction
we
really
want
to
go.
I
think
that
that's
probably
something
I
would
worry
about
getting
right
later
on.
F
Personally,
but
yeah,
it's
just
kind
of
a
I
find
a
little
bit
more
convenience,
while
I'm
hacking
to
be
doing
that,
but
in
production
I
probably
want
it
to
be
integrated
into
the
flow
of
the
notebook.
There
is
one
thing
that
I
have
noticed
about
tasks
MPI
and,
and
it's
part
of
the
reason
why
I
run
it
in
a
terminal
which
is
that,
if
I
crash,
the
cluster
like
I,
omit.
D
F
Intentionally
crash
the
schedule
or
something
like
that,
the
scheduler
coordination
files
seems
to
get
left
over
and
if
I
start
up
again,
it
looks
like
the
worker
is
actually
seeking
to
read
that
file
and
they
try
to
call
home
to
a
scheduler
who's
dead.
And
so
the
cluster
kind
of
just
sits
there.
So
I
want
to
be
able
to
see
kind
of
in
the
same
window.
Do
I
have
that
schedule
or
file
around
we're
working
on
fixing
that
and
contributing
it
back
to
the
task.