►
From YouTube: NUG Monthly Webinar, September 24, 2020
Description
Hear about some user successes and lessons learned, and the topic of the month -- Public Safety Power Shutdowns (PSPS) and how to reduce the risk of these events to your scientific progress.
A
Okay,
it's
now
a
couple
of
minutes,
past
11,
so
we'll
get
started,
people
can
still
join.
This
is
aiming
to
be
a
somewhat
informal
discussion.
I
guess
first
up
people
can
hear
me:
okay,
right
cool,
I'm
getting
nods.
A
See
it's
been
a
little
while,
since
we
had
these
monthly
webinars
and
we're
sort
of
shifting
the
format
up
a
little
bit
so,
whereas
previously
they
were
very
much
oriented
towards
a
presentation
of
some
sort
and
now
we're
still
going
to
have
a
presentation
of
some
sort
but
it'll
be
a
you
know,
a
relatively
small
portion.
A
Can
you
know
15
minutes,
and
our
aim
is
to
you
know
for
this
to
be
kind
of
a
very
interactive
forum
yeah,
for
I
guess
us
to
communicate
nurse
to
communicate
with
our
users
and
our
users
to
communicate
with
nurse
and
with
each
other,
and
because
we're
looking
for
that
interactivity
and
you
know
slightly
less
formal
approach.
Yeah.
I
highly
encourage
you
to
join
the
nurse
user
slack
channel
if
you
haven't
already
and
do
most
of
the
text
chatting
there
in
preference
to
the
zoom
chat
facility.
A
A
So
the
plan
for
today
is,
I
think
I
posted
out
the
agenda
earlier,
but
so
we're
going
to
start
out
with
a
kind
of
a
win
of
the
month.
The
idea
here
is
just
you
know,
open
open
discussion,
share
something
something
you
achieved
and
it
can
be.
A
You
know
it
can
be
something
small.
It
can
be.
You
know
solved
a
bug
got
a
paper
published.
It
can
be
something
big,
you
know
achieved
a
major
run
or
a
milestone
in
a
project.
Yeah
we're
looking
for
sharing
success
stories.
You
can
also
nominate
somebody
else
that
you
know
achieved
a
big
win.
You
know
this
could
be
a
good
source
also
for
or
a
good
place
to
learn
about.
A
A
So
we
have
come
on
to
the
details
of
that
shortly
and
then,
after
that,
we'll
go
into
today.
I
learned,
which
is
kind
of
the
other
side
of
yeah.
Not
everything
is,
is
successes
all
of
the
time
sometimes
yeah
we
get
burnt.
We
get
stuck
and
yeah
it's
a
little
bit
painful
at
the
time,
but
you
know
usually
in
the
long
run,
there's
something
to
learn
from
it
and
yeah
at
the
very
least,
there's
something
that
we
can
kind
of
teach
each
other
from
it.
A
You
know
give
give
each
other
a
heads
up
about
how
to
avoid
a
particular
problem,
or
you
know,
identify
something
that
is
not
as
trivial
as
it
might
sound
or
not
what
it
looks
like
yeah.
This
can
also
help
us
to
find
you
know
places
that
we
can
tweak
our
documentation,
for
instance,
hopefully
improve
everybody's
experience,
we'll
have
a
space
to
make
announcements,
and
you
know
calls
for
participation,
and
this
is
not
a
one-way
thing.
A
Their
intention
is
again
if
there's
something
that
you
know
about,
that
would
be
good
to
have
more
nurse
users
involved
in
we'll
announce
it
there
and
then
we'll
spend
a
while
on
our
topic
of
the
day,
and
today
our
topic
of
the
day
is
psps
preparedness
and
psps.
For
those
noted,
california
is
public
safety
power
shutdowns
come
wildfire
season,
for
this
is
the
the
second
year
when
this
is
a.
A
You
know,
a
significant
likelihood
that
we
need
to
prepare
for
pg
e,
which
is
the
utility
provider
that
provides
nurse
electricity
on
high
risk
days,
aims
to
minimize
that
risk
by
cutting
out
power
to
some
regions,
and
that
can
impact
us.
So
we'll
talk
a
little
about
that
and
then
just
finish
off
with
a
few
moments
to
think
about
topics
for
upcoming
meetings
and
a
quick
look
at
our
operational
numbers
for
the
next
month.
A
So,
let's
start
out
with
a
win
of
the
month,
please
unmute
yourself
and
chime
in
sort
of
with
yeah,
whatever
you'd
like
to
say.
Yeah
tell
us
tell
us
all
about
an
achievement
either
yours
or
somebody
else's.
C
Hi
I
I
was
pleased
my
postdoc
anthony
kremen.
Just
got
a
urgent
hpc
paper
accepted
on
our
use
of
the
nurse
real-time
queue
for
our
data
processing
for
our
data
from
the
telescope
throughout
the
night.
So
yeah,
it's
been
a
a
very
good
thing
for
us
for
desi.
For
you
know,
using
the
real-time
queue
at
nurse
has
been
very
helpful
for
us
and
it's
nice
to
see
that
turning
into
a
conference
paper.
So
we
were
excited
about
that
this
morning.
A
D
Yeah,
so
I
had
a
paper
accepted
by
the
journal
of
plasma
physics.
We
run
a
code
at
nurse
that
was
actually
written
by
a
group
in
finland
which
computes
the
orbits
of
energetic
alpha
particles
in
tokema.
Next,
the
worry
is
that,
under
certain
circumstances,
they
can
be
lost
from
the
plasma
and
overheat
the
wall.
D
So
I'm
part
of
a
of
a
team.
That's
designing
about
a
400
million
dollar
tokamak
that
will,
for
the
first
time,
produce
more
power
out
than
power
in.
So
we
have
to
worry
about
possibly
having
100
megawatts
of
these
alpha
particles
running
around.
A
That's
that's.
That's
quite
impressive.
B
D
A
Sharing
yeah,
that's
great.
What
I
should
have
been
doing
was
taking
notes,
but
that
actually
reminded
me.
I
think
I
noticed
a
little
indicator
that
we
are
in
fact
recording
this
meeting
so
yeah.
I
guess
please
be
aware
of
that.
If,
if
you
do
have
any
objections
and
have
spoken
so
far
drop
us
drop
us
a
line
afterwards,
but
so
we'll
we'll
put
the
recording
on
the
on
the
webinar
page.
So
this
one
over
here
after
the
meeting.
A
A
So
I
have
a
I
have
what
feels
like
a
much
smaller
scale
win,
but
I
think
I
think
it
still
counts.
As
a
win,
we've
got
these
meetings
up
and
going
again,
it's
been
yeah.
It's
been
several
months
and
I'm
very
optimistic
that
an
interactive
format
will,
you
know,
bring
something
new
to
nurse
a
user
group
and
you
know
encourage
a
lot
of
collaboration
and
and
sort
of
mutual
support
for
each
other.
E
So
this
is
cheyenne
from
pnnl.
I
would
like
to
report
a
minor
win
as
well.
So
recently
our
work
on
graph
triangle
counting
won
the
graph
challenge
champion
in
in
the
hpec
in
the
ieee
h-bec
conference.
So
they
run
a
yearly
graph
challenge
competition
and
we
kind
of
compete,
and
this
year
we
competed
on
the
graph
a
triangle
counting
and
although
it's
not
the
best
algorithm
out
there,
but
it's
a
distributed
one.
So
it
won
the
graph
challenge
championship.
So
we
are
quite
excited
about
that.
E
A
Yeah,
so
I
guess
the
key
insight
there
is
that
with
a
distributed
algorithm,
even
if
it's
not
the
best
serial
algorithm,
you
know
you
can
do
things
that
you
can't
necessarily
do
with
serial
algorithm
yeah.
You
can
actually
target.
E
A
larger
graph
and,
and
and
and
my
my
interest
was
mainly
in
in
stressing
the
network
and
you
know,
testing
the
communication
pipeline.
So
it's
also
a
very
nice
mini
application
or
my
a
benchmark
which
helps
you
assess
the
mpi
and
the
interconnect
performance.
So
so
that
way
also
it's
kind
of
nice
and
I'm
planning
to
use
it.
For
you
know
some
other
benchmarking
efforts
as
well.
E
E
So
it's
a
it's
a
very
irregular
kind
of
irregular
kind
of
algorithm.
So
what
happens?
Is
that,
as
you
can
imagine,
graphs
have
different,
varied
degree
of
a
varied
degree,
so
the
messages
sent
out
will
kind
of
is
proportional
to
the
number
of
edges
attached,
and
that
varies
a
lot.
So,
even
if
you
are
using
a
collective
operation,
so
collectives
are
very
good.
As
long
as
you
have
like
equivalent
amount
of
sends
and
receipts,
then
you
get
good
bandwidth,
but
if
it's
valid,
then
you
have
this.
E
You
know
you
have
the
effect
of
synchronization,
so
some
of
the
processes
will
finish
quickly
and
they
have
to
wait,
and
that
kind
of
is
one
of
the
major
bottlenecks
with
graph
workloads
and
unfortunately,
mpi
has
this
new
neighborhood
collective
operations,
but
they
are
not
very
optimal
as
well,
so
so
graph
workloads,
usually
you
know
they.
They
have
this
because
of
the
regular
inherent
irregularity.
A
A
A
A
So
then
the
other
side
of
the
coin,
but
it
doesn't
have
to
be
the
other
side
of
the
coin,
is
today.
I
learned
so
interested
here
in
and
stories
that
people
have
something
that
surprised
you,
that
it
might
benefit
other
users
to
hear
about.
You
know
and
might
help
us
to
identify
things
that
we
can
either
document
better
or
you
know,
add
some
pointers
too
so
yeah,
for
instance,
you
know
something
you
got
stuck
on
hit
a
dead
end
turned
out
to
be
wrong
about
some
tip.
A
It
doesn't
have
to
be
a
negative
thing.
It
can
be.
You
know,
I
discovered
that
if
I
you
know,
adjust
my
timing
of
my
jobs
a
little
bit,
you
know
made
them
shorter
and
wider.
I
you
know
saved
x,
amount
of
queue
time.
You've
got
a
better
end
to
end
time,
or
even
just
something.
That's
kind
of
relevant
to
nurse
users
that
you
learned.
That's
an
interesting
pointer.
A
A
So
anyway,
there's
been
some
really
interesting
talks
in
this
series,
and
so
you
know
I
often
try
to
either
attend,
or
you
know
log
in
afterwards
and
have
a
look
at
the
talks
and
about
a
week
ago
now.
I
was
catching
up
on
one
that
I
missed.
You
know
previously,
which
was
called
color
mapping,
strategies
for
large
multivariate
data
in
scientific
applications
and,
and
it
turned
out
to
be
one
of
those
you
know
hey.
This
is
amazing-
and
I
wouldn't
have
thought
of
this,
so
so
I
want
to
call
it
out.
A
Go
take
a
look,
so
the
speaker
actually
came
into
scientific
computing
from
an
art
type
of
background
and
she's
talking
about
how
to
choose
and
use
a
color
palette
to
better
communicate
your
findings
in
scientific
visualizations,
so
that
yeah,
it
turns
out
kind
of
yeah
that
the
naive
way
of
doing
visualizations
that
I've
always
used
where
you
put
lots
of
bright
colors,
because
they
stand
out
actually
isn't
as
effective
as
using
a
a
much
more
muted,
color
palette
and
using
brighter
colors.
A
And
you
know
strong
hues
or
strong
saturations
just
for
specific
information,
and
she
tells
it
really
well.
So
so
that's
my
today
I
learned
and
there's
a
link
in
the
slack
to
the
list
of
webinars
and
it's
a
fairly
recent
one
yeah,
the
second
or
third
one
on
the
list.
Yeah
take
a
look.
A
A
a
hard
lesson,
or
just
a
lesson
that
they're
interested
in
passing
on.
F
A
This
was
yes,
and
this
is
the
sort
of
thing
we're
looking
for,
because
these
little
tidbits
like
that,
you
know
there
are
a
lot
of
them
and
just
being
aware
of
them.
You
know
it's
it's
it's
great
to
find
out
about
so
yeah,
thanks
for
pointing
that
out.
Hopefully,
you
can
see
this
on
the
screen.
Here's
where
it
is
it's
under
my.nurse.gov,
yes,
under
the
jobs,
tab.
G
Yes,
thank
you,
oh
by
the
way,
I
also
really
use
this
this
jobs
with
generator.
Whenever
we
got
new
member,
you
know
who
is
using
started
using
nuts.
We
really
recommend
him
or
her
to
go
to
that.
You
know
script
generator
first
and
then
run
those
models.
So
I
really
appreciate
that
and
and
what
I
think
I'm
learning.
What
I
learned
recently
is.
G
The
first
is
very
simple
and
I'm
not
the
result,
may
not
statistically
significantly
yet,
but
I
noted
so
we
changed
the
default
building
from
static
to
dynamic,
linking
and
then
but
our
model
we
use
chroma
model
to
you
know
obvious
atmosphere
or
climate
models.
They
are.
Basically,
then
we
run
over
many
nodes.
So
currently
I'm
running
global
simulation,
the
great
space
in
30
kilometers.
So
it's
kind
of
high
resolution
and
it
takes
some
quite
a
long
time
and
I
noted
the
initialization
takes
some
time
to
15
to
20
minutes.
G
That's
even
actually
even
lower
resolutions,
and
then
let's
try
to
put
this.
You
know
environment
variable,
you
know,
may
you
know,
build
a
model
with
static
linking
instead
of
different
dynamic,
and
then
it
does
change
initialization
time
to
like
15.,
sometimes
actually
even
debug
queue
didn't
finish
initialization,
but
after
I
introduced
this
environment
variable
and
make
it
build
it
statically.
Then
it's
like
a
few
minutes,
so
I
have
still
still
need
more
statistics,
but
from
the
several
cases
I
run
as
a
test.
That
does
make
a
difference.
G
G
I
wonder
how
many
people
actually
aware
of
this
and
using
it
I'm
just
starting
using
it.
It's
so
cool,
it's
just
like
a
conda
for
python.
I
just
you
know,
look
at.
I
can
clearly
see
a
list
of
those
libraries,
and
here
I
connect
cdf
or
whatever,
and
but
however,
I'm
just
currently
I'm
having
a
trouble,
because
the
library
I
wanted
to
install
is
esmf.
G
They
provide
lots
of
nice
tools
to
remapping,
for
example,
between
different
grid,
but
I
just
wanted
to
use
this
offline
for
myself
and
try
to
install
this
stack,
but
not
quite
successful
yet
so
I'm
now
trying
is
continue,
try
to
install
spec
or
maybe
just
install
myself
or
this
some
of
the
library
installed
as
a
part
of
other
libraries
like
nco,
for
example,
that's
also
installed
or
ncl-
that's
also
installed
by
nas
folks.
So
I'm
gonna,
I'm
trying
different
paths,
but
probably
I
might
just
issue
a
ticket
for
beta.
G
I
don't
I'm
more
curious
about
the
spec
way,
so
yeah,
that's
just
using
some
learning.
B
A
A
So
so
we're
getting
pretty
interested
in
in
spec
as
well.
Actually
we
have
been
for
a
little
while
and
so
something
you
may
or
may
not
have
discovered
yet
is
that
you
can
do
module
load
spec
on
cory.
B
G
A
Yeah
yeah,
if
you
do
get
stuck,
definitely
drop
us
a
line
because
we're
we're
kind
of
hoping
to
yeah
increase
our
usage
of
that
as
a
way
to
make
it.
You
know
easier
for
people
to
install
software.
You
know,
especially
when
there's
complicated.
What
do
you
call
it?
You
know
combinations
of
versions
and
dependencies
that
people
need
yeah.
So
that's
that's
good
to
know
and
let
them
yeah
open
a
ticket
by
all
means
and
we'll
see
if
we
can
help
improve.
D
A
A
You,
and
so
they
are
also
interested
to
hear
your
experiences,
perhaps
offline
or
in
their
in
the
slack
channel
later
about
static
versus
dynamic,
linking
with
the
climate
code,
and,
if
I
remember
rightly,
the
environment
variable
that
you're
talking
about
is
export,
cray
link,
type
equal
static,
possibly
creepy
league
type
equals
static.
A
Stop
at
that
one
and
and
step
on
to
announcements
and
calls
for
participation.
So
there
are
a
number
of
things.
Another
email
is
a
very
long
email,
so
it's
pretty
easy
to
miss
things,
but
probably
a
couple
of
you
know
important
upcoming
things
that
are
good
to
know
about,
and
you
can
go
back
and
check
your
email
for
the
details
about.
That
is
a
big
one.
A
We
had
announced
about
the
power
upgrade
happening
october
7-12
and
the
great
news
is
that
the
facilities
people
have
managed
to
arrange
it
that
for
the
the
power
upgrade,
it
should
be
able
to
happen
without
impacting
quarry
so
yeah,
some
of
the
auxiliary
things
like
quarry
gpu,
will
still
be
affected.
The
great
news
is,
we
won't
have
to
take
an
outage
for
that.
A
Hpss
has
a
new
file
system
like
interface.
Take
a
look
at
that
very
important.
The
ercap
allocations
process
closes
in
less
than
two
weeks,
so
yeah
make
sure
you
get
your
request
in
and
we
have
office
hours
coming
up
october,
1
october
5.
A
If
you've
got
questions
about
the
process,
a
couple
of
announcements
about
the
better
scientific
software
fellowship
closes
in
less
than
a
week,
and
if
you
would
like
to
be
involved
in
the
sc
21
conference,
it's
looking
for
volunteers
now
and
that's
a
great
way
to
you
know,
get
to
know
a
broader
range
of
the
the
super
computing
industry
and
other
yeah
participants
and
users
in
it
now,
johannes
mentioned,
you
have
a
announcement
about
julia.
That
sounds
interesting.
H
Hi,
yes,
just
a
quick
announcement,
we
have
a
julia
module
available
on
corey
now,
so
this
module.
It
currently
has
version
1.4
and
we'll
be
upgrading
that
well
we'll
be
adding
a
1.5
version
soon.
H
This
module
also
manages
the
julia
depot
path,
so
some
packages
that
that
many
users
might
need
and
need
to
be
configured
for
corey,
specifically
such
as
mpi,
are
managed
in
that
path.
So
you
don't
have
to
rebuild
it.
They
they
automatically
get
included
and
I'll
be
expanding.
The
available
modules
based
on
popular
demand,
essentially
yeah.
So
that's
that's
everything
regarding
julia.
I
guess
I.
H
I
can
also
quickly
remind
everyone
if
you
still
want
to
answer
the
office
hours
survey
that
was
sent
out
a
couple
of
weeks
ago,
I'm
looking
at
the
data
now.
So
if
you
want
to
get
your
opinion
in
there,
you
know
sooner
rather
than
later
as
well.
Thank
you
very
much.
A
Cool
thanks,
john
yeah.
If,
if
you
haven't
encountered
julia
yet
take
it,
take
a
look:
it's
a
really
interesting
and
promising
language,
I've
kind
of
started
using
it
a
little
and
and
learning
it
just
recently,
and
it
it
feels
it's
some
of
the
same
niche
that
python
fills
but
being
a
compiled
language.
It
can
you
get
past
some
of
python's
performance
challenges,
so
yeah
that
one
that
one
looks
really
promising
it's
good
to
good,
to
see
that
that's
available
on
this
now
actually
a
year
or
two
ago.
A
H
A
G
H
H
G
Yeah,
because
we
are
just
asking
because
we
are-
we
have
some
small
group
of
scientists
talking
about
you
know,
writing
another
new
numerical
scheme
to
solve
flow,
but
we
are
also
discussing
which
language
that
should
be
written
in
and
we
are
discussing,
you
know,
still
stick
to
fortran
and
use
some
open
acc
to
use
a
you
know,
gpu
or
maybe
use
python
or
some
more
more
modern
language,
and
I
mean
maybe
I'll
probably
quickly,
learn
about
julia
and
and
think
about
it
also
yeah.
Thank
you.
I
might
ask
on
the
slack.
H
Sometime
later,
thank
you
yes
yeah.
So
this
is,
I
think,
a
very
good
time
to
to
think
about
when
you're
starting
a
new
project,
to
think
about
whether
you
want
to
try
out
julia,
it's
for
applied,
math
work.
The
way
it
does
multi-dimensional
indices
is
very
similar
to
fortran.
So
so
that
would
be
a
bonus
in
favor
of
julian.
A
That
sounds
great,
so
if
anybody
else
has
got
any
announcements,
we'll
move
on
to
our
topic
of
the
day,
so
our
topic
of
the
day
came
yeah
worryingly
close
to
fruition,
yeah
relatively
recently
with
the
wildfires,
and
that
is
that
it's
psps
season
and
yeah.
Unfortunately,
the
wildfires
in
california
seem
to
have
been
getting
worse
from
year
to
year.
A
A
So
you
know
that
was
a
pretty
interesting
learning
experience
and
you
know
I
think,
a
lot
of
people
here
probably
experienced
either
directly
from
being
in
california
or
indirectly
from
you
know,
via
using
nuts.
A
A
You
know
if
it's
a
shorter
job.
If
it's
a
six
hours
renting
job
yeah,
we
can
stop
it
before
it
starts,
or
you
know
prevent
it
from
starting
until
after
the
you
know,
everything's
back
up
and
yeah.
That
way,
you
don't
lose
your
job
part
way
through,
but
longer
running.
Jobs
are
particularly
at
risk,
and
so
one
good
solution-
and
you
know
possibly
possibly
the
best
solution
that
we've
got
at
the
moment-
is
to
use
checkpointing
to
break
long
jobs
into
shorter
jobs,
and
we've
got
a
few
options
about
that.
A
So
in
a
few
minutes,
zenji
who's
been
leading.
Our
restart
efforts
here
at
nurse
will
give
you
know
a
little
bit
of
an
overview
of
what
the
options
are
at
nurse
can
get
some
some
ideas
of
what
you
can
do
and
then
we'll
go
into
a
more
general
q,
a
about
psps
season
and
checkpoint
restart.
A
Before
we
start
so
we
have
jeff
rotten.
Who
is
nurse's
operations,
lead,
he's,
waiting
and
rebecca,
who
is
part
of
the
the
communication
team
for
these
sort
of
events
do
either.
If
you
want
to
say
something
about
psps
season
before
before
we
go
into
zinji's
talk.
I
I
could
give
you
just
a
brief
note.
I
mean,
as
you
mentioned
well
from
september
to
december.
I
California
is
in
fire
season
and
you
all
probably
heard
about
all
the
fires
we
had
recently.
Those
were
started
by
largely
by
lightning
strikes
and
but
psps
events
are
somewhat
different.
Basically,
those
happen
when
we
have
dry
conditions
and
we
have
high
winds
in
the
area,
and
the
issue
is
that
the
high
winds
can
cause
either
the
high
tension
lines
from
our
utility
to
swing
wildly
sometimes
detach
also
can
be
more
conventional
lines
or
having
tree
branches
or
things
come
down
and
take
out
electrical
equipment.
F
I
F
I
And
that's
the
pg
e
has
basically
been
lacks
in
its
maintenance
for
many
years,
and
that
has
led
to
the
events
that
we've
had
over
the
last
several
years.
I
So
in
response,
what
pg
e
is
doing
is
that
they're
actually
turning
off
power
when
there
is
a
threat
of
these
conditions,
actually
causing
a
failure
of
the
electrical
grid.
I
This
can
basically,
we
can
have
a
psps
event,
because
any
of
the
power
lines
that
feed
us,
not
just
the
local
ones,
but
the
big
high
tension
lines
that
that
feed
all
of
california
are
threatened
and
art
are
taken
down.
So,
as
steve
said,
we
get
sort
of
different
kinds
of
warnings.
They
try
and
give
us
about.
72
hour
warning.
I
But
it
can
be,
as
short
as
just
a
few
hours
to
take
the
systems
down.
So
that's
the
background
of
what
happens.
This
is
a
garage.
J
I
Just
closed
sorry,
I
gotta
use
my
phone
on
this
one.
So
that's
that's
the
background
of
what
psps
events
are
are
about.
They
are
things
that
are
in
response
to
weather
conditions
that
can
cause
electrical.
I
A
Thanks
jeff
so
we'll
go
into
a
general
q,
a
after
or
really
fairly
shortly,
but
first
as
a
you
know
what
you
can
actually
do
about
it
kind
of
mitigation.
For
you
know
this
sort
of
thing
with
other
benefits
as
well.
Zinji
has
a
few.
You
know,
notes
and
tips
around
checkpointing
zenji.
Would
you
like
to
share
a
screen
or
shall
I
present
a
slide,
and
you
say
next.
J
J
So
can
you
see
my
screen.
J
Now
it's
good,
it
looks
good
okay,
so
I
think
I
have
five
minutes
to
talk
about
the
ci
options
on
kari,
so
I'm
sanji
from
nurse
skills,
engagement
group
and
I
want
to
just
release
all
our
collaborators
and
and
my
colleagues
who
worked
to
get
our
cr
to
work
at
nurse.
J
So
just
to
start
simple,
so
what
is
check
pointing
so
checkpointing
is
the
accent
of
saving
the
state
of
a
running
process
to
a
file.
We
call
it
checkpoint
image
file
and
then
the
process
can
later
be
restarted
from
that
file
and
continuing
from
where
it
left
off.
J
So
there
are
two
types
of
approaches
to
do
the
checkpointing,
so
one
is
application.
Internal
check
pointing
and
the
other
one
is
using
external
tool,
it's
transparent,
checkpointing.
So
in
general
the
internal
checkpointing
I
mean
it's
quite
limited,
so
I
believe
all
the
applications
run
on
nursk
systems.
They
do
have
some
sort
of
the
checkpoint
and
restart
capability,
but
I
mean
the
transparent
checkpoint
thing
is
much
more
desired
because
you
can
stop
the
code
anytime
and
then
you
can
start
from
exactly
very,
very
good.
J
J
J
So
we
aware
the
car
has
a
one-time
overhead
and
also
impose
extra
work
for
you,
so
we
created
a
cure,
called
flex
qrs
to
provide
a
charging
discount
and
also
we
developed
the
variable
time
javascript
to
make
your
life
easier.
When
you
use
a
checkpoint,
you
respond
so
variable
time.
Javascript
is
a
script
with
additional
aspect,
directives
and
the
best
functions,
and
you
can
use
them
with
your
applications
if
they
can
do
internal
checkpointing
or
if
they
can
be
checkpointed
with
external
tools,
so
it
can
allow
longer
jobs
to
run
in
multiple,
shorter
ones.
J
J
So
from
the
user
perspective,
you
just
need
to
submit
one
javascript
and
then
check
the
result,
and
then
the
direct
advantage
of
using
that
and
benefit
is
improve
the
kill
turnaround
because
the
shelter
valves
can
make.
I
mean
better
use
of
the
bacteria
opportunity
on
the
system,
so
the
available
cr
options.
Now
we
can
list
a
couple
support
the
applications
with
internal
cr
support.
J
You
can
we
recommend
you
to
adopt
variable
time,
jobs
and
also
if
your
job
can
generate
useful
results
in
a
short
time
limit.
Let's
say
in
two
hours,
then
you
can
use
this
collect
secure.
As
that
way,
you
can
get
a
great
charging
discount
that's
for
application
with
internal
checkpoint
restart
support,
but
for
the
applications
without
internal
checkpoint
restart.
J
J
J
So
to
use
that
is
pretty
simple,
so
you
just
need
need
to
load
the
module
and
then
another
small
module
that
you
said,
I
mean
define
a
few
batch
functions
and
then
started
the
coordinator.
So
I
I
forgot
to
mention
dmtcp
is
a
coordinated
checkpoint
in
restart
tool.
That
means
it
has
a
one
coordinator
overseas,
other
activities
of
checkpointing
and
the
restart
and
coordinate
between
them.
So
you
need
to
start
a
coordinator
and
optionally,
specify
the
checkpoint
interval
and
then
launch
the
application,
in
this
case
a
dot
out
under
this
dmtcp.
J
So
at
nurse
we
have
been
in
collaboration
with
the
dmtcp
team,
a
couple
years
by
now
try
to
get
to
the
mtcp
to
work
with
mpi
workload,
and
especially
the
clay
mpg,
and
the
aries
network
imposed
some
more
difficulty
for
the
ntcp.
So
so
far
the
mtcp
doesn't
work
with
korean
pitch
over
paris
network,
but
there
is
a
implementation
in
the
npcp
called
mama
which
features
of
like
economism.
So
it
is.
J
Mana
stands
for
mpi
agnostic,
network
agnostic,
transparent
technology.
So
this
is
a
plugin
implemented
in
vmtcp
and
we
have
been
working
with
the
developers
and
also
nurse
to
interns
and
have
already
gotten
it
to
work
through
the
bus
gromex
and
also
used
and
make
it.
I
worked
with
hpcc
to
do
some
overhead
evaluation,
so
biospace
our
number
one
applications
at
nurse
can
consume
like
up
to
it's
like
more
than
15
percent
of
the
computing
cycles.
So
we
are
very
excited
on
this
capability
working
for
us.
J
J
So,
during
on
getting
the
cr
to
work
for
nurse
workload,
we
realized
a
strong
and
active
cr
community
is
very
important
for
having
anything
that
will
work
for
production
workload.
I
mean
the
transparent,
ci
tools,
so
we
are
hosting
a
checkpoint
over.
I
mean
a
symposium
on
checkpointing
in
next
february:
try
to
get
researchers
and
practitioners
and
developers
and
also
end
the
users
together,
and
this
workshop
feature
the
latest
work
in
in
checkpoint.
J
We
start
research
tools,
development
and
product
signals,
so
we
highly
encourage
those
participation
from
nurse
users
and
especially,
we
are
interested
in
you
share.
Experience
with,
I
mean
share
your
experience
on
adopting
ci
tools
in
your
production
workflows,
so
some
details
here.
We
are
still
working
out
to
the
details
and
planning
to
release
the
cfp
soon
like
this
week.
A
Thanks
angie,
so
there's
a
link
to
dangerous
slides
in
the
in
the
webinars
channel
of
the
slack,
so
you
can
jump
back
in
and
you'll
see
them
again.
What
we'll
do
next
yeah,
let's
just
share
this
screen,
is
a
open
q,
a
session.
A
Here
we
are
so
we
have
on
the
call
jeff
you
heard
from
before.
Who
is
the
deputy
deputy
of
operations
at
nisk
and
rebecca
is
on
the
call,
as
well
you've
heard
from
her
in
you
know
many
nurse
communications
and
she
yeah.
A
She
leads
the
nurse
team
that
makes
sure
that
our
users
are
kept
updated
when
events
like
this
happen
and
zinji,
of
course,
is
with
us
who
is
our
checkpointing
expert,
so
I'll
open
the
floor,
please
raise
your
hand
and
or
unmute
yourself
and
speak
up
with
any
questions
or
comments
around
psps
checkpointing
and
you
know
being
prepared
for
this
sort
of
event.
J
I
mean
the
time
minimum
requirement
could
be
closer
to
some
of
the
referrals,
so
anyway,
that's
not
to
require
the
previous
verb.
C
You
can
always
trust
me
to
have
a
question.
So
what
are
the
current
expectations
for
keeping
up
edge
services
during
a
psps?
I
I
Common
file
system,
the
hardware
that
it
was
running
on
was
not
super
robust
when
we
had
to
run
on
backup
power
since
that
has
been
taken
out
of
service
and
replaced
with
cfs
we're
actually
in
a,
I
think,
a
much
better
situation,
and
so
we're
expecting
that
we'll
be
able
to
keep
most
of
the
auxiliary
services
up
during
events
this
year.
C
A
Questions
well,
if,
if
you
do
have
questions,
you
can
also
also
yeah
either
send
us
a
ticket
or
keep
discussing
it,
and
you
know
we'll.
We
will
make
a
very
strong
point
about.
You
know
making
sure
that
people
are
updated
when
these
events
happen.
You
know
often
things
are
kind
of
unfolding
in
real
time,
and
you
know
we
don't
know
exactly
what's
going
on
either
but
yeah.
Those
of
you
who
are
nurse
users
last
year,
probably
remember
you
know
quite
a
lot
of
you
know
regular
messages
from
rebecca
during
their.
A
You
know,
during
the
psps
outages
and
alert
times
with
you
know:
here's
what
we
know
that
hopefully
make
it
a
little
bit
easier
for
you
for
you
to
make
make
decisions
about.
You
know
what
what
jobs
to
submit,
and
you
know
how
to
arrange
your
workflow
yeah
when
they're,
when
there
is
there's
disruptions
going
on.
B
Yeah,
so
one
of
the
things
last
year
was
that
I
was
on
the
emergency
management
team,
so
I
had
all
the
inside
scoop
because
I
was
you
know
there
when
things
were
happening.
I
am
still
on
the
team.
However,
I
think
it'll
probably
be
somewhat
more
of
a
reduced
scope
at
this.
You
know
this
year
if
we
have
a
psps,
but
I
will
still
try
to
get
as
much
info
as
I
can
and
keep
you
all
informed.
A
A
The
first
one,
and
most
of
this
conversation
can
probably
happen
in
you
know
offline
in
in
slack,
typically,
in
the
past
these,
these
meetings
and
webinars
have
been
on
a
third
day
of
the
month
schedule,
and
this
month
we
we
pushed
it
to
the
fourth
thursday,
to
you
know,
work
around
some
some
calendar
clashes,
but
next
month
we'll
return
to
the
third
thursday
schedule,
so
it
should
be
a
fairly
predictable
time
frame.
A
We
are
open
too,
and
always
looking
for
topic,
requests
and
suggestions.
So
if
there's
an
element
of
nurse
that
you
would
like
to
learn
more
about
this
is
this
is
one
sort
of
a
great
opportunity
for
it?
Another
is
yeah.
We
had
these
these.
You
know
really
fascinating
success
stories
at
the
top
of
the
hour
at
the
beginning
of
the
meeting
yeah,
I
think
any
or
all
of
those
would
actually
make
a
great
topic
for
a
topic
of
the
day,
and
you
know
what
we're
looking
for.
A
There
is
kind
of
like
what
what
angie's
just
presented
just
just
a
few
slides
five
or
ten
minutes
of
telling
us
about.
You
know
something
interesting
that
that
you're
doing
that
you
know
maybe
interesting
or
beneficial
to
other
nurse
users
to
learn
about.
A
So
this
is
a
an
opportunity
to
show
off
your
work,
as
well
as
as
well
as
to
you
know,
request
copies
of
interest
so
yeah
if
you'd
like
to
either
request
a
topic
or
nominate
someone
or
self-nominate
to
to
present
a
topic
of
the
day
at
an
upcoming
meeting.
Yeah,
let
us
know
either
either
by
a
slack
or
via
a
ticket
or
yeah
wave
a
hand
now.
But
if
something
is
already
on
your
mind,.
A
And
our
final
agenda
item
is
a
quick
look
at
last
month's
numbers,
so
I
I
attempted
doing
a
little
bit
of
a
visualization
of
our
availability,
but
I
think
the
the
lines
show
the
events,
but
not
necessarily
the
duration.
A
Here
we
had,
we
did
unfortunately
have
a
few
outages
during
august,
but
overall
we
had
a
95,
a
97.3
scheduled
availability,
which
means
that
yeah,
apart
from
scheduled,
outages,
yeah
maintenance,
outages.
The
system
was
yeah
available.
Sort
of
you
know,
according
to
the
specs
of
that
97.3
of
the
time
and
the
yeah,
the
storage
systems
were
100
the
whole
way
through
august,
so
that
was
that
was
good.
A
We
have
a
a
one
of
our
metrics
that
we
report
to
the
department
of
energy
is
about
capability
jobs,
so
jobs
that
use
basically,
a
large
fraction
of
the
machine
and
threshold
is
at
1024
nodes,
and
so
we
have
a
target
to
to
spend
25
percent
or
more
of
our
nurse
hours
on
these
jobs.
A
That
really
can
only
run
at
a
facility
like
nurse
and
we've
been
getting
quite
a
few
of
those
lately,
and
so
you
know
in
august,
35.1
was
on
of
corey's
hours
from
large
jobs,
tickets,
incoming
and
outgoing,
so
we
tend
to
get
over
the
last
few
years
in
the
order
of
500
tickets
a
month,
and
that
trend
seems
to
be
sitting
about
there
in
august.
If
I
calculated
correctly,
we
got
just
shy
of
500
new
tickets.
A
We
closed
just
shy
of
600
tickets,
which
which
is
great
because
yeah
we've
always
got
a
bit
of
a
backlog
of
tickets.
That
you
know
aren't
easy
to
instantly
answer.
So
we've
got
a
current
backlog
of
485
and
I
forgot
to
put
the
number
here,
but
I
think
we're
we're
at
somewhere
in
the
92
93
of
tickets,
meaning
we
have
a
sla
that
we
report
to
the
doe
to
address
tickets
in
three
business
days
with
a
target
of
eighty
percent.
A
So
that's
everything
that
we
had
on
our
agenda
and
we've
gone
a
couple
of
minutes
over
but
again
yeah
we
can
keep
talking
on
slack
before
we
finalize.
Is
there
anything
else
that
we
didn't?
Think
of
for
this
meeting
that
you'd
like
to
bring
up
is
either
an
agenda
item
we
should
add,
or
just
a
general
comment
or
announcement
that
was
missed
earlier.