►
From YouTube: NUG Monthly Meeting 16 Dec 2021
Description
Recording of the NUG Monthly meeting on Dec 16, 2021
A
Welcome
to
the
november
monthly
meeting,
just
a
heads
up
and
reminder
that
we're
recording
the
session
we'll
post
the
recording
afterwards
there'll
be
a
link
to
it
on
the
the
web
page,
that's
associated
with
the
meeting
one
that
was
posted
in
the
slack.
A
A
All
right
so
we'll
follow
the
same
format
that
we've
been
using
for
a
little
while
now
this
has
been
yeah
intended
to
be
a
pretty
interactive
session.
So
please
participate,
we
don't
we
don't
have
a
too
big
a
crowd
at
the
moment.
A
We've
got
about
20
people,
so
I
think
when
you've
got
something
to
say
just
unmute
and
say
it
and
you
know
if
it
gets
too
noisy,
we'll
go
to
like
a
hands
up
or
something
but
keep
it,
keep
it
fairly,
informal
and
and
free
flowing
for
the
moment,
if
you're
not
already,
please
join
the
nurse
user
slack.
It's
a
really
good
forum,
for
you
know,
discussing
with
other
users
and
and
nurse
staff
who
are
sort
of
on
a
bit
of
an
occasional
basis.
A
A
So
our
agenda
today
will
follow
the
normal
pattern
of
start
out
talking
about
wins
of
the
month,
and
today
I
learned
two
sides
of
the
you
know
what
interesting
news
to
people
here
have
and
also
we
have
a
whole
bunch
of
announcements
and
calls
for
participation
at
the
moment,
then
we'll
go
into
our
topic
of
the
day,
which
today
is
going
to
be
preparing
for
allocation
year.
A
2022
and
helen's
online
and
she'll
walk
us
through
a
lot
of
the
stuff,
that's
useful
to
know
and
what
to
expect,
as
the
transition
comes
up
finish
up
with
just
a
bit
of
a
heads
up
of
what's
coming
up
and
a
quick
look
at
what's
been
going
on
on
the
system
in
the
last
month,.
A
So
first
section
is
winner
of
the
month.
The
aim
here
is
for
this
is
an
opportunity
to
to
show
off
an
achievement
or
to
shout
out
somebody
else's
achievement
that
you
know
of,
and
it
can
be,
you
know
any
any
kind
of
level
really
from
solving
a
bug
that
have
been.
A
You
know,
keeping
you
giving
giving
you
grief
for
a
little
bit
to
having
a
paper
accepted
somewhere,
something
significant
that
you
achieved.
You
know
when
you're
in
your
normal
work
that
might
be
a
candidate
for
a
award
at
nurse
high
impact,
scientific
achievement
or
innovative
use
of
high
performance
computing
award.
B
Hey
steven,
it's
david
yeah,
I
just
I've
already
shared
it
with
brandon
and
jack,
but
I'm
see
I
have
up
to
max
nodes
on
perlmutter,
I'm
achieving
the
best
scaling
up
to
sorry
achieving
the
best
scaling
I've
achieved
ever
on
any
machine
so
including
a
asymptotic
behavior
at
the
higher
node
counts.
C
A
Phase
two:
you
know
I've
forgotten
the
number
of
nodes
off
the
top
of
my
head.
I
think
I
might
have
to
look
that
up.
We
we
do
have
it
written
somewhere
on
on
one
of
the
web
pages,
and
this
person
here
might
might
be
able
to
remind
me
what
it
is.
A
Is
it
6,
000,
cpus,
6
000
nodes
need
to
look
that
up
and
and
get
back
to
you,
but,
of
course,
they'll
be
cpu
nodes
they'll
each
each
node
will
have
two
sockets
of
their
amd
milan
cpus
and
they
won't
have
gpus.
A
B
C
A
Yeah
can
make
good
use
of
the
full
core
where,
yes,
some
workflows,
you
know,
can
make
good
use
of
hyperthreads,
because
each
each
task
isn't
fully
using
the
core.
It's
you
know
got
you
know
lots
of
opportunities.
First,
okay,.
B
Yeah,
the
application
code
doesn't
strong
scaling
is
not
practical,
yeah
actually
strong.
A
Scaling
to
960
nodes,
it
was
quite
challenging.
That's
you
need
a
massive
problem.
B
B
E
E
E
A
B
It's
a
pretty
big
problem
too,
so,
for
even
even
though
it's
960
nodes,
it's,
but
it's
not
the
like.
This
problem
would
scale
up
on
quarry
to
about
well
all
nodes
on
cory,
so
8192
factor,
2
and
524
288
cores
with
no
threading.
E
B
Yes,
I
mean
petsy
has
to
be
petsy
hyper
cuda,
so
that's
kind
of
the
main
thing
yeah
and
worked
with
mark
adams,
and
I
have
been
working
on
that
for
quite
a
bit
and
some
people
in
the
hyper
team,
so
yep,
but
it's
using
the
gpus
using
them
pretty
efficiently.
I
might
add,
okay,
that
seems
to
be
for
the
initials
for
these
initial
scaling
tests.
That's
like
the
first
initial
scaling
test.
That's
not
even
that's
without
any
optimization
on
pearl
motor
itself.
A
Sounding
really
promising
we're
seeing
some
pretty
good
yeah
early
early
results,
I
think,
with
yeah
scaling
and
performance.
This
would
be
the
jumbo
code
right.
B
A
Yeah,
that
sounds
good
and
I
just
noticed
in
the
in
the
chat
that
nick
posted
phase
two
will
have
3072
nodes,
so
yeah
a
little
over
six
thousand
amd
sockets
each
one.
Having.
A
Yeah
they're
the
pci
there
yeah
they're
on
the
on
the
board
so
to
access
the
gpu
on
a
different
node.
It
would
pretty
much
have
to
you
know,
be
doing
a
kind
of
a.
What
would
you
call
a
heterogeneous
mpi
computing
kind
of
code,
yeah.
C
Okay,
wait.
So
are
you
saying
that
we
we
will
be
able
to
run
jobs
on
both
phase
one
and
phase
two.
A
Yeah,
I'm
hoping
that
that's
going
to
be
possible,
but
I
don't
know
for
certain
if
it's
being
established
yet.
Okay.
A
E
A
A
Challenges
to
yeah
to
be
able
to
do
that:
okay,.
A
And
I
say
in
the
chat,
then
ask
the
question:
do
we
know
when
the
cpu
nodes
will
become
available
and
the
answer
is
and
not
yet
I
don't
think
a
date
has
been
announced
for
phase
two.
Yet.
A
D
A
I
think
we
do
have
some
information
about
that.
Actually,
in
last
month,
no
the
month
before
in
october's
monthly
meeting
clayton
went
through
did
a
bit
of
a
walk
through
perk
up
and
sort
of
you
know
preparing
for
for
next
year,
and
one
of
the
slides
that
he
had
was
talking
about
that
you
know,
estimating
and
and
so
yeah
translating
their
scours
for
the
different.
A
You
know
for
the
various
resources
available
next
year,
so
that
slide
deck
should
be
available
on
the
the
webinar
site
from
it,
and
I
think
it's
also
somewhere
in
our
docs.
Although
I
don't
remember
exactly
where
to
look
that
up,
but
yeah
there
is
some.
There
is
some
guidance
about
that.
If
you
take
a
look
in
october's
meetings.
D
A
Is
probably
a
good
starting
point.
D
Great
yeah,
I
know
for
the
you
know,
for
the
ercap
they
had
charging
factor
type
things,
but
in
the
specific
award
we're
looking
at
they're
asking
for
node
hours.
So
I
just
wasn't
sure
whether
I
should
and
not
the
charging
factor
so.
F
Oh
yeah
yeah
I
I
will
mention
something
about
the
charging
factor.
Basically,
there's
a
400.
What
and
original
nurse
hours
is
now
called
one
note
hours
original
hours
were
based
on
hopper
now
we're
basing
charges
on
perimeter.
A
So
before
we
move
on
to
the
next
thing,
that
was,
that
was
quite
a
good
discussion.
Anybody
else
have
anything
they'd
like
to
shout
out.
There's
a
win
of
the
month.
A
And
if
not,
we
can
hop
across
to
the
next
side
of
the
the
other
side
of
the
coin,
which
is
you
know
today.
I
learned
and
here
we're
interested
in
hearing
stories
of
something
that
surprised
you
that
might
benefit
others
to
learn
about,
and
this
can
be.
This
can
be
either
something
that
you
you
stumbled
across
or,
or
you
know,
learned
about
your
ways
of
using
the
system
that
improve
things,
but
it
can
also
be
you
know
something
that
tripped
you
up.
A
You
know
this
is
we
we
do
research,
we're
discovering
new
things
and
learning
things
the
hard
way,
all
the
time-
and
you
know,
rather
than
kind
of
being
you
know,
put
off
for
or
wanting
to
to
hide
the
things
that
didn't
work,
yeah,
there's
a
lot
of
benefit
in
actually
making
noise
about
the
things
that
didn't
work,
and
you
know
here's
what
didn't
work
and
here's
how
I
fixed
it
or
I
don't
know
how
to
fix
it.
Yet
maybe
somebody
else
does.
A
So
I
can,
I
can
kick
us
off
with
one
that
is
it's
a
it's
a
mixture
of
today
I
learned
and
yeah.
It
was
actually
a
little
bit
of
a
you
know,
a
winner.
I
thought
as
well.
A
It
was
quite
an
enjoyable
challenge
to
work
on
with
the
help
of
one
of
our
users,
who
had
a
code
that
was
running
a
bit
slower
than
it
needed
to
and
and
inconsistent
performance,
and
we
have
some
tools
on
the
system
that
you
know
I
hadn't
actually
dug
too
far
into
myself,
and
one
of
them
is
dash
and
and
with
dashan
was
able
to
see
with
this
code
that
it
was
doing
a
surprisingly
high
number
of
I
o
calls
and
it
wasn't
well.
A
It
was
moving
a
lot
of
data,
but
it
was
moving
a
lot
more
data
than
it
needed.
C
A
So
it
was
it
was
it
was
reading.
You
know
a
few
hundred
megabytes
of
file,
but
moving
several
gigabytes
backwards
and
forwards,
and
and
doing
you
know
millions
of
io
calls
and
so
yeah,
with
a
little
bit
of
digging
around
we're
able
to
establish.
That
was
that
it
was
using
a
list
directed,
I
o
in
fortran,
to
read
in
a
text
file
and
when
fortran
does
list
directed
io.
A
A
So
it
was
reading
132
bytes
at
a
time,
and
you
know
fetching
that
in
many
many
small
and
operations
and
of
course,
each
operation
actually
fetches
a
bigger
block
than
what
it's
reading
so
there's
a
whole
lot
of
operations
there
and
by
just
setting
an
environment
variable
and
the
intel
compiler
kind
of
has
some
environment
variables
for
runtime
behavior.
We're
able
to
greatly
reduce
the
number
of
reads
and
yeah
in
that
you
greatly
reduce
the
yeah.
A
The
overall
amount
of
time
spent
by
io
in
this
job-
and
you
know
the
whole
thing-
ran
a
whole
lot
faster.
So
it
was
a
really
interesting
thing
to
learn
using
using
darchan
to
sort
of
discover
what
it
was
that
was
going
on,
and
we
have
some
notes
about
that,
one
in
our
web
page
as
well,
and
there's
kind
of
a
little
script
it's
enabled
by
default.
So
you
know
most
code
that
you
build
unless
you've
explicitly
switched
it
off
will
be
collecting
this
io
data.
It
uses
the
mpi
layer.
A
So
I
think
if
your
code
isn't
mpi
it
you
know
if
for
serial
codes
it
may
not
well
or
maybe
it
does,
but
you
need
to
need
to
run
it
under
s-run,
but
yeah
in
any
case,
take
take
a
look
at
it.
It's
yeah.
It
turned
out
to
be
really
really
helpful.
E
A
A
Or
reuse
or
what
yeah
so
in
so
in
this
case
it
was
about
buffering.
So
so
the
file
was
being
read
from
the
community
file
system,
which
is
a
you
know.
It's
like
a
network
attached
file
system,
it's
not
it's
not
on
the
node
and
so
each
time
it
does
a
read
it.
It
features
kind
of
a
at
the
lower
level.
It's
fetching
a
bigger
block.
A
It's
probably
fetching
like
a
disc
block
of
you,
know
four
or
eight
k
or
something
like
that,
but
because
it
was
only
actually
reading
132
bytes
at
a
time,
you
know
what
I'm
not
sure
of
is
why
it
kept
on
fetching
the
larger
size
or
maybe
yeah.
So.
E
E
A
Oh
so
this
is
actually
using
in
inside
in
memory
in
the
fortran
runtime
library,
oh
okay,
so
when,
when
it
comes
down
to
the
posix,
read
calls
that
it's
doing
so
so
you
know
it
at
the
fortran
level,
you're
you're
doing
fortran
code,
but
underneath
it's
calling
these
system,
posix
kind
of
calls
and
each
list
directed
I
o
was-
was
essentially
calling
read
with
a
132
byte
buffer
to
read
into
and
so
by
changing
changing
this
environment
variable.
We
turn
it
into
like
a
10
megabyte
buffer.
A
So
so
then
yeah
there
were
fewer
of
these
posix
read
calls.
I
see
ivan's
asked
in
the
chat.
What
was
the
name
of
the
tool
which
we
debugged
it?
So
the
tool
is
called
darshan
I'll
write
it
in
the
chat
for
spelling.
A
And
if
you
do
a
module
list,
when
you
log
into
corey
you'll,
see
one
of
the
default
modules
is
darshan
and
what
having
that
module
loaded
will
do
is
when
you
compile
a
code,
it
the
compiler
wrappers
the
create
compiler
wrappers,
automatically
link
in
the
darshan
library.
If
the
module
is
loaded
and
then
at
runtime,
if
the
module
is
loaded,
it
will
collect
the
io
data
and
it's
basically
edge
each
time
it
does
a
posix
read,
call
or
an
mpi
io,
so
project
read
or
wrap
mpio
called
it
collects
information
about.
A
You
know
the
the
call
the
time
spent
in
the
amount
of
data
moved
and
it
puts
it
in
a
location
on
scratch
that
you
can
read
and
in
the
docs.
If
you
do
a
search
for
darshan
in
our
in
our
docs
you'll
find
some
notes
on
it
and
there's
a
script
there
that
you
can
basically
generate
a
pdf
of
the
kind
of
the
the
report
and
it's
kind
of
just
a
high
level
summary
report
of
things
like
the
other
bio.
A
E
A
Do
you
mean
darshan
or
the
fortran
yeah
dash
in
itself
doesn't
dashing
is,
is
just
purely
measurement
measurement
and
reporting?
Okay.
So,
but
then
you
can,
you
know,
use
those
since
I
I
guess
it's
kind
of
the
I
o
version
of
a
profiling
tool.
A
And
if
not,
we'll
move
on
to
our
next
block
of
things,
which
is
announcements
and
calls
for
participation,
and
this
month
we
have
quite
a
few.
There
are
several
that
in
your
weekly
emails,
you
can
go
back
and
and
look
that
up
some
reminders
of
important
ones.
Alccc
pre-proposals
are
due
this
week
this
friday,
which
is
to
say
tomorrow
you
may
have
noticed,
particularly
if
you're
logging
in
from
lbl.
We
now
have
a
federated
id
pilot
and
what
federated
id
does
is.
A
Allow
you
to
use
your
own
institution
login
to
log
in
to
nurse.
You
can
link
them
together.
So
so
the
pilot
is
just
for
lbl
users.
So
if
you're,
if
you're
a
part
of
berkeley
lab,
you
should
be
able
to
or
when
you,
when
you
log
into
certain
nurse
things
like
you
know,
help.nurse
get
help.nurse.gov,
irs
and
so
on.
Have
the
opportunity
to
kind
of
link
your
if
your
lbl
account
to
your
nurse
account
and
then
you'll
be
able
to
log
into
those
nurse
services
with
your
lbl
login.
A
Heads
up
that
the
winter
holiday
is
coming
up
and
nurse
services
will
be
particularly
consulting
service
and
so
on
will
be
shut
down
during
it.
So
so
there'll
be
no
consulting
between
december
24
and
january
3
and
there'll
be
much
more
limited
account
support
than
usual.
The
systems
will
still
be
up
just
that
getting
getting
help
may
take
a
little
longer
and
of
course,
the
big
one
that
helen's
going
to
tell
us
about
in
in
much
more
detail
in
a
few
moments
is
that
the
allocation
year
transition
is
happening
on
january
19th.
A
I
see
there's
a
question
from
g
in
the
chat.
When
would
the
award
decision
of
2022
allocation
be
announced?
Oh,
I
think
that's
this
week
as
well
check
check
the
weekly
email
in
your
inbox.
I
think
there's
a
note
about
it,
but
I
think
that
that
one
is
planned
for
this
week
or,
if
not
then
next
week,
so
that's
coming
up
very
soon.
A
Hello
might
know
in
more
detail
a
few
announcements
about
goal
matter,
so
we
have
a
user
training
for
perlmutter
on
january
five
to
seven
there's
a
link
to
a
web
page
here,
we'll
post
these
slides
on
the
on
the
meeting
web
page.
So
you
can
just
click
on
that
link,
but
you'll
also
be
able
to
find
them
fairly
easily.
A
Another
training
coming
up
just
after
that
january,
12
and
13,
is
one
on
the
nvidia
hpc
sdk,
which
is
to
say
that
the
nvidia
compiler
suite
that's
our
default
compiler
on
perlmatter
and
the
one
that
we
recommend
for
using
the
gpus.
So
this
is
going
to
be
a
very
useful
training
to
to
join.
A
Other
people
might
have
news:
if
you're,
if
you
have
a
gpu
ready
workload,
you
can
get
early
access
to
palmata,
so
it's
still
not
open
in
a
general
sense,
it's
still
in
its
sort
of
early
access
phase,
but
we
have
a
access
request,
form
that
you
can
get
to
by
clicking
on
this
link.
Put
the
slides
up
shortly
for
gpu
ready
codes.
Also,
there's
a
few
kind
of
things
to
remember
when
you're,
using
both
palmata
and
cory
is
that
home
is
shared
on
both
systems.
A
Yet
more
announcements,
the
annual
nurse
annual
user
survey
is
currently
open.
I
know
we've
had
quite
a
good
number
of
responses,
but
not
as
many
as
we
would
like.
So
please,
if
you
haven't
participated
in
the
survey,
yet
please
do
so.
You
should
find
a
personalized
link
in
your
email
within
a
couple
of
weeks
ago.
Now
it
will
have
come
from
an
address
nurse
mbi,
mbri
research,
dot,
com
and
yeah.
Following
that
link
should
take
you
through
to
the
survey.
A
I
guess,
if
you
can't
find
it
in
your
email,
there's
a
chance
it
might
be
in
spam.
It's
worth
checking
that,
but
the
annual
survey
is
kind
of
really
valuable
for
nurse
on
on
sort
of
two
fronts.
One
is
that
it
helps
us
to
identify
areas
that
you
know
what
we're
doing
well
and
and
what
we
want
to
focus
on
for
improving
for
the
next
year
and
also
it's
very
valuable
in
our
reporting
back
to
doe,
which
you
know
affects
our
funding,
which
affects
resources
available
to
everybody.
A
So
it's
a
really
important
one
to
participate
in
if
you
have
a
or
if
you
are
or
no
a
postdoc
or
sorry
a
post
grade
who
is
looking
for
interested
in
a
fellowship
and
a
good
candidate
for
a
fellowship
applications
for
the
doe
computational
science
graduate
fellowship
are
now
open.
A
This
is
for
first
and
second
year
phd
students,
it's
a
great
program,
the
james
corona's
award
and
leadership
community
building
and
communication
nominations
are
open
at
the
moment
and
this
one's
aimed
at
mid-career
scientists
and
engineers,
which
is
kind
of
an
important
area.
A
And,
if
not
wrong,
if
not,
I
will
hand
the
screen
across
to
helen
who
will
walk
us
through
what
to
prepare
or
what
to
expect
in
preparing
for
the
transition
to
allocation
year.
2022.
F
Everyone
today,
I'm
going
to
just
walk
through
the
ay
2021
to
the
2022
transition.
F
So
here's
a
brief
outline.
I
will
talk
about
the
allocations
for
the
new
year
and
the
transition
process
and
what
happens
on
the
start
date
of
this
new
allocation
year.
What
are
the
new
changes
that
we
are
expect
users
can
expect
and
what
about
discontinued
users.
F
Allocation
year,
2022
range
from
starts
from
january
19th.
It's
normally.
We
do
this
on
the
third
wednesday
of
january
and
until
the
so
tuesday
of
january
in
the
next
year.
The
make
sure
you
understand
that
the
last
year's
allocation
hours
do
not
carry
over
the
award.
Emails
will
go
out
this
week,
and
this
year
we
have
separate
cpu
and
gpu
awards.
The
cpu
awards
can
be
used
for
query
and
promoter
cpu
and
the
gpu
allocation
can
be
used
for
the
parameter
gpu.
F
F
The
charges
on
query
starts
on
january,
20
20th,
the
second
day
of
the
new
allocation
year
and
promotion
will
still
remain
free
of
charge
and
to
further
notice,
maybe
sometime
around
mid
mid
year
after
the
phase
one
and
phase
two
integration
completed.
F
F
Then
the
computational
systems
need
to
sync
up
with
this
active
database
and
we
need
to
clean
up
old
batch
jobs
that
are
do
not
have
the
allocation
continuing
and
we
sometimes
do
system
maintenance
on
the
day
as
well,
and
we
will
have
some
new
policies,
new
software
at
the
ui
transition,
so
there's
a
whole
web
page
and
you
can
read
more
about
it
and
also.
I
will
talk
briefly
today
as
well.
F
This
is
the
time
now
it's
shortly
before
it
ay
2022
starts
so
some
of
the
things
already
we
start
processing,
for
example,
the
new
ay,
the
ay
2021
project
requests.
We
are
not
in
not
accepting
those
as
of
october
14th
and
the
week
before
the
ay
starts.
We
will
not
process
any
new
user
account
now.
F
Doe
also
needs
to
validate
them,
so
no
user,
new
user
account
creation
or
validation
for
that
week
and
after
shortly
after
you
receive
the
your
awarded
projects
allocation,
then
the
pis
need
to
do
something
you
have
a
month
or
so
to
do
that
and
last
deadline
is
the
last
day
of
the
ay21,
the
pis
or
proxies
you
have
to
nominate
which
users
in
your
project
will
continue,
and
you
also
have
to
decide
which
user
will
have
premium.
Qos
access,
there's
an
api
in
iris
in
your
project
and
then
go
to
the
rows.
F
Tab
you'll
see
two
additional
columns.
It'll
only
exist
for
this
muscle,
so
duration
in
irs,
and
then
pis
can
go
in
and
check
box
for
of
your
users.
You
want
to
continue
and
enable
premium
and
there's
also
a
recommended
recommendation
for
pi's
users
to
check
your
premium
jobs
currently
in
the
queue,
and
maybe
you
want
to
update
them
to
regular
since
there's
a
chance
that
those
jobs,
if
they
didn't
run
now
and
they
start
to
run
next
year,
they
might
be.
F
And
now
you
have
to
pay
2x
of
the
charge
and
if
the
users,
if
the
api
forget
to
your
checkbox
the
premium,
then
that
job
will
be
deleted.
If
it's
not
changed
to
to
regular
as
well.
F
On
the
start,
date
of
the
ay
2022,
it's
january
19th
in
the
morning
seven
o'clock,
the
irs
database
will
be
doing
the
new
transition
of
of
updating
to
all
the
new
allocation,
your
data,
so
if
you
usually
already
logged
into
that
to
ios
during
that
time,
you
need
to
log
out
and
back
log
back
in
to
see
the
new
data
we'll
have
a
scheduled
maintenance
for
corey.
On
that
day
and
for
perimeter
we
have
decided
not
to
do
a
downtime
this
week.
F
There
is
actually
one
maintenance
with
this
two
maintenance
one
week
before
and
one
week
after
for
more
unsubstantial
system
upgrades,
but
during
this
transition
day
on
this
day,
and
the
only
thing
we
need
to
do
is
these
learn.
This
learn
database,
sync
with
iris
and
while
doing
this
live,
you
may
experience
a
very
short
period
of
slowness
other
than
that
system
actually
will
appear
to
be
on
up
and
the
other
system
services
will
be
up
on
that
day
as
well.
F
Also
on
this
ay
start
day,
we
will
process
some
and
delete
some
old
jobs
that
are
no
longer
no
longer
valid
for
the
new
year,
for
example,
jobs
associated
with
non-continuing
projects,
jobs
with
with
the
continuing
project,
but
the
user
is
no
longer
a
member
for
the
project.
This
can
happen.
If
your
pi
forget
to
renew
this
users,
they
have
premium
jobs,
you
not
have
access
to
the
premium
qrs
or
the
overrun
jobs.
Is.
F
It
usually
happens
for
users
who
have
already
exhausted
the
allocation
year
so
for
the
new
year
start
starting
the
new
year,
our
overrun
jobs
doesn't
make
sense
and
all
the
user
head
jobs
at
older
than
12
weeks
per
policy
will
also
be
deleted.
F
So
here
is
the
information
about
new
allocation
units
and
new
charging
factors.
As
I
mentioned,
each
project
has
a
separate
cpu
and
gpu
allocations
all
based
on
permanent
node
hours,
so
charging
factor
for
permanent
cpu
is
1.0.
F
Charging
factor
for
promoter
gpu
is
also
one,
and
if
you
want
to
convert
a
new
and
old
hours,
we
used
to
do
cpu
we
used
on
cpu
only
for
corey,
for
example.
Here
we
have
unit
of
nurse
hours,
which
is
based
on
hopper
cpu
hours
now
to
do
the
based
on
the
capacity
and
performance
we
decided
this.
The
equivalent
factor
is
one
new
parameter.
Cpu
node
hour
is
the
equivalent
of
400
node
hours
on
hopper.
F
So
here
is
a
table
that
you
can
see
for
for
first
charging
unit.
You
can
see
the
the
unit
here
and
a
house
wall
in
ay
2021,
the
charting
factor
was,
is
140
divided
by
400,
and
the
new
charging
factor
would
be
0.34
and
k.
L,
a
the
original
80,
the
chinese
factor
of
80
divided
by
400,
is
0.2,
so
you
might
see
a
small
number
of
allocations
and
don't
don't
be
panic,
don't
panic
it's
times
400!
It's
about
equivalence
of
you
are
getting
for
the
next
year.
F
For
the
gpu,
they
don't
list
it
on
the
table.
There's
only
one
system
parameter,
it's
one
and
the
future
system
will
be
based
on
the
parameter
fact
factor
so
for
the
changes
and
for
ay
2022,
we
have
a
new
default
python.
Module
it'll
be
changed
to
3.8
anaconda
2021.5,
it's
python,
3,
normal
python,
2
support.
F
F
F
If
you,
the
version
is
already
existing
on
korea
is
not
default
and
it
is
the
default
perimeter,
so
you
can
use
it
and
if
you
find
anything
any
issues
you
can
already
report
to
us
via
our
help
portal
or
filing
a
which
means
file
a
nurse
consulting
ticket.
F
We
will
also,
we
also
plan
to
do
a
big
query.
Os
upgrade
to
accommodate
the
the
the
system
up
system
need
for
for
supporting
matching.
You
know,
for
example,
the
security
requirement
from
from
our
hpe
from
doe
management
so,
and
we
promise
this
is
going
to
be
our
last
major
planned
major
os
upgrade
unless
there's
another
critical
search
security
concerns.
F
F
It
includes
all
these
crazy
supported
packages
such
as
mpi
lip,
sci
net,
cdf,
hdf5,
compilers
performance
tools,
a
lot
of
things
in
it
and
each
so
if
we
upgrade
a
cdt
version,
meaning
we're
going
to
get
all
these
packages
having
a
new
software
default,
then
we're
also
going
to
upgrade
not
upgrade
or
change.
Our
intel
compiler
default
version
to
a
newer
version,
then
we'll
provide
a
web
page
later.
With
all
the
details
of
this
version,
change.
F
And
it'll
be
announced
in
our
weekly
email,
so
one
thing
you
have
to
do
if
your
application
is
statistically
compiled,
you
have
to
re-link
because
of
os
upgrade,
and
we
do
recommend
you
to
rebuild
all
applications
because
of
the
newer
software
default
and
also
the
os
upgrade
nurse
plans
to
rebuild
all
our
supported
software.
F
The
last
slide
about
discontinued
users,
if
users
with
no
active
project
for
new
year,
they're,
considered
discontinued
as
effective
on
the
first
day,
but
they
will
have
a
month
to
access
the
systems.
They
cannot
run
batch
jobs,
but
they
can
log
in
access
their
files
and
bring
them
back
to
their
system
for
longer,
storage
hpss.
They
still
have
write
access
for
one
month
and
then
five
more
months
to
read
only
and
then
they'll
have
no
access
to
hpss.
F
So
this
is
all
I
have.
Thank
you
very
much
and
if
there
any
questions
I
can
answer,
I
will
also
check
the
chat
messages.
E
Hey
hannah,
you
mentioned
that
corey's
update
schedule
will
be
three
times
a
year.
Do
you
have
any
calendar
date
targeted?
E
F
Doesn't
mean
cory
will
be
updating
every
three
times
a
year.
I
was
talking
about
this
great
pe
called
cdt,
create
performance
tool,
create
developer
toolkit
release
is
three
times
a
year.
We
don't
have
to
install
them,
we
don't
have
to
change
our
default.
This
is
just
their
release
in
the
past.
They
do
release
this
every
month
and
we
were
doing
about
quarterly
installation-ish,
newer
versions,
but
also
we
were
having
like
a
year
or
so
to
do
to
change
our
default.
F
F
Not
delay
I
mean
I
was
talking
about.
How
often
do
we
change
our
default
default
here?
We
promise
not
to
change
less
than
a
year
unless
there's
always
upgrade
you're
just
required
to
change,
because
the
other
versions
won't
be
compatible.
F
So
the
last
version
we
had
is
like
we
had
it
in
2020,
it's
late,
so
it'll
be
almost
two
years
before
we
change
default
this
time
and
we
plan
not
to
change
it
anymore
before
corey
retires.
Okay,.
E
A
Yeah
thanks
a
lot
lots
of
lots
of
good
information
there
and
quite
a
lot
to
take
in.
A
F
And
the
question
about
allocation
limits
for
the
alcccl,
I
actually
don't
know
about
the
limits
if
you
could
submit
a
ticket
to
and
we
could
forward
it
to
the
allocations
team.
To
answer
your
question
then,.
D
Actually,
just
for
clarification,
it's
not
the
limit.
It's
that
they're
asking
the
wording
of
the
call
talks
about
the
perlmutter
hours
that
are
available,
but
it
says
not
to
use
a
charging
factor,
so
I'm
just
a
little
confused
about
the
wording.
F
Yeah
just
go
ahead
and
ask
your
question
and
take
it,
so
I
didn't
mention
it
explicitly
in
the
talk
today
in
the
arcc.
Lcc
actually
goes
six
months
off
cycle
of
our
cap,
so
the
alex
llcc
allocations
runs
from
june
july
to
june
and
the
the
allocation
request
also
it
goes
through
a
different
cycle.
Obviously,.
A
Thanks
again
that
one
dude
any
final
questions
before
we
move
on
to
our
next
segment.
A
Okay,
yeah
thanks
again
for
lots
of
very
useful
information
there.
A
So,
coming
up
next,
we
are
always
looking
for
interesting
topics
for
these
monthly
meetings.
Tentatively
next
month,
one
of
our
regular
participants
will
present
some
work
that
they
have
done.
So
that
should
be
should
be
really
good.
We're
also
planning
a
topic
around
nurse
documentation
and
yeah
what's
there
and
how
you
can
join
the
effort
and
contribute
yeah.
As
I
was
saying,
we're
always
very
interested
to
hear
what
our
users
are
doing.
A
So,
if
you've
got
some
work
that
you'd
like
to
show
off
the
monthly
meeting
is
a
a
good
opportunity
for
it,
and
you
know
very
interested
to
hear
you
can
either
send
a
ticket
or
dm
me
on
slack.
If
you
have
something
that
you
would
be
interested
in,
presenting.
A
A
So
I
was
looking
through
a
few
other
yeah
samples
of
numbers
that
tell
interesting
things
about
the
state
of
the
system
and
one.
If
you
haven't
already
found
it
is
on
my.nurse.gov.
A
Under
I
think,
it's
under
jobs
and
there's
one
of
the
the
pages
is
called
cue
backlogs
and
what
the
backlog
is
is
especially
the
sum
of
the
amount
of
work
that's
currently
in
the
queue
in
units
of
like
one
whole
system.
A
So
what
this
chart
is
showing
the
texture
is
a
little
bit
small.
The
top
chart
is
for
haswell
and
near
the
the
end
of
november,
beginning
of
december,
we're
kind
of
hovering
a
little
over
7.5
kind
of
eight
q
backlog
of
7.5
or
8,
and
what
that
means
is
that
there's
7.5
or
8
days
worth
of
work
using
all
of
the
resources
of
the
query
as
well
nodes.
A
So
if
all
of
the
queued
jobs
ran
for
the
entire
time
that
they
requested-
and
you
know
often
jobs-
you
know
request
more
than
what
they
need
is
a
little
bit
of
a
buffer.
So
this
can
be
a
slight
over
estimate,
but
it
would
take
seven
and
a
half
days
for
all
that
work
to
get
through
the
queue
and
which
can
sort
of
give
some
hints
about
how
long
you
can
expect
a
job
to
wait,
particularly
if
it's
a
if
it's
a
long
job
short
jobs,
can
usually
jump
the
queue
and
start
earlier.
A
We
have
a
lot
more
knl
nodes,
the
bottom
changes
knl.
We
have
a
lot
more
known
nodes
than
haswell
nodes
and
the
backlog
on
that
one
is
that
this
is
actually
quite
difficult
to
read,
but
I
think
it's
about
they're
hovering
around
the
four
mark.
So
it's
a
useful
chart
to
look
at
if
you're
wondering
where
your
job
is
in
the
queue.
E
So
steve,
that's
the
peel
side,
kill
sizes
right,
cure
weighting
size.
Do
you
have
an
average
or
how
long
the
kill?
Normally
you
know,
timing
was.
A
Yes,
there
is,
there
is
another
page
on
my
dirtnurse.gov,
I
think
it's
under
it's
either
under
jobs
or
under
center
status.
There
are
two
sections
that
overlap
a
little
bit
there
and
it's
about
average
queue
wait
times
and
that
shows
sort
of
for
different
sizes
of
jobs
in
terms
of
number
of
nodes
and
length
of
jobs
in
terms
of
number
of
wall
clock
hours
by
taking
a
sample
of
recent
jobs.
A
A
I
thought
this
kind
of
an
interesting
chart.
We
normally
have
a
yeah
just
a
few
numbers
here
on
the
screen
about
new
tickets
coming
in
and
tickets
closed,
so
this
is
actually
showing
tickets
coming
in
oh
yeah,
new
tickets
coming
in
the
colors
are
for
different
resources.
A
Essentially
different
types
of
requests,
but
you
can
see
the
the
number
of
tickets
coming
in
over
is
actually
over
over
a
couple
of
years
on
a
on
a
month
by
month,
basis
can
be
quite
high,
several
several
hundred
per
month
in
some
cases,
and
the
the
bottom
line
here
is
the
number
of
kind
of
active.
You
know
currently
open
tickets,
this
this
one's
only
over
a
month.
A
You
can
see
that's
a
lot
more
consistent,
which
pretty
much
tells
us
that
nurse
support
is
fielding
tickets
at
at
roughly
the
same
rate
that
they
come
in,
but
we
we
still
do
have
a
bit
of
a
backlog.
E
Sorry,
steve
yeah,
it
seems
to
me
that
I
have
most
of
the
questions
but
anyway,
since
I
don't
have
a
legend
here,
so
what
is
the
biggest?
You
know
breakdown
for
the
tickets,
what
categories.
A
A
Get
a
lot
of
questions
about
corey,
a
lot
of
questions
about
sort
of
allocations
and
iris
requests
for
access
to
things.
This
is
this
is
quite
a
broad
swathe
of
tickets,
and
it
also
includes
things
that
you
know
are
more
kind
of
off
off
to
the
edge
where
we're
seeing
more
tickets
about
pearl
mudder
good
number
of
tickets
about
you,
know
software
and
sort
of
understanding
well
either
either
using
or
installing
software
and
another
one
is
understanding
what
happens
with
jobs.
A
Okay,
a
popular,
a
frequent
query
that
people
have
but
yeah.
Actually,
that
would
probably
be
an
interesting
topic
itself.
There's
a
breakdown
of
the
types
of
questions
that
people
ask:
that'd
be
a
good
topic
of
the
month
for.
E
A
A
There
are
certain
like
definitions
about
what
defines
an
outage,
so
so,
for
instance,
if
if
something
causes
the
amount
of
the
fraction
of
nodes
in
use
on
cory
to
drop
below
some
threshold,
for
instance
that
triggers
an
alarm,
because
it
often
suggests
that
there's
something
wrong,
although
it
can
just
mean
that
there's
a
very
large
job
in
the
queue
and
a
lot
of
other
jobs
need
to
be
yeah,
they
need
to
be
drained
for
it
to
for
it
to
start.
A
So
it's
one
of
our
you
know
ongoing
challenges
with
getting
ensuring
that
there's
good
utilization
as
well
as
sort
of
you
know,
fair
access
and,
and
so
on,
like
that,
so
the
the
second
one
here
c-scratch
is
a
luster
file
system.
Lustre
is
a
very
high-performing.
C
A
Very
high
scaling
file
system,
but
you
know
it
it's
a
it's
like
running
a
formula,
one
car
versus
running
your
own.
You
know
personal,
regular
car.
It
requires
a
little
bit
more
care
and
feeding
than
than
a
lot
of
other
file
systems
and
yeah
yeah.
Occasionally
things
kind
of
jam
up
in
some
way
or
another,
and
because
c-scratch
is
sort
of
fairly
fairly
critical
to
the
system,
it
can
cause
the
the
system
to
become
effectively
unavailable
for
a
while,
so
that
that
happened
here.
A
The
problem
with
the
service
nodes
was
essentially
one
of
the
one
of
the
management
sort
of
services
that
keeps
different
aspects
of
quarry
running.
I
think
something
went
wrong
with
one
of
them
and
it
took
a
little
while
to
clear-
and
you
know.
A
And
so
so
we
do
get
a
mixture
of
hardware
failures
and
software
issues,
and
I
don't
actually
know
what
the
breakdown
is
last
month,
for
instance,.
A
That
was
due
to
a
a
power
unit
like
a
a
rectifier
or
something
like
that
hardware
unit
failed
and
it
took
down
the
cabinet
and
when
it
took
down
the
cabinet,
it
took
down
some
service
nodes
and
so
on,
and
so
yeah
that
had
some
knock
on
effect.
So.
E
A
So
usually,
when
it's
an
unscheduled
outage,
it
means
that
something
kind
of
causes
the
system
to
be
not
kind
of
effectively
usable.
Okay.
You
know
it
might
be
stopping
jobs
from
starting
or
stopping
people
from
being
able
to
log
in
or
you
know,
if
the
file
system
is
so
frozen
up,
that
people's
sessions
freeze
when
there's
a
scheduled
outage,
it
is
usually
a
much
more
well,
it's
a
much
more
sort
of
controlled
shutdown
of
the
system
and
bring
back
up
for
maintenance.
A
A
Okay,
so
we're
right
at
the
top
of
the
hour.
Thank
you
all
for
joining
we'll
post
the
slides
and
recording
shortly
on
the
webpage
for
the
for
the
meeting
and
we'll
see
you
after
the
new
year
enjoy
the
holidays
for
everybody
who
will
be
celebrating
them
and
hopefully
get
a
bit
of
a
break.