►
From YouTube: NUG meeting 15july2021
Description
Video recording of the NUG monthly meeting, June 15 2021
A
Okay,
I
think
we've
probably
got
around
about
a
quorum.
So
let's
begin
a
heads
up
and
reminder
first
that
we
are
recording
the
meeting
we'll
make
the
video
and
slides
available
on
the
meeting
webpage
at
www.nurse.gov
sort
of
shortly
afterwards.
A
Yeah,
okay,
so
we'll
follow
our
normal
format,
which
I
think
people
are
fairly
familiar
with
now.
The
idea
of
this
is
to
be
an
interactive
meeting,
which
is
to
say:
please
participate,
you
can
you
can
either
raise
your
hand
or
or
just
unmute
yourself
and
and
speak
up
when
you've
got
a
question
or
a
comment
to
make
there'll
be
lots
of
opportunities
for
that
and
I
think
we're
you
know:
we've
got
around
about
20
people
at
the
moment,
so
that's
a
comfortable
enough
size.
I
think
that
we
can
just
speak
up.
A
So
that's
a
a
good
place
to
yeah
to
post
comments
and
to
sort
of
continue
the
conversation
and
we
we
tend
to
use
the
webinars
channel
just
to
keep
it
as
a
separate
place
from
the
general
channel.
Yeah.
B
A
So
our
agenda
will
follow
kind
of
our
normal
patents,
we'll
start
out
with
a
win
of
the
month
and
the
flip
side
of
today
I
learned:
we've
got
a
few
announcements
to
make
and
there's
opportunity
to.
We
have
for
participants
to
make
announcements
there
too,
and
then
for
our
topic
of
today.
We
have
norm
barasser
from
nursk's
building
infrastructure
group
who's,
a
energy
efficiency
expert
and
has
done
a
lot
around
nes
set
up
for
energy
efficiency
with
with
its
computer
room.
A
So
they'll
talk
us
through
a
little
bit
about
some
of
the
things
that
we're
doing
and
have
a
opportunity
for
discussion
there
and
then
we'll
finish
up
with
some
looking
ahead
to
what's
coming
up
and
a
quick
run
through
our
numbers
last
last
month,
so
starting
out
for
win
of
the
month.
So
the
aim
of
this
segment
is
to
you
know,
basically
share
and
celebrate
the
achievements
in
our
community
and
they
can
be
big
or
small.
Getting
a
paper
accepted
solving
a
challenging
bug.
A
A
On
nurse
side,
pretty
sure
this
happened
since
our
last
gathering
with
the
isc
june,
top
500
list
came
out
and
perlmutter
came
in
at
number
five,
which
is
the
the
same
point
at
which
corey
debuted
actually
also
number
five
on
the
list.
C
A
A
bit
of
a
quiet
month
we
can
step
along
to
and
maybe
also
combined
with,
so
so.
The
flip
side
of
that
is
today.
I
learned
yeah.
A
Of
course,
it's
great
when
something
works,
but
a
lot
of
the
path
to
getting
something
working
is
finding
a
lot
of
things
that
didn't
work
first,
and
so
this
is
kind
of
an
opportunity
to
swap
ideas
and
notes
and
share
stories
about
things
that
were
things
that
were
difficult
things
we
got
stuck
on
dead
ends
that
we
hit
things
that
seem
like
they
ought
to
work
but
didn't
and
get.
A
The
idea
here
is
that
what
yeah
we
can
learn
from
from
each
other
and
and
bounce
ideas
off
each
other
of
how
to
how
to
solve
things
as
well.
It's
also
kind
of
an
opportunity
to
talk
about
a
a
new
tip
for
using
nest
systems
that
you
might
have
come
across
recently
or
just
something
interesting
that
you
learned
or
read
recently
that
might
interest
others.
A
So
now,
for
me,
a
lot
of
the
learning
in
the
last
month
has
been
experiences
using
spac,
using
spec
to
set
up
a
bunch
of
software
environment
which
we
have
both
on
corey
as
well
as
they're
setting
up
perlmutter,
and
it's
a
it's
a
very
powerful
tool.
It
does
a
lot
of
clever
things.
It
could
also
be.
There
can
be
some
challenges,
yeah
working
out,
what
didn't
work
when
a
when
a
build
failed,
saying
it's.
A
Yeah,
it's
it's
an
interesting
experience
of
diving
in
and
looking
at
how
it
you
know
how
it
does
things,
but
something
that's
been
really
helpful.
There
is
the
community
on
on
slack.
So,
if
you're,
if
you're,
using
it
to
install
software,
there's
a
spec
slack,
which
has
quite
a
helpful
group
of
users.
A
A
Back
and
check
that
for
links
to
things
and
so
on
important
things
that
might
that
yeah
that
well,
that
will
affect
people
in
the
latest
maintenance.
We
updated
sloan,
and
there
is
a
slight
change
in
slums
behavior,
which
is
a
dash
dash
overlap
flag
for
when
you're
running,
multiple
s-runs
on
the
same
node,
so
most
kind
of
typical,
you
know
straightforward
use,
isn't
impacted
by
this.
A
The
most
common
use
case,
of
course,
is
yeah
s,
run
dash
n
mini
nodes,
there's
little
in
many
tasks,
my
executable,
but
there
is
a
you
know
reasonably
useful
this
case,
where
sometimes
you
want
to
run
multiple
programs
on
the
same
node,
so
you
you're,
starting
one
s,
run
on
on
half
of
the
cpus,
for
instance,
and
another
s
run
on
the
other
half
of
the
cpus,
either
as
part
of
a
workflow
or
two
things
that
are
working
together
and
for
those
it
used
to
be
a
fairly
simple
geo.
A
Do
one
s
run
put
it
into
the
background.
Do
the
other
restaurant
put
it
in
the
background?
Wait
for
them
all!
You
now
need
to
let
slum
know
that
it
it's
allowed
to
start.
This
runs
on
the
same
node
and
so
there's
a
new
dash
gesture
over
that
flag.
For
that.
So
we
have
some
examples
in
our
docs
of
how
to
use
that,
and
it's
just
a
kind
of
a
short
form
of
what's
what's
changed
there.
A
Other
important
announcement
is
we
have
a
corey
os
update
planned
for
september,
it's
a
minor
update,
but
we
will
be
changing
the
pe,
which
is
the
programming
environment,
kind
of
that
the
default
set
of
modules
that
you
get
when
you
log
in,
and
one
expected
impact
of
this-
is
that
statically
linked
things
will
need
to
be
re-linked,
so
dynamically
linked
themes
should
be
fine,
because
dynamic
linking
they'll
get
linked
at
load
time
to
the
appropriate,
updated
version
of
a
library,
but
for
aesthetically
linked
things.
A
A
Some
cfps
coming
up
there
that
we
know
about
the
links
to
these
are
in
the
weekly
email.
If
there's
a
workshop
or
an
accelerated
programming
using
directives,
a
parallel
applications
workshop
for
alternatives
to
mpi
plus
x.
I
think
this
covers
things
like
gas
net
and
I
get
global
array
sort
of
stuff,
also
and
also
at
sc21.
A
Is
the
super
check
workshop
on
checkpointing,
a
few
training
things
coming
up.
The
sap
webinar
series,
if
you
haven't
seen
this,
this
is
there's
some
really
interesting
stuff
there
and
some
good
civilian
little
tips
for
using
hpc
and
for
scientific
computing
and
after
the
webinars
are
complete.
They
upload
links
to
the
recordings
onto
the
website.
A
So
I
think
links
to
the
website
are
in
the
weekly
email,
take
a
look
and
go
there,
so
the
next
one
coming
up
is
on
multi-institutional
scientific
software
development
and
some
lessons,
london,
best
practices
and
so
on,
and
that's
in
august.
A
We
have
tomorrow,
I
believe,
there's
a
training
hosted
by
nvidia,
I
think
about
cuda
multi
multi-threading,
with
streams.
This
is
useful
for
preparing
for
perlmutter
there's
also
in
about
another
month
in
late
august,
a
four
day:
cmake
training,
where
we're
partnering
with
keywear.
So
if
you
develop
or
even
just
sort
of
your
build
and
install
applications,
that
could
be
world
worthwhile.
A
And
one
other
kind
of
interesting
announcement
that
we
have,
if
you
haven't
seen
it
already,
the
e4s,
which
is
the
extreme
scale
scientific
software
stack,
I
think,
which
is
part
of
the
sap
project,
is
updated
version
21.2
just
to
say
february
february,
2021
is
now
available
on
corey.
I
think
everything's
built
for
cory
haswell.
A
It
may
or
may
not
be
available
for
knl
as
yet,
but
you
can,
you
know,
use
the
specs
there
as
a
starting
point
for
that
there's
quite
a
thorough
set
of
packages
and
libraries
yeah
adios
hdf5.
A
I
think
pets
you
might
be
part
of
it
schlepc,
so
so
there's
some
sort
of
good
libraries
available
there
to
get
it
to
sort
of
a
two-step
process.
You'll
need
to
modulate
p4s
stack
first
and
that
will
put
other
modules
in
your
path
with
the
specific
packages.
It
also
sets
up
a
spec
environment
for
building
things.
On
top
of
it,
that's
the
announcements
that
I
have
from
nurse's
side
does
anybody
have
any
other
calls
for
participation
that
they'd
like
to
bring
up.
A
Yeah
we're
racing
through
today's
meeting.
Normally
normally
we
don't
arrive
here
for
another
10
minutes,
but
I
think
I
saw
that
norm
is.
C
A
Got
some
time
to
bounce
through
bounce
through
additional
topics.
C
Well,
yeah,
we
can
go
below
the
fold.
I
kept
the
other
slides
in
there
for
below
the
fold
if
needed.
Here
we
go.
Oh
hang
on
a
sec
here
I
got
to
show
you.
A
By
way
of
introduction,
norms,
yeah
building
infrastructure
group
and
has
done
a
lot
of
work
around
energy
efficiency
and
yeah
seriously,
some
yeah
some
quite
interesting
stories
and
and
clever
tricks
and
what
tricks
might
not
might
not
be
the
right
word
but
yeah.
Oh.
C
So
I
I
will
say
I
I
tried
to
find
that
slack
channel
webinars,
but
I
I
didn't
see
it
so
if
any
questions
end
up
coming
through
there
I'll
have
to
rely
on
you
to.
A
C
So
as
yeah
as
I
said,
I'm
my
name
is
norm.
Rasa
a
little
word
on
my
history.
I've
been
an
employee
at
at
lawrence,
berkeley,
national
lab
since
2000.
So
21
years
now,
I
initially
came
to
lawrence
berkeley
lab
after
having
worked
in
an
area
called
energy
engineering,
energy
efficiency
for
commercial
buildings.
C
In
the
wake
of
us
coming
to
the
new
crt
building,
now
called
chai
wang
hall
here
on
campus,
I
joined
nurse
at
50
time
in
2017
to
help
with
the
energy
efficiency
and
energy
performance
of
the
building
I'll
go
over
those
reasons
later
and
in
early
2019
just
before
the
pandemic
transition
to
well
actually
before
that
it
was
100
time,
but
I
transitioned
fully
over
to
the
division
and
now
focus
my
time
on
on
making
sure
that
the
building
performs
well
I'll.
C
Go
over
with
the
reasons
for
some
of
that
later,
but
I
want
to
just
talk
first
about
something
that
all
of
the
the
users
here
in
in
our
community
are
probably
pretty
well
aware
of
that.
The
first
level
of
energy
improvement
just
goes
down
to
the
processing
capability
of
these
scientific
computing
platforms,
and
this
is
some
an
aspect
of
the
generational
improvements
of
our
systems
that
that
users
may
not
have
appreciated.
But
if
you
look,
you
know,
let's
just
go
as
far
back
only
as
as
as
edison.
C
If
you
look
at
the
at
its
power
consumption
for
its
compute
throughput,
when
we
deployed
corey,
we
got
roughly
five
edisons
with
really
a
doubling
of
the
power.
So
if
you
think
of
that,
this
is
more.
You
know
law
stuff,
but
that's
energy
efficiency
right
there
that
computational
throughput
for
less
power
consumption
is,
is
energy
efficiency
at
the
first
order
and
we're
getting
the
same
thing
with
perlmutter.
C
C
So
it's
important
to
lose
track
of
that,
but
one
thing
and
you
will
notice
we
are
getting
a
doubling
of
power,
but
a
five
times
improvement
and
it's
going
to
come
in
slightly
under
five
times,
so
we
can
see
the
softening
of
the
moore's
law
and
we're
looking
at
oh
believe
it
or
not.
We're
actually
looking
at
nurse
10
and-
and
it
looks
like
they'll-
even
be
more
of
a
softening
of
that.
But
but
this
is
an
important
aspect
that
our
our
computational
technology
is
providing
core
energy
efficiency
improvement.
C
That's
not
to
negate
the
need
to
make
sure
that
these
infrastructure
and
other
support
aspects
of
our
operational
services
don't
need
to
be
paid
attention
to.
So
I've
got
three
basic
levels
of
energy
efficiency
at
the
support
and
infrastructure
level.
I
don't
want
to
point
out
the
the
next
level
that
I
always
like
to
emphasize
is
once
you
start
a
compute
job.
C
It
should
complete,
because
if
it
doesn't
complete,
if
it
gets
to,
70
percent,
gets
to
30
percent
or
gets
even
before
it
gets
to
100
and
it
crashes.
That's
just
pure
waste
out
the
window,
and
this
is
one
of
the
reasons
why
we
put
so
much
emphasis
on
helping
our
users
with
their
car,
their
code
and
and
we
do
the
best
we
can
to
ensure
that
once
jobs
are
being
deployed,
they
have
a
high
probability
of
succeeding.
C
Then
site-specific
facility
design
is
next
nurse.
We
are
located
here
in
berkeley
and
we
happen
to
have
a
very
mild
climate
and
we're
able
to
produce
an
hvac
system
that
doesn't
use
chillers,
basically
vapor
compression
air
conditioning
which
we're
all
very
familiar
with.
Oh
the
lights
turned
off
on
me,
I'm
the
only
one
on
the
floor
right
here.
C
So
this
is
an
important
aspect
of
being
able
to
just
dissipate
the
heat
to
the
environment
without
very
energy,
intensive
hvac
equipment.
I'll
talk
about
that
later
and
then
the
last
thing
is
once
you
have
that
equipment
deployed
having
high
resolution
monitoring
tools
and
data
analysis
tools
to
to
adequately
adequately
determine
whether
the
those
systems
are
performing
optimally
is
a
a
perennial
challenge.
C
A
nurse
has
invested
a
lot
in
both
in
staff
and
infras
and
and
systems
deployment
to
to
be
able
to
to
monitor
how
our
systems
are
performing
analyze
and
and
improve
them.
This
this,
this
positive
feedback
loop
and
indeed
we're
helping
to
set
the
standard
for
the
state
of
the
art
and
scientific
computing.
It's
with
quick
words
on
our
building,
we're
a
four-story,
150,
000
square
foot,
building,
and
basically
40k
of
that
is
offices.
C
So
we
don't
count
that
in
our
efficiency
metrics
we
basically
have
a
power
supply
capability
of
21
and
a
half
megawatts,
which
is
usually
about
double
what
our
expected
draw
is
and
our
two
systems
promoter
and
quarry
are
capable
of
of
drawing
a
peak
of
10
megawatts.
C
As
I
said,
we're
year-round,
compressor,
free
air
and
water
cooling
systems.
We're
lead
gold
rated.
We
have
an
annual
average
pue
of
108
I'll
talk
about
that.
Further.
C
Right
are
in
in
berkeley
right
now.
Nurse
or
shy.
Wang
hall
is
approximately
40
of
the
campus
energy
demand.
So
we
are
basically
designated
here
as
the
significant
energy
user
on
on
the
berkeley
lab
campus.
So
that
gives
us
a
lot
of
extra
attention
and
help
from
the
lab
directorate
in
identifying
energy
efficiency
measures
and
I'll
talk
a
little
bit
more
about
that.
C
C
We
collaborate
deeply
with
my
former
division:
the
energy
technologies
area,
where
there
is
a
data,
center's
efficiency
center
for
general
id
for
the
for
the
private
sector,
and
they
are
their
experts,
come
in
and
help
me.
We
also
have
a
a
specialist
energy
engineering
consultant
called
kw
engineering.
C
This
gentleman
right
here
that
that
comes
in
and
and
does
consulting
for
all
of
the
buildings
on
campus
and
there's
got
a
special
attention
for
for
nurse
because
of
our
high
energy
demand.
What
are
the
targets
that
we
use
to
know
that
we
are
actually
operating
efficiently?
We
have
two
metrics.
We
have
the
power
usage,
effectiveness
and
and
a
subset
of
it,
which
is
it
power,
usage
effectiveness.
I'll
talk
a
little
bit
more
about
that,
basically
pue's
the
facility
wide.
C
C
So,
for
example,
if
the
compute
energy
is
say
six
megawatts
and
our
total
facility
is
six
point,
eight
megawatts
that
point
eight
represents
the
the
you
know
the
0.8
part,
which
is
an
8
facility
overhead,
that's
how
much
electricity
that
we
consume
for
all
of
the
services.
C
Over
and
above
the
compute,
I
t
eliminates
some
of
the
of
the
you
know,
non-support
stuff,
that
doesn't
matter
as
much
and
looks
just
in
at
the
I.t
inside
the
cabinet
and
peels
out
the
hvac
and
and
it's
it's
a
way
of
understanding
the
efficiency
of
the
hpc
directly
another
one.
That's
becoming
much
more
important
is
wattage
water
usage
effectiveness,
that
is,
our
cooling.
Our
water
cooling
facilities
use
these
large
cooling
towers,
which
you
can
see
a
picture
of
down
here.
C
Currently,
with
pearl
mudder
were
projected
to
evaporate
somewhere
around
60
million
gallons
of
water
a
year.
So
it's
a
lot
of
water.
That
was
a
large
increase
from
the
corey
only
or
the
or
the
corian
edison
era,
when
we
were
down
around
the
12
to
15
million
and
that's
because
pearl
motor,
while
it's
a
more
efficient
system,
it's
using
100,
100
percent,
liquid
cooling
and
so
hits
the
the
the
cooling
towers
harder
and
we're
evaporating
a
lot
of
water.
C
The
preliminary
analysis
numbers
for
nurse
10,
on
the
other
hand,
has
that
water
use
blooming
to
around
180
million.
So
we
are
in
the
process
of
really
really
seriously
looking
at
our
water
usage
effectiveness
and
we're
starting
to
evaluate
new
technologies
for
the
nurse
10
era
that
will
use
a
lot
less
water.
C
Water
usage
effectiveness
is
a
different
type
of
a
metric.
It's
not
unitless,
like
the
other
two,
we
we
look
at
the
amount
of
water
and
the
cooling
plant
energy
and
you
end
up
with
the
liters
per
kilowatt
hour
hour.
We
have
been
monitoring
that
metric
for
a
little
bit
over
a
year
now,
and
this
is
site
specific.
So
you
really
can't
compare
one
site
to
another
and
we're
in
the
process
right
now
of
determining
for
our
current
cooling
plant
with
pearl
mudder.
C
What
is
our
efficiency
point
and
we
will
be
developing
energy
efficiency
measures
and
improvements
on
that?
Any
questions
on
this
by
the
way,
don't
don't
feel
shy
about
just
interjecting
with
questions
I
I
actually
prefer
to
have
more
of
a
conversational
approach
to
to
these
topics.
C
How
do
we?
I
mention
that
in
a
third
budget,
a
bullet?
How
do
we
pay
attention
to
how
the
systems
operate?
Well,
we
have
deployed
this
system.
We
call
omni.
This
is
an
old
flow
chart
of
of
everything.
That's
happening.
It's
actually
not
very
current,
no
more,
but
it's
still
illustrative
of
how
detailed
it
is
out
of
all
of
the
hyperscale
dui
data
centers.
C
We
have
the
most
high
resolution
instrumentation
system,
where
we
are
operating
on
an
ethic
where
we
most
of
the
time
in
most
of
the
other
facilities
and
historically
when
a
new
project
is
occurring
or
when
we
know
that
there's
a
deficient
area
of
of
performance
in
a
data
center,
a
project
plan
is
put
together
data.
C
You
know
a
data
monitoring
plan
is
put
in
the
the
systems
are
deployed,
there's
a
period
of
gathering
data.
Then
we
decide
what
to
do
and
then
a
project
is
designed
and
then
you
know
the
work
is
done
and
then
post
work.
We
look
at
the
data
again
and
see
how
well
we've
improved
things.
Well,
that's
a
lot
of
time
delays
and
in
the
period
when
you're
gathering
the
data
you've
got
inefficient
operation
and
wasted
energy.
C
We
operate
in
a
different
ethic
where
we
say
we
don't
know
what
we
need
to
measure
and
that
time
delay
of
once
we
notice
we
need
to
get
more
eyes
on
the
performance
of
that
equipment.
It's
too
late.
So
we
have
decided
that
we
have
the
the
capability
of
just
gathering
everything
and
when
we
see
that
we
have
a
problem,
we
will
have
that
data
already.
C
We
can
go
into
implementing
corrections
and
adjustments
immediately
and
thus
the
omni
was
born
and
and
so
that's
what
we
have
we
and
we
keep
the
the
performance
data
indefinitely.
C
We
we
use
the
omni
system
and
data
that
we
have
becomes
this
triumvirate
of
of
support
for
our
our
optimization
of
the
entire
facility,
with
the
sustainable
berkeley
lab.
That's
the
the
ldl
directorate
that
that
helps
leverage
their
resources
to
help
us
improve
nurse,
and
it's
this
a
positive
collaboration
which
is
rapidly
becoming
a
template
for
for
energy
efficiency
improvements
in
the
entire
d.
C
We
indeed
the
the
hyper
scale
scientific
community
in
large
one
example,
that
of
an
area
that's
becoming
emerging
in
all
of
the
large
top
50
type
data
centers
is
this
area
operational
data
analytics
our
omni
platform?
Does
this
is,
is
that
it
co-mingles
hpc,
telemetry
and
hvs
ac
infrastructure
monitor
data
into
a
common
data
base
that
we
can
then
analyze
together
time
synchronize
and
be
able
to
deploy
solutions.
C
This?
This
is
a
very,
very
powerful
tool.
This
tool,
right
here
sky
spark
allows
us
to
look
at
at
all
of
our.
It
basically
has
cray
fan.
Blower
fan
performance
data
from
from
corey
in
this
same
platform
along
with
the
hvac,
and
it
allows
us
to
look
at.
We
establish
a
baseline.
This
is
showing
a
scatter
plot.
That's
showing
the
cooling
plant
performance
in
both
the
baseline
period
and
a
targeted
analysis
period.
So
what
we
have
here
is
this
baseline
fan,
power
and
pumping
power.
C
Scatters
are
are
showing
what
where
we
should
be
performing.
We've
done
settings
changes
in.
In
this
example,
we've
done
settings
changes
and
we're
looking
at
that
analysis
period
versus
the
baseline,
and
this
plot
is
showing
that
that
we
are
actually
burning
a
lot
more
energy
here
in
the
in
the
cooling
tower
fans,
and
so
this
helps
us
zero
in
on
on
the
settings
that
are
in
in
the
cooling
plant,
and
we
can
do
this
real
time
and
we
can
iterate
and
dial
things
in
yes,.
A
See
I
I
noticed
on
it
was
just
this
past
weekend,
this
being
used
in
in
real
life,
in
fact,
and
probably
non-concomi
with
more
detail,
but
so
we
had
a
power
outage
over
the
weekend
for
power,
maintenance
work,
and
so
when,
when
that
happens,
when
we
don't
have
kind
of
the
main
feed
of
power,
we
can
actually
keep
a
fair
bit
of
stuff
running
on
backup
power,
but
corey
computes.
I
just
sort
of
too
much
for
it.
In
fact,
corey.
A
So
so
you
might
have
noticed
like
people,
you
might
have
noticed
that
you
know
you
could
still
use
things
like
that:
the
dtns
to
move
data
around
yeah,
and
so
I
was
watching
the
internal
slack
channel
a
little
bit
because
the
yeah
the
one
time
that
people
were
a
little
bit
concerned
about
was
during
the
yeah
the
middle
of
saturday
afternoon
when
the
temperature
was
forecast
to
kind
of
be
at
the
peak
and
and
the
big
question
was
yeah.
A
We
think
if
the
forecast
is
right,
that
we
have
enough
cooling
capacity
that
can
be
driven
by
the
backup
generator
to
keep
the
thing
just
to
keep
those
things
running
and
and
cooled,
and
there
was
a
yeah
a
bit
of
chatter.
During
that
time,
you
know
on
the
internal
slack
channel
as
the
operators
were
watching.
What
I
imagined
was
these
charts
that
norm's
showing
right
now
and
watching.
A
C
This
is
the
chart
exactly
and
I
happen
to
have
it
up
here.
These
are
the
environmentals
and
the
racks
that
you
were
talking
about,
and
this
is
what
we
were
chattering
about.
These
are
deployed
sensors
that
show
us
the
air
intake
temperatures
as
a
matter
of
fact
we're
going
to
go
the
last
seven
days
and
we
can
show
everyone
the
outage
that
was
over
the
weekend
right,
so
so
this
period
right,
yeah
these.
C
These
were
the
temperatures
right
in
this
period
here,
where
we
were
just
operating
on
that
one
air
handler-
and
we
were
talking
about
these-
these
temperatures,
as
we
were
down
to
just
the
the
two
backup
air
handlers,
and
we
were
able
to
make
sure
that
that
this
common
area,
air
cooled
equipment
was
operating
correctly.
That's
exactly
right-
and
this
is
the
omni
system
that
was
still
operating
during
the
power
outage.
A
C
Yes,
exactly
right,
yeah,
that's
that's
an
example
and
that
this
this
grafana
here
is
an
example
of
one
of
the
oda
tools
that
that
we
use,
and,
for
example,
I
think
we
got
this
one
right
here.
This
is
this
is
where
I
can
view
the
actual
performance
of
all
my
air
handlers,
and
this
one
shows
me:
here's
the
cooling
distribution
units
for
pearl
motor,
I'm
able
to
see
the
inlet
water
temperature
and
the
outlet
water
temperature.
C
So
we
have
a
whole
host
of
instrumentation
that
allows
us
to
to
watch
the
the
performance
of
these
systems,
and
this
is
one
of
the
other
ones.
So
getting
back
to
the
slide
deck.
I
mentioned
that
that
we
have
the
capability
of
hvac
high
resolution
telemetry
as
well
as
hpc
high
resolution
telemetry,
and
I've
got
one
example
that
I'm
going
to
work
through
that
that
you
might
be
interested
in
this
is
corey
an
xc40,
crazy
system
and
it's
part
of
the
the
cascade
system.
C
This
is
something
that
users
may
not
have
known
about
corey.
It's
it's
a
combination.
It's
a
hybrid
system.
It
uses
cooling
water
for
70
to
80
percent
of
the
heat
extraction
from
the
processors
and
then
the
balance
of
that
at
20
to
30
percent
is
from
air
that
is
blown
through
all
of
the
compute
cabinets
in
a
cascade
fashion,
and
the
way
this
works
is
the.
This
is
why,
if
you've
ever
been
in
the
room
with
corey,
so
a
lot
of
people
like
to
use
earplugs,
it's
a
honking,
noisy
machine.
C
Repeat
another
blower
fan
to
bring
velocity
back
up.
Rinse
repeat
all
the
way
down
through
the
something
like
15
cabinets
until
it
exhausts
out
into
the
room.
C
Well,
we
interacted
with
craig
way
back
during
the
edison
phase
to
to
tell
them
that
hey
these
fans
consume
a
lot
of
energy
and
it
was
a
significant
portion
of
of
edison's
at
the
time
total
energy
consumption
roughly
12,
and
for
corey
it
was
somewhere
in
the
range
of
350,
almost
400
kilowatts
of
corey's
energy
use,
which
is
somewhere
in
the
at
that
time
was
three
2.75
kw
megawatts,
but
but
anyways
they
only
gave
us
three
fan
speeds
in
edison
idle,
just
I've
nominally
called
it
and
maximum,
and
so
it
was
basically
2500
3200
and
then
4
thousand,
and
we
interacting
with
said
this.
C
Speed
control
feature
where
it
would
monitor
the
processor
temperatures
in
the
row
and
if
there
is
a
one,
hot
spot
node
in
the
row,
with
a
with
a
processor
that
is
running
hotter
than
the
rest
and
for
every
five
degree
c
processor
temperature
up
change
up,
it
would
modulate
up
the
blower
fence
by
150
rpm
up
until
it
got
to
the
4k
maximum.
This
way
it
allowed
some
some
power
responsiveness
demand
on
these
blowers
up
and
down.
That
was
great.
C
It
provided
us
roughly
seven
percent
energy
savings
just
out
of
the
box,
but
because
we're
compressionless
and
our
water
temperature
kind
of
fluctuates
with
the
outside
air
temperature.
What
we
call
wet
bulb
that
dictates
how
the
cooling
towers,
how
cold
water
the
cooling
towers
can
make.
We
found
that
this
static,
cooling,
coil
exiting
air
temperature,
basically
the
the
servo
control
loop
that
says.
Okay,
this
cabinet
air
temperature
should
be
22
degrees.
C.
If
that's
a
static
set
point.
C
C
So
we
decided
to
develop
an
active
script
on
quarry
that
we
call
dynamic
setpoint.py
and
it's
a
system
management,
workstation
script
which
looks
at
the
cooling
water
temperature
and
actively
adjusts
that
cabinet
cooling
temperature
set
point
to
make
sure
that
we
are
not
just
widely
opening
opening
up
that
valve
in
order
to
try
and
chase
an
unobtainable
cabinet
air
temperature.
C
C
And
this
way
we
were
have
been
able
to
get
a
much
more
agile,
seasonal
performance
of
the
dynamic
fan,
speed
control
feature
in
in
quarry
and
we're
shaving
off
these
points.
Here.
These
represent
a
cooling
water
pump
energy,
and
these
are
really
really
big:
cooling,
water
pumps,
they're
125
horsepower
each,
and
so,
when
we're
circulating
all
this
water
with
those
cooling
water
pumps.
If
we
can,
you
know,
knock
those
cooling
water
pump
speeds
down
by
a
couple
percentage
points.
It
translates
into
some
some
energy
savings
in
a
small
way.
C
C
Delivery
demands
from
jobs
starting
up
in
these
huge
exoscale
systems
are
probably
gonna
demand,
even
more
interactive
communication
between
the
cooling
plants,
the
power
system
deliveries
and
the
hpc
system
within
an
exascale
system
that
starts
up
a
job
say:
that's
going
to
use
75
percent
of
the
system
and
it
may
all
of
a
sudden
in
the
blink
of
an
eye,
demand
five
megawatts
more
of
power,
and
indeed,
that
translates
into
cooling
demand
right
away.
A
cooling
plant
can't
respond
to
that
instantaneously.
C
It
is
going
to
need
some
sort
of
pre-learning
where
you
know
the
job.
Scheduler
says:
okay:
cooling
plant
get
ready,
10
minutes,
we're
gonna,
need
five
megawatts
worth
of
cooling
capability,
and
this
in
a
small
way
kind
of
represents
the
future
world
we're
moving
towards
for
power,
responsive
cooling
plants
just
enclosing
a
couple
words
on
what
the
building
infrastructure
group
does
for
energy
efficiency
in
the
future
for
perlmutter
and
now
nurse
tan.
C
We
interact
heavily
with
the
design
teams
in
making
sure
that
we
incorporate
energy
efficiency
concepts
in
the
actual
design
right
down
to
the
owner's
project
requirements,
as
well
as
all
of
the
the
review
during
construction
and
commissioning,
as
well
as
engaging
with
equipment.
Vendors
where
we
see
where
some
new
technologies
that
might
be
coming
from
from
manufacturers
might
be
four
or
five
years
out
that
we
could
benefit
from.
C
I
mentioned
earlier
in
the
talk
that
nurse
10
could
potentially
raise
our
water
evaporation
up
to
180
million
gallons
a
year
which
actually
exceeds
the
capacity
of
the
pumps
feeding
lawrence
berkeley
lab.
So
we
have
to
find
alternatives.
We
started
engaging
with
what
we
call
dry
coolers.
These
are
just
basically
huge
automotive.
C
Radiators
is
what
they
are:
they're
they're,
big
fans
that
that
blow
through
radiator
arrays
to
help
cool
down
a
closed
loop,
which
then
goes
into
the
cooling
plant.
Now
they
take
more
surface
area,
but
they
don't
evaporate
water
and
so
we're
engaging
with
with
those
manufacturers
to
see
if
we
can
use
that
technology
for
nurse
10
to
help
with
our
water
evaporation
other
nursing
activities
that
are
in
planning.
C
We
are
looking
at
machine
learning
methods
to
help
us
optimize
our
settings,
especially
with
the
the
balance
between
the
feed
of
the
cooling
tower
fans,
our
most
energy
intensive
component
in
our
cooling
systems
versus
the
pumps,
which
are
a
little
bit
more
efficient
right.
Now
we
are
kind
of
hitting
the
fans
more
and
the
settings
between
the
two
are
difficult
to
do
manually,
they're
kind
of
seasonal.
C
They
need
to
be
set
one
way
for
one
cooling,
seas,
cooling
or
heating
season
versus
the
other
and
and
it's
difficult
to
find
an
algorithm
that
is
fully
agile
between
all
of
the
different
types
of
conditions,
so
we're
in
the
process
of
planning
the
the
data
that
we
need
in
order
to
deploy
some
machine
learning
models
that
that
will
be
much
more
agile
in
in
that
regard.
C
Hpe
also
has
some
products
that
we're
evaluating
and
we
regularly
do
outreach
and
collaboration
with
the
with
the
the
various
centers
around
the
world,
like
I
said
earlier,
the
top
50
centers
and
help
them,
and
we
exchange
ideas
they
help
us
and
and
and
we
try
and
stay
on,
the
cutting
edge
of
everything
and
then
closing
that
I
just
want
to
always
like
to
put
the
word
out
for
everybody.
C
That's
on
the
team,
and
for
this
summer
we
also
have
two
summer
students
engine
basically
engineer
in
students,
gabriel
o'reilly
and
nicholas
ventura,
who
are
both
helping
us
in
various
corners
as
well.
So
with
that
I
and
this
time
I'm
gonna
put
you
guys
on
the
spot
for
some
questions.
A
That's
really
interesting
thanks
tom,
so
a
couple
of
questions
and
comments
that
that
your
presentation
brought
to
mind.
A
So
I
thought
that
was
really
interesting,
that
you
found
almost
like
a
law
of
unintended
consequences
there,
where
adjusting
the
fan,
speed
to
improve
the
efficiency
of
corey
kind
of
had
this
like
it
interactive
it
triggered
yeah.
It
treated
an
interaction
with
the
cooling
system
that
then
kind
of
undid,
the
good
that
the
fan
speed
was
doing,
and
so
by
sort
of
coupling
the
information
together.
Basically
you're
able
to
you
get
them
to
cooperate
instead
of
stepping
on
each
other's
work.
C
Yeah
it's
it's
like.
We
have
these
offsetting
savings,
so
we
got
savings
in
quarry,
but
then
we,
when
we
looked
at
it
holistically
with
the
second
order
effects
that
might
be
elsewhere-
and
this
is
very
common
in
building
science,
energy
efficiency,
all
the
time
they're
in
second
order
and
third
order,
interactive
effects,
we
it
looked
like
they
were
roughly
roughly
offsetting
now.
This
is
a
unique
situation
in
our
facility
and
and
every
facility
is
a
prototype.
C
We
are
also
site
specific,
a
lot
of
the
other
cray
xc
deployments
in
in
other
centers.
They
will
have
chillers,
meaning
their
chilled
water.
Loop
has
a
set
point
and
the
the
air
conditioning
equipment
make
sure
that
that
set
point
tightly
stays
within
a
tight
window,
so
they
wouldn't
have
had
this
issue,
because
that
chiller
is
just
going
to
be
using
whatever
energy
it.
They
maintain
that
cooling,
water
temperature.
C
So
in
that
situation
the
dynamic
fan,
speed,
control
out
of
the
box
with
with
the
cray
xc
system,
actually
works
very,
very
good,
but
in
a
situation
where
that
cooling,
water
temperature
diverges
a
lot
due
to
the
outside
conditions
that
static
set
point
cabinet
air
temperatures
that
point
does
get
into
a
situation.
This
was
a
very
agile
solution
and
the
the
the
csg
group
here
worked.
C
I
I
gotta
send
all
sorts
of
thanks
to
them,
and
the
and
owen
owen,
james
in
the
otg
group
who
developed
the
initial
python
script
and,
and
then
several
people
and
adida
gower,
is
now
the
expert
in
the
csg
group
that
that
that
helps
with
it,
and
I
actually
presented
this
way
back
at
nug
extreme
at
super
computing
19,
I
think
it
was
yeah,
is
in
denver.
C
In
dallas
and
presented
it,
as
you
know,
this
could
potentially
be
a
future
improvement,
and
I
don't
know
craig
didn't
opted
not
to
go
that
route,
but
that's
because
very
shortly
thereafter,
pretty
much
everything
all
of
their
capabilities
were
focused
in
on
shasta,
but
yeah
that
that
is
not
uncommon
in
energy
efficiency.
A
Yeah,
so
it's
never
quite
that
simple,
so
the
the
other
thing
that
you
reminded
me
and
and
either
norm
or
possibly
somebody
else
on
a
ss3
is
on
as
well.
I
can
can
clarify
so
for
for
nurse
users,
the
s
account
command,
has
a
output
option
where
you
can
get
consumed
energy
and
for
completed
jobs.
It
shows
a
value
and
I
think
that's
getting
the
energy
from
from
somewhere
in
the
omni
system.
Do
you
know
that
yeah.
C
B
B
Yeah,
I
think
it
uses
the
gray
captaincy
or
something
like
that.
Yeah.
C
It
that
so,
while
you're
looking
for
that
that
they
basically
for
our
power
numbers,
we've
got
several
flavors
of
them.
I
I
don't
think
my
slide.
Deck
actually
says
that
we've
got.
We've
got
master
meters,
which
we
call
ion
meters
that
that
we
use
for
the
campus
uses
for
our
total
power
consumed.
They
are
revenue
grade
and
highly
accurate
and
they're
at
the
substation
level,
and
then
the
sedc
meters
that
that
look
at
cory
alone
are.
C
There
are
some
on
board
power,
sensing
meters,
that
that
go
through
the
scdc
channel
and
into
omni,
and
then
we
have
some
and
I
believe,
those
use
modbus
protocol,
which
we
use
to
pull
those
into
to
query
they're
not
are
into
omni
they're,
not
revenue
grade,
but
they're,
very
highly
accurate
and
for
perlmutter
we
deployed
mu
meters,
which
are
called
their
trend.
C
Point
interval
meters
highly
accurate,
just
just
tiny
bit
less
accurate
than
the
ion
revenue
grades
very,
very
high
resolution,
so
much
so
that
we
decided
to
submit
our
top
500
as
a
level
3
because
of
the
high
level
of
sampling
rate
that
that
it.
A
Might
be
pushing
time
you're
right
yeah
so
for
for
people
interested
in
seeing
energy
use
via
this
account
s
account
dash
e
lowercase
e
shows
you
the
list
of
fields
that
it
can
display
and
the
fields
are
called
consumed.
Energy
and
consumer
energy,
raw
yeah.
B
One
thing
I
just
want
to
make
it
a
point
like
if
you
want
to
do
in-depth
analysis
or
use
that
that
number
may
not
always
be
accurate,
so
it's
okay
to
just
get
a
rough
estimate
of
what
you
are
using.
But
if
you
need
more
information,
I
think
you
can
get
in
touch
with
us
and
one
of
us
will
be
able
to
help
you
get
exactly
job
usage
values
yeah.
You
know
with
with
more
certainty.
C
Yeah
yeah
either
sweetheart
or
I
can
help
you
with
that
and
in
fact
we
we
helped
another
nurse
user.
Who
is
a
graduate
student
at
uc
berkeley
recently
in
that
regard,
that
the
number
does
not
capture
all
of
the
line,
losses
and
stuff.
Those
are
sedc
numbers
they're
on
board
numbers
from
chlorine
right.
So
so
there's
a.
A
Degree
of
approximation,
but
but
for
a
first
first
approximation
you
can
get.
C
Are
at
the
sorry
I
will.
I
will
give
I'll
point
you
to
where
a
pdf
is
of
this
deck
just
in
case
anyone
wants
it.
Okay,.
A
Sounds
great
and
we'll
we'll
post
that
on
the
website
fairly
soon,
so
we
are
at
the
top
of
the
hour,
and
people
probably
need
to
head
to
the
next
thing,
just
very
quickly
rush
through
the
next
couple
of
items
so
coming
up.
Earcup
season
is
coming
up,
so
we'll
probably
aim
to
have
a
topic
of
the
day
around
ercap,
perhaps
for
the
august
webinar
always
interested
in
topic,
suggestions
and
requests,
and
especially
if
participants
would
like
to
show
off
their
work.
A
Let
us
know
last
night's
numbers:
we
didn't
have
a
regular
schedule
maintenance.
In
june.
We
had
a
couple
of
brief,
I
think
not
even
complete
outages
but
system
degraded
events.