►
From YouTube: NUG webinar recording, Dec 2020
Description
Nick from NERSC's storage team tells us about NERSC's archival system HPSS, including the challenges of smoothly moving over 150PB of data from Oakland to our new data center at Berkeley
A
So
quick
plan
of
the
day
we've
been
doing
this
for
a
few
months
now,
so
hopefully,
people
are
starting
to
get
familiar
with
the
new
format
that
we're
using
we're
aiming
to
be
quite
interactive
here.
So
please,
yeah
speak
up,
participate
type
things
in
probably
the
nurse
user
slack
webinars
channel
is
a
easier
and
and
more
you
know
long
lasting
place
to
put
a
chat
but
yeah,
also
in
the
in
the
zoom
chat.
A
If
you
wish,
and
our
agenda
is
so
we'll
go
through
win
of
the
month-
and
today
I
learned
so
there's
opportunities
to
talk
about
things
that
have
happened
in
the
last
month
and
you
know,
celebrate
our
successes
and
learn
from
our
challenges.
A
We
have
a
few
announcements
and
also
open
the
floor
for
announcements
that
any
of
our
users
here
would
like
to
make
there's
a
good
opportunity
for,
like
your
course
for
participation,
and
so
on
a
particular
thing.
We're
going
to
talk
about
a
little
bit.
There
is
preparation
for
the
ay
allocation
year
transition,
which
is
coming
up
in
january.
A
Then
our
topic
of
the
day
is
hpss
nusk's
archival
system
and
nick
from
nurse
storage
team
is
here
and
he's
going
to
tell
us
a
little
bit
about
hpss,
and
you
know
some
of
its.
I
guess
tips
and
topics
and
history
and
we'll
talk
a
little
bit
also
about
using
it
and
then
we'll
finish
up
with
sort
of
last
month's
numbers
and
and
what's
coming
up.
A
So
first
thing
is
win
of
the
month,
so
the
intent
of
this
segment
is,
you
know,
to
show
off
an
achievement
or
to
shout
out
somebody
else's
achievement
that
you
know
about
so,
for
instance,
if
you've
had
a
paper
accepted
solved,
a
bug
that
you
know
was,
you
know
proving
a
challenge
for
a
little
while
you
know
scientific
achievements
are
really
good.
A
These
are
also
kind
of
your
opportunities
to
you
know,
make
yourself
and
your
your
work
known
for
you
know
potential
nomination,
for
you
know,
science
highlights
or
high
impact
scientific
achievement
awards
or
for
innovative
use
of
high
performance
computing
awards.
So
there's
a
awards
that
nurse
awards
to
users
each
year.
A
Does
anybody
have
anything
that
they'd
like
to
kick
off
with
something
interesting?
That's
happened
in
the
last
month
or
so.
A
Lots
of
silence,
nobody's
nobody's
done
anything
or
it's
just
been
a
challenging
month.
A
See
I
have
some
yeah
quite
encouraging
news.
It's
also
kind
of
a
a
bit
of
a
thank
you
to
users
who
have
answered
so
far.
We
have.
Our
annual
nurse
survey
is
currently
open
and
you
know
surveys.
There
are
a
lot
of
them
nowadays
and
you
know
it's
quite
a
challenge
getting
a
good
response
rate,
but
it's
quite
important
to
us
to
get.
A
You
know
our
users
feedback
on
what's
going
on,
and
I
was
looking
at
the
numbers
that
we've
had
the
responses
that
we've
had
coming
in
just
the
other
day
and
yeah.
So
far,
we've
had
a
really
good
participation
rate.
So
that's
really
encouraging
and
you
know
thank
you
users
who
have
participated
if
you
haven't
there's
still
time.
You
know
please
send
in
your
feedback.
There
is
a
link
in
your
emails.
A
A
Oh,
we
can
go
on
to
the
the
flip
side
of
this
same
coin,
which
is
today,
I
learned
so
in
the
you
know,
win
of
the
month
we're
talking
about
achievements,
and
today
I
learned
well,
there's
also
a
degree
of
achievements
here.
A
Just
achievements
that
might
have
yeah
might
have
come
in
a
slightly
more
painful
way,
and
the
idea
here
is
to
recognize
that
things
don't
necessarily
come
easily,
and
you
know
there's
there's
plenty
that
we
can
learn
from
challenges,
and
you
know
it's
helpful
to
each
other
and
other
users
to
to
know
about
the
things
that
we
tripped
up
on
and
yeah,
even
if
we
didn't
solve
them.
Yes,
challenges
and
ideas
that
you
know
others
might
be
able
to
help
us
through.
A
And
you
know
this
might
also
lead
to
improvements
that
we
can
make
to
our
documentation
or
ideas
for
further
training.
A
It's
also
a
good
time
to
talk
about
just
you
know
something
new
and
interesting
that
you
learned
an
interesting
seminar
that
you
saw.
For
instance,
anybody
have
any
tips
that
they'd
like
to
tell
us
about,
or
things
that
they're
stuck
on,
that
they'd
like
to
bounce
off
the
room.
A
So
I-
and
I
suspect
others
at
nurse-
have
been
learning
in
the
last
couple
of
weeks
about
the
degree
of
complexity
of
all
the
different
systems
that
make
up
nersk.
So
we've
been
planning
for
this
power
maintenance.
That's
currently
going
on,
and
it's
it's
quite
a
complex
sequence
of
operations,
all
the
different
systems
and
services
that
use
just
some
element
of
infrastructure.
That's
in
the
nurse
building.
Since
for
this
maintenance,
we
need
to
actually
turn
off
all
power
to
the
building.
It's
it's
kind
of
a
step
beyond.
A
Even
what
we've
had
to
do
in
the
past
for
public
safety
power
shut
offs
in
that
in
the
psps
you
know
we
can
usually
keep
something
going
on
on
a
backup
generator,
but
because
this
time
it's
the
actual
power
infrastructure,
that's
being
upgraded.
Everything
needs
to
be.
You
know,
carefully,
shut
down
and
brought.
A
A
Okay,
no
other
new
tips
or
tricks.
We
will
have
a
few
of
them
coming
up
actually,
in
our
topic
of
the.
A
Day,
oops
all
right,
we'll
move
on
then
to
announcements
and
calls
for
participation.
A
So
we
have
oh.
As
always,
please
check
the
weekly
email.
There
are
new
news.
There
I
mentioned
before
the
nurse
user
survey
is
currently
open.
Thanks
to
those
who
have
responded
and
if
not
please
do
and,
as
I
think
everybody
is
quite
aware,
we
currently
have
a
power
maintenance,
which
means
that
all
systems
and
services
are
available
this
week
and
that
even
includes
actually
the
help
system.
So
one
of
the
services
that's
unavailable
is
our
authentication
service,
which
prevents
logging
in
to
help.nurse.gov.
A
Just
to
tip
on
that,
you
might
have
noticed
that
the
help.nurse.gov
site
before
the
maintenance
has
changed.
We
have
a
new
service
portal
which
we
think
is
a
easier
interface
and
a
clear
interface
for
you
know
for
asking
questions
and
for
identifying
helping
to
categorize
the.
A
B
A
A
A
Okay,
it's
not
so,
I
think,
probably
the
the
biggest
announcement
coming
up
that
we
have
is
that
the
allocation
year
transition
will
be
coming
up
before
the
next
node
meeting
before
the
next
meeting.
Yes,
it
will
actually
be
the
day
before.
A
So
tuesday
january
19
is
when
the
current
allocation
year
will
finish,
we'll
have
our
monthly
scheduled
maintenance
on
the
wednesday
and
when
we
return
from
maintenance
we'll
be
in
the
new
allocation
year
helen,
I
know
you've
done
a
lot
of
the
preparation
and
planning
for
this.
Would
you
like
to
talk
a
little
about
the
allocation?
Your
plans.
B
Oh
sure
so
yeah,
as
they
mentioned
last
day
of
ay,
okay,
okay,
yeah,
so
we're
gonna
cover.
One
of
the
main
changes
for
this
year
are
the
premium
qos.
So
this.
B
So
for
premium
it's
normally
not
not
used
for
your
regular
computing
and
nurse
it's
for
an
emergency
not
for
like
when
you
try
to
use
up
all
your
alkene.
It's
for
emergency
publication
deadline
experiment,
so
it
should
be
used
infrequently,
and
so
we
have
decided
that
we
need
to
have
a
sort
of
threshold
for
the
charging
factor.
So
once
the
over
20
of
the
allocation
high
water
mark,
the
premium
will
be
charged
double
of
its
premium.
So
normally
premium
is
two
times
of
your
regular
qos
charge
charges.
B
So
when
it's
over
twenty
percent
of
the
allocation
it'll
be
charges
four
times
of
a
regular
qos
charge,
and
this
this
is
temporary
and
we
may
decide
to
change.
Depending
on
how
this
is
usage.
I
will
be
observing
this.
A
new
policy,
and
also
a
a
new
set
up,
is
that
we
are
not
going
to
have
premium
being
enabled
by
default
for
all
the
users,
so
pis
have
to
decide
which
users
in
their
project
can
use
premium
qos.
B
So
during
the
year
this
instruction
and
link
on
this
page
during
the
year
apis
can
toggle
a
user's
access
to
premium
on
and
off.
B
This,
then,
then,
before
allocation
year
starts,
we
are
asking
all
the
pis
to
do
two
things
one
is
to
you
have
to
decide
which
users
will
continue
in
the
project,
so
we're
asking
for
the
pis
that
are
having
current
currently
allocation
of
ay
2020
allocation
and
their
project
is
continued
in
ny
2021..
B
So
for
those
pis
we
are
asking
you
to
decide
which
users
will.
You
will
continue
I'll
ask
them
to
continue
in
your
next
year's
allocation
and
also
with
the
same
choice.
You
have
to
make
this
api.
You
are
also
going
to
select
which
users
you
are
allowing
them
to
use
premium
qos.
B
This
instructions
are,
is
already
published,
but
the
api
for
iris,
as
to
do
so,
will
be
available
soon
by
early
next
week,
when
the
power
maintenance
is
over,
and
then
we
will
say
rebecca
is
going
to
send
a
separate
email
to
all
the
pi
and
processes
to
to
emphasize
that
pi
need
to
do
this,
otherwise
user's
job
in
the
queue
may
be
deleted
without
users
being
continued
or
allowed
to
to
premium.
B
So
another
recommendation
is
that
the
jobs
in
the
queue
right
now
is
premium
and
if
you
want
to
avoid
a
surprising
two-time
charge
because
those
jobs
may
start
in
ay
21
instead
of
now,
and
that
you
may
think
this
job
or
not,
you
know
as
the
premium
priority
anymore.
You
can
change
it
now
to
to
regular
or
make
sure
your
users
are
running
having
these
jobs
in
the
queue
they
do
have
access
to
premium,
but
you
still
want
them
those
jobs
to
be
run
in
the
premium.
Qos
next
slide,
please.
B
So,
on
the
day
when
ay
21
starts,
which
is
wednesday
january
20th,
we'll
have
iris
down.
First
then
we'll
have
the
new
allocation
year
data
switch
over
then
corey
does
start
at
the
same
time.
But
then,
during
this,
when
iris
transition
is
over,
corey
is
going
to
correspond.
Slurm
scheduler
will
have
the
access
to
the
new
iris
data.
Then
we
then
will
do
some
processing.
B
The
list
of
the
jobs
lists
on
this
page
are
the
jobs
in
the
queue
that
we
will
delete
when,
after
curry
comes
back,
those
jobs
will
be
gone.
So
so
those
are
the
jobs
that
if
the
the
project
is
not
continued
for
this
year's
allocation,
all
these
jobs
will
be
deleted
or
if
a
jobs,
the
allocation
is
still
is
continued.
But
pi
didn't
select
this
user
to
continue
in
the
next
like
allocation
year.
B
Those
jobs
will
also
have
the
invalid
slum
association
to
be
allowed
to
run
jobs,
so
those
jobs
will
be
deleted,
so
pis
make
sure
to
to
can
select
your
users
that
you
you
desire
to
continue
and
premium
qrs
as
well
since
they're,
not
defaulting
anymore.
If
a
user
is
not
enabled
for
a
premium,
those
jobs
will
be
deleted.
B
I
want
to
just
call
out
normally
we
at
allocation
year
during
the
startup
electric
new
allocation
year,
we
change
our
software
default.
Intel
compilers
create
pe
software.
This
year
we
decide
to
keep
those
the
same
as
is,
and
we
do
have
a
potential
to
upgrade
os
in
the
middle
of
the
year
and
we
may
change
a
default
at
that
time.
A
Yes
and
what
people
are
thinking
of
any
questions,
so
there's
a
link
in
the
zoom
chat
and
also
the
webinars
channel
of
the
nurse
slack
to
the
slides,
which
means
that
you
can
yeah
click
directly
on
the
link
to
get
to
the
web
page.
It
saves
a
little
bit
of
typing
yeah,
so
we've
got
some
yeah.
B
B
C
A
Cool
yeah,
so
that's
that's
very
helpful.
Thanks.
D
A
A
A
E
F
E
E
Lbl
background,
I'm
sorry
I'm
on
a
macbook
air
that
doesn't
support
backgrounds.
So
you
get
to
see
my
my
messy
room
so
hope,
everybody's
okay,
with
that.
E
Can
everybody
see
that
okay
title
slide,
that
looks
good,
okay,
great,
so
thanks
again
for
the
up
dogs,
thanks
again
for
the
opportunity
to
speak
about
hpss,
so
steve,
contacted
owen
and
myself,
I
think,
like
last
friday,
night
for
a
quick
presentation
at
nug
about
hpss,
with
a
suggested
topic
of
the
move
out
of
our
former
oakland
data
center.
E
So
owen
is
frantically
powering
stuff
up
at
the
data
center
as
we
speak,
so
that
left
me
kind
of
on
the
hook
to
give
the
talk,
but
that's
okay.
E
So
this
is
about
our
move
out
of
the
oakland
scientific
facility
that
happened
in
between
mostly
between
2018
and
2019,
and
early
2020.
I'll,
introduce
the
team
really
quickly.
So
again,
I'm
nick
balthazar,
I'm
an
hpss
admin
in
the
storage
systems
group
I
deal
mostly
with
day-to-day
operations
and
before
shelter
and
place
happened.
E
I
was
mostly
on
the
hardware
side
and
implementation,
and
now
I
kind
of
do
a
little
bit
of
everything
wayne
herbert
is
our
team
lead
he's
responsible
for
what
the
system
is
going
to
look
like
in
five
years,
as
opposed
to
maybe
this
week
or
tomorrow.
E
Melinda
jacobson
is
an
hpss
developer.
She
spends
half
of
her
time,
actually
writing
hpss
code
and
contributing
it
back
to
the
hpss
collaboration,
which
is
a
collaboration
between
ibm
and
five
doe
labs,
including
ersk,
and
she
spends
the
other
half
of
her
time.
Helping
nurse
with
software
process
improvements
and
our
own
internal
hpss
code
and
modifications
owen
james,
who
I
mentioned,
is
a
member
of
the
otg
operations
technology
group.
He
is
our
man
in
the
field.
E
He
deals
with
a
lot
of
the
on-site
day-to-day
operations
to
keep
hpss
running
and
he's
also
responsible
for
knowing
a
lot
of
fun
facts
about
hpss.
Like
you
know
how
many
mp3s
could
you
hold
in
three
tape,
libraries
and
stuff
like
that
which,
unfortunately,
I
don't
know,
but
he's
always
fun-
to
talk
to
you
about
interesting,
hpss
and
storage,
trivia
christie,
callback
rose,
is
our
group
lead
and
she
also
has
a.
She
also
is
on
the
hpss
technical
committee
driving
future
hpss
features,
and
finally,
carrillozynski
works.
E
So
usually,
when
I
give
a
presentation
about
nurse,
I
have
a
few
introductory
slides
about
what
sort
of
work
we
do
and
how
many
users
we
have
and
their
distribution
across
the
world
in
the
country.
But
since
everybody
here
is
a
nurse
user,
I
I'm
going
to
go
ahead
and
skip
that,
but
I
did
want
to
mention
a
bit
about
nurse
compute
infrastructure,
in
particular
the
storage
tiers.
E
So,
as
you
may
know,
storage
is
arranged
in
tiers
that
form
a
hierarchy.
The
top
of
the
hierarchy
is
typically
a
very
fast
storage
that
is
load
capacity
and
the
bottom
tier
of
storage
is
a
slower
but
higher
capacity.
And,
of
course
everybody
wants
the
top
tier
to
have
more
capacity.
That's
faster
and
everybody
wants
the
lower
tier
to
have
more
speed
and
be
easier
to
use,
and
these
tiers
are
arranged
mostly
due
to
cost
the
higher
tiers
of
storage
just
cost
more.
E
So,
as
you
all
know,
the
compute
platform
is
corey,
a
cray
xc40,
that's
been
in
production
and
up
for
a
number
of
years,
the
top
tier
of
our
storage
is
the
burst
buffer,
which
is
1.8
petabytes
of
flash
that
can
stream
data
at
a
terabyte
and
a
half
per
second.
E
The
second
tier
storage
is
corey's
scratch
file
system,
which
is
luster
on
spinning
disk.
I
believe
it's
next
young
hardware
that
is
30
petabytes
and
can
move
data
at
700
gigs.
A
second,
the
third
tier
of
storage,
is
also
a
file
system
on
spinning
disk.
It's
a
gps
file
system
called
our
community
file
system,
it's
running
on
ibm
ess
hardware
and
can
move
data
at
a
100
gigs,
a
second
and
finally,
the
last
but
not
least,
the
bottom
tier
of
storage
with
the
least
convenience.
E
But
most
capacity
is
the
archive
system
that
I
work
on,
which
is
230
petabytes
of
tape
in
three
ibm
ts,
4500
libraries,
and
we
can
do
a
an
aggregate
transfer
of
between
30
and
100
gigs
a
second
but
single
transfers
are
typically
between
one
and
two
gigs,
a
second
between
hpss
and
the
file
system.
E
E
We've
been
running
the
hpss
software
product,
which,
as
I
mentioned,
is
a
collaborative
product
between
ibm
and
five
doe
labs.
We've
been
running
that
since
1998.
we
have
two
hpss
systems
in
production.
E
The
user-facing
system
is
called
archive.
We
have
200
petabytes
of
tape,
and
that
is
where
nurse
users
store
their
their
results,
raw
data,
etc.
We
have
a
smaller
system
that
is
nurse
internal,
mostly
for
center
backups,
which
is
called
regent,
and
it
has
about
30
petabytes
of
tape.
E
All
of
our
data
transfers
so
far
are
via
client,
interface,
htar,
pftp
and
globus.
We
don't
have
any
direct
file
system
interface
yet,
but
that
is
a
topic
that
we're
looking
at
and
actively
exploring
so,
for
instance,
ghi
gpfs,
hpss
integration.
There's
a
project
right
now,
that's
looking
at
that
and
we're
looking
at
other
interfaces
as
as
well
such
as
the
hpss
fuse
interface,
which
is
kind
of
like
nfs.
E
Historically,
we've
grown
the
archive
at
1.4
times
per
year,
so
this
is
a
growth
chart.
When
I
joined
the
group
in
2007,
we
had,
I
think,
a
petabyte,
and
now
we
have
230
petabytes,
so
it
just
keeps
growing
archive
is
an
orange
and
this
is
a
stack
chart.
The
region
doesn't
actually
have
230
petabytes,
it's
actually
only
30
petabytes
represented
in
green,
so
the
topic
of
the
day
is
moving
out
of
osf,
which
we
did
between
2018
and
actually
early
2020.
E
E
So
we
had
a
multiple
year
evaluation
of
where
to
put
the
tape.
I
should
also
mention
which
isn't
in
the
slide
that
oracle
our
principal
tape
and
drive
vendor
at
the
time
decided
to
stop
supporting
oracle
enterprise
media,
which
is
high
performance,
high
speed
tape,
media
that
we
use
for
the
archive.
So
we
were
also
stuck
for
a
decision
as
to
what
technology
to
move
the
archive
to,
because
our
principal
vendor
of
drives
and
media
decided
to
end
of
life,
their
enterprise
product.
E
So
this
is
just
to
show
that
you
know
this
is
the
operating
range
for
a
tape
system
and
basically,
what
this
shows
is
that
90
degrees
fahrenheit
is
pretty
much
the
limit
and
it's
the
blue
area,
where
tape
can
operate
safely,
so
90
degrees,
fahrenheit
and
50
about
50
percent.
E
Relative
humidity
is
where
we
want
tape
to
be,
and
we
don't
want
any
rapid
fluctuations.
It's
supposed
to
be
a
very
consistent
environment,
so
the
the
berkeley
data
center
just
could
get
too
hot
or
too
humid
or
fluctuate
too
rapidly
to
be
safe
for
tape.
Another
issue
that
we've
been
encountering
since
2017
is
airborne.
Particulate
tape
does
not
like
dirty
air.
E
You
know
it's,
it's
a
media,
that's
basically
aramid
or
mylar
that
and
it
has
a
tape
head,
just
like
an
old
cassette
deck
and
if
it's,
if
there's
dirt
in
the
air
that
can
interfere
with
reading,
writing
the
media
and
every
year
during
fire
season,
the
air
quality
in
the
bay
area
has
gotten
really
bad.
E
So
that's
another
issue
for
running
tape
that
that
we've
been
having
and
it's
a
it's
an
issue
in
the
crt
data
center
because
of
the
open
air,
cooling,
environment
and
sort
of
lack
of
filtration
that
the
data
center
has.
E
E
Although
we're
doing
a
lot
of
remote
management
now
that
we
didn't
expect
to
be
doing
putting
all
the
data
in
the
cloud
comes
up
from
time
to
time,
but
users
want
the
data
faster
than
a
cloud
provider
sla
and
in
our
own
analysis
it
cost
more
than
running
the
archive
on
premises.
E
We
looked
at
building
out
a
room
in
crt,
but
you
know
that
was
both
costly
and
would
constrain
data
center
space
for
other
systems
like
the
the
upcoming
pearl
motor
system-
and
we
didn't
want
to
do
that.
E
One,
the
ambient
temperature
fluctuation
in
crt
and
two
ibm
is
an
enterprise
drive
and
media
vendor,
so
we
can
use
their
drives
and
media
and
switch
off
of
the
old
oracle
media,
so
that
was
at
least
two
of
the
three
problems
were
solved
with
that
solution.
E
So
here's
a
look
at
the
in
one
of
the
integrated
cooling
libraries
that
we
have
at
crt.
E
As
I
said,
these
support
enterprise
drives
and
media
so
that
solved
our
our
our
enterprise
media
issue.
We
bought
three
of
them:
they're
16
frames,
each
13
000
slots
and
at
least
two
of
the
three
have
64
tape
drives
each
running
ibm
jd
media
at
15,
terabytes
of
cartridge.
E
Airborne
particulate
is
still
an
issue
for
us,
we're
looking
for
a
solution
for
that,
but
nothing
has
has
come
up
yet.
E
So
a
quick
look
at
the
process
of
of
the
move
there
were.
You
know
this
is
obviously
kind
of
a
condensed
view
of
it.
There
were
an
awful
lot
of
steps
and
it
took
a
long
time,
but
one
of
the
issues
is
that
the
ibm
media
is
incompatible
with
the
oracle
media,
so
we
couldn't
just
load
it
up
all
in
a
truck
and
transfer
it
from
oakland
to
to
berkeley
what
we
did
was
run
some.
E
E
We
had
3
000
cartridges
that
were
packed
up
by
otg,
mostly
owen
and
some
interns,
and
moved
every
day
for
two
weeks
by
courier,
in
small
batches
of
a
few
hundred
cartridges
and
over
the
course
of
two
weeks
we
moved
all
three
thousand
cartridges,
and
this
was
mostly
done
to
reduce
user
impact,
so
the
chances
that
a
read
request
would
come
from
any
cartridge
in
flight
between
oakland
and
berkeley
was
minimized
and
we
did
get
a
few
requests
and
that
kind
of
jammed
things
up.
E
E
So
you
know
those
were
physically
unracked
moved
by
truck
and
then
re-racked
at
crt
and
recabled
over
the
course
of
nine
hours.
We
brought
hpss
down
and
then
brought
it
back
up
at
crt.
E
Then
there
was
an
ongoing
data
copy
out
of
the
oracle
libraries
for
almost
a
year
where
we
copied
out
120
petabytes
of
data
over
a
400
gig
dedicated
link
between
oakland
and
berkeley,
and
some
days
we
we
exceeded
half
a
petabyte
a
day
so
that
that
worked
very
well
and
with
the
exception
of
a
couple
of
damaged
cartridges,
it
all
made
it
to
to
berkeley
without
a
hitch
and
with
almost
no
downtime.
E
This
is
a
visualization
of
the
data
copy
operation
that
our
team
lead.
Wayne
did.
I
know
the
numbers
are
probably
way
too
small
to
read
on
zoom,
but
basically
you
can
see
that
on
some
of
our
peak
days,
we
sustained
about
half
a
petabyte
a
day
for
the
large
files
which
stream
better
than
small
files
and
and
that
went
on
for
quite
a
while.
We
were
also
dealing
with
regular
user
ingest,
which
is
between
150
and
300
terabytes
a
day.
E
So
we
had
some
days
where
the
ts-4500
complex
in
berkeley
was
taking
in
three-quarters
of
a
petabyte
a
day,
and
we
were
able
to
sustain
that.
E
The
smaller
system
was
a
lot
easier
to
move
and
we
just
did
a
forklift
move.
We
packed
up
all
the
cartridges
into
you
know
hundreds
of
boxes
moved
them
all
by
truck
and
shut
the
system
down
and
moved
it
in
one
single
14-hour
downtime.
So
again,
owen
and
his
team
were
critical
in
packing
up
the
media,
and
you
know
ejecting
tapes
packing
them
up
early
in
the
morning
and
late
at
night
and
arranging
for
all
the
media
movement.
E
So
wayne
led
that
effort,
and
it
was
it
took
a
good
couple
of
months
to
get
all
of
the
old
stuff
out
of
there
and
I
believe
the
oakland
scientific
facility
is
now
like
a
hole
in
the
ground.
I
don't
even
I
think
it's
been
demolished,
but
glenn
can
probably
speak
to
that,
since
I
think
he
has
a
view
of
it
from
his
his
window.
E
D
Any
oh
yeah,
sorry,
this
is
koichi
from
vienna.
Thank
you
for
very
interesting
presentation.
I
really
love
to
know
those
behind
the
scenes
efforts
going
on
nice
and
it's
yeah.
It's
really
a
lot
of
work.
I
can
see
that
just
a
quick
question
because
you
mentioned
the
current
system
has
this,
you
know,
is
equipped
with
this
kind
of
air
conditioning,
so
that
keeps
the
tape
in
a
good
condition,
but
I'm
just
wondering
what's
going
on
right
now,
because
there's
no
power
at
the
facility
this
week.
E
Thank
you
for
the
question.
That's
that's
a
good
question,
so
the
tape
libraries
are
are
down
so
they're,
they're,
unpowered
and
so
you're
right,
there's
no
air
conditioning
going
on,
but
the
main
issue
that
we
have
is
when
I
o
is
being
done
to
tape.
E
So
as
long
as
there
are
no
reads
or
writes
being
done
to
the
tape
system,
it
should
be
fairly
safe
if
the
temperature
fluctuates
as
long
as
it
stays
within
that
I
showed
you
that
chart
the
cyclo
psychometric
chart
with
the
the
blue
area
as
long
as
the
overall
temperature
remains
there.
As
long
as
we're
not
doing
I
o
to
the
tape
system,
it
should
be
okay
and
we
will
probably
have
the
libraries
powered
up
for
a
few
hours
before
we
start
the
system
to
let
the
temperature
inside
the
libraries
become
optimal.
E
Thanks
for
the
question
that
that
was,
that
was
good.
Thank
you.
A
So
you
so
you
there
was
a
couple
of
things
in
your
slides
there
and
that
you
talked
about
that
that
jumped
out
at
me.
So
one
was,
you
talked
about
the
size
of
the
data
on
hpss
increasing
by
1.4
times
per
year,
which
sort
of
just
sounds
like
a
number.
But
then
I
noticed
in
that
inside
hpc
article
that
was
published
february
2019
and
it
said
120
petabytes
of
data
and
so
a
little
less
than
two
years
later,
we're
up
at
230
petabytes
nearly
twice
as
much.
E
It's
a
lot
of
data,
it's
an
average
over
many
years,
so
we
do
have
fluctuations
actually
in
spite
of
what
the
numbers,
in
spite
of
what
the
numbers
say
in
inside
hpc,
I
believe
wayne
has
observed
that
our
ingest
rate
is
down
a
little
bit
over
the
last
couple
of
years,
so
maybe
the
last
few
years
have
been
1.3
or
1.2
but
yeah.
We
we
handle
a
lot
of
data.
A
E
Thanks
for
the
question
that
that's
a
good
one
as
well
so
light
disk
drive,
technology,
tapes
are
always
increasing
in
capacity
as
well,
and
we
have
a
constant
effort
going
on
to
upgrade
the
tape
tape
drives
and
media
to
meet
the
capacity
demand
while
keeping
the
floor
space
con
constraints,
basically
consistent.
E
So
right
now
we're
running
15
terabyte
cartridges
and
I
believe
at
crt
sitting
in
pallets.
Right
now
are
the
next
generation
of
tape
drives,
which
will
run
20,
terabyte
cartridges
and
wayne,
and
I
will
go
ahead
and
implement
the
new
drive
technology
and
then
copy
all
of
the
15
terabyte
media
onto
the
20
terabyte
media,
and
that's
just
a
constant
thing
that
happens
with
hpss
every
couple
of
years.
We
change
the
the
underlying
media
technology.
E
So
that's
that's
the
magic
of
how
that
works.
Thank
you
for
the
question.
A
E
1970
was
it
74,
I
believe
74
is
the
oldest
data
in
the
archive
yeah.
Sadly
I
I
I'm
older
than
that,
but
so
we
we
to
date
have
never
had
to
purge
anything.
E
A
Cool
thanks
and,
and
so
the
one
other
thing
that
jumped
out
at
me,
while
you
were
talking,
was
when
you
were
describing
the
sneaker
net
and
moving
moving
cartridges
by
truck.
So
a
quick
back
of
the
envelope
calculation
suggests
that
the
truck
had
a
bandwidth
of
about
600
gigabytes
per
second.
E
That's
another
one
of
those
fun
facts
that
owen
figured
out.
Actually
he
did
figure
out
how
much
bandwidth
the
fedex
truck
has,
and
I
I
can't
remember
the
number,
but
I
think
it
was.
It
was
much
greater
than
the
actual
bandwidth
of
hpss
over
the
network.
E
That's
right,
no,
that's
that's!
That's
actually
very
true
it.
It
moved
data
faster
than
we
can
over
the
network.
That
may
be
embarrassing,
but
I
mean
you
can
just
pile
a
huge
number
of
tapes
into
a
truck
so.
A
Even
even
the
network,
like
the
400
gigabit
link,
that
was
basically
a
special
purpose
network
that
was
yeah
uncommonly
fast
at
its
time.
Right.
E
Yeah,
not
not
many
sites
have
a
point.
Two
point:
four
hundred
gigabit
link
that
that
is
quite
fast.
We
have
100
gigabit
moving
to
the
hpss
movers
and
they're
they're
dual
connected,
so
a
single
I
o
mover
can
push
maybe
200
gigs
a
second
or
two
gigabits.
A
second
that's
you
know
divided
by
eight
for
gigabytes,
but
yeah.
We
spinning
up
multiple
tape
drives.
Each
tape.
Drive
can
do
maybe
300
or
350
megs
a
second.
A
It's
a
huge
data,
move
yeah
thanks
again
nick
does
anybody
else,
have
any
questions
or
comments?
I'd
like
to
make.
D
D
Maybe
I
can
go
ahead,
go
ahead!
Yes,
yes,
yeah,
which
questions
to
first.
So
this
from
your
presentation.
I
you
mentioned
this.
The
old
system
from
oracle
in
the
new
system
from
ibm
were
not
compatible
or
consistent,
and
then
I
just
didn't
get
how
you
guys
solve
the
problem.
Did
you
guys
have
to
copy
from
oracle
to
one
something
and
then
copy
again,
or
did
you
guys
do
anything
more
sophisticated.
E
That's
essentially
the
gist
of
it
thanks.
That's
another
really
good
question,
so
that's
right!
The
oracle,
media
and
tape
drive
technology
is
not
compatible
with
ibm.
They
won't
read
each
other's
media
or
cartridges.
So
what
we
did
was
what
you
might
do
in
any
file
system.
If
you
were,
you
know,
maybe
copying
from
your
mac
desktop
to
a
windows
machine
or
something
we
just
kind
of
copied.
The
files
over
the
network.
E
You
know
read
them
out
of
the
oracle
infrastructure
over
the
network,
to
the
ibm
infrastructure
and
just
rewrote
them
onto
the
the
ibm
tapes-
and
you
know
the
metadata
is
all
handled
by
hpss,
so
the
files,
the
bits
all
of
that
stuff
end
up
intact
in
in
the
metadata
database
and
yeah.
It
was
basically
just
a
big
big
copy
or
an
rsync.
If
you
you
want
to
think
of
it.
That
way,.
D
Thank
you
and
thanks
for
the
question,
I
have
two
more
questions
actually
sure
you
can
ask.
The
second
question
is
about
it's
kind
of
related
follow-up,
but
security
of
this
hps
system.
I
mean
security
both
in
terms
of
any
potential
data
loss,
or
I
don't
know
some
confusion
between
metadata
and
actual
data,
and
then
also
I
don't
know
if
any
outside
hackers
attack
hpss
system.
Just
to
I
don't
know
for
for
my
field,
my
climate
science
data,
nothing
is
really
sensitive,
but
maybe
some
more
different.
D
You
know
space
physics
or
some
other
national
security
have
maybe
sensitive
data,
and
then
some
hackers
may
just
want
to
give
trouble
to
other
people,
so
I
they
might
do
try
to
get
access
to
storage
so
yeah,
just
just
accidental,
you
know
erase
or
loss
of
data
security
and
a
more
intentional
attack
kind
of
security
is
any.
Do
you
have
to
do
something
against
those
issues
to
maintain
hbss.
E
Well,
those
those
are
also
very
good
questions,
so
one
is
sort
of
a
data
integrity.
How
do
we
protect
the
metadata
and
the
other
is
more
about
a
cyber
security?
How
do
we
protect
hpss
against
cyber
security
attacks
right?
Okay?
So,
as
far
as
the
metadata
goes,
we
back
it
up
every
day
and
we
use
you
know
we.
We
use
disk
arrays
that
are
our.
You
know
raid
six
plus
two,
so
we
can
lose
disks
and
we
have
metadata
backups.
E
But
to
your
point,
if
you
know
if
an
asteroid
were
com
were
to
come
and
like
wipe
out
crt
off
the
face
of
the
map,
that
would
be
it
for
hbss.
We
don't
have
off-site
copies
of
the
data.
We
do
have
off-site
copies
of
the
metadata.
But
of
course,
if
your
data
is
gone,
the
metadata
doesn't
do
much
good.
E
So
we
do
what
we
can
to
to
make
sure
the
metadata
is
protected
and
backed
up
and
not
corrupt
and
all
of
that
stuff,
and
we
do
have
off-site
copies
of
that,
but
not
off-site
copies
of
the
data.
As
far
as
the
cyber
security
question
we're
prone
to
cred,
you
know
stolen
credentials,
just
like
any
other
system.
So
if
somebody
say
it's
difficult
to
steal
your
credential
these
days
because
we're
using
one-time
passwords
so
we're
a
little
bit
more
protected
at
nurse
than
we
used
to
be.
E
They
could
get
go
in
there
and
mess
things
up,
and
you
know
we
wouldn't
we
don't
have
backup
copies
of
the
data.
So
it
is
a
risk
that
you
know
if
somebody
were
to
steal
your
credential
and
hack
into
corey
or
something
you
know
they
could
potentially
delete
your
data.
D
E
E
They
wouldn't
be
able
to
read
other
users
files
if
they
still
say
you
know
my
identity
or
your
your
identity,
but
I
think
the
identity
theft
risk
is
a
lot
lower
these
days
with
with
one-time
passwords,
and
of
course
you
know,
if
we
do
find
out
about
a
breach,
we
can
re,
we
can
revoke
somebody's
hpss
token,
but
yeah.
If
somebody
were
to
go
in
there
and
erase
all
your
data,
we
couldn't
get
it
back.
So
we
do
advise
you.
You
know
if
there's
something
that
is
super
critical
for
multiple
reasons.
E
D
Yeah
thanks
for
the
suggestion
I'll
pass
that
to
my
project
managers.
I
think.
E
To
be
fair,
we
we
have
a
a
pretty
good
reputation
in
our
hpss
system
of
of
being
reliable,
and
you
know
the
data
integrity
has
a.
We
have
a
good
record,
so
not
to
say
nothing
could
happen,
but
we
are
super
careful.
A
Two
minutes
left:
we
we've
got
one
more
kind
of
slide
after
this
of
a
few
tips
for
hpss
for.
E
Sure
you
can
contact
steve
and
he
you
know
we
can
exchange
email
addresses
or
you
know,
yeah.
I
I'm
happy
to
answer
your
questions.
Offline.
E
A
E
A
E
Oh
yeah,
okay,
thank
you.
I
can
hang
out
for
a
few
minutes
after
the
meeting.
That's
fine
so
anyway,
thank
you
again
steve
for
the
opportunity
to
give
a
little
good
press
to
hpss.
A
Cool
and
thanks
nick,
that
was
a
really
interesting
sort
of
talking
topic.
Thank
you.
So
we
share
my
screen
and
so,
while
we're
on
the
hpss
topic,
we
have
a
a
couple
of
sort
of
user-facing
tips
here,
a
lot
of
them
which
are
on
our
docs
page.
A
At
this
address,
I
think,
there's
a
couple
of
people
from
das,
lisa
and
albert
at
least
online.
Do
you
want
to
say
anything
to
users
about
tips
for
using
hpss.
G
I
think
the
main
thing
that
I
would
emphasize
to
users
is
to
remember
what
nick
pointed
out
is
that
at
its
heart,
it's
a
tape
system,
there's
a
disk
archive,
that's
at
the
top.
That
makes
it
respond
like
a
file
system,
but
underneath
its
tapes,
and
so
that
makes
things
doing
things
like
putting
a
hundred
thousand
small
files
into
it.
That
would
normally
be
kind
of
okay
on
a
file
system.
G
Although
100
000
is
not
always
good,
not
a
file
system
anyway,
it
would
be
kind
of
okay
on
a
file
system,
but
it's
not
great
on
a
tape
system
because
they
end
up
spread
all
over
the
place,
and
I
think
that
is
right
now
the
most
common
issue
that
users
run
into
when
you're
storing
files
into
hps.
A
Cool
thanks,
lisa
and
yeah.
If,
if
you
are
having
trouble
using
it
or
working
out
how
to
use
it
or
best
practices,
the
docs
are
hopefully
helpful,
but
yeah
also
drop
us
a
line
by
helping
nurse.gov.
A
So
in
the
last
couple
of
minutes
coming
up
we're
always
interested
in
topic,
requests
and
suggestions.
A
Maybe
we'll
keep
that
one
on
the
on
the
slack
channel
and
just
to
finish
up
a
quick
run
over
last
month's
numbers,
so
corey's
availability
was
yeah,
quite
high,
98.7
scheduled
a
little
lower
when
you
include
the
maintenance
scheduled
maintenance
in
there.
Normally,
I
have
a
you
know,
a
little
graphic
here,
kind
of
showing
a
timeline
of
when
and
for
how
long,
the
various
yeah,
outages
and
so
on
were,
but
my
script
for
doing
that
is
on
cory,
which
is
currently
unavailable.
A
So
it's
just
a
little
bit
of
text
here
we
had
very
high
utilization
in
that's.
The
type
of
that
should
be
november
of
over
95,
and
almost
half
of
them
were
large
jobs
using
more
than
a
thousand
nodes.
A
So
that
was
good
to
see
that
that
corey
is
being
used
for
for
large-scale
challenging
science.
That
really
needs
these
sort
of
systems.
A
And
that
brings
us
to
the
top
of
the
hour.
Thank
you.
All
for
participating,
it
sounds
like
nick
might
have
a
few
minutes
before
needing
to
run
off
to
the
next
thing.
So
so
we
can
continue
sort
of
chatting
briefly
about
hpss.
A
I
forgot
to
mention
we're
recording
this,
so
you
will
be
able
to.
You
know,
see
the
presentation
again,
we'll
post
a
link
to
that
and
post
the
recording
on
the
on
the
web
page
afterwards.
D
D
Actually,
if
you
go
back
to
the
slide
for
the
hpss,
I
just
wanted
to
know
reminded
more
detail
about
how
to
use
globus
to
hpss.
I
got
a
question
from
yeah
here,
for
you
know
globus
warning
or
file
systems
not
append.
I
think
I
asked
this
question
before
here,
but
I
also
asked
by
one
of
the
members
in
my
division
or
project.
D
I
recommended
them,
you
know,
or
maybe
we
should
do
the
two-step
you
know
first,
so
they
are
trying
to
move
a
lot
of
data
from
piano
computing
to
the
nas
hp
ss,
but
I
advise
them
to
do
two-step,
first
from
scratch
and
then
to
hpss,
but
they
simply
ask
why-
and
I
couldn't
answer
I
I
remember
we
briefly
discussed,
but
I
didn't
totally
understand
the
underlying
underlying
reason
so
just
want
to
know
some
details
about
why
yeah
it's
it's!
A
No
no
go.
I
was
just
going
to
embarrass.
E
Well,
not
necessarily,
but
was
the
transfer
out
of
nurse
hpss
or
was
it
storing
data
to
nurse
hpss,
storing.
D
Data
to
the
hpss
so
using
robots
to
move
data
from
outside
to
nask.
E
Hpss,
okay
and
was
it
from
one
hpss
system
to
another,
and
that
was
the
the
issue
and
they
on
the
the
other
side.
They
wanted
you
to
move
it
to
a
file
system.
First.
D
F
D
A
F
A
Believe,
and
unless
you
can
probably
you
know,
clarify
this,
that
it's
to
do
with
globus,
splitting
the
data
stream
into
multiple
fragments
and
doing
them
all
at
once.
E
That's
part
of
the
reason
for
wide
area
network
transfers.
Yes,
globus
is
only
single
stream,
so
the
only
way
to
the
only
way
to
move
data
effectively
from
a
external
file
system
to
hpss
is
to
fire
up
multiple
jobs.
You
know
to
move
data
concurrently,
because
on
the
hpss
side
it's
only
a
serial
protocol,
it's
an
enhancement
is
coming,
but
it's
going
to
be
a
while.
E
E
So
we
prefer
people
to
tie
up
their
files
on
the
remote
side
and
then
send
us.
You
know
large
tar
files
as
opposed
to
just
sort
of
do
an
rsync
like
thing
that
globus
allows
you
to
do
where
it
just
you
know,
drag
and
drop
a
whole
directory
full
of
files,
so
that
that's
another
issue
we've
seen
with
globus
is
sort
of
deluging
the
the
arc.
E
The
tape
archive
with
lots
of
small
files,
which
is
both
kind
of
inefficient
on
the
tape
side,
and
it
also
metadata
in
the
archive,
is
kind
of
a
bottleneck.
So
if
we
have
a
lot
of
small
file
movement
on
tape
and
a
lot
of
small
file,
I
o
and
the
metadata,
it
can
really
slow
things
down.
D
Yes,
I
think
so
you
said
particularly
grommas
is
taking
advantage
of
dividing
data
into
chunks
and
then
send
in
as
multiple
I
mean
paths,
but
hpss
is
is
basically
single
stream.
A
E
D
E
E
All
all
right,
I
guess
I
will-
I
will
sign
off
so
thanks
again,
for
the
opportunity
to
to
plug
hpss
thanks
again
for
the.