►
From YouTube: 9. File Systems
Description
Learn about the NERSC filesystems and the best way to use the different filesystems.
Slides for all sessions can be downloaded from here: https://www.nersc.gov/users/training/events/new-user-training-june-21-2019/
A
Hello,
everyone,
my
name,
is
Johnny.
Probably
you
have
seen
my
name
in
the
in
your
email
in
the
during
the
past
few
days,
so
I
guarantee
you'll
see
a
few
more
emails.
Why
would
that
be
a
short
story,
so
I
hope,
I,
hope
and
I
appreciate?
If
you
can
reply
to
that,
that
will
give
us
some
feedback
I'm
going
to
talk
about
nurse
file
systems,
so
they
are
different
file
system
and
nurse
and
they
are
designed
with
different
purpose
and
different
motivations.
A
So,
for
example,
burst
buffer
is
the
current
most
the
fastest
fastest
went
nurse
and
the
backend
does
hardware
is
SSD,
and
this
software
software
there
is
a
Chris
data
work.
So
if
you
want
to
accelerate
your
I/o
application
performance,
you
can
try
to.
You
should
try
to
use
burst
buffer
as
your
first
choice,
and
then
we
have
a
scratch
which
has
Laster
file
system
and
the
maximum
aggregated
performance
is
around
700
gigabytes
per
second,
so
I,
don't
think
you
can
get
this
high
number
anywhere
else.
A
So
the
hardware
and
the
software
and
nurse
gets
highly
like
our
colleague
Ashley
has
commissioned
the
highly
motile
e
optimized
for
each
PC
users
and
for
your
application,
and
then
we
have
a
project.
So
this
the
big
difference
is
that
this
tends
to
be
permanent
and
tends
to
be
hoped
how
they
are
data.
You
know,
for
a
longer
time,
period
turns
up
previous
to
first
offer
and
the
scratch,
especially
the
relatively
longer
than
first
buffer,
like
with
12
it.
So
at
12
weeks,
purging
period
burst
buffer
is
just
a
template.
A
A
A
Those
are
the
main
file
systems
and
we
do
have
another
two
global
common
global
ho,
so
those
are
not
designed
for
your
applications
IO.
So
you
shouldn't
run
your
application
while
directly
talking
your
direct
talk
into
this
file
system,
so
you
should,
but
those
photos
might
design
to,
for
example,
keep
your
source
code
compiling
your
code
on
and
that
so
it's
SSD
based
and
has
limited
Kota
so,
for
example,
hope
global
home.
A
That's
where
you
you
see
when
you
first
log
on
to
curry,
you
are
/
home
directory,
so
you
got
10
40
gigabytes
Kota
on
that
so
first,
so
this
is
a
very
simplified
diagram
for
different
file
systems
and
in
the
following,
slides
and
other
works
through
different
fastest
file
system
in
more
details.
So
the
first
one
I
want
to
talk
about
is
the
scratch
file
system
so
scratch.
A
So
it's
based
a
nurse
scratch
is
just
don't
get
confused.
Scratch
is
just
the
name,
a
name
that
we
use
to
describe
a
configuration
based
on
laughter
and
HDD.
So
a
bunch
of
things.
So
we
refer
to
scratch
as
this
Laster
file
system.
So
Laster
is
one
with
most
successful.
Hpc
file
system-
and
it
has
16
years
research
and
development.
Many
was
the
optimization.
The
research
idea
has
been
have
been
put
into
production
and
have
been
implemented
as
the
real
product
features
in
the
file
system.
A
And
if
you
look
at
the
top
500
HPC,
thus
fast
fastest,
a
supercomputer
in
the
world,
you
have
found
that
most
of
the
supercomputer
are
using
Laster
as
the
file
system
and
the
current
version
nurse
is
2.7.
The
latest
version
is
2.0
so
which
has
more
features
and
I
think
we
will
get
upgraded
in
the
next
machine.
A
So
in
order
to
understand
the
scratch
or
the
last
year,
we
use
the
two
terms
here.
There
are
few
important
concepts.
So
first
is
the
metadata
server
and
this
is
holding
your
files
like
file,
name
and
directory
name
some
of
the
metadata,
and
then
you
have
this
OSS,
which
is
object.
Storage,
object,
storage
server,
so
this
is
managing
a
bunch
of
OST.
So
OST
is
object,
storage
target,
which
you
can
think
of
that
as
a
bunch
of
discs.
A
We
think
that
those
are
the
hard
hard
HDD
hard
disk
drive,
and
when
we
talk
about
the
aggregate
IO
performance,
we
actually
talk
about.
We
we
actually
talk
about
a
maximum
observed
performance
in
before
we
really
in
the
initial
phase
of
the
system.
So,
for
example,
if
you
run
the
MPI
file
write
all
so
you
get
in
order
to
get
maximum
performance
is
better
to
leverage
the
lustre
striping
on
the
file
system,
and
this
is
another
diagram
to
give
your
idea
how
the
scratch
file
system
is
hooked
in
the
quarry
computer.
A
So
on
the
top,
is
the
quarry
computer
knows
and
we
have
130
on
that
router
and
the
router
connected
to
the
computer
knows
with
the
it's
a
lot
with
the
scratch
file
system
in
terms
number
OST,
we
have
248.
So
that
means
that
you
can
strive
your
data
at
most
248
of
the
OST
of
the
X
tourist
server
or
objects
our
target
so
and
there's
there's
this
metadata
server,
and
currently
we
have
five
major
servers.
One
is
one
is
called
primary
metadata
server
and
we
also
have
four
additional
metadata
server.
A
So
you
could
talk
to
us
and
send
us
email
if
you
observe
this
kind
of
slow
performance
in
terms
of
metadata,
and
another
thing
is:
if
you
have
a
very
large
file
or
big
file
like
100,
gigabytes
or
one
terabyte,
so
you
may
consider
striping
to
get
the
optimal
IO
performance,
so
striping
I.
Can
we
will
talk
about
that
later
and
how
to
do?
That
is
using
a
very
simple
command.
A
So
here's
a
very
quick
demo
with
this
striping
command,
so
this
is
using
LFS
and
then
this
only
works
on
scratch
file
system
and
it
doesn't
work
on
any
non
master
file
system.
For
example,
if
you
log
on
to
quarry
and
you
this
is
where
you
see
when
you
log
on
to
quarry
and
then
you,
if
you
type
this
command
the
FS
get
stripe,
we
try
to
know
like
I
mean
is
OST
is
being
used
by
your
data
by
our
fine.
So
you
will
see
this
error.
A
So
basically,
you
cannot
run
this
command
by
non
non
Laster
file
system
and
we
know
that
this
home
directory
underneath
is
based
on
GPFS,
not
a
master.
So
that's
the
reason
you
saw
that
you
will
see
the
error
and
then,
if
you
sitting
to
the
scratch
and
make
sure
you
see
this
global
square,
C
square
one
SD
and
your
username,
then
you
are
guaranteed
that
you
are
on
the
last
row
file
system.
And
then,
if
you
run,
this
clamp,
LFS
gets
drive
and
given
any
existing
file
name.
A
A
So,
in
order
to
change
the
striping,
because
you
can
imagine
having
more
OST
potentially
will
improve
your
concurrent
improve
your
I/o
performance,
because
you
have
concurrent
server
to
serve
your
request,
io
request.
So
in
order
to
change
your
striping
for
their
data,
you
have
to
create
a
new
directory
and
manually
move
your
existing
file
into
this
newly
created
directory.
So
you
cannot
change
the
stripe
configuration
in
terms
of
number
wise
T
and
the
number
of
the
size
of
this
drive
directly
on
an
existing
file.
A
There
are
some
striking
recommendations,
so
you
can
check
out
this
table
like
the
depends
on
your
file
size,
if
is,
for
example,
if
it's
less
than
1
gigabyte,
probably
just
use
default
striping,
if
it's
very
large,
like
a
hundred
gigabytes
or
even
one.
Her
wise
will
recommend
you
to
use
disk
one
stripe
large,
and
you
could
also
manually
change
the
stripe
size
like
a
damn
Street
before,
like
because
the
command
will
just
use
72
OST
for
the
stripes
count,
but
you
can
definitely
increase
the
stripe
count
to
200
or
100,000
and
next,
first
buffer.
A
So
why
burst
my
first?
Birth
father
is
designed
to
accelerate
your
I/o
and
also
to
absorb
the
bursty
I'll
request.
So
in
these
two
pictures,
as
on
the
left,
is
without
this
first
buffer
or
the
teeth
work
as
the
Iowa
accelerator
you'll
see
that
this
is
very
typically
situation
on
HPC
file
system.
So
we
will,
if
you
observe
the
IO
activity,
you
will
see
this
kind
of
a
spikes,
this
kind
of
a
birthday
pattern,
but
with
first
buffer
we
will
be
able
to
absorb
those
bursty
pattern
and
to
learn
dramatically
improve
the
I/o
performance.
A
So
basically,
the
first
buffer
is
designed
for
high.
I
ups
and
high
bandwidth
applications
and
it's
very
easy
to
use.
Currently
you
just
you
just
need
to
add
a
few
lines
of
scripts
in
your
existing
script
batch
script.
So
there
are
a
few
important
seem
to
notice
when
specify
we
want
to
use
burst
buffer.
You
need
to
tell
your
job
like
how
much
resources
you
want
to
allocate.
So
this
is
the
capacity
parameter
so,
for
example,
I.
A
If
my
job
produced
like
900
gigabytes,
probably
I
will
request
like
1
terabytes
right,
but
slightly
more
than
the
job
we
are
produced.
We
are
produced
so
I
request
to
capacity
as
one
1,000
gigabytes
I
will
get
a
bunch
of
burst,
buffering
those
during
the
wrong
time
and
how
many
knows
I
will
get
it.
You
can
simply
calculate
that
by
dividing
this
number
with
20,
so
the
20
gigabytes
is
the
granularity
on
burst
buffer,
and
then
there
are
few
more
commands
which
are
useful.
One
is
the
staging,
so
your
data,
assuming
is
unscratched
filesystem.
A
You
use
this
command
before
your
before.
You
start
your
job.
You
stage
in
the
file
from
scratch
on
to
burst
buffer,
and
then
we
run
the
job.
Your
computer
knows
can
talk
to
those
data
from
directory
with
burst
buffer,
and
you
can
also
use
stage
out.
That
assumes
that
you
have
some
new
data
produced
right
or
you
may
modify
the
data.
You
want
to
keep
that
new
result.
A
So
you
want
to
stitch
out
from
burst
paper
down
to
some
relatively
permanent
space
like
a
scratch
so,
which
is
like,
will
not
be
purged
until
12
weeks,
so
very
safe
and
permanent,
and
also
you
can
see
that
the
burst
buffer
space
we
are
disappears.
If
your
job
exists
right,
if
you
won't
have
a
longer
period
of
a
burst
buffer
space,
you
want
to
create
persist
in
the
reservation.
A
So
in
order
to
creat
this
reservation,
which
is
only
all
the
only
owned
by
you
or
a
group
of
your
users,
so
you
can
you
need
to
submit
a
few
jobs,
so
one
job
is
to
creat
this
reservation.
So
this
is
job
0
and
you
just
submit
this
job
as
purge
this
script
and
by
specifying
capacity
and
typos
and
the
mode
access
mode
and
the
type
of
space,
and
also
give
it
a
name
and
then
later
you
use
this
name
as
your
burst
buffer
tag.
So
this
is
then.
A
This
is
how
you
use
the
are
producing
the
reservation,
your
job
and
don't
forget
to
delete
this
reservation
after
six
weeks
as
how
long
we
can
make
sure
we
can
guarantee
the
data
is
safe
and
third
is
a
project
file
system.
So
so
you
can
see
that
for
running
a
perky
for
running
jobs
or
for
doing
data
analytics
and
a
first
buffer
and
scratch
is
good
for
it's
good
to
use
in
that
scenario,
but
for
sharing
large
data
project
is
recommended
like
transmissions.
A
A
Okay,
so
this
is
the
project
so
and
then
there's
HP
SS.
So
if
you
want
to
keep
the
data
forever
and
then,
for
example,
the
data
film
from
a
paper
or
some
raw
data,
you
want
to
later
reuse
or
reproduce
your
science.
So
you
can,
you
should
archive
the
important
data
and
for
the
HP
SS
there's
some
best
practices
on
the
website.
For
example,
you
should
archive
the
data
in
a
way
that
later
you
may
intend
to
reach
it.
A
So
you
have
some
like
you
build
some
software
and
you
want
to
use
you
want
to
use
one
day.
It's
used
by
your
group.
You
may
consider
requesting
a
global
common
space,
so
here's
the
performance
comparison
showing
that
the
cotton,
the
performance,
the
library
loading
time
is
faster,
uncommon,
uncommon,
space
and
then
finally,
it's
a
global
home,
so,
like
I
said,
is
designed
for
hosting
your
source
file
and
you
may
compile
code
at
this
directory
at
this
space
and
you
have,
but
it's
not
intended
for
Iowa
Highway
operations.
A
A
So
this
is
just
a
summary
of
what
we
have
covered
so
like
I
said
there
are
different
file
system
and
there's
first
buffer
scratch
and
project
and
SPSS,
and
then
google,
chrome
and
Google
home.
So
they
are
really
designed
for
different
purpose,
and
you
should
check
you
should
understand
what
we
are
going
to
do.
You
are
going
to
archive
some
data,
then
go
to
HP
SS
or
you
are
going
to
share
some
data.
A
Okay
and
finally,
there's
a
nice
fit
nice
scene
designed
by
bass
group,
which
is
the
data
dashboard.
So
if
you
go
to
this
web
this,
my
top
nurse
could
have
got
website.
You
could
have
a
very
clear
picture
about
where
about
your
data
on
nurse
file
system
like
including
home,
see
scratch
and
project,
and
then
for
it
specifically
for
the
for
the
data
on
the
project.
A
You
could
get
more
insight
by
clicking
this
data
dashboard
and
you
can
have
you
can
view
all
the
project
that
you
belong
to
and
also
you
could
click
this
button
to
check
the
detail.
Usage
in
terms
of
percentage
of
space,
allocation
and
I
know
the
allocation
and
group
percentage
over
space
allocation.