►
From YouTube: Operating OpenZFS at Scale by Satabdi Das
Description
From the 2022 OpenZFS Developer Summit: https://openzfs.org/wiki/OpenZFS_Developer_Summit_2022
Slides: https://docs.google.com/presentation/d/1fZcVvGNtG5V4JyKG7IlpzZR37YtZdI0-/edit?usp=sharing&ouid=112595186103367032517&rtpof=true&sd=true
A
A
I'm,
a
software
development
engineer
at
AWS,
FSX
part
of
ncfs
and
today
I'm,
going
to
give
you
a
peek
into
what
is
AWS
FSX
for
opencfs
before
I
start
just
a
quick
show
of
hands.
How
many
of
you
have
heard
about
AWS
FSX?
A
Oh,
that's,
that's
a
good
number!
So
I
don't
have
to
spend
a
lot
of
time
for
those
who
do
not
know
it
yet.
Aws
FSX
offers
file
system
in
Cloud.
The
x
is
literally
your
choice
for
your
workload.
A
A
What
is
FSX
for
opencfs?
It's
a
storage
service
that
lets
you
launch,
run
and
scale
fully
managed
opencfs
file
systems
on
AWS.
It
provides
the
familiar
features,
performance
and
capabilities
of
opencfs
file
systems
with
the
agility
and
scalability
and
simplicity
that
you
can
expect
from
a
fully
managed
AWS
service.
A
A
Before
I
dive
deeper
into
our
services,
I
just
wanted
to
call
out
what
are
the
resources
that
we
manage.
So
we
manage.
We
have
something
called
the
physics
volume
which
you
may
know
as
opencfs
dataset
file
system.
We
call
it
volume
because
many
of
our
customers
do
not
actually
know
about
opencfs.
They
are
not
opencfs
Savvy
users,
all
they
care
about
cost
effective
file
system,
and
they
also
care
about
the
uniformity
of
the
terms
that
we
use
across
AWS
service
and
we
do
use
FSX
volume
for
other
file
engines.
A
Why
do
customers
use
FSX
for
opencfs,
and
this
is
just
a
summary
of
the
reasons
that
customers
use
and
I'm
going
to
dive
deeper
into
each
of
them?
The
first
one
being
it's
high
performance
and
it
offers
Advanced
ZFS
capabilities
and
it
is
cost
effective.
A
Now,
before
we
talk
about
performance,
we
I
would
like
to
explain
how
data
is
accessed
from
an
FSX
opencfs
file
system,
so
the
clients
access
the
file
system
from
inside
AWS
cloud
and
each
FSX
file
system
consists
of
a
file
server
attached
to
storage
disks
fsx4
open
ZFS
uses
the
arc
cache
which
improves
the
access
to
the
portion
of
data
that
can
be
driven
from
the
in
memory.
Cache
and
fs64
opencfs
file
systems
can
serve
Network.
A
For
data
accessed
from
the
persistent
disk
storage,
we
offer
up
to
four
gigabytes
per
second
throughput
and
160k
disk
iops,
and
this
level
of
performance
is
possible
because
of
two
reasons.
One
of
course,
because
of
the
underlying
Hardware
supports
this
level
of
throughput
or
latencies,
but
also
because
of
Advanced
ZFS
capabilities
that
you
use,
for
example,
Arc
or
or
compression.
A
Customers
use
FSX
opencfs
file
systems
because
of
the
ease
of
use.
A
customer
can
create
an
opencfs
file
systems
in
a
matter
of
minutes.
They
can
use
the
CLI
the
API,
the
SDK,
and
they
can
create
a
file
system.
They
can
scale
it
up
the
storage
of
the
file
system.
They
can
scale
up
the
throughput
of
the
file
system
just
in
a
matter
of
minutes.
A
Now,
if
you
compare
that
experience
with
creating
an
opencfs
file
system,
where
you
have
to
provision
your
Hardware,
you
have
to
install
the
software
you'll
have
to
upgrade
it
regularly.
You
have
to
patch
against
vulnerabilities,
you
have
to
manage
your
own
backup.
All
those
complexity
are
taken
care
of
by
by
the
service.
A
I
wanted
to
give
you
an
example
of
how
a
customer
can
create
a
file
system
just
to
given
sort
of
a
just
to
give
an
idea
how
easy
it
is.
This
is
a
an
API,
the
CLI,
basically
using
which
customer
can
create
400
gigabyte
file
system
where
they
give
the
deployment
type,
which
is
an
interlearn
internal
way
of
us
saying
which
kind
of
file
system
of
opencfs.
We
are
going
to
provision
right
now.
We
only
support
single
lazy
one,
and
then
they
can
create
a
128
throughput
capacity
file
system.
A
We
also
offer
apis
for
creating
volumes,
FSX
volumes
creating
snapshots,
creating
backups.
They
can
update
the
file
system
storage.
They
can
update
the
file
system
throughput
capacity.
They
can
update
the
volume
configurations,
for
example,
they
can
update
the
quota,
the
compression
and
the
NFS
exports
the
record
size
all
using
native
FSX
API.
A
Many
customers
prefer
to
scale
up
automate
their
scaling
operations
when
they
cross
a
certain
threshold
and
they
can
use
Native
AWS
apis.
To
achieve
that.
A
It's
also
fully
managed
what
do
I
mean
by
fully
managed?
Is
we
patch
regularly
the
file
systems?
We
take
periodic
incremental
backups
and
we
also
offer
data
encryption.
Let's
go
a
little
bit
deeper
into
each
of
those
fields.
A
We
patch
file
system
regularly
during
maintenance
Windows.
What
do
we
patch?
We
upgrade
the
software
running
on
the
file
server
and
the
day
and
the
week
and
the
time
can
be
decided
by
the
customer
and
they
can
also
update
those
maintenance
windows
and
we
we
take
opencfs
release
versions
and
then
we
review
within
our
team.
We
have
a
process
around
that
and
then
we,
based
on
our
review,
we
decide
which
version
to
to
get
and
to
patch
into
our
file
systems.
A
A
What
is
fsx4
open,
ZFS
backup,
so
it's
incremental
in
nature.
We
use
internal
AWS
infrastructure
to
take
the
backup
and
it
is
highly
durable.
We
store
those
snapshots
in
S3,
Amazon
S3.
It
is
file
system,
consistent,
meaning
you
can
create
a
file
system
from
any
of
those
backups.
When
do
we
take
it?
We
have
an
automated
job
that
takes
daily
backups.
It
is
taken
during
backup
window
again
we
give
customer
the
choice.
If
they
want
to
change
the
backup
window
or
not,
they
can
also
modify
the
retention
period.
The
default
is
seven
days.
A
We
also
offer
data
encryption.
There
are
two
types
of
encryption
that
we
offer
encryption
at
rest.
Encryption
in
transit,
encryption
at
rest
means
customers.
Data
is
protected
and
encrypted
using
IF
customers
provide
a
key
using
AWS
Key
Management
Service,
which
is
another
service
for
Key
Management.
We
encrypt
the
data
using
customers
key
and
when
data
and
metadata
is
encrypted,
it
will
be
encrypted
before
written
before
being
written
to
the
file
system,
and
when
it
is
presented
back
to
the
application,
it
will
be
decrypted
before
presenting
back
to
the
applications.
A
A
A
We
share
metrics
with
the
customers
so
that
they
themselves
can
monitor
their
file
system
performance.
A
We
retain
these
metrics
for
15
months,
so
that
customer
can
have
a
have
a
historical
view
of
how
their
file
system
has
performed.
Over
time.
We
offer
customers
read
operations
and
write
operations,
amount
of
storage
they
have
used.
What
is
the
compression
ratio
they
have
achieved
so
far,
and
the
CPU
and
memory
usage
of
the
filed
server.
A
We
augment
the
file
server
metrics
with
the
metrics
that
we
Source
from
CFS
along
with
this
metrics.
We
continuously
monitor
the
file
systems
and
generate
internal
alarms
for
us,
and
we
generate
dashboards,
which
we
regularly
review
by
regularly
women.
We
review
them
daily
and,
among
many
other
things,
we
monitor
file
server,
Health
the
state
of
the
software
running
on
the
service
or
running
on
the
server,
and
we
operate.
What
we
build.
So
by
that
I
mean
our
team,
writes
the
code,
deploys
it
in
production
and
we
operate
our
code.
A
So
we
put
a
lot
of
effort
into
how
we
monitor
our
file
system
health
and
how
we
monitor
the
whole
health
of
the
service
and
how
we
automate
how
we
can
automate
mitigations.
A
They
can
tune
a
few
aspects
of
their
file
system.
For
example,
they
can
tune
the
record
size.
We
offer
128
QB
byte
Cricket
size
and
our
public
documentation
maintains
that
that
is
sufficient.
That
is
good
enough
for
most
of
the
workload,
but
customers
can
choose
to
update
their
Wicket
size
based
on
their
workload.
A
A
They
can
also
change
the
user
and
clip
quota
based
on
their
requirement,
and
all
of
this
they
can
access
from
the
API.
A
So
what
did
we
learn
so
far?
We
we
launched
the
service,
as
I
mentioned
in
December
2021,
so
it's
been
almost
a
year.
We
have
been
operating
and
we
have
a
few
learnings
that
we
can
share
with
you.
So
one
is,
we
know.
Compression
is
good
for
saving
this
disk
space,
but
we
also
noticed
that
for
read,
heavy
workloads
compression
can
significantly
improve
the
overall
throughput
performance
of
the
file
system
because
it
reduces
the
amount
of
data
that
needs
to
be
accessed.
A
It
needs
to
be
sent
between
the
underlying
storage
and
the
file
server.
The
effective
throughput
is
actually
roughly
equivalent
to
the
product
of
the
provisioned
disk
throughput
and
the
the
level
of
compression
ratio,
for
example,
for
4
KB
byte
throughput,
which
is
our
highest
throughput
level
that
we
offer
currently.
And
if
you,
if
the
STD
compression
ratio
is
around
2
to
3x,
the
read
throughput
is,
is
by
around
8
to
12
gigabyte
per
second.
A
We
also
learned
that
there's
a
relationship
between
throughput
and
Arc.
This
was
by
Design,
but
this
is
something
which
we
sort
of
learned
as
we
operate
and
I
wanted
to
share
with
you.
The
file
systems
throughput
actually
determines
the
size
of
Arc.
So
when
a
customer
increases
their
throughput,
it
improves
the
file
system
performance
in
two
ways:
one
it
increases
the
throughput
and
the
iops
customers
can
derive
from
the
disk,
but
since
the
arc
size
also
increases,
it
is
also
a
good
fit
for
a
workload
with
larger
workloads.
A
A
We
leverage
Arc
tunings
to
improve
file
system,
performance
and
stability
and
since
Arc
works
so
well
for
the
customers.
We
want
to
maximize
the
amount
of
memory
we
give
to
Arc,
but
on
smaller
servers
we
noticed
that
certain
metadata,
intensive
workloads
can
push
the
arc
size
well
above
what
we
configured
for
ZFS,
Arc,
Max
and
the
amount
of
memory
free
memory
also
gets
pushed
below
the
ZFS
Arc
sys3.
So
he
experimented
with
a
few
few
tunings
that
we
use
to
for
using
the
pruning
and
Arch
eviction
strategy.
We
use
our
stats.
A
So
thank
you
for
writing
the
Tool
that
helped
us
debug
like
how
we
can
solve
this
and
how
we
can
Elevate
the
memory
pressure,
and
we
we
made
tunings
in
in
Arches,
free
and
art,
made
a
strategy
to
basically
balance
performance
versus
stability
in
a
way
so
that
it
helps
the
file
improve
the
file
system,
health.
A
We
also
learned
that
many
of
our
customers
are
actually,
as
I
mentioned
before,
are
actually
not
ZFS
heavy
users.
They
all
need
a
file
system
which
is
cheap,
which
is
highly
performant
and
that
we
pre-tune
their
file
systems.
They
do
not
have
to
worry
about
how
to
tune
different
ZFS
components.
They
can
just
use
it
as
it
is
out
of
the
box
and
it
works
for
most
of
them.
Many
customers
use
it
as
general
purpose
storage,
not
even
HPC
workload.
They
use
it
as
home
directories
for
storing
their
binaries
and
it
is
cost
effective.
A
We
also
learned
that
for
many
customers
that
use
metadata
intensive
workload,
they
would
be
greatly
highly
helped
if
they
have
access
to
metadata
statistics.
So
and
right
now
we
do
not
have
a
clear
path
to
Source
those
metadata
statistics
from
ZFS.
A
What
we
are
looking
for,
we're
looking
for
metadata
statistics
on
open,
close
mcnode,
mcdir
get
address,
say
that
or
get
Etc
Link
and
Link.
We
have
a
list
of
operations
that
we
want
to,
if
possible,
to
Source
metadata
from
ZFS,
and
we
hope
we
can
contribute
in
some
way
to
the
community.
If
we
can
get
those
statistics
from
CFS
natively.
A
A
So
many
customers
use
opencfs
for
high
performance
storage
for
latency,
sensitive
and
iops
intensive
use
cases.
Many
customers
in
financial,
analyzes,
front-end
Eda
by
front
end
Eda.
They
mean
simulation
of
of
designs,
genomics
research,
basically
where
they
would
care
for
low
latency,
High
throughput
features
of
a
file
system.
They
use
opencfs.
A
Many
customers
also
use
streaming
video
processing
because
of
the
same
reasons
that
are
mentioned
at
the
right
side
right
hand,
side
of
the
slide
because
of
the
latencies
and
the
iops
and
the
throughput
and
the
low
cost.
Of
course,.
A
Where
was
I?
Okay,
so
many
other
customer
use
cases
are,
it
can
be
a
drop?
It
can
be
used
as
a
drop
in
replacement
for
self-managed
NFS,
as
I
mentioned
before,
customers
use
as
a
user
share
home
directory,
simple
storage,
because
it's
because
of
the
cost
and
the
flexible
storage
and
triple
provisioning.
What
do
I
mean
by
flexible
storage?
A
So
basically,
as
I
mentioned,
it's
so
very
easy
to
increase
the
storage
of
your
file
systems,
the
increase
the
throughput
of
your
file
system,
just
without
thinking
of
how
do
I
like
configure
ZFS
when
I
increase
the
size,
because
all
of
that
is
being
taken
care
of
by
workflows
that
FSX
runs
in
the
background.
A
Some
customers
are
really
ZFS
Savvy
users
they
like
to
use
ZFS
features
like
snapshots
and
cloning.
Some
customers
use
this
clone
storage
for
as
their
test
environments.
They
take
point
in
time
snapshots
and
they
tend
to
come
to
CFS
because
of
those
features.
A
Aws
claims
that
we
are
the
we
are
the
top
most
customer
obsessed
company
on
planet
Earth,
so
no
presentation
from
AWS
is
complete
if
I
don't
share
a
few
customer
stories.
One
of
our
customers
is
Vela
games.
There
are
game
studio
and
they
increased
their
build
Time
by
60
by
using
openfs
open
opencfs
The
Challenge
was
they
were
using
build
graph,
which
was
which
was
generating
multiple,
which
automated
their
build
tasks,
but
they
did
using
opencfs
is
they
started,
saving
or
other
checkpointing?
A
The
output
of
intermediate,
build
tasks
and
passing
those
output
persisting
those
output
and
passing
that
to
the
next
stage
of
the
automated
build
task
so
that
they
do
not
have
to
build
from
scratch
every
time
those
next
steps
are
running
and
they
used
cloning
to
do
to
achieve
that,
which
was
like
very
fast
and
easy
way
to
to
clone
to
get
a
copy
of
the
last
output
of
the
last
stages.
A
Rev.Com
is
another
customer
and
where
they're
looking
for
a
fully
managed
high
performance
file
system
and
they
wanted
to
reduce
the
operational
complexity
and
they
accelerated
their
ml
training,
workflows,
workloads
with
no
latency
storage.
They
reduce
their
operating
costs
by
nearly
30
percent
and
also,
as
I
mentioned,
we
publish
some
metrics
to
AWS
Cloud
watch
using
which
they
have
visibility
and
they
can
monitor
their
file
system
performance.
A
To
recap
of
what
we
have
I
have
covered
so
far.
Is
customers
use
FSX
for
opencfs
because
of
the
high
performance
storage
for
latency,
sensitive
and
iops
intensive
use
cases
they
use
it
because
of
Advanced
ZFS
capabilities,
which
have
been
made
simple
and
easy,
and
it's
a
cost
effective,
fully
managed
drop-in
replacement
for
self-managed
NFS
going
forward.
A
This
is
the
first
time
AWS
is
attending
open,
CFS,
deaf,
Summit
and
we'd
like
to
continue
our
engagement
with
the
community.
So
far
we
run
AWS.
We
run
opencfs
on
arm
graviton
architecture,
arm
architecture.
So,
whatever
bugs
we
see
we
like
to
report
back
to
the
community.
Whatever
way
we
can
fix
them,
we'd
like
to
Upstream
those
changes,
so
we
can
fix
them.
It's
been
only
a
year.
We
have
been
operating
up
in
CFS,
although
we
have
another.
A
C
C
A
A
A
Got
it
yeah
I'll
I
can
talk
to
you
later
after
the
conference
and
then
can
get
to
know
more
about
it.
Yes,
sir.
A
So
the
question
was:
there's
a
slide
on
how
do
we?
We
have
tuned
some
of
the
ZFS
tunings
configurations
to
get
performance
and
the
question
was:
did
we
do
those
as
live
tunings
and
what
kind
of
tunings
we
gave
so
without
going
too
much
into
implementation
details?
A
First,
we
monitor
using
Arc
stats
and
other
tools,
like
the
health
of
the
file
server,
how
the
how
the
file
server
is
performing
the
iops
and
all
that,
and
we
first
try
to
tune
the
file
systems
in
our
development
environment
and
try
to
drive
the
workload
we
do
not.
We
cannot
do
those
tunings
in
on
in
production
customer
file
system.
So
first
we
try
those
out
in
our
test
environment
and
we
conduct
experiments.
A
We
have
a
I
have
Patrick
my
colleague
who
recently
did
some
tunings
the
tunings
that
I
mentioned
Arc.
You
can
certainly
up
to
him
and
he
can
explain
to
you
what
all
we
did
so
basically,
our
main
goal
is
an
optimal
balance
between
performance
and
stability
and
and
that
is
sort
of
our
guiding
tenet
in
deciding
what
kind
of
tunings
we
use.
A
So
that's
a
good
question
and
I'm.
Oh
sorry,
the
the
question
is:
what
encryption
strategy
do
we
use
and
I
said?
That's
a
good
question
because
it's
not
on
top
of
my
head,
but
I
can
get
back
to
you
on
that.
We
have
public
documentation
available
for
that.
C
A
B
B
You
talk
about
backup,
and
you
said
that
he
was
using
internal
AWS
technology
to
keep
the
dynamic
system.
I
was
curious
if
you
considered
using
zfs7.
A
So
the
question
is:
we
I
mentioned
that
we
do.
We
use
AWS
infrastructure
for
backing
up
the
file
system
and
the
question
is:
have
we
considered
using
ZFS
snapshots
for
backup?
Yes,
we
did
consider
ZFS
snapshot,
but
we
decided
to
go
with
our
infrastructure
AWS
internal
infrastructure,
because
that
gives
us
uniformity
over
across
multiple
engines
and
also
it
gave
us
the
way
we
wanted
to
design
the
feature
for
our
customer.
C
A
So
the
question
is:
what
kind
of
internal
infrastructure
that
we
use?
We
AWS
doesn't
disclose
that
information,
because
we
keep
it
as
an
obstruction
layer,
because
so
that
customers,
we
feel
that
information
doesn't
help
customers
in
any
way
what
customers
care
about
and
access
to
their
file
system
over
NFS,
since
we
offer
opencfs
and
and
how
they
are
going
to
access
from
what
kind
of
clients
they
are
going
to
access.
So
that's
why
you
do
not
disclose
that
information.
A
B
A
A
Question
is
how
much
data
is
stored
in
AWS
Amazon
for
opencfs.
Unfortunately,
I'm
not
the
best
person
to
answer
that
question,
because
we
do
not
publish
those
data
openly.
C
A
So
the
question
is:
have
we
considered
looking
into
using
L2
Arc
or
slog
devices,
or
do
we
just
simply
use
Zippos?
We
we
did
and
our
guiding
tenet
there
is
something
that
is
simple
to
use
for
the
customer
and
which
gives
best
performance.
We
are
still
looking
into
because
DFS
is
so
feature
Rich.
We
are
still
looking
into
multiple
features
which
we
can
use
so
that
we
can
drive
even
better
performance
for
the
customers,
but
so
far
we
do
not
offer
ill
to
our
device.
C
A
Again,
that
is
something
I
am
not
the
best
person
to
answer,
but
we
launched
in
2021
December
last
year.
So
AWS
has
a
conference
called
reinvent.
Some
of
you
may
know
it's
a
it's
a
largest
conference
for
the
community
and
we
we
declare.
We
announced
the
ga
General
availability
of
FSX,
probe
and
CFS
in
that
conference.
A
Got
it
so
the
question
is
I
mentioned
a
single
FSX
file
system
can
have
multiple
FSX
volumes,
so
what
is
the?
What
is
it
file
system
and
how
like
how
customers
use
the
volumes?
So
customers
can
use
access
those
volumes
using
NFS
export
and
they
can
create
multiple
FSX
volumes.
Multiple
children
FSX
volumes
to
the
root
volume
and
they
can
all
change.
Since
we
allow
the
customers
to
change
the
NFS
exports,
they
can
access
those
different
children
volumes
using
NFS
exports.
C
A
It's
it's
FSX
volume
is
is
something
that
we
expose
to
the
customer,
so
it's
it's
visible
by
the
customers
using
NFS.
A
Yes,
we
currently
we
offer
NFS
protocol,
so
Windows
is
any
Windows
client
has
to
access
over
NFS
I.
Think
that
was
the
last
question.
Thank
you,
I
hope.
I
could.
A
If
you
have
any
questions,
please
feel
free
to
reach
out
to
me.
We
stand
up,
say
hi,
he
works,
we
have
shooting,
we
have
mash
and
we
have
for
AWS
attending
this
conference.
Thank
you,
foreign.