►
From YouTube: Illumos Brings the SAS by Kody Kantor
Description
From the 2019 OpenZFS Developer Summit
slides: https://drive.google.com/open?id=14KbyfOcf23rhatgypG3ZxIGSfBGbHNlp
A
A
So
I'm
telling
what
everybody's
favorite
stuffs
has
topology
is.
We
already
certainly
touched
on
it
a
little
bit
like
during
Alan's
talk
talking
about
maybe
like
storing
the
initiator
in
a
beat-up
property.
So
we
know
like
what
that
thing
is
actually
connected
to.
We
talked
about
raising
device,
names
and
Linux,
and
we
don't
worry
about
that
on
the
wheels
which
is
cool
so
I'm
glad
one
of
your
problems,
but
we'll
talk
about
our
problems
instead,
so
like
just
to
start
out
like
I
mean
like
this
serves
like
for
always
like
I.
Don't
know
like.
A
We
all
know
that
there
like
there
are
ways
to
tolerate
failures
for
like
forecast,
failure
to
see
them
coming
up
and
then
avoiding
failures.
You
know,
tolerance
is
pretty
well
understood.
We
have
things
like
raid
in
ZFS,
not
NTFS
forecasting.
We
talked
a
little
bit
about
smart
data
earlier
I
can
tell
you
like
when
a
failure,
the
disk
will
predict
when
a
failure
might
be
coming,
I'm
like
proactively
or
notify
the
operating
system
or
like
using
a
CLI
tool
that
that
might
happen.
B
A
We'll
talk
about
really
bad
stuff,
which
is
operator,
error
and,
like
all
things,
can
go
a
totally
wrong
so
like,
let's
just
take
a
situation
where
the
OS,
like
ZFS,
identifies
a
bunch
of
check
Samir's
in
a
bunch
of
disks
so
low.
Like
ZFS
using
the
eggs,
add
or
Mme
and
see
if
we
most
will
automatically
swap
in
a
hot
sphere
and
the
operator
is
gonna,
be
nullified,
hey.
You
should
replace
this
disc
because
we
found
a
bunch
of
checks
limiters
on
it.
A
A
You
should
replace
these
discs
and
like
if
an
operator
doesn't
know
any
better,
then
they'll
replace
the
discs,
but
sometimes
an
operator
does
know
better
and
it's
like
even
impossible
to
know
what
the
right
thing
to
do
is
and
that
may
Harbor
failure
is
like
really
complicated
after
taking
an
account
like
where
things
are
in
the
box.
So
it's
just
like
picture
for
you
box,
just
like
a
system
that
we
use
a
giant
like
the
bomb
is
online.
So
if
you
want
to
check
that
out
later,
you
can,
but
this
is
late.
A
I
don't
know
like
how
many
Forks
have
looked
at
hardware
diagrams,
but
this
is
this
is
an
HPA
here
and
then
we
can't
appear
of
like
backplanes
that
does
this
actually
like
plug
into
and
in
our
bomb.
What
we
do
is
we
have
like
two
cables
that
go
to
one
expander
and
then
two
cables
from
that
expand
or
go
to
your
expander,
so
like
the
HPA
and
that
architecture
is
a
single
point
of
failure.
A
If
you
lose
the
HPA
all
your
disks,
but
basically
just
disappear,
or
if
there's
like
errors
that
only
occur
sometimes
because
the
HPA
is
slowly
failing,
then
it'll
look
like
every
disk
of
the
system
as
throwing
errors.
So
if
you
think
that
doesn't
actually
happen,
it
totally
does
happen.
This
is
like
an
eye
chart
here,
except
it's
even
if
you
could
see
if
it's
basically
gonna
readable,
like
we
now
we,
the
system
started
out
with
just
too
hot
spared
us
and
somehow
we
have
like
ten
of
them.
C
A
A
A
A
A
A
A
Names
here,
so
these
are
each
be
a
worldwide
names
and
you
sort
have
to
do
this.
Mapping
up
okay,
well,
five,
zero
had
a
bunch
of
errors,
so
I
think
that's
this
one
here.
You
know
you're
not
really
sure
you
just
like
matching
up
numbers
at
this
point,
and
things
like
this
is
a
really
simple
to
all
of
you
to
like.
We
have
to
h,
bas
and,
like
I,
don't
know
like
sixteen
desks
or
something
it's
like.
You
can
imagine
like
this
124
just
thing
with
like
fan-out
expander
and
like
two
edge
expanders.
A
So
really
we
need
better
tools
and,
like
this
hard
to
get
right
so
multi
padding
people
like
I,
don't
know
how
you
live,
but,
like
you
probably
have
the
most
crazy
SAS
topologies
on
the
planet.
We
use
pretty
simple
ones,
but
it's
really
hard
to
get
this
mapping
of
hardware
topologies
right,
because
they're
just
so
many
different
ways
that
are
valid
according
to
like
the
like
SAS,
like
documentation,
but
it's
really
confusing
even
to
picture
in
your
mind.
A
So
we
started
some
work
in
animals
to
sort
of
try
to
solve
this
problem.
Like
the
first
thing.
We're
trying
to
do
is
make
things
better
for
operators
so
like
the
first
thing
that,
like
Rob
Johnson,
a
co-worker
of
mine,
did
was
implemented
a
prototype
support
for
directed
graphs
and
FMA,
so
that
we
can
because,
like
not
all
sassed
apologies,
look
like
a
tree
like
to
put
this
multi
padding
like
things,
can
get
really
complicated,
and
then
we
found
this
thing
called
smhpa-300
to
cross
it.
A
A
So
we
can
get
some
information
from
sm
HPA,
a
P,
I
and
then
like
what
we
did
is
we
wrote
a
utility
to
run
a
bunch
of
SMP
commands,
which
is
like
the
scuzzy
management
broker
serial
against
any
expanders
that
have
SFP
ports
available,
select
like
doing
that,
we
can
sort
of
figure
out
the
world.
Why
names
of
disks
that
are
behind
if
I
and
then
what
those
are
attached
to
using
worldwide
names.
We
can
also
get
by
link
air
state
countries
through
SMP.
So
now
we
can
do.
A
A
So
hopefully,
oh
yeah.
So
this
this
first
tool,
SAS
topo,
is
it's
like
FM
top
one.
We
must,
if
you
prefer,
use
that
across
with
like
LSI
utils,
so
it
gives
you
like
it'll
print
out
paths
from
initiators
to
targets
and
the
SAS
topology,
and
you
can
like
tell
it
to
print
out
to
buy
specific
properties
like
fire
covers
the
host
device.
Name
is
chasity
locations
that
sort
of
stuff,
like
maybe
the
chassis
location,
is
like
front
dust,
zero
saline.
A
You
can
tell
a
DC
operator
yeah,
it's
front-desk,
zero,
that
we
need
to
replace,
and
you
can
optionally
serialize
this
director
graph
into
an
XML
document,
because
XML
is
the
format
of
the
future
and
it
can
handle
64-bit
numbers
which
things
like
JSON
can't
really.
So
here's
just
like
a
sample
openness,
a
snowcone.
So
this
is
the
system.
The
first
indication
of
it
was
just
like
no
arguments.
You
can
see
that
we've
created
a
bunch
of
a
like
nodes
in
this
directed
graph.
A
A
A
Have
an
initiator
which
is
connected
to
an
expander
each
of
those
has
a
port.
This
is
a
wide
port
and
then
the
expander
port
is
connected
to
a
target,
and
if
we
wanted
some
more
detail,
we
could
just
run
like
the
cat
V
play
and
we
get
some
details
on
the
port.
So
this
will
like
in
the
future
is
gonna
include
five
link
state
air
counters.
Almost
so
maybe
we'll
see
like
a
thousand
airs
on
this
on
this
target
by
4.
This
is
4
5,
and
then
this
is
the
fMRI.
A
If
you
want
to
look
up
the
resource
in
the
SAS
scheme
and
then
here
on
the
bottom,
we
have
a
target.
So
this
super
lock
thing
is
the
hardware
component
fMRI,
which
is
used
to
look
up
the
actual
physical
device
and
FM
on
wheels.
And
then
you
can
see
that
we
discovered
some
information
about
the
device
like
where
does
slot
0,
the
manufacturer
or
model
number
serial
number,
a
lot
sort
of
stuff?
A
The
second
thing
is
that
we
wrote
this
tool
to
convert
that,
like
the
XML
output
from
the
previous
tool-
and
this
is
written
in
rust
and
you
can
like
it-
produces
this
website
bundle.
So
you
can
actually
open
up
this
topology
in
a
web
browser
and
it
makes
it
really
easy
to
see
how
things
are
connected
in
a
system.
No
matter
how
complex
it
is.
A
A
So
we
can
see
each
HPA
is
attached
to
eight
disks
and
you
can
click
on
these
and
more
impatient
about
it.
So
we
clicked
on
the
HBA.
Here
we
have
the
part
where
compliant
F
of
rx.
We
can
look
that
up
and
we'll
go
out
to
the
model
number
of
serial
number
device
label
at
source,
stuff
and
we'll
be
able
to
put
like
like
fine
link.
State
air
occurs
in
these
port
nodes
as
well.
A
So
in
the
future
like
what
we'll
be
able
to
do
is
if
we
find
a
port,
that's
throwing
a
bunch
of
errors.
We
can
like
automatically
color
box
rather
something
so
then
an
operator
can
like
quickly
look
at
this
and
be
like.
Oh
okay,
like
the
HPA
is
throwing
a
thousand
airs.
This
is
rad
and
my
ZFS
is
also
identifying
checksum
eaters
on
all
eight
of
these
disks.
But
these
eight
does
is
you're
totally
okay.
So
then
we
know
that
you
do.
The
HPA
is
gone
wrong
or
cables
gone
well.
A
So
it's
super
useful
and
a
slightly
more
copy
to
picture
here
which
is
even
harder
to
see.
So
we
have,
in
this
case
a
single
HPA
which
have
a
ban
on
expanded.
We
clicked
on.
We
have
like
the
same
part
where
specific
information
over
there
in
corner,
and
that
has
a
bunch
of
disks
attached
to
it
and
then
there's
also
another
expander.
Here,
that's
attached
to
a
few
more
disks
so
like
here.
A
Maybe
this
expander
is
going
bad
and
you
know
FS
is
identified,
checks
some
errors
on
all
these
things,
but
all
the
other
disks
are
okay,
so
like
this,
even
having
like
trivial
tools
like
this,
if
you
guys
have
had
to
like
dive
into
LS
iu
table
for
this
is
like
a
game
changer,
it's
to
my
pain,
alright!
So.
B
A
Like
I
said,
our
firm
goals
are
just
to
have
better
tooling
for
our
operators.
Longer
term,
though,
since
Lumos
has
FMA,
like
really
good
fault
management
for
hardware,
we
would
like
to
like
enhance
FMA
to
actually
provide
more
targeted
diagnoses
when
things
start
going
wrong
in
the
chassis.
So
maybe
that
would
be
like
look
like.
A
Maybe
ZFS
stops
swapping
adjusts
all
the
time
when
it's
these
checks
of
errors,
and
we
also
want
to
be
able
to
make
better
pools,
because
I
don't
know
why
you
guys
feel
like
when
I
make
a
pool
I'll
just
go
like
sequel,
create
like
mirror
or
raid-z,
one
like
stas,
TBS,
TCS
DD,
without
really
taking
into
account
where
that
justice
actually
are
like
what
the
fault
domains
are.
So
that's
really
what
we'd
like
to
do
with
this
work?
A
So
the
question
was
like:
how
can
we
improve
the
retire
agent
with
this
information?
Do
we
have
any
prototype?
We
don't
have
any
prototype
code
for
that,
but
one
thing
that
we
were
thinking
is:
we
could
actually
like
in
this
in
these
like
device,
specific
properties.
We
could
like
mark
this
disc
as
being
like
having
being
part
of
pool
gooood
whatever
with
vita
gooood
whatever.
C
Do
you
think
that
you
could
take
the
logic
of
drawing
this
and
kind
of
pare
it
down
to
the
error
case
and
make
an
ASCII
art
output
that
could
eventually
be
the
part
of
some
sort
of
command
line
diagnostic
where
it
shows
you.
You
know
right
now.
Zfs
tells
you
that
the
disk
has
checked
some
errors,
but
if
you
could
roll
that
command
up
and
show
the
errors
on
the
full
path,
because
you're
generating
that
that
long
path,
stream
and
I'm
just
wondering
do
you?
How
feasible
is
that
there
is
that
a
factory
yeah.
A
I
mean
I
think
that
we
could
certainly
do
stuff
like
that,
like
you
can
look
up
and
you
these
individual
nodes
just
by
using
the
fMRI.
So
you
could
conceivably
do
something
like
that
and
like
as
you
cool
off
water
like
FM
ATM,
faulty
Ani
wants
a
skier
like
I.
Don't
know,
I
mean
I'm
asking
our
person
but,
like
you
could
probably
do
probably
likes
a
little.