►
From YouTube: [Linux.conf.au 2013] - grand distributed storage debate glusterfs and ceph going head head
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
B
So
let
me
let
me
just
start
by
introducing
the
most
important
people
on
the
stage
here
today,
which
are
my
two
debaters
to
be
fair
in
alphabetical
order
by
surname,
John,
Mark
Walker
in
the
Gloucester
camp
and
sage,
while
in
the
safe
camp.
Now
let
me
introduce
these
guys
to
you.
Real
quick,
john
mark
is
a
I
think,
it's
fair
to
say,
a
veteran
community
person
in
the
in
the
free
and
open
source
software
space
I
know
that's
dating
you
to
a
certain
extent,
but
you're
just
going
to
have
to
live
with
that.
B
B
He
has
a
relatively
strong
opinions
on
US
economic
policy,
it's
quite
interesting
to
follow
and
he
is
also
and
I
hope.
This
is
not
wrong.
He's
also
an
avid
follower
of
what
one
half
of
the
english-speaking
world
world
calls:
association,
football
and
the
other
soccer
sage.
While
to
my
right
and
your
left
is
the
technical
lead
of
the
set
project,
he
is
one
of
the
founders
of
dreamhost.
B
He
has
gigantic
whiteboard
behind
him
to
draw
out
his
design
and
implementation
ideas
and
rumor
has
it
that
this
whiteboard
has
never
been
empty.
As
for
myself,
I
am
Florian,
I
run
host
XO,
which
is
a
company
that
does
projects
with
both
of
these
storage
technologies
and
I
will
be
doing
today.
What
is
the
absolutely
hardest
thing
for
me
to
do
on
anything,
and
that
is
have
no
opinion
and
shut
up
a
few
words
about
this
format
that
we're
doing
this
afternoon.
B
This
is
a
debate,
which
means
it's
not
a
Q&A
session
and
it's
also
not
a
fight,
but
something
halfway
in
between
we're
going
to
start
with
giving
each
of
our
presenters
three
minutes
and
no
slides
to
give
us
their
summary
of
the
project
that
they
represent.
And
after
that
we
will
start
with
the
questions.
B
And
whenever
you
have
a
question,
then
please
just
raise
your
hand
and
wait
for
Tim
to
race
up
to
you
with
the
microphone
and
of
course,
when
you
have
questions,
then
please
try
to
be
as
precise
as
possible
and
put
our
speakers
on
the
spot
will
looks
kind
of
try
to
alternate
who
gets
who's.
The
first
person
to
answer
the
question
and
and
the
other
person
will
always
get
a
chance
to
answer
rebuttal.
Whatever
they
choose
to
do.
B
E
F
G
E
E
E
E
E
Startup,
as
half
starts,
do
we
said?
Yes,
of
course,
we
do
not
really
understanding
at
all
what
that
table.
So
we
thought
we
would
just
pick
something
on
the
shelf.
Install
it.
If
going
fine
turned
out,
wasn't
that
we
tried
to
find
you
something
on
the
shelf.
We
discovered
the
proprietary
software
was
extremely
expensive
and
difficult
to
use,
especially
at
that
his
case.
Oh
and
one
thing
about
it,
a
great
part
about
analyze
the
performance
department
was
it
had
to
be
faster
than
take
back
as
long
as
this
best,
then
tape
backup.
E
One
of
your
bunch
of
different
solutions
that
didn't
work,
we
thought
and
I
get
into
each
other,
but
they
go
later.
It's
not
important
now,
but
we
discovered
that
ok
will
do
this
thing
and
we
found
that
the
best
we
can
do
this
was
with
this.
Implement,
are
included
and
we'll
create
what
we
call
a
Legum
toolkit
for
building
block
systems,
and
this.
I
E
E
That's
something
with
as
a
positive,
so
it
was
a
negative,
but
it
really
did
inform
our
early
design
decisions
because
we
decided
is
all
going
to
userspace
and
that's
what
we
were
pretty
best.
Passing
interfaces
between
translators
in
user
space
and
around
that
message
filled
skies
were
to
determine
distribution
in
time.
K
A
L
B
C
Know
Seth
actually
started
as
a
part
of
my
graduate
work,
AC
Santa
Cruz,
and
the
motivation
was
that
the
National
Labs
are
flowing
these
huge
super
indicators.
You
know
thousands
tens
of
thousands,
hundreds
of
thousands
of
processors
writing
to
the
same
file
system
at
the
same
time
dumping
these
huge.
You
know
computations
and
system.
The
problem
is
that
the
lester
metadata
server
just
did
not
scales
a
single
node.
I
C
A
single
point
of
failure
and
it
didn't
hit
performance,
so
we
were
sort
of
task.
How
do
you
build
a
petabyte
scale
file
system
at
the
time
fed
of
like
most
epic
number
I'm
had
a
buildup
advocates,
petabytes
scale
file
system
that
will
actually
scale
to
those
workloads
and,
as
we
sort
of
took
a
step
back
and
looked
at
sort
of
larger
design
problem,
we
realized
that
we
need
to
solve
a
couple
of
different
problems.
One
is
just
a
scalability
aspect.
How
do
you
make
something
that
actually
will
distribute
that
workload
across
lots
of
notes?
C
It's
still
giving
us
the
right
answer.
Another
was
that
when
you're
dealing
with
systems
at
scale,
other
things
become
important,
you
need
to
have
fault,
tolerant,
seems
to
be
dealt
with
as
a
first
little
concept,
because
any
system
that
has
definitely
moving
parts,
some
of
those
parts
are
going
to
be
failing
at
any
point,
I'm
going
to
be
able
to
be
continuously
available
in
working,
even
in
the
face
of
those
failures.
C
And
finally,
of
course,
performance
is
important
and
so
out
of
that
sort
of
effort,
building
initially
the
metadata
server
and
then
wanting
to
implement
it
and
eating
something
for
the
sit
on
top
of
we
ended
up.
Building
an
entire
distributed,
object
layer
that
we
call
ratos
that
will
replicate
trulia.
C
Of
objects
across
thousands
of
nodes
and
then
on
top
of
that
building
the
the
parallel
file
system
with
the
total
distributed
metadata
server
in
order
to
actually
solve
that
problem,
and
then,
when
I
finished,
my
dissertation
work,
I,
certainly
thought
that
you
know
all
I
have
to
do
is
open
source.
This
and
people
will
start.
You
know
hacking
on
it
and
then
I'll
be
able
to
do
something
else,
and
that's.
H
C
I
didn't
exactly
work
that
way
and
what
I
would
I've
lived
in
grad
school.
Is
that
all
of
my
peers,
who
worked
on
these
great
graduate
student
projects
as
soon
as
they
finish?
They
go
get
a
job
at
NIT
up
or
an
emc
and
super
thin.
It
would
stop
working
on
these
new
interesting
models
that
were
open
source
by
virtue
of
being
research
projects,
and
you
know
double
effectively.
C
C
There
are
no
open
source
products
that
would
compete
with
way
you
get
from
netapp
appliance,
which
was
prohibitively
expensive
for
most
people,
and
so
I
decided
to
instead
of
going
to
work
for
emc-
and
you
know,
throwing
a
salary
to
take
the
open-source
around
and
try
to
make
this
into
something
that
could
compete
with
something
real
from
the
open
source
community.
That
would
actually
compete
with
the
enterprise
options,
both
on
feature
set,
an
NL
price
and
on,
of
course,
no,
the
principal
and
so
that's
sort
of
where
that
fits
today.
C
Since
then,
it's
evolved
from
something
that
was
purely
a
distributed
file
system
to
something
that
has
you
know,
file
system,
client,
linux
kernel,
if
used,
client,
a
library,
client,
also
a
distributed
block
layer,
fully
featured
and
also
distributed,
object,
storage
layer,
so
everything
all
in
one
and
it
rules
I'm
job.
All.
B
So,
to
give
our
speakers
a
bit
of
an
idea
how
well
each
of
our
technologies
is
already
known
to
the
audience,
we're
gonna
we're
going
to
quickly
conduct
a
quick
poll:
okay,
here's
how
this
works.
This
is
something
fantastic
because
it's
completely
anonymous
and
private,
because
we
don't
want
our
speakers
to
know
who
is
in
what
camp.
So
it's
called
the
humming
poll.
Okay,
I'm,
going
to
ask
you
a
yes-or-no
question
and
it's
needed
for
you
is
yes,
you
hum,
and
if
it's
no,
you
stay
silent.
Okay!
B
So,
let's,
let's
test
this
real
quick
who
in
here
is
currently
physically
in
this
room:
okay,
universal
agreement:
okay,
who
in
here
has
traveled
here
from
someplace
outside
the
Australian
mainland?
Okay,
that's
less
all
right!
So
that's
how
this
works.
So
question
number
one
who
among
you
has
previously
used
in
any
fashion
whether
in
production
or
testing,
whatever
who
of
you,
has
used
Gloucester
FS
before
okay,
who
of
you
has
used
safe
before
who
has
used
both
and
who
has
used
none?
B
Okay,
there
you
go
ok,
so
we
have
a
rough
understanding
of
who
has
used.
What
and
how
familiar
you
guys
are
with
these
with
these
technologies?
Thank
you
for
that.
Okay
and
the
since
John
Mark
started
with
the
with
the
introduction
to
the
to
the
project.
Sage
now
gets
the
first
question
now.
Would
you
please
read
out
your
t-shirt.
B
C
I'm
kind
of
a
nice
person,
so
this
is
going
to
be
a
little
challenging
I.
Think
part
of
part
of
that
I.
Think
one
of
the
key
differences
you'll
see
between
what
clusterf
s
offers
and
what's
F
offers
is
a
design
that
sort
of
is
born
out
of
a
long-term
sort
of
design
effort.
Where
you're
saying
taking
a
step
back,
how
would
you
actually
build
something
that
actually
makes
sense
and
solves
the
problem
vs?
How
do
I
build
something
in
less
than
six
months
that
I
can
immediately
move
into
production?
E
I
E
Big
admirers
of
maintain,
we
were
very
for
their
success.
E
E
The
service,
as
opposed
to
anything
else,
would
be
because
of
its
simplicity
that
use
abuse
the
unified
data
vacuum
that
we're
building,
whether
it's
a
not
generic
object
based
interface,
that's
based
on
the
swift
API
or
whether
it's
the
new
KTM
integration
of
doing
that.
Well,
I
allow
people
to
host
the
ozone
cluster
of
us
in
a
way
that
warms
up
to
expectations,
to
need
scale
out
and
as
file
system
that
we
all
these
were
designed
to
be.
You
know
a
unified
way
to
sort
your
data
in
your
way.
E
B
M
E
C
Yeah
so
I
think
I
think
two
things
I
want
to
say.
First,
is
that
I
think
there
are
a
lot
of
different
use
cases
and
Gloucester,
fests
and
stuff
are
better
for
different
use
cases.
They
have
their
strengths
their
certainly
so,
if
I
weren't
using
SEF,
I
would
probably
using
cluster
fest,
because
I
think
the
key
thing
that
I
don't
want.
C
J
C
C
So
one
of
the
things
that
glass
surface
does
that
that
stuff
doesn't
is
geo
replication
replicating
across
multiple
data,
centers
and
I've
heard
that's
based
on
our
sink
and
I
have
a
hard
time,
believing
that
you
can
have
a
geo
replication
solution
that
is,
gives
you
actually
a
consistent
disaster
recovery,
backup
that
is
this,
but
maybe
that's.
Okay,
I!
Don't.
E
E
Sort
of
the
actual
copy
mechanism
for
me
device
to
go
over
to
the
rep
in
volume,
but
it's
not
the
brains,
are
not
arson.
In
fact,
we
we
only
use,
like
the
variable
part
of
a
MRSA
infection,
before
that
we
have
like
a
marker,
a
peon
marker
framework
that
keeps
track
of
data
changes
and
manages
which
which
changes
are
reverb
over
the
next
month.
A
E
B
I
I
have
I
have
one
question
here
that
is
probably
going
to
spark
a
little
bit
of
disagreement.
One
of
one
of
Lina's
is
rather
memorable.
Memorable
L
KML
quotes
is
that
you
shouldn't
be
doing
a
file
system
and
user
space
for
anything
except
toys,
but
we
also
know
that
Lena's
is,
admittedly
not
an
authority
on
storage
right,
so
we
shouldn't
necessarily
take
his
opinion
as
dogma.
B
So
and,
conversely,
there's
just
been
some
recent
benchmarks
from
the
glacier
camp
that
have
shown
that,
at
least
in
some
workloads,
cluster
Fest
is
actually
faster
than
Seth,
so
John
Mark.
How
would
you
dispel
such
bud
from
admitted
non
authorities
likeliness
and
sage?
How
would
you
shoot
down?
Who
would
be
an
outrageous
performance
data
reading
attempt
evil?
Enos
is
right.
E
N
E
B
C
All
that
being
said,
I
think
the
whole
internal
versus
fuse
question
is
that
there's
really
just
a
red
herring
there?
You
can
make
perfectly
usable
systems
that
are
based
on
fuse
and
you
can
make
peripheral
usable
systems.
Obviously
that
are
based
on
based
in
the
Kremlin
really
depends
on
what
your
use
cases.
C
So
there
are,
there
are
weaknesses
in
the
in
a
fuse
API,
and
so
there
are
certain
corner
cases
where
there
are
limitations,
but
fuses
come
a
really
come
a
long
way
since,
since
Linus
made
that
comment,
and
even
previous
with
that,
so
it's
much
less
of
an
issue
today
than
it
was
before.
Regards
to
the
performance
information
that
I'm
florin
was
alluding
to
it's.
It's
funny
that
that
the
fuse
topic
even
came
up
in
that
discussion,
because
the
performance
numbers
that
were
at
those
baseline
had
absolutely
nothing
to
do
with
the
fuse
implementation.
C
G
C
Steph
today
can't
get
anywhere
close
and
the
reason
for
that
is
because
luster
has
been
used
exclusively
in
the
HPC
community,
where
they
cared
about
nothing
except
for
numbers,
and
so
they've
been
relentlessly
tuning
it
on
both
the
network
side
and
on
the
storage
site,
for
you
know
almost
a
decade,
and
we
haven't
really
done
any
of
that.
So
we
have
a
long
way
to
go.
That
being
said,
it's
it's.
It's
really
based.
The
architecture
is
a
completely
different
use
case
in
the
luster
architecture.
There's
no
fault
tolerance
at
all.
C
So
in
order
to
make
a
highly
available
system,
you
actually
have
to
buy
expensive,
like
raid,
arrays
or
sans.
It
will
make
them
to
multiple
hosts
and
then
have
some
like
IP
failover
hacked
make
them,
and
it's
just
and
then
you
have
the
best
expensive
networking
layer
and
it's
you
know
it's
a
it's.
A
very
expensive
project
actually
make
it
go
that
fast.
But
you
know
if
they
do
it
I
think
both
Seth
and
cluster
have
taken
the
alternative
approach
where
you
actually
want
to
build
something.
C
E
E
C
C
Sort
of
I
prefer
for
certain
applications.
The
the
file
system,
people
have
beaten
the
application
developers
over
the
head
long
enough
that
they
actually
write
into
large
files,
but
most
of
the
codes
that
those
HP
folks
folks
are
running
our
like
decades
old,
their
they're
written
in
like
Fortran
87
or
what
ever
became
before
that,
and
they
actually,
they
do
horrible
things
to
the
metadata
server.
So
it's.
E
C
I
C
On
stuff
right
now
is
a
networking
expert
Minh.
So
could
we
don't
really
have
any
business
trying
to
write
that
right
now
there
are
number
of
people
who
are
interested
and
keep
talking
about
and
I
keep
waiting
for
them
to
pony
up
a
developer
and
actually
do
it.
But
I
think
this
this
will
be
the
ear
use.
E
B
E
E
Benefit
to
the
community
is
that
we
can
both
make
each
other
better.
I
mean
I,
know
that
there
are
things
that
I
see
like
stop
doing
my
thing,
you
know
what
I
think
push
each
other
and
I
think
you
know
it
was
so
that
so
that
the
performance
question
that
for
Ian
blitz,
you're
nervous,
because
we
had
engineer
who
get
tired
of
hearing
about
our
green
stuff
was
and
so
I
decided
to.
You
know,
push.
E
E
C
B
B
C
C
E
B
C
D
C
In
order
to
get
sort
of
strong
coherence,
you
know
strong
consistency,
I'm
in
the
metadata
layer
and
that's
a
huge
investment
in
with
lexington
gineering
and
testing
it
now
in
stabilization,
and
we
have
to
actually
make
it.
You
know
reliable
and
people
can
be
using
production
actually
forgot
with
question
one.
E
E
E
Originally
we
didn't
have,
we
didn't
rely
on
HT,
we
had
about
nine
different
schedules
and
some
of
them
are
going
by
different
developers,
because
the
original
never
very
much
hackers
of
a
file
system,
as
we
didn't
push
people
to
anyone
scheduling
system.
So
he
won
over
time
that
we
solidified
around
the
HTC
solution.
It's
a
good
service
best
for
specific
for
that
scale
of
minutes.
This
case.
C
E
Ok,
John
just
the
point
of
the
here's:
a
developer
in
Spain
actually
doing
a
new
way
of
doing,
distribute,
replicated
volumes
totally
independent
of
our
efforts
and
so
we're
I'm
curious
next
year,
when
we
got
this
discussion
because
it
could
be
something
especially
wait.
Five
for
the
network
so
to
be
interesting
to
see
what
counselors
name.
G
E
E
C
C
You,
if
you
use
raid,
you
can
put
you
can
put
stuff
either
on
top
of
a
rod
disk
and
not,
and
you
rely
on
staffs
replication
in
order
to
deal
with
failure,
or
you
can
put
it
on
top
of
a
raid
array
that
is
essentially
just
more
reliable
and
bigger,
but
you
pay
the
twenty
percent
or
whatever
overhead
for
the
great
code.
Currently
we
recommend
just
running
on
top
of
raw
disks
as
it's
simpler
to
deploy
simpler
to
manage
and
the
performance
tends
to
be
better,
but
people
do
it
both
ways.
C
First,
SEF,
you
can
run
the
whole
system
on
on
top
of
SSDs
a
lot
of
people
do
that
and
they
get
great
performance
if
you
can
afford
it.
That's
that's,
certainly
a
preferred
solution.
Some
people
run
on
on
only
disk,
that's
good,
but
the
sort
of
the
sweet
spot
as
far
as
the
price
performance
force
F,
is
to
use
a
disk
for
the
data
and
use
SSD
for
the
journal,
and
that
gives
you
fast
low,
latency
rights,
but
it
also
gives
you
the
capacity
of
the
disk.
C
C
H
C
The
system
is
the
backup
and
that's
sort
of
how
all
of
these
systems
are
designed.
Now
the
dirty
little
secret
of
the
whole
distributed
systems
community
is
that
you
know
we
all
talk
about
eliminating
single
points
of
failure
and
designing
around
that
so
forth.
There's
one!
You
know
common
element
across
the
entire
solar
system
and
that's
the
software,
and
so
a
single
bug
on
any
node
can
wipe
out
all
your
data.
C
H
I
E
M
So
one
hand
we
have
a
small
startup
that
determines
the
fate
of
that
open
source
project
in
play
and
on
the
other
hand,
we
have
red
hat,
which
were
you
know,
I
know
the
luster
is
invested
in
a
poor
governance
model
everything.
Yet
here's
some
conflicts
of
interest
there,
where
Red
Hat
has
been
throwing
its
weight
into
the
OpenStack
p.m.
you
know.
It
has
its
own
products.
It's
a
hypervisor,
whereas
stuff
is
much
more
in
the
numinous
or
liking
to
each
and
defend
your
governments.
I
E
That's
such
a
really
good
point,
I'm
glad
you
brought
it
up,
because,
when
I
look
at
things
that
that
Seth
doesn't
admire
one
of
the
things
that
they
do,
that
really
buyer's
ability
to
you
do
outreach
with
all
these
different
communities
and
do
the
necessary
integration
with
all
I
wish
your
career
in
similar
position.
Having
said
that,
yeah
now
I
I
defend
the
fact
that
we
have
a
very
live,
a
pretty
open
going
small
that
we
have.
We
do
a
lot
of.
E
We
make
sure
that
the
community
interests
are
our
part
of
the
process
and
defining
what
happens
in
the
project,
and
I
can
tell
you
that
you
know
when
it
comes
to
defining
features
and
the
next
version
of
muster
for
us,
we're
not.
You
know
working
out
over
the
Pyrenees
or
come
from
college
management.
It's
the
instigators,
didn't
get
a
deciding.
E
C
So
governance
and
licensing
is
something
that
I
care
a
lot
about.
Part
of
the
reason
why
I
started
Seth
was
because
I
was
very
frustrated
with
the
proprietary
storage
industry
and
so
I
wanted
there
to
be
a
bureau
open
source
alternative
to
that,
and
so
the
license
first
F
is
lgpl,
so
its
copyright
copyleft.
C
You
have
to
contribute
your
changes
and
change,
change
them
and
so
forth,
but
at
the
same
time
that
it's
lgpl,
so
that
you
can
still
integrate
it
with
other
stacks
and
the
reason
for
that
is
because
we
view
the
storage
layer
as
a
as
one
piece
of
a
much
larger
stack.
You
might
be
integrating
with
something
like
a
dupe.
You
might
be
running
it
with
kvm
or
key
me,
or
something
like
that,
and
you
want
to
be
able
to
work
in
all
those
environments.
C
As
far
as
governance
code
goes,
we
don't
have
any
board
or
anything
I'm,
essentially
the
benevolent
dictator
for
the
for
the
project
I'm.
Currently
the
cake
eager
for
you
know
my
version
of
the
Seth
tree.
Obviously,
no
one
can
fork
it,
but
at
this
point
most
of
the
most
active
developers
on
Seth
all
work
for
ink
tank,
our
company,
and
so
are
our
development.
Efforts
are
somewhat
guided
by
sort
of
ink
tanks.
Gold
bring
it
to
market.
C
You
know,
OpenStack
gets
all
the
all
the
press
and
all
the
buzz
and
all
the
love,
but
it's
funny
that
CloudStack
is
actually
the
one
that's
been
deployed
in
all
the
largest
environments
and
then
they're
all
these
other
ones.
To
you,
there's
things
like
an
eddy
and
eucalyptus
and
all
these
other
projects.
So
I
don't.
I
don't
think
it
makes
any
sense
as
an
open
source
project
to
sort
of
choose
of
who
you're
going
to
appear
and
integrate
with
and
so
forth.
C
In
fact,
one
of
the
key
advantages
of
being
an
open
project
and
not
being
sort
of
a
close
source,
something
offered
by
vmware
or
something
is
that
you
actually
can
integrate
with
all
these
things
and
there's
sort
of
a
frictionless
integration.
You
know
the
the
classic
integration
first
set,
for
example,
was
contributed
by
somebody
in
other
ones.
They
didn't
even
doesn't
work
for
us
or,
and
he's
been
a
South
user
for
a
long
time,
but
he's
not
associated
with
ink
tank
at
all.
So.
F
C
It
it
depends
on
your
workload
yeah,
so
the
what
we've
been
typically
deploying
our
nodes
that
have-
maybe
you
know,
8
to
12
disks
and
anode.
You
know
four
to
eight
cores.
Maybe
16
gigs
ram
something
like
that,
and
that
seems
to
work
well,
but
we
have
sort
of
limited
data
points
as
far
as
what
workloads
are
running
instead
hard
work,
and
you
know
what
performance
people
are
going
to
need,
I
think
as
as
the
project
develops,
it'll
become
more
efficient
and
so
the
other
run
was
with
less
CPU.
As
we
continue
to
optimize
it.
C
We've
we've
done
a
lot
of
work,
trying
to
get
it
to
run
on
some
of
these.
These
big
boxes
that
having
of
36
48
drives
and
then
and
trying
to
you,
know,
get
the
full
throughput
of
all
the
disks
and
the
problems.
We're
hitting
right
now
are
I
think
related
to
like
a
new
my
issues
and
asymmetric
memory
and
all
that
annoying
stuff.
That
I
don't
really
want
to
think
about.
E
Recommitting
where
for
the
16
business
per
serving?
No,
we
don't
recommend
more
than
that,
just
because,
once
you
add,
and
what
this
is
into
the
mix
I
owe
it
to
being
squelched,
but
you
tested
it
on
whose
case
you
have.
It
depends
on
whether,
during
your
particulars
cases
will
the
scale
hours
to
scale
up
for
scale
on
your,
we
tend
not
to
do
as
well
as
we
try
to
which
bill
the
scale
as
possible.
I.
C
E
E
We
have
some
pretty
good,
getting
started
documentation,
but
once
you
start
getting
into
the
internals
a
poster
of
us
and
how
actually
do
more
than
just
you
know
your
central
deployment
and
we
kind
of
saw
so
we
definitely
build
up.
And
if
you
know
of
any
good
documentation
runners
on
paperwork,
the.
C
Documentation
rousseff
used
to
be
horrible,
now,
I
think.
Actually
it's
it's
okay,
I
don't
know.
Maybe
other
people
will
disagree,
it's
very
hard
to
find
good
technical
writers
who
actually
understand
what
they're
writing
about,
and
it's
also
very
hard
to
motivate
developers
to
write
documentation
instead
of
code,
and
so
it's
a
documentation
is
actually
one
of
the
easiest
ways
to
contribute,
and
I
think
two
projects
like
cluster
of
ssf
yeah.
C
C
So
the
so
that
sort
of
the
raw
level
of
what
the
nuts
and
bolts
what
you're
doing
is
you're
running
a
few
commands
and
adding
it
in
one
of
the
things
that
we're
working
on
right
now
is
sort
of
an
automated
provisioning
infrastructure.
So
you
would
literally,
you
would
have
a
pile
division
essentially
see
I'd
have
a
pile
of
square
disks
in
your
data
center.
C
C
That's
that
that's
sort
of
the
the
high-level
vision
for
how
you
manage
you
know
an
entire
data
center
worth
of
spinning
disks
in
a
sort
of
a
scalable,
efficient
fashion
and
we're
sort
of
polishing
the
tools
right
now
to
sort
of
make
that
the
default
way
that
people
use
it
right
now,
people
are
still
using
sort
of
the
old
previous
generation
of
tools.
We
have
a
quite
transition
to
the
new,
the
new
goodness
yet
so.
E
N
N
E
I
E
H
E
Deployment
largest
employment
I
think
supposedly
that
I
know
of
there's
a
he
gets
into
there
somewhere
between
10
and
20
petabytes.
There's
a
life
sciences
group,
major
university
they've
deployed
genomic
sequencing
and
a
set
of
this
over
12
pound
of
lights
and
wishing
15-3
soon
server
wise.
There
was
a
company
time,
one
that
had
five
to
six
hundred
said,
but
I
think
they
azum
significant
modifications,
get
pushed
back.
C
The
largest
deployment
that
I've
worked
with
directly
is
the
dreamhost
dream,
objects
one
and
that's
three
petabytes
I've
heard
talk
of
larger
clusters
and
through
sis,
but
I
haven't
actually
interacted
with
those
clusters.
I
don't
know,
I,
don't
know.
The
actual
numbers
was
the
first
question.
Koch's.
C
As
you
as
you
push
the
scale
of
these
systems,
regardless
of
how
you
architect
it,
you
always
run
up
against
things
that
you
didn't
think
of
in
terms
of
the
implementation
and
as
you
sort
of
continually
push
the
line
you
have
to
sort
of
fire
now
these
wrinkles-
and
so
we
fixed
a
number
of
things
as
we
sort
of
built
that
initial
three
Pettitte
by
cluster
I.
Imagine
when
we
build
denix
next
one,
that's
five
and
10
petabytes,
then
we'll
find
more
issues,
but
there's
nothing
that
I'm
terribly
concerned
about
I
guess.
C
As
far
as
gotchas
I
mean
there,
there
are
many
ways
to
sort
of
miss
configure
the
system.
I!
Guess
if
you
don't
choose
your
initial
number
of
placement
groups
and
Seth,
it
can
be
problematic.
Although
the
new
version
finally
has
splitting,
and
so
you
can,
you
choose
a
number.
That's
too
small.
L
C
C
Object
with
made
of
object,
restful,
object,
block
and
then
also
a
district
file
system,
and,
ironically,
the
file
system
is
the
thing
that
we
worked
on
first,
that
was
where
I
began
my
sort
of
research
and
it's
the
least
stable
and
the
one
part
of
stuff
that
we
don't
recommend
running
in
production
just
yet
that
said,
all
the
other
parts
are
awesome
and
you
should.
We
should
run
them
today,
especially
over
something
like
Swift
there's
so
much
like
so
much
better.
H
E
We're
all
done
creating
a
fight
back
in
so
we
love
the
word
where
the
new
technologies,
like
Swift,
we
like
to
push
this
Web
API
willing
to
push
the
KT
of
integration
and
we're
doing
live
gips
with
pleasure,
fine
fescues,
which
is
what
everyone
use
for
our
salmon
iteration.
So
we
we
understand
that
to
really
be
the
policies
to
put
future.
We
have
to
move
beyond
the
skillet
as
bit
and
we're
working
very
hard
to
get
there
as
soon
as
possible.
I
think.
C
I
think
the
one
thing
to
tercer
say
as
a
general
point
about
file
systems
in
general
is
that
when
you
look
at
sort
of
the
marketplace,
you
have
sort
of
the
open
source
options,
and
then
you
have
the
proprietary
options.
The
problem
that
the
proprietary
solutions
have
is
that
they're,
stuck
with
standard
protocols
like
NFS,
insists,
which
are
decades
old
and
a
design-based
with
this
basis,
basic
client-server
paradigm,
and
it's
very
hard
to
make
something
that
is
scaled
out.
C
That
is
using
a
client-server
protocol,
because
the
client
thinks
it's
talking
to
one
person,
a
hundred
different
servers,
and
so
this
is
one
place
where
open
source
solutions
can
excel,
because
they
can
innovate
on
the
protocol
level
and
they
can
innovate
on
the
client
side,
particularly
in
the
linux
kernel.
Where
the,
where
the
proprietary
solutions
can't
we.
J
You
guys
have
possibly
touched
question
just
now.
Just
step
back,
remember
to
the
kind
of
position
of
remodel
is
for
you
to
identify
quite
a
high
level
feature
level,
a
gap
in
the
other
technology,
and
then
the
rebuttal
is
to
kind
of
come
back
and
say
where
you
are
in
the
roadmap.
You
are
going
to
address
again,
be
your
technology,
so
I.
L
C
C
C
So
for
the
the
RESTful
API
eyes,
for
example,
you
know
we
want
to
be
compatible.
Amazon,
s3
and
listening
a
directory
gives
you
things
in
alpha
numeric
order.
That's
not
how
people
interact
with
file
systems,
and
I
think
I
don't
think
we
would
ever
want
to
do
that,
because
we
need
to
do
things
like
hashing
and
so
forth,
and
so
it
doesn't
really
make
sense
to
enforce
sort
of
the
lowest
common
denominator
of
both
AP
is
on
yeah.
B
B
A
I
H
A
Thomas
sand
hill
then
came
in
John
and
jenni
and
Stephen
and
Daniel
and
four
units
agent,
damned
for
all
their
talking
and
everything
I'd
like
to
take
a
bedroom
room,
going
good,
Richard,
sirry
and
wires
of
all.
These
people
are
but
a
wiki
page
I'll
get
slides
from
people
who
want
to
give
them
to
me
not
working
up
there
as
well,
so
they'll,
be
there
soonish.