►
From YouTube: RH InkTank Ceph Day Sessions Kamesh Pemmaraju DELL
Description
Ceph Day Boston 2014
http://www.inktank.com/cephdays/boston/
A
All
right,
our
next
speaker
up
here
is
comes
from
dell.
We've
actually
been
working
with
kamesh
and
his
folks
over
at
del
for
quite
a
while.
Now.
A
So
we've
had
some
really
good
interactions
and
what
we
have,
what
was
our
first
big
one
university,
alabama
yeah.
A
Yeah
yeah,
so
there
have
been
a
lot
of
really
cool
things
and
these
guys
have
been
kind
of
boots
on
the
ground
and
done
some
really
awesome
stuff
with
customers
and
stuff
with
dell
hardware.
So
without
further
ado
I'll,
let
camesh
take
it
away
perfect.
Thank
you
very
much.
B
B
I
also
host
and
organize
the
openstack
meetup
in
boston
area.
So,
if
you're,
how
many
of
you
are
from
the
boston
area?
Local,
great
perfect,
so
in
fact,
every
month
we
meet
to
talk
about
all
the
open
source
projects,
openstack
being
the
main
sort
of
focus
for
us.
So,
as
a
matter
of
fact,
tomorrow
is
our
next
openstack
meet
up
we're
going
to
be
talking
about
hyper-v
peter
from
microsoft.
Is
here
he's
going
to
talk
about
that?
B
B
So
what
am
I
going
to
talk
about
today?
So
I
spent
a
lot
of
time
talking
to
customers.
That's
my
job
and
I
talk
about
open
source
cloud
implementations.
I
talk
about
open
source
storage.
I
talk
about
scale
out
storage.
I
talk
to
enterprise
I.t.
I
talk
to
service
providers,
so
I've
got
a
broad
background
of
customer
interactions
and
based
on
that,
I've
sort
of
distilled
some
of
the
things
that
I'm
seeing
in
the
industry.
So
it'll
give
you
an
idea
of
how
cephei
is
getting
adopted.
What
areas
is
it
getting
adopted
in?
B
What
use
cases
are
still
not
there
yet
in
terms
of
customers
perceptions,
so
the
reference
architecture
stuff
that
we've
been
working
with
with
inktank
over
the
past
couple
of
years
help
you
sort
of
narrow
down
all
the
different
things
you
can
do
with
ceph
I
like
to
think
of
stuff.
As
a
swiss
army
knife,
you
can
do
a
lot
of
different
things
with
it.
B
It
can
fit
a
lot
of
different
use
cases,
but
then
you
can
quickly
get
lost
in
the
forest,
so
you
need
to
really
zero
in
on
what
use
cases
make
the
most
sense
and
then
build
reference
architectures
for
that
and
that's
kind
of
what
we've
been
doing
with
inktank
and
that's
not
enough.
You
also
need
to
come
up
with
okay.
Is
this
reliable
enough?
Does
it
have
the
performance
characteristics?
B
B
You
know
we
need
to
use
this
stuff
right
and
then
at
some
point,
the
organization,
the
cio
or
director
of
it
or
whoever
gets
involved
and
says:
okay,
what's
the
budget
for
this
stuff,
why
do
we
want
to
use
this?
Why
do
we
care
you
know?
What
are
the
use
cases
do
we
do?
We
want
to
use
this
for
what
applications
do
we
care
about?
So
all
those
questions
come
up
and
and
at
some
point
you
have
to
answer
them
right.
B
You
have
to
answer
to
a
cio
or
to
your
director
of
I.t
or
your
executive,
whoever
it
is
you
need
to
make
those
cases,
and
I
have
some
advice
for
you
along
those
lines
and
then
there's
a
survey
by
the
way.
Ceph
is
the
second
most
I
believe,
the
most
popular
backend
for
openstack,
according
to
use
user
service,
lvm
being
the
the
number
one,
the
the
linux
volume
manager.
So
again,
what
what
are
you
looking
to
build
out
of
this?
You
know
what
is
the
initial
storage
capacity?
B
Are
you
looking
at
steady
state
data,
or
are
you
looking
at
spike
data
usage
at
certain
times
of
the
year?
What
is
the
expected
growth
rate,
so
I
I
guess
you
have
to
at
some
point
once
you
feel
comfortable
with
the
technology.
B
You
start
thinking
through
your
pilot
and
pre-production
environments
and
ultimately
it's
all
about
workloads
right
use
cases.
So
is
it
more
capacity
focused?
Are
you
looking
at
archival
use
cases
or
backup,
or
are
you
looking
at
high
performance?
You
want
to
put
a
database
on
ceph.
Is
that
a
good
idea?
B
Well,
you
know
I'm
going
to
talk
about
some
of
that.
What
type
of
data
is
involved?
Is
it
feminine
data?
Is
it
scale
out,
you
know,
multi-tenanted
cloud
environment,
or
is
it
persistent
data
that
you
want
to
keep
between
vms
as
they
disappear?
Is
it
object,
block
file,
lots
of
considerations,
so
we're
going
to
talk
about
some
of
that,
like
I
said
earlier,
ceph
is
like
a
swift
army
knife.
It
has
all
these
different
things.
It
can
be
tuned
to
a
wide
variety
of
use
cases.
B
Let's
look
at
some
of
them,
so
this
is
a
a
thing
that
I
use.
So
you
have
kind
of
two
different
target
users.
If
you
will,
you
know
traditional
it
that
uses
you
know
traditional
sans
nas
devices,
virtualization
and
private
clouds
using
microsoft
or
vmware,
and
then
you
have
cloud
applications
so
kind
of
think
of
them
as
legacy
and
modern
in
in
in
one
way
right
another
way
of
looking
at
it.
So
here
you
have,
you
know:
xas
infrastructure
as
a
service
storage
as
a
service
platform
as
a
service
or
you're.
B
Looking
for
a
compute
cloud
like
amazon
or
openstack,
so
you've
got
both
of
these
type
of
things.
So
are
you
looking
at
capacity
or
performance?
So
these
are
the
two
things
you
can.
You
need
to
think
through,
ideally,
ceph
sort
of
fits
right
there
in
the
middle,
although
I
can
make
a
case
because
I've
seen
some
very
compelling
performance
numbers
where
you
can
even
replace
ceph
replace
a
traditional
san
with
sef
and
in
fact,
within
dell.
This
is
a
big
big
argument.
B
That's
going
on
within
our
storage
group,
we
have
compellent
and
equal
logic,
which
are
traditional
sense.
Just
like
netapp
and
emc
has
so
they're
saying,
hey,
what's
going
to
happen
with
our
with
our
technologies,
is
seth
going
to
go?
Replace
this
in
the
long
term
we're
having
those
discussions
today
and-
and
it's
not
an
easy
answer
right
now-
ceph
is
maturing.
There's
still
it's
good
for
certain
use
cases
like
devtest
and
I'll
talk
about
what
what
those
are.
But
I
think
this
is
a
nice
sweet
sweet
spot
for
sef
right
there.
B
I
wouldn't
really
recommend
ceph
for
very
high
performance.
You
know
emc
and
netapp
type
of
storage
devices
where
you
put
mission,
critical,
high
performance,
databases
and
stuff,
like
that,
not
right
now,
but
that's
sort
of
a
nice
target
because
it
fits
a
lot
of
different
use
cases
all
right.
So
the
use
cases
I'm
going
to
talk
more
about
openstack,
because
that's
kind
of
where
my
my
main
focus
has
been
and
where
I'm
seeing
a
lot
of
customer
demand.
B
Okay,
there
we
go!
So
if
you
look
at
this
one,
we
have
the
content
store.
So
you
can
you
can
look
at
ceph
as
an
object
store
similar
to
swift.
You
can
also
look
at
it
as
a
backend
for
openstack.
B
So
let
me
talk
about
some
of
the
specific
use
cases
of
ceph
that
we
are
seeing
and
some
that
I
think
that
itsef
is
good
for
the
main
one
is
openstack.
Obviously,
and
ceph
has
both
a
swift
api
compatibility
as
well
as
a
cinder
api
compatibility
cinder
by
the
way
how
many
of
you
are
familiar
with
openstack?
B
Okay,
quite
a
few
of
you
great.
So
the
sender
is
the
component
for
these
for
up
for
volumes
and
block
storage.
So
the
ceph
block
device
is
what
you
would
use
to
interface
with
with
the
ceph
cluster
through
the
sender
api.
So
we've
got
the
volume
interface
as
well
as
the
swift
object
interface.
So
these
we're
seeing
a
lot
of
object
right
now
with
in
terms
of
customer
traction
and
they're,
using
it
for
both
ephemeral
storage,
as
well
as
snapshots
copy
and
write
and
volumes.
B
Now
the
an
interesting
benchmark
that
our
storage
team
did,
it
turns
out.
Actually,
if
you
use
ephemeral,
storage
on
the
cef
cluster
as
opposed
to
the
local
disk,
it's
actually
faster
in
terms
of
in
terms
of
throughput,
which
is
kind
of
a
surprise
to
me.
But
it
turns
out,
it's
actually
faster
if
you
just
use
a
femoral
ceph.
The
other
reason
you
would
want
to
use
ceph
for
for
cinder
is
because
you
can
also
have
your
images
hosted
on
the
ceph
cluster.
B
What
happens
typically
in
in
openstack
is,
if
you
don't
have
a
a
surf
cluster
like
this.
The
image
gets
downloaded
from
glance
onto
the
local
node
before
it
boots
up.
So
there's
a
there's,
a
overhead
in
terms
of
moving
the
image
over,
but
if
it's
on
ceph
it's
right
there
in
the
cef
cluster,
so
your
boot
up
times
are
a
lot
faster,
so
you'll
get
some
benefits
from
from
using
ceph
for
both
glance
and
and
and
nova
as
well
as
cinder.
B
So
in
the
case
of
openstack,
you
know,
obviously
the
ceph
platform
is
certified
against
the
red
hat
things,
which
is
an
advantage
for
you
now.
The
other
use
case
is
just
pure
cloud
storage.
If
you
want
to
use
it
as
a
pure
object
store
like
a
swift
cluster,
you
can
use
the
object
gateway
and
the
rados
cluster,
as
as
the
as
the
backend
object,
storage.
B
The
the
third
use
case
is
web
scale
applications
now
here's
an
interesting
thing
so
here
we're
using
the
gateways
and
the
apis
to
get
to
the
actual
cluster.
But
if
you
want
to
get
very,
very
high
performance,
in
fact,
I
think
the
performance
difference
between
going
this
route
versus
this
route
is
an
order
of
magnitude
faster,
so
you're
going
directly
to
the
seffrados
cluster
to
using
the
native
protocol.
So
this
gives
you
an
enormous
amount
of
flexibility
in
terms
of
performance
and
scale
and
and
multi-tenancy.
B
So
now
we
can
sort
of
set
up
a
performance
block
configuration
just
like
that,
so
you
can
use
the
cache
pool
and
the
backing
pool
for
for
your
for
your
volumes,
so
you
can
for
read,
write
you
can
use
the
cache
for
right
back
mode
and
for
read
you
can
use
cache
as
read-only
mode,
so
both
modes
are
available
for
for
the
performance
block
and
sage
mentioned
earlier,
with
the
with
the
erasure
coded
feature.
B
That's
coming
in
that's
already
there
right
now,
nice
in
ice
1.2,
you
can
use
it
for
clothes,
for
for
cold
storage
and
that'll.
Give
you
a
lot
of
additional.
You
know.
First
of
all,
it'll
save
you
money
because
you
don't
have
to
spend
as
much
on
storage,
because
it's
going
to
cut
down
your
your
the
number
of
storage
nodes.
B
You
need
for
the
same
amount
of
storage
and
you
can
use
the
cash
pulled
for
for
basically
increasing
your
performance
for
for
reading,
because
this
is
a
good
read-only
use
case
and
then,
like
I
said,
for
databases
again
go
native
protocol,
because
this
is
the
way
for
you
to
get
performance
using
the
sefblock
device
and
native
protocol.
You
can
in
fact
put
databases
on
ceph.
Now
I
wouldn't
recommend
this
yet
yeah
I
would
I
would.
I
would
highly
recommend
you
do
your
own
performance
testing
with
your
own
sort
of
reference
architecture
before
you
can.
B
You
can
go
down
this
road,
but
the
thing
I'm
trying
to
point
out
is
there
are
ways
in
which
you
can
use
ceph
for
different
use
cases
because
it
supports
it
using
these
different
protocols
and
then
finally
hadoop
as
sage
mentioned
again,
you
know
ceph
file
system
once
it's
ready
for
production,
it's
a
great
replacement
for
hdfs.
B
B
If
you
have
2x
3x
4x
redundancy
there's
a
cost
trade-off,
so
it's
all
use
case
dependent
right.
So
that's
one
trade-off.
You
need
to
look
at
the
second
one
is
all
about
failure.
Domains.
There
are
lots
of
different
failure.
Domains
to
think
about
the
disks
can
fail.
Your
ssd
journals
could
fail.
Your
node,
the
entire
node
can
go
down.
Your
entire
rack
can
go
down
or
a
complete
site
can
just
disappear
because
of
a
natural
calamity.
B
So
each
of
these
failure
zones
is
what
you
can
when
you're
designing
your
ceph
cluster
through
your
config,
your
crush
configs,
to
map
out
what
the
the
fault
zones
are
and,
and
that
way
you
can
start
to
put
you
know
disks
and
nodes
in
different
racks.
So
in
case
one
rack
goes
down,
you
still
have
another
rack
to
take
care
of
it
and
then
think
about
the
storage
pools
right.
Do
you
want
high
performance,
ssd
pool
which
will
give
you
that
block
performance
I
mentioned
earlier
or
is
it
just?
B
Capacity
are
looking
for
large
scale,
low
cost?
We
call
it
cheap
and
deep
right,
cheap
and
deep
storage,
just
large
archival
for
your
object
storage.
Then
you
need
to
look
at
a
capacity
pool
using
a
razor
coating
as
a
potential
for
for
reducing
costs
and
the
monitor
nodes
right,
the
monitor
nodes.
You
have
to
think
about
their
failure
domains.
You
don't
want
them
to
be
in
the
same
failure
domain,
because
if
that
goes
down
all
your
monitor
nodes
go
down,
in
which
case
you
don't
have
access
to
your
cluster,
so
consider
all
these
failure.
B
B
So
that's
kind
of
the
the
rule
of
thumb
for
that,
if
you
want,
if
you
can
use
ssds
for
journaling,
which
will
speed
up
your
your
write
performance
and
then,
of
course,
you
can
have
tiering,
which
is
in
firefly,
which
will
allow
you
to
use.
You
know
both
ssd
ssd
pools
as
well
as
backing
pools,
so
you
can
have
both
of
those
things
but
think
about
what
happens
when
your
ssd
fails.
So
we
were
having
a
big
big
discussion.
Internally.
B
B
Do
you
want
to
make
your
ssds
redundant
meaning?
Do
you
want
to
use
some
kind
of
a
raid
one
configuration
or
mirrored
configuration
for
that?
So
we
at
the
end
of
the
day
we
sort
of
came
to
a
nice
sort
of
compromise,
in
which
we
said
you
need
to
have
a
five
to
one
ratio
between
your
actual
data
disks
and
your
ssd
journal
devices,
and-
and
that
way
you
know,
if
you
have
like
10
disks,
then
you
need
two
disks
for
ssd
effectively
in
a
single
node
right.
B
So
that
gives
you
enough
of
a
capacity
so
that
you
know
you
can
you
can
take
care
of
those
things
so
erasure
coding,
as
I
mentioned
already,
you
can
get
a
lot
of
additional
storage
benefits,
but
the
expense
of
compute.
So
I
think
we
sage
had
us
had
a
slide
which
was
showing
what
happens
for
recovery
times.
When
you
use
a
razor
coating,
it
goes
up
dramatically
and
it
has
a
it
has
a
tremendous
impact
on
compute.
So
you
need
to
think
about
that.
B
So
the
for
for
extra
capacity,
you
can
think
about
jbot
expanders
now.
Dell,
for
example,
has
we
have
these
md
360
series,
they're
90
discs
in
a
chassis.
It
gives
you
at
four
terabytes
per
disc
90
times
four
you're
talking
about
almost
a
half
a
petabyte
in
a
single
chassis,
full
of
just
jbods
just
discs,
just
a
bunch
of
disks,
which
you
can
hang
off
of
a
server
like
a
storage
server
to
get
this
extra
capacity
huge
capacity.
B
If
all
you're
interested
is
in
storage,
cheap
storage,
that's
the
way
to
go,
but
you
have
to
be
careful
because
now
you're
you
will
be
adding
extra
extra
latency
because
now
you're
over
subscribing
your
assassins
right.
So
we
have
come
up
with
an
ra
that
kind
of
takes
that
into
consideration.
We've
done
some
some
testing
along
those
lines
monitor
nodes.
We
need
odd
number
for
quorum.
Services
can
be
hosted
on
a
storage
node
itself.
B
You
don't
need
dedicated
nodes
for
for
monitor
nodes
if
it's
not
a
very
large
cluster,
but
a
fairly
you
know,
between
10
to
20
is
sort
of
the
guideline.
If
you
have
20
nodes,
let's
say
each
node
is
about
30
to
40
terabytes,
then
you're,
okay,
you'll
still
be
fine.
Having
your
monitor
nodes
hosted
on
the
storage
node
itself,
but
if
you're
going
beyond
that,
we
we
recommend
having
dedicated
nodes
right.
So
a
dedicated
node
just
for
running,
monitor
services,
keep
in
mind
that
monitors
are
relatively
quiet
for
most
of
the
time.
B
B
But
if
it's
a
stable
cluster,
your
monitor
node
is
not
really
doing
anything.
It's
sitting
there
idle.
So
keep
that
in
mind
and
if
you're
doing
distributed
geo
distribution
and
you
want
replication
across
sites,
then
what
you
want
is
dedicated.
Rados
gateway
nodes
for
large
object
stores
that
are
distributed
across
multiple
sites,
so
you
can
use
dedicated
radars
gateway,
nodes
and
use
federation
between
those
sites
to
synchronize
between
between
the
sites.
B
B
Networking
is
a
huge,
huge
thing.
You
have
to
make
sure
you
think
through
right
from
the
beginning,
because
guess
what
I've
talked,
we
have
had
several
customer
installations
where
we
had
all
these
conversations
with
the
customer,
we're
ready
to
go,
deploy
the
thing
and
all
of
a
sudden.
At
the
last
minute,
the
networking
and
security
teams
come
up
and
say:
what
are
you
guys
doing?
You
know,
so
we
had
to
go
back
to
the
drawing
board
effectively
and
say:
okay,
now
we're
going
to
connect
this
to
your
back
end.
B
Do
you
want
network
redundancy,
meaning
multiple
switches,
multiple
links
going
to
your
servers,
or
are
you
okay,
with
with
one
single
switch?
So
that's
another
thing
to
think
about.
Do
you
want
dedicated
client
nodes,
client
networks
and
dedicated
data
networks?
Again,
it
depends
on
your
use
case
how
much
traffic
you
think
you're
going
to
push
through
your
data
network
so
think
through
that
one
gig
versus
10
gig
versus
40
gig,
some
customers,
in
fact
a
big
telecom
customer
based
in
the
boston
area.
B
B
They
want
it's
a
video
video
on
demand,
application
right
and
they're.
They
were
talking
about
tens
of
thousands
of
users
using
you
know
all
kinds
of
different
form
form
factors
of
end
devices
streaming,
video
and-
and
this
stuff
is
basically
doing.
You-
know,
translation
and
transcoding
all
kinds
of
stuff
on
the
fly.
So,
okay!
Well,
you
know
they
were.
They
were
doing
some
testing
of
that
performance
testing,
or
that
I
mean
it
just
goes
to
show
that
ceph
can
be
used
in
all
kinds
of
different
environments.
B
B
You
know
the
first
use
case
that
the
the
case
study
with
uab.
This
was
back
before
dumpling
by
the
way,
so
their
main
data
center
was
in
birmingham
university
of
alabama
birmingham
and
they
wanted
to
have
a
backup
facility
in
huntsville,
which
is
about
100
miles
from
there.
They
had
a
dedicated
van
link,
but
when
we
did
the
actual
testing
we
found
out,
the
latency
was
not
fast
enough
and
at
the
time
inktank
didn't
have
the
the
replication
feature.
So
it,
as
you
know,
right
ceph,
fuses
cons.
It's
consistent.
It's
not
eventually
consistent.
B
It
is
actually
consistent.
So
it
requires
your
all
of
your
notes
to
be
on
a
pretty
high
network
with
low
latency.
So
keep
that
in
mind.
I
think
the
newer
things
that
that
are
coming
out
in
ice,
2.0
and
3.0
will
solve
some
of
these
problems.
But,
right
now
our
recommendation
is
stay
within
a
data
center.
B
So
what
we've
been
doing?
At
dell
we've
been
working
with
a
number
of
distributions.
As
you
probably
know,
we
started
off
with
canonical
distribution.
We've
been
working
with
susa
and,
more
recently,
with
with
red
hat
we've,
been
building
reference.
Architectures
testing
out
these
solutions
on
on
these
different
distributions.
For
a
while.
This
is
the
latest
version.
That's
coming
out.
In
fact,
our
first
dell
red
hat
openstack
solution
was
announced
at
the
red
hat
summit
in
april,
so
this
one's
coming
out
in
in
summer
so
effectively.
B
What
it
is
is
a
pilot
configuration
if
you
want
to
start
with
openstack
and
ceph.
This
is
the
best
way
to
get
started
because
it's
effectively
openstack
in
a
box
openstack
and
ceph
in
a
box
completely
pre-configured
pre-tested.
It's
got
all
the
servers.
You
need
it's.
It's
poweredge
r
from
dell
720
xd
are
the
storage
servers,
10,
gig,
networking,
red
height
enterprise,
openstack
linux
and
an
ice
1.2
all
built
into
this
all
ready
to
go
pre-tested
you
get
up
to
36
terabytes
of
raw
storage
in
this
in
this
box.
B
In
this
it's
it's
actually
three
quarters
of
a
rack.
It's
not
even
a
full
rack.
So
if
you,
if
you
want
more
storage,
you
can
throw
in
more
more
of
these
storage
bundles,
which
is
kind
of
an
expansion
feature
for
for
this
pilot
bundle.
So
this
gives
you
228
virtual
machines,
36
terabytes
of
raw
storage,
which
you
can
use
as
either
as
volumes
back-end
volumes
to
openstack
vms
or
just
as
a
object
store.
You
can
do
that.
B
It's
got
an
openstack
manager
and
openstack
controller,
all
built
in
with
so
two
additional
nodes
for
so
this
is
coming
out
in
summer.
This
comes
with
a
reference
architecture,
so
all
the
considerations
that
I
just
mentioned
are
have
we've
been
having
lots
of
internal
debates
and
discussions
both
with
inktank
and
red
hat
and
and
the
dell
storage
team
and
we've
finally
come
up
with
a
solution
that
we
think
is
is
optimal
for
a
wide
variety
of
use
cases,
and
that's
what
that
is
right.
B
B
So,
if
you're
looking
for
a
performance
use
case,
you
want
to
put
a
database
on
this
or
you
you
want
to
use
it
as
a
test.
Dev
environment
for
builds
and
testing
and
scale
out,
testing
and
stuff.
The
performance
server
would
be
your
best
choice.
Right,
it's
got,
ssds
ssd
pooling
is
not
introduced,
yet
that's
coming
in
version
two,
but
this
gives
you
journals
with
ssds
and
40
terabytes
of
data
for
on
in
a
single
server,
120
terabytes
and
three
servers.
So
it's
pretty
it's
pretty
dense.
You
get
a
lot
of
stuff
there.
B
If
you
want
to
go
capacity,
then
we
have
these
md
j
bar
chassis.
I
was
talking
about
so
this
md
3060
or
md
1200
they're.
Basically,
these
large
j
bar
chassis
with
lots
of
discs
in
it
right.
So
you
can,
you
can
build
them
out
again
think
through
and
the
the
configurations
will
make
sure
that
you're
not
over
subscribing
your
sas
lanes
or
your
latencies
are
getting
affected,
etc.
But
we
are,
we
have
done
some
internal
testing
to
make
sure
it
fits
the
needs
of
that
particular
use
case.
B
So
what
are
we
doing
to
enable
dell
and
red
hat
specifically?
This
is
the
red
hat
solution.
We
have
a
similar
solution
with
susa
and
susa
has
been
testing
ceph,
as
well
with
with
our
solution
with
dell
hardware.
So
that's
also
available,
but
specifically
with
red
hat.
It's
a
validated
ra
co-engineered
between
red
hat
inktank,
all
three
parties
effectively
we've
been
working
with
inktank
for
over
two
years
and
then
the
red
hat
acquisition
happened
and
we've
been
working
with
red
hat
too
for
the
last
six
months.
So
it's
a
nice.
B
You
know
marriage
made
in
heaven
so
to
speak,
so
we
have
pre-configured
storage,
bundles
storage,
enhancements,
certification,
professional
services
all
nicely
bundled
together.
So
you
get
everything
out
of
the
box
ready
to
go
so
with
that,
I'm
going
to
jump
into
the
case
study
so
university
of
alabama.
What
was
their
biggest
challenge?
So
they
do
a
lot
of
great
research
in
cancer
and
genomic
research
right.
This
is
kind
of
their
focus.
They
have
close
to
900
researchers.
B
Their
biggest
challenge
was
data
sets
right.
So
they
had
this
research
data
effectively
scattered
across
all
kinds
of
devices.
Usb
sticks,
you
know
hard
disks,
desktops
laptops.
It
was
all
over
the
place,
they
had
a
compliance
issue
and
they
had
this
problem
of
managing
the
demand
of
these
researchers
right.
So
so
the
data
was
at
risk.
Productivity
was
going
down
because
people
are
not
finding
the
data
that
they
were
looking
for.
B
They
badly
and
desperately
needed
a
centralized
repository
for
the
data,
mainly
for
compliance
and
a
for
the
demand
that
was
coming
up,
and
this
is
how
their
system
looked
like.
It
was
a
mess
right.
They
had
all
kinds
of
grids,
they
had
local
servers,
they
had
prototype
cloud
set
up
with
all
kinds
of
you
know:
open
source
clouds.
They
were
using
cloud
stack,
they
had
an
openstack
thing
going
on,
they
had
virtualization.
B
I
mean
this,
I'm
sure
this
a
lot
of
typical
enterprise
data.
Centers,
look
somewhat
like
this.
It's
not
very
unusual,
in
fact,
when
we
talk
to
the
other
other
universities,
I
I
can
draw
exactly
the
same
picture
and
it'll
look
the
same
across
the
board
right.
So
this
was
their
situation
and
they
were
using
hpc.
I
mean
at
the
end
of
the
day,
this
is
an
hpc
application.
B
They
were
doing
genomics,
you
know
large
compute
performance
type
stuff
and
they
were
pushing
all
the
stuff
into
the
hpc
cluster,
which
is
of
course
connected
on
infiniband
and
they've
got
all
that
stuff.
They
do
all
their
processing,
put
all
the
storage
in
hpc
storage
and
then
pull
it
back
into
their
laptops
for
doing
some
local
processing.
So
this
is
kind
of
how
it
looked,
and
it
was
all
on
one
gig
network
right.
The
the
interface
to
the
hpc
was
one
gig
network,
which
is
slowing
them
down
lots
of
challenges.
B
So
the
solution
was
a
scale
out
storage
cloud
and
openstack
ceph
and
crowbar,
which
was
our,
which
was
dell's.
Deployment
tool
really
came
to
the
rescue,
so
they
were
able
to
house
and
manage
a
centrally
accessible
across
the
campus
network
storage
cloud
file
system
clusters
can
grow
as
big
as
they
can,
because
it's
all
centralized
it
can
be
provisioned
from
this
one
single
massive
pool,
as
opposed
to
being
distributed
all
over
the
place.
So
it
helps
a
lot
with
compliance.
B
400
terabytes
they
put
400
terabytes
into
production
when
they
started
off
and
they
did
the
math
and
it
came
out
to
be
41
cents
per
gigabyte
per
month,
which
is
actually
pretty
darn
good,
because
if
you
look
at
it's
comparable-
and
I
don't
know
if
it's
comparable
with
amazon,
but
it's
it's
pretty
darn
good,
very
nice
cost
structure
there
and
they're
looking
to
scale
up
to
five
petabytes
over
the
next
year
or
two.
So
the
researchers
were
very
happy
because
now
they
have
much,
they
can
work
with
much
bigger
data
sets.
B
B
At
the
end
of
the
day,
they
were
so
happy
with
this
initial
thing
that
they're
now
looking
to
and
they
were
using
other
things
too,
like
research
storage,
clash
crash
plan,
they
were
doing
github
hosting
on
pocs,
etc.
B
So
this
is
how
it
looks
today
it
looks,
there's
a
cloud
services
layer,
it's
all
ceph
right
now
they
have
this
virtualized
server
and
storage
computing
cloud
based
on
openstack
and
ceph
and
crowbar,
which
is
the
one
which
is
a
deployment
tool
like
we
can
talk
offline
about
that,
but
it
helps
you
with
deploying
these
massive
clusters
automatically
right
doing.
All
this
by
hand
is
a
is
not
easy.
Trust
me
because
there
are
a
number
of
considerations,
all
the
way
from
configuring.
B
Your
servers,
your
networks,
getting
all
your
openstack
services
up
and
running
all
your
ceph
services
up
and
running
tying
it
all
together.
It's
it's
not
an
easy
thing
to
do
so.
Crowbar
is
incredible.
Other
tools
too,
like
foreman
and
a
number
of
other
tools
can
do
similar
stuff,
but
that's
a
key
component
of
any
deployment
infrastructure.
So
this
is
what
it
looks
like
they
have
a
number
of
ceph
nodes.
B
B
So
that's
what
it
looks
like
today
and
they
want
to
expand
this
beyond
just
data
management
right
now
they
want
to
have
scientific
computing.
In
fact,
they
want
to
host
hpc
on
openstack,
which
is
awesome,
but
I'm
kind
of
scared
of
that,
but
we'll
see
how
it
goes,
but
we're
working
with
them
on
that
they
do
want
individualized
customized.
You
know
dev
test
environments
on
openstack,
which
is
a
great
use
case.
B
It
works
well
all
of
most
of
our
customers
do
that
and
they
want
to
start
integrating
shareware,
open
source
and
other
commercial
software
into
this
it'll
it'll,
be
it's
an
ongoing
exercise.
It's
probably
going
to
take
them
two
or
three
years
before
they
get
all
this
integrated.
But
their
vision
looks
somewhat
like
this,
so
eventually
they
want
to
have
this
cloud
services,
layer,
openstack
and
openstack,
doing
all
their
infrastructure
as
a
service
in
terms
of
providing
virtualized
server
and
server
resources
to
their
researchers,
as
well
as
an
enterprise.
B
I
t
and
then
the
ceph
nodes
effectively
playing
two
roles.
One
is
the
storage
as
a
service.
This
is
the
first
thing
they
started
off
with
and
then
being
a
backup
store
or
a
volume
store
for
for
the
openstack
nodes,
so
both
of
those
and
eventually
also
connecting
all
to
their
hpc
cluster.
So
this
is
the
vision
it's
going
to
take
them
a
couple
of
years,
we're
working
very
closely
with,
with
the
customer
over
the
last
two
years,
they're
very
happy.
In
fact,
we
did
a
couple
of
sessions
at
the
openstack
summit.