►
From YouTube: C* Summit 2013: Hindsight is 20/20. MySQL to Cassandra
Description
Speaker: Michael Kjellman, Software Engineer at Barracuda Networks
Slides: http://www.slideshare.net/planetcassandra/c-summit-2013-hindsight-is-2020-mysql-to-cassandra-by-michael-kjellman
A brief intro to how Barracuda Networks uses Cassandra and the ways in which they are replacing their MySQL infrastructure, with Cassandra. This presentation will include the lessons they've learned along the way during this migration.
A
People
in
the
back
come
on.
Oh,
how
are
you
how
far
you
going
to
spread
out
here?
So
if
you
notice
that
back,
did
you
notice
that
from
yesterday?
That's
right
so
who
took
it?
Nobody
here?
Okay!
Well,
if
I
find
out
who
did
there's
a
case
of
beer
for
you
in
the
back
actually
there's
two
cases:
you
want
blonde
or
brown
we're
going
to
have
a
contest.
I,
don't
know
if
you
saw
it
on
Twitter
but
to
get
a
case
of
beer,
because
we
love
beer
right
speaking
of
beer.
A
I
think
we
need
to
get
Michael
one.
So
Michael
Kelman
he's
a
he's
been
a
friend
of
mine
for
a
while
here
on
Cassandra
community
he's
always
on
IRC.
He
has
he
the
notoriety
of
being
the
only
guy
who
went
to
120
the
day
it
was
released
in
production,
he's
the
man
so
he's
going
to
give
us
a
really
I.
Think
a
poignant
talk
here
on
switching
from
my
sequel
to
Cassandra.
There's
a
lot
of
perils.
A
lot
of
good
stuff
I
mean
I.
B
Alright,
well,
I'm
here
today
to
talk
to
you
about
our
transition
from
my
sequel
to
Cassandra
I
work
for
barracuda
networks
and
I'm.
Glad
you've
all
made
the
decision
to
listen
to
me
blather
for
the
next
hour
and
hopefully
I
keep
you
all
awake.
So
anyways
I,
you
don't
know
a
barrack.
It
is
we
make
real
time.
Excuse
me,
we
make
network
security
appliances,
we're
also
do
cloud
solutions.
We
are
moving
in
towards
storage,
would
get
back
up
products
and
we've
always
been
a
my
sequel
house.
B
B
We
have
a
lot
of
boxes
that
are
single
monolithic,
my
sequel,
you
know
waiting
disasters
and
I'm
sure
that
there's
many
people
and
I
guess
the
point
here
I
want
to
make-
is
that
I
don't
want
you
to
think
that
I
stand
up
here
and
I'm
not
going
to
be
any
better
than
many
of
you
in
the
audience.
I
want
to
be
honest
about
our
problems,
about
what
problems
we
had
and
how
Cassandra
actually
made
us
better
company,
so
I
work
primarily
in
Java
pearl
and
see
you
can
laugh
at
pearl.
B
It
seems
to
be
the
new
thing
to
do
and
I
wrote
pearl
casa.
We
have
a
lot
of
legacy
pro
code
and
so
it's
pretty
important
for
us
to
make
sure
that
we
can
get
old
or
older
engineers,
maybe
who
still
are
comfortable
and
pearl
and
don't
want
to
learn
other
languages
to
be
able
to
work
with
our
new
our
new
database.
I
have
binary
support,
implemented
but
not
load
balancing.
Yet
it's
committed
and
hopefully
I'll
get
that
done.
I
just
I
couldn't
get
it
done
between
the
launch
and
preparing
for
this
talk
and
I.
B
B
So
anyways,
here's
our
Cassandra
cluster,
one
of
the
advantages
of
being
an
appliance
company
is
that
we
can
actually
make
our
own
hardware.
So
we
sort
of
you
know
gone
through
a
lot
of
iterations
on
this.
We
found
something
that
I
think
actually
works,
really
well.
Sort
of
we've
got
five
different
models
that
we
can
use
depending
on
the
amount
of
I/o.
We
need
an
eye,
ops,
etc,
and
it's
actually
pretty
cool
we
can
do
so.
This
is
a
one
of
our
clusters,
as
you
can
see,
they're
all
one
new
boxes,
nothing
special.
B
We
are
running
125,
which
is
the
current
release
plus
patches.
As
patrick
said,
I
tend
to
live
on
the
dangerous
side
and
we
have
24
nodes
and
two
data
centers
12
and
each
we
have
two
two
terabyte
hard
disks.
One
time
I
had
some
problems
back
in
the
day
where
some
nodes
accidentally
get
a
little
overloaded
with
capacity
during
some
repair
operations
and
I
almost
ran
out
of
just
space,
so
I
pledged
I
would
never
run
out
of
space
again.
B
So
I
put
two
terabyte
disks
just
for
that
reason
we
have
one
small
SSD
for
hot
calm
families.
After
11
you
can
now
pin
hot
calm
families
to
an
SSD
64
gigs
of
RAM.
We
use
poplar.
Excuse
me
poplar
puppet
for
management
and
cobbler,
which
was
a
red
hat
project
that
the
Ubuntu
guys
actually
now
do
in
symphony
as
well,
and
it's
sort
of
pixie
management.
B
So
we
can
actually
take
our
bare
metal
in
our
data
center
and
go
all
the
way
through
remotely
to
bootstrapping
into
Cassandra
without
me,
actually
being
in
the
data
center
now,
which
is
a
huge,
huge
huge.
You
know,
change
from
the
way
you
boost
operate
operationally
with
my
sequel
and
I
target
load
at
about
600
gigs,
a
node.
B
So
what
is
real
time?
I
would
make
the
argument
that
there's
no
such
thing
as
real
time.
So
you
may
you
know.
Obviously,
there's
you
know
music,
processing
and
stuff
like
that.
But
what
is
your
company's
definition
of
real
time?
How
fast
do
you
need
to
be,
and
so
we
had
a
problem
where
we
weren't
real
time
enough,
you
know.
Is
it
one?
B
It's
got
a
nice
older
couple
and
free
diabetes
supplies.
So
when
this
gets
through,
you
know,
I
feel
like
I,
failed
as
an
engineer
right
and
spams
hard
right,
because
they're
always
one
step
ahead
of
me,
always
our
whole
team,
and
after
years
you
gotta
have
to
accept
the
fact
or
not,
but
you
can
see
there's
a
fake
unsubscribe
link
at
the
bottom
that
they
want
you
to
click
on
which
now
validates
your
email.
They've
got
this
nice
link
at
the
top.
B
So
there's
a
lot
of
attributes
in
here
that
we
can
figure
out
if
it's
good
or
bad,
and
so
you
know
how
fast
can
we
take
this
piece
of
email?
How
fast
can
we
determine
if
it's
good
or
bad,
and
how
fast
can
we
then
block
it
for
our
customers
around
the
world,
and
so
you
know
we
had
to
rewrite.
It
gets
to
the
point
that
my
sequel
was
a
huge
point
of
pain
and
here's
some
numbers,
so
we
actually
decreased
our
latency
to
about
2.4
one.
B
This
morning,
I
woke
up,
and
we
were
at
now
averaging
under
two
milliseconds,
which
I
think
is
pretty
freaking
good
and
we
have
about
33
million
elements
in
the
database.
That
compares
to
about
4
million
and
my
souple,
we
actually
we
started
hitting
pain
points
with
my
sequel,
around
2
million
and
I.
Wouldn't
say
that
you
know
you
can
you
can
say
with
my
sequel:
what's
the
the
actual
number
of
loads?
It's
really
your
right
pattern
and
unfortunate
decisions,
sort
of
made
our
right
pattern.
B
You
know
pretty
high,
and
so
we
did
what
what
every
company
does
right.
We
put
a
bunch
of
read
slaves
all
right.
Well,
now,
we've
got
replication
lag
because
we're
doing
so
many
inserts
per
second,
that
you
know
that
slaves
can't
catch
up
and
now
now
everything's
miserable,
and
you
know
we
had
problems
with
insertions,
etc.
And
then
another
important
point
here
too
is
that
our
old
code,
because
of
the
way
it
was,
was
written,
we
could
only
serve
about
314,000
elements
in
real
time
to
our
customers.
B
Out
of
the
potential
33
million-
and
you
know
now
we
can
serve
all
33
million
to
our
customers.
So
that's
a
pretty
strong
proposition
there
as
well
tracking.
We
have
all
these
thresholds
building
and
I'm
sure
that
you
know
many
other
people
do
this
as
well,
where
you
know
maybe
we'd
have
to
have
15
appliances
in
the
field.
B
Ask
us
about
a
particular
domain,
or
you
know,
piece
of
email
before
we'd
even
start
looking
at
it
simply
because
my
sequel
could
not
deal
with
the
number
of
transactions
if
we
open
the
floodgate
to
every
single
possible
email,
we're
in
two
data
centers
now
so
obviously
you
know
that's
a
huge
benefit
that
we
can
be
better
to
our
customers.
If
we
lose
a
data
center,
everyone
knows
here
you're
going
to
lose
a
data
center.
It's
just
when
and
then
this
is
the
huge
one.
B
We
went
from
eight
minutes
on
average
from
classifying
a
domain
23
seconds,
so
we
take
an
action
now
in
three
seconds,
and
so
this
is
actually
an
article
that
Jonathan
Ellis
tweeted
how
to
survive
a
ground-up
rewrite
without
losing
your
sanity.
It's
by
Joel.
He
founded
Stack
Overflow
co-founder,
and
he
was
also
Microsoft
and
I'd
really
recommend.
Reading
this
article,
it
was
an
article
I
wish
I
had
read
before.
I
started
writing
and
pretty
much
the
the
proposition
was
they
had
to
rewrites.
They
were
doing.
B
They
were
a
dotnet,
ms
sequel
house
and
one
of
the
rewrites
went
really
well,
but
there
was
a
business
proposition.
Sales
had
actually
turned
around
to
them
and
said:
hey
look.
You
know
if
they're
over
this
number
of
transactions
per
second,
they
did
sort
of
a
new
relic
type
thing
with
the
application
monitoring.
If
they
were,
you
know
if
their
application
was
doing
a
number
of
transactions,
you
can't
sell
for
them.
This
is
actually
gonna
bring
down
the
entire
system
for
everyone,
and
so
they
couldn't
sell
anymore,
and
so
they
actually
had
to
rewrite
that.
B
They
were
saying
that
you
know
just
to
put
out
new
I'm
a
sequel
clusters.
They
would
need
to
rewrite
code
and
they
need
to
push
its
than
regressions
were
coming
in
and
the
other
one
engineering
thought
that
they
had
to
rewrite
the
code
you
know
was
you
know
you
looked
back
on,
you
said
hey
what
the
heck
were.
We
thinking
the
one
where
engineering
did
it,
that
one
failed.
The
features
that
everyone
said
that
they,
you
know
absolutely
didn't,
need
all
of
a
sudden
now
needed
to
be
implemented
yesterday
and
the
other
one.
B
Now
they
were
to
sell
and
their
product
was
a
lot
better,
and
so
you
know
past
engineering
decisions
they
may
have
been
right
at
the
time
and
I
don't
want
to
bash
on
my
sequel
and
I.
Don't
want
to
bash
on
the
decisions
of
it.
It's
just
now,
with
the
data
load
that
we
have
and
the
world
that
we
live
in.
B
You
need
to
be
a
little
more
agile
and
in
our
particular
case,
right
threats
are
becoming
a
lot
smarter,
they're,
becoming
more
targeted
right
and,
more
importantly,
to
people,
I
think,
are
a
little
less
willing
to
get
that
false
positive
right.
I'm
sure
everyone
in
here
even
a
gmail.
It
happened.
It
was
funny
I
was
sending
these
slides
for
someone
to
take
a
look
at
and
they
had
hosted
gmail
and
it
ended
up
in
their
spam
folder.
So
I
was
like.
B
Oh
thanks
for
you
know
not
looking
at
my
slides,
but
it
went
to
their
spam
folder
right
so
similar
case
with
us
right.
You
don't
want
that
piece
of
message
to
get
in.
So
how
are
we
a
curative
the
time
without
letting
spam
through,
and
so
obviously
it
comes
to
legacy
systems
right.
Everyone
can
write
flop.
The
code,
I
guess
show
of
hands
who's,
never
ridden
a
bug
right.
That's
what
I
thought
right?
B
Who
in
here
thinks
that
you
know
has
looked
at
their
code
and
never
said
wow
like
why
the
heck
did
I
write
that
well
come
work
for
me.
Then
please,
you
know:
I,
look
at
some
code,
even
six
months
after
and
I'm
I'm
thinking
to
myself
why
the
heck
did
I,
you
know,
did
I
do
that
and
obviously
there
was
a
reason
at
the
time,
but
I
think
everyone
because
of
time
constraints
getting
the
project
out
you
just
you
need
to
ship
it.
You
start
sort
of
making
compromises.
B
That
may
not
be
great
for
engineering
in
the
long
run,
and
so
you
start
putting
all
this
duct
tape
around
the
database
right.
You
start
throwing
memcache
in
you,
maybe
put
Redis
whatever
the
the
cache
of
the
month
is
right
and
you
try
to
make
your
application
perform
to
this
part.
But
you,
you
know.
Fundamentally,
your
database
is
now
limiting
your
application,
and
so
you
need
to
hit
the
reset
button
right.
B
So
nowadays,
can
you
imagine
five
years
ago,
someone's
standing
up
here
saying
that
you
should
hope
that
your
nodes
in
your
database
are
failing
constantly,
absolutely
not
right,
and
so
instead,
you
know
you
can
engineer
these
things
with
these
cheap
little
boxes
that
you
hope,
constantly
you're
failing,
but
you're
resilient
enough.
That
you
can
keep
adding
them
in
right
can
scale
you
can
replace
them
really
easily.
You
know
you
don't
have
to
get
woken
up
at
4am,
because
your
your
your
database
went
down
it's
easily
scalable.
B
There's
no
single
point
of
failure
and
I
I
say
that
you
know
of
right,
because
there's
always
that
single
point
of
failure
that
brings
everything
down
and
then
obviously
many
smaller
boxes
versus
one
monolithic
box.
So
we
had
a
an
interesting
case
where
we
said
hey
look.
You
know
my
sequel
is
not
going
to
cut
it
for
us
anymore.
We
need
to
rewrite
and
we
went
to
the
product
guy
and
we
had
to
get
buying
and
we
had
to
get
technical
buy-in
from
a
lot
of
people
who
really
liked
my
sequel.
B
You
know
who
they
thought.
It
was
the
best
database
product
and
unfortunately
they
wouldn't.
Let
us
go
architect
the
funder.
You
know
the
underlying
solution,
but
without
you
know
producing
so
they
wanted
something
produced
every
three
months,
and
so
this
actually
created
something
interesting
for
us
is
that
we
needed
to
come
up
with
these
intermediary
stages
of
giving
actual
production
quality
results
to
other
teams.
B
As
we
do,
you
know
back
in
data,
so
I
guess
this
is
in
the
presentation
where
I
now
put
up
this
really
cool
kak,
diagraph
I'm,
showing
how
awesome
our
latency
is
now
I
conveniently
took
this,
though,
on
a
Saturday
and
I
took
a
12
hour
period,
and
this
is
really
what
happened
right
and
I
guess.
This
is
the
to
point
out
that
Cassandra
is
not
a
my
sequel
replacement
and
direct
replacement.
Excuse
me,
and
it's
not
a
magic
bullet-
to
solve
everything
right
so
in
that
and
v1
here.
B
That
was
me
having
a
crappy
threading
model
on
our
logging
and
I
pushed
the
code.
Much
to
my
my
boss's
dismay,
probably
three
times
you
can
see
in
a
very
short
amount
of
time
to
Oliver
nodes.
Be
too
I
didn't
realize
that
there
was
a
data
center,
aware
load,
balancing
policy
in
the
new
datastax
Java
driver
for
Cassandra,
and
so
what
had
happened
was
we
were
actually
making
half
of
our
requests
across
data
centers,
which
was
you
know,
clearly
increasing
the
overall
latency
in
our
in
our
application,
and
so
then
v3.
B
You
can
see
here
that
magically
now
our
latency
numbers
are
down
around
2.41.
Milliseconds
I
have
no
idea
what
happened
at
that
spike
right
there
you
know
III,
don't
know,
I
got
an
email
saying
that
they
saw
some
weird
network
connectivity
during
the
same
time.
So
you
know
maybe
I
can
blame
it
on
the
network.
Guess
so
migrating
is
painful.
It's
really
painful.
I
hope
I
have
mentioned
it's
painful,
there's
tons
of
regressions
and
there's
tons
of
rewriting.
So
why
should
you
do
it?
B
Cassandra
is
the
best
option
for
your
persistence
here
right
now,
bar
none.
If
you
can
come
up
with
a
better
option
right
now,
I
would
love
to
talk
and
chat
with
you
afterwards
right
it
comes
down
to
how
as
a
business,
are
you
going
to
bring
your
company
to
the
next
phase
right
and
for
us
my
sequel
wasn't
going
to
bring
us
there
right.
We
couldn't
shard
our
data
set
very
easily.
B
Read
slaves
weren't
going
to
work,
and
if
you
were
crazy
enough
to
go
master
master
and
that
wasn't
something
that
was
even
going
to
fix
the
problem,
it
was
just
going
to
delay
the
inevitable,
and
so
don't
let
your
database
hold
you
back
right.
You've
got
a
great
option.
Why
are
you
stuck
right?
So
lessons
learned
the
good.
B
We
spent
a
lot
of
time,
my
boss
and
I.
In
fact,
we
actually
did
it
four
times
because
we
didn't
take
notes
and
we
kept
forgetting.
So
we
would
sit
down
and
we
would
do
it
again.
What
is
our
data
model
going
to
look
like?
How
are
we
going
to
you
know,
make
this
thing
so
that
for
years
down
the
line,
I
don't
have
to
then
stand
here
again
and
say
what
the
heck
was.
I,
thinking,
right
and
I
think
we
did
a
really
good
job.
We
found
a
way
to
really
do
normalize
our
data.
B
It's
working
really
well
and,
more
importantly,
because
we
had
those
targets
we
had
to
go
every
three
months
we
were
able
to
really
you
know,
scale
our
data
and
change.
The
way
that
we
were
going
to
to
include
stuff,
so
measure
twice
cut
once
so,
just
because
it's
no
sequel
is
everyone
likes
to
call
it
our
big
data
right
that
doesn't
mean
you
don't
have
to
take
all
the
precautions
that
we
all
were
taught
to
take
the
bad.
B
We
we
hit
the
point
where
we
knew
we
weren't
going
to
to
be
able
to
continue
with
my
sequel
sort
of
I
would
say
a
year
after
we
knew
we
couldn't.
You
know
when
the
first
signs
started
to
hit
and
I
didn't
expect
to
need
to
rely
on
our
legacy
systems
for
as
long
as
they
did,
and
so
then,
you've
got
this
year.
B
Gap
right
where
you're,
like
hey,
look,
I've
already
committed
all
this
code
into
trunk
and
it's
super
awesome
and
it
does
what
you
want,
but
the
end
of
the
day
Tech
supports
still
getting
all
these
calls
in
saying
you
know
hey
this
piece,
you
know
this
is
wrong
or
this
is
you
know
XYZ,
and
so
you
really
need
to
manage
those,
and
you
also
need
to
think
about
yourself
as
an
engineer.
How
are
you
going
to
make
minor
improvements
to
your
legacy
system
right?
B
You
can't
just
let
it
die
because
you've
got
hundreds
of
thousands
of
customers
still
using
that
and
I
think
that's
really
important
and
then
sinking.
You
have
to
take
that
really
seriously,
because
you
know
you
can
write
the
best
code
in
the
world,
but
if
you
migrate
your
data
and
you
do
a
really
crappy
job
sinking,
you
know
now
your
code
sucks
as
well
so
I've
got
eight
tips
here.
One
defined
your
requirements
early
to
start
with
queries,
three
think
differently.
B
B
We
have
sort
of
this
weird
requirement
where
we
need
real-time,
ordered
results
of
all
rose
in
all
of
our
Colin
families.
This
actually
presents
a
pretty
big
problem
for
a
distributed
system,
because
now
you've
got
keys
all
over
the
place
and
they're
in
random
order.
So
how
do
we
do
that?
I'll
actually
come
to
our
solution
and
then
what
is
your
read,
load
and
write
load
right?
B
B
We
need
to
play
your
Cassandra
bluster.
So,
to
start
with
the
queries,
Cassandra
doesn't
mean
that
you
can
just
ignore
your
queries
and
I.
Think
cql
is
actually
good,
because
I
think
it
sort
of
forces
and
brings
back
that
mentality.
If
I
need
to
think
about
hey
this
architecture
and
this
distributed,
how
am
I
going
to
make
it
work
right,
counters
and
composites.
So
counters
are
cool
but
they're
rough
guesstimate.
B
So
you
know,
there's
a
couple
of
tickets
that
you
can
look
at
4
counters
composites
are
your
friend,
so
in
cql
that
would
be
primary
keys.
How
are
you
going
to
partition?
Your
data,
that's
really
important
to
think
about
and
optimized
for
use
case,
so
don't
be
afraid
of
rights
right
disk
space
is
really
cheap
and
memories.
Pretty
cheap
nodes
are
pretty
cheap
right
now.
B
Ss
tables
are
immutable
so
when
they
get
written
there,
a
pen
old
append
only
so
when
the
delete
happens
right
a
flag
gets
ridden
in
there
saying
hey
look,
you
know
we
actually
want
to
delete
this
and
then
you've
got
GC
grace
period,
which
happens,
and
you
don't
actually
remove
that
data
until
Gigi
grace
is
expired
right.
So,
if
you're
making
all
these
deletes
all
over
the
place
and
then
you're
also
not
doing
any
maintenance.
B
Now
you've
got
all
that
data
still
on
disk
and
it's
sort
of
like
the
postgres
problem
right,
and
so
it's
not
like
you
know
there
aren't
solutions
there,
but
you
need
to
be
aware
of
that
and
you
should
think
about.
Maybe
you
don't
need
to
delete
that
data
oops
all
right.
What
is
dude?
Sorry,
three
think
differently
regarding
reads:
I.
Think,
because
my
sequels
performance
is
less
than
stellar
many
times
we
have
sort
of
or
I
did
gone
into
the
mentality.
B
If
I
need
all
the
data
now
and
then
put
it
somewhere
fast-
and
this
is
pretty
poor
use
case
for
Cassandra
and
in
fact,
because
you've
now
fixed
your
read
and
write
problem,
you
don't
need
to
do
this
right.
So
if
you
do
select
star
from
really
big
Colin
family,
where
foo
equals
bar
around
10,000
rows,
you
will
get
an
RPC
timeout
and
you
know
the
fact
is:
it
has
to
iterate
over
every
single.
You
know
place
all
over
the
place
and
you
know
maybe
there's
further
optimizations
that
could
be
made
later.
B
So
thinking
of
migrating
data,
I
will
give
you
know
a
few
battle
stories
here.
The
first
one
I
wrote
our
first
sync
script
and
I
thought
everything
was
great
and
we
went
to
production
for
our
new
web
filter
categories
and
it
turned
out.
I
had
blocked
ups
com,
I
blocked,
German
Google
and
on
the
day
of
the
Samsung,
for
release,
I
blocked,
samsung
com.
B
We've
had
a
couple
other
things
too.
Like
you
know,
priority
of
your
sink
I
would
argue.
They
may
have
different
data
that
should
get
to
your
new
system
faster
right.
There's
the
you
know
the
whole
enchilada
script,
and
so
what
we've
actually
done
is
written
for
sink
scripts
that
we
run
at
all
times.
We've
got
the
one,
that's
really
super
fast
that
runs
and
we
just
immediately
dump
stuff
over
we've
got
the
one
that
takes
a
million
hours
just
got
to
go
through
all
the
stuff.
B
That's
constantly
changing
and
so
really
think
about
your
your
sink
script
and
take
it
as
seriously
as
any
production
code
you
put
in
because
a
mistake
here.
You
know
everyone
has
to
migrate
their
data
and
in
our
case
to
we
were
taking
stuff
from
flat
files.
We
were
taking
stuff
from
other
my
sequel
database.
We
were,
we
were
taking
some
from
all
different
sources,
and
inevitably
you
will
make
a
mistake.
So
how
do
you
you
know,
sort
of
take
all
those
different
pieces
together
and
then
run
them
over
and
over
again
oops
wrong
button
again.
B
Don't
use
cassandra
is
a
cue
I'm.
So
it's
a
it's
really
easy
to
want
to
do
this
use
case,
and
so
there's
a
nice
blog
article.
It's
linked
at
the
bottom,
from
datastax
I'm,
biddle
esky
wrote
on
Cassandra,
Cassandra,
anti-patterns
and
so
I
just
talked
about
tombstones
and
read
performance
right,
so
it
makes
sense.
You'd
want
this
distributed
durable,
Q.
It
turns
out
it's
not
the
best
use
case.
There's
plenty
of
other
great
use
cases
for
Cassandra.
So
don't
use
it
for
this
one.
B
You
know
you
can
you
can
use
your
cool
new
cluster
for
many
many
many
other
things,
and
so
instead,
what
we
did
is
we're
using
Kafka.
It's
a
multi
pub
multi
consumer,
durable
q.
I
came
out
of
LinkedIn
when
you
get
that
little
email.
That
says
you
know
that
these
many
people
saw
your
profile
and
you
want
to
sign
up
for
pro
and
stuff
that
I
believe
still
goes
through
Kafka
and
it's
pretty
awesome.
08
is
a
really
cool
release,
and
so
our
combination
of
using
Cassandra
with
Kafka
is
working.
B
Awesome
I
mean
really
really
really
really
well,
and
you
know
you
throw
a
lasting
search
in
there
and
I.
You
know
I'm
having
a
really
great
day
as
an
engineer,
six
ice
to
make
capacity.
So
don't
forget
about
the
Java
heat
right,
so
you
want
about
eight
gigs
Max
and
if
you
go
any
higher
than
that,
you're
probably
gonna
run
into
GC
pause
issues.
So
that
being
said,
we
have
64
gigs
of
RAM
on
these
boxes.
You
know
how
do
we
use
that?
B
Well,
there's,
obviously,
a
lot
of
things
that
are
off
heat,
there's
off
heat
caching,
key
caching
row.
Caching,
so
you
can
still
use
that
and
then
also
remember
too,
that
you
know
the
colonel.
The
Linux
kernel
is
pretty
smart
about
caching,
so
you
can
let
the
Linux
kernel
sort
of
dude
part
of
the
job
as
well
and
and
then
you
know,
there's
no
reason
not
to
allow
that
plan
for
capacity.
B
Don't
let
your
nodes
get
to
the
point
that
there
at
one
point
two
terabytes
of
load,
because
then
you're
going
to
need
to
bootstrap
a
new
node
and
you're
gonna,
you
know
bring
that
in
and
then
you're
still
going
to
need
to
do
a
clean-up
operation
you're
still
going
to
move
tokens,
etc.
Unless
you
know
you
can
double
the
size
or
you
know,
I
guess
would
be
knows
it's
not
as
big
a
deal
anymore
and
so
stay
ahead
of
it.
I
guess
it's
the
main
lesson
here.
B
There
were
plenty
of
times
where
I,
let
it
get
too
big
and
I
strongly
regret
it,
and
that's
why
I've
probably
started
balding
at
my
young
age,
stress
tool.
You
can
use
this.
It
comes
with
Cassandra,
vanilla,
Cassandra
and
run
it
against
your
nodes.
Right
get
one
node
get
a
piece
of
hardware
config
that
you're
really
happy
with
figure
out.
How
many
writes
and
reads:
can
you
do
and
then
obviously
multiply
that
and
figure
out
what
your
replication
factor
is
going
to
be
and
how
many
data
centers
do
you
want?
B
And
then
you
can
pretty
easily
get
a
you
know,
good
idea
for
what
your
capacity
should
be,
but
the
idea
that
oh
well
Cassandra
is
super
powerful
and
I.
Don't
need
to
ever
worry
about
I
own
anymore
is
is
kind
of
going
to
hurt
you
so
I
0
is
just
as
important
so
use
the
stress
tool,
figure
it
out
right
and
then
and
and
if
you
plan
ahead
of
time,
things
should
be
probably
a
lot
better
for
you
in
the
long
run.
B
My
sequel
hardware
is
not
Cassandra
hardware,
so
if
you've
got
a
really
big
node,
we
have
some
of
my
super
boxes.
There,
raid
60
teen
drive
boxes
that
have
256
gigs
of
RAM
and
32
cores
and
the
pretty
cool
boxes
right.
But
that's
probably
going
to
be
a
little
overkill.
Considering
you
know
you
can
only
still
use
eight
gigs
max
on
the
heat.
B
So
if
you
get
lots
of
small
boxes
that
are
cheap,
you
know
you
can
probably
get
three
boxes
for
moving
the
cost
of
that
one
massive
box
and
now
you've
also
built
in
some
redundancy
right.
You
can
kill
one
or
two
or
one
of
those
nodes
and
you
haven't
lost
any
data
right
and
hopefully,
two
of
you
planned
your
capacity.
Your
performance
of
your
application
hasn't
gone
down
and
then
another
one
too,
because
Cassandra
is
so
awesome,
did
your
old
coat
and
your
other
prior
architecture.
B
Decisions
that
you've
brought
over
is
that
now
going
to
become
a
bottleneck.
So
in
our
case
we
did
find
you
know
some
code
that
was
less
than
optimal,
and
how
do
we
now
have
to
think
about
you
know:
do
we
need
to
rewrite
that?
Is
this
going
to
become
a
bottleneck
for
our
decisions,
automate
automate
automate?
B
He
accidentally
didn't
put
the
config
in
he
put
the
stock
config,
and
so
they
had
come
up
in
test
cluster
and
he
had
lost
data
and
he
we
went
back
and
forth
and
back
and
forth
and
back
and
forth,
and
how
he
could
not
have
ever
made
a
mistake
and
that
we
know
Cassandra
has
this
bug,
that's
put
it
in
test
cluster
and
guess
what
like
I
make
mistakes?
We
all
make
mistakes,
it's
human
nature
automate
it.
The
problem
won't
happen
right.
There's
this
really
cool
tool
that
sylvian
wrote
called
CCM
right.
B
You
can
make
an
instead
of
cluster
on
your
macbook
pro
right.
So
you
stick
it.
There
you've
got
three
and
then
you
run
your
puppet
config
on
it
and
boom.
Oh
hey,
look
yeah,
I,
magically
anam
on
12
right
and
you
know
you're
going
to
make
mistakes
in
your
automation.
But,
more
importantly,
now
you've
figured
out
what
those
mistakes
are
ahead
of
time.
You
know,
I
I
cannot
stress
this
enough.
We
came
from
a
company
that
maybe
didn't
automate
enough.
We
are
now
really
trying
to
automate
everything
everything
we
do
is
through
pop.
B
It
everything
is
repeatable
and,
and
the
more
important
thing
is
right.
If
you've
now
got
this
really
big
distributed
system
right,
you've
now
moved
all
that
complexity
to
operations.
So
if
you
didn't
plan
for
your
operations
and
people
don't
understand
that
now,
when
that
know
does
die,
you
know,
how
are
you
going
to
deal
with
that
failure
you
still
have
to
so
you
know
if
you're
going
to
buy
into
this
idea
of
continuous
failure,
you
need
to
make
sure
that
you've
taken
care
of
it
from
an
offside.
B
B
Is
that
we
didn't
do
it
fast
enough
and
in
fact,
I
still
have
two
nodes
in
our
cluster
that
we
did
an
apt
get
install
on
and
so
now
they're
this
sort
of
Frankenstein,
where
I'm
manually
patching
jars
on
them
as
I
make
builds,
and
I
still
need
to
rebuild
them.
But
you
know
everything
else
is
automated
in
this.
Those
two
nodes,
I've
actually
still
screwed
them
up,
because
once
again
the
manual,
so
you
know
I
wish
I'd,
never
apt-get
installed
those
two
notes,
and
then
we
actually
did
a
cool
thing.
B
My
coworker
shetal
rope,
which
creates
a
CCM
cluster
on
your
dev
box,
and
it
goes
out
to
our
production
cluster
and
it
grabs
n
number
of
keys
from
all
the
column
families
and
makes
them
locally.
So
you
can
deal
with
actual
data.
That's
coming
in
you.
May
you
may
not
have
the
read
and
write
load
right,
but
you
can
deal
with
real
data
from
that
day
on
your
production
code.
So
you
know
you
don't
need
to
operate
on
your
production
cluster.
You
know
you
just
run
this
command
and
boom.
B
There's
a
three
node
CCM
cluster
with
real
data
on
it,
so
that
was
actually
something
which
we've
done
earlier
and
it's
you
know
really:
cool
from
a
dev
environment
run
the
dupe
against
it.
We
can
do
builds
on
Hadoop.
We
can.
We
can
check
all
that
stuff
and
not
worry
about
needing
to.
You
know,
ruin
our
production
cluster
ate
some
maintenance
required.
I
was
not
a
Java
developer
before
starting
with
Cassandra,
so
I
have
obviously
went
to
school,
and
you
know
you
get
taught
Java
and
you
go
through
all
that.
B
But
Barracuda
is
not
a
Java
company.
I
guess
maybe
we
are
a
bit
now,
but
I
did
not
know
what
Jay
console
was
I
didn't
know
what
Jay
map
was
I
didn't
know
what
Jay
stack
was
and
I'm
willing
to
fully
admit
that
I
didn't
know
what
all
this
stuff
in
the
JDK
did.
I
just
knew
that
Java
was
required
and
you
run
this
jar
so
I
think.
If
you
want
to
run
Cassandra,
you
need
to
know
a
little
something
about
Java
development.
B
B
Think
it's
going
to
bite
you
because
it
bit
me
because
I
was
sort
of
trying
to
take
linux
mentality
right
if
us
trace
the
Java
process.
I,
don't
know
if
anyone's
done
that
in
here,
you're
just
gonna
see
a
bunch
of
few
texts
and
stuff
going
by
and
it
means
nothing.
So
that's
why
these
tools
were
written.
It's
not
like
they.
You
know
wanted
to
make
up
pretty
pictures.
You
have
to
use
them
for
java
development
right
and
so
what
we
ended
up
doing
was
jus
Cola.
It's
it's
a
free
client.
B
You
can
get,
you
can
hook
it
in
and
you
can
do
HTTP
requests
to
jmx
and
so
it's
cool,
because
then
you
can
write
a
script
in
whatever
your
favorite
languages,
and
now
you
can
automate.
You
know
all
these
actions.
Almost
every
important
thing
in
Cassandra
is
in
JM
x.
J
console
is
sort
of
a
cool
way.
To
do
that,
you
can
see
all
the
m
beans
that
are
are
in
Cassandra
and
you
can
go
through
them.
B
So
I
actually
have
a
little
bash,
alias
I
made
I
just
do
JC
space
and
then
a
node
and
then
it
just
magically
comes
up
with
jay
console
for
that
Cassandra
node,
and
that
was
one
of
the
better
things.
I
did
repairs,
so
I
was
sitting
in
a
talk,
Jason
Browns
talk
yesterday
and
he
was
talking
about
sort
of
consistency
and
repairs,
etc
and
Cassandra
from
a
technical
level,
and
there
was
tons
of
questions
about
well.
B
My
cluster
is
doubling
when
I
run
a
pair
and
then
I've
got
to
run
this
and
that
and
and
I
think
it's
a
pain
point,
because
you
know
everyone's
run
into
some
type
of
repair
problem.
But
you
know,
if
you
sort
of
figure
out-
and
you
can
use
these
tools-
you
can
start
figuring
out.
You
know
why
this
has
happened
so
when
the
Merkel
tree
gets
calculated
right,
if
any
one
mutation
didn't
get
applied
now
that
that
checks
on
that
he's
not
going
to
match
so
as
soon
as
that,
one
doesn't
happen.
B
Now
it's
going
to
stream
all
of
that
data
and
then
the
node
goes
through
that
data
and
figures
out
what
you
know
is
missing,
and
so,
if
you
don't
run
a
repair
ever
maybe
you
have
never
run
a
pair
for
a
year
and
then
you
go
through
and
you
do.
You
know
that
first
repair
there
could
potentially
be
tons
of
ranges
that
are
out
of
sync
and
it's
probably
gonna
take
a
really
long
time
but
another
another
battle
story,
but
I
guess
I
wasn't
going
to
tell.
But
it's
a
good
one.
B
So
the
first
time
that
I
decided
to
add
we
were
going
to
double
our
cluster
from
three
to
six
nodes
and
I
was
super
pumped
and
I
went
to
do
the
move,
operation
and
I
I
fired
up
a
screen
session
on
all
three
nodes.
Are
all
sick
notes.
Six
nodes
actually
and
I
ran
no
tool
move
and
then
the
token
that
they
were
going
to
go
to
and
I
went
to.
Bed
and
I
woke
up
and
things
were
really
bad
right,
because
what
does
it
move
doing?
B
It's
actually
literally
moving
data
that
needs
to
be
on
other
nodes
to
redistribute
the
ring
right.
It's
not
like
this
magical
process
that
all
of
a
sudden
has
happened
and
I
guess
I,
didn't
I,
didn't
think
about
what
was
happening,
and
so
anyways
I
was
in
a
complete
world
of
hurt.
Right
I've
got
data
that
shouldn't
be
on
nodes
here
and
here
and
all
over
the
place,
and-
and
it
was
probably
about
two
weeks
to
get
our
cluster
stable
again
and
so
now.
I
understand
right.
B
When
you
do
this
maintenance
you're
moving
a
lot
of
data
right.
If
you've
got
terabytes
of
data
on
your
cluster
and
you
have
to
move
them
anywhere,
you
have
to
do
repair
you're
going
through
a
lot
of
data,
and
then
you
know
if
you
use
these
slightly
underpowered
or
low
aya,
you
know
I
Oh
nodes
with
the
idea
you're
going
to
distribute
them.
B
Yet
then
you
put
one
point:
two
terabytes
of
data
on
them
like
I,
did
now
you've
got
to
go
through
one
point:
two
terabytes
of
data
like
it,
you
know
and
you're
not
magically
gonna
get
more
SSDs.
You
can't
imagine
you've
got
a
16.
You
know
raid
10
array
of
SSDs
in
this
box.
That's
got
to
spinning
discs
right
and
so
I
guess.
Once
I
came
to
grips
to
the
fact
that
yes,
I,
oh
Java,
is
the
same
things
started
getting
better
right,
because
then
I
realized
that
it's
not
Cassandra's
fault.
B
B
Where
is
Barracuda
today,
so
we
are
two
years
in
production
with
Cassandra
I,
still
love
it,
and
after
two
years,
I
think
that
maybe
I've
gone
through
the
the
hate
phase
and
I
I
still
love
it.
It
was
definitely
our
the
right
choice
for
us.
The
numbers
speak
for
themselves
at
this
point
and
I
have
zero
regrets
for
making
the
decision
and
I
think
it's
probably
one
of
the
most
important
things
I've
ever
done
in
my
life,
we
have
two
product
lines
that
are
a
hundred
percent
now
powered
their
data
is
powered
by
Cassandra.
B
Our
spam
firewall
is
in
beta
for
anyone
to
to
join,
and
hopefully
next
week
it
will
be
going
to
every
single
one
of
our
customers
and,
more
importantly,
anecdotally.
I
talked
to
people
who
I've
switched
over
to
the
new
code,
and
they
actually
tell
me
yes,
my
spam
is
better,
and
so
that's
really
what's
important.
I
could
talk
about
how
many
transactions
per
second
that
we
could
do,
and
I
talked
about
how
fast
our
you
know,
response
time
is
etc.
B
But
more
importantly,
there's
the
human
aspect
of
the
fact
that
people
like
my
product
better
and
I
think
that
that
is
directly
related
to
cassandra
and
we
have
received
real-time
response.
I
think
three
seconds
from
the
first
time,
we've
ever
seen
a
particular
domain
in
the
field
to
classifying
or
to
making
an
initial
decision
is
really
good.
B
B
I
know
that
there's
some
people
who
were
outspoken
in
the
community,
who
probably
will
hate
me
for
this
for
saying
it
and
I'll
be
on
their
blacklist,
but
you
know
the
team
has
really
done
an
awesome
job
with
cql.
It's
great.
It's
awesome
and
importantly,
we've
got
engineers
who
don't
understand
what
a
slice
predicate
and
what
you
know.
A
get.
B
Multi
range
is
right
and
all
this
stuff
and
what
my
start
and
stop
is
they
they
don't
and,
more
importantly,
I
think
they
don't
want
to
necessarily
have
to
learn
that
they
just
want
the
power
of
Cassandra,
and
so
everyone
understands
select
star
from
table.
Everyone
gets
it,
and
so,
if
you
can
now,
you
know,
spend
all
that
time
upfront
making
a
really
great
data
structure
and
really
making
sure
that
you're.
You
know
you've
thought
that
through
and
now
you
can
sort
of
give
that
to
people
to
now
tap
into
that
power.
B
That's
pretty
powerful
right
and
and
that's
why
I
think
it's
awesome
and
you
know
you
can
say,
or
the
10gen
CEO
can
say
what
he
wants
about,
how
he's
now,
one
because
of
cql,
that's
really
powerful,
it's
great
and
so
I'm
pretty
pumped
for
the
direction
of
the
database
and
and
the
fact
that
now
I
can
go
to
people
and
say:
hey
look!
You
know,
we've
got
this
really
great
tool.
It's
going
to
make
your
code
better
Cass,
which
jonathan
briefly
spoke
about
and
other
two
point,
O
features.
B
I
think
two
are
going
to
continue
to
make
Cassandra
the
best
solution,
and
so
that
brings
me
to
the
Cassandra
community.
So
there's.
Obviously
a
lot
of
other
competitors
in
this
space
is
a
lot
of
commercial
competitors.
React
hbase,
Oracle
I
was
told
to
remove
by
a
friend
the
M
database
because
it
was
a
cheap
shot.
So
the
bigger
question
is
how's
their
development
community.
You
cannot
beat
the
developer
community
in
Cassandra
right.
There
are
so
many
dedicated
people.
B
You
mean
you
can
see
them
here
who
just
love
working
on
the
stuff
they
live
and
breathe.
It
I
can't
personally
say
the
same
about
the
others,
but
if
you've
got
something
that
performs
really
well,
that's
going
to
make
your
applique
you're
great
and
you've
got
a
lot
of
awesome
success
stories.
How
can
you
go
wrong
right
and
so
there's
obviously
IRC
channel
I
never
used
IRC
as
well
before.
B
Well,
I'd,
use
that
like
once
or
twice,
but
nothing
seriously,
so
I
guess
maybe
a
little
bit
intimidating
at
first
to
figure
out
what
the
freenode
servers
are
and
stuff
and
how
to
log
on.
But
it's
worth
it
takes.
You
know
five
minutes,
you
can
figure
it
out
and
then
you're
at
you
can
just
get
access
to
this
wealth
of
people
who
want
to
help
you
who
are
just
doing
it.
B
You
know
the
kindness
of
their
heart
and
then
obviously
it
is
the
mailing
list
user
at
Cassandra,
patchy,
org,
there's
tons
of
Awesome
information
on
there
and
I'd
really
recommend
that
you
take
that
as
well.
So
that's
all
I
have
I've
got
obviously
some
time
for
question
that
answers.
If
you
anyone
hasn't,
he
I.
C
B
B
We
can't
put
it
here:
oh
we're
going
to
export
and
delete
this
data
and
these
rows
here
and
this
constant
juggling.
So
certainly
we
did
and
that's
why
I
say:
take
your
sink
script
as
seriously
as
your
production
code,
because,
certainly
like
it
wasn't,
it
wasn't
a
bug.
On
my
end
on
the
actual
code,
it
was
a
bug
in
the
sink
script
and
then
that
led
to
be
blocking
German
Google.
You
know
which
leads
to
really
unhappy
people.
Thank
you.
Yep.
B
So
we
don't
riously
snapshots
and
Cassandra
we're
not
actively
like
doing
backup
to
tape
or
anything
like
that.
We
have
replication
factor
3,
I'm
doing
constant
repairs
are
really
important
data.
We
write
it
at
quorum
and
we
are
we're
at
two
data
centers
as
well.
So
we
can
fully
are
our
application
can
fully
serviced
our
entire
customer
base
from
one
or
other
of
the
data
centers,
so
you're
not
backing
it
up.
Just
through
your
day,
are
you
using
the
snapshot?
Feature?
B
Yes,
but
I
wouldn't
say
it's
like
a
consistent
thing:
I'm
not
doing
like
you,
the
equivalent
of
a
my
sequel
dump
or
percona
back
up
every
night,
but
the
I
think
the
consistency
features,
and
you
know
the
fact
we're
into
data
centers
is
slowly
good
enough
for
us.
Yep.
B
Absolutely
not
absolutely
not.
You
know.
The
number
of
nodes
that
you
have
in
a
V
node
is
is
the
same,
and
you
know
the
initial
bootstrapping
with
v
nodes
it'sit's.
You
still
have
to
move
all
that
data,
as
I
said
right.
So
you
know
it
just
makes
certain
things
better
and
spaces
out
the
data
across
the
ring.
E
B
So
I
think
ec2
is
a
bit
different
than
we
make
our
own
hardware,
so
I
can
benchmark
it.
I
think
you
sort
of
get
whatever
was
installed.
Amazon
you're
not
really
sure,
and
we
don't
run
in
ec2,
so
I'm
not
really
sort
of
the
best
person
to
talk
about
that.
But
I
do
know
I.
Oh
it's
a
pretty
big
concern
for
Amazon
customers,
so
you
know
they
may
have
just
needed
to
go
to
a
beefier
instance
on
is
on
to
get
the
performance
that
they
felt
was
necessary.
B
How
much
are
you
willing
to
pay
if
your
company
is
willing
to
pay?
For
you
know
n
number
of
nodes
or
twenty
four
nodes
with
you
know,
16
SSDs
go
for
it.
I
mean
that's
awesome,
but
in
our
case
we
had
to
figure
out.
What's
the
cost
benefit
ratio
that
we
want
to
get
the
redundancy
that
we
want
the
performance
that
we
want
per
node
and
obviously
the
ability
to
have
nodes
die
and
still
not
perform.
Does
that
answer
your
question
I
think,
there's
a
question
back
there.
F
Right
now
we
are
planning
to
migrate
from
sequel
server
to
Cassandra.
Lot
of
business.
Users
are
like
really
comfortable
doing
a
Dockery's
and
we
build
our
primary
indexes
and
all
those
things
and
then
a
bunch
of
business
customers
come
and
head
giving
bunch
of
reports.
We
just
go
and
write
some
a
decorous,
give
a
report.
How
do
you
convince
your
business
customers
when
you
migrate
to
Cassandra?
If
you
want
to
create
some
a
doc
reports,
what
can
I
ecosystem?
Do
you
guys
use?
Do
you
guys
run
into
those
kind
of
problems.
B
So
I'm
not
I,
think
I
understand
your
questions.
Let
me
phrase
it
so
I
make
sure
you
get
it
that
your
internal
customers
yeah
need
reports
on
your
data
and
we
have
that
problem.
So,
as
I
said,
if
there
is
the
use
case
where
you
need
to
iterate
through
every
single
piece
of
your
data,
that's
where
we
bring
in
Pig
and
Hadoop.
So
you
can
write
a
pig
script.
You
can
get
all
your
data
out
in
the
format
you
need
and
then
you
can
format
it
back
into
your
reports.
B
G
So
if
you
don't
want
to
store
a
terabyte
on
each
node,
what
is
the
sort
of
soft
limit
you're
using?
Is
it
like
50,
gigs
or
100
gigs?
What.
G
D
B
Totally
so
that
was
something
maybe
I
didn't
make
clear
enough
with
the
technical
buy-in.
Certainly
so
we
were
my
simple
shop
as
I
said,
and
I
don't
want
to
just
keep
bashing
on
my
sequel.
I
think
that
there
is
a
there's,
a
point
in
place
and
there's
a
reason
that
my
sequel
is
what
it
is
and
so
I
think
that
having
that
conversation
internally,
especially
to
what
we
did,
we
got
our
unbeknownst
to
me.
Someone
was
assigned
as
the
my
sequel
advocate.
B
So
when
I
did
the
initial
pitch
as
a
team
to
the
rest
of
our
engineering
leadership,
the
guy
pretty
much
did
every
single
thing
he
could
to
make
sure
that
we
would
stay
with
my
sequel.
Well,
you
can
do
this.
What
you
can
do
this?
What
you
can
do
this?
Well,
you
can
do
this
and
it
was
really
good
because
it
actually
did
bring
up
a
few
things
that
were
easily
work.
You
know
we
came
up
with
technical
solutions
to,
but
maybe
weren't
things
I
thought
about.
So
I
would
actually
advocate
for
that.
B
You
know
get
an
advocate,
an
advocate
for
for
my
Sifu
or
postgres
or
whatever
you're
moving
from
and
have
them
go
in
there
and
sort
of
say
well,
you
can
do
this.
You
can
do
this
and
if
you
can't
come
up
for
great
answers
for
all
this
stuff,
you
know,
maybe
you
don't
have
a
real
business
decision
or
a
reason
or
logic
to
actually
move
right.
I
mean,
as
I
said
it's
painful
right.
B
I
I'm
I'm
personally
upset
that
I
made
mistakes
with
that
sink
right
because
it
inconvenience
our
customers
and
that's
that's
pretty
bummer
right
and
so,
but
that's
going
to
happen
if
you're
you're
moving
all
this
data
and
so
and
as
I
said
too,
we
couldn't
go
away
and
just
do
the
architecture
and
then
come
back
and
say:
okay,
now
we're
ready
to
go
and
it
turned
out
because
we
were
pulling
all
the
data
from
all
these
separate
places.
You
know
that
was
what
we
had
to
do.
B
We
had
to
implement
the
bottom,
but
we
had
to
come
up
with
sort
of
different
little.
You
know
hurdles
along
the
way.
We
would
go.
Okay,
here's
something!
Here's
proof!
This
is
going
to
work
and
so
I
think
it
too.
If
you
think
that
you
can
just
go
away
for
a
year,
you're,
probably
not
going
to
get
the
the
business
by
in
either
Jim
Oh
on
she's
got
the
mic
so.
D
B
I
mean,
I
think,
naively,
I
thought
that
it
was
going
to
be
possible
that
one
day
I
would
just
you
know,
switch
off
my
sequel
and
and
move
all
of
the
associated
infrastructure
over.
It's
not
gonna
happen,
and
so
totally
you
know
prove
that.
Yes,
this
is
a
great
solution
for
us.
Prove
that
hey
look
I'm
able
to
produce
you,
no
business
driving,
you
know
stuff
for
my
company
and
then
you
could
sort
of
add
from
there.
You
know
thanks
up
tails,.
H
B
So
there's
a
there's:
a
Hadoop
classes
in
Cassandra,
so
there's
Colin,
family
input,
reader,
there's
a
cql
compatibility,
commenting,
1.6,
126,
excuse
me
and,
and
so
there's
examples
in
the
examples
folder
at
the
root
of
the
project
that
you
can
take.
A
look
at
on
how
to
use
cassandra
is
both
an
input
and
output
for
hadoop.