►
From YouTube: NYC* 2013 - "Building a Scalable Time-Series Database with Cassandra" at BlueMountain Capital
Description
Speakers: Jake Luciani and Carl Yeksigian, BlueMountain Capital
SlideShare: http://www.slideshare.net/planetcassandra/nyc-tech-day
This talk will focus on our approach to building a scalable TimeSeries database for financial data using Cassandra 1.2 and CQL3. We will discuss how we deal with a heavy mix of reads and writes as well as how we monitor and track performance of the system.
A
A
Alright,
so
the
data
that
we
store
is
financial
time
series
data,
so
this
can
be
tick
level
data.
It
can
be
data
that
the
traders
actually
put
in
so
it
could
be
very
sparse,
but
it's
it's
time
series
data,
so
it
looks
kind
of
like
this,
so
you
have
a
bunch
of
discrete
points
at
Isla
at
many
intervals
of
time,
and
you
want
to
be
able
to
store
both
of
these
efficiently
and
you
want
to
be
able
to
query
these
efficiently.
So
what
are
the
queries
that
are
our
users?
A
Ask
us,
there's
a
time
series
query
and
there's
a
cross-section
query.
So
the
time
series
query
is
basically
here's
a
start.
Here's
an
end,
here's
a
periodicity
and
so
between
the
start.
In
the
end,
at
certain
intervals,
I
want
data,
so
here
start
as
10
a.m.
and
is
2
p.m.
and
we
want
it
at
every
one
minute,
so
the
star
at
the
end
and
the
periodicity
defines
the
query.
A
So
in
this
one
there's
two
different
two
different
data
points,
so
the
Microsoft
and
the
Apple
price
and
we're
asking
for
the
data
as
of
11am,
so
that
as
of
time,
is
the
only
component
to
the
query,
so
cross
sections
are
for
random
data.
We
don't
know
ahead
of
time
what
all
the
components
of
a
cross-section
query
are
going
to
be.
So
if
we
optimized
for
the
cross
section,
that
means
that
we're
storing
thousands
of
rights
and
we
can
get
inconsistent
queries
across
the
rights
and
we
also
need
by
temporality.
A
A
So
we
can't
optimize
for
both
cases.
So,
let's
optimize
for
the
time
series,
it's
a
really
simple
data
model
to
go
put
into
Cassandra.
So
this
is
how
it
gets
stored.
Now
this
is
Cassandra
11
how
you
would
represent
it.
Basically,
you
have
the
the
ticker.
You
have
the
name
these
two
times
and
then
the
value
so
Apple
last
price
for
two
days
ago,
as
of
yesterday
or
when
we
found
out
about
yesterday
last
price.
A
A
So
cql
three,
it's
a
pretty
simple,
pretty
simple
table
I
mean
it
was
a
pretty
simple
mapping
from
the
data
that
we
want
to
store
to
the
c
ql
three
that
we
want
to
write.
So
here
we
have
the
ID
which
we
store
as
a
binary,
the
property
that
would
be
last
price.
We
sort
tix
because
we
have
to
represent
dates
somehow,
and
then
we
have
the
value
which
is
just
bites.
A
So
this
is
the
time
series
query
that
we
want
to
write.
This
doesn't
even
include
periodicity,
so
it's
just
start
and
end
and
give
me
all
the
data
in
between
those
and
then
here's
the
cross
section
that
we
want
to
write.
So
given
some.
As
a
point,
give
me
the
last
knowledge
time
that
we
know
so
reading
way
too
much
data
back.
We
get
every
single
point
between
the
start.
A
In
the
end,
even
if
there's
you
know
a
million
knowledge
time,
so
we're
rewriting
the
data
a
million
times,
and
we
only
care
about
it
every
one
minute.
We
would
get
all
million
points
back
and
we
have
filter
it
down
to
one
and
we
have
the
knowledge
time
as
well.
So
we
want
to
be
able
to
to
know
what
we
knew
or
we
want
to
be
able.
A
We
want
to
know
the
last
point
that
we
knew
about
so
we're
building
a
service,
not
a
nap,
so
our
users
they're
running
on
a
huge
grid
and
their
applications
and
they're,
hammering
us
very
quickly.
So
basically
they
they
want
to
go
as
quickly
as
possible
and
we're
the
limiting
factor.
So
we
have
our
service
on
top
of
Cassandra,
so
that
Olympus
is
what
does
the
the
filtering
for
us?
A
So
here's,
the
type
of
filtration
that
we
have
to
do
so.
We
filter
everything
by
knowledge,
time
and
filters.
The
time
to
use
queries
by
periodicity,
so
Cassandra
gives
us
back
a
ton,
more
data
that
we
actually
want
to
use.
So
we're
filtering
right
now.
We
filter
200,000
points
down
to
300
that
we
actually
return.
A
So
one
thing
that
that
we've
been
discussing
is
push
down
filters,
so
basically,
instead
of
having
the
the
service
layer
do
filtering,
we
would
push
that
down
into
the
Cassandra
layer.
So
we
do
down
samples
on
right.
We
also
would
do
basically
the
periodicity
would
be
done
on
the
coordinator
so
rather
than
having
it
go
back
to
service
to
be
done,
the
coordinator
would
be
able
to
do
it
with
its
local
cache
and
then
the
values
that
we're
storing
aren't
all
doubles.
So
storing
you
know,
we
want
to
store
blob.
A
Basically
so,
and
some
values
belong
together.
Sometimes
we
have
some
complex
value
that
we
want
to
store.
So
we
use
thrift
to
do
this,
so
thrift
gives
us
the
typed
extensible
schema
and
it
gives
us
the
union
types
give
us
an
easy
way
to
deserialize.
So
this
is
an
example
of
a
thrift
Union.
So
basically
the
we
have
just
like
two
int
and
then
getting
it
back
at
in
a
double.
So
it
can
be
one
of
those
three,
but
it
can't
be
all
three
so
getting
it
back
out.
A
B
Thanks,
but
that
was
the
easy
part
right,
so
you
know
all
this
data
modeling
and
customers,
consumers,
but
the
really
hard
part
is
an
enema
deuce
is
using
this
no
scaling.
The
really
hard
part
is,
you
know,
figuring
out.
How
is
this
going
to
support
our
business
requirements?
And
you
know
the
first
rule
of
scaling
that
you
know
everyone
always
tries
that
never
works
is
I'm.
B
Just
gonna
make
everything
goes
fast
as
possible
by
creating
like
the
biggest
you
know,
a
process
and
I'm
gonna
buy
the
biggest
machines
and
I'm
gonna
buy
the
you
know
the
I'm
going
to
set
all
the
settings
to
you
know
200
trillion
and
I'll
never
have
a
problem
every
single
scale
and
that
that
never
really
works.
So
the
real
key
to
just
scaling
this
type
of
system
is,
is
you
know
we
you
have
to
you
have
to
think
about
what
is
the
right
hardware
for
this
workload?
B
How
can
we
deal
with
with
the
jvm
which
ends
up
being?
You
know
ninety
five
percent
of
the
unknowns
because
for
every
different
workload,
you're
gonna
have
all
different
types
of
a
heap,
Hagman,
tation
and
and
and
all
sorts
of
other
problems
that
that
you
may
not
have
on
on
on
other
workloads.
You
know
so
you
have
to
think
about.
You
know
how
do
I
read
this
data?
How
do
I
write
it?
How
frequently
does
it
happen
and
what
are
the
typical
queries
that
I
can
a
tune
for?
B
The
next
thing
is
you
know
you
have
to
tune
cassandra
for
your
workload?
You
know
Cassandra
comes
out
of
the
box
with
with
with
some
really
sensible
defaults,
but
you
know
you
can't
really
go
to
production
with
with
those
default
settings,
because
there's
so
many
factors
and
cassandra
gives
you.
You
know
it.
B
So
the
first
rule
is,
you
can't
fix
what
you
can't
measure
and
what
we
do
is
we.
We
use
this
great
open
source
project
called
reamonn,
which,
if
you
guys
haven't
heard
of
it,
should
definitely
check
it
out
and
what
allows
you
to
do
is
sort
of
stream.
You
can
push
metrics
in
into
Riemann
and
it's
basically
a
event
processing
system.
Where
you
can
build
alerts,
you
can
build
aggregation.
You
can
proxy
off
to
systems
like
graphite
and
you
can
build
some
really.
B
That's
in
inside
of
Cassandra
that's
track
and
it
pushes
them
out
out
out
to
Riemann
and
what's
cool
about
Riemann
is
you've,
got
to
kind
of
got
two
different
views
that
you
can
that
you
kind
of
get
right
out
of
the
box.
The
first
is
you:
can
you
can
push
metrics
into
this
dashboard?
You
can
build
these
sort
of
real-time
dashboards
of
like
what
the
what
the
latest
metric
is
right.
So
you
can
build
these
and
it
uses.
C
B
Sockets,
so
so
it
so,
it
all
runs
in
real
time
in
the
browser,
and
you
can
configure
these
these
dashboards
and
save
them
down
and
you
can
track
and
set
up
dashboards
for
all
sorts
of
things.
You
can
see
like
what
the
current
load
is
like
what
the
current
everything
is
and
then
at
the
same
time,
Raymond
will
push
everything
off
to
a
graphite
after
it
goes
through
the
the
streams.
B
So,
if
you
don't
configure
any
streams,
you
can
basically
say
you
know
any
metric
that
that
has
the
word
Cassandra
in
it
sent
to
this
graphite
server
and
a
any
metric
that
that
you
know
that
has
my
app
in
it
send
sent
to
this
graphite
server
side
it's
in
both
and,
as
you
can
see
in
graphite,
you
can
sort
of
so
having
these
two
together,
you
can.
You
can
answer
to
questions
like
what's
going
on
right
now.
B
B
There's
we
wrote
a
c-sharp
driver,
there's
job
drivers,
drivers
for
every
language
and
that
the
proto
buffers
support
and
you
can
basically
write
your
own
little
application
metrics.
So
if
you're
writing
application
that,
like
queries,
Cassandra
does
something
with
it
and
then
you
know
it
sends
it
back
out.
You
basically
build
a
metric
around
that
task
as
well
and
and
that
gets
pushed
out
alongside
of
it.
So
you
can
build
your
own
dashboards
in
like
one
single
integrated
system,
it's
really
really
valuable.
It's
helped
us
figure
out
a
lot
of
issues
as
we
fit
them.
B
The
other
big
tool
I
want
to
push
is
visual
vm.
This
is
like
I,
don't
know
how
I
didn't
know
about
this
being
a
Java
developer,
but
it's
it's.
It's
the
Sun
tool
that
Sun
comes
with
comes
with
comes
with
this
process
called
j
stat
d,
and
if
you
start
it
up,
you
basically
get
all
these
real-time
metrics
out
of
out
of
this
tool,
so
you
can
basically
connect
to
to
any
java
process,
and
you
can
see
things
like.
B
Eden
Phillip
and
then
you
you
can
watch
it
get
a
pro
promoted
and
then
you
can
watch
the
DL
gent
fill
up,
and
then
you
can
watch
all
of
the
the
garbage
collections
happening
and
it
gives
you
real,
like
the
ability
to
kind
of
tweak
some
some
garbage
collection
settings
watch.
What
happens
come
back
and
you
don't
necessarily
have
to
have
all
them
walked
out.
So
it's
a
really
useful
tool.
You
can
also
profile
of
the
application
live.
B
You
can
do
all
sorts
of
really
cool
things
and-
and
it
has
plugins
built
in
highly
recommended
so
found
actual
scaling.
So
those
are
kind
of
the
tools
that
we
use
to
kind
of
help
figure
out
what's
going
on
and
how
can
we
can
make
it
better?
The
next
thing
is,
you
know
so,
some
of
our
machine
setup.
You
know
we're
using
SSDs
for
for
all
the
hot
data.
B
Currently
you
know
all
of
our
data
is
hot,
but
over
time
you
know
a
seer
as
yours
goes
on
we're
going
to
be
able
to
move
data
off
off
of
SSD.
You
know
our
j
bod
config,
which
means
just
a
bunch
of
disks.
Instead
of
having
like
a
raid
0,
if
you
have
raid
0,
then
basically,
if
you
lose
one
disk,
then
the
whole
node
is
gone.
B
If
you
do
like
raid
5,
then
you're
losing
you
know,
thirty
percent
of
your
of
your
disk
space
j-bot
is
is
just
the
idea
that
you
know
you
have
a
bunch
of
different
mount
points
and
you
just
randomly
throw
data
into
random
drives.
So
if
one
of
those
drives
dies,
then
all
the
other
drives
are
still
working
and
Cassandra
12
has
this
built-in.
Now
it
works
really.
Well,
it's
actually
helped
us
a
bunch
of
times.
We've
lost
more
than
I.
B
Another
thing
I
recommend
is:
if
you're
using
SSDs
II
need
as
many
courses
you
can
get
your
hands
on,
because
it's
it's
very
sick,
CPU
limited.
So
you
know
that
the
I/o
falls
away,
because
you
have
all
these
really
fast
seeks
and
all
of
a
sudden,
the
bottleneck
become
CPU
and
Cassandra's
built
to
run
with
a
very
high
concurrency.
The
the
jvm
does
a
great
job
with
it.
So
I
would
definitely
you
know,
put
as
many
courses
you
can
get.
B
You
know:
10
10
gig
network,
but
banda
network
cards
and
jumbo
frames,
which
is
the
idea
that
each
TCP
packet
you
actually
try
to
shove
as
much
information
as
you
can
into
each
packet
yeah,
so
Jay
bonds,
a
lifesaver
I,
didn't
actually
wanted
to
put
some
information
in
here,
there's
actually
from
from
our
hardware
vendor
that
there
was
a
bios
update
that
we
didn't
know
about
ahead
of
time,
but
they
pushed
that
bio
update
and
basically
said
this
fixes
the
problem.
When
you
do
lots
of
sequential
reads
for
a
long
period
of
time.
B
You
know
you
could
end
up
losing
the
drive,
which
is
exactly
what
happens
in
a
compaction
right.
So
we
so
we
all
of
a
sudden.
We
were
doing
like
compaction,
like
you
know,
over
over
over
a
weekend
and
all
of
a
sudden
these
drives
started
failing.
We
were
like
what's
going
on,
so
it
turns
out.
It
was
this.
It
was
this
bug,
but
but
but
Jay
Bob
worked
around
it.
So
you
know
at
least
it
got
most
of
the
way
through.
B
So
we
installed
the
bios
update
and
then
everything
everything
came
back
and
was
working,
fine,
the
the
black
magic
of
JVM.
This
is
the
next
trick.
I
put
down
our
sort
of
tweaks
that
we've
done
so
we
run
a
12
gig
heap,
which
is
you
know
as
high
as
we
could.
We
could
push
it.
The
EEG
in
Eden
spaces
is
1.6
gigs.
The
survivor
ratio
actually
shrunk,
because
you
know
we
didn't
really
have
this
promotion
problem,
so
it
gives
us
a
little
more
room
for
heap.
B
We
used
compressed
scoops,
which
means
if
your
heap
is
less
than
32
gigs,
it
doesn't
have
to
use
a
long
for
each
pointer
and
then
this
other
thing
which
we
actually
just
opened
a
Cassandra
ticket
on.
There's
there's
this
feature
which
should
be
default.
If
you
use
the
server
flag,
but
it's
not,
but
this
actually
gives
us
like
a
15-percent
boost
on
on
reads
just
by
adding
this
tlab
and
what
it
is,
is
the
throw
threadlocal
allocation,
so
so
in
Java
when
you're
creating
new
objects.
B
If
you
don't
have
this
turned
on
and
it
basically
uses
one
sort
of
shared
system
to
basically
allocate
new
things
onto
the
heap,
but
but
with
this
turned
on
it's
done
per
thread.
So
you
get
a
lot
more
throughput
things,
don't
get
bogged
down
and
locked
so
figuration
changes.
This
is
like
all
very
detail-oriented
wanted
to.
We
wanted
to
kind
of
go
through
and
tell
like
all
the
little
things
that
we've
we've
set,
because
I
thing
can
be
pretty
useful
for
everyone,
the
henna
hand
off.
B
We
set
to
a
single
thread
versus
the
default
to
the
hundred
kilobyte
throttle.
This
is
sort
of
just
to
help
us
figure
out.
You
know
when,
when
it
ran
with
multiple
threads
and
a
larger
throttle
limit,
basically
there
was
too
much
cpu
being
spent
and
we
couldn't
really
keep
up
with
with
reads
and
writes
the
mem
table
size
we
set
to
2048.
This
is
for
a
12
gig
heap,
so
it
leaves
enough
room
for
the
mem
tables
and
we
really
want
to
focus
which
we
really
tried
to.
B
Even
though
cassandra
is
well-known
for
a
right
heavy
system
or
writes
work
really
well,
it
is
really
good
for
reads
you
just
have
to
you
know,
be
very
careful
and
and
in
tune
things
and
the
Cassandra
community.
We're
really
focused
on
trying
to
make
this
like
a
top
priority.
We
want
it
to
be
as
fast
as
possible
to
reed's,
just
as
fast
as
writes
on
the
server
side
for
the
thrift
service.
B
We
use
the
half
cent
cafe
sink
since
we
have
this
giant
compute
cluster
come
in
and
hit
these
nodes
like
crazy,
they
kind
of
hit
it
and
then
stay
open.
So
if
you
have
like
one
thread
per
request,
obviously
that's
that's
not
going
to
scale
the
compaction
we've
set
to
four
threads
for
for
multi-threaded
compaction.
This
is
a
good
balance,
because
we
have
16
cores
currently
and
it
leaves
four
cores
for
compaction.
Xand.
B
The
rest
are
available
to
do
reads
and
writes,
and
we
turn
off
the
the
internode
compression,
which
is
new
because
it
was
causing
too
much
GC,
because,
basically
on
each
message
it
gets
off
of
the
wire
it
basically
has
to
decompress
into
a
a
buffer
and-
and
since
we
were
running
on
a
10
gig
network
like
what's
the
point
of
having
compression
everything's
going
to
be
fast
at
least,
we
haven't
hit
any
limits
yet.
So
this
is
the
point
of
the
talk
that
Jonathan
referred
to
earlier
this
morning.
B
B
B
You
know
giant
comment,
and
you
know
it's
sort
of
been
adopted
by
a
bunch
of
systems
to
kind
of
to
be
used,
but
what
it
does
is
it
it
takes.
It
creates
a
certain
number
of
levels
and
for
each
one
is
each
level
is
ten
times
the
size
of
the
previous
level,
and
what
you
do
is
you
fix
the
size
of
your
SS
tables
for
each
level
right,
so
so
each
SS
table
is
going
to
be
I.
B
Think
the
defaults,
five
megabytes
we
set
it
to
like
64
megabytes
and
what
it
allows
you
to
do
is
so
in
level.
0
there'll
be
10,
SS
tables
and,
oh
sorry,
not
level
0
in
level
1.
There
will
be
10
esas
tables
in
level
2,
there
be
one
hundred
and
level
3,
they'll
be
a
thousand
and
so
on,
and
and
what
it
allows
you
to
do
is
is.
B
Is
you
don't
it's
not
randomly
ordered
each
level
the
level
is
is
is
sorted
by
by
Roky,
so
you
can
actually
just
ask
the
leveled
manifest
you
can
say
hey
which
SS
tables
does
this
row
belong
to
and
I'll
say
these.
You
know,
n
SS
tables
should
have
this
row
and
then
you
go
check
it
so
vs.
I
steered,
which
is
kind
of
this
like
exponential.
It
keeps
growing
and
growing
growing
over
time
and
there's
been
talks.
You
know
people
in
the
passive
head,
like
SS
tables
of
like
300
gigs,
or
something
like
that.
B
So
in
order
to
work
around
it,
you
can
use
this
and
it
allows
you
to
kind
of
it's
a
good
workaround
for
wide
rows.
So
it
allows
you
to
and
it
allows
us
to
handle
our
use
case,
which
is
sometimes
we
want
just
a
particular
point,
and
sometimes
we
want
a
time
slice
now.
The
problem
with
level
compaction-
and
you
can
see
it
in
here-
the
yellow
you
can
see
in
level
1
through
5.
We
only
have
to
check
a
couple
but
level
0
which
is
sort
of
the
raw
flushed
SS
tables.
B
You
always
have
to
check
all
of
them
and
what
ends
up
happening.
Is
it
it's
like
breaking
bad.
You
know
you
get
this
under
high
right
load.
You
can't
keep
up
with
the
compaction
zan
leveled,
so
you
end
up
having
this
like
gigantic
effective.
You
keep
writing
out
these
essays
tables.
The
compaction
can't
keep
up.
So
you
end
up
having
more
and
more
and
more
of
these
esas
tables,
it
almost
defeats
the
purpose
of
level,
because
you
know
you
want
to
have
the
point
of
it
is
you're
trying
to
limit
the
number
of
s's
tables.
B
But
in
this
scenario
you
will
have
to
check
you
know
a
huge
amount
every
time.
So
what
ends
up
happening
in
a
hydrate,
hi
hi
read
in
high
right,
which
is
what
we
have
is
a
you
know,
your
your
your
reads:
go
way
way
way
down.
So
what
we
decided
to
do
was-
and
it's
pretty
conceptually
simple-
is
we
combine
the
two
compaction
strategies
so
that
for
for
level
0,
we
use
size
tiered.
So
as
long
as
it's
in
level
0,
basically
the
the
the
the
level
compaction
hasn't
picked
it
up.
B
Yet
so
it's
just
sort
of
there
we
know
we
know
all
the
SS
tables
are
relatively
small
because
they
were
just
flushed,
so
we
can
size,
tear
them
together.
It's
really
not
a
huge
burden
on
the
system
and
we
end
up
cutting
down
on
the
number
of
SS
tables
that
we
have
to
read
from
in
level
0
and
in
the
meantime
you
know
once
it
does
get
to
one
of
those
larger
SS
tables.
B
It
actually
goes
faster
because
the
SS
tables
are
not
much
larger
and
in
order
to
get
into
level
0,
it
picks
like
a
random
32,
SS
tables
from
sorry
in
order
to
get
in
the
level
one.
The
the
compaction
strategy
picks
a
random
32
level
zeroes.
So
if
those
are
all
larger,
they
knew
up
getting
more
throughput
into
the
system
anyway,
so
that
that
was
the
compaction
stuff.
Now
this
other
issue
that
we
had
was
compression
a
lot
of
the
CPU
time
is
spent
on
compression.
You
know
with
with
SSDs.
You
really
want
to
get.
B
You
know
the
most
bang
bang
for
your
buck,
so
we
wanted
to
kind
of
keep
as
much
in
it
compressed
as
possible,
but
it's
very
CPU
intensive,
because
you
keep
rereading
the
same
block
over
a
compressed
block
over
and
over,
even
if
it's
in
the
page
cache.
So
there's
a
faster
compression
that
came
out.
You
know
relatively
soon
after
snappy,
which
is
called
lz4
and
and
in
benchmarks
it's
it's.
You
know
forty
percent
faster
than
snappy
and
there's
a
guy
who
works
in
solar
who
wrote
a
java
implementation
of
it.
It's
really
nice.
B
It
has
basically
a
pure
java
implementation.
It
has
a
Java
unsafe,
which
is
this
magic
API
that
you're
not
supposed
to
know
about,
but
everyone
uses
and
there's
also
just
a
pure
see
version
so
and
you
can
see
in
this
benchmark
snappies
over
on
the
left
and
the
the
unsafe
one
which
doesn't
require
any
native
hooks
runs
at
the
same
speed
of
snappy.
So
that's
a
huge
win
right.
There,
because,
then
you
can
run
on
a
lot
more
platforms
and,
at
the
same
time,
the
the
jna
eyes
is
faster.
B
We
didn't
see
this
in
practice
because
the
the
blocks
that
Cassandra
compresses
are
so
small
and
so
much
time
is
spent
going
back
and
forth
through
jni,
but
it
did
cut
down
on
the
95th
percentile
latency
so
like
the
overall
throughput
stayed
the
same,
but
but
I
put
our
latency
dropped
a
bit,
so
that's
good
and
finally,
the
CRC
check
is
another
huge
area
that
it
spends
a
lot
of
time
in
so
a
lot
of
time
is
spent
when
you're
profiling
and
you're.
Looking
at
this,
the
CRC
check
for
each
compressed
block.
B
It
currently
uses
like
a
just
a
pure
java
method,
which
is
really
slow,
so
it
ends
up
taking
you
know
to
it,
ends
up
causing
a
2x
performance
hit
when,
when
you
do
this
crc
check,
there's
a
last
chance,
there's
a
percentage
chance
that
you
can
set
per
column
family,
you
could
say
I
only
want
to
read
of
every
ten
percent
of
the
blocks.
Did
the
CRC
check
now
in
a
dupe
they
actually
created?
B
They
did
this
benchmark
and
they
found
that
a
pure
J&I
version
of
the
CRC
check
runs
30
x,
faster
than
the
java
version,
so
it
might
make
sense
to
to
move
that
into
jni
and
I
want
to
throw
up.
These
are
our
current
stats.
We
have.
We
currently
have
12
nodes
in
two
data,
centers
we're
running
at
our
f6,
which
means
we
have
six
copies.
So
it's
it's
kind
of
wasteful,
but
it
gives
us
a
guarantee
that
we
can
either
lose
the
data
center
or
we
can
lose
a
node
in
each
data
center.
B
We
can
do
150
thousand
writes
per
second
for
our
data
size.
We
can
also
do
a
hundred
thousand
reads
per
second
for
for
for
the
cross
section
queries
we
have
over
6
billion
points
without
compression
and
the
uncompress
it's
about
2
terabytes
and
our
Layton
sees
are
down
there
below
those
are
actually
our
our
Olympus
level.
Layton
sees
so
those
aren't
actually
even
Cassandra,
that's
like
after
it
goes
out
of
Cassandra
and
then
through
our
service,
and
that's
it
I,
don't
know,
probably
blew
through
that
didn't
I
all
right.
B
We
got
15
minutes
all
right,
so
I
want
to
open
up
to
questions
or
if
you
want
to
go
back
over
anything,
you
didn't
understand
or
anything
else
yeah.
So
the
patch
for
the
hybrid
compaction,
that's
I
mean
the
only.
The
only
thing
is
it's
not.
You
know
pretty
enough
because
it
assumes
like
so
are
our
SS
table
our
our
our
leveled
compaction
is
set
to
like
a
64
megabyte
thang,
so
it
actually
looks
for
like
a
fixed
number
of
megabytes
available
to
do
the
size
turn.
B
A
A
So
if
we're
providing
one
minute,
the
last
value
that
we
have
for
that
minute,
the
Olympus
will
take
the
raw
results
that
cassandra
has,
which
is
every
single
data
point,
so
micro,
second
level
data
points
and
it'll
then
take
that
and
it'll
roll
it
up
into
a
single
data
point
for
that
minute,
and
then
it
also
does
the
knowledge
time
filtering.
So
we
can
go
back
and
update
a
value
many
times,
but
we
really
only
care
about
the
last
value.
So
that's
what
the
Olympus
does
both
of
those
those
components.
B
And
the
main
idea
is
that
you
know
all
the
rights
come
come
come
through
the
service
layer
and
they
go
out
to
the
surface
layer.
So
we
can
so
we
can
down
sample
that
data
as
it
comes
in
so
based
on
the
query
we
figure
out
which
which
one
to
talk
to
and
then
but
within
that
you
might
ask
for
like
every
third
day
on
Tuesday
that
happens
to
be
a
Tuesday
at
the
end
of
a
month
right.
So
you
can't
plan
for
all
of
those
down
same
points.
D
E
C
B
Well,
I
mean
so
yeah,
one
of
the
other
motivations
for
the
LZ,
for
is
so
the
snappy
implementation
that
we
use
in
Cassandra
doesn't
work
in
Java
7.
It
may
end
up
getting
fixed,
but
in
the
meantime
it's
it's
sort
of
this
a
hard
barrier
for
getting
into
it.
In
for
making
Cassandra
go
to
java,
7
I
think
Jonathan.
You
probably
mention
that
I
think
like
Java
7
is
going
to
be
like
the
de
facto
I
mean
it
already
works
in
Java
7.
B
But
one
of
the
barriers
is,
you
know
all
the
SS
tables
have
all
the
column.
Families
have
compression
turned
on
by
default,
using
snappy.
Now
that
this
is
in
the
this
was
committed
and
the
new
default
is
going
to
be
this
LZ
for
which
works
with
Java
7.
So
if
you,
if
you're
upgrading
from
java
6
to
java
7,
you
have
to
wreak
recomp
act,
all
your
data
with
the
new
compression
scheme
and
then
everything
will
work
great
in
Java
7.
We.
C
A
Ya
know
it's
a
it's
not
something
that
we
support
right
now,
but
it
is
something
that
you
can
put
that.
Oh
so
moving
averages
can
that
be
calculated
in
Olympus.
So
it's
something
that
where
we
will
add
and
it's
something
again
where
it's
going
through
all
this
data
that
we
have
and
providing
the
the
filter
in
the
service.
D
F
A
A
Yes,
that's
what
we've
that
those
are
the
points
that
we
went
through
on
like
Java
tuning
and
and
tuning
Cassandra?
It's
because
when
you
read
and
write,
the
performance
hurts
because
you're
doing
like
in
memory
operations
and
you
have
to
do
locks
I
mean
that
the
not
locks,
but
you
have
to
your
reading
stuff
that
could
be
right
being
written
by
another
another
client.
B
G
B
You
go
back
to
the
earlier
slide.
This
is
this:
is
this
whole
bitemporal
time
series
there's
actually
two
dimensions
right,
so
we
never
actually
update
points
in
place.
We
create
a
new
point
with
with
a
later
knowledge
time.
So
what
that
allows
us
to
do
is
say
like
because
what
you
want
is
to
be
able
to
go
back
in
time
and
say
when,
when
something
happens
yesterday,
when
we
had
the
bad
data
this
happen,
so
we
want
to
simulate
that
or
you
know,
we
also
want
to
have
the
ability
to
fix
it.
B
E
B
On
yeah
I
mean
we
just
have
a
week.
We
have
our
consumers
are
much
much
much
much
larger
and
greedier
than
our
Cassandra
cluster.
So
what
we
try
to
do
is
we're
trying
to
optimize
like
so
we
have
functional.
The
functional
use
case
works
right
like
it
does
what
it's
supposed
to
do,
but
it
doesn't
always
do
it
the
best
way.
B
B
One
data
point
or
something
like
that,
so
we're
trying
to
be
smart
about
when
we
pull
the
data
but
I
think
the
longer
term
goal
is
you
know
you
know
the
fact
that
Cassandra's
open
source-
you
know
we're
were
a
strong
development
shop.
So
you
know
we
really
want
to
make
sure
that
cassandra
has
these
functionalities,
and
you
know
we're
working
with
the
Cassandra
committers
and
everyone
else
to
try
to
make
sure
that
you
know
the
idea
of
like
being
able
to
push
down
a
filter
into
cassandra
is
something
that
everyone
could
probably
use.
B
F
B
B
B
A
A
A
I
So
I
was
kind
of
curious
about
the
hybrid
compaction
strategy.
If
you
take
one
of
those
really
super
huge
SS
tables
from
level
zero
and
push
it
up
to
level
one
because
of
the
size
limit
of
SS
tables
and
leveled
compaction,
I
guess
you
would
like
take
that
one
as
stable
and
suddenly
there
are
like
a
hundred
and
fifty
or
whatever
in
level
one
does
the
leveled
compaction.
Is
it
able
to
handle
that?
B
A
B
B
A
A
Yeah
we
did,
the
problem
is,
oh
sorry,
do
we
do
a
comparison
against
other
tech
level,
databases
so
the
the
problem
with
a
lot
of
them?
Is
it
doesn't
answer
the
other
query
which
is
the
cross
section,
so
it
does
it's
optimized
for
the
time
series,
but
not
the
cross
section,
and
we
need
to
be
able
to
do
both.