►
From YouTube: C* Summit 2013: Real World, Real Time Data Modeling
Description
Speaker: Tim Moreton, CTO at Acunu Ltd
Slides: http://www.slideshare.net/planetcassandra/tim-moreton
Data modeling for Cassandra presents a new set of challenges, especially for developers with a background in relational data modeling. And there are added complexities in modeling for analytic applications which need to enable statistical functions over the data, but a good data model, exploiting Cassandra's strengths, can make all the difference to a successful project. This tutorial will examine a number of real-world customer data modeling examples and draw out some hints and tips that will benefit hnot just the Cassandra newbie, but also the more experienced data modeler.
A
Okay,
good
afternoon
everybody,
I'm
tim
morton,
I'm
the
founder
and
cto
of
kunu.
I'm
going
to
talk
this
afternoon
to
you
about
real
world
real-time
data
modelling.
So
I
know
it's
the
last
session
of
the
day
and
I'm
only
a
small
way
between
you
and
cassandra
branded
ale,
so
I'll,
try
and
keep
it
try
and
keep
it
to
the
point.
But
what
I'm
gonna?
What
I'm
gonna
talk
about?
A
I
guess
I'm
gonna
give
you
the
story
of
a
couple
of
customers
that
we
we
worked
with,
who
are
building
real-time
analytical
applications
on
cassandra
and
some
of
the
data
modeling
techniques
that
they
used
and
that
we
we
helped
them
to
build
their
application,
and
I
think
they're,
pretty
generally
applicable
and
will
be,
will
be
valuable
to
you
guys
as
well.
So
to
start
with,
you
know
who
who
the
hell
are
cooney?
Well,
we
we
love
cassandra,
we've
been
using
cassandra
for
a
long
time
since
2010.
A
In
fact,
we've
been
working
on
and
using
cassandra
and
helping
customers
with
it.
We
contributed
a
lot
of
the
design
and
implementation
work
around
virtual
nodes.
We
originally
actually
built
that
for
one
of
the
customers
that
we
were
working
with
in
the
uk,
telefonica
digital,
sorry,
telefonica,
o2,
one
of
the
uk's
largest
mobile
providers.
A
They
use
cassandra
and
a
kunu
together
to
to
collect
between
300
and
a
billion
a
day
at
peak
event,
detail
records
which
are
routing
messages
from
sms
messages,
but
they
had
a
large
amount
of
data
on
each
of
their
machines
and
they
had
a
particular
problem
with
that,
and
you
know
that
that
sort
of
approach
where
we,
where
we
work
with
customers
and
and
end
up
building
product
on
the
back
of
that,
is
exactly
the
same
story.
A
I'll
talk
about
for
the
rest
of
this
presentation,
but
mostly
focused
on
on
data
modeling.
We
also
did
a
lot
of
work
on
the
cassandra
query,
language
and
earlier
incarnations
and
it's
great
to
see
how
how
far
cql3
has
come.
A
But
as
I
say,
this
talk
is
really
looking
at
the
experiences
that
a
few
of
the
customers
that
we
engaged
with
on
a
support
arrangement
had
the
the
lessons
that
we
learned
through
that
from
a
data
modeling
perspective.
So
how
do
you
build
applications
on
top
of
cassandra
and
how
do
you
get
to
grips
coming
from
a
relational
world
with
the
fact
that
cql
and
the
cassandra
engine
are
just
a
fundamentally
different
beast
so
before
I
dive
into
that?
A
I
think
it's
worth
sort
of
pulling
up
a
bit
of
experience
that
we
have
about
how
people
are
using
cassandra
just
to
sort
of
just
sort
of
set
the
context
here.
So,
on
the
one
hand,
one
broad
category
of
use
case
that
we
see
is
around
session
storage.
A
So
that's
where
you
perhaps
have
user
profile
information,
perhaps
you're
in
a
social
gaming
organization,
you're
and
you're
keeping
the
state
associated
with
each
user,
and
maybe
you
have
50
million
of
these
you
you
push
the
state
into
cassandra,
you
pull
it
out,
you
look
at
it.
Maybe
you
do
an
update
on
it
and
you
push
it
back.
That's
a
that's!
A
common
use
case-
that's
a
common
use
case
that
we
see
certainly
not
unique
to
cassandra,
but
it,
but
it
it's.
A
It's
certainly
a
major
aspect
of
use
usage
there
and
then
the
other
side,
I
think,
is
real-time
analytics.
So
this
is
a
certainly
a
different,
a
different
beast
in
terms
of
the
workload
and
the
fit
of
cassandra
to
this
problem
space.
So
here
you're
dealing
with
a
stream
of
events
being
collected
by
cassandra
and
you're,
looking
to
summarize
or
aggregate
or
in
some
way
in
in
some
way
pull
information
out
of
cassandra,
which
is
a
reflection
of
the
data
being
put
in,
but
isn't
the
data.
A
You
very
rarely
do
updates
in
this
in
this
in
this
situation.
A
So
the
the
characteristics
of
these
two
data
sets
are
pretty
different
and
I
think
you
know
it's
worth
looking
a
little
bit
about
that
in
terms
of
if
you're,
if
you're
looking
to
build
an
application
on
cassandra,
if
you're,
perhaps
looking
at
various
nosql
or
new
sql
technologies
and
you're
thinking
about,
is
cassandra
the
right
fit.
For
me,
it's
good
to
look
at
these
characteristics
and
sort
of
put
your
problem
space
into
that
fit
your
problem
into
it
into
this
space.
A
So,
with
session
storage,
quite
often
your
workload
is
going
to
be
very
read
heavy.
You
often
want
atomicity,
or
you
want
some
reasonably
strong
guarantees
about
what
is
going
to
be
happening
to
the
updates
that
you're
pushing
into
cassandra
or
the
or
or
the
situation
in
which
you're
making
those
updates.
A
So
also,
if
you
look
at
it,
you
know,
even
if
you
have
50
million
users,
if
you
have
a
kilobyte
of
data
for
each
of
those
users,
you're
still
only
talking
less
than
the
memory
capacity
of
a
single
modern
machine.
So
I
you
know,
big
data
is
a
word
that
gets
thrown
around
quite
loosely,
but
I
would
describe
many
of
these
sessions
and
storage
use
cases,
probably
not
big
data
with
real-time
analytics.
It's
often
quite
different.
A
You
know:
we've
seen
we've
seen
workloads
where
people
are
collecting
click
stream,
data,
telemetry
data
from
telcos
infrastructure
data,
log
messages
and
financial
market
data
and
a
range
of
other
streams
of
data.
Typically,
coming
at
you
with
high
velocity
and
there,
the
balance
of
reads
to
writes
is
often
very,
very
different.
A
We
typically
see
a
hundred
times
more
writes
than
reads
and
all
the
reads
are
to
the
summaries
or
the
results
or
the
analytics
that
you're
getting
out
the
other
side.
You
rarely
want
to
read
the
original
events
that
have
actually
gone
in
and
this
you
know
for
any
real
system.
This
really
isn't
going
to
fit
in
ram.
A
So
it's
worth
looking
at
you
know.
Jonathan
set
out
this
morning,
actually
a
very,
very
similar
list
of
cassandra
strong
points,
and
I
I
absolutely
agree.
You
know:
you've
got
scalability
high
performance,
especially
right
performance,
high
availability,
so
you
know
taking
advantage
of
the
fact
that
you,
a
cluster,
can
span
multiple
data,
centers
or
racks
and
be
aware
of
that
topology
and
work
around
it.
A
Well,
those
three
facets
are
really
strong
points
of
cassandra,
and
one
of
the
challenges
that
still
remain
is
certainly
certainly
common
to
many
distributed
systems
is,
is
being
able
to
get
strong
transactional
semantics.
So
you
know
for
session
storage.
If
you
look
at
it,
the
scalability
is
not
so
much
of
an
issue.
The
right
performance
is
not
particularly
helpful
for
this
workload.
Often
high
availability
can
be
useful,
but
you're
still
stuck
with
some
of
the
asset
semantics.
It's
worth,
noting
that
the
compare
and
swap
operations
coming
in
cassandra
2
will
greatly
ameliorate.
A
This,
that's
that's
for
sure.
On
the
real-time
analytics
side,
however,
the
the
picture's
quite
different,
so
scalability
is
really
important
here,
you're
typically
going
to
be
collecting
large
volumes
of
data.
You
know
I
gave
you
one
example
with
with
telefonica,
but
you
know
earlier
earlier
this
week
I
was
working
on
a
customer
where
they
were
collecting
one
and
a
half
billion
events
a
day,
and
they
want
to
keep
these
they.
You
know
they
want
to
keep
these,
and
that's
far
from
the
that's
far
from
the
highest
that
we
see.
A
The
the
imbalance
in
the
reads
to
rights
means
that
right
performance
is
a
really
useful
characteristic
of
cassandra
and
high
availability.
Stereo
here
is
still
still
useful,
but
you
don't
need
strong,
transactional
semantics
in
this
workload
and
I'll
come
on
to
show
you
why?
So
you
know
one
of
the
one
of
the
I
guess
is
slightly
contentious,
but
maybe
what
I'm
saying
here
is
that
there
are
many
systems
for
which
for
which
you
can
apply
you
can
you
can
go
to
for
session
storage.
You
know
lots
of
users.
A
Use
lots
of
users
use
a
number
of
different
solutions
here.
If
you
want
high
availability,
cassandra's,
a
great
is
pretty
much
the
only
solution
out
there,
but
for
real-time
analytics
it
is
a
very,
very
good
fit
and
building
applications
that
need
to
get
insight
out
of
high
velocity
streams
of
events
is
a
really
common
and
natural
use
case
for
cassandra,
and
I
suspect
that
many
of
you
guys
in
this
room
actually
actually
have
that
problem.
A
So
that's,
I
guess
the
sort
of
the
rest
of
the
talk
is
going
to
focus
on
data
modeling
around
real-time
analytics
applications,
but
these
cat,
these
these
trends
are
probably
going
to
be
useful
as
well.
A
If
you
do
have
a
session
storage
application
in
particular,
though
I'm
going
to
focus
on
one
example
use
case
which
is
not
o2
telefonica,
but
it
is
another
telco
it's
where
we
were
asked
to
sort
of
help
with
a
system
where
users
were
collecting
tens
of
thousands
of
cool
detail
records
a
second
and
they
wanted
to
do
network
network
monitoring.
You
know
several
different
use
cases
here,
looking
at
drop
rates
being
able
to
track
them
outages
as
they
emerged
and
understanding
how
and
which
customers
were
involved
in
and
implicated
in
these
problems.
A
So
what
we're
really
talking
about
here
is
operational
analytics
right,
so
it's
not
needling
a
haystack
type
processing
where
you're
collecting
different
data
sets
together
and
hoping
to
come
up
with
some
insight,
which
is
going
to
transform
your
business
in
an
abstract
way.
It's
very
to
the
fact
to
the
point
I
you
know,
you
use
your
domain
knowledge
to
understand
to
apply
the
metrics
that
are
important
to
your
business
and
then
you
want
to
be
able
to
know
what
is
happening
to
those
metrics
within
a
small
time
period.
A
If
there
are
many
use
cases
for
which
the
supplies
it,
you
know,
if
you
are
a
if
you're
doing,
advertising
analytics
it's
clearly,
it
doesn't
take
a
hadoop
cluster.
To
tell
you,
the
click-through
rate
is
the
metric
that
you
should
be
tracking
likewise
latency
on
on
api
requests,
so
these
guys
were
looking
to
turn
cool
detail
records
into
real-time
dashboards,
so
they
could
make
take
corrective
action
when,
when,
when
issues
arose
and
just
check
the
changes
they
were
making
to
the
setup
were
working
fine
and
they
they
were
using
cassandra
to
do
this.
A
So
I'm
going
to
talk
about
how
they
did
that.
So
to
start
with,
I'm
going
to
just
just
quickly
summarize
some
basics
from
cassandra
data
modeling.
This
may
have
been
covered
at
a
couple
of
other
talks
today,
but
I
think
it's
well
worth
driving
at
home.
The
first
thing
is
that
you
need
to
denormalize.
Cassandra
is
not
oracle,
it's
not
a
relational
database.
You
need
to
insert
data
in
every
arrangement
in
which
you
wish
to
read
it
back.
A
That
gives
you
a
lot
of
power,
but
with
great
power
comes
responsibility,
and
it's
also
a
bit
of
a
challenge
as
we'll
see
later
to
maintain
agility.
When
this
is
this,
this
is
the
case.
The
reason
you
can
do
this
is
because
cassandra
has
a
great
storage
layout,
which
means
that
random
rights
or
rights
to
random
keys
more
correctly,
do
not
necessarily
or
are
unlikely
to
incur
disk
seeks.
A
A
A
Sets
of
I
mean
the
converse
is
also
true
right,
so
sets
of
items
that
you're
likely
to
not
access
together.
You
should
not
put
in
the
same
row
and
there's
no
point
having
incurring
the
cost
of
having
to
maintain
those
being
sorted
and,
in
particular,
you're,
probably
going
to
end
up
with
enough
stuff
in
each
row
that
you
need
to
somehow
distribute
your
load
across
your
cluster
and
the
final.
The
final
sort
of
basic
tenet
here
really
is
that
atomic
counters
are
a
really
useful
building
block
for
building
real-time
applications.
A
So
cassandra
can
allow
you
to
insert
what
are
essentially
plus
ones
into
the
system
and
when
you
read
them
back,
you
get
the
actual
value,
and
this
is
this
is
a
great
building
block
for
not
all
real-time
analytics
use
cases,
but
pretty
much
pretty
much
most
so
just
to
highlight
this
one
event
being
one
event,
update
is
likely
to
result
in
potentially
many
updates
across
rows.
Don't
worry
about
that?
It's
absolutely
fine!
That's
that's!
To
be
expected.
A
A
So
what
follows
I
guess
is
a
is
a
cookbook
of
several
techniques
that
we
we
help
this
customer
adopt
and
that's
only
the
first
part
of
the
story.
I'll
I'll
tell
you
what
happens
afterwards.
So
the
first
thing
they
wanted
to
do
is
be
able
to
track
the
occurrences
of
certain
metrics
and
be
able
to
do
so
by
a
variety
of
different
time
hierarchies.
So
they
wanted
to
be
able
to
count
currencies
by
day
hour,
minute
and
second
and
while
not
rocket
science
it
does.
A
It
does
require
a
little
bit
of
thinking
about
what
what
they
do
here
is
use
every
row
for
the
for
a
level
in
hierarchy
and
within
that
row
use
the
columns
to
maintain
the
sub
components
at
that
hierarchy.
So,
for
every
day
you
maintain
a
row
for
every
hour.
You
maintain
a
row
and
for
every
minute
you
maintain
a
row
and
inside
each
of
those
rows
the
inside
an
hour
row
the
minute
the
actual
minutes
were
encoded
as
columns.
A
So
this
is
pretty
convenient
because
if
you
notice
here
you'll
be
able
to
pre-build,
what's
basically
a
a
graph
of
of
these,
these
numbers,
these
occurrences
over
time
at
any
granularity
here
by
just
doing
a
single
slice
operation
on
a
particular
row.
A
So
the
second
thing
was:
how
do
you
add
wares
to
this
right,
so
they
were
looking
to
filter
these.
These
cool
detail
records
these
alerts
and
these
occurrences
by
a
number
of
different
fields
right
so
you're,
looking
at
things
like
device
type
model
carrier
and
particular
characteristics
of
the
characteristics
of
the
network.
A
Now
there
are
a
couple
of
different
ways
of
encoding
encoding
wares,
but
basically
what
you
want
to
do
is
include
it
in
your
roki,
in
your
partition,
key
in
in
a
cql
term,
and
the
reason
for
doing
that
is
that
when
you're
filtering
you're
unlikely
to
you
know,
you're
filtering
you're,
trying
to
do
something
like
select
me.
The
number
of
occurrences
grouped
by
time
where
this
this
thing
happens
so
you're
unlikely
to
need
to
touch
multiple
different,
wear
values
at
once.
A
If
you
are,
if
you
are
fine,
there's
potentially
other
ways
of
doing
it,
but
usually
it's
a
filtering
type
operation,
and
so
here
you're
augmenting
the
row
key
and
you're
having
to
augment
every
single
combination
of
the
row
key
that
you've
already
got
with
with,
with
with
the
with
each
value
respective
value
from
that
from
that
filtering
field.
A
So
group
group,
by
is
a
is
a
sort
of
similar
operation.
What
you're?
What
you're
doing
there
is
you're
just
augmenting
the
column,
key
that
you
have
in
the
field
in
the
sorry,
in
the
data
model
by
the
actual
value
of
that
group
by
so
then,
you
can
slice
across
multiple
different
com,
multiple
different
components
and
be
able
to
pull
out
all
those
values.
At
once,
remember
each
row
is
going
to
be
co-located
on
disk
on
a
single
or
small
number
of
machines
in
a
in
in
the
cluster.
A
So
to
go
and
get
back
that
data
you
are
going
to
only
need
to
go
to
one
place
or
on
disk,
and
that's
that's
what
you're
looking
to
do,
you're
looking
to
minimize
work
for
every
every
operation
that
you
do
so
the
next?
The
next
challenge
these
guys
set
us
was
okay,
so
we've
got
all
this.
This
is
nice,
but
we've
seen
some
anomalies.
A
So
we
we'd
like
to
find
out
what
the
constituent
events
were.
We
want
to
find
out
why,
of
this
sort
of
latency,
histogram
say
that
you've
just
built
using
the
techniques
you
outlined,
what
why
there
are
these
outliers?
Why
are
there
twice
as
many
dropped
calls
in
this
particular
area
or
for
this
particular
device
type
at
this
time?
I
want
to
understand
what
those
cool
detail
records
were.
I
want
to
understand
what
subscribers
they
affected.
A
What
you're
doing
here
is
you
store
the
original
event,
some
of
which
you
know
some
of
that
data
is
not
going
to
be
useful
for
grouping
some
of
it's
not
going
to
be
useful
for
filtering,
and
it's
just
going
to
be
useful
for
finding
out
more
detail
about
the
underlying
causes
of
a
particular
particular
piece
of
analytics.
So
we
maintain
basically
this
this
id
mapping.
A
So
it's
it's
like
a
sort
of
manual
secondary
index,
if
you,
if
you
like,
but
managed
by
yourselves
in
inside
the
cassandra
data
model,
so
you
see
now
what
you're
getting
is
is
is
quite
a
lot
of
it's
quite
a
lot
of
different
colors
in
this
data
model
and
remember
these
diagrams
are
simplified,
because
what
you're
going
to
need
to
do
is
for
every
time
you
do
a
wear
or
a
group
buy
or
maintain
a
drill
down.
A
You're
gonna
need
to
to
to
manage
to
manage
all
of
this
through
cql,
so
we
we
got
to
a
point
where
these
guys
were.
You
know
asking
us
to
do.
You
know
we
ended
up
with
quite
a
lot
of
color
in
in
the
data
model.
From
from
that
point
of
view-
and
this
was
this-
was
this-
was
getting
pretty
tricky
so
at
this
point
the
system
was
working
nicely,
but
the
customer
wanted
to
change
things
so
the
manager,
the
manager
they
demoed
it
to
their
managers.
A
They
saw
how
saw
how
how
much
potential
it
had
and
said.
Actually,
please
can
we
do
this?
Please
can
we
filter
on
by
these
different
characteristics?
Actually,
please
can
we
do
this
other
form
of
analytics
here
and
I'd
like
a
drill
down
for
this
particular
aggregate
to
this
particular
event,
and
then,
of
course,
they
also
asked
the
question
very
reasonably.
A
So
actually
could
you
just
hand
the
system
over
to
me,
so
I
could
do
that
and
suddenly
what
you're
looking
at
is
needing
to
be
able
to
offer
what
currently
you're
doing
in
cql
or
in
in
in
thrift
through
to
through
to
your
business
owners
through
to
them
through
to
managers,
and
in
fact
the
big
problem
really
is
that
you
can
never
anticipate
complete
requirements,
and
this
is
this
is
a
big
part
of
the
challenge.
A
So,
after
a
little
bit
of
iteration
on
this
a
little
bit
of
the
customer,
having
learned
some
of
these
techniques
and
built
this,
they
said
actually.
Is
there
a
better
way,
and
so,
from
that
experience,
what
we
ended
up
doing
was
productizing
some
of
this.
So
we
we.
We
noticed
that
there
was
much
commonality
in
the
data
models
that
people
were
building
and
wanted
to
help
people
build
something
that
was
a
level
higher.
So
we
put
together
a
framework
which
we
call
kuno
analytics
and
it's
for
cassandra
users.
A
A
So
I'll
talk
to
you
a
little
bit
more
about
those
those
things
and
but
all
of
this
actually
includes
a
lot
of
the
data.
It
really
started
out,
as
the
data
model
that
we
worked
with
on
a
number
of
cusp
for
a
number
of
customers.
A
Acuno
analytics
allows
you
to
collect
data
from
in
via
a
restful,
http
api.
You
can
just
fire
json
object,
static
or
log
lines.
We
have
flume
integration,
storm
integration
and
various
message
queue
integrations
and
what
it's
doing
is
it's
basically
building
continuous,
so
olap
style,
cubes
continuously
on
ingest,
so
rather
than
wait
overnight
and
come
back
in
the
morning
for
your
data
store
to
have
built
aggregate
cubes
across
a
range
of
dimensions,
which
is
essentially
what
you're
doing
in
cassandra.
A
But
in
real
time
we
we
allow
you
to
do
that
using
a
high-level
language
using
concepts,
familiar
from
sequel
like
where's
and
group,
buys
limits,
happenings,
joins
and
so
on,
but
without
having
to
without
having
to
go
into
the
detail
or
the
depth
of
managing
cql.
So
data
comes
in.
We
use
cassandra
to
store
raw
events
and
aggregates,
and
you
can
issue
queries
out
of
this
json
api,
which
actually
touch
cassand
the
aggregates
in
cassandra.
A
A
So
what
I'm
going
to
do
here
is
prove
eric
wright
and
see
whether
see
whether
we
can
we
can
make
something
happen.
So
what
I've?
What
I'm
just
going
to
show?
You
is
basically
how
you
can
achieve
all
of
those
cassandra
data,
modeling
techniques-
actually
in
about
three
minutes,
hopefully
before
the
my
time
runs
out.
So
so
the
first
thing
I'm
going
to
show
you
is
that
we
have
a
so
we
have
a
a
a
ui
here.
A
So
basically,
okunu
maintains
a
set
of
dimensions
and
a
set
of
cubes
on
top
of
a
table,
and
that
table
has
a
an
endpoint
which
you
can
fire
json
events
at,
so
I'm
just
going
to
kick
off
a
load
generator
which
is
firing
some
latitude,
longitude,
timestamp
duration
and
other
data
at
it.
A
So
what
you
can
see
here
is
that
we've
just
basically
said
time
treat
timestamp
as
a
time
and
just
aggregated
up
this
hierarchy
and
treat
latitude
and
longitude
as
a
hierarchy
aggregated
at
these
levels
and
maintain
these
cubes
for
me,
because,
basically
that
was
what
I
was
doing
in
my
cassandra
data
model.
So
when
you
do
things
like
head
over
to
here
and
say.
A
Something
like
show
me
the
count
between
five
minutes
ago
and
now
group
that
by
second,
you
get
a
bunch
of
results
and
I
don't
think
this
is
probably
classified
as
big
data
either,
because
we're
only
doing
about
40
a
second,
because
I
did
it
particularly
slowly,
but
you
can.
Then
you
know
when
you're,
when
your
boss
asks
you
to
also
well
compute
me
the
average
duration
of
calls
coming
in
from
those
cool
detail
records.
I
can
do
that
and
then
you
can
go
and
add
that
to
a
dashboard.
A
That
I
will
call
demo
and
then
you
have
real-time
analytics
powered
by
cassandra
on
datasets,
where
we've
deployed,
I
think,
up
to
about
use
cases
with
about
5
billion
events
a
day,
so
analytics
scales
out
linearly
over
cassandra,
and
you
can
then
also
go
and
run.
Queries
like
this
one
that
I
made
earlier.
A
To
do
exactly
the
sort
of
to
do
exactly
the
sort
of
you
know,
you
see
all
the
sql-like
concepts
there
to
do
exactly
the
sort
of
analytics
that
this
customer
was
aiming
to
do
with
with
cassandra
directly
so
in
myself,
a
new
dashboard.
A
So
there
I
have
a
real-time
geoheat
map,
updating,
aggregating
cool
detail
records,
that's
basically
powered
by
cassandra,
and
you
didn't
need
to
write
a
single
line
of
cql
to
do
this.
A
In
fact,
it's
pretty
much
all
just
set
up
through
the
ui.
So
you
know
these.
These
queries
are
just
significantly
higher
level
than
you.
You
would
have
to
do
if
you
were
using
cassandra
directly.
So
it
gives
you
the
power
of
cassandra,
but
also
helps
you
get
to
value
more
quickly.
So
what
I'm
going
to
did
just
here
was
just
change
the
latitude
and
longitude
granularity,
because
we
we
have
those
two
buckets
and
now
we're
going
to
aggregate
it
at
just
a
different
different
hierarchy.
A
So
you
can
see
actually
I'm
collecting
results
that
just
a
different
different
granularity
there.
A
So,
but
you
know
this
isn't
just
a
a
tool
designed
to
you
know,
there's
an
api
behind
this.
It's
designed
to
help
you
designed
to
help
you
build
real-time
analytics
applications
and
one
of
the
one
of
the
things
you
can
do
is
just
make.
You
can
very
easily
embed
these
widgets
in
your
own
in
your
own
sites.
A
So
I
think
that's
that's
all
I
had
to
say
thank
you.
A
A
Yep,
so
how
does
this
relate
to
storm?
Where
would
you
use
this
alongside
storm?
Well,
we
have
actually
a
number
of
customers
who
use
storm
at
the
front
end
of
what
we
do
so
storm
for
those
who,
who
don't
know,
is
a
sort
of
a
distributed
stream
computing
setup
where
you
just
fire
streams
of
events
and
and
and
do
processing
on
it.
So
storm
is
like
a
cep
system
where
you
we
see
storm
as
a
sort
of
upstream
setup
for
processing
data
on
the
way
into
analytics.
A
A
So
quite
a
lot
of
our
users
do
things
like
raising
an
alert
when
the
volume
of
trades
for
this
particular
symbol
has
fallen
below
the
mean
minus
one
standard
deviation
of
the
usual
level
that
that
stock
trades
at
four
ten
o'clock
to
eleven
o'clock
on
tuesdays
this
time
on
tuesdays
historically
over
the
last
six
weeks
or
just
show
me
that
graph
and
show
me
another
line
of
what
happened
last
week.