►
From YouTube: How Email Met Apache Cassandra with Narmal 2014
Description
Stephen Portanova- Narmal is an iOS app that saves you time (and sanity) by making email more chat-like. This talk is about why a startup chose Cassandra as its primary data store, demonstrated with real-life examples. In particular:
• Cassandra's crazy powerful data model
• Cassandra's operational flexibility
• The Cassandra Community
Stephen Portanova speaks Haskell to God, Scala to women, Objective-C to men, and Javascript to his horse.
A
Cool,
thank
you.
So
I'm
not
the
biggest
cassandra
expert
out
there
used
it
in
production
at
my
last
company,
but
didn't
really
get
a
really
good
handle
on
it.
So
I
wanted
to
come
out
and
do
kind
of
a
personal
project
which
is
kind
of
morphed
into
something.
A
little
bigger
normal
is
another
email
app
and
if
you're
wondering
about
the
logo
and
the
name,
it's
a
nar
walrus.
No,
those
don't
exist
out
in
the
wild.
A
I
wish
and
normal
is
just
like
nar,
as
in
narwhal
risk
and
kind
of
like
male
email,
just
a
easy
domain.
It
was
only
9.99.
So
that's
that's
the
explanation
there.
If
you
read
even
like
tech
crunch
or
panda
daily
or
anything,
you
probably
get
at
least
like
one
article
a
week
talking
about
how
email's
broken
and
some
new
app
is
gonna,
come
and
fix
it
all
and
change
everything
and
revolutionize
the
world,
and
that's
me
too,
but
I
don't
necessarily
think
that
email's
broken.
A
I
mean
it's
kind
of
like
an
old
80s
classic
movie.
It's
probably
up
about
time
for
a
reboot
or
an
update,
but
there's
a
lot
of
good
things
about
it
too.
So
I'm
not
necessarily
bashing
on
email
all
the
time
so,
but
there
are
problems
for
sure
and
if
you
guys
can
see
this,
this
is
actually
my
home
screen
and
I
want
to
point
out
three
things.
First,
that
I
have
16
tinder
notifications,
I'm
kind
of
a
big
deal
not
bragging
or
anything,
but
I'm
bragging.
A
Second,
and
less
importantly,
is
that
I
have
almost
14
000
to
be
exact:
13,
717,
unread
emails,
that's
across
couple,
gmail
accounts
and
icloud,
and
that's
just
unread,
I
probably
like
20
25,
000
red
ones,
and
I'm
guessing.
That's
not
that
uncommon,
and
the
third
thing
which
is
connected
is
that
if
you
look
at
my
messages,
I
have
zero
unread
messages
and
I
do
have
a
few
friends.
I
promise.
A
So
it's
not
that
I'm
a
total
nerd,
but
I
think
the
main
thing
is
that
why
I
have
almost
14
000
unread
emails
and
zero
unread
text
messages
is
in
large
part
because
messaging
chatting
is
just
fun.
It's
a
better
interface.
A
It's
a
little
easier
to
work
with,
whereas
email
is
you
just
get
overwhelmed
with
it,
you
get
swamped
by
it
and
then
every
little
thing
just
kind
of
builds
on
it,
and
then
you
don't
want
to
even
check
it
because
once
I
got
to
I
think,
like
a
thousand
emails,
I
was
never
going
to
go
through
all
of
them.
It
just
wasn't
going
to
happen.
A
It
was
too
much
work
already
and
now
times
that
by
13,
it's
never
going
to
happen,
but
I
started
thinking
what,
if
we
could
kind
of
combine
the
best
features
of
texting
along
with
emailing,
because
emailing
does
have
a
lot
of
good
features
and
we
don't
want
to
throw
kind
of.
What's
that
saying
the
baby
out
with
the
bathwater,
if
you're
into
that
kind
of
thing,
but
a
little
backup.
So
this
is
an
email
app.
A
That's
bottom
line
going
to
make
emailing
more
like
texting
initially
for
ios
on
this
back
end,
I'm
using
scala
using
cassandra
as
the
primary
data
store,
probably
going
to
throw
some
other
things
in
there
too
soon,
but
I'm
really
liking
cassandra
so
far
and
doing
it
on
aws,
which
has
a
lot
of
advantages
and
fits
what
pretty
wealth
cassandra.
I
think
so
let
me
explain
the
overall
concept
a
little
more
and
then
we're
gonna
get
into
this
isn't
in
production.
A
Yet
so
I
haven't
been
dealing
with
a
lot
of
those
issues,
so
I'm
pretty
much
going
to
be
talking
about
mostly
my
data
model,
and
so,
if
you
know
nothing
about
cassandra,
this
may
not
make
a
lot
of
sense.
If
you
know
too
much
about
cassandra,
then
this
may
be
boring,
but
you
have
to
listen
to
me
anyway,
so
let's
do
it
so
the
apps
basically
come
down
to
three
views
at
least
initially,
and
the
first
few
is
what
I'm
calling
it.
A
The
conversations
view
say:
I
am
emailing
you
you
and
I
are
emailing
back
and
forth
and
say
we're
emailing
several
times
over
a
few
years.
Basically,
a
conversation
is
just
a
group
of
people,
that's
all
it
is,
and
we
can
email
back
and
forth.
We
have
different
subjects
and
all
those
subjects
all
those
emails
are
going
to
be.
Underneath
this
one
conversation,
no
matter
how
many
times
we
email
over
a
period
of
time,
and
it
can
work
with
with
you
me
you
all
three
of
us
as
a
con.
A
A
So,
actually,
once
you
click
into
that
subject,
then
you
go
down
and
you
can
see
all
the
emails
that
are
part
of
that
subject
and
so
right
now,
I'm
most
familiar
with
gmail,
but
I
kind
of
feel
like
kind
of
what
we
have
right
now
is
more
or
less
the
second
and
third
screen
right
now.
So
if
you
and
I
were
emailing-
and
we
had
about
say
we
even
all
the
time-
we're
really
good
friends
and
we
have
like
a
thousand
different
subjects.
A
You
would
see
all
these
thousand
subjects
in
your
just
in
your
list.
In
your
main
gmail
app
once
you
log
in
but
kind
of
the
one
of
the
advantages
is
now
all
those
thousand
are
gonna,
be
underneath
our
conversation,
so
we
can
just
go
in
and
I'm
not
going
to
be
kind
of
spammed
with
the
thousands
of
things
it's
going
to
kind
of
consolidate,
email
make
it
look
a
little
more
reasonable
to
deal
with
without
really
making
it
that
much
different
and
so
kind
of.
A
Basically,
the
bottom
line
is
that
I
didn't
put
this
in,
but
one
user
can
have
many
conversations.
A
conversation
can
have
multiple
topics
and
a
topic
can
have
multiple
emails,
and
the
thing
that
I
really
love
about
cassandra
is
the
the
data
model.
I
think
it's
a
ton
of
fun
to
use.
A
I
could
be
using
it
wrong,
but
so,
if
I
am
tell
me-
but
it's
been
so
much
fun
to
play
around
with
so
far
so
I
don't
feel
too
bad,
but
so
this
is
actually
pretty
close
to
what
I'm
using
as
far
as
a
conversation,
you
have
a
user
id
and
you
have
a
time
stamp,
so
the
partition
keys
user
id
the
clustering
key
is
the
time
stamp.
So
let's
go
ahead
and
insert
a
few
things.
A
First,
one
and
all
these
are
going
to
be
the
same
users,
and
so
really
the
main
difference
is
they're
all
going
in
the
same
row,
which
means
the
same
node,
which
makes
queries
pretty
quickly
because
we're
only
going
on
one
node
and
they're
all
contiguous
by
timestamp,
and
so
we
have
three
things
here
and
basically
the
so
the
recipient
hash
is
kind
of
what
I'm
using
to
differentiate
it.
It's
not
actually
in
the
primary
key.
A
So
the
problem
is,
if
I
wanted
to
get
just
the
last
n
most
recent
conversations
that
I'm
gonna
have
two
of
the
same
conversations.
I'm
gonna
have
the
first
one,
the
one
for
x
and
one
four
seven,
that's
kind
of
my
abbreviation
for
time.
A
Stamp
didn't
want
to
type
out
ten
or
eleven
digits
for
epoch
time,
but,
and
so
we're
gonna
have
that
come
up
as
a
double
or
come,
not
a
double
but
come
up
twice,
and
I'm
more
or
less
fine
with
that
right
now,
because
I
can
filter
that
out
application
side,
I
thought
about
using
something
like
redis
sorted
sets
to
kind
of
actually
store
them
and
then
do
a
query
to
redis
to
find
out
the
exact
like
contiguous
spaces.
I
need
to
do,
but
this
is
way
easier
way
easier
operationally.
A
A
This
is
confusing
and
it's
not
gonna
make
sense,
because
I
didn't
define
the
type
class
for
you
guys,
but
this
is
just
my
method
where
I
group
things
by
the
recipient
hash
and
then
take
the
head
of
that
list.
So
it
comes
into
a
map
of
lists
of
map
of
string
lists
and
it
takes
for
each
recipient
hash,
which
is
the
first,
which
is
the
key.
It's
going
to
take
the
most
recent
recipient
or
it's
gonna,
take
the
most
recent
conversation.
A
So
it's
not
gonna,
I'm
not
gonna
be
showing
the
user,
the
actual
kind
of
duplicate
conversations,
I'm
only
gonna
show
the
most
recent
conversation
and
really.
A
The
main
reason
I
brought
up
scala
is
because,
in
my
opinion,
it
works
really
well
with
cassandra,
especially
when
you
start,
which
I
think
you
probably
should
be
doing-
is
like
using
the
asynchronous
functionality
of
cassandra
and
that
just
makes
it
so
easy
scala
and
the
futures
and
the
actor
model
make
it
so
easy
to
do
non-blocking
code,
and
this
is
kind
of
just
showing
off
how
nonsensical
but
badass
functional
programming
can
be
so
now
we
talked
about
conversations
moving
down
to
topics
which
is
just
the
same
thing
as
a
subject,
and
it's
pretty
similar
except
now
kind
of
a
link
to
it
is
the
the
link
to
the
conversation
is
the
recipient
hash.
A
A
Yeah
exactly
good
time
stamp
is
just
the
epoch.
Timestamp,
it's
of
the
latest
email,
yep,
yep,
exactly
and
yeah,
so
the
receiving
hash,
it's
just
a
it's,
a
a
tree
set
or
sorted
set,
and
then
stringify
and
then
md5
hash
hit.
So
nothing
too
crazy
there,
but
again,
and
so,
as
you
can
probably
see
time
stamps
the
second
one
and
then
now,
instead
of
kind
of
taking
the
place
of
recipient
hash.
As
far
as
the
column
value
is
thread
id
and
I've
only
done
this
on
gmail.
A
So
I
don't
maybe
yahoo
and
icloud
at
different
things,
but
the
each
each
email
and
each
thread
obviously
has
a
thread
id
and
so
now
we're
going
to
run
into
the
same
problem
where,
if
we
do
queries,
we're
going
to
run
into
having
the
same
thread
id
multiple
top
we're
going
to
have
basically
the
same
topic
same
third
id,
but
just
at
different
times,
because
you
can
email,
someone
in
the
same
topic,
obviously
same
subject,
and
but
we
only
want
to
show
each
topic
in
the
second
screen
or
in
this
screen
once
so.
A
In
reality,
oh-
and
this
is
just
kind
of
how
that
ends
up
looking
because
now
I
have
a
because
they
have
different
receiving
hashes
they're
different
people.
It's
gonna
be
actually
on
two
different
cassandra
rows,
so
you
get
the
you
can
start
getting
the
wide
columns
as
opposed
or
I
guess
we
had
wide
columns
before,
but
so
a
little
different
way
to
look
at
it,
and
we
can
do
that
same
exact
thing.
We
did
before
with
the
scala
type
class
to
filter
out
the
duplicate
instead
of
recipient
hashes.
A
This
time
I'm
going
to
filter
out
the
duplicate
thread
ids,
so
the
user
only
sees
one
of
each
subject,
and
so
now
we
get
down
to
email
and
emailing
is
the
email
side
of
things
is
pretty
straightforward.
A
Just
now
we
have
user
id
and
we
have
a
thread
id
to
kind
of
integrate
back
into
the
subject
topic,
and
this
is
going
to
be
probably
the
widest
part
of
the
cassandra
row
and
just
kind
of
examples.
What's
going
on,
I
really
I
don't
care
about
sports
ball
at
all,
but
I
thought
I'd
troll,
you
guys,
so
no
worries
go.
Go
49ers
in
the
world
cup.
A
But
so
pretty
straightforward:
now
it's
just
the
emails
are
going
to
be
organized
by
the
by
the
thread
id
as
opposed
to
topics
organized
by
the
recipients,
hash,
and
so
that's
kind
of
interesting
I
feel
like,
and
some
people
may
like,
that
other
people
may
not
other
people
may
actually
want
to
make
emailing
almost
entirely
like
texting,
and
so
what
we
can
do
there,
which
is
really
easy
to
do
with
cassandra
too,
is
just
get
rid
of
the
entire
and
I'm
doing
both
of
these
by
the
way.
A
So,
whichever
view
you
like
you'll
be
able
to
go
to
get
rid
of
the
topics
entirely
and
have
the
emails
correspond
directly
to
the
conversations
and
that'll
make
the
text
that'll
make
it
pretty
much
like
texting
and
do
some
ui
improvements,
but
I
think
you
guys
can
more
or
less
get
the
idea
and
we
actually
only
need
to
make
a
really
small
change.
So
again,
conversation
now
skip
topics
has
many
emails
and
pretty
much
all
we're
going
to
do
is
substitute.
A
A
Then
I
can
just
do
a
range
slice
quarry
to
get
on
the
timestamp
to
get
the
next
20
results.
So
the
wide
rose
part
isn't
a
big
deal
and
entropy
should
still
be
okay.
A
Yeah,
okay,
so
and
again
I'm
using
ec2,
and
one
of
the
really
cool
things
I
found
about.
Ec2
is
a
brief
story.
I
had
about
a
about.
A
year
ago
I
got
a
thousand
dollars
in
amazon
credit,
and
it
was,
I
looked
on
totally
forgot
about
it,
for,
like
11
months
then
looked
on
and
saw
it
and
it
was
expiring
called
them.
They
couldn't
replace
it
or
anything
so,
rather
than
just
losing
it.
A
First
time
took
maybe
like
10
15
minutes
again,
it's
really
easy
and
I'm
sure
you
can
do,
and
so
as
far
as
like
another
point,
as
far
as
like
my
data
model
goes,
it's
it's
still
relatively
relational
and
I
think
the
range
quarries
work
well,
but
I
mean
you
probably
still
implement.
You
could
definitely
still
implement
this
in
a
mysql
or
postgres
or
something
but
either
sharding
it
or
maybe
my
sql
cluster
would
be
easier.
A
Some
features
that
I
haven't
used,
but
which
definitely
are
like
a
big
poster
child
for
cassandra
are
the
multiple
data
centers.
So
when
red
dawn
happens
and
the
russians
and
nicaraguans
invade
and
up
to
colorado
is
occupied,
your
service
is
still
going
to
be
good
because
you
have
a
data
center
in
virginia
with
ec2
and
another
thing
I'm
actually
really
excited
about
cassandra
with
is
spark
first
class
citizen
there
and
really
excited
about
the
streaming.
A
Obviously,
you
can
do
like
more
map-reduce
type
stuff
and
as
well,
but
cassandra's
a
first-class
citizen
there
and
seems
like
spark's
kind
of
taking
over
the
world,
but
I
think
that's
about
all.
I
have
so
yeah,
definitely
not
an
expert
in
cassandra,
but
it's
been
a
lot
of
fun
and
if
you
guys
have
any
advice,
love
to
hear
it
about
things
I
screwed
up
on.