►
Description
In this session, you'll learn about how Apache Cassandra is used with Python in the NY Times ⨍aбrik messaging platform. Michael will start his talk off by diving into an overview of the NYT⨍aбrik global message bus platform and its "memory" features and then discuss their use of the open source Apache Cassandra Python driver by DataStax. Progressive benchmark to test features/performance will be presented: from naive and synchronous to asynchronous with multiple IO loops; these benchmarks tailored to usage at the NY Times. Code snippets, followed by beer, for those who survive. All code available on Github!
A
I'm
the
token
New
York
Times
guy
at
this
event,
but
we
do
use
Cassandra
and
I'm
going
to
talk
a
bit
about
that
and
about
this
project
that
I'm
working
on
we're
Python
years
primarily
got
new
Python.
It
must
be
one
of
them.
Ok,
so
I'll
not
try
not
to
be
too
boring
to
the
Java,
guys
that
I
remember
I've
been
in
computing
a
long
time.
This
is
actually
my
50th
year
of
work.
A
That
would
be
radically
simpler
than
the
enormous
odds
podge
of
stuff
that
we
have
cuz
we're
an
old
company
and
old
company
in
the
tech
space
might
be
five
years.
You
know
were
considerably
older
than
that
and
we
have
a
lot
of
those
vestiges
of
ancient
technologies
from
Linotype
to
you
name
so,
and
they
all
show
up
in
our
systems
and
they
all
show
up
in
one
way
or
another
in
the
stuff
that
we
have
to
maintain.
So
could
we
really
make
it
simple?
A
Could
we
really
make
it
flat
and
could
we
really
make
it
global
all
at
the
same
time
and
I
think
of
our
systems
as
we're
trying
to
build
a
system
for
the
ninety-nine
percent?
You
know,
there's
the
googles
and
there's
a
Facebook
and
those
guys,
but
we
aren't
those
guys.
We
don't
have
the
money
and
we
don't
have
an
amp
power
to
do
that.
A
So
we
started
to
think
first
really
about
communications
building,
a
communications
matrix
and
so
I
think
of
it
as
matrix
that
the
fabric
is
primarily
this
communications
matrix.
That
extends
through
data
centers
around
the
world
that
we
need
to
support
our
communications
and,
as
we
thought
about,
because
we're
we're
working
on
the
DevOps
I'd
too.
So
we're
going
to
build
the
infrastructure,
the
application,
everything
all
from
the
ground
up
this
by
the
way
when
I
say
we
am
talking
about
me
and
my
partner,
so
is
not
a
big
crew.
A
So
we
thought
well
what
if
we
just
sort
of
scatter
nodes
out
of
there.
These
are
small
nodes
and
we're
going
to
just
spread
them
through
this
matrix
and
what
we
want
to
do
is
armed
with
a
way
for
them
to
self-organize
and
to
register
their
capabilities
because
they
are
have
different
capabilities.
Typically,
you
know
we
run
two
or
three
different
sizes
of
nodes,
just
a
long.
The
simplification
line
we
run
with
like
1am
I.
We
do
running
amazon
by
the
way
and.
A
A
A
A
Really
the
computing
problems.
Network
proms,
are
not
the
big
problems.
We
run
into
run
same
problems
you
run
into,
which
is
people
make
mistakes,
and
we
had
one
of
our
team,
the
team
my
team
has
grown
to
three
by
now
and
he
introduced
a
bug
a
little
bit
of
buggy
code
into
Oregon
and
it
ended
up
because
things
are
tightly
connected,
bringing
down
every
node
in
Oregon,
all
the
Cassandra
nodes,
all
the
core
nodes,
all
of
our
problems,
all
of
our
gateways.
A
Or
thurs
owns,
we
spread
everything
out
across
those
zones
pretty
evenly.
We
ought
to
scale
the
Gateway
nodes
and
just
to
dive
down
into
his
own.
This
is
kind
of
what
it
looks
like
our
into
a
region.
This
is
really
just
a
subset
and
I'm
not
going
to
go
through
what
these
things
are,
but
just
to
give
you
an
idea
of
the
connections
and
the
potential
connections,
so
they
all
where
it
says
it's
a
potential
connection.
A
A
A
A
A
Scaling
up
and
down
we're
always
adding
and
subtracting
to
do
that.
We
really
only
speak
WebSocket
amqp
sock
j/s
is
a
transition
protocol
for
us
that
makes
use
of
xhr
streaming
and
so
forth
and
so
on
to
emulate
a
web
socket
interface.
So
we
can
support
virtually
every
every
browser.
Every
device
that
connects
to
us
for
the
news.
A
We
we
picked
up
some
stuff,
that's
similar
to
Cassandra,
it's
interesting
to
me
anyway,
from
a
messaging
background,
which
is
that
important
things
are
replicated.
We
replicate
our
messages
and
we
let
them
race
to
resolution.
So,
for
example,
if
there's
breaking
news,
then
thats
gets
injected
at
one
point
in
fabric.
The
first
thing
we
do
is
replicate
that
to
every
region
and
zone
we
process
it
in
parallel
and
then
resolve.
Then
we
exchange
and
resolve
the
duplicates
and
try
to
send
only
a
single
copy
to
consumer.
A
A
We
give
back
a
message:
ID
that's
been
generated
by
our
major
producers
so
that
they
can
check.
They
can
do
an
end-to-end
check
to
see
that
everything
got
there
when
in
doubt
they
resent,
if
you
get
disconnected
reconnect
and
will
redirect
you
to
something,
live
I'm
not
going
to
go
into
it
today,
but
our
web
socket
server
is
really
proud
of
that.
A
A
We
think
we
can
go
substantially
higher
than
app
that
actually
will
stay
with
small
machines
and
I
was
I've
been
really
impressed
here
at
Cassandra
at
the
size
of
machines
and
the
things
you
people,
we
try
to
stay
small,
so
we're
trying
to
run
cassandra
on
the
smallest
machines
we
can,
and
so
we
try
to
bring
everything
we
can
out
of
it,
and
that
kind
of
brings
me
to
the
python
stuff.
Okay.
So
this
is.
A
I
took
a
snapshot
this
morning,
it's
just
kind
of
see
what
yesterday
looked
like,
and
you
see
that
big
ugly
spike
there
in
the
middle.
That's
what
a
breaking
news
alert
looks
like
okay
and
just
and
if
you're
familiar
at
all
with
RabbitMQ,
which
is
what
we
use
for
our
message
broker,
and
you
can
see
these
rabbits
are
just
loafing
here
and
we
we
like
to
keep
the
rabbit
sloughing,
because
we
are
using
clusters.
A
Cluster
needs.
A
lot
of
needs
to
be
able
to
talk
cluster
in
a
RabbitMQ
world
is
different
from
the
Cassandra
cluster.
In
that
it
consistency
is
at
the
top
of
the
list,
and
so
they
need
to
be
chatting
with
each
other
all
the
time.
So
the
other
thing
we
do
is
we
never
we
process.
We
do
everything
in
memory.
We
never
persist
and
the
reason
we
don't
ever
have
to
persist
in
our
message.
A
Brokers
is
because
we
use
Cassandra
for
that,
so
we've
really
separated
the
problem
in
a
way
that
takes
advantage
of
both
of
these
platforms.
Okay,
let's
so
we
do
messages
and
we
have
a
very
simple
data
model.
This
is
actually
a
little
too
simple,
but
pretty
much
or
my
message
is
some
metadata
and
a
blob.
A
We
don't
care
what's
in
the
blob,
because
we're
a
platform,
so
we
just
push
blobs
around.
We
do
care
about
the
metadata
and
in
fact
we
have
a
very
nice
cube,
as
was
discussed
in
a
previous
one
of
about
eight
items
that
we
use
on
our
metadata.
So
we
track
everything
by
product
by
project
by
so
we
can
do
internal
accounting
and
keep
track
of
our
users
that
way
and
provide
them
with
nice
reports
and
so
forth
about
it.
So
that's
what
a
message
looks
like
in
Nets
course
a
Cassandra
version
of
a
message.
A
This
is
what
a
message
looks
like
in
rabbit
and
you
can
see
how
the
metadata
is
kind
of
migrated
down
there
into
the
headers
and
then
some
of
those
items
have
been
promoted
in
a
way
that
is
useful
for
the
rabbit
up
into
the
proper
okay.
So
the
idea
is
that
the
message
structure
is
general
enough
that
we
can
represent
it
in
Amazon
s3.
We
can,
we
can
represent
it
in
a
lot
of
different
environments.
A
So,
let's
think
a
little
bit
then
about
these
benchmarks
are
going
to
tell
you
about
so
I've
got
some
timelines,
the
rabbit
the
client
kisan,
of
course,
there's
more
than
one
rabbit,
there's
more
than
one
client
and
there's
more
one
Cassandra,
but
we're
going
to
simplify
a
little
bit,
and
this
is
a
pretty
simple
diagram
of
of
how
the
benchmarks
work.
The
client
subscribes
to
the
message:
broker:
okay
and
gets
an
event
which
has
data
a
message
and
then
does
something
with
cassandra.
Cassandra
acknowledges
it
and
we
acknowledge
back
to
the
rabbit.
A
A
Okay,
so
we're
using
sequel
native
now
we
use
the
lib
EV
event
loop
and
actually
that's
a
lot
of
what
I
do
is
to
write
the
drivers.
So
I
wrote
the
driver
to
rabbit
and
got
it
to
work
together
with
the
driver
to
Cassandra.
So
I,
don't
know
if
you've
ever
worked
with
the
an
event
loop
like
Libby
V,
but
it's
like
having
a
wild
animal
in
a
cage
and
you
feed
raw
meat
in
one
end
stand
back
and
it
spits
it
out.
A
The
other
end
it's
really
fast,
but
it's
not
that
easy
to
control
and
when
you
have
two
of
them
running
and
you're
trying
to
run
the
may
synchronously
together,
you
can
run
into
problems.
It's
so
Tyler
hops
and
I've
gone
back
and
forth
a
little
bit
and
making
the
changes
to
both
drivers
that
actually
made
that
possible.
A
So
this
is
it
concurrency
degree
3,
ok,
so
we're
getting
some
overlap
in
processing.
That's
really
the
key
thing
to
note,
but
we're
doing
it
through
event,
processing
we're
not
using
threads
here,
ok
and
we're
getting
more
concurrency
now
this
is
a
little
misleading,
because
all
those
arrows
are
so
neat.
In
fact,
the
order
of
things
is
not
determinant.
Ok
and
they
will
come
back
in
orders
that
you
don't
expect.
They
will
all
eventually
come
back.
Ok,
so
that
means
you
have
to
kind
of
keep
track
and
the
driver
helps
you
do
that.
A
So,
let's
build
a
message.
This
is
just
some
simple
code,
my
Python
code
to
to
build
a
message
and
it
creates
messages
which
has
roughly
the
dimensions
that
are
typical
of
our
application.
And
then,
when
you
push
it,
you
just
keep
pushing
it
in
the
pipeline
right
we've
got
pipeline
going
out,
we've
got
a
pipe
coming
back,
so
I'm
going
to
push
it
to
my
level
of
concurrency,
then
I'm
done
and
then
basically,
this
program
waits
around
until
something
happens.
The
thing
that
happens
is
that
something
comes
back
in
the
pipeline.
A
Okay,
so
I
built
a
message
and
I've
submitted
the
query:
what's
that
actual
query
submit
will
look
like,
and
this
is-
is
I
grabbed
the
body
and
I
do
an
execute
a
sink
in
Python,
okay
and
then
I
add
my
callbacks,
so
my
callback
is
where
I'm
going
to
get
something
back
now
you
notice
that
I
I
pushed
my
limit
of
concurrency
into
the
pipeline.
So
now
every
time
one
comes
back,
all
I
need
to
do
is
am
I
done.
Yet,
if
not
push
another
one.
If
I'm
done
then
finish?
A
Okay,
so
that's
it
is
basically
how
work
I
run
a
50
in
the
pipeline
all
the
time
until
it
winds
down
at
the
end.
So
that's
an
example
of
concurrent
program.
So
let's
look
at
the
push.
This
is
my
disappointing.
First
graph
of
performance
and
what's
wrong
with
this
picture,
what
what's
really
wrong
is
I,
didn't
use,
DC,
aware:
okay,
we're
running
with
a
six
nodes
in
Dublin
and
six
nodes
in
Oregon
and
170
milliseconds
of
round-trip
latency.
A
A
A
But
let's
look
at
a
little
more
detail
here
and
I'm
going
to
be
relatively
quick
going
through
these
I
think,
but
basically
running
we're.
Looking
at
how
many
workers
now
worker
this
python
program
will
multi-process
to
any
degree,
and
so
that's
how
I'm
doing
this
and
it'll
also
take
levels
of
concurrency.
I'll,
show
you
a
little
bit
later
and
so,
as
I
add
workers.
A
Now
there
are
a
couple
things
to
look
at
here:
their
currencies
down
at
the
bottom
I'm
doing
local
quorum,
because
I
found
that
that's
how
we
run
and
there
wasn't
that
much
difference
between
concurrency
level,
one
and
local
forum
in
my
benchmarks.
So
I
focused
on
that
the
percentages
are
interesting
because
I'm
running
on
an
eight-way
processor
here,
it's
a
c1
extra
large
and
so
there's
a
limit
to
how
much
I
can
allow
any
one
of
my
services
to
take,
and
it's
probably
around
for
one
hundred
percent
of
the
eight
hundred
percent.
That's
available.
A
Ok,
that's
a,
and
these
this
is
what
they
all
look
together
and
basically,
when
you
look
at
something
like
this-
and
this
can
tell
me-
had
you
know-
I
like
these
ones,
where
the
slopes
are
rising
rapidly,
that's
kind
of
where
I
want
to
be.
You
know
one
of
these
guys
up
here,
but
it's
also
got
to
end
well
and
these
top
four.
A
They
don't
really
end
that
well,
so
those
ranges-
those
are
those
are
indications
that
I
only
ran
three
iterations
here,
but
those
is
the
high
and
low
in
the
media
and
the
medium
so
of
those
I
would
say
for
push
in
particular.
You
know
I
probably
pick
number
four
because
most
of
the
time-
and
I
probably
operate
it
with
concurrency
up
around
100
or
120
most
of
the
time
I
would
expect
it
would
be
operating
in
this
range,
the
benchmarks
tailored
to
our
environment.
So
these
are
bursts
of
like
twenty
two
thousand
messages.
A
When
I
get
a
burst,
it'd
be
okay
for
it
to
go
ahead
and
rise
up
the
graph.
At
that
rate,
we're
talking
only
under
10
seconds
of
processing
to
take
a
burst.
That
would
be
a
burst
like
that
would
be
typical
of
what
I
get
back
on
a
breaking
news.
Okay,
so
I'm
really,
I
guess
the
message
here
is
tailoring
to
your
environments,
important
this
benchmarks
really
tailored
for
us.
A
This
is
the
Python
program.
It's
got
a
bunch
of
you
can
pretty
much
play
with
everything.
How
many
remote
DC
hos
prefetch
prefetch
is
the
concurrency,
how
many
workers,
whether
it's
DC,
aware
whether
it's
token
aware
I
didn't
talk
about
token
awareness,
I
haven't
found
any
benefit
to
token
awareness
and
I.
Don't
know
whether
it's
my
code
or
the
driver
code
and
I'm
going
to
have
to
dig
into
that
a
little
more
deeply
to
figure
out
just
why
we're
running
replication
factor
3
on
on
six
nodes
in
each
region.
A
Just
to
give
you
an
idea,
this
is
kind
of
what
they
look
like
when
they're
running
you
know
so
he's
building
up
his
22,000
messages
and
then
runs
it.
If
you
look
at
the
unacknowledged,
their
unacknowledged
in
rabbit
terms
means
that's
my
concurrency,
so
I'm
running
with
overall
concurrency
of
3.
I've
only
got
one
processor
mark
one
consumer.
If
I
look
at
one
of
the
later
trials,
I'm
running
with
165
three
consumers,
that's
a
concurrency
level,
55
/
prize
for
sub-process,
okay,
now
in
Python,
these
are
not
particularly
expensive
in
terms
of
memory.
A
We're
talking
about
30
megabytes
for
each
sub
process
that
you
run.
So
it's
not
expensive.
That
way,
you
have
to
be
careful
about
how
much
CPU
is
being
consumed
overall
to
stay
in
an
efficiency.
Okay.
The
polls
are
similar
except
there's
a
lot
more
data,
so
that
I
would
expect
my
graphs
to
be
flattered
and
the
reason
they're
flatter
is
that
I'm
using
a
lot
more
cpu
to
accomplish
and
I
just
run
out
of
CPU
much
more
rapidly
and
don't
get
near
the
through,
but
but
still
pretty
good
throughput.
A
So
my
overall
graph
look
like
this
a
lot
flatter.
So
this
helped
me
with
my
planning,
do
get
births
that
are
looking
for
reads,
they're
nowhere
near
as
much
as
common
as
the
bursts
that
we
get
up
pushes
where
I
get
a
burst
like
this
will
be
when
a
WebSocket
gateway
node
either
go
typically
goes
down.
If
it
goes
down,
all
of
its
clients
have
to
move
at
once.