►
From YouTube: Activision: How to turn Massive Amounts of Streamed Data into Real Time Personalized Experiences
Description
Speaker: Darryl Kanouse, Senior Director - Consumer Technology
In 2014, with the launch of Call of Duty: Advanced Warfare, Activision released a system for messaging its users with highly personalized and contextually relevant communications designed to enhance the user experience and deepen user engagement. Now, in year two, the system is being extended to serve all Activision titles and to deliver these experiences in reaction to player behaviors. The key to success is Activision's ability to process massive amounts of data in real-time using a data center built around Cassandra as the primary user profile store.
A
B
So
hello
I'm
here
to
talk
about
how
to
turn
massive
amounts
of
stream
data
into
real-time
personalized
experiences
for
millions
of
users.
Just
quickly
about
me,
my
name's
Daryl
can
house
I
am
a
senior
director
of
consumer
technology
at
Activision
before
that
I
was
principal
solution.
Architects
at
a
marketing
agency
called
Rosetta.
Activision
was
our
client
at
that
time,
so
I've
been
with
Activision
for
a
couple
years,
but
worked
with
them
for
about
four
or
five
years.
B
B
I
also
do
wire
art
and
music
and
I
have
kids
that
I
love-
and
this
is
my
wire
art
and
I-
have
to
put
it
up,
because
I
do
one
of
them
every
day,
today's
day,
181,
and
that
was
one
that
I
did.
If,
if
you're
like
me
and
you,
google,
your
presenters
you'll
see
a
lot
of
this
stuff
but
anyway.
So
let's
talk
about
the
massive
amounts
of
stream
data
for
real-time
personalized
experiences.
I
have
sort
of
the
subtitle
of
architect
for
success.
I
think
it's
probably
a
bit
of
a
cliché
to
say
that.
B
But
that's
definitely
my
perspective
is
there's
a
lot
of
things
you
can
do
with
optimizations
low-level
tuning
that
kind
of
thing,
but
the
decisions
that
you
make
up
front
about
how
you're
going
to
build
your
system
tend
to
be
the
ones
that
are
going
to
determine
your
success
or
failure
in
these
kind
of
scenarios.
So
we'll
talk
a
little
bit
about
how
activation
we
went
about
the
process
of
architecting
our
solution
for
streaming
data
and
the
role
of
cassandra
plays
in
it.
B
So
there's
some
important
things
to
think
about
when
you,
when
you're
talking
architecture,
there
are
technical
considerations,
then
there
are
non
technical
considerations.
I
think
the
non
technical
considerations
are
almost
as
important
as
the
technical
ones,
technical
ones.
You
can
do
something
about
generally
speaking,
the
non-technical
ones.
You
cannot
do
anything
about
and
they
sort
of
define
be
constraints
around
the
system
that
you're
trying
to
build.
So
we're
talking
about
stuff
like
business
timelines
company
like
Activision,
we
release
video
games
on
an
annual
cycle.
November
is
the
big
release
month
for
us.
B
Whatever
solution
we
come
up
with
and,
however
we
architect
things
they
need
to
fit
within
the
constraints
of
the
business
timelines,
you
can't
have
our
solution
deployed
in
December.
If
that
just
doesn't
that
doesn't
work
and
I'm
sure
most
of
the
businesses
there's
a
similar
type
of
constraint
around
product
releases
and
Palace
tough
works,
there's
also
the
financial
environment
that
you're
working
within
again.
This
is
often
time
stuff.
We
don't
have
a
lot
of
control
over
its
budgets.
I
think
that's
pretty
normal
I
think
we're
all
sort
of
familiar
with
that.
B
We
have
also
legacy
systems,
something
else
we
consider
a
lot
of
when
we
are
talking
about
taking
a
personalization
platform
and
adding
it
to
a
video
game,
we're
talking
about
bringing
a
new
system
into
a
pre-existing
ecosystem
of
a
bunch
of
other
applications,
understanding
what
they
do
and
how
they
work
is
important
in
helping
us
decide
what
what
decisions
were
going
to
make
and
how
we're
going
to
approach
our
solution
and
then
finally,
there's
organizational
politics
I'm
sure
everybody
is
familiar
to
some
level
of
that
Activision.
It's
not
really
that
bad
in
itself.
B
B
So
I
guess
that's
all
to
say
that
when
we
talk
about
how
we're
going
to
do
things
like
large
scale
personalization
in
real
time,
so
that
the
context
under
which
we're
building
this
application
is
very
important.
So
with
respect
to
Activision
and
the
context
there,
let's
just
talk
a
little
bit
about
the
history
of
activation
and
kind
of
how
we
got
to
these
things
we're
doing
this
year.
I,
don't
know,
maybe
guys
are
familiar
with
the
company.
We
started
in
1979
who
the
first
third
party
developer
for
any
games
at
all.
B
It
was
for
the
Atari
console
cartridge
system
up
until
Activision
got
started.
All
of
the
games
high
dosage,
you
got
were
made
by
the
manufacturers
of
the
consoles,
so
so
that's
a
legacy
that
sort
of
continues
on
today.
We
are
still.
The
third
are
the
largest
third-party
game
developer
as
far
as
the
industry
as
a
whole,
just
to
kind
of
give
you
a
sense
of
when
we
talk
about
large-scale
activity.
Is
a
big
player
in
this
industry.
In
this
industry
is
very,
very
big.
155
million
Americans
play
video
games.
B
51
percent
of
all
us
households
own
a
dedicated
console,
whether
it's
an
Xbox
or
Playstation.
Forty-Two
percent
of
Americans
play
more
than
three
hours:
22
/
22
billion
dollars
spent
on
the
video
game
industry.
It's
it's
big
and
Activision
is
one
of
the
biggest
players
in
that
industry.
We
currently
have
over
4,000
employees.
We've
got
38
locations,
that's
kind
of
a
mix
of
offices
and
development,
studios
support
teams,
it
kind
of
depends
on
them
on
the
day
that
you're
asking
that's
something
beautiful
ucks,
but
that's
about
where
we
stand
today.
B
We
release
a
lot
of
video
games
a
lot,
but
there's
three
that
at
the
moment
are
probably
the
ones
that
most
people
have
heard
of.
They
are
blockbusters,
they're,
the
kind
of
tent
pole
titles
in
the
organization
call
of
duty.
You
probably
have
heard
of
that
one
destiny
was
released
last
year,
one
of
the
biggest
new
IP
releases
ever
in
the
video
game
world,
which
is
not
hard
to
do.
B
I,
don't
think.
That's
very
controversial,
I'm
sure,
there's
lots
of
statistics
that
can
support
that.
But
I
think
at
this
point
we
all
sort
of
acknowledge
that
that's
the
expectation
you
just
have
personalized
experiences.
That's
the
that's
the
thing,
but
we
don't
just
do
personalized
experiences
for
nothing.
Obviously,
the
reason
why
they're
valuable
is
because
they
increase
player
engagement.
They
make
people
happy
and
in
the
video
game
world
for
us
we
can
measure
that
engagement
through
a
couple
of
key
metrics
session
frequency
and
duration,
basically
getting
people
to
play
more.
B
Presumably,
because
they're
having
a
good
time
and
they're
enjoying
it,
and
also
we
are
a
business,
so
the
average
revenue
per
user
is
a
metric
that
we
also
look
at
just
as
a
side
note
the
acronym
for
average
revenue
per
user
arpu,
which
still
elicits
giggles.
When
we
talk
about
it,
there
is
another
metric
that
we
look
at,
which
is
average
revenue
per
paying
user,
which
has
the
unfortunate
RP
poo
acronym,
but
people
still
giggle
when
we
talk
about
in
the
office.
B
Historically,
we've
done
personalization
across
a
lot
of
channels
started
with
email.
We
do
website
stuff
from
mobile
app
has
personalization
most
of
what
we'll
talk
about
architectural
e
is
the
in-game
on
console
stuff,
but
I
think
it's
important
to
understand
that
that
the
platform
that
we
built
is
multi-channel
and
supports
a
lot
of
different
stuff,
so
some
background
on
those
channels
and
how
we
got
it
from
a
development
team
standpoint
to
where
we
are
today.
B
That's
kind
of
the
story
of
CRM
and
if
you
guys
are
familiar
with
CRM
as
a
traditional
discipline,
customer
relationship
management
outreach,
that's
the
type
of
an
organization
you
would
expect
to
be
responsible
for
personalized
experiences
and
driving
engagement,
making
people
be
repeat
customers
and
at
Activision
about
four
years
ago.
Five
years
ago
there
was
the
beginnings
of
the
CRM
group
that
has
now
evolved
into
the
group
that
I'm
a
part
of
today,
and
at
that
time
this
was
2011
with
modern
warfare.
B
It
was
very
basic,
we're
just
doing
basic
segmentation
strategies
like
do
you
play
a
lot
or
do
you
play
a
little?
Are
you
good
with
your
kdr,
you
bad,
and
then
we
would
send
you
emails
that
are
related
to
that,
but
it
was
kind
of
putting
our
toe
in
the
water
technologies.
We
use
at
that
time.
A
dupe
for
all
the
basic
data
collection
info,
bright,
Oracle
Oracle,
is
kind
of
a
legacy
system
that
IT
group
had,
but
we
used
it.
B
We
had
L
Aquila
was
the
email
service
provider
that
would
actually
send
out
the
emails,
and
we
had
a
lot
of
guys
who
would
write
sequel
code
against
Oracle
to
produce
CSV
outputs
that
we
would
ftp
off
to
an
email
service
provider
to
deliver
emails,
and
that's
where
we
started,
and
oh
those
were
the
days
so
now.
Moving
on
to
the
next
year,
black
ops
2,
it
became
important.
We
realized
to
stabilize
the
data
environment.
B
This
was
the
first
time
we'd
actually
looked
at
millions
of
users,
data
all
coming
in
at
the
same
time
and
trying
to
deal
with
it.
Now.
This
point
we're
still
not
talking
real
time,
we're
talking
about
capturing
the
data
and
analyzing
it
offline
automation
clearly
was
becoming
a
problem
having
people
writing
sequel
code
is
not
an
ideal
scenario
to
generate
ad
hoc
list
to
send
emails.
So
it's
the
beginning
of
an
automation
process
and
we
refined
our
targeting
so
that
we
weren't
using
the
same
basic
segmentation
and
around.
B
Are
you
good
or
not,
but
we
actually
started
to
look
at
stuff
like
well.
What
are
you
good
at
and
maybe
that's
something
that
we
could
use
to
to
talk
to
you?
Are
you
really
good
at
a
particular
mode
of
play
versus
some
other
mode?
Our
technology
platforms
involved
only
really
to
the
extent
that
we
ditched
Oracle
not
a
year
too
soon,
and
we
used
info
bright
as
the
basic
data
Mart
for
all
of
the
the
data
points
we
were
using
to
drive.
B
So
moving
on
to
2013,
we
learned
a
lot
of
stuff
from
the
previous
year.
One
of
them
was
that
info
bright
was
terrible,
I
hope
there's
nobody
for
me
per
bite
here,
but
it
was,
it
did
not.
Work
for
us
had
lots
of
problems,
keeping
it
up,
but
what
we
didn't
want
to
do
is
continue
on
the
path
of
advancing
our
segmentation
and
advancing
the
personalization
efforts
by
looking
at
more
and
more
data,
and
then
we
also
began
to
introduce
some
of
these.
Other
technologies
increase
the
automation.
This
is
a
progressive
step
along
the
path.
B
So
one
of
the
key
things
that
happened
this
year
was
in
2013
was
the
development
of
a
personalization
engine
that,
as
a
modular
component,
was
the
type
of
thing
that
we
could
throw
data
at
throw
a
set
of
rules
and
some
content
at,
and
it
could
smash
it
all
together
and
produce
an
experience
that
experience
was
email
at
the
time.
But
but
even
now
we
use
still
the
same
basic
principles
in
how
we
do
the
stuff
in
game.
B
Our
mandates
for
personalization
and
now
we're
starting
to
see
some
hints
at
how
our
architecture
is
going
to
need
to
conform
to
some
of
the
constraints
we
have
to
maximize
relevance
and
user
value.
We
can't
just
talk
to
people
about
nothing.
We
can't
you
know
put
in
front
of
them
suggestions
that
they
buy
stuff
that
they're
not
interested
in.
We
really
need
to
make
sure
that
people
feel
there's
a
value
add
by
this
personalization
effort.
B
This
is
an
email
form
and
I
think
this
is
a
good
representation,
more
or
less
of
the
general
approach
to
personalization
that
we
have
still
today,
even
in
the
game
and
I,
think
one
of
the
things
that
sort
of
is
worth
looking
at
look
I
got
my
little
pointer
yesterday.
It
works
so
if
you
compare
the
guy
over
here
on
the
left
to
the
guy
over
there
on
the
right,
you
can
see
that
that
the
way
we've
stacked
content
is
different.
As
a
rules
process,
there
are
perhaps
stats
we
want
to
give
to
people.
B
There
are
achieving
want
to
congratulate
them
about,
and
obviously
those
things
are
going
to
have
some
degree
of
customization
they'll
have
the
relevant
data
points,
but
the
question
for
us
is:
do
we
show
them
stats
at
all?
So
if
you
look
at
the
person
over
here
on
the
left,
that
guy
would
be
somebody
might,
we
might
consider
to
be
a
new
and
we
would
say
for
him
he
had
a
good
week.
Let's
show
him
some
stats,
he
might
get
kind
of
kicked
out
on
that
and
think
that's
great
the
guy
over
there
on
the
right.
B
Perhaps
a
vet,
perhaps
not
as
impressed
with
his
own
stats,
however,
would
like
to
know
that
he
prestige,
because
you
spent
a
lot
of
time
playing
the
game.
So
these
are
the
types
of
ways
that
we
kind
of
manifest
experiences,
and
this
is
a
good
visual
for
for
how
that
works.
It
all
kind
of
starts
with
having
a
lot
of
content.
The
kind
of
personalization
we
talk
about
doesn't
really
work.
If
you
only
have
three
or
four
things
to
talk
about,
we
actually
have
thousands
I.
B
Think
ten
thousand
by
last
count
of
individual
types
of
things.
We
can
talk
to
people
so,
as
you
can
imagine,
there's
a
there's,
a
pretty
massive
production
machine
behind
all
this
stuff
that
generates
this
content,
but
I
sort
of
my
third
factor
thing,
but
you
can
see
some
of
the
types
of
things
we're
talking
about
stats,
congratulating
people,
giving
them
tips
all
geared
towards
making
them
feel
good
and
want
to
play
more
okay,
so
last
year
was
2014.
This
was
the
big
shift
from
out
of
the
emails
and
into
the
game.
B
This
is
a
I
can't
really
like
overstate
what
a
significant
evolutionary
stuff
this
was
for
us,
because
it
meant
now
that
our
systems
had
to
be
production
ready.
It's
one
thing:
when
you
have
stuff
that's
working
offline
and
you're,
creating
your
writing
applications
that
maybe
send
an
email-
maybe
don't,
but
when
you're
talking
about
putting
something
in
the
game
at
Activision,
you
need
to
tread
very
carefully
and
this
was
the
year
and
not
not
to
Uncle.
B
Incidentally,
this
was
the
year
the
Cassandra
came
into
play
for
us,
where
we
started
to
realize
that
in
game
you're
talking
about
communicating
with
everybody
an
email,
you've
really
only
got
a
small
subset.
You
also
talked
about
the
need
for
reliability,
the
need
for
performance.
Everything
starts
to
get
a
lot
trickier,
and
so
so
Cassandra
we
still
use
Hadoop
and
green
plum
as
an
analytics
source
of
sorts.
B
But
here,
if
your
Kim,
Cassandra
and
I
will
talk
more
about
what
how
that
Cassandra
stuff
worked,
this
is
what
it
looked
like
inside
the
game
like
I
mentioned
before
you
can
see
a
lot
of
their
beakers
recurring
themes.
Here's
somebody
got
an
achievement
for
something
there's
a
tip
on
how
to
use
some
of
the
score.
Streaks
again,
these
are
all
targeted,
specifically
I
think
these
were
mine.
B
I
think
I
used
the
missile
very
poorly,
and
so
it
was
telling
me
how
I
could
use
it
better,
but
you
can
start
to
see-
and
we
got
very
positive
feedback
on
this
as
well.
So
now
we're
talking
about
this
year,
and
this
is
where
the
real
time
piece
came
into
play
and
making
it
happen
in
real
time
if
getting
in
the
game
was
an
incremental
step
forward
in
terms
of
general,
like
stress
and
anxiety,
about
the
application.
Making
it
happen
in
real
time
is
now
really
ratchets
up
the
thing
you
can
see.
B
We
have
slightly
different
technology
stocks
that
we're
using
now
you'll
recognize
that
I,
probably
don't
need
to
say
it.
Amazon
is
playing
a
role
and
what
we're
doing
now
wasn't
before,
and
but
we
still
have
Cassandra
some
one
thing.
I
will
note
now
that
if
you
were
noting
on
all
of
the
slides,
this
is
the
first
year
that
we
actually
stuck
with
a
data
platform
for
profile
management
two
years
in
a
row,
and
it
was
go
Sandra
and
that's
why
we're
here,
it's
great.
It
works
out
for
us.
Ok,
so
I
think
that's.
B
B
Excuse
me,
okay,
so
we
have
a
thing
that
we
want
to
build
personalization
streaming
real
time
and
we
need
to
put
it
into
the
game
environment.
We
are
putting
it
into
a
very,
very
noisy
place
that
has
a
lot
of
stuff
that's
happening
in
order
to
keep
a
game
like
Call
of
Duty
active
and
online.
There
are
systems
and
services
that
are
constantly
running
that
we
need
to
be
aware
of
so
that
we
don't
sort
of
step
on
them.
This
is
a
bit
of
the
legacy
system.
Consideration
authentication,
matchmaking
marketplace.
B
B
We
have
P
concurrent
users
that
are
somewhere
around
two
to
three
million
give
or
take,
depending
on
you
know
whether
a
game
just
launched
or
whether
it's
April
and
we
have
a
service
gateway
that
acts
as
kind
of
a
proxy
and
a
caching
tier
between
all
the
backend
services
and
everything
that
the
game
and
all
the
games
in
the
way
that
they
interact.
So
they
all
games
talk
to
the
Gateway
behind
the
Gateway
is
a
whole
set
of
services.
I,
don't
think
this
is
a
particularly
unique
or
different
kind
of
architecture.
B
I
think
you'll
see
this
in
a
lot
of
places.
This
is
sort
of
how
it
works
also
just
like
in
a
lot
of
other
places.
The
kind
of
the
mess
is
behind
the
service
here,
where
you
have
a
complete
hodgepodge
of
a
bunch
of
different
things
that
have
developed
like
barnacles
over
the
years
that
our
core
applications
that
need
to
run,
but
that
haven't
been
touched
in
a
long
time
and
they're
all
kind
of
mix
and
match.
B
But
the
truth
is
there's
actually
some
some
rhyme
or
reason
about
that
there
are.
There
are
owners
of
the
systems,
they
can
tell
you
why
they
are
the
way
they
are
and
how
they
need
to
be
there,
and
and
and
as
it
happens,
we
are
basically
building
one
of
those
things
we're
another
one
of
the
tentacle
barnacled
bits
that
are
in
there.
Okay,.
A
B
We'll
at
scale
is
the
question
about
our
application,
so
ensuring
that
it
will
making
sure
that
the
answer
can
be.
Yes
is
where
we
start
looking
at
the
requirements
and
a
little
bit
more
deaf
and
one
of
those
is
to
understand
what
personalization
means
so
for
us
I
think
it's
important
to
lay
out
up
front
what
personalization
is
not
in
the
game
like
Call
of
Duty
or
any
of
the
games.
We
do
and
is
messing
with
progression
systems
or
changing
weapon
performance
on
a
personal
basis
or
monkeying,
with
maps
Ernie
that
kind
of
stuff.
B
As
much
as
we
would
like
to
make
the
game
easier
for
people
who
are
bad
because
we
know
they
will
play
more.
That
sort
of
violates
the
notion
of
the
integrity
of
the
game
and
the
and
the
level
playing
field
that
we
need
to
support.
However,
we
do
have
a
lot
of
options
and
things
we
can
talk
about.
That
will
result
in
people
having
a
better
time
and
getting
better
and
enjoying
it
and
I.
Don't
know
how
many
of
you
guys
play
call
of
duty.
I
know
my
first
experience
of
it
was.
It
was
brutal.
B
It
was
terrible
and
it
was
frustrating,
but
after
many
hours
of
play
and
lots
of
tips,
I
feel
like
I'm
a
little
bit
better.
Now
and
and
and
mainly
it's
it's
kind
of
like
tips
and
congratulate
congratulate
or
you
know,
attaboy
type
of
stuff
is
the
thing
that
kind
of
keeps
people
coming
back,
despite
the
fact
that
it
can
be
relatively
brutal
experience.
So
the
other
thing
to
consider.
So
that
was
that's.
B
What
personalization
kind
of
is
it's
sort
of
the
in-game
manifestation
of
stuff
that
we
talk
about
an
email,
then
there's
also
the
consideration
of
what
is
real-time
actually
mean,
and
there
is
nothing
that
is
instant
in
anything
in
the
world.
There's
the
speed
of
light
constraint,
but
in
addition
to
that,
there's
also,
you
know
you
got
to
move
data
from
one
place
to
another.
You've
got
to
act
on
it.
B
So
the
window
that
we
have
to
work
with
is
what
happens:
the
there's
the
end
of
the
match
when
you're
playing
a
game,
there's
a
data
that
gets
packaged
up
and
it
gets
sent
off,
and
that
starts
the
clock
ticking
after
that
there
is,
if
you're
playing
the
game
from
a
user
experience
standpoint,
there's
a
killcam,
it's
a
sort
of
a
replay
of
the
last
guy
who
killed
somebody
and
it's
fun.
If
you're
that
guy,
not
as
fun
and
fear
the
guy
that
got
killed,
then
there's
an
after
action
report
in
the
game.
B
That's
a
kind
of
a
list
of
stats
and
all
the
stuff
is
happening
in
the
game.
While
we
are
taking
that
data
and
processing
it
and
trying
to
decide
what
we're
going
to
do
next
when
they
drop
back
into
a
lobby,
the
lobby
is
kind
of
like
the
waiting
ground
where
people
sort
of
sit
and
wait
for
the
next
match
to
start,
and
that's
really
what
we
want
to
take
the
most
advantage
of
an
opportunity
to
communicate
to
them.
B
What
is
too
long
for
us
to
wait
and
for
us
it's
29
seconds
so
that
29
seconds
may
seem
like
a
long
time
away
from
real
time.
But
in
any
system
like
this,
when
you
talk
about
real
time,
you're
measuring
it
by
the
experience
of
the
user
and
in
essence
the
experience
will
feel
real
time
because
they
played
a
match,
the
first
opportunity
we
had
to
talk
to
them.
We
have
something
that's
relevant
and
related
to
that
data.
B
B
Parsing
the
inbound
match
stats,
determine
whether
or
not
the
profile
update
needs
to
happen
archive
the
inbound
data
for
analysis
and
for
replay
deliver
the
user
to
the
business
engine.
The
rules
engine
that
can
assign
the
content
and
that
rules
engine
will
then
load
the
user
profile,
including
all
the
stats
stuff
that
is
the
generated
offline,
apply
the
updated
segmentation
determine
appropriate
messaging
treatment
and
a
bunch
of
other
stuff
and
then
eventually
send
all
that
stuff
back
out
to
the
engine
going
back
out
to
the
to
the
game
and
doing
that
for
p
concurrent
volumes.
B
That's
that's
really
what
our
challenges
and
when
the
question
is
well
will
it
scale
is
whatever
you
can
do
to
make
this
happen,
something
that
you
can
do
when
there's
two
or
three
million
people
that
are
trying
to
do
the
same
thing.
So
the
answer
is
yes
and
we
we
can
do
that
through
three
core
architectural
principles,
the
first
one
being
a
low
latency,
high-volume,
datastore
and
I.
Think
there's
any
mystery.
B
What
that
is
a
stateless,
bespoke
processing
those
you
can
talk
a
little
bit
about
that
and
then
Q
Q's
QA
I
did
a
lot
of
queueing
total
side.
Note
queueing,
as
eight
letter
word,
with
five
vowels
in
a
row,
it's
the
only
one
in
the
English
language
I
take
an
opportunity
to
write
it
so
there
it
is
so.
First,
let's
talk
about
the
low
latency
high
volume
data
store,
I.
B
Think
it's
pretty
pretty
self-evident
that
the
more
you
know
about
somebody,
the
better
you
can
be
at
at
communicating
with
them,
and
so,
let's
just
say,
hypothetically,
what
we
want
to
do
is
recommend
or
talk
to
somebody
about.
It
particular
treat
a
cupcake,
for
example,
knowing
a
little
bit
about
them.
We
can
under
down
that
in
this
case
they
like
frosting.
What
we're
talking
about
from
a
game
standpoint
is
just
understanding.
What
console
do
you
plan
on?
How
long
have
you
played?
What's
your
XP?
B
B
This
is
not
representative
of
all
of
the
data
that
we
capture,
you
know
the
match
level,
stuff
and
purposes
of
analysis
and
replay.
This
is
really
only
a
profile
for
an
individual.
So
when
you're
talking
about
30
million
users
and
you've
got
a
thousand
attributes,
you're
talking
about
a
profile
store
of
sorts,
that's
going
to
need
to
have
30
billion
records.
B
So
one
of
the
things
that
is
sort
of
important
when
we
try
to
understand
how
we're
going
to
deal
with
this
volume
and
how
we're
going
to
scale
is
try
to
be
judicious
about
where
we
get
the
data
and
what
are
the
things?
I
really
have
to
be
real
time
versus
the
things
that
don't
have
to
be
real
time.
What
does
the
fact
is?
There's
a
lot
of
stuff
that
really
doesn't
have
to
be
real
time
in
order
to
provide
the
right
experience.
B
So
we
look
at
our
data
sources,
the
real
time
data
source
that
stream
that
comes
in
off
the
gateway
that
everybody
is
playing,
provides
us
with
this
kind
of
stuff
sessions
start
you
know,
equipment,
setup.
What
weapons
and
things
you
choose
when
you
start
a
match.
The
match
summary
data,
which
is
which
is
a
critical
piece
of
the
actual
metrics
for
an
individual
click
stream,
stuff
marketplace
bids
and
contained
within
that
real-time
stream
is
the
source
data
for
everything
else
that
we're
trying
to
do.
B
In
addition
to
that,
there
are
things
like
hey
somebody
just
got:
100
kills
in
a
match.
That's
awesome.
We
need
to
talk
to
them
about
that.
That
is
a
real-time
trigger.
We
need
to
do
something
about
that
right
then,
but
there's
other
stuff
that
does
not
happen
mean
that
same
sort
of
time
frame,
Sharon
propensity
models
me
anything
that
sort
of
model
oriented.
That
is
a
trending
analysis.
B
So
so
those
are
the
two
bits
we
cannot
take.
Take
those
two
parts.
We
have
to
smash
them
into
a
profile
that
we
can
access
very
quickly
as
a
database
design,
a
principle
that
sort
of
informed
how
we
set
up
our
Cassandra
instance.
We
took
the
notion
of
the
coupling
and
design
scheme
up
from
the
business
model,
which
is
not
a
unusual
approach
when
you're
talking
about
the
sequel,
databases
versus
relational
databases.
B
For
some
people
it
is
a
it
is
a
shift
and
how
they
do
things
they
hear
requirements
about,
like
say,
for
example,
you're
talking
about
a
video
game.
You
have
things
like
maps,
you
have
things
like
weapons,
you
have
users,
so
you
would
have
a
map
stable
and
in
a
user's
table.
Then
a
user
maps
table
that
would
have
your
stats
about
that,
and
you
have
a
lot
of
joins
and
stuff
that
you
all
have
to
assemble
this
master
profile.
Rather,
what
you
could
do
is
just
call
anything.
B
That's
an
attribute,
a
thing,
give
it
a
like
an
ID
and
say
this
thing.
This
represents
your
kills
in
domination
on
the
map
called
turbine
and
and
so
on,
and
so
you
can
just
stack
all
those
together
and
what
that
does
is
it
creates
flexibility
for
adding
new
attributes,
so
things
can
happen
in
the
game.
We
can
track
them
and
we
can
put
them
into
our
infrastructure
without
the
need
to
change
anything
in
our
data
model.
B
It
gives
us
consistency
on
the
interfaces
with
the
data
access
layer,
so
how
we
access
the
the
the
profiles
doesn't
have
to
change.
There's
no
new
sequel
queries,
there's
no
any
of
that
kind
of
work
to
do
we're,
basically
just
striping
out
a
ton
of
attributes
based
on
a
user
ID
and
optimizes
the
Select
operations.
B
This
is
essentially
the
case
for
no
sequel
databases
have
more
or
less.
You
know
the
flexibility,
the
ability
to
just
kind
of
dump
things
in
and
do
what
you
want
with
them.
So
what
you
end
up
with
is
a
user
profile
that
might
look
something
like
this,
where
you
have
the
user
ID
and
then
a
bunch
of
rows.
B
This
kind
of
defines
the
right
profile.
I.
Think
in
our
case
is
probably
no
surprise
that
there
are,
you
know
our
rights
outnumber
are
we
is
about
probably
10
to
1
hour.
Reads
our
stripes
a
full
full
line
data
from
individual
user.
Our
access
to
the
profiles
has
one
query.
Type
of
one
query
type
only
is
give
me
all
the
attributes
for
a
particular
user.
B
Our
rights
are
obviously
a
lot
more
complicated,
and
this
is
the
this
is
why
we're
here,
because
this
is
why,
in
our
minds,
cassandra
is
the
was
the
only
thing
that
has
really
worked
for
us
to
be
able
to
handle
the
volume
of
this.
This
balance
of
reads
and
writes,
and,
like
I
said
earlier,
I
come
from
the
perspective
of
architecture.
There
are
certainly
optimizations
that
make
the
stuff
go
faster.
There's
there's
always
adding
more
nodes.
B
You
know
that
kind
of
stuff,
but
essentially
you're
still
you're
talking
about
a
data
schema
that
is
going
to
support
it
in
other,
where
other
other
situations
would.
So.
This
is
what
our
an
example
of
what
an
actual
schema
would
look
like.
So
if
previously
was
sort
of
an
oversimplification
in
the
real
sort
of
business
case,
our
player
fact
table.
So
we
use
sort
of
legacy
terms
of
facts
and
dimension
relationships
from
warehouse.
B
This
is
a
representation
of
the
kind
of
legacy
and
the
context
that
we
came
from
where
warehousing
was
the
thing,
so
we've
still
adopted
the
same
naming
convention.
Although
what,
if
you
think
of
fact
and
dimension
tables
in
a
traditional
warehousing
way,
that
is
not
at
all
the
way
we
actually
store
data
in
Cassandra,
but
it
is
representative
of
the
fact
that,
basically,
all
we're
storing
as
integers
of
some
type
or
another,
because
the
applications
don't
really
need
to
know
specifically
what
it
is
because
they're
all
sort
of
the
same.
B
We
have
a
particular
user,
ID
and
network
ID,
because
this
is
the
only
way
we
ever
access.
Profiles
is
all
the
reads
are
very
very
simple:
we
have
the
cholesteric
II
that
represents
what
the
actual
attribute
is
in
itself.
That's
going
to
provide
us
with
a
level
of
uniqueness
and
distinction
around
each
of
those
things
which
lets
us
do
kind
of
cheating
and
shortcuts
around
just
kind
of
absurd.
In
whatever
comes
in
it
makes
it
gives
our
application
developers
a
lot
of
leeway
and
sort
of
forgiving,
and
all
that
they're
really
updating.
B
B
Yes,
so
and
then
there's
the
ED
value
itself,
so
that
was
the
low
latency
high
volume
datastore
it's
Cassandra
and
it's
awesome.
Stateless,
misspoke,
processing,
nodes,
I'll
just
try
to
talk
about
this
stuff
quickly,
because
I'm
already
running
low
on
time
in
a
traditional
architecture,
or
not
this
traditional
not,
but
this
is
the
way
you
see
this
a
lot
of
stand-alone
application
model.
We
are
building
a
thing
and
the
thing
has
an
application
code
base
and
we
have
a
lot
of
shared
modules
and
it
is
and
it
all
it
kind
of
runs
together.
B
You
can
sequence
all
the
things
are
supposed
to
happen
in
29
seconds
in
one
continuous
procedural
set
of
code,
but
the
problem
there
is
that
you
don't
have
isolated
scaling
controls.
So
when
you
have
performance
impacts
that
are
say
the
result
of
too
much
data
coming
in
you
now
have
you
can
scale
horizontally?
Yes,
but
you're,
but
the
efficiency
on
resource
usage
is
you're.
B
Now
standing
your
rules
engine
across
all
those
additional
nodes
and
you're
you're,
not
really
maximizing
the
opportunity
to
to
separate
those
stuff
out
the
bottom
line,
being
that
the
impact
of
load
isn't
equal
in
all
parts
of
the
system.
The
things
that
might
make
the
engine
run
slower
are
going
to
be
stuff
like
seasonal
seasonality.
You
know
Christmas
stuff.
We
want
to
talk
more
to
people,
whereas,
like
a
an
event,
reader
is
going
to
be
impacted
by
the
number
of
concurrent
users
which
goes
up
and
down.
B
So
these
are
the
three
basic
parts
of
the
application:
the
real
time
processing
bits.
I
think
these
would
be
a
pretty
standard
delineation
of
how
you
would
set
this
type
of
thing
up,
inbound
event
streaming
and
the
parsing
of
that
it
is
a
it
is,
to
some
extent
kind
of
contained,
there's
real
time,
rules,
execution
and
then
there's
the
offline
data
and
messaging
processing.
So
the
architecture
gets
updated
and
it
looks
kind
of
like
this
inbound
event.
Streaming
reads
off
of
the
gateway
and
updates
cassandra
and
its
profiles.
B
There
is
the
rules
engine
that
now
knows:
hey
I've
got
a
user
and
I've
got
to
process
this
user.
So
I
will
read
from
user
profiles
and
I'll
update
back
into
the
gateway
to
get
back
out
to
the
client
and
then
there's
an
offline
data
processing
bit.
That
is
completely
decoupled
from
the
rest
of
the
application,
still
doing
core
and
important
stuff
like
updating
the
profiles
with
all
that
offline
stuff.
B
So
then,
just
a
quick
on
the
last
point
here
queues
this
may
go
without
saying,
but
we
would
have
a
very
hard
time
scaling
our
infrastructure
to
support
the
maximum
load
because
they're
the
peaks
and
valleys
in
the
load
that
we
see
is
extreme
what
we
see
on
november.
Seventh,
the
day
after
a
car
title
gets
launched
versus
june.
Seventh,
after
people
been
playing
it
for
six
months,
it's
a
magnitudes
of
ten
to
one.
B
We
can't
support
an
infrastructure
that
can
just
look
at
spikes
all
the
time
100-percent
it
just
it's
not
really
cost-effective-
and
this
is
not
even
accounting
for
time
of
day,
which
is
has
a
huge
impact
on
on
how
data
volumes
come
in
so
so
use
qs
qs
lets
us
set
up
an
infrastructure
and
all
these
individual
components
in
a
way
that
they
can.
They
can
have
a
little
bit
of
wiggle
room.
B
So
we
can
plan
a
weekend
planned
capacity
around
what
we
think
is
ninety-five
percent
of
the
spike
and
let
q
is
kind
of
squishing
the
rest
so
that
we
don't
lose
any
data.
Nothing
gets
overheated
and
we
don't
end
up
spending
a
lot
of
money
on
having
infrastructure
that
we
don't
really
need
to
support.
So
when
we
add
Q's,
the
updated
bit
looks
a
little
bit
like
this.
B
B
We
can
use
canisius
to
start
breaking
us
stuff
up
into
categories
and
to
give
it
the
kind
of
springiness
that,
let's
just
kind
of
cue
through
it,
well
we're
going
to
eat
up
some
of
our
29
seconds,
but
at
least
we're
still
going
to
be
we're
not
to
be
losing
any
data.
Nothing's
gonna
get
overheated
and
we're
not
going
to
pay
for
a
massive
infrastructure
unnecessarily.
B
Rabbitmq
is
good
for
that
in
part,
because
the
while
the
data
volume
is
massive,
the
Kinesis
needs
to
support
the
the
data
profile
in
rabbit
is
very
small,
we're
just
sending
in
batches
of
user
IDs
without
any
other
data
and
we're
letting
the
the
rules
engine
query
Cassandra
itself
to
figure
out
what
needs
to
go
there
so
in-
and
this
is
what
our
real-time
streaming
data
processing
for
personalized
communication
and
architecture
looks
like.
We
have
game
clients
to
talk
to
a
service
gateway.
The
data
that
comes
in
off
the
game
goes
into
Kinesis.
B
B
That's
it
until
warehouse.
It
gets
archived.
We
update
our
user
profiles
and
then
the
event
reader
hands
off
the
set
of
users
that
we
need
to
process
into
a
rabbit.
Cue,
the
rules
engine
reading
the
cue
says:
I've
got
a
new
set
of
users.
I
need
to
process
a
query:
Cassandra
I
get
new
sets
of
profiles,
all
striped
out
attributes
ready
to
go
process.
B
Some
rules
marry
some
content
and
send
it
back
out
to
the
service
gateway
to
go
back
into
the
game,
and
this
whole
loop
can
happen
because
of
the
springiness,
because
we
split
the
stuff
out
and
in
no
small
part,
because
we
have
Cassandra
as
a
back-end.
We
can
do
all
this
stuff
really
fast
at
really
high
volumes
and
so
yay
for
us.
B
So
again,
just
sort
of
the
review,
the
low
latency
high
blind
bid
store
Cassandra
I
could
just
replace
that
with
Cassandra
I
was
talking
somebody
earlier
today
about
you
know
they
were
asking
me
so
like
what
the
benefits
of
Cassandra
versus
some
of
the
other
stuff
that
you
had
used
with.
To
try
to
do
some
of
these
bits
and
I'm
kind
of
giving
her
some
of
the
history
and
I
couldn't
really
think
of
a
good
answer
to
the
question
like
what
are
the
benefits.
B
Cuz
Hollinger
brings
versus
something
else,
mostly
because
I
couldn't
think
of
anything
else
that
we
could
have
fit
into
this
particular
use
case.
So
I
don't
know
what
the
benefits
I
don't
know
that
there
even
is
I
ended
up
telling
her
that
that
the
benefit
is
we
could
do
it
and
for
the
alternative
is
we
could
so
stable,
photo
processing,
nodes
and
cues.
So
we
just
talked
about
that
and
then
also,
maybe
perhaps,
if
some
sort
of
interest
some
of
the
other
technologies
we
use
just
run
down
the
list,
Cassandra.
Obviously
Kinesis.
B
We
talked
about
RabbitMQ
all
the
court
application
development,
the
rules
engine
the
event
parser
this,
how
a
Python
37
we
may
move
to
three
eventually,
when
everybody
else
catches
up.
I
think
it's
only
been
out
for
like
five
years
or
something
so
we're
sticking
with
27
for
now
s3
red
chip
EMR.
This
is
our
Amazon
replacement
for
Hadoop.
B
B
It
could
be
anything
else,
but
we
have
a
django
interface
that
connects
to
my
sequel,
the
knobs
that
I
talked
about
earlier
in
terms
of
implementing
strata
setting
up
content
who
to
target
all
that
stuff
happens
in
that
way,
you
get
copyright
for
server
monitoring,
ansible
French
infrastructure
deployment,
so
I
guess.
The
last
point
to
make
here
is
what
we're
in
a
good
place.
2016
is
coming
up
fast
with
every
new
year.
You
can
be
sure
there
will
be
another
call
of
duty
title.
B
There
will
be
more
Skylanders,
there
will
be
more
of
everything,
and
so
what
are
we
going
to
do?
I
mean
I.
Don't
know
that
we
can
continue
this
pace
of
innovation.
Maybe
we
can.
You
know
predictively,
guess
what
your
stats
are
going
to
be
on
the
next
game:
I'm,
not
really
sure,
but
one
thing
I
can't
say
for
sure
kassandra's
going
to
play
a
role
in
it,
we're
pretty
well
embedded.
Our
use
case
is
so
well
couple
to
what
the
Cassandra
does
for
us.
It's
hard
to
really
imagine
getting
outside
of
that.
B
B
B
More
about
personalization
in
the
rules
engine,
yes
sure
I,
just
like
at
a
high
level,
I
think
one
of
the
things
that
sort
of
differentiates
us
from
some
of
the
other
approaches
to
personalization
is
a
notion
of
segmentation
and
event-driven,
mixing
and
matching.
So
there
are
lots
of
vendors
that
offer
personalized
solutions
and
stuff,
but
we
have
a
kind
of
a
brain
trust
from
marketing
agencies.
Who've
worked
in
this
space
for
a
long
time
that
have
kind
of
put
the
stuff
together.
It's
a
bit
of
a
bit
of
secret
sauce
but
I.
B
Think
if
you
consider
those
emails
that
I
was
showing,
it's
really
important
to
us
that
we
that
we
add
the
extra
dimension
of
not
just
a
customization
of
a
stat
or
a
choice
of
a
color
or
a
user-defined
type
of
personalization,
or
even
the
right
recommendation
in
the
right
spot.
But
the
question
of:
do
we
even
show
a
recommendation
at
all?
B
Do
we
even
show
stats
at
all
and
what's
the
order
in
which
we
place
those
and
that
to
me
you
know,
and
all
of
my
experience
in
general
communications
in
CRM
and
marketing,
all
that
yeah?
That
feels
like
a
big
differentiator
and
it's
also
its
resource
intense
on
the
backend
to
process
rules
like
that.
So
a
large
part
of
that
29
seconds
is
evaluating
users
against
content
and
maxing
out
the
matrix
to
make
sure
that
you
end
up
an
experience
that
works.