►
Description
Patrick McFadin discusses what's new in C* 1.2 -- VNodes, Collections and CQL3.
C*ollege Credit is the bi-weekly educational webinar series for all things Apache Cassandraâ„¢ (C*) related. Each webinar features an MVP or member from the community that has hands-on experience building the next generation of applications on Apache Cassandra. The series is designed to provide a baseline of education across an array of topics from 'what is NoSQL' to 'how to scale out Apache Cassandra across multiple datacenters'. http://www.datastax.com/resources/webinars/collegecredit
A
Hello,
everyone
and
welcome
to
this
edition
of
our
data
stacks
Cassandra
community
webinar
series
very
excited
today
this
this
one
has
been
on
my
calendar
for
a
long
time.
It's
Valentine's,
Day
and
I'm
delighted
to
have
with
me
today
Patrick
McFadden,
the
Barry
White
of
Cassandra,
who
is
going
to
entertain
and
educate
us
today.
So
hopefully,
you've
got
the
lights
dimmed
a
couple
of
housekeeping
items.
A
B
B
All
right,
let
me
let
me
flip
this
over
to
keynote
here
and
get
going
so
so
how's
everyone
doing
today,
it's
Valentine's
Day-
and
here
we
are
talking
about
be
notes,
so
I
couldn't
think
of
a
better
thing
other
than
maybe
doing
something
that
has
something
to
do
with
people,
but
it's
only
going
to
take
a
little
while,
so
you
can
go
out
and
do
something
after
this.
So
this
is
all
about
virtual
nodes
and
I'm,
going
to
start
things
out
here
with
just
a
quick
introduction
so
well.
B
B
So
how
do
I
go
from
regular
nodes
to
virtual
nodes
and
actually
that's
pretty
easy
process,
so
we'll
go
through
that
and
then
I'm
going
to
talk
about
some
of
the
benefits
of
switching
to
virtual
nodes,
why
it
was
done.
The
Cassandra
codebase
is
now
pretty
much
all
B
nodes
as
of
1.2.
So
let's
talk
about
why
so
in
the
beginning,
since
the
beginning,
Cassandra
is
always
had
clusters
and
clusters,
of
course,
are
what
you,
when
you
set
up
your
let's
say
under
a
cluster.
B
It's
full
of
nodes
in
those
nodes
all
contain
key
spaces
which
all
have
column
chemist's,
which
hold
your
data
and
every
single
column.
Family
has
a
row
key,
and
this
is
where
things
really
start.
This
is
a
fundamental
piece
that
you
need
to
know
about
how
how
Cassandra
works
in
yl,
nodes
and
nodes
are
important.
So
row
keys
are
all
about
identifying
information
inside
a
column,
family.
B
Those
are
the
unique
part
of
a
column
family,
so
you
have
one
row
key
and
a
really
wide
column,
our
wide
row
with
lots
of
columns
and
so
the
row
key
of
that
fundamental
unique
part.
So
it
can
be
up
to
64
K
size
in
inconsiderate,
which
is
probably
big
enough
for
anything.
You
need,
and
there
are
two
ways
to
sort
those
so
the
first
one
is
to
sort
them
by
using
the
byte
order.
Partitioner
or
Bop
is
the
cool
kids
call
it,
and
if
you
notice
there's
a
little
exclamation
point
there.
B
That's
because
that
comes
with
a
warning.
Don't
don't
do
this
unless
you
really
know
why
you
want
to
do
it
and
so
you'll
hear
plenty
of
people
and
they're
talking
about
Cassandra
and
byte
order,
partitioner
freak
out
about
it,
there's
good
time
to
use
it,
but
what
you
can
really
probably
find
more
is
the
random
partitioner,
and
that
means
that
when
you
place
a
row
key
into
your
cluster,
it's
randomly
put
into
one.
B
So
it
tries
to
distribute
the
data.
So
what
about
that?
The
random
partitioner?
How
does
that
work
so
row
keys
as
you
set
them
into
your
in
your
database?
How
do
you
create
some
random
numbers?
So
that's
that's
an
important
piece.
We
want
to
randomize
it
so
that
it
distributes
your
data
around
the
cluster,
and
so
how
do
you
make
that
number
big
enough
so
that
we
can
use
it
and
then
have
it?
B
Make
it
reproducible
and
that's
where
md5
comes
in
md5
is
a
cricket,
bat
cryptographic,
hash
and
basically,
what
you
do
is
you
take
a
string
and
you
put
that
through
md5,
you
run
it
through
this
algorithm
and
you
get
out
128-bit
number
and
what's
really
magic
about
it
is
that
it
always
comes
out
as
128-bit
number.
So,
if
I
put
in
my
Twitter
handle
boom.
A
B
Get
a
120
bit
number
now
what's
interesting.
Is
that
every
time
I
put
in
that
string,
the
app
Patrick
McFadden
I'm
always
going
to
get
that
particular
number
and
that's
what's
been
called
consistent,
hashing
and
it's
very
commonly
algorithm
memcache
uses
this
other
know,
sequel
databases
use.
It
is
a
very
handy
way
to
get
a
consistent
cash
on
handy
5,
even
though
it's
used
for
doing
some
crypto
like
encrypting
passwords
and
things
like
that,
it
was
really
a
much
better
for
for.
B
It's
not
very
strong
encryption,
it's
really
better
for
the
rest
of
us
now,
just
creating
these
big
numbers,
and
so
cassandra
is
always
relied
on
that.
So
whenever
you
create
a
row
key,
it
hashes
it
into
this
hundred
twenty
eight
bit
number,
and
so
that's
really
important,
and
that's
because
we
need
we
need
to
have
a
really
big
number,
because
you
can
have
a
lot
of
data
right.
It's
big
data.
So
how
big
is
this
number?
Well,
so
the
hash
is
zero
through
two
to
the
128
minus
one.
B
So
those
of
you
not
so
mathy
that
I
put
out
that
number
and
there's
a
lot
of
Commons
in
there
and
a
lot
of
numbers
and
I.
Don't
even
it's
like
a
deck
of
gazillion
or
something
but
I,
don't
know
it's
big,
so
Cassandra
actually
uses
two
to
the
127th
and
the
so
the
the
bonus
reason
is
so
you
can.
You
can
tell
people
about
this
wise
because
we
we
take
out
one
bit
for
fun,
but
still
I
mean
tell
me
you
can.
B
If
you
can
tell
the
difference,
train
attend
to
a
two
to
the
127th
number
and
to
to
120th
number.
Let
me
know
could
be
really
that
cool.
So
that's
a
huge
number,
so
that
gives
us
a
lot
of
keys.
Incidentally,
little
trivia,
question
or
trivia
is
a
2
to
the
128th
is
actually
the
ipv6
address
range
as
well,
so
coincidence,
I,
don't
know
so.
This
is
a
really
really
big
number
perfect
for
whatever
we
need
to
set
this
massive
amount
of
keys
into
our
cluster.
B
So
what
does
that
got
to
do
with
notes?
We
were
talking
about
row
keys,
what
about
notes
here.
So
let's
talk
about
why
this
makes
sense,
and
this
was
going
to
make
a
lot
of
sense
in
a
minute.
So
it's
all
about
so
each
Cassandra
node
is
assigned
a
token,
and
that
token
is
just
a
number
inside
of
that
range
from
zero
to
two
to
the
one
227th,
it's
a
lot
of
numbers,
but
what
is
trying?
B
What
we're
trying
to
do
is
we're
trying
to
break
up
the
ranges
in
all
these
different
nodes
so
that
when
you
set
data
in
there,
it's
being
spread
out
neatly
now
those
of
you
who
set
up
a
Cassandra
cluster,
you
know
that
you
set
you
create
tokens,
that's
part
of
the
set
up,
and
that
token
is
a
really
really
big
number
or
just
it
looks
like
a
string
of
numbers,
but
it's
actually
a
number.
It's
a
divisor
of
that
big
number.
B
So,
whenever
you
put
a
token
in
there,
it
marks
the
ownership
from
from
you'll
see
from
token
zero
all
the
way
to
this
I,
don't
even
one,
though
don't
say
that
number.
But
that
means
that
everything
on
that
to
on
that
node
is
going
to
be
inside
that
range.
So
what
happens
whenever
I
take
a
row
key
and
I
say
I'd
like
to
put
this
into
my
database
and
you
run
it
through
md5,
so
you
can
get
that
128-bit
number
consistently
right.
B
B
B
This
is
what
everyone's
been
using
up
to
this
point,
and
so
that
means
that
a
node
is
responsible
for
all
of
these
keys
and
it
could
be
a
ton
of
keys
if
you
have
a
three
server
cluster
you're
going
to
have
a
lot
of
keys
on
one
server
and
that
single
token
is
set
up,
so
you
have
one
server
one
token
in
one
node
and
that
can
get
kind
of
out
of
Aegeus.
So
what
the
container
communities
all
have
talked
about
is
these
community
nodes
you'll
also
hear
people
talk
about.
B
We
want
more
of
less
really
small
tiny
server
get
1u
boxes.
Not
a
lot
of
disk
has
really
been
kind
of
the
the
way
we
want
Cassandra
to
operate.
It
works
better
that
way
right.
Well,
that's
not
the
way
people
want
to
do
it
anymore,
because,
let's
face
it,
you're
racking
up
a
few
hundred
nodes
which
I
know
I,
know
companies
that
are
doing
this.
You
don't
want
to
put
in
150
servers
to
do
this.
You
want
to
try
to
pack
it
down
and
get
a
little
more
density.
B
So
what
you
really
want
is
that,
and
so
how
do
you
go
from
commodity
node,
which
is
tiny
little
Velociraptor,
this
big
little
trainer
source
racks
and
that's
a
big
problem,
so
we
need
to
do
plan
and
that
plan
it's
packing
more
data
into
a
single
server
and
fortunately
we
have
a
solution.
That's
why
we're.
B
So
we
want
to
make
that
one
node
is
responsible
for
more
data.
Well
how
to
do
that.
So
this
is
where
virtual
nodes
are
going
to
come
in
the
other
thing,
and
this
is
really
kind
of
an
important
I
mean
this
is
probably
one
of
the
most
challenging
aspects,
starting
with
Cassandra
and
it's
funny
whenever
I
get
into
this
conversation
with
people,
because
they
just
look
at
me
like
I'm,
crazy
and
that's
a
signing
token-
and
you
know
you
have
to
actually
plan
this
out.
B
So
this
is
serious
token
assignment
sucks,
I
I've
been
a
long
time,
Cassandra
a
user.
This
is
one
thing,
I've,
never
really
liked,
so
that
is
hopefully
something
you'll
never
have
to
do
again
so
and
the
reason
they'd
start
so
bad
is
because
you
have
to
do
this
business
of
alright.
We
need
to
evenly
distribute
all
the
tokens
around
how
many
servers
you
have
and
then,
when
you
get
it
evenly
distributed
and
you
go
to
grow
it,
the
guidance
has
always
been.
Will
double
your
ring?
Well,
doubling
your
ring
that
isn't
very
practical.
B
Sometimes,
if
you
only
need
10%
more
power,
you
know
container
is
a
linear
scalar,
but
if
you're
doubling
every
time,
that's
not
very
linear,
but
if
you
want
to
put
in
just
say
one
server
or
two
servers,
then
you
have
to
do
a
rebalance
operation
that
sucks
and
then,
if
you
shrink
a
ring
and
take
out
one
server,
you
have
to
rebalance
or
take
away
half
and
that's
always
kind
of
suck.
So
this
is
not
good.
We
don't
want
that
anymore.
B
So
the
other
thing
that's
always
been
kind
of
a
pain
is,
is
that
whole
business
is
having
to
add
token
into
the
Cassandra,
which
is
in
config
file
and
I've,
seen
some
pretty
crazy
chef
scripts
out
there
that
do
this,
and
it's
really
because
that
has
to
be
done
as
the
server's
coming
online.
So
it
can
insert
it
so
properly
in.
B
And
if
it
isn't
done,
if
it's
randomly
assigned
you're
gonna
have
to
do
a
rebalance
anyway,
so
all
of
this
has
not
been
fun
for
anyone
in
operations.
So
let's
kick
that
out,
though,
we
have
relief,
virtual
nodes,
so
the
whole
premise
of
virtual
nodes
is
that
these
big
servers,
these
try
and
source
rec
servers,
should
have
many
nodes
and
these
servers
can
take
it.
But
what
we
should
do
is
make
each
one
of
those
nodes
very
small
and
make
them
not
as
expansive
in
the
range
of
the
keys
and
tokens
the.
B
Why
are
we
assigning
any
tokens?
I
mean
this
is
kind
of
silly,
so
it's
like
assigning
a
MAC
address
to
your
hardware
to
your
network
card.
It's
just.
Why
are
we
doing
that?
This
is
a
21st
century,
so
we're
going
to
have
a
whole
new
plant
here
with
version
1.2.
So
let's
see
how
this
works,
so
virtual
node
features
here
we
go.
This
is
what
is
them?
This
is
probably
why
you're
here?
What
is
this
all
about?
B
Each
node
or
the
each
server
see
we're
going
to
have
to
get
away
from
this
I'm
saying
this
a
lot
to
that.
It,
we
usually
say
note,
is
like
a
server,
but
now
the
server
and
now
nodes
are
different,
so
you're
going
to
have
256
default
nodes
per
server.
That's
kind
of
interesting
to
watch
I
mean
you
can
see
this.
B
You
can
see
your
key
range
get
diced
up
in
a
256
different
nodes
on
one
piece
of
hardware
which
that's
a
lot,
but
if
you
think
about
that's
a
really
good
idea,
because
that's
a
very
small
range
and
a
lot
of
things
come
a
lot
of
algorithms
that
work
on
T
ranges
such
as
doing
repairs
and
doing
cleanups
are
now
operating
over
smaller
key
range.
That's
good,
so
I
thought.
B
You
start
up
a
server
and
you
bring
it
into
the
ring.
You
don't
have
to
assign
a
token
to
it.
You
can
it
has
to
do
it
itself
and
it
figures
out
where
you're
at
and
it
assigned
it,
creates
all
the
tokens
and
then
it
evenly
distributes
it.
It
sounds
magical
and
cutting
unicorny,
but
it
Valentine's
feel
the
love
I
mean
this
is
good.
That's
what
we
wanted
all
along
anyway,
right
I'm,
going
to
make
operations
life
so
much
easier,
I'm!
B
So
glad
we
have
this
now,
because
I
don't
want
to
explain
to
people
that
they
have
to
go
figure
out.
You
know
divisor
of
attended
127th
number
anymore,
so
the
other
thing
that
we're
going
to
look
at
improving
now
that
we
are
creating
these
smaller
key
ranges
it's
after
rebuild
and
if
you
look
at
the
way
the
rebuilds
are
done
now
like
when
you
lose
an
a
whole
server.
B
Yes,
you're
losing
a
lot
of
key
ranges,
but
when
you
bring
a
new
server
back
online
and
it
has
a
lot
of
smaller
key
ranges,
we
can
do
things
like
take
smaller
chunks,
but
parallel
stream
in
from
all
the
other.
Smaller
chunks
out
there
and
it
makes
it
a
lot
faster
to
bring
a
note
back
online
and
when
you're
bringing
a
new
server
online.
B
Let's
say
you
do
need
10
percent,
more
capacity
or
20
percent,
more
capacity,
you
just
add
10
or
20
percent
more
servers,
and
whenever
they
come
online,
they're
Auto
assigning
the
tokens
and
they're
evenly
distributing
themselves.
You
don't
have
to
do
a
rebalance
operation.
Yes,
now
there's
a
new
partitioner
that
we're
going
to
be
the
md5,
the
random
partitioner.
There's
a
new
one.
I'll
cover
that
in
a
minute,
but
on
this
is
really
these
bullet
points
really
where
V
nodes
are
at.
B
B
We
had
to
sign
our
tokens,
so
you
know
you
can
brag
it
up.
You
got
that
going
for
you.
So
how
do
you
transition
and
I'm
happy
to
say
this
is
a
really
easy?
It's
not
going
to
bring
down.
You
don't
have
to
bring
down
your
cluster,
so
you
have
a
running
one
dot.
One
cluster!
You
can
transition
to
1.2.
So
here's
how
you
do
it
so
when
you
upgrade
to
wound
up,
you
can
leave
it
using
random
partitioner
and
having
a
token
assigned
to
each
box
that
guys
is
totally
cool.
B
So
that's
probably
how
you're
going
to
start
so
you
do.
The
in-place
upgrade
everything's
still
sitting
there
as
if
it
was
a
1.1,
non-virtual,
node
cluster,
so
to
change
it
to
a
virtual
node
cluster.
You
go
into
the
gamble
file
and
you
see
there's
two
lines,
there's
a
num
token.
As
an
initial
token,
so
the
initial
token
is
probably
going
to
have
your
token
for
that
big
old
fat,
huge
number
that
you
had
to
assign
to
it
when
you
initially
created
the
cluster.
B
B
Don't
worry
about
that
right
now,
there'll
probably
be
some
blog
posts
on
that
number
and
how
it
relates
to
your
hardware,
but
for
now
just
go
to
the
256,
then
restart
the
node.
Just
you
know:
Cassander
stop
or
casino
restart
on
the
command
line,
and
when
it
comes
back
on
line
you'll
see.
If
you
look
in
the
system,
log
you'll
see
a
stop
for
a
second
also
you'll,
see
a
ton
of
it'll
create
a
bunch
of
contiguous
nodes
in
that
same
key
space
range.
So
the
key
space
range.
B
That
note
is
that
older
node
was
responsible
for
and
then
it'll
just
take
up
and
keep
going
all
of
a
sudden
that
node
is
back
online
and
it
has
256
more
and
if
it'll
operate
with
all
the
other
non-b
node
servers
running
just
fine,
and
so
you
go
ahead
and
just
do
that
on
happy
server
inside
your
cluster
and
when
you're
done
you'll
have
a
nice
setup.
So
now,
once
it's
done,
you
have
all
these
nodes
going
out
there
there's
one
more
operation
that
needs
to
be
done,
and
this
is
pretty
critical.
B
Winning
transitioning
from
non
V,
node
to
V
knows
and
what's
going
on,
is
that,
like
I
said
it's
creating
a
contiguous
range
of
tokens
so
that
the
range
that's
inside
that
server
is
going
to
get
busted
up
into
256
chunks,
but
they're
all
going
to
be
next
to
each
other?
That's
not
good!
We
want
them
spread
out.
So
what
we're
going
to
do
is
run
this
shuffle
operation,
and
this
is
a
one-time
deal.
You
only
have
to
do
this
once
when
you're
upgrading.
B
No
seriously
is
first
time,
I
ran
it.
I
was
like
they're
doing
anything,
but
it
is,
and
it's
meant
to
be
zero
impact.
If
you
can
and
it's
going
to
take
a
while,
I
mean
it
could
take
a
few
days
if
you
have
a
very,
very
large
cluster,
but
that's
okay.
It's
just
moving
one
little
piece
at
a
time
here
and
there
it'll
take
care
of
it.
If
you
want
to
keep
track
of
what
it's
doing,
you
can
use
the
Cassandra
shuffle
LS
command,
so
that's
it.
B
So,
let's
walk
through
the
I
got
really
cool
graphics.
Here,
I've
learned
how
to
do
graphics
with
keynote,
so
I'm
feeling
pretty
cool.
So,
let's,
let's
do
this
one
step
at
a
time.
So
here's
our
existing
wound
outline
cluster,
and
so
every
server
has
a
range
of
keys
and
I'm
I
kept
my
key
space
really
really
small,
because
it's
got
to
look
good,
so
I.
A
B
Four
four
keys
on
one
on
each
server,
so
this
is
my
existing
one,
one
cluster,
so
I'm
going
to
upgrade
and
so
I'm
going
to
change
that
to
the
numb
tokens
equal
four
and
it's
going
to
split
these
up,
but
I
restarted.
Now
what
I?
This
is.
What
I
was
talking
about?
I
have
the
contiguous
range
of
keys
still
sitting
on
each
single
server,
as
it
was
before
just
broken
up
into
four
different
chunks.
B
If
my
data
is
now
spread
very
evenly
around
every
single
server,
so
that
is
pretty
much
how
it
works.
So
what
does
this
mean
for
operations
and
from
what
I
feel
is
like
the
group
that
is
going
to
benefit
the
most
from
virtual
nodes
is
going
to
be
operations
because
you
were
going
to
I
was
just
recently
at.
We
were
talking
to
I'm
talking
to
Jason
Brown
at
Netflix
who's,
one
of
the
guys
line
priam-
and
this
is
a
Primo's
built
around
the
idea
of
assigning
tokens.
B
But
now,
like
half
of
that
code
is
just
gonna,
have
to
get
ripped
out
because
it's
none
of
that's
going
to
be
needed
anymore
and
that's
pretty
amazing
because
that's
always
been
the
problem,
but
now
not
a
problem.
So
when
it
comes
to
your
ops
life,
you
can
just
add
1,
node,
2,
node,
3
nodes,
no
problem,
no
balance,
no
shuffling
or
anything
going
on.
No
token
assignments,
you
just
nip
I.
Think
tokens
are
going
to
be
one
of
those
academic
things
about
Cassandra.
That's
interesting!
B
That
people
were
they
when
they
really
want
to
dig
into
the
engine
and
how
it
works.
They'll
know
about,
but
most
people
really
won't
know
or
care,
and
that's
ok,
that
it
should
be
in
the
background
and
now
we're
looking
at
building
bigger
servers,
and
so
this
is.
This
is
another
operation
topic,
but
it's
also
good
for
everyone
else.
B
Who's
got
to
spend
money
on
servers
is
if
we
can
start
building
bigger
servers,
that's
going
to
make
it
good,
so
we're
looking
at
like
how
many
you
know
how
many
tokens
be
defined
to
these
bigger
servers.
Again,
you
know,
look
for
blog
post
in
the
future,
I'm,
actually
working
on
this
topic
right
now
and
I
want
to
know
more
about
it.
Too.
B
I
mean
what
is
a
good
number
for
how
big
of
a
server
you
have,
but
it
makes
sense
that
you
can
start
adding
more
and
more
tokens
to
a
bigger
bigger
box
and
you
can
have
dissimilar
sizes
too.
You
can
have
less
tokens
on
or
less
token
ranges
on
one
and
more
on
another
if
you
have
dissimilar
hardware,
so
also
the
decommissioning
of
nodes,
which
is
just
like
the
adding
on
those.
B
B
Another
thing
to
mention
to
for
the
ops
folks
is
that
there
is
a
new
node
tool
command.
Now,
if
you
have
256
nodes
sitting
on
a
single
server,
the
potential
is
you're
going
to
have
a
lot
of
nodes.
If
you
do
no
tool
ring
and
if
those
of
you
who
know
what
that
command
is
that
just
shows
all
of
the
different
token
assignments
through
your
entire
cluster,
if
you
had
just
a
4,
node
cluster
or
4
I,
keep
saying
that
4
server
cluster
with
256
be
nodes
per
then
you're.
B
Looking
at
a
thousand
24
tokens,
they
have
to
get
displayed.
So
now
we
have
a
new
command
called
status
and
it
just
shows
you
the
each
server
of
JVM,
that's
running
Cassandra
and
then,
of
course,
how
many
tokens
are
in.
So
it's
a
much
much
nicer
way
of
looking
at
your
ring
without
having
to
you
know
pipe
it
to
more
or
worse.
You
know
just
watch
it
stream
past
your
screen,
so
one
more
for
the
baby
dinos
for
the
win.
B
So
I
mentioned
that
we
do
have
a
new
partitioner
and
so
along
the
same
line
of
hey.
How
do
I
pick
a
really
big
number
make
consistent
and
hopefully
fast?
We
have
a
new
partitioner
called
the
murmur
three
partitioner.
So
murmur
three
is
a
algorithm
much
like
md5,
but
it's
not
a
cryptographic
one.
So
it
doesn't.
There's
a
little
bit
if
it's
cryptographic
meanings,
it's
secret-spy,
you
know
I
gotta,
have
you
know
if
there's
a
lot
of
password
system
to
use
md5,
unfortunately,
and
murmur
three
is
not
for
that:
it's
not
cryptographic.
B
It's
just
meant
to
be
a
way
to
create
hashes,
so
it
in
a
lot
of
testing
has
been
done.
It's
slightly
faster
than
md5,
so
this
is
not
not
like
ridiculously
faster
and
so
as
Cassandra
moves
forward.
We're
getting
like
I
said:
more
bigger,
bigger,
bigger,
more
more
MORE,
it's
good
to
get
those
incremental
changes,
and
while
we
can
so
starting
with
version
1.2
that
will
be
the
default
partitioner.
Now
I've
already
got
this
question
and
I'm
going
to
answer
it
right
now.
B
No,
you
do
not
need
to
convert
your
1.3
1.2
cluster
from
random
partitioner
to
murmur
3
partitioner.
First
of
all,
it's
almost
impossible
to
do
that
with
any
kind
of
ease,
and
the
second
thing
is,
it
doesn't
give
you
that
much
performance
increase.
It's
really
not
going
to
be
that
big
of
a
difference.
It's
really
for
the
future
and
new
as
we
create
new
clusters
and
new
setups
as
we
go
forward,
then
that
will
use
memory,
but
don't
don't
feel
like
you're
now
in
the
lower
performance
tier.
B
It's
not
that
big
of
a
difference
yet
so,
if
you
want
to,
if
you
don't
believe
me,
this
is
open
source
right.
We
can
go,
look
exactly
what
happens
so
I
have
the
JIRA
right
here,
the
Cassandra
3.77.
That
is
the
entire
details
from
the
proposal
all
the
way
down
to
it's
committed
and
I.
You
know
if
you've
never
looked
at
edit
at
a
Cassandra
or
any
open
source
ticket
and
say
Apache
in
the
Apache
JIRA
I
suggest
you
go,
do
it.
It's
really
interesting.
B
I
think
I
even
have
some
non-technical
users
that
have
looked
at
and
just
to
see
and
get
some
insight
in
the
process
and
it's
very
open
as
its
proposed
different
people
talking
about
it.
The
debate
that
goes
on
back
and
forth
I
mean
it
really
is
out
there
that
there's
no
backroom
conversations
and
then
someone
comes
out.
That's
okay!
We
decided
something
and
here's
what
we're
going
to
do.
No,
it's
all
done
right
there.
B
So
it's
pretty
interesting
and
you
know
you
find
out
what
the
motivations
were
and
you'll
see
like
in
this
particular
ticket,
how
it
first
it
wasn't
so
performant.
Then
there
are
some
people
trying
to
figure
out
why
and
then
they
figured
it
out,
so
just
I
would
suggest,
go
checking
it
out.
Just
so,
you
can
understand
how
the
process
works.
So
the
conclusion
you
can
go
do
this
today.
This
is
in
Cassandra
1.2.
Here's
my
one
gratuitous
valentine's
thingy
with
a
cupid
or,
as
my
four-year-old
says,
the
the
baby
that
shoots
people.
B
So
you
can
go
get
it.
You
can
download
it
go
to
data
stacks
calm,
the
community
version.
That's
going
to
be
your
RPM
and
deb
file.
If
you
want
to
use
that
or
you
can
just
go
directly
to
cassandra
at
org
and
download
the
Powerball,
this
is
we're
currently
on
version
one
to
one.
If
you
want
to
try
to
test
upgrade
one
of
your
1/1
clusters,
that's
awesome,
go
out
and
give
it
a
shot.
B
It's
really
cool
and
I'm,
going
to
I
put
these
references
at
the
end,
so
you
can
read
some
of
these,
but
there's
these
two
blog
posts
that
we
had
I'm
going
to
put
my
slides
up
on
SlideShare.
So
that's
more
of
why
I
put
this
slide
up,
but
really
you
know
if
you
come
and
look
at
later,
you
can
go
look
at
these
blog
posts
and
there's
a
lot
more
details
about
some
of
the
motivations
and.
A
B
A
I
am
going
to
read
you
some
questions,
just
a
reminder
to
those
of
you.
Please
ask
your
questions
in
the
WebEx
Q&A
tab.
I
will
go
through
those
and
impose
them
at
Patrick
and
just
as
a
follow-up
to
what
he
said
about
posting
to
SlideShare.
We
be
emailing
out
the
video
archive
of
today's
presentation.
You
had
such
an
amazing
Valentine's
Day.
You
want
to
relive
it.
You
will
be
able
to
do
that
tomorrow,
okay,
so
this
one
from
Steve
Brawner,
let's
get
straight
into
it,
Patrick
there's
the
murmur.
B
Interesting
well
that
there
on
the
micro,
this
is
where
the
the
optimization
was
found
within
the
micro
benchmarks,
and
so
we're
talking
about
nanoseconds
here.
I
know
that
the
him,
the
the
murmur
3
has
a
lot
more
optimizations,
especially
for
multiple,
the
multiple
processor
I
couldn't
I
couldn't
say
exactly.
Yes,
it's
going
to
be
better
than
that.
But
I
know
that
from
what
I've
read
about
both
md5
and
maror
3,
the
murmur
3
in
the
long
run
will
win.
A
Okay,
thank
you
very
much.
Jack
schmitt
and
mike
went
also
both
have
the
same
question,
which
I
think
I
can
answer
and
questions
are
along
the
lines
of
any
idea
when
we'll
see
1.2
in
date,
stacks
Emprise,
so
it
will
not
be
included
in
data
stack,
Enterprise
3.0,
which
is
slated
to
drop
on
February
25th,
the
big.
A
So
what
around
that
release
is
all
around
security,
but
we
are
scheduling
a
drop
of
stage
type
enterprise
later
in
the
year,
which
will
include
all
the
greatness
of
1.2
in
it
there's
a
little
lag
as
we
are
now
jinx
and
also
do
a
lot
of
testing
on
that
making
sure
it's
ready
for
production
Patrick.
Anything
to
add
on
that
now.
B
That's
that
is
definitely
this
team
with
a
sex
enterprises
making
sure
it's
baked.
In
so
I
mean
you
can
go
out
and
get
the
community
version
and
really,
if
you
want
to
play
around,
you
know
it's
going
to
take
a
while
for
it
to
get
fully
baked
for
the
enterprise
and
think
about
for
data
stacks
enterprise,
especially
there's
a
lot
more
than
cassandra
going
on
here.
B
A
B
B
Nothing,
it's
interesting
that
those
digging
through
this
initially
a
lot
of
the
existing
rules
are
still
stay
still
apply,
so
the
adjacency
issues
are
they're
still
there.
So
you
went
for
instance,
one
token,
then,
the
next
adjacent
in
the
next
adjacent
after
that
are
where
the
replica
pairs
they
have.
The
the
only
difference
is
in
how
those
replicas
or
where
those
replicas
lived,
and,
of
course
you
don't
want
to-
and
this
is
why
the
shuffle
is
such
an
important
operation.
A
B
B
No,
it's
not
even
a
best
practice
I,
don't
a
won't
even
allow
it
to
happen.
That
was
these
are
some
of
the
questions
I
had
initially,
but
that
would
make
any
sense.
So
that's
one
of
the
reasons
shuffle
is
important
to
do
is
you
know,
you've
got
them.
It
calculates
where
all
the
tokens
are
in
the
replicas
are
and
then
make
sure
that
those
are
not
sitting
next
to
each
other.
A
B
Love
that
question
and
that's
actually
a
feature
that
I've
proposed
so
right
now,
no,
the
J
bob
configuration
is
well
balanced
by
just
the
key
space
and
not
the
the
actual
virtual
node
itself.
I
think
an
interesting
thing
would
be
now
that
we're
talking
about
virtual
nodes
and
having
these
smaller
ranges
and
how
it
would
play
into
larger
Hardware
again,
if
you
look
at
one
of
the
points
of
virtual
nodes
to
large
use
larger
boxes.
B
So
let's
say
you
have
8
or
16
or
24
drives
and
j-bot
configuration
I
think
it
does
make
sense
in
some
of
those
configurations,
especially
with
SSDs
2-pin
virtual
nodes,
certain
virtual
nodes
to
a
single
SSD,
now
I
might
get
some
argument
from
some
of
the
developers
side
on
Cassandra,
but
I
I
could
see
that
from
an
operation
side
making
a
lot
of
sense
because
then
you're,
you
know
you're
just
really.
Sticking
to
that
one.
B
Another
topic
that's
been
floated
around
is
pinning
on
large
core
boxes,
pinning
a
virtual
node
to
a
particular
CPU
or
core,
and
that
that
also
has
some
benefit
as
boxes
get
bigger,
I
mean,
let's
face
it.
128
core
box
is
not
that
far
away
and
do
you
want
to
go
through
the
cost
from
the
CPU
side
of
having
all
these
contexts
which
is
going
around
and
you
building
these
stacks
as
you
take
the
same
thread
off
the
same
virtual
mode
and
move
it
around
to
all
these
different
cores.
B
A
Time
sounds
like
you
definitely
got
a
plus
one
from
Chris
on
that
feature.
This
one
from
Steve,
Brona
I,
will
take
this
one.
So
Steve
asked
any
early
release
or
dev
early
Cassandra
1.2
data
stack
enterprise
versions.
We
can
sign
up
for
or
volunteer
for
and
then
the
hashtag
guinea-pigs
welcome.
Steve.
The
answer
is
yes.
Email
me
Christian
at
data
stacks
comm
and
I
will
put
you
in
contact
with
the
right
person
to
enroll
you
in
our
early
access
program.
A
B
You
only
run
the
shuffle
once
the
when
changing
the
replication
factor.
It's
going
to
change
that
it's
going
to
put
replicates
on
different
nodes,
so
our
different
physical
servers
so
yeah
the
this
is
another
one
of
those
questions.
I
asked
early
on
to
and
I've
been
assured
that
that
is
not
the
case
that
the
algorithms
are
built
so
will
not
put
replicas
on
the
same
physical
piece
of
hardware.
A
B
B
B
Well
that
that's
where
I
think
we're
looking
at
creating
these
different
ranges
of
notes,
so
you're
a
core
box
which
is
older,
you
may
know
you
only
have
256
virtual
notes
on
that
one,
but
on
the
bigger
box,
creating
maybe
say
512
makes
sense,
because
you're
you're
using
more
hardware,
more
memory,
no
more
disk,
more
CPU.
So
you
want
to
try
to
keep
the
smaller
note
size
if
you
can
and
you're
just
going
to
put
more
data
on
there
anyway
right.
So
it's
just
gonna.
A
A
A
And
a
couple
of
links
here
for
everyone,
you
know
lots
of
training
event,
probably
one
in
your
area
check
it
out
at
data
stacks
check
out
the
new
community
resource,
which
is
Planet,
Cassandra
lots
of
good
information
on
different
use
cases
out
there
and
also
lots
of
blog
posts
and
good
info.
And
then,
if
you
are
on
the
East,
Coast
March
20th,
we
have
our
NYC
star
big
data
tech
day.
We
have
great
speakers
presenting
including
eBay
and
Comcast
and
Instagram.