►
Description
This topic will introduce the Cassandra native protocol, native drivers and Cassandra Query Language (CQL). It is important for developers to be aware of this new way of integrating with and querying Cassandra -- without using Thrift or RPC. There are various ways of tuning that integration and modeling your data - all intended to make it easier and more productive to build against Cassandra with some additional performance benefits. This is a technical session with code abstracts using the Java driver.
A
Hello,
everybody
thanks
for
coming
so
case,
you're
thinking
you're
in
the
wrong
room.
This
is
the
what
we're
talking
about
today,
so
it's
going
native
with
Apache
Cassandra,
so
about
me
I'm.
My
name
is
Johnny
I'ma
solutions,
architect
at
company
called
datastax.
If
you
want
to
stalk
me,
there's
all
my
my
information.
A
Datastax
is
a
company
that
was
set
up
by
Jonathan
Ellis,
I'm
at
file
few
years
ago
to
really
provide
a
kind
of
a
commercial
support,
enhance
the
product
around
cassandra.
And
it's
been.
It's
been
a
pretty
interesting
couple
of
years.
We
have
quite
a
few
customers.
Now
we
contribute
roughly
eighty
percent
of
the
code
into
the
Apache
project
and
we're
headquartered
out
of
San
Francisco
and
Europe
ones
in
stocki
Park,
which
is
definitely
not
as
nice
as
San
Francisco,
yeah,
I
wish,
but
and
oblivious.
We
are
hiring.
A
So
if
anybody
is
very
very
keen,
please
have
a
look
look
at
our
site,
so
what
do
I
mean
by
going
native
with
Cassandra?
So
traditionally,
when
Cassandra
was,
you
know
first
starting
the
really
the
main
way
of
integrating
with
it
using
it
was
through
thrift,
and
there
was
a
you
know,
a
bunch
of
thrift,
clients
around
there,
Hector
astanax,
etc,
and
what
happened
with
Cassandra,
1.2,
really
cql
and
Nate.
The
native
protocol
got
to
a
point
where
I
think
it
was.
A
It
was
pretty
good
and
it's
becoming
pretty
much
the
de
facto
way
of
developing
against
Cassandra.
But
why
did
we?
Why
do
we
do
this?
Why
did
we
move
from
one
to
the
other?
Not
that
there's
anything
wrong
with
thrift,
if
you're
still
using
thrift
and
you're
comfortable
with
thrift,
you
know
do
use
it,
but
really
what
we
found
from
the
community
and
the
developer
is
the
first
thing.
A
The
tooling
around
that
and
enable
and
from
a
kind
of
a
product
point
of
view,
from
adding
new
stuff
to
Cassandra,
because
you
don't
eat,
you
know
the
different
thrift,
clients
keeping
them
compatible
with
what
was
happening
in
Cassandra
and
maintaining
that
going
forward.
It
was
getting
hard
so
yeah,
so
we
introduced
it
cql
and
the
native
protocol
I
should
point
out
as
well.
Astanax
Netflix
are
introducing
the
native
protocol
into
there.
A
That's
next
driver
as
well
so
because
they
won't,
they
want
to
make
sure
that
people
know
this,
and
so
just
to
start
and
what
I'll
do
is
I'll.
Give
you
a
quick
run-through
through
cql.
If
how
familiar
is
everybody
here
with
Cassandra?
Have
you
used
it
before
great?
So
hopefully
you
know
what
Cassandra
is
it's
distributed
database,
so
cql
is
a
standard
query
language.
It
was
really
intended
to
make
it
much
simpler
for
developers
and
people
to
use
to
query
and
model,
and
you
probably
already
know
it.
If
you've
come
from
an
SQL
background.
A
You
know
you
know,
there's
a
simple
statements:
texts
are
from
users.
This
is
the
same
whether,
whichever
way
you're
doing
it
only
it
has
the
usual
statements
in
there.
You
know
creating
dropping
altering
user
permissions,
all
that
kind
of
stuff.
So
it's
a
it
should
give
you
a
very
comfortable
way
of
getting
getting
working
with
Cassandra,
so
example
here.
So
this
is
creating
a
key
space,
so
creating
a
key
space
and
c
ql.
A
What
you're
doing
there
is
you're
saying
creating
my
key
space,
Johnny
I'm,
giving
it
a
network,
topology
strategy
and
I'm
saying,
replicate
my
data
between
different
data,
centers
simple,
as
that
I'm
going
to
rush
through
this,
because
there
is
a
lot
lot
to
cover
from
the
basics
point
of
view,
and
what
you
have
here
is
statement
there
to
create
table
should
be
quite
quite
easy
to
understand.
What's
going
on
there,
there's
a
select
query,
they're
very,
very
similar
and
an
insert
statement.
A
Only
thing
to
really
point
out
here
is
it's
worth
understand
what
that
primary
key
definition
means
there
is
the
concept
of
obviously
a
partition
key
and
the
clustering
columns
in
there.
So
the
the
blue,
but
is
the
partition
key?
The
orange
e
bits
are
your
clustering
columns.
Now
the
partition
key
is,
if
you
know
Cassandra,
you
will
know
this.
A
It
really
is
the
the
value
that's
used
to
give
affinity
to
certain
nodes
and
replicas
in
your
in
your
cluster
and
then
the
clustering
caller
coms
are
really
about
ordering
your
data
on
that
on
that
on
that
node.
So
it's
a
very
straightforward
stuff
and
it's
also
worth
pointing
out
that
you
can
have
more
than
one
clustering
column
in
there
and
you
can
have
composite
partition
keys
as
well,
so
you
could
have
T
name
and
play
in
it
player
name
as
your
partition
key.
A
So
you
can
get
quite
quite
quite
quite
quite
clever
with
it
and
also
we're
pointing
out.
There
are
some
data
types
that
you
might
not
have
be
aware
of:
one
of
the
new
ones
that
came
in
recently
it
was
collections
as
a
data
type.
So
what
you
get
with
collections
is
you
get
set
lists
and
maps
and
if
you
wanted
to
create
a
table
using
those
in
those
carbs,
it's
very
simple
to
setup.
You've
got
a
set
example
and
a
list
example
and
a
map
example
in
there.
A
There
are
some
performance
considerations
around
using
collections,
so
just
be
conscious
of
what
you're
doing
sometimes
it
is.
It
is
still
more
efficient
to
denormalize
your
data
then
use
a
collection
column
also
do
favor
sets
over
lists.
They
are
more
performant
and
I'd
also
tell
you
guys
to
look
out
for
sign
of
2.1
where
they
are
still
in
beta,
so
not
necessarily,
but
they
are
putting
a
lot
of
indexing
in
around
collections
as
well.
So
you,
this
is
a
quite
a
nice
way
of
modeling
stuff
in
Cassandra.
A
Other
thing
you
can
do
is
you
can
do
query
tracing
so
you
can
turn
you
turn
this
on
and
off
if
you're
using
the
command
line
come.
So
you
turn
this
off
in
your
session
and
when
you
execute
a
query
and
you
get
back
a
bunch
of
diagnostic
information
on
what
it
did,
so
you
know
which
nodes
it
went
to
the
time
it
took,
etc.
And
this
is
your
friend
yeah
query
tracing
is
it
is
going
to
you
a
lot
this
when
you're?
A
If
you're
a
query
is
running
slow,
this
will
tell
you
why
it's
running
slow
and
I'd
always
advise,
and
even
we're
just
wanting
to
learn
it's
a
very,
very
helpful
tool.
So
you
can
actually
go
through
all
the
steps
that
query
took,
what
nodes
it
went
out
to
how
long
it
took
to
come
back,
etc.
So,
when
you're
looking
at
tuning
your
queries,
optimizing
it
debugging.
This
will
help
you
a
lot.
There's
also
a
lot
more
that
you
can
do
with
cql.
A
We
have
lightweight
transactions.
Lightweight
transactions
are
on
a
per
operation
basis,
a
way
of
essentially
doing
a
compare
and
set
on
a
statement.
So
you're
saying
insert
this
data.
If
this
condition
condition
maps
matches
it,
you've
got
counters.
So
counter
is
just
probably
pretty
obvious.
What
characters
do
you
gives
you
the
ability
to
do?
Incrementing
a
new
committee
values
and
columns?
We
also
have
time
to
live
so
on
a
on
a
per
insert.
You
can
set
a
TTL
for
that
data
once
that
details
expired,
that
data
would
no
longer
be
returned.
A
A
So
this
is
a
very
quick
overview
of
cql,
but
really
the
point
to
get
across
is
it's
giving
you
a
very
familiar
lying,
a
very
major
tool
to
connect
up
query
model,
your
data
and
there's
a
lot
of
information
out
there,
which
I
suggest
test
you
looking
at
them
as
a
bunch
of
talks
out
there.
That
would
help,
but
I
want
I
could
talk
about
this
for
for
an
hour
rather
than
everything
else.
So
the
probably
was
quite
interesting
is
how
we've
built
the
we've
got
this.
A
You
got
the
cql
which
gives
you
the
the
query
language
we're
connecting
up,
and
then
we've
built
a
whole
different
way
of
connecting
into
cassandra
yeah,
so
part
of
the
when
you're
using
seat.
Well
is
you
can
use
the
native
protocol
as
opposed
to
the
traditional
thrift
RPC
protocol
for
connecting
to
Cassandra?
So
what
this
is
doing,
it
is
essentially,
we
are
without
we
support,
request
pipelining.
So
your
client
connects
up
to
your
Cassandra
cluster.
A
It
opens
up
a
certain
number
of
persistent
connections
in
there
and
what
we're
doing
there
is
we're
sending
a
multitude
of
requests
concurrently
down
the
same
connection
and
getting
the
back
on
sanity.
The
same
connection
as
well-
and
this
is
a
two-way
conversation,
so
we
are
getting
not
just
request
reply.
We're
also
getting
push
push
events
back
from
Cassandra
now
the
reason
so
just
to
be
quite
clear
on
the
type
of
notifications
you
get
back
from
Cassandra
with
previously.
A
If
you
wanted
to
be
a
awareness
of
the
topology
or,
what's
going
to
your
client,
would
be
pulling
your
Cassano
cluster
going
out,
get
a
request
reply.
What
we
do
now
is
when
you're
using
the
native
protocol
any
changes,
any
kind
of
technical
events
that
happen
on
your
cluster.
Your
clients
are
now
aware
of
them,
and
this
type
of
data
informs
the
client
driver
as
to
what
it
needs
to
be
doing
around
load,
balancing,
etc.
Now
the
type
of
check
vents
you
get
back,
our
topology
changes,
a
node
has
gone
down.
A
An
old
has
come
up
changes
to
your
schema.
These
are
the
type
of
things
that
get
notified
to
the
client
driver,
not
data
mutations,
so
you
don't
have
this
wrote
changes.
Something
tells
me
about
it.
It's
just
the
technical
stuff
for
the
driver,
but
it's
it's
quite
it's
quite
a
handy
thing
to
have,
and
then
really
the
driver
itself
is
built
from
the
ground
up
to
be
it's
a
completely
asynchronous
architecture.
What
we've
done
is
in
case
of
the
java
stuff.
A
We've
used
a
neti
extensively
to
give
you
a
non-blocking
I/o
around
there,
and
it's
it's
really
really
handy,
because
the
way
and
I'll
get
into
it.
We
start
looking
at
some
code
examples
giving
that
have
this
now
very
asynchronous
way
of
grits
or
a
multiplexing
requests
over
the
sockets
we're
not
blocking
any
threads.
It
really
does
mean
your
ability
to
scale
whether
it's
internally
within
your
clients
and
also
the
amount
of
resources
it
takes
to
work
with
cassandra
is
quite
quite
different.
It's
quite
changed.
A
So
that's
very
much
in
a
nutshell.
There's
a
lot
more
to
this,
and
what
I'm
going
to
do
now
is
just
really
talk
about
the
more
from
a
development
point
of
you.
Talk
about
the
drivers
how
they
work.
These
are
that
so
the
native
drivers,
we
at
datastax
we've
produced
a
bunch
of
the
ones
there
in
bold.
We
have
these
datastax
native
drivers
and
we
have
written
them
in
Java,
C,
sharp
Python.
You
got
C++
and
beta,
not
quite
sure.
A
When
that's
going
to
go,
we've
got
the
odbc
as
well,
and
there's
also
a
bunch
of
community
drivers
built
in
addition
to
this
enclosure
or
lying
mode,
Ruby,
etc.
There's
lots
and
lots
of
different
there.
You
should
you
should.
It
depends
on
what
depends
on
the
language
using,
but
there
is
the
tools
are
there
for
you
to
use
for
the
for
you
to
connect
to
connect
up
with
so
what
I'll
do
now
is
I'll,
focusing
on
the
the
Java
driver.
I'll.
A
Take
you
through
some
ways
of
using
it
and
cool
things
there
and
then,
at
the
end,
we
can
have
some
have
some
questions.
So
first
thing:
when
you
from
a
client
point
of
view,
when
you
decide
to
connect
up
to
Cassandra,
the
first
thing
you
do:
is
you
build
a
cluster
and
what's
quite
important
there
first
thing
you'll
observe?
Is
it's
using
the
Builder
pattern?
It's
it's
all
very
fluid
interface
for
building
this
stuff.
Up.
A
First
thing
you
have,
there
is
you've,
got
these
contact
points,
so
the
contact
points
are
the
really
telling
the
client
discover
the
cluster
yeah.
So
it
says,
connect
up.
Tell
me
everything
about
this.
This
Cassandra
cluster
I
need
to
use
it
very
much
in
the
same
way,
the
your
listeners
work
when
you're
configuring
your
sound
right.
It's
not
a
it's.
Not
it's,
not
a
we're,
not
saying
connect
to
these
nodes
and
you're
querying.
It's
just
I
need
to
go
and
discover.
A
My
cluster
I
need
to
know
the
apology
of
my
cluster,
the
partitioning
everything
you
want
to
know
about
your
cluster,
and
then
you
build
that.
You
then
create
a
session
off
that
cluster
and
you've
optionally
got
the
ability
to
create
a
session
on
a
key
space
or
just
up
to
the
cluster
in
itself.
Obviously,
if
you
don't
specify
the
key
space,
you
can
then
qualify
that
exactly
where
you
would
do
it
SQL
with
your
secret,
well
statements
and
then,
which
you've
got
this
session
back.
A
You
can
then
execute
queries
against
your
data
and
here's
a
example
they're
just
doing
a
fairly
straightforward
insert
into
two
and
others,
not
my
password
for
for
anything
so
yeah
there
you
go
so
it's,
and
it's
also
worth
noting
that
you
know
your
cluster
objects
and
your
session
object.
These
are
long-lived
objects
in
your
application.
These
aren't
things
you
would
be
instantiating
frequently
so
reuse
them
keep
them
alive
for
a
long
time,
but
in
the
same
way
there
are
shut
down
methods
like
you
would
on.
A
A
Okay,
so
next
example
here
is
reading
from
a
table,
so
we
call
session,
don't
execute
with
a
bit
of
CQ
layer,
select
star
from
users
and
you
get
back
a
result
set,
and
you
know
I,
don't
think,
there's
anything,
because
we
need
to
explain
about
this
one.
You
get
back
some
rows
and
you
iterate
through
them,
and
you
do
stuff
with
them
very
simple,
very,
very
straightforward.
However,
there
are
cooler
things
you
can
do
with
this,
and
this
comes
down
to
the
way
we've
adopted
this
asynchronous
architecture,
so
you
can
do.
A
Is
you
can
basically
see
that
executes
async
method
up
there?
This
gives
you
the
ability
to
asynchronously
make,
as
the
name
implies.
They
simply
see
execute
queries
against
there.
So
this
is
really
really
handy
for
a
lot
and
I'll
show
you
some
examples.
Did
it
because
what
this?
What
this
does
is
the
you
get
back
a
bunch
of
you
get
back
a
bunch
of
futures
and
these
features
are
implementing
guavas
listenable
interface,
which
also,
as
a
consequence,
means
all
of
the
very
cool
things
you
can
do
with
guava
future
methods.
A
You
can
now
do
with
with
your
queries
into
there,
and
this
is
this
is
very,
very
nice
stuff
and
the
reason
it's
quite
nice
is
because
it
lends
itself
to
doing
to
large
scale
and
you're
not
blocking
your
client.
Here.
You
can
essentially
execute
your
method.
If
this
is
a
very
horrible
big
loan
query
or
a
big
bunch
of
inserts,
you
say,
come
back.
Tell
me
when
you're
done
and
do
that
and
you
can.
A
What
you
do
that,
what
you
can
do
is
you
can
register
callbacks,
so
example,
example,
here
what
we've
done
is
imagine
this
select.
Well,
the
select
star
from
users
probably
would
be
a
very
horrible
query
and
we
give
you
back
a
large
amount
of
data
and
what
I
can
do.
There
is
from
a
client
point
of
view.
I
can
say
when
you
come
back,
execute
my
listener
there
runnable
and
call
that
run
method
and
do
something
there
and
it's
cailli
anything
you
want
to
do.
Example.
A
Here
is
really
just
when
you
get
back
the
row
iterate
through
it
and
do
it
your
stuff,
but
this
is
a
very,
very,
very
handy
thing
to
be
able
to
do,
because
it
also
enables
you
to
paralyze
your
calls,
so
very,
very
simple
example.
But
what
this
is
trying
to
show
to
you
here
is
I've.
Similar
query:
I've
created
a
bunch
of
a
collar
set
the
session
executed.
I.
A
Really
optimize
how
you're
doing,
and
if
you,
if
you,
if
you
are
having
to
do
lots
and
lots
of
big
queries,
so
you
you're
getting
back.
You
want
to
insert
lots
of
data.
I
want
to
select
lots
and
lots
of
legs
rather
than
getting
back.
One
big
select
break
it
down
into
smaller,
smaller
queries,
because
this
gives
you
a
quality,
gives
you
a
lot
of
advantages.
Well,
first
of
all,
it's
very
easy
to
do
it
with
the
execute
a
sink
and
with
the
futures
on
there.
A
But
there
is
the
problem
with
doing
large
queries,
and
this
is
that
it
will.
It
can
result
in,
for
instance,
what
your
coordinator
know
doing
a
lot
of
work
and
that
introduces
a
certain
amount
of
hot
spot
on
that
node,
which
they
don't
have
an
effect
on
general
through
buttons
up
there.
So
breaking
things
down
is
the
smaller
queries
is
often
better
than
having
just
one
big
query:
it
also
it
will,
if
you
have
them.
A
You
know
if
you're
looking,
if
you're,
using,
which
I'll
talk
about
that
when
you,
if
you
are
starting
to
use
latency
awareness
for
your
load
balancing
this,
can
often
skew
this
because
you're
giving
a
very
expensive
operation.
So
your
99th
percentile
is
now
thinking.
Oh
this,
this
node
is
slower
than
actually
it's
just
doing
a
lot
of
work,
so
it
can.
It
can
skew
that
and
the
other
big
advantage
of
breaking
things
down
to
smaller
queries
is.
If
that
query,
fails,
it's
fine
I.
Can
we
try
it?
It's
not
an
expensive
operation.
A
So
other
thing
you've
got
with
the
native
protocol
is
you've
got
prepared
statements,
and
this
is
just
like
prepared
statements
in
your
traditional
jdbc,
a
world
the.
So
these
statements
gets
compiled
what's
their
intended
for
multiple
execution
and
its
really
really
straightforward
to
use
it
you,
if
anybody's
ever
written
any
Java
code
connecting
to
a
database,
you
will
understand
exactly
what's
happening
here.
There
are
some
considerations
around
prepared
statements.
I
mean
don't
prepare
a
statement.
Unless
you
are
going
to
be
we're
using
it,
you
know
I
think
that's
going
to
really
obvious,
but
yeah.
A
These
are
your
friends.
This
is
something
you
will
predominantly
users
prepared
statements
and
there's
there's
more
ways
of
binding
your
variables
into
there
and
there
you
can
do
it
by
index.
You
can
do
it
by
name,
I
think
there's
also
you
can
define
variables
now
as
well
and
substitute
stuff
in
so
it's
fairly
fairly
easy
to
use
and,
and
though
the
other
and
then
other
thing
we
have
is.
A
So
whether
it's
with
the
previous
examples
of
that
you
just
if
you
want
to
set
your
consistency
level,
you
just
call
set
consistency
level
on
that
query,
and
then
you
pass
in
whatever
whatever
you
want
worth
pointing
out.
The
default
is
consistency
level
of
one,
so
that's
somewhat
redundant,
but
just
be
aware
of
that
of
you.
If
you
don't
set
it,
it
will
be
using
consistency
level
of
one
and
then
same
way.
You
build
up
that
query.
You
call
session,
not
execute.
You
get
your
results
back.
So
really.
A
If
you're
going
to
take
away
anything
from
these
previous
slides
it,
it
probably
should
be
the
asynchronous
nature
of
you've
how
you
can
use
the
use,
use
the
culture
I
mean
it
is
really
really
cool,
I'm,
very,
very
efficient
for
doing
so.
It
really
doesn't
lend
itself
to
a
lot
a
lot,
a
lot
of
use
cases
now,
but
it
I
mean.
If
you
are,
it's
all
non-blocking
I/o,
it's
it's
very
very
nice
and
you
will
see
a
lot
of
performance
differences.
Then
tikki,
we
start
breaking
things
up.
Bulk
bulk
reads:
bulk
rights.
A
It's
it's
a
very,
very
handy
way
of
off
split
splitting
the
work
load
more
evenly
around
your
cluster,
so
use
it
and
have
fun
alright.
So
what
you've
also
got
with
the
the
drivers
and
the
native
protocol
are
a
bunch
of
policies
and
the
policies
are
for
a
load
balancing
for
reconnection
for
retries
and
there
are
a
bunch
of
pre-written
choices
about
policies
for
years.
So
first
one
here
we've
got
is
a
multi
data
center
load,
balancing
policy
and
what
this
is
saying
to
your
client.
A
A
Instead-
and
it's
that's-
that's
that's
quite
nice,
so
it
gives
you
a
lot
of
tolerance
to
failure
to
not
things
flapping
to
stuff
like
that
or
if
you
just
you
basically
you're,
hammering
that
local
data
center
you
just
it's
having
problems.
Fine,
let's
start
switching.
Some
of
my
requests
over
to
my
remote
to
my
other
data
center.
A
Another
one
you've
got
is
you've,
got
a
token
aware,
load
balancing
policy
now
with
a
token,
to
wear
low
balancing
policy.
What
it's
doing
is
so
when,
when
your
client
connect
into
Cassandra,
it
chooses
a
coordinator
node,
as
you
all
know,
Cassandra
its
entirety,
masterlist
doesn't
matter
which
node
it
chooses
to
it's
still
the
case.
What
the
token
and
where
policy
means
is,
I
have
I
will
trying.
I
will
connect
to
one
of
the
nodes
that
owned
this
partition
of
data
and
what
that
means
is
I
don't
have
to
go.
Do
it
like
a
france?
A
Is
a
network
hulp
to
from
the
coordinator
to
the
other
nodes,
because
my
data
is
already
residing
on
the
node
I'm
connecting
to,
and
that's
that's
that's
quite
nice
and
the
reason
it's
you
know
it.
What
makes
it
even
nice
areas
you,
because
the
client
driver
is
getting
these
events
pushed
to
it
it
when
you
change
your,
you
add
nodes,
you
change.
The
partitioning
petition
ranges
across
your
nodes.
Your
clients
told
about
this.
A
It
knows
it's
always
aware
of
which
nodes
on
what
what
what
what
tocan
ranges
really
the
way
to
use
this
is
to
you
instantiate
it
with
a
child
policy
as
well.
You
have
to
instantiate
it
with
the
child
policy
and
example
here
this.
This
is
actually
a
very,
very
common
way
of
setting
up
your
your
your
your
cluster
is
so
you
have
a
nice
annotate,
your
token,
where
policy,
and
then
you
use
once
I'm
token
aware,
also
make
me
DC
aware
so
I'll
connect
to
my
local
data
center.
A
I
will
choose
a
coordinator,
but
as
an
affinity
to
that
that
part
that
partition
of
data
and
then,
when
I
connect
to
a
remote
data
center.
I'll
do
the
same
thing
and
you
you
it's
it's
it's
it's
very
nice
and
then
we
have
actually
a
most
probably
should
say
that
there
is
a
default
policy
as
well
for
a
lower
bound,
which
is
just
around
robbing.
A
Obviously
the
default
retry
policy,
which
it's
quite
conservative
and
what
it
does
it
will
just
retry
yeah
it
just
it.
Just
just
doesn't
do
much
the
next
one
there
which
we're
using
in
that
example,
is
the
downgrading
consistency,
retry
policy,
and
what
this
is
intended
to
do,
and
you
should
you
should
use
it
with
caution.
Don't
use
this
unless
you
generally
know
what
you're
doing,
because
what
will
happen
is
the
idea
here
is
I've.
A
Can
it
I've,
for
example,
I've
called
consistency,
level
of
all
and
I'm
executing
my
query
against
against
Cassandra
for
some
reason
imagine
I'm
in
a
multi
data
center
deployment,
the
connectivity
to
mow
the
data
center
is
unavailable.
I
can't
honor
that
consistency
level,
because
I
can't
get
the
ACT
back
from
all
my
multitude
of
data
centers.
So
what
do
you
do?
You
can
fail?
The
request
say:
well,
I
can't
you
I
can
I
can't
honor
all
or
I
can
say
you
know
what
okay
I'll
drop
down
to
a
weaker
level
of
consistency.
A
So
princess
I
go
from
all,
maybe
to
say
local
quorum
and
then
know
when
that
when
the
connectivity
comes
back
up
it
will
we
try
and
do
this,
but
it's
it's
a
very
handy
tool
in
your
box,
but
you
know,
I
probably
argue
if
you've
gone
for
all,
you
have
a
really
probably
a
good
reason
to
go
for
all,
so
don't
use
it
just
thinking
by
default.
I'll,
just
use
this
downgrading
retry
policy
understand
what
that
means
for
your
application
understand
what
that
means
for
your
data.
Don't
just
use
it
out
of
the
box.
A
You've
got
the
fall
through
retry
policy,
which
basically
doesn't
do
anything.
It
just
lets
your
bit,
your
that
just
bubbles
that
up
to
your
your
business
logic
and
you
decide
what
to
do
in
that
situation
and
then
you've
got
the
logging
retry
policy,
the
login
we
try
policy.
All
it's
doing
is
you
would,
as
with
the
the
token
posse
you
nest,
the
child
policy
within
there,
and
all
this
is
doing,
is
basically
logging
out.
A
When
I
read
try
and
I
tell
us,
I
can't
see
why
you
wouldn't
ever
have
a
retry
policy
and
not
want
to
know
with
your
retrying,
because
something's
really
going
wrong,
so
use
the
login
retry
policy
yea.
It
will
be
fun
and
there's
a
link
down
there
to
our
to
the
javadocs
on
this
which
you
can
have
but
retry
policies
are.
You
know
it's
important
to
really
really
I'll
say
it
is.
It
is
quite
important
to
understand
what
you're,
what
you're
doing
there
you're.
You
know
you're
retrying
on
failure,
so
something
has
gone
wrong.
A
Hence
your
you're
doing
this.
This
isn't
like
on
load
balancing
this,
isn't
saying:
I'm
going
to
be
routing
requests
where
it's
efficient,
something
that
something's
gone
wrong
and
you're
now
retrying.
So
no,
what's
the
right
thing
to
do
when
that
happens,
then
it
don't
just
a
particularly
downgrading
consistency
policy.
That's
you
know
that
that
that's
a
choice
you've
made
for
a
functional
reason
for
a
requirement
to
choose
that
level
of
consistency.
So
downgrading
is
not
not
might
not
necessarily
be
the
right
thing
to
do.
A
We
then
have
a
bunch
of
reconnection
policies,
and
this
is,
as
I
said,
of
thought
gosh.
This
is
how
often
will
attempt
to
reconnect
to
a
dead
node.
How
familiar
this
and
it's
it's
fairly
brutal.
There's
this
really
two
choices.
There
you've
got
a
constant
reconnection
policy,
which
is
saying
if
my
node
is
marked
as
down
every
X
number
of
milliseconds
try
again
try
again
try
again
up
to
a
maximum
period
of
time,
and
he
then
got
an
exponential
reconnection
policy.
A
The
constant
attempts
were
would
worry
me
in
a
in
a
system
because
that
type
of
workload
is
you
know
the
node
goes
down.
It's
probably
likely
that
lots
of
your
clients
have
noticed
this
or
experiencing
an
issue
and
are
retrying
and
they're
all
going
to
start
hitting
like
this.
So
having
a
bit
of
variability
in
those
retry,
attempts
is
nice,
so
yeah
I
personally
usually
go
for
the
exponential
one
and
I'll
also
point
out
all
of
these
policies.
You
can
write
your
own,
it's
quite
extensible
it's.
A
This
is
all
up
on
on
our
github
get
up
calm,
/
datastax,
very
extensible
people
do
write
their
own
policy,
for
instance,
know
maybe
a
time-based
one.
You
know
midnight,
don't
use
this
data
center
use
this
one
stuff
like
that.
There's
there's
lots
and
lots
of
ways
that
you
can
extend
this,
for
whatever
your
use.
Cases
and
needs
are
another
thing,
and
the
reason
this
is
in
there
is
because
this
is
probably
one
of
the
most
painful
points
with
earlier
with
with
early
versions
of
Cassandra.
A
So
what
you've
got-
and
this
do
you
think
well
show
you
this
is
this?
Isn't
you
know
anything
fancy,
but
in
comparison
to
what
we
had
before
paging
was
in
nurse
it
was.
It
was
quite
quite.
This
is
very,
very
nice,
so
I
mean
so
historically
getting
large
data
sets
out
of
Cassandra.
Ensure
client
is
a
problem
because
Cassandra
would
load
that
result,
set
it
to
memory,
and
then
you
drop
that
result
set
over
to
your
client.
So
you'd
get
these.
You
know
memory
exceptions
a
lot,
you
know
you
wouldn't
you'd
have
problems.
A
Here
it's
just
showing
you
how
that
works,
but
the
one
thing
that's
quite
cool
about
this
is:
is
the
concept
of
state
as
your
paging,
your
data?
So
if
you
think
about
it,
if
that
in
that
example,
there
are
that
clients
queer
that
node
there
it
gets
back
the
first
first
page
of
the
data.
Well,
what
happens
if
that
node
goes
down?
Yeah?
Well,
you
know,
do
I
have
to
start
again
from
the
other
nodes.
You
don't
what
it.
What
you
actually
have
is
essentially
essentially
a
cookie.
A
If
you
will
that
tells
you
what
what
point
you're
through
there.
So
if
that
node
goes
down,
the
next
page,
requests
could
can
simply
go
to
another
note
in
your
cluster
without
having
to
start
again.
I
know
that
so
it's
it
is
tolerant
to
failure
and
I
from
from
a
personal
point
of
view.
This
is
for
me
one
of
the
nicest
things
we're
not
just
loads
of
nice
things.
This
is
a
very
nice
thing,
because
I
hated,
it
was
very
ugly
to
get
large
large
result
sets
back.
So
this
is
your
friend.
A
You
won't
even
notice
it
other
thing.
We've
discussing
I
showed
you
the
example
before
was
tracing.
So
yes,
in
the
exactly
same
way
from
the
cql
come
online,
you
can
tracing
get
results
back.
You
can
also
do
this
within
the
the
driver
as
well,
and
all
you
simply
do
is
where
regard
to
using
your
query
builder
or
just
your
own
statements,
you
just
searched.
Call
it
enable
tracing
on
there
and
were
you
execute
that
query?
A
You
get
back
an
execution
info
object
and
from
there
you
can
get
a
query
trace
object,
and
this
gives
you
a
lot
of
gives
you.
Basically,
this
gives
you
all
the
diagnostic
information
that
you
would
be
getting.
Has
you
gone
through
the
command
line
to
do
this
as
well?
So
you
can
get
unit.
You
know
what
nodes
you've
attempted.
You
know
the
latency
if
you've
coordinator,
has
gone
out
to
the
other
nodes
to
find
it.
A
You
get
a
lot
of
diagnostic
information
in
there,
but
rather
crucially,
this
is
something
you
are
doing:
programmatically
yeah,
so
there
is
you're
not
doing
this
on
a
global
level.
You're
not
going
in
and
saying.
Okay
now
enable
tracing
on
all
my
queries,
you're
doing
it
on
a
per
statement
basis.
So
that
can
be
a
challenge
of
you
because
well
had
a
way
actually
something's
gone
wrong.
I
need
to
enable
tracing,
so
think
think.
A
Think
carefully
about
that,
and
the
other
thing
to
also
think
carefully
with
tracing
is
there
is
a
performance
cost
to
enable
tracing
I
mean
it
is
persisting
and
they
don't
Cassandra.
It
is
consuming
resources,
so
use
it
when
you
need
to
diagnose
stuff,
don't
run
it
in
production.
You
know,
obviously
don't
these
code
with
enable
tracing
on,
because
you're
going
to
just
be
hammering
everything
in
this
I've
seen
people
do
some
randomness
turning
it
on
so
just
to
give
them
a
certain
amount
of
you
know
every
five
percent
of
kohls
enable
tracing.
A
A
Now
the
tools
you
have
around
the
CQ
Lemaitre
protocol
you've
obviously
got
there's
a
command-line
utility,
CQ
r
sh.
That
will
let
you
connect
up
and
do
what
you
like.
Well,
we've
also
built
a
dev
center.
This
is
completely
free,
go
download,
it
have
a
play
with
it
and
you
get
all
the
kind
of
things
you'd
want
from
a
to
that.
You
get
all
the
syntax,
highlighting
also
completion,
pretty
colors,
connecting
up
two
clusters
managing
all
that
stuff.
It's
using
the
native
drivers
as
well-
and
this
is
this
is
this-
is
pretty
new.
A
We
launched
this
back
of
one's,
it
depends
September,
I
think,
last
last
year,
so
it's
in
version,
one
at
the
moment
and
yeah
we're
we're
very
keen
for
people
to
try.
It
out
tell
us
what
they
think
anything
you
don't
like
anything
you'd
like
to
see
changed
or
any
extra
things
you
want
to
see
see
in
there.
So,
finally,
there
is
a
bunch
of
links
here
that
will
help
you
a
lot
so
the
first
one
there
is
obviously
datastax
com,
that's
going
to
where
I'd
say
go
to
for
all
your
Cassandra
stuff.
A
A
second
link
up.
There
is
a
bunch
of
stuff
there
to
help
you
if
you've
not
used
Cassandra
for
Cassandra
before
downloading
VMS,
get
yourself
up
and
running
on
there.
We
also
have
quite
a
comprehensive
amount
of
training.
Both
online
training
completely
free,
just
go
off
and
start
using
it
learn
about
it.
It
is
pretty
geared
up
to
the
native
driver
and
java
and
c
ql,
so
it's
it's
quite
quite
good
to
do
this
from
our
downloads,
page
you'll
get
be
able
to
download.
A
A
The
developer
blog
lots,
not
some
interesting
things
there
that
you
would
I
suggest
reading
them,
loads,
loads
of
cool
stuff
and
our
community
community
site
for
Cassandra's,
planet,
Cassandra,
org
and
again,
there's
my
favorite
thing
on
there
is
that
we
have
a
bunch
of
use
cases
and
interviews
with
people
why
they've
used
Cassandra
how
they've
used
Cassandra
stuff
like
that
and
just
tons
of
webinars
over
there.
So
that's
me.
Thank
you.
A
There's
a
lot
to
cover
it's
quite
quick
I
could
have
stood
here
for
a
day
talked
about
this,
so
it's
hard
to
get
it
all
fit
again.
So
if
ya,
oh,
yes!
So,
yes,
and
if
you
I'm
here
all
week,
were
yep
great,
we
got.
We
got
questions
coming
in
already
we're
on
the
fifth
floor.
Come
over
speak
to
me,
speak
to
any
my
colleagues,
any
questions
you
have
anything
you
want
to
know
I'm
here
I'm
here
to
help.
A
Cool,
so
we
have,
we
have
some
questions,
so
are
you
planning
to
introduce
data,
update
notifications?
No,
and
arguably
that's
not
really
the
drivers
kind
of
responsibility.
If,
if
you're
asking
me,
how
would
I
do
data
update
notifications,
the
only
thing
you've
got
in
your
toolbox
at
the
moment
is
triggers
so
triggers
are,
and
I
will
definitely
say.
This
triggers
are
experimental,
they're
they're,
really
as
a
we
put
them
out
with
cassandra.
Is
there
to
two
people?
How
are
people
going
to
use
this?
How
are
they
want
to
use
it?
A
Can
thrift
and
c
ql
be
mixed
and
I
believe
they
can
I've
not
tried
it
myself
and
I've
I
have
seen
a
couple
of
community
drivers
where
they
are
using
cql
over
over
the
RPC
stuff,
but
I
I
wouldn't
advocate
it.
I
mean
the
really
they're
made
to
work
in
the
native
protocol
and
c
ql.
It
sits.
There
really
are
made
to
work
together.
I
wouldn't
advocate
personally
I
wouldn't
advocate
it
if
it
works
for
you
and
it's
going
to
do
what
you
want
to
do,
go
ahead,
but
really
seek
your
native
protocol.
A
That's
the
way
to
go.
You
want
to
go
down
that
route.
No.
The
question
is
there
a
scholar,
client
supporting
cql
with
a
sync
API,
that
you
would
recommend
and
I
recommend
looking
at
our
client
drivers
download
sites
on
our
site,
and
there
is
some
skyla
ones
in
there.
Also
planet
Cassandra.
This
links
to
some
Scala
libraries
in
there
is
there
a
specific
datastax
supported.
We
have
built
skylar
driver.
No,
but
you
know
I'm.
A
I
know
it
and
well
probably
yeah
I
mean
that
I
think
I
think
I
think
the
demand
is
there
and-
and
it
wouldn't
actually
be
a
difficult
thing
to
do,
because
our
Java
driver-
because
it's
all
asynchronous,
be
call
it
sold
non-blocking
I/o.
It
would
fit
very
very
simply
into
this.
Really
all
you're
doing
is
I.
A
Think
if
you
built
a
scarlet
drivers
just
making
a
little
bit
more
friendly,
how
you'd
use
it
I
think
you
would
ultimately
I
think
you'd
be
just
delegating
out
to
the
make
the
Java
driver
anyway,
but
you
just
make
it
nicer,
so
yeah,
it's
a
scholar.
I
mean
how
we
hope
we
do
have
I
do
know
lots
of
people
using
Scala
and
the
native
drivers
before.
But-
and
you
know
what
this
is
all
open
source
is
all
up
on
github.
You
know
we
be
quite
happy.
A
If
you
know
you
guys
want
to
build
a
scalar
driver,
build
a
scholar
driver.
You
know
another
one
there,
usually
when
kasam,
usually
when
using
Cassandra,
you
were
starting
from
initial
requests
and
build
the
key
spaces,
etc,
but
adding
other
info
usually
implies
creating
other
key
spaces.
This
can
result
in
a
performance
problem.
Query
multiple
keys
basis,
sequentially.
What
is
the
power
of
Cassandra
with
cql
instead
of
using
relational
DBS?
That's
not
like
a
big
big
big
question.
A
Let's
start
with
the
first
one,
usually
when
you're
using
a
sounder,
you
were
starting
an
initial
request
and
you
build
a
key
spaces.
So
if
you're
not
quite
sure,
I
understand
your
question,
there
I
mean
if
you're
talking
about
data
modeling
is
that
who
asked
that
question
yeah
who
asked
that
question.
Yeah
I
mean
I
I,
don't
quite
sure
what
you're
asking.
But
if
you
are
asking
about
the
approach
to
modeling
data
input,
seek
you
understand?
I
mean
it
is
a
denormalized
query,
based
approach
to
modeling
your
data
and
which
the
implication
of
that
is.
A
Is
you
need
to
know?
Well
what
am
I
actually
going
to?
How
am
I
going
to
be
querying
to
do
this
and
you're
really
optimizing
for
that?
You
know
there
is
no
secret
special
sauce.
That's
going
to
make
this
any
any
less
challenging
than
it
would
be.
You
know
you've
got
you're
going
to
have
issues
with
either
if
your
requirements
change
or
you
get
new
requirements.
How
do
I
adopt
my
model
to
meet
these
new
requirements
and
there's
you
know,
there's
we
really
the
read
the
question.
A
People
always
ask
me
if
the
ensuing
this
is
the
question
you
are
asking
is:
how
do
we
know?
I've
got
this
massive
table
with
a
bunch
of
data
in
there
that
I've
optimized
around
I'm
going
to
I'm
gonna
be
curving
it.
You
know
what
I've
now
got
a
new
requirement
that
I
need
to
query
a
different
way.
What
do
I
do?
A
I
create
this
table,
but
I've
got
all
this
data
sitting
over
here
had
a
way
back,
fill
it
to
get
my
my
table
of
the
date
so
that
my
data
is
now
in
there
it's
queried.
It's
probably
do
it
and
you
know
not
going
to
lie
I,
don't
think.
That's
a
particularly
easy
thing
to
do.
There's
no
click
a
button
and
do
it.
But
there
are
a
bunch
of
approaches
that
you've
got.
You've
got
a
very
low
level.
A
You
can
start
writing
writing
SS
tables
yourself,
based
on
the
you
read
in
the
SS
tables
from
the
other
table.
You
read
them
in
parse
them,
transform
them
write
them
out
to
the
new
SS
table
format,
and
then
you
just
stream
them
back
into
the
table.
That's
one
approach.
You
would
also
probably
want
to
start
writing
in
parallel
before
you
do
that
so
you're
you
you're
just
catching
up
based
on
when
you
took
the
SS
table
and
you've
also
got
you
know
you
can
brick
for
us
it
you
could.
A
You
know
you
can
basically
write
something
that
says:
reading
stuff
in
and
writing
stuff
back
out,
not
not
depends
on
the
type
of
data,
and
there
are.
There
are
some
pretty.
This
is
the
nice
thing
with
the
async
stuff,
certain
from
a
client
point
of
view,
it's
pretty
quick,
you
can
get,
you
can
actually
pump,
I
mean
I.
Might
a
laptop
I
can
pump?
You
know
millions
of
Rights
into
Cassandra,
just
just
by
purely
perlman
boom,
but
you
know
that
might
not
not
not
suit.
A
You
know
there
are
a
bunch
of
you
know
the
whole
that
whole
ecosystem
about
yelling
and
tools
around
that
that's
evolving
a
lot
we've
partnered
with
Jasper
assaf.
You
know
pentaho
building
these
nice
kind
of
ETL
load
and
transform
tools
which
you
know
there
are
they
are
certainly
kettle
is
cql
compliant
in
there
and
it
will
let
you
do
this
kind
of
stuff
and
other
approach
you
have
to
do.
A
This
p
population
is
certainly
with
with
datastax
enterprise,
which
is
our
kind
of
uber
product
of
cassandra,
and
it
gives
you
we
have
an
analytics
part
there
and
what
that
is
is
essentially
HDFS
compliant
file
system
sitting.
On
top
of
your
cassandra
column,
families-
and
this
means
you
can
start
writing
hive
queries.
Pig
queries,
MapReduce
jobs.
So
if
you
did,
you
could
write
something
on
there
to
read,
transform
and
write
tomorrow.
The
table-
and
this
would
be
a
batch
thing.
I-
would
just
go
off
and
do
it
and
you
get
through
it.
A
So
there's
a
bunch
if
that
was
the
question,
there's
a
bunch
of
purchase
to
it.
So
I
think
I
think
where
I've
actually
got
I
think
I
think
I've
finished
a
bit
early
actually,
which
is
which
is
I,
was
worried
about
over
running
so
yeah.
So
thank
you
very
much.
If
there's
no
other
questions,
we'll
we'll
call
it
a
day
and
I
said
down
at
the
booth
if
come
and
come
and
speak
to
me,
ask
anything
you
want
to
know
and
you've
got
my
Twitter
and
email.