►
From YouTube: This Week in Cassandra: Operations in the Cloud 3/4/2016
Description
Link to blog referenced in video: http://www.planetcassandra.org/blog/this-week-in-cassandra-operations-in-the-cloud-342016/
A
All
right
welcome
everyone
to
another
edition
of
this
week
in
Cassandra
good
times
this
week,
we've
got
Ben
Bromhead
with
us
from
insta
cluster
he's
going
to
be
talking
about
operations
in
the
cloud
with
us,
the
second
half,
first
we're
going
to
take
a
lo
a
look
at
some
of
the
news
from
the
week.
What
do
we
got
here
today
guys?
The
first
thing
I
think
on
our
agenda:
ssl
vulnerability,
pretty
big
deal.
It.
B
B
C
Yeah,
so
so
this
is
interesting.
So
you
know
this
was
this
has
been
making
the
news
just
the
last
couple
of
days.
They
they
released
a
paper.
I
think
on
march.
First
is
when
this
was
disclosed
and
it
was
actually
kind
of
interesting
because
they
did
a
scan
and
they
found
that
something
like
almost
a
quarter
of
all
the
top
million
domains
were
vulnerable
to
this,
and
so
basically
the
vulnerability
is
has
to
do
with
SSSs
lv2.
C
So
this
is
a
really
old
protocol
that
you
know
from
the
1990s
that
you
know
most
most
modern
clients
and
servers
actually
don't
even
use
this
to
communicate
anymore.
It's
been
replaced
by
TLS,
ur
and
other
things
like
that,
and
but
but
what
they
found
when
they,
when
they
went
out
and
looked
at
this
vulnerability,
was
that
a
lot
of
web
servers
and
a
lot
of
that
are
the
Republic's
publicly
accessible,
still
have
SSL
v2
enabled
you
know
sort
of
as
a
even
though
clients
aren't
using
it.
C
It's
still
enabled
on
the
server
and
just
by
simply
having
it
enabled
still
on
your
server,
it's
possible
to
expose
yourself
to
sort
of
a
man
in
middle
type,
attack
where
somebody
can
probe
your
your
SSL
v2
server.
That's
not
secure
and
then
be
a
to
intercept
traffic
using
other
protocols
that
are
more
secure,
like
TLS,
so
kind
of
an
interesting,
interesting
attack,
and
it
was
kind
of
shocking
the
number
of
number
of
websites
out
there
that
are
vulnerable.
This.
B
Well,
you
know,
and
then
so
why
I
think
this
really
matters
too,
though
I
think
the
Cassandra
community
is,
of
course
you
can
use
TLS
and
ssl
to
secure
connections
from
one
place
to
another
and
inside
the
client
communication
inside
Snowden
owed
communication
and
yeah.
You
could
use
a
very
outdated
certificate
and
you
know
I
think
this
is
I,
always
hated
people
putting
infinity
on
their
expiration
of
their
ssl
certificates
because
they
probably
created
it
like
1995
right,
yeah,
they're,
still
using
the
same
certificates,
but.
C
E
E
Yeah,
I
think
what
one
of
the
issues
with
it
is
that,
if
you're
using
the
same
certificates
on
your
Cassandra
cluster
in
another
place
that
is
vulnerable
to
it,
you
can
actually
use.
You
know
that
other
service
as
an
oracle,
to
decrypt
connections
to
Cassandra,
even
if
even
if
it's
using
kind
of
like
more
modern
SSL
or
TLS
families,
I'm
pretty
certain
that
ssl
v2
has
not
been
in
any
modern
jdk
for
some
time.
E
B
Yeah
yeah,
that's
I,
think
that
that
was
my
point
tune.
I
you
can,
you
could
find
it.
I
mean
ssl
z's
been
on
more
than
just
websites
of
course.
So
we
want
to
make
sure
that
you
know
you're
aware
that
I
think
you're
using
apache
cassandra
be
using
it
to
serve
some
sort
of
website
or
mobile
app
or
something
so
there
go.
Do
your
thing:
hey.
A
Speaking
of
upgrading
your
stuff,
interesting
segue,
I,
Cassandra
3.4
feature
fries
about
that
right
we
get
some
new
stuff
coming
out.
Don't
we
hey
zhui
hi
you've
been
looking
at
sassi
a
lot
recently.
We
talked
a
little
bit
about
it
couple
weeks
ago.
What
do
you
think
the
state
of
this
is
that
yeah.
D
So
it's
interesting
because,
just
yesterday
our
currently
I'm
staying
in
Riga
in
Latvia,
so
yesterday,
I
gave
a
talk
and
I
just
show
sassy
in
action,
so
people
will
pipe
quite
excited
and
yes,
it
is
really
really
nice,
because
now
until
now
we
used
to
tell
people.
Okay,
guys,
you
cannot
search
on
data
easily.
You
need
to
design
your
your
schema
to
be
able
to
query
your
data,
which
is
still
true,
which
is
with
who
we
continue
to
do
to
say
that
to
people,
but
now
with
sassy
we
add
another
level
of
flexibility
for
Quarian.
D
So
we
have
different
kind
of
query,
for
example,
to
to
go
little
bit
into
detail,
you
can
query
data
text
data,
so
you
can
create
an
index
on
your
column.
If
it
is
a
text
column,
you
can
choose
either
you
use
tokenizer.
So
you
split
your
text
into
tokens.
So
it's
kind
of
full
text
search
are
just
simple
text
query
and
if
you
have
a
numeric
value,
you
can
choose
another
mod
to
query
to
perform
range
query.
Yes,.
E
D
A
I
think
it's
important
to
talk
a
little
bit
about
why
we
even
need
this
right.
Like
I,
don't
know
if
you've
looked
at
kind
of
the
old
implementation
or
the
existing
implementation
of
secondary
indexes,
but
they're
they're,
pretty
limited
right,
like
I.
Think
one
of
the
big
problems
is
that
they're,
based
on
tables
themselves
and
tables
Cassandra
tables,
really
don't
make
good
data
structures
for
search
indexes.
No,
not
that
Oh
at
all.
Right.
A
You
have
to
know
exactly
what
the
primary
key
is,
so
you
can
only
index
things
by
exist
like
if
you
know
the
exact
term
that
you're
looking
for
a
nice.
It
makes
a
really
terrible.
You
know
flexible
index
like
it
hard
like.
You
can't
really
compare
it
to
anything
else.
That's
available.
Let's
say
with
something:
that's
through
leucine
or
something
that
does
you
know
some
sort
of
to.
A
And
then
you
end
up
with
all
sorts
of
there's
other
problems
that
you
have
with
the
you
know
existing
secondary
index
implementation.
Each
term
has
basically
you
know
it's
a
partition
with
a
list
of
things
that
it
points
to,
but
it
can't
point
to
locations
and
files
because
of
the
compaction
process,
so
you
have
indexes
which
are
being
compacted
totally
independently
from
the
data
that
it
points
to,
and
so
you
can
only
point
to
keys,
so
you
still
have
to
go
through
the
entire
internal
process
of
looking
up.
A
Looking
up
your
data,
like
you
still
have
to
check
bloom
filters,
you
still
have
to
scan
SS
tables,
you
have
to
merge
a
bunch
of
stuff
together
and
that's
not
particularly
fast
right.
So
if
you're,
if
you
have
a
like
a
ton
of
data,
that's
going
to
be
coming
back.
It's
going
to
take
forever
secondary
indexes,
don't
perform
very
well
I!
Think.
A
So
you
end
up
with
compaction
problems.
You
have
a
lack
of
flexibility,
you've
got
you
know
there,
and
then
you
have
performance
problems
on
top
of
that,
so
I
think
the
interesting
thing
about
sazzy
is
it
addresses
all
of
these
issues.
Right
you've
got
additional
flexibility
with
the
ability
to
tokenize
text,
so
you
can
find
you
can
actually
do
prefix
searching,
which
is
amazing.
You
know
we've
got
a
like
claws
and
cassandra
now
something
I
never
would
have
thought
that
we,
you
would
have
had
you
end
up
with
think
queries
that
are
actually
fast.
A
So
the
short
version
the
architectural
details
are
a
sazzy
index
is
built
with
an
SS
table.
So
every
time
an
SS
tables
flushed
to
disk.
There
is
an
index
that
goes
along
with
it
and
instead
of
pointing
just
two
keys,
it
points
to
file
offsets
and
it's
a
B+
tree.
That's
mapped
into
memory.
So
it's
a
data
structure,
that's
very
efficient
for
indexes
without
the
drawbacks
of
the
B+
trees,
which
is,
if
they're
terrible
for
updates
and
deletes.
So
you
actually
get
this
a
lot
of
the
nonsense.
A
You
get
a
ton
of
extra
features
and
there'll
be
more
performant,
assuming
that
we're
going
to
keep
them
in
memory.
So
that's
kind
of
something
that
is
important
about.
Sazzy
is
it's.
You
know
you
can
actually
start
to
take
advantage
of
the
extra
memory
on
your
machine.
You
know
a
lot
of
people
like
memories,
so
cheap
buy
it
like
it's
it's
every
year
you
see
the
Amazon
instances
same
price,
better
move,
you
know
more
RAM
and
you're
like
well.
A
B
A
B
A
little
plug
here
they
are
going
to
be
doing.
We
are
going
to
be
doing
a
meet-up
applicants
on
April
fourth,
and
that's
in
cupertino.
It
will
be
those
guys
talking
about
they're
going
to
be
unveiling
this
to
the
world,
so
this
will
be
pretty
monumental
event.
There
are
200
seats
available,
so
we
will
be
posting
that
next
week
on
meetup
com,
if
you're
in
the
valley-
and
you
want
to
come
see,
this
I
would
just
watch
playing
at
Cassandra
a
Twitter
for
the
announcement,
but
that's
going
to
be
a
huge
event.
Yeah.
A
This
is
I'm,
I
can't
even
tell
you
how
excited
I
am
for
this
feature.
I
mean
this
is
this
is
going
to
make
everyday
querying
faster
and
more
flexible
and
also
if,
if
you've
got
spark
jobs
that
you
want
to
do
real
predicate
push
down,
you'll
be
able
to
take
advantage
of
these
indexes
and
get
much
faster
spark
jobs
for
your
analytics
purposes.
So
this
is
like
in
my.
In
my
opinion,
this
is
one
of
the
biggest
features
to
come
into
open
source
cassandra
in
a
while.
No.
C
I,
do
it
I
think
one
other
point
that's
important
to
make.
Is
that,
like
for
people
that
are
new
that
are
coming
from
a
relational
database
world
that
come
to
Cassandra,
you
know
they
they
see
secondary
indexes.
You
know
they
see,
oak
Sandra
has
secondary
indexes
and
they
think.
Oh,
it
must
work
like
you
know
like
the
relational
database
yeah,
and
we
spent
a
lot
of
time
telling
people
don't
do
that.
C
Don't
do
that,
like
you
know
they
don't
work,
how
you
think
they
do,
and
then
we
spend
a
lot
of
time
kind
of
explaining
how
they
work
and
I
think
you
touched
on
it.
John
like
that
added
flexibility
that
is
coming
as
that
dish
and
database
back
I'm
they're
going
to
have.
You
know
that
much
more
functionality
that
they're
kind
of
more
used
to
from
the
Oracles
of
my
sequels
of
the
world
and
so
I
think
it's
gonna,
make
it
easier
for
people
to
transition
to
you
know
over
to
Cassandra
yep.
A
11
extra
point,
though
I
mean
this
is
Luke.
This
is
a
really
really
good
point
that
you
touched
on
here.
This
is
it's
going
to
make
it
easier
to
for
people
to
transition,
but
it's
also
important
to
emphasize
it's
still,
whenever
possible,
better
to
create
multiple
views
of
your
data
and
be
able
to
query
a
single
partition
like
this
is
not
just
a
free
like
hey.
B
First,
rule
still
apply
that
and
I
think
that
just
kind
of
cap
this
this
topic
is
just
stay
tuned
because,
as
this
feature
is
developed
and
is
continuing
to
be
developed,
we'll
learn
more
about
the
performance
will
learn
more
about
how
you
should
deploy.
It's
just
stay
tuned
watch,
planet,
Cassandra
watch
our
Twitter
feeds.
This
will
be
a
hot
topic.
It
will
be
heavily
talked
about
for
the
next
few
months
and
need
ups.
You
name
it.
B
Planet
Cassandra,
of
course,
is
a
great
place,
but
also
I
think
datastax
Academy's
we're
going
to
probably
have
some
some
of
course
on
there,
some
tutorials
on
using
well
materialized
views
with
sassy
I
mean
these
are
all
features
that
you
need
to
know
how
to
use
before
you
deploy
them
in
production.
Yep.
Absolutely.
D
I'm
going
to
do
some
technical,
deep
dive
and
do
a
blog
post
to
explain
people
how
it
works
in
sonora,
because
that
still
has
some
some
tricks
like,
for
example,
as
John
said,
even
people
should
not
use
it
blindly.
For
example,
if
you
put
index
on
user
email,
so
we
know
that
for
one
email
address
you
get
atmos
maximum
one
person,
so
it
means
that
it
could
be
even
even
with
SAS
index
it
may
scan
on
your
whole
cluster.
So
there
is
no
magic
here.
We
are
not
selling
magic.
We
are
setting
software,
so
the.
E
That
actually
brings
up
a
really
Patrick
you
mention
there.
You
know
about.
As
this
you
know,
feature
kind
of
stabilizers
image
and
mature
is
one
of
one
of
the
things
that
I've
kind
of
been
struggling
a
little
bit
to
wrap.
My
head
around.
You
know
just
kind
of
as
someone
who
consumes
the
open
source.
E
A
A
so
that,
so
with
all,
I
think
it's
a
it's
kind
of
a.
It
can
be
a
little
bit
tricky
when,
if
you're
not
familiar
with
the
tick-tock
release
cycle,
so
with
30
well,
open
source
Cassandra's
switched
to
a
tick,
tock
release
cycle,
it's
a
monthly
release
cycle
and
if
the
version
number
ends
with
an
odd
number,
then
you
can
consider
it
a
bug-fix
release.
If
it
ends
with
an
even
number,
then
it's
a
feature
release
so
version
3.4
we
mentioned
that's,
that's
a
feature.
A
B
If
you
think
about
like
in
bed,
I
think
what
you're
talking
about
as
a
matter
of
stability
and
what
you're,
what
you
feel
like
you
can
put
into
production,
eh
I,
don't
think
it
would
be
wise
to
put
an
even
release.
I
mean
this
almost
goes
to
my
you
know
to
like
Linux
kernel
guidelines
right
even
odd
number
on
the
colonel,
but
you
wouldn't
put
an
even
number
in
production
because
it
is
a
new
feature
release.
B
But
if
you
put
an
odd
number
in
your
you're,
not
expecting
anything
new,
so
these
should
be
rolling
up
any
bugs
in
the
whole
30
3x.
Up
to
that
point,
so
that's
going
to
give
you
some
assurance
that
that
you're
getting
a
not
something,
that's
got
new
things
to
worry
about,
but
you
know
it's
rolling
up,
bug
fixes
that
will
hopefully
cover
things
that
were
released.
Previous
to
that.
The
good
thing
I
think.
B
That's
a
good
number,
you
tell
me
well,
it's
you
know,
I.
I
think
that
the
old
adage
was:
what
would
you
wait
until
dot
six
with
with
Cassandra
right.
E
D
E
B
I
know
there's
that
we
have
people
that
are
in
better
either
in
production
or
pre-production
with
33
mm-hmm.
So
you
know
that
if
you
think
about
that
is
well
cable,
that's
that's
almost
five
months,
almost
six
months
of
Cassandra
3x
in
out
in
the
wild
that
covers
that
six-month
period,
so
you
could
probably
think
of
that
as
a
good
starting
place
and
when
three
dot
five
comes
out,
it's
probably
going
to
be
the
next
iteration
up,
but
I
would
say.
That's
that's
that's.
My
guidance
right
now
is
looking
at
three
dot.
E
I
see
people
doing
that,
yeah
and
I
know
people
on
the
on
the
dev
mailing
list
might
not
like
this
a
little
bit
because
I
understand
the
purpose
of
the
tick-tock
was
to
start
to
get
features
out
the
door
a
little
bit
quicker
and
in
a
safer
manner.
So
you
don't
have
these.
You
know
big
massive
releases
that
you
know
a
riskier
to
adopt,
but
I
feel
like
we
need
some
sort
of
middle
ground,
particularly
in
the
you
know.
The
3x
transition
series
yeah.
A
B
That's
like
an
LTS
and
arm
yeah,
so
it
will,
it
will
be
you
held
up
through
the
3x
lifespan
until
you
know
what
we
will
probably
reevaluate
when
it
comes
to
40,
if
that's
a
good
idea
or
not,
but
just
to
bridge
the
old
with
the
new
that
was
the
decision
made,
so
the
3x,
the
three
dot
0
dot
number
3000
x.
Now
those
are
just
backported
bug
fixes
into
the
three-dot.
Oh,
don't
go.
A
A
E
So
I
guess
where
our
excitement
comes
from
is
sometimes
sometimes
you
know
you
like.
You,
you've
been
running
Cassandra
in
production,
for
you
know
a
year,
you've
kind
of
you're
getting
pretty
good
at
it.
You
know
you
know
enough
to
be
to
be
dangerous
and
then
you've
saved
yourself
a
couple
of
times.
E
You
know
your
data
models,
humming
along
you're,
growing,
you're
scaling,
it's
all
going
great
and
then
all
of
a
sudden
you
know
one
or
two
servers
start
throwing
you
know
massive
like
GC
pauses
and
that's
like
random
and
so
what's
going
on
here,
and
this
kind
of
tracing
becomes
really
great
in
finding
out
lives
with
him
within
your
data
model,
it
becomes
really
great
and
you
know
you've
got
that
one
user
who's
got.
You
know,
10
million
followers
on.
You
know
your
amazing
social
network.
E
E
It
just
gives
you
that
end-to-end
visibility,
so
you
can
start
to
really
have
a
good,
deep
dive
when,
when
things
go
wrong-
and
it's
really
good
for
that-
one
percent
class
of
problems
that
just
have
you
know,
let
you
scratch
your
head
for
a
while,
and
you
eventually
figure
it
out.
So
we're
excited
so.
A
Ben
as
a
since
you
you're
one
of
the
founders
of
insta
cluster
I,
you
know
you
talk
about
1%
I
am.
Do
you
end
up
seeing
this
type
of
thing
like
this
type
of
problem,
manifest
frequently
enough
with
you
know
with
your
customers,
where
this
is
the
type
of
thing
that
you
know
is
going
to
be
able
to
save
people
like
a
lot
of
time
or
money
the
product
before.
E
I
definitely,
and
when
I
say
when
I
say
one
person,
that's
not
one
percent
of
all
Cassandra
users,
that's
one
percent
of
your
queries
or
what
you're
doing
right
and
that's
spread
over
a
lot
of
people.
That
happens
a
lot
so
I'm
just
looking
at
this,
and
you
know,
I'm
thinking
about
like
Aaron
and
our
engineers,
helping
our
customers
out
to
figure
out
and
diagnosis
this
stuff-
and
you
know
this
is
going
to
help
our
customers
out.
E
This
is
going
to
help
the
community
out
massively
and
it's
also
going
to
save
us
a
lot
of
time
with
with
which
to
be
completely
selfish.
That's
what
I'm
most
excited
about,
but
I
know
it's
going
to
help
a
lot
of
people
out,
so
it
really
is
about
you
know
you
mean
time
to
recovery
or
fixing
a
particular
issue,
particularly
with
it.
You
know
we're
data
model
issues
and
seeing
what's
happening
under
the
hood,
the
number
of
SS
tables
it's
reading
associated
with
a
particular
query.
It's
I'm
excited.
A
E
E
Getting
you
know
that
the
default
JVM
settings
to
a
great
place,
you
know,
look,
there's
still
a
few
outdated
things
in
Cassandra
dash
M
by
there's
lots
of
great
resources
to
improve
that,
and
you
know
just
getting
it
set
up
from
raishin
side.
You've
got
services
like
ours
to
take
care
of
it.
You've
got
products
like
do
you
see
that
that
help
you
out
with
that?
E
So
what
we
tend
to
find
is
that
most
people,
it's
the
data
model
that
really
kills
them
and
particularly
you
can
get
in
a
situation
where
you
know
you
go
live
with
Cassandra
you've
migrated
from
some
other
database,
or
you
know
you've
built
this
project
fresh
on
it
and
everything's
going
great
for
about
first
six
months.
Then
you
start
to
add
capacity
and
then
all
of
a
sudden
you're
like
whoa.
This
is
this
is
not
happening
and
you
realize
those
secondary
indexes
want
a
particularly
a
great
idea,
or
you
know,
all
of
a
sudden.
E
D
E
Yeah,
so
one
of
the
things
that
we
do
when
we
on
board
someone
is,
is
we
actually
completely
for
free
review
their
data
model
and
schemer
and
try
and
identify
any
any
low-hanging
fruit
I
wish
we
didn't
have
an
automated
process,
but
really
there
are
some.
There
are
some
schemers
and
data
models
where
you
know
that
might
not
be
appropriate
for
a
service
that
requires
super
low,
latency
stuff.
But
you
know
if
someone
doesn't
really
care
too
much
if
it's
a
200,
300
millisecond,
you
know
read
from
a
very
wide
row.
E
You
know
they're
not
too
worried
about
that.
So
by
us
having
a
look
at
it,
helping
them
out,
you
know
just
identifying
and
identifying
any
kind
of
red
flags
or
things
that
they
might
shoot
themselves
in
the
foot.
You
know
they
get
a
great
experience
with
Cassandra
up
front,
but
you
know,
certainly
for
us
data
model
is,
is
the
thing
that
we
really
focus
on
in
helping
people
out
and
making
sure
it's
a
it's
ready
to
scale.
For
that.
You
know
the
next
couple
of
years,
mm-hmm.
A
Right
on
all
right,
well,
I
think
it's
time
we
can
wrap
this
up
about
at
our
limit.
We've
got
a
lot
more
information
on
the
planet.
Cassandra
this
we
can
extend
our
blog,
so
definitely
check
that
out.
Ben.
Thank
you
very
much
for
coming
on
was
awesome,
talking
about
operations
in
the
cloud
and
don't
forget
to
update
your
SSL
people.
This
is
very,
very
important,
so
thank
you
very
much.
Alright,
thanks.