►
Description
Speaker: Carlos Alonso, Software Engineer
This talk will be a step by step walkthrough of a developer troubleshooting a real performance issue we had at MyDrive, from the very first steps diagnosing the symptoms, through looking at metric charts down to CQL queries, the Ruby CQL driver, and Ruby code profiling.
A
A
First
of
all,
I
thank
everyone
for
being
here.
Hope
you
really
enjoyed
a
the
show
and
yet
just
introduced
the
the
top.
A
is
a
we're
going
to
see
a
case
study
on
how
me
as
I
developer
and
have
been
troubleshooting
the
production
production
issues
in
English
and
during
the
last
couple
of
months
right.
So
hopefully,
you'll
get
some
inserts
on
on
what
to
do
are
not
what
not
to
do
us
as
work
right
so
yeah.
A
A
We
are
going
to
see
a
real
life,
a
problem
we
had
a
couple
of
months
ago
and
we
are
going
to
do
it
life
right.
So
that
am
usually
they.
What
was
the
problem?
The
steps,
exactly
steps,
I
I,
followed
to
try
to
work
out
what
was
happening
and
a
all.
This
is
going
to
be
life
right,
so
we
will
see
some
codes
and
stuff
and
maybe
something
break,
maybe
something
that
you
know
they
like
the
motor.
Like
this
index,
we
wrap
up
with
some
conclusions
and
questions
and
I.
A
So
first
of
all
introduce
myself
I'm
Carlos
Alonso
I'm,
a
Spanish
London,
which
means
I'm
as
planet,
which
you
probably
have
4
3gs
by
my
accent
and
London
I
living
in
London,
and
by
my
thoughts
of
you,
you
can
I
really
hate
the
beach
in
the
Sun,
not
very
common
thing
in
London.
If
you
happen
to
go
that
well,
I
got
my
my
degree
in
my
masters
in
Salamanca
University,
a
small
city
in
Spain,
very
nice,
highly
recommended
for
you
guys.
A
If
you
want
to
go,
and
nowadays
software
engineer
at
my
drug
solutions,
that
I
will
speak
a
bit
more
later
about
vodka
company
I'm
basically
enjoy
in
Cassandra
change
2014.
That
is
when
I
joined
my
my
carbon
company
when
I
drink
my
drive
and
a
since
two
days
ago,
I'm
a
certified
Cassandra
developer.
A
A
My
right
solutions
we
do
for
a
living
driver
providing
our
product
is
basically
and
we
installed
a
black
boxes
in
cars
and
applications
in
smartphones,
and
we
collect
a
GPS
and
accelerometer
data
as
long
as
they
as
the
guy
is
driving
the
car
right.
All
of
this
is
time
series
I,
don't
know
you
are.
If
any
of
you
have
heard
about
this
concept
in
the
last
couple
of
days
and
a
we
use,
this
data
to.
A
Come
with
a
high
score
on
how
they
guy
drives
with
some
advice
like
they
went
if
you
continue
breaking
on
corners,
you're,
probably
gonna
die
down
and
this
kind
of
stuff
and
also
I
psychological
profile.
So
the
idea
is
that,
and
these
information
is
been
given
to
insurers
a
to
adjust
somehow
your
the
price
of
your
of
your
policy
right.
So
you
are
a
good
driver.
This
is
actually
objective
a
risk
assessment
right.
A
It
is
based
on
your
facts,
I'm
not
on
the
fact
that
your
car
is
blue
or
yellow
or
you
are
whatever
age
or
Europe,
a
1000
horsepower
or
whatever
right
hey
at
my
drive.
We
are
doing
Cassandra
since
2012,
and
the
company
has
been
recently
acquired
by
the
general
group
which
may
be
you
know,
and
maybe
you
don't
dip.
The
generally
look
is
one
of
the
biggest
insurers
in
Europe
and
I
think
it
has
very
few
or
no
presence
at
all
here
in
the
US,
but
we
have
been
a
well.
A
We
are
some
starting
negotiations
with
progressive,
which
you
probably
guys,
no
more
than
because
operates
here
the
game.
If
you
want
to
know
more,
that's
the
twitter
handle
the
the
website
and
pretty
much
as
everyone
we
are
hiring.
So
if
you
fancy
what
you
have
heard,
you
can
go
all
day
on
the
side
and
we
are
advertising
the
roads
we
have,
or
you
can
speak
to
me
later
like
and
what
that's
about
it.
A
Let's
introduce
a
little
more
the
problem.
We
are
going
to
cheat
right,
I
say,
as
the
title
was
of
the
presentation.
We
are
going
to
troubleshoot
performance
issues
in
production,
aside
developers.
So
from
the
point
of
view
of
developers,
we
have
very
small
company,
we
don't
have
any
a
Cassandra
administrators
on
board
right
and
a
outsourcing.
It
then,
is
extremely
expensive.
So
we
are
a
couple
of
developers
of
Cassandra
and
maybe
it's
not
the
best
approach,
but
is
what
we
have
right
now,
like
a.
A
We
have
lots
of
interesting
stuff,
a
lot
of
banging
our
heads
index
was
and
listening
to
you
and
then
the
good
thing.
A
of
the
old
performance
issues
is
that
they
have
been
in
production
no
in
the
test
environment
right.
So
we
do
have
a
test
environment,
but
for
some,
if
hundreds
the
load
properly
and
if
you
want
to
know
a
little
bit
more
on
this
particular
problem,
it
happen
on
thursday.
After
I
don't
know,
what's
wrong
with
friday
afternoons.
But
this
problem
had
these
kind
of
things
always
happen
on
fri
this
afternoon.
A
I
guess
that
is
because
then
you
can
go
out,
get
completely
drunk
and
forget
about
something
like
this.
So
the
program
we
are
going
to
see
the
program
is
an
oversimplification
of
what
we
have
right,
a
and
hope
what
happened,
but
more
or
less
will
do
they
do
a
team
and
it's
basically
an
import
script
right.
A
So
we
did
a
trip
as
I
told
you,
we
base
our
business
in
in
in
people
driving,
so
people
make
trips
and
then
we
run
them
to
an
import
script,
a
to
store
them
key
in
Cassandra
as
a
time
series
right,
yeah,
pretty
simple,
the
three-day,
the
CSV,
so
the
number
of
Queens
I
Jurgen.
As
you
are
going
to
see
it's
very
normal,
nothing
strange.
So
everything
should
run
smoothly,
as
as
we
are
as
we
expect
by
this
is
our
setup.
We
are
going
to
see
two
environments
141
the
production
11.
A
Another
one
is
the
test
one.
Each
of
them
are
three
notes
clusters.
It
is
not
the
biggest
thing
in
the
world,
but
way
so
far
is
is
one
minute
and
we
are
using
Cassandra
one
tonight
which
I
suppose
that
is
a
bit
tall,
but
we
essentially,
we
have
limited
resources
and
now
that
we
are
so
behind-
and
we
are
really
scared
on
upgrading
tights
off-
is
something
that
we
will
have
to
do
it
slowly
and
smoothly
in
the
short
steps.
I
guess
what
is
something
the
fact
is
there
like
nothing,
I
cannot
avoid
it.
A
Knee
time
is
time,
with
the
end
time
of
the
trip
text,
a
with
the
end
occasions
analytics
with
the
stab
location
of
the
trip
and
as
I
may
come
with
a
start
time
of
the
day
of
the
trip
a
for
the
for
the
developers
recently
certified
eh.
I
guess
that
primary
key
makes
sense
right.
So
all
the
trips
from
one
driver
are
co-located
in
the
same
note,
and
furthermore,
they
are
sorted
by
the
enzyme
which
I
just
make
sense
right,
given
our
use
case
and
then
the
freedom
of
the
rest
is
pretty
much
boilerplate
boilerplate.
A
A
What
I
every
which
is
as
a
every
language
and
so
I,
and
I
think,
is
a
intuitive
enough
to
be
able
to
read
it
and
follow
it
right
and
they
will
do
explanations
on
the
ways
our
project.
Hopefully,
no
one
will
lots
on
the
way
and
how
far
from
groovy
this
is
I
I
want
to
describe
a
couple
of
tools
that
we
will
use
in
the
code.
A1
is
from
the
probe,
which
is
the
standard,
a
rogue
provider
like
he's
an
open
source
tool.
A
You
can
see
they
get
there
in
the
ink
in
its
own
github
repository
and
casanetti,
which
is
Jim
that
a
John
nunamaker
started
and
then
I
become
a
core
committer
of
it,
which
is
a
eject
built
on
top
of
the
Rubik
driver.
That
datastax
provides
to
basically
avoid
having
to
write
a
string
situate
all
around
the
planet.
A
With
this
we
reduce
the
complexity
and
how
easy
it
is
to
make
a
tacoma
state
by
using
symbols
and
data
structures
like
race
ashes.
What
is
interesting
and
if
you
guys
are
interested
just
you
can
head
today
today
give
half
three
and
yeah
I
think
that's
enough.
Let's
dive
into
into
the
code-
and
we
have
it
here-
is
readable,
I
hope
so
what
this
first
part,
but
this
is
a
row
B
script.
A
These
are
they
require
and
include
arguments
some
questions
here
and
then
meet
start
here
in
like
15,
so
clear
a
we
are
what
we
are
using
studs
d
to
monitor
all
our
stuff
right.
So
what
I
hope
everyone
and
monitors
play
everything
they
can,
because
this
is
basically
what
helped
us
discovering
would
that
we
were
having
a
problem.
So
here
we
are
changing
our
nation
to
my
local
horse
starts
the
server
I'm
doing
everything
local,
because
I
don't
know
you
were
in
nineteen.
Ninety
percent
of
the
conferences
I've
been
internet
just
doesn't
work.
A
I
think
this
is
the
first
one
where
interment
really
was
big
applause
for
biggest
expertise
in
place
and
well
here
and
I
take
the
the
timestamp
were
they
were.
We
are
starting
and
we
create
a
causality
client
here
with
readings
day.
They
count
day
with
a
host
where
we
are
connecting
to
and
on
the
port
right.
We
connect
a
and
open
the
column
family
we
are
using.
As
you
can
see,
we
are
not
using
any
string
here
whatsoever
right.
A
A
A
Like
this
one
here,
a
five-point
almost
six
seconds
right,
we
are
monitoring
as
well
with
opscenter
great
tool.
We
have
here
a
44
by
ninety
four
bytes
per
second,
this
going
probably
to
go
up
right
now.
In
the
same
in
the
next
update
and
here
in
the
request
agency,
we
are
below
three
milliseconds
average
three
milliseconds
90
90th
percentile,
which
is
to
the
acceptable,
I
think
and
so
I
think
we
are
going
to
go
right.
Hey
we
see
this.
We
are
good
to
go
to
work
again.
A
A
A
So
we
are
now
going
to
stop
over
test
environment,
a
start-up
production
environment
and
a
predicate
all
the
stats
d
and
my
cousin
Dobson
cuz
idea
datastax
top
center
to
monitor
day
the
new
one
fight.
Let's
see
that
everything's
working
properly
then
reload
this
well
and
this
is
going
to
be
fit
now.
I
just
wanted
to
take
them.
No
panic!
Yet
yeah
you
gotta!
Well,
now
you
see
that
we're
in
the
production
cluster.
You
can
see
here
right
and
the
data
will
come
soon
right
good.
So
what.
A
A
What
that
finish,
but
I
get
when
I'm,
let's
breathing,
we
didn't
Saudis,
that's
just
fun
this
time.
This
again,
just
in
case
you
know,
maybe
looking
for
tying
them
together,
five,
six,
seven
eight
and
this
is
getting
complicated,
as
I
told
you
is
friday
afternoon
and
again
see
the
board's
they're
looking
at
me,
because
he
doesn't
see
this,
but
he
had
access
to
this
and
a
graphite
shows
that
we
have
gone
up
like
three
times
and
well.
A
Well:
option
that
is
around
a
100
per
second
here
and
also
hit
her
by
a
big
here.
The
mean
value
is
built
on
the
three
milliseconds
which
had
a
which
won't
add
up
to
like
15
16
seconds.
We
were
seeing
right
so
I
would
discard
Cassandra
as
they
today,
as
well
as
causing
this
problem.
So
I
don't
know
I'm
a
bit
lost
his
Friday
yeah.
A
The
boss
keeps
looking
at
me
so
I'm
going
to
just
a
rub
this
code
in
in
a
profile,
blog
okay
and
try
to
understand
where
we
are
losing
our
our
tax,
ID.
Okay.
So
a
for
this,
we
are
using
the
day.
Ruby
profiler
Wright,
the
rookie
crop
and
well
to
make
things
quicker.
I
have
a
small
snippet
here
to
copy
and
paste
to
my
script
right.
A
A
Here
we
r
we
scroll
to
the
top,
which
normally
is
where
the
most
interesting
parts
are
yes,
so
it's
a
spending
hundred
percent
of
the
time
in
the
global,
a
method
which
makes
sense
because
we
are
in
a
street
fight.
Fifty-Six
percent
of
the
time
monitor.
Mixing
mon
synchronize,
not
very
helpful,
I,
haven't
seen
such
an
aligning
my
55
tons
of
future
get
what
makes
sense
because
we
are
running
a
synchronous
right
some
promises
here.
A
One
any
problem
is
more
permissive
here,
monitor,
mixing
thread
condition
it
takes
for
not
very
I,
don't
know
so
I'm
just
going
to
say
that,
as
this
grip
is
very
simple,
that's
all
I'm,
just
passing
a
CSV
which
is
far
from
slow
and
then
sitting
to
confirm.
Instead,
the
inter
Cassandra
I'm
just
going
to
say
that
Cassandra,
a
the
East
Cassandra's
fault,
so
I'm
gonna
a
build,
a
new
smallest,
it
smaller
script
with
just
one
in
sent
to
this
Cassandra
and
one
family
and
see
hey.
A
What
can
I
get
from
from
that
right,
so
well
and
missing
in
developer
x
is
just
ice.
Try
to
isolate
the
problem
with
that
make
sense
right,
but
bear
in
mind
that
we
were.
We
are
now
joining
this
against
our
protection
cluster,
but
when
this
happened
a
we
just
a
the
a
copy
of
the
column
family
to
run
it
in
that
in
that
column
family
right.
So
we
are
not
inserting
data
in
our
production.
Cluster
just
for
testing
purposes
again
I
have
this
stream
here
and
a
to
go
a
bare
metal
as
possible.
A
I
even
removed
the
DECA
sanity
gem.
I
told
you
before
and
now
you
see
how
the
code
becomes
a
little
bit
more
and
one
dirty,
maybe
because
I
have
to
write
a
literal
string
that
I
want
to
execute
right
so
analyzing
the
street.
They
starts
this
stuff
and
then
just
create
create
the
connection
open
the
session
to
the
key
space
insert
not
even
a
single
ugly.
A
A
Should
be
very
quick
because
it's
just
one
insert-
and
it
is
not
being
very
quick
which
at
this
point
at
this
moment,
is
a
bit
weak
because
it
says
that
I've
been
able
to
reproduce
the
problem
right.
Otherwise
I
would
have
to
start
looking
at
other
places.
So
now,
I
can
just
I'm
going
to
go
to
the
same
through
the
same
steps,
I'm
going
to
add
the
profiling
blog
and
to
this
one.
A
A
A
A
That's
lower
for
attacking
so
hundred
percent
again
in
the
global.
Ninety
nine
point:
eighty
eight
monitor
mixing
very
useful
99
Cassandra
future
gift.
Well,
this
makes
sense
because
a
we
are
going
a
synchronous,
but
a
under
the
hood.
The
driver
goes
a
synchronous
and
then
white
synchronously
for
it
right.
So
that
makes
sense.
We
are
not
a
crazy,
yet
some
promises
promises.
A
Look
at
this
night
guys
you
see,
so
we
are
wasting
ninety
nine
point.
Four
percent
of
our
time
in
connecting
to
the
cluster
so
has
nothing
to
do
with
the
answer,
which
makes
sense
because
we
we
were
inserting
20,000
k.
Now
we
are
inserting
just
one
and
we
were
seeing
the
same
problem
right.
So
the
problem
is
actually
when
connecting
not
well
in
30
and
now
what
what?
A
I
can
maybe
add
here
some
debugging,
so
I'm
logging
right.
So
I
can
see
a
because
the
driver
is
going
to
tell
me
a
how
is
connecting
to
the
cluster
right.
So
maybe
a
logging
information
can
help
me
in
fixing
this
problem.
Maybe
also
this
is
something
I
should
have
done
other
very
beginning
of
the
of
the
thing.
A
A
Okay,
let's
see
what
was
happening
just
after
that
line
where
the
things
don't
hack,
it
was
the
night
what
doesn't
matter.
I
know
how
very
gauche
here
and
actually
I
can
see.
Something
very
interesting
which
is
host
93.
Is
that
the
other
refuse
connection
I
could
not
connect
to
that
fall
within
10
seconds.
Wait
a
moment
because
actually.
A
16-6
is
actually
in
10
seconds
right,
so
this
may
be
pointing
to
what's
happening
so
what's
happening
is
that
this
bad
guy
here
is
refusing
all
connections
I'm
trying
to
to
to
making.
So
why
is
this
because
I
told
you
that
we
were
done
in
a
three
notes:
cluster
right
and
I
have
here
a
I
piece,
one
two
and
three:
why
are
we
connecting
to
this
one?
Well,
we
don't
know
about
that.
A
On
the
ring,
ok,
so
our
ring
is
evenly
distributed
well,
but
has
that
three
notes
exactly
so?
What
is
that
boil
them
out
coming
from
right?
So
what
bringing
a
little
bit
more
on
the
documentation
I
saw
that
the
driver
grid
on
the
system
PF
stable
a
for
those
IPS
is
it
is
going
to
connect
to
so
try
a
I'm
gonna
connect
to
that
to
that
node
and
select
all
from
ceased
and
the
fear,
and
she
was
there
and
that
I
carry
the
bad
guy
so,
but
why
is
this
guy
sitting
here?
A
You
have
it
that
well,
after
having
a
look-
and
hopefully
it
works,
a
I
saw
that
this
is
actually
a
bag
in
Cassandra
system.
Beer
stable,
not
updating.
After
the
commissioning
notes
in
to
see
the
appointment,
we
have
12
19
like
in
129,
but
it
has
been
reproducing
one
tonight
as
well
Mike.
So
what
happened
here
is
that
at
some
point
in
the
past
hour,
a
class
that
was
four
nodes
and
what
we,
the
Commission
one
and
this
bike
made
of
it.
A
A
A
A
Just
from
NJ
the
profile
deadlines
a
day,
that
was
a
very
good.
Yes,
actually,
that's
something
I
I
tried.
Well
it's
three
seconds
for
a
there.
You
go
right
here
we
are
and
our
profile
should
be
reporting
our
good
time
there.
You
are
bike,
for
we
expecting
so
yeah,
actually
a
that
made
even
hardened
the
problem,
even
harder
because
a
as
you
know,
the
driver,
a
spreads,
the
load
across
all
of
them,
and
only
one
note
had
this
problem.
A
So
this
problem
was
happening
only
one
doesn't
noteworthy
coordinator
right,
so
it
was
even
funny,
but
I
thought
that
was
enough
for
this
presentation
to
do
it
like
this.
So
just
to
before
we
go
conclusions
are
a
measure,
everything
because
matrix
and
monitoring
a
let
us
know
that
we
were
facing
and
I
can
explain.
Performance
issue
keep
very
important.
A
This
one
even
more
keep
calm
and
wait
to
the
end
because
on
our
files,
profiling,
if
I,
would
have
done
deeper
in
this
time,
I
would
have
seen
that
the
big
time
of
the
big
percentage
and
is
going
in
the
cluster
and
I
could
have
avoided.
Writing
the
BS
transcript
and
all
this
stuff
right.
So
yeah
thanks
to
the
community
thanks,
especially
to
passing
my
fighting
and
we
involving
this
blood
in
this
process
from
the
very
beginning
and
that's
it.
Thank
you
very
much.