►
From YouTube: Webinar | Oracle to Cassandra Core Concepts Pt. 1
Description
Oracle is a database. So is Cassandra. And that’s about it as far as how they are similar. How they are different though… well, that’s when things get good. Have you tried to scale your Oracle database to handle 1M simultaneous users (without ruining your mental health and close personal relationships)? How about creating 100% uptime with active-active datacenters? You don’t even want to think about it, do you? Spend some time with us to learn how Cassandra can make you into the database rockstar you know you are.
C
A
Yeah
a
little
bit
you're
right,
yeah,
so
we're
going
to
talk
about
Oracle
and
how
you
can
go
from
Oracle
to
Cassandra
I.
This
is
probably
number
one
topic.
A
lot
of
places.
I
go
I,
don't
know
about
you,
but
it's
just
a
lot
of
Oracle
out
there
right.
They
don't
round
for
30
years
now,
something
like
that
and
it
established
database
people.
A
B
A
For
15
years,
yes,
pretty
actively
I
made
a
lot
of
money,
doing
it
I'm,
pretty
happy
to
say
and
I
can't
say
anything
bad
about
it.
I
mean
from
from
a
usability
standpoint.
It
gets
pretty
complex,
but
for
the
workloads
that
I
had
given
it,
it
was
a
solid
choice.
There's
you
know
there
was
a
lot
of
choices
out
there,
but
I
know
like
why?
Don't
we
just
jump
into
why
we
make
these
choices?
How
about
that
sounds
good.
A
A
Love
our
data
yeah,
look
at
my
phone,
it's
full
of
it.
I
have
to
keep
buying
a
bigger
one,
but
via
the
data
problem
that
we
call
a
we
love
to
collect.
Data
goes
back
to
ancient
Egypt
and
you
know
this
is,
and
it
was
kind
of
setting
the
Wayback
Machine.
It
was
way
before
any
databases,
but
we
collected
data.
We
collect
the
data
about
crop
yields
and
we
collect
the
data
about
financial
transactions.
This
is
a
ledger.
A
It's
pretty
bad
and,
of
course,
we
collected
things
into
tables.
Trade
tables
have
been
around
for
as
long
as
trains
probably,
and
that
was
the
tabular
format,
the
columns
and
rows.
That's
interesting
how
we
came
upon
that
beforehand.
Collecting
data
has
always
been
a
problem
with
humans.
We
do.
We
are
collectors
and
we
really
started
hitting
this
first
data
plateau
now
Rachel
I'm
going
to
go
through
these
data
plateaus
as
a
concept,
but
the
plateau
that
I'm
talking
about
is
we
started
to
limit
ourselves
out
now.
Do
you
know
where
this
is?
A
I
know
it's
like
well
just
write
on
the
deer
tongue,
and
this
was
an
interesting
place,
because
this
is
the
only
place
you
could
get
a
reservation
for
an
airline
in
the
1950s.
Now
in
the
1958,
the
United
States
there's
a
lot
of
extra
income,
people
started
flying
by
and
they
were
taking
thousands
and
thousands
of
phone
calls.
Do
you
see
like
in
the
foreground?
There's
a
guy
here
with
a
little
oxygen
delivered.
A
B
A
C
A
Sabre
right
and
Sabre
was
a
summed
up
by
IBM.
It
is
still
here
today,
folks,
you
can
you
book
an
airline,
you
probably
were
going
through
savor
and
it
was
amazing
because
what
they
were
allowed
to
do
from
the
air
or
from
a
central
call
office
was
collect.
Information
from
people
one
at
a
flight
and
people
got
on
the
plane
and
they
had
a
seat,
and
it
was
all
very
organized
and
centralized,
and
really
that
was
the
first
solution
and
it
was
awesome
and
it
started
this
whole
industry.
A
B
A
A
Probably
many
people
on
the
phone
have
so
IBM
was
the
one
who
made
a
big
database
and,
if
needed,
a
database,
you
called
a
glue.
Now
there
were
competitors,
but
let's
face
it.
I
began
that
we
old
adage,
no
one
ever
got
fired
by
an
IBM,
and
so
that
was
the
1970s
1960s
for
20
years,
a
dominated
database
industry,
probably
in
30
years,
but
things
do
change.
We
hit
this
data
plateau
again
here
we
are.
This
is
actually
a
picture
from
1977,
I,
believe
and
IT
workers
of
the
world
unite.
A
You
didn't
ahead
to
wear
a
white
shirt
with
a
black
tie
that
was
like
the
rule
and
your
room.
Size
computer
is
ready
and
the
plateau
was
not.
Everyone
could
afford
a
room
size
computer
and
that's.
Okay.
That's
that's
really
interesting,
so
we
have
a
new
solution.
Now
we
have
this
microcomputer
revolution.
That's
kicking
up.
Here
comes
all
these
new
players.
A
Okay,
one
though
that
stands
out
above
the
rest
is
Oracle
and
they
built
a
relational
database
that
works
on
a
variety
of
systems
and
more
or
less
supplanted
what
IBM
was
doing
pretty
easily
because
it
bit
I.
This
is
where
I
come
along.
I
started
using
Oracle
in
the
1990s,
because
I
ended
up
working
database
that
it
was
relational
because
I
learned
that
in
university,
so
it
dominated
from
the
80s
well,
I,
couldn't
say
80,
so
91
is
really
started.
Dominating
and
I.
C
A
Of
course,
this
happened
right:
the
world's
got
all
Internet
II
and
College
Tesco,
but
anyway
this
outfit
I
mean
the
internet
became
a
thing
and
the
whole
world
became
data
connected
and
when
you
look
at
the
scaling
problems
that
they
could
bring
it
dwarfed
anything
you
could
pull
on
a
micro
computer
by
here,
but
we
have
this
problem
right.
This
is
I
mean
what
was
your
impression
of
the
Internet
at
certain
times.
B
Well,
besides,
after
these
yeah,
we
we
thought
and
waited
a
lot
is
dunya
is
a
downloaded
yet
I
mean
I.
Remember
like
going
off,
and
you
know
you.
A
Exactly,
and
so
we
have,
we
have
this
problem
right,
which
is
we
put
too
many
people
on
there
and
we
get
this
website
Oslo
problem,
and
it
wasn't
just
because
it
was
a
dial-up
modem.
It
was
because
we
were
building
bigger,
bigger
databases,
but
the
database
was
a
problem,
and
this
was
when
I
dealt
with
a
lot
in
the
late
90s
when
I
worked
at
coms
and
did
a
lot
of
consulting
and
he
was
trying
to
make
money
being
an
Oracle
consultant
is
my
first
answer
was
by
bigger
machine
and
I
love
this
one.
A
This
is
a
centum
enterprise,
4500
I
think
450
loved
that
had
wheels
portable
awesome.
But
you
know
that
was
the
thing
you
get.
You
got
money
from
receive
and
you
bought
a
bigger
machine
awesome,
the
2005
that
is
really
I.
Put
that,
in
my
mind,
is
a
moment
and
that
we
had
this
problem,
the
thundering
herd's,
and
it
was
just
because
everybody
and
their
brother
was
getting
on
the
internet
and
you
never
knew
when
things
were
going
to
come.
A
I
worked
in
education,
the
thing
that
we
had
to
deal
with
was
last-minute
deadlines
and,
of
course,
everybody
showed
up
at
the
last
second
right
and
what
happens
when
we
have
thundering
herd's
this
problem
right,
and
this
is
getting
pretty
close
to
where
we
are
now
now
the
this
is
a
big
moment.
Everyone
knows
where
you
were
when
they've
happened
right,
you
remember
where
you
were
Rachel.
A
A
B
A
A
You
shouldn't
be
on
this
webinar
at
all,
but
you
come
around
right.
You
come
around
you,
you
yeah
exactly
so
this
iPhone
thing.
Well,
now
it's
every
every
bit
of
phone
age
out
there
has
a
screen
that
looks
somewhat
like
that
with
an
app
on
it
has
probably
a
computer
listening
to
it
somewhere
and
without
a
doubt,
zero
chance.
A
You
will
not
have
a
database
involved,
and
that
has
created
a
problem,
because
now
everyone
expects
it
to
be
online
all
the
time
and
you
can
have
the
next
cool
app
that
just
takes
off
and
that
thundering
herd
is
going
to
stampede
you
and
your
competitor,
the
blue
space.
You
go
on
the
iPhone
App
Store,
or
are
you
going
Google
Play
and
you
look
for
an
app?
You
can
find
three
or
four
competitors
instantly,
and
that
means
it
to
be
relevant.
You
have
to
have
something
that
works,
so
slows
is
good
as
dead
right.
A
So
this
is
what
we
did
right
cuz.
It
was
like.
We
always
use
Oracle
I,
you
a
used
Oracle
and
you
would
use
Oracle.
D
A
B
A
C
A
The
choice
of
that
that
particular
time
was
okay,
you
want
to
be
safe,
go
with
Oracle
and
everything
that
connected
to
the
Internet
that
needed
to
talk
to
something
and
look
at
all
these
applications
ad
mobile
application.
We
have
web
applications,
we
have
gaming
and
from
telemetry
stock
markets.
None
of
these
really
fit
well
anymore
because
of
the
problem.
That's
inside
that
it's
a
single
server
relational
database.
Now
we
can
build
it
out
and
do
more
with
it.
But
is
that
really
the
case?
Well,.
B
A
A
Nobody
believed
her,
so
look.
Look
at
this
potential.
I
will
propose
to
you
the
listening
audience,
a
third
database
solution
that
is
Apache
Cassandra
and
it
is
born
and
bred
around
this
new
problem
set.
It
was
built
to
tackle
this
issue
that
we
created
for
ourselves,
which
is
hey:
let's
just
put
a
billion
people
on
particular
application
and
make
them
happy.
C
A
It
was
you
know,
the
thing
is,
though
it
was
nothing
Cassandra
came
out
in
2008
I
mean
that
was.
That
was
when
it
wasn't
conceived,
but
there
was
a
lot
of
thinking
before
then.
It
goes
back
to
the
Dynamo
paper,
which
was
2007,
the
Google
BigTable
paper,
which
of
2006
and
those
things
had
a
life
inside
of
their
Amazon
and
Google
before
them.
So
these
problems
were
being
solved
in
with
computer
science.
A
Earlier
at
those
companies
that
I
highlighted
earlier
Google
Amazon,
they
were
here
they
are
looking
down
the
barrel,
the
gun
of
billion
users,
and
they
are
solving
those
problems
quickly
with
computer
science
and
just
like
a
centrifuge.
Those
things
are
spinning
out
and
flying
out
and
finding
themselves
into
projects
and
open-source,
rather.
A
A
A
But
that's
an
option
and
you
have
to
manage,
probably
know
yep
32
CPUs
in
a
box.
You
have
to
manage
that.
Ok,
so
now,
I
have
all
the
CPU
memory
I'm
going
to
add
more
discs,
which
probably
means
that
I'm
going
to
run
out
of
slots
on
the
server
so
I'm
going
to
put
in
ass
and
awesome.
But
if
I'm
running
a
stand,
probably
going
to
use
ASM
because
I'm
going
to
manage
that
storage
properly
and
managing
storage
on
an
Oracle
server
is
really
a
dark
art
and
a
good
way
to
make
money
too.
A
A
For
caching,
well,
you
can
coherence
is
another
product
that
Oracle
sells,
but
it
is
not
integrated
as
much
and
it
is
a
key
value
store.
Some
people
use
comparing
it's
more
replacement
with
memcache
speed
times.
Ten
is
more
integrated,
so
that's
going
to
give
you
this
gift.
You
want
and
the
space
when
you
start
getting
in
a
real
scale,
problems
you're
going
to
need
to
add
more
service.
More
more
of
that
and
so
you're
there's
only
a
certain
size.
You
can
get
I.
A
D
A
Like
four
losers
in
new
data,
spinners
there's
the
little
lifter
thing,
because
there's
humans
can't
and
these
things
up
and
not
lose
a
finger.
So
you
have
to
put
this
little
forklift
thing
underneath
this
huge
server
and
lift
it
up
into
the
rack
and
it's
a
huge
operation,
and
it's
at
that
point
you're
kind
of
laughing
to
yourselves,
maybe
that
room
size,
computer
wasn't
a
bad
idea
because
at
least
it
was
in
the
room.
So
what
was
the
other
way?
A
Or
do
this
with
adding
more
servers
instead
of
the
bigger
server,
because
you
run
out
of
space,
which
means
you're,
probably
going
to
use
a
Stan?
Actually,
you
will
use
a
stand
and
if
using
a
stand
and
multiple
servers
with
Oracle
you're
going
to
be
in
rack
territory,
that's
real
application
clusters-
and
this
is
this-
is
about
the
the
most
viable
way
of
running
search
several
servers.
Now
you
can
do
a
sharted
architecture
as
well,
but
rack
is
how
you
get
more
fluid
failover
or
things
like
that.
A
But
to
use
rack
you're
gonna
have
to
use
a
cluster
where
fast
application
notification
in
case
things
go
bad
and
definitely
cache.
Fusion,
so
that
your
your
data
is
up-to-date
and
working
well
and
fast,
and
everything
else
pretty
complex,
I
can
tell
you
right
now:
I've,
never
seen
a
rack
system
work
flawlessly
or
perfectly
out
of
the
box.
I've
always
had
interesting
failover
issues,
but
I
think
that
definitely
way
to
scale
interesting.
But.
A
Well,
anybody
can
buy
one
if
you
have
a
huge
budget,
not
as
probably
it
is
the
budgeting
constraints
or
you
have
to
pay
for
license,
and
you
get
a
little
fan
out
of
the
deal
and
that's
going
to
cost
you
a
lot
of
money
plus
you
can
well
the
thing
that
makes
it
it
keeps
it
affordable.
You
can
only
go
up
to
a
hundred
nodes
of
the
rack
after
that
forget
about
it.
A
A
Don't
do
that
so
the
little
step
down
from
out
of
using
data
guard
data
guard
is
a
way
to
manage
the
transaction
or
manage
to
your
secondary
databases.
So
you
can
have
failover,
which
is
nice
in
those
cases,
but
it
did
the
failure.
There
is
a
mean
time
between
failure.
You
can
go
with
active
or
standby
active
costs,
more
money,
which
means
you
have
to
pay
for
reading
that
data
now.
All
of
this
data
is
now
is
protected
and
scaling
and
everything
up
to
a
certain
point.
What
if
you
want
to
analyze
it.
B
D
B
D
A
I
know
you're
good
at
that.
You
sure
you
can
use
any
of
those.
The
other
data
warehousing
technologies,
but
you're
going
to
be
doing
HDL.
There's
nothing
in
place.
B
Ok,
like
what
happens
about
like
in
line
like
why,
in
a
business
like
you've,
got
all
your
data
in
that
database
shouldn't
you
just
be
doing
analysis.
There
I
mean
I
hate
it.
So
not
you
know
just
to
be
completely
upfront.
Like
I
spent
years
of
my
life
tried
the
ETL
job
move,
data
from
source
systems
to
data
Mart's
and
data
warehouses
for.
A
B
A
D
B
As
you
go
back
as
why
I
want
to
unwind
out
something,
that's
Amira
notice
that
says:
Patrick
put
some
nice
little
boxes
around
stuff
and
put
some
words
like
uptime
and
scale
and
stuff
like
that.
You
know
we
were
trying
to
group
stuff
together
to
give
you
an
idea
of
like
why
you're
going
to
be
doing
certain
things,
because
I
think
you
can
go
ahead
and
next
slide.
We.
B
Can
simplify
everything
back
down
to
a
single
server
again
and
now
we're
going
to
talk
about
a
single
server
of
so
the
whole
point
of
this
webinar
is
to
introduce
people
who
are
familiar
with
Oracle
to
the
concepts
behind
Cassandra.
So
what
is
similar
to
Sondra?
What's
different,
so
starting
with
a
single
server,
so
Cassandra
was
designed
from
the
get-go
to
be
distributed,
so
you
can
run
Cassandra
on
a
single
server.
We
all
have
you
know.
Most
of
us
engineers
here
at
state
effects
have
been
having
on
our
app
laptops.
A
B
And
how
better
than
how
big's
is
a
single
server
have
to
be?
Are
we
talking
about
the
more
CPU,
the
more
you
know,
pl3,
80s
or
superdome's,
or
whatever
activate
about
this?
No,
but
we're
talking
about
commodity
hardware,
because
we
need
Cassandra
was
built
to
be
scaled
out
to
provide
continuous
up,
Sun
and
and
scale,
but
by
using
hardware
that
wasn't
going
to
break
the
bank
or
by
using
clouds.
You.
D
B
B
One
of
those
knows
is,
you
know,
minimum
of
about
eight
CPU
32
gigs
of
RAM
about
a
terabyte
of
spinning
disk.
Maybe
you
know
up
to
terabytes
of
SSD,
so
we're
not
talking
about
massive
machines.
Of
course
you
can
put
more,
but
for
the
most
part
that's
a
pretty
typical
Cassandra
node,
so
as
you
talking
about
Cassandra
is
designed
from
the
get-go
to
be
distributed.
So
here
is
your
now,
your
database
is
distributed
around
a
ring
and
all
those
nodes
in
there
are
peers.
So
there's
no
master,
no
there's
no
slave
node.
B
B
B
D
B
Each
node
of
is
responsible
for
a
certain
range
or
ranges
of
data,
the
next
slide
and
if
you
want
to
add
more
nodes
to
this
cluster,
so
if
you
need
to
add
capacity,
you
need
to
add
you
need
to
scale,
you
don't
have
to
just
scale
your
individual
services,
though
you
can.
You
can
also
just
easily
add
more
servers
to
your.
D
D
B
Those
nodes
are
going
to
get
pieces
of
data
from
the
other
nodes
next,
and
the
cluster
will
automatically
reconfigure
itself
in
order
to,
instead
of
each
node
being
in
charge
of
ten
tokens,
it's
going
to
be
in
charge
of
eight
tokens
and
all
those
nodes
are
going
to
contribute
data
to
the
new
ones
to
bring
it
online.
So
now.
A
D
B
D
B
We
have
to
buy
fans
because
fans
are
best
for
databases,
and
then
you
hit
your
head
on
the
almost
on
the
table
for
a
while,
and
you
try
to
explain,
I
ops
and
you
explain
physics
and
they're
like
we're
sorry,
this
is
all
that
we
can
buy.
So
if
you're
in
that
situation,
don't
despair,
we
can
help
you.
That
is
a
way
out
of
the
fan.
Nightmare.
Yeah.
A
B
B
Just
in
mind
that
Cassandra
is
designed
to
be
always
on
never
down
so,
as
you
add
new
nodes
to
the
system,
the
streaming
of
data
from
the
other
nodes
is
actually
a
background
process.
Everything
is
a
background
process
in
Cassandra,
because
it's
designed
to
never
ever
ever
have
to
be
brought
down.
So
you
add
a
need.
No
to
the
system.
You
upgrade
your
your
system.
B
D
D
B
Okay,
so
the
application
comes
down
and
the
application
will
is
going
to
write
to
the
Cassandra
ring
now,
as
I
mentioned
earlier,
all
these
nodes
are
peers,
so
each
any
one
of
these
nodes
can
act
as
the
coordinator.
It
doesn't
have
to
be
just
one,
the
top
there's
a
driver
that
fits
on
the
application
that
have
number
of
different
policies
that
that's.
How
do
you
round
robin
or
retry
any
of
those
particular
notes
so.
C
B
Just
using
an
example
of
one
here,
but
keep
in
mind
that
any
of
these
nodes
can
and
will
be
the
formatter,
so
the
application
writes
a
row
of
data,
the
primary
key
of
that
data,
the
partition
key
is
hashed
and
aside
a
token
value,
so
go
ahead
and
go
to
the
next
slide
that
token
value
is
actually
can
be
written
in
any
number
of
places
and
you,
as
a
database
administrator,
decide
how
many
copies
of
that
data.
Do
you
want
around
the
ring?
The
most
common
number
is
three.
B
Mostly
that
way.
If
one
of
your
nodes
goes
down,
you
can
fix
the
other
whatever
it
was
wrong
with
the
first
one
on
a
second
one
and
still
be
able
to
have
a
third
one
available
third
copy
available
to
retrieve
requests.
So
I
like
three.
It's
also,
you
know,
there's
also
a
my
pipe
somewhere
in
there,
but
reads
a
good
number.
Yes,.
A
And
so
I
think
really
because
it's
been
proven
time
and
time
again
and
there's
some
math
behind
this,
but
a
replication
factor
3
is,
is
a
good
trade
off
for
space,
because
you're
going
to
be,
these
are
going
to
be
replicated
three
times
and
they're.
Also
it's
for
the
uptime.
Another
thing
is
how
you
can
hear
your
data
is
consistent
at
quorum
because
they
have
three
nodes:
you're
going
to
have
51%.
That
means
two
nodes
online.
So
that's
some
of
the
reasoning
behind
it.
Thank.
D
C
B
So
now
that,
as
the
data
was
distributed
to
three
nodes,
we
can
lose
two
to
the
nose
and
still
make
me
enough
time.
But
what
happens
if
our
data
center
goes
down
and
data
centers
do
go
down?
There
are
chicken
cobblers
being
let
loose
every
day
and
all
data
centers
it's
obscure
in
society
or
there's
natural
disasters,
or
you
know
actually
real
reasons
that
a
data
center
might
go
down.
Well,.
C
C
B
A
No
no
I
mean
you
know
those.
So
netflix
is
a
great
user.
They
talk
a
lot
about
these
two
concepts.
They
run
active
active
of
course,
because
they
want
to
they're
an
Amazon
hundred
percent.
They
know
that
there's
going
to
be
downtown,
so
they
run
active,
active
and
have
a
great
webinar
or
a
discussion
on
that.
A
You
can
go
look
up,
but
they
also
talked
a
lot
about
the
individual
notes
right
because
when
AWS
reboot
happened,
Thank
You
Amazon,
they
didn't
have
any
downtime
because
they
were
they
were
configured
correctly
and
they
had
a
good
replication
factor,
good
consistency,
level
and,
as
nodes
were
blinking
out,
I
think
they
lost
like
300
No
that
that
redo
process
they
didn't
have
a
second
down
time,
because
we're
ready
and
the
database
was
built
to
withstand
that.
Naturally,.
B
D
A
Exactly
not
a
good
thing,
so
that
I
mean
I.
Think
that's
why
they
and
so
data
Netflix
is
a
good
example.
Those
who
hit
that
next
data
plateau
right
is
they
they
knew
they
couldn't
get
where
they
needed
to
go.
They
are
increasing
stock
value
for
their
for
their
shareholders
and
they're,
also
making
their
users
happier
by
keeping
things
online
they're
like
the
cable
company.
Now
they
have
to
be
at
one
all
the
time,
they're
better
than
their
cable
company.
B
C
B
D
B
B
Only
because
these
are
slides
and
I
want
to
look
pretty.
Do
we
have
the
same
number
of
nodes
in
the
both
datacenter
and
the
same
replication
factor?
That's
because
cemetry
looks
nice,
but
cemetry
isn't
real
life,
so
these
data
centers
actually
do
not
need
to
be
symmetrical.
They
do
not
need
to
have
the
same
number
of
nodes
nor
the
same
replication
factor.
They
also
don't
need
to
be
in
the
same
place.
They
don't
need
to
be
in
actual
physical
data
centers
they
could
be
wanting
to
be
in
the
cloud
want
to
be
in
on-premise.
B
A
The
typical
it's
a
typical
arrangement,
it
could
be
and
I've
seen
plenty
of
on-prem
that
has
another
kata
center
in
a
cloud
environment
because
they
don't
they
ran
out
of
room,
and
so
this
this
will
enable
that
for
sure
we
had
who
Yellen
was
had
a
discussion
about
how
they
moved
from
on-prem
to
the
clouds
and
then
back
to
on-prem
without
any
downtime
based
on
this
replication
strategy.
So
there
are
some
interesting
things
you
can
do
with
this
and.
D
B
B
Uptime
and
scale
that
is
the
yeah
Cassandra
again
live
designed
to
be
always
available,
always
on
multiple
data,
centers
of
various
copies
of
the
data
within
the
data
center.
But
we
also
talk
about
caching,
so
this
thing
also
does
have
to
be
fast
because
we
talked
about
down
and
down,
but
slow
is
also
down.
So
remember
those
thundering
herd's
back
in
the
90s
of
you
know
your
website
being
slow.
That's
not
going
to
cut
it
anymore
either
if
you're
on
an
app
and
you're,
and
you
can't
get
to
where
you
want.
It's
so
easy
go.
B
Apps
there
you
can,
you
can
probably
download
another
app
in
the
time.
So
speed
is,
is
very,
very
important
as
well,
but
if
you
put
a
caching
layer
again,
you
are
introducing
more
complexity,
you're
introducing
another
single
point
of
failure,
and
that's
not
what
we're
looking
for.
So
Cassandra
is
designed
to
do
incredibly
fast
transaction
read
and
write
work.
B
So,
first
off
we
have
the
application
it
talks
to
the
coordinator
knows
we
saw
that
earlier
and
it
sends
out
a
write
request
that
write
request
is
going
to
hit
a
commit
log
that
lives
on
disk.
This
commit
log,
provides
durability
and
will
also
is
append-only
and
was
written
sequentially,
so
you've
got
one
disk
that
sits
there
and
write
sequentially
and
always
append,
and
then,
when
it's
filled
up,
the
file
is
start
to
another
one,
very,
very
simple
student:
it
hits
a
commit
log,
it
acknowledges
back
to
the
coordinator
and
the
right
is
good.
B
At
the
same
time,
it's
going
to
do
all
a
signatory.
It's
going
to
write
also
to
them
to
memorize
and
write
to
what's
called
a
mem
table.
So
once
the
data
is
in
the
mem
table
is
now
queryable
and
any
anybody
coming
from
another
another
node
or
another
application
process
can
read
the
data.
Add
a
memory
there's
only
so
much
memory,
so
eventually
those
men
tables
flush,
the
disk
into
something
that's
called
sort
of
strings
tables.
B
Those
at
those
tables
are
also
immutable
and
they,
the
data
in
there,
is
sorted
and
rented
sequentially
the
mem
table.
The
mem
memory
is
cleared
and
the
data
lives
is
lives
on
disk.
Those
discs
can
be
SSD,
they
can
be
spinning.
Spinning
disks,
the
Cassandra
was
actually
designed
to
provide
fast,
read
and
write
on
spinning
disks,
but
we're
also
works
very,
very
well
on
SSDs,
of
course.
Eventually
those
SS
tables
are
merged
together
in
a
process
called
compaction
and
compassion
of
keeping
up.
B
A
So
I
really
feel
like
this
is
a
critical
thing.
This
is
not
any
memory
datum
and
even
though,
if
it
does
go
into
men
table
I
hear
as
people
ask.
Is
this
any
memory
database
yeah,
because
it
writes
a
commit
log
that
isn't
about
of
a
durable
write,
and
so
it
is
on
disk
and
then
once
it's
in
an
asset
table
and
no
longer
needs
to
be
in
the
commit
log,
so
we're
still
using
disk
a
hundred
percent
across
the
board.
A
A
A
D
B
A
We
yanked
it
but
I,
but
I
think
it's
just
as
relevant
I
mean
it
seems
to
do
with
getting
data.
How
do
you,
user
security,
safe
keeping
your
data
safe?
That
kind
of
thing
I
mean
these
are
all
basic
deals.
There's
there's
nothing
complicated
in
here.
It's
not
like
how
to
set
up
a
rack
bust
or
anything
like
that.
It
is
really
and
you
could
almost
put
the
punch
in
any
database,
my
sequel,
anything,
but
what
I
think
would
be
interesting.
B
Luckily,
luckily
we
have,
we
have
tools
for
this.
We
have
tools
to
make
things
easier
for
people
who
are
Oracle,
or
my
sequel
or
sequel
server
DBAs
to
get
used
to
how
things
work
in
Cassandra.
So
what's
currently
being
highlighted
here
in
pink
or
red
or
whatever
color
shows
up
on
your
screen
are
a
tasks
that
are
handled
by
off
center
off
center
is
available
for
download
and
you
can
use
it
against
Apache
Cassandra
or
bit
effects
Enterprise.
B
So
here's
like
a
best
practice
services
which
will
actually
go
through
your
system
and
give
you
some
ideas
on
whether
or
not
your
replication
is
is
set
up
correctly
or
your
performance
is
good
or
your
security
is
set
up
right
or
your
backups,
so
the
next
one
we've
got
this
we'll
take
a
look
at
your
at
your
ring
right
now.
We
have
a
Cassandra
cluster
with
also
solar,
our
search
and
analytics
and
we're
seeing
what
is
green,
what
is
red
and
what
is
yellow.
B
So
all
those
tell
you
the
health
of
your
individual
node,
and
then
we
also
just
have
the
basic
dashboard
view.
Your
cluster
health,
your
utilization,
your
load,
I,
mean
pretty
much.
Is
this
completely
customizable
will
give
you
all
the
stats
of
your
of
your
system,
just
like
you're
used
to
with
Enterprise
Manager,
so.
A
Functioning
from
the
world
of
Oracle
and
huge
Enterprise
Manager,
this
is
a
I
think
this
is
a
pretty
similar
tool
where
it
gives
me
an
all-in-one
look
and
let's
monitor
things
now.
I
I
say
this
a
lot
you
put
a
thousand
nodes
in
a
cluster.
You
better
have
something
that
can
watch
those.
It's
not
something
you
want
to
spin
on
your
own.
A
A
B
C
B
This
is
a
little
a
dock
fly
just
to
show
you
hey,
yep,
there's
documentation,
just
like
OTN
all
available
one
on
datus
XCOM,
but
also
that
all
backups
can
be
managed
through
ops
center
or
the
command
line.
Whichever
one
you
prefer,
something
that
people
talk
a
lot
about
with
Cassandra
like
well,
if
you
have
all
the
publications,
do
you
need
to
that?
And
of
course
you
do.
Cuz
I
mean
your
your
data's
not
going
to
go
away,
but
there
might
be
a
mistake
in
your
application.
B
A
D
A
B
A
You
want
to
think
about
or
lasting
anyone
thinks
about
they're
building
something
you're
checking
boxes
at
the
end,
it
really
should
be:
haven't
integrated,
stuff,
okay,
so
we're
back
to
my
my
duties
here.
So
we
checked
off
a
few
boxes
here,
that's
cool
got
all
that.
What
more
do
we
need
to
look
at
now,
this
this
seems
more
online
of
like
maybe
a
developer
would
want
to
do
this
or
an
architect
like
designing
a
database,
creating
database
managing
objects,
managing
objects,
there's
a
DBA
role
thing
and
even
like
looking
at
new
features.
A
D
C
B
B
It
will
give
you
the
ability
to
write,
queries,
manage
your
update,
your
inserts
or
create
tables,
and
look
just
a
quick
look
for
those
who
are
not
familiar
already
with
Cassandra
how
you
actually
interact
with
Cassandra.
This
looks
vaguely
familiar,
doesn't
it
and
it's
not
quite
as
well,
but
it's
called
cql
and
it's
designed
specifically
so
people
who
know
SQL
can
interact
well
with
Cassandra.
A
B
A
So
last
thing
was,
of
course,
I
and
I
totally
blew
it,
but
the
ETL
situation
that
we
laughed
at
we
landed
on
last
time
got
to
me.
Was
I
I,
just
don't
like
you
PL
at
all,
because
it's
costly
and
it's
not
efficient
and
from
a
lot
of
reasons
it
can
also
lose
data.
It's
just
something
that
I
would
rather
not
do.
So.
What
do
we
get
with
Cassandra?
In
this
case,
yeah.
B
C
A
C
A
That,
but
is
unfortunately,
a
little
less
integrated,
but
I
think
it's
it's
fine
for
what
it
is.
It's
the
bring
your
own
a
dupe
or
you
can
have
your
own
Hadoop
cluster
and
we
provide
connectors
so
that
it
will
grab
data
and
it
will
pull
and
push
me
to
to
Hadoop
cluster
from
the
container
cluster.
So
the.
B
Idea
here
that
you
don't
have
each
other,
the
data
will
be
automatically
replicated
in
real
time
to
these
clusters.
So
you
can
integrate
search
into
your
application.
You
can
put
a
bi
tool,
tableau
or
anything
out
there
with
its
ODBC
or
JDBC
connection,
to
allow
your
users
to
do
ad
hoc
queries
without
interrupting
your
OLTP
transaction
processing.
Right.
A
And
that
so
I
I
think
we've
seen
some
really
interesting
applications,
and
this
is
what
it
comes
down
to.
So,
let's
kind
of
wrap
this
up
into
a
neat
ball
is,
and
let's
take
this
back
to
the
plateau-
that
we're
dealing
with
Cassandra
in
itself
is
a
database
that
can
do
things
that
are
going
to
respond
to
what
we've
created
right,
which
is
mobile,
apps
and
IOT
web.
Those
are
all
creating
a
lot
more
demand
and
need
scaling
problems
that
we
haven't
yet
seen.
A
That's
when
it
comes
down
to
and
if
you
look
at
the
users
that
are
using
Cassandra
they're
making
money.
This
is
a
part
of
their
bottom
line
and
they
are
relying
on
Cassandra
to
make
sure
that
it's
up
on
line
ready
to
rock
well
their
users
around
with
their
wallets
out
there
ready
to
go,
and,
more
importantly,
I
think
it
just
gives
everyone
a
better
experience,
because,
let's
face
it,
slow
is
as
good
as
down
anymore.
C
B
Put
there
there
is
a
cache.
There
are
some
things
that
you
need
to
change
about
the
way
you
think
in
order
to
take
the
most
out
of
get
the
most
out
of
the
system,
and
that
is
going
to
be
the
topic
of
our
next
two
webinars
we're
going
to
be
next
week's
topic.
It
is
going
to
be
on
data
modelling.
So
how
do
you
date
a
model
for
Cassandra
and
the
following
week?
We're
going
to
talk
about
how
you
change
your
development
methodology?
B
A
Super
excited
about
part
2,
now
I've
been
doing
data
modeling
talks
for
three
or
four
years
now.
Actually,
four
years
I
checked
and
the
the
concept
of
data
modeling
from
a
relational
Cassandra
is
getting
more
bait.
I
think
an
instance.
That's
a
word,
and
we
I
think
this
is
where
we're
going
to
probably
you're
going
to
hear
some
of
the
same
things
again,
but
hopefully
some
new
things
and
we'll
highlight
some
of
the
newer
features
that
are
into
standard
treat
a
and
probably
what's
coming
in
through
dotto.
A
But
we
need
you
to
understand
that
this
is
not
this
isn't
going
to
kill
you
now,
if
there's
one
webinar
you're
going
to
if
you're
going
to
watch
one
of
these
watch
that
one
or
you
already
watched
this
one
so
if
you
like,
but
that
number
two
is
going
to
be
really
critical,
especially
for
developers
because
understanding
your
data
model,
it's
the
first
thing
you
need
to
get
whenever
you
building
successful
application
all
right.
Well,
we
want
to
see
you.
A
We
want
to
see
you
up
close
personal
and
that's
going
to
be
the
Cassandra
summit
in
2015
Santa
Clara,
two
days
of
fun
and
excitement.
We
are
also
going
to
be
doing
a
certification
with
O'reilly
Media.
So
this
is
a
big
big
deal.
So
this
is
what's
a
certification
test.
We
have
training
you
can
sign
up
for
that.
It
is
that
is
to
pay
for
part
of
this.
A
Now
it
is
free
to
go
to,
and
if
you
want
to
do
a
priority
pass,
basically
guarantees
that
you
can
get
into
certain
sessions
because
I'm
going
to
tell
you
there's
going
to
be
thousands
of
people
here,
it
will
be
a
big
event,
so
getting
that
priority
path
is
pretty
important.
If
you
want
to
guarantee
spots
because
there's
going
to
be
some
really
hot
talks,
and
if
you
give
you
that
it
gives
you
that
guarantee
so
Rachel
and
I
both
have
priority
fathoms.
So
you
can
pick
a
winner
there.
It
doesn't
matter
you're
the
winner.
A
A
I
mean
do
all
that
now,
if
you
can,
because
you
don't
want
to
wait
last
year,
we
people
waited
to
the
last
second
and
it
was
really
tough,
because
there's
a
lot
of
people
didn't
get
a
go,
because
it's
just
full
and
that's
we
don't
want
to
have
that
we're
also
doing
it
online
a
lot
of
its
going
to
be
online.
So
if
you
can't
make
it
for
some
reason,
you
won't
miss
out
and
of
course
all
of
the
talks
will
be
videoed
and
we
will
have
that
all
available
on
our
YouTube
channel
as
well.
A
So
if
you
can't
make
it
don't
don't
despair,
you
will
eventually
be
able
to
see
some
of
the
talks
almost
actually
all
of
them,
but
really
the
important
thing
is
when
you're
there
you
get
a
talk
to
people
and
I
think
this
is
the
most
important
thing
talking
to
people
relating
experiences.
Finding
out
how
they're
doing
it
is
really
something
so
I
think
that
is
it
oh
we're
going
to
have
to
we're
going
to
have
to
take
some
questions.
I
think
here
see
who
has
the
Q&A?
That's
the
question.
The
first
question
I
have.
A
Did
actually
saw
a
comment
just
recently
that
exadata
ears
shared
nothing,
and
that
is
true,
that
is
true.
Although
a
disease
I
have
run
Exadata
in
production
and
I
should
be
very
clear.
Yes,
that
is
they
shared
nothing
architecture,
although
it
is
a
very
specific
architecture.
It's
built,
we
buy
a
box
and
it's
huge.
If
you
want
to
do
multi
data
center
than
you,
you
have
to
use
Golden
Gate
on
that,
but
I
will
I
will
make
that
clarification.
Okay,
here's.
A
That
one
every
node
every
node
is
the
same.
So
if
you're,
if
you're
running
in
the
multi
data
center
with
analytics
and
search
those
notes
in
that
data
center,
will
all
be
the
same
where
they're
running
Cassandra
and
whatever
extra
so
like
in
search
node.
There
would
be
writing
solar
on
topic
of
Sandra
and
then
analytic,
cigarettes,
Cassandra
and
spark,
but
if
you're
running
contender
only
then
can
stamp.
Each
cup
sender
note
is
independent.
A
There
there's
no
difference
between
those
other
than
what
they're
primarily
responsible
for
the
data,
but
if
you
make
a
request
to
any
node
that
it
will
find
your
data,
so
it's
not
there's
no
specific
nodes
that
are
that
are
set
aside
for
just
queries,
and
these
are
and
I
know
that
this
is
a
very
common
pattern
with
a
lot
of
databases
is
like
well
either
the
query
nodes
either
the
data
in
it,
because
none
of
that
there's
a
master/slave,
architectures
or
anything
like
that's
all
peer
to
peer.
So
the
short
answer
is
no.
B
I
can
go
ahead
and
take
that
so
they're
very,
very
different
beasts.
Hadoop
is
designed
to
do
a
collection
so
take
tons
and
tons
and
tons
of
machines
throw
a
ton
ton
of
jet
of
data
at
it,
and
you
know,
do
some
plotting
through
it
with
MapReduce.
In
order
to
figure
out
answers
to
you
know
deep
questions.
It
is
much
more
of
an
oil,
an
OLAP
word
oli,
P
system.
You
know
designed
to
do
data
lakes
and
basically
process
the
data
of
the
universe.
B
That's
what
it
was
designed
to
do
is
just
to
pack
through
the
internet.
Saundra,
on
the
other
hand,
is
an
OLTP
system.
It
is
designed
for
high
speed,
reads
and
writes
little
tiny
bits
of
data
short
request
data
coming
in
and
out
at
all
times.
They
work
on
completely
different
file
systems.
They
are
completely
separate
projects,
they
run
in
conjunction
with
each
other.
So
you
can,
you
know,
take
your
Cassandra
data
and
move
it
into
the
dupe.
A
A
It
I
kind
of
answered
that,
but
let
me
be
very
clear
so,
when
you're
running
data
sect
Enterprise
it
will
you
can't
run
them
non
separately,
so
whenever
you
run
Cassandra
only
that
will
be
in
one
data
center
when
you
run
SPARC
and
Cassandra
together,
they
are
on
another
data
center
in
SPARC
and
solar
and
a
different
data
center.
This
is
for
workload,
isolation
and
also,
for
you
know
just
for
task
isolation.
A
We
want
to
make
sure
that
these
nodes
are
there
for
a
specific
reason
by
using
data
center
segregation,
it
gives
you
it
gives
you
some
options,
and
so
that's
why
it's
done
that
way.
You
know
the
SPARC
is
going
to
use
so
much
CPUs,
much
memory,
etc.
That
is
going
to
be
a
different
type
of
node,
potentially
so
you'll
want
to
make
sure
that
those
nodes
are
there.
Remember
we
keep
taking
advantage
of
here
is
the
basic
part
of
Cassandra,
which
is
replication.
A
It
will
replicate
your
data,
no
matter
what,
and
so
that's
what
makes
this
work.
We
just
take
advantage
of
that
basic
fact.
So
I
think
that's
it
now,
just
we
will
be
collecting
these
questions
as
well.
So
if
we
didn't
get
your
question,
we're
going
to
try
to
we'll
try
to
follow
up
with
a
blog
post,
do
the
QA
and
there
as
well
so
just
keep
your
eye
out
for
an
email
when
it
gets
posted,
will
try
to
post
a
the
QA
blog
post
as
well.
A
Just
so
we
make
sure
we're
get
over
ajaan
your
questions.
I
know
you
have
a
lot
of
course
hit
us
up
on
Twitter
I,
see
a
few
people
already
have
on
my
Twitter
account
right
now
great.
We
love
to
hear
from
people
and
if
you
see
a
Cassandra
Day
coming
to
a
town
near
you,
our
next
ones
next
month
in
New,
York
just
come
on
by,
we
have
a
lot
to
talk
about
there.
You
can
ask
your
questions
there
as
well
lots
of
experts.