►
From YouTube: Data consistency across cloud native systems
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
A
Why
is
it
important,
then
walk
you
through
that
example,
and
then
we're
going
to
look
at
a
couple,
different
systems
that
are
kind
of
ubiquitous
components
that
you'll
find
in
Cloud
native
architectures
and
talk
about
some
aspects
about
them
that
are
interesting,
some
aspects
that
may
be
surprising
and
the
things
that
they've
kind
of
the
history
behind
the
ideas
that
they
have
and
what
they
kind
of
contributed
to
the
overall
kind
of
landscape
for
architecting
software,
with
an
eye
towards
the
consistency
of
the
data
being
used
so
jumping
into
that
I
am
Jimmy.
A
Zilinski
I
am
the
co-founder
of
a
company
called
auth
Zed.
We
are
the
creators
of
a
database
system
called
Spice
TV
Space
TV
is
a
data
store
specifically
designed
for
storing
and
Computing
authorization
data,
so
that
means
that
it
is
basically
the
core
engine
that
your
business
would
use
to
compute
whether
a
person
has
access
to
perform
an
action
or
not
I
like
to
use
that
term
permission
or
permission
system
rather
than
authorization
system,
I,
think
it's
far
more
approachable
and
defines
the
problem
way
better.
A
But
you
can
understand
how
consistency
is
kind
of
a
core
part
of
this.
Just
because,
fundamentally
permission
systems
for
software,
they
have
to
be
correct,
or
else
there
is
a
security
flaw.
If
some
software
allows
someone
to
perform
an
action
that
they
otherwise
shouldn't,
that
is
absolutely
Mission
critical
in
most
software
and
doing
this
kind
of
work
at
scale
and
at
low
latency,
because
absolutely
every
action
has
to
take
place
in
your
software
system
has
to
check
whether
it
has
the
access
allowed
to
perform
that
action.
A
It
puts
us
in
the
critical
path,
so
data
consistency,
super
important
for
Spice
TV,
but
before
I've
said,
I
was
previously
working
in
the
cloud
native
space.
I
worked
at
a
company
called
Red
Hat
by
means
of
the
chorus
acquisition,
so
I've
been
kind
of
working
in
this
space
since
before
actually
the
cncf
was
created,
core
OS
was
kind
of
building
distributed
systems
and
kind
of
container
Technologies.
A
Basically,
the
foundation
of
cloud
native
software,
since
before
kind
of
the
cncf
and
this
whole
kubernetes
ecosystem
emerged
and
in
that
time
I
have
contributed
to
a
bunch
of
cloud
native
projects.
A
I've
co-created,
some
I
am
also
a
maintainer
of
oci,
which
is
the
standard
body
for
container
images
and
kind
of
this
all
kind
of
folds
back
into
my
passion
for
distributed
systems,
even
before
working
at
core
OS
I
always
had
an
eye
towards
distributed
systems
and
as
an
early
adopter
of
a
project
called
SCD
which
ultimately
became
the
data
store.
A
That's
used
by
kubernetes
I'm,
going
to
talk
a
bit
about
NCD
later
in
this
talk,
but
then
also
as
a
part
of
kind
of
building
kind
of
large-scale
SAS
systems
on
cloud
native
software
I
have
also
ran
MySQL
and
postgres.
These
type
of
relational
databases
at
scale
I've
seen
where
they
fall
over
I,
know
sharp
edges.
A
When
you
build
enterprise
software,
for
example,
you
try
to
do
things
without
introducing
new
dependencies
on
other
systems,
so
your
customers
don't
have
to
set
up
yet
another
software
dependency,
and
you
start
to
bend
a
lot
of
these
products
to
their
will.
So
in
ways
they
should
not
Bend
so
you're
trying
to
like
actually
get
different
database
properties
out
of
databases
that
were
never
designed
to
do
certain
things.
A
A
So
I
also
left
my
my
contact
information
on
this
slide.
So
if,
at
any
point
in
time
you
want
to
reach
out
to
me
feel
free
to
just
shoot
me
an
email
with
a
question
or
actually
and
the
final
slide
I'm,
also
going
to
link
to
Discord
community
that
you
can
join
to
kind
of
discuss,
distributed
systems
in
general
or
data
consistency,
but
I
I
prefer
email.
A
And
then
you
might
see
me
around
on
Twitter
or
GitHub
under
these
handles
as
well
and
enough
about
me,
it's
on
it's
time
to
move
on
to
the
actual
primary
subject,
which
is,
you
may
have
seen
these
terms
thrown
around
in
your
software
development
career.
A
A
But
fundamentally
these
Concepts
they're
not
kind
of
unique
to
databases,
because
so
many
software
systems
all
store
data.
Eventually,
they
kind
of
Punt
it
off
to
a
database
that
is
then
responsible
for
maintaining
that
data,
but,
like
the
fundamental
systems,
are
still
offering
views
of
data
and
they're
modifying
data
potentially
and
before
they
pass
it
all
around.
So
whether
we're
talking
about
databases
or
microservices,
the
concept
of
kind
of
the
data
you're
working
with,
is
always
going
to
be
relevant.
A
It's
actually
really
interesting
acid,
which
is
like
one
of
the
super
popular
acronyms
thrown
around
in
this
space,
which
stands
for
atomicity,
consistency,
isolation
and
durability-
that
that
acronym
actually
has
this
kind
of
story
around
it.
Where
actually
folks
believe
that
the
sea
was
just
made
up
to
make
the
acronym
work
well,
that
c
is
consistency
which
is
the
topic
of
this
whole.
A
This
whole
presentation,
so
I
hope
by
the
end
of
it
I,
can
kind
of
formally
explain
at
least
how
I
think
about
consistency
and
why
it
is
almost
certainly,
even
if
that
was
the
initial
intent.
Definitely
not
the
case
that
it
is.
It
is
a
made-up
concept
that
is,
that
is
just
to
make
an
acronym
work,
because
you'll
find
that
it's
used
in
many
other
places
and
other
discussions
outside
of
just
the
term
acid.
A
So
it
clearly
has
some
relevancy
on
its
own,
so
I'm
gonna
actually
use
some
of
these
terms
in
here,
I'm
going
to
use
them
in
different
contexts
so
that
there
are
kind
of
definitions
become
more
clear
rather
than
just
kind
of
trying
to
describe
them,
abstractly
in
a
vacuum.
A
So
I've
talked
a
lot
about
these
things,
but
I
still
haven't
like
covered
the
very
Basics
yet,
which
is
what
actually
is
consistency
now
I
didn't
use
the
the
Wikipedia
definition
that,
like
you,
just
Google
for
and
find
instead
I
kind
of
defined
it
the
way
I
like
to
think
about
it
and
the
way
I
feel
like
most
Engineers
colloquially
use.
A
It
I
think
that's
really
important,
because
there's
you
can
go
and
look
up
a
lot
of
this
terminology
and
read
a
very
dense
article
or
read
research
papers
that
talk
about
these
Concepts,
but
that
doesn't
matter
if
you're
just
trying
to
communicate
something.
To
your
fellow
engineer.
What
actually
matters
is
that
kind
of
you
have
this
effective
communication
tool
and
you
both
have
a
shared
understanding
of
this
topic.
A
So
I
tried
to
Define
it
in
my
own
words,
rather
than
like
the
mathematical
terms
that
you
might
find
elsewhere,
so
how
I
Define
it
is
strictly
around
the
contract
between
how
data
can
be
observed
in
a
system.
I
often
kind
of
talk
about
freshness
with
this,
like
that.
The
concern
of
like
how
fresh
is
the
data
that
you're
working
with
becomes
a
part
of
that
equation,
but
fundamentally
I
think
the
core
concept
here
and
the
way
people
most
often
use.
This
term
is
largely
around
what
I
would
say.
A
Quote-Unquote
correctness
and
correctness
basically
is
context
dependent,
which
makes
it
kind
of
tricky.
It
depends
on
what
type
of
system
you're
trying
to
build
and
when
you're
trying
to
build
these
systems,
you're
going
to
first
kind
of
talk
about
the
problem
and
then
work
backwards
to
find
the
consistency
model
that
is
going
to
work
for
your
solution.
A
So
why
does
consistency
matter?
And
why
are
we
working
backwards
to
arrive
at
it?
It's
because,
if
you're
building
applications
and
fundamentally
you
have
a
contract
between
what
your
expectations
of
the
data
that
you're
going
to
use
in
the
application
and
the
data
in
the
database,
for
example,
or
that
the
users
will
see
and
your
application
that
you've
built
if
that
contract
is
broken,
systems
can
explode
in
catastrophic
ways.
A
Basically,
silent
errors
can
occur,
data
corruption
can
occur
and,
fundamentally,
if
you
want
to
solve
this
problems
like
these
problems
between
this
inconsistency,
certain
things
will
actually
just
be
impossible
for
you
to
do
without
totally
re-architecting
your
software
around
something
that
works
more
consistent.
The
door
closes
behind
you
when
you
open
up
and
move
to
a
less
consistent
system.
You
don't
have
the
capability
of
adding
this
in
retroactively
and
that's
kind
of
the
scary
part
about
consistency.
A
Is
you
really
need
to
understand
your
problem
and
your
domain
first,
because
if
you
pick
something
that
is
not
going
to
jive
with
the
system
in
the
future,
you
are
going
to
be
in
a
world
of
pain,
probably
re-architecting
or
carving
out
some
subsection
of
your
application.
A
That
has
to
be
treated
special
with
completely
isolated
data
that
works
at
a
higher
consistency
level
and
all
that
might
not
mean
too
much
now,
but
I'm
going
to
go
through
a
concrete
kind
of,
like
example,
now
and
then
eventually
we'll
talk
about
some
systems
in
the
real
world,
those
components
how
how
this
all
plays
out
in
those
components.
A
So
here
I've
got
this
example
and
this
type
of
Medical,
but
it's
a
real
problem-
that's
actually
faced
by
everyone,
designing
e-commerce
systems
in
the
world
unless
they're
building
on
top
of
a
pre-existing
system
of
someone
who
has
already
solved
this
problem
for
them.
But
even
then,
as
you
extend
that
those
systems
with
your
own
systems,
you
still
have
to
continually
think
about
consistency
and
how
that
plays
out.
But
I
digress.
A
Here
is
the
hypothetical
scenario
there
are
two
humans
involved
in
the
scenario
a
child
in
a
parent.
The
parent
is
supervising
the
purchase
of
an
item
by
a
child
online.
A
The
child
is
going
to
basically
review
the
the
orders
on
their
account
and
see
if
the
item
has
been
purchased
yet
then
they're
going
to
purchase
the
item,
and
then
the
parent
is
going
to
double
check,
just
make
sure
that
the
the
child
did
the
correct
thing,
and
so
basically
we
have
this
flow
over
time,
which
is
the
child.
First
reads
the
orders:
they
see
that
an
order
hasn't
taken
place.
A
Yet
so
then
they're
going
to
purchase
the
item,
and
then
the
parent
is
going
to
read
the
orders
and
find
that
the
child
successfully
purchased
the
item.
Their
child
was
good.
Everything
acted
accordingly
and
I
just
wanted
to
make
this
kind
of
concrete
one
more
time.
Nowhere
have
I
mentioned
servers,
databases
microservices.
None
of
that
this
is
all
actually
purely
from
the
external
facing
side
of
the
system,
the
user.
At
the
end
of
the
day,
sometimes
your
your
users
are
real
people.
A
Sometimes
your
users
are
other
services
and
your
microservice
architecture,
sometimes
the
they
are
the
actual
service,
and
you
are
the
database
right,
but
the
the
point
here
is
that
these
types
of
problems
that
I'm
going
to
describe
in
the
scenario
this
plays
out,
regardless
of
that
it
does
not
matter
actually
what
those
are
it's
equally
capable
of
happening
in
all
of
these
scenarios.
So
here
is
the
problem.
This
is
another
way
the
order
of
events
can
take
place.
A
The
child
reads
the
orders,
the
parent
reads:
the
orders,
the
child,
buys
the
item,
and
then
the
parent
buys
the
item
now.
Why
did
this
happen?
It's
because
the
parent
checked
the
orders
right
before
the
time
in
between
the
child
was
going
to
actually
purchase
the
item
and
at
that
point
in
time
the
parent
looked
at
the
order
list
and
said:
oh,
my
child
didn't
purchase
this
so
now,
I
have
to
as
the
failover
I
have
to
go
and
purchase
this
item
because
the
the
shop
was
not
successful,
but
actually
the
child
was
successful.
A
The
parent
just
checked
too
early
and
the
these
events
got
basically
interweaved,
and
this
is
kind
of
a
problem
because
the
parent
fundamentally
made
their
decision
based
on
stale
data.
A
So
by
the
time
they
made
their
purchase
technically,
the
read
that
they
they
made
was
invalid
because
the
child
actually
had
had
already
purchased
the
item,
so
they
would
have
had
to
reread
before
actually
finally
making
that
purchase
to
do
this
successfully,
but
they
have
absolutely
no
signal
to
tell
them
that
they
needed
to
reread
so
computer
scientists
love
to
sound,
really
smart
and
they
like
to
use
words
from
math
and
physics.
A
So,
there's
actually
a
term
for
the
relationship
between
these
two
events,
which
is
causality
or
causal
ordering
or
causal
dependency,
because
the
purchasing
of
the
item
is
dependent
on
the
read.
A
A
A
So
moving
on
from
here,
there
kind
of
is
a
really
obvious
way
that
a
lot
of
people
think
about
solving
this
problem
and
it
truly
does
solve
the
problem,
which
is
to
just
combine
things
with
causal
dependencies.
Why
can't
they
happen
at
one
point
in
time?
So
when
folks
typically
think
about
this,
they
think
about
like
transactions
and
relational
databases
or
Atomic
operations
in
like
the
sync
libraries
and
like
their
their
programming
languages.
A
So
this
does
solve
the
problem
and
it
also
leads
to
a
pretty
good
segue,
which
is
that
actually,
so
far,
I've
really
been
describing
atomicity
in
this
example.
So
that's
the
a
in
acid
if
you'll
recall,
but
not
the
C
in
acid.
A
Oh
that's
because
while
we
have
grouped
all
these
things
together,
we've
been
working
under
the
assumption
that
every
single
time
an
action
takes
place
here
and
if
we
follow
the
flow
of
time
that
is
immediately
visible
to
all
of
the
outside
actors
in
that
system,
and
this
is
where
we
start
to
really
get
deep
into
like
the
physics
and
relativity
kind
of
analogies
that
exist
in
distributed
systems.
A
But
we
can
actually
imagine
scenarios
where,
when
you
actually
perform
these
actions,
the
visibility
happens
later
so
a
child
and
a
parent.
The
exact
same
story
plays
out
theyatomically
still,
the
child
performs
the
atomic
operation,
but
when
the
parent
goes
to
read,
it
actually
happens
before
the
atomic
operation
is
visible
to
all
participants
in
the
system.
A
So
this
happens
obviously
way
more
in
distributed
systems
because,
for
example,
you
might
have
a
read
replica
that
is
getting
changes
stream
to
it
asynchronously,
and
it
is
best
effort
trying
to
stay
up
to
date
with
the
most
recent
information,
so
that
maybe
folks
in
another
Geo
region
on
the
planet
get
fast
performance
for
for
sale,
data
or
not,
maybe
just
not
up
to
the
most
consistent
level
of
data.
So
quite
a
common
real
life
scenario,
largely
in
just
read
it
systems.
A
This
is
the
difference
between
animus,
atomicity
and
consistency
is
like
this
visibility
aspect
that
happens
in
relativistic
systems,
and
this
is
what
will
play
out
time
after
time
in
any
distributed
system.
A
It
may
seem
like
kind
of
far-fetched
looking
at
this,
like
the
example
that,
like
well,
hey,
how's,
the
visibility
actually
like
how
can
this
be
delayed?
Maybe
if
it's
not
a
distributed
system,
but
even
in
normal
database,
you're
running
a
database
on
a
single
system,
the
time
at
which
it
takes
for
a
transaction
to
commit
in
the
time
at
which
it
is
actually
then
propagated
to
to
actually
the
visible
data
that
is
queried.
That
actually
is
a
Time
window
in
which
you
can
race
against
the
visibility.
A
A
So
I
kind
of
have
this
consistency
Spectrum,
where
I
kind
of
lay
out
the
problem
in
two
different
dimensions.
When
we're
talking
about
the
types
of
consistency
that
systems
can
have
across
the
bottom,
the
x-axis
I
have
immediate
consistency,
which
is
what
I
was
talking
about
for
most
of
the
example
like
when
I
was
talking
about
atomicity,
that
is
once
a
change
happens.
A
It
is
immediately
visible
to
everyone
in
the
system
and
then,
on
the
right
hand,
side
of
the
x-axis
I
have
eventual
consistency,
which
is
time
passes
and
eventually
folks
will
receive
an
update.
They'll
eventually
see
the
change
that
has
occurred,
and
that
could
be
basically
arbitrary
amount
of
time.
Until
that
happens,
this
is
what's
most
commonly
talked
about.
I
feel
like
when,
when
folks
talk
about
consistency,
do
we
need
immediate
or
eventual
consistency?
A
A
So
that
means
that
if
something
occurs
first,
it's
guaranteed
to
be
before
the
thing
that
happens
after
it
that
that
sounds
a
little
bit
spacious,
but
I'll
get
into
why
that's
relevant
and
the
systems
that
benefit
from
it
later
I've
kind
of
dropped.
In
some
examples
here
of
these
types
of
systems,
for
example
linearizability-
you
can
see
that's
on
the,
but
that's
the
most
immediate
and
the
most
strict
ordering.
A
So
that
is
one
like
the
strong,
strong,
strong
strongest
kind
of
guarantee
that
you
can
find
in
a
system
and
what
that
actually
means
is
there
is
a
total
Global
ordering
across
of
all
the
systems
for
each
change
in
the
system,
and
when
those
changes
are
applied,
they
are
immediately
visible
to
everyone
in
the
system.
A
This
is
the
strongest
guarantee
you
could
possibly
get
and
then
on
the
polar
opposite
of
that
I.
Have
this
eventual
consistency
that
is
also
weekly
ordered
now,
that
is
a
kind
of
interesting
bit
of
Technology
called
a
conflict-free
replicated.
A
Data
type
and
crdts
are
a
kind
of
building
block
that
a
lot
of
systems
are
kind
of
exploring
right
now
and
what
that
actually
lets
you
do
is
propagate
changes
that
can
basically,
when
you
have
a
scenario
where
changes
don't
matter
what
ordering
is
applied,
you
can
actually
use
this
as
a
very
effective
way
to
synchron
eyes
data
the
they
basically
rely
on
this
property
of
commutativity.
A
You
might
actually
remember
this
from
maybe
you've
had
an
abstract,
algebra
class,
or
maybe
you
just
remember,
basically,
learning
addition
as
a
child.
Addition
is
commutative.
So
that
means
that
you
can
do
things
like
one
plus
two
equals
three
or
two
plus
one
equals
three.
It
doesn't
matter
what
order
you
receive.
Those
changes
when
you
sum
together
a
bunch
of
numbers,
they're,
always
going
to
converge
to
the
same
result.
A
So
if
you're
performing
operations
on
your
data
that,
regardless
of
whatever
order
you
apply
to
them,
you're
always
going
to
converge
of
the
same
result,
that's
great.
That
means
you're
going
to
be
good
and
you
can
use
one
of
these
systems
and
still
get
correct
answers.
A
So
there's
kind
of
other
variants
in
the
space
I
kind
of
use
serializability,
which
is
that
there
is
an
order,
for
example,
immediately
consistent,
there's
no
total
Global
ordering
it
just
means
everything
happens,
independent
like
independently
and
isolated
in
an
order,
and
then
we
also
have
eventual
consistency
which
is
more
similar
to
what
you
see
out
of
a
lot
of
the
new
SQL
systems.
A
I,
just
kind
of
wanted
to
show
that,
like
there's,
there's
varying
levels
in
the
Spectrum,
it's
not
just
kind
of
like
the
opposite
corners
and
then
the
middle
ground,
but
I
kind
of
wanted
to
dig
a
little
bit
deeper
into
this,
because
there
is
super
important
to
understand
properties
involved
here.
A
If
you'll
look
at
the
bottom
and
the
x-axis
I've
replaced
immediate
and
eventual
with
slow
and
fast,
because
in
the
real
world,
this
is
the
implication
of
choosing
one
of
these
effectively.
It
is
way
way
less
performant
to
choose
something
that
is
immediately
consistent
because
you
have
to
make
sure
before
you
write
something
that
it
is
going
to
be
visible
to
everyone.
So
that
means
it
probably
has
to
be
replicated
everywhere
before
it
becomes
visible
and
accept
it
as
a
right
and
then
crdts.
A
For
example,
you
can
just
basically
dump
out
a
stream
of
changes,
hope
that
eventually,
someone
gets
all
the
changes
and
then
they're
good,
and
what
is
really
interesting
here
is
that
we
kind
of
have
this
middle
ground.
This
box
and
I
call
this
box
cleverness,
because
this
is
where
you're
going
to
find
a
lot
of
the
stuff
in
the
real
world.
That's
compromising
and
trying
to
make
a
lot
of
systems
viable.
A
A
You're
stuck
in
one
of
these
camps,
you're
stuck
in
either
corner
of
the
spectrum,
but
actually
most
systems
will
not
actually
have
those
problems
and
instead
they're
going
to
live
somewhere
in
this
middle
ground,
and
this
Middle
Ground
is
where
there's
going
to
be
a
lot
of
interesting
tricks
and
things
that
you're
going
to
be
able
to
partially
apply
and
to
gain
a
lot
of
benefits
in
your
system,
without
necessarily
paying
the
costs
globally
across
all
of
the
data
that
you're
working
with
so
some
some
typos
in
those
slides.
A
A
S3
I
wanted
a
really
good
example.
This
is
actually
kind
of
funny.
I
wanted
a
good
example
of
a
ubiquitous
system.
That's
eventually
consistent
that
a
lot
of
folks
are
using
and
I
immediately
thought
of
S3
I
built
product
in
the
past.
On
top
of
S3
and
yeah,
you
basically
would
submit
blobs
to
S3.
It
would
tell
you,
hey
I,
wrote
it
that's
great,
but
if
someone
else
immediately
then
tried
to
pull
down
that
blob,
it
would
not
be
available
yet.
A
So
there
was
no
necessarily
guarantee
that,
after
you
had
written
something,
it
was
immediately
viewable
to
external
actors
and
actually,
as
I
went
to
go,
make
this
presentation
I
found
an
article
AWS
actually
fixed
this.
They
actually
changed
this
a
couple
years
ago.
So
I
don't
know
if
this
is
actually
true
across
all
the
implementations
of
blob
storage.
A
A
Eventual
consistency
was
kind
of
the
the
the
status
quo
there
and
what
is
kind
of
interesting
is
I'm
gonna
dive
into
it
deeper
later,
but
the
system
backing
S3
storing
metadata
was
given
additional
consistency
capabilities,
which
is
what
made
it
possible
for
the
developers
of
S3
to
actually
change
this
consistency,
to
make
it
actually
more
consistent.
A
That
is
very
typical
for
systems,
and
this
is
actually
an
example.
That's
kind
of
rebutting
my
conjecture
that
it's
actually
it's
actually
impossible
or
hard
to
Pro
prohibitively
hard
to
kind
of
add
this
add
consistency
after
you've
designed
a
system
that
doesn't
have
it,
but
us3
is
actually
sufficiently
simple
that
it
was
actually
not
much
of
a
hurdle
once
the
underlying
dependency
that
they
had
offered
that
capability.
So
remember
take
everything
with
a
grain
of
sand,
that
I
say
a
grain
of
salt.
A
That
I
say
because
you
know
the
system
you're
working
with
and
I
have
to
kind
of
speak
in
generalities
for
systems.
That
I
think
are
what
are
the
most
common
place
and
that
I've
seen
most
commonplace,
but
maybe
you're
building
something
that
is
it's
not
exactly
that.
A
So
here's
one
that's
going
to
be
really
fun
relational
databases.
This
is
kind
of
a
system
that
a
lot
of
people
are
familiar
with
and
I
think
that
in
the
general
case,
a
lot
of
developers
believe
that
by
simply
adopting
transactions
in
their
usage
of
a
relational
database
that
they
have
kind
of
solved,
consistency,
problems
and
I'm
here
to
tell
you
that
transactions
are
not
a
silver
bullet,
not
nearly,
and
actually
what
dictates
your
consistency
in
these
systems.
A
Far
more
than
transactions
is
actually
basically
the
isolation
level
set
in
the
database,
and
even
if
you
don't
include
transactions
whatsoever
implicitly,
the
statements
that
you
send
the
server
are
going
to
be
wrapped
in
transactions
so
like
whether
you
think,
like
everything,
is
a
transaction.
Basically,
it's
an
atomic
unit
inside
of
a
earth
relational
database,
whether
you
use
the
keyword
or
not
in
the
SQL
you're
you're.
Writing
all
right.
A
So
what
are
isolation
levels,
isolation
levels
are
kind
of
this
aspect
of
the
database
and
it's
usually
defined
at
the
table
the
table
layer
effectively.
What
it
says
is
the
the
consistency
of
the
data
that
you're
working
with
on
that
particular
table.
So
there's
no
standard
for
this
or
anything
like
that,
but
across
MySQL
and
postgres
they
kind
of
agreed
that
these
are
kind
of
the
basic
isolation
layers
in
my
sequel,
the
default
isolation
level
is
repeatable,
read
so
you'll
see
that's
the
second
from
the
top
and
then
in
postgres.
A
It's
actually
recommitted,
so
you'll
see
that's
actually
the
third
from
the
top,
so
postgres
is
actually
by
default
more
lenient.
It
can
actually
have
less
consistent
responses
by
default.
If
you
don't
actually
go
in
and
clarify
what
what
isolation
level
you
need
in
the
SQL
you're
writing,
so
I
kind
of
wanted
to
run
through
like
the
different
types
of
scenarios.
A
It's
that,
like
are
kind
of
outlined
here
dirty
reads,
are
when
you
perform
a
transaction
and
when
you
read
a
row,
a
another
transaction
which
has
not
been
committed,
yet
it
hasn't
been
written
to
the
database.
You'll
see
that
data
you'll
see
changes
that
have
occurred
so
basically
I
open
a
transaction
I
try
to
modify
something.
You
open
a
transaction.
You
go
to
read
that
thing.
A
You'll
see
the
change
that
is
a
dirty,
read
and
so
you'll
see
that
unless
you're
really
kind
of
like
explicitly
choosing
to
go
inconsistent,
that
is
unlikely
to
be
ever
a
scenario.
You'll
see
when
you're
using
relational
database.
Unless
you
specifically
say
I
want
read
uncommitted.
A
So
that's
not
a
super
common
problem,
but
it's
interesting
to
note
that
it's
even
possible
and
then
effectively
that
that
guarante
that
eliminates
some
of
the
benefits
that
you
get
from
what's
called
a
mvcc
or
a
multiversion
concurrency
control
system
database
system.
So
that
is
super
atypical.
Unless
you're
working
with
a
database
that
is
not
in
VCC.
A
So
then
we
have
non-repeatable
read
which
this
is
when
you.
Basically,
you
reread
data
committed
by
other
transactions
and
bind
has
been
modified,
so
this
means
that
you're
in
your
transaction,
another
transaction
modifies
some
data
that
you've
already
read
and
then,
if
you
go
to
read
that
data
again,
you'll
see
it's
updated
within
your
transaction.
So
this
is
kind
of
breaking
the
kind
of
atomicity
thing
that
a
lot
of
people
consider
transactions
to
provide
them,
but
actually
you'll
find
that
this
non-repeatable
read
and
recommitted.
A
That's
that's
actually
possible
so
that
by
default
in
postgres,
this
is
totally
a
scenario
that
can
happen
to
you,
even
though
that's
probably
shocking
to
to
people
to
believe
that
hey,
like
I'm
supposed
to
just
be
reading
from
that
particular
snapshot,
I'm
not
supposed
to
see
these
types
of
changes,
but
that's
quite
possible,
and
then
finally,
we
have
Phantom
read
which
this
is
more
commonly
the
the
thing
that
is
it's
most
quiet
because
you're
not
going
to
see
it,
but
also
it's
the
thing.
That's
going
to
actually
corrupt
your
well.
A
It's
going
to
be
it's
not
going
to
corrupt
your
data,
but
it
is
definitely
going
to
be
the
most
surprising
way
that
could
possibly
corrupt
your
data,
which
is
basically
you
read
some
data
in
your
transaction.
A
Another
transaction
modifies
it
and
when
the
database
goes
to
apply
it,
it
just
happily
applies
both
it
doesn't
try
to
rerun
your
transaction
and
your
transaction
performs
a
read
and
then,
depending
on
the
value
of
that
read,
it
performs
its
right
if
another
transaction
before
it
comes
in
it
just
totally
swaps
that
out
doesn't
matter
it's
just
going
to
progress
anyway.
A
If
you
had
reread
it
reread
the
value
in
the
non-repeatable
read
scenario,
it
would
not
have
mattered,
it
would
have
lied
to
you
and
say
the
value
hasn't
changed,
but
then,
fundamentally,
when
these
transactions
kind
of
get
committed,
that's
the
wish.
The
time
the
value
is
going
to
have
changed
and
you're
going
to
be
Sol,
so
Phantom
reads,
are
actually
even
possible,
basically
at
the
default
isolation
level
in
my
SQL.
A
So
unless
you
have
explicitly
configured
your
data
store
to
be
serializable,
that's
the
only
opportunity
for
all
of
these
to
not
occur.
To
you
now
the
interestingness
that
you
get
in
kind
of
like
that
cleverness
box,
that
I
kind
of
showed
off
in
the
diagram
earlier.
A
A
There
is
a
select
for
update
Clause
that
you
can
write
that
says
I'm
going
to
read
this
row,
because
I
am
going
to
causally
update
some
other
row
based
on
its
value,
so
that
actually
lets
you
describe
this
causal
dependency,
this
causal
ordering
in
in
terms
that
the
database
can
understand
and
what
actually
happens
internally
when
you
use
select
for
update,
is
it
does
a
row
level
lock
on
that
data
so
that
that
actually
locks
that
data
and
prevents
any
other
transactions
from
modifying
that
data
for
the
duration
of
your
transaction
until
you
can
actually
commit
your
transaction,
so
this
is
what's
going
to
give
you
guarantees
that
no
one
else
changed
that
value
at
from
underneath
you,
and
this
is
kind
of
having
your
caking
eating
it
too.
A
You
don't
have
to
turn
on
full
serializability
to
get
that
guarantee.
You
can
do
that
with
any
of
these
modes,
so
that
is
where
you're
you're
in
that
cleverness
box
again
you're
selectively.
Choosing
I
need
consistency
for
this
operation
right
here
locally,
but
I
don't
need
consistency
across
the
board
everywhere.
A
So
there's
there's
a
lot
more
and
there's
a
lot
of
other
tricks
deep
inside
these
relational
databases
for
for
kind
of
like
working
with
this
data.
But
I
think
this
is
the
very
high
level
most
important
thing
if
I
had
to
like
teach
someone
about
consistency
and
relational
databases
that,
if
I
had
10
minutes
to
tell
you
something
this,
is
it
walk
away?
A
Knowing
that,
like
isolation,
levels,
are
a
thing
that
you
should
constantly
make
sure
you're
reminded
of
and
familiar
with,
when
you're
you're,
writing
schemas
for
relational
databases
and
also
knowing
that
like.
If
you
need
to
read
them
right
in
a
relational
database,
you
should
be
using
select
for
update
most
likely
cool.
A
So
let's
talk
about
a
less
commonly
used
system,
but
an
equally
interesting
and
very
relevant
one,
which
is
Lock
Services
and
that
is
kind
of
what
I
call
this
class
of
of
software,
although
they
were
originally
designed
to
be
locked
Services,
they
kind
of
have
larger
Scopes
these
days.
These
are
projects
like
SCD
and
zookeeper.
So
what
are
the
guarantees
here?
A
Or
actually,
let's,
let's
a
little
bit
talk
about
what
Lock
Services
are
so
basically,
in
the
probably
mid
to
late
2000s
Google
wrote
a
paper
about
a
system
that
they
had
built
internally
called
chubby,
and
that
internal
system
is
a
distributed
lock
service
and
the
point
of
the
system
was
we
have
a.
We
have
distributed
systems,
we
have
a
bunch
of
different
applications,
they
all
need
to
coordinate
together,
so
they
need
to.
They
need
some
mechanism
for
them
to
safely.
One
of
them
needs
to
acquire
exclusive
access
to
some
resource.
A
They
need
a
lock
a
distributed,
lock,
that's
very
tricky,
and
it
turns
out
formally
that
to
solve
that
problem,
you
need
a
linearizable
system.
So
what
ended
up
happening?
Is
they?
They
wrote
A
paper,
how
they
designed
this.
This
distributed
lock
service
and,
ultimately,
we
we
see
projects
inspired
by
that
a
zookeeper
is
I
believe
it
is
definitely
inspired
by
that
paper.
It
doesn't
implement
it
directly.
A
It
implements
a
completely
it's
its
own,
unique
algorithm
called
Sab,
but
then
we
also
see
systems
like
SCD,
which
is
also
inspired
by
kind
of
later
computer
science
research,
around
kind
of
the
the
same
consensus,
research
in
distributed
systems
and
ultimately
these
are
linearizable
systems,
but
then
we,
as
the
stems,
have
gotten
more
and
more
mature.
A
They
kind
of
decided
hey.
This
is
a
really
useful
property
having
something
that
is
linearizable,
so
we
have
always
guarantees
about
whatever
like
critical
beta,
we
have
in
our
distributed
system,
let's
save
it
all
over
there.
We
don't
just
need
it
for
locks.
We
need
it
for
more
things.
A
A
That
is
actually
the
core
data
store
used
for
kubernetes
and,
what's
really
interesting,
is
that
kind
of
while
it
is
realizable
there
are
lots
of
tricks
that
you
can
kind
of
use
in
the
protocol
and
under
the
hood,
to
optimize
the
the
consistency
and
like
the
performance
of
such
a
system,
without
kind
of
impacting
the
external
user
facing
appearance,
the
freshness
of
the
data
that
they're
seeing
so
there's
a
lot
of
really
interesting,
distributed
system
tricks
here,
I'm
just
going
to
call
them
tricks
because
I
don't
want
to
dive
too
deep
into
them,
and
there's
lots
of
variations
of
algorithms
under
the
hood
that
are
shortcutting
things
to
to
make
this
really
fast,
and
then
there's
also
capabilities
in
a
lot
of
these
systems
to
kind
of
relax.
A
The
consistency
that
you
can
actually
use
with
the
system.
If
you
would
like
to
trade
that
in
order
to
get
higher
performance-
and
the
important
thing
to
note
here-
is
that
when
you
are
kind
of
like
relaxing
this
consistency
and
like
really
kind
of
playing
with
these
things,
these
are
the
authors
of
these
systems
and
they're
doing
it
kind
of
at
the
protocol
level
and
like
in
the
API
level.
A
So
it's
not
really
exposed
to
anyone
on
the
outside,
consuming
the
system
so
much
as
kind
of
like
optimized
around
the
guarantees
that
they
can
provide.
There
are
some
that
provide
apis,
so
you
don't
necessarily
you
can
choose
if
this
needs
to
be
a
quorum,
read
or
not,
for
example,
but
by
and
large
lots
of
your
tricks
are
internal
to
these
systems.
So
most
critical,
most
strictly
required
strong
consistency
systems
they
get
stored
in
blog
Services.
A
So
then,
basically,
what
happened
was
the
new
SQL
Revolution
right
and
that
eventually
turned
into
kind
of
what
we
we
all
kind
of
refer
to
now
as
distributed
SQL
I.
Think
of
databases
in
this
space
as
Cocker's,
DB
and
tidb.
But
there
are
a
couple
more.
A
This
era
is
really
interesting
because
they
basically
looked
at
solving
the
problem
of
scaling
out
a
relational
database
in
a
horizontal
fashion,
so
you
can
keep
spinning
up
individual
nodes
and
scale
up
the
system
without
any
like
replication
lag
there
is
you
don't
have
to
direct
rights
to
one
particular
node,
they're
kind
of
solving
these
traditional
problems
and
scaling
these
relational
databases
and
doing
so
by
applying
some
of
the
research
from
the
lock
service
optimizations.
So
what
happened?
Was
the
creators
of
these
databases?
A
Looked
at
the
research
that
kind
of
gone
into
like
the
the
lock
services
and
went
well
hey
the
data
we
need
to
effectively
the
internal
data
in
our
database,
the
bookkeeping.
We
need
to
make
sure
that
we
can
actually
scale
these
systems.
We
can
serve
that
basically
store
that
in
using
these
Lock
Service
techniques.
Using
this,
the
consistency
tricks
that
these
systems
have
developed
and
that
is
going
to
make
it
so
we
can
actually
scale
and
provide
our
SQL
systems
now.
A
You'll
notice
that,
as
a
part
of
this
process,
they're
not
actually
kind
of
passing
on
any
of
that
to
the
end
users,
they're
still
providing
the
same
kind
of
isolation,
levels,
transaction
logic,
selector,
update
logic.
All
this
all
the
things
that
I
mentioned
previously
about
relational
databases.
Those
guarantees
are
still
here
in
these
systems.
They've
only
just
been
made
possible
to
be
expanded
in
this
kind
of
new
kind
of
Auto
scaling
World,
but
that's
not
to
say
that
magically
these
distributed
SQL
systems
are
linearizable.
A
In
fact,
the
examples
I've
given
here,
cockroachdb
and
tidb.
Neither
of
those
are
linearizable.
So
really
the
the
major
benefits
of
folks
from
the
consistency
being
used
here.
A
Are
the
database
admins,
the
The
Operators,
the
sres
running
these
systems,
because
they're
able
to
basically
scale
out
these
relational
databases
using
the
more
effective
kind
of
consistency
for
only
the
core
data
necessary
there,
but
not
any
of
the
end
user
application
developer
data
so
cool
that
unlocked
like
a
bunch
of
new
capabilities
for
the
the
SQL
databases,
but
the
end
users
themselves
didn't
necessarily
get
anything
better,
and
this
is
kind
of
where
this
is
the
less
defined.
A
What
I
would
say
is
like
the
interesting
New
Era
of
kind
of
databases
which
I'm
calling
adoc
here,
but
I
think
this
is
kind
of
flexible
consistency
systems.
I
use
two
examples
here:
Cosmos
DB
and
dynamodb,
which
are
kind
of
foundational
data
stores
at
Azure
and
AWS
respectively.
A
Dynamo
is
actually
notorious
for
having
been
a
paper
published
many
many
years
ago,
probably
mid
yeah,
mid
late
2000s,
and
what
has
actually
happened
is
the
Dynamo
described
in
that
paper
is
not
at
all
like
the
Dynamo
that
currently
runs
at
Amazon
now,
which
is
why
they're
able
to
add
capabilities
to
S3
right
to
actually
kind
of
add,
more
consistency,
because
now
systems
like
this
expose
consistency
to
the
end
user
API.
A
So
the
folks
consuming
this,
these
databases
on
the
Fly
can
choose
what
level
of
consistency
they
want
for
the
data
in
the
response.
So
if
we
like
touch
back
on
the
other
systems,
Lock
Services
are
very
strict
and
strong
consistency.
No
matter
what
you
do
and
then
relational
databases,
you
don't
have
any
Ops
there
beyond
the
isolation
level
and
kind
of
Select
for
update
mechanisms,
but
that
that
is
all
like
very,
very
domain,
specific
and
and
kind
of
it's
still
non-obvious.
A
There's
no
way
to
say
this
particular
treat
this
whole
glob
of
things.
This
whole
operation
I'm
going
to
perform
with
this.
This
amount
of
consistency
in
those
systems
so
very,
very
different
from
what
we
previously
had.
This
is
actually
kind
of
what
I
would
describe
as
like
unifying
all
of
the
things
all
the
benefits
that
I
just
talked
about,
because
now
the
end
users
are
more
in
control
of
the
consistency
of
their
data.
They
get
to
pay
for
exactly
what
they
use
in
terms
of
like
the
performance
cost
of.
A
When
things
need
to
be
consistent,
they
can
actually
slow
things
down.
Just
for
that
part,
and
then
for
the
things
that
don't
need
to
be
consistent,
they
can
actually
take
advantage
of
the
optimization
and
and
really
go
for
it,
and
what's
interesting
here
is
space
DB.
So
the
database
that
my
company
builds
is
an
example
of
one
of
these
ad
hoc
systems.
Users
per
request
can
specify
the
consistency
they
need
and
there
is
a
default
to
be
done,
specify
one.
A
But
what
is
really
really
interesting
is
that
this
is
an
opportunity
for
a
lot
of
user
experience.
Research
because,
at
least
in
our
domain,
because
we're
specifically
handling
authorization
data,
we
can
actually
tell
users
what
is
going
to
happen
at
the
different
levels
of
consistency,
because
we
know
more
about
their
data
because
we
know
about
the
domain
they're
operating
in.
So
we
don't
just
have
to
explain
kind
of
like
these
distributed
systems,
research
Concepts
to
every
user
of
the
system.
A
Instead,
we
can
make
it
obvious
that,
like
if
you
check
this
permission
with
this,
it's
going
to
be
based
on
time
where
time
is
rounded
for
for
performance
reasons.
So
like
we
can.
We
can
actually
draw
on
actually
analogies
that
folks
understand
about
our
domain
rather
than
trying
to
treat
teach
them
very
low
level.
A
Concepts
I'm,
actually
super
excited
for
for
more
of
this
I
would
really
like
to
see
apis
where
you
kind
of
can
sneak
the
concepts
of
consistent
C
into
it
so
Folks
by
choosing
the
proper
apis
thinking
about
their
use
case.
They
they
no
longer
think
about
consistency,
they
think
about
their
use
case
and
then,
by
virtue
of
picking
that,
then
they
have
just
chosen
the
API
with
the
proper
consistency
for
what
they
need.
A
So
it's
very,
very
cool
stuff
I'm
really
excited
about
this
space
generally
and
it
wouldn't
be
possible,
if
not
for
an
approach
where
we
kind
of
start
with
very
consistent
world
view
and
then
allow
folks
to
relax
that
over
time,
because
I
feel
like
when
you
work
with
the
relational
database
model
where
it
is
very
relaxed
and
then
you
layer
on
more
and
more
strictness
when
you
do
that,
you
just
have
to
know
too
much
about
both
the
domain
details
and
the
internal
knowledge
of
the
data
store,
you're
working
with
or
the
internal
knowledge
of
the
system.
A
A
This
is
a
very
deep
subject
and
I
just
want
folks
to
know
that
I
have
only
like
kind
of
scratched
the
surface
here.
People
spend
their
whole
professional
careers
researching
this
stuff.
That's
a
cuckoo
clock!
That's
how
I
know
I
hit
my
my
number
for
the
time
for
this.
A
This
webinar,
but
yeah
consistency
is
a
super
super
important
subject:
I
still
see
very
seasoned
developers
overlooking
it
or
downplaying
it
in
their
systems
and
especially
when
they
make
architectural
decisions
for
their
overall
design
of
their
system,
and
it
really
it
really
shouldn't
be
overlooked.
It's
one
of
the
most
important
things
that,
like
you,
need
to
discuss
and
decide
when
you're
designing
something,
because
excluding
very
few
of
these
occasions,
we
have
previously
not
worked
with
a
lot
of
systems
that
give
you
that
flexibility
to
actually
adopt
more
consistency,
if
you
need
it.
A
There
are
many
of
large
companies
that
folks
in
Industry
think
are
the
bastions
of
Pinnacle
of
engineering
that
get
to
hire
experts
all
around
the
world.
They
pay
well
because
they're,
a
giant
company
building
a
really
cool
product,
but
a
lot
of
these
companies
have
huge
problems
because
they
the
way
they
grew,
they
were
incapable
of
dealing
with
consistency.
A
They
didn't
have
the
technology
at
the
time
to
deal
with
the
consistency
of
their
systems
and
they
have
outages
as
a
result,
they
have
data
loss
as
a
result
and
they
are
stuck
in
a
state
where
they're
forever
trying
to
migrate
critical
data
or
the
data
that
has
different
consistency
into
new
systems
and
I
do
not
want
to
see
more
developers
in
that
scenario,
especially
since
we've
made
so
much
progress
on
this
subject
as
an
industry
and
with
that
I'd
like
to
thank
everyone
for
watching.
A
The
link
is
right
in
front
of
you
and,
if
that's
not
just
exclusively
for
spice
DB
users,
it's
an
open
source
Community
where
folks
that
care
about
distributed
systems
can
talk
about
all
kinds
of
all
kinds
of
research
and
practical
usage
in
Industry.
So
thanks
for
time.