►
From YouTube: Modeling and Partitioning in Cosmos DB (2021-05-20)
Description
Mark Brown, Principal Program Manager Azure Cosmos DB, will be presenting on Modeling and Partitioning in Cosmos DB. A great introduction to using a distributed NoSQL database.
B
A
Thank
you
very
much.
Welcome
everybody
to
the
may
20th
edition
of
the
san
antonio
cloud
computing
group,
meetup
virtual
as
it
has
been
I'd
like
to
thank
everyone
from
the
san
antonio
cloud
computing
group,
everyone
from
the
net
virtual
user
group
for
being
here
and
I'd
like
to
thank
dave
and
the.net
foundation
for
hosting
and
running
this
thing.
A
I
appreciate
the
great
work
they
do
in
in
running
and
hosting
these
things,
as
well
as
their
great
work,
educating,
advocating
and
kind
of
bringing
in
the
new
generation
of
developers
I'll
post
that
link
in
the
chat
here.
In
a
moment,
we
have
one
announcement
one
week
from
today
on
the
27th.
A
A
I
will
I
don't
have
a
link
for
that
one
yet,
but
I'll
post,
that
to
our
page
as
soon
as
I
do
and
now
to
our
main
event
of
the
evening,
I'd
like
to
introduce
our
guest
mark
brown
mark
is
a
principal
product
manager
for
azure,
cosmos,
db
and
I'll.
Let
him
intro
his
talk
since
he
knows
all
about
it,
but
I'm
very
excited
because
I
don't
know
how
often
it
is
we
get
to
hear
first-hand
you
know
from
from
the
source
material.
So.
A
Here,
mark
and
I'll
hand
it
over
to
you.
B
Thank
you
kevin
I've
known
mike
ben
bakovich,
for
god
well
over
a
decade,
probably
so
he's
a
great
guy
good
to
have
him
on
next
next.
Was
it
month?
Do
you
guys
meet
monthly.
A
B
All
right
well,
thanks
for
having
me
and
welcome
everyone.
My
name
is
mark
brown,
I'm
a
principal
program
manager
on
the
cosmos
db
team.
I've
been
on
the
cosmos
tv
team,
for
probably
I
guess
a
little
over
three
years
or
around
three
years
prior
to
that.
I
was
on
the
azure
networking
team
and
prior
to
that.
B
I
think-
and
I've
been
at
microsoft
now,
just
about
20
years,
all
told
so
quite
a
long
time
worked
on
lots
of
different
products
and
services.
I
have
to
say
this
is
probably
the
most
fun
I've
had
no
sql
distributed.
Databases
is
a
heck
of
a
lot
of
fun.
Lots
of
good
tech
in
here
my
day.
Job
for
cosmos
is
twofold:
I'm
the
program
manager
for
what's
called
our
resource
provider.
That's
basically
our
control,
plane
back-end.
B
So
when
you
provision
an
account
using,
say
an
arm
template
or
a
powershell
or
a
cli
or
even
rest,
all
that
api
stuff,
the
management
api
is,
is
what
I
when
I
pm
and
then
I
also
run
a
team
of
other
pm's
as
well,
and
we're
just
basically
trying
to
help
spread,
love
and
awareness
around
cosmos
db.
So
today
I'm
gonna
talk
about
data
modeling
and
partitioning
in
cosmos
db.
B
This
is
probably,
I
would
say,
the
most
important
topic
for
anyone
new
to
this
type
of
database
to
understand.
So
let
me
kind
of
go
through
objectives
here.
First,
I
want
you
to
get
familiar
kind
of
with
cosmos,
db's
core
concepts
kind
of
our
api
in
there
and
understanding
data
modeling
best.
B
And
then
we're
going
to
apply
that
to
a
real
world
scenario
here
today
and
then
I'll
help
you
understand
throughout
all
this
kind
of
how
cosmos
differs
from
a
relational
database
when
designing
a
data
model
because
we're
going
to
start
with
a
relational
data
model.
B
B
So
let's
talk
about
the
horizontal
scale
aspects
of
cosmos
db.
Unless
your
workload
is
small
with
just
a
low
amount
of
data
or
requests,
your
data
is
likely
going
to
be
stored
on
a
bunch
of
different
physical
servers
or
partitions
within
something
we
call
a
container
now
in
cosmos.
We
abstract
all
this
away,
but
under
the
hood,
you're
reading
and
writing
data
in
and
out
of
cosmos
across
this
cluster
of
servers-
and
this
is
better
known
as
scale
out
and
it's
how
we
achieve
horizontal
scalability,
enabling
both
unlimited
storage.
B
So
when
you
need
more
storage,
we
just
simply
add
more
servers
to
your
cluster.
This
also
provides
unlimited
throughput,
because
each
of
these
computers
has
its
own
cpu
memory,
and
I
o
running
through
it
so
adds
additional
capacity
to
handle
the
request
when
you
need
it
now.
Cosmos
db
is
also
non-relational.
B
So
when
working
with
relational
databases,
you
have
the
ability
to
do
things
like
define
constraints
between
the
different
entities.
You're
storing
this
lets
you
do
things
like
create
foreign
keys,
perform
join
operations.
Those
types
of
things-
non-relational
databases,
like
cosmos
db,
don't
implement
any
of
these
relational
constraints
and
the
reason
why
is
because
cosmos
is
a
horizontally
scalable
database
and
your
data
is
likely
going
to
be
spread
across
multiple
servers.
This
could
be
tens
or
even
hundreds
of
servers
depending
on
the
storage
or
throughput
needs
you
have
now.
B
I
don't
want
to
suggest
that
it's
not
technically
possible
to
enforce
relational
constraints
across
a
cluster
of
servers,
but
doing
so
would
actually
have
an
enormous
impact
on
the
performance
and
availability
for
your
database
and
because
cosmos
is
designed
to
provide
predictable
performance.
We
don't
really
provide
a
way
to
declare
these
types
of
relational
constraints,
so
you
may
be
asking
yourself:
well
is
cosmos?
Okay
to
use
for
relational
workloads,
and
definitely
it
most
fact
is,
and
in
fact
most
workloads
on
cosmos
are
relational
in
nature.
B
You
just
need
to
use
and
learn
different
techniques
to
implement
the
relationships
between
your
entities,
and
this
approach
is
very
different
than
designing
a
data
model
for
a
relational
database
for
those
that
are
new
to
this
type
of
database.
You
should
not
follow
your
intuitions
best
practices
in
the
relational
world.
Don't
translate
very
well
to
this
type
of
database
and
may
even
be
anti-patterns
okay.
B
So
let's
put
this
into
practice,
I'm
to
spend
the
rest
of
this
session,
taking
a
relational
database
that
anyone
who's
familiar
with
sql
server
should
know
the
adventure
works.
2017
database
well,
actually
we'll
just
take
a
small
part
of
it.
These
tables
represent
the
canonical
e-commerce
workload.
If
you
will,
we've
got
customers
with
the
customer
table.
Customer
address
a
customer
password
for
the
products.
We've
got,
of
course,
the
product
table
and
then
there's
your
product
categories
and
product
tags,
and
then,
finally,
we
have
our
sales
orders.
B
So
we've
got
our
like
our
sales
order,
header
here
and
then
also
our
sales
order
detail.
So
with
a
non-relational
or
excuse
yes,
with
a
non-relational
document
database
like
cosmos
db,
there's
not
much,
we
can
do
with
the
tables
the
way
they
are.
You
would
never
use
these
tables,
as
is
because
the
cost
of
performance
for
trying
to
do
operations
on
them
would
be
prohibitive.
B
Another
thing
I
want
to
point
out,
too,
is
that
in
a
relational
database,
the
relationships
between
the
data
are
important
for
modeling
the
database
for
a
nosql
database.
This
is
only
part
of
the
story.
What's
most
important,
when
designing
a
data
model
for
a
nosql
database
is
to
look
at
the
access
patterns
for
your
application.
B
Okay,
so
a
real
world
ecommerce
platform
would
obviously
need
to
implement
a
lot
more.
You
know
functions
than
this,
but
this
subset
really
is
enough
to
kind
of
guide
and
illustrate
the
different
design
techniques
that
we
want
to
showcase.
So
here's
what
our
little
mini
ecommerce
site
is
going
to
do
we're
going
to
create
and
edit
a
customer,
and
then
we
want
to
retrieve
a
customer.
B
Then
we
want
to
list
all
product
categories
and
then
we'll
do
the
same
for
product
tags
we'll
create
edit
and
list
those
we
want
to
create
a
product
and
edit
a
product,
and
then
we
want
to
list
all
products
from
a
category
and
we
also
want
to
include
the
name
of
the
category
and
the
tags
in
there,
because
if
you
look
here
in
product
we
don't
have
the
category
name
in
here
and
then
the
tags
are
sitting
in
a
completely
different
set
of
tables
with
a
many-to-many
relationship.
B
So
we
want
a
query:
that's
going
to
provide
all
of
that.
We
also
need
to
create
a
sales
order
and
then
we
want
to
list
all
sales
for
a
customer.
Okay,
so
that's
a
different
one
and
then
we
have
another
one
which
we
want
to
query
top
10
customers
by
the
number
of
sales
orders
they
have.
So
this
is
kind
of
like
a
a
light
analytics
scenario.
B
If
you
will,
maybe
I
want
to
send
out
a
nice
gift
to
them,
or
I
don't
know
a
coupon
or
whatever
saying,
thanks
for
being
a
customer,
here's
a
little
something
for
you,
okay,
so
let's
get
started
remodeling
this
relational
database
in
cosmos
db
and
we'll
start
here
with
the
customer
entities.
So
we
have
three
tables
here:
I've
got
customer
customer
address
and
customer
password
and
first
what
we
need
to
do
is
we
need
to
translate
these
into
its
corresponding
json
document.
B
Now
we
could
keep
these
in
separate
containers
in
cosmos,
but
let's
look
at
the
operations
we
need
to
support
that
includes
both
creating
a
customer
and
editing
a
customer
now,
given
that
cosmos
db
stores
data
as
json,
another
approach
we
could
take
is
to
embed
the
address
and
the
password
tables
into
our
customer
table.
Now,
when
I
create
a
new
customer,
I
only
need
to
insert
the
data
into
a
single
container.
B
Generally,
you
want
to
embed
data
or
entities
together
when
there's
a
one-to-one
relationship
for
the
data,
so
for
customer
and
customer
password,
you
would
have
that
one-to-one
relationship,
so
that
makes
a
good
a
good
candidate
for
that
or
if
you
have
what
we
call
one
to
few
relationships
now.
This
is
different
than
one
to
many.
One
to
few
is
actually
a
little
more
precise
and
that
it
explains
or
helps
to
to
quantify.
Is
it
an
unbounded
relationship
in
that
sense
right?
B
So,
if
you
have
one
to
few
they're,
just
a
handful
of
things,
it's
typically
some
kind
of
bounded
amount.
This
would
be
like
address
right,
like
address,
is
strictly
a
one-to-many
relationship
to
a
customer,
but
you
don't
have
an
unlimited
number
of
addresses
or
you
don't
have
an
insane
number
of
addresses.
You
just
have
a
small
number
like
maybe
one
or
two,
you
got
yourself,
your
wife,
your
your
kids
or
your
mom,
or
your
dad
or
whoever.
B
But
so
it's
a
one-to-few
relationship
that
we
like
to
call
and
then
the
other
aspect
is
that
the
related
items
are
queried
or
updated.
Together
is
another
reason
why
you
would
want
to
and
better
reference
within
there.
Now
you
want
to
reference
should
be
in
bed
now
you
want
to
reference
when
the
relationship
is
one-to-many
and
especially
if
it's
unbounded
and
I'll
explain
why
that's
important
in
a
little
bit.
B
Also,
another
good
option
for
referencing
the
data
within
a
database
like
cosmos,
okay,
so
because
we
have
this
one
to
one
and
one
to
free
relationship
between
our
entities
and
because
we
usually
retrieve
all
the
data
for
a
customer
at
once,
it
makes
sense
to
embed
everything
inside
the
single
json
document
here
now
that
we've
defined
our
first
entity,
we
need
to
store
our
data,
customers
or
customer
data
into
a
container
and
we'll
call
that
customer
now,
when
creating
a
new
container,
we
also
have
to
define
its
partition
key
within
cosmos
db,
and
you
may
be
asking
well
what's
a
partition
key
well
remember
earlier,
I
said
that
cosmos
db
is
a
basically
a
cosmic
tv
container
is
an
abstraction
over
a
cluster
of
physical
servers
when
storing
documents
in
a
cosmos
db
container.
B
B
Instead,
documents
are
being
written
to
what
we
call
logical
partitions
and
it's
these
logical
partitions
that
sit
on
different
physical
servers
and
so
from
a
user's
perspective.
You
don't
really
need
to
care
about
physical
servers
when
designing
your
data
model,
just
that
your
data
gets
written
to
these
different
logical
partitions.
B
B
Well,
that's
where
the
partition
key
comes
in.
So
when
creating
a
new
cosmos
db
container,
you
define
the
container's
partition
key,
which
is
the
name
of
a
property
that
cosmos
will
use
to
decide
which
logic
partition.
Your
data
should
go
to
think
of
it
as
like
an
address
to
route
your
data
to
within
your
database,
now
using
the
example
of
a
container
partitioned
by
username.
B
First
is
each
document
has
a
maximum
size
of
two
megabytes
now
remember
before
when
I
talked
about
when
to
embed
or
when
to
reference.
It's
because
each
document
can
be
no
larger
than
two
megabytes,
which
is
why
you
don't
want
unbounded
arrays
inside
your
documents.
B
Another
restriction,
too,
is
each
logical.
Partition
has
a
maximum
size
of
20
gigabytes,
so
you
can't
have
more
than
20
gigabytes
of
documents
where
say
username
equals
andrew
and
when
working
with
data,
what
we
want
to
achieve
or
strive
to
achieve
at
least
is
an
even
distribution
of
data
across
our
lib,
our
different
logical
partitions.
So
if
one
partition
gets
a
lot
more
data
stored
on
another,
this
is
what
we
call
a
hot
partition
and
we
want
to
try
to
avoid
that
same
goes
for
request
as
well.
B
B
Let's
take
that
example
again
of
a
container
partition
by
username.
Now,
if
we
were
to
issue
this
query,
you'll
notice
that
it
filters
on
the
username
property-
which,
thankfully,
is
our
partition
key
and
because
cosmos
db
understands
that
it's
going
to
send
this
query
to
a
single,
logical
or
physical
partition.
B
Now,
if
our
query
is
filtered
on
something
else
say
like
favorite
color
here,
we
would
have
no
idea
where
the
results
are
so
in
this
case,
cosmos
fans
out
the
query
to
each
and
every
logical
partition
and
thus
every
physical
partition
and,
as
you've
probably
already
guessed.
This
is
going
to
hit
every
physical
server,
and
this
is
what
we
call
a
cross
partition
or
fan
out
query
and
more
specifically,
as
a
fan
out
means
it's
going
to
hit
every
single
physical
partition
in
your
database.
B
Now
such
a
query
will
work,
but
it
does
have
an
impact
on
latency,
throughput
cost
and
ultimately
scalability
now
whoops.
Sorry,
I
need
to
go
back
now
for
small
containers.
This
performance
impact
is
not
that
bad.
In
fact,
for
a
container
with
just
a
single
physical
partition,
you
won't
notice
any
impact
in
the
performance
at
all.
However,
as
your
database
grows,
larger,
the
impact
becomes
worse
and
worse
as
you
have,
as
you
have
to
hit
more
physical
servers.
B
To
answer
your
query-
and
this
is
the
scenario
that
precisely
traps
so
many
people
new
to
this
type
of
database
is
typically
developers
will
do
dev
and
test
on
a
small
data
set
and
then
conclude
that
their
design
is
good
because
the
performance
they
measured
was
acceptable
in
a
nosql
database.
You
don't
truly
know
if
your
design
is
scalable
until
you
actually
measure
it
under
heavy
load,
or
you
have
a
good
amount
of
data
in
there
about
the
concurrency
of
your
operations.
B
Okay,
now,
with
all
that
knowledge
about
partitioning,
how
do
we
choose
the
right
partition
key
for
our
customers?
Remember
before
I
show
the
operations
we
need
to
do
for
our
customers.
We
had
three
things
we
needed
to
do
we
needed
to
create
a
customer,
we
have
to
edit
a
customer,
and
we
also
need
to
retrieve
a
customer
in
this
case
we're
going
to
retrieve
them
by
their
id.
B
So
id
of
the
json
documents
we're
going
to
store
this
container
should
make
for
a
pretty
suitable
partition.
Key
keep
in
mind
too.
The
other
thing
that's
important
is
the
concurrency
of
these
operations
we're
not
often
creating
or
editing
a
custom
customer,
but
we
are
retrieving
them
quite
often
right
every
time
they
log
into
the
e-commerce
application.
B
B
Okay,
so
the
id
the
json
documents
is
going
to
make
for
a
suitable
partition.
Key
here
note
that
when
using
the
id
as
the
partition
key,
we
end
up
with
as
many
logical
partitions
as
there
are
documents
in
the
container,
with
each
partition
containing
only
a
single
document,
and
that's
perfectly
fine.
Many
users
are
concerned
about
this.
B
Having
this
high
number
of
logical
partitions,
but
there's
no
need
to
worry
logical
partitions
are
a
virtual
concept,
so
cosmos,
co-locates,
co,
logical,
partitions
on
the
same
physical
servers
and
then
moves
them
to
different
physical
servers
when
needed,
and
there's
no
upper
limit
to
how
many
logical
partitions
you
could
have
in
a
container
in
fact,
functionally
a
single
document
per
logical
partition
is
a
key
value
store
in
cosmos
db
and
you
can
have
key
value
stores,
basically
infinite
size
and
cosmos,
an
infinite
number
of
logical
partitions.
B
Okay.
So
let's
do
a
demo
or
two
here
in
this
demo,
I'm
just
going
to
show
you
how
to
query
for
a
customer
by
id
and
then
return
that
customers,
data
and
I'll
show
you
some
code
here
as
well.
So
I've
got
a
little
demo
app.
All
set
up
here
and
we're
gonna
run
this
one
query
customer
right
here,
so
let
me
open
that
up-
and
here
you
can
see
I'm
getting
a
reference
to
my
database
and
my
container
I'm
going
to
look
up
this
customer
id.
B
So
this
is
the
id
of
the
customer
I'm
going
to
get
in
there.
I've
got
a
sql
statement
so
we're
using
cosmos,
db's
sql
api,
which
has
a
very
sql-like
syntax.
It's
not
ansi
sql,
keep
in
mind
we're
a
json
store,
so
our
flavor
of
sql
is
meant
to
work
with
json,
not
with
fixed
column.
Rare
data
like
you'd,
find
in
like
a
sql
or
any
postgres
or
anything
okay.
B
So
next,
then,
I'm
going
to
create,
what's
called
a
query,
get
item,
query
iterator
and
I'm
going
to
pass
in
my
new
sql
definition
there
and
then
I'm
going
to
pass
in
a
parameter
and
then
I'm
going
to
create
some
request
options
here
and
then
I'm
going
to
specify
my
partition
key
now
for
cosmos
db.
You
can
do
one
of
two
things
you
can.
B
You
don't
have
to
do
this,
and
but
if
you
pass
in
the
value
of
the
partition
key
in
the
where
clause,
it
will
kind
of
suss
it
out
and
then
use
that
to
wrap
the
query.
However,
I'm
calling
it
out
here
because
it's
kind
of
a
best
practice
to
put
this
in
the
request
options
and
this
functionally
is
what
is
going
to
route
the
query
to
the
correct
physical
partition
in
your
database.
So
it
knows
where
to
go,
find
that
data.
B
So
that's
my
query
definition
here
and
then
I'm
just
going
to
put
this
in
a
while
loop
and
then
loop
through
the
results
here
and
call
read
next
async,
that's
going
to
fetch
the
first
record
and
then
just
print
out
my
customer
using
a
handy
little
print
function
with
the
new
soft
library
there
and
that's
it
just
luther
all.
There's
only
one
customer,
but
I'm
still
going
to
put
this
in
a
4-h
just
to
show
that
that
function
there.
B
Okay,
so
over
here
in
my
app
I've
got
my
apps
right
here
and
I'm
going
to
run,
I'm
going
to
run
menu
item
a
and
I'm
going
to
query
for
a
single
customer.
B
This
will
take
a
second
because
it's
just
starting
the
app
it
hasn't
connected
to
the
database.
Yet.
B
There
we
go
okay,
so
there's
our
customer
record,
you
can
see.
There's
the
original
customer
data
there,
the
title
first
name
last
name:
email
address
phone
number
and
creation
date.
Here's
address
in
an
array
now,
there's
only
one
address
in
here,
but
there
it
is,
and
here's
the
password
object
that
we
had
in
our
other
table,
and
here
you
can
see.
I
returned
the
request
charge
of
2.83.
B
Now
this
may
be
new
to
you
as
well,
but
cosmos
db
uses
what
we
call
request
units
per
second
or
ru's.
This
is
how
you
measure
throughput
in
cosmos
db
and
it's
essentially
a
proxy
for
compute
and
io
and
memory
within
there.
So
when
you
provision
a
new
container
in
cosmos,
you
have
to
specify
some
level
of
throughput
and
the
more
throughput
you
provision
the
more
compute
power
you
get,
the
more
you
can
do,
the
more
operations
you
can
handle
the
more
you
can.
B
Do
you
also
get
more
storage
with
that,
because
there's
there's
an
implied
level
of
storage
for
a
certain
amount
of
throughput
within
there.
So
you
knowing
how
much
your
operations
cost
is
a
way
to
figure
out.
Okay,
I'm
going
to
run
this
operation
10
times
a
second.
I
know
it
costs
2.83
ru,
I'm
gonna
and
then
add
that
up
with
other
operations
that
I
know
I'm
going
to
run
at
some
level
of
concurrency.
B
Add
all
those
up
to
figure
out
how
much
just
roughly,
how
much
are
you
I
need
to
provision
total
for
my
database.
Another
thing
I'll
point
out
too,
is
because
throughput
here
is
measured
or
governed
per
second,
the
longer
you
can
amortize
request
over
time,
the
lower
overall
throughput
you
need.
I
often
hear
from
customers
like
I
want
to
run
a
batch
job.
That's
going
to
do
a
whole
bunch
of
work
and
I've
got
to
provision
the
insane
amount
of
throughput
to
do
it.
Why?
Why
do
I?
B
Why
is
it
so
much
to
do
that,
and
I,
my
usually
my
answer
is:
can
you
stream
the
data
rather
than
batch
it,
because
if
you
can
stream
it,
you
can
process
it
over
a
longer
period
of
time,
that's
going
to
require
less
throughput
and
you're
going
to
pay
less
for
it.
So,
anyway,
that's
just
talking
about
throughput
something
to
know
there,
but
we
want
to
measure
that
here.
So
that's
2.83
for
for
that
request
of
that
query
now.
B
Another
thing
I
want
to
show
you
is,
if
you're
making
a
doing
a
query
where
you're
passing
the
partition,
you
know
the
partition
key
and
the
id
for
the
data.
You
can
do
something.
What
that
we
call
a
point
read
and
it's
part
of
the
non-query
operations
for
say
inserting
updating
deleting
and
reading,
and
let
me
show
you
that
so
here
I
have
a
function
called
get
customer
and
same
get
my
database
in
my
container
same
customer
id
and
I'm
going
to
write.
B
Let's
use
this
function
here,
which
is
called
read,
item
async
and
I'm
going
to
pass
it
precisely
two
things:
I'm
going
to
pass
it
the
id
for
the
data
I'm
looking
for
in
this
case
customer
id
and
the
partition
key,
which
is
the
same
in
this
case
it's
customer
id.
So
I'm
going
to
go
and
get
the
same
amount
of
data
using
a
point
read
and
let's
see
how
that
runs
so
run
number
b
and
there
it
is
back
and
look
here
at
the
bottom.
B
The
request
charge
for
this
was
just
a
single
ru,
and
this
is
something
we
guarantee
for
any
point
read
operation
that
you
make
for
a
kilobyte
of
data
or
less
is
always
going
to
cost
you
a
single
ru,
and
the
reason
is
because
this
goes
straight
to
our
back
end.
We
know
the
cost
for
a
kilobyte
of
data,
and
we-
and
so
that's
why
we
can
guarantee
it
at
a
single
ru.
It's
going
to
be
much
faster,
much
cheaper.
B
So
if
you
have
a
lot
of
high
concurrency
reads,
if
you
can
structure
your
data
or
model
your
data
and
partition
it
such
that
you
can
fetch
those
single
items
with
a
partition
key
and
an
id.
That's
what
you
should
do,
because
it's
going
to
give
you
the
best
performance,
the
best
bang
for
buck
for
your
application.
B
Are
there
questions
I
get
through
this
kellen
or
just
people?
Just
hang
out
and
then
do
it
at
the
end.
A
Yeah,
if
anybody
wants
to
ask
any
questions,
that's
that's
pretty
good.
We
only
had
one
one
question
that
had
to
do
with
kind
of
the
stream
if
he
was
asking,
if
there's
a
way
to
up
the
the
size
of
the
the
text
in
your
editor
there
it
seemed
like
or
or
the
quality
of
the
stream
but
yeah.
I'm
I'm
monitoring
for
questions.
I.
B
On
yeah,
this
is
it's
funny.
I
post
a
podcast
and
I'm
always
telling
when
I
do
check
checks.
I'm
like
you
gotta
make
your
stuff
bigger,
we'll
make
this
bigger
and
whoops
where'd
you
go
and
then
let
me
zoom
up
visual
studio
here
and
we'll
get
that
to
like,
say
120.
Maybe
how
does
that
look?
That
should
be
better
yeah?
That's
right,
cool
cool,
cool,
all
right
yeah!
Thank
you.
B
Folks,
sorry
for
running
small
texture
make
it
better
I'm
and
I'm
going
at
1080p
right
now,
so
I
can't
go
any
higher
okay,
so
that
was
our
query.
Customer
demo,
let's
move
on
to
products
and
we'll
look
at
the
product
tables.
Next.
First
up
is
the
product
category
table.
So
we're
going
to
do
the
same
thing
we
did
earlier,
which
is
translating
this
into
a
json
document,
and
we
need
to
store
that
document
in
a
container
we'll
call
that
product
category
and
then
next
we
need
to
figure
out
his
partition
key
here.
B
B
Product
categories
and
again
I
want
to
point
out
when
you're
looking
at
the
operations,
you
need
to
support
it's
important
to
know
those
the
volume
of
those
operations.
For
instance,
we
probably
aren't
creating
or
editing
product
categories
very
often,
but
just
like
with
customer.
B
We
need
to
list
the
query
product
categories
quite
frequently
within
our
ecommerce
application,
but
there's
a
problem.
We've
got
here,
there's
no,
where
filter
on
this
query.
So
how
do
I
make
this
thing?
A
single
partition
query.
Well
again
we're
gonna
oops.
Where
are
we
okay?
So
what
we're
gonna
do
is
we're
gonna
use
a
little
trick
here
and
I'm
gonna
create
a
new
property
and
give
it
a
constant
value
in
our
document.
B
So
here
I've
created
a
property
called
type
and
I've,
given
it
a
value
of
category
now
I
can
partition
this
this
container
by
type
and
then
just
set
this
value
category
in
every
document.
Now
look.
I
know
this
looks
a
bit
weird,
but
it
actually
makes
sense.
If
you
just
stick
with
me,
I'm
going
to
iterate
on
this
design
through
the
rest
of
my
talk
here
and
you'll,
see
how
this
actually
makes
a
whole
lot
of
sense,
and
it's
actually
a
really
smart
thing
to
do
so.
B
Okay,
let's
do
a
demo
here
in
this
demo,
I'm
going
to
show
you
a
query
for
product
categories
and
then
we'll
measure
the
ru
charge
in
our
app.
So
back
to
my
app
here,
I'm
going
to
query
product
category
query
products
by
category
id.
Sorry,
that's
the
thing!
I'm
doing
I'm
listing
all
product
categories.
For
my
one,
this
is
the
one
I'm
going
to
run
okay,
so
this
should
look
familiar,
I'm
going
to
get
here
and
scroll.
B
This
thing
out
of
the
way:
yeah,
that's
okay,
so
top
of
this
looks
just
like
the
same
for
the
others.
I
got
my
database,
my
container,
my
query
and
here
you're,
going
to
see
it's
just
basically
hard-coded
so
select
star
from
c
where
c
type
equals
category
and
then
I'm
going
to
create
my
query.
Iterator
here
pass
in
the
partition,
key
guess
what
hardcoded
category
and
then
iterate
through
the
results
here
and
I'm
just
going
to
print
these
out
and
I'm
going
to
print
out
the
request
chart.
B
So,
let's
list
all
product
categories
here,
that's
option
c,
and
here
you
can
see
a
bunch
of
product
categories.
I
think
there's
37
in
this
database
and
there's
my
query:
request
charge
4.04.
So
not
bad!
That's
you
know
pretty
good
cost
there.
Just
four.
Are
you
to
run
that
query?
B
So
in
a
real
world
scenario,
you
know
you
would
it's
not
like?
You
would
run
this
query
every
time
right.
You
would
run
this
at
startup
and
then
cache
it
in
memory
or
something
right.
So
you
wouldn't
worry
too
much
about
the
cost
of
this
grade.
But
certainly
you
wouldn't
want
to
be
running
it
over
and
over
and.
B
Okay,
so
next
we're
going
to
look
at
the
product
tags
here
and
I'm
going
to
translate
that
into.
Of
course,
it's
json
document
format
here
and
we'll
store
its
own
container,
we'll
call
that
product
tag
pretty
unique
now
it
turns
out
tags
shares
the
exact
same
access
pattern,
as
categories
does
so
we're
just
going
to
apply
the
same
strategy
here,
I'm
going
to
add
a
new
property
called
type,
and
I'm
going
to
give
it
a
value
of
tag
and
stick
that
for
each
document,
all
right.
B
Next,
moving
on
to
the
product
table
here
going
to
translate
into
its
json
and
next
I
want
to
look
at
the
relationship
from
product
to
product
tags.
Our
product
table
here
has
a
many
domain.
B
With
product
tags
and
I
need
to
access
tags,
the
product
tags
in
my
application,
meaning
when
I
display
a
product,
I
need
to
display
the
tags
for
it
as
well.
I
also
would
want
a
query
for
a
product
using
its
tags
as
well
in
here
now.
I
could
do
this
in
one
of
two
ways:
I
could
store
the
product
info
in
the
product
tags
table
or
I
could
materialize
tags
in
my
product
table
now.
B
Given
that
there's
much
fewer
tags
for
product
than
product
or
tags,
it
makes
sense
to
materialize,
product
tags
and
embed
them
in
my
product
table
right
because
that's
going
to
be
a
bounded
array,
you're
not
going
to
have
a
million
product
tags
for
a
product,
or
hopefully
you
won't
so
remember
with
that
one-to-few
relationship.
It's
a
good
candidate
for
embedding
tags
into
my
product
table
here
and
then
next,
we're
gonna
store
products
in
its
own
container,
we'll
call
that
product
and
then
next
now
we
need
to
figure
out
a
good
partition
key.
B
So
again,
we're
gonna
look
at
the
operations
here
and
decide
on
partition
key.
We,
of
course,
need
to
create
an
edit
a
product,
but
the
interesting
operation
here
is
creating
for
a
product
by
category,
because
this
is
likely,
at
least
in
our
design,
how
customers
are
gonna
search
for
products
or
at
least
one
primary
weight.
So
we
need
to
list
all
products
that
match
a
specific
category,
so
this
corresponding
query
select
star
from
c
where
c
dot,
category
id
equals
say
category
a
will,
return
all
the
products
for
that
category.
B
A
B
We're
going
to
use
category
id
as
the
partition
key
so
now,
every
time
I
run
that
query,
it's
gonna,
it's
gonna
be
within
partition,
so
I
had
another
problem
here,
and
that
is
that
every
time
I
create
for
privacy
category,
I
get
a
category
id
and
I
get
a
bunch
of
tag
ids.
But
what
I
really
wanna
display
is
the
category
name
and
then
a
list
of
tag
names
for
each
product,
as
I
render
that
out
to
the
page.
B
B
B
Support
joins
across
containers
data
that
is
modeled
in
this
type
of
data
store
is
optimized
such
that
it
could
be
served
in
a
single
request
so
to
our
products
table
we're
going
to
add
additional
properties,
including
the
name
of
the
category
and
then
also
the
name
for
each
of
the
tags
that
are
in
there,
and
so.
By
doing
this
we
make
sure
that
we
can
retrieve
all
the
data
we
eventually
are
going
to
need
and
return
that
to
the
client
to
render
on
the
page
and
do
it
in
just
a
single
request.
B
Okay,
so
another
demo
here
we're
gonna,
see
what
this
looks
like
when
we
add
a
category
and
the
tag
names
into
our
private
container.
So
let's
run
this
here,
go
back
to
so
here's
the
query:
I'm
gonna
run,
I'm
gonna
create
products
by
category
id
here
so
get
the
reference
to
my
database
and
container
and
then
here's
the
category
name.
B
I'm
going
to
query
on
this
is
components:
comma
headsets
and
then
here's
my
query
so
select
star
from
c
or
c
category
d
equals
add
category
d
and
again
you're
gonna,
create
your
query.
Definition
here
pass
in
your
partition
key
and
then
we'll
loop
through
the
results
and
then
I'll
show
a
request
chart
for
that.
So
let's
run
that
one!
That's
option
d
here:
query
products
by.
B
B
And
here
you
can
see,
I've
got
one
two
three
products,
so
here's
the
category
d
and
category
name
that
came
back
here
and
then
the
sku,
the
name,
the
description,
the
price
and
then
here's
an
array
of
tags
where
I've
got
the
tag
id
and
the
name
for
each
of
them
in
there.
So
this
one's
got
three
tags.
This
one
has
two:
this
one's
got
five
in
there
and
our
request
charge,
for
that
was
a
little
less
than
three
2.91.
So
that's
a
pretty
efficient
query
in
there
and
enter
go
back
to
there.
B
Okay,
okay,
so
now,
when
we
create
a
new
product,
we
need
to
populate
these
additional
properties.
But
what
if
we
rename
a
category
or
a
tag?
How
do
we
manage
that
referential
integrity
between
the
containers?
B
Well,
guess
what
in
a
nosql
database,
you
still
have
to
maintain
referential
integrity
between
data
right
data
can
change
and
you
need
to
be
able
or
wanna
be
able
to
reflect
that
in
other
places
in
there
and
it
turns
out.
Cosmos
actually
has
a
way
to
handle
this,
and
it's
called
change.
Feed
now
changeview
is
an
api
that
lives
within
every
cosmos
tv
container.
Actually,
technically
it
lives
within
every
physical
partition,
but
just
kind
of
ignore
that,
for.
B
Are
concerned,
you
access
it
through
the
container
reference,
so
whenever
data
is
written
to
cosmos
db,
such
as
an
insert
or
an
update
change,
feed
streams
these
to
a
delegate
that
you
can
listen
to
and
then
use
that
event
to
respond
to
data
that
was
changed.
B
Okay.
So
in
our
case
we
have
to
listen
to
changes
that
occurred
to
our
product
category
container,
as
well
as
our
product
tags
container,
and
every
time
that
data
is
updated,
it
will
also
propagate
those
changes
to
the
product
container.
Accordingly,
okay,
so
I'm
going
to
do
a
demo
on
this.
So
in
this
demo.
B
To
use
change
feed
to
do
this.
First,
I'm
going
to
create
a
product
container
for
a
specific
category
and
then
we'll
see
how
many
products
are
in
that
category.
I'll
then
update
that
category's
name
in
the
product
category
container
and
then
we'll
show
how
change
feed
picks
up
those
changes
and
then
propagates
that
to
every
product
in
the
category,
all
right
so
back
into
here.
So
let
me
show
you
some
code,
so
I'm
going
to
query
products
for
category
here
all
right.
B
So
here's
my
category
accessories
tires
and
tubes
and
then
I'm
going
to
do
a
count
right
in
my
product
container
and
pass
in
that
category
id
and
then
do
a
group
by
on
the
category
name
in
there.
Okay
and
then
that's
the
first
thing,
I'm
gonna
run.
So
let's
do
that
so
create
products
by
category
id.
That's
option
d.
A
B
Whoops
wrong
one:
oh
that's
going
to
run
that
okay!
This
is
all
going
to
run
as
a
single,
a
single
function.
Next
thing
I
want
to
do
is
I
want
to
update
the
product
category
name,
so
I
have
a
new
thing
here.
A
new
function
here
same
category
name,
and
what
I'm
going
to
do
is
I'm
going
to
just
make
a
small
change.
I'm
going
to
replace
the
word
and
and
put
an
ampersand
into
the
category
name.
B
Category
id
I'm
then
going
to
call
a
function
called
replace
item
async.
This
is
functionally
an
update
for
cosmos
and
it
takes
three
things:
it
takes
the
partition
key
which,
by
the
way,
all
of
these
functions
take
a
partition
key.
So,
like
the
read
item,
async
insert
item
and
replace
item
and
delete
yes
and
then
it's
gonna
take
the
id,
which
is
the
category
here
and
then
the
item
itself
right.
So
that's
the
updated
product
category
object
that
I
created
here.
B
Change
feed
and
what
that
looks
like
so
here's
another
little
project
I
have,
and
it's
sitting
here
running
just
listening,
and
this
is
the
code
for
change
feed
in
here.
So
let
me
walk
you
through
this,
so
it's
a
couple
of
things
you
need
to
know.
First
is
chaintree
uses
this
thing.
We
call
a
lisa's
container,
it's
basically
a
checkpoint
for
changes
that
have
been
read
off
of
off
the
container.
So
when
change
feed
runs
and
it
pulls
about
every
second,
it
goes
to
the
leases
container.
B
It
says,
hey
tell
me
the
last
thing
I
read
in
this
container
and
if
it
says
okay,
you
read
this
and
there's
a
difference
between
that
and
what's
in
there
it'll
say:
okay,
here's
some
more
changes,
and
it
then
sends
them
to
the
change
feed
to
this
delegate
that
we've
called
input
here
and
that's
basically
going
to
send
it
as
a
read-only
collection
and
for
us
for
casting
that
or
deserializing
that
to
a
specific
type,
the
reason
it's
a
collection
is
because
there
may
be
more
than
one
change.
B
That's
occurred
since
the
last
time
you
read,
so
it
just
calls
a
delegate
with
this
collection
full
of
changes
in
there
and
then
you
iterate
or
loop
through
all
those
changes
and
then
do
something
with
them.
So
that's
the
basically.
This
is
kind
of
the
boilerplate
for
a
change
feed
in
here
and
what
I'm
gonna
do.
B
Is
I'm
going
to
create
a
new
list
of
task
objects
here
and
then,
as
I
for
each
through
every
product
item
or
product
category
item
in
this
collection,
I'm
going
to
grab
its
category
id
and
its
name
right.
So
this
is
going
to
be
the
new
name
when
this
thing
comes
in
and
then
I'm
just
going
to
add
to
that
task
list.
Another
a
function
here,
that's
going
to
update
the
product
category
name.
B
So
let's
look
at
that
and
then
I'm
gonna
just
go
call
when
all
on
this
right,
because
the
change
feed
you
can
process
lots
of
different
stuff
at
the
same
time.
So
you
just
basically
set
it
on
a
task
and
let
it
go
all
right
so
here
below
I've
got
another
function
here.
This
is
update
product
category
name.
Now
I
need
to
update
all
the
products
with
that
category
name.
B
So
the
first
thing
I'm
going
to
do
is
I'm
going
to
write
a
query
select
star
from
c,
where
c
header
id
equals
that
category
id
I've
passed
in
and
then
I
have
another
reference
to
a
product
container
here.
So
let
me
go
to
the
top.
The
very
top
I've
got
two
container
references.
I've
got
my
product
category
container,
which
is
what
I'm
listening
to
here
right,
so
get
change.
Change.
Speed,
processor
builder
is
called
off
the
off
the
container
that
you
want
to
listen
to,
and
then
I
have
another
container
here.
B
This
is
my
product
container.
This
is
the
thing
I'm
going
to
go.
Do
something
with
okay
so
down
here.
I'm
going
to
first
do
a
query
against
that
container,
because
I
want
to
retrieve
all
the
products
for
that
category
and
then
I'm
going
to
loop
through
them
all
here
right
so
just
like
I
was
using
this
forage
to
loop
through
to
print
them
out.
I'm
now
going
to
use
it
to
update
the
name
of
the
product
category
for
each
product.
That
gets
returned
in
my
query.
B
Right,
so
I'm
going
to
count
this
too,
so
you
can
see
how
many,
how
many
times
it
does
it
so
here
I'm
going
to
change
in
my
return
product
doc
object
here.
Category
name
is
going
to
be
the
category
name
that
gets
passed
into
the
function
and
then
I'm
going
to
call
replace
item
async
and
I'm
going
to
update
every
product
in
my
collection
or
my
container
with
that
new
category
name:
okay,
all
right!
So
let's
close
everything
up
here.
Actually,
let's
just
go
back
to
here.
Okay,
so
here
we
go.
B
B
B
Referential
integrity
maintained
now,
let's
change
it
back,
so
you
can
see
it
one
more
time
here
you
can
see
it
picked
it
up
right
back
to
accessories,
tires
and
tubes,
and
that's
it.
Okay.
Let's
go
back
to
slides
any
questions.
I
guess
at
this
point.
A
A
I
I
think
I
I
think
it's
looking
really
really
streamlined
and
really
cool,
so
I'm
definitely.
B
Talk
and
I've,
given
it
a
bunch
of
times
because
a
lot
of
customers
don't
understand
these
concepts
and
it's
absolutely
critical
if
to
be
successful
on
this
date,
using
this
database
or
this
type
of
database.
Understanding
the
concepts
and
the
techniques
for
modeling
data
is
the
only
way
to
get
the
what
you
know
that
what
what
you
need
out
of
this
thing
I
mean
the
promise
for
a
nosql
distributed
database
is
functionally.
B
You
can
get
theoretically,
unlimited
scale
right,
it's
just
it's
a
scale-out
database,
so
you
just
keep
adding
servers
to
it
in
a
relational
world
like
a
sql
server.
There's
like
a
there's
a
limit,
there's
a
four
terabyte
limit
for
an
uncharted
database
in
sql
right
by
the
way.
Sharding
is
the
same
exact
thing
right.
We
call
it
partitioning,
but
you're.
B
Basically,
it's
a
scale
out
right
and
then
you
have
to
figure
out
which
data
goes
on,
which
char
your
partition,
we're
doing
that
already
here
right,
we
try
to
make
it
as
simple
as
possible
by
saying
pick
a
property
in
your
data
and
use
that
to
distribute
your
data
to
these
different
physical
partitions
or
shards
within
there.
Right
and
because
it's
distributed
out
like
that
accessing
that
data
is
always
going
to
be
fast
if
you're
doing
something
like
a
point
read
right,
because
it's
just
you
need
to
know
the
partition
key.
B
So
I
know
the
address
of
that
server,
so
I
can
go
get
it
so
that
you
know
we're
unique
in
the
fact
that
we're
the
only
database
in
azure
that
has
an
sla
on
latency.
So
for
a
read
or
write
of
a
kilobyte
of
data
or
less
using
our
direct
mode
api
using
our
sql
api,
we
have
a
sla
of
less
than
10
milliseconds
at
p99,
so
99
out
of
every
100
requests.
We
guarantee
are
going
to
be
less
than
10
milliseconds
and
the
p50
of
that
is
generally
about
four
or
five
milliseconds.
B
So
it's
generally
quite
quite
fast,
and
that's
true
whether
or
not
your
database
is
a
megabyte
in
size
or
a
petabyte
in
size.
It
doesn't
matter
because
we're
scale
out
right,
your
data
is
just
is
spread
out
amongst
all
these
different
servers,
you
provide
the
partition
key
and
the
id
to
get
your
data
and
we'll
get
it
for
you
and
return
it
in
less
than
10
milliseconds.
So.
A
B
Yep,
absolutely,
in
fact,
this
is
what
we
tell
customers
is.
They
need
to
prove
this
out
when
they
go
from
dev
test
to
prod.
You
need
to
put
serious
load
on
your
database
because
you
need
to
know
if
it's
actually
truly
going
to
scale
right.
This
is
the
trap,
so
people
that
don't
understand
these
concepts
and
modeling
and
partitioning
they
go
and
they
def
test
their
thing.
B
They
go
into
prod
and
then
a
month,
six
months
a
year
later,
they
realize
that
they
made
a
poor
choice
and
the
problem
is
that
you
can't
go
back
and
change
the
partition
key,
because
it's
actually
physically
assigning
data
to
where
it's
physically
stored.
So
you
need
to
go
and
create
a
new
container
pick
a
better
partition
key,
hopefully,
and
then
copy
your
data
from
one
container
to
the
other
and
then
switch
it
over.
B
It
can
be
very
painful
for
customers,
so
having
this
knowledge
up
front
makes
you
a
winner
in
the
end.
So,
okay,
let's
move
on
to
our
last
set
of
entities
here
with
the
sales
order,
do
the
same
exact
thing
here:
we're
going
to
create
json
documents.
B
Out
of
these
guys-
and
of
course
this
is
also
a
good
candidate
for
embedding,
because
it's
a
one-to-four
relationship
right
you're
not
going
to
have
a
sales
order
with
a
billion
items
in
there
unless
you're
unless
you're
crazy,
you
can't
stop
buying
next
store
in
its
own
container,
we'll
call
that
sales
order
and
let's
look
at
the
operations,
because
we
need
to
decide
on
a
partition
key
here.
B
So,
of
course
we
need
to
create
a
sales
order
and
then
we
need
to
list
all
sales
orders
for
a
customer
in
here.
So
that's
what
this
query
looks
like
here:
we're
going
to
select
star
from
c,
where
c
dot
customer
id
equals
customer
a.
So
if
we
partitioned
by
customer
id
here,
this
would
make
it
a
single
partition
query,
which
is
good.
I
think
that
makes
sense
right.
So
this
would
be
a
quite
a
frequently
run
operation.
You
know
listing
all
the
sales
orders
for
your
customer.
I
was
in
there.
B
It
doesn't
impact
creating
a
sales
order
and
there
that
kind
of
makes
sense
as
well.
So
I
think
for
this
thing
here
we're
going
to
use
customer
id
as
our
partition.
Key
here,
okay,
before
we
go
any
further,
I
want
to
take
a
step
back
when
looking
at
the
containers
we've
designed
so
far.
It's
interesting
to
note
that
we
already
have
a
container
that's
partitioned
by
the
customer's
id
and
that's
the
customer
container
itself.
So
could
we
store
say
the
customer
record
and
the
sales
orders
in
the
same
container?
B
Yes,
absolutely
we
can
do
that.
Not
only
is
this
technically
possible
in
a
database
like
cosmos
db,
this.
B
For
this
type
of
database,
cosmos
is
schema
agnostic
right,
it's
a
nosql
database,
so
it
does
not
enforce
schema
at
the
database
level.
So
this
is
something
that's
totally
supported
and
it's
also
quite
suitable
when
data
shares
similar
access
patterns
and,
of
course,
shares
the
partition.
Key-
and
that's
true
here
as
well.
Customer
is
gonna.
Actually
you
know
you,
the
customer's
gonna
log
in
so
you're
gonna
access
the
customer
data
and
then,
of
course,
the
customer
is
going
to
access
their
sales
orders
or
create
a
new
sales
order.
B
So
in
the
case
here
it
makes
sense
to
store
these
things
in
the
same
in
the
same
collection.
So
what
we'll
do
is,
instead
of
storing
these
in
different
containers,
we're
going
to
store
everything
in
a
customer
container
here
now
I've
got
to
make
some
other
changes
here
as
well.
I'm
going
to
change
the
partition
key
from
id
to
customer
id
and
I'm
going
to
add
customer
id
to
the
customer
object
or
the
customer
document
itself.
B
B
Another
thing
I
need
to
do
is:
I
need
a
way
to
be
able
to
distinguish
between
a
sales
order
and
a
customer
within
the
container,
so
I'm
going
to
add
what
we
call
a
discriminator
property
or
this
type
property,
and
I'm
going
to
give
it
a
value
of
customer
or
sales
order
for
each
of
those
different
entities,
and
this
will
allow
me
to
query
for
them
individually.
B
If
I
want
here,
okay,
so
now,
what
we
have
is
a
customer's
container,
where
each
logical
partition
will
contain
exactly
one
customer
row
and
then
all
that
customer
sales
orders
within
here
and
so
now
to
get
all
the
sales
orders
for
our
customer.
I
just
simply
run
with
this
updated
query
here,
so
this
is
select
star
from
c,
where
c
customer
id
equals
customer
a
and
c
type
equals
sales
order.
A
B
For
a
number
of
reasons,
one
speed
is
one,
so
we're
the
only
only
database
that
provides
latency
sla,
we're
also
the
only
database
that
has
five
nines
of
availability,
and
that
is
because
we're
distributed
in
that
we
run
in
every
single
region.
So
we
can
survive
regional
outages
with
very
low
rpo
factor,
minimum
rpo
of
five
minutes
or
you
can
have
an
rpo
of
zero.
B
B
This
is
good
in
a
couple
of
ways:
one,
not
only
do
you
get
rto0,
because
what
happens
is
if,
if
the
primary
region
stops
responding,
the
sap
client
will
automatically
redirect
the
request
to
a
secondary
region,
and
it
does
it
within
30
seconds
and,
frankly,
probably
does
it
even
faster
than
that.
B
So
you
have
zero
downtime
or
rto0.
With
that
other
reasons,
you
would
use
it
as
well.
I
mean
just
because
your
data
may
not
be
structured,
doesn't
mean
that
this
is
a
bad,
or
this
is
a
database
that
you
couldn't
use.
It
there's
plenty
of
customers
that
have
very
structured
data,
but
they
have
higher
requirements
for
the
latency,
the
availability,
those
sorts
of
things.
So
that's,
I
think,
one
of
the
primary
reasons
that
we
see
customers
using.
A
B
I
mean
consider
that
cosmos
was
born
in
azure
right,
I
mean
sequels
as
a
database
has
been
around
since
1970.,
so
it's
51
years
old
now
now
granted
you
know
back
then.
When
sql
was
created,
you
know
the
cost
of
a
megabyte
of
storage
was
like
a
hundred
thousand
dollars
or
something
like
that.
Right
now,
storage
is
cheap.
I
mean
I
got
a
terabyte
of
storage
on
my
phone
and
that
just
cost
me
a
few
hundred
bucks
what's
really
expensive.
B
Now
is
the
cost
of
compute
relative
to
storage,
it's
very
expensive
and
cosmos
being
a
relatively
new
database.
It's
only
been
around
since
2015
2016
is
that
it's
a
nosql
database
designed
to
optimize
around
request
or
the
compute
end
of
it,
which
is
why
database
cosmos
is
fine
with
duplicating
data
in
your
in
your
account,
because
the
cost
of
storage
fundamentally
is
cheap
in
there.
What
we
want
to
do
is
design
our
database
such
that
it
serves
data
exactly
as
it's
needed
by
your
application
with
as
minimal
changes
to
it
as
possible.
B
So
you
want
to
optimize
around
the
request
with
a
database
like
cosmos,
and
you
do
that
by
taking
advantage
of
the
fact
that
it's
schemeless
or
schema
agnostic.
You
have
schema
right
like
if
I
go
into
my
class
here
models.
Here's
my
schema
for
my
database
right.
I've
got
a
customer
object.
Customer
address
location,
object,
password
product
right.
This
is
where
you
enforce
your
schema
as
you
do
it
at
the
at
the
application
level,
and
this,
of
course
gives
you
flexibility,
and
if
you
want
to
change
it,
you
certainly
can.
B
I
was
in
there
with
something
like
sql
database
you're
going
to
have
downtime
as
you
go
into
alter
table
and
that
add
additional
properties
to
it.
So
so
there's
lots
of
reasons
not
there's
no
one
reason,
but
you
don't
need
to
have
unstructured
data
to
use
a
database
like
cosmos
like
I
said,
if
you
have
insane
or
needs
around
availability
or
latency
or
something
else.
This
is
also
a
good
choice.
B
Okay,
any
more
questions.
A
That's
the
only
one
yeah
that
was
that's
pretty
good.
I
could
see
where
you'd
have,
that
that
massive
performance
gain
and
and
of
course
weighing
the
cost
of
the
storage
versus
the
compute
so
yeah
it
definitely
makes
sense.
B
That
and
you
know
the
other
thing
too-
is
it's
conscious
to
not
have
kind
of
enforced
relational
constraints
right.
We
want
to
be
right,
optimized
so
that
you
know
there's
no
nothing
blocking
when
you're
trying
to
write
data
to
the
database
within
there.
So
there's
I
mean
we
physically
could
enforce
relational
constraints
across
physical
partitions,
but
here's
the
problem
is,
if
say,
a
physical
partition
went
down
and
by
the
way,
there's
four
physical
partitions.
For
every
time
you
write
data,
we
store
data
in
four
different
replicas
and
four
different
pieces
of
compute.
B
So
we
keep
multiple
copies
of
your
data
for
availability,
because
if
one
of
those
replicas
goes
down,
you
still
got
three
more
and
when
every
time
you
do
a
write
in
the
cosmos,
you
write
into
three
replicas
and
then
it
copies
over
to
the
fourth
one.
So
that's
just
additional
bulletproofing,
if
you
will
and
within
region
right
so
forget
about
replicating
to
another
region
where
we
do
the
same
thing
again.
Every
time
you
write
data
into
cosmos,
it's
stored
four
different
times,
okay
and
then
that,
and
that
just
gives
you
additional
availability.
B
So
what
I'm
saying
is
that
if
you
had
to
enforce
relations
constraints
across
physical
servers
and
one
of
those
servers
failed
well,
then
your
availability-
all
up-
is
gone
in
that
sense.
So
it's
a
conscious
decision
to
not
enforce
that
type
of
thing,
but,
like
I'm,
showing
like
maintain
referential
integrity,
you
can
totally
do
it.
You
just
need
to
know
what
kind
of
technologies
to
use
and
techniques
around
that
so,
okay,
let's
keep
going
I'm
getting
close
to
the
end
here.
Finally,
I
want
to
look
at
this
last
request.
B
We
need
to
serve
so
what
we
want
to
do
is
we
want
to
query
our
top
10
customers
by
the
number
of
sales
orders
they've
got
so
this
request
requires
you.
You
have
to
count
the
number
of
sales
orders
for
each
customer
and
then
sort
those
in
descending
order.
Then
return
the
first
10
that
come
back
out
of
that
now,
even
though
customers
and
sales
orders
sit
in
the
same
logical
partition,
this
isn't
actually
a
query.
I
could
do
with
cosmos
db,
at
least
today.
B
Aggregate
and
we're
going
to
store
that
in
the
customer
entity
within
our
customer
container,
so
I've
got
this
new
property
here.
Sales
order
account
and
I'm
going
to
store
it
in
there.
B
So
what
we
want
to
achieve
is
that
every
time
I
add
a
new
sales
order
into
my
customer
container,
I'm
going
to
increment
this
sales
order
count
on
my
customer
object,
and
here
we
can
benefit
from
the
fact
that
each
because
customers
and
sales
are
sit
in
the
same
logical
partition,
we
can
use
transactions
actually,
so
cosmos
does
support
transaction,
which
is
a
very
relational
concept,
because
the
data
sits
in
the
same
logical
partition.
B
So
remember
we
can't
we
don't
it's
a
conscious
decision
not
to
do
these
types
of
relational
constraints
across
partitions,
but
because
it's
in
the
same
partition
we
don't
have
the
same
issues
with
regards
to
the
loss
of
availability
or
other
kinds
of
weirdness.
That
can
happen
when
you're
trying
to
do
these
kind
of
distributed
transactions
across
these
different
pieces
of
compute,
and
so
we
can
do
this
in
a
transaction.
Now
in
cosmos,
you
could
do
transactions
one
of
two
ways.
B
You
can
use
a
stored
procedure
which
is
written
in
javascript
and
I'm
not
a
huge
fan
of
that,
but
we
also
have
a
way
to
support
this
through
our
sdks
in
both
the
java
and
uh.net
sdk.
Using
this
feature
called
transactional
batch
okay.
So
now
what
I
can
do
is
I
can
write
a
query
that
looks
like
this,
so
I'm
going
to
select
top
10
from
c,
where
c
dot
type
equals
customer
and
I'm
going
to
do
order
by
on
my
sales
order,
count
in
descending
order.
B
I'm
going
to
query
my
customer
and
sales
order
id
I'm
going
to
call
this
function
here.
First,
all
right-
and
I
got
my
customer
id
here.
This
is
the
same
one
I
was
querying
earlier
and
then
notice
in
here.
In
my
query:
I'm
not
using
the
type
property.
So
I'm
going
to
get
back
the
customer
record
and
I'm
going
to
get
back.
B
B
Now
before
I
was
deserializing,
these
queries
into
specific
classes,
because.
A
B
To
use
dynamic
type
when
I
deserialize
this
data,
because
I
I'm
getting
different
types
of
objects
in
here,
so
the
rest
of
this
all
looks
the
same
right
and
then
I've
got
a
customer
object
here
and
then
I'm
going
to
create
a
list
of
sales
orders
and
then,
as
I
iterate
through
each
of
the
results
here,
I'm
going
to
for
each
of
this
thing,
I'm
going
to
inspect
the
type
property
and
if
it's
a
type
customer
then
I'll
deserialize
that
into
a
customer
object
here
and
if
it's
type,
sales
order,
I'm
going
to
do,
orders,
dot,
add
and
then
deserialize
it
into
that
there,
okay
and
then
I'm
going
to
print
this
out.
B
B
That
door's
been
sitting
there
a
while
okay.
So
let
me
query
for
the
customer
in
all
their
orders.
That's
number
g
here,
okay,
so
I've
got
product
here
a
product
here
and
then
what
I'm
supposed
to
show?
You
is
sales
order,
count
of
two
right
here:
okay,
so
there's
my
denormalized
aggregate
that
I've
got
so
that's
that
query
now,
I'm
going
to
create
a
new
order
and
update
the
customer
item
total.
So
let
me
show
you
that
code
here
get
this
back
to
normal.
B
Okay,
so
here's
my
customer
here
now-
the
first
thing
I'm
going
to
do
is
I
need
to
fetch
my
customer
and
I'm
going
to
do
that
using
my
point.
Read
that
I
showed
earlier
right,
so
I'm
going
to
pass
the
id
is
the
customer
id
and
then
the
partition
key
is
the
customer
id
remember
before
we
created
a
new
property
customer
id.
That's
a
partition
key,
but
for
the
customer
record
it's
the
id
right.
So
I
don't
need
to
specify
something
here.
B
I
know
exactly
that
id
is
that
same
as
the
partition
key
within
there
and
I'm
going
to
save
that
into
my
customer
object
or
deserialize
that
other
response
object
here
and
then
I'm
going
to
increment
sales
order
account
right
here.
Okay,
so
just
sells
our
account
plus
plus
now
I'm
going
to
create
a
new
dummy
order
here.
So
I've
got
a
a
good
I'd,
normally
use
new
guide
here
and
then
here's
a
new
sales
order.
B
So
I
need
to
create
sales
order
and
I
need
to
specify
the
type
right,
because
I
need
that
discriminator
property
pass
in
my
customer
id.
I
got
an
order
date
and
then,
of
course,
a
blank
ship
date
because
we
haven't
shipped
yet
and
then
I'll
cut
a
couple
of
products
in
here.
I've
got
a
new
mountain
bike
frame,
that's
black
and
38
inches.
I
guess,
and
then
some
racing
socks
as
well
to
complete
my
order
here
and
then
below
that
I'm
going
to
do
use
this
thing
called
transactional
batch.
B
I
have
to
pass
in
my
partition
key
and
that
here,
of
course,
is
going
to
be
our
customer
id
and
then
I'm
going
to
call
a
couple
of
functions
here,
I'm
going
to
call
create
item
and
then
I'm
going
to
pass
the
sales
order
we
just
created
above
and
then
I'm
going
to
call
replace
item,
which
is
the
update
and
I'm
going
to
pass
in
my
customer
id
and
then
the
customer
object
in
there
and
then
call
execute
async.
B
Okay.
So
here
we're
going
to
create
your
order
and
update
order,
total
that's
option
age
here
and
all
successful.
Now
I'm
going
to
query
for
my
customer
and
all
their
orders,
so
option
g
and
here's
my
new
sales
order
made
just
now
with
my
new
hl
mountain
bike
frame,
that's
black
and
then
a
pair
of
racing
socks.
And
if
I
scroll
up
here,
you
go
sales
order.
Count
is
three
okay,
so
awesome
sauce
and
just
like
creating
orders,
you
can
do
that
in
a
transaction.
B
You
can
also
delete
an
order
and
also
do
that
in
a
transaction.
So
here
I've
got
my
customer
id
and
order
id.
I'm
gonna
call
read
item
async
on
my
customer
object
and
then
I'm
gonna
decrement
sales
order
account
to
down
one
and
then
I'll
call
transaction
batch
again
and
this
time
I'm
gonna
pass
call
the
lead
item
and
just
pass
in
the
order
id
and
then
replace
item
on
the
customer
object
again
and
then
it'll
update
the
customer
within
there.
So
go
ahead
and
I'm
going
to
delete.
A
B
B
So
here's
my
function
for
that
and
here's
my
query:
select
top
10,
first
name
last
name:
sales
order
account
from
my
container
here
customer
where
c
type
equals
customer
right,
because
I
have
to
distinguish
that
I'm
not
pulling
sales
orders
in
there
and
then
do
an
order
buy
on
sales
order,
account
in
descending
order
and
then
I'm
just
going
to
print
all
that
out
and
we'll
run
you,
and
this
will
take
a
second
to
run.
B
B
So
I
know
before
I
said
you
should
try
to
avoid
these,
but
in
reality
it's
kind
of
hard
to
avoid
them
for
every
different
situation
and
in
scenarios
where
you're
just
going
to
run
it
say
once
a
month
or
something
like
that:
it's
okay,
it's
okay!
For
operations
that
aren't
run
very
frequently
it's
when
they're
high
concurrency
queries.
You
want
to
avoid
that
type
of
thing
at
some
point,
and
also
of
course,
if
they're
in
small
containers
as
well.
B
At
some
point,
it
may
make
sense
that
a
cross-partition
query
that
was
once
okay
has
gotten
to
a
point
where
it's
gotten
too
expensive
to
run.
This
can
happen
when
you
get
into
containers
that
are
maybe
thousands
of
partitions
in
size.
In
that
case,
what
you
want
to
do
is
you
want
to
denormalize
that
aggregate,
basically
you're,
creating
a
materialized
view
of
that
data
and
you're
going
to
store
that
in
another
container?
B
So
the
what
would
happen
is
you
don't
create
that
in
terms
of
transaction,
there
you're
basically
going
to
create
another,
an
upsert
statement
and
it's
going
to
upsert
the
value
for
that
sales
order
count
into
another
collection.
And
then
you
run
the
query
and
you
use
that
collection
or
container.
Excuse
me
to
serve
that
query
that
you
then
run
the
materialized
view.
B
Pattern
is
a
as
another
very
frequently
used,
operation
or
type
of
trick
where
you're
essentially-
and
it's
just
like
the
name-
explains
right
and
it's
for
anybody,
juice,
sql
and
what
I've
materialized
viewers
is.
These
understand
in
this
case
here
we're
basically
materializing
in
aggregate
and
then
using
that
to
serve
queries,
and
this
is
common.
A
B
Workloads
where
you
have
say
high
right,
throughput
and
high
read
throughput,
and
this
is
also
another
thing-
that's
not
uncommon
is
customers
are
often
kind
of
frozen
by
the
fact
that
they
may
need
to
have,
or
they
may
need
to
optimize
around
different
partition.
Key
values,
it's
quite
common
that
data
gets
written
into
one
container
and
then
they
use
change
feed
to
write
it
into
another
container,
with
a
completely
different
partition
key
and
that
each
of
the
containers
kind
of
serves
different
purposes
or
maybe
serves
different
queries.
B
It
takes
a
little
bit
of
math
and
and
figuring
to
get
to
the
to
knowing
whether
you
need
to
do
that.
But
that's
why
again
understanding
the
operations
that
you're
running
and
the
concurrency
of
those
operations
is
so
important
to
be
able
to
do
a
good
design,
a
scalable
design
for
a
database
like
cosmo
cv.
B
So,
okay,
our
final
design,
here
we've
got
a
customer
container
with
customers
and
sales
orders.
I
got
a
product
container
here
with
my
products
and
I
got
a
product
tag
container
with
tags
and
a
category
container
with
categories
there's
one
more
optimization.
We
can
actually
do
here,
and
that
is,
I
can
create
another
container
and
we'll
call
it
product
meta,
because
they're,
both
they
both
use
type
as
a
partition
key
with
their
own
unique
values.
B
So
this
is
another
way
you
can
optimize
data
like
this
and
just
store
it
in
a
single
container,
because
it
only
needs
every
container
unless
you're
using
shared
throughput,
but
every
container
needs
its
own
throughput.
But
if
I
store
the
same
container,
I
can
do
that.
This
is
actually
a
best
practice
for
master
data
right
or
reference.
Data
is
just
store
it
all
in
the
same
container
and
then
use
the
the
type
of
data
as
a
as
its
partition.
B
Key
because
you're
never
going
to
run
into
20
gigs
of
product
categories
or
20
gigs
of
product
tags.
And
if
you
do,
you
can
also
you
can
use
a
composite
key
to
get
a
higher
level
of
cardinality
within
there,
so
that
you
can
easily
keep
that
data
within
your
20gb.
So
here's
our
final
design,
our
three
containers
from
nine
original
relational
tables
that
we
had
from
adventure
works
and
this
database
will
scale
and
perform
to
essentially
petabytes
in
size
unlimited
size
within
our
application.
B
So
that's
it
for
my
talk.
Everything
you
saw
here
is
available
on
github,
so
you
go
to
github.com,
azure
cosmos
db,
cosmic
works.
All
the
code
I
showed
is
there
actually
all
the
data
is
there
as
well?
I
am
working
on
the
writing.
A
data
loader,
it's
actually
written.
I
just
haven't-
I
just
haven't
merged
into
this
repo,
but
you
can
take
a
look
at
this
and
look
at
all
the
code
in
there
and
look
at
the
data
and
get
it
set
up.
B
I've
got
an
arm
template
in
there.
That'll
set
it
up.
You
can
also
run
a
cli
bash
script.
I
wrote
in
there
as
well
that'll
also
set
it
up
for
you
there's
a
good
article,
practical
cosmos
db.
This
is
basically
showing
modeling
and
partitioning
using
like
say
a
wordpress
or
a
blog
platform,
lots
of
great
videos
on
youtube
and
then,
of
course,
we've
got.
B
A
little
micro
site
called
got
cosmos
that
I
run
and
on
there
we've
got
a
weekly
podcast
that
I
host
every
thursday
at
1
p.m,
pacific,
which
I
think
you
guys
are
essential.
So
what
would
that
be?
Like
2
p.m?
I
guess
for
you
guys
in
austin
3
p.m.
For
us,
oh,
is
it?
Are
you
guys
two
hour
service,
okay
or
two
hours,
yeah?
Okay,
so
here's
I've
got
cosmos.com
tv.
You
can
come
see
me
every
week.
B
If
I
didn't
burn
you
out
watching
me
now
lots
of
great
stuff
next
week,
we're
gonna
recap
all
of
our
build
announcements.
We've
got
a
lot
of
cool
stuff
coming
for
build,
we
always
do
and
then
just
more
great
episodes
coming
up.
You
can
check
that
out
as
well.
We
just
ran
our
own
first
cosmos
tv
comp.
This
is
kind
of
like
a.net
conf,
but
much
much
lower
level
that
guy
the
net
confidence.
B
We
can't
quite
do
that
yet,
but
lots
of.
B
There's
live
sessions
you
can
go
and
see
in
here
keynote.
There's
some
really
great
content
here.
Lots
of
good
on-demand
sessions
in
here
as
well
so
come
and
check
that
out.
Here's
the
repo
with
all
the
stuff
in
here
and
you
can
come
and
check
that
out
as
well.
It's
even
got
a
deployed
azure
thing
in
there
as
well,
so
any
questions
from
anyone.
B
Let's
see,
I
see
why,
okay,
that
you
asked
that
one
earlier
from.
B
A
Yeah,
okay,
the
the
only
question
I
I
might
have
is
is
kind
of
are
there's
security
benefits
because
I
know
it's
it's
not
strictly
relational,
but
you
demonstrated
that
you
could,
you
know,
use
it
relationally.
B
Security
benefits,
so
let
me
talk
about
security,
so
the
way
you
we
use
these
master
keys
in
our
sdk
to
access
cosmos
db.
So
you
secure
you
secure
the
database
using
a
pair
of
master
keys,
there's
a
rewrite
and
a
read.
Only.
We
just
recently
announced
support
for
azure
id
and
rmac.
B
So
now
what
you
can
do
is
you
can
actually
authenticate
using
the
service
principle
id
from
an
aed
token
and
then
pass
that
when
you
do
create
a
new
cosmos
client
using
our
sdk,
and
you
can
get
all
the
our
back
goodness
out
of
that
as
well.
So
I
can
actually
now
go
and
create
aad
groups
and
then
give
them
permissions
using
the
new
rbac
model.
Like
read
from
this
container
right
to
this
container.
I
can
query
this.
B
B
We
have
support
for
service
endpoints,
so
you
can
restrict
access
to
people
on
a
on
a
subnet
or
a
vnd
or
a
subnet
within
there,
and
we
also
support
private
endpoint,
which
essentially
removes
the
public
ip
address
right,
because
cosmos
is
on
the
public
internet
right.
B
It's,
the
endpoint
is
out
there,
the
the
uri
for
your
endpoint,
your
you
know,
myaccount.documents.azure.com
resolves
to
a
public
ip
address,
so
you
can
use
private
endpoints
to
remove
that
and
then
you
basically
get
a
ten
dot
address
and
you
connect
to
that
or
it
resolves
to
that
right
so
and
that's
basically
using
also
the
private
dns
as
well,
because
you
need
to
have
a
way
to
resolve
that
fqdn
so
that
you
can
access
that
10
dot
address
in
there,
so
so
lots
of
options
for
security,
of
course,
in
there
and
authentication.
B
So
we've
got
jurassic
and
you're
off
then
covered
from
the
network
all
the
way
down
to
the
app
oh,
what
else
customer
managed
keys?
So
I
guess
we're
in
the
security
realm
you
can.
You
can
encrypt
your
data
as
well.
I
mean
we
already
encrypt
it
using
a
microsoft
manage
key.
If
you
want
to
encrypt
it
again,
you
can
go
and
create
a
key
stored.
Key
vault
pass
that
key.
B
Your
the
the
resource
uri
for
the
key
and
key
vaults
at
us,
and
you
put
in
your
database
account
and
then
we
will
encrypt
your
data
again
on
there
so
and
then
at
build.
We've
got
some
announcements,
cool
announcements
coming
up
around
security,
but
I'm
not
gonna,
no
spoilers.
No,
no!
Sorry,
no
spoilers
here
so
I
gotta
fair
enough.
Yeah.
A
B
Yeah
definitely
come
check
us
out,
I
mean
that's.
Build
for
us
is
the
big
event,
because
you
know
cosmos
is
a.
Is
a
developer's
database
right?
You
don't
find
sql.
Is
you
know
big
in
the
dba
community,
because
you
need
dbas
to
run
it.
Cosmos
is
fully
managed.
It's
you
know
it's
it's
and
the
only
way
to
access
it
really
is
I
mean
you
have
a
portal,
but
you
don't
really
use
portal
to
do
much
of
anything
just
other
than
just
getting
things
kind
of
started.
A
B
Read
and
write
data
in
and
out
of
your
database
right,
so
it's
very
developer
focused
very
developer
friendly
in
there,
so
so
we're
kind
of
a
different
we're
kind
of
a
different
database.
If
you
will
from
from
the
others.
A
No,
I
guess
we'll
we'll
call
it
a
show.
Thank
you
very
much
mark
for
coming
out.
You
really
highlighted
some
some
great
strengths
and
some
some
great
reasons
why
you
would
make
the
decision
you
know,
make
the
informed
decision
to
go
with
cosmos
db
and
and
really
why
why
you
know
why
you
would
consider
it
versus
some
some
alternatives,
so.
B
Well,
what
I
want
you
to
walk
away
with
is
you
know,
use
the
right
database
for
the
job.
You
can't
use
cosmos.
It
doesn't
make
sense
in
every
scenario,
although
you
can't
use
it
for
a
lot
of
different
workloads,
but
no
being
smart
about
how
you
use
database.
How
do
you
model
how
you
design
for
it?
Why
is
it
built
the
way?
B
It
is
there's
a
reason
why
cosmos
is
built
like
this
and
it's
to
give
you
that
basically
unlimited
scalability
in
there
and
just
insane
fast
performance,
but
you
can't
you
can't
get
the
promise
unless
you,
unless
you
understand
the
concepts
and
design
for
it,
and
that's
that's
what
I
hope
that
you
all
got
today
so.
A
A
B
Cool
well
follow
us
on
twitter,
at
azure
cosmos
db,
I'm
on
twitter
all
day
as
well.
In
fact,
I
monitor
our
twitter
account.
So
so,
if
you
got
questions
you
can
ask
who
on
twitter,
I'm
also
quite
frequently
on
stack
overflow,
pretty
much
answering
questions
there
and
on
our
microsoft
q.
A
so
you
know
my
job
is
to
try
and
help
developers
be
successful
on
cosmos.
That's
kind
of
my
you
know
my
other
job
too.
A
Perfect
well
yeah.
It
was
a
great
demo,
a
great
presentation,
so
we
appreciate
you
gracing
us
with
with
your
your
knowledge
and
your
presence
here
with
the
san
antonio
group
and
I'm
sure
with
the.net
group
as
well.
So
great,
oh,
okay,
we'll
wrap
it
up.
Thank
you
very
much
everybody
for
coming
out.
We
appreciate
it
and
please
join
us
next
time.