►
Description
Canary deployment is a helpful tool that allows companies to put in production multiple versions of its products to control flow and access based on different sets of rules, clients, amounts and operations. In this Kong Summit 2020 session, we will discuss best practices for canary deployment and share our experiences using Kong Enterprise — supported by Kuma capabilities — in achieving this.
Learn more about Kong: https://bit.ly/2I2DypS
A
A
We
have
different
clients
from
big
banks
to
vintage
to
small
startups
something
from
healthcare
from
different
topics,
but
all
of
them
have
something
in
common.
What
they
do
have
in
common
is
that
they
have
a
product
in
production
that
doesn't
not
accept
any
downtime
and
they
need
to
get
adapted
really
fast,
because
everything
changes
and
whenever
they
need
to
do
that,
they
have
like
different
approaches,
but
it
needs
to
change
fast.
It
needs
to
change
easily
and
the
user
doesn't
need
to
to
realize
that
something
has
changed.
So
what
are
the
businesses
for
this?
A
First
of
all,
they
do
need
to
release
experimental
features
to
users
and
see
if
it
increases
the
sales,
because
someone
tries
to
say
okay,
let's
try
out
a
new
dashboard.
Let's
try
out
something
new
and
see
if
that's
increasing
the
sales
or
not
or
maybe
what
they
are
doing,
is
just
changing
the
infrastructure
behind
it
and
trying
not
the
users
to
see
that
something
changed.
So
the
user
shouldn't
have
something
that
changed,
but
the
infrastructure
should
be
better.
A
Maybe
there
are
shots,
releasing
a
new
feature
or
for
beta
testing,
or
maybe
they
are
just
releasing
a
new
feature
for
a
small
set
of
users.
So
what
comes
there?
It's?
What
kind
of
deployments
comes
in?
That's
where
canary
deployments
comes
in
and,
as
you
can
see
in
this
picture,
what
canary
deployment
is
is
a
pattern
for
just
rolling
out
releases
to
a
subset
of
users
or
servers
and
let
them
be
the
ones
that
are
trying
out
the
new
version.
A
So,
as
you
can
see
here,
there
are
a
lot
of
users,
but
those
users
can
be
either
in
treatment
a
or
in
treatment
b.
Whenever
they
go
to
treatment
a
as
the
in
the
90
percent,
they
are
just
gonna,
be
serving
their
real
application.
The
application
that's
been
working
and
we
know
it
works
and
everything
works
okay,
but
whenever
they
go
to
treatment
b,
that's
where
we
are
trying
to
try
out
new
features.
A
A
Maybe
you
can
just
create
an
early
adopter
program
or
do
what
it's
called
dog
fooding,
which
is
okay.
We
are
the
ones
that
are
gonna,
try
out
our
own
products
and
see
if
everything
works.
Okay,
if
it
works
good
for
us,
it
should
work.
Okay
for
the
rest
of
the
of
the
people
so
coming
up.
Next
are
what
are
the
benefits
of
this?
First
of
all,
as
you
can
see
in
the
next
slide,
is
a
b
testing.
You
can
do
test
canary
deployments
to
do
a
b
testing.
A
Let's
try
out
these
two
versions
and
see
which
ones
performs
best.
The
next
benefits
of
this
is
just
for
capacity
testing,
maybe
you're
rolling
out
something
that
might
change
a
lot
and
might
change
the
times.
So
when
you
do
canary,
you
can
stress
out
a
small
part
of
it
and
see
if
it
works
as
expected
or
not.
A
The
next
one
is
feedback,
because
whenever
you
roll
out
something
new,
you
expect
feedback
from
the
real
user,
not
just
a
uit
user
or
a
testing
user.
You
want
the
real
people
to
give
you
feedback
and
you
can
roll
it
out
right
away.
A
Then
we
have
no
call
starts
because
you
are
just
running
two
versions
of
it.
Whenever
you
are,
you
are
confident
on
the
new
one.
You
just
go
to
the
new
one
and
start
using
that
one.
You
don't
have
to
stop
every
everything
and
roll
everything
up
just
keep
working
with
that
one.
We
have
two
more
no
downtime,
as
I
just
said,
and
the
last
one.
A
A
That's
something
that's
meant
to
be
for
a
load
balancer
or
something
like
that.
That
just
creates
two
instances
and
moves
from
one
to
the
other.
It's
also
not
a
rolling
deployment.
You
don't
want
to
just
okay,
let's
start
destroying
the
instances,
we're
on
the
old
version
and
moving
to
the
new
one.
What
you
are
doing
with
canary
is
you
have
two
versions
and
you're
trying
out
things
in
the
two
ones.
A
So
that's
just
the
theory.
So,
let's
just
jump
into
into
our
show
time
yeah.
So
what
we
have
here
is
we
have
a
real
life
example.
Foodx
is
a
mobile
ordering
solution
that
allows
you
to
just
download
the
application
or
enter
your
web
app,
create
an
order
for
a
restaurant
pay
through
the
application
of
the
web.
Application
go
a
straight
to
department
and
after
that
you
can
retrieve
it
and
then
show
it
with
your
friends.
A
A
So,
as
you
can
see
in
the
two
images
here,
you
have
different
places
in
places
nearby
and
also
when
you
provide
an
address,
you
can
get
the
places
nearby
that
address
and
as
fred
is
gonna
show
in
the
next
in
in
the
next
few
slides
and
in
the
demo.
That's
something
that
was
having
trouble
with
one
version.
A
We
created
a
whole
new
version
for
it,
but
we
will,
but
there
was
such
an
internal
change
that
we
needed
to
try
out
not
only
in
testing
but
also
with
real
cases,
scenarios
of
what
was
going
to
happen
and
that's
what
canary
comes
in
so
feather.
B
Nico,
okay,
based
on
what
nico
just
mentioned
here,
we
have
an
architectural
diagram
of
covering
all
the
feudax
business
needs.
Here
we
see
it's
pretty
straightforward.
We
have
a
backend
with
a
managed
kubernetes
on
a
cloud
provider.
In
this
case
it's
google
and
then
we
have
two
kinds
of
application.
One
is
a
web-based
and
the
other
is
a
mobile
version
and
there
are
consuming
services
through
a
congress
controller
ground.
English
controller
is
the
one
responsible
for
exposing
that
functionalities
to
the
outside
world,
and
then
we
have,
which
is
called
bff.
Bff
is
a
pattern.
B
The
the
letters
are
backhand
for
front
end.
It's
a
it's
a
pattern
which
allows
you
to
have
a
detailed
experience
for
each
of
the
frontends.
You
have
on
your
on
your
architecture
and
then
we
have
the
back
ends.
The
back-ends
are
also
built
in
python.
As
the
bff
bff
uses,
the
flask
network,
a
framework,
sorry
and
the
back-end
is
built
upon
django,
okay,
it's
storing
the
main
data
onto
an
sql
database
or
national
database
that
it's
as
a
service.
B
All
these
architecture
is
being
observed
by
graffana,
so
we're
going
to
see
live
metrics
at
grafana.
What's
going
on
with
these
with
these
components
right,
then
we
have
the
problem
with
the
first
back-end
as
many
as
nico
mentioned
it
was,
it
was
really
slow.
It
was
resolved
because
all
the
processing
between
say
before
saving
the
data
was
was
being
done
on
the
back-end
side.
So
that's
brings
a
lot
of
issues.
B
So
now
we
are
going
to
set
up
a
new
whole
version,
which
is
called
okay
back
in
b2
is
going
to
be
the
the
same.
The
same
stack,
the
same
technology
stack,
but
it's
doing
the
processing
near
the
database
layer
using
poscas
on
the
possible
sql.
So
for
accomplish
this,
this
shifting
of
the
traffic
to
send
over
ninety
percent
to
one
version
and
ten
percent
to
the
other
version.
We
are
going
to
use
qma.
B
What
is
schema,
I
think
you
all
should
know
about
qmac
minus
the
kung
service
mesh
solution
as
any
other
service
mesh
technologies.
Kuma
has
the
concept
of
a
control
plane,
which
is
responsible
for
not
only
managing
the
data
planes,
but
also
configuring
them
and
observing.
What's
going
on
at
every
time,
it's
like
the
brain
on
our
source
mesh
and
then
we
have
the
data
planes
and
we
can
see
here
on
the
image
at
the
bottom.
B
The
data
planes
are
the
envoy
proxies
that
have
been
deployed
as
sidecar
containers
of
our
application
containers.
So
giving
this
configuration
give
this
a
scenario,
a
seriousness.
Implementation
allow
us
to
observe
all
the
traffic
that
is
incoming
and
they'll
come
in
traffic
for
our
applications
and
therefore
we
can
configure
some
security
policies,
routing
policies.
We
can
do
some
canary.
As
in
this
demonstration
we
can
do
also
blue
green
deployments
and
we
can
observe
all
the
applications
I
mean
we
can
grab
all
the
telemetry.
B
I
mean
logs
metrics
traces
and
centralize
them
in
another
back-end
storage.
What
happens
if
we
don't
have
a
service
mesh
to
do
this?
To
accomplish
this?
Need
we
have
well.
In
that
case,
we
are
going
to
need
to
set
up
some
business
logic
inside
the
application
in
order
to
know
okay.
The
traffic
has
been
sent
to
me.
B
So
it's
not
my
my
turn
to
respond,
so
I
have
to
drive
the
traffic
to
the
other
box
and
that's
not
the
ideal
scenario,
because
it
doesn't
scale
when
I
had
to
to
implement
a
lot
of
particular
rules
there.
It's
not
the
best
way
to
it's,
not
a
big
approach
to
accomplish
that.
So,
given
this
scenario,
we're
going
to
show
now
the
demo
here
we
have
in
our
cluster,
we
have
let
me
check.
B
We
have
the
applications
in
this
space.
We
have
the
bff.
As
we
told
you,
we
have
the
two
versions
of
the
back-end.
Of
course,
the
first
best,
the
the
second
version
of
the
back-end,
is
not
receiving
traffic
at
all
at
first,
because
you
know
traffic,
isn't
there,
we
are
not.
We
didn't
configure
anything
yet
on
the
mesh
side
on
the
mesh
layer.
So
all
the
traffic
is
going
to
the
b1
version,
providing
that
we,
oh
okay.
B
We
also
have
the
ingress
here,
the
ingress
controller,
and
now
we
are
ready
to
show
you
how
the
application
looks
like
the
main
idea
of
all
this
demonstration
is
to
show
you
us
like
it
is
a
seamless
transition
to
the
end
user.
Then
users
are
never
gonna
realize
the
changes
on
the
application
side
here
here
we
can
see
that
if
I
reload
the
application.
B
B
In
that
case,
we
are
not
going
to
show
you
how
in
kuma
we
can
configure
that
traffic
routing.
It
is
a
it's
a
it's
a
yamo,
it's
a
like
a
kubernetes
object,
but
from
the
kuma
api.
So
it
is
called
traffic
root.
What
it
says
here
it
says:
okay,
all
the
traffic
that
match
that
is
coming
from
this
component
observed
by
qma
service
mesh
and
it's
going
to
this
destination
component,
send
all
the
traffic
to
the
first
version.
B
We
have
here
the
the
current
label
that
it's
specifically
saying
that
it's
version
one
send
all
the
traffic,
it's
a
weight-based
probabilistic
traffic
setting
to
the
first
version
and
zero
to
the
other
version.
So
what
happens?
If
we
change
this,
I'm
going
to
run
a
script
here
to
call
the
api
and
we
can
see
now
in
rafana.
B
We
are
seeing
two
dashboards
here.
We
configure
a
dashboard
with
the
total
request
on
a
rate
of
30
seconds,
and
then
we
configure
the
request
time
per
service
on
a
bff
b
bind
I
mean
if
I
look
at
the
bff
and
I
ping
the
bff
every
couple
of
seconds.
I'm
gonna
see:
okay,
it's
lasting
like
nine
seconds
to
bring
all
the
data
together
to
the
to
the
view
right.
So
this
is
based
on
percentiles,
but
the
worst
percentile.
The
p99
is
showing
that
it's
taking
almost
10
seconds
to
accomplish
that
request.
B
B
B
A
So
fedex
what
you
just
did
there
was
you
changed?
You
just
change
the
rule
to
say:
okay,
start
sending
something
to
the
old
version
and
something
to
the
new
version.
Almost
all
of
our
users
right
now
are
going
to
be
going
to
the
new.
Do
the
old
version,
but
30
of
them,
or
something
like
that,
because
it's
not
just
a
30
percent.
But
it's
up
to
holistic.
B
A
B
That's
why?
Because
we
moved
to
the
new
version,
as
nico
just
told
us
that
has
the
processing
the
that
massive
processing
layer
being
done
in
another
layer,
which
is
the
best
approach
to
do
that.
So
in
that
case
we
see
that
the
the
times
are
getting
better
and
we
see
that
here
there
are
requests
coming
from
both
versions.
B
B
B
A
So
none
of
the
users
realized
anything
apart
from
having
a
performance
improvement,
but
the
data
is
still
the
same.
Everything
is
still
the
same.
The
only
thing
that
changed
is
that
everything
works
faster
right
now
right
and
if
something
was
not
working
right,
we
could
just
you,
could
just
roll
back
and
go
to
the
previous
version,
because
it's
still
there
it's
still
being
processed
in
the
in
the
back
end,
and
if
something
comes
up,
you
can
just
roll
it
out
back.
A
B
Okay,
that's
delay
of
the
seconds
is
because
it's
going
to
another
region,
the
database
to
grab
the
metric,
the
data,
sorry
and
we
have
a
latency
between
regions-
that's
going
to
be
acceptable
for
the
demo,
not
for
production,
of
course,
and
then
we
should
see
here
that
okay,
the
time
decreased
considerably
and
the
yellow
line
started
to
receive
all
the
ray
all
the
traffic
I
mean
all
the
rates
of
the
request
are
going
to
the
v2.
A
Awesome
great
great,
thank
you
fede,
that's
really
cool,
so
felix
has
showed
us
how
canary
works
and
canary
almost
every
time
works,
but
we
have
a
few
things
that
you
have
to
keep
in
mind.
Let's
find
them
so
the
first
thing
is:
do
not
really
reliable.
Do
not
over
rely
on
this,
because
it
does
not
effectively
mitigate
the
risk
of
silent
defects.
A
What
this
means
is,
for
instance,
if
fede
just
introduced
a
new
defect
into
the
version
two
you
have,
you
will
be
trying
that
one
in
production,
so
don't
over,
rely
on
this
and
also
try
it
out
in
testing.
Try
it
out
every
every
everywhere,
because
you
are
pushing
code
to
the
production
right.
A
This
is
only
possible
where
our
nodes
contract
changes,
because
if
something
changed
and
the
ui
would
would
need
to
change
its
response,
this
is
this
couldn't
be
as
easy
as
it
was
because
you
cannot
change
from
one
to
the
other
one
right.
It
also
increases
the
complexity
because,
as
fedex
just
show
us,
he
was
just
pushing
something
to
the
automated
deployment
mechanism
and
it
was
being
deployed.
A
But
you
have
to
keep
in
mind
that,
apart
from
the
roots,
ingresses
and
everything
else,
you
have
something
else:
that's
the
root
between
the
services
and
you
have
to
keep
that
in
mind
too,
if
not
something's
moving,
and
you
don't
know
why,
apart
from
that,
and
it's
really
important
as
fedex
has
shown
us,
you
have
to
be
having
a
look
at
what's
being
changed,
you
have
to
be
measuring
the
changes
you
have
to
have
a
lot
of
metrics
on
this.
A
If
not
it's
just
something
that's
been
changed
and
you
don't
have
the
ability
to
to
see
if
everything
is
working.
Okay,
if
it's
not
if
it
makes
sense.
A
One
of
this
is
one
of
the
most
important
ones
that
database
changes
can
present
a
problem.
For
instance,
if
you
are,
if
you
have
a
version,
that's
faulty,
then
the
database
would
have
changes
that
are
not
that
might
be
corrupt
and
that's
data,
that's
corrupt,
even
for
the
for
the
latest
version,
the
previous
version.
Every
single
version
is
looking
at
the
same
database
and
if
everything
works
not
the
way,
I
suspect
it.
A
The
database
is
being
corrupted
and
the
last
one,
but
not
least
is
make
sure
you're
rooting
the
traffic
to
this
mesh,
not
to
the
load
balancer.
What
this
means
is,
you
are,
you
are,
you
are
just
sending
all
the
information
to
kuma
in
this
case.
So
if
kuma
is
the
one
that's
taking
care
of
this,
he
will
be
able
to
it
will
be
able
to
just
use
the
canary.
If
not,
if
it's
just
going
straight
to
the
service,
you
are
going
through
kuma,
and
none
of
this
will
work.
A
One
last
thing
that's
also
important
is
that
you
have
to
have
a
stages.
You
have
to
keep
in
mind
the
duration
of
this.
You
have
to
keep
metrics
for
this,
as
we
said
before,
and
you
have
to
keep
an
evaluation
of
this.
What
this
means
is
you
have
to
plan,
you
don't
have
to
just
okay,
let's
go
with
with
just
spinning
up
things
and
going
to
canary
deployment,
and
everything
will
work
right.
It
might
not.
A
A
We
would
like
to
thank
you
all
for
joining
us.
Thank
you,
freddie
also
for
the
whole
demo.
We
would
like
you
to
stay
tuned
with
us
to
keep
up
with
the
questions
right
after
this
session.
So
thank
you
all
for
coming.
I
hope
you
enjoyed
as
much
as
we
did
creating
this.