►
Description
SoloCon 2022:
[Lightning Talk] Business Continuity with Gloo Mesh
Speaker:
Marino Wijay
Developer Advocacy and Relations, Solo.io
Abstract:
Disaster Recovery and Avoidance are critical to ensuring that applications continue to be available. In this lightning talk, we discuss how Gloo Mesh leverages Traffic Policies to mitigate disasters and downtime.
Track:
Service Mesh and Application Networking
A
Hi
everyone
and
welcome
to
our
next
lightning
talk,
I'm
samantha
kim
and
I'm
part
of
the
marketing
team
here
at
solo
io,
I'm
excited
to
introduce
our
next
lightning
talk.
Speaker.
Please
help
me
welcome
mourinho
back
to
the
stage
to
talk
about
business
continuity
with
glue
mesh
mourinho
over
to
you.
B
Thank
you
very
much.
Samantha,
hey
everyone
again.
Welcome
to
my
talk
on
business
continuity
with
glue
mesh.
C
C
C
Let's
understand
what
could
contribute
to
an
outage,
so
outages
can
actually
be
caused
by
a
variety
of
factors,
both
technical
and
non-technical
in
nature.
Some
can
be
internalized,
some
can
be
externalized,
it
doesn't
matter
at
the
end
of
the
day
and
outages
and
others,
and
that
could
be
a
small
part
of
your
environment.
That
could
be
the
entire
thing,
but
it
it
contributes
to
an
outage
dns,
for
example,
if
not
available,
can
cause
substantial
outages
as
we
are
so
reliant
on
things
like
name
resolution.
C
So
if
we're
trying
to
communicate
using
ips-
and
we
don't
have
our
our
name
resolution
in
place-
things
tend
to
break
dns
servers.
What
we
tend
to
do
with
them
is
deploy
them
in
a
high
availability
state.
So
that
way
we
have
one
if
it
goes
down,
we
have
another
one,
that's
available
to
just
continue
to
to
process
those
name.
Resolution
requests
in
the
case
of
storage.
C
That
system
holds
all
the
data
that
we
consume
in
a
company
all
the
data
that
we
store,
that
we
generate
it's
stored
in
some
sort
of
repository
or
it
could
be
spread.
But
it's
storage.
We're
talking
about
storage
array,
storage
networks
and
even
disks,
so
we
have
the
possibility
of
running
out
of
storage,
we
run
out
of
capacity
and
when
we
do
run
out
of
capacity
guess
what
we
have
nowhere
to
write
to.
So
the
system
comes
to
a
halt.
C
C
Sometimes
the
network
is
the
source
of
a
potential
outage.
You
know,
network
equipment
can
can
age,
it
can
fail.
Network
cables
can
fail,
transceivers
for
fiber
optics
can
fail,
someone
can
cut
a
fiber,
and
then
your
network
has
failed.
There
could
be
potentially
some
level
of
oversaturation
to
and
from
a
particular
environment,
and
that
is
actually
causing
huge
spikes
in
latency
and,
furthermore,
contributes
to
a
level
of
inaccessibility
or
a
lack
of
availability.
C
And
then
this
moves
us
on
to
the
concept
of
availability
which,
when
you
think
about
it,
really
motivates
us
to
consider
constructing
things
like
service
level,
indicators,
service
level,
objectives
and
even
service
level.
Agreements
to
just
ensure
that
we're
always
maintaining
a
level
of
availability
and
resiliency
for
our
business
and
for
the
applications
that
run
inside
them.
And
then
finally,
the
last
part
is
people
like.
How
do
we
ensure
that
that
we
are
enabling
our
our
workforce,
our
people,
to
recover
from
these
failures
and
that
they
have
the
right
systems?
C
The
right
tools,
the
right
technology,
to
solve
a
lot
of
these
failover
problems,
so
so
what
technology
can
I
think
of?
Or
can
anyone
think
of
that
can
actually
solve
a
lot
of
these
challenges?
Service
mesh,
all
right?
So,
let's,
let's
dig
into
service
mesh
and
and
what
we're
doing
here
so
up
in
this
diagram,
I
have
a
single
kubernetes
cluster,
which
can
be
running
anywhere.
It
could
be
running
on
eks.
It
can
be
running
something
that
you
built
on
premise
or
you
something
that
you
built
in
another
cloud.
C
C
But
this
this
load
balancer
is
there
to
protect
that
control
plane.
If
you
lose
a
node
inside
of
kubernetes
or
if
you
lose
a
control
plane
inside
of
kubernetes,
you
still
have
a
control
plane
that
is
functioning.
That
is
actually
allowed
to
pass
instructions
down
to
the
worker
notes
to
allow
you
to
schedule
pods
and
and
run
containers,
and
even
still
do
that
level
of
networking.
It
allows
you
to
still
traffic
shift
or
or
move
around
your
services
as
needed,
but
this
is
a
function
of
a
kubernetes
cluster
itself.
Wherever
you've
deployed
it
kubernetes.
B
C
C
Now
this
is
all
great
and
well
and
then
you
know
we
have
recovery
for
our
applications
and
we
have
ways
to
route
to
multiple
parts
of
our
application
within
a
cluster.
But
what
about
multi-cluster
now
in
the
case
of
multi-cluster
recovery
and
failover,
any
single
service
or
endpoint
can
fail.
An
entire
cluster
or
environment
can
fail.
You
might
even
have
cross-cluster
traffic
going
on.
How
do
we?
C
How
do
we
actually
achieve
some
cross-cluster
resiliency
in
the
event
of
a
failure
of
let's
say
a
service,
or
even
maybe
a
node
in
a
cluster
or
even
the
entire
cluster
itself,
since
we've
distributed
our
application,
parts
of
it
might
end
up
in
different
kubernetes
clusters.
So
how
do
we
go
about
routing
to
all
of
these
different
locations?
C
So
we
could
leverage,
let's
say
a
service
mesh
and
sdo's
specific
specific
objects
like
virtual
services
and
destination
rules
and
even
service
entries
to
to
just
create
the
necessary
policies
to
get
around
to
where
we
need
to
go,
but
having
to
do
this,
for
so
many
services
at
scale
is
impossible.
So
what
do
we
do
now?
How
about.
A
C
System
a
cicd
system
might
be
able
to
solve
part
of
that
problem
for
us,
but
not
entirely
because
it
doesn't
have
all
available.
It
doesn't
have
the
complete
awareness
of
all
objects
in
all
locations,
and
then
this
is
what
brings
us
to
something
like
blue
mesh,
so
I'll
talk
about
glue
mesh
in
a
second.
But
what
if
we
can
further
abstract?
Let's
say
the
istio
service
mesh
and
treat
all
of
our
kubernetes
clusters
as
if
they
were
all
part
of
the
same
network
fabric
we
actually
can,
and
we
do
so
with
glue
mesh.
C
So
with
blue
mesh,
we
can
leverage
something
called
a
route
table
resource
that
that
basically
tells
us
where
all
of
our
services
exist.
Whether
service
saying
cluster
is
in
cluster
one
or
service
b
is
in
cluster
two
or
all
the
copies
of
service
service,
b
or
across
all
clusters.
The
route
table
is
gonna.
C
Tell
us
that,
and
so
this
route
table
is
actually
a
direct
translation
of
of
istio's
resources,
specifically
those
virtual
services,
destination
rules
and
even
service
entries,
but
we're
simplifying
that
configuration,
because
now
all
you
need
to
do
is
configure
route
table
resource
to
configure
where
things
are
much
like
a
routing
table
in
the
networking
world.
You
configure
your
route
table
to
say:
here's
where
this
network
is
where
that
network
is.
C
This
is
the
same
idea,
so
this
is
actually
allowing
us
to
effectively
traffic
engineer
where
our
requests
to
our
applications
go
and
even
specify
alternative
paths.
So
if
there
is
a
failure,
let's
say
cluster
one
goes
down
or
cluster
two
goes
down
and
cluster
one
needs
to
access
something
in
cluster
two,
but
it's
non-existent
anymore.
Cluster
one
can
route
to
cluster
three
without
a
problem,
and
this
is
all
made
possible
using
glue
mesh
and
its
management
plane
too,
to
provide
the
management
and
abstractions
on
top
of
the
steel
service
mesh.
C
C
The
other
thing
that
I
need
to
mention
is
that
glue
mesh
actually
unifies
the
root
ca
amongst
all
of
your
istio
instances.
So
now
all
of
your
different
environments-
here
all
of
your
different
clusters,
are
sharing
that
same
root
certificate
or
root
ca
certificate
that
enables
them
to
trust
all
services
amongst
each
other.
C
What
what
we
aim
to
solve
here
is
with
our
glue
mesh
technology.
We
can
provide
you
a
way
to
continue
your
business
and
its
functions,
provided
that
you
have
some
sort
of
outage
or
a
small
portion
of
your
environment
goes
down,
while
istio
provides
a
very
powerful
service
mesh
that
gives
us
things
like
traffic.
C
Other
capabilities
to
to
to
basically
allow
us
to
connect
our
endpoints
and
services
together.
It
also
allows
us
to
circumvent
failures
locally
and
if
we
take
that
to
the
next
step
and
we
leverage
blue
mesh,
we
can
streamline
and
simplify
our
configurations,
while
also
circumventing
failures
across
all
different
locations
and
sites.