►
Description
For more great content, visit https://solocon.io
SoloCon 2022:
[Lightning Talk] CARFAX: Gloo Edge at Scale
Speakers:
Mark Portofe
Director, Platform Engineering, CARFAX
Sebastian Chacko
Solutions Architect, CARFAX
Abstract:
In this lightning talk, join CARFAX Director of Platform Engineering Mark Portofe and Solutions Architect Sebastian Chacko to learn more about using Gloo Edge at scale.
Track:
Edge and API Gateway
A
Hi
everyone
and
welcome
to
today's
lightning
talks,
I'm
samantha
kim
I'm
part
of
the
marketing
team
here
at
solo.
I
o
I'm
excited
to
welcome
our
lightning
talk
speakers.
So
please
help
me
extend
a
warm
welcome
to
mark
and
sebastian
from
carfax
who
will
be
sharing
information
about
how
they
are
using
glue
edge
at
scale
over
to
you
mark
and
sebastian.
B
Hey
everyone,
my
name
is
mark,
we'll
be
talking
about
glue
edge
at
scale
and
just
some
quick
introductions
here.
My
name
is
mark
portafi.
I'm
director
of
platform
engineering
at
carfax
and
with
me
today
is
sebastian
chaco
who's,
the
solutions
architect
of
our
platform
engineering
department
at
carfax.
B
So,
just
a
little
bit
about
the
agenda.
What
we'll
be
talking
today
we'll
go
through
I'll.
Tell
you
a
little
bit
about
carfax
the
carfax.com
domain,
which
is
our
main
consumer-facing
domain
a
little
bit
of
how
we
went
up
to
aws
and
our
usage
of
glue
sebastian
will
cover
our
high-level
architecture
and
how
we
went
about
choosing
an
ingress
controller
for
our
architecture
and
then
we'll
talk
a
little
bit
about
our
lessons
learned
and
what's
next
so
first
a
little
bit
about
carfax.
Some
of
you
may
have
heard
of
us.
B
You
might
see
the
car
fox
on
tv
and
heard
our
slogan
show
me
that
the
carfax
we
have
a
mission
of
providing
millions
of
millions
of
people
with
information
on
how
to
shop,
buy,
sell
maintenance,
their
car
with
a
lot
more
confidence.
To
do
that,
we
actually
have
a
ton
of
data
that
stands
behind
our
products
and
services
over
a
hundred
thousand
hundred
thirty
thousand
data
sources,
as
well
as
over
120
billion
records
regarding
vehicles
and
automobiles,
our
customers.
They
can
be
general
consumers,
dealers,
automotive
manufacturers,
banks,
insurance
companies.
B
So
we
have
a
wide
array
of
customer
types
today,
we'll
be
talking
focusing
mostly
about
carfax.com,
which
focuses
on
the
general
consumer,
and
then
we
have
various
locations
as
well.
In
the
us,
canada,
europe,
carfax.com
services,
predominantly
the
u.s
market
space,
with
maybe
a
little
bit
of
traffic
coming
in
from
canada
as
well.
B
A
little
bit
about
carfax.com,
you
can
see
our
header
there,
our
footer
some
of
the
products
and
services
that
we
we
do
support
things
such
as
used
car
listings,
a
vehicle
history
report,
information
on
how
to
maintain
or
service
your
car.
The
one
thing
that's
unique
about
the
customer.
The
consumer
space
on
our
customer
facing
side
is
that,
with
some
of
those
television
ads
that
we
run
the
the
traffic
can
get
a
little
spiky
depending
upon
when
we
run
that
tv
ad.
B
What
the
viewership
is,
it
might
be
during
a
big
nfl,
playoff
game
it
might
be
during
the
olympics,
etc.
We
can
see
noticeable
spikes
in
our
traffic
coming
in,
so
we've
got
to
be
able
to
make
sure
that
we
can
scale
to
to
those
needs
right
now,
we're
ranked
353rd
in
the
us
by
traffic
volume
per
sem
rush,
and
that
equates
to
roughly
2
billion
requests,
or
so
a
little
over
2
billion
requests
coming
into
blue
edge
on
our
architecture.
B
So,
as
you
can
tell
for
us
being
a
consumer-facing
app,
we
have
to
really
focus
on
speed,
reliability,
availability,
scalability.
All
the
standard
terms
you
might
think
of
with
a
high
traffic
website.
B
A
little
bit
about
how
we
got
up
to
aws
and
started
leveraging
glue
for
our
needs.
We
started
back
in
2017.
B
The
first
application
that
we
migrated
up
to
aws
was
actually
our
carfax
blog
and
originally
that
was
a
beanstalk
application
and
we
migrated
it
to
to
kubernetes
recently,
but
we
started
back
then
in
2017
and
we
progressed
by
migrating
other
applications
up
to
carfax.com
as
well,
such
as
our
used
car
listings,
our
our
home
page,
our
car
research
pages.
But
one
thing
to
note
at
that
point:
in
time
we
were
in
a
hybrid
state.
We
had
on-prem
data
service
data
centers,
and
then
we
were
migrating
up
to
aws.
B
So
we
had
traffic
split
between
aws
and
our
on-prem
data
centers.
In
september
of
2019,
we
actually
began
rolling
out
microservices
leveraging
glue
edge
and
we
started
to
convert
over
to
glue
edge.
We
were
on
traffic
1.0
at
that
point
in
time
and
we
started
to
migrate
over
as
of
september
2020.
We
actually
had
100
of
our
traffic
routing
through
aws
and,
ultimately,
in
june
of
2021,
we
had
fully
migrated
over
to
glue
edge
from
traffic,
so
we
completed
that
migration
last
year
and
then
this
fall.
C
Thank
you
mark.
So
once
again,
I'm
sebastian
I'm
solutions
architect
for
the
platform
engineering
department
here
at
carfax.
I
want
to
apologize
in
advance
for
my
voice.
I'm
fighting
off
a
little
bit
of
a
cold,
and
you
might
hear
me
coughing
so
sorry
about
that.
C
So
in
this
slide,
I'm
going
to
talk
a
little
bit
about
the
architecture
diagram
for
carfax.com
that
we
have,
and
so
our
kubernetes
environment,
which
encompasses
glue
edge,
is
spread
across
two
aws
regions,
us
eastward
and
usb
s2,
and
we
have
one
eks
cluster
in
each
of
those
regions.
So
when
a
user
wants
to
get
to
carfax.com
the
first
entry
point
that
they
land
on
is
cloudfront,
which
is
aws
cdn
product
and
for
apps
that
have
caching
configured
at
that
layer.
C
They
get
immediate
response
and
for
apps
that
are
configured
to
as
an
origin
from
s3
a
cloud
front
loads
that
from
s3
and
serves
it
back
to
the
user.
The
the
piece
where
glue
edge
comes
in
is
for
dynamic,
dynamic
content,
where
cloudfront
will
make
that
call
to
one
of
our
clusters
in
eastern
oregon,
depending
on
your
location,
routing,
depending
on
where
you
are,
and
that
lands
us
or
that
lands.
C
The
customer
on
our
external
facing
load,
balancer
behind,
which
is
running
blue
edge
in
our
eks
environment,
and
that
takes
care
of
all
the
routing
from
there.
Next
slide
pretty
smart.
So
I
know.
C
Mark
talked
a
little
bit
about
how
our
journey
and
how
we
landed
on
aws
or
how
we
migrated
for
the
past
two
three
years.
So
the
the
first
piece
that
we
had
to
solve
is
what
is
our
compute
environment
or
provider
that
we're
going
to
use.
We
evaluated
options
that
were
available
at
the
time
bean
stock,
ec2,
ecs,
etc,
and
we
ended
up
landing
on
kubernetes
because
we
felt
that
gives
us
the
best
value
once
we
are
done
with
that.
C
The
next
thing,
the
most
important
decision,
was
choosing
an
ingress
controller,
because
that
eventually
that
gets
the
person
or
the
customer
to
where
they
want
to
go.
So
that
was
our
biggest
decision
at
the
time
and
we
evaluated
the
choices
that
were
available
and
we
landed
on
traffic,
which
is
another
ingress
controller
like
glo
mesh.
It
provided
simple
path-based
routing
at
the
time,
fit
our
initial
use
case,
and
we
were
good
with
that.
But,
as
our
kubernetes
footprint
grew,
we
quickly
outgrew
the
capabilities
the
that
traffic
had
at
the
time.
C
So
we
had
apps
coming
on
board
that
were
spas,
ssrs,
apis,
etc.
So
we
needed
a
controller
that
could
provide
granular
pathways
routing,
as
well
as
like
redirection
and,
most
importantly,
api
gateway
functionalities
like
earth
rate,
limiting
firewalling
ability
to
hit
non
kubernetes
targets
etc
and
be
able
to
handle
the
the
load
that
we're
throwing
at
it.
So
we
were
not.
We
were
using
aws
api,
but
we
still
are,
but
we
wanted
something
that
integrates
a
little
bit
better
with
our
environment
than
that.
C
So
that's
where
we
landed
on
blue
edge,
which
is
the
most
feature-rich
technology
technologically
advanced
option
that
we
evaluated
at
the
time
so
fast
forwarding
today
we're
serving
two
billion
plus
requests
per
month
through
bluewords,
like
mark
mentioned
before,
mark
next
slide,
please.
C
So
this
slide.
We're
gonna
talk
a
little
bit
about
the
lessons
learned.
The
journey
has
been
has
taken
a
while,
so
we
had
definitely
had
some
positives
and
lessons
learned
throughout
it.
So
positives.
First
thing
the
solo
support
is,
is
super
awesome
like
they
provide
us
a
dedicated
slack
channel
and
a
slack
workspace
where
we
can
like
not
only
ask
like
issues
and
help
for
help
with
debugging
issues,
but
also
like
implementation
level.
Questions
like
when
we're
adding
a
new
feature
or
something
we
can
go
ahead
and
ask
them.
C
How
do
we
add
this?
What
is
the
value
helm
value
for
this
and
stuff
and
they're
super
helpful
with
that
which
is
awesome,
and
then
the
traffic
scale
and
chaos
testing?
I
know
we
mentioned
this
a
couple
of
times
before
the
traffic
scale.
This
domain
runs
a
lot
of
traffic
and
glue
was
able
to
handle
everything
that
we
throw
over
through
added
and
it
was
able
to
provide
slightly
better
performance
than
we
had
with
our
traffic
1.0
implementation,
our
previous
ingress
controller
and
then
chaos
testing.
C
That's
something!
We've
tried
to
incorporate
into
our
deployment
workflow
for
big
infrastructure
releases,
it's
a
manual
at
this
time.
So
basically
we
go
in
to
our
cluster
and
pick
out
this
component
and
try
and
fail
different
different
portions
of
this.
So
with
with
blue
ads,
we
went
in
and
failed
different
components
of
it
and
it
was
able
to
handle
almost
every
scenario
we
throughout
it,
with
with
no
issues
moving
over
to
the
lessons
learned
discovery,
so
blue
edge
discovery
is
an
awesome
awesome
tool
to
get
you
going.
C
So
it
automatically
discovers
all
the
kubernetes
resources
within
your
cluster
and
creates
upstreams
automatically.
But
what
we
found
was
it's
not
super
production
ready.
So
when
we
all
of
our
other
resources,
are
created
using
argo
cd
and
when
we
had
these
up
streams
that
are
automatically
discovered
and
other
resources
created
through
our
go.
There
were
conflicts
there
and
there's
also
like
a
resource
use
bug
in
in
discovery.
So
solo
recommended
that
we
don't
use
it
in
production.
C
So
we
switched
over
to
using
our
creating
the
glue,
add
job
streams
using
again
argo
cd,
yaml
and
that
that
that's
worked
for
as
well,
and
the
next
was
deployments
versus
daemon
set.
So
glue
can
be
deployed
in
a
couple
of
different
configurations.
The
first
one
is
as
a
kubernetes
deployment
or
a
daemon
set
in
our
previous
implementation
with
traffic.
C
We
were
running
it
as
a
payment
set,
particularly
for
so
that
we
could
run
it
on
specific
instances
and
did
not
have
that
bogged
down
by
other
resources
running
audit
so
that
the
the
the
ingress
gateways
or
the
ingress
pieces
isolated
from
the
rest
of
the
cluster
and
and
won't
be
a
cause
of
failure.
So
we
we
went
down
the
same
route
with
blue
edge
and
reviewed
it
solo
and
they
they
recommended.
We
stick
with
daemon
sets
with
our
current
architecture.
C
Next
live
view,
smart
and
so
what's
next
so
glue
ads
like
I
talked
before,
we
went
to
where
we
started
using
blue
edge
for
all
of
these
api
gateway
type
features
which
we
are
starting
to
explore
right
now,
like
one
of
the
things
that
our
teams
are
really
excited
about,
is
the
blue
edge
and
the
lambda
integration,
so
that
teams
don't
are
not
isolated
into
only
running
kubernetes
services
behind
lued.
C
So
we
can
just
use
the
same
api
gateway
to
also
front
lambda
applications
and
team
teams
have
that
flexibility
so
which
is
awesome
and
then
another
thing
we're
really
excited
about
blue
edge.
Having
is
the
glue
oidc
integration,
where
we
can
just
add
authentication
to
any
page
we
want
without,
I
really
even
have
to
make
any
app
changes,
so
it
it
just.
You
can
just
configure
ydc
with
an
sso
provider
and
your
page
voila
is
authenticated.
C
So
that's
awesome
and
then
the
last
thing
we're
we're
working
on
or
hoping
to
work
on
for
this
year
is
blue
mesh.
So
one
of
the
things
that's
missing
been
missing
from
our
environment
is
a
proper
service
mesh.
So
we
towards
the
last
to
in
last
year
we
evaluated
a
few
different
options.
We
looked
at
link
rd
aws
mesh
istio,
and
we
felt
these
does
are
our
best
choice
and
but
we
definitely
felt
that
istio
is
a
big
beast
to
compact
by
itself.
C
So
when
blue
mesh
came
out,
that
was
really
exciting
for
us,
because
that
abstracts
away
a
lot
of
the
management
piece
for
istio,
so
we're
really
excited
to
to
try
and
get
that
out
this
year.