►
Description
CacheFilter: Flexible HTTP Caching in Envoy - Josiah Kiehl, Todd Greer
Web traffic relies extensively on caching proxies, and Envoy needs robust HTTP caching support to perform that role, but scaling and feature requirements vary too much for a "one size fits all" implementation. CacheFilter is an Envoy filter that handles the many caching-related request and response headers and directives, with the customizability and extensibility to support anything from single-server deployments to planetary-scale caching systems with extensive bespoke needs.
A
Good
morning,
thank
you
for
your
interest
in
the
future
of
envoy,
based
caching,
I'm
todd
greer
and
today
I'll
be
describing
the
implementation
of
envoy's
http
caching
filter,
but
first
I've
asked
my
colleague
josiah
kill
to
say
why
you
want
caching
and
how
to
enable
it
josiah.
Why
does
envoy
need
a
caching
filter.
B
We
have
all
of
these
clients
out
on
the
wide
internet
connecting
to
our
infrastructure
through
an
envoy
which
then
does
back-end
service
picking
and
returns
the
content
from
those
services
back
to
the
clients
in
order
to
reduce
the
load
on
these
backend
services.
So
we
can
scale
them
up
scale
them
up
more,
as
well
as
reduce
the
latency
for
retrieving
the
content.
In
the
first
place.
B
We
want
to
have
that
envoy
cache
the
content
where
possible.
So
whenever
the
content
is
cacheable,
when
it
comes
back
through
the
envoy
from
a
client
request,
we
will
insert
it
to
the
cache
from
the
via
the
cache
filter,
as
well
as
proxy.
It
back
to
the
client
that
way.
Subsequent
subsequent
requests
that
come
in
will
go
to
the
cash
filter,
get
a
cash
hit
and
go
straight
back
out
to
the
client
without
incurring
the
back
end
service
cost.
B
This
is
particularly
useful
when
you
have
widely
distributed
architectures,
where
the
services
could
be
in
different
data,
centers
or
different
cloud
regions
or,
however,
you
might
imagine,
we
want
the
content
to
be
as
close
to
the
requesting
client
as
possible,
and
so
we
can
deploy
envoy
instances
way
out
in
satellite
locations
which
may
or
may
not
have
instances
of
the
service
that
they're
asking
for
deployed
there.
B
That
envy
would
then
route
that
traffic
to
the
data
center,
where
the
services
exist,
that
request
would
be
processed
content,
would
be
retrieved
sent
back
through
the
internal
infrastructure
to
the
envoy,
where
the
clients
requested
it
and
give
the
content
back.
At
that
point,
the
content
will
get
cached
locally
as
close
to
the
clients
as
possible,
making
all
future
requests
substantially
faster
because
we
don't
have
to
make
these
long
distance
remote
service
calls.
B
Another
situation
where
this
might
be
useful
is,
if
you
have
envoy
deployed
in
a
service
mesh
where
envoy
is
handling
the
intra
service
communication
in
within
your
backend
infrastructure.
This
isn't
the
first
architecture
that
we're
considering
when
designing
the
cache
filter,
but
I
can
imagine,
especially
with
an
in-memory
cache.
It
could
be
useful
to
cache
the
content
that
one
service
is
requesting
from
another
to
reduce
to
reduce
the
traffic
passing
between
the
services.
B
So
how
do
I
use
this
cache
filter?
Now
that
that
sounds
great
now
we
can.
We
can
see
how
it
will
help.
The
simplest
way
is
to
take
a
look
at
the
cache
filter
sandbox,
which
exists
for
cache
filter
developers
to
spin
up
a
quick
onboard
instances
that
has
a
quick
on
instance
that
has
caching
enabled
and
the
config
that
turns
the
caching
on
is
one
that
adds
the
cache
filter
to
the
http
filter
chain
at
the
place
where
the
cache
filter
is
inserted
into
the
chain.
B
The
request
coming
through
will
make
a
look
up
in
the
cache
and
then
we'll
make
a
look
up
to
the
cache
that's
configured
here
in
the
in
this
case.
It's
configuring,
the
simple
http
cache
and
retrieve
the
content
from
there
anything
else
that
affects
the
cache
behavior,
such
as
what
very
headers
do
we
respect
from
the
back
ends
of
how
this
content
will
differ
from
request
to
request
that
also
gets
configured
in
this
in
this
config
there's
a
very
likely
as
feature
development
continues.
B
A
Thank
you
josiah.
So
how
does
cash
filter
work
if
you're
watching
this
presentation?
You
probably
have
some
familiarity
with
how
envoy
manages
his
http
filters?
Envoy
has
a
chain
of
filters
when
a
request
comes
in
filter
manager,
iterates
through
the
chain
of
requests
in
order
notifying
each
one.
When
the
response
comes
back,
it
goes
through
the
chain
in
the
opposite
order.
A
A
This
allows
http
cache
plugin
implementers
to
focus
only
on
storage
or
other
value-added
responses
that
their
plugin
needs
to
provide.
This
well,
this
enables
the
writing
of
a
wide
variety
of
plugins
for
divergent
needs.
Those
plugins
can
be
http
aware
if
needed,
but
they
can
also
be
simple
key
value
stores.
A
We
have
an
example,
simple
http
filter
or
a
simple
http
cache.
That
is,
in
fact,
just
a
wrapper
around
a
hash
map
when
envoy
has
parsed
in
http
request
setters,
it
calls
the
decode
headers
method
of
each
filter
when
it
gets
to
cache
filter.
If
it's
a
get
request,
we
look
in
the
cache
for
our
matching
response.
A
A
A
Otherwise
it
would
get
sent
upstream
while
we're
busy
checking
the
cache,
which
would
cause
a
problem
if
we
got
a
hit
when
the
cache
plugin
completes
the
lookup,
it
invokes
our
callback
with
the
results
in
the
case
of
a
hit.
Those
results
will
include
the
cache
responses
headers,
which
we
pass
on
to
filter
manager
by
calling
encode
headers.
A
A
A
We
certainly
intend
for
most
of
them
to
be
cash
hits,
but
those
that
aren't
are
referred
to
as
cash
misses
and,
of
course
this
miss
can
happen
because
it's
literally
not
in
the
cache
or
we
could
be
talking
about
something
that
was
found
in
the
cache,
but
is
too
stale
to
serve
or
something
like
that.
For
some
reason
it
cannot.
The
entry
in
the
cache
can't
actually
be
used
in
either
case.
A
If
we
back
up
to
the
point
where
we
are
getting
the
response
back
from
the
lookup
context
in
in
the
previous
scenario,
we
got
a
result.
That
said
this
is
a
cache
hit.
Here
are
the
headers.
In
this
scenario
we
get
a
result.
That
says
sorry.
This
is
a
miss
and
when
that
happens,
but
instead
of
calling
encode
data
and
giving
envoy
headers
to
send
to
the
client,
we
simply
call
continued
decoding
which
tells
envoy
hey.
You
know
how
we
had
you
pause
early
earlier,
yeah.
Sorry
about
that.
A
Just
you
know,
keep
on
going
nothing
to
see
here,
proceed
as
usual,
and
so
of
course
envoy.
Does
it
iterates
through
the
remaining
filters
and
on
and
on
we
go
now,
of
course,
if
that,
when
that
happens,
that
respond,
that
request
will
presumably
generate
a
response
that
comes
back
into
the
into
the
cache
filter
on
the
other
direction
and
that'll
be,
as
we'll
see
the
headers
in
the
encode
headers
call
from
filter
manager
in
in
code
headers.
Of
course,
we've
got
actually
quite
a
bit
of
logic.
To
do.
A
To
figure
out,
we've
got
to
look
for.
You
know,
look
at
the
different
rules
for
whether
something
is
cacheable
you
know.
Is
there
an
authorization
header?
Is
there
a
cache
control
header?
What
are
the?
What
are
the
directives
all
these
different?
Is
it
a
response
to
conditional
headers,
all
the
all
these
sorts
of
different
things
that
need
to
be
evaluated?
We
evaluate
them
and
once
we've
done
that,
if
we
determine
that,
in
fact
this
is
a
cachable
response,
we
will
of
course,
then
cache
it.
A
A
Now
we
don't
really
care
what
the
results
are
in
terms
of
affecting
our
behavior.
We
need
to
probably
report
some
stats.
The
stats
is
one
of
the
outstanding
items,
but
the
we're
going
to
respond
the
same,
we're
going
to
allow
the
package
to
pass
through
the
same,
so
we
don't
actually
wait
for
a
response
to
insert
headers.
A
We
call
you
and
fire
and
forget,
keep
going
when
the
when
envoy
eventually
tells
us
hey
here's
a
body
assuming
in
fact
that
there
is
a
body
in
this
response.
We
get
told
that,
via
the
encode
body,
callback
from
filter
manager
and
as
you'd
expect,
we
then
turn
around
and
insert
that
body
into
the
insert
context,
and
we
fully
expect
that
it
should
be
able
to
deal
with
it
and
if
it
can't,
then
that
again
won't
affect
this
response,
because
the
primary
thing
that's
happening
is
routing.
A
Perhaps
a
server
is
overloaded
or
there's
some
header,
some
non-standard
header,
that
it
looks
at
for
whatever
reason,
if
it
wants
to,
it,
can
simply
refuse
to
insert
these,
and
that
is
fine,
see
the
comments
on
in
the
insert
context
class
for
more
details.
A
We
are
going
to
be
making
a
few
changes
there
in
the
near
future
to
better
report
statistics
so
to
write
a
plugin
for
cache
filter.
These
are
the
four
classes
you
need
to
implement,
http
cache
along
with
http
cache
factory
and
the
lookup
context,
and
insert
context,
which
is
the
analog
on
the
insert
side.
A
So
you
don't
need
to
worry
about
it
now.
With
that
I'll
hand,
it
back
to
josiah
to
talk
about
the
current
state
of
development
on
this
project,
just
saying.
B
So
is
the
cache
filter
production
ready
from
a
cash
semantic
standpoint,
like
is
the
cash
filter
rfc
compliant
in
many
cases,
yes
for
basic
cash
requests,
including
cash
control
and
vary
and
validation,
request
flows
with
e-tags
and
last
modified?
That's
all
implemented
and
ready
to
go
some
of
the
more
advanced
validation
logic
like
with
if
none
manage,
etc.
B
Like
those
listed
there,
that's
not
yet
implemented
and
we'll
actually
just
skip
caching,
if
those
are
present
and
the
cache
control
extensions
like
immutable
and
these
others,
those
are
also
not
yet
implemented,
but
they're
not
as
commonly
used.
B
If
you're
asking
will
it
work
with
the
cash
that
I
have
in
my
infrastructure
today,
the
answer
is
no.
We
do
not
have
any
production-ready
implementations
of
http
cache.
The
only
cache
implementation
that
exists
today
is
the
example.
B
One
simple
http
cache,
and
that's
really
just
there
so
that
if
you
wanted
the
envoy
cache
filter
to
work
with
ignite
or
with
memcached
or
whichever
then
you
would
have
to
write
an
implementation
of
http
cache
so
that
the
cache
filter
could
use
it
and
serve
content
from
that
remote
from
that
remote
cache,
there's
a
whole
list
of
issues
on
github
that
we
know
we
need
to
have
done
before.
We
can
declare
this
thing.
Production
ready.
B
One
of
the
most
important
is
that
the
in-memory
cache,
which
I
mentioned,
the
simple
http
cache-
is
not
scalable.
It
currently
doesn't
do
any
memory
management
it
will.
You
can
spin
up
envoy,
have
it
cache
your
content
and
it
will
very
quickly
run
out
of
memory
because
it
doesn't
do
any
sort
of
management
on
the
back
end.
B
There's
also
some
other
basic
functionality
like
serving
head
requests
and
important
things
like
gathering
stats
on
cash
requests
and
just
a
whole
list
of
other
things
that
need
to
be
done
that
are
all
filed
under
the
area:
slash
cash
label
in
github,
if
all
that
sounds
great
and
you're
ready
to
dive
in
and
help
one
of
the
most
important
things
that
we
need
people
to
contribute
are
these
plugins
for
the
various
caches.
B
So
if,
if
you
have
expertise
in
any
of
these
caches
and
want
envoy
to
work
with
them,
please
write
an
implementation
for
the
http
cache
interface,
so
that
envoy
can
talk
to
it.
The
interface
is
ready
to
go
and
it
would
be
great
to
have
these
implementations
to
test
the
cache
filter
itself
against,
and
so
we
would
happily
support
that
effort.
B
We
are
almost
always
logged
in
there
because
this
is
part
of
our
day
job
and
the
list
of
issues
that
we
know
need
to
be
done
are
currently
filed
under
that
label.
That
I
mentioned,
and
if
any
of
those
catch
your
interest,
you
can
either
post
some
comments
in
the
issues
or
tag
us
on
on
slack
and
we'll
get
you
started.
B
So
that
does
it
for
our
presentation,
thanks
for
following
along,
and
thank
you
even
more
if
you're
looking
to
get
started
contributing
to
the
cash
filter.
Our
contact
information
is
right
there
and
we
will
take
questions
from
here.
A
Okay,
so
just
so,
I
think
you
may
have
said
something,
but
I
didn't
hear
anything.
A
Okay,
I
wanted
to
mention
something
about.
There
was
a
question
earlier
about
cash
purge.
One
of
the
things
that
we
need
to
figure
out
is
the
is
the
approach
is
used
for
catchphrase
because
different
style
caches
have
different
needs,
you
know
so
for
some,
you
can
do
it.
What
is
literally
cashback
you
go
and
delete
the
entries
that
you
want
to
be
gone
for
others.
You
do
a
invalidation
approach
where
you,
where
you
record
entries
that
say
hey
if
you
find
the
thing
in
the
cache,
don't
serve
it.
A
B
Just
another
mic
check:
can
you
hear
me
now?
Yes,
excellent,
I
think
one
of
the
other
questions
that
we
haven't
addressed
in
the
chat
is:
is
it
possible
to
catch
just
one
route
match
from
the
list?
I'm
I'm
assuming.
That
means
like
cache
key
configuration
like
deciding
what
parts
of
the
of
the
path
contribute
to
the
cache
key
like
deciding,
whether
or
not
to
include
query
params
or
whether
to
include
the
protocol
and
those
sorts
of
things.
B
Okay,
so
to
answer:
if
that's
the
question,
then
that
is
a
planned
feature.
It's
not
currently
supported,
and
that
would
be
one
of
those
things
I
might
have
mentioned
it
in
the
slide
about
things
that
we
would
add
to
the
config
and
like
how
you
decide
how
the
cache
decides,
whether
to
split
issues
or
split
entries
or
not.
A
Yeah-
and
some
of
that
is
already
in
the
config,
just
not
it
doesn't
have
any
effect
yet.
A
Yeah
another
thing,
so
so
what
I
think
the
question
was
was
talking
about.
You
know
the
fact
that
filter
config
is
per
listener,
and
that
tells
you
what
filters
are
in
stack
and
then
you
wanted
to
have
different
routes
have
different
config.
That
is
something
we
definitely
need
to
add,
is
per
route
configuration
and
that
just
that's
just
a
matter
of
getting
that
done.
B
So
we
say,
like
does
the
interface
to
http
cache
plug-in
allow
for
coalescing?
I
believe
it
does,
but
I
think
todd
you
would
probably
have
a
bit
more
insight
on
that.
A
Yes,
it
absolutely
does
so
all
you
need
to
do
for,
for
coalescing
is
basically
have
multiple
things
come
in
and
if
they're
misses
just
don't
respond
to
the
the
second
third,
whatever
one
telling
us
that
that
it's
a
miss
just
just
let
it
go,
I
just
let
it
sit
and
wait
and
that
works
fine.
Now,
probably,
we
would
need
some
configuration
around
like
you
know,
maximum
delays
and
stuff
like
that,
but
fundamentally
yeah
you
could
do
it
in
a
plug-in.
A
B
Yeah
the
next
next
question
is
how
do
items
get
pushed
out
of
cash
now.
The
short
answer
is
that's
up
to
the
plugin
and
the
plugins
currently
implemented,
like
the
simple
http
cache.
Just
doesn't
do
it,
and
so,
depending
on
how
the
cache
works
that
we're
talking
to
whether
it's
like
a
remote
cache
like
redis
or
something
else,
or
if
it's
an
in-memory
cache
that's
written
completely
within
envoy.
It's
gonna
be
plug-in
specific,
how
that's
managed,
but
the
simple
http
cache
just
doesn't.
Do
it.
A
Yeah
and
just
to
be
clear,
that's
just
because
we
haven't
gotten
around
to
it.
B
Yeah
sure
I
mean
we
are
not
going
to
go
into
production
without
a
feature
like
that,
like
that
this
is
just
like
the
simple
http
cache
is
good
for
development
and,
it's
good
to
say,
hey
my
caching
semantics
work,
but
it
is
not
good
to
put
in
front
of
live
traffic.
A
Yeah,
I
do
think
that
we
are
going
to
need.
There
are
more
configuration
options
that
will
need
to
be
added.
So,
like
I
assume,
any
cash
plug-in
is
gonna
need
a
mac
space
option
a
max
time,
yeah,
probably
so
as
well,
and
you
know
so.
There
are
probably
some
other
things
that
are
universal.
We
also
in
in
standard
envoy
fashion.
Have
you
can
specify
you
know,
opaque
to
cat
stuff,
that's
okay,
to
catch
filter
that
is
just
handed
to
the
plugin,
for
whatever
configuration
you
need,
yeah.
B
A
One
thing
that
I
I
wanted
to
explicitly
mention
just
because
I
I
don't
know
whether
I
don't
think
we've
mentioned
it
before
is
that
is
in
terms
of
cash
admittance
policy,
and
that
is
you
know
we
might
expand
where
you
can
have
different
policies
in
the
cash
filter.
But
another
option
is
that,
just
because
we
call
your
plugin
and
say
here's
something,
please
insert
it,
you
don't
have
to
actually
insert
it.
You
can
say
gee
thanks
nope,
I'm
gonna
pass
and
not
insert
it.
So
you
can
do
whatever
you
want.
There.
B
Yeah
to
answer
shakti's
question,
so
the
plan
is
to
to
say
yes
that
you
can
use
redis
as
a
remote
cache
with
http
cache
filter
or
with
the
cache
filter,
but
there
is
not
currently
a
plug-in
implementation
that
implements
redis's
api.
So
once
somebody
gets
inspired
and
says:
hey
I'd
really
like
to
use
redis
with
envoy
and
writes
the
plugin
for
it
then
envoy
will
support
talking
to
it,
because
the
interfaces
are
all
there.
B
A
A
Let's
make
it
a
plug-in
that
whatever's
special
is
the
plugin
so
like
I
don't
think
we're
going
to
be
contributing
the
redis
cache
just
because
that
doesn't
happen
to
be
relevant
to
our
to
you
know
google's
business
needs,
but
we
are
absolutely
going
to
do
anything.
We
can
to
hold
your
hand
while
you
add
it.
Oh
yeah.
B
Right
and
like
to
build
on
that,
we
are
not
red
as
experts,
we
don't
use
redis,
and
so
it's
probably
not
good
for
us
to
be
writing
that
plug-in
anyway.
B
We
wouldn't
be
very
good
at
keeping
up
with
releases
and
making
sure
that
it
and
that
sort
of
thing
like
it's-
it's
not
good
for
us
to
own
things
that
we
don't
use,
but
it
is
in
our
best
interest
to
have
somebody
else
contributing
those
so
that
we
have
users
adding
requirements
to
both
the
cache
filter
itself
and
the
http
cache
interface
like.
If
we're
missing
a
piece
of
the
interface
that
something
like
redis
or
memcached
and
stuff
need.
B
Then
we
want
to
be
extending
the
generic
portions
of
of
the
code
to
to
support
those
things,
and
so,
if
somebody
comes
along
with
the
specific
needs
that
redis
has
we're
happy
to
support
those
needs,
we
just
don't
want
to
own
the
redis
http
cache
implementation
itself.
A
Yeah
so
yeah,
please,
you
know
file
bugs
prs
questions.
I
think,
which
I
mentioned
earlier.
You
know
a
lot,
often
on
we're
routinely
on
on
slack.
You
know
we
answer
email,
all
that
stuff,
so
we
we
are
motivated
to
help
any
efforts
on.
A
B
In
the
main,
repo
yeah-
and
in
fact,
if
you
just
add
the
cache
config
that
I
mentioned
earlier
in
the
presentation-
then
it'll
load
it
into
your
your
filter
chain,
because
it's
it's
in.
It's
like
it's
merged
into
main
line.
Right
now,.
A
B
Yeah,
it
is
definitely
we
don't
think
that
it
should
be
used
in
production,
but
if
you
have,
if
you
have
the
the
time
and
ability
to
bulletproof
it,
then
then
by
all
means.
B
B
We
plan
to
get
it
to
production
ready,
but
like
it's,
it's
not
there.
Yet
what
is
missing
to
rounded
in
production,
an
http
cache
implementation
that
is
scalable.
The
only
implementation
we
have
right
now
is
not
production
ready.
That's
the
primary
thing.
The
basic
cache
semantics
are
ready,
like
it
supports
cache
control
and
all
of
the
basic
like
dtl
type
headers,
there's
some
more
advanced
stuff
that
that
it
doesn't
support
yet
like
some
of
the
more
unique
validation
flows,
but
for
basic
caching
it'll
work,
and
it
will.
A
A
B
To
describe
scalable
in
this
context
the
most
clear
way
to
point
out
that
simple
http
cache,
the
only
http
cache
implementation
is
not
ready
for
production
is
that
it
does
absolutely
no
memory
management
and
it
will
keep
adding
entries
to
the
cache
until
you
run
out
of
memory
and
envoy,
probably
crashes.
So
that's
that's
the
most
obvious
flaw
with
it,
but
it
also
doesn't
do
sharding
or
other
things
that
impact
performance
like
you're,
probably
going
to
get
lot
contention
and
like
it's,
it's
just
like
it's
written
as
an
example.
A
Yeah
and
we
think
we
can
turn
it
into
a
production
quality
thing,
while
still
being
a
good
example.
You
know
if
that
person's
wrong,
maybe
we'll
split
it,
but
that's
the
plan.
B
We
actually
had
a
slide
on
that.
How
many?
How
many
like
it's
it's,
a
relatively
simple
interface.
I
wonder
if
we
have
this
an
easy
way
to.
A
Show
that
less
than
a
dozen
much
less
than
it
doesn't,
where
is
this
yeah?
If
you,
if
you
check
with
slides
it'll,
be
in
there,
we
don't
have
time.
B
Yeah
we've
got
about
30
seconds
left,
but
actually,
if
you,
if
no
it's
fine,
it's
like.
If,
if
you
look
at
the
http
cache
class
in
the
in
the
github
like
search,
then
then
you
should
be
able
to
see
it
like.