►
Description
Deep dive into Calico's new eBPF dataplane, which is now GA. You'll learn about its advanced features, including its high-performance Kube-proxy replacement that preserves source IP all the way to the pod. The talk will also touch on Calico's other dataplanes; Calico now supports Windows nodes (including in open source!) and there's a fast-maturing VPP port in the works.
A
Oh
there
we
go
yep,
so
I'm
going
to
talk
about
calico's
evpf
data
plane,
and
I
have
a
couple
of
slides
at
the
end
about
the
friends.
So
we
have
more
than
one
more
than
one
data
plane
and
some
of
them-
maybe
maybe
you
haven't
heard
of
before
so
I
thought
I'd
just
drop
a
couple
of
slides
in
at
the
end,
to
tell
you
about
the
the
other
data
planes
that
calico
has
and
why
you
may
or
may
not
want
to
use
those.
A
That's
the
agenda
for
the
talk.
So
mostly
I'm
going
to
talk
about
the
ebpf
data
plane.
That's
that's
what
I've
been
working
on
for
the
last
year
or
so
and
yeah
just
gonna.
Take
you
through
what
it's
all
about.
Why
we
did
this?
How
fast
it
is
everyone
everyone's
always
interested
in
that
and
then
yeah
just
a
couple
of
slides
at
the
end
about
about
the
others.
A
So
without
further
ado,
I'll
dive
in
so
yeah
my
I'm
working
there
we
go
so
what
what's
color,
because
evp
data
plane
it's
an
alternative
data
plane
for
calico
so
calico
is,
is
the
kind
of
most
widely
used.
Networking
and
security
solution
for
kubernetes
you
know:
we've
got
hundreds
of
thousands
of
clusters
out
there
using
calico
a
few
parts,
the
calico,
so
we
have
our
data
model.
You
know
that's
stored
in
up
in
kubernetes
api
server.
A
We've
got
our
calculation
logic
that
takes
all
the
policy
and
distills
it
for
for
every
host,
and
then
we
have
the
actual
implementation
like.
How
do
we
get
packets
around
through
the
calico
network?
How
do
we
secure
them?
How
do
we
drop
the
bad
ones,
allow
the
right
ones
through
and
that's
the
data
plane,
so
we've
had
pluggable
data
planes
in
calico
for
a
while.
A
So
I
was
going
to
talk
about
a
couple
of
those
later
on
a
couple
of
the
other
ones
later
on
ebps,
the
one
that
that
we've
recently
added-
and
so
that's
what
it
is,
it
seems
obvious
we
use
evpf
instead
of
the
standard
linux
networking
technologies
that
that
the
kind
of
standard
calico
data
plane
is
based
on
that's
mainly
ip
tables
and
linux,
routing
and
those
kind
of
things.
A
Instead
of
that
we're
using
ebpf,
I've
got
a
slide
next
to
that
explains
ebpf.
If
you're,
if
you're
not
super
familiar
with
it,
so
the
like,
why
evpf?
Well,
we
can,
we
can
do
things
with
evpf
that
we
can't
do
in
in
the
old
world
in
the
in
the
standard
linux
data
plane.
So
we've
got
some
evpf
only
features
which
I'm
going
to
cover
in
more
detail,
and
you
know
we
can.
A
We
can
give
you
great
performance,
so
I'm
going
to
cover
cover
the
the
sorts
of
trade-offs
that
we
make
there
and
what
we,
what
we
can
do
there
later
on
so
yeah
wizzy
new
features
that
we
can't
do
in
the
old
data
plane,
great
performance
and
just
as
a
word
of
warning,
a
little
caveat.
A
It
has
some
new
features
that
we
don't
have
in
the
open
tables
world,
but
it
also
lacks
a
few
features
that
we
do
have
in
the
op
tables
world.
You
know
some
of
those
are
due
to
like
fundamental
differences
between
the
two,
so
you
know
I
have
a
list
of
them
later,
but
the
the
iptables
log
action
is
not
available
from
ebps,
because
it's
it's
an
ip
table
specific
feature.
So
we
can't.
We
can't
use
that
yeah.
A
So
ebpf,
what's
it
all
about
you've,
probably
heard
of
ebpf
already,
but
just
just
to
recap,
so
it's
it's
a
virtual
machine
that
runs
inside
the
linux
kernel,
it's
a
bit
like
the
java
virtual
machine,
so
it
runs
its
own
type
of
byte
code
and
the
the
name
means
extended
berkeley
packet
filter.
But
it's
not
only
used
for
packet
filtering
these
days,
so
the
name's
a
bit
of
anachronism
and
anachronism,
but
it
just
so
happens
that
calico
is
using
it
for
packet
filtering.
A
So
it
can
be
a
little
bit
confusing
when
you
hear
hear
of
other
uses
like
monitoring
and
that
kind
of
thing.
A
A
key
thing
to
know
about
eppf
is
the
the
mini
programs
that
we
can
put
in
the
kernel
that
run
on
this
virtual
machine
are
event
driven,
so
they're
triggered
by
something
happening
in
the
kernel.
It's
not
like
a
sort
of
user
process
where
it
just
runs
and
runs,
and
and
can
you
know,
do
things
and
set
timers
and
trigger,
and
you
know
display
an
animation
or
something
has
to
be
triggered
by
something
happening.
A
So
some
examples
could
be
a
packet
arriving.
I
mean
that's
the
key
one
key
one
for
calico,
so
packet
arrives,
evpf
program
runs,
maybe
it
drops
the
packet.
That's
one
of
the
things
that
that
particular
evpf
hook
can
do.
Maybe
it
allows
the
packet
through?
Maybe
it
decides
to
turn
the
packet
around
swap
its
headers
respond
with
an
icmp
message,
there's
a
couple
of
places
where
we
do
that
in
the
in
the
calico
ebtf
data
plane.
A
So
it
has
some
flexibility
in
what
it
can
do.
You
know
it
can
mangle
the
packet
in
in
any
arbitrary
way.
Basically,
it
can
drop
it.
It
can
allow
it
through
for
normal
processing,
but
what
it
can
do
is
constrain
to
the
packet
and
the
particular
hook
that
it's
attached
to
similarly
packet
being
sent,
maybe
that
one
could
drop
it,
allow
it
or
encapsulate
it
and
send
it
down
a
tunnel
or
something
like
that.
A
Now,
there's
a
there's
a
hook,
for
you
know
every
cisco
that
a
program
makes
and
people
are
using,
that
to
police
cisco's
and
generate
like
audit
logs
of
what
every
program
does
and
maybe
block
people
from
accessing
certain
files
that
that
kind
of
thing
there
there
are
ones
for
deciding
which
socket
a
packet
goes
to
when
it
arrives
at
the
host
choosing
the
source
ip
when
when
packet
is
leaving
the
host
and
all
kinds
of
things
across
all
the
different
subsystems.
A
Since
running
code
in
the
kernel
would
be
dangerous.
I
mean
you
know
if
you
load
a
kernel
module
or
something
like
that
running
in
kernel
space,
that's
dangerous!
It
can
do
anything.
The
ppf
programs
that
we
load
are
all
subjected
to
a
verification
process.
So
the
the
kernel
has
a
very
rigid
like
verifier.
A
It
ensures
that
your
bpf
program
cannot
access
memory
that
it's
not
allowed
to.
It
cannot
run
forever.
So
it's
not
allowed
to
tight
loop
or
run
more
than
a
certain
number
of
instructions.
That's
how
how
they
ensure
that-
and
it's
just
generally
quite
locked
down
like
the
functions
in
that
it
can
call
within
the
kernel.
Those
are
all
very
limited
to
a
specific.
A
You
know
allow
list
of
of
functions
that
we're
allowed
to
call.
So
it's
it's
fairly
safe,
although
of
course,
if
we're,
if
we're
dropping
packets
and
we
drop
essential
packets,
then
obviously
that
could
that
could
cause
problems.
A
So
why
choose
epps,
I'm
going
to
contrast
it
with
the
iptables
data,
plane
and
and
sort
of
explain
why
why
it's
different,
so
the
iptables
data
plane
is
baked
into
the
kernel
as
the
the
net
filter
subsystem.
It
does
get
a
lot
of
active
development,
but
because
it's
in
the
kernel
and
it's
it's
c
code-
that's
part
of
the
kernel.
Its
development
cycle
is
quite
slow.
So
if
they're
adding
a
new
feature
there,
they
have
to
worry
about
fat
compatibility
with
absolutely
everything.
A
That's
out
there
and
the
feature
may
go
in
today,
but
that
kernel
will
not
be
available
in
you
know.
Ubuntu
say
until
you
know
two
three
four
years
down
the
line
when
that
particular
kernel
gets
gets
rolled
out.
So
I
mean
the
great
thing
about
that
is
it's
generally
very
stable
and
and
and
battle
hard
and
been
around
for
years.
A
Even
nf
tables
kind
of
the
version
two
of
ip
tables
has
has
been
around
three
years
at
this
point
so
stable
but
slow,
moving,
slow
development
process,
it's
very
general,
so
it
can
handle
all
the
things
that
the
the
linux
kernel
is
capable
of
handling
so
tunneling
packets,
ipsec,
bridging
routing
all
that
sort
of
stuff
is
all
integrated
in
it.
A
And
that's
what
this
diagram
on
the
right
hand
side
is
it's
a
diagram
of
the
net
filter
and
all
the
networking
stack,
but
there's
only
one
path
through
that
for
any
particular
packet,
and
it
comes
in
on
the
left
hand,
side,
and
it
goes
through
these
stages,
one
by
one
and
some
of
them
make
choices
and
send
it
and
send
it
up
into
a
different
layer
or
some
of
them
loop
it
round,
but
you're
very
constrained
in
that
world.
A
You
you
can't
do
anything.
That's
like
super
creative.
You
can't
bypass
a
big
chunk
of
it
if
you
can
spot
early
on
that.
This
packet
doesn't
need
all
of
that
extra
processing,
and
if
you
want
to
jump
from
one
place
to
another,
to
do
something
interesting,
then,
then
you
don't
have
that
capability
without
patching
the
kernel,
which
is
a
non-starter
for
most
products
etf.
A
Having
added
a
custom,
encapsulation
header
and
switch
the
mac
address
and
make
it
go
out
of
your
interface,
that's
the
kind
of
thing
that
you
can
do
in
in
bpf
land,
and
that's
that's
great.
If
you're
trying
to
like
bring
the
most
performance
out
of
out
of
a
system
and
that
that's
one
of
the
things
that
I
like
to,
I
like
to
think
about
ppf.
It
lets
you,
trade,
this
generality
and
compatibility
for
performance.
A
So
a
lot
of
the
performance
that
we
get
in
the
the
bpf
data
plane
is
because
we
do
exactly
this.
What
this
red
line
is
doing,
we
take
a
packet,
that's
come
in
on
the
left
hand,
side
and
when
it
hits
the
the
q
disc
box
on
the
left,
you
don't
don't
worry
about
the
eye
chart.
We
can
pick
it
up
and
we
can
send
it
directly
to
a
local
kubernetes
pod
or,
if
we're
load,
balancing
at
ingress
for
a
node
port
or
something
in
kubernetes.
A
We
can
turn
it
straight
around
and
send
it
back
out
of
the
the
same
interface
and
we
don't
have
to
go
through
all
of
these
blocks
in
the
diagram
and
pay
the
price
that
they
have,
but
the
the
counterpoint
to
that
is
some
of
the
blocks
in
the
diagram
may
be
useful.
So
in
your
particular
scenario,
like
the
box
is
in
the
top
right
of
the
diagram,
where
it
kind
of
loops
around
on
itself.
A
Those
are
the
boxes
that
handle
ipsec
traffic.
So
if
you
do
this
bypass,
then
you
can't
do
ipsec
because
you
bypass
the
ipsec
subsystem.
That's
that's
fine
for
a
lot
of
use
cases,
but
it's
something
you
need
to
be
aware
of.
Like
there's.
There's
no
free
lunch,
like
the
the
c
code
in
the
kernel
is
pretty
fast
for
what
it
does.
A
But
if
you
bypass
a
big
chunk
of
it,
you
can
you
can
get
some
good
wins
and
you
can
do
some
some
much
more
flexible,
like
creative
things
that
that
solve
problems
that
you're
not
able
to
do
in
in
the
other
world.
So
that's
that's
kind
of
my
thoughts
on
it
and
I
know
when,
when
kim
volk
have
done
some
micro
benchmarks
and
and
when
I've
done
the
same,
I
if
you
do
like
really
like
tight
micro
benchmarks
for
certain
operations
in
ip
tables
and
bpf,
but
sometimes
the
vpf
one
is
slower.
A
Sometimes
it's
faster
they're,
both
kind
of
pretty
well
optimized
for
what
they
do,
but
it's
really
the
flexibility
and
the
the
ability
to
do
these
like
interesting
trade-offs
that
that
you
get
a
lot
from
so
yeah
good
for
different
users
and
and
different
use
cases.
I
mean
obviously
bpf
it's
a
trait
of
newer
kernels.
So
if
you're,
if
you're
on
a
quite
stable
older
kernel,
then
you
can
just
stick
with
ip
tables
with
with
calico.
A
That's
no
problem,
and
so
let's,
let's
talk
a
bit
more
about
some
of
the
the
flexibility
that
that
it
brings.
A
So
one
one
of
the
pain
points
in
in
kubernetes
networking
for
for
a
long
time
is
when
you're
using
cube
proxy,
and
you
have
some
external
traffic
in
order
for
key
proxy
to
do
to
take
in
traffic
from
a
node
port
and
send
it
on
to
a
backing
pod.
It
ends
up
needing
to
estimate
the
traffic,
so
the
traffic
arrives
at
the
first
host.
Let's
call
that
the
ingress
host
and
it
detects
that
it's
a
node
port
using
op
tables
rules
or
ipvs
in
if
you
have
that
turned
on
it.
A
So
the
packet
comes
in
and
it
does
a
dna,
which
is
what
it
really
wants
to
do.
And
then
it
has
to
do
an
snap
as
well,
so
change
the
source
ipe
to
be
the
host's
ip.
And
that
means
that,
for
all
of
your
web
server
logs
and
for
you
know,
network
policy,
you
see
the
source
ip
as
being
really
not
very
useful.
You
see
it
as
the
the
host
ip
where
the
packet
arrived
rather
than
the
original
host
ip
from
outside.
A
So
with
bpf
we
can.
We
can
deal
with
this
problem
and
kind
of
break
some
of
the
the
rules
that
that
that
are
in
place
in
the
sort
of
standard
linux
data
plane
and
do
something
a
little
bit
different,
so
for,
for
example,
packet
comes
in
to
a
node
port,
the
bpf
program,
and
so
this
is
this
is
how
calico
works.
A
We
replace
cube
proxy
with
with
code
in
our
bpf
programs.
We
do
we
do
the
load
balancing
there
and
then,
rather
than
doing
an
snap,
we
encapsulate
the
packet
keeping
it
exactly
as
it
is
inside,
but
we
we
stick
with
the
x
line
header
on
it.
We
send
it
to
the
correct
node,
like
the
backing
node
that
has
the
backing
pod
on
it.
We
have
a
bpf
program
there
that
catches
that
packet
decalculates
it
does
the
dna
so
that
the
packet
will
be.
A
Will
go
to
the
pod
and
the
pod
will
be
expecting
packets
with
that
ip
address
packet
arrives
at
the
pod
and
it
still
has
the
original
source
ip
address.
Just
we
didn't
change
it,
and
the
way
we've
made
sure
that
we're
on
the
reverse
path
is
because
we
did
the
dna
on
the
target
host
rather
than
or
on
the
backing
host,
rather
than
this,
the
ingress
host.
A
So
when
the
pod
responds,
we
have
a
bpf
program
there
that
can
catch
the
response,
packet
and
sort
of
reverse
it
all
the
way
along
the
chain,
so
that
that's
the
kind
of
creative
thing
that
you
can
do
in
in
bpf
land
that
you,
you
couldn't
really
do
in
ip
tables
land.
A
I
believe
ipvs
does
allow
this
kind
of
thing,
but
it's
subtly
kind
of
wrong
for
kubernetes,
just
because
of
how
how
it
assumes
that
the
the
backing
pod
would
receive
the
traffic,
so
the
encapsulated
packet
would
have
to
go
all
the
way
to
the
backing
pod,
which
means
your
backing.
Pod
has
to
be
rewritten
and
it
kind
of
changes.
The
networking
model.
A
I
think
if
I,
if
I
understand
that
correctly
so
yeah,
that
that's
the
sort
of
thing
that
we
can
do
there-
and
I
guess
I
should
do
a
demo
and
and
show
some
of
this
stuff.
A
Hopefully
that
means
that
you
can
all
see
a
couple
of
windows
so
on
the
left
I
have
the
google
cloud
microservices
demo,
which
is
a
great
test
app
for
a
new
data
plane
because
it
runs
lots
of
services.
It
uses
kubernetes
services
to
kind
of
load,
balance
between
them
all
and
each
of
them
is
written
in
a
different
language.
So
you
get
to
see
all
the
different
networking
quirks
of
different
languages
and
all
of
their
different
dns
behavior,
and
things
like
that.
So
hopefully
I
started
this
a
while
ago.
A
So
hopefully
it's
still
running
can
page
around.
I
can
buy
a
vintage
typewriter
and
place
my
order
and
it
will
create
a
create
a
an
order
in
its
database.
My
cluster
right
now
is
running
with
cube
proxy
in
calico
ip
tables
mode.
So
if
I
do.
A
There
we
go
so
these
are
the
sort
of
services
running
in
the
default
namespace.
So
all
the
the
pods
that
make
up
the
the
microservices
demo
we've
got
calico
node
running
it's
in
op
tables
mode.
We
have
q,
proxy
running
and
yeah.
That's
all
working.
I've
got
an
nginx
running
as
well,
and
the
reason
I've
got
that
running
because
I
want
to
show
some
access
logs.
A
A
So
I'm
about
to
disable
cube
proxy,
and
that
means
that
calico
takes
over
from
cube
proxy
and
in
order
to
bootstrap
the
whole
system.
We
have
to
know
the
piece
of
information
that
qproxy
normally
knows,
which
is
how
to
really
reach
the
api
server,
not
through
the
sort
of
kubernetes
service
ip.
So
I
applied
that
already
and
restarted
the
calico
pods
and
then
I'll
I'll
take
a
look
at
the
nginx
log.
A
A
Do
you
have
the
right
notebook,
okay
gave
it
a
refresh,
and
now
it's
working.
So
the
thing
to
note
is
I'm
hitting
one
of
the
one
of
the
nodes
and
the
ip
address
that
that
ends
up
in
the
logs
is
10
128
136,
which
is
the
ip
of
one
of
my
nodes
that
that
I'm
coming
through
and
that's
not
really
very
useful
to
like
it's
not
useful
for
the
log
and
it's
not
useful
for
policy
either.
A
A
So
what
am
I
doing
here?
So
I'm
doing
a
cube,
cuddle
patch
of
the
cube
proxy
demon
set.
This
is
all
in
our
in
our
docs
and
I'm
adding
a
node
selector
to
it
that
that
makes
it
only
run
on
nodes
that
are
explicitly
tagged
with
non-calico.
A
I
haven't
tagged
any
nodes
with
non-calico,
so
cube
proxy.
Just
won't
run
anywhere.
It's
a
nice
simple
way
to
disable
disable
cube
proxy
temporarily
in
this
case.
So
if
we
do
that,
I
think,
because
I
set
this
cluster
up
on
gcp.
It
sort
of
has
some
authors
doing
the
background.
There
we
go
now.
Everything
should
still
work,
because
I
haven't
churned
anything
so
cute
proxy's
rules
will
still
be
in
op
tables
and
where
this
one's
flaking
key
proxy's
rule
should
still
be
in
in
ip
tables
and
everything
and
nothing's
deleted
those
yet.
A
So
if
I
turn
that
on
then
felix
configuration
is
patched
and
now
everything
should
still
work,
if
I
refresh
hopefully.
A
What
earlier.
A
So,
in
theory,
with
the
latest
calico,
you
shouldn't
even
need
to
do
a
hard
refresh
there.
It
should
carry
on
working,
but
I've
done
a
hard
refresh
and
it
has
come
back
so
maybe
I'll
investigate
that
later,
but
yeah
that's
still
working.
We
should
be
in
in
bpf
mode
now.
A
A
Apart
from
that
little
bit
of
disruption,
we
saw
which
I'd
say
we
shouldn't
really
have
seen,
but
the
the
source
ip
that
nginx
is
seeing
now
is
the
it's
my
source
ip.
So
I
don't
don't
try
and
hack
me
or
anything,
but
now,
when
I
access
it,
we're
seeing
the
real
source
ip
from
all
the
way
outside
the
cluster,
and
if
I
refresh
a
few
times,
should
just
stay
consistent,
okay,
yeah,
I
think
that's,
that's
all
I
have
for
a
demo
switch
my
share
back
to
the
other.
A
A
A
So
that's
the
that's
the
demo,
but
if
I
go
to
the
next
slide,
I
can
tell
you
how
we
sort
of
build
on
this
even
further.
So
the
next
step
after
this
is.
If
your
network
supports
it,
we
can
implement
a
feature
called
dslr
or
direct
server
return.
So
this
is
another
option
that
you
can
turn
on
all
starts
the
same
way,
so
the
packet
comes
in
to
the
to
the
node
port.
We
encapsulate
it.
A
We
send
it
to
the
correct
backing
node
it
gets
decapsulated
but
and
the
pod
sees
exactly
the
same
packet
as
as
it
did
before.
But
then,
when
it
responds
the
the
bpf
program
running
on
the
backing,
node
is
able
to
just
respond
so
rather
than
rather
than
doing
encapsulation,
to
get
it
back
to
the
original
node
and
send
it
back
along
this
kind
of
safe
path
where
it's
guaranteed
to
work.
A
Instead,
we
can
just
like
essentially
spoof
the
packet
and
pretend
that
we
are
the
first
node
and
send
it
send
it
directly
back
to
the
the
client.
A
Your
network
has
to
allow
this
exact
type
of
spoofing,
so
if
you're
on
prem
and
you're
in
particular
layer,
2
network,
this
works
quite
nicely,
then
you
can
arrange
for
that
to
work
well,
and
you
cut
off
this
extra
extra
hop
if
you're
in
the
cloud
it
works
within
the
same
subnet
in
aws
and
gcp,
but
it
doesn't
work
with
load
balancers,
so
is
it
is
a
little
bit
of
a
limitation
there,
but
just
to
give
you
a
feel
for
the
kinds
of
things
that
that
you
can
do
with
bpf
that
you
can't
do
you
wouldn't
be
able
to
tell
the
ip
tables
data
plane
to
you
know.
A
A
A
A
A
So
just
flip
on
the
wire
guard
switch
and
you
have
encryption,
but
you
trade,
you
trade,
some
performance,
and
if
you
use
our
ipip
data,
ipip
encapsulation
option,
it
drops
down
to
six
gigabits
per
second
and
what
I
think
has
happened
is
since
we
originally
developed
calico
a
few
years
ago,
like
we
picked
ipip,
because
it
was
the
fastest
at
the
time
as
our
sort
of
standard
like
out
of
the
box
encapsulation,
I
think
the
xlan's
got
a
lot
of
love
in
the
kernel
and
wire
guard.
A
2
is
is
also
like
highly
tuned
and
they've
just
overtaken
ipip.
So
if
you,
if
you
want
to
use
encapsulation,
I
recommend
vxlan,
especially
with
evpf
mode,
because
there
are
some
specific
incompatibilities
with
with
ipip
that
slow
it
down.
They
don't
break
it,
but
let's
slow
it
down
a
bit
slicing
that
same
data,
a
different
way
like
just
taking
the
data
from
the
same
test.
A
We
can
measure
the
cpu
per
packet
instead
and
if
you
look
at
it
that
way
like
we're
saving
sort
of
50
cpu
at
the
smaller
ntu
size,
and
if
you
bump
up
to
a
9k
mtu,
we
still
save
a
little
bit.
But
but
overall
the
the
cpu
used
to
send
the
9k
packet
is
mostly
shifting
the
9k
data
and
then
the
bits
we're
doing
are
a
small
part
of
it.
A
So
you
see
a
much
bigger
difference
at
smaller
packet
sizes,
but
you
do
see
a
cpu
improvement
in
in
both
cases
and
one
reason
I
like
to
slice
it
this
way
as
well
is
not.
Everybody
cares
about
27
gigabits,
40
gigabits
of
of
traffic,
but
most
people
would
rather
have
less
cpu
used.
So
you
know,
if
you,
if
you're
moving
any
like
significant
amount
of
traffic,
it
should
reduce
the
amount
you
use.
A
One
of
the
things
that
comes
with
the
calico
ebpf
data
plane
is
the
q
proxy
replacement,
and
this
isn't
optional
in
in
our
data
plane.
So
some
of
the
features
that
that
the
calico
data
plane
has
like
host
endpoint
protection
and
that
kind
of
stuff
meant
that
we
really
had
to
take
this
over
and
make
to
make
sure
everything
happened
in
the
right
order
in
inside
the
kernel,
and
so
we've
taken
over
from
keep
proxy
when
you're
in
etf
mode
and
and
our
implementation
is
faster.
A
So
it's
faster
than
op
tables
mode,
all
the
time
it's
faster
than
ipvs
mode,
but
it's
it's
a
kind
of
splitting
hairs
with
ipvs.
Like
you
know,
a
fraction
of
a
millisecond
I'd
be
tables.
The
performance
varies
a
lot
depending
on
how
many
services
you
have
so
as
the
number
of
services
increases,
iptables
really
slows
down.
A
So
if
you're
talking
like
10
000
services,
you
really
want
to
be
using
ipvs
for
or
ebpf,
because
the
those
both
scale
kind
of
order,
one
in
the
number
of
in
the
number
of
services
and
yeah,
just
keep
their
performance
even
with
really
high
numbers
of
services.
A
A
A
So
we
measured
the
sort
of
real
like
time
to
first
content
time
in
in
that
setup,
and
we
saw
iptables
mode,
it's
just
above
1.5
milliseconds
in
our
test,
ipvs
mode
in
cube
proxy
1.5,
milliseconds
bpf,
with
the
sort
of
non-direct
so
where
it
goes
back
to
the
first
node
beats
down
a
little
bit
so
down
about
1.3,
milliseconds
and
then
with
dsr.
A
Yeah,
so
that's
that's
the
performance
section
of
the
talk,
a
little
word
on
limitations,
so
ipv4
only
at
the
moment
we
wanted
to
get
get
the
data
plane
out
there,
get
it
into
people's
hands
and
kind
of
implement
a
broad
set
of
the
of
calico's
features.
Before
we
tackled
things
like
ipv6,
one
of
the
key
pieces
of
advice
we
had
about
making
an
evpf
data
plane
was
you
have
to
cover
a
broad
set
of
the
features
you
want.
A
Otherwise,
you
can
kind
of
micro
optimize
it
and
end
up
with
like
going
down
a
blind
alley
where
it's
it's
very
fast
for
the
one
feature
you
implemented,
but
then
it
ends
up
slow
because
you've,
you've
you've
made
some
poor
choices
elsewhere.
So
we
wanted
to
do
a
broad
broad
base
and
and
get
it
out
there
we're
on
x,
86
64
only
at
the
moment.
The
main
reason
for
that
is
just
doing
the
cross
builds.
A
Our
infrastructure
wasn't
quite
set
up
for
it,
so
we
can
cross-build
go
binaries
quite
easily,
but
cross-building
all
the
cross-building,
the
the
the
f
binaries
is
a
little
more
fiddly.
I
think
amd
64.,
sorry
arm
64.
Would
be
fairly
straightforward
to
do,
but
the
other
ones
that
that
we
have
some
support
for
like
our
pc.
We
might
need
to
flip
the
endianness
and
so
on
and
work
on
that
right
now.
A
All
nodes
in
the
cluster
must
run
the
bpf
data
plane
and
that's
because
the
the
kind
of
creative,
like
ncap
based
external
traffic
solution
that
I
talked
about,
requires
that
bpf
program
to
catch
the
packet
on
the
other
end,
I
think,
over
time,
we'll
add
support
for
running
hybrid
clusters
and
we'll
we'll
basically
add
s-nat
support
to
the
the
bps
data
plane.
So
it
can
interwork
with
the
other
types
of
cluster.
A
A
I
believe
that
the
vpf
support
for
updating
packet
checksums
hasn't
been
updated
sctp,
although
it
could
be
run
there,
and
that
makes
it
very
difficult
to
to
do
like
cube
proxy
function
and
do
the
nat
for
sctp,
and
we
don't
support
the
log
action.
I
mentioned
that
right
at
the
start,
but
it's
it's.
The
log
action
in
our
in
our
policy
is
implemented
by
an
iptables
log
action
and
the
iptables
log
action
isn't
available
from
bpf
because
it's
just
a
totally
different
point
in
the
kernel.
A
So
we
need
to
do
something
else
to
get
logs
if,
if
we
implement
that
feature
or
just
accept
that
it's
the
difference
between
the
two
data
planes
that
can't
easily
be
resolved
in
version
318,
which
which
came
out
just
a
few
weeks
ago,
host
endpoints
are
now
supported
so
before
that
we
release
with
workload
endpoint
support
support
only
so
that's
pods
and
not
hosts.
A
So
we've
now
added
host
endpoint
support.
That
was
that
was
quite
a
big
piece
of
work
and
host.
Endpoints
is
the
feature
that
kind
of
intersects
with
cube
proxy
and
and
being
able
to
interwork,
with
with
the
vanilla
q
proxy
versus
needing
to
do
our
own.
So
if
we
had
to
do
our
own,
we
wanted
to
make
it
better,
but
we
we
sort
of
had
to
do
our
own
because
we
planned
for
host
endpoints
down
the
line.
A
A
I
think
we
are
going
to
do
it,
but
I
think
we,
I
think
it
will
end
up
meaning
something
a
bit
different
in
in
bpf
land.
I
think
it
will
go
towards
being
implemented
in
xdp
and
like
being
a
sort
of
really
early
pre-filter
like
like
we
have
in
in
ib
tables
mode
where
we
have
a.
A
We
have
a
little
sprinkling
of
xdp
to
to
do
some
of
this
function,
but
I
think
we
can
do
a
more
thorough
implementation
of
that
on
the
on
the
base
of
the
the
new
bpf
data
plane
and
for
a
long
time
it
wasn't
available
in
calico
enterprise,
but
I'll
just
do
a
little
plug.
We've
introduced
it
in
calico
enterprise
3.5,
so
we've
added
a
bunch
of
our
enterprise
features
in
there
like
flow
logs
and
enhanced
types
of
policy
like
tiers
policy
and
so
on.
So
that's
a
tech
preview
in
in
calico
enterprise
3.5.
A
That's
all
I
have
about
bpf
just
had
my
couple
of
slides
on
the
other
data
planes
that
that
kalika
has
so,
I
guess
from
the
beginning
of
calico
the
the
data
plane,
certainly
from
quite
early
on,
we
made
the
data
plane,
pluggable
and
a
big
driver
for
that
was
we
started
off
in
python
with
with
calico.
We
were
part
of
the
open,
spec
ecosystem
in
in
python,
and
when
we
saw
kubernetes
coming
along,
we
thought
that
was
the
thing
to
to
to
really
sort
of
engage
with.
A
We
just
decided
that
now
is
the
time
to
to
rewrite
it
into
go,
so
we
split
the
product
into
into
two
parts
like
all
the
sort
of
brains
and
the
data
plane
separate,
and
we
rewrote
one
and
go,
and
then
we
rewrote
the
other,
and
we
ended
up
with
this
quite
nice
split
between
the
two.
So
you
could
run
the
you
could
run
the
golang
back
end
with
the
python
data
plane
and
then
you
could
switch
to
the
go
data
plane.
We
could
test
them
against
each
other,
make
sure
they
were
really
robust.
A
So
we
ended
up
with
this
api
and
that
was
really
convenient
when
microsoft
came
along
and
contributed
a
windows
port
of
calico,
so
the
first,
the
first
data
plane
we
added
from
from
outside
was
the
windows
data
plane,
and
this
was
open
sourced
in
in
316.
A
now
have
in.
In
the
latest
version
we
have
bgp
support
and
we're
adding
support
for
a
bunch
of
platforms,
so
openshift,
eks,
aks
and
rancher
are
all
on
the
supported
list
now
and
aks
it's
it's
kind
of
being
baked
in.
So
you
can
try
it
as
a
tech
preview,
where
it's
sort
of
a
tick
box
option
and
you
can
just
enable
it
on
your
windows
nodes
in
in
aks.
A
I've
got
the
docs
links
at
the
end
repeated
so
yeah.
The
windows
windows
data
plane
is
there
and
we
have
an
enterprise
version
of
the
windows
data
plane
as
well.
That
supports
a
bunch
of
our
enterprise
features
and
the
new
kid
on
the
block
is
the
vpp
data
plane.
So
vpp
is
a
is
a
project
from
cisco,
an
open
source
project
and
it's
part
of
the
fido
project,
which
is
why
the
that's
the
logo
on
the
right
there,
and
so
the
vpp
team,
have
contributed
a
data
plane.
A
Implementation
for
calico,
based
on
the
same
the
same
api
that
we
have
recently
passed
its
first
round
of
conformance
tests,
so
we're
moving
that
to
a
calico
owned
repo
and
they're
they're
working
on
it
in
there
as
part
of
the
the
official
calico
release
and
they're
working
towards
a
tech
preview
release
where
you'll
be
able
to
enable
this
very
easily.
A
A
All
very
clever
runs
in
user
space
and
they
sort
of
have
various
ways
of
getting
packets
up
into
it
and
then
it
it
runs
all
the
all.
The
protocols
and
user
space
implements
policy
and
everything
and
then
fires
the
packet
onto
your
into
your
application.
A
I
think
it's
it's
gonna
the
next
milestone
for
that
is
tech,
preview,
so
past
some
conformance
tests,
but
it's
not
sort
of
thoroughly
thoroughly
baked
yet.
But
it's
it's
really
interesting
and
great
to
have
such
a
big
contribution
from
from
outside
the
team,
so
yeah
interested
to
see
how
this
one
pans
out
and
how
it
how
it
competes
with
the
bpf
data
plane
and
I'm
sure
we'll
find
that
out.
A
That's
the
end
of
my
talk,
so
I
put
some
links
up
here
for
the
ebpf
docs
getting
started
with
windows
and
dpp.
They
have
a
how-to
for
turning
on
in
your
cluster.
Now,
thanks
very
much.
B
Yeah
hi
thanks
for
the
talk
it
was
great,
I
mean
it
seems
like
we
do,
have
some
questions.
If
you
want
to
take
a
look
in
the
chat
as
well,
but
one
seems
to
have
been
answered,
but
I
think
it's
okay.
If
we
go
again
so
I
suppose
it's
does
ebpf
give
better
performance
than
ipvs
and
does
calico
ebpf
skip
contract.
A
Yes,
so
it
gives
better
latency
than
ipvs,
I'm
not
sure
we've
measured
the
throughput.
So
you
know
performance
is
a
multi-faceted
thing,
but
the
latency
versus
ipps
is
certainly
better.
That
was
the
graph
that
I
put
up.
Ipvs
is
pretty
fast,
though,
and
and
our
our
data
plane
is,
like
you
know,
a
fraction
of
a
millisecond
faster
in
our
tests
in
terms
of
latency,
so
the
big
win
is
against
either
of
those
versus
ip
tables.
A
It
does
skip
the
next
contract
for
workload
flows,
but
the
the
thing
I
sort
of
alluded
to
I
didn't-
I
didn't
really
mention
it,
but
in
in
version
318,
when
you
go
from
ip
tables
to
bpf
mode,
we
kind
of
cooperate
with
linux
contract
in
order
to
make
sure
that
it's
not
disruptive
when,
when
you
do
the
upgrade
so
existing
flows,
we
just
kick
out
to
linux
contract
and
let
it
handle
them,
but
new
flows
we
handle
in
our
own
contract
table
just
thinking
about
like
why
that
might
not
have
worked
earlier.
A
I
flipped
my
cluster
backwards
and
forwards
from
bpf
mode
today,
and
one
guess
is
the
flip
from
bpf
mode
to
ip
tables
mode
is
disruptive,
because
we
can't
do
anything
on
the
ip
tables
mode
to
make
it
less
disruptive
like
ib
tables,
isn't
flexible
enough
to
handle
it
going
that
way.
So
it's
possible
that
I
messed
it
up
by
flipping
back
and
forth,
but
I'll
have
to
dig
into
it.
B
A
Okay,
so
iptables
is
the
it's:
the
linux,
kernel's
built-in
firewall
and
load
balancing
solution,
and
the
main
thing
to
know
about
it
is
it's:
it's
structured
into
chains
of
rules,
so
a
rule
might
be
something
like
if
the
packet
is
going
to
this
ip
address
and
this
port
then
drop
it
or
then
rewrite
the
packet
and
send
it
to
this
service
instead,
and
the
structure
of
the
rules
is
they're
in
a
big
long
list.
So
you
have
a
chain
of
rules.
A
The
first
rule
is
processed.
If
it
matches
it
wins
and
it
does
its
thing.
Otherwise
it
goes
to
the
next
one.
Otherwise
it
goes
to
the
next
one
and
when
q
proxy
programs,
the
services
into
that,
if
you
have
10
000
services,
then
you
get
10
000
rules
in
a
row
and
if
yours
is
right,
if
your
service
that
you're
accessing
is
right
at
the
bottom,
that's
10
000
rules
you
have
to
go
to
and
each
rule
costs
about,
0.5
microseconds.
A
So
they,
when
you
got
10
000,
they
add
up
to
milliseconds.
So
that's
ib
tables
and
that's
how
it
implements
say,
say:
nat
for
cube.
Proxy
ipvs
is
a
separate
subsystem,
the
ip
virtual
server
subsystem
and
it's
basically
a
faster
way
of
doing
that.
It
has
efficient
ways
of
sloping
up
the
traffic
before
ib
tables
gets,
gets
a
look
at
it,
and
then
it
does
a
more
efficient
load
balancing
technique
than
having
a
thousand
rules
in
a
in
a
row.
A
It
does
a
hash
look
up
to
figure
out
what
the
right
back
end
is
and
and
away
it
goes.
Ebpf.
Is
this
virtual
machine,
that's
very
flexible,
and
one
of
the
things
that
we've
implemented
is
a
load
balancing
solution,
that's
very
similar
to
what
ipvs
is
doing
there.
So
we
take
the
incoming
packet.
We
check
it
against
a
we.
Do
a
hash
lookup
in
a
table
which
is
very
fast
to
figure
out.
Is
this
a
node
port?
Is
this
a
kubernetes
service?
B
It,
okay
thanks
and
then
I
guess
we
have
another
one,
and
that
is
can
this
be
enabled
with
a
c
and
I
multiplexer.
A
A
So
you
could
multiplex
between
like
calico
running
bpf
or,
like
suppose,
you
use
multis
to
add
two
interfaces
to
every
pod,
and
one
of
them
was
the
calico
interface
that
would
run
bpf
mode,
and
then
you
could
have
a
second
interface
doing
some,
some
other
cni
like
say
you
had
some
dpdk
special
or
something
you
could.
You
could
do
that
and
I
don't
think
they
would.
I
don't
think
the
calico
side
would
have
a
problem
with
there
being
a
second
cni
on
there.
A
B
Okay,
thank
you
again
for
the
great
talk
and
for
all
the
answers.
If
there
are
no
other,
I
guess
there
is
another
another
one.
It
just
came
so
what's
the
performance
benchmarks
when
using
ipvs
and
ebpf.
A
So
yeah,
I
think
I
covered
that
already
and
the
the
latency
with
ebpf
is
lower.
That
was
one
of
the
graphs
that
I
put
up.
So
I
think
ipvs
was
like
0.5
milliseconds
and
if
I
remember
from
the
graph,
the
bpf
data
plane
was
0.4
milliseconds,
but
I
don't
have
numbers
on
hand
for
service
like
like
how
much
throughput
you
get
on
the
service
so
yeah.
I
don't
have
numbers
for.