►
From YouTube: IETF111-RTGWG-20210729-2200
Description
RTGWG meeting session at IETF111
2021/07/29 2200
https://datatracker.ietf.org/meeting/111/proceedings/
C
A
Not
well,
please
make
yourself
familiar
with
idf
rules
and
they
think
that
everything
you
share
becomes
property
of
atf
and
really
really
short
agenda
for
today
is
talking
so
routing
by
stuart,
unfortunately,
addressing
by
tourists
and
self-healing
networking
slow
label.
There
is
no
draft,
yet
authors
are
working
on
the
draft,
so
we
made
exception
here.
The
topic
has
been
discussed
in
six-man
number
of
venues
and
actually
being
tested
and
deployed.
So
it's
really
exciting
topic
here
and
stuart.
D
Right
so
asked
to
share
slides
I'm
going
to
share
the
tcr
slides
tcr
share
right.
Thank
you.
So,
first
off
there's
a
a
small
error
in
that
crept
into
the
agenda.
The
the
draft
is
drafted,
bcx,
rtgwg
tcr00.
D
So
what
we?
What?
What
was
our
motivation
for
doing
this
right?
New
network
demands
stress
the
existing
data
plane
protocols,
things
like
collection
of
telemetry
path,
guidance,
reroute
protection,
the
incoming
need
for
slos
things
like
proof
of
transit,
extensibility,
programmability
etc,
including
parameterized
functions.
D
So
we
put
together
a
sort
of
concept
as
to
how
we
might
design
a
protocol
for
for
to
address
these
needs,
in
particular
to
you
know,
to
support
multiple
features.
Concurrently,
it's
easy
to
add
one
feature.
It's
more
interesting.
If
you
want
to
add
multiples,
the
need
to
authenticate
metadata
at
intermediate
nodes
and
and
the
need
for
extensibility
in
the
in
the
design.
D
So
the
basic
idea
is
to
construct
a
packet
out
as
a
lightly
structured
set
of
tokens
tokens
as
in
computer
science,
tokens
not
token
ring
and
cells,
as
in
we
make
up
a
packet
as
a
number
of
small
components,
in
the
same
way
that
the
human
body
is
made
up
of
cells.
We
apply
longest
prefix
matching
engines
to
invoke
code
points
tokens.
D
We
can
combine
to
create
more
elaborate
and
interesting
functionality,
and
we
explain
some
of
this
in
the
use
cases
you
can
stack
the
the
tokens
for
per
segment
and
per
node
behavior
where
needed.
D
So
the
key
differentiators
from
packets,
as
we've
designed
them
in
the
past,
are
that
we
deliberately
allow
structured,
non-linear
parsing
non-linear
as
in
streaming
media.
D
So
you
don't
have
to
go
from
one
component
of
the
package
to
the
next
and
you
don't
have
to
deduce
what
you're
going
to
do.
The
packet
will
tell
you
where
you
need
to
go
next
at
this.
This
hop
in
terms
of
where
you
need
to
go
in
the
packet
next
for
processing,
so
we
support
programmable
behavior
at
multiple
levels.
In
this
approach
we
construct
a
packet
from
the
tokens
and
then
we
combine
them
to
get
the
defined
behavior,
and
then
we
parameterize
that
behavior.
D
This
will
become
quite
clear
and
obvious
in
the
next.
The
next
slide.
We
have
some
tokens
that
specialize
in
things
like
security.
We
have
tokens
that
specialize
in
scratch
pad
and
another
thing
that
is
of
interest.
Although
we
have
presented
this
as
a
packet
design
in
its
own
right,
we,
I
believe
it's
also
an
interesting
method
of
describing
advanced
functions
for
extending
existing
protocols.
So
it's
quite
a
good
ancillary
data
or
metadata
design
structure.
D
One
of
the
things
that
we
we
thought
was
that
actually
the
payload
may
exist
in
a
token
cell,
so
why
would
we
put
the
pay?
We
can
clearly
put
the
payload
on
the
end
of
the
packet
if
we
if
we
want
to,
but
why
would
we
want
to
put
the
packet
in
a
token
cell?
Well,
some.
There
are
some
thoughts
in
the
research
community
that
you
might
want
to
modify
the
payload
as
the
packet
goes
through,
for
example,
for
congestion
management,
for
example.
D
Instead
of
throwing
the
whole
packet
away,
you
just
it
may
be
that
in
certain
applications,
it's
acceptable
and
desirable
to
throw
some
components
of
the
payload
away,
for
example,
think
about
elements
of
a
video
system,
for
example,
where
you
you
may
be
able
to
dispense
with
some
bits,
but
other
bits
are
important.
D
This
is
an
optional
approach.
If
and
of
course
we
may
have
packets
without
payloads
anyway
about
for
oam
purposes
so
right
this
is.
This
is
one
of
the
key
slides
right.
So
what
does
a
token
cell
look
like
it?
Has
a
it
has
a
length
because
it,
unlike,
for
example,
mpls
it's
each
cell,
will
be
variable
length.
It
has
the
the
a
pointer
to
the
next
token
to
process
at
this
hop
after
you've
processed
this
token.
D
So
the
idea
is
we
chain
these
tokens
together
to
get
the
functionality
we
need,
and
then
we
have
the
match
zone.
So
the
match
zone
is
the
is
the
part
of
the
token
that
we
put
into
a
longest
match
engine,
so
that
consists
of
two
components:
we
have
the
the
cell
type
and
perhaps
sub
id,
and
then
we
have
the
prefix,
so
the
prefix
of
the
token
cell
blob.
So
this
is
where
the
parameters
go
so
think
about
this.
D
For
example,
this
might
be
specifying
an
ipv6
address
and
then
the
prefix
might
be
the
ipv6
address
itself
or
it
could
be.
The
ipv6
address
itself
concatenated,
with
a
more
sophisticated
programming
parameter
in
order
to
do
something.
A
little
more
sophisticated
than
ipv6
network
programming
is
doing
by
having
a
more
more
sophisticated
parameter
in
there
that
you
look
up
in
the
lookup
engine.
Of
course,
you
you,
you
may
discover
that
it's
adequate
to
simply
look
at
the
token
cell
type
and
in
all
cases,
once
you've
done
the
lookup.
D
You
know
they
now
know
the
structure
of
the
suffix.
So
what's
happening
is
the
front
of
the
this
middle
piece
of
the
token
here
is
vector,
urine
vectoring
you
into
some
code
like
an
mpls
label
would
and
but,
unlike
an
mpls
label,
the
token
carries
a
set
of
parameters
that
assist
in
the
processing
of
the
packet,
the
forwarding
of
the
packet.
D
So
it
would
be
obvious
from
that
what
the
lookup
engine
does
so
the
lookup
engine
looks
up
the
token
match
zone
retrieves
the
forwarding
parameters,
which
was
what
goes
on
in
any
forwarder
and
then
vectors
to
a
piece
of
code
which
is
actually
what
goes
on
in
a
forwarder
but
and
sucks
in
the
parameters
it
needs
to
create
the
the
effect
the
effect
may
result
in
storing
in
some
information
in
the
pipeline.
For
the
next
token,
so
what
sort
of
token
cells
might
we
have?
D
We
might
have
a
forwarding
one
with
the
addressing
type
we
might
have
metadata
scratch
pad.
These
are
writable
areas
of
the
packet,
some
security
tokens.
We
we
think
we
can
do
some
quite
interesting,
parallel
processing
if
the
hardware
can
support
it.
So
we
have
this
concept
of
a
manifest,
which
is
the
set
of
things
to
process
in
parallel,
and
the
inverse
of
the
manifest
is
the
rendezvous
where
we
can
bring
them
back
together.
D
Disposition
disposition
is
what
you
do
when
the
packet
needs
to
leave
this
part
of
the
the
this
zone.
If
you
like
of
tcr
this
segment,
if
you
like
in
in
segment
routing
and
at
the
moment
what
happens
is
a
lot
of
this
is
induct
in
in
deduced,
but
we
think
we
can
be
a
lot
more
specific
with
with
some
parameters-
directives,
for
example,
specifying
some
latency
objectives,
conditionals
and
basically
anything
else
that
you
want
to
program
into
the
into
the
system.
D
So
let's
look
at
the
parallelization
thing,
which
is
novel.
This
looks
complicated,
but
what
we're
really
doing
is
exploring
the
properties
of
this
concept.
Whether
you
would
build
this
or
not
would
would
depend
on
your
application
and
the
capabilities
of
your
forwarding
hardware,
but
it's
always
interesting
to
look
at
what
the
natural
consequences
of
the
design
are.
So
a
the
the
first
token
is
takes
you
to
a
manifest
manifest
says
there
are
three
processing
streams
that
you
can
do
in
parallel.
D
If
you
have
the
capability,
so
we
process
one.
We
go
to
two
in
parallel
with
three
in
parallel
with
four
two
completes
and
takes
us
to
five
for
three
completes
and
that's
the
end
of
the
action
series
and
four
takes
us
to
another,
manifest
with
three
more
parallel
actions
and
then
a
terminator
at
nine
and
you're
terminated
when
five,
three
and
nine
are
finished
now,
I'm
sorry
and
six
and
six
and
eight.
So
this
is
conceptual
right.
D
This
is
just
to
show
you
what
you
can
do
once
you
start
putting
pointers
into
packets
and
building
some
of
the
some
of
the
structures,
and
if
you
look
below
I'm
not
going
to
go
through
the
detail,
but
if
you
look
below,
we
can
see
how
we
can
chain
together.
The
structure
we
have
here
by
putting
pointers
from
one
token
to
another.
D
Well,
this
is
the
inverse
function
where
we're
rendezvouing,
because
it's
possible
that
you
want
to
do
that.
It's
quite
good
to
do
two
and
three
in
parallel,
but
you
can't
proceed
anymore
until
they've,
both
completed,
for
example,
to
get
to
node.
Four
again,
I'm
not
going
to
go
into
the
details.
The
the
the
slides
are
fairly
straightforward
and
there's
quite
a
good
description.
I
hope
in
the
in
the
draft
disposition.
D
So
this
is
what
a
package
is
to
do
when
it
leaves
the
network
and
we
we
know
we
already
have
this
and
we
have
a
need
for
this.
It's
the
things
like
the
next
header
in
ipv6.
That
says
you
know
what
follows
the
mpls
bottom
label
for
things
like
vpn
and
pseudo
wires
and
more
recently,
the
network
programming
ip
suffix,
which
is
being
used
to
specify
what
you
do
when
the
packet
leaves
the
the
sr
domain
now.
D
D
So
let's
look
at
some
some
packets.
So
here
we
have
the
first
token
first
active
token
is
saying
forward
towards
his
ipv6
address
notice
that
the
the
source
addresses
are
optional,
because
we
know
that,
for
example,
in
transport
networks
you
don't
often
need
them
and
you
can
consider
them
a
payload
parameter.
D
So
I
we
have.
We
have
the
first
first
token.
Is
the
ipv6
address
that
social
house
we're
going
to
forward,
but
there
is
a
a
pointer
to
the
second
token,
which
is
telling
us
to
look
at
some
slo
parameters.
This
packet
must
not
arrive
before
this
time.
This
packet
must
arrive
in
a
window.
This
packet
must
arrive
at
a
precise
time.
D
So
this
is
a
way
of
doing
latency
based
forwarding.
When
the
packet
arrives
at
the
destination
address,
then
we
see
we
have
the
disposition
token,
and
that
tells
us
what
we're
to
do
next
with
this
packet,
how
we're
to
process
this
payload
and
dispatch
the
packet
out
of
the
tcr
system
right?
Well,
that's
simple
and
straightforward.
Now,
let's
apply
some
fast
reroute.
D
Well,
the
first
thing
you'll
notice
is
that
for
the
fun
of
it,
the
first
here
to
the
right
hand,
side
in
light
blue
is
the
packet
you
see
above
and
what
we
discovered.
There
was
a
failure
and
we
needed
to
push
a
fast
reroute
token
on
the
front.
That's
in
dark
blue
you'll
notice
that
if
we,
if
it's
more
convenient
to
us,
we
can
use
a
different
address
family.
D
For
this
I
mean
I've
arbitrarily
picked
an
ipv4
one,
but
the
important
important
thing
here
is
that
we're
not
casting
the
address
family
directly
into
the
packet
design.
We
are
allowing
the
packet
designer
to
consult
with
the
network
operator
and
use
the
addressing
family.
That's
most
convenient
for
this
function.
D
Now,
normally
we
would
just
push
this
on
and
do
our
do
our
best
with
fast
reroute.
But
of
course,
if
we're
doing
latency
based
forwarding
of
some
sort,
some
sort
of
slo
affording.
We
would
really
like
that.
The
fast
reroute
system
took
advantage
of
this
and
knew
the
the
history,
and
we
can
do
this
by
pointing
the
next
token,
the
rednecks
token,
to
the
exactly
the
same
slo
characteristics
as
was
used
on
the
main
path.
So
now,
we've
added
a
controlled
latency
system
to
fast
reroute.
D
D
Supposing
we
want
to
do,
we
want
to
do
latency
based
forwarding
with
segment
routing.
We
can
do
this
by
having
each
of
the
forwarding
segment
tokens
take
as
a
second
action.
D
The
latency
based
forwarding
parameters,
so
you'll
notice
that
these
first
two
segments
are
using
exactly
the
same
slo
parameters,
so
you
can
accumulate
the
same
behavior
and
the
same
sort
of
time
keeping
across
each
of
the
segments,
rather
than
doing
it
individually
equally.
Well,
if
we
want
to,
we
can
have
different
slo
characteristics
for
different
segments.
So
these
first
two
here
point
to
this
second
slo
characteristic,
and
this
t
token
n
points
to
one
of
its
own.
D
D
Simply
by
pointing
the
the
next
token
pointer
to
the
telemetry
information
and
and
similarly
we
can
do
the
same
thing
with
a
manifest,
so
it
could
be
that
you
want
to
do
some
quite
sophisticated
and
complex
data
collection.
D
Well,
you
can
you
can
do
your
forwarding
look
up,
which
in
most
formulas
is
a
dma
action
in
parallel
with
doing
your
telemetry
action,
and
it
will
be
a
simple
exercise
for
the
reader
to
see
that
you
can
clearly
put
all
of
these
together
and
which
we
do
here.
D
I'm
not
not
going
to
go
through
that
in
detail
here,
but
I
think
you'll
find
all
the
pointers
are
right
and
I
I
would
ask
you
to
look
at
the
the
draft
and
consultant
slides
and
show
see
how
the
pointers
allow
you
to
construct
these
sorts
of
structures.
D
What
we're
what
you
can
do
here
is,
and
there's
some
slides
that
that
I'm
going
to
incorporate
into
the
next
version
of
the
draft
that
show
how
you
can
break
up
the
signature
domains
so
that
you
sign
the
bits
that
are
associated
with
a
group
of
tokens
or
a
set
of
parameters
associated
with
a
token
and
how
you
don't
have
to
sign
the
whole
of
it.
You
can
sign
the
bits
that
are
relevant
for
this
set
of
hops
and
this
particular
action.
D
So
what
we
got
with
tcr
is
a
general
purpose
network
data
plane
protocol
with
serializable
and
parallelizable
characteristics.
It's
the
serialization
parallelization.
It's
the
ability
to
introduce
scratch,
pads
and
metadata
scratch.
Pads
are
our
information
where
the
packet
forwarders
write
as
the
packet
goes
through.
The
network
metadata
are
additional
parameters
or
ancillary
data
that
is
needed
to
qualify,
how
you
forward
the
packet,
it's
a
an
extensible
approach
and
it
has
differentiated
security.
D
D
You
can
provide
parameters
into
the
token
cell
and
the
understanding
of
those
parameters
is
implicit
in
understanding
the
the
cell
and
being
vectored
to
the
code
that
executes
that
cell,
in
the
same
way
that
an
mpls
label
vectors
uses
some
code
and,
of
course
we
can
introduce
new
token
cell
types
without
needing
to
rewrite
and
redesign
the
protocol.
D
G
D
Well,
we
think
that
at
medium
speeds,
an
existing
one
could
do
this.
There's
nothing
in
here
that
you
can't
do
with
the
existing
hardware
depends
how
many
tokens
you're
going
to
to
have.
But
fundamentally
this
is
a
sort
of
a
hybrid
between
what
goes
on
in
an
existing
between
mpls
and
ip
in
terms
of
basic
forwarding
and
looking
at
the
follow-on
tokens
is
not
particularly
different
from
parsing
an
ach,
and
everyone
is
now
looking
at
achs
and
metadata
and
ancillary
data.
D
I
think
is
more
more
powerful
than
some
of
the
other
techniques
we're
looking
at.
G
All
right,
so,
while
I
agree
with
you,
I'm
gonna
leave
you
with
two
comments.
The
first
one
is
this,
so
much
resembles
the
work
that
firewalls
have
to
do
with
a
given
firewall
program
to
implement.
You
know
its
intent
for
a
given
set
of
instructions,
and
we
know
that
long
chains
can
have
negative
impacts
on
forwarding
rates,
and
the
second
comment
is
much
like
the
firewall
problem,
you're
now
introducing
a
programming
path.
That
now
has
a
lot
of
exception
paths
that
need
to
be
considered.
D
I
kind
of
assume
that
if
you
shouldn't
be
putting
a
packet
through
an
lsp
or
whatever
a
token
switch
path,
I
suppose,
unless
you
know
it's
going
to
going
to
cleanly
get
through
there,
because
you
know
from
the
routing
system
what
the
capabilities
of
the
path
are
and
so
long
as
you
design
the
thing
in
the
right
in
the
first
place-
and
you
know
with
a
lot
of
experience
in
mpls,
of
designing
these
things
right
in
the
first
place,
it
will
work
and
if
you
screw
up,
then
the
usual
thing
is
just
to
dump
the
packet
and
increment
to
counter.
D
So
I'm
not
too
terrified
of
of
this.
It's
certainly
very
different
from
things.
We've
tried,
I
think
in
the
past,
but
we're
sort
of
getting
close
to
that
with
you
know
the
ancillary
data
and
the
extension
headers
that
we're
doing
because
they
have
lots
of
undefined
states
and
and
they're
even
less
defined,
because
you're
doing
implicit
ordering,
rather
than
explicit
ordering
of
what
you
need
to
do
to
the
packet.
D
D
A
D
I
will
I
will
do
my
best
to
to
to
to
to
answer
them,
if,
if
you
can
capture
them
in
some
way
that
I
won't
lose
them.
D
And
this
is,
this
is
a
concept
right.
I
am.
What
I'm
trying
to
do
is
to
get
people
to
understand
the
sort
of
power
you
get
if
you
put
pointers
in
a
packet
rather
than
rely
entirely
on
implicit
parsing,
because
no
one
actually
just
looks
at
the
front
of
the
packet
anyway,
do
they
they
they
they
they.
They
now
look
inside
all
sorts
of
things
to
try
and
figure
out
what
to
do
so.
This
is
moving
from
implicit
parameters
to
explicit
parameters.
D
A
Okay,
so
we've
got
one
minute:
if
there
are
any
short
questions
comment,
please
go
ahead.
Otherwise,
we'll
move
on.
D
F
Following
both
of
them
in
parallel,
well,
I
think
you
know-
maybe
just
one
one
marketing
for
you
right,
given,
I
guess,
fair
to
say
a
little
bit
of
the
history
and
adjacency
with
mpls.
This
is
also
discussed
in
the
open,
mpls
design
team,
which
is
meeting
regularly.
So
there
is
another
great
chance
to
you
know,
join
that
effort
and
maybe
have
the
discussion
over
there.
If
you
haven't
been
watching
that
space,
yet
here
in
routing
working
group.
D
Just
to
answer
robert's
question:
there
is
a
ttl,
I
didn't
show
it
it's
in
the
preamble,
so
there's
a
bunch
of
stuff
on
the
left.
I
always
show
which
has
got
a
a
a
small,
tiny
number
of
parameters
that
are
always
pushed
onto
the
front
of
the
packet,
and
one
of
them
is
ttl.
A
F
Present
all
right,
so
this
is
an
idea
you
know
born
from
from
similar
intentions.
What
stewart
was
talking
about
with
a
terrible
name
so.
F
So
maybe
the
document
is
also
getting
split
up
between
the
problem
and
the
solution,
and
you
can
see
from
the
slide
that
you
know
the
the
core
issues,
such
as
efficiently
traffic
steering
like
sr
does
it
should.
You
know
be
very
simple
with
this
and
more
efficient
than
maybe
many
of
the
spring
crh
variations
should
support
equally
very
flexible
programming
in
the
way
that
srf
v6
adopted
that
with
srh
and
then
also
by
you
know,
value
of
having
variable
long
addresses.
F
It
would
be
easily
feasible
to
introduce
new
semantics
that
need
longer
addresses,
for
example,
and,
of
course,
research.
But
let
me
start
with
the
you
know:
new
problem
space
that
I
think
with
ipv6
we've
mostly
ignored,
which
is
the
fact
that
you
know
the
itf
network
product
called
iceberg
is
pretty
much.
F
You
know
ten
percent
which
everybody
talks
about,
which
is
the
internet
and
then
ninety
percent
of
what
you
know
since
rfc
8799
is
called
limited
domains
or
the
people
called
private
networks,
many
of
them
in
the
iot
space
manufacturing,
energy,
all
gas
transportation,
constraint
networks.
But
equally
you
know,
service
providers,
infrastructure
network
with
ipmpls
or
ipsr
is
not
quote
the
internet.
F
It
just
you
know,
has
the
internet
running
on
top
of
it,
and
so
that
that
you
know
I
counted
all
the
devices
on
the
planet
and
90
of
them
are
not
on
the
internet
or
connected
to
the
internet,
but
really
just
within
these
private
networks.
F
So
there
there
there's
really
a
lot
to
be
said
about
a
lack
of
ipv6
addressing
to
better
support
this
so
and
the
example
that
I
wanted
to
bring
up
from
industrial.
But
another
space
as
well
is
that
you
want
to
be
able
to
build.
You
know
embedded
constrained
networks
whenever
you
like,
without
having
to
bother
about
a
single
global
address
space,
but
just
about
your
you
know,
network
local,
address
space
and
then
compose
and
interconnect
them
in
an
arbitrary
hierarchy
and
topology
that
you
want
right.
F
You
may
want
to
start
with
some
form
of
machinery
that
you're
selling,
which
has
an
internal
ethernet
network
and
a
router
to
the
outside.
You
assemble
these
to
form
some
larger
machinery
and,
ultimately,
an
assembly
line,
so
multiple
hierarchies
to
even
get
to
a
single
building
block,
that's
shown
here
and
then
you
can
easily
imagine
how
many
of
these
building
blocks
here
on
that
picture
might
need
to
be
interconnected
in
a
way.
That's
certainly
not
the
oh.
I
just
need
a
flat.
F
Everybody
can
reach
anybody
else
network,
but
we
already,
you
know,
have
done
good
standardization
with
things
like
mud
and
other
mechanisms
to
really
ensure
the
security
of
these
type
of
networks,
and
as
it
turns
out
you
know,
what's
going
to
be
proposed
is
just
you
know
going
very
nicely
along
with
it.
I
think
so.
Here
is
a
typical,
classical
example
of
these
instances
that
you
can
even
see
in
you
know,
industry
standards,
I've
worked
in
transportation
with
trains
and
so
on.
F
So
if
you
look
at
some
of
the
standards
for
for
how
to
build
networks
in
a
train
car,
you
know
or
any
other,
you
know-
example
machineries.
What
you
typically
do
is
you're
using
the
wonderful
rfc
1918
10
net
space,
every
instance
of
a
product
you're
building
has
you
know
the
device
has
been
given
exactly
the
same
10
net
addresses
and
then
you
have
a
gateway
with
net.
Like
any.
You
know,
industrial
ethernet
switch
you're.
F
Looking
at
that
and
you'll
see
that
oh
well,
why
does
it
have
this
strange
form
of
net
well
because
exactly
of
this
type
of
use
case
where,
for
example,
netting
the
third
byte
of
the
address
to
be
a
unique
on
the
next
layer
of
land,
which
is
the
orange
level,
and
then
you
have
the
second
byte,
where
you
can
do
one
more
level
of
aggregation
with
this
existing
ipv4
addressing
space
and
voila.
F
That's
basically
a
lot
of
what
you
do
for
security
and
combination
that
your
constraint
to
two
levels
may
go
up
to
another
ipv6
level
on
top
of
that.
But
if
you
really
compare
ula
ipv6
with
this
stuff
in
ipv4,
you
have
the
same
16
bits
really
available,
but
you
have
additional
problems
with
ipv6
ula
because
you
know
supposedly
net
is
evil
and
you
shouldn't
use
it,
and
then
you
run
into
hash
collisions.
F
F
Hex
string
addresses
like
an
ipv6
with
any
boundaries
on
4-bit
so
that
I
don't
need
to
cut
in
a
single
digit
and
then
dots
are
just
you
know,
structural.
You
know
optional
things
just
to
visualize,
where
the
structure
of
an
address
ends.
So
the
address
allocation
within
a
single
network,
that's
kind
of
the
one
core
new
thing
that
we
haven't.
F
You
know
in
in
normal
networks
and
igp
expected,
which
is
that
every
assigned
prefix
has
to
be
non-overlapping
with
any
other,
so
that
anybody
who
owns
a
prefix
owns
any
longer
address
of
that
which
is
basically
crucial
to
allowing
this
to
work
and
yeah
that
might
actually,
when
we
get
to
actual
routing
plane,
for
this
might
be
an
interesting
question
about
the
consistency
requirements
that
we
want
to
raise
at
the
distributed,
control,
plane,
yeah,
and
then
we
just
expect
that
you
know
let's
say
for
for
now:
everything
is
route
in
the
igp
like
host
addresses
right,
so
everybody
can
route
to
anybody
else's
unique
prefix.
F
So
here
is
then
the
example.
So
we
have
one
network
network
one
and
a
couple
of
devices
interconnected,
which
is
other
each
one
showing
its
own
prefix
and
then
basically
somebody
else
builds
a
second
network,
probably
following
from
the
same
industry
standards
slightly
different,
mirror
topology
in
this
case,
but
certainly
you
know
decidedly
or
unintentionally,
overlapping
address.
And
now
the
question
comes
okay,
how
do
you
start
connecting
this
and
allowing
to
have
traffic
flow
between
them?
F
So-
and
here
is
basically
the
the
simple
summary
what
you
do
for
the
interconnection,
which
is
that
you
know
one
network
wants
to
get
connected
to
the
other
one.
So
on
that
network,
which
is
the
blue
one
network,
one
you're
starting
to
establish
some
connection
into
the
other
network,
which
is
shown
here
with
these
dotted
boxes
right.
So
you
take
ra
and
you
connect
it
into
lan,
one
on
the
orange
network
and
you
do
receive
a
particular
prefix
from
the
45.
F
F
So,
and
then
is,
is
the
very
simple
address
processing
that
we're
doing
when
passing
the
traffic
on
which
is-
and
you
know,
given
the
time
I
I
won't
be
so
slow
that
everybody
gets
in
the
first
run,
took
me
a
while
as
well,
but
you
go
from
the
source
and
the
destination
address
is
2
to
135,
and
the
source
address
is
your
own
52.,
so
the
first
two
will
route
it
to
ra,
which
is
the
address
of
that
node
in
the
blue
network.
F
The
second
two
is
the
function
of
the
the
remainder
of
the
address,
which,
in
this
case
would
be,
you
know,
put
it
into
a
different
network
and
locally
the
network.
Connection
number,
let's
say
one,
which
is
the
parameter,
which
is
the
third
part
so
and
what
ultimately
happens
is
that
ra
is
knowing
okay.
I
need
to
put
this
out
on
the
link
into
network
two,
I'm
going
to
strip
this
whole
prefix
and
I'm
recirculating
the
network,
the
the
packet
into
network,
two
routing
and
forwarding.
F
So
it's
going
to
be
forwarded
to
number
35,
and
then
you
know,
the
reverse
thing
is:
is
happening
optionally,
with
the
source
address
where
we're
basically
prepending
the
return
path,
so
that
if,
when
the
packet
ultimately
arrives
at
destination
52,
it
knows
how
to
send
return
packets
back
to
the
52
in
network
one.
F
So
what
is
that
it
is
in
the
forwarding
plane?
Exceptionally
simple,
I
think
when
we
want
to
start
having
this
type
of
interconnection
with,
you
know
every
network
having
its
own
address
space
independently.
F
It's
just
normal
prefix
lookup,
like
we've
done
it
forever,
and
the
only
novel
things
are
stripping
and
prepending
address
prefixes
and
when
you're
doing
this
function,
recirculating
you
know
after
that
operation.
So
it's
stateless
or
prefix
address
rewrite.
So
if
we're
looking
into
any
type
of
address
rewrite
option,
you
know
to
connect
networks,
it
should
be
the
best
scalable
performant
option
right
and,
as
should
be
not
too
difficult
to
see,
it
would
be
possible
to
make
that
work
for
an
arbitrary
topology
interconnect
between
these
hierarchical
or
any
type
of
mesh.
F
Now
that
was
kind
of
the
most
complex
thing
now,
when
you
think
about
just
the
logic
of
an
address
being,
you
know
a
sequence
of
functions
and
each
function
being,
let's
say
either
a
semantic
prefix
or
a
node
prefix,
followed
by
a
function
code
and
a
parameter.
You
can
see
that
this
simple,
I
get
a
packet.
I
do
a
prefix
lookup,
depending
on
the
prefix
lookup.
I
have
an
adjacency
that
does
different
things.
F
Can
map
into
the
most
fundamental
functions
that
we
already
need
in
routers
and
would
like
to
be
more
flexible
right.
So,
let's
say
the
function.
Number
zero,
followed
by
a
value
for
the
next
protocol
is
simply
the
whole
stack
thing
right.
So
would
allow
me
to
eliminate
the
next
protocol
field
from
an
ipv6
header,
because
it's
simply
in
the
address
number
one
could
simply
be
steering
which
basically
then
is
followed
by
a
node
prefix,
so
that
would
be
a
replacement
to
what
we're
doing
as
our
mpls
or
srv6.
It's
all
in
the
address.
F
So
it
would
be
very
compact
encoding
for
the
steering
number
two
was
the
instruction
for
this
internet
work
where
you're
stripping
in
and
adding
addresses,
and,
of
course
you
can
come
up
with
any
other
functions
and
parameters
which
is,
for
example,
how
you
could
map
any
of
the
programmability
in
srv6
that
srh
has
into
the
address
equally
well,
but
given
how
it's
all
variable
length,
you
don't
need
to
waste
64
bits
when
you
don't
even
need
them
like
in
most
cases
in
srh
that
we
have
now
and
if,
instead
of
a
node
prefix
we're
saying
well,
let's
just
start
with
a
semantic
prefix
like
we
have
in
ipv6
as
well,
for,
for
example,
multicast
addresses
you
could
equally,
you
know
map
any
future
semantics
that
you
want
to
have
into
the
same
address
space
just
make
sure
it
doesn't
overlap.
F
We've
done
it
so
far
with
the
standards-based
prefix
allocations,
I'm
saying
in
general.
This
could
as
well
be
done
through.
You
know
configuration
programming
of
the
forwarding
plane
from
the
control
plane.
F
Yeah,
okay,
right
so
the
control
plane.
I
think
there
is
a
lot
to
be
said
about.
How
do
we
get?
You
know
know
how
to
get
to
a
destination.
F
I
make
a
lot
of
cases
for
many
private
networks
not
really
having
to
have
other
naming,
because
within
the
network
these
addresses
are
fixed
for
lifetime,
so
they
wouldn't
be
location
dependent
within
the
network
and
the
path
to
other
networks
equally
can
come
from
pc
controllers,
but
otherwise,
there's
obviously
a
lot
of
interesting
extensions
for
path,
routing
that
we
already
know
how
to
do
with
bgp,
for
example.
F
Here
is
a
you
know,
an
example
of
how
simple
a
base
header
could
be.
So
this
would
be.
You
know
how
to
take
ipv6
strip
everything
that
we
don't
think
we
need
and
arrive
at
a
much
simpler
header
destination
address
source
address.
We
need
length
for
both
of
them.
I
think
we
need
ecn
and
hop
limit.
Everything
else
can
go
into
extension
header.
F
Obviously,
this
would
require
a
newer
version,
not
four
and
not
six,
so
you
know
it
could
basically
be
done
in
a
backward
compatible
way
so
that
it's
a
superset
of
v6,
but
that
details
haven't
been
talked
about.
So
you
know,
I
think
you
know
to
come
to
a
funny
end
here.
I
think
what
we're
talking
about
really
is.
If
we
want
to
think
more
about
addressing
and
not
only
you
know,
tag
on
more
and
more
extension,
headers,
but
think
about
the
basic
addressing
that
we
have.
F
We
have
gone
through
a
long
evolution
which
started
with
nat
26.
You
know
ipv4
to
ipv6
transition
mechanisms
most
of
them.
I
think
people
would
find
you
know
not
very
good,
but
some
of
them
actually
could
be
domesticated
and
become
kind
of
really
useful
functionalities,
which
is,
I
think,
what
we're
doing
here.
We
also
have
functional
structures
on
ipv6.
You
know:
scope
zones,
unicast
prefix,
multicast,
a
lot
of
the
things
we
have
done
where
ad
hoc
things
you
know
on
top
of
ipv6,
not
very
structural.
F
We
have
done
a
lot
more
structural
stuff
on
the
address
processing
in
mpls,
decks
and
source
routing
and
sr.
Those
things
also,
you
know,
are
nicely
flowing
into
this,
so
I
think
when
we
take
all
these
things
together,
this
proposal,
really,
you
know,
should
give
us
a
very
nice
multi-purpose,
functional
address,
processing
architecture
and
that's
it.
H
I
have
a
small
question
about
the
redundancy
resiliency,
because,
if
a
node
and
if
link
for
gateway
is
encoded
inside
the
header
is
as
a
part
of
the
address
and
a
particular
node
particularly
would
be
down,
then
we
need
to
find
some
some
route.
But
if
it's
encoded
in
the
ap
address,
does
it
mean
that
we
need
to
to
change
ap
address
or
api
actually
yeah.
A
C
C
C
Can
you
hear
me
yeah,
it's
not
great,
so
I'm
alexander
asimov,
I'm
working
for
yandex
and
since
we
are
running
out
of
time,
I'll
try
to
do
it
fast,
and
so
we
will
try
to
first
discuss
the
opportunity
to
enrich
tcp
with
self
healing
capabilities.
C
I
will
start
my
talk
with
focusing
on
the
data
center
environment,
but,
as
you
will
see,
it's
not
only
about
data
centers
here.
So
here
is
a
typical
topology
of
the
data
center.
There
are
top-of-rack
switches
connected
to
the
first
tier
splines,
which
are
represented
here
with
literal
s.
C
Of
course,
a
loud
balancing
is
widely
adopted
in
such
topology.
Normally,
it's
equal
cost
multipath
with
a
hash
function
using
five
tuple,
so
modern
data
sensors
provide
multiple
paths
between
most
of
hosts.
The
single
path
between
two
hosts
exists
only
inside
the
rack
inside
one
port,
the
number
of
bus
equal
to
the
number
of
planes
in
the
data
center.
In
the
case
of
interaction
between
pods,
the
number
of
paths
will
be
the
result
of
multiplication
between
the
number
of
planes
and
the
number
of
super
spines
in
each
plane
to
make
you
feel.
C
C
Yes,
one
may
expect
that
this
number
of
paths
should
provide
fault
tolerance
out
of
the
box,
but
the
real
outages
in
the
data
center
are
way
more
complicated.
C
There
are
two
main
options
in
cp
for
loss,
recovery,
selected,
acknowledging
acknowledgement
and
retransmission
triggered
by
rto
timeout,
but
all
these
retransmissions
will
have
the
same
five
tuple,
so
they
will
all
travel
the
same
path
resulting
in
service
degradation
and
as
the
congestion
window
will
be
shrinking,
it
will
increase
the
chances
to
meet
rto
event
and
continuous
events.
In
is
a
disaster
inside
the
data
center.
C
Itil
is
calculated
as
the
minimum
from
measured
rtt
and
r2
amine.
The
default
linux
value
for
r2
mean
is
200
milliseconds.
If
we
have
thought
unfortunate
enough
to
lose
a
syn
packet,
the
basic
timeout
is
even
higher
it's
one.
Second,
while
real
rtt
in
the
data
center
is
about
one
millisecond
to
make
it
even
worse,
it
increases
two
times
after
each
unsuccessful
attempt.
C
C
The
question
is:
is
it
possible
to
enrich
tcp
with
the
opportunity
to
jump
from
the
failing
path
to
answer
this
question
we'll
need
to
take
a
tour
through
the
linux
kernel
manic
list,
it
was
2011
when
rc
6438
was
published.
Where
was
stated
that
flow
label
can
be
used
to
balance
encapsulated
traffic.
The
idea
is
simple:
since
the
transport
layer
may
be
not
be
available,
we
can
improve
load
balancing
quality.
If
we
put
the
hash
of
the
tcp
socket
in
the
flow
label
field
in
2014,
it
was
introduced
in
the
linux
kernel.
C
C
The
next
year
there
was
another
patch
in
the
linux
development
which
added
tcp
hash
recalculation
upon
negative
routing
event.
What
is
negative
routing
event:
it's
rto
timeout
that
we
just
discussed
and
next
year
this
behavior
was
further
strengthened.
C
Since
this
time
the
hash
is
recalculated
on
both
audio
and
scene
outer
events
and
as
we
discussed
outer
timeout
affects
not
only
flow
label
value,
it
also
affects
all
kinds
of
encapsulations.
In
the
case
of
gre,
it's
affects
its
key
field
in
case
of
udp
encapsulation.
It
changes
the
udp
source
port
and
a
surprise.
C
Once
again,
let
me
have
a
partial
outage
at
x,
1
1.
This
time
we
will
also
add
the
flow
label
to
our
hash
function
at
the
top
of
rack.
Switch
in
the
case
of
selected
acknowledge
acknowledgement,
nothing
really
changes,
but
if
we
have
an
audio
event,
the
tcp
hash
at
the
circuit
will
be
recalculated
and
so
does
flow
label.
C
So
after
each
other
event,
we
have
a
50,
probably
probability.
The
traffic
will
jump
to
another
plane.
The
more
planes
we
have
the
higher
is
the
probability
that
the
jump
will
move
tcp
to
the
unaffected
part
of
the
network,
and
it
is
fully
transparent
to
the
application.
In
addition
to
the
flow
label
in
the
hash
function
of
the
topofreck
switch,
we
also
deployed
ebpar
for
agents
at
the
host
to
change
the
outer
mean
values
according
to
the
real
rdt
in
the
data
center,
and
here
is
result
of
our
experiments.
C
Though,
on
real
production
services,
we
took
a
top-of-rack
switch
and
created
a
constant
packet
loss
on
one
of
its
uplinks.
The
switch
had
four
uplinks
in
total
on
the
left.
You
can
see
the
outcome
of
the
data
plane,
monitoring
it
is
udp
based
and
as
predicted,
it
shows
25
percent
of
packet
loss
on
the
right
side
is
the
traffic
volume
from
service
from
the
host.
C
Behind
this
top
of
rex
switch,
the
volume
of
the
traffic
has
reduced
four
times
now,
the
same
experiment,
but
with
flow
label
enabled
or
at
the
hash
function
of
the
top
topofrex
switch.
The
udp-based
data
plane
monitoring
shows
the
same
result.
It's
a
just
udp
ping
without
any
re-transmissions,
but
tcp
flows
of
our
services
are
jumping
from
the
failing
path
and
service
traffic
is
preserved
unaffected.
C
So
what
one
can
learn
from
in
these
slides
using
flow
label
at
the
level
of
top-effect
switch
and
maybe
first-tier
spines-
gives
hcp
of
ipv6
a
self-healing
capability
to
make
this
jump
of
traffic
quick
enough.
You
should
also
use
ebay
pf
to
change
the
rto
and
scene,
auto
values
according
to
your
latency,
and
this
goes
for
free,
though
with
poor
documentation,
and
one
may
note
that
an
environment
with
multiple
paths
is
not
limited
to
the
data
center.
C
One
may
say
that
the
internet
itself
consists
of
multiple
paths
between
the
majority
of
its
points
and
all
these
jumping
from
failing
path
to
another
may
improve
user
experience.
In
general.
C
C
As
you
can
see,
two
of
the
five
best
parts
were
not
affected
by
the
outage,
so
it
leaves
the
door
for
the
jumping
from
defending
path.
So
is
everything
works
that
great
and
do
we
need
only
to
properly
document
implemented
in
the
linux
kernel?
C
And
this
trap
is
called
anycast
in
the
dc
environment.
State
for
any
car
services
are
noted
that,
while
in
the
wild
internet,
any
cast,
speed
proxies
are
represent
representing
significant
portion
of
traffic
volume
and
unfortunately,
this
kind
of
services
doesn't
perform
very
well
with
linux
flow
label.
C
As
a
result,
subsequent
packets
can
be
redirected
to
another
instance,
which
won't
have
any
appropriate
state
and
will
drop
these
packets
accordingly,
this
might
be
following
outer
following
auto
events
that
may
return
the
traffic
to
original
instance,
but
anyway,
it
won't
improve
user
experience,
and
this
is
no
theory.
C
C
So
after
the
default
scene,
audio
timeout,
which
is
normally
one
second,
the
client
will
send
another
sim
packet
with
new
flow
label
value
and
it
may
reach
another
instance,
but
not.
But
now
we
will
have
one
client
and
two
servers
trying
to
establish
one
connection.
C
C
The
client
needs
to
respond
with
an
arc
to
finish
the
connection
procedure,
but
it
has
a
flow
label
that
directs
packets
to
the
second
proxy
and
the
acknowledgement
number
related
to
the
first
proxy.
Of
course,
such
scenario
will
end
up
with
a
connection
timeout
in
such
race
condition
is
such
a
risk
condition
is
likely
to
happen.
C
It's
really
hard
to
say
at
the
moment,
because
it's
really
hard
to
check
how
many
such
connections
are
broken,
but
both
issues
with
rto
and
scene
adder
may
happen
only
if
the
hash
is
recalculated
at
the
client
side,
and
this
provides
an
opportunity
for
a
quick
fix.
This
will
still
have
save
the
improvements
in
case
of
the
outage
and
resolve
the
issues
with
tcp
session
timeouts.
C
Where
can
because,
in
the
controller
environment,
we
can
assume
that
we
can
distinguish?
Where
is
the
anycast
service?
And
where
is
the
unicast
service?
So.
C
In
in
the
ipv6
world,
we
have
an
opportunity
to
enrich
tcp
transport
with
self-healing
capability.
Most
of
the
mechanics
are
already
in
place,
and
today
it
works
in
the
data
center
environment,
though
we
need
to
change
the
linux
implementation
to
guarantee
robot's
behavior
in
general,
and
this
time
as
a
community,
we
need
to
properly
document
it.
There
are
several
open
questions,
for
example,
should
we
focus
on
tcp
only
or
try
to
provide
a
general
guidance?
C
B
C
Can
you
repeat
the
beginning
of
the
question
so
are
you
asking
about?
Where
is
the
hashing
is
implemented
or
where
is
who
is
changing
their
flow
label?
Who's
changing.
B
C
Yeah,
so
it's
a
it
is
today
it
is
changed
on
the
host
upon
the
rto
events,
both
for
the
established
sessions
and
upon
the
scene.
Retransmission.
B
A
A
B
Did
you
did
alex
write
a
draft
on
this,
or
is
this
just
a
presentation.
C
At
the
moment,
this
is
just
a
presentation
we
are
still
at
the
beginning
of
the
road
and
discussing
a
proper
way
how
to
fix
it,
and
then
there
will
be
a
lot
of
work
must
be
done
in
the
itf
documentation,
as
I
said,
because
we
need
to
update
the
documentation
for
the
flow
label
itself
and
also
write
a
spec
that
will
describe
how
the
hash
recalculation
should
affect.
A
Okay
ready,
so
the
plan
is
to
provide
routing
as
a
home
for
this
work,
and
so
tom
and
alexander
already
working
on
it,
and
so
more
people
are
more
than
welcome
here.
The
problem
is
real.
He
provides
real
solution
to
serious
issues
and
I
would
like
to
thank
alexander
for
the
presentation
and
again
please
reach
out
and
contribute.
A
A
We
are
completely
out
of
time,
so
I
would
like
to
thank
everybody
for
attendance
and
I
really
really
hope
to
see
you
face
to
face
in
madrid.
Take
care
everyone.