►
From YouTube: IRTF Open session
Description
The Internet Research Task Force (IRTF) Open session at IETF 104 will be held at 12:50 to 14:50 UTC on 25 March 2019. The session includes Applied Network Research Prize presentations.
A
We
we
didn't
mean
to
put
up
art
RTF.
We
really
still
are
internet
research
task
force
and
with
any
luck,
I
have
now
updated,
slides,
so
welcome.
I
am
NOT
going
to
give
you
slides.
I
know
you've
memorized
the
note.
Well,
so
we're
not
going
to
show
it
to
you,
but
there
will
be
a
test
at
the
end
of
the
session
and
and
just
so
you
know,
I
RTF
abides
by
the
ITF
Stillwell.
So
if
you've
seen
it
multiple
times,
you've
seen
it
for
us
as
well.
We
just
changed
it.
So
it's
as
RTF.
A
A
They
there
are
normally
two,
but
one
of
our
presenters
was
unable
to
attend
last
time
and
they're
going
to
be
terrific
talks
about
large-scale
heart
problems.
So
we'll
start
out
with
Brandon
SHhhh
linkers
talk,
Brandon
is
with
Facebook
and
University
of
Southern
California
and
his
award
papers
called
engineering
egress
with
edge
fabric,
steering
oceans
of
content
to
the
world
and
take
it
away.
Brandon
thank.
B
You
for
the
introduction
good
afternoon,
everyone,
my
name,
is
Brenda
shrinker
from
the
University
of
Southern
California.
Today,
I'm
going
to
be
talking
about
edge
fabric,
a
system
we
built
at
Facebook
to
deliver
traffic
to
end
users
around
the
world.
So,
let's
start
off
here
with
a
brief
overview
of
Facebook's
network
Facebook,
has
dozens
of
points
of
presence
around
the
world
and
interconnects
with
thousands
of
networks.
B
B
B
B
Next,
we
use
BGP
or
the
border
gateway
protocol
to
exchange
reachability
information
with
those
networks.
So
in
this
example,
the
end
user
ISP.
We
receive
routes
to
their
end
users
across
the
network,
interconnection
that
we've
established
with
them,
and
we
also
receive
a
route
from
that
tier
1,
transit
provider.
B
B
So
what
are
the
challenges
to
using
all
this
rich
interconnectivity?
Well,
our
key
objective
here
is
to
deliver
traffic
with
the
best
performance
possible,
but
the
challenge
to
doing
that
is
that
BGP
doesn't
consider
demand
capacity
or
performance
in
its
decision
process.
So
let's
take
a
look
at
what
problems
that
creates.
We
have
here
a
simple
example:
Facebook
on
the
left
is
trying
to
deliver
five
gigabits
per
second
of
traffic
to
the
end
users
and
the
ISP
on
the
right.
B
Now
our
router
is
configured
to
use
those
short
direct
paths
that
we
prefer,
and
so,
as
a
result,
it
puts
all
of
that
load
onto
that
upper
path
and
everything's
fine.
Until
later
on,
in
the
day
now,
demand
has
risen,
we're
now
at
12
gigabits
per
second
of
demand
and
again
BGP
at
that
router
can't
be
adopt,
can't
adopt
a
demand
or
capacity
in
real-time.
It's
simply
not
possible
to
express
that
with
Beach
of
these
policy
terms.
B
B
Likewise,
BGP
doesn't
consider
performance
in
its
decision
process.
A
simple
example
of
that
can
be
seen
here
that
upper
preferred
route
now
has
a
securities
route
on
it.
So
it's
added
50
milliseconds
of
latency,
also
some
piece
of
equipment
downstream
is
miss
functioning
or
malfunctioning,
adding
loss.
So
in
this
scenario,
the
route
through
that
set
that
second
route
through
that
transit
provider
would
actually
be
preferred.
B
Now,
despite
all
these
problems
with
BGP
and
how
it
doesn't
account
for
capacity
or
performance,
it's
still
fundamental
to
interconnection,
and
it's
not
going
away
anytime
soon.
The
thousands
of
networks
that
Facebook
and
other
large
content
providers
connect
with
all
expect
for
us
to
use
the
BGP
protocol.
B
So
I've
briefly
gone
over
Facebook's
Network
and
an
overview
over
the
challenges.
Next
I'm
going
to
dive
deeper
into
our
connectivity
and
the
challenges
I'm
going
to
talk
about
how
we
sidestep
BGP
as
limitations
with
edge
fabric
I'll,
then
talk
about
edge
fabrics,
behavior
and
production
and
finally
I'll
talk
about
the
evolution
of
edge
fabric
and
some
ongoing
work.
B
So
back
to
those
points
of
presence
that
we
have
around
the
world
at
each
of
those,
we
have
three
types
of
connectivity:
first,
we
have
transit
providers
and
transit
providers
can
deliver
traffic
to
the
entire
internet
at
each
pop.
We
typically
have
two
or
more
of
these
for
redundancy
and
we
connect
with
them
through
a
private
circuit
or
sometimes
known
as
a
private
network
interconnection.
B
Then
we
have
peers
and
we
separate
peers
into
two
different
categories,
so
I'm
going
to
go
into
detail
and
why
we
do
that
a
little
later.
But
in
general
we
have
private
peers
on
which
there
are
on
the
order
of
tens
per
pop
and
again
we
connect
with
them
through
circuits,
and
we
have
IXP
or
public
peers
that
we
interconnect
with
via
internet
exchange
points
and
those
are
on
the
orders
of
hundreds
per
pop
and
we
interconnect
with
them
through
a
shared
fabric,
which
means
we
don't
have
a
direct
circuit
between
our
routers
and
ours.
B
So
how
do
we
prefer
across
these
different
routes?
What
does
what's
our
router
is
configured
to
do?
In
general,
we
apply
this
very
simple
policy.
We
prefer
routes
from
private
peers
over
internet
exchange,
point
peers
over
transit
providers.
Now
we
prefer
peers
over
transits,
because
peers
provide
a
short
direct
pass
to
end-users
and
we
prefer
private
over
internet
exchange.
Point
peers
because
we
prefer
circuits
that
are
dedicated
to
have
dedicated
capacity
between
Facebook
and
the
peer.
B
B
This
is,
let's
take
a
look
at
what
the
circuit
peak
demand
is
to
its
capacity
four
circuits,
where
we
predicted
that
the
demand
was
going
to
be
greater
than
his
capacity
at
least
once,
and
what
I
have
here
is
on
the
y-axis,
a
CDF
of
circuits,
where
the
demand
exceeded
the
capacity
on
the
x-axis
is
their
peak
demand
relative
to
their
capacity.
So
a
peak
demand
here
of
two
indicates
they
had
twice
as
much
demand
as
a
circus,
actual
capacity.
B
B
B
B
Second,
we
wanted
to
have
ease
of
deployment,
which
means
we
wanted
to
interoperate
with
our
existing
infrastructure
and
tooling.
We
have
BGP
routers
at
the
edge
of
our
networks,
like
most
network
operators.
Do
we
already
have
existing
tooling
for
interacting
with
BGP,
so
we
wanted
a
system
which
could
interact
with
that
existing
infrastructure.
B
On
the
right
hand,
side
I
have
another
extreme
which
is
host
based
routing
and
that's
where
each
host
makes
a
decision
on
what
the
route
of
that
packets
going
to
be
and
then
uses
some
signaling
method,
such
as
MPLS
or
GRE,
to
signal
to
the
routers
at
the
edge
of
the
network.
How
does
how
to
handle
that
packet,
so
edge
fabrics
approach?
Balance
is
balanced
between
these
two
extremes.
B
We
have
a
controller
that
overrides
BGP
decisions
at
the
router
and
when
our
hosts
provide
hints
on
packet
priority,
but
don't
precisely
specify
how
the
packet
should
be
egress
from
our
network.
So
what
does
this
approach?
Look
like
well,
first
routers,
at
the
edge
of
our
network,
keep
selecting
routes
like
they
do
today
using
BGP.
We
still
have
all
of
our
BGP
sessions
with
other
networks
terminated
at
those
routers.
So
in
this
case
our
router,
based
on
all
the
information
is
received,
have
selected
route.
B
A
hedge
fabric
also
selects
ideal
routes,
but
in
addition
to
all
that
bgp
routing
information,
it
also
has
access
to
other
inputs.
In
this
case,
that
means
advanced
policy
information
such
as,
for
instance,
us
configuring,
based
on
business
reasons
or
reasons
provided
to
us
by
a
peer
prefix
traffic
rates,
circuit
capacities
and
route
performance
measurements,
so
edge
fabric
takes
all
that
additional
input,
and
it
also
makes
a
decision
and
in
this
case
is
decided
to
use
route
B.
B
B
So
edge
fabric
can
perform
two
types
of
overrides:
it
can
override
BGP
decision
in
order
to
move
traffic
for
a
set
of
end
users.
So,
for
instance,
we
can
say
on
a
per
destination
basis,
override
what
BGP
would
typically
do,
which
is
perhaps
send
that
traffic
via
appearing
link
and
instead
send
it
via
transit
link.
B
So,
let's
take
a
look
at
how
the
all
of
this
comes
together
to
prevent
congestion
in
our
network
and
we're
going
back
to
that
example.
I
showed
earlier,
where
we
have
Facebook
on
the
Left
trying
to
deliver
12
gigabits
per
second
of
traffic
to
this
is
P
on
the
right
and
BGP
by
default
is
going
to
put
all
of
that
traffic
onto
that
upper
link
because
we
always
prefer
those
short
direct
paths
from
Pierce.
As
a
result,
that
link
is
going
to
become
overloaded,
so
what
edge
fabric
does.
B
Is
it
understands
that
this
12
gigabits
per
second
of
demand
is
actually
composed
of
two
prefixes?
And
in
this
case
it
understands
that
if
it
shifts
one
of
these
prefixes
away
and
shifts
that
traffic
to
an
alternate
link,
in
this
case,
the
path
via
the
transit
provider
that
it's
going
to
prevent
congestion
on
the
peering
link
without
causing
congestion
anywhere
else?
So.
B
How
does
this
work
at
the
bgp
level?
Well,
we
take
that
transit
route
that
we've
selected
we
injected
via
BGP
and
then
BGP
at
oliver.
Routers
is
configured
to
prefer
routes
from
edge
fabric,
and
we
do
that
by
configuring
local
pref
on
the
bgp
sessions
for
edge
fabric,
such
that
the
local
prep
of
its
routes
is
always
the
highest
and
less
preferred.
B
So
edge
fabric
monitors,
BGP
decisions
and
overrides
them
as
needed
to
prevent
congestion
in
our
network
edge
fabric
is
able
to
support
a
variety
of
traffic
engineering
policies
because
it
operates
over
a
variety
of
inputs
and
it
can
perform
overrides
on
a
variety
of
granularities
and,
more
importantly,
it's
compatible
with
our
existing
BGP
infrastructure,
which
means
that
what
we've
truly
achieved
with
edge
fabric
is
centralized
control
over
the
traditionally
distributed.
Bgp
decision
process.
B
Going
back
to
those
design
priorities,
I
introduced
earlier
edge
fabric,
meets
our
goals
of
operational
simplicity
because
we
can
always
fall
back
to
bgp
at
the
routers
of
edge
fabric
fails.
It
allows
operators
to
continue
to
use
our
existing
tools
because
routes
are
injected
to
those
routers
via
BGP
and
synchronization
is
only
required
between
edge
fabric
and
routers.
B
B
Likewise,
if
I
move
a
significant
amount
of
traffic
and
now
I'm
at
fifty
percent
utilization,
now
I'm
getting
poor
utilization
of
those
short
direct
links
and
I'm,
not
making
good
use
of
my
capacity,
so
in
general,
we
strive
for
based
on
operational
experience
is
achieving
95
percent
utilization,
and
this
allows
us
to
have
high
utilization
with
tolerance
for
Burson
traffic.
Now
the
key
question
here
is:
can
we
maintain
that
utilization
without
any
packet
loss,
so
we're
gonna?
B
What
we
did
here
is
we
measured
across
our
network
during
that
two-day
measurement
period,
and
what
we
found
is
when
edge
fabric
is
shifting
traffic
away,
meaning
that
it
believes
that
us
link
would
be
overloaded
if
it
didn't
intervene.
99
percent
0.99
percent
of
the
time
there
was
no
packet
drops
on
that
link.
B
Anything
to
the
left
means
that
the
utilization
is
lower.
Anything
to
the
right
means
that
it's
higher
and
we
end
up
with
potential
loss
during
bursts.
So
what
we
find
here
is
that
the
vast
majority
of
the
time
we're
able
to
keep
the
utilization
of
these
interfaces
or
these
circuits
within
2
percent
of
that
threshold.
B
So
I
talked
earlier
about
those
two
extremes
of
how
you
can
have
routing
decisions
made
at
the
edge
of
your
network
at
routers,
or
you
can
have
routing
decisions
made
at
your
hosts
and
when
we
actually
started
off
with
edge
fabric,
we
were
using
the
other
extreme
routing
decisions
made
at
our
hosts.
That's
called
host
based
routing,
so
in
this
model,
what
edge
fabric
would
do
is
it
would
inject
its
decisions
directly
into
our
servers
and
then
our
servers
would
use
MPLS,
DHCP
or
GRE,
depending
on
the
generation
of
edge
fabric.
B
This
was
to
signal
to
routers,
at
the
edge
of
our
network,
send
this
packet
through
circuit
X.
Now
a
key
challenge.
There
is
synchronization,
you
have
to
keep
routing
state
maintained
across
all
of
your
hosts
and,
if,
let's
say
circuit,
X
disappears,
my
servers
need
to
know
that
now
that's
no
longer
a
valid
option
for
them
to
route
traffic
via
in
comparison.
What
we
do
today,
this
edge
based
routing
approach,
diode
described,
has
red
fabric
inject
its
decisions
into
routers
at
the
edge
of
our
network
and
overrides
our
enacted
by
those
routers
host.
B
Don't
signal
the
precise
path
that
they
want
to
track
our
packet
to
take.
Instead,
they
just
signal
to
the
router
information
about
that
packets
traffic
class
such
as
this
is
a
video
packet.
So
this
means
that
we
don't
have
any
hosts
synchronization,
which,
in
our
network,
drastically
reduces
the
complexity
of
a
system
like
edge
fabric.
Further,
we
have
flexibility
with
dscp
signaling,
because
we
can
account
for
different
classes
of
traffic
and
we
can
always
fall
back
to
BGP
at
our
edge
routers.
B
Next
thing
I
want
to
briefly
go
over
is
about
congestion
beyond
the
edge
of
our
networks
and
for
this
example,
I'm
going
to
talk
about
internet
exchange
points,
so
internet
exchange
points
allow
networks
to
interconnect
through
a
shared
switch.
So
in
this
case
Facebook
and
another
content
provider
may
both
connect
to
this
big
ixp
shared,
switch
and
downstream
end
user
networks
may
connect
as
well
so
internet
exchange
points
are
often
seen
as
removing
barriers
to
interconnection.
B
I
don't
have
to
provision
cross,
connects
between
me
and
all
these
other
networks
as
I
want
to
interconnect
with,
but
they
also
create
a
key
challenge
and
to
see
why?
Let's
take
a
look
at
this
example,
in
this
case,
both
Facebook
and
this
other
content
provider
have
hundreds
of
gigabits
per
second
of
capacity
to
this
internet
exchange
points
which,
in
this
case
Facebook
wants
to
send
8
gigabits
per
second
of
traffic
to
those
end
users
and
the
other
content
providers
6
gigabits
per
second
now.
The
problem
here
is
that
is
px.
B
They
only
have
10
gigabits
per
second
of
capacity.
As
a
result,
we
end
up
with
the
same
problem
that
Iola
straited
earlier
demand
here
is
greater
than
the
available
capacity,
where
any
up
with
congestion
and
packet
loss.
Now.
The
key
problem
here
is
that
these
networks
on
the
Left
Facebook
and
this
other
content
provider
have
no
visibility
past
their
network
edge.
They
have
no
understanding
of
what
that
other
networks
circuit
capacity
is
downstream,
and
even
if
they
did,
they
can't
see
each
other's
traffic
from
Facebook's
perspective.
B
B
So
what
can
we
do
to
identify
congestion
beyond
the
edge
of
our
network?
Well,
we've
looked
at
a
few
different
signals
before
we
were
looking
for
instance,
of
prefixed
traffic
rates,
so
I
could
figure
out
how
much
of
Facebook's
traffic
is
going
to
go
on
the
circuit
again.
That
doesn't
work
here
because
trough
cross
traffic
beyond
our
edge
from
other
content
providers
is
being
mixed
in
and
we
don't
know
how
much
traffic
they
have
circuit
capacities.
B
Oftentimes,
you
aren't
going
to
know
downstream
how
much
capacity
does
my
transit
have
with
the
end
user
network
I
have
no
idea,
and
what
that
means
is
you
have
to
instead
use
route
performance
measurements?
You
have
to
infer
congestion
from
these
performance
measurements,
but
that
can
be
particularly
challenging
because
you
can
see
things
such
as
latency
increases
and
you
aren't
sure
as
to
whether
that's
due
to
a
path,
change
or
a
change
in
client
population
or
due
to
actual
congestion.
B
Likewise,
you
don't
know
how
much
traffic
to
shift
you
have
to
continuously
probe
for
capacity
as
downstream
a
failure
may
occur,
reduce
capacity
for
20
minutes
and
then
be
resolved,
so
it
requires
a
trial
and
error
discovery
process.
Likewise,
those
interactions
with
other
networks
also
create
complexity.
B
They
may
also
respond
to
congestion
signals
and
thereby
reduce
the
amount
of
traffic
they're.
Putting
on
those
links-
and
you
may
increase
your
traffic
and
you
may
oscillate
together.
So
it's
very
difficult
to
get
a
signal
here
as
to
how
much
traffic
should
I
put
on
this
link.
Even
if
you
know
the
current
status
of
congested
or
not,
that
doesn't
mean
that
five
minutes
from
now
it's
going
to
be
in
that
same
status,
so
stepping
back
from
all
of
this.
What's
really
new
here,
these
problems
in
general
have
been
known
for
quite
some
time.
B
In
this
case,
what
we
have
actually
at
each
router
there's
multiple
routing
instances
and
those
routing
instances.
The
dscp
marked
packets
arrive
at
each
instance
based
on
the
DSC
p-value,
so,
for
instance,
DSC
p-value
50
will
arrive
at
Rowdy
instance
50,
and
we
inject
routes
into
each
of
those
instances.
If
there's
no
route
injected,
the
router
will
fall
back
to
the
default
route
instance.
So
this
allows
us
to
customize
on
a
per
destination
per
classic
traff
per
classic
trap
class
as
to
whether
or
not
we're
gonna
override
the
route.
B
D
Erin
Falk,
so
it
seems
like
one
thing.
One
of
the
effects
of
this
mechanism
is
that
it
increases
sort
of
the
dynamics
of
route
changes
for
that.
A
packets
experience
and
I'm
wondering
if
you've
looked
at
the
impact
that
this
has
on
individual
flows.
I
mean
my
experience
with
Facebook
is
that
most
objects
are
pretty
small
but
I,
but
it
it's
unlikely
that
both
paths
are
going
to
have
the
same
latency
and
so
for
a
particular
flow.
D
B
So
the
way
the
decision
process
works
today
is
it's
like
we
going
to
continue
to
select
the
same
routes
or
the
same
destinations
to
shift
as
devote
increases.
So
let's
say
I'm
100
megabits
per
second
over
my
capacity
I'll
choose
X
to
shift
now.
I'm
200
megabits
per
second
I
choose
X
and
Y,
and
what
that
means
is
that
once
we've
shifted
something
over
we're
likely
to
continue
to
shift
it,
it's
not
always
that
we
will.
There
is
some
elbow
of
optimization
there
were.
B
E
Hi
Brendan
I'm,
Dave
Blanca
need
idea
about
injecting
the
BGP,
prefixes
and
I
guess.
The
failure
mode,
then,
is
if
the
edge
fabric
doesn't
work,
it
falls
back
to
BGP.
I
was
wondering
about.
You
gave
an
example
where
you
showed
two
non-adjacent
before
prefixes
aggregating
to
more
than
the
bandwidth
on
the
on
the
10
gig,
and
you
selectively
chose
one
to
offload,
say
two
and
a
half
gigs
of
traffic
or
something
right.
Where
did
you
get
those
prefixes
from?
Do
you
synthesize
them
from
what
you
know
is
downstream
or
are
they
pre-configured?
E
B
So
the
the
general
aggregation
here
is:
we
get
samples
from
IP
fix
or
s
flow.
We
aggregate
them
up
to
the
most
specific
prefix
advertised
by
a
BGP,
and
then
we
do
break
those
prefixes
apart
again
further,
so
let's
say,
I
have
a
slash
20,
which
is
one
gigabit
per
second
of
traffic
will
break
that
slash
20
up
into
smaller
prefixes,
slash,
21
or
slash
22
until
we
get
down
to
a
certain
granularity.
B
E
B
The
decision
processes
are
independent.
We
actually
prefer
to
move
v4
before
v6
and
that's
because
v6
we've
seen
cases
where
you
shift
it
to
a
different
route.
That
route
is
actually
black,
hoing
the
traffic
and
then
you
end
up
oscillating,
because
you
shift
away
and
back
each
time
the
prefix-
and
this
is
likely
just
because
of
v6
routes
being
less
chromed
than
before.
F
Yes,
my
name
is
Stuart
Cheshire
from
Apple
throughout
the
presentation
you
talked
about
demand
as
being
a
fixed
thing
like
we
have
12
megabits
of
demand
or
12
gigabits
of
demand
going
into
a
10
gigabit
pipe
right.
But
if
you're,
all
the
transport
protocols
I
know
like
TCP
and
quick,
adapt
the
throughput,
and
if
you
send
a
sustained
12
gigabits
into
a
10
gig
pipe
and
lose
20%,
it's
not
going
to
continue
losing
20%.
The
center's
are
going
to
slow
down
that
rate.
F
So
I
I
didn't
understand
why
the
normal
congestion,
control
algorithms
to
adjust
rate
did
not
slow
down
when
they're
too
fast
and,
conversely,
speed
up
when
that
too
slow.
If
there's
excess
capacity,
TSP
will
speed
up
until
it
uses
all
the
capacity,
because
there's
there's
no
such
thing
when
I'm
looking
at
facebook
is
loading
a
picture
too
fast
right,
I
want
to
load
as
fast
as
it
can,
which
should
be
all
the
capacity
that's
available
so
to.
B
Be
clear
here
when
I
say
12
gigabits
per
second
of
demand.
That's
what
our
controller
sees
as
the
demand
to
that
prefix
at
that
moment
in
time.
The
reason
that
it
can
be
greater
than
that
links
capacity
is
likely
because
we
shifted
traffic
away
from
that
link
on
a
previous
iteration,
so
I
may
have
let's
say:
I
have
a
single
prefix
that
if
I
was
to
send
all
through
this
link,
it
would
be
congested.
I
would
have
shifted
it
away
on
a
previous
iteration.
B
G
Obviously
the
amount
of
krypton
Eid,
it's
not
a
question,
it's
quite
a
remark:
you
are
struggling
to
they're
all
good
old
problem
over
congestion
link
and
informational
about
congestion.
So
you
stopped
just
one
step
before
their
invention
frame,
relay
and
means
of
struggling
of
convention
in
friend
and
frame
relay
I
hope
it
will
be
a
result
of
your
ongoing
work
and
you
propose
something
like
begging
for
the
BGP
Thank
You.
H
Jo
a
plea
this
might
be
in
the
paper:
I
have
read
it,
but
you,
you
implied
I,
think
when
you
are
looking
for
congestion
off
net
through
exchange
points
or
in
remote
networks.
The
earing
act
of
probing
to
look
for
congestion
conditions.
So
I
was
wondering
if
you'd
consider
pulling
those
kinds
of
insights
directly
from
TCP
when
you
already
kind
of
have
in
a
passive
observation.
Some
indication
of
whether
transport
protocols
are
being
throttled
beaten
before
packet
loss
exists.
So.
B
I
That
the
primary
and
first
control
that
you
have
for
directing
your
traffic
seems
not
to
be
mentioned
and
well.
Okay,
feedback
not
explained,
and
the
first
thing
that
you
I
guess
are
doing
is
the
server
selection
deciding
to
which
of
your
server
clusters
at
which
location
you
direct
the
queries
of
these
offer.
Customers
and.
I
I
guess
I
guess
some
of
well:
okay,
essentially
the
predictions
of
how
much
traffic
will
be
generated.
This
way
from
each
of
a
server
clusters
goes
into
edge,
cast
into
edge
fabric
as
the
estimation
of
the
required
or
of
the
generated
demand
for
the
former
volume
but
kind
of
I
wonder.
Is
there
no
feedback
that
actually
feeds
back?
I
B
To
be
clear,
there's
there's
two
controllers
here:
I,
don't
talk
about
the
other
one.
There
is
a
global
controller
which
decides
which
point
of
presence
around
the
world
and
end
users
traffic
will
be
sent
to,
and
then
there's
this
vocal
controller,
which
each
point
of
presence
decides
how
we're
going
to
egress
that
traffic.
B
Those
two
systems
do
have
some
cohesion
between
them
and
the
interactions
that
you
describe
do
exist
in
terms
of
how
we
decide
what
the
demand
is
for
each
point
of
presence:
that's
not
based
on
the
global
load
balancer
that
is
based
on
IP
fix,
or
s
flow
measurements
at
that
local
pop.
So
that
allows
us
to
get
in
near-real-time
every
30
seconds
exactly
right
now
how
much
load
there
is
at
that
location.
A
J
So
we
were
looking
at
bgp
data
that
we
collected
and
if
you
follow
this,
this
development,
you
see
a
large
increase
of
BGP
committees
being
used
over
the
last
eight
years
we
have
seen
more
than
two
hundred
ninety-six
percent
of
increase
of
BGP
communities
being
used,
so
individual
values
in
BGP
communities
and
I
looked
up
yesterday.
It
further
increased
so
last
year,
around
five
thousand
I
guess
we're
using
BGP
committees
and
now
it's
up
to
ten
thousand,
and
we
see
seventy
four
thousand
individual
values
for
short
communities.
J
So
for
me,
as
a
researcher
this,
this
means
I
should
probably
take
a
look,
what's
actually
happening
there.
So
what
are
we
talking
about?
We
are
talking
about
the
short
bgp
communities.
You
probably
know
your
defined
in
our
steen1997:
they
are
a
32-bit
value,
usually
split
in
half,
so
the
first
16-bit
being
an
AAS
number
director,
16-bit
being
a
value
where
each
PHAs
is
agreeing
up
upon
values
where
their
peers,
what
they
should
mean
on
what
they
are
being
used
for.
So
there
is
no
strict
semantics
in
it.
J
Peers
have
to
agree
upon
it
on
themselves
and,
as
you
have
noticed,
it's
only
16
bits,
so
we
now
have
a
yes
numbers
which
are
larger
than
16
bits.
Finally,
we
get.
The
large
community
is
defined
in
RFC
1892,
and
they
are
now
at
12
by
it.
Well,
you
so
know
you
have
three
three
fields,
each
with
significant
space
to
use
them
so
4-byte
asn,
a
ESS,
can
actually
use
communities.
Now
here
the
first
four
bytes
are
now
defined
to
be
a
global
administrator.
J
J
Besides
the
confusion
of
the
naming,
if
it's
a
long
or
large
communities,
we
spotted
other
problems
where
we
try
to
do
our
measurements,
the
large
communities
were
not
really
used
in
in
2018.
We
only
found
fifty
one
global
administrator
actually
using
them,
so
nothing
we
could
actually
measure
on
internet
scale.
This
has
has
been
become
better
and,
if
you're
interested
in
the
uptake
of
large
communities,
a
meal
from
ripe
has
set
up
or
have
published
an
article
where
he
looked
into
the
development
of
large
communities
in
the
uptake.
J
So
now
we
have
around
120
global
administrators
that
are
using
large
communities,
so,
but
how
are
they
being
used
at
all
or
in
general
communities
can
be
split
into
two
groups.
We
have
informational
communities
that
have
passive
semantics.
They
are
used
for
location
tagging.
Where
has
this
prefix?
We
learned
in
which
a
pop
RTT
tagging
we
have
seen
and
on
the
other
side,
we
have
action
communities
that
carry
active
semantics.
They
are
used
for
triggering
black
holding
or
actions
in
other
guesses,
for
example,
paths
prepending.
J
The
problem
here
is
without
documentation
of
these
P
of
these
values.
You
cannot
see
if
this
is
an
active
or
passive
community
or
if
the
semantics
is
active
or
passive,
because
it's
already
mentioned
the
peers
decide
themselves
what
these
community
values
mean,
and
there
is
no
bit
indicating
if
it's
active
or
passive
or
an
action
community,
and
this
leads
into
several
sorts
of
problems.
J
Communities
are
transitive
optional
attributes,
so
they
should
be
forwarded
to
your
peers.
An
RFC
74/54
says
you
should
scrub
communities.
You
are
using
inside
your
network,
so
you
cannot
be
manipulated
from
outside,
but
forward
for
any
communities
by
other
users,
so
it
should
be
expected
date
that
they
are
actually
propagating
through
the
internet,
but
still
a
lot
of
people
do
not
expect
this
and
a
lot
of
trends
providers
don't
actually
forward
them.
J
We
only
found
14%
of
transit
providers
propagating
received
communities
and
yes,
this
value
seems
to
be
small,
but
the
Internet
graph
for
the
EAS
graph
is
highly
connected.
So
you
actually
end
up
in
communities
traveling
quite
quite
a
lot,
but
still
many
people
do
not
expect
them
to
propagate
it
widely,
and
the
problem
here
is
that
this
leads
to
some
potential
for
misuse
as
they
are
propagating
through
the
internet
and
can
trigger
actions
multiple
hops
away,
and
there
is
no
way
for
an
operator
to
find
out.
If
this
is
intended
or
not.
J
This
leads
into
a
problem.
You
cannot
say
well,
this
is
traffic
management
and
this
is
legitimate,
or
this
is
an
attack,
and
we
ask
ourself
the
question
if
there
are
also
unintended
consequences
in
this
combination
of
b2b
communities
being
transitive
and
for
wallet
and
used
for
actually
changing
routing
decisions,
and
our
assessment
in
the
end
is
yes,
there
is
a
high
risk
for
attacks
that
we
already
see
some
attacks
as
well.
J
So
what
we
were
looking
at,
of
course,
we
took
all
over
the
publicly
available
BGP
data
we
can
find,
and
in
the
end
we
find
that
75%
of
BGP
announcements
that
we
looked
at
have
at
least
one
BGP
community
asset
and
in
2018
it
were
five
to
six
thousand
a
ESS.
Now
it's
more
than
ten
thousand
a
ESS
that
make
use
of
these
short
communities
now,
taking
a
step
back
and
looking
at
the
propagation
again
what
we
can
actually
measure
or
what
we
cannot
measure.
J
Finally,
2
is
4,
so
it's
4
is
recording
this
community
in
its
routing
decision
in
its
rip.
So
es
2
has
added
this
informational
community.
Now
a
s
2
is
also
adding
a
community
for
signaling
it
or
triggering
an
action
in
a
s
3
its
upstream,
and
this
is
also
forwarded
to
a
is
4.
So
both
of
these
communities
are
now
present
or
visible
in
a
s4,
but
a
s
4
cannot
know
who
actually
has
added
these
communities
and
so
can't
we,
but
we
needed
this
for
our
measurements.
J
So
we
had
to
come
up
with
a
solution
for
us.
We
can
only
death
in
fear
or
infer
which
a
s
is
adding
a
specific
community
by
assuming
that,
if
the
a
s
value
or
the
s
number
present
in
the
community
is
actually
D
as
adding
the
community,
we
will
get
a
lower
bound
of
the
travel
distance
or
of
the
a
s
hop
count.
J
This
will
lead
us
for
the
community
to
2
to
3
or
3,
with
the
correct
travel
distance
of
2,
a
s
hops
and
with
the
other
community
3
1,
2
3
with
the
wrong
assumed
travel
distance
of
one
of
also
2,
but
it
actually
off
sort
of
one,
because
it's
just
one
a
s
up,
although
correctly
it
would
be
too.
But
for
us,
this
lower
bound
of
the
distance
of
one
hop
is
sufficient
for
our
for
our
work.
J
So
if
we
plot
these
values,
which
again,
is
the
lower
lower
bound
of
travel
distances,
we
end
up
with
this
cdf
on
the
excess
pieces.
You
see
das
hop
count
and
we
find
that
10%
of
communities
having
a
s
hop
count
of
more
than
6,
so
they
traverse
more
than
6
different
guesses
from
where
we
assumed
them
to
have
been
added
at
more
than
50%
of
communities
still
Traverse
more
than
for
a
SS.
J
And
if
you
compare
this
with
the
mean
length
of
a
s
path
that
we
have
observed,
which
is
around
4.5
or
4.7,
this
actually
means
it
travels
almost
through
the
whole
internet
and
the
longest
community
propagation
we
have
observed
were
11,
AAS
hops,
so
they
do
propagate
through
the
internet.
Now,
looking
at
another
very
complex
and
yes,
topology
I
use
one
again
announcing
a
prefix
to
a
s
2
and
adding
a
community
3,
1
2
3
to
inform
a
s
3
or
execute
path.
Depending
there.
J
You
will
notice
that
this
community
value
is
also
propagated
to
a
is
4
again
and
although
it's
only
intended
for
signaling,
something
towards
a
3
is
4
is
also
receiving
an
announcement
with
this
community.
So
we
end
up
with
2
different
s
paths
and
in
the
first
case
we
for
our
for
our
research
call.
This
community
be
on
path,
because
the
a
s
value
from
the
S
community
from
the
community
is
present
on
the
a
s
path
that
we
record
in
s
3.
In
is
for
recall.
J
This
community
read
off
path
because
the
AES
number
3
is
not
present
on
the
S
path.
It
could
also
be
that
the
a
s
that
is
being
signaled
for
is
further
hops
away
behind
is
4,
but
in
both
cases
this
would
be
called
off
path,
because
the
a
s
number
is
not
present
on
the
recorded
yes
path,
and
if
we
now
take
the
right
part
of
these
community
values,
separate
them
by
own
pattern
of
paths
and
plotters.
We
end
up
with
this
distribution.
J
On
the
left
side,
you
see
a
quite
a
number
of
community
values
in
the
off
path,
communities
that
are
related
to
black,
holding
remote
record
like
holding
and
on
the
other
side.
On
the
right
hand,
side
you
see
very
even
numbers
that
look
like
provider
like
operator
assigned
and
easy
to
remember,
and
we
think
that
this
comes
from
the
fact
that
e
that
a
essence
that
are
not
implementing
black,
we'll
just
forward
by
calling
communities
compared
to
a
yeses
that
do
black
holing,
which
follow
the
black
housing
RFC
and
do
not
further
propagate
these
communities.
J
Now,
coming
to
the
experiments
that
we
did
to
show
that
there
actually
are
some
problems
out
there
in
internet,
all
of
the
experiments
were
done
first
in
a
lab
environment
and
then
validated
on
the
internet,
with,
of
course,
operator.
Consent
and
I
will
show
two
different
scenarios
in
this
talk.
There
are
more
in
the
paper
and
the
configuration
of
our
routers
are
available
publicly,
so
first
going
back
again
and
giving
an
intro,
how
does
remote
trigger
black
holding
is
supposed
to
work?
So
we
know
we
are
talking
about,
as
one
is
announcing.
J
The
prefix
to
its
upstream
is
two
and
then
receiving
traffic.
This
is
expected
behavior.
Sometimes
you
have
the
problem
that
you
receive
more
traffic
than
you.
Actually,
you
want
to
attract.
We
call
this
a
denial
of
service
attack
and
one
mitigation
is
a
is
one
signaling
to
ASU
that
wants
to
blackhole
a
prefix.
Usually
this
is
done
in
band
in
the
same
BGP
peering
session,
then
the
normal
BGP
announcements
are
being
sent,
but
there
are
also
cases
where
it's
a
special
BGP
session,
which
has
other
problems,
but
not
the
ones
mentioned
here.
J
So,
as
one
is
announcing,
the
prefix
P
checked
with
the
black
hole
in
community
to
signal
a
is
to
that
it
should
drop.
Traffic
is
to
is,
of
course,
still
announcing
the
prefix
P
to
all
its
peers,
but
without
the
black
holding
community.
Now
what
happens
is
is
is
now
dropping
traffic
and
all
the
routing,
the
traffic
towards
P
and
it's
poly
routers
and
the
link
between
a
is
1
and
a
is
2
is
released
from
your
observers,
traffic
and
is
usually
usable.
J
So
you
sacrifice
parts
of
your
network
or
parts
of
the
prefix
IP
addresses
to
still
keep
all
of
the
other
prefixes
and
servers
reachable.
This
is
how
it
should
word.
What
we
noticed
is
that
for
this
to
be
used
in
a
secure,
you
need
to
employ
some
safeguards.
Of
course,
the
provider
that
is
providing
black
holing
has
to
check
if
the
customer
is
actually
allowed
to
black
hole
these
prefixes.
J
So
if
these
prefix
are
owned
by
the
customer
or
the
customer
has
permissions
to
black
hole
them,
and
this
leads
to
the
fact
that
you
need
different
policies
for
customers
and
peers,
different
access
control
lists
and
leads
to
a
lot
of
configuration
overhead
for
for
a
secured
usage
of
remote
triggered
black
holding
and,
of
course,
receiving
such
communities.
You
have
to
add
no
advertise
and
nor
export
to
the
announcement.
So
you
don't
propagate
it
further,
and
we
also
noticed
some
providers
translating
black
holding
to
the
black
hole
in
communities
of
other
upstream.
J
So
you
were
not
even
able
to
do
selective
black
hole
because
they
were
translating
it
and
announcing
it
to
their
peers
as
well,
translating
the
actual
rail
user.
Now,
what
should
not
be
possible
is
depicted
here.
We
have
the
same
topology,
but
now
a
s2
is
on
the
roll
of
an
attacker,
and
a
stew
should
just
be
a
backup
path
to
to
the
prefix
of
a
as
one.
J
But
a
s2
is
able
to
actually
at
the
black
hole
in
community,
although
it's
not
on
the
best
path
so
as
to
is
announcing
to
a
three
that
prefix
P.
Should
be
black
hold
and
we
notice
a
three
is
actually
doing
that,
although
the
best
path
is
through
a
as
one
and
a
as
one
as
the
origin
for
P
is
not
actually
requesting
any
black
holding
and
the
other
problem
that
we
noticed
is
that
this
is
even
possible
if
a
s2
is
not
involved
in
in
any
connection
to
a
has
one
at
all.
J
So
as
one
can
just
hijack
the
prefix
P
and
announce
the
prefix
P
with
the
black
hole
in
community
said,
and
we
noticed
that
in
some
cases
we
are
able
to
circumvent
a
CLS
and
prefix
filter
lists,
because
the
black
holing
community
is
checked
before
any
prefix
filter
lists
are
applied,
so
we
were
able
to
confirm
this
on
the
Internet.
It
works
multi-hop
and
it's
hard
to
spot,
because
the
community
values
are
usually
on
monitored.
J
Reasons.
For
that
we
found
is
the
black
column.
Prefix
is
a
more
specific,
so
you
need
a
exception
rules
in
your
configuration
to
accept
as
let
32
so
essentially
everything
that
smaller
than
/
24
and
some
providers
check
the
black
holding
community
before
applying
any
prefix
filters,
and
we
even
found
some
configuration
guides
on
the
internet
which
had
this
problem,
and
they
were
the
example
configuration
provided
and
the
problem
here.
There
is
no
validation
for
the
origin
of
the
community.
Every
AES
on
the
path
can
add
a
black
hole
in
community
for
the
upstream
provider.
J
Now
yesterday,
yup
Snyder's
gave
a
talk
at
the
IPG
where
he
presented
the
mitigation
for
this.
If
you
would
check
that
the
peer
that
is
announcing
the
black
holding
or
the
prefix
with
the
black
holding
community
is
on
the
best
path
and
only
then
accept
the
black
holding.
This
is
one
possible
mitigation
for
this
attack.
So
if
you
are
interested
in
that,
you
should
check
the
recordings
of
that
talk.
So
if
you
only
accept
a
black
holding,
if
the
period
is
announcing,
the
black
holding
for
a
prefix
is
your
current
best
path
to
their
prefix.
J
Now
a
s2
is
our
attacker,
announcing
prefix
P
with
community
to
do
path
prepending
in
AF
3,
which
leads
to
the
longer
path
over
a
s,
4
and
5,
to
be
the
preferred
path
for
a
s6y.
This
attack
could
be
interesting.
Well,
one
thing
could
be:
there
is
a
network
tap
between
is
4
and
5,
and
if
you,
even
if
you
would
identify
that
a
is
2,
is
your
attacker
and
you
would
screen
the
network
of
a
is
2.
J
You
will
not
find
any
network
tap
there,
because
they
on
purpose
redirected
to
traffic,
to
a
is
4
and
5
or
any
actual
network
campus
could
be,
is
2
is
being
forced
to
cooperate
here
and
the
other
thing
it
is.
It
could
just
be
a
denial
of
service
attack
because
it's
known
at
the
link
between
is
4
and
5
a
very
thin
link
with
less
with
not
as
much
bandwidth
as
would
be
needed
so
by
redirecting
traffic
there.
You
could
actually
fill
that
link
there
and
after
I
gave
this
presentation
at
the
right
meeting.
J
We
were
actually
approached
by
dine
and
they
pointed
us
to
an
article
where
they
found
our
attacks.
That
today
are
actually
already
using
communities
to
foster
propagation
of
hijacks.
So
the
attackers
found
out
that
by
setting
specific
community
values,
their
hijack
would
actually
be
propagated
more
in
the
BGP
network.
So
we
already
see
attacks
using
communities.
J
We
found
probably
authenticity,
transitivity
standards,
documentation
and
monitoring
of
community
usage,
starting
with
authenticity.
I
mentioned
several
times
that
every
a
s
that
is
on
das
path
is
able
to
modify,
add
or
remove
community
values
on
announcements
on
on
in
BGP,
and
there
is
no
attribution
possible.
It
means,
even
if
you
found
out
that
there
is
an
incident,
you
cannot
find
out
who
is
actually
responsible
for
that.
We
all
know
rpki,
but
intentionally.
J
On
the
other
hand,
we
also
see
that
operators
rely
on
the
correctness
of
community
values
because
they
are
basing
policy
decisions
on,
for
example,
where
a
route
has
been
learned
and
large
communities
are
there,
but
they
only
partially
improve
the
situation,
because
all
of
these
points
still
apply
to
large
communities.
They
only
fix
the
first
part
of
being
an
NA
s
number.
So
the
question
is:
how
can
we
achieve
authenticity,
or
at
least
attribution?
So
after
an
incident
you
know
who
you
have
to
talk
to
to
prevent
further
problems
in
the
future.
J
Another
thing
that
could
probably
to
big
discussion
series
we
now
communities
are
very
helpful
in
debugging,
because
you
know
what
is
happening
in
the
network
and
why
certain
networks
are
forwarding
traffic
in
a
certain
way,
and
they
are
indeed
a
very
easy
low
overhead
communication
channel
and
widely
news.
We
still
only
see
them
being
used
one
or
two
hops
away.
You
usually
do
not
signal
black
holding
five
or
six
hops
away,
or
you
do
not
usually
need
to
inform
peers
six
or
seven
hops
away.
J
So,
on
the
other
hand,
you
have
a
high
risk
for
abuse
in
communities
being
transitive.
So
the
question
is:
do
we
have
a
high
risk
here,
or
do
we
have
a
more
benefit?
Do
we
need
a
discussion
between
benefit
and
risk
using
flow
transit
or
allowing
community
values
or
communities
to
be
full
transitive
monitoring
is
another
field
full
of
MIS
misunderstandings.
J
We
do
not
know
what
has
happened
on
the
path
between
peers,
even
if
you
are
able
to
look
into
lookingglass
as
many
early,
it's
very
hard
to
spot
differences,
so
inferring
modifications
between
the
origin
or
das,
setting
communities
and
the
collector
is
almost
impossible,
and
even
if
you
would
be
able
to
record
all
of
these
changes,
you
still
have
two
problem.
You
do
not
know
what
these
community
values
actually
mean
and
there
is
no
general
way
for
attribution
of
a
changes
or
recording
who
actually
change
anything.
J
J
J
Doing
our
research,
we
found
another
very
large
problem.
This
is
with
documentation
because
all
of
the
yeses
can
define
their
communities
themselves.
There
is
no
need
for
documentation.
There
is
no
central
point
of
documentation.
We
found
that
some
of
the
guesses
are
the
are
documenting
them
in
who
is
now
your
our
databases
on
their
websites?
Some
are
only
providing
community
documentation
in
customer
portals
or
not
even
at
all.
J
So,
if
you
see
a
community,
you
cannot
find
out
in
an
easy
way
what
they
mean,
and
even
if
there
is
documentation,
it's
often
an
inoculant
in
natural
language
and
parsing.
This
is
impossible.
We
tried,
we
failed
if
you
have
a
very
limited
scope,
for
example,
trying
to
find
out
the
community
is
used
for
geolocation
or
for
for
geo
tagging
of
prefixes.
You
can
of
course,
look
for
city
names,
airport
codes,
things
like
that,
but
parsing
community
documentation
in
a
general
purpose
for
a
general
purpose.
J
J
J
They
are
only
using
string
representations
instead
of
community
values
internally,
and
these
representations
are
then
translated
to
short
and
large
community
values
for
their
auto-configuration.
An
example
would
be
tact
at
origin,
dot,
country
dot
de
where
de
is
a
parameter
for
the
community
definition
Tec
origin
country.
So
you
see
it's
a
hierarchical,
hierarchical,
a
system
very
it
says.
Well,
this
community
is
a
tagging
community,
it's
a
it,
has
passive
semantics.
J
It
is
taking
an
origin
on
the
country
level
and
the
countries
Germany
and
their
system
allows
the
definition
of
parameters
to
communities
and
these
parameters
with
the
committees
together
are
documented
in
one's
system.
They
have
working
code
and
they
are
using
this
in
production
already
right
now
they
have
an
internal
internet
draft
like
document
and
if
you're
interested
in
that,
you
probably
should
should
talk
to
Whittaker,
who
is
sitting
there
and
laughing.
J
So
I
think
this
is
a
great
way
for
actually
started
document
communities
in
a
sensible
way
because
you
don't
have
to
operate
with
metrical
numbers
and
you
can
actually
distribute
these
documentation
and
talk
about
policies
and
filters
with
your
peers
because
you
have
to
you
can
talk
with
with
with
strings
and
no
magical
numbers.
And
even
if
you
have
other
router
configurations,
you
can
still
use
these
string
representations.
And
you
know
what
you're
talking
about.
J
Then
we
came
up
with
some
recommendations
for
operators
based
on
our
work,
of
course,
as
the
RFC
already
States,
you
should
filter
all
informational
comment,
community
values
that
you
are
using,
that
carry
your
a
s
number.
So
if
you
are
using
communities
to
check
where
a
prefix
as
we
learned,
we
should
scrub
these
communities
when
you
receive
them
from
your
peers
because
they
are
defined
by
you
internally
and
used
by
you
internally.
J
It
might
be
useful
to
come
up
with
agreements
with
their
down
streams,
so
to
define
what
they
are
allowed
to
do
with
your
up
streams
if
they
are
actually
allowed
to
do
path.
Prepending
with
your
upstream
for
their
prefixes,
of
course,
publicly
documenting
you.
The
communities
you
are
using
is
key
to
enable
other,
a
SS
to
filter
action
communities
to
you.
So
if
I
have
a
customer
where
I
know
he
might
be
playing
a
with
BGP
I
might
want
to
filter
things.
So
he
cannot
trigger
things
in
my
upstream,
but
you
need
agreements
for
that.
J
J
So,
coming
back
to
D
to
the
general
problem,
bgp
communities
are
currently
the
only
feasible
way
to
realize
signaling
between
a
asses,
but
the
problem
is
that
the
secure
usage
requires
good
operational
knowledge
and
diligence.
We
do
not
think
that
a
very
over
complex
system
is
really
suitable
to
secure
the
shortcomings
of
bgp
communities,
but
we
have
to
be
aware
that
there
is
a
problem,
and
while
everybody
in
this
room
is
probably
able
to
handle
this
and
do
everything
correct
all
the
time,
we
cannot
rely
on
that
on
a
global
scale.
J
There
are
a
bunch
of
people
out
there
who
do
not
know
what
they
are
doing
and
there
will
never
be
a
world
in
in
which
everybody
is
doing
everything
correct.
So
the
question
is:
do
we
still
can
rely
on
or
do
we
still
want
protocols
that
allow
people
to
make
mistakes
that
will
break
other
people's
network
or
do
we
need
an
evolution
in
protocols
here
that
are
less
fragile
and
trot
in
and
more
usable
or
other
or
with
other.
J
To
prevent
people
shooting
themselves
and
others
into
the
in
in
the
food,
so
wrapping
up
communities
are
widely
in
use.
They
are
used
to
realize
policies,
they
are
needed,
but
they
heavily
rely
on
mutual
trust
between
the
peers,
because
there
is
no
authenticity
and
security
in
place.
There
is
no
tribution.
Attacks
are
very
hard
to
detect
and
one
take
away
from
our
experiments.
We
did
some
prefix
hijacking
that
was
reported
on
Twitter,
but
nobody
actually
spotted
our
agree
direction.
J
A
J
J
A
I
I
I
But
in
the
end,
you
have
to
see
that,
yes,
you
always
have
bilateral
relations
that
are
mapped
into
bgp,
neighbor,
neighbor,
neighbor,
ship
relations
and
yes,
what
is
exchanged
there
should
be
seen
as
something
that
is
essentially
just
bilateral.
And
yes,
if
you,
if
you
want
to
be
a
responsible
actor
in
the
old
system,
you
have
to
really
control
what
you
are
doing
with
your
neighbors.
And
if
you
really
take
that
understanding,
you
can
actually
start
to
build
stuff.
That
says:
well,
okay,
you
and
me
are
peering
I'm,
a
responsible
person
I
make
an
agreement.
I
What
we
are
doing
on
our
relation
and
for
that,
if
you
and
me
are
doing
a
decent
effort
at
controlling
at
doing
the
right
policy
for
implementing
our
agreement,
we
actually
have
a
chance
for
using
that
as
fairly
trustworthy
and
I
might
even
I
might
even
go
out
and
offer
you
an
agreement
in
which
I
would
in
which
I
promise
you
that
I
am
doing
stuff
where
I'm
related,
where
I
am
relating
to
you
in
a
controlled
manner.
The
communities
that
Randy
is
sending
to
me.
This
is
something
that
does
not
work
recursively
overview.
K
J
I
L
Very
beautiful
work,
thank
you
for
bringing
it
I
mean.
This
is
well-known
and
I've
been
facing
this
for
like
15
years,
probably
since
now,
if
we
start
doing
remote
black
calling
the
mitigation,
today's
mostly
basic
hygiene,
you
take
care
of
what
you
accept
and
you
take
care
of
issues,
and
there
are
some
telltale
consequences
in
very
beautiful
stuff
things
like
bandwidth,
immunities
that
are
very
useful
in
data
centers.
They
were
made
non-transitive
just
to
avoid
those
issues.
While
most
data
centers
use
ebgp,
we
can
propagate
those
communities,
so
there's
work
definitely
needed.
A
M
M
To
bring
up
to
speed
those
folks
who
might
not
be
super
familiar
equate
perfect.
So
if
you
are
an
expert
on
quick
and
you
know
the
background
material
I
apologize
in
advance,
please
bear
with
me
as
I
go
to
it
all
right,
so
I'm
gonna
be
talking
about
taking
a
long
look
at
quick.
This
was
a
measurement
work
that
appeared
at
IMC,
2017
and
I.
M
Don't
think
I
need
to
convince
anyone
in
this
room
that
Internet
connectivity
is
important,
but
just
to
set
the
stage
and
put
things
in
perspective
in
2015,
3.2
billion
people
had
access
to
Internet.
Obviously
that
number
has
increased
over
the
past
four
years,
but
in
that
same
year
the
number
of
people
with
running
water
was
less.
M
These
two
numbers,
next
to
each
other,
are
I,
find
them
depressing,
but
for
reasons
that
are
out
of
the
scope
of
this
talk,
but
it
it
emphasizes
the
importance
of
Internet
connectivity
use
it
in
our
personal
life.
In
our
perfect
professional
life,
virtually
every
business
depends
on
the
internet
and
their
viability
is
tied
to
the
performance
of
the
network
that
they're
operating.
K
M
Naturally,
there's
a
lot
of
effort
to
try
to
improve
these
networks
and
make
them
more
reliable
and
more
performant.
Idf
is
one
of
those
efforts
and
we
do
a
lot
of
things.
We
come
up
with
new
protocols.
We
use
traffic
management
techniques
to
make
sure
that
our
networks
are
utilized
in
a
way
that
everyone's
demands
are
met
and
we
even
design
our
applications
to
adapt
themselves
to
the
underlying
network.
So
we
increase
the
user
experience
improve
the
user
experience
and
while
quic
is
one
of
those
effort,
it's
it's
a
transport
protocol.
M
It
stands
for
quick,
UDP
internet
connection
and
it
started
in
Google
and
it
was
basically
a
transport
protocol
design
with
today's
needs
in
mind.
Quic
was
designed
for
a
bunch
of
main
reasons.
The
first
one
was
to
facilitate
rapid
deployment.
What
does
that
mean?
If
you
think
about
HTTP,
you
have
HTTP.
Hopefully
you
have
TLS
underneath
and
it's
running
on
TCP,
which
is
your
Transfer
Protocol
and,
as
you
all
know,
TCP
is
part
of
the
it's
implemented
in
the
kernel.
So
it's
in
the
kernel
space.
What
does
that
mean?
M
So
now
what
this
means
is
that
is
that,
whenever
you
have
a
new
version
of
quick,
all
you
need
to
do
is:
let's
say:
if
you're
browsing
the
web
and
you're
using
a
browser,
all
the
users
need
to
do
is
to
update
their
browser,
and
then
they
have
the
new
version
of
quick.
Obviously,
this
means
that
a
lot
of
things
like
a
lot
of
guarantees
that
the
TCP
port
provides
like
reliable
delivery.
M
Another
main
reason
was
quick
for
quick,
which
Google
never
shied
away
about
pointing
out
was
to
avoid
ossification
by
middleboxes.
We
all
know
there
are
many
middleboxes
in
networks.
These
could
be
nuts
or
security
firewalls
or
could
be
web
caches
and
many
other
applications.
A
lot
of
them
do
claim
that
they
improve
performance.
Perhaps
in
some
cases
they
do,
but
there's
also
a
lot
of
evidence
that
they
actually
do
more
harm
than
than
good.
M
One
of
the
examples
that
I
find
very
interesting.
This
is
this
was
a
joint
work
by
Google
and
t-mobile.
It
was
a
few
years
ago
it
was
presented
at
velocity
conference
where
they
basically
looked
at
the
youtubes
traffic
over
t-mobile
network,
and
how
does
it
interact
with
their
web
proxies,
and
this
is
summary
of
findings
from
their
slides.
They
basically
found
that
it's
better.
The
YouTube
traffic
does
not
go
through
their
proxies
because
they're
hurting
their
performance
and
I
don't
want
to
point
any
fingers
to
t-mobile
YouTube.
M
This
is
this
is
not
an
issue
isolated
to
them.
Another
example.
This
is
taken
from
a
CloudFlare
blog
post,
where
they
were
basically
saying
we
had
TLS
1.3
enabled
for
a
while,
but
no
one
was
using
it
because
the
browser's
were
not
supporting
it
and
they
were
not.
Basically,
support
turning
it
on,
because
middle
boxes
were
breaking
it
and
to
be
fair,
it
wasn't
just
middle
boxes.
There
were
other
other
issues
that
prevented
TLS
1.3
from
being
deployed
at
scale,
but
middle
boxes
were
not
helping.
M
Tcp
fast
open
is
another
example
that
a
lot
of
folks
believed
it
didn't
it
never
got
deployed
at
scale
because
of
metal
boxes,
and
the
list
goes
on
so
and
all
of
these
things
can
happen
because
in
TCP
all
of
your
headers
are
in
the
clear,
so
metal
boxes
can
see
them
and
act
upon
them.
They
can
modify
them
drop
them,
add
headers
or
break
your
connections
into
to
all
the
things
that
you're
familiar
with,
whereas
in
quick
pretty
much
everything
is
encrypted,
so
you
take
all
that
from
middleboxes.
M
They
can't
they
can't
do
any
obligations
or
meddling
and
finally,
quick
was
proposed
to
improve
performance
I.
Just
a
side
note
here:
I
have
performance
for
HTTP
traffic.
I
should
mention
that
quic
is
eventually
gonna,
be
a
general-purpose
transport
protocol,
but
it
would
start
it
started
with
HTTP
in
mind
and
it's
that's
its
biggest
use
case
right
now.
It's
very
integrated
with
HTTP.
So
throughout
this
talk,
whenever
I
say
quick,
we
are
basically
gonna
focus
on
HTTP
over
quick.
M
So
whenever
I
say
quick,
I
mean
HTTP
over
quick,
so
quick
improves
performance
by
a
number
of
optimizations.
The
most
famous
one
is
zero
RTT
connection
establishment.
If
you're
familiar
with
TCP,
you
have
that
three-way
handshake
to
establish
a
connection
before
you
can
send
any
data.
If
you
have
TLS
on
top
of
TCP
as
you
should
well,
there
there's
gonna
be
more.
Our
titties
and
quic
tries
to
achieve
zero,
RTT
connection,
zero,
RTT
connection.
What
that
means
is
that
you
can
start
sending
data
from
the
very
first
packet.
Obviously
that
doesn't
always
work.
M
You
should
have
contacted
the
server
before
and
have
valid
keys
for
zero
RTT
to
work.
If
you
don't
it's
going
to
be
one
or
two,
our
titties,
but
after
that
everything
else
is
gonna.
Be
zero.
Rt
t
quit
previous
head
of
work
head
of
line
blocking.
What
is
that,
if
you
have
a
HTTP
stream,
if
it's
HTTP
1,
you
have
a
stream,
you
have
to
open
a
TCP
connection.
M
If
you
have
more
than
one
stream,
then
you
have
to
open
more
TCP
connections,
and
we
all
know
that's
not
that
all
those
connections
have
overhead
competing
over
bandwidth.
So
it's
not
a
great
use
of
resources
HCB
to
solve
this
by
multiplexing
HTTP
streams
into
a
single
TCP
connection.
This
is
great.
It
gets
rid
of
a
lot
of
overhead.
However,
if
any
of
these
streams
is
blocked
for
whatever
reason,
then
all
of
those
the
streams
are
blocked,
and
the
reason
for
this
is
because
TCP
is
agnostic
to
the
HTTP
streams.
M
As
long
as
TCP
is
concerned,
you
have
a
stream
of
bytes
that
needs
to
go
from
one
end
to
the
other
end
and
quic
solves
this
by
basically
mapping
http
streams
into
quick
streams.
Now
having
those
logic
that
logic
of
streams
in
quick,
if
one
of
the
streams
is
blocked,
the
rest
of
them
are
not
going
to
be
blocked
and
can
be,
can
proceed
normally.
M
Quick
has
improved
loss
recovery.
It
helped
ADA
mitigates
the
akka
ambiguity
problem.
The
TCP
has
it
has
better
RTT
and
bandwidth
estimation.
A
lot
of
this
good
loss
recovery
comes
from
the
fact
that
you
can
easily
change
the
congestion
control
as
well.
So,
for
example,
if
you
have
bbr
a
new
congestion
control,
you
can
easily
replace
your
old
one
with
the
new
one,
and
that
comes
from
the
fact
that
the
first
point
that
I
talked
about
you
can
easy.
M
A
little
bit
of
history,
quick
started
in
early
2010
at
Google,
as
I
said,
I
think
it
was
in
2013
that
was
publicly
announced
and
Google
started
using
it
soon
after
there
was
a
spec
draft
and
towards
the
end
of
2016,
the
ITF
working
group
started
and
the
working
group
has
been
very
active.
There
are
many
implementation
of
quick
around
quick,
Google's
quick
is
at
version
47
now,
and
the
working
group
is
working
fast
and
hopefully
soon
we're
gonna
have
a
standard
version
of
quick
and
everyone's
gonna
be
using
that
all
right.
M
So
that's
that's
why
quick
started
and
a
little
bit
of
history
of
it
and
but,
as
I
said,
quick
would
start.
One
of
the
main
reasons
for
quick
was
improved
performance,
so
Google
has
been
reporting
on
quicks
performance
they've
been
using
it
heavily
and
they've
been
putting
out
reports
that
helps
with
page
load
time
with
YouTube
rebuffering
and
all
these
great
numbers
that
it's
perfect
and
it's
very
promising.
M
However,
the
issue
with
these
is
that
they're
all
aggregated
statistics
and
not
really
reproducible
by
anyone
else
unless
your
Google
and
you
have
access
to
that
data
and
they
don't
really
report
any
ctrl-x
tests.
Again.
Everything
is
aggregated
statistics
there
at
the
time
that
we
started
our
work.
There
were
other
evaluations
of
quick
in
their
research
venues.
However,
most
of
them
were
limited
environment
networks,
limited
tests
and
they
used
old,
untuned
versions
of
quake
which
I
will
get
into
in
a
bit.
What
that
means
and
the
results
that
they
provided
were
not
necessarily
statistically
sound.
M
Neither
they
provide
good
causes
analysis
for
the
performances
that
they
observe.
So
we
basically
wanted
to
bridge
those
gaps
like
filling
those
gaps
and
provide
them
more
comprehensive
evaluation
of
quick.
And
how
does
it
compare
to
TCP?
So,
as
I
said,
we're
gonna,
look
at
HTTP
performance
and
we're
gonna
compare
quake
in
TCP.
M
M
Our
servers
host
a
bunch
of
web
pages
and
objects
with
different
sizes
and
pages,
with
different
object,
sizes
and
different
number
of
objects,
and
we
fetch
them
using
quick
in
TCP
and
we
come
the
performance
and
I
must
point
out
that,
even
though
I'm
not
gonna
go
into
the
details,
we
run
once
we
get
all
the
results.
We
want
a
statistical
test
to
make
sure
any
difference
that
we
see
is
not
due
to
noise
or
network
variations
or
things
that
are
not
really
differences
between
the
protocol.
M
So
whenever
we
report
a
difference
between
the
two
protocol,
we
are
confident
that
this
is
the
difference
in
performance
or
not
know.
Is
there
anything
else,
so
the
setup
is
pretty
simple,
but
in
2016,
when
we
were
doing
these
tests,
we
had
this
big
issue
of
finally
having
a
server.
This
supports
quick.
It's
not
like
TCP.
There
wasn't
a
quick
module
for
Apache
servers.
The
different
many
options
are
also.
Basically,
our
two
real
options
were
either
use.
M
Google
servers
because
Google
at
the
time
had
quick,
basically
hosts
our
stuff
on
Google
servers
and
run
our
tests
against
Google
or
use
a
server
that
comes
within
the
chromium
code
base.
Well,
the
first
option:
Google
servers
didn't
really
work
for
us
for
the
first
obvious
reason
that
we
had
no
control
over
it.
M
It's
half
a
second!
So
basically,
one
third
of
our
download
time
is
wait
time
and
we
did
some
tests.
We
realized
this
wait.
Kinda
exists
in
Google,
App
Engine.
We
wasn't
sure
we
weren't
sure
why
it's
happening.
Obviously,
we
didn't
have
any
control
to
the
server
to
investigate
this
more,
and
this
was
not
good
for
us,
because,
if
we're
down,
if
we're
checking
performance
and
comparing
millisecond
times
1/2
a
second
wait
time
is
not
okay,
so
we
decided
to
use
the
server
in
the
chromium.
M
However,
so
this
is
the
bar
on
the
left
is
doing
the
exact
same
experiment,
but
with
the
chromium
server
the
server
that
is
part
of
chromium.
Now
you
can
see
that
verb,
huge
wait.
Time
is
gone,
that's
great,
but
now
our
download
time
is
much
bigger
compared
to
quick
to
Google
I'm.
Sorry-
and
this
is
problematic,
because
this
is
basically
these
two
plots
next
to
each
other
are
telling
me
that
the
server
in
chromium
cannot
provide
the
performance
that
quic
is
able
to
provide,
because
we
clearly
see
that
Google
is
doing
better.
M
So
we
had
to
try
to
basically
infer
what
are
the
configuration
that
Google
servers
are
using
and
and
basically
fine-tune
our
chromium
server
to
make
sure
it
matches
the
performance
that
Google
gives.
So
we
did
that
along
the
way
we
found
some
bugs
and
basically
we
fixed
it
I'm
not
going
to
go
into
the
details
but
happy
to
talk
about
it
offline.
But
after
we
did
that
the
plot
on
the
right,
the
bar
on
the
light
right,
is
basically
the
same
experiment
using
our
chromium
server
after
adjusting
it,
and
not
only
we
don't.
M
M
So
now
that
we
have
our
setup
complete
our
test
bed
complete,
we
did
some
tests.
Let's
start
showing
you
results
from
a
desktop
client
and
I'm.
Gonna
show
you
some
simple
results
where
we're
downloading
different
object,
sizes
from
five
kilobyte
to
10
megabyte
and
we're
downloading
them
at
different
ball,
unlike
bandwidth,
and
we're
comparing
how
quick
in
TCP
perform
compared
to
each
other.
M
So
in
this
case
the
RTT
is
36
milliseconds,
the
loss
is
insignificant
and
those
numbers
so
45
44
%,
and
what
that
means
is
that,
when
we're
downloading
that
5
kilobyte
object
using
quick
in
TCP,
the
download
time
for
Quake
is
45
percent
better
than
TCP.
Now,
to
avoid
bombarding
you
with
a
lot
of
numbers,
I'm
gonna,
replace
that
with
a
heatmap.
So
just
think
of
it
as
red
means,
quick
is
doing
better
blue
means,
TCP
is
doing
better
and
white
means,
there's
no
statistically
significant
difference
between
the
two
protocols.
M
So,
if
I
complete
this
plot,
you
can
see
pretty
much
in
every
bottleneck
bandwidth
and
for
every
object.
Size
quic
is
doing
better
than
TCP
we're
able
to
download
the
object
faster.
So
this
is
great.
We
added
we
throw
in
some
loss
into
the
picture
and
we
saw
still
quic
is
doing
pretty
much
better
than
TCP.
In
all
cases,
we
increase
the
RTT
time.
We
did
worked
with
different
artists.
In
this
example,
RTT
is
112
milliseconds
again,
quick
was
doing
way
better
than
TCP.
M
So
so
far
everything
was
great
and
we
were
very
excited
and
then
we
did
this
experiment
where
we
added
some
packet
reordering
and
as
soon
as
we
added
packet,
reordering
things
started
to
change
and
we
actually
so
we
actually
saw
a
case
as
cases,
especially
when
that's
covering
the
plot,
but
the
Blues.
The
right
side
of
the
plot
are
big
objects.
The
last
column
has
a
10
megabyte
object,
so
when
we
have
packet,
reordering
quic
is
doing
worse
than
TCP.
So
we
want
to
see
why
this
is
happening.
M
We
looked
at
quick,
scold
instrument,
instrumented
the
cold,
look
at
TCP
to
see
how
it's
coping
with
packet
reordering
and
basically,
what
we
found
is
that
TCP
has
this
mechanism.
When
you
have
packets
reordered,
it
increases
it
its
neck
and
it
can
cope
with
that
reordering.
Whereas
quick
didn't
have
that
mechanism
in
place
and
when
packets
were
reordered
deeper
than
its
neck,
it
was
basically
thinking
that
those
packets
are
lost,
so
it
was
going
into
loss
recovery
and
we
all
know
what
that
means,
and
it
was
that
performance
was
going
down.
M
M
Sorry
I
went
all
right,
so
we
did
knock
threshold.
The
default
my
threshold
for
quake
was
3,
so
we
want
to
see
and
I'm.
Looking
at
the
example
when
we're
downloading
a
10
megabyte
object.
So
it's
a
big
object.
It's
a
sizable
transfer
and
we
want
to
see
if
quick
can
benefit
with
it
from
the
same
mechanism
as
TCP,
so
we
started
playing
with
the
neck
and
we
actually
saw
that
there's
a
big
latency
between
my
clicker
and
the
slide,
and
we
saw
that
as
actually
as
we
increase
the
neck.
M
Quick,
quick
spare
formance
actually
gets
better
and
when
we
let
the
neck
to
increase
up
to
300,
which
is
actually
the
number
the
tcp
the
upper
bound,
the
tcp
is
a
lot
to
increase
it's
nag.
Then
quic
is
able
to
recover
it's
able
to
cope
with
the
packet.
Reordering
I
actually
starts
performing
better
than
TCP.
M
All
right
so
next
thing
that
we
want
to
look
at
what
0
RT
T,
because
that's
that's
a
big
improvement
that
about
the
improvements
in
quake.
So
we
want
to
see
how
much
zero
RT
t
help
with
and
I'm
gonna
go
to
back
to
our
base
example,
wouldn't
where
there's
no
loss-
and
we
have
a
36
millisecond
RTT
as
I
talked
about
quick,
is
doing
much
better
than
TCP.
This
is
quick
versus
TCP.
M
You
can
really
sense
that
when
the
object
size
is
small
and
when
your
object
is
big
naturally,
because
your
transfer
is
is
longer
and
your
connection
time
is
a
very
small
fraction
of
your
transaction,
so
it
doesn't
have
a
big
big
effect,
which
is
it
still
great
because
if
you
think
about
web
most
of
the
time,
you're
actually
requesting
very
small
objects.
So
so
this
0
RT
t
can
help
a
lot
in
that
in
those
scenarios.
M
Sorry,
so
comparing
these
two
plots
together,
as
we
said,
0
RT
t
only
helps
for
smaller
objects,
but
we
can
see
that
quic
is
doing
better
for
bigger
objects
as
well.
So
we
want
to
see
what
is
it
that
the
quic
does
that
helps
it
to
perform
better.
So
I
have
an
experiment
here,
which
is
a
little
bit
extreme,
but
I
like
it,
because
it
helps
visualizing
things
a
little
bit
better.
M
M
So
basically
the
thank
you
very
quick
is
way
more
aggressively
and
better
adapting
itself
to
their
changes
to
the
available
bandwidth,
which
is
great
but
also
made
us
think
if
quic
is
so
aggressive
in
adapting
itself
to
two
available
bandwidth.
How
is
it
gonna
play
with
fairness
to
other
traffic
because,
as
we
know,
we
want
different
flows
to
be
fair
to
each
other,
so
now
no
flow
shuts
down
other
flows.
So
we
made
TCP
and
quic
compete
with
each
other
over
a
bottleneck
bandwidth
and
we
actually
found
out
that
quick
is
not
fair
to
TCP.
M
We
found
out
the
quic
is
taking
more
than
share
share
bandwidth.
We
repeated
that
experiment
with
when
quic
is
competing
with
multiple
TCP
flows
and
we
still
got
the
same
results
and
to
make
sure
this
is
not
our
environment.
We
made
quick
and
quick
complete
with
quick
things
were
fair
TCP
con
competing
with
TCP
everything
was
fair,
but
when
the
two
protocols
been
competing
with
each
other,
quick
was
not
being
fair
to
TCP.
We
want
to
dig
in
a
little
bit
deeper,
so
here
I
have
the
congestion
window
size
for
the
two
protocols.
M
The
in
this
example
they're
both
using
cubic
and
as
you
can
see,
they
start
from
the
same
congestion
window
size,
but
quickly,
quick,
increases
in
congestion
window
and
takes
a
unfair
share
of
the
bandwidth
and
causes
TCP
to
basically
slow
down
and
to
zoom
in.
You
can
actually
see
that
quic
is
way
more
aggressively,
increasing
its
congestion
window.
M
All
right
so
I
have
one
last
thing
to
talk
about
before
I
run
out
of
time,
and
that
is
mobile
devices,
so
everything
I
talked
about
so
far.
The
client
is
a
desktop
device
and
again
going
to
my
base
example
of
no
loss
and
36
minutes
like
an
RTT,
we
saw
that
quic
is
doing
better
than
TCP.
In
most
cases,
however,
we
read
it
the
same
exact
experience
experiment,
but
this
time
the
client
is
a
mobile
phone,
and
what
we
saw
is
that
well
one
quick
is
still
doing
at
least
as
good
as
TCP.
M
You
don't
see
any
blue
cells
in
there,
but
the
performance
gains
of
quick
started
to
diminish,
so
quick
is
doing
better
than
TCP,
but
the
gap
is
not
as
big
as
for
a
desktop
client.
So
we
want
to
see
why
this
is
happening
and
what
we
did.
We
instrumented
the
quick
code
to
try
to
infer
a
state
machine
and
see
what's
happening
in
and
in
quick
and
what
states
the
the
protocol
is
in
at
every
time.
M
So
I'm
gonna
show
you
this
state
machine
where,
for
the
case,
where
we're
downloading
a
10
megabyte
object
at
50
megabits
per
second-
and
it
looks
something
like
this:
it's
a
classical
state
machine.
You
have
different
states,
the
percentage
of
time
that
you
spend
in
every
state
the
probability
the
transition
probabilities.
This
is
a
little
bit
difficult
to
reach.
I'm
gonna,
replace
it
with
a
table
and
as
soon
as
I
do
that.
M
Hopefully,
things
are
gonna
become
clear,
as
you
can
see
when,
when
we're
using
a
desktop
machine,
quic
is
in
application,
limited
State
for
only
7%
of
the
time,
and
that's
the
state
that
the
client
is
receiving
data
faster.
They
can
consume
it,
but
as
soon
as
you
go
to
a
mobile
device
where
resources
are
more
scarce,
now
quick
is
in
application,
limited
State
for
60%
of
the
time-
and
this
is
exactly
the
price
that
quic
is
paying
for
being
implemented
in
the
user.
M
Space
you're
constantly
context,
switching
between
user
space
and
kernel
space,
so
which
is
fine
on
a
resourceful
device,
but
when
you're
on
a
mobile
device,
things
are
not
that
great.
So
that's
all
I
had
to
talk
about.
To
sum
it
up.
We
looked
at
the
protocol
that
was
rapidly
evolving
and
honestly,
sometimes
I
felt
like
measuring
moving
sand.
M
There
are
a
bunch
of
other
chests
that
I
didn't
have
time
to
talk
about,
but
I
encourage
you
to
read
the
paper
if
you're
interested
in
and
we
instrumented
the
code
extracted
some
state
machine
and
that
helped
us
to
provide
some
root
cause
analysis
for
the
performances
that
we
were
seeing
and
the
formula
I
just
want
to
point
out
that
this
work
was
done
two
years
ago.
So
at
the
time
quick
was
at
version
36.
As
I
said
now,
Google
quick
is
at
version
47.
M
However,
nothing
stops
us
from
doing
the
exact
same
measurement
on
the
new
versions.
We
actually
did
that
in
the
paper
we
looked
at
quick
from
version
25
to
36,
so
we
had
that
evolution
of
quick
performance
and
we
can
do
the
same
thing
for
for
newer
versions
and
future
versions,
and
with
that
I'm
happy
to
take
questions.
N
N
Thank
you
and
one
follow-up
I
believe
it
was
two
slides
forward.
Perhaps
three
the
fairness,
question.
N
About
the
fairness,
so
what
I
wanted
to
ask
about
was
you
noticed
the
difference
in
the
fairness
word?
Where
and
I
assume
you
mean
that
quick
was
consuming
a
higher
proportion
of
the
bandwidth?
Did
you
compare
that
to
sort
of
the
expectation
as
observed
here
that
that
quick
performs
better
than
TCP
normally
so
TCP
will
prove
that
this
must
mean?
Presumably
the
TCP
will
leave
some
of
the
bandwidth
underutilized
or
less
utilized,
and
is
it
the
same
proportion
here
or
how
far
different?
Is
it.
N
So
my
my
maybe
this
is
too
complicated
to
ask
it
to
Mike,
but
what
I
was
trying
to
get
at
is
that
we
expect
that
quick
will
perform
better
than
TCP
based
on
the
prior
observations,
even
when
they're
not
competing
right,
which
means
that
on
the
same
kind
of
link
TCP
is
leaving
must
be
leaving
some
bandwidth
under
unutilized
in
order
for
it
to
be
able
to
beat
it
right.
So
how?
N
M
It's
definitely
a
fairness
issue
because
I
don't
know
if
this
answers
your
question,
but
we
did
this
for
very
long,
so
it
we
let
them
both
get
to
that
equilibrium,
and
we
could
see
that
when
there
is
no
competition,
TCP
is
able
to
utilize
the
bandwidth
or
almost
fully,
if
okay,
okay,
that
makes
sense.
Thank
you.
Yeah.
N
D
That
was
one
of
my
questions.
So
I
have
a
clarification
question.
A
larger
question.
Claire
facing
question
is:
what's
the
queueing
discipline
you
were
running
in
your
bottleneck?
What's
the
what
what's
the
queueing
discipline
you're
running
in
your
bottleneck
link?
Are
you
running
bbr?
Were
you
running
red
aqm
dropped
hill?
Oh
I,.
D
The
larger
question
is
sort
of
going
back
to
your
very
initial
remarks.
I'm
really
kind
of
first
great
work
very
interesting
nicely
presented.
Thank
you.
This
is
a
good
paper,
thank
you
for
coming
here
and
presenting
it.
I'm
interested
in
the
200
million
users
who
have
internet
and
no
electricity
right
and
so
I
think
that
there's
a
lot
of
attention
being
paid
to
quick,
as
you
know,
higher
performance
and
you
know
better
utilization
of
of
congested
resources,
but
I
rarely
see
performance
numbers.
D
M
The
reason
I
didn't
put
it
in
here,
because
we
kind
of
the
things
that
I
put
in
here
I
want
to
be
cases
that
I
can
isolate
and
then
show
where
the
difference
is
coming
from,
but
and
in
like
in
networks
that
and
as
I
said,
we
found
in
3G
networks
and
the
poor
networks
that
quic
is
it
still
doing
better
than
TCP.
But
most
of
our
experiments
were
in
controlled
environments
that
were
hard.
Man
writes
right.
M
But
it's
a
good
start.
Yeah.
O
M
O
P
Montenegro
Microsoft,
so
thank
you
very
much
for
this
work
and
I
think
I
heard
you
say
that
there
may
be
some
ongoing
work
going
on
some
more
research.
So
if
that's
the
case
could
I
add
a
suggestion
that
you
gee,
quick
or
Google
quick,
is
so
fine
and
good,
but
the
whole
focus
of
the
IETF
ever
is
I
quick
write
the
idea
octi
of
quick?
P
If
that's
one
of
the
of
your
findings
and
that
would
be
potentially
more
relevant
for
the
future
than
liquid,
because
possibly
everybody
at
some
point
will
be
on
ITF
quick,
so
that
that's
one
one
suggestion
and
the
other
one
is
more
of
a
comment.
You
indicated
that
since
quaked
is
implemented
in
userspace,
that's
one
implementation
hours,
for
example,
runs
kernel
or
user
doesn't
matter
so
you
could
run
it
in
kernel.
You
could
run
user
space.
It's
not
part
of
the
Pirkle
itself
right.
So
I
understand
that
for
the
test
we
needed
to
do.
M
A
So
to
pitch
for
the
remaining
remainder
of
the
year,
there's
four
more
great
and
RP
talks
to
come.
If
you
want
the
links
and
you
can't
find
them
for
some
reason,
I
did
put
up
a
an
agenda
slide
set,
that
is
in
the
tracker,
so
you
can
find
that
and
also
a
humorous
prog
related
picture.
But
in
any
event,
thank
you
for
being
here
and
thanks
for
the
great
questions
for
our
folks
and
end
of
IRT
F
open.