►
Description
The Internet Research Task Force (IRTF) Open session, including Applied Networking Research Prize (ANRP) presentations, will be held during IETF 114 at 19:00-21:00 UTC on 25 July 2022.
A
A
A
A
A
A
Okay,
I
could
have
a
few
minutes
now,
so
you
should
probably
get
started.
It's
I
don't
know
there,
so
you
want
to
come
up
and
get
ready
for
the
microphone.
A
A
Okay,
I
will
assume
that
you
can
hear
me
in
the
room.
If
that's
not
true,
please
enter
the
chat
so.
A
So
welcome
everybody.
This
is
the
iota
open
meeting
at
ipf4104,
I'm
colin
perkins,
I'm
the
irtf
chair.
Hopefully
you
can
all
see
and
hear
me
in
the
room
and
I'm
remote.
A
So
I
would
like
to
begin
with
the
the
usual
note.
Well,
slides.
A
Echo,
but
hopefully
it's
not
too
bad,
so
this
is
the
irtf
open
meeting
the
irtf
follows
the
itf's
intellectual
property
rights
disclosure
rules
and
a
reminder
that,
by
participating
in
this
meeting
and
by
commenting
on
the
presentations
that
you,
you
agree
to
follow
the
irtf
processes
and
procedures,
including
disclosing
any
intellectual
property
relating
to
the
contributions
that
you
make.
A
I'm
sure
most
of
you
have
seen
these
slides
before
the
the
details
are
in
the
documents
linked.
But
essentially,
if,
if
you
have
ipr
on
the
documents
you're
talking
about
you,
you
need
to
disclose
that
if
you're
commenting
a
microphone.
A
In
addition,
a
reminder
that
the
irtf
routinely
makes
recordings
of
these
meetings
available,
both
the
the
online
and
the
in-person
person
meetings,
including
this
one
and
this
meeting
is
being
streamed,
live
on
youtube,
as
well
as
via
the
usual
meat
ecosystem.
A
If
you're
participating
in
person-
and
you
are
not
wearing
one
of
the
red-
do
not
photograph
lanyards,
then
you
consent
to
appear
in
these
recordings.
And
if
you
speak
at
the
microphones,
then
again
your
consenting
to
being
recorded
and,
as
I
say,
the
recording
is
being
made
available
on
youtube.
A
Equally,
if
you're
participating
online-
and
you
turn
on
your
camera
or
your
microphone
and
make
a
contribution,
then
that
that
is
being
recorded
and
you
can
consent
being
recorded
and
also
the
chat
is
also
being
recorded
and
will
be
made
available
in
the
the
usual
jabber
archives.
A
As
a
participant
in
the
irtf,
as
I
say,
you
acknowledge
that
recordings
of
the
meeting
may
be
made
available
and
that
the
previous
that
any
personal
information
you
provide
will
be
handled
in
accordance
with
the
privacy
policy,
and
you
also
agree
to
work
respectfully
with
the
other
participants
in
the
ietf
and
the
irtf.
And
if
you
have
any
issues
or
concerns
about
that
speak
to
me
or
speak
with
the
ombuds
team
and
the
the
itf's
code
of
conduct
and
anti-harassment
procedures
linked
on
the
slide
also
apply
to
the
irtf.
A
For
those
of
you
participating
in
person,
please
sign
in
using
the
the
mobile
meet
echo
the
meet
echo
light
tool,
we're
running
the
queue
electronically.
So
if
you
have
questions,
then
we're
using
the
electronic
queue
that's
accessed
by
via
the
meet
echo
tool
and
keep
the
audio
and
video
off
if
you're,
using
the
on-site
version
that
the
meet
echo
light
tool,
remote
participants,
please
leave
your
audio
and
video
off
and
unless
you're
you're,
presenting
or
asking
a
question
just
to
avoid
feedback
in
the
comment
section.
A
Also
a
reminder
for
those
of
you
who
are
attending
the
meeting
in
person
as
a
covet
safety
measure.
The
itf
is
requiring
those
those
of
you
in
attending
the
meeting
in
person
to
wear
an
ffp2
and
95
mask
or
its
equivalent,
and
the
only
exception
for
that
is
the
the
chairs
and
the
presenters
who
are
actively
speaking
in
particular.
Participants
who
are
making
comments
or
asking
questions
from
the
floor.
Microphones
are
expected
to
wear
a
mask
at
all
times,
including
while
they're
asking
the
those
questions.
A
Okay,
so,
as
I
say,
this
is
the
the
irtf
open
meeting,
the
goals
of
the
irtf
are
to
complement
the
standards
work
being
done
in
the
ietf.
By
focusing
on
some
of
the
longer
term
research
issues,
the
iitf
is
very
much
a
research
organization,
it's
not
a
standards,
development
organization
and
while
it
can
publish
rfcs
and
and
we
do
publish
both
experimental
and
informational
documents
on
the
rfc
series
that
the
primary
output
of
the
irtf
is
his
research
is
understanding
his
research
papers.
A
The
irtf
is
organized
as
a
series
of
research
groups.
Hopefully
you
can
see
them
on
the
slide
here.
The
the
crypto
forum
group
and
the
privacy
enhanced
enhancements
and
assessments
groups
met
earlier
today.
The
the
other
groups
meant
sort
of
highlighted
in
dark
blue
on
the
slider
meeting
later
in
this
week.
So
please
do
look
out
for
those
groups
this
week
and
try
and
attend
the
sessions.
If
you're
interested
in
those
topics.
A
A
little
bit
of
research
group
news,
I'd
like
to
welcome
curtis
heimerl
who's,
recently
joined
as
co-chair
of
the
gaia
group.
The
global
access
to
the
internet
for
all
research
group
curtis
will
be
joining
leandro
navarro.
Who
is
planning
on
stepping
down
from
from
chairing
that
group
after
this
meeting
and
jane
coffin?
Who
is
continuing
so
I'd
like
to
welcome
curtis.
A
And
thank
him
for
his
service
and
thankfully
andrew
for
his
his
many
years
of
service
to
the
group,
and
I
very
much
appreciate
the
efforts
the
android
has
put
into
chairing
the
group
and
I
look
forward
to
working
with
curtis
going
forward.
A
As
I
say,
the
irtf
is
primarily
a
research
organization.
We
tend
not
to
publish
many
rfcs.
We've
had
one
rfc
published
since
the
last
meeting
from
the
information-centric
networking
group.
Looking
at
architectural
considerations
for
using
an
ic
and
main
resolution
service,
but
primarily
the
the
iitf
tends
not
to
publish
much
in
the
rfc
series
and
the
output
is
more
in
form
of
interesting
presentations
and.
A
To
support
that
we
run
the
applied
networking
research
prize
and
the
the
goal
of
this
prize
is
to
recognize
that
some
of
the
best
recent
results
in
applied
networking
research
is
to
to
recognize
some
interesting
new
ideas
which
are
potentially
relevant
to
the
internet
standards.
A
Community
going
forward
is
to
recognize
up
and
coming
people
who
are
likely
to
have
an
impact
on
the
internet
standards
process
and
internet
technologies,
we're
very
grateful
to
the
internet
society
to
comcast
and
mbc
universal
for
their
sponsorship
of
the
a
rp
that
allows
us
to
make
these
awards
bring
different
people
to
give
these
thoughts.
A
And
what
what
we're
doing
today
is
the
goal
of
this
session
is
to
to
to
make
some
of
these
awards.
So
I'd
like
to
congratulate
tasha
swami
and
sam
kumar,
who
will
be
giving
their
award
talks.
A
On
this
session
today,
tasha
will
be
talking
first
in
a
couple
of
minutes
talking
about
his
work
on
data
plane,
architectures.
What's
the
line
right
in
for
insurance
and
sam
will
be
following
later
in
the
session
talking
about
tcp,
low
power.
A
A
Going
forward
look
out
for
a
little
bit
more
of
more
wood
talks.
Goten
akiwati
korean
cat
and
daniel
wagner
will
be
giving
the
talks
of
atf-115
and
the
nominations
for
the
nominations
for
the
2023
awards
will
be
opening
in
september
2022.
So
to
look
out
for
those
look
out
for
the
nominations.
Opening
in
online.
A
Okay-
okay,
hopefully
that's
better,
as
I
was
saying,
look
out
for
the
nominations
for
the
2023,
a
rp
in
september
this
year
and
congratulations
to
tasha
and
to
sam
who
will
be
giving
their
very
nrp
talks
today.
A
In
addition
to
the
applied
networking
research
prize,
we
also
host
the
applied
networking
research
workshop,
which
we
organize
in
conjunction
with
acm
sitcom.
A
This
workshop
is
taking
place
tomorrow,
it's
co-locating
with
the
itf
in
philadelphia,
so
thank
you
to
tj,
chung
and
marwan
fire
to
the
chairs
this
year
and
who've
been
organizing
that
workshop
we've
got
a
program
of.
I
think
that
there
are
four
four
really
nice
research
papers
a
keynote
and
some
invited
talks
on
novel
approaches
to
protocol
specification.
A
As
I
say
that
the
workshop's
happening
tomorrow,
if
you're
there
in
in
person,
then
please
do
consider
attending
if
you're
attending
remotely,
then
you
can
register
and
attend.
Registration
is
free
for
anyone,
who's
also
registered
with
the
itf,
although
we
do
ask
you
to
to
register
separately.
So
we
know
who's
attending
the
workshop
and
the
a
rw
next
year
will
be
again
co-locating
with
the
the
itf
in
july
2023,
which
is
planned
to
be
in
san
francisco.
A
And
to
finish
up
before
we
get
to
the
talks,
I'd
just
like
to
note
that
we
we
are
very
pleased
to
offer
a
number
of
travel
grants
for
these
meetings,
both
to
support
early
career
academics
and
phd
students
from
underrepresented
groups
to
to
attend
the
irtf
research
groups
and
a
number
of
travel
grants
for
the
applied
networking
research
workshop.
A
Thank
you
very
much
to
the
travel
ground
sponsors
to
akamai,
comcast,
cloudflare
and
netflix
for
supporting
that.
If
you'll,
you
know,
please
see
the
the
travel
grants
page
linked
from
the
website
for
details
of
that
and
if,
if
you're
interested
in
sponsoring
the
travel
grants
in
future
or
if
you're
interested
in
applying
for
a
travel,
grant
see
that
webpage
or
contact
me
for
for
details
of
those
sponsorship
opportunities
and
again,
thank
you
very
much
for
the
sponsors.
A
So
that's
essentially
all
I
have
to
say
today
the
agenda
for
the
remainder
of
the
day.
We
have
the
the
two
a
rp
award
talks.
Tasha
swami
will
be
first
talking
about
taurus
a
data
plane
architecture
for
per
packet,
machine
learning
and
that
will
be
followed
by
some
sam
kumar's
talk
on
performance,
tcp
for
for
low
power,
wireless
networks.
A
Okay,
I
will,
at
this
point,
switch
over
to
tasha.
Can
you
check
the
microphone
when
I
get
the
slides
up.
D
A
A
Okay,
so
I
should
have
control
over
that,
while
tasha
is
checking
to
see
if
that
works.
I'd
just
like
to
say
that
the
that,
as
I
say,
the
first
talk
today
is
he'll,
be
talking
about
taurus
at
data
plane,
architecture
for
per
packet
ml
tasha
is
a
phd
candidate
in
the
electrical
engineering
department
at
stanford.
A
His
research
is
focusing
on
the
intersection
of
machine
learning,
networking
and
architecture,
and
he
works
on
the
hardware.
Software
stack
for
data
plane
based
machine
learning,
infrastructure
and
applications.
Tasha
is
due
to
graduate
this
year.
I
understand
he's
on
the
job
market.
So,
if,
if
you
like
this
work,
then
please
do
talk
to
him.
He'll
be
around
at
the
itf
all
week,
and
if
you
find
this
talk
interesting,
I
believe
he's
also
going
to
be
presenting
in
the
coin
rg
session
later
this
week,.
C
C
Awesome
thanks
colin
cool,
so
I'm
gonna
be
talking
about
taurus,
which
is
a
project
that
me
and
my
colleagues
have
been
working
on,
and
so
taurus
is
essentially
a
data
plane
architecture
for
per
packet
machine
learning
and
go
into
a
little
bit
of
what
that
means
all
right.
C
So
this
here
is
a
quote
from
a
2015
google
blog
and
at
that
time
google
was
already
dealing
with
over
one
petabit
per
second
of
total
bisection
bandwidth
and
it's
only
grown
larger
and
harder
to
scale
since
so
what
we're
essentially
dealing
with
here
is
a
situation
where
networks
require
more
and
more
complex
management
with
higher
and
higher
performance,
and
so
it's
the
time
is
ripe
for
finding
new
solutions
here
and
one
of
the
promising
solutions
in
this
area
is
machine
learning,
so
machine
learning
can
allow
us
to
essentially
take
in
data
from
the
network
and
make
progressively
better
and
better
decisions,
as
we
train
our
models
and
these
machine
learning
algorithms
can
approximate
network
functions
based
on
the
data
they
see
and
they're
also
going
to
customize
their
operation
to
the
data
that
they're
training,
on
which
in
turn
means
that
these
machine
learning
algorithms
are
actually
customizing
their
models
to
the
network
itself,
and
so
we're
sort
of
doing
elements
of
this
already
with
handwritten
heuristics
in
the
network.
C
So
something
like
an
active
queue,
management,
algorithm
or
hashing,
and
load
balancing
and
playing
with
operator
tuned
parameters.
So
all
machine
learning
is
doing
here
is
taking
the
next
step
by
automating
the
the
search
for
these
kind
of
parameters
that
allow
to
work
well
within
your
network.
C
So,
if
we're
okay
with
using
machine
learning,
we
now
need
to
examine
where
exactly
in
the
network.
It
has
to
happen.
So
I'm
sure
many
of
you
already
familiar
with
software-defined
networks,
essentially
the
control
plane
and
the
data
plane
are
split
and
the
control
plane
is
responsible
for
policy
creation,
essentially
in
the
form
of
flow
rules
which
are
installed
into
a
data
plane
where
that's,
where
you're
going
to
find
your
switches
and
they're
doing
packet
forwarding
via
match
action.
C
C
And
on
the
left
here
I
have
a
diagram
of
the
same
typical
define
software
defined
network,
but
on
the
right
I
have
a
software
defined
network
with
the
taurus
worldview,
and
so
what
we've
actually
done
here
is
we've
split
the
machine
learning
operation
into
training
which
is
going
to
happen
in
the
control
plane
and
then
inference
which
is
going
to
happen
in
the
data
plane.
C
So
in
the
control
plane,
policy
creation
is
going
to
take
the
form
of
flow
rules
plus
ml
training,
and
when
installing
this
information
into
the
data
plane,
it's
going
to
be
sending
flow
rules
as
usual,
but
also
the
ml
model
weights
and
in
the
data
plane.
We're
going
to
be
doing
our
typical
match
action
packet
forwarding,
but
we're
also
going
to
be
doing
decision
making
with
ml
inference.
C
And
so
that
brings
me
to
one
of
the
core
tenets
of
taurus
and
that's
essentially,
the
ml
inference
should
happen
per
packet
in
the
data
plane,
and
so
the
the
intuition
here
is
relatively
straightforward.
C
You
want
to
be
able
to
do
per
packet
operation,
because
that
is
the
finest
granularity
of
traffic
essentially
operating
on
a
packet
scale.
Now
not
every
application
may
need
per
packet
level
operation,
but
the
there
are
applications
that
need
it,
and
so
the
platform
should
be
able
to
support
per
packet
operation
and
then
the
data
plane,
that's
where
the
packets
are.
So
if
we're
going
to
be
doing
decisions
on
packets,
it
should
happen
in
the
data
plane.
C
Oh,
I
think
powerpoint
animations
don't
play
well
with
the
pdf,
so
that's,
okay.
What
what's
basically
happening
in
here
is
that
if
we
were
to
do
just
a
rough
off
the
off
the
cuff
math
here,
so
you
have
traffic
at
one
giga
packet
per
second
moving
through
your
data
plane.
Now,
in
the
time
it
takes
you
to
send
a
packet
digest
from
the
data
plane
up
to
the
control,
plane,
calculate
flow
rules
and
then
install
it
back
in
the
data
plane.
C
In
this
case,
we've
assumed
half
a
millisecond
for
each
step,
so
we've
now
missed
1.5
million
packets
in
our
traffic
stream.
By
the
time
we
had
flow
rules
installed
into
the
data
plane.
So
in
the
example
here
we're
doing
anomaly
detection
so
we're
trying
to
find
out
if
incoming
packets
are
malicious
or
benign,
and
maybe
if
we
find
that
it's
malicious
we're
going
to
install
some
rule
to
say
block
that
ip
and
by
the
so
we've
missed
1.5
million
packets
during
this
flow
rule.
C
Installation
time
by
the
time
we
block
that
ip
we've
already
let
a
ton
of
potentially
malicious
traffic
into
the
network.
So
the
whole
takeaway
here
is
really
just
to
show
you
why
we
can't
let
our
operation
for
these
kind
of
this
level
of
application
happen
in
the
control
plane
and
if
we're
committing
to
using
machine
learning,
we
can't
have
inference
happen
in
the
control
plane.
C
So
fundamentally,
the
conclusion
here
is
that
the
robustness
and
performance
of
your
network
are
going
to
be
determined
by
the
quality
of
your
reaction
and
the
speed
of
your
reaction.
So
in
the
machine
learning
world
view,
the
quality
of
the
reaction
is
going
to
be
determined
by
your
training
data.
So
how
much
do
you
have?
What
kind
of
cases
does
it
cover?
How
well
is
it
cleaned,
but
also
your
speed
of
reaction?
So
in
the
case
of
the
anomaly
detection,
you
want
to
act
on
a
malicious
packet.
C
So
zooming
in
on
the
control
plane,
let's
talk
a
little
bit
about
the
actual
implementation
of
how
you
do
this,
so
I
mentioned
before
that
we're
going
to
split
our
machine
learning
into
training
in
the
control,
plane
and
inference
in
the
data
plane,
and
so
the
key
here
is
that
training
is
off
the
critical
path.
If
packet
forward
is
packet,
forwarding
is
happening
in
the
data
plane.
Then
the
control
plane
is
not
responsible
for
making
per
packet
level
decisions,
which
means
that
we
can
do
our
machine
learning.
C
Training
there
at
leisure
and
essentially
we
can-
we
can
put
in
whatever
the
latest
and
greatest
ml
accelerators
are
whatever
your
favorite
ml
framework
is
install
it
in
a
control,
plane
server
and
have
it
training
models
offline.
C
The
trickier
part
comes
in
the
next
step.
Where
now
we
need
to
deal
with
the
actual
critical
path,
basically
tackling
packets,
as
they
come
so
machine
learning
inference
here
is
going
to
happen
in
the
data
plane.
Like
I
mentioned,
and
the
the
final
outstanding
question
here
then,
is:
if
we're
okay
with
doing
training
in
the
control
plane,
we
can
use
whatever
existing
hardware.
We
want
and
then
what
do
we
do
about
the
data
plane?
C
Do
we
have
say
a
switch
that
can
do
inference
at
line
rate
per
packet
operation,
and
so
this
is
really
the
the
crux
of
taurus
and
that's
what
it
needs
to
do
so.
Taurus
is
an
architecture
for
per
packet
machine
learning,
inference
in
the
data
plane.
C
So,
let's
jump
into
the
actual
hardware
and
how
we
enable
this
kind
of
machine
learning
inference
at
line
rate.
So
I
have
a
picture
here
of
a
piece
of
pipeline,
a
protocol
independent
switch
architecture.
So
this
is
the
typical
programmable
structures
you'll
find
in
these
kind
of
switches,
so
some
sort
of
programmable
packet
parser
match
action,
tables
that
allow
you
to
encode
your
network
functions
and
then
maybe
a
programmable
traffic
manager.
C
C
What
does
that
look
like
and
more
specifically,
what
is
the
abstraction
here
with
which
we're
going
to
create
our
programmable
machine
learning
fabric
and
so
in
taurus?
We
use
the
mapreduce
abstraction,
so
mapreduce
is
really
useful
for
machine
learning,
because
it
supports
a
lot
of
the
common
linear
algebra
operations
that
you
need
for
your
ml
algorithm.
So
this
covers
everything
from
neural
networks,
support
vector
machines,
k-means
all
these
different
kinds
of
applications,
and
just
as
an
example,
I
have
here
in
the
picture
an
example
of
a
single
neuron
from
a
deep
neural
network.
C
So
you
can
see
exactly
how
map
and
reducer
applied
here.
In
this
case
in
the
blue
box,
we
are
doing
an
element-wise
multiplication,
that's
our
map
with
multiplication
with
inputs
and
weights
and
then
we're
applying
a
reduction,
and
so
this
is
going
to
essentially
add
all
the
values
together
and
you're,
going
to
produce
a
scalar
value
from
your
vector
of
inputs
and
then
finally,
we're
going
to
apply
an
activation
function,
and
so
that
suffices
for
a
single
neuron.
But
you
can
mix
and
match
this
pattern
ad
nauseam
to
create
a
full
neural
network.
C
So
by
stacking
extra
of
these
blocks
in
parallel,
you'll
be
creating
a
layer
of
neurons
and
then
stacking
them.
Sequentially
you'll
be
creating
multiple
layers,
and
so
that's
how
you
can
create
say
a
deep
neural
network.
C
So
the
other
advantage
of
the
mapreduce
pattern
is
comes
from
the
kind
of
performance
that
it
enables.
Primarily
it's
a
from
the
the
cmd
parallelism
that
same
instruction,
multiple
data.
So
we
can
get
a
lot
of
performance
out
of
the
parallelism
with
minimal
logic,
and
this
is
as
opposed
to
what
you
might
find
in
a
say,
like
a
a
typical
like
tofino
pipeline,
where
they
have
vliw
pipelines,
which
give
you
much
more
flexibility.
C
But
the
cost
here
is
that
there's
a
lot
of
logic,
that's
needed
for
the
communication
hardware
and
that
ends
up
taking
up
a
lot
of
the
the
overall
on-chip
area
and,
in
addition,
simply
parallelism
gives
us
the
ability
to
unroll
the
loops
in
our
in
our
algorithms.
C
So
the
the
idea
of
unrolling
here.
If
we
take
the
example
of,
say
a
single
layer
of
a
neural
network
and
say
you
have
four
neurons
in
your
network,
you
can
either
execute
them.
Sequentially
you're
doing
one
neuron
after
the
other,
or
if
you
have
the
resources,
you
can
instantiate
all
four
of
them
in
parallel,
and
so
the
trade-off
here
is
that
more
on
rolling
is
going
to
give
you
better
performance,
essentially
doing
all
four
of
those
neurons
at
once.
D
C
Less
a
much
higher
latency,
and
but
we
get
this
kind
of
control
with
the
cmd
pattern
by
essentially
on
adjusting
unrolling
factors.
C
So
we
went
ahead
and
we
essentially
adjusted
the
switch
pipeline
with
a
mapreduce
unit
that
implements
the
patterns
that
I
just
described
so
the
the
we
still
have
our
typical
programmable
elements.
We
have
a
programmable
packet,
parser
match
action,
tables
and
traffic
manager,
but
you
can
see
in
the
center.
C
We
have
this
mapreduce
unit
and
that's
essentially,
what's
going
to
do
our
machine
learning
inference,
and
so
there
are
a
couple
little
idiosyncrasies
about
the
the
arrangement
of
the
pipeline
that
I
want
to
point
out
and
and
that's
how
we
use
these
different
elements
for
machine
learning
context.
Even
if
they're,
typically
network
elements,
so
a
packet
parser
is
normally
for
pulling
out
your
headers
from
your
packets
and
doing
whatever
you
want
with
your
match
action
rules.
C
So
the
match
action
tables
before
mapreduce
can
be
doing
some
sort
of
cleaning
on
the
features
and
then
the
match
action
tables
on
the
output
on
the
right
side
of
the
mapreduce
unit
can
be
doing
some
sort
of
interpretation
of
the
results,
and
so
when
we
actually
went
to
design
this
mapreduce
unit,
there's
a
couple
of
things
that
came
up
it
turns
out.
You
can't
really
just
stick
an
accelerator
into
the
switch
pipeline,
so
what
we
did
was
we
kind
of
established?
What
were
the
the
points
that
we
wanted?
C
C
And
it
has
some
meat
line
rate
with
the
fixed
clock,
so
this
essentially
rules
out
an
fpga
makes
an
fpga
will
give
you
a
variable
clock.
We
want
it
to
be
deterministic
and,
of
course,
line
rate
is
our
performance
requirement
and
then
minimal
area
and
power
overhead.
We
don't
want
to
blow
up
the
entire
chip
area,
adding
in
like
a
a
map
reduced
block.
It
should
be
something
that
is
small,
but
gives
you
access
to
a
whole
class
of
applications.
C
And
so
finally,
the
one
little
thing
to
note
here-
that's
kind
of
interesting-
is
that
most
of
these
ml
accelerators
are
built
to
do
batch
processing
in
an
effort
to
get
high
throughput,
but
in
the
network
pipeline
you're,
actually,
processing
packets,
as
they're
coming,
which
means
that
you're
operating
on
a
batch
size
of
one
which
is
turns
out,
puts
a
lot
of
different
performance
demands
on
the
hardware
than
a
typical
accelerator
would
see.
C
C
So
you
can,
you
can
imagine,
say
a
packet
coming
into
the
switch
pipeline
and
we
want
to
see
essentially
whether
it's
malicious
or
benign,
so
the
packet
hits
the
first
stage
and
that's
where
we're
going
to
do
our
packet
parsing.
So
we're
going
to
read
local
features,
say
our
ip.
C
Whatever
information
we
can
extract
from
the
packet
itself,
the
packet
is
going
to
move
to
the
second
stage,
which
are
the
match
action
tables
and
from
there
maybe
we're
going
to
do
some
sort
of
retrieval
of
out-of-network
events,
so
these
would
be
different
kinds
of
elements
of
metadata
that
the
control
plane
may
have
installed
into
the
match
action
tables.
So
something
like
the
failed
logins
per
ip.
C
The
packet
then
moves
to
the
the
center
block
the
mapreduce
unit.
That's
where
we're
going
to
apply
our
learned
anomaly
detection.
So
you
can
imagine
this
is
maybe
a
binary
neural
network
and
it
gives
it
a
a
score
from
zero
to
one
on
how
anomalous
it
is
so
one
is
definitely
anomalous.
Zero
is
benign
and
finally,
the
match
action
or
the
the
packet
will
move
to
the
post-processing
match
action
tables,
and
that's
where
we
do
our
interpretation.
C
C
So
in
the
paper
we
actually
did
a
full
asic
analysis
of
this
taurus
hardware
and
how
we
can
we
wanted
to
show
essentially
that
it
has
minimal
overhead
and
it's
feasible
to
to
build
something
like
this,
and
so
we
based
our
evaluation
platform
on
a
coarse
grain.
Reconfigurable
architecture
called
plasticine
and
we
programmed
our
applications
in
the
spatial
hardware,
description,
language
and
so
spatial
is
just
an
hdl
that
lets.
You
use
these
kind
of
parallel
patterns
like
map
and
reduce
to
program.
C
Your
your
your
reconfigurable
architectures
at
the
loop
level,
and
so
the
the
basic
architecture
of
the
mapreduce
unit
here
is
really
just
a
grid
of
compute
and
memory,
tiles,
so
easily
scalable
and
very,
very
straightforward.
C
In
the
compute
units
we
have
cindy
lanes
that
are
operating
in
parallel
and
a
reduction
network
that
allows
us
to
implement
the
reduce
operation
and
the
memory
units
are
just
blocks
of
banked
sram,
so
we're
doing
severe
pipelining
within
the
compute
unit,
but
then
we're
also
doing
pipelining
one
level
higher
between
the
compute
and
memory
units.
So
the
idea
here
is
cindy
parallelism
everywhere
and
then
pipeline
parallelism
everywhere
and
that's
how
you
get
your
performance
really.
C
So
we
went
through
a
set
of
real
world
applications
and
programmed
them
onto
our
asic,
and
we
ended
up
using
a
12x10
grid
to
support
all
of
them
and
we
compared
it
to
state-of-the-art
switches
with
four
pipelines
and
our
reference,
which
was
500
square
millimeters,
and
we
found
that
our
grid,
which
could
support
these
different
applications,
was
only
adding
a
3.8
percent
overhead
or
4.8
millimeters
per
pipeline.
C
So
again
before
earlier,
I
said
we
want
minimal
area
overhead,
so
3.8
is
pretty
low.
Given
that
you're
now
getting
an
entire
class
of
machine
learning,
applications.
C
And
jumping
into
one
of
these
applications,
I've
been
using
anomaly
detection.
As
a
recurring
example.
Here
we
tried
out
two
different
types
of
anomaly:
detection,
with
support,
vector
machines
and
a
deep
neural
network,
and
so
for
both
models.
You
can
see
in
the
throughput
it's
one
gigabyte
per
second,
which
is
the
line
rate
for
high
end
switch
pipelines
like
your
tofinos
and
broadcoms,
the
latency
that
we
added
was
in
the
hundreds
of
nanoseconds
or
less
so.
In
this
case,
you
would
choose
your
application.
C
You
can
see
here
that
the
bsvm
requires
83
nanoseconds,
while
the
dnn
requires
221
nanoseconds.
So,
depending
on
your
slos
and
what
kind
of
requirements
you
have
to
meet,
you
can
choose
your
algorithm
to
reduce
latency
and
then,
in
both
cases,
the
area
and
power
overhead
required
for
the
hardware
to
implement.
Just
these
applications
is
single,
digits
or
or
less
of
0.6
power,
overhead
0.5
area,
overhead
or
point
eight
and
point
and
1.0
respectively.
C
Again,
if
you
don't
need,
say
the
full
suite
of
benchmarks,
you
only
want
a
reconfigurable
fabric
that
will
let
you
do
anomaly
detection.
You
can
do
it
with
minimal
overhead
here
and
so
in
the
paper
there's
a
several
more
applications
if
people
are
are
interested,
such
as
a
congestion
control
network
and
a
traffic
classification
network.
C
So
we
went
through
this
whole
process
of
doing
an
asic
analysis
to
prove
that
it
could
be
done.
But
as
far
as
research
goes,
we
don't
really
want
anyone
waiting
for
some
sort
of
mass-produced,
taurus
asic,
so
we've
put
out
an
open
source,
fpga
based
test
bed,
and
so
this
is
just
a
rough
diagram
of
what
it
looks
like
at
the
control
plane.
C
We're
using
your
typical
network
os
like
onos
we're
using
a
tofino
switch
to
to
mimic
the
piece
of
pipeline
elements
like
your
programmable
packet,
parsers,
match
action
tables
and
traffic
managers
and
then
we're
using
an
fpga
to
to
mimic
the
mapreduce
unit.
So
we
set
it
up
in
this
bump
in
the
wire
configuration
and
so
because
of
the
limits
of
an
fpga
you're
not
going
to
be
able
to
hit
the
same
performance
as
you're
going
to
get
with
the
asic
core
screen
reconfigurable
architecture.
C
Essentially
the
example
I
mentioned
earlier
about
anomaly:
detection,
where
we're
trying
to
do
detection
of
anomalous,
packets
in
the
control
plane
or
we're
trying
to
use
taurus
and
do
an
anomaly
detection
in
the
data
plane,
and
so
with
the
testbed
that
I
just
showed
you
you
can
do.
You
can
do
either.
So
you
so
in
the
case
of
taurus
we'd,
be
placing
our
anomaly
detection
application
on
the
fpga.
C
While,
if
we're
trying
to
do
control
plane
based
anomaly
detection,
we
would
run
it
at
the
the
controller
on
the
cpu.
C
C
C
The
f1
score
is
71.1,
so
you
can
see
that
taurus
on
the
far
right
side
of
the
the
the
table
is
achieving
an
f1
score
of
71.1.
So
it's
faithfully
recreating
the
model
as
it
was
in
software
and
we're
processing
packets
as
they're
coming
in,
whereas
in
the
control
plane
we
actually
had
to
sample
packets
from
the
network
and
run
it
through
the
control
plane
and
run
it
through
an
ml
framework
and
try
to
install
flow
rules.
C
And
what
ends
up
happening
is
that
you
miss
so
many
packets
while
doing
this
operation,
that
your
effective
f1
score
drops
pretty
heavily.
So
you
can
see
on
the
far
right
column
the
f1
score
for
the
baseline
ranges
from
1.5
to
almost
almost
zero,
so
you're
effectively
throwing
away
your
model
because
of
the
added
latency.
C
So
that's
just
one
example
of
what
you
know
what
we
did
with
our
fpga
testbed.
There's,
of
course,
lots
of
other
things
you
can
do,
but
the
just
to
reinforce
the
point
why
you
have
to
operate
in
the
data
plane.
C
Cool
so
yeah,
so
that's
mostly
it
for
me.
I
have
my
contact
information
here
and
I
have
at
the
bottom
the
git
lab
link
for
the
fpga
testbed.
We
hope
people
wanna
can
try
it
out
and
there's
the
link
to
the
full
paper
in
this
easy
to
memorize
url.
C
So
yeah,
I'm
happy
to
take
any
questions.
A
Okay,
thank
you
very
much
the
excellent
talk
since
we
we
have
a
some
people,
remote
some
in
the
room.
I
think,
if,
if
we
can
manage
the
queue
using
meet,
echo
me
take
a
queueing
tool.
I
think
that
would
be
helpful.
I
do
see,
I
guess
it's
barry
at
the
microphone
there.
D
Okay,
actually,
I'm
being
dave
or
am
right
now
dave
iran
asks,
I
assume
the
class
of
anomalies
you
can
detect.
Are
those
that
can
be
detected
by
header
fields
within
the
width
of
the
alu
of
the
switch
things
in
the
packet
data
beyond
the
headers
won't
be
seen?
Is
that
correct.
C
So
the
in
the
case
of
anomaly
detection,
we
used
the
kde
nsl
data
set,
which
had
a
a
record
of
different
attacks
that
were
calculated
from,
like
you
said,
either
header
fields
or
you
can
also
actually
calculate
aggregate
fields
from
across
headers.
So
you
can
say,
create
like
a
histogram
using
the
matte
checking
tables
across
different
packets
and
the
the
packets.
C
The
the
packet
headers
are
going
to
be
limited
by
the
packet
header
vector
size,
that's
moving
between
stages
in
the
switch
pipeline,
but
you
don't
necessarily
need
to
be
limited
to
features
in
the
header,
because
the
control
plane
can
install
different
types
of
metadata
into
the
imagination
tables,
and
you
can
do
your
own
processing
in
the
match,
action
tables
over
time
or
whatever
other
kind
of
calculations.
You
want
to
do
on
your
headers.
So
the
headers
are
just
the
starting
point
for
the
the
features
here.
D
A
E
So
the
first
one-
and
this
is
the
naive
attendee
question-
I
suspect
the
paper
is
very
important
for
interpreting
that
last
table.
It
was
really
quite
opaque
how
to
understand
the
meaning
of
the
columns
and
their
impact
on
a
comparison
to
the
baseline.
I
think
there's
a
lot
of
implicit
knowledge
in
your
table
structure.
I'm
sure
the
paper
explains
it
the
slideway.
It
was
just
a
bit
of
produce
to
a
naive
reader.
E
So
at
the
start
of
your
talk,
that
was
the
first
point.
You
made
a
case
to
say
that
the
delay
between
doing
a
packet
sample
constructing
table
match
rules
in
the
controller,
injecting
those
rules
down
into
the
functional
plane
and
applying
them
had
a
huge
packet
loss
and
mismatch
interval.
But
it
seems
to
me
the
delay
to
perform
the
ml
operation
tune.
Your
ml
have
a
model
that
is
representative
of
the
condition
you
want
to
model
and
then
install
that
has
a
similar
cost,
it's
not
to
say,
there's
no
benefit
of
ml.
E
I
think
it's
huge,
but
the
component,
that's
about
the
delay
cost
of
doing
an
instantiation
of
rules.
I
don't
think
is
the
basis
of
doing
it.
I
think
you're
on
stronger
ground,
arguing
it's
about
the
ability
to
do
complex
match
at
line
rate
than
the
static
cost
of
doing
the
rule,
installation
right.
C
So
the
installation
you're
right
about
the
installing
the
model
itself.
So
the
idea
is
that
you
could
be
taking
sampling
packets
from
the
your
network
and
be
sending
different
kinds
of
metadata
to
the
control
plane
and
essentially
be
doing
your
training
offline
and
you
can
install
model
weights
or
replace
model
weights
as
as
needed.
The
idea
is
that
whatever
is
operating
in
the
data
plane
itself
has
nothing
to
do
with
the
installation
of
model
weights
yeah.
C
E
I
thought
I
thought
that
idea
that
you
could
do
the
model
training
asynchronously
the
sample
exactly
is
very
beneficial,
but
if
you
consider
a
new
class
of
attack
that
you
have
to
understand
it
and
do
some
form
of
bayesian
analysis
and
classification,
which
is
completely
unmodeled
here,
exactly
how
you
do
that
training
unknown
how
long
that
takes
it's,
not
about
the
speed
of
the
chipset.
It's
about
your
ability
to
do
the
good,
bad
classification,
a
priory,
to
inform
the
model
and
then
download
it.
That's
quite
a
high
cost
in
time.
C
Yeah
so
so
this
is
always
like
kind
of
the
the
trouble
with
security
rate
like
if
you
want
to
do
an
on-the-fly
analysis
of
a
brand
new
attack.
That's
not
really
what
we're!
What
we're
targeting.
E
E
But
but
in
engineering
terms,
your
case,
this
is
extremely
fast
at
line
right
well
made.
I
enjoyed
listening
to
it
a
lot.
Thank
you.
Thank
you.
A
C
Yeah,
so
I
think
the
the
for
energy
consumption
needs
to
be
looked
at,
maybe
more
holistically.
So,
while
you
are
increasing
by
some
small
percentage,
the
energy
that
you'd
be
consuming
in
the
the
switch
itself,
you
can
consider
that
say
if
you're
doing
anomaly,
detection
you're,
removing
the
cost
of
running
an
anomaly,
detection,
application
and
software
on
a
server
somewhere
else.
C
A
Okay,
thank
you
when
I
have
questions
but
for
sure.
A
This
is
a
an
irtf
meeting,
which
is
which
is
co-locating
with
the
ietf,
and
obviously
you
know
since
that
this
communicates
with
the
itf.
The
question
is,
then,
you
know
to
what
extent
have
you
given
any
thought
towards
how
that
how
this
might
change
or
affect
the
type
of
work
the
ietf
does?
C
Yeah,
I
think
so
one
of
the
the
things
that
actually
shamu,
who
asked
a
question
earlier
brought
to
my
attention,
was
that
was
what
kind
of
standardization
is
needed
for
packet
headers
if
we're
going
to
be
using
them
as
features
or
carrying
model
weights
or
basically
doing
kind
of
this
like
ml
ml
assist
type
operations.
C
A
A
I
I
I
mean
I'm
thinking
that
your
traditional
programmable
switch
uses
p4
or
something
like
that.
As
a
programming
model
do
do.
We
need
a
similar
standardized
programming
model
for
these
types
of
ml
switches.
D
C
C
Yeah,
so
so
it's
like
kind
of
a
compliment
to
to
p4.
We
went
with
mapreduce,
so
we're
not
necessarily
married
to
the
idea
of
using
a
map
produced
blocker.
Anything.
The
the
bigger
idea
here
is
just
doing
inference
in
the
data
plane
so
but
yeah.
It
could
definitely
help
to
have
some
sort
of
standardization
in
the
way
that
p4
works,
but
for
the
the
mapreduce
element.
C
So
you
could
even
consider
like
an
extra
control
block
in
p4
as
mapreduce,
and
we
actually,
we
have
another
paper
and
submission
on
what
the
the
language
level
constructs
here.
Look
like
so
yeah
there's,
that's
definitely
an
area
for
standardization
as
well.
A
D
D
A
Just
took
a
little
while
yeah,
okay
scope,
okay,
great
all
right,
so
the
second
talk
today
is
focusing,
I
think,
on
a
very
different
problem
domain.
So,
in
this
talk,
sam
kumar
will
talk
about
his
paper
on
performance
tcp
for
low
power
wireless
networks.
This
was
originally
presented
at
the
nsdi
conference
in
2020.
A
If
I
don't,
if
I
remember
correctly,
sam
is
a
phd
student
at
uc
berkeley
advised
by
david
culler
and
rolac,
he's
broadly
interested
in
systems,
security
and
networking,
and
his
research
focuses
on
rethinking
systems
designed
to
manage
the
overhead
using
cryptography
and
presumably
also
improving
the
performance
of
tcp
for
power.
Wireless
networks.
A
So,
sam
over
to
you.
B
Okay,
thanks
colin
for
the
introduction,
as
you
said,
I'm
sam
and
I'm
going
to
present
my
research
on
performant
tcp
for
low
power
wireless
networks
and
to
join
work
with
my
collaborators
at
uc
berkeley
and
as
you
mentioned,
it
was
published
in
2020
at
nsdi.
B
So
low
plan
research
began
in
the
late
1990s
and
at
this
point
in
time,
researchers
deliberately
cast
away
the
internet
architecture
based
on
the
idea
that
low
panels
may
have
to
operate
in
two
extreme
environments
and
two
different
from
regular
networks.
In
order
for
the
internet
architecture
to
directly
apply
so
many
of
the
early
protocols
like
s,
mac,
dmacc
and
so
on,
and
the
early
systems
like
taneos
and
contiki
did
not
conform
to
any
particular
standard
or
architecture.
B
But
surprisingly,
the
adoption
of
iep
did
not
come
with
tcp.
For
example,
openthread
a
lowpan
network
stack,
developed
by
nest
and
used
in
in
the
smart
home
space
didn't
even
support
tcp
and
instead
the
community
has
come
to
rely
on
protocols
like
coap,
which
are
specialized
low-pass
protocols
based
on
udp.
B
It's
also
worth
pointing
out
that
during
this
time,
lopens
have
not
yet
achieved
the
same
kind
of
pervasive
adoption
that
we've
seen
in
other
protocols,
like
wi-fi,
at
least
in
the
context
of
branding
internet
access
to
devices.
So
a
natural
question
is
whether
to
get
that
kind
of
pervasive
adoption
of
lopens.
We
should
adopt
not
only
ip
but
also
the
broader
set
of
iep-based
protocols,
including
tcp.
B
B
Now
there
have
been
a
few
prior
attempts
to
use
tcp
in
the
space,
typically
based
on
a
simplified,
embedded
tcp
stack
like
micro,
ip
or
blip,
and
what
we
can
see
in
this
graph
is
that
our
work
tcplp
achieves
significantly
higher
good
put
than
prior
attempts
to
use
tcp
in
this
space.
B
I'd
also
like
to
share
an
update.
That's
happened
since
we
published
this
research,
which
is
that
open
thread.
The
low
power
network
stack
that
I
mentioned.
That's
used
in
the
smart
home
space.
We
simply
adopted
tcp
directly,
based
on
our
research.
It
uses
tcplp
as
its
tcp
implementation,
and
the
research
also
influenced
thread.
The
network
standard
that
open
thread
implements.
B
So
now
I'm
going
to
take
a
step
back
and
provide
some
more
context
as
to
what
exactly
low
pans
are
and
what
some
of
the
challenges
are
with
using
low
pens,
and
I
can
do
that
by
comparing
low
pans
to
other
wireless
technologies
that
you
might
be
more
familiar
with
so
on.
The
left.
Wi-Fi
provides
a
host
with
internet
access
via
an
access
point
in
the
middle
bluetooth.
It
doesn't
really
provide
full
internet
access,
it's
more
like
a
cable
replacement,
channel,
a
wireless
usb
of
sorts
and
then
on
the
right.
B
B
The
second
set
of
constraints
come
from
the
link
layer,
low,
pass
link,
clear
like,
for
example,
ieee
802.15.4
has
a
small
mtu
of
only
about
100
bytes
and
has
low
wireless
range,
which
means
that,
in
order
to
in
order
to
get
connectivity
over
a
large
area,
you
need
to
transmit
data
over
multiple
wireless
hops
and
finally,
energy
constraints
are
also
an
issue.
B
You
typically
don't
have
enough
energy
to
keep
your
radio
on
and
listening
all
the
time.
So
you
duty
cycle
your
radio.
What
that
means
is
that
your
radio
is
actually
in
a
low
power,
sleep
state
for
say,
99
of
the
time
and
then
one
percent
of
the
time
you
can
turn
on
your
radio
to
send
or
receive
packets
and
in
order
to
provide
an
always-on
allusion
to
applications.
B
So
to
make
this
more
concrete,
I'm
going
to
tell
you
about
the
platform
we
used
in
our
research.
It's
called
hamilton
and
some
of
the
stats
of
this
platform
are
on
the
slide.
The
key
point
here
is
that
this
kind
of
device
is
more
powerful
than
the
devices
we
had
when
low
band
research
first
got
started
in
the
early
2000s,
but
it's
still
substantially
less
powerful
than
even
a
raspberry
pi.
You
cannot
run
linux
in
a
device
like
this.
B
Instead,
you
have
to
run
a
specialized
embedded
operating
system,
and
you
can
understand
our
research
as
tackling
the
central
question
of
how
should
a
device
like
this
connect
to
the
internet
and
the
result
of
our
research
is
that
we
show
that
tcp
ip
works.
Well
now,
as
I
mentioned
earlier,
the
adoption
of
iep
in
this
space
did
not
include
tcp,
and
that
was
no
accident.
B
The
reason
is
that
researchers
had
doubts
as
to
whether
tcp
would
work
well,
and
we
expected
it
to
not
work
well,
given
the
challenges
of
lopens
so
here
are
some
quotes.
I've
taken
from
some
research
papers
to
show
some
of
the
concerns
that
the
community
has
had
about
using
tcp.
The
first
one
is
that
tcp
is
not
lightweight
and
may
not
be
suitable
for
implementation
and
low-cost
sensor
nodes
with
limiting
processing
memory
and
energy
resources.
B
The
second
one
is
that
certain
features
of
tcp
may
cause
harm
like,
for
example,
that
the
connection
oriented
protocol
aspect
of
tcp
is
a
poor
match
for
wireless
sensor
networks,
where
actual
data
may
only
be
in
the
order
of
a
few
bytes
and
finally,
there's
the
wireless
tcp
problem.
The
idea
that
tcp
may
use
a
single
packet
drop
to
infer
that
the
network
is
congested
which
can
result
in
extremely
poor
performance,
because
wireless
links
tend
to
exhibit
relatively
high
packet
loss
rates.
B
So
again,
to
summarize
more
simply,
there's
concern
that
tcp
is
too
heavy,
that
its
features
are
necessary
and
that
it
will
perform
poorly
in
the
presence
of
wireless
loss.
So
central
to
our
research
was
understanding,
tcp's
performance
and
what
we
did
is
we
did
a
study
where
we
actually
ran
tcp
in
a
low
pan,
measured
its
performance
and
try
to
draw
conclusions
about
how
well
tcp
really
does
or
does
not
perform,
and
what
we
found
is
that
out
of
the
box
tcp.
B
B
But
the
issues
on
the
right
it
turns
out
are
fixable
within
the
paradigm
of
tcp
or
a
fairly
straightforward
techniques.
So
in
our
research
we
show
why
the
expected
reasons
don't
actually
apply.
We
demonstrate
techniques
to
address
the
actual
issues
causing
poor
tcp
performance,
and
our
overall
conclusion
is
that
tcp
can
perform
well
in
low
pans.
B
Okay
in
the
next
part
of
the
talk,
I'm
going
to
focus
on
the
expected
reasons
for
why
or
why.
The
expected
reasons
for
performance
don't
apply
and
to
go
back
here.
I'll,
be
talking
about
this
technique
in
this
part
of
the
talk,
and
the
reason
is
that
this
part
of
the
talk
is
more
about
our
experiments
and
our
observations
about
the
expected
reasons.
B
B
The
tcp
stack
was,
of
course,
running
on
the
hamilton
platform
directly.
Our
software
stack
is
using
open
thread
with
riot
os
and
we
used
a
wireless
test
button
collector
data,
where
each
of
those
numbers
is
one
of
our
hamilton
nodes.
The
lines
connecting
them
show
an
example
of
a
topology.
In
reality,
open
third
is
going
to
generate
this
dynamically.
B
So
one
of
the
first
things
we
had
to
do
was
to
implement
tcp.
Now,
as
I
mentioned
earlier,
there
have
been
several
prior
attempts
to
use
tcp
in
this
space
based
on
simplified,
embedded
tcp
stacks,
but
we
wanted
to
use
a
full
scale,
tcp
stack
in
our
study.
Now.
The
challenge
is
that
implementing
a
full
scale.
B
Tcp
stack
is
hard
and
in
fact,
there's
an
entire
rfc
devoted
to
all,
describing
all
the
problems
that
people
were
seeing
in
full
scale,
tcp
stacks
back
in
1999,
even
though
these
tcps
had
matured
for
at
least
a
decade.
By
this
point,
so
our
approach
was
not
to
implement
a
tcp
stack
from
scratch,
since
we
felt
it
would
be
too
error
prone
to
do.
B
B
So
now
that
we
have
our
implementation
of
tcp,
we
can
concretely
answer
the
question
of
what
are
the
resource
requirements
of
running
tcp.
So
what
we
found
is
that
tcp
lp
requires
32
kilobytes
of
code
memory
and
about
half
a
kilobyte
of
data
memory
per
connection
to
store
all
of
the
tcp
connection
state
in
a
full
scale.
B
Tcp
implementation,
while
our
platform
has
substantially
more
code
and
data
memory
than
that
now
as
an
optimization,
we
use
separate
structures
for
active
sockets
that
are
actually
endpoints
of
a
tcp
connection
and
passive
sockets
that
are
just
listening
for
new
connections,
which
also
save
a
bunch
of
memory
as
well.
But
the
point
here
is
that
you
know,
at
least
in
terms
of
connection
state,
we're
well
within
the
bounds
of
the
available
memory.
B
So
natural
question
is
what
about
the
actual
buffers
used
to
send
and
receive
data,
so
the
tcp
buffers
need
to
be
the
bandwidth
delay,
product
and
size
in
order
to
be
able
to
send
at
full
speed
of
the
network,
and
we
empirically
determine
the
bandwidth
delay.
Product
has
two
to
three
kilobytes
and
we
can
see
in
the
graph
here
how
we
experimentally
did
that
you
can
see
two
to
three
kilobytes
of
buffer
size.
The
available
could
put
over
tcp
levels
off.
B
So
our
conclusion
here
is
that
tcp,
including
the
size
of
the
buffers,
fits
comfortably
in
memory
and,
in
fact,
there's
another
conclusion
to
be
drawn
here,
which
is
that,
if
you
notice
the
the
size
of
the
buffers
is
actually
much
bigger
than
the
connection
state,
which
suggests
that
most
of
the
overhead
of
tcp
doesn't
come
from.
The
complexity
of
the
protocol
is
from
the
buffers
and
any
performant
bulk
transfer
protocol
would
need
these
buffers
in
order
to
transmit
at
the
bdp.
So
in
some
sense
the
overhead
really
isn't
bottlenecked
by
tcp's
complexity
at
all.
B
There's
also
some.
We
also
introduced
a
technique
here
in
order
to
reduce
the
memory
used
for
the
buffers,
and
part
of
this
has
to
rely
on
tcp
having
both
a
receive
buffer
and
a
reassembly
buffer
to
store
in
sequence,
data
and
auto
sequence.
B
Data
for
reassembly
now
full
scale,
tcp
stacks,
like
freebsd
use
packet
queues,
there's
a
separate
queue
of
packets
for
each
of
these,
but
in
the
embedded
setting
we
don't
want
to
use
dynamically
allocated
packets
because,
if
we
hold
on
to
dynamically
allocated
packets
in
a
memory
constraint
setting,
we
may
cause
other
memory
allocations
to
fail.
So
we
instead,
we
want
to
use
flat
arrays
and
the
naive
strategy
would
be
to
have
a
separate
flat
array
for
your
receive
queue
and
for
your
reassembly
queue
now
to
optimize
this.
B
What
we
observe
is
that
there's
an
interesting
relationship
between
the
advertised
windows
size,
the
number
of
bias
we
currently
have
and
the
total
size
of
the
buffer,
which
is
that
the
number
of
received
bytes
plus
the
advertised
windows
size,
is
equal
to
the
total
size
of
a
receive
buffer.
Now.
The
observation
we
make
on
top
of
this
is
that
all
of
the
data
we
may
possibly
get
for
reassembly
has
to
fit
within
the
advertised
window
size,
that's
the
contract
of
tcp
that,
if
you're
sending
to
a
recipient,
you
do
not
go
past
their
advertised
window.
B
B
Okay,
next,
I'm
going
to
talk
about
the
wireless
tcp
problem
and
before
we
talk
about
that,
I
need
to
tell
you
about
the
number
of
implied
segments
is
that
affects
tcp's
congestion
control.
So,
as
I
mentioned,
the
bama's
lip
product
is
two
to
three
kilobytes:
each
segment
is
sized
to
about
250
to
500
bytes,
and
this
was
chosen
carefully.
It's
actually
based
on
the
technique.
I'll
tell
you
about
later
on
in
the
talk
or
coping
with
a
small
mtu
of
these
networks.
B
So
we'll
come
back
and
explain
this,
but
for
now
take
it
as
a
given
that
our
segments
are
250
bytes
to
500
bytes
and
what
this
works
out
to
is.
We
have
four
to
12
in
flight
tcp
segments
at
any
one
point
in
time.
Now
this
is
different
from
other
higher
bandwidth
networks.
B
Here
our
maximum
segment
size
is
462,
bytes
and
what's
going
on
and
when
I
say
in
action
segment
size,
I'm
actually
subtracting
the
space
for
tcp
options.
So
this
is
how
much
data
is
sent
in
each
tcp
packet
and
our
bandwidth
delay
product
is
filled
by
just
four
tcp
segments.
B
So
what
ends
up
happening
is
that
yeah
we're
our
losses
are
very
frequent,
but
because
we
only
need
a
connection
window
of
four
segments
in
order
to
fill
up
the
bdp
and
send
senate
line
rate.
Tcp's
congestion
control
actually
is
actually
able
to
recover
from
losses
extremely
quickly
and
we
spend
most
of
our
time
actually
sending
at
a
full
window.
B
We
find
that
because
our
bandwidth
in
these
networks
is
so
small,
our
bandwidth
delay
product
is
small
and,
as
a
result,
we
can
recover
to
a
full
bdp
quickly
after
a
loss,
and
this
means
that
the
wireless
tcp
problem
actually
does
not
affect
tcp's
performance
significantly
in
these
networks
and
it's
much
more
resilient
to
wireless
losses
in
a
lower
pan
than
it
is
in
a
higher
bandwidth
wireless
network.
B
So
now
I've
talked
about
the
expect
why
the
expected
reasons
don't
apply
in
the
next
part
of
the
talk,
I'm
going
to
tell
you
about
the
actual
reasons
for
poor
performance
and
going
back
to
our
slide
with
our
techniques
on
it
I'll
be
telling
you
about
these
three
techniques.
Now
there
are
a
couple
I
didn't
get
to
the
zero
copy
send
buffer
the
link,
clear
queue,
management,
and
that's
because
I
don't
have
the
time
in
this
talk
to
talk
about
it.
B
But
if
you
want
to
chat
about
it
afterwards
I'll
be
around
or
you
can
look
in
the
paper
to
find
some
details
about
those
so
first
dealing
with
the
mtu
problem,
here's
a
graphic
showing
the
size
of
the
mtu
in
ethernet
wi-fi
and
I
triple
eeee
into
the
15.4,
which
is
an
example
of
a
low
pan
link
layer,
and
what
we
can
see
is
that
tcp
ip
headers
are
very
small
compared
to
the
ethernet
and
wi-fi
mtus,
but
they're
significant
compared
to
the
ieee
inner
tutor,
15.4
mtu,
and
this
is
going
to
result
in
large
header,
overhead.
B
B
So,
in
order
to
overcome
this,
we
break
this
conventional
wisdom
and
instead
allow
tcp
lp
to
have
tcp
segments
that
span
multiple
link
layer
frames.
Okay,
what
that
means
is
that
we're
relying
on
the
six
lowpan
adaptation
layer
to
handle
fragmentation
and
reassembly
for
us,
which
adds
some
overhead,
but
it
means
that
the
overhead
of
our
headers
is
now
amortized
over
multiple
frames,
allowing
us
to
get
some
good
good
put
now.
There
is
a
trade-off
here
if
we
use
too
much
fragmentation.
B
If
we
set
our
our
mtu,
I
mean
if
we
set
our
tcp
segments
to
be
way
too
large.
What's
going
to
end
up
happening,
is
that
we
rely
on
too
much
fragmentation
and
that's
bad,
because
now,
if
one
fragment
gets
lost,
we
lose
the
entire
packet.
So
what
we
want
to
do
is
we
want
to
choose
our
tcp
segments
to
be
as
large
as
possible
to
effectively
amortize
the
overhead
without
incurring
more
fragmentation
beyond
that.
B
Okay-
and
this
graph
was
an
experiment
where
we,
where
we
measured
the
maximum
segment
size
and
the
good
put
that
results,
and
we
found
that
the
gains
essentially
level
off
around
three
to
five
frames.
So
that's
what
we
use
for
our
future
experiments
and
it
shows
that
you
know
there's
a
good
trade-off
to
be
made
here
where
we
can
get
good
good
put
despite
the
despite
the
header
sizes.
B
Okay,
now
I'll
talk
about
how
the
link
clear
scheduling
to
support
a
low
duty
cycle
interacts
poorly
with
tcp,
so
recall
that
these
devices
often
don't
have
enough
energy
to
keep
their
radios
on
listening
all
the
time.
So
we
define
the
duty
cycle
as
the
proportion
of
time
that
the
radio
is
listening
or
transmitting.
Basically,
the
percent
of
time
where
the
radio
is
not
in
a
low
power
sleep
state,
okay,
and
in
order
to
get
good
energy
construction,
we
want
the
duty
cycle
to
be
as
close
to
zero
as
possible.
B
Now
there
are
several
ways
in
order
to
support
this.
In
the
session,
literature
open3d
uses
a
particular
duty,
cycling
mechanism-
that's
called
a
receiver
initiated
duty
cycle
protocol,
which
I'll
now
explain
so
in
open
thread.
You
have
two
kinds
of
nodes:
we
have
battery
powered
nodes
where
we
want
to
minimize
the
duty
cycle
and
wall
power
nodes
that
are
plugged
into
a
wall
outlet
and
have
enough
power
to
keep
their
videos
always
on
okay.
Now,
sending
a
frame
from
b
to
w
is
easy,
because
w
video
is
always
on.
B
So
we
can
just
send
the
frame
whenever
we
like
more
challenging,
is
the
reverse
getting
a
frame
from
w
to
b
okay.
So
what
has
to
happen
is
that
w
has
to
wait
until
b's
radio
is
listening,
and
how
does
it
know
when
b's
radio
is
listening?
Well,
this
is
where
the
protocol
comes
in.
What
b
does
is
that,
whenever
it
turns
on
as
radio
to
listen
for
a
for
a
frame
it'll
send
a
data
request
packet
to
w
informing
it
that
it's
now
listening?
B
So
w
has
to
wait
until
it
guesses
it
a
request
packet
and
once
it
does,
then
it
can
go
ahead
and
send
the
frame
to
b
and
b
will
listen
and
receive
the
frame.
Okay.
So
what's
the
key
point
here,
the
key
point
I
want
to
emphasize
is
that
these
idle
duty
cycle
is
directly
related
to
how
frequently
it
sends
data
request.
B
Frames
b
can
choose
to
send
data
request,
frames
very
rarely
which
allow
it
to
get
very
good
energy
consumption,
but
doing
so
would,
but
by
doing
so
will
cause
more
of
a
delay
in
getting
frames
to
it,
since
w
has
to
wait
for
the
request
frame
in
order
to
send
it
one
of
the
one
of
the
data
frames.
B
Okay.
So
now
let
me
talk
about
what
this
means
for
tcp
operation
and
I'll.
Do
this
by
comparing
http
over
tcp
to
coap,
okay
and
coap
is
a
rest-based
protocol
running
on
top
of
udp
and
in
our
setup
we
had
bsnw
data
request
frame
every
one.
Second,
basically,
it
makes
it
listen
for
packets
every
one
second,
and
that
allows
it
to
get
a
really
low
duty
cycle
now.
B
The
key
difference
between
http
and
co-app
here
is
that
http
requires
two
round
trips,
whereas
co-app
only
requires
one
round
trip,
okay,
so
for
the
first
round
trip
right,
you
start
at
a
random
phase
within
the
100
million
within
the
thousand
millisecond
sleep
interval,
so
you'd
expect,
on
average,
a
500
millisecond,
delay
and
co-op
is
consistent
with
that
for
http.
What
happens
is
that
for
the
first
round
trip
we
see
500
milliseconds,
but
the
second
round
trip
starts
right
at
the
beginning
of
the
next
leap
interval.
B
So
the
second
round
trip
consistently
sees
the
worst
case
latency
when
transmitting
the
packet
from
b,
okay
and
as
a
result,
http
performs
more
than
twice
as
poorly
as
coap.
On
this
workload.
Now
I
want
to
point
out
that
there
have
been
some
recent
extensions
to
tcp,
for
example,
tcp
fast
open,
which
you
can
use
to
eliminate
the
second
round
trip
and
get
performance
parity
between
co-app
and
http,
but
this
problem
also
happens
for
bulk
transfers,
where
the
acclock
nature
of
tcp
causes
it
to
consistently
experience
the
worst
case,
latency
even
for
bulk
transfers.
B
So
this
is
an
important
problem
to
solve.
Regardless
of
that,
and
our
approach
to
solving
it
is
to
use
an
adaptive
duty
cycle.
The
idea
is
that
we
can
use
the
tcp
and
http
protocol
state
in
order
to
vary.
How
often
we
send
data
request
frames,
the
idea
being
when
we
expect
a
packet.
We
want
to
send
data
request
frames
more
frequently.
B
So,
for
example,
if
I'm
an
http
server
of
one
of
these
battery-powered
devices-
and
I
just
accepted
a
tcp
connection-
I
can
be
pretty
sure
that
that
I'm
going
to
soon
receive
an
http
request
on
that
connection.
So
I
may
choose
to
send
data
request
frames
more
frequently
at
that
point
in
time
and
doing
this
nearly
entirely
eliminates
the
gap
between
co-app
and
http.
B
So
let
me
step
back
and
go
over
hidden
terminals
to
provide
some
background
on
that.
For
those
who
aren't
familiar
with
it,
we
can
understand.
The
wireless
range
of
a
node
is
looking
something
like
this.
The
unit
just
models,
the
simplification,
where
we
consider
this
to
be
in
some
sort
of
perfect
circle
in
practice.
Of
course,
it
can
be
more
complex
depending
on
the
exact
environment.
Your
deployment
is
in,
but
unit
this
model
is,
is
going
to
be
enough
for
us
to
capture
the
phenomena
of
interest
here.
So
you'll
go
with
that.
B
So
imagine
you
have
four
segments
in
a
line.
I
mean
four
four
nodes
in
a
line
with
their
with
their
transmission
ranges
shown
here,
and
we
want
to
transmit
data
from
a
to
d.
Now.
The
nature
of
tcp
is
that
we
have
multiple
segments
in
flight
at
the
same
time
for
a
single
connection
and
that's
why
we
have
segment
one
being
sent
from
c
to
d
and
segment
two
being
sent
from
a
to
b.
B
But
unfortunately
this
is
bad,
because
the
wireless
ranges
are
going
to
overlap
at
b,
so
the
two
packets
are
going
to
interfere
there.
Okay,
now,
in
the
context
of
wi-fi,
we
typically
overcome
this
using
a
protocol
based
on
rts
and
cts
frames
that
allow
us
to
mitigate
the
hidden
terminal
problem
in
most
cases,
but
in
the
context
of
lopens.
The
small
mtu
means
that
rtsdts
typically
has
too
high
of
an
overhead
as
a
result
most
uses
of
it
don't
use
rts
and
cts
packets.
B
B
This
also
happens
because
of
data
packets
and
acts
going
in
opposite
directions.
So,
for
example,
here
what
we'll
ultimately
see
is
that
you
get
the
same
problem
with
b
and
d,
both
sending
at
the
same
time
to
c,
because
each
of
their
csmas
can't
hear
the
other
so
to
mitigate
this.
Our
approach
is
to
add
a
new
random
back
off
delay
between
link
player
rate
price
okay.
B
So
the
idea
is,
if
you
transmit
a
frame-
and
it
fails-
which
you
know,
because
you
don't
get
a
link
layer
acknowledgement
for
it,
then
you
wait
a
random
amount
and
retry
the
transmission,
and
this
is
different
from
csma
in
two
respects.
The
first
respect
is
that
in
csma
you
do
this
randomized
delay
with
exponential
back
off.
If
the
channel
appears
busy
in
this
case,
even
the
channel
appears
clear
if
our
transmission
fails,
we
still
do
the
back
off.
B
So
it's
different
in
regards
of
what
triggers
the
transmission
and
second,
it's
a
much
longer
delay
right,
because
in
csma
you
can
rely
on
hearing
a
concurrent
transmission.
You
can
transmit
immediately
if
the
channel
appears
clear
in
this
new
delay
that
we're
adding
this
link
retry
delay.
What
we're
seeing
is
that
we
want
to
have
a
delay,
that's
chosen
between
0
and
10
times,
the
time
to
transmit
a
frame,
the
idea
being,
even
if
there
are
two
concurrent
permissions
that
can't
hear
each
other
with
high
probability,
they
won't
overlap
in
time.
B
D
B
The
way
this
would
work
is
that
each
of
these
two
nodes
would
send
its
data
once
in
order
to
go,
and
they
would
never
collide.
But
then,
when
they
retry
they'll
transmit
a
second
time
at
hopefully
different
intervals,
and
they
won't
overlap
in
time
and
the
transmission
will
succeed.
D
B
So
we
did
a
measurement
study
to
understand
what
kind
of
link
delays
would
be
appropriate
and
what
would
work.
What
we
observe
is
that
there's
a
huge
reduction
in
packet
loss
even
from
a
small
delay
and
as
we
increase
the
delay
too
much,
it
starts
to
eat
away
at
your
good
foot,
because
now
you're
beating
a
lot
when
transmitting
your
packets.
B
So
that's
what
we
used
in
our
study
and
this
reduced
the
packet
loss
from
six
percent
to
one
percent,
which
was
which
you
consider
a
significant
improvement.
B
So,
finally,
I'm
going
to
summarize
our
evaluation
and
and
conclusions
so
first,
I
previewed
this
result
at
the
beginning.
We're
able
to
achieve
significantly
higher
good
put
than
prior
attempts
at
using
tcp
and
we're
very
close
to
a
reasonable
upper
bound
that
we
computed
based
on
measurements
of
how
fast
the
radio
can
send
out
packets
and
the
overhead
loss
to
headers
and
x.
B
We
also
did
a
measurement
study
to
study
the
energy
efficiency,
so
we
used
tcp
and
co-app
for
a
sentence
and
task
and
measured
the
radio
duty
cycle
over
a
24-hour
period,
and
you
can
see
the
radio
duty
cycle
here.
The
key
point
is
that
tcp
is
not
significantly
worse
than
co-app.
In
fact
they
perform
comparably
for
the
duration
of
the
experiment
at
about
a
two
percent
duty
cycle,
and
we
consider
this
a
success
because
tcp
is
able
to
perform
essentially
on
par
as
a
protocol
over
udp
developed,
specifically
for
low
bands.
B
So
now
the
tcp
is
a
viable
option.
What
does
this
mean?
Well
first,
we
should
reconsider
the
use
of
lightweight
protocols
that
emulate
part
of
tcp's
functionality
in
the
sense
that
you
know,
if
you
have
a
protocol,
that's
specialized
that
performs
just
as
well
as
a
general
protocol,
that's
more
interoperable
and
used
more
broadly.
You
should
perhaps
prefer
the
one
that's
used
more
broadly
and
is
more
interoperable.
B
Second,
we
think
that
tcp
may
influence
the
design
of
low-pan
network
systems
in
the
sense
that
you
know
for
a
long
time.
It's
been
the
case
that
many
sport
home
devices
that
you
buy
on
the
market
require
a
specialized
gateway
to
get
internet
connectivity
and
tcp
gives
us
the
opportunity
to
allow
these
devices
to
connect
end-to-end
to
any
services
externally
that
they
may
depend
on.
B
So,
just
to
talk
a
little
more
about
the
about
the
middle
point
about
how
tcp
may
influence
the
design
of
low-pan
network
systems
when
I
say
gateway
architecture,
I
mean
a
setup
like
this,
where
you
have
your
devices,
these
smartphone
devices
you
bought
on
the
market
and
in
order
to
allow
them
to
communicate
with
an
application
server
and
a
data
center
somewhere,
you
have
to
install
some
specific
gateway
in
your
home.
There
is
some
protocol,
translation
and
application
logic
in
order
to
bring
connectivity
to
those
devices.
B
What
this
means
is
is
often
the
case
that
some
of
you
may
have
experienced.
This
is
that
if
you
go
buy
smart
devices
from
a
new
vendor
now,
all
of
a
sudden,
you
need
another
gateway
for
those
new
devices
or
even
maybe,
the
newer
versions
of
devices
on
the
same
fender
like,
for
example,
for
a
long
time.
It
was
the
case
that
for
life
that,
if
you
have
bulbs
from
say,
lifx
and
bulbs
from
philips,
you
would
need
separate
gateways
for
both
of
those
devices.
B
So
the
the
introduction
of
ip
in
this
space
didn't
really
change
this,
in
the
sense
that
now
your
application
protocol
on
the
left
is
now
implemented
over
ip,
but
you
still
need
the
application
layer
gateway
and
the
missing
piece.
I
think
that
would
allow
an
end-to-end
connection
here
would
be
to
have
a
transfer
protocol.
That's
supported
on
both
sides,
namely
tcp,
and
once
you
do
this,
your
application
layer,
gps,
become
regular
border
routers,
and
you
could
potentially
consolidate
these
together
into
a
single
border
router.
B
So
in
conclusion,
we
implemented
tcplp
a
full-scale
tcp
stack
for
low-pan
devices.
We
explained
why
the
expected
reasons
for
poor
tcp
performance
don't
apply.
We
show
how
to
address
the
actual
reasons
for
poor
tcp
performance
and
we
show
that
once
the
issues
are
resolved,
tcp
can
perform
comparably
to
low-pan
specialized
protocols.
That's
all.
I
have
prepared
I'm
happy
to
take
any
questions
now.
A
Okay,
thank
you
sam
now.
That's
for
that
excellent
talk.
I
see
we
have
a
couple
of
people
in
the
online
queue
and
a
couple
of
people
at
the
microphone.
A
D
So
hi,
I'm
matthias,
I'm
one
of
the
co-founders
of
white
great
work
thanks
a
lot.
One
remark
and
two
questions
question
first,
so
you
argued
that
supporting
tcp
is
important
because
it's
popular
now
quick
becomes
popular.
Did
you
work
on
any
comparison
from
the
system?
Point
of
view.
D
A
B
B
So
we
didn't
do
a
comparison
against
quick,
but
I'd
like
to
comment
on
that,
because
that's
a
good
point
that
other
transports
are
becoming
popular.
Many
of
the
issues
that
we
addressed
aren't
specific
to
tcp.
They
apply
broadly
to
tcp
and
other
protocols
needed
for
bulk
transfer
like,
for
example,
the
main
issues
getting
it
to
work
with
hidden
terminals,
getting
it
to
play
well
with
link,
clear
scheduling
and
so
on,
apply
broadly
to
any
protocol.
That's
transmitting
a
lot
of
data
and
wants
a
significant
amount
of
bandwidth.
B
D
And
another
question:
I
mean
in
your
paper
you'll
note
that
you
also
have
an
implementation
for
gneic
the
default
networks
they
can
write.
Do
you
also
plan
to
submit
the
pierre
to
upstream's
implementation.
B
At
some
point
we
did
have
plans
for
that,
but
what
happened
is
that
riot
os
already
adopted
a
different
tcp
stack
and
it
seemed
a
bit
redundant
to
contribute
a
second
one.
Recently.
What
we
have
done
is
we
have.
We
must
have
contributed
our
code
to
open
thread
which
now
uses
it
as
its
default
ecb
stack.
Okay,.
D
Firstly,
I
highly
encourage
you
to
submit
the
pm
and
finally
remark:
you
said
that
fragment
needs
to
be
a
packet
needs
to
be
is
lost
when
the
fragment
is
lost.
I
mean
this
depends
a
little
bit
on
the
fragmentation
screen
right.
If
you
consider,
for
example,
selective
fragment
recovery,
it
doesn't
matter
too
much
whether
the
fragment
is
lost
or
not,
for
the
whole
packet.
B
Yeah,
so
my
understanding
about
the
basic
slope
and
work,
or
at
least
the
way
it
was
implemented
in
the
operating
systems
we
looked
at,
was
indeed
that
if
a
fragment
is
lost,
you
lose
the
whole
packet.
But
I
do
agree
that
there
are
protocols
you
can
use
to
recover
a
loss
for
happening
without
losing
the
entire
packet,
and
those
could
also
help
with
the
problem
allowing
you
to
make
the
packet
bigger
and
amortize
tcp
ib
headers,
even
better.
D
All
right,
hello,
tommy
paulie
from
apple.
Thank
you
for
doing
this
talk
very
interesting.
I'm
super
happy
to
see
the
use
of
tcp
here.
I
just
had
a
couple
questions
from
the
presentation
way
earlier
and
you
don't
have
to
go
back
when
you're
talking
about
the
memory,
saving
aspects
and
the
ability
to
have
the
flat
buffer.
You
had
the
diagram
there
of
you
know,
essentially
here's
kind
of
what's
in
flight
and
then
there's
the
out
of
order
bits
and
there
are
gaps
in
there
as
well.
D
When
you're
doing
this,
are
you
able
to
essentially
guarantee
100
of
the
time
that
you'll
never
need
to
allocate
memory,
or
is
it
like
just
most
of
the
time?
And
then
there
would
be
a
failover
case
where
you
do
need
to
have
dynamic
allocation.
B
D
B
Great
question:
we
ensure
that
you
never
have
to
dynamically
allocate
memory
cool
and
the
way
we
do
it
is
that
you
store
the
data
there.
You
have
a
bitmap
to
keep
track,
of
which
bits
contain
the
out-of-order
data,
but
the
bitmap
can
also
be
sized
statically,
because
it
depends
only
on
the
array
size,
which
is
also
static,
got
it.
Okay,.
D
Is
this
something
that
needs
tuning
on
the
internet
hosts
to
make
sure
that
they
are
friendly
to
the
low
pan
devices?
Or
can
you
use
completely
unmodified
internet
hosts
to
talk
to
yeah.
B
That's
an
excellent
question
and
the
short
answer
is
that
the
hosts
on
the
linux
side
were
completely
unmodified.
Great,
I
mean
that's
to
say
a
little
bit
more
about
that.
The
timing
that
we
adjusted
for,
like
the
randomized
delay,
was
none
of
the
tcp
levels
at
the
link
layer.
So,
as
a
result,
the
the
other
side
actually
doesn't
see
any
of
that
got
it.
This
is
also
one
of
the
advantages
to
us
using
a
full
scale.
B
Tcp
stack
like
the
one
from
freebsd,
because
it's
been
battle
tested
in
the
real
world
and
it's
interoperable
with
all
the
major
tcp
stacks
that
are
out
there,
and
I
just
want
to
say
that
interoperability
is
actually
a
problem
in
the
embedded
space.
Many
of
the
other
tcp
stacks
you
find
are
have
have
interoperability
problems
in
pretty
subtle
ways,
with
the
real
tcp
stacks
that
are
used
and
that's
something
we
manage
a
side
step
by
using
a
battle
tested,
tcp
implementation
as
the
basis
of
our
study.
B
Yeah,
so
the
hidden
terminal
problem
affects
even
a
single
tcp
connection
in
isolation,
and
we
verify
that
our
randomized
back
off
fixes
the
problem.
In
that
case,.
B
Yeah
yeah,
so
I
mean,
if
you
have
background
traffic.
This
is
also
why
we
use
randomized
delays
instead
of
fixed
delays,
because
if
you
have
a
randomized
backup
it
doesn't
matter,
the
interference
is
coming
from
the
same
stream
or
a
different
stream
right
in
both
cases,
you'll
back
up
a
random
amount
and
hopefully
transmit
again
without
colliding.
B
This
is
also
why
we
did
it
without
because
I
mean
there
are
several
particles,
you
could
use
that
look
at
tcp
state
in
some
way
and
having
it
just
be
a
randomized
delightfully.
The
link
clear
gives
us
some
confidence
that
it
would
work
across
tcp
streams
and
regardless
of
the
source
of
traffic,
whether
it's
tcp
different
tcp
streams
or
even
something
else.
D
B
B
D
Am
I
unmuted
finally,
so
this
is
following
up
on
the
multi-hop
case.
So
in
these
environments
the
forwarding
devices
are
in
fact,
also
very
low
power.
Low
resource
devices.
D
B
So
that's
a
great
question.
First,
I
want
to
I
mean
so
first
I
just
want
to
clarify
that
the
buffer
is
used
at
the
intermediate
routers.
These
aren't
tcp
layer
buffers
it's
just
like
the
general
packet
buffers
used
for
forwarding,
because
you
know
an
end-to-end
tcp
connection.
There's
no
tcp
state.
D
Sure
sure
sure,
but
it
may
put
a
different,
may
put
a
different
low
aggregate
load
on
those
buffers
than
say
co-app
traffic
or
something
that's
more.
You
know
simple
request
response
related.
B
Yeah,
so
I
mean,
of
course
it's
the
case
that,
when
you're
transmitting
at
higher
bandwidth,
you're
going
to
place
some
more
stress
on
the
on
the
buffers
of
the
intermediate
of
the
intermediate
routers-
and
there
are
a
couple
things
that
that
we
do
in
that
we
actually
did
in
our
study
in
order
to
help
mitigate
that.
B
The
first
one
is
that
we
added
some
active
queue
management
functionality
to
those
intermediate
routers,
where
you
mark
packets,
that's
congested,
using
explicit
connect
using
explicit
condition,
notification
and
so
on
in
order
to
prevent
tcp
from
filling
up
the
entire
buffer
and
keeping
your
cues
short.
The
primary
reason
we
did.
This
was
to
improve
fairness
of
different
tcp
flows
that
are
competing
for
buffer
space
of
these
intermediate
routers
and
also
to
reduce
the
and
also
reduce
the
latency
of
traffic.
B
But
it
also
has
a
side
effect
of
limiting
the
amount
of
buffer
space,
that's
being
used
by
a
single
tcp
flow.
To
address.
Some
of
the
concerns
that
you
brought
up.
A
All
right
you
have
so
I
have
a
question.
Does
the
I
I
I
very
much
like
the
idea
of
the
headers
multiple
linked
airframes?
Does
this
put
any
constraints
on
the
link
there
or
or
does
the
six
loop
handler
handle
all
of
that.
B
That's
a
great
question,
so
some
of
these
can
potentially
be
handled
at
the
sixth
low
fan
layer,
but
others
do
indeed
have
to
do
with,
with
the
with
the
link
layer
directly
like,
for
example,
the
randomized
delete
that
we
added
to
avoid
hidden
terminals
is
something
that
would
operate
at
the
link
layer
right,
because
at
the
six
little
band
layer
you
don't
have-
or
at
least
you
don't,
naturally
have
the
same
kind
of
visibility
into.
B
You
know
when
your
link,
clear
acts
are
coming
in
and
so
on,
whereas
you
would
need
that
to
determine
that
resolution
failed
and
how
much
to
back
off
on
the
retransmission
and
so
on.
So
some
of
them
do
indeed
affect
the
link
layer.
A
Yeah
are
the
requirements
that
the
link
layer
delivers
packets
in
order
to
avoid
damaging
the
headers
or
or
is
six
we're
pen
handling
that.
A
Is
there
a
requirement
that
the
link
layer
delivers
packets
in
order
in
order,
because
of
the
way
you've
sent
the
headers
split
across
multiple
link
their
frames,
or?
Is
that
all
that
the
reordering
handled
by
six
looper.
E
When
okay
yeah,
thank
you
very
much
for
this
work,
this
is
great
stuff.
I
did
have
a
comment
on
the
comparison
with
co-app.
I
think
the
specification
for
co-app
was
not
entirely
based
on.
We
can't
use
tcp
type
that
thing
it
was
more
based
on.
We
can't
use
http
because
the
justification
for
it
was
for
folks
who
wanted
to
use
a
restful
interface
for
the
application
layer.
Not
every
application
of
the
year
in
iot
wishes
to
do
that,
but
there's
certainly
a
lot
of
a
lot
of
incentive
to
use
restful.
E
So
when,
when
the
restful
folks
started
to
become
interested
in
iot,
the
only
alternative
was
http
one
one,
which
I
completely
agree,
is
terrible.
It's
massager,
it's
textual
based
vertical.
You
cannot
compress
it
it's.
It's
very
verbose,
et
cetera.
It's
it's
terrible.
E
We
subsequently
had
http
2,
which
became
a
binary
protocol,
and
we
actually
had
a
paper
three
years
ago
in
a
in
our
w
about.
You
know
how
to
use
that
over
something
like
6lowpan,
for
example,
some
just
initial
scratching
the
surface,
but
now
we
have
http
3
and
quick,
and
it's
all
binary.
So
I
I
understand
you,
you
guys
haven't,
had
a
chance
to
go
after
the
excellent
work.
E
You've
done
to
look
at
those
layers,
but
I
would
highly
encourage
you
to
do
that,
because
that
would
that
would
address
a
significant
portion
of
the
of
the
application
layer.
Incentives
to
for
iot
as
well.
B
Yeah,
so
so,
thanks
for
clarifying
that,
I
do
acknowledge
that
co-op
has
has
evolved
quite
a
bit
in
a
few
years.
Some
of
those
evolutions
happen
after
after
we
published
this
work,
but
I
do
want
to
to
clarify
my
position
on
co-op
a
little
bit
based
on
what
you
said.
It's
that
indeed
I
think
that
co-op
is
useful
and
it
has
its
uses
and
it's
very
flexible.
B
It's
been
evolving
a
lot
over
the
years
and
that's
great
I
do
I
mean
I
have
noticed
that
coap
has
been
evolving
in
some
sense,
more
and
more
towards
the
same
kind
of
abstraction
that
tcp
provides
right.
In
some
sense,
the
ability
like,
for
example,
with
some
of
the
recent
work
on
on
streaming
on
streaming,
block,
transfers
and
so
on.
A
D
I
thank
you
so
much
for
the
presentation.
I
really
appreciate
it.
I
was
just
wondering
you
talked
mainly
about
the
applications
of
this
in
in
lans.
Do
you
see
any
application
for
longer
range
networks
like
like
mobile
ad
hoc
networks
or
anything
of
that
sort.
B
That's
a
great
question,
so
all
of
our
experimentation
was
was
done
using
ieee
introduce
15.4,
which
is
a
personal
area
network
protocol,
and
that
was
motivated
by
the
recent
interest
in
adopting
some
of
that
technology
to
work
in
the
smart,
home
and
iot
space.
Some
of
the
I
mean
some
of
these
lessons
might
carry
you
over
to
the
mobile
and
an
ad
hoc
network
space
like
lp
vans
and
so
on.
B
I'm
not
sure
I'll
be
able
to
tell
you
any
specifics,
given
that
I
don't
have
much
experience
with
those
networks,
but
I
mean
my
first
gut
would
be
there's
probably
a
way
to
make
tcp
work
well,
given
that
it's
been
adapted
to
work
on
so
many
different
kinds
of
networks
in
all
kinds
of
different
environments,
but
other
than
that,
I'm
not
sure
if
any
of
this,
which
of
the
specific
techniques
would
directly
carry
over
there.
A
All
right,
thank
you
very
much.
A
And
thank
you
again
to
to
both
of
the
speakers.
I
think
there
were
two
two
really
great
talks
there.
Both
sam
and
tasha
will
be
around
all
week.
I'm
sure
they'll
be
very
happy
to
talk
with
people
more
about
their
work,
so
please
do
to
find
them.
Have
a
chat
chat
about
their
work,
make
them
welcome
to
the
the
ietf
and
to
the
irtf.
A
Congratulations
both
to
sam
and
tasha,
for
the
award
of
the
anrp.
This
time,
as
I
said
earlier,
look
out
for
more
a
rp
award
talks
at
the
the
itf
one
five
in
london
in
november.
The
nominations
for
the
2023,
a
rp
awards
will
be
opening
in
september.
So
if
you
know
any
good
work,
please
think
about
nominating
that
work
and
look
out
for
the
applied
networking
research
workshop,
which
is
taking
place
co-locating
with
the
itf
in
philadelphia
tomorrow.