►
From YouTube: IETF 116 IRTF Open
Description
The Internet Research Task Force (IRTF) Open session, including Applied Networking Research Prize (ANRP) presentations, will be held during IETF 116 at 0400 UTC on 27 March 2023.
A
A
A
A
A
A
A
C
D
D
E
E
E
Welcome
everybody:
this
is
the
irtf
open
meeting
at
itf116
in
Yokohama.
E
E
And
very
quiet
is
that
better,
all
right?
Okay?
So,
as
I
said,
my
name
is
Colin
Perkins
I'm
the
iitf
chair.
This
is
the
irtf
open
meeting.
E
So
I
I
want
to
start
with
the
the
usual
reminders
of
the
iitf
policies,
the
intellectual
intellectual
property
policy
and
so
on.
So
first
of
all
a
reminder
that
in
the
iitf
we
followed
the
the
same
API
disclosure
rules
as
the
ietf
does,
and
the
bypass
participating
in
this
meeting.
E
You
agree
to
follow
those
procedures,
in
particular,
if
you're
aware
of
any
intellectual
property
on
your
your
talk
or
your
contribution
at
the
microphone,
then
you
need
to
disclose
disclose
that
fact
and
the
precise
rules
for
this
are
listed
on
the
slide
and
so
on.
E
In
addition,
a
reminder
that
we
make
audio
video
recordings
of
these
sessions
available,
this
session
in
particular,
is
being
streamed,
live
and
going
out
on
YouTube
and
the
recording
will
be
on
YouTube
afterwards
and
there's
also
a
photographer
here.
So
if
you're
wearing
a
one
of
the
red
do
not
photograph
lanyards,
then
you'll
be
you
will
avoid
the
photographs.
But
if
you
speak
at
the
microphones
or
if
you
are
giving
a
presentation,
you
will
be
recorded
and
the
recordings
will
go
online.
E
We
also
have
a
code
of
conduct.
You
know,
please
do
pay
attention
to
the
rules
about
code
of
conduct
and
the
anti-harassment
procedures.
Please
do
behave
professionally
and
appropriately
and
ensure
that
everybody
is
welcome
in
this
meeting
and
in
the
irtf
and
the
I
and
in
the
ietf
in
general.
E
E
Also,
if
you're
asking
questions,
use
the
please
use
the
the
miteko
tool,
either
the
on-site
tool,
if
you're
in
the
room
or
the
full
full
tool.
If
you're
remote,
we're
running
a
a
unified
queue
for
questions.
So
please
do
use
the
tool,
rather
than
just
going
straight
up
to
the
microphone
to
put
yourself
into
the
queue.
E
Also,
a
reminder
that,
as
a
covered
safety
measure,
in-person
participants
in
this
meeting
in
the
other
ITF
controlled
rooms
are
required
to
wear
an
ffp2
mask
or
equivalent.
The
only
exceptions
to
that
are
the
people
actively
presenting
and
the
chair
when
they're,
actively
speaking,
participants
asking
questions
from
the
floor
are
expected
to
remain
masked.
E
All
right
so,
as
I
said,
this
is
the
irtf
open
meeting.
The
irtf
itself
is
the
a
parallel
organization
to
the
IE
ETF,
which
focuses
on
some
of
the
longer
term
research
issues
which
affect
the
internet.
E
It's
very
much
a
research
organization
we're
not
here
to
conduct
standards,
development,
we're
not
here
to
produce
standards
and
while
the
the
irtf
can
publish
informational
or
experimental
rfcs,
the
primary
outputs
of
the
research
groups
is
expected
to
be
understanding
and
research
papers
rather
than
protocol
specifications
and
rfcs.
E
The
irtf
is
organized
as
a
number
of
research
groups,
those
listed
in
and
shown
in
dark
blue
on
the
slide
are
meeting
later
this
week.
The
computation
in
the
network
research
group
coin
RG
met
this
morning,
and
we
have
two
two
groups
which
are
not
meeting
that
the
network
coding
research
group
is
essentially
finished
with
its
work
and
will
be
expected
to
close
relatively
shortly,
and
the
thing
to
finger
research
group
is
having
a
a
meeting
in
a
few
weeks
online
and
all
the
other
groups
are
meeting
later
this
week.
E
E
The
goal
here
is
to
bring
together
the
the
political
standards
community
in
in
the
iitf,
with
the
academic
research
Community,
which
is
studying
formal
methods
for
protocol
specification,
they're
there
being
to
exchange
some
experience
and
and
ideas
and
to
try
and
understand
whether
and
how
formal
methods
can
be
employed
to
improve
the
way
we
specify
protocols
and
to
improve
the
correctness
of
the
protocols
that
are
being
specified
in
the
ITF
and
pass
back
experience
on
What's
useful
for
for
professional
protocol
designers
to
the
academic
Community
developing
such
such
tooling,
and
that
group
will
be
meeting
on
Wednesday
I.
E
Believe
the
other
new
research
group
is
the
research
and
Analysis
of
standard
setting
processes
proposed
research
Group,
which
we'll
be
meeting
on
Thursday.
The
chess
for
this
are
Ignacio,
Castro
and
Niels
turnover,
and
this
group
is
focused
on
understanding
the
standard
setting
process
itself.
E
It's
focused
on
understanding
the
community,
its
diversity,
the
impact
that
the
set
of
participants
in
that
Community
have
on
the
process
by
which
we
develop
standards.
The
impact
of
how
changes
in
that
Community
affects
the
standards
which
are
being
developed
is
focusing
on
understanding
the
decision-making
process
in
Internet
standards
and
understanding
the
interactions
between
the
different
parts
of
the
ITF,
the
participants
in
the
ATF
and
the
ITF
and
other
standard
setting
communities.
E
As
I
say,
we'll
hear
more
about
these
in
in
a
few
minutes,
and
please
do
consider
going
along
to
the
to
their
first
meetings
later
this
week.
E
E
The
first
of
those
was
the
CCN
info
draft,
which
came
out
of
the
information
Centric
networking
research
group,
which
is
an
RFC
published
in
February
this
year
and
I.
Think.
Just
last
week
the
the
Quantum
Internet
research
group
published
its
first
draft,
the
first
RFC
on
the
architectural
principles
for
Quantum
Internet
and
the
Quantum
Internet
group
is
meeting
in
the
slot
immediately
following
this
one.
So
if
you're
interested
in
that
topic,
please
do
go
along
to
that
meeting.
E
Foreign
addition
to
the
research
groups
and
and
the
the
research
we
we
do
in
in
those
research
groups,
we
also
organized
the
applied
networking
research
price.
E
This
is
organized
in
cooperation
with
the
internet,
Society
with
sponsorship
from
Comcast
and
NBC
Universal,
and
it's
here
to
recognize
some
of
the
best
research
results
in
applied
networking,
some
new
research
that
has
potential
relevance
to
the
the
internet
standards
community
and
perhaps
to
recognize
upcoming
people
who,
who
are
likely
to
have
an
impact
on
the
internet
standards
and
Technologies
going
forward.
E
I'm
very
pleased
to
announce
that
we
have
two
a
RP
award
presentations
today.
The
awards
go
to
the
awards
for
this
meeting
go
to
Boris,
because
many
for
his
work
on
novel
offloading
architectures
for
for
NYX
and
so
our
facility
Jacobs
for
his
work
on
evaluating
machine
learning
for
network
security.
E
E
In
addition,
the
the
final
activity
we
organize
in
the
irtf
is
the
applied
networking
research
Workshop.
The
the
workshop
is
a
forum
for
the
the
research
Community,
the
vendors
The
Operators
in
in
the
in
the
ITF
standards
Community
to
present
and
discuss
emerging
results
in
applied
networking
research.
E
It
co-locates
with
the
July
ITF
meeting,
which
will
be
in
San
Francisco
this
year
and
I'm
pleased
to
announce
that
the
the
chairs
of
this
meeting
of
this
Workshop
will
be
Francis
Yan
from
Microsoft
and
Maria
apostolaki
from
Princeton,
both
of
whom
previous
anrp
winners
so
I'm
very
pleased
to
have
them
on
board
and
running
this
Workshop
paper.
Submissions
for
the
workshop
will
be
due
on
the
12th
of
May
and
you
should
look
out
for
the
detailed
call
for
papers
any
day
now.
E
Foreign
ly
I'd
like
to
highlight
that
we
do
offer
travel
grants
to
attend
the
irtf
meetings.
We
offer
diversity
travel
grants
to
support
early
career
academics
and
PhD
students
from
underrepresented
groups,
and
we
also
offer
travel
grants
to
attend
the
applied
networking
research
Workshop,
the
iitf.org
travel
Grant
site
will
include
information
about
those.
You
can
expect
the
call
for
travel
grants
for
the
the
July
meeting
to
go
live
later
this
month.
So
please
do
do
look
out
for
that
in
the
next
couple
of
weeks.
E
And
that's
essentially
all
I
have
to
say
our
agenda
for
the
the
rest
of
today.
We
are
starting
with
a
couple
of
short
talks.
E
Introducing
the
two
new
proposed
research
groups:
Jonathan
will
talk
about
the
the
usable,
formal
methods,
group,
Next
and
that'll,
be
followed
by
Ignacio
who'll,
be
talking
about
the
the
research
and
Analysis
of
standard
setting
processes
group
and
then
following
that's
the
the
majority
of
the
meeting
will
be
devoted
to
the
award
talks,
starting
with
Boris
talking
about
autonomous,
Nick
offloads,
and
then
Arthur
will
be
talking
about
Ai
and
machine
learning
for
network
security.
C
Yeah,
okay,
good
afternoon
good
afternoon,
everybody
so
I'd
like
to
introduce
the
usable,
formal
methods,
research
group
and
before
I,
do
that
you
probably
want
to
know
what
I
mean
by
formal
methods
and
the
standard
definition
would
be
the
use
of
mathematical
techniques
and
formalisms
to
assist
in
the
specification,
design,
analysis
and
implementation
of
in
this
case
protocols.
C
C
Yes,
there
are
lots
of
things
missing,
but
basically
work
in
the
50s
and
60s
that
mostly
dealt
with
safety
and
weather.
Mechanical
processes
would
fail,
and
over
time
that
was
applied
to
more
and
more
to
digital
processes,
and
nowadays
we
use
it
to
analyze
protocols
and
certainly
initially
the
proof
techniques
were
really
quite
limited.
C
We
had
to
make
these
very
strange
assumptions,
or
we
couldn't
even
analyze
some
things.
They
were
beyond
our
tools
and
they
required
huge
amounts
of
manual
work,
hundreds
and
hundreds
of
pages
of
by
handwritten
algebra,
and
so
over
time.
We
eventually
invented
tools
and
techniques
that
helped
with
this
process
and
allowed
us
to
do
more
interesting
things,
and
eventually
the
tools
were
sufficiently
mature
and
usable
that
the
formal
methods
Community
became
deeply
involved
in
the
development
and
specification
of
what
became
TLS
1.3
and
just
to
give
you
the
way,
I
think
about
this.
C
There
are
lots
of
things
that
don't
really
fit
into
this
categorization,
but
roughly
you
would
say
formal
analysis
is
the
set
of
techniques
which
we
use
to
say.
Is
this
specification
design
correct
so
ignoring
anything
that
we
care
about
the
implementation?
Is
the
design,
if
implemented
perfectly
correct?
Does
it
do
what
we
think
it
does
and
then
there's
the
second
half,
which
is
formal
verification
which
says?
Does
this
piece
of
code
do
what
we
think
it
does?
C
Does
this
piece
of
code
say
compute,
the
right
value
and
the
form
methods
research
group
will
try
and
look
at
both
of
these.
C
Just
to
give
you
go
back
briefly
to
the
TLs
design
process.
What
happened
with
tls13
was
from
the
beginning.
Academics
were
doing
formal,
analyzes
of
the
protocol
design.
They
were
saying.
Does
this
design
achieve
the
effects
we
want
it
to,
and
actually
there
were
three
or
four
major
flaws
that
were
found
in
the
protocol
design
at
various
stages
of
its
development
that
eventually
were
removed?
Well,
they
were
removed
and
there
were
proofs
written
that
say
this
version
of
the
protocol
is
secure.
This
version
of
the
protocol
meets
its
mathematical.
C
C
The
problem
is
okay,
the
problem
is
formal
methods
are
not
very
user
friendly.
C
The
proofs
can
be
immensely
long,
so
we
used
a
automated
tool
to
do
a
proof
of
tls13
and
the
proof
runs
to
750
000
lines,
and
you
could,
in
theory,
go
through
line
by
line
and
check
each
one
of
those,
but
most
people
don't
want
to
and
you
so.
You
basically
have
to
choose
a
tool
that
checks
each
line
and
eventually
gets
to
the
end.
C
The
alternative
procedure
that
people
use
rather
than
trying
to
use
one
of
these
tools
is
to
write
by
hand
proofs
literally
long
form,
proofs
and
quite
often
I
will
be
asked
Can
you
review
this
paper.
Here
is
26
pages
of
algebra.
You
have
two
days
to
review
this
paper
and,
unsurprisingly,
it's
very
very
difficult
to
actually
check
whether
the
proof
is
actually
a
proof,
and
so
there
are
a
couple
of
issues,
but
mostly
the
proofs
are
hard
to
understand,
verify,
adjust.
C
Everything
and
academics
have
mostly
been
doing
this
work
and
the
problem
with
this
work
is
it's
very
high
risk
and
very
low
reward,
because
quite
often
what
happens?
Is
you
analyze
a
protocol?
And
you
say
great:
it's
secure
no
problems
at
all
and
then
you
go
to
get
it
published
and
everyone's
like
who
thought
it
wasn't
secure.
C
You
just
have
written
a
very
boring
paper,
and
so,
unless
you're
like
I'm,
pretty
sure
this
is
broken,
you're
not
going
to
invest
the
potentially
years
it
would
take
to
check
and
so
enter
the
usable,
formal
methods
research
group
proposed.
C
How
can
we
solve
these
problems?
How
can
we
take
this
technique
that
we
finally
gotten
to
a
point
where
we
can
actually
use
it?
How
do
we
make
it
so
that
anyone
can
use
it
and
I
think
the
initial
steps
that
will
be
really
good
for
the
UFO
MRG?
Is
it's
going
to
provide
a
place
for
experts
to
gather
and
it
will
be
a
pool
of
knowledge
that
other
working
groups
can
come
to
and
say
we
can't
make
this
work.
We
we
don't
know
how
to
do
this.
C
Can
you
advise
and
if
we
then
start
building
up
a
tower
training
materials,
because
at
the
moment
it's
very
easy
to
analyze
a
very
tiny
protocol?
There's
examples
and,
and
then
the
moment
you
want
to
go
slightly
more
complicated,
you're
on
your
own
and
there
there
are.
We
can
also
provide
feedback
to
Tool
designers.
C
So
a
lot
of
the
tools
are
not
actually
used
necessarily
by
people
who
want
to
analyze
ITF
protocols
they're
used
in
Academia,
so
we
can
go
through
them
and
say
we
need
this
feature,
hopefully
because
the
irtf
does
do
some
publishing
we'll
have
a
place
to
publish
these
negative
results.
Club
paste.
The
publish
here
is
a
proof
of
security
and
we'll
have
a
place
to
store
all
the
proofs
and
the
checking
tools
you
need
to
check
the
proofs.
C
So
a
couple
of
non-goals
very
important
on
goal
is
to
not
try
and
change
the
ietf
process.
So
maybe
the
ietf
processes
do
need
to
change,
but
that's
not
what
we're
interested
in.
That's,
not
our
job.
It's
an
I
RTF
group
and
if
we
manage
the
way
we
can
check
whether
we
have
succeeded
in
making
formal
methods
usable
is
if
people
use
them.
If
we
say
you
have
to
use
them,
we
we
don't
know
whether
we've
succeeded
and
very
much.
We
don't
want
to
be
an
obstacle.
C
We
want
to
provide
useful
tools,
we're
meeting
on
Wednesday
at
9
30..
Please
come
and
join
us.
Thank
you.
Any
questions.
F
Oh
I
guess
I
won
the
the
online
Lottery
there
I've
got
the
really
low
mic
all
right.
So
maybe
this
is
a
question
for
when
you
have
the
session
later
in
the
week,
but
my
under
very,
very
limited
understanding
of
formal
methods
is
that
one
of
the
challenges
with
them
is
anytime,
you're,
doing
a
proof,
you're
always
dependent
on
a
certain
set
of
lemmas
or
a
certain
set
of
assumptions
yeah.
So
the
big
question
is
whether
or
not
those
are
valid.
C
So
absolutely
there
are
some
very
standard
assumptions
that
we
make,
which
we
know
aren't
true.
C
For
example,
we
assume
that
say:
asymmetric
crypto
is
just
a
magic,
perfect
black
box
and
that
it
always
works,
and
yes,
we
will
definitely
need
people
who
do
Crypt
analysis
to
continue
doing
their
work,
but
the
goal
of
these
proofs
is
to
say:
if
we
assume
that
we
have
a
valid
or
a
secure,
asymmetric,
cryptography
algorithm.
C
F
B
Hello,
Diego
Lopez,
there's
a
question
because
looking
at
when
coming
here,
I
was
assuming
that,
while
the
former
methods
were
going
to
one
second,
yes,
no
I
was
I
was
a
understanding
that
we're
talking
about
formal
methods
in
general,
for
protocols
and
for
verifying
or
proving
whatever
the
properties.
But
it
seems
that
you
are
focusing
it
very
much
on
security
properties.
It's.
C
So
that
security
properties
is
my
background,
but
it's
certainly
not
the
only
scope
of
the
RG.
C
The
proposed
dodgy
you
know
one
where
one
of
the
first
things
we're
planning
to
do
is
try
and
come
up
with
some
examples
that
people
can
look
at
that
aren't
security
related.
You
know,
deadlock
related,
live
lock
related,
but
yeah,
that's
certainly
in
our
scope.
But,
yes,
my
background
is
Securities.
That's
what
I
think
about.
B
And
just
do
you
have
any
idea
of
the
background
that
Japan,
because
I
remember
a
long
time
ago,
this
I
used
to
work
with
this
communication,
communicating
process
algebra
and
things
like.
B
Yeah,
exactly
it's
or
or
are
we
talking
about
something
a
little
bit
ahead
of
that.
C
B
But
for
help
based
on
related
approaches,
so.
E
C
E
Great
all
right
so
so
next
up
is
Ignacio.
Who'll
be
talking
about
the
the
raspberry
a
reminder.
Well,
Ignacio
is
getting
set
up.
If
you're
in
the
this
room,
you
need
to
wear
a
mask.
If
you
are
not
willing
to
wear
a
mask,
you
need
to
leave
this
room.
E
G
Colleen,
hello,
everybody
telling
you
about
our
new
research
group
proposal,
Search
Group
rosberg,
which
is
about
research,
analysts
of
standard
setting
processes
and
I'm
chairing
it
together
with
meals.
Next
slide,
please!
G
So
what
is
rasp
about?
Well,
the
name
says
it
all
is
about
understanding
better
of
the
things
that
we
do
here,
not
only
at
the
idea,
but
mostly
at
the
IDF
of
the
point
is
not
to
judge
whether
the
ITF
is
the
best
thing.
Since
this
live
Spirit
and
the
itu
is
not.
The
point
is
just
to
analyze
what
we
do
and
other
people
might
make
judgments
on
that.
But
that's
not
really
the
point
of
this
research
group.
G
The
point
is
data
rather
than
judgment
types
of
outputs
that
we
expect
a
joint
reports
of
research
papers,
databases
for
example.
G
We
are
now
labeling
email
discussions
for
agreement
disagreement,
so
we
can
understand
better
consensus
formation
processes,
tools
and
open
software,
we're,
for
example,
making
a
little
tool
to
make
recommendations
for
cross
area
review
by
comparing
the
content
of
the
emails
of
the
people
with
a
draft
that
might
require
review
the
way
to
do
it
well,
similar
to
many
of
the
research
groups
collaborations
so
there's
many
different
people
from
many
different
areas
that
are
interested
in
this
sort
of
stuff,
organizing
working
sessions
and
prevent
duplication.
G
There
are
different
people
that
has
been
working
in
different
ways
around
this
and
mostly
produce
evidence-based
reproducible
work.
This
is
a
charter.
Please
subscribe
to
the
mail
list.
If
this
sounds
interesting
come
to
the
session
tomorrow,
we
do
a
lot
of
different
things.
That's
a
slide,
please
just
to
give
a
glance
like
this,
for
example,
is
from
one
of
the
papers
that
we
have
been
working
on,
and
this
is
the
call
for
ship
craft.
You
can
see
how
it's
a
group
of
people
that
write
a
lot
of
drafts
together.
G
You
can
see
also
some
points
in
the
middle.
You
can
guess.
Maybe
why
that's
happening
again?
Data
no
judgment
next
slide.
G
That's
the
interaction
graph
of
the
working
groups
and
you
can
see
how
things
Cluster
nicely
around
different
areas,
and
this
is
just
to
give
you
a
glimpse,
but
we
do
many
other
things
and
if
you
have
a
Nexus
live,
please,
if
you
have
questions
or
you
have
ideas
or
do
you
want
us
to
look
into
something
please
drop
by,
and
this
is
what
we
do.
G
What
we
don't
is
to
make
hierarchical
comparisons
between,
which
is
the
which
standard
setting
organization
is
better
or
similar
to
what
Jonathan
said,
define
how
operational
process
of
the
ITF
should
work,
though
people
is
welcome
to
use
the
data
that
we
might
produce
to
make
those
judgments
by
themselves,
but
that's
not
overall
next
slide,
so
please
drop
by
on
Thursday.
We
have
I
believe
fabrico
session,
going
from
ethnography
to
large
language
models
in
understanding
the
ITF
and
please
come
to
agree,
disagree,
propose
or
just
discuss.
The
research
group
is
starting.
G
A
E
E
Okay,
well,
if
you
are
interested
in
understanding
how
the
ITF
Works
please
go
along
to
the
meeting
on
Thursday.
Thank
you.
A
E
All
right,
so,
if
any
luck
you
now
have
control
of
the
slates
okay,
so
I'm
very
pleased
to
introduce
the
first
ever
applied.
Networking
research,
price
winners.
First
of
all,
applied
networking
research,
Prize
winners
for
this
meeting.
E
The
first
Speaker
here
is
Boris
pizmeni
Boris
is
a
PhD
student
at
the
Technic
technion
computer
science
department
in
Israel
and
is
currently
visiting
epfl
he's
also
employed
by
Nvidia.
As
a
software
architect,
his
research
is
focused
on
improving
system
software
performance
by
enhancing
Nic
controller
hardware,
and
in
recent
years,
he's
worked
on
accelerating
quick
with
UDP
segmentation
offload
and
received
side.
Coalescing
offload
he's
also
worked
on
accelerating
encryption
for
quick,
TLS
and
ipsec.
E
D
D
It
describes
a
software
architecture
that
enables
NYX
to
accelerate
level
5
protocol
computations
transparently
to
soft
tool
TCP,
so
we're
going
to
focus
on
level
5
Protocols
of
ltcp
here
are
some
examples,
specifically
we're
going
to
focus
on
the
TLs
protocol
and
its
encryption
decryption
and
authenticational
digest.
As
it's
called
here
in
the
paper,
we
also
present
nvme
TCP
in
its
digest
and
copy
of
flood
in
the
combination
of
the
two,
so
I
believe
overview
of
TLS.
D
D
D
So
the
next
approach
is
to
use
a
hardware
such
as
the
on-cpu
acceleration
available
in
Intel.
Cpu
calls
called
asni,
so
so
this
is
very
efficient.
It
uses
fast
typologies
tools
and
cache
memory
has
a
relatively
low
overhead,
but
nevertheless,
a
CPU
call
doing.
Tls
can
consume
more
than
50
percent
of
the
call
on
just
encryption.
D
So
looking
further,
we
can
consider
an
off
CPU
accelerator,
such
as
a
pcie
Calder.
Does
encryption
like
the
Intel,
quick
assist,
call,
and
the
benefit
is
that
the
CPU
oval
head
is
independent
of
the
data
says
being
encrypted,
but
the
problem
is
that
significant
parallelism
is
required
to
outperform
the
on
CPU
acceleration,
and
this
can
be
problematic
and
sometimes
applications
need
to
be
redesigned
to
make
use
of
that.
D
Looking
forward,
we
can
ideally
place
the
encryption
of
the
naked
surface.
Data
needs
to
Traverse
the
Nick
anyway.
The
problem
is
that
current
approaches
to
do
this.
They
depend
on
offloading
TCP,
IP,
routing
quality
of
service.
Essentially,
the
internal
Network
stack
into
the
hardware,
and
this
introduces
a
lot
of
complexity,
security
problems
and
has
thus
file
shown
itself
to
be
undesirable
and
impractical.
D
D
D
So
before
can
learn
TLS.
What
we
had
is.
We
have
a
baseline
application
say
using
a
TLS
Library.
It
calls
the
liability
with
its
data,
the
library
encrypts,
the
data.
It
puts
it
in
a
TLS
record
and
then
the
record
moves
to
the
kernel
will
TCP
sends
it.
So
we
have
an
encryption
pass
and
a
copy
pass
once
when
passing
from
the
application
to
TLS
and
and
the
copy
is
when
passing
from
TLS
to
I
lost
control.
D
D
Finally,
in
Canon
TLS,
the
TLs
layer
can
communicate
directly
with
the
next
driver,
and
this
is
what
we
require
for
autonomous
software.
This
direct
communication
allows
us
to
make
some
optimizations,
okay,
so
with
autonomous
TLS.
What
we
have
is
that
we
eliminate
the
encryption
pass
from
software
and
we
move
it
into
the
hardware.
So
when
the
application
sends
its
data
using
Canon
TLS,
we
just
do
a
copy.
We
don't
do
the
encryption,
but
we
still
create
a
record,
but
this
time
we
can
compute
some
parts
of
the
record,
for
instance
the
the
Mac
the
authentication.
D
So,
in
actual,
this
is
the
solution.
Now
we'll
get
into
how
we
implement
the
offload
starting
from
the
transmit
side.
So
in
order
to
offload
data,
that
is
in
sequence,
we
don't
need
to
do
much,
because
the
Nic
uses
a
state
that
incrementally
is
updated
and
we
simply
send
the
packet
with
an
indication
and
how
it
will
will
perform
its
operation
encrypted
data.
Then
it
uses
two
contexts
to
accomplish
this
population:
a
static
state
which
is
a
relatively
constant
for
the
hotel
connection.
D
It
holds,
for
example,
the
encryption
keys
and
a
dynamic
state.
It
is
updated
for
each
and
every
packet
and
it
holds
the
state
for
the
next
expected
TCP
sequence
dumbbell.
So
it
tells
the
hardware
how
to
encrypt
the
next
TCP
sequence
number
position,
and
it
holds
things
like
the
current
message
of
the
column
message,
size
and
the
IV
and
the
rolling
authentication
or
icv
state.
D
So
the
problem
is
when
we
want
to
send
something
that
is
out
of
sequence.
A
full
example
assume
that
we
send
packets
one
through
eight
and
then
sotel
decides
to
returns,
meet
packet
five,
so
the
Hudl
doesn't
have
the
correct
Dynamic
state
to
accomplish
that
because
it
expects
bucket9
and
it
needs
to
transmit
back
at
five.
So
what
happens
is
that
the
driver
identifies
this
problem?
D
It
compels
the
sequence
numbers
between
the
dynamic
state
it
has
a
shadow
of
and
the
packet
that's
been
transmitted
and
it's
going
to
perform
a
flow
which
we
call
the
cavali
into
transmit
packet,
five.
So
in
the
recovery
flow,
what
we
need
to
do
is
we
need
to
pass
the
TLs
local
prefix
of
TCP
packet
5,
which
is
marked
in
the
dashed
lines.
D
On
the
figure
and
after
passing
this
information
to
Nick
handwell,
we
can
send
packet
five,
because
the
dynamic
state
would
be
adjusted
for
packet
five,
and
to
accomplish
this,
the
driver
will
communicate
with
the
TLs
layer
directly
asking
it
for
this
TLS,
like
called
prolifics
and
in
the
TLs
layer.
It
holds
this
mapping
which
which
allows
it
to
provide
this
information.
D
The
only
problem
is
that
the
TLs
layer
needs
to
hold
this
information
all
the
time,
and
we
can't
release
this
data
once
packets
are
acknowledged,
because
we
may
release
a
part
of
a
record.
So
we
hold
an
extra
reference.
Follow
the
packets
that
that
combine
into
a
record
and
we
release
those
only
when
the
entire
TLS
record
is
acknowledged.
D
So
that's
it
for
the
transmit
path.
Moving
to
to
the
sieve
path
so
again,
we'll
receive
the
in
sequence.
Flow
is
very
straightforward.
How
will
the
clips
incrementally
as
packets
go
through
and
indicates
for
each
packet
whether
it
was
decrypted
and
authenticated
successfully.
D
Now,
problems
begin
when
we
have
out
of
all
the
Transmissions
full
instance.
In
this
example,
we
receive
packets,
one
two,
three
four
and
then
packet,
two
again
so
packet
to
easily
transmission
and
then
back
at
five.
So
what
will
happen?
Is
the
childhood
will
skip
the
decryption
of
Market
to
identifying
that
it?
It
was
received
before
because
it
expects
packet.
Five.
The
benefit
here
is
that
Hardware
continues
doing
its
acceleration
for
packet
five
and
on
volts,
and
it
doesn't
stop
because
of
pocket.
Two.
D
Another
problem
that
may
occur
is
a
TLS
local
data,
the
ordering.
So
here
what
we
have
is
the
packet
2
is
not
received.
D
Now
the
real
problem
happens
when,
on
the
receive
path
we
get
heavily
order
links.
So
when
the
TLs
circled
handle
is
the
ordered
the
handle
can't
use
this
trick.
Well,
it
relies
on
the
land
field
of
the
TLs
record,
handle
to
tell
the
position
of
of
the
of
the
next
record,
because
after
Little
sifts
Packet
fall,
it
knows
it
didn't
see.
Pocket
free
and
packet
fall
can
have
any
number
of
potential
TLS
circles
and,
as
a
result,
it
stops
offloading
until
it
is
recovered
until.
H
D
Okay,
so
to
to
solve
this
problem,
we
can
just
let
software
notify
Hardware
every
time,
heals
the
TLs,
like
called
heals
the
TLs
like
old
handle,
because
packets
may
continue
coming
all
the
time
and
and
as
a
result,
we
will
always
have
this
slice
condition
between
software
and
Hardware.
Will
software
is
chasing
hardware
and
Hardware
keeps
receiving
new
packets,
so
we
need
some
solution
that
is
not
pure
software
to
to
solve
this
problem
for
streaming,
TCP
workloads
and
the
solution
we
propose.
We
devise
is
a
social
Hardware
collaboration.
D
Well,
first,
so
suppose
we
have
TLS
header
reorderling
and
what
Hudl
does
it
will
speculatively
search
for
what
we
call
a
header
magic
pattern,
for
instance
for
TLS?
It's
this
hex
1703
or
flea,
which
represents
the
type
in
the
version,
and
once
Hardwell
identifies
this
pattern
in
the
TCP
byte
stream.
It
will
ask
software:
is
this
a
TLS
handle
or
is
it
something
else?
And
meanwhile,
packets
continue
to
arrive
and
the
handle
will
track
where
it
expects
to
see
subsequent
TLS
record
handles
based
on
the
length
field
again.
D
So
here
we
first
found
the
handle
in
pocket
five
and
then
in
packet.
Seven,
we
check
and
verify
again
that
the
handle
is
still
Dell.
If
it's
not
Dell,
we
continue.
We
go
back
to
step
one
because
it
was
long,
but
if,
if
Hardware
was
correct
in
this
speculation,
eventually
software
will
be
able
to
confirm
that
indeed,
packet
5
contained
a
TLS
circled
handle
in
the
position
that
the
the
Nick
asked
about,
and
this
this
will
allow
Hardware
to
synchronize
and
resume
its
acceleration
from
whichever
point
it
is
currently
tracking.
D
D
Taking
a
step
back,
we
ask
ourselves
what
protocols
in
computations
are
autonomously
uploadable
and
to
define
the
probabilities
and
that
make
them
such
and
we
find
that
most
computations
and
Protocols
are
not
floatable,
but
not
all
so
looking
at
the
properties
that
make
computations
autonomously
or
floatable
without
by
identifying
it
on
the
transmit
side,
the
computation
might
be
size
preserving,
and
this
precludes
a
acceleration
of
compassion
and
encapsulation.
For
instance,
we
have
here
an
example
with
encapsulation
to
explain
the
intuition
behind
this.
So
suppose
we
do.
D
D
Okay,
so
Jose
sends
the
first
packet
and
the
Nick
accelerates
it
by
encapsulating
it,
and
by
doing
so
it
adds
some
additional
payload
bytes
and,
as
a
result,
an
additional
packet
is
required
because
the
MTU
is
exceeded.
So
two
packets
will
send
on
the
while,
instead
of
one
so
host
B
receives
those
two
and
sends
X,
but
the
second
act
is
lost
and
as
a
result,
what
we
get
is
that
host
a
thinks
that
all
the
data
each
sent
has
been
successfully
received
because
it
got
an
act
for
the
150
bytes
it
sent.
D
But
in
fact
some
data
was
lost
and
the
Nick
needs
to
be
responsible
for
this
transmission.
But
the
transmissions
and
TCP
logic
is
what
we
wanted
to
avoid
placing
in
the
Nick.
And
this
is
why
we
consider
this
undesirable
and
the
property
that
is
acquired
next
required
that
the
computation
is
computable
on
TCP
packets
of
any
size.
D
So
it
can't
require
any
bytes
from
future
packets,
and
this
precludes
some
block
Cycles
such
as
ascbc,
which
operates
on
blocks
of
16
bytes,
because
some
packets
may
not
contain
the
all
of
those
16
bytes
and
we
need
to
pass
them
on.
We
don't
want
to
start
stalling
packets
in
Hardware
to
perform
this
operation.
D
Next,
to
to
do
the
recovery
will
equal
the
state
required
to
compute.
The
operation
is
of
constant
size
and
is
a
message
independent
up
to
maybe
some
metadata,
such
as
message,
sequence,
numbers
and
it
kind
of
depends
on
all
slim
payload
on
on
future
bytes,
and
we
find
that
most
protocols
adhere
to
this.
D
As
a
copy
to
offload
with
zero
copy
and
https,
which
is
just
using
open
SSL
and
what
we
can
see
in
the
figure,
for
instance,
starting
from
the
left
figure
with
the
throughput,
is
that
the
yellow
line
and
the
blue
line,
the
the
model
is
the
same
and
the
yellow
line
is
is
an
Optimum.
Http
is
the
best
we
can
hopeful
when
doing
https
with
offloading,
and
we
see
that
the
Gap
is
very
small
and
the
numbers
below
show
the
comparison
between
https
and
offload
with
zero
copy.
D
D
That's
it.
In
conclusion,
autonomous
nickel
floats
is
framework
for
accelerating
level
5
protocol
computations
efficiently,
while
cooperating
with
the
social
TCP
stack.
It
is
applicable
to
most
protocols
and
computations,
and
the
evaluation
show
that
we
can
improve
throughput
by
3.3
X
and
the
decibel
utilization
by
up
to
60
percent
and
latency
by
up
to
30
percent.
A
E
All
right
so
I
have
a
question,
so
so
obviously
this
this
approach,
assuming
certain
properties
of
the
protocols
it
you
know
as
soon
as
they're
size,
preserving
incrementally
computable,
have
constant,
States
and
so
on.
E
And
you
know,
obviously
these
fit
some
protocols,
but
not
necessarily
all
of
the
protocols
to
to
what
extent
are
these
sort
of
fundamental
limitations
of
the
approach
versus
sort
of
limitations
of
the
current
work
that
might
potentially
be
resolved
in
in
a
future
iteration
of
the
the
ideas
and
the
approach.
D
So
so,
almost
all
of
them
all
fundamental,
the
only
one
that
can
be
somewhat
softened
is
the
requirement
for
CBC.
Well,
it's
possible
to
to
think
of
a
solution
that
stole
some
bytes
to
to
enable
this
offloading
the
partial
block,
the
one
that
doesn't
have
the
full
AES
block,
and
but
this
would
be
more
complicated
and
ascbc
is
deprecated.
So
there
is
not
much
interest.
D
A
Dave
hi
Dave
Iran
MIT
do
the
trade-offs
change
that
all
of
you
have
a
user
mode.
Tcp
stack
as
you
were
with
dpta,
or
something
like
that.
Not.
E
Okay,
so
one
other
thing
I
mean.
Obviously
this
is
an
ATF
meeting
that
you're
presenting
at
is.
Is
there
any
sort
of
guidance
or
any
sort
of
issues,
to
consider
that
the
ITF
should
be
paying
attention
to
when
designing
future
protocols
to
to
make
this
type
of
approach
or
similar
approaches,
sort
of
work
better?.
D
Yes,
actually,
that's
that's
a
great
question,
so
this
this
approach
also
works
for
TLS
1.3,
which
was
finalized
more
or
less
when
we
were
finalizing
the
hardware
and
one
of
the
things
that
happened
in
TLS
1.3
is
that
the
trailer
was
starting
to
use
the
real
application
type
and
and
when
using
an
offload
such
as
this,
it
created
some
problems,
for
instance,
so
in
general
it
worked
well
because
the
format
of
the
record
remained
the
same,
so
keeping
the
formatting
of
Records.
D
So
to
do
data
placement
we
need
to
to
assume
its
application
data,
but
if
it's
a
handshake,
then
it's
bogus
and-
and
this
creates
quite
quite
a
lot
of
complexity-
that
we
didn't
anticipate
that
doesn't
exist
with
TLS
1.2.
So
so
this
is
somewhat
unfortunate.
Similarly,
the
padding
is
also
makes
things
more
complicated
than
it
could
have
been.
E
Okay,
thank
you.
So
it
sounds
like
there's.
Maybe
some
lessons
that
can
be
well
learned
from
the
the
way
the
protocol
design
changed
to
make
simplify
offloading.
F
F
F
You
know,
for
example,
if
somebody
sort
of
a
corollary
question
is,
if
somebody
offers
you
one
parameter
to
change
to
the
system
that
might
might
get
better.
How
would
that
affect
your
performance?
You
know
if
somebody
offers
you
double
bandwidth
or
twice
as
many
cores
or
something.
D
And
that's
a
great
question
so
so
these
are
the
holidays
of
this
technology
in
particular.
So
it's
how
to
predict
what's
going
to
happen
in
the
future
right
now,
we'll
see
Soviet
option,
but
it's
not
like
everybody
needs
to
do
800,
gigabit
TLS.
So
it's
not
obvious
that
it's
applicable
to
to
all
use
cases
yeah.
There
are
impressive
numbers.
F
D
The
number
is
so
great,
but
not
everybody
needs
those
numbers.
So
there
is
a
trade-off,
though,
and
making
the
most
out
of
it
requires
quite
a
lot
of
work
in
software.
I
think
the
performance
in
previously
today
is
somewhat
better
than
Linux,
because
the
Netflix
guys
slowly
did
a
lot
of
work
to
make
it
so
and
I
think
we'll
see
over
time
how
it
evolves.
F
E
Okay,
so
the
final
talk
today
is
the
other
applied
networking
research
prize
winning
talk
It
Is
by
our
first
seller.
Jacobs
Arthur
has
a
PHD
in
computer
science,
from
the
federal
University
of
Rio
Grande
du
Sol
in
Brazil,
and
he's
worked
with
Jennifer
rexford's
group
in
Princeton
and
with
Walter
willinger.
E
E
He's
worked
working
currently
as
a
senior
software
engineer
for
nomad
health
I
believe
and
his
paper
today
is
entitled
Ai
and
machine
learning
for
network
security.
The
emperor
has
no
clothes
and
I
believe
this
was
originally
presented
at
the
ACM
computer
and
communication
security
conference
in
November
2022..
Yes,
okay,
you
should
have
control
over
the
slides,
so
I.
I
Into
the
mic:
okay,
thanks
Colin
for
the
introduction,
hi
everybody.
My
name
is
Arthur
and
I'm.
Here
today
to
present
to
you
our
work
entitled
AIML
for
network
security.
The
emperor
has
no
clothes.
I
So
in
recent
years,
we've
seen
exciting
advances
in
machine
learning
and
AI
in
fields
such
as
facial
recognition,
recommendation
systems
or
even
spam
detection
among
many
other
areas
of
computer
science
and
other
areas
in.
But
let's
take
a
look
at
what's
causing
all
that
excitement
and
what
we
call
the
traditional
AIML
development
pipeline.
Usually,
if
you
want
to
develop
a
new
machine
learning
model,
you
start
by
collecting
some
data
and
selecting
which
model
you
want
to
use.
I
You
didn't
use
that
data
to
train
your
selected
model
and
evaluate
it
using
traditional
evaluation,
metrics
such
as
Precision,
recall
or
F1
score.
Then,
if
you
have
a
high
enough
iPhone
score,
that
usually
means
your
job
is
done.
You
can
claim
your
model
Works,
deploy
it
in
a
network
product
in
a
production
and
you
move
on.
Otherwise
you
go
back,
you
collect
more
data
or
you
collect
better
data
and
re-evaluate
your
model
selection.
I
Now
we
claim
that
this
sort
of
traditional
AIML
pipeline
is
good
enough
for
low
stakes
decision
making,
such
as
recommendation
systems
or
spam
detection,
but
what
about
high-stakes
decision
making
scenarios
such
as
self-driving
cars
or
network
security,
in
which
a
wrong
decision
can
have
direct
impact
on
people's
lives
or
companies,
companies,
revenues
and
reputations?
In
these
scenarios,
we
argue
that
we
need
to
be
able
to
claim
that
a
model
works
is
not
good
enough.
I
We
need
to
be
able
to
tell
why
a
model
works
and
when
does
the
model
not
work
considered,
especially
especially
consider
that
is.
It
is
well
documented,
documented
that
machine
learning
models
can
suffer
from
under
specification
issues
such
as
chocolate
learning
where
the
model
takes
shortcuts
to
classify
the
data
rather
than
actually
learning
to
solve
the
problem
or
they're
a
model
might
suffer
from
out
of
distribution
samples
or
even
that
the
model
is
simply
overfitted
to
spurious
correlation
in
the
data
and
not
learning
anything.
I
Answer
that
question
we
propose
trustee
trustee
is
a
novel,
explainability
explainable
AI
technique
that
produces
Global
explanations
from
any
machine
learning,
Black
Box
model
in
the
form
of
low
Fidelity,
sorry,
High,
Fidelity
and
low
complexity,
decision
decision,
trees,
trustee
augments,
the
traditional
AIML
Pipeline,
with
two
new
steps,
the
first
one
to
extract
a
decision
tree
from
any
Black
Box
model
and
the
second
one
to
analyze.
That
decision
tree
for
any
issues
that
might
impact
be
impairing
the
the
model
to
make
the
correct
classifications.
I
And
finally,
our
last
requirement
was
for
Tracy
to
be
able
to
produce
stable
explanations
that
is
produced
roughly
the
same
explanation
for
the
same
input
on
multiple
executions,
so
trustees
algorithm
starts
receiving
as
input
to
a
data
set
in
a
black
box
machine
learning
model.
It
then
starts
by
splitting
that
data
set
into
a
training
and
testing
data
sets
using
a
given
split
such
as
70
and
30
percent.
I
It
then
uses
that
Black
Box
model
to
as
an
oracle
to
produce
the
expected
output
for
the
training
data,
which
will
then
be
used
to
guide
the
training
of
the
decision.
Trees
notice
that,
since
any
machine
learning
model,
can
be
used
to
produce
the
expected
output,
we
achieve
our
area.
Our
first
design
requirement
of
model
agnosticity,
then
trustee
selects
an
M
number
of
samples
from
this
training
data
set
in
expected
output
and
further
splits
it
into
a
training
and
testing
set.
I
The
training
set
is
then
used
to
produce
a
decision
tree
using
the
traditional
algorithm,
cart
called
classification
and
regression
trees
and
then
evaluate
it.
Using
the
tests
and
data
sets
to
produce
an
explanation
output
which
we
can
then
use
to
measure
the
Fidelity
of
the
of
the
produced
explanation
with
the
expected
output.
This
process
is
repeated
another
time
and
a
num
number
of
times
with
different
samples
from
the
training
data
set
in
which
we
call
trustee
inner
inner
loop.
That
runs
any
number
of
times
this
inner
loop
produces
an
output.
I
Then
one
thing
to
notice
here
is
that
it
is
not
uncommon
for
cards
algorithm
to
produce
decision,
trees
of
hundreds
or
thousands
of
nodes,
and
for
for
a
human
to
be
able
to
parse
it.
This
needs
to
be
much
much
smaller,
so
the
size
of
the
explanation
here
matters
to
circumvent
this
problem.
We
propose
a
new
pruning
algorithm.
I
Now,
finally,
given
that
trustee
uses
a
subsample
of
data
to
train
decision
trees,
it
is
possible
that
with
multiple
executions,
different
decision
trees
will
be
generated
and
so
to
mitigate
that
issue,
we
added
an
outer
loop
to
trustee
that
runs
the
inner
loop
for
an
s
number
of
times
and
calculates
the
pairwise
agreement
of
the
decision
trees.
The
decision
trees
produced
that
is
trustee
measures,
whether
or
not
they
produce
decision
trees
make
agree
on
the
decisions
made
for
the
same
samples
and
then
Returns
the
decision
tree
with
the
highest
main
agreement
amongst
all
of
them.
I
I
Now
for
the
second
argument,
step
of
the
augmented
the
development
pipeline,
we
introduced
a
novel,
a
novel
method.
We
call
Trust
reports,
trust
reports,
basically
automate
part
of
the
analysis
process
to
try
to
identify
the
three,
the
under
specification
issues
that
I'd
mentioned
before
that
shortcut
learning,
other
distribution
samples
and
spurious
correlations.
I
We
basically
the
trust
report,
summarizes
important
information
from
the
decision
tree
explanations,
such
as
the
size
of
the
decision
tree
the
depth.
The
number
of
input
features
from
the
model
that
were
actually
used
to
classify
the
data,
the
Fidelity
amongst
many
other
things
and
and
small
experiments
to
see
how
much
the
explanation
is
staple
to
the
actual
Black
Box.
I
On
top
of
that,
the
trust
report
produces
useful
plots
on
the
on
the
decision.
Tree
explanations,
such
as
the
number
of
samples
classified
at
each
level
of
the
decision
tree
for
optimal
pruning,
the
number
of
the
number,
the
number
of
samples
and
classes
that
a
specific
Branch
classifies
and
then
the
number
of
samples
that
each
feature
is
responsible
for
in
the
decision
tree
classification.
I
Now
it
is
important
to
notice,
though
sorry,
that
the
trust
report
does
not
automatically
tell
you
which,
under
specification
issue,
your
model
suffers
from.
It
still
requires
a
human
to
look
at
them
and
I
try
to
identify
it.
Since
this
under
specification
issues
are
ultimately
domain
dependent.
So
it's
really
hard
to
automate.
I
I
I
We
then
use
this
model
to
extract
a
decision
tree
out
of
it
using
trustee
with
a
Fidelity
of
one
precisely
and
no
pruning
was
required,
since
he
only
had
seven
nodes
that
you
can
see
here
on
the
screen.
Now,
as
you
can
see,
the
decision
trees
is
telling
us
that
the
model
uses
is
using
bytes,
49,
43
and
47
from
the
input
bytes
to
make
the
classification
between
VPN
traffic
and
non-vpn
traffic.
Now
to
understand
what
this
decision
tree,
we
need
to
First
understand
the
data
that
it's
coming
from.
I
So
if
you
go
when
we,
when
we
went
to
Dove
deep
into
the
pcaps
that
were
being
used,
we
quickly
noticed
that
there
was
a
split
in
the
data
that
is
all
the
non-vpn
traffic
pcaps
had
ethernet
headers
on
them,
while
the
VPN
trap,
pcaps
traffic
Recaps
did
not
have
internet
header
on
them.
This
created
a
mismatch.
I
I
So
we've
net
that
knowledge
in
mind.
We
can
go
back
to
the
decision
tree
and
see
that
the
first
decision
that
this
model
is
looking
at
is
basically
comparing
whether
or
not
for
VPN
traffic.
It
is
using
the
UDP
or
TCP
protocols,
value,
6,
6
and
17
of
the
ipv4
protocol
against
a
random
byte
from
a
source
Mac
address,
which
in
this
case
is
always
larger
than
17
for
the
data,
so
that
basically
splits
all
almost
all
of
the
data
perfectly.
I
The
second
level
of
the
decision
three
is
basically
showing
that
to
weed
out
the
remaining
few
samples
that
do
not
follow
these
rules.
Like
one
percent,
the
the
model
is
picking
up
on
different
headers,
such
as
the
fragment
offset
against
a
random
bike
from
the
render
from
the
source
Mac
address
for
the
one
for
vpns
on
right
side
and
on
the
left.
Side
is
looking
at
the
destination
Mac
address,
which
is
always
zero.
I
That
is
that,
if
you
go
back
to
the
data
you
could
you
quickly
realize
that
the
developers
of
the
model
didn't
remove
the
pcap
metadata
from
the
from
the
pcapps
before
reading
the
features
so
they're
actually
reading
features
feature
values
from
the
pcap
metadata
for
the
first
40
bytes,
which
includes
a
lot
of
potential
potential
shortcuts
for
the
model,
including
by
3023,
which
indicates
whether
or
not
the
ethernet
header
is
present
or
not
in
the
pcap.
I
So
the
takeaway
is
that
this
model
is
suffering
from
blatant
shortcut
learning
and
didn't,
learn
to
classify
the
data
at
all
and
simply
picking
up
on
shortcuts
put
in
the
features
in
the
future
values.
Now,
for
the
Second
Use
case,
I
want
to
revisit
the
random
Forest
example
that
I
showed
before
we
selected
many
many
papers.
We
saw
many
Penny
papers
that
relied
on
the
seek
IDs
2017
data
set
for
classification.
This
data
set
is
really
popular.
I
This
data
set
contains
traffic
from
13
different
types
of
attacks
aside
from
benign
traffic,
including
Port
skins,
DDOS
and
Heartbleed
in
this,
and
this
data
set
comes
with
a
set
of
78
pre-computed
features
from
flow
statistics
such
as
flow
duration,
mean,
inter
arrival
time
number
of
packets
sent
and
received
in
each
flow
and
a
lot
of
most
of
the
Publications.
We
we
found
they
used
the
data
set,
reported
F1
score
numbers
of
0.99,
which
were
very
easily
able
to
reproduce
with
a
random
Forest
classifier.
I
By
simply
looking
whether
or
not
the
maximum
response,
the
maximum
length
of
response
packet
size
is
bigger
or
smaller
than
12K.
This
model
is
able
to
determine
whether
it's
a
heartbeat
attack
or
not.
The
reason
for
that
in
using
this
distribution
plots
from
the
trust
report.
We
can
see
that
this
feature
specifically
a
perfectly
splits.
I
The
entire
heart
bleed
flows
from
all
of
the
rest,
because
the
response
packet
size
for
heart
bleed
that
flows
are
is
always
bigger
than
12K
and
it's
always
smaller
than
12K
for
all
of
the
other
classes,
and
we
can
see
that
same
behavior
in
other
features
such
as
the
inter-arrival
time
response,
inter
arrival
time,
which
almost
perfectly
splits
heart
bleed
from
all
others.
All
of
the
other
classes.
I
Now
to
understand
why
this
happens,
we
need
to
First
understand
how
the
heartbeat
attack
works.
A
heartbeat
attack
and
I
feel
like
this
is
preaching
to
the
choir,
but
heartbeat
attack.
Works
happens
when
I
in
a
malicious
actor
sends
an
https
heartbeat
message
with
a
to
a
vulnerable
server
with
a
value
in
the
size
field
bigger
than
the
actual
package.
So,
basically,
you
can
send
a
16k
byte
packet
and
specify
it
as
64k
bytes
for
the
server
a
vulnerable
server
will
respond
with
a
message
with.
I
We
have
a
heartbeat
response
with
the
same
size
as
the
incoming
packet
by
copying
the
contacts
of
the
incoming
packet
into
the
into
the
response
packet.
So,
but
since
the
pack,
incoming
packet
only
has
16k
bytes,
the
response
packet
will
have
48k
48k
bytes
from
the
server
memory,
which
may
include
credit
card
information,
usernames
passwords
that
sort
of
thing
now
in
the
seek
IDs
2017
data
set.
Specifically,
we
notice
that
for
the
30
minutes,
duration
of
the
heartbeat
at
the
heart
bleed
attacks
generated.
I
They
didn't
close
the
connections
once
we
generated
huge
numbers
for
feature
values
related
to
response
packet,
sizes
and
Inter
arrival
times
in
in
those
flows
which
made
it
abundantly
easy
for
the
for
the
model
to
pick
up
on
those
values.
So
with
that
in
mind,
we
set
out
to
generate
a
validation
data
set
for
our
explanation,
consisting
of
a
thousand
new
heartbeat
flows
with
out
of
distribution
values
in
which
we
simply
close.
I
The
connection
of
the
harp
of
the
https
connection
after
every
heartbeat
message
sent
and
this
generated
feature
values
related
to
response
packet,
sizes
and
intervival
time,
much
similar
to
benign
traffic
and,
as
expected,
the
random
forest
classifier
was
unable
to
identify
a
single
one
of
those
new
thousand
heartbeat
flows
as
heartbeat.
I
Now
the
third
use
case,
unless
I
don't
want
to
show,
is
another
example
of
a
paper
that
uses
the
seek
IDs
2017
data
set.
This
paper
was
published
in
CCS
2020
and
proposes
a
model
called
nprint
ml.
This
paper
uses
an
automl
model
for
intrusion
detection
system.
It
uses
a
4480
features
with
values
minus
one
zero
or
one
which
correspond
to
a
stable
bit
representation
from
the
from
the
packet
re-established
protocol
headers
read
from
the
five
first
packets
of
each
flow.
I
The
reason
for
that
is
that,
because,
in
this
data
set
all
of
the
attack,
traffic
was
generated
in
an
outs
from
an
outside
computer,
one
hop
away
from
the
measurement
point,
and
so
this
is
basically
checking
whether
the
the
attack
was
inside
the
network
benign
or
outside
the
network,
which
is
malicious.
So
it's
basically
able
to
tell
the
difference
based
on
that
most
of
the
attack
traffic
was
generating
using
Kali
Linux,
which
is
the
initial
TTL
value,
is
64.
I
minus
one
hop
63
third
bit
one,
and
then
you
get
to
split
all
the
benign
traffic
from
the
from
the
rest.
Now.
The
second
decision
is
also
looking
at
TTL,
but
it's
splitting
all
of
the
DDOS
attacks
that
were
generated
using
Windows,
Windows
8.1
to
be
specific,
which
has
an
initial
TTL
value
of
128
or
minus
one
hop
127.
So,
let's
look
at
the
second
bit
of
the
second
package.
I
It
could
be
the
first
packet
so
yeah,
whether
it's
one
or
zero,
and
then
it's
able
to
identify
all
of
the
DDOS
samples
in
this
data
set
now.
I
The
third
decision
that
here
is
also
interesting,
because,
as
you,
if
you
notice
the
the
the
decision
tree,
is
checking
whether
or
not
the
value
is
negative
or
not.
That
means
that
the
value
the
the
model
is
checking
whether
no
problem
yeah.
This
means
that
the
model
is
checking
whether
there
is
a
second
packet
or
not
in
the
in
the
flow
observed.
I
As
most
of
you
probably
know,
board
scans
are
usually
not
responded
to
by
by
an
attack
by
by
it
a
victim.
So
all
of
the
board
scan
flows
in
this
data
set
only
had
one
packet,
so
basically
the
model
was
able
to
check
whether
or
not
it
was
a
port
scanned
by
the
number
of
checking
the
numbers
by
checking
the
number
of
packets
in
the
flow.
I
And
finally,
this
was
our
last
observation.
We
noticed
that
this
decision
tree
relied
heavily
on
random
bits
from
the
TCP
options
headers
in
the
in
the
flows
and
using
the
using
the
sorry
and
using
the
trophy
part
we
iteratively
removed
features
from
this
data
set
until
there
was
only
TCP
options.
Fields
for
this
model
to
pick
up
on,
and
still
this
model
reported
an
F1
score
of
0.99
using
only
TCP
options
bits.
So
this
this
was
very
interesting
to
us
and
indicated
that
this
is
actually.
I
This
model
was
not
learning
anything
but
was
picking
up
on
spurious
correlations
in
the
data
we
set
out
to
validate
this
explanation
by
curating,
a
balanced
data
set
of
4047
flows
from
real
real
world
traffic
from
UCSB
Network,
a
new
Siri
cut
IDs
to
label
those
flows
between
benign
distribution
of
service
attacks
and
Port
scans,
which
were
the
three
classes
that
we
were
able
to
collect
from
the
short
span
that
we
collected.
As
you
can
see.
I
When
we
used
nprint
ml
model
to
classify
that
data,
he
was
unable
to
identify
a
single
one
of
the
distribution
of
service
attacks,
denial
of
service
attack,
sorry,
and
it
was
able
to
identify
very
few
of
the
port
scan
attacks.
The
reason
for
that
is
because,
as
I
mentioned
before,
UCSB
also
didn't
respond
to
port
scan
attacks,
so
the
so
it
was
able
to
pick
up
on
a
few
of
those
attacks
based
on
the
number
of
packets
in
that
flow,
but,
as
you
can
see,
it
failed
when
put
under
minimal
stress
of
real
world
traffic.
I
So
aside
from
these
three
data
set
three
use
cases
that
are
presented.
We
looked
at
four
other
different
use
cases
with
diff
with
reproducibility
artifacts
available
and
found
similar
issues
in
all
of
them,
including
kitsunis
Ensemble
of
neural
networks
for
anomaly,
detection
and
pensive's
reinforcement
learning
model
for
adaptive,
bitrate
selection,.
I
In
the
paper,
you
also
find
an
algorithm
description
of
trustee,
an
ablation
study
on
all
of
the
design
requirements
that
I
presented
and
more
information,
as
well
as
a
user
guide
on
the
trust
report.
That
I
am
that
I
mentioned
before
trustee
was
also
packaged
into
a
python
package
that
can
be
downloaded
today.
It's
available
for
anyone
to
use
and
has
received
surprisingly
number
of
amount
of
downloads
already
and
finally,
machine
learning.
I
High
stakes
scenarios
requires
a
level
of
trust
that
the
traditional
AIML
pipeline
simply
cannot
give
us,
but
trustee
helps
by
improving
the
trust
and
providing
High,
Fidelity
and
low
complexity
explanations
in
the
Forum
decision.
Trees
trustee
can
be
used
today
by
anyone
and
be
downloaded
in
that
website
or
using
pip
and
yeah.
So
just
download
the
python
package
go
analyze,
So
yeah.
Thank
you.
Thank
you.
E
All
right,
thank
you
very
much
excellent
talk.
Does
anyone
have
any
questions.
J
Hi
Stephen
Farrell,
yeah
I
read
the
paper.
It's
a
really
good
paper.
Thanks
I
enjoyed
it
I'd,
say
I
enjoyed
reading
the
paper
when
I
got
towards
the
end.
The
beginning
was
a
little
bit
harder
work
for
me
because
it's
something
I
feel,
but
the
end
actually
was
very
good
when
you
used
it
with
the
case
studies.
So
I
was
wondering
all
of
this
depends
on
having
access
to
the
training
data.
Essentially
right,
if
you
don't,
and
perhaps
in
in
a
lot
of
real
world
cases,
you
won't.
I
So
you
technically
don't
need
the
exact
training
data
to
produce
an
explanation
using
trustee
you
just
get
better
insights
using
the
training
data,
but
if
you
have
access
to
the
model
say
through
an
API
that
makes
the
classification-
and
you
have
data
to
to
that-
you
are
able
to
use
to
test
it.
You
can
produce
explanations
for
that
data
for
using
that
model,
but
it
I.
That
is
a
hard
challenge,
because
not
everyone
makes
that
model
available
like
that
through
an
API
for.
J
I
Yeah,
so
this
is
part
of
what
I
mentioned
before
about.
We
try
to
automate
this
analysis
as
much
as
possible,
but
it's
really
hard
because
we
it's
hard
to
tell
whether
or
not
the
model
is
actually
suffering
from
shortcut
learning,
for
example,
or
the
problem
might
be
super
easy
and
not
need
machine
learning
at
all.
I
So
it's
hard
to
make
that
call
unless
you're
actually
familiar
with
the
domain,
but
based
we
did
have
like
a
bunch
of
guidelines
based
on
the
values
we
we
produce
on
the
trust
report
that
could
indicate
whether
or
not
there
might
be
something
going
on
with
the
model.
So,
for
instance,
if
you,
if
your
model
has
4
000
features
and
you're
using
one
or
two
percent
of
it
to
make
a
a
perfect
classification
using
a
decision
tree.
That
is
probably
an
indication
that
there's
some
problem
going
on.
I
I
It's
not
I
I
would
say
it's
not
unique
and
probably
not
optimal.
It's
we
optimize
for
Fidelity,
but
there
are
different
decision
trees
that
might
result
in
the
same
Fidelity
when
we
achieve
the
best
possible
Fidelity
for
the
use
cases
we
analyzed.
But
if
you
have
correlated
features
in
the
data
set,
you
could
generate
a
decision
tree
of
the
same
Fidelity
with
using
different
features,
for
instance.
I
So
we
chose
not
to
tap
that
information,
basically
by
adding
that
outer
loop
to
trustee
to
produce
a
decision
tree,
that's
roughly
stable,
but
one
of
the
things
that
we
discussed
while
developing
is
that
there
is
knowledge
in
the
different
decision.
Trees
that
you
may
produce
the
ex.
The
expressivity
of
a
decision
tree
is
naturally
lower
than
this
than
the
neural
network.
For
instance,
a
neural
network
has
more
power
expressivity
power
than
a
decision
tree.
I
So
and
that's
the
reason
we
moved
away
from
decision
trees
in
the
first
place,
so
you
can
produce
different
decision,
trees,
explanation
for
the
same
neural
network
and
they
all
might
be
true,
it's
hard
to
say
it's
hard
to
say,
which
one
is
the
the
one.
True
decision
green
problem,
there
isn't
one
it's
amalgamation
of
all
of
them
and
that's
something:
we're
kind
of
working
on
trying
to
tap
on
that.
The
information
of
different
decision
trees
that
might
be
produced
because
they
are
they
they
can
be.
I
There
can
be
valuable
information
there
to
explain
how
the
neural
network
might
be
working,
for
instance,
does
that
answer
your
question?
Yes,
okay,.
K
E
Okay,
I
I,
see
Brian,
Trammell
I
think
is
removed.
L
A
L
Brian
Trammell
good
morning
from
Zurich,
so
I
noticed
something
in
thanks
a
lot
for
the
for
the
talk.
This
was
great,
unlike
Stephen,
I
haven't
read
the
paper
yet,
but
I'm
going
to
fix
that
this
week.
L
I
noticed
something
in
all
of
your
examples
where,
basically,
you
went
into
the
decision
tree
and
were
essentially
doing
analyzes
based
on
fairly
deep
domain
knowledge
of
how
the
sender's
in
a
network
operate
right.
This
is
something
I
think
that's
been
missing
from
at
least
a
lot
of
the
of
the
ml
literature
on
apply
ml
to
network
security
in
the
past.
So
so
thanks
a
lot
for
that.
L
Is
it
accurate
to
characterize
the
work
that
you've
done
here?
Is
basically
automation,
assisted
analysis
of
of
these
decision
trees
right
like
so
the
the
what
trustee
gets.
You
is
over
the
first
time
to
point
out:
hey
you
should
look
here
for
an
overfitting
or
an
over
classification.
Is
that
a
is
that
an
accurate
have
I
understood
that
correctly.
I
L
So
so
one
of
the
things
that
that
popped
in
my
head,
when
Stephen
was
talking-
and
this
might
be
something
to
look
at
in
sort
of
future
work-
for
looking
at
being
able
to
do
verification
without
training
data
is,
you
can
essentially
generate
synthetic
network
data
based
on
your
knowledge
of
how
these
stocks
work
right,
like
so
the
set
that
you
have
to
explore
if
you're
trying
to
extract
something
from
an
API
is
significantly
reduced
from
the
set
that
you
and
even
just
in,
in
the
examples
that
you
have
here.
L
You
have
a
lot
of
examples
where
it's
like,
oh
I
can
tell
I
have
peacock
metadata
versus
not
or
I.
Have
the
ethernet
header
versus
not
I
mean
there's,
there's
a
highly
restricted
set,
so
I
think
this
is,
is
more
of
a
for
follow-up
work.
L
I'd
really
like
to
see
some
sort
of
like
an
extractor
come
out
of
that,
so
that
would
actually
get
you
to
the
next
step
of
of
of
being
able
to
automate
this
analysis
because
it
did
look
a
lot
like
you
know
it
looked
a
lot
like
okay,
we
use
the
automated
thing
and
then
a
human
had
to
go.
Look
at
it,
which
is
super
useful,
but
you
know,
has
some
scalability
issues.
I
Yeah,
understandably,
yeah,
that's
actually
a
very
good
idea
as
a
reference.
I
can
point
to
at
least
a
there's,
a
hot
Nets
paper
that
came
out
in
2021.
That
did
something
similar
using
ale
plots,
but
they
didn't
generate
the
data
they
used
ale,
plus
to
sort
of
guide
the
collection
of
more
data
to
cover
more
data
points.
So
it's
something
something
similar
to
what
you
mentioned
yeah,
but
I
could
see
something
like
that.
Being
automated.
H
Okay,
so
the
the
start
of
the
paper.
Thank
you
for
this
talk.
It
was
really
nice.
The
start
of
the
paper
is
about
how
you're,
explaining
the
models
right,
but
as
I
go
further
on
it's
more
like
you're,
also
commenting
on
the
data
the
data
set
so
that
just
because
I'm
familiar
with
the
data
Excellence
work
when
you
must
have
heard
the
data
sheet
for
data
sets.
A
H
Just
think
that
this
has
a
lot
of
applicability
in
knowing
how
to
do
not
just
model
explanation
but
characterizing.
The
data
set
itself.
Yeah
I
mean
that
I
think
is
very
new,
and
another
thing
is
the
Fidelity
score
that
you
have
is
you've
shown
that,
with
with
the
Precision
recall
an
F1
score,
have
you
considered
doing
specific,
Information
Gain
measure.
I
I
know
I
have
not,
but
that
that's
a
good
that's
a
good
point.
I
can
look
into.
Regarding
the
data
yeah,
we
did
notice
that
most
of
the
examples
most
of
the
problems
we
identify
with
this
model
arose
from
the
bad
data
or
for
using
the
data
wrong.
Basically,
so
it's
we
had
this
debate
a
lot
while
developing
whether
we
were
finding
issues
with
the
model
with
the
data
but
in
in
the
VPN
on
VPN
example
the
for
me
choosing,
which
features
to
use
that's
part
of
developing
the
model.
I
H
Because,
like
I'm
coming
from
an
NLP
domain-
and
there
are-
these-
are
very
good-
well-known
data
set
Benchmark
Benchmark
data
sets
which
have
been
like
really
exceeded
by
current
algorithms.
But
then
those
data
sets
are,
as
you
have
shown
with.
The
network
data
sets
are
also
in
a
sense,
biased,
so
interesting.
This
workers.
I
B
Is
this
no
okay,
so
well
the
one
between
Diego
Lopez,
the
no
it's
it's
related.
B
Precisely
with
these
comments
on
the
on
the
data
set
Etc
when
we
started
to
work
with
the
AI
models,
I
insisted
very
much
that
if
you
we
were
Network
management
practitioners
Etc,
we
should
focus
more
on
the
data
and
the
models
and
we
have
been
trying
to
to
set
up
mechanisms
for
publishing
data
sets
that
are
usable
for
for
making
and
when
you
make
this
reflection
about
getting
the
mobile
plus
the
data,
because
otherwise
trustee
cannot
make
or
is
much
more
difficult
that
you
can
make
any
sense.
B
I
was
wondering
whether
well
not
necessarily
in
the
idea
in
the
ITF
or
the
irtf
or
whatever,
whether
it
will
be
advisable
that
we
try
to
push
for
a
set
of
public
data
that
could
be
equivalent
to
opening
I,
don't
know
how
to
call
it
open
source,
AI
or
open
data
AI
or
whatever.
So
just
I
I
felt
myself.
I
Thank
you
so
much
that
that
is
part
of.
We
do
comment
on
that
in
the
paper.
That
sharing
data
in
networking
is
different
from
other
areas
that
took
advantage
of
the
EI,
such
as
images
or
text,
because
there's
a
lot
of
private
data
in
networking
and
people
are
not
willing
to
share
that,
and
at
this
point,
I
think
we've.
I
There
there's
been
some
initiatives
like
the
people
that
produce
seek
ideas,
2017
to
produce
data
for
machine
learning,
that's
publicly
available,
but
they're
all
fundamentally
broken
that
they
make
them
kind
of
useless
for
us.
So
the
at
least
the
alternative
we
found
was
to
be
for
people
to
work
with
their
universities.
I
We
couldn't
publish,
for
instance,
the
UCSB
data
set,
because
that
was
Private
for
the
the
day
we
did
publish
the
headers,
only
not
the
payloads
of
that
of
the
of
the
traffic,
but
it
allows
the
a
lot
of
it
allowed
us
to
at
least
validate
the
model
that
was
trained
on
the
different
data
set.
So
that
was
the
alternative
we
found
and
I
think
might
be
the
one
that
we
as
researchers
can
take
advantage
of.
B
K
K
A
K
I
Yes,
but
I
would
say,
the
idea
of
this
proposed
is
not
to
retrain
right,
you're,
you're
publishing
a
black
box,
and
this
is
your
solution
to
it.
So
their
idea
is
basically
to
take
this
box
and
put
in
a
production
Network,
and
it
should
work,
that's
sort
of
how
the
these
models
are
sewed
to
us
and
not
that
you're
need
that
that
you
need
to
constantly
be
retraining
it
as
you
should.
So.
I
Our
idea
with
validating
those
papers
and
that
on
the
data
that
was
used
was
simply
that
curating,
a
different
data
set
to
show
that,
if
you
put
this
in
your
network
as
it
is,
it
will
break
basically,
but
we
did
retrain
if
I'm
not
mistaking
the
the
endprint
ml
model.
We
did.
We
train
using
the
UCSB
data
set
and
he
was
able
to
pick
up
on.
I
E
All
right,
that's
everything
we
have
for
today.
Thank
you
again
to
Boris
and
the
Arthur
for
giving
the
talks.
Congratulations
on
the
awards,
both
both
of
the
Prize
winners,
will
be
here
for
the
rest
of
the
week.
So
if
you
have
further
questions,
I
want
to
talk
about
the
work.
Please,
please
do
go
talk
to
them.
E
Please
also
consider
going
along
to
the
usual
usable,
formal
methods
group
or
the
the
raspberry
later
in
the
week,
and
seeing
what's
going
on
in
those
two
two
new
research
groups
with
that.
Thank
you.
Everybody
and
I'll
see
you
around.