►
From YouTube: IETF111-ANRW-20210727-1900
Description
ANRW meeting session at IETF111
2021/07/27 1900
https://datatracker.ietf.org/meeting/111/proceedings/
A
So
yeah.
B
D
A
C
C
I'm
pleased
to
welcome
you
to
this
third
session,
in
which
we
will
talk
about
interconnection
and
routing.
C
So
we
have
two
talks,
one
from
romer
from
two
from
iij
talking
about
hunting,
bgp
zombies
in
the
wild,
and
the
next
talk
is
going
to
be
on
meta
peering.
That
was
automated
isps
election
by
mustafa.
My
name
is
amrish
fukiya.
I
work
for
the
internet
society
as
an
internet
measurement
and
data
expert,
and-
and
I
guess
we
can
start
first
of
all,
let
me
introduce
roman
roma
is
a
senior
researcher
at
iij.
His
current
research
interests
include
traffic,
modeling
network
data
analytics
and
anomaly
detection
mythical.
E
Please
hello,
everyone.
This
is
roman
ronson
from
iaga
research
lab
and
today
I
will
present
our
latest
results
on
bhp
zombies.
So,
first
let
me
explain
what
a
bgp
zombie
is,
so
this
figure
represents
how
one
prefix
is
seen
by
wrist
routers
on
the
y-axis.
Here
you
have
all
the
wrist
routers
and
on
the
x-axis
you
have
time.
The
prefix
we
are
looking
at
is
one
of
the
risks.
Basically
bgp
beacon,
prefix
and
the
green
circles
here
shows
that
the
prefix
is
announced
by
one
of
the
router.
F
G
A
A
E
Hello,
everyone.
This
is
roma
francine
from
iaj
research
lab,
and
today
I
will
present
our
latest
results
on
bishop
zombies.
So
first
let
me
explain
what
a
bgb
zombie
is,
so
this
figure
represents
how
one
prefix
is
seen
by
wrist
routers
on
the
y-axis.
Here
you
have
all
the
wrist
routers
and
on
the
x-axis
you
have
time.
The
prefix
we
are
looking
at
is
one
of
the
risks,
basically
bgp
beacon,
prefix,
and
the
green
circles
here
shows
that
the
prefix
is
announced
by
one
of
the
router.
E
Then
the
green
line
shows
that
the
prefix
is
active
in
the
router
routing
table
and
the
red
cross
shows
that
the
prefix
is
withdrawn
by
the
router.
So
here
it
means
the
prefix
is
active
for
two
hours.
Then
it's
withdrawn
for
two
hours
and
then
it's
announced
again
for
two
hours
and
withdrawn
again
and
announce
again,
and
this
is
what
we
expect
from
bgp
beacons.
E
But
you
can
see
that
there
is
three
lines
here
that
represents
three
router:
that
think
that
this
prefix
is
active
during
this
time,
even
though
we
know
that
the
prefix
was
withdrawn
by
arrive,
and
this
is
what
we
call
bgp
zombies.
So
here
we
have
three
zombies
and
in
in
summary,
a
bgp
zombie
is
an
active
entry
in
a
routing
table
that
correspond
to
a
prefix
that
is
in
fact
withdrawn
by
its
originals.
So
we've
looked
in
the
past.
We've
looked
at
this
bgp
zombies
and
we
used
bgp
beacons
to
do
that.
E
That
was
what
we
published
in
palm
2019,
but
it
didn't
really
tell
us
anything
about
the
regular
prefixes
we're
using
on
the
internet,
and
this
is
the
goal
we
set
for
this
work
here.
We
want
to
see
we
want
to
monitor
big
zombies
for
regular
prefixes
and
see
if
it
was,
if
it's
as
bad
as
what
we've
seen
for
our
b
cards,
so
for
beacons.
E
Defining
zombie
was
very
easy
because
we
knew
already
when
the
prefix
is
withdrawn
and
when
it's
announced
again,
but
here
because
we
are
looking
at
any
prefix
on
the
internet,
we
have
to
find
out
when
an
original
is
going
to
withdraw
a
prefix
and
to
do
that,
we
are
looking
at
a
metric,
which
is
the
number
of
active
routers
for
prefix.
It's
shown
here
in
that
figure
so
that
metric
range
between
zero
and
one
one
means
that
all
the
router
we
are
using
see
that
prefix
as
active.
E
E
E
E
But
what
is
interesting
for
us
is
to
see
when
there's
a
significant
change,
a
significant
drop
and
then
this
metric,
the
number
of
active
routers,
is
stable,
but
at
a
low
value
here
it
means
that
only
a
few
routers
didn't
withdraw
that
prefix
and
if
it's,
if
it
lasts
for
a
certain
time,
then
we're
gonna
say
that
this
is
a
bgp
zombie.
So.
E
Of
the
bgb
zombie,
when
we
see
the
majority
of
the
router
of
within
the
prefix,
we're
gonna,
wait
90
minutes
and
after
90
minutes.
If
we
see
that
the
prefix
was
not
completely
completely
withdrawn
or
it
wasn't
reannounced,
then
we're
going
to
say
this
is
a
zombie
you
can
check
the
paper
for,
for
somebody
tells
why
we
use
90
minutes.
E
To
explain
so
using
this
very
simple
zombie
detector,
we
analyzed
six
years
of
bgp
data
and
found
6.5
million
of
bgp
zombies,
and
we
looked
at
different
things
in
these
zombies.
First,
we
ran
some
sanity
checks.
I
will
explain
the
next
slide
and
we
also
look
at
some
of
the
characteristics
of
zombies
in
the
wild.
One
of
the
first
sanity
check
we've
done
is
to
look
at
what
we
call
the
state
variance
between
recipes.
E
So
s3
tells
us
that
it
can
reach
that
prefix
through
this
ss,
but
we
have
access
to
one
of
dcs
through
this
and,
and
that
is
tells
us
that
it
has
withdrawn
this
prefix.
E
E
So
this
really
tells
us
that
there
is
in
fact
a
zombie
and
it's
not
a
misclassification,
so
we
use
that
to
validate
our
result,
but
to
do
that,
we
have
to
have
pass
zombie
pass
with
at
least
two
respears
in
the
in
the
us
pass.
E
E
E
We
have
on
the
internet,
so
only
those
few
prefixes
have
created
a
lot
of
zombies
and
one
of
the
questions
we
had
then
was:
okay,
maybe
noisier
prefixes
like
hp,
beacons
are
more
prone
to
zombies,
and
this
is
what
we
looked
at
in
this
figure
we
took.
E
We
did
this
only
for
two
years
in
2018
and
2019,
and
we
looked
a
bit
more
at
this
asm.
We
look
at
a
different
characteristic,
and
what
we
found
was
that
the
top
s
and
popular
content
networks
usually
have
also
either
like
they
announce
a
lot
of
prefixes,
or
they
are
very
long,
especially.
E
And
if
we
think
about
it,
it
kind
of
makes
sense
if
we
assume
that
bgp
zombies
are
due
to
bugs
in
routers,
then
a
longer
space
lines
mean
they're,
gonna
imply
more
routers
and
thus
more
chance
to
hit
one
of
this
back.
E
E
Here
we
also
found
that
some
zombies
have
an
origin
that
is
different
than
its
covering
prefix,
because
that
that
route
is
stuck
then
we
might
have
a
wrong
original
information.
B
E
Create
routing
loops
and
we
found
over
400
of
potential
routing
loops
in
our
result.
I
also
advise
you
to
look
at
this
presentation
from
kellnag
last
year,
where
they
give
a
concrete
example
of
routing
loops
and
also
the
address
route
to
to
show
that
okay.
So
that
concludes
my
presentation
in
this
work.
We
look
at
bgps
on
this
for
regular
prefixes.
E
C
C
If
not,
I
have
a
question
for
you,
so
my
first
question
would
be:
what
do
you
think
could
be
the
cause
it
could
be
the
cause
of
these
pgp
zombies?
Is
it
40
routers?
What
could.
C
E
Thank
you,
that's
a
very
good
question
and
that's
definitely
something
is
missing
in
our
study.
It's
it's
simply
missing,
because
it's
it's
very
hard
to
check
all
the
causes
of
zombies.
E
The
main
cause
we
think
is,
is
yes,
faulty
routers
and
bugs
in
in
routers,
but
yeah,
because
it's
just
like
very
hard
to
check
all
the
different.
You
know
software
version
and
in
in
it's
probably
a
lot
of
corner
cases
where
things
don't
work,
it's
it's
hard
to
check
yeah,
but
that's
why
we
think
it's
a
problem.
E
What
we've
heard
from
operators
is,
they
sometimes
see
those
zombies
appearing
and
the
common
practice
is
to
re-announce
prefixes
and
really
withdraw
them,
because
so,
when
you
withdraw
prefix,
you
have
like
a
small,
very
small
chance
that
a
zombie
appear
and
by
just
re-announcing
and
re-uh
withdrawing
is,
it
seems
to
usually
remove
zombies.
That
appears
we've
heard
from
some
that
did
just
reset.
They
are
all
a
bgp
session,
which
is
a
bit
a
bit
brutal.
But
that's.
E
Yeah
yeah
yeah
yeah,
but
yeah.
I
think
the
the
main
problem
for
operators
is
to
monitor
this,
to
find
to
know
that
there
is
in
fact
problems
right
and
for
that
we
and
for
the
paper
we
do
monitor
zombies,
but
in
real
time
it's
it's
sometime,
a
bit
hard
to
do
because
you
have
to.
E
If
you
use
recent
rogers,
that's
a
lot
of
data
to
process,
so
it
can
be
a
pain
of
like
taking
all
this
data,
but
still
like
ripe,
provides
some
tools
like
pgp
play,
for
example,
where
people
can
look
at
how
their
prefixes
is
are
seen
by
by
risk.
C
Okay,
I
see
we
have
one
person
in
the
cube
colleen,
you
can
proceed.
H
Hi,
can
you
hear
me?
Yes,
yes,
hi
nice
talk,
so
this
this
is
not
even
close
to
my
area,
so
this
this
question
may
be
completely
may
make
very
little
sense.
I
I
was
wondering
if
you
were
seeing
any
difference
in
behavior
between
ipv4
and
ipv6
prefixes,
given
that
the
the
interpretation
is
that
it
may
be
rooter
bugs
and
they
may
be
exercising
different
code
paths.
E
In
this
paper,
I
don't
think
we've
done
much
comparison
between
ipv4
and
ipv6,
but
we
had
a
previous
paper
on
that
and
we've
seen
the
bad
news
is:
we've
seen
much
more
zombies
in
ipv6.
E
And
one
reason
we
really
dig
into
the
results
we
saw.
We
saw
that
one
of
the
network
was
creating
a
lot
of
zombies
and
we
contacted
them
and
they
were
saying
that
yeah
they
had
some
problem
with
their
ipv6
and
they
were
restarting
their
their
bgp
sessions.
When
that
customer
complains.
C
A
Hi,
can
you
hear
me
now
yeah
hi
ramon.
Thank
you
so
much
for
the
talk
very
interesting.
So
I'm
just
wondering
because
I
mean
you
processed
so
much
data,
and
I
know
that
going
through
all
the
information
from
risk
and
from
broadband
is
just
you
know,
complete
a
work
in
itself.
So
thank
you
so
much
for
for
taking
the
time
to
do
that.
So
maybe
I
missed
this.
A
I'm
just
wondering
how
much
of
what
you're,
seeing
as
zombies
are
actual
zombies
and
how
much
is
just
you
know
some
some
hiccups
or
accidents
that
may
occur
or
or
have
you
been
able
to
separate
reliably
these
two
and
maybe
a
follow-up,
is
just
have
you
seen
like
some
big
offenders
in
the
sense
of
you
know,
somebody
who,
may
you
know,
generate
too
many
zombies,
or
do
you
see
sort
of
like
a
very
skewed
distribution
in
the
you
know,
origin
of
these
zombies.
So
again,
thank
you.
So
much.
E
In
this
paper,
we
didn't
really
look
at
the
source
of
zombie,
but
the
previous
one
did
a
bit
more
work
on
that
and
we
found
so
we
were.
We
had
a
technique
to
to
find
what
is
the
source,
which
is
where
the
zombie
created,
and
it
was,
it
was
not
really
like.
There
was
like
not
one
big
offender,
it
was
like
changing
quite
a
lot,
and
that
also
gave
us
some
more
evidence
that
there
is
like
those
bugs
it's
it's
a
bit
random
like
how
how
the
zombie
are
created.
E
E
So,
even
though
we
have
like
this
very
controlled
environment,
you
know
things
were
changing
all
the
time
and
and
for
the.
The
first
question
is
how
we
check
like
they
are
really
zombies.
E
So
in
this
paper
we
did
like
some
of
this
sanity
check,
like
the
difficulty
with
with
this
work
now
was
that
we're
working
with
past
data
we
use
like
six
years
of
bgp
data.
So
it's
very
hard
to
go
back
in
the
past
and
check
you
know
what
really
will
happen,
but
the
the
previous
studies
we
did
run
tres
routes.
E
Every
time
we
found
the
zombie
we
run
tressrot,
and
we
could
confirm
that
every
time
like
we
see
like
a
few
routers
that
will
forward
the
packets
and
then
we
receive
a
smp
message
saying
that
the
network
is
unreachable.
E
So
we
could
confirm
that-
and
I
don't
remember
exactly
the
number,
but
it's
like
over
90
percent
were
really
zombies.
C
I
Yeah
jared
moch
akamai,
I
mean
I
mentioned
that,
because
in
your
paper
you
highlight
a
number
of
our
prefixes
that
apparently
regularly
get
you
know,
get
stuck
and
such
yeah.
I
think
there's
there's
a
lot
of
things
that
contribute
to
to
that.
On
our
side,
we've
definitely
been
trying
to
improve
some
of
our
prefix
stability
efforts
as
well.
You
know
because
we
have
a
large
set
of
distributed
deployments
and
then
a
lot
of
them
a
lot
of
these
like
the
16625
as
originated
prefixes.
I
Those
are
all
coming
from
bgb
speakers
that
run
on
either
on
routers
or
on
servers.
Specifically,
that
may
may
be
more
likely
to
actually
have
some
negative
operational
impacts
when,
when
those
things
are
put
into
service
or
taken
out
and
when
we've
been
trying
to
improve
the
prefixes,
so
I'd
be
interested
just
to
to
know
or
see
if
you're
still
seeing
this
from
us
and
if
it's
improved
recently,
because
we've
undertaken
a
number
of
efforts
to
improve
this.
E
Oh,
that's
all:
okay,
okay,
that's
very
interesting!
Then
I
guess
yeah
we
could
check
again
and
and
see
like
if
there
was
some
improvements
and
but
but
I
get
one
thing
I'd
like
to
mention
is
I
remember
for
akamai
we
see
because
you
announced
a
lot
of
like
very
small
prefixes,
I
think
at
different
places
in
the
network.
E
Sometimes
we
see
a
very
long
as
pass
for
these
prefixes
and
I'm
guessing
that
so
zombies
will
appear
for
those
like
very
long
as
fast
and
and
I'm
guessing
that
this
has
like
little
to
no
impact
for
your
traffic,
because
that's
probably
where
it's
not
the
place
where
you're
gonna
direct
your
your
clients.
I
Yeah
we
have
a
lot
of
distributed
unique
deployments
as
well
as
we
now
have
actually
three
different
backbones
that
we
operate
that
interconnect
all
of
our
sites
together,
so
depending
upon
where
the
regional
interconnection
is
for
the
the
distribution
of
the
content
into
into
a
customer
network.
You
know
you
know
in
japan,
it
might
be
different
than
in
india
than
somewhere
else.
You
know
obviously
you'll
see
different
as
paths
you
know
for
those
based
upon
the
the
relationships
we
have
with
the
service
provider.
I
So
it's
it's
quite
possible
that
you
know
some
of
the
prefixes
might
be
negatively
impacted
depending
upon
that
upstream
provider
you
know,
network
property
and
and
and
we
actually,
we
have
teams
who
are
actually
working
full-time.
You
know
going
and
chasing
things
like
that
around,
as
well
as
systems
that
kind
of
monitor
and
detect
it,
but
they
tend
to
just
take
things
out
of
service
and
then
a
human
has
to
go
and
chase
down
and
figure
out
what
what
happened?
I
You
know
we
quite
often
see
you
know
I.
I
think
it
should
be
no
surprise
to
anybody.
Who's
looked
at
bgp
research.
We
see
a
lot
of
interesting
events
all
the
time
and
so
when
the
systems
just
take
stuff
out
of
service
automatically,
it's
really
you
know
it's
doing
that
to
improve
the
customer
experience,
and
that
happens
all
day
every
day.
C
Thank
you
jared,
so
this
bring
us
to
the
end
of
your
presentation.
Roma.
Thank
you
so
much
for
coming
here
so
early
for
you
yeah.
Thank
you
very
much.
So
we
can
move.
Thank
you
very
much
bye-bye.
C
We
can
move
to
our
next
speaker
who
is
charzeb,
who
is
a
phd
student
at
the
university
of
central
florida
and
is
currently
working
as
a
research
assistant
at
the
networks
and
wireless
systems
lab
his
key
interest
areas
of
research
on
network
architecture,
internet
peering
and
data
analytics.
J
Why
is
that?
Because
isp
admins
often
attend
events
sponsored
by
pmdb,
nano,
nano,
etc
where
they
network
with
each
other
and
using
these
events
they
identify
isps
for
potential
appearing
and
after
the
negotiate,
after
that,
they
negotiate
traffic
exchange
terms
and
conditions
which
may
include
traffic,
the
max
traffic
volume
they
are
willing
to
exchange
or
the
specific
points
that
they
are
willing
to
pay
at
or
whether
or
not
it
will
be
a
public,
pairing,
etc.
J
And
then,
after
this
step,
if
and
only
if,
both
of
the.
J
J
J
Because
isp
admins
often
attend
events
sponsored
by
peer
and
db,
nano
nano
etc
where
they
network
with
each
other,
and
using
these
events
they
identify
isps
for
potential
period
and
after
the
negotiate,
after
that,
they
negotiate
traffic
exchange
terms
and
conditions
which
may
include
traffic,
the
max
traffic
volume
they
are
willing
to
exchange
or
the
specific
points
that
they
are
willing
to
pay
at
or
whether
or
not
it
will
be
a
pay
public,
pairing,
etc,
etc.
J
And
then,
after
this
step,
if
and
only
if,
both
of
the
isps
agree
to
the
terms
and
conditions,
the
bgp
forwarding
rules
are
written,
so
the
deployment
actually
takes
takes
place.
So
I
mean
overall,
since
the
whole
process
requires
a
ton
of
manual
worker
is
extremely
slow
and
it
oftentimes
takes
a
couple
weeks,
two
months
and
even
with
such
an
elaborate
and
lengthy
process.
Finding
the
right
peer
is
hard
and
say
you
put
in
like
two
months
into
selecting
your
peer.
J
It's
not
guaranteed
that
it's
it's
going
to
be
an
optimal
option
because
see
the
internet
is
far
more
dynamic
than
these
inter
connection
deals,
which
means
that
during
the
negotiation
and
finding
process
there
plenty
of
great
peering
opportunities
are
discarded
under
or
over
estimation
of
various
metrics
can
lead
to
future
disagreements.
J
Now
bad
selection
can
also
mean
that
your
resources
are
not
optimally
utilized,
so
sort
of
your
load.
Balancing
factor
is
suboptimal
and
in
the
bigger
picture
and
in
the
longer
run
such
relations
sub,
optimal
relations
or
disagreements
or
missed
opportunities.
Everything
in
the
longer
run,
such
relations
can
hurt
isps
both
isps
financially.
J
So
it's
clear
why
it's
so
important
for
these
to
be
optimal
and
along
with
that
to
be
dynamic.
So
if
you've
identified
an
issue,
you
can
fix
it
quickly.
J
Now
we
present
meta
pairing,
a
tool
that
will
help
identify
optimal
peering
isp
pairs,
and
it
also
gives
you
the
best
peering
contracts,
so
say
for
two
given
isps.
We
need
to
decide
whether
or
not
they
should
be
appearing,
and
if
yes,
which
particular
locations
they
should
be
appearing
at,
so
we
first
calculate
the
traffic
matrices
with
both
the
isps
which
is
sort
of
their
internal
traffic
flow.
J
We
also
identify
the
locations
where
both
these
isps
have
a
presence,
because
these
are
the
points
where
appearing
is
possible,
then
we
gather
the
gridded
population
data
for
the
united
states,
which
basically
divides
the
whole
country
into
small
segments,
and
now
we
know
the
population
for
each
of
these
segments-
and
we
take
all
of
this
data
all
of
these
computations
and
feed
them
into
this
policy
generator
machine
which
turns
it
up
extracts
all
the
useful
information.
J
J
The
first
is
so:
it
basically
uses
the
pop
locations
in
the
population
data
to
construct
an
overlap
map
between
the
two
isps
and
what
it
represents
is
the
number
of
people
each
isp
presumably
covers,
and
how
many
more
people
can
become
accessible
with
appearing
deal.
This
is
summed
up
in
the
affinity
score.
J
The
next
one
is,
I
use
the
same
applications
and
also
the
traffic
matrix
matrices
to
give
out
peering
willingness
scores
for
each
of
the
common
populations
from
the
perspective
of
both
isps.
So
the
overall
willingness
for
the
for
a
particular
appearing
deal
is
just
the
average
of
these
scores
at
particular
points.
J
We
then
take
the
geometric
mean
of
the
willingness
score,
which
is
the
representative
of
the
willingness
to
pair
and
affinity
score,
which
is
the
representative
of
the
non-overlapping
areas
and
population
to
get
the
felicity
score,
which
tells
us
whether
or
not
these
isps
should
appear.
Now.
Please
note
that
these
scores
are
novel
they're,
not
like
an
industry
standard.
We
came
up
with
these
scores
and
we
have
discussed
how
we
came
up
with
this
and
what
they
represent
so
with
the
felicity
scores.
J
Isp
admins
can
set
a
threshold.
Okay.
If
this
pair
has
a
phylicity
score
of
more
than
0.6
or
0.7
will
be
pairing,
otherwise
we
will
not.
So
this
is
the
deciding
factor
for
whether
or
not
they
should
be
appearing
and,
along
with
that,
we
give
them
the
acceptable
pairing
contracts
that
okay,
if
you
decide
to
pay
these
are
the
locations
that
you
should
be
paying
now.
Both
of
these
results
can
be
used
by
isp
admins
to
decide
whether
appearing
deal
will
be
worth
it
now.
K
J
J
We
can
see
the
overlap
map
so
which
is
calculated
using
the
pop
locations.
As
mentioned
earlier.
We
can
also
see
the
willingness
course
for
each
of
the
contracts
that
are
possible.
Not
all
of
them
are
listed
in
the
screenshot,
but
they
are
possible
and
at
the
bottom
we
can
see
a
sample
contract
recommendation
which
is
given
a
sprint
and
ebay,
decide
that
they
are
appearing
metapeering
or
tool
recommends
that
they
should
be
appearing
in
los
angeles
and
chicago.
J
So
the
website
lists
top
three
such
contracts,
so
this
is
the
best
one
and
then
there's
the
second
best
option
and
the
third
best
option,
and
just
for
reference-
here's
another
example
for
columbus
and
ebay,
and
in
this
case
the
same
overlap
map
is
given,
the
willingness
graph
is
given
and
a
sample
contract
is
given.
But
in
this
particular
case
our
model
does
not
recommend
peering,
but
in
case
they
do
end
up
deciding
that
they
should
appear.
They
should
be
at
san
jose
and
ashburn.
J
We
tested
this
model
on
23
different
isps,
which
basically
means
506
pairs
using
two
heuristics.
Now
the
x-axis
we
can
see
on
the
x
axis.
We
can
see
the
isp
pair
type,
where
a
is
access
c
is
content
and
t
is
transit.
So
a
t
for
example,
means
that
it
is
an
access,
transit,
isp
pair
first,
so
the
first
heuristic,
the
isp
view
we
recommend
peering.
If
any
one
of
the
two
isps
has
a
felicity
score
greater
than
a
certain
threshold.
J
In
this
case,
we
used
0.55
for
the
second
heuristic,
the
holistic
view
we'd
recommend
peering.
If
and
only
if,
both
of
the
isps
have
a
velocity
score
greater
than
0.55.
J
J
J
We
believe
that
metabearing
is
a
step
in
the
direction
of
a
more
dynamic
and
automated
isp
relation
management,
and
we
are
working
on
an
extended
model,
a
more
complex
model,
a
more
complex
metabolic
model
which
uses
machine
learning
techniques
to
learn
from
previous
data,
to
learn
what's
important
in
appearing
deal
before
giving
out
its
recommendations.
J
So
this
concludes
my
presentation.
I
hope
you
liked
it.
If
you
have
any
related
questions,
I
would
be
happy
to
address
them
now.
B
Okay,
clear
my
mind
when
you're
waiting
up
here
and
use
is
when
you
say
one
of
the
serious
weights
is
population.
J
Yeah,
so
the
metrics
that
we
use
assume
that
traffic
is
directly
related
to
these
eyeballs
that
you,
as
you
mentioned,
and
it's
directly
proportional
to
the
population
around
that
area.
So
assuming
that
there
are
eyeballs,
the
traffic
originating
from
that
area
is
directly
proportional
to
the
area
that
all
of
the
pops
it
will
cover.
So
it's
just
a
heuristic
measure
which
we
use
in
approximating
the
peering
suggestions.
B
J
For
this
particular
case,
we're
assuming
an
even
distribution,
so
two
eyeballs
into
from
eyeballs.
C
J
For
this
particular
project,
we
have
not
focused
on
the
economic
side,
but
another
project
that
we
are
currently
working
on,
focuses
just
on
the
economic
side,
so
optimizing
peer
selection
based
on
how
much
you
can
save
so
that's
yeah.
That
is
something
that
we
are
working
on,
maybe
in
the
future,
these
two
projects
combine,
but
right
now,
in
this
project
we
haven't
considered
that.
C
I
don't
think
so.
If
not,
I
might
have
another
one.
So
I
see
that
you
focus
your
your
your
model
based
on
data
coming
from
from
the
us.
So
do
you
intend
to
expand
it
to
other
regions
of
the
world
where
perhaps
interconnection
is
a
little
bit
different
than
the
usc.
J
Yeah
definitely
we've
seen
that
the
trends,
peering
trends
and
interconnection
trends
are
different,
especially
for
european
ixps.
So
the
the
reason
that
we
have
focused
on
u.s
is
because
the
model
needs
to
train
using
peering
trends
right.
So
if
we
combine
two,
it
will
be
a
problem,
but
yes,
the
same
model
can
be
trained
for
european
isps
where
different
mattresses
have
different
weightage.
C
C
A
A
So
hi
everyone
yeah
thanks
so
much.
We
just
are
closing
session
number
three,
so
I'll
invite
and
mundo
to
help
us
share
session
four.
I
don't
know
if
you
could
share
also,
maybe
the
slides
or
not
just
let
me
know-
and
I
can
share
them
just
to
remind
people
how
we're
gonna
run
the
session.
D
A
Okay,
because
we
we
don't
see
the
slides,
I
don't
know,
if
should
I
trash.
D
A
D
Okay,
can
you
see
the
this
by
session
slides
or
I
guess
you-
I
I'd
like
to
show
these
the
to
share
my
sk,
okay,
it.
I
was
going
to
share
my
slides
here,
but
if
you
have
perhaps
it's
easier
that
if
you
share
yours.
D
Okay,
so
hello,
everyone,
my
name,
is
ed
mundo,
this
suzy
silva.
It's
a
pleasure
to
be
here
and
we
have
a
very
nice
session
on
monitoring
the
internet
traffic.
Today
we
have
four
papers
and
first
paper
towards
layer.
The
lamentary
will
be
presented
by
justin
human,
the
second
paper.
It's
about
the
spin
beat
also
measurements.
Then
the
detection
consumer
iot
device
how
you
detect
iots
in
the
wild
for
the
lens
of
isp
and
the
less
paper
it's
on
the
evolution
of
internet
flows.
D
So
it's
the
way
you're
gonna
run
the
session
and
allow
a
very
short
question
after
each
presentation
and
then
you
have
a
panel
at
the
end
of
15
minute
penalties.
That's
correct,
andre!
So
without
further
ado,
not
to
take
time
from
the
speakers.
Let's
go
to
the
first
presentation:
justine
human,
from
university
of
liege
he's
on
research
unit
for
in
networking.
So
please
can
you
show
the
video.
D
A
Sorry
under
here
there's
no
sound
for
me,
I'm
not
sure
if
other
people
have
the
same
issue,
if
you
can
help
us
with
that.
Thank.
M
M
M
But
first
let
me
remind
you
some
basics,
so
we
moved
from
a
kind
of
monolithic
architecture
to
microservices,
so
you
can
see
microservices
everywhere
now
and
there
are
a
lot
of
reasons
for
that,
mainly
that
it's
easier
and
faster
to
deploy
to
maintain,
but
there
are
also
other
reasons.
M
So
if
you
look
at
this
kind
of
architecture-
and
let's
assume
there
is
a
problem
somewhere-
and
I
ask
you
to
to
debug
and
to
find
the
problem
here-
you
would
tell
me
okay,
easy
game
right,
but
what
about
this
one?
So
it
gets
a
little
bit
more
complicated
right.
M
So
hopefully,
for
that
kind
of
spaghetti,
microservice
architecture,
you
have
application
management
performance,
which
is
actually
application
performance
management.
Sorry-
and
in
this
case
more
specifically,
you
have
distributed
tracing
tools.
In
this
talk,
I
will
take
jaeger
as
an
example,
but
you
have
to
know
that
there
are
a
lot
of
other
alternatives.
M
Jager
is
just
a
famous
one
in
all
of
them,
so
such
tool
is
very
useful
when
we
are
dealing
with
microservices
and
they
all
have
something
in
common.
They
all
have
the
same
notion:
the
same
concept
of
trace
and
spans.
M
So
this
is
just
the
part
of
your
code
that
you
want
to
monitor
all
right,
and
here
at
the
bottom
of
the
slide,
you
can
see
a
screenshot
of
the
jagger
visualization,
so
you
have
basically
a
main
trace
which
contains
two
sub
traces,
and
each
trace
has
some
some
child
span,
so
here
a
span
and
then
a
span,
a
child
span,
etc,
etc.
M
M
Well,
that's
when
you're
facing
problem
and
to
just
to
cite
an
example:
let's
assume
that
my
database
lookup
is
slow,
so
I
can
see
in
jager
that
my
database
lookup
is
slow
because
I
traced
it
in
that
case.
Maybe
should
I
just
blame
the
app
or
should
I
put
the
blame
on
the
server
or
even
the
database,
or
maybe
this
is
a
network
issue
or
actually
it
can
be
anything
else.
So
it's
hard
to
to
to
know
exactly
what
what
it
is.
M
M
So,
let's
just
see
a
basic,
simple
topology
here,
so
you
have
the
app
the
database
and
this
guy
in
the
middle
of
the
path,
with
a
congestion
of
one
of
its
interface,
and
so
jaeger
will
just
report
a
slow
execution
time
right.
So
when
you
see
this,
you
will
investigate
on
the
app
you
will
investigate
on
the
database
and
actually
you
wouldn't
find
anything.
So
you
would
be
left
wondering
scratching
your
head
and
wondering
why
it
takes
so
long
and
that's
that's
a
big
problem
because
root
cause
analysis.
M
M
The
first
one
is:
how
do
we
find
a
way
to
correlate
the
traces
from
the
atm
with
the
corresponding
network
traffic?
So
back
to
my
example:
if
you
want
to
trace
exactly
your
database
lookup
in
your
code,
you
want
to
match
the
trace
generated
by
the
apm
with
the
network
traffic,
which
is
the
db
lookup.
M
Okay,
on
the
link,
and
for
that
we
will
use
iom.
So
iom
is
just
it's
in
situ
operation,
administration
and
maintenance.
It's
actually
used
to
carry
some
useful
data
in
packets.
We
have
developed
it
in
the
kernel
and
it
should
be
available
soon.
Why
do
we
use
iom?
Well,
we
want
to
kill
two
birds
with
one
stone.
So,
as
I
said,
it
can
carry
a
lot
of
useful
low
level
information.
M
So,
for
instance,
you
have
the
queue
size,
the
preferred
location,
ids
of
nodes
and
interfaces
from
where
the
packet
is
coming
from
and
where
it's
going
to
and
so
on
and
so
on,
and
on
the
other
side
we
just
enhance
the
io
header
to
carry
so
both
the
trace
id
and
the
this
pan
id.
So
remember
what
I
told
you
about
the
common
point
of
those
tools:
they
all
have
the
same
notion
of
trace
and
span,
so
a
trace
id
and
a
spanner
id
both
represent
a
unique
id
of
a
span.
M
So
the
first
important
question
is
censored:
let's
face
the
second
one
when
and
how
should
we
inject
these
ids
so
when?
Well,
we
have
two
possibilities
either
at
circuit
creation
or
when
sending
data.
So
if
you
think
a
little
bit
about
it
at
circuit
creation,
it
wouldn't
be
enough.
Why?
Well,
because
an
operator
could
use
the
same
socket
for
all
connection
all
along
or
you
could
have
for
the
same
connection,
multiple
trays
see
if
you
want
to
monitor
different
parts
of
your
code,
you
would
have
multiple
trays
on
the
sims
okay.
M
So
that's
not
an
option
to
inject
those
ideas
at
socket
creation
and,
moreover,
you
don't
want
to
modify
the
c
library,
because
you
would
have
to
modify
also
high-level
languages
and
that's
not
an
option.
We
want
to
provide
a
tool,
an
improvement
to
those
tools
which
is
not
a
burden.
So
you
you
want
to
integrate
it
easily
without
changing
everything.
Okay,
so
we
will
just
select
the
option
of
injecting
those
ids
when
we
send
the
data,
but
now
how
so
again
we
have
several
possibilities.
M
M
So
we
are
left
with
two
other
possibilities,
which
is
to
add
a
new
cisco
or
use
a
netlink
call
again,
if
you
add
a
new
syscall,
syscalls
are
not
always
portable
and
the
preferred
way
of
doing
this
is
usually
to
use
netlink
call.
So
from
a
kernel
perspective,
it's
always
the
best
option,
so
we
just
select
the
netlink
call
as
an
option.
M
So
let
me
explain
you
a
bit
this
architecture
of
crosslair
telemetry.
M
So,
first
of
all,
we
have
a
client
which
is
a
jager
client
in
this
case,
which
which
is
used
to
add
tracing
code
to
your
application
and
actually,
when
a
trace
is
available,
it
will
be
sent
to
the
agent
which
will
forward
to
the
collector,
which
will
apply
some
action
on
it
and
then
store
it
in
the
database.
Okay,
now
we
had
a
clt
library,
which
is
also
a
client
library.
M
M
So
it
comes
with
a
lot
of
other
challenges,
but
I
think
we
are
pretty
close
to
a
perfect
solution,
but
right
now
this
one
is
working
pretty
well,
except
in
the
corner
cases
I
I
mentioned,
but
again
the
the
main
goal
of
this
one
is
to
have
some
in
useful
info
to
debug
so
from
layer,
3
layer
4.
As
long
as
it's
low
level.
M
So
this
is
a
correlation
request
and
so
the
jager
collector
will
be
responsible
for
just
storing
the
correlation
inside
the
database
and
actually,
as
a
result,
you
can
see
directly
in
the
jager
visualization
iom
data
or
layer
304
data
as
you
want.
As
I
said
so
back
to
our
example.
You
have
the
application
database
we
introduce
and
we
will
simulate
a
congestion
here
on
this
guy
on
its
interface
and
as
you
can
see.
So
this
is
the
first
node.
So
this
is
the
app
second
node
at
third
node.
M
So
the
second
node
is
this
one,
and
you
can
see
that
the
egress
q
is
increasing.
So
if
you
are
the
operator
now
that
you
see
this
well,
you
directly
find
root
cause
analyze.
Okay,
so
you
can
see.
Okay,
there
is
a
problem
in
the
queue.
Maybe
I
could
rebalance
it
or
you
are.
You
are
now
capable
of
applying
some
actions
to
to
solve
the
problem,
but
again
without
cross-layer
geometry,
you
wouldn't
have
those
data.
So
the
only
thing
you
would
know
it's
that
it's
slow
once
again.
M
M
So
let
me
conclude
on
this
talk.
I
definitely
think
that
it's
a
hot
topic
in
the
industry,
so
I've
heard
that
there
is
a
lot
a
lot
of
interest
on
this
one,
and
I
do
believe
that
clt
solves
a
lot
of
challenges
in
the
tracing
world.
M
We
are
still
working
on
some
bar
to
improve
it,
so
I
mentioned
earlier
another
version
which
will
be
per
packet
so
to
have
a
perfect
correlation
and
match
solution,
and
there
is.
There
are
also
some
other
things
to
improve,
but
not
that
important.
So
I
insist
on
the
fact
that
this
solution
is
working.
So
this
is
something
that
you
could
use
right
now,
if
you
want-
and
there
is
a
link
to
the
github
repo
so
feel
free
to
have
a
look
at
it.
There
is
also
a
video
to
demonstrate
how
it
works.
M
M
D
So
thank
you
very
much
for
your
for
your
presentation.
Very
nice
too.
Let's
see
if
there
is
anyone
in
the
queue
right
now
to
ask
questions.
D
I
guess
people
are
shy
at
the
beginning,
but
I
I
I
have
one
question:
do
you
foresee,
because
in
your
implementation,
if,
if
I
recall
you
have
to
do
some
stuff
by
hand
once
you
get
off
the
information,
so
do
you
foresee,
instead
of
manual
interven
invention,
can
possible
implement
potential
to
automated
analysis
that
could
be
attached
to
your
tool
or
have
you
thought
about
it?.
M
Yeah
well,
we
could,
but
you
have
to
know
that
below
clt
this
is
iom,
and
so
this
is
the
configuration
of
iom.
That
is
the
biggest
part.
Actually
so,
operators
using
iom
usually
configure
it
by
hand,
but
again
we
could
provide
some
tools
to
to
automate
it
to
automatic
it.
So
we
could
also
provide
a
way
to
merge
everything
to
the
rpm
tool,
but
maybe
it
will
be
a
lot
of
burden
for
each
tool.
M
So
I
think
it
should
be
better
to
keep
it
kind
of
decentralized
and
maybe
provide
some
tools
to
configure
iom
yeah.
D
D
M
No,
of
course
you're
right,
so
it
was
a
pretty
limited
test
bed
just
for
the
sake
of
the
paper,
but
we
are
definitely
planning
on
expanding
the
test
bed
to
to
have
some
more
real
test
cases.
D
B
D
Let's,
let's
go
for
the
next
talk:
okay,
the
I
guess
is
going
to
be
presented
by
ike
kunsan
from
our
wth
university
he's
a
phd
student
and
a
researcher.
So
please.
N
N
N
I
think
it
is
safe
to
say
that
network
measurements
have
always
been
important
to
get
a
better
understanding
of
what
is
going
on
inside
of
the
network.
However,
measurement
techniques
have
typically
been
developed
independently
from
protocols,
and
thus
they
oftentimes
depend
on
externally
visible
protocol
semantics.
N
A
prominent
example
are
tcp
sequence,
numbers
and
acknowledgements
which
can
be
used
to
compute
the
round
trip.
Time
of
a
connection.
Let
me
quickly
illustrate
that
with
a
short
example.
What
we
have
here
are
two
hosts
interconnected
by
a
network
probe
in
the
middle.
If
the
host,
on
the
left
hand,
side
now
sends
a
packet
with
a
certain
sequence
number,
the
network
probe
in
the
middle
can
store
that
sequence
number
and
then
basically
start
the
time.
N
As
soon
as
the
acknowledgement
then
arrives
at
the
network
probe,
the
network
probe
can
basically
stop
the
timer
and
compute
the
right
hand
side
half
of
the
roundup
time.
Unfortunately,
such
techniques
are
no
longer
possible
in
times
of
encrypted
transport
protocols
such
as
quick
because
they
are
the
protocol.
Semantics
are
no
longer
visible
to
an
observer
to
still
allow
for
meaningful
measurements.
The
quick
standard
features
a
special
purpose
bit,
and
that
is
the
spin
bit.
N
It
is
a
dedicated
bit
in
the
quick
short
header
and
visible
to
onpath
observers,
while
the
spinbit
allows
for
round
time
measurements.
There
are
also
other
important
network
properties
that
one
might
want
to
measure
in
this
context.
There's
an
ongoing
discussion
in
the
ippm
working
group
focusing
around
four
different
proposals
that
are
similar
to
the
spinbit,
but
enable
packet
loss
measurements.
N
N
As
the
name
implies,
it
generates
a
constant
square
wave
signal,
or
in
other
words
it
first
transmits
a
certain
number
of
packets
with
a
set
qubit
and
then
a
certain
number
of
packets
with
an
unset
qubit.
The
network
probe
can
then
simply
count
how
many
packets
have
arrived
in,
which
phase
and
can
thus
derive
the
packet
loss
that
has
occurred
here
on
the
downstream
one
leak.
The
third
approach
is,
then
called
the
arbit
or
reflection
square
bit
and
builds
upon
the
qubit.
N
N
Finally,
we
then
have
the
so
called
t-bit
where
we
basically
have
one
train
of
packets,
which
is
reflected
several
times
between
the
server
and
the
client
mapped
to
our
setting.
Our
observer
will
now
only
be
able
to
compute
the
packet
loss
that
has
occurred
on
the
overall
link,
so
from
the
time
that
the
train
has
left
the
observer
under
one
direction
until
it
has
entered
the
observer
again
from
the
other
direction.
N
N
We've
then
investigated
three
different
scenarios.
In
the
first
setting,
we
have
induced
random
packet
loss
on
the
downstream
one
link
in
the
second
setting.
We
have
induced
burst
packet
loss,
and
for
that
we
have
used
the
simple
gilbert
model
and
then
finally,
we've
also
considered
the
impact
of
different
flow
sizes
on
the
measurement
accuracy.
N
For
this
we
use
symmetric
traffic
and
disable
congestion
control
so
that
there's
a
constant
flow
of
packets.
We
then
transmit
roughly
1
million
packets
in
each
of
our
experiments
and
perform
30
iterations
for
each
of
our
settings.
We
then
report
the
cumulative
loss
rates
that
the
different
approaches
have
derived
at
the
end
of
the
experiments.
N
If
we
now
look
at
the
result
of
all
the
four
approaches,
we
can
see
that
they
are
mostly
very
close
to
the
ground
truth,
especially
for
higher
loss
rates.
The
only
thing
that
stands
out
here
is
that
the
t
bit
is
not
that
accurate
for
low
loss
percentages,
as
evidenced
by
a
larger
confidence
intervals.
Here.
The
main
reason
is
that
the
t-bit
includes
two
pause
phases,
and
thus
it
doesn't
actually
cover
the
whole
traffic.
N
N
N
N
N
Additionally,
it
also
has
the
highest
fluctuations
in
the
measurements
and
takes
really
long
time
until
it
gets
close
to
the
ground
truth
so
summarizing.
It
can
be
said
that
the
albert
is
the
closest
representation
of
the
ground
truth,
while
the
q
and
orbit
are
not
far
behind
only
the
t-bit
struggles
a
bit
and
takes
a
long
time
to
get
close
to
the
ground
truth.
N
Let
us
finally
get
back
to
the
questions
stated
in
our
title,
so
which
spin
with
cousin
is
here
to
stay
well
solely
based
on
the
measurement
accuracy.
The
albert
seems
to
be
the
best
choice
as
it
closely
follows
the
ground
truth.
However,
it
depends
on
the
end,
host
loss,
detection
and
there's
always
this
slight
temporal
delay
between
the
actual
packet
loss
and
its
reporting.
N
First,
there's
a
decrease
in
accuracy
when
they
are
subject
to
burst
laws,
as
is
evidence
in
our
second
experimental
setting.
What
we
did
is
configure
increasing
average
burst
sizes
and,
as
our
results
indicate
the
qr
and
t
bits
struggle.
If
the
burst
sizes
increase,
the
second
disadvantage
of
those
longer
algorithmic
intervals
is
that
they
prolong
the
time
until
the
measurement
stabilizes.
N
These
results
now
stem
from
our
third
setting,
where
we
investigate
a
different
flow
lengths,
as
is
evidenced
in
the
plot,
the
different
algorithms
start,
their
measurements
at
different
times,
so
the
elbit
starts.
First,
then,
the
qubit
joins
afterwards,
the
tibia
joints
and,
finally,
the
orbit
joints.
N
However,
in
the
long
run,
all
the
measurements
start
to
stabilize
at
the
same
time,
so
at
roughly
one
to
two
megabytes.
So
overall,
the
measurement
accuracy
seems
to
be
suitable
in
all
of
the
four
cases,
although
there
are,
of
course,
differences
in
them,
but
which
of
those
approaches
now
to
choose.
N
These
eventually
decide
on
how
closely
the
network
operators
will
be
able
to
localize
laws,
and
thus
it
actually
depends
on
the
needs
of
the
operators
on
how
fine-grained
they
want
to
localize
the
laws
in
their
networks,
because
from
a
measurement
accuracy
perspective,
all
of
the
approaches
should
provide
reasonable
results.
For
that.
D
D
We
have
time
for
one
question
to
ike.
Thank
you
for
presentation,
let's
see,
and
then
let's
see,
if
I
see
anyone
in
the
queue
right
now,
not
so
white
people
are
thinking.
Let
me
try
one
question
to
you.
I
do
you
think,
have
you
thought
about
trying
to
measure
the
duration
of
packet
loss
burst
and
do
you
think
that
would
be
feasible
with
the
the
the
scenario
that
you
have.
N
So
basically,
you
mean
like
how
long
the
individual
bursts
that
we
have
occur.
Yes,
so
basically,
that
is
not
directly
the
intention
of
the
different
techniques
that
we
have
there
and
I
actually
didn't
come
up
with
them
myself.
So
I
yeah.
I
don't
think
that
those
techniques
are
that
feasible
for
that,
but
yeah.
There
are
ways
of
determining
how
long
certain
bursts
take,
but
I
think
that
is
mainly
only
possible
for
the
q
and
r
bits
in
that
case,
because
the
others
are
not
that
feasible.
For
that.
D
Okay,
thank
you
very
much
for
and
cedric
had
the
question
and
I
guess
you
it
disappeared
from
my
screen.
Okay,.
N
O
Yeah,
I
can
speak.
I
was
just
wondering
if,
if
that
instrumentation
had
an
impact
on
the
packet
loss
rate,
so
if
you
had
all
this
bit
to
measure
loopback
at
last,
do
you
do
you
change
the
actual
bucket
loss
rate
that
you
observe
in
the
network.
N
O
N
To
it
basically
yeah
yeah
yeah,
that's
what
I
was
about
to
say
something
about.
So
basically,
the
people
in
the
ippm
working
group
are
thinking
about
adding
the
the
loss
bits
to
the
two
reserve,
bits
that
are
still
available
in
the
quick
short
header
in
that
case
that
wouldn't
really
yeah.
So
we
would,
in
that
case
only
use
currently
unused
space
for
that,
but
obviously,
if
we
add
additional
bits
to
the
to
the
overall
transmissions,
then
that
might
have
an
impact.
N
Although
I
don't-
or
I
am
not
able
to
quantify
that
right
now,
but
that's
yeah,
I
think
a
valid
thought
about
those
those
additional
bits.
In
that
case
yeah.
D
Okay,
I
guess
you,
we
deferred
further
questions
to
the
panel.
So
let's
thank
you
very
much.
So
very
nice
work
and
let's
go
to
the
next
presentation.
I
guess
the
presenter
we're
gonna
be
a
side
sadie
from
max
planck
institute
he's
a
phd
student
in
computer
science.
Please
I
can
start
the
presentation.
Please.
L
These
devices
provide
wide
range
of
services
from
smart
speakers
to
smart
appliances,
tvs
and
surveillance
cameras.
However,
it
has
been
shown
that
these
devices
can
be
exploited,
and
one
notable
example
is
the
mirai
attack,
where
millions
of
exploited
devices
participated
in
enlarging,
launching
one
of
the
large-scale
data's
attacks
that
triple
parts
of
the
internet
and
service
providers.
L
L
L
However,
detecting
iot
devices
as
a
provider
level
is
not
an
easy
task.
The
reason
is
that
traffic
patterns
across
iot
devices
are
diverse,
deploying
there
are.
There
have
been
some
recent
work
that
suggested
that
we
can
deploy
some
agent
inside
the
premise
of
a
customer.
However,
it's
not
scalable,
and
it
is
privacy
interested
as
well
and
active
measurement
approaches
that
won't
work
if
the
devices
are
located
behind
the
net.
L
Moreover,
if
you
want
to
do
a
deep
packet
inspection,
we
will
face
serious
privacy
concerns
among
the
customers
of
isp.
One
of
the
readily
available
data
sources
are
flow
capture
utilities
such
as
netflow
and
iap
fix.
These
data
sources
are
already
collected
by
its
service
providers
for
their
other
operational
purposes.
L
The
key
inside
of
our
work
is
that
devices
that
we've
studied
should
have
shown
repeating
patterns
of
communication.
That
appeared
even
in
sparsely
sampled
data,
which
we
generated
detection
rules
using
as
extremely
limited
packet
fields,
and
we
were
able
to
detect
devices
or
generate
rules
for
detection
of
devices
from
seventy-seven
percent
of
the
study
manufacturers
and
we
detected
devices
in
a
dataset
from
an
isp
from
within
minutes
to
hours.
L
We
leveraged
the
fact
that
iot
devices
in
order
to
provide
their
services,
they
have
to
communicate
with
certain
backend
infrastructures,
and
if
we
focus
on
the
destinations
contacted
by
these
devices,
can
we
can
we
find
out
which
of
the
subscribers
of
this
isp
have
which
type
of
iot
device?
L
Second,
we
check
whether
we
can
see
traffic
from
a
single
device
from
a
single
vantage
point
that
isp
the
data
set
from
an
isp
and
third,
we
identified
which
domains
ips
and
port
numbers
can
be
used
to
generate
detection
rules
for
different
devices
and
then,
of
course,
we
project
our
detection
rules.
And,
finally,
we
applied
our
methodology
on
a
data
set
from
a
large
european
isp.
L
L
We
trigger
devices
to
generate
iot
traffic,
and
next
we
connect
our
test
beds
to
home
vantage
point
inside
the
premise
of
isp
and
push
back
with
iot
traffic
to
the
internet
and
captured
iot
traffic
at
these
different
locations.
As
seen
as
shown
in
the
figure,
we
observe
iot
activity
from
from
more
than
64
percent
of
devices
now
tested
in
the
isp
dataset.
L
As
it's
an
involved
process,
I'm
not
going
into
the
details,
but
in
in
as
an
overview,
we
have
generated
detection
rules
for
us
at
three
different
levels
of
granularity.
The
finest
one
is
the
product
level.
Well,
we
can
say
what
type
of
product
it
is.
For
example,
it's
an
amazon,
echo
or
not,
and
the
next
level
of
granularity
is
a
manufacturer
level
where
we
could
only
say
it's
a
samsung
device.
L
L
L
In
this
figure
we
see
the
duration
of
the
data
set
that
we
have
observed
and
in
the
in
the
y-axis,
we
see
the
number
of
unique
subscribers
that
per
hour
that
had
the
infer
that
had
the
infrared
iot
device.
L
L
The
next
question
is
that,
if
we
what
happens,
if
we
increase
our
observation
here
here,
we
see
this
same
plot
as
a
previous
plot,
but
with
the
24
hour
observation
period,
we
should
see
here
that
increasing
observation
period
help
detecting
even
more
iot
devices.
L
Next
takeaway
is
that
we
see
that
the
number
of
detected
devices
are
stable
and
there
is
not
a
lot
of.
There
are
not
a
lot
of
fluctuations
in
the
number
of
devices.
L
Now,
if
we
zoom
in
on
this
32
diff
iot
device
time
we
have,
we
will
have
this
plug
here
in
the
x
ax
in
the
left
part
in
the
y-axis.
We
see
each
individual
device
in
the
x-axis
again
number
of
devices
in
the
number
of
devices
per
day,
24
24
hours,
and
they
are
also
categorized
according
to
their
ranking
in
their
country
or
in
the
amazon
rankings
in
the
country
of
isb
for
the
ones
that
we
didn't
find
ranking.
They
are
put
into
other
category.
L
L
D
D
L
For
the
rules,
we
use
support,
protocols
and
destination
ip
addresses
protocols
as
well:
okay,
yes,
yes,
okay,
and
we
also
we
generated
rules
to
generate
or
finding
which
ip
addresses
to
use.
We
use
the
domain
names
first,
because
we
have
captured
the
domain
names
in
the
in
the
lab,
because
we
cannot
use
all
the
domain.
We
cannot
use
all
the
ip
addresses
that
we
that
are
contacted
by
the
eye
by
the
devices.
I
D
Okay
and-
and
you
identified,
I
in
your
presentation,
said
how
many
different
iot
devices
you
could
were
able
to
identify.
I.
L
D
I
J
L
D
Community
right,
yeah,
okay,
I
have
one
question
from
david
oren:
could
you
please
speak?
Please.
D
David,
how
do
I.
P
L
The
first
one
is
that
if
a
device
is
known
that
being
infected
or
participating
in
a
larger
scale,
attack
isps
can
notify
the
users
and
the
for
the
owners
of
that
device
and
say:
okay,
you
have
a
device,
that's
infected
and
it
has
been
shown
in
case
of
mirai
attack
where
isps
were
actively
engaging
in
notifying
customers
with
the
infected
devices
and
even
taking
extreme
measures,
for
example
by.
P
L
Not
without
a
customer's
consent,
you
can
always.
You
will
know
that
this
customer
has,
unless
this
has
this
type
of
device
and
and
you
don't
need
to
repeatedly
detect
the
same
device
for
the
customer
unless
if
the
device
is
moving
from
one
customer,
this
device
is
only
active
for
a
few
minutes
in
that
customers
and
in
our
setting,
the
subscribers
were
home
costs
for
were
home
users
and
or
fixed
subscribers,
and
not
a
mobile
one.
D
Thank
you.
So
thank
you
very
much.
I
guess
we
differ
for
the
questions
for
the
panel.
We
have
to
move
on
for
because
of
timing.
So
thank
you
very
much.
Thank
you
very
interesting,
and
so
please.
The
next
paper
will
be
presented
by
simon
bauer,
who
is
at
the
technical
university
of
munich,
he's
a
research
associated
there.
So
please,
the
video.
G
G
At
the
same
time,
previous
studies
present
methodologies
to
survey
flow
characteristics
like
flow
duration,
flow
size
or
flow
rates,
but
recent
insights
into
flow
characteristics
in
the
internet
are
rare
and
therefore
our
paper
produced
the
question.
How
flow
characteristics
changed
during
the
last
few
years?
G
Well
before
we
start
talking
about
our
methodology
and
our
measurement
results,
let
me
briefly
introduce
a
scalable
flow
analysis
tool
implemented
in
go
that
provides
large
scalability
due
to
parallelized
packet,
parsing
and
flow
aggregation.
The
tool
is
published
as
free
and
open
source
with
our
paper.
G
G
For
our
study
we
identified
the
start
of
tcp
flows
with
the
3b
handshake,
of
course,
and
we
terminated
tcp
flow
when
we
observe
a
connection
tear
down
if
there's
an
idle
period
in
a
of
a
flow
for
a
certain
timeout
period,
or
we
observe
a
freshly
established
three-way
handshake
of
a
ip5
tuple
that
is
already
tracked
for
identified
flows.
We
calculate
the
flow
size
in
the
sense
of
the
sum
of
layer,
4
payload
sizes.
G
We
calculate
flow
duration
as
the
time
interval
between
the
first
and
the
last
packet
we
observe
and
we
calculate
flow
rate
at
the
average
data
rate
based
on
flow
size
and
flow
duration.
Our
study
we
compose
a
data
set
consisting
of
28
traces
provided
by
kaider.
G
Such
traces
have
anonymized
ip
addresses
and
no
layer,
4,
payloads
and
each
trace
is
provides
one
hour
of
traffic
that
is
captured
at
a
10
gigabit
per
second
isp
backbone
link,
as
illustrated
below
on
the
timeline
we
select,
23
traces
taken
in
chicago
between
2008
and
2016,
and
five
traces
taken
in
new
york
between
2018
and
2020..
G
As
you
see,
we
have
three
periods
without
traces
for
several
months,
so
on
average
we
select
traces
in
an
interval
of
three
months,
but
there
are
three
intervals
without
traces,
larger
intervals
without
traces,
because
they
are
simply
traces
available
well
and
regarding
pre-processing
of
such
traffic,
we
only
consider
tcp
flows
that
are
longer
than
or
equal
200
milliseconds.
G
This
is
also
done
by
related
work
and
proposed
to
filter
out
quite
short
flows,
because
calculating
flow
rates
or
calculated
flow
rates
are
may
falsified
for
short
flows
in
case
of
single
packet
flows
or
if
all
packets
are
sent
back
to
back.
G
Let
me
point
out
two
major
findings,
so
regarding
the
99th
percentile
of
flow
duration,
here,
on
top
of
the
plot,
we
observe
that
there's
only
little
increase
during
the
years
2008
until
2013,
but
afterwards
we
observe
an
increase
by
factor
1.5
between
june
2013
and
march
2016..
G
Next,
we
were
interested
in
the
relevance
of
such
heavy
hitter
traffic.
Therefore,
we
calculated
the
share
of
transmitted
bytes
by
such
flows
within
the
99
percentile
for
each
flow
characteristic.
We
did
not
find
a
specific
trend
over
time,
so
we
so
here
the
table
shows
the
average
across
all
traces
taken
in
chicago.
G
In
the
second
column,
on
the
in
the
right
column,
we
see
the
share
of
bytes
transmitted
by
different
percentiles
and
well,
especially,
the
flows
within
the
99th
percentile
of
flow
size
represent
a
large
share
of
totally
transmitted
bytes,
with
nearly
90
of
all
tcp
bytes
transmitted
by
such
one
percent
of
flows.
G
Further,
we
had
a
look
at
so-called
big,
fast
flow,
so
chiang
and
all
introduce
a
2-2
taxonomy
based
on
two
threshold
values
to
group
flows
regarding
their
size
and
their
flow
rate,
and
we
had
a
closer
look
at
the
relevance
of
such
big
fast
flows,
which
are
represented
by
only
a
small
share
of
flows,
but,
as
we
will
see,
have
a
yeah
large
relevance
regarding
the
share
of
bytes
that
they
transmit,
so
we
defined
three
threshold
pairs.
G
The
first
pair
refers
to
the
original
threshold
values
from
china
at
all,
ie
hundred
kilobytes
regarding
size
and
10.
Kilobyte
per
second
for
flow
rate,
and
then
we
increase
the
thresholds
by
one
magnitude
for
pair
two
and
pair
three.
G
Let
me
highlight
the
increase
of
the
share
of
bytes
transmitted
by
the
second
threshold
pair,
illustrated
in
green.
So
here
we
observe
an
increase
of
big
of
bytes
transmitted
by
big,
fast
flows
between
20
and
30,
for
traces
in
2018
and
2000,
until
2010,
up
to
between
40
and
50
of
all
bytes
transmitted
by
tcp
for
more
recent
traces
taken
in
chicago
the
values
for
the
new
york
data
set
are
smaller,
which
can
be
traced
back
to
a
larger
share
of
small
flows
for
such
traces.
G
To
conclude,
my
talk,
let
me
summarize
our
findings
so,
as
we
have
seen
here
in
the
talk
and
we
observe
a
significant
increase
of
the
99th
percentiles
of
flow
duration
and
rate,
we
find
a
large
significant
of
heavy
hitters
regarding
the
share
of
transmitted
bytes
and
we
observe
an
increase
regarding
the
relevance
of
big,
fast
flows
during
the
past
years
further
and
not
included
in
the
talk.
D
Very
much
simon,
thank
you
very
much
and
have
time
for
one
question
before
the
panel.
D
Okay,
I
guess
people
are
different
for
the.
I
have
one
quick
question:
perhaps
there
there
has
been
a
change
in
flow
since
the
pandemic.
So
are
you
looking
into
that
the
change
of
the
frozen
last
year.
G
Yes,
so
so
we
did
not
do
that
yet,
but
we
definitely
plan
to
do
that.
So,
as
you
have
seen,
we
now
worked
on
the
kaider
data
set
well
and
qaida
provides
one
trace
per
year,
and
so
we're
looking
for
further
data
sets
that
allow
a
more
fine
grained
study
and
yeah,
especially
regarding
the
pandemic
during
the
last
few
months.
So
yeah
this
will
be,
will
be
a
topic
for
the
future.
Q
Yes,
hi
hi
simon
thanks
for
nice
talk.
I
was
wondering
I'm
not
sure
about
the
data
set.
Do
you
know
something
about
how
the
parallelization
of
flows
changed
over
the
years?
So
they
are
so
like
the
change
from
ht1
to
let's
say
quick
or
http
2,
where
now
things
start
to
get
paralyzed
over
the
same
connection,
so
you
would
see
less
parallelization.
G
G
D
Okay,
so
thank
you
very
much.
I
think
it's
it's
better
to
invite
all
these
speakers
to
the
room
and
now
and
and
collect
questions
for
everyone.
So
I'm
not
sure
how
to
do
that,
but
I
think
yes,
okay,
it's
coming.
A
D
I
guess
there
is
one
in
queue
ali
who
said
you
you
can
just
speak.
Please.
J
Yeah
so
yeah,
the
question
is
for
the
simon:
do
you
see
any
anything
related
to
the
walking
dead
so,
but
is
there
any
correlation
between
the
the
ipv6
and
and
the
port
correspondingly
being
used,
because
the
microservice
architecture
is
dominating
the
development?
So
I'm
just
have
you
considered
that
aspect
as.
B
H
G
So
we're
currently
working
to
to
add
ip
addresses
in
plane
and
then
we
would
able
to
travel.
Look
at
ipv4
versus
ipv62.
D
Anyone
else
regressions-
and
I
guess
I
I,
if
not,
I
have
a
question
for
justin
for
the
first
talk:
do
you
foresee
in
your
in
your
presentation,
in
your
environment,
any
performance
issues
to
implement
the
the
solution,
the
collection
of
all
the
results,
so
the
performance
of
the
heads
have
you
thought
about
that
or
if
any
or
do
you
see
any
problem.
M
Yeah,
so
that's
a
good
question.
So
can
you
hear
me
yeah
yeah,
yeah,
perfect,
yeah,
okay,
so
actually
the
the
overhead
is
again
it's
all
on
iom,
so
the
overhead
introduced
by
the
cross
player
telemetry
is
just
a
net
link
call.
So
that's
not
that
big
and
all
the
overhead
is
on
the
iom
side,
and
so
we
have
already
measured
the
impact
of
iom
in
another
paper,
and
so
unsurprisingly,
the
more
you
insert
the
more
it
drops.
M
So
you
have
to
find
a
compromise
depending
on
your
hardware
and
and
a
lot
of
things.
D
M
And
by
the
way,
just
just
a
small
notification
to
operators
that
iom
is
now
available
in
the
kernel,
so
it's
for
from
two
or
two
or
three
days
and
it
will
be
available
in
5.14
version.
So,
okay,.
D
H
I
have
a
fairly
general
question
for
perhaps
most
of
the
presenters,
the
the
the
talk
about
spin
bits.
I
mean
that
the
spin
bits
obviously
came
from
quick
initially,
for
the
other
talks
does
the
presence
of
quick
and
the
deployment
of
quick
effect,
the
the
types
of
systems
you're
building
or
the
behaviors.
You
expect
to
see.
D
So
is
the
question
for
everyone
calling.
G
So
so
we
also
considered
to
detect
click
traffic,
our
data
set,
but
there
was
not
a
significant
share,
so
we
should
not
have
a
closer
look.
L
L
H
D
I
have
one
quick
question
for
ike:
did
I
pronounce
your
name
correct?
Yes,
okay.
Did
you
contrast
the
the
measurements
you
you
talk
about
in
your
paper
like
a
passive
measure
which
it's
good,
but
did
you
contrast
for
packet
loss
of
active
measurements,
the
difference
of
doing
that
or
do
you
see
any
reason
to
compare.
N
Out
how
well
the
different
approaches
can
actually
detect
the
the
loss,
that's
happening
and
actually
the
the
the
normal
traffic
that
we
were
sending
was
kind
of
the
the
active
measurement
part
on
that,
and
I
think
the
the
general
idea
of
those
approaches
is
also
to
kind
of
not
have
to
use
active
measurements
so
that
we
can
yeah
keep
the
the
additional
overhead
on
the
on
the
network
low
and
just
measure
it
on
the
on
the
already
passing
traffic
without
having
to
actually.
D
N
Yes,
I
think
the
the
the
main
thing
is
that
if
you
use
active
measurements,
then
you
won't
be
able
to
like
get
the
loss.
That
is
happening
in
your
network
and
not
under
normal
conditions,
and
thus
you
would
always
kind
of
have
a
different
picture
than
what
you
would
get
when
you
just
use
those
different
loss
techniques
here,
and
so
I
don't
think
that
this
contrasting
and
that
point.
G
D
D
Okay,
I
have
a
question
to
simon,
I
guess
in
the
in
the
your
paper,
you
say
that
seven
percent
flows
transmit
at
the
rate
between
one
kilobit
per
second
and
a
hundred
kilobit
per
second
yeah.
I
I
kind
of
wonder:
is
that
do
you
expect
this
transmission
rates,
or
did
I
get
it
wrong?
It
seems
a
slow
transmission
rate.
There
is
any
reason
for
that,
or
do
you
expect
that
or
I'm
just.
G
So
if
I
understood
you
correctly,
the
question
is,
or
you
you
mentioned,
that
we
observed
a
large
share
of
lows
within
quite
a
quite
a
small
range
of
the.
G
Right,
yes,.
D
G
Yeah
so
spontaneously,
I
don't
have
an
explanation
for
this,
so
we
we
plan
to
have
a
closer
look
onto
ports
and
even
ip
prefixes
that
may
allows
us
to
differ
between
different
kinds
of
flows.
Then
I
might
be
able
to
answer
your
question.
D
Folks,
well,
I
guess
people
are
maybe
in
europe
it's
well.
It
is
late
in
europe
not
for
me,
people
want
to
sleep
so
very
nice
talks.
I
I
learn
a
lot
from
you,
so
I
hope
the
audience
enjoy
and
say
we
have
a
large
crowd
watching.
I
certainly
will
refer
your
your
work
to
my
students.
So
thank
you,
everyone
for
being
here.
I
was
really
nice
and
hope
the
audience
enjoy.
So
thank
you
very
much
for
everyone.
D
Close
the
session
for
today,
okay,
so
thank
you.
Thank
everyone.