►
From YouTube: IETF111-ANRW-20210728-1900
Description
ANRW meeting session at IETF111
2021/07/28 1900
https://datatracker.ietf.org/meeting/111/proceedings/
A
Hello,
I
can
be
heard,
but
not
seen
apparently
because
meet
echo
is
not
letting
me
turn
my
video
on
how
about
now.
Oh
there
we
go.
A
Andre,
I
have
the
videos
queued
up
too,
but
but
since
you're
sharing
the
screen,
do
you
do
do
you
want
to
play
them?
I
just
wanted
to
check.
B
B
Will
help
us
just
to
play
the
videos.
A
A
Check
testing,
I
see
audio
waveforms,
suggesting
that
I
can
be
heard.
Okay
awesome
great,
so
so
welcome
everybody.
We.
A
I
cannot
hear
you,
oh,
that
is
why
hold
on
a
sec.
I
need
to
unmute
this
tab.
Hello.
Can
you
hear
me.
A
Oh
perfect,
yeah.
I
was
muting
myself,
you
know
because
there
are
six
16
000
zoom
meetings
going
on
at
the
same
time.
So.
A
Welcome
everybody
to
to
session
five
in
this
session
on
dns
and
privacy,
again
we'd
like
to
to
to
to
what
we
thank
the
sponsors,
but
also,
let's
remind
everybody
of
the
logistics
and
the
links
here.
So
there
is
a
slack
channel.
If
you
have
you
know,
questions
and
and
and
such
that's
a
good
place
to
put
them,
I
will
see
them
andra.
Our
our
talented
session
chair
for
the
last
session
will
also
see
them
and
that's
a
good
place
to
do
that.
A
The
url
here
for
your
convenience,
where
you
can
access
the
program,
the
papers
and
also
the
videos
that
you're
about
to
see,
are
all
on
the
on
the
program
website,
thanks
to
colin
perkins
who's.
Also,
here
great,
thank
you
for
putting
everything
up,
so
you
can
watch
those
again
if
there's
something
you
you
really
liked
and
then
also
if
the
papers
themselves
are
also
available
in
the
acm
digital
library,
for
your
viewing
pleasure.
A
If,
for
some
reason,
you
want
to
follow
up
and
and
read
the
papers
they're
all
available
there
as
a
reminder,
all
these
sessions
are
recorded
and
the
recordings
will
be
available
on
youtube
after
the
workshop.
A
So
looking
forward
to
an
exciting
session,
okay
yeah
a
couple
more
points
here:
the
videos
that
were
about
the
presentations
that
we're
about
to
see
are
pre-recorded,
so
we'll
take
questions
at
the
end
in
a
five-minute
slot
and
then,
at
the
end
of
each
session,
we've
got
a
two-hour
slot
at
each
session
runs
an
hour.
So
we're
going
to
end
each
of
these
sessions
with
a
15-minute
panel
for
q,
a
for
all
the
authors
presenting
in
the
session.
A
Our
our
rationale,
for
that
was,
we
thought
it'd
be
a
little
bit
more
lively
than
just
you
know,
than
just
a
presentation
of
back
and
forth
with
a
single
author.
So
we
organize
things
dramatically
so,
hopefully
we'll
have
a
fun
discussion
on
dns
privacy
towards
the
end
of
the
session
yeah.
A
Just
some
meat
echo
stuff
to
ask
a
question:
just
enter
the
queue,
the
mic
and
hand
logo,
and
then
I
will
call
you
out
and
enable
the
audio
and
if
you
need
more
information
on
how
to
use
medeco
there,
it
is
next
slide
yep.
So
here
we
are
session
five
and
our
first
presentation
will
be
from
austin
hansel
austin.
I
see
her
here
long
time,
no
chat
austin,
and
not
only
am
I
the
session
chair,
I'm
also
a
co-author
on
this
paper.
So
so
no
hard
questions
right.
A
No,
but
austin.
We're
gonna
play
the
video
now,
so
we
can
queue
that
up.
So
we
can
ask
the
medieco
folks,
wherever
they
are
to
play
encryption
without
centralization.
A
F
Snowmobile
hi,
my
name
is
austin
hemsel
and
I'm
a
phd
student
at
princeton
university
and
I'm
going
to
be
presenting
some
work
today
about
distributing
dns
queries
across
recursive
resolvers
to
reduce
centralization.
This
work
was
done
in
collaboration
with
paul
schmidt,
kevin,
barcolta
and
nick
femster.
F
F
So,
as
we
know,
dns
privacy
has
become
a
significant
concern
within
the
past
few
years.
We
know
that
onpath
network
observers
can
infer
which
websites
you're
visiting
by
looking
at
your
plain
text,
dns
queries,
this
kind
of
interference
can
occur
within
governments
or
in
coffee
shop
networks
and
et
cetera.
So
two
protocols
have
been
proposed
to
encrypt
dns
traffic.
One
is
dns
or
tls
or
dot.
Neither
is
dns
or
https
or
do
and
we're
saying
that
increasingly
do
is
being
rolled
out
by
various
browser,
vendors
and
other
software
developers.
F
In
fact,
we're
seeing
that
firefox
now
uses
doe
as
the
default
dns
protocol.
For
us
users,
which
you
know
encryption
is
a
great
thing.
It's
great
that
encrypted
dns
is
being
increasingly
deployed.
However,
with
it,
some
users
are
green
and
growing
concerned
about
ways
in
which
the
dns
may
be
centralized
into
a
handful
of
large
operators,
and
so
we
stand
to
ask
well.
Is
there
a
way
we
can
get
encrypted
dns
resolution
while
distributing
our
queries
across
multiple
operators
instead
of
centralizing
into
a
small
set?
F
F
The
step
resolver,
then,
as
it
receives
queries,
distributes
them
across
multiple
resolvers,
according
to
some
user
specified
strategy
such
as
a
random
model
or
a
round
robin
model
to
prototype
this,
we
forked
dns
grip
proxy,
which
supports
do
and
the
ascript,
and
we
added
support
for
the
hash
and
round
robin
model,
and
we
also
evaluated
some
existing
code
that
was
within
the
proxy
for
random
query
distribution.
We
evaluate
the
performance
of
this.
F
Similarly,
the
round
robin
model
behaves,
as
you
might
expect.
First,
query
goes
to
the
first
resolver.
Second
query
goes
the
second
resolver
third
to
the
third
and
so
on
and
so
forth,
and
lastly,
in
the
random
model
also
might
behave
as
you
might
expect.
The
queries
go
to
various
resolvers
without
a
particular
order.
F
So
if
we
look
at
this
figure,
which
is
a
cdf
of
tcp
and
tls
setup
times
for
cen
content,
we
can
read
the
legend
in
this
way.
If
you
look
at
the
cloudflare
dash
google
line,
what
we're
seeing
there
is
that
these
are
the
this
is
the
cdf
of
tcp
and
tls
setup
times
when
you
use
cloudflare's,
doe
resolver
to
resolve
google
cdn
content,
and
essentially,
what
we're
seeing
here
is
that
there
really
isn't
too
much
of
a
difference
that's
being
made
in
terms
of
using
one
particular
resolver
to
host
to
resolve
cdn
content.
F
So
these
are
measurements
that
were
taken
from
four
amazon
ec2
vantage
points
in
which
we
perform
page
loads,
using
a
instrumented
instance
of
firefox
with
selenium
in
a
headless
browser,
and
we
perform
measurements
with
three
different
query:
distribution
strategies
right,
and
so
what
we're
seeing
is
that
no
single
resolver
once
again
or
rather
no
single
model,
seems
to
perform
the
best
or
the
worst
across
different
vantage
points,
we're
seeing
slightly
different
effects
on
pedro
times.
F
But
everything
looks
about
roughly
the
same
and
then
when
you
turn
to
a
comparison
of
using
a
single
resolver
for
all
your
queries,
we're
seeing
a
similar
story
where
each
resolver
seems
to
perform
somewhat.
Similarly,
in
terms
of
its
effect
on
page
load
times,
whereas
in
certain
vantage
points
you're
seeing
maybe
a
slightly
higher
effect
than
other
ones
right.
So,
for
example,
if
you
look
at
oregon,
you
might
see
that
there's
a
much
more
pronounced
difference
in
pedro
times
with
certain
resolvers,
then.
F
Lastly,
we
look
at
the
effect
of
how
many
unique
domain
names
are
seen
by
dns
resolvers
whenever
you
use
query
distribution
strategies
right,
so
we
utilized
a
real
world
data
set
from
almond
at
all
of
100
homes
in
cleveland,
ohio
that
were
connected
to
a
fiber
to
the
home
network.
We
utilize
this
traffic
to
perform
our
query
distribution
strategies
after
the
fact
so
to
say
to
kind
of
simulate.
F
Queries
are
seen
by
various
resolvers
on
average
right,
and
this
is
fairly
intuitive
to
understand,
because
if
you
think
about
it
in
a
hash
based
model,
the
same
queries
for
the
same
domain
name
should
always
go
to
the
same
resolvers.
It
shouldn't
be
the
case
that
two
resolvers
ever
receive
a
query
for
the
same
domain
name,
whereas
if
you
look
at
the
round
robin
model
within
the
first
couple
of
weeks,
we
see
that
each
resolver
starts
to
gain
access
to
more
and
more
unique
domain
names
from
a
given
client.
F
So
in
summary,
we
present
the
design
and
prototype
implementation
of
a
refactored
stub
resolver
architecture
that
allows
for
decentralized
encrypted
dns
resolution.
We
performed
a
preliminary
evaluation
of
the
step,
resolves
performance
and
utilize.
The
real
world
data
set
to
evaluate
how
do
different,
create
distribution
strategies
affect
how
many
queries
are
seen
by
recursive
resolvers
right.
F
What
does
your
performance
look
like
if
you
use
a
lot
of
resolvers
resolvers
that
are
in
different
geographic
locations,
and
how
does
this
affect
a
client's
privacy
in
a
formal
way,
rather
than
the
preliminary
evaluation
done?
Is
there
a
way
that
we
can
formalize
the
privacy
costs
and
benefits
of
using
different
query
distribution
strategies?
F
A
Great
talk,
yeah,
hey
austin,
all
right!
I
think.
Let's
see,
I
don't
know
if
we
have
your
video
on,
but
I
see
a
queue
of
two
questions.
So
let
me
figure
out
turn
my
video
on
yeah.
I've
got
a
question
from
jim,
oh
oopsie,
remove
from
q
jim.
H
Yeah,
yes,
going
to
absolutely
excellent
talk,
austin,
and
I
was
a
little
bit
confused
by
you
saying
that
when
you're
querying
these
resolving
servers
that
things
are
working
out,
fine
between,
say,
google
and
cloudflare
and
all
the
rest
of
it.
If
you
understand
correctly,
the
cloudflare
resolver
service
doesn't
implement
ecn
into
client
subnet,
so
the
revit
information
it's
returning,
may
not
necessarily
be
an
optimal
ip
address,
safe
for
something
that's
hosted
on
another
network
or
because
of
the
routing
information.
F
So
thanks
for
the
question,
so
I
would
I
would
say
that
we've
been
thinking
more
and
more
about
seating
and
localization,
as
we
start
thinking
about
doing
collaborations
with
so
folks
in
cape
town
and
we're
planning
on
doing
some
measurements
across
africa
to
see
you
know.
F
Okay,
if
we
can
take
these
resolvers
in
north
america
and
these
clients
in
north
america
and
extend
our
results
to
other
countries
where
we
see
similar
effects
in
seeding
localization
and
what
we're
quickly
starting
to
realize
is
that
we're
gonna
have
to
think
a
bit
more
a
bit
harder
about
the
evaluation
we've
done
so
far
with
regards
to
cd
localization,
it's
not
quite
as
formal
or
as
sophisticated
as
it
could
be,
and
we've
definitely
been
thinking
about
questions
related
to
edns
for
future
work.
F
But
that's
not
something
we
explored
for
this
paper.
A
Let's
see,
we've
got
a
question
from
dkg.
If
I
can
hi.
G
G
Hey
thanks
for
doing
this
work
austin.
I
really
appreciate
that
you're
trying
to
figure
out
how
to
formalize
this,
and
I
wonder
whether
you
have
any
thoughts
about
the
privacy
implications
of
the
two
schemes
that
you
have
beyond
information
learned
from
the
dns
resolver.
So,
for
example,
in
the
hash
based
scheme,
it
looks
to
me
like
each
client
is
basically
I'm
assuming
that
each
client
doesn't
pick
the
same
hash
based
scheme,
they're,
probably
keyed
by
some
in
individual
selected
hash.
And
I
wonder
if
that.
G
Which
lookups
you
do
to
which
servers
record
that
and
then
and
then
have
a
fingerprint
of
like
who
you,
who
you
or
who
your
device
is
that
they
would
find
over
time.
So
when
you,
when
you're
formalizing
the
privacy
evaluations,
do
you
like?
Are
there
metrics
like
that
that
you
plan
to
include
like
how
are
you
thinking
about
this.
F
So
gonna
be
honest.
That
is
a
scenario
that
I
had
not
thought
about
in
my
head:
the
potential
for
using
that
kind
of
hashing
strategy
as
a
super
cookie-
and
I
think
that's
very
interesting
to
think
about
you
know
another
kind
of
question
that
comes
to
our
mind.
F
Is
you
know
yeah,
so
it
may
seem
beneficial
in
the
hashing
scheme
to
you
know,
use
the
second
level
domain
of
a
domain
name
as
a
way
to
ensure
that
all
you
know,
domains
that
are
related
to
each
other
should
go
to
the
same
resolver.
But
then
there's
a
further
question
like
does
it
matter
whether
certain
domain
names
at
all
go
to
certain
resolvers
in
the
sense
that
you
know?
F
Maybe
there
are
certain
sensitive
domain
names
that
you
wouldn't
want
to
go
to
a
certain
operator's,
resolver
right.
Maybe
that
would
reveal
more
information
about
you
than
you
would
like.
So
there's.
Also
a
question
of
you
know:
is
there
a
way
that
a
client
could
provide
some
kind
of
a
preference
about
resolvers
that
it
might
wish
to
use
to
avoid
certain
isps
to
avoid
certain
geographic
locations?
F
And
you
know
there's
a
lot
of
considerations
beyond
just
you
know:
okay,
how
many
resolvers
do
you
use
or
which
resolvers,
but
also
with
what
locations
across
which
isps
you
know
if
they're
isp
operator
resolvers,
maybe
you
wish
to
use
it
or
not,
so
those
are
also
considerations
we're
thinking
about,
but
I
guess
to
answer
your
questions
directly.
I
had
not
thought
about
that
scenario
that
you
brought
up
and
that's
definitely
something
to
write
down,
as
we
start
thinking
about
formalizing
this
for
future
work.
So
I
really
appreciate
that
question.
A
I
Hi,
can
you
hear
me
so
this
is
slightly
informal
to
the
idea
of
privacy,
but
it
go
feeding
again
get
the
idea
of
user
experience
when
you're
talking
about
collecting
sets
of
recursive
resolvers.
It
might
be
worth
just
including
in
that
the
characteristics
of
the
resolvers
in
terms
of,
for
example,
filtering
or
doing
a
sec,
because
otherwise
you'll
get
different
results
from
different
resolvers,
potentially,
which
could
lead
to
very
strange
user
experience
in
intermittent
behavior.
So
just
another
thing
to
add
into
your
selection
criteria.
F
Yeah
absolutely-
and
I
should
say
as
well
that
we've
already
begun
doing
some
work
to
kind
of
explore
whether
we
can
do
some
kind
of
a
user
study
to
explore.
Okay,
do
users
have
any
understanding
of
what
it
might
mean
to
distribute
queries
across
multiple
resolvers,
what
is
dns
privacy
to
users,
even
technical
users
right,
and
that's
very
much
going
to
inform
among
a
wide
variety
of
questions
and
preferences
like
what
would
the
default
settings
of
such
a
system?
F
Be
it's
something
that
you
know
we
frankly
like
we're
kind
of
hesitant
to
provide
any
kind
of
an
answer
for
it.
First,
like
you
know,
we
don't.
Maybe
there
shouldn't
be
a
default
strategy,
but
you
know
in
terms
of
user
experience.
Users
probably
do
need
some
kind
of
default
that
they're
not
going
to
set
it
themselves,
and
it's
not
immediately
clear
to
us
whether
one
strategy
is
better
than
another.
Whether
or
not
you
know,
some
users
may
want
filtering
resolvers
or
not
in
terms
of
like
you
know.
F
If
I'm
a
parent
in
a
household,
you
know
what
I
want
a
filtering
resolver
versus
finding
somebody
that
is
not
a
parent.
Maybe
I
don't
want
a
filtering
resolver
at
all,
and
how
do
you
make
these
kind
of
default
decisions
for
users
without
trying
to
make
them
get
bogged
down
too
much
in
the
details?
F
And
so
that's
definitely
a
consideration
that
comes
to
mind,
especially
as
we
look
at
various
lists
of
doe
resolvers
out
there,
like,
I
think,
dns
crypt
has
one
on
their
github
page
and
I
think
in
one
of
the
columns
they
indicate
whether
or
not
it's
a
filtering
resolver
or
not,
and
that
was
something
that
came
to
our
mind
is
like
well
yeah,
like
you
know,
if
we're
going
to
be
doing
measurements
across
many
resolvers,
you
know
we
got
to
be
sure
that
we're
going
to
get
the
answers
that
we're
going
to
expect
just
from
a
scientific
perspective.
A
Siobhan
we
have
one
more
one
more
question
in
the
queue.
I
know
we're
a
little
over
time
for
or
the
first
talk,
but
let's
take
a
look,
can
you
can
you
hear
me.
K
J
Oh
cool
thanks
yeah.
This
is
not
a
fully
thought
out
idea,
but
to
address
what
bkg
brought
up
earlier
about
fingerprinting
concern.
Would
it
help
at
all
if
there
was
a?
If,
globally
it
was
the
same
domain
name
to
resolve
our
mapping
and
everyone
all
clients
would
use
the
same.
You
know
second
level
domain
mapping
to
resolver
like
this
is
not
ideal,
but
it
would.
At
least
I
mean
you
would
still
be
able
to
find
out
that
you
know
for
certain
domain
name
like
everyone's
craze
are
on
this.
J
Resolver
like
an
attacker
would
be
able
to
do
that,
but
this
is
I
mean
this
would
still
be
better
than
the
status
quo.
Perhaps.
F
Yeah,
that
might
be
better,
I
mean
something
to
consider
too,
is
whether
even
the
hash
based
strategy
is
the
best
we
can
do
in
terms
of
privacy
right
like
even
if
we
all
agreed
that
we
should
use
the
same
hashing
scheme
for
all
clients.
Maybe
we
can
do
better
in
terms
of
privacy
than
hashing
domain
names.
It's
something
that
we're
still
thinking
a
lot
about,
because
we're
not
entirely
convinced
that
that's
the
best
we
can
do,
and
yet
it's
something
we're
still
actively
thinking.
F
A
Thanks
austin
great
questions:
we
I
see,
there's
one
more
question,
but
I
think
we
definitely
should
probably
get
on
so
shawn
if
you
could
hold
your
thoughts
until
the
till
the
panel.
That
would
be
appreciated.
Hang
on
to
that
thought,
because
we're
we're
definitely
up
against
time
here.
So,
let's
to
the
meat
echo
folks,
maybe
we
can
queue
up
our
second
talk,
which
is
institutional
privacy,
risks
and
sharing
dns
data.
L
L
L
In
our
work,
we
introduce
a
new
aspect
of
dns
privacy,
which
we
call
institutional
privacy,
which
is
concerned
with
the
behavior
of
an
institution's
traffic
as
a
whole,
in
contrast
to
individual
privacy.
This
has
not
been
closely
studied
before,
but
it's
important
to
look
at
because
institutions,
internal
activities
such
as
sending
or
receiving
an
email,
can
leave
a
digital
trail
in
the
dns
ecosystem.
L
Based
on
this
motivation,
our
contributions
in
this
work
are
first
to
define
institutional
privacy
as
a
new
privacy
risk
and
dns,
and
within
this
model
that
we
define,
we
give
a
methodology
for
finding
institutional
privacy
leaks,
and
then
we
apply
this
methodology
to
a
real-world
data.
That's
anonymized
and
show
the
privacy
risks
and
also
demonstrate
that
the
anonymous
method
that's
used
is
not
sufficient
to
prevent
institutional
leaks.
L
Examples
of
specific
activities
that
may
be
confirmation
through
dns
include,
for
example,
two
institutions
sending
or
receiving
an
email
between
each
other,
which
may
reveal
a
relationship
between
them.
That's
not
known
publicly
or
in
other
activities
an
employee
of
a
company
accessing
a
privacy
sensitive
or
a
website.
That's
considered
embarrassing
from
the
company's
perspective,
such
as
like
an
illegal
or
an
adult
website
and
accessing
such
websites,
while
on
the
company's
network.
L
For
our
threat
model,
we
consider
an
adversary,
that's
the
authoritative
server
and
has
either
accessed
the
server
logs
or
the
traffic
between
the
recursive
and
authoritative
server,
and
the
goal
of
the
adversary
is
to
associate
the
source
ip
and
the
domain
name
and
the
query
in
the
query
to
the
corresponding
institution,
the
adversary
targets,
an
institution
that
meets
two
kind
of
conditions.
The
first
one
is:
the
institution
must
run
its
own
recursive
resolver.
L
This
would
let
the
adversary
use
the
resolver's
ip
address
to
uniquely
identify
the
dns
traffic
of
the
institution,
and
the
second
condition
is
the
institution
must
route
traffic
from
its
own
autonomous
system,
and
this
would
let
the
adversary
map
the
resolver's
ip
address
to
the
corresponding
autonomous
system.
That's
owned
by
the
institution.
L
We
study
queries
from
66
institutions
that
meet
the
conditions
that
I
just
listed
and
also
represent
a
diverse
set
of
sectors
such
as
largest
mp500
companies,
government
institutions,
university
of
southern
california,
schools
and
so
on,
and
when
we
pick
those
companies,
we
exclude
institutions
that
have
apparent
deniability,
such
as
isps
or
hosting
service
providers,
because
queries
from
from
this
companies
might
might
be
coming
from
their
customers
and
not
from
their
employees
and
examples
of
potential
real
world
adversaries
in
our
model
include
dns
service
providers
and
also
researchers
that
have
access
to
data,
that's
shared
by
the
service
providers
for
research
purposes
and
also
government
level
actors
that
have
the
means
to
piece
drop
on
dns
traffic.
L
E
L
The
source
ip
we
use
public
ip2as
number
mapping,
data
to
map
the
source
ip
to
the
institution
that
owns
the
as
and
this
works,
even
if
partial,
prefix,
preserving
anonymization
method
is
used
and
second
for
the
domain
name.
We
use
a
public
who
is
data
to
map
the
domain
name
to
the
institution
that
owns
the
domain?
Here
we
assume
the
domain
is
in
full
and
not
that
qna
minimization
is
not
being
used
once
we
identify
queries
related
to
an
institution,
we
filter
out
we'll
filter
and
find
queries
that
are
related
to
email
exchange.
L
L
We
apply
this
methodology
to
one
week
of
b
root
data
and
we
also
reproduce
our
results
on
a
second
week.
This
data
set
is
anonymized
using
a
prefix,
preserving
method.
The
bottom
eight
bits
are
anonymized
and
we
run
this
experiment
with
irb
approval
and
with
the
permission
of
the
data
owners
and
the
research
questions
that
we
looked
at
are
first
how
common
are
sensitive,
email,
related
queries
and
then,
within
this
queries,
are
we
able
to
find
specific
relationships
between
institutions?
That's
not
otherwise,
publicly
known
in
the
paper.
L
L
So
first
we
looked
at
how
common
our
mx
and
dnsbl
queries
in
the
data
set.
That
we
looked
at
this
plot
shows
that
x,
axis
is
the
seven
days
of
data
that
we
looked
at
and
the
y-axis
is
the
number
of
queries
in
millions,
and
we
can
see
that
several
millions
of
dnsbl
and
mx
squares
are
made
each
day,
which
can
be
a
significant
source
of
email,
leakage
of
email,
related
traffic
and
within
this
millions
of
queries,
we
can
go
further
and
ask.
L
Are
we
able
to
narrow
down
and
look
for
specific
relationships
between
institutions
and
in
this
slide,
the
plot
on
the
left
shows
a
breakdown
of
those
query.
Mx
queries
by
the
different
sectors
of
companies
that
we
studied
and
the
plot
on
the
right
narrows
down
even
further
and
looks
at
queries
made
to
specific
institutions
such
as
palantir
and
well.
L
L
So,
based
on
this
results,
our
recommendations
are
for
institutions
to
deploy
querying
minimization
wherever
possible
yeah.
So
we
recommend
the
even
faster
reduction
of
this
mechanism.
L
Another
option
is
to
use
a
mechanism
called
local
roots
which
caches
the
root
zone
locally,
so
that
the
recursive
resource
does
not
have
to
make
queries
to
the
root
server
and
for
service
providers.
Our
recommendations
are
first.
We
have
shown
that
host
only
anonymization
is
less
sufficient
for
protecting
institutional
privacy.
L
So
we
recommend
service
providers
to
put
legal
constraints
when
sharing
dns
data
and
for
the
case
of
wider
sharing,
where
that's
not
possible
to
that,
they
look
into
more
rigorous
privacy.
Preserving
data
sharing
approaches
in
this
work.
We
have
shown
that
dns
queries
may
leak
significant
institutional
information,
thus
otherwise,
not
publicly
known.
L
Therefore,
we
recommend
institutions
deploy
q,
a
minimization
wherever
possible
and
that
service
providers
evaluate
institutional
privacy
risks
when
sharing
dns
data
for
research
purposes.
With
that,
I
conclude
my
talk.
Thank
you
for
listening.
Our
paper
and
data
can
be
found
at
the
link
shown
on
this
slide.
Thank.
A
You
thanks
thanks
so
much
I
waiting
for
to
see
oh
good.
We
we've
got
hashem
in
the
question.
Cue
hashama
I'll,
remove
you,
and
hopefully
you
can
speak
up.
A
Privacy
bassey:
can
you
hear
the
question
questions
about
like?
Have
you
considered
the
use
of
differential
privacy
to
essentially
make
it
more
practical
or
possible
to
share
some
of
this
data
across
institutions.
A
A
A
Yes,
the
question
was
for
from
hashem
was:
have
you
considered
the
use
of
differential
privacy
to
make
it
more
feasible,
practical
to
have
institutions
or
organizations
share
this
dns
data
that
you're
talking
about
in
your
talk.
L
A
Seeing
nothing
else
in
the
queue
and
given
that
we're
like
a
little
bit
behind
time,
I
think
I'll
ask
meet
echo
to
queue
up.
The
third
video
and
that'll
leave
us,
hopefully
time
for
q,
a
on
that
talk,
as
well
as
a
little
bit
of
group
panel
discussion
at
the
end,
where
I'll
ask
all
of
our
our
speakers
to
to
join
again.
So
here
we
go
talk
three.
N
Hello,
everyone
welcome
to
our
presentation.
I
am
shangri.
I
am
here
to
present
our
work.
Dinners
over
tcp
considered
vulnerable.
This
is
working
together
with
dr
hayes
schuman
and
professor
michael
weitzman.
We
are
all
from
german
national
research
center
for
applied
cybersecurity.
Athena
in
darmstadt
germany.
N
N
Our
motivation
is
based
on
our
previous
work
on
dean's
dns
over
udp,
so
we
all
know
that
dns
over
youtube
is
vulnerable
to
ipv
documentation
attacks.
So
then
another
question
is
that
what
about
dns
over
tcp?
Is
it
also
one
about
to
have
your
recommendation
attached
or
not?
So
we
searched
the
related
some
related
works.
We
find
out
that
it's
well
believed
that
ip
fragmentation
text
doesn't
work
with
tcp.
For
example,
there's
a
drafter
best
comment
practice.
Ib
fragmentation
considered
fragile.
N
It
says:
there's
alternatives
to
every
fragmentation
which
is
using
tcp
with
pm2d
there's
also
another
best
common
practice
in
draft
fragmentation
avoidance
in
dns.
It
says
tcp
is
considered
resistance
against
hyper
fragmentation
attacks,
also
in
last
year's
dynasty,
so
which
is
the
event
for
among
the
dns
operations
community.
N
It
says
tcp
normally
implements
pm
duty
and
can
avoid
iv
fragmentation
of
tcp
segments,
so
it
is
the
real.
The
stingers
over
tcp
really
protects
against
ip
fragmentation
attacks
to
check
to
check
this.
We
designed
our
evaluation
in
the
internet,
so
we
want
to
trigger
fragmentation
over
tcp
on
name
servers
in
the
internet
and
then
compare
with
udp.
N
N
As
a
comparison.
We
do
similar
evaluation
over
udp,
but
considering
udp
is
the
connectionless
connectionless
protocol.
So
it's
a
little
bit
different.
We
first
can
send
out
the
thing
as
request
and
then
get
the
response
afterwards
to
extend
it
and
hp
packet
too
big.
Instead
of
waiting
for
the
retransmission
since
there's
no
youtube,
udp
doesn't
do
transmission,
so
we
set
another
dinner
request
the
same
as
before,
and
get
what
wait
for
the
second
generous
response
and
check
his
fragmented
orders.
N
N
When
we
check
among
those
domains
only
vulnerable
to
fragmentation
over
tcp,
we
find
out
that
there
are
even
76
domains
which
has
tcp
when
we
test
disability
set
when
we
test
data
over
udp.
So
what
does
that
mean?
Tcp
means
truncated.
It
is
that
when
the
packet
is
tube,
the
dns
payload
is
too
big
for
the
udp
or
when
the
server
is
limited.
N
N
N
N
N
Based
on
our
previous
evaluation,
we
proposed
a
potential
exploit
so
as
we
presented
here,
the
he.
This
is
the
example
of
a
fragmented,
dns
packet
of
tcp,
so
the
so
the
package
above
is
the
first
fragment.
The
package
in
the
bottom
is
the
second
apartment,
so
we
we
can
inject
a
malicious
payload
into
dns
via
ib
fragmentation
over
tcp.
N
N
Also
in
udp
source
port,
so
here
in
tcp2,
so
the
tcp
challenge
here
at
the
tcp
software
source
port
here
since
here
we
show
the
packet
from
the
name
server
to
the
resolver,
so
it's
destination
board,
and
also
the
tcp
sequence
number
and
technology
number
both
of
them
are
32
bits.
So
it's
believed
to
really
hurt
your
spool,
but,
as
you
can
see
both
of
these
challenges,
the
tcp
challenge
and
the
things
changes
they
are
in
the
first
fragment.
N
N
There
are
many
works
about
making.
There
are
many
works
using
make
use
of
ipids.
We
do.
I
wouldn't
make
digging
new
details
here,
but
we
did
a
quick
check.
We
find
out
that
there
are
even
2000
domains,
we're
still
using
globally
sequential
ipids
for
tcp.
That
means
they
use
the
global
counter
for
all
the
ipa
other
tcp
connections.
N
N
So
what
are
the
current
measures?
We
propose
cutting
measures
in
different
layers,
for
example,
for
ip
the
easy
30
ways
just
to
filter
the
fragments
or
just
filter
the
small
fragments,
like
google,
do
google
filter
or
the
small
fragments
smaller
than
500
so
which
make
it
make
it
less
vulnerable,
but
sometimes
to
filter
the
fragment
will
make
the
network
down
or
doesn't
work
well.
So
another
way
is
just
to
randomize
randomize
the
ipids,
to
make
it
hard
to
spoof,
but
still
sometimes
it's
still
vulnerable
to
some
side.
N
Channel
attacks
like
before
another
way
is
to
disable
pm2d
in
tcp
layer,
so
their
research
works
find
out
that
find
out
that
pm2d
is
not
really
useful,
so
we
can
just
disable
it
and
that's.
We
can
just
filter
assembly
packet
too
big,
another
way
to
cut
measure.
These
attacks
is
to
just
to
enable
ding
attack
because
dns
there
is
like
final
solution
to
all
dns
cache
with
an
attack,
but
but
we
just
need
to
make
sure
that
dns
stack
is
configured
properly.
N
A
Thanks
thanks
so
much
dancing.
Do
we
have
questions?
I
think
yeah.
I
see
online
here
to
take
questions
if
we
have
any
questions
for
the
speaker.
A
N
A
Yeah,
I
see
mark
andrews
in
the
queue
mark.
Do
you
want
to
do
you
want
to
have
a
question?
You
want
to
ask
a
question.
A
Arc
you're
up,
oh,
I
probably
have
to
remove
from
q.
Let's
do
it.
K
Dns
cookies
as
a
clients
using
dns
cookies
to
see
see
whether
they
were
actually
infectable,
and
the
second
question
is:
have
you
looked
at
the
at
the
proposed
well-known
tc,
a
well-known
tc
as
a
protection
mechanisms?
There
is
a
there's,
a
draft
from
94
19
2019
about
that
in
my
name
as
well.
N
N
K
There's
a
well
there's
a
using
welder
using
a
well-known
t-sig
shared
secret
is
one
possible
countermeasure,
in
other
words,
using
tcg
for
every
every
train.
Every
dns
transaction,
using
a
well-known
using
a
well-known
shared
secret,
is
also
a
countermeasure
which
is
doable,
which
is
theoretically
doable
at
the
dns
server
level
at
the
dns
level,
and
the
other
question
was:
did
you
actually
try
do
an
evaluation
against
a
client
server
pair
which
support
dns
cookie?
K
J
N
J
N
N
A
I
see
a
question
from
punit
here
or
punits
in
kyu.
Let
me
take
punit.
P
P
I
think,
looking
to
the
future,
that
is
the
long-term
solution
to
transport
level.
Cash
poisoning
risks,
of
course,
there's
still
a
lot
of
work
to
do,
but
it
it
it
should
be
mentioned
here.
N
Yeah
yeah,
but
the
test
for
the
transporter
layer.
Protections
is
only
works
with
if
it's
widely
developed
deployed
everywhere.
So,
for
example,
the
the
dns
of
https
before
mentioned
it
only
projects
against
it
from
the
client
to
the
resolver,
but
still
doesn't
protect
from
the
resolver
to
the
name
server.
So
if
you
really
want
to
using
the
transporter
layer
protection
protection
protocols,
it
must
be
deployed
everywhere.
So
that's
why
we
didn't
didn't
include
it
in
the
code
measure
part.
But
that's
true.
That's
a
sql
code
measure
thanks
sure
thank.
D
You
nick,
we
cannot
hear
you
you're,
muted,
sorry,.
A
Sorry
about
that
we
got
about
six
minutes
to
go.
I
see
peter
and
q,
but
I
think
let's
take
peter's
question
if
it's
for
chen
chang,
but
let
me
invite
our
previous
two
speakers
back
on
on
stage.
If
you
will
so
that
we
can,
we
can
have
a
more
general
discussion
about
these
three
papers.
E
Hello,
I
believe
you
said
in
the
chat
that,
when
you
cause
a
tcp
server
to
fragment
that
they
stop
setting
the
df
bit
right
yeah.
N
Set,
yes,
that
will
be
a
mitigation
yeah.
It's
like
quick
dirty
mitigation,.
N
J
N
E
A
Doing
that
hashem,
let's
I
see
you're
back
in
queues,
so
hashem
welcome
back.
Yes,
thank
you.
M
M
F
Yeah,
that
seems
reasonable.
I'm
not
exactly
sure
what
metric
we
would
use
to
measure
load,
but
the
idea
seems
correct.
I
mean
something
else
that
we
experienced.
Actually,
as
we
are
testing
the
software
is
that
from
the
ux
perspective
in
general,
with
these
strategies
there
may
be
some
scenarios
where,
if
a
resolver
goes
down,
a
user
may
be
able
to
use
certain
websites,
but
not
others,
I'm
not
sure
whose
microphone
is,
but
there's
some
feedback.
F
Right,
so
you
can
imagine,
as
I
think,
you're
suggesting
that
if
a
resolver
goes
down,
you
know
that
may
be
a
problem
for
users
because
they
may
be
able
to
access
some
websites,
but
others
and
from
ux
perspective.
That
might
be
hard
to
understand.
So,
certainly,
if
there's
some
way,
we
can
test
out
load
or
the
reliability.
M
M
Yeah,
so
there
are
different
ways
to
measure
how
busy
a
different
resolver
is
by,
for
example,
measuring
the
cpu
utilization.
You
can
also
say
you
can
become
count.
The
number
of
queries
you
send
to
a
given
resolver
and
there's
so
many
of
that
has
been
sent
to
it.
You
can
guess
it
is
being
very
busy
and
has
more
loads
than
others.
M
Things
like
that
mm-hmm
yeah,
yeah
yeah.
We
we
did
something
for
not
balancing
and
we
have
many
servers
and
we
would
like
to
load
balance
the
traffic
to
them.
We
have
ways
to
make
sure
we
distribute
the
traffic
fairly
in
such
a
manner
to
optimize
the
performance
of
the
whole
system.
F
Right
and
that
would
seem
particularly
relevant
for
the
random
model,
if
that
was
something
we
wanted
to
use
right,
because
you
wouldn't
want
to
accidentally
use
a
resolver
for
many
different
queries
that
is
under
high
load,
as
you
suggested.
A
Thanks
a
lot
for
your
question,
sean
there's
certain
a
lot
of
interest.
I
I
I
I
don't
see
too
many
other
questions
in
the
queue.
Let
me
let
me
sort
of
since
we
have
a
panel
here.
Let
me
sort
of
pose
pose.
A
One
question
I
think
it
you
know
kind
of
maybe
relates
to
the
first
two
talks
a
little
bit
more,
but
even
in
the
case
of
you
know,
austin
your
your
proposed
solution,
there's
still
a
proxy
that
gets
to
see
some
of
the
traffic
right
before
before
it's
still
encrypted
and
so
moving
encryption
from
the
browser
into
you
know
a
different
place
in
the
network.
A
What
do
you
all
think
about
like
the
implications
of
that
for
for
visibility
on
the
on
the
on
the
privacy
side
and
maybe
on
the
security
side
as
well
right?
So
it's
just
different
entities
who
may
get
to
see
that
maybe
it's
your
isp
instead
of
the
browser
vendor
or
maybe
it's
nobody
if
you're
the
one
operating
the
proxy,
you
know
behind
the
the
cpe,
so
there's
a
bunch
of
different
ways
that
that
could
go,
but
that
affects
privacy.
A
F
I
can
go,
but
it's
somebody
else's
thoughts.
I
don't
want
to
take
it
too
much
time,
but
you
know
one.
One
thing
that
you
and
I
have
talked
about
nick
is
particularly
related
to
you
know.
If
you're
an
enterprise
you
know
you
want
to
be
able
to,
there
might
be
some
legitimate
reasons
why
you
might
want
to
see
the
queries
that
are
being
sent
on
the
network
and
if
you're
moving,
I
mean
this
is
just
I
guess,
true
of
just
using
encrypted
dns
in
general,
but
the
more
it's
deployed.
F
You
know
they
may
have
less
visibility
into
that,
and
so
one
thing
that
you
might
want
to
factor
in
for
a
query
distribution
strategy.
Is
you
know
if
there
are
certain
domain
names
that
are
supposed
to
go
to
a
split
horizon
resolver,
then
you
might
need
to
map
integrate
that
into
your
strategy,
somehow,
whether
it's
in
some
kind
of
a
preference
box,
you
might
imagine
in
the
application
that
runs
the
resolver.
You
might
build
an
indicate
like
for
these
cert
force.
F
These
domain
names
to
go
to
these
resolvers
and
the
user
might
be
able
to
specify
that,
and
so
then,
from
a
visibility
perspective,
the
enterprise
might
be
able
to
run
their
own
encrypted
resolver
and
still
be
able
to
see
the
queries
that
they
need
to
see
in
a
certain
sense
right
like
they
may
still
wish
to
have
access
to
all
queries
in
order
to
look
for
command
and
control,
et
cetera,
but
still
at
least
in
one
particular
use
case.
F
You
might
imagine
being
able
to
say
you
know,
for
these
particular
domain
names
ensure
that
they
go
to
this
enterprise.
You
know
split
horizon
resolver,
that's
one
thing
you
might
want
to
integrate
in
terms
of
visibility.
A
And
bassey:
do
you
have
any
any
thoughts
like,
let's
suppose,
austin's
proposal
were
or
sort
of
were
implemented?
Well,
it
is
implemented.
Let's
say
that
it
were.
We
had
a
choice
of
where
to
deploy
that
kind
of
stub.
Resolver
functionality
like
you
could
move
the
encryption
point
to
you
know
within
the
home
network
for
all
devices
you
could
move
it
to
the
cpe.
You
could
actually
proxy
somewhere
outside
the
home.
You
could
yeah
there's
a
bunch
of
different
places
where
that
could
be
deployed.
A
Do
you
have
thoughts
on
that
and
how
that
would
sort
of
affect?
You
know
what
you're
trying
to
achieve
with
with
with
the
dns
sharing,
and
I
see
we're
over.
L
L
We
haven't
talked
more
about
from
the
encryption
perspective,
but
just
in
general
from
like
when
we're
like
thinking
about
solutions
about
like
some
of
the
threads
we're
talking
about
on
the
paper
like,
for
example,
one
potential
solution
might
be
instead
of
directly
querying
routine
routine
servers
forwarding
the
traffic
to
another
public
recursive
resolver
that
uses
encryption
and
well.
L
What
we
runs
is
that
some
of
the
solutions
only
shift
the
threat
model
and
don't
actually
solve
it,
because
now
the
traffic
becomes
visible
to
the
operators
of
the
resolver,
even
though
the
traffic
is
encrypted.
So
the
the
data
still
leaks.
The
data
stored
in
the
server
still
links
the
information
so
yeah.
Any
solution
that
we
consider
needs
to
take
these
factors
into
account
as
well.
A
Hey
thanks
thanks
betsy,
and
I
think,
thanks
to
all
our
speakers,
and
also
thanks
to
to
those
of
you
in
attendance
who
asked
such
great
questions.
This
is
a
lively
session
and
hopefully,
hopefully,
at
the
next,
a
nrw
we'll
see
more
papers
on
this
topic.
I
can
see
that
these
are.
These
are
really
nice
ideas
with
with
hopefully
a
lot
of
potential
for
for
future
work.
As
the
questions
brought
up
so
outstanding
job,
everybody
really
enjoyed
it
and
I
think
it
is.
A
It
is
now
my
turn
to
turn
things
over
to
to
andre
who
is
chairing
our
last
session
on
applications
and
specifications.
So
I
think
at
this
point
I'll
also
I'll
still
be
in
this
in
the
media
echo,
but
I
won't
take.
I
won't
take
the
mic,
so
I
will.
A
I
will
also
take
this
opportunity
to
bid
bid
our
audience
farewell
and
also,
since
I
do
have
the
the
the
mic
here
I'll
say,
thank
you
once
again
to
andra
who's,
who's
who's,
our
great
session
chair
for
last
session,
who
was
just
an
absolute
pleasure
to
work
with
on
this
workshop
together.
A
I
hope
we
get
to
to
work
to
collaborate
again
on
things
like
this
and
and
hopefully
on
on
other
topics
as
well,
and
I
think
I'm
probably
not
alone
in
saying
colin,
we
we
could
not
have
done
this
without
you
there's
like
way
too
many
moving
parts
between
the
itf
and
the
acm.
So
we
really
appreciate,
like
your
experience
on
that,
so
I'll
turn
it
over
to
andra
for
the
last
session
and
the
last
words
as
well.
D
Thank
you
so
much
nick
and
likewise
has
been
a
pleasure
working
together
and
we
do
have
a
closing
piece
at
the
end
so
hold
on
for
two
more
talks.
So
I'm
trying
to
to
show
you
what
we
have
in
store
now,
but
apparently
I'm
not
able
to
sorry
about
this.
D
Yeah,
apparently
I'm
not
able
to,
but
hopefully
everybody
has
seen
the
program
for
for
the
last
session.
We
have
two
very
interesting
talks,
one
about
how
we
can
interpret
better
rc's
and
how
and
and
barat
is
here
and
he's
going
to
do,
the
five
minutes:
q,
a
after
the
pre-recorded
talk
and
then
we're
gonna
hear
from
from
ali
and
he's
going
to
discuss.
Well,
it's
a
it's
a
call
for
action,
essentially
about
how
we
can
write
better
specifications
for
particular
applications.
D
So
I'm
just
going
to
ask
meet
echo
to
to
load
the
video
and
meanwhile,
I'm
just
gonna
remind
everyone.
We're
gonna
just
run
the
session
just
the
way
that
we
have
seen
before
so
queue
up
for
questions
and
again
it's
a
pre-recorded
video,
so
questions
at
the
end
and
we're
gonna
have
a
panel
at
the
end
of
this
session
with
questions
for
for
both
speakers.
Q
Q
So
I
want
to
ask
is
what
makes
protocol
special?
Since
then,
we
have
grappled
with
something
fundamental
about
network
protocols
they're
specified
in
english,
implemented
in
code
through
interpretation
of
that
english
prose
coded
typically
in
a
relatively
low
level
language,
but
it
having
a
fundamental
intent
to
interoperate
with
other
implementations.
Q
We've
struggled
for
a
long
time
to
achieve
all
of
them
properties
such
as
security
and
interoperability,
flexibility,
scalability
correctness
and
extensibility,
and
while
this
is
sometimes
due
to
the
design
and
specification
process,
it's
often
because
of
implementations
with
issues
being
discovered
incrementally
over
time
and
those
implementations
get
improved.
The
specifications
get
improved
and
then
around
we
go
again,
but
it's
not
clear
that
this
is
fundamental,
and
so
what
we
might
do
is
consider
different
options
that
we
have
in
protocol
specification
and
implementation.
Q
Q
Many
have
rightly
observed
that
this
opens
the
door
to
many
possible
mistakes,
so
the
natural
option
is
to
formalize
the
specification
process.
Many
languages
and
tools
for
formal
specification
of
software
generally
and
protocol
specifically
have
been
developed,
and
these
formal
specs
enable
a
degree
of
certainty
about
what
the
spec
author
meant
but
they're
a
pain
to
write
and
to
read
even
for
specialists,
and
so
they've
been
considered
really
only
worth
it
for
the
most
safety
critical
applications,
and
then
one
can
consider.
Well.
If
we
have
this
formal
specification,
we
can
produce
an
automated
implementation.
Q
Q
Our
approach
is
built
upon
a
set
of
concepts
and
techniques
in
natural
language
processing,
known
as
combinatory
categorial
grammar,
which
is
a
way
to
do
what's
known
as
semantic
parsing,
how
to
decompose
a
sentence
into
sensible
units
and
we're
not
natural
language
processing
experts,
we're
networking
people
so
really.
What
we
wanted
to
do
is
take
these
techniques
that
have
been
developed
in
nlp
and
apply
them
to
this
context.
Q
We
use
this
grammar-based
approach
rather
than
a
more
popular,
deep,
neural
net
based
approach,
because
we
wish
to
retain
the
structure
of
the
text
so
that
we
can
figure
out
whether
multiple
interpretations
of
a
sentence
are
valid
and
how
those
should
be
fixed.
If
at
all,
so
what
kind
of
text
are
we
talking
about?
Q
So
we
built
a
tool
called
sage
that
attempts
to
identify
ambiguity
in
rfc
text,
taking
advantage
of
their
structured
nature
and
flag
sentences
that
are
ambiguous,
then
for
limited
rfc
features
that
we
support.
We
take
this
disambiguated
text
and
convert
it
directly
into
protocol
code.
This
is
a
very
early
stage
tool.
It's
not
ready
for
wider
use,
so
really
part
of
our
motivation
and
presenting
it
to
you
all
is
to
learn
from
the
community
and
identify
what
our
next
steps
of
where
we
want
to
extend
it
to
support.
Q
Q
The
latter
are
absolutely
needed
to
support
more
advanced
protocols,
and
so
I
come
back
to
where
I
began,
which
is
what
makes
protocols
special
they're
essential
to
everything
we
do
today
and
have
to
work
in
context
far
outside
their
original
intention.
However,
because
as
a
community,
there
exists
standardized
practices
for
the
development
of
rfcs.
What
I'd
like
to
pose
for
discussion
is
how
the
process
for
standalone
standardizing
rfcs
might
integrate
semi-automated
tools
like
say
a
future
version
of
sage,
to
improve
specifications
and
to
produce
baseline
implementations.
D
So,
thank
you
so
much
for
the
very
interesting
interesting
talk
about
us.
Can
you
hear
us
yeah?
I
already
see
people
joining
the
queue.
So
stephen,
would
you
like
to
announce
yourself
and
ask
your.
R
Question:
hey,
can
you
hear
me
yeah,
so
this
is
really
interesting
work?
I
guess
my
question
is:
how
far
do
you
think
that
natural
language
processing
techniques
can
take
us
towards
sort
of
fully
automated
cogeneration
now?
Is
there?
Is
there
certain
components
of
specifications
that
perhaps
need
you
know
more
structured
or
more
formal
languages?
What's
what's
that
gap
like,
I
guess.
Q
You
know,
I
hope,
is
going
to
be
the
easier
piece
of
that
now.
The
disambiguation
part
gets
tricky
when
you
know
you're
dealing
with
references
that
go
way
outside
of
the
bounds
of
a
specific
spec,
so
that
this
is
one
of
the
pieces
where
there's
a
vocabulary
piece
which
is
relatively
easy
to
figure
out.
Q
You
know
in
our
in
our
paper
we
we
talk
about
we've
just
pulled
in
terminology
from
a
networking
textbook,
so
that
we
knew
that
these
were
special
terms
in
networking
right,
and
so
we
have
sort
of
domain
knowledge
and
that's
a
kind
of
dictionary
that
can
grow,
but
it's
not
going
to
continue
growing
forever.
Q
There
are
other
pieces
which
are
really
about
the
sentence
structure.
Disambiguation
there.
It
gets
a
little
bit
trickier.
You
know,
I'm
not
an
nlp
expert,
so
the
you
know.
The
answer
I
can
give
you
here
is
gonna
be
limited,
but
the
the
way
that
we've
sort
of
looked
at
that
is,
you
could
think
of
it.
Initially,
as
a
grammar
checker
that
you
know
we
use,
we
have
grammar
checkers.
You
know.
Google
docs,
for
example,
now
does
sort
of
auto
grammar
checking
while
you're
writing.
Q
You
could
think
of
that
as
being
an
integrated
feature,
it's
not
going
to
give
you
100
coverage
to
on
day
one,
but
it's
going
to
find,
and
you
know
this,
this
sentence
is
ambiguous.
Did
you
refer
to
the
header
or
you're?
Referring
to
a
message?
Are
you
referring
to
you
know?
You
know,
there's
terms
can
be
used
in
multiple
ways
and
you
can
get
have
a
tool
that
flags
that
and
then
also
the
structure
of
the
sentence
itself.
Maybe
the
noun
is
missing.
You
know
a
noun
is
missing.
Q
D
P
I
think
I
can
be
heard.
I
believe
I
can
be
yes,
hi
bharath.
Thank
you
for
this
lovely
talk.
So
this
is
super
interesting
because
clearly
you're
familiar
with
the
history
of
this
work
and
when
generally
the
space
that
you
showed
in
that
graph
has
been
around
for
a
good
20
or
30
years.
P
You
know,
what
does
what
do
things
refer
to,
and
and
in
doing
that,
you
encounter
some
basic
pieces
of
ambiguity.
So
I
think
that
can
be
that
I
can
understand,
but
the
harder
things
I
have
trouble
with
in
ambiguity-
and
we
we
find
this
routinely
in
rfcs-
is
something
that
is
the
way
that
the
text
is
written.
There's
a
flow
and
there's
context.
P
So
any
any
sentence
has
context
coming
in
before
it
so
depending
on
what
the
building
blocks
are-
and
this
is
what
I'm
missing
in
in
your
in
your
work
at
the
moment-
is
how
are
you
building
the
semantic
blocks?
Are
you
using
semantic
blocks?
Are
you
you
said
you're
not
doing
state
machine
generation?
I
get
that,
but
just
in
terms
of
the
language
itself,
are
you
doing
some
sort
of?
P
I
don't
know
I
mean
I'm
thinking
about.
You
know
the
tools
that
are
used
for
building
summaries,
for
example,
which
do
semantic
block
building,
and
things
like
that.
Are
you
trying
to
perhaps
generate
a
summary
of
some
pieces
of
this?
How
far
do
you
think
you
can
go
with
this?
I
would
love
for
it
to
be
able
to
detect
ambiguity
in
in
in
text
that
is
beyond
just
sentences
and
so
on
to
actually,
you
know,
pull
semantics
out
of
it.
P
That's
a
question
and
in
terms
of
a
comment
for
you
you're
asking
about:
where
could
this
or
the
potential
for
integrating
something
like
this?
I
think
this
is
super
useful.
Even
at
the
lowest
level,
it
could
be
used
as
part
of
things
such
as
working
group
last
call
ietf
last
call
the
rfc
editor.
Does
this
work
for
us
typically
like
there's
a
human
who
reads
the
text
and
says:
hey
this
sentence
seems
ambiguous:
can
you
please
replace
it
and
they'll
offer
suggestions?
P
So
automation
of
any
or
any
and
all
of
those
things
would
be
super
useful,
and
I
think
that
the
value
of
automation
is
in
not
waiting
until
later
to
bring
those
tools
in
just
make
it
part
of
a
ci
for
every
pr
that
goes
into
the
draft.
You
know
when
any
new
language
that
is
introduced
goes
through
this
machinery
and
then
it
figures
out
if
there's
any
ambiguity
introduced
by
so
there
are
several
places
where
this
can
be
introduced
and
I'd
love
to
see
that
happen
so
yeah.
That's
all
I've
got.
Q
Yeah,
thanks
for
all
that,
so
to
start
at
the
beginning
about
the
sort
of
the
reference
in
a
sentence.
So
this
is
co-reference
resolution,
something
that's
you
know
a
major
studied
area
in
nlp,
where
using
some
co-reference
resolution
tools
right
now.
Q
We
don't
mention
too
much
in
the
current
paper,
but
sort
of
we're
working
on
that
right
now,
and
you
know
it's
a
problem
that
I
guess
can't
be
solved
perfectly
just
because
the
nature
of
english
text
is,
you
know,
sometimes
references
can't
be
perfectly
resolved,
but
the
good
news
is
that
we
can
then
flag
potentially
unresolved
references,
and
then
you
know
that
would
be
part
of
this
automated
tool.
So
that's
sort
of
like
the
first
stage,
then
the
the
question
you
had
about.
Do
we
build
up
this
sort
of
context
of
understanding.
Q
We
don't
do
that
right
now
and
so
right
now,
we've
been
mostly
targeting
simple
protocols
and
the
one
thing
that's
kind
of
amazing
is
that
you
know
for
relatively
simple
protocols.
Often
you
can
get
away
with
not
having
too
much
context
as
long
as
all
the
terms
that
are
in
each
sentence.
Q
Are
you
have
already
resolved
those?
You
don't
necessarily
need
a
lot
of
history
of
you
know,
concepts
that
have
been
built
up
over
several
paragraphs.
To
then
understand
it
as
long
as
all
those
terms
have
been
defined
now,
once
we
get
to
more
complex,
for
example,
algorithm
descriptions
and
more
complex
protocols,
we're
going
to
have
to
resolve
that
it's
currently
we
we
haven't
figured
that
piece
out.
Q
Q
Don't
have
big
working
groups
with
a
lot
of
intense
work
going
on,
but
you
know
there's
all
these
smaller
protocols
that
are
out
there
and
someday
some
of
them
may
become
big
protocols.
You
know
protocols
that
people
are
depending
upon
that.
We
can
never
really
anticipate
that,
and
so
it's
good
to
be
able
to
make
sure
that
all
of
those
get
you
know
some
level
of
scrutiny
say
through
these
semi-automated
tools.
P
Thank
you
for
that
response.
I'll
make
a
quick
comment
before
I
walk
away.
Don't
do
not
disregard
the
the
important
protocols
that
are
getting
a
lot
of
scrutiny
because
those
ones
actually
are
the
ones
that
need
this
kind
of
help.
More
than
anything
else.
What
often
happens
is
that
you
get
scrutiny
on
the
ideas
you
get
scrutiny
on
various
things.
P
You
get
scrutiny
through
people
doing
implementation,
but
not
necessarily
on
how
well
the
text
is
written
and
and
that's
I
would
say
that
that's
still
a
huge
gap
in
my
mind
about
there's
this
huge
gap
between
intent
of
the
working
group
and
the
text
that
shows
up
in
the
document
that
tends
to
be,
and
that
that
you
rely
on
just
individuals
to
do
to
fill
that
gap.
It
would
be
good
to
not
be.
P
D
So
thanks
so
much
jana
and
I
see
calling
in
the
queue.
Probably
would
you
like
to
ask
your
question?
Please
and
I'm
just
gonna
close
the
queue
after
calling,
but
please
don't
do
remember
that
we
have
a
few
more
minutes
at
the
end
of
the
session,
so
do
join
the
queue.
If
you
have
more
questions
for
that,
thanks.
O
Yeah
hi
thanks
thanks
sandra
thanks
but
beref
for
the
talking
this.
This
is
fascinating.
Work.
I'd
like
to
echo
some
of
the
comments
that
jenna
made.
I
think
we're
starting
to
see
a
a
bunch
of
different
tools
that
people
are
applying
to
to
rfcs
and
drafts.
I
mean
that
there's
your
work.
We've
seen
people
doing
security
analysis
of
some
of
the
protocols.
I've
seen
people
analyzing
state
machines,
we've
done
some
work,
modeling
critical
data
formats.
O
I
think
what
one
of
the
challenges
is
is
that
that
sort
of
analysis
tends
to
happen
when
the
rfc
is
published
and
if
we
can
find
ways
of
building
these
tools
early,
so
they
can
be
run
routinely
on
the
drafts.
I
think
that
would
be
a
really
nice
way
of
improving
the
quality
of
the
specifications
before
they're
published.
O
So
so
I
would
really
encourage
you
to
to
keep
working
with
the
the
community
here
to
try
and
integrate
this
into
the
the
draft
publication
tools
and
so
on
and
and
I
I
can
introduce
you
to
the
the
tools
team
and
and
so
on.
If
need
be,
and
if
we
can
help
out
with
this
sort
of
work
in
the
irtf,
then
then
please
do
come
and
talk
to
me
thanks.
Q
Yeah
thanks
exactly,
we
would
like
to
integrate
it
with
the
sort
of
the
draft
stage,
rather
than
it
being
only
you
know,
once
it's
final
and
published
you
know.
The
good
thing
is
that
even
at
the
draft
stage,
it's
just
english
text.
So
from
the
perspective
of
our
tool,
it
doesn't
know
any
different
between
the
two.
You
know
it
will
find
ambiguities
or
not
in
in
either.
D
Form
so
thank
you
so
much
again,
bara
thank
you
for,
for
the
people
asking
questions
do
join
us
again
at
the
end
of
the
session,
but
for
now
I'm
just
gonna
ask
echo
to
queue
up
the
next
video
for
the
next
talk,
which
is
from
ali
and
he's
just
gonna,
give
us
a
call
for
action
for
more
collaboration.
S
Hi
everyone,
my
name
is
ali.
I'm
a
computer
science,
professor
and
I've
been
working
on
streaming
media
since
my
phd
years.
The
topic
for
today
is
the
very
recent
developments
in
the
streaming
space,
in
particular
the
common
midi
client
and
media
server
data
specs
coming
out
of
the
consumer
technology
association.
S
S
Without
a
doubt,
streaming
media
is
a
big
part
of
our
lives
with
more
efficient
media,
codecs,
faster
internet
access
streaming.
Any
content
we
want
is
quite
easy.
Today,
http
played
an
important
role
in
this
too.
Back
in
the
day,
it
all
started
with
progressive
download
which
allowed
us
to
play
the
media
before
having
to
download
the
entire
media
file
with
sufficient
bandwidth.
This
was
a
much
better
experience
over
the
download
and
play
approach,
following
that
with
some
indexing
tricks
and
the
use
of
byte
range
requests.
S
S
In
a
nutshell,
we
have
a
source
which
is
live
or
element
that
provides
the
mezzanine
content,
which
is
supposed
the
best
and
most
crisp
quality
that
we
can
offer.
The
mezzanine
stream
is
fed
to
an
encoder
or
a
transcoder
that
generates
the
number
of
representations
according
to
a
pre-selected
bitrate
letter,
which
is
a
combination
of
resolution
and
betrayed
pairs.
S
S
S
Http
is
an
object,
request
response
protocol,
but
with
the
small
media
pieces
it
can
mimic
streaming
very
nicely
and
the
ultimate
goal
is
to
make
the
viewers
happy
at
which
http
adaptive
streaming
does
a
pretty
good
job
most
of
the
time.
But
adaptive
streaming
is
not
perfect
either.
So
bad
things
do
happen
from
time
to
time.
S
Here's
an
example
showing
streaming
client
behavior.
I
have
been
showing
this
plot
for
more
than
10
years
now,
so
you
might
have
seen
this
before.
There
are
10
clients
streaming
from
the
same
server
over
a
10
megabit
per
second
link.
Ideally,
all
the
clients
should
consistently
stream
from
the
same
representation.
S
In
this
case
it
is
the
866
kilobit
per
second
representation.
This
way,
the
network
capacity
will
also
be
used
as
much
as
possible.
However,
this
is
not
the
case,
as
you
can
see
from
the
frequent
upshifts
and
downshifts
for
each
client
in
a
scenario
like
this,
which
is
very
common,
we
get
either
unfairness
among
the
clients,
instability
for
each
client
or
under
utilization
in
the
network.
S
S
S
In
other
words,
the
money
should
be
spent
to
put
more
fiber
under
the
ground
and
improve
cellular
coverage
as
nice,
as
this
sounds,
one
thing
we
know
for
sure,
is
that
the
demand
for
bandwidth
is
always
more
than
the
supply.
The
last
camp
was
more
pragmatic.
How
could
we
best
use
whatever
the
bandwidth
or
server
capacity?
We
have
to
meet
the
demand,
so
the
last
camp
brought
up
the
idea
of
some
sort
of
a
control
plane
that
might
enable
information
sharing
between
the
servers
and
clients.
S
S
This
send
standard
actually
covered
a
lot
of
ground
and
established
a
control
plane
framework
we
always
wanted,
despite
all
the
academic
research
behind
it
due
to
various
reasons.
This
wasn't
really
picked
up
by
the
industry
until
a
number
of
companies
wanted
this
to
do
something
about
it
in
cta,
a
few
years
later,
that
cta
effort
first
focused
on
the
metadata
information
sharing
from
the
clients
toward
the
servers
and
the
spec
was
published
last
year
as
cta
5004,
and
that's
called
common
midi
client
data.
S
They
didn't
get
on
the
bandwagon
a
few
years
ago,
but
now
they
want
to
have
something
similar
here
is
the
deal
whose
fault
is
it
when
a
user's
video
keeps
freezing
or
is
low
quality,
one
can
blame
the
device
or
the
app
or
the
internet
connection
speed
or
the
cdn
or
the
content
provider
is
a
vicious
cycle
that
mostly
ends
up
without
a
satisfactory
answer,
which
makes
us
the
users
quite
upset
fault.
Isolation
is
very
important
for
keeping
the
paying
customers
and
it
requires
analytics
data
from
various
points
along
the
delivery
pipeline.
S
There
are
many
proprietary
internet
analytics
solutions
today,
but
surprisingly,
the
cdns
are
quite
blind
to
all
of
this
cds
see
the
requests
and
send
the
responses,
but
they
keep.
They
don't
have
the
glue
to
connect
all
the
pieces
together
to
get
the
full
picture
and
identify
how
well
things
are
working
or
not.
The
solution
is
quite
straightforward,
though,
if
we,
if
with
each
request,
declines
and
some
identifiers
that
status
information,
the
cds
suddenly
gain
visibility
into
the
media
analytics
see
this
new
paper
from
friends
of
mine
to
explore
how
powerful
cmcd
reporting
could
be.
S
S
Sending
a
deadline
might
allow
for
getting
the
response
with
a
higher
priority.
Reducing
the
risk
for
video
stalling,
for
example,
see
our
non-stop
paper
this
year
on
the
subject.
Another
use
case
is
hints
about
the
next
request.
The
cdn
may
reduce
the
fetch
time
for
the
clients
by
using
that
information.
S
S
The
work
is
still
ongoing
if
you're
interested,
please
consider
participating
in
the
effort
developing
or
writing
a
spec
is
usually
the
easiest
part
we
had
to
send
standard,
but
it
was
not
widely
adopted
to
get
better
adoption
with
the
cmcd
and
cmsd.
All
of
us
working
in
this
industry
need
to
work
together.
We
all
agree
that
information
exchange
is
useful
when
the
information
is
relevant
and
actionable.
The
information
also
needs
to
be
fresh.
S
There
are
already
a
number
of
cdns
and
clients
supporting
cmcd,
so,
for
example,
grab
the
latest
hts
code
start
experimenting
yourself
and
come
up
with
how
cmcd
or
cmsd
can
help
in
your
environment.
Both
have
extension
mechanisms,
so
extending
the
functionality
is
pretty
straightforward
and,
most
importantly,
join
the
effort
and
contribute
review.
S
The
use
cases
propose
new
ones,
and
so
on
with
this,
I
provide
a
slide
here
with
some
useful
links
for
your
reference
and
thanks
for
listening,
and
I
hope
those
of
you
who
are
interested
in
streaming
stuff
and
other
multimedia
systems
topics
can
attend.
Acm
multimedia
systems
conference
impersonal
virtually
at
the
end
of
september.
Thank.
D
You
so
thank
you
so
much
ali.
Can
you
join
us
by
turning
on
your
mic
and
your
camera,
so
thank
you
for
that
plug
as
well
at
the
end
forever.
D
So
please
do
join
the
queue
if
you
have
questions
for
for
ali,
so
the
queue
is
open.
Now
I
don't
see
anybody
right
now
in
the
queue,
so
maybe
I'll
just
start
it
right
off.
So
my
first
question,
as
obviously
when
you
see
this
type
of
call,
fractions
is
okay.
How
much
adoption
have
you
already
experienced
right
now
and
which
are
the
immediate
benefits
that
you
already
observe
just
to
you
know
maybe
trigger
people
to
to
join
the
the
bank,
the
yeah,
the
effort.
S
Well,
certainly,
we
have.
We
have
some
folks
today
in
the
workshop,
who
are
representing
some
some
of
the
biggest
largest
cdm
companies.
So
I
hope
they
will
also
join
this
conversation,
but
you
know,
but
definitely
there
is.
There
is
some
interest,
because
this
time
you
know
you
know,
unlike
the
mpeg
standard,
we
did
like
five
years
ago.
This
time
you
know
the
request
came
from
the
cdn
providers,
so
that
already
shows
their
interest
because
they
are
really
struggling
with
other
analytic
solutions,
and
you
know
they
need
something
in
place.
S
Otherwise
they
are
just
running
blind
and
you
know
they
need
to
do
this
now,
if
only
to
see
cds,
the
you
know
implement
is
obviously
that's
not
going
to
be
sufficient.
We
need
client
support
as
well,
and
you
know
yes,
there
are
some
client
implementations
out
there
who
already
committed,
but
you
know
the
big
players
like
ios
platform,
hls
players,
av
library-
I
mean
they
are
not-
they
are
not
doing
it
yet,
maybe
in
the
future
they
will,
but
not
at
the
moment.
S
But
you
know
this
can
only
be
a
great
success
if
everybody
who's
involved
on
either
the
server
side
or
the
client
side,
you
know,
starts
implementation
and
the
biggest
benefit.
The
second
part
of
your
question,
as
as
I
mentioned
in
the
talk,
is
even
with
some
sort
of
unique
identifier
per
the
playback
session.
You
know
the
cds
will
at
least
be
able
to
understand.
S
Okay.
This
client
is
in
this
session
and
asking
for
all
these
files
all
these
segments,
and
things
like
that,
so
they
will
be
able
to
put
up
things
together
at
least
per
session.
Otherwise
it's
just
you
know
they
are
using
all
these
individuals
request
logs,
but
they
have
no
idea
what
they
are
for.
So
even
with
that
simple
request
idea,
I
think
there's
a
lot
of
benefit
in
there.
D
That's
great
to
hear-
and
I
mean
I
do
support
that
fully
and
since
the
queue
seems
to
still
be
empty
right
now,
I'm
just
going
to
continue
and
I'm
just
going
to
bother
you
with
a
few
more
questions.
D
I
would
I
liked
very
much
your
your
slide
number
seven
in
in
the
in
the
talking
you
gave
before
just
showing
that
you
know
different
players.
Just
you
know,
point
the
finger
at
the
other
and
they
just
it's
very
hard
essentially
to
to
do
root,
cause
analysis
for
anomalies
that
might
occur
in
this
environment,
and
I'm
just
wondering
have
you
or
are
you
in
touch
with
people
who
are
looking
at?
D
You
know
different,
deep
learning,
machine
learning
analytics
for
this
type
of
you
know
animal
eye,
detection
approaches
or
you
know,
proposing
ways
for
different
players
to
share
data
in
a
privacy
preserving
way.
S
Well,
when
it
comes
to
you
to
diagnosing
your
problems,
for
you
know
for
your
video
services,
it's
not
really
very
privacy.
I
mean
it's
like.
There
are
a
lot
of
price
issues
there.
Obviously,
because
you
know,
whoever
is
looking
into
your
problem
will
likely
know
what
you
were
watching
or
what
you
were
trying
to
watch
right.
So
you
know
I'm
not
really
interested
in
that
part,
but
when
it
comes
to
identifying
where
the
problem
is
or
what
what
is
causing
the
problem.
S
Obviously
there
are
some
companies
out
there
who
are
doing
all
these
probing
type
of
things.
They
collect
data
from
different
parts
of
the
delivery
pipeline.
Now
they
they
have
some
components
in
the
client.
They
have
some
components
in
the
cdn.
They
have
some
components
in
the
you
know
head-on:
side,
the
packaging
encoder
or
whatever.
S
Whatever
you
know,
the
more
props,
they
can
put
obviously
the
more
visibility
they
will
get,
and
you
know
that
you
know
collecting
the
data
is
one
thing
and
processing
the
data
is
another
thing
and
they're
reading
the
data
and
understanding
the
root
causes
totally
something
different,
and
then
all
these
different
companies
are
different.
You
know
success
rates
for
that,
but
you
know
most
of
the
time
you
know
unnecessarily
a
cdn
you
know
goes
down
like
you
know.
S
It
happened
a
few
days
ago
with
the
you
know,
one
of
the
larger
cdn's,
but
you
know
things
do
happen
from
time
to
time
or
unless
something
happens
with
your
isp
connection
and
so
on.
Most
of
the
problem
is
usually
most
of
the
problems
are
within
your
home
network
or
some
you
know
software
issues
in
your
client
side.
So,
even
if
you
have
some
components
to
understand
what's
happening
in
the
client,
then
that
will
give
you
a
lot
of
visibility.
S
You
know,
okay,
you
know
we
made
this
change
and
then
here
is
you
know.
The
knee
problems
started
happening
here
and
there
right
so
most
of
these
you
know
services
often
do
a
b
testing
so
when,
whenever
they
are
testing
with
a
new
feature,
they
you
know
usually
only
enable
it
for
a
subset
of
users
and
see
whether
everything
is
working
or
not
so
yeah.
There
are
a
lot
of
ways
to
diagnose
these
things,
but
just
because
you
diagnose
it
doesn't
mean
that
it's
going
to
get
resolved.
S
That's
another
thing,
and
you
know
it's
this
finger,
pointing
someone
else
or
blaming
someone
else
is
usually
what
the
you
know
the
companies
do.
So
it's
not
that
easy
to
get
away
from
it.
But
with
this.
D
C
C
C
So
I'm
curious
what
the
reaction
of
the
folks
supplying
very
large
quantities
of
this
of
this
video
that
believe
they
control,
essentially
the
entire
delivery
chain
other
than
the
isp
link,
so,
for
example,
google
and
youtube
they
control
effectively
everything
up
until
the
last
top
of
the
path
to
the
user.
Netflix
today,
in
most
cases,
also
does
that
do
they.
E
S
Well,
okay,
I'm
getting
some
echo.
Maybe
you
might
want
to
wait
so
the
the
the
larger
set
of
benefits
are
for
those
people
who
you
know
in
the
second
bucket,
for
example.
Today
we
have
olympics
right.
So
the
the
broadcast
is
coming
from
tokyo,
japan
and
then
it's
going
to
a
bunch
of
service
providers,
video
providers
out
there
through
and
then
through
multiple
cds.
S
You
know
everywhere
in
every
country
you
know
millions
of
watching
the
content
live.
So
in
those
cases
that's
really
the
you
know
where
the
benefit
is
because
if
something
doesn't
work
then
now
you
know
where
you
know
things
are
falling
apart.
So
you
know
for
for
companies
like
youtube
or
netflix.
S
Essentially,
the
the
problems
are
more
limited
in
terms
of
where
they
can
happen.
You
know,
especially
for
netflix,
I
mean
it's
all
on
demand,
so
the
content
is
already
encoded
packaged
and
it
has
been
tested.
So
if
something
goes
wrong,
it's
probably
going
to
be
in
your
home
network,
or
maybe
you
know
the
isp
is
the
bottleneck
at
that
time.
So
it's
going
to
be
relatively
easy
to
figure
that
out,
but
for
larger
events,
especially
for
live
events,
and
when
they
are
crossing
across
different
networks
and
different
domains.
S
Then
that's
where
the
most
of
the
benefits
will
be,
and
you
know
more
direct
answer
to
your
question
is
at
the
moment
I
haven't
seen
anybody
involved
from
those
companies
in
this
work.
D
So,
thank
you
so
much.
I
I'm
just
going
to
open
the
queue
for
both
speakers
and
I'm
just
going
to
invite
bada
facebook
to
join
us
and
so
basically
we're
opening
for
for
the
panel.
I
see
janna
now
in
the
queue,
so
please
janna
just
go
ahead
and
ask
your
question.
Thank
you.
P
Thanks
sandra
and
thanks
ali
for
that
talk,
I
want
to
push
back
gently
on
something
that
you
said
earlier
about
privacy
not
being
important
here.
I
think
it
is,
and
I
think
it's
achievable.
I
don't
think
it's
very
far
away.
Just
because
I
can
see
as
a
server.
I
can
see
the
manifest
and
I
can
see
the
videos
you're
fetching
doesn't
mean
that
you
have
no
privacy
right.
P
All
I
need,
from
a
service
point
of
view,
is
to
be
able
to
tie
together
the
different
videos
that
you're
watching
to
be
able
to
pull
together
a
full
picture
of
the
video
session
that
you've
had.
So,
if
I'm
one
of
the
problems
I
have
as
looking
at
logs
at
the
cdn,
I
work
at
fastly
is
basically
being
able
to
pull
together
the
different
chunks
that
you
requested
as
a
client.
So
just
the
one
single
piece
that
you
have
in
the
cmcd
spec
of
the
geo
the
session
id.
P
Basically,
that
alone
gives
me
like
90
of
the
value
that
I'm
looking
for.
So
if
I'm
able
to
tie
together
just
those
that
single
video
session.
That
helps
me
tremendously
because,
because
it
shows
me
the
bit
rates
that
that
you
used,
I
mean,
of
course,
the
bit
rate
being
explicit
would
also
be
useful,
because
otherwise
you
have
to
infer
them
from
the
object
names
to
figure
out
what
the
bit
rate
might
be
and
so
on.
P
But
those
two
pieces
of
information
are
are
hugely
useful
and
I
think
they
can
be
done
as
you
as
you
imagine
that
the
spec
does
that
in
the
privacy
preserving
way,
meaning
that
the
session
id
doesn't
actually
include
the
client's
ip
address
in
there
or
the
time
stamp
directly.
The
time
stamp
is
fine,
but
the
client's
ipad
doesn't
be
particularly
problematic.
S
It's
not
really
involved
in
that
much
I
mean.
Obviously
we
hope
and
assume
that
you
know
this.
The
cdm
provider
will
do
whatever
they
can
to
preserve
the
privacy
right.
Otherwise,
all
this
information
flowing
back
to
the
cdn
servers
will
obviously
you
know
I
mean,
reveal
some
information,
but
how
the
cdm
provider
is
going
to
use
that
information.
That's
really
I
mean
I
think
it's
a
separate
topic.
It's
I.
P
Appreciate
that
I'm
basically
let
me
be
more
specific,
I
think
that
the
construction
of
the
guid
of
the
session
id
should
be
done
in
such
a
way
that
it
doesn't
reveal
client,
personal
information.
That's
if,
as
long
as
things
like
that
are
taken
care
of,
I
think
that
the
rest
of
it
is
still
very
useful
and
stuff
that
we
can
and
want
to
be
able
to
use.
So
thank
you
for
this.
I'm
I'm
I'm
curious
to
understand
the
state
of
this
work.
P
As
in
under
asked
about
the
deployment,
I
want
to
understand
the
state
of
the
the
the
spec
itself
like.
Where
does
it
stand
right
now
and
and
and
what's
the?
S
Of
right
so.
J
S
Might
you
you
must
have
seen
the
fastly
look
on
my
one
of
the
last
slides
fastly
is
one
of
the
cdn
providers
who's
trialing
this
and
they
already
have
some
cmcd
reporting,
so
they,
you
know
some
of
your
servers.
At
least
you
know
I
mean
there's
a
trial
going
on.
You
can
collect
cmcd
data
coming
from
the
clients,
so
that
and
then
there's
a
dashboard
where
you
know
people
can
see
what's
happening,
and
you
know
things
like
that.
S
Akamai
is
the
other
partner
at
the
moment
and
we
hope
other
cdns
will
join
the
club,
and
you
know
the
as
I
mentioned
earlier,
though,
this
is
not
just
the
cdn
support.
The
clients
also
need
to
send
the
information
to
the
cdn
right.
So
you
know
the
hgs
and
a
couple
of
other
open
source
players
are
currently
also.
S
You
know
we
have
some
initial
support
for
that
and
with
the
other
side
of
the
spec
cmsd
the
common
media
server
data,
I
think
it
will
be
a
lot
more
important
for
the
cdns,
because
now
you
will
be
sending
some
information
back
to
the
client
and
at
the
moment
you
know,
since
we
are
just
you
know,
you
know
working
on
the
spec
now
there
is
really
no
deployment
or
any
trials,
as
you
know,
to
my
knowledge,
but
I
hope
to
see
some
action
in
that
space
later
in
the
fall
or
late
this
year,
so
hopefully,
hopefully
with
the
cmcd
and
cmsd,
and
anybody
can
extend
them
as
they
like.
S
You
know.
You
know
there
will
be
a
lot
of
action
in
the
space
in
the
next
year
or
so.
S
Where
does
one
follow?
What
where
does
one
follow
the
spec?
How
do
I
focus
well,
it
is
under
cta,
but
there
are
two
links
in
my
at
the
end.
You
know
in
the
file
in
the
last
slide,
where
you
can
find
the
github
links
and
you
can
post
issues
and
you
can
you
can,
you
know,
find
the
specter.
So
it's
a
pretty
open
process
at
the
moment.
C
So
here's
a
really
quick
one
is
the
session
id
local
to
the
client,
because
if
it
isn't,
doesn't
that
open
you
up
to
all
kinds
of
interesting
spoofing
attacks
like
piggybacking
on
other
people's
sessions
and
collusion
among
clients
to
make
it
look
like
they're
all
cooperating
when
they're
not.
S
Well,
you
know
the
spec
doesn't
really
say
much
about
how
you're
going
to
generate
the
session
id,
but
it's
not
local
to
to
the
to
the
client
itself.
So
you
know
it's
supposedly.
You
know
something
unique
so
that
you
know
in
a
you
know
so
that
you
know
two
clients
or
you
know
three
clients
within
the
same
home
or
within
the
same
neighborhood
or
city
whatever
they
will
not
be
able
to
come
up
with
the
same
id.
C
S
Yeah
yeah
yeah,
the
the
that's
important.
Certainly-
and
you
know
you
don't
want
to
send
a
wrong
or
you
know
you
know
that
intended
information
back
to
the
cdn,
because
you
know
with
your
information,
the
cdn
might
take
a
different
action
and
obviously
the
cdns
must
be
careful
about
what
they
process
once
this
information
comes
from
the
clients.
So
that's
that's
one
very
big
issue
with
this.
D
So
thanks
so
much,
I
don't
see
anybody
in
the
queue
right
now,
so
I'm
just
going
to
take.
Oh,
I
do
see
janna
jonathan's,
perfect
I'll,
keep
my
question
for
a
bit
later.
P
D
Right,
I
mean
I
was
just
trying
to
bring
it
back
to
bara
because
we
have
been
discussing
about
this
open
well,
this
open
space,
where
we
can
contribute
to
to
the
specification
for
obvious
case,
but
I'm
just
wondering
coming
back
to
sage.
Are
you
accepting
at
any
point,
or
are
you
considering
at
any
point
accepting
sort
of
contributions
from
the
community,
especially
this
community?
In
terms
of
you
know,
maybe
opening
the
tool
and
having
people
help
with
that?
Q
Yeah
yeah
absolutely.
We
would
like
to
we'd
like
to
engage
with
folks
in
the
community,
we're
kind
of
at
this
point
where
we're
we've
been
working
through
the
back
catalog
of
relatively
simple
rfcs.
To
start
with,
so
you
know
mostly
to
figure
out,
you
know
what
are
the
the
essential
features
that
we
can
actually
disambiguate
on
and
right.
Q
You
know
auto
generate
code,
but
we
would
like
to
I
think
one
of
the
key
things
may
be
that
you
know
some
of
you
know
I
think
in
colin's
question
there
was
a
mention
of
some
of
the
existing
tools
that
the
community
is
starting
to
use
or
folks
that
are
working
on.
Q
You
know
state
machine
analysis
and
maybe
some
of
the
other
things
to
the
extent
that
we
can
integrate
those
kinds
of
things
where
say,
there's
a
state
machine
analysis
tool,
but
it
needs
to
have
a
state
machine
and
the
state
machine
is
maybe
right
now
specified
in
text.
So
you
know
there's
there
are
some
things
that
are
structured
text
that
are
in
in
rfcs.
Moving
from
that
to
something
that
is
like
an
intermediate
representation.
Q
That,
then,
goes
into
a
formal
analysis,
that's
something
that
you
know
we'd
like
to
be
able
to
plug
in
with
existing
tools.
One
other
thing
is
that
you
know
the
fact
that
we
can
use
markdown,
potentially
or
even
better
than
markdown,
something
that's
like
structured
formatting
will
be
extremely
useful.
Q
You
know,
I
don't
know
to
the
extent
that
that
is
being
adopted
widely
across
the
community,
but
you
know
there's
the
more
that
we
can
structure
the
text
itself,
while
still
keeping
it
easy
to
read
and
not
sort
of
you
know,
overly
formalized
will
make
it
easier
for
both
human
interpretation,
but
also
for
machine
interpretation.
D
P
Not
at
all,
I
actually
had
a
question
for
bharat
too,
and
I'm
trying
to
bring
it
back
into
my
head
now
at
a
I
was
going
to
ask
about
that's
right
so
so
you've.
Clearly
you
said
that
you've
been
looking
at
some
simple
protocols.
Q
Has
so
we
in
the
sitcom
paper
we
cover
the
icmp,
you
know
the
full
protocol,
so
icmp,
obviously
being
one
the
simplest
place
to
start
right
and
so
the
reason
we
started
there
is
we
can
go
all
the
way
from
spec
to
disambiguation
fixing
the
sentences
to
implementation
that
works
with
you
know,
ping
and
traceroute,
and
you
know
it
interoperates.
Q
So
you
know
that
was
our
first
test
case.
Then
what
we
did
is
we
went
through
other
rfcs
across
the
years
and
picked
bits
and
pieces
from
them
that
weren't
sort
of
in
the
the
original
icmp
rfc
in
terms
of
style
of
text
or
structure
of
text.
So
we
have
some
text
from
ntp,
some
from
igmp
some
from
bfd,
so
those
are
the
other
ones
and
then
right
now
we're
actively
looking
at
some
state
machine.
Q
You
know
trying
to
pick
which,
among
the
various
protocols
that
have
state
machines,
do
we
want
to
try
and
analyze,
and
so
we
started
looking
at
some
and
so
suggestions
for
those
is
like
that's
one
place
to
start.
You
know
if
there's
one
of
the
ways
that
we
particularly
improve
the
tool,
but
then
it
can
be
useful
later
is
protocols
where
it's
known,
probably
through
oral
history,
almost
in
the
community,
that
there
was
a
spec
that
was
really
bad.
Q
That
was
then
improved
over
time
and
to
the
extent
that
we
have
that
history,
we
can
then
say
feed
the
original
bad
version
in,
and
then
we
can
see
whether
our
tool
can
identify
the
things
that
you
know
human
editors
identified
later
on,
and
so
you
know
it's
not
quite
a
training
process
in
the
way
that
you
know
like
a
neural
net
training
would
be,
but
you
know
sort
of
we
can
find
whether
our
tool
is
even
missing
those
things
or
not.
So
that
would
be
the
first
place
we
could
engage.
That's.
P
A
that
is
actually
an
interesting
one,
so
I
one
one
point
where
you
could
insert
that
tool
might
be
before
well,
there's
a
few
points
in
it
may
not
be
in
in
multiple
rfcs,
but
in
an
rfc's
life
you
can.
You
can
catch
it
before
it
hits
the
rfc
editor's
queue,
because
that's
one
place
where
a
lot
of
english
fixing
happens
is
that
the
rfc
editor
will
basically
take
a
look
and
do
a
bunch
of
copy
editing
and
some
ambiguity
gets
removed.
That's
usually
yeah.
P
P
Look
at
the
whole
spec
with
the
fresh
terrifies
rather
than
the
people
who've
been
walking
with
the
spec
for
the
for
for
two
or
three
years,
and
so
they
they
all
understand
the
context
and
don't
necessarily
refer
to
the
text
to
know
what
needs
to
be
built
so
you'll
find
that
the
the
the
rfc
before
last
call
and
after
last
call
can
actually
be
quite
different.
So
those
are
at
least
two
points
in
the
process
that
I
can
point
you
to.
P
D
That's
all
thank
you,
so
thank
you
so
much
for
for
this
wonderful
conversation,
I
mean
I
we're
just
one
minute
to
the
end
of
our
session
today
and
the
end
of
the
workshop.
D
I
think
we
had
calling
in
the
queue-
and
that
was
fortunate,
because
I
would
like
to
also
invite
both
him
and
nick
to
join
me
for
saying
for
asking
the
question.
Call
him
please
and
then
just
saying
goodbye
and
wrapping
for
yeah
wrap
me
the
worship
for
this
year.
O
Yeah,
so
I
I
think
we're
short
of
time,
so
I
I
will
skip
my
question
but
again,
thank
you.
O
Thank
you
to
beref
and
ali
for
for
the
the
excellent
talks.
Thank
you
again
to
andrea
to
to
nick
to
all
all
the
other
offers
to
all
the
presenters,
the
the
program
committee
members.
O
I
think
this
has
been
a
really
interesting
program
that
there's
been
some
really
excellent
talks,
really
excellent
papers
and
we've
seen
some
some
really
interesting
questions
here,
and
I
have
certainly
learned
a
lot
over
the
last
three
days
of
the
workshop
and
I
hope
other
people
have
too
thank
you
also
to
the
to
the
sponsors
to
to
comcast,
to
to
akamai
for
supporting
the
the
fee
waivers
for
supporting
the
workshop
and
in
going
forward
for
for
supporting
the
the
irtf
to
help
improve
access
and
the
diversity
grants
and
yeah,
and
if
people
have
feedback
about
the
workshop,
if
people
have
ideas
for
for
what
to
do
in
future,
please
please
talk
to
us.
O
Please
talk
to
me
to
andrea
to
nick
yeah
and
yeah.
I
I'd
like
to
just
finish
up
by
encouraging
all
the
offers
to
join
with
the
rest
of
the
itf
the
the
irtf
sessions
this
week.
O
I
think
one
of
the
the
benefits
of
the
a
rw
is
co-locating
with
the
ietf
meeting
and
obviously
this
is
a
bit
harder
doing
this
online
rather
than
doing
it
in
person,
but
please
do
try
and
make
the
most
of
the
opportunity
and
yeah
well
we'll
we'll
do
this
again,
and
hopefully
we'll
do
this
in
person
in
future.
So
yeah.
E
O
All
I
have
thank
you
again
to
to
the
organizers.
I
think
I
think
it's
been
excellent.
D
And
it's
been
great
working
with
you
and
yeah.
It's
been
great,
taking
care
of
the
workshop
this
year,
looking
forward
to
seeing
you
and
gather
if
you're
roaming
around
there,
so
that's
it
for
today.
Thank
you
again
so
much
and
thanks
thanks
nick
thanks
colin,
it's
been
great.