►
From YouTube: DASH High Availability Working Group July 26 2022
Description
1. Participants should read: https://github.com/Azure/DASH/blob/main/documentation/high-avail/design/xsight-labs-ha-proposal-new-ideas.md
2. Bring concerns to the table please
3. Proposal is an attempt to cross Reliable/Perf while remaining InterOperable
A
B
Any
other
q,
a
from
from
the
team
here,
the
typical
internal,
involves
something
like,
for
example,
end
of
rib
notification
from
a
given
protocol
that
it
has
finished
forming.
You
know
basically
processing
all
the
entries
to
into
router
just
as
an
example.
B
So,
in
terms
of
flows,
I
guess
you
know
to
mark
itself
ready
something
similar
could
be
used,
just
wondering
if
that
is
documented,
and
I
guess
if
not,
then
that
is
something
we
will
need
to
do.
B
And
if
there
are
any
things,
if
there
is
anything
related
to
operational
state,
that
is
already
documented
in
in
the
document
that
marian
has.
B
Which
document
any
document
where
we
have
recorded
the
operational
states
like
marian
mentioned
earlier,
that
it
is
recorded?
B
C
B
B
Yeah,
as
I
said
at
the
beginning
of
the
of
the
call,
I
will
send
a
link
to
to
my
pull
request
after
I
get
back
home
because
I'm
not
yet
there.
Okay.
B
So
any
other
general
qa
for
aha.
Before
we
end
this
session,
then.
D
So
there
was,
there
was
discussion
about
resurrecting
the
topic
of
what
to
exchange
and
stuff
like
that.
So
is
that
going
to
be
formally
specified
as
part
of
the
dash
specification,
what
will
be
exchanged?
A
A
We
don't
send
packets
from
one
to
the
other.
In
order
for
the
secondary
to
you
know,
basically
do
the
same
thing.
Look
at
the
packet,
bring
it
through
slow
path.
You
know,
make
a
connection
entry
do
everything
that
that
primary
did
like.
So
the
the
alternate
proposal
is
to
say
no,
the
primary
has
already
done
that
work
and
his
job
is
to
send
the
the
the
state
across
and
and
the
secondary,
just
basically
copies
the
state
into
its
connection
table,
for
example.
A
So
there's
many
ways
to
optimize
where
you
don't
have
to
send
and
redo
work,
or
at
least
not
as
much
work,
and
so
that's
the.
I
think
that
that's
why
I've
always
said,
though-
and
I
don't
know
how
much
punsando's
going
to
disclose,
but
if
they're
going
to
disclose
how
they
do
theirs
today,
then
that
would
be
mode
one
and
I
think
mode.
A
I
think
that's
like
mode
one,
and
so
we
can
actually
ask
punsando
formally
and
how
much
they're
willing
to
disclose
as
far
as
the
that
to
the
community,
if
not
we
can,
you
know,
can
make
it
up
ourselves
and
then
I
think
excite
really
was
working
on
two
things.
One
was,
and
I
think,
they're
separate
by
the
way
one
was:
should
the
protocol
be
reliable
or
not,
and
the
second
was:
how
can
we
make
the
the
messaging
to
the
secondary
more
efficient
through
and
not
have
it
do
all?
A
A
So
I
think
there's
there's
some
debate
over
that,
and
so,
even
if
we
don't
agree
on
the
exact
messaging,
I
think
we
need
to
discuss
the
different
approaches
and
and
and
the
reason
we
need
to
discuss
it
is
some
approaches-
are
much
more
efficient
than
others.
A
There's
a
lot
more
bandwidth,
less
bandwidth,
less
packets
and
less
processing,
and
so
it's
important
to
have
the
discussion,
and
so
I
don't
know
if
anybody's
from
so
I
think
the
first
one
is
if,
if
pensando's
on
the
line
and
I'm
not
asking
them
to
disclose
anything
now
they
have
to
decide,
are
you
going
to
disclose
the
packet
sequence
that
you
were
thinking
about
to
to
to
maintain
state?
A
If
not,
then,
then
the
community
here
will
have
to
reinvent
that
basically-
and
so
that's
that
would
be
the
first
question:
if
they're
willing
to
disclose
this
sequence
that
they're
thinking
about,
then
I
think
we
can
go
over
that
and
call
it
mode
one
and
then
just
kind
of
go
over
it
and
agree
that
that
would
be
a
methodology
and
that
everybody
can
can
kind
of
see
and
know
that
that'd
be
one
methodology
that
would
work
and
then,
if
we
can
get
through
that,
then
we
go
to
mode
two
which
is
like
defining
a
more
efficient
one
which,
like
I
say
you
know,
transfer
state
as
opposed
to
makes
a
secondary
recalculate
it
and-
and
you
know,
secondary
being
a
little
bit
more
blind
to
what's
going
on
just
maintaining
a
table
and
transferring
state
into
the
table
as
a
slave
as
opposed
to
recalculating
everything.
A
D
Yeah,
so
I
I
don't
have
an
answer
for
that.
Yet
so
I'll
have
to
go
back
and
discuss
with
the
team
but
yeah.
So
my
question
basically
was
because
I
was
under
the
impression
from
listening
to
the
earlier
recordings
that
the
actual
implementation
of
how
it's
done
will
be
left
to
the
user.
A
We
just
stopped
the
discussions.
We
didn't
say
that
this
is
too
much
arguing
over
the
you
know
mode,
one
mode,
two
mode,
three
who's
reliable
unreliable
and
then
you
know
so
I
I
didn't
make
it
clear
that
this
is
a
very
important
point,
that
if
we
go
down
the
road-
and
we
say
no-
everybody
just
do
whatever
you
want.
A
First
of
all
makes
it
really
difficult
to
test,
and
secondly,
there's
going
to
be
a
bunch
of
losers
here,
because
whoever
comes
up
with
the
most
efficient
approach
is
obviously
going
to
have
a
serious
business
benefit,
and
so
that's
what
everybody
wants
to
do.
Then
there's
going
to
be
one
winner
here
and
everybody
else
is
going
to
cry
so
like
I,
I
don't
agree
with
that
approach.
D
Okay,
yeah,
because
I
I
thought
I
had
something
about
the
inter
vendor
need
not
be:
there
need
not
be
interoperation
between
vendors
and
so
on.
So
I
think
that.
A
I
think
this
decision
for
later
on
the
practicality
of
getting
the
approach
exactly
the
same
across
vendors.
I
think
we'll
have
to
discuss
that
later.
I
think
that
first
thing
is
to
document
the
approaches
that
are
reasonable,
that
can
be
taken
and
compare
the
efficiencies
of
each,
and
so
that
way
we
don't
have
people
running
off,
doing
a
very
inefficient
design,
trying
to
comply
only
to
find
out
that
they
can
never
get
business.
A
A
That
would
be
the
simplest
approach
and
we
would
just
duplicate
everything
but
at
let's
say,
5
million
connections
per
second
going
back
and
forth.
You
can
just
imagine
it's
5
million
times
6,
and
then
you
have
to
multiply
that
by
two.
If
you're
going
to
actually
reverse,
you
know,
send
those
packets
back,
you
are
literally
taking
up
the
entire
user
data
path.
So
I
think
that's
why
this
deserves
discussion,
that
even
even
its
simplest
approach
can
be
optimized
right.
You,
you
can
optimize
the
tail.
A
For
example,
do
you
really
need
to
send
all
the
the
states
you
know
the
packets
across
or
or
not?
And
what
is
the
result?
You
know
just
simple
simple
investigation
of
you
know
worst
case
something.
A
connection
is
held
open
too
long.
Well,
that's
a
lot
better
than
a
connection
that
closes
prematurely.
A
connection
that
closes
prematurely
can
actually
cause
the
connection
at
the
end,
points
to
basically
go
haywire
and
and
really
just
they
can't
talk
to
each
other
anymore.
They
can't
send
closes,
they
can't
send
resets,
they
can't
send
anything.
A
A
If,
if
you
want
to
error
on
keeping
things
open
versus
closing
prematurely,
that
obviously
gives
you
some
a
little
bit
of
leeway,
and
you
know
I've
documented
a
few
things
like
when
you
do
a
sync,
no
matter
how
you
do
it
when
you
go
to
the
secondary
you've,
got
to
re-simulate
all
the
connections
anyways
like
so
anything
that
that
shouldn't
be
there
anymore
due
to
policy
updates
that
happen
kind
of
in
flight.
You
have
to
resimulate
and
so
you're
going
to
get
rid
of
all
the
connections
that
shouldn't
be
there.
A
That
way,
so
the
result
will
be.
You
have
some
maybe
open
connections
that
didn't
receive
within
and
what
do
you
do
with
them,
but
the
reality
is,
I
think
we
can
get
even
more
what
I
call
mode
one
where
you're
sending
packets
across
we
can
optimize
that.
I
think
I
don't
think
we
need
to
send
so
many
packets
back
and
forth.
C
C
Basically,
the
ability
for
the
sender
to
send
either
the
packet.
You
know
the
original
packet
or
some
digested
form
of
the
packet
and
defining
that
the
receiver
has
like
an
invariant
logic
that
it
has
to
perform,
and
then
it
really
becomes
up
to
the
sender
to
decide
you
know:
does
it
want
to
send
the
packet?
Does
it
want
to
send
a
message?
Does
it
need
a
reply
back?
C
Does
it
not
need
a
reply
back
and
if
you
build
on
top
of
this,
you
can
get
to
like
you
know,
increasing
levels
of
like
optimal,
you
know,
but
it
does
and
it
allows
for
interoperability,
and
you
know
it
doesn't
require,
like
everyone
to
have
to
do
the
most
optimal
solution
to
be
to
be
interoperable
and
so
like.
I
would
encourage
people
to
read
it.
C
Show,
yes,
yes
and
I
would
encoura.
I
would
encourage
people
to
read
it
because
I
think
it's
like
consistent
with
what
you're
saying
gerald
and
in
in
a
way
that
allows
for
innovation
and
interoperability.
A
Yeah
that
sounds
excellent.
I
think
if
people
have,
I
did
not
read
this
one.
I
like
it.
I
like
the
idea,
especially
that
it
would
support
all
the
different
modes
and
I
think
even
mixed
mode
is
possible.
I
think
you
can.
C
Absolutely
yeah
it
would
allow
that
it
would
also
allow
like
you
to
make
a
decision
to
say
well
for
certain
state
transitions.
You
know
I'm
going
to
send
those
immediately
for
other
ones.
I'm
going
to
send
those
periodically
like
it
really
becomes
like
the
decision
of
the
sender,
to
decide
like
how
advanced
it
wants
to
be
in
the
synchronization
of
state
and
the
receiver
just
performs
a
very
simple
function.
The
receiver
just
accepts
the
state
and
updates
its
loca,
its
local
state
and
either
sends
a
response,
if
requested
or
not,
and
so
like.
C
I,
I
think
I
think
people
should
read
it.
It's
not
pushing
a
particular
mode.
Like
you
say
mode,
one
or
you
know
advanced
modes,
it's
not
pushing
a
particular
mode.
It's
a
framework
that
allows
you
know
lots
of
different
modes
to
be
implemented
while
still
remaining
interoperable
yeah.
Now.
A
Does
it
allow,
for,
I
think
it
does,
but
I'm
just
going
to
pose
one
question,
because
I
think
a
lot
of
times
the
the
receiver
shouldn't
process.
Anything
should
just
take
the
state
receiving
and
copy
into
its
connection
table
and
and
hence
forego
all
you
know
all
computations
other
than
writing
that
state
into
the
connection
table.
Does
that
allow
for
that.
C
Yes,
it
does
so
it
basically,
the
receiver
receives
either
a
packet
or
a
message
in
the
case
that
the
receiver
receives
a
message.
Then
that
message
is
just
the
current
state
of
a
connection
and
it
updates
it.
You
know
it
updates
its
table
with
that
current
state
blindly,
yeah,
that's
it
okay
and
it,
and
it's
up
to
the
sender
to
decide
whether
the
sender
wanted
to
send
the
packet
or
whether
the
sender
wanted
to
send
the
message.
And
then
the
sender
also
decides
whether
it
wants
a
response
and
the
response
could
be.
C
The
entire
message
comes
back
as
a
response
like
in
your
mode,
one,
for
example,
the
entire
the
entire
packet
can
come
back
as
a
response,
or
you
could
get
a
truncated
version
of
that
back
or
the
sender
could
even
send
some
opaque
information
and
that
opaque
information
comes
back
like
I'll.
Give
you
another
example
gerald,
and
maybe
nobody
would
ever
implement
this,
but
maybe
the
sender
actually
buffers
the
packets
locally
and
just
sends
a
message,
but
it
still
wants
a
response
back
before.
C
It
then
sends
the
original
packet
to
the
network
so,
rather
than
sending
the
packet
and
getting
the
packet
back,
it
just
sends
a
message
and
gets
an
acknowledgement
back
and
all
of
those
all
of
those
approaches
are
possible
with
what's
being
proposed.
So
you
know
like
I
think
there
was
a
period
where
you
know
there
was
like,
let's
just
say,
conflicting
opinions
between
like
nvidia
and
excite
about
like.
C
What's
the
right
approach,
should
it
be
reliable,
should
it
not
be
reliable,
and
so
this
proposal
is
really
an
attempt
to
embrace
all
of
those
approaches,
while
still
being
interoperable
yeah.
I.
A
Really
love
it
I'd
like
to
read
it
until
I
read,
I
can't
know
for
sure,
but
it
sounds
awesome.
I
guess
if
it's
easy
to
formulate
the
message
from
all
the
technologies
that
we're
talking
about
how
about
other
people
on
a
call?
What
do
other
people?
Think
of
that
just
in
general
and
we'll
all
go
off
and
read
it
in
in
detail?
Is
it
anybody
else
have
an
opinion
on
that.
D
Yeah,
just
just
to
throw
out
there
a
little
bit,
john
and
gerard
right,
I
mean
it.
We
can
go
down
to
the
weeds
with
various
modes
and
combinations
here
right
to
throw
one
more
curveball
or
maybe
a
half
break
idea
that
I
have
at
least
that
I'm
thinking
aloud,
as
as
I'm
you
know,
making
some
sense
here
is.
D
A
D
Okay,
okay,
I
mean
because
if
it's
an
end-to-end
redundancy
or
the
models
that
you
want
to
develop
for
the
right,
then
then
the
the
the
order
of
messages
at
the
scale
will
just
just
be
overwhelming
in
any
ways.
So
you
can
reduce
that
at
scales
of
you
know
clustered
operation.
A
C
C
I
think
that
there's
also
like
a
case
that
you
could
make
for,
like
you
have
to
in
some
cases
you
know,
depending
on
how
much
and
I'll
use
the
word
fault,
tolerance
and
what
I
mean
by
fault
tolerance
is
that
in
the
event
of
a
failure
you
know
like
do
you
lose
connections
or
not
lose
connections
like
if
you
can't
tolerate,
like
any
loss
of
you
know,
let's
just
say
you
know,
and
and
maybe
every
you
know,
deployment
is
different,
but
let's
just
say
that
there's
a
deployment
that
says
you
know
for
established
connections.
C
I
cannot
tolerate
a
single
loss
of
a
connection.
You
know
I
can't
black
hole
any
single
connection.
Well
then,
like
in
that
case,
I
think
the
conclusion
is
that
an
acknowledgement
you
know
has
to
be
sent
from
the
receiver
of
the
state
synchronization
back
to
the
sender,
so
that
this
so
that
the
sender
is
aware
that
the
receiver
is
in
sync,
and
I
think
that
you
know
this
idea
of
like
publishing
it
to
a
database
or
something
like
like.
C
I
I
I
think
that,
like
you
know,
you
need
like
in
some
cases
the
sender
needs
like
a
positive
acknowledgement
that
the
receiver
is
in
sync,
and
I
just
don't
know
how
to
do
that
other
than
to
have
like
a
direct
protocol
between
the
sender
and
the
receiver.
D
Yeah,
the
cartesian
product
of
the
senders
and
receivers
with
messaging
can
just
get
you
know
too
much
too
many,
so
the
database
actually
does
some.
You
know
dampening
of
that
effort
and
if
you
have
multiple
classes
of
connection
across
these
five
million,
the
like
you're,
you
know
espousing
you're,
saying
you
know
some
that
can
not
afford
to
lose
messages.
Some
that
can
then
you
know
you're
only.
You
know
synchronizing
what
you
really
need
to,
as
opposed
to
everything.
D
C
Then
then,
the
clients
of
the
database
right.
So
I
think
it's
just
like
it's
like
a
very
costly
solution
to
add,
like
a
third
entity
right
in
the
in
the
aha.
D
Right
there
in
certain
scenarios,
like
the
upf,
you
know
that
they're
able
to
you
know,
subsume
that
cost
and
and
yes
it
it
there's
other
cost
to
it.
You
know
development
and
footprint
and
things
like
that,
but
certain
applications
are
okay
with
stuff,
like
that
in
in
the
scenario
of
the
user
planes.
D
Suresh
somewhat
related
to
morally's
question
so
are.
D
A
We
have
to
start
with
one
to
one
and
we're
going
to
implement
one
to
one.
Do
we
would
it
be
advantageous
to
have
a
two
to
one
of
course,
but
I
don't
know
if
we
can
start
there.
We
could
have
discussions
on
it,
though,
but
I
think
we
need
to
resolve
one
to
one
initially.
D
A
A
Yeah
so
they're
both
active
and
they're,
both
backing
up
the
other
device
one
to
one,
but
it's
not
active
standby.
It's
active,
active.
D
A
A
We
have
we
so
when
we
talk
about
aj,
we're
talking
about
you
know,
give
you
an
example:
two
dpus
we'll
call
them
dpus
backing
each
other
up
right.
We're
not
saying
in
fact
we've
written
about
this-
that
a
dni
exists
on
a
single
gpu
right.
You
can
spray
eni
connections
across
multiple
gpus
and
if
it
arrives
on
multiple
gpus
deterministically,
then
you
can
actually
have
each
one
of
those
gpus
backing
up
that
fractions
of
connections
to
the
others.
So
we
do
have
a.
A
We
do
have
coverage
at
least
for
this
scenario,
where
we
don't
have
just
a
one
for
one,
but
it's
actually
and
it's
actually
that
e9
can
exist
across
many
dpus,
but
the
decision
for
which
gpu
the
connections
arrive
is
really
from
the
source
right
that
sources
ecmping
across
multiple
gpus
and
then
each
of
those
gpus
just
has
the
same
policy
and
and
processes
some
fraction
of
the
connections.
So
we
do
have
that.
So
I
guess
one
for
one
active
active
is
not
everything.
We
do
also
have
the
load
spreading
ability
with
this.
A
B
Thanks
gerald
anyone
else,
guys.
D
One
last
question
is:
are
these
goals
specified
somewhere
like
no
drops
after
seeing
like
for
new
connections
and
things
like
that
or
bandwidth
being
used
for
the
sync.
C
I
I
can
tell
you
in
the
document
I
wrote,
I
tried
to
enumerate
out
what
I
thought
the
goals
were
without
quantifying
them,
because
I
because
I
think
I
think
like
for
each
deployment.
It's
you
know
it's
it's
you
know
like
each
provider
might
have
might
quantify
those
differently,
and
so
you
know
I
don't
think
like.
It
should
be
part
of
dash
to
like
quantify
that
that,
for
example,
you
can't
have
that
you
have
to
have
like
100
reliability
of
every
single
connection.
C
I
think
someone
who's
deploying
dash
can
quantify
that,
because
I
think
that
there's
like
a
wide
range
of
possible
modes
like
gerald,
talked
about
mode
one,
but
there
might
be
other
modes
that
are
like
that,
that
that
are,
let's
say,
looser
modes
that
might
be
perfectly
acceptable
for
for,
like
some
deployment,
so
I
tried
I
tried
to.
I
try
to
like,
like
enumerate
like
a
set
of
metrics
for
h,
a
without
quantifying
them
as
requirements.
A
C
A
B
A
A
So
there's
always
a
moment
in
time
where
it's
possible
that
one
side
goes
dead,
the
primary
and
it
doesn't
stand
out
the
same
even
though
it
received
it
and
so
you're
gonna
lose
that
connection
or
you
lose
it
lose
it
in
in
this
transition
from
primary
to
secondary,
because
it's
not
planned
it's
an
unplanned,
and
during
that
it
is
always
possible
that
you
may
not
have
formed
a
connection.
So
we
want
to
like
error
on.
We
want
to
have
all
the
connections
that
were
established,
stay
open.
A
We
don't
want
to
lose
any
connection
in
the
transition
that
was
already
open,
but
we
can't
ever
say
that
it's
impossible
to
lose
something
that
was
lost
right.
You
can't
go
back
in
time,
yeah
right,
you
don't
know
what
the
failure
was,
and
so
what
I'm
saying
is
I'm
agreeing
with
you,
john,
that
your
protocol
actually
accommodates
for
that
it
can
be.
You
can
choose,
you
can
choose
not
to
get
a
knack
and
say:
look.
I
know
my
network
doesn't
lose
packets,
which
is
mostly
the
case
in
our
case
two.
A
I'm
going
to
accept
that
that
these
connections
were
formed
because
they're
they're
happening
at
5
million
per
second
right.
It's
hard
to
you
know.
You
know,
there's
five
million
of
them
happening
in
that.
Second,
so
the
likelihood
is.
Maybe
you
know
there
will
be
a
small
fraction
lost,
but
I
think
that
that
your
protocol
actually
allows
for
either.
C
Right
right,
it
allows
for
you
to
be
as
loose
or
as
strict
as
you
as
you
want
right
and
and
it's
up
to
the
sender
to
decide
how
strict
it
wants
to
be,
and
the
receiver
is
just
you
know.
The
receiver
is
just
getting
the
updates
and
updating
his
state,
and
so
I
think
I
think
people
should
read
it.
I
think
that
it'll,
you
know
it's
a
framework
that
allows
sort
of
incremental
improvement
in
in
optima
and
optimization
without
having
to
change
like
the
the
wire
protocol.
C
Like
the
wider
wire
protocol
can
be
defined
up
front
and
then
you
know
more
and
more
complex
or
loose
or
different
kinds
of
approaches
can
be
layered
on
on
top
of
it
and
still
remain
interoperable.
A
I
like
it,
let's
all
at
least,
agree
to
read
it
and
be
open
to
it
and
if,
after
reading
it
you
have
concerns
like
because
you've
read
it
and
you
know
you've
read
the
details,
then
we
could
in
the
next
meeting
christina
talk
about
concerns
over
I
mean
theoretically,
it
sounds
very
good
right.
You
know
a
good
place
to
start,
because
we're
going
to
argue
everybody's
going
to
argue
that
their
mode
is
better
for
a
long
time,
but
if
it
accommodates
what
could
be?
D
B
You
bj
thank
you
so
much
for
offering
so
okay.
I
think
we
have
something
to
do
for
the
next
call
then,
and
then
we
would
also
have
prs
to
look
at
if
there
happens
to
be
something
in
the
apis.
So
thank
you.
Is
there
anything
else
for
today
then.