►
From YouTube: DASH High Availability Working Group Apr 25 2023
Description
F5 joining call, Q&A
Exposing table to SONiC is not an issue; concerned w/speed
Inline sync is not easily decoupled from the dataplane - would never get the performance if we went w/SONiC
Perfect sync/Bulk sync there is no tight timeline
Think of as snapshot/restoring feature
See:
https://github.com/sonic-net/DASH/blob/main/documentation/high-avail/AMD-Pensando_HA_Proposal.md
https://github.com/sonic-net/DASH/pull/271/files
https://github.com/sonic-net/DASH/blob/main/documentation/high-avail/xsight-labs-ha-proposal-new-ideas.md
A
Group
for
the
day
it's
April
25th
and
we
just
have
a
get
together
with
a
short
presentation
or
a
q,
a
by
F5.
B
So
hi,
so
what
we're
looking
at
doing
right
now
is
is
taking
our
R
series,
which
runs
this
thing
called
f5os,
just
super
proprietary
building,
a
Sonic
layer
on
top
of
that
and
then
getting
Dash
running
on
top
of
that
Sonic
layer
right
now.
We're
not
aware
of
any
reason
why
this
won't
work,
but
the
Sonic
layer
implementation
that
I
found
is
mostly
in
Python.
So
it's
mostly
slow
control,
plane
stuff,
but
the
issue
we
have
with
session
replication
is
you
have
to
do
really
a
lot
of
it.
B
We
already
have
that
capability
in
our
product,
but
it's
very
tightly
tied
to
exactly
how
we
have
chosen
to
implement
various
tables
and
internal
memory.
Structures
that
we
have
and
the
question
I
was
trying
to
figure
out
is:
is
it
I
wasn't
able
to
find
enough
information
in
The
Proposal
right
now?
B
Would
it
be
wrong
if
we
did
the
rep
the
session
replication
according
to
the
requirements
as
outlined
by
Dash,
but
using
our
proprietary
failover
system
because,
like
I
said,
there's
no
real
reason
why
we
can't
put
a
Sonic
layer
on
top
of
F5
OS?
They
fundamentally
do
the
same
thing.
They
do
it
in
slightly
different
ways
and
then
running
Dash
on
top
of
Sonic,
because
it's
in
P4
I
shouldn't
present
any
huge
problems,
but
the
issue
is
going
to
be.
B
This
is
this
is
super
hard
coded
right
now
it's
very
high
performance,
but
it's
it's
not
if
we
put
if
we
put
a
Sonic
layer
on
top
of
it,
it'll
go
way
too
slow
for
anyone
to
care
about
right
and
modifying
it's
not
trivial.
So
it's
it's
sort
of
it
would.
It
would
be
harder
than
the
rest
of
this
project
put
together
to
modify
our
aha
system,
so
I
wanted
to
kind
of
have
this
conversation
say
this
is
kind
of
what
we're
thinking
about
this
is
kind
of
what
we
run
into.
C
This
is
John
I'll,
chime
in
so
I
think.
You
know
there
was
a
period
of
time
in
this
community
where
there
was
a
lot
of
discussion
going
on
about
like
interoperability.
Between
now
from
what
you
just
said,
I
mean
it
would
be
almost
like.
Nothing
would
be
interoperable
with
your
because
it's
very
proprietary
but
I
think
that
over
time
the
community
has
pretty
much
settled
into
like
a
state
right
now
of
not
having
vendor
interoperability
with
aha
that
it
would
just
be.
Each
vendor
is
interoperable
with
their
own.
C
So
from
that
that
perspective
I
think
what
you're
proposing
it
is.
Okay,
I
think
that
there's
I
don't
know
how
much
of
the
requirements
you
read,
but
there's
sort
of
like
an
active
sinking,
that's
happening
as
you
know,
as
connections
are
changing
State
and
that's
you
know
happening
like
all
the
time,
but
there's
also
something
called
a
perfect
sync
which
is
sort
of
a
live
migration.
C
I
call
it
live
migration,
I
guess
the
community
Microsoft
call
it
perfect
sync,
and
it's
there's
a
little
bit
of
uncertainty
around
that
about
whether
that
has
to
go
through
Sonic
and
be
transported
by
Sonic
or
that
could
be
transported
by
you
know
a
vendor
sort
of
proprietary
transport
for
doing
the
perfect
sink.
C
Well,
I'll
give
you:
my
interpretation
is
that
you
have
system
a
potentially
system,
B
they're
like
actively
syncing
to
each
other.
Let's
say
a
fails
over
to
B.
Now
you
want
to
bring
up
system
c
or
or
you
know,
or
let's
say,
let's
just
say
you,
you
have
system
a
running
and
you
want
to
upgrade
it.
You
want
to
migrate
off
all
of
the
connections
from
A
to
B
in
B.
You
know
hasn't
been,
you
know,
B
is
like
freshly
booted
system.
C
There's
it
has
no
State
at
all.
You
know
no
connection
State
at
all
and
you
want
to
just
migrate
everything
from
A
to
B.
That's.
B
We
have
that
capability,
it's
just
it's
just
not
specific.
The
only
issue
when
you
talk
about
perfect
sync
is:
if
you
want
to
transition
from
a
handling
traffic
to
B
handling
traffic.
Obviously
there
has
to
be
at
least
you
know
like
a
few
micros
in
there,
where
a
isn't
accepting
new
connections
is
the
you
know
the
the
table,
never
stops
updating
right.
You
have
to
pause
table
updates,
but
it
doesn't
take
milliseconds.
It
takes
microseconds,
so
it's
usually
not
an
issue,
but
yeah.
We
can
do
that.
We
we
don't.
B
We
don't
have
a
specific
brand
name
for
it
or
anything.
It's
just.
You
know
it's
just
a
feature
of
the
system,
but
if
you
want
that
to
be
handled
through
Sonic
like
that'll,
just
it'll
be
a
lot
slower.
C
I'm,
not
I
I
mean
I'm,
not
saying
I'm
in
favor
of
that
or
not
in
favor
of
that.
But
I
think.
The
idea
is
that
for
the
perfect
sink,
there's
not
there's
not
a
tight
sort
of
performance
requirement
that
the
perfect
sync
have
to
happen.
You
know
in
some
like
very
tightly
bounded
amount
of
time
that
the
migration
can
happen
over
a
longer
period
period
of
time.
B
C
Yeah
agree
so,
like
part,
part
of
part
of
the
perfect
sync
is
that
is
that
you're
doing
like
sort
of
an
active
sync
of
State
changes
at
the
same
time
that
you're
doing
like
a
migration
of
you
know,
connections
that
maybe
are
idle.
You
know,
or
have
been
idle
for
a
long
time,
so
the
perfect
sink,
like
is
still
doing
sort
of
like
the
like
you're,
still
doing
like
an
an
active
backup
kind
of
sink
like
live,
while
at
the
same
time
like
moving
all
of
your
idle
connections.
C
So
I
I
think
you
can
read.
I
mean
I,
think
that
there's
some
documentation
on
on
that,
but
that
that's
the
idea
of
it.
Okay,.
E
C
E
Just
joined
very
late
hi
good
morning,
Christina
good
morning,
everyone.
This
is
the
only
slide
that
I
have
seen
because
I
just
joined.
E
When
I
see
Sonic
and
then
I
see
this
session
replication
in
the
bottom,
which
is
F5
OS
in
addition
to
what
John
was
just
mentioning,
I
just
want
to
mention
two
things.
One
is
you
know
so
far
as
what
discussion
that
we
have
had
related
to
AJ
in
terms
of
Dash.
We
expect
that
the
DPO
IPO
card
will
have
a
partner
elsewhere
in
the
network
and
so
the
current
card.
E
The
primary
card
will
have
certain
sessions
that
it's
actively
forwarding,
and
you
know
the
partner
will
do
similarly
for
the
other
half
of
the
number
of
flow
entries,
for
example.
So
the
sync
mechanisms
that
we
discussed
were
two,
which
is
bulk
sync,
and
what
is
the
called
the
inline
sync
and
I
see
that
this
bottom
line
here
that
is
shown
here
corresponds
to
the
inline
sink
for
the
most
part.
That
already
you
know,
we
had
discussed
that
could
have
some
of
the
proprietary
attributes
in
it.
E
How
it
could
sync
that
and
the
bulk
sync
that
we
discussed
was
in
context
of
Sonic,
that
the
dpu
card
would
take
a
snapshot
of
its
current
set
of
flows
that
it
has
already
programmed
based
on
all
the
rules
that
it
has
followed.
The
policy
tables
that
it
has
gone
through
in
its
Pipeline
and
shared
that
snapshot
with
Sonic
and
Sonic
would
basically
do
the
boxing
to
the
newly
advertised
neighbor.
E
You
know
the
partner,
basically
at
the
same
time,
when
the
bugs
sync
is
happening,
which
again
might
take
longer
time
right
and
there
could
be
new
flaws
coming
in
and
going
as
well,
so
the
inline
sink,
which
is
shown
here.
You
know
the
inline
sync
could
happen
in
parallel
with
the
bulk.
Sync
is
what
I
you
know
wanted
to
mention.
E
So
if
Sonic
is
doing
the
bug
sync,
and
if
you
want
someone
else
to
do
the
bugs
Inc,
meaning
a
different
Nas
yeah.
B
Our
system
has
the
ability
to
do
the
bulk
sink
and
the
inline
sync.
At
the
same
time,
these
are
all
capabilities
that
we've
got.
Exposing
the
bolt
sync
to
Sonic
is
not
an
issue.
The
issue
is
just
that
it
would
go
a
thousand
times
slower.
B
That's
that's.
Basically,
the
Sonic
layer
is
not
particularly
performant,
and
so
it's
just
going
to
be
so
much
slower
that
you're
going
to
have
to.
But
if
you're,
okay
with
the
speed,
then
I
I,
don't
see
that
as
a
huge
problem.
Exposing
exposing
the
table
to
Sonic
is
not
an
issue
and
giving
Sonic
the
ability
to
update
the
table
is
not
the
issue.
The
issue
is
speed.
The
reason
I
was
concerned
about
this
is
because
you
know
our
our
aha
session
is
obviously
extremely
proprietary.
It's
very
tightly
integrated
with
specific
implementation.
B
Details
of
how
we
do
forwarding,
and
so
the
AHA
system
can't
easily
be
the
sort
of
the
the
inline
scene
cannot
easily
be
decoupled
from
the
forwarding
plane,
because
the
two
are
are
essentially
one
thing
and
so
I
was
just
concerned
about
you
know
if
we
were
going
to
try
to
do
inline
sync
at
the
Sonic
level,
there's
no
way
you'd
ever
get
the
performance
that
you
need
right,
like
you,
that's
just
it's
too
slow.
C
That
that's
right,
I
think
everybody
agrees
with
with
that
and
I
think
I
think
you
know
the
the
the
perfect
sink
or
the
bulk
sink.
I
think
that
the
idea
is
that
there's
not
like
a
tight
time
bound
on
how
long
that
takes
to
to
complete
it
could
go
through
Sonic
again,
I,
don't
necessarily
agree
or
disagree
with
it
going
through
through
Sonic,
but
I.
Think
like
the
the
requirement
for
how
long
does
it
take
to
sync
when
you're
doing
a
perfect
sync
I've
heard
you
know,
minutes
or
or
10
minutes.
C
You
know
those
kinds
of
times
like
thrown
out.
A
E
B
B
What
we're
aware
of
is
either
we're
being
asked
to
present
our
forwarding
table
or
we're
being
asked
to
accept
from
Sonic
entries
into
our
forwarding
table
Yeah
right
and
what
Sonic
does
that
with
us
to
restore
backup
whether
we're
making
five
copies
of
the
same
thing?
Whether
you
know,
however,
that's
working
is
not
we're
not
aware
of
right,
like
we're
just
we're
just
a
single
node.
B
We
know
about
our
inline
sync
and
we
know
that
Sonic
can
tell
us,
hey,
add
stuff
to
your
forwarding
table,
but
we're
not
we're
not
going
to
think
of
it
as
a
synchronization
feature,
we're
going
to
think
of
it
as
a
snapshotting
and
restoring
feature.
C
Right
so
so
reshma
had
had
mentioned
that,
there's,
like
you,
know,
a
partner
or
a
peer
like
an
haper.
That's
actually
there's
not
necessarily
just
one
haper,
you
know,
there's
a
there's,
a
concept
of
an
eni
which
is
sort
of
an
interface
and
and
each
eni
can
have
you
know
a
different
haper.
B
Yeah,
so
we
actually
have
we
have
like
n
plus
one
configurations
for
this:
we've
I,
don't
if
we
need
more
than
like
16
nodes,
it
might
become
a
problem
but
up
to
16
nodes,
I'm,
pretty
sure
our
system
will
be
able
to
do
whatever
we
want
it
to
do
that.
I've
I've,
seen
it
with
my
own
eyes,
handle
a
16
node
configuration
I
have
not
seen
more
than
16.
B
B
C
That
that
I,
don't
think
that
that's
the
requirement
the
requirement
here
is
that,
like
that,
there
are
many
like
active,
backup
or
active
active
pairs.
It's
not
that
there
are
big
clusters
of
you,
know,
n
that
are
all
sinking
to
each
other.
It's
just
that.
There
are
many
pairs
of
of
active,
active
yeah.
C
A
problem
yeah
yeah,
so
so,
but
that's
that's
the
requirement
and
there's
a
scale.
You
know
there's
some
scale
information
about.
You
know,
what's
the
scale
of
the
number
of
Enis
or
or
the
that
need
to
be
supported,
and
it
it
scales
with
throughput.
You
know
so
for
like
a
for
a
200
gig
data
plane,
the
scale
requirement
is
64
Enis
and
you
know,
and
it
you
know,
a
400
gig
data
plan
would
be
double
that.
C
So
that's
that's
the
kind
of
scale
number
for
the
number
of
Enis
for
like
a
200
gig
data
plane.
So
each
of
those
Enis
would
have
a
partner
and
would
be
you
know,
performing
aha
with
its
partner.
B
C
Possibly
going
back
and
looking
at
some
of
the
some
of
the
recordings
of
past
meetings,
there
was
a
lot
a
lot
of
recordings
by
pensando
I,
don't
know
if
anybody
from
pensando's
on
on
here.
C
They
they
went
through
their
approach
where
basically
any
kind
of
packet
that
causes
a
state
change,
is
sent
to
the
backup
and
then
returned
back
to
the
primary.
That's
that's
how
they
chose
to
implement
their
8A,
but
there's
you
know,
I
mean
there's
no
again,
because
there's
no
interoperability
requirement
I
think
each
vendor
is
free
to
choose
how
they
want
to
do
that.
Synchronization
and
there
are
many
trade-offs.
Yeah.
B
And
then
that's
that's.
Basically
all
I
wanted
to
know
right
that
that
right
there
is
the
answer
that
I
was
hoping
to
get
out
of
this
meeting.
So
I'm
happy
at
this
point.
Okay
and
I
think
that
we
need
to.
We
obviously
have
to
you
know,
keep
investigating
this
on
our
end
and
make
sure
that
there
aren't.
You
know
to
try
to
uncover
as
many
of
the
Hidden
problems
as
we
can
find,
but
that
that
was
one
of
the
big
calls
I
wanted
to
make.
B
So
I
didn't
want
to
go
to
all
the
effort
of
of
checking
out
the
the
platform
and
Sonic
and
all
that
if,
if
we're
gonna
have
to
go
and
re-implement
AJ
from
the
ground
up,
which
was
not
that's,
not
really
possible
to
do
realistically.
B
D
There's
also
a
document,
the
proposal
document
is
there,
it
has
the
state
machine
and
what
the
Sonic
role
is
and
what
the
platform
does.
Maybe
that
will
kind
of
help
too.
A
Have
you
seen
that
Tim
I'm
not
sure
let.
A
Let
me
share
my
screen:
if
that's:
okay,
okay
and
of
course,
now
I'm,
you
have
to
be
patient
with
me,
because
my
screen's
wrong
okay-
so
here
is
the
this-
is
just
a
set
of
playlists
from
the
working
group.
So
long
story
short,
you
could
go
back
and
look
at
these
and
I
can
I.
Think
I
sent
you
the
link,
but
I
could
send
it
again.
A
But
if
you,
if
you
go
to
the
repo
and
you
go
into
documentation-
and
it
should
be
in
high
Avail
here
and
then
there's
a
AMD
there's
a
couple
of
them-
there's
an
original
and
an
updated,
and
so
there's
a
couple
here
that
were
very
good
and
then
proposal,
yeah,
okay,
and
so
this
one
was
very,
very,
very
good,
and
it's
long
just
FYI.
It's
very
long
and
then
also
here
alternate
ideas
for
proposals
here
and
then
we
did
take
a
swing
at
creating
some
slides
around.
A
This
was
marnian
around
what
we
thought
the
requirements
white
might
be,
but
they're
not
super
fleshed
out,
but
our
original.
So
so
that's
the
folder,
which
is
called
documentation,
High
avail.
A
D
Yeah,
no,
it's
not
gone
to
the
agile
duration.
It's
it's
here.
It
is
here
whatever
is
here.
Is
here.
E
D
Yeah
the
dash
API
is
I
think
we
are
working
on
PR
to
get
the
this
thing
in,
but
yeah
it's
not
in
yet.
A
Yeah,
so
there's
a
little
more
here,
Tim
that
we
just
added
some
a
few
changes
in
in
it
sitting
in
a
PR.
A
If
that's
helpful,
all
right,
okay
and
then,
like
I,
said
the
the
playlists
here
so
I
try
to
I
try
to
put
them
in
date
order
so
I
think
they
were
I,
think
they
were
toward
the
end
of
the
playlist
and
rather
than
the
beginning,
where
AMD
was
presenting
so
anyway.
Well
great,
was
there
anything
else
for
the
day.
A
Yeah,
hey
thanks
for
coming
and
thanks
for
asking
the
questions.
Anyone
else
on
the
call
have
anything.