DASH High Availability WG, 27 Sep 2022

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: DASH High Availability Working Group Sep 27 2022

Description

Unplanned Switchover discussion
Requested updates to HA Proposal document
Thank you Sanjay for presenting!
Next week discussion on MetaData policies/expectations

A

Thanks for coming to high value, High availability meeting today, September 27th, um so I was out last week and reshma hosted for us last week. So thank you very much for doing that. Reshma um I have the the notes here about things that we talked about last week and um looks like we had some things that need to be added to the hlds that I'll take care of and looks like. We talked about. You know how to do certain things.

A

You know verify Behavior things like that, and so we're saying that the ha document review is completed, the the one from AMD and so um I hope everyone agrees with that. Were there any other q a for them or anything, we could probably bring that up in this call, and then um it looked like this week. We wanted to talk about this line here uh packet flow and see what happens when there's no act from standby and then um overlay ecmp with AJ from Gohan. So.

B

uh Switch over we haven't discussed, uh you know those detail flows last week.

A

Is everyone um agree? We could go ahead and talk about that. First then, this week.

C

Yep, it's just a update uh from my site, I. Think the whatever the review comments that I got I'm almost done uh making the modification changes in the document. I'll, probably update times days of PR With The Changes sometime later today or early early tomorrow, I'll notify it once it's out.

A

Okay, yeah sounds good. Okay. Let me just grab this here: oh okay, Gohan, um so we want to cover unplanned switch over, and so maybe can we start with the context of the question.

A

And what we want to talk about.

A

B

Oh, can we go over that the document again I mean okay.

A

Okay, um Sanjay would that be okay with you, okay, I'm, going to stop presenting.

A

A

Did you need just a minute Sandra.

C

A

C

I didn't present.

A

C

Me one second laughs.

A

And let me know if you'd like me to present it. Oh there you go okay,.

C

A

C

uh Go on this is the one we wanted to.

B

C

um So I think here uh this is a this- is the easier uh flow, so I think. The notion here is I'll just quickly go over and then go on. You can put any specific questions and we can talk about so this is I. Think glass of the heartbeat once we detect loss of heartbeat, uh we update that we have lost peer. Connectivity has been lost right on the old standby side and then uh we switch over. There is a jump to becoming the uh Standalone primary right.

C

Once I move to Standalone primary, the peer is gone, so there is no sync happening anywhere right. We continue forwarding with whatever state we have the flow state that we have and uh the controller has the option of coming and uh initiating uh flow reconcile. I think the idea here is uh once the controller uh knows that we have. The configuration is at the same level where it needs to be. uh Then there is a flow reconcile when we do the flow reconciliation. We re-evaluate the flows based on the current policy.

C

Whatever is the policy that is there and then, if there is any changes to be applied, for example, um uh deny flow might be? Might have become a permit or the other way around or anything any other changes that are needed. Those changes are applied and, uh and at that point we are done so that's the high level flow right during this I think once we switch to primary I think the key thing is there is no more syncing. There is no this thing because there is no PR available at that point.

B

How did the student controller knows that the you know they they need to switch to this? uh uh You know they need to make the standby to be the primary. uh Now.

C

I think it would need to be um through an uh through some event that we raise.

C

I think the key here is Gohan, because if he immediately right after switch over, if we go and do the uh if we reconcile the flows, uh because the configuration may not be up to date right because the controller is kind of I mean pushing configurations uh independently to each of the nodes it might so it might be that we might go and uh disrupt the traffic.

C

That is there because we might apply a older, config and disrupt traffic and then come back again and then the controller catches up the config and then, with this thing, so I think there was a request uh uh from Microsoft to leave that control uh in the controllers. This thing so that we don't disrupt traffic any existing traffic.

B

Because I think the loss of the heartbeat doesn't really mean that the primary is down right, so it could be, you know just uh because the primary could be still there. So in that case you may have two primary now in the network. It.

C

Could be so we might have two primaries in the network, so what happens is if there is a partition like that there is a network partition like that. I mean it may not be a network version, let's say I think you're right one, for example, the case might be a little contrived, but let's say there is uh in the network, someone put a rule or some some ACL or something like that, which blocks only the heartbeat traffic right.

C

Only or somehow some of it affects the heartbeat traffic right and the heartbeat traffic is not being you know, uh received and there's no uh communication on the heartbeat Channel between the two. If that is the case, we would switch over and I think here. The Bad Case would be where there is partially done right in the sense the heartbeat is uh broken, but the rest of the network is fine. If that is the case, that is still fine.

C

What happens in that case is uh the uh in that scenario, because uh currently I think because we are advertising uh uh bgp, for example, if you're using bgp if the control net we are doing with this ASP pen right. So what would happen is uh the rest of the network would still uh send traffic only to the primary right and not to the secondary.

C

So what would happen essentially is the secondary would think it is in Standalone uh primary but receive no traffic, but the active things that it is, uh whoever is the admin active, thinks that it is Standalone primary and then it receives traffic and continues to forward. Let's say when the connectivity gets restored for the uh yeah.

C

When the connectivity gets restored for the heartbeat the they would peer up again and then it would go back to the same this thing, so it would flush all the flows, and you know we restart uh bulsing and get back to the same procedure again, and the standby would come back up as standard.

D

How does the controller know that you know, even in the case where Hardware channel is affected, that there is a switch over.

C

uh You mean uh I think this would need to be via some uh via an event that we raised right from the uh from the switch side. I mean from the.

D

Understand why.

C

On the standby side, here we need to raise the event which the controller needs to monitor. I think the key thing is how when does the controller uh say till this for the switchover itself? We don't need the controller to know the only the only portion where the controller needs to get involved is after the switch over if you have to do flow reconciliation right. That is when no, but.

D

You just you just mentioned that there is something that Microsoft wanted you to hold back on the config side on the controller. What was that thing.

C

That is just the flow reconcile so uh and if so, the thing is so the standby is acting uh like we were talking right. So all the flows on the standby are following. This are following the policies on the active side right, so we created the flow on the active before switch over right. So there was a flow. There was a uh that hit the active, we looked up, the policies and the policies might be either allowed deny, or it could be some metering policies or something else right.

C

uh So we created the flow according to the policies on the active and we think the flow, as is with the metadata to the standby, and we don't want to reapply the policy on the standby because then we would be out of uh this thing, so we take the same metadata. Whatever is the policy results on the active and sync the flow to the standby? So now, when a switch over happens, uh the flow is uh is according to whatever policies were there on the uh old uh primary right.

C

Whatever was the whole primary now we continue forwarding the flows. Now uh there could be it's potential, potentially it's possible that the configuration state of the policies on the on the secondary side is, is a little behind or is different than what what the flows are uh indicating right. The this thing because it followed whatever was there on the primary now we want to bring it in sync between the policy and whatever is the Flow State, so that is triggered by the flow reconcile.

C

When the flow reconcile happens, we walk all the flows, re-evaluate the policy right and bring it back to so there might be change in the flow itself. We may need to do some reconciliation right. uh There might be a new metering uh uh class that is applied to a flow in the act in the old primary. It was different, but now the configuration on this, the on the new on the secondary or the new upper primary say something different right.

C

So then we'll need to go and apply apply that uh policy to the flow, so that is thrilled back.

D

You are applying. You are applying the policy because you're thinking that, in between the switch over the policy, may have been changed.

C

uh Yes or it might be, but maybe between switchover or let's say there- is a policy update when the policy update happens. The controller is not uh that it's not Atomic operation right to sync, the policy to the primary and the secondary at exactly the same time right so there might be the primary might have gotten updated, but not the secondary, yet that it is following behind. It could happen that way. So then, uh we'll be I mean within the node. It's not possible to know whether the policy is up to date or not.

C

So we let the current. We leave that control to the controller. Let us know that okay policy is now up to date. Right now you can go and uh if you know reconcile this Flow State like we don't want to be caught up behind. We don't want to be applying an older policy and then uh uh disturbing the flow, and then immediately after that, we get the newer policy and then reapply, and then we don't want to do that. So we just want to hold off on reapplying so in the reconciliation either.

C

That's the one that is held back, or that is what is dependent on the controller, but whatever the existing flows that are there and uh whatever flow state was there. We continue forwarding as soon as the switch over happens.

D

Yeah, so essentially, what you're doing is this you're doing the florist simulation right um after the switch over correct, okay, okay, good thanks.

B

So, where are these two seconds coming from? You know in the disruption time two seconds where oh you know what decide that okay, I don't see anything in this step, mentioning about the two seconds. How do we, you know what what what's there to guarantee the two seconds so.

C

This would be dependent on uh for us, detecting, heartbeat loss right and anything that we need to the data plane to start forwarding again start forwarding right so far uh we mean the standby is dropping traffic right. We are not. uh Even if you receive traffic, we will not use the flow state to forward, because the state of the uh data plane is in standby.

C

Right is in the secondary state, so we have to do. There is some activity uh that we need to do to start forwarding right. That is one, and the second is the second component. That is one component, which is uh software, work, software or whatever work on the DPR on the Node itself. There is also uh this thing, which is a another component for drops, which is the uh Network itself right.

C

So when the this thing switches, if you need to let's say there- is an actual failure on the primary side, uh for example, bgp or whatever is the network protocol has to uh uh has to reconverge and forward start forwarding all the traffic to uh the prime to the secondary side. So this two seconds is our goal: uh Gohan right! This is what we want to I think all our testing, whatever I mean so far. What we have done is to make sure that we do not exceed that two seconds right.

C

We are trying to do whatever we can to keep it.

B

Yeah I guess that part is the goal, but to what? What? What are the components that you know in this system to uh you know to to help us to reach that goal? That's my question right, so you know what what are the? What are the components that you mentioned is a heartbeat and you mentioned the network. So how do we guarantee the you know, the heartbeat that will you know, um meet this two seconds. So what is our heartbeat rate? Currently, yes,.

C

I think that is uh Baltic uh I, think it's 100 150 milliseconds 200 milliseconds.

E

Correct yeah, with our testing yeah, we try it with 150 milliseconds, it's more to yeah. We have to tune it and see what suits right based on the the actual deployment.

B

But what is the critical path here right so because uh um you know we lost the habit and they, when do we start advertising and the um you know, from the loss of the heartbeat, to the point where you know the the the the previous standby is the fully functional. So what what are the? What are the critical parts that we need to do and what? What is the time you know, spending on each of the part, each of those components do we have a you know a breakdown of those.

B

uh You know a timeline that okay, you know, starting from the loss of heartbeat. You know one other, each step that we need to carry out in order to you know for the, but the new standby to.

C

Be yeah so I think from our testing empirically I think we can break it down one. We don't have that this thing yet um I think Balki might have something from our testing, but it's not there in this document, but one thing the Govern. What we are trying to do is one optimization, or this thing that we take here is uh on the standby um uh consciously. What we have done is we have tried not to advertise only after we detect switchover because then what happens is we are adding to that?

C

uh We are adding one more component right that the route has to start propagating up right. So, instead of that, what uh this is. This is what we did with uh this thing. I think we will uh talk about it with the ecmp approach to, uh though so the idea here is, we don't try to uh create one more leg, which is that uh route uh propagation itself is to create. You know: I'll re-advertise the route only when we do switch over.

C

Instead of that, even before switchover we are advertising the route, but with a worse metric right. So all that needs all that happens is uh so when we lose the route on the active side is when it will switch over to this backup path. So that way we kind of reduce the uh switch over time right. So that is one component, which is the network component other than this I. Think like uh this thing was saying, uh like bulky was saying: 150 milliseconds. This is the actual heartbeat uh detection time itself right.

C

So there is some detection time and after we detect, there is some activity that we need to do so that we switch over to uh switch the state to being uh Standalone so that we can actually start forwarding the traffic. So these are the three three components within that uh successful switch over time.

B

Okay, I mean that's much more clear. Can we update the document to include you know, that's a critical passion.

C

Sure you can do that uh in that in the second I can we can try and add uh what happens during switch or what are the pieces that are involved? Sure you can do.

B

All right thanks so so I I saw that there was a because it was mentioned. A metering, so um I mean I. Just asked uh not require related questioning it. I, don't know if it has been discussed. So uh during the switch over did. We also, uh you know, um replicate those metering and data from the primary to the backup.

C

No good, it's Hindu, it's not replicated.

B

Okay, so they mean those are lost right, so you know if they are doing the Billings and those data are lost and then how do we? How do we get that recovered or what was the plan.

C

So I think the plan was uh um uh I. Think the controller polls right the to get the collect the data So within that polling interval I think there is some loss, so it might have pulled the primary at Whatever frequency collected the uh collector of the meter data and then, uh if the switch over happens and then we start metering on the uh new uh primary right on the other side. So whatever is the loss within that polling interval? That loss is still there, but the meter class Etc will remain constant.

C

So if there's a class ID allocated to it um in the previous site for a particular class of traffic, then the same class ID will start metering on the other side. But there is that loss within that within that one polling interval, whatever was lost, is lost. I think that is uh the plant.

B

Okay, did we explicit called this in a document.

C

uh I, don't know metering, uh not here but yeah, so this will apply to a lot of other things. Also right, one metering I just bought it up, because I put some policy thing other than just permit deny, but anything that anything that stats or anything that we are collecting, um because we are the only thing only thing that we are sinking between the nodes is the flow um State. It's not really all the metering State. We are not syncing everything, because that would require a lot of bandwidth right to do that.

B

Yeah, can we call the waters and the? What are those uh uh you know, expectation for the controller side. You know, apart from the meter you mentioned, there are other states, we are sinking and there will be implication on those. Can you elaborate.

C

Yeah I think or can I act, maybe in somewhere we'll elaborate, we'll say that the only states that we sink I think it's more of what we think right. Everything else is not synced I'll. Add that one I think somewhere, we say I will make it explicit, I. Think somewhere we say we are only syncing. The Flow State I can.

D

I think you know one thing: uh I think what yeah what Gohan is probably asking is that what is an expectation from the controller or an outside entity? How often they should pull and then collect that statistic so that when the switch over happens, you have only lost. Perhaps you know the last sink that you probably missed right.

D

So so, if the controller is collecting that thing every so every second or every you know five second or whatever be the case right then, essentially, you know there's only that much of a billing data is lost when the switch over happens right, correct.

C

Yeah I agree, the only thing is I think what I was trying to communicate was so that would be left to the controller right on what is the? uh What is how much is it? How much are how much for each of these features? How much are we ready to lose right, so that depends on that will dictate how often we want to this thing, which might be uh specific to each feature or uh what the controller, whatever is the expectation SLA with the top right?

C

That's why that's what I was meant to say that in this document we cannot recommend what to how often to collect it would be something that is uh up to the controller, but because what I can call out is that we are not it's not syncing, all the state right, all the in-memory state, from actors to standby. What we are thinking is only the flow information. Whatever is uh mentioned in this document.

C

Would that make sense.

D

I think probably we can do it for the plan switch over at least we can protect the statistics for unplanned. Perhaps you can lose it, but for the plan, if you flush it out or you know, send it out to the whatever the entity that may be outside, which is collecting it, and at least you can, you can preserve the data.

B

I mean if it's uh I I think it's probably not not that critical right, so you know either plan to switch over or you know just to make. Maybe you know the expectation you know. If we do the plan switch over then maybe the controller with the the world Curry uh on the primary right before the switch over. So you get the latest data and once you switch over then prove the other side.

B

So therefore you know, even though maybe the controller is putting at the minute uh um interval, but if they do a pulling right before the plan to switch over, then they will get at least some. You know very accurate data before they switch over right. So you know that, then, that that that that should be, you know addressed a little bit some of some of the concerns that we lose some of the meter in data during the switchover, because the switchover is uh pretty pretty fast right. So on the order of seconds uh right.

C

Yeah yeah I think with yeah you're right, one I think with planned. It might be even quicker because once we notify the other side, we are.

D

C

Yeah message to the other side to switch.

B

For arm plan, I guess you know there is uh it's a little bit hard to know when when that happened, because that's unplanned right so therefore you, you might not have the chance to to query that before the switch over happened. So then, therefore, the ex you know we would set up some expectations say you know. If you are creating a minute, then you might be. You might lose a minute of uh metering daylight during the switch over yeah right, depending on depending on what is your.

B

um You know the um the putting interval, so you know we might and let I I. You know I hope that we can, you know, call out those uh you know specific in the design documents so that the expectation uh is clear. You know on the controller side because we're not planning to do the state replication for things other than the you know the flow data right, so um you know, maybe you can call. So apart from the mirror you mentioned, there are some other state.

B

What what are those other states uh I'm trying to you know understand a little bit more about that scenes right. So so, if we, if we, if we know that the other state that you know uh application requires, maybe we can explicit cause those things and those specific implications. It will be it'll, be it would be better right. So you know, apart from metering, do you know other things that we're not replicating.

C

uh So there could be other uh there's mirror and the these are all the policies that we have like I can't I think, probably like uh um uh a port mirroring right. If you are doing some spam kind of a thing, I think there was some requests to be able to do that. That and.

C

What else I'm trying to think through all the policies that we have.

C

Yeah, those are the two okay anything else that you this thing. Any other features that we uh currently I mean Implement already, that uh policies that we, that might not be this thing, that has some in-memory state.

E

The policy wise I think we what our policy we do lookup we carry that as metadata and we try to apply that. On the other side, um yeah I mean, uh like metering, I, mean obviously.

C

Stats and all we don't.

E

We don't yeah, stop yeah, hey hello. Can you hear me.

C

E

Okay, yeah stats and all we don't obviously think, and obviously it is not a sing for every packet right. So we cannot, we will not be able to with respect to State. We cannot uh sync each and every sequence number act, number those things now with respect to policy per se, uh I think, whatever that we do look up in the active side, we try to take that result to the standby as well. Yeah.

C

So I think that's only things where we have a policy which has it which carries its own uh independent uh in-memory state right. So those are the things I'm trying to look I, think think of metering and statistics. I can't think of any other thing I'll. We can comb and try to comb through all our policies that we have for serious right and see. What is there so.

B

What is the implication of the sequence number in the you know, act number I went you know. You guys mentioned that. That part we didn't think. So what would be the implication for that? If we do not think those, uh you know, those sequence number explicitly.

E

So you obviously after the switchover you need to you, need to basically learn from the packet, the sequence number, and then you have to go from that. I.

C

Mean if you have.

E

To do very soft yeah if you have to do very uh sophisticated uh sequence, number act, number checks and all yeah those things after switchover you have to first, uh basically, you have to relax it little bit and then you have to go from there.

A

Would it be that we would lose small right.

B

Are we so first question? Are we so I remember you know what what is the are? We using the secrets, number and the you know, act numbers to do the TCP flow tracking in current, uh you know Dash pipeline. um Are we already doing that or or not doing.

F

That this is John from excite I, don't think it's in the behavioral model, but it's it. But we've we've talked about this- that uh tracking the tracking the sequence number on the sin like you can't get into the um established state unless the acknowledgments are for the you know, for the Sands and also for closing the connection tracking the sequence number of the fin um I mean. If you track the sequence number in every single packet, you can do even more secure type things like Windows.

F

You know TCP window checking, but I, don't I, don't think. We've ever talked about the window. Checking we've only talked about it as a way, for example, like you cannot know that the connection is actually closed. Unless you track the sequence number of the fin to the Final Act.

B

So, okay, so if we're doing that and we're not doing the you know, synchronization between the you know the active and standby for those sequence numbers so after the switch over. If we receive a theme package and how we handle it, because we don't have keep track of those sequence numbers.

B

Is there going to be a problem on that or.

G

They're still aging.

F

Yeah right, so it's it seems, like the connection like, um like won't actively be removed like on the Final Act, but that that the flow would be removed based on Aging, but there's a small window. You know, there's like a small window there that, if the if the switch over happened between the fin and the act for that fin, then that flow might need to be aged instead of like actively removed.

G

Yeah as long as we got the fin before the switch over and then the finag got missed, then it will take its time to age out. But if the fin comes after switch over then it's as is no change in the behavior.

D

I mean there are a bunch of scenarios here right during establishment or during the flow in in flight or towards the end, when you're trying to close the connection. There are multiple scenarios where you can have this orphaned State on on in the switch over right. Are we relying on retransmits here or are we?

D

um How do we reconcile the state to be accurate on the standby is not clear to me right. There could be our front connections and we may have to rely on stuff, but an existing in-flight connection May completely be gone. If the state machines, don't you know, converge.

C

Yeah, so there is a there are two things: let's I think during State change. There is uh this thing uh there is switch over during State change. I think this is what uh we're talking about the Aging catching map right I, believe that the flow itself when it is an active flow, I think during the duration of the flow when we switch over. That is fine. We relearn the sequence number and then continue from there on right, so there there is no right cost of any kind. There is no aging nothing! Nothing of that.

F

C

The I think the only thing is, since we are using the sequence number, uh what John was mentioning too only during the state transitions right, either when the state clear during the create or during the termination of the state. So that is when I think the fallback mechanism is aging other than that I think uh switch over does not affect. We learn from the flow and uh start tracking it from on the standby side.

C

On the switch on a newly primary side,.

D

Okay, yeah, depending upon the state, you know the fallback mechanism could be either aging on the you know on on the receiver side, and but there could be a Timeout on the sender side where and basically, if it is expecting it, then it can. It can re-transmit something right. Sorry,.

C

Please continue.

B

Can we call this as also explicitly in the document saying that you know because we're not syncing those, uh uh you know those sequence numbers and then therefore you know um with you know some will flow if it's a trend, if it's a you know, State get transition during the switchover.

B

Do they, you know uh we have to rely. The new new primary has to rely on the agent to to age out those flows.

C

Yeah, we can add that, though,.

G

And since we're talking about aging, can we uh I don't remember this document also covering the regular case when everything is in steady state? We have primary and secondary. There is a flow sync to secondary and then that flow lasts for a long time and it should have been aged out under normal circumstances, but there is traffic going through primary, so primary won't age it out. So how will secondary handle this.

C

The Aging is always: uh it is driven by the primary right, so any flows that needs to get H dot gets H Dot from the primary and secondary does not age out other than that. I think there is a very long uh uh timer bulky can so.

G

Okay, so when, if the flow is removed by age by aging and not by sin, how well second, do we know about that.

C

It's synced from the primary.

G

Is it specified in a document as well, because I didn't see it.

C

um Aging, maybe we haven't talked about aging much I guess in the document. Yeah.

G

Yeah I tried to do a quick search. I didn't find anything so so yeah. This is another thing, that's uh also missing.

G

um Sanjay. Last week we talked about the retransmission cases. Maybe you discussed I was off for a little bit during this meeting, but can we write in the document when the retransmission happens and what are the scenarios it covers?.

C

So I think there is uh I think all the I think somewhere we had mentioned, we did rely on the this thing re-transmission from the this. We can just talk through quickly if you want I think when we complete this curve, this, whatever we're talking now, we can just go back to the packet flow and see what happens in each leg. If there is a drop right, it gets covered by the re-transmission from the client, but we can just go through it. If you want.

G

Okay, and also when there is restimulation happening right in the primary that is syncing. Do we at that point, use the data path, sync to sync, the entries that have been changed because it can happen in background or, however, the uh DPO might implement, or will the Sonic grpc Channel get opened up open for the re-simulation related sync.

C

ah So for the oh I think Whenever, there is a state change due to resimulation right.

G

State change utility simulation.

B

G

B

Didn't I I I was unable to follow that so I think you're making ring transmission and then there's another re-simulation. That is two topical yeah.

G

Two separate topics: retransmission is something that Sanjay will like you know. We will be covering with the example of the packet flow in each state, but the re-simulation is a separate topic. Re-Simulation is when the policies might change on the card by the sdn, and you know to handle that there are different ways. Each GPU might be handling it, and in that case, how do we sync that information of the state change to the ha uh a partner.

B

Can we can we, you know, export the topic one by one. Let's uh can we talk about the you know the three-way transmission first and then you know have it closed and then we can talk about the re-simulation yeah. So.

G

B

So, who is going to talk about the transmission problem here? Yeah.

C

Maybe what we can do is let me come I think this may be a picture that we can use right. um So this is the flow establishment flow. Maybe what we can do is I'm just thinking we can see in each leg. If there is a drop, how does it uh get uh addressed right? That makes sense, so here I think the legs are uh we have VMA sending to the primary? This is the first leg.

C

Then there is leg two where there is a packet that is synced to the secondary uh back and then going. There are four legs in this. So, uh as you can see right, if any um in any leg, if there is a drop, let's say there is a sin packet, so we can think about TCP and then UDP. uh So if you take a TCP packet uh in any of these four places, if there is a drop, then the vmb does not receive the sin right.

C

So VMA would retransmit this in packet, write anything during the connection establishment and that kind of retriggers. That kind of covers the flow again right wherever there is a loss. That will be this thing so before 3. If there was, for example, before uh leg number three, if there was any drop, then the primary DSA would keep continuing to send the packet to the secondary DSE, because it hasn't said that it is synced to the secondary.

C

Yet right uh so, let's say in number two or number three: the packet got dropped, then any re-transmission from uh from DM VMA, the primary DSA would primary would receive it. Send it to the secondary uh would keep plan. You know forwarding it to the secondary to if in the state, but if number three leg finishes right, so there we get an act back from the secondary for the flow sync. We know that the flow has been synced to the secondary. After and after that, let's say the drops are happening in leg number. Four.

C

In that case the packet would come to the primary. The primary would forward directly to vmb, because we know that it's been synced to the secondary already right. It would sync to vmb and uh it can kind of get short circuit and then hoping that one of the retransmission makes to vmb. Then the established whatever completes right, so that would be the sequence if it is UDP I. Think, since there is no sin or there's not this thing till the ACT comes back, we keep sending it to the secondary.

C

Once the ACT comes back, we it's uh short circuited on the primary.

B

C

B

To ask for the UDP.

C

uh We get to act for from the DSC. We get this thing that the packet has been. The flow has been synced to the secondary, so that is that, in the sense it's just between the primary and the secondary, we know that the flow is echoed back from the secondary to the primary that is treated as an act.

B

That's a number three right, so yeah the step. The leg number three correct.

C

Once we get three, then we know that it's there in the secondary. The flow has been synced to the secondary and, from that point on, um the primary does not forward anything to the secondary for flow sync.

B

That's the same for the TCP right. So sorry, maybe what's the difference between.

B

A before you reach the next leg number three: if you receive a re-transmission from the VMA, you will always send to the secondary right. So after reaching the uh the leg straight, then you no longer need to send the packet to the to the secondary directly sent to the vmb right. So, and this applies to the both TCP and UDP, it's.

C

A place to work DCP in TCP the flow in good. This thing the flow. This thing you do not move past in right, uh the sin packet in TCP in UDP. We don't, since we don't have that sin, uh it only changes uh uh the VM can uh transmit for the consecutive packets, UDP, packets and all four are lost. Right could happen that way in TCP, at least uh because we have the Cincinnati. It gets gated by the sin.

C

If there are drops that what I mean, what I meant is still four happens, right till it reaches for uh VMA would not move to us in I mean uh to the next ACT packet right. It would keep re-transmitting. The sin till four happens.

B

C

Udp, that may not be the case it may move forward right in in certain terms, that's the only difference, but internally it's about the same.

C

The till three happens we will not. uh We will always forward the packet uh what we receive in the data part to the secondary till three finishes, if both true for both uh TCP and UDP, once three is done, it is short circuited once like three is complete. It is short circuited on the primary uh and forwarded to the destination.

D

But isn't it you know it isn't too um aggressive when it comes to UDP, depending upon the the rate at which you're getting the traffic? Why do you have to send the entire traffic? All of it till you receive an act, because there is no I mean the ACT is saying that okay yeah, you have received a. uh Could you not short circuit it earlier um and by counting certain number of packets, as opposed to basically sending all the traffic.

C

Yeah and if the problem is that drop could be in two or three right: maybe it is it, it is getting to the secondary and getting dropped. So unless we forward, uh so there are two things that uh control this right. One is uh uh the one. This thing is uh goal is that we don't want to buffer packets on the primary right, because we won't be able to keep up with buffering right. So we don't want to keep any uh buffered packets on the primary and uh I mean send it.

C

We want to uh not buffer and get the uh you know, get the packet as it comes and uh forward as we wanted. We don't stop buffering. That is. That is one goal. The second is, the drop could be in either two or three two, so we want to keep uh keep that thing alive by forwarding the whatever packets we receive from the client to the secondary yeah.

C

It could be this thing, uh but uh as soon as the first few packet comes in, we will I mean once the for once three comes back in, we will short starts with getting.

C

That I think our thing is, if you don't want to do that, if you want to selectively forward right, then we have to uh either buffer the packets because it may be getting dropped in between you need to buffer the packets, and it gets a little more com, much more complicated. If you have to do.

D

So then, in that case you will never have a timer on the ack right um if you receive it, but whenever you see the ACT, which is number three right, there is no timer for that right. Okay,.

C

Yeah, whenever yeah there is no, once we receive three, we know that we have received the account. With short circuit I mean you are meaning that on the floor, is there a timer running per flow to see whether we got the hack.

D

Yeah yeah looks like there is, there is no need for a timer.

C

There's no need for a Time, because at the scale of the millions of laws it might be.

D

B

Okay, so sure I mean um so for them I I. Just look at the you know, just a follow-up question: I. You know the package between the you, the Deep Shield. There is a Manhattan. How is the policy results, um but what is that used for? So we just uh you know, send the packet to the uh innocent encapsulated and send the original packet to the secondary.

B

What are those metadata with the policy results used for and do we have those descriptions? There I mean I, didn't see any descriptions yeah. So the.

C

Director data itself, yeah I, think the metadata itself will depend on uh each implementation go and that's why we are not called out the. What is there in the metadata we're saying what this thing will be. The policy results at how we keep the policy difference. For example, the policy could be the result of uh the Ackle or uh what metering class we decided to attach.

B

uh Is that optional is that the mast.

C

I think well, whatever is needed by each of the DPO data path to forward right, I mean that needs to be sent in the metadata or the problem will be once it receives on the secondary right. uh But if you did not put in there is policy results from the primary. We would re-evaluate the policies on the secondary right.

C

For example, let's say there is an ACL uh and we want to if you re-evaluate the Accel on the secondary, the states may be different here in the primary we may say permit, but on the secondary, if the policies were not aligned, we may say deny if that is the case on switchover.

G

There is reconciliation anyway,.

C

Yeah, but before reconciliation, we'll start dropping traffic that the reconciliation.

G

It shouldn't, according to your proposal, it shouldn't even get any traffic before reconciliation.

C

No, no, no, no, no I! Think sorry! No! That's not the this thing case. Marion so what's happening is before reconciliation. We are forwarding based on the the primary state.

C

The reconciliation only changes the state if there is change to be had right, but as soon as there is, whichever we are forwarding traffic according to whatever is the state. That is how we are able to keep the um uh but keep the drops at a minimum right. We are continuing.

C

So if the primary had said- let's say in this case right if the primary primary configuration uh be just before switch over the primary configuration for a particular rule was permit right and on the secondary that uh the configuration was slightly lagging behind and if we evaluated the same flow on the secondary, it would be denied. Let's say right the for example.

C

So when this happens, uh the when the flow got created on the primary, uh the metadata that gets synced back to the secondary says that it is permit and in the current state we would go and install a flow which says permit right. uh There is unplanned switch over the primary switches over. We get to the secondary, becomes the uh primary and starts forwarding traffic. When you start forwarding traffic, when traffic hits the flow, we still see that it is permit and we we forward the traffic right. We don't drop it right.

C

So the end clients, VMA and BMB, see no difference in uh on switch over right, but when the reconciliation happens, if the controller did a Reconciliation at that moment, without changing any policies, if it continue, if it went and initiated the reconciliation, then we would go update the flow to say it's a deny now right, then we would start seeing drops, but what we expect is uh the controller would make sure that we would know that. Okay, the secondary, is slightly behind uh on uh the policies before calling reconciliation.

C

It would push the new policies right so that it comes in line with whatever was the policy on the primary and then calls the reconcile when it calls the reconcile. We see that it is a no op. It was a prime permit before even now, it's a permit. There is nothing to be reconciled right when there's no change in the state. So this way there is no disruption.

G

But yeah I I understand what you're saying, but why would the controller? Is this really a requirement for controller to wait for the switchover to update the policies? I thought controller tries to keep policies uh same on both primary and secondary at all times there might be a delay, but it's in a matter of like seconds. It's not something that lasts for long.

C

So I think I'm not saying they wait for this thing. It might be delayed right it might it may not be it's not Atomic that it sinks the configuration to the primary and the second is the same time. But if we go ahead and uh do a reconcile or we apply, I mean we apply the secondary policies immediately. Then there are more chances that we will actually disrupt traffic because the conflict of the policy configuration on the two is not in sync right.

C

So we have a flow which was working before switch over and right after switch over for three minutes. It will get dropped or whatever is the for. uh Whatever is that.

F

Time are you? Are you assuming, like policy, is just echo or is it like everything like in order to do the Transformations, the mappings the router result? Is it like everything to forward that packet? Are you just saying it's? The echo result.

C

It is mostly the policy results, anything that is dynamic like, uh for example, next stop or routing, etc. Those are determined locally. The.

G

Capa mappings.

F

Wait: those are determined locally so, like you can redo uh like the mapping lookup on the secondary.

C

Yeah, so if there was a, uh for example, if there was a next stop I've changed, because that we cannot sing from the primary to the second right. Those are different. So.

F

It seems arbitrary, like you're deciding that the axle is like an important piece of state that you want to move, but uh but maybe like the mapping entries aren't like how do you know that those aren't also delayed and being updated, like no.

E

So so so so for the mapping yeah, so uh I think uh Microsoft uh is asking for mapping also to deal something like uh the policy. So that is something we are looking into. Okay,.

F

E

That is a must, but uh the mapping actually see when it started. There was no uh obviously recruitment on that side, and also the scale of the mapping is huge. It's humongous and it's like it's it's affecting yeah, uh so yeah that that is some. That is some requirement that has been you know, raised by Microsoft.

G

Also eni table BNI table all that is anything that's coming from SD Android that can change sometimes so.

B

Sure, okay, so I, I, sorry, I, I, I, think the Christianity in a time check so I think this may be a little bit bigger topic. So, okay, so how about the you know, I think we get the quite a few feedbacks on the document right, so I don't know who who uh I can summarize it, but the reason someone can put down the you know. uh Maybe you know the MD can take action item to update the documents right.

B

So I think that there are a few things that we found out that the the document doesn't have enough details right. So maybe you can summarize those- and uh you know for the for the met for the metadata policy since it looks like we need more discussion on that right. So it's not clear in the document what is needed, whether it's optional? uh You know we see argument on that. The.

A

I'll go ahead and capture, you know what I've um gathered from this last hour and work and send it to Sanjay so that he can have a note of what I heard needs to be updated, and then um you know we can. We can do that and then more discussion on metadata policies. Next time is what you're saying.

B

Yeah, that's my anyway. I have a want to continue. I think the metadata policy might be a big, bigger topic yeah. It looks like yeah.

D

B

D

Also would like to know that metadata part because I think I I know it's it's the implementation part, but I think if we know what metadata expectations are and plus, if there is a interoperability is, you know could be achieved, then we can probably see that as well, but.

G

At the same time, you.

A

D

If you could also have the updates on the GitHub as well that'll be great, you know.

A

For the document, the thing that's the thing guys like we can always go into the document.

A

Of what you'd like to see expanded, there's a PR. That's no problem to do that. Also.

B

Maybe even the exactly the packet format, that is, you know, implementation data but I I'm, thinking like at least from the requirement side right. So you know there's same some. uh You know requirements that have haven't been captured in this document like what you know because I think Sanjay mentioned. You know someone mentioned there's Microsoft requirements, say: okay, you know some policy need to be carried out, but it's not explicitly called out in the document right. So therefore, you know there is some confusings on on that front.

B

Maybe you're asking you know a part of the uh clarification would be. You know what are those requirements and the right so those kind of things we need to clarify on the metadata policy.

C

But I think we can add. Probably we can discuss and put what kind of uh information goes that not the exact format, because that would depend. But at least we could probably discuss and put what kind of interesting.

A

Every Hardware supplier is going to have some have some slightly different right, but if we can level it up, yeah.

D

Yeah, that sounds good. Thank.

A

You, okay, okay, so we should probably close out. Oh go ahead. Sanjay yeah.

C

No, no, if you can send what you're captured I have captured.

A

C

A

C

Here too, but if you can put in whatever.

A

Of course, I will yeah I'm pretty good at taking notes, so I'll uh I'll send you what I have and.

G

For the next time, the re-simulation also is a topic that we would like to discuss. I think we talked about it earlier. Okay,.

A

I'll make a note and and I know, restimulation has been, you know on the top of our minds. We need to figure out how we want to do it. Also how we want to you know: how does it get triggered? What are the the? What are the triggers? How do we trigger what kind of API um so yeah I'll mark that down sure, but I'll go ahead and stop the recording and post it later today, I appreciate everyone's time and Sanjay I'll get with you on my notes.

A

Thank you, okay. Everyone. Thank you. Okay, thanks. Y'all have a good day.