Internet Engineering Task Force 101, 23 Mar 2018

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: IETF101-ICCRG-20180323-0930

Description

ICCRG meeting session at IETF101
2018/03/23 0930

https://datatracker.ietf.org/meeting/101/proceedings/

A

If you have this article, you should it's a lot of fun, I thought of. For that it's a lot of joy and fun. You should you should read this with a lot of popcorn or with scotch your preference, but it's fun. I'll move on, go search for it having an open Internet, it's just fun, so we're gonna start the iccid a meeting and idea 101.

B

A

You all for being here early in the morning, I know it's Friday and it's early in the morning. So we'll try to keep this short and sweet before we start. We need a few people. You know the standard questions raise your hands now. If you want to be a JavaScript, don't make me choose.

A

Somebody javis five. Thank you, sir Matt I need one more person. Four minutes and I have a link here for to use etherpad, so people can contribute. So you won't be the only person taking minutes, but I need a point person.

A

We can do a crowdsourcing experiment to.

A

Silence it's like nobody can hear me. I can see anything right now.

A

Minutes minutes minutes Michael. Why aren't you raising your hand.

A

It's been a long time since it Akmal it's an icy crg.

A

Somebody minutes please, if you don't need minutes, I'm gonna spend five minutes asking the question, or should we stop taking minutes in this meeting, because it's really not a meeting I, don't mind doing that? Having that conversation either, are you gonna take minutes? Thank you, sir. You just avoided this conversation.

A

So welcome again we will start quickly I'm gonna hand out blue sheets in a moment. Please sign it. Please add your name and your affiliation to it in the meanwhile. You can all read the note. Well, I'm sure, you've never read it before so take a moment to read it.

A

If you haven't dug them out well, you actually should read it. It's about IPR disclosure and I, an idea of participation, and it applies to this meeting. So please make sure you can look for IETF note well, and you will find this that's about all I'm gonna say about this now. So if you haven't read it, please go read it.

A

Here's the agenda, but I think there's at least one modification that I know of we're. Gonna have Paul Paul.

A

Are you in the room there yeah thank you presenting some initial proposed I Triple E standardization work, which Paul and others who are proposing this in the I Triple E thought would be of interest to this room and they wanted to basically give a short bit summary of the work that they're proposing there so that we're gonna start with that and then Ingemar is gonna talk about VBR condition, control for and l4s support, especially for 5g networks.

A

Neil will give an update on what Google's been doing with PVR in the past four or five months. Praveen is bailing on me, so I'm gonna call him out west I'm, not gonna. Call him out I would I want Praveen to president, let that plus plus in he will hopefully do that again in the future, but he's not going to be able to do it today.

A

So that's off the agenda and we have Michael shanira from Hebrew University to talk to us about performance, oriented condition, control, it's a it's an exciting and a different form of condition. Controller than we've seen in the past and I'm excited to have him here so I hope that generates a fair amount of discussion.

A

I'm not going to close that one very quickly, because I think they're going to have time today to spare, and then we have a very short presentation on in on a new idea for doing condition, control and bandwidth canteen networks by ancient, so we'll just show up with it's time for your presentation in the meanwhile. If there's nothing that you want to bash in the agenda, we will start. Do you wanna start Bob.

A

C

Yeah I'll advance. That's.

A

Probably the best thing: okay,.

C

Just real quickly show hands how many people were in TS vwg, okay. So this is the Cliff Notes version of the presentation. I did yesterday so well, but I think I may have some clarifications on the some of the comments or questions as well.

C

So as mentioned, we have a proposed project and I Triple E 8o 2.1 for congestion, control, algorithm or congestion management, algorithm, I, guess I'd say, and it's proposed meaning it hasn't started yet, but it has is in the process of approval and one of the things that we need is more review and feedback and because it's intended to cooperate with congestion control from the IETF. We thought it was very important to do this presentation. So next slide, please so back in November 802, don't want to agree to create a project.

C

The project authorization report that's a proposal. It goes through various levels of approval and ultimately the working group votes to create that project. So we decided to delay this project until July timeframe in order to obtain more simulation data, more approval, more review from others. So again, this is why I'm here, the motivation for this activity is documented in a report pre standardization report from the I Triple E 802, the link is here.

C

This is an open process that involves end-users and standard and vendors, to kind of define the requirements of the next generation data center. So have a read of that, and, and send me comments, so that's where we're at so what is this thing? We're talking about next slide, please. So, first of all, we are amending 802 dot1q, which, if you're you know familiar with 802, that is the book that described Ethernet, switching, which is really the fundation for chips that are used in routers data center switches and routers today as well.

C

So it's all sort of related and the scope of this project is specific to data center environment. So within the data center, so we're not talking about across wide area links and what we're doing is we're trying to isolate flows that are causing congestion in these environments, like rocky v2 or you know, high-performance data center computing kind of artificial intelligence types environments. So the way this works is that this bridges, but I'm calling them bridges, because that's what our terminology, but again fundamentally, there are layer, 3, switches, routers, you know they have Ethernet links.

C

The Asics have effectively one set of cues on on the port's, the so they're all shared between the layer, 3 layer, 2 functionality, but what we're doing is we're going to identify flows that are creating congestion in the data center, and so a lot of questions came up about. Well, how are you doing that? Well, one of the challenges we have in 802 dot, one is, is finding the right level of specification and allowing implementation flexibility as well.

C

So we don't typically try to over specify things, we've specified behavior as and whenever we can, and so we're assuming in this scenario that there's already many mechanisms that are in chips today that are identifying congestion, because you're doing things like marking the ecn bits and so we're going to use those existing mechanisms or we will use mechanisms that we have defined so there there are. There are techniques in in 802 dot1q. Nonetheless, you find the flows that are causing congestion and for those flows.

C

What we're going to do is adjust their scheduling, adjust transmission selection, so we're going to move them so to speak, to a different traffic class and then we're going to schedule and mix in that traffic class in a way that doesn't reorder packets, but it effectively delays those flows that are causing congestion, allowing the indian congestion control loop time to react.

C

So the concern is that as data centers get faster and bigger that they're not going to be as reactive as we want them to be, there's gonna be a lot of packets in flight with 100 gigabit ethernet, and that we want to absorb those as we can and giving ecn time to do its rate adaption.

C

So that's the the high level and another trick and reason why we want to standardize this is we intend to provide a signaling mechanism on a hop by hop basis to the upstream neighbor to identify those same flows so that way that upstream neighbor can also do the same and and provide the isolation that allows them even a little bit more buffering, and so it can kind of propagate back. The hope is it never propagates all the way back to the source, but because we want ecn to do that. I'll explain more on that.

C

So anyway, the scope is really about reducing head-of-line blocking, for you know, as separating the the flows that congested from the non congested and as I mentioned, it's intended to be used with the higher layer, congestion control defined here next slide. Please so kind of the the assumptions of the the lossless data centers of today that a high performance, computing or again storage, Rocky, be to storage and vector environments. Is that the network is a layer, 3 network.

C

First of all that all these links and boxes are really routers and the links between them are subnets. It's a class network fat tree and we are using ecn as a you know: DC TCP or rocky v2, with the DC qcn as an end-to-end congestion control mechanism that that that may be application specific. So for quick, for example, or for for rocky B 2 there that's more or less an application thing.

C

If you want to create a purely lossless environment, there's the the ability to use per flow or priority based flow control, so, in other words, pausing the frames and on a per class basis. This is often not discouraged or a lot of times. People don't like to do that because it creates that of line blocking. But that is your sort of last effort last-gasp to never drop a packet.

C

If we're really run buffers and we don't want to drop and we're gonna do flow control and again, these are all using typically code points as a way to identify the different traffic classes and maybe, as opposed to VLAN tags. These days next slide, please so 802 that one already has two sort of existing congestion management tools. I just mentioned priority based flow control, so typically we have eight classes of service or eight queues. If you will, you can define those queues to have individual flow control, and so that requires extra buffering.

C

You know so that you have Headroom. So typically people don't turn this on on all the classes. They may turn it on on one class or something like that. But again, the concern is that it creates like a head-of-line blocking scenario so as I push back I'm blocking the entire traffic class. So if I have multiple flows, sharing that traffic class and they may be going to different destinations, I'm, fundamentally stopping them all. So that's the the negative downside that we see that causes some buffer bloat causes backup and congestion spreading.

C

So in general, we see maybe priority flow control being used only at the edges to keep from over running Nick's, not so much used in the core of the network. But if you want a truly lost environment, that's sort of your last your last tool in the bag. Next slide. Please there's another approach that 8:02 had defined for non IP type networks, if you will so.

C

This is where your your fabric is a big layer to and you're doing, something like Fibre Channel over Ethernet or maybe rocky version, one where you didn't have IP II didn't have therefore congestion control. So we were trying to create a layer, two lossless Network, and here we actually signal a congestion message, all the way across the layer to fabric to a source and have the NIC so to speak, perform rate rate control itself again. That was sort of divine for these non IP based protocols didn't necessarily see a lot of use.

C

There was a mechanism defined here to identify flows, causing congestion, that's one of the mechanisms we could reuse or for our proposal next slide, please so congestion isolation. So, first of all, the goal here again is to work in conjunction with the higher layer, congestion control we want. We want the stacks, the endpoints to do their rate adaption in the way they do, but we want to provide assistance or more time for that control loop to work, and we are doing that by creating yet another control loop. So there that's the the tricky part.

C

The goal is making bigger larger, faster data, centers that aren't that reduced loss or having a loss do this in a flow agnostic way when, when observation is, as we get faster links like hundred gigabit on and higher port density, we're seeing less and less available memory in those switches on a per port per pack, you know gigabit per basis, so we are pressure on switch buffers. So we don't can't just throw a memory at the problem and primarily reduce flow control and head-of-line blocking so next slide. Please, ok!

C

So just briefly kind of a unless you will wait till the end to take comments um briefly how this is going to work. So imagine we have an upstream switch in the downstream switch. Obviously we're going to have a full fabric here and there's traffic coming from all directions, but just to simplify the discussion.

C

So, as in the downstream switch as frames begin to, congestion creates frames begin to Q in the egress of the switch, the logical or physical egress of the switch. This is there's some thresholds here, at which point we will be detecting congestion, and this is where mechanisms that are being defined in IETF could be appropriate or ones that we have already used. We're, not necessarily specifying that the specifics of that yet that's an option, but once we've identified congestion and I can kind of flow that may be causing it. Maybe it's sampling we will notify.

C

We will remember that we will keep track of this flow and subsequent packets for that flow that are coming in. We will now queue in a different class in a different traffic class and we're going to do now we're going to schedule that class in a way that avoids that out of order packets. But the key is that we're effectively sort of delaying that and using buffering in the switch to delay the flow, that's causing congestion. Now this might be an elephant flow statistically, but we don't know, but hopefully, we've got the right.

C

The right flow next slide, eventually, as that congested queue begins to back up. We want to tell our upstream neighbors so that we can use his memory as well, so we, this is where we would send a signal.

C

We call it a congestion in isolation, message which is effectively a message that includes the flow information, so the upstream switch can identify the same flow and- and it will do the same thing next slide so then that that upstream switch will also be isolating the flow and that's where we effectively now are eliminating head-of-line blocking so the the non congested flows, the mice, the green, the good guys there now you know, don't have the congested flow in their path as much so they're able to to eliminate the head-of-line blocking the next life.

C

Now, if you truly want the lossless environment- and we start to backup the congested queue, then we can pause that queue and, at that point we're blocking the entire class. But the goal is that, in that class is, is effectively only the congested flows, David.

D

Wanted to quick check on what's going on here. Are you doing dual queues? Are you doing queue pairs within class, or do you need a second priority or class to odd to make this work right? So one of our objectives is.

C

To try to make this as easy for existing implementations to use, and so the objective in the idea is to use an existing traffic class, so we would be using a new tree, an existing one, that's already there because and the reason. Why is because of that last step, if I'm going to pause the congested queue, I need to actually be able to and use the existing pause mechanism.

D

I need to have an existing class. Okay, so is that, is that step four pause, causing both queues or did you put the top queue into? What's effect is a separate priority right, yeah.

C

It's a good point: it's it's pausing! Only it's a it would puzzle initially would pause only that queue. However, there could be some packets still in flight. You know that, there's a possibility that we might have to pause both queues. It really depends on what is what residuals are left in that nonk.

D

Ingest, okay, sorry I think I've got the answer, which is every time you want to use this for a traffic class you need to put you need to assign another one of the eight traffic classes to the congestion, isolation mechanism.

C

Yeah now it would be possible to have say two or three non congested queues and only have a single congested queue, but you would have to coordinate that, but it doesn't have to always be pairs, but if, but the expectation is that it would be a pair as you're mentioning okay next slide, so we have some early simulation results, I'm not going to go into great detail above it. The links are here, but we do see. Our objective remember was to reduce the use of that pause frame, because that has been known to cause problems.

C

We did see some reduction of that and improvement in flow completion times, particularly for kind of quote mice frames. If you will this sort of naturally or statistically does of myself in separation, if you're, assuming that you know the the elephants are the guys that are effectively causing congestion, you kind of statistically get that in a lossy environment.

C

We also saw a significant reduction of packet loss, so we were by getting the flows that are causing congestion out of the way we reduced packet loss and that really has a huge benefit on the mice and control frames. So again, the details are there and we're gonna do some more. There was some questions about what you know: the modeling and, as usual, I and I would love to work with anybody here on simulation. If that was an option. Okay last slide.

C

So what are the next steps so again we're socializing this with with many people, we need technical review, we'd like to do additional simulations. All alternative switch architectures that and different congestion control. Algorithms I'm very interested how this interplays, with things like bbr or other time-based congestion, control schemes or rate based? Not it's not explicit marked, and we would hope to progress our motion and start this project in July. So how can the ITF participate?

C

So we already have an IETF I, Triple E, inter working relationship with that would be a great topic for that to discuss. You can email me directly comments and I'll be happy to work with you engage in and do whatever we can to vet. This idea a little bit more and then we also have, as I mentioned there is this paper that we remote, where we're motivating this problem and we're talking about the use cases and would love again collaboration on that. It's an open process. Anybody can work on that.

C

We have vendors and we have end-users today. So have a read its ads as well, and that's it thanks to eat comments.

E

Matt Mathis, um it sounds like this method is focused on there being specific pairs of ports that have hot flows between them.

C

Well, I showed it when I described it I'm only showing an upstream in a downstream, but the reality is the cloth fabric, so they're kind of coming from all different right.

E

I was gonna, say I'm, not aware of that many environments, where it's a single port pair and and a much more common case is what we call the in caste problem. Where you have. Every input port is landing on the same output work and then correct. Okay, yeah.

C

I used that simple diagram as to illustrate the mechanisms, because.

E

The formal problem is probably got a lot less traction as something, but the latter problem. The real, the in caste problem correct. It might make sense yeah.

C

If you have a look at our paper, we talked about the key problems and in caste is: is there any one of them Bob Michael.

F

I just wanted to know most your presentation is about. This is what we're gonna do and then at the end you say, and we want to try different queueing disciplines, different traffic mobiles, all the rest of it. Are you saying you might not do what you've most mostly described? If someone comes up with something better or you'll, definitely do that, but you might do other things or yeah.

C

So the key part as I mentioned in 802 standards, is we don't we're not you know it's the balance of how much of it is known versus how much of it is not known. What that's what the answer is and I mean we have a pretty good idea on how we want this to work and integrate with the what's ship models and things that exist today. But it's it's not done the typically.

C

What we do is put this project authorization request which defines the scope and, like you know, and we've already gone through, defining objectives and goals. We can entertain multiple proposals for how to achieve those and bake them off and do all that. That's that's not part of the process. So that's against soliciting these ideas.

F

And much an outcome be that are cheaply. Standardized is a couple of couple of mechanisms. Then the sort of market decides between them. Yeah that a couple of switch mechanisms you know and they can be put in different yeah.

C

F

C

Or we can say yeah, you can say here's one idea, but it's open to use others at the end of the day. We want to see this external behavior. We want to see you know no repack and reordering or some level low level, and we want to see a message being sent. That gives me enough information for an upstream switch to do. You know so that that's where you have pure interoperability and then the behavior can't obviously can't conflict with one another, but try to avoid implementation. Details if possible is that, if that makes sense.

F

Yeah but but what I mean what I meant by not necessary in the same network, so that one network could use one and another network could use another, but switches have got them both. You said I mean oh yeah,.

C

A

Singam are next.

G

Okay and is just about the presentation of this VBR and it's a useful FRS with alfresco support and it's a few experiments and findings, and it's quite sort of recent results and it doesn't claim to be the D solutions. And there are other solutions. Also in the pipe.

B

H

G

Yeah and connect the next slide, and there you probably have seen all this there'll be all what it stands for and what it is, and it's implemented by Google and and the scope of this work is to modify bbr with the l4s support. It doesn't address this.

G

The packet loss issues that are identified with short or ism, and the reason why I wanted present is that these changes are very minimalistic and it's a very easy to implement it in real, and these are simulations not not distress it that it's not no, no real test beds that are run with this.

G

Okay, do it like this? Do it my cat fell on it and it's a modification start all done is that do use Alfred supporters is added, and this ENL code is from the TCP DCP TCP, and it does this special thing. It doesn't do accurate, easy on and the main changes the function will be. Your update, bandwidth and totally.

G

You update the bandwidth computation and remove the number o see more packets in our equation and divided point in factor 4 in this case, and it's not 2 or 4. It doesn't really matter that much and and there's only additional state arrivals. The total it's a delivered, C needed while deliver is specified and delivered, is the amount to deliver the packets and it sort of update the delivered and is and see more packets to take next slide.

G

Please and the game cycle is changed from a torta tease to 3 Ortiz and also reduce the gain gain variation a bit and it sort of the game. Variation is essentially haft, it gives less jitter and but sometimes he can give the slightly slower rate increase and in or in some cases, if they're competing competing flows in the same boat.

G

The next return they get, the worst sort of a convergence, the mean or TT probing is removed, but that is for simplicity to make it work, and one reason why you can remove is that the alpha risk is very short or a 0 Q delay. But in the end you may still need minority probing.

G

For instance, if you have 5 d, FR g, dual connective, whatever flows to change from LTE to fight un or any kind of changing or titties the bandwidth window is the max bandwidth window is reduced to 2 or titties or 3 or titties. And one warning here is that the two short winner can actually reduce performance for a pre-application limited traffic, and it also gives some more noise in the bandwidth. But the benefit is that you get the more a quicker rate reduction.

G

If you get to congestion and additional one, one is that you exit PBR start startup. If you experience more than one or two T or C mod packets, that's wrote it and can take next slide, and this is sort of essentially, if you look at, for instance, BB or update bandwidth, is that only the boldface read the port? Is that ad in the code, but you need additional code in a TCP input, dot, C and TTP rate sort of account for this earmarked delivered packets.

G

But all in all, this is a very simple modifications and gives a reasonably good results already.

G

The next Lloyd, and here a comparison between BB or and it is the BB or evil and simple both the neck, with the minimum or TT 20 milliseconds and Arnelle 4's mark tresor leau, two milliseconds on them, and what you can see is that the both the track, the band, would changes and marked in red here quite well and the BB or even in some cases, you may not see it in some case, it's a slightly slower in the rate increase and what you can also see that BB or has the kiss or TT.

G

Probably that reduces the throughput quite a lot indications. It is not present in the in the l4s adaptive version, and if you look at the cue delay, you can see that 12 additional well for s mainly reduces to scan standing cue. It also reduces the delay spark, but that is more attributed to a sort of mean max bandwidth window.

G

So al fresco Slee reduce this tanning cues faster okay, and this is an example when it's optimal prior phantom queue- and this is total mind. Interpretational phantom queue in reality is Tandy her slightly differently. We did a virtual cues on. We can see that you managed to reduce queuing delay to almost zero yeah. What we have is the actual zero ization the way or the packets, and you sacrifice some 10% pen with what you get more noise in the throughput also can of course say that this is well through postmaster.

G

Over a very short period is like 20 milliseconds, so I will actually sort of changes due to OTT. Is that next slide? And if you look at multiple flows it can. This is sort of a simulation where you have five flows that are either corner to pass on arrival process process. We don't in 1000 point two floors per second, so they'll appear at random.

G

The first blue Flo appears to me on the red at some random intervals, and you can see that the BB or it doesn't manage initial to sort of keep the queuing, delay low and that's quite natural, because you have this late converse at wanted. Children, let's go, but it stabilizes talked a while at least and the flow rate will converge or relatively well. Like next point, and if you use this with l4s, then you get two slightly better flow convergence and you reduce the cue delay. But it's not ideal ideal issue.

G

They'll have something like two millisecond q delay, but it's so more like five millisecond PQ delay and what you can see. Also the newly-arrived flows, the ramp up more cause as long as that's because you exit this beep, your startup mode next Lloyd, unless the only hundred songs don't provide a phantom key and you get slightly better convergence. But it can't really say that this sort of a just happenstance or so, and it's slightly lower Q delay and that's it sort of you, don't reach 0 Q delay with phantom to you.

G

If you have more flows that are competing. The bottom item that there's something that needs to be addressed audience should get worried even lower key utilized, and then we have an exploit.

G

Let's talk about, try to mimic case very good for in, for instance, the beamforming in orange totals get slightly worse beamforming and you lose the BM slow. It's done this will put gradually decreases or the course of 400 milliseconds, and you can see that the BB or Evo here managed to keep the queuing delay lower than PB or the BB or sort of a has spikes up to sixty minutes. You can kill, delay and and it, but you don't reach the 0q delay with vb or avoider it's up to like 20 milliseconds.

G

If you look at the next slide, you can see. The reason is that the max panel, it is slightly overestimated at first yeah 1.05. Second, you have a too high bandwidth after that. You get on bunch of seeing more packets on the bandwidth is pushed down, and then you have the it's. This process repeats itself, so more work is needed here to get a better bandwidth estimation, and the trick is here that you can of course make this make this more coercion.

G

But then you can easily end up in a problem problem that you get the worst convergence if you are more flows that compete for the same bottom. Actually, it's a excuse between the devil and the deep blue sea, in some cases, real litterbot, ott fairness in this case, I, must say in this case it's very short to test, and you can see that PBR you is, it will expect.

G

It behaves as expected that if you are the longer or tt stand, you get the lower share of the bandwidth PB or, on the other hand, total doses complete the opposite if you're, shorter or tities command compete with a flow with a longer oddity to get the lower share of the bandwidth. It's a it's a feature, a BB or II.

G

We in this case it sort of makes things it sort of flip this upside down, or it turns your throat or right side and then another example where the sort or entities on a comparison I still takes a nice slide. You have roughly the same behavior and it can with a shorter OTT scenarios, annual water, tea unfairness, differing oddities, and they get a better or TT factors with another with that is no real for a smoking, sorry, finance and conclusion after it.

G

It was quite easy to modify PBR for this case, but a probably a better way to do it with more sort of a brainer. Then more thinking involved conduct.

G

And then, if you additional Adele Presley, you get good convergence with multiple flows competing on a certain box makes it it keeps defending you small or where it's more with the phantom cheese, and you get a certain degree of jitter, and that is a result of the bandwidth probing that is still present in this with an app to BBB or and what you need to be a in. This implantation need to be a bit aggressive in order to get your fair share and about bottleneck.

G

Otherwise, you will install if your to coach doesn't need your than we do. We go away sloop. It will go away, that's about it.

A

You have a few minutes for questions.

I

You chunqiang thanks for this work, very appreciative, you're, the small change about easy in and and seems very interesting I would love to try it on our environment too. There is only one change, I'm, a little bit afraid off, which is to exit startup or slow star on the first rung of experiencing easy end. I think this will cause the same problem.

I

The baby are trying to fix originally, which is an hour on the Google back on when links you have this elephant copy flows and they're getting all this sort of bursty losses or likely bursty easy and signals because of some Incas happening, or you know just bursty small mice, traffic coming driving by. We call that you know lost by the drive-by traffic, so VP I might be exiting the startup phase.

I

Us essentially slows are very early, and then we have to rely on this either 1.25 can- or in your case one point 12 percent again to slowly grow up to the Penguins because it could be or could be right about that, and that could be an issue yeah yeah. So that part we might need to. You know discuss further yeah, but I think that the the rest looks not very interesting. Thank you.

J

So Gauri first I like this piece of work, I like the idea of being able to take some signals and really figure out what to do rather than just carry on blasting into the internet. So this looks really cool. I appreciate this and your benefit seemed to come from multiple things. So one of it was using ecn as a signal at all which applies 12 for s but would apply to non l4s. It would apply to a boar other ecn methods, it'd be kind of nice to know.

J

If you have a feeling for how much comes from easy, an and how much comes from bbr yep.

G

And I mention it briefly: honorable evil, one a sort of a trick to that universe was a short or bandwidth window and that sort of reduced of the length of the various parts and the other one is to reduce understanding Q that you could see in the BB or example.

G

One will do all the slides for your changing bandwidth.

G

Take some mana be the first craft on the first craft, be with slides, yes, and you can see it after the blue graph, a sort whether the cue delay gradually goes down with BB or and that they circulated that is attributed to m4s, marking that it comes to cue delay real fast, but the window, the delay, spikes and possibly also the height of the last parts. That is the chain difference. There is due to the shorter bandwidth window.

J

Okay, I buy this as being as being a very useful thing on its own and I, like the phantom cubit as well, which is probably something that is a separate thing, but it really helps convergence. So thanks very much for the talk.

K

Andrew Gregor Google I'd like to second Gauri's thing about change. One thing at a time so that we can see with what the impact of each of them is now I mean. Is it exploration? This is. This is great work get me wrong, but from an engineering point of view, ought to be very nice to know what the contribution of each of the individual variables is and I have one to add, which is that one of the things that stands out, particularly in this plot, is the impact of bebe as Max filter window.

K

Consider using something like a percentile tracker and softening it there's.

K

If you search around you'll, find that there's a way of tracking and given percentile over distribution, which costs about the same as in a WMA. They'll need to be a little bit of numerical fudging to deal with the fact that the percentile is not actually what you want to track. You want to track an estimate of the maximum, but you know fudged like that, because that could be bbr a significantly softer response to events in the network and probably eliminate most of those spikes in the bottom graph. In the slide.

G

I

Chhanchhan um yeah I agreed right now. The VBR max filter is a it's a stream or filter and I. Think the men sort of benefit of this ECM work is to actually reduce that effect of max filter, because one problem is everybody's probing and once a while, everybody get a really good penguin and they stay at that. But who is this easier serve as a like a guidance to say? No, no, the you! You are way too extreme, because you know I'm try to start thinking you all right.

I

You have to reduce your way and that's why it would achieve great queuing delay. I do have and I also agree with Andrew suggestion. We are looking to use different type of filter for the Penguins. um I have a one question here. So I noticed l4s uses sojourn time as a marking special, but in most in our data center we still use instant as cue lens. So what's your saw that if, instead of using sojourn time, you use simply like over X kilobytes of key well market? What would the result? Look like.

G

I protonate to come back with the answer: yeah, okay,.

I

Yeah, it would be going to see that yeah yeah.

L

Neil Cardwell I also wanted to thank you for this work. I think it's really great to see people experimenting with simple ways to fold in ecn signals to the VBR, algorithm or framework I had a question about, or some comments about, the the modifications to the algorithm if it's possible to go back to the slide, that'd be great. If not that's fine.

L

In looking at the relationship between the the gain cycle, that's talking about the bandwidth, probing schedule and the bandwidth filter window I would encourage people who are experimenting with variations on a PBR to keep in mind some of the design relationships between the timescale in which you probe for bandwidth and the time scale of the bandwidth filter window in the you can take a look at the EPR internet draft.

L

It talks about some of those constraints where you want to basically remember a bandwidth over a time scale that includes some bandwidth probing in order to get robust, behavior and not sort of fall over. If you see delay variations due to cross traffic or or radio bandwidth fluctuations whatever, and then the second comment would be I.

L

Think one thing with the EC n base: bbr one thing to keep in mind- would be that over the lifetime of a bbr connection in the real world, we might have the bottleneck shift from a bottleneck that supports l4s ECN to bottleneck that doesn't support it, and it would be probably important to think about how these mechanisms could function if the bottleneck shifts in that way, so that the behavior is still reasonable.

L

If there's a non ecn bottleneck that becomes the bottleneck, so, for example, the min RTT probing might need to be always kept there in some forum to make sure that the to a propagation delay estimate maintains anchored to reality, but but overall, thank you very much for this work. I.

F

Risk I really just wanted to comment on your point about by its versus time in in the Q marking, and it's fine to use bites if it's a fixed-rate link, but if it's got multiple queues and the right varies depending on the other queues, then you really want to use time.

F

But that's this and there's those papers that give you performance results on that. If you want to I can point you at them: well add yuku or if you've got priority queues. You know, in other words, if they're, if the output rate of your particular queue varies depending on the traffic in the other queues, then you can't really set a byte limit because you don't know the rate of your particular queue, and so it's better to do it in time.

I

They all have multiple yeah.

F

And- and this is what these papers show, but but you can get a lot better performance if you use time not it's the same as you know, coddle and pie and everything they will use time yeah. For that reason,.

G

Thanks for the interest on me, I.

A

Thank you so much income up, that's good work and a good presentation. Next up, Neil.

A

I would upload the slights that's what I'm gonna do out here.

L

Okay, so I'm Neal, Cardwell and I'm gonna talk a little bit about recent work that we've been doing on bbr at Google. This is joint work with my colleagues at Google, working on the TCP and quick VBR implementations, including the folks you see here van and you Chung Ian China and the rest of the gang next slide. Please.

L

So I'd like to start with a super quick overview and status of bbr, which most people here are familiar with, probably from previous presentations, and then I'd like to spend most of the time talking about work that our team has been doing on dealing with paths that have a lot large degrees of aggregation and then close with a few thoughts on packet loss and ongoing work. Next slide, please so just a quick overview in status. We've talked about PBR at the last couple, ITF, so you're, probably familiar with some of these.

L

The bbr is used for TCP and quick traffic on google.com and YouTube and internally at Google, on the when backbone, between data centers. This source code is available for Linux TCP and for the chromium, quick implementation. There's also work going on implementing bbr for FreeBSD TCP at Netflix. We've got a couple internet drafts documenting the the algorithm and then there's you know some slides and and videos from previous talks next slide things so aggregation. So what do I mean here?

L

Basically, there are a number of where you can get data and/or, acknowledgments that are sent in birches, bat, bursts or batches, and generally, when this is intentional, it's been designed for some kind of amortization of overheads to increase efficiency, so this can happen both in shared media, whether it's Wi-Fi cellular, cable modems, and it can also happen to reduce the per packet overheads using offload mechanisms in order to take a batch of packets on the wire and treat them as a single entity as you pass them down through the stack on transmit or back up through the stack on receive, and so basically you see that on a lot of Ethernet links and so between.

L

All of these links and all the various techniques they're using this turns out to be a very widespread phenomenon so to handle aggregation. Bébé are sort of decomposes this problem into two separate but related problems. The first is: how do you bound your sending rate based on the estimated rate at which your packets are making it through the network?

L

Basically, your bandwidth estimate and one of the two drafts we have out talks about that approach that we have been using so far for delivery rate estimation and there are links there and there's ongoing work on improving that. Basically, a couple different approaches we're looking at, but both have the flavor of looking at bandwidth over longer timescales and then the second part of the problem is once you have a bandwidth estimate.

L

How do you use that to sensibly bound the amount of in-flight data in the network, which is to say how do you set your Sealand and as a first approximation, the initial release? The VBR basically has a sort of simple approach where it generally bounds the Seawind to twice the estimated bandwidth delay product so twice the bandwidth of times the min RTT. Of course, this can run into challenges if RTT is noisy. So here's an example next slide, please. So this is a trace from a case where there's a Wi-Fi bottleneck.

L

So it's just a sender one Ethernet hop Wi-Fi access point, Wi-Fi network laptop receiver, and here this is five seconds of traces showing every RTT sample that we get and it's a bit small I guess, but you can see that the the RTT who sort of bounces all over the place from four milliseconds at the bottom, all the way up to the roughly 80 milliseconds.

L

So though the RG T's can be kind of noisy here, and so this is one of the challenges and we can sort of drill down and get a sense of what's going on. If we take traces at at both ends. Awesome next slide, please so.

L

I wanted to spend a little time doing a deep dive on one particular choice, just to illustrate some of the issues that you run into and and how they impact the the approach we we took for dealing with this. So here's a single trace, we're going to be looking at, and this is just a very simple TCP transfer of 20 megabytes.

L

This is a Linux TCP, ebbr or sender, going over the public Internet to a cable modem and what you know typical Wi-Fi home network and a laptop receiver, and we took some traces on the sender and the receiver with TCP down and next slide. Please. So if we look at things from the perspective of the receiver, it looks kind of reasonable. The data is mostly arriving kind of smoothly. This is a time sequence. Plot x-axis is 2 seconds of worth of time. Y-Axis is 20 megabytes worth of data.

L

You can see the data arrives fairly smoothly and then the TCP sender is in queueing acts smoothly. There are a couple of flat spots that are kind of interesting. Then those turn out to be from when the sender is seewhen limited or a receiver when delimited, which we can see on the next slide. Next slide, please.

L

So this is the same trace, but from the perspective of the receiver, and that's this one's a little more interesting, you can see that basically, the sender is largely sending pretty smoothly, but then there will be long periods where the act stream just sort of disappears for a while and the acts go silent and in those cases the sender sends for a little while at it's estimated bandwidth.

L

But then it runs into its Seawind bound of twice the bandwidth delay product as I was saying and and then eventually after a while there'll be a big burst of Acts. That arrives the sort of vertical blue segments here a little while later and we can zoom into the the circle here on the top right and the next slide next slide. Please- and this is what it looks like from the sender's perspective zoomed in so we're sending along in this green transmitted line, that our estimated bandwidth and then all of a sudden.

L

The acts disappear and we get a flat segment in the horizontal segment in the blue line and there's sort of silence in the extreme, and eventually we get a big aggregated ACK, the bunch of data acknowledged in a single burst, which allows us to continue sending and it's this kind of the phenomenon that we see over and over when there's aggregation next slide, please, if we look at that same moment from the perspective of the receiver, it's a little bit interesting.

L

The big aggregation effects that are visible to the sender in this case are not showing up at all on the receiver. On the receivers perspective, what happened was during that time here, where the the lines are smooth and diagonal, the the data arrival is smooth and at the receivers in queueing TCP acts on its own receiving host smoothly, but basically the the it as we could see from the previous trace.

L

Actually, those acts apparently are not making it out on the network, and so you know there a number of phenomena that could be happening here, but it appears that basically, the the Wi-Fi access points- data transmissions are monopolizing. The link for a while and the result of this is that, since the acts are not making it out on the wire, the sender, as we saw on the previous slide, runs out.

L

A Seawind goes silent, and that gives us this flat horizontal section following there, where the link is underutilized, because the sender stop sending on next slide. Please. So how can we? How can we deal with this? So at a high level? Bbr is sort of aiming to keep the minimum amount of data in flight.

L

That's required to achieve the available bandwidth that it sees and in the first arrive of bbr at a high level, the the Seawind is generally bounded to twice the bdp and that's because basically there's one bdp, that's budgeted for utilizing the pipe and then one bdp, that's sort of budgeted for a delay, variation that you can see in the path and that's sufficient for fully utilization.

L

If and only if, the delay variation is less than or equal to, the men are TT, but, as we were just seeing in previous slides, obviously, delay variation can be quite high. So what happens? Is you if the delay variation is higher than and as we saw the sender exhausts, it's see wind and under utilizes the link and the lower the min RTT is the more likely your you are to run into this situation.

L

Let's see we lost okay, I, think you so, and in fact this can happen even on the public internet, because often content distribution networks will try to place their they're, sending hosts close to the user for a good user experience. So the min RT t might be just one to twenty milliseconds, whereas, as we just saw that Wi-Fi delay variation can be quite a bit higher up to say, 80 milliseconds next slide so to deal with this. The in the past half year or so the bbrt MIT has implemented an an explicit aggregation estimator.

L

That is basically asking how much excess data can be acknowledged as a sort of aggregate over the time scale of a single flight of data or a single sealant, and it does this by sort of estimating the windowed maximum degree of aggregation that it's seen recently and then uses that estimate to provision or allow itself.

L

If needed extra in-flight data to so they can keep sending during these long silences in the extreme, and it does this by basically saying what's the extra amount of data that we acknowledge acknowledged over some time window beyond the expected amount.

L

So specifically, we can calculate the extra amount of act, data as the actual amount that was act, what we expected and what we expect to sort of the estimated bandwidth times that sampling interval, and then we can take the maximum we've seen over some reasonable recent time window and and add that into our seal and that we allow ourselves and there's a course.

L

A question of how we sent is recent, and here we we basically tried to keep that window at roughly the same time scale as the the estimated bandwidth filter and then as a sort of safety mechanism sort of bounded that extra act amount by the the Seawind. Since, of course, you over the timescale of a single flight, you shouldn't have more delivered than your Seawind. Obviously so next slide, please. So here's just a quick way to visualize the the mechanism here.

L

The idea, basically, is that on when we're sampling, the extra mouth that's been acknowledged, we take the actual amount, that's been acknowledged, just sort of the entire blue height here and then we subtract the amount that we expected to have acknowledged, which is the the green vertical segment here, and we computed that from the estimated bandwidth times that sampling interval on the bottom there. And that gives us the. If we subtract those do we get that extra amount.

L

That was act, which is the red part here, and that's the part that that we allow as extra Seawind essentially so next slide, please.

L

So this um there's code for this that's already been deployed for the quick chromium implementation of bbr and you can click through the link there to see that and then there's also a tcp plantation, that's being rolled out on Google, comm and YouTube for some global experiments, and then here just some quick examples and it controlled the environment where you can see the kind of levels of improvements sort of this in this particular network.

L

The for example, in the Wi-Fi network, when using 2.4, gigahertz, there's sort of a 4x increase in throughput and then in 5 gigahertz, is sort of a 10x increase in throughput system. Significant improvements next slide, please. So in addition to the the work on explicitly estimating the degree of aggregation in the path. A second mechanism that we've been working on recently is a mechanism to adaptively drain data from the network to maintain shorter queues on a more regular and basis, so in in bbr in the steady state.

L

There's this basic approach that we call gain cycling where you in general pace at some pacing gain times the current estimated bandwidth, and you spend a little bit of time, probing for bandwidth by sending with a pace and gain above one a little bit of time, draining the extra data in flight from the network by pacing lower than the estimated bandwidth and then holding steady for a while, sending out exactly the estimator bandwidth and in the initial release of VBR.

L

The the drain phase was held generally for about 1 times the the men RTT or the two, a propagation delay estimate, and that can run into issues where, if the bandwidth, if there are variations in the available bandwidth, then that can end up pacing out more data until the we reached the Seawind, and so there ends up being twice the estimated BGP and fly so the new mechanism that we've been experimenting with, we could call drain to Target, which, basically, instead of holding the the drain phase for only one min RTT it sort of adaptively, holds that state until the the in-flight drifts down to the estimated bandwidth delay product in order to drink, try harder to adaptively drain more of the excess packets.

L

Out of that, so in various rounds of experiments, we've sort of found that you need to be careful. How you do this, so there seemed to be a couple ingredients to making this work. Well, you sort of need to bound to the amount of time that you spend draining the data because you spent too long. Then you can run out of good bandwidth estimates in your filter.

L

So you need to make sure that you do probe for bandwidth within the time scale of your bandwidth, estimating filter, and then we also noticed that you seem to need to randomize the phases of the gain cycling and this kind of approach to avoid always having the elephants and mice probing and draining in sync, and then we found that it was practically speaking.

L

It was necessary to deploy this in tandem with the aforementioned aggregation estimator, so that if you're trying to robustly drain data out of the network excess data out of the network, you do need to sort of allow yourself to put more data in if that extreme dries up, and that and the acts disappear for a while. So anyway, so we've been experiencing small-scale experiments with this on YouTube and seeing nice reductions in RTT and packet loss and we're in the process of doing a global experiment.

L

Next next slide, please so there's also work on bbr going on outside the Google teams. As I mentioned, there's work going on Netflix there's. Also. There was recently an implementation of bbr for ns3 that was released there with a little tech report accompany the links are here. The VBR is also one of the many interesting congestion control algorithms that the Stanford team is testing in their really neat system called Pantheon I encourage you to check that out next slide, please and then what to share a few quick musings on on packet loss.

A

M

A

M

The aggregation, thingy yeah.

A

M

Mentioned that it is bounded by Seawind. Is that including the receiver window in that state? Because it's not explicitly mentioned two slides, two slides earlier belief or three shots yeah or the next one? Before that.

L

Yeah so right now it's not explicitly bounding it's off by the receiver window. When it's doing that filter. We could certainly do that as well. We could say: okay, this should not be bigger than the Seawind and it should not be bigger than the Arwen's. Is that what you're suggesting so.

M

If you, if you're running, if you have this aggregation aggregation phase, obviously the X do not come back to the to the sender right so in here you could exceed the receive window that that the sender knows at that point in time, right, yeah,.

L

Well, the so it the obeying the receive window is sort of I think it's sort of a separate or orthogonal issue. Obviously we we never ignore the see the receiver window so.

M

It seems so implicitly ha notice at all states, yeah.

L

Yeah so they're they're separate mechanisms in Linux or quick I assume that make sure they never disobey that received window yeah. So, from my perspective, the with packet loss as a signal, the question is not really whether to use packet loss as an input signal. Obviously we know and Kubik used packet loss as a signal and even all of the versions of bbr that we've experimented with also use packet loss as an explicit signal.

L

I think the key question is really: how can we use packet loss effectively as a signal to adapt to the transport senders, behavior and I? Think there are some really interesting. Fundamental issues here is where packet loss results from congestion. There's still the question of time scale so to take two examples for exists: if, if the buffer is only one percent of the bandwidth delay product, then you have situations where a link can be 99% idle.

L

Even if you've got around over the even over the time scale of a round trip with packet losses is, you can take, there can be a burst that can fill that tiny little queue, take some losses, and then people back off say and then leave the the the link idle for the rest of the round trip and then, at the other end of the spectrum you could have a buffer.

L

That's nice and big say it's a hundred percent of the bdp and you can still have if you look over a slightly longer time scale of say, ten round trips, that link might be fully utilized for the first round trip, but then completely idle for the next nine round trips and so I.

L

Think, given these kinds of issues, I think part of the key question is on what time scale should a congestion control, algorithm repro before bandwidth, once it's seen a loss signal and there's sort of an interesting tension between minimizing packet loss on the one hand, and maximising application performance on the other hand, and now obviously, most of the time reducing queuing and packet loss generally reduces the application.

L

Latency but up to a point, but if you, if you try to absolutely minimize queuing and packet loss, it doesn't always mean you've, maximized application performance so to minimize that absolutely minimize queuing and packet loss.

L

So you essentially need to send quite slowly and probe for bandwidth slowly and rarely, and that can cut your loss, but it can also mean that you've you're not well utilizing the network and can thus increase your app latency and it's kind of an interesting trade-off, and we've actually seen this repeatedly as we look at tuning congestion, control, algorithms and in data center workloads, we've seen it with several different congestion control, algorithms, not just PBR anyway, so those are just random musings next slide, please!

L

So! In conclusion, you know: we've talked about the status of the verse first version of bbr and we're actively working on improving it. We know we've got some work left to do. It's there's work on going on in several places and right now, at least at Google. If focuses, are sort of how to deal with aggregation, how to deal with the packet loss signal.

L

I talked in November about one iteration on the VBR responds to packet loss, and then we were also working on the dynamics of sharing with loss based congestion control and EQ MEMS and there's, as I said, work going on on VBR at netflix as well, and of course, we're always happy to hear ideas, see patches or tests or experiment results. Look at packet traces. All that sort of thing next slide, please yeah! So that's all I have and like to thank these folks and there's a link.

L

If you want more info and we're happy to take questions and comments.

N

Yeah Toby Holland journalism and for the Wi-Fi tests. All the examples you show have the TCP sender at least one Ethernet hub away from the Wi-Fi link. Have you run experiments where the BBS and there is the same machine that has the Wi-Fi device in it.

L

We we have run results. We have run tests with that, but I don't have any traces or numbers.

N

For you on that help for that as well, your your order of magnitude improvement does that come in those cases as well. Some some like what I've seen is this phenomenon? Where, because you don't have enough queue in the Wi-Fi device, you get less aggregation because there's not enough data to build the aggregates right.

H

N

Bb I was especially that was especially bad for bbr, so I was wondering if you right.

L

Now that's a great question: we um I think yeah we'll have to go back to the off to run those tests as well and see see how much it helps for that case as well. I think it will help to some extent, I think the open question would be how much is left on the table after this patch. But thank you that's great point. They get girl.

E

Matt Mathis I wanted to add a couple of footnotes and about some other things. The issue about aggregation is actually a really critical problem in this community, and the issue is I. Believe it to be true that the majority of humans on this planet are behind links that exhibit this ken's it behaves, and the constraint is that if you have a curation going on, you don't have a way of precisely controlling delay.

E

You just don't have enough information, and this is a fundamental problem here, that in places where you have full duplex wire line connections, it's easy to imagine precisely regulating queues, and that is not the case in wireless and not the case in most of the southern hemisphere and not the case in lots of places and so we're looking at a divided problem. Space probably and the other thing is I've, become very concerned recently there ruminations about loss.

E

All of the plots we've been seeing here have been what I would call bulk flows, not transactional, workflows and I. Think there's a lot of stuff there that we haven't even begun to look at Thank You.

O

Modules in more party I'm, the chair of the coding group research group, and this with my chair, hat and I'm talking here actually in our group, we started investigating how to deal with loss by using some form of loss coding, especially inside quick, and also how this interacts, with congestion control, in particular, so I think we're interested, but I think there could be also some collaboration that could be possible. Okay,.

L

Thank you very much. If you could email us, that would be great.

P

Alan thank.

L

P

Like Kevin Smith, Vodafone I've got two quick questions so, first of all on, for example, the Vodafone UK 4G network today would Google services be using bbl? Yes, they would. Thank you, okay. The second question is regarding loss and how you deal with loss. This is maybe an open problem which is it D Evan. Are you attempting to use a way to distinguish between loss versus the fact that the radio layer may well be retransmitting at the at the physical layer, one layer, two layer there? Two and a half, for example, right.

L

No we're not attempting to distinguish those in in VBR, but what I would say is that, typically what we see in traces of cellular traffic and Wi-Fi traffic and our team, everybody on our team has spent a lot of time looking at the traces. Typically, what we see is that the the link layers in radio in cellular and Wi-Fi networks tend to do a very good job of link layer retransmissions that completely fill in most of the holes most of the time yeah.

L

So, in our experience, looking at the traces from TCPS perspective, what we see is high delay variations like the traces that we saw here and we don't see those manifest as packet losses at the TCP layer. So now, if I Cal be interested.

P

In exploring that further ISA great.

L

Thank you, I'm, inserting.

P

K

From the QR Andrew Andrew McGregor I noticed there was periodicity in the flat spots in your Wi-Fi trace and the periodicity was a hundred and two point four milliseconds. So what was happening was something and attached to that access point was in how safe mode and what you were seeing was. The timer took the access point to drain his passive cues after a beacon.

L

What sort of device you think may be in power saving mode in this case? I, don't know it could be anything.

K

It could even be that it could even be your bbro endpoint having part of its traffic power saved, which is not supposed to happen, but does.

L

Okay, because in the receiver side traces it seems like it's pretty smooth, yeah, okay, yeah me I'm gonna, take off that.

K

Can happen, the other observation is maybe packet loss. You need to think about the situation that shallow buffer packet loss is generally pretty much plus on distributed.

K

You know, random drops, other causes of packet, loss will have strings of correlated losses, and maybe we need to treat two and three consecutive drops is different from one as a heuristic toward looking at the whole distribution hold distribution and time of the packet losses which tells us about the the cause of it and if it's buffalo's, the integral of the tail of the distribution of rates going above what the buffers can sustain.

L

Yeah I agree that there's probably some signal in there that can be looked at it. It might.

O

L

Yeah, it might be really hard and so I think that that's probably gonna be a long term. Ongoing research effort and the question would be what's the simplest mechanism that you can deploy. That will behave reasonably, even if you may not have a a complicated mechanism to distinguish different kinds of packet loss. But yeah I agree that that's an interesting research area.

A

Jenna I'm got participating on the floor. um I just want to make a general comment, and this is entirely my anyways I want to make a general comment about loss rate. I, don't I, don't like the word loss, because it's really regions missions that we're counting here, not losses, but but beyond that I think. There's something important I!

A

Think there's something important in that neurons, which is that sending additional retransmissions I think your slide on the meditations on packet loss is spot-on, there's a very clear tension here between application performance and most pertinently right now, for me and I think for most work that folks are doing in transport. Now it's latency and there's a tension between latency has observed by the application and retransmissions that a sender is willing and able to do more aggressively.

A

Increasing retransmission retransmissions, in my opinion, should be completely reasonable and it should be reasonable thing that we are willing to give to gain some latency back or gay, or to drop some very some latency, as was, and in that spirit, I want to encourage folks to think about the mechanisms in the work that's happening here as not to try and reduce the retransmission rates, all the way back down to what we know and are used to in Reno or cubic, but to something that's reasonable.

A

It doesn't have to be the same as what cubic or Reno has, and that should be perfectly fine as long as it's not 50%. Obviously, at that point the trade-off is it's not a trade off anymore at that point or you've gone way past the line, but I think it's completely reasonable to to increase the retransmit rates significantly from where, where they are right now for, for you know cubic in order to do to get back some of that latency.

A

L

L

All right, thank you very much.

A

Thank You Neil, it's Michael. Next, let's see you know.

E

B

Q

Go for it Michael great hi! Everyone thanks a lot for coming and thank you Jennifer. The invitation, so I want to talk about a little nervous begin to the money. Oh sorry, can you hear me now? Okay, I want to talk about PCC or performance-oriented congestion control next slide. Please and PCC is a collaboration between my research group at Hebrew, University and Brighton godfrey's research group at urbana-champaign and everything I'm about to tell you is based on two publications, one from 2015 at used, ANS di and another. That's upcoming. Also a tennis I next slide.

Q

Please and PCC is currently being evaluated in various places and hopefully soon more as well. More will follow next slide. Please so before diving into a PCC, I want to say a few words about what TCP we think does wrong. So think of a scenario where you have a flow. You have a connection F sending at a specific rate, R right and yeah, and there's packet loss right there. So there's many reasons why this might have happened and there's many things you can do in response and yeah. So so you experience packet loss.

Q

What's going on in the network, so one option is that your the flow causing congestion, that is you're a big flow, you're bombarding a bottleneck, link somewhere, and for that reason, there's congestion, in which case you should decrease your rate by a lot even for selfish reasons, right a different option is that you've just exceeded some shallow buffer somewhere, in which case you should decrease your rate by just a little bit now.

Q

Yet another option is that there's some you're an insignificant flow there's some other flow, that's big and that's causing congestion right, in which case I would argue. You need to maintain your weight right. Certainly you don't need to decrease it by a lot and the last option, which this is not an exhaustive list, but the last option here is think of non congestion laws right due to physical layer, corruption due to the medium due to handover in mobile networks.

Q

Whatever right, if, if you get lost, that's random, then your fail you're alone on a link and one out of every 200 packets is dropped. I would argue that you need to fully utilize that link right. So, in that case, you actually need to increase your transmission rate until you actually do start the stuff for congestion losses, so yeah next slide.

Q

So the but the the the key point here is that TCP is design. Philosophy is such that you hardwire a mapping from packet level. Events Lizz a packet was dropped, I received an AK, the RTT of a packet was more than some bound to changes in the congestion window, and no such hardwired mapping can be optimal across all of these scenarios right.

Q

So stepping back from this, the abstract question is: what's the right rate to send that, and the key observation is that it's really hard to figure out what's going on within the network, but regardless of what's going on within the network, there's something you can quantify. That is, if you send it a specific rate, you can quantify the outcome in terms of the implications for performance and you can formulate that so yeah. So specifically, what PCC does is it tries out different rates? So you send you pace yourself. You send at a specific rate.

Q

You wait long enough thing. Rtt are slightly over an RTT for acts to return for selective acts to return, and you gather statistics meaningful statistics that pertain to performance like what's the last rate. What's the throughput, how fast is latency increasing during this monitor, interval, etc, etc? And then you derive from these statistics a utility value. So you aggregate these statistics using some utility function into a numerical, real value that intuitively represents the score in terms of performance.

Q

That is how happy am I with what just happened right so for now think of a very simple utility function. That's just the throughput times: 1 minus the loss rate. Okay, so that's maybe that most basic utility function one can employ in this case it ignores latency and it's pretty straight forward.

Q

So what PCC basically does is it divides time in two consecutive intervals and each such monitor interval is devoted to trying out a specific rate. So you can think of this as a sequence of micro experiments. We test different rates, one after the other and so at first we said that a rate of r1.

Q

We we collect statistics. We aggregate these statistics into a utility value you one, then we sent in a different trade or two we'll do the same thing, an aggregate. Now we have a different utility value. You two and the crucial point is: how do we determine what the next rate should be, so there should be some online learning algorithm that determines based on our previous interactions with the network, based on the different rates we tried, based on the different statistics we collected based on the different utility values.

Q

What should happen next, so at a very high level. This is the PCC architecture. So you you learn real performance in the sense that you, you choose a rate. You gather statistics and you provide a score that indicates the performance level you control, based on this empirical evidence, and this we are going to show reals, consistently high performance and going back to TCP. There's two crucial aspects here. So one is that thank you that we gather meaningful statistics right.

Q

So the fact that a packet was lost doesn't tell us much the fact that the loss rate is half a percent does actually mean something. The second thing is that we apply an online learning algorithm to adapt our rate. So I'll talk about what that means. In a few slides.

Q

Yeah, can you still hear me cool, so here's so the first version of PCC presented in 2015 called PCC Allegro employed a fairly simple rate adjustment algorithm. So there's a lot of details, but I want to give the high level picture so think of sending at a specific rate, R and deriving a utility value of U right. So now you need to decide whether to go up or down right. So you're in this decision-making mode, well-liked and the question for pcc's perspective is: will I gain higher utility?

Q

If I go up, will I get better performance or will I get better performance if I go down right, so the way PCC v1 handles this? Is it does random control trials, but it probes high it probes, a lower rate and based on empirical evidence it decides in which direction to move. So in this case, for example, say that you were sending at a rate of, are you try a higher rate of R times, 1 plus Epsilon, the lower rate of R times 1 minus Epsilon?

Q

Let's say you do to experiment right you, so you send twice at the higher rate and send twice at the lower rate, and in this case, as you can tell in the figure, the results are pretty conclusive. It didn't have to be this way, but in this case it's pretty obvious. That's I gain higher utility when they sent at a higher rate, and so I will move up now.

Q

Obviously, there's a lot of details, I'm omitting, because firstly, I wouldn't want to go into this decision-making mode, each and every time right so PCC doesn't do that. Pcc basically enters the decision-making mode, where there's evidence that the utility that the direction is no longer the right one and I'm not going into the details about how we choose this exact step size, which is also adaptive right. But this is the high level picture.

Q

The high level picture is gained empirical evidence right and move in the empirically maximizing direction, which translates in PCC to the utility maximizing direction.

Q

So one thing I want to point out is that even this fairly straightforward scheme is enough to distinguish between some of the cases and actually all of the cases we've seen before. So think of the scenario where I am a large flow, causing congestion and the opposite scenario where, where there's random loss right, so one out of every 200 packets is dropped, an expectation regardless of what I do right.

Q

So in this case, utility behaves very differently right if I'm sending at a certain rate in the first scenario and I increase my rate, then because I'm, the flow causing congestion loss rate is going to go up right, I'm. The problem. In the second scenario, if I increase my rate, the loss rate is the same right because it has nothing to do with me, and this is translated.

Q

Two different choices of rates, because the utility maximizing direction in the first scenario is going to be decreasing your rate, whereas in the second scenario, it's going to be to increase my rate because I'm getting more good foot and my law rate doesn't change.

Q

So one important thing to note is that congestion control here is handled differently than in TCP. So the question is where's the congestion control, so in TCP congestion control is exogenous to the protocol. I mean there. Is we hardwired into TCP behavior? That says, if you experience packet loss interpret that as congestion and once you interpret that as congestion react in a certain way. So PCC doesn't do that and thank you so in PCC again, please!

Q

Yes, so in PCC we have different different centers each employing an algorithm that maximizes its own utility and the crucial point I think is that selfish doesn't mean aggressive. If my utility function is such that I don't want right that I suffer from excessive loss right then so will my behavior, so my behavior will be consistent with that. So we think of this interaction between flows as inducing the non-cooperative game.

Q

We have different senders, they optimize their own utility functions, but if the utility functions are such that the result is resulting equilibria are good, for example, where the loss rate doesn't exceed a certain bound. Then that's what we were going to converge to. So a lot of the engineering is devoted to figuring out what the utility, what the right utility functions are right, so with utility functions, induce good equilibria, so in PCC v1 this was handled in a certain way.

Q

So going back to the utility function, the simple utility function I mentioned before, where it's basically just the throughput, my minus some penalty for loss right. These utility functions aren't actually going to behave that well, but when you plug in another factor and- and you can look at the paper- the formal analysis right, another factor that penalized is loss further, that what you actually get is a unique equilibrium that you're bound to convert through that's fair and where the loss rate is bound, is bounded. Sorry yeah, so here are some experimental results.

Q

So what you see here is a side by side comparison of PCC and TCP cubic on sending on a so we're standing on a single link and Flo's come and go so first we have so the x-axis is time. The y-axis is true, put the red flow enters and it's alone in the beginning, then the green flow enters and so on and so forth, and then they start leaving all right.

Q

So what you see in the bottom figure is that PCC basically converges to a stable equilibrium where each of the flows gets its fair share and stays there until the situation changes there's until another flow is added or one of the flows is removed. So.

Q

Here's another experimental result, another depiction of experimental results from the paper, there's often a trade-off between convergence rate and how variable your rate is upon convergence. It's so often you'd have things that converge fast for where you see high standard deviation, once you converge, or they are the opposite. I we're probing is slow, but then you might stay at a fixed point.

Q

So what you see here are different protocols and where they're located in terms of convergence time and the standard deviation of the throughput upon convergence, the y-axis and below at the bottom, you see different points corresponding to different choices of parameters for PCC. So what you can see here is that for quite a few choices of parameters, you actually get better trade-offs than most existing algorithms out there.

Q

So here's one more result: I wanted to show. So this is this is an attempt at quantifying reactivity to rapidly changing Network conditions. So you have an emulated environment where every five seconds the network parameters change drastically. So the bandwidth changes, the RTT changes and the loss rate changes and what you see the blue line is the optimum sending rate. In hindsight this is what you should have done. Had you known right that this this these would be the exact network parameters. So what you see is that Kubik performs really badly on this example.

Q

It's because it doesn't contend with loss. Well, TCP, Illinois performs somewhat better and PCC. Allegro is actually very close to the blue line right, so it pretty much is able to track the optimum in this example and I'll actually revisit this example later when we talk about PCC v2, because there are some subtleties involved here. Okay, so we've run a lot of experiments in the cost in the in the course of the last three four years, and all of these are documented in the paper. So please take a look and I'd be happy to.

Q

Could you go back once I think's are are documented in the paper we've we've run, experiments in many different emulated scenarios, and also in the wild, and one thing I want to note- is we've run experiments from in. In in, in the context of you know, inter data center, intra data center, satellite and force etc, and what we see is that we get significant performance benefits in each of these scenarios.

Q

But one important thing to note is that this is the same PCC implementation, but each and every time we're comparing against the TCP variant designed specifically for that environment. So so this is the same black box, PCC implementation.

Q

So here's a here's, a graphic portrayal of some of the experiments we've conducted. These are 510 source target pairs, so each and every line represents comparison of PCC and TCP Kubik between the Associated points and the color index is how much higher is the throughput right. So in the median it's it's 5x yeah.

Q

Okay- and this is a my favorite figure from the DNS di 15 publication, so this is sending a lot of data over very long paths, so this is entering 100 gigabyte of data over very long path. You see from you de to Berlin from you de to China, from Illinois to Reseda, so the green represents TCP cubic and yellow, is taking a flight right basically and and PCC is the blue.

Q

Yeah, this is the Internet in the 21st century, so there's uh yeah there's a lot of work for that firfer, this workgroup, so a few words about deployment and deployability. So there's one one interesting observation, which is that TCP is a really bad learner from an online learning perspective because of this hardwired mapping. So, for example, it doesn't do one of the most basic things that you'll find in any online learning algorithm, which is trying to close the feedback loop with the environment.

Q

So this is what I mean by saying by saying that when I go up, that's the loss rate go up, that's the latency go up or do they stay the same. This seems to be crucial for dictating what I do next right, but in TCP, because you're reacting to a packet level event right. That information is not available to you so, but the the more important observation from a deployability perspective is that the TCP sender, it might be the bad learner, but the TCP receiver is really trying to help it right.

Q

So when you think of the TCP receiver, it's actually throwing a lot of status, ahh-choo right, which you could theoretically utilize and from a deployment perspective. What that means is that PCC doesn't actually need to change. That's right, PCC doesn't actually need to change the TCP receiver, so the receiver can still run legacy TCP. You don't need to change the application. All you need to do is change the part of TCP in the kernel that adapts transmission rates, so that that's the part you need to change.

Q

This can be implemented as a kernel, module and and- and this is basically what we did so this is the high level software architecture, but but I think that's the that's the main point. The main point is it that all you need to do is generate the kernel module that just changes.

Q

The algorithm TCP employs to adapt rates, so so this was in 2015 and since then, we've continued working on on PCC quite a bit, and the reason is that PCC Allegro is still far from optimal and the more challenging the environment is or the more challenging the applications are. The worse it performs so in particular, it suffers from some a suboptimal convergence rate. We in we did incorporate the some extent latency into PC cv ones, utility function, but but it wasn't done in a very principled way and we actually did very little experimentation.

Q

There was no formal analysis of this. It suffers from back performance in mobile networks and for LT a for example, I'll revisit that, and also from some optimal quality of experience in the context of video streaming. I'll revisit this point as well, so in so in an upcoming publication, we're presenting a pcc vivace, which we've already experimented with and is being evaluated in several places, and thanks yeah and pcc vivace is another embodiment of the same high level architecture, but there's two fundamental changes that pertain specifically to the utility function framework and to the algorithm.

Q

The online learning algorithm that we employ to adjust transmission rates right so uh yeah Thanks, so we're changing the utility framework and we're changing the online learning algorithm. Next, so without diving into the details and soon soon the paper is going to be available online and the code also a few words about what we did so so the new utility function framework incorporates latency explicitly and that's required required.

Q

Changing the original utility function of PC cv1 quite a bit and is intended to provably, provide better convergence that is both faster and more stable. Another thing which is a pretty crucial feature of PC cv2, is that different senders can employ different utility functions within the same broad class of utility functions without compromising convergence right. So we can tweak the utility functions of different senders to accommodate their different performance needs right. Some I care more about latency.

Q

Some might care more about bandwidth right, while still guaranteeing that they converge to a stable state and, while also enabling us to reason about what this stable state might look like.

Q

Another thing that changed is the online: the simple online learning algorithm that I described before, where, basically, you had random control trials and you were probing in both directions, trying to figure out what what the right direction is. So again, there's a lot of details. Some pertain to how do we handle noise right? That is how we are gathering statistics, but these statistics might not be accurate.

Q

Think of a scenario where I'm sending at a low rate, if I send just a few packets, it's really hard to get to get accurate statistics right and there's other scenarios where this might happen. So this is one thing we needed to address. There's also various issues related to the step size, but one thing I do want to to mention is in PCC v2 called PCC vivace. We don't just use the utility function to determine what the right direction is.

Q

We also use it to determine by how much we should change the rate right, and this is done by looking at the gradient of the utility function right. That is, if I increase my rate. If, at this point in time, the gradient looks like this right, I should increase my rate by quite a lot right, as opposed to maybe I'm very close to a stable, State and I should just probe a little right and go up by a little bit right.

Q

So this is this is so this you, this online learning algorithm is intuitively based on doing gradient, ascent on the utility function and there's a lot of results in online learning. Theory that show that not only does this result in performance gains for the individual sender, but also if different senders interact and they play well with one another.

Q

So it's interesting to contrast, PCC with bbr, because these two reflect very different design philosophies. Right. Bbr tries to track the bottleneck, bandwidth and TCP and PCC employees online learning and try to learn from empirical evidence, but they're also somewhat similar, because both rely on pacing and both rely on going beyond packet level statistics and trying to gather statistics at longer timescales. So one thing I want to mention. Before we go onwards, is we have compared against BB, rv1 and I know that the BB?

Q

Our team has been working on various improvements and once VBR v2 is available, we'd be thrilled to play with it yeah.

Q

So going back to the figure we saw before, where we Twite try to quantify how we active different protocols are to changes in network conditions. So again we have a network. We have a single link right and the network parameters change every five seconds and what you see here, as the dashed black line, is the optimum sending rate right. This is what you should have done in hindsight, so notice that both PC sees, that is both PCC Allegro, the old version and PCC vive.

Q

The new version track that line pretty well, but also notice that PCC Allegro, the old version, often overshoots right and when it overshoots that results in excess latency, because it's it's sending above the raid and packets, get queued, etc, etc. Pcc vivace does much better because, most of the time, the vast majority of time it's close to the optimum, but it's beneath the dashed black line.

Q

Bbr has somewhat bizarre behavior here, as you can see it, it sometimes dips and take a walk takes a while to recover I conjectured that it is associated with scenarios where the behavior of this network deviates from the network model that underlies PBR bid, be variant. It'd, be very interesting to to see how BB rv2 does in this context, yeah.

Q

So one important thing to know they said.

J

Is it possible just to ask on the previous slide what you.

Q

J

Gory first, and are these each individual runs against exactly the same channel model? Yes,.

Q

Change and they change by choosing a uniform, uniform e ik lambda ma parameters within within the specified ranges right, and then you basically just run these protocols against exactly this. Oh sorry about this, okay but yeah, but but just to repeat my answer so so this is an emulated environment. Every 5 seconds. We choose parameters uniformly at random from these ranges, and we run all of these protocols against the exact same choices of parameters.

Q

Sorry, one at a time right, so one one thing I want to mention is that this this the way you react to changing network conditions has implications for quality of experience. So here's an experiment, that's basically doing the same thing. Only now we're measuring quality of experience for streaming video and we're measuring that in terms of the buffering ratio, which is the time you spend buffering at the client out of the total time right. So, ideally, the buffering ratio will be zero right. Lower is better.

Q

So what you see on the x-axis are different bit bit rates, the for standard bit rate and the y axis is the buffering ratio. So the upper line is TCP cubic, as you can see, bbr the significant significantly better and pcc vivace is actually quite close to the bottom. In this example, and and once again these are emulated the network conditions, but we actually also ran run this experiment. You can see that in the paper on streaming, video from AWS to residential Wi-Fi right, so there there are such experiments as well.

Q

But if you next slide please another interesting phenomenon is that as the number of complete, so the previous slide was for one flow, but as the number of competing flows of video streams increases, the the gaps actually become bigger, and the reason is that stability and convergence play a role right. You want the flows to play well with each other, so so this is what you see here. So the bbr line goes somewhat higher and the pcc line is still pretty much close to the bottom yeah. So here here's another set of experiment.

Q

This time we used Mahamaya the network simulator to replay Verizon, LTE, traces and, and what you see here is different points on the trade-off between self-inflicted, latency and throughput. So it's very easy to do well in one of these individually right, if I, just blast away, I'll get perfect throughput, but I'll I'll generate a lot of latency for myself right, because packet we'll be buffered. On the other hand, if I send one packet, each RTT latency will be perfect, but true put will be horrible right, so different protocols exhibit different trade-offs.

Q

So what you see here is that cubic gets really high throughput, but it's lost based right so that the latency is horrible, PCC v1, which didn't incorporate latency and actually, even once you did incorporate latency it didn't actually do that. Well, is not much better in that respect. Bbr is better but still suffers. A 1 second latency vegas, I would say, is a better trade-off and sprout is a protocol from MIT designed specifically for this context, which I think is it's probably the optimum trade-off in this. In this figure at least I.

Q

Don't know how close it is to be the optimum in general a, but it's important to know that it's specifically for this context and also requires changes in both sides. It's like explicit receiver feedback, so you can see that PCC vivace it Eanes more or less the same, the same level of cells and so the latency and the summer flow is stupid. Yeah, ok, so there's there's more things!

Q

You can look at there's various demos, there's there's a additional leading material Thanks, so we're still very actively working on this, both on the algorithmic side and on the implementation side. So we're still seeking better online learning and utility frameworks. I think that PCC v2 vivace is is a step in the right direction, but I think that there's much more to explore. There's the question of how this should be adapted to mobile networks. There's the question of what the interface with the application should be.

Q

What would video where PCC looked like and one thing where we're working on now is building an open-source consortium that will center around an implementation in quick and a PCC kernel module.

Q

There's a lot of more details in the paper, including all the different parameters of the of the different simulation emulation and empirical results are presented, and that's basically it. Thank you very much.

E

And that this um I was trying to figure about your question about why bbr did so poorly in that one graph and I realized that VBR has in it an explicit assumption that the men are T. T is stationary, and just just varying them in our titi out of five seconds steps. Time intervals is guaranteed to screw up VBR, and one of the things that we haven't done yet, which is going to come up at some point eventually is, is how badly do we screw things like Leo, but in terrestrial routes?

E

This is sort of an unrealistic scenario.

E

Q

F

Right, I'm gonna. Thank you very much for well. I've got a question about the switch well I think it was a switch part way through from selfish utility maximizing everyone uses the same utility function. Is that true, or do.

Q

From selfish utility maximizing, everyone uses the same uses before you.

F

Started talking about what utility function would be for everyone right.

Q

Alright, so so in in PCC, v1, we've hardwired a specific utility function into the protocol, and- and we chose that specific utility function, because we could show both formally and empirically, that you get good equilibria, but in this non-cooperative game, that's induced by it and in PCC V. One of the things we've managed to do was to allow you to tweak your utility function without compromising convergence. So you no longer have to you. Can there's there's different knobs in the utility function that reflect how much you care about latency?

Q

How much you care about good put you can play with these knobs and not compromise stability. Okay,.

F

So my specific question was that if you, if one cares about throughput and latency hmm and another cares about throughput and not latency, surely that that one that doesn't care about latency is gonna cause the other one to keep starving itself and and then you don't have to compromise the latency eventually if it wants some throughput, which is sort of the whole problem that Vegas had and everyone else has right.

Q

Yeah so yeah so I think I think that that goes beyond PCC. As you know, that's inherent right. That is, you can play with the utility values, but but this this is a, but what you're basically doing is you're you're, choosing a different balance of how much you care about loss and how much you care about latency as a signal yeah right and in that I agree. If you ignore latency completely, you will end up taking over all the buffers and.

F

This is sort of like why I'm it given given that latency. If you don't care about it, it also doesn't matter. If you don't have it, you know, in other words, having low latency right is, even if you don't care about it, it's good for everyone else, wait so so there you do need to somehow design the things so that you you keep the latency low. As long as you don't compromise, your only utility function just for everyone else that does one low latency, yeah yeah, okay,.

Q

The same thing applies to right right.

R

Pravin from Microsoft one of one question I had was you showed the optimal rate in.

Q

R

How did you come up with that.

Q

So if you know exactly the network parameters right, then then you can figure out. You know what the bandwidth is. You know what the RTP is, and you also know what the loss rate is all right. So that means that you can try to operate it. The exact operating point where you're fully utilizing the bandwidth and you're not going into the buffer. Okay.

R

But if you had multiple flows with different utility functions, would you still be able to compute this instantaneously like later later after the road? Oh, if.

Q

You had multiple flows and you were playing with you.

H

Q

Yeah links, that's a good question. What you could compute is their fair share at each and every given point in time. So you could argue that, ideally right, there would be. Let's say that it's five-second interval, so they actually do have enough time to converge. So, most of that time you would want them to spend at exactly their fair share of the optimum okay.

R

One thing that is interesting here is that, because you have added now latency as a as a part of the utility function, is that you know less than best-effort congestion control likelike, that it would be a good test of whether the utility function can mimic something like less than best effort. So that would be a very good test to do for this. So.

Q

This that's a very interesting point, so we haven't actually it's an interesting research for us, so we haven't actually tried seeing whether we can emulate other protocol as well by choosing utility functions that match them, but.

H

Q

For LED backed right where you you want to do best effort until someone enters right, I, think that is something that you could actually probably capture when other.

H

Q

I want to note about the utility function, which I think relates to. One of the previous questions is: if you're lost based purely lost based, then you're aggressive towards legacy TCP. If you're purely latency based right, then Kubik might kill you right, because you have this, this utility function, where you can play with the different parameters you can.

Q

You can strike the right balance between the two, because I didn't talk about TCP friendliness right, which relates to your question, but but what we actually show in the paper and we discuss TCP friendliness- is that even the purely latency sensitive debauchee yeah that performs better than cubic in the wild, even though cubic is supposed to kill it right, but it's I guess there's enough spare capacity for it to you, the game. Sure.

R

I think this kind of ties in very well with you know, applications that want to pick their utility function like there are scenarios where you want to make that trade-off, but, like this goes back to like Bob's question on what should be the default, because most applications today are not providing that information. So what would you pick as the default to do on the best on the network? So.

Q

In the in the paper we we had two default choices: one that's lost based and the other that's latency, based in an ideal world, everything would be the PCC vivace, latency I. Think in terms of you know the function to choose to build into PCC vivace, and you actually get good results for, for both. So I would argue. The one that cares about latency is one that won't end up filling up buffers. So that's the one you would probably want globally. Thank you.

S

What I saw from UCL I have two quick question. First, all of the experience that you are showing us is 5-second info right in these specifics. What, if one second or less than that so because, when the network becomes highly dynamic right, all of those learning things that you talking about doesn't work.

Q

So I think that there's there's two issues here. One thing that we're currently exploring is the behavior of PCC. So this is joint work with Huawei that implemented this as a kernel, module and and the behavior of PCC in mobile networks, right, which I think is exactly what you're talking not.

S

A lot more wonderful so, but there might have some dinah where the RTT.

Q

Might be way too slow to react and, and you can actually get get significantly better results than the existing protocols, because unless you have in network cooperation- but this is not a question of online learning right unless you have in network feedback you're going to react at the timescale of an RTT and of the question is what do you do and I would argue that online learning is what you should do in that context. You shouldn't hide your responses regardless, so.

S

What is my concern is, so you ended up with the Wraith right, which you thinks, that is a right rate, and the network have a transient congestion right. So when the transit congestion happens, you're gonna, your sending rate, is quite high. Comparison with other flows which react quickly to the transient congestion, so you're gonna be taking all of the bandwidth, so you're sort of penalizing. When the network becomes highly dynamic.

S

You are penalizing other flows competing for us which they are non non PCC flows by, and so this is my concert rather than the online learning so I'm worried about the race that you thinks that these are maximizing the utility functions and at that time net will become. You are in the highly dynamic network, yeah.

T

S

The sending race that you thinks is right, maybe is too aggressive to the computing for us so but I think.

Q

That isn't, that is dependent on your choice of utility function, right.

S

Q

For example, a choice where you shy away from confirm latency the latency increased, no matter what right in that scenario, you'd be, you wouldn't be sufficiently aggressive right, so so this is a matter of figuring out what the right utility function is. Okay,.

S

My second question is: have you considered any processing costs for this algorithm to run? Because my in my past experience you required a lot of CPU? Yes, the.

Q

World so we've so we've actually there's there's a few variants of kernel modules of this at this time, and one thing that we we've done and- and we owe Google a big thank-you for that is- is we've leveraged the changes that bbr write have inserted into the module I think in that respect, the costs are approximately the same. Thank you.

I

Yoochun Chen from Google I love the PCC work, in fact internally we often compare PCC with PBR and yes, there are a lot of similarities and there are differences. I want to ask for the utility function. Have you consider beyond the RTT ray and laws like application level metric? For example, videos? Probably care about the quality of experience the most so.

Q

So we've we started considering this I think it's a very important research direction. Right now we focused on transport, layer, metrics but I think and there's some and there's some deployability issues here as well, because you may need the clients cooperation right to reveal some statistics, but maybe you can infer them from the sender as well. So I think it's a it's a very interesting direction. We have not yet plugged in a utility function that reflects the quality of experience at the client. I. Think I think that would be great to do. Yeah.

I

I think a lot of video clients actually can sort of feed a quality of qoe metrics easily, so they'll be very interesting. The other is also a similar question that have you considered beyond the transport layer, but also beyond the flow level control. For example, if you ask they are sent her application, what they care about the most they would take. Tell you it's the tail latency of my our pcs across 10,000 connections. I! Don't really care about your flow level, seeing I only care about the last one that straggler issues so.

Q

I

Can we build a congestion control using your framework, but it's not just managing per flow by looking at a little bit like the co flow kind of concept, from from Berkeley to say, can we build a congestion control, the optimized Taylor agency for a particular job now.

Q

You have not done that, but it seems like you could apply the same. The same ideas but but I I I think it's it's worth it's worth exploring. We haven't done that. We have looked at what PCC might look like in the data center environment as if you want to to incorporate DCN and if you wanted to employ incorporate receiver feedback but you're you're talking about a different angle, yeah.

I

Because if we head home, usually they use flows are just a channels and they will spread their requests among the pool of connections. So to them, optimizing per is really the lease of their their interest yeah, but.

Q

I, don't see a reason why we shouldn't be able to incorporate Co flow information, but.

I

Q

Is something we should definitely explore? Thank.

I

T

Spencer Dawkins is wearing a few hats and it probably doesn't matter which one this one is right now.

T

I have a question, probably in the form of suggestion, which you know take it for what it's worth and please don't try to answer it right now, because I hate people who, like blow up your presentation with a question that you know it's like no one would ever have thought of the suggestion was I wonder when you all will be ready to look at things like PCC, with the kind of rich understanding of what you are running into for a modder protocol like quick, which that my first boss in 2002 got shut down.

T

We cut down because we couldn't tell the difference between loss and congestion, and you know some things remain constant over. You know decades and that's one of them so up. Let me so I ask you to consider that, but maybe more broadly for the group here, not the next ITF, but in the next year or two is to be thinking, you know is to be thinking.

T

You know people have been working on TCP for a long time, because TCP is TCP and everything yeah I mean basically everything else is either UDP or blocked and firewalls, but we're trying in the transfer area in the IETF to get to the point where you can you can roll new protocols without having upgrade operating system kernels and without having to upgrade middle boxes and stuff like that, and you can roll new congestion control mechanisms and I. Would you know I would offer that as a possibility? I would also offer as a possibility thinking.

T

If you you know, the Internet is a big place right. So it's actually hard to work. It's not hard to make big improvements on something that smart people have been trying to fix for 40 years, but are there are there specific corners of the internet where what you're thinking about would be more valuable than other corners?

T

It is there a way for you to isolate what you're trying to do to those kind of corners. I have some theories about that. Doesn't matter, you know, doesn't matter what they are and smarter people in the room. We could probably give you twice as good a theory over lunch, but I transport, okay, I can I can take what I could put that eye on for saying there was transport area director I would like for people who are thinking about research to be thinking about stuff, that's more than the next.

T

The next operating system update, you know, which a couple of year of timeframe, but you know I mean we're gonna. Have it we're gonna have an internet for a long time. You know and I know that I see crg as schizophrenic and it was before. Jonah was here because beacon because we rely on. We rely on ICC RG, so much for guidance and in the IETF.

T

But you know- and please don't stop doing that, because the Internet is a big place and there is you know you will probably be running TCP next week too, but to to you know to be maybe a little bit more schizophrenia can say you know what are we looking at this relatively short term? It has to work everywhere, because something's gonna be blood. Something's gonna go in the Linux kernel and it's not gonna know you know if it or it's not going to get deploy at all.

T

They're like like, like cubic I, mean that was the deal right. You know you put you put a cubic in the Loess kernel and then it has no idea who's who's, trying to talk who's, trying to talk to it or from where so that you know that's a use case and just thinking, especially if you guys are talking about like really different kinds of congestion, control mechanisms like PBR and PCC, and the next five that somebody will come in and want to talk to a Java about you know, maybe maybe maybe that maybe that's something.

T

That's worthy of really smart people, time in the research community as well, and thank you for letting me decide which add hours worth. Thank you just for the record Alison's not in here and my co ad just left before I was in the mic line so completely unsupervised. At this point.

U

So my name is Nicholas Quinn I'm working for nests, which is a French National, Space Agency, and when I look to all curzon the comparison with your proposal and all the optimized DCP versions have to say that when we have 10 megabytes links, we can actually get that capacity. So just that's the other side where you say that we have two 17s more capacity with your proposal, which.

Q

One inside so, if.

U

You go up in your presentation, have again again again again up and down the other way round. Sorry.

U

That's it 17 I, look at your paper and basically you compare that with Ebola, which is not deployed in peps, and we have specific things. So basically, you may want to revise these message because I don't like think that this curve is 10, megabytes and satellite, only one and we can achieve that which was Omega bustling, so.

Q

I'd be happy to take this offline because it'd be interesting disappearing.

U

Why would be at the benchmark? You.

Q

U

And another that was a comment and I have a small question on how you actually evaluate the loss rate, the loss rate, so we're.

Q

Using selective axe, okay,.

U

So you assume to drift exactly so.

Q

We're using selective axe and we use that because cumulative is just not enough to make it.

U

And also a fail, and also the poem that we selectively knacks, you may not be sure that you have actually losses good pay, P make buckets. Maybe we use.

H

U

Win such missions, such as in cellular networks and those are pockets that are being being submitted. They are not just due to congestion, so that is yeah I mean so right. You.

Q

May want to consider.

U

That and I don't know how you can basically in your models, consider that your measures you are may be incorrect and fee. If that is resilient or not so,.

Q

This this is a very good point. So right now, one thing we're looking at very closely is is really mobile networks and satellite networks and the specific characteristics oh yeah. So.

U

If you were me, I'd be very happy. Have we have platforms? If you want to try your sing on our real satellite things cool, we can take that offline. If you want I'll.

Q

Ping you after okay go.

U

Thank you well,.

Q

Thank you so much.

A

This was the session, goes on really, oh.

H

A

A

Apologies, my sincere apologies. This is me basically without adult supervision. This is why we needed it. It isn't well! Well, then, that's it folks and I'm gonna bow just so the last presenter that we don't have the ten minutes to to do the presentation. I'm. Sorry, you had a chance to do. Doesn, t CPM, so I'm not gonna, feel that bad, but well, we'll see you all next time.

A

B

B