Internet Engineering Task Force 92, 24 Mar 2015

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: IETF92-AVTEXT-20150324-1730

Description

AVTEXT meeting session at IETF92
2015/03/24 1730

A

No well, if anybody hasn't read this by this time in the meeting, please do so once it's gone off the screen here, it will be still on this region on the IETF website.

B

A

Sorts of other places agenda. We are very constrained for time. There's lots of people came in with lot late documents on time documents as far as their deadlines were concerned, but late as far as us asking for a gender time in this meeting we're concerned, so we're going to try and get through this fairly quickly. I appreciate, as I said on the mailing list.

A

If you could limit your questions to those things that are basically around, is this the right job for a header extension ie, the right thing for a vtx to do and, of course, whether we want to support the work or not, and there is interest in doing the work a lot if it's something down, as you know, I've got this minor technical problem. Don't bother raising it here, do it on the mailing list or save it for the next meeting, assuming we do have a discussion on it at the next meeting anything your title.

C

A

Anybody want to bash okay working group documents. This says ready for publication request. It is actually in publication request. Now the ad doesn't seem to have done anything with it yet, but he's had it two days: I, don't know what he's playing it.

A

Splicing notification: we had a late IPR declaration after the adoptions working document, so I hope people are there's been a couple of males on the list asking if there are any concerns with this. So this is your final time for asking. Are there any concerns with this IP? Our declaration and therefore with us, proceeding with this document is working group document?

A

Okay, if the many takers could record that the notes no concerns were expressed.

A

D

So just follow up in a previous one, I'm concerned that declarations are coming in late, Perkins, yeah.

A

I think I. Both the chairs are also concerned with declarations coming in late, I mean basically, ideally- and this applies to the four drafts on the table at the moment. Please then we're going to discuss later. Please do try and make any declarations you know of at the time when it stood an author draft, so we can actually see those and take them into account when we make the decision as to whether we're going to work on these.

A

As I work in group draft, it is much easier if we do get it then, rather than later right stream pause, it was had one working group last call: we've had some changes. Is anybody seeing the need for another working group last call on the stream pause document, or are people happy with the document as it is currently now is? In other words, are there any other concerns that people want another review cycle on so I think we're just proceed with the getting it tied it up for publication, request you and do that one yeah?

A

Okay. So if you can make a note that some Jonathan Blow next will be the putt, the the Shepherd for that document, that's the pause, resume document stream, pause document, ravishes document Shepherd.

A

Header extension, you sure.

B

So, on the header exception draft, it was adopted as a as a spitter as a working group document Monday morning the consensus 40 working document last cycle, but I had neglected to ask the ad for the mouse own until dope I was reminded so I apologize for that. So this is the plan, as suggested by the authors they're, going to update the drafts based on number of comments they received so far.

B

Of course, if you have feel free to send more comments to the last or the author's at any time, the the details of that I, don't think need to go through all the details just now, but these are basically reflecting the comments that were in the discussion we had of this draft.

B

The last time it was presented, so the expectation is they'll, have an updated version of this document, so in early May, another occupied before the July meeting, and then hopefully we can go to working with Les Paul in august or September before the November meeting and then do a publication request at the December meeting. Anybody have any suggestions or concerns of that timeline. Just.

E

Clarification so there will be another individual version now.

B

E

Working the area updates would be Indian.

B

Young and all this will be, is working to play yeah. That is now a working group document.

A

B

That's the title at the top is the that was submitted with only the title change and the reference is updated from the final individual submission version. So, um okay.

C

A

Beloved ID shortly to be ex ID, so.

B

C

We have a token.

B

Of appreciation for the work he has done, assisting us.

B

Is to us here sue us isn't here. Moving on to the second presentation.

C

A

So sue us here or someone else, is doing this presentation. Okay, well, we'll skip it and come back to it then.

A

F

A

Go straight to the layer refresh request presentation, let me go into full screen mode.

C

B

Speaking, not as a chair in this context, of course, and keith will do any consensus, calls that are needed next slide, so there refresh request the the idea of this.

B

This is a feat, an ATP, f feedback message which is designed to request a refresh of one or more layers of a layered stream without requiring a full refresh that the whole layer stream is required with Viet with the FOI request has currently defined its sign to be applicable for both temporal and spatial scalability and it's applicable to all the various scalability modes which we have with such labor hashed out the names of over the past few cycles.

B

The next slide, so a layer refresh point as I've defined. It is so you have a receiver which previously could all good all was only able only had the state to decode some of the layers of a of a layer of a layered included source, possibly done and after and and so then, after the refresher point can decode a greater number of the layers.

B

So next slide I have some illustrations of this. So, for instance, this is a spatial refresh point. It's the ideas, so we've got two spatial layers which are called s0 and s1, and a decoder is currently decoding layer, s0 and then at frame 3. He can certainly coding a sorry mess one because frame 3 s, 1, there's no dependency on previous frames in that layer. It's only it's only dependency on the lower layer and future, so he can start decoding that without s0 stream needing any kind of special refresh points.

B

So as we end up, the second presentation will pick you up when when this is done so next slide. So then a temporal refresh point similarly is. If you have streams which are referencing if a to hire temporal layer is referencing.

B

Earlier pictures in the temporal layer, which is valid, most definitions, upper layers, then similarly, the point at which you can start decoding the higher temporal layer without needing earlier friends of that temporal layer on this picture frame. Six, sorry, the type on the slide that should be temporal, refresh, point copy and paste for layer t1 and the next slide. So this is the concept of temporal nesting.

B

Sometimes temporal refresh points are unnecessary if your layered stream has a temporal structure such that it's it's called temporal nesting there's a formal definition, but basically what it means is that you don't need temporal refresh points, because basically, every frame the chapel refresh points always safe, safe to switch up.

B

So just to say that you know it's possible that you may never need this method as and before you're doing temporal only temporal scalability with certain structures. So next slide. So the layer, refreshed message is and, as I said, an 80 PF feedback message for the details are in the draft. It's pretty straightforward, a BPF message.

B

So what you say is you what you say that you basically specify the new layer you want and optionally the current lowest layer you're able to decode. If you don't specify that lowest layer, it means you currently can't decode anything how you describe layers and how you identify when you've actually gotten the Refresh point is codec dependent, but so I mean we've got layer, definitions and for the butcher sort of sketchy, just quick and dirty for the four codecs we identified that have some form of temporal or spatial scalability.

B

Still a lot of details needed, especially how you recognize where the Refresh points are, because that's because I think identifying layers a straight, usually straightforward, identifying refresh points. So you know when you've been satisfied, you can start using the layers, often trickier any comments so far by the way or and then the next slide, and then for the applicability to the multi stream modes.

B

The question for the multi stream modes is always: could you just use F IR send fi are for one of the streams and the question I have. The problem I have is no. Where have we actually specified what fer means for any sort of multi stream encoding of a layered source? Does it refresh the individual stream? Does it repress the whole source? I think there's cases where would be useful to have refresh the whole source semantics at which point saying that fi are just means refresh. The stream is problematic.

B

So what this draft is proposing is that fi are whether, regardless of what scalability your mode, you're using fi, are means refresh the whole source and use lrr for the Refresh individual layer or set of layers. Let.

A

Me get to the end of the presentation in n duty, I.

B

Think this might be the last slide other than you.

A

Want to do this? Ok, all right! Ok can I. Just before you actually move into the questions then ask I'm going to do this for all the trials. Who's actually read this draft, so at least a dozen look. How on that please yeah Ronnie.

E

Oven, so it this the last part. It relates also to the previous slide, because this what's F IR means is usually specified like. What's the idea are in the right.

B

So basically the oh, you expect.

E

You'll expect to write.

B

As far as I can tell 864 suc, which Lee wants to publish RFC, didn't say what if I are met for the multi-story modes and nothing else is actually publish arts again so help them sell.

G

Just Bernardo, but just a general comment right at least typically for temporal scalability. It doesn't make a lot of sense to send an fi are for an extension layer right. That is certainly true. Yes right, so in some sense it's kind of an irrelevant question, because you'd send it to the base SSRC, and that would.

B

Yeah but I'm in yeah, that's there yeah, so it's but the, but for but I think most people don't yet people don't tend to do multi stream temporal scalability as much anyway. Since I mean I don't know, I mean you reduce, you guys need to be the most multi-stream guys. So what how do you do? Just everything's? Actually a separate stream yeah.

G

Everything's a separate stream, but the iframe would only be on the base layer, so that would be the only thing so.

C

G

You'd send the far2 yeah, but I. You know I I'm, not really in love with redefining how existing stuff works. I mean it's fine if you want to do a new draft, but yeah.

B

Lemon, I guess I guess the main thing is that I don't actually know how existing stuff works and if we want to say that f ir means refresh the whole, I mean look at if you're just doing temporal. It's probably six of one half dozen, the other. It's for multi multi stream, spatial that I get confused.

G

A

G

Bit of a question about that, so you know in the diagrams you're showing I guess I wanted to understand exactly what you wanted to have happen. Did you actually want an eye, dr frame, to be sent in response to this thing? No.

B

I want what I want is, for it should not be an eye dr. It should be. The whole point is to avoid having an eye dr right.

G

B

Point is to have a frame select for for temporal turn on turn on temporal nesting. If you weren't previously doing it, you were previously doing it you're happy or are these messages for spatial only reference, the lower level pictures of this frame don't reference earlier? Okay,.

G

B

Of the spatial layer, that's that was the picture on slide three or four, whatever that was.

E

Running so there's no assumption about what was the state of the loyal or lower layers in terms of the decoder, whether they are available or was.

B

E

Can be like for late Joyner to DC, so.

B

The assumptions I mean, if you send a message that says I currently have layer, 0 I want level 1.

B

The assumption is you have level 0 I mean that's, that's the semantics of the message, it's possible to say, send a message that says: I want I, don't have anything and I want level 0 at which point what I would expect you to get is it might be an eye, dr on the base layer of frame, or at least an inter encoded, exactly the details of what is what is or isn't allowed is a little fuzzy for united's codec dependent, but so the idea is that the base layer would have enough that you could start decoding from there and in the higher layers which are possibly being received by somebody else could continue uninterruptedly reference previous frames.

B

Without so you wouldn't get so all the other centers wouldn't get the full band. All the other receivers wouldn't get the full bandwidth hit of an eye. Dr.

D

Hon I common peg and soap yeah. I think, I'm generally supportive of this a comment that you would probably should be clear how it relates to the FI our draft, if you send a yeah one of these refresh requests for the base layer and how that's different for something on that, and if I uh yeah.

B

I mean, I think.

D

I mean, I think this bunch of details like that yeah.

B

I mean exactly like, I said what fi are for a base layer, especially the multi stream means. You know. I was unclear about I mean so, maybe and but, but also with.

D

A LED rich red LED refresh for the base layer is equivalent to an. If I uh I would say I would say.

B

Tis not so I mean that's the point being that the higher lit the layers you're, not talking about continued uninterruptedly will need a bunch. More pictures and I didn't feel like doing that. Much ascii art and.

D

I thought it unfortunate that you needed to have a codec, specific refreshes but I suppose I suspect, that's probably unavoidable. Finding.

B

The MIT, the main thing.

D

B

How codecs taught you know talk about layers in terms like how many you know what how many dimensions they have and how many bits each dimension has. Is it.

D

Probably is critic specific.

B

It is I mean it's basically, because you know 264 SPC has quality layers, which nobody else has how many bits each layer takes is dependent, and so this.

H

D-Bots go huh so the message tells the sender to create a refresh point. Yeah is there any thing we need to specify on how the receiver knows when it's received that reshef point.

B

Yes- and I currently have text in there for vp9, because that's the one we're actually working on when we realize this is a good idea. It will. That will also need to be codec specific, just recognizing when you have an eye DRS, codec, specific, so yeah, so we're okay. So the draft I currently have sections for, for you know like the four codecs I've identified, which I have scalability, that people care about I mean how that's recognized does need to be cut a specific yeah and.

C

That's actually that's.

B

Actually, the harder part, because you know the details of exactly what our isn't allowed. I can have a lot of subtleties in terms of like what is or isn't allowed as a reference frame and things like.

H

Right, yeah, it's all detectable by.

B

H

The RTP stream yeah.

B

Exactly so, it's possible we'll discover that. Oh no wait! This codec, you can't actually detect it, so we're going to not have to not allow for that Kodak, but we're hoping that I mean certainly for anything, that's not yet a published RFC or in public. Hopefully, we can figure out how to identify this for all the fool. Arguably, you could so.

A

You put that in the pilot definition. Be you good.

B

Yeah so I mean I think for well prefer things that are currently I mean where it is documentation wise, whether it's in the payload type, documentation, okay and for anything in the future, would clearly be analytic definition for anything. That's already published RFC will be in this document, things that are currently in progress. It depends on where they are in the cycle. Arguably,.

H

Right, if you send this command and you display what you're receiving anyway in a short amount of time, the picture will clear up right. So it's conceivable that even at the codec, if you don't know how to specify how it's detected, it might still actually work.

B

I mean yeah I mean just like you might get life, iframes decided if I are but I thought.

H

I mean is there's a lot of ugly stuff that happens on video switches and everyone just hopes it clears up in.

B

It right yeah, the point of this is to do better than hope it clears up, but actually particular, obviously for layer. Switching switch, okay.

A

So we need to get moving on okay. Nobody does anybody need to speak briefly to any concerns about scope with this document or whether it is a header extension or should be something else, a.

B

Feedback message that I had feedback best HP.

A

Might make me generate right? Okay, so I guess the fundamental question will confirm all these on the list is sam, is their intro I mean basically I guess show of hands, preferably because it's easier for me to see, but basically show your hands of you. If you think this is something that I ATF should be working on, adverse is no, you don't think something. This I tips should be working on and then I'll be asking after that for a show of hands on interest in actually supporting the work just to give it some prioritization.

A

So, first of all those interested in or think ITF should be doing. This work a.

A

Dozen, those who don't think I TF should be doing this work, it consensus and just now, for our information who would be interested in actually reviewing and supporting further work on this document actually doing somebody will work on it. Another eight or nine. The.

C

Trick that was easy. Thank you. Suresh.

A

Now you're here.

I

We five minutes passed over accelerate sorry summer. Every 50, better 300.

A

Now, you're getting a full 15 minutes: okay,.

I

Okay, so this is a video frame marking a header extension proposal, co-authors, a Espen and Sue house next slide, please. So the reason for this work it's primarily derived from some parallel work happening in a VT core. Well, actually, now it'll be probably new working group. There are some security architectures that are being proposed where the traditional middleboxes will not have access to media keys, and so the payload will potentially not be visible to those middleboxes that do the actual media switching and in those into end private media architectures.

I

We need some some more critical metadata available at the switch in order to make forwarding decisions possible. So the the goal of this work is to have an agnostic switching, so these middleboxes never ever have to inspect the payload and they can still perform all their forwarding functions again, mainly because the payload may be encrypted, even if you're not doing in private media. If the payload is encrypted, you can avoid the cost of doing the decryption in the middle box for higher scale. But, most importantly, is the middle bullet. In these Indian architectures.

I

It's simply impossible to do any kind of payload inspection and that that seriously hampers some of the forwarding functions that you need to do in the switch. And so we need something beyond the current RTP headers to make those decisions and then also a more forward-looking goal is to have the switching function, be something that's codec agnostic, a truly capable of handling any codec format, potentially even not even knowing what the codec format is.

I

As long as it knows that the format is conforming to these to these metadata markings, it doesn't even need to know what the what the underlying codec is next slide. Please, and so the the intelligence in the switch can help to provide cleaner video switching typically when these switches operate they're, usually using an active speaker role, so that when the active speaker is talking using the voice activity level, arti beta extensions when the active speaker is talking their video will be switched in.

I

And if you do that naively, you won't get it at a clean, iframe boundary. And so there may be a glitch during speaker switching. So we need some kind of indicator so that the switch knows when it's able to deliver a full, clean, intra frame you'll also be able to have better recovery during packet losses.

I

Because today, when the switches detect some packet losses, they don't have any information about the context of that packet loss that it affect an entire frame that it affect leading loss of the beginning of the frame, trailing loss at the end of the frame so being able to have some more data about exactly the extent of the loss helps the recovery operations and during congestion, the switch can make more intelligent decisions about what to drop.

I

If there's information about the priority of the packets in that media flow, the switch can make better decisions that affect less participants and then, finally, if you have diversity in the capabilities of the participants, you can drop entire layers towards some of them and the setter extension will allow that and for the endpoints there's. Also a small side benefit of being able to have better recovery same as in the sense of the switch would have. You can better identify, leading and trailing loss scenarios to know when you have full media frames. Next slide, please!

I

Yes, the proposed solution is the RTP header extension and it's split into two parts: there's a mandatory fixed length, codec independent part, and that's what really want to focus on today. There's also an optional variable length, part, that's codec specific and it could be defined for each payload format, we're initially proposing to define the popular paler formats in this in this draft and specify what that optional extension looks like for those formats.

I

New formats going forward in payload would be encouraged to to define what this blob means for those formats and we're using the standard header extension model from fifty to eighty five, so the the length that's encoded in that standard tells you whether or not there is an optional part present and if its present, how big it is and, of course, it'll be negotiated in stp. So you can figure out whether or not you're going to get this info from from your peer next slide. Please, ok!

I

So the codec independent part that we want to focus on a few flags and then an identifier, so the first pair of flags are start and end of frame and we'll we'll talk a little bit more in later slides about well, as some of these are important and what the relation to current RTP header fields are, and then the next pair, independent and discardable frames. This is talking about what the encoded media frame is, whether it's a traditional man called an iframe or a keyframe. We're calling independent here to make it more general.

I

But basically this means that this frame doesn't depend on anything that came before, and discardable means that this frame can be dropped without any kind of impact on the decoding dependencies going forward. The next to sorry, the next two are for temporal scalability, which was just presented on. This is a the the base layer. Sync indicator lets you know when you can, you can effectively resynchronize to hire enhancement layers and the temporal ID is similar to the temple ID. That Jonathan was talking about next slide. Please quick.

C

J

We were the discardable bit be said: is that redundant with, like the temple.

C

Or than layer, ID being greater than 0.

I

No because this means that truly discardable in the sense that it would be the highest enhancement layer. So this this means discardable for all all layers.

I

There's there's two interpretations of discardable discardable for the whole flow or discardable for this layer and there's, you know some codex make a distinction between those two things and have ways to signal it, and we could explore whether or not that's something useful to the signal here is.

F

Definitely a 2 comments. One is. It is entirely legal and useful in some scenarios at least to have, for example, a discount double base layer picture, hapless yeah, so don't get get too much into this mpeg-2 type of thinking it's over. It's that it was done 10 years ago. Second comment is M. You have absolutely no space for any extensions, impair the I would suggest grab another bite call it reserved- and you know just to be fruit, future proof. Okay,.

I

Well, but by the time the work is done, maybe we'll it'll be you know, there's there's no desire to keep this limited to one by features.

A

Without coming to the list, maybe because I think that's outside the scope of what we want to do today, which is decide, what did you do to work or not?.

I

So quickly, going through the rationale. First of these things that starting in the frame intuitively may think why on earth would you need into frame you've, got the marker bit all the markers very poorly defined? It's not reliable.

I

Nobody can really use it effectively, so we definitely need into frame and the way that the markers currently defined it's actually different, depending on the standard you're looking at when it comes to scalable media, and we think that much more much more sane and robust definition is just to have it indicate end of in the frame for that particular layer.

I

So for each layer you have, you can have this this indicator set and then start a frame is useful to know whether or not in Los scenarios, whether you actually have the beginning of the frame or not, because you can't discern when there's loss, whether it was you know the end of the last frame or the beginning of the of the current frame that you lost and if you're at the media level, you can look at the payload and inspect.

I

Usually it's pretty simple within most codec formats to look at the first few bites and determine whether or not the macroblock number is 0 or something like that. But if you can't look at the payload, then you can't do that. You need some other started frame indicator.

E

Look at that market because I was surprised, because what you're saying is the current usage of how the market, because the market bit as specified today is for video. That's what it does it. The the 10 in 1g one is the last packet to the frame and that's how we do like the saying much well.

I

The layer, not Valeri level, but also not not normally, it says you must not rely decoders- must not rely on this. So here we're talking about relying on it. We.

J

So it sort seems that I, almost everything that would be in a packet ization now is going to get moved to this header extension like we're doing the vp9. You know packetization, like all these same bits, are.

C

Appearing here, I guess well so I wonder if, like to.

J

Just make sense to take the entire factorization night, I best entire payload header, and stop it into this expansion. Well,.

I

The goal of this is to is to abstract out, what's common among all the payload formats and that's critical for media switching. What's a critical function for middle box, that's delivering switched.

G

So I'll answer that question, at least for h.264. The answer is definitely no. You don't want all the crap that's in there in this extension it would be worthwhile. I think, as part of this exercise, to go off, for example, stuff like vp8, which is and why you need it or don't in the rest of you, but I do think that yeah, but that that would things like the key idx I would. I would personally argue that certain fields are needed that are missing, but I think you know that would be.

G

We do that after we decide to work on this.

A

Can you quickly do you last couple of slides and.

I

Young little bits in the discussion sure go ahead next slide, please for independent we've already covered it. It's to allow clean video, switching at interfering boundaries. Discardable we've already touched on that to drop packets whenever, whenever you have congestion with the least disruption to the video quality next frame next slide, please the temporal ID, if you're, if you're, not in scalable media, this would would always be zeros, so it would be used.

I

But if you have defined hierarchy yet you're using temporal scalability with a well-known to find hierarchy and signaling, this would be very useful for middleboxes to be able to peel off individual temporal layers, and signaling may already provide some part of this. If you using something like Mrs T, if you're using mrs t RM RM t these might already be coming on different SSRC s and there may be signaling means to associate those sources with with what layer they are. So in those cases may not, it may not be needed, but an srst.

I

You would have no way, no easy way to discern that without having something like this and then the base layer sink is useful for being able to once once you have loss and you need to recover when you can start forwarding enhancement layers at a point when you know that the base layers have been synced, it's kind of a hard point to to convey quickly here in the few minutes, but a few people that know the vp8. Why bit it's essentially the same thing as the vp8? Why bit next slide? Please I!

I

Don't think you need to present this one. Do you yeah? Let's, let's skip the correct.

A

This is being tell you, don't.

I

Ya, let's get the dead, expand.

A

They're actually saying that's all you need to say.

I

That's so that's it yeah.

A

Right um who's, ready, fewer 123456789, any concerns on scope or whether this is the right solution was in a header extension. That's basically what I'm interested in now.

B

John Lomax is individual.

C

B

Concern is while it feels pretty solid for temporal scalability. It feels a little under baked for spatial. We can talk about that's about flying about, but I feel like I. Don't have a clear notion of how I would? How does whether this would do everything I need for spatial, whereas for temporal it feels very solid I.

I

Agree and it's mostly because it could be abstracted out the codec, codec agnostic, part of special skill that he could possibly be extracted out, but it's a little more nuanced temporal scalability is uniform across the board. The only difference is vp8 is to bits everybody else's. Three, that's the only different, so give three bits: n, 57 spatial and quality skillet a little more confusing, because in 264 you have seven bits and they're partitioned hard into three and four and 265. It's not it's not hard to partition at six bits.

I

So there's it's harder to get uniformity on that aspect and I think it's less used. So, let's bang for the buck. When you look at these switching operations, yeah.

G

So I, more or less agree with Jonathan that the temporal stuff is well understood, I think to get really get down into this. You have to go over a lot of specifics about how implementations really work, what they use, what they don't use. But you know it's a good start and there's no claim it's done yet so will I kind of have to have the discussion probably go over? You know the in and outs of various codecs and what is used and what isn't and also frankly, what feedback messages are used in?

G

What aren't but would.

A

You prefer to do that as a working group document or an auto document. Would you prefer to that discussion took place while it was still an auto document, or would you like looking.

G

Into a bucket with the knowledge that you know, we'll have to hash that out, like anything else,.

H

I'd agree its.

C

Darkness cut by the way.

H

C

H

Pottsville again, I.

C

H

It's a good start, I'm, not real keen on the codec specific information field. I do see a connection here with Jonathan strap, because you could also have a refresher point it here. If you add another bit, I think it would be a good working good bottom.

D

Hi I'm Colin Perkins um I, agree that if we're going to do the sort of private cloud switching stuff, then we need something to solve this problem and I think this is one reasonable way of solving that.

D

It looks like some of the discussion regarding the switching stuff may involve changes to srtp.

D

If that happens, then exposing some of the payload headers in an encrypted form would be an alternative way of solving that which would perhaps be lower overhead now I, don't know how that discussion is going to pan out, but it seems that we should perhaps have some of that discussion before making a that's a possibility, but.

I

I took that's a possibility, but the authors would explicitly like to go in at payload agnostic forwarding direction, so even if it wasn't for the prime market, even if it wasn't for the private media, architectures I think it's even a more fundamental goal of this work to make the RTP forwarding function, independent of the of the payload format and so extracting codec specific payloads for the private media work seems like it's possible solution, but it seems like it's not the solution that that gets us going forward with new codex, but.

D

I

Gold mines not.

A

I'm afraid, but.

D

If the goal is to be codec agnostic, then I would be pushing back against having the correct specific extensions here.

A

I

A

Think we need to get the questions I'm going to ask it anyway, but I think there's a bit scope for a bit more discussion on the list on the actual scope, but I'd like to get a viewpoint and what the support is in the room for doing the work so same questions as last time. Those you think this is something that I TF should be working on versus those who do not and then I'll ask for interest in supporting the work, but I do expect to see some more discussion on the list on scoping.

A

So those interested in think those who think IETF should be doing the work put your hand up now. Please 12 15, 16, 17, ok, those who do not think I ETF ship doing this work, those who will be interested in supporting the work by reviewing and so on.

C

D

I compact I think there's a slightly broader question. I think the ATF should be doing some working in this area. I'm, not a hundred percent convinced this is the right answer. Yeah.

A

So I think what I'm expecting you to do colonies post that to the list with Santa sort of initiate the discussion. Is that something you can do I mean. Basically that's what I want to see on the list is what is now the real scope of what we're trying to do and and so on. What are the constraints on that scope? So we can actually know what to write in the milestone.

F

Real quickly, um I just checked something here in real time. I think we do has IPR on this stuff. So I don't know whether you want to reopen the voting based on this information. I will verify that and.

A

I'm going to verify all these on the earth with our usual.

F

A

Yeah, so maybe you could consult with I.

F

Don't do this type of stuff in real time, I have to draft up now.

A

Having made sure they can tonight, maybe you can sort of talk to Jonathan and see what what you know whether there's something that won't be made as a sale will be reconfirming these on the list anyway. So.

A

Okay I'm for next to me that picture verification.

C

H

So the question on that last one did I hear that you plan to ask for a milestone for it, but you need to work on the scope or that you're going to work on the scope and then decide.

F

On the milestone, the first I.

A

Think I need to know what the scope is before I decide on the milestone. Okay,.

F

The second yeah.

A

So, there's a bit more discussion, Allah list, what I'm seeing is broadly people in the room want to do some work in this area right. We now need to work out exactly what it is.

A

So is presenting us from.

A

C

A

C

A

What happened to full screen in this one.

C

K

So my name is Bob Berman I'm be presenting this about reference picture, verification information that is again things that you want to handle specific codec. So next, please there is IP are multiple on this one next, please so. The motivation for this is that with modern video codecs, you often have multiple reference pictures like you can use for for improved compression, and you can also use it to recover from decoding errors and packet losses.

K

For example, the hevc codec has also a possibility to include a decoded picture hash information to really verify that you decoded the frame correctly, which is usually left out since previous codex. It's not a per pixel exactness measure, but here you have that, and this message is introduced to enable the use of these features having a controlled way of referencing pictures that may be lost and also checking that the correctness of the decoding.

K

So next, please.

K

We propose to use an RTC p feedback message to enable the communication of this information from the decoder to the encoder. You can indicate multiple decoded pictures that are to be used for reference or indicate that a specific picture was not decodable, maybe because of loss. You can indicate also that a specific picture was decoded, but with an incorrect result and thus should not be used for reference, and you can also explicitly include the this decoded picture hash for the encoder to evaluate whether this was correct or not.

K

Next, please, there is some relation to existing feedback messages. The most obvious one is the rps I and or if say, RFC 4585, the AV PF, but there you only have the possibility to indicate a single picture, not met, not multiple that one or a vpf also has the picture loss indication, but there you only have the possibility to indicate that pictures are lost, but not which and specifically not which of of certain reference pictures, and there is no mechanism to indicate the incorrect decoding result.

K

So next, please that is very short presentation. So.

A

Let's see if we can get the discussion equally precise, concise um who's read the draft.

A

Anybody want to speak to whether this is the right solution. I had an extension. This is anything else, all sorry feedback messages again I'll get this right soon in this terminology and burn.

G

Our wellnes microscope Pam- and this is just more of a question about relevance. You know, having done a little bit of asking around. It turns out that our psi is typically not used in conferencing scenarios and I'm just wondering here. If you think of this more as a person-to-person kind of a thing or a conferencing thing, yep.

K

That is the idea that it's an interactive scenario.

I

Yes-Mo zanetti and the direction that we've looked at for especially for conferencing, is more about exposing our TP level data instead of codec level data, so things where you surface codec level, frame, ids and codec level dependencies between those formalities. We've some implementations have gone a different direction to only focus on RTP sequence, numbers and and the delivery of those. And if you look at the congestion work, that's being done this very likely that we'll have actors and things like that. For four reasons.

I

You know mostly related to congestion control but could also be used for you know, for recovery and repair and resilience. So I'm inclined to think a better direction is to go towards RTP level repair rather than deep, codec level surfacing of those those codec specific values up to up to some other up to some other feedback messages or headers.

C

Cullen Jenks I'm, just given how much we we've had feedback messages of various types over the years and that there's not all that much optimization of combining them together in the same way I. You know I'd, prefer to see us pursue something like this, that it wasn't as obviously IP are encumbered and see if we could find a Realty free sort of version of it seems like an area where we easily.

B

C

A

This being do what? What was that? What in echoing mo or were you actually saying, do something different I'm.

B

Suggesting the working group should find a different. A non IP are encumbered solution to this problem. If they can.

H

Stoop icicle another question I have is really in order to actually use this at the source. The information has to get there very very quickly in some scalable scenarios, maybe quickly is longer than if it's not scalable, but you think one of the reasons messages like this haven't been adopted in the past is that by the time the message gets to the encoder. The encoder has no real good way to repair a frame that was lost ten frames ago.

K

The same problem would be there with with RTP specific information like vectors and such but I.

H

Wasn't arguing in favor not.

K

Been just noted.

L

Minus with them so I think I mean one reason it might works more these days. Pretty more modern codec is that they have bigger frame buffers and more depth and actually can look started back into history. So you actually have still have your old reference pictures in the buffers, so you can actually reach back and use them when you're repairing rather than having to do an eye. Dr to repaired applause.

H

There's theoretical and what you can actually do if you're using third-party encoder hard work.

C

A

The chairs here is sort of debating whether we actually put the call now or not yeah. I think it would be useful to find out what do we people think we have a problem that needs to be solved, but I think that's possibly as far as we might go now,.

B

Might mean we need a clearer definition of what the problem is that that this solution songs, because I don't think this draft doesn't. I didn't feel like this draft at least turned out the presentation. I don't recall the draft should have explained you know what the use case or models was where this would be a useful thing to do. Yeah.

K

The motivation side was not clear enough. I.

B

Mean, like all the things you're talking about like you know the round trip time issues and I mean you know it is it just. You know. Hevc has this feature so therefore we want to use it. I mean what what is the scenario in which this would be. You know I mean what is the network topology? What is the setup things like that? No, why does it be the best useful thing to do.

G

Bernardo Mike's, just a general comment, though I think it's important to take into account the comment of that mo made and kind of. What's behind it, we're having an explosion in video codecs, this interest in privacy, all that tends to lead to away from solutions that are very specific to certain codecs and are incompatible with privacy right if we're going to have a dolla and vp9 and v10 and h.265 and whatever you don't want to have a zillion specific messages that only work with one specific codec right nobody's going to want to implement that stuff.

G

So that's kind of the general context here and maybe that's explains the lack of enthusiasm. It certainly does on my part. You know, especially when I looked at all the feedback messages we have and discovered that like half of them aren't used by anybody anymore, you know so inventing yet more stuff. That's future legacy. It's not real happy.

C

K

Me just comment on that. This is definitely not intended to be codec specific. It can very well be used. Codec, agnostic, Lee,.

A

So what we do is we'll do a quick call to see between what are this interesting working on this now this is: do we taking another round of discussion as an author draft and see where people want to go with it? So basically, yes, we think this is a working group item. Now we want to get working on it vs.

A

I think we need some more discussion on this, take it to the list and see where we go for it from there so first option who wants to actually start working on it now they think this is the right thing to do.

A

It thinks it needs another round of discussion to decide what we should be doing in working group.

A

Who wants it to go away completely right? Okay, so I think basically progresses an auto draft, see where you want to go with it. That's a discussion with them. Some other people, maybe persuade them round. Okay and so Peter.

J

Okay, so I'm proposing a header extension, that's like the mi da door extension, but for multiple layers instead of multiple media sources. Next slide, so we have three different things or three different layers within our taxonomy. We have media sources, we have encoded streams or dependants dreams or in other words different layers within a media source, and then we have different RTP streams, so media source can have n layers or encoded streams and dependents dreams.

J

Each one of those can have one two three RTP streams, r-tx FTC or either and before we could always use payload type for all of them. But payload type has problems in the sense that you can run out or you could as SSRC for everything, but as the sources have their own issues, and so we defined the midd extension for the class on the top row, which is media sources. And now, if we're not doing any simulcasts or any layering, you can just use em, ID and payload type.

J

You don't have to sit unless it's our seas and you don't have to worry about running out of payload types. But if you are doing simulcast or you're doing layered encoding, there's a missing box there.

D

Hi Ann Perkins I'm kind of confused because you can't use payload type to half of these things. Why.

J

D

I mean media sources identified by the SSRC. The RTP stream is identified by the SSRC if you're.

E

D

Payload type to do that you're doing it wrong.

D

C

D

R TP works I'm. Sorry well,.

J

Halo-Halo type has before have been used to D max and identify all three of these things, and so has SSRC. So no no I'm, not suggesting this is a good idea. In fact, I'm suggesting the opposite that So mi Dee was a mi. Diah is a good solution so that.

D

This is not true. What is being used is the SSRC is being used to demultiplex the streams and the payload type is then being used to identify what type of meteor it is.

J

Okay, so that's a fine distinction, but.

D

It's a very fundamental distinction into how r TP works, because the SSRC identifies the source, the sequence number space by the payload type space. So.

J

That what I'm trying to get out here is that we have a solution with em ID for making it. So we don't have to signal their sister sees and we don't have to rely on the payload type being a deluxe point that identifies everything and having the potential for running out of payload types. I.

D

Realize it's a bundle, I just believe the slightest incorrect. All.

J

Right, well, you know, send me an email with some details of how I can fix this item. Sorry, if it's slightly incorrect, take the way.

C

J

So I left the matter the original slide and someone said tell me to put some so.

C

I think gorgeous tractor what colin is saying: yeah in the RTP level, it's team, oxidation, sslc, I, think what Peter is saying is: how do I mapped the stream to sdp is done through it's not a particularly this.

J

Is kind of immaterial to where I'm headed for Texas.

A

So it's okay, we are heading play.

J

My d, it like it's kind of a mess and it might be cleaned up. A lot of ms 4 bundle where we have multiple media sources over the same network port. But now we have the same problem for multiple layers within the same media source and so there's kind of a missing thing for the box. So what I'm proposing is next slide? Now we fill in the box with a header extension that identifies the layer, and you know honestly.

J

This could potentially be part of this other header extension where we have a temporal ID, but instead we put in a spatial ID, but they didn't have a spatial later ID in theirs. So if we don't have anything anywhere else, I'm proposing that we have a header extension where we can specify an ID that indicates the layer, the spatial layer and that could be called an encoded string or it could also be for a dependent stream.

J

I chose to call it esad, but, like I pointed out here, it could be li d, 4, layer, ID or some other thing. So if you can go to the next slide, the benefit is you don't need to signal? S2 Circe's, there's no risk of running out of payload types, which were the same problems that we overcame with mi d X epilators, and this could work in conjunction with them. Work.

J

That's going on a new music where they're trying to define how to do simulcast an stp by overloading FM tease, but you might run out of those. So this could address that problem. So.

L

My name is wisdom. I think you start in the wrong group. I think you need to go to a music and actually scarce what type of configuration identifier set cetera, because that part of the of this work is actually more relevant. I mean putting in a header extension against England, which is basically whatever text is scope for doing it is trivial.

L

What's it what's, the big question here is: can is ID be singled in the indie stp in a reasonable way to actually mean which is resting it to me and I mean I I want some solution for simulcasting identify the kind of the configuration of the encoded stream and payload type isn't enough from my perspective. So yes I in that sense, I support the work, but I think we need to start talking and draw the saying. How does this look in stp rather than being here and discussing itself right? So.

A

They're in there to impart line.

C

A

By the way, no more in line after.

J

Coming there are two parts of the solution or other.

A

J

Is to define the header extension, which you point out is fairly trivial. In fact, I just copied the text from the mi da door extension and rename some things, and that was that part of it, and I actually think that part of it is is useful all by itself, and I would be happy if we just had that be four scenarios where we want to use this header extension outside of the realm of stp, which is still valuable. The second half is: how do we map at stp?

J

For those you want to use STP and how do? How do we define that? So the second half of the of the draft is defining a proposal for that, but you're right in the sense that this might not be the right working group for this. Maybe instead we define the header extension here and then the music. The end music group can take the header extension incorporated into one of their drafts for the work on simulcast with n stp.

J

So I agree with you that that half might not be right for this group.

C

Any illegal, so I think this is a very good thing. I very much support this work and we have that problem today and and I think that's very reasonable solution. Now as to whether or not some part of that has to happen in that music.

C

Probably some part of it has to happen here as well, so I don't see why we have to block the part. That's happening here, waiting on something to happen in the music. It's not the fastest working group in the world. So and just a quick question and you can feel free to answer. Read the draft, the es ID in a scenario with let's at a conference with ten participants that doesn't have to be unique for every participant right. It can be the same yes, ID use percent short stream for engagement.

J

I was imagining: it is application, choosing it so yeah you could write, you could map the same to every right, okay or you could, or you could have any different.

A

So tell everyone if you can keep your remarks brief as you're about to go over time. So if you can try be as brief as possible, please as.

D

C

D

Colin Perkins I have no objection to defining identifiers to correlate to different streams. I think that is generally a good idea. I have strong objections to the way this. This particular draft works. I think the way it overloads the sdp format parameter is completely and utterly broken. I think the way it attempts to extend pelo types is completely and utterly broken, so I would object in the strongest possible terms to this draft, although I would think the general idea of correlating streams is something that's worth considering. So.

J

A case I wanted clarification on so that if there are two halves to the proposal, one is the header extension one. Is the stp you're strongly opposed to the stp.

D

All right is your.

J

Feeling on the header extension / I am.

D

Strongly opposed to the SDP parts I'm overloading with format parameters.

J

The way you're doing it- okay I understand that, but the.

D

Header XE header extension itself, as Magnus's said, is utterly trivial.

D

It's the way you I don't assign the identifier which is problematic once you figure out how to identify in the stream identify as then wrapping it in a header. Extension is trivial, so.

J

You, okay with having a header extension. I.

D

Am ok with the concept of a header extension for an appropriate identifier that bits trivial, I, think the particular identifier you're proposing is broken unknowingly.

J

What your was, what you're saying is that you've, you see no value in the header extension. If there's no sdp mapping, you.

A

J

A

Saying no me to define the appropriate identifies and.

J

Gets agreement.

A

On that, before you go to the head of excitement and appropriately.

J

I did I identify the identifier as es ID yeah.

A

But I think honestly, you're inappropriate I.

D

Think your ears.

A

Only right posle is.

D

Completely broken once you have defined an appropriate way of identifying these streams, then wrapping it in a header extension is trivial, but the mechanism you're using took to identify the streams. The identifier you've chosen does not work. So once you have identified an appropriate identify, then sure will come into a header extension it'll take 10 minutes to write the draft. That's not the hard part right, so Ronnie.

A

E

Agree with what was said before, definitely definitely be sorry.

A

I'm this I'm a little and extension.

E

Doesn't make sense without the other part, because you've said it's that she should not be unique. It depends on how you define in detest DP, because the way of defining now, unless they pay what you make. But if it you'd find me differently, it may be needed to be unique. So it's not it's until you go to the sdp part and define how it's being used. There's no reason to do this headaches tension or that I. The way we had this before we had this before in the application.

E

Id, that's what we try to do well near the application ID that.

J

That's the part that I disagree with I believe there is value in this without having a.

A

Mean what I'm saying a word can be separate and why you've actually Skyped the problem in the moment sufficiently people are coming up to the mic saying we don't believe this is right to that. You basically need to go round the loop and basically have another look at the problem. Someplace. If that's what Manderson I.

J

Heard I heard one I heard two objections to the SDP mapping and one and well both of them saying that if there is no stp mapping, then the header extension is of no value. No.

A

That's not what the same, but.

D

A

Very briefly, calling please so.

D

The fundamental part here is defining what the identifier is to correlate streams, I think you're. The way you have done that is broken, so they're thought the way you have included it in sdp is broken. Therefore, the way you're, including it in in RTP, is Brazil, because the identifier is.

J

D

J

If I demonstrated a use case for this, that did not involve sdp and I could demonstrate value. I.

A

Think I think your fundamental problem is the identifier yeah.

D

I'm not not what.

J

D

J

Is it if I could demonstrate a use case where I'm identified.

D

I think you're missing the point. I'm.

J

D

Not objecting to your use case I understand your use case. I think the use case is important and needs to be solved. I mean I think the particular.

A

Solution, licked.

D

A

I think this has identified people you need to go and talk to you don't need to wait till the next meeting to re spin. This problem right basically get them on your side. Maybe that will identify that you actually need to go and do some n music work as well.

A

Take the opportunity here to talk to their music chairs, possibly with in conjunction with other people and see where you go next, but I think you're, basically going to have to have an re spin at the draft so that it actually more clearly says this is something that is valid. This is the header extension that encompasses that thing that is now valid. Okay,.

J

Can we can we go to the question of why a little.

G

Comment which is maybe I'm having a hallucination but I, think everything that's in this this in this draft, or at least all the header extension stuff, could go in the fray marking and be entirely useful. There. That's.

J

G

So sure I'm not hallucinating along good I.

J

Mentioned that earlier, which is you know, they have all these bits for everything, except for no well I'd, be totally happy with that.

A

I think we need to make sure that people are actually happy with the thing that's being shoved in that's. Basically, the problem.

A

Okay, more discussion on the list, basically or discussion, or can.

J

We skip to the last night.

J

The latter very less like.

L

You so basically, I.

J

Would I would like to know not very last very.

A

Honestly, I haven't got any slide one person so.

J

I would like to know from the working group as a whole whether they think even independent of stp. If it's worth having a header extension I.

A

Don't think we've reached that point yet I'm not seeing sufficient support in the room yet I'm, seeing simpler.

J

We had one president yeah support.

C

A

I'm not saying we're killing this. What I'm saying is you need to go around and Reese pin the discussion.

B

J

Is it I'm just wondering if anyone else is interested in the header extension independent of the stp or I.

D

Don't think anyone is objecting to the header extension. That's not the problem here right, but the problem is not the idea of putting a header extension to correlate streams. That's a wonderful idea. The problem is the particular thing you're trying to use as the header extension, the particular identify your choosing. It's.

J

Just a number that identifies the.

A

D

Otis, yes,.

A

But when the and.

D

A

Chosen a number that represents apples and he's saying that one there should be a number that represents oranges, that.

D

You have chosen to do it in a way which overloads the format parameters. No, no I. In no way is it forget that part forget and.

J

Just forget about the format Brad, that's.

D

The important part right.

A

Right right, yeah and we've the chairs of a more fun right.

J

Or exactly look, we are that's her way. I used without sdp will.

A

Yes right so okay right meeting closed, all.

C

C

E