Internet Engineering Task Force IRTF, 28 Jul 2021

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: IETF111-ANRW-20210728-1900

Description

ANRW meeting session at IETF111
2021/07/28 1900

https://datatracker.ietf.org/meeting/111/proceedings/

A

uh Hello, I can be heard, but not seen apparently because meet echo is not letting me turn my video on how about now. Oh there we go.

A

Andre, um I have the videos queued up too, but but uh since you're sharing the screen, do you do do you want to play them? I just wanted to check.

B

A

Think I think the deco.

B

Will help us just um to play the videos.

A

C

Just need to ask them.

D

Yeah, we can hear you loud and clear and we can see you.

A

Check testing, I see audio waveforms, suggesting that I can be heard. Okay awesome um great, so uh so welcome everybody. We.

A

I cannot hear you, oh, uh that is why hold on a sec. I need to unmute this tab. Hello. Can you hear me.

D

uh Hi again, yes, we can hear it.

A

Oh perfect, yeah. I was muting myself, uh you know because there are six 16 000 zoom meetings going on at the same time. So.

E

A

Welcome everybody uh to uh uh to session five uh in this session uh on dns and privacy, um again we'd like to to um to to what we thank the sponsors, but also, let's remind everybody uh of the logistics and the links here. So there is a slack channel. If you have uh you know, um questions and and and such that's a good place to put them, I will see them uh andra. Our uh our talented session chair for the last session will also see them um uh and that's a good place to do that.

A

The uh url here for your convenience, where you can access the program, the papers and also the videos that you're about to see, are all on the on the program website, thanks to colin perkins who's. Also, here great, thank you for putting everything up, so you can watch those again if there's something you you really liked and then also if the papers themselves are also available in the acm digital library, for your viewing pleasure.

A

If, for some reason, you want to follow up and and read the papers uh they're all available uh there uh as a reminder, all these sessions are recorded and the recordings will be available on youtube after the workshop.

A

So looking forward to an exciting session, uh okay yeah a couple more points here: um the videos that were about the presentations that we're about to see are pre-recorded, so we'll take questions at the end uh in a five-minute slot and then, at the end of each session, we've got a two-hour slot at each session runs an hour. So we're going to end each of these sessions with a 15-minute panel for q, a for all the authors presenting in the session.

A

Our um uh our rationale, for that was, we thought it'd be a little bit more lively than just uh you know, than just a presentation of back and forth with a single author. So we organize things dramatically so, hopefully we'll have a fun discussion on dns privacy uh towards the end of the session uh yeah.

A

Just some meat echo stuff to ask a question: just enter the queue, the mic and hand logo, and then I will call you out and enable the audio and uh if you need more information on how to use medeco there, it is uh next slide uh yep. So here we are session five uh and our first uh presentation will be from uh austin hansel austin. I see her here long time, no chat austin, and not only am I the session chair, I'm also a co-author on this paper. So uh so no hard questions right.

A

No, um but uh austin. uh We're gonna play the video now, so we can queue that up. So we can ask the medieco folks, wherever they are to play encryption without centralization.

A

F

Snowmobile hi, my name is austin hemsel and I'm a phd student at princeton university and I'm going to be presenting some work today about distributing dns queries across recursive resolvers to reduce centralization. This work was done in collaboration with paul schmidt, kevin, barcolta and nick femster.

F

So essentially we are presenting the design and prototype implementation of a refactored stub resolver that allows for decentralized, encrypted dns resolution.

F

uh We then perform a preliminary evaluation of the sub-resolutions performance and utilize, a real-world data set to evaluate how various query distribution strategies such as random, round-robin and hash-based strategies affect the amount of queries that are seen by recursive resolvers.

F

So, as we know, dns privacy has become a significant concern within the past few years. We know that onpath network observers can infer which websites you're visiting by looking at your plain text, dns queries, this kind of interference can occur within governments or in coffee shop networks and et cetera. So two protocols have been proposed to encrypt dns traffic. One is dns or tls or dot. Neither is dns or https or do and we're saying that increasingly do is being rolled out by various browser, vendors and other software developers.

F

In fact, we're seeing that firefox now uses doe as the default dns protocol. For us users, which you know encryption is a great thing. It's great that encrypted dns is being increasingly deployed. However, with it, some users are green and growing concerned about ways in which the dns may be centralized into a handful of large operators, and so we stand to ask well. Is there a way we can get encrypted dns resolution while distributing our queries across multiple operators instead of centralizing into a small set?

F

So first we want to ask: well, is decentralization even worth doing in the first place, what might be the theoretical reasons why you may or may not want to distribute your queries across multiple resolvers?

F

However, we're not going to be answering that in this talk instead we're focusing on well, is it even technically possible or feasible to distribute queries across resolvers and what forms code this decentralization take.

F

So the architecture we have in mind goes something like this in step. One, the stub resolver discovers various upstream resolvers that support doe, along with the characteristics of these resolvers, such as geographic location.

F

The user can specify these requirements, such as within a configuration file and then the sub resolver then selects of this pool a set of doe resolvers that match the characteristics the user desires.

F

The step resolver, then, as it receives queries, distributes them across multiple resolvers, according to some user specified strategy such as a random model or a round robin model to prototype this, we forked dns grip proxy, which supports do and the ascript, and we added support for the hash and round robin model, and we also evaluated some existing code that was within the proxy for random query distribution. We evaluate the performance of this.

F

We note that the proxy can not only run on host devices but also on routers, so you may be able to support distributing queries for a given strategy for an entire local network.

F

So this is just a quick example of what it looks like to use a hash based model that we implemented so, for example, if the stub resolver sends out a query for example.com or receives it rather, it may then forward it to cloudflare's upstream doe resolver, based on hashing of the second level domain name for images.example.com.

F

It would also get hashed to cloudflare dissolver because it shares the same second level domain name with example.com.

F

Similarly, new york times might get mapped to google and then, if the client queries it again, it gets mapped to google once again same domain name and then washington post goes to quad9, so we're seeing that the dns queries are no longer just going to a single do resolver but being distributed across multiple of them.

F

Similarly, the round robin model behaves, as you might expect. First, query goes to the first resolver. Second query goes the second resolver third to the third and so on and so forth, and lastly, in the random model also might behave as you might expect. The queries go to various resolvers without a particular order.

F

So we wanted to evaluate these strategies and see what's the effect of a stub resolver that implements these strategies on both page load times, cd and localization, and the percentage of unique domain names that are seen by various resolvers from each client.

F

So we're going to be answering these questions in turn, so, first we turn to cdn localization right and what we're particularly concerned about here is what is the effect of using various recursive resolvers on how far away the cdn content is that you resolve right.

F

So if we look at this figure, which is a cdf of tcp and tls setup times for cen content, we can read the legend in this way. If you look at the cloudflare dash google line, what we're seeing there is that these are the this is the cdf of tcp and tls setup times when you use cloudflare's, doe resolver to resolve google cdn content, and essentially, what we're seeing here is that there really isn't too much of a difference that's being made in terms of using one particular resolver to host uh to resolve cdn content.

F

In other words, you could use cloudflare's resolver and get google content just fine, just as you can use google's resolver to get cloudflare content just fine.

F

Similarly, in terms of page load times, we're seeing that no single strategy seems to have the best or, worse effect, on performance for page load types.

F

So these are measurements that were taken from four amazon ec2 vantage points in which we perform page loads, using a instrumented instance of firefox with selenium in a headless browser, and we perform measurements with three different query: distribution strategies right, and so what we're seeing is that no single resolver once again or rather no single model, seems to perform the best or the worst uh across different vantage points, we're seeing slightly different effects on pedro times.

F

But everything looks about roughly the same and then when you turn to a comparison of using a single resolver for all your queries, we're seeing a similar story where each resolver seems to perform somewhat. Similarly, in terms of its effect on page load times, whereas in certain vantage points you're seeing maybe a slightly higher effect than other ones right. So, for example, if you look at oregon, you might see that there's a much more pronounced difference uh in pedro times with certain resolvers, then.

F

Lastly, we look at the effect of how many unique domain names are seen by dns resolvers whenever you use query distribution strategies right, so we utilized a real world data set from almond at all of 100 homes in cleveland, ohio that were connected to a fiber to the home network. We utilize this traffic to perform our query distribution strategies after the fact so to say to kind of simulate.

F

What would this traffic look like that is sent to various resolvers if we were to use our query distribution strategies in the first place, and so what we're seeing here, if you look at figure a with the hash distribution strategy, is that if you use two resolvers, four resolvers or six resolvers, there's a kind of flat line in terms of how many unique domain names from a given client's.

F

Queries are seen by various resolvers on average right, and this is fairly intuitive to understand, because if you think about it in a hash based model, the same queries for the same domain name should always go to the same resolvers. It shouldn't be the case that two resolvers ever receive a query for the same domain name, whereas if you look at the round robin model uh within the first couple of weeks, we see that each resolver starts to gain access to more and more unique domain names from a given client.

F

This makes sense because the round robin model doesn't have a fixed mapping between domain names and resolvers.

F

Nonetheless, it is interesting to see that the percentage of unique domain names does kind of taper out over time, and we think this is because, in the traffic data there's a lot of queries for popular domain names and fewer queries for less popular domain names.

F

So in summary, we present the design and prototype implementation of a refactored stub resolver architecture that allows for decentralized encrypted dns resolution. We performed a preliminary evaluation of the step, resolves performance and utilize. The real world data set to evaluate how do different, create distribution strategies affect how many queries are seen by recursive resolvers right.

F

Our goal of this work is not again not to suggest that we necessarily should be using query distribution strategies everywhere or that we should be using these exact strategies or in a particular manner, but rather to explore what is the feasibility of a distributed dns architecture in which we don't just send encrypted dns queries to a single resolver, or maybe a small set of resolvers, but rather distribute them across a larger set of resolvers and we're planning on doing some follow-up work to see.

F

What does your performance look like if you use a lot of resolvers resolvers that are in different geographic locations, and how does this affect a client's privacy in a formal way, rather than the preliminary evaluation done? Is there a way that we can formalize the privacy costs and benefits of using different query distribution strategies?

F

So I want to thank you all for listening to this talk and you are more than welcome to contact me at the email above and I'd be happy to answer any questions. So I want to thank my collaborators. I want to thank the audience for listening and yeah. Thank you very.

A

Much thanks austin.

A

Great talk, yeah, hey austin, um all right! I think. uh Let's see, uh I don't know if we have your video on, but I see a queue of two questions. uh So let me figure out uh turn my video on yeah. I've got a question from jim, oh oopsie, remove from q jim.

A

uh How do I enable gyms.

D

It looks like we don't have any guests in this shoot.

G

Jim is there jim just asked a question yeah. Can you hear me.

H

Yeah, yes, going to absolutely excellent talk, austin, and I was a little bit confused by you saying that um when you're querying these resolving servers that things are working out, fine between, say, google and cloudflare and all the rest of it. If you understand correctly, the cloudflare resolver service doesn't implement ecn into client subnet, so the revit information it's returning, may not necessarily be an optimal ip address, safe for something that's hosted on another network or because of the routing information.

H

So wonder if you've made any attempts to try and adjust that or take account of that in your measurement data, or is that something to look out for the future.

F

So thanks for the question, um so I would I would say that we've been thinking more and more about seating and localization, as we start thinking about doing collaborations with um so folks in cape town and we're planning on doing some measurements across um africa to see you know.

F

Okay, if we can take these resolvers in uh north america and these clients in north america and extend our results to other countries where we see similar effects in seeding localization and what we're quickly starting to realize is that we're gonna have to think uh a bit more a bit harder about the evaluation we've done so far with regards to cd localization, it's not quite as formal or as sophisticated as it could be, and we've definitely been thinking about questions related to edns for future work.

F

But um that's not something we explored um for this paper.

A

um Let's see, we've got a question from uh dkg. If I can hi.

G

A

This is daniel.

G

um Hey thanks for doing this work um austin. I really appreciate that you're trying to figure out how to formalize this, um and I wonder whether you have any thoughts about the privacy implications of the two schemes that you have beyond uh information learned from the dns resolver. So, for example, in the hash based scheme, it looks to me like each client is basically I'm assuming that each client doesn't pick the same hash based scheme, they're, probably keyed by some in individual uh selected hash. And I wonder if that.

B

G

Like a super cookie to me, it seems like somebody on the local network could force you to do or encourage your browser to do a number of lookups observe which.

G

Which lookups you do to which servers record that and then and then have a fingerprint of like who you, who you or who your device is that they would find over time. So when you, when you're formalizing the privacy evaluations, do you like? Are there metrics like that that you plan to include like how are you thinking about this.

F

So gonna be honest. um That is a scenario that I had not uh thought about in my head: uh the potential for using that uh kind of hashing strategy as a super cookie- and I think that's very interesting to think about um you know another kind of question that comes to our mind.

F

Is you know yeah, so it may seem beneficial in the hashing scheme to you know, use the second level domain of a domain name uh as a way to ensure that all you know, domains that are related to each other should go to the same resolver. But then there's a further question like does it matter whether certain domain names at all go to certain resolvers in the sense that you know?

F

Maybe there are certain sensitive domain names that you wouldn't want to go to a certain operator's, resolver right. Maybe that would reveal more information about you than you would like. So there's. Also a question of you know: is there a way that a client could provide some kind of a preference about resolvers that it might wish to use to avoid certain isps to avoid certain geographic locations?

F

And you know there's a lot of considerations beyond just um you know: okay, how many resolvers do you use or which resolvers, but also with what locations across which isps you know if they're isp operator resolvers, maybe you wish to use it or not, um so those are also considerations we're thinking about, but uh I guess to answer your questions directly. I had not thought about that scenario that you brought up and that's definitely something to write down, uh as we start thinking about formalizing this for future work. So I really appreciate that question.

A

Great and I think we have one fine- we have at least one more person in the queue. uh Sarah has a question. Sarah dickinson.

I

Hi, can you hear me um so this is slightly informal to the idea of privacy, but it go feeding again get the idea of user experience when you're talking about collecting sets of recursive resolvers. It might be worth just including in that the characteristics of the resolvers in terms of, for example, filtering or doing a sec, because otherwise you'll get different results from different resolvers, potentially, which could lead to very strange user experience in intermittent behavior. So just another thing to add into your selection criteria.

F

Yeah absolutely- and I should say as well that we've already begun doing some work to kind of explore whether we can do some kind of a user study to explore. Okay, do users have any understanding of what it might mean to distribute queries across multiple resolvers, what is dns privacy to users, even technical users right, um and that's very much going to inform among a wide variety of questions and preferences like what would the default settings of such a system?

F

Be it's something that you know we frankly like we're kind of hesitant to provide any kind of an answer for it. First, like you know, we don't. Maybe there shouldn't be a default strategy, but you know in terms of user experience. Users probably do need some kind of default that they're not going to set it themselves, and it's not immediately clear to us whether one strategy is better than another. Whether or not um you know, some users may want filtering resolvers or not in terms of like you know.

F

If I'm a parent in a household, you know what I want a filtering resolver versus finding somebody that is not a parent. Maybe I don't want a filtering resolver at all, um and how do you make these kind of default decisions for users without trying to make them get bogged down too much in the details?

F

And so um that's definitely a consideration that comes to mind, especially as we look at uh various lists of uh doe resolvers out there, like, I think, dns crypt has one on their github page and I think in one of the columns they indicate whether or not it's a filtering resolver or not, um and that was something that came to our mind is like well yeah, like you know, if we're going to be doing measurements across many resolvers, you know we got to be sure that we're going to get the answers that we're going to expect just from a scientific perspective.

F

So that's definitely a consideration. We've thought about.

A

Siobhan we have one more one more question in the queue. I know we're a little over time for or the first talk, but uh let's take a look, can you can you hear me.

J

Sorry, can you hear me? Yes, we.

K

J

Oh cool thanks um yeah. This is not a fully thought out idea, but to address what bkg brought up earlier about fingerprinting concern. um Would it help at all if there was a? If, globally it was the same domain name to resolve our mapping and everyone uh all clients would use the same. You know second level domain um mapping to uh resolver like this is not ideal, but it would. At least I mean you would still be able to find out that you know for certain domain name like everyone's craze are on this.

J

Resolver like an attacker would be able to do that, but um this is I mean this would still be better than the status quo. Perhaps.

F

Yeah, that might be better, um I mean something to consider too, is whether even the hash based strategy is the best we can do in terms of privacy right um like even if we all agreed that um we should use the same hashing scheme for all clients. um Maybe we can do better in terms of privacy than hashing domain names. It's something that we're still thinking a lot about, because we're not entirely convinced that um that's the best we can do, and yet it's something we're still actively thinking.

F

A

Thanks austin great questions: we I see, there's one more question, but I think we definitely should probably get on so shawn if you could hold your thoughts until the till the panel. uh That would be appreciated. uh Hang on to that thought, uh because we're we're definitely up against time here. So, um let's to the meat echo folks, maybe we can queue up uh our second talk, which is institutional privacy, risks and sharing dns data.

A

So looking forward to that, one thanks.

L

Hi everyone, my name, is bossy and I'll present our work on institutional privacy risks and sharing dns data. This is a joint work with my phd advisors, professor alexander korolova, and professor john heidemann.

L

Almost all online activities such as accessing a website or sending an email start with a dns query. Therefore, dns queries can say a lot about end users, online activities and this privacy risk has been well understood and studied in the past.

L

In our work, we introduce a new aspect of dns privacy, which we call institutional privacy, which is concerned with the behavior of an institution's traffic as a whole, in contrast to individual privacy. This has not been closely studied before, but it's important to look at because institutions, internal activities such as sending or receiving an email, can leave a digital trail in the dns ecosystem.

L

Based on this motivation, our contributions in this work are first to define uh institutional privacy as a new privacy risk and dns, and within this model that we define, we give a methodology for finding institutional privacy leaks, and then we apply this methodology to a real-world data. That's anonymized and show the privacy risks and also demonstrate that the anonymous method that's used is not sufficient to prevent institutional leaks.

L

To start off with how we define institutional privacy, we broadly define it as confidentiality of digital footprints, of an institution's internal activities.

L

Examples of specific activities that may be confirmation through dns include, for example, two institutions sending or receiving an email between each other, which may reveal a relationship between them. That's not known publicly or in other activities an employee of a company accessing a privacy sensitive or a website. That's considered embarrassing from the company's perspective, such as like an illegal or an adult website and accessing such websites, while on the company's network.

L

For our threat model, we consider an adversary, that's the authoritative server and has either accessed the server logs or the traffic between the recursive and authoritative server, and the goal of the adversary is to associate the source ip and the domain name and the query in the query to the corresponding institution, the adversary targets, uh an institution that meets two kind of conditions. The first one is: the institution must run its own recursive resolver.

L

This would let the adversary use the resolver's ip address to uniquely identify the dns traffic of the institution, and the second condition is the institution must route traffic from its own autonomous system, and this would let the adversary map the resolver's ip address to the corresponding autonomous system. That's owned by the institution.

L

Many institutions adversaries fit our threat model in our study.

L

We study queries from 66 institutions that meet the conditions that I just listed and also represent a diverse set of sectors such as largest mp500 companies, government institutions, university of southern california, schools and so on, and when we pick those companies, we exclude institutions that have apparent deniability, such as isps or hosting service providers, because queries from from this companies might might be coming from their customers and not from their employees um and examples of potential real world adversaries uh in our model include dns service providers and also researchers that have access to data, that's shared by the service providers for research purposes and also government level actors that have the means to piece drop on dns traffic.

L

In this talk for our methodology, I focus on how we associate queries with an institution and how we find queries related to email, but there is more in the paper.

L

As I mentioned earlier, the gold adversary is to associate the source ip and the domain name with an institution. So.

E

L

The source ip we use public ip2as number mapping, data to map the source ip to the institution that owns the as and this works, even if partial, prefix, preserving anonymization method is used and second for the domain name. We use a public who is data to map the domain name to the institution that owns the domain? Here we assume the domain is in full and not that qna minimization is not being used once we identify queries related to an institution, we filter out uh we'll filter and find queries that are related to email exchange.

L

The first one is mx queries which are made by mail servers before sending an email, and in this case the source id of the query identifies the sender of the email and the domain name, and the query identifies the intended recipient and the second type of query we look at is dnsbl queries which are made by anti-spam services upon receiving an email.

L

In this case the source ip identifies, the recipient of the email and the ip address. That's embedded within the domain name identifies the standard of the email.

L

We apply this methodology to one week of b root data and we also reproduce our results on a second week. This data set is anonymized using a prefix, preserving method. The bottom eight bits are anonymized and we run this experiment with irb approval and with the permission of the data owners and the research questions that we looked at are first how common are sensitive, email, related queries and then, within this queries, are we able to find uh specific relationships between institutions? That's not otherwise, uh publicly known uh in the paper.

L

We also look at how common are queried queries to sensitive sites, which I won't cover in this talk.

L

uh So first we looked at uh how common our mx and dnsbl queries in the data set. That we looked at uh this plot shows that x, axis is the seven days of data that we looked at and the y-axis is the number of queries in millions, and we can see that several millions of dnsbl and mx squares are made each day, which can be a significant source of email, leakage of email, related traffic and within this millions of queries, we can go further and ask.

L

Are we able to narrow down and look for specific relationships between institutions and uh in this slide, the plot on the left shows a breakdown of those query. Mx queries by the different sectors of companies that we studied and the uh plot on the right uh narrows down even further and looks at queries made to specific uh institutions such as palantir and well.

L

We mentioned some examples of specific relationships that we find, for example, u.s department of justice, ip address requesting mxrecordofpalunter.com, which is in line with the known fact that parentera works with government agencies and another example, is a school board in jefferson parish, requesting a mess record of ice, which is also in line with the known fact that this parish is known to work with eyes in their deportation efforts.

L

uh We show these examples just based on one week of data, but we expect a more dedicated adversary that can look at a longer duration of traffic to find even more institutional privacy violations.

L

uh So, based on this results, our recommendations are for institutions to deploy querying minimization wherever possible uh yeah. So we recommend the even faster reduction of uh this mechanism.

L

uh Another option is to use a mechanism called local roots which caches the root zone locally, so that the recursive resource does not have to make queries to the root server and for service providers. Our recommendations are first. We have shown that host only anonymization is less sufficient for protecting institutional privacy.

L

So we recommend service providers to put legal constraints when sharing dns data and for the case of wider sharing, where that's not possible to that, they look into more rigorous privacy. Preserving data sharing approaches in this work. We have shown that uh dns queries may leak significant institutional information, thus otherwise, not publicly known.

L

Therefore, we recommend institutions deploy q, a minimization wherever possible and that service providers evaluate institutional privacy risks when sharing dns data for research purposes. With that, I conclude my talk. Thank you for listening. Our paper and data can be found at the link shown on this slide. Thank.

A

You uh thanks thanks so much uh uh I waiting for to see oh good. We we've got uh hashem in the question. Cue hashama I'll, remove you, and hopefully you can speak up.

M

About using differential privacy to ensure that the data of the institution is not going to be revealed to the intruder,.

A

I think maybe we, the first part of the question, was cut off a little bit, maybe have.

M

You have you thought about using differential privacy to protect the privacy of the institution so that the intruder doesn't know who's, which institution this queer is coming from.

M

I'm talking about the differential.

A

Privacy uh bassey: uh can you hear the question questions about like? Have you considered the use of differential privacy to uh essentially make it more practical or possible to share some of this data across institutions.

M

Exactly yeah, that's the question exactly yeah.

A

A

Maybe maybe we're having a little bit of a v problem. Spassy. Can you hear us seems okay, maybe uh okay. I think I can hear you going hi bastard yeah, all right.

A

Heard the question right.

A

Sorry, I think I missed the question. Okay, let me restate it. Can you hear me now?

A

Yes, the question was uh for um from hashem was: uh have you considered the use of differential privacy to make it more feasible, practical to uh have institutions or organizations share uh this dns data that you're talking about um in your talk.

L

um That's a good question. uh It's also one of the things that we recommend in the paper, but we haven't actually uh done the work uh to see how how that would be applied in the context of sharing dns data, but certainly that's a possibility.

A

Hey thanks for the good question.

A

um Seeing nothing else in the queue and given that we're like a little bit behind time, I think I'll ask meet echo to queue up. The third video uh and that'll leave us, hopefully time for q, a on that talk, as well as a little bit of group panel discussion at the end, where I'll ask all of our our speakers to to join again. So here we go talk three.

N

Hello, everyone welcome to our presentation. I am shangri. I am here to present our work. Dinners over tcp considered vulnerable. This is working together with dr hayes schuman and professor michael weitzman. We are all from german national research center for applied cybersecurity. Athena in darmstadt germany.

N

In this presentation, we will first briefly introduce our motivation about this work and afterwards we show our evaluation in the internet. Based on the evaluation results, we present present a potential exploit afterwards we propose some code measures and then conclude our work.

N

Our motivation is based on our previous work on dean's dns over udp, so we all know that dns over youtube is vulnerable to ipv documentation attacks. So then another question is that what about dns over tcp? Is it also one about to have your recommendation attached or not? So we searched the related some related works. We find out that it's well believed that ip fragmentation text doesn't work with tcp. For example, there's a drafter best comment practice. Ib fragmentation considered fragile.

N

It says: there's alternatives to every fragmentation which is using tcp with pm2d there's also another best common practice in draft fragmentation avoidance in dns. It says tcp is considered resistance against hyper fragmentation attacks, also in last year's dynasty, so which is the event for among the dns operations community.

N

It says tcp normally implements pm duty and can avoid iv fragmentation of tcp segments, so it is the real. The stingers over tcp really protects against ip fragmentation attacks to check to check this. We designed our evaluation in the internet, so we want to trigger fragmentation over tcp on name servers in the internet and then compare with udp.

N

We designed follow evaluation, so we have our server and we find some servers name servers in the internet. We first establish the tcp connection to the name server and understand the dns query over the tcp connection.

N

We, after getting the response we send icmp packet too big to indicating our empty on the road is too small and we will wait for the dns response retransmission.

N

As a comparison. We do similar evaluation over udp, but considering udp is the connectionless connectionless protocol. So it's a little bit different. We first can send out the thing as request and then get the response afterwards to extend it and hp packet too big. Instead of waiting for the retransmission since there's no youtube, udp doesn't do transmission, so we set another dinner request the same as before, and get what wait for the second generous response and check his fragmented orders.

N

Then, what's the result, we tested against alexa top 100 thousand domains. In the end we get about 500 domains fragmented over tcp as a comparison. There are over 9000 domains fragmented over udp, but this is not surprising. We all know udp is vulnerable to aip fragmentation decks.

N

When we dig into more details about the fragmented domains utcp, we find ourselves that majority of them are only vulnerable to ccp, there's about twenty percent of what about both and eighty percent of whatever early to tcp.

N

When we check among those domains only vulnerable to fragmentation over tcp, we find out that there are even 76 domains which has tcp when we test disability set when we test data over udp. So what does that mean? Tcp means truncated. It is that when the packet is tube, the dns payload is too big for the udp or when the server is limited.

N

In any case, it means the name. Server has a preference over tcp. So it's the name. Server try to use tcp instead of udp.

N

So so that means there's just this 60 these 76 domains. They try to use tcp, which is which the name server source is better, but actually it's it's one of the api implementations, every fragmentation attacks which make it worse.

N

So we also check find out that tcp tends to get smaller fragments than udp. We find that there are about 40 domains which fragments of tcp they have fragments smaller than 290. Well, over udp there are about over 90 percent of the mass has fragments over 540..

N

So what does this mean? Smaller fragments means more payloads are in the second fragment or other fragments, so that means small payloads are injectable, so this is resulting stronger, exploits, which makes things worse.

N

Based on our previous evaluation, we proposed a potential exploit so as we presented here, the he. This is the example of a fragmented, dns packet of tcp, so the so the package above is the first fragment. The package in the bottom is the second apartment, so we we can inject a malicious payload into dns via ib fragmentation over tcp.

N

So, as we know so, the original thing-sketch poisoning is using plotting to overpass the to bypass the thinnest challenge, which is the thing is transit. Also, the thing is transaction id, it's the 16 bit. It can be easily flooded so to secure it. So we further introduced some challenges in udp.

N

Also in udp source port, so here in tcp2, so the tcp challenge here at the tcp software source port here since here we show the packet from the name server to the resolver, so it's destination board, and also the tcp sequence number and technology number both of them are 32 bits. So it's believed to really hurt your spool, but, as you can see both of these challenges, the tcp challenge and the things changes they are in the first fragment.

N

So that means, if we can trigger fragmentation, we don't need to take care of them they're. In the first environment, we only need to spoof the second fragment, but when we try to spoof the second fragment, we only need to take care of the ip challenge, which is the ipid.

N

There are many works about making. There are many works using make use of ipids. We do. I wouldn't make digging new details here, but we did a quick check. We find out that there are even 2000 domains, we're still using globally sequential ipids for tcp. That means they use the global counter for all the ipa other tcp connections.

N

So the that means the attacker can just using a second machine to purpose the ipids and do a just simple linear regression. And then he can easily guess that vid, which makes the exploits really workable.

N

So what are the current measures? We propose cutting measures in different layers, for example, for ip the easy 30 ways just to filter the fragments or just filter the small fragments, like google, do google filter or the small fragments smaller than 500 so which make it make it less vulnerable, but sometimes to filter the fragment will make the network down or doesn't work well. So another way is just to randomize randomize the ipids, to make it hard to spoof, but still sometimes it's still vulnerable to some side.

N

Channel attacks like before another way is to disable pm2d in tcp layer, so their research works find out that find out that pm2d is not really useful, so we can just disable it and that's. We can just filter assembly packet too big, another way to cut measure. These attacks is to just to enable ding attack because dns there is like final solution to all dns cache with an attack, but but we just need to make sure that dns stack is configured properly.

N

There are a few other works about dns mixed configurations which make it even vulnerable when the instructor is enabled so make sure.

N

Then our conclusions we find out that dns of tcp is still vulnerable to ip fragmentation attacks, and we also thought that, if it's really good to recommend using tcpg avoid fragmentation, that's it. Thank you. So much.

A

Thanks thanks so much dancing. uh Do we have uh uh questions? um I think yeah. I see uh online here to take questions um if we have any questions for the speaker.

A

uh Now is a good time to hop on the queue thanks for joining us by the way, I'm sure it looks like it's a very late hour.

N

A

Yeah, I see mark andrews in the queue mark. Do you want to um do you want to have a question? uh You want to ask a question.

A

um Arc you're up, oh, I probably have to remove from q. Let's do it.

K

Dns cookies as a um clients using dns cookies to see see whether they were actually uh infectable, and the second question is: have you looked at the at the proposed well-known tc, uh uh a well-known tc um as a protection mechanisms? There is a there's, a draft from 94 uh 19 2019 about that in my name as well.

N

K

N

K

The second question clearly.

N

K

There's a well there's a using welder using a well-known t-sig um shared secret is one possible countermeasure, um in other words, using tcg for every every train. Every dns transaction, using a well-known using a well-known um shared secret, um is also a countermeasure which is doable, which is theoretically doable at the dns server level at the dns level, and the other question was: did you actually try do an evaluation against a client server pair which support dns cookie?

K

You see how that how that went in terms of trying to do a injection with that in with that active.

N

M

N

Question about the thing is cookie, so.

E

We didn't check that, but I think that cookie is.

N

Used to protect against the thing of.

O

Cash poisoning so.

J

N

That works, if we, if we, if all you can say, is.

J

N

J

The another issue is about the issues.

N

About the fragmentation attacker allows general.

J

N

For example, even.

G

N

G

Something in the payload you can still.

N

G

N

Useless second fragment just to disable the you know, disable the dns response to cause denial of service.

N

um Another question about the shared secret: I I don't really understand this.

A

I see a question from punit here uh or punits in kyu. Let me uh take punit.

A

Punit you're up, we cannot hear you oh pony you, there yeah.

P

Can you hear me now: okay, yep yeah hi, thanks for doing this research, it's interesting to see that pcp is also vulnerable. So one thing in the mitigations, I think you should mention especially now that the idf is talking about doing dns over tls to authoritators.

P

I think, looking to the future, that is the long-term solution to transport level. Cash poisoning uh risks, of course, there's still a lot of work to do, but it it it should be mentioned here.

N

Yeah yeah, but the test for the transporter layer. Protections is only works with if it's widely developed deployed everywhere. So, for example, the the dns of https before mentioned it only projects against it from the client to the resolver, but still doesn't protect from the resolver to the name server. So if you really want to using the transporter layer protection protection protocols, it must be deployed everywhere. So that's why we didn't didn't include it in the code measure part. But that's true. That's a sql code measure thanks sure thank.

D

You um nick, we cannot hear you you're, muted, sorry,.

A

Sorry about that um we got about six minutes to go. I see peter and q, um but I think let's take peter's question if it's for uh chen chang, um but let me invite uh our previous two speakers uh back on on stage. If you will uh so that we can, we can uh have a more general discussion about these three papers.

A

Let's see peter you're up.

E

Hello, uh I believe you said in the chat that, when you cause a tcp server to fragment that they stop setting the df bit right yeah.

E

So would you suggest, as a mitigation that resolvers do not accept packets, tcp packets, that do not have the df bit.

N

Set, uh yes, that will be a mitigation yeah. It's like quick dirty mitigation,.

N

J

N

It's only for tcp tech, so things over tcp or olive for all. The tcp circuits.

E

Yes, although there's a draft in itf right now that suggests that udp senders should also set df, but that's different.

N

After that, that will be tricky, since some rotors on the on the road will still fragment udp packet, but what tcp just the pmpud? It will using segments mss with ms fields to to make the package smaller. So it may work.

E

N

B for udp, maybe not.

E

It should resegment and not fragment. I agree.

N

So nick you're, muted.

A

Doing that hashem, let's uh I see you're back in queues, so uh hashem um welcome back. Yes, thank you.

M

This is my question to the fellas speaker about using different ways of distributing queries using some reservoir. I was thinking if you it happens, using either round robin or random or whatever or hashing.

M

You may end up sending many queries to one dns reservoir and if it gets very busy, it might not be even you might bring it to grinding. Hold, might not even be able to respond to anything.

M

So maybe what another scheme is to measure the load on a given resolver before you send this a query to it, and if it is very loaded, maybe you can consider sending it to some somebody else.

F

Yeah, that seems reasonable. um I'm not exactly sure what metric we would use to measure load, but the idea seems correct. um I mean something else that we experienced. Actually, as we are testing the software is that from the ux perspective um in general, with these strategies there may be some scenarios where, if a resolver goes down, a user may be able to use certain websites, but not others, uh I'm not sure whose microphone is, but there's some feedback.

F

Right, so you can imagine, as I think, you're suggesting that if a resolver goes down, you know that may be a problem for users because they may be able to access some websites, but others and from ux perspective. That might be hard to understand. So, certainly, if there's some way, we can test out load um or the reliability.

B

F

That would be great, and I agree with that: it'd be good to integrate, that into the strategies.

M

Yeah and there may be one way to measure how busy it is is by looking at the cpu utilization. You can look at how many queries you have send so far. Maybe if you said.

F

Exactly I think your microphone cut off for me for a second I'm sorry.

M

Yeah, so there are different ways to measure how busy a different resolver is by, for example, measuring the cpu utilization. You can also say you can become count. The number of queries you send to a given resolver and there's so many of that has been sent to it. You can guess it is being very busy and has more loads than others.

M

Things like that mm-hmm yeah, yeah yeah. We we did something for not balancing and we have many servers and we would like to load balance the traffic to them. We have ways to make sure we distribute the traffic uh fairly in such a manner to optimize the performance of the whole system.

F

Right and that would seem particularly relevant uh for the random model, if that was something we wanted to use right, because you wouldn't want to accidentally use a resolver for many different queries that uh is under high load, as you suggested.

M

Yes, yeah yeah, it can fit around the model. Thank you.

A

Thanks a lot for your question, sean uh there's certain a lot of interest. I I I I don't see um too many other questions in the queue. Let me let me sort of um since we have a panel here. Let me sort of pose pose.

A

One question I think it you know kind of maybe relates to the first two talks a little bit more, but um uh even in the case of uh you know, austin your your proposed solution, there's still a proxy that gets to see some of the traffic right before before uh it's still encrypted uh and so moving encryption from the browser into you know a different place in the network.

A

um What do you all think about like the uh implications of that for for visibility um on the on the on the privacy side and maybe on the security side as well right? So it's just different entities who may get to see that maybe it's your isp instead of the browser vendor or maybe it's nobody if you're the one operating the proxy, uh you know behind the the cpe, uh so there's a bunch of different ways that that could go, um but that affects privacy.

A

It also affects the the sharing discussion I think, for the second time. What are what are people's thoughts on that.

F

I can go, but it's somebody else's thoughts. I don't want to take it too much time, uh but uh you know one. One thing that you and I have talked about nick is particularly related to um you know. If you're an enterprise uh you know you want to be able to, there might be some legitimate reasons why you might want to see the queries that are being sent on the network and if you're moving, I mean this is just I guess, true of just using encrypted dns in general, but the more it's deployed.

F

You know they may have less visibility into that, and so um one thing that you might want to factor in for a query distribution strategy. Is you know if there are certain domain names that are supposed to go to a split horizon resolver, then you might need to map integrate that into your strategy, somehow, whether it's in some kind of a preference box, you might imagine in the application that runs the resolver. uh You might build an indicate like for these cert force.

F

These domain names to go to these resolvers and the user might be able to specify that, and so then, from a visibility perspective, the enterprise might be able to run their own encrypted resolver and still be able to see the queries that they need to see um in a certain sense right like they may still wish to have access to all queries in order to look for command and control, et cetera, but still at least in one particular use case.

F

You might imagine being able to say you know, for these particular domain names ensure that they go to this enterprise. You know split horizon resolver, um that's one thing you might want to integrate in terms of visibility.

A

um And bassey: do you have any any thoughts like, let's suppose, um austin's proposal uh were or sort of were implemented? Well, it is implemented. Let's say that it were. We had a choice of where to deploy that kind of uh stub. Resolver functionality like you could move the encryption point to you know within the home network for all devices you could move it to the cpe. You could actually proxy somewhere outside uh the home. You could yeah there's a bunch of different places where that could be deployed.

A

Do you have thoughts on that and how that would sort of affect? um You know what you're trying to achieve with with uh with the dns sharing, and I see we're over.

L

A

I'll respect that in a minute, yeah.

L

uh We haven't talked more about from the encryption perspective, but just in general from like uh when we're like thinking about solutions about like some of the threads we're talking about on the paper uh like, for example, one potential solution might be uh instead of directly querying uh routine routine servers forwarding the traffic to another uh public recursive resolver that uses encryption and uh well.

L

What we runs is that some of the solutions only shift the threat model and don't actually solve it, because now the traffic becomes visible to the operators of the resolver, even though the uh traffic is uh encrypted. So the the data still leaks. uh The data stored in the server still links the uh information so uh yeah. Any solution that we consider needs to take these factors into account as well.

A

Hey thanks thanks betsy, and I think, thanks to all our speakers, uh and also thanks to to those of you in attendance who asked such great questions. This is a lively session um and hopefully, uh hopefully, at the next, a nrw we'll see more papers on this topic. I can see that these are. These are really nice uh ideas with with uh hopefully a lot of potential for uh for future work. As the questions brought up so uh outstanding job, everybody really enjoyed it and I think it is.

A

It is now uh my turn to uh turn things over to uh to andre uh who is chairing our last session uh uh on applications and specifications. So uh um I think at this point I'll also I'll still be in this in the media echo, but I won't take. I won't take the mic, so I will.

A

I will also take this opportunity to uh bid bid our audience farewell and also, uh since I do have the the the mic here I'll say, thank you once again to andra who's, who's uh who's, our great session chair for last session, who was just an absolute pleasure to work with on this uh workshop together.

A

I hope we get to to work to collaborate again on things like this and and hopefully on on other topics as well, and um I think I'm probably not alone in saying colin, we we could not have done this without you there's like way too many moving parts between the itf and the acm. So we really appreciate, like your experience on that, so I'll turn it over to andra for the last session and the last words as well.

D

Thank you so much nick um and likewise has been a pleasure working together and uh we do have a closing uh piece at the end so hold on for two more talks. um So I'm trying to to show you what we have in store um now, but apparently I'm not able to uh sorry about this.

D

Yeah, apparently I'm not able to, but hopefully everybody has seen the program for um for the last session. We have two very interesting talks, um one about how we can interpret better rc's and how and and um barat is here and um he's going to do, the five minutes: q, a after the pre-recorded um talk and then we're gonna hear from from ali and he's going to discuss. Well, it's a it's a call for action, essentially about how we can write better specifications um for particular applications.

D

So I'm just going to ask meet echo to to load the video and meanwhile, I'm just gonna remind everyone. We're gonna just run the session just the way that we have seen before um so queue up for questions um and again it's a pre-recorded video, so questions at the end and we're gonna have a panel um at the end of this session with questions for for both speakers.

D

um So that's it for me and thank you so much.

Q

Hi there I'd like to tell you about tools for disambiguating rfcs. This is work led by our grad student, jane yen, along with ramesh govindan and myself barthorogaven at usc.

Q

In nearly every talk I give I somehow keep coming back to september 1981.

Q

This is the when the foundational protocols of the internet that are still in use today were defined and they were written in english and manifested in the world in clark's phrase through rough consensus and running code.

Q

So I want to ask is what makes protocol special? Since then, we have grappled with something fundamental about network protocols they're specified in english, implemented in code through interpretation of that english prose coded typically in a relatively low level language, but it having a fundamental intent to interoperate with other implementations.

Q

Those other implementations went through the same process: interpretation by a person who scrutinized that english text or not and wrote an implementation that reflected their best guess as to the intent of the spec author, and yet these implementations were written separate in time and space from one another and from the specification something that rarely takes place in other domains of software development and from that we want to achieve all these different properties.

Q

We've struggled for a long time to achieve all of them properties such as security and interoperability, flexibility, scalability correctness and extensibility, and while this is sometimes due to the design and specification process, it's often because of implementations with issues being discovered incrementally over time and those implementations get improved. The specifications get improved and then around we go again, but it's not clear that this is fundamental, and so what we might do is consider different options that we have in protocol specification and implementation.

Q

How easy are they to specify and how easy are they to implement and not just implement badly, but with all of the properties that we seek?

Q

The place we began in in our work is the place that we've been some time that this specification is written in english and the human uh and a person somewhere. Human must read the spec interpret correctly and then implement it correctly.

Q

Many have rightly observed that this opens the door to many possible mistakes, so the natural option is to formalize the specification process. Many languages and tools for formal specification of software generally and protocol specifically have been developed, and these formal specs enable a degree of certainty about what the spec author meant but they're a pain to write and to read even for specialists, and so they've been considered really only worth it for the most safety critical applications, and then one can consider. Well. If we have this formal specification, we can produce an automated implementation.

Q

A crisp enough formal specification should be easy to convert to code. So if you're going to go to the trouble to write a formal spec, you might as well automate the code generation, formal specs, have seen little adoption and we're still writing protocol specifications in english.

Q

So our work, the first piece of which is appearing in sitcom this year, takes a small step toward filling in the missing quadrant, taking rfcs, identifying ambiguities in the spec text and then, after the author fixes those attempting to auto-generate code straight from english.

Q

Our approach is built upon a set of concepts and techniques in natural language processing, known as combinatory categorial grammar, which is a way to do what's known as semantic parsing, how to decompose a sentence into sensible units and we're not natural language processing experts, we're networking people so really. What we wanted to do is take these techniques that have been developed in nlp and apply them to this context.

Q

We use this grammar-based approach rather than a more popular, deep, neural net based approach, because we wish to retain the structure of the text so that we can figure out whether multiple interpretations of a sentence are valid and how those should be fixed. If at all, so what kind of text are we talking about?

Q

Well, we want to be able to process excellent chunks of text like this, or this.

Q

So we built a tool called sage that attempts to identify ambiguity in rfc text, taking advantage of their structured nature and flag sentences that are ambiguous, then for limited rfc features that we support. We take this disambiguated text and convert it directly into protocol code. This is a very early stage tool. It's not ready for wider use, so really part of our motivation and presenting it to you all is to learn from the community and identify what our next steps of where we want to extend it to support.

Q

So, in addition to parsing pros, rfcs have many types of structured text. Sage currently supports interpretation of packet, headers and pseudo code, some aspects of state and session management, but does not support state machines or algorithm specifications.

Q

The latter are absolutely needed to support more advanced protocols, and so I come back to where I began, which is what makes protocols special they're essential to everything we do today and have to work in context far outside their original intention. However, because as a community, there exists standardized practices for the development of rfcs. What I'd like to pose for discussion is how the process for standalone standardizing rfcs might integrate semi-automated tools like say a future version of sage, to improve specifications and to produce baseline implementations.

Q

D

So, thank you so much for the very interesting interesting talk about us. Can you hear us yeah? I already see people joining the queue. So um stephen, would you like to announce yourself and ask your.

R

Question: hey, can you hear me uh yeah, so this is really interesting work? um I guess my question is: how far do you think that natural language processing techniques can take us towards sort of fully automated cogeneration now? Is there? Is there certain components of specifications that perhaps need you know more structured or more formal languages? um What's what's that gap like, I guess.

Q

You know I the to the extent that we have figured this out and I just have to say any any answers I give at this point are going to be speculative, because this is really sort of we're taking baby steps towards this problem right now.

Q

I think that the first thing we want to target is the disambiguation piece I feel like.

Q

If we can really nail down the sources of ambiguity in the spec text, then the cogeneration piece, at least from you know our work on the so far is the easier piece of it, because if you've disambiguated the spec, then you know what every sort of entity that you're talking about in a sentence or in a paragraph what it means, what it's referring to and once you've been able to do that, then you know a lot of it is specifications generally are going to be relatively hierarchical in structure they're, going to reference terms that have been defined somewhere, and so you know the co-generation piece.

Q

You know, I hope, is going to be the easier piece of that now. The disambiguation part gets tricky um when you know you're dealing with references that go way outside of the bounds of a specific spec, so that this is one of the pieces where there's a vocabulary piece which is relatively easy to figure out.

Q

You know in our in our paper we we talk about we've just pulled in terminology from a networking textbook, so that we knew that these were special terms in networking right, and so we have sort of domain knowledge um and that's a kind of dictionary that can grow, but it's not going to continue growing forever.

Q

There are other pieces which are really about the sentence structure. Disambiguation there. It gets a little bit trickier. um You know, I'm not an nlp expert, so the you know. The answer I can give you here is gonna be limited, but the the way that we've sort of looked at that is, you could think of it. Initially, as a grammar checker that you know we use, we have grammar checkers. You know. Google docs, for example, now does sort of auto grammar checking while you're writing.

Q

You could think of that as being an integrated feature, it's not going to give you 100 coverage to on day one, but it's going to find, and you know this, this sentence is ambiguous. Did you refer to the header or you're? Referring to a message? Are you referring to you know? You know, there's terms can be used in multiple ways and you can get have a tool that flags that um and then also the structure of the sentence itself. Maybe the noun is missing. You know a noun is missing.

Q

That leaves out the who it is that's doing something you know sending a message computing, some value so on so that's sort of the first steps. Maybe you know I'm happy to answer follow-up questions. If that didn't clarify.

R

No, no, that was good.

D

Thanks, so thank you so much I see, jana is in the queue so jenna. Would you like to meet and ask your question.

P

I think I can be heard. I believe I can be yes, hi bharath. Thank you for this lovely talk. um So this is super interesting because clearly you're familiar with the history of this work and when generally the space that you showed in that graph has been around for a good 20 or 30 years.

P

And to the point that you were making about idea, specs one of the interesting things I find about so I've gotten people right back to me saying this sentence is ambiguous when they try to translate it from english to a different language, because english fundamentally has some ambiguities which some other languages don't in some particular cases, and so when translating they need to figure out whether I'm talking about the subject or what does this?

P

You know, what does what do things refer to, and and in doing that, you encounter some basic pieces of ambiguity. So I think that can be that I can understand, but the harder things I have trouble with in ambiguity- and we we find this routinely in rfcs- is something that is um the way that the text is written. There's a flow and there's context.

P

So any any sentence has context coming in before it so depending on what the building blocks are- and this is what I'm missing in in your in your work at the moment- is how are you building the semantic blocks? Are you using semantic blocks? Are you you said you're not doing state machine generation? I get that, um but just in terms of the language itself, are you doing some sort of?

P

I don't know I mean I'm thinking about. You know the tools that are used for building summaries, for example, which do semantic block building, and things like that. Are you trying to perhaps generate a summary of some pieces of this? um How far do you think you can go with this? I would love for it to be able to detect uh um uh ambiguity in in in text that is beyond just sentences and so on to actually, you know, pull semantics out of it.

P

That's a question and in terms of a comment for you you're asking about: where could this or the potential for integrating something like this? I think this is super useful. Even at the lowest level, it could be used as part of things such as working group last call ietf last call the rfc editor. Does this work for us typically like there's a human who reads uh the text and says: hey this sentence seems ambiguous: can you please replace it and they'll offer suggestions?

P

um So automation of any or any and all of those things would be super useful, and I think that the value of automation is in not waiting until later to bring those tools in just make it part of a ci for every pr that goes into the draft. You know when any new language that is introduced goes through this machinery and then it figures out if there's any ambiguity introduced by so there are several places where this can be introduced and I'd love to see that happen so yeah. That's all I've got.

Q

Yeah, thanks for all that, so to start at the beginning um about the sort of the reference in a sentence. So this is co-reference resolution, something that's you know a major studied area in nlp, where using uh some co-reference resolution tools right now.

Q

We don't mention too much in the current paper, but sort of we're working on that right now, and you know it's a problem that I guess can't be solved perfectly just because the nature of english text is, you know, sometimes references can't be perfectly resolved, um but the good news is that we can then flag potentially unresolved references, and then you know that would be part of this automated tool. So that's sort of like the first stage, then the the question you had about. Do we build up this sort of context of understanding.

Q

We don't do that right now um and so right now, we've been mostly targeting simple protocols and the one thing that's kind of amazing is that you know for relatively simple protocols. Often you can get away with not having too much context as long as all the terms that are in each sentence.

Q

Are you have already resolved those? You don't necessarily need a lot of history of you know, concepts that have been built up over several paragraphs. To then understand it as long as all those terms have been defined now, once we get to more complex, for example, algorithm descriptions um and more complex protocols, we're going to have to resolve that it's currently we we haven't figured that piece out.

Q

um You know, I guess maybe the way that I sometimes think of that about this is that this will especially be helpful for the long tail of rfcs or ietf drafts out there, which aren't getting as much scrutiny.

Q

Don't have big working groups with a lot of intense work going on, but you know there's all these smaller protocols that are out there and someday some of them may become big protocols. You know protocols that people are depending upon that. We can never really anticipate that, and so it's good to be able to make sure that all of those get you know some level of scrutiny say through these semi-automated tools.

P

Thank you for that response. I'll make a quick comment before I walk away. Don't uh do not disregard the the important protocols that are getting a lot of scrutiny because those ones actually are the ones that need this kind of help. More than anything else. um What often happens is that you get scrutiny on the ideas you get scrutiny on various things.

P

You get scrutiny through people doing implementation, but not necessarily on how well the text is written um and and that's I would say that that's still a huge gap in my mind about uh there's this huge gap between intent of the working group and the text that shows up in the document that tends to be, and that that you rely on just individuals to do to fill that gap. um It would be good to not be.

P

It would be good to be able to rely on on automated tools to help enforce that intent matches what the text says. Thank you again for, for this work sounds good.

D

So thanks so much jana and I see calling in the queue. Probably would you like to ask your question? Please and I'm just gonna close the queue um after calling, but please don't do remember that we have um a few more minutes at the end of the session, so do join the queue. If you have more questions for that, thanks.

O

Yeah uh hi thanks thanks sandra thanks uh but beref for the talking this. This is uh fascinating. Work. um I'd like to echo some of the comments that jenna made. um I think um we're starting to see um a a bunch of different tools uh that people are applying to to rfcs and drafts. I mean that there's your work. We've seen people doing security analysis of some of the protocols. um uh I've seen people uh analyzing state machines, we've done some work, um modeling uh critical data formats.

O

I think what one of the challenges is is that that sort of analysis tends to happen when the rfc is published um and if we can find ways of building these tools early, so they can be run routinely on the drafts. I think that would be a really nice way of improving the quality of the specifications uh before they're published.

O

um So so I would really encourage you to to keep working with the the community here to try and integrate this into the the draft publication tools and so on uh and and I I can introduce you to the the tools team and and so on. If need be, and if we can help out with this sort of work in the irtf, then then please do come and talk to me thanks.

Q

Yeah thanks exactly, we would like to integrate it with the sort of the draft stage, rather than it being only you know, once it's final and published um you know. The good thing is that even at the draft stage, it's just english text. So from the perspective of our tool, it doesn't know any different between the two. um You know it will find ambiguities or not in in either.

D

Form so thank you so much again, bara thank you for, for the people asking questions do join us again at the end of the session, but for now I'm just gonna ask echo to queue up the next video um for the next talk, which is from um ali and he's just gonna, um give us a call for action for more collaboration.

D

S

Hi everyone, my name is ali. I'm a computer science, professor and I've been working on streaming media since my phd years. The topic for today is the very recent developments in the streaming space, in particular the common midi client and media server data specs coming out of the consumer technology association.

S

These are the two specs that might have a fundamental impact on media distribution. I will give a brief overview of where we are today how we got here and where we go next.

S

Without a doubt, streaming media is a big part of our lives with more efficient media, codecs, faster internet access streaming. Any content we want is quite easy. Today, http played an important role in this too. Back in the day, it all started with progressive download which allowed us to play the media before having to download the entire media file with sufficient bandwidth. This was a much better experience over the download and play approach, following that with some indexing tricks and the use of byte range requests.

S

Seeking an immediate timeline was made possible and servers also got a bit smarter, yet the most importantly came out about 15 years ago with chunk streaming, which enabled live streaming over http and surprising arbitrary content for things like insertion.

S

The natural extension to chunk streaming was to generate multiple versions of the content and offer rate switching capability, in other words, adaptation capability to streaming clients. Today, all major streaming platforms use a flavor of this approach.

S

In a nutshell, we have a source which is live or element that provides the mezzanine content, which is supposed the best and most crisp quality that we can offer. The mezzanine stream is fed to an encoder or a transcoder that generates the number of representations according to a pre-selected bitrate letter, which is a combination of resolution and betrayed pairs.

S

These representations are then packaged into media segments and optionally encrypted, which are then published to the origin server. Through the help of the cdn caches, media segments can be efficiently accessed by the countless streaming clients.

S

Why do we use http? Well, it has all these nice features works on almost any device and goes through all the nasty middle boxes. Without a problem, the cdns have a huge replication capacity and offer cheap and low latency access to everybody.

S

Http is an object, request response protocol, but with the small media pieces it can mimic streaming very nicely and the ultimate goal is to make the viewers happy at which http adaptive streaming does a pretty good job most of the time. But adaptive streaming is not perfect either. So bad things do happen from time to time.

S

Here's an example showing streaming client behavior. I have been showing this plot for more than 10 years now, so you might have seen this before. There are 10 clients streaming from the same server over a 10 megabit per second link. Ideally, all the clients should consistently stream from the same representation.

S

In this case it is the 866 kilobit per second representation. This way, the network capacity will also be used as much as possible. However, this is not the case, as you can see from the frequent upshifts and downshifts for each client in a scenario like this, which is very common, we get either unfairness among the clients, instability for each client or under utilization in the network.

S

Http streaming traffic, which is in a rough traffic and uses a set of discrete bit rates, cannot converge to the ideal point.

S

If you are interested in knowing more about this, we studied this problem in egl in our nostra 2012 paper, which you can refer to for more details.

S

Since these problems have been identified about 10 years ago, we and many other groups worked on finding a solution. Several solutions with different pros and cons have floated around. Now. When it came to the solution types, there were primarily three camps.

S

The first camp was arguing that the servers must always stay in control and the network must support appropriate quality of service, otherwise streaming video could never be on par with the traditional cable or broadcast tv, so their solution was to make everything more controlled with smarter and more expensive infrastructure.

S

However, it soon became quite clear that the market has already moved on from more control on everything. The second camp was, on the other extreme argued, not doing anything that will put more control on different structure, but simply spending money to improve the excess capacity.

S

In other words, the money should be spent to put more fiber under the ground and improve cellular coverage as nice, as this sounds, one thing we know for sure, is that the demand for bandwidth is always more than the supply. The last camp was more pragmatic. How could we best use whatever the bandwidth or server capacity? We have to meet the demand, so the last camp brought up the idea of some sort of a control plane that might enable information sharing between the servers and clients.

S

This was certainly an interesting idea, but, bringing together the developers of all the client implementations with the content or network providers wouldn't be as easy.

S

But we were determined in 2013, we stirred up enough interest to have a joint atfempek workshop on this subject. Different proposals were brought to the table and the discussions led to the development of the surveillance network assist dash standard in mpeg.

S

This send standard actually covered a lot of ground and established a control plane framework we always wanted, despite all the academic research behind it due to various reasons. This wasn't really picked up by the industry until a number of companies wanted this to do something about it in cta, a few years later, that cta effort first focused on the metadata information sharing from the clients toward the servers and the spec was published last year as cta 5004, and that's called common midi client data.

S

With this completed, the cta started working on the sisters pack a few months ago, which is for information sharing toward the clients, and it's called common media server data and it's expected to be published later this year early next year.

S

So what has really changed the mind of the cdn providers?

S

They didn't get on the bandwagon a few years ago, but now they want to have something similar here is the deal whose fault is it when a user's video keeps freezing or is low quality, one can blame the device or the app or the internet connection speed or the cdn or the content provider is a vicious cycle that mostly ends up without a satisfactory answer, which makes us the users quite upset fault. Isolation is very important for keeping the paying customers and it requires analytics data from various points along the delivery pipeline.

S

There are many proprietary internet analytics solutions today, but surprisingly, the cdns are quite blind to all of this cds see the requests and send the responses, but they keep. They don't have the glue to connect all the pieces together to get the full picture and identify how well things are working or not. The solution is quite straightforward, though, if we, if with each request, declines and some identifiers that status information, the cds suddenly gain visibility into the media analytics see this new paper from friends of mine to explore how powerful cmcd reporting could be.

S

Here's a list of things a client can send information about, even with a unique session id sent with each request. The cds can now group the requests per playback session, which is already a big improvement over today's cdn's capabilities.

S

Sending a deadline might allow for getting the response with a higher priority. Reducing the risk for video stalling, for example, see our non-stop paper this year on the subject. Another use case is hints about the next request. The cdn may reduce the fetch time for the clients by using that information.

S

On the other hand, the cmsd work is only a few months old and there are already a number of use cases described by the working group. As you can see from the list, there are many ways to see: cmsd can help it or many problems the cmsd can avoid, which would otherwise be very difficult to resolve.

S

The work is still ongoing if you're interested, please consider participating in the effort developing or writing a spec is usually the easiest part we had to send standard, but it was not widely adopted to get better adoption with the cmcd and cmsd. All of us working in this industry need to work together. We all agree that information exchange is useful when the information is relevant and actionable. The information also needs to be fresh.

S

There are already a number of cdns and clients supporting cmcd, so, for example, grab the latest hts code start experimenting yourself and come up with how cmcd or cmsd can help in your environment. Both have extension mechanisms, so extending the functionality is pretty straightforward and, most importantly, join the effort and contribute review.

S

The use cases propose new ones, and so on with this, I provide a slide here with some useful links for your reference and thanks for listening, and I hope those of you who are interested in streaming stuff and other multimedia systems topics can attend. Acm multimedia systems conference impersonal virtually at the end of september. Thank.

D

You so thank you so much ali. um Can you join us by turning on your mic and your camera, so thank you for that plug as well at the end forever.

D

um So please do join the queue if you have questions for for ali, so the queue is open. Now um I don't see anybody right now in the queue, so maybe I'll just um start it right off. So my first question, as obviously when you see this type of call, fractions is okay. How much adoption have you already experienced right now and which are the immediate um benefits that you already observe um just to you know maybe trigger people to to join the the bank, the yeah, the effort.

S

Well, certainly, we have. uh We have some folks today in the workshop, uh who are representing some some of the biggest uh largest cdm companies. So I hope they will also join this conversation, but uh you know, but definitely there is. There is some interest, because this time uh you know you know, unlike the mpeg standard, we did like five years ago. This time you know the request came from the cdn providers, so that already shows their interest because they are really struggling with other analytic solutions, and you know they need something in place.

S

uh Otherwise they are just running blind and you know they need to do this now, if only to see cds, the you know implement is obviously that's not going to be sufficient. We need client support as well, and you know uh yes, there are some client implementations out there who already committed, but you know the big players like ios platform, hls players, av library- I mean they are not- they are not doing it yet, maybe in the future they will, but not at the moment.

S

But you know uh this can only be a great success if everybody uh who's involved on either the server side or the client side, uh you know, uh starts implementation and the biggest benefit. uh The second part of your question, as as I mentioned in the talk, is uh even with some sort of unique identifier per the playback session. uh You know the cds will at least be able to understand.

S

Okay. This client is in this session and asking for all these files all these segments, and things like that, so they will be able to put up things together uh at least per session. Otherwise it's just you know they are using all these individuals request logs, but they have no idea what they are for. So even with that simple request idea, I think there's a lot of benefit in there.

D

That's great to hear- and I mean I do support that fully um and since the queue seems to still be empty right now, I'm just going to continue and I'm just going to bother you with a few more questions.

D

um I would I liked very much your your slide number seven in in the in the talking you gave before um just showing that you know different players. Just you know, point the finger at the other and they just it's very hard essentially to to do root, cause analysis for anomalies that might occur uh in this environment, and I'm just wondering have you or are you in touch with people who are looking at?

D

You know different, deep learning, machine learning analytics for this type of um you know animal eye, detection approaches or you know, proposing ways for different players to um share data in a privacy preserving way.

S

Well, uh when it comes to you to diagnosing your problems, uh for you know for your video services, it's not really very privacy. uh I mean it's like. There are a lot of price issues there. Obviously, uh because uh you know, whoever is looking into your uh problem will likely know what you were watching or what you were trying to watch right. So uh you know I'm not really interested in that part, but when it comes to identifying where the problem is or what what is causing the problem.

S

Obviously there are some companies out there who are doing all these probing type of things. They collect data from different parts of the delivery pipeline. Now they they have some components in the client. They have some components in the cdn. They have some components in the uh you know head-on: side, the packaging encoder or whatever.

S

Whatever you know, the more props, they can put obviously the more visibility they will get, and you know that you know collecting the data is one thing and processing the data is another thing and they're reading the data and understanding the root causes totally something different, and then all these different companies are different. You know success rates for that, but you know most of the time you know unnecessarily a cdn you know goes down uh like you know.

S

It happened a few days ago with the you know, one of the larger cdn's, uh but you know things do happen from time to time or unless something happens with your isp connection and so on. Most of the problem is usually most of the problems are within your home network or some you know software issues in your client side. So, even if you have some components to understand what's happening in the client, then that will give you a lot of visibility.

S

You know, okay, you know we made this change and then here is uh you know. The knee problems started happening here and there right uh so uh most of these you know services uh often do uh a b testing so when, whenever they are testing with a new feature, they you know usually only enable it for a subset of users and see whether everything is working or not so yeah. There are a lot of ways to diagnose these things, but just because you diagnose it doesn't mean that it's going to get resolved.

S

That's another thing, and uh you know it's this finger, pointing someone else or blaming someone else is usually what the you know the companies do. So it's not that easy to get away from it. But with this.

D

You will have.

S

Some proof right.

D

So so I I see two people now joining the queue so I'll ask david to maybe unmute and ask um a question please. So thanks for that I'll be right back.

C

S

C

S

Meet me well good to hear from you.

C

So I'm curious what the reaction of the folks um supplying very large quantities of this of this video that believe they control, essentially the entire uh delivery chain other than the isp link, so, for example, google and youtube they control effectively everything up until the last top of the path to the user. uh Netflix today, in most cases, also does that do they.

E

C

Value in this kind of an approach, or only for people who, for example, the origin, the origins and the content owners are contracted with separate cdns uh for delivery.

S

Well, uh okay, I'm getting some echo. Maybe you might want to wait um so the the the larger set of benefits are for those people who you know in the second bucket, for example. Today we have olympics right. So the the broadcast is uh coming from tokyo, japan and then it's going to a bunch of service providers, video providers out there through and then through multiple cds.

S

You know everywhere in every country you know millions of watching the content live. So in those cases that's really the uh you know where the benefit is because if something doesn't work then now you know where you know things are falling apart. So uh you know for for companies like youtube or netflix.

S

uh Essentially, the the problems are more limited in terms of where they can happen. You know, especially for netflix, I mean it's all on demand, so the content is already encoded packaged and it has been tested. So if something goes wrong, it's probably going to be in your home network, or maybe you know the isp is the bottleneck at that time. So it's going to be relatively easy to figure that out, but for larger events, especially for live events, and when they are crossing across different networks and different domains.

S

Then that's where the most of the benefits will be, and uh you know more direct answer to your question is at the moment uh I haven't seen anybody involved from those companies in this work.

D

So, thank you so much. um I I'm just going to open the queue for both speakers and I'm just going to invite um bada facebook to join us um and so basically we're opening for for the panel. I see janna now in the queue, so please janna just go ahead and ask your question. Thank you.

P

Thanks sandra and thanks ali for that talk, um I want to push back gently on something that you said earlier about privacy not being important here. I think it is, and I think it's achievable. I don't think it's very far away. Just because I can see as a server. I can see the manifest and I can see the videos you're fetching doesn't mean that you have no privacy right.

P

All I need, from a service point of view, is to be able to tie together the different videos that you're watching to be able to pull together a full picture of the video session that you've had. um So, if I'm one of the problems I have as looking at logs at the cdn, I work at fastly is basically being able to pull together the different chunks that you requested as a client. So just the one single piece that you have in the cmcd spec of the geo the session id.

P

Basically, that alone gives me like 90 of the value that I'm looking for. So if I'm able to tie together just those that single video session. That helps me tremendously because, because it shows me the bit rates that that you uh used, I mean, of course, the bit rate being explicit would also be useful, because otherwise you have to infer them from the object names to figure out what the bit rate might be and so on.

P

um But those two pieces of information are are hugely useful and I think they can be done as you as you imagine that the spec does that in the privacy preserving way, meaning that the session id doesn't actually include the client's ip address in there or the time stamp directly. um The time stamp is fine, but the client's ipad doesn't be particularly problematic.

S

It's not really involved in that much I mean. Obviously we hope and assume that you know this. The cdm provider will do whatever they can to preserve the privacy right. uh Otherwise, all this information flowing back to the cdn servers will obviously you know I mean, reveal some information, but how the cdm provider is going to use that information. That's really um I mean I think it's a separate uh topic. It's uh I.

P

Appreciate that I'm basically let me be more specific, I think that the construction of the guid of the session id should be done in such a way that it doesn't reveal uh client, personal information. That's if, as long as things like that are taken care of, I think that the rest of it is still very useful and stuff that we can and want to be able to use. So thank you for this. I'm I'm I'm curious to understand the state of this work.

P

As in uh under asked about the deployment, I want to understand the state of uh the the the spec itself like. Where does it stand right now and and and what's the?

P

Where does it end? Just because I don't know this particular space.

S

Of uh right so.

J

S

Might you you must have seen the fastly look on my one of the last slides uh fastly is one of the cdn providers who's trialing this and they already have some cmcd reporting, so they, you know some of your servers. At least you know I mean there's a trial going on. You can collect cmcd data coming from the clients, so that and then there's a dashboard where you know people can see what's happening, and you know things like that.

S

Akamai is the other uh partner at the moment and we hope other cdns will join the club, and uh you know uh the as I mentioned earlier, though, this is not just the cdn support. The clients also need to send the information to the cdn right. So uh you know uh the hgs and a couple of other open source players are currently also.

S

You know we have some initial support for that and with the other side of the spec cmsd the common media server data, I think it will be a lot more important for the cdns, because now you will be sending some information back to the client and at the moment uh you know, since we are just you know, uh you know working on the spec now there is really no deployment or any trials, as you know, to my knowledge, but I hope to see some action in that space uh later in the fall or late this year, so hopefully, hopefully with the cmcd and cmsd, and anybody can extend them as they like.

S

You know. uh You know there will be a lot of action in the space in the next year or so.

P

And where does one look for updates in the space.

S

Where does one follow? What where does one follow the spec? How do I focus well, it is under cta, uh but uh there are two links in my uh at the end. You know in the file in the last slide, where you can find the github links and you can post issues and you can you can, you know, find the specter. So it's a pretty open process at the moment.

S

Thank you so much.

C

So here's a really quick one um is the session id local to the client, because if it isn't, doesn't that open you up to all kinds of interesting spoofing attacks like piggybacking on other people's sessions and collusion among clients to make it look like they're all cooperating when they're not.

S

Well, uh you know the spec doesn't really say much about how you're going to generate the session id, but it's not local to to the to the client itself. So uh you know it's supposedly. You know something unique so that you know in a you know so that you know uh two clients or you know three clients within the same home or within the same neighborhood or city whatever they will not be able to come up with the same id.

C

uh Just a quick suggestion that you do a security analysis of the um scope of session ids and how they can be abused.

S

Yeah yeah yeah, the the that's important. Certainly- and you know you don't want to send a wrong or you know uh you know that intended information back to the cdn, because you know with your information, the cdn might take a different uh action and uh obviously the cdns must be careful about what they process uh once this information comes from the clients. So that's uh that's one very big issue with this.

S

With this, you know whole thing.

D

So thanks so much, um I don't see anybody in the queue right now, so I'm just going to take. um Oh, I do see janna jonathan's, perfect I'll, keep my question for a bit later.

P

Oh, no, sorry, I didn't mean to take away from your question. If you had to go ahead, you were in line ahead of me and you have the chairs mike so.

D

Right, I mean I was just trying to bring it back to bara because we have been discussing about this open um well, this open space, where we can contribute to to the specification for obvious case, but I'm just wondering coming back to sage. um Are you accepting at any point, or are you considering at any point accepting sort of contributions from the community, especially this community? In terms of you know, maybe opening the tool and having people help with that?

D

I'm sure you have tried it in a few rfcs and I don't know if you want to tell us how that went as well.

Q

Yeah yeah absolutely. We would like to we'd like to engage with folks in the community, um we're kind of at this point where we're we've been working through the back catalog of relatively simple rfcs. To start with, so you know mostly to figure out, you know what are the the essential features that we can actually disambiguate on and um right.

Q

You know auto generate code, um but we would like to I think one of the key things may be that you know some of uh you know I think in colin's question um there was a mention of some of the existing tools that the community is starting to use or folks that are working on.

Q

You know state machine analysis and maybe some of the other things to the extent that we can integrate those kinds of things where say, there's a state machine analysis tool, but it needs to have a state machine and the state machine is maybe right now specified in text. So you know there's there are some things that are structured text that are in um in rfcs. Moving from that to something that is like an intermediate representation.

Q

That, then, goes into a formal analysis, that's something that you know we'd like to be able to plug in with existing tools. um One other thing is that uh you know the fact that we can use markdown, potentially or even better than markdown, something that's like structured formatting will be extremely useful.

Q

um You know, I don't know to the extent that that is being adopted widely across the community, but you know there's the more that we can structure the text itself, while still keeping it easy to read and not sort of you know, overly formalized will make it easier for both human interpretation, but also for machine interpretation.

D

Thank you so much and I'm just going to hand it over back to janna and sorry for mixing the two lines of conversation. Yeah.

P

Not at all, I actually had a question for bharat too, and I'm trying to bring it back into my head now um at a uh I was going to ask about uh um that's right so so you've. Clearly you said that you've been looking at some simple protocols.

P

uh Can you give examples of which ones you've looked at and what the experience.

Q

Has so we in the sitcom paper we cover the icmp, uh you know the full protocol, so icmp, obviously being one the simplest place to start right and so the reason we started there is we can go all the way from spec to disambiguation fixing the sentences to implementation that works with you know, ping and traceroute, and you know it interoperates.

Q

So you know that was our first test case. Then what we did is we went through other rfcs uh across the years and picked bits and pieces from them that weren't sort of in the the original icmp rfc in terms of style of text or structure of text. So we have some text from ntp, some from igmp some from bfd, so those are the other ones and then right now we're actively looking at some state machine.

Q

uh You know trying to pick which, among the various protocols that have state machines, do we want to try and analyze, um and so we started looking at some um and so suggestions for those is like that's one place to start. You know if there's uh one of the ways that we particularly improve the tool, but then it can be useful later is protocols where it's known, probably through oral history, almost in the community, that there was a spec that was really bad.

Q

That was then improved over time and to the extent that we have that history, we can then say feed the original bad version in, and then we can see whether our tool can identify the things that you know human editors identified later on, um and so you know it's not quite a training process in the way that you know like a neural net training would be, but you know sort of we can find whether our tool is even missing those things or not. um So that would be the first place we could engage. That's.

P

A that is actually an interesting one, so I one one point where you could insert that tool might be before well, there's a few points in it may not be in in multiple rfcs, but in an rfc's life you can. You can catch it before it hits the rfc editor's queue, because that's one place where a lot of english fixing happens is that the rfc editor will basically take a look and do a bunch of copy editing and uh some ambiguity gets removed. That's usually uh um yeah.

P

I mean that that's one place, but also around last call again. The the roc is fairly mature by that point in time, but it goes through a community-wide review and people will come and say: hey. You know. People walk in the fresh pair of eyes and they'll.

P

Look at the whole spec with the fresh terrifies rather than the people who've been walking with the spec for the for for two or three years, and so they they all understand the context and don't necessarily refer to the text to know what needs to be built so you'll find that the the the rfc before last call and after last call can actually be quite different. So those are at least two points in the process that I can point you to.

P

um If you take an rfc, that's going into last call or analysis, that's going into the rhetoric process and then put it through your tool and see what comes out on the other side. um That might be something to consider.

D

That's all thank you, so thank you so much for for this wonderful conversation, I mean I we're just one minute um to the end of our session today and the end of the workshop.

D

um I think we had calling in the queue- and that was fortunate, because I would like to also invite both him and nick to uh join me for saying um for asking the question. Call him please and then just saying goodbye and wrapping for yeah wrap me um the worship for this year.

O

Yeah, so I I think we're short of time, so I I will skip my question uh but uh again, thank you.

O

Thank you to beref and ali for for the uh the excellent talks. um Thank you um again to andrea uh to to nick uh to all all the other offers to all the presenters, the the program committee members.

O

um I think this has been a really interesting program that there's been some really excellent talks, really excellent papers and we've seen some some really interesting questions here, uh and I have certainly learned a lot over the last uh three days of the workshop and I hope other people have too thank you uh also to the to the sponsors uh to to comcast, to to akamai um for supporting the the fee waivers uh for supporting the workshop uh and in going forward for for supporting the the irtf to help improve uh access and the diversity grants um and um yeah, and if people have feedback about the workshop, if people have ideas for for what to do um in future, um please please talk to us.

O

Please talk to me to andrea to nick um yeah and yeah. I I'd like to um just finish up by encouraging uh all the offers um to join with the rest of the itf the the irtf sessions this week.

O

um I think one of the the benefits of the a rw is co-locating with the ietf meeting and obviously this is uh a bit harder uh doing this online rather than doing it in person, but uh please do try and make the most of the opportunity and uh yeah well we'll we'll do this again, and hopefully we'll do this in person in future. So uh yeah.

E

O

All I have uh thank you again to to the organizers. I think I think it's been excellent.

D

Thank you so much.

I

Thank you so much.

D

The participants, oh I'm, getting some feedback. Sorry, I'm sorry.

D

And it's been great working with you and yeah. It's been great, um taking care of um the workshop this year, um looking forward to seeing you and gather if you're roaming around there, so um that's it for today. Thank you again so much and thanks thanks nick thanks colin, it's been great.

A

Thanks again, thanks for all the all the hard work.

I

A

A