Internet Engineering Task Force 109, 16 Nov 2020

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: IETF109-MAPRG-20201116-0730

Description

MAPRG meeting session at IETF109
2020/11/16 0730

https://datatracker.ietf.org/meeting/109/proceedings/

A

A

Okay, um good morning, everybody, this is uh the map rg session. um I had a little problem sharing with my other brothers, so I'm now in safari. Instead of sharing my whole screen, which means I cannot really see um the controls.

A

However, I will quickly run through my sets of the slides and then, um if possible. I would like to ask the first presenter emily to maybe share the slides on your own, or at least try that and then I can try to fix that where I will listen to that presentation, so that would be good um welcome to itf 109.

A

We are online again. uh Nevertheless, we have some nice presentations today.

A

um This is the note well part about the intellectual property rights um similar as the ietf. The irtf has adopted rules about participation in intellectual property rights, um and you should study those rcs which are um linked here on the slide. If you haven't seen this before in order to know what you get yourself and when you're participating here on the mic or in any way in the java or anyway, um we also have in the itf a couple of policies about privacy and code of conduct.

A

So if you want to participate here and you're not aware of these rscs, please also have a look at them and then the next slide is quickly um giving you a reminder or some understanding that the irtf is actually not the same as the ietf, what the difference are and how to how to how these two organizations work together or interwove and there's another rc. You can study if you want to- and this is the usual slide, which is more useful if you actually go to the material, so you can.

A

Click on these links find all the links you need. We find our charter. So what is this group about? You find the slides. You find the meet echo link, which you probably have found already hopefully and there's also the java room, and um this is our agenda today.

A

So um we have this very quick overview here, because we have a packed agenda. uh We only requested one hour slot because it's an online meeting uh and some people are not the perfect time zone. So we try to keep this a little bit short but actually kind of. Fortunately, unfortunately, however, you want to take it.

A

We got like a lot of nice contributions, and so what we have today is that we have two longer talks: um uh the the first point about network fingerprinting and the second one about uh oblivious um dough measurements, which are kind of brand new material, and then we have like the later threes are only five-minute head-up heads-up talks, because these have been a really really good piece of work, all three of them, but these have been published in papers in the imc conference and they have been presentations about that and also the long videos about these presentations are available online.

A

So, if you want to know more, you can watch them, or maybe you have watched them in advance already, because you're prepared um and and you can read the papers, all the links for the papers and the videos are in the agenda. So if you look up in the data tracker click on the agenda, you get more information there, and with that I will try to stop stop sharing here and see if emilyne is available to maybe share his self.

B

English should I share my screen.

A

If you can, that would be great for me.

B

Okay, so I'm clicking: uh where should I it's the first time I'm using this tool? So I don't yeah.

A

So um yeah next to the video and audio button, there's also the screen sharing symbol. Basically, so if you can click that one and.

B

I can grant.

A

You that, let's see.

B

Okay, yes, that's me, and these are my slides, okay, perfect. Thank you um well uh hello, everyone and uh thank you for listening to this presentation on network fingerprinting and routers under attack. So first I will introduce what motivated this work with auto research questions.

B

Then I will talk about fingerprinting what it is and how it works, and after that I will review our methodology for the experiments that we did. And finally, I will comment on some key results that we found.

B

Okay, so for our first research question here we are interested in knowing what is the hardware ecosystem within the internet and within operators.

B

So here, actually we want to discover where the different brands of hardware, that is to say juniper, cisco, alcatel, etc, where those different brands of hardware are located inside the network and what is the role that they play and knowing? This is actually important for a second research question, which is what could happen if an attacker is able to identify the brand of a router and use that information to launch a targeted attack on sale, routers.

B

So from the output that we get from question one. We can get a view of what could happen in such a scenario, and these questions are motivated by the fact that recently there are five vulnerabilities that have been found in cisco devices vulnerabilities that can lead to a remote code, execution and denial of service attacks, but there's not only cisco. There are also several other manufacturers for which a vulnerability in the rsm implementation has been found.

B

uh First, a bit of a background. How can we actually tell the brand of a router? Well, this is called fingerprinting and the basic idea is to send some probes to a device and analyze its responses.

B

So recently, there was a very lightweight fingerprinting technique that was proposed and that relies only on two different probes to be able to get the hardware of the router and sometimes even the operating system, and so this technique is based on a specific field in the ip packet, which is the time to leave field so routers when they send a packet.

B

They put an initial value in this field, which is almost always 30, 32, 64, 128 or 255, and so the people who have found this technique have also shown that it is enough to get these two values for a time exceeded packet and an echo reply packet. So these two values we call that the router signature. This is what tells us, what is the brand of the router and in this table, you can find the different classes of hardware that we are able to identify. With this technique.

B

We can differentiate between cisco, juniper juniper with the operating system, junos e and the last class, which is composed of brocade, alcatel and linux machines, and that we will call the bell class okay. So on to the data we collected, we use tnt, which is a paris choice, root extension that comes with this lightweight fingerprinting technique that we just talked about, and additionally, it can also reveal mpls tunnels.

B

So it has the potential to discover more ip addresses than the classic traceroute does, and we launched tnt between the 1st of november to the 13th of november of last year. Over 28 vantage points, and overall we collected 1.2 million addresses.

B

So from these traces we then used another tool midar to perform alias resolution, which is the process of identifying which ip addresses belong to the same router and so from the ip level topology that we got with tnt. We can get a router level topology with middle, and this is actually interesting for us, because these topologies they are closer to reality. They are more concrete, and so we can better study, network, resiliency and robustness on this kind of topology.

B

So, with our tool middle, we were able to identify 65, 000 routers having all together 200 000 addresses.

B

And finally, the last bit of processing that we did on the data was to use the tool border mappit, which allows us to determine who is the owner of a router in the sense. What is the autonomous system that operates this router and our goal here is to delimit, as precisely as possible, the different ass in the network to study hardware ecosystem at a finer scale.

B

Okay, so back to our first research question: what did we find? What did we find? Well, let's. First look at the internet as a whole on your left. Here you have a graph that gives the signature reportation for both addresses and routers, and what we can get from this graph. Basically, is that cisco largely dominates the overall market, so we can see, for example, that nearly 60 percent of the routing devices are actually cisco devices.

B

Now, if we look at a finer scale at the scale of the autonomous system, well, this graph gives us once again the hardware repartition, but for four large asses this time for asa b, c and d, and what we can already see from this graph is that the hardware repartition can be very different from one as to the other, for example, with asa, we see a lot of cisco devices, but for asd it's not the case at all. We see only almost junos for asb. We see also a lot of junas and for asc.

B

We have a nearly equal mix of bell and genus routers. So yes, what we can conclude here is that different ascs have very different hardware ecosystems, and it can also be very different from the global point of view.

B

Okay, now on to the second research question, now that we know more about the ases infrastructure, what could happen if an attacker is able to easily identify the brand of a router and then use that information to launch a targeted attack?

B

Well, the first thing to know here is that not all routers contribute equally to forwarding packets in the network. Okay, we have small routers. We have big routers with a lot of forwarding power, and so if it turns out that a certain brand of router is important for forwarding the traffic, and if that brand of router has a vulnerability, then it could be a target of interest for attackers to to target those routers.

B

So how are we going to measure the impact of an attack on a brand of routers? Well, we already saw the hardware distribution that I've put back here. um So, for example, if we look at asa, we could say: okay, we see a lot of cisco routers, so we can conclude that cisco routers are important for forwarding the traffic, but actually the the amount of routers does not necessarily reflect the amount of traffic that is being carried. So this is a first indicator.

B

Yes, but we need another metric to measure that and this metric is the hardware popularity we, we compute the hardware popularity as the proportion of traces that cross each brand of hardware and this metric. It has been shown that it reflects the amount of the actual amount of traffic. So what do we get when we compute it? Well here it is.

B

We see that actually, it's quite close to the hardware distribution, so our first indicator was already a good one, but still we can see some differences, for example for asc in terms of number of routers, we saw a nearly equal mix between belle and junas, but when we look at the amount of traffic that is being carried, we see that actually bal is more important for forwarding the traffic okay. So how did we proceed with our experiment?

B

Well, for each brand of router, we simulated an attack by taking down different fractions of the different brands of routers, and we looked at the impact in terms of traces impacted in the network and we performed the simulation 30 times and averaged the results that you can see here on your right. So we have one graph per as let's look for it, for example, at asa we can see here that it is enough to take down 20 of the cisco routers to impact already nearly 60 of the traces of all the traces in this network.

B

So it's quite an impact already and on the other hand, we see that attacks on junos and on buy routers. They don't have much of an impact compared to compared to attacks on cisco.

B

So, yes, this autonomous system asa is quite vulnerable to attacks on cisco for asb. However, this is not the case at all. Here we see that attacks on genus are actually more harmful to the network than attacks on cisco.

B

So as expected, given the hardware popularity, we can see that not all asses will react in the same way to different attacks. Okay, so time to wrap things up. What can we conclude from this presentation?

B

Well, the first thing is that someone can very easily retrieve the brand of a router by sending only two probes to this router.

B

The second thing is that um the different, as they can have very different uh hardware, reportation and hardware popularity and third, because of this uh different asses, will be more or less vulnerable to different types of attacks, depending depending on the hardware infrastructure, and that's it for me. I hope you enjoyed the talk. Thank you.

A

Thank you very much uh for this nice talk um and also, as you went um through in time, we actually have some time for questions, so if people would like to join the queue, you have an opportunity right now.

A

Yep, there's brian, please animate yourself.

C

Okay, so they've changed this slightly since the last time hi. uh So uh a quick question on on the takeaways here would a uh another way to phrase this be. um There are definite advantages to being a multi-vendor um uh as in terms of your ability to like. So this basically is looking at the amount of traffic. That's going through these these architectures.

C

Could this tool be used to to quantify for a given as what their return is on having you know, uh routers from multiple vendors right like so, you could use this to map yourself as opposed to what you've done is to map it from externally.

B

Yes, it's um it's actually um well in the paper, we have a few suggestions for our network operators and one of the suggestions we didn't put- but I think about later, was that maybe there could be an advantage for autonomous system to have several uh vendors several manufacturers in order to to not be that vulnerable, to not rely on only one single brand, but maybe a drawback of this would be in terms of management overhead. Maybe it's complicated to um to deal with that many routers, because they're.

C

Cool but there's like actual data on the vulnerability side of that from this paper. So that's cool thanks.

A

You're welcome thank you, and that also reminded me that brian nicely volunteered to take note. So thank you for that.

A

Any other questions.

A

Okay, then, thank you again and we move on to the next talk and we have sadish online here and he will also try to share. I should be able to grant.

A

A

And it's coming nice, okay.

D

I'm am I audible.

A

Yes, you are, you can start. Thank you.

D

Okay, yeah. uh Thank you so much so um hello. Everyone thank you for being here today. I am sudeesh and I'll be presenting uh the measurements of a privacy, enhancing dns protocol called odo or oblivious dns over https, and uh this work is uh the result of a collaborative effort between lots of incredible people at cloudflare and was a part of my internship uh this summer uh with them, and uh this talk will focus on uh the odo standard proposal that was co-authored by apple and cloudflare, and we're committed to actually moving this forward.

D

As a quick introduction. Dns is the domain name system and is the foundation for the human usable internet.

D

It responds to client queries for host names with the corresponding ip addresses and records, and traditionally the dns protocol is not encrypted and it uses udp, which continues to be a majority of the traffic received by the public recursive resolver, which cloudflare operates, and the usage of non-encrypted dns leaks, user information to network operators to on-path adversaries who are observing the network and even allows active attackers to modify the network request from the client or the response from the server.

D

But to overcome some of these problems, there have been recent efforts to secure dns using dns over tls dot or dns over https with doh, and uh these have been gaining popularity and have been integrated into various web browsers operating systems, uh protecting the client traffic from being observed by onlookers or being intercepted and changed by the attackers.

D

But, however, the resolver operator can continue to associate the client request to the ip address and build a profile around their browsing patterns.

D

But over the past few years, we've seen active measurement research trying to understand and measure the impacts of uh encrypted protocols like doe or dod, and many large scale. Measurements have shown that the performance of these encrypted protocols vary by the choice of the resolver and it does not significantly impact page load times and it improves user security.

D

So there have also been various attempts to uh improve page loads using prefetching, but while dot did improve the security of dns queries for clients, it also received a lot of criticism uh of the because of the small number of publicly available resolver services, essentially centralizing the internet and giving these organizations a lot of control.

D

But additionally, these resolver operators also uh can can associate client queries with the client ip addresses and use the ips to geo locate the client to maintain privacy guarantees some operators actively purge data exceeding 24 hours and give their users privacy policy based guarantees and bringing together a lot of organizations to agree to a common set of privacy.

D

Practices is extremely difficult and it requires explicit negotiation and a lot of effort mozilla, for example, actively defined criteria around data retention, aggregation and frequent audits for the door services to be configured as a default to the firefox browser and, while some users might might be comfortable with this idea of having a policy driven approach, these are difficult to enforce and they're. Also very time consuming.

D

Making users want a system that that can technologically guarantee their privacy and uh in in this talk I'll focus on the privacy critiques and the ability for the resolvers to be able to create a profile for for their clients, and this is exactly where oblivious dns or over https or odo kicks in at a high level. There are three main components in odo.

D

The first are the clients who prepare a query for which they would like a response, and the goals of the clients in odo is to be able to successfully send encrypted messages, receive valid responses to decrypt and in the process be able to identify if there are any malicious actors and take any corrective actions.

D

The second is a proxy instance whose main role is to relay the encrypted queries and responses to and from the targets, while removing the client ip addresses, and the third is a target instance which receives the encrypted query from the client decrypts. It obtains the dns response for the query uh processes it to retrieve another to get a response, encrypts the response and sends it back to the proxy, and it has no ability to identify who the actual client is by the ip address.

D

But at a high level, the design of odo is similar to that of dough and injects an intermediate proxy node, which terminates the client query over here. Queue and performs. The query on the client's behalf and the main goal of odo is, is actually to prevent. Recursive resolvers and isps running such as running such resolvers from being able to link the clients to their requests, so a client encrypts their query, using a hybrid public key encryption scheme and using a validated public key from the target resolver and sends this query to a proxy instance.

D

And there are various ways in which the clients can actually learn about these services and validate them through dns sec. Look for conflicting keys, but, however, in this talk I'll I'll, be pointing on this problem. But there are more details that are available in the report, but in odo the uh the proxy instance can see the ip address of the client, but not the contents of the dns query and forwards.

D

This message to the oblivious target, which decrypts the query and obtains the response from the resolver and the response is encrypted by the oblivious target and is sent back to the resolver. Who sends this message back to the client where it can be decrypted and in this process the target only sees the dns content and not the actual client's ip address.

D

But we often say the target resolver as like independent entities, but this need not really be true. Ideally in practice for performance reasons, the the oblivious target and the recursive resolver could be co-located, and this avoids the additional network messages between the oblivious target and the resolver.

D

But it is still possible to maintain these as individual services without any co-location.

D

But now, with the understanding of the protocol in place, we actually set out to understand and measure three main research questions. The first one was what is the impact of odo on the dns response times, and how does this protocol affect page load times and user experiences?

D

Also, how does this compare itself to other privacy, enhancing protocol techniques that are actually out there today, and this helped us understand the cost of privacy while maintaining the security guarantees and the need for performance?

D

This slide uh pretty much sums up the results of our measurement and is probably the most interest to all of you, but but I'll leave this here as a reference to come back to just in case. Anyone wants to refer to the slides and in the next few slides. Actually talk about the measurement that we perform and detail each one of these takeaways.

D

So let's get to the measurement we implement and deploy the oblivious targets and proxies using a google cloud and a serverless platform like cloudflare workers and we physically separate the oblivious target instances from the resolvers and randomize the query from the targets to three public resolvers to cloudflare dns, google, dns and quad 9, which we use in our measurements.

D

We use nine google data center locations, seven across the united states, one in montreal, canada and one in sao paulo in brazil, with 10 client instances running at each of these vantage points, performing the experiment of sending dns requests at a rate of 15 requests per minute, which is the average number of dns queries sent by client devices with high internet usage and the average bandwidth that we have for all the clients running on a single core intel.

D

Xeon is roughly 480 megabits per second, but the clients perform dns response time measurements by choosing pairs of available proxies and targets and by choosing a low latency proxy target pair for the measurement uh shown in the orange line. Here we find that the average query response time improves by 22.8 percent compared to only choosing a low latency proxy, as shown in the green line, and this hints towards the fact that having an intermediate proxy on the same network path to the target will improve the response.

D

Time performance and this path, however, can be quite different from the path that a udp-based dns packet might actually travel on.

D

But what about connection reuse and connection reuse is an optimization that can enable clients to improve their performance by at least 46 percent on an average and it avoids unnecessary tcp and tls handshakes for each request.

D

In our experimental setup. We uh we evaluate for the worst case performance and incur an additional network latency between the target instance and the resolver in the architecture that we showed before and uh as as we see here, the target instances which are located in google cloud and performing queries to three open resolvers have faster response time for google dns because of potential co-location within the same data center compared to the other services and integrating the oblivious target into the recursive resolver can reduce the the network.

D

Latency incurred uh to that of a cache hit for uh for the answer and in the cache miss cases, incurring the network cost for the recursive resolver to communicate with the other name servers. So we find that co-locating. These services does actually result in better performance but to understand the performance of odo.

D

We compare these to other protocols offering similar privacy and security guarantees and use dough, as the baseline protocol, which is shown here on the indigo line to the left and the dough over tar is a variant of dough which provides both security and privacy guarantees, in addition to some anonymity guarantees, and this is the beige line that you see to the right of the figure.

D

But when we compare this to odo, we notice odo, with no service. Co-Location achieves an interesting position roughly in between, do and do over torque, and these results get interesting and better, as we start to co-locate uh the target and the resolver together, which is shown by the dash blue line over here, and we notice that the response time for odo compared to doe for the baseline increases by fifty percent with service co-location and a hundred percent when, uh when the target and the resolver are not co-located but dns protocols uh with message.

D

Encryption like dns script uh or anonymous, dns script tend to have much larger, compute overheads and use non-encrypted channels, and these protocols have higher response time and lies somewhere in between odo and doe. Over torr, which is which is over here in this in this region uh on on the graph and the performance of these response times with with odo somewhere being in the middle ground is very interesting, uh is very interesting for us, and but this brings us to a crucial measurement that we're really interested in the page load time impact.

D

And to do this, we establish a measurement node in a lab network with an available on path and a randomly chosen off path. Proxy and the node runs a local stub resolver, which is configured to use doe and odor protocols for various runs and in each run we browse the same set of pages after purging, all the local cache entries and perform page load time measurements with selenium, while using the navigation events.

D

Our first measurements presented here are pessimistic and use the worst case, network architecture with no service co-location and only considers the complete page load time. Events instead of other metrics like time to first byte or the first useful paint. So, additionally, there are a lot of various browser artifacts like caching, which happens within the browser which- and this is the reason why the top graph over here is quite different from uh from the stress test measurements and the results from this test. Measurements that we've shown before.

D

But what we find uh in in our preliminary measurements is that using odo with an onpath proxy increases, the page load time by 20 compared to a baseline, udp based dns usage or doe, and a randomly chosen off path. Proxy actually increases this by 25 percent, so the page load times move from somewhere between 1.319 seconds on an average to 1.6 seconds uh with odo. These results are still preliminary and we are optimistic that with service co-location, these will get better.

D

But this is still ongoing work and to conclude, odo is a practical privacy enhancing protocol for dns, and it has minimal page load time impacts and the performance impacts of the protocol are purely like network topology effects. uh We make a lot of recommendations in our report for the ideal usage in production systems like having an ecosystem of on-path, proxies, uh having service, co-location and being able to and making the clients.

D

You do connection reuse uh all of the code and our implementation is open source and is available at these links on github under the cloudflare organization and we're committed to move this standard forward in the ietf and hope that more operators join us in providing support for the protocol either by running the proxies or the targets, and uh the report indicating our measurements uh is also available at this link here. Thank you. I'm I'm open to questions.

A

Thank you very much, um so this is oh. We already have a queue, so this is a kind of brand new results right, there's no paper yet, but you're working on one. So as soon as that becomes available, maybe you can also share it on the mailing list, and with that we take first question from mike.

A

I think you have to I mean yeah. Yes, there you go.

E

Thank you. uh So I was just wondering: uh does the work that you have here support ecs or edns client subnet, um because it seems like the optimization that you're doing um is really great for dns performance, um but on the other side of that authoritative name, servers may be being used to try and do geolocation based load balancing. um You know for our actual delivery performance enhancement and if that information is obscured, uh what's what's the balance? Have you have you looked into that at all, yet.

D

So so we haven't actually performed a lot of measurements for that, but to to the dns resolver, uh the proxy is essentially what what they would see as the client. So any edns optimizations that that are applied for say a proxy in a specific geo location would continue to be applied uh it. It probably becomes possible for the client to choose a proxy that is prop, but that is maybe at a completely different geo location and that might have some effect on edns, but we haven't actively measured them.

A

Okay and mike you have another point, no then we have prime.

C

Hi sadie: this is um really cool work thanks a lot for sharing it. uh So uh actually, your answer there is is sort of a lead into my next question. So you're saying that you didn't really look at the um uh the ability of or the impact of actively choosing a proxy.

C

um You know you could actively choose a proxy either to uh maximize the confusion right like the maximum or minimize the identifiability, or also to minimize latency. What what I'm seeing in these graphs, though, is that those two tend to be look like they're related to each other. You can either maximize confusibility or you can max, or you can minimize latency, because the minimum latency path is going to point kind of in the direction that you're pointing in so um I I'm kind of wondering I actually would like to see the follow-up work here.

C

Sort of dig into that particular question like what is the actual latency cost of additional.

C

um You know bits of of um confusion about uh who you are because it looks to me like this basically takes the problem that doe was originally intended to solve, which is encrypt the the thing to a gigantic dns aggregator, which will obscure who you are and because there's a a need to obscure who you are from the gigantic dns aggregator we're now going to use smaller dns aggregators in the proxies they will execute, who you are from the giant one, but the giant one can still kind of guess at least geography based on the fact that everyone's going to implement for lowest latency possible.

C

Right, because I mean there's the impact of of this over dns over over um 53, you said- is up to 20, which is actually kind of massive, and that's like that's just late in the table. So um I guess that's more of a you know. Please dig into that. I I I in the future. I want to see sort of a curve between hey.

C

You know here's here's, the x-axis of how confused I want to be, and here's the y-axis of how much I have to pay for that, because if that curve can be pushed down, then you can make a really good argument that okay, you should do this anyway you're going to take the hit, but but it's worth it, but if that curve is too high, then yeah.

A

C

A

It's not only location, it's also linkability right.

C

Right so you're fixing the linkability problem, but the but the, but I mean, if you're, in a situation where only three people are behind a certain proxy, then you know you need to get the mass and and you're you're kind of um uh you're kind of flipping. The mass argument that doe was intended to solve in in the first place back around so comparing other architectural variants.

D

Yeah, I I think I think that's definitely like a very, very valid suggestion, but uh as jonathan I think pointed out in the chat along with uh he said that it's not in either or in most cases like you, don't have to choose between one of them. There are past, probably ways in which we could scale them, but it needs like this close collaboration uh to uh by a lot of operators and network operators to put some of these proxies on path and see and see how we can do this uh right.

D

But of course, one of the things that we noticed is that uh there are quite a lot of people who do use. Dns script, which is uh which is over here, as this grey line that you see, but the performance of odo is, is like much better than dns script, even in its workplace. Setting so.

C

Cool all right, thank you very much.

F

Yeah, thank you.

A

Thanks, uh we still have time we're really good in time. That's really nice not used to that, but we don't have anybody in the queue anymore. So, thank you very much for your presentation. uh You can ask more questions in the chat everybody else we will move on to the next presentation and next we have sebastian talking about dns again, are you able to share.

F

Working on that.

F

Okay, somewhere else here we go.

A

Yeah, we can see a slide. We can hear you're all set.

F

Yeah already for the flickering, I didn't notice that before um hello, everybody nice to be here, I only have five minutes and this is an accepted imc 2020 paper. uh So there is a longish version of this, a presentation, a video recording, so you can go and make a reference to that.

F

uh This is a joint work between um sidn, the dutch register domain by history, the new zealand domain registry, a root server operator.

F

We joined forces to go and find out how the clouds are centralizing, dns traffic.

F

So, as you may may know- and I have plenty of slides to go and justify this point- there are concerns from the us and the european union about centralization of platforms and centralization of internet infrastructure.

F

So they are hearings about an antitrust and you've heard well facebook in there being a social media, company or twitter having too much power um about how are they creating a single point of failure, a point where they decide what is good, what is bad um undermining privacy of users and also market consolidation?

F

So our intent with this work is if we can do anything to go and measure internet centralization, and you can pick any metric. You may want on this the number of uses the amount of traffic network infrastructure as a mailing presented a little bit a few minutes ago. Computer infrastructure market power market share, but because of all of our three uh dns organizations, we focus on dns traffic.

F

That's mainly coming from resolvers to authoritative servers so, and we provide you with a three different vantage points. So we have using a passive dns, so packet captures.

F

We have the dns traffic going to door, dornell the netherlands with official language, touch new zealand, english, maori and sign language and be root, and we focus our efforts on and measuring from five different cloud and content providers, so google, amazon, microsoft, facebook and cloudflare, uh two of those cloudflare and google. As you know, and as you heard from the previous presentation, they have public dns, so we would expect to see a lot of traffic from them.

F

We did a lot of the aggregations, a lot of analysis. You can go in and take a look at the paper, but what we can find- and what I can quickly share here, is, um for example, the different how using different weeks of data in different years, the traffic has been growing, so you can see for be root.

F

The traffic coming from uh those five cloud providers basically double in amateur for a year and they stay the same in 2020.

F

uh However, the root servers and the cctv in the tld space are completely different beats and you can see if, uh for new zealand, almost a third of the traffic comes from those five cloud providers which, for the netherlands is even worse, and you can see if I go back and forth that the amount of traffic the netherlands gets from. Google is around 15 percent of the total load.

F

We can focus on different aspects, but basically we can see from ipv4 ipv6 options and at the end of the slides I have a beautiful picture that shows. uh Google and cloudflare basically share the traffic between ipv4 and ipv6, but facebook is starting, sending more and more ipv6 traffic and the microsoft and amazon barely use ipv6 as a transport.

F

uh If we focus on dns by different query types, you can see the patterns and the thing the footprint of each of the cloud providers are completely different and you can see on google and specifically in 2020 how the amount of traffic they send for the ns queries uh drastically change.

F

um The reason for that- and I have a pretty picture here- is basically because google deployed q name minimization in december 2019, which leads us to uh part of our conclusions that if a cloud provider decides to go and adopt a certain technology, ietf technology, you will see a massive change either dnsec udp versus tcp, etc. So there are pros and cons of centralization and again I'll, invite you to go and check the video presentation on imc, ripe or dns org or read the.

A

Paper: okay, um thanks. A lot, also um very interesting talk about a important topic, we're discussing. uh We are still good in time, so we still have time for questions here.

F

Right so well, george is pointing out uh cuny. An immunization is sort of a country example of where a consent, market concentration, is a good thing to have, and I'm pretty sure there are other options. So google also does dns validation and because the netherlands has a lot of uh you know, domain names, it's likely an explanation for why it isn't receiving so much. Traffic uh cloudflare is also that's a dns equalization.

F

So if the idf come out comes up with a new technology or a new in s, query type that might be very useful for users. This micro concentration will be very helpful for them.

F

I don't see a lot of um questions.

A

No I'm still waiting for a little moment because we, as I said, have time. um I also recommend everybody to read the full paper um really good paper. Yes,.

F

If we have more time, I can show you uh one more thing which surprises us. So most of the traffic received by the netherlands and new zealand is a udp traffic except for facebook. So facebook has a non negligible amount of tcp traffic and by checking the edns buffer size, we notice they're using a very small buffer size, so they're likely querying for something and they not. Then they have to go and retry on tcp the same query: late.

A

Yes, so maybe something to discuss on the map or gmail list, there might be people from facebook.

A

But I don't know: oh.

F

Yeah definitely.

A

Thank you very much for your presentation and with that I think we move on next. We have oliver um and I believe you should be all set to start sharing. Yes, there you are.

G

A

See you we can hear you typing perfect.

H

Can you can you see my, can you see my screen.

A

Now it's up! Yes, thank you.

H

Awesome: okay, perfect, so hi everyone, my name is oliver gasser, I'm a postdoc researcher at the max planck institute for informatics in germany and today I'm going to present a short excerpt from our study on the so-called lockdown effect, namely the implications of the kovit-19 pandemic on internet traffic.

H

As you all know, uh right around march, a large part of the overall population of the world uh went into the so-called lockdown, which means that people were now working from home. They were studying from home and teaching from home, and also parts of the social life um moved online.

H

And in all of these efforts, the internet is obviously essential, and so we asked ourselves how well does the internet actually cope with this kind of change and we looked at a lot of data, namely um a large european isp and three internet exchange points, one located in central europe, one in southern europe and one at the us east coast, and also at an academic network ready madrid, which is a university network in spain and to analyze uh all of these data.

H

We assembled quite a large group of people from different research institutions as well as companies, so let's get right started. The first analysis that I want to present to you is how does the traffic actually change during the lockdown?

H

What you see in this graph on the x-axis, you see the calendar week all the way back from january 2019 to june 2020, and you see a horizontal line where you have the uh year change from 2019 to 2020, as well as some specific dates noted, for example, the kovic 19 outbreak, as well as the initial responses and lockdowns and on the y-axis.

H

You basically see the normalized traffic volume, which means that whenever the line goes up, traffic increases and whenever the line goes down, traffic decreases and the blue line in this graph shows the traffic volume for the isp.

H

And what you see there is that right around the initial responses and lockdowns, you see a about 30 percent increase in traffic, which would normally span over multiple months within just a short a period of time.

H

If we now look at the internet exchange points in gray, we see a quite similar behavior there and for some of the ixps, the traffic level remains elevated, even after some of the lockdown restrictions have been lifted.

H

Now, looking at a different network again, which is the mobile operator network in orange, which behaves completely different compared to the other network that we've already seen, namely during the lockdown, the traffic goes actually down, which makes sense when you think about it, because people were more staying at home and not moving outside anymore.

H

So they didn't use the mobile network as much as before, and also once some of the lockdown restrictions were lifted right around at the end of april. Mobile traffic starts to pick up again so now that we've looked at the overall uh traffic uh um traffic volume characteristics over several months.

H

Let's look at a more detailed picture, namely how does uh traffic change within a day and for this we're going to look at a working day as well as a weekend day, and so we picked a working day, which is february 19th before the pandemic, before the lockdowns, and what you see on the x-axis is basically the hour of the day and the y axis again the traffic volume, and we see how is the traffic volume distributed over the day and what we see before the lockdown is that on working days, there is a strong increase in traffic towards the evening hours.

H

um If we compare this to a weekend day, we see that the traffic is more equally distributed over the complete day.

H

Now, if we compare this to a working day during the lockdown, we see that the traffic more resembles um a working day compared to a weekend day, so work days, actually start to look more like weekends, from a traffic characteristic.

H

And finally, the last um analysis that I want to show you is: we also looked at different applications, how traffic for different applications changed and one of one important applications when you're working from home, as is vpn, so we applied a port base as well as a domain based technique to identify vpn traffic and in this left part of the graph you see on the upper part of the graph. You see the traffic volume for workdays.

H

So again, if the bar is higher, you have more traffic, more vpn traffic there and on the lower part, you see the traffic volume for weekends, and this is the um traffic the vpn traffic for february. If we now compare this with march, we see that there is about a 200 increase in vpn traffic, uh mostly during uh during the working days in working hours.

H

If we look a little bit more further in time, so at april and june, we see that the vpn traffic starts to slightly decrease again, but it's still, the volume is still above the initial volume that we've seen in february.

H

So what are our main takeaways from this paper? We saw that traffic increased between 15 and 30 percent within just a few days. The difference between the work days and weekends starts to vanish and also specific applications for remote work such as vpn and video conferencing saw quite a large increase in traffic.

H

There is a lot more in the paper which we've presented at imc 2020.. For example, we look at changes in transport, ports, different traffic classes. We also investigate how does a traffic in an educational network change during the pandemic, when students are staying off of the campus and at home, and we also compare hyper giants versus non-hyper giants and much more.

H

That was it from my side happy to take any questions.

A

Thank you very much. We do have like one or two minutes for questions. um I also uh wanted to mention. I forgot this earlier that this is like only maybe not fully in scope from apache, but we thought, given um the current circumstances might be interesting. It's a very comprehensive paper with a lot of different measurements and there are also more papers at imc at the internet measurement conference.

A

So if you look up the the agenda from the last imc conference, which was just like a few weeks ago, you find more interesting papers about these kind of effects.

A

Let's see we have people in the queue so jonathan.

I

And did you look at how weekend traffic changed? I was we did weekends, just look the same or have weekends become even more flat, uh so weekends, you you're talking about weekends yeah, because you said that um normal traffic looks more like weekends. What does we can do? Yes,.

H

So that's so weekend, traffic still looks like uh weekend traffic more or less. So it's the the the main thing that changes is the working day traffic. So I can maybe show uh a backup slide, which I have here and what what you see here is basically we're trying to classify um from the traffic characteristics whether a uh a day looks more like a weekend or it looks more like a a working day and on the lower part. You see the the days classified as working days on the upper part.

H

You see days classified as weekends and all of the blue bars are classified correctly. So here, for example, we have the weekends classified correctly here we have working days classified correctly and what you see is that the weekends remain uh classified correctly, so they're still behaving like weekends.

A

Okay, thank you. We have uh one quick question from spencer.

G

Thank you for uh bringing this work to to us here. um I'm one of the editors of a document in the mops working group and operations about media streaming. uh Guy guidance for operators- and this is this- is uh going to be a great resource for us. Thank you sure. Thank you.

A

Thank you oliver, and that with that we go for the last talk. We only have three minutes, but maybe we can do one or two minutes over. uh We have marcus um and let's see, if you can share yes, there you go.

J

A

Yes, we can see your screen, we can't see the slides yet yeah.

J

You can see in here so now you should be able to see.

A

J

Okay, okay, yeah uh hi, everybody, I'm marcus from rwt university, and I would like to summarize our results of the.

I

J

Of the security configuration of internet facing opcwa deployments, so opcoa is one of the most recent industrial communication protocols, which was designed with security in mind, making it a prime candidate for communication in the industry, 4.0 and the industrial internet of things, for example, to control productions via the internet, and indeed these security features of opcoa are approved, for example, by the federal office for information security in germany.

J

However, to actually achieve this level of security, which is promised by opc, a extensive configuration is required and to guide operators. Through this extensive configuration process, the opc foundation published official security configuration recommendations.

J

However, the questions still persist, whether internet, facing or pc or a deployments are configured securely, and you might ask why this is relevant for the itf as opcwa is now idf protocol. But we think that this might be only a key example for secure by design protocols in general. So, let's get into it.

J

So first we have to find opc a deployments on the internet, so we performed active internet measurements weekly over a period of seven months, using zmapp on the standalone, pcr, a port and subsequently retrieved security configurations and publicly available payload data, and indeed we found industry deployments using the opc ui protocol um between 1, 700 and 2000 of these deployments were reachable over the internet in the ipv4 address.

J

Space and 42 of them are discovery servers, which means that they are only used to announce opc, os deployments on other ip port combinations, and we focused on the non discovery service here. As the security configuration is only relevant for the actual payload transmission and yeah, then we answer the question whether these are configured securely.

J

Looking into the security configurations of these deployments. First, looking into the security mode, which basically enables or disables communication security features, for example, uh confidentiality and or authenticity, and here one out of three modes completely disables the communication security, and we found that one-fourth of all deployments actually uses this uh one mode, um basically neglecting all of the um obc a security benefits and do not offering communication security at all.

J

And, furthermore, um we looked into the security policies which define security primitives to be used to implement these selected security features, and here three out of six policies are not considerable as not secure and two of them, because two of them are deprecated due to the usage of sha-1, and we found that, despite these were deprecated in 2017, one. Fourth of all deployments that offer security use. These still use these deprecated policies and, more interestingly, we found that only one point.

J

Four percent of the servers we found um enforce the usage of secure policies which were recommended by the opc foundation.

J

We also looked into whether the deployments implemented the security policies correctly for by looking into these certificates, the service used for authentication and found that um thirty percent of the servers use two weak certificates um which do not meet the minimum requirements of the security policies, for example key length or um yeah hash algorithm um that was used to generate those certificates and, furthermore, in we detected one or we detected several cases of certificate, reuse and one was most striking as one certificate was deployed on more than 380 devices, which we think are operated by different operators, which we derived from the discussion with the manufacturer.

J

Looking into the third aspect of the security concept, we also looked into authentication and found that 44 of the deployments do not implement access control, meaning that everybody on the internet um can access these servers without authenticating itself, um and we also looked into the um payload data of these deployments and found that these are actual production systems. As of our judgment.

J

So our results underpin that deploying a secure by design protocol is not sufficient, but it requires, of course, a secure configuration, and we think that also standardization could somehow help for that, for example, by specifying that implementations have to ship with secure by default configuration yeah. Our papers are available on archive and we also published our data set in our scanner. So thank you very much for your attention.

A

Thank you max sorry that you now were a little bit time squeezed here, um but also, I recommend really to read the paper. um I think, even though this is not an itf protocol, it's a really good example about what can go wrong. I think, like you, ended up with over 90 of uh deployed systems that have like some kind of a security flaw, um even so recommendations exist. So um I think this is a good lesson learned read. So people should read the paper and look at this and we should consider this in future.

A

um We can maybe take one quick question. We are already over time, but if there's something urgent, uh I would grant it otherwise. People can move to the list. Ask for the questions to marcus, read the paper or contact him directly. Thank you very much and with that we're at the end of our session. uh Thank you very much to all the speakers. uh I really enjoyed it. um I was really happy that we got so many good presentations and talks.

A

um Thank you for everybody who passed participated for all the questions and then see you on the list or next time enjoy the rest of the itf meeting.

A

A