Internet Engineering Task Force ANRW18, 16 Jul 2018

Previous Meeting

⏯

youtube image

►

From YouTube: IETF102-ANRW-20180716-1550

Description

ANRW meeting session at IETF102
2018/07/16 1550

https://datatracker.ietf.org/meeting/102/proceedings/

A

Fifteen plus five well 15, plus five, so you'll get a five-minute.

B

I'll give a second if.

C

D

You know I think this earlier those it.

A

Made a UH your, you are yeah.

A

So hi I'm Brian, Trammell emergently pseudo academic I, am chairing the next session, which is on traffic engineering, um for which we have the arm to talks. First is invited talk from nicknames der Princeton University about why and how network should run themselves. um Why, in health rejecters should run themselves there. We go what yes.

C

D

Great, so thank you for the opportunity to speak to the group. This is an invited talk as Brian said, and a lot of this actually is forward-looking. Future sort of vision, I, based the talk on a bunch of past work at all that I'll refer to throughout, but there's basically a sort of one premise behind the talk, which is that it's been a lot of work in monitoring, there's been a network monitoring.

D

There's been a lot of work in inferring properties of the network based on data that we collect with measurements, be it performance, related security, related, etc.

D

There has been a lot of work in control of networks, be it through protocols like net conch for similar but software-based control of the of the so called data plane, and the talk basically has one premise which is like hey: we've got a lot of building blocks.

D

Should we start thinking about putting these together and in in certain ways to do to do more interesting, even even more sophisticated things, and what kind of challenges arise from that I'd like to think about network management in terms of this cycle, where operators will get stuff out of the network measurements from their switches, routers, etc? This might be anything from packet races to net flow to SNMP there. They then try to figure out.

D

What's going on right, inference will say: I mean that could be things related to performance, security, failures, provisioning, anything imagine and then do something about it right, so react in some way. Okay, now in in many cases, these three orange boxes are separate activities.

D

In some cases, we've seen some coupling of different aspects here, but I want to sort of talk about the past a little bit and then look forward to the future and point out that in all of these three areas, there have been very exciting developments that may enable closing this control loop in in interesting ways. This.

E

D

Think by the way, relates a lot to some of the the working groups here, both in the ITF and and in the IRT F anima, some of the management working groups, map RG and the IRT, F and so forth. So hopefully this will resonate as well as net conscious. Well, obviously, okay, so I'll talk about each piece here.

D

Let me start with inference for each one of these three I'm going to talk about some past work, glimmers of hope, I would say, and I'm gonna focus on some of my own past work with some of some of my students and postdocs, but for any one of these three there 20 examples. Okay, so my point here is just to say: hey, there's some there's some there's some building blocks here, there's also a gap.

D

We need to cross to kind of close this loop, so in inference, there's a whole bunch of stuff that you can do. Of course, once you've gathered data from the network, machine learning or perhaps AI is the term I should be using now. But you know: detection of attacks, what-if scenario, valuation inference about performance or quality of experience, prediction for provisioning, troubleshooting, anomaly, detection.

D

The list goes on and on I'm, just going to mention two specific examples of things we've done in the past and then, where I think there's a gap in to be bridged, so attack detection. This actually, we've worked on for quite some time.

D

Looking at network properties to detect spam to detect web based malware to you know, use the routing, BGP, routing messages and anomalies there through detects bulletproof web hosting, and one I'll talk a little bit more about which is detecting and predicting the abuse of Internet infrastructure by looking at anomalous patterns in DNS registration. So I'll mention just a couple of words on that, just to give you some color and then I'll sort of come back to the bigger picture. Okay.

D

So what this is some work by Shuang Hao, who was also a co-author on the the work you heard this morning on on TLS from Kevin? So typically, what happens? If there's an attack, be dot malware or phishing? Or what have you there's going to be?

D

A website that's set up, so if there's an email campaign- or you know some other type of campaign, typically, you want the user to click write so that they can go somewhere and buy something or enter their password or do something that that you'd rather not have them do predating setting up the website. Of course, you trace this way way back. There's a there's, a DNS registration, a domain registration at some point, and it so happens that those domain registrations look funny right for one those. Basically, they can be hosted.

C

D

Funny places they can be registered through funny registrar's. They can have strange things in the names themselves. They can have small edit distances to things that are, you know previously, on a blacklist or even to each other. So all those turn out to be pretty good features. Other things that turn out to be good features are a lot of times. These shady websites for these types of activities are registered in huge batches right. So it's not common that we, you know.

D

Each of us would register thousands of domain names with very, very similar keywords in them, but that's a common thing that turns up in attacks. Ok, so it turns out that you know this is just one example of where you can use network data to infer that something bad is actually not even happening that it's going to happen right, so you can predict that's pretty cool um another one. What-If scenario evaluation, so this by the way was done work by my former PhD student now at Google.

D

This was um implemented a while ago in the CDN provisioning use case scenario, and the idea was basically like. Ok, if we, if we deploy a new front-end in our you know in the front-end back-end web architecture, or if we need to take down a front-end for for maintenance. What's that gonna do to web search response time?

D

Ok, so you can imagine that there's like a huge amount of steps that you would have to go through if you were gonna, do a just closed form analysis on this right, because that has effects on round-trip time, packet, loss rate, its etc, and the idea here was that hey.

D

If we you know there, we, if you are a company like Google, that has a lot of information about these lower level network properties and how they relate to much much higher level quality of experience things like search response time, then you can basically learn the relationship between those network properties and a higher level application quality of experience, metrics and predict. Basically, what's going to happen, if you do something like move a front-end or take it down for maintenance, so what, if scenario, evaluation um since it's the ITF I well I'll skip this slide.

D

I'll just tell you that it basically works. Okay. Okay, so um that's inference. Well. Is that the end of the story? No in a word I, will tell you that everything I showed you. There was basically cheating, okay and in a way and the the reason that it was cheating is that it's all off line. Okay, you've got lots and lots of traces. You collect them, you throw them into like huge clusters. You train your models for a long time.

D

You run k-fold cross-validation and you get some nice results for for a graph like I just showed you but we're ways from doing this. In a real-time closed loop, okay, so good progress there lots of work to go. Let me talk a little bit about control right. So another thing: that's happened over the last five ten years is through things like net contest en etc.

D

There's been a lot of work in programmatic control, over the configuration and even the forwarding behavior of these Network Devices, and also there's a lot of work going on down here in programmable data planes. Okay, so there's basically two things really going on as I mentioned programmable network wide control and the ability to sort of customize how packets get forwarded in Hardware in the switches themselves. Okay, so relates a lot to what's what's going on in net conf. Let me just give you one example of something that we worked on so in a conventional IXP.

D

As everybody knows, a bunch of IAS has come together. They they want to peer at this IXP. They exchange routes at the route server to limit the pairwise, bgp interconnections, and we basically said oh well. Actually what, if you did something like take that conventional route server and turned it into? Like a you know, a controller of sorts right on how to talk directly to the switch, and this could be we used in our in our work. We used open flow 1.3, but this could be metcon for something else point here is that hey?

D

You can actually do nifty things with control right if I've got a network-wide view over. What's going on at my XP, I could basically do things like have all these a s is: send more flexible, more more fine-grained policies to that controller and do more more flexible, interesting things. Let me just give you one example: what happened there? ah Okay, one of the things that we worked on and I'm basically going to elide a lot of the details, but basically we looked at things like port, specific inbound traffic engineering and even port specific peering right.

D

Okay, so cover your ears, if you're, if you're worried about net neutrality, but basically what we were looking at here is: um you might have a an a s that would write a policy like this right that says: okay, if the if the traffic is destined for this particular port, send it out a particular port on the switch, and you can basically write that in a configure you know high-level configural, you can put it in the switch and the problem is on where this gets.

D

You know why this becomes a research paper is like that's all well and good, but it totally doesn't interact well with BGP right, so we figured out ways to make this work nicely so that you can't send traffic out an output port. That hat you know, your neighbor has an advertise there. So there's a lot of interesting things going on there with how you couple programmable control with today's internet routing protocol. So you don't actually like break the Internet okay, so there's a lot more going on in control.

D

Obviously that's one quick example, but just to say that again on that piece of the picture, lots of interesting developments in, say: writing lots more flexible policies and extending what are our network protocols can do. Finally, let me just say a few words about monitoring right.

D

So in historically this has been P cap or net flow or go home and we've been doing a lot of work on making telemetry more flexible and it I did see actually that there's a I got appointed to the draft itself, but I see there's a draft look for adoption here this week. That relates to in ban Network telemetry. The idea here behind this work is this is work of PhD, student or Pegula is that there are a lot of analysis.

D

Questions like things that require getting measurements out of your network right, but often you know if it's packet capture or net flow or what-have-you. Often you have too much or too little data, and could we better tailor the data that we're getting out of our switches and routers to the queries that and the things that we really want to know? So one example that we looked at in this in this paper was DNS reflection right.

D

Okay, so for those of you who don't know about reflection, this is basically an attacker since DNS queries with spoof source IP addresses to resolvers and those those responses bigger than the queries go back to some victim. Well, let's say you wanted to detect that you're interested in maybe getting information about your from your network about DNS response traffic from a certain set of unique DNS resolvers directed at a single victim that exceeds some threshold. Okay, that's not something that you could typically ask for in.

D

You know and then BPF filter or by specifying a NetFlow config button.

D

Is that a lot of these types of queries map very well to to to query languages that already exist, for example, spark and the system that we built called sonatas is- is built on this this language, so what I just said is basically represented here as a standard, MapReduce, query and spark right so take your DNS queries count the number of basically DNS servers that they're coming from that are directed towards each victim, take a threshold and then do something with it write.

D

Something basically relates to the whole control loop that I talked about at the beginning. But basically the idea here is collect data, that's tailored towards the questions that you want to answer right. Okay and again, why is this research? Well, there's a lot of actually very tough questions there, because most of these systems, like SPARC, aren't really tailored towards getting like huge fire hoses of network traffic coming off of switches and routers into their cluster right. So we got to do a lot of work to partition.

D

That query put some of it in the hardware so figure out which parts of those map and reduce and distinct can actually go into the switches and ten years ago.

D

The answer would have been none of them, but now, thanks to a lot of work in programmable Hardware, we can actually map a lot of those primitives right down on to the switch and save some of the load coming up at that at that server box at the top, as I mentioned, there's interesting work going on not only, and the research community would also come into the IETF on these networks. Whole amatory kinds of questions. Okay, so we've got all the pieces. Can we close the loop as I mentioned?

D

This is basic, we're part of the way they're each one of these. These building blocks really is you know, we've got a ways to go an inference right, we're a long way from real time. You know real time inference taking any of those problems that I mentioned at the beginning of the talk right. You know anomaly: detection security, a long way from sort of putting that into a real time, control loop, monitoring, right, we're pretty good at doing low-level, Network, metrics and aggregates. Even the thing I showed you is pre low-level.

D

Okay, what about higher level application characteristics and properties like? Is the video stream rebuffering right? Or was there a change in the resolution or what's the join time across all of my clients, for example? And those are things that we are actively working on in our research. But again those are all just sort of little steps in the building block and I. Think, basically, you know I think we should be thinking about one slide left where we'd like to be so closing the loop.

D

There are interesting things that we could do in all three of these cases, ranging from getting real-time statistics on path. So I think that telemetry draft that I just flashed up relates to that kind of question. Inferring properties up the stack even doing things in the security privacy side right. If we could close the loop in this way, we could do. We could do much better monitoring right. We could gather only what's needed which would help us both in the in the scalability aspects of things, but also in the privacy right a lot of times.

D

We don't want to gather things that we don't need to. The problem is we have to because of the way that our systems work today. So, in summary, we've got building blocks in all three of these areas. There's a lot of things that we could do in making. This kind of you know closing this control loop in the way that I, described and I think their challenges, both in doing so and in figuring out what problems are most important to solve.

D

First, so I hope, I've convinced you that we've made a lot of progress in all three of these areas, but there's a lot of work to do in this or Network automation, space and there's a URL here to a website. We just had an NSF workshop on this, as you know, I think there's quite a gender going forward many years to come. So thank you very much.

A

You have a couple of minutes left for questions. If anyone like to come up, oh cool, then I get to ask the question. Good um sue go back a couple of slides. There's you are. You identified the gaps um which.

F

A

Do you write so there's a real-time couple: control gap, there's the lack of integrated access and there's the the low level high level application? Which of these problems do you think, are accidental and which are essential right like so. You know, I'm a monitoring person, so I look at the yeah taking a low level, medicine aggregate and turning them into qoe and I'm like okay. Well, you know we have mappings that we can use there but they're. um You know it's kind of black magic and it'll, probably never work yeah.

D

And it's you know it's getting harder because of encryption right. So it's good for us keeps us in business right. Yes,.

A

But the I spend a little bit less time on the other two, so I'm well a mystic about them. Is that just you know you you're your alpha missing about what you don't know or if he's actually like so the middle control, plane thing I think is actually just something. That's a nerd. Ization is a key way to fix that and it's it's not like a central yeah.

D

I agree, I, agree and there's there's certainly work here, TAF about sort of multi-domain control and and and and of these three up early work least in that area, even though I professed about it that never stops an academic right, um but but no, I think, there's there's. um There are hard problems in control.

D

Certainly people are working on them. I think. The other areas, as you mentioned, are arguably a lot more difficult and I. Think you hit on the problems in monitoring. I think inference also is very challenging. The quantities of data are large and that's why a lot of these inference techniques work, because a lot on rely on having tons and tons of data. The one I showed you about the. What, if scenario valuation like unimaginable amounts of data that take any deep learning problem right, that.

C

D

Great for a paper, it works great for an offline analysis, but turning it into real time. Control is another ballgame and that's not to say that it's hopeless, I think there's an area of machine learning, research, active learning right. So there's. If you look at these sort of cost, sensitive, optimizations and learning that I think we will be able to make some progress and saying like hey, it's a lot cheaper to get packet headers than to get payloads right and cheaper might be performance cost.

D

It might also be privacy cost right, but once we put costs on these features, we can start to say: hey I can do just as well with the headers right with this prediction problem than I can, with all this other stuff, so I think that's where the interesting problems are there Thanks.

A

So another like half half a minute for each other question.

G

I think it was really interesting. I definitely go read the work I, particularly liked that you honed in on in bandar network telemetry I, think there's huge opportunities there really interesting. I have one really small comment. Yeah is actually asked as a question rather than a pilot. What do you see is the difference between policy based routing and traffic engineer.

D

Well, policy-based routing is a way to achieve traffic engineering right I mean it's it's a particular implement. It seems to achieve that goal for many traffic engineering problems. I.

G

Think we have a maybe a.

F

E

G

Of traffic engineering on.

E

G

Of the IETF than what you're referring to the the ability to identify a flow and direct it somewhere yeah, we really think most of us shall.

D

See this route all see, they sure.

G

D

An operational goal, for example, yeah.

G

Everyone, but some of us also think that the ability to do some resource, accounting and even some resource allocation, as in setting up queues as an important attribute of of traffic engineering I completely.

C

Agree with you, yeah.

G

Yeah and they're not doing that here, but this is good stuff. Someone.

D

Could do it yeah they're? Absolutely no I completely agree with your taxonomy. That's that's. That's a very good point.

A

H

A little quick plug for me, one of the one of the the network management research group, is having a special meeting about machine learning. That I think touches on a number of these topics and it's taking advantage of the fact that we have a bunch of faculty and that space here in Montreal so check on the schedule and join that lace.

A

In the winter's day afternoon, I think I.

H

Think so, yeah all right thanks.

A

Rudy and thank you very much, Nick.

A

Next, we have Praveen Kumar park, mount semi, oblivious traffic engineering with.

A

C

F

A

Yeah, can we lose.

A

Its ietf action ready.

A

I

A

It was working, we tested it right before.

A

J

Actually, just message.

A

It's definitely it thinks it's is that keynoter keener.

A

A

Yeah, that's not you.

A

A

Alright, nobody look at my to-do list.

A

K

Sorry for delay, hi I'm, Kevin and I'm going to present our recent work on semi oblivious traffic engineering, which was presented at an STI earlier this year. This is joint work with collaborators at Cornell, CMU, Facebook and yoursel Ghana.

K

So in order to meet various competing objectives, operators of wide area networks use traffic engineering in order to steer traffic and desirable ways, and the good traffic engineering system must be able to achieve different, achieve good performance in terms of high throughput and link utilization. It should also be robust to failures.

K

Traffic in wide area networks have different latency requirements, for instance low latency for customer traffic, while more relaxed requirements for bulk replication traffic and finally, these traffic engine systems should be simple enough, so that it's easy to debug. Any issues that might arise in production and traffic engine systems must be able to achieve these. These competing objectives in the presence of various challenges, for instance, the network capacity might be added based on anticipated demands, so this leads to highly non-uniform link capacity and unstructured topology.

K

Unlike data center topologies failures such as a link going down or a misbehaving, router is quite common in such networks. Similarly, it is quite difficult to predict traffic Matisse's in advance, so a good traffic engineering system must be able to handle these failures in mispredictions in a in a graceful manner. A tea system must also be aware of any limitations imposed by the underlying hardware. As in, for example, the number of florals that can be installed on a router is is limited and finally, quick rapidly.

K

Updating the routing state in response to changing demands can impose significant overhead on the on the routers. So let's look at a few examples of traffic engineering systems.

K

So the conventional approach to traffic engineering relies on relies on traditional distributed systems such as such as.

K

Realize understand, distributed protocol such as OSPF any CMP in order to achieve good performance, for instance by.

K

For instance, by changing the link rate of this particular edge, HF, an operator can steer the traffic over this other. But now this approach is easy to implement because it harnesses the capabilities of widely deployed conventional routing protocols. However, in order to achieve good performance, it is quite difficult in practice to change or be optimized the link weights further. These distributed protocols often do not work well when failures happen or during periods of free convergence, when link weights have been updated. So around five to six years back enabled by software-defined networking we started to see.

K

Centralized solutions to traffic engineering and ice team gives us global visibility and direct control over the network. So in principle we should be able to implement the optimal routing scheme at any point in time. So for experts in this room, that'd mean we firmly the routing as an optimization problem, say as a multi commodity flow problem and then solve it using LP optimizer.

K

Now this should give optimal performance in theory, but in practice it is difficult to get optimal performance because note that to actually achieve optimal performance you'll have to solve the MCF instance. Mcf instances quite frequently and update the routing state almost instantaneously, and this is limited because of various practical consider, operational constraints. For instance, let's look at the time required to solve LM cf instance, so this graph shows on the x-axis traffic matrices over a period of one week from Facebook's backbone network and when the y-axis we have time required to solve them.

K

Cf instance using an optimized, NP solver, and we find that it takes around 30 to 40 seconds to solve each instance. So this imposes a fundamental how quickly you can react to changes in changes in demands.

K

Similarly, there is no guarantee that the solution of an MC, of instance at one point in time would be the same as the one that we saw up just a few seconds back, and this leads to posture. So for the same setup.

K

This graph shows the number of thoughts that the routing paths that we might need to add or remove in each iteration, and we see that we have to update 1,500 to 2,000 paths, and these updates need to be done in a consistent manner to avoid any forwarding loops black holes and to avoid any transient congestion. So this again imposes a significant overhead on the routers. So how did people actually build, centralized tea systems so to understand that let's take a step back and see what a key system fundamentally does essentially test? You answer. Two questions.

K

First, is path selection, that is, it should tell us what are the set of paths to take from a source to a destination, and the second is given this: these multiple set of paths, multiple paths from a source to a destination. How should we split traffic among these paths and we'll call it rated efficient? So it turns out that path. Selection is a slow, expensive process, because we might need to update major distributor routers for a single path update, but we might a bit.

K

We can update the splitting ratios more frequently, so so the key insight here is that we may not change the set of forwarding paths very frequently, but we can change the splitting ratio much much much much frequently.

K

So, with this approach, path, selection becomes tricky because, since we have selected the path once we are sort of committing to using these set of paths for a really long period of time, so it better be that these set of paths give us all the properties that we want, ranging from basic properties of connectivity to more more efficient, more efficient requirements such as it should be able to handle failures, and it should be able to handle misprediction in demands.

K

So the our work is that a static set of cleverly constructed paths can give us near optimal performance and and very good level of robustness.

K

Specifically, we highlight that the path selection algorithm should pick paths which have low stretch for good latency. It should select paths with high diversity so that the traffic engineering is robust to failures and the third is, the path should be selected in a capacity aware and globally optimized manner to have good load, balancing I'll, explain what I mean by these two properties in the next slide.

K

So suppose this topology, within which you have all the links at hundred Gbps, except one link, which is a 10 Gbps link and we will use a shortest path, based approach, which is not aware of the capacity and suppose a B and C want to send traffic to e, and in this case, since all the shortest paths grow to go through this bottleneck link, we will have high link injection.

K

Now we can make this approach capacity aware by using something like a strange, shortest path, first or C SPF, and using that, let's see what happens in this case in where we have made the topology more uniform that, let's say all the links are hundred Gbps and a b and c set one to send traffic to e and all the demands are 100 Gbps again. So when a one to send traffic, it will take the shortest revenue and path and it will saturate the links along that path.

K

So B will have to take a longer path and C will have to take an even longer day, but this is clearly suboptimal because the paths were not selected in a globally optimized manner. So instead, if we had, we had C, we knew all the source and destination pairs in the beginning and we could use those to select the paths we could have come up with a much better set of paths and we find that most commonly used path. Selection, algorithms often do not meet the criteria for good path.

K

Selection, for instance, shortest path based approaches can give us no latency because they have no stretch, but they are often not diverse enough to be robust to failures, or they do not give us good load balancing properties. We can get Lu good, good load balancing by using mcf, but we found that using mcf 3. The set of paths is very brittle to failures and they often have higher latency.

K

So in the past, people have looked demanda previous schemes, such as VLB, are valent cleared balancing in order to achieve good performance. So let's look at oblivious routing in a little bit more detail, so we will be works with forwarding packets through random intermediate hops, and these intermediate immediate hops forward the packet to the final destination, so VLP has been shown to work really well for mesh-like topologies by providing good performance as well as robustness.

K

However, wide area networks are not much like so even though VLB gives us good robustness, it will give us high latency, because you will have to take really long paths, and this also contributes to higher congestion over many different links, so oblivious routing, which was proposed by recurring 2008 generalizes VLB to non less topologies it does. It does show, is so by providing a hierarchy of intermediate hops instead of a single intermediate, hop more precisely.

K

It computes a distribution over routing trees and each routing tree is computed using an approximation algorithm which guarantees that the paths in the route in each routing tree are not significantly longer than the shortest paths. So we get two good latency and then, in this iterative algorithm, we update the weight of each link so then based on its uses in the previous set of trees. So this means that we do not over utilize any particular link leading to good load balancing and also this leads to a more diverse set of trees leading to more robustness.

K

So this oblivious routing approach that that was proposed in 2008 has good load, balancing properties, and it has been shown in that it is order log in competitive with mcf. That means no matter what demands arise in practice. We will always get the maximum congestion within a log in factor of what you would get with the optimal MCS best scheme.

K

So a small which is the semi oblivious, a non traffic engineering system proposes using oblivious routing for path, selection, static path, selection and it combines with our dynamic and it adaptational that uses L&P to compute the splitting ratios.

K

Now people have shown that semi, oblivious routing is not much better than oblivious routing in the worst case, but these worst-case scenarios do not occur common in practice. So what we looking at in this paper is that how well does some simply restarting performing practice so to evaluate small? We conducted experiments with data from Facebook's backbone network. The backbone network consists of several geo, distributed data centers and points of presence across the globe, and if we use the Yates traffic engineering system to perform high-fidelity simulations.

K

So let's look at performance so on the x-axis we have time over half a week and we will have columns for different traffic engineering systems and the on the y-axis will have metrics for normalized throughput normalization drop due to congestion and maximum link utilization.

K

So you find that, in the optimal approach based on MC, F is able to write throughout all the traffic and the maximum link utilization varies between 40 to 65 percent selling, a diurnal pattern. I will not go through all of these approaches, but I'll just note that only see SPF, oblivious and small are able to route all the traffic without dropping without dropping any traffic and the performance with or in terms of maximum link utilization remains closest to optimal. In fact, it remains within 16% of optimal on an average.

K

Next we performed a similar analysis for robustness and in the in these, in this experiment. For each of these traffic matrices, we failed one unique link in the topology and measure the performance so again, optimal is able to read out around these failures by rerouting traffic by rerouting traffic and and the failure loss which is shown in blue shows that optimal doesn't drop many packets.

K

This simulator in trend is seen across all the all the traffic engineering systems, as maximum link utilization goes up and again we find that SMO is able to achieve good performance by routers by rerouting all the traffic around failures and still keeping the maximum condition close to optimal. Now this these experiments were done with a path budget of food. That means we allowed only full parts to be used between any pair of pair of nodes.

K

So in the next set of experiments we increase the path budget to up to 64 and we found that a few of the traffic engineering systems are also able to achieve near optimal performance, but the need lot mainly put on that many parts between each pair of nodes to achieve the same level of performance and smaller.

K

Now, a natural question to ask is whether these results are specific to Facebook's backbone network, or do they generalize over a large set of topologies and traffic matrices. So for this we performed a large-scale experiment using data from the internet, topology zoo, and we consider and we created traffic matrices using the gravity model.

K

So let this graph shows the aggregate performance, sorry about the fonts on the x-axis. We have normalized capacity. Wear capacity is defined as the factor by which we had to scale up the traffic matrices before we could before we saw any link congestion, as in before we saturated at least one link, and these are all normalized with respect to the optimal mcf based approach and we found that small performs close to optimal.

K

Similarly, we performed an experiment measuring the probability of achieving different levels of throughput availability SLA, and we found that small again performs quite well quite well, and it is robust to failures. So to summarize, we found that path. Selection plays a crucial role in the performance and reliability of traffic engineering systems, small which combines oblivious routing for a static path, selection, which can be created updation, is able to meet the competing objectives of performance and reliability.

K

So finally, we have made all the code for small open source so feel free to check it out. Thank you.

A

So we are running a bit over. We have time for one quick question run for the Mike's IRA.

L

Chi UMass Amherst's, so uh in early on in the motivation slides you were talking about how but diversity is the difference between using say, K shortest paths. Plasmatic, mada, T flow versus more. That was wondering, like bottleneck links themselves. How often are they actually a problem, and are these like? What links are these in say a wide area network topology that you would say, define bottlenecks, and then you would want to do something other than a shortest paths. Plasmatic Model, T flow versus small right.

K

Good question so um with Kiesha, what we found is that the set of links which are normally congested when you use a shortest path, based approach or kishan's, at this case shortest path. This mcf is those which are central in the graph in some sense, so there are different notions of node centrality and link centrality. So these nodes are the these links are the ones which are central in the graph.

L

K

It could be transatlantic link, it could be links within the continent itself because they they lie on many different shortest paths. So, if you just take the entire all the n squares shop, N squared nodes- and you compute the shortest paths, then you would find certain links which are there on most of these paths. So those are the ones that could be congested if you're, using K shortest path. Less ncf. Thank.

L

A

M

This is the last session for this amazing workshop and we're gonna start with Nikita talking about tap dance, hi.

N

Thanks right now, so I want to talk to you about some work. We did on deploying censorship, circumvention system called tap dance, so one of the things that's interesting is that if you look at the population of Internet users, a large number of them come from countries where their access to the Internet is restricted. So this is a study that was done by Freedom House, and you can see that there's a very well-known example of China who runs the Great Firewall of China and restricts access to assistive, meaning different websites.

N

But overall, if you include the partly free site, so the majority of Internet users are facing some sort of censorship.

N

Censorship is done by filtering people to access to the Internet. There are many different approaches that are used. People manipulate DNS people use BGP to know about certain things. Some of you might remember the Pakistan You Tube incident that resulted from that people use the packet inspection to look at content, though that's a little bit less popular now as more more protocols are moving towards encrypted content and ends with encryption.

N

I would say the kind of a big workhorse of Internet filtering is just simple packet filtering where you deploy a packet filter at the border of your country, that it filters things based on IP and port. So, for example, last year I visited Shanghai. It was a very beautiful city, but when I went to my hotel and try to connect to Twitter the Great Firewall of China knows Twitter's IP address and block stuff, so anything action doesn't go through now.

N

For those who know me, you know that I can't go without social media very long without getting twitchy. So it's what I did is I set up a proxy at my university I set up with UPN and I connected through and I'm still able to use social media, so censorship is not foolproof and in fact, this concept of deploying various kinds of proxies the users could connect to censored. Websites has been adopted by large number of projects, tor psiphon, ultrasurf, etc.

N

Unfortunately, this creates a bit of a cat-and-mouse king. What happens is that the censors now go after these proxies. They find where these proxies are located and they block them. They simply add another block entry to this packet filtering list now, sir commenter's. Of course they work to deploy new proxies and to distribute them, but in the end, there's this race back and forth the new proxies new blocks.

N

The advantage can be in the circumvent source pocket if they can make their proxies either harder to find and there's work doing that faster to deploy. So you can just bring them online more quickly, harder to block and easier to get into the hands of real users, and so our project really works in this harder to block space.

N

We want to make sure that, even when these proxies are eventually found, it's hard for the sensor to respond by blocking up now, our project is based on deploying a proxy within an infrastructure of some friendly Internet service provider, and the system is called tap-danced, because what we use is we use a tap on the connection from a client that goes through or friendly iSpeech with some website that is not currently blocked by a sensor. So you know this could be cat pictures, calm or anything like that. That looks benign.

N

The trap gets us a copy of all the traffic, and this is an important deployment architectural detail, because we do not want to introduce an inline element here that ends up being hard to deploy, but getting a mere copy of the traffic that goes through someone's feet was something that we were able to do with at least several ISPs.

N

The way that happens is you have a client make a TLS handshake to not blocked server to some reasonable server establishes a shared key, and then it sends an HTTP request over the steel us for connection, including in it a special tag that is recognized by our tap-danced proxy that sees a copy of this traffic because the request is incomplete.

N

The real web server, the reachable web server just waits for the completion of this request. At the same time, the station recognizes the tag and responds send a response in place of the real server and from then on communication keeps going in this way. The client sends messages ostensibly to the real server, but in fact, sending requests to our proxy that then responds masquerading as a server.

N

This is what English HTTP request looks like, so you can see here that it's incomplete, because it's missing a second crlf at the end of the message. If a web server receives, as it said, okay I'm waiting for you to complete this HTTP message, this special X ignore header is constructed in such a way that when you encrypt it in your TLS connection, you get a certain message that has a particular format. That has something that we can recognize.

N

In particular, it has two parts. One is a special key agreement component that gets turned into an elliptic curve point on something called an Ella Gator curve.

N

Using this elliptic curve point together with the station's private key, you can get a decryption key for the rest of the tag and inside this rest of the tag we have the sum shared secrets and also the client random, so that we know that the decryption happened correctly. So what happens here is that if this message was sent by a tap-danced client, we are able to successfully decrypt it and recover the key material that lets us take over the TLS connection.

N

If it was sent by somebody else, then our decryption fails, and so we do this strategy christiane on every single tos connection, 99.9% of the time it fails. We ignore them and when you have a tab, dense client we're now able to act as the other side of the CLS connection and pretend to be this reachable server while sending sensor data.

N

The other thing is that the this tag it looks completely random from the point of view of somebody who doesn't know the station private key, so it cannot be easily detected by the sensor.

N

After we able to tag and recover the key, we send a response here in state step, 5. Well, that confirms that we have picked up the connection. This response advances the server-side sequence number further than the real server actually sent.

N

This causes a server when the client sends you messages to ignore any future messages that are sent because it says this message has come with an acknowledgement. Number 4 something I have not sent yet and TCP stacks, both by the RFC and according to our testing. Just ignore those messages Allah. As long as the sender sequence, numbers within the window,.

N

So pulling this together, you're now able to take over this TLS connection and use it for, like I, said accessing sensor data. Now let me talk a little bit about our deployment. We did work with two ISPs Americ, which is a regional research ISP in michigan and UC Boulder, and we deployed for total monitoring stations at their points of presence. Merritt had two points of presence in Chicago and one in Detroit, whereas Colorado had one place where we were able to.

N

Let me see add the station that sees all the traffic except for the science Network.

N

We then looked at what websites could be used as these target reachable sites in our deployment. So we did some scanning. We use data from census to find what all websites that are in the right. Ip ranges. That's have an SSL capable web server that has a trusted certificate. We then filtered them to try to eliminate some that had really short, timeouts or short CCP windows that reduced I.

N

Forget me about 20% of them, then we also filtered things by whether they could actually talk to our client and because we only implement a few of the TLS cipher suites. This actually brought them down to cuts out about another two thirds of the websites, but in the end we had about 900 reachable websites that we could use, and so what this means is that, if you were to do use IP blocking to block or deployment, you would have to block all these 900 websites, and this is what I mean by harder to block.

N

You have to have a significant amount of guest Clairol damage by instituting these blocks.

N

We in our trial, we deployed this to about 70,000 users by partnering, with syphon a company that supplies software for Android that can be used in uses a number of different censorship, circumvention technologies and just really quick numbers from our trial. So we were monitoring our ports holed up about 100 gigabits of traffic. At the peak we saw about 60 gigabits of traffic, and so we were able to process us across four and a mid-range servers.

N

Commodity servers and our user traffic was several hundred of megabits. This is the traffic that went to presumably censored websites. This is the traffic that actual users wanted to connect. So new, like I, said about 70,000 users. Several hundred megabits.

N

So one thing that I want to talk to mention at the end is that, in addition to IP blocking where the sensor would need to identify these thousand websites and block every single one of them, another approach that a sensor might be able to take- and probably most realistic at this point is to note the fact that there's various tells that you can use to distinguish a tap-dance client from a real web browser connecting to a real website.

N

You know a real web browser makes the DNS request then opens up a bunch of connections to a bunch of different servers. It uses this particular things in the last negotiation that have a certain kind of look. It uses HTTP messages back and forth for various sizes, and all of these things create a certain pattern that can potentially identify a real web browser might distinguish us some of the work that we're doing now is trying to make sure that we look more like a real web browser.

N

So it's harder to distinguish us so, for example, in the client, hello we just switched to using the Chrome, 62 or 64 version client, hello, which is still used by in our measurements, a few percent of clients. One interesting thing that I want to point out is that a bunch of work that's going on in ITF, actually makes our job easier. So the DNS requests, if they're sent over HTTP or at CLS, to give less information for the sensor to be able to identify that something's going on encrypted.

N

S&Amp;I is not Vario to us one three, but there's discussions for it once again.

N

Make it harder to tell what's going on and gives makes it harder for the sensor to identify real traffic from sensor traffic, and so one of my takeaway points that I want to say is that the a lot of these changes are being on and motivated by providing user privacy, but probably seeing how the secondary benefit that protocols that are designed for privacy are harder to censor, because you cannot selectively tell what's going on so I won close by thinking a large team of collaborators that we've had to make this project real and I want to say that we are in the process of working towards a sustained deployment rather than a trial, and we're always looking for new ISP partner.

N

So if you happen to have a mirror port lying around and you're, not doing anything with come talk to me. Thank you. I.

F

J, vascular canonical, so are these reachable and house that you're using is sort of the hidden proxies? Are they complicit to this scheme? Are they unaware that you're doing this they're.

N

Unaware we include a user agent with a URL that says that, if for some reason, if they find something in their logs- and they want to talk to us- that they can go and find find this out but they're at this point.

B

They're not complicit so.

F

Have you guys considered possible legal liabilities for the ISP? Yes,.

N

In fact, one of the partners that we have is up turn, which is a nonprofit that looks at legal issues, and so they did some analysis with that. I can direct you to some of their findings. I mean I'm, not a lawyer, so I can't really tell you what the answer is and, lastly,.

F

Use that you had 900 of these reachable sites is that list public I presume not that.

N

List is not public. No, though I mean you could generated yourself using our methodology. Okay,.

F

J

Whenever we open up the ice piece which is operating in many countries, that I considered it very limited, have limited connectivity- and we looked into this any formal analysis and what we saw is so easy to detect that an operate. An SP in these countries are running this. Providing is that by testing so and this as psi running a very strictly regulated by their own by the country's AHA. So when it's so easy to detect by testing I am much of this falls apart.

J

This ice piece have a lot of investment in the by licenses to operate and when they breach of the licenses, and then this Co one can easily be discovered. That is a big issue for the deployment and for in this releases, I'm.

N

Not sure if I fully understood your question I'm sorry, but the this is definitely not something that, like I, think every single is speak and deploy. There's definitely different reasons why different ice creams may not be able to do this, so I'm not were never have thought about. You know doing kind of a global deployment so that ice peas that we have work were comfortable doing this, though, they also had some initial concerns that were overcome, yeah.

J

Once in a meeting, we would try to recruit ice peace. The thing is, it is easy to test this. The presence of this it.

N

Is possible to detect the presence of us? Yes, that's right. It easy or difficult. It's well, I mean is.

N

I'm not sure qualified as easy as difficult, but just explain about the deductibility. The easiest way to detect it is to get a copy of our. You know be one of those users who uses our psyche on clients and sees we'll see what I see you're connecting to well. Like answers, the last question: we don't have a public list of of destinations and I speak this public that that are being participated. So it's not trivial to figure out which is peas are using this, but the project is not. You know.

N

One of our goals for this project is not to keep the identity of these ISPs secret. The goal is to operate even these, that these ISPs or no but.

J

Your goal to support bypassing of censoring internet in different countries, so it's it's pretty easy. If you hadn't, feel connected to different ISPs and I, see that you can bypass the censor I.

N

Say it depends on eyes, feeders, there's a lot of work that tries to figure out whether how easy it is to get to one of these not blocked websites, by changing your riding rules, and it's it's a little bit.

N

I don't want time to really summarize it in this time, but where I can talk to you more offline about it, I can show you.

J

N

J

Challenges: okay,.

B

N

It's thank you.

B

Very much for that point.

E

All right Dave from Cisco, so this is cool work, but I've got a couple of questions now. So on the detection method, you didn't list TCP, but if I understand the presentation correctly you're using some, you know TCP hack, of sending an acknowledgement. Number that's out of window I.

N

Mean yeah is something that the basically it happens automatically on the client, because we send a message. We inject a message from the tap-danced station to the client and the client automatically respond with a different acknowledgment number.

E

So, wouldn't the non-standard use of TCP be detectable, ah but.

N

The non-standard part of it is only visible on the.

N

It's just excuse me for one. Second, if you look at this end of this on the slide, it is non-standard. If you look at this end of the slide, everything looks as if it should be, because you get a message, that's acknowledging it, and now you got a new message and then you acknowledge this message in your sequence.

N

Here you get at the server the decoy server. You get messages that are out of order from TCP, and so that part is non-standard. But the sensors of the vation point is over here.

M

B

It thank you I'm.

M

Sorry, we don't have more time for it, but.

C

F

Is going to be around.

M

After this session, and hopefully can answer two more question, the next talk is going to be by Rishi and it's about discrimination of users.

L

Hi everybody thanks for staying I'm, Ritchie I'm, a PhD student at UMass, Amherst and today I'm going to be talking about the nature and dynamics of tor exit blocking this work was done in collaboration with folks at UMass, Stony, Brook, UC, Berkeley and ixy. um So, as folks might be aware, the Tor network allows its users a way to communicate with content on the internet anonymously. So as an example here, I have Alice who's a not and who wants to browse the internet anonymously.

L

So when I say anonymous, she does not want to give away her IP address as she accesses content she's using the Tor network. To do so so her traffic reaches the Tor network and from there it makes it on the right-hand side towards servers that are hosting popular content.

L

Now, a problem in this setting that a lot of content providers face is that when Alice's traffic reaches from the Tor network they're not entirely sure if this traffic is malicious or benign um and as a result of this tower, users have started to face discrimination or differentiation from content providers and what this means is at times.

L

Tor users see a variety of captures that they need to solve, to be able to get to the content they are looking for, and sometimes it can be as bad as being outright denied access to the content um and I think this is. This is a problem. It's a problem for two reasons. The first is there are people like journalists and activists and people behind state-level censors who are using the Tor network to get to content they can't get to otherwise. um The other reason is which which we think about lesser.

L

A recent report found that tor users are just as likely to make purchases on, say, revenue generating websites as non tor users. So basically both ends on this. Both parties in this thing are losing losing out the tor networks are not able to get to content; they want to get to, and revenue generating websites could potentially lose money yeah.

L

um So this is sort of the backdrop in which we place the work that I'm going to be talking about today, um just to sort of set the stage here, I'm going to walk us through what sort of interaction really happens when a benign tor user faces discrimination while accessing content. So I know this, this figure is kind of crowded, but I'm going to like direct your attention to pieces of it. um On the left side, I've got two users, user, 1 and user to user. One is one, is a benign tor user user.

L

Two is malicious and both of these users want to access some popular content. That's hosted on server one server two and server 3, and they want to do this anonymously um so as they want to do. This anonymously in the middle I have like a simplified representation of what the Tor network looks like. There is an entry relay, there's a middle relay and there's an exit relay with a certain IP address, um as folks know that servers on the Internet face a lot of Flik abusive traffic.

L

So all of these servers are subscribing to some form of threat, intelligence, which is basically serving as a line of defense against Seay attacks or whatever malicious traffic they receive. This threat intelligence could be something homegrown. It could be something which is like a central repository that everyone has access to. So these are the different like parts of the equation here and as an illustration, a malicious user user. It's trying to attack server one and they do it using tor, so use a tooth traffic mix.

L

The Tor network, egress, is the Tor network, reaches server, one suburban, really realizes. This is an attack and as a server, that's under attack there's a number of things you could do. For instance, server 1 says this is an attack I'm, going to tell my threat intelligence provider? Hey guys, this I P address 1.2.3.4 is bad. um That's one scenario: now user 2 tries to attack server, 3 server, 3 realizes this is an attack and another scenario. Server 2 is going to send an email and abuse complaint to the who is contact.

L

They can find for this IP address and folks in this audience are probably quite familiar with how abuse emails work you find like the abuse contact of the autonomous system. That's announcing this prefix, you send an email, saying I'm receiving bad traffic from this IP. Stop doing so, um and now, after all of this has happened, a benign user that just wants to get their cat videos accesses content and server to and they get blocked because it turns out server.

L

2 is subscribing to the same form of threat, intelligence that server, 1 and server 3 were sort of contributing. um This is one of the ways in which a completely benign tor user is facing differential treatment on the Internet, and this is a problem. Another way this could have happened was if this threat intelligence intelligence system I have on the right side. Here was simply just crawling the consensus, the Tor consensus and listing all known exit IPS on on on, though as a blacklist or something.

L

So what we wanted to do was to understand exactly what happens in these scenarios and like which of these scenarios really plays out in the wild, um and how would one go about doing something like this? Is that we take this complex interaction and we try to figure out. Where can we place our vantage points to understand these scenarios better? So what if I could be like I could simulate being a benign tor user and at scale figure out? What is the sort of differentiation that tor users face when they access content?

L

What if I could introduce exit relays of my own and then I could see what properties of these exit relays can I change and then sort of realize how what what kind of treatment I get on the internet?

L

What, if I, could get access to abuse, complaints that exit operators get right like then I could use that as a proxy to understanding this is the sort of abusive traffic that people face from tor and then that's something that we could. That's that's a problem, a concrete problem we could work on and finally, if I could get access to something, that's commercially deployed threat. Intelligence I could figure out how do tor IP sort of harness this bad reputation. How does it happen over time?

L

What's what can we do in this scenario to fix the problem? um So this is exactly what we do. These are the four key questions we answer by introducing vantage points in this complex ecosystem, so I'm gonna, like sort of highlight what our methodologies are for answering this question, and what do we learn from this?

L

So first I'm going to pretend I'm going to be I'm, gonna, simulate being a benign tor user and at scale access popular content on the internet and see what sort of discrimination do I face from servers.

L

So to do this, we design like an automated crawler which crawled Alex at top 500 websites, and our crawler was capable of sort of interacting with these websites by log trying to log in by trying to search it things like that, um since we were doing this from tor, you can see on this side. There's some example screenshots of what we got back as a benign tor user. We were, we were denied access, we were, the crawler was asked to sort of solve a lot of CAPTCHAs and so on now at scale.

L

How often does this happen, so we found that 20% of Alexa 500 websites discriminate against or users at the front page level. It's else you access, say the home page and you see some form of discrimination. This discrimination ranges from say solving CAPTCHAs. It could be as bad as like being denied access now for off the websites that did not do this on the front page about seven percent or so discriminate.

L

If you try to log into these websites as a tor user and another three percent or so will discriminate, if you try to do a search exercise, a search functionality on these websites as a tor user. So this basically says that we did not understand this as well before this discrimination is clearly is quite bad at this point for benign tor users.

L

The next step that I want to talk about is, if you folks, remember, I, said that tor has exit relays of its own. How about I launch some of ours that we can sort of change the properties of and see how they impact the treatment that people get then once they use these exit relays.

L

So we did that we launched ten exit relays of our own with different sort of properties. Some of these. So when I say properties, I mean what sort of uplink do these relays have to the Internet. There was small, medium and large exit relays, depending on how much bandwidth they were allowed to use, and then the word the relays have something called an exit policy which is basically what sort of traffic is this exit relay? Okay with egressing? So what port numbers is it's Delta? Would it send traffic to and so on?

L

So that's basically an exit relays, exit policy and we varied that from default, which is the default policy toll, allows for its exit relays and reduce reduce, which is a much more conservative policy which does not allow a lot of variety of traffic and the reason this is sort of important is we thought that if we could make the exit policy more constrained or more conservative, maybe a traffic that comes from these relays gets treated better from servers hosting content. So that's that's one of the hypotheses.

L

What we wanted to test here so just to sort of put this at the back of your mind. We did this and then we harness the data from these relays and I'll show you what we learned there so abuse complaints that exit relay operator is get. We wanted to analyze these and use them as a proxy to figure out what abusive traffic comes from tor. So we did that the reason this was the reason it's important to do.

L

Something like this is that in general, if you do not have server-side coordination, it's sort of hard to get access to something that lets. You decide what fraction or how much of the traffic is abusive from, say the Tor network, or you know other endpoints, so by using these abuse complaints that exit operators received over a period of, however, may click I, think three, so the the dates are from 2010 to 2016.

L

So for that period of time we looked at the abuse, complaints that exit relay operators got, and these are 25 exit relays, ten of which are our own, and the other 15 are the ones that had been running in the Tor network for a while and then we sort of did automatic clustering on them did some analyze than using regular expressions, and you basically found that a large majority of these abuse complaints are DNC related I'm guessing. This does not surprise people in this audience.

L

There were a lot of DNC a complains related to BitTorrent traffic from tor, and we know that his complaints tend not to be very high fidelity in nature, so we sort of disc-like did not analyze them further and we looked at the non DMCA complaints that people receive that exit relay operators receive and off. These 38% were about botnets and compromised machines, so we're trying to understand this is basically the kind of abusive traffic that you would see coming from dawn and there's like a distribution of what are the types of complaints we saw.

L

The important thing here to note is there was sort of no correlation between what exact policy the relay had and how many complaints did it get. So the hypothesis you were trying to figure out that, can we sort of fix this problem by changing how conservative the exit policy of the relay is and turns out? That does not seem to be the case in this analysis.

L

Switching gears a little bit I'm finally going to get to the last part of this equation, which is commercial threat, intelligence, and you want to understand how does commercial set intelligence treat tor exit IPs and what happens over time? So for this purpose, Ving were given access to Facebook's threat exchange data, which basically is an aggregation of threat.

L

Intel information from a number of companies- it consisted of over 100 black list or feeds consisting of like these IP addresses, have done something bad in the past and using this threat intelligence information, we looked at how do exit ip's harness bad reputation over time? How soon do they get blacklisted and so on, and we found that so I'm going to show you this plot and, like I, think the point I'm trying to make here will become evident so on the x-axis here.

L

I've got time in hours and y-axis is sort of the cumulative distribution function and that spike at zero is basically saying that all of these IP addresses got blacklisted on this particular feed that I'm talking about in the threat intelligence data as soon as these relays came up online. So basically, this is like black listing after IP addresses as a matter of policy. These relays hadn't really been around to actually do some damage at that point, but they got blacklisted almost immediately and we refer to this as proactive blacklisting of touji lays.

L

On the other hand, if the blacklisting takes its time are further, delays are born in the tor network. We try to think that this might be in response to abusive traffic. That's coming from these relays, sadly, 7% of commercial black lists that we analyzed they blocked or proactively ODS. It's a matter of policy for blocking tor exit' IP addresses um I'm gonna. This.

L

This graph is kind of messy so on the on the x-axis I have got time, and that bar says that this really came online at that point of time and the brown line is sort of showing this relay was alive in the Tor network. Every all the other colors are basically try to show this blacklist blacklisted. This relay in this span of time. So it's important. It's sick. It's interesting to note that as soon as it really comes up online, there are black lists that enlists them, and these are our own relays.

L

So we were able to like take them up, take them online and offline as we wanted as soon as it. So once the relay is gone, it takes it took the the very proactive feeds.

L

A few days to sort of purge the IP off themselves, then again we brought the relay up online and it got enlisted on these black sister very rapidly, and these are some black lists which were they had like a short term ban, which was which makes us feel like these would this was this was done in response of abuse after the abuse ended the black list, with the blacks, SD enlisted detour IP address.

L

So so there's there's both kinds of blacklist that you can see, some that are more reactive in nature and some that are more proactive and we are like I like I mentioned. I did not like. We did not find a correlation between how a really gets blacklisted and its exit policy so that that hypothesis we began.

L

We did not really like me, we found an answer to it and it seems like a conservative exit policy is not surely the solution, um so I want to conclude and then sort of think about gay people's opinion on how we could do something better here. So the first thing that I want to highlight is blacklisting laughter. Ip addresses ends up being a matter of policy.

L

In a lot of cases, um we find that a lot of commercial glaucous are trolling tour and enlisting the exit IP addresses so there's before we could see some evidence of abuse coming from these IP addresses they got listed on the on the blacklist and does worse then adverse and what we understood before. There's a rejection of tour by websites for like search and login functionality as well so I think to me.

L

The the the key interesting points here is our here are the the sort of fate sharing dynamics in tour, so we're tall exit IPs, even though not all users are malicious, a bunch of them end up sharing feet because the exit IP is common for the traffic. That's coming out of the network and I do not think IP, blacklisting sort of does a good job, because a lot of benign tour usual users end up sharing fate with these bad guys. So, if folks have opinions on this, it'd be great to know. Thank.

M

You thank you for I.

I

For interval I contribute to spam assassin, mine to Fang, mod security on Apache and so forth, and I'm just wondering if the web servers can detect bad entries and decide to blacklist you. Why don't you just apply the same tests on the ingress side of your network so that those requests never make it out the other side.

L

Making logging any kind of information about who is sending bad traffic or sending any kind of traffic is sort of different from what the goal of the tour anonyme.

I

I'm not saying blogging it I'm, saying, filter yeah,.

L

So, are you saying that on the entry relay I keep track, of which requests are malicious? No.

I

You don't need to keep track of anything, you see the malicious request and you block it end of story. It's.

L

That would involve changing the relay software itself, right, like I, wouldn't to run some sort of fingerprinting there, which says okay. This looks like attack traffic right first I'd I do think there are some efforts that are similar, but not quite the same that are going on and that's on the egress end of the network, not really on the ingress end and second I, do think that it makes people a little nervous to have any kind of information being logged on tor exit relays, because it's a slippery slope right.

I

But if you're going to let bad entries through your infrastructure, you're going to be spanked, for it, I mean.

L

You shouldn't be surprised.

I

L

That is true, but aside aside from the fact that there is malicious traffic, there's also benign traffic, and the point that I was hoping to make here was: if there are these long term bans for IPS that at some point in the past have exited malicious traffic. I think that's bad, whether it's tor or not,.

I

L

Like I do think, that's that's what what you're saying that there should be something within the network that tries to detective traffic is malicious or not, and I think there are some efforts in that direction towards the egress end of the network, not the ingress, but at the same time, like long term, blacklisting does not seem to be a solution in general, anyway.

L

I

Well, the mod security rules that Apache uses they're very portable. You could probably do some post-processing and and make them work with your infrastructure as well, so that you're using the same triage rules for bad traffic as most of those sites that you were add.

L

The mod security rules public, yes, okay, that would be a great resource thanks for pointing that out. I.

B

Just wanted to point out that you can only do filtering by IP address at the ingress level if you were doing any kind of filtering at or because the ingress level does not see what kind of traffic that's happening. Is that what you were talking about? Justice.

I

B

I

B

You you can you, you can only filter by IP, because the ingress for.

L

B

Does not see the contents of the traffic just see somebody connecting to the Tor network and then relays and crypto traffic over to the egress.

B

That's kind of a fundamental design trade-off for pronounced communication. Right is, if you could separate, if you knew what people were doing and where they were going, there wouldn't be anonymous screen.

L

So at the egress sense, if I I think that it is possible to do something at the interest you.

B

Could you filtering on you guys unless, though, hopefully not, hopefully, people are actually using end-to-end encrypted protocols, so the egress filtering, if you use HTTPS or start CLS or anything like that,.

L

But like if it's say like 38 percent of the abuse traffic or the complaints will say about botnet traffic and so on. So if this is like on a particular port or something that looks like a port scan, I think that kind of traffic can possibly be filtered. There's.

B

Some policies reject various kind of ports as well.

C

Quick thought on that, and just because it's encrypted doesn't mean you can't tell that it's attached.

D

I, just I mean one can in analyze properties the encrypted.

C

Traffic to and.

D

Maybe make blocking decisions or know.

B

L

I mean there are fingerprinting attacks that happen, make which are like attack attacks on the only traffic correlation and things like yeah.

D

That's kind of my getting is like you sort of like throw up your hands and say it's encrypted. We can't tell anything, but that's actually not not the case. I mean we know all kinds of attacks against and quit maybe use them for good.

M

Thanks and so I think thanks decision is ended and I leave it to Sharon to end decision. Thank, You, reishi.

O

Alright, so we've finished the talk portion of the day, sadly for me, because I'm so excited about all of this, so thank you. Everyone for attending through this part I'm just so you know we had 44 submissions to this to this workshop and we've accepted right now. I think we have 18 posters in the other room. So if you go to the program page, you can see the list of posters they're all in there. The presenters will be here. So this is another opportunity to kind of connect directly with researchers and ITF errs.

O

I hope people will take this chance. Everyone can join us in the next room, we're sort of scheduled at like 6 16, 46 42 endless, but it's very fluid. So please everyone was welcome to join us in the courtroom. Thanks again, it was a real pleasure to hang out with you guys now and I'll see you in there. Thank you.