Internet Engineering Task Force 105, 24 Jul 2019

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: IETF105-PEARG-20190724-1330

Description

PEARG meeting session at IETF105
2019/07/24 1330

https://datatracker.ietf.org/meeting/105/proceedings/

A

All right, hello, everyone welcome to the PRG privacy has been suspense research group. My name is Chris Wood here with chiffon sahih Sarah is with us remotely and meet echo just quickly moving forward. This is the note. Well you've probably seen this lots of times so far. But if you haven't, please take a moment to familiarize yourself with it or you can read it. Online Siobhan is passing around the blue sheets. Please make sure you sign them. We have a jabber scribe, correct. Yes,.

B

So we have a JavaScript. Is someone willing to be a JavaScript.

A

Do you want to do just jabber.

A

Which one would you prefer? Okay, let's do jabber and we also need a minute taker. Then any volunteers.

A

Make eye contact with people I know until someone does their head.

A

Ben, thank you. So this is the agenda we have today. We have eight minutes of sort of research. He presentations to start and then followed up by some proposed individual drafts for the research group to close things out. Does anyone have any last minute comments revisions they would like to make before we get started.

A

Doesn't look like it okay, so on that note we will kick things off with Pete, who will tell us about privacy, standards and anti standards.

A

You might need to talk cause. You didn't like.

C

Okay, take two, so my name is Pete Snyder I come from brave software, where I do privacy, research, I, mussel co-chair of ping, which is privacy, group and w3c, and so this is some lessons taken away from the intersection of those activities. So just as a brief overview of what I'll be talking about. First, we'll talk about how standards impact the the privacy work. That brave does is a privacy or anything vendor.

C

Very briefly, then I'll talk give a couple examples of anti Stan error, anti patterns that we see reoccurring in standards and how that makes life difficult for people who want to do privacy, protections on the web and then finally, I'll give just some kind of listen of additional topics that are related.

C

That I would I think warrant further discussion, but I there's not time for this talk so first, how do standards, employees or impact our work as a privacy oriented implementer so in order to protect privacy on the web for our users, great makes a bunch of modifications to the browsing environment. We block state storage in different places. We integrate tor and a lot of people. Users use tor in the box, we block a whole bunch of resources from being loaded.

C

The first place from a variety different places like easy list and in-house generated lists, and things like that. The most relevant to this conversation is we modified the browser environment, and so we end up violating standards in a large number of places, because implementing those standards correctly would harm our users, privacy just to give kind of a very specific example. Here's a recent paper that came out that gives a an overview of a bunch of fingerprinting methods that are on the web.

C

The relevant irrelevant to this conversation is not only do they do a nice work, nice job of surveying existing work on how people are actually doing think of fingerprinted online, but they also give the impact of each of those fingerprinting methods and and kind of a benefit how much identifying power they give in. So the one I want to talk about here is on the right arm, on the right hand, side audio context, which is a way of identifying the user based on particular it's about how the browser will do audio synthesis.

C

It turns out if I won't go to the details, but it turns out that different vendors and different pieces of hardware- and things like this- will do this very subtly differently or sometimes not so subtly, and you can use this to build up a semi, identifying piece of information about the user, which can be combined with others to uniquely identify them and so brave modifies this, and what we do is we.

C

We know a lotta bunch of things that the browser allows you to do by default. These are things like asking querying the hardware for its capabilities or different low-level things, and so what brave does is we just say? No, when it's a third party context, we just say you can't do that unless you user, Ops and explicitly says, violates a standard, but to do so otherwise, with the harm art would allow our users privacy to be hard. We do this in a whole bunch of other places as well.

C

This is it not even nearly a complete list, but I mention these just as a partial enumeration of how extensive the problem is. So how did we wind up in this situation, or why is this problem so endemic across the browser platform? I only have three different examples of these kind of anti patterns. We see in standards that lead to this outcome. This is not anywhere near an extensive list or a comprehensive list, but I need it as motivation, then demonstration so also I don't mean this to be simple.

C

This is not like a just do these three things and all of a sudden privacy comes easy, but I mean them as a necessary, but not sufficient list of problems.

C

So first, is we see reoccurring over and over again that standards are extremely specific about the functionality that needs to be implemented, but are extremely unspecific about how to how to mitigate the privacy harm that that's that functional the ads.

C

So, for example, many w3c documents have a privacy consideration section this. These are things saying like implementers should be concerned about these privacy concerns that when they implement this, but they are not normative, they are not mandatory. They are just a list of concerns, but the rest of the document, which is almost without fail, much more much longer and much more specific.

C

So, specifically, the functionally that must be implemented, and the result is that everybody agrees on everybody implements the harmful part and people are extremely unsure what to do about the non harm or about the mitigating part, and, as a result, nobody can people. Vendors hands are tied when they want to try to do the mitigations, because websites assume the standard, the well-defined behavior.

C

So just as a motivating example, this one you may be familiar with referred policies, a is a standard that says, under certain conditions, notify the website that you're visiting now of where you just came from this has extremely specific, very obvious. Privacy harm our privacy implications. There's a very short piece of text saying we know that there's privacy problems here, but in the mitigation section it just says, vendors can do whatever they want, there's no specific specificity. It just says vendors may violate this at any time. As a result, many websites rely on.

C

This is very unusual to website operators and that's result. Many many websites just assume the refer policy will be standardized as it's described in the rest of the document and it now, if you in, if you're, in brace position or somebody like Gray's position, where you want to protect the users, privacy and you remove the refer header or you do different things- to try to make the referer header less privacy harming you break a whole bunch of websites.

D

This text seems explicitly permit user aliens to suppress the refer adder entirely. What would you want it to say? I would.

C

Like it to smoke, I have a long list of things we could probably talk about offline, but to the first high order. Bit would be don't do this, but if you had to do it, I would say. Don't expect in third party context. Prefer a policy should up should only be sent on a referral to be sent on the user gesture things along these lines. I think.

D

You okay, well. The last point is different, but I think even if from the optimistic view of the true creator which are web authors, we eat specifications, um you just know what chrome does I.

C

Think those are probably I'm sure you're right that web bodies do not read specifications I'm sure that web authors do read things that are distilled from specifications frequently.

D

Perhaps my experience is they do what chrome does so so I think I guess. My point is that your thesis seems to be that on it's flexible in the specification that creates the problem. I'm, not sure, that's I'm, not sure making very participation of that in the particular example, I think.

C

That may distill down to a distinction without a difference in that by saying vendors can do anything. Everybody is following the specification, and so chrome is following the specification and also doing what follows its lead. The standard is no longer the guiding principle. I mean we can take sounds good okay example: two functionality, that's useful, only in a very, very, very small set of situations, but is made universally available to all websites, so problem here is you have something that's very useful to a very small number of people.

C

Maybe that's doing things like audio synthesis, but it's made without there's. No there's no permission in place. There's no gating the functionality. You know anything like this. It's all of a sudden. It's not being used just for this very narrow use case, it's being used for finger printing or it's at least available for fingerprinting it's available for these kind of privacy, harm uses and so, as a result, becomes extremely difficult to pull this off the web without breaking all bunch of websites. That expect it to be in place an example here.

C

Canvas element allows you to not just write things to the campus, but also pull things out. The use case for this is not zero, but it's obviously not the common case of when you're writing things to do a canvas, and so what do we see? We see not. We see. The common use case of this is on libraries, like fingerprint js2, where, if you dig into the code in the lower right hand corner, you can see this being used to generate unique identifiers for users.

C

Now, if you pull out, if you think you're somebody like brave or you're somebody who's doing privacy protections, you think great, we'll just pull out this functionality on all of us and you broken whole bunch of useful use cases as well, where the the goal in the first place should be just to say, figure out a way of getting permission to the sorry, gaining access or restricting access to these kinds of positive use. Cases in the first place.

D

Again, how would you do that.

C

So again, I have a long range of suggestions, but the the first approximation is to say user gesture, say permission to say some other way of the user. Signaling I'd like to do a thing I'd like to have you back canvas likely. If so reading, back cameras in and of itself is often not useful. You need to do something with that value for it to be useful, save it to disk stay with the storage that, and so those are things that other browsers, already gate.

C

Behind these sorts of things, storage, access, API, brave, does etc, and so these things get paired together. I'll it may be natural to pair them in that way explicitly yeah. I'm.

D

I think probably seen you in a place where we're gonna put costly popular user for, like all sorts of things, they can't possibly give.

C

Permission promises is one of these things. User gesture is not a permission prompt to third-party frame. It's not the permission. Prompt thinks about sure.

D

But not everything, since we got you just lots of things like pretty good for him. His fingerprinting there plenty of ways to forced user pre, huge gesture.

C

Right that is, it is not possible to trick the easy to do many things, but but require me to do something would take it out of the common path for sure.

C

So these are, these are additional lists of things that it's extremely hard to think of why that every single website needs son gated access to them. These are all things that are used in fingerprinting techniques, and I would suggest we should do something somewhere here.

C

So what's the what's the lesson to take away from the second pattern assume they people will use your functionality if you allow them to use it across the board, if you think there's a the only a small subset of use cases or places where your functional is actually you needed, figure out ways of getting or restricting accordingly.

C

Third and final is what we see over and over and over again in standards. Conversations which is websites can already do bad things. What is the marginal harm of allowing them to do an additional bad thing? I think this is unhelpful, so, for example, if something that the standard is proposing a new way of allowing web service to do community or applications to be communications to remote servers, the common refrain is, you can already go with an image tag. What's the harm of allowing them to do it?

C

In a different way, this is a way of expanding your horizontal privacy that indefinitely and making it impossible to stop to fix the problem after the fact, an example here is client hints, which is both both has aspects in the w3c and IETF.

C

The relevant part here is I, won't walk through this, because I assume people are familiar with since it lives here, but there's a large number of there's a large amount of papers being written and research that exists, showing that the exact kinds of values that can be requested through client hints can nearly identify a non-trivial number of users. As a result, the response is often well. You can already identify users anyway. What's the harm of already doing this cookies exist, don't they etc, etc.

C

I think this is the way of just doubling down on the problem, to put put it differently when you're digging and stop digging or when you're in a hole, stop digging figure out. Freeze the problem start thinking our ways to mitigate the problem as is, and don't and don't entrench the problem indefinitely again, I can I'm happy to say more specifics. If this is a topic of Congress of interest,.

C

Ok, one last thing: I wanted is just to say it's just kind of bullet point of a number of different things that I think would weren't for the conversation, but are too much to go through in detail here.

C

One thing that we see over and over again is is a way of kind of pitching the problem forward to say we know this part of this. This new standard introduces a privacy harm, but we're working on the standard that's coming down at the pipeline in the future. That will fix that problem. I think this is extremely harmful.

C

Not only does it type the future authors hands and what now they have to do to address the current problem being introduced, but it gives you know it makes it extremely difficult to evaluate the privacy harm by the standard you're considering right now, which is to say something, may change in the future. That is not a basis to judge something that's gonna be introduced today.

C

Second point I think is worth keeping in mind when evaluating standards is that the idea of formalizing bad practice has at least some some appeal in that? You can say well if we can get all the bad uses to use this new API instead of the old API, then we can reason about that. New API use in some semantically valuable way.

C

I think this is not useful, because what ends up happening is that actors use both api's instead of just a new one, and the last is this kind of what I think is this kind of like judo move where people say well site authors use this people like sites and so users, indirectly wanna. You want this to exist. I think this is totally not helpful.

C

It is important to consider the site site. Authors needs the first and foremost, is the person using operating the software and to recognize those interest at verge frequently and to consider the harm to the user, not to nebulously users in general, so some last takeaway, some last things I hope to keep in your mind, is that the amount of standards getting pushed through is just extremely difficult to be able to reason about privacy, wise and so to a first approximation.

C

The best thing we can do for privacy is just just slow the roll a little bit and to give things a little bit time to percolate into to percolate through the review processes. Second think about complexity in term of itself as a privacy harm. It's not adding anything new to the platform brings some risk and also brings some reward, but to not think of it as there's no privacy harm that I've identified it. So it's fine to add.

C

It is not the right criteria to consider when you're thinking about improving privacy on the platform and then third is I. Think a totally under considered risk is that standards work largely now or a non-trivial amount of standards.

C

Work is standardizing things that are already shipped and at that point it's nearly impossible to pull them back in and so figuring out some way to reason about things before they get off the door or at least earlier in that process, it's probably a place worth digging into okay, so I'm happy to discuss some things more about it, but I just want to say. Thank you very much for your time. I'm Pete Snyder I'm, the privacy researcher at brave I'm here to try to help do better and privacy has standards. Thanks.

A

Are there any questions.

A

So I have one you mentioned many examples in which standards could potentially be improved or specification, secured least be potentially improved to help. You know, perhaps benefit the privacy of those implementing them or those users that are using the things that are implemented. Is there any like what sort of concrete or tangible steps can either the IRT app or the IETF take to move in that direction?

A

If you can offer some advice.

C

Or I guess sure next steps so I think there's a in w3c. We have the idea of hearth, horizontal review boards or horizontal review groups that at different iterations in the standards lifetime from conception to recommendation groups like paying the privacy group at other groups, like accessibility are expected, or at least have the option of giving input at that point, I think for maizing. That process stronger would be an extremely useful way of allowing interested in concerned actors to get involved earlier in the process.

C

E

Most of the examples that you mentioned, we have taken from the world of w3c, which I'm not familiar with beside the client ins. Do you have ideas of photo codes, IETF protocols with the same sort of problem, because in theory in the IETF people should avoid security considerations, including privacy? We have FCC 9:23, so in theory, problem should not happen in the IETF, but of course they do so. Do you have specific examples in mind from your experience with so.

C

I, don't have any examples from the ITF world specifically that I would feel very confident talking about beyond just kind of yeah from the outside, but I would say we also WEC standards also have these privacy and security considerations, sections and there's a concerned vendor I. Think that is a step one of what needs to be a 10 step. Road.

C

F

Lemon, so thanks for presenting this is really helpful. One of the things that I've noticed people do to try to mitigate risks of this sort is to have essentially a list of things that a particular site is allowed to do, and that seems like I mean this. Is that this really wouldn't apply to IETF standards, but it sort of seems like it applies to the stuff that you're actually talking about here.

F

So basically, a set of entitlements I mean. Does that make sense to you as a way because the problem with what you're, with with slowing down the the advancement of progress so to speak, is it's very difficult to do, because there's always somebody who wants this new feature? And what do you say to them? I mean the new feature is probably totally privacy violating, and there really isn't a mechanism for actually preventing it from being exposed to the user when the user doesn't want it to be exposed to them.

F

There's no way to actually enforce that restriction. So so it seems to me that the actual research problem here is how do we create a framework for enforcing restrictions of that type.

C

So I completely agree with you that that figuring out what sites should be allowed to do what it. What any given point is extremely difficult. I I slightly disagree with you and that there's not ways that vendors can enforce those choices. Those vendors, a couple of examples came up before other things, maybe policies determined by the browser, offline or shipped with the browser, given some knowns, that of safe sites.

C

Permissions are knocked around, but but not useless way of going about this user gestures, where the frame is where the code came from, which is not a variable in any standard currently, but not what frame is it executing in, but what who delivered it, etc, etc, etc. So I think I think I think is more ivan either for hope than well.

F

It sounds like you actually agreed with me there, because what you're really saying is that we need a way to make that happen. Oh.

C

F

G

Garrymon diem qualcomm, we call back to six years ago, when I was chairing the geolocation working group from the w3c. We had a discussion is he, you know related to ping advice on the topic, and we just then we discussed the concept of whether a webpage could or what a web web service provider couldn't declare to the user. What their intentions were as far as any information they would provide, and we couldn't figure out a way to actually do that without getting it without it being abused by rogue parties.

G

I'm wondering now, though, you know when you look at the only innovations such as certificate transparency and we're getting better and better authentication into the browser all the time respective websites, whether whether browser based policies with respect to individual websites, could actually take the place of having to specifically advise the user from the service service itself. So I was wondering what your thoughts are on this well.

C

I mean I, think geolocation in the browser is actually a positive example. I mean that is a of the many ways that user privacy is harmed. That is not often one of them because it very explicitly says the users understand what that means, and they have they opt in and I. Think of the hundred things that makes me concerned that that's not one of them I think for that reason, I think the idea I'm not sure, there's any solution to the concern of the website. Saying I'm only gonna do this with that information.

C

I think that's, probably even even given another advancements I think that's helping out a way that it's going to be useful, going forward.

B

We're on to the next presentation.

H

H

Okay, hi everyone, I'm Sandra, a PhD student at EPFL and today I'm, going to present some work done by myself and my co-authors on traffic analysis of encrypted DNS, so I'm going to start by jumping straight to a conclusion, which is that we did a number of experiments where we did a traffic analysis of DNS over HTTP traffic and we found that monitoring and censorship. It's still feasible, even in the presence of encryption, and that currently proposed a DNS based countermeasures against traffic analysis are not sufficient to prevent such attacks.

H

So in this talk, I'm going to describe some of these experiments in detail so when a client connects to a destination host generally, this is preceded by a DNS lookup. As we all know, there are measures in place to encrypt the connections between the client and the destination hosts, but DNS lookups have so far been sent in, but here which makes them susceptible to eavesdropping and tampering.

H

For example, if you have an adversary on the path between the client and the recursive dissolvent, the adversity get get some idea of the browsing history of the user, for example, which is a privacy concern. There is also a censorship, that's based on DNS, so there have been measures in place before that have been proposed to improve DNS security.

H

You have measures such as DNS SEC, which look at authentication but do not provide confidentiality, and you have measures such as DNS Script, which allow for encryption but did not see much widespread adoption over the last couple of years. You have these protocols DNS over TLS and DNS over HTTPS, which have been gaining some traction.

H

The idea behind them is to set up a TLS session already or an HTTPS connection between the client and the recursive resolver and exchange DNS queries and responses over this connection, and since this is encrypted, ideally, this should provide some privacy for the user.

H

So the scenario that we are looking at is an observer who is monitoring the connection between a client and a recursive resolver. So the user visits the page and the observer tries to get some features from this DNS over HTTP or door traffic and tries to guess which web page is being visited by the user.

H

Note that, since the connection is encrypted, the observer no longer has access to the content of the DNS queries and responses, but looks at information such as sizes of packets, the timings between the packets, the directionality of the traffic or headers such as TLS headers, and the idea that we're basing our analysis on is the fact that when a user visits a web page, there are a lot of additional resources, such as ads static files or images that have to be loaded as well, which could be hosted on different domains.

H

So a visit to a web page actually consists of multiple DNS queries and responses, and the set of query seven responses could act as a fingerprint for identification of that web page.

H

We are considering two adversary goals here: monitoring and censorship and I'm going to speak about each of them. So, as I mentioned before, the goal of a monitoring adversary is to look at the do edge traffic and get some features. So we and I try to identify the webpage visited by the user. So for this we build a classifier based on size and directionality features of the door. Traffic I don't go into the details there of the classifier they're available in our papers, but we basically can conducted two experiments here in the first experiment.

H

We considered a case where an adversary knows the entire set of is that a user has visited and the goal of the adversary is to identify which particular webpage was visited by the user. So in this experiment we considered a set of 1,500 web pages. So if you consider a random classifier that tries to guess which web pages it is, the classifier would try to guess this with one on 1,500 basically. But what we see is that our classifier gets a 90% precision and recall where precision is a measure of correctness of all the results.

H

That word is turned by the classifier and recall, says how many relevant results were returned. So when the classifier has a high precision and recall score, this means that not only did the classifier identify a large number of web pages, it did so correctly in the second experiment. We consider a bit more realistic scenario where an adversary does not know all the set of web pages that are visited by the user.

H

Rather, the adversary has has is interested in a subset of the webpages called the monitored set, and the goal of the adversity is to determine whether the user visited a page in this monitored setting. So for this experiment, we looked at a set of 5,000 web pages, where 1% of the web pages were in the monitored set. This is generally a harder classification problem in the area of website fingerprinting, and we see that we get a lower precision recall score of about 70%, but this is still much higher than a random case.

H

The second goal that we considered was censorship. We did a preliminary analysis of a censoring adversity. The goal of the censoring adversary is slightly different. The idea is to identify a web page as fast as possible and I really try to block the connection. So this means that an entire door trace will not be available to the adversity and the adversity has to look at partial door traces.

H

So what we did here was we took a set of 1,500 web pages and we the uniqueness of the door traffic when only the first LTL s, records have been observed, and the idea behind this is that when two traces are unique, then we can find out which trace to block.

H

So what we found out in our analysis is that generally, the fourth TLS record, usually curls ons to the first doe query in our traces and the size of the TLS records, also has connection to the length of the domain name. So this means that one strategy that an adversary could follow would be to block on the first query, this means that the user will not be able to load the page.

H

The disadvantage of this method is that it could result in high collateral damage, because other pages with the same domain length could also be blocked as a result of the strategy. Another thing that we found out was that by the fifteenth record, or so, which corresponds to approximately 15% of a trace length, most of the traces in our set were distinguishable, so the adversary could follow a strategy where they try to block after having a high confidence that this is the trace that they want to block.

H

So this reduces the collateral damage, but the disadvantage is that this would result in the user being able to load most of the page they might miss out on some of the resources. The page.

H

We also did a number of experiments where we looked at the robustness of the attack and by this what I mean is when different aspects of the experimental scenario change. How does be an adversary? Try to keep good classifier performance, for example, DNS traces can vary over time. So how often does the adversary have to retrain their classifier? Then we wanted to see the effect of client location on our classifier, as well as changes in the infrastructure. So by infrastructure I mean we change the the resolver we experimented with CloudFlare and Google's resolver.

H

We looked at cloud flat, standalone, doe client as well as Firefox's in Bill client. And finally, we did some analysis of desktop versus Raspberry Pi environments and the main takeaway is that for best performance. Ideally, you would train a classifier that is tailor to that particular scenario. If you use a classifier trained on one set of parameters, the classification performance does drop when you test on another set, but it does not stop the attack.

H

So what we saw from my initial set of experiments are that monitoring and censorship are feasible, even when DNS traffic is encrypted. So we looked at countermeasures to prevent traffic analysis attacks. So the first thing we looked at was edn airspace countermeasures where a DNS is extension mechanisms for DNS, so one of the options in a DNS is a padding option which allows you to add some padding to DNS queries and responses, and the idea behind this is that you remove the size information that is available for the classifier to distinguish web pages.

H

So the first thing that we did was we implemented padding of DNS queries, so we used cloud flash stand-alone, doe client for this, and we implemented a recommended padding strategy. So the RFC there has padding strategies and the recommended one is to pad queries to multiples of 128 bytes, which is what we implemented on cloud flask line.

H

We had also contacted CloudFlare with the initial set of results and they implemented padding of responses on their resolver. How were they they padded their responses to multiples of 128 bytes, whereas the recommended strategy is to add them to multiples of 4 68 bytes. So we decided to compare both the strategies as well, so the experiments that we did were the two e dns-based meshes, which I just described. We also looked at a case that we called constant padding. So this is a simulated scenario we wanted to see if we had perfect padding.

H

That is, if all the TLS records work to the same size, how the classifier would perform. So we basically set all the sizes to the maximum possible value that we saw in our trace and apply the classifier to this case. And finally, CloudFlare has a dns over tall service where DNS queries and responses are sent over the Tor network.

H

So we decided to experiment with this service as well to see how anonymous communication acts as a defense, so this table outlines the performance of the classifier, just note that the values are as decimals, not as percentages here for for comparison without any counter measure the classify attains about 90%, precision and recall. We see that with a DNS with the current CloudFlare strategy, precision and recall drops to only about 70% with the recommended padding strategy it drops to about 45%.

H

Both these values are much higher than a random case, which shows that a DNS based measures and do not eliminate traffic analysis. If you look at constant padding it's about 7%, so there is a major drop in the performance and DNS over tor achieves the best results with about 3.5% precision recall we also looked at the overhead in terms of amount of additional traffic that is added by these counter measures.

H

So we did a very short experiment where we took 50 web pages and about 6 samples per web page and applied each of the counter measures and looked at the total volume that is sent and received. Bytes of the TLS records. Just note that the y-axis is in log scale here we see that the e DNS face measures as expected, do not add much overhead. Our constant padding adds a lot of overhead because we are padding everything to the maximum size and DNS over tor is somewhere in between.

H

So we see that tor is a effective defense for the traffic analysis, attacks and the reason so that interview the data sent in fixed cell sizes, which reduces the variability of sizes and of exercise related features of the classifier. Another thing is that there's repackage ization, and by that we mean the data, can be merged or bundled together in tall, and this affects the directionality features we look at which records have been sent and received, and you, when thought does this, this affects the directionality features used by the classifier.

H

One thing that we are not been able to explain the clusters that we saw in the confusion graph, so this graph shows web pages that have been mislabeled as one another, and what we saw is that web pages generally tend to be clustered where web pages within the same cluster and to be misclassified as one another. This means that the anonymity set for a particular web page is not the entire set of web pages in the test, but rather only the web pages within a particular cluster.

H

So we did some initial traffic analysis, but we have not being sorry feature analysis, but we have not been able to determine why this is the case.

I

So I can take this opportunity to ask a clarifying question as long as you're waiting yeah sure is. Your tour protection is that DNS of Earth over doe over tor is.

H

Dns over Tosh, so.

I

So regular to us yes,.

H

I

H

Wait, oh yeah inside to.

J

One of the ways getting help is.

H

Okay sure yeah I'm on my last legs anyway. So currently we are working on a few different things.

H

The first thing is that our experiments looked at the case where a user's visiting one page after the other- and this is not exactly a real, realistic user scenario. So we are considering the case where you can have multiple tabs open by a user which results in some interleaving of the door traffic. Our initial results show that a classifier gets about 40% precision recall with about two tabs. Another thing is that we consider the case right now where there is no caching of the DNS records.

H

So we want to study the impact of caching as well on the classifier. We also started doing a comparison with DNS, so a TLS traffic, and we looked at the padded DNS, so a TLS traffic, and we see that it is much more resistant to the classify it's about 28% as compared to dough. So we have started doing or feature analysis to see why this is the case.

H

And finally, we want to see if we can have counter measures which include both padding and rhe packetization, but without tors overheads and latency and volume caused by headers. So this is basically the summary that currently proposed. Ddns measures might not be enough to prevent the traffic analysis, and these are some links to our paper. Thanks.

J

I think really good work Thanks. If you go back to the tape the table. Yes,.

J

So so I saw, when you say, constant Pony I had a quick look at the paper in that I mean you've, had it to the biggest size pocket of all. Yes,.

H

So this is so we don't actually apply the padding here, but what we do is we take all the traces and we change the sizes to the same value, which is the largest value that we saw on this set, because the idea is that you want to see a perfect scenario where all the sizes are the same. So.

J

When you said constantly there, it doesn't mean just having every tear less I.

H

H

J

Nonetheless, why is that so different from a DNS for 6/8 diffic.

H

J

Understand why doing that makes such a difference.

H

For Indiana's, for six it, what we saw is that there is still a variability in the sizes that are there in the classic, and the variability in the sizes is a big feature for the classifier, whereas with constant padding, because all the sizes are the same, the size is no longer a feature.

J

H

Is both query and response sizes so.

J

You had a lot of queries that were 128 is not enough. Padding yeah.

H

But I just want to clarify that the response sizes have a greater variability than the query sizes, so the responses do have a higher impact.

J

Were you able to determine what is it in the responses that's causing them to go more than four six eight is DNS, acre I know.

H

We haven't yet done that analysis of the content of the responses.

J

Interesting difference and finding out more about that and turning that into some guidance for people, like you know, presumably somebody's using a really dumb query name. So don't do that if you don't want to be- and somebody else is giving like lots of answers.

K

Hi Daniel con Gilmore. Thank you very much for doing this work. I am the guilty party for proposing Edie, Anna, Sarah, 468 and I proposed.

L

K

Because I wanted to encourage this kind of work, so I'm really happy that you've done that and great to see the results. The way we arrived at, the recommended padding policy was basically to look at individual DNS queries and responses and I like what you've done here, which is to look at them in combination. So I have a couple of questions that maybe you can give me what your intuition is. I think what you're saying with the constant padding arrangement is that the.

L

Reason that it's not zero.

K

There is that different websites do different numbers of DNS lookups.

H

Yes, so there are a couple of things. It also depends on the kind of features that we used. So one of the things that we looked at was we looked so if you have a trace, we looked at whether each TLS record was going from client to resolve over from resolver to client. So then you have basically a sequence of you know which direction it is going in. So even if you removed the sizes, you still have this directionality as a feature directionality.

K

And also parallelism right if I issue, three requests and responses back. Yes,.

H

K

Different from saying request response exactly.

L

K

So number of queries and and cadence yes, yes and then I wanted to ask you to hypothesize I know you haven't done the research to dig into it and produce a printable result, but why you think the DNS over TLS was markedly better than the DNS over HTTP.

H

So I mean, if you looked if we look to just of the sizes, we saw that there was much less variability in sizes, I'm wondering whether it is due to some of the configuration messages that are being exchanged in do an additional addition to the queries like we don't distinguish those messages, since we are looking at just the the TLS record size.

G

Options frame yeah.

H

Maybe something like that which could be different for dough and TLS.

K

M

Fross, a canal blabs, very cool work, I, really appreciate that I had so one question is already mystique: eg acid, but the other one I had was so nurses hypotheses that the DOE Safari said that if you mix do H traffic with normal web traffic, that that would obviate the signature of the DNS traffic a lot more. Is that something that you consider studying as well? Yes,.

H

This is something that we thought about right now. We do consider toe traffic as separate from website traffic, and we feel that when you mix them this might affect the results we we haven't done some work on that yet, but this is something we are considering. Okay, thank ya.

I

Westford karai- and he took my question so I'm gonna ask the little tiny one at this point. Is your data set available? Yes,.

H

We're planning to make a it's not available at the moment. We are just sort of cleaning things up and we're planning to make it available.

H

N

We tomorrow I, am curious at how exactly you measure the lengths of the queries and responses.

H

Responses so we can pick up files and we look at TLS records which are of type application data if I remember, and then we take the sizes of those TLS records. So.

N

Us humor that there is a direct mapping between the size of the DNS messages and the size of the TRS, because we.

H

Also did verify this by decrypting the records and in our paper you can see. We have a plot where we show the sizes of the queries as well, the sizes of the records- and there is a they follow the same shape essentially so.

N

What, if what if an implementation was not was deliberately either groupings have worker is in the same record or splitting queries and multiple records? What would I to.

H

Queries in the same record, all I think that.

H

You mean changes the size of the.

N

Five queries, for example. Yes, there is nothing in the protocol that says that those five carries up to go in five years because they.

E

N

Be packed in one big message: yes,.

H

N

Is sent as a single theorist recorder? Yes, so if site did that with a change, your results.

H

I think it might change, especially if the classifier has been trained in in a particular way as and if it's trained, based on traces, where individual queries are in individual records. This of course changes the pattern of the traffic. So how? If you train it in that particular case, if you have a mix of traffic I, think it would affect the performance of the classifier. Yes, thank.

N

O

Jeffrey askin: do you have any feeling for how the position in recall will scale as you increase the universe of websites from a couple thousand to the size of the Internet yeah.

H

So this is one of the major things, because our dataset is not very huge. We rather have multiple experiments with smaller datasets I would assume that the precision and recall would go down, but I don't know by how much yeah.

P

So this is Sarah Dickinson on Java. She asks have you considered, including oblivious DNS, in a future analysis, and yes, actually, Carmela I. Think your colleague ready to play. Okay, you.

H

P

Something she said that was not on the plan, but it could be. Can you give us she asked for an implementation but yeah? If you have anything to add.

H

For oblivious tienes, maybe correct me if I'm wrong, but oblivious DNS has a slightly different adversary model right, where the idea is that the recursive resolver does not. There is no mapping between the client and the query and we are not looking at on path.

H

P

H

But yeah we don't, we haven't considered it yet so far, very.

Q

Nice working again did you consider like to emulate the behavior of a National, Security Agency, or something like that. The intelligent agency that you have a set of websites that you know they don't want you to go to and then so we try to fingerprint them will do whatever your method is, and then you get a like domain names on various lists or entire DNS zones and try to see if an extreme like that, you would be able to detect them like the techie users going to those allegedly forbidden web sites.

H

Sorry, just to clarify your question is whether we are going to analyze such a scenario.

Q

Or if your planner would be an interesting idea or have any thoughts, if you would be able to detect that yes.

H

So, as I mentioned in, like our censorship, related work, like we have done a preliminary analysis at the moment, but we are thinking of continuing in this direction and seeing how whether we can have a set of websites and if the user can visit those or not yeah thanks hi.

R

Riad Wahby, nice nice work, have you considered? Are there extra features that maybe, like maybe you're, leaving, precision and recall on the table so, for example, what about inter packet arrival times? Yes,.

H

So we did an initial feature analysis and we did consider into a packet arrival times at the moment. At that time we didn't see that it it increased, I mean it did, increase the precision recall, but not by much, and usually we found that using timing as a feature can also be complicated, because it also depends on which position. The adversary is, whether you know you're, locating the adversary on router or know, if you we're doing measurements right on the client or not.

H

So that's why we decided not to do the timing based features, but our initial sort of analysis did show that, including the timing can raise the precision and recall by a little bit.

B

A

Right, so this is going to be I'm gonna truncate, this presentation, because a lot of it overlaps with what was just presented. Thank you very much for giving everyone in the background necessary. This is joint work. We did with Nikita Boris offensive empathy Allah.

A

It was presented that a nrw workshop earlier this week and so just to get right into it, as you may know, and as was just described there, so the recent shift there's a recent shift in focus on privacy in the IETF and ITF in general, and particularly we're trying to protect what resources or what applications or what services you know. Particular clients are using and there's also that gets the commerce or the opposite side of that is what we're trying to protect. Who is accessing these particular resources and using these services and connections.

A

What not clearly the former is easier than the latter, or rather the the former is harder than the latter, because we have very distinguishing identifiers currently like IP, addresses and other things that are in client software. So that's what I'm gonna focus on and the idea has been trying to do. A lot of work to push in that direction, in particular, rolling out dough and dot and gos encrypted S&I.

A

All these things are, you know, plugging the various holes that have come up over time or that have existed for a very long time and hope, so you know eventually masking what is the client actually up to on the internet, so in this particular model more concerned about the privacy of a particular connection that is used to request a particular resource, we're generally assuming an adversary, that's local and passive.

A

It can observe all packets between the client and the server, and the goal is pretty simple: it just wants to learn some information about that particular connection, be it you know what resource was actually requested, perhaps some metadata about that resource, referring back to in our duty paper. Just describing you know what HTTP method was used or sent over this particular connection and optionally.

A

They may also want to link this back to a particular client should also know that in the real world, this is generally assumed to be studied in what we call the open world model, which is where you know you, you train, potentially your classifier or whatever. It is you're doing your initial preliminary assessment on on a fixed set of connections, but what the actual clients do in the wild is sort of unpredictable and so you're not constrained to that.

A

What you train, your you know, initial experiments on so the closed world model, which is generally considered to be much easier in this particular problem space and what we actually you know for this particular work studied, of course, gives better results for classification and identification, but ultimately goal was to focus on the open world and, as was presented earlier, there are lots of different features available when you want to do this. Sort of you know identification, particularly, you can look at Network addresses.

A

You can look at packet timing in sizes and all these things, or even the clear tax information that was previously available or is currently available, depending on what client your software you're running in this work, though the you know we're assuming that the things that should obviously be encrypted like DNS and sni are encrypted we're. Assuming that we're not you know, the adversary is not looking at packet timing in sizes, strictly looking at network addresses to try and figure out, you know what particular website or what service a particular connection is trying to access and.

A

Yeah, so if you look at the sort of the spectrum of information available to you know, an adversary wants to do this sort of fingerprinting back in the day before we had anything from before we had HTS and encrypted dns everything was kind of sent around in clear text.

A

It was very easy in her Kaduri for the adversary to just simply, you know, look at what the clients were doing and, as things have sort of, become more and more encrypted, the features that are available sort of become a bit more difficult to act on so right now, if you assume a world where we have encrypted s and I encrypted DNS, you look at information available at the network layer, so IP addresses you look at patterns in the traffic and the goal is to strictly identify our fingerprints, a connection based just on that information and the idea.

A

This is to kind of show, claim that it becomes increasingly difficult, as you obviously take away the clear text. Information on the wire, so the current state assuming you're running, perhaps some older software and resolving or opening up a particular connection in HTS connection I, have a client, the middle, who sends a clear text. Query to a DNS resolver gets back. An answer in clear text, obviously knows where they're going, because they see both the query and the address everything's fine, at least from the adversaries perspective, and then he opens up TCP connection, TOS connection.

A

Everything again is exchanged to clear and ultimately, the resource that they're after is encrypted, but depending on the adversary model. Again they've already learned exactly what you know application the client is after so it could potentially be game over that particular point, especially if the goal is to censor based on that. You know that signal, but if you add Dolph, Dolph, dot and dough into the mix and yes and I lots more things are encrypted.

A

But what remains currently or you know as postulate based on the current designs for these things are the ALP values that are still setting the clear in a TLS.

A

The IP address of the server to which you're connecting as well as all the other network features that were mentioned earlier, that I'm omitting so we're focused rickly on the network address again, because that is the thesis is that the sufficiently unique identifying information for the connection, so the experiments that were discussed in the paper basically worked as follows: go grab a massive set of domains and write a crawler that goes and connects them identifies all the you know, the top level the IP addresses that are resolved from trying or that are returned from trying to resolve the top level domains, as well as the IP addresses from each live resource.

A

On a particular page. Do the resolution for all of those names to be at the top-level domain and all the subtrees URLs using CDNs, collect all this data and then try to see. You know how we unique these IP addresses are fairly straightforward for this particular experiment, and if you look at the anonymity set that results from that particular experiment that the data kind of suggests that basically there's many.

A

What we call a unique IP addresses that those that fall into you know and not on a set of size like one or two, and this is not, you may think it's perhaps uncommon. As the you know, we move towards a a you know, a world in which a of es, since you see and everything hosts a lot of applications, but still there are a lot of legacy or legacy.

A

There are older servers that you know run from behind a couch or have a unique IP address that simply looking at it can reveal exactly where you're going. So that's not great but again, I. Take this we're going to solve in this particular data is collected from closed world environments, so it would become harder to identify this or do this set in the open world as we're saying earlier, if you look at, if you want to identify, for example, what is the actual page, you know a client is attempting to access.

A

Perhaps the logical thing would be to look at, not just you know, a single connection, that's an initiated when loading that page, but rather the set of connections that are initiated when voting that page and all of its sub resources. So as the on path adversary, you see things like the DNS query patterns you see to us in TCP connection pattern, and the set of these things are often should, in theory, be sufficient or more unique than each one on its own.

A

So if you were to you know as an example, if you loaded New, York, Times, comm and Safari, you would see many many TLS connections kicked off many many DNS requests sent over dough or dot and that the union of that set is what we're using as the fingerprint for a particular page load. So the privacy of a page load then considers the same exact adversary. Just assumes that the adversary has is able to bucket eyes or group.

A

You know these connections from a client into a single event and then use that to make a determination as to how unique a particular connection is and perhaps use that uniqueness to associate it with known top-level domains. So the same features are available or just you know, expanding the scope a little bit here and that pretty much describes or shows what I was describing.

A

So you go from a single connection to multiple connections and you look at sums instead of individuals very straightforward and same thing same thing, so the simply experiments that were discussed in the paper. They didn't focus on the very large IP address, set or domain set. That was used for the single connection, IP address uh nominee experiment, but rather just the top 1 million loaded them using their crawler and then compute. Some basic statistics, for example. How many unique URLs are you know reference upon loading each individual page?

A

You know how many different domains is that kick-off, underneath the hood to see how many connections you're making and the results of you know doing that. Looking at the number of unique IP addresses or the number of unique page little fingerprints that came out of it, it's basically here, so you can see.

A

If you look at the x axis, the anonymous head size significant significantly shifts to the left, basically suggesting that by looking at the some of these connections and grouping them, two individual payloads, the anomaly set or the uniqueness goes up, which matches our intuition again closed. World open world so still could be improved.

A

So the the conclusions we kind of draw from this very very you know. Preliminary research is still ongoing. As that. Clearly we need some sort of encryption of obviously clear text holes to get some sort of notion of connection privacy, and we need some sort of notion of connection privacy to get some notion of page book, privacy and I. Think perhaps that's like the ultimate goal of a lot of these things. I mean encrypted sni and doand.

A

Dot are great in that they're, focusing specifically on the connection privacy, but you know perhaps there's more that could be done to get towards the the larger bigger picture that we're trying to protect. There are a lot of related issues here and things that potentially not considered in this experiment and particularly equity of have a client that's doing happy, eyeballs to race connections across address families or even across interfaces. That might be worse in some ways, because you're simply giving the network more information about where you're trying to go.

A

So it is great for performance, as clearly demonstrated by all the clients that are implementing it and the benefits that it brings. However, from this particular spective, it might make things a little bit easier for the adversary, which is not necessarily great over on the plus side, the things that we're doing in the HP biz to coalesce connections with secondary certs, it's great, because that potentially shoves more requests along a single connection and makes it's effectively removing information from pages of fingerprints that would have otherwise.

A

You know spun up new connections and perhaps add to the amount of uniqueness that exists for a particular patient. Fingerprint consolidation within a single CDN as well also helps because you have single connections that you know. Basically, clients are tethering to the CDN and then sending all their connections and all their requests over it.

A

You can also do things like deploy proxies that hide the IP address, and you know that that solves a lot of issues in perhaps, let's see even the most obvious way forward, and then there's also things that have sort of heard us, especially with respect to yes and I, but in this case may help and then it's deenis load balancers that are, you know, constantly changing the IP address that you land on so making it not easy to.

A

You know just identify a particular service by a specific IP address, I mean you could do fancy things by potentially trying to identify the ASN that to which that IP address belongs and then associating you know. You know the IP addresses of particular connections with asons and then looking at the Union or the set of asons. That result from a particular connection and using that to identify a PLF, but that has not been done yet.

A

So perhaps that's an interesting or you know useful way forward, so I guess in general, the the the purpose of this work was the shorter show that the the intuitive I guess it's a rather intuitive observation that yes taking away clear text.

A

Information from the network is a good thing because it removes obviously the very clear privacy holes on the network, but there are still things that need to be fixed, but this particular problem website, fingerprinting, based solely on address information, seems to be coming harder based on the work that the ITF is actively doing, which is a good sign, I think, but then writing back to the previous presentation.

A

It's unclear what that means for the the other side or the other information that the adversary has available to them, for example, so they shift their focus towards traffic analysis in which they are using information such as timing, inter, inter arrival time, the sizes of packets that are sent to sort of infer what that what the client is doing on the network- and this is something that tor and the academic research communities have been struggling with for a very long time and continue struggling with website.

A

Fingerprinting just has a number of different papers, some of which we've tried to collate, along with our, in collaboration with Ian Goldberg sort of a repository of you know all the known attacks in this particular area.

A

Looking at how bad traffic analysis is for you known on Thor connections, of course, much of the research is done in tour, but it applies equally as well, if not more to you know non Thor connections, which don't do things like fix cell padding to mitigate obvious or obvious, or to make traffic analysis just a little bit harder, so I think next. Steps for this are to really kind of encourage people to take this problem a bit more seriously, if they're not already, and that means like asking people to do more research in this area.

A

Such I mean Nikita and others are obviously doing it. We should for them to keep going I, think documenting to the gnome research. That's been done in the area is also quite useful. If you know the reason to having for having it, you know a single reference that we can use to. You know either assess or assess countermeasures, or you know sanity check to make sure you know I, just don't learn whether or not you know something that hasn't been done.

A

Actually, indeed has been done, and then you know perhaps IV III RTF or the ITF can like work with these people who are actively working on these problems to develop mitigations to see you know, what's effective and from a you know, cost performance perspective. So, for example, the previous presentation like here we got good results from sending dns over tor, but it's you know, there's a performance hit there. So what is finding that right trade-off is difficult, and so perhaps something we should implore, IETF and I urge you have to be working on.

A

So that's pretty much it happy to take any questions.

B

L

Hi Chris Dave Wonka.

S

Akamai thanks for presenting this and presenting something different about it, then what was adding in our W I was afraid this game, using things I'm talking the couple of things I wanted to mention, and you can figure out if they're questions or just comments, I've been working on anonymity sets on the client side instead of the server side, so that we can report on activity in the web based on clients and what it struck me from what you've shown here and what the page load combinations show is clearly the solution for creating anonymity sets based on aggregation on clients that are very different than the server-side, so it's completely complimentary to what we're doing in map Archy and then this last hackathon.

S

I was writing code to see and.

L

I notice oh they're, so.

S

And we're going to talk about that on friday mornings, so you know whoever wants to work on this. Let's get to talking about that and then well I'll stop because I.

A

Will be there from frg but I'll sync up with you and we can talk about it. Perhaps afterwards yeah.

R

Hey Chris, we had um just a quick thing: can you go back to the second graph that you showed us with the buckets so from this it looks like anonymity. Set of two is two orders of magnitude smaller than an amenity set of one. So if we sum up everything that means that, like 95% of web sites, can be identified, yes,.

A

It was a very high number.

T

N

Quick questions.

T

You mentioned load balancing in Aires. Were you saying that load balancers our anniversary just.

A

No, no just that dns-based boat balancers might give you different IP addresses if you try to resolve the same name over and over again. So a simple you know and naive approach to trying to map name to IP address and looking at the first result that comes back might not always work, because the answer will change on.

U

Nick Sullivan so Chris what about the children? No? No! No, but seriously. This is dealing with passive adversary's. Have you thought of adversaries that are more actively trying to subvert such types of anonymity? Protections? No.

A

Not yet I'm hopeful that Nikita and the group that's working on it can you know start taking that a little bit more seriously. Yeah sorry.

B

A

Can talk offline if you wish and and.

B

The meaningless also thank.

A

M

All right, my name is Rowan fresh-picked I work for an omlette labs. This is joint work with high scanners, who was my master student mad-eye's, one loaf and Luca Elodie from historian and Technical University of Eindhoven respectively right. So it feels like we're repeating ourselves here, because I think somebody had a similar slide and the person before me had a similar slide and the IETF is focusing on protecting people's privacy, which is great. We have things like deprive. We have DNS over TLS.

M

We have a lot of buzz around toh, but the focus of all of this is mostly on privacy of traffic in flight right, so we're trying to protect traffic from adversaries that are on the network.

M

But if you look at the domain name system, there is a sort of an obvious elephant in the room, which is the operator of the resolver that, even if you protect all the traffic in flight, still has access to your traffic and then they might actually have legitimate reasons for doing so. Right. If you are in enterprise network, you may want to use DNS to detect indicators of compromised people that are infected with malware on your network, or you want to be able to monitor our threats in large user bases.

M

For example, court 9 does that they have a very strict privacy policy, but they still inspect traffic in order to detect malicious behavior in in larger populations that user resolver. So it is probably too easy to say they shouldn't do this, but what we wondered is: can we find a better way of doing that that sort of provides some privacy guarantees for users when we're inspecting traffic on the result for itself?

M

So what we did is my master student developed a potential solution for this, which uses something called that is called bloom filters, which some of you may be familiar with, but I'll explain it in the next couple of slides. More importantly, we have a working prototype which is open source Oh URL is on one of the last slides. If you want to have a play with it, it's very much a prototype, but it gives you some idea what it does and we tested this in production at surf.

M

That's the national research and education network in the Netherlands on resolvers that have a client base of roughly 200 to 250 thousand users right. So first up bloom filters, so bloom portals really were developed in the 1970s just basically, there are a method to speed up database lookups and they are a highly efficient mechanism. So insertion and look up is is roughly order one and basically you can think of a bloom filter as a probabilistic set.

M

So if you have an element in that, has a may or may not have been inserted into the filter, then you and and if you query a filter, then if the the filter says it's not in there and it's guaranteed not to be a member of the of the set if it says yes, this is a member of the of the filter. Then there is a small probability that this is a false positive, but it's highly likely that that particular element is in the set.

M

So how does this work you take, for example, a domain name? You run that through a set of hash functions, then you get some output. This is actually the sha-256 hash of that particular domain name. You use each of the you use parts of the hash or the different hash functions from multiple hash functions as indices in a bit array, and then you flip the bits to one if at a particular index. So on the left hand side, you see insertions, so we insert example a common example or here as an example.

M

This is a toy example Donnie eight bits in the weight set and then, if you do a lookup, you go through the same process and if there is at least one bit set to zero, then the element that you're looking for is guaranteed not to be in the filter and if they of all of them are set to one, then it is either in the filter or it was a false positive right.

M

So, with a bloom filter, you have basically have a trade-off between the false positive rate and a reasonable filter size.

M

So you don't want to have huge filters, huge bits as you need to keep in memory, but you also want to keep your false positive rate relatively low and the parameters that you use, for that is the number of hash functions that you use, or rather the number of indices that you extract from the hash to sort of set flip bits in the filter, the size of the bit array and the expected number of elements that you expect to insert into the set.

M

And then the formula on the slide gives you a rough estimation of the false positive probability.

M

Now the idea that we had was that you, what you could do is to take all of the queries from your clients and insert T's into a bloom filter, and this is actually interesting. This is a methodology. That's already used to find things like new. You observe domains, for example, I think power. Dns has an implementation that does that. But what we wanted to do was use this information to check.

M

If a name was queried for, but we don't were not interested in by whom that name was queried for and exactly when, so what we want to do with this is to perform network level threat monitoring. So we won't say that we have a domain name. That we know is malicious. We want to be able to say within a certain frame time frame. Did anybody ever query that name in our network and the bloom filters, give you some nice properties with that, because there are non enumerable as soon as I've inserted.

M

Something I have no clue what I insert anymore, because it's turned into some a few random bits that I flip in a filter. If I makes queries for lots of users into a single filter, it becomes really hard to sort of distinguish queries from individuals, and what is also interesting is that, due to the mathematical properties of bloom filters, we can actually take multiple filters and combine them into a single filter which has a higher false, positive probability, but contains more data and thus anonymizes even more.

M

So what are the prototype that my student develop does? Is that it has sort of an auto tuning mode that you can run it in against your resolver for say a few days to see where your curry pattern looks like and then it will suggest bloom filter parameters that you can set in order to have a certain false positive probability.

M

So we, for example, want to do hourly filters in our experiment, and we want to be able to aggregate these two single day. So we want to combine 24-hour lis filters into a single filter for a day when we do aggregation and we would like to have a false positive probability, probability of 100,000 for the daily filters and what the graph shows you is that it's important to do the tuning, because if you look at the number of distinct query names that you see in a day on a resolver, then on an hourly basis.

M

Is that it's the red line. So that's anywhere between. So it's roughly 4 million query distinct names within an hour that we see. But if you look at this on a daily scale, the number becomes much higher and you see roughly 30 million distinct names in a single day now.

M

The future extension that we want to do is that it keeps automatically tuning, so what it will do is it will ensure that all of the filters for a single Bay in Bay can be aggregated, but it will use the predictions from the previous day to set up the filters for the next day so that you can, you don't have to come to constantly manage the system.

M

So what do we put in there? What we wanted to be able to do so surfing? It is a large research network. It has many universities connected and has schools connected to it to this, and what we wanted to do was to be able to distinguish queries from different institutions, but not from different users, so basically the the things that we insert in the in the filter on this slide. So if we have say evil domain Lacombe, then we will insert both the second-level all of the labels, but also organization, a at evil domain.

M

A comb for the specific network that that query came from such that, and this means that if we want to say we we get an indicator of compromise, we want to work out. If that indicator of compromise was ever seen on the network, we can put it in a bloom filter in hotels if it was seen and then we can sort of enumerate over all of the institutions that are in there and figure out which institutions sends us that query.

M

Now we test this prototype for about three weeks on a busy DNS resolver. This shows that the query pattern in queries per second, for that particular resolver, and we studied three use cases I'm not going to go over all of them, because that would take too much time.

M

I will refer you to the paper, there's a link on the on the last slide to the paper, but what we looked at was detection of so-called booters earlier ddos as a service website, we looked at hits on email blacklist and the one that I'm going to talk about is the the national detection network.

M

First, a little bit about the predicted versus the actual false positive rate for the filters, so we ran Auto tuning for a week before we did our experiment and we chose as filter parameters that we would set 10 indices for every query that we get and the bid filter size was for in a 91 megabytes as it's roughly fifty nine megabytes in memory. So it's actually quite reasonable.

M

If you compare the resolver cache on that machine, which was about two gigs, so our goal was to keep the daily false positive rate below one in a thousand and of course we had to estimate the number of elements that we inserted a little bit of hand-waving, but after you've used a filter, you can calculate the actual false positive probability, which is that the formula is given there. It's explained in in the paper and fertilize me: I can't remember what s is, but the graphs are show you the result.

M

The black line is ten ten to the minus three. So that's one in a thousand, so anything above so that the graphs are a bit confusing. Anything above the red line means we had a lower false positive probability than we actually said. So the takeaway from this is that we actually had a very good, false, positive rate on the bloom filters, and if you look at the hourly, because we use the same parameters for the hourly and the daily filters, otherwise we can't aggregate them.

M

Then, of course, at an early level, the probability for false positive is very, very low.

M

Now, one of the things that we tested this with is the National detection Network, and this is something that is managed by our government, a Dutch national cybersecurity Center and what they have is they have a system that runs a Miss. So it's a malware information service platform, I think is the acronym and what they put in. There are high value indicators of compromise, for example, indicators of compromised from the intelligence services. Now, in the condition for participating in this national detection network.

M

Is that you don't just take data from it, but you also put data back into it. So say you get an indicator that there is some some malware active then what they want to know is how does this affect your community? Because their goal is to figure out how that society is impacted by by these threats. Right and, of course, surf net wanted to participate in that, but they didn't want to monitor all of their individual users.

M

They want to didn't want to throw away the privacy of all users in order to participate in something like this, so this particular solution that we implemented was very interesting for them because it allowed them to take the indicators of compromised, hold them against the bloom filters and figure out if there are hits on that, and they could report it back to the national detection Network, but they also got some indicators of sort of what threats there were in the network. The graph shows you the number of threats that occurred on a daily basis.

M

So it's not a huge number of threats in the order 40 to 50 in a unique threats per day. But the interesting thing about this is that surf nets. Privacy policy prevented them from monitoring individual curries, so they couldn't do this before and now that they had the bloom filter solution. They could do these lookups and actually work out whether these threats were occurring in the network, and we found an actual compromise, which was a one of wanna cry infected machine.

M

So what we did is as soon as we found out that one of the threads that was detected was one a cry. What we did was go to surface privacy offer and say: hey. We found his thread that maybe needs mitigation. Can we now specifically monitor for this threat?

M

So, rather than doing blanket surveillance, we can look for a specific threat and then see who is infected and chase up that machine, so this is much less invasive for users than just sort of monitoring all of their traffic and just telling them no we're doing this for your good. Now, in this case, we can make a balanced decision, whether or not to monitor for something or not so some other benefits. No personal data is stored, of course, because we don't retain any IP addresses.

M

This is aggregated at the individual now level, but this means that you can retain this data much longer which allows you to do historical lookups, which is very interesting. We think think back to, for example, the one a cry case where you could recognize that this strength was present or your network, because he would do certain DNS queries if we had had this running at the time sort of before that threat existed as soon as the malware researchers discovered. That particular query. We could have gone back in time and worked out.

M

When is the first time that we actually saw infections, which is something we can't do right now, and that would be very valuable for threat intelligence.

M

Another thing is that- and this is something I as a researcher find very interesting- is that you could share this data with third parties without disclosing PII too to them, and then they could. For example, say you take filters that are collected in in different networks. Then you can do co-occurrence of queries. If say, I found this query that that I think is malicious, which other networks did I see that in and was that roughly at the same time?

M

And the final thing is that as a nice side effect of how bloom filters work, they you can do cardinality estimates or the number of distinct queries that are in there. You can estimate that, because it's sort of related to how hyper log lock works if you're familiar with that right, so the prototype code has been released as open source, the URLs on the slide so surf net where we trial. This is planning to take this into production because they want to use this for their C, cert team and adenylate labs.

M

We're creating the tools to integrate this into our open source products, in particular our resolver product unbound, and the goal of that is to again release this is open source software and make this really easy to deploy, so that, if you want to do this kind of network monitoring, then at least you now have a more privacy friendly solution available than just blanket. Recording every query that all of your users make and I hope that it's a little bit of proof that security and privacy can go hand-in-hand, and with that this is the paper.

M

The link is on the slide and the slides are up in the data tracker. If you want to read the paper, are there any questions.

S

Hi Roland Dave Wonka super cool work. I can't wait to read the paper I'm. The question I have is about the sharing. Are you comfortable giving out your full bloom filter to the NCSC or I can't believe they would give you their bloom filter back? How does this sharing really okay.

M

So for the for an initial detection network, we're certain that they were serpent is certainly not sharing the whole bloom filter with NCC. What they are sharing with ncsc is: if there are the detections that they do themselves so say, I have a threat, I detect that, and we report back. We saw this threat in universities or in I, don't know schools for vocational education, but of course it's not the bloom filter they're, sharing that what they're sharing is whether or not a threat was detected, but.

S

Don't you have to take the consonance the bloom filters to find out if the 1:1 burglar had the the vulnerabilities are the the text that were representing the other I'm, not understanding? Sorry, okay, I can't hear you. Can you step closer to mine? Oh I was just saying: don't you it's the coincidence, the bloom filters that shows that you that there's evidence of those attacks or vulnerabilities in the in one set yeah, but so who's, the one? That's finding the intersections of bloom filters, because that's on your trusting the.

M

So I think they're, so you think you're you're sort of conflating two things.

M

The idea is that the network operator, in this case surf net, who participates in the national detection network, thus detections on their own on their own bloom filters that they record on their resolvers and they just report infections back to the cybersecurity ncsc. They don't sort of send the bloom filters they're.

M

What you could do a distributed model is that you could have a query API where somebody could send in say: I. Has this name being queried? Is this name presence in your bloom filter, something you do to look up and you report yes or no I agree.

S

With you that, if you have the remote query, that's a way that you could possibly hear. But if you give the bloom filter, then they can query yeah.

M

They can query I thought yeah Crick, but so I wouldn't advocate giving it to some something like NCSE. But for example, say you have a University researcher or is interested in this, and I can I can definitely see conditions under which you could share that data with them.

V

Hi STP from the NCSC in the UK, and thank you for your presentation. I found it really interesting and I think it's a good piece of research that needs to happen in this space and and, like you said it's nice to see that security and privacy can't go together and I just had a question on your previous slide. I think maybe the wonderful about.

V

Oh maybe it was a couple of very sorry yeah about sharing the bloom filter with third parties, so in this case and I would just be concerned about the potential leg-up that you would give a threat actor in knowing that their demeanor. Whatever was noted, you could just change their techniques so and threat intelligences shared under the TLP protocol at moon um and yeah. Just to say when you share it with researchers, I worry about giving it to any academic that you know, fancy don't I. So.

M

Okay, so so of course, when I say sharing a third parties, what I would say it should have a little asterisk that says, under certain conditions, right I am an academic researcher. So I don't worry so much about sharing this with academic researchers. I mean trust me, but you do have a you. Do have a point right. You want to have certain safeguards in place before you share this kind of information.

M

I guess it also depends on the network that you collect this information in surfing is a research network, so there there is a one of the goals is also to do research on the network itself. So if you share that with researchers that are within say their constituency, then that could be there could be good conditions for doing that actually surf and has a data sharing policy that sort of lists conditions under which this kind of data can be shared with researchers.

M

And then there are data sets that can be shared with researchers within their constituency or random academic researchers around the world and I can't speak for them, but I would say that this is data that they would definitely feel comfortable sharing if the conditions are right because I talk to their privacy officer and she actually, she understood this really well quite well, even though she didn't have a technical background.

M

This was sort of easy to understand and her intuition was that, for example, this would not be subject to the GDP art because of the data that gets put in, and that meant that, under certain conditions, we could share the information, even though there are some privacy risk. If you can just try and send whatever question you have to the filter and figure out, if something's in there does that sort of answer your question sort of okay, any other questions.

M

Okay, thank you return. Oh.

B

Thanks to all our presenters can be a round of applause.

B

Do we have the blue sheets? Has anyone not sign the blue sheets.

B

Should be moving on to the internet drafts, it is Roland. Sorry, it's Fernando Marlene.

W

B

It's really low.

W

Is that better? Now?

W

Yes, okay, so I'm for another one and I will be presenting a set of three documents to our target at beard. These documents are about the security and privacy implications of numeric identifiers employed in network particles.

W

These are a set of documents that we have been working on for a few years now with the ban are safe next slide, please so as an introduction, essentially for the last 30 years, or so, we have got the topic of numeric identifiers wrong in in many protocol, specifications and implementations after.

L

Memory break now you may.

W

W

For example, you may remember issues that we have faced with PCP initial seconds numbers: predictable transport protocol numbers, predictable, ipv4 and ipv6 fragment identifiers, predictable, ipv6 interface, identifiers. Remember.

B

Your audience really scrambled, you can't make how anything.

W

B

I think he does a little video I.

W

Did yeah I don't know if I can disable.

W

B

W

W

Lessons that we learned we've, you know we get into fires of some protocols, essentially we're not applied to all the particles. So it's essentially the same problem that repeated and over again in different lamentations and in different protocol specifications.

W

W

So, essentially, we work on a set of three documents. These three documents used to be a single large document and, based on on the advice on a number of groups, we split this document into three. The first document is well numeric, IDs, a history in which what we try to do is cover the timeline of some some old numeric identifiers. This is this document targets. Peer, second, is about proposing algorithms to you know to generate these numeric identifiers.

W

You know without or avoiding the security and privacy implications, and the third document is a document that we hope can be published as an ad sponsored RFC, which essentially tries to force protocol specification writers to have to do a proper analysis about numeric identifiers next slide.

W

So the first document numeric ID, is history. Essentially what we do we cover some sample: numeric, identifiers, ipv6 interface, identifiers, ipv4 and ipv6, fragment identifiers and so on, and what we tried to do with this work is essentially to illustrate you know how, essentially we faced the same problem over and over again, sometimes called the same identifiers in different protocols like prime identifiers in ipv4 and ipv6, or sometimes it's the same problem at the end of the day, but for different protocols.

W

This document again is targets. Fear next slide, please the second document, it's a little bit more complex. Essentially, what we try to do is to categorize numeric identifiers based on their interoperability requirements and the failure mode interpretability requirements, ell requirements that you know must become quite based on the protocol.

W

There are different sort of requirements like uniqueness, an identifier that must be monotonically, increasing and so on, and the failure mode is essentially well what happens if you don't comply with those requirements so based on those categories, what we did is, you know, try to go through most of the numeric identifiers that we are aware about and try to see if we can make each of them follow into one of the categories that we define and for each of those categories.

W

What we do is prove proposed some sample, algorithms that can comply with them to repairability requirements, while avoiding the negative security and privacy implications, or some.

W

At the previous document, please document target fear peace group next.

B

Mani, your audio is giving way again.

W

B

B

Think, given the lack of time we're just going to move on to the next school with.

I

B

And I think this draft, the third draft, which is I, think it's under consideration for ad sponsorship by Ben, kaduk and yeah, and we spoke to the we spoke to the security ad Benedict and also the author of the draft, the to draft and seem like the first to draft history and generation who were under scope for and good benefit from coming to perigee. And we were thinking of adopting them. So do people have opinions.

L

B

This and we're.

L

Looking to alum at.

B

The end to to get a sense of whether the room is okay with us. Is there there's consensus to adopt these two, these two documents?

B

So if you have comments, please come and speak.

A

My people read these documents show of hands. If you, okay,.

B

Does anyone have opposition to the research group adopting these documents.

D

What's the research content again, what's the research.

B

Content I believe you three integration of the two drafts.

B

A

So one of them is just the survey basically evolved. The the past sure problems.

L

I've read the same I.

A

Don't know what the research yeah actually yeah of the generation, hopefully or another clean answer so I. Let him.

L

Into the queue.

A

Fernando yeah.

W

Sorry, the connection was lost was the question that was answered.

B

Neither can you answer in chapter.

A

Yeah I guess in the interest of time and do the federal abilities we'll just move on to the next presentation and come back to this quality and time permitting. Of course we can always take is the list afterwards. So on that note, David other.

A

X

Am David offer with the Guardian project and talk about an internet draft we've submitted recently on enabling network traffic obfuscation via pluggable transports. So, first of all, what is what are pluggable transports mechanism for enabling the rapid development and deployment of network obfuscation techniques used to circumvent surveillance and censorship? The deep tape details about this work, which has been going on for some now, are available at this URL.

X

The generalized architecture here is that there is a server, that's exposing a public proxy that accepts connections from pluggable transport clients. The client transforms traffic before it hits the public Internet. The PT server reverses the transformation and then passes the traffic onto the server app there's, also an optional lightweight protocol to facilitate communicating connection metadata. So when you're migrating between one connection type and another, for example.

X

So since we don't want to limit the behaviors of what pluggable transports can do this, this idea focuses on the interface between the technologies and the applications themselves, not on what goes on inside of of an obfuscation technique.

X

The draft that we've put together is based on the pluggable transports, 2.1 specification, which is a work of a fairly large and diverse community of people, and it has two subsets. One is what we call the transport api interface and the other, the dispatcher interface, so one that is focused around in-progress language-specific api.

X

That's integrated directly into the client app on the client side and into the server app on the other side and communication within the app or the way. The app on both sides sees. The pluggable transport is like a socket with the dispatcher API. This is to be used between processes, so there's another process on each side that handles the process of obscuring and unobscured and passes that so that the actual application doesn't have to deal with that aspect of the problem.

X

The dispatcher here can be configured with different types of proxies, so you can have live at any moment. Different kinds of of transports available that can that are connected in different kinds of proxies and can respond to different kinds of traffic. So this is sort of that architecture described in graphic form.

X

Here's the Transport API, where what's going on in the track in the transports, actually bundled into the application, the itself and here's the case with the dispatcher, where the actual applications don't take part in or don't got compiled in the kind of application technique being used and here's a case where there's a mixed architecture, where maybe the client uses a dispatcher type environment where the server has a transport style environment.

X

The spec also talks or the internet draft also talks about where we can by looking at the older specification, which was just the dispatcher API and the issues related to linking across language and then also this idea of adapting back and forth between types of adapt, types of technique and I. Think that's it questions.

Y

Ben Schwartz: what would you like to see happen with this document? Well,.

X

Yeah I guess I can give maybe my own personal response to this, because I'm not sure overall myself, but there's a been a lot of work going on in this area, and it is in my mind, sort of in occurs in user space of a necessity, because we want these techniques to be adapted in a very rapid way. So it's unclear to me what or it needs to be debated, what sort of final position this has in some standardization work.

X

Should some of this stuff take place a lower in the network stack or not, and and if so, why and if not, why etcetera? But we are, we do hope. This work took place for a limited period of time, only within sort of one small set of the community, and now we have a larger community that finds this interesting. So there seems to be need to be some bridge between some long term standard and some just lack of unity on the topic Thanks.

X

A

Many people have read this decade.

A

Only two thanks.

Y

For your question, then Ben I'm, one of the authors of.

A

Course so yeah I think this certainly need I guess more reviewing reading from people here. If you find it interesting, you know please same comment so a list. We can kind of decide what to do at that point. It's certainly.

L

A

Very interesting work, but it might be a bit too really at this time. So, let's, let's get more.

T

Feedback not at.

A

First, thank you.

N

Z

Staying here, hey so I think I have three slides, but the first time I represent of this was in November of 2014 in Hawaii. You may or may not remember it. It's been edited a lot.

Z

It's been refractor refactored reflected network structure, it's a created author, as we had Nick Fenster and some students at Princeton and helped with it adds more modern things like blocking of ES and I in South, Korea and stuff, like that, the reason it hasn't gone very far is because I'm, the one working on it and I diverted my efforts, which is I, don't have a lot of time for nhf work to the is 2.0 restructuring work, which is recently done yay, so I'm doing a lot more stuff that doesn't involve administrative stuff.

Z

Just a real, quick overview. Ten seconds. It orders the discussion about censorship around three things prescription. That is what do you want to block identification? How do you technically identify those things that you'd like to block and then interference the actual performance of the blocking there's, a ton of small and medium issues that a bunch of people have identified in the reviews that are in our issue? Tracker and we're gonna be going over those in the next couple of months and incorporating all that good feedback?

Z

And since this is a possibly a research group draft, we're gonna actually try and do some outreach to areas that haven't gotten much review of this so routing and DNS in IETF just to excerpt the chunk of it, that's there and see if we get additional feedback there. Although some people like Stefan a long time ago, reviewed an earlier draft of this, the bigger issues that have been talked about on the list. There's this thing about mitigations I was initially pretty just to state the problem.

Z

Some people feel that it would be a better draft of it actually in, in addition to describing censorship techniques and included discussions of mitigations that may or may not be relevant for each technique. I was initial, initially pretty reluctant to do that. Just because it felt like it would blow up the draft, but I I'm I should say: I can go either way on this in the sense that I think pithy statements under each technique that describe some types of mitigations may work.

Z

That I think the trick is, is that censorship techniques change quite a bit, mitigations changed in my totally just ad-hoc or an intuitive nature. They change it even more quickly. Some things like domain fronting, you know, come and go right.

Z

I forgotten his first name. I should know in Vitoria, Victorio Burt Ola said specifically that censorship may be too negative of a framing I, come from a digital rights background where we talk about when someone's blocking something you want to get access to that censored, no matter what it is happy to say blocking techniques. Do you want to talk now, I'm almost done, there's a chunk about non-technical forms of prescription and interference so ways of finding stuff and blocking them like self-censorship or legal mechanisms and stuff like that, I'm not wedded to that stuff.

Z

It just felt like it would make the document sort of complete and it's sort of the rubber hose thing that might come into play in certain kinds of things, and it's not like some rough description of what that rubber hose might look like. There's the bigger issue of the stuff moves really quick. It may need to be a regularly updated draft. I'm, not I, know that that's a different discussion in other places. Some people don't care about that. Some people feel very strongly about that.

Z

It may be something we want to wait for the living standard discussion. What's the research content, it's a sort of you can think of it as a review article, a systemization of knowledge. There are some gaps in certain areas that we're working to fix and then one other thing I wanted to say. That's escaping my mind. Oh no.

G

I'll stop her.

D

For scrolling, so first of all, okay, you should call it s. Okay, one rather than art see something cool more seriously and the question we should be able to being. You know, updated draft I. Guess that's a question for you is how often do things change if things change you know on the scale, every two years then, like just you, know, publish this one another one a couple years: they change every two months. Then probably that's awesome, so I think it's all a question for you better rather than me.

D

It's not a constant rate of change, yeah yeah sure on the Implanon, this non-technical thing, the sensors, like the framing thing you know, like the reasons like a bad name, is because, like people don't like censorship and if you call it there, BBT people will soon grow like WPT Muslim, give advert only new deputies, so, like no I, think you should citizenship yeah.

Z

And what I thought was, if you change the title and everything throughout to a worldwide survey worldwide blocking techniques? How does it make me feel too bad I can.

D

Live with that, but I mean it just seems like these be great.

L

Malory noodle article 18 since we're solving all the bigger issues on the slide right now, I think for the first one. Maybe this is a question for you. Siobhan is I mean it seems like in the Charter. You are open to talking about threats, not just mitigations and I. Do think that censorship and privacy are two sides of the same coin. Like so I don't know. Maybe we can talk about that more because I feel like it would fit within the Charter. I was just rereading it before I came up to the bank agree.

B

That was more for sedition, Minister. Sorry, that's right! Email, sucks, right.

AA

When you Seltzer thing thanks for this work, I think the non-technical forms is a valuable addition, because we're always thinking about. Where is something going to get routed if it, the technical means aren't available and thinking about that sort of broad piece of the threat model is helpful to thinking how effective are anti censorship mechanisms. Thank you. Stefan.

E

Stefan BOTS Maya regarding the first point, yeah the fact that the draft is still not published as a rare sea so that it's complicated to keep this sort of thing. Up-To-Date and I vote aye against, including mitigations, because a big problem is not only that they evolve very fast, but also that can have consequences. A bad advice can be really harmful for people that can, for instance, to show that we try to walk around censorship thing like that. So it's much more touchy, so I suggest to stay with rights. Only on of.

L

Course to call censorship.

E

Sense or because otherwise, what people who censor don't want to be consensus, but that's what they also just use. Ted in waltz.

Z

I hadn't thought about that. If I, if we put a mitigation here that someone uses, then they end up in jail, man, I'm.

E

Z

B

Is that a question? Oh.

P

If you're still looking for inputs on absolutely for the first one, I I'm, not sure how much beyond RFC six nine seven three will be able to contribute, even if we go down the path of including mitigation, so that's I mean personally I'm fine with the content says they are on the second one. I I, don't think, there's a need to change it. Okay, yeah, but maybe the title should reflect the scope, which is that not this is not on online sensation. It's just Internet traffic and website sensation. So maybe that just fix it.

P

That's a good point. Our people, I, am the JavaScript so and I read producer whatever I forgot. Thanks are.

B

People opposed to adopting the draft, as is, does even have any comments on that: okay should shoot yeah and have people read it.

B

See if your hands we're going to do a hum on adoption.

AB

Of the draft, as is so if you're in favor,.

B

Of adopting the draft please hum now.

B

If you're opposed.

N

To drop in the draft, please.

B

A

Yeah, so it sounds like you're positive support for so will confirm on the list and go from there. Thank.

AC

A

Just to be good the question of like how frequently the document will be updated, as you said, does you know kind of bleed into the living document issue, so we'll discuss with column? What is you know a good strategy for dealing with that? Should that option happen, but for now the normal thing will happen. I guess.

A

Right so the this brings us I, guess, basically to the end. We had to cut Fernando off earlier because of technical issues. Were you gonna say something from him: okay, yeah! Please.

P

This was a response to echo this question of you, which is on what were the research questions, ended up and Fernando said, identifying the reasons for Florida, IDs, categorizing them based on interoperability, requirements and failure modes and producing algorithms to address such requirements.

B

Yes, we can take that question also that question would option for that. Also to the list then.

A

Yes, that's good people would please read the drafts if you're, interested and and yeah well ask whether or not people interested in adopting on the list and go from there make at this point, perhaps given the few number of people that I've actually read at doing a hum here is not the best thing so we'll just take that one to the list and with that message anything else, we can end a few minutes early thanks. Everyone Thanks.

B

We're at the blue sheet.