Internet Engineering Task Force 113, 21 Mar 2022

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: IETF113-PEARG-20220321-1200

Description

PEARG meeting session at IETF113
2022/03/21 1200

https://datatracker.ietf.org/meeting/113/proceedings/

A

A

A

B

B

B

B

Sorry does the share preloaded, slides, optional.

C

Okay, let me try that again, this one okay, let everybody remote and in the room, hear me now: yay, okay,.

C

Okay, so welcome to the perigee pitting at ietf 113.. This is a hybrid meeting. You will have noticed that, unfortunately, none of the chairs were able to be there in person, so there is a sad little empty desk at the front.

C

um I will remind everybody that this thing is being recorded before we go any further and if I can, uh if I can ask for a volunteer for a minute taker, if you're interested in that, please contact the chairs by the chat.

C

C

As usual, we have the note 12, which we hope that everybody is familiar with and has read and in particular we draw your attention to the code of conduct the highlights on this slide and we expect everybody to adhere to that.

C

Moving on to the agenda, the blue sheets are generated automatically by your attendance in meet echo. These days, uh shivan will be acting as javascribe, we'll be monitoring the chat. So if there's anything to put in there, you would like asked at the mic. Please prepend it with mike and he'll bring that to you.

C

We have a notetaker. Thank you very much, and our agenda today is uh fairly light. We have one presentation on the effectiveness of quick padding against website fingerprinting.

C

We have a presentation on gdpr and network privacy, and then we have one update on the state of the server worldwide censorship technique draft by mallory.

C

Before we get going, I will just quickly ask that or remind people that participants in the room will need to join queue via meet echo, and then they can go up to mike to ask the questions, but when they do, please do state your name clearly as with masks. It makes it particularly hard to see who is speaking.

C

And the full participants can keep their audio and video muted uh when they're, not speaking that will be much appreciating. I see that there is an echo, I hope. That's not me. I will switch to a headset um okay, some folks are hearing it and some aren't so I don't know if it's okay, maybe the in-room like ah yeah, maybe mike, is picking something up. uh Hopefully that will get sorted out.

C

um We will start by moving on to the first presentation given by sandra sidney um sandra. Would you like to request to share your screen? Drive your own, slides, hey, william,.

A

A

Okay, that's good.

D

I can see your slides, please go ahead. uh Let me just try to move uh the slides okay. That works. uh Sorry for the echo.

D

Okay, uh hello: everyone. uh Today, I'm going to talk about uh work on developing defenses against website fingerprinting of quick traffic.

D

This is a joint work with ludovic christopher marwan, nick and carmela, and we are from epfl and cloudflare.

D

So before I go into the details of our work, I just want to give a quick overview on website fingerprinting.

D

So let's say that we have a client and this client is trying to visit a webpage in this case example.com.

D

We have an adversary, that's on the path between the client and the destination host and the adversary is observing the traffic between the two parties.

D

Now, since the channel is encrypted uh the client sorry, the adversary does not know uh which web page is being visited by the client. uh All the adversary can see is some metadata of the traffic, and in this scenario the ip addresses uh we assume that there are measures such as ech or encrypted dns, so that the adversary does not see the domain uh that the client is visiting.

D

So the goal of the adversary in website fingerprinting is to determine uh which webpage is being visited from just observing this metadata.

D

In order to do this, the adversary has a pre-trained machine learning. Classifier, that's already been trained on some network traffic traces that they've collected.

D

So what happens here is that the adversary gets the traffic sample and then creates some features.

D

So on the left side of the slide, I have some of the characteristics that the adversary uses in order to develop these features, and these features are then fed into the classifier and the classifier spits out a prediction uh on what webpage this could be, and this is how the attack works.

D

So, in our scenario, we are interested in seeing uh how website fingerprinting works over a quick connection between the client and the destination now uh website. Fingerprinting on quick traffic is not new, uh it's actually already been done, and this work was actually presented at the ietf last year and they concluded that it is no harder to fingerprint quick traffic as compared to tcp and the adversary can identify pages over a quick connection with high accuracy.

D

What we are interested in looking at is whether it is possible to develop defenses against being fingerprinted by such an adversary.

D

So if you look at the quick rfc there is this option for a quick padding frame that allows you to increase the size of quick packets and the rc specifies that this padding could potentially be used to provide protection against traffic analysis. And this is what we are interested in. Exploring further.

D

So I want to talk a little bit now about the adversarial model that we are considering. uh So we have a bunch of vantage points which can be routers or switches or middle boxes on the internet and they are located in different ass.

D

So let's say that we have a client in asx in this scenario, and the client is interested in visiting uh some webpages which are hosted on destination hosts uh in another, as uh this is for simplification that they're all in the same as but they could be located in different destination holes in different asses.

D

Now these web pages can host sub-resources, which are also located in many different endpoints.

D

So when the client tries to visit and obtain these resources, the network traffic passes through a bunch of different vantage points uh as seen by the red arrows here.

D

So in our scenario, the adversary is an as that is interested in finding out which webpage is being visited by the client.

D

Now these vantage points will collect different subsets of the traffic and, as we saw here, we have a machine learning classifier which, uh which requires collection, as well as storage and computation on these traffic traces.

D

So because of this, we actually have a centralized location in each as and all the vantage points uh transmit, the traffic that they collect to this location, which actually performs the website fingerprinting attack, as seen by the purple arrows.

D

So what we want to do now is the goal of the adversary. Here is to identify the correct web pages uh web page among all the web pages that are hosted on a single ip. So since the ips are seen by the adversary, they can already filter traffic based on ip address, so they just need to identify which web page is being visited among all the web pages that are posted on one ip address. In our scenario,.

D

So we do our experiments with a quick, dominant uh data set uh of 150 pages here. So we did this by crawling the web pages in popular uh lists such as the alexa one million. This is similar to prior work. That's been done on free traffic and we looked at uh which web pages have a high proportion of quick traffic uh since uh quick is still being adopted. A lot of web pages are still transmitting their resources over non-quick connections.

D

But since we are analyzing quick, we wanted to build something which has primarily uh quick traffic, and we found that for our data set, we had approximately 70 percent quick traffic on average. uh Just want to note that previously we tried to be more realistic and we partnered with cloudflare uh to look at uh domains that are hosted on a single ip uh and get the traffic traffic for those, but those had only about four percent uh peak on average. So we went with this method to build our data set.

D

um As I mentioned, this is the process that we have for the for the website. Fingerprinting. But now we see here in the second step, we apply a defense to the traffic sample which I'll come to uh in the upcoming slides, and then we pass this defended sample into the classifier and we see how well the classifier performs against these defended traces um I'll be using a couple of uh well-known classifiers from the literature which is there in the yellow box below. If you have any questions, you can ask me about that later.

D

So the metric that we are using to evaluate is a common metric used in these kind of works, which is called f score.

D

Now f, score is the harmonic mean of the recall, and the precision of the classifier, so recall means how many relevant results are returned by a classifier and precision indicates how many of those are actually correct.

D

So we have 150 pages in the data set, which means that if we were to have an adversary randomly guess which web page it was, we would have a 0.67 percent chance of getting this right, but we see actually that uh for our uh classifier we get a 96 percent f score, which means that on undefended uh traffic, the adversary has a very good uh chance of identifying the webpage that is being visited by the client.

D

So when we look at what features are important for the classifier, we see actually that the size based features are very important, so how many packets of particular sizes, as well as the total number of bytes that are incoming and outgoing, are mainly used by the classifier to identify the page.

D

So the first thing that we do is we try to apply some defenses to hide these features from the classifier.

D

So the first thing that we do is we pad individual packets and hide the packet based uh features, as shown in the figure here, and we see that that does not uh really decrease the f score by a lot. It goes down by about two percent.

D

um The next thing that we do is we hide the total amount of traffic in both directions by applying some amount of padding to each of the packets, and we still see that the f score is really high, which is about 92 percent.

D

And now we look again at the features of the classifier to see why it is doing so well when five space features are no longer used, and we see that there is a lot of directionality based features that are still leveraged by the classifier.

D

So hiding the total size and individual size does not hide how many packets are going in each direction, and this is now being used instead of the size based features.

D

So in order to hide the directionality-based features, uh we now perform this defense, where we inject dummies randomly into the trace, and what we actually see is that injecting dummies does decrease the performance of the classifier, but it also comes with a high cost. So we see that it goes down to about uh six. It goes down by about 16 percent in the worst case, but with a hundred percent overhead. When you add these dummies- and this is not including the uh additional traffic just is injected in order to hide the packet based features.

D

So what we come to, what we conclude is that these network defenses offer low protection with high costs, for example, just to get a 10 reduction in f score. We need more than 50 percent overhead when it comes to dummy injections.

D

Now so far, I've talked about an unconstrained adversary and this adversary observes all the traffic and also performs uh this classification with all the traffic that it sees. But in reality it is possible that you don't have a perfect adversary and adversaries can be constrained not just in the amount of traffic they see, but also in what they do with this traffic.

D

So the first thing we look at is an adversary that has a limited view of the traffic. So, for example, you have a vantage point of as5, you might miss a lot of the other red arrows and the other traffic that the client is uh generating.

D

So in order to see what this is actually observed, how much of the traffic we generated uh trace curves from the client for all the pages in our data set and observed uh how the stray shots pain. So this is the result from one of the vantage points that we conducted the experiment from uh we did this on multiple vantage points and we observed similar trends, uh and what we see is that there are only a few large asls that can observe a large proportion of the traffic in the first place.

D

So that means that uh here, in this scenario, on the left plot, we see that there are only three. Yes, that can even observe more than 50 of the pages in a data set and on the right clock. You see that for those pages about three of them can observe more than 50 of all the routes for each page.

D

So this means that a lot of the adversaries, by virtue of their location, might not even be able to successfully conduct an attack.

D

Another interesting thing that we saw was that google actually has a large relevance on the pages, so more than 80 of the pages on our data set contain at least one resource that is hosted by google and uh found that actually, these resources could be um ordered differently and different web pages, which means that timing to google resources is a low cost fingerprint and can uh have up to 77.9 percent f score. This means an adversary, for example like an isp.

D

I can just use times to google instead of using all the traffic that we observe and identify with high accuracy.

D

The next thing that we noted was how well these advertisers perform with limited uh processing. So previously uh I said that random phones are transmitting all the traffic that would observe to this centralized location.

D

uh Now what we wanted to see if, instead of transferring all the traffic, whether these vantage points can just transmit uh flow summaries of this traffic to these centralized locations.

D

So in order to simulate this, we use something called sampled net flow. uh So what happens here is that the packet races are sampled and then uh flow submarines are created and these are sent to the centralized location and we perform the attacks this with the submarines, instead of the packet traces and we experimented with various uh sampling rates, what happens to the adversary's performance?

D

So, as expected, uh we see that with lower sampling rings. You also have much lower performance of the anniversary, because there is much less traffic for the adversary to make good features from uh just want to note that, even at a 0.1 sampling rate, it is still much higher than the random baseline.

D

Then, when we applied the padding lenses, uh we do see that there is some reduction in the performance of the adversary. However, we want to point out that, in the case of the limited adversary, a lot of the gains come from the sampling process here more than the actual application of the defense.

D

So what we actually find is that is that the network layer defenses in the case of both unconstrained and constrained adversaries, do not efficiently uh hide a lot of the global features and the main reason uh for this is that they do not know the sizes of traces in advance to efficiently design uh padding strategy. So there is no application layer, information uh that is given here, and so you need to randomly inject dummies um or add size based feed size based defenses, and this could increase the overhead of these network layer defenses.

D

So then, we wanted to see whether applying defenses the application layer could potentially help protect against these attacks.

D

So in order to do this, we started analyzing the structure of the pages, uh so here's a quick uh refresher on terminology that I'm going to use. uh So let's say that we are visiting example.com, uh you might have resources from uh uh exam from the same domain example.com, and these are called uh first party resources or you might have resources from other domains.

D

So here you have tracker.com, and these are called third-party resources and when we look at the web page structure, what we see is that, in our data set, uh 18 of the web pages actually have a very small number of first party resources. So you actually have a large prevalence of third parties and there is in a large prevalence of google resources in our data sets. So 24 of the pages actually even have more than 50 percent of google resources.

D

So what does that mean when it comes to applying a defense? It means that third parties are contributing a large proportion of resources to the webpage and in order to apply a defense, we actually would need cooperation from all the parties that are supplying some resource to the web page. And we actually did do experiments on this. Where we hid the resources from third parties or from first parties to simulate the scenario of only one of these parties participating in the resources.

D

And then we went for the application defenses, so we applied the same uh packet and space padding uh to, uh but at the application layer. So we are protecting the actual resources here and we still see that uh the packing is ineffective here once again because of the because the adversary just uses the ordering based resources.

D

However, when we apply uh dummies now into the uh uh into the uh resources, we actually see that uh this can be more effective than in the network layer case and, for example, injecting five dummies on average reduce the f score by 39 with a relatively low cost.

D

But in reality, if you wanted to implement something like this, that actually comes with some deployment complexity, because injecting dummies here means that you're actually sending some requests for some dummy resources and we'll have to think about how those would be implemented uh and how that would uh impact the client's experience.

D

So, in short, what we found is that, at least for the network level, if we want to implement an efficient resource, we need some sort of information from the application layer. Otherwise this could lead to large overheads.

D

At the same time, if we decide to implement things on the application layer, they come with a whole set of other complexities, so they would require some sort of coordination between parties or we would have to talk to developers on how to uh how to write code so that resources are always fetched in some sort of a standardized manner. And, finally, all of these changes in the application layer could potentially have a large impact on client experience.

D

At the moment, we are working to see uh further what kind of practices would have to be done in order to develop better application, layer, defenses and here's a link to our paper and feel free to ask me some questions. Thank you.

D

um I see some people in the queue uh sharon.

E

Yes, I, how are you sandra? Thank you. I have a. I have a question for you if you think of the definitions uh slightly different, where the adversary is actually the client sending traffic, because his malware and he's doing some data leak or lateral movement and the one sampling is the protection software.

E

Really has a very high score by sampling, netflow and detect the raw, the rogue client. So, according to your analysis, the only way the malware can throw off the detection.

E

Sampler is by injecting a lot of dummy traffic into whatever it does to avoid the pattern. Recognition. Is that correct.

D

uh So you're, assuming here that the client is the adversary right.

E

Yeah he's malware and I want to detect him by sampling and you're saying if I use quick, it's not going to reduce my f score.

C

F

E

He's going to use a lot of dummy traffic in his malware, then then, yes, but at a high cost. Something like that.

D

Yeah, so so in this scenario, uh what you're saying is that actually uh analyzing the network traffic would be a good thing because you want to detect the malware. Yes, yes, uh I mean what I'm saying right now. Is that generally most of the defenses that we have, or rather in this scenario anything to prevent this detection are not going to help much.

D

So you would be able to detect this with a large accuracy already and if, if the client wanted to perform against that, they would have to think of some good way of injecting dummies. As you say,.

E

Great great, thank you very much, we'll look forward to seeing some raw data if it's possible to share.

D

Yeah we'll be sharing that.

E

Thank you very much.

D

D

B

F

Want to go next.

F

Yeah, thank you for this presentation. It's very useful for us to think about this type of threat and it's useful to get some data. um I'm curious about clarifying the threat a little bit more precisely. I've.

B

F

The paper, but my understanding is that you uh removed all caches and just made a single uh web page load to the index page of each popular domain name and the goal was just. Can you distinguish loading the index page with no caching of one domain versus another in cases where, for example, they're hosted by the same cdn or something like that.

D

Yes, that's right.

F

And- and so I'm just curious like how is this, how is this type of attack going to apply to cases where, maybe I don't go to the index page or.

E

F

I load a customized resource, or maybe I have cached resources or.

F

Various other situations like that, because it seems like many of the times that we're worried about the network adversary, we're worried about them, learning what I'm reading or um or the contents of my messages or lots of different threats and that's not to downplay the threat of them, knowing that I'm even going to a particular domain name. But does this threat also apply to learning what pages I'm visiting or or cases where the network traffic is going to be mixed or resources will be cached, et cetera,.

D

uh Yeah, that's a good question and this is also a field of research in this area. So, like you said, we are working with relatively clean traces here, where we are assuming uh no caching and like no background traffic and it's just the home pages, but we are planning now, for example, to do some experiments where we are also visiting subpages of different websites to see how well this attack is going to work, and there is also, I think, work done by others.

D

Now there are some papers coming up where they're looking at fingerprinting in the presence of all these factors that add some noise but yeah. I I would say that this is kind of the worst case for the uh worst case. I mean the best case for the adversary and all of these factors would uh possibly lead to a reduction in the f score.

D

That's that right, we.

F

Don't know how much of a reduction, but yes, it will be worth further. Research.

D

uh Yeah, we'll probably get some numbers when we run this possibly next month,.

F

Okay, great well, I think many of us will be interested in those numbers too. Thank you thanks.

C

Okay sandra, thank you very much for that presentation and for all the questions. There is some interesting discussion in the chat and I'm sure sandra will follow up there on that as well.

C

Thanks, sarah yeah, thank you, okay. So our next presentation, I believe, is one by somebody in the room, a real person there we go step forward. Thank you.

G

C

Want to attempt to share your screen.

C

Okay sandra, you may need to stop sharing your screen. Please, okay, great.

A

C

Okay, here come the slides. I think.

C

Okay, I think they've loaded.

H

C

Go ahead, then, please.

H

So I'm luigi, I'm gonna present a little bit um the relation between gdpr, okay.

C

Luigi, I I'm having trouble hearing you. I don't know if everybody else is as well. Maybe you're, okay in the room, but it's quite quiet remotely.

H

C

Better, thank you.

H

Okay, just to hit the mic with the mask: okay, I'm going to talk a little bit about gdpr and relation with the ip addresses in general, a little bit um as the title say that how layer eight meets layer? Three: okay, uh knowing I'm not a lawyer, just to be clear. So this is my interpretation. I spent some time reading and uh so I'll give you some some some interesting point that we may discuss later on. So I'll, really one slide history, some terminology, and then we dig a little bit in gdpr and ip addresses.

H

There is a a few slides that make a clear link between existing rfcs and gdpr. If time, we will go over that as well. Okay, so um gdpr came into effect in 2018. So it's not that long ago replaced a very old uh data protection act in europe. It is a regulation which means it is slightly more complex than a simple law, because there are articles which are the law itself, but there are also recitals which are notes that explain actually how to apply the laws. Okay, which is the body of gdpr.

B

Luigi, I think it's still pretty quiet. If you wouldn't mind just um just speaking up, I guess a little bit.

H

Speaking even more up, okay.

B

H

As I can okay, thank you. The key point is personal data. Okay, what is personal data is anything that can identify a natural person. Not a legal person like a company is different, a natural person like me all of you in the room and connected elsewhere, name personal, addresses anything that can can identify ourselves information concerning our employment, financial information. You have the details in this slide, which is pretty worthy. There are also sensitive information, which is like ethnic origins or really religious belief or anything.

H

It's real personal choices in a certain way and any other information that actually you you are willing to disclose by yourself, for example, to your employer. Okay, in this last bullet, I put my employer, but it's just any employer can or any entity can ask you for some information that you may wish to discuss, but is really personal.

H

This is about the personal data. Then we have three key points which are the console, processor and processing. So the console is who is controlling the personal data?

H

Okay, because if you give the data to someone that someone is controlling your personal data, okay and he may wish to do some processing- which is the action of taking your data and making something to get some stats, for example right this is the processing and the the the entity that does the processing is the processors, okay, which is actually doing the processing. Now it's kind of a headache, but these three things are correlated and do overlap.

H

Okay, it's like I connected to my isp. I sign a contract. I give some personal information. The isp controls my personal data that I gave to him. Okay, and he may decide to do some stats, okay and he decides how to process the data, but not necessarily does it itself. He can ask somebody else, a third party to do it, which will be the processor okay.

H

H

Gdpr and ip addresses so in the gdpr.

H

Is clearly stated that any online identifier is personal data? Okay, especially this applies to ip addresses, and the european court of justice ruled that it is personal identification data because you can associate and retrieve a lot of other information, even if you use temporary addresses okay. So as such, it falls specifically under gdpr and privacy protection.

H

Now um I will go a little bit through the the seven principles of gdpr and try to make a link to what is an example of how does does it apply on the on the protocol stack, let's say more specifically to the network layer, okay and ip addresses.

H

uh The first very simple principle is lawfulness fairness and transparency, which basically says that if I give my personal data to my isp, I expect that it does use my personal data according to the law. Gdpr is part of it. Okay, um you know in an undiscriminated discriminatory manner, okay and in a transparent manner, which includes my explicit consent. I will come back to this a little bit later.

H

uh Second principle is purpose limitation is the fact that uh uh my isp is not allowed to use my personal data for whatever he wants. Okay, he is able to to use my ip address in order to count my packets for billing purposes is not allowed to look what is in my packets in order to measure how much shopping I do online, because this is not related to the service that it proposes. The isp is proposing internet connectivity.

H

Okay, third principle data is minimization is the fact that uh mysp is allowed to collect as much data that is needed to offer the service, but not more for of that. Okay, so he can certainly again access to the ip header, the transport header, to provide the service, but not access the content of my packet in order to look at what I actually do exactly, even if it isn't clear, okay accuracy for the principle is the fact that anything, the isp gathers of me must be accurate.

H

Okay, that or error free, if you wish, okay, uh storage limitation is the fact that uh my my data cannot be archived forever. Okay, there is typically a limited amount of time that my data can be collected. Then it should be deleted if I do not ask actually to do it beforehand, because there is this. Also, this aspect that actually I I have the right to be forgotten, so that I can ask to delete all my data. Okay.

H

This is interesting, is kind of tussling somehow, because on the on the one side, we have gdpr that asks storage limitation on the other side, load enforcement and for some minimal time to to keep some logs for accountability and traceability of some stuff. So there is a balance to strike there at some point. Okay, um security, integrity and confidentiality is just that.

H

If I give my personal data to the isp, I assume that isp is doing his best to protect my personal data and they gonna not do not go out in the wild okay, which brings to the fact that is. He is actually accountable for my personal data and even in the case, as I explained before, that my isp is the console of my data, and it gives my data to someone else in order to process them, okay, to perform a processing and in something goes wrong and the processor actually leaks.

H

My data, the one that is accountable from my perspective, is the isp okay.

H

It's not moving anymore.

H

H

H

So this is really high level uh what happens in gdpr and how we can relate with with the ip protocol stack. So you know that gdpr is peculiar, for the european union is not the only example. Okay in this table, there is a summary of other laws regulation you can find all over the world. Okay, there are, there are more or less or similar and they all more or less consider ip addresses.

H

As a personal data.

H

There are some legal nuances in a certain way, so, for example, in brazil he is not explicitly stated that uh ip addresses or online identifiers uh are included in in the law, but the way the law is expressed, ip addresses falling. Okay.

H

Another interesting uh peculiarity is the fact that in japan, even anonymized data covered by the appi, which is the the japanese equivalent of gdpr, which is not the case in here for gdpr in europe once you anonymize the data, and you are sure that there is no way to go back and find the the the original information gdpr is out of the scope. Okay and all of the laws are based on on consent.

H

Usually explicit consent, which means that you have to take an expletive, explicit action to give your concept, which means when you are in europe, you have this pop-up window. That say you accept the cookies and you have to to to click yes or no, and today we have also different settings. This is a explicit action. Okay, you cannot just put some place in the webpage. Oh by the way we are collecting cookies.

H

If you have something against, please shut up, speak up okay, so you cannot do that because would be a passive consent. Okay,.

H

And in canada, on the contrary, this is uh allowed. Okay,.

H

I would like just to make one one single point: uh you can go over the slide. There are the rfc numbers um which try to to make a clear link between gdpr and and and what we do here in uh and the atf. The only thing is this slide in uh rfc 6973.

H

There are already some wording that clearly points to the same principles that we can find in gdpr, so just to state is not out in the moon. Gdpr is something that, in a certain way, we already uh share as a principle, okay- and I think I'm done so that we are in time.

C

Perfect, thank you so much and thank you for giving this talk on a topic which comes up repeatedly inside the ietf. um I see we have patrick in the queue please go ahead, patrick.

I

Hello, uh patrick tarpie, I've come very interesting presentation, just an observation, really it's a kind of known fact that in the european union area, there's a piece of regulation called the data retention directive, which requires communication providers to keep call data records for the purposes of lawful, intercept and all that malarkey.

I

To what extent do you think that new protocols, such as mask over quick and oblivious technology, to some extent render this conversation about the privacy of ip addresses, or indeed you know the actors that you've mentioned in your presentation kind of obsolete? Really? Do you think that's a fair comment? You know that if the evolution of mask proxies and oblivious kind of make, this idea a bit exponential and a challenge.

H

Thank you for the question. It's very very interesting question uh uh top of my head. The answer is, uh I think this, that technology are very useful and can help in in privacy protection.

H

There is one thing so that should be considered is uh where the data goes, uh because the fact that you have an oblivious methodology in order to to to obfuscate some things doesn't mean that you are gdpr compliant because uh you you have also to consider where actually you do that, how you do that and then coming to play the logs that you keep in order to for accountability.

H

So does it make sense my my answer, but we can chat more. It's a very interesting question, but it's what complex, I think to to discuss.

C

Thank you again, luigi for that presentation. That's great and again there's some further discussion in the chat that uh you may want to follow up with.

H

A

H

A

C

Very much okay and a final presentation is from mallory um it's an update of the survey of worldwide censorship techniques. Draft so plea do you you have slides, you want to show you on screen.

G

uh I do have slides, I just requested to um share them yeah there we go. Thank you so much.

G

Yeah all right, thanks and thanks for time on the agenda. This is a research group document. If people recall I'm presenting it only because um the other authors who've worked on this draft for a very long time, no longer have the capacity to continue working on it. So um a lot of the content in the slides and the draft itself is not written by me, but um I am it's steward at the moment. Happily so yeah joe hall, um he was originally in my position at cdc when he started writing this uh in november 2014..

G

So that's quite some time. I think one of the challenges of this draft is that, of course, it's a you know. Contentious topic uses the word censorship throughout it's not something that atf is used to talking about, but also that um the more time that passes the more sort of techniques can be added refined. You know so at some point I think, there's a recognition.

G

This has been made this. This has been recognized multiple times in um perigee that um it just has to sort of be uh published, so we're working towards a document that is good enough to be published, but also, um you know, has sort of a time stamp on it when it does get published and everything that sort of evolves after that can be captured in a different way.

G

um So yeah there's it's gone through. One research group last call we'd like to by the end of this presentation and then on the list very soon move to another. Last call: that's my goal here, we're now on the fifth version, since the research group adopted it and the changes that we've made most recently have been to scale back the section on self-censorship to the bare minimum.

G

There talked a bit more about domain seizure again like not under the sort of technical table of contents which I'll get to in a second, and then we now also talk about how tls 1.3 extensions are sometimes being blocked.

G

That's again, an example of how the longer we wait, the more new and novel ways to block the internet um arise.

G

uh So the um contents, the summary of the draft, it's in essentially um four parts, there's a section that sort of helps to define what to block. um Then that's followed by a section on how to detect what to block after you've defined it, and then the last part is really about the actions that you can take to block it um and then there's some um discussion of how the network is layered and and how that um actually matches yeah. I just saw a list of point in the chat, which is absolutely right.

G

We've talked a lot about how this could be a living document, but also how it might be useful to to actually get it published once and then think about how to keep it up to date. So we are tracking this in github. You can see the open issues um these. This is a just sort of um list of what's open and, in my analysis of the way to move forward.

G

So um actually, since I submitted the most recent version, um ecker has given some really great suggestions to how to figure out issue 81 on the tls identification piece. um There's been an open issue for quite a while.

G

Just to sort of incorporate, I think this very relevant report, thanks to chris wood, for pointing that one out um there has been some treatment on issue 64 about the issue around tls, um and it's just that the original um the person who opened the issue chelsea um needs to review kind of the change we did and whether or not she's satisfied with that that there are two other issues that are still open, and I would actually um I'm here, I'm coming here to you to recommend that we drop them.

G

One is um to introduce this concept of sensor maturity because again, I think this is something that maybe changes over time. I'm not sure it adds a great deal, and I also I'm just not very sure or clear what the text on sensor maturity should be since there's not been a suggestion and then the other one I suggest dropping um is changing.

G

So, throughout the document we have a sort of trade-offs, caveat under most subsections, meaning that you know there's a cost to the censorship of some degree um and there's just been a suggestion that we changed that terminology from trade-off to cost to implement, but I'm actually looking at the text. I don't think that the terminology change is really warranted and also it would require us to do more word smithing, because then we tend to say trade-off colon the cost to implement this. You know so it would be really redundant and kind of um yeah.

G

I'm just not I'm just not convinced that it's really worth making the change um so anyway. That so, like I said we're waiting for the re-review on issue 64..

G

We know that we need to incorporate 81, 55 and 64 into the next version and then we'll go to the list right after that's done um and hopefully ask the chairs for another last call but um wanted to stop there. I think to ask if there were any questions or comments um thanks again for the time we have like three minutes left. I think thanks mallory. Are there any questions from the meeting today.

C

Otherwise, I think I can um to say, from the chair point of view, we're very keen to see uh this version move forward. Now that there's been some action on it again, so we would be uh very keen to have a discussion about setting a provisional date for getting to the next research group last call so that we have some time-based targets to move this forward because, as you say, it really needs to be published at this point in time. So perhaps we can. We can chat about offline.

G

C

G

Reasonable, I think it should be fairly imminent. Really it's just. I feel really confident that the changes I need to make I can make it's just. I need the reviewers who originally raised them to just give me their stamp of approval, because I want to get it right and I would just also say I forgot to sort of say this in the presentation- and maybe it's only tangentially relevant, but there's also a new brand new zero zero draft. That's going to be presented, I think three times at ietf 113 on ip blocking.

G

This is, I think, you know, reading between the lines in response to some of the requests that have been made of various infrastruct internet infrastructure to um action, russia's behavior um during its war on ukraine, and that's a good draft. If folks want to read it, I think there's some overlap here, there's definitely some stuff in the ip blocking space that isn't in this draft. So I don't want to open that whole can of worms.

G

I just want people to be aware of it and to note you know that this draft has been around for quite a while. There's been a lot of really great thinking put into it, and I just like to see it um get some attention yeah and if there are particular.

C

Viewers that um you want feedback from please, please let the chairs know, so we can help you with that um cool and if you think those other drafts are useful um for people from perigee to provide input on, then uh please send that to the peggy list. So folks are aware exactly.

G

I did recommend that the authors send it to the list the perigee list so we'll see if they, if they will or not so anyway, thanks all.

C

Right, thank you very much for that uh with that. I think we are wrapped up for today. Thank you to all our presenters today um and uh please enjoy continue to enjoy the hybrid ietf, whether you are remote or in person. Thank you very much. Everyone.

B

B