Internet Engineering Task Force IETF 114, 26 Jul 2022

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: IETF114-ANRW-20220726-1900

Description

ANRW meeting session at IETF114
2022/07/26 1900

https://datatracker.ietf.org/meeting/114/proceedings/

A

A

All right, everybody we'll just give it a couple of minutes just see people are still joining.

A

A

All right, uh it looks like the rate of people uh joining.

B

C

A

Has slowed down so uh welcome everybody. Can I just check that you can all see and hear me? Okay,.

A

All right, uh in that case uh welcome everybody to the second of the uh applied networking research workshop uh sessions for 2022.

A

My name is colin perkins from the university of glasgow. I'm pleased to be chairing this special session on protocol specification techniques.

A

So I'd like to begin with the the usual reminders, um the applied networking research workshop is organized uh by acm, sig karma by the irtf, and it's co-located with the ietf meeting. As you all know, and as a result, that means we. We follow the itf's intellectual disclosure, intellectual property rights, disclosure rules.

A

So by uh participating in this meeting, you follow the rules which I'm sure you're also familiar with uh um you. You agree to follow the rules which are all familiar, which I hope you are all familiar with, uh and that's a if you're speaking at the microphones um and you you know of any uh patents or an intellectual property that covers uh covers the topic you're discussing. Then you agree to to disclose that uh the the links on the slide point to the the detailed policies there.

A

In uh addition, um this session is being recorded uh and will be uh available online on youtube after the session, and if you speak at microphones in the room, if you um join during the session remotely, then you you agree to being recorded and you agree to that recording being made publicly available.

A

uh In addition, if you're participating in person and you're not wearing one of the red uh do not photograph lanyards, then you uh consent to uh appearing in the recordings and in the um the the photographs of the session uh and.

C

A

Online participants, if you unmute your microphone and turn on your camera again, you may be recorded and the recording may be published and, as I say, this recording is being published. It is being recorded and I believe, the closer being live streamed.

A

Any uh personal information you give to the ietf to the irtf will be handled in accordance with the privacy policy and there's a link to that on the slide. That mostly mostly applies to your registration details and by participating in the the irtf.

A

You agree to work respectfully with the other participants, and you agree to follow the code of conduct.

A

If you have any questions or concerns about that, please either contact me or contact the ombuds team and the link with the contact details for the ombuds team is on the slide.

A

And a final reminder before we start because of um the the risk of covet as a safety measure, we are requiring all in-person participants in this meeting and in the other, itf controlled areas of the meeting venue to where n95 uh or equivalent masks um at all at all times uh what, while they're in this meeting or in the the itf controlled areas.

A

The only exception for that is for the um the speakers in the session who are actively giving the talks at the front microphone.

A

If you're making a comment or asking a question from the floor microphone, we do ask that you remain uh masked at all times. Let me ask you remain masked. While you are in the meeting at all times,.

A

Okay, so this is the applied networking research workshop.

A

The um goal of the a rw is to provide a forum for researchers, vendors network operators and the standards community to to talk about uh to discuss uh interesting, new results in applied networking research.

A

um One of the goals is to try and find inspiration uh from the topics and and the problems being discussed in the itf and the other is to try and connect the research community with um the iatf and to help bring new ideas into the itf community into the standards community. The goal is to encourage this two-way exchange of ideas.

A

The a rw is very much there to encourage, hopefully, the submission of new new research results that could form the basis of engineering work in the itf, uh but also to to help better specify internet protocols.

C

A

Change operational practices, possibly uh influence future research and experimentation in the ietf and the irtf.

A

One of the key areas there, I think, is one of the key goals for the workshop is to think about how we can better specify internet protocols, and a lot of this has been in the form of ideas for new protocols themselves.

A

I mean, as as we saw in in the talks this morning uh as we've seen it in the previous. What versions of this workshop um we've had a lot of talks uh about how to to specify new protocols about new protocol designs, improvements to protocols, and so on.

A

Part of the discussion, though, I I think is how it's not what new protocols we might need or how to improve protocols, but how to specify protocols.

A

Something we we haven't discussed so much in the a and rw uh which I think is one of the goals of this session is to try.

C

A

Start to think about how we describe and specify protocols um how we can, you know, think about how we can specify uh how we can ensure that the protocols we're specifying are consistent and correct and how we can verify the specific specifications. We are writing as researchers or as engineers in the ietf are correct and how we can validate if the implementations match those specifications.

A

I think one of the key goals of writing a specification is to communicate and communicate in a way that we know that what is written in the spec is correct and that we can check that the implementations match that specification check that you know the specification is self-consistent. The implementations match the specification and everything should interwork.

A

The way we have historically done that in the ietf community in the standardization community has been, though, to use natural language.

A

We write rfcs in english and we use english prose to describe and specify the protocols.

A

And, yes, we occasionally use formal language right. We've been using abnf, for example, for for longer, and I've been involved in the itf, which is far too many years.

A

More recently, we've started using yang in a lot of specifications and there's a number of other languages of scattered around around the rfc series, but predominantly we specify rfcs in english.

A

What I'm hoping we can start to do in the session is to start to discuss whether this is the right way of writing. Protocol specifications I'll start to discuss to what extent formal methods, structured specification, languages, natural language, processing or other techniques can help to describe network protocols.

A

The goal is very much not to think about how we change think about how rfcs are prepared right. It was um what what I I hope was a light-hearted comment in the the chat earlier about. Is this about whether we use markdown or xml for for writing rfcs right. The goal of today is not to start a conversation about the rfc series.

A

The goal is to start a conversation about how we specify protocols in general and whether english is the right way of writing these protocols. To what extent formal methods or other techniques should be incorporated.

A

I'd like to use this session, hopefully to start a conversation. So that's a conversation about how we specify protocols and hopefully to begin to connect um academic researchers who are studying protocol specification with the engineers in the itf, the researchers in the itf and the irtf that specify protocol standards and design and develop new protocols.

A

So uh our agenda for today.

A

We have we just finished the the introduction status update um and then we we have three invited talks.

A

First, talk will be by max von hippel from northeastern university, we'll be talking about um automated attack synthesis by extracting finite state machines from critical specifications.

D

A

Will be followed by a talk from jane yen from.

C

A

I'll be talking about some work, they've been doing on tools for disambiguating rfcs and then finally, chris wood from cloudflare will be um talking about some of the work. That's been happening in the crypto forum, research group and some of the the successes and limitations of the way that group has been um specifying cryptographic standards and then.

C

A

We have a, I hope, some some time for discussion uh before we close.

A

um There is a note that there's a hedgedock for notes. The link is on the slide here. um If you have questions or comments, either put them in the chat or put them in in the hedge dock. I guess probably better to put them in in the chat, there's also a session in the hedge docker, in the notes.

A

If you would like us to get in touch about any of this work, uh if you're interested in following up on any of these ideas, then please put your name in there and we'll try and follow up afterwards and we'll talk more about this in the discussion session at the end.

A

So for that, uh if max is present in the room, uh he could come up and I'd like to to hand over to him for the first.

D

Talk right, I'm curious. I'm ready to go.

D

Should I use this microphone or the other ones stay anywhere? Okay,.

C

All right all right.

D

And okay and uh colin, can you.

E

Give me control.

D

Over the slides.

A

Yep, I'm just uh trying to find the right button in me took.

D

Got it all right? Thank you.

A

Very much you got it okay, over to you.

D

All right, my name is max von hippel, I'm a fourth year phd student at northeastern university and today I'll be talking to you about automated attack synthesis by extracting finite state machines from rfcs from protocol specification documents.

D

This was a collaborative work with maria lenoir pacheco from purdue, who recently graduated and is now a professor at university of colorado boulder as well as her advisor dan goldwater at purdue, another student from my lab ben weintraub and our advisor christina nieto retaro, and this work was presented also at the ieee security and privacy conference.

D

So today we're going to be talking- or at least I'm going to be talking about automated protocol analysis.

D

This is old news to those of you who who build protocols, so everybody in this room, but uh but the internet runs on protocols such as tcp, udp, etc, and um these protocols create a sort of shared language for computers to talk to one another. Here we have picture the internet or specifically the arpanet in december 1969.

D

At the time there were four computers talking to one another. The internet has grown considerably since that picture was drawn.

D

I'm a formal methods, researcher and in formal methods. We call these things, crippy structures, but to normal software engineers, protocols are described by finite state machines and a finite state machine is actually a completely unambiguous specification of the language of a protocol and essentially when I say that a protocol makes us that two computers speak the same language, I'm saying something both informal and formal. The formal statement is that they run either the same state, machine or state machines that accept the same language.

D

The informal one is that they know how to talk to one another.

D

In our group at northeastern, we specialize in studying these finite state machines, either using formal methods or other approaches and trying to find flaws in them. So we take a protocol, the protocol, we define a protocol to be whatever is in these rfc. We then formalize it using a finite state machine that we write in some language like pramela or upol, or something like this, and then we have a tool that generates attacks against that. That protocol expresses an fsm.

D

So there's kind of a lot of hidden steps in between there right going from specification to fsm to attacks and I'd like to disambiguate that a little bit when we talk about protocols, we first and foremost have to talk about rfc specifications right, which is obviously especially salient in this context. These are written by the ietf. The written in english prose, which I think is, is the topic of today's conversation um and although they they may seem technical to somebody who doesn't write these documents for a living.

D

The reality is that they really could be quite a bit more technical than they are right. um So, for example, they could be written in or agda they could be written in a in a programming language if we wanted them to, um and there are advantages and disadvantages to their current presentation.

D

Then a programmer reads these rfc specifications and implements them. So on the left. We have the uh well a tiny little screenshot of the dccp rfc and on the right we have a tiny little screenshot of dc dccp implemented in the linux kernel, and there are all sorts of decisions that the implementer has to make when implementing uh the the rfc, um and that doesn't just include decisions about how to implement some functionality like. Oh, this list needs to be sorted.

D

Will I use you know, merge, sort or or bubble sort, but also decisions about what it is that the specification actually says in the first place, because since it's written in english, that's not inherently clear.

D

So when the implementer reads the rfc specification and decides what she is going to implement um implicitly she's, creating a mental model in her head and to me as a formal methods person, I view that mental model as a finite state machine, but it doesn't really matter. You could view it as any isomorphic thing. The point is that obviously, a human being reading english text is going to to imagine what that text is supposed to communicate and that thing that that she or he imagines is some expression abstractly of what they intend to implement.

D

They then go and implement it, and hopefully there's no errors in that process either, so bugs in a protocol. Implementation can arise from any one of the steps in this process. First of all, there could be fundamental design flaws in the specification. It could be that we specified something. That was bad right. It was a bad idea and the idea needs to be reworked.

D

It's also entirely possible that the mental model, the finite state machine, is wrong, in other words, that the programmer or implementer made some mistake. When she read the specification and misinterpreted something now we could blame that on the programmer or blame it on the author. It doesn't really matter but mathematically.

D

There could be a discrepancy between the finite state machine that the programmer intends to implement and the one that was abstractly described by the protocol text by the rfc and then finally, it's entirely possible that the specification is great and the finite state machine that the programmer imagines is great, but the programmer has a typo or something, and so there's some bug in implementation, which was unintended right, so the programmer writes something that does not do what she intended to do.

D

This last class of bug is an extremely well-studied problem and there are a ton of really great techniques for approaching it, and it does not interest me today, so we will be talking about the other two.

D

So this presentation is going to focus on fundamental issues with protocol design and ambiguities and omissions in the specification.

D

The first half of this presentation will be about how we can automatically extract a finite state machine from a protocol specification, so I'm going to take an rfc document and I'm going to automatically guess or extract to find a state machine that should represent that rfc.

D

This is sort of like automating, the first step that the implementer does when she reads the specification and tries to decide what it says before she goes and implements it in rust or go or whatever.

D

The second half will be doing something useful with that. So, in order to measure that the extracted final state machine is actually good for something I'm going to do something with it and by something I mean attack, synthesis I'm going to automatically generate attacks if you're in this room and you're like well, that sounds cool, but I don't care about security. um I'll, tell you right now! You can view these attacks as bugs, so you could also view it as I'm going to automatically find bugs.

D

So this is relevant whether or not you care about security.

D

So, let's focus on the first part first and then we'll move on to the second part.

D

When we try to extract a finite state machine from a request for comments document, there are a number of critical challenges that we faced. First and foremost, there's simply no one-to-one mapping between the text and what we call this canonical finite state machine. In fact, it takes human expertise to read a rfc and interpret what it is that it says, or should say, for a number of reasons that we'll get into second and related to the first point: rfcs contain omissions, mistakes and ambiguities.

D

Here's an example from the dccp rfc, where a sentence if interpreted mathematically, would imply the existence of a transition from part open to open upon receipt of a dccp closed packet. But of course that would be nonsensical. If you receive a dccp closed packet, you should close your connection not go to open, and so, um uh if you read the rest of the document, it becomes clear that that transition should not exist.

D

But if you read this sentence and interpret it literally, it clearly implies the transition that isn't there, and I will tell you, from now years of experience that rfc documents are riddled with sentences like these.

D

Another problem is that off-the-shelf natural language processing tools are simply not suitable. Generally speaking, they are trained on newswire text or books, or things like this that are non-technical in nature, and so they face the exact same problem that I faced when I tried to describe my research to my girlfriend who's, not a programmer, which is that I use all sorts of lingo that are kind of domain specific and and don't mean anything to people in other fields like in her case biology.

D

So um we need something: that's trained on relevant text such as computer, networking textbooks or stack page stack exchange, stack, overflow questions, etc. Finally, there's a ton of variation in the language and structure of different rfcs- and this is one of the I think interesting results we'll show today is that spoiler alert. The dccp rfc is more complicated than tcp one, at least in terms of its english text.

D

So we have a multi-step approach. The first step is that we create a technical language embedding. um Essentially, what we're doing here is that, rather than impose our own concept of what the most natural vector space is for the text that describes an rfc to live in, we are allowing a machine learning model to come up with an ideal vector space that it wants to learn from um that. Embedding is going to capture all sorts of interesting things about the text, such as uh the the punctuation of sentences or the use of mathematical symbols, etc.

D

Second, we have a zero shot protocol information extraction. That's where we're going to take the text that is now projected into the vector space of our technical language embedding and we're going to automatically extract an intermediary representation which I'll get into more later, which contains sufficiently much information for us to then in a subsequent step, extract a finite state machine. So I want to stress this point that we don't do end to end in one step from english to fsm.

D

We break it down into smaller sub-problems and we essentially create a structured intermediary representation, which is an xml document that has enough data for us to then in a third step, extract the finite state machine in a rather heuristic manner.

D

Finally, we do something useful with that fsm in our cases, attack synthesis I'd like to stress that you can do other things if you care about other things. So if you were like well, I don't care about attacks, I'm not a security person, but I care about bugs. I care about correctness, monitors whatever well boy. Do I have other tools for you. There are other things you can do with these, but what we did was attack synthesis because, uh frankly, my dissertation is going to be about attack synthesis. So I applied attack synthesis.

D

Let's talk about step one in step, one. We want to learn a technical language. Embedding, as I said, I don't want to impose my own personal moral, ethical beliefs about uh the natural vector space of rfc documents. Rather I'm going to let the computer decide this, we use something called burt. The idea of bert, loosely speaking, is that it takes a contextualized approach to understanding what words mean, so the word reset could mean something different in different contexts, because it is informed by the words around it.

D

We train on roughly 9000 documents and roughly 475 million words, uh so our bert model is trained on quite a lot of data that includes um stack exchange, stack overflow, networking textbooks um all sorts of other things.

D

Next, we want to do protocol information extraction. So just to remind you in case you missed it. um I now have some text that is projected into my vector space and I want to translate it into an xml document. I'll show you what that looks like momentarily, which somehow captures um enough data for me to then be able to extract a finite state machine in a straightforward manner. So I I essentially want to read the text so put differently.

D

I want to teach a computer how to read and interpret the english language rfc document now that it has it in its own natural language, which is the the burton coding. So here's a little snapshot from the dccp rfc and I'm going to show you how we would uh put this into a intermediary representation.

D

First, we need to know what it means for the computer to read the document. So to do this, we invent a grammar shown here in broadcast. Now our form, the grammar, contains a few different types of tags. It has definition, tags that are used to define states, events, etc.

D

Reference tags that reference previously defined states, transition, events, etc, state machine tags that tag essentially the structure of the finite state machine in the document, any references to transitions or other logic that relate to the state machine and then finally, control flow tags that essentially uh capture logical structure in the fsm. So when you read these documents a lot of times, they have kind of complicated. If else structure, uh often that's related to the indentation of different blocks. Of text and um our nlp is going to capture that using these control flow tags.

D

So here's that same quote again, but this time it's now wrapped in our intermediary representation structure in our xml, we have a control block on the outside that scopes, our search.

D

We have a transition block that contains an entire state machine transition, including the source state and the target state, and then we have an event and we know what exactly uh occurs on that event. So, from the perspective of a formal methods person, I now have the source state, the target state and the label on the transition in my cryptic structure. But from the perspective of a software engineer, I have you know an edge in the directed graph. That is my final state machine.

D

In order to go from text to this intermediary representation without requiring a bunch of grad students to spend countless hours, annotating things, we use a machine learning approach. uh In fact, we use two different machine learning approaches. We compare them. The first one is a linear conditional, random fields, model which starts out by splitting text into chunks.

D

It then extracts features uh from the the embedding that that the chunks are in and then feeds those into a linear conditional, random field, linear condition, conditional random fields, model which builds on the markovian assumption that the next state is predicated only on the previous one.

D

We also experiment with a neural analog to this model, which is kind of morally the same, but it uses a neural network instead of the the linear, classifier.

D

So we evaluate these against each other and also, in contrast to a completely rule-based heuristic program, which we don't expect to do very well, because this doesn't learn anything. um The rule-based system uh in fact performs poorly, as we expect achieving uh below 35 token level, accuracy and and similarly bad span level accuracy. uh In contrast, the linear, crf and neural crf models perform quite a bit better with the neural one slightly outperforming the linear one and you'll actually see in almost all the metrics.

D

We showed today that the neural model slightly outperforms, the linear model, which is another interesting result.

D

So now that we've gone from text to embedding to intermediary representation, we now have these rfc documents that are expressed in xml. Like this, you can imagine big long document entirely structured with xml tags. That say what's happening where and now I need to extract to find a state machine from that, and I want to be very clear. This itself is not a finite state machine. A finite state machine would consist of things like this actual transitions right. I need to interpret the xml and extract the fsm.

D

So how am I going to do that pretty much in the most boring possible way, I'm going to use an entirely heuristic algorithm that exploits the very structured nature of the xml to extract the fsm um and uh happy to talk about this more.

D

But if you want to improve on our work, a good place to start would be on the fsm extraction algorithm because at the moment we're not using machine learning for that step, and I think a fruitful thing to do would be to try that once we've extracted the fsm, uh we want to know how good is it right? I mean it's, it's probably not exactly what was written in the text, so so, how close is it?

D

So we compare against one that I wrote by hand and to get back to the point about why all this is necessary. I will tell you: it took quite a long time to write that by hand, because there are all sorts of subtleties when you get into a model, checking environment of the ordering of events and the semantics of the channels that connect uh to two processes that are not captured by just the diagram that you find in your average rfc document. um So we expect to have 20 transitions in our tcp fsm.

D

We also hand annotate or hand create an xml for the tcp rfc, and we call that our gold xml and then our fsm from that is the gold fsm, so that one should have perfect annotations and any bugs are related to either ambiguities and text or problems with our extraction algorithm. um So, comparing these we find that the gold fsm does uh quite well.

D

We define a transition as being approximately correct if his source state target state are both correct and at least one of the events on the label is correct, but it might have like the wrong order of events like um say you maybe receive scene and then send back instead of the opposite or something um and and in the case of gold we find slightly fewer transitions than we expect to. But almost all the ones we get are are approximately correct with the linear neural models, there's quite a bit of noise.

D

So, although we find uh most of the transitions we expect to find, we also get a bunch of incorrect transitions.

D

So, comparing to our skyline, we find that another interesting result here is that in the very particular case of tcp, the neural and linear models extract different intermediary representations, but actually all that is modulated away by the finite state machine extraction algorithm. So any differences in intermediary representations do not matter in this particular case in terms of the extracted fsm, although this doesn't generalize.

D

Looking at dccp, we find uh deceptively more transitions than we did with tcp, but we expect substantially more so fractionally less are found. We expect to have 34 total transitions with the gold. We only get 17.. As you can see, the linear and neural models do not perform as well as gold, but again neural slightly outperforms, linear, okay. So um let's talk about kind of why we miss things because, as you just saw in these graphs, we missed a substantial number of transitions.

D

So there's two problems: one is missing transitions, the other is incorrect transitions and we'll start with missing transitions. Here's an example from the dccp rfc for a transition going from close weight to last stack in which you send a fee.

D

This text contains mention of the message that was sent to the fiend and the state that you started in the close weight, but it does not mention the state that you go to last stack to find that you need to search like I think, four paragraphs up in the document and find something indented back three lines, and it would implicitly tell you that it was last act, but there's really no way that a computer could figure that out and even a human being like myself reading.

D

This would have to search around for a while to try to figure it out. um Also, if we wrote a rule to tell the computer to look that far back, it would cause problems when we parse other parts of the document. So this is a good example of why you can have missing transitions when you extract from an rfc.

D

So we do an evaluation and try to figure out why it is that we miss transitions and we we do this by actually manually inspecting every single missing transition. We find that in the case of tcp, um most of the missing transitions are due to omissions or ambiguities in the text. The thing I just showed you, I would consider to be an omission or ambiguity. I think it's ambiguous because you have to search in other parts of the document that are pretty far away paragraphs away to figure out where the transition's supposed to go.

D

However, there's also a number of extraction errors for tcp. In contrast with dccp, we also get some prediction errors, which is interesting, but that might just be a scaling aspect, because there are so many more transitions in dccp. So who knows, if you guys added like 20 more transitions to tcp, maybe I would get some prediction errors with tcp, I'm not really sure uh it should be noted.

D

We missed very few for tcp, and additionally, this result with gold implies that the dccp specification is substantially more ambiguous than the tcp one, because the uh fraction of of missing transitions that are due to omissions or ambiguity is substantially larger than the fraction. That's due to extraction errors, in contrast to the same analysis with the gold tcp uh uh fsm right, so uh in both these cases we're using our skyline intermediary representation. So any problems are at the extraction level or in the spec and again the the neural model is is slightly better.

D

Next, we look at incorrect transitions, so I've told you about missing transitions, and I've said that, essentially, they can happen for multiple reasons, but they mostly happen because of omissions or ambiguities in the text. Another question would be what about when we make a mistake when we extract a transition, that's just completely off. It has the wrong start state or the wrong target state both um or none of the correct events on the transition uh in the case of tcp they're, primarily caused by prediction errors. So this would be something we can improve on right.

D

The prediction errors are are part of our and their their problems with our methodology, so our methodology can be improved to not have so many prediction errors, uh potentially even just by training on more rfcs. To be honest um in dccp, however, looking at the gold we find that a substantial number of incorrectly extracted transitions are due to omissions or ambiguities in the text, and so this again highlights that the dccp rfc uh experimentally appears to be more ambiguous than the tcp1 along numerous metrics that we can measure.

D

We underperform on dccp in terms of what.

D

How many transitions we can extract um and and conversely, how many errors we have and again this suggests that dccp is more complex.

D

Okay, so now I've told you uh end to end that we have a way to automatically read an rfc document and extract a finite state machine, and I've shown you that the final state machine has a number of errors, including missing transitions and also transitions. That are wrong, and I'm going to claim perhaps scandalously that, despite these errors, it is good enough to do something useful uh to me that useful thing is going to be attack synthesis. So um that means the automatic generation of attacks against the protocol.

D

But you could also view this as the automatic uh discovery of potential bugs.

D

So uh I used an attack, synthesis tool called korg, which I created, which is why I used it. uh I presented korg at safe comp 2020, which was in lisbon, portugal, but I was over the internet because of covet. Unfortunately, so, hopefully, next time we'll get to go to lisbon. um Cork takes his input and extracted finite state machine like the one we just generated uh a linear temporal logic property. This is just a formalization of what it means for the state machine to be correct.

D

So an example of a property would be that there's no active close in dccp, so I could have a property saying you never have both peers closing at once.

D

It then creates a system in which two pures are communicating over a vulnerable channel, so it injects a vulnerability into the channel between them, a man in the middle, and it asks if, in that threat model, uh the property could be violated, so we're not asking if the protocol's correct on its own, we kind of assume it is. In fact, we don't even look at properties unless uh they're satisfied by the prop the model in isolation, but we want to know under attack. Is this still satisfied?

D

The spin model checker reduces this to a language emptiness problem uh by intersection of a book automaton with the system, but basically it does fancy formal methods from the 80s, and it says no.

D

The system is not correct when it's under attack, here's a counterexample which is just a sequence of events that the system goes through in which the property's violated. Finally cork takes that counter-example and turns it into an attacker program that simply replays that attack. So it's a pretty simple approach.

D

We looked at two case: studies, tcp and dccp. uh The fsms shown here are made by me and have been extensively model checked for the number of properties and shown to satisfy all of them. So these can be viewed as perhaps more canonical than the fsms that you'll find in the rfc documents or in any number of security papers where they do maybe graph guided search, but they don't really care that they satisfy formal mathematical properties for tcp. We use four properties.

D

The first three came from my paper when I presented the korg tool where I also used tcp as a case study, and the fourth was written by ben weintraub. All four, though, are written based off a close reading of the tcp rfc.

D

Likewise, we also have four properties for dccp. I don't think these are incredibly interesting properties, so I'll kind of skip them, but I'm happy to talk about them. Afterward. If people have questions, we generate a number of attacks. This is kind of a confusing table, so I'm going to try to break it down into sub points, essentially with tcp there's only one property with which we are successfully able to generate attacks using any of our extracted finite state machines.

D

We generate very few attacks with tcp, but all of the attacks work, meaning all of them are confirmed on my handwritten model, so there's no false positives, in contrast with dccp we're able to generate attacks using all four properties.

D

um However, we get some false positives, so, for example, with the third property, I wrote for dccp using the fsm that was extracted with our linear conditional random fields model we generate 13 attacks and all 13 are false, positives. So uh again, there's sort of a noise to signal ratio thing here, where I'm generating more attacks with dccp, but there's more errors.

D

I generate very few with tcp, but there's zero errors again to me this seems to indicate more ambiguity or more complexity in the dccp rfc, because errors in the attack synthesis step are caused by errors in the finite state machine. So this is fairly predictable for the results. I showed you up until this point.

D

Okay, when I say the word attack, it can mean any number of things and people use this word rather liberally. So I want to give some examples what I mean by an attack. So you know what exactly I'm saying one attack would be to inject an act to the first peer and a tcp active passive establishment routine. In order to, if I remember correctly, get that pure stuck and seen received because it can non-deterministically decide to exit, but if it doesn't decide to exit.

D

For example, if it's configured to have a very long timeout time, then it could stay there forever um or essentially forever. uh They may sound a little silly. But if you talk to samuel giro at lincoln labs he'll tell you that he's found tcp implementations in the wild to have this bug and in fact I found stack exchange queries. Even experts exchange queries from people who had this problem, where their tcp connections hung forever because they'd configured the timeout incorrectly.

D

Another example which is much more complicated, which we also find, would be to elaborately spoof each peer and a dccp connection in order to guide the other one into close rack. This is sort of a silly attack because I'm not really sure what like an elite. Russian hacker would want with this. However, it is a violation of a correctness property because there's no active active close in dccp.

D

So if I can induce an active active close, maybe I can cause a deadlock or something- and I want to point out this interesting point- that because our approach is based on model checking we'll find many many variants on kind of the same attack, and um so indeed we find many variants on this attack, and I can show you dozens of different ways to essentially do the same thing.

D

I want to conclude with future directions, so um the research I showed you today was presented at uh s p to a security audience, and um uh you know there's kind of a certain set of things that interest them in terms of breaking protocols, but I think for you, you may have different interests in terms of maybe building protocols and won't be broken from the first uh first place.

D

So it'd be nice if we could automatically highlight omissions and ambiguities in the rfc text, and I think that we're part of the way there if we can kind of, take our tool and make it say, I'm kind of confused about how to interpret this piece of text. Could you clarify it that might be a useful functionality for rfc authors to be able to know when something they wrote is not entirely clear?

D

Another thing I'd like to do is automatically suggest bug fixes. So if I extract the finance state machine and the rfc author looks at it and says yeah actually that looks correct. I think that's what I intended to describe and then I find an attack against it. Well, it'd be nice. If I could use some uh synthesis techniques from the formal methods community to automatically suggest a patch that would resolve that problem and we actually have an undergrad working on that problem with us right now, which is very exciting.

D

um A completely different kind of vein of research that we'd like to undergo, is to extract logical properties right now. I'm writing those properties from hand. That's not easy. Writing properties from hand is maybe not quite as hard as writing models by hand, but they're both much more difficult than they seem, uh and it's very easy to write a property that, for example, is vacuously true, and then you really can't do anything with it.

D

um So there's a lot of logical subtleties in these things, we'd like to offer support for secure protocols, there's a lot of kind of interesting problems to this like capturing the arithmetic that goes into the cryptography and such a protocol, uh capturing secrecy requirements, and we might have to take a different approach in terms of the back end, for example, maybe relying on pro verif or tamarin instead of uh quark.

D

um I think that, finally, it would be really nice if we could take all of these different ideas and turn them into some sort of cohesive software suite for an rfc author in the loop, so an rfc author or authors are writing an rfc. It would be nice if they could run our code and be told hey. This seems ambiguous.

D

Maybe you could clarify this uh and, and they could kind of have a feedback loop to do so until they automatically get an unambiguous piece of text, uh and you could even imagine um perhaps mixing formal language with the english to make it so that if the rfc author actually likes the way they describe something in english, they could just give little formal hints about how to change it, and that's it with that I'll. Take any questions. Thank you very much.

F

It's okay, okay,.

A

All right, thank you very much excellent talk. uh I see a couple of people in the queue um who's. First, uh ishii nishida.

E

Web service, so I think this is very interesting research. Thank you so much and then I this might be a part of a huge direction, but uh one of the complicated things for tcp is congestion, control and loss, recovery, yeah, and then I'm wondering if we, you know you can apply this technology to the congestion control gross recovery and how it's promising. That's what I'm curious about.

D

Yeah, that's a fantastic question. um My current project is congestion control. uh However, um I have not been applying nlp to the problem. I've been manually modeling, a congestion control algorithm of interest. I think that congestion control is pretty complicated and it would be fantastic if we could partially automate the modeling, um but I'm trying to get an intuition for the modeling to begin with, by doing it by hand before we consider automating it, but I mean you basically predicted the next year in my life, so yeah, that's that's! That's what I'm doing thanks.

E

So much thank you.

A

All right, thank you, uh michael.

B

um You said you use um you analyze the text to get the finite state machines. um Most of the transport rfcs contain a graphical representation of the finite state machine. So did you also consider using that, or is that too hard or too weak or.

D

That's a great question, so uh I did write a little toy script that could read um the the ascii diagrams and rfcs and got it to work on quite a few of them, but there was nothing very scientific about it. I just kind of wrote a little heuristic python script. I've spoken to maria about this, and apparently maria did the natural language processing for the most part on this project.

D

I did all the fm maria says that there is kind of a field of nlp that deals specifically with diagrams and figures, and it could be applied to this type of problem, which is an interesting approach. I will tell you, however, that the fsm diagrams are missing a bunch of information, for example. If I recall correctly, I think that the tcp fsm diagram is missing a core component of the active active, close routine and in the dccp one.

D

I think that there was again: oh you know what I think in dccp there's a transition, that's shown in the diagram, but not in the text. There's a bunch of places where the diagrams are missing. Things there's also this problem. When protocols have multiple diagrams that have to be somehow merged like if you look at ltp, which is a pretty interesting protocol, that's used in space exploration. It has, I think, like seven or eight fsm diagrams in it that all interact in, uh I would say, mathematically ambiguous ways.

D

Even dccp has multiple fsm diagrams, because it has one for the state changing, stable and unstable and then another one for the communication. So diagrams are interesting. I'm not personally working on that problem. I do think it merits attention, but I don't think is a golden ticket to automatically getting good fsm from rfc documents. No.

B

But it would be very good, but it would be very good if there is a disagreement between your fsm representation and the one shown on the graphic to to make that available to make this difference available to the authors, because that might not be intended.

D

Yeah, that's a good point. We we mentioned these things in our paper, but I'd also be happy to summarize some of the differences uh in an email to whoever would find that interesting, so maybe offline. You can help me know who would like to know about things I found in the rrc that that maybe could be improved.

B

Thank you. Thank you.

G

Great work very interesting: I want just two questions: quick, the first one you've shown two different methods, one with linear one with. uh I don't remember sorry anyway. I wanted to ask if they are wrong in the same way or they actually could be used together to cross validate each other and say, okay,.

D

Yeah, that's a good question. I haven't thought about that maria probably has yeah, I'm sure that you could use them to. You could do some sort of intersectional approach, potentially right where you you have like a higher layer that takes as input both models and then finds kind of where they agree. um That sounds interesting. I I haven't tried that, but I think it's a good idea.

G

Thank you and the second one is, if you are taking else into account, not now my maybe in the future, the problem of the length of the fields, because sometimes the same transition is actually governed governed by some fields. It might be either too short or too long or well, usually too short, and also that might be a problem for the state machine. Yeah.

D

That's a great question: um I would say that prior work before I started my phd by my group, specifically by samuel giro, who, as I said, is at lincoln labs now working with maria um applied similar techniques, to figure out the packet structures of protocols using natural language processing applied to rfcs, and you could potentially combine these approaches right.

D

So if you have one approach to find the packet structure and another to find the finite state machine, and then you can go from just reasoning about kind of your atomic propositions of like scene and fiend to actual full-fledged packets, then that sets you up to pretty nicely be able to make arithmetic statements about the contents of those packets right, such as their length. So that would be a next step. However, from a formal method, perspective is complicated because you get into a state space explosion problem, so you probably wouldn't want to use model.

D

Checking at that point, if you're going to reason about the packet details, you'd want to use another approach, but yeah. It is interesting and something we've thought about yeah. Thank you. Thank you great.

G

H

uh This is artwork from microsoft, so I think there was one thing that I found a bit missing in the evaluation.

H

So obviously, your current extraction don't have very good accuracy, and also you have the additional loss from the translation to state machine.

H

So have you tried to actually uh do the work that you suggest in the future, steps for the tl, for instance, for tcp or for uh gccp, which would be essentially measuring how many trends you need to actually make in the original rfc? Until you get to the point where you actually don't have any missing transition anymore? And you don't have any false positions anymore, because it feels like with some that many errors, you would need to essentially uh completely rewrite everything.

H

It would be like very frustrating for the rfc authors to actually do that methodology. So do you have any kind of thought on that? Or did you do any experiments on that.

D

Great question: I have two thoughts on that. First of all, we have not worked on this yet because I think there's a lot to improve in um in our methodology before we try to improve the rfcs, so I mean just from an academic perspective. I understand that, as like the ietf community, you are more interested in improving the rfcs, but as academics working on trying to find cool ways to extract information from documents.

D

We're perhaps focused right now on improving our ways of extracting information from documents and, as you can see, with prediction error, there's a lot that we could improve on right by training on more documents and improving our model structure. um That being said, I think actually that what we'll eventually find when we do get to doing this experiment is that the amount of changes necessary is not that great. I think.

D

For the most part, it would be things like you have a sentence in which the target state is implicit, and you just add the words to scene received to the end of the sentence, and now it's explicit right essentially taking things that to a human, are implied and making them explicit, because I'd rather have a slightly redundant, boring spec. That's unambiguous, then one that's very cleverly written and I get confused by um so I actually do not think it would take that many changes in order to make it unambiguous good thanks. Thank you.

A

All right, thank you, uh a reminder that we we're using the meat echo queue, so we can do the the local and remote people in the queue. uh Last question, I think, is by jonathan.

I

Yes, yeah, um so I I'm uh I, I spend a lot of my time doing: tamarind modeling, so um um I'm interested to know why you're using fsm, because almost all protocols that I've come across cannot be expressed as an fsm. You need tokens or they have states or you're, actually very limited to maybe just tcp. I I can't imagine what other ones don't have state.

D

Yeah, this is a fantastic question, um so uh people who work in tamarin or pro verif or- dare I say kryptol- are um uh interacting with mostly cryptographic protocols or protocols that use cryptographic primitives and they care about things like secrecy and privacy and- and I care about those things too, but um but because they care about those things they they need to reason about threat models where secrecy or privacy or confidentiality et cetera, could be violated and attacks against those things and um fsms are are, I would say, probably not the right approach for that.

D

So I think we agree on that front and I I think that um maybe a more type theoretic approach like uh what what cameron does is quite good. um There actually are quite a few protocols that uh have at least components that can be studied using fsms, mostly communication protocols.

D

Examples include ltp, tcp, dccp, sctp, sftp ftp and some also some newer protocols. I've been looking at that are not related to itf, uh which we'll release in future work. So I would say that many things that are communication protocols that have handshakes in which you establish some sort of connection are are very well described by fsms.

D

But another aspect to your question is that we often will take a protocol and break it into chunks and say this chunk can be studied by model checking with uh an fsm representing that component of the protocol, but this other chunk might be best studied in some other way, so the cryptographic aspects are probably not suitable to to model checking yeah. Thanks for the question, though, great question.

A

All right, thank you, uh really excellent talk, really excellent questions. uh There's a bunch of discussion in the chat as well, which you might want to look at uh uh so in a couple of minutes. So thank you again max your talk.

D

Thank you very much.

A

All right, thank you, so our next speaker today is jane yen from the university of southern california, uh who is going to talk about tools for disambiguating rfcs.

J

Yes, thank you. Okay. Okay, can everyone hear me? Okay, great, um so hi everyone? um So today I want to talk about our work that I work with my advisors, barack robin and ramesh govindam. The topic is tools for disambiguating rfc.

J

Let me start from giving you some numbers that I got from the rfc editor website here. This table shows the number of published rfcs each year from 2016 to 2021.

J

We can notice that the published number ranges from almost 200 to 300., so assuming each rfc is about 10 pages. That means there are about 2, 000 or 3 thousands of pages every year that should be reviewed and studied and then considered it doesn't sound. Like a small number to me.

J

It also indicates that great human efforts in involved, so human efforts is involved in various parts of the specification production process and as an example here is a working group, they will need to gather and discuss which uh and or what protocol needs to be standardized, and there will be one or more specification. Authors composing the contents of spec, although it might be obvious, the content is full of domain specific knowledge and would require professionals to verify the content.

J

Any unclear detail is picked out by human examination.

J

When we speak of the concerns about the quality of a spec, it is hard to neglect the impact of ambiguities with the existence of the ambiguities in a spec different protocol. Implementers could interpret the content differently and generate multiple versions of the protocols and here's one example which might cause different interpretations.

J

It says the checksum is the 16-bit one's complement of the one's complement sound of the icmp message, starting with the icmv type.

J

At first glance, it might not be obvious why the sentence is ambiguous. However, when we focus on the last part of this sentence, it explicitly explicitly mentions the checksum starts with icmp type, but it doesn't mention where the checksum computation should end. So how could this affect a person's interpretation?

J

Here? I show the icmp header to help illustrate how we can at least come up with two different interpretations. One is to check some. Only the icmp header part and the other is to check some, both the header and the payload, as shown, the two different interpretations would lead to different protocol implementations.

J

So with this example, we noticed that using an ambiguous specification could not only result in different implementation, but also result in bug, implementations and or security vulnerabilities.

J

So if we go back to our scenario of generating a spec, the working group and the spec authors all know the importance of delivering correct message over the spec to spec reader. So there are some guidelines to follow and they are useful to reduce the ambiguities.

J

But do we know whether we are close to near zero ambiguity for the specification.

J

Motivated by this question, we presented our work in uh last year cycle and and our w following, I will introduce what they are about, respectively.

J

The first one is sage in this work. We get to know what ambiguity exists in a number of long-standing protocols. Specifically, we uncover by instances of ambiguity and sex instances of underspecification in icmprbc, with an ambiguous specification. We are able to generate executable code that can interoperate with the third party code and they generalize to significant sections of bfd, igmp and ntp.

J

Well, as an introduction, I leave some detail in our paper and aim to give an overview of it.

J

So in sage we apply natural language processing techniques on it to discover existence of ambiguities. To do so, we need to understand the semantics of the specification which is commonly termed as the semantic parsing and ultimately we want to leverage the results from semantic parsing to generate a low low-level executable protocol code.

J

Since semantic parsing is not perfect and may possibly never be perfect in the future, once ambiguous sentences are found, we will need to involve human efforts to edit those sentences. When a specification is unambiguous, executable protocol code can be automatically generated.

J

And in this work we face a number of challenges: first, semantic parser, only parses generic terms, but specifications could use domain-specific languages, languages such as checksum, once complement, etc, and second semantic parsing is not perfect and therefore multiple representations might be generated, but only a subset is valid.

J

Finally, when we generate semantic representations, we need to convert them into correct executable code.

J

In this work, we make the following corresponding contributions. First, we extend the generic semantic parser with domain-specific, syntax and semantics.

J

Second, we automate this integration of poor semantic representations with defined checking rules and third, we compile semantic representations into executable code.

J

Sage has three components that correspond to the contributions and respectively they are semantic parsing, disaggregation and code generator components.

J

A complete rfc, parsing workflow will look like this. In addition to the already introduced three components, human will be involved in the loop after the disability phase. If more than one representation remains, it means we discover ambiguities, so we provide feedback to the user. Who is likely the specification author to resolve ambiguity by rewriting the sentences?

J

Also, if a sentence is unambiguous, sage will proceed to map the intermediate representation to a code snippet. The code snippets will undergo unit tests to check if it performs any underspecified political behavior.

J

I will introduce the main idea of each component. First semantic parsing component.

J

The purpose of doing semantic parsing is to derive a semantic representation. Therefore, what we can expect is to take in a textual sentence and turn it into a chosen intermediate representation and in our work we use a semantic representation called logical form. A simple, logical form is composed of a predicate with a number of arguments.

J

The key observation here is that a logical form is a unifying abstraction for dissemination and code generation.

J

We do not change the functionality of the parser itself, but for us to properly parse our text, we have to extend the syntax and semantic parts.

J

For example, the nlp parser wouldn't recognize all the nouns or noun phrases, so we have to use uh so in this work we use a generic parser called spacey and we extend the species term dictionary with networking terms, such as one's complement and for the semantic parts. A specification might include some idiomatic usage, for example, the equivalent mark in zero equals to echo reply is not a sign that can be processed by the generic parser.

J

After we add the semantics for it, the parser could understand that it is an association relations in the sentence.

J

Next, let me introduce how we leverage logical forms to discover ambiguities.

J

If the lp parsing yields something other than exactly one logical form, we call it a true ambiguity.

J

An example of a zero logical form case is that a sentence is incomplete due to missing subject in the sentence and another example of more than one logical form case is that a sentence uses in precise language such as using a terminology that can be interpreted as two different concepts.

J

So the nlp parser has limitation and could generate additional logical forms that do not indicate the true ambiguity cases. Therefore, we define five different types of check-in rules to eliminate these additional logical forms.

J

The by-checking rules are type argument ordering predicate ordering distributivity and associativity respectively, but for more detailed introduction of these five chicken rules you can find in our paper.

J

After disambiguation, every sentence uses only one logical form as an ambiguous specification. Then we move on to the code generator.

J

The code generator works as follows: it checks in a logical form and its contextual information and turns it into executable code. This example is simple and straightforward to map a logical form to an executable code, but this converting process can get far more complex than this example.

J

We include. We also include more complex example in the paper, so you may check it out later.

J

um I'm only giving you an idea of what kind of evaluation we've done in sage we um put. The sage is on rcmp rfc. We apply the sage on and prfc and use generic linux, pin and truss route on the center side and use the package formulation function to generate a reply. Packet back to the center, the tested ping and threshold programs are able to process our automatically generated reply packets.

J

As for showing the efficacy of applying disambiguation rules, we analyze how many logical forms are generated as the base and how the value goes down to one. After all, checks are complete.

J

I have focused on how we use sage to understand an rfc and but in our nrw paper we extend the discussion to cover what else challenges haven't been addressed and how the work can be possibly extended.

J

While siege takes a significant step towards automatic specification processing, their much work remains in an rfc. There are many components: stages able to parse packet formats pseudocode, but there remains other components. It hasn't supported, such as stage machine communication, pattern and architecture.

J

In addition to the limitations shown, there are other challenges that can also be considered from code generation perspectives, thinking of stage as not only a method to understand the level of ambiguities, but also a method to automate the code generation. There is a lot of room to automate the code generation part there's, for example, sage, mostly parses sentence by sentence to determine a sentence, ambiguity and generates code. According to the order of descriptions, however, it's unrealistic to assume that all specifications use self-contained sentences every time.

J

That means we should consider the relationship between sentences and how a sentence can provide the context to generate the code for authors for another sentence.

J

Another challenge could be discovering the mismatched or miscaptured behavior. For example, we have both text and syntactical components which might be a diagram. What, if the information in the diagram is not inc, is not consistent with the textual description or what? If a piece of information is illustrated in a diagram, but there is no textual explanation.

J

Should the protocol implementer implement that part of the diagram.

J

Supposing that we are able to process a single rfc already, what about some protocol describing its functionalities or constraints across multiple rfcs? In some cases, a protocol gets revised and added more constraints in implementation, with the publication of another rfc. How to aggregate the information from multiple rfcs.

J

Then there's another type of challenge, which is aggregating the information of multiple protocols instead of one protocol. In practice, we send packets with a stack of protocols when we want to generate the code of a stack of protocols. How do we exactly organize correctly organize and categorize the information from diverse rfcs?

J

In addition, how do we select the constraints or values, while the stack of protocols is slightly different.

J

Yet another interesting perspective is to consider the logical functionality versus the performance of a protocol implementation. A protocol rfc usually focuses on describing its logical functionality and lifts the flexibility of implementing code to any reader of the rfc.

J

That means we will know the political implementation should be logically correct, but not about how the performance will be. If the spec author suggests any performance oriented implementation, will we be able to differentiate them and leave out the implementation as the flexibility or the freedom of the code of code implementer.

J

The last part I would like to present is our current work direction.

J

I have shown this figure previously. It illustrates how we process an rfc document and involve a feedback loop to indicate the human user, where the ambiguity exists. I would like to point out there is still room to improve the involved human efforts, such as how to assist editing, ambiguous sentences or to extend sage to support more protocols.

J

This leads to another two challenges: we need to deal with, considering how we can assist editing, ambiguous sentences. We might also want to ask: can we avoid writing an ambiguous sentence in the first place? As for the other challenge, we have only talked about a subset of specification components. What else protocols are we going to support.

J

Regarding the first challenge, we are motivated to design a user interface tool that can guide the spec author to input only essential information for the protocol. Then we will use the extracted essential information to produce an ambiguous, readable english text and executable code.

J

As for the second challenge, we are aware that stage: 4 protocols require considerably more complicated operations and many protocols belong to this group. Unlike stateless protocols, a staple protocol is required to keep internal states to integrate packet contents. Given the current state, the ability to keep historical information could allow us to significantly cover more protocols.

J

So now let me piece together. We would like to give the user a friendly interface that is able to extract the essential information we need without ambiguity.

J

As one of the options we might design interfaces similar to microsoft, archive which leverage, the web interface and puzzle-like shapes to generate executable codes. But for us our output becomes the english sentences once we design a parser and parse the information received from the user interface, we leverage the information to both compose easy to read english sentences and the executable code.

J

The last part we would like to fill in is what essential protocol elements are considered. As I mentioned, we want to deal with stateful protocol from our internal discussion. We believe a reasonable stats should concern, contains skymer, less received packets output packet and the program's deeds.

J

This constitutes our vision of a useful tool to generate bi-directional, unambiguous english text and precise executable code.

J

This is my last slide, I'm happy to take questions and also interested in learning your opinions about the current work. I leave my contact information and the open source code of the mentioned siege on the side.

A

All right, thank you, train a real, really excellent talk.

A

uh Does anybody have any comments uh or questions about this talk.

A

A couple of minutes to come.

D

Up, maybe I think one.

A

One of the the things which interested me about the work is is that it really nicely complements the the previous talk and that we we have you you're modeling aspects of the protocol, which are not not perhaps captured by the finite state machines. And so there's a nice complement between the two.

A

um One question for me: I mean you, um you know your you know, but both this this, this mechanism and the the mechanisms you've developed and the mechanisms and the previous talk are all based around natural language, processing and so on. um How how expensive are they to run how what was the sort of runtime for these models.

J

um The round-time is um longs for processing the icmp rfc. It contains um seven. Oh sorry, I forgot the exact number messages but to to process that whole rfc is runs for about five minutes.

A

J

A

It's it's not instantaneous, but it's not crazy long, yeah, okay,.

J

Yes, good because um part of the complexity is due to the the lp parser that it has to analyze how many different kinds of way to generate the semantic representation.

J

So there are ones that you have a so, for example, if you have a noun phrase that can be interpreted in multiple ways, then all of them have to get enumerated um to generate the representation, so that takes uh that takes the majority of the time.

A

Yeah, okay makes sense all right. Thank you uh folk. I guess here.

F

In the room, yes, I am uh thanks. Well, okay, maybe I'm a little bit sidetracking um with the comments to start towards colin colin. You said well, okay, we are used to do the specifications in english prose well, okay kind of uh one some. Sometimes we enhance the specifications by some formal stuff.

F

I would not consider ascii art as a real formal enhancements of the rfcs, but they are in there and in the iec icmprcs, obviously, and kind of. I wonder how does the nlp actually deal with stuff like that?

F

Are you are you kind of doing the manual transformation to reasonable data.

F

Data structure, definitions by hand and continue from there or the other way or ask around, would it actually help you to have in the rfcs more formal and more modern ways of specifications of the relevant data structures.

J

Okay, so for the ascii arts parts um we didn't, we didn't use the nlp parts lv tool to parse that um instead we parse the ascii art of the packet header by writing, another small program to parse it, and it can automatically extract the structure.

J

The the packet header structure, so that's one of it and um I just want to address in another perspective- is, as our current work is trying to extract the essential information out of it by by looking at those essential information, we are thinking not only to generate the english text or the executable code. We also are thinking. We are also thinking to use those information to generate the diagram automatically.

J

So that's in the future. We can. We expect that the authors can directly compare the diagram or it can automatically use the diagram generated in the rfc so that it can help illustrate the idea. So I hope that and answer your question.

A

Yeah, okay, good good great! Thank you, uh jonathan highland. I guess.

I

Hi- and that was a really interesting talk- thank you. um I was just sitting there thinking, wouldn't it be easier for us to formally specify protocols and then use some tool with nlp to generate the pros spec, and then we can be sure that the pro spec actually implements the uh formal specification, and then we can check the formal specification.

J

Yeah, so uh I agree that is also one way to do that. um The thing is we have to gap. We we have to fill in the gap of uh how to convert between the former specification, um a formal language, um former specific, specific engineering language to the english text, so that part is still missing over there and um it's- and I think, there's one problem over here is there are too many different kinds of intermediate representations, so formal specification language is sort of like another intermediate representation.

J

um You can always choose like different ones and then try to fill in the gap from the english text to that representation. So I I would say that is totally a doable and actually a good way to go from it. um But our first work stage is to just directly choose a simpler way to do the first first step to analyze whether we can find some ambiguities, but for further specification.

J

I agree: that's writing a formal specification language and then and then generate the english text might be also a very doable way, um but I think we start from like a smaller um scope from the state machine to just have some.

J

Like simpler, probably some uh simpler language, english text, language and just to generate a very cool, uh cruel, simple, english text, um but we have to see whether that is a good approach first, otherwise that we are not sure whether we should go from the formal specification language to the english text, because formal specification specification language text. um I believe it's more. You need to take some time to learn and to pick up and then to do those things.

J

But if we just extract the essential information like just some key value pairs- and we can just directly go from those information and generate text that might be also another good way to go. Yeah.

J

A

Great great, thank you, I'm conscious that we we have one more talk and we're running a little little behind time. So uh again, thank you again, jane. Was that really excellent.

A

All right, thank you. The next speaker is uh chris wood.

A

Who, I believe, is there in person, is that right.

K

Yeah hello, um trying to figure out how to control this remotely all right.

A

I shouldn't theory have passed you control.

K

Great, I got it.

A

All right so to you, chris.

K

All right thanks all right everyone, uh my name is chris wood. Do some uh work a number of places in the ietf, uh in particular in um the cfrg, as a as of I guess recently, uh do I help out producing technical specifications, for that particular group, and uh what I'm going to try to talk to you about is, I think, a problem that has been emerging in that particular group, which is not it's not only in the cfrg. It exists elsewhere um and uh it's on it's on this topic of. You know uh how.

K

How can we improve the quality of specifications um in in in the cfrg and elsewhere in the ietf and irtf, and uh I should say first and foremost that this is very much based on you know my own personal experience. Writing these things, trying to like communicate, technical things in a way, that's clear and concise to people.

K

I've not not like talked about this with many other people. I've not like asked for a review on the slides. So I uh you know, if you disagree, that's cool. We should talk about it um and uh this does not like consensus of the cfrg in any way particular or any any way, shape or form. Also, I'm not trying to like point fingers or blame people, and in fact I, like I work on a lot of documents, so I should just be pointing the fingers at myself.

K

I really want to see if there's a way, a reasonable way forward for the itf and the irtf to improve the the overall quality of the outputs, and so this is an attempt to do that.

K

Okay, so I'm going to start with what I think is sort of the the problem, um or maybe it's not a problem, but I depending on your perspective, but I'm I'm going to refer to it as a problem.

K

So in in theory the cfrg is, is uh is chartered to do um a lot of things. It's pretty broads broke uh uh pretty broad charter. um You could summarize it in the sentence that I have sort of coded here.

K

It's meant to sort of effectively bridge things that happen in in academia in theory, and sort of bring them into practice, um documenting things that, like are written down in cryptographic like related papers or and published in conferences and whatnot, and present them in a way that people can actually use them and to do so via what are referred to as informational rfcs.

K

An informational rfc just contains information does not go through the same sort of uh like editorial or or rather review process. The same sort of rigor that other things in the ietf and elsewhere go through so that sort of bar to publication of an informational rfc has like historically been lower than that of like a standard track or proposed standard.

K

That's kind of a problem because in practice the cfrd specifications I think, demand a much much higher bar uh for what they're actually being used for you. You could go back and look at all the many of this many of the specifications or rfcs that came from the cfrg and look at how often they're used and referenced in practice, and it should be very clear that these are very, very important documents like things like hmac, hkdf, uh crypto9, eddsa and hpke. That's a recent one.

K

um These are all like fundamental to how the ietf and even other irtf groups, design reason about and build protocols the security protocols in particular.

K

Moreover, these specifications that are developed have a huge audience. They are meant to be useful to people who are designing protocols. They want to be able to consume these cryptographic objects um they're also useful to people reviewing these particular specifications to make sure they're consistent with things that have been like analyzed and proved in practice, and there's probably lots of other audiences as well.

K

But the gist is that uh the the stakes are very high. I think for the things that the cfrg does, despite what the charter lists, um so quality of its outputs is, is critical um and uh so.

K

I'm going to be trying to focus on like specification quality here, um and uh what do I mean by specification quality, so I I claim that there are at least three sort of fundamental aspects of a technical specification, first of which is like the functional piece of it like.

K

What does this thing do and cfrg is actually kind of it's quite uh nice in that, like typically, the things that it works on are much much simpler compared to protocols that the ietf standardizes, so it was only until I guess recently that we actually started looking at real protocols that involve like sending messages between different parties. um Historically, it's just been, like you know, an algorithm specifying algorithm that has some steps and an api and a syntax and stuff like that.

K

So I'm just going to refer to these things as objects and the the functional description of this object. Is you know what does it do what's purpose? What is its purpose, like I just kind of want to reason about how I might use and hold this particular thing.

K

The second part, or a second part, is what I refer to as the syntax specification, and it's like how do I? What does its api look like? How would I interact with it if I were to implement this in like a type safe language? What would that api look like, and how might I enforce it? And the third uh component is like the actual implementation like how do I? How would I write code to implement this particular algorithm and to do so correctly um and keeping in mind the different audiences that were listed earlier?

K

It's important to make sure that each of these different components of a specification are tailored to the relevant audiences. So, for example, the implementation specifications would be written such that implementers can actually use it and implement the thing correctly, whereas, like the functional piece of a particular specification needs to be written such that people who want to just figure out what this thing is doing, can do so without having to understand the implementation. Details like you shouldn't have to be a cryptographer to understand like what an rsa signature is or whatever like that.

K

These are like simple concepts and people building these protocols in the ietf should not have to be experts to use them. They should be like easy to use hard to misuse that sort of thing, blah blah blah, okay um and to figure out whether or not you know the the the specification or the the different components of a specification written satisfy these. Like these goals, there's like there's like questions you can ask yourself as an editor or an author of a document.

K

I think the the first and foremost one that comes to mind for me is like. Is the specification easy to use easy to understand and use like? Can I just like read it and figure out what the thing is doing? Can I reason about like the behavior of the cryptographic object? Could I reasonably like think about what the api might look like and what the exceptional cases might be if I try to hold this thing the wrong way.

K

Second, most important question is like if I'm reading this thing- and I were to try to do a faithful implementation of it, would I produce a consistent and correct implementation that I could interoperate with other people's implementations, and that depends on you know the behavior being well defined. It depends on the implementation, description being clear and so on um and lots of other questions, I'm sure would come up to to try to figure out whether or not you're on the right track.

K

So I want to look at a couple examples um that uh the the the cfrg has done um and to try to uh reason about these two high-level questions is the specification, easy to use and understand, and will it yield consistent and correct implementations, and I picked on an rfc that gets picked on a lot rfc 8032 for addsa.

K

Has this uh one component of the draft specifies how you like verify digital signatures? um Don't need to read this.

K

In fact, uh I encourage you not to read it because you might get bored, but I will just note that in this this final piece of this particular body of text, they have something like you could do, or we recommend you do this one thing for correctness and security and safety, but maybe, if you're feeling up for it. Maybe you do this other thing instead um and it turns out that has been the cause of tremendous amount of pain and practice. It's been the cause of like security problems in practice.

K

um People just don't agree on like what signatures are valid and that's that's, probably not a desirable property for a digital signature scheme. So I would claim um that uh to answer the first question like yeah, the specification is easy to use and understand, like it's actually a really well written technical specification, um but was it written in such a way that it yields consistent and correct implementations? Unfortunately, not, um and uh just my own personal opinion, like I said, if you disagree with any of this, that's fine, we could talk about it.

K

Okay, another example is actually an ongoing specification, that's being developed right now. uh This thing called opaque, um opaque is uh without describing what these terms are. It's. uh Basically, um it's a compiler for taking a bunch of different pieces or a bunch of different cryptographic. Algorithms gluing them together and the the result is a is a password authenticated key exchange protocol, um uh conceptually like once you, if you understand the object and understand the protocol, it's pretty easy to think about, unfortunately, to like actually understand the spec and implement it correctly.

K

You have to understand and be familiar with. All these other concepts like what the authenticate exchange thing is. What this vo prf thing is what what hash to curve means, what a prime primordial group means and there's other relevant specifications not listed here that are likely also in this dependency tree um that are like necessary to understand, opaque, um and uh so I to to to, I guess, answer the two questions. I would say that this specification is not easy to understand and use um uh and will yield correct and consistent implementations.

K

uh Their stress factors and people have done it. So maybe that that check is a bit biased, but maybe there should not be a check there, because it's still an ongoing draft, so we don't know, um but the takeaway is that, like this is an incredibly complicated document, it could be much potentially be much simpler. If, if you try to uh change up like what the what the the features that were supported are and and and whatnot, but um it's not.

C

K

To understand and that's not great for security, relevant protocols.

K

Okay, um so I think the the the the core problem here uh is that uh the nature of the work that the cfrg does is highly technical and it's highly technical and has to be consumable by like different audiences, with different levels of expertise and different purposes for consuming that technical specification um like an implementer, just wants to know how to implement the thing the protocol designer wants to know how to use the thing um and the the problem is like this is a hard problem, so there's like no way around it other than just acknowledging that this is a hard problem, like writing.

K

A technical specification that takes all of these boxes in such a way that it's maximally, useful, easy to implement and correct and consistent um is, is not an easy thing to do, and we've been doing it for trying to do it for many many years. I do think we're improving, but I think there's more that can be done and trying to shine a light on. You know, potentially more weight, more things that can be done.

K

So I wanna, I wanna, uh call out one particular draft that uh is going through the process right now uh to kind of showcase. How we chose to uh you know, structure the specification in such a way that it does try to tick some of these boxes in what I consider to be a useful way for people who are consuming the spec, so this draft called hash to curve.

K

It is uh again without going into any of the math details. It's basically just like a hash function uh or it's a specification that describes a hash function that takes as input arbitrary data a string and produces a point on a curve, and that's basically it at the highest of levels. It's a very simple interface.

K

It can be summarized in this one picture here: it's internal steps are like quite simple and and and and pretty easy to. I guess like if, if you don't want to know more, it's it's easy to figure out. I guess uh what what the I guess high level functionality is basically from this.

K

So um as I go through next couple, slides just I guess, keep in mind and ask yourself, you know is like based on the limited set of information I'm going to present like is the specification, easy to use and understand, and will it you'll consistent in correct implementations?

K

Okay, um so, let's consider the the functional description of this particular specification. First, that is uh what the part of the specification that's written for the people who just want to understand what hashing to curve means.

K

So I'm, if I'm reading this specification for the first time I might go to the part of the spec that describes like what this functionality is- and I come across this. This blurb here that describes hash to curve with some input, produces an output and there's some there's just little. There's these few steps here.

K

Okay, but maybe I want to know like what these steps are, helpfully there's like. Similarly simple descriptions in the draft that describe each of these different steps without any of the implementation details like you, don't need to understand them to figure out like what this particular function is doing like this. This is sufficient. I would claim for understanding what, like the function of hash to curve is doing, um and I think that's like a good balance for this particular audience.

K

Now, let's consider the api designer, so an api designer looks at the same. You know section of text.

K

It sees like oh there's this uh there's this thing called map to curve, as one of the steps that I have to you know invoke when like in an implementation of hash to curve. What is this map to curve thing to do?

K

There's specific sessions that describe what the high level syntax and api for hash to curve would be for different, like curves that you might target. um I just talked about one, this alligator two method for one map to curve thing. It describes basically all of the parameters that you might need to configure for this particular function. What are the constants that are relevant for this, like particular function?

K

What are the exceptional cases that might you might have to specify when thinking about an api um and like this there's this other thing called sign of the output, but don't worry about that um and I've truncated, some of the like text that follows because, from a like an api perspective, this is kind of. I would say mostly what you need. um Okay, now you're now, let's say you're like you're the implementer, and you want to actually like try to implement this thing, to make sure that you know it.

K

uh I I can use it for some protocol. It interoperates with some other implementation again start on the left. Go to this map to curve thing, and then I follow the pointers down to like the actual implementation description of alligator 2 in this particular case, um and it goes into excruciating detail about every single step.

K

That needs to be done with like comments as necessary for like the different parts of the pseudo code, noting like, for example, on line three like what in exceptional cases um that was like described earlier in the syntax and the api, and this seems- uh and moreover, it's not it's not actually included in this back.

K

But this is almost there's almost a one-to-one mapping between this implementation description as written here and the reference implementation that we provided in sage, um not the sage that was talked about in the previous uh presentation um and so with the ultimate purpose. To be like each step along the way, minimizing potential source of errors and interrupt bugs or whatever.

K

K

Coming back to the the original questions, um if you were to approach this document and ask yourself you know, is it easy to use and understand and we'll yield consistently correct implementations, uh I'm not going to like put checks or x's or question marks next to them? You know I'll leave it to you to decide as an interested reader, but I offer you offer it to you as a you know, potential way to think about how we might structure these particular documents. I think this one went through.

K

uh I spent a lot of time in the group, went through a lot of refactoring and revisions, and I think um it does a reasonable job at like striking a good balance.

K

So I may have just implicitly answered these things, but anyways okay, drawing on hash to curve, I think there's a number of things we can do going forward to sort of improve the overall output of the things that we're doing in the cfrg and I'm going to enumerate just a couple high-level things without drilling into any specific details, because a lot of it will vary based on like the thing that's being specified. How involved it is like? Is it a protocol or not, and so on?

K

um I think the the most obvious one um is to just consider the audience. uh It should be fairly clear if this is like written technical communication and that's obviously important for any sort of communication.

K

But that means like making it easy for the audience to like stitch these different pieces of the spec together.

K

So maybe that means like- or that does mean being consistent where consistency matters like, if you're using pseudocode, to describe different implementations of different things in the specification use the same format of pseudocode at every single step of the way, don't introduce like one different format, for you know, section a and then in the appendix use. Something entirely different. Consistency everywhere will help and will go a long way and also like hashtag curve.

K

Does I claim that, like any time you are using pseudocode, try to make it as close as possible to the actual reference implementation that you are providing with your specification, and we should require reference implementations for these things, um but there's there's a lot of uh there's been a lot of interesting discussion in various like back analysis to in terms of like what this reference implementation format should be. Should it be say, should it be rush should be something else?

K

uh Maybe that's something we should all talk about, but um try to make trying to make them as close as possible. um As I think, a admirable goal.

K

And on the heels of that, I would say that consistency sort of uh beyond like sort of format, consistency in concepts is incredibly important, primarily because we want to reduce the cognitive load of people who are reading these specifications, so not only like reusing formats but we're using concepts in draft itself, but also across drafts.

K

So if we're talking about say elliptic curves in one draft and groups or prime order groups in another draft, that could be confusing to people because sometimes they're used interchangeably, even though they shouldn't, but sometimes they are so being consistency where you being consistent where you can is incredibly important using a shared vocabulary for the things that we're talking about is incredibly important um and uh I think, for the most part, the the some of the documents that the cfrg is working on right now uh are like certainly they're internally consistent.

K

I think the review process is such that, like it gets a tremendous amount of review in the in the group before it goes out, then it goes through like this formal, the review panel step um and then there's a number of other steps, but I don't remember what they are um and I think like. We have reached a level of internal consistency, but external consistency across drafts is uh it could be improved um and that's a hard problem, because you know things and concepts change over time like what was like.

K

Maybe you know we used to talk about the curves, as you know, the best thing since sliced bread, but now we want to mostly we were talking about like primordial groups instead of elliptic curves. We want to kind of shy away from elliptic curves in general, um uh so maintaining consistency over time is is difficult, but I think, for you know, batches of drafts that are being developed at around the same time. It's it's not on a reasonable request and something we could do.

K

Okay, and I think the last one is sort of a call to action. I guess to embrace formal methods and formality where it's appropriate.

K

There is a tremendous amount of work and referenced, informally referenced and formally verified implementations of especially cryptographic, algorithms, like hacks, back and hackle star and whatnot, whether or not they're suitable for the you know, permanent nature of an rfc is not immediately clear to me, but all of this work uh that basically lets someone take one single implementation and you know produce test factors from it, produce other, like implementations in different languages, that they can test out uh and sometimes produce like optimized implementations as well, that are just suitable to plop in production systems.

K

Those those are those are good things and it removes any like choice. Chance of potential error from someone implementing these things themselves, trying to you know, interpret an implementation, description or syntax description or whatever and potentially getting it wrong. um I I was alluding to earlier. It's like still not clear what it's like the right format for this uh this uh these, like reference implementations, be it like something based in python.

K

You know that a dsa rc8032 uses python to describe some of the relevant steps. um More recently, things have been using a lot of sage.

K

Because it's convenient and there's there's like been a developing sort of library in which we can use or the developing.

K

uh I guess you can call it a library, a library of like the the specs that are being developed. The building blocks um and and drafts are like building on these building blocks and adding more building blocks and and and whatnot. There's no reason you couldn't do this in rust or hacks back or whatever. um I think it it'd be interesting to experiment with different reference implementations and see what is most approachable from an implemented implementer's um and what is what is good for really the you know: inclusion in an rc.

K

Okay, um so I want to wrap up and lose some time for questions, um uh so I my core point here is that, like the work that the cfrg does again is incredibly important, um we don't call them standards, they're, not technically standards, but I think um if it, you know walks like a duck and quacks like a duck. It is a duck. These things should be considered standards.

K

In my opinion, um they are incredibly valuable and important to what the ietf and irtf is doing, um and I think we should treat them as such uh there's definitely a lot more work that we can be doing both within the group, um even in the rtf, for you know producing uh tooling, like sage or the name that was in the first one. I apologize um to uh improve the the quality of these specifications.

K

uh There's like ways that document editors can. You know right now start improving the quality of their specifications uh through, like better pros, better, better use of terminology, better, um more consistency. What have you- um and I think it's about time- that we really took the question of uh or the the task of exploring formal method approaches for cfrg specifications seriously? um And uh I don't know I don't know what that looks like in practice if it's like a task force that go off and like try to you know, work through these different options.

K

If it's a group in the rtf to sort of uh explore this area, I think you know anything is a reasonable start here, but I think it's uh given all the work that's been done. I think it's it's now. It's now time to start uh actually pulling it into the cfrg um with that, I guess I'll I'll close for questions. um If you have any.

A

All right, yeah.

E

A

I think thank you very much. Chris uh jonathan.

I

So uh I was just thinking if we are going to have reference, implementations and pseudo code, which has precedence right if the reference implementation doesn't agree with, what's written in the spec, which one do we consider correct, do we have to decide on a case-by-case basis,.

K

My personal opinion is that we should have reference implementations in no pseudo code, um but, uh like the that's just not the way, specs are published right now we have pseudocode to describe the things and reference implementations sort of hang off the side of them.

K

It would be great if there was like one clear, correct description of what thing is and how it's implemented, not two, possibly inconsistent representations, so yeah.

A

All right, thank you, uh thomas.

A

I guess, because you're in the room.

L

I I didn't sign in.

C

Thomas is in the room.

K

But uh there's other people in the queue, so I guess nick go ahead. Oh okay,.

L

So what about producing non-informational drafts within the cfrg? Is that completely out of the question? No.

K

I don't think so, for that I I I like does it. I don't think so, um whether or not that would I mean the topic of like. Should the cfrgb and irtf versus itf has come up before my general stance has been like if it's not broke, don't fix it like from a like a iotf versus ietf perspective, but um I I can't see any harm from like rechartering to potentially, you know uh explicitly include like non-informational documents.

K

A

I mean there are procedural issues there, but in practice uh we cfog should be producing as good equality documents as it can and yeah. We can have the arguments about whether it should be an atf or an ircf group. Later.

A

M

I guess um I'll go um siobhan said brave. This goes to a discussion that we were having on slack chris, but um security proofs and like do you, think, there's a conflict between this. The proof and implementation, complexity or simplicity. um Then, like should you? What should you do, and maybe you can say in the draft like go and look at this other place where we did something slightly different and then proved it um yeah just wondering what.

K

I think anecdotally, like the security proof, should trump everything um if the implementation doesn't match. There should be like very clear reason as to why it doesn't match, perhaps with like justification as to why it would deviate from what has been analyzed in practice, um uh but I think we should strive very hard to keep like all these things consistent, and I would include like the the the security model and the analysis. That's been done as part of like the description of the specification, um uh so yeah keep everything consistent.

G

And hi again, this is me, thomas of pecorella, and for real anyway I'd love to have a specification written like you said. I really love it.

G

I have mixed feelings about reference implementations, because I fear that for some cryptographic, algorithms having a reference implementation would lead to more errors than on carefully written one, meaning timing, errors or stuff. Like that possible timing attacks, I mean somebody that just grabbed the reference implementation transform it in another language by some mathematic immediate, don't use the brain and go on.

G

I would love to have, however, some test cases like if these numbers are there. You must reach this point, because I fear that in some cases uh this could be more useful to debug or test case. Your particular implementation.

K

um I mean yeah. I certainly agree that the test cases and test factors are invaluable um and uh I think most of the the specifications that I'm.

G

Aware of including probably would be the best.

K

Yeah yeah um in terms of like whether or not reference implementation should exist uh that it was actually an explicit choice to use something like sage to describe uh these like hashtag based drafts, because you can't just like easily take sage code and plop it in production. We want to make sure that was like something you did not ever do and we. Furthermore, we took steps in the reference implementation to try to describe uh you know uh or avoid sort of. I guess implementation pitfall.

K

So, for example, if there you need to do like a comparison of two like byte strings and constant time we didn't just like in python say: like does a equal b, we would say like if a is equal to b constant time, I forget, like the extra actual string, but like you make it very clear, like you know this, this is how you would actually implement in practice, um but I mean it's a good point. Like should. Should reference implementation exist, uh I I I.

G

Tend to think they're useful, but write a reference implementation that cannot be used as it is by um let's say, an undergraduate student doesn't um actually have time to actually do something right. That would be great and I'm all up for it.

K

Yeah, I mean, uh I guess the other other side of that coin. Is there are projects like uh I mean like the the recent nist pqc competition that all the reference implementations? There were actually very high quality implementations and they perform very well. um You could reasonably take some of those and plot them in production. If you were, you know, if, if you wanted to do so- and I guess it depends on you know what sort of bar you hold for you know the reference implementation.

G

Yes, it also depends on how long you want to to have the rfc, and how long is the code, yeah yeah, of course, thank you. Yeah.

D

Hey this is max uh great talk. I just wanted to say I've found at least one protocol where uh not the pseudo code, but the english language text did not match the reference implementation, not an ietf one but one from outside. So that can happen too, um and I I do think it's. Maybe a good idea to work be worth considering to say what what should be um that somebody else mentioned like believed.

D

If there's a discrepancy right, so the spec could maybe say, if there's a discrepancy between this and the implementation, you should always fall on the side of. I don't know whichever right and, and that way you have it be unambiguous. So maybe there's something worth.

G

Yeah, that's not a bad idea.

A

All right, thank you very much uh just to follow on from that last point, I think it's very very clear that uh yeah uh chris had some some great examples, but this is very much not a a cfig specific set of problems. There are ambiguities and issues with all of the specifications uh and certainly looking at the the dependency graph colin jennings has for the web rtc specifications and the sprawling massive documents there. uh I think it's it's clear.

A

This is a an issue of consistency that we we have in in many different parts of the community.

A

um So what what I've been trying to do uh with these talks is is to sort of try and stimulate some discussion, try and see if there's interest in your research on protocol specification thinking about how we specify protocols in in the in in the itf and the irtf communities, uh it's time to think about what are the research challenges in ineffective protocol specification and how we can improve the way we specify protocols.

A

um There's certainly been a lot of discussion in the chat uh and a lot of really good questions to the talks, and I think there are three fantastic talks there. So thank you, everybody for, for both giving the talks and for the discussion in the chat.

A

um I I am certainly looking to to gauge interest uh to see if people are interested in potential irtf work in this area. If you are interested in this pl, please do talk to the speakers.

A

Please do talk to me and if, if you would be interested in following up on this, put your names into the um that the documents that's linked on the slide- and I I just put in the chat uh and we're you know.

C

A

Normally, I would arrange a a lunch meeting or something at the itf um in covered times. That's obviously a little bit difficult, but please do put your names in the chat. We'll arrange a call at some point if there's interest and see if we can take this further.

A

So um thank you uh again to everybody. Thank you to the speakers. Thank you to everybody for, for paying.

G

A

um If there are any final questions, I'm very happy to take them and if not thank you again.

A

All right thanks, everybody look out for the nrw call for papers for next year.