IPFS IPFS Camp 2022 - Libp2p Day, 7 Nov 2022

Previous Meeting

Next Meeting

⏯

youtube image

►

From YouTube: Formal Analysis of GossipSub - Ankit Kumar

Description

This talk was given at IPFS Camp 2022 in Lisbon, Portugal.

A

um Gossip sub is a lip P2P based protocol.

A

um It is a local reputation based system in which peers add only those neighbors who, they are sure, are good behaving and have high scores, and uh they use that to um pass on full messages, rather than uh sending full message to everybody, in which case that would be flood publishing.

A

So this helps to disseminate messages fast, while keeping a low overhead. um It has been claimed resilient against civil attacks in a previous work by Vizio at all. um It has been tested for resilience in test ground, and the specification for gossip sub is given in prose English.

A

So in our work, we implemented a full executable specification of Gossip Sub in acl2s, which is a very powerful first order, theorem proverb that has been used uh quite quite a lot in industry in the areas of um Hardware verification and requirement analysis.

A

We can use this executable spec to model orbit topology, um which allows us to have fine-grained control over um events, messages, behaviors parameters and weights, and uh it allows us to try out any configuration that could be used on top of Gossip sub. We utilized um this model to sorry. We cross validated our model with the original implementation of cursive sub, uh the specifically the golang implementation.

A

What we did was we ran the unit tests provided in the golang implementation in our own model, as well, as we type checked the trace output by the golang implementation in scl2s to verify that it's outputting, the current that to verify that whatever it is, outputting is actually events. So in our work.

A

um An advantage of this formalization effort was that several discrepancies between the implementation and the specification um we are clarified. While we were trying to implement it, uh we had to regularly be in contact with the political Labs developers, um so a lot of discrepancies were settled. Another Advantage was that, while writing the scoring function, which is used to um judge your neighboring peers, whether they are good or bad, we came up with four properties that were quite evident and we included them as we wrote Our code.

A

We believe that these properties are essential against civil attacks, because our work is currently under submission. We can't share our code, but we can definitely discuss the results we arrived at.

A

um So we tested these properties in context of filecoin and ethereum filecoin certificate satisfied all of those properties, whereas ethereum satisfied only one and based on these properties and counter samples that we came up with we, uh we were able to generate an attack that expose certain vulnerabilities.

A

So consider a gossip sub Network.

A

There are three topics: blue, green and orange, uh and- uh and there are two nodes, A and B that are in meshes of all three topics and B is trying to keep a score of a let's say, is misbehaving and therefore B judges that the score of a has dropped below zero and therefore, in the next in the next heartbeat event, B will prune a from its mesh, which happens as is shown, and therefore B is no longer in connection with a so it's not sending or receiving message or direct full messages from a in any topic.

A

um So this was an example of how gossip sub works.

A

Now how the score is calculated is we have certain uh parameters, P1 P2 up till P, seven um that are used to uh track certain behaviors of the neighbors P1 is time in mesh. P2 is first number of first message: delivery is sent, P3 is number of mesh message. Deliveries per heartbeat interval, P3 B, is the same as P3, but it is sticky that is, it is remembered, even after a prune is pruned from the mesh. P4 is the number of invalid messages.

A

B5 is application, specific score, um P6 is um IP collection, factor and p7 is a penalty on certain behaviors, so the ones that are coded as red are bad. That is, we don't want them, they indicate bad behavior and the ones in blue are good, and since we know these are good and bad behaviors, we multiply them with corresponding weights that are either negative or positive depending upon their behavior.

A

Once we have that um they are so the topic wise uh parameters are multiplied with their weights and summed over all topics, and then they are added with the global scores. The global score contributions and the whole thing is uh then capped using a topic cap DC.

A

So this is the scoring function that gossip sub utilizes and our work has been to verify this particular scoring function in regard with the some of the properties. So now we'll be talking about the properties, we wanted properties that were hopefully uncontroversial, but it would seem that any scoring function that claims to be resilient against civil attacks should satisfy these properties.

A

um So the first property says that um if the topic score is always less than zero, then eventually the overall score should be less than zero. Ethereum does not satisfy this property whereas, as whereas file coin, does the second property says that increase in bad performance counters should lead to a decrease in the overall score, which again ethereum does not satisfy, but it is Satisfied by falcoin.

A

uh The third pair, the third property states, that the increase in good performance counters should lead to an increase in your overall score, which um ethereum again does not satisfy and Falcon does satisfy. Finally, the fourth property says that identical performance counters should achieve identical scores.

A

This was trivial to prove, because we have used sl2s, which is a functional programming language and therefore we have referential transparency that allows us to say that, um because the definition is same for both the context, the fourth properties is Trivial to prove now we'll go over each of these topics and try to reason why that is the case. Yeah acl2s I have provided a link for that in the first slide.

A

Yep so um so after we have implemented the whole of Gossip sub. Here is our property. The property says that given a which is appear, TP, which is a topic and counters, um which is a map from a pair peer topic to counters counters, can be considered as record that keeps track of these five counters.

A

um We want to prove that if the topic score is less than um 0, then the overall score, then the overall score is also Less Than Zero. So this is the property that we put it. Incl2S and sl2s comes back to us with a initialization of counters, as shown uh for some peer, a um and the topics are color coded for each of the counters, uh so these counters all are for blue, green and orange yeah.

A

So since we have implemented the scoring function, we do have the computational power to evaluate these counters and come up with the score which comes out to be 5., however, notice that if we only calculate the topic based scores, the blue topic has a score of minus 25. The green one has 22 and the orange one has 7.78 note that all the weights and parameters that we are using in this computation have been derived from the implementation of ethereum uh in the golang implementation. So so these are actual parameters and values that we arrived at.

A

Based on this data, we can generate an attack in which, as we as we see if we maintain a steady rate of message, delivery in orange and green topics, uh whereas if we decrease the number of mesh messages in the blue topic after certain time interval, then a can be in all the a can maintain a positive overall score. While misbehaving in the blue topic, because it has severely constrained its mesh message, deliveries to B- and we can do nothing about it because a has a positive score and so a will not be pruned out.

A

However, filecoin does satisfy this property. Why is that? This is because they set their weights such that the the maximum positive score uh achievable by um the corresponding weights assigned to uh positive parameters is less than the penalties issued by the weights assigned for the negative parameters. So the penalties are so high that, even if you add up the maximum positive score from each of the topics, it is less than any single penalty you get and that will drive down your overall score below zero.

A

Similarly, for property, two, we can see that ethereum uh has a topic cap.

A

However, the sum of its uh score contributions from each of the topic can be higher than topic cap and after that, if some topic, if some, uh if there is some misbehavior in some topic, it will decrease down its score, but the topic cap will keep it at the same level as it was before, because it's not going below the topic cap, uh whereas in file coin there is no topic Gap, so any decrease in the score will lead to a decrease in the overall score for property 3. We have the same problem.

A

That is because of the topic cap. Any increase in the contribution by a certain topic will not actually increase your overall overall score because it is already capped. So, even after the increase, your score will be at the maximum cap um in case of file coin. There is no such cap and therefore it can increase.

A

So um we discussed uh all the properties that we came up with in context of ethereum and filecoin.

A

Now in the future, what we would like to do is to model the application Level over our current um Network protocol layer. That is uh when we have both the application and the network layer working together. Then we can write properties about both of them in conjunction and come up with more interesting properties to try and find bugs, um and this really emphasizes the property driven development approach.

A

So, while writing specification, if we have a view of the properties that our specification would like to um satisfy, then that would help us to um really avoid those kinds of bugs that that could arrive from if we had not considered those properties in advance.

A

um So what does the community think about it? What are your thoughts? Should we have more property, driven development and formalized uh approach towards um web3 going forward.

B

Do you you mentioned this tool, acl2 yeah I'm wondering like, and it looks like it's a wisp right. Yes, um so I'm wondering like. Is this a similar thing or like? Does this have like a similar purpose to like, say or lean or Eggo, or something that.

A

Is correct, so is, um is a third improver in Lambda Cube yeah, um significantly less descriptive than that? It's in first order logic. However, it's more automated than that, because there are quite a lot more um useful heuristics that can be utilized to quickly arrive at uh quickly be able to prove things so um so yeah, so that is acl2s, which is.

B

A first order of logic: yes, okay um and the like the following question is so you have these properties right, and these properties are things that you'd like to show that are true or you.

A

Also get counter examples, so so the parameters that we got were given to us by acl2s suggesting that see. This is something that does not specify this property and using that we could generate our attacks.

B

I see so it gives you like, like you say all right. This is something that I would like to have, and here is it and it gives you back a model that makes it.

A

We try to prove the theorem yep. It was not proved because there existed certain counter examples which were given to us gotcha.

B

One more question: so there, like you, gave an example with the lisp and so I. There was like a colon and then like colon topic, for example like yeah yeah. That slide.

A

Right yeah, these are, these are types these are user defined types we uh Define them using Dev data and uh a good thing about these types is that they are, um they can be defined, defined using any function or recognizer. So you could have any arbitrary function that uh given any object in the universe of acl2s, if it returns true or false, then it can be used to certify your type.

B

And I'm sorry, this might be like super common feature in Wisp and I. Just haven't done much less um so like is there a type Checker that's running through a compile time, or is this something that's like type checked at runtime or.

A

Yeah, so when you define a type you get an enumerator and a type Checker, and when you are specifying the property, um what it tries to do is to um is to uh come up with um witnesses that satisfy those types so but yeah, but lisp is an interpreted language. It's not compiled yep. That makes sense.

B

Yeah: okay, thanks a lot. Thank you.

C

um You said there is an cap or score for each topic, but that's not in the spec. Oh is it just um so in respect you have a gap for each value, but you don't have a cap for the entire topic.

A

uh Yeah the cap is for the entire score, it's not for each topic and therefore we have the problems with property. Two and three.

C

uh Because I have the specs of the C7 right here: I, don't see any cap on this car, so we can talk later on. Okay, yeah.

A

This is derived from the English Pros uh specification, of course, itself. Okay,.

C

Yeah we'll check okay, yeah.

D

Good, um you might have covered this already, but how do you? How do you know that these properties add up to several civil resistance or strictly imply civil resistance.

A

Yeah, so um so, for the first attack that we showed um the neighbor a for B was essentially acting as a symbol. uh That is, it worked for some time um passing messages at the required delivery rate uh to gain some positive score and after that it um dropped its message. Delivery in in a particular topic, uh while maintaining that positive score. So so it was essentially acting as a civil.

D

Got it, but do you do you have a sense of how that adds up systemically across a large graph that many symbols can join or is there? Is there a provable? How does the provability extend to like the the broad phenomenon of many many different nodes, interacting.

A

Yeah, so in our proof we utilized a very small um scenario where we just had three peers connected in all topics with each other. One of them was The, Observer other was the Civil, but uh you can extend it easily. You can consider that b is cons is covered all around by symbols like a all of them uh start behaving in the same manner, and therefore that constitutes a flash attack against B uh flash Eclipse attack. Sorry yeah.

E

So I have a question, is it's? uh Is it possible to abstract away like uh within a library or something the complexity of this formal verification and then using some kind of CI that could be run automatically in the somewhere? In the repository that we have.

A

Let me understand the question correctly: uh you are saying to reuse the already implemented code in ACL device. Yes,.

E

To make sure that the future implementations are also compliant with the but yeah the properties yeah.

A

um You actually there is a recent work on this by Andrew who's sitting right next to you, so.

A

uh So yeah he he has worked on worked on uh Witness dating uh a witness, generating data data types um for the purpose of uh integrating this with other languages. I, don't think that would be possible because we need to reason about that particular code, uh but Andrew does want to say something a.

F

Couple a couple things: one of them is right, so this is useful for validating implementations, but it's also useful at the design stage of a protocol right like that is something where, when you don't have any code, you write a sort of abstract model like what um sort of similar to what when ankit's done, and then you can write down these properties that you might want uh this protocol to satisfy and sort of tweak things add additional constraints or um to run the property to make sure that it actually satisfies um the properties that you care about like civil resistance, for example, um yeah.

F

We can talk more offline about the other stuff yep.

A

So so a useful thing would be to formalize your specification and then reason about it and then go on to code. It in in your preferred language of choice. Did.

G

Did this um impact the the English prose spec at all? Did we make any improvements to the spec as a result of some of the ambiguities you found well.

A

Yeah we we found that some of the portions of the English Pros were out of date, in which case we then had to look at the implementation, because that was more upgraded with um uh I mean in those cases we had to consider the code as the specification um so but but right now. What we can claim is that we have a formal specification in in the ecl2s language, which is which is much more non-ambiguous than English, and therefore that can be used as a specification.

G

Cool yeah, okay, so I guess just following on from that: if someone come, if a researcher comes along a year from now, will they find the English Pros updated or and or will they find a link to your formal proof.

A

uh I would like it that the that the formalized model be linked. uh We are, we are working on adding it to the community books of acl2s. So if you will just um download a sales you, you can find that code in their books as well. Thank you.