IPFS Virtual Meetups, 15 Sep 2020

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: IPFS Recovery - Govind Mohan & Hlib Kanunnikov

Description

HackFS finalists Hlib Kanunnikov and Govind Mohan take us on a behind-the-scenes look at the process and thinking behind their project IPFS Recovery, which creates a way for content to persist permanently despite any damage to data and the network by
bringing data recovery algorithms into IPFS protocol.

Discover IPFS Recovery at https://github.com/Wondertan/go-ipfs-recovery

For more information on IPFS
- visit the project website: https://ipfs.io
- or follow IPFS on Twitter: https://twitter.com/IPFS

Sign up to get IPFS news, including releases, ecosystem updates, and community announcements in your inbox, each Tuesday: http://eepurl.com/gL2Pi5

A

So this is ipfs recovery. We built this in hackafest 2020. That happened over the course of like uh two months ago, uh lasting 30 days, so this is kind of a story of of how we like of what we did and uh sprinkled with the ample technical details uh so coming into this hackathon, I had a lot of experience working with uh distributed uh error correction in error correction in the context of distributed system.

A

So I thought it would be a pretty cool idea to bring this over to ipfs um and what are the problems that I that this solves so like? The existing problems that I wanted to tackle were data.

A

Corruption can lead to losses in uh vital information because uh data when you have a distributed system, you have data at storage and data in in transit, and you want to ensure that there is integrity of both both these modes of data, so data at rest can be uh compromised by things like coffee poured on your laptop or like a massive power grip failure, uh and if uh I, if, if content, is being served, potentially vital content is being served from your devices onto other devices in the network. You don't want this to happen.

A

You want to ensure that there are different uh ways to access the same information in your in your distributed system, so you also have the problem of no churn where devices can just go offline for any reason whatsoever. I could just decide. I want to turn my computer off right and also there's the issue of censorship. If there are people who are actively targeting data, that's certain types of data on the network.

A

They could go to any link to make sure that it's not there and finally, transient connectivity, uh poor internet connectivity can happen for all kinds of reasons in a distributed system. So it's it's important to have that that consideration. uh I also want to add um cyber security, has three components which is uh confidentiality, integrity and availability.

A

uh So confidentiality is probably the one that's probably paid most attention to uh so integrity and availability are quite important too. So I think uh erasure coding is is what's really uh the the solution that that allows us to uh ensure proper cyber security for distributed systems. So what is the ratio coding? It's a method of data protection in which data is broken into fragments, expanded and encoded with redundant data pieces and stored across a set of different locations or storage media. uh Well, what this jargon actually means is uh data on data.

A

That's erasure, coded kind of becomes like a hydra where you have multiple heads and if you chop off some of the heads you can, you can like it just keeps recovering over and over again until you chop off most or, if not all the heads. uh So uh it's it's worth consuming some extra storage to obtain better data, resiliency underwriting performance and uh um data can be spread uh across the network geographically, so it'll even allows for better performance and delivery guarantees.

A

um So why? Why does this matter specifically for ipfs? uh Well, I think the uh distribution of data in the limitless manner, that's in the core of ipfs. It requires strong integrity guarantees, but we want to make sure that data is available at all costs, uh no matter what kind of data it is. We want to make sure it's it's available.

A

It doesn't matter who or what circumstance is trying to get rid of that data. We want to ensure that it's still there uh and it's it's it's a long requested feature.

A

There are issues like github issues from like uh 2016 that are that are still open, that uh that address this issue and uh no no specific solution exists for the ipld layer, which is where we actually built this out as you'll see shortly uh and finally, we kind of kept this in the spirit of ipfs, uh which is just keep it very modular and pluggable in in many different ways. So we actually have a few different eraser codes that we've built out so I'll.

A

Let uh talk a little bit about our hack and what we actually.

A

A

Oh, I think you're a mute.

B

There you go uh hey guys um um I'll, go with some more technical details, so how all works um first, as uh my original goal uh was to create um a module for recovery that will follow most of the best practice best practices uh I've seen in ipfs and how you do modular things, uh how you split everything, so this module also introduce interfaces that, like tries to be uh like uh useful, abstract and like uh convenient to use.

B

um Originally, we started uh like um to think uh we we we uh got that we need uh ipfs need a ranger coding. We did some research and seen that.

B

uh Actually, there is a lot of issues regarding that, like ideas on how to implement, uh but um like I decided to use uh to integrate that in ipld and ipfs currently work with first version of ipld, so the apfs recovery is integrated over there um for for the hug demonstration for thing, that's due to to show something on hacker fest, we uh also added a new comments to ipfs.

B

uh The main one is in code that allows you to encode any uh deck to re to recoverable deck, uh so you just put one at once id and you get another one uh and interesting is that uh we need uipfast get uh or using the other hash you got. You will still uh see the original file, so actually you just interchange uh one one hash to another hash and you still be able to get original content by this newly generated hash, which also has a recovery feature.

B

um Also, there are like different ideas for commons that can be added to your ipf sli. That will expand the uh ipfs with new features and like will help to manage recoveries in some way, so you can do manual, recoveries or you can uh look at um uh like some states uh in ipfs recovery model that are that can be useful to see through the bug sessions.

B

um Our main implementation, uh uh as uh like we, which is rit salaman, which is kind of uh very popular erasure coding, and it's simple to understand- um I won't be describing how it works. um I think, in terms of like some mathematical details, uh but uh uh simply uh it just allows you to uh to recover uh to first I'll generate some additional uh blocks uh for your existing blocks.

B

For example, you have seven blocks, but you decided to generate three more blocks uh that will that will then we will have 10 blocks at all, but if you will lose any three of them doesn't matter what you can still uh get the original file, uh so you still can regenerate those those blocks that represents the actual user data, even if you lost them uh so about implementation. um We needed to uh for for this.

B

For a ritual, and uh for that um for ability, integration uh we needed to uh like find a way to integrate it and uh ipld.

B

We zero has uh nodes uh interface, uh that's can be addressed by a cid and that's id defines the content which is uh connected to like actual uh node implementation, and we added a new cd codec like custom to address uh newly uh new new nodes, uh new recovery nodes that uh are uh wrapping protonates, uh which are used as links to uh to blocks on the network and uh what this uh recovery node does. It also like adds additional free links to, uh and um recovery notes uh links to do to the node.

B

So uh this uh allows um the recovery worked on the uh whole ipfs network. That means that, if you have a kind of you have a recovery node, but you can get the block you need. You can fall back to getting the block.

B

For getting recovery notes and that will help you to recover the block you originally needed and the good thing that this works uh not just on your local note, but uh if you want to uh distribute your content and share it with others on ipfs, it might be maybe very demanding and it flows around in ipfs network. So it really helps to recover and to increase availability of the data you store on ipfs, that's kind of like. uh Let's continue on um implementation, uh so this id hack, uh this id is uh nut. It's not.

B

uh It is actually a hack because, uh from my understanding um we need kind of separate uh things. uh It's kind of plugins new free model where we can add additional plugins or some additional uh repre letters or something to make say these uh additional custom data uh to to address that some additional uh capabilities of the data you address, uh because uh recovery isn't actually a codec. You can tell that uh this. uh No any node. As a is a recap like you can tell that recovery is encoding, encoding or.

B

Because codex is used to mostly just say: okay, I use uh json encoding. I use darkroad above I use uh any other codec to to to understand how to uh get the blocks to to the memory uh about encoding implementation. So uh there is a important uh quality of read salaman. It states that all the blocks should have same size and that was kind of like issue for us to to solve, because uh ipfs blocks are not always have the same size, but rich solomon needs them to be the same.

B

So uh we I've decided to use um and to encode uh the size for the actual blocks inside the redundant nodes.

B

That way, when we try to recover some blocks, we copy blocks, we have into charts, there is kind of special data structure and it it manages uh uh do we have enough needed amount of blocks uh and it helps you to uh to actually uh restore the blocks with the original size and get the original size.

B

uh We use a variable integer there to get original data for for that to work in ipfs like plug and play because uh if you have a have a deck that you already created, you need to encode it, but not to change it drastically for, for the data, be still readable in ipfs manner.

B

I hope you, uh I hope I explained this understandable. um The next thing I would like to tell you is a custom dock session, so for.

B

For ipld we zero ipf has used dark service uh that allows you to get the ipfs nodes by cids and there is a concept of sessions that separate some uh logic.

B

uh Some, like group of notes um for you to do fetch from the internet or got it locally, so in custom. Duck session, um manages uh key pairs of um already uh recovered uh of the garden blocks like the blocks, you retrieved and saves them as a reference for a next iteration of traversing to the graph to understand if um the node I'm failed to get from the network uh to get its parents.

B

This is kind of like soft link currently implemented, and um uh just just in memory uh where you understand who can recover. The note I have uh did not. I need right now in this current session. So apparently they are saved in memory, but I think there should be a special uh like uh special components.

B

That's reps data store, and uh it's just says locally small uh metadata about what nodes can be recovered from from what.

B

This won't add a lot of um like additional storage, but it's still useful and it works only locally. So you you, the node nodes knows uh uh what what can I use to recover? Something something that I need um something user asked.

B

uh The fourth thing is uh recover. This is actually an interface uh uh that has a red salomon implementation.

B

uh It manages states for recoveries because if we, for example, imagine a ipfs gateway and many people try to get some something from it.

B

It may cause that in in the same moment, in the same moment, a few uh there will be a recovery process for for same data that we would like to avoid. So there is a recovery singleton that manages different recovery sessions and what allows it to for any user of ipfs. To point to.

B

um Oh sorry, my my headphones are off uh to point you uh to recovery sessions that can be uh going on. At the same time. um That's probably all about implementation, reid, solomon. I think there is.

B

This that's all I wanted to tell about this. uh The fourth thing is um draft for novel alpha entanglement implementation.

B

So um eric solomon is um industry standard for erasure codings, but uh there are a lot of ongoing research uh regarding codings and there is interesting, noble alpha entertainment uh paper and uh going had a chance to even contact their the people who were created uh this paper and they also did a hack for ethereum uh and they integrated somehow those noble alpha entanglements, but uh we decided to keep this out of the scope like ritzleman, is easier to understand and easier to implement. But we have a draft implementation for authentication.

A

Yeah sorry I'd just like to say a couple of things about that. uh First of all, I was stuck on mutually for the longest time, so I couldn't say anything uh yeah, so alpha entitlement is a way of data being spread across the network where you have uh actual data, uh when you have data uploaded onto the network, which is uh could be multiple different files, so you could upload a file. Someone else could upload a file.

A

uh There are redundancies that are formed for both these files uh in such a way that uh your file could potentially help someone else's recover, someone else's file which could be damaged or degraded for the reasons that I mentioned before. uh So this is a it's a very new form and it has been uh there's a lot of experimentation going on with other distributed systems such as ethereum swarm as well.

A

So I just wanted to bring that over to the ipfs world, so you know, I think uh the modularity, the high level of modularity, that we have with ipfs recovery also helps for us to plug in any erasure codes uh that in the future, could be even more crazy right.

A

What we want to achieve is like the information theoretic balance for how how much we can recover out of degraded data, and the last point here is test ground.

A

So of course, the good folks over at ipfs uh created test ground and it's uh we want to kind of battle test uh our implementation of recovery, both with solomon, alpha, entanglement and all of them, and see how get an exact metric for how much uh degradation we are able to uh degradation of data and also of the nodes on the network that we're able to resist- and that's that's, going to be a key factor of determining how how this can be integrated with the core and then you know, potentially deployed in the mainnet.

A

um I think uh yeah. There were some points about recoverability. That uh sleeve also wanted to make sure.

B

uh So there are like uh concepts that can be uh used online. That's our kind of interface level. uh That's uh uh should be covered here. uh The first is recoverability, so um this is kind of thing that allows you to to like it's it's a parameter for the data for this encoder data that defines like percentage. uh You want for your data to be safely lost.

B

Look it's recoverability. If you lost the percent uh you put in regular ability of the data, you will still be able to retrieve it from the network. So let's say I want a recoverability. I have a dac already in the network and I do encode on it. Putting regular ability 25 and what it does it generates. 25 percent, uh like adds more notes to this whole deck and uh if you lose 25 actually from this deck, you originally have on a network.

B

You'll still be able to read the whole data from from it but like, if really like, we can tell just add more data, uh add more 25, more data, but in terms of distributed systems. I think that this overhead is is is helpful and it will allows the or content to be better retrievable.

B

That's kind of this thing uh and um for reit solomon, uh this recoverability works uh uh not with percent it works just like we have a parameter that defines the amount of node of nodes you want to generate for all the dog layers.

B

If you can see on the on the picture we have here. So if we have recoverability 2 for the root block, we have uh that has two um nodes. We recover ability to generate two more or uh to in case uh two and three are lost.

B

uh This is also like a possible case and if you lost loss free, you won't be uh able to uh get the contents of uh six and uh everything that goes under free.

A

B

A

At each level uh that is at each parent, node, you, you have uh the number that's specified by your recoverability that many redundancies are created. So at this level you can see that a and b are created and connected uh to once in such a way that, if, if you lose any two of these four nodes, you can use the rest to recover all of them, uh and the same applies for two and all its children, five and all its children, any parent know that has all its children. So this again, this is uh reid.

A

Solomon. If you have uh alpha entanglements uh in our in our github uh in one of the issues, you can actually see a more specified diagram for that uh kind of use case. uh So there's this there. It's there's different ways to play around with the dag structure. uh Thanks to you know the ipld api, which is uh allows for this kind of thing to be done really well um and a little bit about strategies as well.

B

uh Yes, so this is uh when you.

B

Get the network from the uh get the data from the ipfs uh you um and you fall back to recovery. uh This might uh uh will lead to additional data on your nodes that you haven't asked. For I mean that's. If I uh I need some contents and to get that content, I I need another content and this another content is able on the network. I got it, I it helped me to recover uh the data I needed, but I haven't asked for that data to recover.

B

So um this strategy is this kind of thing that might be like it's better to put kind of ipfs config that will do for you uh like help you to decide what to do with the data. uh This additional data you need for recovery, yeah.

A

B

uh First is all that means that you will help the network and you will store everything. Just data means that you will store. Only data um data means only data means that you will be able to. When you do recovery you can uh of the file this file. Might you can get just part of the file? You don't need the whole file, but this part of the file uh has uh some like lost blocks and you need what you do.

B

You uh fall back to other date and blocks for you to recover original blogs, but if you have, if the user haven't asked for them like in case of a third auction that wouldn't store the this, uh those block wouldn't be stored. But if you choose just data um in in this case, I described uh you the data you haven't asked for, but.

A

B

Still fetch it from network, or you uh also may regenerate it uh um with other blogs you ask for in just data, you will save them locally. uh Maybe you need the future or you will help to network, and you will actually provide it to the network. In case you requested that you'll just do what you'll save only things you need. Not the thing uh help you to get what you need.

A

Yeah in the interest of time, because I think we're kind of running low, uh there are some use cases where you know, maybe more vital use cases where I think uh storing data and redundancies are would be important, even if it means paying a little bit more. You know having the uh bearing a little bit more cost, and then there are some cases, like probably more average cases where you just want to get the data and keep the data.

A

So we want to have these different kinds of strategies as well, and I think this uh the future really holds for any community discussions. We want to be able to talk with the community, see how I mean we have a lot of ideas. Clearly right. We want to see what kind of ideas you guys have and how we can implement this, both for.

B

A

B

Here about a second version of ipld, I haven't got enough time to actually dive into new specs. uh They uh like that protocol labs works on. I know that there is a lot of new cool features that might be used for recovery as well, but I'm not sure what uh exact features we can use it for, but recovery, as I said already for cid part, uh that recovery is kind of it's it's a feature for this content.

B

It's not an actual way to represent this content, so um adding like additional uh more something to to cd that will also be put in ipld. I think it's kind of connected things kind of this um also. The second point defining recovery notes over adhd, um going already told about this in terms of alpha entanglements, but uh there is also interesting idea.

B

um I described you soft links uh that are used for, for you can locally find what notes you need to recover the data you asked for, and um you will clearly store this in that data store. But what if we put this on dht, uh that will uh allows us to create kind of soft links on the network, so this would be a kind of last resort for you to restore uh for to recover the data um like actually ipfs.

B

Currently, when you try to get something- and um it will just endlessly wait for that for the data it can find what if we can do, asynchronously uh other things to uh to find other other things on the network and kind of soft links. So you can match that this can recover that because they uh someone connected them and there is a motivation for um for people for users to do interconnected recovery nodes.

B

So, for example, you want your data to help other person other other data to recover, so you can connect and make those like helpful notes that will connect uh different parts of the different decks and will help from the from these two decks. You'll create some redundant notes more and they that way. If you can help different data, help each other to.

B

To recover them, even though I can there is an interesting example: when uh you want your content to be more distributed over the network, you can connect it with demanding content. You can help uh demand the content to recover.

B

So that way, your content will be much more like distributed for some peers who uh needed a recovery for a demanding content. So there is, there is a lot of future new things.