IPFS IPFS Camp 2022, 31 Oct 2022

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: HAMTs & Other IPLD Data Structures - Rod Vagg

Description

How does IPLD represent complex data? How does this affect data transfer and what kinds of data you can request?

A

Distributed data structures um now these we call these persistent and immutable data structures and they're, not new, because this term persistent and immutable data structures has been around for a long time and you'll actually find these data structures in the um in the functional programming World in particular uh the standard libraries of closure Haskell Scala. They all contain these persistent and immutable data structures, and they map really well to the content addressing world because we do immutability and we like persistence.

A

So a lot of these data structures are old and we've sort of pulled them into the new world of content. Addressing uh the problem is that the algorithms um involve a lot of trade-offs and when you come into the content, addressing World you've got a whole new suite of things to think about, and not just what happens on the local computer.

A

So, let's, let's see what we're talking about with uh persistent and immutable data structures by building a really simple one that I'm calling super large array. This would work in practice. um I I, don't know if you would use it, but you could build this. So, let's start with uh and we're building an array, we're building a something where we can store- and you know potentially infinite number of entries in this huge array distributed on ipfs.

A

So we're going to start off by saying we can store five things in a block on ipfs, so we're setting a limit and we're going to fill it up with five elements. Now what happens when we want to add more things to our array, because we're going to be able to expand this thing right? Well, we stick more things into another block and then we create a structure to reference both of these blocks. So we've we've created a dag.

A

We have a root that includes a link to the first block and the second block, and then we have our array elements in order. So we end up with three blocks. So three cids, two of the leaf nodes, contain the leaf data Whatever, Whatever, It, Is, We're, storing, perhaps they're just numbers, perhaps they're cids, but distinctly at the leaf nodes is where our data is and then we've got the intermediate the the root one to to contain them both.

A

uh Now we can fill this up and keep on adding blocks. We've got 25 elements in our array, but what happens at this point, because now our root block contains five elements and we're we're at our limit for for Block length here for Block size.

A

um So what do we do if we want to put another element into this array? Well, we had another level to this uh to this structure, so each of these levels we're going to give a height, so height, one is where our values are height. 2 is the one just above that that holds them together, but as we start to need to collect these height two nodes more than one of them, we need a height three, where we start to create more more nodes as well.

A

So potentially this thing could keep on going up and up and up and in that way we form a dag that can contain a arbitrary number of elements, but you can see that some of the problems we get into when we get when we're getting large dags. We have these very tall structures and very wide, and we get into all these sort of data transfer problems that we've been talking about here as well.

A

um The limitations.

A

um What what should the width be of these things? I set five just because they fit on the screen really nicely. um But how do we? How do we come up with a good value for width? um We could approximate the byte size that we're aiming for and we've got various limitations through. Our systems like bit swap is not so good with blocks somewhere in the one to two Meg size. So we sort of could aim for you know maybe one Meg blocks, um but that depends on what we're storing in these things.

A

So maybe we're just storing cids, and if we are, then we could get the approximate size of cids and and use that to figure out what the width of these blocks are. It's a bit of a challenge because even cids are flexible in their size. You can get. You know, cids that are um 64 128, bytes long, even identity cids, which just break everything anyway.

A

The most interesting part of this is the the costs of mutation um so size, the bigger we make these things the more complicated it comes for mutation, because we don't just want to have a static array. Often we'll want to have a um an array of data that we want to mutate over time. We want to add to things we want to change elements. This is the nature of real world data. We love immutable data, but we like to be able to mutate these things to and create new copies of them.

A

um So let's look at the at the implications of mutation. So in this um in this picture, I've I've made up a CID for each of these blocks. So each of these blocks has a unique CID and we're going to do a couple of mutation operations. Firstly, we're going to append another element to the end. So what happens when we mutate by adding an element on the right hand side here we get a new CID for that block because it's changed.

A

But then, thanks to Merkel merkeldags, the linking block has to get a new CID, because the news, the idea of the pair of the child, has been given to the intermediate and then the root gets a new CID. So every mutation does this bubbling up to the root, and so we've created essentially three new blocks in this process of just adding one element to the end. So if you want to keep all the historical data, you've got all those original blocks, plus three more blocks now to contain this.

A

Now, what happens if you just wanted to edit one of the elements? So let's say you just want to change one of the middle ones and it and give it a new value well to get the same thing happening where that creates a new CID for that one, and then you need to include that in the linking intermediate and then you need to get a new route. So, each time you get a new route.

A

So this mutation has real costs um and they sort of flow on costs, but they have implications for all parts of the stack. um So these costs Cascade, as you can see, just small changes in parts of it just Cascade all the way up dags. If we make our blocks too large, let's say we were aiming for one Meg blocks with these arrays and we just want to change one little element.

A

When we do that, then we create a new one, Meg block, even though most of the content is the same in that block and then up the chain as well we're creating these new one Meg blocks. So each time we do this we're creating more potentially more garbage or more history that can be costly.

A

um We could make the block small so that mutating them doesn't, it doesn't seem so costly and you get more potential deduplication across the dag. But when you make lock small, then you're going to get much taller dags. So you end up having to mutate a lot of blocks.

A

um So there's a sweet spot here somewhere, but it really is application, develop application specific. It depends on what you're doing with the data whether you're transferring it is this going over ipfs. Are you transferring around the world?

A

um So the moral of this story is that these kinds of costs really are application specific and you should Benchmark to figure that out um and a data transfer slide, because this is definitely a data transfer talk, um so mutation costs are very similar to the data transfer costs, uh the the longer the taller the trees. As we've heard from a few previous talks, if you make these trees taller, then you get these round-trip costs. So you need to search your dag to find the let's say you just want one element out of this array.

A

If it's a long, stringy dag, then you're going to have to do this round trip thing just to find the element you want, but if they're really thick blocks like really heavy, then you, then you might have fewer round trips but you're going to be transferring a lot more data just to find the one element you want, because each of those nodes is going to be really thick, um but also there's an algorithm involved here in finding what you want so sometimes with these structures it's difficult to to prefetch.

A

You can't predict where you're going to go until you get the next thing, so that makes this whole data transfer process really tricky with our Dags. Anyway, let's get into the really interesting bit, which is um hamps, which we talk a lot about we hand up. If you haven't heard of a hampt, how many people have heard of a hampt, okay, so yeah, okay, you've heard of it, um but I want to talk about how they actually work, because um this is kind of fun.

A

um So a hemp is a hash array, mapped tree. uh It's it's a key value store. It's just a map like a associative array that you would have in a normal programming language um and it's a mashup of a hash table and a prefix tree. So we put these two concepts together and we get this. It's like a hash table tree.

A

um Now, it's actually a common algorithm. That's used within standard libraries, you'll even find it in Java stand but standard libraries for their collections, because it's such an efficient way of of making associative arrays um now the nice thing about hamps is for our purposes is that we use uh they use hashing to create Randomness, which leads to a denseness property and denseness is something we really want out of our dags, where we can get them so denseness.

A

um Basically, the fewer the fewer blocks, the better most of the time, a few hops from the route to where you want to go the better. It is for both data data transfer and storage and all sorts of things, um and and this this mutation cost as well. When you get these cascading costs the fewer things you have to mutate, usually the better. So um you know, within the constraints of size and all that sort of stuff, so you know block size, does become a a secondary consideration when you're talking about denseness.

A

Let's start off by having a look at what a hash table is and then I'm going to show you how we can take a hash table at a prefix tree and we get a hampt and it's actually simpler than it sounds so a hash table. This is you remember this one for your next um technical interview, because this will definitely come up. um A hash table is given a list of key value pairs. We want to distribute them in an array of indexed buckets: zero to n some number n.

A

You set the size of this thing using a little algorithm where you you select the bucket by taking the hash of a key and running some some little algorithm over that to get an index of the bucket. So you use the hash of the key to get some randomness and you use that random value to find an index which tells you which bucket to put the value in so uh the buckets will then contain an array of elements so that it could be.

A

You know any number of elements that end up in these buckets, but the hash provides a Randomness in the distribution so that the the buckets over time even out two keys that resolve to the same bucket, because you can see how this um we're sort of truncating the hash here a little bit we're using it to narrow it down.

A

You can often have two keys in the same bucket and we call that a collision, so they're said to collide. So, let's, let's run a really simple hash table across seven seven keys and values. Now we've got keys here, which are some names and some values which are just an individual integer, we're going to store them in a hash table. We're going to have a hash table of um also seven buckets um and potentially we could add more keys here, but we're just going to do seven to seven and see what happens.

A

First of all, we'll do a hash we like sha, so we can do a sha2 to 56 of of those keys and we get you know the 32 byte number.

A

um Then we are going to run a little index function and what I'm going to do really simple, I'm, going to take the first byte of the Shaha output, I'm, going to run a mod 7 over it just to find which bucket these go into I'm using seven, because I could also use a um I could mask off the first three bits, which would be a mask of seven and I could just use three bits. That's a really interesting thing that I'll! Let you think about why we might want to do that.

A

But for now it's just a mod 7 just and that gives us a number between 0 and 6., um and then we sort them into these buckets so the bucket that they belong in, and you can see that we end up actually filling up three buckets with these seven names um and which doesn't sound great. But given enough of these Keys, then you would find distribution roughly even across all of these buckets and you'd fill them up evenly.

A

So to look up in a hash table, you do the same thing. You take a key. You run the index of the hash, and then you jump to that bucket and you do a sort, a search through that bucket and that could be a um a sorted search or just a you know, an iteration through the bucket to find the the value that you want and we store the key and the value in those things so that you can actually look up the key.

A

So, for instance, if I want to look up William, then I take the key. William and I run. I run the hash over it and I find out that uh using the hash and using that little function, it's in bucket one. So I go look in bucket one there's an array of three items: look through that array, I find William and then I've got a value so great. What happens if I want to find Gary well, I run a hash of Gary I find the the index function.

A

Gives me one I go to bucket one and there's three items and Gary's not in there. So Gary does not exist in our hash table great okay, hash table for ipld. Are they any good for ipld? Well, it's this bucket size problem that becomes a problem for ipld. If these buckets can grow infinitely, then we're going to get blocks that are too big.

A

We could encode that into a block, but if we keep on adding to it the block just to become out of out of control- and it's not very useful if you've just got a block with this data in it. So um uh when, when we keep on mutating these things and adding blocks adding things to our blocks, um then uh and the block sizes get bigger, then the costs are associated with that mutation keep on going up, and that goes up exponentially. So we don't actually want very large blocks with our data.

A

We want to distribute these nicely across some sweets plot value of blocks um anyway.

A

Moving on so another problem with hash tables for ipld is that when we took that index function we actually truncated the hash and by taking the first byte, we made it insecure, because what good is a hash, if you're only going to take the first byte of it, because you can generate any key, that would give you the bucket you want, and if, if user input can control which bucket they go into, then you could be filling up buckets and that you know that becomes really inefficient, because searching through buckets is terrible.

A

um So stopping users from forcing collisions is something we want to avoid. um So to solve these issues, we're going to turn a hash table into a tree and that's what a Hamp does a Hamp takes. A hash table and repeats the pattern by taking buckets and turning them into their own little hash table and repeating this index operation as we iterate through a hash.

A

So, instead of an array of buckets, what we can do is have an array of either entries or links to a new table as as an in-memory data structure in some programming language, then this link would just be a pointer to another another struct, but in ipld it's a CID, it's a link to a whole new block, um so we run the same index function, but we have a way of deriving a different index at each level of this tree, depending on which level you're on over the same hash.

A

So, for example, if the the original block uses the first byte like we did, then maybe the second layer could use the second byte to derive the the location. So we're going to try that, with this with this hash table, we're going to turn it into a tree. First of all, we'll notice that we have. We have a raise of entries here. We just want to have one entry per per item, so one of one of them's, fine, the first one has one element in it.

A

So we'll leave that alone, but two of them have three elements so we're gonna, we're gonna, insert links there and run this same procedure across these. These two things so we'll run the hash again or we could just reuse the same hash. Now we're going to our index function now jumps up to the second byte.

A

So we take the second byte of the hash and we get a totally new index for our elements and then we can put them in new buckets for this second layer and you'll see they nicely fall into complete the separate buckets. So now we have a two layer tree of three elements where each entry falls into a separate bucket at the end. So there's no arrays of elements here, there's no searching you've just got to look up where it is and jump to the next thing until you find it um so this this is.

A

This will be what our ipld blocks look like. They just simply um the blocks with an array and the arrays either have an entry or they have a CID to another block.

A

um So looking up and so we're going to look up and we run the hash of Ann, uh we find that in the first block the bucket that we want to look in is bucket number four. We go there and we find a CID. So it's not there, but it tells us where to go the second second block or the second layer. We run that index function again, but over the second byte and we find a new index which is five looking five and there's and that's great um now.

A

If we did Gary first one we'll find bucket number one we go in. There find a CID jump to CID the second CID. It tells us to look in bucket number two um and it's not there. How Gary doesn't match William, so Gary doesn't exist in our hash table. Okay, have we solved our problems? uh Well, we've controlled the size of the blocks um by having n times the number of either entries or links. So these things can't grow infinitely.

A

They're they're bound, which is great so overflows when you hit an overflow, it causes the creation of new blocks um and we've also increased security by using the more more of the hash.

A

So you might be able to force force collisions by you know, making keys that have a particular first byte that you want, but it's very difficult to do that as you proceed through a hash, because you could keep on forcing that, but look how difficult it is to uh to Hash for Bitcoin and get those leading zeros, it's very difficult to find to brute force hashes that will collide at all the places that you want um so characteristics. Denseness hash digests give us Randomness, which is great because it gives us this.

A

This dense distribution of non-random keys. So it doesn't really matter if you're, storing keys that look like each other. The hash function gives us this random distribution and we don't also have we also don't have a strict height equals one base layer where all the entries are stored. Like our super large array, they don't all exist at the bottom.

A

They're actually distributed all through the thing and I'll show you a picture of what that might look like, but we don't strictly say that there's this bottom level, where they all exist, um but the more you fill it up, the more they the entries tend to get pushed down. The linear scan is also eliminated. So we don't need to search through buckets to find these things. So the lookup costs are really all about just jumping blocks and running the index thing.

A

But what this means is we'll often have blocks then up with just two entries, even if we set our maximum maximum number of entries to a very large number, we'll still end up with a lot of blocks that only have two entries, so they're, quite small, and we still have an array of n that we're nothing in them.

A

um uh So that's not awesome dilation. um When we delete an element, we we have to reverse what we've done um and we what we do is we do a collapse to turn this back into the form that would occur if we didn't have those elements. um So if we only have one element left in a node, then we we collapse that element back up into the parent, and then we create this form of the Hampton that it would be as if that deleted element was never there.

A

I'll tell you why that's important in a second, um we could bring back buckets and we actually do this in practice. When we use hands, so we can bring back buckets with a maximum length. So we, what we say is there's a maximum length of buckets, say three.

A

If you fill up the bucket- and you want to add one more- that's when you overflow, so then the the nodes become an array of either a bucket or a link, but the bucket has a maximum size, and we do this because elasticity is actually a really nice property to have so we don't have these really small nodes and we allow these these blocks to sort of flex a little bit not too far, but Flex a little bit so that that when they overflow they shrink back again and the new nodes are not too small.

A

It's a little bit like having these breaks between talks. You you allow for some flexing when people go too long or Too, Short um and uh yeah, it helps balance the the mutation size with block size and the traversal depth trade-offs. um So buckets tend to be something we use in practice.

A

um So canonical forms back to this idea of why we do the deletion. um What we want with in ipld typically is to have for any set of data. We want to run the algorithm and end up with the same Roots, the ID of our dag, regardless of how of the order in which we create this thing so for a key value set.

A

Regardless of what order we add our keys, we want to end up with the same root CID if it contains the same data, so this this gives us. This deduplication allows us to have consensus when we're talking about the same data set so with hamps. We do that by. If we have buckets, we have to sort them so that they're, not just in there, by the insertion order.

A

Otherwise, you'll get different, different cids for your blocks and when we delete, we have to rigorously rerun that collapsing algorithm to bring them back up to the form that they would have been if those entries weren't there and in that way, regardless of what you do to you know inserting or removing, if you have, if you have a hamped and you're running the same algorithm as some somebody else who generates a hemp with that same algorithm, you'll get the same root, CID and you're said to have converged on the same ipld data.

A

Just for a mental picture of what amps end up looking like this. This is the kind of thing that we would see in practice with a hampt where the little orange orange blocks are values that we might find and you'll see that those the values are distributed throughout the the dag, but they tend to be pushed down towards the bottom and the the upper nodes tend to be contained, mostly links.

A

But, unlike the super large array, you can't just look along the bottom layer and find all your entries they're actually distributed all the way through and so just evaluating, uh hamps they're dense, which means they're they they can be fast for lookups, so we don't have to Traverse all the way down the bottom to find what we want. um We can often just hop straight to it and because we pack the values in really nicely with this Randomness, then we get this denseness um and they're flexible. They can be they've got lots of knobs.

A

We can tune to make them for application specific needs, and you can also use a hamper as a set which is really interesting. Just get rid of the keys. You can just sorry get rid of the values you can just use it as a set of keys.

A

um We need cryptographic hash function. If we don't use a cryptographic cache function, then we allow people to Brute Force collisions that may not matter in some situations where you don't accept user input, but typically for ipld. We we want cryptographic hash functions to allow this Randomness to be to avoid these problems of collisions um and hamps are all about Randomness. They can't do key ordering, which is a bit of a bummer, but you can't you can iterate over a Hamp, but but the order in which you iterate will always be random.

A

Also sub dags have no meaning that everything's tied to the root. You need to start at the root to find your elements. If you just took a sub dag of a hampt it'd, be essentially meaningless, which is kind of annoying and then also they're messy.

A

um So you can't you can't do much with a Hamp without running the algorithm. You can transfer it, but it's very difficult to make it meaningful without having the algorithm to run over it.

A

um I might skip that one for transfer or traversal um we get. This denseness is great. So when we're transferring this between two parties, we don't have to have these long stringy things that we have to go back and forth. So densance is great, um but because it's again we need the algorithm, it's very difficult to do any prefetching.

A

um You know like smart prefetching anyway, so it's basically impossible to predict where you're going to go in a Hamp to find the value that you want. I'm going to skip schema just quickly other data structures, just just to compare this to some other ones. We have this other data structure we're using file coin. It works okay in filecoin, but it's not very awesome.

A

um If we were to replace a data structure in final coin, it would be the AMT. um So it's a specialized data structure that is used to store array like that or it's a little bit like super large array, but it's a mashup between that and a hampt. The reason we do that is because we actually in file coin, where we use it, we actually want it for a sparse array. We don't just want to pack up from zero onwards. We want to insert these indexes in in a sparse way.

A

um The reason it's not awesome is it fails our denseness criteria, because what we end up with is the AMT. This does this kind of thing where the root, the top? The top left here um will connect down to nodes that span a long way as you go to the far the far right there. So uh when we, when the indexes increase that you're storing uh it's the further you push out there, so you can imagine transferring this kind of data gets pretty pretty annoying.

A

Lastly, ordered collections. This is something uh Mo mentioned at the order collections of the next great Holy Grail. With order collections, we can build databases, I'd love, to be able to come and do a talk about how all the things we can do with order collections, but we don't have solid order collections implemented yet, but with them.

A

If we can build these things, then we can build all sorts of powerful things on top of them, but they're a little bit hard, because um if you take away the randomness, then you you have all sorts of problems with density and and structure, but the key Insight that that we're working on, including in poly trees and there's a couple of other ways that are being approached is that we can use hash digest as a source of Randomness to generate probabilities rather than um rather than than strict branching like we do with with a Hamp we can use, um we can run patches of keys and generate probabilities of branching there's.

A

um A talk on probably trees would probably um go more in more into detail on this anyway. That's it from me. um Hopefully that wasn't too long.