IPFS IPFS þing 2022, 10 Aug 2022

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Web Scale IPFS - @mikeal - IPFS Implementations

Description

Web Scale IPFS - presented by @mikeal at IPFS bing 2022 - IPFS Implementations - https://2022.ipfs-thing.io

A

Hey everybody um so I'm going to talk a little bit about um the read and write pipelines in web3 storage.

A

um Web 3 storage is soon the platform that nft storage kind of rests on top of, um and we you know, I think we built nft storage in a couple weeks, uh so it was not built on, like you know the the most sustainable architecture and as we've continued to scale, we've found that you know we can't really buy the scale that we wanted. So we had to go and build it. And now web3 storage is really set up to be.

A

Not just you know the provider for nft storage, but for many NFC storage, size customers, so here to talk a lot about what we've had to do at scale. So a little bit about dag house we're a nucleating entity inside of protocol Labs. That means we're in protocol labs and we're kind of like on the rails to become an independent entity, um and this is like a little bit weird to talk about in an audience full of protocol experts, but in in the broader ecosystem of search providers.

A

It's pretty rare to have this much protocol expertise inside of one team kind of building, um so that's really sort of shaped what we've been able to do as a team and soon as an independent company.

A

um We also have like a lot of real live users, um a lot of whom just learned how to program in a boot camp, uh and they they will tell you when things break, and they will tell you when things are too slow, um and so that's kept us really honest and really focused on user needs as well. So the way that we tend to look at things is: what is the user need that we can kind of uniquely solve as a service provider with protocol expertise um and man?

A

It is awesome to build distributed, Cloud systems when your users hand you decentralized protocols. um Nothing is in the way of you scaling that out and building it properly. um You know: we've had a lot of iterations now in Cloud systems and there's a lot of amazing cloud infrastructure that you can kind of pull off the shelf, but you're often limited by what your users want to do. Requiring some level of centralization right, like no number of features in Dynamo, will help you. If what your users want to do is a SQL query.

A

You are now going to be scaling a postgres like for the rest of your time, and we don't have that problem. So this really should allow us to build like some Best in Class distributed systems on cloud architecture.

A

um So talk first about the right pipeline. When we're looking at rights, we need to think about the rights being in three states. The first is at rest, so on a user's device. The second is, we have the data we've taken it into our system in some way and ingested it from the user, and the third is that data is actually available in the ipfs network and that can be a little kind of tricky to talk about and guarantee, but I'll get into that so data at rest.

A

For the most part, our user data is not already in a bit swap node, it's not in an ipfs node, so we can't really take it over the pinning protocol. The whole reason that they want to give us the data is to make it available in ipfs, so they don't already have it in ipfs to give it to us.

A

um That said, it's also not already in a car file, but we would very much love it to be in a car file. um If it is already in a car file and broken into a dag, then we don't have to do that. Work in our back end, but even more importantly, the cryptographic guarantees that they want to to do. The next part of their workflow in the user's application is already available to them.

A

Before we've made the data available like what we're looking at when we take an ipfs data, is usually we're part of some other users transaction right, like they're, putting data into our system. So they can get a CID, so they can put it into a blockchain transaction or some other system and giving them that early means that they can start to put that all together, concurrent to us, taking in the data which really helps them out and and provides like a much better user story where they get immediate feedback.

A

um But some of the disadvantages here is that there's just not a Unix, invest encoder for every language and every system like if you're on python right now, like you kind of can't um so for this, we've been building sort of two-stage infrastructure and looking at two-stage infrastructure, where we we take data in various formats and then turn it into a car file.

A

And then the car file becomes what we take in, and even our new pinning API that's getting built out is just taking, pin requests, turning them into a car file and writing them into the system and we'll take that. For regular user data- and we won't even take like tar balls for you- know, directory structures and stuff like that.

A

um This is uh it we're currently in like uh testing internally right now for this, um so this isn't available to users yet. But this is a peek at like our new data received pipeline.

A

um I I'd hope that a few people would talk a little bit more about ucan, so that I didn't have to get into ucans up here, um but ucans are, are an amazing protocol that you should really look at for delegating permissions to do certain operations, um and, what's amazing, about this delegation structure?

A

Is that any did can get permissions receive permissions and then delegate it to as many people as they want, and what we found very quickly is that our customers have customers like it's, not that you know our user is going to give us the data necessarily like their user is going to give them data and if they have to proxy it through their system, in order to not leak a bearer token like that really sucks for them.

A

So we really wanted to build a system in which, like users could delegate that to other users into devices down a chain. So ucans really gave us that so the new system takes this ucan request and it says Hey I want to upload this car CID a car seat. Id is a is a CID. That's the hash of the entire car file. That's coming in! It's not the the root node in the car file that that's nothing! Don't worry about that!

A

um It's the hash of the entire car file- and we actually enforce uh shout out to 256 for this for a reason that I'll get into in a minute. um Then what we return to the user is actually assigned URL to S3. So this allows like the our customers, customers, customers device to directly upload into S3 without any proxying layer in between, and then we receive that into our system and we can do stuff with it. We key that this is brilliant.

A

We key that with the CID slash cid.car, because the way that S3 implementations work is that, because now every cloud provider has an F3 implementation, because it's so popular, but the way that they work is that they have various like load, balancing algorithms and scaling algorithms, but they they tend to find that by a prefix. You need some data locality.

A

So if you look on aws's documentation, for instance, and like what is the read and write throughput limits in S3, they won't tell you that there's any limit on a bucket, the limit is always per prefix. So if you prefix by something that's a hash, you like you you've now like distributed all of the names across the entire set evenly, so all of their scaling. Algorithms are going to work like kind of perfectly and you hit like no limitation right.

A

um The other amazing thing is that in these signed, S3 URLs you can tell it to validate the shot, the checksum of the actual input data. So we only give you Earls that will that will that S3 will validate with that shot, 256 hash, so we so a thousand users try to upload the same thing. At the same time, we can give them all assigned URL into the same bucket and none of them can override each other. So we basically have a lock free upload infrastructure into S3 into our distributed system.

A

I, don't know of a fatter pipe of like data coming in for ipfs data than that. It's it's pretty brilliant!

A

um So yeah we're incredibly excited about this, and this is really set up for us to not just support like nft storage, but to support thousands of customers, the size of nft storage. Once we have have the data we need to make it available, and this is why we build elastic ipfs. So the way that elastic ipfs works is that you give it a URL to a car file as input you don't like sort of write data into the system.

A

Our original design actually was to write it into a bucket and get the bucket notification, and then we realized like this is so much nicer. If you just de-hubble those things, because now, if it's just a URL, you can put data in different buckets for whatever reason that you want. You can take data from remote systems and other customers that already put them up in in like HTTP URLs. It doesn't really matter to us- and this also allowed us to onboard a lot of data that we already had before.

A

We had the the piece that I just showed you ready yet so we actually have this running in production. Now, it's the main storage provider for NFC storage and web3 storage, and we were able to onboard like our entire backup bucket of car file data and then basically use the bucket that we were sending backups into as our like main storage infrastructure now, without swapping out any infrastructure, and that was all able to come up in parallel to the existing systems.

A

um We then so once we have a URL to a car file that car file gets indexed by Lambda, and so every block uh in that we'll get the we'll pull the multi-hash out. The multi-hash will get written into these um uh into Dynamo um and we've just sort of uh revved. The Dynamo schema there.

A

So that's that works a lot faster now too, um and then, once that, uh once those records are in Dynamo now, there's actually a pool of node.js processes managed by kubernetes so depending on the amount of load in the system, it'll spin up more less down and those are actually handling the bit swap requests that all come in over websockets and then, as far as like how we distribute and manage the websocket connections, we're just leveraging the regular AWS websocket um infrastructure there like they do all of that load balancing for us and in fact, we now get to operate as one peer ID for the entire system and we can even run in multiple regions and have that also managed by AWS and that connection.

A

So this is really nice actually, because, when you're a really large provider, you want to get added to every gateways peer list so that you're just really fast without having to do DHT lookups and if you're constantly adding new nodes into your cluster. You have to constantly be messaging to all of those providers. Like hey, add these new peer IDs, so having one peer. Id has been really phenomenal here.

A

um All right. Let's look at reads uh so most reads: come from HTTP gateways, um that's just kind of the reality right now. So uh let's look at how we handle uh some of that uh we're we're kind of in love with cloudflare, uh for our read architecture and for really a lot of our HTTP architecture.

A

um They have mostly free egress, which is really really nice, um especially like being a multi-tenant ipfs provider is a little bit difficult in that you don't know which customer to charge for read throughput Because multiple customers can upload the same data and you're just getting the content addresses, so it becomes very, very difficult to actually charge for read bandwidth. So it's nice to have some Revenue alignment here on, like what we're being charged for and what we can actually charge our users for um we built a Gateway CDN.

A

um So it's not I think that it made the message as this a few times, but it's not technically an ipfs Gateway, even though it has the whole ipfs Gateway API there. It's really like a CDN in front of gateways, um and so what that, how that operates is there's the regular HTTP cache in cloudflare, which is there for anything you do, including workers, and about about 40 of our requests, just hit that regular HTTP, cache and they're fine um and obviously like ipfs data is immutable.

A

So all of those cache headers are, you know, never take the set of cash. If you don't need to, um then there's a secondary cache that we have in worker KV and we also have a product called super hot, where you can take any Gateway, URL and we'll just like. Take that Gateway URL and cache it in cloudflare Forever the whole rendered state. So we can make that super fast as well, and those caches are actually above 60 of the 60 that come through the other cash.

A

um So these rates are great, um and if it's not in one of these caches, uh then we race a bunch of gateways. So who's the fastest um cloudflare was was kind enough to actually uh run a cloudflare Gateway in our Zone private just to us for this infrastructure. So we hit that one we hit ipfsio and we hit pinata like all in parallel um yep. So that's great uh we're really happy with that with the performance of our Gateway right now, our customers are really happy um and having immutable.

A

Data is amazing here, because even the views on the immutable data are cachable forever right. So we have. You know that e-tag is never going to change.

A

You can return it with cache headers that say never let this fall out of cash, you're, very safe um and now we're sort of starting to look at like what would a bit swap CDN look like because we're very happy about all of this and we're really kind of unhappy with aws's bandwidth charges, uh so we're we're looking at uh cloudflare workers um and Ellen and Vasco actually got bit swap running in a cloudflare worker, um just like between the last time that I did this talk-in now um they're they're, amazing, um and if we were running the math on this, and it's actually cheaper for us to copy the data out of S3 into R2 and then serve it once then, it would have been to serve it twice from AWS.

A

So it's even having two copies of the data and two systems is like not really a problem for us and we have a lot of processing workloads that we actually have to run in AWS, because classler has like two percent of the features of AWS, um so yeah we're looking into this right now and I I. Think that we're probably going to end up going that way and uh that's my talk, thank you.

A

And I should say: uh there's a bunch of follow-up talks about elastic ipfs and a few different tracks, including the connecting FPS track that I'll be running tomorrow.