IPFS IPFS þing 2022, 10 Aug 2022

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Content Routing Performance Goals - @jbenet - Content Routing 1: Performance

Description

Content Routing Performance Goals - Juan Benet at IPFS thing 2022 - Content Routing 1: Performance - https://2022.ipfs-thing.io

A

So now I'm gonna um give a quick intro to content running, so um it's easier to kind of think about the content writing problem in for us in terms of ipfs, given um that we, a lot of us, are really familiar with it um think of um the traditional HTTP model as being um as not having to solve a Content writing problem because the um how to find the content is embedded in the url url to some extent.

A

So a URL like example.com has a a name system, um so the domain gets mapped to an IP address, there's routing for the particular computer. So there's not exactly content routing. There is regular IP network location routing, but once you connect to those machines, those machines are supposed to know what that content is, and they can tell you whether you know they can either return that content to you or give you an error now in reality, there's content running systems underneath those those machines.

A

So when you think of large Cloud systems within those whenever you get to those machines- and you request a resources- something like that, there usually is some form of content routing, but that usually happens entirely within one Administration domain or one single service, which means that there's a range of sophisticated tools that they use within one mode of motor control to then provide whatever resource you're. Looking for.

A

So, if you think of like an S3 bucket or an image on a social network site or something like that, there's some sophisticated system underneath, um where the machines that are handling your request, to find the specific object, you're looking for and return it to you. But the problem is a lot easier when you control the thing end to end right. So when you completely control that system, you can evolve that system over time.

A

You can scale it to me requirements and so on, and you don't have to deal with um having to write a protocol that a lot of different participants handle. So the content writing system. So the content writing problem emerges when you decouple the location addressing in the web, and you do um content addressing so by moving to content, addressing and gaining all the benefits that that confers like the um being able to Route things by cryptographic, hash and so on. um Now.

A

The way that we solve this is that we first have to do a step of mapping the name system to a Content identifier, but then we have to go from content identifier to finding the specific set of participants in the network that have that content and that's where the content routing problem emerges.

A

So um the the content writing question is how do you go from that CID to the set of participants in the networks that have the content and do so in a decentralized protocol, so think of it kind of like think of the the IP routing, World um and and the whole large set of systems that are designed to Route, IP, addresses and think of an equivalent system emerging for routing content, addresses and routing uh different different mapping, the locations of all the different resources and being able to to route your particular request to that to wherever the nearest participants are?

A

Sometimes it's also not just about near participants. Sometimes you have to deal with um authorization, requirements or authentication requirements, and so on. uh So it's not just as simple as finding all of the participants. You also have to take into account um who the request is coming from. Are they authorized to view this resource either authorized to find out who has uh relevant content and so on? So there's a bunch of um details there and so on.

A

The good news on all this is that it's not harder than IPA routing, so IP routing is of a pretty hard problem, there's an enormous set of um protocols and massive scale systems that enable IP routing content. Writing is a similar scale of problem, so it's totally doable it's just a matter of getting there uh to sort of talk a little bit about scale uh currently in the in the Falcon Network. We. um This is the broad map of of the system.

A

um Think of the just to mention scale for a moment we're talking about already hundreds of petabytes of data. So that's a lot of content to uh route. um If you think of um yeah, let's do a quick, calc, so 100 petabytes.

A

um Let's say it was a current chunker at 256k ish foreign, but like average.

A

um Yes, this is on the order of three 400 billion records. Ish. Does that sound about right.

A

Times something like 100 bytes per record or more 200 bytes per record.

A

Yep so so you're dealing with um a 40 terabyte set of Records just for the content on popcorn today, right so 40, terabytes of content, routing information um and that's sort of like the order of of magnitude that we're that we're dealing with so um uh one other component here is um when you think about cdns and you think about the layout of the internet. You would ideally like to solve the content writing problem as close to the requester as possible.

A

So if somebody's in um think of the big internet as a massive Grapevine, where there's all kinds of different.

A

Domains that uh all different sub networks of devices and your requests from a particular computer at the at the edge is being routed through a whole set of machines. You ideally want to solve the content, writing problem as close to the user as possible to minimize the latency of returning that request.

A

So if you can get to 10 sub milliseconds like that would be great like if you send the request out from a house somewhere, and there can be a Content routing system um right in your ISP, you can resolve a lot of the requests right there um and redirect the user to wherever the content is without having to go all the way to sort of the rest of the network.

A

I know that this is very different from how traditional peer-to-peer systems have solved this problem. The traditional model here is oh route, everything through a DHC- and this is what ipfs has been doing uh for a while, but when we do that, you end up with very long latencies to be able to retrieve a particular request, because you um are having to hop around the entire internet trying to find who has a particular content. This you end up with, um uh without like a very high throughput way of of being able to resolve these queries.

A

Just for a sense of scale like this is um the sort of like the range of objects that are getting generated by various different applications, and you know this is three years ago. You can imagine this continuing to grow a bunch of these continuing to grow on some exponential graph, um and so this is the number of like uniquely routable objects that all of these applications are generating.

A

So, in order to have a very efficient and very successful content routing model for the for the internet, we have to solve a problem of this magnitude, meaning you need to be able to do um page load, quality latency, so a user opening a web page entering a an address and pressing enter or entering some Search terms and pressing enter and be able to um render requests to the user in like human perceptible time scale. So that means you and most have on the order of like 500 milliseconds before it starts feeling slow.

A

um So you have to like so content routing systems have to, for the most part for hot content um solve that problem. uh Get to the point where a user can you know if you're going to?

A

If you click a link for to a specific tweet, you need to be able to render that tweet on the user's computer and the content on our tweet, so the the actual information that tweet and the images or whatever is associated and which other tweets and so on in less than 500 milliseconds, ideally less than 300 milliseconds, because 500 already starts getting slow.

A

um Now. Of course, you can amortize a lot of this, um knowing that the user was already on Twitter or knowing the user has already been seeing tweets from the same um sub networks or whatever all of that can kind of narrow down the problem space. But ideally you want to get to like that level of random access to this number of objects.

A

So this is a very non-trivial problem right, like being able to solve something at this scale um requires um treating latency very seriously when, when you design a system, so you don't want to be hopping around the internet, because if you hop around the internet you're now dealing with 100 to 300 millisecond links every time you you know, send a message out to somewhere else right.

A

So if, if right now here from Iceland, we send a message to San Francisco, that's a 150, millisecond or 200 millisecond hop um by the time we hear back, I, guess 200 seconds there, I'm back right, roughly um by the time we hear back, we maybe get like one more of those before we're already done, uh so we can't really do systems that require many hops to many nodes.

A

To be able to to um to route um so this kind of like another sense of uh scale here, this kind of typical um Cloud style workloads, um I've kind of estimated in the past- that we, the, if you think of all of these applications and you project their growth just um to some extent being able to handle 10 to the 15 objects or 10 to the 18 objects, is roughly where you want to be.

A

um There are probably not that many objects or won't be that many objects uh soon, but being able to build systems that can handle that scale um is, is roughly where you want to be and uh yeah again the you need systems that can handle large scales and do so with very low latency, uh so that to me that very much constrains the problem and we'll talk about this more later in the um uh constraint section, uh but that that very much constrains the problem to having to replicate a lot of the indexing information and put it close to wherever it's going to be requested.

A

um So, let's maybe talk about the problem a little bit formally, so the think of the a Content writing system, that's enabling users to find content in a network. So the the fine part is like this query that uh content uh users are going to do, um and so that's the search process through a system. It's a routing, query and content. Here we we're going to use the IDS to um to map all that um in lipidor, P terms. That means finding other peers.

A

So that means specific public key addresses that you can then Map and find um the actual IP addresses, or everything else would be able to connect to them. um Ideally, you have glue records there, or things like that. That can give you the information, so you don't have to do additional hops to to find those fears.

A

um People might be familiar with this interface, which is kind of like we have this notion of Provider records where there are content providers that are providing some content to the network.

A

They map it in terms of it's a tuple of the the sea idea, they're providing and their particular peer ID, um and then the clients have the ability to search, like the fine part is find providers for a particular CID, um and that should return back a um an asynchronous channel of multiple purities over time, because you won the search process to start returning things as quickly as possible.

A

uh While the system continues to look for more potential providers uh because you might find a set of providers, but they might not give you the content, you might not be able to get the con from them, you might try reaching them, they might not be online, they might not. You might not have the right authorization, they might not want to interact with you or whatever.

A

So you need a system that can um there's a um one one-to-many mapping between like every CID there's many provide possible providers um by the way note that it doesn't have to be uh completely consistent. So you do not need a full view into all uh providers in the world um and you don't even need a um you. Don't need a lot of providers.

A

You just need enough providers to make the the routing query successful, uh so this kind of um we talked in the past also about kind of record systems, so you can sort of create a record which includes the CID, the peer ID of the provider, um and you can use the private key to create that record.

A

That's a non-reputable non-repeutable record, which means that once you create that record, um there's that particular peer uh stated that you know if there's a signed record with a CID that particular peer is using their identity to declare to the network that they do indeed have this content, um which is tricky for any content. That might be.

A

um Censorship uh might be a target of censorship at some point, because that means that you can get find these signed statements across the network that particular parties had particular content at particular moments in time. We'll talk about all this um all these kind of properties tomorrow, but um you know these kinds of things can sneak in in there.

A

um So now, there's probably a there's a whole set of properties that you might want to have in systems like this. So things like sorry I need to get rid of this uh animation thing.

A

A

um So think of there being um things like uh just from a traditional distributed systems, uh consistency, availability and partition, tolerance, um you want to be able to um find find the con the content in the network. You want to be able to have high availability. You want the system to work through Network partitions, um so a user trying to request something should be able to do so with high throughput, even if, like the rest of the internet, is getting disconnected from them or whatever.

A

You want very high performance in terms of the uh throughput of raw records that you would want to be able to to uh to handle, um and you you have search, um there's some efficiency requirements on the on the search process to make sure that, ideally, that's there should be of one on the um on the size of the network.

A

Meeting um the the the search query itself should not grow um with a number of nodes in the network if it does then you're in a really bad um you're in a bad spot, because uh that that might mean that additional hops are going to be introduced in the search and that each of those hops, if they're not locally in one region of the world, are going to add. You know 50 to 100 milliseconds um per hop and so like that that'll that can bust your entire solution.

A

uh And then you know, there's a whole bunch of uh we'll talk more about security properties like um uh there's a long document that uh I've been working on that has tons of different properties related to the security of the of these systems. um We'll talk more uh more about this tomorrow, uh so just to kind of put things into perspective.

A

This is roughly sort of like my grading, of what content routing through a DHC roughly looks like where you have um there's pretty good uh scalability in some senses, um so roughly at DHC is like okay for the distributed systems model, but like not great. This is why it was a good starting content writing system. You can use dhcs when you have a small amount of content, but they quickly get that in terms of uh performance of the search right.

A

So there's like triple fire thing is like flagging that in a DHT in traditional login DHT um in cademia just find content you're going to have to take login hops through the network to find it, and if that that routing query does not take into account where you are in the world and what um the latency of your your to other peers and so on then um you're going to end up waiting multiple seconds to um request something, and so in the Tweet example.

A

Imagine that every time you clicked a tweet, you have to wait like three to four seconds for it to load and potentially sometimes more I feel like it can't find it right. It's like that experience does not work.

A

um The good news is the HDs are really space efficient, meaning, um as you add, more nodes into the network. You add a lot more capacity and you can deal with huge numbers of of uh of Records. So so that's like a pretty good. The ft is closer score fairly well on some security properties, things like resilience and permissionlessness, so anybody can join ADHD um they're, pretty robust to certain kinds of failure.

A

They have some problems with um things like Eclipse attacks and so on, where you can do some amount of censorship, resistance, sorry, censorship of Records, but for the most part, dhcs tend to be used because they have reasonably good security properties. Another problem, though, is that in DHC is, you do not have reader writer privacy.

A

There's been many approaches to try and imbue dhcs with reader writer privacy, but in general it's extremely difficult to do it um in great part, because whenever you're looking things up, you have this trade-off between wanting to terminate lookup as fast as possible for the user. But if you do, then the network learns something about the information that you're requesting who's, requesting it from where and so on. So you end up in a really bad spot.

A

If you try to provide reader writer privacy with a DHC like system, the approach is to try and hide that include issuing tons of other queries to try and hide your legitimate, legitimate accesses in the noise and so on.

A

um But in any case like we need to be evaluating many different content writing systems over time with this kind of lens of being able to see various different properties.

A

Now today, we don't have to worry about the bottom: half like the security end and um and and privacy, and so on, like all of that stuff, um it's not for the short to medium term, that's more for the long term, but we ideally want really high performance um systems when it comes to the um yeah, the the uh speeds of uh being able to query things and um just being able to handle a certain set of scale of um of Records uh cool, so I think I'm gonna uh stop here and uh yeah move on to the to the next talk, any any quick questions that I can um maybe answer about this or yeah I guess two questions.

A

If there's any simultaneously.

A

A

Yes, I mean uh we're talking super generally here, so you could have many different kinds of. um There are many different ways of like thinking about these kinds of these networks. um A lot of like DHT, like systems, have this model where all of the participants in the network act as routers as well. So you have both the the the content providers and the content consumers both acting as routers in the network, um and you try to organize that into into a system.

A

There are other structures that distinctly separate the the parties that are doing the content routing into a different type of agent. That is neither doing the content providing or the content consuming. uh And then you get a lot more flexibility in designing things, so you can do like um meaning that for a particular content, you can design a Content writing system to do what you described and that can totally work that could totally work.

A

um But now, whether that particular strategy Works in a particular system is a yeah dependent on that on that on that system,.

A

Yep other examples of like routing systems that do uh between all one like in constant time, yeah yeah. So so, for example, um what would happen if you, um you know, had a very straightforward um think of it kind of like a hash table? If you, if you can enumerate the set of participants, sorry the set of like possible routers in a network, um and you know their identities, um you could assign them a particular part of a key space, and you know always that like just go to that participant for um for it.

A

So if you have churn in the in the participant set, then you have to like deal with that. But if you can enumerate the set of participants and store that, then you totally can do it right and, and that can scale pretty well, you can get to you know how big does the list have to get before it gets unreasonable.

A

A

Yes, exactly yeah um and and yeah.

A

A

Sure, uh well I mean it depends on again. It depends on the design of the system, because if it's not open membership, then you can get uh asymptotic performance uh 12-1.

A

If you, if you have consistency on the on the set of content routers, um then whenever you're doing a query, you know precisely the set of content matters and you can go precisely to the to those and you get of one um on every on every request, um but in enforcing consistency on the set is like difficult right. However, what do blockchains do? Blockchains, enforce consistency on Modern, proof-of-stake, blockchains and forced consistency on all the um on all the consensus uh nodes right?

A

If you're going to be a consensus participant, you have a permissionless ish way of joining a network and getting promoted to participate in the consensus, um and once you do, everybody in the network knows that you're participating in the consensus- and you know precisely all the notes and so on, and you can get that to scale to hundreds of thousands to millions of nodes.

A

But you may not need millions of um content writers like Earth's, not that big, so you you might be able to deal with, because the number of content routers depends greatly on the latency right. So you want to be able to to serve one of these requests with, ideally something like 50 milliseconds, and so that means that you just need enough content routers to be able to deal with. uh You know a lot of queries in a particular region.

A

You know you need enough of them laid out everywhere on the on the internet to be able to serve. You know, 50 millisecond level, queries and then you need full replicas of everything you need right there or in a foolish it doesn't have to be with full with high consistency it could be. um You could just have like the hot part of the content right that usually content. Has this drop-off rate, where a tiny fraction of the content is requested. The most.

A

Cool all right.