Graph Protocol The Graph - MIPs Program, 14 Oct 2022

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: The Graph - MIPs Workshop - AutoAgora

Description

The Graph is an indexing protocol for querying networks like Ethereum and IPFS. Anyone can build and publish open APIs, called subgraphs, making data easily accessible.

Follow The Graph on social media
Twitter: https://twitter.com/graphprotocol?s=20
Instagram: https://instagram.com/graphprotocol
LinkedIn: https://www.linkedin.com/company/thegraph/
GitHub: https://github.com/graphprotocol
Website: https://thegraph.com

Find more about the MIPs program here: https://thegraph.com/migration-incentive-program/ and follow all details of the program here: https://www.notion.so/MIPs-Home-911e1187f1d94d12b247317265f81059

A

Recording hello.

B

A

Yes, we're recording hello everyone and welcome back to another mips Workshop. Thank you all for attending today. I am very excited to have Alexi who is going to talk to us all about ogura and also Gora, as you can see, from the presentation, slides and with that I will hand it straight to him to take us into the workshop hi.

B

uh Thanks a bell uh yeah, so we're gonna talk about a little bit about Agora and then a little bit more about Otto Agora, um it's all about. Basically, the theme is how to price bigqueries you're serving right.

B

Okay, so I'm gonna introduce Agora. First, um it's gonna be a short introduction. I'm not gonna, go too much in detail here, because there's already some material out there, uh notably a workshop from Zach, who is the uh the main developer of Agora, so uh yeah I hope you'll get a chance to to check that Workshop out.

B

In any case, uh the prom setting. um We observe on real queries from from the network that you know for some sub graphs. So in particular here we have uni swap the the execution. Time of of queries can span three orders of magnitude, so anything between 10 milliseconds all the way up to minutes right.

B

B

Another thing also is that we're seeing that queries are quite repetitive, so that means that for most of the queries you're receiving, you can take action and you know determine if you wanna kick them out, or uh you know, price them according to their difficulty. Right I do hope by the way that you already know about graphql I'm, not gonna, I'm, gonna I'm, not gonna, teach you about graphql in here, but there is a lot of material out there right um in any case, in the context of the decentralized service.

B

um Here's the here are the components that are, you know, playing a role in in serving a query, so at the top I'm going to get the laser great.

B

um So the top we got the end user, the end user talks to the Gateway, so the end user prepays for their queries, with um with GRT on polygon or at least last time, I checked, that's how it worked and the end users provided a uh an endpoint to consume um to consume from their uh from their GRT money.

B

Queries right. The Gateway will select an indexer for each query once again, I'm not going to go into detail, but basically the selection is a function of the economic security, which is how much the indexer has staked uh I mean how much the indexer has in delegations take and also how much is located to the particular subgraph uh that is being served uh and other. You know, quality uh aspects like how many felt queries there were in the past and um latency right and and also the price of the query.

B

So that's where it gets interesting. If you press the particular query that the end user sent uh too high, then uh there is there's going to be a higher chance that another indexer is going to be picked in any case. So we've got the end user, send the graphql query to the Gateway and the Gateway selects us. Let's say um that query is gonna.

B

Go so that query, plus some payment data cryptographic, payment data, that's generated by the Gateway, will go to the um through the indexer service that the indexer is running and index or service is tasked with interpreting the payment data and accepting accepting the query and the payment and verifying that the payment is correct.

B

Etc if and the indexer service then um proxies that graphql query to uh to the query node, which the software is called graph node um to to to actually process that query and the graph node itself of course um gets its data from um from a postgres database that contains all the data from the subgraph. That's been indexed by the index, node Etc, so here I'm only showing the flow of queries, I'm, not showing all the rest of the components, index, reagent and the index nodes and and your you know, ethereum nodes Etc.

B

This is not here, I'm, just showing the relevant beats right.

B

um So an interesting aspect of all this is that the indexer service will also serve a cost model to the Gateway and that's how the Gateway determined how expensive the query will be for a particular indexer. The Gateway pulls the cost model from the indexer every about 30 seconds.

B

So before you can receive a query and before you can, the Gateway can decide for a price. You have to supply a cost model, so those cost models are written in the Agora DSL, and here we have a little example of a nagora entry, and it looks very much like a graphql query, except that um you define a price for that query shape basically right. So in particular here we say that the base cost of that query is 0.0042. Grt plus point zero: zero, zero, zero 38 times first.

B

So if you in that particular case, if the client asks for many um Many Many Items uh from the from the query response, uh we would charge more right, and so that would apply for the com to an incoming query like like such such a query and the charge price in that particular case would be 0.004314 GRT right. So that's pretty much how it works and that pricing is computed on the Gateway prior to receiving the query.

B

uh I invite you I, urge you I, wouldn't I would even say to check out the Agora language Workshop from Zach. uh That's the link and you can snap the QR code um I'm, also going to give a link to the presentation such that you can go back to this.

B

So the the overall flow once again, I'm not going to go in details, but you can find some information in the readme's of different pieces of software I'm going to mention here with the manual flow to create your Agora models. Is that first, you run your query node with a graphlock query: timing equals gql, and that will make the query node spit out. um Query the query contents and the query execution times uh to STD out and then it's your job to capture the logs uh Second Step.

B

You run those logs through q log, which is under the graph protocol organization in GitHub for analysis.

B

This tool will help you determine which are the most frequent query shapes what are the execution times of those query shapes Etc, and it also will help you to generate a Json L block files.

B

um Then what you can do from the information you got in step, two from the analysis is that you can write by hand your uh your Agora models for each of the subgraphs of interest and, fourth step. You can try out those Agora models using the Json L files that you generated with q log.

B

um You can try them out with Agora itself. There's a Agora software also under the graph protocol organization.

B

And that will basically estimate how much money uh you would have made if you had um basically replace replace your history of past queries that you've served and will help you determine how much money you would have made with that new uh cost model.

B

um So here's an example of a very small Agora model. We've got two entries and a default price, um so here it's pretty straightforward to understand. What's going on um and yet the default price is a catch-all. So that's.

C

B

Know if, if the queries you're receiving, are not matching those first two, then it will be applied that default price. uh Once again, if you want to know more details, you can do way, fancier things with Agora uh check out the Agora language Workshop that I mentioned earlier, um and in any case after this, so you you've got your Agora file. You can load it into your indexer stack to be served to the Gateway using that command, so that command comes from the indexer CLI um and you run graph index for cost set model.

B

You give the ipfs hash of the subgraph. You want to apply this to and the file, and that will, after that, we'll start serving that model to the Gateway for the gate which you press. Your your queries.

B

um If you want to know more about that tool, though I would expect it has been mentioned already, in other workshops, uh check out this link, it's a bit long, so we probably want to get to the presentation and go back to that page and get the link and you'll get more information about what the graph indexer tool can do for you.

B

So it's all nice, it's very powerful, but the problem is that it's hard to do and it's repetitive, you would have to run that loop I described here um for every new subgraph to do your indexing and ideally you would even redo the models for the subgraphs. You were serving uh because the statistics of the queries you're receiving, might change Etc, um so some indexers just elect to put a default price guesstimate and then roll with it. But we're not very happy with this, and especially for mips.

B

We really want you to um uh optimize your Revenue right and get as much revenue as possible. um So for this we created Otto Agora.

B

Automating pretty much every step of that process um and it's functionally split in two parts: uh it's split into a relative disc cost Discovery part which helps you determine for each subgraph. How different core queries are um are cost in terms of your Hardware um relative to each other and then in the second part, we're going to discuss how all of those prices are adjusted in real time to the current market conditions, foreign.

B

Looks a little bit scary, but it's it's not too bad um and that's that's a diagram of of part of the indexer stack and the auto Agora components that you have to plug onto your indexer stack for for it to work.

B

So we've still got our Gateway Etc. uh The indexer service is wrapped in a container called Auto agor index or service, and its job is to automatically um filter out the query timing logs. So the very ones that we were mentioning earlier, um such that those are sent to a queue um that and and then those log lines are processed through the auto Agora processor.

B

uh What the Auto Group processor is doing is uh extracting all the data from those log lines such that, such as the uh in the query itself, um the the execution time uh Etc.

B

um That data is then saved into a new postgres database that uh that saves the query logs and the query skeletons I'm going to come back to what that means. It's pretty much a log of all the queries. You've served with how long that you know with their execution, latencies and Otto. Agora then runs statistics from that database to determine which are the most frequent queries and builds and updates the relative.

B

The the Agora cost models automatically those cost models are pushed to the index reagent, just as the indexer CLI would do, and the index region takes care of, pushing that to the database that the indexer service uses and then the index or service serves those um autologoric generated, cost models continuously to the Gateway um yeah. That's pretty much so in green.

B

Here uh we got the original indexer components and in blue are the auto Agora added components right so I'm going to go into details uh for for each each step of the process so and uh in the Agora part, I discussed that the uh the query execution times come from the query nodes.

B

um Here we are gathering that data from the indexer service, the for the indexer service to to give you that data you have to set the environment. Variable query: timing, logs equal, true.

D

B

You get those log entries, it's already in Json format and, as you can see you can you get your um subgraph hash, get your query um as well as the query variables and the execution time in milliseconds.

B

So that data goes into the auto Agora processor, such that the um um the so the the query itself is extracted the um so that's where it comes from yeah from the queue um the the queries extracted. The schema of the subgraph is retrieved from the indexer service um and that helps us normalize the query, because in graphql there is many ways of basically asking the same question, but at the end it's the same price for us, so we want to normalize all of that in our statistical data.

B

um The normalized queries are split into query skeletons where we got only the shape of the query and no value in there. So, for example, if you had a query that said, you know, first 12 then first would be a variable in here and 12 would go into the query logs.

A

B

A separate variable um and the way we link those query skeletons to to the core logs is uh by using a hash as the key uh black to Hash, so cryptographically strong hash such that in case, an attacker would want to mess with her pricing.

B

um They wouldn't be able to very easily figure out how to mangle their queries. To confuse us right.

B

um So yeah all that gets fed into two tables the core logs table and the query each each log entry refers to a particular skeleton.

B

uh I'm gonna show you later how that looks in the in the tables. Here uh it's extremely detailed, so I'm just going to fly through it, but uh pretty much. These are the steps of normalization and um cleaning up of of the queries to normalize them and separate the variables and the and the skeletons.

B

um Here's your supporting data to prove that we do need to normalize our our queries, and here we have the frequency of the most queries and before normalization, you can see that the uh the the frequencies are dropping much quicker than after normalization, because uh queries get grouped um much more efficiently and much more tightly, so it does make a huge difference to normalize the queries. It's a very important step, and here it's just an example, data that we got in our in our tables, so the logs table we can see.

B

It refers to a subgraph query hash, which refers to a particular query: skeleton timestamp when we receive that query the execution time and the variables to fill into the uh to the skeleton such that we can completely rebuild the original query and, in the course skeletons table we can see so here. Actually, the first hash here corresponds to the first hash here, which is lucky and you can see the the normalized query, which is also very much compressed, all Extra, Spaces and and new lines Etc are removed.

B

So it's just a very compressed, very generic form of the query um so here to give an example. So that's that's basically kind of a compression technique and it really works well because sometimes those query skeletons are very large.

B

um So we can see that here we have a half a million uh we've seen a half a million queries and we have only 64 skeletons. So that gives an idea of how repetitive those queries really are, and here our table of products table is using only 100 megabytes of data. So it's not very much.

B

um So yeah, so after that we got the the main part, the autograph part that uh generates the the models and just gets this data from the postgres DB determines what the most frequent core skeletons are: compute the average execution times for the skeletons then create the Agora source file and it injects it into the indexer agent.

B

um At the same time, otogora determines which subgraph it has to generate uh Gore models for by asking the index or agent. What subgraphs are we actually allocated on right now? So it will not.

B

um It will not generate models for subgraphs that we are indexing off chain or or something else on the subgraphs that we're currently allocated on and thus serving right now um so using the index or CLI. If you you know, while togor is running, if you ask the the index or C like what are the current cost models, that's what it would tell you. So that gives you an idea of what autogora generated, and you can see that you get some comments here.

B

That just gives you a little bit of context data for you to double check. um You know that those models are satisfactory. So it gives you the minimum maximum time of execution. That's been observed the average and the standard deviation actually here. The standard deviation is quite large somehow, and you got your query skeleton and the price that was computed so really the what we call relative cost is that value here right.

B

So here we have a second entry and it's a cheaper, statistically cheaper, query to execute, and we can see here that the relative cost is 44.. So we got almost. We got a good two orders of magnitude, difference of cost between those two queries, um yeah and we've got our default here still because we still need to catch-all right.

B

um And yeah the interesting thing here is that um we got a global cost multiplier, so all of those relative costs are multiplied by global cost multiplier that is determined per subgraph and that is fed as a variable to uh to Agora and so to determine what value should be fed in this we have to compute absolute prices. So before we were caring about the cost of a query, and now we have to determine a price that is a function of that cost and and the current market conditions Etc.

B

So here's a rough um route visualization of how, at least from the indexer's perspective, how the query Market looks like.

B

um As you increase your query price, you will see the the queries, uh the query rates that you receive for that subgraph go down, so it will start at some. It will Plateau at some level and it will go down all the way to zero.

B

If, if you press, uh you know if your prices are too expensive compared to the the current market conditions, um how it looks like in terms of Revenue per second is like this. So, of course, if you set your price to zero, you're gonna get quite a bit of queries, but you multiply Whatever by zero. You get zeros, you get no money.

B

As you increase your price, your Revenue will go up and it will reach a tipping, a maximum Tipping Point after which, because you're pressing too high uh you you get less competitive and you receive less queries. So you sharply your your query rate will sharply Decline and that's your Revenue also so once again right. If your price is extremely high and you get no queries once again right, you multiply by zero, you get zero, um and so the game here is to uh maximize our Revenue, which is the average query price times.

B

The query rate that we're receiving.

B

So for this we decided to do some continuous reinforcement, learning to continuously adjust a.

B

An action space that we've modeled with a with a gaussian, so basically we have just a gaussian, that's over.

B

um That's just determining the uh the probability density for us to to pick a a particular price, and so we're sampling that gaussian and changing our price multipliers constantly, um and with this we determine um a reward function. That reward function is just our Revenue and sorry, and um the reinforcement, learning algorithm will tweak the standard deviation and the mean of that gaussian such that we optimize the revenue I'm going to show you after that, A visual representation of what that means.

B

It might seem a little bit abstract um in any case, the loop of the algorithm is we sample a press multiplier from the policy, so the policies that gaussian distribution over prices we apply that price multiplier to Agora. We wait one minute we measure, so one minute is because we have to wait for the gateway to apply that new price.

B

Then we measure the number of queries per second we receive over a minute and with that we can compute the the revenue rate we got over that minute, and then we can update the policy using the the um the optimizer um uh using the revenue per second as reward, and we just repeat repeat- and that will continuously move that uh gaussian to follow the optimal uh Revenue rate, which is that the peak here right so that big, just to give him more context that Peak will move around depending on how many indexers there are and how the consumers change their um their budgets right.

B

So we just want to move. We want to follow this guy uh whenever it moves, I invite you to watch that Defcon presentation um from Thomas cornuta, a colleague of mine he actually presented yesterday at Defcon, and it's all about that algorithm. So yeah check it out any case. uh Here's a visual representation of what goes on. So what you can see at the top here is the red line. Is that gaussian that we use to sample the prices the x-axis are?

B

Is the prices and the um the y-axis is the query rate in the in that top top quadrant of the video um the uh The Gray Line is the horizontal Gray Line is the um the simulated Network query rate and we add some noise to it, because it's more realistic that way and the um the vertical Gray Line is the budget limit that the consumer has set.

B

So what you can see here is that we start we start over there with our gaussian and it will rapidly move towards the uh the maximum limit right, trying to get as much money as possible.

B

So I'm gonna play it again, as we can see that that gaussian widens to figure out faster where that limit is, and then it converges down such that we can sample. Mainly, you know more tightly around the higher prices right, so you can see the the Red Dot here is, just being it samples quite right, a quite a wide range of prices, and then it's getting Tighter and Tighter and Tighter as it trains foreign.

B

So the reason why we've elected with to to use uh machine learning is that here, as you can see from very visually from that gaussian, is that there is a um there's, an equilibrium to play between um what we call an in reinforcement, learning, uh exploration and exploitation.

B

um If you want to explore more, you would widen that gaussian. You can sample a wider range of prices and you quickly see broadly what happens. You know, depending on how you choose your prices.

B

The problem, though, is that as you're doing that, you're you're learning a lot about the environment, but you are also getting less money because to to see to to to explore that environment, you will go into places where you get less money, um whereas when the gaussian Titans uh you're doing exploitation, which is which means that you have a quite high confidence in in terms of where the optimal point is, and then you just stick there and then then, just you know, get as much money as possible.

B

um I invite you to watch the uh Defcon presentation that I linked just before, because it has more experiments where you can see what happens if suddenly, the price moves and the custom will just basically expand and and then and then find the optimal Point again and what happens when there is multiple competing agents, Etc, so I'm not going to go into those details because it was already presented by my colleague yesterday um yeah.

B

So here are the components that are involved with absolute price Discovery. So there's a bit less components, um because we don't for the for that price multiplier. We don't really care about what the contents of those queries of of the I mean. We don't really care about. The relative prices of queries costs sorry, but we care about basically shifting those those prices to to follow the market price.

B

So the way we're doing it is by looking at the query, metrics that are exposed by the indexer service. That's how we know how many queries were served in a uh in a chunk of time, um and so yes, we do. This to Agora will address that price. Multiplayer variable in the gore models and push it to an extra agent into the indexer services. Db and an indexer service will serve that to the Gateway and it just repeats repeats repeats in a much tighter Loop by the way than the uh the relative cost discovery.

B

uh It will update pretty much every two minutes if I remember well,.

B

So here are the metrics that we extract from Auto Agora, so togor sends some metrics to you to the indexer's Prometheus, and um we can see that um so here that particular sub graph I think is uh is Uma on ethereum, and um so we can see that the number of queries uh per second that we're serving is is somewhat flat. You got to remember that the the market and the number of queries uh that we're receiving varies right, so it's not only a function of our prices, but it's also a function of Demand right.

B

um So that's why there's not a very much visible correlation between the the course per second and the the prices were sampling. In any case, um we can see that the mean of the gaussian fluctuates. So it starts pretty uh it's pretty high and then goes down a little bit to to find that optimal Point. uh We also see the standard deviation going down over time. As confidence uh goes up about the the market environment.

B

We can see that also the consequence in terms of the sampled prices that we're applying over time. You could see that it's extremely wide to start with, as it's exploring and you know poking around, to see what happens and then it converges quite tightly. We can see our reward going up as expected. So that's the uh um the reward in that particular case is the um price multiplier times uh queries per second, and we can also see that the the GRT uh per second that we're making is is going up over time.

B

So it's a pretty pretty good run here, so putting it all together. um Here are all the components. So you see, the flow for the relative casting, which is going from Gateway, receive query, goes through index or service Lexus, Service computes. The execution time goes into the logs into the queue uh that Q is consumed by the auto Agora processors that gets fed into the database.

B

That keeps a historical um trace of of all the queries we've served and and what were their latencies, and that goes into Agora to compute its statistics to generate the relative cost models. It also gets the um number of queries that are served from the the query metrics and generate those models. The October, the Agora model, sorry fits them into the index reagent, which fits them into the index for service DB, and then the indexer service serves those models to the Gateway. So you got the the full cycle here.

B

There's some limitations with the current implementation. It's still work in progress, um but yeah the the modeling, the the relative derivative cost modeling is is quite simplistic because it's just based on execution time.

B

We could improve it with. You know, for example, extracting special variables like first and Skip and inject those um into into the Agora models to have uh to what happened there.

B

um Where was I uh I was saying that yeah, um the the modeling is somewhat uh somewhat basic for now, but it's still much better than having to do things by hand or not doing anything at all, um but that could be improved over time.

B

um Also for now, we're not uh Computing execution time for multi-root queries. So what's a what's a multi-word query: it's it's basically a graphical query. That's asking for that! Has two root nodes or more right.

B

That's a little bit tricky because uh agorak is casting those nodes separately, but in terms of execution time we just get the overall execution time of the whole query with all the you know, including all the notes in there so yeah, that's a work in progress and the the problem with absolute costing is that it is a little bit slow, we're quite limited by the uh gateways polling speed.

B

um So that's why you can see here. For example, in that run, it took um let's say about six between six and 12 hours to get to a satisfactory um convergence on the on the pricing model.

B

Yeah, so that's it! um So please check out the auto Agora repo, and you also got a link to the presentation uh on the right side that will probably help you with getting all those links that I've shared here.

B

A

Amazing, thank you so much Alexi um as we were going through the presentation. There were a few questions, so if you want to maybe scroll up into the chat and address all the questions, that'll be a good idea. Okay,.

B

Let's see going up.

B

um With those Auto grow components be on our machines or in some cloud it would be wherever your index or stack runs pretty much it's tightly coupled to the index reagent uh and the indexer service. So you would have to run it in situ.

E

Oh yes, that was a question for me. I wanted to expand it a little bit uh so I guess this stack is a very computationally. It requires a lot of computation power or.

B

No, not at all, that's the the thing it is very light. It.

E

Turns out so the queries themselves would not uh um be longer for our customers to get. It would be the same.

B

Thing yes, um so, first of all, so yes, there is that flow right that um sorry yeah. So you may think that there could be bad pressure. If, let's say our talk or processor is too slow. It would uh back up the indexer service, standard output and could slow things down, but that's not the case. We have two levels of uh pressure reliefs.

B

Basically, so if so the the rapid mqq is, um if it fills up, it just throws away entries and if um and if there's any kind of back pressure in terms of our filtering software for the logs here same thing, it just like throws away entries. uh So this was optimized for uh not impacting uh the the system.

B

um We prefer throwing away um log entries then slowing down the operations. It's a good question, though yeah in.

E

This whole architecture uh I have a question about postgres database. uh How much storage would it require and um can we delete some query logs that are older and I guess became Irrelevant for a newer version subgrowth. So in your time period.

B

Yeah so um so yeah we haven't done precise statistics, but in the one example I gave for half a million queries. It was using about 100 megabytes of data. But yes, you could prune older entries, no problem. There is no automated process yet to do that, but there should be indeed.

E

um And how fast, in terms of memory uh the model would be would be good enough to produce better results. So, as far as understand from the video it takes some time. So it takes some queries just to learn the model to teach the model.

B

Yeah so the model itself. um It is very compact right because the only the only variables we're training are the standard deviation and the mean right. So it's basically just two numbers, and then you multiply that by the number of subgraphs, but in terms of memory, it's very small. It's a few hundred megabytes for auto Agora, a few hundred megabytes for autograph processor.

B

um Yeah, it's very small, it's much less than the indexer service or even the indexer agent. As a matter of fact, much much smaller we've developed everything in Python, so it's both. You know easier to easier to comprehend and it's much much lighter than the node.js parts of the of the infrastructure.

E

Yeah, thanks for your antonson, for a great presentation, great work. Thank you. I I think it's going to be a nice tool.

B

They're still improvements to be made, so if you want to participate, if you want to check up the code, if you want to you know it's all uh open source, it's Apache, 2.

B

Okay, so next question: do the drops to zero mean that we serve some queries for free? uh No, we never serve queries for free. um It is uh because okay, let's go there and see.

B

Yeah there's some drops to zero here.

B

um It's because we're sampling prices that are too high, and that means that we don't get queries for that amount of time and that's because we're exploring the the space, and so that is that information is, is basically a strong um um disincentive right. It's a negative reward for the for the the optimization algorithm so.

A

B

Gets zero queries and then it learns it and then it doesn't do it anymore.

E

A question about the video, because there.

B

E

Drops to zero uh in the late stages after.

B

That yeah that's interesting, um yeah, you're, right, and so this is. um This could be seen as a prompt. So when it goes to zero, we are not serving queries, so that is actually it's similar to um to the grafana.

B

It's because we're sampling over the price limit that the consumer set see when that point the the round, the the red disk flashes down there, it's right after the right after the uh the the price limit that the consumer set and so effectively here we're not serving queries at all um in this particular simulation, because we have a single agent for the subgraph uh it. It looks as if, basically, the the core is never served right.

B

So the the end user would be not getting data uh from our experiments as long as you have at least two indexers on a subgraph uh that cannot happen.

B

It's a mix of the fact that uh both agents are competing so that limits basically how far up the agent will go, um and also uh because they're sampling randomly there's very little chance that they will both select a price that is over the Consumer Price. So, if you're over, then the other guy takes the query right. So that's the idea uh in general, it's known that quality of service is poor when there is a single indexer on a subgraph. Ideally, you would get three indexers.

B

Does it answer your question yeah.

E

It's answered completely yeah. Thank you very much sure.

B

Okay, so will the recording and presentation be shared after the workshop? Yes, um yeah? Okay? Is there much of a response time penalty when attaching a to Agora to our indexer stack? Has this been measured so, as I said, um zero penalty, because it is um there's multiple buffers right so first there is the standard output buffer from the indexer service, because we're taking everything from the standard output of the the logs right of index of service, um then we got a programming go that is filtering the logs.

B

um It is using internal, a buffers and cues and it also has a pressure relief valve. So if, if the queues fill up then entries are thrown away, then the rabbitmq is also um uh also has a pressure. Relief also will throw away entries, so the penalty should be exactly zero and if it's not zero, then write an issue and we'll try and fix it, but we have three layers of protection, so it should not impact at all.

B

And that's it because there's no more questions. Maybe some people want to ask with sound the questions instead of the chat, there's still time, I think right. There's 10 minutes.

A

Correct yeah we have 10 minutes so uh feel free to ask more questions. If you have any guys.

D

ah Hello, can you hear me yep? Yes, um uh thank you for presentation. First of all, um my question is: uh can we impact the price by hand? We can maybe we're going to make some constant price or something, as you mentioned uh previously, it was possible, but what about uh automating system?

D

Is it possible to set minimal price.

B

D

Like that, thank you.

B

Yeah um in this particular case, uh we haven't yet implemented minimum price, um but that can be done. Yes, um we are not expecting that a minimal price would be needed from our um we've.

B

We've studied the behavior of the the gateways, indexer selection, algorithm and um we are, um it- is made in such a way, which you can see by the way from the um the Agora language presentation from from Zach, but it's made in such a way that, for now at least the market is, is, is made in such a way that there is no race to the bottom.

B

If there is multiple competing indexers, um if it was a a purely um let's call it linear Market, uh you would expect indeed that, as soon as you have more than one agent, they would just like you know, try and get the queries and be cheaper than the other guy and then go down to zero. But that's not the case.

B

um Still, though, um uh a minimum price should be implemented. Indeed, yes,.

B

I guess we're good.

A

uh We have another question from Ovi.

B

Oh, the chat, oh sorry, I wasn't, monitoring the chat. I'm gonna, switch from presentation to chat and I yeah uh feels like will be a disadvantage if we're not running Auto Agora. Yes- uh and that's because from my past experience, if you want to find the market price, if you want to do it by hand, you have to sit there for hours and then it moves all the time.

B

So that means you, you don't even get to sleep like every 30 minutes or one hour like okay, I'm, gonna check it again and and adjust it um as you can see actually from uh too far back from here.

B

um You know once it's converged, you can see the mean is pretty much market price, the optimal price, it's moving all the time. It's moving a little bit up a little bit down all the time, so we do have to do that by hand which would be crazy.

B

Okay, congrats with tripo.

B

Finish page: yes, why it was asked yep.

B

uh Silly question: what if Robert mq goes down if it goes down what's expected, is that the um the auto Agora indexer service that rapper should just throw away the throw away the data? Maybe I should test it just to be sure, but because I implemented this quite a while ago, but it should throw away the data we can try. All this mentioned on ongoing test net or you're encouraged to use it. Actually, uh if you want to, you know, show that you can generate revenue on the test net the better.

B

So yes, please use it and tell me if something is wrong: can we a B test or to Agora.

B

um So by a B test, I would think that you mean um switching between a manual model and the auto Agora and and seeing which one is best. Is that what you uh you mean Anton.

E

Yeah, it can be even implemented into Auto goros, so half of the traffic goes through Auto gorans half of the traffic goes through a normal pricing. The models, if we have already just to take a look yeah.

B

So that makes sense. uh The problem, though, is that uh we don't get to choose which press model is applied because it's the Gateway, so you would have to change your your the price Model that you've declared for the subgraph to the uh to Agora, to the Gateway, sorry and um and then it would just switch all traffic to to that model it.

B

Basically all traffic goes through the model right, that's the problem, so uh only solutions I could see is running two indexers, though it seems that it's in the rules of mips, that's not good uh or manually, turn it off and on, but yeah there's not really a solution right now, because the Gateway is doing the pricing. It's not us.

B

Does that make sense.

E

um Yes, I see the obstacle uh there. That's why well I want it. uh um Maybe he knew the solution, how to make it better than just switching out switching off.

B

Any other questions.

B

B

So getting close to the end, but yeah don't hesitate to just go to the repo and write issues.

B

Yes, Daniel, uh you need some load for sure. uh That's actually.

B

uh One of the trickier problem we had with Otto Agora is how to make it not explode when there is no queries, uh because, basically, if there's no queries, the um the optimizer goes into Extreme Exploration, because wherever it pokes at there's nothing, so it goes wide wide wide and then you know get basically a number overflow and it explodes uh so we we do have a pullback force right now implemented that uh limits, how how far the um the the the the the policy can go, and it basically brings it back to the initial state.

B

If there's no queries but yeah, it makes it kind of static. A steam is running queries already. Can we redirect this at some point into service.

B

uh I, don't quite understand that question.

B

B

Still not sure what the question is. um Is it a question about merging the Yoto Agora into the indexer service or something like that.

C

uh um Maybe uh I think partner of linear working on the same thing. um I can explain uh uh in real voice because that shot that um we were following the instructions of me um uh guidelines to open the port for graph nodes.

C

But now, as we saw in the announcement, it was like for graph note or graph queries uh uh for queries now, so we're now wondering if we opened their exact port for that for this phase. Zero of this Meep test net. So there's still a question and it depends tomorrow so that uh I believe that was didn't was asking about, uh because the first uh there were instructions to open the graph node on Port 8000 uh and uh later we found out that probably a query node was uh has to be open. Yeah.

B

Yeah yeah, so, okay, we're.

C

Not sure if we have four yeah, oh is.

B

um Is the graph note? uh Basically, the graph node is Task uh of two things right: it indexes the data, so it basically loads subgraph data continuously into its database, and then it also serves uh graphql queries. So it's read-only. You know from from that database.

B

uh The way it's recommended to run. That stack is to separate the graph node into two parts, so you basically run graph node twice and one instance of graph node is test of unindexing and we call it index node.

B

But it's still group the graph node software and the query node, which is also the graph node software, is, is turned into just a query serving mode and they both talk to the same database as to opening the query node, that's probably for debugging purposes, because um in the in production you don't want to open this because you're not going to get paid for queries that go directly to the graph node queries have to go through the indexer service for them to for you to be paid right. So.

C

This is just period.

B

Debugging system right yep.

C

uh Yes, that's absolutely clear for us that one endpoint is for serving page queries and another for debugging purposes and accessing queries directly, but uh the question I think we still uh that is about this place: zero, which endpoint we are at the open because it wasn't. uh um It was very clear if we want if we need to open the debugging endpoint or the paid one.

B

um So basically, Auto Agora will only care about the indexer service. If that's also the question, um because that's the payment layer, um whatever goes through graph, node we're ignoring completely, and that has also advantages um the advantages that we can rerun queries if we wanted to and they're not going to affect our data sets um because we're gonna We would bypass the indexer service. So that's the very reason why we're getting that data from the indexer service um but yeah in terms of what should be open or not. I'm, not quite aware.

B

uh You'd you'd have to ask um other people that are managing mips, but.

A

In any case, will.

B

Work uh because it's internal, it's within it's inside your infrastructure, so you don't have to open any port to the outside for a togor to work, because it's going to run in your um in your cluster right.

C

Yeah that's sort of clear thanks, yep.

B

So, under stress with enormous amount of queries, would Agora help to serve them better or on the opposite, make things worse due to additional computations um I mean it depends. What machine you're running on I guess um in deck, the autograph processor could end up using quite a bit of resources, but you would need a huge amount of queries. We've tested, a single autograph processor can go through 10 15 queries per second, no problem, um I, don't remember which limits we said in our kubernetes cluster, but they're.

B

Very small I think it's one CPU, each and and 500 Megs of ram, but even that that amount of ram was Overkill we can. We can lower it um so yeah, um yeah, I, hope that tells us that answers the question but uh in our particular case we're running in Google Cloud. So uh actually you can see in the diagram that here the reason why I put you know those stacked rectangles is that you can Mo. You can run as many autograph processors as you want.

B

So if it's not going fast enough, so first of all, if you don't want to have impact on your system, just run the one. If it can't take all the queries, then some are going to be thrown away, and that way you limit your impact to 500 Megs of RAM and and one CPU right. If you wanna, if you don't wanna, throw away data for the training, then run multiple autograph processors.

B

um If you can, if you can scale if you can afford to scale in our case for our setup, we can afford to scale. It's just gonna ask for more VMS from Google cloud and it's okay. But it's up to you, though, but once again the the impact of all the auto Gore stack is small compared to the query, node and the indexer service, and all that. So it's about maybe an order of magnitude, less right.

B

So, okay, according to Kohinoor Goran, meets phase. Zero are not compatible and that I would guess is because queries are not going through the index or service. Yet.

B

um I, don't know correct: okay, great so yeah there you go now, I understand the nature of the the whole prompt.

B

But eventually the uh the load is going to go through the indexer service right um You Gotta test this, because there is a lot more involved with the indexer service. uh Basically, whenever you have the indexer service on, then you also have the proper. You know the actual Gateway code running on the other side and there's like payment channels and Etc so yeah. Eventually, that's going to be turned on because um there's a lot of components there that have to be tested, Etc at least I- would expect.

B

Maybe Ford would know this.

B

Because, indeed, if, if the index or Services is never ever used, that means that your Revenue will never ever be taken into account in terms of the uh you know, your your performance and auto Gore is useless, so that would be sad.

F

Oh, an extra service will definitely be used. Yeah phase. Zero is really just mostly to get indexers that are not that are new to the systems familiar, so they can only focus on the graph node first and then in the next mission, we'll be adding the other components.

B

Okay, great, that makes like makes a lot of sense, cool super clear. Now, thanks, okay, um I'm gonna, put back the QR codes just in case any more questions.

A

All right: well, if there aren't any more questions, then I'll take this opportunity to wrap up today's conversation. Thank you. Alexi for presenting this was highly informative and I hope it wasn't the same for everyone here too, um as we've established throughout the the presentation. This will be recorded and put onto YouTube um it'll, be there'll, be a link to the mips playlist where it will be posted and also uh there is a link to the presentation, then a chat and that will also be shared after as well.

A

Thank you all, and we look forward to seeing you at the next mips Workshop, which will be happening next week. Tuesday at 5, PM ugc take care. Everyone, bye, bye, bye,.

B