GitLab Sharding Working Group, 26 May 2020

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: 2020 05 26 Database Sharding Working Group

Description

No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).

A

All right database, sharding group, so follow-up items, albero thanks for feedback on the capacity planning and I can.

B

Explain very very quickly: there's there's actually a couple of issues there, so one of them is going to be led by Jerry and and the Jose, so they're gonna try to come up with with some quick conclusions based off all of information. All of the you know the efforts that we've carried out on the database here from last couple of years and summarize a good amount about and putting animation by the end of the week or so so bear we're.

B

Gonna, see some conclusions around the capacity of the database and then potential walls that we might be hitting here in the next you know months and then the other. That's that's point number two actually so point number one highlight be: they are the capacity assessment for our main production, DD cluster. That will tell you a bit of a more kind of detail. Fine-Grained approach to you know we're gonna, be testing, you gonna, be stressing our testing database and we'll have conclusions in a couple of couple of months.

B

I think, but I should complete the other issue, the issue that we'll have ready by the end of this week. So it's a combination of short term and midterm issues and couple issues: okay,.

C

A

C

Will a new issue comes out? Can you back the stock for reference yeah.

B

So that's Jose is working on that and Jerry so I'll see if I can idat in a couple of hours or so sure.

C

Thing thanks appreciate. Yes,.

A

Thank you, the next section, and so as far as what's been done since last week, there were a bunch of questions about progress of like capturing table. Size was one of the questions, some of our assumptions around namespace sharding, and it turns out, we all realize, there's a lot of scattered information. So I updated this charting to consolidate a lot of the work. That's been done so far, some of the assumptions and just tried to make it more representative of what we're working on right now.

A

So you know can read that, as far as what else has been done, there were some design changes that came out of discussions on our audit log, partitioning efforts that were undertaking right now, andreas anything noteworthy to call out on this one.

D

Sorry, which one I was just something else, database design changes. No, we settled on a target design for for the audit lock and we ran the body owning the feature and we sort of agreed on the that's good. Okay,.

A

Yanis is working on a script to generate some test audit events data while we're still working with the right folks to figure out what we can use for test data, whether or not we can continue use production data for test data, so Yanis is creating a way for us to generate a large volume of just generate test data.

A

There was also, as part of the questions about namespace sharding, repeated question came up. Are there any use cases out there for sharding and scaling a production database, no way that we're exploring? So we started an issue to try and look for some of those use cases out in the wild and we haven't found anything yet so if anybody has any useful information, we're scaling a production database using native Postgres with foreign data wrappers feel free to add it to that list. That issue link here.

A

Ondrea's. You have the next item.

D

Yeah, so this is remotely related to shouting in a sense that I think those are the things that we should also be doing, and this is about analyzing our workload and actually understanding that better. So for a while, we didn't have reporting on that and on that came back today.

D

I think, and it was very easy to see that there's only one query that eats 40% of the total database time on all the replicas in the cluster, which is really a lot, if you think about it, and that's been around for at least two weeks, I'm still figuring out where when it was introduced- and it didn't impact much much on the on the database, health at least we're not aware of it so I like to think that's a good sign as well, and it was easy to fix and it's really just fixing an index or fixing the query.

D

It's a very simple fix for that and I would just like to encourage us to do more in that direction like analyze the work or that we have and find those fixes, because this is really low hanging, fruits and ultimately, I. Think, even if we have a shotting solution, if we, if we keep doing the wrong things in that environment, it's it's perhaps even going to be worse than today's. We have to do these things anyway. I think this is. This is important.

D

There is great reporting tool that we're using the booze cruise checkup to, and it still looks like the it's working again, but the primary database instance is not included in the reporting I credit on this year today. For it up, it would be great if you can fix that I think it's just a configuration setting, um but it's definitely interesting to look at. Let's.

E

Take aynd of like emphasize addresses point like we had like a lot of squeeze on PG bouncer and we were taking a look and we discovered one background job that was consuming like pretty unimportant job that was consuming about 50% of our PG bouncer connections and just by fixing that we effectively doubled the capacity and so there's a huge amount of optimization that we can do. That'll just give us tons of headroom.

A

Right so going back to the original goals for this working group was database scalability and there's been a singular focus on charting for some time in this working group, but there are lots of other things like this that we can call out that will help with our database scalability. So thank you, Andres and Andrea for bringing those up any other questions on that bullet point.

A

All right on to the next, so I was a week or week and a half ago that Andres put together an issue to track where we would break if we implement a namespace starting vibrator key in the form of timeouts on sequel database right. If we have to, if you do, a query that does not use the partitioning or charting key, it's just it's gonna take forever and there was a really good amount. We started with the table on features that would break if we went with namespace sharding and they are.

A

The list is started here and then Andy and Andrew did a query across our database Andrew. You want to kind of go over that real, quick yeah. It's.

E

It's pretty sort of like the first point is that it's very optimistic. So what it's saying is that it's kind of looking for them, the most difficult tables and then so, there's about 350 tables in the database and the basically I'm excluding most of those and I'm looking at the the most difficult ones and there's about 80 88 in that group, we could probably cut a few of them out. But what's left is is the things that have got no connection to a shard in any Ward's right to a namespace in any way.

E

So you know that's your users on your t, oday, obviously as well, but there's a whole bunch of tables in and basically what I did was I started off with all the tables and I said well, things like names, basin project are obviously char double by name space, which is not a which is actually a very optimistic assumption to make. But like we'll start with that, and then everything that's connected to that through a foreign key, we'll just assume, is also sharable by a namespace.

E

Again, that's a very optimistic assumption and if you go through that, you end up with about 88 tables that are not shot able by names first, because they have no relationship to your namespace, and so that would be kind of where you need to start that discussion.

E

You know and and then, if you look across those relationships, there's about 80 or 90 relationships. Sorry I've actually just put it in this other documents. I, don't know, I just share your screen, really sure sorry, I was just gonna. I was yeah. The me share my screen. Yeah sorry, I wasn't prepared.

A

E

E

So I mean you can kind of look through it, there's basically like all the database tables and all the columns and all the foreign keys and then I just kind of do this awful kind of join across everything to figure out. You know which tables are namespace table and which ones are linked through foreign keys and then I end up with this filtering on it at the moment.

E

So that's kind of all the tables, and so a lot of them are kind of one relationship away from a namespace table, but then and and that's not to say that those won't be difficult to do, but the most difficult ones possibly will be these ones.

E

These are the ones that, like you know, and the one that we always use is to do's and users, and so you know that's something to think about and then more recently, in the last few minutes, I just started looking at sort of relationships between something that's namespace codes, so that would fit easily in a shot and then a user and there's about I think I've got 55 relationships from things that have that would be named.

E

Periscoped to users that are not so that's effectively many to many relationships that we would need to denormalize in some way with the namespace charting solution and each one of those would need consideration. You know there wouldn't be trivial, but then also inside the you know, that's not to say that insider names by Skype, you wouldn't have other problems, so you know for project references. Another project stay in a fork network, there's no guarantee that those will be on the same shard, and so that would be something that we need to consider as well.

E

So it's just kind of a brute force attack on on our database.

F

For my own career.

F

So for this thing table not, you know, another's names baseball, we have to duplicate or replicate the data to all chars, but for the other problems that someone we can shard based on name space, but data may reside on different chars. Then the solution will be automate. These I mean put.

A

F

Simple way is we have to do the distributing the joy of the shards to pull all the data from all sharks together, so we can get an accurate view of the timing accuracy not from the current.

E

Yeah I mean I, don't I am NOT an expert on post, Chris I, don't know the the details of that, but, like that's, what I want is like I want to kind of see like a proper proposal of like how we do that in how I deal with transactions across those across the shards.

E

You know and all the other questions in like you know how we, how we actually construct those foreign data wrappers- and you know all of that- work- yeah it's and and that's not to say that it's yeah I, just personally I'm, finding it difficult to kind of visualize it at the moment. But I. You know I'm not.

G

E

D

G

Goes along these lines right, so not okay! It was actually correlate. This information, which I believe is very useful with table size at both I respect. If it's number of rows or signals right and then chart those tables and are a ASA Trenchard, because they're namespace, based or namespace related less you well and then consider the rest of them that are potentially to be replicated table. Similarly, to what Silas was doing is doing and connect those replicated now to do those joints across charts.

G

Then that's the point where you need to go through the four in bigger wrappers: that's where you need to go to an originator node which will hold partitions as foreign tables to the charts, and if you perform the join at this ordinary level, you can express any query as complex as you may want on the coordinator level, main problems that will be very slow, especially because, given current phosphorus implementation of foreign workers, each node is processed similarly, not in parallel that may change in phosphorus routine, but that's true still true today, as of phosphors 12.

G

So what the goals of this approach would be to try to obviously minimize those kind of queries there. You still have the option to and those across across chart goings, but less you use that functionality better, because otherwise those queries will be hit.

E

Thanks, it's very clever.

D

And it's perhaps even safe to assume that as soon as you have queries that go across all the charts, if we really talk about 100 X scale- and this is I- think this is why we're doing this in the first place, it's it's safe to assume that they're not only going to be slow, but those features are going to break because no, you can't be scanning all the charts and we do have a lot of features that actually do.

D

This I think this is because of it lab is so cohesive, so it allows you to to interact with many things across the instance across different namespaces and all that. Oh, very sorry,.

C

Interrupt I, don't think scanning all charge is necessarily that bad a thing depending on how expensive it is to scan each shard.

D

That's true for sure, but you know we're doing this in the first place, because we expect a lot of data growth and if we assume that we have shards that the size of gitlab come today and we have a couple of those, you can't just assume that those features are going to continue to work without any change. Yeah.

C

Absolutely agree.

G

In the foreign debt, wrappers change charts are in severe, you cannot good performance. This may change.

G

Now, by the time, this I presume that the initial version will contain a small number right- maybe let's say eight charts- oh sure, maybe inside will be partitioned differently, but that does matter you. You may have a spiritual partitions that in a bigger number, but still you can roughly charge level. So if you say, if you have a charge, you need to scandals in Syria, but it will be only eight I mean or in the query that coordinators may filter. Maybe you need to touch to its feet, yet it's still a personnel.

G

That's going to be stopped by the time you will need hundred physical control, charts. Probably polish will have implemented. This is kind of a parallel. What won't be an issue by that high.

A

Yeah, but that could conflict with our adoption model right if you're saying 13 could be where they implement parallel scanning. We're not going to adopt that until ninki, lab 15 is a timeline, and that's two years from now so go ahead and raise and.

D

I mean I really like us that we explored this idea of foreign data, represent that with post Chris. But Avera do you? Do you know anybody this as successful? You turn to that in production at that scale, because we have I, don't think we have any examples of anybody actually.

G

G

Cypress itself uses the same model except that, instead of using for in the rapper's music ordinary note, because my signal lands the same story, it's a different architecture, I think all these designs should say same idea now other production deployments based in this. I.

G

D

I mean I yeah do for sharding. I think this is. This is very similar across all the charting technologies, but I think that the point here is that in our case there is no readily available to chart in distribution for a forester. We can just use right, so it's not only about like changing our product to to align with that idea, but it's also about implementing that sort of distributed database technology, which seems like a large effort. That is that we would pursue if we, if we go down the route right, yeah.

G

Absolutely absolutely and that's why we're and that we're entering some uncharted territory right. The sizes that I mentioned is you know requires some new solutions. Not all of them are ready to be available as a software that exists. The same happens in other areas like, for example, or I or something, but in any case first of all. Yes, we might need to start developing software instead of consuming it and I'm talking about database software.

G

In my opinion, that's my take, and second, these efforts are understanding how to show me, even though this solution may end up coming later, that Dalbey could gain anyway, because this this is gonna happen in time, like so the earlier larger, but that will happen, and these patterns, which are ineffective anyway, needs to be fixed at some point. That nana lies so I think the way in the same direction in.

B

Any case unchartered territory doesn't sound great for a solution that has to be super. You know proven and stable and know something that has to last and never fail right. So yeah we don't want to test anything. No, that's gonna be brittle or similar right.

D

And you also have an a value of boring solutions, rights for picking picking a solution that we find boring that that's very important for us yeah. Perhaps we can talk a bit about the multi-tenancy solution that we're preparing a proposal for what that make sense. Yep.

A

That's a nice segue, okay,.

A

So she kind of preface it a lot of this work and I always refer to it as issue 50. So the research that andreas and Andrew have done into what's easily shareable and what's not has led to a lot of conversations about how we would need to change the application which is now listed under follow-up items. We need to discuss with product if we are pursuing the namespacing sharding as the solution.

A

There are a lot of application changes that would come into play that we would need to make and we need to have those product discussions and Syd's going to be involved starting next week. So we can get the the ultimate product discussion from him, but while we were discovering those changes in lack of chart, ability on namespaces andrew put together some thoughts about charting using a multi-tenancy approach on get lab, calm and andrea. You want to give us the elevator pitch on that one.

A

It's it's the last bullet here on what's happening next, oh you want to open that up. Greg sure.

E

Yeah I mean really what this is, is we've kind of been given the constraints of how we want to do this with a single database and I think that's pretty clear and then within that we've been trying to navigate like how can we take this in shot it in a way that we can kind of unroll everything and it's complicated, right, I think we all agree on that. Much and so I was talking to undress about it, and I was trying to think well.

E

If I knew there was something above a user and a namespace that we could use as a shot in key and, of course there isn't. But one potential option is that we basically create effectively virtual private instances of get there right, so they're all running on the same application, but Acme Corp could come in and sign up and they could have AK Maquoketa, calm and effectively.

E

What that is is as a tenant and for the namespace of all the key to basically the key tables in the database, so user name space projects and all this or big ones. We add a shard and key and get lab comm will effectively become tenant one and on every self-managed instance, they will become the default tenants and on the soft manage instances it will never ever be any other tenants will just be the one tenant.

E

So it's a got a quite a good, easy story, simple story for self-managed versus gitlab comm and then on get lab comm. We can start expanding it out, so you get you know, customers can create their own virtual tenants and they could effectively be the admin of that tenant, and that has some advantages, there's obviously a lot of disadvantages and we should discuss like and be very sort of aggressively think about the thing, the strengths and the weaknesses of this approach.

E

But one of the things that I think is really interesting is that the access group is working on. Another stream of work called get lab spaces, and this could be quite a good proposal for that as well. So it really provides very solid isolation. You know when you're in your instance- and you act mentioned somebody- you only see people in your instance.

E

You know it's. It's quite tightly bound to that instance, and it's quite difficult to to break out. So it was just a an idea that came about from trying to battle with the complexities of what we're doing and I think the the advantage of it is that, like conception or sorry, technically, it's quite easy to think through. Like you, you sort of know where you need to have the keys, you know it becomes very easy to shard.

E

Obviously the downsides of it is that we have one very big shot to start with, and that would be the gitlab dot-com shot. That would have everything on it.

E

But one of the things that I found quite interesting was I've, been doing a bit of analysis of who uses good luck, um particularly from a database point of view and I took the top 1,000 users on the on the sides and I categorize, sorry top 1,000 users by the amount of database time that they spend on get labs so basically specifically on databases and of those top 1,000 users about 50% of our of our capacity.

E

So that's about half and then beyond that there's obviously a long tail, but browsing through that list, and you can, if, if you there's a in the there's a pie chart, if you open the source, you can see who the customers are basically they're, all privates namespaces. A lot of them are free as well, which is I, think something also. We need to convert a lot of those users because you'll see some big name brands in there.

E

So if you click on the pie, charts Craig and you just click open source, you can see who who they are but anyway, sort of the point is that a lot of the users you using gitlab are kind of these private users and they would probably fit quite well into a tenon model like they don't really care about what other users are doing, they're using it lab for their own sort of world view, they're, not kind of sharing, merge requests across with other organisations that much so it might work quite well for them, but obviously it's a very different direction and it's not really the remit of this working group.

E

So there's a little bit but technically it it would be quite straightforward and something we should discuss. I think there's my innovative pitch pitch.

A

Thank you, yeah, we're running up on time here, but and I called it out for follow-up item, so getting product management really involved in the discussions about what changes would need to take place for the namespace approach versus what changes we need to take place for this tenant approach, and we can start talking about what the initial iterations would be and get their feedback on. What's what's more tenable for them, I think that starts next week with blitz, Dib and I will Josh's off this week too, which is why he's not here I'm.

C

Prepped for that those 88 tables, does it make sense to try to figure out how those map, as far as either teams or product features or some some mapping associated with him like I, was trying to get some intuition about him when I was looking through in regards to like you, you know if emojis isn't gonna give me all the data based on what charger and I think I think I can I can I can I can deal with that as a customer or just hypothetically speaking, but you know there may be others that are you know like we have to be explicit about, and you know like that's what I'm trying to figure out it's like.

C

What's the communication of the customer and those situations, or is it that you know we tell customers? Well, you got to now start using namespaces and namespaces are by default. As an example, like you know, at least that's that's kind of that's kind of where my current head is but I don't know if that's like or anybody else's has our first Olin tenant idea.

C

That's a quietness, oh yes, asking I'm asking either very deeper or well. My fill questions. I.

E

Might like a the one point I wanted to make was that this is the this tenant proposal is kind of very left-field, and so I don't want to kind of have it as a compact as a competitor to the other idea and I think we should kind of all thought.

E

You know sort of agree on one solution in this working group, but because that idea is so left-field it'll be good before you spend too much time on it and come up with this grand plan and then have it shut down by product in by said it. You know, I, think it'd be really good, just to kind of feel the water and get an idea of whether it's worth doing that or just scrap it, and just focus on this I'm. Not answering your question at all, but now.

C

I think I think you're getting to what I'm after which is like I, didn't want to say this, because this is another kind of right field to your left field, which is why can't we take the tenant and push it into the namespace right like.

C

Why can't we just say by definition, if you're a tenant, if you get a namespace and therefore then everything you know kind of falls sequences from there right and that's that's kind of that's that's that sounds I know that probably sounds crazy, but, like that's that's that's how you get that's, how you get isolation right is, as likely you tell customers like look. This is the way to get isolation and you're kind of being very upfront about it, and then that does mean that they can't get to some information.

C

But that's that's where we you know, that's the trade-off that they're making in those situations right and we're also making helping them make rate for them. Sure so, like I, don't think it's actually a bad bad way to think about it. I think actually I'd be curious to see what product reaction is be honest, but I agree we need to. We need to work as team not having competing competing proposals, but just like more, if we vetted those proposals to make sure we understand the problem.

A

Thanks for bringing this proposal forward Andrew, it was supported by data and the struggles that we've had with namespace. So far. That being said, we were out of time. If there are any other comments or questions, you can feel free to add them to that dock or the stock, and we will follow up everybody happy Tuesday.