GitLab Manage:Workspace Group, 20 Apr 2023

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Review of plan to partition events table

Description

https://gitlab.com/groups/gitlab-org/-/epics/3946

A

Thank you, okay, hello, everyone. uh My name is Tong and I'm. Here at Rocco, I'm gonna talk about events, the events table partitioning. Let me share my screen.

A

um Check screen: okay, does this look alright, Roger.

B

A

Okay cool, uh so uh so, basically, the the root. The root issue is that the events table is Big very, very big.

A

It's about one terabyte, I think so quite big um yeah try and find this one Okay. So in this issue um we have all the data about this. This table. uh It's it's used in your user activity, project activity, group, activity uh contribution and Analytics yeah. Those are the few places is used and it looks roughly like this.

A

um So the the main use case is in this calendar here and activity. So if you click on one of the cells, it will show you the activity for that day and we only show one user for data.

A

um So you can see you can see later why this is important. Basically, it's all based on date based, and we only show about one years of data I'm sure I think like. If you keep scrolling, you can see more, maybe but yeah I think it's really hard to scroll one user data IV a lot um yeah. So that's that's the events table and.

B

Then is data removed after one year or supposedly.

A

Data to move after three years, so it.

B

A

A sidekick there's a background worker um that removes um oh data. That's that's uh three years old, but it's very inefficient. It only removes 10 000 events. At a time uh it will take a 20 years to remove all the data.

B

A

Now so so I it's kind of disabled on getout.com.

A

No, we just delete it. No, that was the decision that was made um years ago, like from this almost a very startled github.com I.

B

A

I think it's like: if we don't need it, we delete it right because nothing.

B

A

It it's all one year and we're an issue one year, data um here right, so yeah there's the idea anyway.

A

um So anyway, because this table is quite large, um we want to kind of figure out how to reduce the size of the table, because a very big table like this is very hard to write to. uh It also costs money to store as well uh a few thousand dollars in mind. But you know it's: it's not trivial um and it also very slow to read now as well.

A

um I think the crazy thing is that um it's one and the the dot high is only 100 gigabytes but because it's so slow to read, we need like 10 times the amount of indexes just to read it efficiently, so yeah. So one of the one of the great things about I, don't know how much do you know about partitioning.

B

um I know um the high level um it's a postgres feature right. Yes, yes, you can also uh implement it on the application Level, but I think we use postgres.

B

So you set up from rule yes uh and then most common use is something related to some timestamp or a date. That's.

A

B

Right, yeah and then it will create like tables, but they are transparent for the application. Yes, um so it is basically an um how do you say like an interview, uh the main the main table will still be there. Yes, yeah I think it will be renamed and then a new table will be added. Yes,.

A

B

You select data from that table. It will transparently receive read data from all the tables if needed,.

A

Right, yeah yeah, correct, yeah, yeah yeah. It is absolutely correct, so it's been available since postgres, 10, I, think and in the progressively added more and more features. So since post Quest travel is pretty complete now, which is very nice, I think when the first introduces we introduced it, it didn't support like one keys and things like that so made it really hard to use.

A

Oh, yes, that's the idea, so um so the most obvious solution for you for the events table is to partition it so behind the sensors, we'll have um events for uh so the events partition for December 2023 or the events partition for uh uh January, February and so on so by by month uh by year. It's not enough because if we split one terabyte by by five years, say uh it's too too big. So the the part each partition is still too big um and then.

B

um Growing there will be more events. Oh.

A

Yeah yeah, assuming assuming we we get more users than the rate of growth, will increase what I mean more events. Yes,.

B

A

That's very legitimate assumption, yeah! That's why I assume as well yeah.

A

So who knows that um and then um there is a guide, that's very helpful um that the database team has created. So I am mostly following that guide. uh So it is step one two, three four, um so they are correspondingly two issues that I've created, because this is uh one issue for release, one and two uh step, one and two and then the other issues for that.

A

A

Yeah there's a lot of subtleties here: um oh yes, let's talk about other things, so other things that we could try besides partitioning uh is somehow drop. Some of these indexes, because this is 80 of the table, is indexes but I think we need all the indexes so well.

B

I did Facebook. um What do you say composite indexes that we can drop? Yes, because if you scroll up.

B

That's an example is, um for example, where your cursorber is now been one line up that with the long name, this.

A

B

Right before that can also use so queries against author ID and created ads can also run against that seconds index.

A

Yeah, that's true, so this.

B

A

Oh, but this is a partial way, I see well.

B

It could be yeah.

A

But yeah yeah, so maybe no I, think one of this is useless. Yeah this this one.

B

A

B

One can be done that will not. That is not.

A

Yeah yeah exactly so, we can delete things but I think we can we'll probably only save a few gigabytes and yeah.

B

A

Still have hundreds of gigabytes so yeah those are some of the things that I've considered um but um yeah. The other thing, that's that's a lot more drastic is that um this is a polymorphic type table. It actually stores different, like 10 different types of events. Right uh installers, push events uh bit, stores, project events like I, create a project or I create a.

B

A

Stores like uh project membership events right, so they all have slightly different. um Like uh logic.

B

A

Slightly different logic to display things, so we could split into 10 different tables, yeah.

B

A

Then the the to show anything then we're at the union union across 10 different tables and that's a horrible Glory as well so I, don't know what do you think? No.

B

It doesn't give you um that much because we we do we do not. uh The access pattern is across all these Target types right. We, we just show all events for a given user, a different project, yeah.

A

Exactly so, just like that yeah yeah.

B

A

Would be yeah if you want to build that query it'll be like yeah this one! You need that one, this one unit or on an order by created kind of expensive yeah um I mean we could we could do something like that. We just need to change the UI ux. You know to something different I think uh this. This doesn't show yeah. This isn't sure.

A

But if you go to say the project activity close set for no reason uh you, you start to see that we actually have like different English status, show by Stephen event types already, but then there's this passkey or thing.

B

A

B

uh Another read performance, optimization is to um denormalize into based on the usage patterns, so you will have project Events, maybe that that is accessed on the using project ID and you can have another table like user events or something that is used for the my activity. Feed, yes, but that is that is actually actually duplicating data, because an event can be related to different.

B

It can be, you know, I it can be on the project and on the user right right right right. Yes,.

A

B

Create a merge request, it's both on the project and on the and on.

A

And on me: yes, yes, but that.

B

That is something that uh could be done, but it will Inc. It will basically triple your storage demand.

A

Yeah yeah that isn't safe, it doesn't say, storage that adds storage, so yeah yeah.

A

They are interesting idea.

B

A

B

Solutions are like using completely different storage.

A

Oh yes, yes, there. It is one.

B

A

To use a quick house, um but quick house is not available for self-managed yet so oh.

B

Exactly so, then, you need to have complicated duplicate code for search.

A

B

A

B

A

Medium term solution is to is to um partition it as far as I can see, but there is still a long-term problem where, if your disc keeps growing I think even partitioning uh doesn't keep up right. So why do you need to start removing data or archiving data, or something like that.

B

Yeah but partitioning deleting a partition is a very cheap operation, so we also win yeah.

A

Yeah, so when, when we need to delete it or move it move the partition uh to somewhere else, it's very easy right. Yeah.

A

um Okay, so um so what I've done is that I created issues to? Firstly, um so how how they, how they recommended we're doing is that we create a copy of the table uh uh and the copy is already partitioned. So we set up the partition and there is rails code to automatically create the required partitioned tables and then the second step is to um copy the data from the old table to the partitioned events table.

B

A

That makes sense.

B

There's a lot of data to copy. Yes,.

A

There's a lot of data yeah. So if it's one terabyte and we take 50 000 rows per every two minutes, it will take many many days but apparently the the new way of copying the data now using batch background migrations. So maybe you can parallelize this a lot more as well. So.

B

A

Will take 40 days instead, who knows um which is nice, so that's that um can.

B

We also um um like, uh since at some point we will delete the old data that we only create use partitions for newly add data.

A

uh No, no I mean no. We can't we need to. We need to completely copy everything first and then we can start using partitions. So so the.

B

A

Is that while there's back view is going on uh I think there's it also creates a trigger to whenever, whenever data is inserted, a.

B

A

It as well to to yeah yeah, so so it does that and then the second thing is that uh uh what's this, this is step three to clean up to just throw away the background my equation and then step four is to basically swap swap the two tables around. So that's the thing.

A

um Okay, let's go back to here. So the problem, the thing with partitions the trade over partitions is that yes, it's good, um but if you want to query um data across two partitions say: I want data from January and I want data from February. It's very inefficient to do that.

A

So, ideally, every single query that we have has the created ad um thing inside the inside the query or what I call the partition key.

A

So um there's a few ways of getting all the queries that the application generates for this, and it mostly has the Creator add thing, but so, but not everything. So one of the one of the issues that I've created is to go change. All the queries to to having to create it at inside.

A

So it is, it will be efficient when we switch to partitions.

B

Yeah yeah and uh I've also used in the past um a index on a function. You can do that in postgres, um because time stamp indexes are huge because almost all values in the in the creates that are unique. You know there are micro circuits.

A

B

And but if you well, in my case, I had to use I could use date, so I had an index on dates created at it's.

B

That index is updated when you insert or update a record yeah, and then I could query that to have a very small index, for uh that would give me everything for a certain date and then I would you know narrow it down on something so right, like.

B

Between or whatever right and project ID is is or author ID is yes and.

A

B

Index becomes more um yes, small and faster wow.

A

Yes, yes, that is true, I think the good news is that um postgres does this automatically for us, which is really really nice, um not sure about indexes, because I think uh you need to have the partition key as part of your primary key. So.

B

A

So your primary key now becomes ID and created that so it's already part of the index um primary key um but I think what what postgres does is that it takes the created ad and then it in the background. It does something like an index, but it's not really index I can't remember what they call it like. It converts that into like the the hash of the partition that it's looking for.

A

It's like hey, okay, I'm looking for the month and a year of the of the partition, so it will naturally throw away the the time the the hour, the second the things and then it will. It will query the right it will look for it in the right partition table.

A

um So we don't. We don't need to have special indexes for that um to look for the right partition. If you get what I mean yeah um I.

B

Think the partition will be on year and month. Right, yes, yes, yeah.

A

So uh we don't need to create another additional index for that um postgres naturally, does that already, which is super nice um yeah?

A

um The good news about all these queries is that um I think a lot of queries already go order by events.id, so it should be quite easy to. Hopefully it's quite easy to convert the order by to autobuy e events.created at comma events.id.

A

A I think yeah.

A

So that's that's. um Where is it? That's, that's! Okay! That's this one update glorious, so yeah I should probably copy that. Oh boy.

A

That query C, oh yeah, that's such a very small link here. I should probably update this because since I mean now, let's let's off or is to uh it's not very clear, okay, there we go.

A

Okay uh you're, following so far: okay.

B

No, we we have.

A

Five minutes left yeah exactly yeah. Let's, let's speed things.

B

A

um So there is another table which is a a for that tables. Foreign key refers to the events table, but that one's only 200 gigabytes. So let's not talk about that.

A

um Let's talk about the complex, the the last complexity here, uh which is that they we have a Unix and unique index on on the events table, um but a unique index doesn't have created at um so, which means that postgres will not allow this unique index when we partition it. It's just part of the rules, I guess um it's too expensive to have a weird unique index like that. uh So I think we we can't drop it because we will still need it.

A

So I think what we need to do instead is to um do this, which is move, move some of the data into a normalized table, and then we can keep querying that data. It's I think this is this system the most complicated bit, I think the other bits are slow, but yeah, but it's too manageable. This one I, don't know!

A

um Oh yeah, this one, this one I'm talking about Adam.

B

On this one, this is like um to guarantee uniqueness.

A

I, don't know what it does exactly like. Yeah, maybe I think it's because like uh Wiki Pages like when we create a Wiki page it it also checks into repo, and it does it's slightly differently. So yeah I'm not sure.

A

Okay, anyway, uh any what.

B

Any questions yeah the this issue says that which will have the following columns, but then fingerprints.

A

B

Move to fingerprint column to a new table. Oh okay,.

A

In in other columns as well, yeah yeah yeah.

B

B

Yeah and are there any um since we will create a new table for the events? Yes, are there any other um opportunities for? um Because now we start with a new table, there's also a chance to maybe optimize something.

A

uh Yes, so um we can reorder the we can reorder the uh column order, yeah, so I've linked I've put it as a note in one of the the issues when we create a copy. So the current Table order is not not efficient. It's it's waste. It weighs 24 bytes per row, so 24 bytes times, uh 3 billion.

B

A

It's a few few gigabytes. So if we can save those 24 uh bytes, we can save 14 gigabytes so yeah there we go. Definitely should do that.

B

Yeah in the Target type stuff, but it's a string.

A

Yeah uh yeah: do we want to do that.

B

A

B

And I would only do that if um rails polymorphic Association can support uh changing it to an enum or something like that. Yeah.

A

Yeah, you can you can I promise I, don't know how how the backfield works, because you know how to do transformation when you do backfill, because the new table is going to have a different data type and then the old table is going to have Yeah a different data type, so I don't know how to backtrack. In that sense, yeah.

B

Yeah, it's mostly um storage, I'm thinking about yeah yeah.

A

It doesn't give you much.

B

For the performance, it doesn't really matter no.

A

No, yes, yes, definitely um yeah. If you can save some storage, I think I would as well this.

B

Always doesn't.

A

Complicate the back view: okay, I think we're at time. So thanks very much yeah.

A

Stop recording.