Delta Lake delta-rs Development Meetings, 2 Feb 2021

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: 2021-02-02 delta-rs open development meeting

Description

Tentative agenda

* Neville talks through some of the Arrow merged changes
* Tokio 1.0 planning
* Chris shows off his writer branch
* ???

A

There we go all right.

A

All right welcome, we are officially live. um This is another delta, rs uh development meeting um the agenda I had or what I would say, a tentative agenda uh would be. I I shared in our delta rs slack channel, um and it's just I was thinking we could go through. Neville could just talk through some of the recent changes that he's had merged around the writer support in in the aerocrates.

A

um I wanted to discuss a little bit and just get on the same page around what our let's call it. Road map or plan should be for releasing crates with tokyo, 1.0 as a dependency um and then to after that, I'd love for chris to share, screen and sort of walk through some of the writer changes uh that he's been working on.

A

Does that seem reasonable.

B

A

Yep sounds good yep all right, mr depaula you're up first.

C

Okay, thanks all stuck um so there hasn't been that much activity from my side um in the last two weeks, the the main thing that we've done, um we matched the tokyo 1.0. um That's just the update update to dependencies. um There was some issues with the ipc integration tests fading, but um qp helped out, and we found a solution for that. We merged that. um I'm still I'm looking at the supporting the 2.6 um version of the pca format.

C

There's a bigger issue in the, in the sense that the the rust writer only supports the legacy format, but because you, because of backwards compatibility, even if you're writing like with a version two of the writer, you still sort of end up, writing valid stuff, but it's actually really version one of the stuff. So I've been doing a lot of um exploratory work on the on on the pacquiao repository.

C

Also looking at the packet, mr, the the java equivalent and the c plus plus equivalent, to see how we can um we can pave a way of giving us proper, 2.6 support. So one of the challenges we face with, for example, 2.6 support- is that there is, I think, there's two types of um sorry, two data types that we we actually don't write correctly. So we sort of lose precision if we, if you, if you're, writing, nanosecond precision timestamps, because the the old version of the old version that supports it.

C

That was that um 96-bit integer, which is quite messy to work with and has been officially deprecated for a few years now, um but to get proper timestamp um nanosecond precision support. I need to do a bit of planting around the the separating the formats between version, one and version two I've got. I've got some rough work that I've been doing, but it looks like it's a it's leading me down like a long path.

C

So last week, um early last week, I was looking at how I can sort of separate the work better, so that I can. I can be able to deliver it in chunks without um disrupting anyone. I'm still thinking about it, but I think yeah.

A

The work that you've done thus far are you: have you been able to set up like a spark reader um to to validate the the data? That's being written thus far,.

C

Britain, yes, I've done that I've just done it locally! Yeah, I'm doing it with. um What's this, I keep forgetting what the pi arrows, alright, I'm doing it with pi arrow and pi spark I'm just using notebooks. So I write stuff in rust and then I I test them in in both of those two. That's where I picked up that there's some compression formats that we're not supporting correctly. I hope I opened them gyros for that. But encodings look fine! So far, it's just that the version to write stuff is not it's.

C

Not it's not currently accurate. There is.

C

There is some discussion happening on the parquet repository where they want to document core features um and by that they're, looking at what every implementation should support as a minimum, because there's a lot of exotic stuff, especially when you go beyond the 2.6 or the format like 2.7 and 8 introduced encryption, which we don't need, um but they want to sort of standardize, because what I what I saw actually um I opened, uh I opened the pull request on the parquet format: um repositories suggesting that we adopt the error intervals as part of version.

C

Two of the format, because version one has intervals, but version two doesn't have intervals. So if you try to really support a version, two writer, it sort of becomes a bit muddy in the sense that you you're effectively using the schemas of both version two and version one.

C

So I started a discussion there or I opened a pull request, um because I I noticed that, even though you know we had version 2.8 of the format, a lot of the writers still use version one by default, which, like I guess, everybody's doing it to try to be conservative not to not to break changes, but the the downside of that is that the rust writer um only really supports version one and reading version. Two files um back in the rust writer won't won't give us the correct results.

C

So I've been doing quite a lot of exploratory work. um I haven't done anything concrete that I can say I'm opening a pull request on this. The other thing that I've been working on this is relevant to christian, so it'll be interesting to see what you've, what you've done so find. The right support is remember. I mentioned that there was that pull request that somebody had started where we were trying to be able to write data to a bunch of to to up to a vector of of bytes instead of a file.

C

um So I've been working on that I'm trying to abstract it out into a trade, but I picked up that. um I actually missed this. There was. There was a pull request a few months ago, where somebody created an in-memory.

B

C

Yeah the readable, cursor yeah, so whether we could ah okay cool yeah. I was wondering whether we could use that and what uh what what the gaps would still be, because I think from uh writing to um to block stories like s3. We need to be able to to to to split the file into multipaths, so I'm wondering how we'd be able to do that.

D

Yep yep yep, I think when we get to my section we can. We can talk about that. A little bit.

C

Okay, cool yeah. I think that's, that's it from my side. um In summary, I haven't done. I haven't done much work, there's been like personal stuff that I had to deal with. So I wasn't even like at work for about most of last week,.

A

Gotcha um so, unfortunately, I've got someone. I got a meeting scheduled at 9 15., so um it'd be great if y'all kept kept chatting. Let's talk about tokyo, one 1.0, uh real, quick and then when I have to jump to my other meeting I'll, just leave this window open and if somebody will ping me in in slack when it's time for me to close the meeting, because I don't know if I'll disconnect the uh the stream. If I leave okay, so tokyo 1.0, I used my first question on this.

A

I saw the arrow uh couldn't uh get merged um qp and I merged the the code into delta rs uh from the aero release, standpoint, neville or qp. How long do you think it's going to be before there's an actual? You know, aero crate, that is pushed to crates scio with with the tokyo 1.0 dependency.

B

Probably a long time, but now with that three.

C

Months, yeah yeah.

B

um It's not just the arrow crates. We, the we're, also using azure grid, which does also does not have a release in crates. I o so that that's another problem that we have to deal with.

B

I didn't know that.

A

That's a bummer ben will be disappointed.

A

It sounds like either way we're in we're in unreleased crate territory for a few months.

A

Yep, that um neville does that tokyo, 1.0 thing and arrow. Does that mean that there would be a major version bump, or are we just waiting for an incremental version.

C

Yeah, it's going to be incremental, so we we don't follow simva. The next version of error will be 4.0.

C

And then that'll be in about three months from now.

B

It's from three to four right.

C

Yeah we're from three to four.

A

Does that mean that there's a a window where there's going to be potentially more unstable things that come in to arrow or should we be able to pin to where we're pinning from a version in delta rs and be pretty safe? There.

C

I think um we should be able to pin yeah, we should be able to to pin to a version, and then I think what I can do from my side is, if there's interesting, work that has been done on, especially the pca site, which is which is relevant for us. I can. I can submit a product, just updating the what's this, the the pinned versions.

A

Oh yeah, that works for me, okay, qp, for the the python binding I mean since it. My understanding is we're just building the them. What are they called many wheels yep ourselves, it kind of doesn't matter if we're pinning.

B

Yeah that doesn't matter for python, it's only it only matters for crates. I o release.

A

It's a bummer, but I don't think it actually stops us from doing anything.

B

Yeah I get the downstream users who can use git hash as well to pull in dependency so not to worry about it.

A

I don't know if we aside from maybe christian- I don't know if we have too many downstream uh rust crate only to be users right now,.

B

A

That's why I'm not really worried about it.

A

All right, if there's no other questions around tokyo, 1.0 maybe I'll hand it over to you. Christian and I've got to jump to my other meeting so I'll mute this window. Okay sounds good and I'll.

D

Talk to y'all in a bit all right I'll, do my my usual fumbling to find my screen share, um got it okay, so I did just actually let me post this in the um channel real quick. I did just submit a draft pr based on the code that I'm gonna share, where's the chat there. It is.

D

So draft pr here, um so basically what I'm doing right now for the initial right support the uh and what this, what this draft pr is trying to do is just as far as the core code in delta rs goes. It's just trying to add a very basic transaction api.

D

Most of the interesting code in the pr is actually all tucked into this one big test which that test is really um includes a lot of meat around. uh That includes a delta rider struct.

D

That would use this transaction api and the reason I did it this way is, you know I feel, like we probably have a lot more iterations to do on the actual delta rider struct itself before it actually becomes a part of the delta rs code base, and um I think for my needs- I don't even need it to um for for our primary use case that we're trying to solve. I can.

D

I will happily just keep this delta writer completely outside of the delta rs code base, um but it's nice to use it in the test just for kind of validating that the transaction api that I've come up with um meets my needs and as a rust, noob. It's it's really good to for me to use this to work out all the ownership things which ownership is something that I'm not a master of yet.

D

So what I have here shown on the screen is a.

D

Is the test that I've called write exploration- and it includes a like- I said- a delta writer struct for writing arrow record batches um as well as interacting with the transaction api. So uh in the the main little smoke test, I'm using um I initialize a delta table and a delta writer, I create a little vector of json values, um write out my record batch after initializing a transaction and then, after writing out my record batch.

D

I create an add action for the delta log and commit that, and then we do a little update as well and the update basically changes. Some values um writes an ad and a remove to a different transaction and then commits that. um So everything in this test file is a very naive implementation of writer support, but I think one of the things that is really interesting to to look at specifically is what neville mentioned a moment ago, which is that that in-memory writable cursor?

D

um I actually think that so this is actually. This is the implementation in arrow.

D

The I actually think this might meet our needs.

D

In a very complete way, so we might not need any further work, um but you guys feel free to tell me if I might be wrong on that. My thought is: this is exactly what we want, um so we we would want to have this hand. Let me get a silent slack. It's killing me.

D

All right, that's better, okay, so, ultimately what well? Let me start with what I'm doing right now.

D

At the moment, um the naive delta writer api that I have um takes in a delta table metadata object in a record batch um instantiates one of these in-memory writable cursors um and writes the uh it determines what what the next data path to write to delta is um writes everything everything to the cursor first plucks off the size and uh plucks some of the metadata out of the record match as well, such as the partition values and then right, writes it to storage and creates an ad action.

D

Ultimately, I think what we would want to do here is to rearrange this so that we could write multiple record batches um with and and basically hold as uh hold that cursor and writer, as owned properties of the delta rider, struct and basically on flush is when we would write to storage and create the add action instead. So, basically, I think that's the missing bit that I have, but I I feel like the memory writable cursor itself is exactly the struct. We need to do that.

D

uh I would just need to figure out how to rearrange this so that I can own the cursor and hold the writer uh across multiple calls from the calling context. If that makes sense,.

C

Yeah this does um and then I think, when you do that, one of the benefits becomes that um you you'd be able to. Potentially you know. After writing. Each batch check the size of the the castle's data exactly yeah, if it's big enough or I'm not sure how the what the what the limits or thresholds are with um you know, writing multiple files.

C

um So if it's let's say, for example, your limits is five megabytes and then you find that what you've written so far is um close to that five megabytes. For example, then you'd probably close the close close that close the file. You know write that photo and then um I'm writing to stories. And then, when the next batch comes, you sort of clean the uh yeah you sort of clean the the data in the in the right, so in the cursor from there yeah. I think that could work.

D

Cool yeah, that's that's what I'm thinking um I you know. I tried a little quick pass at it and because of my rust nudeness, I I just couldn't figure out all the ownership details, but I feel like this is something that a I'm getting slacked again. Okay, um but I feel like this is something that a non-rust noob would be able to um figure out quickly. uh I'm just kind of gonna beat my head against the wall again until I figure it out, but I feel like that api should fully support it.

D

um So, having mentioned that, um the other thing that I'll mention is just the transaction api itself. um So it's super simple, like I said, like most of the code, is in this in this draft. Pr is in this test um the api for creating a transaction.

D

uh What I have right now is, I just have a create transaction method on the delta rs delta table struct, which returns a delta transaction, and it really only has one method so far, which is commit all and it expects you to pass all the actions that need to be committed as part of the transaction.

D

I feel like this separation is good and what we really want is for the delta writer to basically return the appropriate action based on whatever record batch was written to it. So we have a decoupling here, so basically it would be the responsibility of the caller to instantiate a writer and a transaction right to the writer, get back the actions and if it's doing any sort of update, merge or anything more complex than a peer append which it to be really honest.

D

It's only the pure pins that I care about for my purposes, um but if it needs to uh do any ads and removes based on an update for example, then it would be responsible for collecting all of the actions that need to be committed to the transaction when everything is done.

D

One notable thing here is the whole concept of the transaction is when you go to commit.

D

The transaction should be capable of checking the basically the snapshot, that's represented by the delta table. It was initially instantiated with against, whatever snapshot might exist after the transaction was created, and so what I have right now does make assumptions that um the caller will do all that management. So, basically, the caller would be responsible for making sure that only a single transaction context exists at a time or or rather let me rephrase that um that a single transaction instance refers to a a delta table which contains and represents the snapshot.

D

And then, um if and then, basically pulling and diffing whatever a later well, whatever the later snapshot might be against that. Ultimately, we need to have conflict resolution internal to the transaction um which could check the the the current metadata against whatever metadata was present um when the transaction was started, and actually, when I say metadata here, I actually mean more than just metadata metadata, as well as new files added to the delta log that part's not implemented.

D

So this is really and when I and so back to the naive part this this first implementation is essentially assuming a single rider against the table.

D

So if there are any other writers which there will be in my use case, then we're going to need to do conflict resolution prior to during the commit, and that would be handled inside of this commit all so um does that make sense to everybody the the way I said it.

C

um I mean it's interesting because if you go back to that portion, where you create a transaction, and then you you add, and then after that, when you commit the transaction at line 372, I don't know if it would be an anti-pattern and I'll make. This comment on the pull request itself. um When I look at it, but it would be interesting if, where you write record patch in line 366.

C

Yeah, instead of returning that ad, you could potentially also pass the transaction immutable reference to the transaction as part of the the the argument so metadata record patch and then transaction, and then what what what would happen there is, instead of pretending that the ad action bank use sort of appended to the transaction internally, because yeah yeah requiring the user to specify the the actions you know in the correct sequence, could cause conflicts on its own.

C

So what then we do after that is that we just commit all to the transaction without passing the vector, because the transaction would own the vector um of actions inside of it. So I think it this. This gives us a good start to to explore. You know usability of apis, I don't like creating.

C

I don't like creating a flow where we require someone to do non-trivial work, because sometimes you know, if you, if you don't, if you don't remember, to add at the right um transactions etc in the right order, it might cause issues.

D

Yeah, that makes total sense and, uh to be honest, um my kind of longer term view is- is pretty similar to what you're describing. um However, I I kind of have a split in my own mind. On the one hand um I I was, I was thinking it would be nice to kind of maintain a vector of actions on the transaction itself, but on the other hand, I'm also thinking that beef yeah as delta rs, continues to evolve.

D

um There will be more layers on top that maybe should have this like fine-grained interaction with and and and be capable of, storing their own actions as they're doing each step.

D

So, for instance, I'm thinking um that delta rs should have a concept of an update command which um which would that update command, would encapsulate the process of creating whatever ads need to be created, as well as whatever removes need to be created, and um so they would so so. Those update commands would still be in delta rs. So it wouldn't go out to the ultimate user to manage.

D

It would be encapsulated in delta rs still, um but the kind of the parts would be out on the floor for delta rs for those higher level commands to use, as they see fit, rather than kind of tying everything to use a you to to kind of tuck all their actions into the transaction as they're going along.

D

um I don't know I haven't thought through it enough um to to be strongly opinionated on which way makes more sense to me, but that was kind of part part of what was feeding my hesitancy to add that action vector to the transaction right out of the gate. If that makes sense,.

C

Yeah that does, I suppose I also need to um look at the delta specification in more detail, but I think I looked at it like two months ago, just skimming through it, but now now's, probably the time to sit down and read it properly.

D

Yeah like so so, the thing is like an update is actually pretty complex, and this does not this. This test does not fully. um You know if, as you're going through the pr you'll have to read the comments, because this is actually not a legitimate update.

D

um A real update would rewrite the initial add file, um and I'm not doing that right now, because that requires like reloading the initial file scanning for only the rows that have a change in them um and rewriting a new file that contains all the unchanged rows and the changed rows together, removing the old ad um and then uh including the the new ad that includes the rewritten rows.

D

um So I'm shortcutting that just for the sake of the test, um but yeah it's uh when I, when I'm looking at the reference implementation, um I I really like the layered uh approach they have where they have. The transaction is, is very decoupled from the from the log itself and then the commands are also are decoupled from either of those and they kind of use them in a compositional way.

D

So um you know you have this concept of an update command, that that creates actions and writes data as it needs to and then and then commits the transaction in one in one shot. But I mean I don't know I don't know how closely we really want to match the reference implementation, I'm not trying to match the reference implementation, but uh there are some good ideas in it.

B

Cool no I follow. um I have questions is the um to pr ready for review. I saw that it's in it, it's still in dropped, state, so sure yeah, we start reviewing it now.

D

Yeah, so I put it in draft. I think the only thing that I absolutely want to do before uh before I consider it mergeable is at least just want to rebase, because I got a bunch of sloppy commits in there so I'll do that real, quick and I'll put it in um official pr status um there's um and we can go from there yeah. So I'll just do the rebase, but that's not going to change any of the code, so yeah feel free to start reviewing.

B

Cool, I think, all of those discussions we just had. uh We can have that in the github yeah. You are perfect with the context of the code. That will be a bit more helpful. I understand.

B

D

Was that level.

C

Oh sorry, I was saying I also agree with that. Yeah.

B

Yeah, I think that looks pretty cool to have um media demo on the right support.

D

D

All right, um I think, that's everything I need to talk about here since we'll just carry this on in the pr discussion, um any any other things we need to talk about before we adjourn.

B

um Not from that side yep not from my end anything, I guess yeah, I think, that's all then cool right, I can think tyler and then um we can end the stream awesome. Okay, all right thanks. Everyone, bye cool good doctor in two weeks, bye.

C