Delta Lake Delta Hack 2021, 11 Jun 2021

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Delta Hack: Live demo of kafka-delta-ingest

Description

One of the lead developers on kafka-delta-ingest, Christian Williams, will share the progress and demonstrate the tool. Kafka Delta Ingest helps bring Kafka-based data into Delta Lake as quickly and efficiently as possible.

A

Welcome, let's see if we get our youtube stream going, looks like it's going, uh welcome to another sort of live demo or live stream for delta hack uh hackathon that we're organizing to just run a couple of days. It'll end at the end of the day tomorrow. um So if you've got projects uh definitely submit those um the site is uh on devpost you just search for deltahack2021 you'll find the site you can sign up or or submit. uh You know whatever projects you have that there, but today I've got christian williams.

A

um I've got christian williams, who is one of the primary developers on uh damon in the delta project, called kafka delta ingest and the goal for kafka delta ingest is to bring data out of kafka as quickly and efficiently as possible and put that into delta tables, which can be very useful for data streaming uh or spark streaming or any sort of you know, high throughput stream applications that you might want to be building on top of delta lake now kafka delta ingest is still early.

A

It's not quite production ready, but I'm looking forward to learning just how how ready it actually is from christian uh and if you've got, questions feel free to throw those in the the chat sidebar on youtube or if you join the delta community slack workspace, which you can find at the bottom of delta dot io. You can join us in the kafka delta dash ingest channel on our delta users, slack workspace, but without further ado christian. Why don't you go ahead and show us something cool sure.

B

So, like tyler mentioned uh kafka, delta ingest is a daemon written in rust for streaming data from kafka topics to delta lake tables, um and we've done a demo on this previously and so for today's demo, I'm gonna kind of I'm gonna pick up on one of the key features.

B

Speaking of that production readiness thing one of the key features we did not have in our prior demo uh that we do now have in kafka, delta, ingest and um that's support for multiple worker processes, running it running running for the same topic to table stream, which is of course, really useful for handling high volume topics.

B

So I've already got stuff up on my screen and let me let me explain what what we're seeing here so um prior to running anything here. I I set up a kafka topic with three partitions and the name of that topic happens to be web requests. That's the same example. Data set that um that we used in the previous demo- and it's also the same data set that that we have in our open source repo that you can run locally. um I've launched two separate kafka delta, ingest workers uh and, and these are each configuration.

A

We're seeing at the uh the top and then the second we've got four games here right.

B

Yeah sport panes, so the top two are my two separate kafka delta, ingest workers and um they're configured to write a new delta file every minute and and so in between that time, they are um they're, just buffering messages, they're consuming from kafka, doing some little tiny transformations on those messages and um sticking them in a buffer. uh So that's worker, okay, so we got one worker in the top one worker in the second pane.

B

um The third pane is showing a tale of my delta log directory and I've been running this for a while now and so we have 234 delta log files in our in our delta transaction log and then the bottom pane is showing how many data files we have in the partition that the example is writing to.

B

So we have 238 files right now, um so uh so kafka has assigned, since we've got two workers and three partitions kafka has assigned two of these partitions to one worker and one partition to the other worker and our output directories.

B

um We'll we'll see these new delta log files and these new parquet files being written as each worker passes its one minute threshold of buffering.

B

um Let's peek real quick at some of the outputs that I see here so um so I'm using this tool called parquet tools to inspect the data in one of the parquet files and um if we, uh since I zoomed it, it's a little harder to see.

B

But let's see if I can just kind of navigate up here. So one of the things I wanted to can you use.

A

Your mouse to show us what you're looking at well.

B

uh This this should work right.

A

B

Yeah so um one of the things I wanted to zoom in on to make that point that you know we're leveraging kafka partitions to split the traffic across our workers, and so each of these data files is it's going to have.

B

A each data file will have a mutually exclusive set of partitions, so the worker that is handling partitions, zero and one all the records are going to have this partition uh value in our message equal to either one or zero, because those are these are the partitions at sampling.

B

If, however, we look at a different file.

B

So this is a data file written by the other worker. That's handling partition two, so we'll see here. All the records written by that worker were for partition two, um so one other neat thing to inspect.

B

So these are the delta log files that they, they don't necessarily correspond to the exact exact same rk files I was showing, but uh these are alcoholog files, and one difference here is that the worker handling partition zero and one is going to include this uh we've called it texan, but the tx in action.

B

um It's gonna include two of those one for each partition that it's handling that records the kafka offset, so that if another worker gets this later, it can look at the version number and then seek to the appropriate um kafka offset, and so so that is the part. The worker handling partition, zero and one.

B

B

Example, written by the other worker, only handling the single partition.

B

So uh next one thing we can do now that we're playing with two workers- um anybody who's familiar with kafka, probably knows about the concept of a rebalance and the concept of a partition assignment.

B

um So if you have the simplest possible scenario and you've got a single topic and a single worker, uh I'm sorry a single topic partition and a single worker there's only one place for that partition to go to um to the one worker and there's only one partition that needs to be um distributed to anywhere, and so obviously, worker one is going to handle that partition.

B

uh If you have a single topic with three partitions and only one worker, then still only one worker is going to handle all three partitions when you add another worker to a topic with more than one partition.

B

Now kafka is going to choose which worker each partition goes to, and it does that with a concept called partition assignment. When a consumer joins a consumer group, it's another kafka concept. Kafka assigns it a specific set of partitions.

B

Now the number of consumers for topic can change at any time and when that happens, um something something called a rebalance happens and what kafka does when there's a rebalance?

B

Is it basically changes the partition assignment for all the workers so to demonstrate that I'm going to kill one of these guys, so I just terminated.

A

B

And you probably just saw some log messages up here, indicating that um our for.

A

The other worker yeah highlighting.

B

Our partitions were revoked and then immediately afterwards, uh we received a new partition assignment, and our new partition assignment includes all three partitions, because we only have two workers right.

A

B

A

Sure I'm understanding this correctly. You terminated, you know, worker one number, two and worker number one got all of the workload basically yeah exactly from worker number, two.

B

Right and so uh once he got his new assignment, he stopped what he was doing and he went and and checked out the latest state of the delta log and he found okay. So um the delta log says the the last data we committed to the delta table uh corresponded with uh these three partitions and it had a a record written for this. This is what comes out of those texts and actions.

B

It had the offset recorded for each of those partitions, so it extracted those offsets and then did a seek with the kafka consumer to go pick pick back up where it left off, so that, basically it's it's only ever uh it. You know it when we when we lose workers, we always start from the last recorded offset. So we know that we're not duplicating data and we're always picking up where we left off so eventually once he hits that one minute threshold he's going to do another right and, let's see I haven't, been watching bottoms.

B

We patiently watch.

B

um And let's well.

A

That, while that goes, could you tell us a little bit about what happened behind the scenes when the second worker terminated like what what did kaka delta and just do when that shut down? That's uh actually.

B

A nice segue, I'm gonna, switch over to a diagram.

A

Sounds good I like.

B

A

See the diagram.

B

All right- and I think I'm going to zoom this again there we go.

A

So this is kind.

B

Of the whole run loop um and just to kind of situate us and be able to answer that that question I'm actually, if you don't mind I'll, walk through it from the beginning, is that okay go for it all right cool.

B

So when a worker starts up the first thing that happens, is it initializes the consumer and says hey I'm here, I'm part of this consumer group handling this topic um and give me some work to do out of that. It gets a partition assignment from kafka, which we kind of demonstrated.

B

The next thing it does is it reads the delta log for the delta table that it's going to be writing to, and it figures out, okay for the partitions that I have. These are the offsets I need to start from.

B

So after it reads the delta log, it gets the offsets out of those texan actions um and once it discovers exactly what offsets it needs to start from for each of its partitions, it seeks the kafka consumer uh there's api for doing that. So it's like consumer.c. Can you give it the offset for each partition?

B

um Then it checks to see. um Then it starts the actual run loop and it checks to see um has a rebalance of it happened. If it, you know, when we're first starting up, it's unlikely that it has. So it's going to go down this no path, and it's going to start it's going to consume and buffer the message it just received um and it's going to buffer each message that comes in actually got these things out of place. The rebalance happens after consuming buffer apologies.

B

So it's consuming and buffering each time it consumes and buffers it checks to see if um a rebalance has happened yet um and it also and then afterwards it checks to see. Should I write yet have I buffered for long enough. Have I buffered either enough messages or buffered for the um max time, and if so, it writes out a parque file and writes out a delta log file.

B

Then it goes back to consuming and buffering in the case where a rebalance um does happen, then it rechecks its partition assignment and starts back over reading the delta log, and just because it's so confusing that I got those things uh misplaced and tyler. You can call me off of this if you want me to just go back to doing something and it's clear enough, but like I mentioned I I I think it's clear, though okay yeah, this is one of my little ocd behaviors that I just can't.

B

I can't live if, if this isn't in the right spot, but all right I'll stop tinkering with. I think this is enough to get the idea. So basically we can zoom in buffer. Have we rebalanced um if we have, we need to go? um We need to check what our current partition assignment is from kafka. Go back, read the delta log again get our offsets and then reseek and start the whole cycle over.

B

And that is kind of the flow.

A

B

Answer the the question that you were posing: it does: okay,.

B

A

Yeah a little bit: okay,.

B

Okay, so I switched back over to we see it now console and let's see here- oh yeah, okay, so we spent two time too long on that diagram. We missed uh the first right, the first exciting right after our rebalance, um but you know, as you can see, it picked back up where we left off kept writing new double log files. So what happens if we restart worker two?

B

So what we'll see here is we get a rebalance on our worker that that was continuing to run and we also get uh rebalance on the new worker and you know, what's.

A

Well, that first worker was writing, is, is kafka delta in just going to write records like? Is it going to write a transaction regardless of whether it has records every minute or how does that that timer actually work.

B

um No, it's it! So, basically, if there's nothing in the buffer, it should not write to delta.

B

Unless we have a bug which we could I don't know, but as far as like the designed behavior um for sure um we should. You know when it checks that should write um node in the workflow.

B

One of those conditions should be do I actually have any messages to write, if not, I'm just gonna wait around for to actually get some messages.

B

um One of the things that's funny that happened here when we terminated that second worker. um I think I could be wrong about this, but I think that the second worker was originally handling partition. Two and the first worker was handling partitions, zero and one um and now they're flip-flopped, and that's the thing about rebalance. um You know when you have one consumer handling all partitions and you bring another one on you. You don't know how the shuffle is going to come out.

B

It's kind of an interesting nuance that I happened to notice. While we were watching this. um So let's see we looked at, we watched our workers run, we inspected the per parquet file. We took a look at the delta logs. uh That's all I had lined up for the demo any uh further questions or anywhere you'd like to go deeper on tyler.

A

um Could you share a little bit about the work? That's come in lately to make sure that those two workers can actually write to the same delta table.

B

Sure yeah, so originally, when we designed kafka delta ingest, we had this concept of a right ahead, log that that we were going to use to keep track of the offsets. And, um let's see here, get back over to our.

B

B

So what we were originally doing.

B

Is we were only going to have one transaction, one texan action in the delta log and instead of storing kafka offsets here, um we were gonna store, a an identifier that points to a record in our right ahead log that kept that that kid keeps track of the offsets for all partitions.

B

um Somebody on our team had a really brilliant idea. What if we don't do that and instead just write multiple transactions in the delta log file, multiple texans in the delta log file each and each um would have an app id specific to the partition um and record the offset for that specific partition. So by doing that, we were able to get rid of a bunch of infrastructure that we were going to have to um support.

B

So, whereas our initial right ahead log was going to require a dynamodb table and then a locking table as well now we're just relying on the delta log for uh all of our offset information and that's what allows us to um kind of maintain this. You know I hate to use the phrase exactly once semantics, but um you know we think it is.

B

So that's what allows us there are. You know, that's what allows our workers to pick up where they left off is relying on these techs and actions in in the delta log.

A

Interesting, I very much like that. We're making we're we're pushing the texan notation for how that should be pronounced. Yeah yeah.

A

I think I mean that that answers all of the questions that I was thinking about, maybe maybe to wrap us up. uh If you could talk a little bit about how how somebody might be able to get involved either helping with writing code or testing or documenting, or what have you.

B

Yeah um definitely- uh and you sent the repository yeah, um I did issues and pull requests are very welcome. As far as any of those three categories uh documentation testing um code- uh I'll, say you know, a lot of people have expressed interest in in supporting other formats um formats other than json um and there's there's even been some talk of abstracting kafka as the source, um so that you know this might become delta ingest instead of kafka delta ingest, which would also be really cool. um Those are two of the bigger items.

B

I think that would be uh neat contributions. um Some of the uh kind of smaller scoped things that would be good contributions include like um some resiliency to some of the network, calls we're doing right now. We've we've got some pretty limited, uh retry handling in case of network failures. When we're like writing desk three, for example, um that might be a good uh easier item to get in on what was uh there's something else that just occurred to me.

B

Oh yeah we're very focused on s3 right now, um so if but yeah, if, um if there's any kind of azure related thing that um somebody might want to contribute, um really, though, that that's all at this point, since we got rid of the ride ahead log, I actually don't think we have any direct, s3 or or cloud storage touch points. We always go through um delta rs for that, so I might take that back. That's might not be as important anymore.

B

um Somebody I did notice on our issue list. Somebody had a really neat issue.

A

So I should mention uh for anybody watching if you go to the kafka delta ingest repository uh good first issues are noted there. So if you're just getting started with rust or are looking for an easy entry into kafka delta ingest, there's a couple of good first issues that are available.

B

Oh yeah, here it is there's one um somebody had posted this issue to to add the ability to access kafka message headers in transformations I could. I could see that being a nice easy ad.

A

Interesting, so uh kafka delta, ingest I'll reiterate, is open source. It's up on the it's part of the delta project or delta lake project. Excuse me um any parting words.

B

No none that I can think of whatever I would say to that would be very awkward and so I'll just skip I'll pass I'll. Take a pass.

A

Okay, uh well christian. Thank you very much for for demoing the current state of kafka delta ingest, I'm very much looking forward to this finding finding a production workload soon. um I also want to thank uh qp, misha and neville, who have all been major. uh I would say contributors into getting kafka delta in just this far.

A

If you've got any more questions about kafka delta ingest or want to want to tinker with it, you can join us in slack at kafka, delta, ingest on the delta users slack or hit us up on github start a discussion open up an issue, etc, etc, etc. uh But with that said, have a good rest of your day and I hope everybody enjoys some delta hacking bye. Now, okay have a good one. Tyler.