Delta Lake Delta Lake Tech Talks, 8 Nov 2022

Previous Meeting

Next Meeting

⏯

youtube image

►

From YouTube: Get Started with Processing Delta Tables Using AWS Glue, Amazon Athena, and Amazon Redshift

Description

There are a lot of use cases of Delta tables on AWS. Noritaka Sekiyama, Principal Big Data Architect at AWS Glue, will demonstrate how to get started with processing Delta tables on Amazon S3 using AWS Glue, Amazon Athena, and Amazon Redshift on Tuesday, November 8, 2022 at 4:00PM PDT.

Learn more about Delta Lake: https://delta.io/
Noritaka Sekiyama: https://www.linkedin.com/in/moomindani/
Denny Lee: https://www.linkedin.com/in/dennyglee/
Join us on Slack: https://go.delta.io/slack
Delta Lake Releases: https://github.com/delta-io/delta/releases

A

And YouTube channels, but if you're wondering what session you are, we propped it right on screen we're talking about getting started with processing, Delta tables using AWS, glue, Athena and redshift, but, like I, said uh we're going to take a couple minutes just to get our YouTube channels and our LinkedIn channels set up so in the interim by all means.

A

uh If you have questions, please use the Q a and if you want to tell us a little bit on um where you're, based out of where you're from by all means go ahead and chime in inside the chat um on where you're, based on, for example, my name is Denny and I'm based out of Seattle Washington, specifically Kirkland, but Seattle area. How about you Nori? Where are you based out of.

B

I'm in Tokyo of Japan, perfect.

A

So we got a nice Global audience today, awesome so based on so again, norian Tokyo myself and Seattle uh yeah for the for those who are chatting away go ahead and tell us where you're based out of.

A

Excellent we've got uh Steven based out of uh uh Wicklow, I, think Ireland, so that's really cool all right and uh oh Alexandra Egypt. Now that's pretty cool! Oh man I like that. Okay, let's see sorry, uh we've got the linked.

A

uh We've got LinkedIn all set up, and so which is really cool and looks like now. We've got YouTube set up, so we are set all right, perfect, well, uh welcome everybody um again, uh we'll go ahead and restart. So now that everybody is uh is online in terms of all the LinkedIn and the YouTube and the zoom channels hi there.

A

My name is Denny Lee I'm, with databricks and I'm a Delta Lake, committer and I'm really really really happy to introduce Nori he's uh from AWS he's gonna introduce himself a little bit, but he's going to be talking today about getting started with processing, Delta tables with AWS glue, Athena and redshift um before we dive into those particulars, I did want to do a little housekeeping.

A

So if you do have questions for those of you who have dialed in using Zoom, just use the Q, a okay uh for those who of you who are doing LinkedIn by all means, go ahead and put your chat or your questions in the LinkedIn and the same for the folks who are actually on YouTube I'll chime in on the background I'm going to mute myself when Nori goes ahead and speaks uh so I'll do my best to answer various questions but by the same token, here's here's.

A

How we're gonna go uh uh here, we'll make sure that to answer the questions live um once he uh Nori has done the main section so without further Ado. Let me just go shift over to Nori going to introduce yourself and dive into our today's session. Please.

B

Sure, hello, everyone I'm super excited to to talk about theater, Recon AWS, using group, Athena and redshift in this session. I'm Nori principal Big, Data architect on the AWS group team. I am responsible for building software artifacts that help customers to build data rakes on AWS Cloud. Recently I have published a book about AWS group for today's webinar. We will be giving away a few copies of this book to three of the lucky winners.

B

If you are interested, please tweet about your earnings from this session and share your experience and then tag me in the Tweet so that we can reach out to you and give away the free copy of the ebooks. After this event,.

B

Today, at the main, focus of this session is to run how to get started with processing data tables. On AWS. First I'll be talking about how to process data tables using different options of AWS group. Then second I'll be covering about how to query data tables from Athena and reshift.

B

As many of you already know, Delta rake is an open source project that helps Implement modern data array card texture commonly built on Amazon, S3 or other Cloud strategies with deterrick. You can achieve AC transactions, time travel, queries CDC, including abstract and other common use cases on the crowd.

B

Today, in AWS, data Lake can be used in variety of the AWS services, such as Amazon, EMR, Amazon, Athena, grew and redshift.

B

Aws group is a serverless data integration service on AWS. It gives you scalable data integration, ending unified data, governance, ability to connect to various data source and so on this database group you can easily integrate data across multiple data stores and transform and enrich data and make it queryable from different analytics and email applications.

B

Why AWS group is good for the rake he has five years. First AWS group make data preparation and data integration simpler, faster and cheaper. Second AWS do provide the powerful servers data integration capability for all Enterprises using the AWS ecosystem; third, that both AWS screw jobs and chloras support, Delta Lake and then break formation, AWS rake formation. The sub is for centralized access control, which also supports the the direct tables foreign.

B

Familiar with coding, some may prefer visual user interface rather than coding in the area of data engineering. It is very common to have people with different skill set.

B

Aws group offers variety of interfaces for every skill, set, a visual drag and drop interface for no code, users between notebook interface for interactive job watering and, finally, the job API interface for developers who want to submit their own Python and Scatter code.

B

Big address of the interface used to alter your job. You can get the benefit of the massively scalable group engine in the today's session. I will focus on these first two visual retail and between notebook or using their Drake.

B

Featuring ethereal is this: screenshot is one of the visual detail. Example: oh, this represents flow diagram, or that is does that be that is designed for large scale, data movement and transformation processes.

B

uh It automatically produces Apache spark code or so, if, even if the customers don't have a spark skill set uh without the running curve of this uh spark or Technologies, you can also spark jobs.

B

This is a very good uh for simple use cases such as the data movement, data, Transformations and so on.

B

For data scientists. It is very common to use Jupiter notebook, break interface to make trials and errors on your data, then develop and test your data integration script.

B

This glue, Studio notebook, is a brew between notebooks for interactively, exploring data. Then there of the Apache spark jobs. Eight arrows you work with Airbase group Faster by giving immediate feedback and rating you or tests your code.

B

This is ideal for developers who know about spark and or Python and scar so or when you want to have the some complex ether code. Oh, this is very good. No! Oh! Let's move on. Let's get started, let's get into details about how to process data tables on AWS group.

B

Currently, there are three options: to bring Delta Rick Library into AWS, crucial platform. This time, I will use the marketplace connector, since it is the easiest option. If you want to bring specific version of Delta Rick library, then you can choose the other two options.

B

Here's the first stripe first demo in this demo I will use glue, Studio notebook. It gives you the server restricted notebook experience with ephemeral, glue, spark cluster. Okay, let's move on this is the console of glue Studio.

B

Okay, let me start the demo. Okay now I will create a new jifton notebook job from the existing iPhone notebook file. This is available in the public GitHub repository. Then I will be giving some names for job name. Then setting the IAM role when I provided this uh information, then group studio will try to spin off the serverless, managed Jupiter notebook interface.

B

So it's loading now or it will be spinning up within uh several seconds after that, I'll be explaining about how to oh, how the notebook experience with AWS group and data rate look like okay. Now the notebook is ready, so I will make the small changes to or make it available for. Marketplace connector then run it. Then the options are configured for this notebook session now or let me give some bucket name.

B

I have this time I am using Sydney region, so I am passing the my bucket name uh that is located in a cylinder region. The one I run this cell, then the internally. It is spinning up the ephemeral group cluster, while waiting for that I I will be explaining about how what we are doing on this notebook. The first uh I will be creating five sample records using spark or data frame API and then create a Delta rig table or using this sample record.

B

So it will have only five records in this data table. Then, after that I'll be demonstrating or to read from the other table, then do some or basic operations right, insert, update and upsert.

B

That's pretty much everything for all this notebook demo and uh in the upset phase. I will use merge command for that now. The ephemeral cluster is already set up so I'm running some cleanup phase. For this then uh or said uh next cell or I'll be creating five sample records.

B

This table all shows the fictional or product inventory that has the product name, product price and category. Okay, then right now, I am writing into a screen location using Delta table. So once this server has been completed, then the data Delta files I mean pocket files with the transaction. Rook and metadata will be located on your Amazon S3 bucket.

B

Okay YouTube takes some time.

B

And after that,.

B

Give me one second, no.

A

Problem, take your time and.

B

By the way, for.

A

Folks that are actually having problems posting comments on LinkedIn. We notice that there's a problem there um please go ahead and um I'm going to repost on LinkedIn, under my personal account uh for you to chime into the Delta user slack, and if you can just ask your question in events and or uh you can also uh join us directly through the zoom link, so I'm actually, actually you know what I'm gonna do that um well Nori, why you go ahead and take care of that I'm gonna go ahead and send out the zoom link.

A

So you can ask people who can ask questions directly there. Okay,.

B

Thanks thanks: okay, let me resume the demo there and now the data is already written into S3 and Delta table is visible in the cutter. Now let me read from Delta table yep uh five records are shown up in the table, so in this spark SQL you can see the content of that table. Now this is the basic operation. Now I will be trying to insert the new record like product ID 6. The process name is pen.

B

Then after that I will show the table. You can see the new record or product id6. Is there okay? Next or let me up to date, the record like the product, ID 7., then the product id7, uh the price has been changed.

B

The next one is a pretty much very important. uh It simulates absurd operations on data Lake, one record will be inserted. Newly one record will be updated or absurd. I'm using merge query on data Delta table. Then after uh this query finished, then you can see the one new record, one new record for product ID 8 and one existing record product id1. uh The price has been changed successfully.

B

As you can see, you can run any of the stock circle. Pi spark and the data table Delta reconnected with python API on this notebook. So you can take advantage of Google ecosystem to bring various data sources to data rig or something like that.

B

B

We have seven sample notebooks available on GitHub. You can easily use any of the samples to understand how it works and simulate this demo, uh three notebooks out of seven notebooks are related to the rhetoric.

B

Okay, uh in this demo, I have demonstrated how the notebook experience will create in the next ripe demo, I'll be explaining how the group study or visual editor experience with rank. Let me show my.

C

B

This one, can you see my screen? Maybe it's yeah.

A

Well right now you're showing the glue page. So that's the one.

C

You want to show right: okay, perfect, then you're good.

B

That's right, so this is the AWS group console and uh from now on, I am demonstrating the glue Studio experience. So let me open Guru Studio.

B

And uh to use a div Drake uh as I explained uh first, uh you will be. You will need to satisfy the marketplace connector that is designed for data Lake. The directory Marketplace connector is located here. So if you view this product, then you will see the wizard to subscribe. This connector.

B

Once we've you finished the uh this wizard, then you will have the uh you will be able to use the Delta rig on Google studio. Let me go back.

C

B

By the way, uh as you can see, there are so many connectors, so if you want to bring some data from somewhere to the redirect, you can use this. If you want to move the some data on data Lake to other place, you can also use this system.

B

Okay, now let me create a new visual job. Okay, so in this demo, I read from some public S3 packet and write into or my S3 bucket, using Delta rig, so the sources standard S3 the target target, is the Delta rig. So let me do like this. Okay, then the template will be shown up. Let me name it there with the rake visual okay uh in the at the source. You can specify a three location or you are existing catalog table in this demo.

B

I'll use the S3 location for Simplicity, so I have some sample date set on public S3 bucket. So let me use this one.

B

Okay, this is the Kobe 90 data set okay, I. Probably mapping is very simple or transformation that will map your crumbs into the different names or different types of whatever. Then after that, I'll be using Delta rig to store this table for this I need to pass one extra parameter to show the location.

C

B

So for now, director rake visual I will be using this fast. Okay.

B

Sources is Json and Target. Is there there and I want to pass one more extra parameter?

B

So I am roll Guru service row.

B

And then I have a few spark configuration. So let me pass that.

C

B

Okay, now that's it! Okay, now I save this uh job. Then let me run this, and now all the girls of system is trying to spin up the server cluster in the uh internal backend. Then uh I will be demonstrating how it works with Delta Lake.

B

A

Sorry I'm going to ask a quick question just because very applicable to what you're showing right now the serverless spark that you're actually processing. Is this uh using EMR or is this eks I'm just curious? What's the source about when you're accessing.

B

With a very good question: actually we are not using EMR, we are not using eks. We are using our own system to spin up the servers back cluster on AWS, so it's a totally different mechanism to spin up the spark cluster.

A

That's really cool so, uh and do you then specify which uh inside because I noticed all the configurations and the scripts? Do you then basically specify in the script or in the initial job? Where which version of spark you want to play with for sake argument, thank.

B

You so in Google job system we have the concept of Guru version and I am using group version 3. that supports spark 3.1, so Plus 3.1.1.

B

So we have several section right this, so you can choose any of the group version to specify the spark plugin.

A

Perfect excellent yeah I just thought that would be really handy for people to to rock right away, just that it's nice and visual and integrated right in right right into the glue system that was all so cool. Carry on. Please thank.

B

You, okay uh now the Adobe is still running, but let me explain what is happening right now, so you can see startup time 8 seconds. It means uh Guru was able to spin up the 10, node or spark cluster 10 node I mean here 10 dpu. It means 10 dot. 10 nodes spark cluster within eight seconds, then, after that uh we are running some script uh and it is executing right now succeeded. So start type. Time is seven seconds and execution time is 1 minute and 43 seconds.

B

Okay, now at the destination. Let me open S3 console because I am writing reading from public bucket and writing it. To my bucket, uh let me verify that S3 pass group rate formation, demo Us best tool. This is the bucket S3 bucket name.

B

So let me find this bucket.

C

For me, a little bit.

B

Then The Rake video demo. Now you are seeing these uh raw files located on S3. This is the main pocket file, or this is the transaction row that is uh required for the return. Okay, so the uh you you, as you can see, we were able to ingest into Delta rake format on S3 I will use this the direct table or for the further demo.

B

A

This is looking awesome.

B

Thank you. The next step is uh so once we ingest this Delta table on my S3 bucket I want to query this table or from different engines like Athena redshift uh like that, and then in the next demo, uh we will demonstrate to query Delta tables using Amazon, Athena and redshift.

B

Before doing that, let me explain a little bit more about the. What is the Delta table and what is the New Concept of manifestable in in the previous demo, I used AWS screw with apart spark to read I write Delta table. Actually it was uh that Delta table created in the about spark, so it is a native Delta table in the next two demo. I want to query Delta table from Amazon Athena and press shift to make the Delta table queryable for those engines.

B

It is important to create a manifest table which is based on the simmering text format. The standard way to create a manifest table is to run generate command, so you can, as you can see here the example of the generate command. If you run this command in your spark cluster with data rig library, then it will automatically populate the same link based manifest table. This is one standard way that is natively supported in the Drake.

B

Another positive way on AWS group platform is to use AWS Google. uh To do the same thing, it simplifies the generating manifestable operation and you can easily schedule that uh to or sync these or tables uh in the specific period of time, I will choose this way for Simplicity for the next demo. Okay, then, in the next demo or I, will be demonstrating chlorine Delta tables using Google Chrome is a component to automatically create a table definition from actual data.

B

You you can you do not need to or configure any of the table schema. A Google Chrome will be creating the table. Definition on behalf of you for Delta rig. It also has the extra capability to Auto generate simulink based manifest file, so the Manifest type, the Manifest table, manifest file enables Athena and receive Spectrum to query the Delta table.

C

Oh I I forgot to show my presentation yep. Oh.

A

No worries yeah I was just about to ask you see if he can go. Let's, let's go show that part again, because actually we got a lot of questions both on okay, linked on the Q, a and also on the LinkedIn, actually exactly about how does glue work with the Delta tables and the crawler. So, yes, please go ahead and.

B

uh Sorry for that, no.

A

No I missed that myself. I was actually answering questions, so I missed the video myself. So my apologies.

B

So this is that that's right deck I wanted to show. So there are two type of the tables, a native Delta tables and the Manifest table. uh So in the glue I used a native data table because I use Apache spark ecosystem to create a Delta table, uh but uh for next demo, uh I will need to create manifest space table. Now that is based on simmering text format to make it equivalent from Athena and press shift. Usually you will need to do this operation or generate command.

B

One spark to create a manifest table, but I will choose this way to use glue Clover to do the same thing like creating the Manifest space table. Okay, next demo I will use Google to do this task.

B

Okay, now let me use my same environment to Showcase how it works.

B

Okay. Now let me open Glue console.

B

Then clora.

B

Okay, now, let me create a plural named Delta rake.

B

Then I will need to add a data source to crawl the data. This time I will use Delta rig from the data source. Then I need to pass the S3 path here.

B

The three paths is this: one: S3 console has the link to create the S3 path. So let me use this way then copy paste, the location. This is the S3 location. That is uh something I use in a previous demo. So in the visual you in the interface I ingested or the sample data set into this location. Now, let me close this location using chlora.

B

The important thing is here so I need to check enable write manifest to make it clearable from both Latin and redshift specter. Okay, now uh this location and this time I need only one data source, then next uh other than that I need to pass. The IEM role to this program then choose the.

B

Database name Delta rake, so let me use this table name prefix for that newly created table. Okay, now everything is configured correctly. So let me create this new plural.

C

B

Then now neutral Rays showed up. Let me run this Cora.

B

Okay, now the chlora is trying to read from the S3 location in, for the schema then uh create a table definition and also the Manifest file uh that is based on simulink text format.

B

By the way, I mean some of the you or have already used AWS screw in the past, maybe, but we recently updated the entire console experience. So maybe you are seeing the brand new console experience.

A

No, that's very true, I mean it's pretty cool. Actually. Is this a good time to maybe ask some additional questions? I did answer some questions, but I I just figured like there are some good questions that uh yeah perfect.

B

It's already created the table. Oh.

A

Okay, then you know what we'll we'll we'll ask them later, then not a big deal. Okay, thank.

B

You for that so that newly created table is here so I will answer the question later.

C

B

So, oh, let me open the database to show what table we created.

C

B

Is the database View the new table? Is this one? So there's a rectable, so this is the current timestamp right. Then we created a new new table definition. So in this table definition the order schema is automatically created based on the data and if we I go to this location.

B

Then there is the Manifest file that represents the Manifest table. Okay, so then now the group Laura uh has run successfully. So now the table is already prepared correctly. Then it's time to query from the different engine: okay,.

B

Moving back to this phrase, next demo is the Green Delta table using Amazon Athena.

C

B

And open the Athena console.

C

Nothing else: okay,.

B

So, as you can see, this is a new table created from the plural I want to query this table.

B

And you can see the table content here, so this is very quickly, therefore, but you can, as you can see, uh that I think I was able to query the data table or that is based on uh similar exist format uh created by group chlora. This is a sample record coming from the public S3 a bucket okay. uh As you can see, this is queryable from Auto Athena, but the same table is curable from Amazon redshift, so you can choose any of the these query engine based on your requirement.

B

So what I am doing next is to use redshift.

B

I'm opening the redshift console right now and I configured, let's use Subarus for this demo. This is server is we are used to query around ad hoc query on the serverless FML Crosser uh on redshift.

B

Here we have. Oh query, editor button two, so this is connecting to the uh let's use serverless cluster.

B

Okay, then, here you are seeing the same database that is located on glue data control. Then you will see the new. Let me Let me refresh.

B

It may take some time- maybe it's a good time to answer some questions.

A

Oh no problem, actually I was literally asking some. Oh okay,.

B

A

Soon, as you ask them, it's.

B

A

Keep on answering questions off the side, but I.

B

Can't and then we'll come back answer. Questions no worries, so this is a a table we created from Cora. Let me run this. This query.

B

Foreign is shown up in the redshift console too so, but based on the requirement, you can choose any of Amazon, Tina, Amazon, great shift and any other engines as ever Yep. This is a pretty much everything I prepared for today's demo or so, as you can run from this demo, you can easily create a table. Definition from group Rover then use the Amazon redshift Amazon Tina uh to query the Delta table.

A

Awesome uh Nori: this is great. I love the fact that you're showing how easy it is to use glue with Athena and red Chef to go ahead and work with Delta tables. We got a bunch of questions, I'm actually gonna, even though I answer some of those questions I'm actually going to go back to it. But I did want to ask the audience that if you do have questions, please chime in to LinkedIn uh Zoom q, a or YouTube I'm actually going to be looking at all three.

A

So you're probably wondering why I'm zooming back and forth It's, because I got two monitors. I have to go. Look at all the questions, um but I'm going to start with the ones some of the ones that people have already asked just to provide a little context for everybody, okay, sure and oh by the way, uh I know you're in the Q a section, but I did think you won. You had two additional slides, one on re invent and one on the book. Why.

B

Don't you guys show those let's set.

A

B

Okay, so let me add two more thread, so uh today's demo is based on the past blog post that is authored by me and the other co-authors. So if you are interested, you can uh go and visit this blog post and check out any of them to try the same instruction in your account, and uh we have reinvent that it's a religious event in AWS that is coming soon uh in the AWS re invent or but that happens at the end of this month. I have two sessions there.

B

So if you are interested or if you are coming, then uh please come to my session and the red stock there yeah that's the two two thread: I have.

A

No problem: that's perfect! Okay, all right! So let me let me start by asking some of the questions that, uh like the um that that came through okay, let's think, uh oh right. For example, uh one of the questions that actually had to do with blue in the Crawlers is that basically, it is the is there crawler running in the background to collect the info on the Delta tables. What you demoed basically was it running, so I presume that you can basically have this continue running as their data is being loaded into the Delta table.

A

It's not that big of a deal I'm presuming.

B

That's good good question, so Gloria can be run by a manual operation or a schedule or event driven basis. If you want to or keep the your Delta manifest fire Very Fresh, then you can configure a very short interval right there, every five seconds, five, five minutes or whatever. If you want to or the Randy crore for the all the operations you have, then you can configure S3 event with secluda so that you can reflect all the changes to the table definition. So it will be a better for the real-time use case.

A

Perfect so then related to that, in terms of the real that sort of real-time use case-ish, uh typically with the Manifest right. That's not it's you're not going to be able to refresh that uh fast enough. Potentially, so the question I'm wondering about is that it does glue automatically take into account like if the Manifest gets refreshed. Then it's automatically refreshes in the glue catalog as well or is it just really part of the job itself.

B

Takes the ownership of the refreshing, the table definition and also the Manifest table or manifest files at the same time, so the table definition and the Manifest manifest files uh keep in sync.

A

Gotcha, so so, then, basically what happens just to clarify, because I actually did answer a question that I wasn't complete for some folks is that you can actually Pro read data, that's streamed into a Delta table, provided that basically you're allowing glue to basically manage the Manifest and manage the update of the schema. So that way, when you're querying the data through glue right, it is the one that defines what the scheme is. Is that a correct assertion? Yes,.

B

Yes, that's right, and in addition to that, even if the Manifest table is outdated, you can still query if, for the order, outdated snapshot I mean because it is using AC transaction. So you can have the historical snapshot based on the uh quora. That's the some other perspective.

A

Oh, that's awesome, awesome, okay, perfect!

A

um So, with the capabilities I'm going to switch over to LinkedIn and I'm, going to switch over to zoom and I'll, go back so sorry to YouTube, then I'll go back to zoom so from LinkedIn. uh Miguel has the question so with this capacity that we're seeing right now um is Lake formation security also available for this as well. Yes,.

B

Yeah, the information uh access control is already integrated with the Delta recruiter, so you can configure that any of the table, level, Chrome level and row level or security using break formation.

A

Perfect, that's really good to know. Okay, then I'm going to switch over to YouTube here uh Victor has asked. Is it possible to query the Delta change data feed using Athena.

B

So country uh country Athena can create a manifest table, so it is a just a snapshot of the Delta table so yeah, that's the current behavior.

A

Correct and- and you know just I feel comfortable saying this- make sure you attend reinvent yeah. There.

C

A

Be some new information that shall pop up, but so and then and I would love you to attend Nori's event, okay as well. Thank.

B

You for introducing that.

A

No, no problem.

B

Yeah, uh as you can imagine, we have so many uh interesting, exciting announcements at three invents, so uh looking forward to seeing you at the beginning.

A

Exactly exactly all right perfect, um there is another question from Sandeep uh from this one's back to the uh our Zoom here it so he originally asked is the the visual ETL able to generate optimized code, uh optimize spark code and I'm. Just curious like? Is that something that the that the glue Studio takes care of as well or is? This is more like you're sort of responsible for going and doing that yourself.

B

uh Yeah, so uh Guru Studio, visual user interface will automatically create a spark, a bunch of spark code. So let me show you how it works. Perfect! That's right! Let me go back to Google studio.

B

Okay, this is the console I have used for the demo and this one. This one is a new table created in the demo. But if you click this script tab, then you will see the auto generated code that is coming from the this visual dag. So if you write this feature that then Google studio will populate this script automatically.

B

If you change this target, for example, adding new transformation or making changes of the parameter or whatever, then the script will be automatically updated. Based on your changes in the console.

A

Cool all right, I think you definitely answer that question. uh Another question uh is the: if you're using uh since you're, using an S3 bucket here, can you actually use a Kafka topic as a source as well when you're using glue Studio? Yes,.

B

Good very good question. Thank you for asking that. So let me Open my Firefox.

B

So if you see this console, you can choose any of Kafka Kinesis. These are the supported streaming source for the glue studio. So if you choose this other source and Target as a Delta rig, then you can stream the data from Kinesis to their direct.

A

Perfect- and you know, we've got time for one more question, uh and so this is also a glue, specific question, so loving the fact that we're actually giving some love to our friends in AWS on glue um are there mats asking the question: are there any plans to support newer versions of spark um past 3.1, since the that basically does limit Delta Lake to version 1.0 right? So, in other words, for sake argument are you know, uh support spark 3.3 which allows us to do spark 2.1, for example?

A

Oh sorry, Delta, like 2.1 as an example.

B

That's very good question and actually I cannot answer anything about that yet, but uh we have reinvented.

B

A

B

Were right, the reason why we are using territory 1.0 is our group version. 3 is based on Xbox 3.1.1. Yes,.

A

Yes, and so so, in other words, we're not trying to not answer your questions, uh ladies and gentlemen, we, but we are calling out they're, going to be some really great announcements at the end of the month. Part of the reason why we did the session now and I'm sure, Nori and I can have a soft food session. After remember, we can update some of our answers for y'all. Sure, yes, perfect! Well, hey uh I! Think that's it for today.

A

So in terms of questions, my apologies, if I'm not able to uh answer all of your questions but I think we covered most of them, but in case you can't just go ahead and join us at go.delta, dot, IO slack and just ask your questions there, because we're actually online on on that slack. Answering questions as well, so um saying that Nori I really appreciate you taking the time to answer everybody's questions and um I. Think I want to that's it for today and uh oh, yes, perfect leave it at the screen.

A

uh If you want to ask Nori questions directly as well, you can ping him directly on GitHub, Twitter and Linkedin, but, like I said, you can also join us. Go dot, Delta, dot, IO slack! If you want to just ask a bunch of Delta Lake questions, but hopefully you guys enjoyed this awesome session, showcasing glue, Athena and rest shift uh um uh with Delta Lake and with that Dory. Thank you very much. Thank.

B

You very much for listening the session.

A

C

B

Care everybody.