Red Hat OpenShift Operator Framework SIG | OpenShift Commons, 19 Oct 2018

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: OpenShift Commons Operator Framework SIG October 2018 Full Session

Description

Update on Operator SDK - Rob Szumski
Operator Metering - Chance Z
Operators @ Pantheon - Daniel Feinberg
GCP Spark Operator - Chaorun Yu (Lightbend)

A

We go guys alright, so welcome again to another operator framework C. This is the October meeting. I had put a call out on the email list if anyone's joined. Usually what I like to do is have someone external to red hat he's working on an operator talk I, couldn't track anyone down to commit to coming this time next time the folks from pain cap said they would come and talk about that I did all right, tid be operator that they've created and how you say that so there'll be one next week.

A

If someone who comes on the call wants to take a risk and share their operator story, just let us know and we'll do that today, um but what we when I have on the agenda today is to get an operator SDK update from Rob Szymanski and little talk about operator metering. um Chance and I was just looking at the Google group to see what else was bubbling up in topics after we do that the updates- maybe we can talk about some of the stuff, that's on them.

A

It's been talking being talked about on the mailing list, though Rob do you want to Gary your screen and or do what you're going to do in terms of an update and I'll stop sharing now.

B

Yeah I don't really have anything to show on the screen, but.

B

So we have a few different efforts underway for getting a more structured version and released process going for the SDK, and some of this discussion has been happening on some of the PRS up in the repo. If you've been following that and what it comes down to is we're going to start rapidly progressing from our current status to more of a 1.0 status.

B

Once we've merge in the controller run time work and that work is all about using some of the libraries that the other sig from API machinery is curating, and this is some more low-level stuff to communicate with the kubernetes api and watch multiple CR DS, and that type of thing and.

C

B

Has been progressing on a separate, PR and there's about to be merged here soon and further complicating that we're adding a few new types of operators to the operator SDK. So we have the ansible based operator sdk, getting merged into the mainline. That has also been updated to use this new library under the hood, and so what we'd like to do is we'll tag a release of that once we get it merged gets close to start testing.

B

If that is going to change around some of the mechanics of how your operators work and how they just called the sdk under the hood but you're, you know baseline logic, for what you're doing and go is going to remain the same, and so what we're going to do is have a beta period for that and then once we iron out all the bugs, which will hopefully not take too long, we'll call that the 1.0 and that'll be our. You know stable API.

B

That folks can build against, and you know once again, since all this is rendered in your your go code. You knows you don't have to update on day one and your stuff will keep working to the constructs that the kubernetes api has.

B

So that's a little bit of background on the release process and I'm curious. If anybody has any questions about that or if they've seen some of the chatter happening on the github issue,.

D

I, don't I just wanted to follow up kind of on what you mentioned, so we already have the controller runtime refactoring changes moved on to the master punch and we have our latest release version. Point: zero, zero, seven I believe that's the one! That's before any of the controller one-time changes. So up until that release you shouldn't face any breaking. Yes.

D

Currently, what we're doing is basically adding the migration guide on how to update to the next release, because those will in all the taking changes in how you move from using the SDK API used to using the controller runtime for your operator. So that's just kind of an add-on.

B

Thanks happy to hear that merge that must be behind a mic. It has to be. You know, yeah.

A

All right, having a little bit of lag on my machine, I'm gonna meet my video.

B

While Diane is getting situated with her video I'm curious for folks that are using the SDK on the call today, have you been successful? Updating the versions as they're coming out or any kind of feedback on that process that we needed to you know take into consideration when we're doing this.

B

Now everybody just cloned it one and that was it work.

A

Fine or are they having started using it yet so hey one of the other cool all right, then how about if we move on into chances presentation on operator metering, so we can get an update from him on that and then we'll open it up for other topics.

A

Dance, if you want to share your screen and I, think chant might be needed. So hang on a sec. Oh there you go.

E

All right, Lee, oh I'm, chance, I work at Red, Hat and previously core West I came in with acquisition and I've, been with a crossroad hat for a little over three years and Rob asked me if I would give a brief overview of what operator metering is and, if possible, also demo. So I have all that I thought. I'd start by just kind of giving the base idea of what metering is for what its purpose is and like we'll be able to solve with it and where you can find out more information.

E

So we are part of the operator framework forward and github. So, if you're familiar with STK already or OLM, the operator lifecycle manager, we are one of the projects as part of this org under operator metering.

E

So the project name is operator metering, but I want to preface that with that, the fact that this isn't necessarily only geared towards operator these cases, though, that is probably the best way for to get better integration. If you do have an operator that you can leverage metering in a more communities native way.

E

So the base idea of metering is that we work closely together with your monitoring, stack and potentially other data sources to collect data, store it for a long term and then provide the ability to report on it over over time and slice and dice it. The way you need so, let's see just to give a quick example of kind of what everything actually looks like we.

B

E

Guide on how to actually start from scratch, what it looks like to create custom resources for metering involving meatiest queries, which it talks about here, is how to write a Prometheus query: how to feed them into metering.

E

This is this portion here and then continue to iterate until you get data in our database, after which you can query it using sequel right here, the sequel eventually ends up and another custom resource which users can use to more dynamically, a iterate or to innovate, direct Luthor operator- and this is that's what this looks like here and then at the very end you scroll down.

E

You can create a report which is actually going to run the sequel query with the various inputs that you specify and produce that the results and make them available for careering, at which point you have an api which will be able to query and then get the results as a CSV.

E

So what I'll do is I'll actually go through this in more detail, but that's just the rough rundown is that it starts with is from easiest query. You start in just in the data through the reporting operator. Reporting operator then has the ability to query using some sequel die either you write or we write, and then you get that great a run by creating a report or a scheduled report which actually says what do you want to report on?

E

So in the background, I already have an installation of metering running by default. It runs a set of pods for storing our data for querying it and then also for the part that runs collection and the actual queries themselves. In particular. The primary component here is the reporting operator. It's the one that does the data collection we theists, and it's also. What query is the database, which is presto for all the real work on the underlying data? We use ACS HDFS for storage, but that is something you can change.

E

You can actually also use s3 natively, as basically a file system is where you can think of it, or you can also use a local disk that is mountable on all the pots, still like NFS cluster ifs deficit. Anything that's mountable by many pots, all so be used as a storage in for for this, instead of HDFS.

E

As I mentioned before, we have a number of custom resources can.

A

You make your font a tiny bit bigger chance. It's really tiny. Thank you. Big.

E

Resolution makes it little hard to tell where it's right.

E

Let me see that gets the pholis.

A

E

E

Alright, so we have a number of custom resource definitions, so the top one here is the metering resource, it's kind of the resource that tells you to install everything. Our metering operator installs, the pods listed above presto, I've reporting operator. It does all that through the metering resource is basically the config resource for installation.

E

We have some presser tables which people don't usually interact with, but they're kind of restoring some of our state and then the rest are all things. I would expect the end user to deal with so there's a report, data source, which is basically incoming data or data that already exists and I'll show you that there are report generation, queries which are the sequel queries that we saw before there's the report.

E

For me, this query, which is a Prometheus ul expression for collecting data out of Vitas, and then there's reports and scheduled reports, which are the parts to actually act upon that data. The storage locations are a way for configuring. Whether or not you want your data to be stored in HDFS s3 or a local file system, for example.

E

So, starting from the bottom up, I'll start with data collection portion. We have these report camellias queries which are really simple. They are natural, Prometheus expression prop QL. So let me make this a little easier to read.

E

So I'm just pulling out the.

E

That's about second here.

E

So we have an actual prometheus QL expression right here. This is not the easiest for you still.

E

So I will make this comment slightly smaller see if I can get a nice break in here. But this is a large prometheus query expression that gets the containers memory usage and then does it by grouping it by pod level. So you get pollen usage information instead of just container level, and then we do a bunch of joining basically at the end do or lane it with other kubernetes data so that we have like the pod name, the node name and the namespace. So this query is just run by the configuration of the data source.

E

Which, actually is what Maps, eventually down to a real table, table query. So the report data source has a Prometheus query name, which is the name of the pot of the report from UT square. We looked at before cloud usage memory bytes and that configures the operator to actually go and collect this periodically.

E

This section normally can have some extra options for like how, often to poll and like pump sizing for like how much data to grab at once, but by default it would just use some cool defaults and then we have a status just like every CR which store there's other information about this resource. In this case, the table name field is set indicating there's a database table created for this and that we're collecting the data. So now that we have a data source, we can actually query it from our database. Using a report generation, query.

E

So there can be many report generation queries that act on the data sources. That's kind of the idea is that the data source is the underlying raw data, and then you can have one zero or more queries actually utilize, that underlying data. That way, you don't actually have to collect the data more than once for processing in different ways, just obviously useful.

E

So let me actually just open this in the Edit view.

E

So a report generation query just like everything else has a name all this other stuff is auto-generated because Hayes likes to fill in the metadata. We have a comest set of columns, which is basically what we expect. This query to output, in terms of like a database schema if you're familiar was like a sequel table. This is roughly what that map's to is the columns in that table and then some extra information for how to display it.

E

Reports can take in custom inputs, so you can like override their default behaviors and program them dynamically. A little bit right now.

E

M'allister reports only allow overriding the start and end dates and then there's actual sequel query, which is just an SI sequel that has go templates in it that could be processed before the query is run to allow us to do things a little more dynamically, so the sequel query is doing an aggregation on the name, space pod anode, to get us a pod level usage query that indicates how much memory a pod has been using over time, and so we have a bunch of different ones and I want to say that when I'm gonna run, which is the namespace memory usage freak and it's very similar, the main difference is, it's got less columns and the.

E

Let me show you the query. Query.

E

The sequel query is the same, except instead of being grouped by the pod and the node and the namespace it's just for getting by the name space which, in the end will give us a dump a little memory usage in the namespace over time.

E

So to use these, we have our concept of reports ever reports and I've actually already created a schedule report for us. So the schedule report on show.

E

Is this one? So we have an hourly report, two reports, actually one for memory usage and one force acusa j--, and what these do is they run the sequel query specified by the generation query field and they run according to a particular schedule. We can do hourly daily monthly, whatever you want. Really. We also support cron for the more flexible use cases as well, and so it will report from data starting at the reporting start time until the reporting in we don't have a reporting in.

E

So it's gonna report forever, which is what I want for this purpose um and it what retro actively go back and fill in the data? That's missing from the start, assuming we have the data collected from Prometheus already, and so as I showed before this has been running for about ten hours since last night, so that we actually have some more than just a few rows of data before I show you that, though, we can see the status to indicate like where it's at in the report.

E

If we're back long for anything, you checked where the last period I ran for is and then because I'm on open ship. But this works with really good like for days as.

B

E

I'm using routes you can use blowed bounce of services or no ports as well. I have a route that is configured to expose my endpoint at a particular domain name here, so I actually already have a command set up for create. Yes, you can query it, but it's not really anything. I'm too worried about this CI clustered I set it up with off using the omen ship off the proxy and I'm querying the route that I just showed before, and then this is the endpoint.

E

The API v1 schedule reports it and then it hard to see but I'm querying for a particular report, which is the namespace B usage hourly and I'm. Getting it in type of the de format, don't agree that we basically get the results as tab. Tab, delimited format for each column period start is the start time for the given scheduled interval. So it's an hourly report, so each period starts a period. End isn't one hour.

E

The namespace is the namespace that we're calculating on a start and data and are the minute max of the values in that time range. And then the policy of usage for seconds is the CPU every instance in time, multiplied by the resolution of that data, all added together to give us an actual CPU core usage seconds, and then we do this for every hour. So we have 13 to 14 14 to 15 and everything up down to basically the last hour 15 to 16.

E

You see we have as well with memory and you can see the same value except it's in bytes and it's a different set of values for the these Pollock's usage information. So this is actually all coming from node exporter at the end of the day and then Prometheus collects annoyed exporter data and we do some extra processing with the sequel and the Prometheus query to get it into this format. But that means basically that if we ever needed to change power, it works.

E

All we have to really do is modify one of these resources, like the the namespace memory usage request. I can just edit it and modify it to my needs. So that's the rough idea of like how you interact and use me, but there are other things you can do we're currently working on a another concept which is see.

E

This is a regular report, but the concept is that you can have custom inputs where you could imagine having a query that maybe is specific to a particular namespace and you could add the old inputs to it that customize the behavior so that it filters everything. That's not the namespace that you want. Maybe you know your CI test namespace, and you only want to report on that namespace.

E

This is something that we just added in is finally being worked on, so I don't have a great demo of it has never expertise, utilize, these custom inputs very ugly yet, but that's something that we just released and then we're also working on with this future, a concept of roll-up which allows you to calculate really granular reports that have like. Maybe you say, like the hourly interval that I was showing. But then this could be rolled up into a daily report which basically aggregates the hourly results.

E

And then you can continue to create at higher levels to have accurate snapshots, either at one interval or a higher interval or resolution without having to actually compute the hold series of data across all the lower granularity.

E

Yeah, so that's the rough idea. I, don't really have a whole lot more, given that all of this is just custom resources. The real power here is that you can program it using kubernetes. Just the same way, you can program anything else. Thank you, Burnett ease. You have an operator that wants to interact with this system. It can do so using typical kubernetes technologies like operators or true CTL.

E

So you don't really have to worry so much about like how does my component integrate it's just the same way, you would interact with any kubernetes systems, so I'll turn the SDK will likely have bindings to create various report, queries Prometheus queries and will integrate directly with metering to be able to expose your metrics, make them collected by metering and then potentially even report on them automatically without your operator having to do any that really extra steps potentially, except for define what the query should be.

E

So that's where I'd like to see it go but I'm open to any ideas as well. Anything you want add this Rob or something I should cover.

B

um No I think you covered it. I do have a random few slides on some diagrams, if that's useful for folks, but I might as well just go ahead and share those really quick.

A

There's any questions just pop them and shot or unmute yourselves and ask them.

A

Well, he's doing that, get that shirt, okay, go there! You go.

B

um So here's kind of how it all comes together, few examples of how you can use this in a real environment. So chance is showing us all the reports, all the stuff into the hoods. But at the end of the day, say if you want to do show back for a number of different teams. You know each team has three different projects and you know they have a certain budget. You can run the reports the chance we're just talking about and get the usage on Amazon.

B

We can actually correlate to a dollar amounts which is really cool using the Amazon billing API, and so you can get those into Excel and just you know, sort them and group them by the different namespaces and total things up just manually or because these are all just using CSVs. You can actually import these into your business intelligence tool of choice. Whatever you want to do and make dashboards out of these and have a more automated flow, um you can also start doing a number of kind of like augmented math.

B

If you wanted, if you want to call that on so, if I've got like two bare metal clusters, for example and I know how much that they I'm leasing the hardware for a certain amount, maybe I've got a block of bandwidth. You know whether I use it or not, don't cost any money, and so, if you wanted to combine all of that infrastructure cost together, you know of your shared, like enterprise math device. For example.

B

You can take some of the usage from these reports and combine it with that fixed cost and multiply that stuff together and then show that back to your team, whether in you know another Excel document or in that bi tool hook it up to any of the other cost reporting that you might be doing. Email reports, that type of thing- and my favorite use case for this of all, is you can shame teams that are under utilizing what they've reserved.

B

So if you're, you know asking for more than 2x what you're using on the cluster itself, you can start shaming those teams, you know calculate the ratio of what they're using like or not list them out, so exactly which apps need to be. You know yanked down to size or even just go ahead and do that for them. You could have automation, that's running, that's automatically adjusting people's resource limits and that type of thing. um So it's pretty exciting.

B

This is you know all using cluster metrics that we have today, and that is one whole use case for this. But you can also you know, export custom metrics from your operators, and so that is kind of the the other use case of this. So the cluster monitoring and gaining insights, for that is great, but if you've got a database operator and it's emitting vetrix for the different things that is tracking internally the number of rebalance operations or things like that that are critical to how it.

D

B

Or the number of tables that people have or the size of those tables you can start exporting those and, using this same framework to.

C

B

Which is pretty exciting, so that's kind of a reverse order. That's the high-level chance gave these a low-level. So let us know if you have any questions and we're excited to be catching this out with folks on there to Bernays clusters.

A

All right, well, I, think that they're probably covers off metering I'm, not seeing any questions and Daniel Feinberg from Pantheon has been asking in the chat a lot of questions and updates around the controller which we I think he joined late and we were. We missed him out so I'm going to unmute, you Daniel, but he can introduce yourself and what you've been working on and we can kick off a conversation.

A

Daniel, you should be unmuted now.

F

F

So I'm Daniel Feinberg I'm a senior engineer sre at Pantheon. We are a Drupal and word hope, WordPress hosting platform. We also offer developer tools to agencies and we host about I, don't know 150,000 sites. It varies quite a bit depending on the day as customers join, so we host a large-scale Kassandra infrastructure in two different ways, so we used sandra in two different major pieces of our platform and so we're in the process of building a cassandra operator.

F

I know that there's a few of those being built and we have been writing the master branch until yesterday or the day before that I guess for the operator sdk with the cassandra operator, so we had to pin it at until we get a chance to do the upgrade which I'm hoping to do this weekend to the run time, because there's features there that I'm really.

C

F

That building that are talking about so our other operator is a machine operator that is also right now being built. It's not running in production. We have our standard operator operating production sandra clusters in a very simple way. Our larger monolithic database will be migrated, hopefully by the end of the month. Probably in the first week of next month is more realistic on to the operators management plane.

F

So our machine operator is actually we may we wrangle and maintain systemd containers in large quantities. All of our customer code runs and systemd containers, no docker, no images system, D, namespaces, C groups, all the base stuff we've been around about seven years and when they built this out, docker wasn't there yet, and so we, the large monolith in our system is an orchestration plane and our goal is to slowly piece that out into operators in kubernetes wrangling, our system D containers our first step.

F

There is putting couplets on all of our our servers that run the system, D containers and bringing in a provisioning tool that is being built as an operator it will deploy. Daemon sets to manage services on each server that worked a knitter, a great for customer load on our servers with those assistant e containers. So the machine operators super interesting and unfortunately won't be open source. But what the big the big deal for us is that we're moving into a way where kubernetes will be our central database for infrastructure.

F

It will we'll be using it like right now. The machine builder operator doesn't really do much beside daemon sets and allows us and our monolith provisional the system via containers still and so kind of utilizing at CD and kubernetes in its multi region way. We use gke at Google to our multi zonal way to store all of our operational information, whether or not it affects kubernetes and using its event-driven system and operators to operate on kubernetes resources, but also external things that kubernetes isn't managing so kind of a hybrid operator there.

F

The container operator, like I, said, will be open. Sourced I have one regression and a little bit of finalizar work and the upgrade to do.

A

C

A

So Israel found operator that you're building, is it really specific or is it something that's useful to the rest of the community at large specific to what you're doing at Pantheon? No.

F

Well, because we have two different kinds of use: cases for Cassandra, one is a multi region, eventually consistent set up, and the other is a three-way data replication for a meta data for a distributed file system, which is publicized. We maintain a distributed file system for our customers and the meta data is stored in Cassandra, and the files are stored in GCS on Google, and so the two different use cases have had us broaden out the built and spec of the management of the Cassandra cluster. um There's.

F

Definitely like we don't do we maintain two different versions: I'm, not gonna, say what the there's one version. That's no longer supported that we're trying to get off of and another version is, the three dot X line of Cassandra.

F

So ours will be specific to the containers like you'll have to use our images with our operator, because we have some logic inside of the entry point files and the docker images that do calculations that can't be done at the at the level of the operator to set up configurations and things so there's definitely design decisions that we've made that are based on our feature set, but I believe that by open sourcing it we can set up things like. Oh, we do repairs differently.

F

Let's code, an adapter, an option in the CR D do to be able to swap in and out different repair techniques right and I'm, hoping that open sourcing. The project will allow us to the broaden those features in use case.

B

A

Yeah, so you know, thanks for for sharing, that update with us and I, look forward to seeing that Casandra one out there in the wild. Let us know when it's ready for us to look at it. I'm.

F

Hoping early November as soon as we get our monolith migrated and see that it's stable, then I'm, hoping to put that out. You know Sean just asked about the control of runtime I missed the beginning and part of my project. This weekend is hopefully migrate to the controller runtime version of the operator SDK, so I'm wondering if maybe someone can speak to how hard that's going to be and how much change that's gonna, be across the codebase for a user.

D

So I can kind of jump into that dannion. So you don't have the migration guide out yet we're planning to have it out. I guess this coming week, along with release that tags on changes on the master but kind of at a high level. There is a fair amount of change involved in just the interfaces changed, because we've moved over from using the SDK api's SDK dot. Watch as to give the handler that you have in the SDK. You move those over to using the controller runtimes controller package.

D

Essentially, most of your reconciliation code would probably stay, as is it's just that you would need to update the project layout just a bit. So if you look at the master branch today and if you try to create like a samples and catch the operator as an example you'll be able to see just what the new project new clothes looks like. So there is a fair amount of change involved in actually taking your reconcile cold and just moving it over to a controller package and then just kind of changing the interface to the reconcile code.

D

So obviously, when you would get events sent to your handle now you basically get like an object key that you use to look up the object and the cache, for instance. So there's some changes around the edges, but I think by and large, the bulk of your like operator code should stay the same and, if you're using multiple consume resources within your SDK base project today.

D

That should also be easily translatable into the new controller on time layout with multiple controllers, or you could kind of define how you want to have elationship sleep in those custom resources.

D

This is kind of what we are going over in the migration guide, but you might be able to get an idea if you just kind of go what the user that we have on the master right now, we should be updating the samples operators as well to use the new controller anthem, so that should also give you kind of an idea of how the handler match to the control that is reconciled.

D

But there shouldn't be like anything that you wouldn't be able to actually move over, so hope that kind of answers. Your question yeah.

F

Also right power unit testability of the new.

D

So, with the controller and time you basically end up getting a separate client object within euro reconciler and type plant object. Actually you can actually like say that client and set up with your own custom resources or objects beforehand.

D

Control Anton has a fake line package for that, so that should hopefully make you know testing a lot more easier when you try to pass out this reconciler in your controller, and so we don't have an example around that, but maybe that's something we can add up as a separate talk on how to use that safe line package.

C

Just as another data point, I have moved the and we've moved the ansible operator over to the controller runtime a while back and sse was saying most of our core logic didn't have to change. We just had it and changed the interfaces it was calling, so that was really nice and then the rest of it is like moving files to the new location. So as far as amount of work it should be like it should just be about moving files into the right locations or renaming them.

C

But as far as like the core logic, it shouldn't change that much. Your main file will change some because you need instead of a manager and do some of that other stuff. But you can kind of just take the core like the controller one times, like example, just like take that and that they'll mostly work for you, so just add them as you're starting to go through things.

A

All right, then, I was asking in the chat. If anyone else had any updates and Sharon was gonna. Ask for a slot today about GCP spark operator. Oh I apologize for that sermon that got lost in the email threads. We have 15 minutes if you'd like to take it I'd quite like to hear it. So why don't you share your screen and take it away and it's a little.

G

Bit all I can go a little bit quicker, but.

G

A

G

G

Yeah, so I can go creep on some stuff in there.

G

Oh, an extension just give me a moment: click.

B

A

So, while he's doing that, I just wanted to make an announcement, we did get a room at KU, con North America to host kubernetes operator framework hands-on workshop. It is that on the December 14th the morning Friday morning, the 14th in the Seattle- it's not actually at the convention center, it's going to be at the Seattle Sheraton. So if you are interested in enjoying that, I will send that information out on the mailing list after this song so Sharon take it away.

G

Ok yeah sure today, I can talk a little bit about the Google cloud platform spark operator. I. Think in two weeks ago I attended a meeting where Chiri presented his spark operator, but this was from Google I think it has more momentum and more contributors and users. So let's look at what it's about so have a few things. I want to talk about for today, so to get started out.

G

Just talk briefly about what the operator pattern is, but I think most people are already familiar with it, but the this treaty, Peace Park operator, is basically an implementation of this pattern and then I'll talk about the architecture of the of this operator, how to install it and what are when some of its basic features are and then I'll talk about a COI tool. That's provided in this operator project called the spark CTL. It makes some of the workflow with managing spark drops the easier, as we'll see then comes a feature called militating animation webhook.

G

This is a feature that the spark oratory leverages to to provide lots of flexibility in customizing. Your spark driver and executor paths. I think this is one of the most useful features in this project and, as the last thing I'll talk about exporting and looking at Prometheus matric for Matias matrix with this spark operator. I'll conclude with some future things that you can contribute to this project.

G

Ok, the operator pattern. This is a way of packaging operational knowledge of a of a complex or an Humana cases, stateful application and make it a native to kubernetes, maybe means that you know you can interact with this.

G

This custom resource or your application in using standard communities tooling, for example, coops ETL, and just just as you would interact how you would interact with the traditional kubernetes resources table, sets or deployment or thoughts, and so the purpose is to extract the ways of the details in the provide a smoother user experience.

G

The what it really is is it's an application, specific controller that extends the communities API that makes managing this complex application, makes Mex management creation and configuration easier and the way it does. It is it's an it's an event. Loop. That's constantly running the operator component keeps observing for events. Listening for creation of a new custom resource, for example, and then when something happens, the operator evaluates the current status, what it should do and then it acts on the the insights that what it thinks it should do and this loops this event loop keeps going.

G

Then that's quickly cut into the gist to talk the specifics of the GCP spark operator. It was created by this guy called Ian Ali at Google, and it's not open source. The link is provided on the slide. The approach that it takes to managing spark jobs is that it creates two customer resource definitions or CR. These one called spark application. Another cost schedule spark application, so those are these represents the abstractions of a structure and they are what make scratch jobs- citizens in cadiz.

G

You can as well see in a minute you can interact with the stock, drops, manage them and monitor them using standard tools. The other go is to dim line the creation, management and monitoring of spark drops.

G

Let's take a look at the architecture diagram of this project, so the way it works is that you document your spark job application. As our Spock job specification, you know you know. The mo file so here in the example is spark ID mo we'll see an example of this. So the content was the contents.

G

Look like you know in a minute, but once you have the spark job spec documented in the yeah mo you would use cube CTL or a spark CTL that we'll talk about use these COI tools to submit your llamo to the api server and once the server receives your request to, for example, create a new c rd service pack. Application or schedule spark application. The there's a component called controllers in the spark operator that would you know, get this request and assemble those configurations and cast them to another component. Consummation. Wonder the well.

G

The submission runner does is basically there's the translation going on, because in the llamo you have all those configurations that you want to one your spark job to have, and then the special runner is translating those configurations in a spark into a spark submit command and it advance that sparks the main command by talking to the API server, and then the AI server would in turn create the travelpod, which then would launch exactly pause depending on your configurations.

G

There's also a component called a pod monitor that keeps monitoring the pod events and they're. Also, there are also optional components. We were hitting animation webhook and that we'll see in a little bit.

G

The other basic features are because it uses a llamo to document in the spec okay job, so the llamo is a declarative in nature. It's easy to do things like version control and because, under the hood, what it really does is it's its advanced parts, the main command, so everything that spark submit takes of those configuration options. This part operator also supports. You only need to figure out what you need to put in the llamo. The translation. You just need to know that there's a there's, a good documentation, it's easy to figure out.

G

It also supports a crown like scheduled, spark jobs. So that's what the ECR the scheduled spark application is for and- and the interesting feature is mutating animation black book the operator uses that to enable product customization, you can mount configure config maps or volumes in your driver and executor parts. We'll see that in a few slides- and you can also use the spark operator to enable automatic job pre-submission. If you would like to change the specs of an existing start job or to restart it if upon failure.

G

But if that's what you want and yeah at the end, it also supports exporting Prometheus matrix. This is ank. This is an incomplete list of features, but I think these. These are what the main.

B

G

Main things are: let's talk about prerequisites, it requires community needs the 1.8 and above because it relies on garbage collection, customer resources, but that's only available starting when not 8, and if you would like to use mutating animation well cook, then communities went on aisle and about is required because this feature is only only becomes the beta feature, starting with online and exactly what's distribution of kubernetes. You, you install the operator in that, doesn't really matter. Personally, I've used it on GT and open shipped post worked fine and the yeah how to install it.

G

It's easy to install because there's a there's, an incubator chart on the central helm, charts repo yeah, you basically added the people, how people and they started just what it would do for any other standard chart, and there are other options to customize it. For example, you know you would like to install it in a different name space or there are some components that you would like to enable or disable. But you you can look at the the link. There's a concise documentation that you can look at. I won't go into details here now.

G

Let's take a look at a sample llamo what it looks like so here. So basically you would like you name your Spock job expert PI, and you would like to run it in default. Namespace yeah! That's where you specify your namespace and you provide your image, your main class application file. It's all standard things, then you can configure our driver pod to with some memory, resource requirements or service account without you use and how many executor instances you would like to launch and resource requirements for executor.

G

So this is a very simple llamo and you can have all sorts of other configurations as long as this proximity. Of course it and you figure out the corresponding speck in the llamo but yeah. This is what it looks like the basic operations very easy, because now the CRTs are there. Then you can just create a spark job. For example, just as you would create a pot cookie I'll apply the llamo and took place all the jobs. Okay, get spark applications, the name of the the CRD you have to get other details, for example, events.

G

You know you can do a describe to date, yeah very standard things.

G

Let's now look at the the customs tôi có provided in the in this project called spark CTL. So here I said it's a. It complements cube CDL to make some operations easier, but but I would say that it it can fully replace coups to do when working with working with spark application or schedule spark applications. Here are these because yeah, because everything that Cuba CDL can do in Sparks, they all can do and it makes things easier. For example, listing our spark drops.

G

You can do sparsity our list, but with cuba CD all you would have to do cube. Ctl get spark applications. Things like that.

G

So it's longer, but this here it's shorter because the status of a spark job, cuba, the sparsity all status for pi again with the to be studio that man is a little longer to get the events against shorter, but with with Cuba city all we have to do it describe and spark application, and things like that is longer and gettin spark'd up logs, it's again a one-liner, but with Cuba CD all we need to first find the finder part.

G

You would like to the part corresponding to your spark job and then cancer logs from that part. So it's two commands, but here it's one command, yeah, just some syntactic sugar. That makes your job a little easier.

G

Besides that there are a few other features. Spark city also supports for forwarding, usually web UI again. This is something that cube CL can do, because with cubic CDL, you can again just figure out the part first and then do a port forwarding on that part. But here you don't know yeah I, don't need to find the part. You just knows that the spark job name here is a spark PI. So it's easier.

G

It also supports staging local dependencies to s3 and GCS, and so for your dependencies that you specified in your spark idle llamo. You can specify your a GCS pocket or a tree bucket to upload them to to a remote place, but you need to configure your authentication and stuff up front. The details are in the documentation which I won't talk about here.

G

Ok, so now, let's go to the mutating animation graphic. So this feature is a it's a feature about of kubernetes itself rather than the operator, but the sparkle operator leverages this feature to to enable a flexible customization of pods. What this feature is, is it's a so-called animation controller that intercepts requests to the API server and the modify stand object before the object is persisted as I mention it's a beta feature in 1.9 above and the SPARC operator uses this feature to achieve mostly three use cases.

G

The first use case is Mountain config, mats and Driver and executive pause. The second feature is the monkey volumes. The third feature is setting positive affinity and I. Definitely things like what what notes you actually run on, or which knows it would like to avoid. Let's look at the view sample use cases. So when would you like to month amount complete maps in your spark job parts? So here's a here's an example. So it's a very common to have some custom configurations for a job in Sparky, Falls, calm, partying, that Sh or log4j properties.

G

To these four files, very common, and in order to have these custom configurations in your spark jobs, you need to have them available in the pots and the way you do it in the. If you, if you are to use a sparkle operator, is that you first mount these.

G

So you first mount these files as complete Maps and then, and then your Yama file. You simply refer to that. The config Maps that you were creating and then, when the when the spark CRD the spark job is created. Those config maps that you've pre-mounted would be automatically mounted in the inside the pods inside the travel and exec reports, and then your spark job will be automatically configured as as desired.

G

Another use case is the supporting Hadoop configurations to access HDFS, for example. You need course that XML and HTML site XML these files. Again you can mark as camping maps and and refer to the complete maps in the llamó, which would then bring in the config maps and mark them in the pots once your job starts. So this way you achieve connections with HDFS.

G

Another use case with the meditating emission web hook is the monney volumes. So here's a use case that I've been conjuring myself is in the use of spa history server. In this case, both driver and executor parts of a spark job need to log events to the same volume, which is also the volume used by the history server part itself for using like displaying on the UI, for example.

G

So here, for example, have a type, a PVC, a volume and then in order to to have the driver, an exactor house long log to that volume you need to have this chart without volume month. You specify the name of the volume that's available here and then the path that you would like to amount the volume at yeah. This way the volume is available at this / month directory and then you can long events there, which which are configured here yeah.

G

So these are some use cases for the mutating animation back hook. I think it's a pretty useful feature now, but it's an optional component. You can disable it if you don't want it to use it yeah. Last but not least, let's talk about Prometheus metrics, the spark opera jury configures a Prometheus DMX exporter to run as a Java agent in the operator card itself, but it also supports emitting metrics Prometheus metrics in the driver and executive metrics themselves in the travel and adapter executives themselves.

G

So so the two sets of metrics are in a an application-specific metric. For example, spark driver app status, stop duration, so this is a metric, that's specific for that job. Coming from a driver or executive park, there's also a set of metrics that are a higher level, for example, spark app running count, so these are metrics that are specifically provided by the operator pod itself. So these are application, metrics application level metrics, but note that tweaks to expose driver and executor metrics.

G

So the first set your spark application image that you specified in your llamó that needs to contain the Prometheus JMX exporter, Java agent jar, otherwise the metrics won't be exported, but once you have that jar available in your image, it's it's easily configurable to to have those metrics exported yeah. Here's a example llamó file that configure configures the driver and executor metrics to be to be exported, so these are for the first set of metrics that I that I showed in a previous slide. The operator part itself already configures itself to export application level.

G

Metrics yeah. The this slide is just the general thing about how you look at those metrics. You can look at them in the permitted UI or, for example, you would like to verify the list of metrics opera export, advises, spark operator, part itself, then you can find the pod and then do a port forwarding on that. The the the default port is ten to five four, and once you have that port forwarded, you can go to the metrics.

G

The end point to see that list of the metrics, the application above metrics that that will be shown here. The same is true for the driver and executor metrics. You can also look at them there yeah.

G

The future work is that the current status of the project that it's a fully compatible with spark to 3 and 2/3 being tested with a tool for release, candidates versions and it's currently alpha, but it will be upgraded to beta once to love for its officially released so a trial here at the lab, and we are actively evaluating and contributing to this project. Our past contributions included the hub chart and integration with the Prometheus and spark a server. The project students are in its early stage, and it requires lots of testing to make it mature.

G

So more integration tests are I, think that needs to be added, and we are also working on that also Kerberos support. That's currently lacking also spark Cpl, doesn't have very good support for scheduled spark application. That's also something to be added. Just a few words about the team. I mean I worked at the light bends. The team I working is called fast data platform, so we are a the product. Is a curated fully supported the platform that helps you help to helps developers, design, build and run data pipeline, and all mortgages is on streaming.

G

Data so data that moves in real time and Kafka is obviously a very important component, but our entire architecture is built on top of communities. We used to be based on based on mazes eCos, but now we are really betting on top of our communities and the sparkle pager is our well at least for now.

G

Our project of choice to managing spark jobs on top of communities and our release is going to our upcoming release is going to use organ shift as our community's distribution, and that will we've been testing on open shipped with a sparkle brighter and of the spark reactive components.

G

History server, for example- and currently it's looking good so I think this is a promising project and has a lot of activity so I encourage you guys to try it out and maybe consider contributing yeah yeah I think I went a little bit too fast because because of a short of time but yeah, if you have questions you can shoot me an email or yeah. Just talk to me. Okay,.

A

Geron, thank you very much for taking the time and I'm looking forward at some point soon be getting a demo of the the life and form. So hopefully we can schedule some time, maybe for a deeper dive briefing on that yeah.

G

A

The link to the your.

B

F

A

Would be great, I will post all that up on the mailing list a little bit later today, and so thank you, everybody I, don't see any questions in the chat at the moment, but I will post all this and Sharon. You can reach out to him on the mailing list as well, so thanks again and we'll meet we'll be meeting again and on the third Friday of next month, but you can find us all always on the google group mailing list, so that was a really great charm.

A

Thank you very much for all the work we've been doing on that much appreciate it.