Ceph RGW Refactoring, 23 Nov 2022

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Ceph RGW Refactoring Meeting 2022-11-23

Description

Join us every Wednesday for the Ceph RGW Refactoring meeting: https://ceph.io/en/community/meetups

Ceph website: https://ceph.io
Ceph blog: https://ceph.io/en/news/blog/
Contribute to Ceph: https://ceph.io/en/developers/contrib...
What is Ceph: https://ceph.io/en/discover/

A

All right, let's start with the evolves topic on object, Lambda.

B

Hey yes, this is a something that God sent me and he wanted to know. He had discussion with some somebody over some internet Forum.

B

um He wanted to know whether the what we did with the Lewis scripting is um similar to AWS object, Lambda um and I've looked at what they did. uh So there are a couple of differences.

B

um One of them is that they're not restricted to tallure. They have all kind of language support, um they're, just manipulating object gets which makes sense, because this is what you have to manipulate in line. You can order the offline, um but the the one one difference that I've seen there- and this could kind of make me think about.

B

uh Maybe a different approach that we can take there is that they use a different Gateway or at least a different URL if they want the the object processing to run, because one thing that kind of worries me and I guess also you guys when it comes to uh the object, processing via Lua.

B

So what if it's just you know switching some metadata Fields, then it's probably not too bad, but if, if it comes to actually reading the object or reading Chunk from the object is doing some heavy processing on them, then it kind of okay. We don't want to those um um those scripts to hog all the CPU of the rgw and kind of make the all the rest of the people using the rgw to upload the double object.

B

We don't want to slow them down, um scaling would be a problem and estimating uh the capacitive system would be a problem and what they did in AWS is that if you want to fetch the object- and you want to run the the Lambda function against it, then you use a different URL.

B

um And I was thinking whether this is something that that we can do um I mean one option would be to go with something like multi-site.

B

But but that would be duplicating the actual data and that doesn't make sense. I.

C

Don't say I would ever do that. Look.

B

It's a whole different.

C

This is a whole line of development there, but there's been there all right and then and the cows have been in conversation been in meetings with people about different things, serverless functions and supporting them from from from other stories or without optic stores. It's a whole area where you proliferate an entire Suites of software to run such such serverless functions right um that they could put in that that's the case um and and um so and and all those will be out of process.

C

Well, um um there's there's also been conversations about having a converged one um and and people and and I think I can say here um we had meetings with ronancott this team and gal was present, so I was a little confused. Why he doesn't remember it, but I mean one of our one is, if you remember, maybe you ever looped into some of that too. One of the one of the things that ronan's team did was develop. The Prototype of what IBM calls fabric right and fabric is its own sort of Technology.

C

You've been doing a bunch of things, um but it but it.

C

But it's, but it's what it's it's it's, but it's fine, because the core, the core functionality of retention for fabric was that it would be that it would be, and it would be a Gateway um that uh that that that and and with an interface uh to a variety of as an interfaces, a variety of back-end systems but I think but I think one of the I think the consumer interface, for it uh is actually use a zero point in some way um and you can insert it into various workload: workflow pipelines or application, pipel data pipelines uh and and it imposes it.

C

Data passes through it um and- and it imposes Advanced security for pro enforces policy-based security and uh um of various kinds uh on the data or or the access. And then it could include restricting how much of the data can be seen which which columns and Tech columnar data sets can be seen.

C

I can anonymize or transform things um they wrote, All, This and and and the process asking for whatever whether this was you know, we talked to them about whether this, whether whether whether implementing an object, Lambda uh using a technology like this would be would be interesting. One of the reasons is that they, now they have, they had this. They have this. uh That's sort of part of this part of fabric was it was it was a wasm run time that they could accelerate and they had already written on all this.

C

All these transforms to allow wasm32 codes to interact with 64-bit data and other stuff, but there's, but then there's a whole bunch of things out there right that I've said that I haven't really able to know which of us have explored them, but there are a whole Suite of software that do serverless functions and as as as as as as as a as an infrastructure component uh which you could link to uh S3. So these are all things that could be done.

C

I mean I, I right, so so there's so there's a question of does harsh. If you need her once its own system, but but but I but I, think if we start combining it with it shouldn't be instead of rgw, then I think the idea of rgw doing it I think those I think those I think those premises fight with each other, but I think but I. Think if you or you know but I, think if you then say: hey Community version, there's a lot of off-the-shelf stuff.

B

Okay I mean my. My kind of question was um so yeah, so if this is something so the the whole idea is about like inline processing right, because if this is like offline processing, then we have a.

C

Whole yes, office Health, but.

B

C

It's in line then I would say it's an interest. There's interesting projects there, but I wouldn't be fine, but I wouldn't be focusing on how to get it outside and inside. At the same time, I'd be focusing on how to cut how to get an ins out how to get it in line, but very efficient and safe, so think so. Sandboxing and other ideas come up at that point. Hence the hence maybe the the run-in stuff is interesting.

B

Let me see okay, yeah I mean also like really. The question that I raised here is that at least AWS they kind of Switched that to two different uh gateways or at.

C

Least they made an extraction point there, but both would work right. You could have. You could have rgw front ends, exposing an URL or you could add it to something else to it.

B

Yes, yes, like sure we don't know what they do, but I'm saying I think that the idea is that if you just want to fetch the object, you use one URL, even if you want to fetch the object that goes through something uh you need another another URL. Maybe this is not the the Gateway. Maybe this is something that's done before that.

C

Yeah the flexibility, so they can do that and they probably are doing it out of process, um but we wouldn't have to make that choice right.

B

A

So I'm I guess I'm curious what what the advantages are for doing it in line um instead of as some other client that just issues uh a get request to S3 and processes, the data that it gets back.

B

Well, when you do a get, then then you can't you have to do that inline, whether it's in the rgw OR in some other layer could be, but it has to be in line because you can't I mean when you do a put, then you can later on Kick something that will do the processing, because you just uploaded something and it doesn't really matter whether this is going to be the right thing immediately or later on. uh When you do it yet you have to get that right. So.

C

Another thing like I, don't know if this is true of object. Lambda per se, I, don't know everything about object, Lambda, but um but like another thing that I remember when, when when, when nuba came one of the things one of the things that nubas said it could do and add some classic for it was you could you could you could you could add to the processing pipeline? This is more like the Lua integration. Now, in some ways and and some things you could do you like, like they had an example. I, don't know I.

A

I, don't I'm not I'm, not fronting it as as.

B

C

Exemplar of good practice, but one of the one of the one of the demos they had was a card credit card, anonymization data or data anonymization, which is something that yeah, so so so so it could be in line and permanent uh and a put um other ideas like this. So so there are things complete reasons why you would you know who knows what they're object? No, but there's there there are.

C

There are hypotheticals where we, where you would do something that maybe isn't the same as what most climate processing is, um but where you do want to con sort of converge, the processing together with that with with with operation, because either for performance reasons or or or or or or or or for not I, guess non-refutations by the time the data is arrested. It's it's where you want it. Okay,.

B

Right right when, if even if you're limited to the get I think to to Casey's Point, uh then sure the client can do everything with the data. But sometimes the owner of the data is not the client that gets the data right. So so it could be that the owner of the data wants you to get something that goes through some filters on processing and they don't want to count on the implementation of the application or the client. To actually do that. I mean this is what the object Lambda is for.

C

But one of the things is that you know, but I think I think broadly I mean I, I, think I think in terms of at least in terms of like prioritization of of work and and sort of likelihood of of of impact. It seems like a lot.

C

I think it seems like the argument for doing it inside of rgw is, is not as strong as uh at least a bigger picture than that, because so much of because so much of the of this sort of area would be about building a a resilient ecosystem with the developers can use to organize the different codes and outputs and credentials and all that stuff and it's a whole different sort of area of activity and all and and it can always just access. S3.

A

Yeah, just in in general I would I would like to treat rgw as just the S3 storage and layer things on top of S3, so that it works with anything unless they're real benefits to to having it. In line like that.

C

It certainly seems to me, like you'd, have a hard time winning a race with with with with with with the left, with with well-developed infrastructure, that that does that and like you know, for example, for example, with the awesome system, it's got a multi-language runtimes and all those things. So probably that would be, but it's possible that there wasn't work by by Ryan's group is useful. It's possible that people are doing similar things as part of other serverless function, Suites and all of those should be able to consume us by ask3.

B

Yeah yeah I mean sure you can put that in front of of the rgw and do whatever processing and uh if he just speaks S3 at the other end, then it does the same thing: yeah I.

C

Mean that's what you made about at the beginning: I wanted to be elastic and and not, and not sort of competing with RSW for its resources. That's a that's a strong point. Yeah.

B

C

So you have some elastic compute facility that that that's the separation of concerns, that's really helpful. Yeah.

B

I think this is why AWS separated that, so you know, they'll have better control on the uh you know how things need to scale yeah I.

C

Feel differently about about the data integration piece like the aeroflight stuff and so I've been pushing for that to be the other I think for routing pipelines and things like that. The kind of purpose of that is to beat there's a there's, a there's, a net there's a likely advantage to being close to the storage there at least that's kind of our pitch, um but but yeah I, I'm I'm sort of thinking that the best way to write these things is as some sort of elastic facility. That's not part of the S3.

C

Maybe better I think I agree with that.

B

Yeah I agree that makes sense. I mean the uh I mean, as what we've seen in in the cephalicorn is uh doing. Those small manipulations on the metadata is useful in line like.

C

These are all for that, but I think it's, like speaking I, think it's I think it's like the argument for doing it like once.

C

To be observed in the other state might be one of the arguments for it right.

B

Yeah but but more heavy processing on fatigue objects is probably better than I mean it could be done in line, but not inside the RSW I.

C

Agree: yeah I, guess that's all true and the Sophie, but it just what occurred to me. The most was that I there's a there's, a there's, a role for there's a lot of there's, potentially a lot of software engineering involved in building that elastic runtime.

B

um Okay, that's all that's the only discussion, appointment ahead.

A

Cool I'm I'm interested to hear more about the ideas around aeroflight, especially I, did a round of review on Eric's work and I'd like to find a time to discuss it. When he's present.

C

Yeah, that sounds good.

A

C

Is on vacation too, but he's been working on the piece that would would connect it to like spark, but there hopefully.

A

Okay, it wasn't clear what the Integrations would look like, but this would be us writing something in spark. That kind of translates to our.

C

um Sorry well, I think I think that the idea is that spark is going in is the spark connectors are evolving interfaces that we'll just Implement, so Eric has been working on. The Eric's prototyping is aimed at getting the basic aeroflight with interface there, um which is really really narrow, or at least oh, it's not quite, but it's relatively it's relatively narrow. It's super generics. What he's doing right now, but um where, but but people the smart communities are, you know, is already layering stuff on top of flight.

C

That is more interesting, like there's something called flight SQL, which has capabilities like a superset of of our essay select in some ways and and it would be- and the idea would be that that the the general the generic optimization of flight SQL integrated into the spark connector or into the spark Catalyst for the connector enabled would um would let us just be a first class provider for data sets or they could send push Downs similar to how S3 select does and maybe other stuff after after that,.

A

Okay cool, so there there are kind of protocols out there we're not just inventing this in rgw exactly.

C

A

No, no, the whole.

C

The whole the whole excitement around it is is that is precisely so, and that no we're not just breadboarding a new way of talking to S3 our objects. We're we're trying to we're trying to we're trying to pretend to converge it in we're trying to meet up with something that's being evolved.

A

Okay, yeah- that was one of my main questions on Eric's, PR and I, didn't see any any lengths or extra information about this stuff.

C

A lot to it, but this is sort of the base level where, where we show okay, we've got an essay, we've got an aeroflight there um and you can do some simple operations and we can admit some conventions there, but but I think the gold, the gold we're after is above that and it's like, and it's probably in the flight SQL space or something some other protocols sub protocols that end up getting layered on, but but they should all. But but there should be, there should be free money here.

C

There should be a free integration with spark ecosystem here, similar to s3a. But it's not as three.

B

By the way, if, if we are going to develop the the skills to to write stuff in the in the Catalyst could be, this could be useful for for S3 selects uh offering as well, because, because, um like at least this one I know from GAO, like the you know, there are some limitations and some things that we support and are not support by AWS and like the the Catalyst implementation, is very, very uh kind of prudent and whenever he is not 100 sure that the the servers are gonna ingest.

B

That correctly they're going to say just give me the entire object. I'll do the work so.

C

They're very good yeah.

B

C

Gonna have we're gonna have to be able to give information back, they're, maybe they're- maybe development here in order to propagate the intuitions back back to the optimizer for sure.

B

So that could be very useful. I mean this would be good because that would demonstrate the the real power of as you select because currently because the client is kind of cautious, then they're not gonna, quite often they're not pushing down the the query. They're just doing everything themselves is.

C

This applied, does it Supply mostly to the this, is actually I mean I, don't know which, which things you know, which things Cal knows. I mean we spent a long time. Sort of fighting with this I mean there's in addition to S3 select well, S3 select, doesn't didn't, hasn't, seemed to be a big theme in Sparkle, though for cold data. It could always show up there.

C

They have S3, you know they have S3 there, um but but um for us to select there, there are some applications that are starting to use it, but it took a long time to get all the way there, but like one is Torino Presto DB.

C

um So this gal did work to get press get Presto DB to work, although just just, as you said, their their Optimizer isn't Catalyst but it, but it also tended not to be using us. You know when he expected it to, and then we tried to use trino because we had we had people testing with that and and trino didn't even have S3 select working. um So we've since read a couple of papers of white papers that suggest that some parts of trino now have it.

C

um So each of these application ecosystems have some notion of an Optimizer that does this, what you say has to decide whether to push down something and whether it would be use whether whether it would refer to spark its Catalyst um so yeah. Each of these things has to have some where, where uh deducing, whatever it should and I, don't I I, don't think this is a well-evolved area in in in in in Estuary.

C

Select really at this point, um but I think you know, but but the problem is like might be simpler, um but but over in, but I think it's a very I think it's very important topic for, like the for the hot integration over on the spark side, when, uh if it's, if it's it's going to need some sort of way of inferring, when it could be a good idea and it's going to have to have information from us.

C

Otherwise, how could it know that now that we've decided to Cache this object, and it should probably go back to us all right.

C

A related thing was careers.

C

This prefecture, you know the this part that this was work that was done by the students at Mass, open Cloud, but they have this engine that you know that they that they, what that they, that they moved into spark and included a couple other things too, when caught something called Pig, but spark was one and the end uh and and Spark which, when spark spark wants to when spark queries are, are other jobs in spark are set up, um spark creates a dag, and you know a description of the work, a graph description of the work and it and at the end, at the point where, when it does that um and and then inserts them into its work, it's Collective work plans um this this this this this glue stole it stole the dag and sent it over to this.

C

To this to this Cruise engine and the crazy, and then the crease thing would look at all the dags that it had seen recently did a critical path, analysis on them and decided which objects were going to be needed at what point, and it would use them to pre-fetch stuff into the cache and they could use them to discard stuff. We weren't going to look at it anymore,.

B

And this is their cash or something that happens in Saudi.

C

Arabia, this was done for a d4n, so it's so! This is so this, so this so this. So this is part of that sort of ecosystem where it would be the cash inside of rgw, although in theory it could drive other caches.

B

Yeah I mean this looks pretty specific or maybe too specific for the rgw, like I, mean.

C

So I don't think it's a specific word.

B

Not not the cash part, but but doing the analysis of the of the work that is needed by Spark.

C

Reduced it to just the objects that were going to be accessed.

B

Oh okay, okay,.

C

C

Yeah, it wasn't trying to it, ignored any information about the job that, except except it's implied, data accesses.

A

Foreign, well thanks everybody anything else you want to cover today.

A

If not, thank you see you next time.