Ceph Developer Monthly, 1 Sep 2021

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Ceph Developer Monthly 2021-09-01

Description

https://tracker.ceph.com/projects/ceph/wiki/Planning

A

B

Hello good evening.

B

All right, so I think we can get started. Yeah bluetooth is not letting me paste in the chat, so I can't link to the ether pads, but there's three radius topics today. Basically, the first one is about config profiles.

B

So the idea here is that um post-deploy stuff in many different kinds of very many different kinds of use, cases and different kinds of environments, and um one of the like our existing kind of defaults- are pretty well tuned for high performance servers.

B

um But many folks are starting to deploy itself in more concerning environments, whether it's for use as a home hobby cluster or an enterprise kind of edge deployment, or it might only be a single node or a small number of nodes, with little memory and cpu available.

B

So the idea is that we could have uh kind of performance profiles for it applied to the cluster that would affect the defaults for a bunch of these kinds of tunables that control, how much cpu memory we consume.

B

There's kind of some obvious ones like the demons that have ost memory. Tar like memory targets like the ost, the mds.

B

There are less obvious ones, first, like cpu, that might be more related to number of threads.

B

Especially for things like uh rgw mds that are are more about passing data through themselves, rather than storing it directly.

B

I think there's also a lot of um investigation that we could do to figure out exactly which things we could tune or to kind of get the lowest possible resource usage in scenarios. That's demanded.

B

Like talking to mark nelson earlier, I think he he was thinking. That's um the major integrating factor. There is the osd caches, especially for bootstore, so reducing the size of the roxby cache and bluestora's cache itself needs a main memory piece there most likely.

B

But I think it's been some time since we tested about that kind of scenario and really tried to tune it.

C

Would these all be like config options, specifically with this, like.

B

Yeah, I'm kind of thinking this would be like a no like similar to how we have the like what we did with like quality service profiles for pacific, where we have like a one like an option called like qs profile that controls a bunch of other options.

B

Now it would act. Similarly, so you said like performance profile, um they're like default or it's a limited or I'm not sure where the name is for the the resource.

B

Research constrained environment is, and it would change a bunch of these config parameters. For you.

C

So if you enable a profile, then it forces a value for a bunch of config options. Basically right and you can't override it unless you disable the file.

B

Well, I guess there's a few ways to go there in terms of overwriting. What we did with the quality of service profiles was that we had a third profile called custom that lets you do whatever you want with any of the settings.

C

B

But if you're not using the custom profile, then there's a few that you can't change.

A

I think there's some valid uh discussion to be had about what that ui should look like, but.

A

Yeah like if it seems like it, should be straightforward to override a single, a single config on a profile, and it should be the case that, if you keep that, that is, if you like, do that override and then the config changes in a subsequent version of stuff, all the ones you didn't override should change, but not the one you overrode some behavior like that, but.

B

A

I don't think any of that's as important as the general existence of the profiles. Yeah.

B

Yeah, I think that makes sense, and that's kind of just is describes like the profiles changing the default values instead of setting.

A

Yeah the values directly right, I think that's the right intuition, yeah.

C

I guess I was imagining whether this was something that was part of the config subsystem where you set a profile and it sort of like adjusts all the default values. So you could still do a config get, for example, and you'd get the adjusted default.

C

That's set consequent to profile, because if you, if you built that sort of something like that into the config system, and then you could imagine profiles for all kinds of different figure options, not just in this like specific case.

A

Yeah- and I think you're right, I think configured should always return the actual in-use value.

D

A

D

A

Is going to see, there should be no.

D

A

D

Yeah, I guess the other question that I'm thinking is what what profile should we have? How many of these profiles or what classification should we make.

B

Yeah, so I guess I mean I mean.

A

I was gonna say there seem to be good arguments for both this edge or whatever constrained resource mayes will just call it edge resource one and high performance. Those seem like two regions of the design space that are distinct and both directly useful.

D

A

At least two.

D

Yep, those clearly have use cases at the moment as well.

B

The other thing that comes to mind is we have a bunch of existing options for differentiating between different types of hardware, ssd and hdd. Those could be kind of subsumed into a profile.

D

Model yeah a profile model right.

B

D

Could just set like a.

B

Like it could be like a this disk profile or something that itunes all those current parameters based on your disk.

B

Type yeah are there other sorts of environments that anyone thought of that might be useful to have a profile for.

B

That seems to be useful to start with two, and uh if we need more, we can always add them, but.

D

Yeah one thing I was thinking is this more for the testing part of it. We could create artificial profiles that we want to test with. They may not necessarily be what users would run with, but they could be like extreme and just to make sure things still work.

B

A

That raises another question, so I mean I kind of hate to over design this part, but it may be worth considering whether we want to also make it so that large deployment users can create their own custom profile that encodes what they think the default should be for their cluster.

A

It might be a good way to tie those ui elements together, something like it lives in the monitor you set this. You know json file that encodes the defaults that should exist cluster-wide and it behaves.

C

A

Of the same way as what we're describing any further config values, you set on an osd basis, overwrite that, but only for that osd, I don't know that it's worth attacking. At the same time, that's probably worth briefly considering.

A

That would also address the youth, the testing use case because to knowledge you could just make one of these. We wouldn't.

C

A

Encode it into the base.

B

Yeah it sounds like a nice simple indication to be able to import and export, and it's essentially like um important, exporting except.com file. Isn't it but maybe yeah.

A

B

A

More so once you think of it that way, I think that's right. The config change would be essentially permitting layered configs.

B

A

The from an implementation that would probably look something like the config now has some kind of a baseline config that lives above the most basic defaults but below the specifically set ones. Something like that. You need a way to dump all this stuff, so there need you to be ui elements for exposing that.

A

But after that, yes, I think you're right. It's a matter of setting the setup layer, itself.com.

C

Yeah, you need to define an order that the the layers are applied. Basically.

A

And we can do.

C

A

We can say there are only three layers: there's the baseline default, that you don't have the ability to change, there's the specified profile and then there's the immediate config settings.

C

Yeah I mean there's the late. The config subsystem already has like, I think, like five different ordered things, sources of a config value and it tracks where they come from, um and so you could just sort of insert profiles in there somewhere.

C

um But if you have multiple profiles that both touch the same value, you need to define which profile what order the profile supply in.

A

That may not be so bad. We could just like not allow that there could be exactly.

C

A

Or just have a well-defined order.

C

Or that they're- probably not touching us enough, but yeah yeah, I just remembering the um the asoc command um config diff on a running daemon, will tell you it'll, show you the demon, all the different values it has in the different layers and which one is the actual one that's using, including the default, and then the one that came from the config one command line from the environment, from the monitor and so on.

A

Oh that's most of what we need them. Yeah.

B

Yeah sounds like this would be a relatively small addition to that. Just inconvenient.

C

Yeah, it just needs to be, I mean it could even be implemented at the well. Maybe if we implement it, it couldn't implement it entirely in the monitor and then it would be. The monitor would feed it through to the demon, and so the source of damage is concerned. To pick options come up from the monitor or you could plummet so that that, at the demon level you can actually see the source.

C

That might actually, I don't know if that's more trouble than it's worth.

B

I guess you may have some complexity around um those that aren't able to talk to the monitor for some reason, just like we do today with like yeah yeah.

C

Yeah, those it's a pretty small narrow set of options where that's possible, and I think this wouldn't be ones that we would care about touching with it.

C

Like stuff related to bootstrapping,.

B

C

And even then it we, like literally, have a bootstrap step where we talk to the monitor to get the config, and then we like restart the whole process with that config. So that should be fine.

B

Yeah yeah, it's not is that a new problem same thing with the existing settings.

C

What was that? What was the motivating driver, so it's like edge so yeah, so like large scale, options or.

B

Like that, where you wanted to kind of minimize the resource utilization but still have all the functionality.

C

Because you want to like do all these sort of in yes, and would there have been a v? Would it make sense to have these like where you can apply profiles to a subset of the cluster or to profile.

A

A

See I think that would be useful, but I think that's like a part two yeah.

B

Yeah it feels like that might get pretty complex, just managing the configuration if you're closer at that point,.

A

Yeah, it's worth thinking about what that would look like as the configs percolate through just so. We don't like make choices that make that annoying later, but I think it's useful even without that feature.

C

Well, if we, if for any given profile, you say that it's driven by a unnamed config option, then you could set that config option so like say that the osd scale or whatever osd profile, is, um has a set of key value pairs, and you have all the different like values for it. um You could define what the config option is that sets the profile, and then you could set that using the normal, like hierarchical, whatever config thing.

A

Yeah, you could take this from one osd.

B

A

C

Or globally or whatever yeah, I think you can make it all fit together.

B

Yeah, it seems like a pretty natural way to do it actually.

C

You could even make the profiles themselves be like a config value. It's like a json dictionary of like profile to a list of a dictionary of option names to values or something like that.

C

It might look kind of gross inside um like a big option. They could put in config key or something.

B

Yeah I mean we were describing a pre-general structure, so it could be literally the general structure that people could set.

B

Yeah yeah that seems like it could be like a nice part three or something for like the custom profile idea.

B

It's with uh some like some of the pre-existing ones already defined.

B

In terms of the actual options that we need to set ink on the experiment and determine a lot of those experimentally to see what really makes a difference.

B

I think today uh the reason fast I think mark is tested kind of like osu's scaling down memory like two gigs or maybe one half gigs, but they're, probably some other things that we can do to make that uh minimum even lower.

B

Okay, should we move on to the next topic.

B

Basically, just still take notes here, but we can start just describing the next one a little bit so that this is about um how we could handle uh pools with no replication better, particularly how to deal with eios on the client side, um as if this kind of a pool is accessed by an rbd client.

B

For example, it's got a file system running on top of it's it's it's local, lock device and if the file system, like the xp4 or xfs, um starts getting io errors because the disk backing in in that poor one died, um it's not going to behave! Well, it's going to essentially craft this local power system. If it gets some eaos and some successful rights.

B

So that it is to prevent the client from um getting any success for voice or reads as soon as it starts to fail, so basically, as the first yeah it gets, it just continuously returns the io for that device.

B

Until it's unmapped there's a couple ways to uh we thought of implementing this one is based in the osd map. You can see, you could add a full flag that indicates this pool is.

D

Entered in aerostate.

B

That way, the osd would also be able to return the air directly to the client in case the client didn't support this yet, and uh it would work well with uh clients, man, I think, multiple places like ffs.

B

Another way to handle it would be purely client side, so that.

B

The library that rdd user space or the kernel client would um detect an eio and because the first time it has to respond that to the client to the upper layer. um It just continues to do that, keeping track of that in its own memory.

E

B

If we just delete, we never.

C

E

Eio, immediately or sorry are we sure we never returned eio already.

C

I mean the current radio's. Behavior is just to hang right if the osd ocd's down or the cluster is unavailable or whatever it hangs, and definitely there.

A

Yeah kind of thought.

C

A

Was some weird there's some kind of timeout somewhere in rbd, but you have to turn it on.

C

Yeah and it's it's doesn't work well.

B

They were designed only for like um applications that could handle those errors, basically uh so like like a liberator like the cli or something that's expecting to be able to be interactive and responsive and um doesn't mind if some requests sent out fail. But the file system can't deal with that kind of.

B

Like yeah, like a local disc, doesn't usually feel that way where uh it tells you to view some errors back within complete bunch of your other rights that are still being written out. So that's essentially why the locations get get corrupted in that kind of situation.

E

All right, I was asking about the iowa, because I thought there were some narrow cases where we could actually return them. If we got it on a read from the.

C

Is damaged and you try to read a damaged object. You get an eio right cases.

B

You don't get it you're going to the client, though you get it yeah.

A

I think it crashes, the osd or something.

B

Oh, you get you recovery.

E

Okay, all right! Well, anyway, I mean any solution like this is going to need to be both a client and a server thing, and it's probably going to be racy. Typically, why I was concerned about it. The last time it came up came up.

E

What do you mean.

C

E

C

The reason why um I'm partial to the pool eio flag is that it separates out the sort of the implementation mechanics of having a well-defined point in time, after which you return, I o and before which you don't and having it that ordering um so that it behaves the way like maybe a disc. If you disconnect the cable or something I behave well,.

E

C

E

C

From that part,.

E

C

Separates that from the.

E

E

Participating in the pool- that's already not true, though,.

A

No, it means that any read that couldn't complete success will hang and it'll hang until something notices and marks the pool eio right.

C

So the challenge is that we can like brainstorm like situations where we should return the I o- or maybe we shouldn't like.

C

Maybe if the oc is down and it's been marked out and a certain amount of time has elapsed- maybe that works for certain cases or maybe it's something else, but we're not really sure, but that that's like a I feel like that is that sort of policy decision of what we wanted to return the io is like is tricky and I'm not sure that there's going to be one answer, that's going to be right for everyone and so the pool aio flag.

C

The point is that it separates that policy decision of when to return the I o from the mechanics of saying, okay set the flag in the osd map, and now we return the I o for that pool and we can make sure that the client implementation and the osd are properly coded so that that behaves. You know in a way that doesn't have weird ordering issues or apps that are resubmitted later or happen to be in flight like we can make sure those issues are don't exist and then separately. We can have.

E

C

Else that had implement the policy to view whatever.

E

That's that's what I'm saying, though, because because an osd map flag is not point in time, not really because you don't want to have a pool. That's across multiple osds, then like one of the osd is gonna. Have the osd map that says yeah I o, and one of them might not. If not.

D

You could you could always have an epic right. Like you have a pool creation, epic, you could have a pool dead, epic.

A

The osds don't return eio. The client does.

C

If you say you have a hung, osd and you have a bunch of ops that are in flight and are hanging, and then you set the flag, so those ops that were already in flight- um maybe maybe the ocd actually instant down and the up the op like made it to the osd, is- was processed or maybe it didn't. And so I think I think, the the implementation that made sense to me at least was that um for existing ops on the client side. We don't do anything and we wait for the osd to respond.

C

um Oh, except if they're down.

A

A

I think there's a simple way to do this. The pool eio flag is semantically equivalent to deleting the pool the osd's immediately start getting rid of their state. They drop any in flight ops period. The client eventually sees the flag goes. Oh, that's why it didn't get any responses. It starts returning yeah.

B

Yeah that would work, that's exactly how we handle the full states in the osd2. So.

C

I think the only thing to keep in mind is that there is. That means that there's a situation where the op might have actually happened, but the client gets in the io, which.

D

A

C

Is what happens with the real disc too right sure? Well,.

A

C

This might happen.

A

Because setting this flight made the pool stop existing.

B

C

Doesn't have to so I was assuming that this could be. You could remove this flag like some time went by and you're like. Oh, I didn't actually lose the data. I don't actually want to delete it, because if you just wanted, we could just delete the pool and we could get what you're talking about right and we just fix the client so that if the pool doesn't exist, then you get the io instead of you know something like that.

A

So for the single replica case, I actually think this kind of scratch pool concept is the first use case we should worry about, and in that case we shouldn't be encouraging people to think in terms of recovering these things. Clients should be designed around the assumption that these pools die.

C

And I think in general, that that is the case. But if um say, you're running your application and you've got the replicas and one of them's down, and you have some policy that says.

D

C

It's discounted for this long then like make it look like the disc guide, and you start returning io and then you know 12 hours go by and the administrator realizes that their system's down, because that happened to too many disks and it turns out, we didn't actually lose any data. They were just offline for a long time and the policy that was triggering this like didn't realize that it killed two volumes that would break the upper application, all the data's still there, and so you would.

C

You would clear that eio flag and all the data come back.

B

The way I think.

C

To think about it is it's just it's like the logical equivalent of unplugging the disk right, yep.

A

C

You start getting io, the data doesn't go away, but um and you can plug it back in, but it's like a it's a it's that like lever that we can then implement some policy on top of or have a human intervene if they need to.

A

I mean if our only semantic pro like, if our only promise under those conditions is that every I o that was in flight or returned dio, either completed or didn't complete, then there's nothing to worry about right.

E

My concern is just that, like the semantics on this are going to be weird and.

A

Yeah but that's okay, I'm saying I.

E

Mean yeah as long as they're I mean as long as it is okay.

A

I'm trying to express the semantics we would get, then we can talk about that. So I believe this map are that any and flight. I o that therefore return dio, right and any of those ios may have completed or not in any order.

A

Yeah subject ordering guarantees, I suppose so pipeline dobson the same on the same object.

A

Yeah we'll have some specific point. We don't want to promise that them anyway. I think we can make that promise safely and I think that's a I think, that's the strongest problem yeah. I think it's free and I think that's the best, but I mean like.

E

A

It's gonna be clear that, like.

E

We are guaranteeing that if you run this on a multi-osd pool, weird operate like weird operation, orders are going to happen, they're not.

A

Really that different from the weird operation orders you get with the regular disk if you yank the cable, which is, I think, what sage is saying. I think this is. I can't think of a way in which this is not possible, with a regular disk.

E

It's just going to be a lot more, mostly.

A

I mean whatever.

E

No, I mean like with a regular disc. You might see out of order up in general, you will this you're going to.

A

I mean I personally don't think that the semantic difference matters very much because I don't think the I. I think we will almost never use this feature on purpose on a replicated pool, I think 99.999 percent of the time this will be used immediately before deleting a pool and that's the use case we really care about. So I don't think we want to go out of our way to like make this harder just to make replication behave in a slightly less weird fashion. Oh.

E

No, I'm not saying that I'm just well.

A

E

Actually, that's good enough, so I mean.

A

E

The reason I'm arguing is that I'm not sure we even want to bother putting this in the osd map, because, like for the use cases we have, the client is going to need to cooperate anyway, and so I'm not convinced we don't just want to have it be in.

C

The client well, I think I think there are two things first. I think I disagree with sam that this is 99.9 when you're going to delete the data, because I think that actually I mean sometimes that's going to be true, but I think they're going to be lots of cases where the osd is offline for some reason and that's going to trigger all the policy that makes these disks appear to fail. um But the oc actually isn't like it's, not because the disk failed.

C

But it's because, like the demon wouldn't start or it hit a crash- or I don't know like whatever it was right.

E

Yeah, what do you think your.

C

And so you're saying: oh, the disc failed, you set the slag, so it goes and recovers and we rotate the volume somewhere else.

B

C

Hopefully, but maybe two hosts are down and you do it the two discs and can't recover until is stuck. You turn the host back on they come up like. I think this might actually need to be coupled with, like the equivalent of the there's, some other annoying osu flag. That's like auto out auto out so that if it comes back up, it clears it automatically or maybe whatever. That's that's a separate discussion but like it, we need to the having plugging the discs, conceptually or logically or whatever, plugging metaphorically plugging.

C

The disc back in, I think, is like a really important part of this, and that's why I think it should be a flag because it's it's global state that allows all clients everybody in the system to agree on what happened and what didn't happen, whereas if it's client-side only implementing some time out, then like there are multiple clients, they might disagree.

C

They might have different different notions of when that when the timeout happened and when it didn't and there's sort of an unclear record of like that, the client viewed the disk as failed and like the application moved on. Like here we it's there's a there's like a well-defined record.

B

It sounds almost like um you get the same semantics or, uh like I said here. What you're drawing between android keep it sounds like sages um being able to recover data after the application has moved on in case of a disaster.

B

Is that is that correct.

C

B

It sounds like you're describing being able to recover data in disaster um when you thought you lost it, but you didn't. The application has moved on in some way, but you want you, you find you've had this like extra copies lying around.

B

That sounds very similar to the semantics of like a delayed delete.

A

D

A

D

I feel I feel like this. This truth, one is a one, is a super set of the other. One is a subset of the other right. If we implement both, let us say we don't delete the data and we leave that decision to the user to like okay, I don't care about this pool now go delete the data versus like okay.

D

I still want to recover something, so I will say: okay, I I will do this manual like let us say, there's a cascading failure like what sage is describing that all the nodes for, even though their replica one pulse suddenly uh became unavailable. At that point, we want to still recover the data, but it was just one off incident that happened and we don't care about the data that should be like a manual.

D

You know a decision that should be left to the user, how they want to configure the system or if they manually, want to go and delete that pool and don't care about the data anymore.

A

So I my argument essentially, was that it's not so much that they should be the same operation. It's that defining semantics, for the failure case is hard, um but sage has a good point. There are a lot of situations where you actually would want to be able to recover them.

A

So I guess if the semantics really are identical to a disk pool, then as long as we returned dio when we forced whatever rbd on the other side to be unmapped and then re-mapped and therefore the files to remount it and mongodb goes through its recovery process or whatever, then we do get well-defined state on the client side, and that should suffice.

B

Yeah, those client-side semantics are anchored pretty clear, they're very similar to the existing osd map full flag. How the plan handles that, but instead of black nozops, we turn the io for them.

A

And paserio I was imagining wasn't so much administrator. It was ocs observed that, for whatever reason, a pool marked scratch or something replica, one became unresponsive and it automatically took its own steps to increase the replication factor on or whatever, and because this is a thing that has to happen automatically. Freeing up the space also has to happen automatically.

A

So I assume that this would be humid out of the loop.

C

I think that's the. I think that the failure part of it is human other loop, and I think that's what worries me.

C

The most is that you could have a situation where you have has three replicas and then host one goes down the automatic stuff triggers and like tells creates a new pool or something, and then a second host fails and that happens and work at that ocs is gonna, have like no idea that wasn't able to recover in time or it lost too many replicas or whatever, because it doesn't know all the the higher level stuff up further off the stack, so it shouldn't ever like go so far as to like delete data.

C

I feel like that's.

E

E

Are we saying that the client actually doesn't return the I o until it sees that flag.

A

Yeah, yes, that's correct.

A

That is until it sees that flag. It does the current rados thing, which is it just waits and assumes.

E

A

E

C

I think actually, it would probably need like a last eio epic field, also in the pgp.

B

If you want to be able to turn it off, it's exactly the same as the full flag media. First, like all right godfollowing, it might win it.

C

Just like that same ordering thing, it's the same as it'll be the same semantics as like the the full stuff we already dealt with all these like.

B

B

I think the major difference is that on the client side we never want to recover and once we've gotten pretended yo, it sounds like we want to force the rpd device to be like fully method.

A

Yeah that rbd device should never ever return success again. It should or sorry that mapping, I should say, yeah yeah.

B

A

You should be forced to unmap it and remount the file system and then restart the application.

C

I mean I think, live rvd might yeah. It probably doesn't already do that. Xfs, for example, will do that if it gets any io on any of its metadata structures, it goes into like this zombie state or you can maybe read stuff or whatever, but it won't write anything else if whatever it should do the same thing like once it sees in the I o.

B

Is copied only.

C

It yeah it goes right on me, yeah.

E

And that's the thing we could also just do I'm not I'm.

B

Just really comfortable making the.

E

Semantic change: well, no, I'm not like. We could also just make the whole pool go read only once we return eio, so you can do the data recovery case, but not the.

E

But not write the things that have had indeterminate up for operations on them.

C

Though, instead of returning ao for reads, you also use satisfy reads, but only do you overwrite.

E

Well, I mean because we were talking about on setting eio, but I wonder if we maybe just want to make it a read, only pool once we've ever set the eio flag.

C

I don't think so because again it's just like it might come back yeah. It might come back right. You might have done this um because the osd was offline for too long and you hit some time out and you're like oh.

D

C

Going to fail, but then like it, turns out that you need the data and you have the data and um metaphorically you want to plug the disk back in.

C

E

um That depends very much on the application.

A

Reads are sufficient to demonstrate all of the properties you've you've expressed worries about.

A

D

A

Reads will will return inconsistently committed objects across the set of ios that were outstanding? It will not necessarily reflect an ordering just because you didn't perform are right.

E

Oh yeah, I just mean because most applications have extra checks when they're doing rebuilds from something like that.

A

Right, which is the reason we would force the rvd mapping to be recreated in the file system to be remounted.

C

Though, and why having to be a pool flag for example, ensures that this happens consistently, even if you have multiple pg's? So maybe some of these are still actually responding and some aren't. But this has like a pool whatever barrier point in time equals.

E

Well, I mean I want to can be very clear, safe, that's not point in time like we, we've done a whole thing.

B

E

How full flags are.

A

For a particular for.

E

A

Mapping, it actually is because rbd basically.

C

A

Allows multiple mappings to the same device yeah. There are no use cases where that's common you're right. If there were multiple readers and writers at the same rpd device, then you could demonstrate.

E

A

E

Writer, if you have multiple pgs and 16 outstanding ios in flight, you can have the last 15 completed, not the first one, because yeah.

A

Point in time, in the sense that the set of non-deterministic ios is determined by the point at which the pool flight is up, it will hit the client, because the epics.

C

The epics increase monotonically and they spread whatever it's like point in time, is wrong word, but it's it's an.

D

Ordering it's a well-defined ordering.

C

Where ops either happens before.

B

C

And they never reorder across that barrier, maybe barrier whatever.

A

Barriers, I think, the right term yeah. That seems good.

A

From the client's point of view, when it gets that map every, I o that completed before that point really completed every. I o that was in flight at that time, is not deterministic and won't be in a known state. Until after a safe, remounting.

A

At which point you can perform a read and find out which rights completed.

C

During the journal replay or whatever happens, yeah.

A

Yeah, as with any normal disc,.

C

So I think that that take away from this is that it requires osu changes that are like relatively well defined and understood. There are some client changes, also again a mirroring of what's up, but both right. That means that burritos needs to be changed. The kernel client needs to be changed, so if this is going to go in order for this actually used like we need to get those all those changes upstream into the upstream kernel and into the downstream kernel and all that stuff. So.

A

I have a few other higher level questions, so the use case here is for like low durability.

A

Rpd opt images in ocs right, so if the scale on that is intended to be tens of thousands and they're supposed to be created and removed cheaply, then we have this problem where we actually don't want big pools, because the blast radius would be really high and we can't have 10 000 little pools.

C

So my thinking here is that um I yes, so I think you want to generally do this with like single pg pools, probably because having a replica one.

D

C

With anything, more than pgm1 is like raid, 0 and you're, just like increasing the probability that you're going to fail. So in most cases I think pg will be one, although you could actually do anything. Can.

E

We can we enforce that and we could.

A

Of course, I don't think you actually want it, it could be like pg four, it doesn't literally have to be one. The point is that.

C

A

That's the blast radius, yeah.

C

Pg yeah yeah pg num will usually be small, um so it's not so much a pg count issue um and then.

E

The second thing is that, if these are but we're gonna need to do things like make sure that the pg of balance or the whatever it doesn't doesn't change yeah, so we're gonna use.

B

That's a purple on or off setting for that, so you can turn it off for these.

C

But the part is that, um if, if we are thinking about these as analogous to disks, um then I think there are two ways to do it. We could we could set it up, so these pools are like actually sort of correspond to disks, um and then you have like multiple rbds, maybe on the same disk disk, slash pool right, so the pool becomes the failure domain and then we have. If there are lots of applications that are doing this, then they would have multiple they would each have their own rbd. That's in that pool.

C

That would be one way to work it, um and if we did that, then the number of pools is order, the number of osds. So we don't have this like oc map explosion right.

A

Right it just means: ocs is now responsible for balancing users across pushing.

E

Fair enough.

A

E

They're gonna they're gonna have to do that anyway for this to work at all, because they need to not have multiple replicas on one osd.

C

Right right I mean.

B

There's already each.

C

Of these pools already has to have some like goofy crush rule that like pins it to a failure domain, and so I think, basically, basically have a pool for domain. Basically, whatever it is um yeah and that could either be, I mean, probably you want to push it all the way down to disk. um Do we do.

E

We have any poll account warnings in the system, yet when we start getting large well I mean all that stuff will be shaking out us with your testing on this. I think yeah.

A

Well, that was going to be my next thing. I want to talk.

E

About because, like this is going to be weird for us to test, I think.

A

No, actually, everything we've discussed so far would come out even on the 50sd cluster.

E

Even no, we are not not difficult just like, because it's not the kind of testing weed, maybe it is, it seems.

A

E

Than a lot of the testing, we do what.

A

I mean is even a casual use of ocs will expose pretty much all of the usage problems, because osds are small individually.

C

Yeah, it seems sort of inevitable that the sort of the management framework that goes around this that's deciding what the domains are and making sure that they're in the application. Whatever that's that's going to be.

B

It already has to have some mapping between.

C

I don't know if there's no way around that.

E

Part of it, but oh I just mean that we're gonna need to start writing tests for this eio returns and stuff, and that's a little different than a lot of our testing is, um and if we're putting this in step, we need to actually write some purpose testing for it.

E

Yeah I mean we do.

A

C

Very much like the.

A

Full testing there's full test.

C

A

It works like this.

C

The full testing sets and unsets full, but with an arbitrary workload and so it'll be a little bit different than that, because I.

A

C

Need a client that, like actually validates.

A

C

Ios back right, exactly yeah and and does, and that- and that does happen in sort of an early fashion. They don't get weird yeah exactly, but I think the redos model could be adapted to do that. Probably.

A

Or just write a tool for this purpose, because.

C

A

To tell the tool when you've set the whatever, like it's tedious, but it's like not that much work, I don't think yeah.

A

Oh, I see what you mean. You can yeah. Okay, add a terrados model so that you tell raido's model when you're setting the flag at everything.

C

Although it might, it might be easier, just have a separate tool because, as far as this is concerned, it doesn't really care what type of output is. Maybe there's a distinction between read and write, but beyond that like watch, notify.

A

Is the only exception that needs to be specifically tested everything else.

A

I was going to suggest some clever if we had the way this uh like way of limiting a set of pg's an rbd image could map to. Then we could do something clever with recreating pgs, but it interacts poorly with this not wanting to permanently delete data concept.

A

I'm not sure how we do that. So I think the pool per osd is actually a really good solution.

B

Didn't we want the um idea of like rbd masking, for other reasons, though, for like.

A

Oh sure, it's just not a solution to this. I wanted it to be a solution to everything.

B

Right right got it.

C

A

Oh yeah, that watch notified thing. um I mean it's, the semantics are still the same, like watch returns, cio or whatever the watch callback. Equivalent of that is yeah and the client I'm no. I think it's just all. The watch date is therefore like explicitly.

E

A

And you have to set it back up.

C

A

That's up yeah.

E

A

E

The rbd client's gonna do anyway automatically yeah okay,.

A

It probably doesn't, this will be code that needs to be written in the objector, but, like I don't think it's a good deal.

B

It does mean that every you'll have to deal with the eos for anything. That's it's not triggered by a client like any of its metadata operations or watch out to fight typical things needs to handle errors on those think it is that for blacklisting somewhat.

E

Yeah yeah: well, we we need to have this conversation with ilia before we before we come into it as about forward.

A

It still seems like it's just work. I don't think any of this is hard to define.

E

I'm not saying.

A

That it's fine.

E

But I mean the rbd team is pretty small these days, it's it's been eviscerated over the last year or two and if we think we're going to get anywhere that anyone can use it in time for anything, we need to make sure that's plausible.

C

I think, if we're lucky this, the eio handling and laboratory will just match that. Just that's exactly what they've already probably already tears itself down. It says it's a black livestock right, yeah.

B

Can we pat yellow.

A

Like we kind of like raido's model to be able to continue operating or whatever tool, we use to test this, but even that's not really necessary. We could think of.

D

A

Or absolutely refuses to carry on when it hits this the state or any. I o contacts from one of those pools. That's probably what I mean.

C

Yeah, that might be the safest thing, because then some other consumer, that isn't liberty or whatever that we didn't think about won't, have some unexpected behavior. Suddenly yeah.

A

I mean probably we shouldn't make the object or die any more aggressively than it needs to, but rbd is like no rpd users. User is ever going to continue operating without actually reinitializing rbd, because no one's going to want to build some kind of an online reinitialize setup thing. That's too much work for no reason waiting. Just on map and remount.

B

The current claim, I need a little bit extra handling there, since it shares the operator layer among the different mounts, but it shouldn't be too much.

A

That's a really good reason to do it. The I o context level or whatever the kernel equivalent is.

B

Yeah yeah they're already in the kernel, basically.

B

Okay, I think we've covered that one pretty well uh go to the well havoc um scale testing.

B

So the idea with this one is um there are different ways that we can try to test very large scale clusters without necessarily having physical hardware. That's with they might we can't necessarily test the capacity or the performance characteristics exactly the same way without that same number of osds, but just have it having say thousand supposed to use in a very, very small number of nodes.

B

We can uh find lots, lots of bugs that we wouldn't detect otherwise without a very large cluster, like things in the manager, the monitor the fam and the do with like cluster cluster-wide operations.

B

uh So we could use like memory back to osds this, like memstor. I think we will quickly run into memory constraints and if we want to deploy thousands of these, I think it might be.

B

More efficient to create uh use a little bit of the persistent storage on disk for, like the metadata pools and os internal metadata for like pg log pg info that kind of thing so that we can restart the osds and test peering and well recovery of no data. Perhaps but um it's like with gabby about this earlier, and he was guessing like having that the data pools essentially be like um devnl and subzero. So rights go nowhere and reads: return zeros or some deterministic data based on the h object.

C

What, if we just, is there like a less intrusive model where we just like?

C

Let's take argument, take the smithy nodes and instead of having those four lvs, we have like 40 of them and then we had take 100 of them and we have 4000 osds, and then they have like actual data.

C

B

C

Have to choose memory way down, because those don't have a ton of memory. But like.

B

Yeah yeah, I think we have two memory down. We have to tune like ranks to be settings potentially to make it not use any disk space or uh it's first level like or make it use all the disk space for its first level or something, but it doesn't have that giant. Spacing amplification.

C

B

C

Even on like a like on the scale cluster, if we're using stuff volume to provision osds, there's a flag in there, that says, I want n osds per nvme.

B

C

Believe, and so you could just set that to like 16 or 32 or 64, or something and just create a ton of posties yeah.

B

I think that's right. I think that works for like a certain scale like I guess it's probably like maybe um 100 goes to use per node. I I guess I was thinking towards with the object store back end. Replacement idea. Is that that kind of opposite of another order of magnitude.

C

I see okay, strength.

B

Memory requirements and down to like okay, 100 bags.

A

I mean we could go whole hog and create a an object store or if an object.

A

Object creation flight, whatever hint that says, I assume you're, going to lie to me about the contents of the subject and then create a pool flag. That says, don't actually store any of the data for the objects in this pool yeah. This is a fake pool like the pool will store all of its metadata. It'll do all the usual appearing stuff. It's got a pg log, but the actual objects won't have any data they'll just return. Zero is.

E

That easier to do with the cluster level than just making a backing store that does it automatically.

A

Yeah cause you need the metadata not to lie, and if you want to do this with separate s, like you kind of want the directories to work.

B

Yeah same with rgw.

A

So actually, no.

B

A

Is actually easier to do as a real rados thing, otherwise, the obvious.

B

Like sort of infer.

A

I'm pretty sure this is a data pool, so I'm just going to lie about the objects. You could totally do it that way, but I think.

E

A

Little bit more inscrutable to anyone, that's trying to keep this and since well.

E

It could also just.

A

E

Omap and not one.

A

Of the points of this data, what are the points of this for like scale testers, to be able to use this feature and there's something to be said for making it slightly less infuriating to use.

C

I wonder if we could get this for free if we made blue star, do zero block detection and just not store it and then, as long as you write, zeros, then you're not consuming any storage you're just consuming the metadata for the optics.

A

Can fio be told to just write zeros that would do it yeah.

B

Yeah you can do that yeah.

C

Yeah yeah all right, I think it defaults, even because even a pool plug I would flag. I would worry about rgw because it puts attributes on objects and I don't know like there's yeah, you still.

B

Need like no map or a metadata equivalence, even in databall, to be persisted. Yeah.

C

And like bluesource, should do this anyway, right, like if you're writing, zeros, like we shouldn't, be writing.

C

That's like a little bit silly that we don't know.

B

I think rvd filters them out in the client side actually.

A

That would be a problem. Can we tell not to do that rather perverse.

B

uh You just sort of create the objects.

A

So that I'm saying well, I guess.

A

We actually want the objects to exist, sort of like us or do we yeah? We want the. I o.

B

A

B

For a lot of cases, you don't even need the well yeah.

D

No, it would be nice to have objects right.

C

Well, maybe we could have an option.

A

I mean c store already supports first objects, so it won't be difficult to support there either eventually, yeah same deal just needs to notice that if that zeros are being written then you know not do it yeah.

C

I mean you could do the same with menstrual. You could think, I'm sorry do the zero detection and then you could also do this with a bazillion. Although I the thing that's, I'm sorry is that you probably want to test things like you know, stopping a bunch of osds and starting them up again and like all those like crashing type behaviors and I'm sorry doesn't do that super well, I actually do want to write down everything else.

A

So, okay, so that makes it so that one way or another we're not storing the data in the pools, great yeah. What the other limitations are, that osc's require a lot of memory. So that's the next limitation.

B

Right right, that's something.

A

Where do you want to.

B

Look at the complete profile thing with the edge use case, so that fits pretty well because we already need to figure out how to reduce this memory.

A

Yeah, well, what I mean is: if our goal is to test, I don't know 4 000 osds with you know three machines, then that won't no version of that's ever going to work. You can't have that many osts on three machines, we'll never be that efficient. So is.

C

That the kind of thing we're talking about is a really big machine, but yeah.

B

It really depends on the memory like if you're talking about um like 60, like 100 megs.

C

Right, we don't have any of those but yeah.

B

These are actually pretty amazing in terms of memory. These days, they're, like 32, gigs,.

A

Yeah, um I think what I'm saying is if the goal is to test how the monitors, which is what we're talking about in this.

D

A

Behave, it might be a nice way, it might be nice if we could fake the existence of some most use, at least for some data structures.

E

And that might be.

A

E

A good long-term goal, but I'm not sure that we need that extraordinary.

A

Magnitude yet for.

E

What we've done, I'm.

A

Not sure it actually is, that's why I'm bringing it up so.

C

So I wrote a bunch of like fake, monitor, osd things like um I don't know 10 years ago and they've just languished. I I periodically have to like touch them. If I ever like changed an interface and there's all this like annoying test code that never gets run. Probably I have to go fix. I might have finally ripped it out, but it's I I think the maintenance burden is like I don't know.

C

I think it's easier to. I think it. It would be easier. I think, to have both you tuna holes that get them down to like 50 or 100 mag osds and then run.

A

E

That even possible, though.

C

I think 200 megs whatever it is and then like run um two dozen of them on the smithy and then lock a hundred smithy. That feels like an.

E

Answer I I haven't looked at the in the lsd in a while, but I just remember mark at one point said that getting it down below a gig was a little rough.

A

Well, from mark's perspective, that's right.

E

And the overspeed still.

A

Perform I o yeah.

B

Yeah exactly we.

A

Don't really care about that yeah exactly.

B

E

Not trying to test.

A

B

We can like disable all the caches. We can turn down the size.

C

Of any kind of thing.

B

C

And I think the.

E

Problem with well.

B

That's true because it was.

E

It was with us as memory balancer code or auto tuner, so maybe it was just. The defaults are too large. So.

C

I guess the the concern I always have with faking listings is that um you're never sure that you've faked it right so yeah. You might run this like crazy scale. Tester like monitors are great but like you're running they're running getting us pickles these that aren't realistic and realistic or have some different behavior. That's the same plus.

B

You get like the top half of the behavior like peering in I'll. Do that yeah.

C

B

We see you right.

E

Now you want to test right, it's not just about them yeah, and also we aren't just testing the monitors. We need to test the manager too, and that would be more.

A

Complicated, I guess the truth is any anything in the monitor or the manager, so performance sensitive that we'd be interested in investing this kind of effort shouldn't be there in the first place.

A

There shouldn't be anything that scale sensitive in either of those those places.

B

Right, this is how we discover that.

A

Yeah, no I'm saying so as soon as we identified something where this would be worthwhile, it would be time to remove it not to test it. Well,.

B

A

So yeah, I think what I'm saying is. I agree.

C

So blue store is zero detection. That's the.

B

Yeah, it seems like.

C

The simplest products.

B

That get this running um and that uh to actually use this with like toothology and deploy it. I guess we need to change.

B

How we are running things there a little bit too we're still using like like partitions on each device right now, right.

C

Yeah, sorry, I was distracted. That's what right now that the ansible job is the thing that every every time the machine is brought up for a particular test. It creates recreates those like four lvs and they're each 100 gigs right now, and so we could have an alternate answer which or whatever it is so we create. Instead of creating four of them, we create 40 that are 10 gigs each as long as bluestorkin, fine, so yeah.

B

I think actually it's a lower limit there, where we can't create a boost or below like 15 or 25.

C

B

C

It's something yeah, there's, there's the lower levels.

B

But if we're talking about like.

D

B

Of oc's, maybe we want to go even beyond what we could do with partitions.

C

I, my son, my guess, is that we're going to run out of memory on those notes before we run out of enough disc, to make sure.

B

So 32 gigs into 200 megs is 160 osds. We have like four nodes. Four disks.

C

B

That's going to be uh 40! Well, I guess 40 partitions per disk! Isn't that reasonable.

C

Yeah they're lv's, so yeah.

B

That's like 30 gigs, with a partition roughly.

B

There one terabyte discs: oh yeah. I guess that would work for this. With these.

C

Okay, so 400k guesses, these so they're smaller.

B

Okay, they might be a little smaller.

C

I think the new ones are bigger, they don't like the little ones anymore um yeah in any case yeah. I think I think it's fine.

E

Yeah, whatever.

C

E

850 gigs available, but yeah I thought the older ones were smaller than that.

C

If the old ones are definitely 400, gig or 490, gig or whatever, because there are only 400 gig but um whatever whatever sets lower bound on blue story is like it's not like a fundamental thing: if there's some option somewhere, but it's for rocks to be it's whatever or something like that, so those can be changed too.

C

B

Okay, good cool, it's actually it's turning out like to be election. Before that I was looking so that's great.

C

And I like the idea of um running big pathology tests, especially now the scheduler will be able to handle it. Hopefully yeah.

B

C

C

C

B

I think there's anything else. We missed with this discussion on the scale testing. For now. I think, without.

D

E

Can we back up a little bit and justify to ourselves? Why is the morning replica? One is a good idea in the step project. I'm not saying it's not. I just like.

E

I I just want to make sure we're really convinced it's a good idea and I'm still conflicted in my head. I mean it's. You.

A

Know: 3x reduction in storage space for application level replication. That's why.

E

No, I understand.

A

E

Don't want to do replication, but I mean like because to the things we provide, are they really worth it anymore to people who are just want a local disk.

C

So I think the counter argument is that maybe these things should be using the the new local storage operator thing that we've been talking about and they shouldn't be using ceph at all, and obviously that has the benefit that you get better performance you're not using this complicated stack to do it. I think the counter argument to that is there are cases where replica, I think, their cases where replica one makes sense, and you want the full feature set that stuff provides like rbd mirroring, for example. Well,.

E

Yeah, so can we talk through that because well like rv rearing, really because, like all the things I've heard about, will do that? I thought in like the point: is that they're applications that do that kind of thing in the application layer.

A

Well, actually, no, you wouldn't want to be here. Well suppose suppose, you've done you've run your analysis and you think two replicas is enough, but they can't be in the same data center.

C

Or maybe it's imagine an edge site where you've got something going on your edge site and you only have one replica because it's like whatever. But you want. You want sort of a dr copy.

B

Then you can take it easy advantage of uh well.

E

But those are all compression and encryption.

A

No, I mean they're.

B

A

Though, that's the thing like if, however many minute delayed rbdr replica is sufficient for your use cases, then your application doesn't have to support it at all. You can just stick exit on it. Do whatever it is, you were going to do and you're done ship. It.

C

And your edge cluster blows up, then you redeploy it and you restore from your.

C

Yeah, you can kind of.

E

Do that actually, better than just doing that with, like you know, butter everest send or whatever I.

C

Mean maybe you can do that with like lvm snapshots and like mirroring gifs and all that stuff like maybe, um but I think, there's um if you everything else is already stuff using.

A

C

Same set of interfaces or whatever, I think would be nice.

A

Well, if you deploy it that way like now, each node also needs some lvm space right. So you can't just like give and take that back from itself. You'd have to right.

C

That's the other problem yeah. It's.

A

E

A whole thing all right, yeah, I was thinking more about like snapshots and things and rvd snapshots were actually like. They imported exports all kinds of formats, but I hadn't considered it for the dr yeah. I think this falls.

C

Into the case where, if this is the only thing that you want to do, then steph is the wrong tool. But if you're using ceph for like 95 of everything else- and you also need to do this one thing- then there are a lot of advantages to also doing this with stuff.

C

And unfortunately there are a lot of things that fall into the category and at some point you have to limit the scope, and but this one keeps coming up so yeah all right and I think it it seems like there's a relatively modest effort to like have the semantics be well defined um so that we can, we can do it.

E

Yeah, I'm not really worried about like our being able to develop it. I'm just worried about like keeping it working process wise.

C

E

Envy, whoever.

C

Has to write the ocs piece of this. That's like actually doing the orchestration doing all that other stuff, um but yeah.

A

Yeah, that's gonna, be a thing we would like steph, probably to be making the call about when a pool should get marked, eio some kind of exposed. I believe this pool to be probably not useful right now, so there will be interface level work that could be a manager.

C

Module, that's like.

A

Yeah exactly something like that: whatever yeah but.

E

A

In the monitor.

E

Whenever we mark an osd down, no.

C

Because that's not necessarily enough, it's it's a policy thing like what. Maybe that's what the environment user administrator application wants, and maybe it isn't so I I really feel like we could implement a particular policy in a manager module. Some external thing can implement its own policy and just call like fosd pool set flag. Eio whatever like this like opens up all those possibilities, since we have total flexibility.

C

But what makes you really nervous is like saying like this is what the policy should be, and this is the only policy we support, because I have a, I highly doubt that that's going to cover everybody's.

A

We're probably going to have to identify the relevant use cases and give opinionated.

C

Like yeah, exactly like this is what we suggest and we'll implement some particular policy, but, like you know, you're down, the line will be like we want to do this too. We're different.

D

C

B

Really makes sense as a default kind of policy. In my mind, sorry yeah, I think, you're backing on, like the other existing rp mechanisms and failure, detection to like like using down or out as I uh as a time.

E

I guess I would want.

B

E

Guess I would at least want to like implement that as a default policy that you can enable or disable in the pool. Then, if.

A

E

Set eio when you lose an osd.

A

This is because.

E

That's going to be the one for an awful lot.

A

What I, what I mean is that there should be business logic and stuff that takes the raw information from the current stats and makes a choice based on the set policy. Yeah yeah, that business logic should live in the manager and not in ocs, see. I should specify the policy, not the implementation yeah. Otherwise, we get into things like ocs interpreting, really specific details of like status, which is probably not good, probably not a good thing.

A

We want ocs expressing intent. I think.

C

Yeah, I think that's.

E

Yeah, because we're also going to need to not do things like mark all the pools down as soon as 100 of pg's go unknown.

E

Well, that's another reason that if we can't put the policy implementation in the monitor, we should, instead of the manager.

C

Manager is, I think, manager is fine, because manager actually has a pg manager. It has a little bit more information in this case actually, but.

A

Like this, this isn't critical to the cluster itself, continuing to function so manager.

E

For anyone who's using these, I'm not sure if that's an accurate statement.

A

Cascading failures: that's the problem, failing to mark something, as an eio pool, isn't going to make the cluster situation worse.

E

No, I just mean like because copter sondra are the things that we've just that I've heard about being mooted as clients, for this are going to be sort of the root applications that everything else gets built on top of.

E

So, if they're not working, it is basically going to be as if the storage cluster isn't working from the cluster user's perspective.

C

Yeah, I wonder what the what the like failure time, whatever what they would expect like? Did they want ios to start ei owing after 15 seconds or like much.

D

A

C

Like a data, drive will take yeah.

A

I know they want it to look like a second right.

C

A

I'm not sure, hang on, I'm not sure this is strictly a property of a sata drive. So this is, I think, more about ocs setting up another replica.

A

That's the answer.

C

But it needs to know at some point. It needs to know that system isn't just slow. It's like.

A

Oh, I know what I'm what I mean is the the time frame is less about cassandra's expectations of when it does fails and more about how long you should go before ocs should start setting up another replica. It's replication delay, I think not not the application's internal expectations or a recovery point objective. Whatever time you're allowed to be degraded.

C

Yeah, maybe I think this is like something that that the application people would have to answer because yeah yeah, like you, don't want your kafka stream to just freeze for 90 seconds.

C

Probably just plays around latency, where you.

B

Consider the failed if it starts going poorly.

A

So I don't think any none of those well, at least I don't think coffee or cassandra will will freeze simply because one of the nodes freezes by design okay but you're right. That's something. We'd have to check with the applications about yeah yeah.

C

I think the the good news is that the fact that, like the I thought it was it's like the non, I thought there was some category of device. I don't. Maybe it's not stated, but it's like the non-enterprise whatever, where, like you, can get ios that just take a long time before the firmware decides to time it out.

E

C

That happens, a lot of consumers.

E

At least they assume there isn't another.

C

Copy yeah, so at least in that case,.

A

C

Some precedent for our behavior, and so hopefully the application will deal with that. If you.

E

Well, I mean we're part of the reason we got kicked. This problem is because it didn't.

C

Well, we we hang forever.

E

Yeah and kafka never like decides that it's dead because we hang forever. That was the specific example I heard at least yeah, um and so I think kafka expects us to return an error in a reasonable amount of time.

A

Or they expect a that, like the orchestrator just killed the node. That would also work right, yeah, yeah! That's that's what I'm saying.

C

D

A

Much application correctness as how does it behave when a hang happens and how long are we supposed to wait before spinning up another one yeah, because with kubernetes you're going to be setting up what a? What do you call it replica? No, I forgot, but it's like kubernetes. Please keep end of these alive. Basically,.

C

So if, if I remember correctly, sahina has this what's like a weekly call about this replica one discussion whenever the next.

D

C

Is whoever is in? There should basically like present this solution and like make sure that rest of the people lined up around it like approve.

E

Is there an upstream like openshift data foundation thing or is that I think that I think that's red hat internal.

C

Or the call is like that, that's called yes.

E

C

I mean there's open shift, origin or whatever.

C

E

I don't know yeah yeah, oh I just I don't know that any of us are in that anything.

E

C

You know who's who's on that call.

B

And gary and josh sullivan, who are good folks to talk to you about that yeah.

C

So we talked about it on tuesday, with both of them, and I can't remember I think I mentioned this, but we should basically flag it to make sure they bring it up.

A

Cool thanks appreciate you hearing me.

C

B

Hey last minute, other ideas.

B

All right, well, that looks like I've always faded to darkness here, so I think it's time to call it something all right see y'all next time, bye thanks.