Ceph Performance Weekly, 7 May 2020

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: 2020-05-07 :: Ceph Performance Meeting

Description

No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).

A

Alright, let's get started okay, new pr's. In the last two weeks, I only saw two as possible: I missed one though the first is from Adam and that's for this type of the data log, which I very much enjoy your comments. Adam, don't never never lose them. I, don't actually know what this does Adam. What what's this P argue.

B

So we have been under the ruthless domination of old map for quite a while, and we've been trying to use less of it in our GW, because I think we're probably the heaviest biggest consumer, especially in things that are essentially linear. It doesn't really make sense to use a key value store to emulate a queue, and so this is basically just implementing a segment of queue on top of radius objects.

B

Nice there'll be doing things like replacing the data log, the metadata log and so forth with it. That's great.

A

If ever in the future, we had, some kind of actual, like native log in Rados, would be reasonably easy to like abstract this into that yeah it'd be cool.

A

That's really exciting: that's this really good news, I! Think in general, any kind of sequential logging we do and no map is a little a little crazy, even even our PG log stuff in in the key value databases, kind of nuts, so good.

B

Job was something that we talked about, but it's more difficult to do in user land. Well, I guess the our equivalent of user land exactly yep yep.

A

I hear you I think once we start getting the opportunity to play around the persistent memory devices I'm hoping that will be the kickstart. We need to really looking at PG log.

A

Maybe that's something we can also look at for user land from your perspective type of stuff as well.

A

That would be pretty great cool. That's exciting, I'm, very, very curious to see what you guys see when you once this stuff is actually in test a testable state.

A

All right also the other big new PR, so that I guess we have two big new PR czar, wonderful you're, the Sam's initial C store work and Keith. Who has been reviewing that I? Don't know what state it's in, but just having that there's a really big milestone so really really excited to see what what we see with that work.

A

So those were the only two new pr's that I noticed personally anything from anyone. I missed.

A

All right: well, we had a bunch of stuff closed this week. There was a bug fix for the monitor to make sure that the priority cache manager invocation is also regularly calling balance and to memory to make sure that we're properly releasing freed memory and then also regularly rebalancing things as we should be. So that's good that should improve the memory. Balancing and memory consumption of the monitor key Fuu had a PR that increased the standard, deviation, variance in the Crimson and.

A

Classical OSD performance testing framework with Jenkins and CBT, hopefully that will eliminate some false positive flags.

A

It's the oh, this one for increasing the 4k metallic size. Spinners from Egor emerged that that was okay to merge now because of his other work merging with the the hybrid allocator and also the deferred big rights PR. So that's really good. Look at lots of testing in on all of that. Over the next couple of months of master.

A

Ma GN pings PR. This doesn't actually change the OSD on threads per shard anymore. This one instead fixed a bug and where we were not properly sending notification to all threads to wake up, I believe I. Remember it that's what he changed in any event improves a performance when you have like random, read workloads. So it's really good.

A

Adams starting PR merged, that's a big one, that merged I think we're still waiting on his tool to be able to convert OS DS from the old format to the new format. But having that pair, there means now we can get a couple of other peers have been waiting implemented, including the double cash: oh, no, double cash, PR fix and also the age based bidding, and then we'll also have to change that tool for getting the OSD conversion.

A

Since the doto double cache, PR will cause an on disk format: change just like the column, family, sharding wounded. So hopefully we can roll all that up, then for Pacific and the last one that merged is this one from KC to use iterators for comparison operators a buffer list. We had a couple that were updated.

A

Some discussion on my PR for the MDS, you add new, expected a new expected files, lag or directory exit, or rather just discussion with the island about how to implement that and kind of general feeling on. If this is a good route to go or not, the objection triumphant had another review and I think another update. In the last two weeks, Thank You, Leah, Oh, fantastic, very good, was.

B

That the last thing needed Adam I- think I need to do one more run of FS. But apart from that, yes Wow, fantastic.

A

That's great all right.

A

Igor had a speed-up removal in by introducing collection, l'esprit fetch that had been stale for a while, but it got reopened and rebased I. Think it's ready for testing and review and.

A

This blue store, right, lock, I think we're all a little scared of it. I think my campaign went through and figured out why it was failing in what I forget which tool was, but there was a failure that I think he diagnosed and figured out, but I think we're all well at least personally I'm still a little scared of it ATM. We were messing around with the locks and blue store.

A

It just makes me a little nervous, though I'm not sure how we proceed on this one. Exactly.

A

Maybe maybe here's the right person to really dig into me, but in any event we should probably proceed cautiously on this one. But but if I remember, the performance looked really good, it was nice increment. So if it is safe, maybe it is worth doing.

A

Okay and then Igor's memory reduction, PR for the O node that one was rebased and it's now going through testing branch. So hopefully that gets merged. That's exciting! That's a nice reduction in the O node size, I, think about 20% of every member right and also a little bit of a performance increase because of it, and that was it for the last two weeks from what I saw.

A

Anyone have anything to add.

A

All right: well, the the only thing on the discussion topics list for this week. That I see is a knee Hawaiian to present some performance experiments she's been doing in the sepia lab, so me, how do you want to take over.

C

Yeah sure more discussion and a presentation, but I just wanted to share some test results from some experiments are done. So the goal of this was to see how much variation exists in our smithy notes, particularly if we run the same performance test on the same smoothie machine over and over again how much variation exists.

C

Currently, eventually, the goal will be that if we see there is acceptable amount of variation, we could potentially use these for some long-running performance tests using the existing pathology, CBT integration that we have um and yeah I pasted some results in the chat.

C

Hopefully everybody your can see it, but if it helps I can share my screen to.

C

Our mark, do you want to go ahead, and do you want to go I think everybody here is yeah should have access to this, which, however, it works. I.

A

See a couple people looking at it just one second I can probably share.

A

Yeah clean up all the random windows I've got open on my desktop, so it was not like overloading here. Oh just got a log into the SSO.

D

Can folks see this? Yes, we can it's pretty small if you can zoom it'll be better yeah. Let me see if I can do.

A

That does that look better yeah.

C

Yep much better now.

A

C

Yeah I guess what the two metrics that I initially looked at was the eye ops and average client latency across our 10 runs of lebar, really FIO performance and overall in terms of eye ops. I am pretty satisfied. I think that the variation is not too high, as you can see, the average sure was by 196 in most trials of similar numbers or acceptable. Close numbers, latency I still think there is a bit of variation, especially you look at trial, 9.

C

Comparatively I, from the average value there is, there are a couple of other graphs on the left. There is a plot of thanks to shredder and his a visualization tool. We were able to plot 95 to 99.99% value information that FIO collects and, as you can see, that variation also shows in the 99.99% aisle results of trial, 9.

C

A

The last yeah go ahead, I was just gonna say it looks like the 99.95. Percentile is really close, though well, not really, but like much more yes, but that's good yeah.

C

And the sri the what the last one is just comparing all the this client agency and the latency. Oh right, that's the metric. The.

C

Tree the sri there on the call think so but yeah, I think it's basically a FIO collects client, latency and latency, and this is just a comparison across all these are runs. That's completion, latency.

D

Neha, yes, okay, it.

A

Was like submit lesion latency in and overall Lindsey.

E

F

The last graph.

E

Essentially shows the the average completion latency and the latency values compared across all the ten runs.

C

So yeah it's it's.

C

It's almost similar to the second plot that is there, but it is comparing both completion and overall latency. So I guess: I, guess it's a it's not too bad in general. Also I have to note that this is just once MIDI machine.

C

I would like to spend more time on getting more data points across different machines and see if I see similar trends, and if the answer is yes, then we can go ahead and narrow down some machines that we want to use for this kind of testing and we can probably do longer runs and other kind of things once we are sure about our baselines.

C

Mark I would like to know your opinion as well. I mean you, you are the person who looks at these numbers there every day, sir. Do these look acceptable to you? Is it worth pursuing? Oh yeah,.

A

This is this looks great I'm really glad to see that on an individual node, we're seeing I mean looking at this. The variation looks to me like it's under well under 5%, if I'm just running it through my head properly yeah this. This looks really good. I'm I suspect that if you add the network and we'd see a lot more variation, but that's you know a future and ever just just doing single node. This is this, isn't bad.

A

You know, unfortunately, that's kind of the case we're at the easy case right single node is where we can yeah. We can get good stuff like this. The benefit that technology brings us is being able to do the big multi node tests and that's where we, we probably need to figure out how we're gonna do that effectively. But this is a good start. Yeah.

C

Yeah, it is yeah, that's a from my perspective. I think I want to be very clear that at least single node test results are worth looking at and reliable and we will be like can be increased to multiple nodes. We know there are going to be other Network and other kind of bottleneck slightly come in.

C

But as long as we have this fine, then that that's when we should start looking at the second step, I guess so, I think I'm just going to do a few more experiments for one some more machines and then maybe talk to David Galloway about narrowing down some of these machines and do multi node testing as well. At this point, it's a matter of I'm just like trying to locksmithing machines and like run these tests over and over again, it's a pretty manual job at the moment.

C

But if you want to do this on on an automated basis, we want to do some sort of cron jobs or other things like that. um But that's.

D

I think a question for later. We have one question for you: this was a single OSD right, yeah.

C

Not there, these are our three our settees, oh.

A

C

Me, let me quickly share the the profile, the workload profile as well.

A

The reason I am the reason I asked about it- was that the I ops is around like 5,200, and these should be either until he 3,700 nvme drives or up and drives, depending on what which specific smithy note we're talking about the that number is quite a bit lower than what we can get with a single OSD. These are rights, or these reads: yeah.

C

Just right I, this is at least the I base, with the CBD configuration no good.

A

Okay, I see it okay, so for K, random, right, I upside of 32.

A

Eg size, 256, replication, three yeah, that's all just fine!

A

So it's interesting to me that then we are seeing something on the order of about fifteen to sixteen thousand right, I apps across the three, the three OS DS, which is a little lower than we sometimes see I'm curious about it.

A

What sorry was this? Was this master or was this yeah? It was master yeah.

C

It was just burn mushroom.

F

That he's old now I'm one driver, they're separate, drives I.

C

Am guessing one drive, I didn't change anything much I, just yeah I think it should be banjo yeah.

A

Josh is probably a single P 3700 with four partitions. I am for virtual drive setup for the EOS DS sure.

A

If you limited these, things only have four cores in them. If I remember, right, I think they're like three point something gigahertz with four cores. That might be the the bottle like that might be the limitation.

A

So if that's the case, then CPU is the thing probably holding these back.

F

Eight cores in that box yeah.

A

F

That's a cat5 CPU in focus is that threads.

A

Yeah, exactly so for for real cores a threads.

A

Yeah yeah I suspect that we'll probably CPU limited, which means that probably things like rocks DB compaction. The nvme is probably fast enough to kind of absorb the extra IO workload, but CPU will, just you know be, will just have extra CPU overhead. We should slow it down, but that might might change kind of the variance that we see in the results over time like it might be. It might be actually smoother because they're not seeing like you know the disk backing up at all.

A

But if we looked at the clip tall results, we'd see pretty pretty consistent, like I/o depth on the device and and q wait times for the device.

C

Just curious: have we ever done any sort of baseline experiments on our smithy or any kind of lab machines in the past.

A

Depends what you mean by a baseline experiments? Do you mean, like.

C

I know Ben Ben Ben, the wind infernalis notes came, and you did a bunch of testing on them. Oh sure sure. Similarly do we did. We do anything for us with ease ever I. Yes,.

A

I think we did when they first arrived. You know, but it was really ad hoc right. It was like verifying that they worked yeah I, don't think we did anything like rhoda, but like a formal report or anything.

G

F

It be worth playing with this testicle osde on that box. That's at least not be as CPU limited I.

A

Think we will be in both cases Josh, because I think even a single OSD will be able to saturate that really really easily. Okay.

A

A

Interesting, what do you think yeah yeah.

C

Quickly, it's what experimenting and gathering was also a big yeah. I would actually technically try to do the same test. 1 OS, T, vs, 3, OS, T's on the same smoothie box and see what.

D

The results look like I'm.

A

Going to guess that you are going to see about 22 like 25, between 20 and 25,000 right apps, given that CPU. As that's going to be my my my prediction, we could do like a prices right style thing here.

A

I'll say 20 to be under a little bit.

C

What's the prize money yet oh.

E

C

Get a smoothie machine Oh.

A

Virus ever goes away.

E

But butthole called detected this meeting machines when compared to in faretta not.

A

So so the inserter nodes have 20 physical cores and course per per CPU, and they run into lower clock speed, they're running at 2.3, gigahertz I believe they also have 4 P, 3700 and Vimy drives in them, whereas the sympathy machines only have one and some of the smithy machines, those drives have failed and I think they've been replaced by in some cases, maybe by obtains.

A

Though, in terms of overall CPU much less but in terms of CPUs per.

A

Nvme drive, you might actually have a little bit more on nathie if I'm thinking about that right, because it would be basically I'm inserted, it would be 5 2.3, gigahertz, real physical cores per OSD and on mithy it would be for real physical cores at 3.5, gigahertz I. Think smithing. You actually get a little more.

E

Each of those contracts I'm getting around 17 to 18 thousand cups, really.

A

Okay, oh, do you know if the what the 2d profile was when you did that we talked about this before right, yeah.

E

Yeah the use was neck. Okay, I can see fine yeah.

A

E

But with balance yeah getting getting a bit more close to 20 yeah.

A

That's really interesting to me, because on insert that I can actually pull off much bigger numbers than that with just a single OSD running on one of those P 3700 drives how much I hope that.

E

B

A

That is interesting, I'd, be if you can reproduce that it'd be really interesting to run the OSD with lo one of either mine or Adams Wolcott profiler, and just take a look at that to find out which, which threads are really busy and if there's anything different than what we typically see.

E

Yeah yeah I remember a good mention about this and I brought it up available. Okay, what I didn't I didn't get around to do it so I should try that I.

A

Remember last summer, when I was doing the cache work, I was doing a lot of testing on the OS DS and I thought that we inserted I had gotten up to about 35 to 40 thousand 4k random, right, I ops, with with some of that work that was then merged. So if now we're seeing it, you know more like 18 or 19 we've we've introduced a regression so yeah we should. We should look into. That would be a good idea. Yeah.

E

I can pull up the numbers. I came to come back, yeah cool.

A

Yeah, this looks good I, like it.

A

Yeah be really curious. What you'll see if you do a 1:1 OSD test? So definitely, if you don't mind that would be that'd be great. Okay,.

C

I will show those results, cool.

A

Is there anything in the second sheet today.

C

You can't go there, it's so that some early experiments had done across different speeding machines, and this is I. Just saw the eye. Ups and I got discouraged and I decided to destroy on one I was just comparing it's the same workload profile but running on different reading machines and I. Just looked at the eye, ops, there's a lot of variance.

A

And do you know the tooth ology reboot all of these notes before it ran the test? Yes, it does yeah, okay, so that's actually pretty high variance like about 10%, almost I. Think.

A

And not quite 10, maybe like.

C

hmm And whatever it is it's worse than on a single machine right, so there's yeah. That's also a part of the reason why I'm saying that I want to do those the same kind of experiment on a difference with the machine to just be what those results. Look like it's possible. Those might not be as impressive as the first one. Yes,.

A

Missy 0 68 that one looks like it's your worst one, so maybe that would be the one to try.

C

Yeah I wonder if there's a way to just lock one like I mean at least I, don't know. Now, if you just lock a machine, you get a random smithing machine, but I don't know if there's a way to say.

H

C

Can right? Okay, I have not tried that.

A

Not 68 78, that was the worst one. I missed it. Yeah.

C

A

F

Yeah think about this in terms of where we want to put CI, even if we do have some variation in machines and if you are CPU level, we can still get some use valuable regression testing out of those machines right.

F

They can still tell us if they're ahead of more drastic changes.

A

Hey yeah, exactly Josh, you know. If we're looking at things like a 30 or 40 percent regression, then you know 7% or 8% variance, doesn't necessarily change change. It right. Yeah.

F

In terms of being like CPU Bend, what is that change, or what is that? Not not that it's detecting.

A

That would not that would make it so that we would be less likely to detect things like increased write, amplification in rocks TB causing slowdowns, because the disk can't keep up.

A

But if we were also recording the OSD log and running it through the rocks TV parsing script in the CBT tools directory, we could at least then go back and look at those and track whether or not we've seen a change in the statistics. Even if we didn't see a change in performance.

F

That's a good point might be some things we want to analyze and extract of it. Behavior from you know, they aren't reflected in the performance. Results, do other bottlenecks, yeah.

A

So this thing right here, the Safra TB log cursor. We can run that against the log when we create the log, we can even just delete the log if they're too big to keep, but they shouldn't be forms to us, because we should be running out of the log levels anyway, ooh yeah. What level are we running at in those tests that I.

C

Have an answer of zero everything's, you know it's all optimized.

A

Open fantastic, so we will want to make sure that we get the rocks DB logs, probably there's really low overhead, so it shouldn't cause. A lot of you know performance impact, but we should get that and then we can run this thing against it, and this will give us all these statistics about, like you know how many compaction events there were, what size they were.

A

Input and output records all those stuff, that's probably worth doing, and then, if we're running against multiple machines, we may want to run this thing, which basically will take the output of two different systems: sisty TL dumps and then compare anything in the kernel, that's different between them and give you give you that kind of a report as to what changed or what's different between them.

A

That might be worth doing. If we're seeing variants like you're, seeing.

C

C

The OSD logs right.

A

The OSD logs for that first one that rösti be like long purser tool and then this thing I think. If I remember, I spent a while since I used it. It just will take the output of sis ETL and, and you give it like two two files. You know one from arrays one or more files, and then it will take those those output dumps and and just it walks through them. Looking at what's different between them, almost kind of like a or like a diff, not exactly but more or less.

C

You fool, you had a question.

G

Yes, I'm not sure my my mic works. You hear me yeah yeah,.

D

G

Okay, my question is: is I really want to understand the plan we got into the prophecy I because I added it to the purpose there to our Jenkins I don't want to, but as I mentioned it last time, I think he said it's just a very beginning of the our prophecy. I really want to know. What's our plan is.

A

So keep away I. Think just my understanding is the idea here is for Jenkins. We do tests looking at performance regression of specific PRS and the goal would be fast running tests. We iterate through lots of them, and hopefully we can test every single PR that comes in with a performance, flag or a label and have different test Suites that run based on the labels so like rgw or.

A

The next step so I suppose the idea here, then, would be.

A

Do you do you remember I think that Deepika added the ability to to run the test based on the label right.

G

I, don't really know: okay.

A

So I think we should verify that.

G

H

Mentioned that I think that was the feature that was there in Jenkins and you did it using github book. So I think that is that charge refuted.

F

H

A systems there.

F

Was Mimi was that check mean how could run against the disks I believe we're. Currently, the guy script is running against men store. Oh.

A

F

Shifu natural kinda.

A

F

We're taking this road again.

A

They're in service Josh, okay,.

F

Right right, it's really.

A

We've got four of them right now, dedicated for for Jenkins. So hopefully that's enough to get us started.

A

Key food would it be difficult to have those the your tools run on a classical SD on blue store, rather than mem storm? Oh.

G

I see I see, so the next step is to to run this CBT using unit will store yeah.

A

G

C

Think the another thing that came up in previous performance call was mark. You suggested that we should have more realistic workloads that we should be testing against, and FIO was something that you wanted to include.

G

C

Currently, we only have radio spend so how hard would it be to hook up our FIO with Jenkins I.

A

Think it should just be a different CBT duration file.

A

And sure that fire was actually physically present on the nodes.

A

Is there it can even think of any reason that wouldn't be enough.

C

You, uh what do you think.

G

Regarding to the more relict load to send into the stinky paste test, there were main.

A

Violence, the furnace bench.

G

Or as well, in addition, this film pinion regarding to this I think it's if we can come up with more remover, I'm, happy to add it to them. I just have no idea. Okay, after my head, what a kind of overload we are! We are one we want to you use for exercising the.

G

A

It be so so there's a it would be moderately long-running but kind of a four corners test right. So with FIO we could do 4k, 128k and four megabyte. Io sizes across reads, writes random, reads: random, writes and then a mixed workload.

A

Now it would give us kind of a pretty good coverage or just kind of a generic FIO profile.

A

Only question is to run each test for I usually do about five minutes, so that would be about you know roughly an hour long, probably for that test, when you include setup and everything else.

A

Is that too much do you think for each PR, or is that reasonable? We could shorten the time length I.

F

Need performance Pierre's to behave I, give it a day or weekend doesn't seem like there's, like probably none.

A

Yeah but I think not that many right, so we should be able to run that I think pretty well unless we start doing it on every single PR coming in.

F

Which is performance, labels, ones and and if we wanted to you, know test what you add for the points table to work. Two more peers. If you wanted yeah.

A

Or Josh I mean we could also do something like say for FIO. We run it against anything labeled with either our BD or labeled, with burritos.

A

And then we're not testing like random, like UI PRS or something.

F

Yeah yeah, thank you just it's just getting it ready running against all the performance ones to start with his music good for his toughness. Okay,.

A

Yeah works, if that were the case, that's worth starting out with I think we could easily run those tests for five minutes each, and you know that would be like an hour long performance suite for FIO, which wouldn't be too bad on one node.

F

That seems reasonable to me.

A

Cool yeah I can I can just whip up a quick yeah Mel snippet for the the benchmark for one once I typically use you know, I can keep. Who I can give that to you? If that would be the the best way to go, or you know whatever.

A

I see a new chat message: yeah I'm gonna, stop she's already thinks it's.

F

Not you from store.

G

F

G

F

Could be an issue using a file you have to.

G

F

Mind it's in the jet instead of the black device.

A

Using a file for black device.

F

E

Running against Easter.

F

So it's just a thief with a start set up. Yes, yeah I.

A

Wonder a little bit how I wonder a little bit if we should have CBT build the cluster instead of V start.

A

There are a couple of advantages to doing so. Then TBT knows about the about the topology of the disks, so it can do things like run on any controls. The OSD so can do things like start out with Val grinder or do some other kind of walk, trace type stuff which maybe we'd want to be able to add as a feature when you fop are coming in, and you see something weird. Maybe you want to be able to tell Jenkins you know: do it this way run it with.

A

You know this thing added to the animal to to run. You know something some something with CBT.

A

People, what what do you think are the advantages of using Vista to build the cluster.

G

Think the biggest advantage is that it's a simple to use, instead of using the shift of Pi, which is used by authority, which is which is a little bit uh highly weighted. Oh.

A

Yeah I don't even use that I, just the CBT mo to build it.

A

Now, let's do this because it's already working Josh your question about using the file for block device I'm, trying to think what the ramifications would be there. It might have some impact on flushing behavior.

A

When, when you have like things happening, I'm, not exactly sure what this would do differently, but it's probably good to have them. You might have like an extra sink or something.

G

I think there's a pen no but overusing using safe, ATM or using safe to the pie, which you use it to pathology, is that they are all expecting a pre cute back edges which is missing in our in our sea ice and which is which takes a little very long time to build the packages right.

A

G

Of a beauty of CBT is that it's that not need a preview packages mechanically consumed, so beauty directory, which is built from from source source. That's the biggest advantage over using CBT tote. All wrongs to be hidden. It's great.

A

P Fuyu can also have CBT build the cluster itself, and that also does not need packages. They can do it just with the source distribution. That's how I do it.

G

Packages takes a long time to boot.

G

That's why, which is to use the ground DBT or use sixty directly in our CI.

A

I can paste like an example file from officinalis here.

A

A

I see there's some chat window stuff happening here above packages.

A

Okay, so this has lots of like commented stuff in it, but.

A

The piece doesn't do stuff or.

B

A

Alright, this isn't very cleaned up, but this is like an example: yeah mol file for CBT from officinalis just doing a single localhost OSD, so it's probably kind of a similar setup to what we do on.

A

If, if we wanted to use CBT to build the cluster, what we would do or for Jenkins, the only thing we need here then, is to have the partition labels set up properly for CBT to be able to to know what to grab, though than that this is more or less what a CBT EML files might look like for for doing.

A

Building the cluster itself and then running in this case would be like a single 100 each when you know for K random, right, FIO workload.

A

There's lots of unnecessary stuff in here. You could shrink this way down, but any event.

A

Keva, what what do you think about something like this? Where CBT would be building the cluster? Is there you think other advantages to doing it with V start? If we could, if it doesn't require packages or any you know different, if this still could be done without a source based build.

G

We could do that. It's just need more tuning because our existing a or a wider, based on the assumption that we are using Peck previewed packages.

A

Know this we know yeah this. Would this wouldn't require people packages? So you could you run? You can run everything out of user local bin Oh.

G

Who's driving Z, this discrete or is consuming this. This is llamo file, DBT, DBT, okay! Well, then, never.

A

Knew there CBT, even boats, yeah and CBT boats are the cluster that so.

G

It's also able to partition the disks and the set of the like the.

A

The only thing in it doesn't partition the disks. It expects you to have partition labels for CGT already defined. So as long as you make the proper labels, then it should. It should work. Okay,.

G

We could have free partitions some some some test notes understood they can, because you buy, I buy stupid key to the pie.

A

When, when advantage of doing it, this way is that, then, if you see on line 39 in this case, it can optionally, you can have it. You run the US DS with Val grind, just like on.

A

They start, but CBT can also do this across multiple nodes, so you you can actually build multi node clusters. This way, okay,.

A

Maybe I'll see if I can go in on those notes and and partition that this due to you know, have allow CBT to do it and we could just try it out on maybe just I'm one of them or something and see if it works.

G

Jeff just mentioned over in the chat window that we have another. Probably we can pick another yard using by using restart.

F

Yes, so the uh you see start is that you can use for local binaries with an existing can the average and that needs to both packages. Either you just kind of replace the kinds of the container with local development builds packages are negative, sorry binaries and you can get up and running with the usual stuff, ATM, tooling, so you've kind of another quest with just like a a normal user would.

A

Guys guys Josh, if once that's ready, then one of the things calves on the bucket list for for CBT is to abstract away more of the of the Ceph clustered class. And we could make a an implementation of that. That uses see start instead of and then also have the current one. But.

A

F

For psych adn itself, you trying to consume the existing images without having out of the local, build even yeah.

A

How far along is that I.

F

Think it needs a little bit more working I haven't, unfortunately, had a lot of time to devote to it recently but I'm hoping to get back to it. Sometime, mother.

A

F

G

F

He has initial implantation right now, but it still requires a bunch of copying a fineries around. So it's not good ideal by the timing perspective, so the approach and burger swing is in instead to kind of mount your bill directory into the container and place all the binaries. With some links to the that you wrote your build directory. Essentially it's trying to get into any data copying.

A

Josh is there one of the things that I've seen being really problematic in the past, with containers is being able to run gdb inside of it? Is that something that we can build into this.

F

Yeah I think as long as you allow you I, give it a container of permission to with, like the cap, P trace mission. I think that that's what Beauty requires to function.

F

Okay, but I, have alpha test tested and verify that works, though I.

H

A

Fantastic, that's good to hear.

A

Yeah Josh, that would be really really nice. If we could build a separate cluster implementation using either stuff, ATM or or sea start yeah.

F

Absolutely well I'm really excited to get I get this working so that we can have the same experience for developers as we have for users as we had for forints tests as we had for correctness tests. So we don't have all these different things everywhere.

F

A

I hear ya. The only reason why I actually was really happy about having it back in the day was because, whenever like tough to play back in the early days well and even before, set to play makes.

F

A

Yes makes up of us an insect deploy, both of which would like periodically start failing, know. Having CPT doing itself was actually really nice, because I could only write.

H

A

Know at work- or you know you know, and one method would usually install clustered but yeah having a unified way to do. It would be really nice yeah.

F

Absolutely so any better than that makes my best batch, respected, yeah yeah.

A

That was, that was what CPT originally used.

A

All right cool we're like over time guys any anything else before we wrap up all.

A

Right, well: hey good good meeting, yeah Josh! Definitely let me know how it goes. I I'd be interested in making use of that when you're once you're ready, yeah.

F

A

All right have a good week, guys thanks. You too,.