Ceph Ceph Code Walkthrough, 3 Jul 2023

Previous Meeting

Next Meeting

⏯

youtube image

►

From YouTube: Ceph Code Walkthroughs: SeaStore

Description

Find future Ceph Code Walkthroughs: https://tracker.ceph.com/projects/ceph/wiki/Code_Walkthroughs

A

Extra publication.

A

Is a new Object Store implementation designed natively for crimson's, threading and callback model I'm, assuming everyone here is at least a little bit familiar with Kristen.

A

So the goal here is to avoid CPU heavy metadata designs like roxdb um for context. Rockstv's sort of whole point in life is to spend CPU in order to turn random rights into sequential rights for disks and also Flash.

A

um This isn't such a good trade-off for the OSD on faster media, and there are other designs that can get some of that behavior without spending quite so much time on garbage collection.

A

um But the sort of big concept here is that we want a sufficiently flexible architecture so that we can serve a variety of workloads with different storage configurations.

A

We expect to see a lot of cost focused capacity workloads using only qlc or zns Flash devices or hard disks. We also want to be able to serve highly performance, oriented workloads using only fast end vme devices and finally, we I I expect that there will be a lot of um demand for a combination of high endurance, low capacity with low endurance, high capacity, so in other words, we'd.

A

Really like c-store to be able to do tiering internally, it would take a lot of pressure off of the um design parameters for setting up uh ceph clusters and it would allow people to better combine sort of rgw type capacity, pools and RBD type um uh throughput oriented pools without having to spend um without having to be quite so specific with the cluster setup.

A

So internally to the OSD. There is an interface called the object store when I say the object store in this talk. That is what I am talking about. Not anything else. The object store is the interface by which the OSD talks to its local storage. It has some properties, it is transactional. It is a flat object. Namespace um object. Names may be large, because rgw uses them to do direct lookups, so user side, S3 names, May, translate to Object Store object names within the OSD directly.

A

um Each object contains a key to Value mapping called the omap as well as a data payload. We use that key to Value mapping for things like CFS directories, rgw um property called bucket indexes and uh certain kinds of RBD metadata.

A

It also needs to support copy and write object, clones to support rados snapshots and therefore cephs and RBD snapshots, and we need to be able to support efficient, ordered listing of both the omap and object namespaces for different reasons. Oh map, because it's part of the interface and the object interface, because it is both scrub and Recovery, rely on being able to efficiently iterate over the set of objects. In the same order on all the replicas.

A

So, at a high level, c-store's internal metadata structures, look something like this on the right side of the tree. Here, we've got the LBA B tree and the back ref B trees. These two metadata structures are physically addressed, which means internally each B tree.

A

The internal B tree nodes refer to their children by direct location on disk, um and we use these two data structures to maintain an LBA in Direction, so that the trace on the left side here, the Ono tree, the omap B tree and the actual extents containing object data do not use physical offsets. They use uh logical offsets.

A

um This is important in c-store because it allows us to transparently move around extents, all extents, except for the LBA and background trees freely without worrying about the data structures they um comprise.

A

So the last sort of important bit of Journal layout or a layout for c-store is the way we do journaling if you're familiar with a yield file, store or, to a lesser extent, Blue Store. The journaling mechanism, I believe, is primarily in terms of the user submit a transaction, but with c-store we actually do all consistency at a block level. So um c-store's consistency, semantics are actually totally independent of the app destroy interface and they work in terms of blocks.

A

So it starts with a header it. We then have a set of Deltas, which are semantic changes to a block, and by that and by that I mean blocks, are typed and the Deltas have different formats, depending on what the type of the block is.

A

So if it's a b tree block, for instance in the LBA tree or biomap tree or whatever, um the Deltas, will look like insert key, remove key that kind of thing, whereas for data blocks, it'll just um be changes to portions of the of the data range after that are the newly written, logical and physical blocks and taken as a unit. The entire record represents one unitary change to the to c-store's commit state, including the changes to the LBA tree, required to find the new blocks.

A

So the architecture of c-store looks something like this up at the top. There is the c-store class itself, which implements crimson's version of the Optics store interface below this we have the odode manager, the omap manager, the other data Handler and a few other things responsible for dealing with metadata structures specific to the object, store interface.

A

These all deal in terms of logical addresses and are largely um largely oblivious to the underlying storage below that there's, a big interface barrier called the transaction manager. This is the layer that handles um transactions and extent, reads and writes and is capable of placing extents among different um different backing devices and doing garbage collection and tiering between them.

A

um Below that we have an extent placement manager responsible for deciding which device an extent actually goes on to um with async, cleaner and device implementations for dealing with the actual devices themselves and maintaining free space yeah. And then we have an LBA manager responsible for this physical, logical mapping. I described anti-journal responsible for doing uh transactional consistency.

A

So without further Ado, we'll talk about I'm, going to sort of my strategy here is going to be to kind of touch on each component. Show you the the profile responsible for it and then, if anyone has any questions, they can stop me all right. So the first piece I'm going to touch on is the onode manager. So this is the metadata structure responsible for mapping a GH object, T, which is the osd's internal type, for an uh for an object to anode, which is which, uh because Blue Store used it.

A

It's a perfectly good name for the metadata structure. uh That is the root for a particular object, although other than that it has no actual relationship to Blue Store Zone node, similar concept, totally different implementation.

A

So, if you're familiar with a G, if you're not familiar with the OSD agh object, is a tuple with one two, three, four five: six, seven elements to it: the first three are fixed size, numeric keys or fix size, numeric values, There's, A, Shard, a pool ID and the 32-bit Crush hash that we get from applying Jenkins I believe to the object name.

A

There is a variably sized middle portion, the object name in the namespace. The object name in particular, is under the control of the user.

A

And a fixed suffix containing the snapshot and the generation ID generation is specific to Erasure coding.

A

The order of these elements is is, unfortunately, obligatory because it's how we Define ordering we expect objects listed out of the object, store interface to sort with all of the objects in a particular placement group. Sorted together, so Shard pool and key have to come first, followed by the object name and the namespace, and then because we want snapshots and generations for a particular object to sort together. Those have to come last.

A

So, if you've ever designed a b tree, the with long Keys, one really common trick is to align prefixes and suffixes when they either are comp prefixes, when they're common for an entire um for an entire node or suffixes, when they are different for every element of a node and that's what the FL tree implementation does it takes as you descend the tree, you were at the top of the tree. You primarily store Keys containing just The Shard or The Shard in the pool in the key.

A

As you get to the middle of the tree, you have to store the actual object name and then, if any particular object has so many snapshots that it fills up a whole Leaf node that um Leaf node won't need to store the object name because we can align everything to the left of that portion.

A

So, let's see that code lives the um internal to. Can you guys read that at all how far do I have to oh? That is that better, that's legible.

B

A

Okay, so most things in c-store have this sort of interface and implementation split, mainly because it makes you know, testing easier. So odnodemanager.h contains the interface for any implementation of this metadata structure. It has roughly the methods you would expect. You have a way to check for the presence of an O node.

A

You have a way to get the onode back. You have a way to get or create no node Etc.

A

The actual implementation lives in FL FL tree owned. Manager.H I actually do not remember what FL stands for, but that's the name of the of the data structure engine invented for this.

A

um This code, I, will not go into into in too much detail because it's actually quite interesting, I recommend reading it if you're interested in uh learning one one approach to creating a bee tree with this many different possible layouts.

A

um But, yes, you can find that in stage FL Tree in node manager.

A

Let's see the uh so, the next major component up at the top of c-store, would be the B Tree omap manager. So I mentioned that every object optionally contains this sort of omap structure.

A

So the implementation of that in c-store is a fairly basic, just string based feature because at this time we're not trying that hard to optimize for rgw, not much effort has gone into improving this implementation yep, but eventually, we'll probably want to add things like prefix and suffix solution to this, as well um as with the onode manager. There's an omapmanager.h interface file with pretty much what you'd expect.

A

There are ways to list omaps remove key ranges: clear road maps Etc with the corresponding implementation, betrio mapmanager.h, one thing I thought might be interesting here- is to show you the um this file omap B tree node imple. That itch contains the actual extent that we store on disk or the data structure. That represents it.

A

So this is an internode. So for most of the beach readers there is a type for the internet and a type for the leaf. Node and uh okay. Fine hang on.

A

This is a bad example, so this inherits from ohm map node, which I believe inherits from cached extent. So many of these things, like duplicate for write, maybe get Delta buffer. These are methods responsible for hooking into the commit protocol in c-store. So when we go to do a commit, we go through every extent and we call these methods on it to prepare it for that process and for putting it into the cache.

A

A

So the last sort of top level data structure I'm going to touch on here is the object data Handler, um because we have this sparse logical to physical mapping. We don't need to explicitly represent an extent list or anything like like that. Instead, what we do is we reserve a large subset of The Logical address space and simply let the object use it.

A

Any portions of the space that are that aren't zero or you know less than the size of the object, get extents mapped to the corresponding logical address, um and otherwise it's simply left unmapped, which represents uh a missing portion of the object.

A

So we don't need a secondary extent map uh that code can be found in objectatahandler.h and similar deal up at the top here. Oh yeah, here's a better example, so struct object, data block, inherits from logical cache extent.

A

um Which is the parent structure or the parent interface class for all logically addressed extents.

B

A

Oh I have to actually show the tab there. We go so now that we've gotten through the sort of top level um top level structures, I'm going to talk, I'm, gonna, um look at read and see store real, quick as soon as I can find the tab. That has it one moment.

A

Okay, so I mentioned before that, Crimson has its own sort of version of the object store interface, and this is it it's called futurize store because it is basically the object, store interface but re um interpreted to use uh Crimson style futures um I'm not going to go into this too much, especially this aerator part, but the gist of it is that read. Instead of taking a callback returns, a future object to which you can chain callbacks.

A

So it's a it's sort of a more ergonomic way of dealing with asynchronous I O. So anything that returns a future returns. It is asynchronous, and if you chain a callback onto its return value, it will either execute immediately if the operation is already available or you will end up returning up to the reactor and the reactor will run the Callback later once the future is available.

A

So these things look synchronous, but they are not other than that. The interface looks very much like the object, store interface for good reasons and there's a similar. It uses the same transaction structure as classic implementation uh is partitioned so that each Crim each sea star reactor can get its own local Port into c-store, so that we can avoid synchronization other than that. It's a fairly straightforward implementation of that interface. In terms of the uh components I mentioned before um internally c-store has a transaction structure.

A

Let me find it.

A

That gets passed around to every component that needs to perform reads or mutations most of the. While there are public methods here you don't or users of this users above transaction manager, don't really interact with them. You use transaction manager or other component methods to interact with transaction instead and transaction has two responsibilities: it tracks.

A

The set of mutations that The Logical commit is trying to perform so that when we go to do the commit, we can construct the general record correctly and secondly, it tracks the read and write sets so that at commit time we can detect whether a conflicting transaction ended up committing first.

A

This avoids the need to do synchronization while constructing the transaction, but it does mean that it's possible that a transaction will have to be retried if a concurrent transaction turned out to conflict that state tracking is pretty similar to the state tracking. We already need to do for commit, so it doesn't turn out to be a very large extra bit of overhead.

A

Let's see if we can find.

A

All right, so here's the read method in c-store, so um sort of linking together these Concepts, we're going to repeat without here, is going to pull out, pull out from the onode manager, the onode corresponding to this oid, which got passed in from the user or from the OSD as a GH object t, um and then it will call this call back on up with a transaction and the onode.

A

uh So we have a check for offset if it's bigger in size. We just return. But if it's not, we submit the read through to the object data Handler, which will pull the metadata that tells it where in The, Logical address space, the object. The data for this object is and then perform the read.

A

We can then jump over to object data Handler to see what that looks like.

A

A little bit more complicated with object data does kind of the same thing that the width- oh no, does it pulls the metadata from the O node corresponding to this objects, data um which contains ooh. Let me do that which contains some information like the weather.

A

It has any data at all and the reserve database, which would be the logical offset for the zero offset in the object, after that, it pulls from the transaction manager the set of pins, which would be the set of all mapped, extents that fall between the offset and the length of the object and then iterates over these extents, um ignoring the ones that are zero and populates the return buffer as needed.

A

Anyone have any questions so far. I know this is sort of uh pointless, I, suppose depiction of c-store I, don't know if this is linking up.

A

All right, we'll move on to the next thing, then so jumping back to transaction manager. So the beauty of this abstraction, to the extent that there is one, is that the relatively complex code, for especially the Ono tree, but also the omap tree and the um object data, doesn't need to worry about the actual location on disks and it can be entirely agnostic to tearing garbage and garbage collection in general.

A

It also means that we can avoid. We can allow a great deal of transaction concurrency without needing to know ahead of time what will be read or written. So we don't need to make strong or we don't need a sophisticated locking strategy, because, most of the time we won't have transaction conflicts in the first place.

A

It also means that in the future, if we want to implement metadata structures or pool types that are wildly different from the way rados works now, we can just add more of these structures on top of the transaction manager and fully leverage all of the lower level garbage collection and tiering.

A

For example, if we wanted to create a data structure dedicated for PG logs, that is a thing we could. We could do relatively easily.

A

Possibly some interest to mark um anyway, so the transaction manager presents a uniform transactional interface for allocating reading and mutating logically addressed extents mutations to these extents can be expressed as a compact type dependent Delta, which will be included transparently in the commit Journal Record.

A

For example, the beatrio map manager is able to represent the insertion of a key into a block by simply encoding the key and value pair rather than needing to encode a Delta moving all of the subsequent Keys one key to the right, which would be much larger, potentially um so. The code for this who's in transaction manager.h.

A

And the interface looks like a bunch of ways to manipulate logical mappings and to perform reads and writes on extents so get pins is the method I I showed you before, and it simply Returns the set of all pinned extents between off beginning it offset and extending length and you'll notice. It's a fairly direct call through to the LBA manager keeps doing highlighting um read extent is also probably fairly useful. We call getpin to find the um pinned extent add offset.

A

And then we call readpin, which is elsewhere in this file, hang on.

A

A

Yeah, so the pin has a pointer to The Logical extent, which we read here. Sorry this part's a little bit new forgot. How that part works.

A

So in general transaction meta transaction manager provides this abstraction so that the high level stuff that deals with radio specific data structures doesn't need to know about on-disk allocations and the stuff that needs to know about. Undiscalocations doesn't need to know about rados.

A

Okay, so um for a number of reasons, c-store has a cache both for the usual performance ones, but also for correctness ones. While we have, uh while there are um transactions in progress, the cash represents a projected outcome. So when you go to commit a transaction, the updated versions of those extents go into the cache so that pipeline transactions get the correct versions, especially if metadata structures.

A

um So again, this is hidden under the transaction manager, but the code is in cache.h, so this mainly gets access from the transaction manager, but also from all of the other structures under it.

A

It has sort of the usual stuff you'd expect um cashed extent. The parent interface has intrusive members that allow that'll allow relatively memory inexpensive uh membership in all of the cash internal data structures, including the lru and other things. There is a basic lru for managing um a configurable amount of memory to be kept around and some Logic for forcing things to remain in cash um when they need to be, for instance, the LBA manager has some or the B tree.

A

Implementation at least has some invariants, where it ensures that, if an extent is in cash, all of its parents are in cash as well. So we can guarantee that traversing to an extent we already have in cash is always uh in a memory operation.

A

So the LBA manager, as I've alluded to several times, is the component responsible for managing for mapping physical addresses to logical addresses.

A

um It also includes some other things like reference counts, so that um so that extents, that are referenced from multiple places in the LBA tree can be tracked and we can release them at the correct time. There's also a back ref tree implementation that allows us to do garbage collection more efficiently.

A

um It's structured very similarly to the LBA train, uses the same B tree implementation, but the mappings are backwards from physical addresses to logical addresses and at some point in the future. We will also add the ability to do check sums so that we can get full end-to-end check something from the root tree. All the way down to the leaves should be relatively straightforward in seester, since we already do updates that way uh see that code is in yeah this tab.

A

So, as usual, there's an LBA manager interface that governs how the rest of the code interacts with this. um The interface deals mainly in terms of mappings, so we uh for mappings, We Believe already exists, they're Getters. If we need to create a new mapping, there's an Alec concept Etc now there are ways to increment and decrement ref counts.

A

The only implementation is at this time is the betray implementation, which exists in LBA manager, B tree.

A

The um corresponding extent for this is in.

A

Lbabtree node.h and this actually uses this fixed, KV internal node, which is share, which is a templated extent, implementation, that's shared for all of the B trees that use a fixed size and to fix size, end layout kind of mapping. So basically just this in the back rough tree, but all of the code responsible for splitting and merging notes, for instance, is in that template. So some code, reuse.

A

Let's see so the other major component I think at the bottom or one of the other major components at the bottom of the at the bottom of c-store is the journal which is responsible for atomically, writing and replaying Journal records. The interface is in journal.h, which I can find over here.

A

I didn't open this one.

A

Which has um operations for submitting a record? This is ESET. This is the primary commit pathway, so you submit a record record. Is a struct containing deltas and extents that the journal then writes down the commit protocol is actually um into, or is uh agnostic as the final location of the record, the idea being that, if we're using a device with um Anonymous append, where you don't know ahead of time, what the offset will be, the protocol will still work correctly.

A

Internally to the Deltas and Records um references to other records within the same off within the same record or to other extents within the same record, are relative to the record base. So anytime, you read uh particularly a metadata extent. You have to adjust the in-memory for representations of the uh of the child pointers to um reflect the actual location of the extent.

A

And then there are methods for replay and for um doing about stuff. There are two implementations of Journal one for segmented devices, which would be devices where we greatly prefer to do sequential rights that would be zns and qlc devices in general, but also some Garden variety flash devices as well, and then there is a circular bounded implementation for uh very fast nvme devices which don't have endurance problems when written too randomly and at the lowest level of c-store. Most things have worked this way, there's an RBM and a segmented Journal.

A

There is an RBM and a and segmented devices with different update rules which I'll get to in a moment so um papering over. That difference is the extent placement manager. So, as I mentioned at the very beginning, part of the core idea with c-store is that we want to be able to deal with heterogeneous device configurations.

A

So we wanted to be able to deal with um OSD deployments where the combining both fast um high endurance devices and slow low endurance devices, which necessarily means that c-store needs to be able to manage multiple devices of different uh performance classes. So the extent placement manager is where all of that business logic lives.

A

It manages placing extents among backing devices and it has at the moment, somewhat primitive, but over time we expect to make a lot of changes here, heuristics for deciding where an extension to go based on whether it's metadata or a data extension, whether it's marked as cold or hot, and it had- and it's also responsible for managing background processes, because it is the component that is responsible for knowing what the backing devices actually are.

A

So if you have a um a segmented device, either is your only device or as a cold tier. Then there is a segment cleaner that performs garbage collection on the segments within that device to free them up, so they can be reused um for fast ndme devices where random rights are efficient. There is an RBM, cleaner, cleaner background process instead and by background process. I do not mean a separate thread, or certainly a separate process, separate process in both of these cases. These are simply um callbacks that are called periodically by the reactor.

A

So this is all still local to the same process or to the same reactor thread, and each reactor has its own instances of all of these. These things, the corresponding code, is an extent placement manager.h, which can be found here.

A

So up at the top there's a bunch of there are a bunch of interfaces for for exposing writer instances to callers so that we can do things like journaling and for most of these you'll notice, we have two implementations: the segmented one entered RBM, one, the extent placement manager itself for the external interface that it exposes deals mainly in terms of allocating extents, and it has a bunch of um affordances for the actual commit process.

A

The journal protocol I mentioned before, uh isn't the only way you can write extents, you are allowed to write extents either before or after the commit, but they uh sorry you're you're allowed to write extents down before the the commit, but they will not be linked into the metadata tree until afterwards, and that's this uh um out of line ohol extent concept.

A

So these, like everything else, these things are registered on the transaction, but the accent placement manager is responsible for the um for threading those rights through the commit process.

A

Same deal with read get the extent placement manager knows how to translate these P advertis, which contain a device ID into um the correct device. So, first we get the device ID and then we use the implementation corresponding to that device. Id to do the actual, read yeah, then there's some some affordances here for doing space. uh Space Management, the exemplation manager, is also responsible for tiering.

A

At this time it's still a bit um a bit primitive, but some quite quite a bit of underlying implementation is already present in that, when there are two devices configured when they do garbage collection, um the extent placement manager is given information about how long ago the extent was written so that it can make new choices about where, to put it so over time, extents that don't get read or written very often will tend to find themselves demoted to the cold tier.

A

There is work on going now to add the ability to do promotion of the hot tier as part of the read path and a lot of other changes and, as we start getting into more detailed workload, analysis I expect this will change quite a bit.

B

If I messed up here go ahead, is it irresponsible as well for dealing with fragmentation? Do we, and maybe maybe even uh more important questions? Are we afraid about about fragmentation.

A

So that is an example of something this component will handle. If we decide We, Care I, see um I think we probably do, and we will probably need to make changes to the way the cleaner works so that we so that, for instance, when we, when we clean and extend from corresponding to an object, we also go and get the other nearby extents and write them all out together.

A

um I, don't think that work has been done yet, but it's this is the component it would. It will live in I.

B

See we are leaving uh the doors open.

A

Yeah I mean I I expect to need to do that, one, especially for hard disks. It's just work that hasn't been done yet.

B

Yeah, that's actually Spinners uh who's driving uh supporting Spinners in in sister was the driving Factor behind the question.

A

Yeah we do plan on on supporting Spinners, but it's not the immediate goal like as in the next three months. um Basically as soon as someone's willing to start working on it, it'll become more of a priority. It's just a matter of um what things are important when bluestore is quite good at Spinners. So it's slightly.

B

A

Urgent but I think long term. We probably do want to have one implementation, but that's very long term. So we'll we'll see.

A

Any other questions on this topic.

A

A

um As I mentioned, the segment cleaner is the SP is: is the cleaner responsible for doing garbage collection on segmented devices? This is um particularly important compared to with RBM devices, because segmented devices have this property that you cannot overwrite written extents.

A

um The interface is meant to model the way zns devices behave and if you aren't familiar with those devices, it's a modification. It's an nvme um interface that restricts your access to the disk, to a pending sequentially to a large gigabytes and size segment, um closing it and releasing it and releasing it releases. The entire segment you're not allowed to overwrite portions of the of the segment.

A

So for that reason, before you can release and reuse a segment, you need to move any extents that are already there somewhere else. But the intention is that, if you're familiar with how flash actually works internally, uh there are segments. There are portions of the device that have to be um that that can't be overwritten without zeroing them so internally, a conventional flash device does something like this but exposes to you a mutable interface.

A

The cost is that you have this non-deterministic device controlled garbage collection process that tends to impact hail latencies. It also tends to produce a lot of extra rights for every right the user does, and that was fairly tolerable for three level for uh TLC devices, but with qlc devices, because the right endurance is so poor.

A

It's important to minimize the amount of right amplification as much as possible, so, given that ceph already needs to do something like tiering, we may as well handle the garbage collection portions as well, because if we can combine those rights, we potentially significantly cut down on the number on the amount of right amplification and that should improve device lifetimes.

A

um It also means that we get to do the garbage collection stuff when we want to do it, not when the device feels like it. So we have better control over our tail latency. So for these and other reasons, even for non-zns devices, we'll probably want to use the segmented interface for slower flash, even if it happens not to be CNS.

A

So, as I mentioned, it runs within a within sort of callbacks that are periodically scheduled within the same reactor. um Logical extents are simply remapped within the LBA manager. So when we want to move a logical extent around, all we have to do is update in the LBA manager, the location of the extent.

A

The uh conflict detection mechanism will correctly deal with the case, where we're relocating an extent that's being accessed by another transaction, because they'll both touch the same LBA, itry extents during the process. So what or the other will? We will retry in the unlikely event that that happens.

A

It's also responsible for throttling foreground work based on the amount of pending garbage collection of work. We want to avoid running out of space and having to do a bunch of garbage collection before I O can resume. So. Instead, we inject increasingly long pauses as we approach the um the the as the device approaches. It's sort of hard cap on fullness um I expect that heuristic system will need to change quite a bit, but that is where we are now.

A

Let's see that code lives in well, the interface for it lives in asynctcleaner.h and I believe segment cleaners in here as well. Async cleaner is the general interface that the extent placement manager uses to control these sort of background processes.

A

um There are a bunch of other interfaces in here also for uh for communicating back to the extent placement manager.

B

We lost your video.

B

A

Wait you guys can see it now. Okay,.

A

Anyway, it's an instantcleaner.h. The segment cleaner implementation is in here as well, actually yeah. Here it is and you'll notice it inherits from async cleaner and it provides segment provider because it behaves as a uh it's. A it's responsible for mapping which portions of the backing segmented devices are free.

A

Thank you and I think that is about all I had prepared. Let me see yeah, that's my last slide. Does anyone have any other questions.

A

Yeah, so if you're interested in contributing, we are at a place now where we have sort of Fairly basic multi-core support. Although uh Chennai is working on completing that, and it should be testable if you're willing to tolerate a certain amount of flakiness um we're particularly interested in random, write and random, read workloads on RBD.

A

If anyone wants to test it and give us feedback, and if you have any ideas for improvements to the various internal metadata structures, we would be interested in that as well, especially if we can identify workloads where it's currently a problem.

A

Is this already integrated into the Crimson physiology Suite uh I believe it is in the Crimson experimental Sweden does not usually pass the regular Crimson. One just runs on Blue Store uh China. You can correct me if I'm wrong, but I think all of the c-store tests are still in Crimson experimental one.

A

That's going to be a priority over the next six months. I would say.

A

All right, thanks for the queen, probably probably wrap it up. There doesn't sound like there are questions.