Ceph Developer Monthly, 2 Jan 2019

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: 2019-01-02 :: Ceph Developer Monthly

Description

Monthly developer meeting for the coordination of Ceph project development.

http://tracker.ceph.com/projects/ceph/wiki/Planning

A

B

That's going hey guys, yeah good. How are you such good happy.

A

B

A

That's so far, I don't know if there's nothing on the agenda, so I don't know. If we're gonna get much tonight.

B

We'll see, wasn't there a couple of things from now? Are there? Oh? Maybe he added um oh yeah there we go. Yes.

A

Yeah that two things all right: cool.

B

Started here a minute ago, David's afton I, don't think so. Okay I thought I heard someone talking when I joined but anyway sounded like.

A

B

You know hello, how's,.

C

It going didn't I have a quick question for you before yes thing yep, uh when you're adding a new monitor the I was thinking that you would be able to do. The minimal def Kampf, with just mon english, like just mon host sequel.

C

But at least in mimic I was trying, it still seems to be complaining that it doesn't have like public adder.

A

Yeah, the requirement is basically that the new monitor just be told what address to bind to normally it infers.

B

A

Host, but in this case it doesn't have that, though yeah you should just pass public adder or public network yeah. So it's hard to know what that is. It's yeah! That's why adding motors is all the tricky yeah.

C

The yeah there's a bunch of stuff going on in septa Plato, like guess something decent, but maybe I can just use that unless you have a better idea.

A

Wonder if I.

A

Don't know I wonder if it should.

A

We could make the orchestrator command require that you specified that's kind of like in the kubernetes case, you're just gonna like a dimension, remove a bunch so, but instead we could. We basically require that you define public network for the Mon section that could do it.

A

That makes sense, but it's a little bit weird because for the monitors it has to go in the set off for knowing well, you could take you. Could you could try to find a public network setting in.

C

The dots it seems to be in the docs at least it's like when you're, adding a new, and you can do it from CLE yeah- is that what you're talking about yeah yeah.

A

Yeah I'm just trying to think how this would wire into the orchestrator distraction, because in that case, in the work case, it's just like add one. Remove one right.

C

A

I guess maybe I don't know if that matters I camera with the CLI like.

A

Yeah I'm not sure, that's all right yeah, maybe.

A

Let's talk about it tomorrow,.

C

For the demo, so we can figure out the details later. Yeah.

A

I mean for the time being, you can just specify the public network in this epoch on that'll trick.

A

A

Alright, what else looks like everything is all you want to go ahead and start.

C

um So there's a couple projects: you are building apps on redose um there's a few features that would be really nice to have.

B

Projects for curiosity.

C

C log and then skyhook in Jeff, Southern Cal to you, okay, so we tried to. We have like a lot of use cases, but we try to boil it down in like pretty easy to digest um so I, don't know which one is first on the list: I don't have the CDM. Is it the PG one yeah? Could you also just pace it er than that chat? Okay, yeah I said so. There's a couple use cases for this.

C

Really. What we like to do is you have to define some custom deviate, ops, very good, analogous to like object, class methods and we'd like them to run with the same atomicity semantics as object operations and I. Think that is already the case yeah, although I'm not sure and then the two use cases we have that you're pretty compelling. Are we have some cases where we'd like to do we'd like to put a guard and in front of object operations but we'd like to do it scalably at the pool level?

C

So we don't want to iterate over all the objects to update this value. That's being checked on that object, io path. um So this is like there's four, this.

A

Like chicken, a name or checking an attribute or something like that, yeah.

C

So, like the simplest case is that we have like an application-level epoch, value or version value that we're controlling and then the app is going to go around and like update this value on each PG. But then all of the object operations would do a read and just make sure that their assumptions are correct. It's really a scalable way to do some really lightweight synchronization between between clients, um I think, there's a case in skyhook where we're interested in doing collection level locking.

C

But we need to do this. We want to just do the coordination through Rados by collection. You mean kratos pool yeah, but it's it's not necessarily at that granularity. It's it's like arbitrary groupings of objects, so would be like a table level lock in sky oak, but that table is like in number of objects. Yeah.

A

The problem here is that I think using the word. Scalable is misleading. You can get to the point where you basically push the enumeration over objects into the OSD, but you still, it's still owned um objects. It's just the OSD, that's doing the iteration and you're, not everything across the wire. We.

C

Don't need to change the state of all the objects, so we don't want to we're not talking about pushing an OP to the PG level and then doing a local iteration over the objects. Okay, we'd like to have like a value that says you know like food version in the PG, there's only one value physically one value of this and then in the I/o path for an object operation. We would just read this one value from the PG I see so just.

D

Correlation you're, looking for PG level, metadata or actual, not even metadata, you want. You want to write data that belongs to a PG rather than to an object. Yeah.

C

Yes, exactly so, we want to read it in that object: I, open, CLS and I. Think yeah wait.

A

What it does it, do you really mean a PG level thing, or is it really a pool level thing.

C

uh Well, so I think that I think it's PG level and that's only because I think that it's we may have situations where we don't want to keep the metadata at a pool at the pool abstraction like because we might have subsets I.

D

C

D

Could work well yeah subsets of data within a pool that implies that you're somehow like filtering your object, names to fit within the PT's, which yes is a thing that you as an informed user, might figure out how to hack but isn't something that we as SEP developers wants you to do, because we want to notice that um yeah.

A

I think the thing to keep in mind is that pts RF, the teach account, is changing increasing each other you're splitting and merging at any point. I'm yeah.

C

So I I had considered this and I. Don't know I mean from my point of view the PG mount would change very infrequently, although maybe that's a terrible assumption to have, and so, if I just knew, if I just knew the version of eg map I could deal with those changes at a high level.

C

But you know I I have to give it a little more thought, but I think if this was pool level metadata. That could also work because.

A

It sounds like what you want to do. Is you want to set like able food epic equals 12, and you want to set that sort of globally for the entire dataset right and then F and then have Reyes up guards that check the pool attribute equal to twelve or greater than equal to or whatever do your little comparison operation and having it started by PG? Actually, it seems Artur, because then you are when you set it, you have to set it.

A

You know PG num times you have to iterate over every placement group and then set up for each PG. That.

C

Right from yeah I mean that's significantly better than doing it over every object.

A

C

You're really after.

A

Setting it for hopeful right.

C

Ii yeah yeah, so in this in this limited case, it is true that I think school level metadata would work as long as we can identify like I need it. I need to be able to block until I know that all future iOS are gonna, see the updated metadata, yeah yeah.

D

A

Long as that's possible.

D

Exactly what OST map does once you know it's there yeah again, you can't force all other client I/o to block, but you can, like you know it's not going to go until it until the OST is seen the thing that you have seen. I.

C

Don't need any I/o to block I just need to point to a position in time where I know future ayahs are gonna, see the updated yeah.

D

I I think we even have a place. You can put this already interesting I mean we have full metadata yeah.

A

It's really part of the admin but yeah it's it's like.

D

A

Structured and that it's encoded efficiently and that if you don't set a property than it doesn't get encoded, but it's the properties that you're allowed to set our are coded so they're all like they all control rows, behavior, there's no! Like user metadata, payload one.

D

And we've got like the applications that you can set so they're actually on.

A

D

Don't remember what that actually lives in, but I think that's free, my things, oh yeah, so we can play.

A

Something but I think that I think the so assuming you go down the road of putting in the OST map, then the question is like is: is that sufficiently scalable? Because this is a data structure that every my task memory? So if you have a million tables, that's not going to work just.

E

A

Right, if you have it.

D

Still applies even I'm, putting in a PG, though, because if it was on a PG, then you're, basically building up giant objects that don't have a defined place. They live in the PD I mean we pick a place, but then it's.

A

D

Every single but I mean like going back to what no is asking for initially like.

D

There are some problems with that, like it's an object that you need to be on every OSD before any I/o happens, and that like needs to be backfilled, and you need to be able to block on it and like we used to have stuff like this, with the object clones and the side-by-side objects with the object locators and we filled it all because, like it like, you would backfill and ops will just suddenly stop working, and you wouldn't know why, and so the reason we're pushing hard on this is because what you asked for is really really scary.

A

D

I mean we've talked about PG level, ops in the past, a lot for things like where say to an initially, where you're like doing a listing and searching through each objects metadata or something. How are the hips a PG right? What it's a better.

A

Bar / PG: yes, yes, but when you split, we just copy them. When we merge yeah the we just throw whatever amount.

D

Depends on them reading.

A

D

A

A

Yeah, it's it's a tricky thing to make. Scalable I.

C

Think for I mean it's um it's unclear.

C

Well, let's I mean maybe it would help to separate the two cases, so I think there is a case. So in the zeal in case it really is a very tiny amount of metadata per pool, and it's not gonna scale up. It's. It's just constant I think that you're you're right about that that might not work for all the different use cases that exist in skyhook, like where you have one for a Postgres table right right.

A

But um what would happen in skyhook if, instead of having one epic per table, you had a global epic.

A

Yeah per pool instead of per table. That means that anytime, you had a view change on any any table, then sort of all the pool operations would have to like re-try order or whatever ever. It is right.

C

Yeah I mean my hunch is that it's not gonna, give you the granularity you want, but now we're waiting, we're like sort of waiting into more sophisticated version control schemes for the transactions, though.

C

Yeah might have to get it. Maybe you have to give it like a little bit more thought on how that would play with the multi version games I mean.

A

A

I had one it would be nice to basically reach for something like watch notify where the clients are using a cooperative scheme to coordinate their rights. The problem there is that and there's sort of a fundamental issue with watch notify where, when the client goes away and fails, do you have to like rely on? You have to wait for time for a certain amount of time for the operations to timeout before you like go into some recovery mode or something, mmm and that might not be appropriate for a data piece.

A

That's an H, a database, but I don't really know. Maybe that isn't an issue because bail over in post grace is already gonna, be super coarse-grained anyway, like your weight, tens or seconds or a minute or whatever, before you failover right.

C

C

Well, honestly, I thought it was gonna, be a slam dunk and just but it sounds like. Maybe we need to have still like for the tiny amount of metadata, the pool that makes sense. For this other thing, we should probably.

A

Have a more detailed discussion that there's kinda way what the goal is well.

C

Yeah, so for the for the for this table level, walking was really an intermediate solution before we went more fine-grained. So if it sounds like it's kind of a a important design on the step side, we might be able to just like take the time and I'm not.

A

Yeah exactly yeah yeah, okay is.

C

A

What's that, can you give like a quick summary of what the end goal is.

D

A

I, don't know if there's an intro, if you have a document or page you can share in general, but.

E

A

E

Well, the others like a picture on the wiki in the architecture. If you like, however, that again W start.

E

But the goal here was just to coordinate all of the objects that comprise a single table and increment the version number. When we do an update to any row and coordinating it, the PG level was a little less order than the number of objects.

E

C

Really for coordination.

E

C

Maybe we can just say like what sky gets to like really briefly. Skyhook is a the bottom half of Postgres or MySQL that manages table level data that table level data is chopped up and stored as objects and stuff.

C

But fundamentally the objects are storing. Relational was, but you end up in this situation, where you need to actually have a lot more intelligence in at the storage level than just reads and writes so we're going to be controlling things like transactional consistency, though there's a lot of metadata flying around that we need to track, and some of that metadata is harder to scale than others, and so there's things like table at Wapping, which is a coarse-grained solution, but it just gets more come.

A

A

Well, let's look at that: let's look at the other one and see.

A

On make sense, sure.

C

Well before we move to the other, the other item I think there was one other case use case, that's sort of distinct from the ones we were talking about. So there's. Another use case for what I had sort of envisioned is PG level ops, where we're also tracking metadata on a per G group eg basis.

C

But this might be like intermediate indexing, so we may have like I mean, like the most. The most naive version might be tracking, like min/max values per object, so, like a small amount of object, metadata that lets us filter out objects from further investigation. So we can query.

C

We can query on rows based on column predicates, but then we can just sort of say, like this object certainly doesn't have what we want, because it doesn't land in the range that we'd recorded and so for that it helps, because if we wanted to do that at a high level, we have like say a thousand object round trips, where if there were a thousand objects in the PG, it would be a lot more efficient to just do one round trip and do local metadata.

A

Conversations well, even in that case it wouldn't be one run trip, it would be P ginam round trips, but so you'd have to ask each PG totally fine, yeah, so I hope such a million objects to feel that that was not affected. I, not be that big, a win yeah.

C

So we had like some: we did some experiments last year where it was like. We had realistic.

C

Tpc queries with, like a billion rows, that we had situations where we can do like a thousand trips 1 / PG to do these aggregates. We like multiple orders of magnitude, speed-up versus all the we had like on a hundred thousand objects or something those latency is add up big time. So.

A

Is this? Is this the type of thing where you could have each object, maintain its local min max as an attribute, and then you could build an index over specific object attributes? Yes,.

E

Max we can compute some aggregates and stats values per object, and this would be a way to sort of aggregate. Those.

A

So it would really be as it'd be like a table name and then min Max, or it's like table dot, Rho dot min then table dot, Rho dot max like that right. Thank.

B

A

Different attributes: okay,.

C

D

The stage like it's asking for a list of all the objects that satisfy that really condition is something I'm a lot more interested in and then we've been more comfortable in the past. um So in fact we actually have something sort of like this there's a P GLS filter mm-hmm.

D

If you guys have seen this um and I, don't remember what all it matches with it might only be named based, but I know, rgw uses that for doing some searches and the file system does for what it's looking for. Think directories to do, effort to do it's backwards, grub and repair stuff.

A

But it's it's doing it's um it's a numerating objects and then checking each object, though it's not actually an indigo yeah.

D

A

It could benefit from this what it would benefit from this big time. I mean it really this would. This might be assuming this thing existed, it might generalize the snap mapper to, although it the internal metadata instead of user-facing meditative, but it's the thing that working is it.

D

Iii wasn't clear about that. Are you guys actually trying to build an index? That's maintained live while you do object, ops, or do you just want to be able to do searches on the attributes that objects have mm yeah.

C

We'd want to put them all in line that transactionally update those as part of an entre covering.

A

So I think it has all the same problems and benefits that snap up or does, which might be helpful or just sort of the issues. The problem of snap mapper is that it's a global, sonic global. It's a basically a per pull index effectively.

A

mmm Does it live, it doesn't live inside the PG because PG split and merge- and you don't want to have to rewrite and resource the index and in fact, in general, you probably it's not yeah. That's not an efficient thing to do.

A

So I, maybe it is I mean whatever it would be. Order in that it'd be more efficient, I guess if it were pre PG.

A

But that means that um the we're basically just pushing all the atomicity transactional stuff down and rocks to be which works fine today, you think it works fine today, because we have rocks to be underneath- and it's just you know, has its own big, global, lock and so em, you know, aren't hitting it quite hard enough that it has mattered, or maybe we argue about that, but this will have the same issues basically and that you have independent operations on different PGs different objects.

A

That of both have to atomically update this shared index structure, that's global to the OS team, that's sitting underneath, um and so that might cause more contention, even though I'm like today zoasty. But it makes life really hard when you start thinking about the sort of the future crimson toasty, where we want to be much more explicit and much more strict about the sharding so that we can really shard more aggressively across the BG's mm-hm.

A

Because you have the shared index structure, that's being modified by two VG's.

A

The fact that it's and the size the index is order, the number of objects, it's kind of problematic.

A

If you mean even if they're you know, if you have multiple indexes and objects, one exists in certain index is based on the table names or something like that. Mm-Hm, the total amount of index data is still order of the number of objects in the pool, assuming that they all get indexed.

E

A

You were to do this by actually having / PG indexes. That means that when you merge to VG's, you have to do this, they merge of these two structures. That's gonna, be a big latency, blip or any split. You have to actually the same thing and so far we've done a pretty good job of making split and merge. Not do that. The exception emerged has a bit of a flip, but it's it's rare, but it's it's yeah. It's not a it's, not a price on an order.

A

The processing delay is not related to the amount of work we have to do it's related to like ordering operations or whatever, and it's independent of the size of the pool of them. All that so.

A

It gets a little girl with a number yeah.

C

Just set the flag: don't change, num pjs, lock it lock it up. I.

A

Mean we could, if you words, if, if we knew that there was like a maximum degree of charting in the pool, for example, we could just like pre shard the index so that, if you have one PG, you actually have a hundred little indexes to update and you um any particular object is calling it a modify one of those charts and when you do a query, you sort of walk across them or if you had an index structure that sort of naturally lended itself that being restarted at the same I.

A

Think that it's the wrong direction. I'm, not sure that would actually work well, depending.

C

On yeah I mean, depending on what it ended up, looking like I mean even for some applications stop the world. Events can be okay if we know when they're gonna happen,.

A

Ya know if you know them: they're gonna happen in this case cuz. It's raitis.

C

Third, is the splitting happen in other places,.

A

Well, in general, it's gonna be the administrator who's deciding when to split merge or it's gonna be some like background process. That's the setting when this little urge and opposite.

B

A

Mech pusher will do it on its own turn, that off.

C

So we can know I, guess.

A

C

Okay, alright I think we have some more homework to do than on on this yeah.

D

I think you want to think real hard about what exactly you're trying to push into the sub cluster is what you're doing on your own cuz like it sounds like you. What are you talking about? Pg metadata or like PG level indexes you're, basically asking for for cross object, atomic operations, and you think that you're like saying? Oh, it's on the PG. So it's it's free, but that's just not that's not as true as you might like it to be somewhat.

D

Maybe try and figure out like what kinds of operations you need to do and if we can break it into smaller pieces that maybe we can make happen um well yeah like we, don't have anything that works on people really that works on a PT level, it's accessible to people that maybe we could come up with something we were like.

D

Okay, we want a PG, op and so like we're like and- and we know what versions of these ten objects we're working on, and so like run a thing on these ten objects and store the data in this other object. That's also there like I, don't know I, don't like it. I haven't thought about this. We works I'm, not saying that will work, but that's a lot more plausible than what I've heard so far.

D

Okay, say just being really nicely talked about the snapper just like okay, so we have a snapper that sort of is like this, but like the degenerate case, it's like it's its stores of religious virgins for objects, but only when those virgins exist and join us all. Anyone yeah the entire Rados codebase not just like a set of random data, and it works pretty well now, but it didn't.

D

Seven years ago we found a bug like the beginning of last year: I guess it's two years now we're.

A

Stuck we're still chasing about again same ever anyway, yeah well.

C

We'd like to make that more.

D

D

Data that needs to be there all the time and it's moving pieces I. Think.

A

It's it's it's hard because I can see. A million uses for a reverse index would be a pretty powerful thing. At the same time, we've managed to build a bunch of like robust, complete stuff on top of the sort of the more primitive stuff that readers does today, and so it's it's hard to I. Don't know it's hard to argue that we need to make it more complicated because it's already very complicated and the challenge yeah.

C

Yeah I think that's it. That's a good point like a lot of this stuff. We really we really can do at a higher level. It's just a lot more interesting. Do it versus the matter yeah well,.

A

I mean it can be a layer yeah. The thing about Freitas like rgw, is a layer, that's at some type of rate as a dozen indexes and quotas and garbage collection and everything in the design right, yeah.

A

E

That's a good point. Oh all right. What do you suggest with write to another object? Oh, that that might actually something along those lines might be pretty interesting for us, we'll think about that. Some more yeah.

D

And I mean I want to be clear that that's still a big thing like you'd like you, if it were to work, it would happen. You would have to like be able to specify like the versions of the objects you're reading from and the version of, the one you're writing tour or something in order for.

A

D

A

About it, we're.

D

A

About that multi-object transactions well,.

D

No, this would I mean this would have to be a thing that runs on a PG and just like if it's doing back or something it just blocks until it has all the objects but not not across VG's and I, think we can make that work more easily, because we can just strap them to their data and say yes, it matches and we're not back nor you know, we know which Politis lives into these, so we're good. It's just a split and merge the thing that.

A

C

Okay, let's play bridge Oh, consider that yeah, probably okay, all right um all right! Well, I, don't want to do all on this room too long. I think that was good though so the next one is pretty simple. I think there's a we have some use cases for allowing some data to be returned when OSD, ops, mutate.

C

It's like right now, if you do a read in an object class method. Sorry, if you do it right and an object class method, then any data you return just gets tossed away, yeah, so yeah. It would be nice to be able to return some data and it doesn't have to be a lot fundamentally.

C

I mean fundamentally we're returning data right now because we're returning to the answer and value yep though it doesn't tell you, whatever mechanism is happening there who'd be really nice if they get back on that, because yep well, anyways I can talk about the uses or we can just jump right to it.

A

I believe these cases and I think these in Canada.

D

We did actually finally start returning a real return code. Right then,.

A

Hammer Jewell I was in juvie, I can't look at while ago. Okay, so the um yes pretty luminous I think it was anyway that so it goes in the PG log yeah there's where the error code is recorded and when you send a it's, because the rights have to be idempotent. So if you reset it up, it's the PG log or the adjacent structure. That's consulted to see. If that operation is a replay and if so, then it returns the same.

C

A

That it recorded um so it's pretty easy to extend that to include a payload. It just means that the PG log has the potential to get bigger um for about 16 bytes I'm, not too concerned because there's already PT log entries already have a bunch of other stuff and Alicia could have pools. For example, they have a copy of all the attributes for the object know for.

C

A

Time you write something like that, or maybe just a change once I can remember, but it's like on that order. um So it's a or at least it did in the old, maybe with the it's possible with the new overwrite greatest stuff it. We don't need that anymore, but previously we didn't, but in case um yeah it would be reasonable. I think to include it there as long as it's bounded and I. Think yeah 16 been five sounds reasonable enough.

A

There's actually an open bug for this for something similar to this, and that is I think maybe just poor API design when you do a raitis operation that has multiple items and the multiple operations in the operation.

A

In the normal case, we record a return value for each one and return the return value for each one to the client mm-hmm.

C

A

But the PG log only records the overall return value for the whole thing, which is usually.

B

A

Last one or the one that failed and not for the intermediate ones- and there are some tests that are doing rights and looking at the return value for the individual operations they're just right now. It's unit tests that fail on the M. So if you have an ill-timed reset, we're asked to resend that up and it hits the replay whatever returns a return value than that the intermediate the operation code doesn't match mm-hmm and it's the same basic problem.

A

We just need to store either get rid of that or store more metadata in the in the PG log entry.

D

That one's actually a lot harder because you can have arbitrary so about numbers.

A

That portray an.

D

Arbitrary number of them, immunde arbitrary number of sub ops, can go into an obligatory yeah.

A

You get at 50, not just 16 bytes worth yeah I'm practice. You don't really actually return those right, yeah I, don't know if anything really needs it. So it's like. Maybe we should care to that, but but it does exist and there is a you know: a functional test that tests that, in the liberators test, suite or whatever.

D

You want 16 bytes is that you need to 64-bit numbers or specifically or.

C

I think there's I think there's well so, for my specific use case, I want a 64-bit number plus some metadata and that to fit in sixteen bytes, but 16. Bytes also seems like enough that if you really wanted to return a larger amount of data that was atomically consistent with an update, you could stash it in the object and just return. Some sort of unique identifier.

D

The sixteen bytes is reasonable is just I want to have a really good reason to make it sixteen bytes and say.

B

D

I mean so it won't be like super hard I, don't think, but there is there's more memory. Usage and it'll also restrict what we can do in the future to optimize that yeah I mean so I'll use all 64.

C

Well, I guess: I won't use all thank you. Well, I'm gonna use more than 32 and 32 bits and, like that's what the current data types are I mean I can say, though,.

D

Return code is 64, isn't it first I'm only 32 32.

D

Like I would want a really persuasive argument for why the number we have is correct and will work. It's going, I mean I've.

A

A feeling this is like gonna, be like the maximal object, size or the maximum number of attributes or the maximum size of an attribute like they're, all in some sense, I'm sure you just pick value, that's sort of like its trade-off.

A

D

A

I mean this could even be something that, like you said as a pool property or something right with this pool operations. Have this much allowed right, payload sure and the main purpose of it is just to like prevent a poorly behaved operation from putting like two Meg's in the buffer list, then returning it and blowing up yeah I.

A

Mean it wouldn't be it's a you set the limit at 16 bytes like if, if every 100th operation, and that's this, but the other 99% don't set, it then like it would be fine if that every that 1% sets it so like a kilobyte right overhead of view, yeah or matter if there needs to be some cap, and it needs to be that with knowledge that this is going to the PG log and it's going to consume memory.

E

And 16 would give enough for the eighth, the return Oh and then also in a pipe like two eight flights right. So one could be the value like an incremental version of it. I mean.

A

If I were just a second I, would Eddie I would keep the I would keep the current return value. The three return value and just add a buffer list. I mean.

B

It can just be this.

A

B

A

That we current right now, we we toss it on the floor. You could just keep it if it's uncap it at whatever the max sizes. That's right, it seems like all of the plumbing is already there yeah yeah, except for storing in the PG log interest. He and, in the same time, we fetched the return code when we're doing a replay out. We also would fetch that it would be pretty easy enough alone.

A

C

A

And really address the other problem of having the all the return values for all the sub ops that could be fixed at the same time, either by during all those values in the peach, elegant, Ricci or getting rid of that behavior entirely.

C

Alright is this: is this specifically, so that clients can get back like a vector of return, values or something that it.

A

Made sense at the time because or reads it made, it made sense for reads and I think it reads and writes are sort of similar, and so the writes got them for free without really thinking about it, but the way that the ops are set up for each up. You have an input and output buffer and a return value, and you like to have n in them in your transaction, yeah, so I think for reads: it's generally useful right.

A

You can put real use information in there all rights I, don't really know.

A

Clear what the purpose is: I, don't have a use case in mind.

D

Which thing failed? Cuz, you might have a step of five things and want to know which one like which freaking right and that's true.

A

D

Rights got it by accident. It's because we had, like you, know each of those sub offices processed by running it through the do OST at machinery and that fills in their return. The like double op return code, I think.

A

In practice, they're usually always zero. So we could. We could fix that bug just by storing only the non zero ones. And yes, if you put a bazillion right ops that are returning non zero return codes in the same transaction, then you'd have to peach other countries, but maybe that doesn't matter I don't know or we could just make them non-existent for for right operations. I.

D

Really wish Josh or someone was here because I feel like there was another problem with the with stating extra day the PG log and I can't think of what it would have been well. Simplifying return codes, we're.

A

Kind of ignoring that so there's this new thing that got added around luminous, that is the du pape table. The goal was to reduce the number of PG log entries that we have in memory, but we still need enough ops.

D

Memory, it was something else like about being able to build the transaction efficiently with either were like that was set that late, but just doesn't make sense either because that doesn't seem like it'll be hard, but maybe it was a problem when I got fixed up as a result of the earlier work. I don't know, but like people, people tried to do this before and not been successful.

A

Yeah well, the biggest implementation is needs to handle both the PG log entries and the du pape entries, because they have both that's information.

D

Yeah, that's all right next to each other, though. So, if you get one it's the other.

A

um Anyway, that one's more doable.

A

Okay, all right.

C

A

C

We'll come back with a much more compelling case for the first one.

C

All right, that's great cool. That's all I had all.

A

Right, there's nothing else on the list. Anybody else have anything you want to discuss.

A

All right, Nautilus is coming May 19 go.

A

All right have a good night. Everyone.

E

D