Ceph CDS G/H, 25 Jun 2014

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: CDS G/H (Day 2) - RBD: testing overview & strategy

Description

https://wiki.ceph.com/Planning/CDS/CDS_Giant_and_Hammer_(Jun_2014)

25 June 2014
Ceph Developer Summit G/H
Day 2
RBD: testing overview & strategy

A

Yes, here, yeah.

B

A

B

Right next session here is our BD testing overview and strategy. This was something that Josh and a couple other people asked for and thought it might be a good idea just to give a rundown of kind of some of the different testing strategies that are going to go into our BD and given some of the things that are coming up in the roadmap, so Josh you want to give an overview of kind of where this was thinking. I know it was coming off of a email chain, so yeah.

C

So um I guess journal thing here is that we have a lot of tests for userspace rbd.

C

I cross um I'm different things and a few for Colonel IBD, they're kind of using different, multiple different frameworks right now, and a lot of that could be unified more on to take advantage of the same contests against both kernel space and user space and just thinking in general, but about what kinds of tests we have right now and what kinds of tests we don't have our missing and what things we want to think about with testing new features. Are we talking about they're coming up so I Delia?

C

Add some ideas, but um how to do the kernel? Space testing are using the same framework, so every deep as you just base iliad, you want to talk a little bit about that.

A

Well, the general thing was that, like you said, we were lacking contests and I was thinking about something like like the fsx, which now has with the with the kernel mode and the normal mode, which does the body, and essentially it was the same set of operations against a there, either kernel or lube RVD which having to choose and I, was thinking. We should have something like that for fully barb ed I, at least I owe tests, so everything snapshot.

A

Cloning and striping related should go into such a framework, and that would be a prerequisite for actually releasing the fancy striping support in the colonel, because otherwise there's really no.

A

The one-shot testing that development that I would do is suggest the not sufficient we want to have it continues. If.

D

They can, you must adjust, instead of oxide, to set a functional test similar to the sub Tesla barb ed. That does like create an inch delete image to create image. Write. This thing read it back like those sorts of unit type test, oh yeah,.

A

I know you but I.

A

It's not the Creed, so I was thinking more of the corner cases in the cloning, overlapping and etc. So, in particular the lib rbd tests, for example, there is a specific test that tests, the the overlap thing and the colonel had a bug for for almost a year actually over a year or where there was the overlap poses actually watched.

A

So I was thinking of writing something like something like similar to fsx to have tests at least those those sort of I/o tests, I maintained in one place, and so that the actual, basically, the same type of the same set of tests can be won against the liberty and the curl already in the same in the similar manner, to the fsx that, because if a sex has been a great help, we found lots of bugs and it suggests that this approach will benefit us in the long run.

C

So I think I a lot of those tests. We have for a little bit low every day. Right now are actually in the Python tests. There's a bunch of tests for specific, like cloning corner cases and many tips from discard corner cases and like it's not resize corner cases. Two in there.

C

What do you think about like making that kind of a.

A

Jacob I, then we can make a python bindings for the lip care BD and have a some sort of it won't actually be a big have a some sort of test driver and basically do the same thing that I did for fsx obstruct out the map on map sequences and the introduced python bindings for loop. Kiribati actually make it a library because right now it's just a static, static set of functions and III think you that will do it.

A

Oh, oh I'm, not so much interested in you know the lib are very specific stuff because, obviously that's oh yeah.

A

D

The I a path- that's important, yeah.

A

D

A

Yo path, if there is something else we can always bring it in later, but the core is the IO and that that that would be blocker for merging the fancy striping to the colonel.

C

Something they'll, the only other thing. That's not that's might be part of the some of those tests. That's a little bitty specific is the changing which snapshot you're reading from are currently writing to I'm using um while the image is open, but we could probably be spectra those tests to avoid that for the colonel test.

D

The ipod tests do that you mean yeah.

C

Okay, if you would them do just to make sure that that actually works with, like cashman, able to my handling.

D

So I mean this would almost be wrapping with this. Would we actually wrap the liberty Python binding, or would you put their people in a parallel instead of Liberty like bindings that actually open a map and open a kernel device and implement a subset of the functions, then you would just use that instead yeah.

C

A

Guess, yes, we would do the tests too, so the test will have to call some generic function, say right and then, depending on the switch that has been supplied, it either cause liver, BD right or does a pee right in to map block device and the same thing for mapping. So we barbieri mapping is almost an arrow up and for the colonel will have to do away with károly stuff.

C

When the simplest way to not change, the test would be to implement the same interface it. The python, bindings, food or beer. You can implement by using caleb k every d and I think that'd be pretty doable.

C

But I don't see any any reason to make it a different interface really.

D

D

So I mean that that'll cover the like specific test cases that the developer sort of dreams up as being potentially problematic and wanting to validate fsx is nice because it just throws sort of random stuff and also validates the results all right. But it's also somewhat limiting in that it's you know, there's only ever a single I, oh and flat at a time it is only is generally using a small piece of the image or.

C

D

Actually keeps any in memory map of everything that it's written and that's what's.

C

D

Against I'm running, if there are other other stress, test tools out there, that also are validating correctness that we might like it. Yeah.

C

I guess the biggest one that we have been using is actually the ex best tests um which do have a lot of different stress type tests in them. Yeah.

D

C

And currently those are run it with a different versions for a different kernel versus user space and to running them in different environments may be nice to unify those more. Let me kidnap at the same set of tests become because both there.

D

C

I think there may be some may be some things that weren't Isis tested as well by that which would be things like um I'll snapshots or discard.

A

Yeah well, despite all the things that all the good things that XFS tests do, in my experience at least, they turn out to be what's less useful than the fsx and the set of some set of targeted tests, because even if you get a failure, you generally can't reproduce it. And it's so you end up chasing you don't even know. Sometimes you don't even know which test in particular costly fail. So, for example, the the nesting bug that I was fixing back in January or whatever it was.

A

It was just a few a coincidence that a particular set of tests, but particular like 200, something and then 100 something, and so so particular ordering was causing somewhat reliably bug to trigger so the something something if a sex life, something for sex like which actually has a reproducible, see that does a reproducible sequence of operations is much more useful yeah.

A

So perhaps perhaps we can in the long run we can if we can extend that 610, therefore sex, to do something more more parallel, and so we in the church in the short term, we need to generalize the Python tests for Francis try playing in the long term. We can look into paralyzing fsx to do more what mean to do because relying just an XSS tests or even finding some other fancy testing to the the scenarios that if a sex tests are the most most useful and I think would be, would provide the most benefit.

A

D

There is a version of FF sx that uses MPI that we run in the in the file system test suite it's like MPI fsx, which NR looks at it closely, but presumably it's it's using a bunch of clients and running up a sex, but the backend driver is doing the writes and reads on different clients. It's verifying sort of the coherency between clients. I wonder it might be worth doing on a on top of Colonel RVD.

D

They'd have multiple, multiple kernels mapping the same image and then running epicyclic awesome.

C

D

Don't know if that's gonna I mean we don't do really do any caching, so shouldn't turn out much. But who knows.

C

Yeah, maybe I had something good to render to verify that does work for shared disk scenarios. We do something that we don't have tests right now.

D

Em that clearly wouldn't work with Liberty when caching is enabled bread.

D

A

D

C

So what about some of these new features that we're thinking about nothing like the that so for the bitmap stuff seems like it's going to be pretty straightforward. Just I got that it and as an option during create time so much time we treated them for my two images or you can run the same test, but either any of loaded hit this little bit. Maps based on environment variable or something like that, since there's a don't actually affect the operation there. Just an optimization.

C

I'm sure wanna have some targeted tests around those. Two, though, um probably run things like snapshots and making sure that the bitmap associated with the snapshot is consistent. I.

D

Wonder if we need.

D

Like a tests where we're forcibly killing clients.

C

Yeah I think we'll definitely want that for the mirroring.

C

So we can verify like that generally play is working correctly and fail if it works correctly. In that kind of case.

D

D

I'm getting tired, I can remember thinking. Oh.

D

Yeah having um having sets of tests that are doing fencing and shutting down clients and taking over locks, fencing the old client and then picking up where I left off I think that would that's. Why part of a larger strategy to sort of make use of the the locking better, because, like right now, open sack isn't using it at all right when.

C

D

Then the other sort of cloud things of it seems like could be soil exercise to figure out, have a good. um You know, like example, user or whatever of the locks. That sort of that does it correctly. You.

C

D

It say this is how you.

C

Yeah I think we have very basic tests for those right now, but it's not doing anything complicated and for openstack. It doesn't matter since that it's guaranteeing there's only one access herb by using senators database.

D

Okay, I see maybe the maybe the ice cozy um gateways like if we do, if we do I scuzzy gateways with multi path and some like a chafing on the gateways, mm-hmm cuz they'll need to do fencing in order to take over yeah.

C

Hubby mythology.

D

That's popular place to interesting.

C

Most applicable one burn.

D

D

D

Yeah I think that that still needs sort of a focused effort around drawing up exactly what this sort of quote-unquote right way to do. The gateways and delivers.

D

And what the performance is going to look like.

D

C

What about Nam journaling em your ring.

D

hmm It's gonna be a whole lot of testing around that I. Think yeah.

C

D

C

The most basic level we want to test that you know failover actually works internally plate. This is consistent and added puddin.

D

D

D

Wonder if you could like, if you're doing, if you're doing right, the journal and async rides right back from the journal back into the image I.

A

D

If the shim underneath fsx could just periodically kill the rbd client and instantiate a new one and trigger a replay and continue similar sort of like nods, doing clones underneath in flipping over to the new clone, so it actually just be like restarting underneath the covers, but making sure that it's still getting. That would be sort of an easy Verde validation. Just of the the journal replay component to make sure that the de journaling is correct on this source.

D

Validating that the client side is correct, a little bit harder because it's always labeled.

D

It's going to target.

C

Maybe things about testing the actual agents handling the replication.

C

Picture it in and I killing it and returning it a bunch and doesn't fight I correctness.

C

D

Mean the most important property is that the thing you're replicating to the target image is a is like I'm, consistent.

C

D

Wonder if there's a I guess we can construct something that's doing like random rights across the device, and every right is writing words that are like one more than the previous, and so you could at any point I'm. You could appoint I'm consistent image. You could, if you could make a scan of the image and validate that it was no. In fact, I point I'm thing: you're probably struck something that writes things in a particular pattern, so that you can confirm that. That's always true.

C

Just kind of just kind of an optimization for the general case, though I'm writing a bunch of random data and clear having.

C

Random fill using both sides, perhaps and.

D

Yeah you just have to, but you have to know that that it's a it's correct like if you're writing.

A

D

Data like fsx does, like you, can't just take it fsx image and know that it's correct as a point time without having that learning process. But if you have something, that's writing a pseudo-random sequence. That's somehow generating a big lots of events, but whatever I mean you don't just construct a pattern so that you could just do it like a quick scan over the device, and you can know that it's you didn't miss something or whatever and it gets correct, look a little tricky but probably dream something up.

D

C

D

Well, I think that in the short term, we have sort of a pretty specific set of things to do right, yeah.

C

D

Looks like doing the python bindings for that and that's so we're going to satisfy the immediate and by along with that cleaning up the X of s tests. So we can run the same set of tests inside the.

A

D

That we're doing on top of the colonel orbit ii.

D

Think that's going to go this a pretty good coverage.

D

D

I guess on the on the bitmaps, we could have a simple, simple verification that you look at. You load the bitmap, and then you just verify that there are no objects that exist that aren't colored in the bitmap yeah.

C

Just can't hold a mansion or if I didn't have mattress it.

D

At least that bitmap is a correct superset of all existing objects, yep, okay,.

D

I'll show you take a take a little break and then we'll do might as well get started on this FS dash.

B

Yeah, it sounds good.