OpenZFS OpenZFS Developer Summit 2016, 10 Oct 2016

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Fault Management by Don Brady & Justin Gibbs

Description

No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).

A

So now we're gonna have the presentation that we were scheduled to have before we're. Gonna have Don Brady and Justin Gibbs talk to us about fault management in CFS.

B

Okay, thanks I'm Tom, Brady I'm from Intel and I'm gonna talk a little bit about fault management today and let's see.

B

So I'm, basically just going to do a brief history and then try to go at a high level, 10,000 foot view of the architecture and and sort of the implementation and then we'll go into some specifics and Justins going to help me talk about what's going on in freebsd in fault, management and I'll talk about the Linux and then I'm hoping that we'll have some time at the end for future directions.

B

This is for me when I, when I got involved in this. It's it's a lot of work that just sort of stalled out in 2007 and I know. A lot of private companies have been doing some of this stuff, so I'm kind of anxious to get this stuff moving again inside open CFS.

B

So in terms of the history originally in CFS circa 2005, when it went live, there was no real disk handling, it drives were only they could be in. It was binary mode, they could be in two states, they would be either faulted or they were fully functioning.

B

So at about that time, FMA was landing, and so there was an effort to make CFS visible in that FMA space and it turned into sort of a road map and they added diagnosis and basically spread it out over several years, as as they got more familiar with the territory. They you know, started adding items to the roadmap to to fill out the stack and some of the features never made it back. I sort of alluded to that Italian generic IO, false smart data telemetry.

B

Some of that data is not actually being fed into the diagnosis engine. So that's one of the things I'm hoping we could. You know so. Every new that effort. Most of this history, I gleaned from Eric, Chirac I, had I talked with him and he was sort of the instigator I guess of some of this stuff and at fish works he was, you know, actively involved with with bringing FM a pool FM a like sort of support into ZFS.

B

So yeah, so so my my overall goal for this talk is to sort of renew. Is that stage roadmap approach that they were following before? If you look back into the history, there was a phase zero, a phase one phase, two and I frantically have googled for Phase three, but I have not been able to find it. So we'll have to invent that one. So yeah, like I, said at the end and then maybe even tomorrow, a breakout if people are interested yeah.

B

So this is sort of a simplification of fault management in ZFS sort of shows all the key players. Overall, the idea is automated diagnosis and isolation of a fault of a B dev. A fault is something we can associate with an impact like loss of redundancy and then there's a corrective action on lining to disk replacing a disk.

B

The fault management stack is it's basically comprised of error, detection, which is sort of that middle section, and then a disk monitor is a key player at the top and then there's two response agents that are actually responding to any diagnosis going on.

B

So in normal operations that the kernel module will send an event and there's a user land, daemon or demon that will respond and each platform actually has one now, which is which is nice. So in Solaris, that was the FM FM D and in Linux we have said, and then in FreeBSD they have ZFS D.

B

So these demons will sit there and listen for events and then feed them off to the different modules based on subscriptions that they take so and then likewise, there's a dis, monitor, implied- and this is very platform specific, but it's it's available on every platform so basically have to tie into the this dis management stack on on Linux, for example, we tie them to let you dev to see disk events.

B

So it's sorted at the core of this fault. Management stack is what we call the diagnosis engine and it consumes raw telemetry from in our case, in the simple case. Just from that, the kernel check, sums or I/o error reports and then it'll pry to it provides a diagnosis.

B

So yeah, so these errors come in and then then the engine per Vita will associate a case. The cases sort of like a case file, detective it'll, actually open a case and in in the case of CFS, we attach a third engine. A cert is just a fancy way for saying software error rate discrimination, so we're looking for an events in a teep time period and these these varies across platforms, but it's typically on the order of like 15 errors in 10 minutes or something to that effect, and this was what was chosen.

B

I didn't actually ask Eric why these networks were chosen, but these are the same numbers essentially that we had in 2007 and I I'd be interesting interested to see you know from from experience if people have seen where these numbers were we're inadequate or you know, or not aggressive enough, or something like that.

B

So, basically, in summary, for the diagnosis engine error reports that come in are sort of the symptom and and the fall events that come out on the on the right side there or the diagnosis and so it'll spit out a they call them suspects. But a fault will come out of there.

B

Then this is where the retire agent kicks in it'll. Respond to a given diagnosis from the previous diagnosis engine and it'll take take certain actions it can. It can mark a be dev, it can replace it with the spare and.

B

Yeah so I'd only place with a spare and and and currently the I think, what's in the top of ZFS open ZFS tree is real simplistic. It just takes an array of spares. Typically global and it'll. Try I have a faulted disk. Let me try the first fair and see if it works, and so I know for a fact that people want to have a better matching algorithm.

B

We know we now have the capability of having different tiers of like, for example, metadata, and you might want to have a separate spare for that and in the case of D raid, which which intel is working on it's a d, clustered radiant solution. It actually has virtual spares. So that's another special case where you'd have to actually have criteria for matching the spares, instead of just blindly trying to try and things until something works.

B

Ok, so then, this other this other agent and I took the liberty to call up the disk. Add agent in Solaris, it's called ZFS module and I. Just you know that just doesn't resonate very well. It came out of the system, ant loadable modules framework and for various reasons, it was implemented that way and it's fine and but it it. In essence, it is an agent, so I took the liberty to call it an agent and that basically will consume the disk.

B

Monitor events like if you get a disk add the disk agent will will see a new disk and it'll try to match it against any important poll to see if it's missing the most obvious case is online. If you had a late, a late responder during your import or a for disk, went offline cuz somebody unplugged it and push it back in this agent will actually see that disk notice that it belongs to a certain pole and place it back online.

B

The other, the other act, the other actions provided by the disk out agent is the this content of read every place and I.

B

Think it's important to notice to note that auto replace is different from auto spare, so sparing is or like that the previous agent I talked about, but Otto replaces this concept that sort of came out of that chassis, fish works, saying is where you can actually have pages, take a disk out whether it's for maintenance or retirement and then put in a new disk and the auto replace, will automatically detect it and notice that you opted in and it'll replace it it'll actually reformat the disk, put it back into service.

B

So it's like it's like actually magic. You can take out a random discussion. Full stick in a totally different discs and ZFS will automatically make it good, and you know, as soon as the V Silver's done. It's it's a brand new disk in there without having to actually sit in the command line and type. You know off line and replace all that.

B

So this is actually available in essence on all the platforms as I mentioned, each each platform has a daemon, that's sitting there.

B

Looking for events and what's sort of missing- and we could talk about this at the end- is sort of the diagnosis and repair retirement agent and the add agent there's various levels of support across the different platforms and one goal I think would be to get everybody on parity to what I described earlier and then we can go and start adding diagnosis, improvements and predictive, better prediction prediction of when a disc is going to fail or has failed.

B

So at this point, I'd like to have Justin come up and give us a you know overview of. What's going on in FreeBSD, my.

C

Slides are right.

B

C

Going, which way am I going here.

C

Excellent okay, so the the support in FreeBSD I'm, gonna talk about was developed by spectra logic, I'm no longer working at spectra logic, another as well Andrews, but Allan summers is there and, and the poor soul is trying to upstream all the work that we did while we're there you'll see pull requests from him in Alamos. In fact, I saw some activity from him, I think today.

C

So some of the things that I talked about in these slides, the majority of them, have been up streamed into FreeBSD. There are a few outliers and those I think are mostly going through a Lumos and then coming back to FreeBSD that way.

C

Okay, so, as Don talked about there's kind of a simple description of what you want, this event daemon or this fault management system to be able to do it's supposed to be able to detect drives that are in a bad state that could be either degraded or faulted in ZFS degraded means that the kernel will continue to use the device. It will continue to read and write to it, but it's kind of like a notification to the fault management system, that this is an ailing drive that should be retired.

C

A fault to drive, though, is, is basically taking out of service whenever we detect these drives. Assuming that we have spares available, we want to activate spares for them. If somebody decides to pull a drive out and you have a different drive in or shove the same drive in, we want to detect those events and make sure that the drive comes back into the pool and in these situations, where we have like replaced by physical path, which is this magical thing that Tom was talking about.

C

Where you have an array that has physical path, information, you pull out a drive, you stick in another drive that is of similar capabilities, similar size. The system, basically just detects that decides to aggregate it into the array and bring it online in this last part deactivating a spare after a successful resilvered. That's in the case where you have activated a hot spare and for instance, you went and removed the original failing device and replaced it with a nice healthy device.

C

The system resilvered back onto the original device in its original physical location, and at that point we can return that spare to the spare pool. That's deactivating, a spare in most cases the active deactivation supposed to occur in the kernel, but as we found in the development of this work, there are cases where that doesn't happen and I'll have more about that in a later slide.

C

So one of the challenges that happens in trying to make this all work is that we have events that come from different places within the kernel and in fact, in the Illumina Solaris implementation. You have system events, and then you have ear aport, sorr error reports and they're kind of considered to two separate namespaces, sometimes going through different systems, and yet in order to be able to be successful in managing one of these faults, you might need information from all of these different places and you have to actually aggregate it correctly.

C

Even if those systems have different latency properties or might might deliver the information out of order. So all the information here that we need are the the sis, the sis events, the e reports.

C

We need to know the physical path, information which might be coming from different devices in the system, so, for instance, I attached a drive, it's coming through a SAS HPA or maybe a SATA controller, but the way that I know where it is comes from an enclosure device which is being run by a totally different driver which might report its data out of sync with a device arrival or departure event.

C

In the block diagram here, I'll give you a sense of the other systems in the freebsd kernel that we're using to get this information, so some some problem areas that occur because of the asynchronous and distributed system that we're trying to manage here. So when things happen in the system and and in the ZFS case that could be coming from a user initiated command like acting adding a device or deleting a pool or manually on lighting a device or doing a replace.

C

Those events in freebsd at least seem to come out well in advance of when you could actually issue an I octal and see that pool stay in effect.

C

So we have to actually aggregate this information, detect the fact that we can't act on it just yet and then rely on another event that comes out of ZFS to let us know that we should try again in the case of hot device removal. Usually that happens because you know our users are special. That usually happens when the device is actively being used right, so we're sending iOS left and right to this device and we pull it out.

C

Well, what happens it kind of looks like because we see maybe some IO errors or whatever during that process, that this is a drive, that's on its way out? Well, it is it's just that it's completely leaving the building. It's not that it's it's dying, and so we have to be careful about how we attribute those errors to make sure that we don't decide. That is a bad device that can't be brought back into the array when it returns.

C

As I mentioned on the previous slide physical path, information, it could be, it usually comes from a separate device and that device isn't necessarily always available in the system. If you're lucky and you have- you know a nice, you know fish works array or something from spectral logic.

C

You'll have an enclosure services provider which will tell you you know, based on this SAS address or this SATA ID exactly what slot in the system is being used, and you can do things like toggle, the LEDs and everything, but it really is a separate system from the HBA that is doing the IO path and and the events from it can the force come at different times than from your data path.

C

Devices can degrade slowly over time, so we need to actually carry state. This state needs to survive across reboots pool import/export things like that, and then events don't always happen right. So I've run the system for a while I have some things that go bad I. Do some imports exports? Whatever events come flying at me, I turn off the system and I turn it back on again I'm at some basic state and at that basic state. I still need to know without having the event stream. What things are failed?

C

What things I need to take action on, even if they've occurred, while the system was powered off, for instance, something has returned the drive that went missing or added spares, while the system was now so in FreeBSD. The main systems that we're dealing with to be able to affect this fault management are, of course, ZFS with its streams of sis events and he reports. It also has a nice high octal interface, which is what allows the user space portion of the failure management system to affect change in the system.

C

We also have this thing called geom in freebsd. That's where we get physical path. Information john is kind of like a Lego brick system which allows you to take a raw device or compose multiple raw devices. Ad partitioning do volume management all those types of things, and so the geometric nature of how it slices and dices and combines is why it's called geom, but we primarily use that for physical path, information and then demo. Fess is the way that we detect that a device has come or left the system. All of that information.

C

All of those events goes through deb, CTL 9. You can think of deb CTL 9 as like the poor man's version of an event reporting system. Essentially it predates json, it doesn't use XML. It basically is just strings of key value pairs that happen to come out this driver, and it was primarily done back in the PC card days. If you can remember those so like you insert a modem or you sort of you know, an ata controller in your PC card and a script would run when we developed this at spectra.

C

We were thinking about doing a better event management system in the kernel, but, as usually happens when you're trying to ship a product, you get to the point where things work well enough within your product and then you get busy working on something else, but anyway the stream of string data, you know, is what you can really think of an event which is just a key value. Pair and string data come through the user kernel boundary through dev, CTL or dev dev CTL dev D is the generic demon that can like fork off scripts.

C

If certain things happen, but we can actually subscribe to the event stream by also looking at its named pipe and that's how ZFS D enters the system live, dev, D, CTL the thing at the top of ZFS D, what our attempts to abstract away the event stream in its current format, from what ZFS D has to do so again, hoping that in the future, some better event stream would become available, unlike in other systems, this is not a pub sub stream. Basically, you connect to that name pipe.

C

You get all the events so ZFS D has to filter and deal with the fact that if it's too slow, it could lose events and then, at the very bottom of the ZFS D. We basically, you know, lose the normal lives. Efs live ZFS core libraries to be able to make our changes in the system. The case file repository is where we store information about devices that are in a slowly degrading, State, essentially in freebsd.

C

What that looks like is all the events that have been captured for that device that have not quite been sufficient to make it degraded, but it does basically allow you during debugging and things like that, be able to see over time what has happened exactly what events have occurred that has caused the case file to be opened, so here's a high-level state diagram of what ZFS D has to do at the top.

C

This is where we synchronize system state with the event stream, so we fire up, we go and we probe the entire system to try to figure out what's faulted and what's not, and while that's going on, we hope that we don't receive any events, because we don't know where that event happened relative to our probe.

C

So we sit in the loop until we basically get a clean scan of the system without new events that might change our view, our world view, then we enter a normal event loop, where we can process events from these different systems that I mentioned before because of the event flood that can occur. Essentially, we have to have this yellow diagram that the yellow Dimond up there.

C

We have to notice if an event has been dropped in our event stream and essentially we modified dev D, the demon that feeds us data to make sure that it would always close our pipe if we, if it couldn't buffer an event for us. So essentially, if our stream gets closed, we have to go back to the main loop again resynchronize and make sure that our worldview is correct. The main way that we deal with or detect errors is through this path here, ZFS emits an a bead evident event.

C

That's a checksum error, an I/o error. We look to see if we have an existing case file for it. If we do, then we can take that event. We can add it to the queue and if we get to a certain number of them, then we can mark the devices degraded.

C

If and then we we go and once we have all the information in the case file, we can try to evaluate the case and see if there's something that we can do so if we have decided to degrade the device already, we can online of spare. But in most cases what happens is if we have like just a single I/o error, we'll open up a case file, it will sit there and we'll go on our merry way.

C

We have to also take be aware of pool events, and these events are things that could be initiated by the user. They could add more devices, they could destroy a pool.

C

They could do things like that and essentially what we do is we take the pool gooood and we can use that to look for all case files that are about V devs on that pool and look for look to see if we can now with the new state of the pool, do something to rectify that particular problem and then the last two Deb FS and John. That's basically where we detect that devices have departed or arrived, and the geometry.

C

Ivor, so we actually in both the demo fest handlers and the geometers, have to kind of cross check each other if a device arrives, and it has physical path, information well great, we can do all the checking that we do with physical path information, if not, hopefully, there'll be a Java event later. That will tell us, you know where it exactly lives in our chassis.

C

So the last two boxes here are how we deal with things coming out of order, essentially, as we process an event if we go and look at pool information by doing I octaves into ZFS, and we can't match this event with Poole state will basically add it to an unconsumed event queue, and it will sit there until the next time that a config sync occurs and commits Inc is something is a special event that is emitted by ZFS when it goes to write new configure information.

C

At that point, we basically give all of the unretired events one last chance. We process them all to see if they all match up. If they do, we take action. If not, we throw them on the floor.

C

So we ran into a couple things while we were doing this work, probably the biggest issue that we found were where things around the way that spares are kind of bolted into the system. Both spare and aux devices behave a little bit differently than other V devs that are part of a config they're attributed to a pool through a moss object, they're not in the main label information. So a lot of the things that we end up doing.

C

That would affect something like a raid, Z V dev, or something that we just can't do for spares and aux devices. And you know Allen is probably the best person to ask about all the different. You know kind of torturous use cases that we did to be able to make these things break. But you can imagine things like you know. You have a pool, you export it. You move it into another chassis. Maybe some devices get mixed around, including a spare.

C

You activate a spare on a new pool because that's certainly allowed there's nothing that stops a spare from being added to another pool it gets activated and then maybe it fails. But you can wind up any situations where multiple pools have pointers to devices, some of which might not be might not be even in the system anymore, because of the way that this accounting is done. The solution for spectra was to only activate spares or only call things to spare.

C

When we decided we could actually use a spare, so we don't use a global spare pool and a spectra appliance is basically maintained by the management software in the appliance. When we decide that something needs to be spared, we call it a spare. Just then add it to ZFS, add it to a pool, and then the rest of ZFS D takes takes into fact.

C

The one in the middle there rates, the spare raid-z of spare or replacing mirrors, is also another interesting one. This one's pretty hard to hit, but essentially in ZFS. The expectation is, if you have a pool that has parity information that you can always recover. By doing some kind of you know read from the other Vida or reconstruct from parity or whatever, and that that ability is done.

C

You know at the top level Vida EV layer, and for this reason you usually can't make like a raid Z of mirrors, but you can in this case and when you do, if you try to read from a spare that has reports bad data, essentially there's not enough information in the stack to be able to force the read of the other member, which is kind of surprising and lastly, what's missing inside Ziya ZFS D in FreeBSD, it's most of the stuff that Don talked about. This is basically enough to get a product out the door.

C

We don't have things like the ability to take smart data and use that as a diagnosis engine, we don't have the ability to detect differences in performance between peers, to notice that a device is slowly degrading but perhaps not throwing errors. Yet physical path, replacement works great. As long as you have no partitions and in an appliance you can do that, but a lot of times users like to have GPT partitions and things like that, and if you don't have a really good physical path provider, you can't even do this magical spell replacement.

C

So if you have a chassis that doesn't have an enclosure services provider, essentially you have to do the commands manually. Zfs D isn't able to do the physical pass replacement for you.

C

Yeah yeah so backwards.

B

Nice Justin I, don't know.

B

Yeah, so on I'll just go through this quick, we're running a little bit short on time, but so on Linux. We we had this thing called Z I referred to it earlier and basically it's an event monitor in the user space and basically it'll it'll, collect or watch for events and send them out to any of these things. We called zealots that will listen to or that have subscribed to a certain event class and then they can perform an action, typically send an email or do something like that.

B

There's no diagnosis in the current state, however, we've had some recent developments and we've sort of expanded the mission and and out of the FMA logic into into zetz itself. So rather than like on solaris, where there run a separate plugins, we just basically bake that logic into into the that process. So then it can actually do the equivalent work that I referenced earlier with the retire agent, a diagnosis, agent and add agent, and so that's that's. What we've been working on lately see I.

B

Think I have a well so that all the all the different platforms essentially have to do. The same thing they have to see a disk, and this is just sort of the basic schema that we've implemented and it matches exactly what you would expect on Lumos. Basically, you have all the all the keys. You need to make a diagnosis, so some of this stuff is landing. We landed that phase.

B

One work there, which is basically doing the auto, replace and auto expand all along that disk out agent and then, of course, to dis, monitor itself, and then we have some other stuff coming up in future. We're doing a work-in-progress if anybody's interested in helping out there see see me later today or tomorrow and Justin touched on this.

B

We really want the diagnosis agent to be smarter, so we need input from people on what they've seen what what are, what are good metrics, to look at to make it smarter and then that's pretty much all I had I have some resources here. You can refer to them later, but but basically the fault management. It might.

B

Mike Shapiro had a really good article in ACM about that about what the goal was and so reading that is insightful and then some of Eric's blogs and then, of course, the actual module sources themselves are pretty good at understanding this. So do we have time for Q&A or.

B

I'll just go back to that yeah. So basically, if there's any questions but I'm really, you know we're really looking for input on people on on how to make better diagnosis of when that resources going bad. One of the other things we need to consider, too, is, if you had a top-level videos that we're all like a single video, maybe back you know back by some harbor rate or something like that. You don't want to necessarily look at the errors and say well: I've had 15 errors, you're gone when it's like.

B

Well that there's still a lot of detail, yeah, there's, there's doodle blocks, there's other. It may not be appropriate to take out the final. You know, sort of beat of question.

B

That's a good question: that's actually working progress, we've! Actually! Oh I'm! Sorry! The question is: how do you automate this when, where there's a lot of hardware dependencies in this stack, that's a good question: we've actually been able to simulate removal and adding a devices in in Linux, at least, and so we've automated the testing of auto online and that's a start.

B

There's already tests for auto expand in the ZFS test suite today, but that requires using Z vols, and so these evolves on top of ZFS ZFS on top of t-ball's doesn't work on Linux, oh yeah, that's a good question and we're working on that and then hopefully, we'll will make some progress here. So.

C

In the case of what Spector did there is, since all of our enclosures have at least an expander, we basically did most of our simulated device faults by disabling flies on expanders to be able to take out drives basically without the drive, knowing that it was gonna lose connectivity. We also had from a previous generation product. We had the ability to actually deep our drives just on demand programmatically, and so we could do things like that, and so you can imagine tests where you take out a portion of the drives.

C

Maybe you export the pool you bring them all back. You try to do an import all of those types of permutations where things that were possible from a programmatic standpoint, but it required you know, because we were doing and actually with the real hardware it requires. You know a chassis with a significant number of slots and the software that would allow you to play with the 5 status. There's the ability to change the 5 status, though, is all up streamed in freebsd. Either can do it with the cam utilities.

C

The best thing that we were able to do there was basically just to collect all of our special drives, and you know special drives are the ones that you don't want anybody to ever take away from you because they exhibit really special behavior, and so you know I, don't we didn't.

C

We had this this pie in the sky idea of being able to do it actually with target mode simulation, because we had also another guy on staff who did a lot of target mode stuff, but we just never found the time to do that, but it'd be nice. If you could basic, we just do the injection down there with you know, injecting a SCA seq codes or whatever to simulate that a particular IO failed.

B

The question was whether or not the Linux implementation you know sort of closely matches what's in illumos and yes, that was intentional, I actually looked at the model of the diagnosis engine and a couple agents and found it even though it was derived from FMA, it really didn't have FMA dependencies.

B

It was just a good model, so we we adapted that and in Linux we were successful in in getting all those key functionalities working at least you know in the demos so far so yeah, it's basically the same, and the idea is that we're doing other features we're doing this feature called D raid, which has these virtual spares, and so we want that logic to live and not prevent the array from going to open ZFS, for example,.

C

Right right, so the question was I mean: is there a? Is it possible that, through very special activation of spares, that you could wind up with a pool that is now all of his v-dubs are bigger and what prevents that, from turning into an auto, expand and I? Think that is an optional feature right? You can turn off option auto.

B

D

D

A

C

Yeah, so just for the recording, essentially autoexpand requires a certain amount of minimum space. Before it will do. The expand has to be at least enough to be able to do another Metis lab, and so, if it's just a small difference, you're not going to run into that problem, and it is still an opt-in feature to have this happen.

C

Yeah, so the answer over there was that they do it with partitions, but I guess. The question that I have then is: if you have partitions, you must not be doing replaced by physical path or you've modified ZFS d.o and it hasn't been up streamed. Shame on you, Josh.

B

Other questions.

C

Okay, so there were two questions there. One was you know: what do you do in the face of SAS instability right, so SAS, flipping around and and injecting errors that aren't really related to the be devs, and in that case I mean I, guess I didn't I need to see the failure. We chained up a lot of fairly large systems, but didn't actually see I, guess the type of errors that you saw as we tuned.

C

You know the the equivalent of the surd engine for the spectra appliances based on that experience, so I, don't know what the right answer is there I mean maybe that you can look at the types of error codes that are being returned and the behavior of the HBA to be able to discriminate between events that are really fabric related versus and device related, but that really gets down to the the crux of you know.

C

The point that I think Don's been trying to make is that in these systems we have this framework for being able to do diagnosis and recovery, but it's just a basic framework. It doesn't it's not complete, just as as you found, and then what was the second one. Oh right, a closure infinity, we didn't have a really good answer for that at spectra. Essentially, what we did is in our management platform.

C

That's where we carried the concept of spare affinity and it was basically, the management system would detect that a spare was useful and it would have assign it to the correct pool at that point, and that's basically how we were able to make sure that the spares were used where we wanted them to be used.

B

There's one a Pearson.

C

Essentially, what ZFS do so? The question was, if you were to add smart information into the diagnosis engine right in this case in FreeBSD or whatever, would you have to replicate the type of dictionaries and information? That's recorded inside smart tools? As far as the failure system is concerned, it just needs the diagnosis. The output right, it just needs to know, should I replace this drive or not, and so you just need a binary answer out of the smart tools.

C

It's not that the diagnosis engine has to completely understand the smart data as long as some component in the system can do that, so I would expect that you just end up using the smart tools to do that.

C

B

All right thanks.