►
From YouTube: Kubernetes SIG Storage - Volume Health Proposal 20191028
Description
Discuss Volume Health KEP
29 October 2019
B
Okay,
so,
basically,
right
now
in
kubernetes,
there
was
no
way
to
monitor
the
the
Peavey's.
So
that's
why
we
want
to
add
the
support
to
find
out
if
watering
modems
are
unhealthy
to
find
out.
If
there
is
any
problems,
always
happened
with
the
volumes
either
that
the
volumes
doesn't
exist
anymore
or
it's
not
attached,
or
it's
not
mounted
anymore
or
in
terms
of
local
TVs,
and
if
something
happen
to
the
node,
then
also
we
should
know
so
that's
the
little
motivation
so
here
so
listed
for
cases.
B
B
Somehow
it's
not
attached
anymore
and
I
think
this
one
is
some
rewarding
and
if
the
mounted
volume
is
no
longer
mounted
so
we
also
want
to
also
mark
the
volume
and
then
so
this
fourth
one
I,
don't
know
if
this
is
a
use
case
or
basically,
this
is
more
like
a
different
way
of
implement
this.
This
is
actually
coming
from
one
of
the
review
comment
that
cap
so
some
sort
of
systems.
B
They
can
actually
detect
these
type
problems,
and
so,
in
that
case,
maybe
they
want
to
report
that
when
it
happens
and
then
in
this
case
our
house
controller
can
actually
detect
that
right
away.
So
that's
something!
This
is
something
that
we
added
after
that
review
comment.
So
it's
a
donor's
that
can
take
a
look
at
the
diagram.
How
that
looks
is
that
it's
a
little
different,
okay.
B
It's
still
warring
with
house
I
think
it
was
after
someone
opened
an
issue
and
then
Michelle
actually
said.
Oh
you're,
cheating
me
and
Nick
on
that
issue.
So
basically
they
it's
basically
CSI
volumes
that
God
attached
and
Monday
and
somehow
things
happened,
no
longer
amounted.
Then
we
also
want
to
know
right
because
otherwise,
what
it's
it's
just
another
same
as
before,
superstitions
you
still
find
part
of
the
the
whatever
audience
that
that
are
used
here.
I
guess.
C
Maybe
just
to
reworded
use
cases,
I.
Think,
okay,
instead
of
saying
like
we
need
to
mark
the
P
because
marking
the
PV
sort
of
like
enough
in
detail.
Okay,
it's
more
like
what
do
we
want
to
happen?
Yeah
yeah
like
for
like
the
attaching
one.
What
we
want
to
happen
is
if
a
volume
becomes
unattached,
out-of-band.
B
C
C
B
C
B
C
A
C
B
C
C
C
B
B
They
they
want
in
the
you,
know
the
annotation
and
actually
look
at
yeah
yeah,
but
definitely
we
should
that
this,
this
part
of
them.
We
should
add
more
to
this
far
so
I
think
the
we
do
even
tend
to
capture
all
of
those
type
of
errors.
Just
we
got
a
lot
of
comments.
Actually,
when
people
are
reviewing
this
cat,
they
listed
all
kinds
of
arrows,
so
fun,
it's
kind
of
hard
to
categorize
them.
That's
why
I
know
just
say:
okay
right
now,
we
just
still
have
this.
B
D
C
B
B
Yeah
that
makes
sense.
Okay
did
that,
so
this
basically
just
shows
how
we're
going
to
implement
this.
So
we'll
have
a
external
controller,
a
house
controller
that
will
be
deployed
on
the
master
node.
So
this
will
be
detecting
the
health
problems
for
both
network
storage
and
local
storage.
So
now
we
combine
the
two
I
think
initially
in
the
first
draft,
they
are
kind
of
completely
separate
so
and
then
for
the
for
the
for
the
network
storage.
B
Then
we're
going
to
have
this
CSI
controller,
a
PC
to
go
check
the
conditions
of
those
volumes,
whether
they
are,
whether
they
are
still
there,
whether
they
delete
or
and
whether
they
are
still
attached
and
then
for
the
local
Sergey
I
think
this
is
no
controller
or
PC
for
logo
story.
So
this
is
just
that
you
check
the
know
the
condition
just
to
see
if
a
node
actually
failed
and
then
actually,
this
node
failure
can
that
we
can
also
have
those
for
the
network
storage
as
well.
B
Just
those
would
be
events
so,
and
for
so
this
is
the
so
I
think
this
is
the
the
use
case
for
who
mentioned
earlier.
So
this
is
basically
saying:
okay,
if
some
sort
of
system
they
have
this,
they
have
a
way
to
find
out,
detect
those
errors
early.
They
want
to
report
them.
Then
we
want
to
be
able
to
receive
them,
and
so
this
way
it's
like
a
quicker.
We
can
collect.
C
C
B
Is
CSI
yeah
so
basically
we're
actually
okay,
so
what
happens
is
maybe
if
you
we
go
further,
you
will
see
so
and
if
you
look
at
here,
I
think
Nick
actually
added
some
notes
here.
So
when
when
the
CSI
driver
for
local
soil
is
ready,
then
we're
going
to
use
that,
but
otherwise
for
local
storage.
You
know
it
doesn't
have
the
CSI
and
right
so
yeah,
so
this
actually
means,
if
you,
if
you
see
here,
we
measure
this
control
RPC,
those
are
actually
what
we
CSI
so
for
local
storage.
C
B
C
B
C
C
D
B
Yeah,
it's
going
to
be
very
so
that's
why
I
actually
think
of
that
way.
Then
the
second
model
may
be
more
appealing,
so
this
is
more
like
okay.
This
will
be
kind
of
waiting
here,
yeah
and
then
whatever
never
something
happens,
and
then
it's
going
to
send
out
some
message
and
then
this
the
sort
of
the
controller
will
detect
that
do.
C
B
B
We
can
say
that
the
reason
we
added
is
because
there's
someone
added
a
commenter
they
are
saying
hey.
Is
that
a
maybe
a
more
efficient
model?
If
you
do
this
way?
That's
that's
why
we
have
it
up
here.
So
we
have
a
question
mark
yeah,
so
we
could
elicit
that.
Maybe
we
can
remove
it
if
I'm
here
in
this
city
and
like
alternative,
saying,
hey,
there's
some
mentioning,
but
the
system
doesn't
really
fit
into
the
CSI
model.
I
want
to
you
know
Jeremy,
so
you
can.
C
B
B
B
B
C
B
C
B
B
B
B
But
we
also
have
an
old
component
right,
so
that
has
to
be
on
the
node
every
node
actually
wrestle,
because
this
is
actually
check
the
mount
points
and
stuff
right
right,
so
I
think
especially
enough
a
local
store.
Definitely
you
need
that
right,
even
for
remote
for
never
store
use.
You
also
need
that
you
need
to
check
if.
B
Yeah,
so
that's
the
thing,
because
I
thought
we
are
not
really
planning
to
add
this
thing
into
part
of
cube
later,
but
I
mean,
if
that's
the
case,
because
also
we
so
try
to
add
it
there.
But
if
this
is
outside,
then
it's
going
to
be
you
kind
of
enjoy
agent
every
node
and
then
but
then,
if
you
look
at
at
the
interface
of
I,
was
trying
to
design
there.
C
C
C
B
So
that's
a
tricky
yeah
so
because
I
was
looking
so
maybe
okay,
we
can
actually
jump
to
the
later
part,
which
is
the
CSI
setups
looking
at
it.
This
is
the
this
is
the
well
this
controller,
but
look
at
no
rice.
It
needs
all
of
this.
Basically
the
same
that
you
need
for
node
like
this
stage
and
publish
holding
costs.
B
If
you
want
to
check
not
only
that
it
is
mounted
and
also
mounted
the
same
way
as
before,
then
you
need
potential
need
all
of
this,
because
it's
if
you
just
check
this
amount,
a
then
yeah.
We
don't
need
all
of
this,
but
if
you
want
to
check
if
it's
really
the
same
as
before,
then
so
all
of
this
I
think
most
of
this
we
can
get
in
from
TV
for
awarding
attachment
and
from
the
the
PvP
VCC
itself,
but
I
think
the
past
party
will
have
to
assume.
B
We
know
how
cue
blade
construct
the
path
we
just
constructed
the
same
way.
I
don't
know
if
that's
a
good
way
to
do
it,
but
otherwise
we
don't
know
the
those
pathway
mm-hmm.
So
that's
that's
the
same
that
we
need
to
decide.
I
guess
that
so
basically
for
for
node
check
volume
request
basically
I
have
you
know
everything
that
I
needed
this.
B
B
D
B
C
B
So
I,
okay,
so
I
think
that
is
possible.
If
you
don't
really
want
to
know
whether
it's
Monty
that
exactly
the
same
way
as
before
I
guess,
that's
definitely
possible.
So
if
you're
just
just
kind
of
saying
whether
just
to
check
whether
it
is
still
not
it
or
not,
right
otherwise,
I
I
will
definitely
take
a
look
of
that.
I
was
just
thinking
that
one
doesn't
have
all
of
this
information,
then
probably
it
you
can
give
you
some
definitely
give
you
some
information
back,
but
it
won't
be
able
to
check.
B
If
you
just
yeah
so,
like
I,
think
our
driver,
if
you
give
a
leave,
if
you
give
a
volume
context
then,
and
with
voting
ID,
we
can
figure
out
something
but
I
guess
every
drive
is
different
right,
so
some
draw
may
need
more.
So
that's
why
I
thought?
Okay,
probably
missed
you,
need
to
keep
all
of
this
information.
Yeah,
cuz
I,
don't
know
what
the
driver
deal
with
this.
They
probably
just
don't
save
anything.
Yeah
I.
C
B
I
was
just
thinking
that
Carly
assumes
it
sort
of
mounted;
it's
probably
not
really
checking
if
it's
mounted,
because
that
is
the
same
way
as
before.
That's
you
know
only
thing,
that's
probably,
if
maybe
that
isn't
enough
for
the
first
face,
maybe
we'll
try
this
out
and
I
know,
okay
with
the
opportunity
to
cheat
me
more
than
maybe
what
kinda
more
than
maybe
that's
a
good
if
rest
face
make
it
single
initially
so
yeah
I'll
take
a
look
at
the
bottom
stats,
and
this
idea.
B
C
B
This
one,
basically
just
added
a
few
more
things
if
you
want
to
check
if
it's
exactly
attached
the
same
way
as
before,
basically
we'll
need
all
of
those
parameters,
but
if
we
want
it
just
to
I,
don't
know
if
we
want
to
say:
okay,
maybe
it
first
face,
we
just
make
it
simple,
just
to
check
its
a
pat
or
not
assume.
It's
still
saying,
yeah
I
think
just
work
that
David's
doing
will
cover
the
attache.
B
C
C
B
Is
not
needed:
okay,
okay,
it's
probably
out
yeah,
maybe
she's,
just
sync
up
with
him:
it's
his
work
will
be
in
window,
17,
hope
so,
okay,
okay,
all
right!
So
if
that's
case,
we
can
actually
just
remove
this
one.
So
we
will
just
need
to.
The
other
use
case
you
mentioned
was
if
volume
got
deleted,
yeah
yeah,
so
that
one
yeah
so
that
we
still
need
so.
Okay.
So
here
is
the
response.
B
B
B
C
B
C
B
B
I,
probably
want
to
take
a
look
of
the
nerve
as
well
the
issue
somebody
opens
that
I
want
to
see
to
see
if
it's
all
cover
here.
So
so
the
house
condition
that's
something
that
the
existing
interface
does
not
have.
So
we
need
to
somehow
have
a
wait.
She
provides
some
messages.
That's
the
note
annotation
that
we
want
to
add
to
those
pvt
in
case.
Something
is
wrong,
so
I
think
this
can
be
part
of
this
can
also
be
in
Katt
Williams.
You
can
add
us
there,
but
then
we
wouldn't
need
like
for
each
volume.
B
B
E
B
But
what
but
I
mean
every
attach
is
different.
You
do
want
to
find
out.
You
know
you
attach
to
this
node,
but
not
the
other
node
and
what's
wrong
right.
Otherwise,
then
it's
not
really
helpful.
I
mean
it
just
says
so
say
it
on
this
node
and
then
some
people,
some
error,
don't
really
know
which.
Why
is
a
problem
right?
Then?
You
still
I
think
you
still
need
to
have
those.
A
A
A
A
A
B
C
B
That's
true:
if
even
there
was
a,
if
actually,
if
there
is
already
reactions
happening,
then
meaning
they
will
will
never
have
a
voting.
That
is,
we
thought
it's
urban
I
thought
it's
a
tight
and
it's
not
attached,
because
the
communities
will
keep
trying
to
success
so
basically
meaning
we
don't
have
any
who
shouldn't
have
any
problems
yet.
So,
if
that's
the
case,
maybe
yeah
it's
true,
okay.
So
let's.
E
B
B
So
we
don't
need
this
is
attached.
We
don't
need
this.
You
don't
need
the
house
condition.
So
basically,
we
just
want
to
check
whether
it
is
still
it's
still
there
and
then
given
the
capacity
where
there
is
a
mr.
Sandra's,
actually
capacity,
information
on
the
notes
side,
so
I'm
going
to
take
a
look
and
see
see
if
that
one
is
enough
to
cover
this.
E
If
the
story
they
I
first
had,
could
we
get
the
right
capacity
on
the
notice,
ID
yeah.
C
B
B
Okay,
so
for
this
one
just
to
make
this:
oh
this
one,
we
need
to
sync
up
with
no,
no
Tina
see
whether
whether
we
should
make
this
a
cubelet
call
or
outside
so
and
then
we
may
not
need
all
of
this
all
of
this
parameters
in
the
beginning,
we'll
see
if
the
I
think
the
node
gets.
That's
right
that
one
take
a
look
at
that
one.
B
B
Okay,
so
let's
go
back
to
the
CSI
yeah,
that's
me
so,
basically
they're
two,
and
if
what
they're
alpha
or
the
node
for
the
note
check
volumes,
so
we
also
want
to
other
than
to
find
out
whether
it
is
among
data.
We
also
can
find
out
other
informations.
Like
the
you
know,
file
system,
corruption,
those
type
of
information.
We
can
also
get
from
the
note
check
volume,
so
yeah.
C
B
C
B
E
C
B
Yeah
we
can
yeah,
we
can
cut
the
the
existing
CSI
our
code
and
you
know,
come
up
with
some
common
ones
so
from
the
kubernetes
api
point
of
view,
since
we're
going
to
use
annotation,
so
there
won't
be
any
new
api's
for
at
least
for
the
first
version,
and
then
this
will
be
based
on
whatever
error
message
returned
by
those
house
check
our
pcs
annotations
just
for
alpha
yeah
yeah,
because
I
think
I
think
I
mentioned
this.
You
have
the
six
code
calls
right.
B
B
Yeah
so
I
think,
because
the
devoting
10
days
also
have
other
use
cases
like
the
data
populate
or
so
maybe
that
will
come
up
when
we
discuss
about
the
other
populate
is.
This
will
be
another
use
case
as
part
of
it
mm-hmm
yeah.
So
we
definitely
will
still
think
about
that
one.
It's
just
me
if
we
depend
on
that
mhmm
who
pretty
can't
started
working
on
the
season
controller
part.
B
C
B
C
B
C
The
I
know
in
the
past
security
team
has
been
pretty
strict
about
not
allowing
cubelet
to
modify
objects.
Okay,
that
might
be
another
hurdle
for
the
node
part
of
things.
Okay,.
C
B
B
Attached
yeah,
but
this
is
since
we
said
we,
we
maybe
don't
need
this
anymore.
This
is
just
an
example
but
like,
for
example,
this
one.
Those
are
just
some
examples.
Decay
machines
can
be
anything
oh.
This
is
the
this
is
the
taint
basically
yeah,
but
if
we're
annotation
right
now,
we
don't
really
have
a
fixed
format.
I
think
it's
a
it's
like
very
general,
and
then
the
driver
can
put
whatever
miss
you
yeah
they
have
there,
but
if
once
we
defy
some
common
ira
code
there,
maybe
we
can
well
steal
the
rmsd
part.
C
B
C
C
I
mean
I
was
thinking
more
like
in
the
future
phase,
when
we
can
actually
build
things
to
reconcile
these
errors.
It
would
be
good
to
be
able
to
group
types
of
there's
things
and
like
if
it's
like
an
I/o
error.
Maybe
you
could
run
some
like
I,
don't
know
file
system,
fixer
tool
or
something
if
it's
a
provision
error?
Oh
I,
don't
know
if
we
want
to
reconcile
provision
but
I
hit
create
a
new
one
and
you
here's
your
old
volume,
but
here
is
a
brand
new
one.
C
I
think
that
was
like
something
like
you
can
as
an
application.
You
could
say
I
tolerate
this
error,
like
I
think
the
original
idea
is
was,
if
you
tainted
the
PVC,
it
would
prevent
the
pod
from
getting
scheduled
to
it
and
as
a
pod,
you
might
say:
oh
I,
don't
care
about
this
error
go
ahead
and
schedule
me
anyway.
Okay,.
C
B
At
least
I
think
for
the
initial
use
case
for
the
wadding
taneous
because
of
their
data
popular
right,
so
that
the
there
will
be
like
a
data.
Probably
the
pod
will
still
be
able
to
attach,
and
the
cop
you
know
copy
data
over,
but
a
deposit
should
not
be
using
that
according
something
like
that,
I
think.
Initially,
that's.
C
Mm-Hmm
I
think
the
other
thing
with
tapes
is
that
you
can
also
specify
if
you
want
the
pod
to
be
evicted
or
not.
So
maybe
you
can
buy
a
toleration
that
sounds
like
even
though
my
disk
has
IO
errors.
Now,
don't
evict
me
I
still
want
to.
You
know,
continue
running
and
just
get
like
I/o
errors
in
my
application
and
handle
it
there.
Maybe
hypothetically.
B
B
A
C
And
also
maybe
like
just
to
summarize
the
main
action
items
prior
I
think
the
controller
side.
C
B
B
C
B
Yeah
I,
actually
initially
thought
I
was
gonna
put
together.
Something
I
then
forgot
about
that.
Okay,
so
yeah
welcome
that
so
for
the
CSI.
So
okay,
so
once
we
update
this,
we
for
the
scenes
that
part
we
can
probably
go
to
the
CSI
community
meeting
to
propose
at
least
enhancements
in
list
volumes.
That's.