►
Description
Kubernetes Storage Special-Interest-Group (SIG) Volume Health Discussion - 23 April 2021
Meeting Notes/Agenda: -
Find out more about the Storage SIG here: https://github.com/kubernetes/community/tree/master/sig-storage
A
Hello,
everyone
thank
you
for
joining
today's
meeting
on
boarding
house.
So
today
we
are
going
to
talk
about
what
are
the
use
cases
for
this
feature
and
what
we
should
do
as
the
next
step.
A
So
let
me
share
a
talk,
so
I
talked
to
a
few
people,
so
I
got
some
feedback
from
a
few
people,
so
this
is
what
I
get
so
far
so
right
now
we
know
that
warning
house
we
are.
We
are
just
collecting
that
from
the
street
system
and
then
report
events
on
pvcs
or
parts.
A
So
so
right
now
the
controller
side.
We
have
an
external
house,
monitor
controller,
that's
a
side
car
and
on
note
side
we
have
that
implemented
in
couplet.
So
I
think
this
is
still
useful
to
have
those
events,
so
the
next
step
for
this.
For
this,
I
think
we
can
discuss
whether
we
want
to
bring
this
to
beta
right
now.
The
only
concern
I
have
is
right
now
there
are
no
c-sectional
implementation
other
than
our
sample
implementation
in
this
host
path.
A
So
that's
the
that's
the
current
situation
for
for
this
feature,
and
then
I
also,
I
also
heard
several
people
talking
about
they're
using
this
feature
for
local
pv.
A
So
what
happens
is
right
now,
because
we
only
have
this
as
events,
meaning
that
we
can't
really
have
a
controller,
make
any
reactions
based
on
that,
because
events
can
disappear.
So
what
I
found
out
is
that
there
are
a
few
vendors
actually
just
started
to
have
their
own.
A
They
have
their
own
implementation,
because
this
one,
although
useful,
but
since
they
want
to
be
able
to
react
to
it,
so
they
have
their
they
develop
their
own
crds
or
they
add
that
as
an
annotation,
so
that
it
can
do
something
with
it,
which,
which
is,
I
think,
unfortunate,
because
we
have
this
feature.
We
like
vendors
to
use
this
feature,
so
you
know
so
that's
the.
B
A
A
So
that's
what
we
want
to
discuss
right
so,
okay,
so
I
think
there
are
a
couple
more
things.
Maybe
we
can.
Maybe
we
should
go
one
by
one
or
should
I
just
maybe
let
me
just
go
over
all
of
those
and
then
we
can
go
through
each
of
them
just
to
see.
If
you
you
know,
people
who
are
in
this
meeting
here,
you
know
which
use
case
is
relevant
for
you.
A
A
I
just
this
is
this
is
something
that
I
have
heard
so
far,
so
I
want
to
collect
more
from
from
this
meeting,
so
maybe
we
can
do
you?
Okay,
let
me
put
this
link
in
there.
If
you
want
to
add
yours.
C
A
Okay,
so
let
me
actually,
I
will
share
this
one
so
that
everyone
can.
A
A
Okay,
so
I
will
just
you
know,
you
know,
as
you
guys
are,
writing
down
yours.
I
will
just
continue
with
this
other
use
cases
that
I
heard
about,
and
then
I
think
there
are
some
people
asking
on
mailing
list
and
also
on
the
slack.
A
Basically
that
means
part
where
we
have
to
forcefully
delete
the
pod
either
that's
quiblet
or
another
controller
for
surrey
mount
and
then
there
is
someone
was
also
talk
asking
about
a
like
a
reschedule
to
another
node,
but
that
I
think
that
was
this
front-end
email
through
mailing
list
started
a
long
time
ago,
but
I
think
that
will
be
we'll
build
that,
on
top
of
the
the
csi
capacity
tracking
feature
that
patrick
was
working
on,
I
think
that's,
I
think,
that's
like
one
level
up,
so
there
are
quite
a
few
things
and
then
and
then
also
someone
was
asking
if
we
can
pass
in
additional
information
to
voting
the
note
guide
volume
stats.
A
I'm
not
sure,
maybe
not
yeah,
so
because
he
was
saying
that
right
now,
right
now
we
have
we
pass
in
so
right
now
we
pass
in
the
stage
path
and
then
we
have
this
voting
pass.
One
pass
is
either
so
this
one
is
like
either
staged
or
published.
So
okay,
so
we
have
this.
A
I
so
I'm
not
sure
that
we
actually
have
that
information
right
now,
because
the
the
volume
get
note
get
volume
sets
right
now
we
call
this
periodically
incubate,
but
in
that
place
we
may
not
have
that
information
so
yeah,
that's
that
those
are
what
I
collected.
A
Okay,
so
while
you
are
writing
down
this,
maybe
maybe
I
I'd
like
to
look
at
the
the
local
pv
case.
I
think
I
see
how
is
here
since
you
are.
D
D
So
it's
it's
basically
a
local
processing
volume
that
that
is
attached
to
a
host,
so
we're
using
the
using
the
house,
monitor
house
monitoring
components
to
detect
the
the
failure
cases
for
the
volumes
and
what
our
customer
requests
is
that
so,
basically,
our
customers
deploy
their
application
in
a
stateful
set
and
when
they're,
when
something
happened,
to
the
local
person
volume,
for
example.
D
If
you
know
these
down
or
something
happens
to
the
to
the
ssd,
the
pod
and
the
powder
will
fail,
but
we
won't
be
able
to
get
rescheduled
because
because
you
were
always
stopping
that
in
that
crash
loop
due
to
the
volume
failures.
So
what
they
want
is
to
have
something
that
automatically
remove
the
present
volume
claim
so
that
it
can
recreate
a
volume
and
unblock
the
recreation
of
the
pod.
D
So
the
application
has
itself
has
some
mechanism
to
handle
the
disk
failures,
so
they
they
can
recover
from
an
empty
personal
volume.
D
Yeah,
so
basically,
what
would
happen
is
that,
after
deleting
the
pvc,
the
control,
the
provisioner,
would
just
delete
the
pv
and
then
it
will
like
recreate.
The
server
set
controller
will
recreate
the
pvc
and
then
it
treated
the
new
pvc
and
then
probably
your
new
pv
and
the
bound
them
make
sense,
because
you
can't
change
the
video
right
yeah.
I
can't
change
pvp,
basically
the
application
owner.
They
just
want
to
unblock
the
the
remediation
for
their
pod.
F
F
D
D
No,
I
so
basically
right
now
we
don't
have.
We
don't
have
automation
around
this.
We
just
manually
delete
the
pvc
got
it
like,
including
the
finalizer
for
pvc
protection.
G
F
G
A
But
I
thought
you
were
just
deleting
the
pvc
right,
so
you
are
also
building
the
pod.
Yes,.
D
Yes,
also
that's
right,
because
the
part
we
are
stuck
in
a
crash
loop
and
then
we
have
to
delete
it.
I
forgot
to
mention
that.
A
D
Yes,
that's.
The
controller
is
basically
to
automate
this
process
and
we
are
working
on
that.
A
H
So
sorry,
you
know
I
I'm
still
confused
like.
Why
is
that
has
to
be
stateful
set?
If
you
don't
care
about
the
state
on
that
card,.
D
H
So
the
other
controller
will,
if
you
add
the
position
volume
claim
there,
it
would
create
it
right.
Wouldn't
it.
A
D
A
And,
and
also
so
so
right
now
we
have
on
the
controller
side,
we
have
events
on
pvcs
and
then
on
note
side.
We
have
those
parts.
So
for
your
use
case,
let's
say:
if
we
add
this
volume
condition
information
in
pvc
status
from
the
controller
side.
Will
that
be
enough?
But
do
you
also
need
that
from
node
side?
F
D
So
not
right!
This
was
yes
so
right
now
we
only
rely
on
the
two
type
of
events
emitted
from
from
the
controller
side
on
pvc
and
the
other
one
triggered
by
the
node
notifier.
D
Yeah,
I
think
that's
one
of
the
pain
point
that
our
customer
mentioned
and
we
manage
a
fairly
large
cluster.
Some
bare
metal
posts
today
and
the
disk
failures
is
a
very
common
case
that
they
need
oh
they're
migrating
to
basically
they're
migrating
to
kubernetes,
and
they
want
to
fix
that
problem
in
in
kubernetes
and
they
request
that
we
automatically
handle
this
kind
of
failures.
D
E
G
C
Thank
you
for
just
another
data
point
for
our
like,
at
least
in
some
of
our
customers.
They
were
looking
to
get
the
health
monitoring
status
and
obviously
like
which,
when
this
health
monitoring
demon
detects
that
that
okay,
the
disc,
is
failing,
it
should
blink
the
light
and
they
want
to
replace
the
disc
actually,
but
that
could
potentially
change
the
the
device
id
on
the
on
the
node
and
it
cannot
adopt
the
we
cannot
update
the
pv,
so
it
has
to
be
recreated
right.
So
at
least
right.
You.
A
C
A
C
A
In
place
or
you're
saying
then
okay,
but
they
they
have
to
have
a
different
pv
right
because
they
have
changed
the
disk.
So
it's
has
to
be
different.
One.
C
C
Yeah
yeah
almost
sorry.
A
Okay,
so
it's
almost
like
you
have
to
bind
and
then
keeping
the
pvc
somehow
and
the
just
like
the
in-place
restore.
C
Hello,
yeah
yeah,
so
open
eps
actually
has
a
component
which
exposes
the
smart
statistics
of
device
block
devices
to
prometheus,
so
the
detection
of
right
now
it
I
think,
doesn't
support
ssgs,
but
it
works
with
the
hdds
and
it
can
be
extended
to
support
ssds
as
well.
But
the
problem
is
again,
but
the
only
thing
it
does
is
it
detects
it
and
exposes
metrics,
and
so
the
it
can
send
alerts.
But
nothing
like
the
problem
with
in
place
replacement
would
still
exist.
A
B
A
A
It
will
be
helpful
if
we
have
those
if
we
have
those
one
health
information
in
pvc
status.
C
I'm
just
thinking
loud
actually
like,
oh,
where
does
it
belong?
C
A
Yeah
so
right
now
I
mean
at
least
right
now
we're
talking
about
the
controller.
So
right
now
we're
talking
about
the
controller
side.
That's
why
I
was
asking.
Is
this
from
the
controller
side
or
there's
a
note
side
looks
like
there's
no
side,
we're
not
sure
yet.
I
think
right
now
we're
talking
about
the
controller
side.
So
if
this
is
like
you
can't
you
detect
from
the
controller
side,
if
we
have
this
information
on
pvc
status,
would
that
be
useful?
D
I
I
forget
to
mention
that
we,
so
we
actually
implemented
some
of
the
aisle
failure
checks
on
the
controller
side,
because
for
two
reasons,
why
is
that
our
our
kubernetes
version
doesn't
have
the
kubelet
patch
included
and
okay?
The
other
reason
is
that
we
found
in
our
implementation.
We,
our
controller,
actually
talks
to
each
of
the
node
to
basically
do
the
provisioning
and
we
basically
take
it
back
to
that
route
and
get
the
volume
health
information.
Okay,.
A
A
D
A
A
That
you
can
probably
I'm
not
sure
if
you
notice
and
there's
a
this
new
new
feature
distributed
the
production.
You
could
do
that
as
well.
Is
it.
A
Yeah
in
the
external
provisioner
or
two
mode
yeah,
but
that's
still
on
the
controller
side,
I
think
so.
G
Yeah,
I
just
don't
ask
that
the
controller
like
how
mentioned
is
doing
some
provisioning
for
local
volumes
or.
A
Yeah
but
local
volumes,
that's
for
like
if
you
look
at
the
external
provisioner
right
now
that
actually
support
two
mode
and
then
there
are
local,
be
safer
drivers
that
are
using
that
the
two
modes.
What
are
the
two
modes
the
distributed
and
the
central
center,
is
the
same
as
what
we
have
before
distribute
it
is.
Actually
you
can
put
your
progression
in
on
the
node.
A
A
I'm
just
this
is
just
yeah,
I'm
just
talking
like
this
is
so
I
mean
for
this
particular.
I
was
just
thinking
about
for
this
use
case
right
now,
because
it's
only
events
right.
We
can't
do
anything
so
for
this
feature
to
be
used
for,
but
you
know,
actually
we
can
do
any
reaction
to.
It
doesn't
make
sense
to
add
this
to
pvc
status
so
that
so
that
this
can
be
acted
upon.
A
This
is
something
that
yeah.
This
is
like
one
one
question
that
I
I
have
so
this
is
we're
talking
about
the
controls
and,
of
course,
we
also
need
to
talk
about
this.
Other
this
note
side
as
well.
Does
it
make
sense.
A
It
is
that
is
what
we're
doing
right
now.
Actually
we
actually
only
doing
that,
because
this
is
only
for
this
is
only
for
if
you
have
already
provisioned
a
pd
pvc,
you
don't
you
start,
everything
is
all
set
and
you're
not
you
know,
kubernetes
ever
is
all
set,
so
you
don't
do
anything
anymore.
Then
this
is
the
then
we
are
actually
checking.
We
actually
don't
check
it.
If
it's
like
not
bound,
if
you
think
we
should
check
that
and
that's
something
else
we
need
to
consider.
But
right
now,
it's
really.
We
assume
okay.
E
So
in
case
of
google
pvs,
it's
like
if
he,
if
he
do
not
think
about
distributed
provisioner
right
now,
it's
main
late,
binding
right,
like
local
storage
classes,
are
typically
always
laid
by
as
in
delayed
binding.
So.
A
A
So,
okay,
so
I'm
not
saying
this
is
we're
only
adding
this
for
local,
I'm
just
saying
that.
Does
it
make
sense?
You
add
this.
This
is,
of
course
we
can't.
We
can't
know,
we
wouldn't
know
whether
it's
a
local
or
not
local.
It's
a
we
don't
really
have
a
special
type
for
csi
driver.
This
is
really
depends.
This
is
really
depending
on
the
driver
implementation.
A
So
this
is
like
a
jet.
This
will
be
like
a
general
field
for
any
any
pvc
status.
Does
it
make
sense
to
have
this
field?
A
I
guess
we
don't
have
to
make
a
decision
right
now,
but
I'm
just
asking
so
because
what
I,
what
I
heard
is,
I
already
know
a
few
vendors
actually
just
they
like
even
nick
right.
He
started
this
and
when
I
talked
to
him
the
other
day,
he
said.
Oh,
we
are
using
our
own
crds
because
they
can't
really
use
events
right,
but
he
was
actually
started.
This
whole
world
house
feature,
but
he
can't
even
use
it
so
like
from
our
side,
we
actually
we
use
annotation.
C
C
A
A
A
Okay,
yeah
so
right
now
it's
just
yeah
right
now,
it's
just
events
or
even
the
problem.
Of
course
it's
a
it
is
helpful,
but
it
could
disappear.
This
is
not
that
reliable,
but
definitely
that
that
can
be
very
helpful
as
well.
I
A
A
With
some
error
code-
and
we
couldn't
agree
on
what
error
code
and
then
we
just
have
now-
we
just
have
this
like
a
boolean.
It's,
whether
it's
abnormal
or
not
and
with
a
message
is,
is
that
enough
to
starve
this?
So
I
guess
that's
the
question,
because
if
we
wanted
to
decide
like
what
error
code
we
want
to
equal,
I
think.
A
I
Kind
of
next
step
might
be
to
decide
between
so
our
first
iteration,
we
said:
let's
remove
programmatic
use
from
the
picture,
to
simplify
the
overall
design.
Let's
just
focus
on
what
are
the
signals
that
we
want
to
surface
to
the
end
user
and
we've
accomplished
that
through
events,
it
seems
to
be
well
received.
I
The
second
step,
I
think,
is
now
we
want
to
do
programmatic
use
of
these
signals
to
have
systems
respond
automatically
to
volume
health.
The
question
is:
which
systems
do
we
want
to
target?
First,
do
we
want
to
target
third-party
integrations
or
do
we
want
to
target?
You
know
cubelet
and
kubernetes
schedule
or
responding
like
a
first
party.
A
I
And
I
think
it's
good
to
figure
out
whatever
our
final
consumer
is
going
to
be
for
this
iteration,
so
that
we
can
kind
of
ground
it
in
concrete
reality
and
say:
okay,
this
is
the
type
of
behavior
we're
trying
to
support,
and
then
you
know,
based
on
that,
we
can
figure
out
if
the
api
makes
sense
or
not.
I
I
think
it
might
be,
and
my
this
is
just
instinct
and
it
might
be
incorrect-
is
it
might
be
better
to
start
with
a
first
party
use
case?
I
Okay,
because
you
know
in
general,
with
the
first
party
kubernetes
use
cases,
we
try
to
focus
on
making
things
widely
reusable
for
lots
of
different
consumers
versus
if
we
focus
on
a
kind
of
single
third-party
use
case,
it
might
work
for
that.
One
person
or
one
use
case
but
may
not
be
broadly
applicable,
but
that's
just
a
gut
instinct-
might
be
wrong.
There.
A
Yeah,
okay,
so
we
can,
we
can
talk
to.
I
think
we
should
talk
about
all
of
those
anyway
right.
So
let's
see
how
what
does
it
mean
for
cubelet,
so
right
now
yeah,
we
do
have
a
few
people
asking
about
this.
So
let's
say
on
the
note
side:
we
detect
that
abnormal
condition.
A
So
in
this
case,
so
right
now
those
are
parts.
Okay,
so
then
does
that
mean
we
should
have
those
information
added
to
part
status?
I
think
that
we
probably
need
to
ask
the
note
the
note
team,
because
I
I'm
not
sure,
and
then
this
is
also
like.
If,
if
this
pvc
is
used
by
multiple
parts,
we
right
now
we
have
events
and
all
the
parts
right.
So
how
does
the
people
that
react
on
like
for
each
part
it
detects?
A
This
is
about
how
they
delete
itself
or
you
know.
So.
That's
are
questions.
I
think
we
need
to
ask
okay.
B
The
data
for
that
nfs
share
moves
somewhere
else
and
now
now
the
pod
is
accessing
its
data
through
a
sub-optimal
path,
and
what
you
would
like
to
be
able
to
do
is
just
go
to
the
node
and
remount
with
the
I'm
sorry,
you
know
update
the
mount
with
a
different
server
ip,
but
that's
not
how
nfs
works
and
linux.
Unfortunately,
you
have
to
fully
unmount
the
nfs
share
and
then
remount
it
with
the
new
ip
address,
which
would
involve
bouncing
a
pod
in
order
to
get
optimal.
B
I
o
performance
again,
so
the
interesting
trade-off
here
is
like
it's.
You
don't
have
to
do
anything.
It's
just
like
until
you
bounce
the
pod
you're
wasting
some.
I
o
performance,
and
so
it's
it's
really
a
question
of
preference
like
do
you
want
to
keep
your
pods
alive
until
they're
they're
done,
or
do
you
want
to
optimize
for
io
performance?
And
so
I
I
feel
like
you
need
like
what
we
would
like
to
do
is
is
flag
the
volumes
that
are
in
this
situation,
because
we
we
can
detect.
B
We
can
say
this
volume
is
being
accessed
through
a
sub-optimal
amount
right
and
we
can
flag
that
through
the
volume
health
feature.
We
don't
do
this
yet,
but
we'd
like
to
be
able
to
and
we'd
like
to
have
something
on
the
other
end
watching
that
and
then
basically
enforcing
some
user-decided
policy
like
the
user
might
decide.
I
never
want
to
bounce
my
pods,
because
re
bouncing
a
pod
causes
me
to
lose
more
than
than
just
dealing
with
the
worst.
I
o
performance,
whereas
other
users
might
make
the
opposite
trade-off.
B
A
B
We
can
detect
for
a
given
volume
if
any
client
anywhere
is
accessing
it
through
a
sub-optimal
path,
and
we
could
say
that
means
it's
unhealthy
and
we
could
return
that
information
back
to
the
sidecar
through
the
grpc
socket
and
then
what
happens
on
the
kubernetes
side
is
is
up
to
what
we
implement
as
a
community
and
what
I,
what
I
would
like
to
see
was
like
a
way
some
policy
enforcement
engine.
That
says
you
know
if
it's
unhealthy
bounce,
the
pod
or
if
it's
unhealthy,
don't
bounce
the
pod.
B
B
A
But
right
now
right
now,
I
think
we
were
the
reason
we
choose
to
add
those
two
parts
are
because
we
were
at
that
time
because
we
have
this
from
both
on
the
controller
side
and
the
note
set
right.
So
I
think
at
that
time
we
we
thought
we
don't
want
those
two
to
be
conflicting
with
each
other.
So
that's
why
we
said:
okay
on
the
no
side,
let's
just
add
them
to
parts.
I
think
that
was
a
discussion
that
we
made
at
that
time
so
that
we
differentiate.
B
A
But
your
controller
said
you
know
the
no.
We
talk
you're
talking
about
amount
right,
but
your
controller
say
no
the
amount,
because
this
is
supposedly
the
note
side.
Looking
looking
at
the
month
amount
problem,
controller
side
does
not
look
at
that.
Are
you
saying
your
controller
will
be
looking
at
the
mount?
You
know
this
problem
mounting
problem.
B
G
So
so
you
are
doing
kind
of
a
similar
thing
so
using
this
way
to
trigger
mountain
raymont,
if
you
are
not.
B
Well,
I'm
proposing
this.
We
aren't
actively
doing
this
but
like.
I
would
like
to
be
able
to
do
this
and
I'd
like
to
put
the
the
policy
decision
in
the
end
user's
hand,
so
that
they
can
decide
which
behavior
they
prefer.
G
So
there
are
two
up
two
options
you
can
do.
One
is
delete
the
part,
the
recreate
part.
The
other
option
is
to
just
trigger
like
a
month
and
a
remont
that
both
can
achieve.
How
do
you
trigger.
B
A
B
A
B
But,
but
with
a
with
an
option
to
to
decide
whether
that's
what
you
want
or
not,
because
it's
it's
perfectly
valid
to
not
delete
the
pod
and
continue
running
with
a
less
than
optimal,
I
o
path.
So
it's
it's
up
to
the
user
like
what?
What
do
they
care
more
about
podca
pods
lifetime
continuity
or
I
o
performance.
G
Right
the
reason
I
asked
about
remount
is
you
you
put
in
your
use
case,
like
the
property
fixes
amount
remond
for
certain,
like
informal
volume
right,
we
kind
of
have
a
remont
behavior
like
periodically
trigger
remont,
because
for
a
secret,
for
example,
if
secret
updated,
we
want
to
do.
G
B
A
A
C
A
C
A
Yeah,
but
it's
like
one
snapshot,
is
only
for
csi
right.
I
think
we're
not
adding
that.
C
But
this
mechanism
that
we
are
talking
about,
like
you
say
we
put
it
in
pvc
status,
the
health
check
and
an
external
controller-
deletes
the
part
that
resolves
the
replica
set
of
stateful
said
to
recreate
the
part
or
another
node
or
something.
So
that's
this
mechanism
is
basically
generic.
It's
not
nothing.
Nothing
is
css
specific.
A
C
A
C
C
A
A
A
B
A
A
Okay,
okay,
so
I
think
okay,
so
we
okay,
so
I
think
of
this.
We
talked
about
this
and
we're
going
back
to
here
now
I
think
there
are
some
other
people
talked
about
if
cubley
detected,
this
one
then
terminate
the
pod.
A
So
this
is
this
is
from
the
note
side.
A
So
if
we
oh
so
so,
do
we
need
to
do
this
for
both
places?
We
okay,
so
we
have
this
we're
talking
about
here.
So
basically,
let's
say:
if
you
are
implementing
this
driver,
if
you're
implementing
this
volume
house
in
your
seaside
driver,
then
you
should
try
to
avoid
sending
the
same
information
from
both
the
controller
and
node
side,
because,
let's
assume
that
we
have
both,
you
don't
want
it
to
be
reacting
from
both
sides.
Right
so
let's
say
we
control
the
side.
A
There's
some
controllers
trying
to
do
some
reaction
and
then
from
cubies
that
we're
talking
about
also
talking
about
like
deleting
the
pod
triggering
vermont.
We
don't
want
this
one
to
be
happening
from
both
places
right.
B
A
B
A
Yeah,
so
so
maybe
the
social
maybe
can
choose,
maybe
it's
just
if
they
just
want
to
rely
on
controller
side.
Maybe
it
doesn't
need
to
implement
the
note
side
then,
or
it
just
infinite
side,
only
for
the
events,
but
not
for
the
reaction.
I
just
I'm
just
talking,
assuming
that
we
have
reaction.
Let's
say
if
we
have
reaction
both
places.
A
Maybe
we
don't
want
to
yeah.
We
don't
want
to
turn
on
the
reaction
in
both
places
because
they
could
be
conflicting
with
each
other
overstepping.
A
So,
okay,
okay,
so
going
back
to
okay
going
back
to
this
one,
so
I
think
we're
now.
In
this
case
I
guess
I'm
not
sure.
Where
should
we
have
that?
Have
those
information?
Let's
say
if
we
want
public
to
do
something
for
the
for
the
amount
right
so
well
clearly
to
let's
determinate
the
pod
trigger
a
remark,
then,
should
that
information
be
in
the
pod
itself
or
is
that
I
still
have
this
question?
Is
it
because
we
are
going
to
have
that
in
multiple
parts?
A
E
Like
multiple
pods
accessing
volume,
on
the
same
note,
yeah.
A
I
don't
know
if
that,
if
yeah,
so
that's
the
read
army
case,
so
does
it
make
sense,
you
have
those
information
in
the
in
the
part
yourself
that
I'm
not
sure
I
guess
this
is
seems
like
a
question
for
the
note
team.
Is
that
because.
C
So
what
happens?
I
had
a
question
like
what
happens.
If,
okay,
we
find
a
pv
or
a
volume
that
is
not
healthy
and
and
we
have
a
part
running
on
it,
we
kill
the
pod
and
but
we
want
to
block
any
part
workload
from
using
that
pvp
vc
combination.
Do
we
have
a
use
case
like
that,
because
or
we
expect
that
after
like,
for
example,
I'm
thinking
of
locals
to
local
volumes,
maybe
csr.
A
C
A
Yeah,
so
so
I
get
because
this
is
a
trigger
right.
If
we
only
delete
part,
I
think
that's,
the
volume
is
still
there
right.
So
actually,
if
you're
looking
at
this
case,
actually
we're
talking
about
deleting
both
pvc
and
parts
so
that
they
can
be
recreated
and
rebound
and
re-mount,
but
if
we
only
delete
the
part
that
seems
to
be
not
right,
because
the
volume
still
not
not
house,
it's
not
a
pod,
it's
the
volume.
A
G
Yeah,
I
kind
of
feel
like
here.
The
focus
here
right
is
volume,
health,
monitoring
and
but
in
terms
of
how
to
react,
volume
like
abnormal
behavior,
it's
kind
of
up
to
the
application
or
controller.
G
J
C
I
I
agree,
but
this
the
determining
that,
like
we
can
ignore
that
part
how
to
handle
take
action,
but
but
that
determines
where
to
put
that
information.
For
example,
if
you
put
the
health
stuff
in
part
and
part,
is
deleted,
then
that
information
is
lost,
pvp,
which
is
still
there
and
can
be
used
by
any
part,
and
that's.
G
G
Yeah
we
can,
we
can
focus
on
the
problem
where
to
put
information.
No,
I
I
I
don't
think
part
is
a
good
one,
but
the
pvpc
we
can
this
more
like
focus
discussion
over
there
and
I
didn't
close,
follow
the
original
design,
but
maybe
we
can
also
iterate
right
now.
It's
the
design
only
check
the
available
pvc
and
the
monitor
whatever
the
pvc
like
point
to
the
pv
house.
A
Yeah
right
now
we're
only
checking
the
one
that
are
already
creating
the
bond,
because
that's
the
because
we
know
there
are
already
controller
handling
the
case
if
they
are
like
not
bound.
That's
something
that's
different
right,
so
we
are
looking
at
after
it
is
already
creating
a
band.
The
health
status.
F
B
Certain
types
of
health
problems
that
are
fundamentally
associated
with
the
volume
and
those
probably
should
be
associated
with
the
volume
at
the
api
level,
but,
like
my
particular
use
case,
the
the
problem
isn't
with
the
volume.
The
problem
is
with
the
pod
volume
pair,
where
the
particular
pod
is
having
a
particular
problem
with
the
volume,
but
it
doesn't
necessarily
implicate
any
other
pods
having
problems
with
the
same
volume
and,
unfortunately,
the
way
the
csi
rpcs
are
written.
That
there's
no
there's
no
information
about
like
which,
which
pod
is
being
asked
about.
B
It's
just
saying
this
is
the
volume
how's
it
doing,
but
if
there
was
a
way
to
sort
of
understand
the
health
relationship
of
a
pod
volume
pair
and
then
put
that
information
on
the
pod,
then
that
would
make
sense.
A
So
that
is
the
note
side.
We
have
that
for
a
note
side,
so
the
yeah
for
note
side.
We
have
that
information.
We
actually
have.
You
know
what
volume,
what
part
you
get
all
of
those
information
in
that
in
that
message
that
we
have
right
now
we
only
since
we
only
have
a
message
and
whether
it's
abnormal
or
not,
you
will
see
the
you
will
get
the
voting
information
and
the
party
this.
A
This
is
the
this
is
the
from.
This
is
the
we
have
the
sidecar
right
so
after
we
collected
that
information?
Well,
no
well,
actually,
no
not
set
card
right
now.
This
is
the
equivalent
sorry
yeah
yeah.
B
B
B
A
B
B
A
A
Yeah,
that's
actually
that
that
is.
That
is
possible,
I
think,
but
then
we
are
getting
this
information
from
okay.
So
we're
getting
this
information
from
the
storage
side
right,
so
your
node
plug-in
knows
yeah.
So
if
you
are.
A
Right
right,
so
basically
it
will
just
tell
you
if
there's
some
problem
from
that
yeah
from
its
point
of
view,
and
then
we
are
going
to
just
bubble
up
and
tell
that
it's
going
to
be
the
same
message
for
all
the
parts
that
are
using
that
pvc.
That
is
true
yeah,
I
don't
know.
If
I
don't
know
if,
like
one
part,
is
okay,
the
other
is
not
I'm
not
sure.
What
case
is
that.
A
B
Users
of
the
volume
are
healthy
and
some
users
of
the
volume
are
unhealthy
and
it
would
be
very
hard
to
translate
that
into
which
pods
are
the
unhealthy
ones,
but
but
in
principle
it
could
be
done.
So
I
I
don't
know,
I
saying
all
the
information
has
to
be
associated
with
the
volume
doesn't
feel
100
right
to
me,
although
it
might
be
the
expedient.
J
B
G
A
It
doesn't
go
check
if
the
I
don't
know
what
to
check
on
the
pause
site.
To
be
honest
with
you,
what
what
do
we
check,
but
I
mean,
but
it
does
have
that
information.
It
does
have
pod
information,
it
does
have.
It
knows
which
part
it
is.
I
actually
don't
know
what
or
you're
saying
like,
for
example,
to
check
the
like
particular
path.
B
The
the
external
health
monitor
sidecar
doesn't
know
anything
about
pods.
It
just
knows
about
volumes
right.
A
B
A
Okay
time
check
looks
like
we
are
on
top
of
the
hour
all
right,
so
we
didn't
looks
like
we
didn't
finish.
Did
we
fit
okay?
So
I
think
there's
this
one
more
thing
we
didn't
do,
but
but
we
we
talked
about
this
two
two
cases
at
least
okay.
So
what
is
the
next
steps?
A
C
A
A
A
So
if
then,
then,
that
warning
needs
to
also
need
to
be
deleted
right.
So
if
we
are
only
leaving
the
part
that
doesn't
mean
the
warning
will
be
deleted,
but
then
from
kubelet
it
doesn't
know
it's
not
supposed
to
also
delete
the
volume
that
is
attached.
That
seems
to
be
a
little
strange,
because
if
this
is
part
of
the
staple
set,
it
just
seems
to
be
the
wrong
place
to
deleting
this,
I'm
not
sure
you.
A
G
Point
is
like
we
might
not
guess
like.
What's
a
controller
want
to
do
here,
so
maybe
the
focus
of
from
volume,
health
side
right.
We
discuss
how
to
update
the
status
like
what
api
to
add
it,
either
in
pvp
pc,
not
like
too
much
talk
on
like
a
controller
said.
What
controller
will
need
to
do
either
daily,
pv,
cpu
or
pvo
part?
G
Maybe
in
the
next
step
like
we
can
start
talking,
but
right
now,
I
feel
like
there's
no
standard
way
of
doing
this
right,
it's
up
to
customer
controller
and
also
really
to
how
they
use
the
power
to
use
this
volume
right.
The
stay
for
set
controller
also
recreate
the
part
etc.
G
So
this
kubernetes
do
something
is
definitely
yeah,
so
the
couple
said
something:
definitely
it
will
be
long
discussion
like
yeah
able
to
do
anything
is
involved.
A
lot
of
decisions
need
to
be
made
yeah.
It
will
be
wrong.
C
G
A
Oh
you're,
saying
that:
okay,
okay,
don't
don't
disable
the
feature
for
now
csi
audience.
A
So
so
I
think
we
right
now.
I
think
we
are
talking
about
like
adding
maybe
adding
a
field
in
either
pb
or
pvc,
so
that
is
a
first
class.
We
can
see
how
that
can
be
used,
but
I
think
it's
it's
more
like
we
are
talking
about
this
api,
but
then
we
are
not
really
going
to
add
the
reaction
directly
in
the
in
the
kubernetes
controllers
right
now
at
in
this
step,
I
think
that's
that's
what
I
heard
so
far
right.
I
Yeah,
I
think
the
suggestion
is
maybe
think
about
the
kubernetes
reactions
first,
because
the
kubernetes
reactions
might
be
more
generic
and
reusable
than
a
third-party
reaction.
A
A
I
I
think
the
thing
is
it'll
be
a
forcing
function
right.
I
think
the
natural
reaction
is
to
try
and
go
with
the
path
of
least
resistance,
and
in
this
case
it'll
be,
like
you
know,
there's
a
use
case
x
by
you,
know
consumer
foo
and
if
you
focus
on
getting
that
working
it'll
be
easier
to
get
that
working
end
to
end,
because
it's
a
very
specific
use
case
for
that
consumer.
I
A
I
G
Okay,
so
direction
like
has
two
areas:
one
is
recover,
the
other
for
recovery.
That's
what
I'm
saying
like
it's
hard
to
have
a
general
way
to
like
fit
every
cases,
but
we
can
discuss
that
and
the
other
is
for
scheduler
set
that
can
help
like
better
schedule.
Your
schedule,
part,
if
you
know
some
pv
the
volume,
has
issues
right.
You
can
avoid
schedule
on
that
that
I
feel
might
be
easier
like
to
to
think
through.
So.
A
That
would
be
kind
of
combine
that,
with
the
css
capacity
tracking.
I
think
that
has
to
go
with
combats
tracking.
A
G
A
So,
oh,
okay,
if
that's
the
case,
we
probably
okay.
I
need
to
take
a
look
at
this
one
see:
okay,
I
need
to
take
a
look
and
see
where
should
we
add
this
because
we
actually
talked
about
some
time
ago
of
adding
that
in
the,
but
it
seems
to
really
the
weird
to
add
that
in
the
css
capacity,
so,
okay,
so,
okay,
let's
let's
talk
about
this
one
later
yeah,
so
I
think
for
better
scheduling.
A
Then
this
we
have
to
sync
this
together
with
this
at
the
second
city,
scheduler
csi
capacity
tracking
feature
and
see
you
know
where
this
field
should
be
added
so
yeah,
okay,
so
maybe
that
yeah,
I
think
that
makes
sense
in
think
about
how
to
do
a
better
scheduling
rather
than
saying
yeah
from
public
side.
This
is
a
little
messy
if
you're
like
okay,
so
I
think
we
are
out
of
time
yeah.
A
So
I
will
schedule
another
meeting
in
the
future
just
to
to
talk
more
about
how
this
would
look
like
in
api
and
how
cubelet
can
use
this
to
recover.
Potentially.