►
From YouTube: SIG - Performance and scale 2021-09-30
Description
Meeting Notes: https://docs.google.com/document/d/1d_b2o05FfBG37VwlC2Z1ZArnT9-_AEJoQTe7iKaQZ6I/edit#heading=h.qs7aweajr18k
A
Okay,
welcome
sixth
scale
september
30th.
I
link
the
doc
in
chat.
Please
add
yourself
as
an
attendee.
A
Okay,
let's
start
with
the
first
item.
Add
all
the
scripts
and
configurations
in
the
upstream
cic
system
to
configure
the
perf
cluster.
So
look
at
this.
B
Yeah,
so
this
is
to
configure
the
performance
cluster
in
the
cicg
system
so
after
well,
I
did
many
tests
just
to
verify
the
performance.
Now
I
want
to
include
the
system
that
I
was
testing
to
the
cicd
system
and
then
we
will
have
all
the
performance
jobs
that
we
discussed
before
running
there
it's
merged
now.
B
So,
but
it's
it's
still
like
missing
some.
You
know
some
jobs
to
create
the
cluster
and
and
do
that,
so
I'm
I'm
doing
that
with
frederico,
that's
responsible
for
the
cicd
system
we
covered
so
hopefully
we'll
we'll
get
this
cluster.
You
know
ready
soon.
So
what
is.
A
What
does
this
mean
like
the
the
sig
scale
cluster
like
this,
is
a
like
a
dedicated
job
or
cluster,
or
something
that
is
that's
run
for
things
that
we
want
to
test
or
is
like?
Can
you
describe
this?
No
more.
B
B
However,
it's
shared
with
many
jobs
and
it's,
I
would
say
it's
kind
of
impossible
to
have
any
performance
job
there.
So
we
have
another
cluster
that
will
run
the
performance
test
there
and
the
test
should
be
isolated,
will
not
be
collocated
to
any
job.
We're
running
bare
metal
for
the
functional
test.
Actually,
it
has
well
the
way
that
it
works.
It's
creates
a
vm.
Install
kubernetes
inside
creates
a
cluster,
so
it's
used
nested
virtualization,
so
the
functional
tests-
it's
not
it's
not
regarding
performance,
okay,
so
it's
just
as
the
you
know,
functional
test.
B
So
we
have
the
dedicated
clustering
bare
metal
nodes
that
will
run
the
performance
jobs
that
we
were
discussing.
We
are
planning
so
yeah.
C
A
Cool
okay,
pretty
cool,
so
this
is
so
eventually
like.
This
is
like
we
have
this
cluster.
We
can
start
doing
a
little
bit
of
you
know,
generating
some
bass
lines.
Some
thresholds
that
continuing
to
you
know
this
is
basically
where
we'll
it
will
put
a
lot
of
those
things
like
we'll
to
to
test
like
all
our
performance
things.
Okay,.
C
Yep,
okay,
cool
all
right
is
that
is
that
all
you
had
their
marcelo
yeah,
just
something
that
I
want
to
mention
before
saying.
I
will.
B
Be
on
pto
from
october
1st
to
2
30,
so
the
whole
october?
Okay!
So
just
to
mention
to
you
guys
that
okay,
we'll
miss
you.
A
All
right
thanks,
miscella,
okay,
let's
go
next
one
member
usage,
cluster
profile
or
enlarged
clusters.
Let's
take
a
look.
D
Yeah,
so
so
that's
me
hi
guys,
so
I've
experimented
with
cluster
profiler,
which
was
recently
merged,
and
I
had
some
problems
with
running
it
on
the
large
clusters.
So
I
noticed
that
there's
there's
nothing
wrong
with
start
and
stop
logic,
request,
logic,
they're
like
simply
broadcast,
but
there's
a
dump
request,
which
basically
works
like
this-
that
weird
api
gathers
in
memory
all
of
the
profiles
from
each
of
the
keyboard
pod
pods
right.
So
I
noticed
that
single
pod
produces,
like
around
nine
megabytes
of
profiles
and
provider
results.
D
So
I
have
some
thoughts
how
to
how
how
to
fix
this.
But
I
was
just
thinking
if
you,
if
you
guys,
have
any
input
on
this.
How
how
would
you
like
to
approach
it
especially
and
david
was
the
the
offer
of
the
original
of
the
original
solution.
E
Yeah,
this
is
really
interesting.
Sorry,
there's
a
ton
of
background
noise.
I
don't
know
how
distracting
that
is
to
you
all.
It's
nice
chainsawing
a
tree
behind
me.
I
don't
have
any
immediate
thoughts
other
than
to
minimize
the
number
of
nodes
that
we
profile
at
a
time.
So
you
can
have
some
sort
of
selector
on
just
make
sure
we
don't
get
all
the
vert
handlers.
E
For
example,
that's
probably
the
one
that's
causing
the
most
problems,
and
maybe
there's
some
optimizations
to
how
we
gather
and
dump
these
it
doesn't
require
it
to
all
be
stored
in
memory
yeah.
What
were
your
thoughts
because
it
sounds
like
you
actually
have
an
environment
where
you
can
do
this
sort
of
thing.
D
Yeah,
so
yes,
I
was
thinking
that
we
we
don't
need
actually
have
your
api
together
of
the
results
in
in
memory.
We
just
basically
can
change
a
bit
the
api
by
which
we
fetch
the
profiler
results
right,
so
we
can
basically
from
a
client
side.
D
Okay,
we
can
query
with
api
one
put
by
one,
for
instance,
right
or
like
change
stamp
request
like
for
cubic
pods
to
actually
dump
the
results
into
into
their
volumes
and
then
have
a
client
to
actually
traverse
all
of
the
volumes
and
just
you
know,
copy
the
results
into
into
clients
own
file
system
yeah.
So
here
these
are
like
my
two,
like
initial
ideas,
how
we
should
solve
it.
D
Basically,
just
just
remove
this
cluster
provider
results
extract
which,
which
is
present,
because,
as
I
mentioned
there,
I
think
it
like
won't
fit
into
memory
for
a
large
large
scale
yeah.
So
if
you
guys
don't
have
any
like
strong
opinions
on
this,
I
guess
I
just
try
to
propose
my
implementation
like.
I
think
it
won't
be
a
big
change,
and
I
just
just
see
for
any
good
diamond.
E
I
think
the
biggest
thing
I'd
like
to
preserve
is
the
ability
to
retrieve
this
information
without
having
to
be
inside
of
the
cluster,
so
just
using
standard
cube
control
over
ctl
or
qctl
or
ctl
tooling,
outside
of
the
cluster,
because
if
we,
if
we
dump
it
to
a
volume
or
something
like
that,
then
somehow
we
have
to
get
that
information
out
onto
our
local,
like
laptop,
for
example,
or
wherever
we're
going
to
be
analyzing
this,
and
if
we're
not
doing
it
through
the
api
server,
then
we
have
to
deal
with
ingress.
Somehow.
D
Yeah
yeah,
okay,
that
makes
sense.
I
remember
that.
E
What
I
mean,
how
useful
is,
for
example,
all
the
vert
handler
information?
I
mean
it's
just
as
simple
as
saying
like
a
flag
that
somehow
allow
or
deny
lists
on
what
nodes
you
want
to
collect
it
from
or
whether
you
want
control
plane
like
a
cluster
control
plan.
Only
I
guess
is
there
a
way
to
narrow
down
the
amount
of
information
that
we
retrieve
in
a
way
that
prevents
having
to
just
send
so
much
information.
Are
you
actually
looking
at
every
handler's
results,
or
do
you
just
want
one,
for
example,.
D
Yes,
so
if
you're
talking,
if
you're
talking
about
like
hundreds
of
nodes
scale,
like
probably
doesn't
matter
to
weird
handler,
how
many
nodes
in
the
cluster
are
right,
because
it
just
looks
at
its
own
environment,
which
is
which
is
a
node,
so
it's
it's
only
a
difference
for
let's
say
weird
api
or
weird
contour
controller,
how
many
class,
how
many
notes
in
in
in
a
cluster
there
are.
So
I
guess
it
makes
sense
as
well
to
like
to
add
this
to
do
a
selector
for
only
control
play
notes.
D
So,
if
you
have,
if
you
have
such
a
large
cluster,
then
then
profiling
build
api
and
weird
controller.
That's
useful,
but
maybe
looking
into
every
weird
handler
profiler
results.
It's
it
doesn't.
It
doesn't
do
much
difference
from
the
from
the
cluster
of
size
like
few
nodes
right
that.
E
Would
be
my
expectation
yeah,
it
would
be,
it
would
be
smaller.
I
don't
know,
but
these
are
all
really
great
thoughts.
I'm
glad
you
noticed
this,
because
I
did
not
notice
that
it
was
that
much
information
yeah
that
could
be
pretty
bad.
So
I
guess
bird
api
is
just
going
to
swell
in
the
amount
of
memory
it
consumes,
and
then
it
doesn't
really
like
memory
is
weird
like
that.
Where
once
it
grows,
it
might
not
really
shrink
how
much
it's
using
it.
Just
forever
looks
like
it
consumes
a
ton
of
memory.
D
Yeah
yeah,
that's
that's
the
case.
It's
just
just
tries
to
gather
enough
from
in
memory
everything
and
once
it
has
gathered
everything
only
then
it
returns
steers
out
to
clients.
So.
E
Maybe
there's
a
way
to
do
a
zero
copy
transfer
of
that
information
as
well
that
I
don't
know
how,
because
the
way
I
structured
it,
I
have
that
that
aggregate
structure,
the
cluster
profile
results
that
stores
everything.
So
it
might
not
be
pretty
practical
either.
Also,
what's
the
biggest
result,
like
there's
lots
of
different
results
that
are
returned,
is
there
one?
Is
it
the
memory
profile?
That's
the
biggest,
if
it's
just
the
memory
like
the
heap?
E
C
D
Yeah
yeah,
that's
true,
but
at
some
point
someone
might
want
to
like
use
heap
and
alex
and
what
that.
So.
I
guess
I
guess
that,
but
having
the
solution
which
has
all
of
the
profiles
and
then
doesn't
use
that
man
that
that
many
of
memory-
it's
like
visible,
so
let's
try
it
and
maybe
have
the
selectors
on
the
type
of
the
of
the
pod
and
type
of
the
profile
somewhat
optional.
Let's
say.
D
D
So
you
see
so
you
have
a
map
mapped
by
each
of
the
pod
as
a
key
and
as
a
value
just
results
of
a
profiling.
So
if
you
iterate
over
500
of
pods
or
even
more,
then
you
know
when
you're
reaching
the
the
second
half
of
the
pots,
then
you
get
out
of
memory
errors.
B
B
You
know,
like
you
get
like
the
first
10
knows
that
you
get
from
other
nodes,
something
like
that,
instead
of
getting
everything
at
the
same
time,
of
course,
you'll
be
a
different
timestamp
that
you
are
going
to
measure,
but
is
a
way
that
maybe
you
can
just
you
know,
have
less
information
one
in
one
dump.
So.
D
Yeah
yeah,
that
makes
sense.
I
think
you
would
just
have
to
agree
on
a
subset
of
filters
which,
which
we
should
implement,
because
you
know
we
have.
We
can't
filter
from
the
by
the
node
name,
but
the
number
of
nodes
type
of
like
weird
spot,
but
that's
just
maybe
too
much
work
too
much
effort,
which
eventually
could
be
reachable
by
some
simpler,
simpler
filter.
E
My
advice
here
is
to
pick
the
filter
that
makes
most
sense
for
your
use
case,
so
a
selection
of
nodes
that
only
profile
cuber
components
that
live
on
these
set
of
nodes
works,
for
you
then
implement
that
if
you
just
want
to
profile
specific
instances
like
only
cluster
controllers
or
cluster
controllers
plus,
I
don't
know
this
vert
handler
one
bird
handler,
I
don't
know
or
if
you
want
to
limit
the
amount
of
information
you
give
back
to
say,
only
get
back,
cpu
results
and
not
like
the
heap
and
alec.
F
C
E
You
have
a
lot
of
say
in
what
what's
best.
D
Yeah,
okay,
so
I
have
to
think
about
it
and
I
just
I
just
propose
something.
E
Sounds
great
yeah
and
thanks
for
bringing
this
up,
I'm
glad
that
you
are
starting
to
use
this.
Did
you
get
any
results
that
were
actually
useful?
I'm
curious
if
you
were
able
to
act
on
any
of
this,
or
is
it
more
still
experimental,
trying
to
get
back
some
cluster
profiles
and
to
see
what's
useful
in
it.
D
Not
yet
so
I
gathered
a
few
profiles,
but
I
have
actually
spent
recently
some
time
trying
to
use
go
tool
p
pro,
so
this
like
go
to
which
helps
you
visualize
the
profiles
and
there's
some
you
go
is
complaining
on
the
binary
missing,
so
it
can't
actually
correctly
interpret
the
the
flow
of
the
yeah,
the
graph,
the
you
know,
which
function
to
which
the
you
know
deep.
D
For
instance,
memory
heap
grows
like
where
are
the
alucs
and
so
on
so
I'll
have
to
work
on
this
and
see
if
that's
actually
a
problem
with,
I
know
maybe
my
setup,
maybe
my
cluster
or
maybe
that's
that
that's
something
which
which
needs
to
be
done
on
on
the
cube
virtually
on
the
profiler.
C
E
Did
it
does
it
execute
at
all
for
you
like
or
does
it
just
say,
you
can't
read
the
results.
D
It
says
it
says:
main
binary
is
missing
and
the
graph
is
just
a
one
note
graph
or
like
something
like
this
so
yeah.
But
I
just
try
a
few
different
things
and
because,
if
you,
if
you're
saying
that
yeah
it
used
to
work
for
you,
but
maybe
that's
something
wrong
with
my
setup.
So.
E
E
A
Cool
thanks
so
much
it
would
be
really
cool
yeah
to
see
some
of
those
those
graphs
that
would
be
put
together
a
bunch
of
cool
images,
and
we
can
see
I'm
sure,
we'll
learn
a
lot
from
that.
Okay,
cool
thanks
to
moss.
I
wrote
as
the
item
here
so
for
the
with
the
action
thing
that
we
can
go
on.
It's
add
a
filter
to
limit
the
number
of
nodes
pods
that
we
can
gather
info
from
okay.
Next,
let's
go
to
the
vm
pool
discussion
from
david.
E
Yeah,
so
I
don't
want
to
spend
the
whole
time
on
this.
Like
I
did
last
time,
I
wanted
to
make
you
all
aware
of
a
couple
of
changes
and
a
couple
of
things
that
I'm
thinking
about
as
kind
of
we're
getting
close
to
finalizing
this
design,
but
first
off,
can
you
guys
hear
me:
is
this
like
background
noise?
It's
so
distracting
that
it's,
I
need
to
read.
Yeah.
E
It's
driving
me
nuts,
there's
somebody
with
a
chainsaw
right
outside
my
window
and
I
don't
know
if
I'm
gonna
be
able
to
get
through
the
day
but
anyway,
all
right.
So
the
biggest
change
that
I
made
to
the
design
was
that,
after
talking
to
roman
and
really
talking
to
others,
they've
kind
of
brought
up
a
few
times,
I've
converged
the
virtual
machine
config
into
the
vm
pool
and
made
it
like
a
template
section
similar
to
how
deployment
sustainable
sets
work.
Where
you
define
you
want
deployment
with
so
many
replicas.
E
I
have
a
vm
pool
where
you
define
the
replicas
everything
you
want
and
you
actually
have
the
template
of
the
vm
within
the
vm
pool,
and
I
was
hesitant
to
this
until
I
began
thinking
about
the
use
cases
that
I
wanted
to
keep
them
separate
for,
and
I
don't
think
they
make
sense.
So
the
reasons
I
wanted
to
keep
them
separate
was,
I
was
thinking.
You'd
have
a
one-to-many
relationship,
we're
gonna
have
one
vm,
config
matching
multiple
vm
pools
and
the
reason
you'd
want
to
do.
E
This
is
perhaps
graduating
a
vm
config
to
production
and
turns
out.
I
just
don't
think
that
makes
any
sense
at
all,
because
vm
configs
are
going
to
be
namespace
scoped
you're
not
going
to
have
your
prod
and
staging
and
dev
environments
in
the
same
name,
space
and
it's
it
doesn't
make
any
sense
practically
in
the
kubernetes
environment.
To
me
anymore.
E
The
other
reasoning
I
came
up
with
was
you
could
have
versioning
of
your
beam
configs
and
you
could
have
a
vm
config
that
you
sign
one
version
to
the
vm
pool
and
then
create
a
new
config
where
you
have
assigned
the
next
version,
and
you
could
roll
back,
but
deployments
already
have
this
kind
of
behavior,
where
we
save
a
history
of
like
the
transaction
history
of
all
the
different
changes
that
have
occurred
to
the
deployment
and
there's
a
revision
history
associated
with
each
one
of
those
changes,
and
you
can
roll
back.
E
And
if
we're
going
to
do
that
kind
of
behavior,
I
should
probably
align
with
how
other
kubernetes
primitives
work
today.
So
that's
my
thoughts.
I
think
it
makes
the
bm
pool
spec
way
more
complex.
Looking
because
we
have
both
the
tunables
related
to
how
to
manage
all
these
virtual
machines
with
the
virtual
machine,
spec
itself
and
it's
kind
of
verbose,
but
maybe
that's
just
the
nature
of
what
we're
dealing
with
any
thoughts
about
combining
this
virtual
machine
config
with
the
vm
pool.
Does
that
sound
okay
to
anyone?
Everyone.
A
It's
it's
interesting,
the
yeah
I
mean,
I
think
I
think
I've
gone
back
and
forth
on
this
idea
and
that
so
what
I
liked
about
the
thing
was
some
of
what
I
think.
Maybe
the
last
thing
you
mentioned
was
that
it
does
simplify.
A
It.
Does
simplify
the
the
vm
pool,
object
quite
a
bit.
I
know
when
I
was
originally
thinking
about
this
idea.
I
think
I
used
the
either
the
vm
template
or
or
like
a
running
vmi
as
like
a
and
using
the
object
reference
as
like
a
way
to
like
take
that
thing
and
just
kind
of
multiply
it
yeah
I
mean
I,
I
I
think
what
kind
of
where
I'm
going
with
this
is
like
yeah,
it
does
having
it
in
there.
It
does
make
it
more
complicated
and
only
more
complicated.
A
It
just
makes
it
more
complex
and
yeah.
I
I
liked
the
idea
of
having
some
sort
of
reference
before,
but
I
mean
the
technical
reasons,
for
it
is
really
I
I
don't
know
other
than
just
simplifying
is
was
really
the
only
the
only
reason
that
I
had
maybe
easier
to
read,
but
that's
that
was
kind
of
it.
E
It
felt
better
to
me
to
have
in
separate
separate
resources
for
the
similar
reason
that
you're
you're,
talking
about
where
I
felt
like
it
was
easier
to
kind
of
grok
what
was
happening.
But
I
don't
know
if
that's
the
case
or
not.
I
think
that
these
are
probably
just
expected.
E
Expected
usage
patterns
at
kubernetes
now,
so
I'm
not
sure
that
anyone
would
find
that
helpful
is
more
as
confusing
that
these
two
objects
exist
rather
than
one
object.
When
they're
used
to
standard
kubernetes
primitives,
where
one
object
would
exist
and
kind
of
embed
the
thing
that
you're
going
to
replicate.
G
Yeah,
it's
it's
always
what's
calling
you,
we.
G
Roman
you're
you're
mike's
doing
that
thing:
yeah
yeah
fabian,
always.
H
Called
it
the
cookie
cutting
cutter
pattern,
which
everyone
expects
yeah
so
like
yeah
here,
is
really
the
thing.
What
you
will,
what
will
be
used,
and
here
is
how
often
you
will
get
it
and
you
have
it
on
deployment
stateful
set
demon
set.
It's
always
the
same
thing,
and
also
on
other
controllers,
which
bring
down
crds,
yeah
and
yeah.
Yes,
yeah,
I
can
see.
I
mean
the
object
is
big.
That's
really
not
so
nice
about
it.
E
I
think
the
thing
I
I
dislike
most
about
it
being
embedded
is
we
have
like
layered
templating,
so
you
have
a
virtual
machine
template
and
you
have
the
virtual
machine,
instant
template
inside
of
that
and
it's
just
kind
of
maybe
you
call
it
stanza
or
something
I
kind
of
hate.
I
hate
the
nested
part
of
this,
but
I
can't
do
it.
H
And
it
potentially
opens
very
understandable
support
for
commands
like
cube,
cuddle,
roll
back
and
so
on,
yeah,
because
yeah
they
are
nowadays
mostly
generic,
so
that
everyone
can
make
use
of
them.
E
E
The
strong
feelings,
the
vice
offended
by
this
okay,
all
right,
so
the
the
last
point
I
had
here
is
roman
now
we're
talking
about
this
ordered
selection
for
scale
in
and
the
different
selectors,
and
things
like
that,
and
I
want
to
make
sure
that
we
document
and
have
kind
of
strong
use
cases
for
why
we
need
kind
of
these
custom
selectors,
and
things
like
this
to
exist
for
selecting
the
virtual
machines
that
are
going
to
be
torn
down
during
the
scale
in
process
ryan.
E
I
know
this
is
one
of
the
features
you're
most
interested
in.
Do
you
have
like
a
real
world
example
of
like
how,
in
practice
you
might
use
this.
A
Yeah,
well,
we
can,
let
me
see,
I'm
gonna
go
to
your
document,
you
have
them
all
captured
in
here.
I
think.
E
Look
at
that
I
had
an
example.
I
think.
E
A
Is
this
it?
Here's
yep.
E
That's
the
one
we
can
work
with
that,
there's
also
one
in
the
let's
see
yeah
I
have
one
automatic
okay.
So
there's.
Let's
look
at
the
example
that
you
have.
The
exact
same
example
is
in
the
in
the
document
today
as
well,
though.
Okay.
A
Yeah,
okay,
the
okay,
so
start
the
so
label,
selector,
okay,
so
the
so
for
everyone.
In
the
background,
as
we
we're
scaling
in
vms
as
part
of
the
pool,
we
have
taken
number
of
replicas
down
from
say
100
here
to
like
90..
So
what
are
the
vms
that
we
are
going
to
choose
to
terminate
so
the
order
policies
are
selected
in
order?
First?
Is
the
label
selector
here
so
the
okay?
A
So
the
idea
behind
this
is
that,
as
as
like
an
admin,
I
know
I
have
vms
running
but
they're,
not
they're
they're,
not
using
a
lot
of
traffic
or
they're,
not
in
use
by
someone
or
customer
or
something.
So
I
know
that
those
are
safe
to
terminate
but
they're
still
running,
because
I
you
know
I
want
to
have
them
around
in
case.
You
know
running
already
in
case
that
someone
shows
up
and
I
can
just
provide
them
with
a
virtual
machine.
A
So
I
they're
they're,
not
important,
so
I
can
remove
them
because
I'm
during
the
scaling
process
first
before
I
want
to
remove
one
that
is
being
in
being
used
by
someone.
E
Okay,
got
that
one
that
makes
sense
to
me.
Do
you
think
you
would
ever
need
this
kind
of
ordered.
E
A
It's
a
good
question.
I
think
it's
it's
I
it's
hard
to
say
like
I,
I
think
right
now,
it's
sort
of
it's
like
a
it's
an
on
or
off
switch,
but
I
could
very
well
seeing.
I
could
very
well
see
the
case
where
it
needs
to
be
more
than
that.
If,
if
you
know,
if
I
had
to
make
a
choice
like
between
like,
if
I
had
the
choice
between
two
bad
options-
and
you
know
one
was-
I
knew
was
worse
than
the
other.
E
Okay,
that
makes
sense.
I
I
think
I
can
get
behind
that.
So
that's
the
label
selector
I
I
can
write
a
use
case
for
that
and
I
think
we're
good.
So
the
node
selector
yeah
go
ahead.
A
Anyone
could
like
an
admin.
So
what
I
was
thinking
like
when
I
was
talking
about
these
cases
that
I
was
thinking
that,
like
we
write
some
operator
to
do
it
to
do
the
labeling
or
there
would
whatever
some
controller.
That
would
do
it,
but
adam
can
do
it
really
anyone
you
know
that
has
access
to
this
api
can
do
it,
but
I
figured
it'd
be
done
automatically.
A
Well,
so
we're
we're
doing
since
we're
doing
deletion,
so
if
we
did
it
that
way,
that
would
sort
of
that
would
be
the
reverse
of
the
order.
I
think.
A
Then,
because
we,
I
guess
so
to
kind
of
start
from
the
from
the
bottom
here
like
we,
the
whole
idea
is
that,
like
we
need
to
have
like
well,
I
mean
I
guess
we
could
do
that
way,
but,
like
the
the
kind
of
the
way
we
have
here
is
that
we
have.
We
have
one
policy,
that's
going
to
be.
That's
always
going
to
be
true.
I
guess
like.
If
you
just
do
your
suggestion,
it
would
just
be
the
we
would
just
look
at
the
list
in
the
reverse
order.
E
Difficult,
though,
because
you
have
new
virtual
machines
coming
online,
you
have
to
immediately
mark
them
as
don't
delete
them,
or
I
don't
know.
B
Because
I'm
just
thinking
like
who
is
going
to
mark
like
it's,
not
important,
you
know,
so
it
will
be
all
all
the
nodes
not
important,
and
then
you
remove
the
nodes
that
are
not
important,
I'm
just
thinking
about
the
workflow
and
it's
not
important
because
it's
not
running
important
workloads
in
it.
So
I'm
just
thinking
about
the
administrator.
You
know
the
logic
that
someone
is
going
to
use
that.
H
I
guess
one
example
would
be
like:
okay,
there
are
less
than
five
people
logged
in
on
these
machines.
They
should
be,
should
go
first
or
there's
no
one
locked
in
right
now,
so
they
are
preferred
if
you
scale
down
and
that
can
be
done
automatically
just,
for
instance,
the
guest
agent
can
report.
There
is
no
one
logged
in
you,
see
that
and
you
mark
it
yeah.
H
A
Right
yeah,
I
mean
I,
I
can
kind
of
live
with
that.
That
idea,
but
most
it's
mostly
yeah
like
being
able
to.
I
don't
know
I
to
me,
like
logically,
I
kind
of
yeah
I
mean
go
the
way.
I
mean
it
that
if
to
me
like
I'll
mark
the
ones
that
don't
matter
but
yeah,
we
could
go.
I
don't
know
I
mean
we
could
go.
I
mean
I
could.
A
I
think
we
can
also
solve
for
that
use
case
marcelo,
just
by
simply
putting
the
the
most
important
at
the
bottom
right
I
mean
I
can
I
can.
I
can
deal
with
it
both
ways.
H
What
I
also
wanted
to
ask
here
on
this
section
is:
there's
a
base
policy,
all
this,
for
instance,
what
I'm
used
to
from
other
kubernetes
objects
which
try
to
the
the
best
candidate
is
that
they
have
quite
a
lot
of
quite
a
lot
of
criterias
and
how
to
test
it
like
the
oldest
first,
but
also
the
oldest,
not
ready
ones,
first,
for
instance,
so
there's
kind
of,
and
that's
not
the
only
one
there
also.
Then
it
also
considers
how
long
of
how
long.
H
E
E
Roman
I'll
just
speak
to
it,
because
I
can't
hear
you
it's
meant
to
be
a
catch-all,
a
base
policy
where
you
go
through
the
order
policies
and
then,
if
nothing
hits-
and
you
said
the
base
policy
that
it
would,
it
would
land
on
that
and
we
can
set
a
lot
of
different
options
for
that.
It
could
be
oldest,
it
could
be
newest,
it
could
be
oldest
and
not
real.
I
don't
know
like
anything.
J
E
I
That's
the
question
for
me
like
on
a
deployment
and
replica
set,
it
normally.
H
Takes
the
newest
ones
first,
because
the
the
least
amount
of
ready
and
then
it
takes
ready
ones
so
which
pasta
ready
once
the
readiness
probe
they
take
them
last
and
so
on
right.
There
is
quite
a
lot
of.
A
Okay,
yeah.
That
was
another
thing.
We
talked
about
david.
That
was
like
we
that
was
like
we
could.
That
could
be
an
optimization
or
I
don't
know
if
that
I
don't
know
if
you
have
that
in
the
doc
or
something
or
if
it
was
something
that
could
be
turned
out
on
here,
but
that
was
or
if
it's
something
that's
just
assumed,
or
what
did
we
end
up?
I.
A
Okay,
let's.
E
A
Sure
so
my
thought
behind
this
is
that
when,
if
I
am
monitoring
my
node
health
they're
going
to
be
times
when
I
have
a
node,
that's
unhealthy
and
my
intent
is
going
to
be
that
I
kind
of
want
these
this
node
to
be
drained
because
it's
unhealthy,
I
don't
know,
what's
happening
there.
I
really
like
my
workloads
to
scale
down
so
let's
target
those
at
a
higher
priority
than
ones
that
I
know,
aren't
healthy
notes.
E
That's
interesting
so
the
example
is
really
a
node,
that's
being
is
it
will
we
say
that
the
node's
being
drained
so
we're
trying
to
shut
down
a
node
or
something
like
that
or
are
we
saying
that
you've
detected
that
this
node
is
acting
strange?
So
you
want
to
select
the
what
would
be
the
difference
between
a
node
selector
and
some
sort
of
automation
that
labels
every
bmi
or
vm.
A
Yeah,
so
with
the
label
selector,
I
would
expect
I
expect
to
like
the
sort
of
the
level
of
granularity
to
be
or
sort
of
the
level
of
protection
to
be
based
on
like
these.
Are
you
know,
vms?
That
are
just
that.
I
don't
care
that
much
about
so
I'm,
first
and
foremost
like
that's
fine,
let's
just
get
rid
of
them.
I
don't
really
care.
If
I
need
to
then
make
another
choice,
you
know
I've.
I've
went
through
those.
If
I
have
a
node,
you
know,
maybe,
for
whatever
reason
it
could
be
unhealthy.
A
It
could
be
also
that
the
hardware
is
not
as
good.
It
could
be
a
number
of
things
that
that
I'd
want
to
distinguish
it
to
be
killed
next,
anything
that
I
run
on
the
node.
Maybe
there
are
specific
types
of
vms
I
run
on
that
node
and
those
you
know.
Those
are
ones
that
that
you
know
that
I'd
want
to
kill
next,
so
I'll
use.
A
I
use
those
nodes
as
my
how
to
distinguish
set
of
the
labels
I
mean
I
could
use
labels
here,
but
I
think
the
idea
is
that
addition
to
this
would
be
that
it's
like
the
node.
The
note
health
like
something
is
going
on
with
the
note
or
something
is
different
about
this
node-
that
I
would
rather
go
to
that
node
next
for
to
kill
the
okay.
E
E
Maybe
there's
older
hardware:
you
want
to
phase
out
over
time,
or
things
like
that,
and
you
have
the
opportunity
to
begin
draining
things
off
of
that
node
in
a
natural
way.
I
guess
what
would
be
different
the
difference
between
well,
I'm
trying
to
think
of
that's
the
accurate
way
of
doing
that,
though,
or
if
you'd
want
to
mark
that
note
as
unsketchable
and
begin
shutting
down
the
the
workloads
on
that
node.
If
it
is
gonna,
be
taken
out
of
rotation
or
something
like
that.
A
Yeah,
like
I
kind
of
the
scenario
I
see,
is
that
like
at
this
point
I
have,
I
may
have
marked
it
as
unschedulable
and
that
you
know
I
may
even
be
attempting
to
evict
at
this
point,
but
I'm
in
a
crunch
now,
because
you
know
whatever
reason
I
need
to
bring
down
my
number
of
v,
my
number
of
emi's.
A
It
needs
to
be
scaled
down,
and
so
at
this
point
I
I'm
just
deciding
that,
like
okay,
I've
had
enough
like
these
workloads
just
have
to
go,
you
know
they're
be
blocking
because
of
eviction.
You
know
pod
whatever
disruption,
but
something,
but
now
we're
we're.
We're
deciding
like
it's
time
to
it's
time
to
remove
these
these
vmis,
and
this
would
be
like
you
know,
it's
sort
of
the
easier
way
to
do
it.
So.
H
I
think
this
may
collide
with
a
few
mechanisms
which
you
can
already
use
like.
One
is
the
unscheduled
label.
Marketing
is
unskippable,
another
one
is
having
having
on
the
vm
pool,
template
affinity,
affinities
or
undefinities
to
specific
labels
like
you
would
like
you,
you
just
decide
once
on
the
pool
or
on
your
on
yeah,
for
this
pool
that
whenever
a
specific
label
appears
on
a
node,
you
prefer
that
new
vms
are
not
scheduled
there.
H
Then
you
get
that
kind
of
automatically
there
it's,
because
what
I'm
a
little
bit
what's
a
little
bit
unclear
with
me
on
for
me
on
the
selection
policy
is
what
is
done
next,
so
you
say:
set
the
node
select
there,
but
what?
But?
How
would
you
then,
for
instance,
what's
the
intention?
The
next
intended
step
like
do
you
expect
that
new
vms
are
also
then
not
created
there,
so
that
there
is
an
empty
affinity
automatically
added
to
or
is
it
independent,
that's
kind
of
hard
to
get
from
justice
yeah?
A
Yeah,
I
think
I
think
in
yeah
I
think
in
general,
so
in
general
like
if
I'm,
if
I
like,
just
assume
this
is
the
only
pool
in
my
cluster.
If
I'm
scaling
it
down,
I
wouldn't
I
wouldn't
expect
any
more
vms
to
land
there,
because
I'm
not
creating
any
more.
So
it
could
be
a
factor
like
that.
A
So
that
could
be
true
that,
like
we
like
just
because
of
this,
as
being
my
only
pool,
I
wouldn't
expect
anyone's
to
land
in
there
anyway,
and
but
it
also
could
be
the
case
that
I
I've
marked
it
in
a
schedule.
It
could
be
either
one.
So
I
would
be
forceful
and
then
you
know
you're
not
no,
no
one's
going
to
land
there.
But
if,
if
I
have
a
sort
of
a
fixed
count,
then
then
I
know
anyone's
going
to
land
there.
H
A
So,
okay,
I
guess
like
so,
if
I,
if
I'm
scaling
in
I
could
be,
I
could
possibly
remove
from
node
two
like
I
could
remove
yeah.
I
could
move
from
node
two
and
let's
say
we
scale
up
again:
a
vm
could
land
on
node
two
and
that's
like
I'd,
be
okay
with
that
it
would
be
up
to
sort
of
me
to
decide
like
okay.
A
If
it's
the
node's
marked
on
the
schedule
or
not,
I
think
I
think
maybe
the
way
to
like,
like
the
way
I'm
kind
of
looked
at
this
is
that
we
have
label
selector.
This
could
technically
classify.
We
could
like.
We
could
label
a
vmi
anything
it
could.
This
could
capture
every
use
case.
What
this
does
is
it
it's
sort
of
it's
a
subset.
It
sort
of
allows
me
to
not
have
to
have
to
label
everything
it
can.
I
can
also
distinguish
this
way
by
note.
A
It's
sort
of
like
a
way
I
could
without
having
to
deal
with.
You
know
the
cases
where,
like
basically
writing,
a
controller
that
labels
based
on
nodes
and
then
you
know
having
it
in
this
field.
I
could
just
use
this.
It's
as
a
way
of
doing
it
and
for
all
those
reasons
before
like
you
know,
because
maybe
the
hardware
is
different,
maybe
because
I
know
it's
not
in
a
good
state
on
you
know
the
nodes
that
needs
to
be
remediated,
whatever
any
reason
like
that.
B
Hardware
shouldn't
it
have
like
also
label
like
if
the
hardware
is
different
or
if
you.
B
A
What
I'm
saying
yeah
like
this
I'm
saying
is
that
this
these
could
fall
under
label
selector.
But
what
I'm
saying
is
that
it's
more
convenient
to
have
a
field
to
have
a
field
that
explicitly
states
like
okay,
we
can
control.
We
can
use
vms
on
this
node
as
the
we
can
use
those
as
the
next
ones
to
be
killed
instead
of
having
to
create
a
label
and
mark
the
ones
per
node
and
then
effectively
doing
that.
You
know
like
effectively
like
I
could
like
this.
A
B
A
Well
does
like
so
the
the
idea
that
it's,
like
you
know,
different
hardware
different,
like
node
states,
sort
of
like
does
that
make
sense,
and
it's
like
a
way
like
a
reason.
Why
like
to
me,
it
does
like
that.
I
wouldn't
want
vms
like
when
I'm
scaling
down.
I
want
to
take
those
are
higher
priority
on
my
kill
list,
because
you
know
I
just
the
node
is
not
healthy.
A
Yeah
I
I
agree,
I
mean
I
you
should
remove
the
note
I
mean
but
like
what
I'm
saying
is
like
if
I'm
managing
this
using
a
pool
and-
and
I'm
doing-
and
I
noticed
that
there
are
vms
running
on
this
unhealthy
node
and
I
need
to
scale
down.
I
would
rather
target
those
as
as
opposed
to
a
healthy
one
like.
What's
my
next
option,
it's
the
oldest
one.
Well,
I
could
be
targeting
a
healthy
vm
here.
I
would
rather
target
these
than
I
would
rather
target
these.
H
C
H
With
healthy
and
unhealthy
nodes,
what
you
normally
will
see
there
and
I'm
not
sure
if
the
scalene
helps
you
there
is
that
nothing
happens
automatically.
So
you
you,
then
you
could
have
the
node
selected
there
and
you
your
control.
Our
pool
controller
would
potentially
delete
the
vms,
but
they
would
be
stuck
because
because
the
node
is
not
behaving
properly
so
that
our
objects
don't
get
cleaned
up
from
the
cube
light.
You
get
no
confirmation,
you
have
to
do
forced
deletes.
So
I'm
not
sure.
If
that
helps
you
or
you
need
like
francine
agent
or
something.
A
Yeah
yeah,
so
I
I
I'm
in
that
case
like
I'm,
okay
like
in
I
was
gonna,
bring
that
up
like
when
you
were
saying
like
because
yeah
like
I'm.
Okay
with
that,
because
now
my
pool
has
like
a
correct
understanding
of
like
the
cluster
state,
I
can
handle
their
mediated
node
like
separately.
I
can
be
like
okay,
that
no
one's
a
problem
we're
going
to
deal
with
it
separately.
I
don't
I
don't
want
those
vms
in
my
pool
here.
I'd
rather
just
start
a
new
one.
A
You
know
I'd
rather
like
if
I'm
scaling
down
I'd
rather
get
rid
of
them
and
then,
when
I
scale
back
up,
you
know
I'm
not
going
to
be
like,
like
those
are
gone
like
I,
I
don't
I
don't
care
about
them.
I
don't.
What
I
would
rather
do
is
get
rid
of
one
that
I
know
is
is
on
a
misbehaving.
Node
then
yeah.
A
H
Yeah,
all
I
mean
is,
if
scale
in
you
would
probably
have
to
somehow
that
then
you're
changing
the
meaning
of
the
notes.
It
has
a
different
meaning.
So
one
meaning
is
okay.
This,
when
I
scale
down,
delete
these
vms
first,
once
they're
done,
you
create
new
ones
whatever.
But
what
I
just
wanted
to
say
is
they
do
not
go
away,
so
you
still
have
them
in
the
inventory
if
they
notice
issues
so
right.
What
would
probably
help
you?
A
You
know
what
I
mean.
I
don't
quite
know
what
you
mean.
I
I
like
the
my
understanding
on
this
is
like
if
I,
if
this,
if
I
know
that
this
vm
like
we'll,
make
the
assumption
here
that
it's
on
a
node,
that's
not
responding.
If
I
know
this
vm
is
is
bad,
it
might
like.
When
I
say
inventory,
I
mean
the
vm
pools
inventory
like
if
I
know
that
the
the
vmi
here
that
I'm
holding
in
this
in
this
vm
pool
inventory
is,
is
on
a
node.
That
is
not
responding.
A
H
H
H
So
when
you
scale
in
take
them
first,
but
you
also
mention
something
about
unhealthy
notes
and
there
you
have
them
the
issue
that
you
would
have
a
different
meaning
for
the
node
selector,
because
it
would
mean
that
these
nodes
are
unhealthy
and
potentially
you
cannot
even
delete
the
vms
there
because
you
don't
know
their
states,
but
you
want
them
ignored.
Now.
H
H
E
I'm
gonna
make
a
suggestion
here,
especially
for
the
sake
of
time
and
getting
this
worked
on
at
some
point
virtual
machine
pool.
Why
don't
we
follow
up
with
the
node
selector
or
whatever?
This
is
after
this?
Maybe
the
initial
implementation
takes
place,
and
maybe
that
gives
ryan.
E
For
example,
you
a
chance
to
adopt
vm,
pools
and
kind
of
discover
the
cases
that
you
would
want
to
select
things
in
different
ordering
like
in
the
real
world,
and
then
we
can
work
through
those
exact
scenarios,
because
I
I'm
not
sure
it's
clear
to
me
node
selector.
E
So
what
I'm
proposing
here
is
to
take
note
selector
out
of
the
design
document
for
now
keep
these
ordered
policies,
they'll
just
be
label
selectors
and
keep
the
base
policy
with
the
understanding
that
we
can
expand
the
ordered
policies
to
include
things
like
those
selector
or
perhaps
something
more
accurate
for
exactly
what
you're
trying
to
target
in
the
future.
Does
that
sound,
reasonable.
A
Yeah,
I
think
that's
like
I
said
before:
like
label
selector,
we
we
can
do
everything
that
node
selector
is
currently
like.
All
those
cases
that
I
mentioned
can
be
covered
by
node
selector
in
one
form
or
another
yeah
I
mean
it's
just
I
think.
Well,
the
only
the
thing
here
is
all
the
different
reasons
I
mentioned
it.
It
would
be
convenience,
but
it
we
can.
You
know
this
is
something
that
we
can
like.
I
said
we
can
expand
on.
I
think.
H
I
think,
and
I
can
understand
all
the
cases
you
brought
up-
I'm
just
not
sure
if
it's
really
convenient
also
in
operation,
if
you
put
them
all
into
this
node
selector.
So
I
think
it's
great
if
we
have
the
chance
to
discuss
the
use
cases
for
this
separately
to
see
where
it
best
suits,
because
I
think
it
will
not
all
end
up
there
but
yeah
an
exciting
general
yeah.
So.
E
There
might
be,
for
example,
ryan.
If
we
we
see
the
use
cases
that
the
node
is
unhealthy
or
unresponsive.
We
can
start
saying
all
right
so
vms
that
are
running
on
nodes
that
are
not
reporting,
like
their
health
check
and
everything
target
those
first
and
then
that's
like
a
catch-all
where
you
don't
have
to
actually
list
your
nodes
and
this
node
selector
we're
just
gonna.
Do
the
right
thing
dynamically.
A
Yeah
just
to
make
it
more
easy:
okay,
yeah,
like
I
said,
label,
selector,
no
matter
what
is
serviceable
and
if
we
need,
if,
if
there's
when
we
talk
about
like
state
and
readiness
for
like
this
yeah
like
that,
could
be
a
whole
large
discussion.
I
mean
you've
already
talked
about
the
state
of
the
vmi
and
node
state.
There's
another
one
yeah.
We
could
include
that
in
there
that's
how
we
want
to
terminate
automatically
yeah,
that's
possible.
Yeah,
I
mean
yeah.
I
mean
the
only
other
thing
I
like
I
mentioned.
A
Like
I
said
hardware
yeah
I
mean
again,
we
could
do
labels
for
that.
So
I
think
at
least
for
the
for
the
time
being
like
this
would
be
the
thing
that
could
solve
the
use
cases
either
way,
and
then
you
know
if
it
becomes
something
that
whatever,
if
it's
just
a
pain
like
we
just
need
to
expand
it,
because
we
have
a
clear
use
case
on
whatever
labeling
or
something
you
know
based
on
nodes.
Then
then
that's
fine.
We
can
talk
about
this.
E
Okay,
so
I'll
capture
this
in
the
document
I'm
going
to
take
out
the
node
selector
and
the
examples
I'm
going
to
make
a
note
about
this
discussion
and
kind
of
the
future
thoughts
on
what
need
to
what
we're
going
to
look
at
after
it's
like
a
follow-up.
I
guess
this
is
a
I'm
documenting
that
this
discussion
has
taken
place
and
that
there'll
be
a
follow-up
on
how
to
handle
this
after
the
base.
Implementation
lands.
Okay,.
E
I
think
this
is
really
close.
I
think
this
could
probably
be
worked
on
like
in
the
next
week.
We
just
I'd
just
like
to
get
some.
I'm
gonna
finish
out
this
last,
hopefully
last
revisions,
and
then
I
think
I'd
like
to
get
one
final
final
round
of
feedback
where
people
give
hopefully
looks
good
to
me,
and
then
this
might
be
something
I
can
start
working
on
cool.
A
Nice,
okay,
all
right
cool.
We
got
some
just
in
here
good
and
then
all
right.
I
think
we're
at
time
any
final
thoughts
here.
Last
few
seconds,
some
people
before
we
conclude.