►
From YouTube: Kubernetes SIG Node20210629
Description
Meeting Agenda:
https://docs.google.com/document/d/1j3vrG6BgE0hUDs2e-1ZUegKN4W4Adb1B6oJ6j-4kyPU
A
Thanks
dawn,
go
ahead,
this
is
29th
june
signal.
Weekly
meeting
looks
like
we
have
a
pretty
packed
agenda
for
today.
Let's
kick
it
off
with
the
pr
initial
update.
I
believe
it
is
going
to
be
sergey
or
alana.
B
Yeah,
sorry,
I
didn't
prepare
this
yet
I
can
go
in
the
end
of
the
meeting.
C
Good
from
the
bug
perspective,
we
squashed
a
lot
of
bugs.
I
don't
know
if
you
want
that
update.
C
I
think
we,
what
was
the
specific
numbers?
I
posted
them
in
the
slack
as
well,
but
we
definitely
closed
over
100
bugs
when
we
squashed
bugs
last
week.
So
thanks
everybody
who
helped
out
with
that.
Let's
see
what
are
the
exact
numbers
and
sergey
made
some
graphs
too,
which
were
really
great.
Yes,
we
closed
136
issues
and
we
updated
over
200.
So
I
should
say
we
updated
over
200
that
were
still
open.
I
don't
think
that
counts
all
of
the
ones
that
we
closed
as
well.
C
So
we
did
a
lot
of
work.
So
thanks
everybody
who
participated
in
that.
A
C
Oh,
it's
me.
I
thought
that
vinay
was
in
front
of
me
yeah.
I
think
that
I
just
need,
like
final
approval,
jordan
finished
up
api
review
and
I'm
just
pushing
out
some
stuff
to
fix
nits,
but
other
than
that,
I
think
it's
basically
good
to
go.
Seth
reviewed
the
implementation
a
couple
weeks
ago.
Now
I
think-
and
there
was
no-
there
were
no
really
big
changes.
C
A
D
Yeah,
so
I
think
renault
and
myself
were
drafted
as
a
purpose
for
this
when
we
did
the
cap
review
so
we'll
do
that
this
week.
C
Yeah
derek,
if
you
could
take
a
look
as
soon
as
you
feasibly
can.
I
know
there's
a
lot
of
folks
who
are
kind
of
chomping
at
the
bit
to
be
able
to
test
this,
and
until
it
merges
it's
hard
to
test
so
basically
the
the
sooner
we
get
it
in
for
like
alpha
or
beta
releases
for
122.
The
better.
A
Cool,
so
next
one
again,
I
think
it's
you
alana
requesting
no
approver.
C
Oh
wow,
it's
just
the
whole
agenda
is
me
today
yeah,
I
have
requested
approver
access.
I
hope
that's!
Okay
with
folks
somebody
pestered
me
saying
you
have
reviewed
something
like
500
node
prs.
Why
are
you
not
an
approver,
so
I
said
okay,
I
will
apply.
I
think,
I'm
waiting
on
don
and
some
of
the
folks
from
google
to
take
a
look
at
the
request.
A
That
sounds
great,
I
think
other
than
the
bug
scrub,
like.
I
think
everyone
knows
about
the
bug
scrub,
but
there's
one
more
thing
that
I'd
like
to
add
which
a
lot
of
people
might
not
know.
A
Iran
has
been
leading
and
facilitating
1.22
leeds
cohort
training
group
which
I've
had
the
opportunity
to
experience
kind
of
first
hand,
and
that
has
been
kind
of
a
great
great
learning
opportunity
for
me,
as
as
a
representative
from
sig
note,
and
there
are
other
representatives
from
other
sig
sigs
as
well,
and
this
is
part
of
the
contributor
ladder
growth
program.
So
if
people
want
to
sign
up
for
like
the
following
releases,
it'll
be
kind
of
a
great
learning
opportunity.
D
D
Like
the
growth
path
for
cygnode,
others
have
expressed
interest
as
well,
and
so
one
of
the
things
I
wanted
to
do
was
get
the
existing
set
of
approvers
together
to
write
down,
maybe
their
path
to
approver
and
maybe
some
common
criteria
that
we
can
evaluate
these
things
going
forward.
And
then
hopefully
we
can
meet
this
week
and
then
proceed
from
there.
But.
D
Like
swati,
I
think
everybody
else
appreciates
all
your
effort
in
the
sig
and
yeah.
Hopefully
we
can
find
a
path
forward.
E
Yeah,
so
we
don't
have
like
the
clear
written
requirement.
Explicit
site
is
being
issued
in
the
past,
so
this
is
why
signaled
and
also
work
with
some
other
thing,
have
some
like
the
some
written.
But
it's
not
the
word
clearing,
because
one
issue
we
realized
in
the
past.
It
is
so
some
of
the
sick.
Big
sticks
like
sticking
like
the
signal
large
enough
and
a
poor
means
three
different
laws,
and
so
there's
no
final,
granular
definition.
So
that's
why
this
is
is
missing.
E
And
then
we
can
come
back
because
we
think
about
the
a
fair
fox
includes
avalanche
on
the
past
towards
that
one,
and
we
should
make
that
more
clear,
and
so
we
have
a
constant
standard
for
the
community
can
follow
and
also
it's
written
down
publicly
and
then
we,
the
people
can
follow
and
they
can
really
just
self
exams.
They
are
where
they
are.
E
So
just
yeah,
so
we
should
do
this
so
derek
and
I
just
before
the
meeting
say
we
should
address
this
problem
and
hopefully
this
week
we
can
get
together
to
start.
D
And
then
just
in
transparency,
one
of
the
concerns
is
that
public
intersects,
with
storage
and
networking
and
to
some
degree
scheduling
for
what
it
embeds
so.
D
We
just
have
to
work
our
way
through
that
sensitive
to
those
six,
so
we'll
look
to
get
that
written
down
and
yeah.
So
thanks
for
coming
stepping
forward
alone,.
C
Yeah
one
of
the
things
that
I
I
just
linked
in
the
chat,
unfortunately
there's
just
you
know
the
there
exists
a
contributor
ladder
and
path
and
whatnot,
but
the
requirements
to
become
an
approver
according
to
the
project
are,
I
think,
not
aligned
with
what
a
lot
of
the
larger
cigs
actually
expect
because
they're
like
they
list
the
requirements
like
oh
yeah,
if
you've
like
reviewed,
30,
pr's,
you're
good
to
go
and
if
you've
been
a
reviewer
for
three
months,
and
I
recognize
that,
for
you
know
certain
chunks
of
code-
that's
really
not
sufficient.
C
D
Yeah,
I
think
yeah
we'll
try
to
get
something
more
granular
or
appropriate,
and
actually
would
like
everyone
in
the
stake
here
to
give
feedback
on
it
as
well
right.
D
We
have
to
find
a
way
to
balance
these
things
so
for
those
approvers
on
today's
call
I'll
look
to
set
up
some
time
and
then
maybe
in
next
week's
sick
meeting,
we
can
review
and
then
evaluate
both
alana's
proposal
and
any
others
who
want
to
come
forward
with
that
community
standard
and
so
thanks
again
alana
for
helping
us
to
get
this
clarified.
A
F
A
F
Yeah
so
over
the
past
week,
I
think
the
core
implementation
I
got
it
mostly
completed.
F
The
only
piece
that
remains
in
the
core
implementation
is
changing
the
scheduler
and
resource
quota,
accounting
to
use
max
of
requests
and
resources
allocated
where
previously,
in
the
previous
iteration
of
the
design,
we
had
it
at
resources
allocated
the
new
piece
that
came
in
besides
the
change
to
switching
from
spec
dot
resources
allocated
to
status,
dot
resources
allocated
is
the
resize
which
I,
which
was
easier
than
I
thought,
so
it
is
working,
and
the
good
thing
is
that
we
have
the
e2hta
switch
is
pretty
comprehensive
from
the
last
iteration
of
the
design
that
one,
I
think
wong
chen
is
adapting
to
cater
to
the
updated
api,
and
it
looks
like
she's
making
great
progress
on
that.
F
So
we
have
a
certain
degree
of
confidence
with
the
automated
e2e
tests
from
the
previous
iteration
that
you
know
we
can
count
on
rely.
G
F
Quality
of
the
changes
that
we've
made,
that
said,
I'm
planning
to
get
the
scheduler
rq
changes
looked
at
closely
because
I
don't
want
to
just
port
this.
I
want
to
look
at
it
and
make
sure
that
it
makes
sense
with
the
max
and
was
wondering
if
lantau
I
know
ronald
had
a
comment
about
cri.
F
I'm
wondering
I
don't
know
if
tim
hawkin
got
a
chance
to
look
at
the
api
changes
so
I'll
ping
him
again,
he
mentioned
that
he's
got
me
in
the
queue
I
just
don't
know
how
backed
up
he
is.
I'm
wondering
if
lantau
had
a
chance
to
look
at
the
core
implementation.
Most
of
this
is
report
from
what
david
ashbal
had
already
looked
at
pretty
closely.
F
I
kept
it
kept
the
code
structure,
the
the
key
piece
of
how
the
cubelet
processes
changes
and
how
it's
plumbed
down
to
the
cri
and
how
the
container
status
is
queried
and
then
the
the
the
api
status
is
generated
for
part
status.
They
mostly
remain
the
same.
The
new
thing
is
the
checkpointing
code
and
I
put
that
under
status.
F
H
F
I
didn't
want
it
to
be,
you
know,
considered
for
merge
ready
until
it's
pretty
close
to
ready
I'll,
remove
those
labels
now
because,
okay,
I
I
want
to
remove
it
after
I
get
done
with
the
scheduler
and
resource
quota
changes,
although
it's
pretty
small,
I
want
to
make
sure
that
I
do
this
by
hand
as
in
manually
to
make
sure
that
I
don't
miss
anything
or
so.
This
was
already
done
in
the
last
iteration
we
switched
from
using
requests
because
requests
can
now
change.
F
Previously
request
was
immutable,
so
when
you
create
the
part,
it
is
what
it
is
so
now
we
have
to
do
a
max
of
requests
and
resources
allocated.
That's
what
we
agreed
we
decided
to
do
and
I
think
it
is
reasonable.
It
might
even
be
simpler.
I'm
planning
to
do
that
in
a
couple
of
days.
Just
I've
been
backed
up
with
other
things,
so
I
haven't
gotten
to
this.
I
know
time
is
running
short,
but
I
think
the
rest
of
the
code
is
ready
for
review.
H
Okay,
yeah:
I
will
take
a
look
this
week.
F
Oh
one
more
thing
we
were
wondering:
should
we
create
a
separate
direct
pr
for
the
e2e
test
from
wang
chen
to
kk,
or
should
it
go
through
my
pr?
What
is
the
preference
for
signal
usually.
C
You
need
to
test
that
it
actually
is
working
before
you
can
merge
the
pr.
So
you
probably
want
the
e
to
e's
in
your
pr.
F
C
Yeah
you
can
put
the
the
person's
name
in
the
commits
so
like
or
do
co-authoring.
F
A
Thanks
next
we
have
daniel.
He
wants
to
talk
about
auto
system,
cube,
reserve
system
reserves.
Do
we
have
daniel
on
the
car.
I
I
Oh
okay,
so
yeah!
Well,
that's
being
done!
I
have
opened
a
feature
issue
and
it
is
about
reserving
cube
the
cube
reserved
and
the
system
reserved
memory,
basically
a
new
approach
of
doing
that.
Okay,
now
I
think
I
can
share.
I
Hopefully
you
can
see
my
screen
now:
okay,
I've!
I
have
opened
the
issue
here.
I've
scrapped
a
little
bit
about
it,
but
I
want
to
give
you
a
background
about
on
it.
Yeah
first
off
hi
there
I'm
daniel
I
work
for
sap.
I
I
work
on
gardner,
so
it's
an
open
source
kubernetes
as
a
service
and
yeah,
because
that's
the
first
time
I'm
presenting
something
here.
I
want
to
say
hi
here:
okay,
let's
jump
right
into
it.
Currently,
setting
cube
and
system
reserved
is
basically
a
static
configuration
right
in
the
configuration
file,
cubesurfed
system
reserved-
and
it's
typically
done
prior
to
the
cubelet,
startup
and
updating
cuban
system.
Reserves
requires
a
cubelet
restart
and
how
it
is
typically
done
for
kubernetes
manage
kubernetes
offerings.
I
Is
they
calculate
the
cuban
system
reserved
based
on
the
machine
size?
Gke
is
apparently
doing
it.
Azure
openshift
has
an
enhancement,
open
and
also
gardner
is
doing
that.
What
we
have
observed
is
that
calculating
reserved
resources
prior
to
the
cubelet
process,
starting
is
an
approximation
in
the
best
case,
and
we
have
had
some
troubles
because
on
particularly
busy
clusters,
we
have
seen
that
the
resource
reservations
were
too
low
and
for
mostly
idling
clusters,
not
a
lot
of
parts
running
on
it,
and
the
resource
reservations
were
too
high,
based
on
the
machine
size,
calculation.
D
Is
derek,
so
is
the
machine
size
heuristic
based
on
just
the
instant
site,
or
is
it
also
based
on?
Is
that
is
the
pods
per
node?
Is
it
just
choosing
the
defaults
from
cubelet
for
that,
or
are
you
also
tweaking,
like
your
desired
pod
density.
I
So
I
think
there
are
different
kinds
of
calculations
out
there.
There's
some
that
also
considers
max
parts
right,
yeah.
I
But
I'm
currently
in
the
process
of
of
finding
a
better
way
to
do
that,
at
least
for
gardner
and
so
yeah.
That's
what
the
issue
is
technically
about,
and
I
also
wanted
to
ask
of
course:
how
are
your
experiences
using
this
heuristic
based
on
the
machine
size
or
max
parts?
For
instance,
yeah.
D
Yeah,
I
guess
so
I
was
just
trying
to
some
some
folks,
you
know,
might
tune
for
particular
workload
characteristics
I
guess
and
so
like.
If
you
are
trying
to
bin
pack,
your
nodes
or
you're,
trying
to
you
know,
plan
your
notes
for
20,
40,
80
or
just
the
pods
per
core
calculator,
I
think,
is
what
10
pods
per
core
right
now
is
the
default
and
cable.
I
just
wasn't
sure
if
those
other
knobs
were
going
in
the
heuristic
beyond
either
ram
or
cpu.
I
D
Fine
thanks
go
ahead.
I'm
sorry.
I
Yeah,
that
makes
sense
so
just
real
quick.
Why
is
that
even
important
consequences
over
reservation
over
reservation
leads
to
lower
note
allocatable
you
have
that
has
effects
on
scheduling.
All
of
that.
In
the
end,
we
increase
your
costs
lower
the
utilization,
as
you
were
mentioning
regarding
bin
packing,
for
instance,
the
bigger
problem
that
we
have
faced
is
under
reservation
for
a
cpu,
I'm
just
mentioning
cpu
and
memory
here,
but
I
think
it's
similar
for
other
resource
types
cpu
here.
I
The
q,
ports
c
group,
for
instance,
has
too
many
cpu
shares
and
in
case
there's
a
cpu
contention,
there's
a
risk
of
starving
the
all
the
other
process,
basically
in
addition
to
the
cubelet
and
the
container
runtime,
and
that
has
led
in
our
cases
to
some
node
instabilities.
I
Maybe
something
like
plaque
is
unhealthy,
so
yeah
it
can
surface
in
various
ways
and
it's
hard
to
to
debug
and
then
to
know
why
memory,
the
the
major
issue
that
we're
seeing
here
is
when
you're
under
under
reserve
memory
is
that
the
the
global
om
linux,
om
killer
hits
before
the
c
group
level
om
killer.
So,
basically
that
the
node
is
running
out
of
physical
memory
before
the
q
parts
c
group
limit
is
hit,
and
that
has
some
bad
implications
because
then
any
almost
any
process
on
the
host
can
can
be
terminated.
I
Yes,
it's
influenced
by
the
quality
of
service
class
yeah,
but
we
have
also
seen
here,
problems
also
always
deadlocks
due
to
when,
when
this
global
out
of
memory
happened.
So,
ideally,
of
course,
the
the
c
group
limit
on
the
coupon
slice
would
hit
first
because
before
there's
a
global
out
of
memory
happening.
I
So
that's
that's
the
goal
for
the
memory
setting
the
right
limit
for
the
memory
here,
yeah
just
to
show
it
real,
quick,
usually
we
under
reserve,
so
the
the
c
group
memory
limit
is
quite
high.
Actually
almost
like
almost
at
capacity
level.
I've
indicated
that
here
it's
a
physical
capacity
level,
and
the
only
thing
I
want
to
show
here
is
that
the
the
memory
limit
on
the
q
port
c
group
is
not
at
all
reached.
I
Yet
there's
a
lot
of
there's
a
lot
of
capacity
left
technically,
but
the
host
itself
is
almost
running
out
of
memory.
So,
even
though,
by
adding
a
little
bit
of
of
memory
usage
to,
for
instance,
cue
parts
or
any
other
was
process,
the
the
limits
on
the
cubot
slides
are
not
hit.
But
before
that
happens,
the
the
whole
machine
is
running
out
of
memory
and
that's
typically,
what
we
are
seeing
when
we
under
reserve
memory
and
that's
problematic
one.
I
I
And
now,
then,
the
question
is
what
drives
the
cubelet
and
container
runtime
resource
usage.
What
we
have
observed
is
the
number
of
parts
running
currently
on
the
node.
I
think
that
also
makes
sense
because
the
cubelet
and
the
container
runtime
have
to
handle
more
parts,
and
there
are
also
additional
container
runtime
shim
processes
for
each
part.
I
In
addition,
a
big
influence
is
the
kind
of
work
like
running
or
deployed
on
the
node
currently,
and
that's
something
that's
very
difficult
to
predict
or
impossible
to
predict
if
you're
running
a
managed
service,
because
yeah
people
can
deploy
any
kind
of
workload
on
it
and
we
are
unsure
exactly
why.
That
is
the
case.
I
think
that
would
need
more
deep
investigation
to
why
exactly.
But
we
we
see
that
across
a
lot
of
clusters
showing
you
here,
the
parts
deployed
just
from
one
to
one
10
to
100
parts.
You
can.
I
You
can
see
here
that
basically,
the
cpu
and
the
memory
requirements
for
for
docker
the
container
runtime
here
and
the
cubelet
are
increasing
almost
linearly.
I
I
We
we
created
a
node
with
exactly
the
same
configuration
the
same
operating
system
size,
but
once
with
the
real
world
applications
running
on
it,
some
real
workload
and
the
other
one,
just
110
parts
same
amount
of
parts
running
some
some
sleep
so
doing
nothing
essentially
and
then
comparing
the
the
required
cpu
and
memory
for
the
system,
slice
and,
in
particular
the
cubelet
and
the
container
runtime.
I
What
you
can
see
here
is
for
in
this
test.
Cluster
docker,
for
instance,
is
using
2.6
gi
and
the
cubelet
is
using
120
and
comparing
that,
with
the
real
world
usage
that
we
are
seeing
there,
it
is
almost
more
than
double
it's
6
gi
and
300
mm
mi
for
the
cubelet.
Even
though
we
are
running
the
same
amount
of
parts,
it's
the
same
operating
system,
it's
the
same
machine
and
size
so
yeah
the
workload
has
some
influence
on
it.
D
I
This
was
actually
a
long-running
note,
so
this
is
a
node
that
hosts,
in
this
particular
case,
control
planes
of
other
clusters
and
okay.
I
So
this
is
long
running
stuff,
and
these
are
basically
the
numbers
we're
seeing
there
on
multiple
clusters
and
yeah.
So
finally,
to
the
to
the
proposal
or
the
feature
request.
I
Basically,
the
main
idea
would
be
to
because
we
are
seeing
all
these
problems
try
to
find
a
better
way
to
or
a
different
way,
how
to
reserve
resources,
especially
for
memory
and
cpu,
and
that
also
takes
into
account
the
current
usage
on
the
system,
because
it's
really
hard
to
predict,
at
least
for
us,
and
what
kind
of
workload
is
running
on
it,
how
many
parts
and
then
how
much
cpu
and
memory
is
going
to
be
consumed
by
that
and
has
to
be
reserved
yeah
yeah.
I.
F
Just
want
another
question:
to
clarify
sorry,
the
the
real
world
system
and
the
test
system.
They
were
the
pod
specs
were
the
similar
in
terms
of
both
the
init
container
and
the
actual
container
spec,
except
that
it's
not
doing
any
work,
reserving
the
same
amount
of
memory.
But
I
I
just
want
to
make
sure
the
init
containers
were
not
missed.
I
No,
I
think
they
were
missed.
So
in
this
particular
case
there
were
no
inner
containers,
for
instance,
in
there
they
were.
This
were
really
plain
load
test
parts
but
yeah.
It's
definitely
not
a
very
comprehensive
investigation
that
I
would
and
comfortably
say
everything
is
covered.
F
I
Yeah
good
good
point
so
yeah.
J
I
have
a
question
for
this
non-port
non-coupon
c
group
processes.
Is
it
so
what
you
are
trying
to
parse
with
c
groups,
hierarchy
by
yourself.
J
E
So,
alexander,
you
are
right
yeah,
so
maybe
we
leave
let
the
daniel
finish
the
proposal,
but
you
are
absolute
writer,
so
this
is
so
we
consider
I
daniel
I
post
what
we
discussed
in
the
past
and
but
the
giving
today's
the
quantity
of
the
services
chase
structure
uni
would
propose
to
have
to
pass
down
to
the
c
group
on
the
on
the
disk
on
the
file
system
system
file
system.
So
so
that's
make
those
things
more
complicated,
but
we
we
we
can
finish
this
and
then
there's
some
things
that
we
need
to
consider.
E
I
Yeah
definitely
I
mean
there
are
a
lot
of
things
to
consider
and
I
had
the
first
thing
I
definitely
wanted
to
ask.
First
is:
what
are
your
your
thoughts
on
it?
Is
that
feasible?
Do
you
have
similar
problems?
That's
the
main
thing
I
mean
I
come
to
that
in
a
second
in
the
question
part,
but
that's
basically
it
regardless
of
the
actual
implementation,
because
I
haven't
detailed
that
out.
I
first
wanted
to
ask
I'll
just
jump
here
or
I'm
already
just.
D
A
couple
questions
daniel:
if
it's
okay,
you
have
the
just
trying
to
understand
the
bounding
of
maybe
what
you
have
explored
or
haven't
explored
like.
D
For
gardener
is
the
cubelet
and
container
runtime
under
systems
life
or
in
a
separate
c
group
slice
from
system.
I
I
D
And
to
be
fair,
I
think
that's
perfectly
fine,
but
we
and
like
here
with
red
hat,
we
were
doing
the
same
right
and
while
we,
I
think
in
the
community,
talk
about
ways
of
splitting
them
out.
I
think
very
few
have
done
it
in
practice.
So
that's
good
and
then
the
other
question
I
was
curious
about
is,
if
you've
you've
done
any
testing
on
what
happens.
D
If
you
have
the
enforce
node
allocatable
feature
turned
on
to
be
enforcing
on
the
system
c
group
so
like
right
now,
there's
nothing
actually
inducing
pressure
back
on
system
slice
to
force
reclaim
because
it's
unbounded-
and
there
is
a
knob
that
lets.
You
set
a
bounding
limit,
but
it's
kind
of
risky
to
set,
because
you
don't
really
know
what
the
moon
killer
is
going
to
do
when
it
gets
to
pressure
on
system
slice.
But
it
would
naturally
point
so.
D
I
was
just
curious
like
if
you
have
explored
either
bounding
the
keyblade
in
run
time
in
a
in
a
separate
c
group
where
node
allocatable
is
enforceable
on
that
c
group
and
seeing
different
results,
because
that
was
one
question
and
the
other
question
was
renault
and
I
were
chatting
a
little
bit
in
the
background.
But
one
of
the
reasons
we
at
red
hat
have
been
so
motivated
to
look
at
secret
speed.
2
is,
we
think
we
would
see
improvements
in
this
space,
and
so
I'm
curious
if
sap
has
explored
that
at
all.
I
Enforcing
the
node
allocatable
on
on
the
separate
on
the
system,
slice
and
then
also
on
the
runtime
slice,
but
currently
that's
not
being
done
yet
because
it's
just
as
you
as
you
see,
the
problem
is
just
that
the
memory
and
the
cpu
requirements
are
so
vastly
different
that
at
least
I
am
not
feeling
very
comfortable
on
on
enabling
that
because
yeah
stuff
in
system
slice
and
might
getting
killed.
As
you
were
saying
yeah.
So
I
have
not
done
a
full-blown
investigation
on
on
adding
yeah.
D
Is
that
something
we've
ever
evaluated?
I
can't
recall
if
we've
ever
tried
to
parent
or
run
time
and
cable
in
a
separate
c
group
and
turn
it
on
or
not
in
our
own
preference.
But
am
I
off
that
nothing's
going
to
be
putting
pressure
on
that
to
to
force
reclaim
that
the
numbers
will
just
grow?
I
don't
know.
E
So
the
number
could
be
go
and
here
based
on
your
system
level
of
those
config
and
the
trigger
system.
So
then
the
system,
what's
the
trigger
the
whole
note
the
performance
and
also
which
one
even
we
have
the
score
side
right.
Oh,
it
is
the
best
effort,
because
when
colonel
look
at
the
kernel
wheel,
because
the
kernel
will
also
try
to
proven
of
the
kernel
dialogue,
which
is
really
bad
after
this
room,
so
they
will
try
to
pick
up
which
whatever
process
based
on
available
of
those
kind
of
things
and
actually
explicitly.
E
This
is
not
unkilled
even
like
the
new
kernel
version.
There's
not
a
lot
to
like,
like
the
mark,
certain
process,
not
unkilled,
but
you
could
still
so
then
they
will.
Basically,
it
is
just
random
or
some
process
has
some
range
of
the
um's
circle
range
and
then
we'll
think
about
which
one
have
the
most
can
we
can
reclaim
the
memory
most
and
then
they
will
cure.
So
you
make
this
a
whole
system
and
predictable
performance
and
predictable
and
at
the
same
time
will
be
rendered.
E
So
all
those
like
the
kubernetes,
the
priority
priority
class
is
won't.
Make
sense
here:
what's
the
signal
good
part,
it
is
right
now
when
we
cure
we
can
detect
in
kubernetes
we
can
detect
and
we
will
kill
the
entire
depart.
E
We
do
have
some
signal,
but
occasionally,
if
you
don't
handle
very
well,
you
may
be
end
up,
kill
certain
process
in
that
c
group,
but
the
whole
holy
work
node
is
not.
The
company
could
be
also
don't
get
around
to
kill
the
entire
path,
so
you
may
paper
over
the
real
problem,
but
the
customer
maybe
suffer
so
so
all
kind
of
those
complexity.
E
I
think
the
executive
complexity
is
being
discussed
even
at
the
earlier
kubernetes
time,
so
so
so
hopefully,
this
is
why,
when
the
seekable
version
two
come
out,
I
hope
we
can
do
a
better
job
on
that
one
giveaway.
We
have
more
signal
from
kernel
and
we
also
they
have
the
hub.
They
have
the
clubs
and
we
can
passing
more
doing
dynamic
during
the
time
we
could
trigger
more.
E
Have
the
more
control
like
the
user
space
can
based
on
those
like
the
or
priority
class
based
on
how
important
the
job
it
is,
then
we
can
pass
those
values
to
the
kernel.
E
I'm
not
sure
version
sql
version
2
can
do
that.
Also,
there
are
one
things
because,
based
on
our
today's
layout,
no
matter,
you
are
using
the
slides
or
using
the
old
way
with
them
c
group,
so
they
always
have
the
not
reclaimable
memory,
so
a
q
which
is
accumulated
at
the
top
level
of
the
c
group.
So
that's
also
make
that
node
unlockable
over
time
will
be
not
represented
at
a
given
time.
So
there's
the
many
complexities
you
just
share
here.
L
You
kind
of
detailed
pretty
well
like
what
happens
if
you
under
reserve,
I'm
trying
to
understand
the
problem
with
over-reserving
or
maybe
reserving
for
kind
of
what
you
would
say
is
a
bin-packed,
node
or
fully
utilized
node,
because
I
see
I
feel
like
if
you
adjust
it
you're,
going
to
show
that
there's
more
resource
allocatable,
but
then
more
pods
are
going
to
come
in
and
then
you're
going
to
have
to
reduce
the
amount
of
allocatable,
because
system
reserve
would
need
to
grow
to
grow.
I
So
I
guess
for
over
reservation,
then
less
resources
right
would
be
advertised
to
the
scheduler
and
less
parts
would
be
able
to
run
on.
On
that
note,
and
also
the
c
group
limit
on
the
q
parts
right
would
be
lower
and
basically
you
would
be
wasting
some
resources
on
the
node
because
you're
over
reserving
for
the
cubelet
and
the
container
runtime.
If
I
understood
your
question
right,
I'm
not
sure.
L
Well,
I
I
suppose
that
in
the
scenario
that
your
pod
is
your
your
note
is
is
doing
nothing.
You
have
one
pod
on
it.
Let's
say
you're
going
to
have
too
much
space
that
you'll
be
the
system
reserved
will
be.
I
have
a
lot
of
headroom
but
nobody's
there
to
use
that
headroom
anyway.
L
L
I
don't
see
what
issue
it
have,
because,
because,
yes,
when
it's
empty
it'll
it'll,
have
you
know
extra
headroom
that
could
be
consumed
but
nobody's
there
to
consume
it
anyway.
If
the
note
is
completely
full,
you
already
know,
then
that's
how
much
system
reserved
you
need
for
a
full
node
and
you
couldn't
take
any
more
in
your
size
correctly
at
that
point.
L
So
I'm
just
trying
to
point
out
that
if
you
do
reduce
the
amount
of
system
reserved
you're
leaving
room
for
somebody
who
isn't
there
necessarily
and
if
you
just
have
it,
set
up
for
a
full
node
correctly
that
that's
kind
of
my
question
or
a
point.
I
Yeah,
I
guess
that
makes
sense
to
me
how
you
said
it
yeah.
E
E
That's
also
could
be
the
cases
earlier,
like
direct
medicine
on
the
new
medicine
earlier.
There
also
have
some
other
thing.
I
also
mentioned
another
thing
that
could
be
not
reclaimable
usage
accumulated.
There
could
be
one
another
thing:
it
is
today's
situation.
If
customer
like
the
uncertain
node,
they
have
the
running
job
and
the
user
have
to
constantly
send
the
kubercato
execute
and
kubercutter
logging,
which
is
also
that's
not
a
chat
too.
Unfortunately,
today
is
not
all
charged
to
the
pod.
E
I
I
I
guess
that's
another
point
of
of
why
the
the
cube
and
system
reserve
is
not
really
determined
to
be
able
to
be
determined
prior
to
the
kubele
startup.
I
guess,
and
so
there
are
some
variants
there,
what
would
make
some
sort
of
reconciliation
or
at
least
being
able
to
update
it
at
runtime
handy
if
I
understood
it,
correct.
E
You
are
so
right.
This
is
so.
This
is
why
the
nomad
is
the
cubos
reserve.
Those
kind
of
node
equivalent
reserve
is
the
best
practice
and
also
it
is
the
best
what
we
can
share
based
on
the
production.
So
it's
more
like
that.
We
generate
those
kind
of
things
through
the
monitoring
and
as
the
average,
and
so
this
is
also
like
earlier.
We
say
that
we
cannot
guarantee
those
things,
because
if
we
guarantee
then
we
have
to
enforce
you.
E
You
are
already
mentioned
that
even
we
are
enforced,
you
feel
uncomfortable
right,
so
we
just
yeah,
because
we
due
to
the
imitation
and
the
viral
cases,
no
matter
is
in
the
kernel
or
user
space.
So
that's
why,
after
this
many
years,
unfortunately,
we
still
didn't
enforce
those
things.
E
So
if
you
say
that
I
share
something,
and
in
the
past
we
talked
about,
there's
also
have
the
formula
we
talked
about
in
the
past.
Hopefully
we
could
enforce,
but
until
today
we
feel
you're
uncomfortable
to
enforce
those
things.
D
One
thing
I'm
thinking
about
here
is
like
if
it.
If,
if
this
is
an
area,
you'd
want
to
do
further
study
or
how
hard
it
is
to
maybe
tweak
your
setup,
I
I
would
be
curious
to
know
if
you
did
go
and
parent
the
cubelet
and
runtime
in
a
different
c
group
and
did
enforce
node
allocatable
on
the
cube
reserve
c
group
only
and
not
system
slice.
D
Do
you
see
different
memory
usage
characteristics,
because
the
one
thing
I
am
thinking
is
that
memory
pressure
on
your
c
group
will
get
reported
at.
I
think
what
70
or
80
threshold
value?
I
can't
remember
and
you'll
start
getting
some
reclaim,
but,
as
don
said,
you
might
hit
issues.
C
D
Things
like
exec
and
logging
that
might
be
variable
in
your
real
world
environment,
but
I'd
be
very
interested
to
learn
on
that
and
then
maybe
renault
do
you
want
to
talk
about
all
about
why
we
think
secret
v2
might
be
helpful
here
or
maybe
some
things
that
we're
we're
holding.
K
Out
before
I
think
so,
with
v1,
the
io
and
memory
aren't
as
well
behaved
in
v2.
They
have
proper
accounting
between
the
two
and
another
issue.
We
have
really
is
like
when
you
get
memory
pressure
like
since
we
don't
have
swap
your
executables
will
start
getting
so
the
file
pages
will
start
getting
swapped,
leading
to
system
instability
and
like
during
recent
conversations
with
kernel
engineers,
like
they
really
said.
Oh,
you
cannot
like
totally
pack
it.
K
You
need
to
leave
some
headroom
and,
depending
on
your,
like
workload,
characteristics
how
much
io
they
are
performing,
how
much
memory
they
are
requesting.
You'll
always
need
some
headroom,
so
I
think
it's
like
you'll
have
to
balance
stability
versus
wasting
some
space
for
like
for
head
room
and
to
address
some
points
like
don
mentioned
about
c
groups,
killing
some
process
in
the
part.
So
in
v2
we
have
a
knob
where
we
can
say:
oh
okay,
kill
at
the
c
group
level,
so
the
whole
thing
goes
away.
So
what.
E
K
So
that
will
help.
The
second
thing
is
with
psi
and
like
ilana
is
working
on
swap
with
those
things
in
place,
we'll
be
able
to
get
in
something
like
d
and
will
be
able
to
get
earlier
notifications
and
be
able
to
take
better
decisions
on
what
processes
to
kill.
But
then
the
own
killer
doesn't
work
well
at
all
under
load.
It
will
end
up
like
killing
processes.
It
shouldn't.
Do
you
think
it
it
shouldn't
be
and
only
way
to
protect
against.
K
It
is
like
setting
a
minus
thousand,
which
you
pointed
out,
one
more
thing
like
I'm
not
sure
like
on
the
cryo
side,
what
we
are
doing
is
our
shim,
which
is
called
con
mon
is
running
under
the
pod
slice,
and
so
it
gets
charged
to
the
workload,
but
it
is
also
protected
by
a
minus
thousand,
and
we
don't
like.
We
wrote
it
in
c,
so
it
doesn't.
You
have
a
lot
of
overhead
so
that
way
we
always
get
a
notification
and
also
we
are
not
like
charging
it
to
the
system
slice.
K
I
I
think
what
I
didn't
quite
understand
is
how
better
accounting
and
the
c
group
version
2
or
psi
would
would
help
with
that
issue
here,
where
you
would
need
to,
or
would
it
where
would
make
sense
to
adjust
the
the
c
group
limits,
for
instance,
if
the
for
the
q
parts
lies
at
runtime,
because
processes
outside
of
it
like
the
container
runtime,
are
using
more
suddenly?
K
So
there
are
a
couple
of
things:
jobs
in
c
groups.
We
do
like
the
memory
high
and
low
and
then
where
we
are
kind
of
trying
to
use
the
min
or
the
system
slice
and
then
we'll
see
how
it
helps.
The
idea
is
like,
as
opposed
to
like
the
high
and
max
the
pressure
starts
happening
based
on
overage
and
min,
gives
you
some
like
a
higher
level
of
guarantee
that
okay,
you
won't
get
till.
You
really
start
it
by
a
lot.
K
So
there
are
some
knobs
and
we
don't
have
all
the
answers
right
now
to
be
honest
and
it's
more
like
okay,
we,
we
got
the
basic
c
groups
feature
in
place,
which
is
a
parity
with
v1,
and
now
we
are
exploring
the
new
knobs
and
then
we'll
do
performance
session
to
drive
like
what
will
end
up
being
the
recommendations
at
the
node.
G
Yeah,
I
can
add
just
in
122
cycle
right
now.
There
is
an
open
cap
for
memory
quest
for
secret
v2,
that's
starting
to
explore
using
the
memory.hi
things
to
basically,
you
know
induce
memory,
take
back
right
under
system
pressure.
So
that's
something
actively
being
worked
on
right
now.
E
I
also
can
attack
why
the
memory
accounting
fighter
can
have
this
situation
so
another
beyond
the
water
menu
and
the
david
said
earlier
so
just
earlier.
I
also
mentioned
that
this
is
actually
the
user
space
problem,
but
we
do
have
the
same
problem
for
kernels
right
right,
so
kernel
takes
some
behavior
and
to
have
certain
part
of
workload
container
most.
E
E
A
M
Yeah
hi
everyone,
so
I
just
want
to
really
quickly
ask
signal
the
provers
to
look
at
this
pr,
because
it's
it's
ready
like
for
a
long
time
and
it's
got
lgdm
maybe
months
ago,
or
something
like
that
so
and
so
that's,
basically
the
only
code.
C
Ed
I
had
a
quick
question:
did
the
so
the
test?
Look
green?
I
don't
see
a
pr
for
promoting
the
test
to
conformance
test.
Did
that
happen.
M
C
Specific
thing,
I
don't
think
it
uses
a
cri.
M
Okay,
so
that
that
that's
seems
that
it's
it's
not
going.
I.
M
D
A
Thanks
ed,
we
have
vinayak,
I
believe
he
wasn't
able
to
join
us.
He
has
kind
of
his
item
pointed
out
there.
People
can
have
a
look
and
look
at
the
changes.
N
Yeah
I
have
some
contact
with
that.
I
can
give
a
brief:
introduce
okay,
cool
yeah,
yeah
hi,
everyone,
I'm
chipton,
I
I
reviewed
the
the
ax
cap.
So
it's
about
it's
about
the
the
security
capability
api
we
have
in
kubernetes
and
it
has
a
people
know
it
has
a
add-on
drop,
but
the
problem
is
the
the
ad
doesn't
work
for
non-root
user
in
a
straightforward
way.
N
So
so
in
the
original
cap
he
proposed
to
change
the
kubernetes
api
by
adding
a
new
field
and
ambient
and
also
add
a
new
field
in
the
cri
correspondingly
and
but-
and
I
talked
with
him-
I
I
kind
of
hesitating
of
changing
the
kubernetes
api
in
this
way.
So
we
think
about
the
alternative
that
we
reduce
the
current
api
but
changing
the
behavior
in
a
transparent
way.
That
is
the
current
current,
the
pl
he
mentioned
in
the
in
the
in
the
notes,
contented
apl
and
cryopl.
N
So
we
just
changed
the
container
runtime
implementation
yeah.
That's
it's
just
a
perfect
introduction
and
the
people
can
review
the
cab
can
review
this
revised
prs,
so
the
the
I
think
he
wants
people
thought
on.
If
this
is
a
battery
or
we
want
to
change
the
api.
Okay.
K
C
K
N
Sure,
yes,
yes,
sure
and
I
my
personally,
I
think
the
we
may
want
to
clarify
the
the
spec.
The
current
spec
is
kind
of,
maybe
not
very
clear,
so
we
can
clarify
what
will
happen
in
different
cases.
Yeah.
H
Okay,
I
think
both
options
were
discussed
before,
but
the
the
actual
field
was
preferred
just
because
of
the
backward
compatibility
right,
given
the
part
object.
If
we
just
change
the
meaning
of
add
capability,
it
may
cause
some
unexpected
behavior
like,
for
example,
there
is
a
part
that
they,
a
non-good
part.
They
have
at
some
point
capability
by
by
mistake,
and
we
with
this.