►
From YouTube: Kubernetes SIG Network Bi-Weekly Meeting for 20200819
Description
Kubernetes SIG Network Bi-Weekly Meeting for 20200819
A
A
Hello,
everybody
today
is
thursday
august
18
2022..
This
is
the
sig
network,
kubernetes
meeting
as
usual,
we
are
under
the
governance
of
our
code
of
conduct,
which
distills
down
to
be
good
people.
We
have
a
pretty
thin
agenda
today
and
I
just
got
back
from
a
vacation,
so
I
haven't
had
a
whole
lot
of
time
to
catch
up
on,
what's
been
happening
in
the
last
two
weeks.
I
apologize
for
all
the
pr's
in
these
emails.
A
Let's
do
some
triage,
actually
there's
nothing
on
the
agenda
other
than
triage,
as
far
as
I
can
tell
so.
If
people
want
to
talk
about
stuff,
now
is
your
moment
to
throw
it
on
the
agenda.
While
we,
while
we
do
some
triage,
we
have
15
issues
for
triage.
Although
we've
pre-triaged
a
few,
where
did
this
window
go
there
we
go.
I
will
go
ahead
and
share
a
window.
A
Excellent,
are
we
got
that?
Yes,
all
right
going
from
most
recent
to
oldest
dns
does
not
work
correctly.
This
seems
like
a
misconfiguration
issue.
I
I
asked
for
a
little
bit
more
information.
A
It's
pretty
clearly
not
a
kubernetes
bug.
I
should
just
close
the
tab.
I
signed
it
to
myself
I'll
try
to
help
the
person
figure
out
their
config
issue
if
anybody
wants
to
jump
in
who's.
Super
familiar
with
the
dns
side
of
things.
They're
also
welcome
to
jump
in
I'm,
certainly
not
going
to
be.
Greedy
network
policy
host
could
not
be
resolved
when
adding
network
policy.
A
It
sounds
if
I'm
reading
it
correctly
that
they
have
set
up
a
service
with
no
selector,
pointing
to
an
external
object,
an
external
ip
address
and
they're
trying
to
use
some
selector
here
in
a
network
policy,
but
there's
no
pod
to
select
network
policy
is
written
in
terms
of
pods,
not
in
terms
of
services.
So
I've
asked
for
follow-up.
If
that's
the
case,
then
we
just
have
to
close
this.
Unless
we
want
to
consider
handling
this
use
case
more
elegantly,
it
seems
pretty
niche,
but.
A
A
A
C
A
So
yes,
at
least
on
gcp,
I
thought
it
was
on
other
platforms
too.
The
load
balancer
ip
is
configured
as
a
local
address,
because
in
the
vm
case
it
would
be
a
local
address.
You'd
want
to
receive
traffic
on
it
right.
A
A
Okay,
I'll
read
the
comments
here,
but
I
thought
it
was
an
interesting
corner
case.
The
slippery
slope
is,
we
could
end
up
with
an
equal
number
of
rules
in
the
filter
chain
as
in
the
nat
chain,
because
we
would
have
to
recognize
allowed
ips,
which
could
perhaps
argue
for
the
mark
drop
to
be
a
simpler
model
which
dan
just
weeded
out.
A
At
least
then,
it's
clear,
like
just
add
one
rule
at
the
end
of
that
chain
right,
although
it's
not
clear
from
the
code
point
of
view
where
to
put
that,
because
we
iterate
on
ports
not
on
ips,
all
right
I'll,
look
at
the
updates
there,
I'm
going
to
go
ahead
and
triage.
Accept
this
though,
because
it
is,
it
is
a
bug.
D
D
D
B
A
A
A
A
I
would
presume
that
anything
that
acts
like
a
vip
will
have
the
same
problem.
Yes,
okay.
Moving
on
I'll,
read
the
follow-ups
here.
B
A
Yeah
we'll
have
to
follow
up
on
this
one
issue
filed
anyway.
This
one
is
basically
saying:
hey:
ipvs
doesn't
support
node
ports
on
local
ports,
which
we
know
they're
asking
for
documentation,
which
is
fair.
I
think
the
thing
to
document
is
iptables
mode
supports
it
on
node
port
supports
localhost,
node
ports
and
probably
shouldn't
have
and
here's
how
you
can
disable
it
right.
There's
a
pr
still
open.
I
think
that
disables
it
so
we
should.
We
should
encourage
people
to
disable
that.
C
A
Yeah
well,
and
also
we
have
this
allocate
load,
balancer
node
port
flag,
which
defaults
to
true
for
historical
reasons,
but
like
on
gcp
is
not
needed.
If,
if
there
was
a
way
to
set
that
as
the
default
on
gcp,
then
I
wouldn't
have
filed
this
bug
because
I
would
have
tested
and
said:
oh,
it
doesn't
work.
A
We
have
a,
we
have
a
bunch
of
like
evolved
things
that
we
can't
change
the
defaults
on
or
we
haven't
changed
the
defaults
on.
We
should
think
about
that.
I
have
actually
literally
a
note
on
my
whiteboard
on
my
desk
here,
which
is
to
think
about
a
core
v2
proposal
where
we
could
bake
in
some
of
these
default
changes.
It's
not
just
service,
but
there's
a
bunch
of
things
that
would
be
nice
to
change
the
defaults
towards
more
secure
defaults.
B
A
plus
one,
big,
plus
one
on
changing
the
default,
especially
when
the
default
is
like
basically
a
load-bearing
fog
and
as
long
as
we
make
it
very
clear
and
you
know
give
people
a
path.
I
think
that
would
be
really
nice.
A
Yeah
yeah,
okay,.
A
Maybe
we
should
open
another
issue.
I
I'll
take
a
note
at
least
on
gke.
I
should
be
able
to
get
them
to
change
some
service
flags
change,
cube
proxy
flags
to
make
some
of
these
defaults
saner
right,
like
maybe
we
should
add
a
cube
proxy
flag.
That
is
no
never
mind.
I
gotta
think
I
gotta
think
through
it
feels
like.
I
should
be
able
to
change
this
default
for
gcp
somewhere,
because
we
know
we
don't
need
node
balance
and
node
ports
for
load,
bouncers,
okay,.
A
That
was
this
one.
So
sorry,
antonio,
were
you
gonna
write
the
docs
or
was
somebody.
Why
did
I
think
it
was
you
who
signed
up
for
this.
A
D
A
You
touched
it
now:
it's
yours,
okay!
No!
It
would
be
great
if
somebody
volunteers
to
write
some
documentation
on
this.
I'm
just
going
to
go
ahead
and
triage.
Accept
it,
because
it's
real.
D
A
Yeah
I
mean
if
like
if
we
just
made
it
in
the
comments
around
q
proxy,
like
the
references
get
regenerated,
don't
they
periodically
every
release?
I
thought
they
do.
I
would
not
put
it
in
api
docs.
I
would
perhaps
put
it
in
this
one.
D
A
Yeah
we're
serving
multiple
things
from
one
binary
all
right.
I
also
filed
this
bug.
I
haven't
read
any
of
the
follow-ups
on
it
yet,
but,
as
I
was
looking
at
the
changes
around
service
controller,
I
realize
it's
not
clear
what
the
being
deleted
state
means.
In
fact,
this
is
why
I
went
off
and
wrote.
I
wrote
a
doc
in
the
community
repo
on
controllers
and
this
intermediate
state,
because
we
have
a
real
bug,
so
I'm
going
to
go
ahead
and
triage
oops
accept.
B
A
D
A
It's
not
that's
a
good
point,
I'll
find
it
I
tweeted
about
it
this
week
too,
you
can
find
it
on
my
twitter.
It's
really
short.
It's
just
like
there
exists.
A
third
state
between
does
not
exist
and
exists,
which
is,
is
going
to
not
exist
right.
A
D
This
is
the
thing
alexander,
so
we
have
this
this
thing
on
on
in
public
and
in
points
so
then
points
have
the
determination
step.
The
problem
is
we
had
a
bug
in
the
point
of
light
controller,
because
the
node
wasn't
present,
but
the
path
was
present.
So
we
we
are.
We
have
one
two,
three
controllers
that
depends
on
on
their
behavior
and
the
three
of
them
manage
them
different.
D
A
Yes,
I
mean
each
each
resource
probably
needs
to
define
what
does
it
mean
when
it's
being
deleted
and
how
people?
How
controllers
should
treat
that?
Maybe
the
right
answer
is
it's
not
a
per
controller
decision.
It's
a
per
resource
decision.
D
A
Yeah
I
mean
there.
This
came
up
on
slack
this
week
too,
with
someone
who
was
asking
about
why
their
service
was
experiencing
connection
reset
errors,
while
they
were
doing
upgrades,
they
thought
well,
I
set
a
graceful
termination
period
and
when
I
receive
a
sig
term,
I
call
http.shutdown
and
why
isn't
kubernetes
doing
the
right
thing
shouldn't
all
the
traffic
have
been
drained
before
I
get
the
sig
term
and
I
had
to
explain
like
no,
actually,
that's
not
at
all
how
it
works.
A
Don't
do
the
shutdown,
because
everything
is
asynchronous
to
everything
else
and
there's
no
check-in.
After
all,
the
load
balancers
are
configured
right.
It's
just
like
a
wall
time
oriented
and
it's
it's
unsatisfactory.
But
here
are
the
reasons-
and
this
goes
right
to
the
same
topic
of
like
end
of
life.
C
Okay,
yeah,
if
it's
somewhere
in
slack
that'd,
be
great.
We
had
some
confusion
over
on
our
side
regarding
that
behavior.
A
Okay,
yeah,
you
know
the
truth
is.
I
should
probably
write
a
blog
post
or
something
because
there's
enough
there
that
it's
probably
useful
for
people
to
know
about
all
right.
I
took
a
note
to
see
if
I
can
grab
that
and
post
it
somewhere,
but
it
is
on
slack.
I
think
it
was
in
sig
apps
the
discussion.
A
A
Does
somebody
want
to
look
at
it
and
see
if
they
can
verify
that
it
is
in
fact
an
issue
whether
or
not
we're
going
to
fix
it
doesn't
matter,
but
just
just
to
analyze.
The
report.
C
B
C
A
A
Yep,
okay,
next
cube
proxy,
should
report
events
when
proxy
or
failovers.
What
is
the
proxy
or
failover?
Oh.
This
is
sorry.
I
did
look
at
this
one.
This
was
the
when
they
ask
it
for
ipvs
and
it
falls
back
in
ip
tables
or
something
like
that.
B
A
Awesome
cool,
so
I
guess
there's
the
bigger
question,
though
of
should
we
should
we
send
events
more
readily
for
stuff
there's.
This
other
discussion
that
you
were
on
antonio,
about
sending
events
for
node
port
collisions
like
it's
dangerous
whenever
we
send
events
from
q
proxy
right,
because
we
can
easily
flood
the
system.
D
D
A
The
the
problem
with
events
is
they
age
out
relatively
quickly
compared
to
you
know
the
time
frame
that
people
tend
to
detect
things
in
so
like
if
we
send
an
event
at
startup?
That
says?
A
Oh,
no,
you
asked
for
ipvs,
but
I
couldn't
do
ipvs,
so
I
gave
you
iptables
instead
and
two
hours
later
that
event
goes
away.
Will
anybody
have
seen
it
right
there?
Isn't
there
isn't
a
place
to
decorate
permanently,
like
you
guys,
remember
way
way
way
back?
There
was
this
component
statuses
api
that
we
were
trying
to
like
add,
like
hey,
cube
proxy
component
status,
you
can
go
and
note
that
something
is
abnormal
about
cube
proxy
and
it
turned
out.
A
It
was
bad
for
a
lot
of
reasons,
not
the
least
of
which
is,
I
could
have
thousands
of
cube
proxies.
Am
I
going
to
have
thousands
of
component
statuses
right.
A
But
if
it
doesn't
fail
to
boot
like
if
you
fall
back,
I
know
we're
going
to
get
rid
of
the
fallback.
I
agree
with
that,
but
if
we
fall
back
from,
if
we
do
something
the
one
time
at
startup-
oh
no,
I've
noticed
I.
I
can't
configure
this
particular
syscuddle,
throw
an
event
right
and
then
nobody
ever
sees
it
like.
Was
it
worthwhile
or
are
we
going
to
set
up
a
timer?
That
repeatedly
throws
that
event
and
then
we're
back
to
the
thundering
herd.
D
Well,
that's
for
sure
my
experience
is
this
is
when,
when
you
log
is
for
something
that
knows
where
to
look
at
I
mean
you
go
to
keep
processing
because
you
know
I
don't
know
something's
happening
when
when
you
want
to
get
the
first,
the
first
impression
of
what
is
going
on
the
first
thing
you
look
at
the
bank,
so
that's
that's
for
me,
the
rule
that
we
should
follow.
I
mean
when
we
want
to
point
the
people.
You
know
to
give
them
a
starting
point.
A
No,
but
like
again
looking
at
failover
like
it's
gonna
work
right,
we'll
we'll
just
have
switch,
we
won't
have
given
you
what
you
asked
for,
but
it's
actually
gonna
work
pretty
much.
Okay
until
you
hit
some
weird
corner
case
that
is
different
between
ipvs
and
ip
tables
right
right,
and
only
then
will
you
notice
that
you
didn't
get
what
you
wanted
and
you
can
make
one
argument.
I
guess
that
if
it
worked,
why
would
we
bother
telling
people?
A
But
if
that
was
true,
why
did
we
bother
implementing
ipvs
right
like
it's
there
for
a
reason
right
and
if
they're
not
getting
what
they
wanted?
There
should
be
a
way
to
tell
them,
but
we
know
that
nobody
looks
at
the
logs
and
especially
nobody
looks
at
the
logs
when
things
aren't
on
fire.
D
A
I
agree
with
you
it's
interesting.
I
mean
it's
better
than
nothing
and
in
fact
you
know
I
know
I
know
some
people
take
all
the
events
and
archive
them
right,
so
they
can
go
back
and
look
at
them
later.
So
it's
it's
better
than
nothing.
A
D
B
A
A
All
right,
let's
move
on
this-
is
not
the
topic
here,
so
this
is
going
to
get
fixed,
close
it
we're
halfway
in
is
there
more
on?
Is
there
more
on
the
agenda,
nobody's
put
anything
else
on
the
agenda,
all
right,
we
can
keep
going
then
for
dual
sections.
Node
addresses
show
only
one
ip
if
node
p
parameter
is
fast.
Didn't
we
look
at
this
one
last
time.
Yes,
we
did.
C
Oh
yeah,
I
don't
know
if
we
did
look
at
this,
so
this
is
possibly
I
I
made
a
comment.
It's
possibly
an
api
break.
We
might
decide
basically
node
ip
was
always
useless
when
you
were
using
an
external
cloud
provider,
and
so
somebody
changed
it
to
make
it
useful,
and
it
turns
out
that
this
breaks
people
who
were
using
it
for
its
side
effects.
C
D
D
D
D
D
A
Okay,
there's
a
lot
of
history
here
that
I
haven't
read:
what
like
is
it
is
it?
Is
it
a
break?
Is
it
something
we
need
to
roll
back
on.
B
A
A
B
B
A
D
I
think
go
ahead.
The
thing
is
when
I
saw
the
change
like
last
time
that
they
were
implemented
with
the
annotation.
It
just
was
one
thing
that
looked
at.
You
know
that
something
was
going
to
go
wrong
because
I
remember
andrea
and
then
discussing
on
the
norway
piece,
and
it
took
a
long
time
to
to
have
this
behavior
stable.
A
I'll
give
this
a
read
later
today,
but
dan,
I
I
I'm
gonna
tag
you
as
maybe
a
proposal
maker
like
do
we
fix
it?
Do
we
not
fix
it?
Can
I
can.
B
C
After
you've
read
it
and
and
weighed
in
on,
if
it
seems
like,
I
don't,
I
don't
have
a
sense
of
you
know
like
I
said
it,
it's
really
an
edge
case
like
like.
If
you
go
with
the
thing
that
you
know,
we
can
never
change
behavior,
then
it
is
an
api
break,
but
it's
an
api
break
in
using
a
feature
that
wasn't
working
the
way
that
it
was
supposed
to
so
that
you
can
get
a
side
effect
that
wasn't
documented.
A
C
We
just
lose
the
the
the
new
better
functionality
in
1.25
that
whoever
md
booth,
I
think,
had
added.
A
Okay,
I
mean
what
you
just
described
was
where
I
was
going
in
my
brain
to.
Let
me
have
a
read
over
today
and
I'll
weigh
in
all
right.
A
Oh
wasn't
that
that
was
linked
in
here
somewhere
too
many
windows.
A
All
right
aps
server
crash.
This
is
the
dual
stack
service
versus
advertise
address.
A
B
A
Aside
from
the
fact
that
it
panics,
if,
if
they've,
set
up
an
advertised
address
and
a
service
address
and
those
two
things
are
in
conflict,
what
should
we
do.
D
The
problem
is
that,
by
definition,
the
cluster
is
not
going
to
work,
because
it's
not
the
kubernetes
dot
default
service
is
not
going
to
work
so
everything
they
use
in
cluster
configuration
that
are
all
the
pods
controllers
and
everything
are
going
to
fail,
but
they
are
not
going
to
fail
us.
They
failed
as
connection
timeout.
A
D
D
D
Will
take
hours
to
until
it
finds
out
that
the
problem
is
that.
C
A
D
D
A
D
A
A
A
I
mean
tell
me,
tell
me
if
you
think
I'm
wrong,
it
seems
like
it
should
be
a
simple
fix,
at
least
in
this
case
it
won't,
it
won't
panic,
it
will
just
fail
with
it
with
at
least
a
useful
error
message.
D
Because
those
values
are
derived
well,
I
don't
remember
I
had
to
check,
but
the
problem
is
that
the
how
the
api
server
builds,
the
configuration
goes
through
different
states
and
has
different
latest
servers,
one
on
top
of
that.
So
the
options
going
you
know
are
going
been
derived
and
I
don't
know
at
what
point
you
have
that
information.
B
A
Right
I
mean
the
code
snippet
here
is,
is
exactly
it
right,
so
it
looks
like
we
do
report
it
somewhere,
but
then
later
we
crash.
So
we
should
probably
report
it
earlier
and
just
say:
I'm
not
even
going
to
try
to
spin
up
the
the
the
controller
yeah.
B
B
A
We're
working
on
these
prs
and
caps
still
thank
you
for
all
the
excellent
work
on
this.
I
was
out
last
week
and
this
week
has
been
crazy,
but
I
have
them
all
open
and
I
will
look
at
them.
Asap.
B
A
Yes,
man
that
cherry
pick
makes
me
anxious.
B
A
They
probably
will
take
a
look
at
the
pr
they
should.
What
I'm
afraid
of
is
sometimes
you
know
like.
If
I
say
hey,
approve
this,
then
they
will,
even
if
they
don't
think
it's
the
right
idea,
and
I
don't
want
to
be
in
the
position
of
telling
the
branch
managers
to
take
risks
that
they're
not
comfortable
taking
because
it's
them
that's
on
the
line,
not
me
right
sure.
A
So
maybe
we
should
have
a.
I
haven't
looked
at
the
cherry
pick
pr,
but
maybe
we
should
just
actually
like
loop
them
into
a
conversation
and
say
this
is
the
risk.
This
is
the
bug
like.
Do
you
think
that
the
risk
is
commensurate
to
the
bug
yeah?
Or
do
we
just
tell
users?
You
have
to
wait
for
25,
which
you
know
practically
speaking
in
managed
providers
won't
land
until
q1
right.
B
Yeah,
that's
kind
of
what
my
my
company
is
facing.
I
mean
the
problem,
I
don't
know
in
any
case,
so
we,
I
guess
we
can
have
a
discussion
after
this
meeting
offline
somewhere,
maybe
with
that
branch,
but
this
this
issue
I
created
in
any
case
to
like
track
the
work,
so
both
the
pr
that
went
in
on
125,
but
also
the
cap
for
126
and
future
improvements
that
we'll
make.
A
A
Oh,
they
did
respond.
Okay,
all
right,
then
I'll.
Look
at
it!
Sorry,
all
right
worker
nodes
are
not
showing
external
dns
address.
A
Okay,
I
don't
think
there's
a
fix
here
right.
B
A
It's
the
the
resolution
is
cited.
This
is
not
a
guaranteed
api
and
then
cal's
thing,
which
I
don't
even
remember
why
we
left
this
open.
Oh.
B
B
A
All
right,
I'm
gonna
stop
my
share.
Now
we
still
don't
have
anything
else
on
the
agenda
so
rather
than
waste
more
time
as
much
as
I
enjoy
all
you,
all's
company,
I'm
sure
you
can
do
something
more
useful
with
the
last
13
or
so
minutes.
So
last
chance.
Anybody
has
anything
they
want
to
talk
about.
A
We
have
a
lot
of
open
caps,
that
and
and
more
incoming,
so
we're
gonna
need
to
pay
close
attention
and
maybe
do
some
prioritization
about
which
ones
we're
going
to
spend
our
energy
on.
I
personally
I
like
to
in
between
the
code
freeze
and
the
kept
window
opening
so
like
that
you
know
several
weeks
period
I
like
to
try
to
find
tech,
debt
issues
long-standing
little
ugly,
things
that
I
can
spend
some
time
on.
A
So
I
will
encourage
anybody
who
finds
themselves
with
a
few
free
minutes.
Go
peek
at
one
of
the
code
bases
that
we
own
and
see.
If
there's
not
some
nasty
little
work
that
you
can
pay
down
and
close
one
of
these
old
issues
documenting
some
weird
behavior
or
helping
fix
a
flag
or
whatever
we
have.
We
have
plenty
of
debt.
If
you
have
trouble
finding
some.
Let
me
know
I'll
I'll
be
happy
to
help
you.
A
We
need
more
docs,
yes
and
the
docs
that
we
have
are
not
always
coherent.
I
was
looking
at
recently
this
debug
services
dock,
which
I
link
people
to
all
the
time
it
could
use
a
refresh
to
cover
a
lot
of
the
newer
stuff,
especially
like
the
traffic
policy
fields.
So
there's
plenty
of
debt
to
do.
D
D
A
So
I
was,
I
was
asking
someone
this
week:
if
we
have
a
label
that
sort
of
indicates
tech,
debt
right
and
the
answer
is
no,
we
don't,
but
we
can
probably
assume
that
the
combination
of
priority
backlog
or
maybe
just
priority
backlog
is,
is
the
closest
we
have
to
this.
So.
A
Okay,
everybody
gets
10
minutes
back
thanks
all
for
your
time.
We
will
see
you
in
two
weeks.
Don't
forget:
kubecon
is
coming
up
shockingly
fast,
see
you
soon,
bye.