►
From YouTube: Kubernetes SIG Network Meeting for 20220428
Description
Kubernetes SIG Network Meeting for 20220428
A
This
is
the
sig
network
meeting
for
april
28
2022..
Do
we
have
a
triage
lined
up
for
today.
A
C
B
Oh
no,
I
mean
there
were
a
bunch
open,
but
people
have
been
asking
questions
on
them,
so
we
don't
need
to
discuss
them
here.
I
think
the
only
one
that
I
thought
was
worth
discussing
here
was
this
one
which
cal
opened
cal.
I
don't
know
if
you're
here.
B
No
problem
he's
he's
merely
asking
the
question
of
what
is
the
intention
of
the
load.
Balancer
source
ranges
field.
I
thought
if
he
was
here,
we
could
discuss
it,
but
there's
discussion
in
the
bug,
so
maybe
we'll
just
leave
it
there.
B
D
B
D
B
I
I
think
you're
more
likely,
so
I
think
there's
two
problems.
One
cal
thinks
it's
underspecified,
which
maybe
the
text
of
it
is,
but
it
is,
it
is
being
validated
and
the
second
part
of
it
is
what's
it
like.
Should
we
be
programming
iptables
with
it?
I
think
the
answer
is
yes
and
I
don't
see
it
as
problematic.
B
A
E
People
might
be
traveling,
especially
if
they're
participating
in
like
pre-event
contributors,
summit,
etc,
stuff,
but.
A
A
So
I'll
do
that
and
then
we've
got
alexander
to
discuss,
keep
controller
manager
yeah.
G
G
Okay,
I
hope
you
can
see
my
screen.
It's
not
tell
me
so.
Essentially,
I
would
like
to
talk
about
the
cloud
controller
manager
and
specifically
at
scale,
so
I'm
kind
of
working
on
a
project
right
now
where
we
have
a
mini
to
main
menu
relationship
between
lbs
and
cluster
node,
which
isn't
really
all
that
common
in
the
normal
cases.
G
But
in
this
case
it's
something
that
is
being
used
in
this
project
and
we
are
encountering
a
couple
of
issues
with
regards
to
the
sync
time
to
sync
by
the
cloud
controller
manager
and
what
we're
seeing
is.
This
is
mainly
happening
on
large
clusters,
where
nodes
transition
from
a
not
ready
to
ready
or
back
and
forth
state,
and
so
having
looked
at
the
implementation
of
the
cloud
controller
manager
and
especially
especially
at
sync
loop
for
the
nodes
I'm
seeing
is
that
we
sync
all
load
balancers
for
each
in
node
event.
G
So
to
speak,
so
whenever
it
transitions-
and
this
is
causing
a
specific
corner
case-
to
trigger
outages
by
clients
which
are
trying
to
connect
to
these
lbs-
and
I
just
wanted
to
quickly
present
this-
I
already
filed
the
pr
which
kind
of
fixes
it.
Obviously,
I'm
looking
forward
to
getting
your
input
on
that
and
maybe
reaching
the
consensus,
but
the
idea
is
essentially
this.
G
So
in
this
case
we
can
imagine
that
we
have
five
load
balancers,
each
of
which
are
using
external
external
traffic
policy,
local,
so
they're
only
pointing
to
one
node
and
one
of
these
nodes
transitions
from
not
ready
or
ready
to
not
ready,
which
will
cause
all
of
these
load
balancers
to
get
updated.
G
So
I
detailed
this
a
bit
better
in
my
pr
and
that
kind
of
maybe
exemplifies
things.
So,
whenever
people
get
a
chance
to
have
a
look
at
that,
please
do
and
essentially
what
the
idea
is
here
is
that
for
external
traffic
policy,
local
type
of
services,
whenever
a
node
transitions
from
not
ready
to
ready,
if
that
node
isn't
actually
hosting
the
end
point
for
that
service.
G
It's
not
really
impactful
for
that
for
that
service,
so
there's
no
real
reason
in
in
actually
updating
that
load.
Balancer,
like
I
said
this,
isn't
in
this
specific
corner
case,
with
a
many-to-many
relation
and
for
external
traffic
policy.
Local
now
there's
a
second
problem
as
well,
which
is
mainly
with
regards
to
applications
which
are
high
intensity,
as
I
call
it
or
low
latency,
which
kind
of
can
reduce
the
problem
to
to
the
cubelet.
G
An
example
of
this
is,
for
example,
if
a
service,
so
the
pod
servicing
the
acting
as
an
endpoint
to
the
service,
gets
a
burst
of
increased
number
of
transactions
per
second,
for
example.
This
can
increase
the
cpu
and
memory
consumption
on
the
node,
hence
leading
the
node's
resources
to
get
a
bit
starved.
G
An
application
that
follows
this
type
of
a
deployment
deployment
model
doesn't
really
want
to
impose
resource
limits
because
that
might
impact
service
slas
either
in
terms
of
the
memory
that
is
being
used
or
the
cpu
at
any
given
moment
in
time
right.
So
I
have
a
couple
of
proposals
for
this.
I
just
wanted
to
present
them
in
this
meeting
just
to
get
people's
opinion
on
it,
but
obviously
I
guess
the
greater
discussion
will
be
maybe
on
that
pr.
G
There
are
a
lot
of
way
reasons
for
a
node
transitioning
to
not
ready.
Many
of
those
might
not
have
any
impact
on
whether
or
not
the
application
can
actually
service
requests,
networking
wise
and
if
that
is
not
the
case,
I'm
kind
of
looking
for
people's
opinion
on
maybe
adding
a
knob.
So
an
annotation
on
the
service
object
to
be
able
to
disable
the
node
readiness
impact
on
the
reconfiguration
of
the
the
load
balancer,
and
this
would
again
be
useful
for
applications
which
are
which
are
experiencing
a
high
intensity
or
a
high
load.
G
And
then
I
was
kind
of
thinking
about
this
even
further,
and
I
was
imagining
things
a
bit
the
way
health
check.
Node
port
works
currently
for
etp
local,
which
is
to
say
that
a
probe
is
configured
on
the
load
balancers
and
that
actually
targets
a
node
port
endpoint
on
the
node,
which
returns
a
status
code
to
it.
So
why
don't?
G
We
do
that
as
well
for
what
concerns
the
node
readiness
state,
which
is
to
say
that
in
this
case
the
cloud
lb
would
target
the
same
endpoint,
but
the
end
point
wouldn't
just
return:
an
http
200.
If
the
endpoint
is
running
on
the
node,
it
could
also
interrogate
the
cubelets
read
only
port
and
because
the
cubelet
is
the
one
that
actually
sets
the
readiness
state
and
then
going
even
further
beyond
that.
G
G
So
for
the
discussion
for
number
three,
I
think.
Obviously
we
need
an
enhancement
proposal
and
a
sign-off
for
this,
but
given
the
cni
plug-ins
can
implement
health
check
note
port
themselves.
G
I
don't
really
see
if
there's
any
possibility
to
force
unite
plug-ins
to
use
a
specified
implementation
for
how
it's
supposed
for
how
it
is
supposed
to
work.
I
just
checked
before
this
meeting
psyllium,
for
example,
is
using
a
specific
method
or
a
specific
server,
so
to
speak
for
for
the
health
check,
node
port,
whereas
cube
proxy,
obviously
is
using
standard
implementation
and
ovn
kubernetes.
I'm
only
seeing
it
because
I
used
to
work
on
it,
uses
q
proxy's
implementation
for
that
server,
but
could
we
align
all
of
them?
G
I'm
not
sure,
and
then
obviously
number
four
is
even
more
difficult,
because
that
has
impact
on
the
cloud
provider
implementation
for
all
of
this
yeah.
So
this
is
kind
of
the
problem
that
we're
facing
on
this
project
and
some
of
the
ideas
I've
surrounding
it
feel
free
to
tell
me
if
you
think,
anything's
crazy.
I
see
somebody
raising
their
hand.
B
Hey
thanks
for
that
great
exploration
I
have
in
the
past,
spent
some
time
thinking
about
this
too,
and
I
have
some
thoughts.
I
know
bowie
is
here
too
so
he
can
actually
speak
in
maybe
more
details.
B
Some
of
the
things
you
identified,
especially
in
the
first
part,
I
think,
are
bigger
than
you're,
even
describing,
because
some
implementations
have
limits
to
the
number
of
back
ends
that
can
be
behind
a
single
load,
balancer
right,
like
they're
designed
for
vms,
not
for
containers
and
like
the
google
load
balancers,
we
have
to
pick
a
subset
of
the
available
nodes
in
order
to
put
them
behind
it.
So
we
keep.
H
B
Right
there
we
go,
there's
also
things
like
programming
time
right
like
it
takes
this
propagation
delay
like
you,
you
suggested,
so
you
don't
want
to
be
flapping.
If
you
can
avoid
it.
B
B
B
Well,
first,
can
you
clarify
when
you
say,
do
don't
sync
not
ready
to
ready?
Do
you
mean
don't
publish
that
on
the
node,
or
do
you
mean
just
don't
use
that
to
change
the
service
endpoints.
B
Right,
okay,
so
so
yeah
this.
This
is
where
I
had
spent
most
of
my
time.
Thinking
of
how
can
we
do
this
because
there's
a
second
part
of
this,
some
of
the
load
balancers,
if
you
remove
a
node
from
a
endpoint
set,
any
open
connections
will
be
killed,
whereas
if
you
leave
it
in
the
set
but
fail
the
health
check
the
connections,
there
will
be
no
new
connections,
but
the
existing
connections
will
be
left
alone
right,
and
so
today
we
do
the
worst
thing,
which
is
every
time
a
node
comes
or
goes.
B
B
That's
not
what
we
want
right.
What
we
really
want
is
to
say
there's
a
conditions
where
the
node
is
in
the
set,
but
not
accepting
new
traffic,
and
there
are
conditions
where
the
node
is
out
of
the
set
right
very
similar
to
the
discussion.
We've
had
over
the
last
couple
of
meetings
about
graceful,
endpoint
draining
at
cubeproxy.
We
have
the
same
problem
here
at
cloud
load
balancers
and
we
haven't
done
a
great
job,
making
this
lifecycle
possible.
So
some
of
your
ideas
here
are
are
interesting
to
me.
Some
of
them
seem
impossible.
B
Like
I
don't
see
how
number
four
could
actually
work
number
three.
We
could
actually
do
the
or
the
and
sorry
inside
cubeproxy
like
we
could
make
the
healthcheck
node
port
not
just
return
true,
but
actually
dynamically
check
cubelet.
At
the
same
time,
right
that
helps
cubeproxy.
It
doesn't
help
other
implementations,
but
we
could
at
least
say
hey.
We
think
this
is
a
valuable.
H
G
By
the
ccm
by
the
ccm,
so
the
ccm
would
only
update
the
lb
whenever
a
node
gets
added
or
deleted
so
that
the
lb
has
the
full
set
of
nodes.
But
it
wouldn't
care
anymore
about
transitioning
readiness
state
so
to
speak,
because
the
the
q
proxy
would
do
that
now
by
interrogating
the
cubelet
port.
Instead,
it
would
be
more
that
more
dynamic,
essentially.
B
I
G
B
Even
yeah,
we're,
like
everybody,
we're
in
the
process
of
converting
to
a
google
specific
control
cloud
controller
manager
so
that
we
only
link
the
google
code
into
it
and
not
all
the
other
cloud
providers,
and
so
that
we
have
a
clear
ownership
and
ability
to
you
know,
do
stupid
things
when
you
need
to.
C
G
Yeah
for
what
concerns
the
service
controller
part,
so
this
node
sync
loop
so
to
speak.
That
would
continue
to
exist
even
in
the
google
implementation
or
maybe
not.
Maybe
that's
a.
G
Okay,
so
kind
of
a
corollary
question
to
to
this
entire
discussion
is
at
least
for
point
one,
given
that
this
is
an
issue
that
this
project
is
currently
experiencing
at
scale.
Could
it
be
considered
a
bug
if
it's
considered
a
bug?
G
D
Other
thing
that
you
need
to
check
alex
is
is
what
crowd
provide
the
sagittarius,
because
I
know
that's
another
using
the
the
out
of
three
controller
manager.
I
think
andrew
is
not
here.
No
and
the
psyche
is.
G
No,
in
this
case
not,
I
didn't
see
it
see
the
necessity
necessity
for
that,
because
this
is
only
handling
etp,
local,
so
to
speak,
and
so
you're
just
provided
the
entire
set
of
nodes
that
exist
in
your
cluster
and
as
long
as
that
is
synced.
G
The
health
check,
node
port
will
actually
tell
the
lb
where
the
endpoint
is
so
there's
no
reason
to
to
watch
endpoints
or
pods
and
check,
if
they're
being,
if
they're
being
scheduled
on
other
nodes
and
whatnot.
B
G
B
At
number
one
you're
saying
four
services
that
are
etp
local,
do
not
sync
if
it
does
not
have
service
endpoints.
How
do
I
know
in
the
service
controller,
whether
you
have
service
endpoints
or
not?
Oh
right,
yeah
yeah,
I
use
the
endpoints
lister
so
to
to
get
the
name
depends
on
the
the
metadata
in
the
node.
Whatever
the
node
reference
field.
G
G
B
Way
to
go
for
it,
so
if,
if
we
were
to
draw
out
the
sort
of
state
machine
of
nodes
with
regards
to
load,
balancer
traffic
right,
we
have
at
least
node
doesn't
exist,
node
exists,
but
is
unscheduleable,
node
exists,
but
is
not
ready
and
actually
the
crisscross
of
those
two
right.
So
there's
four
states
there
and
then
well
like
I
said
it
covers
node
exists
and
is
schedulable
and
is
ready,
and
I
guess
the
question
is:
do
we
want
to
change
the
behavior
on
certain
edges
right?
Is
that
a
fair
assessment?
G
So
in
this
case,
what
is
kind
of
shooting
us
in
the
foot
is
no
transitioning
from
ready
to
not
ready
and
back
and
back
and
back
again
so
to
speak.
So
the
schedule
ball
part
is
not
really
the
the
big
pain
point
with
clusters
at
scale
and
especially
with
these
cpu
intensive
loads
that
we're
running
and
all
of
that.
B
Right
and
the
problem
here
is
that
ready
and
unready
can
either
mean
hey.
The
cubelet
went
out
for
a
long
lunch
and
didn't
come
back
or
it
could
mean
this.
Node
has
powered
off
and
we
don't
know
yet,
but
this
is
our
first
indicator
right.
I
guess
in
that
case,
if
we're
relying
on
health
check,
then
the
health
checks
will
cover
it
right.
If
that's
the
assertion,
yes,
definitely.
B
I
would
need
to
go
back
and
do
some
spelunking.
We
need
to
make
sure
that
in
the
non-etp
local
case.
So
when
you
have
a
regular
old
service
that
we
don't
continue.
G
To
send
that
case,
I
didn't
touch
that
case
at
all,
so
the
the
the
implementation
only
focuses
on
etp
local
yeah.
So
that's
what
the
change
concerns
itself
with
for
everything
else.
It
remains
the
same
as
it
currently
stands.
J
It's
just
a
knife
question
is
this
implying
that
they
aren't
existing
in
orbs
enough
to
protect
cubelet
from
abusive
workloads?
It's
more
a
general
problem.
B
Well,
you
mean
protecting
me
by
having
you
set.
Limits
is
fragile
right
if
we
need
to
set
the
appropriate
request
for
cubelet
make
sure
that
cubelet
is
getting
enough
shares
that
it
can
always
service
what
it
needs
right.
This
is
the
whole
like
allocatable
problem
right.
If
we
carve
off
too
much,
then
we
eat
a
lot
of
your
node
that
you
might
not
be
using
and
if
we
carve
off
too
little,
then
you
have
this,
so
I
might
need.
B
Thing
the
because
allocatable
comes
off
the
top
right
like
if
the
node
is
eight
cores-
and
you
say
allocatable
is
seven-
then
cubelet's
got
one
to
mess
with
it's
supposed
to
anyway.
B
That's
a
good
point,
I
I'm
sure
I'm
fairly
confident,
not
sure
fairly
confident
that
sig
note
has
got
some
of
those
things
documented.
I
wonder
if
we
can
do
a
better
job
or
maybe
maybe
this
is
a
good
catalyst
to
go
and
like
relook
at
those
things
and
make
sure
that
we
have
a
new,
a
new
thing
for
them
to
consider
network
abuse.
G
Just
the
last
thing
I
wanted
to
discuss
was
the
possibility
of
doing
2a,
which
is
to
say
that
you
annotate,
that
you
have
the
possibility,
as
a
user,
to
annotate
the
service
object,
which
tells
the
ccm
to
completely
ignore
node
readiness
checks
at
all,
so
that
ccm
will
only
think
the
addition
of
a
node
or
the
deletion
of
a
node
for
what
concerns
the
lb
and
the
lb
associated
with
this
service.
If
it
transitions
to
a
not
ready
state,
the
ccm
doesn't
really
care
about
that.
It
will
just
keep
it
configured
as
it
was.
B
How
about
that's
really
my
least
favorite
option?
We
have
so
many
of
these
little
fiddly
knobs
that
force
users
to
understand
our
implementation
details.
B
I
would
put
that
at
the
bottom
of
my
list
of
things
to
prefer
I'm
personally,
I'm
trying
to
figure
out
if
we
can
do
something
like
one
but
not
even
consider
whether
you
have
endpoints
or
not
just
change
that
edge
like
maybe
that
edge
doesn't
make
sense.
Maybe
we
should
lean
more
on
health
checking
and
and
just
ignore
the
ready,
unready,
schedulable
and
schedulable
differences.
B
B
So
what
is
that
exactly?
I
didn't
fully
follow
what
if
what
if
we
just
did
something
like
what
you
described
in
one,
but
we
ignore
we
just
we
always
ignore
ready,
not
ready,
and
we
rely
on
health
checking
and
we
say
like
there
needs
to
be
a
health
check
for
regular
old
services
and
there's
a
health
check
for
etp
local
services
and
those
might
not
be
the
same
health
check.
B
G
Though,
at
least
I
may
did
a
piss-poor
job
that
may
be
writing
it
down.
But
that's
my
idea
behind
point
four,
which
is
to
say
that
we
have
a
health
check.
Note
port
probe
by
the
lb
and
the
cloud
provider
already
or
the
cloud
implementation
so
to
speak,
already
defines
a
probe
as
well
on
the
lb
for
regular
services
of
type
load
balancer.
B
G
Right,
that's
what
I
was
getting
that's
what
I
wanted
to
do
so
my
plus
sign
at
least
on
point
three
here.
My
plus
sign
is
and
sorry
it
wasn't
really
so
the
I
mean
http
200
if
endpoint
is
running
on
node
and
cubelet
is
reporting.
Okay
and
the
thing
doing
this
so
checking
that
is
q
proxy,
so
proxy
would
check
the
endpoint
as
it
currently
does,
and
it
would
also
do
I
don't
know.
Curl
of
127.001
cubelet
ready
only
port
read
only
port.
B
Okay,
I'm
gonna
have
to
read
your
pr
and
think
about
that
anybody
else.
I've
been.
G
G
But
that
doesn't
mean
this
isn't
actually
a
problem.
Yes,
that
is
kind
of
yeah.
That's
more
for
point
two
here,
which
is
to
say,
resources
are
starved
right
right,
but
yeah
the
time
to
sync
lbs
is
definitely
still
a
problem,
because
when
you
have
so
many
lbs
on
a
cluster,
the
cloud
provider
can
either
limit
you
or
it
just
takes
a
ton
of
time
to
sync
and.
B
B
A
Great,
should
we
move
this
discussion
to
the
mailing
list
or
the
pr
or
both,
where
do
you
guys
think,
is
best
pr?
And
if
we
overflow
to
the
mailing.
G
B
A
Thanks,
alexander
antonio,
you
were
next
on
the
agenda.
D
D
I
think
that
casey
can
understand
fixed
the
the
problem,
but
I'm
not
really
sure
well
this.
I
wanted
to
say
that
for
the
cni
plugins,
this
was
going
to
be
a
nightmare
because
they
wanted
to
do
a
vlog
post,
a
big
announcement
saying
that
everything
was
going
to
break
and
okay,
and
I'm
not
sure-
and
I
was
expecting
dan
williams
to
be
here
if
if
this
was
really
a
bag
for
containers
and
cry
or
a
bug
in
the
con
on
the
cni
plugins,
because
the
ones
that
parse
the
cni
configurations
are
the
cni
plugins
right.
D
Is
that
was
that
the
summary
it
it's
fixed
in
cni
library
and
container
the
folks
thought
that
it
was
a
problem
for
them,
so
everybody
in
kubernetes
had
to
agree.
But
what
I'm
not
sure?
If
is
this
a
problem
for
container
d
because
they
have,
they
are
doing
some
kind
of
cube
net
thing
for
testing
or
if
it's
a
problem
for
cni
plugins.
D
So
that's
why
it's
a
heads
up
for
the
cni
vendors
to
check
this,
because
if
they
are
using
this
one
to
see
the
zero
as
library
they
may
have
customer
clients
or
people
failing
after
the
grade-
or
I
don't
know
it's
it's
a
complete
problem.
I
I
just
got
into
that
yesterday
by
chance,
and
and
thankfully
we
we
were
able
to
to
solve
it,
because
this
was
going
to
create
a
lot
of
of
noise
and
that
wasn't
really
nice
but,
for
example,
calico
and
all
the
people
with
you,
and
I
should
check
this.
A
Yeah
I
had
seen
this
thread,
but
hadn't
yet
had
a
chance
to
read
through
all
of
it
yet
so
it's
definitely
on
my
list.
D
When
you
use
the
the
after
h080
or
something
like
that,
if
you
don't
have
a
person
in
the
cni
config,
it
returns
a
minus
one
and
before
it
assumed
that
was
zero.
Point.
Zero
point,
no
zero
point
one
point
zero,
and
that
was
the
breaking
change
in
case
casey
fixed
that
yesterday
and
now
in
110.
It
keeps
assuming
that
is
0.10
and
the,
and
what
the
container
the
people
find
out
is
that
that
was
breaking
the
ci.
But
I
think
that
is
because
they
are
not
using
a
cni
plugin.
B
D
They
they
I
stopped
at
them
because
they
were
creating
a
blog,
they
were
creating
release
notes.
I
mean
there
was
a
full
announcement
about
that.
Everything
was
going
to
fall
apart
in
124
with
the
cni's
and
and
I
couldn't
believe
it
and
that's
why
I
pulled
casey
in
the
morning
and
he
found
this
aggression
and
and
and
then
I
went
to
bed.
I
didn't
think
yesterday.
So
when
I
wake
up,
I
say,
but
this
is
impossible
container
is
not
passing
the
cni
configuration.
B
B
No
and
tria
folks
are
here:
yes,.
B
D
A
Oh
thanks,
antonio
I'm
just
looking
at
those
pr's
now,
I'm
pretty
sure,
we've
already
updated,
but
I'll
double
check
that
who's
next
sanji.
If
you're
next.
J
Okay,
how
much
time
do
I
have,
because
it's
probably.
A
J
J
K
Go
ahead
sure
mine's,
like
you,
said,
very
simple.
I
I
wanted
to
follow
up
on
that
mailing
list
thread
that
tim
started
back
in
the
day.
Specifically,
the
idea
was
hey.
Can
we
move
this
earlier,
so
it's
more
accessible
to
people
who
want
to
attend
from
europe?
I
I
summed
up
the
response
on
that
thread.
The
most
popular
response
was
either
of
the
suggested
times.
The
suggested
times
were
9
and
11
a.m.
Pacific
time
I
it
seems
like
9
a.m,
very
slightly
won
out
over
11
a.m.
K
I
tried
to
add
a
table
in
the
agenda
just
with
those
numbers
so
based
on
that,
I
would
suggest
just
moving
it
to
9
a.m.
If
there's
no
objections,
does
that
make
sense.
E
I
mean
some
people,
including
me,
may
have
overlapped
with
helm,
which
is
at
9
30
a.m
on
thursdays.
So
I'm
always
gonna
be
juggling
that.
But
if
that's
just
me
then.
B
E
I
heard
that
we're
we're
all
going
to
the
smallest
possible
web
assemblies
right,
that's
right,
everything
on
the
edge,
no
more
computers,
only
raspberry
pi
asics.
K
Perfect
well
I'll,
send
out
a
mailing
list
update
on
on
that
thread,
with
the
assumption
of
moving
to
9am
on
thursdays,
starting
after
kukan,
I
move
the
meeting.
Okay
and
I'll
give
the
time
over
to
sanjiv
thanks.
J
Okay,
so
yeah
I've
put
this
link
to
this
doc
in
the
mailing
list,
so
obviously
we
won't
be
able
to
go
through
most
of
it
right
now.
Just
wanted
to
introduce
it
and
have
you
all
look
at
it
later
and
provide
comments.
This
is
something
that's
relevant
both
to
sig
network
and,
of
course,
sig
network
policy
and
sigma
particular
cluster.
It
kind
of
straddles
these
areas
so
hoping
to
come
to
a
resolution
there
and
deciding
how
to
take
forward
multi-cluster
network
policy.
J
Let
me
do
this
yeah,
okay,
so
I'm
not
going
to
go
through
this,
but
you
know
there's
some
requirements
that
have
sort
of
made
sense
to
me
and
based
on
discussions
with
others.
But
by
no
means
are
we
done
with
being
sure
of
the
exact
set
of
requirements.
J
So
here's
a
working
list
of
requirements
that
you
know
I've
put
together
based
on
some
discussions
with
some
of
you
offline
as
well,
but
we
need
to
keep
working
through
that,
but
in
addition
to
but
in
in,
but
there's
at
least
a
few
use
cases
that
are
more
obvious
than
others.
So
we
can
get
going
with
those
use
cases
and
we
can
decide
whether
we
want
to
add
delete,
modify,
there's
a
lot
of
stuff
here
related
to
you
know
single
network
models,
multi-network
models
and
things
like
that
I'll.
J
J
Should
we
design
this
to
cover
various
kinds
of
multi-cluster
deployment
models
like
what
is
frequently
called
single
network
and
multi-network
modes
of
operation?
Should
all
the
features
work
in
all
modes
or
is
it
okay
to
identify
them?
Some
modes
are
more
important
than
others,
so,
for
example,
design
some
features
that
only
work
in
single
network
mode
should
this
feature
be
limited
to
our
reference
architectures
defined
by
the
kubernetes
multi-cluster
services.
J
J
I'm
presuming
many
of
you
are
and
then
you
know
a
lot
of
these
are
very
analogous
discussions
that
happen
both
in
signet
sigmund
cluster,
as
well
as
in
other
kinds
of
multi-cluster
networking
projects,
including,
for
example,
istio
multi-cluster,
for
example,
you
can
have
sdo
meshes,
which
can
themselves
have
a
single
network
or
multi-network
deployment
model,
and
then
you
can
have
federation
across
meshes.
So
the
analogous
sort
of
topologies
come
into
play
here
as
well,
and
then
a
little
bit
more
clarity
on
requirements
on
the
trust
model
across
clusters.
J
You
know,
is
it
you
know,
name
spaces
which
are
global
and
there's
a
little
bit
more.
That
needs
to
be
worked
out
on
the
requirements.
But
having
said
that,
you
know
one
can
reasonably
put
together
some
reasonable
use
cases.
So
let's
just
go
through
real
quick
and
you
can
feel
free
to
provide
comments.
Later
one
is
okay.
I
want
to
be
able
to
control
I'm
going
to
be
using
mcs
terminology
here,
multi-cluster
services,
api.
J
Forgive
me
if
you're
not
entirely
familiar,
but
it
should
be
fairly
clear
from
the
pictures
so
a
I
want
to
control
which
cloud
which
pods
in
a
cluster
are
allowed
access
to
a
service.
That's
been
imported
from
another
cluster.
Okay,
so
here
I've
got
cluster.
A
I've
got
two
parts.
I
want
to
be
able
to
flexibly
control
that
pods,
which
fit
into
category
p1,
are
not
allowed
to
access
this
remote
service
right.
J
So,
okay,
context
here
is
that
there's
this
remote
service
running
on
cluster
b,
it's
a
multi-cluster
service
and
it
has
been
imported
across
the
network,
which
is
an
arbitrary
network.
Maybe
single
network
may
be
multi-network
and
since
it's
a
multi-network,
it
probably
requires
a
gateway,
which
is
what
I've
shown
here
and
allowing
for
all
these
different
models.
I
still
want
to
decide
how
some
pods
are
allowed
to
access
imported
services
and
some
pods
are
not
allowed
to
access
imported
services.
So
that's
use
case
number
one
you
see
here.
J
J
Let's
just
worry
about
the
selectors,
so
in
this
case,
something
like
okay,
just
having
an
additional
kind
of
egress
destination,
which
is
basically
a
service
import
or
a
reference
to
service
import,
can
allow
you
to
configure
that
these
pods,
which
are
selected
by
this
policy,
are
either
allowed
or
not
allowed
to
egress
to
this
imported
service
right.
So
that
would
be
an
example
of
a
sample
manifest
that
satisfies
this
kind
of
use
case.
J
J
Okay
and
again,
I
would
ideally
want
this
to
work
both
in
all
kinds
of
topologies,
including
single
network
multi-network,
flat
network
and
so
on.
It
would
appear
that
we
can
do
this
without
changing
the
current
mcs
api.
J
With
a
few
assumptions,
the
assumptions
could
include
the
fact
that,
if
you're
running
this
is
in
multi-network
mode,
the
combination
of
gateway,
ip
and
cluster
set
ip
must
be
unique.
Per
export
right
so
that
as
long
and
what
this
would
do
is
that
every
cluster
that
is
exporting
this
service
would
attach
the
appropriate
labels
to
allow
the
importing
cluster
to
decide
to
put
in
its
data
plane
only
destination.
J
We
can
spend
more
time
if
needed.
Thinking
about
it.
So
there's
a
few
more
use
cases.
I
could
either
pause
here
for
any
comments
or
keep
going.
A
It's
got
two
minutes
left
so
probably
a
good
time
to
pause
and
get
any
comments.
F
F
The
pre-previous
one,
okay,
this
one
right
here,
no,
I
think,
use
case
b.
I
guess
controlling
this
yeah
this
one.
J
Okay,
so
here
this
was
actually
suggested
to
me
by
nathan,
midler
from
google.
What
he
said
was
that
you
know
very
often
a
service
would
be
provided
by
lots
of
different
clusters.
J
It
will
be
given
the
same
name,
but
you
may
have
policy
regulations
which,
which
tell
you
that
okay,
this
cluster,
should
only
access
this
global
service
from
clusters
located
in
the
u.s,
for
example,
and
not
from
clusters
located
in
asia
pack
or
things
like
that.
J
J
F
F
How
to
do
such
things,
which
might
not
be
ideal,
but
you
know
this
is
just
my
mind.
J
They
have,
they
may
have
good
reasons.
For
you
know
we
don't
want
to
force
them
to
always
create
more
clusters
in
order
to
solve
a
problem
in
some
cases.
If
that's
the
only
option,
then
yes,
but
very
often
it
would
be
that
that
restriction
applies
to
pods
of
category
p2,
but
parts
of
category
p1
may
want
to
access
a
global
service
across
all
providers
of
that
service
wherever
they
are
and
they're
both
located
on
the
same
cluster.
J
So
yeah
you
can
solve
network
segmentation
by
just
creating
separate
disjoint
cluster
sets,
but
sometimes
you
want
to
have
shared
cluster
sets
but
partitioned
within
the
cluster
set,
but
but
yeah
your
input
is
welcome
and
you
know
we
should
discuss
how
relevant
these
cases
are.
But
there
was
something.