►
From YouTube: Antrea Community Meeting 07/31/2023
Description
Antrea Community Meeting, July 31st 2023
A
Good
morning,
good
afternoon,
good
evening,
thanks
for
joining
with
the
anthria
community
meeting
today
is
the
first
distance
of
the
month
of
August.
Of
course,
if
you
are
in
the
United
States
you
live
in
the
past,
then
it's
still
July
for
you
in
any
case,
Jokes
Aside.
Let's
go
to
today's
agenda.
We
have
two
topics
into
this
adventure.
I
will
say
two
very
interesting
topics.
A
We
will
start
with
the
proposal
about
doing
a
first
and
assembling
live
tracing,
increase
flow
I'm,
just
telling
you
what
I
was
told
as
it
as
a
as
a
topic
for
today,
because
I'm
also
very
curious
to
learn
what
this
is
about
and
show
you
and
will
be
leading
these
presentation
and
I
am
sorry
if
I
did
not
get
out
the
right
pronunciation
for
your
name
and
then
we'll
have
a
talk
by
Andrew
about
the
Improvement
to
existing
CI
pipelines
and
new
functionalities
for
kinda.
A
So
I
feel
that
since
we
have
a
fairly
packed
agenda,
it
might
be
best
to
start
as
soon
as
possible,
and
so,
let's
start
with
the
presentation
about
live
tracing
interest
logo.
So
please
go
ahead.
B
A
C
Yes,
so
I'm
going
to
give
a
presentation
about
introducing
the
first
sampling
Knight
tracing
the
trace
flow,
so
let's
first
quickly
review
the
Milestones
of
the
trace
flow
feature.
So
initially
we
support
injecting
a
crafted
packet
to
so
we
can
gather
information
about
the
the
entry
running
running
process
and
then
add
some
support
for
the
live
traffic
Choice
flow.
C
That
is,
we
can
capture
the
first
packet
that
satisfy
the
given
conditions,
such
as,
for
example,
we
can
assess
on
conditions
like
the
package
headers
so
button
only
capturing
the
first
package
is
not
enough,
so
we
plan
to
add
the
sampling
feature
to
the
trace
flow.
We
plan
to
add
three
sampling
methods.
The
first
is
the
so-called
first
and
Sample.
That
is
with
we
capture
the
first
and
packets,
and
then
we
have
the
interval
assembly.
We
can
capture
one
out
of
every
n
package
and
the
third
is
the
time
interval
sampling.
C
This
means
in
capturing
package
at
a
given
time
interval
such
as
one
second,
so
we
plan
to
implement
the
first
and
sampling
first
because
it
seems
the
most
simple,
but
there
are
some
challenges
to
overcome
when
implementing
the
first
instantly.
The
first
is
that
we
cannot
just
use
the
crd
status
to
store
our
results
because
they
are
too
large
So.
Currently,
our
solution
is
that
we
save
the
results
on
the
disk
of
the
nodes
in
the
ppap
entry
format
and
each
node
passively
waits
for
the
user's
request
to
fetch
them.
C
The
second
problem
is
the
possible
obvious
overhead,
because
we
before
the
change,
we
only
make
OBS
send
the
package
with
the
TCP,
the
first
TCP
packet.
That
is
a
package
that
starts
the
connection,
so
there
is
no
no
problem
but
but
to
support
us.
Definitely
we
need
to
make
obvious
that
every
matched.
C
C
C
C
There's
some
parameters
of
the
sample
action.
The
first
is
probability
so
the
set
of,
for
example,
we
said
the
proper
probability
to
ten
thousand
means
that
we
capture
one
out
of
10
000
packets
and
also
we
have
a
collector
set
ID.
C
B
C
The
the
current
status
of
the
development
is
that
we
have
already
developed
an
alpha
version
that
can
sample
the
first
10
packets
and
write
the
data
to
disk,
but
there
are
still
a
lot
of
work
to
be
done.
For
example,
we
need
to
determine
the
final
Trace
Flow
crd
Design,
and
also
we
need
to
find
a
way
to
design
a
user-friendly
API
to
expose
the
trace
flow
results
to
our
users.
C
So
that's
so
then
I
will
briefly
introduce
the
code
chain.
The
changes
we
made.
The
first
is
that
we
change
the
trace
flow
crd
first
back
part,
we
add
a
sampling
field.
The
central
field
has
two
properties,
method
and
num
the
method.
Currently,
the
method
can
only
be
the
first
n
and
the
num
Fields.
The
not
field
determines
the
the
value.
C
But
I
think
it's
not
very
reasonable,
because
if
we
set
live
trap,
big
too
fast
and
still
we
assign
a
centrally,
then
there
will
be
a
contradiction.
So
I
think
a
more
reasonable
way
is
to
Let's
simply
be
a
property
of
live
traffic
like
this
and
the
same.
The
draft
only
property
is
set
in
the
same
because
the
sampling
and
the
draft
only
the
properties
are
all
for
the
live
traffic
mode.
C
C
C
Then,
if
then,
if
we
have
captured
enough
number
of
packets,
we
set
the
status
of
the
trace
flow
to
be
succeeded.
C
And
also
for
the
entry
agent
side,
the
OBS
pipeline
is
changed
a
little
before
the
change.
We
only
we
only,
but
we
only
track
the
first
tech
first
package.
So,
as
you
can
see,
we
match
the
state,
the
account
contract
stage
to
be
new
and
new
and
tracked.
The
new
means
that
this
package
is
the
first
packet
of
a
TCP
connection,
but
for
the
sampling
feature,
if
the
central
mode
is
on
WE
disable,
this
contract
State
new
condition.
C
So
we
can
check
every
Target
package,
but
this,
but
this
change
has
also
introduced
some
unintended
side
effects
so
without
temporarily
change
the
priority
of
this
flow
to
be
high.
Previously
this
priority
used
to
be
low,
but
this
change
also
caused
some
some
parts.
So
we
need
to
find
out
the
reason
and
to
fix
the
bug
and
for
the
object
updates.
C
We
can
see
that
before
the
change
we
update
every
time
we
receive
the
packed
in
message,
but
for
the
sampling
mode,
this
will
be
true
will
cause
too
much
resources
because
the
there
will
be
too
much
more
texting
messages.
So
we
cannot
update
every
time
we
received
a
pacting
message,
so
so,
if
we
are
on
the
sampling
mode,
so
if
I
will
assembly
mode,
we
must
add
a
update
rate
limiter,
which
limits
the
rate
of
crd
updates.
B
C
And
also
after
we
process
the
active
message
successfully,
we
then
write
the
packet
to
a
local
file
in
the
PCAT
energy
format.
This
is
done
by
introducing
a
new
library
which
is
the
gold
package.
The
gold
package
is
a
library
that
provides
the
support
for
PCAT
for
entry
format
and
the
go
back
sheet
is
the
gold
package
is
maintained
by
the
Google
officially.
C
E
Actually,
I
have
a
question.
You
mentioned
that
the
capture
the
package
would
be
too
large
to
to
be
saved
in
the
crd
itself.
So
how
could
users
retrieve
the
NG
data
from
how
could
they
retrieve
that
I
I
didn't
say
it
mentioned
in
the
slides.
C
F
I
can
have
answered
this
question,
so
we
are
planning
to
reuse
the
other
methods
like
from
the
support
bundle.
So
in
the
older
version
of
support
bundle,
we
use
algorithm
API
to
to
retrieve
the
raw
data
from
kubernetes
for,
for
example,
if
we,
if,
when
a
user,
starts
a
new
request
to
fetch
a
specific
support,
bundle
data,
we
we
aggregate
the
the
files
from
the
local
and
create
a
tafel
and
the
reason
using
the
HTTP
API
to
return
the
data
to
the
user.
F
So
we
only
so
only
if
the
user
starts
a
new
request.
We
we
give
the
data
to
the
user.
Otherwise
it's
just
stored
on
the
local
disk.
I
think
it's.
The
the
current
version
of
this
water
bond
is
switched
to
a
new
I
I,
think
it's
maybe
a
field
server
or
something
like
that,
but
I,
I,
I,
but
I
think
it's.
The
older
version
of
this
water
bundle
is
still
working
for
for
our,
like
condition.
Yeah.
E
I
remember
for
the
for
the
support
about
the
API
I'm
I
mean
not
the
latest
super
bundle-
collection,
Sandy,
API
I
mean
the
old
one
yeah
I
remember.
It
sends
a
request
to
each
agent
API.
E
So
I
I
guess
this
may
not
work
for
when
you
want
to
do
something
with
crd,
but
probably
it
could
be
done
by
entrance
owns
CLI
and
the
CTR,
and
perhaps
it
could
read
some
status
from
the
crdf
data,
the
crd
status
and
start
a
direct
connection
with
the
corresponding
agent
and
to
get
the
data
met.
It
might
be
possible,
but
people
who
are
not
using
nctr
may
may
be
difficult
to
get
the
data
right
because.
B
F
Yes,
so
I,
so
in
the
design
actually
had
a
new
field
in
the
studies
field
of
the
trees,
velocity,
which
is
called
the
like.
Like
a
package
place,
for
example,
we
can
generate
a
new
HTTP
pass
and
we
write
it
to
the
status
field.
B
B
E
F
Yeah
I
think
there
definitely
is
so.
The
problem
is:
what's
the
executive
number
we
should
set,
so
we
we
need
to
evaluate
all
the
parameters
like
the
story
size
and
the
the
package
size
so
before
we
actually
so
in
the
pull
request
we
act,
we
definitely
need
a
top
limiter,
but
the
exercise
we
I
think
we
need
more
discussion
or
test
to
to
set
a
reasonable
one.
Yeah.
F
E
Right
yeah
in
previous
code,
when
we
only
support
a
real
Trace,
relieve
traffic,
only
and
capture
the
first
package,
only
and
U.N,
when
it's
not
a
Leo
traffic
mode,
I
think
we
have
a
timeout
for
I,
guess
it's
a
right
or
for
the
flows
to
match
the
target
package
and
and
remember
it's
something
like
five
minutes
under
for
first
and
packets,
two
ways
they're
near
that
or
we
need
to
dynamically
set
the
timeout
according
to
and
how
to
wait.
C
E
F
Yeah
so
I
think
the
Fairly
harder
limit
is
it's
still
working
for
all
the
stepping
method.
Even
if
we
like
add
a
new
new
sampling
method,
we
we
can
treat
the
timer
for
limited
for
all
these
choices.
I
think
it's
is
reasonable
for
a
user
to
to
understand
the
the
meaning
of
the
field
and
I
didn't
see
any
conflict
yet
to
be
between
between
this
field.
So
I'm
not
sure.
What's
the
if.
E
B
F
G
By
the
way
I
joined
late,
so
just
looking
at
the
spec
I
start
to
think,
should
we
really
reuse
the
twist
flow
of
crd
or
we
should
should
I
have
a
new
one,
because
I
think
this
facing
a
little
different
from
original
choice
for
in
a
new
capture
package
right.
F
C
I
think
it's
a
philosophy
problem
from
everyone
has
has
its
own
preference
preference,
but
personally,
I
think
that
this
this
function
is
is
different
from
the
current
one,
so
I
think
at
least
we
need
to.
As
mentioned
we
can.
We
need
to
put
the
sampling
property
into
live
traffic
and
not
like
that.
Currently,
we
just
add
a
new
parameter.
G
Now
the
current
live
traffic
tracing
a
trace
flow
is
different
from
what
you
guys
are
proposing
right
today.
We
just
capture
the
headers
and
the
word
limited
set
of
headers.
They
don't
capture
packets,.
F
So
if
we
we
have
a
new
crd,
we
have
the
advance
of
like
still
copying
the
design
from
twist
flow.
I
think
the
I'm
not
familiar
with,
but
I
think
a
week.
It
used
external
field
server
I
think
it
may
be
better
for
the
package.
Storage,
foreign.
C
I
think
I
think
it's
reasonable
to
expand
the
current
life
tracing.
So
even
if
we
don't
don't
use
the
sampling,
we
we
still
collect
the
raw
pack
data
as
well.
I
think
that
will
make
the
new
function
and
current
function.
Consistence.
G
G
Yeah
I,
don't
have
a
strong
opinion.
Probably
let's
you
know
sing
a
little
more
here.
If
you
think
is
the
consistent
Behavior.
Maybe
we
can
use
it
if
it's
very
different
that
you
want
to
add
many
new
parameters,
probably
wish
you,
the
Singapore
new
crd,
that's
my
opinion,
foreign.
A
G
A
See:
okay,
nice
right,
so
this
is
I,
don't
know
if
we
have
any
other
question
regarding
sampling
for
Trace
fluff
waiting,
just
10
seconds.
A
All
right,
it
looks
like
it's
all
for
this
presentation,
so
thanks
a
lot
to
shiwana
for
for
doing
this
presentation,
and
then
we
can
move
to
the
next
topic
regarding
the
CI
improvements,
which
should
be
led
by
Andrew.
So
please
go
ahead.
A
D
Okay,
okay,
so
hi
everyone
today,
I
want
to
share
few
improvements
to
the
existing
CI
pipeline.
So.
Currently,
I
am
working
on
these
four
upcoming
improvements,
so
first
is
use,
is
top
all
stale
job
to
kill
job
related
to
a
PR,
and
the
second
is
run
CI
test
in
dual
kind.
Cluster
third
is
run
ipv6ci
test
in
IPv6,
cluster
and
last
one
is
automated,
upgrade
support
for
Jenkins
kind
cluster.
So,
let's
save
one
by
one.
D
So
basically
it
is.
The
question
is
why
we
need
this
Improvement
or
why
we
need
this
job.
So,
like
suppose,
you
have
created
a
PR
and
you
trigger
few
CI
tests
on
that
PR
to
test
your
changes
and
after
four
or
five
minutes,
you
push
some
new
changes
in
your
Pi.
Now
this
time
you
need
to
re-trigger
your
CI
test
to
test
your
updated
changes.
So
before
read
trigger
the
test,
you
need
to
abort
your
previous
job.
D
So
for
that
you
need
to
use
Jenkins
UI
to
about
all
the
running
or
waiting
scale
jobs
on
your
VR,
so
basically,
instead
of
using
Jenkins
UI
to
about
running
or
waiting
jobs,
you
can
use
this
stop
all
jobs
to
kill
all
the
previous
running
tests
on
your
PR.
So
we
can
see
in
the
workflow.
If
you
trigger
a
CI
test
on
the
GitHub
PR,
then
after
some
time
you
updated
some
changes
in
the
PRS,
then
you
need
to
about
this
previous
job
before
retriever
the
test.
D
So
for
that
you
can
read,
you
can
trigger
this,
stop
all
job,
so
it
will.
It
will
Abode
the
whole
waiting
or
running
job
on
your
PR
and
then
after
this
you
can
re-trigger
a
CI
test
again
to
test
your
latest
changes.
D
So
basically,
in
this
we
have
used
a
Jenkins
test
API
to
get
information
about
the
running
and
waiting
jobs
in
the
pr,
and
we
have
used
Jenkins
token
to
perform
post
operation
in
the
Jenkins
and
currently
I
have
enabled
this
feature
only
for
the
cabway
related
shops.
D
Why
I
have
enabled
this
feature
only
for
the
cavity
related
job,
because
in
the
in
the
Jenkins
like
there
is
lots
of
jobs
running
so
we
can't
apply
for
all
the
jobs
like
this
time,
because
otherwise
it
will
impact
some
other
important
job.
So
once
this
implementation
of
proved
by
the
maintenance,
although
it
is
working
fine
for
the
cafe
jobs,
but
we
can
have
a
follow
PR
for
enabling
this
feature
for
other
Jenkins
jobs
and
next
to
we
have
a
next
Improvement
related
to
kinds.
D
So
we
can
run
a
network
policy
conformance
and
E2
test
in
kind
work
stack
cluster
here
I
have
created
three
Jenkins
jobs,
so
we
have
here
three
trigger
phases
for
the
e2e
and
conformance
and
network
policy,
and
it's
a
workflow
is
very
simple.
You
can
trigger
any
face
on
the
GitHub
VR
to
test
e2e,
test
or
or
conformance
or
network
policy,
and
based
on
the
trigger
phase.
It
will
create
the
kind
cluster
kind
voice,
Tech
cluster,
and
then
it
will
learn
the
CI
test.
D
Whatever
you
have
triggered
and
after
finishing
the
test,
it
will
delete
that
particular
kind
cluster.
Similarly,
for
the
dual
stack,
we
can
run
IPv6
CI
test
in
the
IPv6
cluster
here.
I
have
also
created
three
jobs,
because
it
is
very
simple
for
the
developer
to
trigger
any
any
test
like
a
2E
conformance
Network
policy,
so
they
can
if
they
want
to
test
only
one
if
they
want
to
run
only
one
test
like
t2e,
so
they
can
trigger
only
one
trigger
page,
so
its
workflow
also
similar
to
the
dual
stack.
D
You
can
trigger
any
test
phase
on
in
the
GitHub
here
and
then
it
will
create
a
new
kind,
IPv6
cluster,
and
then
it
will
run
CI
test
based
on
trigger
face
and
then,
after
finishing
the
test,
it
will
delete
the
cluster
and
the
last
one
is
automated
upgrade
support
for
Jenkins
kind
cluster.
So
basically,
we
know
that
a
kind
kind
cluster
currently
running
on
the
Jenkins
Pipeline
and
it
does
not
have
the
like
capability
for
the
automatic
upgrades
and
on
every
kind
release
we
need.
D
We
have
to
perform
a
manual
upgrades
upgrade
in
the
kind
test
fit
so
which
is
INS
of
inefficient
so
to
resolve.
This
I
have
added
a
Jenkins
script,
name
like
a
kind
of
grid.essage
to
support
automatic
upgrade
and
its
algorithm
is
pretty
simple.
So
we
can
like
we
can
fetch
or
we
can
get
the
latest
version
from
the
GitHub
using
called
command,
and
then
we
can.
D
We
can
have
our
existing
kind
of
version
using
using
command,
and
then
we
can
check
if
latest
kind
version
is
greater
than
existing
kind
version,
then
we
can
call
a
upgrade
kind
function
and
in
that
function,
based
on
the
OS,
we
can
upgrade
the
kind
in
the
test
bit
and
after
merging
this,
the
workflow
will
be
like
here
or
you
can
see.
First,
you
would
trigger
the
trigger
the
CI
test
faces
on
the
pr.
D
C
E
Done
automatically
without
having
users
having
to
import
the
the
instruction
because
I,
remember
being
you,
can
have
actions.
If
we
push
new
commit
the
previous
one
will
be
canceled
automatically
and
I
think
it's
the
same
for
the
check-ins
jobs,
even
the
the
jobs
finished
and
the
sex
said.
Its
results
will
not
be
reported
to
the
new
commit
anyway.
So
running,
then,
is
usually
meaningless.
So
could
we
just
make
it
automatic.
D
E
H
A
another
goal
for
this
job
is
to
clean
up
the
previously
CPA
clusters
things.
If
we
just
support
the
previous
job,
there
could
be
some
resources
remaining
on
the
test
valve.
So
we
need
to.
We
need
this
trigger
freeze
to
clean
up
all
redundant
clusters
on
testbed.
E
H
Yeah
I
think,
like
you
said
we
can
investigative
for
the
new
trigger
freeze.
So
if
we
can
clean
up
the
cluster
as
a
reaction
Behavior,
so
we
can
add,
maybe
add
a
clear
actions
in
yeah
in
this
function.
E
In
general,
to
confirm,
if
I
understand
correctly,
this,
the
the
clean
up
jobs
triggered
by
this
instruction
is
for
clean
up
of
test
Advanced
and
jobs.
A
social
athlete
with
this
PR
only
or
is
for.
E
Yeah
then,
what's
the
problem,
if
we
just
make
it
automatic
without
user,
without
user
typing,
the
instruction.
I
Yeah
Chad,
so
actually
this
isn't
like
work
in
progress.
So
actually
this
is
like
manually
currently
can
be
done,
but
yeah
it
can
be
automated
because
it
is
based
on
the
pr.
So
next
time
like
any
changes,
gets
posted.
So
previous,
a
previous
running
jobs
will
be
canceled
automatically
yeah,
so
that
will
be
taken
care
of
like
once.
This
PR
gets
much.
J
I
think
one
thing
that
comes
to
mind
with
the
automatic
approach
is
well:
it's
not
it's
not
for
most
PRS,
but
right
sometimes
when
we
are
working
on
like
the
release.
Pr,
for
example,
I
I,
don't
know
if
we
want
the
jobs
to
be
aborted
automatically
in
this
case,
because
you
know
when
you're
working
on
the
release,
PR
and
you're
editing
the
release,
notes,
for
example.
Well,
you
kind
of
like
don't
want
to
have
to
run
those
Jenkins
jobs
again.
J
If
you
get
a
passing
status,
if
you
know
what
I
mean
yeah
yeah
I
get
it
I
mean
it's
only
a
minority
of
the
pr
that
or.
E
In
this
category,
yeah
yeah,
yes,
I
agree,
but
the
the
jobs
that
will
be
cleaned
up
is
the
other
ones
or
it
could
also
be
the
rounding
ones.
E
Okay,
okay,
then
yeah.
That
scenario
might
be
if
it's
automatic,
it
might
be
close.
Some
some
small
problems,
yeah
and
then
Tony
said,
but
my
concern
of
the
manual
instruction
is
perhaps
people
wouldn't
remember
to
type
two
commands
I
I
guess
it
requires.
You
type,
stop
all
their
jobs
once
and
wait
for
the
jobs
to
be
terminated.
And
then
you
start
another
round
of
tests
privacy,
but
to
a
good.
H
J
Can
investigate
it.
I
just
wanted
to
say
maybe
like
today,
I
think
if
someone
does
like
test
e
to
e
and
then
push
pushes
the
change
to
the
pr
and
types
test
e2e
again,
then
we're
gonna
wait
like
one
hour
for
the
previous
job
to
complete
and
then
we're
going
to
start
a
new
job
for
the
second
command.
Maybe
in
this
case,
what
we're
typing
like
a
command
to
run
a
job
again
in
the
previous
iteration
of
that
job
is
not
completed
yet
I.
J
Think
in
that
case
it
would
make
sense
too
aboard
the
first
one
automatically,
because
we're
basically
running
that
job
to
completion
for
nothing.
In
this
case.
J
H
H
Yes,
and
also,
if,
if
you
just
support
the
job
on
Jenkins,
the
the
previous
cluster
will
still
read
me
out.
The
cpv
test
value,
so
this
trigger
freeze
can
also
clean
the
redundant
clusters
for
this
PR.
K
Or
in
that
specific
case,
you
know
my
two
cents
are
for
the
cloud
jobs.
Obviously,
I
think
we
we
had
a
Downstream
trigger,
which
is
there
are
two
different
Jenkins
job.
One
is
setting
up
the
cluster
and
having
the
tests
and
the
other
Jenkins
job
is
a
cleanup
job.
So
when
the
test
job
is
finished,
you
know
the
cleanup
job
is
always
triggered.
K
No
matter
what
so
you
know,
we
could
probably
do
something
similar
to
make
sure
that
whenever
somebody
kills
a
running
job,
which
is
testing
in
you
know
the
on
the
capv,
and
you
know
it
triggers
the
capv
cleanup,
no
matter
at
which
point
you
know
it,
it
was
killed.
That's
something
that
we
can
also
look
into.
I
guess.
H
Yes,
we
can
invested
if
user
reports
a
job,
then
if
the
pro
section
can
be
executed
like
a
cluster
cleanup
function,.
E
And
I
have
a
small
comments
for
the
second
Improvement,
the
the
freight
I
see
to
use
IPv6
Dash
DS,
but
for
the
other
dual
stack
tests.
That's
no
DS,
so
I
I
have
no
preference
for,
for
this
phrase,
I
just
wants
all
of
related
jobs
could
use
a
unified
naming
styles
to
avoid
people
having
to
remember
too
many
different
place.
Yeah.
It
could
be
a
waste
yes
or
without
yes,
but
they
should
be
in
unified,
I.
Think.
A
If
there
are
no
other
questions,
I
guess
so,
do
we
have
any
final
question
going
three
two
one
zero.
So,
yes,
you
can
stop
sharing
now.
A
All
right,
so
there
were
two
very
interesting
presentations
thanks
a
lot
to
both
of
you
and
we
still
have
some
time
allocated
for
today's
meeting.
So
if
there
is
any
other
topic
that
you
would
like
to
bring
up
for
discussion,
please
go
ahead
and
I
will
be
waiting
as
usual
30
seconds
for
topic,
proposals.