►
From YouTube: 2020-10-01 CAPZ Office Hours
Description
No description was provided for this meeting.
If this is YOUR meeting, an easy way to fix this is to add a description to your video, wherever mtngs.io found it (probably YouTube).
A
All
right,
fine,
hello,
everybody
welcome
it's
october,
1st
2020..
This
is
the
kubernetes
cluster
api
for
azure
meeting
we
are
being
recorded
and
this
is
a
kubernetes
sigs,
meaning
so
please
be
on
your
best
behavior.
Basically,
let's
try
not
to
talk
over
each
other
and
maybe
use
the
raised
hand
feature,
although
if
it's
just
the
four
of
us,
it's
probably
not
that
urgent.
A
Please,
and
please
add
your
name
to
the
attendees
list
here,
so
we
have
a
record
there's
no
new
members
or
attendees
as
far
as
I
can
tell,
but
if
somebody
wants
to
say
something
random
or
unrelated,
this
would
be
a
great
time.
C
Yeah
I
added
this
at
the
top
just
wanted
to
mention
that
the
zero
three
ten
cappy
release
just
dropped
and
I
think
the
we
had
e2e
signal
on
the
cabzi
pr
that
nader
had
opened.
C
Yeah
great
and
then
we'll
follow
up
with
the
cabsie
release,
probably
tomorrow
or
early
next
week,
depending
on.
If
there
are
any
epr's
waiting
cool.
A
Yeah
I've
been
my
testing.
I've
been
rebasing
off
of
that,
so
that'd
be
great
when
it's
a
final
want
to
move
on
to
nader.
D
So
this
is
the
we
in
the
last
meeting
two
weeks
ago.
We
said
we
would
increase
the
timeout
for
that
and
see
how
it
would
look
over
like
a
period
of
two
weeks
to
see
if
the
flakiness
is
any
less.
I
think
it's
been
less,
but
in
the
last
few
days
this
couple
of
days
has
been
filling
again
consistently
and
in
that
pr,
with
310
I
had
to
increase
the
timeout
I
kept
failing
until
I
increased
that
time
out
again.
D
So
if
you,
if
you
don't
mind
clicking
on
like
the
test
grid
link,
it
will
show
like
the
pattern
I
haven't
yet
investigated.
I
mean
it's
been
failing
on
timeouts,
but
I
haven't
looked
into
what
happened.
I
would
I
thought
the
the
link
here
in
the
description
will
take
you
to
the
actual
grid
of
the
test.
D
I
would
have
thought
with
the
change
to
delete
that
cecile
made
that
things
should
have
been
faster.
I
mean
tests
in
general
are
faster,
but
this
upgrade
one
fails
on
like
waiting
for
the
upgrade
to
happen.
So
it's
kind
of
not
related
to
this
specifically,
but.
A
D
Look
into
it
again,
this
one,
the
upgrade
one,
the
gtp
upgrade
of
the
cluster
api
test
has
just
been
timing
out
consistently.
Until
so
last
meeting,
we
said
we'd
bump
the
timer
wait
like
that.
It
waits
for
the
upgrade
to
happen
and
see
if
that's
going
to
fix
things
fix
things
a
little
bit
but
not
100.
D
D
D
Yeah,
because
all
the
times
it
it
it
failed,
you
find
it
saying
it's
still
finding
the
old
ones,
they're
not
gone
yet.
So
that's
the
part
where
usually
like
the
last
part
of
the
upgrade,
that
the
test
that
it
checks
that
the
old
one's
already
gone
and
that's
what
usually
fails
here,
I'm
just
bringing.
A
C
C
E
E
Also
the
the
delete
is,
you
know,
definitely
sub-optimal
right.
We're
going
to
delete
one
thing:
wait,
try
the
next!
Do
the
next
wait
next
weight
when
in
fact
we
could
probably
take
a
slightly
different
approach
and
just
do
like
the
delete?
No
wait
just
try
across
everything.
E
Even
if
we
get
an
error
and
then
just
hit
the
reconcile
loop
again
just
say:
hey,
you
know,
let's
timeout
for
15
seconds
or
something
like
that
and
try
to
do
the
delete
again
and
no
wait
and
and
just
keep
cycling
through
like
that.
The
problem
is
we'll
eat
up.
Azure
api
calls,
but
we
might
be
able
to
find
a
nice
a
nice
balance.
E
D
C
C
I
actually
looked
at
it
when
I
was
looking
at
deletion
optimization,
it's
really
hard
to
do
that,
because
all
of
them
depend
on
each
other
like
what
we're
deleting
is
the
network
interface,
the
vm,
the
disk
and
that's
it
and
the
network
interface
like
the
vm
depends
on
a
network
interface
and
the
disk
depends
on
the
vm
or
advisor
is
left.
So
you
can't
really
delete
one
before
the
other
is
completely
deleted
and
that's
what
we're
doing
right
now,
and
so
I
feel
like
by
adding
a
try
to
delete
everything.
E
I
don't
know,
128
and
takes
a
big
jump
to
t56
or
something
so
it
could
very
easily
like.
Sometimes
it
works,
it
just
works
beautifully
and
then
other
times
we
get
this
really
nasty
exponential
back
off
and
we
can.
We
can
set
the
upper
limit
on
on
that
back
off
or
we
could
change
the
back
off
to
be
a
linear
back
off
it.
Really.
It
really
depends
like
we
can
do
some
things.
E
D
E
We
could
instrument,
we
can
log
like
what
do
we
want
to
do?
Do
you
want
to
just
throw
it
into
logs
like
low
low
tech,
or
do
we
want
to
actually
just
start
adding
prometheus
metrics
or
something.
C
D
C
C
C
Yeah
and
then
there's
also
the
thing
that
brian
showed
at
the
office
hours
using
jaeger.
I
don't
know
if
that's
also
an
option.
E
D
D
C
Sounds
good
I
can
try
to
get
a
local
repro
in
the
meantime
at
least.
Try
to
do
that.
A
E
E
E
D
On
azure,
I
guess,
unlike
a
different
thing,
on
azure,
if
I'm
running
other
stuff,
for
our
case
at
least.
Ideally,
I
would
want
like
another
like.
I
would
want
prometheus
running
somewhere
so
like
in
the
different
like
a
like
an
instance
running
and
then
I'm
sending
to
it,
and
then
everybody
can
install
it
wherever
they
want
kind
of
thing.
C
E
Yeah
yeah,
so
what
if
we
had
a
log
outputter
for
metrics?
So
if
you're,
you
know,
if
you
wanted
to
like
open
telemetry
supports
different
outputters,
so
prometheus,
usually
scrapes
right,
you
have
an
endpoint
that
prometheus
goes
out
and
says:
hey,
tell
me
your
metrics
and
then
prometheus
collects
them
and
then
processes
them
right.
E
The
reason
why
I
said
app
insights
is
because
that
allows
us
to
push
those
and
then
it's
a
managed
service
that
that
runs
inside
of
azure
that
that's
what
it
collects
right,
but
I
do
like
the
prowl
thing
better,
because
then
that
gives
us
the
ability
to
just
say:
hey
it's
here,
you
don't
have
to
have
access
to
any.
You
know
you
know
possibly
permissioned
resource
or
something
is
that's
kind
of
what
you're
pushing
towards
right
seal.
C
E
Yeah,
I
think
it
would
be
really
weird
to
try
to
expose
the
prometheus
endpoint
outside
of
the
cluster,
to
like,
say
some
other
prometheus
service.
That's
going
to
be
coming
in
to
gather
up
logs,
because
if
it
doesn't
gather
them
up
in
a
timely
basis
and
the
controller
ends
up
getting
destroyed
before
it
would
be
before
it
could
get
the
metrics
out.
It
would
be
difficult
to.
E
C
E
I
was
working
on
the
encryption
secret
stuff,
but
I
think
we're
waiting
on
nadir
and
there's.
I
don't
think,
there's
much
out
there
right
now.
So
unless
there's
something
we're
pressing,
I'm
happy
to
do
this.
I'd
love
to.
E
Me,
and
and
based
on
the
the
secret
stuff,
do
we
want
to
push
that
out
to
the
next
milestone.
C
A
Right
cool
are,
we,
are
we
done
with
this
topic
or.
A
A
It
basically
just
boils
down
to,
and
I'm
just
looking
for
some
fresh
ideas,
because
I
think
I
see
a
way
to
get
it
done,
but
I'm
not
at
all
happy
about
it.
So
it
basically
boils
down
to
there's
just
a
callback
method
and
some
other
little
scaffolding.
You
set
up
inside
e2e
tests
and
then
the
callback
gets
called
when
we're
dumping
and
collecting
logs
from
clusters
we
create
and
then
inside
that.
Obviously
you
want
to
connect
to
the
node
and
basically
do
a
system
ctl.
A
A
I
looked
at
bastion
for
a
while,
but
I
it's
not
clear
to
me
that
that's
available
outside
the
portal
or
that
there's
an
sdk
wrapping
it
yet
david's
shaking
his
head.
No
don't
use
bastion
or
it
is
purely
ui
yeah.
That's
what
it
looked
like
to
me.
They
kind
of
waved
their
hands
about
multiple
connections
and
stuff,
but
I
guess
they're
actually
saying
you
want
to
open
that
many
instances
of
the
portal,
so
that
really
doesn't
help
us.
A
A
That's
a
little
funky
to
set
up.
I
don't
want
to
write
the
jumping
the
tunneling
code
and
go
so.
I
was
falling
back
on
just
shelling
out
to
ssh
and
doing
like
dash
j
to
jump
through
which
will
work,
but
obviously
there's
some
other
issues
involving
you
know,
known,
hosts,
key
checking
and
all
that
you
have
to
massage
around,
which
is
a
pain
in
the
butt.
A
And
then
the,
but
the
most
problematic
thing
is
the
contacts
you
get
in.
The
callback
is
just
here's
a
machine.
Here's
a
vm
go
ahead
and
scrape
the
logs
off
of
it.
So
the
only
reasonable
way
I
see
to
do
that
is
to
make
further
either
a
z,
cli
or
api
calls
to
say,
okay.
Well,
what
resource
group
am
I
in?
Do
I
have
a
load
balance
or
public
ip
well
go
in
through
that
and
yes,
cecile.
Stop
me
before
I
go
farther
off
the
road.
C
Okay,
so
so,
when
we
first
discussed
this
in
the
cappy
issue,
the
consensus
was
to
go
with
a
daemon
set
that
would
run
and
collect
the
logs
on
the
nodes.
Just
like
we
do
for
the
conformance
test.
The
downside
of
that
is
that
you
can
only
get
the
logs
from
nodes
that
have
actually
joined
the
cluster.
So
if
a
node
fails
to
join,
then
you
can't
get
the
logs,
which
is
not
great,
but
that
was
the
idea
when
we
started
this
was
okay.
This
is
best
effort.
This
is
it's
better
than
nothing.
C
So
we'll
we'll
start
with
that,
and
I'm
wondering
if
the
framework
that
fabrica
actually
like
makes
any
assumptions
on
that
or
if
it's
or
does
it
just
not
at
all,
like
like
specify
how
it.
A
Doesn't
no,
you
just
have
a
single
callback
method
and
then,
at
that
point
it's
up
to
you.
They
have
some
useful
code
for
like
setting
up
how
what
logs
you
would
want
to
scrape,
but,
like
the
doctor,
docker
implementation,
just
sort
of
makes
a
direct
connection.
The
the
amazon
uses
their.
You
know
they
have
kind
of
a
bastion-like
service
that
lets
you
get
into
any
post
directly,
which
is
nice.
That
would
be
nice
to
use.
C
So
the
other
thing
is,
we
do
have
the
bootstrap
boot
diagnostics
enabled
now
and
you
can.
C
C
It
and
boot
box:
well,
we
want
cloud
in
it
mostly
for
now
yeah
at
first,
but
we
also
want
cubelet
afterwards,
but
that's
a
good,
easy
way
to
get
cloud
in
it
without
needing
to
like
ssh.
A
C
And
you
can
do
a
z
boot,
something
I
forget:
the
command
but
yeah.
A
A
C
E
What
if
we
just
mount
a
share
on
there
and
part
of
the
script
just
on
the
exit,
handle
just
writes
files
to
a
share,
and
then
we
just
grab
them
off
the
share.
E
E
Like
a
sheer
mount
so
like
an
nfs
share
or
something
like
that,
we
set
up
an
share
mounted
to
the
vm
during
the
test
runs,
and
then
I
have
the
exit
handler
in
the
cc.
You
end
up
like
dropping
those
or
or
apply
a
new
extension,
a
script
extension
or
something
like
that
and
have
it
drop
the
files
at
the
end
onto
nfs.
E
Storage
and
we
could
have
a
nfs
disk
and
then
it's
just
like
test
run
ebu
id
or
something
like
that
as
like
folder
and
then
machine
games
or
something
I
don't
know.
A
Yeah
that
would
work.
It
seems
it
kind
of
feels
simpler
to
have
them
all
mount
some
persistent
volume
and
then
have
some
other
pod
go
in
there
and
just
pull
the
files
off
of
it.
But
maybe
I'm
not
thinking,
do
you
think
that's
simpler
or
do
you
think
just
setting
up
the
nfs
and
all
that
as
a
precondition
for
all
the
tests
sounds
like
a
pain,
but
maybe
it
wouldn't
be
that
hard.
A
Well,
it's
kind
of
horrible
here
because
tunneling
through
the
because
finding
the
master
is
not
a
definitive
thing.
You
know
you're
kind
of
like
making
a
lot
of
assumptions
that
tunneling
through
this
one
ip
address
will
fi
will
get
you
through
a
host.
That
knows
how
to
get
to
the
other
one
and
yeah.
C
C
A
Yeah,
I've
got
the
I've
got
the
commands
running
through
exact
command,
but
it's
not
it's
ugly
and
it's
also
you
kind
of
have
to
well.
You
have
to
run
two
commands
right
because
the
dash,
oh
you
know,
don't
check
if
I'm
in
known
hosts,
reduce
strict
checking
only
applies
to
the
first
host,
not
the
second
one,
so
you
have
to
kind
of
pave
the
way
by
making
an
initial
command.
That
does
that
and
then
does
an
ssh
command
on
the
second
host.
So
it
adds
it
to
known
hosts
there.
A
So
you
can
go
all
the
way
through
the
second
time.
I
don't
think,
there's
a
better
approach
for
that,
but
but
with
go,
you
could
do
whatever.
A
A
C
A
Thanks
very
much
probably
ideas
do
we
have
anything
else.
You
want
to
say
today
for
pepsi
stuff.
A
C
Okay,
all
right,
so
this
is
technically
due
tomorrow
yeah.
So
how
do
we
want
to
do
this?
Because
I
know
cappy
is
done
with
like
the
big,
like
v1
alpha
3
releases,
but
I
think
we
should
still
do
another
another
one,
because
we're
not
going
to
have
the
one
of
the
four
types
right
away
or
does
anyone
think
we
shouldn't
do
that?
We
should
just
start
with
the
v
one
of
the
four
types
right
now
at
the
same
time
and
vendor
and
main
branch
from.
D
C
Okay,
that
sounds
good
to
me.
I
say
I
think
we
should
maybe
try
to
like
get
the
ones
we
didn't
get
done
in
this
one
and
not
try
to
like
over
like
stuff
it
with
new
stuff.
C
So
we're
still
like
on
the
way
to
slow
down
and
trying
to
get
the
things
that
we
really
want
to
get
done
done
instead
of
like
trying
to
get
new
bigger
features
in.
C
Okay,
so
I'll
just
move
everything,
that's
open
and
then
we
can.
A
C
Yeah,
so
this
is
what's
there
right
now
that
was
so.
Let's
just
go
through
the
list,
enhancement
proposal
david.
Do
we
want
to
move
that
to
next,
or
is
it
still
relevant
right
now.
E
Or
would
you
rather
wait?
I
was
I
was
actually
before
we
started
talking
about
metrics.
I
was
going
to
bring
up
writing
this
up,
so
I
can
start
working
on
the
refactoring
for
this,
so
part
of
part
of
what
will
be
needed
for
the
reconciliation,
the
cloud
resource
reconciliation
would
be.
You
know
breaking
out
all
of
the
reconcile
services
that
we
have
so
that
we
can
replace
an
interface
there
with
one
implementation
or
another,
so
basically,
all
of
our
reconcilers.
E
E
E
I
I
was
you
know,
planning
I
was
intending
on,
starting,
or
at
least
I
was
thinking
about
starting
on
to
get
that
code
structured
in
such
a
way
that
we
could
replace
one
with
another,
just
based
on
like
some
config
settings
for
the
controller
so
before
you
before.
We
want
to
do
that.
We
probably
want
to
write
up
what
the
design
really
should
be
then
have
everybody
say
yeah
or
no
change
this
stuff.
So
the
sooner
that
we
get
the
proposal
out
there,
the
sooner
that
work
can
start.
C
So
I
will
leave
it
in
here
sounds
good,
okay,
private
clusters.
I
am
definitely
working
on
that.
I'm
working
on
that
right
now
don't
have
a
pr.
Yet
it's
a
really
big
change.
E
C
It's
backwards
compatibility.
We
can
talk
about
it
a
bit
more,
but
basically
I'm
adding
like
we
weren't
before
the
load.
Balancer
spec
was
all
done
in
the
controller
like
it
was
all
hard-coded.
C
There
was
no
user
configuration
part
of
it
and
because
I'm
like
adding
a
whole
new
spec,
it's
just
like
if
you
weren't
using
that
spec
before
your,
you
won't
have
it
defined,
and
so
well,
if
you're,
if
you
have
an
existing
cluster,
that
spec
will
all
be
empty,
and
so
it's
like
how
do
we
reconcile
the
load
dancer
of
clusters
that
pre-existed
without
these
new
features
or
use
the
settings?
D
Status,
I
kind
of
worked
on
it
for
a
little
bit
and
had
a
chat
with
david,
but
got
distracted
by
a
bunch
of
other
issues
like
smaller
things
and
some
vmware
stuff.
But
I
was
planning
on
working
on
this
for
the
next
few
days
and
like
not
picking
up
any
other
stuff
until
I
have
this
at
least
like
a
work
in
progress.
Pr
just
to
have
the
conversation
started.
C
Failure
domains
for
azure
machine
pools.
That's
a
pr!
That's
been
open
for
a
while
I'm
going
to
move
it
to
next,
because,
even
though
we
have
the
pr,
I
think
we've
pushed
it.
Several
nice
milestones
and
I
think.
D
That's
something
that
we
need
to
get
it
like.
Do
we
need,
I
reached
out
to
jose
who's
like
the
person,
and
he
said
he's
just
too
busy.
I
was
asking
if
he
needs
help
finishing
up
his
stuff
and
he's
like
he
has
a
couple
of
pr's
and
he
said
he
wants
to
work
on
them,
but
he
just
like
to
do
with
his
other
stuff.
C
C
Yes,
we
should
add
that,
to
this
milestone,
right
yeah,
is
there
an
issue
for
that
already
or
okay
internal
load
balancer
created
for
public
cluster?
I'm
actually
fixing
that
as
part
of
private
clusters,
because
it's
all
related
does
not
currently
handle
are
the
separate
route
tables
where
it's
yeah?
Where
are
we
on
that?
It's.
A
C
D
C
Okay,
the
network,
describer
interface-
I
think
I'll-
leave
it
in
here
and,
if
x,
doesn't
get
to
it
I'll
take
it.
I
think
we're
that's
pretty
easy
to
do.
Improvements
in
e2e
are.
Is
this
done?
I.
A
C
D
Wasn't
that
related
to
the
provider
or
something
it
is,
that
was
something
that
yeah.
C
Oh
okay,
let's
just
leave
it
for
now
and.
C
I
mean
I
can
leave
it,
I
think,
or
what
should
I
do?
Yeah
I'll
leave
it?
Oh
this
one
we
might
want
to
move
to
next
david.
It's
the
secure
sense
of
bootstrap
data.
Yes,
next!
Okay,
do
you
also
want
to
be
unassigned
all
right,
yeah.
C
What's
the
command,
for
I
don't
know
I'll
just
do
this,
okay
and
then
oh
this.
We
want
to
keep
for
sure.
D
C
Okay
yeah:
let's
do
you
want
to
summarize
the
like
action
items
from
the
meeting
or
the
metrics
and
everything
and
then
we'll
keep
it
in
here.
E
No,
that's
an
existing.
C
E
Because
what
it
the
cappy
test
there,
what
it
does,
is
it
scales
up,
scales
down
and
then
upgrades
so
machine
pool
upgrade,
I
think,
has
been
troublesome
for
a
little
bit
like
it
just
doesn't
apply
yeah
your
model.
C
E
That
is
true
and,
to
be
honest,
I
would
I
would
be
thrilled
to
go
fix
that.
C
C
Are
there
any
other
issues
that
are
top
of
mind
that
people
think
should
be
going
into
this
milestone
that
are
important.
C
C
C
C
C
C
C
Oh,
I
think
this
is
definitely
happening
in
I
mean
yeah.
Pr,
I
think,
is.
E
C
C
D
C
A
D
I
know
we
don't
have
enough
emojis
for
sure.
That's
clear.
I
think
it's
like
it
feels
like
there's
like
there's
nothing.
I
have
any
issues
with,
except
because
like
there
was,
there
was
more
people.
A
C
C
E
Oh
okay,
yeah.
I
I
think
we're
trying
to
be
inclusive,
also
to
folks
who
perhaps
hadn't
come
before
like
the
new
zealand
folks
that
that
are
on
a
sister
team
to
us.
They
they
haven't,
they
they
haven't
been
coming
in
meetings.
We
were,
I
think
we
were
hoping
that
maybe
they
would
at
least
I
was
it
doesn't
seem
to
have
worked
out.
E
D
C
I
think
there
are
also
like
phases
and
periods
where
more
things
happen
in
cluster
api
versus
more
things
happen
in
the
providers,
and
I
think
this
is
one
of
those
times
where
more
things
are
happening
in
the
core
cluster
api,
with,
like
the
one
of
the
four
coming
up
and
the
last
release.
So
I
think
that's
where
everyone
is
kind
of
putting
their
focus
right
now,
like
included,
and
so
I
think,
there's
less
happening
right
now.
It's
more
boring
in
the
infrastructure
providers,
which
is
fine,
yeah.
D
Yeah,
that's
fair.
It
will
pick
up
when
we
have
the
new
type
to
be
able
to
perform
stuff.
There'll
be
a
lot
more
work
too.
C
A
D
C
Yeah,
that's
a
good
idea
yeah
and
we
should
maybe
also
promote
it.
The
meeting
like
internally,
you
said,
like
david,
you
know
some
people
might
not
be
aware
or
might
not
remember
that
this
is
happening
now.
Yeah.
E
E
Well,
that
that's
something
that
we've
been
we've
been
talking
a
bit
about
like
how
do
we
write
about
the
work
that
we're
doing
and
start
to
start
to
speak
out
to
the
community
and
more
users,
because
I
think
we're
at
that
place
where
you
know
we
have
some.
You
know
pretty
solid
functionality
that
people
could
take
advantage
of
cool
one
outreach
we're
coming
up
is
november
at
gophercon,
we're
gonna
be
doing
a
workshop
and
we're
going
to
be
focused
on
github
and
like
git
flow
and
go.
E
You
know,
building
go
code
in
github
and
one
of
the
things
that
we're
going
to
work
on
is
cluster
api
git
flow
at
gophercon.
So
no,
no
just
a
heads
up
if
anybody's
interested
they're
welcome
to
participate,
and
you
send
me
some
details
if
you
all
want
to.
E
Yeah
it'll
be
fun,
it'll
be
fun
if
anybody
wants
to
it
and
it's
gophercon
is
a
really
great
event.
If,
if
you
haven't
gone
before
in
the
past,
it's
really
fun.
So,
if
anybody's
interested
they're
welcome.
E
A
Oh
yeah
sure
I
guess
that's
all
we
got
thanks
everybody
for
coming,
we'll
see
you
in
two
weeks
I
will
post
the
recording
in
the
document.
Okay,
thank
you.