►
Description
We would like to take this time to answer questions from our contributors regarding our Automation and CI. How to get things done, why things are the way they are, where things are headed, and how to help out. All is fair game. If the audience is lacking in suggestions, we’ll have some slides and demos prepared based on FAQ’s we’ve encountered.
Presenters:
Aaron Crickenberger, Google
Cristoph Blecker
A
Okay,
everybody
I
think
we're
gonna
go
ahead
and
get
started,
give
some
time
for
folks
to
roll
in,
but
just
kind
of
start.
The
the
most
important
part
of
this
talk
where
I
get
to
tell
you
that
my
name
is
Aaron
Creek
and
Berger,
and
you
are
all
here
to
hear
about
me
and
Christophe
talk
about
automation
and
CI.
You
may
have
heard
of
me
from
the
kubernetes
steering
committee.
He
may
have
heard
of
me
as
the
co-founder
of
cig
testing.
A
A
A
You
know
we
today
managed
147
repos,
like
Tim.
All
clara
was
saying
earlier
today.
Close
to
150
64
of
those
are
in
the
kubernetes
org
35
are
the
kubernetes
SIG's
org,
which
we
created
just
this
year,
17
in
the
kubernetes
incubator,
org
and
then
about
30
others
in
two
other
orgs
for
kubernetes
clients
and
the
kubernetes
CSI
integration.
A
That's
a
lot
of
repos,
that's
a
lot
of
different
projects
and
a
lot
of
people
trying
to
contribute
to
projects
in
different
ways,
and
so,
rather
than
try
and
figure
out
what
the
best
message
is
to
spread
across.
All
of
those
people
is
I
thought
like.
Maybe
it's
better
to
just
kind
of
do.
This
talk
live
I'd
like
for
this
to
be
a
very
interactive
session,
where
we
try
and
address
the
needs
of
the
people
who
are
motivated
enough
to
show
up
today.
A
A
So
yeah
I
want
to
welcome
you
to
our
talk
where
the
topics
are
made
up
and
the
points
don't
matter
just
a
couple
ideas
to
see
the
conversation
maybe
you're
here,
because
you
have
no
idea
of
what
you're
doing
on
this
project.
Maybe
this
is
your
first
coop
con
ever
or
maybe,
even
though
you've
been
on
the
project
for
a
couple
of
months
or
years,
you
still
feel
like.
A
A
It
could
be
that
you're
actually
like
super
senior,
and
you
know
exactly
what
you're
doing
you
just
want
help
solving
for
x
or
maybe
you're
really
motivated,
and
have
some
ideas
on
how
the
automation
and
CI
for
this
project
could
be
improved,
and
you
want
to
talk
about
roadmap,
or
maybe
you
want
to
hear
what
our
roadmap
is
for
automation
in
the
project.
Maybe
you
want
to
talk
about
fada
bot.
A
The
contributor
survey
that
was
sent
out
a
couple
months
ago
had
some
specific
questions
around
what
automation
you
liked
best
and
what
automation
you
liked
least
or
areas
of
the
project
you
thought
could
be
improved
and
fada
bot
was
by
far
the
most
polarizing
answer.
We
had
an
exact
50/50
split
between
people
who
really
liked
it
and
really
disliked
it
beta
bot
for
those
of
you
who
haven't
had
the
pleasure
of
interacting
what
is
with
what
is
definitely
a
robot
and
not
a
human
being,
is
the
account
that's
responsible
for
flagging
issues
as
stale.
A
If
they
haven't
been
touched
in
over
ninety
days,
it's
the
bot,
that's
responsible
for
closing
issues
if
they
haven't
been
touched
in
over
150
days.
It's
also
the
bot,
that's
responsible
for
automatically
spamming
retest
comments
on
your
PR
if
it
doesn't
seem
to
be
passing
tests
for
whatever
reason,
but
it
has
passed
code
review.
A
B
There's
a
few
things
that
we
could
go
through
and
and
some
tools
that
you
may
have
interacted
with
some
tools.
You
might
not
know
that
they
exist
one
of
the
like
one
of
the
really
useful
ones.
We
have
a
code
search
tool
that
searches
all
our
orgs,
all
our
repos
for
a
specific
stands
of
code.
If
you
know
like
the
function
that
you're
trying
to
find
but
don't
know
where
exactly
to
find
a
very
useful
tool,
we
also
have
a
number
of
tools
around
a
flake
hunting
and
visualizing
test
failures.
B
Things
like
velodrome
in
Gruber
Nader.
We
have
automation
and
processes
around
a
lot
of
our
github
management,
as
I
mentioned
earlier,
things
like
requesting
membership
to
an
org
that
invite
actually
going
out
managing
and
pruning
people
from
the
org.
All
of
that
kind
of
stuff
is
handled
through
automation.
There
isn't
a
human.
That's
that's
doing
that
right
now
anymore.
B
We
also
have
tools
to
like
our
PR
dashboards
that
help
contributors
manage
the
flood
of
notifications
from
a
repo,
especially
like
kubernetes
kubernetes
that
can
get
overwhelming.
We
have
workflow
tools
some
again
like.
If
you've
opened
a
PR,
you
will
have
interacted
with
things
like
prowl
or
the
Kubrat,
a
CI
robot,
as
well
as
tied,
which
handles
automatic
merging
now
across
all
of
our
repositories.
B
C
C
Okay
and
you
know
the
triage
just
ask
them
question
no
reply,
just
sitting
there
right
and
may
times
you
don't
have
time
to
close
it
because
you're
not
following
each
and
in
shoes
when
you're
talking
about
thousands
of
issues
out
there
so
with
the
BART,
it
really
works
really!
Well.
You
know
we
have
our
issues
in
somewhat
under
control,
but
you
and
I
think
you
know
like
I
mentioned
in
150
days.
It
takes
before
it
close
right.
C
B
B
Query
go
leave
a
comment
on
it
to
do
a
thing,
and
then
we
have
other
pieces
of
automation
that
listen
to
those
comments
to
Mark
and
transition,
a
an
issue
through
stale
rotten
and
then
close
it
off
the
reason
we
have
like
the
time
frames
that
we've
selected
right
now
are
based
around
pinging
authors,
pinging,
assignees
and
say
like
hey.
If
this
is
actually
still
real.
B
If
you
actually
still
need
this
issue,
if
people
are
looking
at
it,
please
mark
it
as
fresh
because
there's
been
no
action
and
right
now,
the
time
frame
that
we've
chosen
is
a
quarter.
So
a
release
that,
like
hey,
if
an
issue
has
not
been
touched
at
all
during
the
course
for
release
that
somebody
should
comment
and
say
if
that
this
is
still
needed.
But
all
those
time
frames
are
adjustable
because
right
now
the
source
that
we're
looking
at
is
just
a
github
search.
Query
that
we
can
adjust
the
time
frame
on
that
just.
C
B
For
that
that
particular
stage,
if
somebody
leaves
a
comment
but
leaves
it's
still
marked
as
stale
like
they
comment
to
say,
like
hey,
is
somebody
working
on
this
it'll
still
be
marked
as
stale,
but
it
resets
the
stale
timer,
which
is
another
thirty
days
from
that
point,
so
it
it
does
that
particular
stay,
but
there
is
in
the
comment
that
fada
bought
leaves.
It
says
like
hey
if
this
is
actually
still
needed.
Please
say
this
in
a
comment
so
that
our
bots
can
react
and
go
like
no.
B
A
So
this
was
the
graph
I
was
trying
to
find,
so
this
is
def
stats,
a
project
developed
by
the
CN
CF
that
shows
all
of
the
open
issues
and
PRS
that
were
opened
up
against
the
kubernetes
kubernetes
repository.
Can
you
guess
when
we
turned
on
beta
bots
to
close
issues
that
were
older
than
150
days
right?
A
So,
like
one
of
the
complaints
I
often
hear
with
beta,
bought
his
guys,
I'm
really
sick
of
like
the
bot
Flags,
my
issue
is
stale.
Cuz
nothing's
happened
in
ninety
days
and
then
I
removed
the
stale
label,
because
I
still
care
about
this
issue
and
then
90
days
later,
the
bot
adds
to
stay
a
label
back
and
I'm
really
annoyed
by
this.
So
I
remove
it
again
and
then
90
days
later,
I'm
like
whoa
whoa
whoa,
hang
on
hang
on,
like
maybe
the
problem
is
that
for
270
days
nothing
has
happened
on
this
issue.
A
Would
you
like
to
help
out
with
this
issue?
Could
you
maybe
help
us
understand
why
this
issue
is
more
important
than
the
thousands
of
other
issues
that
we
have
on
the
project?
I
know:
that's
not
like
a
great
it's
a
it's.
It's
a
difficult
problem
to
solve,
but
ultimately
I
think
it's
a
problem
of
finding
the
most
efficient
way
for
our
contributors
to
find
the
right
things
to
work
on.
It's
not
necessarily
the
bots
fault
that
we
don't
have
enough
people
working
on
your
particular
problem
right
now,
so
I
mean
for
real.
A
D
A
A
So
we
can
do
this
for
things
like
bumping,
the
the
version
of
the
container
that
is
used
to
run
the
end-to-end
tests
or
changing
the
testing
libraries
that
we
use
to
actually
stand
up
the
clusters,
things
of
that
nature.
We
do
actually
kind
of
lack
the
concept
of
a
staging,
proud
cluster.
So
when
we
deploy
proud
changes,
we
do
it
live.
A
We
actually
historically
kind
of
did
it
on
Friday
afternoon,
you
yeah.
You
can
also
thank
fada
Bach
for
that
for
real.
We
eventually
evolved
from
a
human
named
Erik
Veda,
having
the
tendency
to
deploy
prowl
on
Friday
afternoon
whenever
he
was
on
call
to
creating
a
prowl
plug-in
that
would
automatically
open
up
a
pull
request
every
day
in
the
morning,
so
that
we
can
check
that
pull
request,
see
what
has
changed
in
the
test
infrastructure
and
decide
if
it
is
prudent
for
us
to
deploy
that
day.
A
B
When
we,
when
we
deal
with
things
like
making
changes
to
live,
github
issues
and
github
PRS
and
we're
we,
we
do
have
things
in
prowl
around
like
there's
a
dry
run,
client
and
a
fake
client,
where
we
try
to
be
sickly,
guess
at
how
bit
off
of
based
off
the
API
specs
what
github
is
going
to
return
us
and
how
we're
going
to
interact
with
it.
So
there
is,
there
is
a
bunch
of
unit
testing
around
those
kind
of
things
with
our
a
fake
client,
as
well
as
a
dry
run.
B
B
That
said,
in
practice,
there
are
a
number
of
things
that
we
still
haven't
figured
out
and
still
haven't
solved
at
the
rate
of
web
hooks
that
we
we
end
up
receiving
and
how
dependent
we
are
on
github,
sending
us
details
on
when
a
PR
was
updated
when
an
issue
is
updated
and
what
has
changed
on
that
PR
or
issue
yeah,
we
we
we
get.
We
get
issues
where
web
hooks
will
just
completely
go
missing
and
we
will
not
exist.
A
certain
comment.
Action
or
a
certain
push.
B
Action
will
just
not
appear
to
us
or
will
get
we'll
get
issues
where
the
API
says.
We
should
have
a
certain
response
back,
but
actually
we'll
get
a
slightly
different
response
back
from
the
v3
API.
So
we
a
lot
of
it,
ends
up
coming
down
to
trial
and
error
and
seeing
what
works
at
this
particular
scale
because
we're
receiving
like
every
second
we're
receiving
so
many
events
from
github
and
in
trying
to
ingest
them
in
a
proud
to
take
some
action.
B
B
We
have
post,
submit
jobs
that
run
against
the
branch
after
a
PR
is
merged
to
do
an
action
or
to
run
a
test
immediately
after
that
actions
being
taken,
we
have
periodic
jobs,
CI
jobs
that
run
on
a
regular
schedule
to
maintain
a
certain
CI
signal.
So
all
those
different
types
in
here
are
sorted
primarily
by
sig,
but
also
we
have
like
some
provider
specific
ones
to
to
provide
signal
on
different
types
of
things.
B
These
are
used
both
by
the
SIG's
to
identify
when
they
have
issues
they're
used
very
heavily
by
the
release
team
to
provide
signal
before,
during
and
after
release
on
how
how
the
release
is
doing
if
it's
stable
and
when
any
time
they're
looking
at
cutting
cutting
a
particular
release,
cutting
a
particular
branch.
They
go
back
to
this
and
they
use
this
as
their
main
indicator
of.
Is
this
release
healthy,
or
are
we
going
to
have
some
issues
with
it?
B
It
also
really.
This
tool
is
really
helpful
for
identifying
flakes,
because
you
can
see
PRS
sorry,
you
can
see
tests
and
specific
jobs
over
time,
and
do
they
take
longer
like
these,
these,
these
time
series
that
there
aren't
springing
up,
you
can
see
how
long
a
job
is
being
taken,
and
you
can
see
outliers
that
if
they're
so
yeah
the
the
graph
button
you
can
do
test
duration
minutes
and
that
that
works
for
anything.
B
B
Yeah,
there's
a
lot
of
interesting
details
that
you
can
kind
of
pull
out
of
some
of
these
some
of
these
and
we
run
like
we
run
so
many
different
test.
Suites
we
run
upgrades.
We
run
down
grades,
we
run
there's,
there's
jobs
in
here
too,
to
check
out
specific
cloud
providers.
We
have
jobs
in
here
the
the
automation
that
we
have
around
org
memberships.
B
So,
right
now
our
org
memberships
are
defined
in
a
github
repo
in
gamal
and
then
a
bot
takes
those
compares
it
against
our
current
membership
in
github
and
goes
and
makes
mutating
changes
that
job
is
a
post,
submit
CI
job.
So
once
that
one's
a
PR
merges
that
job
will
run
and
we
can
see
the
results
in
tes
crit
of
each
individual
run
of
that
that
tasks
to
to
github.
B
B
So
you
try.
We
try
and
automate
as
much
as
we
possibly
can
at
the
scale
that
we're
trying
to
take
actions
even
something
as
simple
as
somebody
puts
in
a
membership
application
and
being
able
to
make
sure
that
that
membership
gets
processed.
The
person
gets
their
invite
to
our
github
org
and
then
gets
whatever
permission
set
that
they
that
they
need
those
pieces
at
the
scale
that
we're
doing
it.
B
We
need
to
be
able
to
scale
that
out
and
we
need
to
be
able
to
also
deal
with
limitations
and
github
permissions
model,
because
it
used
to
be
that
the
only
people
that
could
do
this
particular
task
needed
super
owner
privileges
over
everything
that
we
have
and
we
try
to
limit
those
as
much
possible.
So
now,
if
we're
taking
a
defined
configuration
in
github
and
have
a
bot
do
all
the
work
for
us,
we
know
exactly
what's
happening.
A
Yes
and
so
like,
if
you
want
to
do
anything,
get
hub
related
on
this
project.
This
is
where
you
probably
go:
kubernetes
auric
repo.
We
use
that
handy-dandy
feature
of
github,
where
you
can
use
predefined
templates
for
different
issues.
Generally,
we
expect,
if
you're
coming
to
this
repo,
you
want
general
support.
You
want
to
be
added
as
an
organization
member
so
that
she
can
do
fun,
cool
stuff
like
be
assigned
issues
and
PRS.
Wait.
Sorry
I
mean
actually
apply
in
LG
TM
to
PRS.
You
want
review
privileges.
A
Basically,
org
membership
is
the
thing
we
use
as
proxy,
for
we
trust
you
and
you
have
decided
you
also
trust
us
and
want
to
work
on
the
project
with
us.
One
of
the
other
ways
we
use
this
often
is
if
you're,
not
a
member
of
the
org,
we
apply
a
needs,
okay
to
test
label
on
your
PR,
because
you
could
be
some
random
person
who's
trying
to
submit
a
Bitcoin
miner
as
a
whole
request,
so
that
you
can
run
it
on
our
fancy.
Dancy
test
infrastructure.
A
Also,
if
you
have
some
random
third
party
integration
that
you
really
really
like
to
try
out
on
just
your
repo,
but
you
end
up
getting
punted
too
well
a
github
admin
has
been
sent
a
notification
about
that.
I
really
don't
have
a
great
way
to
reply
back
other
than
like
stalking
down
the
person
who
tried
just
going
through
that
flow,
because
we
need
to
have
a
larger
conversation
about.
Why
are
you
adding
that
integration?
Just
because
kubernetes
is
a
super
large
open-source
project
and
we
take
everybody's
privacy
seriously.
A
We
find
that
there
are
many
third-party
integrations
that
require
more
privileges
than
perhaps
they
should
and
we'd
like
to
understand.
If
it's
going
to
cause
any
headaches
for
us
down
the
road.
So,
for
example
like
why,
if
those
of
you
who
remember
we
reviewable
a
couple
years
ago,
were
used
as
an
alternate
code
review
path,
instead
of
doing
pull
requests,
we
found
that
it
just
wasn't
really
worth
the
hassle
that
it
from
an
administration
perspective.
I
also
wanted
to
punt
us
back
to
the
the
test
grid.
A
Question
I
seem
to
have
already
blown
away
the
tab
where
it's
like
I
want
to
make
clear
that
anybody
in
the
kubernetes
community
can
contribute
their
test
results
back
to
test
grid
through
a
PR
process.
This
is
here
for
you
to
use,
so
we
have
a
conformance
tab,
for
example,
where
different
kubernetes
providers,
who
want
to
demonstrate
that
their
offering
is
certified
as
conformant,
not
just
when
they
opened
up
a
PR
to
the
CN
CF
repo.
For
that,
but,
for
example,
boy
I
hope
this
works.
Look.
A
A
So
that
said,
I
feel
bad.
That
test
grid
is
an
open
source.
I
definitely
have
a
dream
that,
like
I,
want
to
be
able
to
stand
up
a
kubernetes
cluster
and
have
an
opinionated
CI
stack
right
out
of
the
box.
Proud
is
super
cool,
but
it
doesn't
really
provide
this
sort
of
historical
view
of
all
of
the
test
results.
You
do
kind
of
get
this
historical
view
of
what
are
the
jobs
that
prow
is
running
right
now,
along
with
some
filter
dropdowns
that
you
can
use
to
go
by
pre,
submit
or
periodic
or
post
submit.
A
We
want
to
open-source
test
grid.
There
are
two
parts
of
it.
One
of
those
are
the
front
end
that
displays
all
the
things.
I
know
they're,
probably
some
of
you
who
know
a
thing
or
two
about
like
CSS
or
JavaScript
and
would
love
to
finally
make
test
grid,
say
display
the
timestamps
in
local
time
instead
of
Google
Standard
Time
I
mean
Pacific
time.
A
There
are
some
of
you
who
might
like
to
understand
like
what
do
all
of
these
columns
actually
mean
you
can't
actually
hover
over
them
for
what
it's
worth,
here's
another
pro
tip-
maybe
you
haven't
noticed
so
that
stands
for
the
commits
apparently,
and
that
stands
for
the
time.
I'm
sure
this
is
all
super
tiny,
so
there's
also
the
backend
component,
which
is
the
piece
that
is
responsible
for
scraping
all
of
the
data
out
of
Google
Cloud
Storage
buckets
and
translating
it
into
data
that
is
more
efficiently
consumed
by
this
UI.
A
We
want
to
open-source
both
of
those
things.
We
want
to
hear
your
use
cases
for
why
you
agree.
It
should
be
open
sourced.
What
do
you
as
a
potential
customer
or
customer
of
this
offering
want?
So
if
you
want
to
use
test
grade,
you
think
it
should
be
open
source.
Please
come
talk
to
me
or
I
hate
to
do
this
to
you
Michelle,
but
I
will
call
it
the
fact
that
that
is
basically
the
test.
Current
author
right
back
there
with
the
hands
up,
yay.
A
Current
current
maintainer
of
test
grid,
but
much
as
I
am
affectionately
referred
to
as
the
community
within
my
team.
Michelle
is
the
test
grid
within
our
team,
so
all
credit
to
her,
for
example,
for
this
recent
change,
where,
if
you
know
that
there's
a
job
that
you
care
about,
but
you
forget
which
stupid
tab
or
dashboard
it's
in
you
can
just
like
type
in
AWS
and
we'll
go
see
the
conformance,
Gardner
thing
and
hit
enter
and
it'll.
A
A
Okay,
lots
of
opinions
on
that
one.
Just
stop
me
from
monologuing.
If
I
go
on
too
long,
yes,
I
first
see
a
need
for
each
sake
to
kind
of
own
their
tests.
So
we
do
this
partially.
By
saying
here
are
the
different
groups
of
dashboards
that
correspond
to
different
sakes,
so
I
sort
of
expect
that
everything
under
siege
a
double
u.s.,
for
example,
aid
of
the
sig
av2
sig
AWS,
is
responsible
for
making
sure
that
every
single
job
on
each
of
these
dashboards
is
solidly
green.
All
the
time.
A
Ok,
you
got
a
couple
red
ones.
Sorry,
let
me
try
calling
out
a
different
sink.
How
about
sake?
Let's
be
fair
and
see.
What's
sake,
GCP
has
ok,
also
a
little
bitter
red.
This
is
the
reality
of
the
project.
Today,
almost
every
single
sig
has
a
bunch
of
failing
tests,
and
it's
really
impacting
the
velocity
of
the
project
and
the
health
of
the
project.
I
would
like
to
help
sicks
change
this,
so
one
way
is
for
them.
A
A
That
I
am
aware
of
that
actually
have
that
send
to
a
Google
Group
that
they
have
an
on-call
person
responsible
for
triaging.
So
as
long
as
you
actually
have
your
test
screen
to
begin
with,
you
can
then
have
somebody
make
sure
that
they
stay
green.
That's
one
way
of
doing
it.
You
also
set
the
magic
word
Federation,
which
I
just
want
to
hammer
on
the
point
that
test
grid
gets
all
of
its
data
by
reading
from
Google
Cloud
buckets.
It
doesn't
really
matter
how
it
got
into
those
buckets
in
the
first
place.
A
So
while
I
love,
prowl
and
I,
think
it's
awesome
and
we
run
tens
of
thousands
of
jobs
a
day
through
prowl.
We
also
support
reading
results
that
are
put
in
there
by
like
Travis,
CI
or
circle
CI
or
if
you
are
so
inclined
Jenkins
and
as
long
as
that
bucket
is
publicly
readable.
We
will
read
the
data
out
of
that
and
consume
it.
A
So
this
is
like
how
the
conformant
stuff
works,
for
example,
though
I'm
not
sure
they
actually
have
any
data
in
here
we
do
read
from
buckets
that
digitalocean
and
I
do
can
populate
to
prove
that
their
offerings
are
conformant.
So
this
is
one
way
that
we
can
encourage
federation
of
periodic
and
post
submit
results.
Then
the
other
question
becomes
pre
submit
results
generally,
if
you're
running
your
own
project,
we
trust
you
to
take
care
of
what's
important
to
you
from
a
pre
submit
perspective.
A
So,
like
the
cluster
provider,
AWS
project,
like
I'm
sure
they
got
their
own
Cree,
submits
that
they
care
about,
or
maybe
the
customized
project
I
start
to
get
a
lot
more
finicky
about
what
it
means
to
like
what
the
requirements
are
to
have
a
pre
submit
when
it
comes
to
blocking
the
kubernetes
kubernetes
repository
or
an
actual
release.
That's
where
we've
tried
to
put
together
a
set
of
documented
policies
and
requirements
for
what
it
means
to
be
release
blocking
and
I
anticipate.
We
will
do
something
similar
for
emerge
blocking
so
here.
A
A
B
Are
ways
that
we
can
and
have
an
other
repos
and
are
gonna
be
trying
to
tackle
this
as
much
as
we
can
for
kou
berries,
ku
grantees,
where
we,
you
can
take
different
pieces
and
split
them
up
into
separate
tests,
which
has
the
benefit
of
number
one
they
can
run
in
parallel,
as
when
we
start
testing
on
a
particular
commit.
Almost
all
the
tests
will
start
as
soon
as
there
are
CI
resources
available
to
run
those
tests.
B
The
other
advantage
is
as
far
as
dealing
with
flakes
is
concerned.
If
a
test
particular
test
does
is
flaking
for
some
particular
reason,
you
can
retest
in
prowl
that
specific
test,
as
opposed
to
running
one
large
CI
job
that
ok,
it
failed
in
this
run
due
to
a
flake.
Okay,
I'm
gonna
hit
retest
and
is
gonna
take
another
hour
in
20
minutes.
B
For
that
particular
job
to
run
the
more
we
can
kind
of
split
those
pieces,
the
better
experience
that
contributors
will
have
like
if
a
read
if
retesting
from
a
flaked
ends
up
taking
you
15
minutes
as
opposed
to
taking
an
hour.
That
makes
a
really
big
difference.
As
far
as
the
impact
those
flakes
have
on
the
PR
authors.
A
So
just
what
I
clicked
through
here
was
an
example
of
a
prow
job
configuration
file.
You
may
have
actually
read
this
I
forget
who
wrote,
who
or
who
wrote
what,
but
this
is
an
example
of
a
job
that
basically
involves
running
this
container
name
like
this
should
look
an
awful
lot
like
a
pots
back
to
those
of
you
who
have
worked
with
pod
EML
day
in
and
day
out,
here's
the
image
we
want
to
run
and
here's
the
command
we
want
to
run
inside
of
that
image.
A
Here's
some
information
in
about
the
repo
you're
gonna
clone
in
order
to
run
this
job,
and
then
here's
a
bunch
of
stuff
that
we
want
to
use
to
understand
when
and
how
to
schedule
the
job
we
use
these
labels,
things
to
like
automatically
pre-populated
service
accounts
or
credentials
to
stand
up
a
cluster
in
AWS,
for
example.
This
job
is
just
a
yellow
file
and
it's
just
inside
of
a
repo
and
it's
inside
of
it
directory.
A
That's
named
after
the
repo
that
it
works
against
and
it
has
an
owner's
file
and
that
owner's
file
has
these
people
listed
as
approvers.
So
if
you
want
to
add
a
job
to
the
cluster
API
provider,
AWS
repo
in
kubernetes
6,
you
don't
have
to
talk
to
anybody
from
sig
testing
at
all.
You
can
talk
to
you
Justin,
Tiberius
or
Chuck
or
David
Watson
right.
A
We
have
a
similar
pattern
for
the
jobs
that
run
against
kubernetes
kubernetes,
where
we
have
directories
named
after
each
of
the
different
SIG's.
So
SiC
network,
for
example,
has
a
bunch
of
different
files
that
are
named
after
there's.
No
real,
consistent
naming
scheme
but
like
here
are
jobs
that
stand
up
tests
related
to
ingress,
for
example,.
B
And
it
we
need
things
like
this,
because
there's
no
way
that
cig
testing
as
a
group
can
go
and
be
like.
Okay,
you
got
you
are
the
managers
of
all
CI
and
all
tests,
and
that,
because
we
need
help
from
the
SIG's
to
be
able
to
not
only
design
the
tests
that
are
running
design,
the
jobs
but
respond
to
them
when
they
fail
and
and
and
fix
them.
B
A
Since
we
are
getting
close
to
time,
we
have
about
eight
minutes
left
if
there
are
any
burning
questions,
but
one
thing
I
wanted
to
make
sure
I
showed
people,
because
I
often
feel
like
people
aren't
aware
this
stuff
exists.
This
is
the
PR
dashboard
has
offered
by
Cooper
Nader.
So,
if
you
like
me,
are
just
you
refuse
to
look
at
your
github
notifications,
because
you
have
like
800
of
them
and
you
try
bankrupting
that,
but
you
end
up
getting
like
another
300
the
next
day.
A
B
A
A
There
are
also
37
PRS
that
are
in
some
way
shape
or
form
on
Christoph's
plate
and
Christophe
also
happens
to
have
two
PRS
out
going.
So
these
are
PRS.
He
needs
to
pay
attention
to
from
I
authored
it.
I
really
want
it
to
merge.
What
do
I
need
to
do?
You're,
probably
wondering
what
do
these
things
mean
and
how
do
PRS
end
up
here?
A
So
there's
a
link
at
the
bottom,
where
it's
a
needs.
Attention
is
based
on
a
simple
state
machine
and
because
we
love
documentation,
we
thought
code
is
the
best
form
of
documentation,
and
so
you
can
see
we
link
directly
to
the
state
machine
code
itself
and
essentially
we
try
and
say:
if
there's
a
PR
out
there
and
it's
assigned
or
review,
is
requested
of
you
and
then
one
of
these
things
happens
either
a
comment
applies
to
it
or
somebody
labels
it
as
LG
TM.
It's
going
to
go
back
to
you.
A
We
could
probably
stand
a
document,
that's
better.
We
could
also
probably
stand
to
maybe
have
this
based
on
github
queries
that
are
specific
to
certain
repos
or
something
we've
taken
a
stab
at
this,
but
not
quite
as
thoroughly
with
the
PR
dashboard
from
prowl.
So
since
we
now
live
in
a
world
where
prowl
and
tide
are
responsible
for
the
testing
and
merging
of
every
single
PR
across
all
147
of
our
repos,
you
might
have
questions
about
what
it
takes
for
your
PR
to
merge
any
given
repo
and
this
dashboard
uses
a
github
query.
A
So,
for
example,
we're
looking
at
all
PRS
that
are
open,
that
Christoph
wrote,
but
we
could
I'll
just
call
myself
out
here.
We
could
also
see
what
it
looks
like
for
all
PRS,
that
I
wrote
and
apparently
I
need
to
work
on
like
passing
tests.
I
also
have
a
bunch
of
labels
on
this
PR.
One
of
them
is
good
because
it's
a
required
label.
One
of
them
is
bad
because
it
shouldn't
be
there.
A
So
I
need
to
find
some
way
to
get
rid
of
the
work-in-progress
label
on
my
PR,
which
I
can
do
because
I
have
titled
this
PR
as
work-in-progress
I,
don't
want
somebody
to
accidentally
merge
it.
If
I
were
to
remove
this
WIP
text
from
the
PR
title,
that
label
would
go
away.
So
we
feel
like
this
is
a
more
granular
view
of
what
it
takes
for
your
outgoing
PRS,
for
example,
but
I
have
found
that
the
goober
Nader
dashboard
is
by
far
a
far.
A
Way
of
keeping
up
with
like
your
workload,
so
if
you
work
on
this
project-
and
you
have
to
deal
with
a
lot
of
PRS
day
in
and
day
out
and
you've,
never
seen,
this
I
would
highly
encourage
you
to
give
it
a
shot.
It's
also
a
good
way
to
just
kind
of
stock,
other
people's
workloads,
so
maybe
you're
waiting
on
Daniel
Smith,
also
known
as
lava
lamp.
Don't
do
that
find
somebody
else
which
is
just
another
thing.
A
B
It
just
comes
down
to
the
scale
of
what
we're
dealing
with
like
when
we're
talking
about
147
repos,
including
the
core
kubernetes
repo
that
currently
has
20
nearly
2,200
open
issues
and
nearly
a
thousand
open
poll
requests.
That
is
way
too
much
information
for
anybody
to
digest
and
even
like
the
people
who
are
very
involved
bite
people
were
top-level
approvers.
People
are
involved
in
sig
architecture.
A
Okay,
one
last
shiny
thing:
I
wanted
to
make
sure
I
show
people
because
I'm
not
sure
how
many
people
are
aware.
We
have
this.
This
is
our
dashboard
that
is
based
on
bigquery
metrics
that
are
run
against
a
publicly
accessible
bigquery
data
set.
So
if
you
know
what
bigquery
is
or
you
know
how
to
use
it,
you
can
absolutely
query
the
same
set
of
data
that
we
are
to
produce,
maybe
a
better
dashboard
for
you.
A
This
dashboard
I
want
to
point
out
this
table
right
here
where
we
show
the
flake
iasts
PR
jobs
for
the
past
week,
so,
for
example,
right
now,
unfortunately,
the
cops
AWS
job
has
been
the
job.
That's
failed
or
flaked
the
most
this
week,
a
flake
is
something
that
passes
or
fails
right,
but
the
like
the
commit
hasn't
changed.
You've
just
run
the
same
test.
A
This
is
a
great
place
where,
if
you
have
time-
and
you
want
to
help
improve
the
health
of
the
community-
and
you
want
figure
out
what
is
the
most
important
test
for
me
to
fix
right
now
like
what
is
the
test
that
is
impacting
people
the
most
turns
out.
It's
this
integration
test
called
test
terminal
pod
eviction.
If
you
could
fix
that,
you
would
make
life
really
great
for
a
lot
of
people.
This
is
also
a
dashboard
josh.
Burkas
will
personally
send
you
a
gift
if
you
fix
test
terminal
pod
eviction,
you've
heard
it
here.
A
First,
it's
also
a
dashboard.
I
can
use
to
at
some
point.
I
will
start
coming
after
people
whose
jobs
have
been
running
continuously
for
over
400
days
and
continue
to
fail
every
single
day.
I,
don't
think
anybody's
paying
attention
to
these
I,
don't
think
we
should
be
spending
money
generating
these
results.
Finally,
if
you
feel
like
you're,
your
PR
is
running
into
a
lot
of
flakes,
or
things
seem
flaky.
A
Lately
we
have
this
graph
that
just
shows
how
many
times,
what
as
a
percentage,
how
many
times
has
a
given
PR
job
on
kubernetes
kubernetes
failed
versus
past.
So
we
can
see
here,
for
example,
that
the
kubernetes
e
to
e
GCE
job
had
kind
of
a
bad
day
right
close
to
the
end
of
November
and
I've
used
it
in
the
past.
A
For
example,
here
the
integration
test
job
was
awful
around
October
and
then
shortly
thereafter
we
started
having
a
problem
with
the
cops
AWS
job,
so
this
is
just
a
good
way
to
confirm
your
intuition
that,
yes,
this
problem
isn't
just
affecting
me.
It
is
actually
happening
a
little
more
elsewhere
and
if
you
want
to
find
out
how
to
improve
this
situation,
now
that
you
know
what
is
flaking
come
to
another
talk
that
is
happening
sometime
during
this
contributor
summer,
where
I
will
explain
how
you
can
hunt
these
down
and
fix
them.
B
I
think
we're
at
time
so
I'll
just
throw
up
this
one.
This
is
where
you
can
find
us
I,
I'm,
always
around
watching
sick
and
Trebek's.
In
slack
our
mailing
list.
We
have
issues
open
and
cake
community
if
you
want
to
help
out
with
the
improving
the
experience
of
contributors,
not
only
to
coober
the
kubernetes
core
repo,
but
across
our
entire
organization.