►
From YouTube: Kubernetes SIG Testing - 2019-06-25
Description
A
Hi
everybody
I
am
Aaron
as
a
sick
beard
and
you
are
at
the
kubernetes
cig
testing
weekly
meetings
for
Tuesday,
June
25th,
we'll
all
be
adhering
to
the
kubernetes
code
of
conduct
by
basically
not
being
jerks
on
today's
agenda.
We're
gonna
have
Steve
walk
us
through
the
new
monitoring
stack,
that's
been
put
together
for
prowl,
and
then
we
have
some
discussion
about
sort
of
configuration
and
related
things
that
were
interested
in
for
prowl.
Well,
so
I
will
hand
off
to
Steve
to
show
us
monitoring
drought
decades,
dhaniya.
A
B
Yes,
he
can
see
that
okay
awesome,
so
this
is
monitoring
the
product
case
that
I
own,
and
so
we
have
added
a
number
of
dashboards
that
we've
found
useful
interesting.
So
some
of
these
already
existed
in
Belgium
and
are
currently
just
being
calculated
using
the
I
started,
taking
one
step
back,
so
the
architecture
of
this
monitoring
stack
is
a
Prometheus's
than
sake
refining
instance,
and
an
alert
manager
instance
that
are
standalone
all
of
the
manifested
employ
those
are
in
the
repo
and
they
don't
require,
like
any
any
other
external
services
other
than
a
PV.
B
If
you
want
your
data
to
stick
around
for
a
long
time,
whereas
the
velodrome
instance
requires
some
other
databases
and
it's
complicated
and
much
harder
to
set
up
so
this
data
here
is
for
the
time
dashboard,
it's
pretty
much.
The
same
thing
that
is
exposed,
I
felt
I
was
just
being
exposed
from
Prometheus
itself
and
not
not
using
the
commercial
gateway.
B
So
these
are
some
of
the
pre-existing
dashboards
and
they're
still
here
and
then
there's
been
a
couple
of
new
dashboards
added
like
the
action
actions
of
singer-
and
you
can
look
at
you
know
interesting.
Hysteresis
over
a
week,
these
are
also
some
of
the
metrics
underneath
this
sort
of
thing
can
also
be
used,
for
instance,
to
calculate
HT,
like
request
and
response
latency
for
deck
and
there's
an
app
tech
support
here.
B
The
metrics
underneath
all
of
these
plots
are
also
hopefully
going
to
be
used
to
create
alerts.
So,
for
instance,
one
of
the
alerts
that
is
set
up,
but
it's
a
little
bit
finicky,
is,
if
hook,
hasn't
ingested
a
web
hook
from
github
for
five
minutes.
During
what
we've
arbitrarily
called.
You
know:
business
working
hours,
the
under
alert
fires-
and
you
know
some
of
their
alerts
that
might
be
interesting-
could
be
if
the
ratio
of
500
or
400
here
from
deck
is
larger
than
inspector.
B
B
Remaining
tokens
freaking
hub
alongside
like
what
exactly
the
calls
were
making
stuff
like
that,
and
also
we're
hoping
to
add
more
dashboards,
as
we
realize
like
which
parts
of
the
system
are
useful
to
longer.
So,
if
you're
looking
to
get
involved
with
this,
all
of
this,
all
of
the
Memphis
for
this
can
be
found
in
testing
from
they
are
under
the
prowl.
B
B
Actually
that
us
working
hours,
because
right
so
I,
don't
that's
like
the
version
of
that
that's
running
on
the
Proud
cluster
I
personally
administer
has
working
hours
defined
as
any
working
hour
where
we
have
somebody
who
can
be
on
call,
obviously
some
optimal,
but
it's
at
least
better.
We
had
a
fire
on
one
of
the
u.s.
holidays
because
nobody
was
working,
but
you
know
like
it's
going
to
be
a
fragile
alert,
but
it
should
be.
It
should
have
more
false.
It
should
have
a
few
false
positives,
admittedly
fragile.
B
C
C
A
I'm
hugely
excited
by
this
I
haven't
really
like
been
paying
too
much
attention
because
I
didn't
want
to
watch
a
pot
boil,
but,
like
just
did
a
quick
glance,
it
looks
like
you
have
all
the
Griffons
dashboards
stored
inside
of
config
maps,
and
so
we
can
use
the
same
like
get
ops
driven
model
that
we
use
to
like
update
jobs
and
stuff
to
update
or
Pharma
dashboards
yep.
This
is
right.
B
Or
something
so,
this
is
currently
like
kind
of
an
open
question.
The
approach
that
open
ship
has
been
using
is
we
have
a
staging
namespace
with
all
of
this
stuff
deployed
and
the
grifone
instance,
and
there
is
just
using
the
production
as
a
pack,
so
you
can
log
into
the
staging
graph
on
a
noodle
around
manually
and
then
create
something
that
you
like
see
it.
B
What
it
will
actually
look
like
reproduction
data
Cole
wasn't
super
interested
in
having
to
deploy
to
us
as
many
things
so
right
now,
the
kubernetes
you're
fauna
has
an
admin
login
that
you
can
use
and
the
thought
was.
You
would
create
the
dashboards
there
and
then
delete
them
and
then
check
in
yeah,
but
I
think
that's
still
kind
of
up
in
the
air
Kathryn
was
talking
about.
You
know.
Maybe
having
was
staging
anyway,
so
yeah.
A
I
mean
I
I
personally
am
interested
in
it
from
the
perspective
of
had
a
number
of
people
like
on
the
release
team
or
just
community
members
in
general,
who
are
interested
in
like
making
dashboards
to
show
things.
Since
we
talked
about
test
health
and
flakiness
and
stuff
like
that,
yeah
I'd
love
to
give
them
like
a
canonical
place
to
do
that.
B
Yeah
and
I
think
that,
like
part
of
that
is
a
question
of
life,
are
there
you're,
essentially
security
problems
with
that
I?
Don't
know
if
there
are
I
just
want
to
pull
up
super
quickly,
one
of
the
dashboards
livers,
one
of
the
dashboards
that
we've
got,
that
I
think
we
haven't
migrated
yet
cuz,
it's
just
specific
to
what
we're
doing
but
would
be
useful.
B
A
B
A
Know
that
we
use
this
is
my
point
like
I
thought,
the
one
thing
that
I
don't
see
in
this
dashboard,
that
I
know
we
use
velodrome
for
but
I
think
it's.
We
just
use
it
as
like
an
influx
database.
Sync
is
the
jobs,
the
query
that
run
queries
against
bigquery
and
then
transform
the
results
using
JQ
into
things
that
can
be
dumped
into
in
flux
so
that
we
can
get
statistics
about
yeah,
bigquery,
metrics,
dashboard.
That
shows
sort
of
like
everything
that's
going
on.
Yes,.
B
At
the
moment,
it's
not
using
any
influx
at
all
and
that's
where
the
that's,
where
it's
like
significantly
less
complicated,
it's
just
literally
using
a
backing
store
for
like
oh,
it's
just
using
Prometheus
entirely
got
you
what's
the
retention
like
on
that
a
gen
just
depends
on
how
large
you
make
the
PB
I.
Think
I'm,
not
a
I,
don't
know
if
my
head
with
what
it's
configured
for
open
ship
I
think
we
have
100
gigabytes
and
we
have
it
set
to
two
months
earlier.
Okay,.
B
Absolutely
so
yeah
like
when
you're
creating
your
final
dashboards
like
configuring,
which
genus
are
State
draw
from
it,
is
so
I'll
fill
a
drum
if
we
continue
to
have
the
the
pieces
that
ingest
github
data
and
digest
that
into
the
dashboard
or
into
the
influence,
as
well
as
having
you
carry
stuff.
Also,
you
know
publishing
to
that
database
as
long
as
that
database.
That's
around.
All
of
that
can
just
be
transparent.
A
A
This
is
partially
inspired
by
the
awesome
Docs
that
Daniel
showed
us
from
the
akima
project,
though
I
did
I
still
haven't
liked
that
I'm
even
begun
upon
the
depth.
So
what
we
could
use
from
that?
The
other
thing
is
all
this
stuff
around
job
management.
If
I
click
on
this,
it
will
take
me
to
a
readme
inside
of
the
config
jobs
directory,
which
tries
to
spell
things
out,
hopefully
a
little
bit
more
I
still.
A
Don't
think
it
is
quite
as
ideally
step
by
step
is
what
I
have
seen
from
the
kheema
project
and
other
people,
but
we
try
to
talk
a
little
bit
more
now
about
like
hey
what
even
our
piece
of
it's
and
post
submits
when
we
use
this
language,
what
are
sort
of
our
best
practices
for
the
images
you're
supposed
to
use
for
jobs?
When
even
are
these
preset
things
that
you
see
everywhere?
A
A
A
B
Eric's
knowledge,
no,
he
is
not.
Yes,
it's
just
for
everyone
else
that
might
be
interested,
definitely
jump
into
this
right
here
and
hopefully
you're
not
hearing
the
wine
jump
into
the
thread
and
put
your
thoughts
in
there.
The
basic
thing
is
I
think
you
can
so
right
now
we
were
kind
of
in
like
a
weird
halfway
State
in
Prague,
where
you
can
either
configure,
but
you
can
configure
proud
to
operate
inside
of
a
namespace
by
changing
config
mode.
You
can
do
it
by
changing
the
credentials
that
are
given.
B
B
A
C
B
Top
of
that
you're
also
able
to
like
direct
like
I,
think
the
build
controller
and
the
pipeline
controller
currently
also
honor
a
specific
field
job
that
tells
them
which
means
to
used
to
put
things
and
where
things
are
not
necessarily
pods
and
now,
on
top
of
that,
I
believe
with
those
two
controllers,
as
well
you're
able
to
change
how
that
works
by
giving
cluster
credentials
with
a
default
namespaces.
So.
A
Say
personally,
like
I,
think
I
can
understand
the
idea
that
we
want
to
like
have
sort
of
same
prescriptive
defaults.
But
we
also
want
to
allow
enough
flexibility
for
people
to
sort
of
meet
their
use
cases.
So
I'm
wondering
if
this
is
a
case
of
like
if
there
are
too
many
options
and
they
can
flick
with
each
other
too
much
or
if
we
just
haven't,
like
documented
sort
of
the
hierarchy
of
like
which
options
should
override,
which
so
that
we
can
sort
of
it's.
A
A
That's
fair,
I
I
have
noticed
a
trend,
a
lot
where
we
also
we're
not
really
sure
what
a
best
practices.
So
we
sort
of
our
tools
have
like
really
sharp
edges
that
can
allow
you
to
cut
yourself,
but
then
we
have.
We
try
to
sort
of
figure
out
what
same
practices
are
and
then
lint
or
test
those
away
via
pre
submits.
So.
A
C
A
B
A
Yeah,
this
is
just
kind
of
like
off
the
agenda,
but
it's
like
I
tried
moving
the
job
over
to
Pocky
tales
as
part
of
that
Docs
thing
and
I
find
that
there's
still
a
little
bit
of
weird
unexpected
behavior
there
that
is
difficult
to
test
locally,
like
I,
would
prefer
to
live
in
a
world
where
we
can
encourage
other
people
to
migrate
their
jobs
over
and
the
best
way
to
empower
them
to
do.
That
is
to
give
them
like
a
you,
know,
develop
and
test
cycle
before
they
open
up
PR
and
like
right.
A
A
B
With
these
like
sub
sub
items
in
it
was
like
the
self
service
management
story
for
prodigals
is
like
pretty
port
without
layering,
on
extra
tools,
which
I
think
but
I
think,
as
I
mentioned
to
you
on
slack,
like
part
of
that
part
of
that
has
always
been
like
the
like
I
think,
there's
a
higher
level
of
trust
and
other
deployments
of
prow.
And
so
then
you
know
running
job
changes
as
a
pre
submit
for
PRS
are
changing,
config
makes
sense,
yeah,
there's,
definitely
a
middle
ground.
A
So
I
kind
of
feel
like
I,
don't
really
have
the
right
people
here
or
enough
time
to
talk
about
the
last
thing
I
have
on
the
agenda,
which
was
breaking
prowl
into
its
own
sub
project
and
understanding
the
mechanics
of
moving
proud
to
sort
of
the
code
lives
over
here
and
the
configuration
lives
over
here.
That's
that's
like
what
I
want
to
get
to
and
I
want
to
understand
what
technical
or
social
limitations
we
have
in
between
here
and
there
I'm
also
aware
that
next
meeting
next
week
would
be
July.
A
2Nd
I
am
planning
on
being
around
for
that,
but
it
is
near.
It
is
adjacent
to
a
holiday
in
the
US,
which
may
mean
some
people
are
not
available.
So
I
will
be
sending
out
a
notice
to
the
mailing
list
asking
if
we
want
to
hold
it,
but
I
think
it
would
be
a
great
idea
to
sort
of
walk
through
the
prowl
epics
talk
next
week
and
really
kind
of
hash.
Some
of
these
things
out
I
also
am
aware
that
we
have
sort
of
GMT
friendly
or
European
friendly
meeting
happening
on
Friday
Steve.