►
From YouTube: Testing Group Think Big #3
Description
Today we think small about slow tests. What is the fastest way to test if users value knowing the test that is the slowest.
A
This
is
the
think,
big
for
the
verify
testing
group
for
july
14th.
2020.
today
is
the
second
part
of
talking
about
test
history.
So
a
quick
recap.
We
did
the
think
big
part
of
think
big,
just
a
little
bit
of
a
misnomer.
A
A
week
ago,
you
can
read
through
the
notes
in
the
agenda
or
view
the
recording,
which
is
unlisted
and
available
on
the
youtube
unfiltered
channel,
and
we
really
talked
about
where
the
where
the
discussion
went
was
providing
actionable
suggestions
about
tests
that
are
slow
or
flaky
based
on
their
past
performance,
and
we
thought
that
a
good
experience
for
that
for
a
developer
would
be
tips
in
the
ide,
the
web
ide
about
how
to
not
write
those
kinds
of
tests.
A
So
I
added
a
sub
bullet
in
our
agenda
today
about.
I
think
that
the
problem
that
that
solves
is
it's
hard
to
know
which
tests
are
slowing
down
our
pipelines,
the
most
and
impossible
to
not
write
more
tests
like
that
on
a
distributed
team
where
that
knowledge
might
be
spread
out,
which
is
a
subtle
way
of
saying.
There's
two
problems
here,
so
the
problem
that
I
want
to
tackle
today
and
in
the
think,
small
part
of
the
think
big
thing
small,
is
the
first
half
of
that.
A
We
could
do
a
research
spike.
We
could
do
a
survey,
we
could
say
the
next
thing
that
we
need
to
do
to
validate
and
move
forward
on.
This
is
additional
research.
That
looks
like
this
to
answer
this
question
and
do
that
through
you
know,
customer
interviews,
so
we're
looking
for
a
tangible
issue
that
comes
out
of
this.
That's
one
of
us
can
act
on
in
the
next
milestone
so
that
we
can
move
this
forward
towards
that
direction
of
hey.
A
Now
we
have
this
great
view
into
which
tests
are
problematic
and
slow
and
we're
not
going
to
write
more
of
them
as
we
proceed
in
this
project
and
a
preemptive.
Thank
you
to
ricky
for
taking
notes
of
my
rambling
as
we
go,
so
that's
the
the
thing
small,
so
I'm
gonna
tee
it
up
with
what
could
we
do
in
our
next
milestone?
Thirteen
three
thirteen
four
that
moves
us
forward
towards
that
vision
of
hey
now
we
know
which
tests
are.
B
Slow
good
question,
I
guess
my
first
question
is:
what
is
our
definition
of
slope?
Is
it
one
minute
ten
minute
an
hour,
I
think
it's
going
to
be
subjective
per
group.
A
Let's
focus
on
our
persona
of
like
the
internal
stakeholder,
and
really
this,
I
think,
goes
to
the
manager
so
ricky
or
even
darby
or
sam.
What
would
you
say
is
a
too
slow
test
as
we're
looking
at
a
team.
C
I'll
say
something
I
guess
so
I
think
I
think
maybe
not
like
a
definition
of
of
slow
because
I
think
that's
relative,
but
it
could
be
just
like
the
slowest
test.
So
if
I
have
the
whole
test
to
be
like
what's
the
slowest
one-
and
maybe
maybe
the
slowest
is
still
plenty
fast
for
my
project,
but
maybe
it
isn't
so
that
would
you
know,
show
me
the
slowest
ones
and
then
I'll
go
fix
the
slowest
one
and
save
that
much
time.
On
my
build.
D
Okay,
would
we
be
able
to
get
some
sort
of
average,
so
it
was
like
you
know,
most
tests
take
you
know
if
you've
got
end-to-end
tests,
they
take
a
lot
longer
than
unit
tests.
So
you
look
at
a
suite
and
say
most
of
these
tests
take
10
seconds
ish
this
one's
taking
five
minutes.
This
is
a
slow
test
or
you
know,
like
some
sort
of
deviation
from
the
mean
kind
of
thing,
might
be
more
helpful
than
figuring
out.
You
know
how
slow
slow,
how
long's
a
piece
of
string.
C
C
So
I
think
that
a
lot
of
the
discussion
around
averages
might
get
a
little
bit
strange
in
those
types
of
situations
where
maybe
you
just
have
five
real
slow
tests
that
test
your
whole
application
kind
of
kind
of
thing,
so
averages
aren't
going
to
be
as
useful
there.
I
I
think
we
talked
about
last
time
what
darby
brought
up
like
give
me
a
dashboard,
that's
ordered
by
the
slowest
and
then
I'll
address
them
in
order
as
we
go.
C
I
think
that
was
one
of
the
kind
of
key
points
that
we
talked
about
last
time
and
I
think
that
would
still
probably
be
the
best
way
to
go
about
this
and
you
could
add
additional
metrics
in
there.
They
might
not
be
useful
for
everyone,
but
for
some
people
they
might
be,
like
you
said,
sam
having
the
average
and
the
flags
like
this
one's
slow,
because
I
think
a
lot
of
times
what
will
end
up
happening
as
people
go
down
that
list
from
slowest
tests
is
they'll,
come
to
a
point
where
they're
like.
C
Oh
actually,
this
test
needs
to
be
slow
because
of
x.
I
can't
fix
this
and
so
they'll
skip
that
one
every
time
and
start
moving
down
on
the
list
for
for
the
ones
that
are,
after
that,
and
maybe
after
four
years
of
doing
that,
you'll
have
a
whole
page
of
tests
that
are
the
slowest
ones
that
you
can't
improve
anymore.
You
know
what
I
mean.
A
So,
let's
assume
that
we
have
some
sort
of
view
for
the
unit
tests
or
the
the
test
history.
Sorry
back
up.
Let's
assume
that
we
have
a
view
to
sort
tests
and
you
can
see
tests
by
how
long
it
took
them
to
run.
Let's
limit
that
view
to
say
the
last
pipeline,
what
could
we
do
in
our
next
milestone
to
gauge?
Is
this
helpful
for
our
manager
persona
for
our
it's
not
slash
as
the
developer,
I'm
blanking
on
the
developer,
persona
name,
delaney,.
C
A
Delaney,
thank
you,
the
manager
for
summoning
I'm
used
to
them
being
an
alliteration
and
that
one
is
not
necessarily
an
alliteration.
So
what
can
we
do
in?
What
would
that
look
like
if
we
had
an
issue
that
was
written
up
to
go
test?
This
hypothesis.
C
C
Historically,
the
same
way,
we
think
about
test
failures.
So
if
you
have
you
know
one
tet,
you
know
you
end
up
with
that
whole
page
of
tests.
That's
been
slow
forever
and
it's
going
to
be
slow,
because
you
know
that's
that's
the
way
they're
going
to
be.
If
you
could
look
at
it
over
time,
you
might
see
when
tests
are
becoming
slower
or
I
think
right
now
we
have
duration
by
like
different.
C
We
have
duration
per
suite
that
I
think
we
even
get
from
duration
of
individual
tests,
so
we
can
get
as
granular
as
we
want,
but
we
could.
We
could
see
not
necessarily
that
just
this
test
is
slow,
but
our
tests
that
might
be
slow,
but
it's
50
faster
than
it
used
to
be
so
that's
pretty
good
or
all
of
a
sudden.
This
test
got
really
slow.
C
C
A
C
E
A
A
So
as
we
think
big,
we
can
think
big
or
we
thought
big
about
hey
here's,
what
flaky
tests
look
like
they're
dependent
on
other
things
or
their
performance
changes
over
time
as
we
narrow
the
scope
down,
we
can
say
well,
flaky
is
hard
to
define
and
it's
not
something
that
we
can
move
on
next.
The
next
thing
we
could
do
because
we
have
the
data,
is
show
you
here
the
tests
in
order
of
how
long
they
took
and
start
with
slow
ones
first,
so
that
helped
clarify
for
you
eric.
E
A
Nothing
yeah.
It
took,
I
think,
small,
a
little
bit
different
direction
than
what
jj
mocked
up
from
our
discussion
last
week
or
based
on
the
the
historical
test
data
that
we
have,
because
we
started
talking
about
not
necessarily
which
tests
have
passed
and
failed,
but
more
about
that
flakiness
and
so
scoped
us
down
into
well.
Let's
talk
about
slowness
first
and
then
we'll
expand
into
flaky
as
this
moves
on.
Potentially
I.
C
Think,
there's
still,
I
think,
there's
still
value
in
talking
about
past
failures,
especially
with
a
limited
scope,
I'm
curious
to
know
which
is
more
valuable
to
our
customers.
Like
would
you
rather
know
the
past
fails
of
the
last
10
tests
that
we
ran,
or
would
you
rather
know
which
of
your
test?
Is
the
slowest
one
like
what
like
I
I
understand.
That's
like
a
false
dichotomy.
There
like
it's,
not
one
or
the
other.
We
can
do
both
but
like
which
one's
more
important
right
now.
F
That
speed
is
something
we
always
want
to
make
sure
we
focus
on,
but
pass
fail
seems
to
be
the
most
important
thing
from
a
historical
standpoint,
at
least
yeah.
F
F
I
wonder
if
we
could
capture,
even
just
the
even
just
comparing
the
last
run
to
the
next
run.
If
we
can
see
that
there
is
a
great
variance
in
in
speed
between,
there
is
a
pretty
good
indicator
that
that
something's
happened.
F
Of
course,
a
lot
of
tests
are
dependent
upon
infrastructure
and
I
think,
as
as
an
app
grows
over
time,
tests
might
become
slower
just
by
their
nature,
yeah
because
of
a
turn
within
the
app
itself,
but
I
think
just
understanding
when
there's
a
large
variance
between
the
the
previous,
the
previous
run
and
the
existing
run
might
tell
me:
hey.
I've
got
to
go
switch
gears
and
look
into
the
performance
of
this
particular
test
and
understand
why
I
have
such
a
big
variance
compared
to
the
last
one.
Now.
A
So
I
just
want
to
make
sure
I
understand
your
user
flow
there
stuff.
You
had
data
that
showed
you
which
tests
like
historically
in
previous
runs
failed
past
whatever,
as
you're
looking
at
those
you've
satisfied
your
questions
about
those
tests
now
you're
looking
at
a
view
of,
and
then
here
are
the
tests
that
differ
from
the
duration
of
time.
The
amount
of
time
that
it
took
to
run
the
test,
regardless
of
pass
or
fail.
F
I'm
not
sure
I
would
do
those
things
sequentially.
I
would
probably
be
looking
at
them
in
in
parallel
that
way,
if
something
was
had
a
huge
variance
and
all
of
a
sudden,
a
test
took
10
times
as
long
as
it
did
before.
I'll
need
to
stop
and
go.
Look
at
that.
Yeah.
A
So
I
just
want
to
make
sure
that
I'm
sharing
the
context.
What
I've
done,
I
think
internally
and
not
verbalized,
is
that
I've
split
this
problem
where
I'm
thinking
about
for
delaney
the
manager
they're
looking
at
slow
tests
because
they're
thinking
about
how
can
I
help
the
team
work
more
efficiently
and
they're,
looking
at
a
lot
of
tests
or
a
bigger
set
of
data
versus
the
looking
historically
at?
How
does
this
test
run
compared
to
previous,
because
I'm
maybe
triaging
a
pipeline
and
that's
more
of
our
our
sct
persona
or
our
individual
developer
persona?
A
And
so
I
think
I
keep
tracking
back
to
the
manager
problem
in
our
discussion
and
I
think
that
we're
bouncing
back
and
forth
a
little
bit.
So
I
just
wanted
to
share
that
that
context
that
I'm
actually
thinking
about
these
as
two
separate
problems
and
two
separate
solution
spaces
as
well.
F
Yeah,
no,
no
you're,
you're
you're
right
about
that.
I
was
extrapolating
on
on
ricky's
question
there
yeah,
which
is
a
good
question
ricky
that
that
would
definitely
make
me
change
my
my
priorities
for
your
question
james.
I
think
the
the
biggest
thing
right
now
is
probably
just
exposure
right
now.
F
You've
got
to
go
into
the
logs
r
spec
and
any
other
tool
will
give
you
individual
timings
for
tests
just
to
prevent
a
manager
from
having
to
dig
out
dig
down
into
the
logs
of
a
pipeline
to
be
able
to
see
those
is
probably
the
the
first
step
just
scraping
that
information
out
and
then
exposing
it
at
a
higher
level.
Yeah.
D
A
C
C
A
Yeah
we'd
do
it
on.com
potentially
do
as
a
dot
com
only
test
and
dump
the
data
into
snowplow,
probably
set
some
sort
of
target
for
it
to
help
gauge.
Is
this
successful
or
not?
That's
how
I
would,
as
I
wrote
up
the
issue
like
what
are
our
success
criteria?
F
E
Sorry,
just
a
random
question
in
my
mind:
if
ever
you
focus
on
showing
like
slow
tests
and
then
we
have
this
other
feature
in
run
modified
test
first,
so
not
necessarily,
if
you're,
like
touching
specific
files
on
your.
Mr
then.
E
A
The
give
me
just
a
download
with
the
slowest
test
data
is
going
to
be
as
a
manager.
I
want
to
see
what's
slow
within
the
suite
that
my
team
owns,
so
that
we
can
focus
on
improvements
there
for
overall
efficiency
of
running
our
pipelines.
So,
while
individual
changes
locally,
you
want
to
run
really
fast.
We
also
want
those
other
pipelines
to
run
fast
because
maybe
they're
charged
with
and
they
have
responsibility
over
even
budget
and
p
l
for
their
ci
systems
like
I
need
to
pay
attention
to
runner
minutes,
and
so
anytime,
we're
spending.
A
You
know
half
an
hour
on
a
single
test
when
it
could
be
five.
That's
25
runner
minutes
that
I
got
to
pay
for
and
if
we're
running
that
pipeline
every
single
day,
that
is
a
whole
lot
of
runner
minutes
that
I
could
be
saving
and
saving
for
budget
for
other
exploratory
things
from
that
perspective,
so
adding
efficiency
to
the
team.
That
way
is
more
of
what
the
what
they're
thinking
about
versus.
How
do
I
just
get
that
feedback
loop,
faster.
A
C
Just
had
one
thing
I
want
to
bring
up,
so
I
don't
forget.
One
thing
I've
been
thinking
of
is
there's
sometimes
changes
that
can
affect
the
whole
swath
of
sweet
timing
at
once,
and
I
think
alerting
on
that
would
be
a
really
neat
feature
like
if
you
committed
a
change
that
made
all
your
tests.
Take
10
longer
that'd,
be
something
that
I'd
kind
of
want
to
know
about.
A
C
Yeah
I
had
something
written
in
here
from
before
the
meeting
so
I'll.
Just
I
think
the
the
answer.
My
answer
for
both
of
those
questions
is
the
same.
It's
going
to
be
striking
a
balance
to
find
how
we
can
store
just
enough
information
to
provide
useful
insights
to
the
user,
but
not
so
much
information
that
we're
bogging
down
the
application,
and
we
end
up
with
another
10
million
100
million
row
table
in
in
every
gitlab
instance.
C
C
And
conceptually,
it
seems
like
a
tricky
problem
to
me
to
make
sure
that
what
we
show
is
actionable
that
you
know
we're
gonna,
we're
gonna
put
up
all
this
time
and
data
everywhere,
and
I
think
I
think
it'd
be
easy
to
build
something
that
somebody
would
look
at
and
say:
okay,
so
what
so?
Keeping
sort
of
keeping
it
concise
enough
that
it's
obvious
what
somebody
should
do
because
of
what
we
told
them.
I
think
I
think
it
is
tricky
yeah.
A
I
think
that
we've
like
where
that
starts
to
get
valuable
and
how
you
solve.
That
is
where
you
start
to
do
that
comparison
over
time.
It
doesn't
matter
if
this
test
is
slow,
it's
always
been
slow,
but
if
now
it's
slower
than
it
was
a
month
ago
or
even
three
mrs
ago,
then
that's
something
you
should
go
focus
on
and
get
it
back
to
the
performance
of
this
app
before,
and
maybe
we
I
mean,
even
if
you
just
label
it
as
opposed
to
minutes
as
runner
minutes.
A
Maybe
then
people
start
to
care
or
something
like
that
or
you
start
to
roll
that
data
up
into
here's
the
increase
in
runner
minutes
just
from
tests
that
have
slowed
down
and
here's
the
increase
from
additional
tests
that
have
run
like
all
of
a
sudden
that
director
and
vp
persona
start
to
care
about.
How
much
are
we
spending
on
this
now
yeah.
C
A
A
A
C
C
And
then
what
do
you
use
that
file
for
later,
like
you're,
going
to
make
your
whole
dashboard
thing,
backed
by
a
file
instead
of
backed
by
a
database,
then
that
gets
a
little
bit
weird
and
then
there's
a
whole
thing
about
historical
data,
like
how
much
historical
data
is
going
to
be
useful
and
are
you
going
to
track
it
for
every
single
test?
I
think
gitlab
has
like
a
for
argument's
sake.
Let's
say
a
million
tests.
Are
we
going
to
store
information
about
all
one
million
tests
in
perpetuity
and
like
how
do
we
figure
out?
C
Which
of
those
million
tests
are
important,
and
and
how
do
we
figure
out
how
to
store
a
minimal
set
of
data
that
brings
value
without
actually
having
to
store
a
million
rows
one
for
each
test
and
then
n
million
rows
one
for
each
test
run
like?
How
can
we
not
do
that
and
still
have
data
that
is
valuable
to
our
our
end
user?
A
Cool
to
quickly
answer
your
question
before
we
wrap
up,
I
think
that
we
could
just
make
that
slow,
because
what
we
want
to
track
is
the
button
clicks.
We
can
caveat
it
with
a
tooltip
that
this
may
take
a
minute,
but
I
think
that
that'll
that
would
still
test
our
hypothesis
that
yeah.
This
is
valuable.
Somebody
gets
value
out
of
this
data
if
they're
clicking
the
button
to
get
the
data,
and
then
we
can
iterate
on
that
to
make
that
performant
or
store
the
data
somewhere
else,
so
that
it's
viewable
within
the
app.
C
Yeah,
I
think
it'd
still
be
interesting
to
me,
to
see
from
like
a
user
interview
perspective
like
here's,
a
wireframe
of
what
we're
thinking.
Do
you
think
this
would
bring
value
to
you
and
then
why
or
why
not
I'd
love
to
get
some
feedback
from
from
customers,
and
maybe
internal
customers
as
well
is
just
what
they're
looking
for
out
of
that
type
of
feature
set.
Sure.
A
Yeah
we're
getting
some
interviews
lined
up
and
I
will
show
those
back
to
the
team
as
we
complete
them
all
right.
We
have
probably
like
90
seconds
left.
I
have
a
to-do
to
write
the
issue.
That's
come
out
of
this
and
I'll
share
it
back
to
the
team.
I'll
also
take
a
to-do
of
updating
our
direction
page
for
code
testing
and
coverage.
A
I
think
that
we
had
a
great
discussion
last
week
about
where
we
could
go
with
this,
of
showing
you
showing
you
which
tests
are
potentially
flaky
and
that's
a
great
direction
item
that
we
should
put
out
there
and
update
the
direction
page
so
I'll
make
that
update
as
well
and
share
that
back
to
the
team,
so
that
you
know
where
we're
going
I'll,
probably
link
to
these
the
first
video
in
that
and
say
we
had
a
great
discussion
about
the
direction
that
we
want
to
take
identification
of
flaky
tests,
here's
a
reference
to
it,
just
everybody's
in
the
loop
and
I'm
always
open
to
feedback
on
how
this
went.
A
I
think
this
format
went
really
well,
so
I'll
propose
it
that
we
maybe
tackle
this
again
next
month.
Take
a
couple
of
weeks
off
and
then
do
back-to-back
sessions
or
try
to
extend
the
session
so
that
we
can
get
a
full
hour
in
all
in
one
day
and
do
the
think,
big
and
think
small
together.