►
From YouTube: Testing Group Think Big #2
Description
Today we think big about Test History. Who would use it, the ideal experience for them and how this could be a differentiator for GitLab.
A
This
is
the
verify
testing
groups
think
big
session
for
July
seventh.
Today
we're
going
to
take
a
or
try
the
think,
big
part
of
a
think,
big
things.
Small
discussion
and
the
topic
at
hand
is
gonna,
be
historic,
test
data
for
all
projects,
so
I
indicated
earlier
in
my
message
to
the
team
that
we
would
do
both
the
think.
A
A
There
are
some
issues
that
are
in
it,
but
I'd
like
to
kind
of
step
back
forget
about
what
we
already
have
open
and
think
about
experience
that
we've
all
had
in
the
past,
with
CI
systems
and
automated
test
data
and
what
could
be
great
experience
when
you're
working
with
one
of
those
systems
when
you're
working
with
get
lab,
and
you
want
to
see
historic
test
data.
How?
How
is
this
test
performed
on
previous
runs?
What
could
that
be?
A
B
B
B
B
For
engineered
survive,
use
code
climate,
for
instance,
has
has
this
feedback
for
engineers
and
I
think
they
do
a
fairly
good
job
of
putting
it
in
front
of
the
engineers
while
they're
working
so
and
it
becomes
the
like
a
little
bit
of
this
gamified
thing.
Where
you
see
that
you're
in
an
area,
and
you
see
where
the
score
is,
and
you
can
see
that
you
know
whatever
the
metric
is.
You
know
what
it
was
complexity
code
coverage.
C
Think
for
a
test,
automation
engineer,
it
would
be
good
to
see
the
trends
for
a
specific
test.
Like
is
this
a
flaky
test?
You
know
in
the
last
ten
runs.
Has
it
passed
like
four
times
and
then
failed
six
times,
but
you
know
not
like
consistently
or
like
a
way
to
surface
the
the
tests
that
are
consistently
failing
so
that
they
know
to
go
in
and
investigate
those
further,
because
it
might
be
that
the
test
is
broken
and
it
might
be
that
there's
a
bug.
C
B
Is
there
any
interesting
ways
to
do
like
meta-analysis
of
those
tests
to
sort
of
had
to
be
able
to
to
be
able
to
show
somebody
who's
writing
a
test
or
investigating
test?
What
what
the
character
there
might
be,
one
particular
characteristic
of
flaky
tests
in
the
suite
and
from
at
a
higher
level.
It
might
be
really
valuable
to
say.
Okay,
you
know
this.
If
you
want
to
know
if
this
test
is
really
failing
or
just
flaky
it'd
be
helpful
to
see.
B
Oh
over
the
past
six
months,
a
lot
of
our
tests
that
were
deemed
flaky
had
this
particular
characteristic
there,
this
style
of
test,
or
they
use
this
tool
to
sort
of,
say
it's
and
just
be
able
to
kind
of
step
back
and
say
the
way
we're
approaching
writing
these
tests
might
be
able
to
be
improved
because
this
this
code,
this
test
style,
is
isn't
producing
good
results
for
us
or
it's.
You
know,
producing
a
lot
of
investigation,
work
or
something
else.
C
Yeah,
that's
a
good
point.
We
tend
to
run
into
those
kinds
of
things
with
specific
failures
or
specific
error
messages,
so,
for
instance
like
if
you're
trying
to
click
on
an
element
and
that
element
is-
has
gone
stale
for
some
reason.
But
then
that
only
happens
like
some
of
the
time
right,
that's
probably
a
flaky
test
or
if
there
is
an
infrastructure
issue
that
that
pops
up
intermittently,
that
could
also
cause
flakiness.
A
Yeah,
that's
what
I
was
thinking.
It
was
like
all
right,
so
here's
all
of
the
tests
that
we've
identified
they
meet
our
flaky
threshold
and
a
pass/fail
percentage
within
those
tests.
All
of
them
are
trying
to
call
this
specific
API
or
run
the
specific
query
or
using
the
specific
test
data
like
if
we
can
start
to
surface
data
like
that,
it
seems
like
they
would
help
a
lot
of
the
the
investigation
time.
A
B
A
D
D
A
D
Definitely
gonna
be
different
for
developers
from
an
SAT
standpoint,
we're
looking
at
all
right.
How
do
we
get
rid
of
the
noth-nothing
developers?
Don't
do
this,
but
but
we're
really
gonna
focus
in
on
how
do
we
get
rid
of
it?
What
are
the
things
that
are
causing
it
to
be
flaky
and
trying
to
identify
if
that's
really
something
in
the
test
itself,
if
it's
something
in
the
app
itself
or
something
in
the
infrastructure.
B
For
me
from
like,
if
looking
at
it
from
like
one
level
higher,
what
I
would
want
to
look
at
would
be
like
what
is
really
giving
my
team
grief
so
that
I
can
best
devote
some
of
my
time
or
ask
someone
on
the
team
to
devote
some
of
their
time
to
analyze
it.
So
what
I'm
thinking
about
is
like
I
want
a
table
of
the
top
ten
most
fairly
tests
and
I
mean
that
by
like
the
top
ten
tests
that
haven't
changed
recently
that
have
the
most
failures.
B
So,
basically,
if
I
go
in
and
I
change
that
test
and
I
alter
it,
I
want
it
to
be
removed
from
that
table.
So
I
know
that
I've
taken
care
of
this
one
and
or
at
least
the
counter
on
how
many
failures
is
reset
after
I
go
and
modify
that
test,
so
that
I
can
keep
track
of
what
I've
kind
of
already
attacked
and
what
remains
to
be
done.
A
B
That,
like
the
immediate,
the
more
immediate
need
for
the
engineer
would
be
to
make
the
pipeline
routine,
but
I
think
when
you
start
to
look
at
the
problem
holistically
like
NSE,
T
or
a
manager,
would
look
at
it.
It
would
be
more
like
okay,
this
is
causing
a
lot
of
grief.
It's
causing
us
to
spend
more
money
on
pipelines
that
we
re
running
this
test
over
and
over
again
like
what
can
we
do
to
make
this
more
robust?
B
B
Yeah,
for
me,
I
think
it
would
involve
some
like
static
analysis
of
the
test,
like
you
were
kind
of
hinting
at
earlier.
So
if
you
can
identify
the
tests
that
have
API
calls
in
them
are
the
most
likely
to
be
flaky.
Why
not
extrapolate
that
a
little
bit
and
test
API
calls
are
the
most
likely
to
be
slow
or
test
with
that
call,
this
one
function
and
take
10
times
longer
than
test
that
don't
call
this
one
function
and
stuff
like
that
and
kind
of
identify
full-blown
okay,
you
asked
for
it's
like.
B
Maybe
you
had
like
some
sort
of
artificial
intelligence
that
could
analyze
the
static
analysis
reports
that
could
then
kind
of
like
based
on
the
data
that
it's
been
fed,
I,
identify
trends
and
then
I
present
them
to
you.
So
like
this,
these
tests
that
I'll
call
this
one
function
are
really
really
slow.
You
need
to
address
this
function
and
make
it
better
I
kind
of
like
elevate
that
sheet
right
in
front
of
my
face
from
the
dashboard
that
I'm
looking
at
does.
B
Yeah
it'll
be
kind
of
neat
if
you
could
have
like
and
again
pie-in-the-sky
if
you
could
surface
that
type
of
information
in
the
IDE
when
you're
when
you're
looking
at
the
test,
maybe
you
have
like
a
little
exclamation
point
you
hover
over
it.
It
says:
hey
this
test
is
the
slowest
test
in
the
whole
codebase.
It
should
really.
Maybe
you
should
think
twice
before
you
put
some
more
stuff
in
here,
yeah.
A
And
I
was
the
oh
I'm
slowly
trying
to
get
you
to
a
clippie,
it's
what
I'm
really
trying
to
do,
but
aside
from
trolling,
you
guys,
if
you're
looking
at
like
in
your
file
view
of
your
repository
and
you
can
start
to
see,
hey
I'm
getting
in
and
I'm
writing
new
code
and
that
I'm
not
going
and
I'm
gonna
write.
Some
new
tests
I
can
see
already
in
that
test.
A
Suite
there's
three
tests
that
are
identified
as
slow
and
some
sort
of
then
mechanism,
like
you
said
of
these,
are
probably
slow,
or
these
are
flaky
rather
not
slow.
These
are
probably
flaky
because
they
all
hit
this
API.
That
is
inconsistently
up
or
down,
or
we
just
identified
that
these
three
are
flaky.
You
should
go
look
at
them
and
try
not
to
write
tests
like
that
and
made
me
fix
those
as
well,
but.
B
You
could
also,
then
put
it
in
the
web.
Id
like
as
you're
writing
a
test
yeah.
It
could
pop
it
up
and
say:
hey
just
so.
You
know
you're,
adding
this
API
call
here
and
historically,
when
you
add
this
line
of
code
to
a
test,
it
becomes
more
likely
to
be
flaky
or
slow,
or
what
have
you
yeah
I
think?
What's
really
interesting,
I.
E
They
are
not
the
holistic
holistically
well
reading
tests,
because
if
the
test
is
it
has
like
good
core
like
it's
good
in
our
white
rule,
I
mean
the
way
that
I
see
it
is
that
if
I,
if
I
write
code,
then
that
it's
bad
like
it's
creating
box,
then
the
test
will
tell
me
that
right,
but
I
feel
many
times
that
I'm
on
the
mindset
of
like
oh
I,
need
to
write
a
test.
Just
so
it
fit
whatever
I'm
doing
you
know
and
I
don't
know
if
there's
a
way
to
visualize
that
maybe
like.
E
If
someone
is
changing
a
test,
a
lot
you
know
like
if
they
change
it
every
day,
like
developers,
are
always
changing.
This
particular
suite
that
sounds
like
that's
a
bad
test.
You
know,
like
tests,
are
being
changed.
A
lot
I,
don't
think
tests
are
meant
to
change
that
much,
but
I
might
be
wrong.
You
know
I'm
just
seeing
it
from
my
perspective,
I,
don't
know
why
I
would
like
to
hear
some
thoughts
about
that.
B
E
Yeah,
that's
that's
one
way
to
see
it.
The
other
way
that
I
was
seeing
it
is
like
we
run
these
stats
tests
every
day
like
every
time
you
know
and
they
pass.
You
know
many
of
them
pass
like
without
without
any
problem
right,
which
means
that,
like
whatever
I'm
doing
it's
not
affecting
that
particular
test
right
like
it's,
so
the
test
itself,
like
the
the
the
usefulness
of
the
test,
is
that
if
I
change
something
that
breaks,
those
conditions
that
are
being
tested
there,
then
it
should
tell
me:
hey
you
broke
these.
E
You
know
like
now.
You
gotta
fix
it
right,
but
I
feel
that
many
times
when
you're
developing
you
end
up
changing
more
the
test
than
the
actual
code
that
you're
working
on
because
you're
just
trying
to
address,
like
whatever
new
conditions
you
are
adding
to
the
code.
You
know
so
I'm
wondering
if
there's
a
way
to
visualize
how
much
you're
doing
that
type
of
behavior.
E
You
know
like
the
like,
just
kind
of
like
those
up
rate,
fixer
type
of
changes
to
the
code,
to
the
testing
code
to
the
Swedes
in
a
way
that
allows
you
to
like
anyone
to
see
if
you
know
like
you,
I
think
it
comes
back
to
the
flakiness
of
the
text
of
the
tests.
But
yeah
just
seemed
like
that.
Like
volatility
of
the
tests,
us
you
go
through
time,
like
maybe
I'm
wrong
I,
like
maybe
self,
like
cool,
like
like
with
his
expertise,
eh
I
mean
but
I.
E
Think
they're,
like
the
perfect
suite
of
tests,
shouldn't
change
that
much
right
like
they
should
be
always
the
very
consistent
good
time
and
just
changes
you
out
new
features.
You
know
I,
don't
know
if
that
makes
sense,
but
that's
just
one
perspective
that
I'm,
seeing
of
something
that
would
be
measure
yeah.
D
There
there
are
lots
of
measurements
that
we
can
pull
in
churn
or
volatility
is
one.
We
have
other
tests
that,
like
crew
brought
out,
can
show
hey.
This
is
how
old
this
code
is.
We
can
show
coupling
of
this
code
with
other
code
that
always
ends
up
getting
changed.
I
used
some
tools,
code,
Matt
and
I-
think
it's
our
service
called
code
scene,
but
I
think
some
of
the
stuff
that
we
use
right
now
can
also
give
us
those
kinds
of
measurements,
there's
another
one
too,
where
we
you
lose
knowledge
over
time.
E
Yeah
I
think
you
hit
the
nail
I
think
test
design
is
like:
how
can
we
help
our
custom
to
give
them
metrics
or
data
that
allowed
them
to
create
their
test
strategy
design
in
a
better
way?
You
know
it's
like.
Perhaps
we're
not
creating
the
right
tests.
You
know
we
just
need
to
have
that
or
use
a
different
framework.
You
know,
like
kind
of
like
give
them
more
actionable.
E
Yeah
items
out
of
the
data,
for
instance
like
it
seems
that
you
have
been
using
karma.
You
know
on
carmine
prolly
like
we
can
give
them
like
that
actionable
pad
from
like
why
you
should
move
from
karma
to
jest,
for
instance,
like
you
know,
and
that's
very
probably,
very
opinionated,
not
something
that
you
can
like
like
extract
and
show,
but
if
you
can
give
them
that
those
insights
of
why
I,
like
certain
things,
might
not
be
designed
the
right
way.
I
think
many
customers
will
appreciate
that
and
it
seems
like
all
I
can
bring.
A
D
You're,
probably
right
and
I'm
probably
incorrect
as
well
I.
Think
QA
has
at
one
time
used
some
cops
but
I.
It's
not
something
I
generally
see
running,
but
but
I
kind
of
brought
it
up
as
an
example
yeah,
that's
kind
of
the
situation
either
an
organization
has
one
way
to
do
something
or
they
don't
and
and
if
they
don't
there's,
not
very
there's,
not
much
tooling
out
there
to
help
us
with
design
yeah.
A
Great
so
it'd
be
interesting
to
maybe
as
you're
looking
at
it
at
a
test
history
you
get
pass/fail
like
Ricky
talked
about
with
the
test
hasn't
changed,
and
this
is
its
history.
Since
its
last
change
or
hey,
this
test
failed
and
here's
the
last
time
it
changed.
So
you
can
see.
Oh
it
just
change
like
right
before
I
started
working
on
this
code
and
maybe
I
didn't
make
that
change.
Somebody
else
did
so
yeah
that
can
be
interesting
as
well.
Oh.
E
A
I
think
we've
alluded
to
a
little
bit
how
we
could
differentiate
with
some
of
the
unique
because
uniqueness
of
gitlab
and
the
potential
of
having
both
the
source
and
your
CI
in
one
tool
and
one
interface.
What
are
some
other
ways
that
potentially
we
could
differentiate
from
competitors
with
a
big
pie
in
the
sky?
We
have
some
smart,
it's
about
your
tests
and
your
test
history
and
where
there
might
be
flakiness.
B
The
thing
that
I'm
thinking
about
right
now
isn't
directly
related
to
what
you're
saying
sorry
to
be
tangential,
but
I
think
a
lot
of
points
were
brought
up
in
this
conversation
about
like
how
can
I
tell
if
my
tests
are
good
or
not
basically
right
so
when
I
think
about
that
the
kind
of
bespoke
industry
solution
is
mutation,
testing
right!
That's
where
you
have
a
dynamic
analysis,
saying
for
your
code
and
it
just
starts
messing
with
your
code
and
changing
it.
B
A
little
bit
then
running
your
tests
to
see
if
your
tests
caught
it
messing
with
the
code.
So
if
it
doesn't
catch
that,
that
means
that
your
tests
aren't
good
and
you
get
an
unwashed
comunication
count.
So
every
time
it
changes
your
code
and
your
tests
don't
catch,
it
that's
bad,
and
so
you
get
a
countered,
and
so
it
does
that
over
and
over
and
over
again
to
kind
of
validate
the
strength
of
your
test.
B
Suites,
that's
something
I'm
actually
super
interested
in
as
well,
and
then
some
of
the
other
things
I
was
thinking
about
a
while
everyone
was
talking
was
there
are
some
people
who
actually
advocate
for
your
test,
Suites
being
the
architecture
of
your
test,
Suites
being
divergent
from
the
architecture
of
your
codebase.
So
instead
of
having
an
r-spec
file
for
every
Ruby
file,
it
would
be
more
like
you'd
have
a
separate
testing
application
scaffold
that
you
would
build
and
grow
differently
as
the
needs
of
your
application
changed
and
it
kind
of
grew
over
time
as
well.
B
So
it's
not
not
a
one-to-one
mapping
mirror,
but
more
of
like
a
an
application.
That's
purpose-built
to
test
your
other
application
kind
of
thing
so
that
that's
something
that
like
Robert
Martin
advocates
for
and
I,
think
there's
a
couple
other
books
about
that
kind
of
idea
out
there
I
haven't
read
them,
though,
so
a
little.
D
Bit
ignorant
no
eat,
so
we
have
both
here
right.
We
have
the
end
end
framework
with
with
QA,
which
is
a
completely
separate
framework,
not
tied
to
the
structure
of
get
lab
at
all.
Then
we've
got
our
built
in
r-spec
tests.
Of
course,
if
we
went
completely
to
the
other
realm,
that
would
that
would
eliminate
usage
of
being
able
to
use
like
Drew's,
fail-fast
template
and
things
like
that.
D
So
I
think
there's
they're
still
married
using
be
the
other
way,
but
I
think
separating
that
the
framework
is
definitely
useful,
but
but
then
you
have
the
whole
mapping
issue.
How
do
you
know
what
you're,
what
you're
testing
because
and
then
test
lease
in
our
case?
They're
mapped
to
a
idea
of
a
feature,
whereas
our
spec
tests
are
literally
mapped
straight
to
the
code.
A
We
are
at
time
or
over
time,
even
like
I
said
at
the
beginning.
We
were
gonna
do
just
the
think
big
part
of
this
this
week
and
since
we
have
another
think
big
session
next
week,
we
will
transition
into
think
small
and
thinking
about
what
is
the
smallest
piece
of
functionality.
We
could
deliver
in
the
next
milestone
to
move
us
forward
towards
kind
of
that
big
vision
of
a
holistic
view
of
your
test
history
and
test
mutations,
and
the
great
discussion
that
we've
had
all
right.