►
From YouTube: Quality Team: Failure Triage Training - Part 1
Description
Beginning of discussion by quality team on how we quickly respond to failing quality pipelines and quarantine necessary tests for further investigation.
B
Right
so
yeah
end-to-end
test
failure,
triaging
training
session.
Let's
go
so
I'll
share.
My
screen,
I
think
that
would
be
good
way
to
go
and
I'll
go
through
the
process.
I'll
share
my
desktop
come
on
share
there
we
go
and
now
I
can't
get
to
the
buttons.
I
need
all
right.
I
want
to
open
that
document.
B
So
we
have
the
debugging
guidelines
that
senedd
was
kind
enough
to
create
just
recently
so
yeah
I'll
follow
through
that
and
then
feel
free
to
jump
in
with
any
questions
at
any
point
and
yeah
we'll
add
some
stuff
to
this
as
well,
and
we
also
have
the
Google
document
that
we
can
take
notes
in
to
make
sure
we
don't
forget
anything.
That's
important
refer
to
later.
Alright,
so
starting
off
with
what
I'd
usually
do
is
I
check
the
nightly
and
staging
pipelines.
B
Example,
go
back
to
staging
today
we
can
see
here
that
I'd
looked
at
the
failures
and
found
that
there
were
some
failures
that
possibly
related
to
other
ones
and
going
back
again
further.
You
can
see
that
soon
ads
reopened
an
issue
and
a
previous
staging
pipeline
failure,
so
yeah
I
would
start
off.
We
was
checking
slack
to
see
if
any
work
has
already
been
done
and
if
so,
jobs
already
done.
But
let's
assume
that
that
hasn't
that
isn't
the
case
and
we're
saying
from
scratch.
So
I
would
open
there's
a
pipeline
and.
A
B
B
But
assuming
that
that
hasn't
happened,
then
I'd
be
doing
the
same
thing
that
he
done,
which
was
from
the
pipeline's
link
and
then
go
through
and
check
the
failures
in
that
I
clicked
on
one
where
it
actually
passed.
But
if
you
see
before
it
failed
the
first
time
and
then
passed,
which
means
it
was
retried
and
so
I'm
going
to
go
through
and
click
on
just
any
of
them.
So
I
can
bring
up
a
list
and
then
scroll
down
and
see
the
actual
failures
that
occurred
and
yeah.
B
Yeah
we
try
to
start
so
yeah.
It
should
be
reported
anyway,
even
if
it
does
pass,
even
if
it
does
pass
like
it
did
this
time
it
passed
after
retry,
but
that
just
means
it's
a
flaky
test,
so
yeah
it
just
should
be
reported.
I,
don't
know
if
this
one
has
actually
been
reported.
So
what
then,
what
I
would
do
is
I'd
copy
copy,
the
file
name
there
and
then
search
for
it.
B
It's
a
couple
ways
of
doing
this.
One
quick
way
is
just
to
have
a
look
at
the
issues
here.
So
having
a
look
at
the
nightly
issues
to
see
if,
if
it's
already
open
here
there
isn't
one
now,
what
I
want
to
do,
ideally
is
make
sure
that
nobody
else
has
opened
an
issue
somewhere
else.
So
it's
possible
that
an
engineer
had
it
encountered
the
same
problem
running
the
test
themselves.
B
B
That'll
take
a
while,
but
if
nothing
comes
up
there
then
I
can
be
pretty
confident.
Nobody
has
seen
that
failure
I'm
going
to
assume
that's
yeah,
okay,
good,
didn't
take
too
long,
but
long
enough,
there's
nothing
open.
So
we
can
create
a
new
issue
so
back
in
the
nightly
project,
because
that's
where
the
failure
happened,
I'm
going
to
report
a
failure
in
this
file
there.
C
B
Right,
yeah
and
that's
a
good
point.
It
is
a
good
idea
to
have
a
look
at
the
closed
issue.
It
might
be.
The
same
issue
might
have
been
closed
pretty
recently.
So
that's
a
good
point.
Let's
see
if
it
was
similar
enough
might
reopen
it,
but
it
looks
like
it's
not
expected
to
our
scroll
across
and
see
yeah
so
see
what
the
text
was
expected.
B
B
B
B
This
was
merged
a
week
ago.
So
yeah
we'll
continue
with
reporting
this
one,
so
I'm
gonna
copy
the
failure
stack
trace
just
up
to
okay
I'm,
going
to
grab
it
up
to
the
first
time
that
the
file
appeared.
So
we
know
exactly
which
test
it
was.
The
rest
is
step
traits
from
the
framework
which
is
always
the
same.
So
we
don't
need
to
worry
about
that
and
we're
gonna
paste.
The
stack
trace
in
there
I'm
also
going
to
paste
whoops
okay
I'm,
going
to
paste
in
the
link
to
the
job.
B
B
C
B
B
B
We
have
conflict
they're
happening,
but
we
don't
have
uploading
LFS,
so
it's
possible
that
LFS
wasn't
didn't
actually
work
in
this
case
rather
than
just
not
finding
the
right
text,
so
that
would
need
some
more
investigation,
so
I'm
gonna
gonna
leave
the
issue
details
at
that
for
the
moment
we'll
get
back
to
the
investigation,
so
the
issue
should
have
linked
to
the
failing.
Job
should
have
a
screenshot
if
available
and
HTML
capture,
if
available
so
then
browse
to
the
job,
artifacts
and.
B
And
that
just
means
it
to
Bob
either
in
the
test
or
in
the
application.
We
don't
know
yet,
but
in
either
case
that's
the
same
tag
that's
needed,
and
it's
a
p1
based
on
the
priorities
mentioned
here,
so
just
double
check.
Those
p1
is
for
tests
that
needed
to
verify
the
fundamental
bit
lab
functionality,
as
opposed
to
p2
for
tests
that
deal
with
external
integrations,
and
this
one
is
Gio
LFS
functionality.
B
B
B
Enablement
right,
okay,
yeah!
So
we
don't
have.
We
don't
have
a
cross-functional
team
member
for
the
enablement
stage
at
the
moment,
so
I
won't
assign
that
to
MCC
anyone
on
that
one
all
right.
So
we
can
submit
that
issue.
Oh
sorry,
and
it's
a
transient
failure
because
it
did
pass
after
retry,
and
so
we
can
submit
that
all.
B
So
I
don't
want
to
spend
too
much
time
on
that
now
so
I'll
just
say.
The
next
steps
here
would
be
to
do
that
replication
to
try
to
reproduce
it
locally
and
for
the
sake
of
hopefully
fixing
the
test.
If
it's
a
fellow
in
the
test,
otherwise
providing
some
more
input
to
report,
an
issue
for
a
developer,
to
an
engineer
to
look
into
fixing
an
actual
bug.
If
it's,
if
it's
an
actual
bug
in
the
application
and
then
the
next
step
for
us
would
be
to
quarantine
that
test.
A
D
B
B
B
Of
course,
but
if
it's
a
bug
in
the
test
itself,
then
we
want
to
make
sure
that
that
test
is
running
even
if
it's
quarantined,
we
can
run
it
and
see
the
results,
but
it's
slightly
less
likely
to
be
on
our
radar
if
it's
running
in
the
quarantine,
jobs
that
are
allowed
to
pass
yeah.
So
if
I
go
back
to
the
pipeline,
you
can
see
here
that
all
of
these
quarantine
jobs
are
allowed
to
fail.
B
B
B
B
B
B
B
All
right,
so
we
know
how
to
emerge
in
mr
anyway,
so
I
think
I'll
skip
that
point
now
and
I'll
fix
that
offline,
but
basically
all
I'm
doing
there
is
creating
a
merge
request.
If
that
change
in
and
then
I
would
post
in
the
quality
channel,
asking
somebody
to
review
it
so
there's
probably
one
somewhere
nearby.
So
here's
here's
an
instance
of
me
opening
a
merge
request
and
posting
in
the
quality
channel
to
ask
the
team.
B
If
anybody
is
available
and
able
to
review
it,
and
that's
just
so,
we
can
get
the
change
in
quickly
so
that
it
doesn't
hold
up
any
other
pipelines,
and
we
can
see
here
that
there's
just
the
one
change
so
similar
to
what
you
saw
there,
but
for
a
different
test.
Just
the
failure
issue
again
in
the
quarantine
tag
and
that's
it
sonoda
proved
it
and
I
think
he
said
it
to
merge
as
well.
So
yeah,
that's
quarantine,
any
any
questions
about
the
quarantine
process.
B
B
B
What
I
think
would
be
useful
is
some
more
coverage
of
the
actual
investigation,
so
after
we're
quarantined
or
during
the
process,
quarantine,
if
you
want
to
do
it
side-by-side,
we
also
want
to
debug
the
test
failures,
investigate
the
failure
tried
to
reproduce
it
locally
see.
If
there
are
any
logs
or
any
further
insights,
we
can
add
to
the
issue
to
help
whoever's
going
to
be
fixing
it
or
to
fix
it
herself.
If
it's
quick
and
easy
enough
to
do
so,
so
there
was
actually
another
failure.
B
B
B
So
an
element
that
is
trying
to
click
is
not
clickable
atom
element.
So
if
it
looks
like
that,
dialogue
is
in
in
the
is
in
the
way
and
the
test
is
expecting
to
be
able
to
click
something
that's
behind
the
dialogue
and
it
was
trying
to
do
that
at
create
new
file
from
template,
so
assume
I've
gone
through
the
process
of
opening
an
issue
pasting
on.
B
All
of
this,
add
the
labels
etc
quarantined
the
test
and
now
I'm
at
the
stage
of
investigating
this
so
I'm
going
to
skip
that
for
now
and
just
get
straight
into
the
investigating.
So
I
want
to
see
what's
going
on
here
and
what
element
it
was
trying
to
click
and
file.
Is
the
web
IDE
test,
so
I'm
gonna
go
to
see
this
time
and.
B
Gonna
open
that
file
Oh
didn't
actually
see
if
that
was
the
right
one.
There
are
two
two
tests,
the
same
file
name
and
that's
the
repository
one.
That's
the
wrong
one
actually
won't
the
web
IDE
test
and
the
line
that
was
failing
at
was
70,
so
that
should
be
where
that
create,
create.
New
file
from
template
line
is,
and
it's
having
problems
in
the
edit
page
web
ID
edit,
and
that
was
this
line
ability
55
yeah.
B
B
Alright,
so
let's
see
what
that
looks
like
in
get
lab
itself,
so
I'm
going
to
open
a
test
project
I
have
a
test
project
that
I
just
used
to
play
around
in
when
doing
this
sort
of
thing
and
I
want
the
way
by
D
and
I'm
adding.
So
the
test
is
about
adding
the
template,
and
so
here
is
the
dialogue
that
was
in
the
way
and
what
the
test
is
supposed
to
do
is
click
on
one
of
those
so
get
rid
of
these
other
files.
B
That's
the
edit
file
and
test
back
here
with
the
test,
opens
a
web
IDE
and
then
try
to
create
a
new
file
from
the
template
that
clicks
a
new
file
button.
So
I
did
that
click
the
new
file
button
there
and
then
within
a
template
list,
it's
going
to
click
on
the
file
name,
so
we
should
be
able
to
find
template
list
element
in
here.
B
You
a
template
list.
So
that's
that
one
you
a
template
list
and
it's
going
to
click
on
the
file
name.
Now
what
file
name
was
it
in
this
case?
It
doesn't
really
matter
because
it
failed
on
all
of
them.
B
So
it
gets
the
file
name
from
this
array
of
hashes
here.
So
you
can
see
here.
The
sign
name
it
ignore
gives
IPI
it
matches
each
of
these
file
names
and
then,
if
I
click
on
one
of
them,
which
is
what
the
test
does
the
test
in
the
edit
page.
Here
it
clicks
on
the
file
name.
So
do
that
and
the
dialog
disappears,
but
in
a
screenshot
we
had
the
dialog
still
there.
So
something
has
gone
wrong.
There
capybara
thinks
click
the
the
file
name
because
the
failure
happened.
B
B
So
what
might
be
happening
here
is
that
there's
an
animation
there
D.
Can
you
see
that
when
I
click
the
new
file
button,
the
animation
drops
the
dialogue
down
slowly?
This
is
something
that's
happened
in
a
couple
of
cases
before
where,
while
the
animation
is
happening,
capybara
finds
the
element
goes
ahead
and
clicks.
It
and
yeah.
Then.
C
C
Yeah
I
wanted
to
actually
bring
this
up,
so
this
is
actually
part
of
the
dynamic
element.
Validation.
One
thing
I
completely
forgot
to
do
is
on
the
dynamic
element,
validation.
We
validate
that
it
appears,
but
also
that
it's
clickable
so
that'll
be
something
like
that
would
fix
this
as
well.
Yeah
exactly
what's
happening,
it's
not
clickable
either
because
it's
animating
or
you
know.
So
it's
weird,
though,
because
yeah
not
not
ready
to
click
it
basically,
yeah.
B
C
B
Yeah,
so
it
would
be
better
for
this
to
be
click
element
and
in
general
we
want
that
anyway,
we
want
a
pager
pager
objects
to
be
using
the
element
methods
rather
than
pure
capybara
methods.
So
a
couple
of
things
we'd
want
to
do
there
and
there
are
a
couple
of
options
for
fixing
this
yeah,
ok,
so
so
yeah,
but
basically
we're
getting
into
the
weeds
of
fixing
the
test
now.
So
we
don't
need
to
go
into
detail
into
that
for
this
training
session,
basically
fixing
the
test,
so
this
is
looking
at
the
code.
B
Haven't
got
to
actually
running
it
yet,
but
we
already
know
how
to
do
that.
So
in
that
case
you
would
run
the
test
locally,
make
whatever
fixes
you
need
to
and
then
submit
the
fix
and
uncor
intend
the
test
if
it's.
So,
if
the
let's
assume
this
isn't,
we
know
this
isn't
a
flaky
test,
in
fact,
because
it's
failing
every
single
time.
So
if
we
fix
it,
then
we're
pretty
safe
in
assuming
that
it's
fixed.
B
B
B
So
that's
a
bit
of
an
overview
we
haven't
actually
had
a
chance
to
administrate
running
it
locally
in
their
session,
so
might
need
to
think
about
doing
that
in
another
session.
Yeah.
A
B
A
Yeah
going
through
actually
running
it
locally,
I
think
we're
a
lot
of
people.
New
people
are
gonna,
get
hung
up
is
what
we
have
all
the
different
things
that
we
have
to
get
set
up
locally,
to
get
things
running,
that's
kind
of
a
whole
different
beast,
so
yeah,
you
probably
don't
want
to
go
too
far
off
the
common
path.
At
this
point,
at
least
for
this
training
yeah.
C
B
B
So
if
you
can't
reproduce
it
locally,
then
we're
still
seeing
the
failure
there's
a
couple
of
things
we
would
do
so
what
I
would
first
try
is
to
reproduce
it
locally
using
GDK.
If
that
doesn't
reproduce
it,
then
I
would
try
it
using
docker
so
yeah
in
the
guidelines.
There's
some
help
there
about
how
to
run
the
docker
image
good
lab,
and
then
you
can
run
the
test
against
that
docker
image.
If
that
doesn't
reproduce
it
and
finally,
I
would
run
the
tests
themselves
inside
docker,
which
is
essentially
running
the
tests
with
git
lab
QA.
B
Potentially
something
might
be
happening
in
that
interaction
that
doesn't
happen
when
running
via
GDK
or
docker
and
local
test
framework.
If
that
doesn't
reproduce
it,
then
something
really
strange
is
going
on,
because
that's
basically
how
how
it
works
in
CI,
it's
the
pure,
a
river
running
docker
containers,
communicating
with
each
other
yeah
another.
B
Yeah
and
also
yeah
when
reproducing
locally,
is
yeah
that
can
be
initially
because
you
might
have
changes
that
are
different
to
what's
being
tested
in
CI.
But
if
we
follow
the
instructions
in
the
guideline
and
actually
using
their
the
char
that
corresponds
to
the
nightly
image
that
that
actually
fails,
then
maybe.