►
From YouTube: Quality Team: Failure Triage Training - Part 2
Description
Follow up discussion by quality team on how we investigate failing quality pipelines using the original containers tests were executed on.
A
B
C
C
Basically,
the
steps
are
creating
issue,
quarantine,
a
test
and
then
investigate
the
failure
so
I'm,
assuming
that
you
familiar
with
the
pipelines
familiar
with
the
process
of
creating
an
issue,
creating
an
mr2
quarantine,
a
test
and
now
we're
getting
to
the
point
of
investigating
the
failure
so
that
we
can
add
some
notes
to
the
failure
log
for
anybody
who's
going
to
be
fixing
it
or
even
move
on
to
fixing
it
yourself.
So
the
document
here
I
had
a
link
to
an
issue,
an
existing
issue.
This
is
one
that's
still
quarantined.
C
So
this
is
a
staging
issue
and
in
this
case
it
failed
because
it
was
unable
to
find
this
selector
pureĆ
admin
area
link
while
trying
to
use
the
menu
to
go
to
the
admin
area.
Now
this
one,
this
test
I
was
familiar
with,
so
it
was
pretty
straightforward.
The
troubleshooting
was
pretty
straightforward:
I
knew
that
in
staging
we
don't
have
admin
access,
and
so
this
isn't
going
to
work
but
trying
to
reproduce
it
locally.
C
I
found
another
problem,
so
there's
actually
another
error
in
this
test,
and
in
this
case
the
error
that
I
get
went
around
locally
was
that
the
branch
name
master
already
exists,
so
I'm
going
to
try
to
reproduce
that
using
GDK,
so
good
I
switch
over
to
my
terminal
and
it's
going
to
restart
Gd.
Okay,
cuz,
it's
pretty
straightforward!
So
I'll
do
more.
Okay!
So
that's
just
killed
gtk
I'm
in
the
JDK
C
folder.
You
can
see
there.
C
So
GDK
run
that
startup
and
then
I'm,
going
to
switch
over
to
visual
studio
code
and
I'm
gonna
bring
up
the
test.
So
the
test
was
this:
one
push
push
over
HTTP
file,
size
spec,
so
I've
got
that
open
here
and
I've.
Just
uncommented
quarantine,
metadata
meta
tag
there,
because
I've
had
another
bug,
so
it
wouldn't
run
while
it
was
quarantined.
So
this
just
allows
me
to
run
the
test
and
I've
got
a
launch
country
set
up
here.
C
Well,
come
back,
I've
got
a
launch
config
set
up
here
that
uses
the
chrome,
headless,
zero
environment,
variable
and
QA
debug,
one
so
chrome
here
this
one
that
turns
off
headless
allows
me
to
see
what's
going
on,
QA
debug
set
to
one
that
writes
logs,
as
you
can
see
in
the
bottom
here
and
then
the
argument
here.
So
this
is
running
the
beam
QA
command.
So
it's
like
running
that
command
from
the
terminal.
If
you're
going
to
do
it
from
the
command
line,
it
would
look
like
that
been
QA
test
instance
or
there's
the
URL.
C
So
anyway,
I'm
going
to
execute
that
and
that
should
run
the
test
and
I'll
just
drag
over
the
window
again.
My
second
monitor
so
yeah
strike
that
over
now
you
should
be
able
to
see
you.
Let
me
know
if
you
can't
at
the
moment
it's
loaded
the
list
of
projects
good
and
it
should
add
there.
Oh,
it's
gonna
open
the
console,
so
that
was
going
to
hide
for
a
second
in
the
back.
So
it's
adding
a
personal
access
token
and
you
can
see
in
the
background
the
logs
of
it
pushing
the
files.
C
C
D
C
C
C
D
C
And
it's
running
the
after
yeah
yeah,
it's
running
the
after
block,
there's
enough
to
block
here
that
is
going
to
undo
the
file
size
change.
That's
there,
because
this
is
two
tests,
so
we
want
to
revert
the
change
that
the
first
one
made
before
running
the
second
test,
but
anyway.
So
this
is
the
failure
that
we
were
seeing
in
the
logs
same
one.
Here,
branch
name
master
already
exists
now,
if
you
were
triaging,
and
you
just
wanted
to
verify
that
the
failure
is
reproducible.
This
is
far
enough.
C
You
could
say
that
yes,
you've
reliably
reproduce
the
failure.
That
would
be
enough
to
provide
some
information
that
yeah
this
is,
is
definitely
a
failure,
but
it
would
be
good
to
go
a
bit
further
and
determine
if
possible,
if
this
is
failure
with
the
test
or
if
there
is
something
wrong.
We've
give
that
itself
now.
C
In
this
case,
we
can
go
back
up
to
the
command
that
was
issued
and
we
can
see
that
it's
issuing
a
check
out
to
create
a
branch
called
master
and
earlier
in
the
test,
it's
already
checked
out
master
created
the
branch
successfully.
So
this
isn't
a
veil
of
logic.
We
know
that
you
can't
create
the
branch
that's
already
been
created,
so
there's
something
wrong
with
the
test
there
and
yeah.
If
you're
looking
into
it
yourself,
you'd
need
to
be
do
a
bit
of
troubleshooting
step
through
try
and
reproduce
identify
where
things
have
gone
wrong.
C
We
won't
go
through
all
of
that,
but
yeah
to
cut
to
the
solution.
Whatever
change
was
made,
change
the
logic
so
that
we
need
to
tell
this
test
now
not
to
create
a
new
branch.
So
this
is
a
fairly
with
the
chest.
It's
a
pretty
easy
fix.
I
haven't
submitted
it
yet,
but
I
will
now
and
if
you
were
doing
the
trade
yourself.
You've
got
to
this
point.
C
If
you
were
able
to
figure
out
pretty
quickly,
is
that
okay,
here's
the
fix
for
the
test,
just
submit
it
and
then
you're
done
no
need
to
worry
about
yeah
anybody
else
having
to
follow
up
on
it.
So,
let's
assume
we
haven't
done
that?
Oh
and
let's
assume
that
we
weren't
able
to
reproduce
the
failure
using
GDK.
C
So
the
next
step
that
we
had
in
the
document
was
to
have
a
look
at
running
it
in
docker.
So
GDK
is
a
different
web
environment,
then
is
in
the
test.
The
story
that
runs
using
CI
we
use
docker
containers
in
CI
based
on
the
omnibus,
is
live
image.
So
that's
what
we
can
do
locally
as
well.
We
can
run
the
docker
container
and
this
is
the
command.
C
Docker
run:
publish
port
80
on
the
container
maverick
to
port
80
on
the
host
naming
the
container
net
test,
because
by
default,
the
docker
container
that
runs
get
lab
and
the
docker
container
that
runs
the
tests.
Some
be
two
tests,
the
two
containers
we
want
them
to
be
communicating
on
the
same
network,
it's
easier
to
have
it
as
a
separate
network,
hostname,
localhost
and
there's
a
name.
It's
container
that
takes
a
while
to
start
up
so
I'll
just
show
you
one
that
I
prepared
earlier.
C
So
if
I
switch
back
to
the
test
to
visual
studio
and
I
change,
the
host
that
I'm
running
the
test
against
I
can
just
run
the
same
test
from
the
same
environment,
but
now
bring
over
the
browser.
And
now
it's
running
the
same
test.
But
this
time
it's
running
it
against
the
docker
container,
and
this
is
the
nightly
image.
So
it
doesn't
have
my
changes,
so
it
should
fail
in
the
same
way,
as
was
in
the
logs.
C
C
C
This
is
going
to
take
a
while,
but
then,
after
it
runs
the
get
lab
image
in
docker,
it
will
run
the
tests
and
we
can
come
back
to
this
and
see
what
the
output
is.
It's
going
to
run
for
a
while
I
also
mentioned
earlier
that
you
can
set
the
personal
access
token,
so
you
can
do
that
using
it
lab
QA
access
token,
and
if
you
setup
a
token
manually
or
alternatively,
there's
now
one
that's
pretty
that's
inbuilt,
let's
see
if
I
can
find
the
source
for
it.
C
C
Personal
access
token,
there
we
go
so
the
development
seed
adds
the
personal
access
token,
that's
the
one
there.
So
if
you
configure
an
environment
variable
that
includes
that
access
token,
if
you
run
the
tests,
it
shouldn't
need
to
create
an
access
token,
so
just
skip
that
first
step.
So
we'll
see
if
that
works,
how
something
is
gone
wrong.
I.
C
In
the
meantime,
I'll
check
on
that
docker
run
so
here
it
shows
that
it's
started,
get
lab
in
a
doc
container
and
was
waiting
for
it
to
become
responsive
and
now
it's
just
issued.
Another
docker
run
command
to
start
the
queue,
a
container
image
and
run
test
instance
against
it.
A
docker
container
address
running
that
test
that
we
told
her
to
just
that
test
that
we
talked
to
so
we're
waiting
for
that
to
finish.
Let's
go
back
here
and
see
why
okay,
so
yeah
gonna
need
to
investigate
that.
So
this
should
work.
C
C
So
those
are
three
the
the
three
first-line
options
that
you
would
take
when
troubleshooting.
One
of
these
should
allow
you
to
reproduce
the
error,
and
hopefully
some
combination
of
them
back
to
GD
k
will
allow
you
to
identify
where
things
are
going
wrong
and
either
fix
the
problem
or
add
some
notes
to
do
the
issue
for
somebody
else
to
follow
up
on
so
I
think
that
wraps
up
the
demo
is
there
anything
any
questions
or
anything
you'd
like
to
see.
A
C
Right,
yes,
yeah
and
if,
if
it
was
a
flaky
test,
then
ideally
you
wouldn't
aren't
uncontained
it.
When
you
submitted
the
fix
ideally
submit
to
change,
leave
it
in
quarantine
for
a
few
runs,
say
five
make
sure
it.
It
passes
every
time
and
then
submit
another
merge
request
on
quarantine
it.
But
in
this
case
it
failed
all
the
time.
So
we
can
and
we
could
uncontained
immediately
and
the
that's.
A
C
C
C
Front
yeah
refer
to
the
front-end
or
back-end
engineering
manager
of
the
content
team.
By
mentioning
in
the
issue,
comments,
yeah,
yeah,
yeah,
yeah,
I.
Think
if
you
do
know
that
somebody
was
working
on
a
feature
on
a
particular
feature
and
you
can't
identify
that
as
the
cause.
For
example,
if
you've
been
working
with
the
team
and-
and
you
know
that
some
merge
request
just
went
in
recently,
then
you
could
go
directly
to
an
engineer
but
otherwise
to
add
relevant
manager,
who's,
relevant
team.
E
What
do
we
constitute
as
a
flaky
test?
I
just
realized
because
mark
when
you
said,
if
you
think
it's
a
flaky
tests,
let's
put
let's
not
move
it
out
of
quarantine,
yet
until
it's
proven
stable
and
the
configure
team.
My
smoke
test
that
I
had
actually
assigned
to
you
was
failing
because
of
a
specific
reason.
It
wasn't
flaky.
It
was
just
failing
so
yeah.
Do
we
treat
just
regular
failures
as
a
flaky
test.
C
No,
no
and
it
can
be
difficult
if
we
just
say
one
report
but
I'm
pretty
sure.
Now
the
the
pipelines
are
set
up
to
retry,
so
a
spec
will
retry
each
test
three
times
and
then
fail.
So
you
can
look
through
the
logs
and
see
if
it
has
retried
and
if
it's
retried
and
failed
each
time,
then
we
know
it's
not
flaky.
C
E
E
C
E
D
C
E
C
B
D
B
I
know
and
we're
in
I
hate
to
say
this
of
it
there's
a
term
the
product
management
says
its
label
health.
Thank
you,
take
it
at
face
value,
but
it
can
be
overwhelming
and
yeah
like
quality
bug.
Quality
:
bug
is
the
same
as
quality
and
a
bug.
If,
if
you
can
think
of
a
better
name
for
me
to
propose
it,
but
yeah
I
understand
it.
It's
not
this.
It's
not
flaky.
It's
a
defect
in
our
test,
harness.
E
D
C
C
Yeah
so
I
have
a
launch
config
here,
I'll
make
a
public
and
that's
just
a
basic
but
you're.
Essentially
it's
adding
the
environment
variables
and
whatever
arguments
and
changing
it
and
there's
this
handy
little
file
command
there,
which
means
that
if
you
select
a
file
in
Visual
Studio
code,
it
will
substitute
the
path
to
that
file
and
run
that
specific
test
which
could
be
handy
nice,
so
I
actually
have
a
few
different.
C
This
is
this
is
the
wrong
studio
occurred,
see
so
I
have
a
few
different
configurations
set
up
if
I
want
to
run
specific
tests
for
Warner
and
smoke
tests
or
if
I
want
to
run
tests
in
a
certain
group
or
I
want
on
tests
against
staging,
have
a
few
different
configurations
set
up
so
that
I
can
quickly
do
any
of
those.
But
current
file
is
the
one
that
I
mostly
use.
B
C
Yeah
opinion
should
always
have
that
to
disable.
Has
us
yeah
if
I'm
gonna
be
doing
it
locally
and
then
sometimes
I
might
run
all
of
the
tests
headless
in
the
background,
while
I'm
working
on
something
else
just
to
make
sure
if
I
make
a
bunch
of
changes
and
what
I
make
sure
I
haven't
broken
anything
else,
I'll
just
run
them
all
and
I'll
run
them
headless,
so
it
doesn't
get
in
the
way.
Okay,.
B
Have
we
noticed
any
flakiness
specifically
to
headless
and
I
think
this
came
up
with
a
discussion
with
Dan
and
me
a
while
back
where
she
was
just
running
you
a
full
full
head,
because
that's
what
our
users
are
using,
because
I
assume
right
now
we're
running
headless
correct
everywhere.
Do
you
mean
in
CI
yeah
yeah
yeah.
C
Yeah
I
I,
don't
think
so.
I
only
recall
one
instance
where
there
was
a
problem
that
was
related
to
running
headless
and
I
think
it
was
actually
just
that
a
developer
ran
in
an
environment
where
the
where
their
desktop
resolution
was
too
low,
and
so
they
ran
into
an
error
there
and
it
wasn't
reproduced
during
CI,
because
there
was
more
virtual
space
in
the
headless
CI
environment.
B
B
Yeah
I
was
about
to
say
until
we
have
our
only
internal
grid,
I
think
once
we
have
that
I
would
like
to
actually
turn
on
go
ahead,
because
that's
that's,
probably
a
better
representation
of
what
our
users
are
using
I
mean
spec
and
mocking
can
be
headless.
C
C
D
All
right:
well,
we
are
at
just
over
three
minutes
left,
so
that
was
perfect
timing
to
get
everything
done.
That's
you
get
two
gold
stars
mark.