►
From YouTube: How to launch product experiments at GitLab
Description
Growth team talks about how to do product experimentation
Training deck
https://docs.google.com/presentation/d/1nmStWChWkYad9K-dced9wS4jS7XLIrHB-WKafc7jrMU/edit#slide=id.gca4c496ea4_0_0
A
Awesome
so
welcome
everyone.
Thank
you
so
much
for
joining
this
experimentation
workshop.
I
have
many
team
members
from
the
growth
team
contributed
to
this
training
slides.
We
have
sam
emily
phil,
caroline
myself,
and
we
will
walk
you
through
how
we
launch
experiments
here
at
gillab.
Sam.
Do
you
want
to
move
to
the
next
one
so
so
for
for
any
people
who
maybe
haven't
doing
much
experiments
before
you
may
ask
why
experimentation?
Why
does
that
matter?
I
want
to
share
an
example.
A
We
read
in
in
a
book
actually
the
growth
team
rate
together
in
a
book
club,
it's
called
trustworthy,
online,
controlled
experiment.
This
is
way
back
like
an
engineer
in
microsoft
bean
search
team.
He
had
an
idea
about
how
to
tweak
the
search
result
and
you
can
see
the
top
are
the
kind
of
previous
search
results
format
and
he
had
an
idea
to
change.
It
move
some
of
the
kind
of
detailed
copy
to
the
title,
but
he
never
thought
this
will
change
anything.
A
He
felt
like
it's
just
a
small
change
and
the
idea
was
sitting
in
backlog
for
six
months.
One
day,
he'd,
probably
just
get
bored
from
regular
work.
The
pm
assigned
him
and
decided
to
spend
a
few
hour
code.
This
experiment
and
I
think,
within
a
couple
hours
on
the
entire
company,
is
getting
alert
basically
saying
revenue
too
high.
A
I
I
sound
that
sounds
an
awesome
alert,
but,
like
entire
company
didn't
know
what
happened,
he
went
to
investigate
this
experiment
and
realized
that
this
small
change
actually
brought
12
percent
increase
in
total
beans,
revenue,
which
translates
into
a
hundred
million
a
year
only
for
us
alone
and
think
about
they
can
implement
the
same
change
for
other
regions.
A
So
I
think
why
experimentation
it
really
comes
down
to.
Sometimes
you
expect
something
to
work.
It
may
not
work.
Sometimes
you
expect
things
that
will
not
work.
It
will
actually
work
like
miracle.
In
this
case,
a
lot
of
times.
Human
beings
are
very
confident
in
our
judgment,
in
how
things
should
work,
but
a
lot
of
times.
That's
not
how
our
customers,
users,
will
behavior.
A
So,
with
digital
products
such
like,
such
as
gitlab,
we
have
a
unique
advantage
that
we
can
test
the
different
product
changes,
flows,
testing
new
features
and
observe
the
real
change
in
the
business,
metrics
or
or
the
kind
of
the
the
certain
numbers
rather
than
guess,
so
it's
a
unique
advantage
that
all
of
us
should
leverage
sam.
Do
you
want
to
move
to
the
next
one
so
yeah?
So
this
is
the
high
level
the
experimentation
process
we
use
here
at
gitlab.
A
It
starts
from
come
up
with
ideas
and
ideally
generate
as
many
as
possible,
and
then
you
have
this
idea
backlog.
You
begin
to
select
the
best
ideas.
We
will
talk
about
some
framework.
We
use
basically
thinking
about
how
impactful
this
change
will
bring
us
how
much
work
it
is
and
pick
the
best
ideas
we
want
to
test
first
and
then
come
to
the
phase
that
we
we
need
to
turn
this
idea
into
actual
experiment.
It
involves.
We
need
to
have
a
very
clear
and
concise
design
of
this
experiment.
We
will
share
the
template.
A
We
use
to
write
the
experiment
design
here
at
gitlab.
We
we
need
to
get
our
ux
team
to
help
and
develop
the
different
versions
we
want
to
test.
For
example,
in
the
bnc
example,
you
have
a
version
that
have
the
the
copy
in
the
bottom.
You
have
a
version
that
have
a
copy
in
the
top.
That's
a
very
small
change,
but
there
are
also
experiments
require
some
pretty
different
ux
designs,
and
then
we
need
our
engineer,
awesome
engineer,
team
to
help
us
implement
those
different
versions.
There
is
the
control
which
basically
is
the
original
version.
A
We
may
have
a
version
one
or
version
two
that
has
different
experience
or
design
for
the
users
and
after
the
engineer,
implementation,
we
move
on
to
a
test
testing
phase.
We
do
staging
tests
and
we
also
roll
out
gradually,
rather
than
kind
of
launch
the
experiment
to
all
users
at
once,
so
going
through
that
launch
phase.
Now
we
need
to
collect
results
and
analyze
to
see
whether
this
experiment
worked
or
not.
So
we
will
add
this
experiment
into
our
knowledge
base,
which
is
basically
a
place.
We
document
our
past
and
currently
live
experiments.
A
We
will
also
collaborate
with
the
data
team.
We
analyze
the
results.
We
want
to
know
whether
it
worked
or
not.
We
need
to
clean
up
the
code,
because
usually
experimentation
requires
some
extra
code
to
wrap
the
experiment
and
we
want
to
remove
them
afterwards
to
keep
the
code
base
clean
and
set
up
the
foundation
for
future
experiments.
B
Sure,
thanks
hila,
so
to
kind
of
start
off
that
process.
That
gila
just
highlighted
it's
important
to
collect
as
many
ideas
as
possible
and
kind
of
generate
a
little
bit
of
a
backlog
of
experiments
that
you
might
want
to
run,
and
you
can
really
go
about
that
in
a
lot
of
ways:
reviewing
qualitative
and
quantitative
feedback
and
or
data
that
you
have
is
really
valuable
in
understanding
that.
But
it's
also
important
to
you
know
chat
with
your
team
work
through
do
workshops.
C
B
At
your,
you
know
specifically
look
at
your
data
and
see
what
the
data
is
saying,
what
customers
are
saying
and
feedback.
Do
you
see
a
large
drop-off
in
a
funnel
or
people
getting
hung
up
on
a
particular
area
and
we're
not
totally
sure
about
what
the
solution
is
to
that
experience,
that's
an
opportune
moment
to
start
to
think
about
does.
Is
this
a
place
where
we
could
potentially
run
an
experiment
to
learn
and
potentially
improve
the
user
experience
and
then.
B
B
B
B
Poor
experience
for
the
user
today
and
we
believe
we
have
a
fix
that
just
improves
it.
We
should
ship
that
fix
and
improve
the
customer
experience.
We
don't
need
an
experiment
to
tap
ourselves
on
the
back
to
say
that
we
approved
it
by
x.
If,
if
we
know
the
current
experience
is,
you
know
close
to
broken
or
close
to
broken
and
then
the
second
one
is
ensure
that
you
have
the
data
you
need
to
understand.
B
Do
enough
people
interact
with
this
area
that
you're
potentially
exploring
a
test
to
be
able
to
reach
significance,
and
then,
lastly,
do
we
collect
data
on
that
particular
area
of
the
product
and,
if
not,
you
might
want
to
start
with
understanding,
adding
event,
tracking
or
back-end
tracking
to
that
particular
area
to
understand
the
volume
first.
B
The
next
step
kind
of
in
that
process
is
to
write
up
an
actual
experiment
idea.
We
have
a
template
in
the
gitlab
project,
which
is
linked
at
the
bottom
here
in
tip
one,
and
it's
also
in
our
the
growth
process
in
our
handbook
on
the
growth.
Excuse
me
the
process
page,
and
it
really
starts
with
defining
your
hypothesis
and
it's
important
to
note
here
that
your
hypothesis
isn't
defined
strictly
to
your
experiment.
B
It
can
be
a
broad
statement
about
the
area
that
you
want
to
explore,
testing,
because
one
experiment
could
be
invalidated
or
if
it
basically
didn't
win
out
over
your
your
treatment
or
your
experiment.
But
that
doesn't
mean
that
your
hypothesis
is
validated.
You
may
have
a
follow-up
test
to
run,
to
try
and
prove
the
same
theory.
B
E
Awesome
and
now
from
the
ux
design
perspective
as
us
as
a
designer.
The
ux
process
in
experimentation
isn't
too
different
from
other
design
work
at
get
lab,
but
the
main
thing
is
we
want
to
ensure
the
proposed
solution
is
small,
so
it's
like
the
smallest
thing
we
can
do
to
get
relative
change
and
that
the
success
is
defined
and
measurable.
E
So
some
of
the
big
dues
we
do
as
designers
in
this
area
are
exploring
multiple
options
early
on
and
sharing
those
out
with
the
team
using
existing
patterns,
unless
the
experiment
calls
for
like
a
different
or
new
pattern
that
we
want
to
experiment
with,
we
want
to
get
that
cross-functional
feedback,
as
you
normally
would
just
seeing.
Is
it
reasonable
to
do
this?
Is
this
going
to
make
the
biggest
change
and
we've
created
a
figma
template
for
experiments
that
designers
can
use
and
if
you're
getting
different
data
than
you
expect?
E
Consider
a
follow-up
usability
test
to
understand
why
you're
getting
that
data
or
why?
What
you're
getting
back
is
different
than
what
you
thought
you
would
with
the
experiment,
and
then
some
small
don'ts
here
is:
don't
spend
time,
refining
details
that
won't
impact
the
experiment
results.
This
can
be
done
during
cleanup
so
like
making
a
very
tiny
change
to
an
icon
that
is
not
relative
to
the
experiment
or
something
that
just
is
kind
of
on
the
side
and
won't
impact
results
and
don't
make
ui
changes
outside
the
scope
of
the
experiment.
E
C
Thanks
emily,
so
the
engineering
implementation
phase,
so
now
that
we've
got
a
well-defined
experiment
and
design.
We
need
to
implement
this
and
get
there
so
growth
engineers
at
gitlab
use
blex,
which
is
the
gitlab
experiment
gen,
and
this
is
a
project
that
lives
under
the
gitlab
org
group.
C
I
call
this
a
second
generation
experiment
framework.
Initially,
our
growth
engineers
implemented
experiments
within
the
gitlab
code
base,
using
a
module
and
using
standard
gitlab
development
feature
flags,
we've
iterated
on
that
process
and
have
now
set
it
on
clicks.
C
This
uses
a
custom
experiment,
feature
flag
type
that
still
supports
the
approach
used
with
development
feature
flags,
but
our
experiment
feature
flags
tend
to
be
a
little
bit
longer.
They
last
a
little
bit
longer
in
the
code
base
than
the
development
feature
flap,
but
this
will
be
familiar
to
anyone.
Who's
used
to
developing
features
behind
a
feature
flag
and
gitlab,
including
supporting
tools
such
as
chat,
ops
for
enabling
and
disabling
the
flag
itself.
C
So
glex
supports
a
b
testing
or
multivariate
experience
for
a
basic,
a
b
test.
We
would
need
to
define
in
the
code
base
both
the
control
and
candidate
experience
for
a
ui
test.
This
could
be
defined
in
a
controller
action,
and
this
built
builds
on
the
approach
used
in
the
open
source
scientist
gym
now.
The
glex
project
read
me
covers
different
types
of
implementation.
There
are
really
good
examples
in
there.
C
That
would
be
my
recommendation
for
where
to
start,
if
you
were
looking
to
implement
an
experiment,
but
the
get
lab
code
base
also
includes
many
examples.
So
currently
there's
around
20
experiments
defined
in
the
get
lab
code
base,
and
that's
also
another
really
useful
resource
for
anyone
looking
to
implement
an
experiment
and
get
there
a
good
place
to
start.
There
would
be
searching
for
the
use
of
the
experiment
method
in
both
controllers
and
helpers.
B
C
Both
front
end
and
backing
tracking
are
supported,
including
snowplow,
for
front
end
and
tracking
data
to
the
database
tracking.
The
right
data
to
be
able
to
make
informed
decisions
is
not
trivial.
Respecting
data
privacy
is
another
important
consideration.
Please
reach
out
to
a
pm
analyst
or
engineer
in
growth.
If
you
want
to
discuss
further
thanks.
B
Thanks
phil,
so
after
the
experiment
is
designed
by
engineering,
the
next
step
is
to
think
about
bringing
it
to
staging
and
out
to
production.
B
B
This
is
a
nice
way
before
it
reaches
production
to
understand
if
you're
you're
collecting
the
front
end
events
that
you
anticipated
and
you
decided
on
with
your
team
once
you're
comfortable
with
the
staging
experience
you
can
roll
it
out
to
production
when
you're
doing
an
a
b
test
generally
in
growth,
we
always
start
with
20
or
less
and
roll
it
out
for
at
least
a
day,
if
not
a
few
days.
B
The
reason
for
that
is,
our
data
is
loaded
in
nightly,
so
you
have
to
wait
at
least
24
hours
to
see
if
any
your
actual
events
have
are
coming
into
the
database
and
then
the
second
reason
is
we
want
to
ensure
we
provide
users
with
a
good
experience,
so
we
want
to
try
and
listen
in
and
ensure
that,
depending
on
how
big
the
experiment
is,
if
we're
hearing
anything
from
support
or
anybody
else
internally
about
this
experience
before
we
roll
it
out
to
the
full
cohort
size
so
for
an
a
b
experiment
that
would
be
50
and
then
from
an
experiment
tracking
perspective,
you
want
to
ensure
that
you
create
an
experiment,
tracking
issue
related
to
your
experiment.
B
The
importance
here
is
that
you
want
to
document.
When
was
the
experiment
rolled
out
on
production
at
what
percent,
and
when
did
it?
Let's
say
if
it
was
at
20,
when
did
it
go
up
to
50
and
then
eventually,
when
was
it
turned
off
or
on
and
what
was
the
result
of
the
experiment?
At
the
end
of
the
day,
this
kind
of
tracking
issue
is
going
to
be,
or
at
least
in
growth.
We
treat
it
as
kind
of
the
end
result
of
the
experiment.
The
conclusion
will
be
defined
here
in
this
tracking
issue.
B
B
And
the
other
important
thing
to
do
in
this
process
of
kind
of
thinking
of
as
you're
writing
up.
You
know
when
you're
planning,
your
experiment
when
it's
is
launched
and
when
it's
concluded
is
to
ensure
you
add
it
to
the
growth
knowledge
base,
which
is
part
of
the
growth
direction
page
in
the
handbook.
B
We've
recently
added
some
updates
to
this
page
so
that
you
can
there's
a
section
defined
for
planned
and
upcoming
experiments,
active
experiments
and
concluded
experiments,
and
our
goal
here
is
to
provide
a
space
for
any
team
to
update
when
they're
planning
to
run
a
test.
They
have
an
active
test
or
a
conclusion
with
the
result.
D
D
If
you
just
pop
in
the
name
of
your
experiment,
you
can
see
which
events
are
coming
through
and
how
they're
coming
through
in
production.
D
The
second
item
is
a
sql
snippet
available
in
sisense
that
will
pull
in
both
front
end
and
back
end
events
for
your
experiment
and
if
you
find
yourself
needing
extra
help
or
you
need
a
more
robust
analysis,
please
engage
the
product
analysis
team
by
opening
up
a
issue
in
our
project.
The
template
is
linked
there.
One
thing
that
I
do
want
to
emphasize
is
that
we
ask
you
open
the
issue
at
least
one
milestone
before
the
experiment
is
going
to
launch
so
that
we
can
plan
accordingly.
D
The
last
two
items
here
the
experiment
framework
and
sop
that
we've
recently
developed
on
the
growth
team
can
just
give
you
an
idea
when
you're
designing
and
measuring
your
experiment
as
well.
As
you
know,
essentially
the
steps
of
when
to
engage
which
teams
and
finally,
some
general
experimentation
best
practices.
C
So
on
to
experiment
cleanup,
so
I'm
trying
our
product
managers
have
made
a
call
on
the
outcome
of
an
experiment
and
it's
important
that
we
clean
up
the
code
appropriately,
so
our
experiments
run
behind
experiment,
feature
flags
which
is
along
with
the
code
they're,
both
technical
debt.
So
we
want
to
do
that
as
soon
as
we
can.
C
Some
of
those
considerations
are
whether
this
should
run
on
sas
or
self-managed
or
both,
whether
it's
ee
or
ce,
only
we
usually
need
to
make
changes
to
tracking,
add
documentation
and,
of
course,
how
to
change
log
entry
by
default.
Our
growth
experiments
only
run
on
sas,
and
we
do
this
so
that
we
can
move
fast.
We
don't
want
to
be
running
experiments
on
self-managed,
it's
harder
to
update
those
experiments
and
conclude
them,
so
we
run
on
sas
by
default.
So
that's
probably
one
of
the
first
considerations.
C
If
you're
rolling
this
in
as
a
feature,
some
of
the
tracking
calls
are
custom
to
experiments
and
that
either
need
to
be
refactored
or
removed,
and
so,
if
you're
rolling
out
as
a
feature
on
self,
manage,
consider
converting
the
experiment
feature
flag
to
a
development
feature
flag.
If
that's
your
standard
practice,
if
an
experiment
has
not
been
successful,
the
code
and
experiment
feature
flag
can
be
removed
from
the
code
base
and
any
learnings
can
be
applied
through
any
follow-up
issues.
B
B
Thanks
kind
of
moving
on
here
to
tips
for
create
for
creative
and
effective
experiments.
There
isn't
always
the
data
to
support
and
potentially
running
an
experiment.
You
don't
have
the
you
don't
feel
like
you
have
the
strongest
footing
to
understand.
Is
it
worth
running
that
test?
B
So
one
process
you
can
take
here
is
if
it's
a
a
larger
initiative
that
could
take
a
lot
of
work
to
build.
The
experiment
is
a
payment
door
test
in
a
paid
and
door
test.
You
design
a
piece
of
the
ui
element
to
see
if
users
will
actually
interact
with
it
and
you
don't
build
out
the
full
experience.
The
idea
behind
this
test
is
to
run
it
in
a
limited
amount
of
time
into
a
limited
number
of
users
just
to
understand
enough.
B
If
users
are
interested
in
this
particular
thing,
we've
linked
to
an
example
painted
door
tests
that
the
growth
team
ran
that
helped
us
understand
if
non-ad
or
non-owners
or
maintainers
were
interested
in
finding
out.
Who
could
invite
users
to
that
particular
name
space.
B
So
in
that
book
that
hela
mentioned
one
of
the
examples
from
my
team
at
microsoft
was
they
would
slow
down
a
particular
area
the
product
by
a
few
milliseconds
and
understand
what
did
that
actually
do
to
the
different
state
adoption
between
screen
one
and
screen
two
of
the
feature
that
drop
helped
them
understand
what
the
perceived
benefit
would
be
if
they
improved
the
infrastructure
speed
by
that
by
that
same
amount,
and
then
the
other
thing
to
really
highlight
here
is
to
check
your
experiment
tracking
early
and
as
often
as
possible.
B
The
reason
for
this
is
twofold:
it's
one
is
for
your
team.
You
know
you
want
to
know
that
you're
getting
the
data,
you
need
to
be
able
to
understand
the
results
of
the
experiment
and
then
two
we
want
to
always
want
to
protect
the
user
experience.
So
if
we
start
to
see
trends
in
the
data
that
the
experiment
is
winning
out,
that's
great.
B
We
can
start
to
look
at
significance
and
understand
when
it's
validated
and
it
should
be
rolled
out
to
100
or
if
we
run
into
an
experience
where
it's
in
that
it's
quickly
becoming
invalidated.
And
it's
not
as
good
as
the
control.
Then
it's
important
that
we
understand
that
and
we're
ready
to
return
the
control
to
100
for
for
all
of
our
users
and
then,
lastly,
is
just
to.
C
B
Between
the
two
results
in
your
tests,
if
it
was
an
a
b
test,
for
example,
it's
important
not
to
see
that,
as
you
know,
in
not
useful
for
your
team,
you
should
utilize
those
results
and
try
and
understand
what,
if
any,
follow-up
experiments,
you
can
run
to
try
and
validate
particular
areas
of
the
experience.
B
I
know
we
covered
a
lot
through
two
slides
here.
This
one
is
full
of
resources
and
documentation
of
where
you
can.
You
can
find
how
growth
works,
how
you
can
how
each
team
specific
team
works,
there's
a
we
have
a
lot
of
content,
so
feel
free
to
check
out
our
handbook
pages
and
then
lastly,
you're
always
more
than
welcome
to
reach
out
to
us.
B
You
can
always
start
by
cr
if
creating
an
issue
and
at
mentioning
us
in
the
issue,
and
then
I
know
I
can
speak
on
behalf
of
the
pms.
That
would
be
more
than
happy
to
provide
feedback
or
answer
any
questions
to
other
pms
or
anybody
else.
That's
interested,
I'm
sure
the
engineers
are
as
well,
and
you
can
also
find
us
on
slack
at
us
under
support
growth.
A
F
So
one
of
the
things
that
I
noticed
that
I
think
it
was
emily
mentioned
when
you're
designing
an
experiment,
you
don't
want
to
make
any
other
code
changes
to
that
area
during
the
experiment,
but
since
growth
launches
experiments
on
top
of
code
that
other
people
are
also
building
on,
how
do
we
safeguard
that
or
is
it
okay
to
play
in
the
same
space?
Sometimes?
A
Yeah,
I
think,
that's
a
that's
a
great
question,
emily.
I
want
to
hear
your
perspective
and
sam.
I
my
my
perspective
is
that,
first
of
all
in
areas
we
own
or
like
we
have
more
control,
we
definitely
try
to
avoid
that.
Like,
for
example,
2
pm
shouldn't
be
launching
two
experiments
in
the
exact
same
area,
changing
some
different
things
that
will
mess
up
with
our
results.
A
Two.
We
should
ideally
try
to
launch
a
b
test.
So
in
that
case,
even
even
though
there
are
external
force,
that's
kind
of
outside
growth,
that's
doing
some
changes.
Supposedly
that
change
should
impact
version
a
and
version
b,
the
same
way
and
they're.
The
only
difference
we
make
between
version,
8
and
version
b
is
the
change
we
want
to
make.
So
we
can
still
somehow
tell
the
difference
between
the
changes.
A
I
think
the
last
point
is
important
to
keep
open
communication
with
other
teams.
Like
your
team,
we
we
appreciate
whenever
you
share
the
updates.
That's
super
helpful.
Whenever
we
branch
into
another
team's
focus
area
like
we're
working
on
verified
adoption,
secure
adoption,
we
make
sure
we
have
close
communication
with
that
team
share.
Our
roadmap
and
plan
so
that
if
there
are
any
potential
conflicts,
we
can
identify
them
early
on,
but
yeah.
I
think
there's
no
perfect
solution.
So
we
we
try
all
those
ways
try
to
minimize
the
impact
on
this.
C
Because
we're
using
feature
flags,
we
have
a
custom
experiment,
feature
flag
type,
but
for
our
engineers
this
is
they're
very
familiar
with
us.
When
they're
looking
in
the
code
base,
they
can
see
that
something's
behind
a
feature
flag.
So
whether
it's
an
experiment
or
not
they're
used
to
that
use
case
and
should
know,
should
know
how
to
how
to
introduce
new
changes.
It
does
happen.
C
We're
certainly
not
doing
anything
to
stop
other
teams
from
changing
their
code
while
we're
running
an
experiment,
it's
just
very
similar
to
a
development
feature
flag.
Your
engineer
would
be
looking
at
with
where
they're
implementing
that
change-
and
I
think
as
healer
alluded
to,
we
would
hope
that
that
change
is
implemented
on
the
control
rather
than
on
her
variant
that
we
tested
our
engineers.
If
we,
if
you
use
get
blam
you'll,
see
which
engineers
worked
on
the
experiment,
they'd
be
more
than
happy
to
answer
any
questions.
D
One
thing
I
want
to
piggyback
on,
as
hilo
mentioned,
open
communication
being
really
important.
I
think
this
is
something
that
we,
as
we
start
to
kind
of
increase
velocity,
we're
going
to
have
to
make
sure
that
we're
not
tripping
over
each
other,
but
one
item
that
was
on
a
previous
slide
that
sam
was
covering
about
why
you
should,
or
shouldn't
run
an
experiment.
A
Going
once
going
twice
going
third
time,
if
not
thank
you
aaron
for
coming,
I
will
share
the
video
afterwards
again.
If
you
have
any
questions,
we
can
help
feel
free
to
hit
us
on
the
growth
channel,
bye.