►
From YouTube: A/B Testing Think Big Session
Description
Recording of the Release stage= Progressive Delivery Group - A/B Testing Think Big and Brainstorming Session
A
A
Hopefully,
we
can
also
form
an
MVC
out
of
this
conversation,
but
it's
not
like
a
must
have.
We
can
also
do
that
asynchronously.
But
it's
an
important
for
me
to
discuss
the
feature
and
to
hear
your
feedback
about
it
and
different
angles
that
we
can
take,
because
that's
a
lot
about
the
purpose
of
the
session
and
now
a
little
bit
about
a
basis.
I'd
be
testing.
So
I.
Imagine
that
everyone
knows
what
a
B
testing
is,
but
I
will
spend
like
a
few
minutes
discussing
it
on
the
high
level.
A
So
a/b
testing,
I
would
say
is
like
a
flavor
of
feature
flags
which
lets
you
leverage
different
audiences.
So
when
you
think
about
a
feature
flag,
which
is
either
on
or
off
for
a
specific
user
or
group
or
whatever
a
strategy
we
have
in
the
future,
a
B
testing
allows
you
to
take
a
decision
based
on
what
the
behavior
is
for
a
specific
audience.
So
that's
feeding
the
feature
flag
here
is
not
on
or
off
it
gives
you
different
options
that
that
someone
will
get
so
to
make
it
really
simplistic.
A
Let's
say
we
have
a
website
and
we
want
to
see
what
color
the
user
prefers.
So
some
of
these
are
gonna,
get
it
you
yellow
interface,
and
some
of
them
are
going
to
get
a
green
interface
and
then
basically,
we
can
then
later
on
look
at
some
measurements
and
see
what
the
users
liked
better
and
then
you
could
then
decide
that
the
the
chosen
color
is
going
to
be
yellow
forever
and
close
the
future
flag.
A
It
doesn't
have
to
be
a
temporary
feature
flag,
so
maybe
testing
can
also
say
that
you
have
different
flavors
per
user
like
for
time,
so
you
can
say,
I,
don't
know.
Users
and
different
geo
locations
will
get
different
set
of
features,
and
you
can
basically
manage
it
with
variants,
so
you
can
think
about
it
like
a
switch
case
and
code
where
you,
where
each
case
is
a
different
endpoint,
that
the
user
would
get
and
any
chooser
gets.
You
know
whatever,
whatever
their.
A
If
statement
resonated,
you
that's
like
in
a
very
general
sense
of
what
what
maybe
testing
is
now
the
persona
that
we
look
at
for
a
be
testing
is
also
interesting,
because
it
is
not
necessarily
a
developer
or
a
project
manager,
a
product
manager
which,
which
would
is
the
usual
persona
for
feature,
flies.
The
persona
in
a
B
testing
can
even
be
a
sales
manager,
it
can
be
an
executive,
and
why
am
I
saying
that?
A
Because
sometimes
a
B
testing
results
have
to
do
with,
let's
say
revenue,
so
imagine
that
you
have
a
website
that
is
a
shopping
website,
so
you
want
to
make
sure
that
you
have
a
lot
of
customers
adding
a
bunch
of
things
to
their
carts
and
you
want
to
see
which
flow
makes
the
users
add
more
items
to
the
carts.
So
basically,
what
you
do
is
you
do
a
regular,
a
B
test,
and
then
you
look
at
which
flow
generated,
more
items
in
the
cart
or
generated
more
purchases.
A
A
So
this
is
really
interesting
to
think
about,
and
another
really
interesting
thing
about.
A/B
testing
is
that
a
lot
of
the
decisions
are
based
on
metrics,
that
that
are
collected
and
we'll
talk
about
in
the
rain'
starving
about
what
what
metrics
we
might
need
to
measure
and
how.
But
the
idea
is
somehow
visually
in
order
to
make
this
decision
a
user
needs
to
know
like
what's
going
on
with
the
different
cases
and
based
on
that
make
their
decision
on
which
one
is
the
right
flow,
so
that
it
was
a
very
I
guess
long
introduction.
B
A
It's
a
little
combination
of
everything,
so
good
lab
is
a
little
bit
late
to
the
game
with
feature
flags.
A
be
testing,
is
it
I,
wouldn't
say
basic
feature,
but
it's
definitely
something
that
most
of
our
competitors
have,
and
it's
something
that's
you
know
has
been
used
for
years
in
the
industry.
It's
not
like
something
new
or
something
that
we
invented
I
would
think
the
okay
one
of
the
most
important
customers
that
actually
has
for
this
is
our
own
growth
team.
A
So
we
already
have
a
potential
for
dogfooding,
which
is
really
exciting,
and
so
the
way
that
growth
wants
to
use
this
is
so
imagine
well.
Well,
we
don't
have
to
imagine
they
just
told
me
the
use
case
these
cases.
They
want
to
see
how
we
can
on
board
people
faster
to
give
up.
So
onboarding
means
that
this
is
a
new
user.
A
It
could
be
a
user
that
she
has
been
using
a
good
lab
for
a
month
for
three
months
doesn't
matter
they
they
bucket
the
users
here
according
to
age,
and
they
want
to
make
the
experience
for
the
early
users
easier
by
helping
them
out.
It
was
in
product.
You
know
guidance
and
things
like
that,
so
they
want
to
feature
like
issues
based
on
the
user,
that
is
there
and
the
creation
date
of
their
user
in
the
lab.
So
so
that's
a
use
case
of
who
would
get
features.
A
A
This
is
really
interesting,
for
you
know
anyone
any
executive
in
our
company,
but
in
any
company-
and
it's
also
there's
a
lot
of
buzz
this
year
about
progressive
delivery
and
a/b
testing
is
definitely
something
that
comes
up
a
lot
in
that
relates
to
progressive
delivery,
since
this
has
a
specific
decision
on
at
the
end
of
the
day,
what
the
point
is
going
to
be.
We
can
also
think
about
AV
testing.
A
As
you
know,
post
deployment
metrics,
because
you're
deploying
something
you're,
you're
collecting
metrics
about
something,
and
then
that
affects
your
decision
about
how
you're
going
to
diplay
deploy
it
in
the
future.
So
this
also
connects
to
continuous
delivery.
It's
like
connects
everywhere
within
the
team
is
the
way
that
I'm
thinking
about.
B
A
Our
biggest
competitor
and
future
flags,
and
also
in
terms
of
a/b
testing,
is
launched.
Darkly
they
have
a
nice
UI
that
they
they
use.
I
have
read
different
blogs
about
how
to
connect
unleash
to
Google
Analytics,
for
example.
To
achieve
this,
which
is
an
option
that
we
can
explore
and
Andrew
which
I'm
going
to
pass.
The
mic
to
in
a
minute
has
also
looked
into
something
called
very
variants
that
unleash
has
and
all
that
Andrews
a
little
bit
about
that.
A
So
because
andrew
is
drinking
I'm
gonna.
Give
a
little
bit
of
Angelica
unleash
is
what
we're
basing
our
future
flex
solution
on
so
they're
kind
of
our
fan
foundation,
but
recently
they're.
All.
They
also
decided
that
they're
going
commercial,
so
they're
also
competitor,
but
also
like
a
an
open
source
that
we're
basing
on,
but
also
something
that
we're
looking
at
this
competitor.
C
Gets
me
every
time.
Another
interesting
fact
is
the
future
flag
variants,
have
the
ability
to
kind
of
stick
or
pin
on
on
specific
values
similar
to
out-there
percent
based
rollouts
work,
there's
a
little
bit
of
a
blurb
down
in
there,
but
basically
it
tries
to
find
a
user
ID
and
if
it
can't
find
a
user
ID,
it
tries
to
find
session
ID
and
if
they
can't
find
a
session
ID.
It
finds
the
remote
address
of
the
request.
C
But
again,
all
the
client
side
work
is
done
for
four
languages,
including
yeah.
For
for
those
four
languages,
you
can
kind
of
look
at
the
limitations
and
decide
and
we
can
go
over
which
we
can
fix
and
which
one
can't
fix
the
the
first
one
they're
setting
the
variant
weights
is
something
that
we
can
absolutely
handle.
Without
touching
unleash
code,
the
other
two
I
think
are
probably
less
likely
to
be
able
to
be
handled
on
just
our
side.
A
So
let's
say
that
we
do
use
this
meta
feature
which
may
change
may
or
may
not
change
and
assume
that
we're
using
a
simple.
This
simple
example
where
we
have
five
colors:
how
do
we
know
like
how
how
again
we
need
some
kind
of
dashboard
to
create
decision?
Is
there
something
like
that
that'll
each
has
today
or
no
yes,.
C
C
If
we
go
into
the
API
in
the
top
right
and
we
can
look
at
their
metrics
endpoints,
the
the
you
know
in
the
first
half
I
guess
it's
better
like
this
is
what
this
is.
What
a
client
would
send.
Basically,
it's
like
the
toggle,
the
toggle
name.
How
often
they
get
toggled
in
which
direction
and
then
what's
missing.
Here
is
also
the
variant
side
of
stuff
that
does
get
triggered
here,
but
because
it's
a
data
feature,
it's
not
really
documented.
C
C
Yeah
and
it
would
be
like
the
number
of
times
people
got
blue
and
then
and
we're
actually
receiving
those
those
metrics
every
day
from
people
who
use
our
features
like
client
or
feature
flakes
set
up
already,
but
we
have
no
way
to
store
them
right
now
and
we
don't
really
have
and
I
don't
know
they're,
unlike
the
metric
side
of
things,
how
we
would
want
to
hook
that
up
whether
we'd
want
to
do
something
with
like
Prometheus
or
I.
Don't
know
what
our
other
metrics
collectors
are.
I
know.
C
A
A
You
know
implement
this,
so
let's
go
back
to
the
metrics
today,
which,
besides
from
a
be
testing,
would
still
be
interesting
to
understand
for
us
just
to
count
toggles,
because
we've
been
like
trying
to
figure
out
how
many
people
are
using
feature
Flags,
regardless
a/b
testing.
So
it
may
be
worthwhile
to
build
something
internally
for
us,
I.
C
Think
it
would
be
super
handy
as
a
as
a
feature
generally
speaking,
if,
like
these
metrics,
are
an
important
decision-making,
Becker
and
a/b
testing
I.
Imagine
like
that's
still
definitely
gonna
be
data.
We
want
to
capture
and
we
may
as
well
capture
all
the
metrics
and
give
people
like
a
good
dashboard
of
a
they're
toggles.
C
Toggles
are
other
variants
work
today
again,
it's
like
not
very
smart,
so
don't
expect
too
much,
but
basically
we
have
a
little
variant
tab.
Now
we
can
just
named
our
variant.
We
can
say
it's
a
very
is
like
green.
The
payload
is
always
a
string
at
the
moment,
so
we
could
do
you
can
just
do
green.
We
could
do.
We
could
do
a
hex
value.
C
C
Payload
is
the
value
of
the
variant
right
so
like
if
you're
writing.
If
you're
writing
your
program
and
you're
like
I,
am
requesting
in
the
color
example
I'm
requesting
a
color,
what
color
do
I
want
and
that
would
be
our
payload.
So
it's
it's
equivalent
to
kind
of
the
the
final
true/false
value
on
a
feature
flag
and
then
we
can
add
some
overrides
based
off
of
user
IDs
at
the
moment.
C
E
E
A
E
Well,
that's
the
part
that
I
always
struggle
with
this
is
because
in
a/b
testing,
like
that's
half
of
the
equation,
right,
like
sure,
the
X
number
of
people
got
this
as
green,
but
that's
only
half
the
story.
What
you
really
need
to
know
what
a/b
testing
is?
What
result
did
that
produce,
which
is
usually
like
three
pages
down
the
line?
E
C
That
makes
a
lot
of
sense,
I
think
unleashes
ultimately
extremely
focused,
providing
the
like
the
toggle
functionality
and
anything
outside
of
that
appears
to
be
out
of
their
scope,
so
yeah
I
think
I
think
we
will
probably
need
something
on
top
of
this,
that
we
can
kind
of
glue
together.
But
if
it's
just
like
some
some
easy,
API
endpoint,
well
easy,
API,
endpoints
or
or
something
like
that
right
yeah,
we
can
see
here.
A
Yeah
so
I'm
going
to
step
away
from
the
money
side
of
the
business
and
if
we
talk
about
like,
for
example,
performance
and
checking
performance
based
on
a
variant,
I
think
that's
something
that
we
could
probably
connect.
I
think
that
already
exists
today
in
the
monitor
team.
So
that's
like
something
really
interesting
that
we
can
do
with
this.
Revenue
is
even
more
interesting,
but
I
think
that's
a
little
bit
more
trickier
and
based
on
the
on
business
and
like
how
you
need
to
put
some
kind
of
breadcrumbs
in
order
to
collect
it
later
on.
A
But
you
know
people
that
don't
use
a/b
testing,
that's
like
the
most.
They
create
experiments.
That's
one
of
the
lingos
that
it's
called
and
I.
Don't
know,
I,
remember
which
competitor,
but
that's
what
they
do.
They
create
experiments
and
then
they
get
metrics
pectin
and
then
experiments,
and
then
they
decide
on
the
result
of
the
experiment
based
on
whatever
was
collected.
I
think
we
have
some
time
until
we
get
there,
but
we
can
definitely
leverage
what
we
already
have.
C
A
A
F
C
C
A
A
Is
part
of
is
part
of
it
like
to
get
lab
installation
whatever?
So
it's
written
it
and
someone
told
me
that
CDE
is
like
the
entry
point
to
monitor
so
I
guess:
I!
Guess
it's
true
for
this
as
well,
and
they
already
had
like
a
bunch
of
API
is
that
we
can
also
use
in
order
to
make
this
richer
performance
standpoint,
but
not
everything.
F
Yeah
and
there's
a
way
to
make
it
to
publish
those
things
and
since
we're
standardized
away
like
syslog
or
something
else,
then
it
leads
you
to
more
flexibility
if
we
needed
to
like
integrate
with
segments
or
some
other
thing
right.
We
republish
these
these
packets
out
for
one
of
our
other
user,
behavior
tracking
thing
that
they
might
want
to
use
somewhere
else
right.
So.
A
Another
really
interesting
thing:
I
have
another
working
on
with
a
monitor
team
and
it
turns
out
that
we
already
have
in
place.
Is
it
as
a
response
for
when
an
error
rate
is
met
and
Prometheus
collects
the
metrics?
We
already
have
like
a
bunch
of
metrics
that
they're
collecting,
if
that,
if
the
error
rate
is
reached,
it
automatically
opens
an
issue,
and
we
could
probably
leverage
that
for
like,
for
example,
like
know
this
variant
is
not
getting
traffic
or
is
less
than
10
percent,
it
can
open
an
issue
until
you
like
remove,
remove
the
variant.
E
A
What
I
think
is
I
think
we
need
to
separate
these
metrics
Institute.
One
of
them
is
purely
feature
flying
related
and
that's
what
we're
getting
from
the
unleash
API.
You
know
the
metrics
that
just
tell
you
here's
the
variants
that
we're
getting,
which
is
valuable
on
itself,
because
a
customer
would
want
to
know
like
the
distribution.
The
future
flagon
was
actually
happening,
and
then
we
have
everything.
That's
post
upon
it,
which
is
Prometheus
and
going
forward.
So
I
actually
think
we
need
two
dashboards
and,
like
that's
my
opinion,
but
I'm,
happy
to
hear
your
own.
D
D
C
D
C
Yeah
I
think,
ultimately,
it's
just
a
data
bucket
that
handles
kind
of
like
what
you'd
expect
a
metric,
a
time
driven
metric
anyways
to
look
like
right
as
long
as
you
have
a
time
stamps
on
your
payloads,
then
they
can
probably
figure
it
out
and
then
it's
up
to.
It
would
be
up
to
somebody
to
build
queries
around
collecting
and
displaying
that
data,
which
I
know
we
on
the
front
and
have
some
fun
libraries
for
embedding
charts
of
data
from
Prometheus.
C
So
there
could
always
there
the
good
plan
there,
where
we
display
some
like
very
simple
metrics,
close
to
the
feature
flags,
while
still
keeping
all
the
data
with
like
the
rest
of
the
data,
the
rest
of
the
metrics.
And
then,
if
you
go
over
to
monitor,
you
might
be
able
to
get
some
like
fancy
cross
training
data
displays.
C
A
A
C
A
C
Can
maybe
kind
of
do
both
at
the
same
time
a
little
bit
but,
like
start
simple
and
then
for
the
variant
side
of
things,
I,
don't
know,
I,
don't
know
what
I
don't
know
what
the
good
the
best
user
experience
would
be
for
that.
But
I
imagine
they're
almost
complicated
enough
to
be
like
a
separate,
daddy,
put
side
of
feature
flags,
even
though
they
share
a
lot
of
like
configuration
and
display
you.
A
C
Something
like
like
if
we
look
at
if
we
can
look
again
at
the
and
I'd
have
to
like
play
around
with
this
a
little
more.
But
if
we
look
at
these
variants
here
like
we
still,
it's
still
in
unleash
anyways,
it's
still
a
feature
flag
with
with
strategies,
and
that
like
doesn't
make
any
sense
to
me
and
I.
Don't
know,
I,
don't
know
why
it's
like
I,
don't
understand
how
this
side
of
things
impacts
this
side
of
things
and.
C
No,
no!
No.
It
would
be
more
fair
to
say
that
you
can
either
request
a
feature
flag
status
or
you
can
request
its
variants
and
the
strategies
impact.
It's
it's
enabled
status
or
you
do
like
default,
or
you
do
flexible,
rollout
or
or
whatever,
whatever
other
strategies
you
want
mad
or
you
can
request
its
variants
and
that's
like
because
they're
like
two
totally
separate
API
calls
in
the
client,
so
I,
don't
underst
I
can
tell
they
shouldn't
impact
each
other
and
I'm,
not
sure
why
it's
designed
this
way
well,.
F
A
A
C
E
See
a
world
though,
where
you
want
to
like
with
get
loud
next
drag
like
you,
only
want
to
a
be
test
on
a
subset
of
people
that
are
have
agreed
to
see
new
stuff
right,
like
the
core
users.
Don't
get
to
see
this
variance,
so
I
could
see
a
world
where
you
would
want
both
a
feature,
flag
strategy
and
variance
as
well.
Well,.
F
F
F
E
C
A
So
this
requires
also
like
the
API,
the
new
dashboard
or
whatever,
we'll
want
to
call
it
and
the
new
variance
UX
also
api.
I
assume,
because
not
everyone
wants,
so
we
use
UX
what
am
I
missing.
Oh,
we
also
need,
like
I,
guess,
to
add
everything
to
the
audit
log,
but
it's
just
one
way
like
like
saving
us
today
like
who
created
it.
What
time
it
has
nothing
to
do
with
toggles.
A
Okay,
anything.
E
I
hate
to
be
the
UX
guy
who
says
like
testing
I,
don't
know,
I
refuse
to
answer
a
question,
but
testing
is
the
only
way
to
answer
it,
but
I
mean
this
is
a
big
enough
thing
that
I
honestly
don't
know
right.
I
can
honestly,
like
I
can
see
it
just
living
in
strategy
world,
and
it's
just
a
thing
and
it's
an
additional
strategy.
I
can
also
see
it.
The
other
way,
what's
completely
segmented
out
thing,
I
think
we're
going
to
have
to
prototype
and
get
some
feedback
on
it.
So
yeah.
C
E
A
E
Agree:
I,
don't
think
this
has
to
be
a
long,
drawn-out
thing
and
I,
don't
think
we
need
actual
customers
I
think
we
could
probably
lovers
our
internal
folks
or
some
of
the
people
that
we've
already
talked
to
you
for
this
I
think
what
we
need
is
prototypes
of
both
ways
and
then
feedback
on
why
people
like
which
one
and
why
so
yeah.
Hopefully
this
is
something
we
can
accomplish
in,
though
you
know
notching
into
a
long
drawn-out
two
three
miles.
E
Don't
thing
I
think
we
can
accomplish
this
relatively
quickly
as
far
as
creating
the
MA
I
had
this
pretty
high
on
my
list
of
things
to
do
in
the
very
new
feature.
So
we
kind
of
have
the
pattern
in
place.
We're
not
talking
about
wildly
new
different
fields
or
anything
like
that.
So
in
the
mocking
up
of
it
should
not
be
terribly
difficult,
just
need
to
work
it
through
the
process.
A
E
Thing
I
do
want
to
say,
though,
is
we
get
to
MVC
and
we're
attaching
metrics
on
variants?
I
think
that's
great,
that's
absolutely
where
we
should
start,
but
that
next
piece
after
that
right
the
connecting
it
to
the
data
that
makes
it
so
much
more
valuable
I
think
we
really
need
to
start
thinking
about
how
we're
going
to
do
that,
because
one
step
in
the
wrong
direction
once
we
go
down
that
path
will
be
disastrous
right.
We
need
to
make
sure
we
pick
the
right.
E
A
We
know
what
one
do
for
them.
You
see
yeah.
The
second
phase
is
totally
unclear
also
worth
validating
by
the
way-
and
you
know
maybe
even
there
we
can
start
with
problem
validation
like
what
users
even
want
to
accomplish.
For
maybe
the
same
thing.
We
need
to
figure
out
what's
interesting
and
that
will
also
pull
meet
us
in
the
direction
of
weather,
to
leverage
the
performance
metrics
that
we
currently
have
with
Prometheus
or
go
like
two
totally
different
paths.
A
A
E
Gut
feeling
on
that,
though,
is
we're
going
to
have
to
go
the
hook
and
other
things
like
like
chase
that
segment
like
there's
a
whole
industry,
that
this
is
all
this.
Does
that
I
know
we
like
to
do
everything
as
an
internal
tool
here
but
like
this
is
a
whole
industry
that
we're
backing
into
after
tackling
its
own
industry
of
feature
flags?
You
know
that
I
just
don't
see
how
we're
ever
going
to
compete
in
that
market
in
the
near
future,
connecting
to
a
segment
and
a
Google
Analytics
seems
like
such
win.
A
A
I
already
saw
like
those
that
shows
you
how
to
connect
to
Google
Analytics,
so
that,
like
shortens
the
research
time
by
far
again,
it's
introducing
another
third
party
that
we
haven't
yet
done
and
get
I'm
not
saying
it's
a
bad
thing.
It's
just
like
a
thing
that
we
probably
need
to
learn,
but
definitely
like
any
user.
That's
doing
websites
would
gain
value
from
this
Google
Analytics
like
how
many
things
how
many
people
went
into
this
page.
How
many
people's
are
this
like?
It's
an
awesome
tool
right.
A
A
Out
we
were
talking-
and
it's
not
really
related
to
this,
because
that
dashboard
has
to
do
with
like
the
age
of
feature
flags
and
like
it's,
it's
not
related
to
like
metrics
or
the
client
side.
It's
more
like
feature
flag
management.
Dashboard
like
here
are
all
my
future
flies
in
the
project.
Here's
when
they
were
created
here
is
like
this
one
is
a
rolled
out
100%.
You
can
probably
turn
it
off
ability
to
filter
and
sort
that
that
thing,
that's
a
little
bit
different,
it's
more
about
the
index
page.
Maybe
it's
poorly
worded
yeah.
A
Cool
anything
else.