►
Description
Arun Kumar shares how the team at DoorDash uses Dagster to power metrics layer and scale experimentation.
A
Hey
everyone:
my
name
is
Arun
and
I
work
as
a
software
engineer
in
the
experimentation
team
at
doordash
able
to
see
the
slides
moving
on
I'm.
So
sorry,
yes,.
A
A
So
today,
I'm
going
to
talk
about
how
do
Dash
leveraged
tax
to
power
and
scale
experimentation
analysis
before
getting
into
the
dog
I
want
to
just
give
a
quick
context
or
about
like
who
we
are,
and
for
those
of
you
who
might
not
have
heard
about
doordash
is
giving
a
quick
context
of
what
we
do
so
don't
necessarily
local
Commerce
platform
that
connects
consumers
with
local
businesses
in
multiple
countries
founded
in
2013.
A
We
building
we're
building
an
infrastructure
to
enable
a
market,
a
three-sided
Marketplace
between
Merchants
consumers
and
dashes,
where
Merchants
provide
the
services
to
the
consumers,
and
then
consumers
find
their
favorite
local
businesses
within
their
locality
and
the
dashes
enable
delivering
the
goods
from
the
merchants
to
the
consumers.
A
Daxter
is
truly
a
rare
driven
company,
we
do
have
multiple
data
use
cases
and
we
use
dagster
for
various
use
cases
that
requires
Dynamic
orchestration.
Some
of
them
are
ml
features
and
training.
Why
clients,
forecasting
pipelines
some
Dynamic
spark
transformation,
experimentation,
analysis
and
Reporting
today,
I
want
to
dive
deep
into
experimentation,
analysis
use
case
and
talk
about
a
particular
project
that
we
recently
implemented
using
texture
or
before
getting
into
the
project.
A
I
want
to
just
give
a
quick
context,
a
quick
overview
about
what
experimentation
means
for
those
of
you
who
might
not
have
heard
enough
enough
about
experimentation.
A
So
experiments
are
commonly
used
to
attach
for
making
data
driven
decisions.
It
helps
us
to
statistically
test
the
efficiency
of
any
new
feature
that
we
want
to
introduce
into
our
platform
instead
of
just
making
decisions
based
on
instincts.
We
look
for
some
statistical
evidence
to
prove
that
the
feature
is
really
beneficial
for
our
company's
key
performance
metrics.
A
So
there
are
two
different.
There
are
different
types
of
experiments
that
are
currently
being
performed
in
the
industry
and
out
of
which
a
b
testing
is
probably
the
most
commonly
used
experiment
type.
How
everything
is
performed
is,
let's
say
we
have
a
search
team
that
Rick
wants
to
test
a
particular
search
algorithm.
A
We
we
do
not
roll
it
out
to
the
entire
audience
directly.
What
we
usually
do
is
we
package
the
users
into
control
and
treatment
and
show
the
new
search
algorithm
only
to
the
treatment
users
and
the
control
uses
will
still
be
seeing
our
old
algorithm.
Then
we
would
incrementally
roll
out
more
users
into
the
treatment
group
and
measure
various
metrics
that
would
prove
the
Improvement
in
the
search
algorithm.
Like
click-through
rate,
we
will
then
ship
more
users
into
treatment
only
when
we
see
some
upward
movement
in
the
metric
for
the
treatment
users.
A
So,
let's
say,
if
you're
not
seeing
any
positive
movement
in
our
com
in
our
company's
key
business
metric,
then
we
will
decide
not
to
ship
a
particular
search
algorithm.
I
instead
like
try
to
roll
back
the
entire
feature,
so
this
is
how
we
make
sure
that
any
new
feature
that
we
introduced
to
our
product
is
actually
moving
our
company's
keyboard.
Key
business
metric
in
the
right
direction.
A
Just
gonna
give
a
quick
overview
of
the
of
how
the
experiments
are
being
ran.
The
experiment
is
a
multi-step
process
involving
like
multiple
stakeholders
like
product
Engineers,
product
managers,
data
centers.
We
have
built
and
end-to-end.
You
know,
experimentation
platform
that
guides
and
streamlines
the
entire
experimentation
process.
A
The
first
step
like
the
product
Engineers,
implement
the
feature,
and
then
they
configure
the
experiment
and
then
the
experiment
goes
live
once
the
experimulator
goes
live,
it
starts
generating
data
and
the
data
starts
flowing
into
our
snowflake
warehouse
and
then
the
data
scientists.
They
will
start
analyzing.
The
data
using
our
experimentation
analysis
platform
called
as
query
so
theory
is
our
in-house
experiment.
Analysis
platform
that
automates
most
of
the
data
analysis
process
for
experiments.
A
A
Just
to
start
with
when
a
particular
user
opens
a
door
Dash
app,
they
get
exposed
to
an
experiment
and
and
exposure
log
flows
in
real
time
using
through
our
real-time
streaming
pipeline,
which
is
called,
and
then
it
flows
into
a
snowflake
table
called
as
experiment
exposures.
So
this
log
is
nothing
where
it
contains
the
information
about
which
particular
bucket
a
user
got
exposed
to
for
a
particular
experiment.
So,
let's
say
for
experiment:
a
the
user
with
the
idcx123
got
exposed
to
control
bucket.
A
We
will
get
a
log
that
says
as
this
information
and
it
gets
ingested
into
the
experiment
exposure
table
once
we
get
this
experiment
context,
the
data
scientists
come
to
a
query
and
they
add
the
business
metric
definitions
and
use
security
to
start
analyzing.
The
data
by
joining
the
experiment,
exposures
with
the
business
metrics.
A
Yeah
now
I'm
going
to
just
dive
deep
into
the
part
like
query
where
I'm
just
going
to
give
a
quick
context
about
query
and
how
we,
how
we
analyze
experiments
and
then
directly
jump
into
the
project
so
how
we
started
Curie.
So
when
we
started
this,
when
we
started
to
build
our
own
in-house
experimentation
analysts
platform,
the
primary
goal
was
to
just
standardize
the
entire
experimentation
analysis
methodologies
to
into
a
single
platform.
A
So
for
metrics
we
started
pretty
much
easy.
We
adopted
a
bring
your
own
SQL
approach
and
allowed
users
to
bring
their
own
ad
hocs
equals
to
query
in
order
to
fix
the
required
metrics
data.
This
actually
worked
really
well
like
there
was
not.
There
was
no
friction
on
the
USS,
because
most
of
the
users
were
already
using
ad
hocs
equals
to
analyze
most
of
the
experiments,
so
it
increased
the
adoption
to
our
platform
and
people
started
using
our
standard
analysis
methodologies
by
just
bringing
in
an
ad
hoc
SQL
that
they're
already
using.
A
However,
as
we
gathered
mode
adoption,
we
started
chasing
a
lot
of
challenges
with
this
ad
hoc
approach.
The
first
one
is
standardization.
We
did
not
have
a
single
source
of
truths
for
all
the
business
metric
definitions.
A
In
addition,
there's
there
is
now
a
dependency
on
the
subject
matter:
experts
because
everyone
needs
to
know
the
metric
definitions,
how
to
fetch
symmetric
definition
and
where
the
metric
definition
need
to
be
fetched
from.
They
need
to
know
how
this
domain
specific
information,
which
made
it
hard
to
analyze,
experiments.
A
The
second
one
is
scalability,
obviously
like
ad
hoc
Seekers
did
we
don't
have
any
control
on
them,
like
users
can
just
write
their
own
monologic
SQL?
Sometimes
they
most
of
it,
and
they
write
some
expensive
sequels,
which
we
don't
have
any
control
on.
The
only
and
the
only
lever
that
we
had
was,
to
just
add,
add
more
machines
and
scale
up
the
computer,
so
ad
hocs
equals
were
extremely
hard
to
scale.
A
The
main
reason
for
that
is
because
we
are
doing
most
most
of
the
Redundant
computation.
So
let's
say
a
particular
metric
is
being
used
by
different
experiments
from
different
teams.
Then
we
ended
up.
We
didn't.
We
ended
up
recomputing
those
metrics
from
scratch
again
and
again
for
each
of
those
experiments,
because
we
do
not
know
what
a
particular
metric
definition
is.
So
the
only
possible
way
was
to
just
like
execute
the
SQL
blindly
provided
by
the
users.
A
So
we
identify
the
challenges
we
Face,
that
that
was
mentioned
in
the
previous
slide
will
primarily
caused
by
the
lack
of
standardization
of
metrics
and
centralization
of
the
scalable
centralization
and
scalability
of
automatics
computation.
To
tackle
these
issues,
we
chose
to
build
a
matrix
layer
or
experimentation,
otherwise
called
as
like
semantic
layer
or
Matrix.
Though,
if
you
are
following
the
recent
news
about
the
DBT,
you
might
have
probably
heard
of
heard
it
a
semantic
layer.
A
So
metric
Ray,
to
give
a
quick
overview
of
automatic
layer
is
Medicare
is
a
centralized
framework
that
contains
that
translate
the
data
or
the
table
and
columns
in
the
warehouse
to
business,
metric
definitions
and
or
dimensional
definitions
uses
using
our
Matrix
layer
uses
build
metrics
as
a
reusable
data
model
using
our
declarative,
DSL
framework
and
they
consume
the
metrics
same
metric
definitions
from
other
platform.
Like
query,
every
metric
definition
would
like
every
metric
creation
would
go
through
a
standard
approval
process
with
the
appropriate
domain
stakeholders.
A
A
So
yeah,
this
is
how
our
DSL
looks
like
today.
When
someone
wants
to
create
a
metric,
it's
actually
a
basically
a
two-step
process,
the
first
onboard
a
source
layer
and
then
Define
a
metric
on
top
of
it.
If
you
want
to
dig
deep
into
the
source
a
source
here,
mimics
a
data
set
or
a
table
defined
by
a
select
SQL.
A
A
So
in
this
case,
this
select
statement
is
returning,
a
measure
called
as
n
delivery
and
it's
returning
to
identifies
called
as
delivery
ID
and
consumer
ID.
So
the
measures
are
the
base,
as
I
mentioned,
are
the
basic
quantifiable
value
which
will
be
later
aggregated
in
inventory,
and
the
identifis
are
basically
the
joint
Keys
used
to
join
with
other
tables.
A
There
are
like
a
lot
of
other
metadata
that
is
tracked
in
the
source
which
will
be
used
in
our
Dexter
jobs.
So
once
the
measure
is
created,
we
refer
to
those
measure
in
the
metric
and
the
metric
definition
by
just
adding
some
aggregation
functions
on
top
of
those
measurements.
So
that's
how
we
Define
a
metric.
So
in
this
case
let's
say
we
are
going
to
define
a
number
of
deliveries
made
by
load
Dash.
A
We
just
use
to
measure,
that's
been
defined
in
the
source
layer
and
then
add
an
aggregation
function
like
call
Sound
In
order
to
be
able
to
compute
those
number
of
deliveries
metric
so
that
that's
all
the
user
had
to
do
the
user
just
build
these
two
yaml
files
push
it
to
the
GitHub
and
then
they
go
through
a
standard
upload
process,
and
then
the
metric
definitions
are
synced
to
our
backend.
Using
our
grpc
servers
once
the
once
these
models
are
created,
then
the
Matrix
would
be
automatically
displayed
in
our
experimentation
platform.
A
So
all
the
user
have
to
do
is
just
go
to
our
experimentation
platform,
create
a
config,
further
experiment
and
then
select
the
metrics
from
the
drop
down.
So
that's
in
that's,
that's
all
the
user
had
to
do
the.
If
the
metrics
are
already
available
in
our
platform,
then
they
can
just
skip
the
entire
metrics
of
the
ship
process
and
then
go
to
the
UI
directly
to
select
the
metrics.
A
A
Yeah
with
this,
we
really
we
improved
our.
We
were
able
to
improve
standardization.
We
started
seeing
uses
through
using
common
metrics
across
different
experiments.
We
as
a
platform
team.
We
did
build
some
standard
metrics
and
we
built
some
standard
collection
of
metrics
and
auto,
configure
it
to
various
experiments,
and
basically,
we
also
saw
a
lot
of
non-technical
stakeholders
started
analyzing
experiments,
because
now
they
don't,
they
don't
have
to
be,
depending
on
the
SQL
definitions
provided
by
the
data
scientists.
A
A
We
just
basically
revamp
the
entire
Matrix
computation
engine
on
top
of
dagster.
A
So,
as
I
stated
earlier,
the
the
original
scaling
concern
comes
because
of
the
Redundant
computation
that
we
perform.
Basically
because
a
single
metric
is
being
used
by
different
experiments
and
across
different
teams.
We
repeatedly
compute
this
those
metrics
from
scratchy
performing
all
the
Redundant
table
scans
and
Joints
repeatedly,
so
that
was
a
primary
root
cause
for
our
most
of
a
scale.
A
So,
in
order
to
avoid
this
written
and
computation,
what
we
did
is
we
initially
first
started
materializing
the
measures
defined
in
each
source
incrementally
and
then
the
idea
here
is
to
use
those
materialized
source
to
measure
metrics
for
different
experiments
from
different
teams,
so
that,
like
we
don't
do
those
redundant
joints
and
table
scans
like
huge
tables,
can
I
get
it
again,
so
how
we
did
introducing
tax
service
for
every
source
that
user
creates.
We
dynamically
build
a
tax
day
job
under
sensor.
A
A
A
Yeah
so,
as
you
mentioned
like,
if
you
look
at
here,
every
Source
will
have
a
desktop
job
and
a
sensor,
so
the
sensors
automatically
tracks
the
Upstream
dependencies
that
have
been.
That
is
mentioned
in
the
source
definition,
so
every
SQL
has
has
to
read
from
certain
tables.
So
the
sensor
that
we
generate
would
automatically
take
care
of
monitoring
those
Upstream
jobs.
That
means
that
manager
so
stable
and
once
the
once
the
Upstream
dependencies
are
satisfied.
A
A
A
We
dynamically
build
that's
the
job
under
sensor,
it's
very
similar
to
how
we
did
for
Source
any
any
if
a
user
comes
to
our
UI
when
they
create
and
if
they
create
an
experimentation,
config
and
that's
the
job
would
be
automatically
created
for
those
for
the
experiment.
In
the
background
again,
like
these
sensors
wait
for
the
source
assessors
to
be
materialized.
If
you
see
if
you-
and
if
you
already
remember
it,
there
is
an
inherent
relationship
between
source
and
the
mesh
and
the
metric
through
the
measures.
A
Right
so
let's
say
when,
let's
say
a
particular
user
wants
to
analyze
an
experiment
a
and
they
want
to
analyze
metric
B
did
this
metric
would
depend
on
a
certain
measure
from
another
source
right,
so
the
sensors
would
actually
the
sensors
would
actually
made
for
the
those
four
success
to
be
materialized
before
starting
analyzing.
This
particular
experiment.
A
Yeah
so
once
the
source
has
been
materialized,
then
it
triggers
the
tax
job
and
within
the
job
we
joined
the
experiment
within
the
job.
We
joined
the
experiment,
exposures
with
the
materialized
measures
and
then
perform
the
aggregation
to
run
all
the
statistical
analysis
and
then
once
we
got,
the
experiment
results
that
job
pushes
the
research
into
a
postgres
database
from
which
they
would
be
sold
in
our
QD
UI
for
the
users.
A
Yeah
this
overall
end-to-end
flow
of
how
we
analyze
experiments
using
the
metallic
sources
today
now
next
next
I'm
going
to
just
talk
about
some
of
the
space
specific
aspects
of
our
platform
and
the
pipeline
and
dive
deep
into
some
special
cases.
A
See
how
backfits
again
like,
as
mentioned
like
all
of
her?
That's
that's
a
materialization.
Jobs
are
incremental
and
they
are
partitioned
by
date.
A
You
know
the
actor
has
a
first
class
support
for
partition
based
backfills,
so
it's
quite
easy
for
us
to
like
run
like
backfills,
without
having
to
be
without
having
to
do
anything
manually
or
we
go
to
all.
We
do
is
just
go
to
the
backfield
UI
select
a
set
of
partitions
and
then
start
triggering
black
bills.
We
also
like
triggered
the
same
back
views.
A
We
allow
users
to
trigger
this
impactful
using
graphql
from
the
from
our
experimentation
platform
directly
so
that
they
don't
have
to
like
look
into
the
like
get
into
the
tax
UI
or
find
the
corresponding
jobs.
A
I
just
wanted
to
add
one
call
out
here
by
default,
like
that,
you
know,
like
that's.
The
creates
one
job
run
for
each
date,
partition
which
could
be
quite
expensive,
particularly
when
you're
building
snowflake
based
SQL
pipelines,
because
this
could
end
up
in,
like
numerous
job
runs,
triggering
like
triggering
one
sequel
to
backfill
particular
days
worth
of
data
yeah.
A
This
was
like
quite
quite
a
big
problem
for
us,
and
initially
we
chose
not
to
use
it
back
full
UI
due
to
this
limitation,
but,
however,
with
the
recent
Dexter
version,
I
think
now
it's
now.
They
we
have
a
feature
where
we
can
actually
batch
multiple,
multiple
dates
in
a
single
run.
So
when
you,
when
you
run
a
backlit
for
for
a
specific
date
range,
it's
I,
I
I
think
it.
A
It
would
be
better
to
just
batch
multiple
date
partitions
into
a
single
Trend,
rather
than
creating
one
run
for
each
of
the
date.
Partition
particularly
like
when
you
are
building
SQL
base
SQL
based
pipeline,
when
we
actually
move
away
from
single
run
back
to
bash
back
build.
We
actually
saw
like
close
to
5,
to
10x
Improvement
on
our
batch
backflip
performance.
A
So
one
other
thing
is
about
the
look
back
period.
Most
of
our
jobs
are
designed
to
be
look
over.
So
what
do
you
mean?
We
look
back
here.
As
you
mentioned,
most
of
our
jobs
are
incremental.
However,
there
are
some
Upstream
dependencies
that
our
source
depends
on
are
not
purely
incremental.
A
For
example,
we
have
some
some
of
the
fact
tables
that
could
change
the
last
90
days
of
data
in
a
daily
pipeline,
and
if,
if
we
have
a
source
that
depends
on
those
Upstream
dependencies,
we
a
user
automatically
user
actually
adds
something
called
as
look
back
period,
which
says
like
the
number
of
days
that
could
change
in
the
Upstream
dependencies
for
the
last
like
10
days,
and
we
basically
like
use
the
Dexter
partition
support
to
rebuild
last
NDS
partitions
on
a
daily
basis.
A
That's
the
partition,
probe,
suppose
this
makes
this
very
easy
and
provides
a
clear
audit
trail
of
when
the
asset
partitions
actually
change
again,
like
we
also
designed
our
pipelines
to
be
still
feeling.
So,
there's
been
a
job
phase
to
run
on
certain
days.
The
next
python
run
will
automatically
catch
up
and
back
below
as
unprocessed
data
based
on
the
last
updated
timestamp
in
the
table.
These
tips
ensure
that
the
data
is
always
up
to
date
and
complete.
A
Is
completely
abstracted
away
from
the
users,
our
jobs
are
dynamically
orchestrated,
based
on
the
models
available
in
the
back
end,
so
any
new
changes
to
the
models
are
if
a
user
creates
a
new
model,
the
that
should
automatically
reflect
in
the
dragster
repository
without
needing
a
manual
intervention
in
order
to
enable
that
we
use
a
worker
on
such
a
server
data.
That's
the
team
thanks
to
Daniel,
specifically
for
this
workaround,
so
we
just
implement
the
reports.
A
We
just
implement
the
repository
data
class
and
override
the
logic
to
fetch
the
sensors
and
schedule
jobs
and
sensors
and
schedule
definition.
So,
while
creating
a
report
Repository,
we
start
a
background
thread
that
periodically
features
and
refreshes
all
these
definitions.
So
this
made
it
possible
for
any
new
jobs
to
be
created
automatically
without
requiring
us
a
tax
to
deployment.
A
Yeah
metrics
and
monitoring
we
do
as
as
a
platform
team.
Since
we
are
responsible
for
the
entire
orchestration,
we
measure
various
metrics
to
make
sure
that
the
platform
is
healthy
and
we
use
promoters
to
measure
most
of
this
metrics
again
like
we
use
most
of
the
we
use
diaster
Primitives,
to
enable
this
metrics
are
recorded
from
the
Run
status
sensors,
where
we
use
the
dashboard
API
to
fetch
all
the
jobs
metadata
like
the
job.
Latency
is
queuing
time,
error
status
and
then
push
it
to
the
parameters
from
within
the
Run
status
sensors.
A
We
measure,
like
various
important
metrics
related
to
jobs,
like
counts,
latencies
failure
rates
queuing
time
for
the
jobs
by
by
measuring
the
queuing
time.
We
know
when
there
are
a
lot
of
jobs
piled
up
in
the
queue
then
we
would
probably
can.
We
should
probably
like
go
and
like
scale
up
the
machines
and
change
the
concurrency
limits.
We
also
measure
the
metrics
on
number
of
lagging
jobs,
see
if
there
are
too
many
jobs
that
are
lagging
like
that
skip
the
particular
trigger.
A
You
also
have
like
detailed
alerts
and
notifications
So,
based
on
the
promotionality
that
you
saw
in
the
slide.
You
configured
multiple
alerts
to
proactively
identify
and
fix
issues
on
a
platform
level
like
we
have
stripped
thresholds
for
the
job,
wait
times
the
job
queuing
time,
the
job
latencies
and
sensor
latencies,
so
that
we
get
alerted
before
there
is
a
bigger
issue
and
we
also
heavily
use.
A
That's
the
slack
integration
and
we
have
configured
multiple
alerts
using
that,
like
the
job
failure
alerts
directly
to
the
source
owners,
and
we
also
have
something
called
as
Dax
Daily
Report,
which
provides
like
a
bird's
eye
view
of
all
the
jobs
that
ran
on
the
previous
day.
It
provides
a
view
of
like
the
jobs
that
are
consistently
failing
the
jobs
that
are
like
taking
too
much
time
to
complete
and
the
jobs
that
are
lagging.
A
Lastly,
like
we
use
heavily
used
adaptive
graphql
endpoints,
to
integrate
with
most
of
our
internal
tools
here
in
the
snapshot,
you
can
see
how
a
screenshot
from
our
experimentation
platform,
we
heavily
use
the
dragster
UI
in
order
to
show
all
the
job
statistics
and
asset
metadata
directly
on
our
platform,
so
the
users
have
to
don't
have
to
jump
between
Dexter
and
the
platform.
A
In
order
to
know
this
specific
metadata
again
like
we
use
the
stack
study
graphql,
also
to
provide
all
this
metadata
on,
like
our
internal
data
catalog
tool,
the
sum
of
the
metadata
that
includes
are
like
the
snowflake
query,
URL
used
for
the
job,
the
row
count
and
also
the
we
directly
linked
the
job
run
URL,
so
that
in
case,
if
there
are
any
issues,
users
can
directly
jump
into
the
job
run.
That
is
responsible
that
is
responsible
for
that
operation.
A
A
From
the
scalability
perspective,
The
Matrix
computation
framework
resulted
in
a
10x
Improvement
in
the
average
experiment.
Analysis
latency
compared
to
a
previous
SQL
ad
hoc
SQL
approach,
enabling
users
to
make
decisions
and
ship
more
features
actively
again
like
the
Improvement
in
the
scalability,
enabled
us
to
work
on
some
Advanced
features
like
automated
image
analysis
and
also
helped
us
to
improve
the
overall
reliability
of
our
experimentation
results.
A
So,
currently,
we
were
able
to
like
standardize
most
of
our
experimentation
metrics,
but
the
semantic
layer
or
Matrix
layer,
as
most
of
you
might
know,
has
much
more
broader
applications
across
lot
of
other
data
driven
use
cases.
We
are
currently
working
to
replicate
this
success
in,
like
other
areas
like
business
intelligence,
exploratory
analysis,
forecasting,
Etc
yeah,
we're
currently
like
trying
to
find
new
use
cases
apart
from
experimentation
for
the
same
business,
Matrix.
A
This
entire
presentation
is
actually
a
condensed
version
of
the
blog
that
I
wrote
last
month.
Yeah.
If
anyone
wants
to
know
more
about
this,
I
would
recommend
to
take
a
read
of
the
blog.
I
can
probably
share
the
link
in
the
zoom
chat.
Once
I'm
done
with
the
presentation
yeah.
A
That's
all
I
have
for
today
I'm
pretty
much
open
for
any
questions,
but
I'm
not
sure
if
we
have
enough
time
or
otherwise
I'll
be
hanging
out
in
the
zoom
to
be
able
to
answer,
and
so
I'll
be
able
to
answer
to
any
of
the
questions.
Yeah.
B
We
have
time
for
a
couple
questions
thanks
Aaron.
That
was
super
interesting.
The
first
question
is:
do
you
use
back
fills
automatically
by
detecting
changes
in
the
code
or
manually.
A
That's
a
great
question:
currently
we
are
doing
it
manually,
but
we
do
have
plans
to
do
it
automatically.
Currently,
whenever
someone
some,
when
someone
make
any
changes
to
the
source
definition,
they
actually
do
this
backfill
manually,
although
we
have
some
automations
like
users,
won't
have
to
get
into
the
tax
to
find
their
jobs
and
then
trigger
it.
We
do
use
the
tax
graphical
integration
to
do
it
automatically
from
the
security
from
the
experimentation
platform.