►
Description
whatnot is the fastest-growing e-commerce marketplace in the USA.
It specializes in livestream auctions for collectors and enthusiasts.
Stephen Bailey led whatnot’s migration from Airflow to Dagster to boost the Data Engineering team’s productivity. Starting with 50 Airflow DAGs running on mostly-daily cadences, the team now runs one hundred thousand individual Dagster jobs a month on myriad schedules.
Here is his advice to other data practitioners.
A
A
Bailey
I'm,
a
data
engineer
at
whatnot
whatnot
is
a
live
stream
shopping
platform
kind
of
like
a
twitch
meets
eBay,
where
we
allow
enthusiasts
to
make
a
living
selling
things
they're
passionate
about.
So,
if
you're
in
my
position,
what
the
position
I
was
in,
where
you're
evaluating
orchestrators
and
you're
going
to
you're
going
to
commit
to
one
for
the
foreseeable
future,
I
would
recommend
well,
first
of
all,
I
would
say:
Daxter
is
fully
production
ready.
It's
like
coming
in
coming
into
this.
A
Coming
into
this
exploration,
you
have
to
ask
that
question
of,
like
is
a
startup.
You
know
going
to
be
stable
enough
for
you
to
build
your
platform
on.
We've
had
zero
issues
with
stability
in
the
platform
from
the
beginning.
The
second
thing
I'd
say
is
you
want
to
focus
on
the
value,
add
part
which
is
an
ergonomic
experience
for
your
developers
and
just
the
agility,
getting
new
pipelines
out
and
plugging
them
into
your
existing
architecture
and,
to
that
extent,
I
think
dagster
has
a
couple
of
really
great
features
that
it
offers.
A
The
first
is
the
the
built-in
GitHub
actions.
The
second
is
the
serverless
deployments.
So
if
you
cannot
think
about
infrastructure
for
your
first
like
transition
for
your
first
several
pipelines,
like
that's
a
great
win,
and
it
means
you
can
focus
on
building
out
your
graph
of
jobs
and
pipelines.
The
third
thing
I
recommend
is
using
the
new
asset
functionality.
That's
really
where
dagster
starts
to
differentiate
itself
from
other
competitors,
and
also
it
becomes
a
little
more
compatible
with
other
modern
tools
like
DBT.
A
So
the
project
plan
was
to
lift
and
shift
everything.
That
was
the
goal,
and
we
started
with
a
couple
of
key
pipelines,
moving
them
over
to
dagster
and
really
getting
the
dagster
workflow
and
developer
experience,
including
things
like.
How
does
the
python
package
look?
How
do
we
set
up
the
CI
CD?
How
do
we
set
up
notifications?
A
How
do
we
like
get
just
get
accustomed
to
all
of
the
new
apis,
so
we
took
a
couple
of
trial,
trial
pipelines
and
moved
them
over
random
in
production,
then
deprecate
them
in
the
airflow
in
the
airflow
instance,
and
then
then
there
was
this
period
of
just
migrating
pipeline
after
pipeline
until
we
had
just
a
couple
of
residual
ones
left
in
airflow
that
we
really
kept
going
for
quite
a
while,
just
as
a
sort
of
fallback
in
case
anything
happens,
we
migrated
our
first
set
of
pipelines
and
then
we
kept
a
number
of
them
going
at
the
same
time
as
dagster
pipelines
were
running
just
as
a
sort
of
backup
to
in
case
anything
went
wrong.
A
I
think
our
experience
was
a
little
bit
like
like
dating
where
it
takes.
It
takes
a
couple
months
for
us
to
build
trust
in
the
relationship,
trust
with
the
new
infrastructure
that
stood
up.
We
had
some
kubernetes
issues
where
pods
were
getting
killed
when
things
were
getting
scaled,
and
so
it
just
the
the
migration
takes
a
little
bit
of
time
for
everyone
to
build
confidence
in
it.
A
But
then
what
we
found
was
that
there's
definitely
a
Tipping
Point
where
having
everything
in
dagster
became
a
much
faster,
workflow,
much
more
visible,
you'd
go
back
to
our
airflow
workflow
and
you
kind
of
be
like.
Oh,
what
is
this
and
at
that
point
we
were
really
able
to
pull
off
everything
everything
from
the
airflow
incidents
and
just
and
just
make
it
in
in
dangster,
and
during
that
process
we
improved
most
of
our
dags,
so
we
were
able
to
pull
over.
A
You
know
it's
mostly
python
modules,
so
we
pull
over
the
air,
the
the
functions
and
just
kind
of
wrap
them
and
run
them
like
we
would
in
airflow,
but
during
that
process,
a
great
opportunity
to
go
and
like
fix
some
pipelines
like
Harden
configurations
that
were
a
little
squeaky
and
the
airflow
pipelines,
and
it
was
really
a
good
opportunity
for
us
to
really
think
put
that
long-term
vision
of
you
know
very
stable
platform
and
just
refresh
some
of
our
our
core
processes.
A
All
of
these
data
assets
that
you
have
out
there
can
be
diagsterized
and
connected
to
your
other
processes
and
that's
a
huge
unlock
for
our
team.
It
allows
us
to
move
to
a
much
more
like
event,
driven
architecture
where
DBT
refreshes
things
and
then
everything
that's
Downstream
of
those
pieces,
refreshes
things
and
that's
the
sort
of
like
next
level
of
orchestration
that
we
really
wanted
to
get
to
where
things
were
much
more
event,
driven,
not
just
a
bunch
of
schedules
that
we
didn't
really
know.
A
What
was
what
was
hitting
what
and
so
I
would
recommend
starting
there.
So
you
know
ignoring
infrastructure
if
you
can
using
all
of
the
the
the
built-in
CI
CD
and
like
actions
that
make
things
easy
and
nice
and
then
focusing
from
focusing
on
building
out
your
asset
graph
and
leveraging
all
of
that
potential
early,
be
the
three
things
that
I
would
I
would
recommend.
A
There
are
some
features
in
dagster
like
the
assets
that
make
it
a
little
easier
and
make
it
not
just
a
net
move,
but
a
net
Improvement.
The
migration
challenge
is
going
to
be
proportional
to
your
to
your
Dags,
the
amount
your
footprint
in
the
orchestrator
and
that
can
be
challenged.
A
I
think
the
other
piece
there
is
what
your
dag
dependencies
are,
because
if
you
have
triggers
in
in
airflow
where
one
dag
is
triggering
another,
and
then
you
know,
custom
alerting
is
built
on
top
of
that,
like
those
things
take,
do
take
time
to
to
migrate
over.