►
Description
Dmitry Matasov from Bestplace talks about the business problem the company solves, the evolution of their data platform, how they leverage Dagster in their workflows across Gitlab, Jupyter, even Google Sheets!
🎞 Slides 🎞
Bestplace & Dagster (Dmitry Matas)➡️ :
https://drive.google.com/file/d/1BSaQmSc9szcKTT16-B_HzwPIYUKuxe81/view?usp=sharing
🌟 Socials 🌟
Follow us on Twitter ➡️ https://twitter.com/dagsterio
Checkout our Github ➡️ https://github.com/dagster-io/dagster
Join our Slack ➡️ http://dagster-slackin.herokuapp.com
Visit our Website ➡️ https://dagster.io/
Check out our Documentation ➡️ https://docs.dagster.io/
A
Hello
hi
everyone,
my
name
is
dmitry,
I'm
the
cto
of
the
best
place
company
and
just
to
give
you,
you
know
a
quick
context
of
on
what
we're
doing
actually
with
dexter
I'll
show
you
what
what
best
place
does
so
we're
building
a
machine
learning,
driven
geoanalytical
platform
for
retail
companies
to
help
them
to
find
the
most
profitable
locations
for
new
stores
and
for
consumer
goods
companies
to
optimize
their
distribution
strategy
and
to
align
their
product
mixes
with
the
actual
customers
around
the
stores
they
are
distributing.
A
So
it
all
started
like
this.
We
had
a
team
of
engineers
and
data
scientists
and
the
client,
and
then
it
all
became
it
all
changed
and
now
we're
somewhere
in
here,
so
we're
having
much
more
clients
and
to
deal
with
them.
Now
we
have
a
lot
of
analysts,
not
data
scientists,
so
basically
yeah.
We
wanted
everybody
to
be
happy
in
this
formula.
A
We
wanted
our
data
scientists
to
work
with
their
pandas
and
dusk
and
jupiters,
and
we
wanted
our
analysts
not
to
dive
too
deep
into
coding
and
to
manage
with
all
the
complicated
machine
learning
stuff
using
configs
and
yamls
or
google
sheets.
A
So
actually
we
have
several
ways
to
collaborate
between
each
of
us
and
data
science
team
collaborate
with
our
analyst
team
through
jupiter
data
scientists,
make
new
experimental
methods
and
analysts
all
can
code
and
python
and
can
use
their
new
notebooks
and
also
we
all
use
gitlab
as
our
main
storage
for
configs
and
deployments,
and
brands
and
analysts
can
use
it
too,
with
its
nice
web
id.
A
A
The
best
alternative
was
to
make
something
like
this,
because
the
other
pipeline
engines
or
orchestrators
were
either
too
immature
or
buggy
or
too
slow
for
a
startup
looking
for
its
solution.
So
our
own
solution
looked
like
this.
We
had
a
yaml
config
describing
the
model,
the
features
we
wanted
to
calculate
and
it
was
running
from
a
jupiter
with
a
kind
of
script
like
import,
our
library
and
run
the
pipeline.
A
Then
we
had
more
of
the
pipelines
and
not
each
of
them
were
about
the
predicting
the
new
locations,
but
about
the
other
things,
and
we
wanted
them
to
be
scheduled
and
to
be
reproducible.
A
A
A
So
that's
why
it
was
a
no
way
for
our
analysts
and
they
went
this
yeah
and
finally,
like
half
year
ago,
we
got
back
to
the
dexter
and
so
that
it
was
mature
enough
for
us
to
try
it
again
and
yeah
we've
tried
dexter,
so
it
actually
helped
us
with
several
things,
to
keep
our
pipelines,
reproducible
and
version
controlled,
to
use
our
jupiters
with
papermill
and
to
configure
everything
in
a
yamu-friendly
way.
A
That's
how
our
development
flow
looks
like
so
in
you
know,
shared
production
environment.
We
have
our
own
self-deployed
s3
server,.
A
For
intermediates
and
for
log
storage
and
postgresql
for
run
storage
and
we
use
a
docker
and
docker
salary
dexter
deployment.
So
when
we're
when
the
developers
work,
luckily
they
start
everything
with
ansible
we're
deploying
with
ansible
as
locally
or
on
the
production,
and
they
deploy
a
local
version
of
dexter.
A
Why
jupiter?
Because
it
is
blazingly
fast
in
updating
your
pipelines
code,
you
can
just
restart
the
kernel
and
run
it.
You
don't
need
to
wait
when
dexter,
when
dagit
will
catch
the
new
code
and
restart
and
the
other
things
is
debugging
and
profiling.
You
can
actually
we're
working
working
load
with
data
frames
and
pandas,
and
we
wanted
to
display
it
nicely
and
to
have
profiling
and
ipad.
So
it
looks
like
this.
A
You
just
develop
your
solids
and
by
charm,
for
example,
and
run
it
with
jupiter
locally
deployed
having
this
yamu
config
and
loading
it
and
running,
execute
pipeline
yeah
we're
having
this
nice
tequiliums
and
we
can
preview
everything
just
in
line
yeah.
What
comes
next?
We
commit
it
to
gitlab
and
deploy
it
to
the
shared
environment
and
yeah
it.
You
can
also
test
it
locally
on
the
exact
same
deployment
as
in
production
having
the
nice
stuff
with
presets
yeah.
A
Here
it
looks
like
this
we're
building
versioned
containers
with
code
and
run
it
with
daggett
and
having
the
same
jupiter
in
production.
So
you
can
tweak
something
or
explore
the
failure.
A
So
in
gitlab
we
just
deploy
it
with
sensible
and
have
our
nice
pipelines
and
deck
it
yeah.
You
see
we're
we're
using
tequila
in
dec
yeah,
and
how
does
the
analyst
workflow
looks
like
so
it
starts.
A
It
consists
of
four
elements:
the
developers
solids,
which
are
robust,
documented
and
optimized,
and
you
can
use
them
as
these
from
the
library.
The
second
way
is.
The
second
element
is
analyst
solids,
so
there
are
paper
mill
driven
jupiters
that
are
business,
specific
they're
visual.
You
can
do
graphing
and
displaying
data
frames,
making
your
own
ad-hoc
little
parts
of
the
pipelines
and
I'm
a
kind
of
scientist
myself.
The
analyst
can
tweak
everything
you
hear
she
wants.
So
it
actually
looks
like
this.
A
So
then
the
papermill
will
put
the
real
parameters
here
and
run
the
notebook
up
from
from
up
to
down,
and
then
you
have
actually
two
options
in
our
repository.
You
just
can
commit
your
notebook
and
it
will
automatically
be
become
a
solid
with
no
inputs
and
no
no
outputs,
just
nothing
bind
or
you
can
define
your
own
inputs
and
outputs
definition
like
this.
A
A
So
these
are
the
two
identical
pipelines
defining
this
x
stuff
here
this
one
in
python
and
this
one
in
yaml.
So
you
can
reference
the
existing
solids
like.
I
want
this
one
rename
it,
and
this
guy
has
his
inputs
from
well
sorry
from
example,
admiral.
Oh
sorry,
example
at
one
an
example
x,
and
he
has
an
output
named
output
with
sum
and
with
product
somewhere
here
yeah,
so
you
can
actually
do
forking
with
the
animals
yeah
and
the
fourth
part
of
it
is
sophisticated
conflicts
from
google
sheets.
A
It's
actually
quite
a
specific
thing:
we're
storing
google
sheet
that
can
define
way
to
process
your
data,
for
example.
This
is
a
labeling
solid
that
contains
the
rules
and
substrings
for
categorizing.
Some
points
on
map
like
data
sales
points
and
we're
just
having
a
function
in
our
library
to
download
it
from
there.
So
you
can
actually
pass
a
link
to
this
table
in
google
api
to
the
solid
config
and
it
will
download
it
and
use
the
configuration
file
and
yep.
A
A
So
the
proposal
is
this
to
actually
not
to
substitute
the
parameter
cell,
but
to
command
it
out
and
leave,
as
is
and
then
to
insert
the
dexter
stuff
and,
in
the
end,
with
nearby
the
tear
down
cell
to
add,
commit
commanded
out,
commit
statement
that
would
be
able
to
when
you
play
around
with
your
notebook,
find
an
issue
fix
it,
and
then
you
can
just
comment
it
out
and
execute
the
cell.
It
will
strip
the
unnecessary
cells
and
command
the
original
parameters
cell
and
commit
it
to
the
gitlab.