►
Description
In this month's Dagster Community Call, Rex Ledesma provided an overview of the new dbt™ integration and Georg Heiler shared his dbt™ use case: Unlocking Advanced Metadata Extraction with the New dbt™ API.
A
Okay,
great
thanks.
Everyone
for
joining,
welcome
to
the
dagster
community
call
we
have
this
meeting
every
other
month.
So
let's
get
into
the
agenda
for
today
and
then
we
have
rexon
who's
gonna,
be
going
through
the
dagster
DBT
integration
updates
and
then
we'll
actually
see
those
in
use
from
George
who
will
actually
be
showing
how
he
wrote
a
blog
post
and
it's
how
he
used
them
in
different
situations,
with
the
metadata
so
over
to
Rex,
where
he'll
be
going
through
dagster
DBT.
B
Okay,
hopefully
everyone
can
see
my
screen
here.
I'm
gonna
go
over
the
API
changes
that
we've
made
to
dicester
DBT
in
our
1.4
release.
B
We
basically
revamped
the
API
for
people
to
use
DBT
and
director
together,
there's
now
more
ergonomic
API
to
create
software-defined
assets.
We
give
you
a
CLI
to
scaffold
a
dagster
project
given
a
DBT
project
and
much
more,
and
this
is
just
like
a
basic
overview
of
the
changes
that
we've
made
as
well
as
some
resources
that
you
can
look
forward
to
if
this
is
of
Interest.
So
let's
get
started
so
basically
like
what
motivated
this.
B
These
sorts
of
changes-
you
know
dagster
and
DBT
have
or
you
know,
our
DBT
integration
is
very
very
popular
over
50
of
our
users
use
this
integration
and
we
started
you
know
once
we
when
we
released
it
back
in
2022
with
an
integration
with
software
defined
assets.
You
know
a
lot
of
the
same
recurring.
Questions
came
up
as
people
use
this
integration
more
extensively.
You
know
people
wanted
to
chain
their
DBT
assets
with
upstream
and
downstream
computations.
B
As
Lorenzo
pointed
out
here,
people
wanted
to
customize
the
metadata
about
the
assets
that
they
were
creating
in
dagster
from
their
DVT
assets.
People
want
a
big
sensibility
with
what
they
were
able
to
execute
using
DBT.
B
For
example.
You
know
Stephen
here
is
asking
you
know
how
can
I
use
slack
to
send
a
message
after
my
gbt
models
have
executed
properly
or
Dennis
is,
is
like
you
know,
how
can
I
run
short
structures
after
before
running
my
DBT
run
step?
B
B
The
second
is,
you
know
how
do
I
execute?
How
do
I
customize
the
execution
of
my
DB
assets?
You
know,
rather
than
just
running
DBT
run
or
DPT
build.
How
can
I,
you
know,
add
multiple
DPT
commands
or
how
do
I
use,
like
you
know,
other
python,
libraries
alongside
by
DPT
executions.
B
The
second
is:
how
do
I
customize
customize
my
asset
attributes?
You
know
how
do
I
change
my
op
name,
how
do
I
change
my
asset,
Keys
et
cetera,
Etc
and
then
fourth,
is
you
know?
How
do
I
add?
How
do
I
add
dependencies
to
my
dpd
assets?
How
could
I
make
it
so
that
you
know
maybe
my
five
train
or
airbud
assets
kick
off
before
my
DPT
assets
or
maybe
I
want
to
materialize
a
dashboard
after
my
DMC
models
have
executed.
How
do
I
do
that
in
an
ergonomic
way?
B
And
so,
and
so
these
are
the
four
problems
that
we
were
tackling
in
the
sort
of
revamp
and
we
came
up
with
a
set
of
apis
that
sort
of
like
address
these
underlying
problems.
B
B
B
So
here's
basically
like
the
new
API
that
we
have
this
is
very
similar
to
The
Decorator
based
approach.
We
give
for
a
raw
op
or
a
raw
asset.
B
B
B
Another
another
example
here
is:
you
know
now
from
this
new
API
from
this
DP
assets.
Api,
you
can
run
multiple
steps
or
multiple
DPT
commands
that
execute
on
the
same
assets.
For
example,
you
could
run.
You
can
now
execute
a
DVD
run
step
and
then
a
subsequent
DVD
test
step
and
before
this
wasn't
possible,
but
now
it
is
given
the
new
apis.
B
Another
example
here
is:
you
can
now
integrate
other
resources
into
your
DBT
execution.
B
So
in
this
example,
we
are
we're
sending
a
slack
message
at
after
running
a
DPT
command
here,
we're
running
a
DVD,
build
step,
creating
events
from
it
and
then,
if
it's
successful
or
if
it's
not
successful,
we
send
a
message
to
any
slack
Channel
and
say
that
it's
failed
and
this
isn't
limited
to
just
a
slack
resource.
B
Another
example
here
is,
you
know,
like
people
want
to
customize
the
asset
keys
that
are
created
with
their
gpp
assets.
B
Here
we
give
you
an
API
to
just
Define
an
asset
key
associated
with
any
DBT
model,
that's
ingested
by
dagster.
So
in
this
example,
we
Define
a
sort
of
like
translator,
object
that
ensures
that
any
asset
key
created
from
this
integration
has
a
prefix
called
snowflake
so
that
we
can
label
all
of
our
DB
assets.
Accordingly,.
B
B
We
give
you
new
apis
to
to
create
Downstream
dependencies
to
your
DP
assets,
for
example.
Here
in
this
example,
we
created
RDP
assets,
and
now
you
know,
we
want
to
define
a
downstream
dependency
to
finding
python
that
takes
in
a
customer's
dependency,
which
is
defined
in
DBT,
and
we
can
instantiate
any
sort
of
python
to
create
that
clean
customers
model.
B
And
here
is
a
similar
sort
of
like
another,
similar
use
case.
We
can
define
an
upstream
dependency
for
PPT
assets.
We
can
take
a
DBT
Source
called
my
source
and
then
translate
that
into
a
python
asset
defined
indexer,
so
that
this
is
an
upstream
dependency
to
your
DPP
assets.
B
And
so
those
were
the
sort
of
API
changes,
those
those
are
the
the
flavor
API
changes
that
we've
made
to
to
improve
the
customization
of
our
integration,
redipping,
our
users,
more
levers
to
be
able
to
customize
compute
customize,
the
metadata
about
their
assets
and
and
also
integrate
it
more
seamlessly
with
the
daxer
framework,
and
we
sort
of
like
culminated
all
of
this.
All
these
API
changes
to
a
nice
little
quick
start
for
DBT
users.
B
So
now,
with
the
release
of
our
1.4
apis,
we're
actually
giving
users
a
CLI
to
scaffold
a
diaper
project
given
any
PPT
project.
So
in
this
example
we're
we're
creating
a
dancer
project
from
the
standard
dapple
shop
project.
You
can
see
here
new
brand
CLI
command
to
create
this
new
dapple
shop,
dagster
directory.
B
This
has
a
this,
has
a
set
of
dags
or
definitions
to
let
you
load
from
this
DBT
project
and
we've
updated
our
tutorial
to
make
use
of
this
new
CLI.
B
So
you
can
see
here,
we've
scaffolded
this
project
and
when
we
go
to
load
this
project
it
automatically
loads
in
all
the
assets
from
RPG
project,
and
we
can,
when
we
materialize
these
assets.
This
actually
runs
a
DVT
build
step
on
by
default,
and
you
can
you
can
customize
this.
This
just
gives
you
just
like
a
sort
of
template
to
get
started
quickly
with
daxer
and
vpg
and
yeah.
This
is
just
showing
a
successful
I
run
running
for
Success,
using
the
sort
of
scaffold.
B
That's
yeah
and
you
know
that's
that's
a
quick
overview
if
people
have
any
sort
of
the
questions
about
these
apis
that
we've
that
we've
released,
there's
API
documentation
on
our
website,
We've
updated
a
tutorial,
as
I
said
before,
to
use
these
new
apis
alongside
our
divestor
gbt
project,
scaffold
CLI.
We
have
a
set
of
frequently
asked
questions
regarding
this
integration
and
we
covered
all
of
this
and
sort
of
our
vision
for
dagster
and
DBT
in
our
in
last
week's
live
stream
event.
B
C
And
I
will
try
to
get
the
videos
back
on
the
second
screen,
so
I
can
keep
watching.
You
I
want
to
start
with
a
blog
post,
to
give
you,
let's
say
a
deep
dive
of
how
you
can
utilize
these
new
apis
of
Dexter
and
TBT,
in
conjunction
with
on
one
side,
passing
additional
metadata
out
of
the
VT
EML
files
or
results
section,
and
secondly,
integrating
DBT
with
a
full-blown
data
governance
solution,
open
metadata.
C
So
Rex,
thankfully,
has
already
introduced
to
Basics,
but
let
me
show
you
how
this
might
look
like
in
code.
I
don't
know
on
the
shared
screen.
If
the
code,
if
the
size
is
large
enough,
please
tell
me
okay,
it
is
like
I
will
try
to
make
it
a
bit
larger
in
any
case.
So
first
of
all,
DBT
has
the
concept
of
a
Target
environment
like
Dev
and
Broad.
C
Perhaps-
and
you
have
to
tell
the
DBT
CLI
resource
which
Target
should
be
used
here-
I'm
passing
the
left
Target,
but
you
might
want
to
use
actually
we
Define
one,
but
it's
defined
up
here.
The
reason
why
this
is
different
now
is
because
initially
in
with,
let's
say
the
first
release
of
the
API,
you
had
to
basically
pass
it
into
the
CLI
like
this
option
and
I
just
monkey
patched
the
code
example
yesterday.
If
we
were
fully
released
API.
C
In
any
case,
we
can
pass
the
DBT
config
settings
like
this.
You
may
be
required
to
add
a
profile,
still
location
if
you're
using
a
different
location.
But
from
this
point
onwards,
the
basics
are
set
up
and
you
can
start
to
make
use
of
a
new
API
most
interestingly,
when
you
want
to
retrieve
additional
information
from
this
DBT
Json
documents
like
the
Manifest
We
Run
results,
maybe
with
test
results
of
the
catalog
file
from
the
documentation.
Is
you
have
to
actually
retrieve
these
artifacts?
The
new
API
has
a
method
to
get
hold
of
them.
C
However,
like
the
nice
thing
about
the
new
API
is
that
it
allows
you
to
stream
the
events
in
a
quite
need
way,
and
this
works
fine,
for
let's
say
the
default
use
case,
where
you
only
want
to
visualize
the
outcomes
in
dag
it
as
quickly
as
possible.
However,
one
when
you
want
to
actually
work
with
these
released
Json
documents
and
retrieve
the
results,
these
documents
are
only
written
after
DBT
has.
C
So
basically,
it
is
required
that
we
are
blocking
the
computation
and
then
basically
can
retrieve
the
data.
So
this
also
means
that
the
logs
in
Target
will
not
be
updating.
You
might
have
to
yeah.
You
might
be
able
to
change
it
up
a
bit
and
add
some
more
complex
logic
to
keep
the
log
streaming
and
block.
But
I
didn't
do
this
here
for
the
sake
of
keeping
the
example
simple.
C
So,
first
of
all,
you
have
to
basically
wait
on
the
cli's
task
to
block
until
the
DBT
run
is
finished
and
when
it
once
it
is
finished,
you
can
retrieve
the
artifacts
as
desired
and
now
on
these
events
that
are
available
that
are
emitted.
But
if
we're
not
limited
in
a
streaming
way
like
one
after
the
other
as
they
happen.
C
But
after
the
blocking
is
finished,
you
have
to
then
retrieve
these
individual
structures
from
the
document,
as
the
extra
is
using
the
asset,
key
notation
and
gbt
a
slightly
different
notation
about
this
schema
database
and
table
name
and
so
on.
You
might
be
required
to
have
a
remapping
or
lookup
table
which
is
basically
sort
of
inverting
the
dictionary
to
then
look
up
between
both
systems,
but
once
you
have
that
in
place,
you
can
basically
iterate
over
these
events
and
extract
the
desired
details
from
the
event.
C
I
was
personally
interested
in
retrieving
two
types
of
information,
but
first
of
all
rows
affected,
whether
somewhere
in
this
Json
and
the
second
one
is
the
compiled
code
in
the
old
version
of
text.
I
was
using
the
convenience
feature
of
load
the
DBT
project
right
away,
but
basically
the
whole
DBT
project
would
be
compiled
on
the
fly,
but
this
is
also
a
slow
approach
and
I
wanted
to
move
towards.
Let's
say
baked
CI
approach
with
past.
C
C
So
basically
I
will
only
get
the
compiled
code
after
actually
running
the
code.
This
means
the
day
gets
you
I
would
not
be
able
to
show
the
nice
documentation
around
the
compiled,
SQL
and
I
want
to
get
this
back
and
basically,
by
running
the
code
in
the
DBT
run
task
here
and
retrieving
the
executed
compiled
SQL
statement
from
the
event
it
can
pretty
much
almost
99
get
back
with
initial
functionality.
C
C
And
yeah,
you
will
also
get
it
back
in
with
your
eye
with
compiled,
SQL
event,
as
explained
before
that
was
necessary
because
I
didn't
pre-compile
the
gbt
Manifest
only
past
it.
But
last
but
not
least,
it
is
sometimes
interesting
to
work
with
a
more
full-blown
data,
cataloging
solution,
maybe
for
the
sake
of
quickly
accessing
the
documentation
or
working
with
other
assets
when
only
DBT.
There
are
multiple
open
source
projects
around
for
like
data
Hub
or
open
metadata
person
liked
their
approach
and
I'm
also
showcasing
how
here
now
how
to
integrate
it.
C
C
The
call
that
the
CLI
is
doing
is
basically
this
one
here,
where
this
metadata
ingest
and
the
path
to
the
path
to
the
config
file
and
I
have
created
a
small
CLI
wrapper
that
is
basically
executing
red
from
python
using
web
python
API,
and
in
order
to
make
this
work,
we
have
to
work
in
pretty
much
a
similar
project
as
before,
but
we
have
to
first
call
this
late
method
to
block
the
task
and
again
we
can
then
use
this
wrap
or
utility
functions
to
retrieve
the
artifacts.
C
However,
now
because
with
the
1.4
release
of
text,
Dexter's
modifying
the
path
of
outputs,
so
it
will
allow
for
parallel
runs
or
for
non-colliding
parallel
runs.
We
have
to
somehow
put
the
final
output
path
to
a
more
yeah,
basically
pass
it
down.
Streams
of
other
two
can
work
with
it.
I
was
extremely
hard
coding
into
this
path,
so
not
fully
supporting
parallel
runs,
but
for
the
sake
of
my
use
case,
this
was
enough.
C
Why
is
this
perhaps
interesting,
because
the
data
catalog
not
only
has
the
assets,
but
it
has
the
pipelines
like
the
extra
runs
and
you
might
be
able
to
show
these
two
business
users
in
a
more
streamlined
way
could
also
feed.
Let's
say
machine
learning,
models,
Kafka
topics
and
whatnot,
and
it
can
give
you
a
neat
way
to
surface
the
test,
results
that
are
defining
DBT
tests
to
end
users
to
drive
a
trust
in
data
assets
where
you
can
show
if
they
were
successful.
All
the
time
or
maybe
not
interestingly,
also
dbt's
default
documentation.
A
C
That
being
said,
this
is
the
DBT
demo
part
one
and
I'm
very
multiple
hats.
So,
first
of
all,
I
want
to
thank
my
University
where
we
were
showcasing
this.
C
The
extra
installation
and
I
was
allowed
to
show
you
these
screenshots
from
a
production
system,
but
with
one
of
my
other
hats,
I'm
working,
also
and
let's
say
Enterprise
company
and
where
we
are
also
starting
to
explore
this
new
DBT
integration
in
a
different
way
where
we
are
having
some
custom
jvm
based,
spark
based
data
ingestion
tool,
and
we
want
to
basically
integrate
it
with
dagster,
where
we
intend
to
basically
have
this
tool
to
copy
data
over
into
the
cloud
environment
and
then
retrieve
a
lineage
from
text
itself
or
Steve
is
outside
tool.
C
This
is
a
very
small
name
of
how
it
will
look
like
in
the
future
where
you
have
this
Source
asset
that
is
living
outside.
You
have
a
inside
asset
radius,
in
fact,
depending
on
this
outside
one,
that
is
when
managed
from
this
ingestion
tool
and
when
you
have
some
DBT
transformation
event
kick
off
so
to
show
how
this
looks
like
in
the
demo
example
would
actually
look
like
this
and
again
quite
neatly
tie
the
extra
and
its
new
API
together
with.
A
C
Yeah,
that's
the
second
demo
in
very
short,
I
hope
it
was
not
too
quick,
because
I
could
go
in
mod
app
there.
Maybe
there
are
some
questions
after
the
call
you
can
reach
out
to
me.
Maybe
that
Rex
will
share
how
to
contact
me
and
regarding
the
blog
post,
I
think
we
can
share
it
in
the
show
notes.
So
if
you
want
to
look
at
it
in
more
depth
would
be
quite
easy
to
read
up
and
I'm
also
part
of
the
access
lag,
so
you
can
just
simply
reach
out
to
me
there
as
well.
A
Boom,
we
do
have
two
questions.
One
was
on
the
column
level
lineage
that
was
from
open
metadata.
Is
that
correct?
Yes,.
C
A
Cool
thank
you
and
one
from
pedram,
which
is,
is
the
dbtcli
running
every
time.
Dagster
definitions
are
loaded.
C
So
that
depends
on
how
you
use
it
in
the
old
way.
Pre
1.4,
you
had
two
options.
Let's
say
easy
mode
on
was
to
Simply
run
it
every
time
when
the
extras
is
loading,
the
definitions
and
I
was
using
it
for
the
sake
of
convenience,
but
this
is
a
bit
slow,
and
this
is
why
I
switched
to
interrupting
to
the
new
API
to
this
pre-bagged
mode,
where
you
can
containerize
with
a
pre-compiled
image
or
in
my
case
with
pre-passed
image
and
then
have
it
run
only
once
you
need
it.
A
C
Only
vaguely
but
like
in
general
Dexter
allows
you
to
orchestrate
anything.
So
if
you
write
your
custom
code
to
to
do
that,
then
you
will
be
able
to
do
it
like
Dexter
doesn't
have
this
integration
out
of
the
box,
but
you
can
easily
Roll
It
Yourself
by
writing
a
custom
code
that
is
needed
to
to
get
the
job
done.
Yeah.
C
A
Okay,
so
any
more
questions
on
any
of
that
or
even
the
DVT
stuff.
A
Can
you
guys
see
my
screen
or
no,
no,
okay,
so
other
than
that?
Just
a
couple,
other
call
outs
if
you're
interested
in
presenting
in
this
call
feel
free
to
reach
out
to
us.
A
So
we
have
this
call
every
two
months
like
I
mentioned
and
looking
for
always
new
people
or
anyone
who's
interested
to
either
collaborate
on
a
blog
post
or
present
in
this
meeting
itself,
so
we'll
definitely
reach
out
and
then
other
than
that
we
are
releasing
just
a
trial
version
of
dagster
University,
which
is
really
meant
for
newer
users
to
dagster.
So
if
you
have
people
on
your
team
or
you're
interested
in
taking
a
look,
let
us
know
we'll
be
able
to
add
you
to
the
slack
Channel
where
you'll
get
the
updates.