►
From YouTube: Data Engineer Internship - DBT sharding
Description
Data Engineer Internship - DBT sharding, Chris Sharp and Radovan Bacovic discussing how to shard DBT models per provider
A
B
A
A
A
Extra
options
or
extra
things
you
don't
need
so
I
can
show
you
here.
This
is
okay,
and
this
import
is
usually
okay
with
some
linters
later
on.
We
can
check
what
is
not
in
use.
It's
not
a
problem,
but
this
is
the
catch.
This
poor
commit
is
not
needed
to
you,
because
it's
used
only
for
specific
tags.
In
case
you
want
to
get
the
commit
ID.
In
our
case,
you
can
just
comment
out
or
delete
this.
This
is
not
needed
for
us.
We
have
isolated
problems
or
isolated.
Duct
run,
feel.
C
A
I
would
say
this
part
is
not
needed
for
test,
because
this
command,
along
with
select
or
our
DBP
job,
we'll
pick
up
everything
regarding
the
BTN
push.
It
I
mean
in
this
case
in
in
Snowflake
table
for
results
of
DBT
I.
Think
we
don't
need
that
anymore.
Just
for
this
testing
we
can,
we
can
put
it
in
a
comment
or
somewhere
but
feel
free
to
remove
this.
It's
not
made
it
definitely.
B
A
A
C
A
C
A
A
B
C
A
C
B
I
haven't
looked
into
that
one
we'll
just
just
take
it
as
far
as
the
marked
just
now,
and
then
so
I
think
it's
the
one
above
that.
C
B
C
B
C
A
A
And
what
do
you
think
we
will
get
any
any
speed
or
any
any
cost
efficiency
saving
money,
both
changing
model
for
fully
refresh
to
incremental,
as
per
experience.
A
B
Yeah
could
have
a
look
at
it
and
see
if,
if
any
of
them
would
benefit.
A
B
There's
updated
because
I
don't
know
how
the
opportunities
are
modeled,
but
if
it's,
if
it's
updated
time,
then
yeah,
we
could
have
a
look
at
that
and
see.
A
B
So
I've
changed
I
pasted
your
changes
in
there
so
so
yeah
I
can
see
the
dbt6
early
here.
No.
A
A
A
A
B
A
C
A
A
Yes,
how
is
going
in
our
case
I
think
this
will
be
fine,
don't
discard
why
it's
not
running.
You
have
airflow
local
installed
and
run
a
ratchet
when
you
execute
your
dag
or
task
on
the
rear.
Duct
it
spin
up
the
port
on
kubernetes.
Cluster
name
is
Facebook
test,
and
for
that
reason
our
code
will
be
executed
there,
not
here
under
the
machine
locally,
it
will
be
executed
cluster
under
the
inside
the
port
yeah.
A
So
for
that
reason
you
have
a
dynamic
command
to
call
DBT
no
use
colors
and
if
that
command
will
be
executed
there
on
25.,
because
you
see
a
line,
105
said:
okay,
dbt6
are
the
model
starts
equal
plus
is
equivalent
spot
operator.
It's
okay
create
a
port
in
our
cluster,
where
it's
defined
or
predefined
somewhere
and
around
the
command.
Your
name
is
here
inside
ebitivity.
B
Yeah,
let's
just
get
pushed
so
you
can
do
this
message
is
gcmsg
and
yeah.
There
is
some
good
Sith,
because
I
was
typing
out
the
same
things
over
and
over
again.
A
C
A
A
A
C
B
A
B
B
C
A
C
A
Anyway,
I
can
take
the
testing
point.
We
just
don't
know
why
it's
too
much
time,
what
about
the
focus,
maybe
on
these
DBT
jobs
and
what
we
can
do
next
for
that?
What
do
you,
what
you
have
on
your
mind,
I'm
thinking
anyway,
we
can
run
it
locally
right,
as
of
now
from.
B
A
Yeah
yeah:
this
is
a
great
source,
of
course
yeah.
So
for
as
of
now,
we
don't
see
the
good
benefit
of,
let's
say
more
from
Full
to
incremental,
but
maybe
for
reason
of
experiment.
You
prefer
to
check
that
part
because
it
will
go
at
least
for
one,
the
first
model
or
something
to
see
what
we
can
get
right.
Yeah.
B
A
B
A
B
A
B
A
B
It
is
quite
complex
and
I
haven't
sat
down
to
was
live,
yeah
I,
think
I
think
these
numbers
might
change
with
the
sort
of
sales
targets.
Perhaps
so
I
think
that's
why
there's
the
hard-coded
numbers
yeah.
B
But
yeah
it's
a
big.
It's
a
large
large
model.
A
B
B
B
B
A
A
I'm
thinking,
where
is
the
okay?
If
you
go
to
that
macro
and
search
for
opportunity,
Source
I
mean
the
source
table.
We
can
maybe
just
add
one
more
parameter
to
be
incremental.
If
you
know
what
I
mean
yep,
because
now
you
have
one
parameter,
you
can
put
second
parameter
as
a
default,
nothing
and
or
Inc
four,
and
if
you
choose
to
be
incremental,
we
can
yeah.
B
A
B
A
B
B
B
And
then
we
can
yeah.
C
A
Yeah,
as
I
said
directly
in
real
life,
for
this
example,
we
determined
it
is
not
a
big
benefit
to
choose
sacramental
over
four,
but
I
mean
just
for
playing.
We
can
do
that
as
an
exercise
for
us
nothing
more
than
that
yeah.
This
is
super
complex,
I
would
say.
B
A
B
A
C
A
B
A
B
Yeah
there's
a
lot
of
a
lot
of
sources.
Yeah.
A
Yeah
the
catch
here
is
that
that
marker
Simplicity
is
complex
right,
because
you
can't
parameters
anything
there.
It
will
select
star
yeah,
that's
that's
the
but
you'll,
see
because
I
dealt
with
similar
stuff
and
want
to
check
something
and
with
Simplicity.
You
have
select
star
from
that
table.
Yeah.
A
A
I
think
maybe
for
this
experiment,
maybe
to
create
somehow
simple,
City
incremental
macro.
C
A
C
A
B
B
A
It
could
be
topless
and
at
the
end
of
the
story
it
could
be,
let's
say
date,
name
or
column
name
like
maybe
imagine.
If
you
have
one
model
per
day,
it
is
updated.
Another
model,
it's
update
time.
Usually
it's
new
unified
but
yeah.
If
it's
not
the
resolution
of
parameter,
but
maybe
something
like
that,
you
know
what
I
mean.
So,
from
my
point
of
view
should
pick
up.
A
A
B
A
This
is
this,
is
the
top
one
I
mean
lineage
is
very
simple.
If
you
have
five
models,
but
logic
inside
the
model
is
complex,
I
would
say,
I,
don't
know
even
this.
This
thing
this
thing
is
repeatable,
so
also
it
can
be
the
future
subject
to
some
optimization
here,
more
macros,
but
still
for
us,
let's
take
like
simple
City
incremental
or
something
like
that
and
as
I
said
sort
it
out
for
one
model
by
your
choice
and
when
we
fix
that
we
can
apply
the
same
yeah.