►
From YouTube: Delta Lake Community Office Hours 2021-09-16
Description
Join the Delta Lake Community Office Hours to ask your questions on all things Delta Lake! You can also join us in the Delta Lake Users Slack at: https://dbricks.co/delta-users-slack. Thanks!
A
I'll
start
with
that,
I
was
messing
around
with
the
live
stream,
and-
and
here
it
is
so,
okay,
perfect.
A
Qp
and
ryan,
you
guys
are
joining
awesome,
we'll
see
if
anybody
else
joins.
I'm
actually
currently
live
streaming
as
well
to
see
if
anybody
has
asked
us
questions,
excellent
misha,
I
see
you
on
youtube.
If
you
have
any
questions,
why
don't
you
go
ahead
and
chime
in
and
ask
us
questions
and
just
type
them
in
and
we'll
do
our
best
to
answer
we'll
do
some
introductions
in
a
second,
we
were
just
working
through
some
of
the
technical
issues.
Like
usual.
A
And
qp
and
ryan,
if
you
can
just
unmute
yourself
and
so
that
way
you
can
talk
whenever
you're
ready.
A
A
A
A
All
right,
let's
see
yeah
excuse
me
sorry,
yeah
and
I'll
fix
these
zoom
links
shortly
to
everybody
for
the
attendee.
So
I
can
give
you
guys
panelists.
It
was
a
at
a
mad
dash
because
I
mistakenly
forgot
that
a
zoom
meeting
and
a
zoom
webinar
are
different
things,
so
I
couldn't
broadcast
it
to
youtube.
That's
also
hey
and
misha.
Did
you
want
to
ask
any
questions
by
the
way
if
you
do
you're
more
than
welcome
to
just
chime
in
and
type
in
here,
if
you
want.
A
Okay.
Well,
meanwhile,
in
that
case,
let's
just
jump
start.
It
welcome
to
the
first
of
what
will
be
bi-weekly
every
two
weeks,
delta
lake
community
office
hours
we're
starting
off
with
a
little
light
today,
but
fortunately
we
actually
have
multiple
people
that
are
attending
and
we're
probably
just
going
to
ask
each
other
questions.
A
We
also
talk
a
little
bit
about
the
delta
lake
road
map,
so
for
any
of
the
folks
who
are
watching
this
post
session,
not
a
big
deal,
you
can
go
ahead
and
actually
dive
in
and
learn
a
little
bit
about
the
roadmap
as
we
speak.
So
so,
for
starters,
I'm
going
to
introduce
myself.
My
name
is
danny
lee.
I'm
a
developer
advocate
here
at
databricks,
long
time,
brickster
a
long
time,
patchy,
spark
and
delta
guy
as
well.
A
So
I'm
going
to
be
here
to
answer
some
questions
and
also
host
a
little
bit.
I
want
to
also
include
next
scott
want
to
introduce
yourself.
D
Morning
or
afternoon
everyone-
I
don't
think
you
can
see
me,
but
I'm
scott,
I'm
a
software
engineer
on
the
delta
ecosystem
team
at
databricks.
I'm
really
excited
to
be
here.
A
Excellent
thanks
and
next
is
qp.
I
think.
C
Yes,
hi,
I'm
qp,
I'm
one
of
the
engineers
from
script
and
I'm
one
of
the
maintainers
of
the
delta
lake
rust
findings
and
the
native
python
bindings.
A
Perfect,
thank
you
very
much
qb
and
by
the
way,
qp,
scott
and
ryan.
You
guys
should
be
able
to
turn
your
cameras
on
now,
because
I
made
you
all
panelists,
so
you
can
you
know
we
can
should
be
able
to
turn
cameras
on
if
we
want
to
and
last,
but
certainly
not
least,
ryan.
Why
don't
you
introduce
yourself.
E
I'm
ryan
from
datapress
I'm
a
software
engineer
and
I
have
been
like
working
on
delta
lager
for
like
many
years
and
and
basically
since
the
beginning,
I
end
up
before.
Like
the
other
leg,
I
also
like
to
work
in
spark,
which
is
mainly
focused
on
the
structure
streaming
project
yeah
pretty
excited
to
be
here.
A
Cool,
thank
you
very
much.
My
apologies.
Yes,
I
would
you
would
figure.
I
would
know
how
to
work
with
the
zone
by
now,
but,
okay,
it
is
what
it
is
all
right,
perfect.
Well,
misha
has
chimed
in
on
youtube
saying
he
is
not.
He
doesn't
have
any
questions
yet
so
not
a
big
deal,
but
then
I'll
I'll
call
someone
like
qp
go
for
it.
You've
got
some
questions,
so,
let's.
C
Start
with
you
first
question
I
was
going
to
ask:
are
you
going
to
show
the
awesome
roadmap
in
this
office
hour.
A
A
Let's
see
okay,
here
we
go
presenting
the
roadmap
as
we
speak.
Oh
sorry,
there.
B
A
B
A
It
perfect,
so
do
you
have
any
questions
any
call-outs,
I
mean
I've
got
a
couple
myself,
but
I
want
to
figure
if
you
know
any,
whether
it's
scott
or
qp
or
ryan,
if
you
want
to
call
anything
particular
out
for
starters,
I'll,
just
zoom
in
a
little
bit
just
so,
it's
easy
to
read.
E
Yeah,
I
think
one
big
project
is
the
delta
standard
writer,
which
is
we
are
building
to
support
like
writing
without
spark
which
we
will
try
to
use.
This
build
a
lot
of
characters
and
kindly
scott
is
like
leading
this
project
and
working
on,
and
probably
we
expect.
This
can
be
like
a
release
on
like
a
q4
next
quarter.
So
yeah,
I'm
pretty
excited
about
this
project.
D
Yeah,
I
know
it's
just
a
really
exciting
feature
and
project,
and
it's
coming
along
well
and
looking
forward
to
releasing
it
next
quarter.
A
Cool
and
so
for
anybody
who
actually
is
interested
more
on
it
in
addition
to
chiming
us
with
the
delta
user,
slack
channel
and
or
the
delta
lake
email
distribution
list
and
or
directly
on
github,
where
you
can
just
chime
in
like
you
know,
we've
already
had
some
in
addition
to
the
official
road
map
which
calls
out
flink
and
presto
right
and
pulsar
to
me
the
those
three
we
we
recently
have
pings
from
beam
as
well
so
and
there
are
actually
some
other
folks
as
well.
A
So
if
apache
here
on
and
b
excuse
me,
those
are
the
two
that
immediately
come
to
mind.
So
if
there
are
additional
projects
that
would
like
to
basically
utilize
a
jvm
based
writer
in
this
case,
which
is
the
delta
standalone
writer,
but
not
you
spark
at
all,
then
absolutely
please
chime
in
and
ask
us
questions
and
talk
to
us
about
this
because
we're
actually,
I
can
probably
say
this
for
scott,
like
we're,
we're
actively
working
on
it
right
now,
but
we're
still
early
enough
where
we
can
get
new
requirements.
D
Yeah
just
leave
feedback
because,
like
like
danny
just
mentioned,
we're
early
stages
of
development,
so
things
can
change
if
users
want
any
different
feature
or
any
different
exposed
api,
whatever
we
can
do
to
help
people
build
connectors.
A
Excellent
perfect
that
that's
the
big
one
delta
standalone
writer
qp
any
what
callouts
you
want
from
the
the
rust
api,
especially
because
of
the
fact
that,
for
example,
the
cockpit
built
into
this
project,
which
is
related
to
delta
rust,
actually
is
in
production.
So
anything
that
you
particularly
want
to
call
out
there.
C
Yeah,
I
just
I'll
just
repeat
that
it
so
we've
been
working
on
a
kafka
to
delta
late
connector,
but
not
in
jbm
native
connected,
that's
fully
written
in
rust,
and
it's
been
in
production
and
running
really
well
so
far,
and
so
we
don't
have
any
major
plans
from
script
site.
I
know
there
are
some
new
features
that
the
community
is
planning
to
add
to
the
rust
findings.
C
But
I
I
just
call
out
that
the
delta
rust
api
for
creating
a
new
tables-
that's
already
completed
from
one
of
the
community
members,
I'm
connor
and
that
one's
already
completed-
and
so
I'm
repeat,
really
happy
with
the
the
delta
and
just
performance.
So
far.
A
Excellent
and
then
one
thing,
I'd
love
to
add
is
also
the
fact
that
we're
actually
planning,
I
think
in
early
october
to
actually
have
a
session
specifically
devoted
to
the
kafka
delta,
ingest
and
the
delta
rust
associate
delta
rust
api.
So
if
you
want
to
dive
deep
from
that,
obviously
you
can
talk
to
us
at
the
the
delta
user
slack
channel,
but
at
the
same
time,
yeah
we
are
planning
a
session,
so
you
can
dive
deeper
there
as
well.
A
So,
but
of
course,
if
anybody
chimes
in
right
now
and
has
any
questions,
that's
what
these
office
hours
are
for
cool
anything
else
that
you'd
like
to
add
in.
I
know
that
there's
work
from
like,
for
example,
nessie
and
lakefest.
We
also
are
having
some
great
work
with
the
folks
over
stream,
with
with
pulsar
gerard,
has
been
doing
some
amazing
work
with
the
power
bi
connector,
so
that
that's
been
extended
even
much.
A
So
anything
we
want
to
add
about
the
hive
connector
that
you
can
think
of
for
ryan
or
right
now.
I
don't
think,
there's
any.
I
don't
recall
any
updates
yet
per
se.
E
Have
connector
yeah?
We
are
planning
to
like
support
the
hype
three
in
this
quarter,
and
so
we
are
like
working
on.
Have
three
support
right
now,.
A
Excellent
okay,
perfect
and
then
for
any
of
you
folks
who
are
delta
inc
delta
lake
inclined,
but
also
not
so
inclined
to
write
any
of
the
code.
We
actually
have
a
lot
of
updates
that
we're
planning
to
do
with
both
the
website
itself
and
the
documentation
so
feel
free
to
go
ahead
and
chime
into
those
github
issues
and
and
and
call
and
basically
chime
in
that
you're
interested
in
helping
or
you
think
you
want
to
see
xyz
and
that's
a
great
way
to
get
involved
as
well.
A
One
other
thing
I
did
want
to
call
out
is
that
we
will
be
sending
out
a
survey,
a
delta
lake
user
survey
later
today,
in
both
slack
and
email
we'd
love
to
get
your
feedback
about
the
delta
lake
project.
So
these
are
I
I
want
to
call
it
out.
This
is
the
initial
proposed
roadmap
right,
yes,
middle
enough.
A
So
my
I
will
call
myself
out
on
that
one,
but
by
the
same
token,
you
know
we're
still
earning
enough
where,
if
we're
seeing
other
prioritizations
like
from
the
community,
we're
glad
to
change
them
accordingly
and
in
fact
anybody
that
chimes
in
and
provides
feedback,
you're
automatically
will
get
a
delta
lake
t-shirt
as
well.
Okay.
So
this
is
me
broadcasting,
the
brandeis
t-shirt,
so
we'd
love
your
feedback,
we'll
be
like.
I
said,
we'll
be
sending
that
out
later
today
and
I'll
and
connect
it
attach
it
to
this
youtube.
Video
as
well.
A
So
just
want
to
call
that
out
so
again
love
to
get
your
feedback.
So
that's
it
for
that
point,
any
other
questions
from
either
the
panel
or
from
anybody.
That's
currently
on
youtube.
C
I
have
a
question
shoot
of
course:
okay,
so
this
is
for
timestamp
handheld
right.
So
for
the
context
in
the
delta
lake
spec,
we
defined
the
timestamp
to
be
of
microsecond
precision
and
but
when
we
write
a
parquet
in
spark
by
default,
it's
using
it
96,
which
is
nanosecond
position
right.
So
I'm
curious:
what's
what's
the
expected
behavior
when
as
a
alternative
implementation?
How
should
we
handle
this
when
there
is
a
mismatch
between
the
spec
position
and,
what's
physically,
being
written
out
in
4k.
E
C
A
All
right:
well,
I
think,
let's
see
if
there's
I'm
checking
youtube
one
more
time
doesn't
look.
Oh
here
we
go.
We
do
have
some
questions
coming
in
now
perfect.
Can
you
tell
a
good
example
from
delta
rs
python
to
adls
gen2,
so
any
samples
available?
So
I
I'm
going
to
direct
that
to
you
lqp.
If
that's
okay,.
C
So
there
is
a
discussion
and
that
really
lengthy
github
discussion
in
the
delta
rs
project,
I'll
I'll
link
I'll
send
danny
the
link
and
maybe
danny,
can
link
to
the
issue
long
story
short.
There
are
some
example
code
there,
but
it
won't
work
out
of
the
box
because,
due
to
some
limitations
in
the
adls
python
file
system
library-
but
there
is
work
around
for
that.
A
B
A
Oh
sorry,
did
you
slack
me
or
with
that
or
the
link,
I'm
still
trying
to
find
a
link?
Oh
okay,
no
problem!
Okay,
sorry
I
heard
I
heard
a
slack
and
I
wasn't
sure:
okay,
okay,
open
source
techie,
says
I'm
joining
the
delta
lake
community
after
two
after
almost
two
years,
that's
great
open
source
techie.
If
you
can
go
ahead
and
chime
us
with
your
question
or
you
can
also,
of
course
you
can
always
ping
us
on
slack
but
yeah.
A
So,
oh,
I
think
you're
coming
with
a
question
so
yeah
yeah
I'll.
Let
you
finish
your
question
that
you're
trying
to
ask.
B
A
Okay,
let's
think
here,
okay,
the
open
source
techie
is
asking
the
question:
I'm
using
a
toonie
replicate.
The
issue
is
that
records
are
coming
in
late.
What
is
the
best
practice
to
use
delta
lake
in
a
cdc
system?
Okay,
so
specifically
the
ideas
that
you
want
to
change
data
capture,
so
I'm
going
to
start
with
a
quick
call
out
which
is
paul
room
and
myself
actually
had
done
a
tech
talk.
A
I
have
to
look
I'll
fight
off
to
find
the
link
and
post
it
here
in
which
we
talk
about
how
using
delta
as
a
cdc
source
and
it
basically
sort
of
calls
out
some
of
the
best
practices
to
do
that.
There
was
an
ask
in
the
open
source
community
to
potentially
open
source
cdf
change
data
feed.
A
There
have
been
discussions
on
whether
we
can
go.
Do
that
or
not.
I
don't
I'm
not
in
pervy
to
what
the
current
discussions
are
right
now.
So
it's
not,
I
don't
think
we're
against
the
idea
of
doing
it
per
se.
I
think
it's
more
of
a
matter
if
we're
trying
to
figure
out
what
the
priorities
are
and
right
now,
the
priorities
unequivocally,
even
though
there's
a
lot
of
interest
with
the
cdc.
A
The
priorities,
at
least
so
far
from
the
community,
seems
to
be
much
more
about
the
integration
points,
which
is
why
we've
been
focusing
primarily
on
that
saying
that
I
will
find
the
cdc
link
shortly
and
I'll
post
it
here
into
this
youtube
channel
and
young
luca.
You
said:
can
someone
shed
some
light
on
why
we
need
to
strip
the
user
info
here?
Could
you
clarify
what
user
information
you're
saying
that
is
being
stripped?
A
So
yawn
yeah,
if
you
can
go
ahead
and
potentially
get
back
to
us
like
about
what
could
you
clarify
what
your
question
is?
Please,
meanwhile,
oops
sorry.
Meanwhile,
I
will
I've
just
shared
the
cdc
best
practices,
video
into.
A
Directly
into
the
youtube
channel,
so
hopefully
that
helps
that,
hopefully
to
open
source
techie,
that's
a
good!
That's
a
good
starting
point!
Okay
for
us,
oh
remy
kinshan!
I
I
hope
I
said
your
name
correctly.
You've
asked
the
question
why
the
parallelism
is
def.
Oh
great
question
actually,
and
so
I'm
gonna
probably
direct
this
to
you
to
you
ryan.
A
Why
is
parallelism
disabled
by
default
for
vacuum
operation?
Under
what
circumstances
enabling
parallels
for
s3
can
take
a
performance
hit.
E
A
Thanks
very
much
ryan
that
that's
actually
super
helpful
ram
kishan.
If
you
have
any
more
questions,
definitely
chime
in
here
and
john,
I
I
did
go
ahead
and
respond
to
your
point.
Your
question
concerning
username
information.
I
believe
you're
referring
to
the
s3,
commit
service.
But
could
you
please
clarify
that
so
that
way
we
can
sort
of
understand
a
little
better?
What's
what
you're?
Referring
to
so.
A
A
So
you
haven't
got
hasn't
chimed
in
yet,
but
if
you're
referring
to
the
s3
commit
service-
okay,
so
okay,
gotcha
gotcha,
all
right,
so,
okay,
so
yawn
I'll,
probably
direct
this
to
ryan
or
scott
there's
a
method
in
the
s3
single
driver
log
store
a
live
63
where
it
actually
strips
the
user
information
from
the
s3
paths.
A
Okay,
I
believe-
and
your
guys
are
gonna
correct
me
now.
I
believe
that
actually
has
to
do
with
the
fact
that
we're
trying
to
remove
any
personal
pi
information
from
being
logged
into
the
system
for
basically
for
security
purposes,
but
I
I
I
do
want
to
verify
that
so
ryan
or
scott.
If
you
can
chime
in
then.
E
Yeah,
let's
basically
just
try
to
prevent
lego
s3
to
lock
your
like
a
usual
quotation.
For
example,
sometimes
you
can
just
put
a
key
svt
on
the
path
and
then,
if
you
try
to
query
list-
and
this
file
doesn't
exist,
it
may
slow,
like
a
final
fun
exception
with
the
four
parts
we
just
try
to
avoid.
This.
A
Thanks,
ryan,
okay,
so
here's
what
I
thought
so,
thank
goodness,
I'm
not
I'm
not
completely
off,
so
the
context
basically
is
very
much
we're
trying
to
ensure
that,
especially
from
a
gdpr
or
you
know,
basically
from
a
grc
compliance
perspective
that
we're
basically
not
logging
your
credentials
directly
into
s3.
That's
what
we're
trying
to
avoid.
So
I
can
understand.
Maybe
why
you're
going
like
hey
I'd
like
to
know
who
the
user
information
is,
but
from
the
standpoint
of
us
ensuring
that
we
meet
compliance
rules.
That's
the
reason
why
that
that
line's
there.
A
Cool
well,
we've
got
a
couple
minutes
left
before
we're
gonna
sign
off.
So
if
you
have
any
other
questions,
please
chime
in
here
on
the
youtube
channel,
we'd
love
to,
like
I
said,
we're
we're
try
our
best
to
answer
all
your
questions
and
then,
if
the
the
the
question's
gonna
be
a
little
bit
longer,
like,
I
said
before,
definitely
join
us
in
the
delta
user's
slack
channel,
as
that
will
be
the
best
way
for
us
to
get
give
more
long-winded
answers.
A
Cool
all
right.
Well
then,
you
know
what
going
once
going
twice
any
other
questions,
anybody
all
right,
you
know
what,
then
I
will
call
it
for
today,
qp
ryan
scott.
Thank
you
very
much
for
helping
for
today's
session
community
office
hours.
We
will
be
doing
the
community
office
hours
every
two
weeks,
so
this
week
we're
having
at
9
00
a.m,
pacific
and
two
weeks
from
now
we're
going
to
have
at
4
p.m,
pacific
and
then
to
answer
agent,
diro's
question
when's
the
golang
biting
y'all
gonna
happen.
A
I
did
want
to
chime
out
that
you
know
what,
whenever
you
go
ahead
and
write,
it
agendero
and
that's
a
running
joke
for
everybody
who
may
not
realize
that
that's
that's
tyler
who's
decided
to
chime
in
from
youtube,
as
opposed
to
being
on
this
panel
today.
So
it
is
what
it
is
so
again.
Thank
you
very
much.
I
appreciate
your
time
and
we'll
see
you
in
a
couple
weeks.