►
From YouTube: 2021-03-16 delta-rs open development meeting
Description
Tentative agenda :
* @Neville Dipale to share the 2.6.0 parquet writer updates
* @QP to review the recent Python and Rust bug fixes
* Try to come to a decision on a path forward for atomic rename support (led by @misha and @Christian Williams)
* ???
A
Okay,
so
welcome
again
to
our
bi-weekly.
That's
every
two
weeks:
regular
delta
rs,
open
development
meeting
our
agenda.
I
had
shared
in
slack
in
the
delta
rs
channel,
which,
if
you're
seeing
this
on
youtube
that
can
be
found
in
the.
A
Sorry
that
can
be
found
in
the
the
delta
rs
readme
I'm
seeing.
A
I
saw
somebody
knock
at
the
lobby,
but
I
can't
actually
see
it
on
my
screen.
Let
me
go
to
a
different
view:
okay,
sorry
anyways.
So
the
delta
rs
slack
channel
details
are
in
the
readme
on
github
and
the
agenda
for
today.
I
wanted
to
start
off
with
neville
for
a
change
so
that
you
can
sort
of
give
you
your
parquet
update
and
then
we'll
move
on
from
there.
A
After
that,
qp
you'll
talk
about
some
of
the
bugs
that
we've
had
come
in
recently
and
then
I
figured
we
can
try
to
come
to
a
decision
on
the
atomic
rename
problems,
but
anyways
neville
you're
up
cool.
B
I've
substance,
subs
substantively
completed
the
2.6
schema
work,
so
that's
passing
the
schema
and
printing
the
schema.
I've
opened.
I've
got
two
pr's
that
are
open,
currently
I'll,
grab
the
the
houses,
the
pr
numbers
and
put
them
in
the
chat
after
after
this
they're
currently
undergoing
review.
B
After
that,
I'm
going
to
be
able
to
work
on
the
timestamp
nanosecond
support,
and
I
can't
remember
what
else
it's
just
two
types
of
arrays
that
we
can't.
We
can't
write
as
a
limitation
because
we
didn't
really
support
2.2.6
yet
and
then,
after
that,
I
think
I'll
be
done
with
the
actual
schema
support
of
2.2.6,
so
we'll
be
able
to
start
reading
a
lot
more,
a
lot
more
types
of
files
that
are
that
are
written
with
2006.
B
So
that
was
the
the
main.
I
suppose,
though
they
just
love
the
work
that
I
was
doing.
It
was
really
to
allow
us
to
unblock
us
from
from
from
being
able
to
read
and
write
2.6
with
logical
types.
A
It
with
the
with
the
changes
that
are
pending
right
now.
What
do
you
think
a
timeline
is
for
an
aero
crate
to
be
released
with
the
the
2-6
writer
support.
B
I
I
saw
on
the
mailing
list
that
the
next
release
is
going
to
be
around
april,
which
coincides
with
our
sort
of
three
three
three
month:
cadence,
there's
a.
A
A
And
for
for
anybody,
who's
not
already
aware
that
the
reason
like
the
the
rust
crate
and
the
python
binaries
those
depend
on
get
versions
of
the
aero
underlying
crate.
The
big
motivation
to
depend
on
a
released
version
of
the
aerocrate
is
so
that
we
can
publish
a
new
delta
light
crate
to
creates.I
o
it's
not
blocking
any
of
the
python
work
or
any
of
the
work
that
the
kafka
delta
ingest
work,
which
is
downstream
of
delta
rs.
A
No
guess
not
cool
qp.
You
want
to
share
some
of
the
the
python
and
rust
bugs
that
have
come
up
or
discuss
whatever
you
wanted
to
discuss
on
that
topic.
B
Yep,
so
for
the
last
two
weeks
we
have
florian
has
has
fixed
two
bugs
with
regards
to
schema
handling
in
the
python
binding,
and
I
found
that
the
s3
listing
there
is
a
bug
and
we
have
that
fixed
so
that
we
can
now
note
a
table
from
s3
by
a
specific
version
and
thomas
has
been
also
being
found
founding
or
playing
around
with
the
the
the
pretty
new
right
support
for
for
delta
rs,
and
he
also
found
a
bug
where
we,
during
the
write
process,
we
were
reading
duplicated
transaction
logs
so
which
it's
a
simple
fix.
B
So
luckily,
and
another
really
good
news
is
misha-
has
added
a
oh
on
that
list,
be
sure
I
said
added
a
single
writer
support
for
our
s3
backhand.
So
we
are
now
at
least
feature
comparative
with
the
the
reference
scholar
implementation,
which
also
only
has
single
s3
support
and
we're
now
working
on
multi
rider,
s3
support.
C
A
So
excited
that's
it
with.
I
saw
I
think
thomas
was
the
fellow's
name
with
some
of
the
the
writer
bugs
that
he
was
working
on.
Did
we
get
any
new
test
data
that
we
could
merge
in
for
our
test?
Suite.
B
So
there
was
some
discussion
about
adding
tests
for
that.
For
that
particular
case,
it's
not
it.
We
so
the
test
data
we
have
today
it's
enough
to
to
cover
that.
It's
just
that
we
need
to
add
some
more
functionalities
in
delta
rs
to
expose
some
of
the
operational
metrics
for
us
to
actually
write
the
unit
test.
A
I
see
is
that
I'm
just
looking
at
the
github
issues,
so
we've
got
is
that
an
actual
file
is
that
the
file
high
level
read
api
to
return
file
content?
No,
that's
not!
It.
B
No
yeah
that
that
one,
so
there
were
two
bugs
one-
is
the
performance
bug
the
other
one
is
a
correctness
book.
The
performance
bug
has
already
been
fixed.
B
Yeah,
so
we
have
the
correctness,
but
that
that's
still
panic,
but
it
should
be
a
simple
fix
that
was
the
first
one
duplicate
files
after
update.
B
A
Yeah
cool,
I
just
dropped
a
link
for
the
youtube
channel
later.
If
anybody
watches
for
the
good
first
issues
somewhat
related,
I'm
kind
of
curious
did
florian
cut
a
new
release
of
the
the
python
binding.
A
Okay,
I
was
actually
more
curious
because
that
would
be
the
first
release
that
was
not
cut
by
you.
A
D
Yes,
so
I've
been
looking
into
that
library,
dyno
lock
in
rust
from
what
it
was.
It
actually
looks
it's
just
like
an
mvp
with
a
single
release
and
it
covers
very
basic
operations
and
it's
actually
a
an
unusable
for
for
for
a
production,
because
there's
no
like
recovery
from
failures
whatsoever.
D
They
they
cover
only
very
basic
operations
such
as
create
a
acquire,
a
lock
release
that
and
move
on.
But
if
something
happens,
wrong
like
the
lock
has
has
died,
there's
no
way
to
recover
this
unless,
unless
you
manually
delete
a
delete,
a
number
dbo
which
will
require
like
calling
the
number
db
elsewhere
without
like
using
dynalog
api.
D
So
in
order
to
use
that
we'll
still
need
to
write
some
dynamodb
related
code
and
also
there's
there's
another
amazon,
dynamodb
lock,
client,
which
is
created
by
the
aws
lab
labs,
and
it
also
there's
a
lot
of
ports
like
to
nodejs
and
go
which
which
actually
like
pours,
and
then
I
tested
that
and
it's
actually
fully
functional
for
us.
It
supports
both
like
acquiring
logs,
updating
and
also
fail.
D
It's
resilient
means
that
there's
this
covers
when,
when
some
workers
are
died
and
it's
automatically
cleaned
by
other
workers
or
something
like
this
so
and
and
at
at
the
end,
we
as
we
should
like
work
on
a
long
time
like
creating
a
dynamo
db
ports
for
rust,
like
for
go
library
or
or
not
ges,
or
if,
like
time
sensitive,
I
I
would
suggest,
like
we
use
only
basic
basic
logic
and
implement
that
within
delta
rs.
C
Can
you
post
a
link
to
the
aws
distributed
lock.
D
So
the
main
issue
is
that
if
you
start
worker,
for
example,
worker
a
on
amazon,
the
number
to
be
a
like
java,
client
and
if
it
dies,
then
the
other
worker
reads
the
previous
lock
item.
It
says
when
lock
is
created
and
it
has
like,
for
example,
lease
duration
for
30
seconds,
it's
very
30
seconds
and
then
deletes
previous
logs
and
creates
new
one
and
that
functionality
of
waiting
and
reupdating
the
lock
is
missing
in
dynalog.
A
With
the
spark
implementation
is,
is
there
just
no
atomic
rename
at
all
like?
Is
this
just
functionality?
That's
missing
for
s3
and
the
spark
implementation.
D
I
have
not
looked
into
spike
limitation.
Actually
I
I
have.
C
In
the
reference
implementation,
there
is
no
pessimistic
clocking
happening
like
this.
It's
basically
there's
a
there's
a
file.
I
can't
quite
remember
the
name
of
it,
something
like
single
driver,
s3,
storage,
backend,
or
something
like
that.
So,
basically,
it
seems
like
they're
they're
funneling,
all
of
the
s3
interactions
through
a
single
thread
on
the
driver.
B
E
Sorry
I
meant
to
chime
in,
but
I
completely
forgot
that
I
muted
myself,
that
is
correct
because
in
essence,
in
order
to
ensure,
when
you're
they're
doing
the
renames
or
the
rewrites
for
the
json
that
basically
it
actually
at
that
point
in
order
to
be
the
most
safe
in
and
easiest
implementation.
The
I
that
was
the
idea
that
we
would
only
have
a
single
writer.
A
I'm
wondering
this
is
this
is
not
something
I
think
we
necessarily
need
to
decide
now
and
there's
a
fairly
robust
discussion,
which
I'll
link
in
the
slack
channel
that
mostly,
I
think,
actually
entirely
christian
and
qp,
have
been
having
around
the
the
create
not
of
this
exist
guard,
which,
I
believe
is
related.
Is
this
issue?
It's
the
same
thing
I
think.
A
But
what
what
comes
to
mind
is
that
whatever
functionality
we
add
here,
because
if
we
are
talking
about,
you
know
dynamodb
being
another
dependency
to
have
these
safe,
multiple
writers
that
I
think
we'll
need
to
provide
a
flag
so
that
you
know
if
you
want
to
use
the
python
binding
or
the
rust
crate
directly
you
won't
have
to
like.
If
you
want
to
shoot
from
the
hip
and
be
you
know
cowboy
coder
you
can
you
can
disable
it
and
not
have
the
requirement
for
dynamic
tv
and
aws.
B
Yeah,
I
think
so
that
our
design,
so
we
we
had
two
designs
in
that
discussion.
One
of
the
designers
moved
the
transaction
log
entirely
to
dynamodb
that
designed
it
more
performance
and
safe,
safer
from
a
lifeless
point
of
view,
but
it
has
the
downside
that
all
the
readers
has
to
use
dynamodb
as
well.
So
we
went
with
this
locking
design
where
only
the
writer
needs
to
use
dynamodb
and
readers
can
just
use
s3
without
accessing
dynamodb
at
all
and
and
so
on.
B
D
B
A
Well,
I
I'm
I'm
not
super
concerned
about
those
use
cases.
I
just
don't
want
to
make
this
difficult
to
adopt
for
the
the
first
like
kick
and
the
tires
set
up
like
the
more
infrastructure
you
have
to
to
have
to
use
any
particular
tool.
The
higher
the
burden
of
ninjas
is
going
to
be.
A
B
D
Yeah,
I
basically
want
to
I
I
was
planning
to
start
playing
with
it
today
like
moving
implementation
into
delta
rs,
but
it
took
me
a
while
with
that
file
system
file
system
automatically
right
the
pull
request
and
then
well.
D
I'm
planning
to
like
starting
working
on
that
tomorrow
with
a
moving
moving
pieces
from
a
java
and
rust
dinolog
into
dress
and
to
see
the
results
would
be.
C
Yeah
I
I
agree,
I
feel
like
this
decision
is
made.
We're
gonna
implement
it
in
delta,
rs,
yeah.
D
Ideally,
it
would
be
nice
to
have
a
full
port
like
go
library
and
not
gs
did,
but
it
will
be
time
consuming
to
port
everything
from
a
java
client,
since
this
is
really
rich,
it
supports
like
background
threads,
which
which
which
sends
heartbeats
it's
each
second,
so
they
fully
support
those
errors,
but
I
I'm
pretty
sure
for
mvp.
We
won't
need
all
all
of
that.
So.
A
I
also
don't
think
it
makes
sense
for
us
to
fork
dynalock
and
then
have
a
bunch
of
additional
like
I
have
another
dependency
that
we
need
to
manage,
but
based
on
the
thread
that
I've
seen
in
the
the
stuff
that
I've
seen
in
the
pull
request,
I
don't
think
there's
a
definition
of
what
the
what
the
lock
needs
are
that
we
would
be
bringing
into
delta
rs.
B
C
A
Well,
like
what
I'm
hearing
is
that
you
know
there's
two
two
poles
here:
one
is
dynalock
which
doesn't
have
everything
that
we
would
need,
and
the
other
is
like.
The
the
aws
labs,
dynamodb,
lock,
client,
some
implementation
of
that
which
has
a
lot
of
extra
functionality
and
what
we
actually
need
is
somewhere
between
those
two
points.
What
I'm
asking
for
is
for
somebody
to
actually
write
down
what
it
is
we
actually
need
so
that
we
only
implement
what
we
need
in
delta,
rs,
yeah
and
not
implement
amazon
db.
Their
tenant
would
be
lock.
A
C
Right,
yep,
that
makes
sense.
The
task
should
not
be
complete.
A
full
port
of
you
know
the
this
aws
distributed
lock,
yep.
A
A
Well,
thank
you
all
for
joining.
I've
personally
found
this
productive,
we'll
continue.
The
discussion
in
the
delta,
rs,
slack
channel
and
I'll
see
you
in
a
couple
weeks
have
a
good
day.