►
From YouTube: 2021-03-02 delta-rs open development meeting
Description
Tentative agenda
* @QP and @Florian Valeye share/highlight some recent changes in the Python bindings
* Discussing policy for adding new committers to the repository (e.g. @Florian Valeye and @Christian Williams)
* @Neville Dipale share some updates on delta-rs writer support that he's been working on.
A
All
right
welcome
neville
we're
live.
This
is
the
delta
rs
semi,
regular,
open
development
meeting.
Thank
you,
everybody
for
joining.
I
put
a
tentative
agenda
in
the
in
the
slack
channel.
Let
me
make
sure
I've
got
that
opening
in
front
of
me.
Qp
I
was
hoping
I'm
hoping
your
audio
is
is
back
in
good
working
order.
I
was
hoping
you
or
I
guess
florian
won't
be
joining
us
today,
but
you
might
be
able
to
just
share.
B
Yeah,
let
me
try
to
find
the
place
to
show
the
screen.
B
So
we've
been
working
on
front
iv,
I
mean
mostly
florian-
has
been
working
on
a
new
method
for
the
python
binding.
So
can
anyone
see
match
me
yeah?
We
can
so
we
have.
We
actually
released
a
new
version
of
python
binding
last
night,
so
it's
not
at
version
0.3
and
the
main
new
feature
that
we
added
to
that
is
the
new
schema
method.
B
So
previously
to
get
a
schema
for
the
data
table,
you
would
have
to
load
the
table,
convert
that
to
a
pi
arrow
table
and
then
use
a
the
the
built-in
schema
influence
from
pi
arrow
to
get
that
which
takes
a
long
time,
and
that
requires
a
lot
of
I
o
to
actually
get
the
data
from
from
the
table.
So
now
we
we
because
the
rust
implementation
already
loads,
the
schema
from
the
delta
transaction
log.
B
We
now
expose
that
schema
directly
through
the
new
schema
method.
That
means
that
you
don't
have
to
load
the
table
anymore,
to
read
the
schema.
So
actually
you
don't
have
to
load
the
file
the
data
file
and
you
want
to
know
the
schema
and
it
makes
a
lot
faster.
Now
there
are
some
still
there's
still
some
no
implementation
improvements.
We
can
add
to
this
implementation
right
now,
the
we
were
thinking
of
moving
the
schema
right
now,
it's
in
python.
B
We
want
to
move
that
into
rust
so
that
we
can
avoid
a
the
json,
serialization
and
deserialization
station
between
these
two
languages.
B
B
I've
went
through
all
of
these
golden
test
data
sets
and
we
are
passing
all
of
it,
except
those
that
doesn't
have
schema
defined
in
the
transaction
log.
So
I
I
don't
know
how
useful
those
tables
are,
because
in
real
scenarios
you
would
always
have
a
schema,
I
think,
for
for
a
table
defined
in
the
method
metadata.
B
A
The
the
protocol
the
delta
protocol
does
allow
for
that.
Behavior
right,
you
might
not
be
are
those
like,
at
least
by
the
protocol,
are
those
invalid
tables.
C
I
might
from
my
recollection,
the
protocol
requires
the
first
commit
to
contain
the
metadata
which
contains
the
schema.
B
A
The
reason
that
I'm
kind
of
curious
about
this
is,
I
think,
there's
that
and
then
the
decimal
thing
and
since
denny
is
on
the
call.
Maybe
you
could
give
an
overview
of
the
decimal
problem
that
you
and
floria
florian
identified,
I'm
wondering
if
just
because
we're
an
implemented
independent
excuse
me
implementation
of
the
delta
protocol,
I'm
wondering
if
we're
just
finding
bugs
in
the
protocol
or
just
gaps
of
what's
missing
there.
Yeah.
B
I
don't
think
it's
mentioned
here.
It
only
says
it
doesn't
didn't
mention
whether
this
is
an
optional
field
or
not.
It
only
says
you
know
it's
part
of
the
field,
but
those
this
particular
field
is
definitely
missing
in
in
some
of
those
important
testing.
A
Could
you
check
your
audio
devices?
It
sounds
like
you're
talking
from
the
other
end
of
the.
D
B
Yeah
but
other
than
that,
it's
been
working
pretty
well
and
we
passed
all
the
other
tests.
A
E
Oh
sorry
about
that
guys,
yeah!
No.
What
I
was
trying
to
say
basically
is
that
typically,
the
schema
string
needs
to
be
populated
when
you
are
about
to
establish
the
table
and
or
change
the
schema.
Otherwise,
what
ends
up
happening
is
the
example
data
or,
like
the
statistics,
that's
referred
to,
is
actually
placed
inside
there.
So
in
other
words,
it's
not
it's
not
consistent.
So
if
you
ever
look
at
the
transaction
log,
you
know
just
from
running
it.
You'll
actually
notice
the
fact
that
sometimes
it
has
it.
Sometimes
it
doesn't.
B
Yes,
one
second,
so
I
think
we
have
the
pr
open
right.
I
don't
remember
if
the
pr
it's
merged
or
not,
oh
there,
you
go
that's
the
mole
and
it's
improved,
so
florian
also
found,
I
think,
probably
demo
also
found
that
teletable
supports
the
decimal
data
type,
which
is
not
documented
in
the
spec.
B
So
we
added
this
implementation
in
the
rust
and
the
python
binding
and
flooring
also
center
pr
to
add
this
to
the
official
spec.
So
it's
probably
properly
documented.
E
Oh,
I
was
about
to
simply
say
that
yeah,
it
looks
like
lewin
already
went
ahead
and
said:
hey
good
good
fix.
B
Yeah,
as
far
as
I
know,
this
is
the
second
protocol,
pr
that
that
was
resent
from
the
delta
rs
project.
I
also
found
one
spec
minor
issue
previously
as
well
when
I
was
improving
data
papers.
A
Cool
neville,
I
I
know
it's
a
little
bit
later
in
your
time
zone.
Why
don't
we
go
with
you
first
and
then
we
can
come
to
the
committers
topic.
F
Yeah
no
worries
hi,
everyone
cool.
What
I've
worked
on.
What
I've
been
working
on
now
is
on
the
rust
right.
Just
I'm
getting
us
to
2.6
packet
compatibility
with
the
vomit.
There
was
a
I
think
I
mentioned
it
in
in
the
last
call
that
that
that
I
joined
four
weeks
ago.
I
had
some
issues
with
understanding.
F
You
know
what
daughter
is
missing
in
the
format,
because
we
didn't
have
you
know
nanosecond
timestamp
support,
so
I
tried
out
an
experiment
about
two
weeks
ago
and
then
I
went
down
a
rabbit
hole.
It
didn't
work
out,
so
I
was
able
to
sort
of
start
it
from
scratch
in
in
the
in
the
past
week,
and
I've
already
submitted
two
put
requests.
I'm
joining
on
my
phone,
so
I'll
I'll
put
the
pull
request
on
the
on
the
chat
after
this
one
is
under
review.
The
other
one
just
builds
on
top
of
it.
F
F
The
next
changes
that
I'm
planning
on
working
on
this
week
is
just
supporting
the
text
format
of
the
schema.
So
if
you,
if
you
look
at
a
packet
file,
maybe
you
print
the
schema
to
to
console
whatever
you'll
find
that
you've
got.
You
know
the
repetition,
the
optionality
so
you'll
have
a
field,
that's
rip
and
that's
repeated.
F
That's
I'm
required
and
then
they'll
have
the
name
and
then
that
the
data
type,
which
is
a
primitive
you
only
have
in
3264
and
a
few
others,
and
then
you'll
have
the
logical
type,
which
is,
for
example,
an
in
32
in
64
et
cetera.
So
the
the
new
version
changes
the
naming
convention
a
bit
there.
So
that's
the
next
item
that
I'll
be
working
on
that
should
keep
me
busy
for
the
rest
of
the
week.
A
F
I
think
2.6
is
where
we
need
to
be.
The
format
is
at
2.8,
but
2.7
and
2.8
mainly
introduce.
I
think
it
introduces
a
new
encoding
from
it.
I
think
bloom
filters,
if
I
remember
correctly,
and
then
just
some
encryption
stuff,
that's
not
relevant
for
our
purposes.
So
I
think
2.6
is,
is
a
good
format
to
be
in,
but
at
minimal
2.4.
So,
even
though
we
we
were
building
against
the
2.4,
you
know
specification
of
the
format
we
weren't
really
complying
with
it,
because
we
because
of
this
logical,
logical
type
issue.
A
I
see
christian,
if
you
don't
mind
me
picking
on
you
for
a
moment
christian,
I
don't
know
neville
if
you
saw
his
pull
request.
I'll
drop
it
in
the
channel
right
now,
but
christian
is
working
on.
Arguably
the
first
delta
rs
writer
based
application
and
he
just
put
up
a
draft
of
the
writer
for
kafka
delta
interest.
Are
there
christian?
You
have
mentioned
something
on
an
internal
chat
about
like
the
delta
rs
writer
support?
C
I
think
we
do
so.
I
will
say
at
this
point
we
haven't.
I
haven't
tested
anything
with
nested,
structs
or
lists,
or
anything
like
that,
but
as
far
as
I
can
tell,
the
aero
writer
and
parquet
writer
provide
what
we
need
for
now.
We
have
this
error
already.
Consent
contains
an
in-memory
writer,
readable
cursor.
That
seems
to
be
working
for
the
use
case
that
we
require.
C
A
No,
the
reason
that
I
bring
up
the
kafka
delta
in
just
work
is,
you
know.
Christian
is
starting
to
do
this
like
now
like
we're
working
on
this
from
the
scribd
standpoint,
which
for
anybody
watching
that's
our
our
employer,
but
we've
got
enough
of
a
test
case
there
to
where,
like,
if
you
need,
if
you've
got
branches
or
pull
requests
that
you
need
some
extra
hammering
on,
I
think,
probably
between
christian
qp
and
myself.
We
can
definitely
do
that
if
you
ping
us.
F
Yeah,
that's
great
because
that
I,
I
don't
think
there's
been
any
user
script
that
otherwise
who's
using
the
picay
writer
who's.
You
know
who's
come
back
with
some
feedback
on
deeply
nested,
structs
and
lists,
because
I
think,
with
with
some
some
level
of
list
nesting,
I've
done
tests
myself,
but
you
know
when
it
comes
to
like
arbitrarily
nested
stuff.
That's
where
I'm
sort
of
relying
on
generating
test
cases
and
sometimes
they're
not
as
complex
as
they
need
to
be,
because
the
on
the
rust
side
of
things
we
don't.
F
We
don't
have
very
good
support
for
you
know
going
from
json
directly
to
per
case,
or
we
have
to
go
through
error
to
pick
a
and
sometimes
encounter
some
blockers
or
issues
in
the
on
the
error
side.
One.
I
think
one
thing
which
won't
be
relevant
for
for
kafka
delta,
just
in
your
use
case
for
now
at
least,
but
becomes
relevant
for
general
users
who
who
want
to
use
it.
Is
that
because
error
error
has
a
zero
copy
facility
or
functionality
where
you
could
sort
of
take?
You
know
a
record
best.
F
Let's
assume
that
you've
got
a
record
match
that
has
a
hundred
thousand
records,
so
each
column
is
under
each
yeah.
Column
is
a
hundred
thousand
rows
and
you
only
want
to
write
the
fifth
first.
Fifty
thousand
of
that,
you
could
sort
of
slice
that
record
badge
or
slice
the
arrays
and
only
select
the
first.
Fifty
thousand
it'll
create
a
zero
copy
view
of
that,
and
then
you
could
write
that
out,
but
writing
that
out
now
I'm
still
testing
the
well.
I
I've
got
a
blocker
on
the
aero
side.
F
That
is
preventing
me
from
being
able
to
test.
You
know
with
deeply
nested
structs.
I
mean
this
becomes
useful,
if
you
let's
say,
for
example,
you've
got
a
record
patch
that
has
a
hundred
thousand
again
records
and
you
you,
we
determined
that
you
know
the
page
page
size
from
per
case.
Size
is
an
optimal
one,
is
probably
let's
say,
ten
thousand
or
five
thousand
records
you
wanna
slice,
that
into
smaller
portions,
so
that
you
can
be
able
to
to
write
it
without
creating
a
lot
of
memory
pressure.
F
A
That's
it,
I
didn't,
have
any
other
questions.
Does
anybody
else
have
any
questions
about
the
the
work
that
neville
is
doing.
A
Okay,
one's
going
twice
all
right.
The
only
other
thing
that
I
had
on
my
suggested
agenda-
and
I
can't
underline,
suggested
enough
because
I'm
not
the
boss
of
everything.
So
if
there
are
other
topics,
please
suggest
them.
The
other
thing
that
I
wanted
to
to
raise
was
an
item
that
we
I
wanted
us
to
talk
about
last
time
around,
but
we
we
just
canceled
because
there
wasn't
a
quorum.
I
wanted
to
discuss
a
policy
for
adding
new
committers
to
the
delta
rs
repository
right
now.
A
I
believe
qp
and
myself
from
anybody
on
the
call,
I
believe
we're
the
only
ones
that
have
merge.
I
think
denny
technically
can
merge
as
well,
but
florian
has
been
contributing.
I
know
that
christian
has
an
opportunity
to
contribute
coming
up
and
I
wanted
to
just
set
some
guidelines
for
when
we,
when
we
give
somebody
right
access
to
the
repository
as
qp
or
denny
since
you've
been
in
the
delta
project
for
a
while.
Do
you
have
any
opinions
or
guidelines
you
might
suggest
either
of
you.
E
Cool
calls
so,
okay,
I
was
about
to
say
that
I
would
basically
we
would
probably
want
to
establish
some
rules
out
of
the
lynx
foundation,
for
which
I'm
actually
currently
trying
establishing
for
delta
overall.
So,
whatever
we
want
to
experiment
here
with
delta
rs,
I
was
about
to
say
that
we're
probably
going
to
go
ahead
and
just
do
the
same
thing
for
delta
overall
right,
because
we
have
the
tech
stakeholders
right,
but
we
didn't
bother
writing
down
the
list
for
maintainers
and
all
that
stuff.
E
A
Denny
from
the
the
linux
foundation,
part
of
the
delta
project
in
in
delta,
the
sparks
delta
implementation.
There's
this
utilization
of
the
developer
certificate
of
origin,
and
I
linked
in
our
slack
channel
the
contributing.md.
A
E
Yeah
so
right
now
we
have
the
governance
here
right
and
so,
in
fact,
actually
I
was
talking
to
the
folks
over
at
stack
storm
about
how
they
designed
it
and
actually
that's
the
one
that
actually
I
want
to
follow
much
more
closer
to
so
let
me
see
if
I
can
find
it.
F
A
Early
early
time
like
like
2007,
we
basically
gave
commit
rights
to
anybody
that
asked
for
them
right
and
so
kosuke
who's,
the
founder
of
the
jenkins
project.
His
attitude
at
the
time
was
well.
I
can
always
revert
something
you.
A
Not
worried
about
someone
blowing
anything
away
to
where
his
view
was
that
if
you,
if
you
grant
that,
extend
that
trust
early,
it
helps
contributors
sort
of
get
it
excited
and
engaged
sooner
rather
than
putting
up
gates
with
for
anybody
that
doesn't
know.
With
what
we've
got
on
github,
we
can
set
up
branch
protection
to
prevent
any
force
pushes,
or
you
know,
merges
into
main
with
without
going
through
pull
requests.
A
E
Deal
I'm
actually,
okay
with
it.
The
the
key
concern
with
that
is
basically
the
is
there
concerns
of
maintenance
overhead,
that's
typically
associated
with
the
more
lacks
policies.
Right
I
mean
if
it's
between
just
this,
you
know
the
seven
of
us
here,
I'm
actually
honestly,
not
that
concerned
about
it
at
all
right,
it's
more
a
matter
of
when
we
expand
and
include
more
folks
and
then
do.
We
then
start
in
essence,
locking
things
down
a
little
bit,
because
for
precisely
that
reason,
the
maintenance
overhead
of
actually
allowing
that
lobster
policy.
C
I
think
the
primary
workflow
comes
down
to
who's,
going
to
merge
the
pr
right
like
I've
he's
already
had
them
had
to
merge
two
of
my
pr's,
whereas
he
could
have
just
done
an
approve
and
then
waited
for
me
to
watch
for
other
approvers,
and
then
I
could
emerged
it.
C
Slack
to
get
reviewers
and
I
also
can't
add
labels
to
issues
and
stuff
like
that.
I
don't
know
if
that's
on
the
same
topic
or
not.
A
They're
related
on
github
to
add
to
add
labels
to
issues
we
can
set
permissions
to
triage,
which
is
a
permission
level,
which
means
you
can
work
with
issues
and
that's
a
different
permission
level
than
right
rights
supersedes
triage.
Of
course,
I
think
I
would
recommend
for
for
qp,
and
I
want
to
ping
florian
on
this
as
well.
The
stack
storm
governance
document
that
denny
linked
is
is
definitely
interesting
and,
I
think
worth
a
read
if
there
are
from
the
delta
project
standpoint
denny.
A
E
A
A
A
That's
that
is
the
primary
one
I
wanted
to
to
get
addressed.
Let
me
make
sure
that
check
our
require
pull
request.
I'm
just
checking
our
branch
protection
on
main
we
require
an
approver
and
there's
no
force
pushes
and
no
deletions
on
that.
So
let
me
I'll
go
ahead
and
add
florian
now.
A
To
delta
rs
committers,
now
one
thing
denny,
I
should
mention
right
now:
delta
rs
committers
also
are
committers
on
kafka
delta
ingest.
Let
me
ask
you
to
just
create
a
separate
team,
because
those
are
those
are
gonna.
These
two
projects
are
definitely
gonna
grow
in
different
ways.
C
A
Yeah
I'll
go
ahead
and
extend
the
invite
to
florian
qp
christian
benny.
Well,
benny
already
read
it,
but
let's
look
at
the
the
stack
storm
contributing
guidelines,
and
maybe
I
can
I
can
take
the
the
follow-up
to
oh.
I
can't
invite
somebody
this
to
this
team.
So
then
he
will
have
to
invite
them
to
the
team.
A
E
A
A
Slack
right
now
I'll
take
the
action
item
to
write
up
some
like
a
guidelines,
a
contributing
guideline,
similar
to
the
stack
storm
thing
and
we
can
discuss
what
we
want
from
a
proposal
thing
proposal
standpoint.
A
I
just
had
a
tv
coming,
I
think,
that's
thomas
again,
that
was
the
the
end
of
the
suggested
agenda
that
I
had.
Are
there
topics
that
anybody
else
wants
to
bring
up
for
us
to
discuss.
B
Oh,
so
one
thing
that
I
recommend
everyone
go
ahead:
yes,
okay,
go
ahead;.
F
No,
no,
no,
I
was
just
gonna
say:
I'm
very
excited
to
see
kafka
dalton
just.
B
B
Speaking
of
I
guess,
it
also
relates
to
what
I
just
wanted
to
say
is
in
the
delta
rs
project.
If
you
go
to
the
discussion
page,
we
actually
have
some
interesting
discussion
around
distributed
system
design
for
delta
rs
itself,
and
some
of
those
design
will
apply
to
the
delta
interest
as
well.
So
please
feel
free
to
check
out
those
discussions
and
chairman
for
for
your
own
opinions,
because
it
will
have
a
big
impact
on
how
we
do
distributed
concurrent
rights
into
delta
rs
in
the
future.
A
Yeah,
I
think
it's
it's
definitely
worth
highlighting
the
discussions.
This
is
a
new
feature
in
github
that
we're
trying
it
out
there
are
definitely
longer
form
design
discussions
that
need
to
happen.
Where
slack
is
probably
not
the
best
approach,
so
we've
been
exploring
that
you
can
actually
watch
the
repository.
If
you
go
to
the
the
watch
in
the
top
right
and
go
to
custom,
you
can
actually
just
watch
discussions.
If
you
would
like
to
keep
apprised
of
that.
B
A
A
All
right!
Well,
if
that's
everything,
thank
you,
everybody
for
joining,
just
as
a
reminder
on
the
on
the
delta
rs
page.
We
have.
It
just
mentions
that
we
have
this
every
two
weeks
at
9am,
on
pst
on
tuesday
mornings,
about
an
hour
or
so
beforehand.
I'll
put
the
I
put
the
meeting
link
in
the
slack
channel
just
to
reduce
the
probability
of
spammers.
You
know
zoom
bombing
us
or
anything,
but
thank
you
all
for
joining.
I
hope
to
see
you
in
a
couple
weeks
or
on
slack.