►
From YouTube: 2021-02-02 delta-rs open development meeting
Description
Tentative agenda
* Neville talks through some of the Arrow merged changes
* Tokio 1.0 planning
* Chris shows off his writer branch
* ???
A
All
right
welcome,
we
are
officially
live.
This
is
another
delta,
rs
development
meeting
the
agenda
I
had
or
what
I
would
say,
a
tentative
agenda
would
be.
I
I
shared
in
our
delta
rs
slack
channel,
and
it's
just
I
was
thinking
we
could
go
through.
Neville
could
just
talk
through
some
of
the
recent
changes
that
he's
had
merged
around
the
writer
support
in
in
the
aerocrates.
A
I
wanted
to
discuss
a
little
bit
and
just
get
on
the
same
page
around
what
our
let's
call
it.
Road
map
or
plan
should
be
for
releasing
crates
with
tokyo,
1.0
as
a
dependency
and
then
to
after
that,
I'd
love
for
chris
to
share,
screen
and
sort
of
walk
through
some
of
the
writer
changes
that
he's
been
working
on.
C
Okay,
thanks
all
stuck
so
there
hasn't
been
that
much
activity
from
my
side
in
the
last
two
weeks,
the
the
main
thing
that
we've
done,
we
matched
the
tokyo
1.0.
That's
just
the
update
update
to
dependencies.
There
was
some
issues
with
the
ipc
integration
tests
fading,
but
qp
helped
out,
and
we
found
a
solution
for
that.
We
merged
that.
I'm
still
I'm
looking
at
the
supporting
the
2.6
version
of
the
pca
format.
C
There's
a
bigger
issue
in
the,
in
the
sense
that
the
the
rust
writer
only
supports
the
legacy
format,
but
because
you,
because
of
backwards
compatibility,
even
if
you're
writing
like
with
a
version
two
of
the
writer,
you
still
sort
of
end
up,
writing
valid
stuff,
but
it's
actually
really
version
one
of
the
stuff.
So
I've
been
doing
a
lot
of
exploratory
work
on
the
on
on
the
pacquiao
repository.
C
Also
looking
at
the
packet,
mr,
the
the
java
equivalent
and
the
c
plus
plus
equivalent,
to
see
how
we
can
we
can
pave
a
way
of
giving
us
proper,
2.6
support.
So
one
of
the
challenges
we
face
with,
for
example,
2.6
support-
is
that
there
is,
I
think,
there's
two
types
of
sorry,
two
data
types
that
we
we
actually
don't
write
correctly.
So
we
sort
of
lose
precision
if
we,
if
you,
if
you're,
writing,
nanosecond
precision
timestamps,
because
the
the
old
version
of
the
old
version
that
supports
it.
C
That
was
that
96-bit
integer,
which
is
quite
messy
to
work
with
and
has
been
officially
deprecated
for
a
few
years
now,
but
to
get
proper
timestamp
nanosecond
precision
support.
I
need
to
do
a
bit
of
planting
around
the
the
separating
the
formats
between
version,
one
and
version
two
I've
got.
I've
got
some
rough
work
that
I've
been
doing,
but
it
looks
like
it's
a
it's
leading
me
down
like
a
long
path.
C
So
last
week,
early
last
week,
I
was
looking
at
how
I
can
sort
of
separate
the
work
better,
so
that
I
can.
I
can
be
able
to
deliver
it
in
chunks
without
disrupting
anyone.
I'm
still
thinking
about
it,
but
I
think
yeah.
A
C
Britain,
yes,
I've
done
that
I've
just
done
it
locally!
Yeah,
I'm
doing
it
with.
What's
this,
I
keep
forgetting
what
the
pi
arrows,
alright,
I'm
doing
it
with
pi
arrow
and
pi
spark
I'm
just
using
notebooks.
So
I
write
stuff
in
rust
and
then
I
I
test
them
in
in
both
of
those
two.
That's
where
I
picked
up
that
there's
some
compression
formats
that
we're
not
supporting
correctly.
I
hope
I
opened
them
gyros
for
that.
But
encodings
look
fine!
So
far,
it's
just
that
the
version
to
write
stuff
is
not
it's.
C
There
is
some
discussion
happening
on
the
parquet
repository
where
they
want
to
document
core
features
and
by
that
they're,
looking
at
what
every
implementation
should
support
as
a
minimum,
because
there's
a
lot
of
exotic
stuff,
especially
when
you
go
beyond
the
2.6
or
the
format
like
2.7
and
8
introduced
encryption,
which
we
don't
need,
but
they
want
to
sort
of
standardize,
because
what
I
what
I
saw
actually
I
opened,
I
opened
the
pull
request
on
the
parquet
format:
repositories
suggesting
that
we
adopt
the
error
intervals
as
part
of
version.
C
C
So
I
started
a
discussion
there
or
I
opened
a
pull
request,
because
I
I
noticed
that,
even
though
you
know
we
had
version
2.8
of
the
format,
a
lot
of
the
writers
still
use
version
one
by
default,
which,
like
I
guess,
everybody's
doing
it
to
try
to
be
conservative
not
to
not
to
break
changes,
but
the
the
downside
of
that
is
that
the
rust
writer
only
really
supports
version
one
and
reading
version.
Two
files
back
in
the
rust
writer
won't
won't
give
us
the
correct
results.
C
So
I've
been
doing
quite
a
lot
of
exploratory
work.
I
haven't
done
anything
concrete
that
I
can
say
I'm
opening
a
pull
request
on
this.
The
other
thing
that
I've
been
working
on
this
is
relevant
to
christian,
so
it'll
be
interesting
to
see
what
you've,
what
you've
done
so
find.
The
right
support
is
remember.
I
mentioned
that
there
was
that
pull
request
that
somebody
had
started
where
we
were
trying
to
be
able
to
write
data
to
a
bunch
of
to
to
up
to
a
vector
of
of
bytes
instead
of
a
file.
C
So
I've
been
working
on
that
I'm
trying
to
abstract
it
out
into
a
trade,
but
I
picked
up
that.
I
actually
missed
this.
There
was.
There
was
a
pull
request
a
few
months
ago,
where
somebody
created
an
in-memory.
C
Yeah
the
readable,
cursor
yeah,
so
whether
we
could
okay
cool
yeah.
I
was
wondering
whether
we
could
use
that
and
what
what
what
the
gaps
would
still
be,
because
I
think
from
writing
to
to
block
stories
like
s3.
We
need
to
be
able
to
to
to
to
split
the
file
into
multipaths,
so
I'm
wondering
how
we'd
be
able
to
do
that.
C
A
Gotcha
so,
unfortunately,
I've
got
someone.
I
got
a
meeting
scheduled
at
9
15.,
so
it'd
be
great
if
y'all
kept
kept
chatting.
Let's
talk
about
tokyo,
one
1.0,
real,
quick
and
then
when
I
have
to
jump
to
my
other
meeting
I'll,
just
leave
this
window
open
and
if
somebody
will
ping
me
in
in
slack
when
it's
time
for
me
to
close
the
meeting,
because
I
don't
know
if
I'll
disconnect
the
the
stream.
If
I
leave
okay,
so
tokyo
1.0,
I
used
my
first
question
on
this.
A
I
saw
the
arrow
couldn't
get
merged
qp
and
I
merged
the
the
code
into
delta
rs
from
the
aero
release,
standpoint,
neville
or
qp.
How
long
do
you
think
it's
going
to
be
before
there's
an
actual?
You
know,
aero
crate,
that
is
pushed
to
crates
scio
with
with
the
tokyo
1.0
dependency.
B
A
Yep,
that
neville
does
that
tokyo,
1.0
thing
and
arrow.
Does
that
mean
that
there
would
be
a
major
version
bump,
or
are
we
just
waiting
for
an
incremental
version.
C
A
C
I
think
we
should
be
able
to
pin
yeah,
we
should
be
able
to
to
pin
to
a
version,
and
then
I
think
what
I
can
do
from
my
side
is,
if
there's
interesting,
work
that
has
been
done
on,
especially
the
pca
site,
which
is
which
is
relevant
for
us.
I
can.
I
can
submit
a
product,
just
updating
the
what's
this,
the
the
pinned
versions.
A
A
I
don't
know
if
we
aside
from
maybe
christian-
I
don't
know
if
we
have
too
many
downstream
rust
crate
only
to
be
users
right
now,.
B
A
D
Talk
to
y'all
in
a
bit
all
right
I'll,
do
my
my
usual
fumbling
to
find
my
screen
share,
got
it
okay,
so
I
did
just
actually
let
me
post
this
in
the
channel
real
quick.
I
did
just
submit
a
draft
pr
based
on
the
code
that
I'm
gonna
share,
where's
the
chat
there.
It
is.
D
So
draft
pr
here,
so
basically
what
I'm
doing
right
now
for
the
initial
right
support
the
and
what
this,
what
this
draft
pr
is
trying
to
do
is
just
as
far
as
the
core
code
in
delta
rs
goes.
It's
just
trying
to
add
a
very
basic
transaction
api.
D
Most
of
the
interesting
code
in
the
pr
is
actually
all
tucked
into
this
one
big
test
which
that
test
is
really
includes
a
lot
of
meat
around.
That
includes
a
delta
rider
struct.
D
That
would
use
this
transaction
api
and
the
reason
I
did
it
this
way
is,
you
know
I
feel,
like
we
probably
have
a
lot
more
iterations
to
do
on
the
actual
delta
rider
struct
itself
before
it
actually
becomes
a
part
of
the
delta
rs
code
base,
and
I
think
for
my
needs-
I
don't
even
need
it
to
for
for
our
primary
use
case
that
we're
trying
to
solve.
I
can.
D
I
will
happily
just
keep
this
delta
writer
completely
outside
of
the
delta
rs
code
base,
but
it's
nice
to
use
it
in
the
test
just
for
kind
of
validating
that
the
transaction
api
that
I've
come
up
with
meets
my
needs
and
as
a
rust,
noob.
It's
it's
really
good
to
for
me
to
use
this
to
work
out
all
the
ownership
things
which
ownership
is
something
that
I'm
not
a
master
of
yet.
D
Is
the
test
that
I've
called
write
exploration-
and
it
includes
a
like-
I
said-
a
delta
writer
struct
for
writing
arrow
record
batches
as
well
as
interacting
with
the
transaction
api.
So
in
the
the
main
little
smoke
test,
I'm
using
I
initialize
a
delta
table
and
a
delta
writer,
I
create
a
little
vector
of
json
values,
write
out
my
record
batch
after
initializing
a
transaction
and
then,
after
writing
out
my
record
batch.
D
I
create
an
add
action
for
the
delta
log
and
commit
that,
and
then
we
do
a
little
update
as
well
and
the
update
basically
changes.
Some
values
writes
an
ad
and
a
remove
to
a
different
transaction
and
then
commits
that.
So
everything
in
this
test
file
is
a
very
naive
implementation
of
writer
support,
but
I
think
one
of
the
things
that
is
really
interesting
to
to
look
at
specifically
is
what
neville
mentioned
a
moment
ago,
which
is
that
that
in-memory
writable
cursor?
D
I
actually
think
that
so
this
is
actually.
This
is
the
implementation
in
arrow.
D
In
a
very
complete
way,
so
we
might
not
need
any
further
work,
but
you
guys
feel
free
to
tell
me
if
I
might
be
wrong
on
that.
My
thought
is:
this
is
exactly
what
we
want,
so
we
we
would
want
to
have
this
hand.
Let
me
get
a
silent
slack.
It's
killing
me.
D
D
At
the
moment,
the
naive
delta
writer
api
that
I
have
takes
in
a
delta
table
metadata
object
in
a
record
batch
instantiates
one
of
these
in-memory
writable
cursors
and
writes
the
it
determines
what
what
the
next
data
path
to
write
to
delta
is
writes
everything
everything
to
the
cursor
first
plucks
off
the
size
and
plucks
some
of
the
metadata
out
of
the
record
match
as
well,
such
as
the
partition
values
and
then
right,
writes
it
to
storage
and
creates
an
ad
action.
D
Ultimately,
I
think
what
we
would
want
to
do
here
is
to
rearrange
this
so
that
we
could
write
multiple
record
batches
with
and
and
basically
hold
as
hold
that
cursor
and
writer,
as
owned
properties
of
the
delta
rider,
struct
and
basically
on
flush
is
when
we
would
write
to
storage
and
create
the
add
action
instead.
So,
basically,
I
think
that's
the
missing
bit
that
I
have,
but
I
I
feel
like
the
memory
writable
cursor
itself
is
exactly
the
struct.
We
need
to
do
that.
D
I
would
just
need
to
figure
out
how
to
rearrange
this
so
that
I
can
own
the
cursor
and
hold
the
writer
across
multiple
calls
from
the
calling
context.
If
that
makes
sense,.
C
Yeah
this
does
and
then
I
think,
when
you
do
that,
one
of
the
benefits
becomes
that
you
you'd
be
able
to.
Potentially
you
know.
After
writing.
Each
batch
check
the
size
of
the
the
castle's
data
exactly
yeah,
if
it's
big
enough
or
I'm
not
sure
how
the
what
the
what
the
limits
or
thresholds
are
with
you
know,
writing
multiple
files.
C
So
if
it's
let's
say,
for
example,
your
limits
is
five
megabytes
and
then
you
find
that
what
you've
written
so
far
is
close
to
that
five
megabytes.
For
example,
then
you'd
probably
close
the
close
close
that
close
the
file.
You
know
write
that
photo
and
then
I'm
writing
to
stories.
And
then,
when
the
next
batch
comes,
you
sort
of
clean
the
yeah
you
sort
of
clean
the
the
data
in
the
in
the
right,
so
in
the
cursor
from
there
yeah.
I
think
that
could
work.
D
Cool
yeah,
that's
that's
what
I'm
thinking
I
you
know.
I
tried
a
little
quick
pass
at
it
and
because
of
my
rust
nudeness,
I
I
just
couldn't
figure
out
all
the
ownership
details,
but
I
feel
like
this
is
something
that
a
I'm
getting
slacked
again.
Okay,
but
I
feel
like
this
is
something
that
a
non-rust
noob
would
be
able
to
figure
out
quickly.
I'm
just
kind
of
gonna
beat
my
head
against
the
wall
again
until
I
figure
it
out,
but
I
feel
like
that
api
should
fully
support
it.
D
So,
having
mentioned
that,
the
other
thing
that
I'll
mention
is
just
the
transaction
api
itself.
So
it's
super
simple,
like
I
said,
like
most
of
the
code,
is
in
this
in
this
draft.
Pr
is
in
this
test
the
api
for
creating
a
transaction.
D
D
I
feel
like
this
separation
is
good
and
what
we
really
want
is
for
the
delta
writer
to
basically
return
the
appropriate
action
based
on
whatever
record
batch
was
written
to
it.
So
we
have
a
decoupling
here,
so
basically
it
would
be
the
responsibility
of
the
caller
to
instantiate
a
writer
and
a
transaction
right
to
the
writer,
get
back
the
actions
and
if
it's
doing
any
sort
of
update,
merge
or
anything
more
complex
than
a
peer
append
which
it
to
be
really
honest.
D
It's
only
the
pure
pins
that
I
care
about
for
my
purposes,
but
if
it
needs
to
do
any
ads
and
removes
based
on
an
update
for
example,
then
it
would
be
responsible
for
collecting
all
of
the
actions
that
need
to
be
committed
to
the
transaction
when
everything
is
done.
D
The
transaction
should
be
capable
of
checking
the
basically
the
snapshot,
that's
represented
by
the
delta
table.
It
was
initially
instantiated
with
against,
whatever
snapshot
might
exist
after
the
transaction
was
created,
and
so
what
I
have
right
now
does
make
assumptions
that
the
caller
will
do
all
that
management.
So,
basically,
the
caller
would
be
responsible
for
making
sure
that
only
a
single
transaction
context
exists
at
a
time
or
or
rather
let
me
rephrase
that
that
a
single
transaction
instance
refers
to
a
a
delta
table
which
contains
and
represents
the
snapshot.
D
And
then,
if
and
then,
basically
pulling
and
diffing
whatever
a
later
well,
whatever
the
later
snapshot
might
be
against
that.
Ultimately,
we
need
to
have
conflict
resolution
internal
to
the
transaction
which
could
check
the
the
the
current
metadata
against
whatever
metadata
was
present
when
the
transaction
was
started,
and
actually,
when
I
say
metadata
here,
I
actually
mean
more
than
just
metadata
metadata,
as
well
as
new
files
added
to
the
delta
log
that
part's
not
implemented.
D
So
if
there
are
any
other
writers
which
there
will
be
in
my
use
case,
then
we're
going
to
need
to
do
conflict
resolution
prior
to
during
the
commit,
and
that
would
be
handled
inside
of
this
commit
all
so
does
that
make
sense
to
everybody
the
the
way
I
said
it.
C
I
mean
it's
interesting
because
if
you
go
back
to
that
portion,
where
you
create
a
transaction,
and
then
you
you
add,
and
then
after
that,
when
you
commit
the
transaction
at
line
372,
I
don't
know
if
it
would
be
an
anti-pattern
and
I'll
make.
This
comment
on
the
pull
request
itself.
When
I
look
at
it,
but
it
would
be
interesting
if,
where
you
write
record
patch
in
line
366.
C
C
I
don't
like
creating
a
flow
where
we
require
someone
to
do
non-trivial
work,
because
sometimes
you
know,
if
you,
if
you
don't,
if
you
don't
remember,
to
add
at
the
right
transactions
etc
in
the
right
order,
it
might
cause
issues.
D
Yeah,
that
makes
total
sense
and,
to
be
honest,
my
kind
of
longer
term
view
is-
is
pretty
similar
to
what
you're
describing.
However,
I
I
kind
of
have
a
split
in
my
own
mind.
On
the
one
hand
I
I
was,
I
was
thinking
it
would
be
nice
to
kind
of
maintain
a
vector
of
actions
on
the
transaction
itself,
but
on
the
other
hand,
I'm
also
thinking
that
beef
yeah
as
delta
rs,
continues
to
evolve.
D
D
So,
for
instance,
I'm
thinking
that
delta
rs
should
have
a
concept
of
an
update
command
which
which
would
that
update
command,
would
encapsulate
the
process
of
creating
whatever
ads
need
to
be
created,
as
well
as
whatever
removes
need
to
be
created,
and
so
they
would
so
so.
Those
update
commands
would
still
be
in
delta
rs.
So
it
wouldn't
go
out
to
the
ultimate
user
to
manage.
D
It
would
be
encapsulated
in
delta
rs
still,
but
the
kind
of
the
parts
would
be
out
on
the
floor
for
delta
rs
for
those
higher
level
commands
to
use,
as
they
see
fit,
rather
than
kind
of
tying
everything
to
use
a
you
to
to
kind
of
tuck
all
their
actions
into
the
transaction
as
they're
going
along.
D
I
don't
know
I
haven't
thought
through
it
enough
to
to
be
strongly
opinionated
on
which
way
makes
more
sense
to
me,
but
that
was
kind
of
part
part
of
what
was
feeding
my
hesitancy
to
add
that
action
vector
to
the
transaction
right
out
of
the
gate.
If
that
makes
sense,.
C
Yeah
that
does,
I
suppose
I
also
need
to
look
at
the
delta
specification
in
more
detail,
but
I
think
I
looked
at
it
like
two
months
ago,
just
skimming
through
it,
but
now
now's,
probably
the
time
to
sit
down
and
read
it
properly.
D
Yeah
like
so
so,
the
thing
is
like
an
update
is
actually
pretty
complex,
and
this
does
not
this.
This
test
does
not
fully.
You
know
if,
as
you're
going
through
the
pr
you'll
have
to
read
the
comments,
because
this
is
actually
not
a
legitimate
update.
D
A
real
update
would
rewrite
the
initial
add
file,
and
I'm
not
doing
that
right
now,
because
that
requires
like
reloading
the
initial
file
scanning
for
only
the
rows
that
have
a
change
in
them
and
rewriting
a
new
file
that
contains
all
the
unchanged
rows
and
the
changed
rows
together,
removing
the
old
ad
and
then
including
the
the
new
ad
that
includes
the
rewritten
rows.
D
So
I'm
shortcutting
that
just
for
the
sake
of
the
test,
but
yeah
it's
when
I,
when
I'm
looking
at
the
reference
implementation,
I
I
really
like
the
layered
approach
they
have
where
they
have.
The
transaction
is,
is
very
decoupled
from
the
from
the
log
itself
and
then
the
commands
are
also
are
decoupled
from
either
of
those
and
they
kind
of
use
them
in
a
compositional
way.
D
So
you
know
you
have
this
concept
of
an
update
command,
that
that
creates
actions
and
writes
data
as
it
needs
to
and
then
and
then
commits
the
transaction
in
one
in
one
shot.
But
I
mean
I
don't
know
I
don't
know
how
closely
we
really
want
to
match
the
reference
implementation,
I'm
not
trying
to
match
the
reference
implementation,
but
there
are
some
good
ideas
in
it.
B
Cool
no
I
follow.
I
have
questions
is
the
to
pr
ready
for
review.
I
saw
that
it's
in
it,
it's
still
in
dropped,
state,
so
sure
yeah,
we
start
reviewing
it
now.
D
Yeah,
so
I
put
it
in
draft.
I
think
the
only
thing
that
I
absolutely
want
to
do
before
before
I
consider
it
mergeable
is
at
least
just
want
to
rebase,
because
I
got
a
bunch
of
sloppy
commits
in
there
so
I'll
do
that
real,
quick
and
I'll
put
it
in
official
pr
status
there's
and
we
can
go
from
there
yeah.
So
I'll
just
do
the
rebase,
but
that's
not
going
to
change
any
of
the
code,
so
yeah
feel
free
to
start
reviewing.
B
Cool,
I
think,
all
of
those
discussions
we
just
had.
We
can
have
that
in
the
github
yeah.
You
are
perfect
with
the
context
of
the
code.
That
will
be
a
bit
more
helpful.
I
understand.
B
Yeah,
I
think
that
looks
pretty
cool
to
have
media
demo
on
the
right
support.
D
All
right,
I
think,
that's
everything
I
need
to
talk
about
here
since
we'll
just
carry
this
on
in
the
pr
discussion,
any
any
other
things
we
need
to
talk
about
before
we
adjourn.
B
Not
from
that
side
yep
not
from
my
end
anything,
I
guess
yeah,
I
think,
that's
all
then
cool
right,
I
can
think
tyler
and
then
we
can
end
the
stream
awesome.
Okay,
all
right
thanks.
Everyone,
bye
cool
good
doctor
in
two
weeks,
bye.