►
From YouTube: .NET Design Review: DataFrame
Description
We're looking at a new API for DataFrame. https://github.com/dotnet/apireviews/blob/master/2019/10-08-dataframe/README.md
A
D
E
So
if
people
are
familiar
with
pandas,
then
this
is
like
a
dot
medical
and
dependence
for
those
who
are
not
familiar
with
and
that's
what
the
data
frame
is.
It's
basically
tabular
data,
that
is
in
memory,
but
you
can
execute
operations
on
it.
So,
like
a
binary
operation
also,
you
can
do
like
column
a
plus
column.
P
would
give
you
the
comb
right.
E
Whatever
the
data
frame
has
a
backing
store
in
the
Apache
arrow
format,
which
means
it
just
towards
bytes
at
the
back
and
there's
like
a
standard
representation
for
the
float
and
intent
blah
blah,
which
means
that
if
you
have
a
column
of
say,
ends
and
then
you
add
a
column
of
float,
the
result,
the
compiler
knows
is
a
float,
but
because
so
so
you
need
to
make
a
new
column.
You
cannot
like
update
the
int
column
itself
in
place
because
the
documentation
will
change
right.
E
E
E
E
Related
to
the
data
space,
it's
a
way
for
people
to
explore
a
data
set
that
they
have.
So
if
you
have
like
a
CSV
or
something
so
you
cannot
import
the
CSV
in
a
notebook
and
you
can
explore
beta.
So,
for
example,
you
can
say
say
that
you
have
a
column
that
you
want
to
like
you
want
to
investigate,
so
you
can
do
correlation
of
column
a
and
column
B,
and
so
that
would
be
like
an
API
that,
like
you,
yeah
access
to
a
column
bigger.
So
you
can
write
the
correlation
function.
E
Whatever
question
you
want
to
do
stuff
like
that,
but
it
it
doesn't
have
any
dependencies
on
either
a
neural
net
or
system
data,
not
on
system
data
from
ml
dotnet,
there's
no
dependency.
Now
it
only
implements
I
date
of
you,
which
is
its
own
yeah.
So,
like
it's
okay,
so
it's
yeah,
it's
independent!
It's
related.
G
B
I
E
B
It's
important
for
two
reasons:
one
is
you
can
do
0
copy
right
wrapping,
but
then
2
is
we
expose
you'll
see
later,
but
we
expose
api's
to
get
act.
The
actual
buffers,
and
so
the
arrow
format
is
a
columnar
format
right.
So
all
the
data
stored
in
columns
and
so
the
idea
is,
you
can
do
sim
D
operations
on
this
data,
which
you
couldn't
do
in
like
a
role
based
format
right.
K
Question
going
back
to
ideas
of
you,
so
panis
allows
the
random
access
right
and
either
the
view
I
think
it's
just
a
bit
only
for
work
kind
of
streaming
for
sorwe
accession
data,
and
so
so
you
able
to
read,
even
though
you
waste
your
tourist
on
I
decided
of
you.
But
you
were
able
to
overcome
overcome
this,
so
you
can
do
a
random
access.
We.
L
E
And
the
primitive
types
that
the
column
can
hold
is
like.
The
premium
pipes
that
we
have
in
c-sharp
like
in
float,
is
mutable
and
then
strings
have
two
columns,
a
string
column
and
an
arrow
string
column.
The
arrow
spring
column
is
immutable.
The
string
column
is
new
row.
So
that's
the
whole
point
of
data
we
can
represent
and
are.
B
G
G
L
B
G
B
G
G
B
G
G
B
G
B
A
E
A
B
F
E
E
E
So
when
you
say
head
that
would
mean
the
first
five
rows
is
what
had
gifts.
They
would
give
the
last
five
rows
so
the
first
basically
from
one
line,
170
2180
you're
extracting
data-
so
all
the
indexes
or
the
get
value
indexers
line
183-
is
a
set
value.
Indexer.
This
means
change
the
value
at
that
index.
1000.
E
The
drive
type
is
spirit
of
calm
event.
So
that's
what
I'm
doing.
The
reason
that
is
important
is
primitive.
Form
of
Flint
has
knows
that
it
that
it
holds
integer
data,
so
the
so
the
API
said
you
use.
Intellisense
will
give
you
that
the
return
type
is
an
INT
and
the
data
is
a
name
there.
If
I'm,
a
data
frame
up
into
everything
would
be
object
because
that's
the
base
column.
G
B
A
E
E
G
B
G
G
B
We're
doing
that
because
they're
different
target
use
cases
great
fight.
That
dataview
is
it's
extremely
like
the
least
common
denominator
of
what
a
tabular
data
could
possibly
do
right
like
the
least
common
denominator,
I
mean
it
describes
to
you
what
the
columns
are
and
it
allows
you
to
get
a
cursor
forward
only
cursor
over
the
rows
right.
That's
all
you
can
do
with
I
date
of
you.
It's
the
only
operations
it
can.
It
can
possibly
do.
B
Like
that,
that's
its
only
goal
in
life
cuz,
the
idea
being
you
can.
You
can
implement
that
with
all
sorts
of
different
data
right.
Some
could
come
from
Azure,
blob
storage
from
a
file
from
a
database
and
all
sorts
of
different
places
right
where
a
data
frame
is
very
opinionated
about
how
its
data
is
actually
stored.
And
since
it's
opinionated
about
how
its
data
stored,
it
can
allow
a
much
greater
range
of
operations
on
data
I.
G
Think
it
will
be
good
if
we
ended
up
with
these
two
technology.
Technology
is
being
rationalized
and
for
somebody
who
wants
to
deal
with
tabular
data,
they
looked
like
single,
coherent
teacher
with,
maybe
you
know,
different
api's
and
abstractions
optimized
for
different
slightly
different
scenarios,
but
one
teacher
I
worried
whether
we
are
not
on
track
to
come,
say
well.
Data
view.
Has
these
limitations
and
I'm
a
dotnet
we're
gonna
create
almost
like.
You
know
another.
F
K
Is
also
another
rationalization
idea
of
you
doesn't
provide
random
access
and
we
target
goal
here
is
to
mimic.
Finally,
the
frame
which
allows
you
random
access,
so
you
can
access
any
data
in
any
order.
You
want
well
identity.
You
have
to
go
like
concern
from
the
beginning
to
the
end,
all
the
time
there
is
already.
H
N
B
G
Not
what
I
I
don't
think
that
that
is
the
goal
and
I
understand
you
can
add
salt
stream
on
the
IP
is
I'm
just
saying.
Can
we
end
up
in
a
situation
where
two
users
who
want
to
use
any
of
these
technologies?
They
basically
appear
as
different
features
in
a
single
technology
not
to
completing
technologies.
O
But
if
data
frame
is
already
implementing,
I
did
a
few.
What
it's
already
implementing
it
is,
for
example,
treated
as
an
identity,
but
it's
not
popular.
You
apply
operations
to
the
tables
differently,
not
if
you
only
have
as
an
identity,
yes,
but
that
that
direction
is
one
way,
but
the
other
direction
doesn't
seem
to
be
good.
I
mean
it's
a
specialized
type
that
has
specialized
operations.
J
A
It's
specialized
types
good,
you
could
think
of
it
as
an
array
where's
the
span
right
I
mean
it's
not
they're,
not
all
completely
orthogonal,
where
one
is
a
strict
subset
of
the
other,
and
so
you
end
up
with
the
world
where
certain
operations
don't
make
sense.
You
have
to
layer
it
in
terms
of
like
a
concentric
circle,
kind
of
thing
right
so.
G
A
J
J
Said
it
was
very
important
to
the
format
that
it's
exposed
in
the
format.
It's
an
input
type.
It
is
very
transparent
over
the
Apaches
particle
stuff.
That
means
this
is
spark
and
not
data
Apache
arrow
stuff,
which
is
different
from
sorry
spark
stuff.
Sorry,
a
patent,
something
Apache,
okay,
yeah,
there's
a
data
format
that
this
is
entirely
dependent
on.
This
is
just
a
representation
of
that
data
format.
This
is
not
data
in
general,
this
is
arrow
or
whatever.
It
is
well.
A
I
mean
I
would
I
would
separate
the
particular
implementations
from
from
the
use
cases
right
like
in
the
same
way
that
you
know,
data
set
had
high
dependencies
on
certain
xml
formats.
But
the
question
is:
what
interaction
model
are
these
things
used
in
and
how
are
we
promoting
this
and
when
our
setting
is
really
what
promoters
is?
A
If
you
want
to
do
something
in
an
interactive
fashion,
similar
to
a
workbook
style
thing,
then
yeah,
you
basically
have
to
think
about
what
that
means
for
the
user
experience,
but
the
fact
that
it
also
mirrors
somebody
else's
format
I
think
is
not
the
I
out
of
it.
At
that
point,
it
just
becomes
a
what's
the
use
case
for
the
API
and
does
the
namespace
make
sense
for
that
use
case
I,
don't
think
it
should
be
Microsoft
data,
because
I
think
that
is
way
too.
A
You
know
it
simply
implies
the
commonality
that
just
isn't
there
right.
It's
it's
it's
a
way
more
specific
thing,
but
it
is
the
right
understanding
is
general
purpose
from
the
point
of
view
that
you
can
load
any
data
into
it.
You
can
look
at
any
data.
You
can
transform
it
in
any
way
you
see
fit
and
then
you
know
over
time.
You
probably
also
have
multiple
ways.
You
can
export
this
data
into
some
shape,
but
it's
a
CSV
file
or
some
you
know
other
shape,
but
I
think
they.
G
I
just
mentioned
the
namespace
talked
to
kind
of
you
know
as
an
illustration
of
when
I
think
external
customers
will
see
this
like
two
features
as
related.
Usually
we
communicated
by
hey,
you
know
they
belong
to
eat,
maybe
not
too
exactly
same
namespace
but
kind
of
like
you
know,
similar
area
operations
looked
at,
you
know
similar
I,
don't
know
they
like
enameled
up
nets
to
be.
There
is
no
API
to
inside.
Her
call
me
correct,
but
yield
yeah.
G
E
E
P
E
Nothing,
it's
not
address
logic
because
yeah
there's
another
project
because
in
a
notebook,
personal
living
like
at
least
the
examples
that
I've
seen
you
always
want
a
copy
in
all
these
places.
So
you
did
something
you
don't
like
it.
You
can
just
do
like
like
a
pyro
and
you
get
what
the
previous
state
that
you
were
in.
If
you're
now
in
place,
then
it's
angry.
So
it's
more
exploration
style.
You
don't
know.
If
that's
the
end
product,
you
want
somebody.
B
A
E
The
moment
it's
completely
separated
the
way
I
understand
it.
Is
you
take
the
data
that
you
want?
I
mean
you,
take
your
CSV
file,
your
source,
and
you
do
all
the
expression
you
want.
So
you
see
if
this
columns
actually
correlated
to
the
other
one
or
not
to
all
the
operations
you
want,
like
add,
columns
to
projections
and
stuff
like
that,
come
up
with
a
set
of
operations
that
you
like
that.
You
know
that
this
is
now.
E
A
A
And
this
is
I
mean
one
of
those
like
I
guess
concerns
I
have
is
that
if
we,
if
you
don't,
have
multiple
consumer,
but
if
you
only
have
one
consumer,
then
you're
always
running
the
risk
that
the
thing
you're
building
is.
Is
it's
very
specific
to
that
one
scenario
versus
when
you
try
to
create
something
that
is
more,
you
know,
general-purpose,
you
Kaba
need
more
than
one
consumer
to
really
make
sure
that
you
actually
have
some
general
purpose
and
what's
something
that
is,
you
know
it's
part
of
my
data
plan
right.
B
Yeah
one
is
data
exploration
specifically
in
in
a
notebook
right
into
Jupiter.
Notebook
I
want
to
load
up
some
data
and
manipulate
it
right
analyze
it
chart
it
maybe
modify
it.
That's
one
complete
use
case.
Another
second
major
case
that
we're
seeing
is
in
spark
there's
there's
an
operation
called
a
UDF
right,
a
user-defined
function,
yeah.
L
B
Input
and
output
of
of
a
type
of
these
UDF's
is
an
in-memory
tabular
data
set
right,
which
would
be
this
data
frame
and
so
like
in
there
a
user
user-defined
function
right
is
like
you
have
all
this
data
coming
from
spark,
and
you
want
to
do
operation
on
a
chunk
of
it.
There
you're
not
like
exploring
the
data
like
you
would
be
in
a
notebook
right.
There,
you're
actually
like
taking
that
data,
manipulating
it
somehow
by
either
whatever
the
UDF
does
to
the
data
right.
If
it's
adding
these
two
columns
or
whatever,
but.
A
I
think
that's
the
I
mean
it
was
originally
the
idea.
That's
similar
to
you
know.
I
dataview
was
PCL
e5.
It
was
the
phrase
we
used
so
I
think
this
is.
It
was
the
same
desire.
That's
why
I'm
saying,
depending
on
where
we
land
with
the
number
of
consumers
that
may
or
may
not
make
sense
right.
This
is
this
is
why
I
mean.
B
P
I'm
not
sure
that
we're
going
to
have
something
that
should
be
general-purpose
until
we
actually
sit
back.
Look
at
things
like
Python
and
other
languages
do
and
start
supporting,
combine
combining
localized,
compute
kernels,
properly
and
optimized,
so
that
you
don't
walk
memory
again
and
again
and
again,
because
if
we're
not
doing
that,
we're
always
going
to
be
light
years
behind
what
other
frameworks
and
languages
are
doing
and
also.
G
I
would
say
you
know
like
we
may,
we
may
say
we
messed
up
and
I
date
of
you
and
a
man
pipeline
is
not
it,
but
I
thought
that
was
there
Nia.
So
if
we
think
that
usability
of
a
magnet
is
not
great,
then
it
doesn't
support
random
access.
Would
we
think
about
trying
to
fix
amel,
definitely
support
these
additional
scenarios,
because
I
can.
G
Know
like
a
mere
dotnet
started
as
a
forward
Omni
Lacey
I,
don't
see
a
reason
why,
for
you
know,
smaller
data
sets
take
a
sense
that
fit
in
memory
various
other.
You
know
scenarios
you
would
not
want
different
kinds
of
implementations
of
some
very
commonly
abstractions
and
then
the
operations
I
completely
agree
that
today,
that
Emma
dotnet
pipeline
is
not
super
easy
to
use.
But
that's
another
thing
like
would
we
consider
trying
to
fix
it
first
if
we
were
working
on?
Actually,
you
know,
we
see
any
Peter
I.
A
Mean
for
another
net.
The
question
is
really
like
how
much
I
can't
actually
Myron
will
be
necessary.
But
if
you
have
one
model
that
is
all
about,
you
know
you
create
new
instances
and
it's
lazy
without
duplicating
the
entire
API
services
might
be
very
hard
to
have
a
model
like
you
almost
need
different
kinds
of
wiring
up
five
flights
at
that
point,
but
I
think
it's
it's
a
point
well
taken
right.
I
mean
like.
A
We
know
that
when
you
try
to
explore
mother
that
models-
and
you
start
with
small
data-
sets
that
that
the
pipeline
model
is
very
hard
to
explore
because
you
have
to
business.
I
got
the
whole
thing,
the
bug
in
one
pass
and
unchanged
code,
and
we
print
and
repeat
verses
in
you
know
in
other
systems
that
have
data
frame
style
API
is
you
can
set
breakpoints,
you
can
introspect
intermediary
results
and
things
make
sense
to
you,
I
mean
you
can
just
mess
with
stuff
and.
G
Implementations
are
lazy
and
there
are
some
IP
is
that
kind
of
you
know
for
the
support
of
lazy
scenarios,
but
you
can
implement
right.
The
PI
data
viewers
only
eager
API,
the
the
forward
on
is
a
separatist,
a
random
access.
It
doesn't
support,
but
it's
not
fundamentally
lazy.
Correct
me
if
I'm
wrong,
very
I
mean.
G
G
K
B
K
B
L
B
G
This
direction
is
great,
literally
no
concerns
here.
I
have
a
a
bit
of
a
concerned
about
the
operations
working
differently
and
operations
from
Emerald
net.
Our
operations
from
this
feature
will
not
basically
not
working
alike
the
operations
in
the
method
net.
You
said
that
the
reason
it
was
done
this
way
is
that
amended.
That
is
not
super
easy
to
use
to
apply
those
operations.
I
agree
with
that.
I
wonder
whether
we
kind
of
came
up
on
trying
to
make
it
easier.
It's.
A
Yes,
many
all
right.
However,
let's
let's
focus
on
this
API
here,
because
I
mean
why
not
be
a
source.
Api
is
easier
to
use
in
this
API
but
I,
but
first,
let's
get
a
handle
on
what
that
API
is
and
then
maybe
we
can
later
then
decide
like,
or
should
this
API
be
used
in
other
places
and
then,
if
not,
then
let's
pick
a
different
name.
If
yes,
then,
let's
pick
it,
is
it
rain
again
like
so
this
is
basically
so
that
what
you
have
here
is
the
modification
part
of
right.
G
H
E
E
Right
this
yeah
that's
right
right,
so
yeah
ignorance,
that's
about
that.
If
that
is
not
fixed,
this
is
supposed
to
work.
Yes,
what
is
the
intention?
The
intention,
the
the
intention
was
I
did
not
want
two
columns
with
duplicate
names
in
the
data
frame,
because
then,
when
I
say
data
frame
off
column
name,
if
there's
two
column
names
I,
don't
know
which
ones
return
so
I
was
drawing.
E
So
so
the
bug
was,
if
I
replace
the
Colin
with
another
column
of
the
same
name.
I
was
preventing
it,
but
I
should,
because
that
means
I'm
replacing
one
column
bit
with
another
column
of
the
same
name.
So
it
doesn't
matter
I
see
but
you're
passing
here
index
2
right.
If
you
pass
it
a
different
density
that
would
throw
because
that
means
you're
having
to
clean
column,
names
yeah.
R
E
E
A
K
E
P
P
L
P
Basically
because
otherwise,
you're
you're,
the
slowest
possible
thing
you
can
do
on
a
computer
outside
of
like
network
access
like
memory
access,
yes
and
so
you're
walking,
you're,
chained
you're,
changing
something
that
could
be
walking
100
megabytes
once
into
walking.
You
know
2
gigabytes,
right,
doing
20
operation,
ok,
so.
E
We
don't
have
any
pain
that
that's
that,
because
there's
two
reasons
the
first
one
is
like
on
a
notebook.
If
you
say
data
frame
got
add
something
and
then
on
the
next.
So
if
you
say
their
frame
that
add
dot
divide,
something
because
there's
no
there's
no
way
for
me
to
differentiate
which
one
you
did
because
I
don't
have
like
I'm,
not
building
the
set
of
operations
that
they're
doing
so
like
I.
Don't
know
that
you're
doing
an
ADD,
followed
by
a
divide,
so
I
can
combine
those
two
together.
M
P
That
requires
people
to
go
and
write
their
own,
basically
compute
kernels
that
do
add
and
abide
combined
and
so
you've
got
an
explosive
number
of
expression,
trees
that
become
infinite
rather
than
having
a
system
that
understands
them
inherently
in
combines
and
themselves
we're.
Just
like
Python
knows
how
to
do
right.
I
just
had
one
caveat.
E
P
E
B
E
B
Back
to
the
compute
kernel
stuff,
another
another
thing
in
that
in
Apache,
Aero
project
is
exactly
there's
the
thing
called
Gandiva,
which
is
like
it
basically
look
like
compute
kernels
built
into
C++
on
top
of
Apache,
Aero,
nado,
right
and
so
another
a
plan
here
is
since
we're
building
on
top
of
Apache
arrow
in
the
future.
We
should
be
able
to
take
advantage
of
any
new
capabilities
that
come
to
the
Apache
Aero
realm
space.
Whatever
you
want
to
say.
P
B
P
B
E
B
Those
were
good
ones,
though.
Actually,
if
you
go
back
to
that
test
right,
like
DF
DF,
one
of
I'm
on
317
right,
like
doing
the
equals
equals
on
two
columns,
compares
each
value
in
the
column
and
then
gives
you
back
another
column
full
of
boolean's.
Yes,
that
is
true
false,
whether
that
those
values
equaled
or
not.
Yeah,.
A
That
one
is
the
only
one
where
I'm
you
know,
I'm
sure.
That's
such
a
good
idea,
I
mean,
generally
speaking,
we
do
+
and
I
think
it's
fine.
The
return
type
is
within
the
same
domain.
That
kind
of
makes
us,
but
for
equality,
comparisons.
The
thing
it
will
be
I
mean
it's
not
impossible
to
do
reference.
Equality
checks.
Why?
But
just
either
doing
object
or
reference
equals
ordering
is
now
or
is
great
grace
to
do
the
typical,
not
check
thing
but
I
think
people
will
find
that
confusing.
A
P
G
J
J
J
J
Sure,
but
our
neural
guideline
is
don't
be
cute
with
operator
overloads.
The
Malayan
operators
return
a
boolean,
the
inequality
operators
are
all
boolean
operators.
It's
just
don't
be
cute
with
operators,
it's
R,
that's
gotten
it.
If
we,
if
we
have
so
what
what
would
the
suggestion
than
just
a
static
method
that
returns
a
new
column
that
is
the
equals
mass
between
a
and
B
yeah,
then
you
think
you
won't
know
the
name
I.
S
E
S
B
P
J
E
A
The
one
thing
that
everyone
remembered
the
sendee
stuff
we
did
review
the
badness
at
some
point
and
think
Emeril's
use
the
phrase
of
like
modem.
It
wasn't
so
much
about
operators.
It
was
specifically
about
equality
like
a
quality,
is
hard
and
people
have
a
very,
very
pre-canned
understanding
of
what
they
think
that
will
equal
does
and
I
think
returning
anything.
But
boolean
will
with
that.
At
the
same
time,
though,
you
expendable
to
be
able
to
overload
equals
not
to
mess
with
the
return
time,
but
to
just
change
change
the
way
you
do
equality
about
that.
A
You
know
you
could
look
at
conkers
and
other
contents
right
and
I
think,
with
plus
and
minus
people
kind
of
have
an
intuitive
understanding
already
that
the
types
may
change.
But
if
you
multiply
an
into
a
verb
float
and
you
don't
get
back
in
in
you-
get
back
a
float,
so
people
people
understand
that
forever
arithmetic
yeah,
the
return
types
may
different
and
they
don't
tend
to
use
that
an
if
checks
right
versus
every
time,
I
use
C
for
boolean
and
turns
out
they
overload
in
an
operator
to
operator
fault.
G
Think,
there's
more
to
it.
I
think
operators
work
better
in
domains
when
there
is
an
expectation
that
operators
would
work.
So,
in
the
you
know,
in
the
domain
of
numbers,
you
know
that
plus
something
it
claims
to
be
a
number
like
I
need.
You
know
that
it's
been
a
support.
Class
minus
multiply
in
this,
and
that
and
everything
right
here.
I
also
worry
like
how
will
people
know
that
column
has
a
plus
operator
that
takes
an
int?
It's
really
nice.
You
know
it
looks
very
nice
once
it's
written,
but
operators
are
not.
G
They
don't
show
up
in
intelligence.
That's
right
and
it's
a
bit
surprising
that
you
know
I
can
add
one
to
a
column.
Second
thing:
I
think
it's
a
bit
unfortunate
here
is
that
when
operators
work
everywhere,
it
kind
of
looks
nice
like
math,
you
can
add
two
things,
but
here
it's
like
a
mixture
of
methods
that
are
not
really
named
as
methods
like,
for
example,
column,
that
takes
an
eye.
G
E
K
A
I
think
we're
watching
you
board
average
I
buy
it's
the
same
thing.
We
solve
things
like
object,
initializes
right
like
if
you
look
at
how
people
discovered
that
in
our
JSON
API
is
they
usually
didn't
write
either
had
an
expectation
it's
there
or
it's
not,
but
once
you
know
it's
there,
it
makes
your
code
really
nice
to
read
right.
So
the
thing
that
I
think
it's
generally
true
that
for
every
operator
we
would
have
a
method
that
does
the
same
thing.
A
So
if
you
just
described
my
fellow
Sims,
you
see
those
methods
and
then
that's
the
way.
You
would
do
it
and
at
some
point
you
see
scepter
code,
where
you
see,
oh,
my
god,
they
look
super
compact,
notation
and
with
super
nice
and
so
I.
Think
in
that
sense,
I.
Don't
think
we
need
to
take
something
away
from
people
because
they
you
know
they
don't
expect
it,
but
at
the
same
time,
I
think
when
we
give
something
to
people
it
has
to
be
self-consistent.
A
I,
didn't
I,
think
your
friend
data
frame
like
to
what
Eric
said
in
the
Python.
How
many
people
do
that,
and
it
is
part
of
the
reason
why
data
frame
is
successful
there,
because
it
is
fairly
compact
yeah
and
because
you
can't
do
relatively
complicated
things
in
what
looks
intuitive?
You
can
look
at
what
lines
of
code
you
can
learn
about
what
it
does
versus.
If
you
look
at
vector
multiplication
and
see
where
you
don't
have
operators,
it's
very
hard
for
you
to
visualize
what
it
does.
G
Q
R
A
They're
entities
like
what
am
I
get
away
from
Python.
Thank
you,
I.
Think
to
me
it's
it's
not
important.
That
feature.
X
is
windowed
net
is
the
same
as
feature
X
and
Python,
because
that's
not
how
the
world
works,
but
you
can't
just
work
you
you
don't
just
move
to
donate
for
this
one
feature
you
know:
I
have
to
absorb
the
rest
of
that
I.
Do
you
have
to
access
files?
A
You
have
to
deal
with
the
compiler
with
the
IDE,
there's
a
whole
different
ecosystem,
so
once
you're
that
ecosystem,
it
is
super
important
that
things
are
self
consistent,
and
so
that's
why
I
think
equality
is
the
thing
you
really
can't
with,
because
it
is
already
very
complicated
internet.
We
have
very
attractive
reference
types,
you
have
different
semantics,
and
so,
if
you
now
also
be
blending
the
fact
that
yeah
in
this
one
feature,
we
also
completely
redesign
how
what
what
the
expectations
are
for
equality,
I,
think
that
would
be
fairly
bad.
Yeah.
K
J
Would
be
making
a
feature
for
Python
developers
and
offered
on
that
developers,
but
I
mean
it
comes
back
to
my
original
question.
Why
would
Python
developers
yeah
so
in
our
design
guidelines
we
say
equal
equal
is
the
same
as
I
equatable
and
the
inequality
operators
are
the
same
as
I
comparable.
That
means
that
they
are
boolean
operations
right.
M
E
I
get
all
that
the
only
respect
I
have
is
for
now.
You
guys
have
only
been
doing
column
plus
one
and
then,
if
you
chain
operations
together,
then
you
could
have
operators
everywhere,
except
where
you
have.
When
you
want
to
equals,
equals
you
have
to
say,
dot
equals,
and
then
you
have
again
just
operator.
It
doesn't
look.
I've.
J
A
So
don't
get
me
wrong
right,
I,
hear
what
you're
saying
and
I
think
this
goes
back
to
an
Eriksen
earlier.
Should
we
remove
all
of
writers
it
to
me
this
is
kind
of
like
the
throwing
out
the
baby
with
the
bathwater,
but,
like
you,
I
think,
the
one
thing
you
need
to
think
about
is:
how
often
do
you
do
Plus?
You
know
concat
multiplication
versus
equality.
I
would
assert
you
probably
don't
compare
columns
that
compared
to
other
modifications
you
make
thanks
for
giving
up
the
the
compact
subjects.
A
B
E
E
E
A
J
Indexer
a
disaster
yeah,
but
if
that's
reducing
them
as
long
as
you
get
back,
that's
a
filter
operation
and
not
a
like,
not
a
returnee
column
that
contains
the
answer
of
the
inacol
yeah
like
these
are
again
a
different
operation
that
you
could
say.
This
is
what
I
want
less
than
to
do
and
like
they
in
fact,
the
the
actual
rule
that
I
was
I
thought
paraphrasing
is
don't
be
cute
with
operators
is
actually
do
not
beat
you
but
operator,
so
good
job
Chris.
J
E
R
J
G
P
And
not
overload
resolution
pen
DES
the
expression
ordering
if
you're
using
methods
for
add
subtract,
multiply
divide.
You
have
to
be
very
explicit
and
very
careful
about
how
you
order
those
expressions.
Yeah,
you
don't
get
the
compiler
just
saying.
Oh
I
know
multiply,
comes
first,
so
I'm
going
to
do
that.
Even
if
it's
over
here
in
your
expression,
tree
yeah,
I
I,.
G
Was
not
saying
that
is
the
best
language
for
these
scenarios.
Well,
I
mean,
and
you
take
what
a
big
19
see
shot
here.
It's
gonna
be
more
than
just
operators
on
those
tables
and
therefore
they
sooner
related
when
I
end
up
in
a
dotnet
world,
which
is
full
of
functions
that
have
names
and
they
alone
and
intellisense.
What
you
mean
you
had
a.
B
L
A
The
thing
you
have
to
think
about
how
this
will
feel
natural,
even
if
you're
a
sous
chef
developer
like
how
do
you
get
the
same
benefits
of
a
compact
notation
where
you
can
do
fairly
complicated.
You
know
data
translations
in
a
way
that
fields
you
know
both
intuitive
is
easy
to
read
and
also
easy
to
I.
Think
that
everybody.
P
The
remaining
thing
there
is,
then,
if,
if
there
is
a
place
where
people
want
and
need
to
be
able
to
do,
equality's
which
return
a
mask
and
c-sharp
doesn't
have
support
for
that.
Today,
then,
is
that
something
that
we
needed
to
talk
with
the
c-sharp
ldm
team
about
to
see
if
they
can
add
new
operators
for
that
there's
already
a
proposal
for
a
new
power
operator.
So
maybe
we
need
something
that
says:
here's
equality
in
their
comparison,
that
we
argument
was
what
I've
heard
multiple
times
during
this.
J
Session
is
that
this
is
not
intended
for
typical
top
end
developers.
This
is
intended
for
data
scientists
who
aren't
otherwise
steeped
in
the
dotnet
world,
and
that
also
comes
back
to
the
point
that
we
spent
a
while
you
know
beating
earlier,
which
is:
does
this
belong
in
corner
of
X?
If
that's
the
target
audience
and
I
would
say,
the
answer
was
no.
If
that's
the
target
audience
I
mean.
B
L
B
L
L
One
of
the
things
usually
happen
a
lot,
and
this
way
a
lot
of
people
Sandra's
own
person
word.
There
is
tons
of
code
and
they're
from
papers
that
available
in
Python
and
if
we
try
to
copy
or
move
this
code
back
and
and
use
them
so
I
expect
also
in
people
that
use
dotnet
will
look
at
Python
code
because
there
is
a
rich,
a
lot
of
Python
code
and
try
to
convert
it
to
c-sharp
and
the
easy
we
can
do.
We
can
show
them
how
to
migrate.
L
P
And
I,
don't
think
that
you
know,
even
though
this
might
be
primarily
targeted
towards
people
who
have
some
ML
background
and
stuff.
One
of
the
reasons
people
like
Python
and
they're
able
to
do
it
is
they
might
be
trying
to
add
some
minimal
ml
to
to
their
app
or
something
like
that
and
Python
is
very
easy.
You
just
open
a
command
line.
P
You
start
typing
code
like
you
would
math
or
anything
else
you
already
know,
and
you
can
get
results
because
it's
familiar
to
people,
even
if
they
don't
have
that
data
or
ml
background
they
just
type
math
and
they
get
math
back
at
work
and
that's
basically
all
this
is.
Is
data
frame,
vectors,
tensors,
they're,
all
just
well-defined
mathematical
types
with
well-defined
operations?
So,
if
you
understand
math,
you
can
type
math
and
you
get
math
back.
J
J
J
Equal,
like
the
inequality
operators
and
the
Equality
operator,
which
is
a
special
case
of
an
equality
operator,
are
a
single
boolean
and
we
we
have
that
it's
actually
mostly
implicit
in
the
guidelines,
but
the
two
specific
things
we
have
are
like,
though
I
comparable
translates
to
the
the
classic
inequality
and
I
equate
able
translates
into
equal,
equal
and
not
equal.
So.
E
J
Said
those
are
single
boolean
and
that
is
the
defined
behavior
we
have
in
dotnet
and
the
whole
point
of
API
review
and
framework
design
guidelines
is
dotnet
feels
like
net,
and
not
this
thing
that
came
out
of
inspired
by
Python
feels
like
Python.
It's
it
needs
to
feel
like
that
meant.
That
is
the
number
one
rule
in
got
net.
We
are
not.
A
Don't
get
me
wrong,
like
I
could
really
agree
with
what
you
just
said.
At
the
same
time,
though,
like
the
I
think,
the
goal
is
not
to
take
a
Python
concept.
Imported
pseudo
net
right
I
think
the
goal
is
to
say
there
is
something
that
is
really
popular
and
successful
in
the
Python
world
for
a
particular
sort
of
characteristics.
And
so
the
question
is:
if
you
were
to
build
something
that
has
similar
characteristic
in
dotnet.
What
would
be
the
dotnet
way
to
do
that?
A
Right
and
I
think
we
just
discovered
that
for
operators
it
is
harder
because
people
like
the
way
strongly-typed
systems
like
C,
sharp
or
Java
work-
is
that
there
is
an
expectation
for
what
certain
operators
do
right.
At
the
same
time,
though,
I
think
there
is
a
desire
to
say,
can
we
find
a
way
to
do
these
things
in
a
compact
fashion?
A
I,
don't
know
what
they
would
look
like,
but,
for
example,
one
thing
I
could
see
you
doing
is
instead
of
saying
we
have
these
operators
on
the
column
type
or
the
data
frame
type,
which
is
the
thing
that
people
will
pass
in
and
out
of
methods,
and
thus
you
probably
want
to
do
now-
checks
against
it's
different
from
saying
you
have
a
method
that
takes
the
lambda.
The
lambda
has
some
funky
arguments,
some
types
and
those
have
to
have
certain
operators
defined
for
them.
A
For
you
to
express
what
that
condition
looks
like
right
and
for
those
things,
maybe
we
don't
care
that
we
have
a
double
equal
to,
because
those
are
not
the
things
you're
actually
passing
a
lot
of
methods,
but
I
think
the
question
is:
can
be
carried
compact
notation
to
do
column
based
transforms
and
and
not
create
a
world
that
people
when
they're
through
argument,
validation
are
surprised
that
double
equals
doesn't
return
the
move
anymore
right.
So
I
think
that
I
think
to
me
is
more.
Like
you
know.
A
A
P
Okay,
the
name
method
equals
returning
a
column
and
stuffable
I
think
it
needs
to
be
something
like
equals:
masks
buttons,
okay,
well,
you
already
have
them
at
the
name
equals
and
anyone
who
has
a
equated
will
or
I
comparable
or
any
other
dotnet
background
will,
by
default,
assume
greater
than
returns
a
bool.
So
you
need
something
like
greater
than
mask
what
it
does
not
implement.
Ie
I
still
think
the
default
assumption
is
that
it
does
nothing.
J
B
J
Q
C
Q
M
J
B
A
P
A
J
E
J
You
have
you,
you
can't
really
understand
what
the
flow
of
what
you're
getting
is
and
worse.
You
have
an
expectation
that
is
wrong
right,
like
when
var
leaves
you
compute
with
ambiguous.
That's
okay,
when,
when
you
have
concretely
decided
the
wrong
answer
now,
that's
when
you
get
confused
in
reading
the
rest
of
code,
I
guess.
J
Thank
you.
We
also
like
another
another
argument.
We
also
don't
tend
to
put
operators
on
non
steel
types
because
you
can
override
a
change
of
behavior
and
the
operator
is
a
static
function
well,
but
the
operators
just
supposed
to
defer
to
the
name
of
this.
That's
our
guideline
up
there,
anyway,
they're.
J
R
A
J
A
J
Suffix
base,
but
a
prefix
base
would
all
sorts
fine
yeah
it's
it's.
If
it's
a
useable
type
by
itself,
then
it
should
just
be
data
column
or
something,
and
then
you
have
specialized
types
beyond
that.
That's
fine,
but
like
we
call
it
collection
of
T,
we
don't
call
it
base
collection,
even
though
it's,
but
mostly
only
ever
used
as
a
base.
I.
J
E
J
P
B
J
J
J
Somebody
already
you
know
in
this
world,
then
yeah
another
that
design.
That's
why
the
host,
because
we
we
do,
we
do
tend
to
say
like
if
you,
if
somebody
who
already
has
domain-specific
knowledge,
would
see
that
type
name
and
say
oh
I
kind
of
know
what
it
does.
Then
it's
a
decent
typography
as
long
as
somebody
who
doesn't
have
the
domain
knowledge
wouldn't
believe
it
means
something
it
isn't.
So
if
it's
overly
generic
we
should
well
is.
Q
S
Q
Q
E
B
B
It
an
issue,
a
direct
issue
of
this
is
spark
net.
Has
the
concept
called
a
data
frame
right
and
it's
like
a
distributed?
Data
frame
right
like
it
can
represent
data?
That's
you
know
some
of
its
on
this
machine,
some
of
its
on
that
machine,
some
of
its
over
here
but
you're
running,
and
it
represents
all
of
all
of
the
data
across
all
of
this,
and
since
we
actually
want
to
use
this
type
in
the
context
of
spark
net
in
those
UDF's
that
I
talked
about
earlier,
the
names
directly
collide
right.
A
Yeah
I
said
he
wasn't
seeing
all
this
for
a
second,
because
I
mean
this
goes
back
to
what
crystals
are
all
about.
Let's
make
one
feature
or
one
technology
with
multiple
features
rather
than
competing
technologies.
But
again,
let's
just
say:
I,
don't
know,
I,
don't
see
enough
reason
why
not
to
get
a
different
name
than
data
frame,
I.
Think,
once
you
talk
about
how
this
Ted
bets
and
other
things,
maybe
we'd
have
a
different
conversation
about
that,
but
I.
A
G
B
J
I
I
R
B
A
M
E
J
Our
general
guideline
is,
if
you
always
return
the
same
type,
someone
is
going
to
depend
on
that
and
they're
going
to
cast
it
yeah
and
only
return
the
interface
from
a
property
or
a
method.
If
you
actually
return
multiple
different
things
from
that
interface,
so
you
will
break
some
way
if
you
want
so.
J
On
in
system
link,
we
have
a
dot
two
list,
extension
method
and
it
returns
a
list
of
monster.
Okay.
Here
we
could
return
a
read-only
list.
J
J
J
J
B
A
B
J
R
R
J
Which
would
be
if
I
always
been
difficult,
then
the
I
list
would
make
sense.
But
since
you
are
only
since
you
were
only
producing
data
and
it's
a
collection,
you
should
be
the
most
specific
collection
type
of
camera.
In.
B
P
C
J
R
Q
J
Mean
assuming
is
if
this
is
called
data
frame
and
that's
called
data
frame
column,
then
that
seems
I.
Don't
okay,
I
these
days,
I
tend
to
make
tree
like
types
like
that
not
go
back
up,
because
it
gives
you
a
lot
more
flexibility
right
there's.
If
it's
not
important
that
a
column
know
about
its
data
frame,
then
don't
tell
it,
but
especially
what
happens
after
you
get
one
and
you
if
you
were
movement
from
the
dataframe
like
what
universe
are
you
in
now?
Just
you
know
add
up
unless
you
need
it
can.
Q
J
J
L
J
Q
Q
J
Q
R
J
M
M
J
R
B
J
E
E
I
A
R
R
E
E
L
B
B
K
F
J
Like
this
yeah,
this
doesn't
feel
like
the
indexer
is
not
getting
smaller,
which
really
suggests
it's
not
I
mean
you're
getting
less
data,
but
you
you
I,
can't
think
of
a
type
off
the
type
of
my
head
where
the
indexer
returns
the
same
type.
Is
it
a
common
scenario
that
you
already
know
the
row
indices
you
care
about,
and
you
just
want
to
select
those
very
particular
indices.
I.
Think
you
guys
going.
B
J
J
I
think
both
of
those
I
think
both
of
the
indexers
that
are
returning
a
data
frame
should
be
methods
and
not
indexers,
because
you're
not
while
you
are
reducing
the
amount
of
data
that
comes
out
of
this
operation,
you're
not
reduced
like
you're,
not
getting
to
a
smaller
and
smaller
type
like
list
of
T,
goes
to
a
t.
String
goes
to
a
char
because
it's
really
listed.
J
P
J
B
Q
B
A
A
M
B
J
J
M
B
The
base
column,
one
like
if
you,
if
you
remember
that
equality
operators
before
we
had
the
conversation
returned
a
predication
column
like
you,
can
just
take
that
boolean
column,
that
you
got
back
from
the
comparison
operators
and
pass
it
into
this
indexer
right
rated
again.
It's
about.
Like
writing
compact
code
right.
Yes,.
E
J
B
A
A
Maybe
that's
why
we
react
this
way,
but
and
if
you
have
an
expectation
of
what
these
things
do
and
then
having
compact
intuition,
it's
obviously
farming
but
I'm
wondering
whether
people
would
look
at
those
signatures
and
they'll
have
like
we'll
be
able
to
tell
what
these
things
are
doing.
Does
it
mean
clearly
the
wrong
way,
a
lot
of
difficulties
or
what
these
things
will
do,
because.
J
Something
like
this
can
always
be
added
later
once
there
does
seem
to
be
a
feeling
that,
like
a
lot
of
people,
are
asking
for
it,
because
a
lot
of
people
feel
that
it
feels
natural
and
stuff
like
it's.
You
can
always
add
things
later.
You
can't
ever
take
things
away,
and
so
these
two
I
look
at
that
and
I
look
at
them
and
I
say
they're
not
doing
the
same
as
the
other
indexers
around
them
they're,
not
depending
on
something
like
range,
which
is
we
have
the
same
pattern
throughout
the
entire
data
framework.
J
This
is
a
different
concept,
so
one
of
these
would
be
like
select
rose
and
another
one
would
be
like
you
know,
filter
and
those
are
the
two
operations
and
not
all
like
they're
different.
So
we
shouldn't
be
using
the
same
syntax
for
give
me
this
row,
give
me
a
thing
that
is
logically
like
the
set
of
rows
but
weird
and
give
me
rows
based
on
predicate
the
value.
Look
like
just
different
names,
so
you
help
clarity
and
again
you
can
over
time.
E
J
E
S
B
So,
in
this
case,
like
what
this
is
doing,
can
you
keep
the
line,
breaks
that
well
whatever
so
housing
data
in
this
case
is
a
data
frame
right.
So
what
this
chunk
of
code
is
doing
is
its
splitting
up
10%
into
a
test
data
frame
and
90%
into
the
training
data
frame
right
so
like
the
shuffle
the
shuffle
method
just
taken
an
inter
and
amaizing
it
basically
to
you,
don't
worry
about
that,
but
what
random
indices
gets.
You
is
a
random.
B
You
know
from
zero
to
the
count
of
data
that
you
have
all
the
indices,
randomized
rate,
and
so
you
know
16
and
17
is
saying:
take
the
test
size
of
those
random
indices
and
split
them
up
into
two
into
two
arrays
and
now
from
housing
data.
So
the
key
is
19
and
20
right,
like
now
from
housing,
data
I
can
just
say,
select
out
the
train
rose
and
from
20
I
can
say,
select
out
the
test
ropes.
So.
J
J
Like
looking
through
the
bar
here
is
good
because
it's
testing
what
you
think
that
the
expression
does
I'm
good
with
up
through
17
I,
think
I
have
a
good
idea.
What
it's
returning
19
to
me,
I,
see
housing
like
data
of
a
thing,
I,
don't
know
why
it's
taking
multiple
rows.
But
to
me
it's
returning
like
one
thing.
So
train
must
be
a
road
or
maybe
it's
a
column.
But,
like
you.
B
J
J
P
J
E
M
J
But
I
mean
that's
you're
designing
you're,
designing
an
API
for
something
that
is
intended
to
hold
potentially
more
than
two
billion
rows
and
now
you're
saying
that
this
particular
index
were
can
only
access
up
to
two
billion,
so
I
just
think
line
20.
If
it
said
housing,
data
that
select
rows,
train
rows
would
be
way
more
clear
than
using
the
square
bracket
notation.
B
J
For
one,
this
is
the
that
rain.
The
type
range
that
we
added
for
better
or
worse,
we
came
up
with
guidelines
of
what
it
does,
which
is
that
it
does
return
the
tea.
A
lot
of
people
were
upset
with
that,
which
is
why
I
deleted
that
fact
from
my
brain,
and
so
this
is
a
pattern
that
we
already
have,
that
we
have
it
the
inventing
a
new
thing
based
on
ienumerable
that
has
a
similar
feel
like
it
again
we're
inventing
a
concept.
P
J
B
B
J
A
A
Have
a
named
method
or
make
it
very
clear
in
concepts
right
so,
for
example,
that
we
don't
represent
an
arranged
as
loose
to
ends.
So
it
seems
reasonable
to
me
to
say
if
you
have
an
overlock
that
takes
one
long
one
end
and
if
never
all
of
it
takes
a
range,
but
there's
not
really
a
conflict.
What
they
do
because
they're
completely
different
concepts.
The
fact
that
they
are
baked
by
similar
types
is
almost
irrelevant.
The.
P
A
R
J
Make
a
thing
that
takes
range
instead
of
using
the
range
above
pattern
then,
like
you,
can
see
the
thing
that
it
that
it
did,
the
like
dot
dot
trail
off,
and
you
can
yourself
look
at
the
index
and
see
that
it's
counting
five
from
the
end
great
there
you
they
can.
They
can
hook
it
themselves
and
turn
it
close.
J
J
There
are
things
that,
yes,
there
are
things
that
you
wouldn't
be
able
to
express
yeah,
but
you're
gonna
get
that
when
you
try
and
take
whatever
thing
you
got,
that
was
a
long
index
and
build
the
range
object
out
of
it
either
you
yeah
a
dense
and
you
were
working
with
them
or
you
cast
the
long
to
in
it
yourself
and
that's
a
sign
of
like
something
might
go
wrong
here,
like
no
one
ever
uses
chapter
box,
there's
a
lot
of
Chuck
block
me
and
use
of
a
sample.
Well.
A
A
To
me,
like
the
one
thing
is
I
would
generally
say:
the
preview
should
be
barred
on
what
the
feature
team
feels
is
appropriate
right,
I,
don't
think
we
have
to
resolve
all
issues
in
order
to
super
preview.
I
can
thank
you
how
many
questions
we
just
had
where
you
know
it's
interesting
to
see
how
customers
react
to
that.
You
know
what
keep
it
do
you
get
right,
I,
don't
special.
A
A
B
P
P
L
B
A
J
Yes,
that
arrow
should
be
in
the
next
place
over
or
like
it
shouldn't
it
shouldn't
take
Microsoft
data,
because
it's
not
it's
not
the
be-all
and
end-all
of
data.
It
is
arrow
compatible
data
like
it's
and
maybe
everything's
arrow
compatible
data,
but
like
if
a
new
standard
comes
along
tomorrow
and
we're
like,
oh
that
one's
better
than
arrow,
we
want
to
do
the.
E
J
M
J
J
You
could
write
a
conversion
on
it,
but
then
you're
gonna
get
whatever
massive
performance
penalty
you
get
from
being
transparent
to
converting,
so
you're
never
gonna
do
that,
but
you
would
be
okay
if
I
had
an
extension
tactics
that
take
the
arrow
intro,
so
then
you
would
be
okay
with
having
it
20
times,
but
because
then
this
type
when
you're
using
this
type,
it
has
nothing
to
do
with
arrow.
The
import
is
not
from
arrow
the
exports
not
to
arrow,
that's
all
something
else
or
derive
types
or
whatever.
That's
all.
J
E
B
B
J
I
L
B
E
J
M
J
Okay,
so
it's
if
you're
taking
thing
if
you
have
to,
if
you
anything,
is
based
on.
Oh,
you
can
see
how
you
build
this
by
looking
at
Apache
arrows
website.
Then
that's
an
arrow
ISM
and
not
a
data
frame
of
them
if
you're
designing
it
and
it
looks
like
it
and
you're
okay
with
the
notion
that
you've
copied
their
behavior,
but
it's
the
if
some
other
things
came
along
later
that
everybody
else
was
doing
and
we
what
we
thought
was
good
to
do
to
not
just
like
bandwagon
but
like
it
had
value.
J
If
we're
now
gonna
have
oh,
we
want
to
go
change.
An
internal
implementation
detail
we're
not
using
arrow
string
anymore,
oh,
but
we
export
as
arrow
string.
Okay.
This
went
from
a
like
non
transformation
based
copy
to
a
transformation
copy,
or
this
went
from.
We
gave
you
a
span
over
the
data
because
it
was
free
to
work
making
a
copy
to
give
it
to
you
now
like
these
are
very
big
changes
and
anything
that
would
ever
require
that
for
switching
off
of
arrow
either
this
type.