►
From YouTube: Weekly Sync: 2022-04-15
Description
Meeting Minutes: https://docs.google.com/document/d/1vKYEPtqKiwsFwhVKPmPub5ebMqN9HteBcbdFAuTXalM/edit#heading=h.1f4g9rj08f5e
A
B
Gonna
post
this
link
in
here
this
is,
we
have
a.
We
have
some
issues
with
the
basically
we
renamed
the
master
branch
to
maine
and
in.
B
You
can
add
image
files
to
a
repo
and
it'll
be
okay
for
a
while,
but
then,
as
you
start
to
get,
you
know
hundreds
to
thousands
of
commits,
then
it
then
it
becomes
a
problem,
so
we
had
to
remove
them.
B
So
this
is
the
old
version
of
the
docs
that
has
the
image
in
it,
and
so
this
is
a
good
one
to
look
at
the
I,
I
believe,
you're
talking
about
the
contributions
section
here
and
so
the
contributions
section,
your
your
this
is,
you
know
norm,
so
it's
normalized
for
six
being.
Whoever
did
the
most
work
now,
not
not.
Many
people
have
gotten
prs
merged
this
year
because
of
the
whole,
like
things
have
been
rather
disorganized
this
year
over
previous
years.
A
B
A
B
Very
hard
for
us
to
judge
we
need
to
judge,
you
know,
do
if
you're,
proposing
175
hour
project
or
a
350
hour
project
we
need
to
understand
you
know
is:
is
that
actually,
because
that's
different
amount
of
time
for
everyone
right.
A
B
So
yeah,
so
so
that's
the
purpose
of
getting
stuff
in.
So
I'm
gonna
put
us
on
here
so
that
because
we're
trying
to
add
we're
trying
to
have
a
we're
trying
to
put
no.
A
No,
no,
but
the
the
thing
that
was
confusing
me
like
right
now.
My
proposal
would
be
submitted
by
like
I
guess
sunday,
so
there
would
be
a
period
right
between
right
now
and
I
guess
it's
june.
A
B
So
it
doesn't,
it
ends
up
being
kind
of
a
that
ends
up
kind
of
being
an
issue.
So
let
me
look:
let's
look
at
the
timeline
together,
real,
quick
and
and
then
we
can.
You
know
we
can
decide
what.
B
Yeah,
yeah
and
and
that's
fine,
you
know
it's
it's
something
that
that
is
very
helpful.
But
not
I
mean
it's
not
critical,
but
obviously
people
who
have
contributions.
It's
going
to
be
a
lot
easier
for
us
to
to.
B
You
know,
grade
their
proposals,
because
we,
if
you
don't
have
contributions,
we
just
won't
know
you
know
what's
realistic
and
what's
not
right.
A
B
Okay,
okay,
so
proposal
rubric.
B
Okay,
so
all
right,
yeah,
let's
so,
let's
jump
over
to
so
what
are
you?
What
are
you
thinking
about?
We
can
spend
this
time
since
there's
nobody
else
here
today.
You
know
talking
about
you
know
what
what
what
your
thoughts
are,
at
least
until
anybody
else
arrives,
and
you
know
we
can
just
workshop
together.
So
so
so
I'm
sorry,
how
did
you,
how
do
you
pronounce
your
name
again.
A
Just
call
me
tintin,
it's
very
easy.
It's
my
nickname.
A
B
B
That's
great
yeah.
That
is
very
easy:
okay,
okay,
all
right!
So
so
what
so?
What
are
you
thinking
then?
What's
your
proposal
thought
process.
A
B
A
Yeah
at
this
point
I
am
I'm
going
into
the
time
series
project,
so
I
have
been
like.
I
have
some
I'm
a
fresher
right.
I
just
joined
college,
so
I
don't
really
have
much
experience
in
computer
science
before
this,
but
since.
A
Started
like
about
four
months
ago,
I've
been
like
really
into
artificial
intelligence,
machine
learning
and
stuff.
So
the
time
series
thing
it
kind
of
resonated
with
the
type
of
stuff
I
was
already
into
right
so
like
in
the
time
series
thing
I
was
attracted
to
the
I
I
was
researching
it,
so
I
I
was
attracted
to
the
concept
of
the
decomposition
of
a
time
series
graph,
okay,
yeah,
the
there's,
a
trend
and
a
seasonality
and
something
else
on
that
residues.
A
So
you
know
I
I
was.
I
want
to
work
on
that
and
try
to
implement
that
into
as
a
as
a
project.
C
A
Other
than
that,
the
another
part
I
was
recently
looking
into
the
iris
data
set
that
you
guys
have.
B
A
B
B
Cool,
so
so
you
familiarize
yourself,
have
you
done?
Have
you
tried
writing
any
code
to
do
that.
B
So
that
that,
let's
do
you
want
to
pop
that
up
and
share
your
screen,
and
we
can
workshop
that
together
because
I
think
that
could
be.
You
know
some
good
coding
practice
and
that
could
get
you
a
pr,
probably
pretty
quickly.
We
may
be
able
to
get
one
within
the
scope
of
this
meeting.
B
B
Can
I
can
walk
through
writing
one,
and
then
it
can
be
on
the
recording
if
that's
and
you
can
follow
the
recording
and
then
go
from
there
if,
if,
if
your
environment
is
not
conducive
to
doing
it
right
now,
do
you
want
to.
B
Okay,
great
all
right,
so
let's
open
up
a
terminal
here
and
let's.
A
B
A
I
was
looking
at
the
iris
data
set
and
one
thing
that
struck
me
as
a
little
odd
was
that
we
were
writing.
Individual
codes
for
specific
data
sets
right
like
for
the
irs
data
set.
We
were
writing
one
and
if
we
were
to
introduce
another
data
set,
we
have
to
write
another.
So
why
can't
we
just
make
that
into
one
module
and
then
just
apply
it
new
datasets.
B
That
is
the
that's
a
great
observation.
I
I
love
that,
so
that
is
the
general
pattern
that
we
follow
generally
and
actually
this
is
probably
something
worth
writing
down
here.
So
general
pattern.
B
So
you
know
the
the
general
pattern
is
sort
of
this
okay,
so
so
we're
going
from
high
level
we're
going
from
ideation.
A
B
Is
like
you
know,
here's
you.
We
want
to
take
your
idea
and
we
want
to
shoot
it
to
production
as
fast
as
possible
right
and
so,
as
the
steps
along
you
know,
the
steps
along
and
that's
kind
of
the
core.
That's
part
of
dfml's
core
mission
right
is
to
try
to
figure
out.
How
do
we
take
anybody
who
wants
to
do
something
related
to
machine
learning
from
the
idea
to
boom
into
production
in
existing
application
as
fast
as
possible?
Right
and
we'll?
B
You
know
we'll
follow
these
same
principles
as
we
do
the
development
on
the
library
itself
right.
So
so
you
know
the
first
steps
you
know.
First,
we
have
like
the
idea
right
and
so
in
this
case,
and
let
me
just
make
sure
that
we
have
an
example
here,
so
example:
data
sets
and
data
set
sources.
B
Okay,
so,
okay,
so
idea,
you
know
we
want
to
access.
Various
data
sets
transparent.
We
want
to
access
various
data
sets
where
model
doesn't
is
is
decoupled.
B
This
is
basically
the
core,
the
core
core
premise
of
dfmo
decoupled
from
dataset
access,
so
so
first
first
solution.
So
first
attempt
so
write
model
access
data
set
directly
or
actually
this
is
just
you
know,
access
data
set
directly
just
to
confirm
that
you
can
do
it.
You
know
right,
you
can
read
file,
connect
to
database,
etc.
B
Okay,
then
so
I'll
just.
B
All
right
so
then
so
yeah
so
first
we
we
confirm
that
we
can
do
it.
B
All
right
and
then
we'd
go
in
and
so
yeah
and
this
would
be
for
any
data
set.
You
might
want
to
add
right.
Okay,
then,
for
the
second,
we
need
to
you
know
abstract
behind
interface,
which
so
and
oh,
no.
The
second
would
be
write
another
one.
So
you
know
you
write
one
write
another
one
and
then.
B
Yeah
exactly
right,
so
so
this
is
and
then
I'll
just
say
this
is
the
core.
This
is
the
core
of
dfmo
we're
just
going
over
it
just.
B
Okay,
yeah,
and
so
this
is
a
generally
applicable
process
right.
So
when
you're
saying
okay,
so
so
we
do
one,
we
do
the
next
one
another
one
come
up
with
an
interface
that
works
for
both.
In
this
case
we
came
up
with
sources
and
that's
the
class
right,
okay
and
then
fourth,
you
know
expose
first
and
second
via
interface,
created
in
third,
okay,
so.
B
B
Those
data
sets
in
terms
of
how
you
access
them
and
expose
and
modify
the
interface.
You
come
up
with
and
iteratively.
B
Modify
the
interface
you
hypothesize.
B
A
B
Yes-
and
this
is
why
so
this
is
so.
This
example
covers
how
we
came
up
with
the
s
data
set
source,
abstraction
source,
abstraction
source
sources
and
how
this
pattern
can
be
extended
to
anything.
B
So
so
your
question
right.
So
let's
go
back
to
your
your
your
question
so
so
struck
as
odd
that
we
were
writing
code
for
each
data
set.
Why
not.
A
Would
make
that.
B
A
B
Yeah,
okay,
so
so
you
can
think
of
this
as
like
building
layers
onto
the
software
stack
right,
you're,
building,
layers
of
abstraction
right,
and
so
what
we
did
first
is
we
first
built
this
layer
of
abstraction
around
a
generic
data
set
itself
right,
so
maybe
a
csv
file
or
my
sql
database
right,
and
so
the
first
thing
we
did
was
we
accessed
the
csv
file
directly.
We
accessed
this
mysql
database
directly
and
then
we
came
up
with
this
sources,
abstraction
or
source
class
right
right.
B
That
allows
us
that
that
we
then
wrap
and
let
me
go
ahead
and
open
it
up-
that
we
then
wrap
the
implementations
that
we
came
up
with
and
we
expose
them
through
those.
You
know
we
we
implement
the
class,
basically
right,
and
so
then
we
move
on
to
the
next
problem
right
and
then
the
next
problem
is
so.
This
would
be.
You
know
this
is
the
general
pattern
and
then
pattern.
B
Cop
paste,
okay,
so
okay,
so
high
level
we
want
to
and
I'll
just
copy
paste
from
here,
so
write
something
generic
allow
for
accessing
any
data
set.
B
Okay,
so
idea,
okay,
so
I
want
to
have
to
be
able
to
access
any
data
set
using
same
interface.
A
All
right
that
was
the
solar
energy
one.
Second,
let
me
just
stop
yeah.
A
B
So
we
are
going
through
we're
going
through
right
now
and
then
we
can
jump
to
any
agenda
items
that
you
have.
How
so.
B
C
B
Sahil,
let
me
let
you
know
what
we're
doing
right
now
and
then
we'll
we'll
get
to
your
agenda
items.
So
just
so,
you
know
where
we're
at
so
you
said
your
nickname
is
tintin
all
right.
So
tintin
is
working
on
he's
working
on
a
proposal
for
the
time
series
project
and
so
he's
interested
in
in
the
decomposition
of
time,
series
graphs,
including
trends,
seasonality
and
he
wants
to
add
some
more
data
sets
and
he
was
looking
at
the
irs
data
set
and
said.
B
You
know
it's
odd,
that
we're
writing
code
for
each
data
set.
Why
not
something
write
something
generic
that
allows
us
for
accessing
any
data
set,
so
we're
taking
this
opportunity
to
understand.
You
know
how
we
add
a
light
layer
to
the
software
stack,
the
layer
of
abstraction
how
we
came
about
you
know
with
the
sources
abstraction
in
general
and
then
how
we
can
extend
that
by
we're
gonna.
Look,
we're
gonna,
extend
that
general
pattern
write
it
we're
just
gonna
wrap
a
existing
data
set
real
quick.
B
So,
let's
see
and
talk
about
you
know:
okay,
we're
all
we're.
Gonna
do
is
wrap
the
existing
data
set
and
then
we'll
look
at
we'll
just
talk
about
what
we
would
do
going
forward.
So
so,
let's
get
your
agenda
items
before
we
go
further.
C
B
B
Some
people
don't
have
the
option
to
do.
Google
docs,
I
know
great
great
great
firewall,
china
and
stuff
yeah
I'm
trying
to.
Basically,
if,
if
I
see
if
I
see
it
I'll
provide
feedback,
I
haven't
you
know
so.
Basically,
people
who
email
my
work,
email
have
gotten
responses,
because
that's
what
I've
been
monitoring
lately
and
so.
C
A
So
let's
try
when
when
mine
gets
completed,
do
I
email
it.
B
B
Well,
anything
that
we
see
you
know
we
can't
catch.
B
Things
that
you
might
need
to
do
is
pre-work
things
that
you
might
need
to
do
things
that
you
might
need
to
explore
to
make
sure
that
your
proposal,
like
things
that
might
be
helpful,
pre-work,
which
are
related
to
your
proposal,
which
you
could
do
as
prs
and
things
that
you
know
you
need
to
look
more
into
probably
because
your
proposal
may
not
align
exactly
with
what
it
within
the
code
base.
Any
other
comments.
A
C
A
A
Saw
those
but
they've
like
I,
I
guess
I'm
asking
for
your
personal
opinion
like
what
you
thought.
C
So
as
as
someone
reviewing
this,
I
would
say
that
if
it
is
to
the
point
and
not
very
verbose
instead,
it
just
delivers
it
as
in
a
small
form
factor
not
just
for
short,
but
in
a
reasonable
amount
of
detail
that
what
that
would
do
writing
long
paragraphs
is
just
you
know.
We
have
to
dig
through
it
and
read
through
it.
That
is
something
something
that
I
would
say
is
not
not
possible
for
everyone
in
the
day
to
day,
because,
like
I
am
or
working
somewhere,
john
is
also.
C
A
Also,
one
thing
I
I
I
mentioned:
I
mentioned
this
to
john
as
well.
The
I
have
I
I
joined
the
gsoc
program
really
late,
like
only
about
a
few
weeks
ago.
I
didn't
really
know
much
about
it
before,
so
I
don't
have
many
contributions
to
my
name
so
like
well.
What
do
I
do
about
that.
C
C
You
know
just
you
need
to
go
through
the
guitars
or
information
out
there.
You
know.
B
C
You
mostly,
we
are
kind
of
a
beginner
friendly
project.
We
are
very
much
welcoming
business
and
also,
if
you,
if
you
have
the
time
you
can
see
that
gsoc
talk,
that
we
have
on
the
idea
space,
we
have
linked
the
video
up
there.
B
C
B
C
B
Great
thank
you
yeah.
I
appreciate
that
so.
B
Let's
go
ahead
here,
so
we're
gonna
have
okay,
so
typical
feedback.
What
did
we
talk
about?
We
talked
about
you
know
areas
you
need
to
understand
more.
You
need
to
read
more
existing
docs
to
understand
how
things
currently
work
at
places,
issues
or
potential
prs
where
that
might
be
helpful.
What
else
do
we
say.
A
Also,
another
thing
the
I
I
know
I
I
know
you
have
that
scale
for
the
contributions
but
like,
like
I
had
the
you
might
remember
this.
A
The
full
request
I
sent
about
the
file
structuring
and
you
know
you
pointed
out
that
it
wasn't
actually
needed,
so
I
am
which
one
the
relative
import
one.
Yes
that
one!
Yes,
yes,
now
like,
I
get
that
that
was
a
rookie
mistake
sort
of
thing,
but
I
sort
of
incorporated
that
as
a
closed
pr
in
my
proposal
would
that
be.
C
C
Yes,
that
kind
of
explains
it
and
it
is
fine,
but
I
wouldn't
say
that
so
what
we,
what
would
count
as
a
contribution
and
what
would
not?
So,
in
my
view-
maybe
maybe
john,
can
add
to
this.
In
my
view,
what
would
have
come
to
the
contribution
is
something
that
has
been
merged
or
that
has
been
blocked
due
to
something
else.
C
That
goes
as
a
contribution
and
anything
else
that
is
not
merged,
or
that
is
not
contributing
towards
the
collective
goal
of
the
project
would
not
we
could.
We
cannot
count
it
as
a
contribution
to
be
fair
to
everyone.
Okay,.
C
Other
than
that,
some
questions
were
there
on
on
twitter
that
I
would
want
you
to
address
once
yeah.
B
C
To
answer
at
my
best,
someone
asked
that
why
don't
we
use
external
libraries.
B
C
B
A
C
That
is,
that
is
pretty
much
it.
I
guess
you
know
like
so
for
every
types
of
data
there
would
be
some
atomic
type
like
for
our
csv.
A
row
is
atomic
type
for
sql.
There
is
a
record
which
is
called
the
tuple
is
something
like
that
and
in
the
fml
we
have
a
record,
which
is
a
dictionary,
and
maybe
it's
on.
B
Yeah,
that's
actually
what
that's
that's
very
closely
related
to
what
we
were
just
talking
about,
which
is
that
you
know
we're.
Oh
yeah,
that's
fine
very
closely
related
to
what
we
were
talking
about
right
right
before
so
he'll
join,
which
is
come
on.
B
Google
docs,
which
is
that
we
have
this
source
abstraction
right,
and
so
the
record
is
the
abstraction
around
you
know,
whatever
one
of
those
rows
are
so
like
a
row
in
a
csv
file
or
a
row
in
a
database,
or
you
know
an
entry
in
a
json
list
right
so
that
that
is
a
record
so
that-
and
so
we
talked
about
you
know,
how
did
we
get
to
the
source
abstraction?
Well,
we
applied
the
same
pattern
to
get
to
the
record
abstraction
for
each.
You
know
piece
of
data
within
a
source.
B
Right,
so
does
that
make
sense
a
little
bit:
okay,
okay,
yeah!
You
can
just
think
of
it,
because
we
have
this
generic
thing
right
where
we
want
to
be
able
to
access.
We
want
our
models
to
be
separate
from
our
sources
so
that
we
can
use
any
model
with
any
source
right.
Then
we
also
need
an
abstraction,
for
you
know
each
entry
within
that
source
right
or
within
any
source-
and
that's
just
a
record-
is
what
we're
calling
that
right,
yeah
so
and
then
okay,
so
anything
anything
else.
B
So
we
talked
about
okay,
so
mentor
login.
So
I'll
get
on
that
one.
Let's
see
mentor
login!
Okay,
sahil!
Could
you
could
you
send
me
a
email
to
my
to
my
work
email
just
because
the
other
day
I
got
off
a
call
with
somebody,
and
I
I
did
the
exact
opposite
of
what
we
had
just
talked
about
for
like
15
minutes.
So
if
we
don't
write
things
down,
I'm
liable
to
do
the
wrong
thing.
B
C
B
It's
it's
yeah,
I
just
actually.
This
is
oh,
let's
finish
this,
so
it's
this
john.s
anderson
at
intel.com.
So
thank
you
yeah,
if
you
just
just
put
it
in
the
subject
that
way
because,
like
you,
said
proprietary,
we're
on
the
recording,
I
don't
want
to
switch
over
to
my
calendar
or
email
right
now.
Yeah,
okay,
great,
thank
you!
Okay.
So
this
is,
I
think
we
captured
in
you,
know
some
stuff
here
and
then
let's
say:
okay.
We
also
want
to
grab
hashem's
name.
B
So
hashem
is
also
mentoring
this
year
and
I
think
we
we
may
have
some
other
mentors
involved,
but
I
don't
we
I
haven't
reached
out
to
that.
As
many
people
I
usually
reach
out
to
many
people,
but
I
haven't
had
a
chance
to
reach
out
to
many
people
this
year,
previous
students
and
such
and
I
know
some
of
our
previous
students
were
actually
trying
to
start
their
own
org,
but
I
haven't
synced
with
them
on
how
that
went
for
them
in
a
while.
So
it
it's
sort
of
it's
right
now.
B
It's
looking
like
three
of
us
and
potentially
we
can
rope
in
some
more
folks,
but
I'm
not
sure
right
now.
So,
let's
see
and
hashim's
last
name
is
in
his
email,
great.
B
Yeah,
no,
that's
good
yeah!
There's
no
music!
This
time,
yeah
noise
canceling
must
be
working
whatever
google's
doing
all
right.
So,
let's
see
so
doc's
contributing,
oh,
that
was
that
was
that
was
funny.
I
wonder.
Okay,
can
you
guys
say.
B
It's
still
sort
of
in
progress,
I
would
say
we're
not
that's
not
fixed
right
now,
so
let
me
just
transform
and
transform.
We
need
to
crop
okay,
never
mind.
This
is
I'm
just
gonna.
Leave
that
one
okay,
so
gsuc22,
I
add
what
did
we
do
here?
So
we're
adding?
You
know
some
clarification
around
proposals
so.
B
It's
not
a
github
policy,
so
so
the
thing
is
it's
it's
just
sort
of
this
is
more
of
a
thing
in
the
united
states,
I
think,
probably
than
than
in
other
places,
but
there's
this
general,
like
awareness
happening
of
you,
know
different
word
usages,
so
we're
you
know
trying
to
align
with
that,
which
is
the
the
reason
to
to
you
know,
change
the
branch
name
in
the
process.
B
We.
So
when
you,
when
we
change
the
branch
name
yeah
it
it,
it
was
an
opportunity
for
us
to
go
through
and
clean
up
these
files
as
well.
Now
we
didn't
totally
get
all
of
that,
but
yeah
that
the
file
cleanup
proved
to
be
a
lot
trickier
than
than
originally
expected.
So
we'll
probably
yeah
we'll
have
to
we'll
have
to
that.
That's
gonna
need
to
be
dealt
with
remorse.
A
I
was
wondering
like
in
some
cases
why
why
don't
you
just
fork
it
into
a
new
branch
called
maine
and
just
treat
main
as
the
master
branch.
B
Oh
yeah
for
oh
fork,
I
see
which
media
fork
so
so
there
is
so
okay,
so
we'll
we'll
only
spend
a
couple
more
seconds
on
this
and
then
I
want
to
go
write
the
data
set,
but
we
are
trying
to
get
the
project
transition
to
an
external
government
governance
structure.
That's
non-intel,
long-term,
to
enable
people
to
become
maintainers
of
the
project.
Because,
right
now
the
permission
settings
are
locked
down.
A
B
Intel
itself,
so
I'm
in
the
process
of
talking
to
legal
about
that
and
they've
okayed
it
conditionally
on
government's
documentation
from
the
buildtree
org
right
now
and
so
we'll
see
we'll
see
how
that
evolves.
It
stalled
out-
and
that
was
part
of
our
previous
mentors,
we're
running
that
some
of
our
previous
mentors
yash
and
saksham
and
some
others,
but
but
I
haven't
heard
from
them
in
a
while.
I
think
they
got
just
same
same
as
all
of
us.
You
know
your
day.
Job
is,
can
be
a
lot
of
work
right.
B
It's
a
full-time
job
for
for
a
reason
right,
so
yeah,
we'll
we'll
we'll
that
may
be
a
next
year
thing,
but
for
this.
B
To
be
under
intel,
okay,
so
and
I'll
put
this:
how
are
we
providing
feedback?
That
would
be
this
link
here
so
and
I'm
actually
gonna
nix.
That
proposal
review
link
so
that
people
go
if
they
see
the
meeting
minutes
and
they
read
that
full
thing.
Okay,
so
anything
else
so.
B
Okay,
all
right!
So
let's
jump
into
this
this
you
know
so
so
we're
going
to
go
apply
this
pattern
right
to
specific
data
sets.
So
we
have
the
iris
data
set
and
I'll
drop
a
link
to
that.
So,
let's
see
so
that
we
can
follow
along
here.
So.
A
B
A
good
question:
that's
a
very
good
question,
so
so
you
know
so
so
I
work
for
intel
and
this
project
was
originally
developed
at
intel
and
then
until
you
know,
we
worked
it
through
until
every
company
has
a
you
know,
different
processes
around
how
they
deal
with
open
source
intel
is
pretty
good
about
it,
and
so
we
were
able
to
open
source
the
project,
and
then
you
know
we
can.
We
can
do
gsoc
and
things
which
is
cool.
A
B
And
so
the
python
software
foundation
is,
you
know
a
within
gsoc.
There
are
top
level
orgs
and
suborgs,
and
so
we're
suborg
of
the
python
software
foundation,
which
means
that
the
python
software
foundation
manages
talking
to
google
and
all
the
complexities
involved
there
to
actually
sign
up
as
an
org
with
google
and
then.
B
We
have
a
simplified
process
which
we
can
follow
for
our
suborgs
right
and
we're
on
a
much
smaller
scale
right
than
the
python
software
foundation,
because
they're
a
conglomeration
of
other
things,
and
so
so
yeah,
that's
the
general
gist
of
it
and
good
good
friend
of
mine
and
co-worker
terry.
She
runs
the
python
software
foundation,
gsoc
involvement,
and
so
she
runs.
B
Called
cbe
bin
tool,
which
is
a
really
cool
project,
and
I
recommend
you
guys
like
that.
You
check
it
out.
If
you
haven't
checked
it
out,
it's
not
not
machine
learning,
focus
security,
focus,
really
cool
stuff,
though
really
important,
stuff
and
yeah.
That's
it.
That's
a
another
good
good
project,
so
yeah,
so
she
she
she
got
us
involved
as
well.
Okay,
so
does
that
answer
your
question.
B
We're
under
that
umbrella,
it's
sort
of
like
we're
associated,
but
we're
we're
not
sort
of
in
right.
A
B
Yeah,
it's
it's
an
by
association,
sort
of
thing.
Okay.
So
where
are
we
going
here?
We're
going
source
god?
We've
got
a
lot.
A
B
B
Oops,
okay,
so
here's
the
iris
source
right
and
you
know
basically
so
what
we've
done
here
and-
and
this
is
what
we're
seeing
is
actually
there's
another
layer
in
the
stack
in
between
here
and
that
layer
is
that
layer
is
this
data
set
source
wrapper
and
so
we'll
we'll
cover
this
briefly,
but
basically
we
implemented
the
sources
construct
right
and
to
to
access
different
sources
right
and
then
we
saw
okay
well,
we
need
this
abstraction
around
accessing
specific
data
sets
and
actually,
okay,
so
unfortunately
we're
doing
this
sort
of
backwards
here,
because
we
already
have
the
abstractions.
B
So
what
happened
is
so
if
we
look
at
this
adding
layers
to
the
stack
here,
what
happened
is
we
skipped
step
two
and
we
went
right
to
coming
up
with
the
interface,
and
so
we
implemented
a
single
source
and
the
the
other
source
was
actually
implemented
as
an
example.
B
So
for
step,
one,
we
implemented
this
iris
training
right,
and
so
this
is
the
meat
of
it
right
here
this
we
basically
we
download
the
data
set
and
then
the
data
set
itself
is
really
just
a
csv
file.
So
then
we
just
you
know,
yield
the
appropriate
source
and
you'll
see
heavy
heavy
heavy
usage
of
context,
management
and
async
in
dffml.
B
A
B
Yeah
and
and
the
reason
for
that
is
context,
managers
allow
us
reliable
cleanup
of
any
resources
used
and
asynchronous
methods.
Allow
us
you
know,
network
access
in
a
concurrent
way,
because
a
lot
of
things
are,
I
o
bound
right
and
so
most
most
of
most
things
are.
I
o
bound
machine
learning
models
and
stuff
are
cpu
bound.
We
can
then
schedule
those
out
to
threads,
so
this
is,
you
can
think
of
this
as
as
attempt
one
right.
So
we
wrote
this
thing,
that's
like
okay!
B
Well,
how
would
we
implement?
You
know
a
simple,
abstract
or
a
simple
source?
Well,
we
would
download
the
download
the
data
set
like
because
this
sources,
abstraction
is
around
generic
sources
in
general,
and
then
the
data
set
source
abstraction,
which
is
this
next
layer
which
we're
adding,
and
maybe
I
should.
We
should
probably
say
that
so
pattern
applied
to
specific
data
sets,
develop
the
data
source,
abstraction,
okay.
B
B
This
is
a
helper
function
here
and
then
our
second
attempt
for
the
second
attempt
we
basically
just
wrote
an
example,
and
so
this
is
the
example
data
this
my
training
data
and
and-
and
this
is
going
to
look
a
lot
better
rendered
on
the
on
the
website
here.
So
let's
go
to
data
set
source.
B
Okay,
so
in
this
example
here
we
are
in
this
example.
Here
we
say
all
right
like
this
is
the
abstraction
itself
that
we
ended
up,
adding
as
this
layer
right
and
here's
the
training
data
right.
So
instead
of
the
iris
data
set,
that's
a
csv
file,
it's
this
we're
downloading
it
from
some
server
and
when
we
download
it.
This
is
what
we
do
right.
B
We
basically
just
use
this
cache
download
function,
another
helper
function-
all
it
does
is
download
if
it
doesn't
already
exist
and
then
it
yields
the
csv
source
right
and
then
here's.
A
I
just
you
know
what
I'll
save
my
questions
for
later.
B
There's
not
a
lot
of
network
access
in
this
library
in
in
the
base
library
itself,
but
when
we
do
do
it,
we
have
this
helper
function,
cache
download
and
the
reason
why
we
always
go
or
you
don't
have
to
go
through
cache
download
it's
just
typically,
if
you
download
something
you
don't
want
to
re-download
it
right.
No.
A
No,
I
I
get
the
cash
download,
I'm
asking
why
cash
like
cash
means,
like
you
know,
it's
temporary
memory,
is
that
what
this
is.
B
Oh,
so
cache
doesn't
necessarily
mean
that
something
is
temporary
memory.
Cache
a
cache
is
really
just
a
place
that
you
store
something
so
that,
like
so
that
you
don't
have
to
go,
get
it
all
the
time
from
its
main
source
right.
A
B
When
you
think
about
when
you
hear
about
like
the
l2
or
the
l1
or
the
l3
cache
right
on
a
cpu,
that
is
just
a
place
where
we
can
store
stuff,
that's
usually
in
main
memory,
but
we
don't
have
to
go
out
to
main
memory,
because
that
takes
lots
of
cycles
right.
So
cache
is
just
you
know
more
of
a
word.
It's
the
word
itself
is
used
just
for
the
for.
A
B
Quick,
quick
access
location,
so
basically
we
use
this
cache
download
function,
because
if
somebody
wants
to
call
this
training
data
set,
you
want,
if
you,
if
you
open
this
twice,
we
don't
want
to
redownload
it.
It's
already
on
disk.
The.
B
Is
that
we
have
for
security
reasons,
these
the
synchronous
download
function
which
this
cache
download
is
backed
by
and
then
cache
downloaded
itself?
Have
a
protocol
allow
list
which
basically,
the
the
purpose
of
this
is
to
stop
people
from
accidentally
passing
it
http
links
which
are
insecure
right.
So
we
do
hash
validation
as
well,
which
is
another
thing.
B
So,
basically,
if
we're,
if
we
download
something
to
the
cache,
we
we
it's
it's
it's
kind
of
nice,
because
we
can
just
turn
around
and
like
the
way
that
we
can
ensure
that
we
have
the
right
contents
in
the
cache
is
the
same
way
that
we
ensure
we
downloaded
the
right
stuff
right
and
that's
by
pinning
it
with
a
with
a
a
cryptographic
hash.
B
Exactly
and
so
basically,
if
the
hash
doesn't
match,
we
either
re-download
or
throw
an
error,
and
if
somebody
tries
to
you
know,
send
you
know,
give
us
a
link.
That's
in
you
know
an
insecure
link,
we're
going
to
throw
an
error
right,
and
so
here
we're
explicitly
allowing
the
use
of
tls
encrypted
http
access
right
by
by
adding
this
to
the
allow
list
so
and
this
kind
of
gets
in.
B
So
basically,
what
we've
done
here
is
we,
let's
see
so
write
another
one
and
then
I'll
just
put
this
example
right,
and
so
that's
the
example
and
then
you
know
come
up
with
the
interface
that
works
for
both
and
here's
the
same
link.
Basically
because
you
know
this
is
that
that's
the
point
of
this
is
that
we,
the
second
one,
was
ended
up
being
the
the
example
for
the
abstraction
itself.
B
So,
okay,
so
then
your
question
is
why
don't
we
have
something
that
that
you
know
we'll
go
back
to
your
question,
which
was
you
know,
can't
we
just
have
something
that
works
for
everything,
and
you
know
the
answer.
Is
we've
now
built
this?
We've
we've
built
this.
We
did
our
first
attempt.
We
did
our
second
attempt.
B
We
wrote
our
abstraction
layer
right
so
now
we
can
start
basically
saying
well
what
are
all
the
things
that
we
might
want
to
access
right
and
if
the
the
pattern
we're
going
to
follow
the
same
pattern,
basically
for
the
third,
this
third
level
right.
So
the
first
level
was
access
data
access,
a
data
source
right
and
then
the
second
step
was
access.
Specific
data
sets
right,
which
might
be
backed
by
a
specific
data
source
right,
which
is
like
a
csv
file
or
whatever.
Now
the
third
step
is
well,
we
need
to
figure
out.
B
How
do
we
access
an
arbitrary
data
set
right?
So
we
need
to
probably
do
this
for
a
lot
of
different
data
sets
right,
because
we
only
did
two
if
we're
going
to
make
something
that
accesses
an
arbitrary
data
set.
We
need
to
have
a
really
good
understanding
of
all
the
types
of
ways
that
that
might
be
accessed
and
the
only
way
to
get
that
is
to
implement,
implement,
implement
right.
So
part
of
this
project
is
to
go
through
and
implement
right.
So
so
we
had
a.
B
We
had
a
proposed.
A
If
I
understand
what
you
were
saying
correctly,
you
mean
like
different
places
like
different
websites
like
kaggle
or
another
data
source
would
have.
B
B
Yes,
and
actually
here
here,
we'll
do
one
right
here,
which
is
the:
why
don't
we
do
mnist,
because
that's
a
that's
an
easy
one,
and
we
already
have
an
example
for
that
and
we
don't
have
a
data
set
source
for
that,
so
we
can
basically
go
ahead
and
we're
gonna
get
kicked
off
here
so
we'll
see
if
we
can
do
it
in
like
seven
minutes.
So
this
is
the
this
is
the
deal.
Here's
all
our
shop384
sums,
here's
our
link
to
the
to
the
data.
B
Let's
go
through
and
implement
mnist
here
and
I'm
gonna
pop
shell
again.
Okay.
So
where
are
we
okay
so
get
check
out?
Dataset.
B
You
know
I
need,
I
think
I
might.
I
think
my
calendar
might
have
gotten
cleared,
but
if
we
get
kicked,
let
me
actually
either
either
it
got
cleared
or
I'm
missing
the
meeting
right
now.
I
think
it's
fine,
I
think
we're
fine.
I,
the
past
few
weeks,
I've
accidentally
run
over
another
meeting.
Doing
this.
C
B
That
sounds
good,
so,
let's,
let's
yeah,
let's
let's
plan
on
that,
then
so
I
think
I
think
that
that's
that's.
I
think
that's
doable
documents,
python,
okay!
So
let's
take
a
look
at
what
we
did
here.
So
I'm
gonna
remove
this
file.
I
don't
think
we
need
this.
I
think
that
just
generated
this
one
off
all
right
so
so,
basically,
I
just
copied
the
base
into
the
new
file,
which
is
the
mnist
right,
because
the
base
has
this
example
here.
So
actually,
actually
I'm
not
gonna
do
that.
B
I'm
gonna
copy
the
iris,
because
this
is
one
that
has
a
this,
has
the
whole
url
and
everything
okay.
So
let's
do
mnist
training,
yeah
and
then
the
question
will
be
you
know
how
do
we
then
extend
this
to
be
arbitrary
right,
and
so
you
might
end
up
like
eventually
we'll
end
up
with
something
once
we
implement
a
few
of
these,
that's
really
obvious
that
that
you
know
we
can
basically
just
say
you
know,
source.
B
B
Worries
so.
B
Okay,
so
if
yeah
yeah,
okay,
so
all
right,
so
we've
got
our.
B
B
So?
Okay,
so
we're
grabbing
actually
we're
grabbing
from
multiple
sources.
That's
right!
Okay,
sourcing,
which
is
data
flow?
Oh
okay,
that's
right!
So
it
was
tricky
one
okay,
so
we're
basically
actually
gonna
create.
So
what
we
see
in
the
usage
here
is
that
we're
going
to
oh,
this
would
be
fun
okay,
so
we're
actually
going
to
do
this
example
as
a
data
set
source.
B
It
looks
like
so,
let's
see
so
we're
definitely
going
to
get
kicked
off,
and
this
is
going
to
take
a
little
bit
longer,
but
basically
we're
going
to
re-implement
this
example
here.
So
what
do
you
mean?
Kicked
off?
B
Google
has
limited
the
meetings
to
an
hour,
so
we'll
have
to
start
a
new
meeting
link.
I
don't
think
this
will
take.
You
know
more
than
like
15
minutes
here,
but
it's
definitely
gonna
take
more
than
three
minutes
so
so
we'll
we'll.
I
think
we
can
run
into
that,
and
then
I
can
confirm
that
I'm
not
in
another
meeting
right
now.
So,
let's
see
so.
B
Basically,
if
you
look
at
this
tutorial
here,
what
we're
doing
is
we're
showing
how
to
how
to
train
a,
I
think,
a
tensorflow
model
yeah,
let's
see
so
yeah,
okay,
so
yeah
we're
going
to
show
how
to
train
the
dnn,
classifier
and
so
effectively.
What
we're
going
to
do
here
is
we're
going
to
end
up
doing
all
of
these
pre-steps
and
we
can
actually
just
actually.
We
can
just
take
this
whole
thing
and
we
can
just
do
all
of
that.
So
our
first
step
is
download
all
the
data
right.
B
Actually,
no
paste
read
docs
examples,
mnist
all
right.
So
our
first
step
is:
we
want
to
download
all
the
data
right,
so
we're
gonna
go
through
and
we're
gonna
cache
download
the
train,
idx,
3
and
train
labels,
because
these
are
these
are
the
things
that
we
care
about,
because
we're
implementing
the
train
data
set
right
now.
B
Right
so
this
this
data
set
itself
download
features
is
made
up
of
you
know
the
the
it's
made
up
of
four
files.
It's
made
up
of
one
for
the
training
features,
one
for
the
training
labels,
the
things
that
we're
going
to
predict
and
then
you
know
two
more
files
for
the
test
label
features
and
labels,
but
we're
not
concerned
with
those
right
now
because
we're
just
doing
the
training
right.
B
So,
let's
see
okay.
So,
let's
see
and
we'll
just
you
know,
let's
just
do
a
find
replace
on
iris
mnist,
and
that
way
we
should
be
pretty
good
here.
Okay,
so
and
then
we
can
grab
these
sha
values
so.
B
Okay-
and
that
was
the
images,
and
then
this
is
the
labels.
Okay
and
then
this
is
also
http
links.
So
we'll
you
know
down
we'll
make
sure
that
we're
we're
adding
that
until
the
allow
list
right
so
and
then
we
can
say
you
know,
training
original
features
and
labels
right
so
we'll
download
these
files
to
the
cache
and
so
we're
going
to
download
the
features
and
then
we're
going
to
download
the
labels
and
we're
going
to
download
them
to
this
cache
directory
here,
which
is
in
you
know,
home
cache.
B
Dfml
data
sets
mnist
and
yeah
we're
going
to
validate
the
shas
based
on.
You
know
that
the
values
that
we
had
given
in
our
tutorial
already
right-
and
we
could
calculate
these.
If
you
look
at
the
cache
download
function,
there's
an
example:
it
just
runs
it
through
shot,
384
sum
and
calculates.
The
sha
so
then
looks
like.
We
also
depend
on
model
tensorflow
and
operations
image
here.
So
we're
this.
This,
which
means
this
kind
of,
goes
to
your
question
about.
Why?
B
Don't
we
use
pandas
every
time
we
have
a
new,
distinct
set
of
dependencies?
We
should
have
a
new,
a
new
plug-in
and
that's
because
so
basically
think
a
distinct
set
of
dependencies
would
be
like.
I
have
models
implemented
using
tensorflow
right
now.
I'm
sure
you're
aware
when
you
download
some
of
these
machine
learning
libraries
they
take
a
long
time
to
download
right
and
they.
B
B
B
B
So
for
your
specific
use
case
right
once
you've
trained
your
model,
you
would
download
you
know
whatever
things
that
are
applicable
to
your
model
right
or
your
data
set
collection,
if
you're
doing
like
pre-processing
on
the
fly,
and
that
way
your
image
is
smaller
right,
and
so
to
do
that,
though,
we
need
to
make
sure
that
we
create
a
new
plug-in
every
time
we
have
a
new,
distinct
set
of
dependencies
and
that
allows
people
to
only
install
what
they
need.
B
So
in
this
case,
this
is
going
to
require
a
df
model,
tensorflow
and
dfml
operations
image.
So
we're
just
going
to
do
a
quick
check
here
that
we
have
those.
So
we're
gonna
run
the
version
command
and
see.
Okay,
it's
it's
gonna!
It's
gonna!
Look
at
this
file
all
right.
Okay,
so
we
have
a
bunch
of
stuff
in
here.
B
So
okay,
so,
let's
just
actually
keep
implementing
for
a
second,
so
we
download
the
data,
we're
actually
going
to
include
this
little
install
command
here
or
actually
that's
going
to
be
covered
already
or
well,
we'll
we'll
include
it
for
now.
Basically,
you
know
you
won't
be
able
to
run
this.
B
If
you
don't
have
this
stuff
installed
and
and
tensorflow
is
not
one
of
them,
but
you
will
need
the
image
operations
right
because
this
to
to
do
this
so
so
to
to
to
use
this
training
data
set,
because
the
the
because
we
have
to
normalize
the
data
to
do
anything
here,
let's
see,
should
we
do
that,
I
think
maybe
we
should
not
do
that
yeah.
I
think
we
should
leave
the
normalization
out
of
this,
so
we're
actually
gonna
we're
actually
gonna
leave
the
normalization
out
of
this
and
basically
say
hey.
You
know.
B
If
you
use
it,
then
you
normalize
we're
just
going
to
give
you
the
raw
data
right
now,
just
and
just
just
to
recap
there
what.
B
Yeah
exactly
we're
just
going
to
provide
the
raw
data.
We
don't
need
to
do
any
normalization
built
in,
but
the
but
the
cool
thing
is
that
you
could
right
like
what
we're
seeing
here
is
that
we
could,
you
know,
have
mnis.training.normalized
which,
whenever
you
import,
that
kind
of
like
you're
saying
you
know
why
can't
we
just
have
one
for
everything
like
this
one.
You
could
have
one
specifically
that
already
does
that
normalization
built
in
right.
Oh.
A
Yes,
speaking
about
speaking
on
this,
like
while
I
was
writing
my
proposal
on
one
of
the
aspects
of
the
project
was
writing
various
operations
right,
normalization
and
nan
removal
and
stuff.
So
I
was
wondering
like
I.
I
want
to
implement
a
few
of
them
myself,
but
you
guys
would
already
have
frameworks
for
several
of
these
operations
right
and
so
like.
How
do
I
tell
which,
which
ones
need
to
be
which.
B
So
I
would
say:
git
grep
is
your
friend
and
also
a
big,
mainly
because
the
docs
are
not
entirely
up
to
date
and
then
also
the
plugins
page
and
this
and
we
have
a
current.
We
currently
have
a
bit
of
an
issue
with
the
way
the
docs
are
splitting
out,
but
this
is
your
sort
of
your
your
your
main
list
of
what
exists
where
this
plugins
page,
because
this
searches
across
like
we
said
each
time
you
get
a
new,
distinct
set
of
dependencies.
You
implement
a
new
plugin.
B
So
this
all
of
this,
which
is
split
out
right
now
for
some
reason
we
don't
know
why
we
haven't
gotten
to
that
in
a
while.
That
is
just
the
main
package
right.
So
if
you
were
to
like
get
that,
that's
basically
all
of
the
anything
you
see
inside
anything,
you
see.
Okay,
that's
that's
a
bunch
of
pyc
files.
B
C
B
B
Everything
that's
implemented,
then
you
would
go
to
the
plugins
link
and
now
you
can
see
across
all
the
different
plugins
right.
So
everything
like
you
know
in
if
you
were
to
look
in
you,
know
like
this-
is
the
top
level
repo
right
and
so
that
the
api
reference
is
dffml
and
then
there's
also
like
model
and
operations
right
and
within
those
we
have.
This
is
dfml
operation,
binsec
dfml,
operation
image,
which
is
the
one
that
was
referenced
in
this
tutorial
that
we
were
just
looking
at
wherever
that
was
oh
yeah
here
right.
B
So
this
is
this
slash
operation
image
is
this:
this
package
on
pi,
pi
and,
and
so
the
plugins
page
allows
you
to
search
across
those
and
if
the
goal
is
eventually
will
support
third
party
plugins.
So
basically,
you
know
these
are
all
the
plugins
that
we
as
a
community
have
developed,
but
we
also
want
to
make
it
very
easy
to
just
say:
hey,
you
know,
here's
my
rant
like
we
were
talking
about.
You
know
hey.
B
B
So
if
you
go
here
and
you
look
at
this,
dev
create
there's
a
helper
script
to
create
a
new
package,
and
so
basically
this
is
this
will
generate
you
a
new
model
or
operation
or
whatever
right,
and
then
you
can
push
that's
just
a
skeleton
of
files
that
gives
you
a
blank
python
package
that
you
can
then
push
to
your
own
personal
github,
and
then
other
people
can
then
use
your
models
or
operations
or
whatever,
by
basically
just
doing
a
pip,
install
and
then
pointing
at
your
repo
link
and
all
of
a
sudden
they'll
have
access
to
all
your
stuff
and
whatever
you
implemented
within
dfml,
and
that
way,
if
you
implement
a
random
new
source,
then
somebody
can
take
the
existing
models
and
use
it
with
your
new
source.
B
Without
writing
any
code
right.
All
it
is,
is
a
command
line
flag.
So
that's
that's
the
goal.
We
aren't
quite
there
yet
right
now,
it's
all
just
within
the
repo
but
but
yeah.
So
okay.
So
where
were
we
here?
So
we're
gonna
do
oops,
okay,
so
basically
and
then
looking
at
our
this
was
you
know
the
this
is
so
the
the
the
python
there's
a
python
api,
an
http
api
and
the
command
line,
interface
and
they're
all
sort
of
mirror
each
other.
B
The
http
api
is
not
not
not
that
not
the
greatest
right
now,
but
so
anything
that
you
can
do
in
one.
You
can
do
in
the
other
right.
So
this
line
here
basically
says
says
no
paste,
so
this
says
create
a
source
object,
which
is
actually
two
sources
combined
right.
B
This
will
expose
both
the
this
or
this
will
expose
the
separate
feature,
features
and
labels
files.
We
downloaded.
B
As
a
single
source
right
because
that's
what
we
want
to
do
so,
what
we
end
up
with
is
so
this
one
is
going
to
pre-process
to
normalize.
So
one
of
these
you
see
is
pre-processing
and
one
of
them
is
just
a
straight.
This
is
the
label,
and
this
is
the
images
and
the
images
are
being
pre-processed.
B
And
so
this
is
the
images
here,
and
so
we
we
don't
want
to
pre-process
them.
So
we're
basically
just
going
to
say-
and
you
can
see-
they're
they're
being
pre-processed
using
this
data
flow
and
so
we're
basically
just
going
to
cut
that
out,
because
we
don't
want
to
do
any
pre-processing
and
we
know-
and-
and
we
can
see
within
that
pre-processing-
it
was
actually
accessing
this
idx
3
source
right.
So
we
can
instantiate
an
idx
3
source
and
an
id
an
idx
3
source
for
the
the
the
the
the
train
images.
B
So,
let's
see
later
features
features
path
and
labels
path
right.
So
we
downloaded
the
features
we
downloaded
the
labels
and
now
we're
going
to
instantiate
an
idx
3
source
for
the
features,
because
that's
the
format
that
they're
in
and
idx
one
source
for
the.
B
And
we
need
to
go.
I
think
I
think
this
works
out
of
the
box
like
that,
but
we're
gonna
go
double
check
that
because
so
yeah,
so
we
need
to
pass
probably
file
name,
oh
yeah,
so
it
looks
like
idx
3.
So
this
instant.
This
says
what
class
to
instantiate
and
then
this
says
property
file
name
equals
this
file.
So
that
means
we
need
to
say.
File
name
equals
this
file.
B
Eventually
we
we're
hoping
to
have
a
converter
that
actually
just
spits
out
python
code
given
command
line
arguments
and
vice
versa,
feature
equals
image,
and
this
says
basically
hey
you're
loading.
All
the
data
from
this
file
in
this
format
like
when
I
yield
a
record
for
each
entry.
What
should
I
call
the
data
that
I
loaded
for
the
from
file
and
we
should
call
the
data
image
so
file
name
on
this.
One
is
images
path
and
then
feature
is
label.
B
So,
basically,
you
know
create
this
sources:
objects,
which
is
really.
You
know
two
sources
in
a
trench
coat
and
each
time
you
read
a
record
from
one
read
a
record
from
the
other
one
and
call
the
data
that
you
read
from
the
idx
3
source
image
and
call
the
data
that
you
read
from
the
idx
one
source
label-
and
this
is
this.
This
should
be
it
so
and
then
we
will
we'll
yield
it
yeah,
okay,
cool
so
and
then
from
here
we
basically
just
say
you
know,
base
so
csvx.
B
I
think
there's
an
idx
data
set
source
cache
download
file.
All
right.
I
think
we
implemented
it
so
yeah
download
the
files
and
spit
them
out.
Okay.
So
now
we're
going
to,
we
can
run
the
the
tests
node.
Okay,.
B
Contributing
and
then
testing
and
then
here's
how
to
run
a
specific
test
and
here's
we'll
just
do
this
shorthand
here.
So
basically
oops,
oh
and
I
figured
out
this
pretty,
which
is
cool
recently,
which
I
really
like.
So
if
you
do
pdb,
if
you
do
python
and
then
pb,
then,
if
there's
an
exception
that
you
didn't
handle
already
like
you're
developing
some
stuff,
it'll
drop
you
right
to
a
python,
debugger
shell
as
soon
as
you're,
like
as
like
yeah
it'll,
it'll
drop
you
right
to
the
debugger
shell,
which
is
pretty
sweet.
B
B
B
B
B
Tests-
oh,
we
don't
want
discover,
so
we're
going
to
run
this
just
this
test
and
what
it's
doing
is
there
is
a
it's
running.
It's
running
these
console
examples.
When
you
put
restricture
text
in
a
markdown
file,
we
have
some
extra
stuff
that
we've
added
on
to
say
when
you
put
test
it's
going
to
run
that,
and
so
what
we're
missing
here
is
from
the
top
level
we
every
plugin
that
we
talked
about,
needs
to
be
registered
and
so
we're
gonna
go
register
this
reinstall
the
package
and
re-run
mist
and
then.
B
B
B
B
So
what
happened
here
is
we
download
the
files
right?
We
instantiated
our
sources
and
we
have
some
code
and
then
we,
you
know
we
combine
them
by
yielding
this
sources
object.
Well,
we
this
would
be.
B
This
would
be
so
so,
basically,
there's
some
some
some
about
this
is
the
implementation
of
this
effectively
there
there's
another
layer
of
abstraction
between
the
data
set
source
and
it's
this
context,
manager
wrap
source
and
it
has
on
line
108
see
over
here
line
108.
This
is
the
the
error
that
we're
getting
function
did
not
yield
source.
B
B
So
this
is
just
basically,
there
was
input,
validation
being
done
to
ensure
that
we're
yielding
a
valid
source
from
the
data
set
source
right,
because
we
have
this
data
set
source
decorator
that
we
can
use
here,
which
is
this
at
data
set
source,
and
this
function
is
calling
another
function
which
is
in
this
wrapper.py
at
line
108,
which
is
saying
whatever
is
yielded,
let's
make
sure
that
it's
a
valid
source
and
then
it
throws
an
exception
if
it's
not
a
subclass
of
source.
Well,
sources
itself
is
not
a
subclass
of
source.
B
So
we
hit
a
new
use
case
here,
and
so
all
we
had
to
do
was
go
modify.
The
wrapper
here
to
say
you
know,
don't
just
check
for
source
sources
is
also
valid.
Does
that
make
sense.
B
A
Is
not
being
yeah.
B
B
B
B
In
idx
3
x-file
read,
this
is
like
this
is
being
thrown
in,
like
the
call
to
the
open
function
in
python.
B
B
C
B
B
B
Gigabytes
free,
so
I'm
highly
suspicious
of
that,
because
this
file
is
small.
So
so
we're
going
to
add
some
logging,
and
this
can
be
done-
an
opportunity
to
learn
how
to
debug.
When
you
hit
weird
issues.
A
B
Okay,
so
then
we're
probably
going
to
see
a
lot
of
log
messages
here,
so
we're
going
to
go
ahead
and
say
you
know
we're
going
to
add
you
know
so
we're
just
gonna
give
us
ourselves
some
debug
information
right,
so
open
file
reading
size
blank,
and
this
will
tell
me
how
tell
us
how
many
iterations
of
the
loop
we're
gonna
go
through
and
then
we're
gonna
say
you
know:
okay,
so
inner
array,
size
and
we'll
just
dump
out
this
inner
array,
size.
B
B
B
So
then
the
the
thing
that
we
can
do
is
if
we
set
this
logger
logging
logging
equals
debug
as
an
environment
variable
whenever
we
run
a
test
case,
it
should
print
us
for
both
debug
logs
and
then
enable
all
of
those
oops
and
enable
us
to
see
what
you
know
what
is
going
on
here.
B
So,
let's
see
okay,
so
they
may
not.
It
may
not,
because
we're
actually
running
into
a
sub.
Oh
no
there.
It
is
okay.
So
here
we
see
the
download.
So
so
just
to
recap
here
so
we
did
this.
B
We
ran
the
command
the
same
command
we
were
running
before,
but
we
exported
this
environment
variable,
and
this
is
how
you
do
a
temporary
environment.
Variable
you
just
prefix,
whatever
command
with
it
and
then
when
it
ran.
You
know
this
sub
command.
Here
it
picked
up
on
the
fact
that
hey
you
know.
I
need
to
be
enabling
the
debug
logging,
so
here's
the
nice
download
progress
code
that
we
have
and
oh
so
here
you
see
it
instantiate
the
source.
So
this
is
the
source.
B
B
That's
why
because
they're
csv
files,
so
we
can
tell
you
right
now.
I
can
tell
you
right
now
what's
wrong.
So
basically,
these
are
idx
3
format,
files
which
are
gzik
gzip
compressed
and
our
file
source
abstraction
will
automatically
decompress
files
for
us,
but
it
works
based
on
the
file
extension.
So
if
you,
if
you
get
the
file
extension
wrong,
then
it's
not
gonna
properly
decompress
it.
So
we
basically.
A
B
A
Okay,
yeah,
so
to
sorry,
if
I
understand
this
correctly,
you
try
to
read
the
wrong
type
of
file
format.
B
How
many
gigabytes
that
is,
like
that's,
a
lot
of
gigabytes,
probably
terabytes.
So
if
we
fix
our
file
extensions,
we
will
end
up
with
sanity.
So
let's
see
so,
okay
and
here
it
is
dot
gz.
So
if
we
fix
these-
and
we
named
these
csvs.gz
right-
you
saw
where
I
just
did
that
here
and
here
then
we
should
end
up
with.
You
know
something:
that's
not
completely.
Nuts.
B
Let's
see
what
happens
it
works.
So
this
is
the
correct
output.
The
correct
output
is
a
giant
giant
giant
all
you
know
those
are
what
what
are
they
yeah?
This
is
I'm
trying
to
control
see
this,
but
you
know
it's
got
the
got
the
better
of
me
here,
but
those
are,
I
think,
what
28
by
28
images
I
think
right.
Does
anybody
remember
yeah?
So
this
is
basically
yeah.
So
these
are
the
flattened,
arrays
versions
of
the
28.
A
B
So
it's
dumping
the
whole
thing
to
standard
out,
but
it
works.
So
we
were
able
to
successfully
okay,
I'm
gonna,
I'm
gonna
kill
it,
so
we
were
able
to
successfully
dump
out
that
and
let's
see
so
if
we
look
at
it
here
because
this
is,
you
know
that
just
just
to
save
ourselves
some
trouble
here,
we
can
dump
it
to
a
file.
So
now
we
can
actually
just
so.
This
is
we're
writing.
B
We
wrote
our
test
case
as
the
documentation
within
the
docs
string
of
the
function
where
we
did
the
implementation
right,
so
everything
is
just
like
boom
all
together
docs
tests
implementation,
and
so
we
can
just
dump
this
to
a
file
so
temp
test.json,
because
this
outputs
in
json
by
default
and
it's
going
to
take
a
little
bit
but
it'll
dump
it
out
for
us
here
and
we
can
go
and
remove
those
debug
messages
now,
so
we
can
say
get
check
out
source
mdx3
because
I
don't,
I
don't
think
we
really
need
those
those
are
sort
of.
B
I
don't
think
we
want
to
keep
those
around
okay.
So
then,
at
the
end
of
the
day
here,
what
we
ended
up
with
is
a
couple
fixes
so
or
one
fix
so
get
add.
So
this
is
what
we
usually
do
is
if
you
know
we
ended
up
with
our
implementation,
so
git
status.
B
So
we
have
three
files
that
we
changed.
Setup
py
changed
because
we
implemented
or
we
added
the
the
entry
point
right
and
then
that
points
to
this.
This
is
the
python
path.
This
points
to
this
new
file
and
then
we
changed
this
source
wrapper
because
we
needed
to
change
the
input
validation.
B
So
what
we
would
do
is,
we
would
say,
git,
add
source,
wrapper
and
then
git
commit
and
then
we'll
just
say
you
know
source.
So
whenever
things
are
within
that
main
main
plug-in
or
the
main
library
itself,
we
don't
prefix
with
anything
we
don't
say
dfml.
B
Allow
add
sources
to
yielded.
B
Type
allow
list
for
context
and
then
I'll
say
for
context,
managed,
wrapper
sources
and
then
I'll
say
data
set
source
right,
because
this
is
what
ends
up
calling
into
that
and
I'll
add
you
both
as
co-authors,
since
we
did
this
together.
B
So
I
have.
I
have
sawhill's
email.
B
All
right,
so
can
you
spell
your
full
name?
For
me,
I
mean
I'd
eventually
contribute.
B
B
And
then
we
give,
I
always
give
my
quick
look
over
here.
Just
make
sure
everything's
correct,
oh,
and
we
got
a
format
with
black.
I
think
everything's
correctly
formatted,
let's
see
yeah
that
looks
like
correct
formatting
to
me
all
right,
we'll
take
a
chance
here
and
see
the
the
ci
will
kick
us
out
if
we
did
something
wrong.
A
B
Okay
and
then
we'll
say.
B
So,
source
data
set
mnist
training,
add
numbness
training,
source.
C
B
B
That's
good,
I'm
happy
to
hear
that.
I
think
this
was
a
pretty
good
recording
today.
I
think
this
is
a
pretty
good
one.
I
think
we
got
a
lot
of
information
on
paper
so
to
speak.
So
then
we'll
push
it
up.
So
I
you
know
for
me:
let's
see
get
remote
dash
v,
so
get
push
so
so
mine.
B
I
I
push.
You
know
to
my
my
fork:
get
push
you
pity
extreme
and
then
data
set
source,
mnist,
oops
oops.
I
missed
a
man
in
my
own
name
and
then
ghpr
create,
and
I
I've
been
having
fun
with
this
data
set.
B
With
this
github
cli
boom
pull
request
all
right.
So
now
we
can
wait
for
the
ci
to
fail,
because
we
have
lots
of
failing
ci
but
yeah
we
need.
I
need
to
I'm
I'm
planning
on
spending
some
time
fixing
this
stuff.
I
I
think
I
told
sahil,
but
I
I
should
have
some
more
scope.
This
is
basically
has
this
project
has
been
something
that
I
do
outside
of
my
day.
B
Job-
and
I
know
sahil-
does
this
outside
of
his
day
job
as
well,
and
so
you
know
we
we
spend
as
much
time
as
we
can
working
on
it,
but
you
know
sometimes
things
fall
by
the
wayside.
So
currently
the
ci
jobs
are
failing,
and
I
know
I
know
you've
been
working
on
that.
So
thank
you
for
working
on
that
and
fixing
the
ones
you
have
all
right.
So
any
final
things
for
this
meeting
today.
B
Great
and
then
you
and
I
can
jump
on
that
call.
Okay,
all
right!
Let
me
go
here
and
I
will
add
this
as
the
mnist
example
for
how
to
add
a
data
set
source,
and
that
should
give
you
a
pretty
good
idea
and
for
pre-work
for
the
proposal.
I
would
recommend
that
you
maybe
go
through
and
you
know
add
something
you
know
similarly
to
what
we
just
did
and
you
could
follow
the.
A
B
A
One
one
thing:
I
had
one
question:
yes,
yes,
so
I
actually
asked
silent
this
earlier.
He
said
he
wasn't
exactly
sure
about
it.
So
one
of
the
projects
that
this
same
project
the
data
set,
adding
things
I
wanted
to
ask
if
I
can
work
on
it
before
gsoft,
even
though.
B
You
can
work
on
whatever
you
want
before
g-suck
right,
so
keep
in
mind,
though
everything's
open
source
right,
other
people
are
gonna,
see
your
stuff,
we're
gonna
know
the
the
beautiful
part
about
open
source
is
nobody
can
say
that
they
did
your
work
when
you
did
it,
because
it's
right
out
there
and
we
can
tell
that
you
did
it
first.
So
if
anybody
tries
anything
like
that,
we'll
just
say
well,
then
we're
not
even
going
to
consider
your
proposal.
B
We
had
somebody
try
that
one
year
I
think,
and
it
was
like
really
like
we
can
tell
who
did
what
we've
been
talking
to
you
all?
So
it's
yeah
that
you,
I
would
say
the
more
the
merrier
and
that
is
you
know
the
more
you
get.
B
If
you
decide
to
go
knock
some
of
this
stuff
out,
then
that's
just
going
to
be
great,
pre-work
right
and
then
you're
going
to
modify
your
proposal
accordingly
right
to
say
you
know,
maybe
do
something
else
right,
because
you
want
to
you
know
you
want
to
get
the
most
out
of
the
time
right
and-
and
you
want
to.
B
This
is
all
of
this
is,
like
you
know,
stuff
that
you're
hoping
other
people
will
use
right,
so
cool
all
right,
great
and
then
I
think
we
have
just
enough
time
to
sync
before
my
my
nine
o'clock
here
so
and
then
so,
let's
see
writing.
A
So
I
I
am
just
to
come
back
to
the
first
question
that
I
had
so
yeah
the
whole
centralized
generic
function
for
this
data
source
thing.
Instead
of
like
copy
pasting,
the
same
code.
B
Yeah,
so
basically,
so
what
we
found
here,
no,
that
was
a
very
good
thing.
I'm
glad
you
I'm
glad
you
said
that
so
so
what
we
found.
What
did
what
did
we
find
here?
Let's
pull
up
the
now.
We
have
two
sources
that
we
can
easily
look
at
side
by
side
and
data
set
data
set
and
iris
okay.
So
let's
look
at
iris
and
let's
look
at
our
new
one.
B
Right,
okay,
so
let's
go
to
the
bottom
right
so
so
we
we
started
with
this.
We
copy
pasted
it
modified
it
a
bit
and
got
to
here
right.
So
so
now
we
can
look
at
and
see
well,
okay!
Well,
what
you
know?
How
far
are
we
from
our
hard-coded
two
examples
here
to
the
point
where
we
would
have
something
generic
that
works
for
everything
right
and
I
think,
looking
at
this,
we
can
see
that.
Well,
what's
our
general
pattern
right?
It's!
B
Source
that
you
need
right,
and
then
you
know
in
this
one,
we
happened
to
do
some
pre-processing
on
this
one.
We
didn't
need
to
do
any
pre-processing
right
so,
but
the
generic
sense
here
is
well.
You
know
what
what
is.
What
is
the
file
you
want
to
download
and
then
what
is
the
internal
source?
You
want
to
do
right
now
now
the
the
goal
of
this
project,
so
so
this
is
just
you
know,
regular
old
python
code,
which
is
great
right
now.
B
This
project,
also
heavily
leveraged
there
is,
is
heavily
based
around
like
this
idea
of
the
data
flows
right
and
so
the
data
flows.
B
Are
this
generically
configurable
thing
that
that
you
can
use
to
basically
say
you
can
sort
of
do
anything
on
the
fly
with
them
right
and
and
they
they
allow
you
to
sort
of
like
mix
configuration
and
code,
but
at
the
same
times
keep
them
separate
and
organized,
and
so
looking
at
this
that
in
general,
my
answer
is
you
use
a
data
flow
because
the
goal
of
this
project
in
many
ways,
as
well
as
being
a
great
like
you
know,
like
trying
to
be
a
good
place
for
machine
learning,
is
to
explore
this
concept
of
data
flows
and
how
you
could
you
know,
extend
anything?
B
B
B
How
would
I
define
a
okay,
so
how
would
I
define
a
source
as
a
data
flow,
and
then
the
data
flow
allows
us
like
basically
arbitrarily
arbitrary
configurability
right,
so
you
could
say
you
could
have
an
operation
that
instantiates
a
source
that
you
can
have
an
operation,
that
cash
that
runs
cash,
download
and
an
operation
that
instantiates
a
source
and
you
could
effectively
wire
up,
and
so
the
the
spoiler
alert
here
is
that
I'm
working
on,
I
have
a
pull
request
open.
B
I
believe
that
allows
you
to
define
any
class
as
a
data
flow,
and
so
once
that
pull
request
gets
merged.
So
basically
you
can
say
this
method.
Does
this
within
the
data
flow?
So
a
data
flow
is
a
set
of
functions
that
get
run
on
like
on
demand
right
and
on.
Demand
in
this
case
is
whenever
a
method
is
called
right.
We
talked
about
update
and
records
and
record.
So
in
this
case
you
would
run
you
know
a
specific
operation
for
a.
A
For
instance,
like
you
guys
have
this,
you
were
working
on
the
web
interface
right
right,
drag
and
drop.
A
B
B
Yes,
well,
so
you
wouldn't
so
basically,
this
is
so
that
so
here's
an
example
of
creating
a
data
flow
and
there's
a
command
line
interface
for
creating
it,
but
it
basically
dumps
out
to
there's
a
serialized
format
which
can
be
represented
as
really
anything,
so
this
in
this
case
we're
dumping
it
out
to
yaml-
and
this
is
the
graph
like
you
can
you
can
visualize
it
and
that
visualization
is
similar
to
that
web
ui
project,
where
you
could
put
together
this.
B
B
For
you,
so
basically
what
you
would
do
is
you
would
say
this.
So,
for
example,
this
is
multiply
right.
So
so
this
is
a
data
flow.
These
are
inputs
here
right,
so
you
could
define
you
define
your
data
flow
and
then
you'd
say
when
I
call
the
records
method
which
lists
all
records.
I
want
you
to
send
the
input
well,
records
doesn't
take.
B
You
can
run
on
any
method,
call
right,
so
you're,
basically
defining
a
class
dynamically
right
you
you
can
basically
think
of
it
as
like
code
generation,
almost
right,
and
so
you
can
get
arbitrarily
complex
in
terms
of
like
mapping,
dynamic
inputs
or
static
inputs
right
and
so
in
the
case
of
these
static
data
set
sources,
you
would
say
you
know
we
would
create
a
data
flow
which
just
statically
maps
these
like
these
are
the
inputs
that
we
provide
statically,
the
url
and
the
hash
right,
and
we
would.
B
We
would
say
on
records,
call
cash
download,
and
then
you
know
the
and
then
you
know,
run
and
and
and
yield
each
record
in
the
csv
or
yield
the
csv
source
like
it's
it.
It's
all
arbitrary
right
like
it's,
it
can
be
whatever
you
want
right
and
so
right
now
it
and
and
effectively
what
it
does
is
it
allows
you
this
mechanism
to
make
everything
configurable
right.
So
we
would
take
the
code
that
we
have
here.
B
B
So
if
we
wanted
to
create
an
arbitrary
source
that
works
for
any
data
set,
we
would
say
we
would
add
cash
downloaded
as
an
operation
and
then
we
would
add
something
that
figures
out
what
the
file
extension
is
and
instantiates
the
correct
source
based
on
the
file
extension
right.
So
we
would
have
something
that
downloads
every
time
it
sees
a
url
it
downloads.
B
B
Okay,
so
I
think
we're
over
on
time.
I
need
to
drop
here
and
double
check
that
I
can
go
into
nine
o'clock
and
I
think
I
can.
I
think
that
meeting
got.
I
think
I
had
no
eight
o'clock
and
I
think
I,
my
nine
o'clock,
got
canceled,
but
let
me
double
check
and
then
let
me
hop
on
the
meeting
with
you
sahel.
Okay,.