►
Description
In the first part of the meeting Jeff discusses grid cells formed via oscillatory systems, the Bush & Burgess’ model of ring attractors, and how this idea might be overlaid onto cortical columns.
Starting at 34:00 Subutai switches gears quite a bit, and discusses a new paradigm for achieving AGI via meta-meta learning, by reviewing Jeff Clune’s 2019 paper “AI-GAs: AI-generating algorithms, an alternate paradigm for producing general artificial intelligence” (https://arxiv.org/abs/1905.10985). We discuss the prospects for meta-learning AGI, and meta-learning Numenta’s neuroscience based approach.
A
All
right,
we
are
recording,
so
I,
just
I've
been
working
on
sort
of
a
basic
task
of
really
trying
to
understand
in
detail
how
how
a
metric
space
like
grid
cells
work
and
how
they're
implemented
in
the
cortex,
specifically
like
the
real
anatomy
of
it.
How
would
how
would
you
create
these
things
and
I?
It's
I
haven't
been
able
to
focus
as
much
time
on
preparing
anything
about
if
I
think
I've
made
some
interesting
observations,
so
I
just
thought
today,
I
would
in
frustrated
of
them.
A
You
know
working
on
the
book,
so
much
I
haven't
really
been
able
to
present
this
material.
So
what
I
thought
I
would
do
today
is
very
briefly
just
explain
what
I'm
working
on
and
not
my
my
results.
So
this
is
just
a
sort
of
a
toe
in
the
water
say:
here's
the
thing
I'm
working
on
and
if
that's
of
interest,
maybe
something
a
job
with
me
later.
But
hopefully
you
know
after
a
couple
weeks
from
the
when
the
book
is
out,
I'll
be
able
to
put
together.
A
I've
been
writing
this
up,
but
I'm
not
gonna,
go
through
my
write
up
here.
So
in
that
context,
just
so
like
here's,
what
I'm
working
on
when
I'm,
not
writing
the
book,
so
I
showed
this
figure
last
time.
This
was
a
nice
illustration
of
how
the
basic
idea
how
a
grid
cell
is
created
using
the
oscillatory
inference
model.
So
hopefully
everyone
will
remember
this,
which
the
basic
idea
is.
You
need
two
oscillators.
You
need
a
baseline,
an
oscillator
which
is
in
the
red
you.
That's
your
data,
your
baseline
theta.
A
A
Almost
all
the
oscillatory
interference
models
work
on
this
basic
idea,
but
this
is
only
one
cell.
It
does
not
map
the
entire
array.
It
doesn't
talk
about
2d,
it
just
a
whole
bunch
of
questions
that
go
on
top
of
us,
but
this
model
also
is
highly
recommended
because
it
illustrates
the
precession
idea,
where
cell
fires
later
as
its
approach
as
approaching
its
peak
and
then
fires
earlier
in
the
theta
cycle
as
it
recedes
from
it.
A
Each
at
a
different,
so
you,
instead
of
having
one
green
guy,
you
have
to
have
multiple
green
guys
here,
each
one
out
of
different
phase,
and
so
then,
if
you
did,
that
you'd
have
one
cell
firing
here
and
also
farm
here,
and
that's
all
fine
here,
another
cell
phone
here,
and
so
this
has
been
postulated
as
a
ring
attractor.
So
I
took
this
picture
from
the
paper
we
reviewed
a
while
back
the
hybrid
model.
This
was
from
the
bush
and
Burgess
hydras
hybrid
model
paper.
A
Unfortunately,
this
is
I
find
this
a
confusing
drawing,
but
it
does
illustrate
the
issue
they're
here,
they're
proposing,
there's
multiple.
This
is
a
ring
oscillator
here.
So
these
all
these
cells
are
firing
at
the
same
frequency,
but
they're
slightly
in
and
out
of
phase.
So
as
they
go,
the
fate
the
peak
of
their
firing
travels
in
a
circle
like
this,
but
they're
all
at
a
voltage-controlled
frequency,
depending
on
movement
in
particular
direction.
A
So
if
I
just
had
these
cells
here,
these
were
these
were
sorry
about
that
these
cells
here,
if
I
looked
at
when
they
peak
relative
to
the
to
the
face
theta,
these
would
implement
a
1d
grid,
saw
module
in
this
direction
of
movement,
and
these
would
represent
another
one
decrease
the
module
in
this
direction
them
and
so
on.
So
these
are
a
bunch
of
1d
grid
cell
modules.
If
you
compare
these
people
cells
with
the
base
theta
and
then,
if
you
wanted
to
get
a
two-dimensional
grid
cell,
that
would
be
this
log
roll
cells.
A
A
But
if
you
want
a
really
nice
one,
then
you
should
get
a
bunch
of
them,
so
here
they're
showing
six
of
them
at
60
degree
angles,
and
so,
if
you
had
a
nice
perfect
thing
like
this,
this
cell,
it
was
looking
at
these
grid.
These.
These
are
like
one
deep,
rich
all
modules
and
if
you
and
then
this
cell
here
would
be
basically
saying,
oh
I'm,
gonna
I
would
create
a
2d
grid.
So
that's
one
2d
grid
zone.
So
the
question
I've
been
asking.
A
No,
they
would
all
be
going
at
different
frequencies
based
on
the
movement
in
this
particular
direction.
So
every
cell
has
thought
of
like
a
base
theta
and
then
the
theta
increases
a
little
bit
depending
on
how
fast
you're
moving
in
this
particular
direction.
So
then
the
frequency
with
the
animals-
oh,
it
was
moving
very
rapidly
in
this
direction.
Then,
if
this
would
be
the
fastest
frequency,
this
wouldn't
be
increased
a
little
bit
because
it's
moving
a
little
bit.
Imagine
is
going
from
left
to
right.
A
Well,
in
this
case,
it
would
be
a
little
bit
incrementing
here.
It's
not
clear
what
would
happen
here,
because
these
oscillations
don't
slow
down.
They
only
go.
They
only
go
faster,
worried
about
that,
but
anyway,
either
either
this
one's
discounted
or
it
runs
at
theta
base
theta
in
this
case.
So
but
there
are
the
ones
where
you
have
a
positive
movement
in
this
direction.
They
would
be
running
faster
than
the
other
ones.
B
A
C
A
Yeah
by
way
of
fire,
yeah
I
think
that's
right,
I,
I'm,
I,
don't
think
that's
likely
going
to
turn
out
to
be
the
case
Kevin,
but
it's
possible
I,
don't
think
I,
don't
think
these
models,
don't
don't
account
for
any
kind
of
intermediary
or
propagation
delay,
or
anything
like
that
and
I'm.
Not
thinking
about
that
either
I
think
it's
all
local
enough
and
it's
a
slow
enough,
Minh
less.
We
this
we're
going
to
insert
extra
inch
of
media
ourselves
in
it.
For
some
reason
we
I
hope
the
patient
I'm.
Not
thinking
about
that,
you.
A
That's
okay,
yes,
actually
I
agree
with
you,
I
mean
what
the
question
I'm
going
to
ask
is:
where
are
the
ring
attract
within
a
coracle
cop?
Okay,
that's
the
question
I'm
getting
at,
and
so
there
has
to
be
a
physical
mechanism
to
make
cells.
Do
this
right.
So
if
that's
what
you're
saying
I'm
with
you
right
I
mean
in
essence
yeah
the
these.
These
models
are
proposed.
I,
don't
believe
they
try
to
map
this
on
to
the
biology,
in
any
sense,
they're.
A
Just
saying
this
is
a
theoretical
model
and
I'm
trying
to
map
it
onto
the
biology,
and
then
you
get
this
physical
structure
that
has
to
implement.
If
that's
what
you're
asking
then
you're
actually
right,
that's
that's!
What
I'm,
trying
to
figure
out
like
I've
come
to
believe
that
this
basic
model
of
ring
attractors
is
correct
and
there's
a
lot
of
things
which
say
that
this
is
problem
there
right
right
right
thing,
but
how
does
it
actually
implement
in
the
neuron?
A
A
You
know
there
are
some
sort
of
you
know,
insects
and
so
on.
There's
some
ring
like
cell
structures,
but
nothing
like
this
in
the
cortex.
My
first
assumption
that
these
are
not
in
a
ring
that
they're
actually
a
linear
array
of
cells
and
that
the
phase
progression
is
moving
along
in
a
linear
direction.
So
that's
what
I
write
down
here
bring
attractors
are
most
likely
linear,
arrays,
not
Rin.
That's
the
first
thing.
It
seems
if
there's
gonna
be
ring
attract
roads,
they
don't
the
rings.
A
A
Exactly
exactly
why
so,
let's
get
down
to
the
next
thing,
we're
so
then
I
ask
myself:
ok,
I'm
going
with
this
idea,
these
ring
attractors.
This
makes
a
lot
of
sense
and
I
mentioned
earlier.
How
you
know
many
columns,
look
like
they're
voltage,
modulated
movement,
vectors
and
so
on.
So
I
got
all
the
right
information
there.
So
where
are
these
and
I
basically
won't
say
I
think
I've
been
to
pursue
two
basic
ideas.
A
We
know
it
has
these
orientation
preferences
and
we
used
to
say
their
orientation
column,
but
that's
not
really
true,
their
orientation
slabs,
and
so,
if
you
move
in
one
direction,
you
see
the
cells
respond
to
orientations
that
are
changing,
but
if
you
move
in
the
other
direction
they're,
basically
it's
a
the
common
orientation-
and
this
it's
often
drawn
like
this,
but
we
know
that
these
these
things
are
kind
of
messy.
They
kind
of
come
to
these
points
of
pinwheels
and
so
on.
But
but
this
isn't.
A
This
was
a
very
well
established
idea
that
you
have
in
being
one.
If
you
move
your
probe
across
in
one
direction
and
the
column
you'll
find
changing
orientations
which,
to
me
are
changing
movement,
vectors,
they're,
saying
different,
different
directions.
This
thing
can
be
moving
and
then
in
this
direction
you
have
a
commonality
and
they
really
it's.
It's
not
clear.
A
What's
going
on
in
this
direction
and
of
course
there
are
many
columns
throughout
this
I'll
get
to
that
in
a
second,
so
one
idea
I've
been
pursuing,
and
this
is
not
the
one
I'm
currently
favoring,
but
it's
just
I
put
it
in
there
for
completeness
assume.
One
simple
idea
is
that
the
ring
could
be
a
set
of
cells
in
a
layer
in
a
mini
column.
A
So
this
all
the
cells
in
this
mini-com
are
basically
representing
the
same
movement
vector
and
then,
if
I
can
get
the
cells
to
fire
at
slightly
different
phases,
then,
and
as
long
as
it
there's
enough
cells
here
to
cover
the
whole,
you
know
a
whole
cycle,
then
you
would
have
a
ring
attractor.
So
that
was
one
idea
and
it
was
like.
Oh
that's,
really
convenient
because
look
here,
column
I've
been
arguing
mini
column
as
a
movement
vector
and
I.
A
Have
this
a
bunch
of
cells
net
and
they
can
just
progressed
like
this,
and
it's
not
every
selling
me
comments,
it's
just
it's
just
even
like
one
layer,
so
that
was
really
nice
and
this
would
result
in
many
1d
grid,
so
modules.
Essentially
the
cells,
every
mini
column
would
be
a
1d
grid.
Someone
and
there's
a
lot
of
advantages.
A
Having
a
lot
of
mock
original
modulus
makes
it
very
high
dimensional
spaces,
the
other,
the
other
opportunity
I
pursued,
and
this
one
I
mentioned
when
flooring
was
here-
is
that
you,
your
your
ring,
is
not
vertical,
but
Yuri
movement
is
is
across
the
orientation
slab.
So
in
this
case
each
mini
column
is
in
essence
a
phase
shifted
element
of
the
ring,
and
so
each
each
slab
is
then
each
slab
becomes
a
ring
attractor,
and
so
you
have
a
progression
of
fades
moving
in
one
direction
like
this.
A
This
this
seems
a
really
odd
idea
that
can
happen,
and
why
would
it
happen
that
way
and
so
on
yeah?
That
means
I'm
is
like
going
in
one
direction:
I
get
changing
phase
and
you
know
in
the
other
direction,
I
get
changing
or
you
know,
orientation
or
changing
movement
vectors.
What's
going
to
force
that
to
happen,
but
I've
come
to
actually
like
this
idea.
A
lot
I
think
it's
there's
some
real
things
and
I'm
not
gonna,
go
through.
A
Why
at
the
moment,
but
one
thing
you
do:
is
you
give
up
a
lot
of
you
end
up
with
many
fewer
Witzel
1d
crystal
modules,
but
you
end
up
with
now.
You
have
the
idea
possibility
of
using
the
pyramidal
cells
in
each
units
the
air
identify
to
sort
of
temporal
memory
like
context
so
now
I
could
select.
You
know
if
there's
10,000
layer,
3
or
layer
5,
for
example,
I,
could
selectively
activate
one
or
two
to
basically
pick
pick
at
a
context.
A
Layer
I
can
see
with
memory
works,
so
this
turned
out
I,
just
it
creams,
I've
cringed
at
first
thinking.
This
could
really
be
true
because
there's
a
very
complicated
system,
which
is
very
little
for,
but
it's
starting
to
look
pretty
good,
actually
and
and
and
I'm
I'm
trying
to
walk
way
through
the
different
issues
they're
associated
with
it,
but
I
think
the
basic
idea
is
that,
well,
we
know
there's
going
to
be
grid
cells,
I'm,
really
convinced
that
the
oscillatory
infant,
your
interference
models,
way
to
go.
That
leads
to
essentially
someplace.
A
You
have
to
have
1d
sort
of
good
cell
modules,
which
can
be
confirmed
into
to
keep
it
so
modules.
You
need
to
ring
oscillators
to
do
that.
Where
are
the
ring
oscillators
there's
only
so
many
places
you
can
have
ring
oscillators
one
ways
you
can
do
it
vertically
the
other
ways
you
can
do
it.
You
know
laterally
and
you
know,
assuming
that
there
aren't
actually
rings
anywhere
in
the
cortex
and
then
then
you
can
sort
of
tease
apart
the
different
attributes.
A
What
happens
when
you
do
it
one
way
versus
the
other
way
and
I
have
a
lot
of
notes
on
this,
but
I
am
structured
yet
so
that's
the
basic
idea
of
what
I'm
working
on
right
now.
I
haven't
reached
a
conclusion,
but
right
now
this
one
looks
promising
that
the
only
downside
to
this
method
and
by
the
way
this
can
even
lead
to
the
two-dimensional
good
cell
models.
A
A
two-dimensional
array
like
we
saw
in
the
tank
paper,
because
if
you
assume
that
many
columns
inhibit
a
surrounding
set
of
many
columns,
you
end
up
with
a
set
of
cells
that
are
active
in
here.
Spaced
apart,
that
look
like
look
like
the
grid
cells
that
we
saw
in
the
tank
paper.
Hey
I
saw
ly
want
to
say
about
this.
I
want
to
get
back
to
it
then
I
won't
be
probably
won't
be
able
to
really
get
serious
about
it
for
about
two
weeks,
but
I
just
thought.
C
Things
that
you're
doing
here
is
the
previous
model
assumed
like
there
is
like
one
of
the
papers
assume
like
there's
a
asynchronous
clock.
Frequency
like
you,
do
a
lot
of
you
know
ships
right
now,
but
there's
an
alternative
form
of
logic,
which
is
self
time
we're
basically
things
propagate
when
they're
ready
to
go
and
activate
the
next
thing.
C
A
A
Yeah,
although
here
we
I
think
it's
a
bit
of
a
hybrid
here,
because
remember
these
models
all
rely
on
the
idea
that
there's
a
base,
data,
frequency
and-
and
that
is
being
shipping
it
to
everybody.
I
didn't
show
it
here,
but
these
are
pyramidal
cells
in
these
mini
columns
and
parental
cells
always
have
an
apical
dendrite,
so
I
mean
I'm,
assuming
this
is
either
layer,
3,
2,
3
or
5.
These
I
think
actually.
A
A
So
in
some
sense
this
is
clock
and
on
twice
it's
like
these
guys
propagated
their
own
rate,
but
they're
also
comparing
it
to
a
base
theta,
where
I
think
wallop
around
all
these
cells
will
be
or
getting
there,
they
might
be
getting
the
base
theta
on
the
apical,
dendrites
and
they're
being
driven
locally
at
the
the
theta
plus
frequency
down
in
their
particular
layer.
So.
C
I
I
agree
is,
if
you
have
two
different
mechanisms,
one
of
which
is
inherently
low
latency
that
allows
you
to
distribute.
You
know
a
reference
signal
and
then
the
other
one
inherently
has
a
higher
latency,
so
it
can
propagate
across
and
and
have
those
those
phase,
differences
I
it
make.
It
makes
sense
to
me,
because
if
you
have
the
mechanisms
in
there,
where
you
have
low
latency
high
latency
propagation
of
signals
within
the
cortex,
then
it
falls
out.
You
know,
I.
A
E
A
E
A
Our
yeah
I
do
it
yeah.
It
is
it's
a
little
odd
to
imagine.
If
we're
gonna
have
these
swim
lanes
and
they're
running
a
slightly
different
frequencies,
it's
there
has
to
be
tissue
structure
to
support
that,
and
you
know
why.
Why
do
I
want
to
go
one
direction.
Do
I
get
a
swim
line
of
phase
shift
and
I
go
in
the
other
direction.
I
get
an
orientation
ship.
Well,
it's
there
well
at
least
the
orientation
ship
this
year.
A
We
know
that
there's
no
theories
at
all
as
far
as
I
can
tell
why
there
are
slab
no
I've
been
looking.
I
can't
find
anything
that
says
like
functionally.
So
it's
appealing
that
maybe
that's
the
purpose
for
it
and
you've
got
these
two
things,
but
then
it's
things
like.
Why
is
it
structured
like
this?
How
come
it
ends
up
this
way?
A
Context
thing
we
think
about
all
the
time
like
you
have
a
bunch
of
cells
in
the
layer
that
say
these
are
all
of
the
same
thing,
but
in
different
contexts
one
cell
becomes
active,
but
I
start
thinking
about
you
know,
what's
going
to
cause
them
a
mini
column
to
be
all
the
cells,
be
the
same.
It's
going
to
be
a
bipolar
cell
and
we've
already
determined
that
with
the
temple
memory
now,
why
would
it
go
why
we
could
get
phased
in
this
direction
in
orientation
of
this
region?
A
Well,
there
are
those
other
weird
shells
than
the
others.
The
chandelier
cell
from
these
other
inter
neurons
I've,
come
to
believe
that
all
this
calculation
is
being
done
at
the
inter
neuron
level
and
and
then
you
just
assign
a
bunch
of
paramus
cells
to
to
each
of
the
elements
that
have
created
by
the
inter
neurons
and
that
these
allow
you
to
create
context.
A
So
I
need
to
really
understand
how
the
mechanism
for
this
propagation
we
have
to
look
very
carefully
at
the
interneurons
and
and
and
and
there
there
is
a
lot
of
literature
on
them
and
I
think
I
think
once
before.
I
was
mentioning
I
can't
recall
that
there
was
a
I
read.
I
read
an
article
that
one
of
these
interneurons
really
has
this
little
planar
and
sort
of
aspect
to
its
projections
and
someone
else
look
for
and
couldn't
find
it,
but
I
think
it's
there
there's.
A
Another
interesting
question
is
this
would
imply
that
this
sort
of
slab
behavior
would
exist
in
every
cortical
region
and
I.
Looked
briefly,
I
mean
I
spent
like
20
minutes
searching
trying
to
find
papers
that
talk
about
slab
type
of
receptor
fields
in
other
cortical
regions.
I
didn't
see
anything
I
didn't
see
anything
that
contradicted
it,
I,
just
the
people
that
haven't
done
this
kind
of
analysis
they
just
haven't.
Had
they
don't
know
what
to
look
for
as
far
as.
A
Guys
always
make
off
all
these
many
columns
are
firing
it
the
same
they're,
not
firing
that
you
can
imagine
there.
You
can
imagine
the
imagine,
is
a
bipolar
cells
that
represent
the
mini
column.
Okay,
that
means
a
set
of
bipolar
cells.
Those
bipolar
cells
are
firing
at
the
same
data
plus
Delta
frequency,
so
they're,
all
and-
and
these
are
spaceship
that
so
this
is
the
ring
attractor.
A
These
would
be
six
six
elements
in
a
ring,
so
you
this
this
guy's
its
first
and
within
a
fit
within
this
theta
plus
Delta
frequency,
to
Viggo
being
being
being
being
being
being
being
back
to
beginning
bing-bing-bing-bing
and
so
on,
and
actually
these
are
in
wrong.
These
are
much
longer
than
this.
You
can
have
a
you,
can
have
multiple
Peaks
going
along,
so
these
are
they
these?
These
mini
columns
are
now
the
elements
of
the
ring
attract.
B
A
Know
I'm
not
aware
of
it
sometime
and
I,
there's
so
much
pain,
I,
just
I
need
to
spend
a
quality
day
or
two
doing
research
on
the
papers,
but
as
far
as
I
know,
I
have
not
found
anything
which
says
what
these,
why
what
happens
along
the
slab?
If
you
go
far
enough,
then
you
switch
to
it.
Then
you
go
from
left
I
preface
the
right.
I
prefer.
A
A
You
know
I
looked
at
some
of
those
papers
and
they
really
didn't
talk
about
this
at
all.
There
might
be
more
than
I'm
sure
this
morning,
I
didn't
see
Kevin,
but
I
haven't
seen
that
yet
it's
it's
like
I
saw
some
paper
that
had
like
title
they're
like
traveling
waves
in
your
cortex,
but
when
you
read
them,
they're
really
not
about
this
at
all,
because.
A
A
So
you
know
they're
they're,
they're
kind
of
Wiggly
and
they
get
wider.
They
go
all
over
the
place
and
then
they
come
to
these
points
of
singularity
here.
So
it's
not
clear,
you
know
what
we're
proposing
is
that
there's
a
propagation
of
a
phase
along
these
these
contour
lines,
which
is
perfectly
understandable?
It's
not
like
there's
one
wave
of
something
going
across
everything
here.
It's
it's
not
like
that,
but
I
don't
think.
A
A
Even
just
to
it,
you
could
do
it,
you
might
see
you
saw
but
consistent,
but
she
wouldn't
even
a
consistent
phase
relationship
in
a
report.
In
this
case,
if
you
were
doing
use
in
v1
who
would
require
the
animal
to
move
in,
remember
we're
talking
about
flow
bits
here,
so
you
know
it's.
It's
like
you
have
to
activate
these
complex
cells.
Do
it
properly?
The
animal
really
should
be
moving,
and
then
you
and
then
you'd
see,
depending
on
fast,
the
animal
moving.
A
That's
how
much
of
a
the
frequency
would
change,
and
then
you
have
to
be
able
to
see
the
phase
shift
between
them.
I,
don't
know,
I,
think
I'm,
making
myself
come
ahead.
It
would
be
hard.
You'd
have
to
make
sure
you're
designing
the
experiment
properly
to
record
and
and
I'm
with
you.
It
could
very
easily
have
been
missed
and
also
I.
Think
it's
it's
not
the
pyramidal
cells.
You
have
to
be
looking
for.
I
think
it's
actually
the
by
policy.
A
That
gets
to
be
a
subtle
distinction,
because
it's
the
bipolar
cells
are
really
doing
this.
It's
a
pyramidal
cells
themselves
are
coming
along
for
the
ride
and,
and
they
may
not
be
firing
all
the
time.
It's
it's,
it's
the
it's,
the
bipolar
cells
that
would
be
firing
all
the
time.
So
I
bet
you
there's
a
lot
of
literature
out
there
a
little
each
of
this
in
one
way,
the
other
it's
gonna
be
just
difficult
to
take
time,
consuming
to
find
it
and
go
through
it
and
in
search
for
it
but
yeah.
A
They
said
earlier,
it's
exciting
to
work
on
and
I
think
it's
I
think
I
think
this
can
be
solved.
I
think
it's
I
think
we
have
enough
not
information,
probably
collectively
in
the
world
of
neuroscience
literature
to
piece
together
than
you
answer
me,
but
to
really
know
for
certain
okay
I
want
to
end
I
want
to
make
sure
we
have
time
for
supersizing
Jeff.
D
A
A
A
No,
that's
just
yeah,
you
can't
see
it
it's
basically,
the
best
ideas
are
there
taking
another
modality
ocular
dominance
and
they
just
drop
it
on
top
of
this
base
mentality.
But
again
you
don't
need
that
two
eyes
to
see
I
and
so
vision
and
hearing
is
not
dependent
on
ocular
dominance
at
all,
and
so
I
tend
to
ignore
that
component,
because
that's
a
very
specific
thing
to
vision,
you
know
wouldn't
apply
to
a
fingertip
or
something
like
that.
You.
D
A
A
I
mean
again,
you
know,
I'm,
arguing
that
the
entire,
almost
all
the
things
people
thought
about
its
orientation
in
v1,
is
wrong.
It
plies
correctly
to
delay
or
for
only
but
all
the
other
layers
which
have
complex
cells,
I'm
arguing
those
aren't
even
those
aren't
feature
detectors.
Those
are
movement,
detectors
that
floated
and
and
remember,
I
get
I
talked
about
that
research
with
the
random
bit
images
and
they
kind
of
concluded
yeah
right.
C
A
I,
just
don't
know
if
that
applies
to
movement,
and
that
seems
like
it
applies
to
like
trying
to
detect
edges
as
opposed
to
trying
to
take
flow
and
movement.
Maybe
it
does
I,
don't
know,
but
almost
all
the
literature
which
is
about
how
how
do
these
orientation
preferences
come
about?
Well,
a
good
portion
of
it
will
be
incorrect
if
I'm
right
about
this,
because
the
complex
cells
are
not
doing
what
people
think
right.
C
Something
else
the
one
place
I
would
argue
that
it
might
be
relevant
not
to
the
somatosensory,
but
to
the
extent
that
the
the
vision
has
a
contributory
signal
to
support
whatever
the
motion
is
that's
a
way
of
basically
getting
salience
out
of
a
very
confusing
set
of
things
moving
around
and
shifting
and
stuff.
So.
A
Yeah
well,
I!
Guess
the
question
is:
how
do
you
determine
you
know
flow
movement?
There's
the
question
have
that
happen,
but
remember,
remember
again,
those
dot,
those
random
dot
images,
work
really
well
for
complex
cells
and
so
there's
no
spatial
frequencies
in
them
all
there's
zero
spatial
frequencies,
it's
in
those
in
those
images
and
they
and
they
actually
luckily
complex,
sell
better
than
the
the
activated
complex
is
better
than
engines.
Okay.
So
that
would
argue
that
that
that's,
if
I
understand,
spatial
frequency,
that
would
argue
that
those.
A
A
C
B
A
B
B
So
this
is
this
is
a
paper
I
read
over
the
weekend
that
I
thought
was
very
thought-provoking
and
I
was
talking
with
Jeff
about
whether
we
had
research
meeting
topics
or
not,
and
I
just
decided
to
kind
of
put
this
together
at
the
last
hour
before
10:00.
So
this
might
be
a
little
bit
disjointed
I
apologize
for
that.
But
I
did
think
this
was
a
very
thought-provoking
paper,
so
the
paper
is
called
AI,
GA,
AI
learning,
algorithms,
an
alternate
paradigm
for
producing
AGI
by
Jeff
Klum.
So
some
of
you
may
have
read
this
paper
already.
B
He
released
it
I
think
last
year
and
there's
a
bunch
of
stuff
coming
around
and
I.
You
know
along
these
lines
that
are
quite
interesting
and
he
kind
of
critiqued
the
state
of
both
kind
of
machine
intelligence
or
machine
learning
a
world
as
well
as
in
part
of
his
paper.
Some
of
the
neuroscience
inspired
approaches
and
he's
proposing
a
particular
way
that
he
thinks
is
going
to
be
the
fastest
way
to
get
to
machine
intelligence
and
whether
or
not
we
agree
with
it
or
not.
I'll
have
a
take
on
that.
B
B
So
this
is
the
and
and
and
if
you
guys,
if
anyone
else
has
read
this
paper
feel
free
to
jump
in
as
well.
This
is
the
paper
that's
up
on
archive
released
last
year,
I
found
a
couple
of
talks
by
him
on
YouTube,
so
I
put
together
a
short
set
of
slides
from
taken
from
his
slides.
Great
I
only
created
one
slide
on
my
own,
so
that
helped
me
put
this
together
very
quickly,
but
I
didn't
want
to
go
through
that
and
see
what
you
guys
think
and
of
this
okay.
B
So
can
you
guys,
let's
see
this
so
so?
First,
he
talks
about
kind
of
the
manual
path
to
AI,
and
this
is
kind
of
the
dominant
machine
learning
paradigm,
which
is
basically
you
identify
these
key
building
blocks,
whether
it's
backpropagation
and
Rayleigh,
you
and
sigmoids,
and
all
of
that,
and
then
you
try
to
put
these
together
into
more
and
more
complex
networks
and
trying
to
solve
more
and
more
complex
tasks
and
it's
all
sort
of
hand
design.
B
You
know
in
machine
learning
it
sort
of
bottom-up
mathematical
principles,
and
you
know
he's
saying:
is
this
even
possible?
It's
this
Herculean
task
of
this
huge
space
of
possible
algorithms
that
you,
you
might
be
trying
to
create,
and
this
huge
space
of
possible
networks,
debugging
and
optimizing.
These
things
are
our
nightmare
and
you
need
these
huge
teams
of
people
working
on
it.
B
You
know
such
as
open
AI
and
you
know
Google
and
so
on
and
each
one
you
know
you
put
more
and
more
compute
resources
more
and
more
data
into
it,
and
you
try
to
kind
of
build
up
bottom-up
from
that,
and
so
this
is.
This
is
kind
of
the
dominant
paradigm
today,
in
some
sense,
what
we're
doing
is
also
kind
of
in
this
path,
except
our
key
building
blocks
are
neuroscience
based
things,
so
you
know
whether
it's
dendrites
or
sparsity
or
grid
cells.
B
You
know
we're
reading
a
lot
of
stuff
from
the
neuroscience,
but
then
we're
trying
to
manually
create
these
networks
and
figure
out.
You
know
how
to
put
it
all
together
or
not
into
into
these
working
systems.
Okay,
so
his.
So
here's
in
machine
learning.
Here's
an
example
of
the
types
of
building
blocks
that
are
there.
B
You
know
convolutional
networks
at
their
attention
mechanisms.
You
know
different
loss,
functions
models,
bayesian
methods,
active
learning,
there's
you
know
a
huge
list
of
things
that
people
literally
a
million
people
around
the
world,
are
trying
to
do
various.
You
know
take
examples
of
this.
You
know
try
to
create
a
network
that
solves
some
specific
problems,
and
the
question
is:
is
this
even
an
efficient
approach?
Are
we
able
to
find
all
these
building
blocks?
You
know:
can
we
create
systems
based
on
that.
B
He
and
the
basic
one
one
kind
of
lesson
that
the
community
has
learned
over
the
years
is
that
hand
design
pipelines
are
ultimately
outperformed
by
learned
solutions,
as
you
have
more
data
and
more
compute
algorithms
and
computer
vision
is
there
and
reinforcement
learning
are
really
good
examples
of
this.
So
in
the
beginning,
people
used
to
design
features
by
hand
and
hog
and
sift
are
examples
of
features
that
were
the
90s
people
used
a
lot
and
now
in
deep
learning,
you
can
pretty
much
learn
B's
end
to
end.
B
You
know
you
can
just
give
it
pixels
and
data,
you
don't
need
to
hand,
assign
features,
they're,
learning
algorithms
are
powerful
enough
and
to
do
a
better
job
than
any
hand.
Design
things
same
thing
now
with
architectures
there
tipic.
You
know
there
used
to
be
a
hand
designed,
but
now
you
can
through
hyper
parameter,
search
and
meta
learning.
Algorithms,
you
can
figure
out
what
architectures
are
are
best,
and
you
know
hyper
parameters
are
rather
than
manually
tuning
the
learning
rates
and
so
on
how
you
can,
through
various
mechanisms,
some
of
which
we've
learned
as
well.
B
You
know
you
can
learn
what
parameters
are
best
for
being
and
same
thing.
What
data
augmentation
used
to
be
the
case
that
you
could
you
could
hand
design
specific
data,
augmentation
techniques,
but
now
you
can
run
this.
You
know
huge
meta,
learning,
algorithms
and
it
figures
out.
What's
the
best
data
augmentation
and
that
could
be
better
than
any
manual
design
manual
data
augmentation
task.
So
there's
tons
and
tons
of
examples
of
this
that,
through
learning
and
optimization,
you
can
figure
out
stuff
better
than
manually
hand-tuned
kind
of
networks.
B
So
what
he's
proposing
is
well,
you
want
to
learn
as
much
as
possible.
You
don't
want
to
take
all
of
these
manual
building
blocks
and
hand
tune
these
things,
it's
a
very,
very
experimental,
expensive
process,
but
now
that
computing
is
becoming
cheaper
and
cheaper,
you
should
generate
the
algorithms
themselves
and
take
the
the
out
itself
can
be
automated
and
is.
B
Slightly
different,
so
he
makes
this
point.
This
is
not
just
evolutionary
algorithm,
so
the
analogy
is
evolution,
you
know,
and
so
that's
his
sort
of
existence.
Proof
here
is
the
earth.
You
know
evolution
with
very
very
simple
rules
led
to
intelligent
systems,
but
the
machine
learning
algorithms
today
that
do
this
are
much
more
efficient
than
evolution
and
the
problem
with
evolution
is
that
it's
it's
not
very
efficient.
A
B
B
Yeah,
we
have
much
better
search
techniques
now
than
evolution
does
did,
and
so
he
comes
in
he's
sort
of
proposing
these
three
pillars
that
we
need.
Well,
one
is
the
first
one
is
to
measure,
learn
the
architectures
and
that's
sort
of
fairly
well
known
in
the
literature,
I
believe
I'm,
not
as
familiar
with
all
of
these
and
Lucas.
B
You
know
why
not
let
the
computer
kind
of
figure
a
lot
of
these
things
out,
and
so
his
hypothesis
is
that
by
doing
this,
you're
gonna
need
fewer
building
blocks.
You
still
need
to
come
up
with
building
blocks.
It's
not
like
you,
don't
have
a
manual
process
in
this.
You
still
need
to
have
a
build
set
of
building
blocks,
but
it's
a
much
smaller
set
and
all
the
work
in
combining
them
is
is
much
easier,
and
this
definitely
resonates
with
me
because
we've
spent
a
lot
of
time
and
are
coming
up
with
neuroscience
based.
B
You
know,
building
blocks,
but
then
you
know
to
get
actually
things
actually
working,
there's
a
huge
amount
of
tweaking
and-
and
you
know,
figuring
out
what
the
data
set
and
you
know,
figure
out
what
the
right
learning
rules,
the
exact
learning
rules
are
what
the
right
parameters
are,
and
all
of
that
you
spend
a
ton
of
time
on
this
stuff.
So
the
problem
at
least
resonates
with
me
whether
the
solution
is
his
solution,
and
that
is
there
is
separate,
but
the
basic
idea
is:
you
need
fewer
building
blocks
if
you
were
to
do.
A
It
this
by
the
way,
just
you
know
the
way
I've
always
felt
that
is
like
you
think
about.
Oh,
we
did
the
temporal
memory
and
and
anomaly
detection
and
all
that
tweaking
we
had
to
do
there
I
always
assumed
that
all
that
treatments,
because
we
didn't
have
the
algorithms
right
they
were
close,
but
they
weren't
quite
right
and
we
didn't
really
how
to
make
them
right.
So
we
should
but
I
always
felt
like
we
really
understood
the
other
ones
better
than
then.
Then
it
wouldn't
be
hard,
but
maybe
I'm
wrong
about
that.
Yeah.
A
A
A
B
E
B
No
say-
and
he
he
sort
of
make
big
and
those
would
be
the
building
blocks
that
you'd
put
in,
but
exactly
how
you
implement
reference
frames,
so
works
in
real
real
world
air.
You
know,
tasks
and
stuff
is
a
really
difficult
task.
You
know
how
do
you
you
know
it's
it's
one
thing
to
have
the
cortical
column
roughly
laid
out
and
for
the
gap
between
that
and
actual
working
systems
is
really
really
hard.
Well,
I
guess.
B
A
Like
that's
it,
maybe
there's
like
ten
things
like
you
know,
the
temporal
memory
context
mini
comes
hypothesis,
and
now
we
have
reference
frames.
You
know,
and
until
until
you
to
me
you
feel
like
you
need
them.
Your
best
discover
those
in
biology
and
and
until
you
do
until
you
know
all
of
them,
it's
going
to
be
just
really
hard
to
do
anything,
but
once
you
do
know
all
of
them,
it
won't
be
so
hard.
That's.
B
A
No
because
that
implies
like
be
a
more
neuroscience
work
to
do.
It's
always
been
my
take
is,
like
you
know,
the
quickest
way
to
get
there
is
if
they
understand
those
building
blocks
and
the
only
way
you
can
figure
out.
Those
building
blocks
is
by
studying
the
brain
right
and
we
don't
know
more
yet
so
don't.
B
Give
up
my
brain
yeah,
but
if
I
were
to
kind
of
represent
his
point
of
view,
be
like
yeah,
that's
that's
fine,
but
then,
once
you
have
the
building
blocks
or
some
set
of
building
block,
combining
them
in
a
way
and
figuring
out
the
details
of
them,
so
they
actually
work
well
is
is
quite
a
big
task.
I.
A
I
know
because,
because
we
didn't
have
enough
building
blocks,
I
know
there's
so
many
things
I
know
are
wrong
about
that,
but
it
was
pretty
good
in
it
and
it
highlighted
a
few
neuroscience
principles
that
no
one
knew
about
so
I.
Guess:
I.
Guess
it
we're
talking
about
hypothetical
here,
it's
always
a
like
hey.
Do
we
have
enough
building
blocks
to
today
to
turn
on
the
automated
meta-learning
systems
to
build
something,
or
is
that
a
hopeless
task?
We
need
more
building
blocks
and
to
me
it's.
B
Always
well
I
think
the
part
you
probably
didn't
see
as
much
as
sort
of
the
detailed
neuroscience
until
you
have
new
pic
working
code
where
every
detail
of
it
is
actually
where
and
I
still
don't
know
whether
what
we
have
is
the
best
actual
implementation
or
not
right.
There's
still
a
ton
of
manual
work
that
went
into
now.
A
I
understand
that
instant
and
that's
the
part
that
could
be
I
know,
but
my
point
is
I
think
that
manual
work
is
because
we
were
missing
all
these
other
major
pieces
and
therefore
even
our
models
were
wrong.
They
were
better
than
other
models,
they
were
getting
at
some
truth,
but
they
were
basically
had
missing
components
and
and
because
we
missed
those
components,
any
implementation
we
do
is
gonna
be
difficult.
That's.
B
A
I,
don't
I,
don't
know
if
that
I
think
that's
unclear.
Well,
how
do
we
know
that
we've
never
had
your
lights
in
a
building
block,
and
so
we've
always
been
trying
to
make
things
work
with
a
partial
set
that
aren't
correct
and
and
and
the
pieces
don't
fit
together,
right
and
so
on.
So
I
think
it's
undetermined.
If,
if
we
knew
all
the
correct
principles
by
which
the
brain
work
would
it
inherently
be
very,
very
difficult
to
put
something
together,
I
don't
know
I,
don't
think
we
know
answer
that.
A
B
B
So
the
issue,
the
problem
with
this
is
it
takes
it's
gonna,
be
extremely
computationally,
expensive,
but
his
take
as
well
we're
just
getting
computational.
The
our
availability
of
computation
is
just
increasing
exponentially.
So
this
is
gonna,
be
a
solved
problem
and,
let's
take
a
look
at,
why
is
this
likely
to
win?
Well,
it's
fairly
obvious.
B
Generating
algorithms
he's
arguing
for
AI
generating
algorithm,
and
so
the
thing
is
the
amount
of
human
ingenuity
you
need
get
smaller
and
smaller.
You
still
need
it,
but
you
know
the
work
of
parameter
tuning
and
you
know
trying
out
lots
of
different
variations
and
stuff
like
that.
If
you
could
do
this,
you
know
you
need
much
less
manual
work
there
and
it
and
you
could
automate
a
lot
more
of
that
stuff.
B
Okay,
so
if
we
go
do
these
by
generating
algorithms,
he
gives
a
you
know
one
example
of
that
that
I
thought
was
quite
interesting
for
us
and
that's
in
the
realm
of
catastrophic,
forgetting
or
continuous
learning.
So
this
you
know,
we've
talked
about
this
in
the
past.
This
is
a
big
problem
with
today's
machine
learning
and
so
the
basic
framework
here
is
you
learn
task
a
then.
You
learn
task
B,
but
when
you
learn
task
B,
you
forgot
not.
You
basically
tend
to
forget
everything
about
8:00,
and
so
you
know
humans
and
animals.
B
Don't
have
this
problem,
you
know,
how
do
we
create
systems
that
can
solve
the
catastrophic
forgetting
problem?
Give
me
see
like
so
you
can.
You
know,
learning
a
continuous
set
of
tasks
without
forgetting
on
the
path
set,
and
you
know
he's
a
list
of
all
of
these
different
proposed
solutions
that
are
all
basically
he
would
put
in
the
manual
category
in
more
interesting.
B
B
So
you
get
theta
one
one
and
you
keep
going
and
at
the
end
of
it,
you
evaluate
how
well
your
sequence
of
this
inner
loop
worked
on
this
meta
loss
function
and
then,
based
on
that
you
differentiate
or
you
use
back
prop
through
this
entire
process
in
the
top
row,
come
up
with
a
bits
come
up
with
a
better
set
of
parameters.
Theta
two
and
then
you
go
through
the
entire
process,
again
evaluate
the
loss.
You
do
back
propagation
or
some
optimization
method.
That
goes
all
the
way
back
and
generates
a
new
set
of
I.
B
Think
so
you
know
you
could
think
of
evolutionary
algorithms
as
an
example
of
something
like
this,
but
we
have
now
much
better
mechanisms
he
claims
of
going
from
theta,
one
to
theta,
2
and
so
on,
and
in
the
case,
if
you
were
to
apply
this
to
catastrophic
forgetting
what
you
get
is
something
like
this.
So
you
first
you
start
with
a
set
of
parameters.
You
learn
task
1,
then
you
learn
task
2
and
you
keep
going
until
they
learn
task
T
and
then
you
evaluate
on
all
of
the
tasks.
B
Ok,
one
of
the
tests,
so
a
task
would
might
be
like
in
a
continuous
learning
case.
It
would
be
like
learning,
let's
say
a
couple
of
categories:
let's
say
you
take
imagenet,
it's
got
a
thousand
categories.
You
learn
two
categories
at
first,
okay,
these
are
all
in
the
next
two
categories
and
the
next
two
and
you
don't
want
to
forget
the
first
two.
B
So
you
learn
kind
of
categories
in
sequence
in
this
particular
case.
But
then
at
the
end,
you
you
evaluate
how
well
it
learned
all
the
categories
and,
of
course,
with
back
prop
the
typical
background.
It
would
horribly
on
all
the
categories.
We
would
only
learn
the
last
two,
but
you
evaluate
on
all
of
them
and
in
being
a
one
heat
when
he
proposed
you
back
propagate
through
the
whole
thing,
generate
a
new
set
of
parameters
and
repeat
this
entire
process.
A
B
A
optimization
process
that's
running
and
a
meta
optimization
process,
that's
running!
That's
going
from
theta
1
to
theta
2,
so
inside
in
here
you
might
have
something
like
backpropagation,
let's
say
running,
so
this
is
where
it
says
task
1
here.
This
is
an
entire
process
of
training,
one
network
2
categories,
jakka,
and
then
you
train
that
same
network
on
the
next
two
categories,
and
so
on,
so
that
we
all
we
do.
But
then
the
new
thing
is
this
meta
objective.
B
You
say:
ok,
how
well
did
this
whole
thing
do
you
know
and
you
can
look
at
the
area
under
the
curb
or
some
metric
that
says
how
well
did
it
learn
all
of
the
tests
all
of
the
categories
and
based
on
how
well
it
did
that
optimization
process
in
this
case
can
back
propagate
through
this
entire
learning
process
generate
a
new
set
of
hyper
parameters
and
in
this
case,
actual
starting
weights
and
go
through
this
entire
process.
So
this
this
entire
slide
is
automated
yeah.
A
B
A
B
B
C
B
No,
no,
the
loss
function
is
computed.
Just
normally.
So
let's
say
this
is
image
net.
The
first
task.
You
learn
categories,
one
and
two
right
in
task
2.
You
learn
categories
3,
&
4.
Only
in
task
3,
you
learn
categories,
5,
&
6
only
and
then
in
task
T.
You
would
learn
category
999
and
1000.
No,
no
I
understand.
B
C
C
What
I
was
saying,
that's
exactly
what
I
was
I
was
trying
to
feed
back
to
you
was
that,
basically,
you
could
look
at
you
know
as
he
moves
through.
You
know
what
is
being
forgotten
across
all
these
tasks
by
moving
backwards,
he
can
come
up
with
a
better
notion
of
you
know
how
to
deal
with
the
loss.
Egde
as
it's
coming
through
right,
I
mean
that's.
B
C
B
B
Right,
yes,
yes,
yeah
and,
and
the
details
of
that
I
don't
know
right
now.
I
would
need
to
go
to
this
paper
called
and
go
through
it
in
detail
which
I'd
like
to
do
at
some
point
to
understand
it,
but
but
basically
he
does
have
a
mechanism
for
doing
that.
They
do
have
that
and
I'll
give
the
the
rough
idea
for
it
later.
Just.
B
B
So
once
you've
done
all
of
this
stuff,
then
what
do
you
do?
What
you
called
meta
testing?
So
now
you
take
a
new
set
of
categories,
a
new
thing
and
you
go
through
a
training
process.
So
this
is
meta
testing,
training
oops,
so
you
go
through
a
training
process
and
then
you
evaluate
how
well
it
retained
all
of
the
different
categories.
B
B
Then
this
was
kind
of
just
one.
Interesting
piece
is
that
the
new
building
block
they
put
in
was
a
neuromodulatory
network,
so
here
he
says.
Well,
if
you
look
at
neuroscience
there
is
these
neuromodulatory
pathways
that
can
affect
kind
of
plasticity
and
and
the
learning
system
you
know
whether
it's
dopamine
or
others
they
can
affect.
You
know
how
fast
you
learned,
so,
let's
just
put
that
in
so
he
puts
in
a
network
that
basically
is
like
an
attentional
type
network.
B
Okay,
so
this
is.
This
is
an
example,
kind
of
a
manual
building
block
that
he
throws
in
there
into
this
uber
optimization
process.
Okay,
so
the
red
path
is
your
standard
convolutional,
neural
network?
The
blue
path
is
a
separate
thing
that
is
basically
all
it's
doing,
is
updating
how
fast
these
things
learn
or
don't
learn
in
an
in
a
you
know,
kind
of
a
precise
way
in
this
path
back
here
so.
B
B
B
B
B
B
B
B
A
B
C
B
B
B
B
He
set
up
matter
learning
the
learning
algorithm,
where
you're
basically
learning
how
to
train
the
system,
or
you
know
how
how
fast
learning
should
go
when
you,
when
you
present
new
categories
and
he's
the
third
pillar
it
says,
is
basically
there's
hardly
been
any
work
on
this
yet
is
to
learn
new
kind
of
data
set
and
benchmarks
and
the
learning
environment
itself
can
be
learned
and
again
he
uses
evolution
as
a
as
an
analogy.
Here's
a
couple
of
analogies
one
is
evolution.
Where
you
know.
B
Let's
say
you,
you
have
trees,
you
know,
maybe
you
have
I,
don't
actually
remember
the
but
ever
lose.
The
environment
itself
is
changing
during
evolution.
So
first
you
have
trees
and
you
you
may
have
tons
of
leaves
I
mean
you
have
too
many
trees
and
then
I
think
you
have
this
example.
You
have
caterpillars,
they
start
eating
the
leaves,
and
then
you
have
other
stuff
that
eats
caterpillars,
and
then
you
have
maybe
giraffes
that
are
eating
the
leaves.
B
And
then
you
have
predators
any
you
know,
I
know
they
giraffes,
but
they
you
know,
herbivores
that
stuff.
So
the
basic
idea
is
that
the
environment
itself
is
changing
and
as
the
environment
changes
you
get
more
and
more
intelligent
animals
emerging.
So
you
want
to
be
able
to
having
the
right
set
of
environments
and
data
sets
and
benchmarks
is
very
critical
to
this
entire
process.
B
I
think
another
example:
he
used,
as
you
know,
what
they
call
curriculum
learning
where
you
might
need
to
train
a
system
on
simpler
tasks
first
and
then
gradually
build
up
and
train
it
on
more
and
more
complex
tasks
and
that's
sort
of
similar
to
how
humans
learn.
You
know,
as
babies,
we
might
be
exposed
to
much
much
simpler
environments
and
our
parents
are
teaching
us
in
a
particular
way
and
and
then,
as
we
grow
up,
we
gradually
learn
more
and
more
complex
at
s
and
that
curriculum
itself
is
very
might
be
important.
B
C
It
sounds
like,
in
a
nutshell,
he's
found
a
way
to
automate
generalization
of
a
network,
and
if
you,
if
you
don't
take
the
lesson
from
curricula
and
learning,
it
could
learn
the
long
lessons
for
quite
a
while,
because
it
might
have
simplified
a
complex
problem
down
to
something
simpler,
which
would
handicap
it
from
that
point.
So
you
kind
of
want
to
feed
it
a
set
of
examples.
C
B
That's
an
example
of
what
he's
proposing,
but
it's
not
the
only
thing,
but
that's
an
example,
and
we've
talked
about
some
other
things
too.
You
know,
maybe
we
need
to
learn
simple,
primitives,
like
curvature
and
and
and
smoothness
and
so
on
before
we
can
learn
more
complex
objects
and
more
complex
shapes,
but
it's
not
the
only
thing,
but
the
basic
idea
is
that
the
whole
entire
environment
that
the
system
is
embedded
in
you
know
the
details
of
how
that
environment
works
should
be
learned.
B
Okay,
so
that's
his
overall
kind
of
conclusion.
He
thinks
that
working
on
these
three
pillars
is
going
to
be
the
fastest
path
to
reaching
AGI.
The
manual
path
is
not
the
way
to
go
and
that
you
know,
maybe
you
need
the
manual
work
to
figure
out
what
these
building
blocks
are.
But
then,
at
the
end
of
the
day,
you
need
these
other
meta
learning
algorithms
in
there
too,
to
really
do
do
well
and
then
there's
one
other
part.
I
thought
that
was
in
his
paper
that
was
relevant
to
us.
That
I
thought
I'd
bring
up.
B
He
does
discuss
sort
of
neuroscience
based
approaches,
so
this
is
not
what
we're
doing
here.
I
just
want
to
be
fair,
but
he
sort
of
critiques
the
neuroscience
based
approach
in
the
following
way.
He
calls
it
the
mimic
path
and
that
it's
that
that
involves
neuroscience
the
studying
animal
brains
and
attempting
to
reverse-engineer
how
engine
intelligent
works
and
in
his
example
the
mimic
path
tries
to
recreate
brains
in
excruciating
detail.
B
You
know
as
much
faithful
detail
as
possible
and
the
Blue
Brain
Project
is
an
example
of
this,
and
you
know
he
says
these
are
worthwhile
independently,
just
because
it
he's
not
saying
they
shouldn't
exist,
but
he
does
nothing.
That's
gonna
be
the
fastest
way
to
intelligent
systems,
get
to
intelligent
systems
and
his
to
get
to
critiques
of
this,
which
I
thought
were
work
relevant
and
we've
come
across
this
scenario
too.
B
A
Know
if
everyone
understands
the
server
type,
this
is
worth
pointing
out
the
idea
behind
the
member
approach.
Is
you
don't
actually
understand?
What's
going
on,
you're
you're
you're,
just
recreating
the
details
and
hope
it's
going
to
work
yeah,
which
I
think
is
nearly
impossible?
It's
it's.
It's
so
remotely
possible
that
that
would
work
because
there's
so
many
parameters,
you
don't
know
which
ones
are
important
and
therefore
you
just
can't
get
it
right.
So
I
think
that
proach
is
never
going
to
work.
It
doesn't
mean
it's
not
valuable,
but
it's
right.
B
A
D
A
B
I
totally
agree,
however,
I
think
these
two
issues
to
our
issues.
For
us,
though,
the
fact
that
you
know
in
neuroscience
is
very
slow
to
produce
new
data
and
you
run
new
experiments
and
that's
one
one
fundamental
kind
of
thing
friction
the
other
Creek
should
this
is.
These
are
my
words:
it's
not
exactly
his
words,
but
the
neuroscience
community
itself
is
a
source
of
friction
because
you
know
their
goals
are
not
necessarily
to
create
machine
intelligence.
In
my
experience,
98%
of
them
are
not
computer
scientists
and
a
lot
of
trouble,
understanding,
computation
and
algorithms.
B
B
It's
just
something
new
that
that's
why
it's
interesting-
and
this
is
this-
is
really
this-
makes
it
really
hard
for
us
to
go
through
papers
and
communicate
with
this
community
and
and
get
the
data
that
we
need.
So
I
think
this
is
a.
This
is
a
valid
source
of
friction
that
we
face
in
Jeff
Coons
case.
B
A
B
A
A
Your
loss
function,
knowing
what
that
is,
it's
critical
if
your
loss
function
is
to
do
continuous
learning
on
data
sets
like
vision.
Well,
that's
what
you're
going
to
optimize
for
and
and
I
think
I
think
one
of
the
problems
with
AI
is
they
haven't.
Had
the
right
loss
function
that
close
to
it,
I
wouldn't
say
the
loss
function
is
I
would
say:
oh
the
loss
functions.
A
You
need
to
learn
a
sensory
motor
model
of
the
world
that
is
able
to
predict
the
inputs
continuously
and
it's
not
like
you
know
better
on
this
type
of
training
and
so
on.
So
you
want
to
get
the
AG
off
that
I.
Guess
I!
Guess
my
I
like
to
approach
just
something
I,
don't
think
it's
approach
to
get
to
AGI
because
you
want
to
get
the
AGI.
You
know
there
were
very
clear
loss,
function
and
I.
A
Don't
think
they're
close
to
that
yet
and
neuroscience
can
tell
us
what
that
loss
function
is,
and
the
second
thing
is,
you
got
a
set
of
parameters
on
the
left
and
that's
it
of
parameters.
It's
been
much
more
complicated
than
what
people
are
thinking
about
today.
Those
parameters
are
like
include
things
like
attentional
mechanism,
with
the
thalamus
and
majah
Tory
mechanisms
and
hierarchical,
there's,
construction
of
columns
and
so
on.
A
These
are
the
parameters
the
brain
works
with
and
if
you
don't
have
that
right,
set
of
parameters-
and
you
know
the
right
loss
function-
you're
not
going
to
get
the
AGI,
you
may
optimize
particular
problem
you're
working
on,
but
you're
not
going
to
get
to
a
jack.
So
this
is
I.
Think
it's
a
wonderful
approach.
If
you
knew
exactly
what
you're
awesome
should
should
be,
and
it's
a
very
complex
plus
function-
and
you
knew
exactly
what
your
parameter
should
be
in
my
argument
that
you
can
only
figure
those
out
by
studying
the
brain.
A
If
you
knew
those
things,
then
yeah,
but
if
you
don't
know
those
things
you're
not
going
to
get
the
correct
ones
or
the
correct
parameters
using
this
approach,
it's
just
not
going
to
it's
just
not
going
to
circus
out
of
this
thing
on
the
round.
So
that's
my
critique
of
this
is
like
it's
great.
If
you
know
those
two
things
and
therefore
you
can
solve
a
lot
of
machine
learning,
cast
It's
Made,
but
it's
not
a
path
to
AGI.
Until
you
can
break
those
two
very
complex
functions,
the
function
and
parameter
sets
yes,.
B
I
completely
agree
with
that
and
I
think
with
neuroscience.
We
can
come
up
with.
You
know
these
building
blocks
and
and
what
are
the
you
know
a
lot
of
these
details
here?
You
know
our
research
team,
a
large
part
of
what
we
end
up
doing,
is
going
from
theta
1
to
theta
2,
and
that
might
take
us
several
weeks
or
months
to
go
from
theta
1
to
theta
2.
And
so
you
know,
can
we
automate
our
research
team
with
this
approach?
Yeah.
A
B
A
It
would
but
don't
fool
yourself.
That's
the
path,
AVR,
that's
the
path
to
getting
better
machine.
You
know
better
continuous
learning
or
some
other
thing
you're
trying
to
achieve
it's.
It's
a
that's
my
critique.
If
you,
if
we're
trying
to
like
you
know,
do
continuous
learning
on
certain
types
of
well-established
problems,
yeah
I
think
that
work,
but
they
call
it
a
path
to
ad
I.
Think
that's
I.
D
A
A
pieces
but
I've
always
felt
like
there's
I'm
sticking
with
it
I've
never
I've
always
felt
you
had
to
understand
the
complete
framework
of
what
a
brain
does
and
how
it
does
it
before
you
can
do
avi
and-
and
there
was
no
shortcut
to
get
into
that
framework
without
studying
the
brain.
I
just
seems
that
people
weren't
able
to
intuit
what
the
right,
what
is
the
brain,
doing,
there's
even
get
that
basic
thing
correct
and
and
the
basic
work
of
how
it
works.
A
So
nothing
has
changed
my
mind
that
best
was
still
necessary
and
we're
not
done
yet
and
so
some
pieces
like
oh
yeah.
Ok,
your
market
or
that's
one
of
25
pieces
year
or
15
or
whatever
it
is.
You
need
to
know,
but
it's
it's
on
its
own
is
insufficient.
I
mean
we
have
neural
models.
Now
we
have
no
models
with
dendrites.
We
can
have.
You
know
oscillatory
into
it
models.
A
A
B
A
B
A
A
That's
a
very
specific
goal
right.
The
goal
should
be
you
know
and
we're
talking
about.
Not
just
you
know
a
robot
navigate.
It's
like
a
system
that
learns
a
commune.
Oh,
it's
working,
any
sensory
modality,
building
a
model
of
the
world
that
you
can
interact
or
that
model
not
just
by
moving
your
body
but
moving
your
senses.
A
C
D
Someone
proposes
a
data
set
and
then
everybody
just
brushes
and
tried
to
beat
that
dataset.
It
seems
to
me
that
he's
proposing
that
the
dataset
is
part
of
learning,
so
we
should
also
learn
better
data
set
as
part
of
our
learning
algorithm.
So
how
would
that
work
with
what
we
have
today?
I
mean?
How
would
people
compare
different
approaches,
our
Goodman's?
If
we
move
away
from
this
benchmarking,
there's.
B
Probably
hell
yeah
you're
right
that
I
think
that
is
what
is
proposing.
There's
gotta,
be
some
goober
benchmark
like
some
some
end
thing,
that
is,
that
is
objective
and
exactly
what
data
you
use
to
achieve.
That
and
objective
is,
is
irrelevant
or
you
know
it's
not
the
issue.
So
maybe
you
know,
maybe
you
want
to
have
an
autonomous
agent
that
can
solve.
B
You
know
really
complex
tasks
in
3d.
You
know
you
where
you
can
put
it
into
any
3d
environment
and
it
can
figure
out
how
to
navigate
in
and
learn
about
the
environment.
Let's
let's
just
say,
and
then
maybe
in
order
to
get
there.
The
system
can
generate
simple
environments
as
a
starting
point
and
simple
actions
and
add
in
more
complex
actions
and
more
complex
environments
automatically
as
a
part
of
training
it
but
you're
right.
It
sort
of
begs
the
question
there.
You
know:
how
do
you
evaluate
it?
C
Of
the
things
that
it's
kind
of
you
know,
driving
some
of
the
research
I
see
is
Jeff
doing
is
looking
at
animal
behaviors
and
how
to
explain
them.
You
know,
in
terms
of
they
got
to
be
able
to
do
this.
They
got
to
be
able
to
do
that.
They
must
have
that
capability
to
do
that.
Those
are
by
definition,
tasks
and
it
would.
The
interesting
thing
to
me
is
that
you
know
I
mentioned
the
robotics
community.
A
A
That's
you
know.
How
does
the
animal
solve
this
particular?
You
know
sniff
task
in
a
rat
or
something,
but
you
may
not
be
getting
at
all
to
the
core
elements
of
what
it
means
to
be
intelligent,
you're,
just
thinking
how's
it
brain
and
in
whole
solve
a
problem
where
they're,
adding
motivated
to
get
something
to
drink
and,
and
so
I've
always
been
careful
of
cautious.
About
that.
What
you
really
want,
you
know
she
would
you
be
the
ideal
subject,
because
humans,
you
could
have
them
do
cognitive
tasks,
but
we
can't
probe
human
brains
anyway.
A
I
think
focusing
on
animal
behavior
is
also
wrong.
You
need
to
it's
gonna,
get
you
down
the
wrong
path.
You
need
to
have
a
solid
framework
of
what
it
means
to
be
intelligent
and
then,
which
is
what
we've
been
doing
and
we've
made
a
lot
of
progress
on,
and
then
you
can
back
off
from
that
and
say:
okay,
if
I
wanted
to
design
an
animal
experiment,
how
do
I
make
sure
that
that
determines
is
testing
that
concept
versus
just
no
animals?
Just
trying
to
remember
you
know
get
through
water.
C
C
A
Ways
but
I
believe
it
lead
you
down
the
wrong
path.
I've
seen
I've
been
to
these
research
labs
and
all
these
animals
be
tested,
and
you
can
see
they
test
what
they
can
test.
It
can't
test
what
we're
trying
to
get
at,
which
is
what
means
to
be
intelligent
until
you
I,
just
I,
think
it's
misleading
it
just
takes
you
down
the
wrong
path.
That's.
C
A
Have
a
lot
of
data,
we
have
neuroscience
data,
the
problem
that
the
Supertop
point
out
is
the
neuroscience
data
we
have,
which
are
literally
tens
and
tens
of
thousands
of
neuroscience
papers
are
not
well
written
for
our
purposes.
They,
and
so
that's,
why
it's
difficult
it's
difficult
to
sort
through
all
the
data,
but
I
believe
that
we
have
sufficient
data
in
in
the
neuroscience
community.
Just
like
I
was
talking
about
earlier.
I
was
talking
about
these
different
eye
problems
and
I
said
well.
What
do
we
know
about
slabs?
Well,
you
know
there
should
be
distributed.
A
B
A
Where
we
can
sort
through
it,
I
found
over
and
over
again,
if
you
spend
enough
time
at
it
and
you
look
at
enough
papers
and
then
occasionally
contact
the
scientist
and
say
you
wrote
this,
but
did
you
mean
bad
or
what
did
you
say
about
what
you'd
mentioned
this
part?
What
did
you
mean?
Why
did
you
put
that
in
there
that
you
can
generally
get
to
the
answers
you're
looking
for
it
just
takes
a
lot
of
time.
I
think
that's
what
unique
about
momentum
in
that
is.
You
I
mean
you
could
argue.
A
If
you
don't
like
this
approach,
it's
fine
but
I'm
convinced
it's
the
only
way
it's
going
to
work
and
it
has
been
working,
which
is
you
have
to
stick
to
the
neuroscience
it's
difficult
as
it
is
combining
it
with
into
and
observation
and
psychology.
You
know
cycler
observations
and
so
on
and
and
then
you
can
start
matching
these
pieces
up
together
and
so
I
know
brain
has
to
do
this,
I
think
there's
no
tissue.
It
looks
like
this.
That's
what
I
was
doing
earlier
with
the
grid
cells
and
now
oscillators
like.
A
Where
does
this
fit
into
the
neural
data?
We
threat?
Eclis
know
that
the
so
train
for
the
interference
model,
I
think
I
concluded
it's
like
99%
certain
it's
right,
and
then
we
said
okay,
if
it's
right,
then
it
has
to
be
for
today
in
the
neuroscience.
Where
do
we
find
it?
Almost?
Nobody
does
that
it's
very,
very
few
people
you
think
about
this
at
all.
They
they
might
make
neural
models,
but
they're
not
constrained
neural
models.
E
A
C
B
I
think
so,
I
think
the
the
issue
is
the
following
that
if
we
want
to
use
these
principles
to
actually
once
you
get
across
the
hurdle
and
say
okay
now,
these
have
to
be
applied
to
practical
problems
and
and
create
actual
working
kind
of
a
GI
systems.
There's
a
big
it
might
be.
Maybe
Jeff
disagrees,
but
once
you
have
all
of
these
principles
to
the
point
where
you
have
code
that
actually
works
and
can
solve
and
can
demonstrate
the
principles
in
some
non-trivial
thing.
B
There's
a
lot
of
engineering
work
involved
in
that
process,
and
so
that's
the
and
there
we
might
be
able
to
use
approaches
like
this
to
kind
of
make
that
piece
more
efficient.
It's
not
sort
of
it
doesn't
remove
the
work
of
figuring
out
what
those
principles
are
in
detail.
It's
just
sort
of
an
engineering
step
of
taking
starting
from
that
and
getting
to
something.
That's
actually,
you
know
solve
demonstrably
short,
showing
something
pretty
difficult
or
showing
intelligence
of
some
sort.
So.
B
And-
and
maybe
you
know
what
Jeff
said
is
once
we
know
the
right
set
of
principles
that
could
be
around
that
could
be
much
more
efficient
than
then
a
purely
you
know,
machine
learning
the
principles
themselves
can
lead
to
a
faster
way
of
implementation.
Okay-
and
we
saw
this
with-
you-
know
some
of
the
sparsity
stuff,
with
temporal
memory
and
so
on,
but.
C
A
Right,
that's
the
right
summary
I
think,
given
what
we're
doing
is
not
AGI
we're
building
partial
solutions
like
we
did
a
temporal
memory
now
we're
doing
it
with
sparsity
accomplish
neural
networks,
given
that
we
are
taking
one
principle
at
a
time
or
a
couple
principle
time
and
applying
to
problems
that
are
not
quite
real
brain
problems,
then
this
approach
could
be
very,
very
helpful.
I
think
we
can
all
agree
on
that
and
I
think
that's
what
your
main
points
of
battaglia
I
think
it
remains
to
be
seen
when
we
have
a
full
cortical
model
framework.
A
B
B
B
A
A
B
Possibly
yeah
I
haven't
done
I'm
gone,
not
that
far.
You
know
it's
not.
It
may
not
be
just
continuous
thing.
It
could
be
anything
any
place
where
we
are
trying
to
create
something.
That's
working
in
practice.
You
know
the
better.
We
can
optimize
the
way.
The
kind
of
the
more
wrote
parts
of
what
we're
doing
the
better.
D
And
to
be
fair,
I
don't
want
to
diminish
Jeff
cellent
work
or
proposal.
There
is
already
like
a
growing
community,
especially
on
the
meta
learning
community
is
trying
to
solve
a
similar
problem,
so
I
think
Jeff's
taking
it
beyond
he's
talking
about,
including
and
datasets
as
well,
in
the
optimization
look.
But
there
are
a
lot
of
people
who
are
already
thinking
about
that
for
for
the
past
few
years,
yeah.
B
In
his
yeah
I
think,
if
you
look
at
the
three
pillars,
he
had
the
first
one:
it's
it's
becoming
a
lot
more
popular.
The
second
one
is
it's
still
small,
but
reasonable.
The
meta
learning
learning
algorithms,
but
the
third
one
where
you
generate
the
data
and
environments,
almost
no
one
is
working
on
that
right
now.
There's.