►
Description
Jeffrey Hawkins is the American founder of Palm Computing and Handspring. He has since turned to work on neuroscience full-time, founded the Redwood Center for Theoretical Neuroscience in 2002, founded Numenta in 2005 and published On Intelligence describing his memory-prediction framework theory of the brain.
Recorded At MIT, Dec 15th, 2017
A
That
is
cognitive,
science
and
neuroscience
so
introduced
him
at
the
first
time.
As
a
speaker
of
our
then
intelligence
initiative,
seminar,
series
in
2010
so
that's
seven
years
ago,
is
the
founder,
as
everybody
knows,
of
palma,
computing
and
handspring
and,
as
such,
has
been
a
legend
in
Silicon
Valley
for
quite
some
time
in
2003
was
elected
as
a
member
of
the
National
Academy
of
Engineering
for
the
creation
of
the
handheld
computing
paradigm
and
the
creation
of
the
first
commercially
successful
example
of
a
handheld
computing
device.
A
He
has
a
deep
connection
at
MIT
in
its
infinite
wisdom.
The
MIT
computer
science
admission
office
representing
I,
note
the
other
side
of
a
substrate,
not
this
one
rejected
Jeff's
application
to
the
AI
lab
and
so
made
it
possible
for
him
to
invent
hundred
computers
and
for
us
to
have
iPads
and
the
like.
So
Jeff
wrote
a
book
which
is,
in
the
meantime,
is
a
classic
book
on
intelligence.
That's
2004
describing
his
memory
prediction
framework
theory
of
the
brain.
A
He
then
started
to
maintain
the
belief
that
it's
time
for
computer
science
to
learn
from
the
brain
for
making
computers
more
similar
to
the
brain,
Jeff
and
I
agreed,
then
on
the
belief
that
the
time
had
come
for
a
new
attack
on
the
problem
of
AI
and
that
neuroscience
would
provide
important
cues.
You
note
the
initiative.
This
was
the
intelligence
initiative,
the
precursor
of
the
CBMM.
The
initiative
is
exciting.
A
Over
the
last
40
years,
see
many
intelligence
initiatives
come
and
go,
but
the
positioning
and
thought
behind
hi
square.
That
was
the
term
intelligence
initiative,
is
the
best
I've
seen.
Mit
is
the
ideal
location
for
an
initiative
like
this,
and
since
then,
companies
such
as
mobile
I
and
especially
deepmind,
which
were
then
just
tiny
startups
when
they
participated
in
the
MIT
symposium
brains,
minds
a
machine
which
were
organized
in
2011.
A
So
because
of
this,
when
I'm
asked
what
would
be
the
next
breakthrough
in
AI,
of
course,
I
answer
that
I
don't
know,
but
that
it
is
a
reasonable
bet
that
it
will
also
come
from
neuroscience,
and
it
may
well
come
from
for
looking
in
more
details
at
the
anatomy
and
function
of
the
layers
in
each
cortical
areas,
and
this
is
what
Jeff
would
speak
about.
The
title
is:
have
we
missed
half
of
what
the
neocortex
does
allocentric
location
as
the
basis
for
perception?
Please
join
me
in
welcoming
Jeff
oaken's.
B
Thank
You
Tommy,
that's
very
generous
and-
and
it's
nice
to
be
back
here,
I-
do
view
MIT
as
really
setting
the
agenda
in
the
field
that
that
I
like
to
participate
in
and
I
almost
completely
forgot
about
the
fact
that
I
had
my
application
for
a
graduate
program
here
was
rejected
many
years
ago.
That's
good,
so
I
don't
hold
anything
against
you
guys
anyway.
B
So,
yes,
this
is
how
I
talk
and
I
won't
explain
the
other
than
they'll
just
jump
right
into
it
here,
just
if
I
just
figure,
it's
a
few
words
about
my
company,
because
it's
a
bit
unusual.
The
Manta
is
a
small
business
in
Northern
California
we're
really
like
a
private
research
lab
there's
12
people
we're
almost
completely
dedicated
to
New
York,
critical
theory
and
scientists
and
engineers.
We
have
a
rather
ambitious
goal,
which
is
the
reverse
engineer.
The
New
York
cortex
I'm
not
embarrassed
to
say
that
it's
an
ambitious
goal:
it's
achievable.
B
We
should
all
be
working
on
it
one
way
or
the
other,
and
our
our
approach
is
a
very
detailed
biological
approach.
We
want
to
understand
how
those
the
neurons
and
the
circuitry
as
we
see
it
in
the
mammalian
neocortex,
what
it
does
and
what
its
function
is
if
it,
when
I
understand
ideas
inspired
by
the
brain
that
can
come
after
you
understand
how
the
brain
works,
so
we
really
stick
to
the
biology
we
test
this
empirically
with
collaborations
and
experimental
labs
and
via
simulation,
and
that's
when
I
talk
about
today.
B
We
have
a
second
goal
which
relates
to
what
Tommy
just
mentioned
here,
and
it's
definitely
second
in
our
case,
which
is
to
enable
technology
based
on
cortical
Theory,
so
I'm
still
a
believer
that
the
the
way
we're
altima
ly
going
to
get
to
truly
intelligent
machines
is
we're
gonna,
the
fastest
path
there
to
understand
how
the
brain
works
and
to
that
end
we
have
a
very
active
open
source
community.
All
of
our
our
stuff
is
very
open
all
of
our
source
code.
You
can
reproduce
all
but
experiments,
and
we
believe
this.
B
Ultimately,
this
endeavor,
whether
it's
us
or
other
people,
will
be
the
basis
for
machine
intelligence,
as
we
will
see
it
in
the
future.
Okay,
I
just
want
to
be
mind,
I
know
everyone's
here
is
a
neuroscience,
and
you
all
know
this,
but
I
just
I
find
it
it's
a
good
idea
just
to
review
a
few
basics
before
I
delve
into
this
mammals
have
a
neocortex.
Now
mammals,
don't
in
the
human
it's
about
70%
of
the
volume
of
your
brain.
This
is
my
model.
I
carry
it
with
me
all
the
time.
B
Language
somehow
is
all
based
on
the
same
sort
of
underlying
fundamental
architecture,
which
it's
just
a
remarkable
thing
to
think
about,
but
it
appears
to
be
true
so
and
random.
Around
castle
was
also
basically
proposed.
It
says:
well,
the
way
to
think
about
the
neocortex
is
just
think
about
one
little
section
of
it.
That
goes
through
that
two
and
a
half
millimeters.
He
called
it
a
column,
and
he
says
basically
in
that
calm
you're
going
to
have
that
central
function.
B
B
Now,
if
you
open
up
a
basic
textbook,
introduction
to
neuroscience
type
of
thing,
you'll
see
a
picture
like
this
they'll
say:
oh
there's,
a
bunch
of
layers
in
the
cortex
input
arise
into
layer,
four
layer,
four
projects,
the
layer,
2
3
layer,
2
3,
is
the
output
goes
to
the
next
region
and
then
layer,
2,
3
projects
to
layer,
5
and
that
projects
a
layer
6.
That's
how
information
flows
through
the
cortical
columns.
It's
actually
not
bad,
but
it's
leaving
out
quite
a
bit
by
my
count.
B
Right
now
we're
we
deal
with
the
relevant
about
12
different
cellular
layers.
Layer
3
is
easily
divided
into
2
layer.
5
is
3
different
cell
types.
These
may
not
be
visible
layers,
it
doesn't
mean
the
cells
are
actually
stratified,
but
their
sales,
a
different
Anatomy
or
from
morphology
or
physiology.
It
can
be
uniquely
identified.
Layer.
6
is
a
very
complicated
layer.
Has
these
two
layer,
6a
and
6b,
or
so
to
these
very
interesting
layers,
and
it's
got
a
bunch
of
other
cells
down
below
there.
B
If
you
just
follow,
for
example,
the
same
as
we
did
on
the
left
there,
the
feed-forward
circuit
II
gets
complicated,
so
there
are
actually
two
inputs
to
every
cortical
calm,
especially
not
the
primary
ones
is.
Sometimes
you
have
connections
directly
from
other
cortical
regions
and
sometimes
I
go
through
the
thalamus
into
there.
So
there's
two
sort
of
feed
Pro
inputs.
They
do
arrive
at
layer
four
among
other
places,
but
they
only
form
about
10%
of
the
synapses
on
layer,
four
cells,
about
50%
of
the
synapses
and
layer.
B
Four
cells
are
shown
in
this
blue
hour
through
this
very
kind
of
unusual
bi-directional
connection
between
layer
6a.
So
if
you
can
understand
what
layer
four
is
doing,
you
can't
ignore
what
layer
6a
is
doing,
because
it's
providing
about
half
the
input
they're.
Indeed,
layer,
forward,
projection
layer,
3,
that's
the
quit.
Layer
goes
direct,
other
cortical
regions,
but
layer,
three
also
projects
down
the
layer.
B
Five-
and
here
you
see
a
very
similar
type
of
circuit
between
layer,
6b
and
layer,
one
of
the
layer
fives
you
have
a
similar
sort
of
parallel
structure
going
on
there
with.
Is
this
very,
very
characteristic.
Bi-Directional
connection,
then,
that
projects
to
upper
layer
five,
at
least
in
some
species,
this
upper
layer,
five,
but
it's
the
layer,
five
thick
tufted
cells
and
that
becomes
a
second
output
of
the
cortical
column,
and
that
is
the
one
that
goes
through
the
thalamus.
B
So
it's
like
these
two
sort
of
inputs
and
two
outputs
and
there's
this
complicated
circuit
going
on
between
now
there's
a
lot
of
known
about
the
cortical
anomie
I'm
not
going
to
go
through
it,
but
we
can
summarize
a
few
things
here.
We
can
say:
cortical
columns
are
complex,
they're,
very
complex
least
12
or
more
excitatory
shadow
layers.
There's
two
feet
forward
pathways,
there's
at
least
two
feedback
pathways.
B
You
can
show
them
here
and
there's
numerous
connections
up
and
down
the
column
in
between
columns
and
then,
of
course,
there's
an
entire
inhibitory
circuit,
which
is
at
least
as
many
cell
types
and
equally
complex.
So
this
is
a
very
complex
system
here
now
the
function
of
this
thing
is
also
going
to
be
complex.
It's
not
going
to
be
simple,
so
anybody
who
says,
oh,
it's
a
filter.
It's
changing
this
or
changing
that.
That
doesn't
seem
to
be
the
case.
B
We
should
expect
this
thing
to
do
a
lot
and
in
some
sense
we're
looking
at,
and
this
is
the
thing
that
makes
us
think
this
is
the
the
source
of
everything.
In
fact,
whatever
a
column
does
has
to
apply
to
everything
the
cortex
does,
because
this
is
the
circuitry
of
the
cortex,
so
we
might
think
about.
B
Oh,
how
is
this
kind
of
touching
or
I'm
going
to
see
with
this,
but
it's
also
going
to
explain
how
we
do
language
and
it
also
has
to
say
something
about
how
we
do
neuroscience
and
how
we
build
buildings
and
so
on.
So
it's
it's
something
really
remarkable.
Now.
I
have
two
thoughts
about
this
before
I
get
into
the
details.
In
my
talk,
one
is
I
just
want
to
remind
yourself.
This
is
one
of
the
most
important
scientific
problems
of
all
time.
B
It's
worth,
stating
that
it
sorts
remembering
that
it
sits
up
there
with
the
discovery
of
you,
know,
genetics
it's
up
there,
it's
really
kind
of
the
core
of
who
we
are
as
humanity,
and
it's
the
only
structure
that
knows
things.
This
is
in
the
only
structure
that
discovers
things
and,
of
course
it
defines
us
as
a
species.
B
It
goes
beyond
the
abstract
I
mentioned
today.
In
the
talk,
the
end
of
my
talk,
I'm,
going
to
give
you
explicit
proposals
about
what
many
of
these
layers
are
doing.
I'm
gonna
be
filling
in
a
diagram
here
explaining
what's
going
on
here,
at
least
our
hypothesis,
for
that
it
won't
be
everything,
but
it's
going
to
be
an
interesting
foundation
and
I'm
gonna
make
the
case
for
that
now
to
do
that
in
the
time
I
have
allowed.
B
I
have
to
move
quickly
through
a
whole
series
of
concepts,
and
typically
when
you
give
a
scientific
talk
and
give
me
one
concept,
and
you
explain
how
you
did
it,
what
didn't
work
and
your
experiments
blah
blah
blah
I,
don't
have
time
for
that.
I
want
you
to
understand
that
everything
I
present
you
here
is
not
just
made
up.
It
was
a
lot
of
work,
a
lot
of
testing
a
lot
of
it
took
a
long
time
and
I
have
a
lot
of
confidence
in
it,
but
I
can't
present
the
data
to
explain
that.
B
Why
I
have
that
confidence,
so
I
just
want
you
to
least
give
me
the
benefit
of
the
doubt
that
later,
when
you
ask
me,
questions
I
can
go
into
nd
detail
about
this
stuff
in
great
detail,
but
I'm
trying
to
tell
a
story
here
today
and
I
want
to
get
to
that
end
pitch
picture
now.
The
way
I'm
going
to
tell
the
story
is
the
way
we
discovered
it.
It's
not
the
way
we
went
about
our
work.
It
may
not
be
the
best
way,
but
it's
the
white.
B
The
way
I
know
so
I'm
gonna
start
at
the
beginning,
the
beginning,
all
of
our
work
was
based
on
a
single
observation.
The
observation
is,
the
cortex
is
constantly
making
predictions
of
its
inputs
every
time.
I
feel
something
I
have
an
expectation.
What
I'm
gonna
feel
and
that
expectation
is
a
very
detailed
prediction,
as
I
move
my
hand
along
this
lectern,
if
even
the
slightest
little
dip
here,
I
would
notice
it.
B
So
we
asked
ourselves:
the
question
is
okay,
our
research
paradigm
has
been
how
do
networks
of
neurons,
as
seen
in
the
neocortex,
learn
predictive
models
of
the
world.
It's
not
that
the
cortex
is
only
building
doing
predictions,
but
it
seems
to
be
a
fundamental
component
of
what
the
cortex
doesn't.
If
we
tease
apart
prediction,
we
might
understand
what
some
of
the
functional
components
underlying
that
are.
So
that's
what
we
want
about
now,
this
question,
this
research
question
can
be
broken
into
two
parts.
If
you
think
about
the
patterns
are
coming
into,
the
brain.
B
You've
got
these
sensory
streams,
millions
of
sensory
bits
coming
into
the
binging
all
the
time.
Why
are
they
changing
two
fundamental
reasons?
Either
the
world
itself
is
changing
and
I'll
call
that
extrinsic
sequences
like
you're,
listening
to
a
melody
and
you
and
you're
learning
the
sequence
and
it's
the
pattern
in
time
that
matters.
That's
one
form.
The
second
form
is
when
you
move
yourself
so
and
you're
doing
this
constantly.
Every
time
you
move
your
eyes
several
times
a
second
every
time
you
touch
something.
B
Every
time
you
do,
you
know,
walk
around
the
room,
there's
a
flood
of
changes
coming
in,
and
it's
been
known
for
a
very
long
time
back
to
Helmholtz
that
you
can't
really
understand
the
world
in
those
sensory
inputs
if
you're,
not
accounting
for
the
behaviors
that
go
with
them.
So
it's
the
sensory
motor
sequences
that
are
leading
to
those
and
so
that's
part
of
problem.
So
we
started
with
the
first
one
and
then
we
tackled
the
second
one.
So,
on
the
first
one,
we
had
a
paper
that
came
out
in
March
of
2016
called.
B
Why
neurons
have
thousands
of
synapses
a
sequence
of
a
theory
of
sequence
memory
in
your
cortex
in
in
the
era?
The
big
idea
is
we
suggested
that
every
pyramidal
cell
is
actually
a
prediction
machine
and
the
vast
majority,
the
synapses
on
the
pyramidal
cell,
are
actually
used
for
prediction,
I'm
going
to
walk
through
that,
then
we
showed
if
you
took
a
cellular
layer,
like
you
might
say,
one
of
the
layers
in
one
cortical
column,
yeah
a
network
of
those
metals.
B
We
learn
a
type
of
sequence,
memory,
a
very
powerful
sequence,
memory,
a
predictive
memory
and
in
order
also
have
the
new
to
do
some.
Probably
super,
sparse
activations
to
understand
that.
So
that's
in
that
paper,
then
we
just
had
a
a
paper
come
out
in
October
of
this
year,
called
the
theory
of
columns
in
the
New
York
or
checks
how
the
theory
of
how
columns
in
New
York
chart
your
cortex
learning
the
structure
of
the
world
in
that
paper.
The
big
idea
is,
we
deduce
it
every
column
every
you
think
of
it.
B
We're
talking
mostly
about
primary
and
secondary
sensory
columns,
but
ultimately
I
think
it'll
be
every
column.
We
do
just
said
it
must
have
a
sense
of
an
allocentric
location
and
I
used.
The
word
ala
centric
in
a
very
broad
term.
It
just
means
other
I'm,
not
using
it
in
the
terminus,
if
eclis,
as
people
who
study
like
grid
cells,
do
and
something
like
that,
but
really
you
can
think
of
when
I
say
I
will
say.
This
is
tripping
some
people
up
today.
You
think
it
was
object
centric.
B
So
when
I
touch
this
little
clicker
here,
when
my
finger
feels
something
I'm
arguing
that
that
the
column
that's
receiving
the
inputs,
my
finger
is
also
figuring
out
where
it
is
on
this
object
and
we'll
get
into
that.
So
that
was
the
big
idea
there
and
then,
as
the
sensors
move
through
over
objects
and
through
the
world
and
learn
models
or
complete
objects
and
I'll
walk
you
through
that
and
then
the
third
part
here
is
our
current
research,
and
this
has
not
been
published.
It's
very
new.
B
We
ask
the
question:
well
how
how
could
columns
compute
this
allocentric
of
eccentric
location?
We
had.
We
had
the
idea
that
well,
let's
look
at
grid
cells
and
place
cells
and
because
they
solve
a
similar
problem,
and
after
we
studied
this
for
a
while,
we
we
come
to
believe
that
cortical
columns
contain
analogs
of
grid
cells
and
head
Direction
cells
that
they're
solving
the
same
basic
problem
is
that
the
internal
cortex
is
using
to
map
environments.
B
It's
been
served
and
it's
now
using
to
map
physical
structures
objects
and
it's
a
very
parallel
process,
and
when
we've
understood
that
now
we're
starting
to
have
to
understand
the
function
of
numerous
layers
and
connections,
so
I'm
going
to
go
through
this
in
order
I'm
going
to
very
quickly
go
through
these
points
and
end
up
down
here
with
the
specific
functions
of
layers
and
so
I'm
going
to
go
pretty
quickly.
So,
let's
start
with
one
slide
on
the
pyramidal
neuron
as
a
prediction
system.
There's
you
hear
typical
pyramidal
neuron.
B
It
has
thousands
of
synapses
anywhere
from
five
to
thirty
thousand
excitatory
synapses.
Only
10%
or
less
than
10%
typically
are
proximal.
You
can
actually
drive
that
cell.
The
fire
90%
of
them
are
on
either
the
distal
basal
dendrites
or
the
apical
dendrites,
and
typically
they're
completely
unable
to
make
the
cell
fire,
which
a
lot
of
great
research
has
been
done
to
show
that
dendrites
are
active
processing
elements.
B
So
if
you
have
somewhere
around
15
active
synapses
that
could
come
active
at
relatively
close
in
time
and
space,
so
they
have
to
be
within
like
a
40
micron
on
a
dendrite
segment,
Danny
generate
it
can
generate
a
dendritic
spike.
The
generating
spike
can
go
to
the
soma
generally.
It
does
not
cause
the
cell
to
fire,
it
depolarizes
a
cell,
so
it
raises
its
voltage,
but
not
enough
to
generate
a
spike
that
can
be
a
sustained
in
polarization
hundreds
of
milliseconds
up
to
a
couple
of
seconds.
B
We
are
gonna
argue
that
that
is
a
predictive
signal,
so
the
proximal
sentences.
This
is
our
theory.
The
proximal
synapses
cause
somatic
spikes,
they
define
the
classification
field
of
the
neuron,
but
the
distal
synapses
cause
dendritic
spikes
and
they
put
the
cell
into
a
depolarize
state
or
predictive
state.
What's
the
benefit
of
a
cell
being
depolarized,
our
models
in
ours
and
our
network
models
rely
on
that
fact.
What
happens
is
that
the
poisoner
will
fire
a
little
bit
sooner
than
another
neuron.
B
If
they
both
have
the
same
receptor
field,
they
don't
have
the
same
basic
feed
flow
receptor
field,
the
one
that's
going
to
be
depolarized
will
generate
its
first
bike,
a
little
bit
quicker
and
it's
going
to
inhibit
its
neighbors
in
a
very
fast
in
a
circuit
so,
and
it
turns
out,
if
you
typical,
a
typical
pyramidal
neuron
can
recognize
hundreds
of
unique
patterns,
100,
unique
context
in
which
it's
printer
predict
its
its
input.
This
is
how
we
model
it
when
we
all
of
our
simulations.
We
use
this.
This
is
a
picture
of
our
software
model.
B
For
this
thing,
I
don't
base
achill
e
in
green
there,
that's
the
proximal
st.
apps
is
the,
and
then
we
have
a
debate.
The
basal
synapses,
we
label
their
context.
It's
an
array
of
coincidence,
detectors
and
then
the
apical
dendrites
are
similar.
These
are
like
threshold
detectors,
so
this
is
our
model
of
the
neuron.
It
has
multiple
states
I
won't
get
into
it.
I
also
should
point
out
the
learning
model.
B
Here
we
rely
on
synaptogenesis,
so
we're
not
changing
weights
of
synapses
we're
actually
growing
new
synapses
in
our
model,
in
a
very
clever
way
that
matches
biology,
but
I'm
not
going
to
get
into
it
now
one
of
the
properties
of
sparse
activations.
We
have
to
cover
this
because
you
won't
understand
anything
else
inside
cover
this,
and
maybe
you
know
this
already,
but
I
don't
so.
Let's
take,
for
example,
we
have
one
layer
sail,
doesn't
really
matter,
we
just
come.
B
Take
a
bunch
of
cells
and
safe,
like
one
layer
are
cortical
column,
let's
say
it's:
five
thousand
ons
and
typically
what
we
see
is
a
very
sparse
activation.
So
let's
say
two
percent
of
our
neurons
are
gonna,
be
active
at
any
point
in
time.
So
we
have
a
hundred
active
neurons.
Now
at
any
point
in
time,
is
100
and
then
one
moment
later,
there's
not
one
hundred
moments
later
is
another
100.
So
first
guys,
beginning
ask
is
what
is
the
representational
capacity
of
a
layer
of
cells?
How
many
different
ways
kind
of
pick?
B
A
hundred
out
of
5,000
well
that
you
all
not
surprised,
it's
very,
very
big.
What
you
may
not
know
you
can
type
this
into
any
browser
and
just
say:
5,000,
choose
100
and
it'll.
Tell
you
so,
and
in
this
case
it's
3
times
10
to
the
2,
that's
infinite
as
far
as
we're
concerned,
and
we
don't
to
worry
about
that.
We
can
pick
them
all
day
long.
The
second
thing
is,
if
you
randomly
choose
two
sets
of
patterns
to
activation
patterns,
what's
the
likely?
B
What's
what's
the
sort
of
the
distribution
of
the
overlap,
how
how
many
cells
would
they
have
in
common?
In
this
case
it's
about
two,
but
then
you
can
say
well,
what's
the
chance
of
it's
coming
out
of
10
cell
20
cells
or
30
cells
in
column,
in
common
in
the
10th,
it
turns
out
that
it's
very,
very
unlikely
it
very
quickly
drops
off
to
like.
Never,
even
though
technically
it
could
be.
B
So
you
can
pick
random
what
we
call
STRs
of
sparse
activations
all
day
long
and
they
almost
all
overlap
by
just
a
few,
so
they're,
very,
very
orthogonal.
In
that
sense,
now
we
can
take
advantage
of
this,
because
the
neuron,
what
it
means
is
a
neuron,
could
only
have
to
form
a
few
synapses
or
doesn't
have
to
form
connections
or
all
the
cells
that
are
active.
It
wants
to
recognize
a
pattern.
So,
in
this
case,
I
said:
I
want
this
neuron
to
recognize
I
have
a
hundred
cells
active
here.
These
are
the
gray
cells.
B
It
only
connections
on
one
of
its
dendrites
to
ten
of
those
or
20
of
those,
and
it
can
reliably
recognize
that
pattern
technically,
it
could
have
a
lot
of
false
positives,
but
it
just
won't
just
never
going
to
happen.
The
second
thing
we
can
do
now.
This
is
a
perhaps
something
you
haven't
seen
before,
but
maybe
you
have
is
we
can
answer
self
the
question:
what
happens
if
I
form
a
union
of
patterns?
B
So
instead
of
just
invoking
one
pattern
in
this
layer,
cells,
I'm,
gonna,
invoke
ten
patterns,
that's
a
thousand
active
cells
or
twenty
percent
of
the
cells
being
active.
Well,
you
could
say
wow.
This
cell
is
going
to
mean
trunks
trouble
now
because
it's
still
looking
or
antenna
listen
and
it
could
have
a
false
positive.
But
if
you
do
the
math,
it's
still
extremely
unlikely.
B
So
this
cell,
by
connecting
to
20
synapses
in
the
whole
population
here,
can
reliably
pick
out
that
pattern,
even
though
it's
a
whole
bunch
of
other
patterns
going
on
and
you
can
keep,
you
can
do
unions
much
greater
than
that.
We're
going
to
rely
on
this
pattern,
this
this
property,
because
what
we
think
is
going
on
in
every
cellular
layer.
B
The
column
is
representing
things
and
often
there's
uncertainty
and
when
there's
uncertainty,
it's
going
to
use
a
union
and
it's
gonna
say
well
I,
don't
know
it
could
be
X,
Y,
Z,
Z
or
so
on,
and
what
it
means
that
the
networks
don't
get
confused
as
it
tries
to
resolve
that
uncertainty.
Is
they
bounce
back
and
forth
they're
going
to
essentially
narrow
down
to
the
only
consistent
answer
under
all
explain
some
of
this.
B
But
the
point
is,
we
think
unions
are
happening
everywhere,
and
so
the
density
of
cell
activity
basically
represents
uncertainty,
and
when
you
really
got
something
you
know
what's
going
on,
it's
gonna
be
very
sparse.
Okay,
then
we
said:
okay,
take
a
bunch
of
those
parameter
neurons
and
for
sparse
activation,
put
the
put
them
in
a
layer
like
this,
and
we
had
a
few
more
things.
We're
gonna
basically
define
we're.
Gonna
put
cells
into
mini-com,
so
you
might
say
10
cells
per
mini
column
and
what
the
mini-com
it
doesn't
have
to
be
a
physical
structure.
B
What
we're
all
we're
asking
is
that
the
cells
in
a
mini
column
have
a
same
a
common
feed-forward
receptive
field
property.
This
is
why
the
classic
kugel
and
bezel
just
any
many
years
ago.
Oh
all
the
cells
in
sewed
vertically,
might
have
some
sort
of
acceptor
field
property.
You
don't
have
to
see
the
many
columns
you
just
have
to
have
that
property.
B
You
add
that
cells
up
to
those
cells
in
many
kind
of
are
gonna,
respond
to
the
same
feed-forward
pattern,
but
they're
gonna
form,
connections
horizontally
that
are
unique
and,
and
so
here's
what
it
would
happen
in
two
time
periods,
time
0
and
time
1.
If
I
had
no
predictive
state
and
an
input
comes
in,
it's
going
to
activate
all
the
cells
in
the
mini-com
because
they're
all
equally
getting
this
thing
and
they
look
similar
in
the
condition
where
there
is
a
predicted
input,
as
a
predicted
state
and
I
represented
those
by
little
red
circles.
B
Here
this
means
if
these
cells
predicting
they're
going
to
be
active,
they're
depolarized,
the
same
input
comes
in,
but
it's
going
to
select.
One
of
those
cells
is
the
one
that
is
that
was
predicted
to
get
a
fire
first.
They
were
very
fast
inhibition
and
basic
Informer
Sparsit
pattern,
the
next
moment
after
this.
What
will
happen
is
the
active
patterns
will
then
predict
another
cells,
and
so
you
can
go
through
these
sparse
activations
in
time
prediction
and
activation
predicting
activation
and
that's
the
basis
of
sequence
memory.
B
We
have
built
this
for
years
and
we
tested
this
and
we
apply
to
commercial
that
we
understand
it
very
well.
I'll
just
mention
a
few
things.
It's
very
high
capacity-
and
this
is
important
to
remember
you-
can
a
slightly
bigger
Network
than
this
we've
shown
can
learn
up
to
a
million
transitions,
meaning
it's
like
10,000
songs
of
100,
no
T's,
it's
really
high
capacity.
It's
surprising!
B
They
can
learn
high
order
sequences.
So
imagine
you
the
treatment,
training
out
two
sequences
ABCD
and
xB
CY.
If
you
show
it
ABC
a
padeen,
you
show
it
xB
see
it
predicts
why
it
doesn't
get
confused
by
the
B
and
the
C.
Similarly,
if
I
just
show
it
the
B
and
the
C,
it's
going
to
predict
both
D
and
Y,
because
that's
all
it
can
do
at
that
point
in
time.
It
does
all
these
things
automatically.
It's
extremely
robust
to
noise
and
failure.
B
You
cannot
got
40%
of
anything
and
it
still
performs
well
and
very
desirable
learning
properties.
It's
very.
It's
all
local
learning,
very
simple
rules.
I
won't
get
into
all
of
that
and
solves
many
biological
constraints.
This
is,
there
are
many
people
implementing
this
by
now
and
it's
being
used
in
some
commercial
applications,
but
it
is
a
biological
model,
first
and
foremost.
Okay,
we
down
to
the
first
section
now,
the
second
section
we
asked:
how
are
we
going
to
do?
Learn
predictive
models,
essentially
motor
sequences?
B
Our
first
idea
was
say:
ok,
let's
start
with
the
same
cellular
layer
and
can
we
turn
it
into
center
field
model,
and
we
said
well,
here's
a
basic
idea:
what
if
we
just
added
a
motor
related
context,
so,
instead
of
the
context
just
being
in
the
previous
state,
we
have
a
motor
related
context
and
we
were
inspired
because
we
said
look.
We
know
that
50%
of
the
inputs
to
the
layer,
4
cells
come
from
layer,
6,
hey,
so
that's
an
idea.
B
Let's
go
for
that
and
we
answer
myself
well,
what
would
that
motor
related
contact
would
be
and
well
this
is
the
hypothesis
you
know
by
adding
a
motor,
really
context
of
cellular
layer,
compare
its
input
as
the
sentence
and
and
then
we
said
what
is
the
correct
motor
related
context.
We
started
working
on
this
several
years
ago.
We
try
different
things
and
they
kind
of
work,
but
they
didn't
work
really
well.
They
didn't
scale
well
and
so
on,
but
about
just
a
little
bit
under
two
years
ago.
B
We
had
a
little
about
it
and
this
gets
to
that
allocentric,
so
they
may
use
my
my
coffee
cup
is
my
prop.
I'm
gonna
use
this
a
lot
during
this
talk,
so
you
can
just
basically
ask
yourself
a
very
simple
question.
Imagine
about
looking
at
this
coffee
couple
I'm
just
touching
it
I'm
familiar
with
it.
B
This
is
my
coffee
cup
from
my
office
and
I'm,
holding
in
my
hand,
I'm
about
to
move
my
finger
and
I
can
can
I
predict
we're
gonna
feel
yes,
I
can't
I,
know
I'm
gonna
feel
I'm
gonna
feel
this
edge
here.
I
also
know
if
I
touch
down
here.
I'm
gonna
get
this
little
rough
thing
here,
because
this
cup
has
a
rough
bottom.
It
also
has
this
little
little
doodad
here
so
I
like
touch
my
finger,
I
I
make
the
predictions
before
I
touch
it.
I
know
I'm
gonna
feel
now.
B
How
could
I
know
what
I
have
to
know?
First
of
all,
the
cortex
has
to
know
that
this
is
a
cup.
You
know
it
has
to
know
it,
and
it
has
to
know
where
it's
going
to
touch.
The
cup
has
to
know
that,
if
I'm
going
to
predict
on
the
field,
it
must
know
where,
and
that
thing
that's
going
to
know
is
where,
on
the
cup
it's
going
to
touch,
it's
not
relative
to
my
body.
It's
relative
to
the
cup
I
need
to
know
the
allocentric
location
of
those
like
possibly
make
that
prediction.
B
That's
deduction
and
the
predictions
are
going
to
be
a
fairly
fine
granular
level.
Every
part
of
my
skin
touching
this
cup
is
predicting
what
it's
going
to
feel
and
that's
a
lot
of
them.
It's
not
like
some
global
predictions
is
a
very
local
prediction,
so
we
realize
that
that
is
a
requirement
and
that's
where
this
for
the
allocentric
location
comes
from.
Okay,
so
I
answer
now
is
hey.
Let's
take
if
we
have
an
allocentric
location,
the
location
of
the
cup,
and
how
could
we
derive
that
I
didn't
know?
What
does
it
look
like?
B
We
didn't
know,
we'd
assumed
we
had
some
weight
in
the
beginning.
We
just
did
experiments
where
we
were
sort
of
randomly
made-up
stuff
and,
and
we
also
realized-
we
really
wanted
a
second
layer
to
the
network.
The
second
layer
was
what
you
were
typically
called:
a
pooling
layer.
That's
a
term
a
lot
of
people
use
if
you
don't
know
what
it
means
in
this
case,
what
I
mean
by
it
is
the
second
layer
we're
going
to
essentially
pick
a
sparse
activation
of
cells
up
there
and
it's
going
to
stay
constant.
B
While
the
lower
layers
changing
upper
layer,
those
cells
up
there
are
going
to
learn
to
respond
to
the
series
of
independent,
individual,
sparse
activations
in
the
lower
layers.
So,
if
you
think
about
the
lower
layer,
it's
sort
of
representing
at
a
feature,
the
sensory
feature
at
a
location
and
if
you've-
some,
if
you
basically,
you
were
basically
modeling
an
object
as
a
set
of
features
at
locations,
it's
kind
of
like
a
CAD
file.
Well,
it
kind
of
makes
sense.
That's
what
else
could
you
do?
B
Modeling
an
object
and
in
what's
interesting
here,
is
that
the
output
layer,
this
this
object
layer,
is
going
to
be
stable
over
movements
of
the
sensor
and
the
input
layer
will
be
changing
with
each
movements
of
the
sensor.
You
have
a
stable
representation
of
the
object
as
you
move,
and
it
doesn't
matter
which
order
you
move,
how
you
touch
the
object
long
as
you
as
long
as
you
know
the
allocentric
location
that
magic
signal.
B
We
don't
know
how
to
do
that
yet,
but
that's
the
magnitude,
so
we
modeled
this,
and
we
did
a
lot
of
work
with
this.
So
with
an
Allen
central
location,
input,
a
calm
can
learn
models
or
complete
objects,
or
this
two
layer
network
cam
and
by
using
essentially
different
object
locations
on
the
object
over
time.
So
two
integration
time,
you
can
both
learn
model
objects
and
you
can
infer
I'll,
show
you
that
now.
B
The
next
thing
we
realize
is,
if
you
had
a
series
of
columns
near
each
other,
imagine
they
were
representing
three
tips
of
your
finger
and
it's
going
to
touch
that
coffee
cup
three
fingers
at
a
time.
Well,
each
finger.
We're
gonna
have
its
own
location
on
the
object.
Each
finger
is
gonna,
have
its
own
sensory
input.
What
those
are
unique,
but
they're
all
going
to
be
basically
trying
to
model
the
same
object
and
if
they're
confused,
they
may
not
know
what
the
object
is,
but
the
output
layer.
B
These
are
going
to
be
three
because
going
to
be
basically
representing
the
same
thing.
And
so,
if
you
formed
an
associative
link
between
on
the
air,
they
can
vote
together
and
they
can
help
resolve
ambiguity.
That's
the
basic
idea,
so
each
column
has
partial
knowledge
of
an
object
as
its
sensory
equivalent.
Sensory
thing
is
moving,
and
these
long
made
language
connections
in
the
objects
layer
allow
the
comms
to
vote
and
inference
will
be
much
faster
when
you're,
using
multiple
columns
than
with
one
column.
B
Just
like
asked
for
me
to
reach
them
to
a
dark
box
and
if
I
use
one
finger
to
figure
out
what
I'm
talking
about
if
I
grab
with
my
hand,
I'll
get
it
or
if
I
was
looking
at
the
world
through
a
straw.
I'd
have
to
move
my
straw
around
a
bit,
but
if
I
open
my
eyes,
I
see
the
whole
thing,
then
I
can
do
it
very
quickly.
So
there's
this
is
just
a
little
cartoon
animation
just
to
illustrate
some
of
this.
It's
it's
not
terribly
accurate,
sis,
Willis
tration
purposes.
B
So
imagine
this
finger
is
going
to
touch
this
Cup
in
three
locations
and
I
have
one
column
with
his
an
input
layer
and
an
upper
layer
as
I
move
towards
this
spot.
I'm
going
to
touch
I
have
a
predicted
location
signal
that
it
basically
invokes
a
union
of
possible
sensations.
I
might
find
at
that
location
when
I
actually
touch
it.
It
had
a
sensory
feature
that
comes
in
it
selects
one
of
those
sensations
it
projects
up
to
the
output
layer,
and
this
thing
says:
I
know
three
objects.
It
meets
this.
B
The
coffee,
cop,
the
can
and
the
tennis
ball
all
meet
that.
So
our
forming
union
representation
up
there
then
I
go
to
the
new
location,
I
get
a
new
location,
but
it
basically
makes
fiction
about
what
it
might
sense.
You
actually
get
a
proper
census
li,
but
this
feature
at
this
location
I,
pass
it
up.
The
output
layer
and
I
eliminate
the
tennis
ball,
because
that's
inconsistent
with
feeling
a
lip
or
an
edge
and
then
I
go
to
the
final
sensation
here.
B
New
location,
new
sensory
feature
pass
it
up
and
I
can
eliminate
the
code
kambei
or
the
soda
can,
because
it's
inconsistent.
If
I
do
this
with
three
fingers
at
the
same
time,
like
the
hand
grasped
it,
I
get
three
different
locations.
Three
different
features
in
this
case,
we're
showing
them
the
same.
They
pass
it
up
in
the
output
layer.
We
can
say
oh
well
column,
one
says
it
could
be
a
coffee
cup
or
a
ball,
the
other
ones
who
are
saying
it
could
be
the
coffee
cup
or
the.
B
Can
you
just
quickly
associate
with
each
other
and
you
illuminate
and
you're
down
to
it.
The
only
thing
that's
possible
for
the
three
of
them
is
the
coffee
cup.
So
very
quickly.
You
lose
that
we
tried
this
out
then
on
a
more
sophisticated
problem.
We
started
with
this
Yale
I
could
see
mu,
Berkeley
benchmark,
which
is
about
80
objects,
they'll
actually
send
them
to
you
hunt
or
you
can
just
use
the
3d
CAD
file.
So
we
figured
since
some
of
them
are
perishable
food
items.
B
We
would
go
for
the
3d
CAD
files
and
then
we
built
a
robotic
simulated
virtual
hand
using
Unity
game
engine.
We
built
sensory
arrays
on
each
of
the
fingers
saying
and
we
built
a
multi-column
array
representing
each
finger.
We
use
4096
neurons
per
layer
per
column,
so
if
it's
three
fingers
that
we've
got
24,000
neurons
each
with
thousands
of
synapses
and
not
surprising
because
there's
a
simulation,
it
worked
very
well,
but
I
just
a
few
things.
B
It's
just
the
talk
about
here
mountain
we
did
it
with
one
finger
and
the
one
finger
is
touching
it
at
different
places.
In
one
touch,
you
can't
really
tell
what
the
object
is.
So
this
is
the
confusion
matrix,
which
is
what
the
actual
object
is
on
this
side,
and
the
vertical
option
is
what
it
actually,
what
just
thought
it
might
have
been,
and
you
can
see
this.
B
Obviously
the
right
answer
is
the
diagonal,
but
in
this
case
there's
a
lot
of
confusion
and
after
the
second
touch
things
started
narrowing
down
quite
a
bit
after
six
touches
you
ruined
really
really
well
and
after
ten
touches
Jory
Caron
tree
to
get
it
there's
a
lot
of
variability
this,
because
if
you
touch
sort
of
unique
features
on
the
object,
you
can
narrow
it
down
quicker
than
if
you
touch
non
unique
features.
But
this
gives
you
the
general
idea.
B
We
also
do
a
lot
of
experiments
looking
at
basically
the
number
of
columns
or
the
number,
if
you
want
to
think
about,
is
fingers.
But
you
know
we
can
do
this
abstract
Li
and,
of
course,
what
we'd
expect
is
that
you
know
the
fewer
columns
are
using
the
more
touches
you
have
to
earn
the
more
sensations
you
have
to
have
to
recognize
the
staying,
and,
if
you
have
more,
then
it
quickly
settles
down
to
basically
can
do
it
in
one
sensation
and
it
gets
harder
depending
on
some
other
parameters.
B
Like
there's
a
lot
of
parameters,
you
can
make
this
harder
or
easier,
but
the
point
is
it
kind
of
we
showed
that
sort
of
characteristic,
all
right,
so
that
was
that
big
idea
there
and
but
then
we
really
said.
Okay,
we
got
to
get
to
the
heart
of
this
allocentric
location
thing
what's
going
on
there?
What
does
that
mean?
And
and
as
I
said,
we
we
thought
of
that-
we
said:
let's
go
look
at
the
the
the
in
Toronto
cortex
to
see.
B
What's
going
on
there
now
I
know,
there's
a
bunch
of
hippocampal
people
here,
and
we
were
talking
about
this
this
morning.
There's
various
reasons
why
we
chose
tomorrow
and
Toronto
cortex
so
think
about
it
on
an
internal
conjugate
into
it,
but
don't
get
mad
at
me.
If
I
don't
talk
to
your
favorite
topic,
so
so
we
ended
up
here.
This
wasn't
our
initial
hypothesis.
We
are
usually
pastas.
This
is
a
coracle
columns
that
could
contain
analogues
to
grid
cells
and
very
recently
we
realized
they
had
to
have
analogs
the
head
Direction
cells.
B
That
was
the
last
missing
piece
that
I
didn't
know
about
until
just
a
few
weeks
ago.
So
let's
just
talk
about
what
goes
on
in
the
internal
cortex
and
I
won't
come
to
be
a
an
expert
in
this,
but
we
have
run
this
by
some
experts
and
they
said
it's
okay,
you
can
say
this
Jeff,
so
we're
in
Tirana
cortex
is
one
of
the
things
that
does
is.
B
It
allows
an
animal
typically,
we
study
rats
to
basically
build
maps
of
its
environment,
to
know
where
it
is
and
may
be
able
to
make
predictions
and
know
where
things
are
sort
of
the
foundation
of
other
sort
of
navigation
problems
and
grid
cells.
I
won't
go
into
all
the
details.
If
you
know
we
all
know
about
them,
but
some
of
the
details
are
really
important,
but
I
won't
get
into
them.
They
allow
us
in
code
location.
So
the
way
to
think
about
this.
If
you
look
at
rooms
they're,
actually
the
same
shape.
B
Do
this
thing
and
I'll
just
say
if
you
think
first
of
every
port
in
this
room
can't
be
associated
with
the
sparse
activation
of
the
grid
cells,
so
you
have
a
bunch
of
grid
cells
them
in
these
grits
all
modules,
but
if
you
just
looked
at
which
cells
are
active
and
which
cells
are
not
active,
it's
sort
of
a
sparse
representation
and
I've
shown
you
here.
Three
locations
in
these
rooms,
every
location
in
these
rooms
has
an
Associated
pattern.
B
What's
interesting
is
the
locations
in
the
room
are
unique
to
the
room,
so
the
actual
coding
of
these
locations
in
room
one
will
be
very
different
than
the
coding
in
room
two.
This
is
actually
essential
to
the
whole
theory,
so
it
a
means
that
location
in
that
room,
that's
a
sparse
activation
in
X
means
that
location,
room
and
R
means
that
location
in
that
room.
B
That
is
a
very
different
things
and,
of
course,
one
of
the
most
important
things
here
is
that
this
location
is
updated
by
movement,
so
even
in
the
complete
dark,
if
the
rat
is
in
that
room
and
it
moves
and
you
walk
forward,
it
updates
its
location,
information
and
one
of
the
clever
things
is
its
raishin
properties.
I
want
to
go
from
here
to
there.
I
can
go
this
way
and
then
turn
this
way
and
I'll
get
the
same
representation.
B
If
I
just
went
straight
or
went
around
the
circle
and
what's
clever
about
this,
it
works
even
in
novel
environments
that
it's
never
been
in
before.
So
it
may
be,
never
been
in
room
three,
but
it'll
have
that
path:
integration,
property
there,
even
in
the
dark,
so
that's
kind
of
clever.
Now
the
the
rat
needs
to
know.
Are
you
its
to
do
this
in
the
dark
yourself
I?
Do
it
at
night
it
actually
just
fun
to
try
to
try
to
see
what,
how
good
you
are,
and
what
your?
B
What
your,
how
good
you
at
this
you
are.
You
need
to
know
the
orientation
of,
in
this
case
the
animal's
head
to
the
room,
and
so
there's
these
things
called
head.
Direction
cells.
These
are
not
driven
by
you
know,
magnetic
fields
or
something
like
that.
They
are
basically
set
of
cells
which
indicate
the
direction
of
the
head.
The
anchoring
of
those
head
Direction
cells
is
unique
per
room,
so
it
doesn't
really
the
room
the
room,
there's!
No,
you
know
it's
not
always
aligned
along
an
edge,
but
with
always
consistent
and
the
up.
B
B
So
if
I
walk
forward
two
steps
well,
it
depends
which
way
I
was
facing,
where
I'm
going
to
be
also
if
I
know,
if
I
want
to
predict
what
I'm
going
to
see
or
since
I
have
to
know
where
I
am
in
which
direction
I'm
facing
cuz
I
could
be
in
the
same
location
here
as
the
animal
moves.
Both
of
these
are
updated.
Simultaneous
you
have
to
update
the
to
the
orientation
I'm
going
to
use
the
word
orientation,
because
I'm
trying
to
generalize
it
orientation
and
the
location
both
get
updated,
I
might
be
updating.
B
In
this
case,
the
movement
is
in
the
case
of
my
finger
is
the
the
movement
of
my
finger
relative
to
the
cup,
and,
and
so
we
have
to
have
that.
The
second
thing
and
I
only
realize
this
recently,
you
also
to
solve
the
problems
of
modeling
of
objects
and
modeling
of
structures.
You
need
to
have
a
good
equivalent
of
an
orientation
so
I've
trying
to
show
here,
as
so
total
sensory
on
your
tip
of
your
finger.
Both
at
sensing
point
a
but
from
different
orientations,
so
you
can
look
at
it.
B
This
way,
I'm
touching
the
lip
of
this
Cup
and,
as
I
rotate
my
finger
like
this.
The
sensation
of
my
finger
is
changing,
but
the
location
I'm
sensing
on
the
cup
is
not
hang
is
a
feature
of
the
cup
I'm,
not
actually
sensing
the
feature.
I'm
sensing,
the
feature
at
an
orientation
I
can't
say
that
the
feature
is
actually
the
slip
of
this
cup
in
the
frame
of
the
cup,
but
the
sensation
I
get
changes
as
I
move
the
orientation
of
my
finger
relative
to
the
object.
B
So
we
need
to
have
something
like
that
to
of
the
sensor
patch
to
the
object
now,
I
should
state
now
that
I'm
gonna
give
this
whole
theory
in
terms
of
touch.
But
the
whole
thing
applies,
the
vision
and
I
believe
it
applies
to
audition
as
well.
It's
a
little
harder
to
think
about
that,
but
there's
nothing
we're
not
doing
anything
specific
here.
We're
really
trying
to
talk
about
generic
properties
of
sensor
patches
relative
to
things
anyway.
B
We're
gonna
argue
that
this
is
anchored
to
the
object
in
any
way
that
it
is
over
there
and
this
orientation
has
to
be
updated
in
my
movement.
So
our
basic
idea
is
the
following
location
and
orientation
of
both
necessary,
that
is
location
orientation.
My
sensor
patch,
whether
it's
a
core
part
of
my
retina.
Where
is
it
it's?
B
B
So
now,
with
this
knowledge,
we
went
back
and
we
did
the
following.
Okay,
we
started
putting
these
pieces
together
in
ways
that
that
are
interesting
and
I,
and
this
is
where
I'm
going
to
lay
out
these
sort
of
basic
of
the
theory
here,
and
this
was
my
most
complex
slide.
So
if
I
lose
you
here,
sorry,
but
well,
I'll
bring
you
back
in
a
moment,
hopefully
and
I
think
everyone
who's
really
smart
about
figuring
this
out.
B
You're,
probably
ahead
of
me
already
I'm,
just
gonna
upfront,
say
without
any
further
justification
that
layer,
6a
is
representing
orientation
of
the
sensory
patch
and
layer.
6P
is
representing
a
location,
there's
reasons
for
this:
I'll
get
it
in
a
sec.
These
are
both
going
to
be
motor,
updated.
There
are
going
to
be
path,
integration,
type
of-
and
it's
sort
of,
like
it's
grid
cell
like
and
head
Direction
cells
and
they're,
going
to
have
properties
similar
to
those
cells
in
anti
rhino
cortex.
Now,
let's
follow
the
circuit.
B
She
has
information
and
your
basic
feed-forward
pathway
here
you've
got
a
sensation
which
is
arriving
too
late
for
and
that's
paired
with.
This
bi-directional
connection
is
very
characteristic
connection
between
layer,
six
a.m.
and
and
layer
four,
and
what
I'm
gonna
argue
there
is
that
layer,
four
is
representing
a
sensation
in
at
an
orientation.
B
Now
again,
if
I
didn't
know
the
orientation
I
just
have
a
bunch
of
cells
that
look
like
you
know,
edge
detectors
or
something
like
that,
but
in
the
context
of
an
orientation,
I'll
get
a
spark
pattern
and
it's
a
sparse
pattern
that
represents
sensation
at
an
orientation.
This
is
our
sequence,
memory
layer
that
I
started
with
it
can
learn
sequences,
but
it
can
also
learn
sensory
motor
sequences
and
so
forms
this
unique
representation
of
sensation,
our
at
orientation.
Now
the
next
layer
is
going
to
be
a
pooling
layer.
B
B
Well,
you
end
up
with
is
a
stable
representation
of
the
underlying
feature
independent
of
the
orientation
of
the
sensor,
so
I
would
end
up
with
reputation
of
whatever
the
thing
is:
I'm
I'm,
actually
sensing
at
that
point,
independent
of
whether
this
way,
this
way
this
way,
if
I
went
through
that
motion,
that's
what
would
happen
in
layer
and
this
represents
the
feature
that
is
being
sensed
at
that
point.
At
the
moment,
there's
no
concept
of
object:
I'm,
not
I'm,
not
locating
this
object,
I'm,
just
representing
what
I'm
sensing
or,
if
I
finger
layer.
B
Three
then
projects
to
layer
five
as
we,
so
that's
a
classic
projection
layer
and
we're
going
to
repeat
this
same
circuit,
we're
going
to
have
the
location,
information,
projecting
the
layer,
5b
and
that's
going
to
now
represent
a
feat-
and
this
is
another
sequence
memory.
This
is
now
good.
Now
we
really
have
the
feature
at
location.
Our
earlier
experiments
didn't
do
with
us
right
and
they
had
some
problems,
but
now
because
I've
added
the
second
thing
up
above
now,
I
really
am
locating
the
feature
location.
B
This
feature
at
locations
of
very
presentation
is
independent
of
the
orientation
of
my
sensor
and,
if
I
pull
over
that
in
the
upper
layer
here
which
I'm
labeling
layer
5a,
which
really
would
be
the
layer,
five
thick
tufted
cells
and
some
species
that's
above
below-
but
just
pretend
it's
this
one
above
here
that
pooling
layer
would
then
be
stable
over
objects.
It
would
be
actual
object.
B
So
we
have
this
sort
of
two-stage
sensorimotor
inference
engine.
Now,
if
you
think
about
earlier
I
talked
about
you
could
share,
you
could
share
information
between
columns.
The
only
two
things
that
are
fearing
here
are
the
object
layer
in
the
feature
layer.
Those
are
two
things
that
neighboring
comms
might
also
be
doing
in
common.
Everything
else
in
here
should
not
be
projecting
the
other
columns,
because
it's
unique
to
this
column
and
sure
enough.
B
The
two
primary
output
layers
of
a
cortical
column
are
always
identified
as
layer,
3
and
layer,
5,
thick
tufted
cells
and
those
basically
represent
the
feature
that
you're
sensing
independent
of
the
object
and
the
object
that
you're
sensing
now
those
actually
can
be
shared
to
multiple
columns
and
those
become
the
feed-forward
input
to
the
next,
the
next
regions.
It's
worth
noting
that
a
column
allow
this
now
I've
kept
it.
The
second
point,
a
column,
therefore,
is
a
two-stage
sensory
motor
model
for
learning
and
firing
structure.
This
is
just
reduced
properties.
B
The
thing
about
touching
and
it's
from
important,
remember
a
column
usually
cannot
infer
either
the
feature
or
the
object
with
a
single
sensation.
It's
just
not
going
to
be
possible.
You
have
two
choices:
you
can
take
the
single
column
and
you
can
integrate
over
time
by
sensing
moving
sensing
moving
sensing
moving
or
your
eyes
could
be
looking
at
us
through
a
straw
and
sense
so
essential
mode
or
you
can
vote
with
neighboring
columns
and
both
of
those
strategies
are
employed
in
the
brain.
B
So
the
theory
is
the
the
the
evolution
discovered
a
way
of
navigating
and
knowing
mapping
out
environment
had
to
do
this
a
long
time
ago,
because
all
animals
move
and
they
have
to
figure
out
where
they
are
and
how
to
get
home,
and
then
there's
another
theory
that
it's
been
published,
that
the
that
the
anti
rhino
court
exercise
of
this
three
layer
structure
in
two
parts
and
I
forget
the
scientists
who
proposed
this
initially,
but
they
proposed
that
the
neocortex
is
actually
now.
It
was
formed
by
folding
those
two
halves
on
top
of
one
another.
B
They
took
a
six
layer
structure,
so
we
think,
what's
basically
happened.
Is
evolution
preserved
months
of
what's
going
on
in
the
n-terminal
cortex,
not
exactly
it's
there's
differences,
but
it
preserved
that
and
now
it's
learning
how
to
model,
learn
how
to
model
objects
in
the
world
and
in
the
human
brain.
What
happens?
It's
now
continue
that
and
it's
using
that
same
mechanism
to
model
ourselves,
and
so
it
would
suggest
of
that.
B
That
just
suggested
that,
when
we
think
about
things
of
whether
there's
mathematics
or
physics
and
brains
or
neuroscience
or
politics
or
whatever
we're
gonna
be
using
a
similar
type
of
thing
and
what's
interesting
about
this
is-
is
this
space?
Is
this
idea
of
location
and
orientation,
they're
dimensionless,
they're
defined
by
behavior
and
they're
down
they're,
not
metric?
It's
not
like
XY
and
Z.
B
There's
sort
of
this
a
very
unusual
way
of
representing
these
things
and
if
behaviors
weren't
physical
behaviors,
but
more
mental
behaviors
like
bike
mathematical
transforms
or
something
like,
you
could
apply,
behaviors
to
abstract
spaces,
and
it's
this
might
be
the
the
the
core
of
high-level
thought.
Okay,
I
want
to
have
one
more
thing
here:
it's
it
suggests
that
we
might
want
to
rethink
some
thoughts
about
hierarchy
that
we've
all
had
for
a
long
long
time.
This
is
a
cartoon
drawing,
but
it
captures
some
of
the
basic
essence
of
it.
B
We
think
about
sense,
arriving
at
a
primary
central
region,
labeled
region,
one
here
we
extract
some
simple
features
and
then
we
converge
on
till
the
next
region.
We
extract
some
complex
features
and
then
we
somewhere
up
the
hierarchy,
we
we
actually
start
representing
objects
in
their
entirety.
This
proposal,
I,
have
today,
is
quite
different.
It
says
that
every
region
has
columns
every
column
is
actually
learning
complete.
Models
of
the
world.
Very
I
mean
I'm,
not
joking.
A
single
can
learn.
Thousands
of
things
and
and
I've
only
talked
about
what
six
of
the
labors
do.
B
There's
a
lot
more
to
be
done,
but
the
idea
that
these
things
are
actually
very
powerful,
modeling
things
you
have
them.
You
have
a
huge
array
of
basically
models
and
they're
all
bottling
the
same
stuff
in
the
world.
Now
a
couple
of
things
here:
I'm,
not
I,
I,
want
to
make
really
clear,
I'm,
not
saying
that
the
classic
view
was
wrong.
I'm,
adding
some
new
thoughts
to
it
that
that
we
hadn't
really
thought
about
before
one
is
you
what's
the
difference
between
all
these
columns?
B
Well,
on
things
about
the
cortex,
when
we
talk
about
how
regions
project
to
each
other,
they
never
do
it
that
way,
they
always
project,
at
least
to
at
least
three
regions
above
it's
like
if
the
LGN
is
projecting
the
v1.
It
also
predicts
the
v2
and
v4
and
people
think
yeah,
but
the
connections
aren't
really
strong
like
well.
They
might
be
diverging.
B
The
point
is:
there's
nothing
that
requires
here
a
strict
hierarchy,
and
so
you
know
a
secondary
region
could
be
looking
over
the
same
sensor
array
but
at
a
wider
area,
and
why
would
it
be
doing
that?
Imagine
I'm
we're
going
to
recognize
the
letter,
E
and
and
I
can
do
this.
I'm
gonna
argue
that
I
can
do
that
in
v1.
I'm
calling
every
column
in
my
can
recognize
the
letter
E,
and
if
that
II
was
really
really
small
right,
the
edge
of
my
my
abilities.
B
It's
only
going
to
be
recognizable
in
v1
because
the
other
regions-
it's
just
it
just-
doesn't
exist.
It's
too
it's
too
fuzzy,
but
if
it
gets
a
little
bit
bigger,
then
it
might
be
recognized
by
the
columns
in
both
v1
and
v2.
But
if
it
gets
really
big,
then
the
can't
do
that
anymore.
It's
just
too
big
an
area.
I
can't
move
over
that,
and
so
you
could
be
representing
things
at
different
scales
here,
but
the
complete
object
in
the
end
they're
sort
of
overlapping.
B
Now
what
if
I
had
to
sensory
arrays
going
on
at
the
same
time,
so
I
have
now
a
vision
in
a
touch
array
and
we're
going
to
basically
grasp
the
the
cup
and
see
the
cup
at
the
same
time.
Well,
you
would
be
invoking
models
of
the
cup
in
many
cortical
columns
because
it
would
be
comms
on
the
retina
they're
sensing,
the
cup
and
those
columns
in
the
somatosensory
regions
are
sensing.
The
cup,
and
so
multiple
columns
are
trying
to
infer
that
this
is
a
cup.
They
all
have
models
of
the
cup.
B
Some
are
drive,
visually
someone's
arrived
tactically,
but
they
all
model
now,
interestingly,
if
they
always
have
models,
the
cops
and
they're
all
sensing
similar
features,
it's
possible
that
they
can
vote
in
various
ways.
Here
you
know
one
of
the
things
we
see
in
the
cortex
there's
a
lot
of
projections
which
don't
make
sense
in
a
hierarchical
fashion.
You
see
you
see,
projections
from
s2
going
to
v2.
Well,
that
doesn't
make
sense
in
a
hierarchical
fashion,
and
here
they
can
be
voting
on
cups.
They
can
be.
The
object
is
being
voting
on
features.
B
They
can
go
up
and
down
the
hierarchy.
You
know
across
the
callosum
so
and
it's
interesting
you
can
print.
You
can
form
very
long
as
you
go
to
the
right
layers.
You
can
form
very
sparse
connections
to
different
parts
of
the
brain
and
it
works.
You
don't
have
to
have
a
lot
of
connections
to
each
column.
You
could
just
send
one
connection
through
over
here.
It's
it's
kind
of
odd,
the
way
it
works,
but
anyway
you
can
have
these
all
these
connections
both
help
vote,
so
the
auditory,
czar.
B
Being
that
the
tactile
system
will
be
helping
the
vision
system.
The
bridge
has
been
up
to
the
the
somatosensory
system,
so
little
non
hierarchical
connections
allow
coms
to
vote
on
shared
elements
such
as
objects
and
feet,
and
that's
kind
of
the
thing
we
see
up
here.
Ok,
so
I'm
almost
done
the
summary
of
the
talk
is
we
started
with
our
goal.
We
should
understand
the
function
operation,
the
language
circuits
in
the
neocortex.
Our
methodology
of
study
is
to
study
how
cortical
columns
make
predictions
of
their
inputs.
B
We
then
propose
the
pyramidal
neuron
model,
which
is
basically
the
prediction
we
say.
Every
pyramidal
neuron
is
basically
using
90%
of
its
10
APS's
for
prediction
in
each
neuron
predicts
its
activity
and
hundreds
of
context,
and
that
prediction
is
manifest.
As
a
depolarization,
we
then
said
a
single
layer
of
neurons
forms
a
predictive
memory
of
high
order
sequences.
This
has
been
well-documented
as
long
as
you
have
sparse,
activations
many
columns
fast
inhibition
and
lateral
connections.
B
It's
certainly
not,
but
it's
a
potential
framework
would
tie
a
bunch
of
things
together.
That
kind
of
makes
sense,
columns,
learn
models
of
object
as
features
out
locations
using
a
two-stage
sensorimotor
inference
model
and
and
I
went
through
the
details.
There
matter
a
lot,
but
that's
the
basic
idea
and
and
then
there's
some
total.
This
is
the
neocortex
contains
thousands
of
parallel
models
that
are
all
modeling
the
world.
B
Surprisingly,
in
high
capacity
that
resolve
uncertainty
by
associative
linking
and
our
movements
of
the
sensors
there's
a
couple
things
that
that
I
should
point
out
that
we
didn't
do
very
big
ones.
Objects
have
behaviors
now
I
point
out
that
everything
I've
talked
about
so
far
is
really
about
the.
What
pathway
we
haven't
been
talking
about
the
whole
cortex
we'd
be
talking
about
how
how
the
what
path
our
model
structure
and
so
on,
and
if
I
want
to
talk
about
behaviors
in
the
web
paper
on
tape.
B
What
pathway
I'm
talking
about
behaviorism
objects
themselves,
so
my
laptop
has
a
behavior.
The
lid
can
open
and
shut
and
I
know
that
also,
if
I
touch
keys,
they
move.
I
know
that
this
thing
has
behaviors
too.
If
I
push
this
button,
something
happens,
objects
have
their
own
set
of
behaviors.
We
have
to
add
that
into
this
model
because
the
the
it's
not
just
the
shape
of
an
object,
it
can
change
and
the
way
that
we
I
think
we're
gonna
model
behaviors.
B
If
you
think
about
the
model
of
objects
or
features
at
locations,
those
features
can
move
in
the
object
space.
That
would
happen
if
I'm
opening
a
laptop
lid
or
the
features
can
change
at
the
particular
location.
So
if
I
bring
up
my
cell
phone
and
it's
on
and
I
touch
something
on
the
screen,
new
features
appear
at
the
same
locations
that
they
bear
before.
So
the
polar
modeling
behaviors
of
objects
is
how
features
move
and
change
at
locations.
We
have
to
do
that.
We
haven't
done
that.
B
Yet
we
need
a
detailed
model
of
the
hierarchy,
including
the
thalamus
I.
Didn't
talk
about
the
thymus.
We
spend
a
lot
of
time
today.
Talking
about
the
samples
we
have
hypothesis
is
what
it's
doing,
why
we
need
it,
but
we
have
to
finish
that
out
and
I
also
already
mentioned
so
to
build
the
complimentary
aware
pathway.
This
is
not
a
model
we
haven't
described
anything
about
how
we
generate
behaviors
and
why
I
might
move
and
how
I
we
reach
something
I'm
talking
about
that
at
all,
I've
just
talked
about.
B
How
would
a
what
pathway
column
learn
the
structure
of
objects?
Aluna
I
want
to
put
it
on
a
plug
here
collaborations.
There
are
many
testable
predictions
in
this
model.
In
some
sense,
a
Greenfield
because
the
people
were
pretty
proposing
that
cortical
columns,
even
primary
ones,
they're
doing
a
hell
of
a
lot
more
than
most
people.
B
Think,
and
so
we
spend
a
lot
of
time
this
week
talking
to
bearish
labs
about
how
we
could
do
that,
and
we
welcome
that
we'll,
have
discussions
and
could
talk
on
the
phone
or
here
today
and
so
on,
and
we're
always
interested
in
hosting
visiting
scholars
and
interns.
We
have
a
couple
right
now,
and
so,
if
you
want
to
come,
spend
some
time
in
sunny
California
even
for
a
short
period
of
time,
so
we
have
people
come
just
for
a
couple
days
and
want
to
get
immersed
in
what
we
do.
B
We
like
having
visitors
like
that
this
is
the
team
we
have
on
the
left.
There's
12
people
I
want
to
call
out
to
specifically
Thai
Ahmed,
who
is
with
me
right
here.
He's
bi
he's
been
with
me:
we've
been
partners
for
12
years
and
he's
critical
to
the
whole
thing
and
Marcus
Lewis
is
one
of
our
scientists
and
he
really
helped
understand
the
interaction
between
layer,
4
and
layer,
6
and
layer.
Five
in
the
f6b
I
didn't
really
talk
about
his
work
here,
but
it's
subtle
underlying
everything
we're
doing,
and
here's
some
insights
into
that.