►
From YouTube: How Sensorimotor Inference Works (Part 3)
Description
Now some details about how sensorimotor inference works in HTM theory.
You should have some background in HTM before understanding these videos. See http://numenta.org/htm-school/ for more videos explaining basic HTM theory.
Music: "Holy Roller" by YACHT
(used with permission from Free Music Archive)
A
So
again,
let's
just
go
back
to
this
very
first
line
and
the
deck
first
money
here
you
and
I.
This
is
just
showing
one
version
disciple,
a
a
four
layers.
Usually
it's
we're
just
gonna
talk
about
this
two
layer,
input,
output,
layer
model
right
now,
I'm
going
to
go
down
to
a
more
detailed
version
of
that
right.
Okay,
so
this
is
the
slide
vehicle
till
tomorrow.
So
we're
just
going
to
look
at
these
one
of
these
two
layers,
the
two
of
one,
these
two
input
output
level.
A
Three,
so
we
did
two
layers
of
cells
now,
but
we're
going
to
introduce
another
thing:
we're
going
to
talk
about
multiple
columns,
so
it
went
on
the
concert
number,
the
bigger
columns
on
the
mini
columns.
These
are
a
horrible
call
core
four
columns
or
is
about
half
a
millimeter
there,
they're
just
bigger
okay,.
A
It's
a
collection
of
many
columns
and
it's
just
bigger
an
extent.
Okay,
okay,
so
in
this
picture
here
we
show
three
comp
come
one
come
two
come
through
the
input
layer
in
each
of
these
columns
is
equivalent
to
the
spatial
fuller
right.
We
have
today
it's
a
mini
columns,
sax
and
mechanism,
but
the
difference
now
is
we
have
this
arm
typical
location
on
object
right,
so
we
have
these
three
columns
and
you
can
think
of
them
like
the
tips
of
your
three
fingers.
A
So
as
a
comrade
setting,
you
know
these
three
fingers
and
each
finger
touches
something
and
each
being
get
some
info.
Now
it's
it's
really
good
thing
about
in
terms
of
touch
because
you
can
net.
You
understand
that
three
fingers
moving
somewhat
independently
they're,
not
touching
the
same
far
server
and.
A
B
A
A
Each
one
is
one
common
and
the
important
thing
is
they
communicate
in
the
output
layer,
the
operators,
these
long-distance
connections,
so
there's
two
or
three
of
these
long-distance
connections
across
both
columns
right,
maybe
like
16
columns
in
the
cortex
pecker
and
lay
502.
So
this
is
a
very
noted
feature
of
these
long-distance
connection.
A
What
way
do
you
think
about
it?
Each
finger
is
getting
some
information
about
us,
so
my
index
fingers
touch
this
coffee,
capsule
I'm
doing
this
feature
at
this
location
that,
on
its
own
may
not
be
sufficient,
identify
the
coffee
cup
probably
isn't
I
can
I
could
evident
coffee
cup
I
move
my
finger
in
multiple
locations,
vaginally
I'm,
not
seeing
right
I'm
just
reaching
in
this
black
box
and
touching
things
yeah.
A
Can
do
this
like
this?
It's
kind
of
like
looking
at
you
over
to
the
straw.
I
sure
is
the
world
it
because
I
got
to
move
a
lot
by
little
eyes,
but
I.
Do
it
I
know
that's
man,
this
your
hyung
Connect
laptop
John,
so
that's
the
same
as
touching
with
one
finger,
but
often
you
get
to
touch
with
multiple
sensors.
At
the
same
time
so
my
hand,
my
three
fingers
touching
it
I
would
get
three
different
feature:
location
representations
on
the
object
and
essentially
to
get
it
down.
A
What
layer
the
input
layer
is
doing
is
representing
not
just
features
but
features
and
location
pairs.
It's
forming
a
sparse
representation
of
a
feature
at
a
particular
location.
Just
like
we
do
sparse
representations
and
simple
memory,
us
or
a
feature
a
particular
location,
see
so
at
the
feature,
a
particular
location.
It
doesn't
identify
the
object
necessarily.
We
have
to.
A
Is
a
pooling
their
the
pooling
layer
represents
the
object
itself,
and
so,
if
actually
learns
to
associate
a
specific
representation
for
popping
up
with
a
local
feature.
Location
pairs.
I
shall
probably
happen.
Consists
of
these
features
at
the
location
right
I
said:
there's
nothing
magic
about
it,
but
at
any
point
on
each
finger
only
gets
partial
input,
and
these
finger
may
not
have
enough
information
to
identify
this
right,
but
they
all
trying
to
do
it.
So.
B
A
The
out
what
we
would
do
long
range
connections
and
put
layers,
let
them
do
this-
is
settle
on
an
answer.
That's
consistent
with
all
the
inputs.
Do
it
very
quickly,
so
imagine
input
on
Mike
one
finger
says
its
object,
a
B
or
C
the
input
of
my
other
finger
says
it
could
be
object,
a
RS
and
the
other
one.
The
third.
B
A
What's
going
to
happen
is
they're
going
to
for
me
you
each
a
second
layer
on
each
of
those
columns.
We
can
form
a
union
of
those
three
options.
So
I
don't
know
what
it
is.
It
could
be
angry
CI,
which
it
is,
it
could
be,
you
know
a
and
W
so,
but
this
long-range
connection
essentially
has
been
boat
instantaneously
very
quickly,
it's
a
what's
consistent
across
all
of
this
a
so
we
set
on
it.
We
are
all
just
working.
A
This
is
model
Val
using
our
a
shim
there
onto
this
is
not
like
totally
like
provocated
sure.
So
this
says
why
you
can
identify
objects
very
quickly.
If
you
have
multiple
fingers
touching
something
or
Y,
you
can
identify
something
very
very
quickly
if
I'm
looking
at
it
with
my
full
web
I
can
just
flash
on
the
in
front
of
me
very
often,
I
can
recognize
it
because
all
the
individual
patches,
the
retina
image,
passes
the
v1
or
arch,
basically
modeling
the
same
object
and
they're
all
things.
A
So
this
and
then
what
happens
now
is
that
the
Z's
they
oscillator
the
object
layer
which
is
stable
right.
Imagine
on
the
alkylation
cooling,
where
the
cooling
layer
it's
it's
saying
this
is
a
coffee
cup.
Every
time
I
move
my
finger.
The
output
layer
is
still
as
stable
as
this
coffee
cup,
but
the
insulator
is
changing
every
time
right.
A
But
once
I
know
that
object
unless
I
know
what
is
accomplished
up
news
is
there's
no
doubt
about
it,
unless
one
of
the
inputs
could
be
inconsistent
sure.
So
what
happened
is
if
you
look
at
this
diagram
again,
the
the
coffee
cup
is
projecting
back
down
to
the
insulator
and
it's
saying,
but
as
a
coffee
cup,
these
are
all
the
future
location
pairs
that
you
might
find
on
a
coffee
cup.
Well,.
A
Where
this
edges
facing
up
a
model
of
expansion,
a
cooling
layer,
so
the
output
layer
position
is
associated
with
me-
are
20
different
input
features
future
location
pairs.
That's
a
definition
of
the
object
right
and
any
one
of
those
might
be
occurring
with
so
which
it
would
projects
back
and
it
depolarizes
the
input
layer
and
says
these
are
the
union
of
all
the
future
location
pairs
you
might
find
on
this
object
right
now.
A
If
I
know
the
location,
because
I'm
about
to
move
my
finger
and
so
I
actually
miss
I,
I
told
you
I
know
the
new
location,
the
new
expected
location
because
I'm
moving
my
finger
right
now
and
I,
the
new
album
interpretation,
let's
say
magically
no
map
yeah,
so
now
I,
say:
okay,
I
note,
the
object
is,
is
just
consist
of
these
future
location.
Pairs.
I
also
know
whether
the
new
location
I'm
going
to
be
on
and
when
you'll
end
up
with
a
jewel
upon
the
prediction
of
the
interlayer
of
exactly
in
put
your
spectrum
right.
A
A
A
Everyone
juice
every
column,
is
learning
it
as
much
as
visit
limit.
My
be
calm
is
trying
to
learn
the
entire
three-dimensional
structure
of
objects.
You're
all
running
in
parallel
right
do
all
parallel
mala
Mia,
so
a
region
I
think
if
I
should
have
a
hundred
cortical
columns,
there's
a
hundred
models
of
the
same
objects
and
at.
A
I
could
isolate
recognize
if
I
have
no
country
input.
I
could
recognize
out.
This
is
one
finger.
Get
the
it
be
feature
was
unique.
If
it's
not
unique,
then
I
have
to
do
it
with
two
multiple
touches
right.
It's
generally
looking
through
a
straw.
I
could
recognize
anything
but
destroy
if
it
was
unique.
I
hope
that's
not
known.
I
got,
but
it
is
not
unique.
I
have
to
look
at
all
thing
right
as
you
move
around.
So
when
you
have
this
choice
of
balance
between,
you
know
how
many
columns
are
asked
to
be
touching.
A
B
A
Now
we
jinda
clash
the
thing
I
want
them,
so
this
basic
this
is
the
basic
idea.
This
is
the
basic
circuit
we
test
with
us
extensively.
Let
me
just
show
you
a
few
other
wide.
You
just
didn't.
Go
look
at
this
next
figure
in
us
in
the
and
the
slide
deck
here,
just
to
remind
you
that
we're
building
all
of
this
using
HTML
sure
so,
four
key
right,
yeah,
we
didn't
have
to
reinvent
the
neuron.
These
are.
A
A
Models
not
required
yeah,
we've
done
just
to
show
you.
This
slide
called
simulations
of
convergence,
timers
number
column.
This
is
just
showing
it
will
lead
model.
These
things
we
can
model.
This
convergence
problem
is
issued
so
so
that
I
don't
like
to
detail
but
be
look
at
the
section
at
the
top
is
the
diagram
we
have.
We've
created
a
bunch
of
virtual
objects.
A
These
are
so
simulated
object
that
have
feature
that
location,
so
it
would
design
those
two
that
there
they're
similar
enough,
that
you
can't
distinguish
them
very
easily,
have
to
touch
multiple
places
on
the
distinction
right.
They
all
this
similar
features,
and
you
know
any
particular
feature
you
touch
on
locations,
lack
of
unique
and
so-
and
this
is
a
be
section
here-
we
show
how
long
it
takes
for
a
single
column
like
a
single
finger.
You
need
to
recognize
one
of
these
objects,
and
this
is
sort
of
the
activity
pattern
in
the
output
layer
over
time.
A
Yeah
well
I
want
this
Randall
better
point
to
them.
I
actually,
don't
know
how
or
you
a,
but
we
looking
at
nnd
were
classes,
the
number
of
cells
that
are
active
at
one
time
so
showing
the
Union
over
up
with
the
beauty
of
objects.
So
you
see
the
activities
a
lot
in
the
beginning.
I've
got
just
touch
once
I
have
a
union
I,
don't
know
what
is
a
touch
again
again
and
again
and
then
at
some
point
it
gets
down
to
the
after
I
think
it's
about
seven
or
eight.
A
On
average
it
says
that
I
know
what
this
is
right
and
then
it's
that
it's
locked
it
yeah.
If
you
do
three
columns
at
lunch,
you
still
initially
in
the
right
first
touch,
you
might
be
ambiguous,
but
there
are
quickly
a
lot
to
compaction,
much
quicker,
just
getting
more
data,
well
you're,
getting
multiple
he's
like
you're,
touching
local
public
to
the
object
at
once
and
you're
voting
and
collecting
there.
So
you
can
innovate
over
time
or
you
can
integrate
over
multiple
centuries
right.
Okay,
I.
B
A
That
work
exactly
yeah
exactly
you
know
when
the
if
you've,
given
a
new
object-
and
you
haven't
seen
it
before
and
you
want
to
learn
what
it
looks
like
we
typically
will
do
you
hold
it
in
front
of
me
to
turn
around
and
look
at
the
punch
every
part
of
your
v1
every
part
of
v2
is
learning
simultaneously
they're,
independently
and
together
simultaneously,
what
that
object
is
they're.
Building
who
you're
building
under
this
of
mop
right
object?
No
one
model,
hundreds
of
them.
One
thing
you
might
ask
yourself:
is
this
really
going
to
work?
A
What
the
capacities
are
system
like
this,
so
this
slide
here
called
simulation
results
passages
there's
a
lot
of
assumptions
here,
but
we
want
to
make
sure
that
you
know
this
is
real.
This
could
actually
work
that.
How
much
could
the
neurons
in
single
column
actually
learn?
It
turns
out
if
you
have
a
single
column
of
reasonable
dimensions,
it
would.
A
A
Mini
columns,
that's
kind
of
consistent
with
a
half
millimeter
cortical
column,
and
so
you
have
900
cells
in
the
impolite.
It's
pretty
small
yeah
and
you
have
4,000
cells
in
the
output
layer
and
under
that
kind
of
realistic
assumptions
which
are
smaller
than
we
typically
model
in
the
world.
You,
a
single
column,
can
learn
somewhere
between
200
and
300
three-dimensional
objects.
And
if
you
combine
mobile
comp
together,
you
get
more
so
you
can
because
they,
because
no,
you
can
deal
with
it.
A
Everything
tied
up
with
it
I
mean
it's
just
a
capacitive.
How
much
you
can
actually
synapse
is.
You
can
actually
fruitfully
employ
on
a
neuron
right,
so
you
kind
of
run
into
an
action.
But
the
point
is
even
if
even
a
single
column
or
very
small
patch
of
a
century
region
to
learn,
700
objects,
it's
not
ten
thousand.
We
don't
need
that.
We
just
you,
know
we're
going
to
still
have
a
higher
if
you're
going
to
do
things
in
our
view.
But
but
you
know
those
are
a
few
of
em.