►
Description
Marcus is further investigating Capsules and how they might inform HTM (and vice versa?)
A
A
C
C
So
yeah
I
mean
topic
title
representing
pose,
pose
being
location
in
orientation
like,
for
example,
the
pose
of
this
is
changing
the
postive,
it's
changing
when
it
rotates
and
and
also
I'll,
throw
into
this
representing.
Oh,
when
I
get
to
it.
The
point
is
sometimes
also
talked
about
the
size
of
the
object.
Scaling
of
it
is
also
kind
of
represented,
that's
strictly
speaking,
that
pose
sometimes
I'll
just
be
sloppy
and
say
pose
to
describe
all
of
it.
So
this
picture
is
basically
using
notation
from
Hinton's
1981
papers.
C
Yeah
so
ours,
yes,
special
relationship,
the
trees,
and
so
here
like
argh,
our
compositionality
model.
This
is
pretty
much
exactly
a
drawing
of
it.
We're
like
this
is
one
way
we
might
interpret.
This
is
this.
This
is
a
location
in
the
space
of
a
child
object,
and
this
is
the
location
in
the
space
of
the
parent
object,
and
this.
D
C
C
C
Very
closely,
it
describes
capsules
kind
of
conceptually,
but
capsules
make
an
additional
change
where
they
don't
have
this
neural
population.
That
represents
the
spatial
relationship
between
things
out
there.
Then
no
no
population
represents
that
they
pull
that
into,
and
I've
taken
I've
taken
this
letter
and
I've
stuck
it
inside
the
triangle,
I've
stuck
inside
the
flow
just
just
to
say
that
now
is
that
it
creates
one
of
these,
a
similar
transformation
circuit
that
takes
one
of
these
and
and
gives
you
one
of
these,
but
they've
they've
removed.
A
C
C
The
premise
worked:
what
caused
me
to
walk
away
and
think
about
this
was
why
did
they
lie?
Why
did
they
land
on
something
a
little
different
from
us?
Why
did
we
land
on
this
idea
that
you
need
some
population,
that's
representing
that
where
they,
where
they're
somewhere
else,
because
so
there's
so
much
else
so
much
so
many
other
common
conclusions.
We've.
A
C
C
Well,
you
could
just
continue
keeping
all
of
that
and
your
viewer
centric
coordinates
you
could
you
could
just
the
initial
buffer
you're
like
okay,
there's
a
chair.
There
are
trailer
a
robot
right
there
I
take
a
step
forward.
I
update
all
of
those
you
could
you
could
you
could
keep
those
temporary
buffers
all.
E
A
A
C
A
C
And
an
example:
I
would
have
used
there
rather
than
the
song
one
is
well.
He
used
the
stapler
or
like
glasses
like
right
now.
If
this
is
a
spatial
arrangement
of
like
yes,
three
components
when
this
move,
when
this
moves,
it's
like
what
you're
doing
is
updating
this
rescue
yeah
and
it
seems
nice
to
be
able
to
store
sequences.
A
C
A
C
So,
okay,
pretty
much
said
everything
here,
so
the
I
mean
the
the
topic
is
representing
opposed
kind
of
three
different
ways
with
numbers,
neurons
and
capsules.
So,
first
to
dive
into
the
numerical
way
of
talking
about
this
of
a
numerical
way.
Oh
one
straightforward,
or
one
like
common
way
of
representing
spatial
relationships
is
with
so.
This
is
a
this
is
a
four
by
four
matrix
and
there's
an
affine
transformation
matrix
where
well
just
break
down
these
components
of
it.
C
One
of
these
is
this
3x3
one
inside
of
it
is
the
rub:
orientation
which
I've
described
as
a
rotation
matrix
and
and
some
some
scaling.
The
scaling
indicates
the
size
of
the
object
that
the
S
is
kind
of
this,
this
part
and
of
optional.
If
you
don't
want
the
system
to,
you
know
be
able
to
have
this
notion
of
a
giant
cylinder
after
its
length,
the
small
cylinder
you
can
remove
the
US
and
then
the
3d
location
of.
So
this
is
like
an
XYZ
of
the
sensor
in
the
objects
space.
C
If
you
take
those
variables
choose
to
express
it
as
a
matrix
like
this,
then
then
it's
really
simple
to
operate
on.
It's
really
straightforward
to
operate
on,
so
the
I'm
just
laying
out
the
problem
of
what
is
pose
or
what
is
one
straightforward
way
to
represent
it,
and
just
so
I
can
get
to
this
fun
aside.
C
C
And
one
of
these
is
really
good
at
this
one,
the
one
that
is
this
one
is
where's
the
sensor
and
the
reference
frame
of
the
object
is
very
much
inspired
by
grid
cells.
It's
like
just
like
your
fingers.
The
rat
you're
in
your
eye
is
the
wrap
this
one.
The
object
is
the
rat
yeah
and
it's
like
so
as
the
object
rotates
its
head.
Direction
cells
are
our
changes
in.
A
A
C
Yeah,
so
in
so
one
quick
statement
that
can
be
made
about
these
this
location
of
the
sensor,
this
is
good
for
path
integrating
the
sensor.
This
is
good
for
taking
movements
of
the
sensor
and
integrating
it
this
one's
good
for
path,
integrating
the
object.
If,
as
you
see
the
object
rotate,
as
you
see
a
shift
around
updating,
this
is
really
simple,
whereas
both
of
them
are
awkward
in
the
opposite.
This
one,
if
they,
if
they're,
if
you
see
the
object
rotating
you're
like
oh
no
I,
know
I
got
to
make
big
circles
around
it.
C
C
Only
using
twelve
numbers
and
the
others
is
just
like
I
have
to
make
it
where
matrices
can
show
stuff
around
now
moving
into
the
neuron
world,
the
each
of
these
variables,
the
the
scaling
factor,
the
rotation
matrix,
which
is
an
orientation
and
the
location
well,
I'm,
just
going
to
say
things
that
are
familiar
to
everyone,
except
I,
mean
I'll
start.
Let's
start
with
the
unfamiliar
one.
Scalar
is
some
sort
of
scalar
valued
value.
It
could
be
something
like
our
scalar.
Encoder
can
be
a
you
know,
firing
rate
some
way
of
encoding,
a
scalar
value.
A
A
C
C
Yes,
it
looks
a
little
bit
uncertain
how
I'm
a
scalar
value
is
represented.
That
might
be
a
fire
that
might
be
some
frequency.
It
might
be
something
I
do
seven
population,
let's
go
with
binary.
Okay,
now
I
called
this
rotation,
because
I
wanted
it
to
be
straightforward,
how
to
make
it
into
a
matrix,
but
is
an
orientation
and
we
represent
orientations.
We
know
different
ways
to
do
it
here.
C
What
I
think
is
the
most
realistic
is
probably
a
point
on
the
surface
of
a
sphere
which
is
like
the
reference
direction
and
a
ring
we've
talked
about
that
a
little
bit.
It
can
also
be
a
point
in
a
volume
of
a
sphere.
There
are
different
options.
Point
is
going
to
require
a
relatively
large
number
of
cells,
and
so
so
there's
that
representing
the
location
with
many
different
options,
I
just
called
the
option
one
and
option
seven,
just
to
make
it
clear
that
these
aren't
the
only
ones.
C
So
we've
talked
about
the
paper
on
bio
archives,
about
how
multiple
2d
attractors
could
do
it,
but
each
of
those
have
a
relatively
large
number
of
cells.
Another
way
that
is
appealing
for
the
ceramic
scaling
hypothesis,
the
idea
that
that
the
thalamus
is
kind
of
scaling,
your
input,
making
the
learning
problem
much
easier
for
the
cortex.
C
For
that
you
want
to
split
up
location
a
little
bit
differently.
This
isn't
a
sigh.
This
isn't
coordinate
my
topic,
but
the
point
is
there
are
other
ways
you
can
say
that,
like
the
direction
from
the
object
to
you
and
then
like
its
route,
its
its
scale,
which
is
proportional
to
its
distance
and
its
size,
it's
the
scale.
Is
this
like
a
scaling
vector
like
what
would
you
want
to
tell
the
thalamus
if
you
want
some
scale
and
bearing
you
if
you
want
the
same
thing
to
be
arriving
at
cortex
as
this
is
getting
closer.
A
C
A
C
So
now
I've
said
how
to
represent
numbers:
how
to
represent
neurons.
Okay.
How
do
you
deal
with
capsules?
Well,
the
capsules
allow
their
quote
unquote
neurons
to
have
scalar
values,
so
they
can
represent
size
and
pose
with
twelve
neurons.
This
three
by
three
matrix
is
three
by
one
that
gives
you
twelve,
whereas
our
model
assumes
that
it
takes
many
more
neurons
than
that
to
represent
a
pose
and
the
size
anyway.
C
Size
is
not
strictly
speaking,
part
of
focus,
but
we
assume
we
were
built
on
different
underlying
assumptions
and
it's
caused
us
to
go
in
different
directions,
and
here
I'm,
going
to
talk
about
the
kind
of
the
design
space
the
UNAM
circuit
space
that
what
the
algorithm
space
of
of
that
comes
from
different
answers
to
a
couple
questions.
One
of
those
questions
as
ten
pose
be
represented
and
computed
succinctly.
Is
it.
A
C
Yeah
yeah,
the
key
word
is
succinctly
so
it
does
it
take
twelve
neurons
or
does
it
take
thousands
of
neurons
I'm
just
guessing?
This
is
thousands,
but
that
seems
like
a
good
guess
and
so
so
there's
kind
of
of
the
right
way
too.
You
know
if
you,
if
you
the
designer,
are
like
saying:
okay
I,
have
these
biological
neurons
I'm
going
to
design
a
really
good
system
with
them
here,
and
you
have
this
like
dial
of
like?
Can
it
be
expressed
succinctly
or
does
it
require
lots
of
neurons?
C
C
Which
I'm
not
totally
discounting
it
I'm
just
laying
out
this
space
of
like
what
what
direction
you
would
go
based
on
these
premises,
so
yeah
if
pose,
can
be
represented
and
computed
succinctly.
Then
capsules
seem
like
a
lot
of
logical
direction,
whereas
if
it
requires
lots
of
neurons,
it's
more
like
our
model
and
and
I
was
a
bit
like
this
question.
I
I,
don't
have
a
firm
answer
on
this,
like
maybe
neurons,
do
a
lot
more
than
we
give
them
credit
for,
maybe
there's
more
going
on
inside
of
them.
C
Maybe
the
timing
of
their
spikes
actually
encodes
a
lot
of
information
in
there,
maybe
maybe
a
maybe
a
surprisingly
small
number
of
neurons
can
actually
do
a
lot.
I
don't
know,
and
if
they
can't,
then
then,
then
you
arrive
at
something
like
our
model
and
and
then
the
interesting
thing
is
that,
like
different.
C
C
Each
one
of
them
represents
like
a
basic
feature,
a
basic
component
like
one
of
them
I
year,
cylinder,
capsule
and,
and
it
has
its
own
set
of
cells
that
represent
how
that
cylinder
is
oriented
and
scaled
and
such
basically
every
every
component,
every
feature
every
object,
whatever
has
its
own
dedicated
set
of
opposed
cells,
and
you
have
these
within
which
of
course,
means
you
have
to
have
a
very
fluid
notion
about
what
is
an
object.
Maybe
an
object
is
actually
a
population
of
multiple
capsules.
A
A
C
So
the
way
I
would
think
of
it
as
capsules.
You
know
there
will
be
a
few
layers
of
them
and
up
top
yeah.
They
have
drank
other
capsules.
There's
a
capsule
that
I'm
just
making
myself
know
like
their
individual
capsules
per
object.
That's
a
hack
and
gross,
but
everything
below
is
then
learning
a
set
of
these
reusable
components
that
they
rearranged
in
different
ways.
And
yes,
there
are
fixed
number
of
capsules.
It's
not
like
our
SDR
is
where
you
can
just
generate
them
on
the
fly,
so
the
system
is
going
to
learn
its
features.
C
C
A
C
Yeah,
so
this
next
one
is
basically
the
flip
side
of
what
I
said
before
in
this
one.
You
know
a
capsule,
essentially
is
its
post
cells?
They're.
Sorry,
if
okay
start
the
sentence
over
a
capsule
almost
is
exclusively
just
a
set
of
cells
representing
the
pose
of
a
particular
feature.
It
doesn't
even
really
represent
one
feature
because
it's
those
cells
are
devoted
to
it.
C
C
In
this
design
of
space
of
algorithms,
you
have
some
computational
unit-
here's
the
capsule
here,
it's
perhaps
the
cortical
column
inside
the
capsule
well
inside
both
for
both
of
them.
That
computational
unit
is
representing
a
pose
here.
It's
very
small
here,
it's
very
large.
The
other
cells
in
the
capsule
here
are
almost
non-existent.
Where
here
the
other
cells
and
that
unit
are.
C
A
A
A
A
C
So
the
capsule
essentially
doesn't
even
represent
quote-unquote.
What
is
out
there?
It
only
represents.
Where
is
it
because
it's
already
devoted
to
the?
Why
ahead
is?
Why
is
kind
of
fixed
and
I
guess?
One
of
maybe
is
a
little
bit
of
stretch
but
I'm
just
saying
like
it,
the
capsule
is
only
notion
of
what
is
out.
There
is
just
a
probability
where
our
notion
of
what
is
out
there
is
that
something's,
like
you,
know,
a
distributed
representation
of
some
okay.
That's
where
I'm
trying
to
assess
there
is
some
truth
here.
C
C
Can
you
quickly
learn
the
ORS?
Let's
put
it
down
here,
suppose
you
know
this,
you
know
this.
Can
you
quickly
learn
this?
Is
it
just
a
trivial
operation
like
it
is?
If
you
assume
cells
have
scalar
values
and
that
you
just
start
up
weights
that
are
positive
accordingly,
is
it
trivial
to
do
that?
If
it
is,
then
it
makes
sense
to
do
these.
These
learn
these
on
demand.
You
learn
these
like
okay,
this
this,
this
spatial
relationship
between
this
component
and
this
component
on
this
child
object.
C
This
parent
object,
I'm,
gonna,
learn
it
and
I'm.
Just
gonna
learn
the
entire
circuit
that
whatever
is
required
and
just
make
that
your
entire
entire
mechanism,
whereas
if
it's,
if
it's
hard
as
we've
generally
assumed
it
is
if
it
is,
if
it's
elaborate,
then
you
have
this
more.
You
want
to
have
this
more
general
purpose.
Flux,
capacitor
thing
that
that
can
what's
going
on
in
this
triangle,
pretty
complicated
and
usually.
C
And
that's
like
the
simplest
version:
that's
the
the
displacement
cells
yeah,
this
placement
routes,
every
good
cell
to
a
different
grid
cell,
essentially
and
and
it
only
gets
more
complicated
as
you
bring
an
orientation
and
scale
yeah,
and
so
either.
You
want
to
learn
this
early
in
life
or
you
might
even
want
to
genetically
determined
or
some
hybrid
of
the
two.
But.
A
E
C
A
C
A
C
Yeah
yeah
and
then
just
to
stay,
maybe
what's
very
clear
already,
but
in
this
one
the
spatial
relationships
are
then
being
represented
and
weights
or
connectivity.
Something
in
the
connections
now
we're
here
is
being
represented
by
a
cell
population
but,
for
example,
displacement
cells.
I
would
say
for
both
of
these
questions,
I'm
not
taking
a
strong
stance
on
it.
I'm
almost
trying
to
say
things
that
everyone
in
every
camp
would
agree
on
and
maybe
neurons
do
more
than
we
give
them
credit
for.
A
I
think
the
more
you
yeah
sue
me,
the
more
interesting
question.
Immediate
is
sort
of
thing
about
like
oh,
it
was
the
future
of
machine
intelligence.
Come
up
like
you
know,
at
what
point
do
we
say,
hey
we've
stopped
using
their
arms
in
distributor,
reputation
or
whatever,
and
can
we
start
doing
in
some
cheating
way
cuz.
It
seems
something
pretty
clear
in
my
mind
that
the
biology
is
on.
A
A
A
A
D
F
A
A
You
know
like
melodies
of
varying
melodies.
Your
pimple
is
the
glasses.
Are
the
movements
of
objects?
It's
on
SUV
are
really
really
fundamental.
One
in
at
the
core
of
all
generalization
and
I
was
trying
to
pick
up
our
exact
just
a
consequence
of
getting
rid
of
the
putting
it
into
the
weights,
or
is
it
a
consequence
of
having
the
single-cell
representation?
C
C
F
F
C
It
a
little
useful
for
to
do
I
mean
a
new
matrix
and
I
did
this
stuff
before,
but
just
spending
time.
Thinking
like
this
that
they're
like
okay,
this
is
what's
going
on
and
like
the
algorithmic
level
or
something
this
is
one
way
of
describing
a
neural
population.
The
power.
How
are
the
neural
populations
accomplishing
that
Oh
like
this,
but
I
do
find
it
a
little
useful
picking
up
in
this
level
a
little
bit
just
making
sure
I'm.
Keeping
this
in
mind.
C
Yeah
one
day,
I
will
present
again
I'm
going
to
tell
you
that
how
the
image
on
the
retina
doesn't
depend
the
size,
because
the
size
of
the
object
I
won't
try
to
explain
it
all
right
now.
The
point
is
thinking
in
these
terms
made
me
realize,
like.
Oh
all,
capsules
have
to
do
to
to
predict
what
they're
gonna
sense
as
as
just
ignore
the
s,
and
everything
will
work
out.
That
was
a
terrible
presentation.
Point
is.
D
E
D
Of
the
capsules
versions
is
so
this
is
representative
transformation,
matrix
from
affine
transformation,
matrix
that
encode
scale
and
translation
rotation,
but
you
can
imagine
other
transformation
matrices
that
are
much
more
careless.
If
you
allow
yourself
that
more
than
twelve
neurons
and
nonlinear
transformation,
we
could
do
a
lot
of
fun
stuff
like
folding,
and
you
know
stretching
like
that.
I
think
one
of
the
papers
they're
representing
a
much
more
general
transformation.
C
And
I
meant
to
mention
that
here
I'm
describing
like
the
kind
of
principles
underlying
capsules
I'm
describing
like
the
motivation,
the
motivation
of
capsules
was
that
they
would
represent
something
like
this,
but
then
they
attended
intentionally
leave
a
very
flexible
training
via
backpropagation
and
let
it
learn
whatever
it
wants
to
learn
so
I
mean
anyway.
The
point
is
I'm
describing
an
idealized
version.
It
might
not
even
be
ideal,
but
it's
not.
A
A
A
A
And
it's
like
capsules,
so
the
writer
here,
but
you
know,
wanted
to
get
stuck
because
they
don't
they
just
stop
there.
They
spend
the
next
five
years
trying
to
get
to
make
this
work
because
there's
really
not
going
there
a
service,
although
conceptually
you're,
doing
the
same
iconic
populations,
but
you
might
might
just
run
into
this
insurmountable
problems.
If
you
don't,
you
know
to
distribute
representation
and
full
representation
of.
A
E
A
F
C
Number
something
here
no
or
discuss
when
you
noise,
you're
talking
about
like
the
the
cells
values,
are
corrupted
I,
don't
know
if
they
talked
about
that.
You
talk
about
occlusion
I,
don't
have
a
direct
answer
to
that,
but
they
do
have
a
really
good
segmentation
of,
for
example,
if
you
overlay
digits
on
top
of
each
other
being
able
to
segment
the
different
digits
being
able
to
see.
A
C
Least,
for
some
people
I
think
that
they
know
that
the
neurons
aren't
doing
aren't
representing
these
digits.
They
know
the
neurons
are
actually
doing
something
a
little
more
like
this,
but
everything
I
think
they
might
think
that
it's
not
a.
Maybe
each
of
these
neurons
section
represented
by
like
for
neurons
down
here
rather
than
the
point
is
they're
like
this-
is
the
fundamental
algorithm.
Somehow
biology
makes
it
robust
by
using
the
multiple
neurons,
and
they
don't
think
too
deep
on
this.
C
A
Counsel,
sparse
people
under
are
you
also
working
on
custom,
distributed
representations?
And
so
maybe
you
know
you
know,
maybe
that's
a
foundational
element.
You
have
to
have
the
spirit
representations.
They
have
to
be
sparring
that
that
is
a
constraint
that
has
to
be
applied
to
all
future
day
I
systems
and
member
station.
If
you
accept
the
diamond
strain,
then
you
wouldn't
say:
oh
I
can
just
get
array
by
her
in
corner
I.
A
C
That's
one
of
the
core
is
one
of
the
core
things
that
they
they
admit
is
so
a
capsule
can't
do
that.
The
if
you
ask
someone
any
of
the
papers
kind
of
is
apologetic
about
this
they're
like
they're,
saying
that
a
capsule
can't
represent
like
there
there's
a
pen
and
there's
another
pen
next
to
it,
or
something
like
that,
and
that
are
like
right
next
to
each
other.
C
In
order
for
it
to
do
that,
you
would
have
to
have
in
the
same
sense
that
a
convolutional
neural
net
split
segment
space
into
different
squares.
The
Pens
would
have
to
be
in
different
squares
for
it
to
be
able
to
represent
that.
So
they
point
to
the
fact
that
the
in
the
actual
brain
crowding
is
a
thing
where,
like,
if
something's
off
to
the
side-
and
you
have
two
objects
that
are
like
close
to
each
other,
you
can
no
longer
tell
that
there
are
two
objects
that
kind
of
thing,
but
the
point
is
no.
C
F
A
C
C
A
C
Funny
I
can
rephrase
what
both
of
us
just
said.
We
said
two
totally
different
things.
I
said:
here's
one
way
capsules
could
solve
this.
If
they
could
represent
a
union
of
poses,
then
it
would
work,
and
you
said
this
is
now
he
said,
but
I'm
saying
you
said
you
know,
so
we
just
need
to
add
attention
to
capsules
and
then
independent.
Then
the
problem
will
be
solved
for
capsules
as
well.