►
Description
Matt and Florian will present their interpretation of paper "Deep Predictive Learning: A Comprehensive Model of Three Visual Streams" as described here: https://discourse.numenta.org/t/deep-predictive-learning-a-comprehensive-model-of-three-visual-streams/3076
A
C
E
A
A
A
What
kind
of
do
that
so
we're
going
to
get
really
far,
I
think
with
just
the
pulvinar
part
and
we'll
talk
a
bit
about
this,
the
hierarchy
and
the
what?
Where
and
what
times
were
the
big
thing
is
they've
created
this
structure
and
they've
got
out
of
it
in
variant
representations
of
objects
that
are
moving
through
the
systems
field
of
view,
and
they
even
have
a
model
of
saccade
baked
into
it.
A
C
D
D
A
Deep
layers
and
there's
superficial
layers
I'll
go
to
the
screen
for
specific
this.
One
of
the
main
ideas
is
that
broken
up
the
court
cortical
system
into
superficial
layers
and
deep
layers
and
they've
simplified
it
based
on
that.
So
the
model
that
I'm
going
to
show
you
that
they
created
doesn't
have
no.5
and
l6
al
6b
and
L
2,
3
and
L
4
they've
got
a
model
of
the
superficial
layers
and
they've
got
a
model
of
the
deep
layers
and
I've
got
a
model
of
the
pulvinar.
So.
A
A
A
E
F
C
G
A
D
D
Call
they
originally
did
they
call
the
same
the
pole
Bernard
because
they
didn't
get
sent
to
input
very
from
the
cortex.
But
now
the
pulpit
amino
is
broken
into
multiple
sub
regions,
that
individually
it's
but
from
different
regions,
so
yeah
equivalent
I
am
I'm,
physic
and
white
next
to
John
I
thought
they
were,
but
but.
D
Yeah,
and
so
the
first
thing
was
a
long
time
ago-
it
was
off
to
the
sort
of
this.
A
pokemon
is
not
like.
Don't
you
write,
one
gets
imported,
one
gets
input
from
the
sensory
inputs,
but
death,
and
but
then,
but
that
was
one
of
the
things
is
your
media
galleries
are
really
emphasizing
that?
No,
it's
really
just
same
and
we
just
beat
just
because
we're
getting
into
the
century,
but
that's
just
the
first
stage.
Really
we
don't.
D
A
A
G
H
D
D
Up
in
there
next
to
each
other
for
historical
reasons,
they
group
the
whole
bunch
of
them
called
in
Paul
Menard
and
they
called
the
other
one,
all
myself
lgn,
but
they're.
Actually,
this
they're,
just
a
bunch
of-
and
this
is
a
nomenclature
issue-
that
they
decided
the
group
of
bunch
together
and
called
moment
on
for
the
reasons
I
just
mentioned
before-
is
that
one
gets
into
it
from
the
eyes
and
the
others
get
input
from
the
cortex.
So
they
said
there
must
be
different,
but
now,
in
the
new
interpretation.
D
D
A
This
is
a
catch-22
you
have
to
jump
in
somewhere,
so
assuming
that
we're
in
the
Alpha
cycle,
the
negative
and
minus
phases,
the
predictions
from
the
deep
layers
being
projected
to
the
thalamus
and
then
the
plus
phase
is
the
new
input
coming
up
from
the
lower
you
know
or
below
and
being
compared
to
that
prediction.
And
then
there
is
a
temporal
difference
mechanism
that
projects.
This
error
signal
back
to
cortex
through.
A
A
A
So
the
plus
phase
would
be
the
one
quarter
of
death
and
that's
where
the
new
input
comes
in
and
there's
a
there's,
a
difference
that
can
that
can
then
be
communicated
through
cortex
from
the
thalamus.
The
thalamus
isn't
computing,
the
thalamus
is
like
they
call
it
a
black
board
or
a
production
screen.
A
It's
just
a
place
for
the
cortex
to
project
its
deep
predictions
coming
from
the
deep
layers
compare
them
to,
because
it's
predicting
what
it
thinks
is
going
to
be
coming
next
from
the
l5
intrinsic
burski,
either
neurons
from
below,
so
that
you
think
about
this
is
a
part
of
the
hierarchy,
but
there's
also
sensory
input
too.
So
we're
predicting
not
only
what's
coming
from
below
taking
sensory
input
here,
that's
steep
these
deep
layers,
so.
D
A
B
C
B
To
have
some
kind
of
architecture
that
allows
you
to
compute
the
differences
between
predictions
and
feed-forward
info.
One
way
to
do
that
is
to
have
the
projection
screen
a
scratch
port
at
a
place
where
you
can
put
out
your
prediction
and
have
that
be
the
same
place
that
also
feed
forwards,
the
the
actual
ground
truth
and
the
way
that
you
are
then
capable
of
computing,
the
difference
at
arrow,
right,
vaga
and
having
arrow
neurons.
You
have
you
simply
compute
a
temporal
difference,
and
for
that
we
need
some
kind
of
a
rhythmic
signal.
That's
which
is.
D
D
Front
of
units
that
are
you
the
difference,
yet
it's
not
funded.
The
word
call
temple
differences,
you're,
basically
saying
I'm,
trying
to
calculate
the
difference
between
these
two
I'm
gonna.
Do
it
by
assuming
I'm
getting
have
a
sudden
shift
between
the
right,
an
actual
it's
not
with
it.
It's
just
a
difference
but
I'm
using
time
as
a
biological
mechanism
to
accomplish
yeah.
H
H
A
Where
you
see
about
that,
you
have
to
think
about
the
another.
Really
important
aspect
of
this
model
is
that
bi-directional
connectivity
all
over
the
place
is
essential
for
this
model
to
work
out.
We've
got
bi-directional
connectivity
between
regions
and
the
same
layer,
regions
and
different
layers
from
super
to
deep
layers
from
deep
to
super
layers,
and
they
have
to
do
that
when
I
get
to
the
models
that
they
construct,
it
will
see
how
they
arranged
all
that
when
they
created
visual
art.
D
D
E
D
D
E
E
D
D
D
While
these
cells
are
changing
or
these
cells
are
changing,
because
that's
when
the
variance
means
right,
it
means
that
I
was
changing
your
books,
but
I
know
underneath
it
there's
there's
a
stable
concept,
that's
not
going
to
change
and
that's
I.
Don't
think
this
is
the
right
diagram
for
this,
because
also.
D
Anyway,
so
I'm
just
this
is
the
this
is
one
of
the
key
things
about
how
the
cortex
put
in
barrack
representations
and
and
what
it
means
it,
creating
a
very
recommendation,
an
importance
we
think
about
that
with
decades.
So
so,
in
this
case
I,
you
know
it
has
to
occur
here
or
someplace.
So
we
could
say
and-
and
there's
people
look
around
before
I'll
step
aside.
Oh
no,
every
time
we
go,
we
used
to
say
well
if
it
were
solving
the
hierarchy.
I
say
it's
not
solve
hard.
D
D
A
D
A
A
B
B
B
D
D
B
B
A
Anywhere
in
this,
except
for
just
one,
one
2d
topology
of
the
input
space
so
as
far
as
representations
in
here
these
are
all
the
superficial
layers
and
deep
layers
and
thalamus
are
all
modeled
with
something
called
deeply
Abra,
which
is
a
at
its
core.
It's
a
deep
learning
model,
but
it
seems-
and
this
is
another
paper
that
I
didn't
read
so
there's
a
whole
paper
on
this,
but
it
seems
to
create
a
deep
learning
model
with
a
bunch
of
a
bunch
of
models.
A
F
E
E
E
D
D
This
was
a
location
like
a
grid
cell
type
of
Faerie,
then
that
the
map
between
this-
and
this
has
to
be
learned,
because
this
on
its
own
location
by
the
column,
doesn't
tell
you
what
you're
that's
part
of
the
definition
of
the
object.
What
am
I
going
to
find
at
that
location
at
what
orientation
so
just
knowing
the
location
in
our
model
on
their
model,
our
model,
just
knowing
the
location
to
predict
the
new
location,
is
insufficient
to
predict
it.
But
it
has
to
be
learned
about
the
court
learning
agenda
to
learn.
D
H
F
B
D
D
Rockets
made
a
very
simple
observation.
One
important
observation
which
I
would
agree
with,
is
that
my
mom,
these
are,
you,
can
think
of
these
as
location
plus
orientation
prediction
like
this.
This
is
what
this
is
where
I
am
on.
The
object
to
associate
that
with
a
sensory
input
requires
any
learning
that
run
bottled
is
here
that's
between
locations
in
the
good
cells
and
play
cells,
use
associated
with
location
with
the
sensory
input.
So
you
have
the
sensory
input.
D
D
Currently,
here
that
this,
this
location
is
going
to
predict
a
sensory
input
down
here,
but
for
that
to
happen,
it
means
that
this,
these
connections
between
this
and
what
what's
expected
here,
has
to
be
learned
that
this.
This
is
a
very
broad
connection.
There's
ten
times
as
many
cells
connect
me
back
here
they
are
they're,
not
drivers,
and
so
you
would
have
to
say
well
in
our
model.
This
can't
be
as
location
prediction.
D
If,
though,
there
has
to
be
learning
these
these
synapses,
you
would
have
to
learn
the
mapping
between
the
location
and
what
the
input
is
at
that
point
in
time.
We're
doing
that
up
here,
back
and
forth
between
here
is
I.
Think
it's
a
very
proposing
it
down
here.
But
the
point
is,
if
you
just
say,
this
is
a
sensory
prediction,
and
you
can't
even
do
that,
even
if
this
was
an
encoding
of
the
sensory
input
and
you
put
any
send
yourself
down.
C
D
D
D
A
D
Leadership
we
associate
in
our
world,
we
associate.
We
have
to
learn
this
both
ways
we
have
to
wouldn't
have
a
new
location.
It
has
to
learn
and
predict
what
the
news
input
would
be
input
once
and
when
you
have
an
input,
it
has
to
learn
to
predict
order,
drive
what
the
location
is,
so
these
self
protecting
your
hands
of
our
new
self,
which
I
think
you
have
to
learn
and
they
learn,
because
you
only
on
the
location
of
an
important
okay.
I
know
those
two
of
us
one
of
those
connections.
Both
ways
right.
D
D
A
D
D
D
E
D
B
D
B
D
E
B
C
D
D
A
B
D
C
E
D
A
D
D
D
E
D
A
E
D
D
B
A
D
A
This
diagram
is
all
cortical
cortical,
and
this
shows
how
they
connected
up
their
different
regions
in
their
hierarchy.
This
the
dotted
ones,
are
superficial.
The
deep
in
vice
versa,
the
pulvinar
comes
in
this
diagram,
so
this
is
a.
This
is
a
little
tribute,
this
big
yeah,
so
we've
got
deep
and
pulvinar
in
v1,
for
example,
each
pulvinar
layer
receives
five
one
be
driving
inputs
from
the
deep
layers
its
associated
with.
So
this
is
a
little
bit
backwards.
D
B
A
B
H
E
C
A
D
A
B
D
B
B
C
B
This
is
this:
this
area
does
have
separate
sort
of
pieces
of
of
thermos
for
different
areas,
but
they
can
communicate
to
each
other
because
they,
as
you
can
see,
for
example,
in
all
those
predictions
converging
onto
v1p
right.
There's
a
is
a
shared
projection,
screen
right,
the
argument
being
that
these
are
that
these
are
much
wider,
that
let
and
their
feet
for
one
like
the
driving
ones
are
very
specific
right,
whereas.
B
A
A
A
D
So
so
this
is
a
very
important
question
as
to
how
they
were
thinking
about
movement
and
individually.
We
would
think
about
it
as,
let's
start
with
one
problem.
One
problem,
then
I
have
a
static
scene
or
static
world
and
I'm
moving
through
it,
which
case
there
is
movement
across
the
retina.
There
are
psychotic
movements
which
change
in
the
input,
but
now
and
that's
why
they
ask
me:
what
are
they.
E
B
In
there,
yes,
they
have
both
of
them
in
that
and
I
still
hard
as
a
figure
and
pop.
That's
part
of
the
challenge
here
right,
because
the
model
is
supposed
to
learn
in
my
representations
and
there
in
the
higher
areas
for
the
objects.
Yet
at
the
same
time,
you
also
have
this
one
dedicated
area,
the
the
fef,
the
the
proper
I
feel
right,
which
we
know
those
are
cut,
bands
and
representations
for
all
these
things,
but
that
is
part
of
one
of
the
streams.
Yes,.
E
A
B
D
But
when
I,
one
of
the
constraints
I
put
on
our
work
was
that
I
have
to
be
able
to
understand
in
very
presentations
in
static
environment.
So
as
I
move,
my
Iowa
or
something
I
need
to
be
able
to
build
around
that
thing.
I
think
the
representation
of
the
thing
I'm
looking
at
is
stable.
Why
my
eye
is
moving
and
that
that
tells
you
that
the
brain
must
be
predicting
what
the
inputs
gonna
be
bits
where
they
out
there
all
the
time
stuff.
D
So
we
you
could
break
the
problem
apart
and
say:
yeah,
there's
a
static
image.
Recognition
problem,
which
and
I
mean
static,
doesn't
mean
that
nothing's
moving,
you
need,
your
eyes
are
moving,
but
the
object
or
the
world
is
not
moving
like
a
Wi-Fi
walk
around
disability
as
long
as
nothing
moving
on.
D
Put
your
constantly
changing,
but
the
underlying
structure
is
descent.
You
have
to
solve
that
problem
in
the
Epis
almost
envision
that
mom,
then
you
separately
say
hey,
but
what
if
it
was
something
moving
like
a
bird
flying?
How
do
I
see
them?
Someone
dancing
something
like
that,
but
you
have
to
be
able
to
do
both
those.
So
the
question
is:
is
just
singing.
Is
this
model
deal
with
the
first
problem,
which
is
hey,
I
have
a
list
of
which
are
moving.
D
A
B
E
A
A
D
A
Into
what
we're
and
what
times
where
they
start
the
object
recognition
process.
However,
whatever
that
means
in
the
higher
levels
and
then
the
higher
rates
at
higher
regions
higher,
yes
higher
regions
and
those
inform
the
regions
in
the
what
area
which
provide
details
that
was
all
day
object,
I
don't
have
a
good
understanding
of
this
honestly.
This
is
the
hardest
part
for
me
to
get
and
there's
a
lot
of
detail
about
how.
E
B
Right
so
you
see
the
the
quote:
unquote
normal
thalamocortical
feed-forward
from
from
v1,
as
sort
of
like
the
bottom
left,
which
is
essentially
just
your
European
right.
So
you
have
an
object,
and
this
is
actually
one
of
the
tasks
that
they
did.
They
didn't
just
like:
moving
and
tracking
objects
and
building
environment
representation
of
predicting
what
you're
gonna
see
on
place
and
circuits.
They
also
did
things
with
like
partially
occluded
objects.
B
Right
and
so
there
so
what
you
are
seeing
here
is
you:
you
see
that
pollen
are
level
that
pollen
our
fellow
McCall
degree
they
sell
the
TRC
level,
and
you
see
that
there's
two
boxes
left
and
right
right.
These
predict
these
are
the
two.
There
are
two
states
off:
the
pollen
are
representation
during
the
face
and
the
+
face
during
the
+
face.
It's
driven
by
the
feed-forward
input,
which
gets
relayed
about
all
the
pollen
are
into
the
both
the
superior
and
the
deep
layer.
Just
above
it.
This.
B
Of
course,
you
know.
Ss
know
that
they're
that
the
deep
layers
is
in
the
previous
time
context,
so
you
should
always
think
of
it
as
prediction.
First
and
then
comes
the
actual
information,
and
so
you
now
have
don't
have
an
error
signal
per
se,
but
you
have
an
implicit
representation
of
the
error.
If
you
have
access
to
the
difference
between
these
two
different
inputs
that
you're
getting
from
Palvin
our
first
you're
getting
one
that
is
a
prediction
and
then
like
followed
by
by
that
one
says:
table
right.
B
B
D
D
Both
get
the
norfair
back
now,
there's
a
question
is
why
you
don't
you
better
get
involved,
others
the
idea
here
that
the
deep
ones
are
representing
the
subset
state,
both
the
deep
layers
needing
the
superficial
layers.
We
both
calculated
the
tumbled
both
so
right,
look
I
know,
that's
not
really
clear!
Why
they?
Why
would
you
want
their
whole
together?
Why.
D
B
D
D
B
B
E
G
D
D
A
Each
off
the
cycle,
the
the
v2d
player,
uses
the
prior
hundred
milliseconds
context,
information
generated
creation
or
expectation
over
the
fold
in
our
units.
That
will
call
me,
in
next,
via
the
1
5,
be
strong
driver
inputs
from
be
one.
That's
first
here
that
was
the
context
update
here,
the
system's
predictive.
They
call
it
a
cricket,
auto.
G
A
D
D
A
D
B
C
D
G
D
A
A
C
G
E
B
B
A
D
D
Think,
yes,
let's
do
it
so
I'm
gonna
try
to
be
the
highest
level
description
of
this
I.
Can't
we
have
this
concept.
You
have
a
bunch
of
cortical
regions
that
are
somehow
connected
like
this.
This
is
all
over
and
one
way
is
thinking
about.
What
is
that
you
have
in
various
representations
up
here
at
the
top
of
hierarchy?
I
wrote
about
this
intelligent
I
think
I'll
take
a
second
and
then
I
think
what
you're
saying
does
it
well?
How
would
I
get?
D
C
D
That's
the
general
way
of
thinking
about
it.
That's
what
I
thought
about
it!
That's
why
deep
learning,
networks,
work
right,
I,
no
longer
think
that's
true,
and
something
like
this
in
our
world.
When
you
have
grit
cells
in
here
or
your
location
and
reference
frames,
then
each
region
on
its
own
can
create
invariant
representations.
It
doesn't
need
a
top-down
signal
telling
it
that
it's
on.
G
D
They
go,
I
can
immigrate
inputs
over
space,
not
just
over
time
right
and
so
so
I
can
get
in
there
and
representation
here
and
I
get
past
that
and
driving
repetition
up
to
this
guy,
which
can
can
actually
build
a
garden
invitation
for
their
entire
conversation,
something
along
those
lines
where
I
don't
need
to
impose
this
down
all
the
time
but
Natalie
you
posed
it
anyway.
I
don't
have
to
enforce
the
ability
it
comes
out.
Naturally,
the
learning
algorithm
that
are
getting
grant
reputation
every
quarter
when
she
does
that.
D
So
now
that
basically
says:
oh,
these
guys
are
all
building
and
writing.
Crepitations
now
I
can
ask
I
can
rearrange
them
in
different
ways,
but
this
is
idea
that
this
one,
no
even
a
storm
with
some
the
driver
invitation
some
high-level
Legion
and
you
know
propagate
down
some
sort
of
signals
that
help.
You
learn
that
actually
the
basic
divide,
where
they're
proposing
a
mechanism
for
doing
this.
D
D
A
D
Ryan
shrine
them
I'm
trying
to
give
it
all
that
detail
and
came
up
with
the
highest
level
simple
description.
What's
going
on,
they've
got
it
made
a
very
clever
way
of
doing
this.
Maybe
that's
the
plumbers
when
you
can
do
this
I'm
just
pointing
out
our
theories
are
no
longer
doing
that.
Are
few
you
doing
this,
and
you
know
that
doesn't
I'm,
not
gonna,
say,
which
is
run
wrong.
I'm,
pretty
sure
this
is
right,
but
we
leave
that
open.
For
now,
just
say:
that's
the
difference
here.
D
D
B
A
D
D
D
B
D
B
D
A
D
A
problem
I
think
to
me:
I'm,
just
I,
trying
and
I'm
guessing
it's
a
question.
I
might
send
I
know
things
from
the
charitable
thing
I'm
trying
to
stand
if
the
world
had
always
assumed
that
that
a
local
Collins
II
wanna
be
two
would
be
for
to
only
know
about
what
its
inputs
are.
Therefore,
the
foreman
in
representation
of
the
entire
input
space.
D
D
A
H
A
H
B
D
D
B
B
F
B
D
D
D
B
H
D
B
B
D
D
C
D
D
A
D
E
A
D
You
know
I'm
not
sure
if
you
want
to
go
ahead,
I
think
you're
gonna
want
to
delve
in
this
further.
Personally,
from
my
point
of
view,
I
want
to
understand
what
about
the
Alpha
cycle.
I
want
to
understand
more
about
the
evidence,
if
that
really
is
evidence
for
these
two
phases
of
these
things,
only
important
to
do
that,
we
have
never.
D
E
D
A
D
D
A
H
A
D
For
example,
but
we
have
no
idea,
we
have
no
concept
that
that
prediction
really
affords
ever
I
mean
on
projection
to
live
for
them.
If
that
were
true,
you
have
to
modify
models
accommodate
that
yeah.
If
that
basic
idea,
that
is
a
phasic
prediction,
knocks
of
prediction,
natural
vision-
and
that
was
actually
that's.
G
E
D
Have
always
been
based
on
saccade,
so
a
monkey
is
the
cutting
looking
at
the
movements
and
during
the
secod,
the
cells
that
are
going
to
become
active
when
the
secod
is
finished
and
some
of
those
cells,
not
all
with
some
subset
of
them,
become
active
in
advance.
It
cannot.
It
does
not
predict
what
it
will
be
enough
to
the
next
caller
to
socotra.
It's
just
the
net.
It's
just
like,
while.
H
D
Moving
I
predict
whatever
sense
absolutely,
but
it
can't
predict
multiple
steps
in
advance.
It's
not
like
incorporate
it's
not
like
predicting.
What
you
would
see
is
you
moving
long?
Does
the
I?
Is
the
conic
nothing's
happening?
Actually,
the
basic
vision
doesn't
occur
below
the
Hadees
movie
when
fry
rapidly
they
show
the
system
shuts
off
until
it
stops.
So
it's
just
predicting.
What's
gonna
happen
after
it
stops,
and
it
can't
predict
what's
going
to
do
after
that,
because
it
does
no
one.
We
could
do
that
because
no
one
knowing
eyes
are
gonna
move.
D
D
D
G
D
D
Thing
that
flooring
just
mentioned
is
a
huge
brother
when
we
try
to
evolve
our
model,
which
is
you
know,
the
stability
of
perception
you're.
Looking
at
this
picture
here,
and
you
know
no
idea
that
your
inputs,
your
brain
or
rapidly
changing,
but
they
don't
three
times
a
second,
the
pletely
new
employed.
So
you
have
to
explain
that
and
in
our
temple
porter
with
this
location,
explain
exactly
what
that
does.
Is
you
have
a
layer
cells
which
are
stable
in
their
activity?
Yeah,
that's
it,
but
in
one
layer.