►
From YouTube: Deep Learning London Meetup: Brains, Data, Machine Intelligence & Cortical Learning with Jeff Haw...
Description
This is a live-stream of an upcoming Meetup for the Deep Learning London Meetup group. Jeff will be presenting from the Numenta offices in Redwood City to an audience in London. See the details here: http://www.meetup.com/Deep-Learning-London/events/171081112/
A
Very
much
for
showing
up
if
there's
been,
we've
really
had
a
lot
of
interest.
In
this
event,
it's
been
amazing,
seeing
everybody
sign
up,
but
it's
amazing
seeing
everybody
turn
up.
So
you
know.
Thank
you
very
much.
Thank
you
very
much
for
skills
Metro
for
hosting
us
for
and
stream,
for,
recording
this
and,
of
course,
a
big
thank
you
to
to
Jeff
and
his
team
for
agreeing
to
do
this.
So
you
know
this
is
organized
as
part
of
the
the
London
deep
learning
meetup,
which
Ali
and
myself
have
been
the
organizers.
A
B
A
D
A
A
Back
to
the
meetup,
so
today
I
mean
we're
really
honored
to
have
Jeff
Hawkins
come
present
to
us,
especially
for
us
live
from
California,
and
maybe
we
can
convince
him
to
come
down
too
and
then
next
time.
So
you
know
many
of
them.
Many
of
you
already
know
his
name,
but
I'll
just
go
through
his
bio,
so
Jeff
Hawkins
is
an
engineer.
Serial
entrepreneur,
scientists,
inventor
and
author
he's
founder
of
two
mobile
computing
companies
palm
and
handspring
and
was
the
architect
of
many
computing
products
such
as
the
palm
power,
Palm,
Pilots
and
trio.
A
A
A
scientific
Institute,
too
focused
on
understanding
how
the
near
neocortex
processes
information,
so
Institute
is
now
still
there
and
it's
located
at
UC
Berkeley
in
2004,
he
wrote
the
book
on
intelligence,
which
some
of
you
may
have
read,
which
describes
progress
on
understanding
the
neocortex
and
in
2005
he
founded
grok,
which
was
formerly
known
as
Numenta
a
startup
company
building
a
technology
based
on
neocortical
theory.
So
the
hope
is
that
grok
will
play
a
catalytic
role
in
the
emerging
field
of
machine
intelligence
and
I.
Think
so
far
it's
doing
pretty
well.
A
Jeff
originally
has
earned
his
bachelor's
in
engineering
electrical
engineering
from
Cornell
and
he
was
elected
to
the
National
Academy
of
Engineering
2003.
So
everybody
again,
thank
you
very
much
for
coming.
Thank
you
for
this
matter
and
everywhere
hosting,
and
thank
you
very
much
for
Jeff
for
willing
to
talk
to
us
and
over
to
you,
oh
yeah,
by
the
way.
What
just
one
more
thing
we
all
take,
questions
at
the
end
there'll
be
a
microphone
that
goes
around.
E
Right
we
all
said
hi,
I'm,
Jeff
I
am
sorry
I'm,
not
there
in
London
with
you
in
person,
I
do
get
over
there.
Occasionally
I
have
family
in
in
Britain,
but
I'm,
not
there
today.
Unfortunately,
so,
hopefully
it's
going
to
work
out
right
well
and
we
have
I
trust,
Olli
and
Matt
here
on
the
a/v
side.
If
something
goes
screwing
weird,
hopefully
they'll.
E
Let
us
know
what
we
can
fix
that
so
I'm
going
to
talk
about
the
work
we're
doing
in
our
company
and
in
our
open
source
project
and
and
as
always
said,
that
we'll
take
two
questions
at
the
end
and
so
we'll
just
so,
hopefully,
you'll
go
on
go
well.
I,
unfortunately,
cannot
see
you
very
clearly
and
I
can't
judge
how
things
are
going
fast
or
slow,
or
you
want
to
get
more
information,
less
information
on
a
particular
topic.
E
So
if
there's
something
really
going
wrong
and
you
really
feel
like
I'm
missing
something
well,
let
all
you
know
and
I'll
be
happy
to
try
to
go
and
address
them
all
right.
So,
let's
start
actually
Ali
was
cause.
We
all
right.
We
started
this
company,
the
name
Numenta.
We
briefly
changed
it
to
grok,
but
we're
actually
back
to
new
mentis
I'm.
Sorry
about
the
confusion
of
that
the
name
of
a
company
is
momentum,
and
our
mission
is
to
be
a
catalyst
for
intelligence.
So
you
know
the
catalyst
is
something
that
speeds
reaction.
E
That's
happening
anyway,
really
slowly
so
we're
gonna
build
and
tell
the
machines
we
as
a
society
are
impelled
machines.
I
have
a
different
view,
the
most
people,
what
those
will
look
like,
but
we're
gonna,
build
them
and
tell
the
machines
in
the
meta
it's
just
trying
to
accelerate
that
it
could
be
a
positive
force
and
that
in
that
transition
that's
kind
of
your
curve.
We
do
three
things
at
Numenta.
We
have
a
research
group
where
we
study
neocortical
theory
and
we
do
algorithms
development.
E
Called
grok
or
product
called
rock,
which
is
less
than
a
month,
the
hold
in
the
market
and
this
product
is
using
our
cortical
algorithms
and
we've
applied
into
streaming
analytics.
So
I'll
talk
about
the
research
Fitness
talk,
I'll
talk
about
preocupa,
to
show
you
what
we're
doing
with
these
cortical
algorithms
and
then
we'll
end
up
a
little
discussion
about
new
pic.
So
that's
my
email
address
and
don't
be
afraid
to
email
me.
If,
if
you
need
to
I,
try
to
stay
involved.
F
E
I
also
have
an
active
in
the
nupoc
email
list
as
well:
ok,
I'm
going
to
start
off
with
a
story,
and
this
is
the
story
about
Bill
Gates
when
Bill
Gates
was
still
CEO
of
Microsoft
a
number
of
years
ago.
He
was
speaking
to
a
group
of
young
students.
What
we
call
grade
school
and
one
of
those
students
asked
Bill
Gates,
he
said:
do
you
think
it
would
ever
be
possible
to
build
the
company
has
largest?
You
know
as
Microsoft
another
company's
largest
Microsoft?
E
Of
course
this
was
before
Google,
but
Bill
had
a
very
quick
answer.
He
didn't
hesitate.
He
said
yes
and
he
said
if
you
could
build
a
company
that
you
can
invent
a
breakthrough,
so
computers
that
look
can
learn.
That
is
worth
10
Microsoft's
and
when
he
was
saying,
I
was
a
very
smart
answer.
He
was
saying
basically,
we've
built
computers
on
the
same
principles
for
70
years.
E
These
are
the
principles
laid
down
by
by
norming
and
Turing
programming
principles,
and
he
said
they
know,
computers
really
don't,
learn
and
and
telling
you
how
to
make
machines
that
learn.
You
take
actually
a
bigger
revolution
that
and
I
agree
with
that.
I
thought
very
soup
thing.
He
said
it
right
over
right
away
and
tells
all
my
computer's
that
learn.
I
talk
about
machine
intelligence,
I
prefer
that
term
they're
kind
of
the
same,
but
one
has
a
part
of
vision
to
it,
and
a
lot
of
people
are
becoming
interested
in
machine
intelligence.
E
These
days
they
happen
for
a
long
time,
but
there's
a
lot
of
people
moving
this
direction
and
and
I've
been
to
lots
of
conferences
from
from
computer
manufacturers
from
applications,
racial
people
trying
to
figure
out
people
doing
big
data.
You
know
we
need
machines
that
can
learn
that
can
adapt,
and
so
on
now,
if
you're
gonna
build
machines,
machine
intelligence,
you
might
ask
a
couple
questions.
What
is
what
are
the
principles
we
can
use
to
build
intelligent
machines?
E
You
know
how
we
gonna
do
that
one
of
the
houses
can
be
structures
and
a
lot
of
disagreement
about
this.
The
second
question
you
might
ask
is
what
applications
will
drive
adoption
in
the
near
and
the
short
term
and
now
and
then.
Finally,
we,
you
know
my
answers
to
these
questions.
Well,
you
know
my
isn't.
The
first
one
I'm
interested
in
brains
and
I
believe
that
machine
intelligence,
it's
going
to
be
built
on
the
principles
of
the
neocortex.
Now,
just
to
remind
you,
the
neocortex
is
about
80%
of
the
volume
of
a
human
brain.
E
It's
it's.
What
makes
you
intelligent
all
my
language,
your
language,
while
high
level
vision,
planning,
motor,
behavior
and
so
on,
is
in
the
neocortex
and
I
believe.
That's.
We
need
to
understand
those
principles
before
we
can
build
machines
that
are
intelligent.
The
goal,
my
goal
and
the
goal
of
events
is
not
to
build
machines
that
are
like
humans.
It's
not
to
replicate
the
human,
your
cortex,
if
you
understand
the
principles
by
which
it
works
and
then
apply
it
to
other
problems
that
are
maybe
not
human-like
at
all.
E
So
if
this
is
not
they,
you
know
the
bull
of
robot
company.
There
are
human-like
robots.
This
is
a
company
to
understand
how
the
brain
works
and
then
build
the
telogen
machines
that
work
on
those
same
principles.
So
a
couple
reasons
that
that
recommend
the
brain
for
this.
Why
we
should
think
about
your
cortex
one?
Is
it's
an
incredibly
flexible
organ
when
you're
born?
You
know,
you
know
very
little
about
the
world.
E
Almost
nothing
you're
near
cortex
is
structure,
but
has
no
knowledge
about
the
world
and,
and
it's
extremely
flexible
you
can
learn
to
program,
computers,
design,
computers.
You
can
learn
to
drive
a
car
as
in
submarines
and
airplanes.
You
can
learn
spoken
language,
anyone,
thousands
language,
mathematics
in
in
physics
and
so
on,
and
and
so
so
we're
gonna
try
switching
back
and
forth
through
a
little
bit.
So
it's
an
incredibly
flexible
tool.
These
are
things
you
can
learn
to
do
that.
You
were
never
evolved
to
do
we're
under
evolutionary
pressure
to
do
so.
E
It's
as
close
as
we
know,
to
a
universal
learning
machine
in
the
same
way
that
Turing
talked
about
universal
computing.
The
cortex
is
the
closer
we
know
to
the
universal
learning
unit.
It's
not
proven
mathematically,
that's
the
case,
but
that's
one
thing.
The
same
thing
is
that
the
neocortex
is
varied
or
robust.
It's
built
a
very
simple
elements
that
are
slow
and
unreliable,
so
the
neurons
and
the
synapses
in
the
brain
are
fairly
unreliable
elements.
None
of
them
work
particularly
well
yeah
together
and
we
have
a
very,
very
robust
system.
E
There
are
no
single
points
of
failure,
and
so,
as
we
think
about
going
forward
in
the
world
in
terms
of
building,
you
know
new
computer
hardware,
the
April
4
machine
intelligence.
This
is
a
very
desirable
property.
People
can
build
memories
and
so
on
that
are
naturally
fault
on
and
then,
but
there's
still
quite
a
few
people
who
don't
believe
that
machine
tells
us
is
going
to
be
built
on
the
principles
that
the
brain,
your
people
just
do
not
care
about
the
bay
and,
in
fact,
I
think
it's
mine.
E
You
might
still
be
a
minority
opinion,
but
I
can
say
the
following:
if
we
do
have
a
neocortex
work,
then
there
would
be
a
race
to
build
them
if
we
had
a
theory
written
down,
but
exactly
how
the
neocortex
works,
I,
don't
think
anyone
would
be
sitting
around
arguing
about
it.
We'd
be
off
building
the
things
and
horsley.
E
That's
where
we
starting
to
be
we're
starting
to
really
deeply
understand
how
the
New
York
cortex
works
and
we're
starting
to
build
them
and
I
think
this
debate
about
whether
brains
are
relevant
or
not,
will
disappear
in
the
coming
years.
Ok,
so
now
we're
going
to
talk
into
some
neuroscience,
just
to
tell
you
a
bit
more
how
brains,
work
and
here's
a
here's,
a
overall
picture
of
what
the
cortex
does.
It
receives
information
from
your
senses
and
the
read
that
is
really
like
an
array
of
senses:
it's
not
one
sense.
E
E
It's
not
batch
stuff
in
any
way
and
and
the
cortex
as
I
said,
starts
talking
early,
not
really
knowing
anything,
and
he
has
to
learn
a
model
of
the
world
so
from
the
sensory
stream.
It
builds
a
model
of
the
world
and
from
that
model
that
makes
predictions
that
detects
changes
or
anomalies
and
it
takes
actions
and
because
it's
taking
actions
you
most
of
the
changes
that
occur
on
your
sensory
organs.
Most
of
the
changes
that
are
coming
in
on
your
data,
your
sensory
streams
or
because
of
your
own
behavior.
E
So
every
time
you
move
your
eyes,
which
is
three
to
five
times
a
second.
You
can
plead
to
innovation,
coming
on
on
your
optic
nerve
and
when
you,
when
you
feel
things,
it's
your
own
body,
moving
through
space
and
touching
things,
and
so
on.
So
a
good
portion
of
a
vast
majority
and
the
changes
on
your
data.
Your
sensors
are
coming
from
the
your
own
behavior
as
you
move
and
manipulate
the
world.
E
So
what
we
say
is
that
the
cortex
learns
a
sensory
motor
model
of
the
world
and
what
we
want
to
know
is:
how
does
it
do
that?
What
does
it
look
like?
What
does
that
model
look
like
and
how
is
it
learned-
and
we
know
a
lot
about
this
now?
Okay,
let's
start
off
with
the
basic
high-level
theory
about
what
the
cortex
does,
and
this
is
called
high.
We
call
it
hierarchical
temporal
memory
or
HTM
and
the
three
terms
in
write
descriptive.
The
first
is
that
it's
a
hierarchy.
E
So
when
you
look
at
the
neocortex,
it's
actually
a
sheet
of
cells,
it's
about
two
and
a
half
millimeters
thick,
it's
about
a
thousand
square
centimetre,
so
you
can
imagine
it's
very
thin
sheet.
It's
like
a
large
dinner
napkin
er,
so
we
had
40
guys
coming
and
but
that
she
is
divided
into
regions
and
those
regions
are
connected
together
in
a
hierarchy.
This
is
very
well-documented
and
the
amazing.
E
This,
the
cortical
sheet
is
no
no,
where
you
look
now
at
what
region
where
you
are
in
the
hierarchy
and
what
information
is
receiving.
They
look
nearly
identical,
in
fact,
these
regions,
like
so
they're
common
across
species
and
modalities,
that
you
know
the
vision,
visionary
and
human
and
a
mouse
of
almost
a
semicolon
in
the
urine
brain
higher
levels
of
motor,
almost
identical,
and
it's
been
known
for
over
30
years,
that
everywhere
in
the
neocortex,
is
basically
implementing
the
same
learning
out.
There's
variations
on
a
theme,
but
basically
same
thing.
E
This
is
a
wonderful
discovery
that,
if
you
understand
you
have
one
area,
the
new,
your
cortex
works,
we're
going
to
understand
how
most
of
the
neocortex
works
and
there's
nothing
vision
about
the
visual
aspect.
So
it's
nothing
auditory,
but
the
audience
why
aspects
of
neocortex
that
the
brain
hears
and
sees
using
the
same
methods.
E
The
second
part
of
that
is
the
reason
that
the
memory
in
the
cortex-
and
this
really
is
a
memory
system,
is
mostly
sequence
mode,
and
you
might
be
surprised
by
that,
but
I
think
a
Sigma
scrum.
You
can
think
of
like
a
memory
of
a
melody
or
something
like
that
most
inference
is
we
rely
to
on
sequence
memory,
so
imagine
my
voice
right
now,
you're
understanding
what
I'm
saying
the
pattern
during
time
matter.
What
pattern
follows
what
in
time
is
very,
very
important
and
I
mix
up
the
order?
It
would
be
different.
E
The
same
is
true
of
touch
when
you
touch
things
your
hands
move
in
a
particular
pattern
over
surfaces
and
so
on.
As
you
move
through
the
world,
these
are
sequences
envision.
Most
people
are
confused
about
vision.
They
think
invasion.
Has
a
spatial
inference
problem
like
there's
a
picture
in
front
of
you.
That's
not
true
the
as
I
mentioned
earlier.
They,
the
input
from
the
eyes,
is
changing
three
to
five
times
a
second.
E
Every
time
your
eyes
move
a
saccade
and
your
head's
moving
and
can
move
through
the
world
and
things
of
the
world
are
moving,
and
so
vision,
too,
is
mostly
an
inference,
a
temporal
inference
problem
and,
finally,
motor
behavior,
which
is
high-level
motor,
be
able
to
generate
by
part.
The
cortex
is
also
time-based
pattern.
So
my
speech,
which
right
now
has
been
generated
by
my
neocortex,
is
a
very
complex
pattern
of
muscle
innovations.
That's
going
on
to
compete
for
forty-five
minutes
or
so
and
I'm
playing
back
patterns
that
I've
stored
in
the
past.
E
Learning
their
sequences
of
sequences
and
sequences
so
on
the
hierarchy,
and
then
you
come
back
down
on
our
key
sequences
were
unfolded
so
I
didn't,
say:
oh,
give
a
brain
talk
about
this
and
I
unfold.
This
very
complex
pattern:
I
go
back
down
the
hierarchy,
that's
the
overall
theory
about
how
the
cortex
works.
It's
a
very
simplistic
view
of
it,
but
it's
I
think
it's
correct!
Now
we
can.
We
can
jump
in
a
little
bit
further
we're
going
to
dive
a
little
bit
more.
E
You
know
a
little
more
theory,
and
so
imagine
you
have
a
sheet
of
the
cortex
and
here's
you're.
Looking
at
any
area.
It
doesn't
really
matter
what
area
it's
about
two
and
a
half
millimeters
thick
and
then,
if
you
jump
in
one
level
deeper
and
I
zoom
into
one
spot,
you
go
back
to
this
one
I
go
one
level.
It
looks
like
Matt's
in
control
here,
giving
them
crazy
directions.
E
You
jump
down.
One
more
you'll,
see
that
the
cortex
did
matter
where
you
look
at
matter.
What
species
of
mammal?
No
matter
where
you
look
in
the
cortex
you're
going
to
see
layers
of
cells
and
the
number
of
layers
is
not
that
important,
but
basically
there's
four
layers
of
cells
with
label.
Two
three:
four:
five
and
six-
and
you
see
this-
every
room,
you
jump
in
one
more
level
of
detail.
E
You
zoom
in
a
little
bit
further
you'll
see
the
second
organizing
principle,
and
this
is
steeper
we're
going
to
go
for
a
while
here
and
that's
that
the
cells
are
arranged
in
these
little
columns.
These
little
mini
columns,
I
call
them
so
you've
got.
So
all
the
cells
are
vertically
itself
packed
in
these
very
tiny,
vertical
columns
across
the
layers,
and
so
there
might
be
100
cells
in
a
mini
column.
So
this
is
the
basic
organization.
This
is
the
structure
how
the
brain
works.
E
All
your
memories
are
stored
in
this
kind
of
structure,
and
the
question
is
what's
going
on
here,
so
we
have
a.
We
have
a
theory
about
this.
We
believe
that
each
layer
in
the
neocortex
is
learning
his
learning
sequences
of
patterns
and
they're
doing
it
for
different
purposes
under
different
conditions,
but
it
was
all
about
sequence,
memory
and
so
the
layer,
four
and
their
three
are
both
inference
layers
and
layer.
Five
is
the
motor
output
layer
and
letter.
Six
is
the
attention
layer.
E
We
have
studied
this
quite
a
bit
and
we
think
we
know
in
detail
how
these
layers
three
works
and
it's
very
similar
to
the
other
layers,
and
we
call
this
the
cortical
learning
outcome
or
CLA.
It's
basically
a
model
or
a
theory
about
how
a
layer
of
cells
in
the
neocortex
learns
sequences
of
patterns
and
does
prediction
and
inference
with
them.
They
said
it's
a
basic
learning
algorithm
and
we've
been
testing
this
for
quite
some
time.
We
have
a
necessary
deal
with
every
day,
so
we're
pretty
confident
we
understand
to
a
fairly
deep
level.
E
What's
exactly
going
on
here
now,
it's
important
to
understand
the
CLA
is
not
just
another
neural
network.
Some
people
say
artificial,
neural
network,
not
really.
No,
it's
a
cortical
model,
it's
got
neurons
in
it,
but
the
neurons
are,
unlike
anything,
you've
ever
seen
before.
I'm
gonna
tell
you
a
bit
about
them.
They're
like
real
neurons,
I'm,
not
artificial
at
all
in
terms
of
how
they
operate,
and
so
you
just
have.
E
To
architecture
than
you
would
in
a
typical,
artificial
neural
network
that
you
might
be
familiar
with,
and
I'm
gonna
give
you
some
flavor
for
that,
and
I'm
gonna
go
a
little
deep
here
for
a
while
and
I'll
come
back
out
and
make
it
easier
moments.
Hopefully,
I
won't
lose
everybody
here.
So
let's
just
talk
about
these
layers.
A
little
bit
more
I've
already
mentioned
this
briefly,
but
the
way
this
works
in
the
real
brain.
E
E
On
in
the
cortex,
so
in
later,
four
is
a
basic
input
layer
and
it
gets
your
sensory
data.
The
terms
neurosciences
use
is
called
afferents,
which
this
means
feed-forward
data,
and
so
it
gets
a
copy
of
you
know,
what's
going
on
in
your
senses
and
somewhere
else
in
the
brain,
but
it
also
gets
a
copy
of
motor
commands,
meaning
there
are
parts
of
your
body
that
generate
your
behavior.
The
cortex
just
controls
them,
but
it
gets
a
copy
of
what
it's
going
on.
So
when
your
eyes.
E
Gets
a
copy
of
the
motor
command
that
made
your
eyes
move,
so
cortex
knows
what
behaviors
you're
performing,
and
it
also
knows
what
you're
sensing
sensory
motor
and
what
we
believe
is
going
on
is
that
layer
four
cells
learns
essentially
motor
transitions.
What
do
I
mean
by
that?
You
can't
think
about
a
saccade
in
the
vision,
so
you
here's
a
face
and,
as
you
look
at
the
face,
your
eyes
will
saccade
over
the
different
part,
look
at
the
hair,
the
eyes
and
the
nose
and
the
mouth
you
don't
realize
your
eyes
are
doing
this.
E
Your
perception
is
stable.
You
look
at
you
just
see
this
face,
but
the
reality
is
the
input
coming
into
your
brain
is
changing
dramatically
completely.
Every
few
you
know
every
few
hundred
milliseconds
now
what
layer
four
tries
to
do
is
predict
what
you're
going
to
see
next
or
what
you're
going
to
hear
next,
what
you're
going
to,
and
it's
going
to
do
that
based
on
your
own
behavior.
So
in
this
case
the
seaweed
go
back
a
second.
In
this
case,
the
order
in
which
I
look
at
a
face
is
not
fixed.
Sometimes
I'll
go
ayano's.
E
Now
sometimes
I
go
here,
a
hair
or
whatever
it's
not
on
my
order,
sequence,
its,
but
if
I
knew
what
behavior
you're
going
to
do
and
I
knew
what
you're
looking
at
now
I
can
predict
what
you're
going
to
see
next
and
that's
what's
going
on
later.
For
now,
what
happens
is
if
layer
four
can
make
a
correct
prediction
about
what's
going
to
connect,
it
creates
a
stable
pattern
in
the
next
layer
in
the
cortex,
which
is
two
three.
E
This
is
the
standard
neuroscience
about
how
information
flows
the
cortex
and
if
it
can't
predict,
what's
going
to
happen,
it
said
she
says:
I
have
no
idea.
What's
going
on
here
on
can't
model
this,
it
passes
the
changes
through.
So
this
cut
back
to
that
temple
of
stability
I
talked
about
earlier.
The
basic
idea
is
that
layer,
two
three
are
learning
high
order
transitions.
A
high
order
transition
is
Markham
melody.
E
So
this
is
the
problem
that
brain
has
to
solve,
and
language
and
music
and
motion
and
hearing,
and
you
name
it
everything
vision.
These
are
the
two
major
ways
of
doing
inference
and
you
see
these
everywhere
and
then
your
cortex,
then
in
three
projection,
X
higher
region.
The
process
repeats
the
I'm
going
to
argue
that
these
two
basic
ideas
of
inference
for
sensory
motor
transitions
and
for
high
order
transitions
or,
if
there's
nothing
else,
these
these
apply
to
every
modality
to
apply
to
vision.
Hearing
touch
doesn't
really
matter.
E
These
are
universal
inference
steps
and
they
apply
to
any.
If
you've
got
some
sensory
organs
and
you
have
some
behavior,
this
would
work
for
them.
The
other
two
layers
which
are
layer
5,
it's
where
the
motor
generation
is
created
in
the
brain,
the
cells
they're
actually
projected
area
of
the
your
body
that
generate
behavior
and
layer
6
of
attention.
What
I'm
going
to
claim
here
is
that
we
understand
layer,
2,
3,
very
well,
we've
been
building
these
four
years.
E
We
tested
them
in
our
commercial
product,
we're
starting
to
understand
later
for
pretty
well
and
what
I
say:
90%
understood,
I'm,
saying
I
am
90%
of
know
what
I
need
to
do
to
build
this
in
software
and
test
it.
It's
not
look
interesting,
90%
of
all
the
biology
we're
talking
about
actually
building
something
practical
with
it.
On
the
motor
side,
he's
only
got
a
50%
on
the
attention
side.
It's
even
less
about
10%
I
have
another
talk
online.
E
E
C
E
State
of
the
world
right
now,
you
kind
of
need
to
know
this
stuff.
If
you're
going
to
work
on
our
open
source
project.
Ok,
let's
jump
back
in
there.
I'll
bring
up
this
slide
again
he's
trying
to
do
it.
Ok
I
already
said
that
the
the
cournot
learning
island
is
the
way
we
first
implement.
It
is,
is
really
modeling
high
order,
sequences
of
high
order
transitions,
and
so
what's
going
on
there
before
I,
can
tell
you
exactly
how
it
works.
E
I
want
to
give
you
a
little
more
things
to
talk,
think
about
I'm
gonna
talk
about
sparse,
stupid
representations
and
I'm
talking
about
not
what
they
look
like
and
then
we'll
put
it
all
together
and
show
you
how
this
thing
works.
So,
let's
just
dump
the
sparse
distributed
representations.
These
are
the
language
of
the
brain
in
STRs
as
we
call
them
or
how
the
neuroscience
works,
how
the
brains
work
and
it's
not
an
option.
E
This
is
not
some
like
thing
you
could
it's
just
there
for
the
biology,
it's
part
of
the
theory
about
how
all
information
runs.
So
how
do
we
understand?
But
the
easiest
way
to
understand
is.
First
contrast
it
to
what
we
do
in
computers
in
computers,
we
do
what
are
called
dense
representations,
so
a
dense
representation
is
like
a
buying
or
a
words,
might
take
8
32
64
bits
which
tends
because
we
use
all
combinations
of
ones
and
zeros,
so
we
use
all
all
possible
representations.
E
An
example
is
the
ASCII
code,
so
there's
the
eighth
letter,
8-bit
character
for
the
letter,
M
notice,
in
something
like
in
the
representation
like
this
bits
themselves,
don't
mean
anything
then
obviously
with
the
third
bit,
and
they
ASCII
code
means
there's
no
meaning
and
back
to
my
change.
One
bit
I
have
a
completely
different
representation,
and
so
these
are
some
sense
and
arbitrarily
assigned
representations,
and
this
is
pretty
much
what
we
use
in
computers
all
the
time
they
the
computer
itself.
We
have
to
assign
meaning
to
it.
E
The
computer
itself
doesn't
know
what
these
things
mean,
and
the
range
is
quite
different
in
a
brain.
Now,
when
I
talk
about
bits,
you
can
think
it
was
ones
and
zeroes
you
can
also
think
of
missiles.
So
if
I
say
I
have
thousands
of
bits
I'm
talking
about
thousands
of
neurons,
if
I
say
most
of
them
are
zeroes
I
mean
most
of
them
are
inactive,
so
we
typically
see
an
SDR.
E
You
need
at
least
several
thousand
bits
to
do
something,
and
most
of
them
are
walkins,
meaning
most
of
the
cells
I'm
sitting
very
fewer
ones,
and
mostly
zero.
So
in
the
brain,
what
we
see
always
is
very
few
cells
are
active
and
very
very
few
cells
are
relatively
active.
Most
of
the
cells
are
relatively
inactive.
We'll
use
this
example
of
2,000
bits,
SDR
and
we'll
say:
2%
of
the
bits
are
active.
Now
we're
always
gonna
have
2%
of
its
active.
That's
the
statum,
that's
what
an
SDR
is.
It's
only
sparse.
E
It
always
has
a
certain
amount
set.
Its
active,
so
I
might
have
41
bits
in
1960
zero
bits
now.
The
difference
here
is
that
the
bits
means
something
they
have
semantic
meaning
you
can
actually
say
what
each
pit
means
where
you
potentially
could
say.
We
need
to
pick
names.
This
is
learned
in
the
brain:
it's
not
a
sign,
but
through
woman
we
can
just
get
so
much
stable,
and
so
what
might
a
bit.
E
Craft
the
representation
for
letters
which
I
wouldn't
do,
but
if
I
did
I
could
say
well,
one
bit
would
mean
this
is
the
vowel,
in
other
words
a
consonant
and
another
one
says
it
sounds
like
an
e
I
or
o
ruh
sound
another
one
has
its
soft
or
fricative
sound
I
can
have
best
describe
how
the
letter
is
written,
there's
no
too
close
or
open.
Does
it
have
a
sender's
with
these
senders?
I
get
Vince
is
saying
where
it
is
in
the
alphabet.
E
What's
next
to
it
and
so
on
and
and
what
I
would
do
is
I
want
the
grep
a
letter
I
picked
the
top
40
attributes
that
match
that
letter
and
and
if
I
pick
another
letter,
I
picked
the
top
four,
the
attributes
that
match
that
letter
and
that's
how
the
brain
forms
representations.
Now
this
has
some
really
great
properties,
and
so
let's
go
through
a
few
of
them,
and
this
is
the
key
fact
you
know
I'll
tell
you
now.
E
If
you
want
to
remember
only
one
thing
for
my
talk
about
the
future
of
intelligent
machines,
you
know
tissue.
Remember
that
they're
going
to
be
built
on
spar,
so
spirit
representations,
that's
the
key
okay,
so
first
property.
If
I
took
two
STRs
two
representations
and
they
have
a
common
bit
in
common
I
mean
they
share
the
same
bit
being
one
I
can
say
they
have
semantic
similarity,
they're
sharing
some
semantic
meaning.
This
will
not
happen
by
chance.
E
So
if
I
took
a
couple
of
bits
on
like
this
I
say,
oh
these
guys
are
similar
and
here's
how
they're
similar,
because
that's
those
bits
represent
something
semantics.
If
I
wanted
to
assign
one
of
the
remember
a
pattern
and
then
say:
hey
these,
the
pattern
occur
again
later.
I
store
and
compare
operation
like
store
this
pattern
and,
let's
see
if
it
occurs
again,
we're
not
going
to
save
all
2,000
pits,
that's
what
we
would
do
in
a
computer.
E
What
we're
going
to
do
is
just
save
the
locations
of
the
1
bits,
so
I
have
a
list
of
40
indices
and
I
said:
ok,
I'm
41,
but
let's
remember
where
they
are
and
now,
if
I
see
a
new
pattern
coming
in
I,
just
look
at
those
locations
and
I
was
thinking
about.
There's
ones
there
I
have
the
same
pattern:
that's
pre,
guaranteed
by
the
way
in
a
brain.
These
connections
are
these
synapses
between
themselves.
E
Now,
what
if
I
couldn't
store
the
locations
of
all
the
one
bit
I
could
only
store
the
locations
of
a
few
of
them.
A
subsample
of
let's
say
I
said
you
can
only
store
10
of
them.
I
picked
randomly
10
of
them.
Well,
we
can
do
the
same
operation
and
a
new
pattern
comes
in
I'll,
look
at
those
10
locations
and
I'll
see
yeah,
there's
10
ones.
There
I'll
say
it's
the
same
pattern,
but
you
might
say
way:
that's
that
could
be
an
error.
What
about
the
other
30?
They
could
be
different
and
that's
true.
E
C
H
F
E
Also,
even
if
you
do
make
a
mistake,
you're
making
mistake
for
something:
that's
semantically,
similar
to
the
thing
you
store
he's
got
a
lot
of
semantic
ostrich
in
common
and
therefore
it
good
substitute-
and
so
this
is
the
basis
of
generalization
in
the
brain.
Is
that
we
don't
need
to
store
the
you
know
recognizes
compatibly
completely
we
can
subsample.
We
can
look
at
a
partial
pattern
to
say
this
is
semantically
explosive.
And,
finally,
we
there's
a
there's
another
property
that
we
use
a
lot
in
our
algorithms,
and
this
is
the
most
complicated
one.
E
It's
I
can
form
a
union
of
these
things,
I
can
or
them
together.
So
in
this
case,
I
may
take
ten
of
these
guys
each
of
two
percent
of
its
active
if
I
order
them
together,
literally
just
do
that
I'll
end
up
with
a
new
reputation,
two
thousand
bits
that
has
about
twenty
percent
of
exact.
Now
I
can't
undo
this.
You
can't
ask
pay
what
with
the
original
ten
not
possible,
but
you
can
do
something
almost
as
good.
You
could
say:
here's
a
new
one.
E
Is
it
one
of
the
original
ten
and
I'm
going
to
claim
that
if
you
just
look
and
say
well,
is
the
Union
have
bits
in
the
same
one
bins
in
the
same
location
as
the
one
I'm
checking
for
I'm
gonna
say
it's
a
fit?
It's
a
good
match.
Now
again,
you
could
point
out
that
this
could
be
an
error,
because
I
could
be
matching
someone
from
one
of
the
original
ten
someone's
from
the
other
lieutenants.
So
on
again
by
the
same
logic,
this
is
very
unlikely
to
occur.
E
Extremely
unlikely
I
mean
almost
astronomically
unlikely,
but
it
could,
and
but
if
you
do
make
a
mistake
or
they
don't
want
to
line
up,
it's
you
till
you're
making
mistake
for
something
semantically
similar
to
the
things
you
support.
So
this
is
the
kind
of
logic
that
the
brain
uses,
and
this
is
kind
of
logic
that
tells
me
she'd
use,
and
these
are
kind
of
logic
that
we're
using
in
our
algorithms,
okay,
now
just
a
couple
of
words
about
neurons
and
I
promise.
Well,
this
is
the
last
picture,
I.
Think
of
a
biologists
thing.
E
This
is
a
picture
of
a
real
nine.
In
a
brain,
this
is
you
just
80%
of
the
neurons
in
your
brain
and
your
cortex.
Look
like
this.
That's
called
the
pyramidal
stone.
Now
these
cells
have
lots
and
lots
of
synapses
on
these
are
the
connections
to
other
cells
and
typically,
there
are
10,000
synapses
on
neuron.
However,
only
a
few
hundred
of
them
are
close
to
the
cell
body
and
they
nega
the
class.
They
define
what
you
might
consider
the
classic
receptive
field,
the
cell,
if
I'm
a
neuroscientist
looks
as
and
why
does
itah
cell
respond?
E
They
look
at
those
hundreds
and
they'll
say
your
that
defines
it
and
all
the
other
9800
don't
seem
doing
too
much
those
other
the
vast
majority.
The
synapses
are
on
these
distant
connection,
these
distant
dendrites,
the
dendrites
of
these
trees
that
come
out
from
the
neuron
and
and
what
we
now
know
is
we
didn't
know
15-20
years
ago.
E
If
these
these
dendrites
are
active
computing
elements,
they
do
something
interesting
and
what
they
do
is,
if
you
have
say,
10
to
15
to
20
connections
on
a
dendrite,
and
these
connections
are
close
to
each
other,
very
close
right
next
to
each
other.
On
the
branch
and
they
become
active
at
the
same
time,
then
we
generate
generates
what
is
called
the
dendritic
spike
and
has
a
large
effect
on
the
cell
body.
If
those
synapses
became
active
at
different
times
or
in
different
locations,
nothing
occurs
and.
E
E
What
they
will
do
is
they'll
have
to
recognize
a
pattern
and
then
put
the
cell
into
a
twits
call
it
the
polarized
state,
or
we
call
it
a
predicted
state,
so
the
cell
can
say:
look
I'm
going
to
be
able
to
predict
my
own
activity
if
I
see
one
of
these
many
patterns
out
there.
This
is
a
picture
of
our
artificial
neuron
that
we
use
in
our
simulations,
and
it
captures
this.
E
The
green
dots
are
the
proximal
or
the
synapses
that
are
that
are
near
the
cell
body
and
then
the
blue
dots
and
this
one
are
those
ones
on
the
far
distance
and
those
are
the
ones
like
coincidence,
detector.
So
that's
what
we're
showing
here,
and
so,
when
we
build
these
models,
this
is
inherent
in
how
our
models
work.
Okay,
now
we're
gonna
tell
you
how
the
basics,
how
you
learn
transitions?
I'm
not
going
to
go
through
all
of
it.
E
Imagine
I
have
a
bunch
of
cells
in
an
array
light,
so
those
little
cubes
each
one
of
those
is
one
of
those
cells
and
they're
all
receiving
some
input
and
and
they
get
different
amounts
of
input
from
the
input
space.
So
you
imagine
when
you
have
any
input
coming
from
your
eyes.
It's
this
big
array
of
bits.
It's
not
like
one
thing
and
each
cell
is
getting
a
different
amount
of
input.
We
won't
talk
about
that
and
that's
what
the
color
represents.
E
E
You'll
end
up
with
just
a
few
of
the
cells
being
active
and
most
of
the
cells
being
inactive
and
there's
a
picture
of
a
small
section
of
one
of
my
simulations,
just
showing
you
a
few
cells
and
what
a
sparse
representation
might
look
like
in
a
brain
or
in
a
simulation.
If
you
want
to
put
a
visualization
on
it
now,
this
is
a
sparse
cell
activation
or
sparse
representation.
E
Now
this
is
like
a
one
time.
You
have
this
pattern,
but
at
another
time
you'll
have
a
different
pattern.
Let
me
go
back
one
forward
back
one:
oh
okay,
this
is
what
we
were.
This
is
what
the
praying
has
to
learn
sequences
up
when
it's
learning
sequences,
it's
learning
sequences
of
these
sparse
patterns
and
the
way
it
does,
that
is
pretty
cool.
E
To
the
system,
what
will
happen
is
some
of
the
cells
become
active?
Those
are
the
red
cells
in
this
picture,
and
some
of
them
will
be
predicted.
Those
are
the
yellow
cells
in
this
picture
that
they're
polarized
now
there's
more
yellows
and
reds
in
this
case,
because
this
typically,
what
will
happen
is
I
might
learn
many
transitions.
E
So
if
I
learned
a
followed
by
B
and
a
followed
by
C
and
a
followed
by
D
and
I
show
at
a
it's
going
to
predict
B,
C
and
D
the
union
of
those
patterns
at
the
same
time,
that's
what's
typically
going
on
now,
there's
a
problem
with
this.
This
is
a
not
a
high
order
memory.
This
is
a.
It
only
could
learn
one
state
transition.
So
this
is
a
first
order,
sequence
memory.
E
It
could
not
distinguish
between
a
b
c
d
and
XB
c
why
it
doesn't
have
the
ability
to
do
that
and
the
way
that
the
brain
solves
is
in
the
way
our
algorithm
solves.
The
CLS
office
is
going
back
to
those
mini
columns.
I
mentioned
earlier,
and
I
have
one
slide
on
that,
which
is
a
little
bit
of
a
build
here,
so
we
want
understand
now,
cut
ourselves,
learn
higher-order
sequences
and
how
we
gonna
use
mini
comps
for
that
so
we're
trying
to
solve
the
ABCD
prediction,
the
exbc
y-position.
E
E
You
can
see,
there's
like
six
cells
in
each
column
and
there's
a
maybe
a
dozen
columns
in
each
of
these
little
pictures
and
if
spar
so
native
to
the
letter
A
that
input
a
is
represented
by
three
columns
and
then
B
is
represented
by
three
columns.
C
is
another
three
columns
with
these
and
other
becomes.
These
are
sparse
representations
of
the
patterns.
Abcd
course
we
will
be
doing
this
in
much
larger
number
of
cells,
but
this
is
to
illustrate
now,
if
I
showed
you
the
next
pattern,
X
B
CY.
E
Well,
you
see,
X
is
different
than
a,
but
the
B
is
the
same
as
the
B
and
the
C
is
the
same
as
that
C
and
the
Y
is
different
than
the
T.
So
this
isn't
going
to
work.
What
happens
after
training
is
interesting.
We
start
off
with
the
same
a,
but
when
you
go
to
the
next
pattern,
this
is
after
training,
the
sequence
you
end
up
with
a
new
pattern:
B
B
Prime,
and
what
B
prime
has
it's
got?
E
The
same
columns
active
you
can
see
that
they're
the
same
three
columns,
but
now
only
activating
an
individual
cell
in
each
column.
It's
much
sparser
instead
of
having
18
cells,
act
and
I
have
three
in
this
picture,
and
the
same
thing
would
happen
with
C
prime,
and
so
the
same
thing
would
happen.
Would
defund
so
by
after
training,
my
learning
sequence
I'd
go
to
a
and
a
B
prime,
the
C
prime
D
prime.
The
columns
are
the
same
as
before,
but
now
we
have
different
cellular
representations.
E
If
I
did
the
same
thing
for
the
exbc
wine
sequence,
I
now
labeling
it
be
double
Prime
and
C.
Double
Prime
you'll
see
that
B
prime
B
Double
prime
and
B
all
have
the
same
columns
or
they
have
different
cells,
and
this
allows
a
system
to
represent
C
or
B
in
any
many
different
contexts.
It
allows
us
to
say
this
is
a
seeing
that
it's
uniquely
in
this
sequence,
and
it
allows
us
to
predict
D
instead
of
Y.
E
E
There
would
be
10
to
the
40th
ways
to
represent
the
same
pattern,
different
context
as
I
could
learn,
10
to
the
40th
different
ways
of
be
representing,
be
in
different
sequences,
and
that's
a
very,
very
large
number
of
course,
and
so
the
brain
has
this
ability
to
learn
almost
almost
an
unlimited
number
of
of
contextual
ways
of
representing
the
same
thing
in
different
contexts.
All
right.
This
is
all
about
the
CLA
and
I
know
it's
a
lot
to
observe
and
you
can't
absorb
this
in
in
one
session
and
I
hope.
I
didn't.
E
You
know,
lose
you
too
much
on
that,
but
I
wanted
to
give
you
a
flavor
for
in
the
end,
you
have
to
take
my
word
for
it.
The
thing
really
works
and
it
works.
Well.
It
converts
an
input
into
a
sparse
to
stupid
representation
and
columns
alerts
transitions.
These
high
order
transitions
and
it's
able
to
make
predictions
and
detect
anomalies.
We
use
it
in
the
brain
and
we
can
use
it
machine
intelligence
for
inference
higher
inference,
central
difference
and
motor
recall.
That's
how
the
how
to
tell
you
your
brain.
E
Does
it
and
it
has
some
really
nice
capabilities.
It's
an
online
learning
system,
meaning
it
you
don't
back
up
the
data,
you
just
learn:
does
it
go
you
throw
in
your
data?
I
didn't
tell
you
how
this
happened,
but
you
can
read
about
it.
It's
very
high
capacity,
even
a
very
small
region
of
the
CLA
connect.
It
can
learn
millions
of
transitions.
It
works
as
simple
learning
rules,
it's
naturally
fault
tolerant.
So
no
cell,
no
neuron,
no
synapse
no
column
is
essential
and
there
are
no
sensitive
parameters.
E
It's
like
this
thing
has
to
be
really
tweak
to
get
it
to
work
now
for
primitives.
You
can
make
it
work
differently
in
different
ways,
but
it's
nothing
really
sensitive.
So
these
are
great
attributes,
and
these
are
the
kind
of
things
we'd
like
to
see
in
a
system
like
machine
intelligence.
Ok,
I
argue
that
this
is
the
basic
building
block
of
machine
intelligence.
It's
a
basic
building
block
in
your
cortex
and
there
are
people
out
there
now
trying
to
figure
how
to
build
this
stuff
in
hardware.
E
Ok,
so
let
me
how
much
make
sure
you
should
still
know
I'm
here.
Let
me
just
talk
briefly
about
now:
I'm
gonna
switch
to
applications
like
does
this
stuff
really
work?
What
do
you
do
with
it?
And
what
can
you
do
so
in
this
case,
I'm
gonna
talk
about
how
we
visited
on
a
company
in
for
nominally
detection
and
and
then
I'll
talk
about
another
application
of
natural
language
processing.
E
So
we
were
trying
to
find
a
commercial
application
for
for
the
sea
line,
and
so
we
said,
let's
do
it
poor
anomaly,
detection
and
streaming
metric
data,
so
we
can
take
a
server
and
we
can
take
a
data
point
off
that
server
every
minute,
every
five
minutes
we
combined
it
up
a
time.
We
run
it
through
an
encoder
which
turns
it
into
a
partial,
stupid
representation.
We
then
feed
that
to
the
CLA.
The
CLA
builds
a
model
of
how
this
metric
or
this
value
changes
over
time.
And
from
that
we
get
a
prediction.
E
We
can
detect
that
there's
a
prediction
error
and
we
do
some
interesting,
so
statistical
processing.
On
the
other
side
which
we
could
read
about
on
a
white
paper.
How
much
time
would
you
end
up
with
is
an
anomaly
score?
You
can
end
up
saying
the
state
of
the
system.
How
unlikely
is
it
given
what
I've
seen
in
the
recent
past,
given
what
I've
learned
about
how
this
thing
changes
over
time
over
the
last
to
say
three
weeks
or
month?
How
unlikely
is
it
these
patterns
I'm?
Seeing
now
these
are
temporal
patterns?
E
It's
like
listening
to
someone
play
music
and
saying
hey:
have
they
gotten
better
or
worse
so
they're,
making
more
mistakes
that
they
change
their
style?
Something
like
that.
We
can
do
this
for
lots
of
metrics
and
off
the
same
machine,
and
we
can
put
this
in
a
product.
We
did
so
I'm
gonna
turn
about
our
product
I'm,
not
trying
to
sell
you
on
a
product
just
going
to
illustrate
how
it
worked.
The
products
called
the
rock
we've.
Initially,
it's
about
a
month
old,
been
on
the
market
for
like
come.
I
E
He's
designed
from
the
Amazon
market
pull
the
damn
beans
on
AWS.
This
is
their
cloud
services
much
the
internet
runs
on,
and
so
we
run
on
top
of
AWS.
And
the
point
of
this
is
that
we
get
the
data
from
a
device
through
something
called
cloud
watch
we
get
data
from
the
server's
themselves.
We
feed
this
in
the
drop
drop,
goes
models
of
this
data
and
it
does
this
in
an
automated
way,
because
the
whole
thing
is
learning
online
continuously
and
then
it
sends
the
results
of
a
mobile
client,
and
this
is
not
brain
related.
E
But
just
to
give
you
sense
of
how
this
whole
thing
works
and
what
we
do
is
we
show
you
all
the
different
instances,
all
the
different
servers,
you're
monitoring.
We
show
you
how
anomalous
they
have
been
in
the
last
week
of
the
last
day
over
the
last
hour.
It's
a
very
good,
sorted
so
very
quickly.
You
can
look
at
something
and
say
hey.
If
everything
has
everything
working
well,
what's
what's
unusual
and
then
you
can
drill
down
to
see.
What's
going
on,
I
need
to
tell
you
a
little
bit
about
the
interface
for
this.
E
Just
so
you
know
why
don't
we
show
you
some
other
pictures
but
you're
interesting,
so
it
can.
We
start
off
with
on
the
left
there
sort
of
a
sorted
list
of
how
anomalous
your
servers
are,
the
on
any
particular
server.
You
can
drill
down
and
see
which
metrics
are
involved
and
how
anomalous
they
are.
This
is
all
done.
Those
on
the
gray
are
in
the
middle
picture,
those
three
different
models.
They
are
all
a
cortical
model.
E
Every
one
of
these
things
is
running
we're
running
hundreds
of
cortical
slices
on
this
data
and
then
the
final
picture.
Someone
can
actually
look
in
to
see
what
the
actual
metric
data
is
and
see
why
the
grok
determined
it
was
unusual
or
not.
So
we
sort
this
by
anomaly
score
and
it's
continually
to
the
learning
and
it's
Kachinas
and
updating.
Now
here
comes
some
of
the
interesting
things
so
I
want
to
give
you
a
sense
for
the
kind
of
sensitivities
in
the
CLA.
What
did
you
see
they
bring
to
this?
E
E
When
it
was
running
along
a
one
level
and
then
jump
to
another
one,
that's
pretty
obvious.
Here's
a
little
bit
less
hobby,
it's
a
slow
chance.
Those
things
were
sort
of
creeping
up
slowly
over
a
matter
of
days
and
eventually
God
says:
that's
enough!
That's
change!
Here's
something!
This
is
a
very
subtle
change.
The
two
pictures
on
the
right
again
we're
talking
about
how
the
court
of
a
model
is
modeling
this
data
or
saying
the
data
into
the
court
of
a
model
and
he's
trying
to
make
predictions
about
it.
E
In
this
case,
the
data
is
very
predictable.
It's
a
very
regular
pattern
rock
and
did
not
detect
those
two
spikes
in
the
third
picture
to
the
right
because
it's
seen
them
before.
But
if
we
zoom
in
on
the
big
block
of
blue
there,
we
go
from
looking
at
a
week's
worth
of
data
to
looking
at
the
day's
worth
a
minute.
You'll
see
they
caught
croc
detected
a
very
subtle
change
in
a
very
repeated
pattern.
E
So
normally
is
every
hour,
there's
a
little
tick
up
and
behavior
here
and
one
day
an
hour,
and
so
they
change
slightly
amount.
So
this
is
a
very
regular
data
set
Brockville
they're,
very,
very
highly
predictive
model
and
it
could
detect
when
something
is
very
subtly
different.
Okay,
here's
a
case
where
we
could
predict
the
changes
in
a
noisy
data
stream.
E
Obvious
why
croc
detected
that's
that
area?
What
was
about
it?
Why
was
that
less
predictable
at
that
point
in
time,
and
it's
not
clear
the
human
wouldn't
have
been
able
to
pick
that
out,
but
grant
in
a
CLA
said.
There's
something
unusual
about
this.
I
know
the
patterns
of
this
data.
This
is
less
I,
couldn't
predict
this,
it's
less
predictable
than
it
was
before,
and
it
turns
out
in
this
particular
case.
These
two
models
both
detected
a
very
sudden,
a
subtle
change.
E
One
of
our
engineers
went
on
to
a
server
that
normally
is
an
automated
bill
server.
If
you
know
what
that
means.
Basically,
it
goes
our
product
over
and
over
again
on
an
automated
basis
and
he
started
a
manual
build
process,
and
so
definitely
when
a
human
did
something
slightly
different
because
the
human
he
was
doing
what
is
normally
done
automatically,
and
so
this
shows
the
power
of
our
case
we're
using
the
claa
in
this
cortical
models
for
basically
doing
very
sophisticated
anomaly
detection
in
streaming
of
data.
E
That's
not,
you
might
not
think
of
that
as
a
machine
intelligence
application,
but
that's
the
kind
of
thing
we
can
do
today
and
it's
very
powerful
and
it's
very
valuable.
Okay,
let's
switch
now
and
talk
about
a
different
application.
This
was
from
a
company
called
set
in
in
Austria
and
what
they
did
is
this
guy
named
Francisco
Weber.
He
read
our
papers
about
the
CLA
and
it's
partial,
stupid
representations
and
he
said
hey.
This
is
this:
is
this
solves
a
problem
he's
that
he
was
a
natural
language
expert?
E
And
he
said
this
is
I've
been
working
on
these
problems,
representation
in
natural
languages-
and
he
says
my
gosh-
these
sparse
distributed
representations
are
the
keys
to
solving
natural
language
problems,
so
he
did
something
very
clever.
Let's
take
a
look
at
that.
He
first
of
all,
he
built
a
tool
that
builds
part,
two
sparse
distributed
representations
from
of
words
and
and
proper
nouns
and
so
on,
and
he
started
with
a
hundred
thousand
now
I
think
use
up.
I,
don't
remember,
but
he.
F
E
Why
do
I
call
this
parses
two
in
repetition,
whether
in
this
case,
there's
16,000
beds,
they're
sparse
meeting,
only
a
few,
the
bits
of
one
and
most
of
zeros,
but
he
has
the
properties
we
talked
about
earlier,
where
bits
have
semantic
meaning
he
doesn't
assign
him
to
learn,
but
a
few
representations
to
share
bits,
a
sharing,
some
sort
of
semantic
learning
and
need
to
try
to
tease
out
what
that
semantic.
Meaning
is,
but
it's
not
so
easy,
just
like
it
is
in
the
brain,
but
they
are
true
artistic
representations.
E
So
now
he
did
some
very
clever
things
with
them.
You
can
do
something
like
the
following:
you
take
the
the
STR
for
the
word
Apple
and
the
sparse
is
to
be
references
for
the
word
fruit
and
now
they're,
going
to
share
some
bits
and
they're
gonna
have
some
bits,
they're
different,
so
in
some
cases
they're
some
bits
that
are
some
semantic,
meaning
that
are
common
between
those
two.
E
If
you
take,
we
subtract
out
the
bits
that
are
an
Apple
that
are
in
fruit,
meaning
we
remove
the
fruit,
miss
from
Apple
you'll
get
another
spar,
sister
representation,
it's
a
subset
of
what
the
Apple
representation
was,
and
you
can
look
that
up.
That's
a
new
representation.
We've
never
seen
before
it's
a
novel
one,
but
it's
going
to
be
close
and
overlap
with
other
ones,
and
so
we
can
say
what
does
it
overlap
most
with
and
the
answer
you
get
is
computer,
so
you
take
apple
fruit.
You
get
the
best
match
you
get
is
computer.
E
The
next
best
matches
are
shown
here,
Macintosh,
Microsoft,
Mac
and
so
on.
There's
a
semantic
processing
done
with
SDRs,
that's
very,
very
cool
and
there's
a
lot
of
stuff.
You
can
do
with
this
at
our
last
hackathon
in
the
fall.
Our
VP
of
engineering
is
to
put
item
on,
did
took
this
and
use
it
with
the
CLA.
So
here's
what
he
did.
He
said:
okay,
let's
train
the
CLA
on
sequences
of
words
like
sentences
very
simple.
We
did
three
word
sentences
and
the
words
the
sentences
he
picked
had
this
structure
to
them.
E
They
began
with
an
animal.
The
second
word
was
either
eats
or
likes,
and
the
last
word
what
the
animal
eats
or
likes.
So
an
elephant
eats
leave
an
elephant
likes
water.
He
made
up
just
a
you,
know
50
or
70
sentences
like
this.
He
fed
it
into
the
CLA
and
he
trained
it
on
these
sequences
and-
and
he
said,
okay,
let's
ask
the
celiac
question.
He
fed
in
the
pattern
for
Fox
and
eats
now.
The
CLA
had
never
been
trained
on
the
word.
Fox,
never
he'd
been
trained
on
other
animals,
but
not
the
word.
E
Fox
is
the
first
time
I
saw
the
word
Fox
butt-fucked
obvious
she's,
going
to
have
semantic
similarity
to
other
animals
in
the
list,
and
so
we
could
ask
okay.
What
does
this
say?
They
predict
the
fox
eats
Gibbon.
He
knows
what
an
elephant
eats
and
what
a
cat
heats
and
what
a
wolf
eats.
And
what
of
you
know,
frog
eats
what
would
a
fox
eat
and
you
get
a
prediction
from
the
CLA
and
you
can
look
up.
The
prediction
say
what
is
closest
to
and
answer
he
got
was
rodent,
which
is
really
cool.
E
This
is
I,
don't
want
to
oversell
this.
This
is
you
know,
we're
not.
We
haven't
built
the
natural
language
machine,
that's
very
capable
yet,
but
we're
doing
it
the
way,
the
brain
does
it
and
exactly
the
way
the
brain
Deborah,
you
were
using
the
same
type
of
representations,
we're
using
the
same
type
of
memory
systems
were
using
the
same
type
of
predictions
and
I
think
this
was
a
real,
really
really
beautiful
demonstration,
both
STRs
and
the
CLA
and
and
there's
a
lot
of
application
for
this.
E
So
this
review
here
this
whole
thing
was
done
without
supervision
in
exhibits,
semantics
generalization,
both
at
the
word
level
and
at
the
grammatical
level,
and
we
think
there's
gonna,
be
a
lot
of
interesting
applications,
commercial
applications
of
this.
Now,
let
me
remind
you,
I'll
just
tell
you
that
in
this
put
two
cases
where
a
grok
I
talked
about
our
using
it
for
streaming
analytics
and
SEP
using
for
natural
language
processing,
that's
the
exact
same
CLA
code
base.
In
fact,
it's
the
exact
same
code.
E
We
didn't,
we
didn't
modify
the
code
you
get
to
work
in
these
two
different
examples
back
to
code
was
written
to
do
either.
One
of
these
the
code
wasn't
to
emulate
the
generic
or
the
universal
process
of
how
brains,
learn,
higher-order
sequences
and
and
we're
able
to
apply
to
different
problems
with
really
very
little
effort.
Well,
it
sounded
like
an
make.
It
sound
like
it's
from
the
lottery.
Okay,
so
now
I'm
gonna
switch
to
getting
close
to
the
end.
Here
we,
our
nupoc,
open
source
project
and
I,
just
wanted
to
tell
you
a
few
minutes.
E
The
guy
on
our
end,
who
you
can't
see,
bring
me
back
up
here.
Sure
Matt
Taylor
he's
our
these
are
guy,
who
runs
a
new
big,
open
source
project,
and
we
started
this
last
summer
and
it's
it
to
be
going
very
well.
In
there
we
have
the
source
code
for
the
cortical
learning
algorithm
for
the
encoders
support
libraries.
You
should
know
that
this
is
a
single
source
tree,
meaning
our
code
that
we
use
in
croc
is
in
this
repository.
So
when
we
make
a
new
update
to
the
croc
algorithms,
they
were
there
right
away.
E
B
E
Third
and
fourth,
that's
gonna,
be
here
in
San,
Jose
California,
and
if
we
have
enough
interest-
and
we
can
somehow
manage
to
do
it-
maybe
we'll
do
one
with
with
the
person
tall
guys
in
New,
York
will
say,
but
and
you
can
find,
that
at
Numenta
org
I
don't
look
mad.
If
you
want
to
add
a
few
more
comments
about
pic
at
this
time
or.
B
There
we
go
so
hopefully
you
can
hear
me,
but
this
is
the
mid-to
that
work
and
that
we've
got
a
community
of
92
contributors
now
and
it's
it's
growing
very
quickly.
We've
got
over
700
people
on
our
mailing
lists.
So
if
you're
interested
in
the
theory
that
Jeff
is
talking
about,
we
have
a
mailing
list.
For
theory.
We've
got
a
discussion
list.
If
you
want
to
try
and
get
new
picked
working
for
yourself
and
then
you
know
a
list
for
us
that
are
trying
to
develop
new
BIC.
B
We've
also
got
a
wiki
with
a
nice
path.
If
you
want
to
learn
more
about
how
to
use,
do
pick,
how
to
get
things
started
more
stuff
about
the
theory.
There's
a
Jeff
mentioned
a
video
about
sensory
motor
integration.
So
there's
a
link
there,
so
please
feel
free
to
go
to
admit
that
I
work
and
there's
a
link
to
our
wiki
from
there.
E
I,
hopefully,
could
all
hear
that
and
that's
doing
a
great
job
of
running
the
new
Pig
project.
I
just
taught
you
go
to
the
next
slide
here
which
are
goals
for
2014
and
we
have
some
research
goals.
We
want
to
finish
implementing
the
lay
of
four
sensory
motor
inference.
We
might
we
can
get
back
to
introducing
hierarchy.
I
would
go
to
publish
some
in
peer-reviewed
papers.
We
haven't
done
enough
of
that.
So
you
have
to
do
that
new
pic
we're
going
to
grow
the
community.
E
We
we
have
some
partners
and
commercial
partners
like
Sept
I
mentioned
we're
also
working
with
IBM
and
some
others,
and
so
we
have
to
support
them
and
so
a
lot
of
a
lot
of
cool
stuff
going
on
a
new
pic.
A
lot
of
projects
talk
to
Matt
about
that
and
then
we're
trying
to
create
some.
You
know
show
this
commercial
value
for
the
CLA
with
our
crop
product
and
that
will
help
attract
the
developers
have
to
help
attract
commercial
dollars
and
so
on
and
we're
also
looking
at
new
application.
E
E
Why
don't
I
say
like
the
1940s,
because
in
you
know,
we
entered
the
1940s
people
had
the
theory
about
computers
that
Turing
had
written
his
seminal
papers
in
1935,
but
really
haven't
built
any
commercial
computers
yet
and
when
we
left
in
1948
to
1950,
we
were
actually
the
computing
industry
was
going.
I
feel
that's
a
lot
like
where
we
are
right
now
where
we
are
getting.
E
These
theories
were
understanding
how
the
brain
works,
we're
starting
to
build
machines
that
work
on
them
and
we're
starting
to
just
show
commercial
value,
and-
and
you
know,
but
it's
hard
I
won't.
You
know.
I
won't
be
around
a
bush
about
this
there's
a
lot
of
new
concepts
here.
A
lot
of
challenging
things
understand.
They'll
take
take
a
while
to
really
deeply
understand
how
the
CLI
works.
You
can
get
there.
Trust
me
it's
beautiful
when
you
get
the
whole
thing.
It's
not
that
hard,
but.
E
Not
so
simple
we're
on
the
forefront
of
what's
going
to
be
many
decades
of
advances
in
machine
intelligence,
and
this
is
really
the
formative
years.
So
this
is
summary
my
talk
here,
but
I
covered
so
far,
I
argued
that
the
cortex
and
your
cortex
is
this
close
to
universal
learning
machines.
We
can
imagine
and
therefore
machine
intelligence
will
be
on
the
principles
of
the
neocortex
and
not
some
other
principles.
E
We
need
to
understand
these
principles
and
you
can't
again
it's
not
to
build
human-like
machines
or
machines
that
are
going
to
be
your
body
and
going
to
talk
to
build
machines
that
learn
on
these
principles
that
can
do
them
sensory
motor
models
or
the
world
and
so
on.
Next
is
we
have
an
overall
theory
on
overarching
theory
about
how
it's
going
on
ath
TM,
the
hierarchical
temporal
memory
theory
we
know
in
detail
one
particular
building
block.
E
That's
the
cortical
learning
algorithm
we've
been
exploring
that
and
testing
on
extensively
for
years
and
and
we've
chosen
two
near-term
applications
in
anomaly:
detection,
I
didn't
talk
about
prediction
and
natural
language
processing,
it's
very
hard
to
know
where
this
is
going
to
go
in
a
long
time,
but
that's
what
we're
doing
right
now
and
I've
invited
you
to
participate.
You
can
go
to
the
mint
org
and
there's
a
bunch
of
papers
and
lots
more
talks
by
me
and
other
people.
E
E
A
G
Hello,
my
name's
Jack
Kelly
I'm,
a
computer
science
PhD
student
at
Imperial,
but
before
that
I
did
an
undergraduate
in
neuroscience.
So
I'm,
you
know
what
you're
talking
about
here
is
is
extremely
exciting
and
you
know
a
lot
of
machine
learning
is
quite
dry
and
this
is
really
sexy.
It's
cool,
so
I
just
wanted
to
ask
that.
Can
the
CLA
do
classification,
like,
in
terms
of
you
know
a
lot
of
kind
of
conventional
machine
learning
you
showed
in
the
image
and
it
says
that's
a
cat
yeah.
F
G
E
E
So
it's
an
all
knowledge
applied
to
the
current
interest
rate,
and
so
that's
far
as
pattern,
you
have
coming
out
that
the
activation
pattern
which
cells
are
active
is
is
a
fairly
unique
State
at
any
point
in
time,
and-
and
you
can
classify
it
now
in
the
brain-
that's
classified
by
just
feeding
it
associatively
to
a
bunch
of
other
cells,
but
you
can
literally
just
sing
it
in
a
classic
classifier.
We've
done
this
extensively.
E
You
can
take
any
kind
of
any
favorite
classify
you
have
and
nearest
neighbor,
or
you
know
whatever
you
want,
and
and
we've
done
a
bunch
of
these
and
you
can
classify
that
state
and
and
it
works
really
well
now,
that's
assuming
you've
had
the
right
input
and
you've
trained
the
system
properly,
and
so
the
general
play
the
general
answer
is
yes,
you
can
do
classification,
we've
also
done
cost
went
with
it,
I
don't
know
if
we've
described
that
anywhere,
but
we
could
talk
to
you
about
that.
If
you
wanted
to
now.
E
The
question
was
like:
oh,
that
can't
recognize
the
image
of
a
cat.
Well,
we
haven't
done
anything
with
vision,
yeah
doctor
when
first
created
the
CLA,
we
started
working
on
vision
and
we
abandon
it
for
two
reasons.
One
is
that
in
the
brain,
a
huge
amount
of
your
neocortex
is
assigned
division.
It's
something
like
40%
of
the
neocortex
is
only
vision
at
about
60%
is
primarily
vision
turned
out
that
you
understand.
E
What's
going
on
that
takes
a
lot,
a
lot
of
memory
to
get
human
level
visual
performance,
and
we
are
finding
even
in
very
simple
problems,
that
our
simulations
were
slow
and
so
so
the
long
story
was
we
just
wouldn't
be
able
to
do
these
simulations
the
way
the
brain
does
it
and
we
also
had
some
mistakes
back
then
this
is
like
four
or
five
years
ago.
We
think
we
could
do
a
better
job
now,
but
we
haven't.
E
We
haven't
acted
that
again,
yet
we're
still
kind
of
fearful
of
the
of
the
you
know
the
amount
of
memory
and
the
resources
would
be
required.
We
would
go
about
it
in
a
very,
very
different
way
than
other
people
would
now.
You
know
the
probably
where
some
of
the
advances
have
been
made
recently
and
deep
learning.
Deep
learning
is
just
a
hierarchical,
artificial,
neural
network,
and
so
they've
got
some
of
the
same
principles
that
we
talk
about
in
HTM
and
I.
E
Think
these
two
fields
have
moved
together,
but
but
the
blurring
of
this
jump
right
internet
way,
they
saw
that
they
use
no
time
whatsoever.
There's
no
time
element
to
it.
But
that's
not
how
humans
learn,
we
learned
through
time,
and
so
we
know
how
to
do
that.
We
think,
but
we're
not
really
ready
to
computation.
We
had.
E
For
you
don't
be
surprised
by
this
I
mentioned
that
the
human
neocortex
about
60%,
of
the
it's
dead,
it's
working
primarily
on
visual
problems.
42
almost
exclusively
the
areas
associated
with
language
are
teeny
compared
to
that
they're
very
small,
and
the
evidence
suggests
that
language
takes
is
a
much
easier
problem
than
vision.
It
takes
certainly
a
lot
less
resources.
Now
we
could
debate
the
intricacies
of
this,
but
you
know
once
you
get.
E
We
can
do
a
lot
more
interesting
stuff
in
language
processing.
We
can
do
envision,
giving
the
software
constraints
we
have
today,
so
I
think
we
know
how
to
do
like
recognize
we're
not
we're
doing
the
way
the
brain
would
do
it
a
little
bit
less
than
the
way
like
the
deep
learning
guys
are
doing
it,
but
we're
not
really
there
yet
from
a
simulation
point
of
view,
but
we
can
do
classification
on
lots
of
other
problems
and
that
works
really
well.
I
can't
see
your
eyes.
I
have
no
idea.
If
I
answered
your
question.
D
I'm
Jon
Drummond
I
worked
for
a
spread
betting
campaign.
Research
I
was
just
wondering
some
of
you
already
answered
was
again
having
seen
success,
Hinton
and
others
have
had
with
deep
learning
whether
you
could
expand
a
bit
more
on
sort
of
similarities
and
differences.
In
the
way
you
were
working,
yeah.
E
D
E
E
My
approach,
our
coach,
is
starting
from
the
biology
starting
from
the
neuroscience
most.
The
deep
boiling
people
are
not
taking
that
approach.
They're,
sorry
for
more
mathematical
premises,
but
I
mentioned
earlier
now.
We
both
believe
in
hierarchy.
It's
all
about
hiring
and
we're
not
there.
Yet,
because
you
know
we
decided
the
model,
the
individual
layers
and
understand
how
the
processes
are
going,
were
they
when
Michael
hierarchy,
but
most
deep
learning.
E
People
will
admit
that
what
the
big
thing
they're
missing
is
time
that
they
have
no
concept
of
time
in
most
deep
learning
and
commonness,
and
they
can't
possibly
model
understand
what
a
saccade
does
or
how
things
move
through
time
and
there's
some
talk
about
that.
There's
some
some
primitive
attempts,
really
nothing
inherently
going
on
a
time
space,
so
they
know
that
they
have
to
move
in
the
time
direction.
I
know
that
we
have
to
move
more
into
the
hierarchy
direction.
E
I
also
know
we
have
to
introduce
motor
behavior,
which
they
have
no
concept
about,
and
we're
working
on
that
so
I
see
these
two
fields.
They
should
be
merging
together.
They're,
not
they're,
not
really
contrary
approaches
they're
they
they
were
actually
working
on
different
aspects
of
the
same
problem,
they're
focusing
more
on
non
time:
hierarchy,
I'm,
focusing
more
on
time
and
motor
behavior,
where
knowing
we
have
to.
We
have
to
reintroduce
the
hierarchy
so
we'll.
E
Long
as
we
don't
get
stuck
in
some
some
local
minimum,
which
happened
in
the
past,
with
AI
and
with
with
early
artificial
neural
networks,
we
have
to
keep.
We
have
to
reintroduce
time
into
these
hierarchical
models.
We
have
to
introduce
behavior
into
the
entire
article
models
and
we
have
to
introduce
your
attention
and
and
when
we
do
that,
then
we'll
really
be
will
achieve
something.
J
Hello
there,
my
name
is
Antonio,
it's
the
second
time.
I
have
the
honor
to
speak
with
you,
the
first
that
one
was
like
three
years
ago
and
we
had
a
meeting
right
there
in
your
office,
oh
yeah,
so
even
back,
then
I
was
convinced
that
they
were
on
the
right
direction.
Toward
you
know
biological
intelligence,
so
I
asked
you.
J
How
could
I
be
helpful
and
you
say
that
I
should
become
a
programmer,
because
you
were
trying
to
build
the
product,
then
the
gerak
project,
so
it
thought
this
wasn't
really
exciting,
so
I
try
to
explore
the
theoretical
foundation
of
your
work
and
I
would
like
like
to
summarize
like
with
your
fearing
like
ten
words
and
I
would
put
it
this
way
like
we
live
in
space-time.
So
we
use
time
to
understand
space
and
I
think
that
this
is
like
the
core
of
the
universality
we
are
looking
for
in
terms
of
the
learning
system.
E
G
E
Are
and
and
I
did
talk
about
spatial
patterns
much
here,
but
if
you
actually
get
into
the
you
read
the
white
paper
on
the
CNA,
we
talked
about
the
spatial
polar
in
the
temple,
polar
at
times,
sequence,
memory
and
so
on,
but
I
think
you're
right
I
mean
we're
looking
for
these
universal
properties.
What
is
patient
time?
E
What
I
heard
when
arguing
I
still
believe
is
true
is
that
time
has
been
the
one,
the
most
largely
ignored
component
of
AI
and
artificial
neural
networks,
and
that
this
is
the
thing
and
the
reason
people
ignored
it,
because
they
focus
on
spatial
vision,
problems
they
said
well,
look.
We
can
recognize
a
pattern
picker
in
flashing
in
front
of
your
eyes.
There's
no
time
involved.
So,
let's
just
take
time
out
of
the
picture,
and
that
was
a
big
mistake,
because
time
turns
out
to
be
the
most
important
part
of
the
whole
memory.
E
I
I
argue
about.
90%
of
the
memory
in
the
cortex
is
time
based
transition
memory,
not
10%
would
be
a
spatial
moment,
and
so
time
is
a
critical
component,
but
pairing.
The
two
together
is
really
the
power
level
system
and
they
are
universal.
I
didn't
I,
didn't
talk
about
speculate
about
the
future
here,
but
there
was
a
Brazilian
types
of
problems
that
we
don't
even
think
about
today.
That
could
be.
We
could
be
dealt
with
using
these
universal
principles.
J
One
last
sentence,
however:
powerful
these,
like
a
universality,
might
sound
I
think
it
still
has
like
a
fundamental
limitation.
I
guess
it
is
a
bit
far-fetched,
but
the
theory
you
describe
an
irony
recorded
as
well
relies
on
the
separation
of
space
and
time
and
if
we
think
about
like
the
theory
of
relativity
space
and
time
are
ultimately
connected
in
a
sort
of
way.
So
perhaps
one
day
we
will
discover
an
even
like
more
general
theory
of
intelligence.
Maybe.
J
E
Start
with
the
mouse,
you
know,
I
didn't
tell
you
this,
but
the
the
sea-waves
we
implemented
today,
which
is
2048
columns
and
64
thousand
neurons,
that's
for
every
model
we
built.
This
is
pretty
much
that
these
days,
that's
about
the
thousands
of
the
size
of
a
mouse
cortex,
it's
about
the
millions
of
size
of
a
human
cortex
and
one
one-thousandth
size
of
a
mouse.
Cortex
doesn't
sound
very
big,
but
it's
actually
really
powerful
that
little
slice.
So
we
got
a
long
I'm
pointing
out
is
we
can
do
a
lot
with
a
little.
H
I'm
greg
I'm
just
programming
with
a
interest
in
neural
networks.
I
just
kind
of
wanted
to
know
how
the
system
compares
to
the
common
sort
of
pitfalls
of
of
sort
of
general
machine
learning.
Things
such
as
overfitting
such
as
local
minima
does,
and
it's
one
of
my
questions
is
also:
does
the
CLA
rely
on
the
vast
space
to
avoid
overfitting
such
that
you
can?
E
There's
a
lot
of
concepts
in
that
in
that
question.
It's
a
great
question.
So,
let's
try
to
tease
it
apart
a
little
bit.
Let
me
just
tell
you
a
little
more
about
how
the
ceiling
learns.
So,
first
of
all,
it
has
very
large
capacity.
So
you
know
we
can
learn
on
millions
of
transitions
in
a
small
section
of
it,
but
how
does
it
fail
and
and
what
happens
when
you
continue
to
the
overtraining?
Well,
the
way
we
implement?
It
is
what
we
call
fixed
resources,
so
we're
not.
E
We
don't
increase
the
size
and
the
number
of
synapses
and
the
number
of
cells
with
a
number
of
so
it's
like
a
brain
is
pretty
relatively
fixed
and
when
you
train
it
more
and
more
and
more
or
two
things
can
happen,
you
can
set
it
up
so
that
it
forgets
which
it
does
anyway,
but
you
can
make
it
you
can
make
the
ratio
learning
forgetting
you
can
change
that.
That's
one
of
the
parameters
of
the
system,
so
how
much
do
I
want
to
bias
toward
previous
learning?
E
E
They
are
almost
always
failures
of
generalization.
They
never
fail,
it
never
fails
catastrophically
and
they
were
like
up
too
much
and
I
start
getting
garbage
results.
Well,
you
start
giving
as
you
start
over
generalizing.
It
starts
predicting
things
more
in
a
more
generic
way,
and
so,
let's
you
know
you
be
able
to
not
detect
very,
very
subtle
changes
in
it.
So
I
don't
know
if,
like
your
question,
was
kind
of
open-ended
there's
a
lot
to
it.
E
I
just
want
to
say,
like
it's,
not
magic,
I'm,
not
claiming
it
is,
and
but
it's
it's
got
a
lot
of
nice
properties
to
it
and,
and
you
have
some
control
over
them,
we
generally
do
not
when
we're
using
it
in
our
product
or
emergence
and
our
tests
and
so
on.
We
generally
don't
we're
not
doing
a
lot
of
those
issues
of
overfitting
and
things
like
that.
E
We
mostly
would
look
and
find
the
bugs,
and
we
will
have
more
problems,
getting
right,
data
and
figuring
out
some
other
parameters
and
things
like
that,
but
because
the
system
so
fails
nicely
and
therefore
even
the
heaviness
of
failure
sometimes
hard
to
find,
because
it
feels
nicely
so
I
hope,
I
answer
that
a
little
bit
it's
a
it's.
A
very
deep
topic
and
we'd
have
to
sort
of
get
into
lots
of
details
to
talk
about
specific
type
of
problems.
F
Hi
Jeff
said
Peter
Morgan
here,
I
have
a
little
bit
of
a
left
field,
question
more
of
a
business-related
one.
When
you've
been
going
a
while,
you
know
I
love
your
product,
I'd
like
to
see
it
accelerated
to
market.
You
know
you
mentioned
you're
working
with
IBM.
Has
anyone
approached
you
like
Google
or
Facebook
to
kind
of
work
with
you.
D
C
E
A
lot
of
big
companies
interested
in
machine
intelligence
and-
and
there
are
different
opinions
about
how
to
go
about
it-
we
represent
one
end
of
the
spectrum
of
approaches.
I
tend
to
think
it's
going
to
be
the
one
that
carries
the
weight
in
the
end
of
the
day,
but
I
love
the
sea.
So
you
know
how
we
had
lots
of
discussions
with
people
yeah,
there's
a
there's.
A
talk.
Online
I
gave
a
google
Tech
Talk.
E
Last
year,
I
talked
to
involve
the
senior
people
at
Google
about
these
approaches
when
we
talked
about
the
relative
merits
and
so
on.
So
there's
a
lot
of
there's
a
lot
of
interest
in
different
fields.
I
mentioned
IBM
because
we
could
talk
about
them,
but
I
can't
mention
other
people
that
we're
talking
to.
But
there
are
lots
of
lots
of
interest
in
this
there's
a
program
being
put
together
at
DARPA,
which
is
the
United
States
Defense
Advanced
Research
Projects
administration.
It's
built
largely
around
work.
These
are
people
interested
feeling
harder
implementations
of
these
algorithms.
E
So
there's
a
lot
going
on.
It's
fun,
it's
a
very
noisy
field,
but
at
the
moment
and
you'll
you
know
you
see
a
lot
of
companies
claiming
different
things
and
making
lots
of
different
investments.
Google
bought
some
British
company
just
last
year
and
who
in
Qualcomm
bought
another
one.
You
know
who
knows
we're
kind
of
our
approach
is
just
keep
our
heads
down.
E
Alright
great
questions,
so
yes,
there
is
deterministic
and
if
we
and
we
test
me,
we
use
that
in
our
some
of
our
tests.
So
if
there
are
a
bunch
of
random
initializations
to
it
and
you
start
the
same
random
sieves
and
and
you
under
a
control,
environment,
you'll
get
the
exact
same
answers
and,
from
a
practical
point
of
view,
it's
not
always
deterministic,
because
if
I
actually
run
on
a
practical
machine,
someplace
timing
delays
could
change
things
and
and
result
to
end
up
different.
E
But
if
you
take,
I
can
transfer
the
knowledge
from
one
model
to
another,
no
problem.
I
can
under
the
right
environments
it's
deterministic
from
a
practical
point
of
view
by
if
I
take
some
data
off
of
a
server
and
I
ran
it
through
draw
and
I
took
it
the
same
data
officer
and
ran
through
bracha
day
later.
I
probably
get
slightly
different
results,
because
the
data
comes
in
slightly
different
orders
and
the
servers
have
different
queues
and
all
these
kind
of
weird
stuff
going
on
it's
too
hard
to
figure
out.
E
E
So
if
you
ran
on
your
laptop
and
you
ran
the
same
experiment
twice,
you
get
the
exact
same
results
now
on
the
hardware
side,
this
is
a
big
and
how
we
can.
How
can
things
be
accelerate
here?
I
I,
don't
know
if
the
guy
don't
know
this,
but
there's
a
bunch
of
companies
right
now
we're
trying
to
figure
out
very
large
companies
trying
to
figure
out
what
is
the
next.
You
know
substrate
for
computing
in
the
next.
E
You
know
decades
and
all
looking
at
machine
learning,
algorithms
and
they're
all
trying
to
figure
out
neural
models
and
a
lot
of
more
interested
in
what
we're
doing
and,
and
they
have
very
different
approaches.
There's
a
guy
I
know
at
Sandia
National
Labs,
which
is
a
United
States,
National
Laboratory
and
they
are
he's
aged
in
doing
photonics
and
that
is
on
chip.
Photonics
right,
they're,
trying
to
you
know,
use
light
and
guides,
and
so
on.
Other
people
trying
use
different
memory
types
of
things.
E
The
key
answer
your
question
specifically
is:
the
bottleneck
is
essentially
a
memory
and
memory
transfer
right
the
there's
a
lot
of
memory.
We
have
a
lot
of
connectivity.
The
bits
have
to
get
places.
If
you
look
at
a
human
brain,
the
white
matter
which
is
the
can
eat,
is
the
wiring.
If
you
will
that's
the
the
big
volume
of
the
brain
is
white
men.
It's
wiring.
E
Back
you've
seen
the
back
like
tray
computers
like
this
tons
of
wires.
Well,
that's
what
the
brain
is
like,
and
this
is
the
problem
because
you
know
on
it's:
basically
a
memory
architecture
which
is
distributed,
and,
and
how
do
you
build
the
distributed
memory,
architecture
with
lots
of
memory
and
I'm,
not
an
expert
in
this
field,
but
in
the
end,
what's
going
to
happen,
people
are
trying
to
figure
out
what
is
the
memory
architecture
that
will
work
best?
Is
it
various
bus
structures?
E
E
Several
problems:
it's
just
like
regular
computers,
there's
a
need
to
make
them
faster.
We
went
up
into
that
all
the
time
now.
We'd
love
the
our
model
to
be
much
faster.
We
there's
a
need
to
make
them
go
embedded
so
like
what
kind
of
stuff
we're
doing
with
grot
you.
You
know
that
takes
you
know
it's
apparently
hefty
model
and
what,
if
I,
want
to
embed
that
in
every
you
know
this
driver
every.
You
know
I
own
box
with
controller
every
everything
on
the
Internet
right
and
refrigerate,
or
something
like
that.
E
Well,
that's
pretty
heavy
thing
around
there,
so
we
want
to
go
towards
embedded
things
and
low-power.
So
power,
size
and
speed
are
all
going
to
be
critical
here
in
the
same
way
that
we've
struggled
with
all
those
in
computers
over
decades.
We're
going
to
be
doing
the
same
thing
here
and
I
haven't
a
clue
which
architectures
were
going
to
win
out
at
this
point
in
time.
I
don't
know.
C
Well
looks
under
pranic
here,
I'm
doing
computer
science
and
currently
working
there.
The
data
scientists
I
have
a
bit
practical
question
recently
got
interested
in
nature.
Language
processing
and
you
showed
amazing
example
of
how
neural
networks
might
be
used,
and
your
development
and
I
recently
been
looking
through
publications
by
Thomas
Michael
oak
he's
from
Google,
and
he
did
also
using
neural
networks
on
different
to
yours
sure,
but
similar
problem.
It's
a
ward
representation
in
vector
space
and
what
you
did
is
strikingly
similar
and
what
I
found
very
different
and
amazing.
C
It's
your
example
of
folks
eating
rodent
and
you
said
that
this
network
hasn't
been
trained
on
Fox,
and
this
is
actually
question
if
you
can
ask
for
it
how
you've
been
able
to
present
it
into
network
people
has
a
nose
award
because
here
in
all
of
existing
research,
if
we
didn't
train
this
reward,
we
don't
know
its
representation
in
Metro
space
yeah.
How
did
your
conversation
so.
E
Let
me
just
walk
you
through
it
again,
so
the
company
staffed,
which
is
in
Austria
they
they
came
up
with
this,
and
you
can
talk
to
them
to
read
about
it.
They
came
up
a
way
of
creating
these
sparse
representations
for
words.
So,
although
the
the
CLA
did
not
who's
ever
trained
on
that
word,
the
there
was
training
involved
in
thinking
how
to
represent
the
word
Fox.
So
there
was
a
representation
from
the
word
Fox
and
that
representation
was
not
an
arbitrary
representation.
That
representation
was
a
sparse,
distributed
representation
where
the
bits
meant
something.
E
So
if
you
compared
Fox
to
all
the
other
animals,
so
I
took
the
representation
for
all
the
other
animals
that
that
that
are
in
this
dictionary,
and
you
would
see
it
share
semantic
bits
with
all
of
them
in
different
ways,
so
it
the
representation
for
Fox
and
substance
encodes,
how
it's
similar
to
other
animals,
and
so
literally,
when
you
feed
that
pattern
into
the
CLA.
Even
though
he's
never
seen
the
word
Fox
before
it
seems
some
of
the
structure
of
Fox,
it
seemed
the
bits
that
our
honor,
the
word
Fox,
have
been
on.
E
In
other
representations
that
it
has
been
trained
on,
it
goes
back
to
that
first
property
of
SGR
is
the
two
patterns
that
are
that
share
bits
have
semantic
similarity.
So
by
the
trick
of
that
demo,
which
blew
me
away
the
first
time
I
saw
it
was
that
the
representation
for
Fox,
although
it's
unique,
it
also,
is
overlapping
in
bits
with
other
animal
representations,
and
so
the
CLA
picks.
E
Never
seen
this
exact
pattern,
but
they're
part
of
that
are
very
similar
to
the
ones
I've
seen
before.
They're
the
same
bits
around
so
the
things
that
a
fox
like
that
shared
with
I,
don't
know
IOT
or
whatever,
who
knows
were
on
so
the
trick
to
making
that
work
was
that
the
representations
themselves
encoded
the
semantic
meaning
of
the
word
and
and
the
CLA
was
just
generalizing
to
something
similar
to
it.
E
Seen
before
it's
as
well
as
Fox,
as
closest
things
I've
seen
before
this
in
this
way
and
I
don't
know
if
I
may
answer
that
I
don't
know
if
you
understood
that
answer,
but
did
you
have
me
satisfied
with
the
bit?
The
trick
was
in
the
representations
themselves?
It's
not
an
arbitrary
representation.
Fox
was
already
similar
to
other
things
that
it
seemed,
but
that
was
cool
and
it's
alright
cuz,
because
Sam
figured
out
how
to
do
that.
E
Thank
you,
hey
just
I,
don't
know
if
there's
any
more
procedural
things
here,
but
I
do
want
to
thank
percentile
and
Ally
for
helping
arrange
this
and
the
guys
at
skills
matter
and
Matt
Taylor
from
our
end,
who
put
this
together
and
again,
I
apologize
not
being
in
person,
but
maybe
we'll
have
some
future
events
that
we
will
be
over
there
and
and
I
fish.
You
guys
all
coming
out
on
a
I
guess
it's
evening
for
you
and
spend
the
time
and
I'm
very
excited
and
happy
were
able
to
come
so.