►
From YouTube: What The Brain Can Tell Us
Description
Second Annual IBM Research Cognitive Computing Colloquium keynote by Jeff Hawking, co-founder of Numenta.
A
We're
witnessing
the
birth
of
machine
intelligence
and
it's
a
very
messy
and
confusing
time,
there's
lots
of
different
approaches.
Again
we
have
people
arguing
for
specific
machine
learning
techniques
to
solve
specific
problems.
Other
people
arguing
for
more
universal
systems.
There
are
different
types
of
learning:
algorithms,
there's
mathematical
ones,
there's
memory
based
ones,
different
types
of
training
paradigms
that
are
going
on.
It's
a
really
messy
world
trust
me
if
you're
not
living
in
it,
but
I
believe
the
end
at
the
end
of
this
decade.
A
In
fact,
even
before
the
end
of
this
decade,
we're
going
to
settle
out
on
a
more
one
dominant
paradigm.
Now
I'm
my
talk
here
is
today's
argue
for
one
of
these
paradigms
and
I
believe
it's
a
paradigm,
that's
used
by
the
new
york
cortex
and
the
neocortex
is
a
universal
algorithm.
As
you'll
hear
me
talk
about
it's
memory
based,
it
is
a
online
learning
system,
it
learns
continuously
and
it's
behavior-based
learning,
and
this
is
going
to
be
the
dominant
paradigm.
I
believe
for
the
next
50
60
years
of
machine
intelligence.
A
The
reason
we're
going
to
have
one
for
the
same
reasons:
network
effects
people
are
going
to
want
to
build
new
hardware
and
software
and
systems
on
top
of
on
top
of
the
winning
solutions.
And
again,
why
is
this
particular
one
going
to
win
is
because
it's
the
most
flexible,
it's
not
always
the
best
solution,
but
it's
the
most
flexible
solution
and
it
can
scale.
How
do
we
know
this
is
going
to
happen.
We
have
a
proof
case.
The
proof
case
is
our
own,
our
own
brain,
the
own
neocortex,
and
we
know
it's
scalable.
A
We
know
it's
flexible,
it's
scalable
because
we
know,
but
nature
has
built
neocortex,
it's
very
small
and
very
large
and
there's
no
reason
we
can't
build
them
larger.
So
my
talk
today
is
really
about
what
this
is
making
the
argument
for
this,
and
I
think
we're
in
this
period
of
time
right
now
and
it's
happening
as
we're
sitting
here.
So
my
company
has
two
basic
goals.
The
first
goal
is
to
discover
the
operating
principles
of
the
neocortex.
Now,
just
to
remind
you,
that's
about
75
of
the
volume
of
your
brain.
A
It's
a
big
wrinkly
thing.
On
top
it's
the
location
of
all
intelligence,
language
planning,
high
level,
vision
hearing
and
so
on.
Everything
you
think
about
is
intelligent.
Our
second
goal
is
to
create
in
technologies
based
on
neocortical
principles.
We
are
not
trying
to
build
a
brain
or
anything
like
a
human
we're
trying
to
build
learning
technologies
that
work
on
the
same
principles
in
the
neocortex.
A
For
my
talk,
I'm
going
to
give
you
some
startups
and
a
little
discussion
about
the
cortex,
some
facts
about
it:
we're
going
to
do
a
little
neuroscience
theory
here,
then
I'm
going
to
talk
about
our
research
roadmap,
where
we
are
and
understanding
the
system,
then
I'll
talk
about
applications
and
I'm
going
to
leave
you
with
some
thoughts
on
machine
intelligence.
So,
let's
just
jump
right
into
it.
There's
a
picture
of
a
human
near
cortex.
A
It
is
a
memory
system,
it
has
to
learn
and
when
you're
born
it
knows
nothing
and
the
way
it
learns
is
it
interfaces
to
the
world
through
a
set
of
sensors.
Those
sensors
change
some
physical
quantity
into
patterns
on
neurons,
and
when
those
neurons
are
inside
the
brain,
it's
no
longer
light.
It's
no
longer
touch
and
sound.
The
neurons
are
identical,
the
ones
that
are
carrying
information
about
from
the
retina
or
the
ones
from
the
auditory
system.
It's
just
patterns
and
the
way
the
cortex
handles
these
patterns.
When
you
get
to
the
cortex,
it's
identical.
A
This
is
a
pattern
system.
It's
not
a
vision
system,
not
an
auditory
system.
It's
a
pattern
system,
it
builds
a
model
of
the
world
from
the
changing
data
stream,
and
the
data
stream
coming
into
the
cortex
is
very
rapid.
My
voice
is
bringing
patterns
coming
to
you
in
order
of
milliseconds
and
10
to
milliseconds
changes.
A
A
Now,
if
you
think
about
it,
the
patterns
coming
in
from
their
senses,
as
you
move
you're,
basically
moving
your
senses
to
the
world.
So
the
vast
majority
of
the
changes
on
your
sensory
stream
are
coming
from
your
own
behavior.
Your
eyes
are
moving
my
body's
moving.
When
I
touch
things
these
changes
are
coming
in
not
because
the
world
is
changing,
it's
because
I'm
moving
in
the
world.
A
lot
of
things
do
move
in
the
world,
but
most
of
the
changes
coming
in
are
from
your
own
behaviors.
A
So
the
model
that
cortex
builds
is
a
sensory
motor
model
of
the
world.
You
it's
very
difficult
to
separate
those
two
things
out:
it's
a
sensory
motor
model
of
the
world
and
we
won't
understand
that
happens.
Okay.
So,
let's
start
with
some
cortical
facts,
there's
a
picture
of
a
human
knee
cortex.
Next
to
it
is
a
picture
of
a
rat
neocortex,
because
it's
all
the
same
everything
I'm
going
to
tell
you
about
today
is
neocortex.
It's
not
specific
to
any
particular
animal.
A
It
is
a
thin
sheet
of
cells
about
two
and
a
half
millimeters
thick
in
a
human.
It's
about
the
size
of
a
dinner
napkin
and
in
about
this
big
about
two
and
a
half
millimeters
thick.
This
is
you.
This
is
me
and
in
a
rat
it's
about
the
size
of
a
small
post-it
stamp.
Okay,
it's
a
remarkably
uniform
system.
I
mean
anywhere,
you
look
in
it.
You'll
see
a
detailed
architecture,
that's
preserved
across
species
and
across
modalities,
incredible
a
detail.
That's
remarkably
uniform
from
an
anatomical
point
of
view.
A
It's
functionally
uniform
as
well,
even
though
the
parts
in
the
air
court
jokes.
As
vision
and
hearing
it
has
been
known
for
over
35
years,
that
the
cortex
processes,
vision,
hearing,
touch
and
everything
it
does
in
the
same
way,
the
evidence
that
this
is
overwhelming.
Most
people
have
trouble
believing
it,
but
it's
true,
and-
and
so
that's
it's
a
remarkably
functional
uniform.
You
can
actually
swap
the
auditory
nerve
and
a
visual
nerve
on
a
young
animal
and
the
auditory
parts
of
the
cortex
become
visual
and
the
visual
parts
of
the
tortoise
become
auditory.
A
We
know
that
the
cortex
is
it's
organized
as
a
hierarchy.
These
regions
connect
together
and
if
you
look
at
the
connectivity,
it's
a
hierarchy.
If
you
dig
down-
and
you
still
slice
through
the
cortex
to
that
two
and
a
half
milliliter
two
and
a
half
millimeters,
the
first
thing
you'll
see-
is
an
organization.
You
see
layers
of
cells,
there's
roughly
four
layers
of
cells
layers,
two
three,
four
five
and
six.
A
Now,
ten
percent,
or
so
of
those
synapses
are
close
to
the
cell
body
or
proximal,
and
this
is
what
most
people
think
about
when
they
think
about
a
neuron.
They
say:
oh,
these
inputs
come
in
and
they
depolarize
the
cell
and
the
cell
can
fire,
but
90
of
those
synapses
are
further
away
from
the
cell
body
and
for
many
years
people
had
no
idea
what
to
think
of
them,
because
if
you
activate
one
of
those
distal
synapses,
it
seems
to
have
no
effect
on
the
cell
body.
A
But
we
now
know
that
the
dendrites
away
from
the
cell
body
are
active
processing
elements
and
if
you
have
more
than
say,
10
to
20
synapses
become
active
at
the
same
time
on
a
short
distance
away
from
each
other,
so
close
spatial
tempo
proximity
that
it
generates.
What's
called
the
dendritic
spike.
It
goes
to
the
soma
and
it
depolarizes
the
cell.
It
makes
the
cell
doesn't
make
the
cell
fire,
but
it
makes
itself
very
close
to
firing
and
we
believe
this
is
a
predictive
recognition.
A
Further,
most
people
think
about
learning
in
a
neuron
as
strengthening
synaptic
weights.
This
is
not
really
true.
We
now
know
that
most
of
learning
occurs
through
the
formation
of
new
synapses
and
synapses
can
form
very
rapidly
on
the
order
of
minutes
or
seconds.
New
ones
can
appear
and
they
can
disappear,
and
this
is
a
much
more
powerful
type
of
learning
than
trying
to
increase
the
efficacy
of
a
particular
synapse.
So
this
is
this.
Is
this?
Is
the
system
that
we
want
to
understand?
This
is
you.
This
is
me.
This
is
the.
This
is
intelligence.
A
This
is
what
it
looks
like,
and
can
we
understand
in
detail
how
this
works?
We
have
an
overall
theory
for
this.
We
call
hierarchical
temporal
memory
or
htm.
It's
fairly
simple.
It
starts
with
the
premise
that
you
have
a
hierarchy
of
identical
memory:
regions,
that's
a
fact.
The
next
thing
we
say:
what's
the
primary
memory
which
is
going
on
in
each
of
these
regions
is
a
time-based
memory.
It's
a
memory
of
sequences,
it's
like
learning,
melodies
and
what
every
region
does
is.
A
It
builds
a
model
of
time-based
patterns
and
if
it
can
recognize
those
patterns,
it
passes
the
more
stable
representation
to
the
next
level,
and
you
have
a
increasing
temporal
and
spatial
stability.
As
you
go
up
the
hierarchy
which
is
observed
in
the
brain.
Similarly,
you
can
take
a
high
level,
stable
concept
and
unfold.
These
sequences
going
down
and
create
a
very
fast
changing
pattern,
such
as
my
speech
right
now,
which
is
what's
going
on
in
my
head
now
the
questions
we
want
to
ask:
how
does
exactly
this
is.
Do
this?
What
does
the
region
do?
A
What
do
the
cellular
layers
do?
What
are
the
neurons?
How
do
the
neurons
implement
this
so
we're
making
great
progress
in
understanding
this?
Let's
jump
in
we'll
keep
going,
I'm
going
to
jump
up
a
bit.
So
if
you
know
bear
with
me
if
this
is
more
than
you
want
to
hear,
but
we'll
get
some
detail
here
and
then
we'll
come
back
up
to
high
level
again.
So,
let's
keep
going
down
further
here.
The
basic
principle
we
think
is
going
on
here
is
that
each
of
the
layers
is
implementing
a
type
of
sequence
memory.
A
In
fact,
they
use
the
same
neural
substrate
in
the
same
basic
process,
but
they
apply
it
in
different
ways.
It's
sort
of
a
variation
on
a
theme.
That's
going
on
here
now
there's
two
layers
that
are
basically
feed
forward
layers,
the
top
in
the
cortex,
those
layers,
two
three
and
four
and
there's
two
layers
that
are
basically
feedback
layers,
layer,
five
and
six.
Let's
just
walk
through
this.
A
bit
input
comes
into
the
cortex
typically
into
layer.
Four.
This
is
the
feed
forward
input.
Everyone
thinks
if
this
is
like.
A
Oh
the
input
from
the
eyes-
or
this
is
the
sensory
data
coming
in,
which
is
true,
but
most
people,
don't
remember
or
don't
know,
is
that
there's
also
the
cortex
gets
a
copy
of
your
own,
a
copy
of
your
own
behaviors,
your
own
motor
commands.
So
when
you
move
your
eyes,
which
is
done
something
by
something
called
the
superior
colliculus,
a
copy
of
that
command
gets
sent
to
the
cortex,
so
the
cortex
knows
what
behavior
was
just
generated.
A
This
is
a
universal
property
and
what
we
think
is
going
on
in
layer
four
is
that
it's
building
a
a
sensory
motor
model,
it's
doing
inference
of
sensory
sentry
sentry
motor
inference.
If
you
will
and
give
you
an
example
of
that,
when
you
look
at
a
face
and
your
eyes
saccade
over
the
face
and
you're
doing
this
three
to
five
times
a
second
you're,
not
aware
of
it,
because
the
world
seems
stable,
but
it's
happening.
A
This
next
goes
on
to
layer,
three
and
layer.
Three
is
a
high
order,
inference
model.
This
is
something
where
you're
just
looking
at
the
the
sequence
itself,
like
a
melody
or
like
my
speech
and
so
make
a
prediction
what's
going
to
occur.
Next,
all
you
need
to
know
is
the
sequence
of
things
coming
along.
So
if
I
know
the
what
the
history
over
the
last
some
number
of
notes
in
the
melody,
I
can
make
an
accurate
prediction:
what's
going
to
occur
next,
then
layer,
three
projects,
the
next
hold
on
hierarchy.
A
That's
your
basic
feed
forward
pathway,
layer!
Five
is
where
you
have
cells
generating
motor
behaviors.
So
my
speech
right
now
is
being
generated
by
cells
in
layer
five
in
parts
of
my
cortex
and
these
project
subcortically
to
other
motor
areas.
So
the
cortex
basically
controls
other
motor
areas.
It
doesn't
actually
innervate
muscles
itself.
It
drives
other
things
that
move
you
and
then
finally,
layer
six
is
primarily
attention.
It's
a
feedback
layer
now.
The
points
I
want
to
make
on
this
slide
is
that
each
layer
is
doing
a
variation
of
the
common
sequence
memory
algorithm.
A
We
understand
the
basic
model
of
each
of
one
layer
will
understand
the
basic
model
of
all
the
layers
and
these
are
universal
functions.
I
want
you
to
understand
the
premise
here:
what
biology
tells
us
this?
Is
it
you
do
this
in
a
hierarchy?
You've
got
everything
the
cortex
does
and
in
what
you
can
see
here,
there
are
no
pure
sensory
areas
of
the
cortex,
that's
a
misnomer.
A
There
are
no
pure
motor
areas
of
the
cortex,
that's
a
misstatement,
it's
all
sensory
motor
and
this
same
process
is
being
used
in
every
modality
and
in
a
hierarchy,
and
if
we
can
understand
this,
we
are
a
long
way
to
building
brains.
So
the
question
is:
how
does
this
sequence
memory
work?
Could
we
really
understand
this?
Can
we
understand
what
these
layers
of
cells
are
doing
and
the
answer
is
yes,
we
actually
think
we
do.
We
think
we've
made
a
huge
progress
in
this.
A
We
call
this
a
basically
a
pretty
boring
name,
hgm
temporal
memory
for
those
of
you
who
might
have
been
following
dementor
for
a
long
time
with
the
previous
we
used
to
use
cla
but
htm
temple
memory.
This
is
a
picture
of
one
of
our
simulations,
those
little
cubes
or
neurons.
You
can
see
they're
in
a
layer
with
vertical
there's
a
many
columns
there.
The
colored
cubes
are
either
the
red
ones
are
active,
the
yellow
ones
are
in
a
predictive
state.
A
Now
I
don't
have
time
today
to
tell
you
exactly
how
this
works,
but
you
can
go
learn
about
it.
You
can
see
I'll
give
you
how
to
tell
you
how
to
do
that
a
moment.
I
just
want
to
tell
you
the
attributes
of
this.
What
this
system
does
what
this
layer
cells
does
is
essentially
learn
sequences,
it
recognizes
recall
sequences
and
it
predicts
next
inputs.
It
does
all
three
of
these
simultaneously.
These
are
not
separate
steps.
It's
constantly
learning
online
learning,
constantly
inferring
over
everything.
A
It's
learned
and
constantly
making
multiple
predictions
at
the
same
time,
not
a
single
prediction,
but
a
union
of
predictions.
It
has
some
really
nice
attributes
and
I
can
tell
you
because
we've
been
building
this
for
four
years.
We've
figured
this
out
four
years
ago,
so
we
have
a
lot
of
experience
with
it.
It's
very
high
capacity.
It
can
learn.
Even
a
small
simulation
can
learn
millions
of
transitions.
A
It
is
a
distributed
system
and
has
local
learning
rules.
This
makes
it
naturally
fault.
Tolerant.
You
can
lose
neurons,
you
can
do
columns,
you
can
lose
synapses,
you
can
do
a
huge
amount
of
noise
in
the
system
and
it
still
behaves
very
well
just
like
brains.
Do
there
are
no
sensitive
parameters
to
this.
It's
not
hard
to
get
it
to
work
if
you
once
you've
built
it
correctly
and
it
actually
generalizes
they're
able
to
apply
the
same
learning
to
new
situations
that
are
semantically
similar
to
previous
situations.
A
I
just
I
want
to
leave
you
an
impression
about
this.
This
is
not
just
another
neural
network,
there's
some
things
that
are
really
unique
about
it.
I'll
just
give
you
three
of
them.
First
of
all,
it
incorporates
a
fair
amount
of
detailed
cortical
anatomy.
Now
we
didn't
do
this
because
we
could
do
it.
We
did
this
because
we
had
a
theoretical
need
to
do
it.
So
we've
only
added
features
to
this
model
that
we
know
exists
in
the
brain,
but
also
we
need
to
get
this
thing
to
work.
A
The
way
we
think
it
has
to
work,
so
we
have
a
model
for
what
many
columns
are
doing
to
create
high
order
representations.
We
model
certain
inhibitory
cells,
certain
connectivity
patterns,
etc.
No
one
else
does
this
in
an
information
theoretic
model
like
this,
the
whole
thing
is
built
on
sparse
distributed
representations.
What
I'm
emailing.
That
is
that,
at
any
point,
in
the
time
of
the
brain,
only
about
one
percent,
two
percent
or
half
percent
of
the
cells
are
active.
Most
of
them
are
inactive.
A
This
is
the
key
to
intelligence.
We
have
figured
out
so
the
mathematical
properties
here,
they're
very
unusual.
There's
a
talk
online.
You
can
see.
We
just
posted
the
yesterday,
plus
some
papers
describing
these
properties.
This
is
the
key
to
intelligence.
If
you
understand,
you
know
how
we're
going
to
build
tesla
machines,
you
have
to
understand
the
representations
and
sparse
distributions.
Have
these
amazing
properties
that
are
surprising
and
the
whole
foundation
thing
is
built
on
it?
I
can't
go
into
detail
about
that
today
and
finally,
the
neurons
we
model
are
have
active
dendrites.
A
We
we
model
learning
by
synaptic
growth.
Again
we
had
to
do
this.
This
is
how
we
get
the
online
learning
to
work.
This
is
how
we
make
a
highly
predictive
system.
This
is,
unlike
any
other
artificial
neural
network.
You've
ever
heard
of
I'm
not
aware
of
any
other
system
that
incorporates
this
kind
of
level
of
detail.
It
may
exist,
but
I
don't
know
about
it.
Okay,
if
you're
mentioning
this,
this
is
completely
documented
the
source
code.
There's
people
built
this
multiple
times.
A
You
can
get
a
lot
of
information
at
nomento.com,
learn
and
there's
some
new
material
up
there
just
posted
yesterday,
you
should
check
out
okay.
Let
me
talk
about
our
research
roadmap,
so
here's
the
system
we're
trying
to
understand
we're,
trying
to
understand
these,
these
layers
of
cells
in
a
region
of
cortex,
and
once
we
can
figure
that
out,
we
can
go
build
brains.
A
Where
are
we?
We
started
with
layer
3
because
it's
actually
the
simplest
one
anatomically
is
it's
the
it's
the
cleanest
one
to
look
at
and
that's
the
high
order,
sequence
memory
and
I
would
say
we're
on
a
theory
point
of
view.
This
is
purely
subjective
from
a
theory
point
of
view.
I
feel
like
we
really
understand
this.
Very
very
well,
so
I
put
98
says
about
close
as
you
can
get
to
anything,
there's,
maybe
a
few
things
we
might
get
wrong
here
and
there
it's
been
extensively
tested.
We
put
it
in
commercial
products.
A
We
know
this
thing
inside
and
out
on
layer.
Four,
we
figured
out
about
a
year
ago,
what's
going
on
here
and
how
this
builds
the
sensory
motor
model
of
the
world.
It's
just
a
variation.
What's
going
on
in
layer,
three,
where
we're
using
motor
commands,
I
would
say
the
theory
is
about
80
there
we're
implementing
this.
It's
working.
We
have
a
lot
more
to
do,
but
we're
really
really
far
over
the
hump
on
this
one.
It's
in
development.
A
This
is
what
we're
working
on
now
on
the
layer
5,
which
is
where
you
start
generating
behavior.
We
have
the
big
building
blocks
of
the
theory.
We
understand
the
basic
opponents
about
how
the
neurons
are
doing
this,
how
they're
interacting
with
the
rest
of
the
rest
of
the
body
and
the
brain,
but
we
haven't,
started
implementing
it
all
and
there's
a
couple
of
several
other
big
building
blocks
were
missing,
so
I
put
it
about
50,
but
I
feel
really
good.
A
We're
gonna
get
this
one
and
and
then
finally,
layer
six
is
more
complicated
and
I'm
we're
it's
a
little
bit
more
nebulous.
What's
going
on
down
there,
okay,
so
that's
where
we've
been
doing
and
since
we
started
with
layer
3,
we
started
with
this
high
order.
For
instance,
we
got
it
working.
The
theory
hangs
together
really
well,
we
said:
let's
try
it
on
real
data,
let's
see
applying
it.
So
what
we
did
is
we
said:
okay,
what
could
we
do
with
that?
Well
it.
A
This
is
a
system
that
this
part
of
the
cortex,
this
layer
basically
does
high
order,
inference
sequence
inference,
and
it
basically
requires
that
the
data
be
changing
on
its
own.
There's
no
behavioral
component
to
this.
So
we
said:
let's
it
can
work
on
streaming
data.
Anything
that's
changing
over
time.
It
should
work
on.
We
can
do
prediction,
anomaly,
detection
and
classification,
so
we
said
there's
a
lot
of
applications
here:
let's
try
them
out
and
I'll
show
you
what
we've
done
now,
how
you
build
a
streaming
application
for
a
data
application
using
this
technology.
A
Well,
you
take
a
data
stream.
You
stick
it
through
something
called
an
encoder
which
just
basically
changes
some
number
or
quantity
or
something
into
a
sparse
distributed
representation.
That's
the
language
that
we
need,
and
now
I
have
the
sparse
distributed
representation.
I
feed
it
to
the
hgm,
and
now
I
get
a
stream
of
predictions
or
anomalies
or
classifications.
That's
what
I
can
do
with
this
now.
There's
many
many
sources
of
streaming
data.
You
know
john
mentioned
earlier.
A
You
know
we're
going
to
be
washing
all
this
data
yeah
most
of
it's
streaming
data,
anything
you
can
regularly
get
from
applications
and
servers
and
medical
data
and
industrial
equipment,
social
media.
All
these
things
can
generate
millions
and
millions
billions
of
data
sources
that
are
changing
over
time.
So
we
have
a
potentially
a
way
of
modeling
that
now
what
kind
of
encoders?
A
I
won't
tell
you
how
the
encoders
work
it's
kind
of
cool
I'll
talk
about
it
later,
but
we
built
ones
for
numbers
and
categories
and
dates
and
times
we
have
one
for
gps
and
even
words
I'll
talk
about
this
in
a
moment.
So
we
have
everything
we
need
to
do
here,
and
so
we
went
and
built
some
applications
here
are
six
applications,
I'm
going
to
briefly
talk
about
that
are
all
about
streaming.
Data
applications
on
the
top
three
they're
all
about
anomaly,
detection,
they're,
similar
and
the
bottom
three
are
very
different.
A
We
started
with
server
metrics,
we
said,
let's
see
if
we
can
model
servers
and
detect
when
they
are
in
anomalous
states
by
looking
at
the
temporal
characteristics
of
their
metrics
and
that's
the
one
we
developed
first,
we've
turned
into
an
actual
product
called
grock.
The
way
we
do
that
is
we
take
some
server.
We
take
a
bunch
of
server
metrics
off
of
it
things.
You
can.
You
know
cpu
utilization
file,
access
things
like
that.
We
run
them
through
encoders.
We
build
a
model
for
each
metric.
A
We
actually
started
by
assuming
we
bring
combine
these
metrics
together
into
a
single
model.
We
found
it
works
better
to
actually
build
lots
of
separate
models
and
then
combine
it
later.
So
now,
what
I'm
doing?
I'm
basically
modeling
a
temporal
characteristics
of
various
data
streams
we
detect
when
those
temporal
characteristics
change
significantly
and
we
say
there's
something
unusual
going
on
here.
A
You
see
here
and
the
higher
the
graph
and
the
little
bars
and
the
color
indicate
a
highly
anomalous
state.
So
you
only
have
to
look
at
the
top
few
things
like
a
little
dashboard
to
look
at,
and
this
is
continually
updated.
It's
running
on
my
phone
right
now
and
you
can
you
know
these.
These
bars
move
across
over
time
as
as
the
servers
perform,
we
also
have
a
web
dashboard,
but
I'm
just
going
to
show
the
mobile
ones
so
what
kind
of
anomalies
we
didn't
know?
A
How
well
is
this
going
to
work
so
what
kind
of
anomalies
can
take?
Now
we
didn't
tell
the
system
at
all.
What
any
of
these
numbers
mean?
What
any
of
these
metrics
means
that
the
servers
or
anything
like
that,
we
just
said:
here's
a
stream
of
data.
We
had
no
idea
what
it
was
going
to
find,
so
it
turned
out
to
be
really
really
good,
really
good
much
better
than
I
even
imagined
it
could
be.
A
So
I'll
just
give
you
some
simple
examples
here.
You
can
see
that
on
these
pictures,
what
you're
seeing
is
the
top
bar.
There
is
a
server
anomaly.
That's
in
the
white
area,
then
the
middle
thing
is
the
actual
metric
data
that
was
anomalous,
that's
the
black
with
the
blue
lines
and
then
underneath
it
you'll
see
the
anomaly
score
for
that
particular
metric.
So
you
can
just
look
at
the
when
the
when
the
anomaly
occurred
and
and
the
sort
of
the
the
graph
there.
A
So
you
can
see
some
very
simple
things:
sudden
changes,
slow
changes,
sudden
subtle
changes
in
regular
data.
That
third
picture,
you
might
see,
there's
a
slightly
different
blip
on
the
right
hand,
side
there
where
they
are
on
the
occurred,
because
this
is
a
regular
data
stream.
It
says
that
is
significant,
a
very
highly
significant
event
where,
on
the
one
on
the
right
there,
you
see
it's
a
very
noisy
data
stream
and
any
particular
spike
doesn't
mean
anything,
but
you
can
still
capture
these
to
these
sort
of
changes.
Now,
here's
here's
where
it
got
interesting.
A
A
This
is
a
single
server,
two
different
metrics,
two
different
models,
both
caught
and
nominate
the
same
time,
even
something
as
simple
as
seep
utilization,
and
I
think
this
one
is
disripe
bytes
and
if
you
look
at
that
data
that
blue
graph,
you
can't
see
what's
going
on,
you
wouldn't
pick
that
point
in
time
as
being
highly
anomalous.
Well,
I
wouldn't,
but
this
system
is
statistically
you
can
prove
it
mathematically.
It's
going
to
be
highly
highly
unusual,
given
recent
history
of
this
system
like
over
the
last
few
weeks
what
occurred
here.
A
This
is
a
build
server
where
every
time
an
engineer
checks
in
code,
it
starts
a
build
process
and
what
happened
on
this
particular
day.
An
engineer
started
the
build
process
manually
at
that
point
in
time.
That's
it
just
started
it
manually
as
opposed
to
automatically
you,
and
I
can't
see
the
difference
there,
but
it
catches
it.
It
says
I've
never
seen
this
I've
caught
it
in
two
separate
models,
I'm
certain
of
it.
Something
is
unusual
here
now
in
this
case
it
could
be
benign.
A
It
could
be
a
risk,
something
shouldn't
be
doing
that
or
it
could
be
something
malicious.
We
don't
know,
but
you
don't
get
many
anomalies
and
when
it
catches
them,
they're
really
important.
So
we
saw
this.
We
said
this
is
really
cool
and
we
said
what
else
could
we
apply
this
to?
We
said:
let's
can
we
apply
it
to
human
metrics,
like
you're
sitting
at
a
computer?
Looking
at
your
keyboard
access
and
your
file
access?
A
Can
we
tell
if
someone's
changed
their
behavior
or
someone
else
is
using
a
computer
and
it
turns
out
we
can
it
works
very
nicely.
We
asked
someone
came
to
us
and
said
what
about
financial
data
they
asked
us.
Can
you
predict
volumes
of
stock
trading?
We
said
we
don't
know
they
gave
us
the
data.
We
looked
at
it
and
turns
out.
We
did
a
really
great
job
at
it.
A
In
a
matter
of
an
hour,
we
had
results
that
equaled
the
best
in
the
industry,
and
then
we
said
we
can
turn
this
into
anomalies,
and
so
what
we're
doing
here,
we
can
actually
monitor
thousands
of
equity
trades
and
find
when
there's
subtle,
anonymities
in
the
volume
of
those
trades.
We're
now
trying
to
add
social
media
data
to
it.
We're
in
the
process
of
doing
this
right
now
trying
things
like
twitter
and
tumblr
see
if
we
can
find
anomalies
in
there
too
and
combine
those.
So
someone
might
say:
hey.
A
Is
there
something
unusual
going
on
this
company?
This
is
not
just
for
people
who
trade
it
could
be.
People
who
are
who
are
they
want
to
track
their
customers?
What's
going
on
unusual
in
the
customer
or
procurement
base,
things
like
that
anybody
wants
to
find
that.
So
we
think
this
is
cool,
we're
going
to
have
a
product
this
next
year.
A
Now
here's
a
something
totally
different.
This
is
a
researcher
at
berkeley
who
a
team
there.
They
want
to
use
an
eeg,
the
scalp
recordings
to
control
things
like
prosthetic
arms
and
and
robots,
and
things
like
that,
so
they
said.
Can
we
take
this
data,
run
it
through
the
htm
and
classify
it
and
try
to
say
like
it's?
Am
I
thinking
going
left?
Am
I
thinking
going
right
or
up
or
down
that
kind
of
stuff?
They
just
did
this
work
two
weeks
ago
and
they
got
really
great
results.
A
I
I
won't
claim
success
here
because
I
think
it's
too
early,
but
I
do
think
this
is
the
kind
of
problem
that
we
should
be
able
to
do.
A
good
job
on
here's.
A
company
in
in
europe
called
peoneck.
They
are
using
this
technology
to
to
basically
track
ships
through
the
harbors
of
europe
and
they
said
look
can
I
can.
I
detect,
learn
the
typical
temple
spatial
patterns
of
ships
moving
through
harbors
and
to
detect
if
they
start
moving
differently
than
normal,
and
it
turns
out
it
works
very
nicely.
A
We
created
a
very
cool
encoder
for
gps,
so
you
can
feed
in
gps,
coordinates
into
the
into
the
hdm
and
it
turns
them
into
sdrs,
and
so,
as
a
ship
moves
to
the
harbor,
you
can
tell.
Is
it
the
kind?
We
don't
tell
it
what
it
should
be
looking
for,
but
it
can
detect.
We
found
so
far
changes
in
velocity
changes
in
direction
being
out
of
out
of
path.
Whatever
is
typical,
you
don't
have
to
tell
it
anything
which
it's
supposed
to
know.
You
just
say:
here's
a
whole
bunch
of
ships.
A
Let's
learn
for
a
while,
and
now
you
tell
me
when
something
is
unusual,
and
one
thing
I
didn't
mention
is
continually
learning.
So
if
there's
a
new
pattern
becomes
normal,
it
says
okay,
after
a
while,
it
says:
okay,
that's
not
anonymous
anymore.
Now.
The
last
example
here
is
about
natural
language,
and
this
is
done
with
a
company
called
cortical
io
they're
based
in
austria.
They
read
our
papers
and
said
holy
smokes.
A
This
is
cool,
they
do
natural
language
processing
and
they
felt
like
the
sparse
distributed
representations
were
the
key
to
understanding
natural
language
processing,
and
I
agree
with
them,
because
this
is
the
language
of
the
brain
and
it
has
all
these
nice
properties
of
semantic
representation.
So
they
created
an
interesting
tool.
They
take
a
corpus
of
documents
like
wikipedia
and
they
built
the
tool
they
trained
the
system,
and
I
can't
I
can't
explain
how
it
does
this
in
this
short
time,
but
what
you
get
out
of
it.
A
You
can
ask,
give
it
a
word
or
document
and
say
give
me
a
sparsely
stupid
representation.
It's
a
picture
of
them
here.
There
are
like
16
000
bits
you
can
see.
Most
of
them
are
off
the
little
dots
of
the
one
bits
that
are
on
and
so
on
now
one
of
the
properties
of
sparse
distributed
representations
are
the
bits
mean
something
they
have
semantic
meaning.
You
may
not
be
able
to
say
what
it
is,
but
they
have
semantic
meaning.
A
So
if
I
have
two
representations
and
they
share
bits
in
the
same
location
or
the
same
part
of
the
array,
then
they're
sharing
a
semantic
meaning,
and
this
doesn't
happen
by
chance.
It's
it's
meaningful,
so
you
now
have
these
words.
These
representations,
which
capture
the
semantic
meanings
of
words
all
right.
What
can
you
do
with
this?
Just
start
something
you
can
just
do
some
very
simple
things
with
the
words
themselves.
You
take
the
word,
the
representation
for
the
word
apple
and
you
take
the
representation
for
the
word
fruit.
A
This
is
really
cool.
It
works
over
all
these
different
things.
You
can
do
this
so
then
we
said.
Oh,
these
are
other
words
that
are
nearest
matches
after
that.
So
then
we
said:
let's
train
this
on.
We
train
the
htm
on
series,
just
sequence:
it's
a
high
order,
tempo
pattern
of
words,
and
what
can
we
do?
It's
a
very
simple
little
system.
That's
doing
this.
A
So
we
we
created,
we
did
a
first
test,
which
we
said:
okay,
create
a
bunch
of
three
word
sentences
and
it's
like
an
animal
either
eats
or
likes
something
and
then
what
it
eats
like.
So
an
elephant
likes
water
and
elephants
eats
grass
or
something
like
that.
We
train
it.
On
50
60
sentences
like
this
now
we
then
give
it
a
new
sentence.
Now,
what
I
say
by
new
the
system
has
never
seen
the
word
fox
and
we
feed
in
fox
eats,
and
the
htm
always
makes
predictions.
So
it's
going
to
predict
something
now.
A
A
I
believe
that
anyone
has,
even
in
this
small
little
example,
has
taken
brains
representations,
how
they
work
in
brain
neural
mechanisms,
and
this
is
the
the
core
of
how
language
is
processed
in
the
brain.
This
is
not
that
this
is
the
the
we're
getting
close
to
the
real
way.
This
is
happening
in
brains.
A
This
system
is
completely
unsupervised,
it
does
semantic
generalization,
it
actually
works
across
languages.
You
can
mix
and
match
languages.
They
did
a
very
cool
job
at
this.
We
think
there
are
many
many
applications
here,
we're
excited
about
it,
they're
excited
about
it,
we're
not
talking
about
what
those
applications
are
yet,
but
we
think
we
can
do
some
really
cool
things
that
no
one
else
has
been
able
to
do
before.
A
Okay,
I
want
to
make
one
point
here:
every
one
of
these
six
applications-
and
only
one
of
these
is
a
real
application.
The
other
are
just
sort
of
you
know
demonstrations,
but
the
code
is
available
for
them.
They
all
run
on
the
exact
same
code.
I
mean
not
a
recompilation,
not
a
re-parameterization,
not
a
tuning,
the
exact
same
code.
We
didn't
tweak
anything.
We
just
said
change
the
data
type
new
encoder
run
it
through
the
system.
What
do
you
get?
I
think
that's
a
very
powerful
statement.
A
It's
getting
at
the
core
fundamental
flexibility
of
these
algorithms,
and
you
know
if
you
asked
a
bunch
of
data
scientists
to
do
something
like
this.
They
wouldn't
come
up
with
one
algorithm
to
do
all
these
things,
and
this
does
really
well
on
all
these
without
any
modification
whatsoever.
A
Let's
go
back
to
our
research
roadmap.
I
talked
about
how
we
did
layer
three
and
then
the
applications
we
can
build
there,
we're
in
the
process
of
doing
this
layer,
4
the
sensor
motor
inference.
What
kind
of
applications
can
we
build
there?
Well
imagine
this
again.
This
is
like
your
eyes,
so
cotting
over
an
image,
so
we
can
do
static
pattern
recognition.
So
we
can
work
with
static
data,
but
we
have
to
have
an
active
learning
system.
You
have
to
move
through
the
data
instead
of
the
data
moving
itself.
The
data
is
more
static.
A
You
move
through
the
data
and
that's
how
the
brain
learns.
So
we
can
do
classification
and
we
can
do
prediction
in
this
case.
We
are
currently
working
in
a
vision
paradigm
because
that's
a
very
well
understood
problem,
so
we're
working
on
image
classification,
but
we're
doing
it
the
way
the
brain
does
it,
but
there
are
many
applications
here.
A
I
think
what
you
want
to
think
about
is
anything
which
has
spatial
structure
you
want
to
classify,
so
you
can
imagine
some
sort
of
network,
whether
it's
people,
networks
or
computer
networks,
and
you
want
to
classify
it.
You
want
to
say:
okay,
I'm
going
to
have
the
system
look
through
the
data,
and
it's
going
to
come
back
to
me
and
make
classifications
predict
what
it's
going
to
see
you
could
you
could
use
this
in
analyzing
corporate
structures
or
financial
structures
or
social
network
media
structures
and
so
on?
A
I
believe
not
only
get
robotics,
of
course,
but
you'll
be
able
to
do
things
which
are
virtual,
like
smart,
bots
or
or
proactive
defense
now,
you're,
not
just
moving
through
the
data
in
a
simple
way,
you're
moving
in
a
way
where
you're
you're
trying
to
achieve
a
goal
or
an
end
game,
and
I
just
I
think
we
have
the
basics
of
this
down
understanding
how
it
works,
but
there's
we're
we
haven't
started
working
on
yet,
but
we
need
to
finish
the
other
stuff
first
and
finally,
the
last
layer.
A
Six
is
really
about
enabling
larger
hierarchies
with
the
multi-central
modalities.
Okay,
we're
very
transparent
in
our
research.
All
these
algorithms
are
documented.
I
wish
they
were
better
documented,
but
they're
documented
well
enough
that
many
people
have
independently
created
these
in
multiple
languages
around
the
world.
So
that
proves
that
they're.
Well
enough
documented,
we
we
have
an
open
source
project
called
new
pic,
which
is
at
nomenta.org.
A
There
we've
placed
all
of
our
own
software,
which
is
a
gpl
format.
We
also
have
a
commercial
license.
We
even
post
our
daily
research
code,
so
you
can
look
at
all
our
messy
stuff
we're
doing,
and
we
have
active
discussion
groups
for
theory
and
implementations.
We
have
lots
of
collaborations.
We
have
a
small
collaboration
with
the
group
in
ibm,
album
and
research.
We've
been
looking
at
these
algorithms.
We
have
a
collaboration
with
darpa
who's,
trying
to
get
a
program
going
for
the
cortical
processor,
which
is
based
on
htm
and
little
companies
like
cortical
io.
A
We're
very
open
we're
just
trying
to
make
this
thing
happen.
That's
what
we're
trying
to
do
and
anything
that
works.
We're
open
for
this
is
just
a
chart
of
our
open
source
community.
We
started
about
15
months
ago.
It's
been
growing
very
nicely.
Continuous
growth
and
more
and
more
people
are
getting
excited
about
this.
More
and
more
people
are
actually
understanding
it.
There's
some
people
out.
There
really
deeply
understand
what
we're
doing,
and
you
know
as
well
as
we
do
it's
just
scary,
but
that's
happening.
A
Okay,
I'm
going
to
end
my
talk
with
a
with
a
story
and
it's
a
true
story.
A
21
years
ago
I
gave
a
talk
at
intel
and
they
have
an
annual
meeting
where
they
bring
in
the
top
200
managers
in
the
company
from
around
the
world,
plus
the
exec
staff,
to
do
business,
planning
and
part
of
that
meeting.
They
have
an
invited
outside
speaker
and
21
years
ago
I
was
the
invited
outside
speaker.
A
A
A
They
did
not
believe
a
word,
I
said,
and
they
said
to
me
well,
what
are
the
applications
going
to
be
for
these
mobile
computers?
Why
are
a
billion
people
going
to
buy
these
things?
And
I
honestly
said
I
don't
know.
I
said
here's
what
I
do
know.
I
know
it's
going
to
be
primarily
about
information
access,
because
that's
what
you
can
do
on
a
small
device.
A
I
said
I
know
some
simple
things
like
a
calendar,
an
address
book
people
are
going
to
want
those,
and
I
said
I
know
people
are
going
to
want
to
access
information
on
a
small
device,
and
I
know
we
will
be
able
to
build
machines
that
will
be
capable
of
that.
But
I
don't
know
what
the
applications
are,
but
I
tell
you
it's
going
to
be
great.
A
It
didn't
work.
Three
years
later,
we
introduced
upon
pilot,
which
was
essentially
a
calendar,
an
address
book,
but
it
was
a
computer
and
a
year
after
that
we
had
thousands
of
applications
on
it.
A
Three
years
after
the
palm
pilot,
we
introduced
a
trio
which
is
one
of
the
first
smartphones
which
I
designed
and,
and
today,
of
course,
20
years
later,
I
bet
you
every
one
of
you
has
one
of
these
computers
in
your
pocket,
and
it
is
the
driving
force
now
here
I
am
today,
I
feel
deja
vu,
I'm
talking
not
about
the
future
of
personal
computing,
but
the
future
of
computing.
A
I've
said
we've
had
60
years
of
one
paradigm.
I
think
we're
about
to
start
the
next
60
or
100
years
of
a
different
paradigm
and
in
the
the
future
is
about
machines
that
learn
and
I'm
very
confident
to
say
that
those
machines
are
going
to
be
built
on
the
principles
of
the
near
cortex,
sparse,
distributed
representations
are
going
to
be
essential
that
algorithms
of
the
temple
learning
distributed.
Temple
learning
algorithms
are
going
to
be
part
of
this.
I'm
very
confident
in
this.
A
These
principles
I
talked
about
are
going
to
be
the
foundations
for
machine
intelligence
once
again,
I'm
speaking
to
a
company
that
I
think
should
be
a
leader
in
this
field
and
maybe
you're
going
to
be,
but
you
know,
ibm,
has
all
the
right
routes
and
all
the
right
history
and
all
the
right
capabilities
to
do
this.
One
of
those
is
by
the
way,
if
you
have
to
build,
really
neat
cool
hardware
which,
unlike
anything,
that's
ever
been
designed
before,
and
how
many
companies
can
do
that.
You
know
ibm's
one
of
those.
A
A
I
showed
you
what
we're
going
to
do
next,
I
could
sort
of
lay
out
a
roadmap,
but
those
applications
as
cool
as
they
are
and
maybe
as
big
as
they
might
be,
are
kind
of
like
the
calendar
in
the
address
book.
You
know
we
can't
really
know,
and
I
I
can't
pretend,
but
I've
shown
a
roadmap
to
get
there
and
I'm
very
confident
this
is
going
to
happen.
This
is
not
going
to
happen
in
a
long
time.
I
am
sure
in
two
to
four
years
we're
going
to
this
is
going
to
be
going.
A
B
Well,
I
will
tell
you
that
this
is
very
different
than
that.
Luncheon
we've
made
a
decision
we're
the
community's
here
and
it's
it's
really
about
building
this
out.
So
jeff
we've
got
time
for
some
questions
and,
let's
ask
those
in
the
audience
or
those
are
in
the
remote
sites
or
in
our
remote
labs.
A
B
A
The
mic
all
right,
do
you
want
to
call
out
people
should
I
call
why
don't
you
look
for
friends.
A
C
So
the
one
thing
I
I
wonder
about
is
your
thought
that
synapses
grow
de
novo,
as
opposed
to
being
tunable,
because
my
sense
of
neurobiology
and
cancer
biology-
and
genomics
generally,
is
that
I
don't
think
of
it
as
a
new
versus
old
set
of
growths
in
a
cognitive
learning
system,
but
rather
in
vivo,
in
the
brain
at
least
and
in
biology,
I
think
of
it
as
tunable
much
more
subtle
than
you're
suggesting
so
have
you
thought
about
that
and
have
you
thought
about?
A
A
You
know
increasing
the
efficacy
of
a
synapse
is,
is
you
know
growing
a
new
one
is
very,
very
similar.
It's
this
I'm
starting
from
zero
as
a
you
know
and
into
one
versus
you
know,
it's
not
really
that
different
than
you
think
it
just
has
a
you,
have
a
much
bigger
information
capacity
if
you're
able
to
start
form
new
connections,
but
I
want
to
make
a
point
in
real
synapses.
A
They
often
don't
work
at
all
and
actually
potentializes
the
synapse,
and
it
may
not
release
any
neurotransmitter
it
really
it's
just
the
very
stochastic
devices,
and
so
anyone
who
requires
an
information,
theoretic
synapse
that
has
even
one
digit
of
precision
if
it
requires
that
it's
not
biologically
possible.
A
So
I'm
not
saying
you
can't
strengthen
the
tuning
synapses,
it's
just
that
you
can't
really
rely
on
it,
but
going
from
no
synapses
to
a
real
synapse
is
a
very
strong
event,
and-
and
so
it's
it's-
it's
not
a
I'm,
not
trying
to
like,
say
you.
You
know
it's
not
possible,
I'm
not
trying
to
say
it's
different.
It's
just
a
much
more
powerful
way
of
learning
and
we
know
it's
happening
and
by
the
way,
it's
another
thing,
a
real
advantage
of
it.
A
It
allows
the
system
to
to
basically
experience
a
pattern
a
few
times
and
start
forming
a
connection
before
it
actually
has
any
effect.
So
you
might,
you
might
not
want
to
affect
behavior
until
you've
experienced
something
a
few
times
and
so
by
the
growth
of
a
synapse.
What
we
do
is
we.
We
increase
something
called
the
permanence
where
you
have
an
accident,
a
dendrite
near
each
other.
A
They
have
nothing
between
them,
that's
a
zero
permanence
and,
as
you
increase
the
perm
permanence,
which
is
a
heavy
and
type
learning
you're,
essentially
growing
that
filipinio
towards
the
towards
the
the
dendrite
toward
the
axon,
and
when
you
make
the
first
connection
you,
then
we
we
give
the
strength
of
the
synapse
a
one.
It's
a
weight
of
one,
but
it's
not
very
permanent,
can
be
very
easily
forgotten,
but
you've
increased
training.
It
becomes
a
longer
lasting
memory
and
it's
harder
to
forget.
So
we've
chosen
that
paradigm.
A
It's
just
a
much
more
powerful
information
paradigm,
and
this
you
know
when
you
learn
something
basic.
The
whole
learning
model
here
is
not
just
tweaking
something.
It's
like.
I
need
to
learn
something
new.
I
need
this
is
a
new
pattern.
This
is
a
new
idea.
This
is
a
new
animal
and
you
have
to
really
lay
these
foundations
down
quickly.
So
maybe
I'm
just
trying
to
say
overall
there's
some
biological
evidence
to
what
we're
doing
a
lot
of
it
actually
and
also.
I
don't
think
it's
diametrically
opposed
to
the
principles
that
you
adhere
to.
Anyway.
A
D
I'm
just
wondering
you
so
much
emphasized
the
uniform
principles
of
organization
in
the
mouth
and
red
brain
as
compared
to
the
human
brain
and
if
you
just
have
a
look
to
the
thickness
of
the
cortex,
this
is,
of
course
true.
It's
maybe
by
a
factor
of
two.
It
is
larger
in
the
human
brain
as
compared
to
the
mouse
brain.
But
when
you
look
to
the
number
of
cells
is
about
2,
000,
2,
000
times
larger.
D
When
you
look
to
the
connectivity,
it's
about
50
000
times
larger
in
the
human
brain
as
compared
to
the
mouse
brain,
and
this
is
not
only
a
question
of
quantity.
It's
also
a
qualitative
question
because
in
the
human
brain
there
are
more
areas
which
are
not
present
there.
There
are
other
gene
expression
patterns.
A
D
A
Yes,
yes,
I'm
very
familiar
with
this
line
of
thinking.
So
look
what
I
presented
today.
I
try
to
make
it
look
simple
right.
That's
that's
my
goal
today
and
as
a
scientist
who's
trying
to
understand
fundamental
principles,
you
have
to
try
to
get
at
the
core
principles
now.
Cortex
is
not
nearly
as
simple
as
I
pointed
out
here.
There
are
other
structures
involved.
We
study
the
thalamus
and
the
you
know,
there's
tons
of
stuff,
that's
going
on.
They
didn't
talk
about
now
in
the
cortical
world.
A
It
can
be
divided
into
two
ways
of
thinking
about
it.
One
way
is
to
say
what
are
the
common
principles
that
are
operating
everywhere
and
what
are
the
variations
on
it?
There
are
variations,
not
all
cortical
regions
are
identical.
There
are
variations
in
the
theme
going
on
here.
The
way
I
view
it
is
that
you
want
to
understand
those
common
principles
first
and
then
you
can
ask
how
do
I
deviate
from
that?
Why
does
irat
have
a
bowel
cortex?
Why
did
it?
A
Why
does
it
form
that
for
the
whisking
sense,
why
do
we
see
you
know
a
stripe
layer,
iv
and
v1
in
certain
mammals,
but
not
in
other
mammals?
You
know
those
kind
of
questions.
Why
do
we
see
certain
cell
densities?
Nothing
I
presented
here,
I
believe,
was
incorrect.
A
There
are
further
variations
on
a
theme
that
evolution
has
discovered,
and
so
it's
perfectly
good
line
of
research
to
say
what
are
the
differences
between
these
areas
and
most
people
focus
on
that
they
it's
almost
like
saying.
Well,
this
is
a
visionary.
It
must
be
different
or
there's
a
language
area.
It
must
be
different.
Let's
try
to
find
some
magic
cell
over
here.
A
That
does
which
may
exist,
but
the
point
is,
I
want
to
try
to
find
those
common
principles,
and
once
you
find
the
common
principles,
then
you
can
do
variations
on
a
theme
on
it
from
a
theoretic.
From
a
theorist
point
of
view,
I
believe
that's
the
way
to
go.
The
evidence
for
common
principles
is
is
unequivocal.
A
The
evidence
for
variations
on
it
is
also
unequivocal
so,
but
I
just
we
choose
to
find
the
common
principles.
First
understand
those
in
detail
before
we
go
and
say
well
why
this
species,
or
this
region
slightly
different,
slightly
different,
we're
not
talking
about
radically
different
that,
I
think,
would
be
a
mischaracterization
of
it.
So
again
I
try
to
stick
to
themes
which
I
can
justify
across
all
species
and
basically
across
all
areas,
and
I
didn't
get
into
variations
like
oh
yeah.
Well,
why
is
the
striat
before
you
know?
A
You
know
it's
not
the
you
know,
that's
a
perfect
example.
You
know
in
humans
and
certain
mammals.
We
have
layer,
4
subdivided
in
v1,
and
but
there
are
other
mammals
that
don't
have
a
subdivided
v4
and,
and
they
see
two,
but
they
may
not
see
as
well
as
we
do
so.
Let's
not
worry
about
that
detail.
Yet
we'll
come
back
to
that
later.
That's
that's
my
basic
answer
to
that
question.
E
Jeff
the
great
work
and
thank
you
for
sharing
my
questions
along
the
same
line,
you're
doing
your
common
core
model
here,
but
in
your
hierarchy,
are
you
actively
looking
toward
the
differences
that
might
be
involved
in
a
youth
learning
pattern,
for
example,
versus
the
the
common
adult
model
you're
on
right
now,
since
there
seemed
to
be
some
differences
in
synapse
formation,
there,
oh.
A
Sure
I
mean
there's
a
lot
that
goes
on.
We
could
talk
for
a
long
time
here
about
how
the
synapses
are
formed
and
what
what
level
how
the
connectivity
is
developed.
You
know
we
have
a
lot
of
advantages
in
software
that
the
brain
doesn't
have
so,
for
example,
let's
when
we
know
that
when
when
a
mammal
is
born,
especially
human,
there's,
a
dense
over
connectivity
at
birth
in
early
life,
and
it
gets
pruned
back
very
quickly
now
you
know
we
can
speculate.
A
Why
that
is,
we
can
say,
look
the
neurons,
don't
know
where
they're
supposed
to
connect
they're
trying
to
find.
I
didn't
explain
the
mechanisms
here,
but
they're,
basically
trying
to
find
other
cells
that
predict
their
own
activity.
Other
cells
that
are
active
before
they
become
active,
that's
the
sequence
memory
and
they
don't
know
where
to
look
for
that
right
so
and
it
couldn't-
and
we
do
know
in
the
brain
that
there
are
certain
directions.
They
do
need
to
look
if
they're
going
to
find
the
right
pattern,
and
so
you
can
start
at
birth.
A
You
can
have
this
over
profundity
of
connections
and
then
you
say
which
ones
are
established,
and
then
you
forget
the
other
ones
right
as
an
adult,
it's
harder
to
learn
new
things,
because
you
don't
have
that
ability.
We
can't
a
neuron
doesn't
say
I
need
to
go
over
there
about
a
half
a
millimeter
and
find
that
cell.
They
can't
do
that.
They
only
been
nearby.
So
we
don't
have
to
deal
with
that
issue
in
our
in
our
models.
We
can
start
off
by
saying,
hey:
look.
A
We
can
give
a
it's
in
softer
and
softer
is
really
to
do
this
stuff.
We
have
this
huge
connectivity
matrix.
We
still
have
topology
the
cell,
that's
still
going
to
connect
to
some
other
cells
nearby,
but
I
can
just
essentially
say
it's
like.
You
have
synapses
everywhere
around
here
and
you'll
find
the
ones
that
you
need
to
connect
which
actions
to
connect
to
where
brains,
real
biologically,
has
to
grow
these
things,
and
and
this
you
know
you
can
have
at
birth.
You
have
one
type
of
thing
going
on
later
in
life.
A
We
know
that
if
you
want
to
learn
new
things,
you
have
to
actually
progressively
get
closer
to
that,
so
the
dendrites
and
the
accidents
can
grow.
So
it's
a
complex
field.
Your
question
relates
to,
but
again
from
a
technology
point
of
view,
we
don't
have
to
deal
with
that.
We
can
just
get
back
and
say,
okay,
what
you
know
we
can
skip
that
part.
We
don't
have
to
say
like
at
birth.
It
looks
like
this.
A
We
can
just
have
a
large
connectivity
matrix,
that's
very
sparsely
connected,
it
doesn't
cost
as
much,
and
so
we
just
have
that
all
the
time
our
systems
are
like
brains
at
birth.
They
never
have
to
prune
back.
They
just
have
potential
connections
everywhere.
Yes,
guru.
F
Jeff,
this
is
a
classic
question.
I'm
sure
you've
heard
many
times
before,
so
we
invented
flying
machines
that
don't
look
like
birds
right.
So
why
do
you
think
computing
machines
are
going
to
look
like
the
brain
and
and
more
importantly,
for
me
right
now?
What
is
an
alternative
architecture
that
you
may
have
seen
in
your
research
that
could
get
to
a
similar
intelligence,
school
yeah.
A
I
do
hear
this
a
lot,
you
know
if
you
go
back
and
look
at
the
history
of
the
wright
brothers,
this
this
is
a
misapplied
analogy.
That's
like
well,
bunnies,
don't
flap
their
wings.
The
wright
brothers
knew
they
had
to
understand
the
principles
of
flight
and
they
studied
birds
to
understand
the
principle
of
flight.
They
knew
that
the
principle
of
flight
had
to
do
with
wing
design.
A
They
did
wind
tunnel
tests,
they
did
this.
They
knew
they
had
to
get
the
principle
to
flight.
They
knew
that
propulsion
was
something
completely
different.
So
airplanes
share
the
principle
of
flight
that
birds
share,
but
the
principle
of
propulsion
wasn't
important
that
wasn't
the
thing
they
were
trying
to
do.
It
doesn't
matter
if
you
have
a
propeller
or
a
jet
engine
doesn't
matter,
but
the
principles
of
flight
are
the
same
and
they
knew
that
same
thing
applies
here.
The
principles
of
intelligence
are
important.
The
actual
implementations
are
not
now.
This
could
be
consumed
as
con.
A
You
know,
conjecture
or
subjective.
Why
do
I
think
these
principles
of
fight?
Are
there
other
ones
like
it?
My
walking
assumption
is
now
I've
been
at
this
now
for
a
long
time.
Over
30
years,
I've
been
working
on
this
and
you
know
I
originally
started
by
doing
a
literature
search
of
ai
and
a
larger
search
of
linguists
and
literature,
search
of
neural
networks,
and
I
just
spent
I
read
thousands
of
papers
all
these
things
and
and
I
watched
the
world
evolve.
I
watched
artificial
neural
experts
come
by.
A
I
observed
the
1980s
back
propagation
so
on,
and
I
kept
saying
you
know
what
these
guys
aren't
getting
any
closer
and
it
seemed
obvious
to
me
that
you
ought
to
look
at
a
brain
if
we
want
to
build
a
cognitive
system.
What's
the
only
example,
we've
got
a
brain
now.
Why
would
I
look
anywhere
else
now
we
remind
some
hubris
and
say
well
we're
smart
enough.
We
don't
need
to
look
at
brains,
we'll
figure
it
out
on
our
own.
Well,
maybe
that's
true
that
could
have
been
true.
A
It
doesn't
appear
to
have
happened
as
I
haven't
seen
it
happening.
So
then,
when
you
go
look
in
the
brains,
you
find
surprising
principles.
You
find
it's
an
amazing
surprising
thing
that
there
are
common
architecture
across
all
these
different
modalities.
It's
a
and
then
we
learn
about
sparse,
distributed
representations.
Those
are
amazing.
These
are
things
no,
I
would
have
never
thought
of.
I
wouldn't
have
thought
of
as
a
hierarchy
of
similar
regions.
I
would
never
guess
that
stuff.
So
at
some
point
we
can
throw
away
the
brain.
We'll
know
enough.
A
We'll
just
do
our
own
thing,
but
I
that
hasn't
happened
yet
so
and
if
you
think
you
can
do
it
some
other
way,
that's
great.
I
just
don't
know
how
and
I
don't
know
what
it
is.
I've
never
seen
anything
else
like
it.
So
to
me
this
is
the
way
to
go
until
we
know
better
all
right.
G
What
is
this
interesting
talk,
and
I
agree
with
the
question
there,
but
I
want
to
bring
this
to
a
higher
level
rather
than
at
level
neurons,
and
things
like
that,
because
one
of
the
questions
I
have
is
you
know,
there's
a
lot
of
evidence
of
machine
learning
around
sparse
representation.
It's
not
a
new
concept.
G
One
of
the
questions
I
have
for
you
is
there's
a
lot
of
infrastructure
that
you're
building
here
a
lot
of
things.
Is
there
any
evidence
that
this
is
buying
anything
that
doesn't
currently
exist?
If
you
just
through
random
forest,
you
know
techniques,
machine
learning,
techniques
at
the
data.
Would
you
get
anything
different?
That's
the
first
question,
the
second
question
or
any
better.
G
The
second
question
is,
you
know.
Fundamentally,
the
cognitive
system
works
at
a
symbolic
representation
and
there's
clearly
a
grounding
problem
between
data
input
and
that
grounding
problem.
You
know
we
have
one
of
the
foremost
researchers
here.
Ann
treason,
who's
done
work
on
this
since
the
80s
of
the
binding
problem
being
able
to
differentiate
between
different
pieces
of
information
and
understand
how
they're
different
so
the
next
time.
I
see
you,
despite
that.
Probably
90
of
your
the
the
optical
input
will
be
different.
That
is
you're
probably
going
to
be
wearing
different
clothes.
G
I
still
recognize
you
because
I
understand
where
the
important
information
is,
but,
more
importantly,
I
can
tell
you
what's
different
between
those
two
things
I
can
say
his
shirt
is
different,
not
which
pixels
are
different
or
which
neurons
are
firing
is
different.
So
I
want
to
make
sure
that
as
we're
moving
forward
in
this,
I
agree
with
the
existence.
Proof
right.
The
human
brain
does
things
that
nothing
else
can
do
so
we
have
existence
proof.
A
Okay,
those
are
two
very
different
questions,
and
so
I'm
going
to
take
them
both
of
you
said
only
one
question,
so,
let's
get
back
to
the
first
one:
oh
can
we
do
a
better
job?
Is
this
better
than
applying
random
forests
or
some
other
type
of
learning
techniques?
And
the
very
beginning
analogy
I
made
here
is:
is
that
you
know
I
I'll
never
say
it's
better
than
some.
You
know.
Take
three
phds
stick
in
a
room.
Try
to
solve
a
problem.
Can
I
be
better
than
them?
A
I
don't
know,
probably
not
the
point.
The
argument
of
my
analogy
in
the
beginning
is
flexibility
that
drives
platforms
and
and
what
we
found
so
far
in
our
testing
and
other
people's
testing,
not
just
us
other
people's
testing
is
that
these
networks
perform
very
quickly
get
up
to
parity
or
very
close
to
parity
of
the
best
solutions
out
there,
and
then
you
people
get
into
these
benchmarks.
A
Where
they're
saying
well,
I
can
get
three
percent,
one
percent,
half
percent,
better
blah
blah
blah,
but
we
got
there
in
half
an
hour
and
they
spent
you
know
three
months
literally,
that's
what
it's
been
like
for
us
going
through
this
also
most
of
the
existing
machine.
Learning
techniques
do
not
handle
time
very
well
at
all,
they're,
just
not
about
time-based
patterns,
and
so
it's
hard
to
actually
sometimes
make
equivalent
comparisons
on
these
things.
But
again
it's
really
the
flexibility
that
matters
and-
and
that's
the
key
there
for
that
piece.
A
It
basically
represents
a
distributed
fashion,
all
the
attributes
of
something
semantically
and
you
just
really
doing
a
bit
comparison
between
patterns
to
understand
what
is
semantically
similar
and
what
is
semantically
different
and
you
could
still
recognize
them
as
the
same
thing
it
really.
It
is
the
key
to
solving
the
representation
problem
in
ai.
A
You
know,
there's
an
air
researcher
once
came
to
him
to
the
redmond
science
institute
and
he
said
to
me
he's
just
retired
after
a
year
in
ai-
and
he
said
you
know-
I
mean
a
lifetime
in
ai
and
he
said
you
know
what
this
rep.
The
problem
of
representation
is
the
biggest
problem
in
ai
and
he
goes
no.
It's
the
only
problem
in
ai
the
problem
of
representation,
and
I
didn't
understand
what
he
meant
by
it
at
the
time.
I
now
understand
what
he
meant
by
it,
and
I
now
understand
the
solution
to
it.