►
From YouTube: SDR Capacity & Comparison (Episode 2)
Description
In this episode of HTM School, we formally introduce the Sparse Distributed Representation (SDR).
Properties of Sparse Distributed Representations and their Application to Hierarchical Temporal Memory: http://arxiv.org/abs/1503.07469
SDR Visualizations: https://github.com/nupic-community/sdr-viz
Intro music: "Books" by Minden: https://minden.bandcamp.com/track/books-2
A
A
Hi
I'm
Matt
Taylor
from
Numenta
and
welcome
to
HTM
school
I
am
really
excited
for
this
episode,
because
this
is
the
first
time
we're
going
to
delve
in
to
the
world
of
sparse,
distributed
representations
or
SD
ours.
So,
what's
the
big
deal
about
STRs,
we
sometimes
refer
to
them
as
the
data
structure
of
the
brand
and
for
good
reason,
because
it
turns
out
that
they're
used
pretty
much
everywhere.
Let's
take
a
look
at
an
example
suppose
you
were
playing
a
musical
instrument
as
you
listen
to
the
musical
instrument.
A
Some
of
your
neurons
in
your
auditory
cortex
are
becoming
active
when
they
hear
specific
frequencies,
but
most
of
them
are
silent.
The
same
goes
for
your
visual
system.
In
every
sensor
region.
There
is
a
sparse
pattern
of
activity
that
represents
the
perception
of
the
world
at
any
point
in
time.
At
the
other
end
of
the
scale,
your
frontal
cortex
is
involved
in
planning
the
music
that
you're
about
to
play.
It
creates
sdrs
that
represent
that
plan
and
beats
those
strs
down
to
lower
regions
of
the
cortex.
A
Your
motor
cortex
contains
neurons
that
represent
specific
muscle
movements.
A
small
percentage
of
these
neurons
will
fire
in
response
to
the
top-down
SDI
representing
the
plan,
as
they
recognize
their
part
in
the
plan.
They
will
fire
in
sequence
and
the
resulting
SDR
will
control
your
feeder
movements
over
time.
A
Similarly,
through
STRs,
you
predict
how
your
muscle
movements
will
create
sound,
and
you
will
use
them
as
well
to
focus
your
attention
on
the
specific
activities
that
are
creating
those
sounds
essentially.
Strs
are
used
in
the
cortex
for
every
aspect
of
cognitive
function
for
every
sensory
modality,
so
we're
going
to
jump
in
to
some
of
the
technical
details
of
STRs
now,
but
first
I
think
I
should
go
over
a
few
terms.
So,
let's
look
at
this
graphic
and
let
me
define
a
few
things
for
you.
A
First
of
all,
I've
been
talking
in
previous
episodes
about
bit
arrays
and
representing
them
as
grids
or
arrays
of
ones
or
zeros,
from
here
on
out
we're
going
to
look
at
STRs
with
a
more
visual
format
like
this,
where
empty
boxes
represent
zeros
and
colored
boxes
represent
points.
So
let
me
define
some
of
the
terms
we're
going
to
be
using
for
sdrs
throughout
this
episode.
First
of
all,
n
is
the
array
length
so
in
this
case,
256
boxes
in
is
256.
A
A
The
sparsity
is
the
percentage
of
bits
that
are
on
very
simple
now
I
know.
Sometimes
this
is
called
density,
but
we
always
call
it
sparsity
and
the
HTM
world,
because
all
of
the
STRs
that
we
deal
with
are
sparse,
so
we're
going
to
first
talk
about
SDR
capacity
from
last
episode,
I
talked
about
capacity
of
a
dense
representation
or
a
dense
bit
array.
That
formula
was
2
to
the
power
of
the
length
of
the
array.
A
In
this
case
it
would
be
2
to
the
256
power,
but
when
we
have
a
sparse
array
and
our
number
of
on
bits
is
restricted,
u
capacity,
formula
changes
a
bit.
Sdrs
are
much
less
capable
of
holding
information
than
dense
arrays,
but
it
turns
out
in
the
long
run.
It
doesn't
really
matter
very
much
and
we'll
explain
that
soon.
Well,
let's
take
a
look
at
this
capacity
formula
that
we
use
for
sdrs.
A
A
So
if
we
have
an
array
of
16
bits
like
this
one
and
we
have
a
W
of
zero,
nothing
is
on.
It
makes
sense
that
the
capacity
is
one.
We
can
only
store
one
value
in
this
and
empty
set.
Basically,
if
we
dial
this
up
to
one
that
one
could
go
in
any
one
of
sixteen
places
right.
Therefore,
the
capacity
is
16
and
if
we
dial
it
up
even
further
than
that,
the
capacity
changes
begins.
Changing
very
quickly
for
a
W
of
two
capacity
is
124
W
of
three
capacity
jumps
to
560.
A
That's
because
the
formula
for
STR
capacity
involves
factorials.
So
the
formula
is
the
number
of
bits
in
the
SDR
factorial,
divided
by
the
number
of
on
bits,
factorial
x,
number
of
off
bits:
factorial,
that's
what
this
formula
represents.
So
it's
not
a
very
complicated
formula,
but
it
does
give
us
a
way
to
show
how
much
how
many
values
we
can
fit
into
an
SDR.
So
let's
take
a
look
at
256
bit:
SDR
1.
A
Once
again
at
a
sparsity
of
about
2%,
we
said
8.8
billion
as
we
dial
up
the
sparsity
because
of
the
exponential
nature
of
the
factorials
in
this
equation.
That
capacity
is
going
to
go
up
very
quickly,
so
we
are
at
11%
sparsity
at
this
point
and
20%
sparsity
is
a
very,
very
big
capacity,
as
you
can
see.
So,
let's
put
this
down
and
let's
slam
this
all
the
way
up
and
look
at
an
array
of
2048
bits.
So
at
a
sparsity,
let's
get
it
to
20%
all
right.
There
40
bits
on
sparsity
of
2%.
A
Excuse
me:
there
is
an
enormous
capacity
for
this
array
already,
and
this
is
the
entry
point
essentially
for
the
type
of
dimension
of
sdrs
that
we
use
in
HTM.
This
is
about
the
smallest
that
we're
going
to
use
an
HTM
system,
a
2048-bit
array
with
40
bits
of
population
and
generally
we'll
go
all
the
way
up
to
64
or
65
thousand
and
keep
it
sparsity
at
2%.
So
it
just
shows
you
that
you
can
fit
a
massive
amount
of
data
inside
of
an
SDR
you
can.
A
You
can
represent
a
massive
amount
of
different
values
for
STRs,
especially
as
we
increase
the
size
of
N
and
W.
So
let's
talk
about
some
comparison
now
and
I'm,
going
to
take
one
STR
that
I've
randomly
generated
here
on
the
left
of
1,024
bits.
I've
got
a
sparsity
of
10%
here,
I've
dialed
up
a
little
bit
just
for
the
sake
of
example.
So
we
have
about.
We
have
exactly
one
hundred
and
three
bits
active
in
this
left.
Sdr.
The
right
SDR
is
another
randomly
generated
SDR.
A
It
has
the
exact
same
NW
and
sparsity
as
the
first
one,
but
it
is
just
randomly
generated,
so
we
have
two
different
random
STRs.
What
we
see
on
the
right
here
is
the
overlap.
So
if
you
remember
from
our
last
episode,
the
overlap
is
just
a
simple
binary
and
operation,
nothing
more
so
in
the
overlap.
There
is
a
bit
on
if,
in
that
exact
same
position
in
the
two
SDR
as
being
compared
if
it
was
on
both
of
them.
So
we
can
see.
We
have
a
much
much
sparser
array
here.
A
A
A
So
if
we've
got
a
sparsity
of
10%
from
each
of
the
ones
that
we're
adding
together,
we
get
a
sparsity
of
about
20%,
the,
not
quite
20%,
because
there
are
some
that
are
shared
that
are
not
getting
counted
twice
and
then
we
can
also
see
in
this
comparison
view
exactly
which
bits
belong
to
the
left,
SDR
versus
which
bits
belong
to
the
right
SDR
and
we're
displaying
here
right
on
top
the
overlap.
Score
is
15
between
these
two
SDR.
A
So
that's
how
many
bits
they
have
in
common
and
we'll
talk
much
more
about
the
overlap,
score
and
overlap,
sets,
and
especially
about
unions,
which
is
a
very
important
property.
There's
some
surprising
properties
of
unions
that
we
really
take
advantage
of
in
HTM
systems
that
we're
going
to
talk
about
in
the
next
upcoming
episodes,
so
stay
tuned
for
those.
But
right
now
we're
going
to
go
back
and
show
some
matching
operations.
Okay,
so
I'm
gonna,
try
and
show
you
that
STRs
are
noise,
tolerant,
they're,
actually,
quite
noise,
tolerant.
A
So,
in
this
example,
I
have
SDR
of
2048
bits
with
a
population
of
41,
which
is
about
two
percent
sparsity
and
in
the
center
SDR
I'm,
adding
33%
noise,
so
I'm
just
flipping
some
of
those
bits,
keeping
it
the
same
sparsity
but
essentially
giving
it
about
a
third
noise
and,
on
the
right
hand,
side.
We
see
that
comparison
view
that
we
just
saw
in
the
previous
visualization
all
of
the
red
bits.
That's
the
overlap.
A
The
overlap
score
you
can
see
here
is
28
between
these
two,
so
that
33%
noise
that
we
added
resulted
in
an
overlap
score
of
about
28,
and
you
can
also
see
the
left
bits
and
the
right
bits
over
here.
There's
this
big
indication
at
the
bottom
that
says
nope
that
is
telling
you
whether
this
is
a
match
or
not.
So
we
don't
have
to
have
an
exact
match.
We
can
do
sort
of
this
fuzzy
matching.
We
can.
A
We
can
just
compare
sdrs
based
on
their
overlap,
score
to
decide
whether
there
are
a
match
or
not,
and
the
way
we
do
that
is
by
using
this
value.
Theta.
If
you
read
the
SDR
literature
which
I'll
link
in
the
description
of
this
video
you'll
know
that
theta
is
a
threshold
and
if
the
overlap
score
goes
under
theta,
we
would
say:
that's
not
a
match.
If
it's
equal
to
or
greater
than
theta,
we
would
call
it
a
match.
So
in
this
case
nope
it
is
not
a
match.
A
However,
let's
say
we
change
theta,
let's
see
the
overlap,
scores
28.
We
can
dial
it
down
to
28
and
then
it's
a
match
right.
So,
but
the
interesting
thing
is
here:
you
can
set
your
theta,
which
is
your
threshold
for
at
what
point
you're
going
to
match
an
SDR
or
not
given
a
noise,
and
then
let's
change
our
noise.
So
we
had
it
at
30%.
A
Now,
with
a
w
of
41
and
a
theta
of
30.
How
much
noise
can
we
tolerate?
So,
let's
dial
this
up
to
so
we're
not
matching
up
above
70
80
percent
noise?
How
far
down
can
we
go?
How
much
noise
can
we
tolerate
keep
going,
keep
going
there?
We
go
around,
let's
see
29%
noise,
so
we
can
add
almost
30%
noise
to
this
SDR
and
still
recognize
that
that
SDR
matches
the
original
SDR
without
noise
and
here's
the
interesting
thing.
A
Here's
we
have
a
formula
to
tell
what
is
the
chance
of
this
being
a
false,
positive
and
I
will
get
into
this
in
the
next
episode.
We'll
actually
look
at
this
formula
and
do
some
calculations
on
overlap
sets,
but
it's
there
is
a
very,
very
small
percentage,
almost
minuscule
astronomically
small
percentage,
that
this
is
going
to
be
a
false
match.
So
we
can
tolerate,
with
the
theta
of
30,
up
to
30%
29%
noise
in
the
signal
and
still
almost
never
almost
never
get
a
false
positive
and
that's
really
powerful
with
sdrs.
A
So
let's
say
how,
if
we
go
up
to
50%
say
we
have
a
system
that
has
50%
noise,
how
low
do
we
have
to
go
with
this
theta
to
get
there?
We
go
21.
So
if
our
theta
is
21,
we
can
get.
We
can
be
fairly
confident
that
we
can.
We
can
match
it
and
the
chance
of
false
positive
is
still
9.8
to
the
negative
to
10
to
the
negative
30
power.
So
that's
still
really
small.
So
STRs
have
a
massive
resistance
to
noise.
It's
one
of
the
key
properties.
A
It's
really
important,
and
it's
one
of
the
things
that
makes
your
brain
so
intelligent
as
well.
So
thank
you
for
watching
this
episode
of
HTM
school.
If
you
liked
it,
please
give
it
a
thumbs
up
and
subscribe
to
our
YouTube
channel.
So
you
will
not
miss
the
next
episode,
which
is
going
to
go
even
deeper
into
STRs
I'm
going
to
talk
about
overlap,
sets
we're
going
to
talk
about
subsampling
at
some
point,
we're
going
to
get
to
unions
and
that's
where
we
really
get
into
some
of
those
HTM
properties
of
STRs
that
are
so
important.