►
From YouTube: SDR Sets & Unions (Episode 4)
Description
Using SDR sets and unions to identify SDRs that have been seen in the past.
Help me decide what episode to do next, Encoders or Spatial Pooling! Comment below or vote here: https://discourse.numenta.org/t/htm-school-episode-4-sdr-sets-unions/455
Intro music: "Books" by Minden: https://minden.bandcamp.com/track/books-2
A
Hello
and
welcome
to
school
I
am
Matt
Taylor
from
the
Mensa.
In
our
last
episode
we
talked
about
why
STRs
are
fault,
tolerance
and
noise
tolerance
in
today's
episode,
we're
going
to
talk
about
sets
of
STRs
and
unions
of
STRs.
If
you
remember
from
episode,
2
STRs
have
semantic
meaning
and
HTM
systems.
This
means
that
SDR
is
with
similar.
Overlaps
have
similar
semantic
meanings.
Imagine
a
stream
of
STRs
and
also
imagine
that
we're
collecting
these
STRs
and
putting
them
into
different
sets
as
they're
coming
across
the
stream.
A
So
we've
got
these
different
buckets
that
we're
putting
these
STRs
over
time.
Those
will
collect
into
a
significant
amount
of
STRs
in
each
set.
Now,
as
each
new
SDR
comes
down
the
stream,
we
can
compare
it
to
all
these
different
sets
to
see
if
we've
seen
it
before
and
potentially
understand
what
it
might
represent.
So,
let's
see
a
visualization
of
STRs
and
sets
in
action.
A
So
here
we
have
an
SDR
of
256
bits
with
four
bits
on
that
is
a
2%
sparsity
and
that's
what
this
is
drawn
right
here
and
what's
going
to
happen,
is
I'm
gonna
click.
This
button
add
SDR
to
stack
and
it
will
drop
that
SDR
down
below
the
button
bar
here
and
every
time.
I
click
it.
It
will
generate
a
new
SDR
and
drop
the
last
one
down
into
the
stack.
So
this
is
going
to
give
us
a
big
stack
of
randomly
generated
STRs.
A
It's
keeping
track
of
how
many
stacks
are
in
the
set
and
I
also
can
click
this
button
to
add
I
think
50
all
at
one
time.
So
then
we'll
end
up
with
63
STRs
in
this
stack.
So
now
we
can
take
one
of
those
STRs
I'm,
gonna
click,
this
match
button
and
we'll
click
one
of
those
STRs
randomly
and
sort
of
simulate
that
we're
seeing
this
SDR
again.
A
So
again,
imagine
this
scenario
where
we've
had
a
stream
of
STRs
and
we're
taking
some
of
them
out
and
putting
them
in
this
set,
because
we
want
to
compare
the
STRs
that
are
coming
through
this
stream
to
that
set
to
see
if
we've
seen
it
before
so
when
I
click
this
DR,
so
some
random
SDR
and
the
stack
it
is
going
to
bring
it
up
top
here.
This
is
the
SDR
that
I
clicked
it's
going
to
highlight
it
down
at
the
bottom.
That's
what
this
orangish
highlight
around.
A
It
is,
and
it's
going
to
rearrange
this
stack
and
order
it
by
the
overlap
scores
of
every
single
SDR
versus
the
one
that
I
selected.
So
as
we
can
see
here,
it
identified
the
one
I
selected
as
having
the
highest
overlap
score
and
it
ranked
up
to
the
top
winning
this
competition.
I
have
a
noise
slider
up
here,
so
I
can
add
bits
of
noise
to
it.
A
If
I
slide
it
up
and
add
one
bit
of
noise,
as
you
can
see,
the
overlap
score
of
that
SDR
that
we
matched
against
went
down
because
now
I
have
one
bit
of
noise.
Innes
and
I
was
only
an
overlap
of
three.
So
another
interesting
thing
that
I
want
to
show
off
here.
Let
me
refresh
this
I'm
going
to
click
this
calculate
false-positive
button
and
what
this
will
do
is,
as
I
add
more
STRs
to
the
stack
it'll.
A
Do
a
calculation
to
tell
us
what's
the
probability
that
some
random
SDR
will
match
against
an
SDR
on
the
stack,
but
it's
not
really
a
match
so
initially
I'm
just
going
to
add
a
couple
of
STRs
to
the
stack
here.
Here's
our
probability
of
false
positive.
Currently
it
is
2.3
times
10
to
the
negative
8,
which
is
a
significantly
small
number,
but
not
too
small
watch
what
happens
to
this
number
as
I
continue
to
add
STRs
to
the
stack
it
goes
higher
and
higher
and
higher
and,
let's
add
a
bunch
I'm
going
to
have
50.
A
A
So
all
of
this
gets
much
more
interesting
when
we're
dealing
with
bigger
STRs
now
I
have
increased
the
size
of
the
STRs,
we're
looking
at
from
256
to
2048
bits.
Just
so,
you
know
I'm
not
displaying
all
of
them,
so
these
STRs
go
onward
and
I'm
doing
the
math
on
the
feet.
Strs,
but
I'm
only
visualizing
a
percentage
of
them
so
that
they
can
all
fit
on
our
screen.
So
this
dimension,
let's
add
a
bunch
of
STRs
to
the
stack
right
now.
I
have
53,
let's
get
it
up
to
over
100.
A
A
Look
how
steep
this
this
curve
is
here
for
the
one
that
we
actually
identified
as
the
matching
STR
versus
others
that
already
been
coming
close
and
you
might
notice
that
I'm
adding
10
bits
of
noise
to
this.
So
25%
of
these
bits
are
actually
noise.
If
I
take
the
noise
all
the
way
down,
then
we'll
get
an
overlap,
score
of
40,
or
at
least
we
should
what's,
the
slider
ticks
all
the
way
down
yeah.
So
we
get
an
overlap.
Score
of
exactly
40,
but
you
can
see
how
resilient
this
set
comparison
method
is
to
noise.
A
Even
if
we
add
50%,
noise,
I
could
still
adjust
my
theta
to
20
20
something-
and
we
still
have
a
significantly
steep
curve
here
and
a
nice
chant
a
very
high
chance
that
we're
going
to
be
identifying
the
proper
STR.
Yes,
we
have
seen
this
SDR
before
so,
and
we
can
also
calculate
the
the
false
positive
rate
for
this,
as
well
as
we
add
more
STRs
to
it.
A
So
in
a
stack
of
a
hundred
and
four
different
STRs
of
this
of
this
dimensionality,
the
false
positive
chance,
that's
just
some
random
STR
will
match
something
in
there
that
we
actually
haven't
seen
is
2.5
times
ten
to
the
negative
24,
so
significantly
low
number,
not
astronomically
low
but
pretty
low.
The
problem
with
this,
this
type
of
classification,
is
that
it's
it's
a
lot
of
calculations.
A
So,
as
you
hopefully
remember
from
previous
episodes,
a
union
is
where
you
take
one
or
more
STRs
or
bid
arrays,
and
you
order
them
together.
So
we're
going
to
turn
bits
on
if
any
of
the
bits
in
that
space
are
on
and
the
ones
we're
comparing
so
now.
We're
looking
at
unions
in
this
visualization
I
have
another
SDR
on
top
here.
A
We're
gonna
start
off
dealing
with
another
256
bit
array
with
four
bits
on:
that's
a
2%
sparsity
and
it's
the
same
thing
when
I
click
this
button,
it's
gonna
dump
down
here
onto
the
stack
I'm,
keeping
a
stack
down
here
at
the
bottom,
but
I'm
also
keeping
track
of
the
union
of
the
stack
right.
Every
time.
I
add
an
STR
to
the
stack
I'm,
also
going
to
pour
it
into
that
Union.
So
the
more
STRs
we
add
like
here's
I'm,
adding
20
at
a
time
now,
I'm
gonna
make
this
set
really
big.
A
I've
got
75
different
SD
R's
in
this
stack
right
now.
This
Union
gets
denser
and
denser
and
denser.
So
now,
if
we
take
a
random
STR
when
I
click,
this
match
random
button,
it
will
tell
us
that,
given
this,
this
Union,
our
chance
of
a
false
positive
is
about
23%,
which
is
pretty
bad.
So
every
time
I
click
this
match
random
button,
there's
about
a
23
percent
chance
that
some
random
STR
will
overlap
with
that
Union
by
the
four
bits
that
are
required
for
an
exact
match,
which
makes
sense
because
this
union
is
super
populated.
A
So
you
know
I
would
would
expect
that
25%
of
those
of
these
random
STRs
would
match
so
the
more
you,
click
and
I
you
can
see
over
here,
I'm,
plotting
or
I'm,
showing
the
overlap
score
about
a
quarter
of
them
will
will
actually
match.
You
can
see
that
match
indicator
here.
So
the
more
I
add
to
the
stack
now
I
have
95.
Now
I've
got
115
that
probability
of
false
positive
goes
up
and
up,
and
so
now
I'm
matching
in
about
50%
of
them
will
just
be
random
matches.
You
know
we
don't
want
that
right.
A
We
don't
want
random
matches.
So
let's
make
our
STRs
bigger,
clicking
the
go
big
button
once
again
and
we're
going
back
to
a
2048-bit
SDR
with
a
w
of
40,
so
2%
sparsity
again
so
keep
in
mind
again
that
I'm
only
displaying
the
first
256
bits
in
the
visualization
but
I'm
doing
the
math
on
the
entire
STRs.
So
let's
add
some
STRs
to
the
stack,
and
one
thing
you
notice
immediately
is
the
chance
of
a
false
positive
is
much
much
lower
because
we're
dealing
with
much
bigger
STRs
so
as
I
add
to
the
stack.
A
Yes,
it
continues
to
go
higher
and
higher
the
higher
chance
of
a
false
positive.
Let's
add
a
20
at
a
time.
So
now,
at
this
point,
I
have
49.
Let's
make
this
an
even
50:50
STRs
in
my
Union.
You
can
see
them
represented
all
down
here
and
the
stack
of
50
STRs
in
the
set
and
the
probability
of
any
new
random
SDR
been
compared
to
that
Union
for
an
overlap
of
an
exact
match
of
40
bits.
A
The
chance
of
that
happening
is
seven
point,
eight
two
times
10
to
the
negative
ninth
power,
so
pretty
low,
not
hugely
low
but
pretty
low,
and
so
we
can
see
that
happening
when
we
start
matching
random.
We're
never
gonna
get
around
a
match.
I
could
sit
here
and
click
for
probably
weeks
and
never
get
a
random
match
with
a
probability
of
false
positive
that
low.
We
continue
to
add
more
and
more
STRs.
A
That
probability
of
false
positive
will
continue
to
go
up
until
we've
saturated
that
Union
to
the
point
where
we're
starting
to
get
a
lot
of
false
positives.
So
the
more
I
add
to
the
set
the
denser
and
denser
this
Union
gets.
You
can
see
right
here.
The
Union
is
at
this
point
at
93%
density.
It's
really
surprising,
even
with
a
union
that
93%
fits
on
the
chance
of
a
false
positive,
is
still
only
4%,
so
that's
still
pretty
low.
A
When
you're
saying
just
taking
some
random
SDR
and
seeing
if
you've
seen
it
before
it's
4%
chance,
you're
not
going
to
be
right,
it's
going
to
be
a
false
positive.
You
identify
something
you
thought
you've
seen,
but
you
haven't
so,
and
the
calculation
for
doing
that
comparison
is
just
so
much
faster
than
it
is,
if
you're
keeping
the
entire
set
at
every
SDR
in
the
set
so
wow.
This
has
been
the
third
episode
that
we've
done
just
on
STRs.
A
We
could
talk
about
encoders
and
encoders
actually
take
real-world
data
and
convert
them
in
some
way
into
representations
that
have
semantic
meanings,
we'll
talk
about
binary
representations
with
semantic
meanings
and
how
we
can
encode
meaning
into
those
bits
and
that's
more,
the
sensory
aspect
of
things
as
far
as
HTM
systems
or
we
could
go
and
talk
about
spatial
pooling,
which
is
a
mechanism
for
normalizing,
St
artists
and
the
spatial
aspect
and
that's
more
of
a
cortical
process,
so
that
would
dive
right
into
the
cortical
Theory.
It's
up
to
you.
A
I
want
to
know
what
you
think
should
we
go
with
encoders
we
go
with
spatial,
pooling
we're
going
to
get
to
both
up
eventually.
What
do
you
think?
Let
me
know
in
the
comments.
Don't
forget,
like
this
video,
if
you're
enjoying
the
series
and
subscribe
to
our
YouTube
channel
and
I'll
keep
making
them
for
you,
thanks
again
for
watching
HTM.