►
From YouTube: Beginner's Guide to NuPIC
Description
With your host, Scott Purdy.
You can find Scott's iPython Notebook here http://fer.io/~scott/nupic_overview.ipynb (json) or here http://nbviewer.ipython.org/github/numenta/nupic/blob/master/examples/NuPIC%20Walkthrough.ipynb (static html).
Another one here: https://github.com/numenta/nupic/blob/master/examples/NuPIC%20Walkthrough.ipynb
A
A
But
it's
just
good
to
know
about,
and
then
I'll
talk
about
some
of
the
different
things
and
what
we
call
the
OPF,
which
is
sort
of
a
framework
for
streaming
data
models
and
there's
a
few
different
entry
points
into
that.
So
I'm
going
to
go
through
and
kind
of
the
order,
I
mentioned
them.
So
first
there's
encoders.
If
you
don't
know
what
encoders
are
all
the
algorithms
work
on
this
concept
of
sparse
distributive
representations,
which
is
just
ones
and
zeroes
and
encoders
our
way
of
taking
different
types
of
data
and
turning
it
into
one
zeroes.
A
A
B
A
A
Why
it's
two
point:
five
and
ninety
seven
point:
five:
instead
of
zero
and
100
is
just
an
implementation
detail
for
this
encoder,
but
I'm
not
going
to
go
into,
but
it's
going
to
create
a
set
of
ones
and
zeros
that
represents
that
number
and
the
idea
is
that
it
buckets
numbers
over
that
range
and
then
selects
a
set
of
ones.
That's
equal
to
the
W.
That's
the
width
of
the
representation
for
that
value,
and
then
n
is
the
total
number
of
bits.
A
Registration
so
again,
I'm
not
really
going
to
go
into
the
theory
of
STRs
and
what
this
means,
but
I'm
just
going
to
say
this
is
how
you
get
STRs
from
values.
So
here
for
the
scalar,
encoder
I've,
encoded,
a
few
different
values
with
encoder
that
I
created,
there's
and
then
there's
the
in
output.
You
can
see
for
values
that
are
close
to
each
other
that
fall
into
the
same
bucket.
A
They're
representations
are
going
to
be
the
same,
so
you
can
see
that
with
three
and
four
and
then
when
we
move
into
the
next
bucket,
above
that,
with
the
value
five,
it
shifts
the
values
over
and
they're
still
overlapping
bits,
because
the
values
are
close
to
each
other.
It
retains
that
the
semantics
of
the
data,
but
if
you
had
a
much
larger
value,
then
it
would
obviously
not
have
overlap
as
those
values
move
over.
This
is
a
really
simple
encoder.
It's
just
doing
bits
right
next
to
each
other,
and
we
have
I'll
show
you.
A
It's
below
the
range
of
this
name,
so
you
can
see
you
get
the
same
representations,
so
we
have
another
encoder,
that's
the
random
distributed
scalar
encoder,
and
the
idea
of
that
is
that,
instead
of
picking
these
values
that
slide
across
the
range,
it's
randomly
selecting
the
bits
for
each
bucket
and
that
way
it
can
represent
a
much
larger
number
of
buckets.
You
run
the
risk
of
collisions,
and
so
you,
obviously,
if
you
create
enough
buckets
eventually
you'll,
have
one
that's
exactly
the
same
as
a
previous
one.
A
But
for
the
purposes
of
this,
this
demo
I'm
going
to
run
this
code.
Here
you
can
see
you
can
see
with
3
&
4.
They
have
the
same
bit
representation
it's
just
like
in
the
previous
one
and
the
5
is
different,
but
it's
not
adjacent
bits
anymore
and
then
again
with
that
one
hundred
and
a
thousand
a
thousand
is
has
a
new
representation.
It's
not
the
same
as
100,
even
though
it's
outside
the
range.
So
that's
the
random,
distribute
random
distributed.
Scalar
encoder.
C
C
A
Okay,
so
here's
the
takeaway
from
this.
What
I'm
trying
to
say
what
I'm
trying
to
show
here
is
how
to
use
these
these
different
coders
and
just
give
you
a
flavor
of
the
different
encoders
that
ship
with
new
pick
so
with
the
scaler
encoder
you're
you're,
taking
scalar
values
and
turn
them
into
ones
and
zeroes,
and
the
way
that
it
does
this
is
that
buckets
the
values
across
that
range
and
then
turns
them
into
ones
and
zeroes
and
within
a
bucket
you're,
not
going
to
get
a
different
representation.
A
So
I
think
that's
what
you're
commenting
on
the
three
and
the
four
the
same
representation?
Yes
and
and
that's
because
the
obviously
we
can't
represent
everything
uniquely
with
this
encoder,
because
it
has
a
limited
output.
And
so
you
have
to
pick
the
right
end.
W
min
Max
values,
to
make
sure
that
your
granularity
makes
sense
for
your
data,
and
this
kind
of
should
be
a
benefit
in
some
cases.
Because
aidid
noise
is
the
data
a
little
bit.
So
so
it
doesn't
have
to
learn
that
four
and
four
point:
zero
zero
one
or
the
same
thing.
A
C
If
you
change
that
to
a
hundred
and
you
change
your
window,
the
w211
would
be
the
first
bit
on
to
would
be
the
second
bit
on
three
would
be
the
third
bit
on
if
you
changed
it
to
100
and
made
the
window
to
be
ten
anything
between
one
and
ten
would
be
the
first
bid
on
anything,
but
in
the
twenty
range
would
be
the
second
bit
on
so
you're
just
bucketing
in
encoding,
scalar
values
to
fit
within
some
of
those
buckets.
How
you
define
the
range
in
the
window.
A
A
G
A
A
That
does
that
make
sense,
because
basically
that's
a
a
bucket
which
is
represented
with
these
three
values,
and
then
you
shift
it
over
one
to
get
the
next
bucket.
So
you
can
fit
20
on
this
range
and
the
formula
for
that
is
n,
minus
W,
plus
one
with
this
example,
because
we're
randomly
picking
the
bits
to
represent
a
bucket,
you
can
theoretically
create
any
as
many
as
you
want.
A
You
keep
going
and
the
the
problem
that
you
run
into
is
collisions
with
values
here:
I'm
showing
very
small
numbers,
so
an
N
of
21
and
the
W
of
3
is
not
a
very
good
representation
for
in
most
cases,
because
you're
going
to
have
a
higher
risk
of
collisions.
Since
you
only
have
21
bits
and
it's
not
going
to
deal
with
noise
or
collisions
very
well,
because
you
only
have
three
active
bits.
If
you.
G
A
This
is
a
scaler
encoder,
it's
not
just
integer,
so
it
do
it.
You
can
get
a
floating
value
as
well
a
floating
point
value
as
well.
So
you
don't
want
it
to
represent
every
single
thing.
It
sees,
uniquely
because
things
that
are
different
by
you
know
one
millionth
of
a
like
a
point:
zero,
zero,
zero
one.
You
want
that
to
be
treated
the
same
in
most
cases,
but
it's
different
by
data
set.
A
I
A
B
A
Okay,
so
those
are
scalar
encoders
I'm,
going
to
show
you
two
other
kinds:
one
is
the
date
encoder
and
the
date
encoder
it
encodes
and
differently.
You
have
you
have
options
for
how
you
want
to
encode
a
date
like
a
go
through
the
documentation.
So
in
the
example
that
I
have
here,
I'm
going
to
I'm
going
to
encode
a
date
as
I'm
going
to
code
the
season
of
of
the
date
when
you
create
a
date
encoder,
you
can
tell
it
what
you
want
to
use
from
the
date
ten
codes.
A
You
can
choose
season,
you
can
choose
day
of
the
week.
You
can
choose
weekend,
vs
weekday,
which
is
just
a
boolean,
and
then
it's
going
to
capture
that
aspect
of
the
date
in
the
output
SDR.
So
in
the
example
here
with
season
the
the
value
five
is
essentially
the
W
that
we
had
in
the
scalar
encoder,
but
specifically
encoding
this
season,
so
I'm
creating
three
different
date
times
and
encoding
them.
A
One
is
right
now,
roughly
I,
guess
it's
a
little
bit
later
than
that
now,
but
I
was
guessing
what
time
I
get
to
this
part
of
the
talk
and
then
next
month,
and
then
I
just
picked
another
time
Christmas
and
what
you
can
see
in
the
output
here
is
for
each
of
these
encodings.
You
have
five
active
bits
because
I
specified
that
in
the
in
the
construction
of
that
encoder
and
it's
capturing
the
seasonality,
so
it's
selecting
the
total
number
of
bits
to
represent
the
whole
year.
A
I'm
not
actually
sure
how
it's
picking
the
total
number
of
bits,
basically
I,
guess
making
12
buckets
roughly,
but
you
can
see
if
I
take
now
versus
next
month.
There's
some
overlap
between
them,
the
these
three
bits
and
some
difference
between
them.
Where,
when
I
encode
christmas,
you
get
the
three
at
the
end
here
and
the
two
at
the
beginning,
and
that
has
no
overlap
with
now
and
next
month,
and
so
that's
how
it
captures
the
semantics
of
seasonality,
which
is
that
you
know
it's
doing
it
roughly.
A
So
that's
the
de
encoder,
like
I,
said,
there's
a
number
of
different
options
for
that.
Like
weekend
vs.
weekday
day
of
the
week,
things
like
that
I'm
just
doing
season
here,
but
you
can,
you
can
do
whatever
makes
sense
for
your
data
and
you
might
ask
how
do
you
know
what
works
well
for
your
data
I'll
talk
about
that
towards
the
end
with
swarming,
which
is
a
way
of
selecting
parameters
for
your
data
set.
A
Then
you'll
notice
there's
also
these
first
3
bits
that
aren't
active
for
any
of
the
4
cap
categories
that
I
specified.
You
might
think.
That's
if
there's
a
none
value,
but
it's
actually,
if
you
get
something,
that's
not
none,
but
also
not
one
of
your
categories.
So
so
that's
a
category
encoder
and
I'm
going
to
actually
use
these
4
SDRs
in
a
second
to
show
how
the
spatial
pulley
works.
So
that's
encoders
I'm
going
to
move
on
to
the
spatial
Pooler.
Unless
anyone
has
questions
about.
J
J
A
We
really
know
how
to
answer
this
so
I
guess
you
can
represent
a
much
larger
number
of
integers
when
you're
bucketing
them,
because
you
can
have
the
same
number
of
buckets
as
categories
here
after
you
have
more
buckets
because
you
have
overlap
between
them
as
you.
So
yes,
just
yeah,
and
it
also
brings
up
a
good
point.
A
Do
you
need
to
have
totally
exclusive
bits
for
a
category
encoder
and
you
don't
need
to
you-
could
do
something
similar
to
the
random
distributed
scaler
encoder,
where
you
randomly
pick
the
bits
and
then
just
hope
you
don't
have
collisions
very
often
and
if
you
picked
a
large
enough
and
then
W,
like
maybe
three
four
hundred
bits,
total
and
thirty
or
forty
active.
That
probably
would
work
just
fine
and.
A
Cool,
so
next
is
the
spatial
polar
again
I'm,
not
gonna,
really
talk
theory
here,
but
the
spatial
polar
takes
ones
and
zeros
as
input
and
outputs,
a
sparse
set
of
ones
and
zeros
that
hopefully
captures
the
spatial
invariance
of
the
input
pattern.
So
it's
going
to
adjust
itself
over
time
to
Matt
to
represent
the
input
space.
Well,
so
it's
sort
of
a
transform
from
all
possible
input
inputs
into
the
the
sort
of
most
common
groupings.
A
That
makes
sense,
and
actually
there
was
a
paper
that
I
was
going
to
put
a
link
to
in
here,
but
I
forgot.
Someone
had
put
a
paper
up
on
the
mailing
list,
comparing
the
spatial
Pooler
to
clustering,
algorithms,
I
thought
that
was
an
interesting
use
of
it,
an
interesting
comparison,
but
so
again
there's
a
documentation.
So
if
you
guys
pull
this
up,
you
can
you
can
go
through
this
and
run
these
examples
and
I
just
put
the
this
line
in.
So
it's
easy
to
get
to
the
documentation.
A
If
you
want
to
look
at
that
or
if
I
needed
to
look
at
it,
based
on
the
question
from
you
guys,
don't
know
why
I
have
all
these
okay.
So
what
I'm
going
to
do
for
to
show
how
to
use
the
spatial
polar
is
I'm
going
to
have.
It
learn
the
four
categories
that
we
created
with
the
category
encoder,
so
first
I'm,
just
printing
out
the
length
of
the
output
from
the
category
encoder
for
cat.
A
So
it's
it's
15
bits
long
and
we're
going
to
use
that
when
we
can
construct
the
spatial
blur
to
tell
it
how
many
input,
how
big
the
input
is,
input
dimensions
that
can
be
multi.
It
can
be
multi-dimensional.
So,
if
you're
doing
a
vision
problem,
you
can
do
two-dimensional
data
here,
I'm
just
doing
one
dimension,
just
to
show
how
it
works.
I'm,
going
to
give
it
four
columns
and
the
reason
I'm
going
to
do.
A
That
is
because
we
have
four
categories
and
I'm
going
to
try
to
have
the
spatial
Pooler
learn
to
represent
those
four
categories.
Well,
exactly
learn
those
four
categories
in
a
in
a
real
situation.
You
probably
want
a
lot
bigger
numbers
in
these
and
you
don't
necessarily
need
you
wouldn't
want.
You
would
not
want
a
one-to-one
map
in
between
columns
and
and
categories
as
I'm
doing
in
this
example,
but
I'll
talk
about
that
more
after
right.
I
show
this
so
there's
a
few
other
parameters
here.
A
Don't
really
know
how
to
explain
this
without
getting
into
a
lot
of
detail,
but
let's
say
in
a
vision
problem.
We
were
doing
a
2d
topology,
so
you
have
a
two
dimensional
input
and
what
you
can
do.
In
that
case,
you
can
lay
out
the
columns
in
a
similar
way,
so
that
the
column
in
the
upper
left
corner
only
can
only
connect
to
the
input
bits
in
the
upper
left
corner
and
that
way
different
columns
can
learn
different
local
features.
A
And
then,
if
you
had
tempo
pooling,
you
could
use
a
hierarchy
and
learn
higher
level
things
as
you
move
up
things
like
that,
in
this
case,
I'm
just
going
to
do
totally
flat
space.
All
the
columns
can
see
all
the
input
bits
and
the
inhibition
is
global,
so,
rather
than
columns
inhibiting
look
other
columns
next
to
them
next
yeah.
A
You
get
and
I
just
realized.
Oh
I,
see
okay,
I
lost
track
of
where
I
was
before
I
feed,
any
data
into
the
spatial
pool
or
I'm
just
going
to
print
out
what
columns
or
what
input
bits
each
of
the
columns
is
are
connected
to
and
that's
randomly
initialized.
So
the
each
of
these
vectors
is
each
of
these
bits
match
the
input
bits
that
come
in
and
a
one
means
that
the
column
was
is
currently
connected
to
that
input
bit,
which
will
make
more
sense
in
a
second.
A
But
that
point
of
showing
this
is
that
they're,
random
initially
and
when
I
feed
data
in
those
are
going
to
adjust.
So
some
of
the
columns,
like
one
of
the
columns,
is
going
to
learn
to
represent
one
of
the
categories,
another
one's
going
to
learn
to
represent
another
one
and
you'll
see
what
that
looks
like
in
a
second.
So
here
I'm
going
to
feed
in
our
category
cat
and
look
at
the
output
from
the
spatial
poor,
so
SP
compete.
The
compute
method
is
how
you
do.
A
A
So
the
ones
that
have
the
most
active
in
their
connected
pool
will
win
and
then,
with
learning
set
to
true
it'll
increase
the
connected
weight,
the
weight
of
the
connection
to
that
input
bit
for
all
the
ones
that
are
active
and
for
all
the
ones
that
are
not
active.
It
will
decrement
them
and
that
the
implication
of
that
is
over
time.
The
inputs
that
a
column
is
active
for
it'll
learn
to
represent
those
better
and
so
where
we
start
out
with
random
connections.
Here
after
I
feed
in.
A
The
value
for
cat
20
times,
if
I
print
these
out
again
you'll,
see
that
this
third
column,
which
was
the
one
that
was
active
for
it,
which
is
this
one
here-
has
adjusted
it's
connected
weights
so
that,
rather
than
being
connected
to
a
random
set
of
these
input,
bits
it's
now
connected
to
the
ones
for
the
input
that
it's
active
for
most
commonly.
So
again,
when
we
look
at
these
random
connections
and
we
we
feed
the
value
for
cat
in
the
active
bits
and
cat
are
these
three.
A
And
so
all
these
columns
competed
to
become
active,
and
this
third
one
was
the
one
that
happened
to
win
and
that's
because
it
had
connections
to
all
three
of
those
bits.
You
can
see.
The
second
one
did
too
so
there's
a
little
bit
of
randomness
there,
but
the
third
one
won,
and
so
then
I
when
I
fed
cat
in
a
whole
bunch
of
times
it
would
adjust
its
weights.
A
Do
this
loop
to
print
out
each
of
the
columns
again
now
you
can
see
that
the
columns
have
adjusted
their
weights
so
rather
than
the
random
connections
that
we
had
up
here.
The
third
one
like
we
saw
before
is
connected
to
all
the
bits
for
cat
this.
Fourth,
one
is
connected
to
all
the
input
bits
for
dog,
and
then
we
see
something
interesting
here.
A
A
Guess
it's
not
obvious
which
of
these
is
which
animal
but
I
just
happen
to
remember
from
before.
So
why
is
that?
Well
again,
these
columns
are
competing
for
these
values,
and
so
what
you
can
have
happen
is
one
value
like
or
one
column
may
become
active
for
it
may
win
from
multiple
animals,
and
then
it's
going
to
be
incrementing
its
weights
for
the
bits
for
for
monkey
or
for
cat
when
it's
active
for
cat
and
decrementing
the
other
ones.
A
But
then,
when
slowloris
comes-
and
it
happens
to
win
for
that
one,
it's
going
to
increment
the
value,
the
connections
to
the
slow,
loris
bits
and
decrement
those
for
cat.
So
it
just
happened
that
this
one
learned
to
represent
two
different
things.
Even
though
there
was
another
column
that
maybe
could
of-
and
we
can
keep
running
this
a
helmet
or
we
can
keep
running
this
a
whole
bunch
and
see
if
they
end
up
even
it
out.
A
Yeah,
there's
always
going
to
be
some
randomness
I
was
trying
to
explain
how
that
could
happen.
Where
one
happens
to
win
for
two
different
animals,
and
then
you
have
another
column
that
doesn't
really
represent
anything
or
maybe
partly
represents
one
of
the
animals,
and
what
you'll
see
in
a
real
system
is
generally
I
was
trying
to
say
this.
Normally,
you
won't
want
to
try
to
do
this
one-to-one
mapping.
You
just
have
a
big
pool
of
columns
in
a
big
pool
of
inputs,
and
some
columns
will
learn
to
represent
specific
parts
of
an
animal.
A
So
this
is
really
simple
data,
but
you
have
may
have
one
column
that
represents
the
first
two
bits
for
an
animal
and
another
one
that
represents
the
third
in
in
real
cases.
You'll
allow
multiple
columns
to
be
active
at
the
same
time,
so
you
can
get
these
much
more
complex
representations
where
you
have
some
of
these
features
in
some
of
these
features.
A
A
specific
example:
yeah,
that's
a
good
point.
So
in
this
example,
probably
because
we
could
have
I
mean
if
we
knew
that
this
is
what
we're
doing,
we
could
set
it
up
so
that
the
columns
could
only
potentially
connect
to
three
bits
like
we
could
have
five
columns
say
that
their
radius
is
three
bits
and
not
have
global
inhibition.
Okay,
but.
A
Yeah
yeah,
so
we
have
know
that
in
this
example
and
I,
don't
know
if
that's
necessarily
a
property
of
global
inhibition
topology
here
like
taking
an
advantage
of
the
topology,
would
be
useful
because
we
know
how
the
bits
are
laid
out,
but
in
more
complex
cases
where
the
input
is
not
laid
out
evenly.
That
might
not
help
in
the
in
a
vision.
Example
where
you
have
two
D
input
is
very
obvious
and
very
intuitive
that
you
would
lay
the
columns
out
to
D
as
well
and
have
a
local
field
that
it
can
look
at.
A
A
But
you
don't
thank
you
cool,
so
I'm
not
going
to
go
too
much
into
the
properties
of
this
facial
pool
or
more
I,
mainly
just
trying
to
show
how
to
use
these
different
pieces,
but
I
want
to
show
one
property,
which
is
the
spatial
invariance,
that
it
learns
so
I'm
going
to
create
a
new
SDR.
That
looks
like
cat,
but
is
a
little
bit
different,
so
I'm
taking
one
of
the
bits
3,
which
is
counting
by
0.
A
I'm
going
to
take
the
first
two
of
those
and
set
them
to
one,
so
it
represents
most
of
cat
and
then
I'm
going
to
pick
another
bit
that
isn't
part
of
cat.
That
happens
to
be
part
of
dog.
So
let
me
print
that
out
real
quick.
So
normally,
if
I
was,
if
I
had
an
STR
for
cat
like
if
I
used
the
category
encoder,
if
I
cat
in
I
get
these
three
bits
as
ones
and
all
the
rest
zeros
but
I'm
just
artificially
creating
an
STR,
that's
a
little
bit
different.
A
It
mostly
looks
like
cat,
but
it
has
one
bit
from
dog
now.
If
I
feed
this
into
the
spatial
Pooler,
what
should
happen?
Well,
the
columns
are
going
to
be
competing
to
win
out.
So
we
can
look
at
our
connected
synapses
here
and
we
can
see
this
third
column
is
going
to
match
two
out
of
the
three
bits
and
the
fourth
column
is
going
to
match
one.
So
we
I
expect
that
the
third
one
will
will
win.
A
It
will
have
the
most
overlap
and
so
it'll
win
and
it
will
basically
say
it'll
it'll
see
that
input
patter.
The
output
pattern
will
be
the
same
as
if
we
had
fed
cat
in
so
this
is
essentially
what
the
spatial
ploy
is
doing.
It
takes
noisy
input
and
outputs
a
stable
representation
of
an
invariant
representation
of
it.
So
the
output
is
going
to
look
like
this.
Just
basically
there'd
be
the
the
four
columns
and
which
ones
active.
A
So
when
we
did
cat
before
it
was
this
third
column
that
was
active
because
that's
the
one
that's
connected
most
to
it,
and
we
expect
that
we'll
get
the
same
thing
here.
So
let's
run
this
and
yeah.
We
do
see
that
so
the
third
column,
which
is
the
one
that
represents
cat,
is
the
one
that's
active.
Does
that
make
sense?
I
want
to
get
that.
K
A
K
A
It's
going
to
look
at
the
some
of
the
it's
connected
bits
that
are
active
and
that's
what
it
uses
when
it
competes
with
the
other
cells,
so
that
the
biological
biological
analogy
would
be
the
neurons.
So
it's
going
to
sum
the
inputs
on
a
segment
of
a
dendrite
segment
and
it's
going
to
get
it's
going
to
either
fire
or
not,
and
whoever
fires
first
can
inhibit
its
neighbors.
So
even
if
you
have
way
more
segments
and
synapses
on
you,
it
doesn't
matter
those
those
ones,
don't
matter
just
the
ones
that
are
active.
That
matter.
A
And
again,
it's
going
to
when
cat
is
active,
it's
going
to
strengthen
its
connections,
to
the
input
bits
that
are
active
and
weaken
the
connections
to
the
ones
that
are
not
active.
So
it's
going
to
weaken
its
connection
to
the
slow
loris
in
that
iteration.
But
then,
if
it's
useful,
Louis
again
the
future
and
becomes
active,
it'll
strengthen
it
and
generally
the
amount
that
we
increment.
The
connections
is
higher
than
the
amount
that
we
decrement
it
and
that
usually
works
out.
Well.
A
Okay,
so
that's
all
I
have
for
the
spatial
polar
I'm
not
really
going
to
go
into
too
many
of
the
properties
of
it.
There
will
be
some
other
talks
on
I.
Don't
know
if
there'll
be
a
talk
on
spatial
polar
theory,
but
there's
plenty
of
information
online
videos
and
by
talks
by
Jeff
and
other
resources.
So
any
questions
about
using
the
spatial
polar.
A
A
Yes,
okay,
so
with
learning
set
default,
it's
not
going
to
change
the
weights.
Yeah
I,
didn't
I,
didn't
realize.
I
had
set
that
to
false
are
a
good
point.
So
you
can.
You
can
enable
and
disable
learning
as
you
want.
Generally
you
leave.
It
enabled
I'm
not
sure
why
I
decided
to
turn
it
to
false
here,
because.
C
A
Yes,
sure,
okay,
so
the
last
individual
piece
I'm
going
to
talk
about
before
I
talk
about
some
of
the
higher-level
ways
to
interface
of
the
code.
Is
that
the
sequence
memory
which
is
called
the
temple
pool
in
the
code
and
the
temporal
pooling
is
where
or
that
the
sequence
memory
is
where
it
learns
the
patterns
in
the
data
over
time?
A
Generally,
you
take
the
output
from
the
spatial
Pooler
and
feed
it
into
the
temporal
Pooler.
You
can
also
use
the
temporal
Pooler
by
itself
in
chayton's
linguists
example
earlier,
that's
using
the
I
believe
it's
using
the
temporal
or
just
by
itself,
and
it's
feeding
the
word
SDRs
from
from
steps
API
directly
into
the
temporal
and
then
it's
looking
at
which
columns
are
active
and
its
mapping
the
columns
one
to
one
with
the
the
bits
from
the
word
STR,
so
it
uses
which
columns
are
predicted
to
know
what
word
is
predicted.
A
A
Tp
hello,
underscore
TP,
so
you
can
find
it
they're,
probably
going
to
walk
through
a
real
quick
here,
I'm,
creating
an
instance
of
the
TP
I'm
not
going
to
go
through
the
different
parameters,
but
you
can
say
how
many
columns
you
want.
You
know
how
fast
to
learn
by
like
how
how
much
you
increase
the
permanence
is
of
the
connections
different
things
like
that.
A
Once
you
have
an
instance
of
the
TP,
you
need
some
data,
so
I'm,
just
creating
this
demo
just
creates
fake
data
to
feed
in
and
then
to
send
the
data
in
you
call
the
compute
function
and
that's
actually
that's
the
same
as
SP.
In
this
case,
you
feed
it.
The
array
that
you
want
it
the
input
array.
Again
you
can
enable
or
disable
learning
and
then.
A
C
A
You're
using
the
CLA
model
or
the
OPF
client,
you
have
a
spatial
floor
and
a
tempo
cooler
in
your
network
and
yes
in
this
example,
I'm
creating
artificial
data
to
feed
into
the
temple
pool
or
I'm,
not
using
the
spatial
ploy
that
I
created
before
I'm,
not
using
output
from
that
right.
Yes,
I
answer.
Yes,.
C
A
So
later
on,
when
I
do
the
CLA
model
I'm
going
to
use
the
hot
gym
example
and
in
some
of
the
hot
gym
datasets
it's
just
one
of
the
data
sets
ignore
the
name,
but
it's
a
data
set
of
electricity
usage
in
gyms
and
they
have
different
locations,
and
so,
if
were
training,
a
model
on
that
data,
you'd
want
to
put
a
reset
and
be
in
between
the
two
points
from
one
Jim
and
then
the
sequence
from
the
next.
So
you're
not
like
learning
over
that
boundary.
A
Okay,
so
now
I'm
going
to
send
the
same
sequences
in
and
we're
going
to
look
at
the
predictions
made
by
the
temporal
puller
and
they
should
be
accurate
now
because
I
don't
have
any
noise
in
this
data.
It's
very
clean,
simple,
learn
and
I
fed
it
in
a
whole
bunch
of
times
already.
So
it
should
have
had
time
to
learn
the
sequences
and
then
there's
this
function
that
just
prints
out
the
STRs
that
are
predicted
and
and
the
ones
that
I'm
feeding
in.
So
you
can
see
what
it
looks
like.
A
If
you're
doing
this
here
yourself,
you
can
see
here,
there's
this
TP
print
States.
So
when
you
call
it
the
compute
function,
this
you
pass
the
Ray
in
and
then,
if
you
want
to
get
the
predicted
cells,
then
you
can
get
them
with
the
get
predicted
state
here
and
then
I'm
just
going
to
like
format
those
in
a
way
that
makes
it
easy
to
read
them
so
I'm
going
to
run
this,
and
the
output
of
this
is
it's
basically
going
to
for
each
input
vector
it's
going
to
print
it
out.
A
It
makes
a
prediction
based
on
the
connections
of
the
cells
and
what
it's
learned
in
the
past
and
predicts
that
this
next
set
of
bits
are
going
to
be
accurate.
This
next
set
of
cells
are
gonna,
be
active,
and
then
we
feed
the
next
record
in
and
since
it's
already
learned
this
sequence,
it
is
correct,
and
so
then
we
see.
A
That
those
columns
are
active
and
then
it
predicts
the
next
value.
So
this
all
matches
all
that
you
guys
can
look
in
at
this
later
for
more
details.
One
one
important
part
of
this
is
that
in
the
temper
puller,
it's
learning
these
sequences
within
context.
So
when
we
created
this
temporal
Pooler,
we
said:
we've
picked
the
number
of
cells
per
column
and
set
that
to
two
and
the
cells
inside
of
a
column,
all
represent
the
same
spatial
pattern,
but
in
different
contexts.
A
So
when
we
see
the
output
of
this,
the
top
row
row
is,
is
one
cell
and
the
bottom
row
is
another
cell.
So
this
is
one
cell
in
the
first
column-
and
this
is
the
second
cell
in
the
first
column,
they're
just
two
cells,
and
so
when
we
see
a
value
like
this
B,
this
is
in
context
so
because
it
was
predicted
as
coming
from
A
to
B.
A
A
A
Okay,
so
that's
pretty
much
it
for
using
the
temporal
puller
again,
you
can
look
at
the
example
here
in
the
parameters.
The
parameters
aren't
always
fully
explained
so
feel
free
to
grab
me
if
you're
trying
to
use
this-
and
you
don't
understand
what
what
the
parameters
are.
The
best
way
to
start
with
this
is
to
find
an
existing
example
with
data.
That's
similar
yours
and
then
just
start
with
those
parameters.
I
guess
I
should
go
too.
A
C
A
I
C
A
Yeah-
and
these
are
really
simple
examples
with
really
small
numbers,
so
it's
easy
to
visualize,
like
two
cells
per
column,
is
pretty
small.
Typically
in
the
examples
you'll
see
we
use
32,
which
is
probably
more
than
we
normally
need,
but
it's
big
enough
that
we
never
have
problems
with
it
and
the
linguist
example
is-
or
the
fluent
example
is
probably
a
good
place
to
look
for
a
more
complex
use
of
the
Templar
rather
than
a
simple
dummy
version
like
this.
A
A
I'm
going
to
switch
back
to
this,
so
I've
covered
the
algorithms
that
don't
worry.
The
rest
of
this
is
going
to
go,
hopefully
a
little
bit
faster,
there's
not
too
much
more,
but
I'm
not
going
to
show
examples
of
using
the
networks
and
regions
API,
but
I
kind
of
wanted
to
just
mention
it
to
to
give
you
an
idea
of
what
it
is.
A
What
I'm
trying
to
capture
in
this
slide
is
that
there's
different
levels
that
you
can
access
the
code
at
so
all
the
stuff
I
just
just
showed,
was
using
the
algorithm.
Implementation
is
directly
just
instantiating,
the
SP
and
stanching
the
TP
and
feeding
data
in
and
getting
the
data
out.
There's
some
ways
to
use
those
that
I'll
talk
about
a
little
bit
that
are
a
little
bit
higher
level
where
you
can
just
sort
of
specify
the
parameters
and
what
your
data
is
and
then
just
run
it,
and
that
can
be
really
useful.
A
But
it's
sometimes
a
little
hard
to
understand.
What's
going
on
so
the
network's
regions,
API
is
sort
of
intended
for
arbitrary
topologies
of
the
different
pieces.
So
you
may
want
to
have
well
here.
I
can
show
you
an
example,
so
here's
an
example
of
how
you
may
set
up
a
topology.
You
may
have
two
different
inputs,
audio
and
images,
and
so
you'd
want
at
the
bottom
level.
A
It
back
go
into
a
temporal
cooler
to
learn
the
sequences
over
time,
and
then
you
can
have
the
outputs
of
those
temporal
pooler's
combined
into
a
single
spatial
polar
and
then
you
can
put
another
temporal
on
top
of
that
to
learn
transitions
over
that
higher
higher
level
output
that
you're
going
to
get
from
the
spatial
poor
of
the
combined
inputs.
And
then
you
can
put
a
classifier
at
the
top
to.
A
Say
what's
happening
in
the
images
or
an
audio
or
whatever
your
problem
is,
but
what
the
networks
and
regions
API
is
really
nice
for.
Is
that
you,
once
you
set
this
up,
you
feed
the
data
in
to
it
and
it
propagates
it
through
all
the
different
pieces,
and
then
you
can
get
that
put
so
you
don't
have
to
manually,
create
all
these
different
things
pieces
feed
it.
Your
audio
into
audio
encoder,
take
the
output
of
that
feed
into
the
spatial
or
take
that
part
of
that
feed,
Interpol
or
SATA,
etc.
A
The
network
you
just
call,
I
think,
the
run
method
on
it
and
it
does
the
whole
thing
it
also
formalizes
serialization.
So
each
of
the
regions
can
decide
how
it
wants
to
serialize
itself,
but
then
the
network
will
call
into
each
of
those
functions
to
serialize
them
and
store
them
with
a
file
that
knows
the
topology
of
the
network.
A
So
one
of
the
pieces
is
the
CLA
model,
which
is
generally
it's
kind
of
an
encapsulation
of
all
the
pieces
I
showed
before
so,
rather
than
creating
those
yourselves
or
using
the
networks
and
regions
API.
The
CLA
model
kind
of
encapsulate
the
most
common
case
that
we
have,
which
is
a
single
level
of
encoder
spatial
pool
or
optional
temporal
Pooler,
and
then
a
classifier
which
turns
the
output
of
the
temporal
cooler
that
predicted
cells
into
a
predicted
value.
A
I
also
didn't
talk
about
the
CLA
classifier,
or
show
examples
of
that,
but
maybe
I'll
send
out
a
follow-up
with
information
on
that.
Okay,
so,
let's
run
through
this
example,
real
quick.
This
is
again
just
copying
an
example
in
new
pic.
So
you
can.
You
can
find
that
in
here
and
it's
doing
mostly
the
same
thing
as
that
file.
So
the
way
that
you
create
a
model
is
with
the
model
Factory
and
you
pass
in
prams.
A
Here
we
haven't
defined
it
yet
so
I'm
going
to
show
you
what
that
looks
like
I'm
not
going
to
go
through
all
these,
but
the
model
tells
it
what
type
of
model
that
it
once
again
the
OPF.
It's
got
this
pluggable
model
system,
so
it
knows
that
CLA
maps
to
the
CLA
model,
there's
some
other
information.
We're
not
really
doing
that.
A
You
can
sports
aggregation,
so
here
we're
aggregating
hourly
the
model
prams
or
would
actually
go
to
the
model,
so
we're
using
a
silly
model.
So
it
has
some
different
parameters,
one
of
which
is
the
inference
type
in
this
case,
I'm
going
to
do
temporal
multi-step
there's
also
a
non
temporal
multi-step,
and
that
is,
if
you
don't
want
to
use
the
temporal
polar
and
that
sometimes
works
better.
A
If
you
have
data
that
can
be
predicted
on
a
with
just
a
first
order
model:
okay,
so
I'm
not
going
to
go
through
all
these
parameters,
but
I
wanted
to
put
them
in
here.
Just
you
could
see
what
they
were.
These
control
things
like
how
fast
the
model
learns
for
the
encoders.
What
and
W
to
use
things
like
that.
A
Some
of
the
tools
in
new
pic
make
it
easy
to
get
datasets
in
experiments.
So
there's
this
function
find
data
set,
and
that
has
a
few
different.
Like
few
different
rules,
you
can
set
an
environment
variable
to
have
your
own
directory
that
you
have
your
data
sets
in
and
then
you
can
use
this
function
just
to
get
the
the
full
path
for
a
file
based
on
the
different
rules.
A
So
first
I
just
wanted
to
show
what
this
data
looks
like
so
I'm,
just
opening
it
up
and
printing
out
the
first
few
lines.
This
is
the
a
file
format
that
our
tooling
understands
so
there's
a
file
reader
that
understands
this
format
and
it
has
three
header
lines
and
then
the
data
and
it's
a
CSV
format.
But
it's
going
to
take
these
first
lines
and
use
it
to
understand
the
data.
So
the
first
one
is
just
the
names
of
the
fields.
A
This
s
means
that
it's
a
sequence
so
I
mentioned
earlier
how
in
the
hot
Jim
data
set
there's
different
gems.
So
this
first
field
is,
is
the
the
Jim
that
it
belongs
to
and
this
sequence
flag
or
when
you
put
an
S
in
here,
it's
the
sequence
flag
and
that's
basically
going
to
insert
or
reset
when
that
value
changes.
So
when
we
get
from
the
fall,
whatever
I
don't
know,
but
I'm
get
from
that
gym
to
the
next
one
and
I'll
insert
it's
a
reset,
so
it
doesn't
learn
over
that
boundary.
A
But
then
we
have
an
address
a
date
time
and
then
the
scalar
value
and
we're
not
going
to
use
all
these
fields
in
our
model,
we're
just
going
to
use
the
day
time
and
the
value,
and
that
was
controlled
by
by
our
encoders,
which
are
set
up
here.
So
here
we're
doing
the
time
of
day
and
the
consumption,
which
is
the
value
that
we
want
to
predict.
A
Cool
so
here
I'm
using
another
another
piece
of
the
tool
Jess,
which
is
the
file
record
stream.
It
understands
this
that
that
file
format.
So
when
we
open
it
and
print
out
the
data,
it's
already
stripped
off
those
first
three
lines
and
it
interprets
the
data
as
the
appropriate
type.
So
it
rather
than
a
CSV
reader,
which
we
get
this
as
a
string.
A
So
we're
going
to
tell
here
that
we
want
to
predict
consumption
once
we
have
that
model.
We
can
feed
data
into
it
and
I'm
going
to
feed
100
records
into
this
and
I'm
going
to
print
out
the
input,
which
is
the
consumption
field
in
the
in
the
record
that
you
feed
in
and
then
I'm
also
going
to
print
out
the
inferences
that
I
get
out
of
the
model
so
before
I.
Do
that
just
really
quickly.
A
A
Okay,
so
I'm
going
to
feed
a
bunch
of
records
into
this
and
print
the
results.
So
you
guys
can
see
what
that
looks
like
so
here,
I
fed
in
value
5.3.
Initially
the
model
doesn't
know
anything
about
the
data.
So
I
doesn't
know
what
to
predict.
It
just
predicts
the
value
it
just
saw,
but
if
we
run
this
enough
times,
the
model
will
start
to
learn
the
data
and
get
better
at
predicting
it
and.
A
A
I'm
just
going
to
do
this
a
few
more
times
it
looks
like
yeah.
Now
it
looks
like
it's
got,
never
run
it
through,
maybe
six
times
the
hundred
records,
and
you
can
start
to
see
how
it's
making
a
prediction
about
the
value
dropping
and
it
starts
during
the
day
the
the
consumption
is
higher
and
overnight
it's
lower,
and
so
you
can
see
that
it's
making
this
prediction
of
a
value
of
one
point,
something
when
the
value
is
still
pretty
high.
A
A
A
Essentially
what
it
does
is
looks
at
the
predicted
and
active
cells
in
the
temporal
pool
and
the
proportion
of
active
cells
that
were
not
predicted
or
the
percentage
fraction
of
active
cells
that
are
not
predicted
is
the
Nama
score.
So
if
most
of
your
cells
that
are
active,
we're
not
predicted
you're
almost
going
to
be
above
0.5
and
if
most
are
not
all
right,
most
most
of
them
were
predicted
and
it'll
be
under
0.5.
Does
that
make
sense?
A
So
here
I
just
put
the
model
prams
again.
These
are
identical.
There's
two
differences,
I
think
one
is
that
I've
changed
the
input
inference
type
from
temporal
multi-step
to
temporal
anomoly,
and
that
tells
it
that
we
want
to
get
the
anomalous
grows
out
and
then
down
at
the
bottom.
There's
this
anomaly
frame
section,
although
I
don't
know,
if
I
don't
think,
that's
really
doing
anything,
I,
don't
think
you
need
to
include
that
so
I'm
going
to
run
through
the
example.
A
I
did
before
again
same
thing,
create
a
model
with
the
model
parameters
specify
the
field
that
we're
trying
to
predict
run
some
data
through
and
printed
it
out.
Here's
the
predictions
just
like
before
with
an
anomaly
model,
you
still
get
predictions,
it's
still
doing
that,
but
you
also
get
this
anomaly.
Squirrel
show
that
first.
A
Okay,
so
here,
if
I
print
out
rather
than
printed
out,
multi-step
best
predictions
from
the
result,
inferences
I
print
out
the
anomaly
score
and
I
get
a
value
here,
I
recreated
a
new
model
to
do
the
temporal
anomaly
rather
than
temporal
multi-step,
and
because
of
that
I
didn't
have
any
learning
from
before.
So
it's
completely
completely
new
model
doesn't
know
anything
about
the
data,
so
I
fed
five
records
through,
and
so
that's
why
the
anomaly
score
is
a
little
over
0.5.
A
Most
of
the
columns
were
not
predicted
because
it
hasn't
learned
the
data
yet
so
I
mentioned
I'd
show
you
more
of
what's
in
the
model
result,
so
here
I
print
out
the
whole
result,
and
this
isn't
a
very
pretty
format
but
I'll
point
out
the
important
parts.
So
here
the
inferences
section
is
what
we
are
using.
We
are
using
multi-step
best
predictions
for
the
predictions
and
the
anomaly
score.
Is
it
here,
there's
also
multis,
but
we're
using
the
most
at
best
prediction.
A
You
actually
get
multiple
predictions,
so
for
one
step,
we
have
this
dictionary
that
maps
from
the
value
that's
being
predicted
to
the
likelihood
so
they're
the
same.
There's
a
23%
chance
of
the
next
value,
the
value
of
one
step
in
the
future
being
five
point
one
and
get
much
higher
chance.
76
percent
chance
that
it
will
be
five
point
three
four.
So
there's
a
lot
of
information
in
this
result
set
just
in
this
inferences
section.
So
that
includes
multiple
predictions
for
each
step
into
the
future.
A
That
you've
specified
that
you
want
to
predict
the
nomally
score
if
it's
an
anomaly
model
and
then,
in
addition
to
the
inferences,
this
record
includes
just
the
raw
input
that
you've
fed
into
it,
as
well
as
some
process
things
like
data
encoding,
which
is
the
raw
input
after
it's
been
encoded
with
encoders
that
you
specified
and
some
other
things
like
sequence
reset.
So
if
we
were
using
that
sequence
filled
with
the
gem
name,
this
would
be
you
could
see
when
we
cross
the
boundary,
because
it
would
show
sequence.
Reset
is
one
here
cool.
A
D
A
So
in
both
of
the
examples,
I
specified
a
temporal
multi-step
and
then
temporal
anomaly
so
both
get
and
then
in
the
model
parameters.
You
can
also
say
enable/disable
the
temporal
blur
but
yeah
in
both
these
examples,
we're
using
encoders
we're
using
a
scalar
encoder
for
the
consumption
value,
which
is
the
predicted
field,
we're
using
a
the
date
encoder
for
the
date
time
and
I.
A
Think
it
was
doing
day
of
week
is
how
it
was
encoding,
the
date
so
we're
using
encoders
and
then
the
operative
that
is
feeding
into
a
spatial
pool
and
then
speeding
up
for
that
in
the
temporal
polar
and
there's
also
a
steal,
a
classifier.
On
top
the
classifier
takes
the
the
output
of
the
temporal
Pooler
and
tries
to
convert
the
predicted
cells
back
into
a
value,
so
we're
predicting
a
scalar
field.
A
So
it'll
know
that
some
cells
are
predicted
and
its
job
is
to
figure
out
what
is
a
scalar
value
to
call
that
and
that's
where
it
figures
out
multiple
possibilities
and
how
likely
thinks
they
are
so
for
the
anomaly
detection
portion
itself.
The
anomaly
score
is
computed
solely
based
on
the
state
inside
the
temporal
polar
but
you're
still
using
all
those
other
pieces.
With
the
exception
of
the
classifier,
you
don't
need
the
classifier.
A
The
reason
why
we
sell
the
classifier,
though,
is
we
do
swimming
a
lot
which
I'll
describe
in
a
little
bit
and
swarming
is
used
to
figure
out
what
parameters
are
best
for
this
data
set
and
without
a
classifier
there's
no
way
to
evaluate
how
good
the
model
is.
If
you,
you
basically
have
to
use
the
prediction
to
see
how
good
the
model
is,
then
we
use
that
to
pick
the
best
parameters,
but
once
you've
picked
the
best
parameters
and
create
the
model.
If
you
just
want
the
knowledge
square,
you
don't
need
the
classifier.
A
Okay,
okay,
so
that's
it
for
this
notebook
I'm
just
going
to
show
a
few
command
line
things,
so
the
CLI
model,
like
I
just
showed,
can
be
created.
Programmatically.
You
can
feed
your
your
data
in
and
get
the
results
out
and
do
whatever
you
want
them.
We
have
a
few
tools
to
make
experimenting
with
these
a
little
bit
easier.
So
one
of
them
is
this
opf
running
swoops.
A
One
of
them
is
this:
opf
run
experiment
script
and
that,
basically,
you
specify
the
parameters
in
a
description,
PI
file,
and
then
you
call
this
script
and
give
it
a
directory
that
has
a
description
about
PI
in
it.
It
uses
the
parameters
from
that
file
to
create
the
model.
That
file
also
includes
some
other
information
about
where
to
find
the
data
and
it
runs.
A
A
A
Rather
than
creating
the
model
myself
I'm
using
the
existing
client
and
just
specifying
the
parameters,
and
where
do
the
data
from
the
output
to
the
terminal
here
is
not
really
that
important
for
the
process
of
our
talk,
it's
basically
just
periodically
printed
out
the
metrics
as
it
goes.
You
can
kind
of
follow
the
progress
in
this
case.
A
A
A
A
And
this
looks
very
similar
to
the
model
params
that
I
showed
you
before.
It's
actually
pretty
much.
The
same
exact
thing
so
config
here
is
what
I
was
calling
model
frames
when
I
was
creating
the
CL
Yamato
myself
and
then
there's
a
few
things
at
the
end
of
this
file.
In
addition
to
the
model
params
that
control
things
like
where
to
put
the
results
in
this
case,
I'm
just
specifying
a
file
somewhere
and
there's
also
a
the
ability
to
specify
what
metrics
you
want
it
to
compute.
A
C
A
A
A
Experiment
script
does
what
it
will,
but
the
advantage
of
this
is
that,
because
we've
structured
the
data,
we
can
run
a
swarm
on
it
and
the
swarm
tool
basically
allows
you
to
specify
which
of
those
parameters
you
want
to
permute
over
and
what
range
do
you
want
to
try
and
then
it
will
run
a
bunch
of
different
models
with
different
combinations
of
parameters
and
figure
out
which
ones
get
the
best
results.
So.
A
This
is
the
swarm.
Script
is
in
just
in
the
new
pic
repository
in
bin
run,
swarm
and
then
I'm
going
to
pass
the
same
path
that
I
passed
for
opf,
run
experiment
but
I'm
going
to
point
to
the
permutations
pile
permutations
file,
which
kind
of
mirrors
the
description
pie,
but
it
just
specifies
which
of
those
fields
we
want
to
put
me
over
and
I'll
show
that
in
a
second.
But
first
let
me
run
this.
A
Again,
there's
going
to
be
a
lot
of
output
here,
it's
just
kind
of
so
as
it's
running
you
can
kind
of
like
see
where
it
is
for
most
cases.
You
probably
don't
need
to
understand
exactly
what
all
this
means.
When
this
finishes
it's
going
to
write
a
directory,
that's
going
to
have
the
description,
PI
file
from
the
best
model,
from
all
the
models
that
I
tried.
A
So
it's
taking
a
base
description,
dot,
PI
that
I
ran
before
it's
changing
some
of
the
values
running
the
model
with
the
new
values,
checking
the
results
and
doing
that
a
whole
bunch
of
times
and
keeping
track
of
which
one
worked.
The
best
then,
at
the
end,
it
takes
the
one
that
worked
the
best
and
writes
those
parameters
into
a
file.
I.
A
A
Laughing
enough,
can
you
guys
see
that
okay,
so
inside
this
model,
zero
directory,
it's
going
to
have
a
description,
PI
and
inside
that
file
are
just
the
differences
between
this
file
and
the
base
description,
dot
PI
that
is
started
with,
so
you
can
see
at
the
bottom.
It's
doing
this
import
based
description
and
then
updating
the
config
with
those
values.
So
in
the
permute
in
the
permutation
sub
pi
script,
we
specify
should
have
said
that
first,
let
me
show
that
real,
quick.
A
So
when
you
do
swarming
I
mentioned
you
specify
which
fields
you
want
to
perm
you
over
again,
I
would
recommend
taking
an
existing
example
and
then
just
adapting
it
to
work
with
your
data,
but
basically
there's
these
different
classes
that
you
create
instances
of
for
different
types
of
values.
So
permute
encoder
is
a
permute
that
understands
encoders.
So
it's
going
to
permute,
specifically
over
this
floating
point
value
for
there
I
think
that's
the
radius
little.
Yes,
it's
the
radius
of
the
encoder.
So
this
is
a
random
distributed.
A
Encoder,
the
radius
is
how
big
the
buckets
are,
and
the
value
of
that
can
change
a
lot
based
on
your
data.
So
if
your
data
has
really
large
values,
you're
going
to
want
generally
going
to
want
a
bigger
radius
and
if,
as
really
small
values,
you
want
smaller
radius
but
fine
tuning
that
can
have
a
pretty
big
impact
because
that
that
can
take
out
some
of
the
noise
to
take
some
of
the
burden
off
the
algorithms.
A
But
if
it
is
too
big,
then
you're
going
to
have
to
the
buckets
are
going
to
basically
represent
multiple
values
as
the
same
thing
that
you
might
not
want
them
to.
So
when
the
swarm
runs,
it
looks
for
these
permute
objects
and
and
picks
a
value
in
the
range
runs
the
algorithm
and
then
for
each
of
those
permute
objects--
in
their
permutations
up
PI
file,
when
it
picks
that
best
model
and
writes
out
the
description.
Pi,
it's
going
to
write
out
the
values
that
it
had
for
each
of
those.
So.
A
Just
as
an
example
here
for
time-of-day,
these
values
were
specified
as
something
that
we
wanted
to
promote
and,
and
these
are
the
values
that
it
came
up
with
as
the
best.
So
that's
basically
it
first
warming
I
can't
really
give
too
much
information.
It's
going
to
depend
a
lot
on
your
data
set,
but
I'd
recommend
finding
an
existing
example.
That's
similar
to
your
data
set
and
then
adapting
it,
and
that's
why
you
guys
are
all
here.
E
A
A
It's
going
to
take
whichever
one
you
specify.
First,
whichever
metric
you
specify
first
and
use
that
as
the
the
metric
that
determines
which
model
is
considered
best
so
when
it
runs
the
models
the
output
gets
fed
into
the
metric
and
then
that
metric
will
give
you
a
value
which
is
you
know,
but
lower
the
value,
the
lower
the
error
and
the
better
the
model.
So
it
takes
the
one
with
the
lowest
error.
There's
one
kind
of
tricky
thing
about
those
you
can
specify
a
window
for
them.
A
So
if
you
have
10,000
records
and
specify
the
window
as
a
thousand,
it's
going
to
just
compute
that
metric
over
a
moving
window
of
a
thousand
records.
So
when
you
run
this
worm,
it's
going
to
take
the
last
thousand
records
and
the
metric
over
those
records
is
what's
going
to
be
used
to
determine
if
it's
better
than
another
model.
So
keep
that
in
mind.
I
A
I
A
I
I
F
A
Know
that
when
you're
computing
the
anomaly
score,
it
does
not
use
the
classifier
at
all
and
you
can
actually
throw
away
the
classifier
and
not
use
it
at
all
and
still
get
anomaly
squares
out.
It's
only
needed
to
MIT
to
get
predictions,
and
if
you
want
to
swarm
on
an
anomaly
model,
if
you
have
an
anomaly
problem,
you
may
still
want
to
do
swarming
and
in
order
to
do
swarming
you
have
to
you,
have
to
have
a
classifier,
because
you
need
the
predictions
to
evaluate
which
model
is
better.
F
D
D
I
was
having
some
trouble
distinguishing
between
the
encoder
and
the
spatial
polar
because
it
seems
like
they
were
performing
similar
functions.
But
is
it
right
to
say
that
the
spatial
polar
would
take
multiple
variables
which
have
been
encoded
and
create
the
kind
of
group
SDR
from
those
multiple
variables.
A
So
the
encoder
takes
a
value
in
turns
it
into
ones
and
zeros,
and
it
generally
encoders
don't
learn
at
all.
So
for
a
given
input,
the
output
for
the
output
from
the
encoder
will
always
be
the
same
where
the
spatial
Pooler
adjusts
the
weights
of
the
connections
to
the
input
bits
for
each
column.
So
over
time
as
those
change,
the
output
will
be
for
a
given
input
will
be
different.
A
The
spatial
pool
also
requires
a
an
array
of
ones
and
zeros
as
input.
So
if
you
are
starting
out
with
the
value
5,
you
have
to
first
put
it
through
the
encoder
to
get
ones
and
zeros.
The
output
of
that
encoder
will
always
be
the
same
for
the
value
5
and
then,
when
you
put
it
into
the
spatial
pool
or
the
output
of
that,
you
don't
really
know
what
it's
going
to
be.
It's
going
to
be
the
spatial
poor's
invariant
representation
of
that
value,
which
may
change
over
time.
Ok,.