►
Description
Today we'll be taking questions about anomaly detection and HTM. Specifically: https://discourse.numenta.org/t/live-q-a-session-on-anomaly-detection/2674
A
Hey
everybody
welcome
to
HTM
hackers
hanging
out.
It
is
Friday
September,
1st
3
p.m.
on
the
west
coast
of
the
United
States,
and
today
we're
gonna
be
talking
about
anomaly
detection
with
with
Vic
specifically,
but
also
in
general,
with
HTM
systems.
In
the
Numenta
offices,
we
have
suit
I
Ahmad
and
Scott
Purdy,
who
are
research
engineers
at
Numenta
and
I'm
gonna.
Let
sue
batai
run
a
good
portion
of
this
presentation
or
this
talk
because
he's
got
a
lot
of
good
resources
about
anomaly
detection.
B
Nupic
yeah,
so
I
thought
so
Matt's
and
Scott
party
is
here
as
well.
He
knows
a
lot
about
this
stuff
and
what
we
thought
we'd
do.
The
fear
of
us
talk
is
there's
a
couple
of
sort
of
generic
questions
and
general
questions
that
seem
to
come
up
quite
a
bit.
So
I
thought
we'd
walk
through
some
of
that
first
and
then
maybe
go
to
some
specific
questions
and
I
know.
Jacob
has
listed
a
bunch
of
them
that
we
could
try
to
go
through
them
and
maybe
others
will
have
questions
as
well.
B
A
B
Okay,
all
right,
so
in
the
beginning,
what
I'll
do
is
walk
through
a
couple
of
things
that
are
in
a
paper
that
we
actually
published
so
Scott
myself,
Alex,
Levin
and
zu
I
was
an
intern
here
we
published
a
paper.
B
Things
in
here
first,
so
you
know
one
question
that
maybe
we
should
start
with
is
just
you
know
what
is
anomaly
detection
and
what
is
streaming
anomaly,
detection,
which
is
the
thing
that
we
we
tend
to
focus
in
so
there's
kind
of
an
image
here
than
the
paper.
That
shows
an
example
and
a
stream
of
data
over
time-
and
this
is
a
typical
kind
of
sensor
stream
that
you
might
see
in
many
industrial
applications
like
Internet
of
Things
and
so
on.
B
So
what
we
define
anomalies,
as
is
something
unusual,
some
unusual
pattern
in
the
data
given
what
you've
seen
so
far
in
the
past
and
the
three
black
dots
here
are
three
anomalies
that
were
actually
marked
by
a
human
expert
and
the
engineer
who
works
on
this
machine,
and
it
highlights
you
know
some
easy
examples
of
anomalies
and
a
really
tricky
one.
So
you
can
see
this
dot
on
the
left.
Here
is
a
very
easy
example
of
an
anomaly.
The
reading
suddenly
dropped,
went
to
zero
and
then
went
back
up
again.
So
that's
clearly
an
anomaly.
B
It's
unusual
in
this
data
and
if
you
look
at
their
very
last
anomaly,
on
the
right
hand,
side
this
is
a
case
where
the
the
reading
dropped.
It
didn't,
go
all
the
way
to
zero,
but
this
was
actually
turned
out
a
catastrophic
failure
in
the
machine.
There's
the
machine
basically
just
broke
down
and
the
temperature
didn't
go
all
the
way
back
down
to
zero,
but
it
was
significantly
lower
than
the
operating
normal
operating
temperature.
B
And,
what's
interesting
is
this
middle
anomaly,
which
is
not
so
obvious,
and
most
traditionalist
techniques
actually
would
not
pick
up
on
this.
But
this
anomaly
acts
preceded
the
catastrophic
failure.
So
if
you
look
closely,
you
can
see,
there's
some
unusual
fluctuations
in
the
data
and
that
you
know
you
kind
of
don't
see
before
I
mean
you
have
to
be
a
bit
of
an
expert
to
really
notice
this.
But
there
are
unusual
fluctuations
here
and
this
kind
of
highlights
what
we
tend
to
focus
on,
which
is
temporal
anomalies.
B
So
it's
not
just
that
the
value
at
a
given
point
in
time
is
unusual
is
just
that
the
temporal
pattern
of
the
data
is
highly
unusual
and
if
you
can
pick
up
on
temporal
patterns,
you
can
pick
up
on
much
more
subtle
anomalies
than
if
you
just
look
at
the
instantaneous
value,
and
this
anomaly
isn't
a
good
example
of
that,
because
there's
no
there's
no
one
reading
here,
that's
unusual
at
all.
It's
really
the
temple
pattern
of
reading
sets
that
are
unusual
so
with
HTM
on
these
kinds
of
anomalies,
at
least
in
this.
B
Where
look
we're
focusing
on
something
we
call
streaming
analytics
or
streaming
anomaly
detection
and
the
paper
has
a
definition
of
what
we
mean
and
basically
it's
just
a
continuous
stream
of
readings
that
is
evolving
in
time
and
you
there's
no
training
set
or
test
set
you're
just
constantly
getting
data,
and
you
just
have
to
tell
at
every
point
in
time.
Is
it
unusual
or
not
given?
B
What's
happened
in
the
past,
so
this
means
the
system
has
to
be
continually
learning,
it
should
be
unsupervised
and
one
of
the
things
that's
quite
interesting
as
as
sort
of
noted
in
this
chart
here.
Is
you
really
want
to
find
anomalies
as
early
as
possible,
it's
great
to
detect
this
last
anomaly
on
the
right,
but
this
is
actually
after
the
machine
has
already
broken
down.
What
you
ideally
want
to
do
is
detect
the
second
anomaly
here,
which
is
well
before
the
system
actually
breaks
down
so
detecting
anomalies
early
is
a
really
big
deal?
Okay.
B
B
B
We
call
the
anomaly
likelihood
which
is,
and
the
anomaly
likelihood
is
what
you
really
should
look
at
to
determine
whether
the
system
is
anomalous
or
not,
and
this
is
a
probabilistic
measure,
it's
kind
of
the
probability
that
the
system
is
in
a
normal
state
or
one
minus
the
probability
that
it's
in
a
in
a
your
signal
anomaly,
and
if
you
look
inside
the
HTM
block,
if
you're
familiar
with
new
pick
you'll
see
some
of
our
familiar
components.
We
have
encoders,
we
have
the
spatial
Pooler
and
we
have
the
sequence
memory
or
the
temporal
memory.
B
Those
are
all
inside
this
block
here.
So
this
is
sort
of
how
we
you
know
use
do
anomaly:
detection
with
HTM,
so
the
stream
of
data
goes
into
HTM
will
model
the
sequences
of
the
temporal
characteristics,
and
these
two
other
measures
here
will
try
to
determine
given
what
the
HTM
is
predicting
and
modeling
right
now.
What
is
the
chance
that
there
actually
is
an
anomaly?
Ok,
so
let
me
switch
to
some
code
I
think.
B
Probably
most
of
you
have
seen
it,
but
if
you
go
to
github.com,
slash
Numenta,
slash
nab.
This
is
what
we
call
the
new
mint
anomaly
benchmark
and
it's
got
results
from
lots
of
different
algorithms,
running
anomaly,
detection
and
I'm,
going
to
focus
in
on
new
mint
HTM
here,
which
does
the
best
in
this
data
set.
There's
a
ton
of
data
in
here
under
the
data
folder
that
you
can
take
a
look
at,
and
our
paper
describes
all
of
this
in
detail,
but
I'm
going
to
focus
in
on
some
of
the
code.
B
If
you
go
into
the
nab
directory
here,
there's
code
for
of
different
detectors
and
there's
code
for
the
momenta
HTM
detector.
So
if
you
click
on
momenta
detector
up
high,
it's
got
pretty
short.
This
is
a
pretty
small
file.
This
shows
you
how
to
use
new
pic
to
do
anomaly.
Detection
using
all
of
the
kind
of
techniques
and
stuff
that
we
know
how
to
and
some
of
the
best
practice
for
for
doing
it.
So
let
me
walk
through
that
code
here,
so
I'm
going
to
switch
to
my
editor.
B
B
B
So
in
here
we
assume
that
you
know
basically
the
range
of
values
that
your
data
is
going
to
be
it.
So
that's
input
max
and
input
min.
So
if
you're
doing
percentages
you'll
be
between
0
and
100,
if
you're
doing
temperature-
or
you
might
be-
you
know-
between
whatever
ranges-
your
your
system,
it's
temperature-
is
that,
but
we
assume
you
know
what
that
is,
and
then
new
book
has
this
convenient
method
where,
if
you
give
it
the
min
and
the
max,
it
will
give
you
back
the
best
HTM
model
parameters
for
anomaly
detection.
B
This
assumes
you
have
time
stamp
associated
with
your
system,
so
you
know
time
stamp
and
a
value,
but
I
would
highly
recommend.
Starting
with
this.
It
actually
took
us
a
quite
a
long
time
to
figure
out
the
best
set
of
model
parameters
that
works
well
for
anomaly
detection
and
we've
tested
this
on
literally
hundreds
of
maybe
thousands
of
data
files,
and
this
is
them
the
set
of
parameters
that
works
best
across.
B
You
know
the
vast
majority
of
data
files
that
we've
tried-
and
this
is
also
the
best
set
of
parameters
that
works
well
with
them,
so
I
strongly
recommend,
starting
with
this
set
of
parameters,
even
if
what
you're
doing
is
something
slightly
different
from
the
way
we're
doing
it.
If
you
start
with
this
and
then
not
five,
you
have
much
more
likely
to
get
good
results
than
if
you
start
from
scratch.
B
Let's
see,
then
there's
a
method
here
to
set
up
the
encoders.
It
basically
says
how
to
map
the
names
of
the
fields
that
are
in
your
data
file,
so
that's
fairly
straightforward
and
then
this
line
will
actually
create
your
HTM
model.
So
this
is
what
goes
into
that
HTM
block
I
showed
earlier,
and
this
bit
of
code
sets
up
the
anomaly
likelihood
class.
B
Okay,
well,
you
have
to
enable
inference
from
well.
This
is
is
basically
it
because
you
have
multiple
fields
in
here.
You
have
the
timestamp
fields
and
the
value
field
you
have
to
tell
the
system
which
field
is
the
key,
is
the
actual
value
field
and
I'm,
not
hundred
percent
sure?
Why
we
need
this
either,
because
the
anomaly
score
doesn't
really
use
this.
So
it's
a
valid
point
of
confusion,
but
for
whatever
you
think
it's.
B
Yeah
I
guess
we're
repurposing
a
prediction
model
for
this,
but
yeah
in
in
in
in
theory,
this
is
not
really
needed,
but
in
right
now,
with
our
code,
it
is
it's
a
valid
valid
confusion
with
the
anomaly
likely
that
a
couple
of
parameters
you
need
to
pass
in
you
typically
you
want
to
be.
You
want
to
have
the
system
learn
on
a
certain
amount
of
data.
Before
you
start
trusting,
it's
anomaly,
you
know
likelihood
outputs.
So
this
kind
of
this
parameter
kind
of
tells
you
what
that
period
should
be,
and
usually.
E
B
B
B
B
B
I'm
gonna
skip
this
for
a
second
I'll,
get
back
to
that
in
a
little
bit,
and
then
you
take
that
anomaly
score
or
prediction
error
and
you
pass
it
to
the
anomaly
likelihood
class
and
you
get
back
an
anomaly
probability,
but
you
confusingly.
We
call
anomaly
story
here
as
well,
and
then
we,
since
this
is
a
probability
probabilistic
measure
and
anomalies,
are
extremely
unlikely.
B
So
it's
actually
very
convenient
to
work
in
a
logarithmic
domain.
So
we
compute
convert
the
probability
into
a
log
likelihood,
which
is
this
log
score,
and
then
we
use
that
everywhere
else
in
our
in
our
system.
Okay,
so
that's
the
basic
flow
we
found
in
practice
that
there's
another
exception,
that's
very
useful.
To
use-
and
do
you
want
to
explain
this
or
Dominic
when
it's
about.
D
Yeah,
so
basically
the
idea
is
that
occasionally
you
have
cases
where
the
data
is
really
noisy.
So
one
one
point
by
itself:
no
matter
how
far
out
the
norm
isn't
enough
to
move
the
likelihood,
because
the
likelihood
uses
some
window,
and
that
is
not
very
intuitive
and
a
lot
of
people
would
come
to
us
and
say
hey.
Why
isn't
this
detecting
this
very
obvious
anomaly?
It's
a
it's!
A
clear
spatial
anomaly,
evalu
way
outside
the
normal
range.
D
We've
never
seen
anything
like
this
before
this
should
be
five
dozen
Amelie,
and
so
we
put
this
this,
and
this
basically
just
looks
for
values.
Some
amount
outside
of
the
range
of
data
that's
been
seen
so
far
and
and
we'll
just
sort
of
it's.
It's
not
the
most
elegant
way
to
address
this,
but
it
basically
would
just
take
anything
outside
it's
five
percent
in
this
case
of
the
range
we've
seen
so
far
and
say
that
that's
not
anomaly
period,
no
matter
what
the
likelihood
comes
out.
B
So
it's
actually
pretty
straightforward.
You
know,
there's
an
initialization
step
and
a
convenient
function
to
get
the
best
model
parameters
and
then
what
you
have
to
do
is
run
the
model
and
then
send
the
results
through
the
anomaly
likelihood
and
then
use
the
log
version
of
that
to
actually
do
the
threshold
them.
One
question
we
see
in
the
forum's
quite
a
bit.
Is
you
know
what
value
should
I
use
to
to
actually
then
detect
the
anomaly?
B
Sometimes
they
say
well,
I.
The
only
likelihood
is
giving
me
a
value
like
0.99
and
I,
don't
see
an
anomaly
and
that
is
actually
expected
again.
The
anomaly
likelihood
is
a
very,
very
it's
a
probabilistic
measure
and
0.99
means
is
a
there's
a
one
in
a
hundred
chance
that
it's
an
anomaly
that
it's
an
unusual
data
point
and
usually
that's
actually
not
sufficiently
rare
to
call
an
anomaly
and
what
we
use
is
actually
five
minutes.
So
zero
point,
nine,
nine,
nine,
nine,
nine
and
it's
a
threshold.
And
so
that's
why.
B
B
D
B
B
You're,
plotting
plot
the
log
value,
it's
much
more
intuitive
to
use
that.
So
so
one
one
question,
so
you
know
there's
the
anomaly
score
and
their
nominee
likelihood
and
some
people
often
ask
well
you
know
why
am
I
getting
high
or
low
anomaly
scores.
So
I
see
that
question
quite
a
bit
and
my
basic
answer
is:
don't
even
look
at
the
anomaly
score.
Just
look
at
the
anomaly
likelihood.
The
anomaly
score
can
spike
up
or
it
can
be
low.
For
you
know
for
a
short
period
of
time
it
does
not
necessarily
mean
it's
an
anomaly.
B
B
D
B
B
B
So
low
the
basic
things
I
wanted
to
cover
one
other
thing,
I
wanted
to
point
out
is
we
have
a
sample
app
on
our
web
site
called
HTM
studio?
You
might
actually
want
to
start
with
this
even
before
you
start
coding
anything
because
this
also
contains
it
contains
the
same
code.
I
showed
earlier
embedded
inside
a
UI.
B
You
can
actually
upload
or
use
open
up
one
of
your
data
files
and
it
will
run
through
the
whole
process
with
you
and
show
you
a
nice
UI
where
the
different
anomalies
are,
and
you
can
read
through
it
here.
It's
free
to
download
and
I
think
the
source
code
for
this
whole
app
is
available
as
well,
but
this
is
this
app
underneath
it
does.
This
exact
same
thing,
I
showed
earlier.
A
Ok,
Matt
anything.
What
else
should
I?
Yes,
some
other
teams
of
questions
I
think
we
already
covered
this,
but
I
want
to
emphasize
that
you
don't
need
to
you
shouldn't
need
to
swarm
to
get
anomaly
model
parameters
like
Zubat
I
said
we
already
have
a
good
set
of
them.
However,
I
think
some
people
are
trying
to
do
some
other
things
like
they
have
multiple
scalar
values
that
they're
they're
trying
to
do
multiple
fields
with
time
stamps
and
addition
to
other
scalar
values.
A
B
That's
a
great
question
so
swarming
we
have
not
found
a
good
way
to
do
swarming
with
anomaly
detection
for
if
you're
just
doing
basic
prediction,
swarming
works
quite
well,
but
for
not
only
that
so
what
swarming
does
is
finds
the
best
set
of
parameters
that
optimizes
a
particular
metric.
So
prediction
error:
you
can
optimize
that
and
that
will
give
you
a
good
prediction
system,
but
we
don't
have
a
good
metric
for
anomaly.
Detection.
D
Specifically,
you
can't
use
the
anomaly
score
or
likelihood
for
this.
It
that
won't
work
well.
Yeah
I
actually
think
that
swarming
using
if
your
problem
is
set
up
that
you
you
can
frame
it
as
a
prediction
problem,
and
there
is
a
variable
that
is
indicative
of
of
your
problem,
that
you
can
make
the
predicted
variable
and
swarm
base
and
that
I
think
that
is
a
good
way
to
approach
it.
B
If
so,
you
could
be,
you
could
do
a
normal
prediction
swarm
and
then
use
the
parameters
for
that
and
then,
instead
of
in
that
code,
I
showed
you
you
can
use
substitute
model
params
for
under
the
swarm,
but
but
you
have
to
be
careful
because
if
you
have
a
data,
that's
very
very
new
care
data,
that's
very
noisy
and
inherently
hard
to
predict.
Then
swarming
is
on
that
on
prediction.
Error
is
not
going
to
give
you
a
good
result,
I
think
so.
The
latency
thing,
for
example,
you.
D
Know
it's
true.
You
have
to
be
calm,
and
the
other
thing
to
keep
in
mind
is
that
the
prediction
case
you're
the
way
that
works
is
its
optimizing
for
a
specific
field.
So
if
you
have
multiple
fields,
it's
optimizing
to
predict
one
of
them
and
it's
gonna
wait.
It's
decision
based
on
the
internal
values
that
they
actually
help
it,
which
might
only
correspond
to
some
of
the
fields
not
all
of
them,
and
so
your
anomaly
score
is,
is
different
from
that.
D
Your
naama
score
is
based
on
the
entire
internal
state
and
it
doesn't
know
what
parts
are
which
feels.
So,
that's
where
you
have
to
be
a
little
careful
where
you
might
get
a
good
predictive
model.
But
if
you
look
at
the
entire
internal
state,
it
might
not
actually
be
a
good
metric
for
for
anomaly
section
for
your
application.
A
A
B
A
B
Know
each
of
these
is
this
is
like
the
prediction,
the
set
of
cells
that
are
that
were
predicted
from
the
previous
time
step,
and
then
this
is
the
current
set
of
active
cells
in
the
temporal
memory,
and
this
is
just
a
normalizing
thing
also
that
it's
the
number
it's
the
current
set
of
active.
It's
the
number
of
active
columns.
Actually,
sorry,
not
the
this
is
all
in
columns
and.
D
B
A
Okay,
so
some
other
common
questions
that
I
think
I
think
we've
hit
on
most
of
these,
we
talked
about
how
anomaly
detection
is
related
to
prediction.
We've
talked
about
multiple
input
streams
and
how
it
affects
anomaly.
Detection
I
was
just
a
little
bit
confused
by
what
you
just
said,
because
you
said,
if
you're,
if
you're,
calculating
different
predictions
that
that
could
affect
the
the
prediction
error.
B
A
B
Yeah,
so
htm'
can
do
multiple
can
predict
multiple
things
into
the
future.
That's
one
of
the
kind
of
nice
things
about
the
way
that
the
system
represents
things
using
sparse
vectors,
but
you
know
say
you
are
flipping
a
coin.
Heads
and
tails
are
both
reasonable
next
steps,
so
this
ATM
will
be
predicting
both
heads
and
tails,
but
if
something
completely
different
happens
that
wasn't
predicted.
So
that
will
be
an
anomaly.
So
if
you're
flipping
a
coin,
there's
no
single
prediction:
that's
going
to
be
hundred
percent
accurate.
B
A
A
A
A
B
B
The
first
question,
though,
is
something
different,
so
there's
parts
of
the
code
that
are
no
longer
used,
so
the
anomaly
likelihood
region
and
then
the
auto
classifier
yeah
they
are
not
used
in
the
nab
example.
Today
the
anomaly
likelihood
region,
the
anomaly
likelihood
is
kind
of
a
computed
after
the
fact,
but
someone
you
know.
Ultimately
it
could
be
a
region,
that's
included
in
the
network,
and
then
you
wouldn't
have
to
do
that
extra
calculation,
but
I
think
this
is
still
in
process
and
not
complete.
B
A
D
D
A
D
Yeah
yeah,
it
could
be
and
I
think
I
think
he
was
trying
to
add
different
types
of
computations
here,
which
I
think
we
want
to
just
put
in
what
what
we've
sort
of
proven
works
and
then
people
are
welcome
to
create
their
own
regions
to
do
whatever
they
want
again.
To
use
this,
you
would
be
working
at
the
network
API
level
as
opposed
to
using
the
OPF
model.
The
OPF
model
handles
this
in
a
different
way.
D
F
C
D
E
B
One
though,
or
do
you
not
use
it?
No,
we
haven't
used,
it
I
think
there's
some
challenges
with
it.
So
one
big
thing
is
that
it's
very
rare
that
you're
gonna
see
the
exact
same
sequence
again
and
one
a
class
and
want
to
classify
that.
So,
if
you
think
about
you
know
this
example
that
I
was
showing
earlier.
So
this
here
might
be
a
pattern
that
you
want
to
classify
again,
but
it's
unlikely
to
look
like
this.
B
It
might
be
quite
different,
and
so
you
know
how
you
classify
a
sequence
is
actually
quite
a
tricky
problem
and
I.
Think
in
general
it's
an
unsolved
problem
in
machine
learning.
You
know
you
want
to
classify
maybe
things
that
are
quote-unquote
similar
to
this
pattern,
but
not
exactly
Mike
doesn't
have
to
be
exactly
the
same.
So
you.
A
D
C
C
B
And
in
general
we
have
not
done
too
much.
We've
done
a
little
bit
of
work,
but
not
too
much
on
sequence,
classification
with
HTM,
so
I
think
within
the
HTM
community.
This
is
an
unsolved
problem.
How
do
you
take
a
sequence
of
patterns
and
even
with
multiple
training
sets
be
able
to
classify
it
robustly
so
that
that's
sort
of
a
I
would
say
it's
a
good
research
area,
I
think
so
we.
B
B
B
B
B
B
One
of
them
is
simply
to
do
a
completely
separate
anomaly
model
for
every
single
field
and
then,
if
any
of
them
give
an
anomaly
or
two
of
them
given
anomaly,
then
you
say
it's
it's
an
anomaly.
So
that's
that's
one
possibility
you
could
feed
in
all
the
data
into
a
single
spatial
cooler
and
then
do
our
normal
anomaly.
Detection
I
found
that
that
works
okay
for
a
few
fields,
but
the
amount
of
training
data
you
need
to
train
the
spatial
Pooler
and
the
temple
memory
will
grow
pretty
fast.
E
B
B
It's
dependent
on
the
underlying
dimensionality
of
the
data,
so
this
this
so
it's
hard
to
give
a
rule
of
thumb.
The
the
if
you,
the
space,
increases
exponentially,
as
you
add
in
more
and
more
fields,
so
it
could
be
that
you
need
that
the
amount
of
data
grows
exponentially,
but
usually
real-world
data
will
fall
in
some
lower
dimensional
manifold
in
that
space.
So
it's
a
function
of
that
underlying
dimensionality.
So.
F
E
B
So
with
the
HTM,
what
we're
doing
is
we're
making
predictions
into
the
future,
so
data-
that's
correlated
at
a
particular
point
in
time,
is
not
going
to
help
too
much.
You
can
correct
me
if
I'm
wrong,
but
the
spatial
polar
shed
did
you,
but
anyway
yeah
it
should
did
it
so
and
but
what
you
want
is
data
that's
correlated
in
time.
B
E
B
E
B
Yeah,
so
it's
if,
let's
say
you
have
five
variables
in
there,
and
one
of
them
is
slightly
off.
The
spatial
Pooler
I
think
is
somewhat
resistant
to
noise
and
you'll,
see
just
a
few
bits
difference
in
the
spatial
cooler
and
the
temple
memory
is
looking
at
the
spatial
color
output,
so
it
might
not
detect
a
huge
difference.
There.
B
B
B
E
D
B
D
So
question
number
four
here
is
in
the
absence
of
correlated
signals
and
specifically
talking
about
when
you
start
a
new
sequence.
Basically,
it's
just
I
think
this
is
a
statement
more
than
a
question,
but
you
can
create
your
own
start
signal.
That
is
a
unit
step
function
in
this
scale
era,
coder
and
I
think
what
this
is
giving.
D
In
this
case
it's
not
random,
but
it's
basically
making
sure
it's
some
element
in
between
the
sequences
that
will
break
up
and
kind
of
break
the
model
out
of
its
predictive
state
from
the
previous
sequence.
And
so
what
another
way
to
do?
This
is
just
to
read
to
give
a
reset
to
the
model
so
model
that
reset
will
basically
get
rid
of
the
current
state
and
just
start
from
scratch,
and
so
that
should
do
the
same
thing
as
putting
a
unit
step
function
in
yeah.
E
D
Interesting,
so
it
should
be
too
much
of
a
problem,
because
if,
if
I
mean
assuming
that
this
sequences
are
things
that
recur,
let
happen
multiple
times
that
the
models
going
to
see
multiple
times,
then
it
will.
It
will
learn
the
sequence
and
things
that
are
noise
in
between
occurrences
of
C
of
predictable
sequences.
Well,
we'll
just
be
considered
noise,
so
I'm,
not
sure.
B
E
There's
there's
three
different
types
of
start
signals:
there's
the
implicit
start,
which
is
at
T
equals
zero
and
then
there's
like
the
generated
start
where
you
use
some
kind
of
detector
to
figure
out
where,
where
you
actually
worth
event
you're
interested
in
it
occurs,
and
then
there's
the
third
one
which,
if
your,
whatever
your
your
your
natural
data,
if
you
have
some
sort
of
natural
start
signal
that
comes
in
on
a
different
channel
and
so
I've
tried
it
all
different
approaches.
And
so
the
generated
is
not
always
you
know
accurate,
but
it's
it's
useful.
E
B
E
B
E
I
did
spend
a
lot
of
time
on
not
having
the
start
signal
and
actually
I
actually
got.
Very
I
got
very
good
at
predicting
our
detecting
anomalies
on
things
that
start
at
random
times
so
because,
because
when
you
just
have
a
data
set
and
in
the
sequence
starts
at
some
time,
it's
always
gonna
be
anomalous
at
the
beginning,
because
it.
B
B
E
B
B
B
E
B
E
I
checked,
yes,
that's
the
statement,
but
I
actually
figured
out.
You
know,
I
asked
it
should
be
non.
Interleaving,
cuz
I
actually
try
to
interleave
the
training
and
it
doesn't
seem
to
pick
up
on
the
the
sequences.
Very
well,
that's
weird
between
it
shouldn't
matter.
I'm
eight,
maybe
I'll,
try
it
again.
Maybe
I
was
using
a
bad
parameters
at
the
time.
Okay,.
B
B
How
high
order
the
order
of
the
sequence?
So
that's
the
technically
correct
answers.
Let
me
try
to
unpack
that,
so
there's
high
order
sequences,
which
means
there
are
sequences
which
share
common
elements,
and
if
you
have
common
elements
in
there,
let's
say
you
have
at
most
three
shared
elements
in
sequences
that
are
ten
ten
elements.
Long
then
you'll
need
to
learn
it
about
to
each
you'll
have
to
do
about
six
repeats
of
the
whole
sequence.
F
E
E
B
E
B
So
it
depends
on
your
resolution
resolution
of
the
encoder,
but
you
should
be
able
to
do
either
one
or
so,
if
you
say,
if
you
have
a
very
high
resolution,
encoder
0.4
and
0.5
might
be
completely
different.
In
that
case,
it
won't
predict
anything,
but
if
you
have
a
coarse
enough,
encoder
they'll
be
very
similar
and
if,
if
in
the
limit,
it's
going
to
predict
both
0.9
and
point
1,
so.
E
E
B
B
C
B
D
D
B
It's
a
new
pic
as
well.
It's
a
new
pic
as
well.
There's,
there's
temporal
memory,
pi
and
temporal
memory
in
our
research
code.
We
also
have
another
version,
which
is
very
similar
with
you
know,
we're
doing
things
like
feedback
them
sensorimotor
stuff
with
temporal
memory.
So
we
have
a
slightly
different
version
of
the
code
in
the
research
repository,
but.
D
D
D
That
would
be
good
it
just
we
don't
have
time
to
really
go
into
this,
but
the
simple
version
of
what
these
things
mean
so
max
in
phys
inference
in
max
learner's
learn,
and
it's
a
the
number
of
the
maximum
number
of
steps
that
you
can
backtrack,
and
this
is
a
an
optimization
in
this
implementation.
Temporal
memory
where,
when
it's
predictions
are
not
correct,
it
will
basically
go
back
multiple
steps
until
it
can
pick
up
a
sequence
that
it
could
have
followed.
That
would
have
been
correct.
Yeah
I
think,
is
that
correct,
sir
yeah.
D
B
So
yeah
so
I
mentioned
earlier
that,
with
high
order
sequence,
you
to
do
multiple
repeats
having
a
higher
Pam
length
kind
of
avoid
some
of
those
repeats
you,
it
sort
of
really
tries
to
learn
high
over
sequences
and
and
so
you're
right.
That
Pam
and
length
of
100
would
mean
like
it
things.
Okay,
sequences
are
very,
very
high
order
and
I'm
just
gonna
keep
thinking.
B
B
Yeah,
so
when
it's
just
you
know
this
stuff
is
these
are
optimizations
that
were
put
in
to
get
it
to
learn
quicker
in
some
situation.
Aereo's,
and
these
are
not
necessarily
biologically
accurate,
although
I
think
Jeff
thinks
there.
That
could
be
a
biological
analog
to
Pamela
as
well,
when
you're
just
trying
to
memorize
something-
and
you
know
you're-
trying
to
memorize
something.
B
E
B
B
D
Just
for
the
sake
of
time,
we've
talked
about
how
encoders
are
are
important.
You
have
to
get
your,
you
have
to
capture
the
semantics
correctly
and
your
encoding
and
get
the
right
proportions
of
bits
and
whatnot
Delta
encoders.
Definitely
try
them
out
see
if
they
help
or
not
they
might,
they
might
work.
In
some
cases
you
might
only
need
a
delta
coder
and
not
even
need
a
scale
encoder.
Normally
we
find
the
scalar
encoder.
We
start
with
that
and
then
don't
let
go
termite
help,
but
you
set
experiment.
Yeah.
B
And
nab
we
we
found
scalar
encoder
works
best
across
all
the
data
sets.
You
know
all
the
kind
of
industrial
data
sets
that
we've
tried
on
Delta
encoder
the
one
case
where
I
found
it
was
useful
is
when
you
have
a
data,
that's
continuously
increasing
and
continuously
decreasing,
and
it's
really
that
the
changes
that
are
more
important,
that
the
magnitude
of
the
change.
So
if
it's
continuously
increasing
the
system,
will
never
really
predict
it,
because
it's
always
a
new
data
point.
B
D
B
B
E
E
D
B
Think
if
it's
like
combining
sequences
but
yeah
I
mean,
if
you're
giving
it
the
same
value
over
and
over
again,
it
actually
does
not
know
what
the
order
of
the
sequence
please
well.
B
D
B
E
D
E
B
B
B
A
That's
it,
it
was
the
one
called
si
la
something.
It
says
something
about
CLA,
okay,
but
it's
he
was
talking
about
the
backtracking
TM
and
just
to
be
clear
in
a
new
pic
in
the
code,
the
the
production
version
of
TM
work.
We
are
calling
backtracking
TM
and
the
one
we
use
in
research
is
just
temporal
memory
and.
D
D
A
Okay,
so
we're
way
over
time-
and
we
got
through
everything,
I
think
I
right
had
a
chance
to
ask
questions.
So
I
think
this
is
another
HTM,
a
KERS
hang
out
in
the
bag.
We
didn't
really
talk
about
anything
except
anomaly
detection,
so
the
next
hangout
will
just
be
a
standard
one,
but
I
really
appreciate
soup,
Tai
and
Scot
taking
their
time
to
answer
these
questions
and
you
guys
on
the
forums
for
laying
them
out,
so
we
could
walk
through
them.
A
I
hope
we
can
refer
to
this
video
and
other
anomaly
detection
questions
that
come
up
that
when
others
have
similar
questions
and
that's
about
it.
Here's
a
treat
for
those
of
you
stuck
around
ol
time.
Here's
my
new
brain
model
really
happy
with
my
new
brain
model,
the
old
one
that
Jeff
gave
me
was
just
old
and
dirty
and
then
had
glue
in
it.
A
Sure
so
I
bought
a
cheap
one
from
the
same
company
for
like
35
dollars,
but
it
was
just
really
bad
and
I
sent
it
back.
So
this
one's
not
so
cheap,
but
it's
a
good
model,
but
anyway
that's
it
for
HT
Macker
hangout.
Thank
you
for
joining
us
one
more
time.
Anybody
has
anything.
Here's
your
chance
to
speak
up.