Numenta HTM School, 24 Feb 2017

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Temporal Memory Part 1 (Episode 11)

Description

This episode offers a detailed introduction to a key component of HTM theory and describes how neurons in the neocortex can remember spatial sequences within the context of previous inputs by activating specific cells within each column.

Using detailed examples, drawings, and computer animated visualizations, we walk through how cells are put into predictive states in response to new stimulus, and how segments and synapses connect between cells in the columnar structure.

Intro music: "Books" by Minden: https://minden.bandcamp.com/track/books-2

A

Place looks really familiar, but I can't remember how I got here. It's almost like someone severed all of the distal connections between the pyramidal neurons in my neocortex, okay, that was a bad joke and you're not even going to get it until after this episode of HTM school.

A

So far, we've only talked about spatial patterns and how the spatial Pooler translates them from an input space into a normalized representation with a fixed sparsity that still contains all the semantic information of the original encoding. Today, we're going to talk about temporal patterns and how the temporal memory, algorithm and HTM recognizes sequences of spatial patterns over time. The temporal memory algorithm does two things one.

A

It learns the sequences of active columns in the spatial Pooler over time and two it makes predictions about what pattern is coming next, based on the temporal context of each input. It does this by activating individual cells within many columns in previous episodes. I've talked a lot about active columns in the spatial Pooler and how each column becomes active in response to stimulus in the input space that overlaps its potential pool. So let's talk a little bit more about this neurobiological structure.

A

Let me just get rid of this stuff here, where you go so, first of all, the input, space and green down here below, as you seen in other videos, draw all the active bits and stuff. This is a feed-forward input and I'm going to draw the mini columns now each one having four cells in each column. So let's highlight this as being a column in yellow and each column, if you remember from other episodes, has a connection to this input space and as a receptive field.

A

So each column is connected to different bits in the input space, and these are proximal dendritic connections, feed-forward input into the system. Every cell shares this proximal input in each column, every column has a separate receptive field and each one of the cells within the column shares that receptive field through its proximal connection,.

A

Now, let's look at the columns without the input, so aside from the proximal dendritic connections, we also have these other connections between cells within the structure. So, for example, here's a cell with a segment that's connected to four other cells, so it has four synapses every cell and the structure could potentially connect to any other cell in the structure.

A

Here is another cell with a segment connecting to four other cells, so we've got one segment per cell, although each cell could have more than one segment and here's a third cell with another four synapses on its segment. These are distal connection.

A

So how does this look on the cell body itself, so the cell body or the soma has got different areas of receptivity, the feed-forward proximal input comes from below, and the contextual information or the distal connections come laterally from other cells. Within the structure, there are two primary phases of the temporal memory algorithm. The first is to identify which cells within active columns will become active on this time step. The second phase, once those activations have been identified, is to choose a set of cells to put into a predictive State.

A

This means that these cells will be primed to fire on the next time step. So I'm going to show you a visualization of running HTM system and we're going to look and see how the temporal memory algorithm actually deals with active columns and tries to activate cells within each column. Okay, let me introduce you to my little sequencer here. So what I wanted to do was show a way that you could input very simple sequence into temporal memory.

A

Algorithms, so I chose like a note sequencer to do that, so I can choose which notes I want to send in what order and send a sequence of patterns into the HTM system. So you can see it running like this. This should look familiar from other episodes we have based on the left. Each note gets a different individual encoding and then these facial pullers columns are on the right with the big yellow, active columns. We can see this better.

A

If we spread this out, then you can really see that there are active columns occurring over time here.

A

So what I'm doing is I'm feeding in a four note sequence and then resetting and restarting the sequence over so every time it sees F sharp, the first note and the sequence it's seeing it for without any context. It's nothing came before it when it sees e. It sees it within the context of sharp and C sharp within the context of e etc. But this is only as four note sequence and then we reset and start over so I'm going to explain to you the two conditions in which cells activate within the temporal memory algorithm.

A

So let's first go to the first note in the sequence, which is f sharp. So, as you can see, we've got the input encoding with block o cells. That represents F sharp, and these are the the columns that spatial Poehler has decided to activate based upon that input.

A

So we're going to look at what cells within those active columns going to activate now, there's there's two reasons like I said that they would activate one is if, within an active column, we go inspect all of the cells and if none of them are in a predictive state, we will activate every cell in the column this is called bursting.

A

So if I show you the active cells here by toggling this on and off, you can see that, with this first time, step in the sequence, every cell within every active column is active because it has never seen this spatial input within a context before it has no predictive cells to activate the other reason. Cells within active columns activate is because they have been put into a predictive state by some previous context of this spatial input.

A

So in this example, let's highlight the active cells here again and you'll see that these columns are no longer burst, because the algorithm has seen e in the context of f-sharp in the past. So what these active cells are representing is that they were in a predictive state when we got to this point. So, instead of showing active cells, let's show the predictive cells. They are the exact same thing, so the algorithm goes and looks into every cell.

A

In every active column, only cells with an active columns become activated because these activations are completely driven by the proximal segments to the input space. So if one of these cells in an active column is in a predictive state like all of these cyan cells were looking at here, then they become active in this time step. That's the first phase of the temporal memory algorithm is to decide what cells become active either they all become active in a column because there's no cells in a predictive state. That means we're sort of kicking off this sequence.

A

We're seeing it in for the first time and the other is because they're ourselves in a predictive state within a column and those will be switched to active if they were correctly predictive because of the current set of active columns. It basically validates that prediction. Yes, it was active. It fell within an active column. You might be wondering at this point: how did these cells become predictive?

A

That's the second phase of the temporal memory algorithm after we have identified which cells are currently active in this time, step based upon their predictive states within the active column we go and look at every single cell in the structure and look at its segments. We looked at its distal segments and each synapse on each segment. So let's take a closer look at the HTM Nayan before we start talking about more about these distal segments.

A

So in this diagram we show this a lot in our literature, we're comparing a biological diagram of a neuron to the HTM neuron in software that we're we're creating. We have the feed-forward input, which is the proximal dendritic input from the input space in green here and in both sides, and then the distal input from lateral connections to other cells within the space and blue for context. Now this HTM neuron diagram over here is showing that there's feed-forward input, but it's also showing that it can have one or many distal connections. These are distal segments.

A

Each one of these segments could potentially have one or many synapses or connections to other cells within the HTM structure. Each one of those cells may be in an honor and off state.

A

So, at any time, if a cell wants to decide whether it's going to go into a predictive state or not, it can look at all of its segments and its connections across all of their synapses and at any one of those, some the crops, all the synapses breach some threshold, which is configurable, then that cell goes into a predictive state based upon its connections, is contextual connections to the other cells within the structure.

A

Each one of these little coincidence detectors could potentially cause the cell to fire in the next time step if it's right, so all of these segments are like little coincidence, detectors telling it I think I see something when in which you're going to play a part.

A

Next, so knowing that I'm going to show you how cells become predictive, because what we do is we go through every single cell and we look at all of its segments in the algorithm, and we sum up all the synapses and whether they're active the the ending cells on those synapses are active or not, if those breach a threshold and we put the cell in a predictive state. Now, let's show the currently predictive cells in this structure.

A

These are the cells that the algorithm is predicting will be active in the next time step and if we dive in and let's let's take a look at a couple of these cells here, little shaky camera work, but so I, just clicked on a cell that is in a predictive state. It thinks it's going to be next and I'm, showing you exactly why this is showing you, this cells segments.

A

There was only one segment on the cell, because it's one color, this magenta color and all of the cells that it's connected to are active in the current time step. That's why it's predictive because it looked at its segments and it looked at summed up all the synapses and they were all one apparently and that breached it threshold to become predictive, so it is in a predictive state. If we look at these other cells that are in predictive states, they are all basically in the same boat.

A

They are all recognizing the same contextual pattern, they're, basically all seeing the previous note being f-sharp and the current note being he. So these are all representing a in context of f-sharp and predicting that c-sharp will be next. These predictive cells are indicating that c-sharp will be next and if we move this one step forward, you're going to see all of these blue cells turn orange and they're all part of the next set of active columns. So in this next time step we did the same calculation. We looked inside the current active columns.

A

The spatial coiler gave us and we to see if any of those cells were predictive. All those cells we just saw were predictive and they all fell within these active columns, so they became active and then we do the same calculation across all the cells to see which cells become predictive for the next time step. So, let's let this run a little bit because there's something interesting I like to show.

A

So it's really learning this sequence really. Well, let's show the active cells, the predicted cells and, if you dive in and look at this aside from the first note and the sequence bursting, because it bursts every time because it has no context for that- F sharp note. It's never seen it coming after anything, because it's the first note in the sequence, aside from that all the rest of the predictions seem to be spot-on. Now what happens if I change? One of these notes? Let's pause right here: let's go one more now: let's do right here!

A

So, instead of a c-sharp, let's play an f-sharp, so I just changed the sequence. All of these predictive cells. Right now the blue ones are predicting c-sharp. When I step forward we're not going to get c-sharp we're gonna get something different. What do you think is going to happen? Let's find out so let's step forward and we see columns bursting all of a sudden.

A

Does that make sense yeah. It does make sense, because, even though we've seen the input f-sharp before earth, we have never seen f-sharp coming after an e. So this spatial context that we do recognize because we've seen these active columns before is not recognized temporally, because we have never seen the sequence. F-Sharp e f-sharp we've only seen f-sharp E c-sharp. So that's why these columns burst because they're seeing a new sequence, something new as we step forward and come back around on the next time step.

A

We see something else interesting if I show you the predicted cells and, let's let this learn this new pattern. A few times takes it a while to recognize there we go so now back at the e. You might notice that there's almost twice as many predictive cells at this point because we're teaching in a second pattern now these predictive cells at this time step are not only predicting C sharp, which it seems several times.

A

These predictive cells are predicting F sharp as well, and if we move one time step forward and we will see that some of them were correctly predicted, it correctly predicted predicted F sharp some of them. These are the green ones, are correctly predicted. Some of them were wrongly predicted. The red ones are incorrect. These were the predictions it was making for C sharp, but it's so it can represent these two different sequences branching off in different directions.

A

From this point onward now over time, as the algorithm learns, it will forget sequences that it hasn't seen in a long time. All of these parameters are tunable, so you can make it forget very quickly or retain information for a long period of time. It's all about how the synapses degrade we'll talk about that in the next episode of HTML.

A

So let's say we have a temporal sequence, another four step sequence and we're trying to encode the concept boys eat many cakes, so just assume that we have an encoding process that encodes the semantic information for these four words in a way that they're recognizable by the spatial Pooler and the spatial Pooler is putting is assigning them active columns. So before learning the spatial polar is going to represent these four elements. In the sequence.

A

With these active column, the temporal memory after learning is going to identify specific cells within each column to represent that spatial information within the context of the temporal sequence. So after learning, once we get the term boys the spatial pattern for boys which kicks off the sequence, all the columns burst, because there is no context for boys. It's always the first word in the sentence.

A

Then the next pattern for eat comes and we have identified not only that this is a spatial pattern for eat, but it is within the context of the spatial pattern boys, so we're going to call this eat. Prime, it's not just eat.

A

It's eat with an additional temporal component, same thing with many many prime, which is the spatial representation for many, with the additional context of many in the context of eat, prime in the context of boys, so we're driving sort of down this temporal sequence and it goes all the way to cakes, which is now cakes. Prime cakes. The spatial pattern is represented in cakes, Prime within the context of mini prime, which is in the context of eat Prime in the context of boys.

A

So you can see how we're driving down a sequence that could potentially diverge along each node in the path. So what happens if, after we've trained the memory algorithm with boys, eat mini cakes, we start sending in a new sequence. Girls eat many pies. So now we've got these two different spatial temporal patterns that the algorithm has learned. The first and last elements are spatially completely different, but the innermost are very similar. So there's going to be some ambiguity here.

A

So let's talk about this after learning, when we go and see what cells activate for this temporal pattern, girls eat many pies. We see that we have eat double, prime, which is a different representation than we had when we said boys eat, which was eat. Prime, that eat was within the context of boys. This eat is within the context of girls, so hence the double prime same thing with many. This is now the spatial pattern.

A

Many within the context of the double prime, which is in the context of girls, so get this temporal information of girls passing all along this temporal pattern. So even PI's double-prime now is PI's within the context of mini double-prime within the context of eat, double Prime within the context of girls, so that temporal information that the sequence started with a different term than boys is followed all the way through the sequence.

A

So after learning we can see that there's these two completely different representations of the ambiguous terms in the sentences, so we can identify one eat versus another eat. Knowing that we're within a specific context, if we got one of these activations, we would know Oh girls eat the other. We would know you know boys eat.

A

So what happens now, if we've learned these specific cell activations for these specific spatial temporal patterns, what happens if we send it in an ambiguous input? So we were starting off beach sequence, just sending in boys or girls and sending in the spatial pattern. All the columns activate all the cells within the columns activate what if we sent the spatial pattern eat in as the first element of the sequence, with no context at all, so the system won't know what what eats. Is it a boy or girl?

A

We don't know what cells now do, you think are going to become predictive if we don't have a context for eat.

A

It turns out that all the active cells for both many prime and many double prime become predictive, because we don't know the context of you, but we do know the context that we have seen eat in in the past. So we might as well predict that it's going to be this one, or it's going to be that one. So we have the set of predictive cells like I, showed you in the previous representation that is potentially looking at two different pathways that this sequence could take.

A

Based on the two paths that it's seen in the past, we can take this even one step further. If we give it two ambiguous inputs. What if we just sent a spatial pattern, eat and the spatial pattern many and which cells will now be predictive? We've got all these cells for cakes and pies, and what happens here is the exact same thing since we don't have a context. The system still knows well. I've seen eat many in two different ways.

A

So I know that in one way Pais follows in another way: kake's follows so it gives you the representation for both pies and cakes within the context of those two temporal sequences. The next episode will be a more in-depth view of the temporal memory algorithm and including how it learns, how distal segments get created over time and how they're synapse permanence azar adjusted to better match recurring patterns. I'm going to end this episode with a puzzler. The examples I've shown today have all had many cells per mini column.

A

How would the behavior of the temporal memory algorithm change if we only used one cell per minute column? If you think you know, leave a comment and try your log don't forget to subscribe to our YouTube channel, so you won't miss any episodes of HTM school and please hit that like button. If you enjoyed this video and I'll keep them coming. Thank you for watching HTM screen.

A

It's almost like someone severed all of the distal segments.

A

And still encodes the semantic information in the original encoding.

A

In previous episode I, so let's talk a little bit more about this. Neuro law is to choose a set of cells to put into a predictive state based upon those at academic.

A

The second is to choose a.

A

I want to first talk about the well. These are just arbitrary things: I'm I'm.

A

This is hard, that's the first phase of the temporal tool early. So so you might be wondering at this point: how do these cells? How so you might be wondering at this point? How did these cells become predictive of they're just magically present? So that's the second phase of the temporal memory algorithm.

A

Then that cell will be certain, then then that cell could so now we've got these two different temporal spatial within the context of many, so it will give you the reprobate represents in the next episode.

A

A

The examples I've shown today have all had many cells within each Oh garbage truck.

A

hmm Okay I have data for that garbage truck so going.

A

My garbage Mike's gotta be picking this up. It's just pick. I could have paused the recording, but it's really a K in the acid and editing geez you're slow, run anyway. I'm, okay,.