Numenta 2015 HTM Challenge, 3 Dec 2015

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Semantic Anomaly Detection - Cortical.IO

Description

2015 HTM Challenge Application submission (ineligible for prizes)

A

So, let's start off with this is the cortical I/o project.

A

It is semantic anomaly, detection, Sauron and Pablo couldn't be here but tailors here, so we're going to watch their video and we can get comments and questions to tailor and Francisco is, if he's in the room, maybe might want to join it. Hi.

B

I'm Taylor from cortical IO and Vienna Austria- and this is our team submission for the new mint hm challenge.

B

We wanted to show how to combine the HTM with a retina API which works on text data, so we built a tool that analyzes the semantics of Twitter posts made by several US presidential candidates, so the data source is the text content of their tweets, which is publicly available, and then we process this text, data and encoded using all retina API into STRs that the HTM can understand and then using the HDMI, detect semantic anomalies or unexpected changes in the text.

B

Content of the candidates tweets so first some quick background information about what we do at cortical and how this relates to the HTM. Basically, we provide an API that encodes text into an AmeriCorps presentation, and this encoding process is similar to the way information is distributed throughout different areas of the brain. So on the left here, you can see a graphical representation of how we store semantic information in a 128 by 128 matrix with.

C

B

Bits of the matrix, representing a specific meaning and with related pieces of information being stored, close to each other, just like in the brain, and so we refer to these representations as semantic fingerprints and they're kind of sparse, distributed representation or word STR, and some people call them and we can compute one of these STRs for any kind of text in a variety of languages.

B

So once you have one of these semantic fingerprint representations of a piece of text, there are a lot of things you can do with it, and one of the cool things about our STRs is that they're compatible with the HTM and can be fed directly into the temporal port. So, in a way our API acts as a text encoder and spatial Pooler in one, and once you start creating STRs for text, you can use them to let the HTM learn patterns in human language, too anomalies and also make predictions.

B

So that's what we did with this application. First, we extracted the text from the Twitter feeds of six presidential candidates shown here in no particular order, and then we cooked two tweets per candidate by day and create a semantic fingerprint for that group of tweets. Then we input those fingerprints into the HTM and graph. The anomaly scores that the outputs by day and because we use semantic fingerprint as input for the HTM we're not grabbing the anomaly scores based on the volume of tweets, but by the actual semantic content at them.

B

So what the candidates are actually talking about, so the higher the anomaly score, the more unexpected the content of the Twitter post was for that day. So when you see peaks in the graph like a year or year, the HTM determined that whatever the candidates opposed to got in those days was unusual for that candidate and for reference. We also plotted a few real-world events on the graph as vertical red lines.

B

So you can see how detected anomalies correspond with events like the eight candidates, making official announcements, holding campaign, rallies and taking part in debates, and so the graphs are interactive and you can move your mouse over data points to see the keywords and exact anomaly scores for those days, and then you can also click on your data points to see the full text of the tweets for that day, and so for most of the candidates. You can see that at the beginning of the graphs, the anomaly scores were initially quite high.

B

This is because the HTM was still learning a pattern of topics that they post about and then, after learning a pattern, the anomaly scores tend to drop off quite a bit, especially, you can see that with Bernie Sanders here.

B

But, for example, you can see with a Hillary Clinton the top graph as soon as she officially announced her candidacy in mid-april. The each game immediately detected a change in which was posting about on her Twitter account, and then it quickly adjusted to this new pattern of topics in her feet, with only minor anomalies popping up after that I like this one year. That seems to correspond with a rally that she held in Labor Day.

B

So as an additional feature, we also added the ability to filter the Twitter feed semantically by social and economic issues.

B

This is done by working on the fingerprint level of the tweets to determine what the candidates are talking about and not just simple keyword matching. So when you click these buttons, it reduces the Twitter feeds to only posts, they've eye similarity to these topics, and then we train separate HTML for each candidate on these feeds. So the anomaly scores are based only on the filter data, so you can see that certain candidates tend to post more about certain issues and for some candidates it's actually an anomaly when they do talk about certain issues.

B

And the entire application is available live right now at this URL I will put the link in the video description, and so we encourage people to take a look at it, draw their own conclusions about the anomalies detected and hopefully use it as a way to get a clearer picture about how politicians speak in the media. So that's it semantic anomaly. Detection with a cortical I'll write an API and the HTM, and we have cortical, are big fans of the HTM and we're very much inspired by the work that Numenta does.

B

So, if you have any questions about how to integrate our software with the HTM and please feel free to contact us Thanks.

A

Really great obvious change in Hillary Clinton's posts as soon as she announces her candidacy, that's very telling it.

D

Looked to me like she talked about lots of different time.

A

Yeah, all over the blog, you know all about her campaign. Oh.

E

A

This is getting old, she.

E

Hired a campaign manager.

A

That's sweet right, I.

F

A

What yeah see that's a great anomaly detection though she got, somebody else, started, writing or tweets for his becomes obvious. So.

D

I hope I might be jumping in here. The what's: what's not clear to me is how much the the temple memory is actually adding here versus just using the fingerprints, because you know if a candidate is very consistent every day and they putting out the same basic topics and then there wouldn't be any pattern there. It would be kind of flatlining, it's just you know the Same Same Same and then you would be able to detect a change and then the HTM temple member would see that.

D

But you might get the same result just by using your document. Fingerprints, making a documentary imprint for every day, I'm just curious in mine. Is that my understanding, correct or do I miss something? Or did you test that or things like that? Yeah.

B

I think that's true and we could try to just pick out what topics are happening when and then see. Okay, is this a new topic that hadn't happened before, but I think yeah I, don't know I think by having the anomaly scored, then you really see exactly you know what how how predictable was this? How different was this, but.

D

You might have gotten the same result by just doing a overlap, comparison between the document from the day before today. Maybe it's.

A

Only a one day, history, though I.

D

Know but I'm, not questioning, is there a temporal pattern.

A

Where you today, like.

D

Mondays are Mondays as poverty and Tuesdays yeah.

B

There isn't a sequence like that, because.

D

If you're going to be doing a sequence of day to day is sort of compilations, you you're gonna need to see a flow or change day to day right, and you know there might be some like is current events occur or approach to an election or I, don't know what it is I just. It wasn't clear to me that the temporal memory is gonna. Add a lot over just doing a distance overlap, score I, don't know I.

G

Think it actually, it actually contributes a lot, because the if you would do that in a static fashion, the whole detection of something new would be directly dependent on what the person is talking about overall yeah and because the only way of doing this without the temporal memory would basically be to spot for specific features or pixels in the fingerprint to appear. But it would not tell you if, for those pixels to appear at this very moment in time, this would be something new. So you.

D

Just compared to yesterday's I mean and say with yesterday's document fingerprint different than today's document fingerprint, but.

G

I think that the transitions of the topics that are coming up, that's what is actually the interesting.

D

If it's really true, I mean really and.

G

And it stays so to say the the possibility for anomalies stays sparse as it is in the beginning. If you use the temporal memory, if we would just do that passively and we sort of keep tracking over a month or something, nothing would be a new so to say anymore,.

D

It'd be something you could test to see how well it works. Yeah.

C

And curious: did you try with the enemy like you did not the animated scores? Did you try to see what you would get back, yeah.

B

Actually, we did try that, but I think that some kind of parameters from the encoders- that's what I was told. Okay I didn't actually write a value.

A

But it does need a value and in their case they only have a value to send it. They just have an SDR. So it's a little bit more to people. Okay,.

B

Something something we can try out then.

H

Question, how does your representation and the method differ from a standard method like bag-of-words topic, modeling using LD, a yeah.

G

It's basically, it's somehow smarter in figuring out how they are related. So if you just take the bag of words, the only thing you can do is to match the key words, and if someone expresses a certain concept by using other key words, there is no way of directly figuring out that this is actually similar and by using the the arrangement in the fingerprint.

G

Even if so to say, two features are next to each other by knowing that the environment of the feature is important, the so to say the cosine that you get on the similarity of the two vectors still stays very similar. So is.

H

That, like saying that your your encoding is a language model in itself, yeah.

G

Yeah, so it's a yeah, exactly it's a language model I would call it it's a semantic model, even because there are certain aspects of language that we don't consider as, for example, the actual the actual sequence of the words. But as far as the aboutness of the whole thing is concerned, that's what is basically modeled.

E

C

F

Really cool application.

E

And it looks like it works pretty well and got my imagination, spinning I thought: well, can you take it yeah.

G

E

Further and create a Twitter generator I mean.

G

That that's actually only the beginning, I mean one problem. Is that the actual tweets? They don't have a lot of semantics compared to some real text? If you take, for example, news Matt, real news articles or things like that, you have much more semantic payload in this.

G

So with the tweet tweets, it's basically the problem to actually find some something meaningful I mean an improvement would be to actually train the system purely on tweets, which would then allow you to also take in consideration all the Smiley's and all these shortcuts that they use and, as I said already in a couple of conversations I could I could imagine that by taking all the smile is into account to get a better sentiment. Analysis on the tweets than we see in current systems, who try to do this by by dictionary? Basically I have.

I

Two questions one is I know from personal experience that tuning the temporal memory to work with fingerprints is a little bit tricky. So you know how much experimentation did you do and did you really get a sense of whether it's working with not working well yet well,.

B

That part was done by some of my colleagues, but they did a very similar hack at the last hackathon at the breaking news, Tim OH, and we started basically with the parameters that we had for that. So we didn't do a whole lot of tuning this time around, because I guess the last time they already did a lot of it.

I

Okay, my second question is who's. Gonna win the Republican nominee.

G

So that's that's the hot question prediction: okay, it might be anomalous.

B

They're all pretty anomalous.

F

Yeah, did you so the drop for stocks application, looks at sequences of tweaks and looks for anomalies, I mean with. If you took the stock tweet database and fed it into your model, would it essentially reproduce kind of what rock for stocks did or would it be different.

F

A

Well, if you classify sign like a cache tag and that's the way I'd hear.

F

Those two ways.

D

Of looking they're complementary, really I think you'd want.

G

To thing would be to use both of them yeah continuously yeah, which.

D

G

Think should be physical, yeah.

G

A

Yeah, okay, thank you guys.