Numenta Featured Applications, 1 Dec 2015

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: HTM Models for Adelaide Arterial Traffic Flow - Jonathan Mackenzie

Description

2015 HTM Challenge Application Submission (1st place winner).

A

This next submission is called HTM for Adelaide arterial traffic flow, and we will have Jonathan Mackenzie on skype to take comments. After the video.

B

So hi I'm Jonathan Mackenzie- and this is my presentation for the new payment to HTM challenge. My presentation is on HTM models for automate into the detection in Adelaide Australia. So what is automating to the detection? Firstly, we need to know what congestion is and that's basically where the capacity of the road is exceeded, and so everyone slows down. So this can be recurrent, which you might think of as something like rush hour in the morning when everyone goes to work and Russia are on the afternoon what everyone goes home.

B

So our goal is to detect nonrecurrent condition and that's where something like an accident. A breakdown, skilled, low, burst, pipe landslides, flooding rights and protests cause congestion on the road. So the problem is basically that freeway autumn agents and detection is well researched and there's lots of solutions for it.

B

But we want to try and do the same thing but on arterial roads, which is much more difficult due to the very nature of the traffic and everyone to turning going different ways and we use loop detectors which are sent to the underneath the road which accounts traffic and they come with their own warts. So in our solution we use HTM to detect anomalous traffic flow.

B

This is an improvement on previous research, which mostly used simulations and supervised learning techniques which allowed people to easily set up incidents and traffic flow and can really easily monitor everything. But we have to use real-world data which comes with all its own problems. So the way my system works is we import the data which is provided by the transport system center at Flinders University into manga base.

B

The data is analyzed using HTM and we reveal in a Python web application, so we've currently got three and a half terabytes of data from seven years and that's at five minute intervals, but for now I'm only looking at two month period, which is about 130 intersection to 170,000 data points per intersection, and this is private data. So you don't have access to it.

B

Sorry, and we also use crash data to verify normally prediction and their 142,000 incidents, but in a period we're looking at is only attribute I own, this public data and you can get it from da gsa.gov today, you so here's my readings data. Basically, it's just a collection of objects which map the sensor to the count of the sensor.

B

So we can see that on intersection, 3041 at 830am there was 58 vehicles on sensor, 136 and here's all the crashes which basically just say when and where crash occurred, and what sort of features that have like raining for drives, etc.

B

So the analysis script uses HTM and there's an option to do it with one model which takes in every sensor as an input, and then we get the anomaly score from that, but they didn't work very well. So we use one model / sensor. So if its intersection with 24 sensors that'll, be twenty, twenty four models 24 out different outputs, but still this takes a long time. So we use supercomputer.

B

So here you can see the script running for my machine using multi model, which means that there's one model / sensor, which run it runs in its own sub process. This is for inter section 3001 and writing.

B

The results to the database is also a smoothing up option available which we haven't used, but this applies a median filter with the window size of your choosing to the readings data, and you can see down here that this would uses more processes then cause since 3 meter, section 3001 uses 20 senses, so it won't run as fast as it could. So we run it on supercomputer, and here you can see supercomputer q with all the jobs lined up and then speak computer has around 1200 cause.

B

So we can easily use as me cause as there are processes for the intersection.

B

And then you can reveal it using the and your web application. So here you can see the lovely city of Adelaide, and this is a central business district, which is the only area that a traffic for so we can click on section and kind of button to see the data for the intersection. So here we've got intersection, 3083 to signalize t-junction and here's all the readings on sensor 56- and here we have the anomaly scores in blue, normally likelihood in red incidents in green and the orange dot indicate that a sensor has exceeded the threshold.

B

Given here point 99 it's about tools taking the log with the only likelihood. So if we zoom into this section here, we can see that at this time there was some anomalous traffic graph happening.

B

So certainly here there was a significant dip due to this incident here and if we mouse over, we can see that it was caused by inattention and was a rear end that cost two thousand dollars and down here we can see on the map exactly where it occurred for the going north, but it did think this bit of traffic over here was anomalous on that was on center, 848, etc. We can click the button to change the sensor.

B

And see what the readings were at that time.

B

So that's that view, and we also have an incidence view where we can easily cross validate crashes with the anomaly scores.

B

So here we have a list of incidents with the date and which innocent, which intersection a decoder and if any of its senses exceeded 2.99 threshold, then they'll be shown here with the sensor. We can filter to only show ones which have exceeded and will click on this one here, 30 40, and so here we can see it was an accident at that time resume in we can see, but there was a dip in traffic at that time and that did come up as an anomaly on these sensors 5672 natey.

B

There were anomalous over for a while around and you concealing that p-word occurred.

B

So here's some results, I found so didn't find, even though there was quite clearly almost traffic here around an accident that it wasn't anomalous- and this is probably due to the net very noisy nature of the data which you can see here and thank you for watching and thank you to new pic for providing this challenge and thank you for the new mentor team for helping me up with all parts of the project.

A

Thank You Jonathan.

A

First off, as someone who has built traffic analysis, applications with new pic before I just want to say that the amount of software it takes to do what Jonathan did is pretty significant, so impressive, just in the maturity of the software and a short period of time, Thank You Jonathan for your work. Judging panel.

C

First of all, don't you look very tired, don't drive God.

B

Just woke up a.

C

Question you said that you tried to put all the sensors from an intersection into the model and I. Imagine that there was some encoding that you had to do and is. There did have a number of choices to make, and you said it didn't work very well. Can you say more about that? It.

B

Would throw one giant model that took in every sensor as an input? It could only give one output and that was I, wonder normally output, so you can get anomaly / in two percent, so rather, and so it was difficult to tell which sensor was actually acting up because I said intersect. Any incident on one line might not slow the traffic in another in a significant way, and so you really need to detect that.

B

So if new pic had outputs on multiple outputs for anomaly scores that have a good, but yet it didn't really work, because the score it produced was pretty much just very low the whole time, no matter what and I'm not sure if that's doodle, that I didn't put enough data into the model, but that's just what I found so I pretty quickly switched to one model. / sensor, Thanks.

D

Yeah, it's an impressive application and I had a question about the historical data. The incident data, how.

D

How confident are you in the accuracy of that data to know if you should have gotten an anomaly score when you didn't or did.

B

I'd say fairly confident because I think that the damage caused is a pretty good indicator, the severity incident and so the length that it would take to clean up, etc, but I think it's fairly accurate, because people are required by law to report any accident on the road to the police. I mean I think they should have a data entry person entering it, but there's never a way to verify back, because it's done at a time. Chang right.

D

As I was curious that sometimes you're expecting to see an anomaly in you, you didn't, or it could have been because of data entry error. On the of the incident report, it.

E

Occurred to me that you know just my casual observation as a driver that sometimes accents they quickly go to the side and doesn't slow down anybody, and sometimes it just backs up everything so that the correlation between an accident and anomaly in traffic flow is not always so great and I didn't know. In episode of Leave made my question here: we're using the the traffic incident reports just to as label data, or is that somehow the algol of the system? What was the goal? So what would you?

E

What was the ultimate utility be of the system? I? Think.

B

The ultimate goal is to have it actually identify an accident that that causes congestion because, as you say, if there's a small way around and they pull over, that doesn't really cause congestion. You really want the ones that do cause congestion because of the more severe ones but I think in my application it was mostly just for now to search for anomalous, traffic and I. Think we found a fair bit of that, but yeah you.

D

Mentioned that there's a desire from city services to be able to identify actual incidents in there for greater confidence in when to roll trucks, and things like that right, yeah.

B

Yeah so yeah that that's not a public safety thing, so you can clear out accidents faster. Could.

D

You imagine this, then this is a real-time system. When phone is, if we were actually to be implemented, yeah.

B

For sure it's pretty simple just to hook up the data and have it run continuously, but that's something I need to organize with the council in the city right.

F

I noticed you were using a threshold of point, nine, nine or anomaly. It seems very high. Does that mean you were getting a lot of false positives that you had to filter out by being very picky about what it calls a normally I'm.

B

The point- nine nine four point: nine nine nine was provided to me by at taylor. I think I think I was a standard threshold that that's used in new pic projects. Good you can take the logarithm of the anomaly likelihood and then that will show you it because it really only peaks what it actually thinks as anomalous behavior going on. So that's what I was told was best practice. Okay,.

F

Thank you, I'm.

G

Question um so you trained the model with your supercomputer. Did you also run swarming to said the parameters for these models? I.

B

Did run a swarm, but it wasn't, it didn't produce results very different from that, provided in the standard models and I think there was a video made by Matt, Taylor and Scott Perry on you should swamp for anomaly, detection and I think the answer was basically no. So it was a waste of three hours me.

G

I'm just curious, if maybe one of the reason why sometimes it doesn't detect an animal, he is a min max problem or something like that. If, for this particular intersection, maybe the like I'm, is it like the number of its incidence? Maybe maybe there is this issue and that's why it doesn't detect the enemy. You see what I mean if the the maximum in your model, pram is said that some value, but you end up with something that's much higher. It's kind of inquiry.

B

G

B

Maximum number of vehicles through a sensor I, was provided by the team here at Flinders, was that 200 vehicles is pretty much the maximum number of vehicles. You're gonna get ever because that's like fully because it's in five minute intervals, so that that's pretty much constant high, very high speed traffic over a single sensor. Ok,.

G

H

I was really impressed at again, like Matt, said the whole end-to-end nature of everything you did, plus the other part of it was just the fact that you have so much data that you ran through and really got a sense. I was one. My question is you know, did you get a sense?

H

You know what of what types of situations the anomaly detector worked. Well and what types it didn't you have it. Some sort of an a meta-analysis of the overall behavior.

B

Frankly, that I can't really tell which one's its best up, because I think it it found it didn't separate end of like 12 p.m. midnight very well from other day other times, and it was a huge gap in the data for about three days. I think that it thought was very anomalous, but I'm not really quite sure about how which fits its best at its have to look through all the incidents that thought were highly anomalous. Yeah.

H

It's very challenging with all of this data to figure that out. If but that's yeah, I think sort of piggybacking on Marian's thing. It might be interesting to do. You know offline as kind of a sweep through your parameters, and you know there may be a way to kind of improve the overall system a little bit yeah.

B

Right, I am planning on running a search basically took make to see if it can find which parameters give the best anomaly scores for certain for some incidents that I know specifically occur that did impact traffic and then see if I can approve the anomaly scores it gets for those you.

H

Mentioned you did a median filter before you fed it to the HTM, hey.

B

It was a main filter right. I was.

H

Wrong the video, but was there a reason for that? Did you find it work better with that and without.

B

It didn't work any better or worse, I guess that's.

H

What I thought yeah.

A

Any other questions or comments for Jonathan thank.

F

A

Alright, thank you for joining us.