Numenta Numenta Research Meetings, 7 Aug 2019

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Deep Learning & Reinforcement Learning Summer School 2019 Recap Part 1

Description

This is just a recap of the Deep Learning portion of the event.

A recap by Lucas Souza, Numenta Research Engineer.

Numenta Research Meeting - Aug 7 2019

Discuss at https://discourse.numenta.org/t/deep-learning-reinforcement-learning-summer-school-2019-recap/6434/2

A

B

I'm doing all right.

B

C

Think with more like what are you about to talk about.

B

All right, so the whole idea is all those tweaks out in this learning. Reinforcement. Learning summary school I was a very good experience. A lot of the researchers in the field have gone through that summer school. At some point it's the summer school has existed for like a long time. Maybe.

B

But it's a lot bigger now this end, the beginning, don't have like three yeah now.

D

B

Had three hundred people out of like five thousand applicants, so it's it's a lot bigger numbers, Wow hardened to so just oh I, think this chart disclaimer it mainly because you it's gonna go in the web. This is just render notes. It doesn't do any justice to the lectures. It's not an access to review, I just focus on some areas that are interested to our research here and I tailor. The presentation took 55 my hands just like one arm and I have an expected powdered.

B

So I, don't think you guys want to know the details of some stuff. It's not relevant! So that's a public. We had.

E

B

Andy's, goodness we had researchers from other fields as well. A lot of neuroscience is some people from biology from physics, etc and yeah, or even very short, actually I'm short and I looked all in the speaker's we just and we had like from undergrad to professor's and we even had likes girl. Just came out of high school publish a paper yoshua, so he had like whole spectrum and the main organizers were Yahshua and reach certain who's here, beating the beer at the bar. We talked about booking people that was fun yeah, so I.

C

Think I having a good time down there, yeah like.

B

We can hear- maybe that's me I'm, not sure, so we had four classes per day from 9:00 to 5:00 Monday accepted. We have evening events almost every day, so it was really tiring actually, but it was good. So it was a lot.

E

About networking a lot about classes.

B

But a lot about networking as problem, so this was like every we had this kind of different mixtures and was organized by safar. So it's the same Sammy school, that's going on for 30 years, I mean Mila and vector. These are institutes that Albert Aman from Toronto and they are organized by Amy's, which observe Miller's, yoshua, bengio and vector is just him. So.

E

It's very much a Canadian thing that looks like a.

B

E

Then it was at the CMU, it was yeah.

B

It is, it is kind of a Canadian thing now.

B

Speakers were not opening and a lot of, but it's mainly organized man. You make Canada and I.

E

Was pretty impressed by.

B

The size of these Institute's, like specially Nealon, we look like huge. It has I think about 500 people, including students, professors and all, and they are forget it too, for universities in ultra and basically, all students, income which are studying machine learning at these universities. They go through a selection process before and really it's kind of sitting on top of the universities they're on trial right now, especially in this machine learning field, so they are quite big and I think you ought Shaw. Has this go?

B

That mantra is Silicon Valley in 10 or 15 years and he's working hard for it. So.

B

About the mixers, what I found very interesting is that it seemed to me that we might be in a bubble. That's because my is what like enough in the bubble. Uh-Huh there's so many people, hi, there's like really in the career mixes, are like oh yeah,.

A

They're, like dozens.

B

Of companies hiring, and it was.

E

Interesting that most.

B

Of these people, they were not looking for a job like most of them were very happy into their PhDs or professorship or something, but all these companies they wanted to hire and the Dominos interest, like everyone is just focused on like research. These are like hard core researchers. They spend their summer going over, like tons of slides of math and they're, not interested in going to the.

C

Mixers were sponsored.

B

By companies they were spoiled. The whole event was sponsored by like a bunch of competent, the caravel sponsored for energy, for example, and they all tried too hard, but no one was interested in working in the industry. Well,.

D

It's just it doesn't do research, so you had like this mismatch kind.

B

Of weird, so the only company that drew everyone's attention was like deep mine and mainly because the mine works with research, and that was like an opportunity to continue working with research, but in the industry and I. Think that's why the mental speech was also a bit feeling for a lot of people, because it was appealing like.

E

The people actually liked the.

B

Idea of working elementa, we might get a few emails, I hope we're godlike to emmaus. He said no matter.

C

Speech or pitch yeah.

B

I am pitch like a lot personally.

C

You, wouldn't you were the pitch and the pitch yeah.

B

We didn't have my stand, but.

E

I, even could all these mixer sponsored by other companies yeah.

C

We just you right, yeah,.

F

I guess me: okay, but as we're there other firms in the private sector who have researched arms that were not of interest like Facebook research, Facebook, a all right, so.

B

That that's interesting, no, like Facebook, ki and Microsoft. Would you like big ones who are not their boss? Just the only one which was pure research focus was mine. All the other companies, like small startups, they're, trying to solve a problem. They mean there's Jesse, but most people didn't care. No, they didn't work for a Khan credit card. I, don't know something like that. They only care about yeah.

C

It's to oblige, so you talk about a bubble because because of that mismatch, some of these companies may figure out like that. Wasn't necessarily good use of their money and said I mean they may not fund it again. That's who's, gonna pay for it right, I,.

B

Don't think they got anything for me and they're looking for machine learning research, but maybe that's not what they want. They just want to apply machinery and I. Think they'll have better luck, just getting someone smart from the industry and trained in on machine learning or just using those rather than trying to hire someone to just enter some kind of paper published movies.

B

E

I think the interests are not aligned and that was really clear.

B

B

E

Day, everything was paid for did.

C

You have to pay to attend this one.

B

So I I did pay nom, but I.

C

Actually got full funding, but I.

B

Apply as a student from Brazil, then they gave me feel funding since I'm. Not a student anymore, I didn't accept the funding, because my condition change so I paid for part of it, and it meant this thing for another benefit, so I paid for subscription and everything and at the paid I think what kind of ticket yeah so yeah. They hadn't liked a lot of money. It was really well organized and to pay for all these might be expensive.

B

Another thing about mental speech is that a lot of people didn't know the mentor and that's explaining what we do it's that there. But when.

D

I got to Jeff Paul Kinsey. Oh it's.

B

Jeff Hawkins company so.

D

Then I got tired of explaining.

B

C

B

I think a lot of people like read the book. It probably didn't follow up.

C

Teaching, because you know the book really did beat a lot of people, so you know that's why I'm ready another one I think it's a good. It's a good tool for people.

B

C

All virtually how that is in something you can, you can think of it as like a marketing pitch in some sense, but it was.

B

So the summer school.

C

Was elected to summer.

E

B

Up until last year, it was separate they had a separate application process now and now it's just joined, but so we have been learning for four days and then, when 14 learning for five days, so in deep learning day, one we started with yokoo. We just gave like an introduction to neural networks and just a comment. These are mainly my notes, I transcribe, to this light when I wrote it and most of them make sense. But now, like cuing Slayer, we've taught many contexts a lot of them don't make any sense, but.

D

B

Might not be with to answer a few questions, but I can write down and come back later if answers.

G

B

Are eventually going to be online one bad thing that a lot of people complain is that they didn't produce the slides even not even before, or after usually what I like to do when I go to these things. Is that I look at this life before so I? Don't have that effect of every slides of surprise things so I know what's coming and I can prepare for it, but since they didn't believe we had this and why you went to this presentation.

B

Every slide was like this whole new thing, and you didn't have that to Google and learn nothing at that, and they do it even release these lights after just a few presentations. So I couldn't like see my notes and go through the slides and kind of remember so, a lot.

D

Of the things I wrote up, I.

B

Was trying to understand what is it and I couldn't find the answer, so some I'm expecting for the graders would come out so and probably do this summers go again at Miami.

B

So you got gave an introduction to neural networks that was really good class. He has this his videos on why he's working at cuckoo AI right now didn't have like any. He talked about over permit parameterization as an approach escape saddle points and that we might have a winning ticket in the neural network. So that's exactly the problem. We were broken. What.

C

Does that mean a winning ticket opening ticket.

B

Is that there is an optimal set of parameters which is a smaller than full, dense network.

D

B

Has a better performance than the full network that you don't need all the parameters. You only need a small set of them, but you have to find which parameters are.

C

So house, over parameterization, that sounds like our parameterization, is a benefit well.

B

It's a benefit because when you start a network, it's like you have all this ticket and one of them is gonna, be a winning one. So that's that's way of escape desirable, so it is a benefit, but you have a different way of skating saddle points. Maybe you don't need the over permit relation and that's kind of the problem we also have been working and later we had Brandon Taylor from back to talk about CNN's 2000 product product Ori talk, he talked a little bit about state-of-the-art networks with a squeeze and excitation networks.

B

That's actually the first time I heard about, and the only difference on that is that you have this new layer with weights each channel adaptively. So not all channels have the same way when creating output feature map, that's how CNN's work and I thought it relates a lot to attention. You basically assign different ways to different futures and tension. Does that, but in been paroled in a pain prevention shared.

F

Wits that are involving across the image just like a standard CNN, because different channels have their own have.

B

Different ways yeah, so you can't attribute like more relevance to one specific.

B

That's from that 2017, working and I think it's the latest winner of the original Jack. The frontier on CNN's, is extending to non including data. A lot of people talk about that, and that includes a grass convolutional neural networks or neural networks that work on a spherical data or any kind of fun all including data and.

C

What is the motivation for them? It I mean it's always their problems like that, or is it just more of a cures? No.

B

No, there are problems like that. One of them is graph so how you learn over graphs, yeah and but.

C

What does that translate into the real world? I mean I, understand with well.

B

For example, social networks who have.

C

So the assumption that there is there may actually may be in some non legit in space by the way, but we may not even know what that space looks like right. So yeah.

D

C

Idea that it figures that hours and.

B

He also one thing interesting to mention is that begs Norma fact robustness and he showed like a few slides showing that and that's, except maybe a solution, so fix up is just better way to initialize the network so that the network is stable without requiring background. So fix up is also recent were giving 2016. Not a lot of people have been using, but as he presented, it seems that it works fine and it might be a good replacement for background. So we can. We can try and.

D

The afternoon.

E

We had this discussion.

B

With four members really interesting, I couldn't get the name of two of the people. I tried to look at we're, gonna find edges, but like philosophy, professors.

C

F

B

F

B

Was not because this was a cross panel, so this paper coming from there was a summer Institute on AI and society happening, and they just like relocated in the afternoon and I. Don't think it was not in their names or not in this can deliver I really look for it. So.

E

We had Richard certain, he.

B

Was talking about moving beyond humanism and how we should look to a diverse society in the new food to where we have humans, hybrid, artificial agents which can coexist interesting part of you we had this. A lot of professor was I really like to start everything in his name. Let's talk about AI researchers being world builders, and he discussed the paperclip analogy, we're very interested some Nick Bostrom knowledge that he says if you optimize an AI is a single objective of creating paper clips.

B

It's gonna convert the whole world each other Utley, and he said that's silly to think like that, because they're organizations we have today are exactly that. They're like single, objective, multi, massive algorithms or they are Taylor.

B

Just do one thing and that's like every startup dream is: you know like to convert the whole world into its objective, and so you already had that scenario and what stopped them from doing that that we have several startups Frank maximize their single objective, so they compete in this space and that is probably gonna happen AI as well, so you might have like money. I would sing objective.

A

But you're gonna.

B

Have many eyes there compete to make words into paper.

A

Clips and others making brass tacks.

F

Yeah, as long as the super-intelligent explosion happens in parallel.

G

That, like Google Facebook and.

F

Microsoft not just.

B

Law, professors focus a lot of ethics. She was talking about that. Every decision we do comes with an application and they were working on defining like an ethical, scarf or machine learning. Project I shall talk a lot about how machine learning researchers should worry about what they are doing, their ethical implications in every small decision.

B

Even small decisions like you just make network, which is ten times bigger and performs better, then you were certainly consuming ten times more energy, that's and that comes economy, or we can insert a lot of kind of bias into your products and that shouldn't be. That should be the concern of every machine learning research and you shouldn't just say: oh my boss asked me to do it so just really.

D

Emphasizing that point we.

B

Had Derek Coffee is a natural language, processing researcher. He also talked about that so that every tech conceived has a has a possible rule. Use and developers should take into account so when they're making something they should think of all the other ways that people are using that and if they're any bad ways, and he also talked about privacy and on the early Industrial Revolution.

B

We had many social issues and they seemed at the time inevitable, but we kind of fix them and we don't have kids working in factories twelve hours per day, and if it's privacy that the similarities day, it seems inevitable that privacy is gonna, be gone, but we might fix it and society's gotta fight back and it might fix privacy in your future optimistic fiance and the only general consensus I find is that at consideration should be made by every developer. I think that's solid argument.

B

They they should be done earlier, the following stage: that's the responsibility of people who are funding a I. Thank you and.

F

Everyone should read.

B

F

Don't know who's that I think this is the author. Behind the arrival movie I think there was there's a short story, written I, think by oh.

C

Yeah, what does that mean? Science.

F

B

Right he's a science fiction, writer, really, yes, Brenda, everyone loves him. I, don't know.

F

Yeah I think it's compared to losing the the China's pop author behind everybody problem yeah, which was it's.

C

B

Much someone said you know you really should meet somebody I say yeah I'm,.

C

Sure it but then, when you say, there's science fiction, author.

B

See like he poses the problems in a like, realistic manner. I don't know he's.

D

Thinking about the future.

B

In the way that can actually happen, so it's a good reflection. That's kind of that point, but.

F

I never read so so everyone all for this because agreed yeah.

B

We are for yeah, except for the.

C

Law, professor another one.

B

Thought about that Jackson I was paying that's.

C

Not his real name, it's not his real name. I was real name is Chang Feng yeah. Oh, he.

B

Adopted like arrested, it's.

C

B

B

So the second day we had enjoyed Chiang Kaname as if you, but.

C

Can I just the question? Was that the only time sort of these ethic issues in philosophical shoes were really talked a lot at the conferences? Did they come a lot in the other discussions or in their social know,.

B

That yeah, that.

F

Was it that was wasn't? There talked about yeah Benjy? Oh, there was talking about. Oh.

B

Yeah I'm, sorry, there was a presentation by Joshua and he talked a lot about it. So Yasha.

C

Banjo is in a different day or a different day, yeah. So what's.

B

The specific this.

C

Was like a banner, there is a specific.

B

Presentation by you or should be anew- and he talked a lot about it so that.

D

Was not the only time.

B

But not all all all the researchers didn't talk a lot about this, so there is no presentation on fairness and bias, for example, I, like comma or federated learning, which or some machine learning filled with your concern without privacy or socio, whatever news issue, so there were no talks about that, so I think that's kind of limited.

B

E

Talks range from somewhere like very.

B

Protector some were very advanced focus on state of the art, I think other summer schools and another.

E

Summer school, you were either trying.

B

To teach like one semester, credit course in one week and then it's just that one thing and you end up learning that thing. Why are you just covering a lot of advanced topics and works more like a conference, this one was kind of a mix kind of trying to convey two or three years of credit cards in two weeks, so it.

E

Was not the kind of summers.

B

Where you go learn and you go home and do it's more the kind of summer school that people try to attend every year to get like some more exposure which what's going on in the field, so these do I think was a mixed stock, so by Ranger Cheng was really interesting. She talked about state of the art, being instance, segmentation and I. Think some my slides are okay, but they're not showing they're like what's the bottom thing said yeah. The bottom thing is like covered.

F

No specialist and no objects so.

B

These are the difference between so this is regular classification tell if it's a cat or not a cat. This is semantic segmentation, that's basically pixel classification, so you're, seeing all these at grass all these are cat. This is an object detection when you have like multiple objects named it, so semantic segmentation can do that if we have like two cat you're, just gonna like mix up the pixels and they state-of-the-art now is doing instant segmentation that you you can capture like.

B

There are several cats in the image and you can get their perfect boundaries, so you can use that, for example, to eliminate someone from an image and- and that's.

D

B

2017 state of the art, so the system's you have to be a very advanced and you can actually like I can take a picture of this room. I can remove three people, nobody know the difference. I just know it's kind of scary, so we evoke we had our sin in and then we worked fast arson and fast arson and all the mascots in there and it just keeps getting faster and faster.

B

There's a lot of interesting work now in multimodal learning and basic involves reasoning and word knowledge. So this is the clever data set and the clever data.

D

B

Involves you have this question and you have to answer a question based on the image, so the question, for example, is: are there a new call number of large wings and metal spheres and that that's an easy problem for a human being? That's a really hard problem. For me, it looks just.

C

Like blackboard, yes,.

D

B

Yeah I think that's that's inspiration for it, and there is also this GQ. A data set it's the same, X clever, but with everyday events. So it's more. You can reason on top of actual reality.

B

When you talk about non included data and graphs in Ian's as well. She talked a lot about 3d image and there are four ways of doing. One is using surface like triangle mesh, which is using games. There is a multi view and you have like a set of image goes along and takes a lot of pictures.

B

There is a volumetric data, it's just boxes, it's 4d tensor its pixels, but with added channels- and there is point cloud most of the 3d data- is in form of point clouds, just bag of points, but the issue bag of points is ordering variant and there is no locality, and so you have some waters which work on that. Some newer models are working mainly on boxes. You have techniques to convert a point cloud to a box of you'd like.

B

B

And she show like recent work on since interests and since interest decided to create the same, for example, you can build a room and you can train this by getting several rooms and you mask a few object and you put the network to build that object which is gone so you can show different rooms and you can mask the bed and confine the network to figure out where the bed is going to be, and then you can use that network to generate like new room. So you have like a new space.

B

Anyone to generate you want to see where the factors are going to go and can use the network to do that. So this is actually from a real world application. This example here.

B

You got your talk on the Iranians, it was slightly more advanced, a lot more I.

B

D

Highlight to think.

B

The main points you're there like a lot of notes, what.

D

I think it's what's.

B

Different for me here is using attention for memory access where to write, deciding where to write and write. We connect a large memory and you can derive large memory networks from that where you have sparse access memory for long term dependences.

B

The whole idea is that you can connect. You can have a memory from network, it's not limited to a small small space if you have a way of knowing where you saved that specific memory. So it's like you're using the attention to look to know where the memory you want to look for the data you want. So you can have like this large memory associated with small networks. It's.

F

Looking girl, Turing machines with memory access.

B

I'm not gonna, be any dead tomorrow. Maybe that's.

F

When you're the time.

C

F

Looks like a large collection.

C

Of stuff is it is it in some sense it is all integrated, or these different approaches that people are taking and.

B

All this, like notes, yeah.

C

D

Are all like small.

B

Slightly different approaches so.

C

It's not like these are aspects of one big research effort. These are like little pieces of people trying this people trying that yeah.

B

So, most of most of these thoughts, they try to cover the state of the art, so we talked like about a lot of slightly different models from different groups. Thanks about the same problem, these large memory networks are specifically from the authors group. So that's why I focus on yeah.

E

B

Is a lot of attention related? What.

C

Does it mean a large memory, our diamonds I, don't want that.

B

That's all I, do you have a small network and you can have a large memory attack tree and the way you the memory just.

C

Remembering what you.

B

Can memorize, for example, best dates or we can memorize? Okay, if you want, if we're trying to learn long-term dependence yeah, then you need to keep. You need to keep its record.

C

What I'm record of what happened in the past? So they were doing a separate memory. Banks of.

B

Us so then, you can train like an attention model to decide where to save and where to retrieve data from so essentially, you could, for example, access something that happened: a million time steps in the past, if you know you're looking for specific times that boost amigo, so that was it's.

C

A new idea for a neural network, so I think I I'm in hurry. Yeah.

B

That's that's like state-of-the-art stuff. If you didn't even need for paper, I think that's like I'm going research. It's.

C

A little bit like well I'm thinking how this is done in the brain and there's multiple ways of doing a brain, but anyway, so it's a new direction for them. Yeah. You.

B

Actually was one of the few researchers which were concerned with brain how the brain works versus machine learning, so a lot of the things he talks about. It also makes analogies with with the brain.

B

How that this is actually quite interesting about the concern is prior and it comes up in another talk as well. What does that mean the consciousness prior? So it's this idea that you have a high dimension, abstract representation space, which you have like a it's a very high dimensional, and you know just concepts and factors, and you reason based on that and that's gonna direct your attention towards low dimensional. It's other right, so you.

C

Have low dimension so this high dimensional thing is consciousness or sorry.

B

It's the other way, I have this low, dimensional conscious spot, so you can represent spins in low dimension and you can reason faster and little dimensional things and then, when you need to attend to something high dimensional, you just attend to that specific things you want. So you can like just to generalize like we can reason about the room, thinking about chairs, table people, and then we need the specifics of the table. Then we can access a higher dimensional representation of the table.

B

I won't depend on the table. Then it's not enough to know it's a table. I need to know that that's a specific baby.

C

B

Access that specific high dimensional, essentially.

C

Uses the word conscious for that meaning the height the low dimensional space or.

B

Top of the hierarchy- yeah yeah, so here is the conscious state is the low dimensional income reason to me and then.

G

B

The low to the high dimensional space.

G

Or either way, yeah.

B

You need at least a mapping right that tells you that these representations the same as that representation, so that that's good done by this thing to Magnus. So this guy decides where, when you need to go down in details, so when you need to go down details, then you access the high dimension enough interesting.

C

I mean I've, never is the first time I've seen a no network research. We use that term conscious or conscious day or something like that. That's.

B

His way back today he's.

C

A little wallet I mean I, just wrote this chapter on consciousness right.

B

C

Reading what all these philosophers and other people it's that about consciousness- and this is yet it's related, but a very different idea.

B

C

So it's just one more way to confuse people I think.

F

Of attention right, attention was brought into the field, just redefine what.

C

Attention yeah since now they're just using the term but I, think many people would disagree with that use of the term. So I'm not putting a judgement value on. It's just interesting that someone just decided to use that term, which is a very controversial term and arguable about what it means and.

B

And and I think people are gonna use that term a lot more. It seems to be working on top of this idea, so we I think we're gonna hear a lot more. The term consciously.

C

Philosophers are.

B

Gonna be unhappy.

C

B

Attention, it's gonna, be cool.

C

D

C

Mean in some sense doing.

D

That, without being.

C

Up front stating that you're doing is somewhat of a disservice. It just makes more confusion for people.

B

So we had Greg Morey borealis talking about video.

C

B

Is the AI arm of RTC obvious is Royal Bank of Canada Canada and they have this AI arm. It's called other ends and they have like a very big machine or anything I'm. Always careful banks have big machinery, they.

E

B

Be doing like good research, so he talked about state of they are being activity, recognition, activity detection, so you only you recognize it David, you also localized in space and time. So you know when directive it starts with enter phineas and where it's happening, that's what you want in readers like. There is an example here. So you you know all these people they're what they're doing what they are about to do, and you can also have like a time frame when they started when they ended.

B

There is a shop early recognition, so you won't ring for the next section, so you're trying to recognize activity in the video as early as possible, so for I can perform to stop someone from shooting people. I can public. You want to recognize activity even before it takes place somewhere else. We can predict gun and there.

F

B

B

So you have, there are several ways of doing that. Most popular ways is classifying Fox or so it's a that's the right. We have these two proposal networks, it's instant segmentation that we saw here, but in a 4d scenario, but that's why it's like a tube. You also.

E

Have this idea.

B

Of group activity recognition that you want to identify what a groups going to do in the viewer like where people moving, who is gonna interact with who and there's new ideas here with the PO social bullying. Where do Alice name of spatially, proximal sequence, proximal sequence, share their hidden state so like the whole idea, is that you can share knowledge between Alice terms of things that are happening closer in the video.

F

Alice TM for each region of a video and then nearby regions share yeah.

B

Would be like an added stem for each person in the group. Pods and I have like a thousand person in the video. Then they would share their hidden state socially and he talked about generative models of video which are still in their early stage, but it's likely the most groundbreaking application of machine learning. That's going to come in the few years. We already have like some examples, which are a bit scary of videos being generated I think when we have very good, maybe even like change the movie industry or I.

C

Don't know you don't have to, you know, actually have active yeah.

D

You know it's licensed there, there are famous work and.

C

Then and then they create new movies that you think. Oh that's.

B

So there are a lot of work like trying to create videos from scripts. Even actually angel chain talked about image, but she's working on this product that you have this creep and try to create like a movie or video, it's an early stage, but there is a way to where it might be happening.

B

The issue with generative models. It requires temporal consistence, so the issue is generating images that require special consistence. So you can't like generate something a chair which.

B

Doesn't make any sense and but when you go to videos now you need temporal consistence, which is whole new issue. It's not that easy. So you need to find ways of keeping high level control of the contact you can buy. This kind of fishes and current models are reducing operational headquarters with other stairs and they find fun way of inserting Pryor's and incorporating physical physical relations between objects. So you don't break the law of physics in your video which written partner to try to generate a realistic video.

C

B

Hey if it's an innovation or I.

A

B

It was in India, like some Bollywood movies, a lot of been like to me. It'll, make sense like at some point, things make sense. Animation is also not maybe from day 3. We had Jimmy ba for vector talking about optimization.

B

He started to talk, saying that Newton's method performs well, and why should we use Newton's method and then that I think it's a big question. It's an optimization approach to try to find the roots of the function from the iterative method. I, don't know exactly the details of the matter. That's a different, optimization method.

B

But you can take like steps towards it right. You look at the secondary and take steps towards it, and he said the reason we don't use it, because it's computationally, expensive and because of the behavior they are getting locally in each region of the space, but there might be ways of moving from grade in the same thing to Newton's method, so what'll be pretty giving to that.

B

The information part here career here is variance, and one thing a lot of people talk about is that a huge assumption of neural networks is its iid.

B

Like what about non iid data and one example is active learning, so you pick the points you're going to learn from which makes convergence a lot faster. But then you lose the ID.

B

Identically distributed there so that there is sample independently from the same distribution and that's not like. We all have to distribution shifts. Example. Then it's not.

C

I sort of the the order in which things appear is assumed to be random.

C

It's a completely opposite. What goes on the brain distribution doesn't change right.

B

E

Yeah that two parts of.

B

That, but you pick random in the distribution, doesn't change and those two things like in real life. You can't assume no, but.

F

Ordnance, don't make these assumptions right, I'm.

B

Not sure they're.

F

Updating a hidden state and so they're updating distributions based on recent history, yeah.

B

Nothing is just referred.

F

To they will feed forward networks yeah.

B

Even we first with learning, but the thing about pre Forster plan is that you break this assumption because you're navigating environment, so it's not ie but you're, still using the learning which so the.

E

Optimization method has.

B

This assumption doesn't mean your problem. Has this assumption? So that's the issue of deposition method. Why we're looking at new ways of optimizing, maybe.

E

There's like a better.

B

Way of doing our names since aliens are no like you know only afternoon we had a lot of fish talking an MP. It was mingling productivity. So I don't have like a lot of know. Anything.

D

B

Yeah I talk about negative same thing: it's basically required because it's in the denominator of softmax huge, so you're trying to figure out what.

C

Is negative sampling? What does that mean negative.

B

Same thing is: what's important: you you're doing were comparing we.

F

B

To predict the probability of a given word in that context, but when you're doing softmax you have to divide by the probability of all possible words. So if you do like the true softmax distribution, you have to consider all possible words. It's like huge.

D

B

Instead, you just pick a few negative examples, so you have a positive sandwich. The words that bring that contest- and you had like a few random.

C

B

Randomly chose a nice example just to hopefully there's not nicely.

F

She talked about hierarchical, south Mexico, I, know.

B

F

Talk a lot in the denominator instead of context.

F

B

And same day, this was a hard day. It's like half of the day was math so.

D

We had an app math yeah because they're talking about.

B

Optimization that we had Bayesian deep learning project girls who back there was really good time. So the difference between Bayesian and regular, deep learning is that instead of doing point, estimation, you're, learning the distribution, sorry, assuming all the non Gaussian and you're learning mean and standard deviation, so I think this picture kind of represents. So this is the point estimate, and this is the sorry distribution you were learning. So this is another way of seeing that.

D

These pictures are just from the internet.

B

Since I didn't have access to the slides they're, not waiting from this line just filled, one of the issues in traditional deep learning is a calibration problem that softmax always push the probabilities.

B

We very low, very high and you end up having probabilities which are not calibrated to the reality, and you can kind of fix that use temporary scaling. But there are other ways of fixing if you're doing Bayesian deep learning, you kind of get the probability. The right probabilities right away, because you're modeling uncertainty as well. They're, not modeling as participations.

B

So it's a it's an intercept vacation for exploration as well since they're, naturally stochastic or modeling a certainty so yeah, but the good thing about it that when you get to result, you also know how certain how uncertain you are on that prism. So that's a good, buy, very deep learning, I think there's a lot of research on it recently.

B

It's alright yeah. Okay, thanks for me! Okay, sorry! So.

E

B

Courses in the afternoon, so autumn closures- maybe I'm just explaining pictures. So it's basically there's idea of compressing data into latent representation services upon that and this latent representation it's on smaller size than the original idea of any kind of compress any image and we create so an Internet application. For example, is you want to send image to your phone and then you can send in the.

E

B

Space and then we'll construct in your phone, so you have a lower bandwidth. It's required to send that image. So all the recorders are used, like everywhere.

B

What's interesting here is that you have these new applications of atom cores like glow, which can generate image very similar to games, so I think Ganz right now are the state of the art for generative models, but our encoders about are closing that gap. There Gary chickens and if that's basically unsupervised the difference again. Is that again you have this other. You have this two networks which you're doing one is trying to get generate better image. The others try to classify. But you are.

B

We still need like a little bit of supervised, Olivos and our encoder. Is you no need at all it's fully supervisor so talking.

E

About Gans.

E

B

But yeah because you need some ground to you, need you need to depreciate between some images like for.

E

B

Some image, which is fake, yeah the discriminator lenses, but in order to learn you have to learn, you need some real, like small. Real later you generate some data which another space.

F

But you also get the positive reals. It is the other current real that also.

B

F

That information source is still the same, though you just have a bunch of real images, and you know they're real yeah.

B

Yeah yeah, like maybe I, made the wrong assumption information.

D

On stream, yeah I.

B

Think of it so.

D

Talking about cancel.

B

This is how game for Michael- and you just respect you got Rio de- do get generate the network, and then you get the discriminator networks of trying to classify. What's we were not and they're like competing, so the generate is getting better because it's trying to pull the discriminator- and here usually you can use you just insert Brenda noise and then engineer it, but you can also use priors, so you can set up your problem in the way that you can generate something which specific. What do you want to generate?

B

You can like mix two faces and try to get insert style. It shows something, and these are examples of image being generated by against this. These are not real people, people and it's a big scary.

B

I think everyone saw this evening if.

F

You insert style on the generator. It doesn't become trivial to do to discrimination. How do you make sure that discriminate is still as a pelican.

B

Soldier, it's done doesn't.

F

Become true there you're saying that you can add a prior like you're doing, for example, in this first island, again writing style image. Doesn't that make the discrimination task much easier, because the real data doesn't have style I.

B

B

That was interesting. Maybe you should talk more about this, so that was bleak. Frigid start and the talk is it's called deep learning in the brain and the whole talk was about how deep learning is feasible in the brain. So he started the talk saying long ago, a drink, Jah, victims kool-aid that back propagation is the way we were on that and I want to show you that the brain came back properly and it was very.

E

B

It's not good, and he defined like that three issues that why people usually say backpropagation can't work in the brain once you don't have an error term, and then he showed this solution of a clearing propagation and says a brain alternates between free phase, which has no external feedback and weakly claimed faced with here, external environment notice, the network and the difference between the correlations is the gradient of they're not entirely sure right now how this works. I wish I had access to this life. I do have some a lot of pictures yeah.

B

We did a review.

E

B

E

F

The oscillation thing where you have like predictions and then, let's think compared to get prediction errors. Is this alternation and oscillation based on sedation? Listen! It's okay! You talk about the pass right. You.

B

E

D

It's a different than.

B

C

Was the overall reaction to this? Did you get a sense of do people believe this? Is it yeah.

B

It was like the pop star of the conference like.

D

A lot of food I think that was the hugest.

B

Line I have seen to talk to the present speaker was Lake Richards. Interestingly, a few people came to talk to me as well, because I've been teaching the opposite in.

E

The past few days, so when.

B

Rick Richards comes and say the brain.

D

Came back propagate.

C

D

Mean on the surface.

C

It's not true, but it could be true, but on the surface doesn't look like it and it feels to me like, like you know, if you're a believer, you try to fit all data to your beliefs. You know a head.

E

C

You can go to great extremes to do so. It'll come up with tremendous arguments for the existence of something I. Just you know, and so I have to ask myself all the time like well am I wrong about this. True, and the other point is, you know, are other people buying and drinking the kool-aid believe I.

B

Think, maybe maybe because submission learning conference, but it seems like 95% of the people drink, the kool-aid, I didn't drink the kool-aid label dependable- and it's just like in some ways refreshing to know that all the brain might be doing this as well. So that's it was I.

C

Mean it's clear: the brain in the neocortex is not, let's see on him, I mean it's just the complexity of it is incredible compared to these things and all these components, and so on, but well.

B

One things that he had like a lot of data showing so he got like brandita and then he tried to feel neural network models and it was showing my seemed like after said, that's possible in the brain like deep learning is a good model for how what the brain is doing.

C

Cherry-Pick your data yeah, like I, said you can feel if you're a believer you can. You know you if you know what made the arc you'll find all this evidence that he did it and you can write books about it. You know it's like. Doesn't.

D

C

Just curious how this is playing out. Oh it's.

D

C

A little bottles, mmmm ability is it but I mean it's an obligation. Could it be occurring there I'm sure, but it would it be.

D

The essence of, what's.

C

Going on the brain, absolutely not one piece but I, don't think it's even that yeah.

B

So yeah, if you actually go through the talk in doing it, comes off yeah.

D

B

I found it just this part of that verse. It says an event could be a spike of birth. We also used for is in our model, and he says the event rate can communicate, Purab signals and the best way to communicate top-down signals, and you can update the weights using the difference in the first probability between time steps, so the gist of it. They gets.

C

That is that just made up or evidence well.

B

He should show like a lot of crab dance, but like it's.

E

Like parasites, you.

B

Can show evidence for this.

E

Might rely and stuff like that, where feedback can cause this person yeah.

E

C

Some right here version.

E

Everyday, like maybe.

C

A small group of spikes.

B

I think you can see is.

F

Our weird is above that proper time, so just the last one.

B

D

C

B

C

When people are interested in this, why would they be interested I mean in theory? You don't need to you.

B

Don't need to know this.

C

If you believe in back prop and yeah is.

D

It is it because they think.

C

They're gonna learn something about building networks, or is it because they're looking to justify what they've already doing.

F

Yeah I think this feels really good. Have you been doing backdrop for a long time? It seems to work really well now you don't have to understand the brain anymore, because you friend of the brain is doing it so.

C

That's what yeah.

G

Or slightly similar, you get a servant, you get a feeling that, like I'm, going in a good direction, yeah.

C

But but they're not saying okay now I have to learn the neuroscience cuz. It's going to inform me they're not doing.

F

If he had said background can't happen the brain and then they would have to actually learn the neuroscience.

C

F

But this is, you know, I.

C

F

Some said this.

C

Is a this basic attitude existed back? We you know in the old days a variant a way back when there's always been this sort of tension between neuroscience and AI and when I, when I applied to MIT AI lab I mean that's what I ran up against. This is before neural networks. This is the classic AI and you know they basically said that you know you don't need to study the brain and here's. Why? Because our algorithms already capture that and the brain, it's just a messy version of those algorithms.

C

So why should we study it? We just never just doing the right thing, so it hasn't really changed. It's an obstacle. It's substance for our mission, I.

B

Wouldn't think I find in terms of that job, singing a lot that they should move away from back propagation, but tuffington students are very much in Quebec yeah.

C

He's right so he's run into the same problem right. You know changing a human condition. You know this is like this and when why they're so excited it's interesting.

C

E

Everyone understand.

C

The distinction.

B

E

They all blur the distinction was the back. What is it the word? The phrase back brought me: you could take it very literally.

E

C

Take a one level up further I mean the big question here is talking about you know or what they doing today. What's going on in the brain doesn't have to be, but the real question is: do we need to know what's going on the brain to move forward and that's the broad question as its backdrop or reinforcement, learning or whatever you want to call it all those other terms you know, are they you know? Do you need to know the brain to make progress you get to those really. The big question I think.

B

Most of the research is about today. They think the answer is yes, I mean we can get there without go in the brain exits. Exactly what Richardson told me he thinks we can get there without knowing the brain, but if we do know how the brain works, that's the shortcuts mmm I.

C

Would agree with that? The question is how big a shortcut is it so I I felt it was a 50-year shortcut and I. Think other people think it's okay for you, shortcut, yeah and so you know, remains to be seen. I.

B

Feel yeah I feel a lot of people missing. We.

D

B

Day we came up with this idea for doing Dennis. I didn't think it read about it that a good Turing test is if an.

F

B

Write a paper it took me two new ribs and get the paper proof so.

E

Essentially, fooling.

B

E

B

D

Are they act like.

B

D

Ultimate during.

B

That and I ask Ricci where, when does he think is gonna happen.

D

B

Specifically think we get this by 2040 and he said no I think way before 2040 see.

C

I, don't think that's a good Turing test at all.

C

A Turing test but I think the.

F

Whole Turing.

C

Test is faulty because, again it's it's solving one problem.

A

C

Gets back to the perennial problem: well, the machine solves this. One problem: is it really intelligent? Is it generally intelligent and that but debate about it.

B

D

B

Actual being good, and what is next, in the blurring, it's a really good talk. His lights are on line. I forget something. So he said, the long-term goal is to learn representations that disentangle causal features. He talked a lot about magaling, abstract space that goes back to that consonant prior I'm.

C

Sorry to keep dragging this out here because you see one of the read that sentence is what does he think is in sort of artificial intelligence, but it's not. It was next in deep learning it, which is a very, very, very more narrow point of view. Was that clear that he was making that over a.

B

Narrow distinction, I'm not sure, that's the name of the thought, maybe I got it wrong. I think I think what is next in machine learning, maybe or ai ai I think I've made up did yeah I.

C

Mean next could be like what we're gonna do in two months now or next could be like what we have to do over the following 10 years.

B

So it was like we have to do over the following 10 years and.

C

Then that seems really kind of like, but.

B

I think the name of the talk is what is next me. I, maybe think.

C

That feels, then these are very small thinking ideas here, you know: disentangle collagen features.

B

C

Of course you have to do that, but is that what the eye is about? You know what about sensory, motor learning and sensory motor integration and.

G

B

C

Just thinking that doesn't feel like a list right, I, just a jihadi. This feels more like God have.

B

You have two things where I would have felt like quitting interesting. Is this, for example, this thing about imagine abstract space and one of the ideas that language can be the size abstract space, and so they are working on this ground and language understanding problems. It's similar to that clever. One like book, mags put the blue key next to the green ball. It's like a very simple one, but you.

D

Have like huge.

B

One like huge mazes and it's called like baby AI, so it's things that human can do like kids can do, but machine learning can't do so. They're working on this kind of problem.

C

D

B

His idea of system 1 versus system, which goes back to the communist priority, that you have abstract concepts and you can do counterfactual reasoning and generalization on top of abstract concepts, but that's and then, when you need you, you have to ground them by the system, one like that's the the first inference, classifying image, etc. So we need to have two systems over one another.

D

B

About doing self supervised, learning latent space rather than a race pace that we don't reason pixels with anything in pictures and doesn't make sense that we were doing like self supervised learning on top of pictures and think CPC there's a little bit of that, and so the question is prior I. Think when we talk about that discovering good disentangle representations, it's a one big deal that goes straight toward the continual learning problem. Right now, our representations, our own thing up, and if you change one thing you just mess up your whole network.

B

He talked about the work being done by Berkeley, also provides agents and twin-stick rewards that the agents define themselves. What is the goal? What is it reward and they can learn by playing. So when you assign then a downswing to test, they already know some stuff, so the agent can, for example, see a ball and decide.

B

Oh I'm, just gonna interact with the ball and then we can learn things like throwing the ball grabbing the ball and when you give a Dom string test like okay put the ball in the table, I already learned like a lot of things just by playing and that it's an analogy how humans learn talk about going beyond iid like a lot of FIFA soccer and having to mean the main thing I got from his talk is he's really he really wants. We, the research is to bring go Phi like good old-fashioned.

B

Oh yeah I, like symbolic of reasoning back into AI what, but not but on top of latent representations, but the whole idea is that you have these two systems, so you're still doing deep learning and you so.

C

You're doing those things using you're using capital properties of go file using neural networks. Yeah. That's.

B

The idea, so it's not exactly bringing go fight back, but bringing this idea you can do counterfactual reasoning back into the picture, but now you're doing counter factory is not on top of you know like specific words or concepts in finder and public like later representations which carry semantic meaning and yeah.

B

That's the gist of it and that's. What wasn't very popular is like a lot of people Plowden, and that goes back to your question, so he had his life applications. I don't want to work on. This was a popular site, and this is the.

C

Small one, the Amazon, like applications.

D

B

Want for a kind.

B

It doesn't want to work on military application like stock market advertising. It was really making the point that they did not be working on that.

C

E

C

First, I'm sorry, the first half of that slide. How does it fit with my values? This is something he doesn't. You were just a second half looks like yeah.

B

The second happy things it doesn't work, one two and the first ones like these are the questions you have to ask when you work on a problem, how does it fit my values? How is the technology going to be used? Who you been after a few surfer related how much those.

C

Are things I should be thinking, you should be asking I say and then the second half is things I, don't want to do. Yeah.

B

He said like I: don't want to do but imply like no one's like.

D

G

B

So yeah it does activity last night, so this presentation is on the Google, Drive I think it's it's activated a Yasir manger deck wheeshes, it's not, and the other ones are more on in general. If someone wants like a good introduction to narrow networks, I really like you go along here presentation. He had like this online I like.

E

F

Day Sunday way he puts being heavy dude. That's his videos.

B

Really so he does that every summer school for the last five years, I'm.

C

Sorry, since he's like I was starting off giving this talk, yeah.

B

This was the first lecture of the Salmons good yeah.

C

B

It every year, I hate his.

C

Than everyone else's, he leaves off every year. Yeah polish he's.

B

Really polish at the other, is.

C

D

Like this is a.

E

B

Version thing it like a lot there's like usually good, alright, so.

C

Okay- oh that's very good.

B

Lip reinforcement, thing cut out anything.

C

Yes, schedule that and.

B

uh I can this is the city in a very Photoshop.

E

B

C

I, did you take this picture and then and then nice.

B

A

B

D

C

B

This experience.

A

Thanks for watching, everybody will continue this with the reinforcement, learning portion.