National Energy Research Scientific Computing Center (NERSC) Deep Learning for Science School 2020, 21 Jul 2020

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Week 1 - Introduction to PyTorch - Evann Courdier

Description

More about this lecture: https://dl4sci-school.lbl.gov/evann-courdier

Deep Learning for Science School: https://dl4sci-school.lbl.gov/agenda

A

All right so Mustafa just introduced the school, so I'm gonna introduce Evan and then we'll we'll kick it off. So we're very pleased to have Evan Cordier joining us today from epfl, that is the Swiss Federal Institute of Technology at Lausanne Switzerland. So Evan works on fast image. Segmentation networks for drones got a master's degree in general engineering and another one in math and machine learning.

A

Evan has done by torch tutorials before such as the applied machine learning. Today's workshop, which I think is a couple years in a row and using my torch for a long time, assistant instructor and deep learning lectures at EPFL.

A

So Evan is gonna. Talk to us today about Pike Gorge. Last year at the school we did tensor flowers without as appropriate. Everybody is really really loving my torch nowadays. So with that Evan I'll hand it over to you and we do have. We have a Q&A feature on zoom. So hopefully everything goes smoothly, but this is our first one. So if there any sort of weird hiccups we'll try to make it better for the next webinars, but please use the Q&A feature to submit questions.

A

We're gonna have a few places in Evans presentation where you can stop and we'll try to take a couple questions then, before moving on.

A

You can also post questions on slack, but we may not be watching it during the presentation during the webinar, so you can post questions on slack, for example, if they don't get answered right now during the webinar and evan is on the slack and we'll try to be there throughout the week to answer questions and participate in discussions.

A

So then Evan I think you can get started.

B

Good so welcome everybody to this fellow introduction to deep learning with titled. So instead, so I'm evan code, you doing HD in depending for computation and I'm working with pyro everyday, so I try to walk you through the basics of my road. So what to expect from this Tribunal? So at the end, you have an overview of the biological system and I get to you to know all the basics tools to for deepening that the vitals provides I'm going to be able to read, understand part of curtain and free air quickly start coding.

B

So the format of this webinar is one-hour, 101 15 minutes, and since we only have one hour, our basic knowledge of Python and object-oriented programming, some garden-based and machine learning, and maybe some experience with numerical complete computing language like beam PI. For example, I go back and 4/5 lights on my books, so that I can get give some explanation on the slides and then move on the netbooks for real-life examples. So, let's dive in so what spiders? So that's scientific computing rivalry that is written for Python.

B

So it's Python first and it works very much like name Pi, but you get additionally GPU support automatic differentiation, an optimization algorithm that goes around and all the necessary tools for deep learning that you might want to use. So it goes from all the Ross functions, optimizers and so on. Well, have a look at that, so this is the original code from the torch library to bite them and you invite us. We have a hole, deep running pipeline that exists so.

B

B

So you have, let me some pointers, so you you will you have some data set some training loop, get your model trained. You can put it to production using a production server. You can do your training distributed. What we'll focus on for this presentation is how to get your data available for your model to be trained and focused on how the training loop is built.

B

What I want to do is basically to see how each bit of the training loop and try to find how to do this in Pirates. So the collision of a program, deep learning program will be first to create your modern load, your data and initialize, all the hyper parameter that you might need, and then you have the training loop where you would get some samples present. Sorry move them on GPU.

B

If you have one compute the models, prediction the loss and then compute the gradients of the loss with respect to the parameters of sure model and update these parameters. So this is the training, loop and I want to focus on how each of these small bits in pi wrote. So that's how I will build it bit by bit and it actually all starts with with samples which are actually pencils in Python. Let's see quickly what is done soon, so these are multi-dimensional rays very much I came by and similarly to moon pie.

B

They, like the interface, is similar for creation indexing masking. So here is an example. How the torch interface is a similar to the new PI 1 and, let's jump in, to see how it works in real life with creating tensors and so on. So here we go I hope everybody can see properly the screen. If there is you know so first for this book I will quickly go through, because things are quite simple and you may want to go back to it later. If you want to.

B

If you want to find more information, we have. We have a finite, so this one I will quickly go through so to import the phyto forever. We need to import torch and then I create a tensor using the total tensile functions. As you can see, my term store here has two dimensions. This is something you can check using the in function and it answer, and you also have shape and size function, to check the size of also given themselves here, my tenth row.

B

It has two times three FA Cup 2003, so you can perform a lot of reparation respite Road and there is two ways to call me operation so either. You can call, for example, torch that dancer tours that operation on attention or you can click directly called your preparation and the transfer itself. So with the example of the same operation, I can simply I can simply call towards that firm or accept some this.

B

This way of writing. It is quite useful because you can then find the operators, so, for example, if I want to compute the norm of the times of X the times before, I can simply square thumbs or compute the sum and then square it changing this operator is quite is quite common in my coach and, of course, normal function that can allows you to do the same thing.

B

One thing you want to pay attention to in by coach: it is the is whether your operation takes place in place or is it is mutating the array. So, for example, here I look at the add operation, so the operation we add the value here to the tensor. So let's create a dancer that is an identity matrix and if I apply this person to a, we should get the term so that as hint that, indeed, as added five everywhere but dancer hey, it hasn't been changed. For us.

B

There is an in-place version of this add operation, hence the tongues or a in place, and here you can see that a has indeed been changed. So that's something you want to pay attention to, because in some in certain context, in pi thought you might need to know when your tongue sauce has been changed in place or has been mutated. So, on a similar note, I want to show the difference between the assignment operator and the addition assignment operator.

B

So here I have terms of a and I add 1 to this tensor and this operation here create Newton's or so. Therefore, if I run we sell, the a and a B for tonsils will have different would be different answers. However, if I use the addition assignment operator here, I will get the same term, sir, because a is modified in place, xxx operators. So that's something that you might you might encounter, and you might be aware you might want to be aware that this in this case you get different answers.

B

Something that you use very often as we're in deep learning is reshaping your tenses in Python. You do this using the view function, and so, let's create a term store that has one dimension and a size of six, and so here I can choose to view my attention. My cancer has a 2 x, 3 tensor, and that's what the and you can ask Pyro's makes it infer the missing dimension by putting minus 1, which is something you see also quite often in the code.

B

So the view content is used very often, and there is also we cite function like in Mumbai in Python. However, the reshape function will copy the data. It will not use the same data underline, whereas the new function does use the same underlying data, which means there is no copy happening.

B

Something else that you use very often is using a GPU. So in python you might want to you. You might want to check first whether you have it is available on your machine. So it is, if you available so here, my personal computers day, I, don't have good available, but let's see on this and this machine I do have the available, and so I first need to import all right and then I should have could available on this machine as well.

B

So here it's true, let's create a dancer and move it to GPU. So there is two ways in Python to do so. One way is to call a CUDA and the tonsure itself. So the first time you do so it will take a little while and then it will be emitted. And when you have your answer on GPU, you can check here that the device is is written over there. You walk, you may want to bring it back to CPU and there is a dot CPU functions to do that.

B

I can do so and get my transfer to CB. So this is the old way to do so, because there is a new, more convenient, convenient way that that exists.

B

This proof, this first part, is used in many order codes. So you might need to know it. However, the new way is more convenient because you can define a device at the beginning of your code and then you don't need to where we are worried about whether you're on CPU or GPU can define device using the torch device function.

B

Here you specify whether you want to be an CPU or GPU and which is the number of the GPU you want to use, and then you can use the dot tooth function to move your uten so X to the corresponding device. So, for example, here I can move my thumbs or X to G view using that direction. So that's quite useful, as I said, because you can put for a spectral line at the beginning of your card. That will check if good eye is available and if not, it will switch back to CPU.

B

Otherwise it will use the GPU and then in your code you don't need to worry about whether CUDA is available. You can just put your tongues of food for 20 bytes, all right.

A

Hey 11:30 interrupted quick, just people are asking if they can see the notebooks now. Do you mind if we share your github link.

B

Can struggle if you want okay? Thank you.

B

All right so two last bits I want to cover about tensors are the time conversion and the conversion to name by so convert the type of a tensor. You can use the same to function as to move to move it and for two different device, so here I've created it and so Y, and we can see that by default, this will be float32 tensor and if you want, for example, to convert it to a float 16, you can do so with this kind of line.

B

You can also convert it to an image using this again using this this line as well. So this is something that you might.

B

You might need to convert between, float, boolean and so on, and one last thing that is interesting is to notice the interoperability between pyro coming by you can very simply transform an empire ray into cancer, so here, for example, I create an invite array, X and then I convert it to a dumpster array, a tonzura proton from sorry using the front, numpy function and then I can move move Mike from torch, shooting by music, using the dot numpy on the toad stance.

B

Also here, I have: why that is it out, make him call the dot Mumbai something to get an empire right? Alright, so that's basically it for the basics and temp source. There is a lot of function that you will discover as you are using it. So what I propose that we start building our training loop right now, so there will be like two main to main bits which will be initially initializing different parameters, models and so on, and then the training itself.

B

So here we can already set the device, so we import Oh thought and we set the device that we want to and then in the training loop we can already move our samples on labels to the proper device, and now, let's see how we can get these samples and neighbors from a dataset. So let me move back to the slides and so really from recap on the consoles. So that's that's what you want to probably remember from what I just what have just shown?

B

I, let you go through it as well, when when you would provide sites, so this is what we've just done. So we have moved the sample on neighbors to GPU. We have seen how to do so, so this is the best we ever added and now, let's see how we actually can get these some person labels from a real dataset. So that's the crucial aspect of deepening to have data. So, let's assume here we have a training data.

B

We have a data set, and here we have samples that we represent with their squares and different labels that are presented by the course. So this could look like this, for example, on your file system. So here you have, you could have two folders one for the train, speed, the other one for the validation speed and each of these folders one folder per class, so here, for example, humanity s, dog, dataset and now to make the to make the link between your file system on your model. That is waiting for input.

B

Python provides a dataset class, so real here is a representation of the data suggests. It's an object that you can query with an index and that we return the corresponding sample from the dataset. So the only thing that the dataset has to implement is the two fungsten Len and get item. So Len is a function that we return. The number of items in the dataset and that item is a method that given an index, we return the corresponding sample and its label.

B

So, for example, in practice you could have a data set with five samples and then your data set, if it, if you call the Len function on it, should return the number of something in the data set. So here fine and then, if you, if you index, you ask from index four of the data stack with indexing, then it should return the fourth index in your data, just for somebody in your data set here, I come alongside with the label, so here human.

B

So as you see it's it's, it's always providing you with a couple so sample and label, and it does not necessarily hold the data. So you can see that it does not load all the images. It will just go fetch the one that you that you require.

B

So then you have this that dataset. But when you work with data, you often need to process batches of samples to suffer the data and the dataset class doesn't provide this. For this. You will need to use a data loader, so data logger is another class that our pythons provides and that which- and that takes a dataset object as input, and that will help you with iterating over this dataset.

B

So you can use it to create batches to shuffle the data to sample the data in a particular fashion, and you can also launch multiple workers to make the data loading faster. So here you can see when you use it in practice, you will be carrying a mini batch to the data loader, which will be occurring each index to the data set, and then you return many batches of data that it might have suffered if you have occurred it to do so.

B

So with all that said, let's jump to the next tip eternal book about better working with data. So I guess you can see probably my screen.

B

So it's out here so first we need to use the to create data section for that. Python provides a dataset class that you can find it in towards the beauties that they so.

B

Here we have we, we want to build a dummy, dataset class, and so what I've done is that first I've created some data, so here I've created inside this data, the tensor that has ten random numbers and then, since it's a dataset, I need to implement two functions: the lam functions which he should return the length of the data, so I which from the lam of this tensor and it should return the data item function that should return the sample corresponding to the index that you prove that you that you have an input so here here I get the data corresponding to the index and then I create a label here I chose to have a label of one if my sample number is above 0.5 and zero if it's less than zero.

B

So here you can see the three functions that are needed for the data set and once you have done that you know I haven't said, of course, but you inherit from dataset class when you have done that your dataset is ready to be used so now to actually instantiate an object. You would just run that line. So here you instantiate an object and instance, and then you can already check what the data attribute contains. So here you can see that it has created ten random numbers and we can check the, for example, the first one.

B

So the first one should be here 0.88 indeed, and the labels be true so here, as you can see it returns indeed at Apple. You can check, for example, that if you create the fourth one here you should get. You should get this number that has a label force, so our dataset is working properly. Now we have seen how a simple dataset works. We can already use, use it within the data logger. So to you to the data louder.

B

There is the classroom, I wrote that called data louder and that you can also load from totes. That util is the data and then to create data. Loader, that's very simple: you just need to provide the data set. You want to iterate on and then the bad signs that you wish and whether you want to fulfill it or not. So here I said that I wanted shuttle and that I want patch size of pi, and then you can simply iterate through this data type, using a simple follow so for sample label in louder.

B

You can get your your samples, so here I will show the first the first batch, so you can see. There is indeed a batch of five samples alongside with five legs.

B

So that's the basic of data sets and data loggers that we have used here is a very simple data set with dummy data. So let's see how it would work with from real-life example,.

B

With the alien vs predator data set, so this is data set that you can find on campion, and that is a classification data set between image of aliens and predators. So here on my on the repository, you can find such data set with a trained and validation folder inside which you have the two classes again and creator of this folder. You have multiple images that looks like this. So here is.

B

The predator here is an onion, and now, let's create a dataset class to load the data that is actually inside the inside my photos huh yeah. So here I have created this a lien creditor dataset. So, as you can see, the structure is the same as the dummy data said before. So we have an init function, loading the data, a Len function, returning the length and a get item function to give a sample when you provide an index. So here in this case, you don't want to load all the image in memory.

B

It might not be possible for a large data set. Instead, what we will do is what rather stop the path to the images.

B

So here, I walk through the IBM folder and I stole all paths to again images alongside with a label the label, zero and then I do the same thing for the predator folder, so I walk to the predator folder I and for each image I stole the image path with the label, one so I here I chose labeled zero and one for adenine video so like just like this I got all my is my path to my samples inside this array: self that image instances and then what I need to implement are the Len function indicate item the Len function simply will return.

B

The number of paths I have in my data set this way and then the gate item for given index will return, will return the image and target so first from the index, I can retrieve the path I'm. The target label that I have defined before I can open the image. I can open the image and and then I simply need to return. It I'm sorry target. So here I get the path from the from G array: I open the image and I return the image around side.

B

The target, once this is done- let's run this cell I can I can create a dataset providing the route and the speed. So the train- speed, yes and I can already check the length of my dataset so lame dataset. We call the Len function so here I can see, I have a 694 samples, training samples and let's, let's run that and create the first temple. So here, as you can see, we get a tuples. We get a bill image alongside the label 0. So here you get an image and we label 1.

B

You can see, we get another image, and here we can notice two things. We noticed first, that my image have different sizes here: 25 25.

B

So we would need to have image of the same size, input to our model and actually we need ten surahs' I met image as inputted to our model, so we need to transform this and that's where we can use transformation from the torch vision package.

B

So the total package is actually computer vision, packets and that will help with a lot of tasks that you have to do in computer vision, that provides cloud data sets and modern architectures, so here in the load, to transform to help with our problem the to tonsure, which help to move an image to convert an image to a temple and, for example, the random crop, which allows you to crop the image randomly to get a fixed sized image.

B

So, let's see how to instantiate the simple transformation you can instantiate a random crop, given the size of the crop here and the crop transform will be a function that you can apply on an image. So let's take the first image in our data set. This is our atom and then I can simply apply. It is cropped from the form here to our image.

B

Here we go. We get a random crop of our image and each time it's fine, I will run this function and we get different random crops. So now what we want is actually to crop out answer and then crop our image and then convert it to an answer. So I want to perform multiple transform and to do that to combine multiple transform.

B

We can use the compost dust from toads vision which arose here to compose different different transformations, so here I do random and then I move it to tonsure and then, if I apply this, this all transform function to my image. I can see that indeed, I get a ton zero now and if I check the shape of distance or using the dot shape attribute, you can see that I get a tensor of size, 100 500, so the crop I I wanted to end.

B

You can check that we have three the nose for the audience, so let's now apply this transform to our onion potato dataset. So if I come back here, what I can do is, after loading, the image I can apply the OL transform function.

B

Here you go and once that's done, you can shake so today, I load the data set again, but now, when I query force for the first sample, I get a tensor rather than rather than an image and the plus level corresponding. So we've done with that being said, we have data set. It is ready to be used with our model, so data set that provides cancel and that has a fixed size.

B

So now we can. We can see that with this new data set, we actually can use the very same code that I have that I have about to create the data loader and to check the samples. So here, if I run in the very same code, you can see that I get indeed.

B

Yeah, a ton so of size Phi with full dimension. The float dimension is the batch size. So here you can see, I have ended a batch size of pipe and then you can see. We have images so three channels, 105, 108 and 100, and alongside with this sample, I plot, the labels and the labels here are the five labels corresponding image.

B

So what is interesting is that the data allow the code doesn't change depending on the dataset, so we have did an alien predator date effect, but doing image specification is pretty common tasks, independent and therefore include division. There is also a dataset that automatically provides loading data from a specific folder structure. So here, if I, if I, provide, if I provide the root of my dataset to the class image folder, then that is a class that exists in Taj vision.

B

Then it will automatically find the different classes and basically it will do all the job that we have been doing in our in our dataset classic in defining yeah yeah, where it. So, let's run this this image folder here we provide the root and the transform that we want. It to use- and we can feel it- we get indeed a proper sample from our dataset. So and that's that's probably the main thing that you want to know for data set.

B

So we have seen how to build a data set and to use the data loader and in case of image, you can use the transformation from top. So now we can incorporate these bits inside a training loop. So, let's take as a attempt to classify this onion, predator and so here, I have added three lines: to import complete transformation from toad vision, as well as the data louder. The image folder and so here I have added some transformation, I load, the data spec and then I created data loader.

B

So I've added these three lines to prepare for our for the iterating through dataset. By joining the training you and then you can try to move. The only thing I have to add is to iterate through the data set with the for loop, so, for example, stables in louder.

B

That's the only it's the online I have to add to my training. Look so here. I can run that, even though this will not do anything for now and yeah, that's it for the data sector and data log. So let's move back to our sites to see where, where we are so.

A

Evan, do you have a moment for some questions or how yes.

B

So here is here is where we are, so we have the numbers to dpu and we have we have these samples from dataset and so before, I go in forward. Let's questions: okay,.

A

Great I'll just read you a couple here: what, despite doors, do if you overflow GPU memory loading data, it's a global memory! Do you need to do manual memory management, look in which can be architecture dependent.

B

Pythons will will not will just crush when there is a when there is an upper fold in memory, so we will have a data data overflow error.

B

This happens most mostly on GPU, so you don't have to do specific data management. Most of the time, however, Patos provides some functions clear, the neck to release the memory on the GPU, if needed,.

A

They can follow a bunch like if there's more, maybe we do, there's one more guys. I think is pretty relevant right now. When we crop an image, does it not lose information? Is there a way to pad an image with white pixels to a greater size so that the features are not lost while cropping.

B

To do image segmentation, alongside with with moving you, can also crop it. You can as you've seen, but you can do kind of cropping rotation and so on. So of course, information when you crop the image, but usually you will see that image multiple times with multiple crops so easily. That's that's that's I mean that's how that's not an issue, and so you can of course, collagen provides a lot of functions to add the image to rescale it and so on.

B

This is not a stereo problem to crop the image. While you are doing deepening allowing to when you crop the image it allows actually to have a bigger budget, because you have smaller images, you can have bigger batches, which is usually good for convergence.

B

A

We can do more later, I think yeah.

B

If you want to add things to what I'm saying.

B

Let's move on then, all right so now that we have seen how to get batches of samples from neighbors from the data set and move them to GPU. Let's see how you can compute the models prediction. So, let's see first, how do you can actually create a model and compute its prediction? So Python provides what we call a module that are reusable model components and that help you with building on your networks.

B

So these are like building blocks to create of all kind of architectures, and these modules this module class will help to manage your model parameters.

B

I think already provides a lot of built-in models, so let's see how we hope to manage the parameters so having a module, so module class actually had to keep track of all the parameters in your model, so you may have a lot of convolutional layers, linearly year's budget organization and so on with parameters and the motor will actually be aware of the parameter it allowed to quickly load and save the model. It will allow to reset all the gradients of the parameters because we'd be computing.

B

The gradients of these parameters- and you might need at some point to raise at them and then you can move all your parameters to the GPU in 1-9 as well. So you don't have to move all the parameters of each conversation to the GPU. You can just do it with the help of modules.

B

So, as I said, Python provides a lot of predefined modules, so there is actually the thought that mmm soup module of Python, which is which is a whole library that is dedicated to neural network, and so they have conversational layers, layers, activation functions, functions and so on. So it's.

B

A

B

Verily, it's full of function, which will be actually that you made which might need, and then, when you want to create your own modules, you can inherit from the thoughts that nm that module class, and then you can benefit from all the advantages that I have talked about when I talked about managing parent parameters. So all the modules from the totes that I mean I really like conversation, linear and so on. They also inherit from this class and when you want to inherit from toast, but and in that module, you need to implement two functions.

B

So you need to implement the eight functions that we like the user in it when you, when you need you need when you create a test, but in pytho specifically. This is why we define all the sub components of your networks and it could become aware of the parameters because you define them in the index and then once you have defined what component your models would have you define in the forward function? How all these components will will be connected together? So all the components that you have defined in the in alright so yeah?

B

That's the basics: on modules: let's see how to how to you wanting to create custom modules through some typical books example. So, first, we will need to import taught as well, along with thoughts that a name that we import as nm and then, let's see as how you can instantiate the simple module that exists in the toaster and in ivory. So here I in turn, I want to create a fully connected layer that is called linear in Python.

B

It's actually simply a matrix multiplication, a plot by F so to create this I need to provide the number of features in and out so here, five a feature theme and two features out of the this fully connected layer. That's why I call here linear regression model and like this I have created a linear layer and therefore my mother. We have a way and a bias parameter.

B

So, as you can see here, we can check that the weight of my mother's, so here I, have indeed weight that that is of size, 2 times 5, and you can note that the type of the weight so if I do type of the modern weight I get poached at an M that parameter, so the parameter is actually a wrapper on the transfer, and this make that the tensor is automatically added to the list of parameters of your model.

B

When, while you use items a parameter in the model, the model would be aware that this tensor is a parameter because it has this parameter class. So for module. You can get all the parameters. So here I'm using the name parameter function on the module and I iterate through them, so it returns a tuple with names and dancers. So here I've got the name and the tones or shapes. So you can see there is in the linear regression model wait 5 to 5 and by a size, 2, and there is also a function.

B

It is just parameters that returns a generator of for all the parameters, in fact the model. So here, if I, if I use, if I use this function, I can get the two parameters, weight and bias that that that is in my daenerys mother. So then, how can we use this model so a motive in Python? They will work on batches, which means that if you have a sample of feature size 5, for example, you will always need to have as first dimension the batch, the batch size and then a second dimension.

B

Your feature size, so here I'm, creating like dummy data with random numbers and actually here what I have is a term for X with a bad size of 3 and then 5 damage from a 5 under sample. Understand then, with this with this, so an input of bad size, three feature size. Five I can just call my model directly under sample, so here not that you don't call the forward explicitly. That's quite important.

B

You don't want to do that, because if you do a code, the forward expediting you might you might not get some hooks that how define an invite odds might not get cold. So what you want is really too cold. The module directly on the terms of X your input, X, and here we can get our output Y.

B

So we can see that we get a bad size of three for the output and then a feature size if you wish of the output of size, two, which is the output size of our linear modem, that you have defined all right. So here we have seen a bit how to how to use predefined the module in my code so be our linear, our linear regression model, and then, let's see how you can actually build your own module.

B

So building your own module will be mostly using predefined module in Python and connecting them through the forward function.

B

So, as I said before, you need to inherit from an ended module and then here I will just initialize the different modules from the top and in library that I want to use. So here, I want to use food in. There are layers to 240 connected layers in a row, so I just defined you define them here and then in the forward function you can see that I am I'm, calling them recursively.

B

So first I call the Rina one I apply a really function, then I called the linear to and the output and then I return this output. So just like this, my network, my neural network module, is ready to be used. You can know that I have used here after true loop, and you might wonder where does this come from? So here you can see it comes from totes at any node functional.

B

So, as you can see, it's kind of a module of enemy, so all the that you can find in enemy like linear, our and conversation and so on. You can also find them as functional inside the totes that and in that functional that we use sometime. So here, I used the reduce function from this toast at an angle, function and submit all right. So now now our network is ready to be instantiated, so let's create one instance of our model, so here I just called the model with the parameters that I have specified here.

B

So the input size of the first in our the hidden side and the number of classes in output of my second linear, Network and Pyrus provides a handy way of displaying model. When you print it, you can see all the permit. The modules that are registered inside your network, so here I, have I can see. We have two denaro layers.

B

However, this the order here does not necessarily match the order in which will you will apply them in the fall. It's just about what parameters are inside your neck and then, just as before, when you want to call your model on the input, apply your mother on the input X here you note that you don't call the forward. You directly called model on the unti put X, and here I create an input X with patch size type and within input dimensions to match the input dimensions of my first inner layer.

B

So when I run this I I I get an output, a file, a size five times, two, so the best size and the output of money nearly so. What I have done here is actually building a sequential model. It will run each different modules sequentially and that's something you can do directly, because this is something which is quite common, so there is a class that is called pot that and in that sequential which will apply the module sequentially. So here we apply three modules: sequentially the linear modules then remove module.

B

Then another linear modules- and this will act like very much exactly what our custom module is doing with applying a linear than ever written in other Denia module. So this is this allows you to do much more much quicker too much more quicker, sorry define network. This is actually sequential, and then we run it the same way as you would have run the previous one. So here I get also. They are good.

B

Bats ties, file output size to right, then modern Kanner's can be moved to GPU and you don't need to move each parameters to project into GPU. You can simply call dot CUDA on your model to move directly all the parameters to GP. So, similarly, to what I've said just before that, CUDA is the older way to do so, and the newer way to do so is to actually call the two device on your model, and so assuming you have defined your device before and this remove your device to your modem.

B

Sorry to do a proper device, as you can see, you don't assign the you, don't do more than the current model CUDA. You don't need to do this. This automatically moved all the parameters to the GP.

B

And probably the last did we want to check on modern how to store unload them after we have you have trained your models, you might want to be able to save them and know them later. So there is two ways to do so and I'll cruise on both because there is one that is easier, but it is not necessarily always safe. So let me, let me show you so.

B

The easy way is Twitter taught that same, providing here, your mode, your model and then the path doing so we'll just we just so state my mother and then I can load it using the torch that load function, and here I can check. I have indeed the same values, for example, for the two bytes down.

B

This function type will actually picker the models, so it will use the pickle packages from Python to save the model, and actually this will mean that the what you would say will be bound to the folder structure that you had. So if your model is defined in a specific file and then that you want to load your model and the define has another name. For example, we won't be able to load it so this might. This might be a problem, and so there is a way to let's say rather than saving the whole model.

B

Just I didn't depart for the value of the parameters and that's what you can do with the state. So it has a state that will return all the parameters of this model. So that's why it's it's actually useful to have the model awarded parameter, because it's able to return all this or its parameter at once in the state dictum. Then you can just save this static, so this technique is really a dictionary mapping name of the parameter with defining different weights.

B

So if I, if I save my money, that way, I'm also allowed to then load it, but I need this time to use a function is called load. State dict, so I create I need to create my model first, which I didn't need to do before in the other way, I create my model first and then I can use the load state function on my model to load the static that I have say, and here you can check that.

B

Indeed, I have the same values for the model and mode, so the difference is is really it's I mean it's important to know a bit that with the first way, this is much simpler, but you need to make sure that your class names and organization of your photos- we won't change otherwise using the state it will just rather than saving the model itself, simply State Dictionary Python dictionary with all the parameters of your model. One last thing we may want to cover now is how to compute a loss with PI crotch.

B

So PI post comes already, if a lot of pretty fine. So here we have an oscilloscope and so on and let's run 203 and they all existing inside the package, so the toast a tenant, and so if we want to create on the pollen, elbow and loss function, you can just instantiate it this way and then to apply it between to compute the loss between two pencils. We can just pass it as argument to our loss function and get and get the sort of the loss, so he had n1 nodes between the two time zones.

B

So here, if you compute it manually, you can see that it is that it sounds good three and then the loss function takes the mean of the inputs. So you get one.

B

So that's it for for for pi code modules, creating your own modules and I'm using the one from by so now what we can where we can see how we can add it to our training loop. So what we can do is simply to instantiate a model that way with a sequencer, for example, and then to move it to the proper device.

B

We also want to instantiate the last function so before I had the l1 here. I want to use the cross rooms of fillers. That pyros provides that is more suited for classification programs and then I can simply in my training, new, add lime. The first time apply my model under samples we get predictions and then we compute the loss between the predictions and the correct label. So that's what we do on this time and we get the loss so.

B

Have covered most of the most of the steps? What what is next is to see how to compute the greatest and to update the parameters with discretion. So that's probably the most important step that we're going to see right now, maybe let's yeah it's covered. Let's cover the written computation and then maybe have this more break for question for you. So that's what we have done, completing them on their prediction and computing, which adds these two lines to the training.

B

Look now, let's feel a bit how you can compute gradients in Python, so python has a package called auto grad which which will allow you to compute the gradients or two different parameters. So python is what we call defined by one run framework, and this means that the back propagation is defined by how the code is run. So every single run of the code and be different, so that's the we will use so this autograph that is inside the that is the differentiation package of Python, and so how does that work?

B

Each time throwing pythons will have recurs, grad boolean attribute, and then, when you perform operation on the on the tensor algebra, we create a computation graph to record all these operations.

B

Then, when you finish the computation, you can call the tone for the backward to directly compute all the gradients automatically. So what you will have usually is that you will do a successive of computation that will lead to a lot. We call a lot of backward to compute loss and to compute the gradients of the laws with respect to each parameters so before to the champion journal book.

B

Let's look at a quick example: oh one stuff, I forgot is that the gradients are accumulated into the tongue, so that what attribute so not here that when you code, the back, the gradient that is computed will be added to the produced gradient. That was inside the term solid good. It's not stored, it's accumulated, which means that if something is already there, you will add the previous gradient. We will see a bit later why we wide this.

B

This happens that way so and I said just a quick example of how that could work before coming to netbooks. So here we have the computation graph. That could happen from from for the previous example, so we have an input that we multiplied by a by a by a white matrix W, we add bias D, we remove the value of the label Y. Then we square it and we get a loss. So it's like enough linear model.

B

What you can note here is that X doesn't require squad because we don't need to have the gradient of the loss with respect to X, because we don't want to update X. However, we want to update the weight, you want to add a W and E and therefore we will stick. There requires grad to initially the equations are set to none and then what we want is the grams of the loss with respect to each parameters, and so we will cover the backward fraction to get that.

B

So we call it loss the backward, and what we get is that the backward we automatically compute derivative of the loss throughout all the chain of operation that we have applied and then click accumulate. The gradient inside the grad attribute of this tensor.

B

Only the controls that, of course, recurs great, not again I, say, accumulate, not store, and if you run this project, if you run this same operation again and that you call backward again on the log, the gradient of the gradient attribute of W will actually be twice as much because it will accumulate and not be stored.

B

Usually this accumulation is used to provide some more flexibility when, when you work with gradients of complex model- but here it won't be useful to us today- and that means that we will have to after we have used the gradient to update our parameters. We need to set them to zero, but that's something that you have to do in Python is to remember to set your greatness to zero after you have used the gradients so that for the next back wall you will you will get the fresh gradient.

B

Alright, let's jump into how you do that with auto breath.

B

Okay, great, so how does Otto grad works? Let's see doesn't practice so, as I said. First, you need to set the required property on your time zone. So when you create answer by default, the recurs grad will be to force the red ball NX bit. And then, if you want to change this, you can use the whicker's white function with an underscore at the end training under which allows you to change the value of the recurs Brad attribute, and here you can check that. Indeed, we get time for eggs with whicker's one equal to true.

B

So once that's done, you can see how other grad will actually track this operation. So let's do a simple addition. Operation Y can express- and here you can see that our Y functions also has the whicker's will attribute equal to true and that it has a gradient function. This gradient function is actually a pointer for auto blood. You know which kind of operation it has to perform to compute the derivatives. So we won't dive into this.

B

But it's interesting to see that, through these dragon functions of the grad is building a graph of the computation and that's what will allow a low autograph to compute the derivative later on. So if we continue and do we have more computation and why we can see that Z is will also occurs well and it will also have the gradient function. That is different.

B

So here it's the backward function defined for the mean, for example, and so once I've seen that we can track how it would look in our in the example that we had in the sky. So in the side you had example of some linear prediction model, so we have an input X and a target white that I define here and then at answer or a weight matrix W on a bias B. So you can see that here, I will we curve.

B

I will make W and B we Clare's gradients, so that that we we can update them with with because this is. These are the parameters that we want to update. So you can check here our two tensors and then we can just run the line to compute the loss, and you can see that the loss is also a ton so that whicker's gradient and that has a gradient function. So it knows why what kind of regime has allowed to create draws?

B

And now, if we check the gradient attribute of W and E, which is where the gradient is tall, you can see that at the beginning, they have no gradient, so actually the grantors which are set to none. Then, if we, if we call it backward and the lost control, it will compute the derivative with respect to B and the value. And then, if we check the gradients now you can see that actually regret. The gradient attribute of W&B has been populated with some data, however x and y that doesn't require that don't require gradient, have.

B

Still, it hasn't, let's just confirm from confirm what I've said about graduates that accumulates. So if I run this line again, I compute the log again with the same values M that I call back code again, we can check that the values of W and D, the gradient of W on d are actually twice as much as the way before so here you can see the grams accumulate. So if you don't want this to happen, we'll have to set the ramp to zero B between two backwards.

B

So we have seen how how to compute the loss for the decorations for our W on these dates. How does it work actually for the model parameters if we have a model? So here we have a quick example of how this could work. So, let's create a neural net, so here I create a simple, sequential internet where.

A

B

Linear story connected the module is followed by really than another, and then.

A

B

Also create a loss function so that I can compute the loss and then backward. So here, if I get. If I ask for the free index of neural net, it will actually return the first module that that is inside the sequencer. So when running with fell, I actually get the first jr., which it has in either in feature. Five and I was pictured ten and then I can check the weight of this module. We know that it's very now more.

B

It has a weight, so you can check this and we can indeed see that it has five by ten. So ten by five here sign and you can see again that it's a parameter and that we curse God is true. So this is something we might have not pay attention before in the previous netbook and that's something that the music of the parameter, so the parameter is registered by the model as being a parameter, and it has automatically degree across broad set to truth.

B

So we don't have to set the whitcross bad of all the parameters it is by default.

B

Then, let's try to compute the loss of yet to compute the loss from prediction and some targets of and then to see how the gradients populates inside our neural net. So here I created some dummy data X&Y to meet answer. First, you see, there is the batch size which is here and here and then the input size of X is 5.

B

I can run my neural net, cutting it on X, get some predictions and then compute the loss between the predictions and the value we want for Y here and when I get given loss, which has a gradient function. So we have tracked the operation that have happened to get to that loss, and then we can check the gradient attribute of our neuron of the of the weight function.

B

Sorry of the weights of the way parameter of our first linear and you can feel the grad is still not because we have computed in ingredient yet, but then, when we recall backward on the last, it will indeed compute the gradient for this for this weight and if you check the value of the gradient of this weight term, so afterwards, we can see that indeed it has been populated.

B

Then, as I said, you need to zero the gradients before you actually do some other prediction and update. So to do that. To do that modules have a specific function, that is, that is called zero graph and that you can call directly on the on the on the model, so here neural net, and this will zero the gradients of all the parameters. So if we went that, we can check that afterward, the gradient of our wait times row is set to zero. That's something that you would need to do between each time.

B

You see a sample in your training the during the Breton's. One last thing that is useful to know is how to stop this history tracking from autograph. All your parameters have by default, Derrick Rose graduate true and doing inference when you are not on you, don't want to update your parameters, and you are simply wanting to get some credit you map you want to. You don't want to write to create a computation graph, and you can do that with the no grad context.

B

So if you write with thoughts, but no blood and no computation bathroom will be built. So if I run that here, you can see that this, this operation y equal x to power two creates, creates the terms of Y that precursor idn't. However, if you ran the same operation inside the, we thought that my grad context the tongue floor that you create here doesn't requires gradient, and you can see that the gradient under grad function of G are both known.

B

So that's the contents basic of autograph. You might want to go through it again to make sure you have understood everything for now. Let's just see how we, we can add it to our training. Look so from what we have. We had before. The only thing that we will actually add will not be inside the initialization part. It will be instead of training loop itself. So we add these two bits: the roasted backward function.

B

So after having computed the loss, we will compute the gradient with respect to drugs for each parameter of the model and that's with the lost backward. Then we will have to update the model parameters and, finally, we will need to set the grant to zero, as I have said, otherwise they would accumulate over. Never not that this is ever gone. I have put this line here, but you can put this time also at the beginning of the of the training group.

B

The only place where you don't want to put it is between the computation of the gradients and the update of the model with that gradients. So you don't want to put your zero right here. Otherwise, you will not update your model parameters, because your parents will always be zero, otherwise, this zero, but can only go anywhere.

B

So that's it for our new training loop. Let's move back to the slides and see what we have added from our training loop. We are nearly done. We have computed the gradients, and now we have this new step of during the gradient that we have a swing demand TV on here and what is simply to now update the model parameters and so to optimize the model for producing with sorry.

A

B

A

We could but I just at the time we got 15 minutes left so I'm wondering if it makes sense to just try and finish and then see how many questions we can get do.

B

You think so yeah we can take the questions, so maybe afterwards I will quite try to run quickly through the to next notebook which are not so long, and then we can take the question later. Okay,.

A

Yeah, just to make sure that we can finish the material and then decide how many questions we can take. I. Think that sounds good yeah.

B

I guess we can finish it I. Can the optimizer part will be what we've seen now and be relatively quick, so so.

B

We will now see how to update the parameters using optimizers so how to optimize parameters in python. So the total has taught optimist third module where we can find lots of different optimizers. An optimization algorithm optimizer that you can construct will take as input a list of parameters that you want to optimize and then, when you want to actually do an optimized optimization step, you can call the optimizer that step function to update the parameter.

B

So all the parameters that are passed to the optimizer will be retained inside the optimizer object and that way Gupta miser can access the gradient attribute and update their values when you call optimizer that step. Finally, you we don't. We know that we have to zero the burdens and that you regret you could also use optimizer dot. 0 grad this function does the same thing for the parameters that were passed to the optimizer, so maybe to make it a bit more concrete.

B

Let's have a quick, quick drawing, so you have a model with you have a model, and you can insert an optimizer giving to this optimizer at the parameter of this model.

B

Then you will compute, we compute the output corresponding to the given input, which we call back one under sauce, which will compute the gradients of each parameters and it will be stored inside the param attribute in the range attribute of each parameters, and since the optimizer holds a kind of reference to the parameters of your model, it will also also be able to access the dot grad attribute, and when you called the dot step, it will compute the update that it needs to do under parameters, and it will apply it.

B

So, let's see how actually it works practice, let's jump to a netbook, so.

B

Here we go first, we want to import the option package so from toad I'm Paul opting I'm here so this command. I want me to check all the different optimizers that are provided by python, so you have added by grad that the delta s de atom, so these are very common optimizers that are already implemented, and then you can use out of the box for an optimizer.

B

As I said you can you always need to provide parameters when you instantiate it so here I have to learn to use the STD optimizer. This is in the traj that opt-in sub module and to create to create this optimizer I need to provide the parameters, and here I can use the parameter function that we have seen in the module notebook that allows us to directly get all the parameters of our model.

B

So here, if I create the same neural net as before this sequential pneumonic running this time will create an optimizer that will that will that will have all the parameters of our neural net and when we will call the step function, it will use the gradients that has that have been contributed to update the parameters here, depending on the undock tomorrow, you'll, you might want to set other hyper parameters for optimization for learning, right, right, decay and so on.

B

So let's see how uses you used it in practice, so I define a loss function here and and Demi input and output, so X, which is a an input with a batch size, 15 and a feature size, 10 and then I run the my prediction. Sorry, a random labor, I compute, the prediction with my neural net, so I can just run the run it on X and get predictions I conclude the last corresponding to this prediction prediction and the labels and then I compute, the Breton's hello ever seen just before so after design.

B

We know that all the gradients of the parameters of the models has been computed here. Let's take what the what is the value of the bias of my neuron of my linear first linear layer in my neural net. So as you can see here, we have a bias with five terms and now the optimizer that step will use the gradients of this bias. So the gradient of the loss with respect to this bias to update these parameter values.

B

So if we do the optimizer that step now, when we print the new bias, the moving for the bias, we can see that we have slightly different values, because the optimizers the optimizer, have has changed the value of this. The bias using the gradient actually so that that's, if you just you, just have to create the optimizer with the proper parameters that you want to optimize and then after having computed the gradient, you simply need to call the optimizer that step to to request the optimizer for a gradient update.

B

One thing you usually want to do also with optimizer. Sorry during your training is to detail the running right. You might not want keep the same running right all the time and to do so, you have running right, speedy roads and so in pythons there is a couple of running right schedules that already exist that allows to change the running right, drink training.

B

So there is different schedules here have like four and let's see how you would use the exponential learning rate scheduler, which each time you call it will multiply the recurrent learning rate by some gamma value so to instantiate the running scheduler you you find one you want in the OP team that allows schedulers a module and then you need to pass the optimizer that you want to change the running rate of here, for this figure I want to need to have a hyper parameter here.

B

This is how much I want to multiply the running right each time. So here remember I instantiated! My optimizer is running right of 0.1, so here what you can see is I haven't instantiated that sorry.

B

So here what you can see is that I print the learning rate of my optimizer, which is 0.1, then I, do a step of scheduler, which is supposed to decay. My learning rate by multiplying it by 0.8 and here I print the running right after the decay, and you can see that indeed, the scheduler has decreased the running rate from Japan wine to 0.08. So that's how you could use a scheduler inside your training.

B

It's not use all the time, but there are available in package once you have seen that we can just add this: adding optimizer to our training loop to finally finish and have a full training loop. So put the initialization part. What we need is to add this line, which instantiate a new t minus 2 here from the opting package with a CD optimizer, then I pass to the model a pass to the optimizer. The model parameters which are the parameters I want to optimize on and a given learning rate and inside the training.

B

Look I added this time, which will use the gradient that you have computed, that auto growth is computed with loss that backward and then the optimizer will use these gradients to update the parameters. So you need to call the optimizer after having computed the gradient with backward after having used this gradient. You can describe them by setting them to 0, as I have said, rather than using model dot d regret, you could also use optimizer dot 0, but here and you do the exact same thing. So here we are, we are good, so we have.

B

We can launch this, but once again it we train you. It would train our model, but this would not just will not display anything. So, let's move back to the to the States, and here we can feel that we have fulfilled the loop we have found.

B

We have sorry finished and we have added this line, which is how to optimize an updated parameters with the rhythms completed with the line core, and now we have the full training loop that can be used to train any kind of model.

B

So I think I have not much time left, but I will try to quickly go through a complete example of how to build and train a full model on task. The enemy's task, the computer I mean didn't ask any. So let me go through this quickly inside the chip, eternal books, and you can probably ask more questions later on the stack or after the webinar. So for this example, I have a couple of imports that you know you now know, and the network, if you want to build, is the Lynnette.

B

So here is a picture that is not very clear of the network. You want to build it, and so the details are here. So what we want to have is a couple of convolution really max pooling layers. So I have convolution random act pudding. Another conversation will do max pooling and then a conversation with given input and output channels and then, when I have used when I have decreased the size of the image using very small layers. I can use free connected layers to do the final classification.

B

So such a network for such a network, you could do custom module that way, so here I still inherit from an in that module and then I choose to create two sequential, let's say sequential model inside in it. So the first sequential would be the convo convolutional part and the second C concern will be the freedom free connected part. So here you can see I used different modules from the NN package, so it comes to derail we might pull to D.

B

So these are modules that you can find in the any packaging that you can find depending on what you need. So you can. You can check them on the highroad stock and then here I use linear, I use linear layers to form a fully connected part. So once I have defined Network I want to do this. I want to use I simply need to tell which is the way I want to combine them. So first I apply on my input images.

B

The convolutional part I get some output, then this line, if you, if you check this, will flatten in the contour, so it will actually take the image and output, something that will have the size, the size of the batch size and the remaining the remaining dimensions. So this is the flattening part where you convert the transfers that is usually for dimensional, with the batch, the channel, the height on the width, to thumb through that that will just have a bad size and a future dimension.

B

And then you apply the fully connected part on the and the output of this fattening. So you apply the convolutional net, you flatten the output, and then you apply the fully connected part to get final and the final predictions. So here it is here with our in it on our for work. This is enough to define this conversional network, so we run this cell and then we can compute.

B

We can instantiate the network if we wish, though so here, I I, create the net with this with this installation and I use the print function from Pyro's to be able to check that my define proper the proper network. So here you can see it cleans all the operations that I want to prefer. However, it's not necessary the order in which I'm going to perform them. That's just what parameters and modules are inside my network. So, let's take here all the parameters that are actually inside my mother and that want to optimize.

B

You can see that have couple of bias and waits for the conversational layers and couple of bias and weights as well for the fully connected layers. These are all the parameters that I want to opt my form for my training. We can check if you, if you want what would happen if we feed that network to a random input, so we take an input that have a specific size, so bad size, channel, 8 in width and we as input to our convolutional network, and we can see that in output.

B

It will give us what we call log probabilities, because we have computed. One of the last layer is lots of max, so it gives us lot. Probabilities under say is 1 times 10 so about 1 is the batch size and then the number of classes we want to classify in then we need to do two things.

B

We need to define the load, the data and then create our training functions so loading the data now is something that we know how to do so here we we create two data set, so we load the minister in a second, we load the two different split. So here we know the train speed, and here we load the tested split of the mistake. This is a data set that is already it is already available in in Python.

B

So you don't have to create your own trust to deal with this data, even though you to download this data set. If you wish we, so we define a couple of transformation, so we resize the image to 32 by 32 only when we convert them to tensors. We provide this transformation to our dataset and then we create two loaders one for the training set and another one for the testing. So this is done simply as we have, as we have seen before.

B

By giving the data set here and the batch size we want to use inside. With this effort, we don't separately test that we train set dough so one we have created this data set and data loader. The last thing we want to do is to create the Train function.

B

It will actually train our network, so here I have chosen to actually write a train function and not just a training loop, so that I can call this function multiple times with different number of epochs and here so in this function, first I will want to create the optimizer I want to use. So in this case, I chose Adam, for which I give the parameters in the running right.

B

I define a criterium here, it's a negative log log likelihood love, so that's adapted for classification and then what we have seen together is this part of the loop, this part of the loop.

B

So we have seen how to how to go through each of the sample of the train, others to move the data unable to device to compute the output of the model, to compute the loss between the output and the labels to compute the gradients you come to perform the the optimizations that I dated that will update the parameters and then 0 the gradients, and here what we do usually means to perform this group certain amount of time.

B

So what we call a POC is the number of time will go through the whole deck si, and so here we can choose to go through data set certain number of time until our network is trained enough and something that you considered haven't explained before is that we need to set our model in train mode.

B

The mother has two modes, training and validation, and there is a very slight difference between the two in train mode. When you are using normalization or drop out layers, the drop out layers will actually drop certain connections inside your inside your network, which is something you don't want to have you see happening in when you are using your network in increments? So when you don't want to have the drop out active in inference, you want to set your model in validation 1.

B

However, when you train your network, you want your dropout to be to be active, and so you will set your model to train mode. It also has a similar, similar action in the batch mode, so the bad norm will behave differently, whether you are in backdashing or training mode, so make sure when you're training to use the training mode and then, when you're, on a training, you can use the validation model, which also just now how to switch to the backdash mode. And so here we have the whole training loop.

B

So for certain number of epochs we perform this loop and then, at the end of each epoch, we want to check the accuracy of our model.

B

So we have written below an accuracy function that will compute the accuracy of our model and under given tasks so the accuracy function, I won't go into detail, but what I think you can notice is that I use the model into every month, I put another into L mode, which will with usually allowed to change the behavior of dropout and best known nail, and then I compute the number of correct sample over the number of total samples, which gives me the total accuracy of accuracy, but my mother, after doing training and after training.

B

So if I define, if I run this to these two cells and then I have trained function of my accuracy, member type in Kolb and here I will first define a device. The device I want to use. So in this case, and this and this laptop I don't have a diffuser, we use the review and then I move my confident to CPU to the proper device and I feel all that to my friend function and the train comes on.

B

We like to perform the training my internet, so here I have by default we a book so the so the training will will will run for three a box and with the output you can see that our model is actually training. So that's that's it for this book.

B

Let's just wait for our final accuracy here, which is actually a quite impressive. We can see that we can actually learn tasks, missed, recognize images with just a simple conversional network and get already 96% accuracy. So that's something that that is actually quite simple and easy to do in by coach with simply.

B

This kind of model that you can find in I guess.

A

We're about six minutes over so if you could just wrap.

B

Sorry to wrap up, we have seen things and the couple of things and the BIOS weather with much more you can discover there is divert, is thoughtful division. You can see the training conversation. There is a way to move to production and high performance script. So you can there's a lot more things that you can discover both by coach and maybe, as the last word.

B

This is a nice quote from Andre company and basically trusting tight roads, makes your life better I, encourage you to check the github that Steve already provided in chat where we have the notebooks I have presented during this webinar, along with a detailed notebook for offline studies, so with lots of commands and there's a couple of other resources that you want to check. All the great pictures I have taken from the deep running boot. This is an e-book.

B

It has been written by the pythons, some pythons community members and it's free until tomorrow, so I encourage you to get it and some other good tutors and course that you might want to check out I hope you enjoyed yeah. That's it for me, maybe want to take over.

A

Thank you very much Evan. That was fantastic thanks for all the the notebooks and the I mean you say shallow dry, but really it was a deep dive and a lot of in a lot of respects. We we do have some questions still on Zoom I, don't know what your constraints are.

A

If you're, if you're still able to stick around you know, maybe we could go through one or two. Otherwise we just take it to slack we're.

B

Trying to get the questions that.

A

We didn't answer.

B

I still have 30 minutes kind of so I can I can go a bit over.

A

Okay, so I think you know some folks have been dropping off already, but let me just say: apologies. If we didn't get to anybody's questions, please feel free to drop them on slack and we'll we'll get to them, but maybe we just get to a couple right now, if you're still interested in answering so one of them, which I think is maybe worth touching on and I, don't know exactly what your level of familiarity is with other frameworks.

A

But there were a couple people who has about the relative comparisons, the strengths and weaknesses of different frameworks like Karis, intense or flow versus pie. George. Yes,.

B

But that's I, guess that's a hard question to answer.

B

It's polemic, question I, guess: I haven't really used much done stuff for myself, so I, wouldn't be very probably biased or touched by toes, but I know is that for now the production part of tongues or fruit is is more mature. So if you want to put a model into production there is, it would be much easier using pie code.

B

Sorry, however, when you want to quickly like tweak, Network and test a couple of things, you can easily debug, you can easily look into your Gretchen's and do things that are very much harder to do is tons of flow. So I would say that PI thought usually is is more interesting in terms of for research kind of side when you want to when you want to try out things that are not like under and I mean. That is not what everyone is do.

B

If you want to do what most people do and do it quickly, I get I guess. Chaos is probably the best the best your best friend, because with chaos we can really do something very quickly and very high up. I thought will help you tweak a bit more on these things and have more control. Tom servo would be more prediction already. That must be what I would say what most people would say. Yes,.

A

Agreed, um let's see here, there's another question about: can we define custom loss functions? This was a while ago. I didn't have a chance to bring it up, but maybe just comment on whether it's possible in how you can implement custom logic functions.

B

Yeah, so a function would be basically just the normal I thought. I mean you know my Python functions where you would use right rods provided the function. So you could. You can definitely build your own, but it could just be another Python with using towards providing function.

A

Some time we could just let folks Esther or me any questions on on slack I think, but you you have notebook examples that folks can work through offline right. Do you want to just say that's what I mentioned there is so.

B

Again, where this so with the link to the limited books, you will find a live, notebooks folder, where you will find the notebooks I have I went through, but I didn't put any comment in the book which are not books that I previously made for another room. You will find assignments, we find much more and there will be a bit longer. So I encourage you to go through this assignment, to get all the explanations and for any questions you can contact us through the stack.

A

Yeah and just say one more time, Evan will be on the everyw will be on the slack, and there are we, a bunch of us, also as well, to try and field questions that folks have.