National Energy Research Scientific Computing Center (NERSC) NVIDIA RAPIDS Training, April 14, 2020, 14 Apr 2020

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: 5. Introduction to cuML

Description

From the NERSC NVIDIA RAPIDS Workshop on April 14, 2020. Please see https://www.nersc.gov/users/training/events/rapids-hackathon/ for all course materials.

A

All right, everybody welcome back from break uh the next. uh The next item on the agenda here is going to be a presentation on qml from zara. uh So take it away. Zara.

B

Thanks roland, um hey everyone, um I'm zoro ranaghi data scientist on the solution, architect, team at nvidia, um like rollin, mentioned uh also a former nissan postdoc.

B

In my current role, I work with hpc users at dle labs and it's actually great to be able to continue working with rollin, laurie and others at nurse and to be with you all today, remotely so I'll talk a little bit about qml before we get to the notebook examples, just a couple of minutes to review what nick mentioned earlier um and the goal is to be able to import a gpu library, for example, qml instead of psychic, learn and run machine learning, algorithms on gpus, without having to write cuda code.

B

Of course it's just sometimes we saw the qdf examples. It's sometimes as easy as just changing the import. uh You just import a different library, but sometimes it requires changing a few parameters and keywords and we'll see a couple of those examples in the qml notebooks.

B

But the overall goal is to complement what already exists. We don't want to replace existing frameworks, but mostly complement some of the other frameworks that currently run on gpu too.

B

Okay- um and so this is what the software stack looks like today, the top layer that's exposed to the user is python. Underneath we have cython wrapping, um which connects the c plus plus to the python layer. Our core algorithms are built in a combination of cuda libraries like um thrust kuspars kusover, and we also have machine learning primitives.

B

What these primitives are they're, basically building blocks of some of the basic operations that make the bulk of our machine learning algorithm.

B

So they can be the core, linear, algebra functions, um distance functions, norms, transpose and so on, and another nice thing is we're actually working on wrapping these primitives in python 2., and so, if there's an algorithm that you're interested in working on and it's not in our roadmap and we're not working on those, you can actually use these building blocks and um create them or test them without again having to write cuda c plus plus code, and so this is the status of our algorithms.

B

Today, like nick mentioned earlier, we'll see a couple of those like linear regression, logistic regression and umap. Some of them like linear regression, random 4, is currently run on multi-gpu and multi-node, but some of the other ones were in the process of working on them too, and extending them to multi-node voltage gpu. So this is the roadmap as we get closer to 1.0.

B

Okay. So now, let's go through the notebook examples. Can you see my notebook?

B

Yes, okay, perfect! I was just wondering if I'm still sharing okay, so we have three notebooks uh more like two, because the first and second one are very similar. When I'm doing this, I actually ran through these on my local workstation.

B

um So if you see uh the output, when I run nvidia smi or in my terminal here, it'll show that I'm running on quadro gv100s.

B

uh Basically these are quadra cards compared to the tesla cars that you're running on corey, gpus, they're, basically the same, but one is designed for the server which are the tesla cars and another one. Quadro is uh for workstation. So, basically, what all it has is as cooling and display boards, because it's a graphics card, rather just an accelerator, so I'll use that and another thing that I wanted to mention- is I'll close these tabs.

B

So we can see the whole screen um and hopefully it'll be large enough to go through it, but this is um through envy dashboard that we actually added, I think a few months ago. So it's really nice that you can monitor your gpu utilization and gpu memory, while you're going through these notebooks before this. One thing that we used to do was that just run nvidia smi and watch our gpu to monitor this. But this is much nicer, and um so this was installed in the container that I'm running, which is point 13.

B

So I think it should be in the same one on nurse, but basically what it is it's called in the dashboard. You can install it too, uh so you can pip install, um but if you're writing a similar container, it should be um in your jupiter notebooks on the left too. So you can click on system, dashboards and click.

B

What else you want to look at so you, for example, here I have gpa utilization and memory um and then pcie and mvlink throughput are actually interesting if you're running or when we get to the desk workflows it'll, be interesting to look at those too.

B

So that's all I wanted to mention I'll leave this here in case we have some time to run through umap that that's, where we'll see a lot of gpu utilization, but let's get started with the first one. So the first one is linear regression, which mostly it'll include basic fundamentals of coemo.

B

For those of you that are familiar with cycle, learn and have used cycle learn. The api looks very similar, so some of it might actually be super easy and straightforward, but the goal is to basically just use some random data or synthetic data to use qml and port. The code from second learn to qml to run on gpus and then the second part of this notebook. We actually have a hyperparameter optimus, optimization example, and the nice thing about that is that we're actually compatible with other hpo packages. That cycle learn can use.

B

For example, if you use desk ml, you can use qml with that or cyclic learn, and we'll see that example too and again the whole point is.

B

We can use that hbo from cyclic learn to run our qml estimators and qml algorithms, and you don't need to use another library or create your own and then in the second notebook we'll get to logistic regression, which, which is very similar to this first one um and the goal is implementing what you've learned in the first notebook.

B

So it's very short, very similar, just a few differences that I'll talk about when we get to that and then finally, we have a notebook with umap to see how we can visualize eminence data sets uh we'll compare that with tsne too and then use two data sets one with digits and another one uh with fashion data set. So that includes like boots and clothes, and things like that.

B

So, let's start with this one: oh in the first cell, we're just we're going to import some of the libraries that we need like matplotlib and numpy, and can I have mentioned here we're just going to use and create random data set in this example, and then here we just want to plot this. So what we're doing is we're just adding some random noise to y um and then we want to see how that looks.

B

So this is our relationship between x and y, which is this black line, and then the other points around it is the data that we created.

B

So we'll use cycle learn to run logistic regression um again so you'll see throughout these notebooks that we check the version two. um So, for example, the cycle learn one that we had is .21, and this is really useful for rapids too, since we have almost like a six week release.

B

So it's good to keep track of what you're running to if there are any issues or if you want to report any bugs later on, it's good to just keep track of that, and then this is using um importing linear regression from scikit-learn and actually scikit-learn apis is pretty user friendly. What we did here is we're just going to create an object, and then this is the object that will create the ordinary least square regression.

B

Basically, what it will try to do is it'll try to fit a line to the data set that we just created and what it'll do is it'll minimize the square distance between these observations and then the true relationship, which is this line, and it's basically one of the introductory machine learning. Algorithms, that's commonly used.

B

So then next thing that we do is we run fit we'll use. Numpy and so like an empire functionality to create a format that psychic learn accepts so in this case um it'll be expanding that data set. So what we did was we created this x, that's one variable, so it's equal to a column and what numpy expand does is it just creates a column on our x expands the shape of the array and then with the position that we've mentioned here.

B

So if we look at x, this is the array the shape and then we can run numpy.expand uh to create this dataset.

B

um One thing that I mentioned, I think in the example here we had why so just a slight modification to change that to y noisy when we want to put the data um and then what we do next is uh we want to run that to see how that will fit. So we run here our fit to see the train our model and then what we do after that here we want to create another set of data to run inference.

B

So basically, we want to predict how this matches, with the trained model that we just created, we'll again use some of the numpy functionalities could to create a linear space. Basically a grid of these points and again, in this case it's just a single line, because it's one dimension and then with running predict.

B

uh We uh predict these new inputs um and what it does. Basically underneath it's a matrix multiply multiplication, but some of the more sophisticated algorithms can be more complex and computationally expensive.

B

But in this case it's just a matrix multiply and then, finally, here what we're doing is we're plotting the true line, the predicted relationship, which is psychic, learn on cpu, so the red line that you can see here um and then what we realize this actually matches with the true relationship line and what that means is that it fits and it worked pretty well so, um with the training dataset with the white noise, it means the optimization process to minimize that error of the observations to the line fit, and it worked really well because we can't see the black line.

B

So what we're going to do- and the next step is run the same algorithm on coilml and the goal is for it to be very similar to what we just had so importing a different library. However, here the first exercise was to create a qdf data frame.

B

um Now one thing that I should mention is you: don't necessarily need to create qdf to use qml now, in a lot of cases it's recommended because, like you can run some of the other operations faster, like the examples that we saw and the qdf notebooks, but cool ml can actually, you can pass um couple and numpy arrays too. So if you really wanted to test this feed, what you can do is use the same array from the previous section and then pass that in without creating a cdf data frame.

B

So it really depends on the workflow if you're already have it working with a dataset that you're and we'll get to that when we get to the logistic regression. But if you already have a data, for example, if it's in coupon format or as a numpy- and you don't want to convert it because you're using other frameworks too, like pytorch, you can easily do that and you can still use qml.

B

So what we did here again, we we're going to import cool ml. This is the current version, which should be similar to the one that you have on quarry um and then here again similar to the scikit-learn version, we'll just import, linear regression and we'll just name it linear regression gpu to not confuse it with the other one um and again we'll instantiate our linear regression object and fit it.

B

um In this example, it's exactly the same as the cycle learn version that we saw um and this these are the parameters, the default parameters that it's going to use.

B

So um what we did was similar to scikit-learn that you'll have some of these default parameters. We've added those too and the goal is so if a user is new to, let's say a linear regression and they don't know there are different options for the solvers. Why would they use one over the other by default?

B

We'll do an educated guess for a solver for you and in this example, it'll be um eigen decomposition, which is one of the faster ones, um and then the other parameter is going to be fit, intercept true and uh we're not going to normalize the data. So that's going to be false.

B

Now again in this example, we have a trained model and what we want to do is again visualize this to see how this compares to our cpu prediction and then the true line that we had. That was the relationship between y and x. One thing is: you'll, probably see a lot of this warning, so especially, I think in the last cell or the one before the glass.

B

It basically means that training for training, we're using column, major and prediction requires row major data, so there's inconsistency between the data, but that will result in an overhead in additional memory usage, so we're just showing or presenting this that this is happening in cool ml. uh But so it's not really a bug. It's more of a feature request that we're working on, and you shouldn't see this in the next couple of release and then so. This is the final graph uh we'll we have the black line.

B

Red line for second learn and green line for qml green line is on top of, though so all are matching. I think we up to 12 digits, almost the same exact answer on cpu and gpu.

B

However, sometimes we're not running the same exact algorithm as I could learn, sometimes we're implementing algorithms that make more sense for gpus or for massive parallelism, so it might be a different solver and neither is right or wrong. It's just a different approximation that we're using yeah, and if you really want to compare the same exact algorithms, you might have to adjust parameters um or say or solvers to be able to compare like apples to apples.

B

Sometimes, though, this is uh to our advantage and we can get better results in qml and the reason for that is, we can optimize more times. We have an optimization loop that we can actually take advantage of because we can run more times since we're faster.

B

So overall, sometimes we have to we're using basically like I mentioned different implementations. Okay. So now, let's look at a hyperparameter, optimization example or hyperparameter tuning, which basically is a process of choosing a set of optimal hyperparameters for an algorithm or a learning algorithm, and these parameters are usually randomly set by the user before training.

B

So these are things that we can modify and you can either do that by hand so to try a different set of outfits. Sorry hyperparameters, but usually it's more efficient to use hpo and we'll see that example there's different methods that you can actually use all the combinations of the those parameters or just randomly select a few and then compare performance and accuracy to see which parameters will give you the best results.

B

So what we're going to do? For this example, we're going to use the diabetes data set from scikit-learn secular actually has a few built-in data sets. That is actually really good for testing and running demos and sample notebooks like these, um I will run hbo in this case for ridge regression um and so for this example.

B

We'll have one parameter to work with, which is alpha and for other algorithms we might actually have a lot of different parameters. So if anyone has done deep learning, you know that hbo can actually optimize a lot of different parameters. For example, let's say training with in a neural network um and the nice thing about that is like I said you can go through all of the possibilities or do a random search um and then compare the performance of those to find the best ones.

B

So what we're doing here is after creating that data set. We uh just split that to train and test split, so here we're just setting the test size at 0.2, so we're keeping 20 percent of that for tests, and then here similar to the previous one, we're just going to import uh ridge from scikit learn.

B

um And- and here we've just added some of the details about differences of cyclic, learn and qml examples, uh so both approximate the same thing but qml currently has three different solvers um and cyclic learn has some of the other ones. So the only one that's in common between the two right now is svd and, like I mentioned earlier in the previous section, sometimes we're not comparing the same exact parameters and for cycle learn. Actually auto solver is a good option.

B

What they do is they have heuristics that actually can choose the best solver for you, based on your data set size and other options, like your type of data that you're using uh we haven't implemented this in qml. Yet so we don't have auto solver, and one of the reasons is that for complex algorithms, we still don't have good heuristic and we want to basically see more use cases to figure out, which ones will give us best performance.

B

So what we're going to do here, we'll create ridge regression, objects again fit them and then run predict on both of these. um What I did here and the cpu and the gpu version. I explicitly added the parameters that I used, uh but you can use the default and basically just call the solver so select the solver.

B

And then what I did here is I ran score, so we can run the prediction and compare accuracy of these two. uh We use the built-in accuracy which what it does is is the squared of the sum of the errors between the observations, all the points that we have and the line that we're fitting.

B

So the lower value means that's better, and we want to optimize on that for hbo and, if you um so when, when we got to that, the hint was that you can actually use grid search from scikit, learn and use your qml estimator.

B

So, like I said, we'll basically use certain points of the parameter to explore. We have alpha in this example, so we'll set the value from -3 to -1, so we'll have 10 points uniformly the distributed points and in this case logarithmic and then we'll do the grid search. Try all the 10 points we're generating we're using grid search, so that will use all the points um and another way to actually run this.

B

You can actually uh select random search too, and that would make sense if you have a larger um array that you're actually using more data that you want to run on. So if you have a lot of hyper parameters and if you run grid search, so if you actually test all of them, it might take hours or days to actually run through that.

B

So it really depends on the number of parameters and how many different values you want to test and then in the next part, we'll create our grid search, object, um adding the arrange from scikit learn and then the optimization strategy is r2 scoring and then at the end of that will fit the model um and then, at the end, if we check the best parameters and best score, if you read it, we'll actually see different values.

B

But in this case, for example, it'll give you the output of the estimator that was used and then the best parameter and then the best score.

B

So if we want to run the same thing with qml, you basically me um we'll run grid search, cv from scikit, learn and run fit again with our qml algorithm.

B

Excuse me and it processes without issues actually um and you'll, be able to see the results. The only thing is again the warnings that I mentioned earlier that hopefully you won't see that in the next release and then because there's a lot to scroll down, I'm comparing um the output from coimo with scikit-learn and they're very similar, so.

C

Questions are about warnings. Yes, you see those because, like your data are row format and it's expecting column format, and I guess it's converting for you. Yes, um so does that mean like you should just always be working in column format like from the beginning? So so that's the thing.

B

um Based on the latest, I think for the frit and the training it's expecting column format, but for inference, it's expecting row format so and that's because we're using the forest inference library. So it's within cool ml2 that, depending on what you're running, it's expecting different formats of data, so this was actually something that I think I saw an issue on github that nick mentioned.

B

um So I wonder if nick has more details on that, if he's still online.

B

Sorry, can you hear me now yeah.

D

Great um thanks so yeah you pretty much nailed it um laurie, the in general, most most operations that we're doing will want to a column major memory layout. But in this specific case the pipeline of dick after fit needs different ones, and that's something that's being that's being worked on in in. I guess currently um in general, though column major is what we would usually say.

C

Okay, so in the future this will become column, major or it'll. Just stop warning you.

D

um In the future, it will probably be whatever the appropriate um memory layout is. I think that's going to depend on a variety of things.

C

Okay, thank you both.

B

Okay, are there any more questions.

B

Yeah all right.

C

So we have a question uh from venkitesh and he says scikit-learn itself performs well on gpus parallelization over data batches, but how much extra performance does coolml give.

B

So, as far as I know actually psychic learn, I don't think you can run that on gpus um and so that's what we're doing we're reporting a lot of these algorithms to run on gpus. Now, there's a cycle learn api, for example, if you use xg boost but then again underneath that's using a lot of the implementations that we've added.

C

uh Maybe vinky test you want to unmute and explain more.

E

Yeah, sorry, all right, that was a mistake. I'm feels terrorist and that one paralyzes you so I thought maybe second one does it too but right, I think what you're saying is circuit, learn, doesn't and gpu. Cuml is the individual exactly.

B

Correct and that's actually one of- and I I mentioned that earlier too, what we're doing is we're trying to complement what already exists. So if you're, for example, using pi torch, you should- and I think uh nick had to slide on that too. You should be able to pass your data between the different libraries now so qmo or pytorch with zero copy. So you you wouldn't have to go back to cpu.

B

um So right. If you have a workflow that you're using python gpus but scikit learn on cpu, you can now use qml to continue that on the gpu. So the whole goal is to be able to do end-to-end data science without having to go back to cpu, and we don't want that additional overhead communication overhead.

E

Okay, thank you. I just have a follow-up on that and that's okay, sure meeting something like.

B

E

There are uh where it does penalize over different gpus. uh Then it seems like maybe you can just use that, but most workflows have this data reading aspect of it where things get slow, and so it seems like the score who df could help there, because he didn't analyze the inner reading. Part of it.

E

Something like that.

B

Right exactly that's the goal, so you can use qdf to load your data onto gpus. You can load it faster and then run any of the pre-processing feature. Engineering, anything that you might need and then yes pass that to like a deep learning framework or commel and continue with your training.

E

So you're gonna have to convert the after the processing, convert everything to numpy and pandas and then give it to the other deep learning. Learning.

B

So, no, if you're, using, for example, pi torch or tensorflow some of these uh frameworks that currently work with uh gpu arrays. So if you use the cuda array interface, then you know you don't need to even convert that back to numpy or pandas. You can directly use your coupe array or, for example, for extra boost. You can directly pass your qdf data frame for your learning, algorithms.

E

Okay, thank you sure.

B

Okay, so what I, I won't really go through all the cells for this notebook, because, like I mentioned it's very similar to the first one, the only difference is this is logistic regression. So basically it's a classification algorithm when it's used when actually the dependent variable is binary but linear, regression that we saw earlier, usually the dependent variable variable and outcome is continuous.

B

So there are a few differences. The first one is you'll see here that we're using coup, which nick mentioned earlier too. So I I really encourage you to look at the documentation for this. It's actually a really nice package, it's very interesting um and if you're already familiar with numpy, the api is very similar, but underneath it's actually using cuda to create arrays on gpus and the goal of this notebook is to deal with data. That already exists.

B

So, let's say, for example, you already have a coupon array and you have a workflow or in pi torch, and you want to use qml. Would you have to convert your data or how can you convert it between, for example, coupe, I and numpy?

B

And so when we set up our data similar to the previous notebook, uh we'll import logistic regression in this case um and then the main part is actually fitting the scikit-learn. This part is a fitting decide to learn logistic regression, and one point is: this: is a gpu array. So, if you're using a cpu algorithm, you can't really pass that. So you need to convert it to an empi array and the data has to be on the host. There's a couple of ways to do this.

B

um I think the hint here was to convert it actually to a numpy array, so you can do that with this, so do coupon as numpy, and that will convert your coupon array to numpy array or another way that can you can do. This is with this get call to move a device array to the host.

B

So that's similar to what was already in the notebook or you can use the hint and actually convert it to a numpy array and then run the score again and your results will be similar.

B

And so what we'll do is, for the second part, we'll use qml, logistic regression that can actually train models with coupons. So that's what we talked about earlier, that you don't necessarily have to convert your data to even a qdf or a qdf data frame. You can use a coupon arrays for training, these machine learning algorithms on the gpu, although the difference here is you'll see this that we didn't have this and the g and the cpu version, for example, is that you have to define your data type and what that means is.

B

This is what's expected for um qml. So if x, train was double precision, so numpy dot float y train should also be the same data type. So what we can't do is we can use this coupon functionality to convert the array from one data type to another.

B

However, another thing that you can do is this was this is actually, I think it's pretty exciting because um you can set this convert data type to true and qml can do that for you.

B

I think this was actually added in the last couple of releases, but the only thing is um what the only disadvantage is that you'll it'll use more memory, because it's doing that conversion for you and so because of that by default, it's off. So, if you want to use this, you can just use. Convert data type to true and converting the data manually can actually be that optimization step. So if you use this, um it's more efficient and you'll use actually less memory, gpu memory, um and then we compare the results and logistic regression.

B

In this example, we get a better result actually on the gpu and one of the reasons is it's an iterative solver. So in this example, the gpu acceleration can actually allow us to get better results by running this more and faster.

B

So we have about 10 minutes and I can go through this umap notebook, uh but I'll just pause here. If there are any other questions.

C

Sure so uh we have a question from janna. He wants to know if kupai can convert to float, 16 ooh, that's.

B

A good question, but I'm not sure nick, do you know that.

D

Sorry unmute, uh yes,.

C

B

B

You any other questions.

B

Okay, so we'll go through the last notebook um I'll try to go through this quickly, again we're using umap and conver, comparing that with tsne we're going to use the mnist data set.

B

So we'll have two types of data set, one is numbers and another one is called the fashion mnist data set. So if we look at our data here.

B

um Each handwritten number is actually a 28 by 28 image, which is actually a single array so to graph it we have to reshape it and for a machine learning algorithm. What this means is. This is actually a 28 squared number of dimension array and not an image per se.

B

So it's relatively a high dimension data and problem, and it would be difficult to use a scatter plot to look at this and plot this, and this is an example of a dimension reduction, so we can use umap or tsne of to go from 28 square square to two or three dimensions for visualization and- and these are some of the algorithms that try to solve this problem, for example tsne.

B

And u power used for this, but essentially what they do is model this in a higher dimension and make assumption of how the data is distributed in this higher dimension and then the case of umap. It actually makes the assumption that it's uniformly distributed and a projection from the top of these objects can project in a lower dimension space, and this is what we see uh when we we can think of it. Like a shadow of these dots, hey.

C

uh Zara, I'm sorry to interrupt you. uh It seems like our video may have frozen. um Oh, maybe you want to unshare and reshare.

B

Okay, let me try that.

B

So what we basically want to do is we'll look at these datasets with tsne and umap, and then we'll just see different distribution of these and different clustering methods. When we compare the two um so for the cpu version, we'll use the umap learn package that should already come in the container, if you're, using that.

B

If not, then you probably have to install that if you're installing it through conda and then what we did was we can create the object by uh selecting the number of neighbors, which at some point, does the nearest neighbor search and it's in high dimensional space, which could be actually a hyper parameter to optimize too. So, if you want to run that, that would be something interesting to look at your results and compare them, and then the example that I have I modified the number of neighbors to 15.

B

So the one that you actually saw was five, but you can change that and the default is actually 15 too, and what that means is larger. Values will result in more global views and then smaller ones will maintain a local structure of the data.

B

So, basically, smaller values were will concentrate on the local structure and the larger ones will create larger neighborhoods as we call it and there's also an initialization strategy.

B

So, as you can imagine, it has to start from somewhere when it tries to calculate how many different types of clusters there are- and this is doing an experimental initialization um called spectral use. For example. In this case it uses spectral clustering. um So um it took for me um with the 15 number of neighbors.

B

It took about a minute and a half to run on the cpu, and obviously that depends on the cpu that you're running and the number of cores um and then we'll go when we go through and use qml we'll create a qdf data frame and then the same call from qml, so it'll be qml umap with the same parameters, and then we do pretty much support all the parameters and we'll see that that will actually go from a minute and a half on cpu to about three seconds on gpu and it's actually interesting to plot the results too.

B

um It's really nice.

B

It creates a different visualization compared to tsne and it's a lot of times it's actually more informative to use umap and, for example, if you look at the output, you'll see that it clusters four seven and nine and then like eight five and three uh together, so those are more similar in the way they look um and then another thing that I added, which I encourage you to add that to is a parameter called random state uh which sometimes, when you run through the gpu version running through at different times, you might not get the same output, so the visualization might not look the same.

B

So if you select this random state parameter, which is basically the seed used by the random number generator during an initialization and optimization, um it was added to qml and the current release. So it's actually pretty new. So in point 13 which matches the cpu implementation, and what this will give you is, you can reproduce results so across different runs. Now your results will look very similar. So almost the same, and so one thing is um one of the reasons that I recommend using this is for coilml.

B

I mentioned that we're improving performance, adding a lot of parallelism, but it sometimes becomes challenging for the optimization stage, so all that parallelism might cause slightly different results even with the same speed, and sometimes it can impact our determinism too. So setting a random say will enable this consistency and will get similar results. I think up to three digits of precision, but it can also potentially be slower uh training, maybe by a few seconds, and it can increase the memory usage too.

B

So the second exercise, it was the same. It's just using a different data set, so you can use the fashion image data set um you'll.

C

Interrupt you again, so your screen is frozen, um which is fine, but maybe just uh so you know we can't see what you're seeing.

B

Yeah, that's so that's I think nick's screen, because mine, just I think I don't know for some reason, doesn't show anything. Oh.

C

All right now that scrolls.

D

Yeah, so I I think zara might something might be up with your zoom settings. I I think I can scroll. That's if that's useful, I'm not sure.

B

It was just odd that it was working and then it suddenly stopped um now.

D

Well, now you have it, I think now, actually you have it again.

B

Yeah so I have to okay, let's see if I can run through these examples, and I can run through them and show you these results later too, because it's actually nice to look at them. But let's just try to run the gpu version.

B

It's just a little uh slower to scroll.

C

C

Okay, well, um all of our participants should have your notebook and should be on it themselves, so uh we'll just encourage everyone again to run through. It sounds.

B

Good and I can share screenshots of it too, um and just to see how it looks so um when the next person or throughout the break, I can just do that.

C

Yeah um before we wrap up, uh thank you very much zara do. Does anyone have questions about cool metal or any of the methods we just talked.

C

C

I'm pretty quiet on chat, uh maybe people are hungry but yeah. It's.

B

Close too much over there sure thank you.

A

Oh we're actually going to start. I believe, with like a last five minutes from zara. Is that right.

B

Yeah I'll try.

A

B

A few minutes.

A

Okay, great thank you zara, so zara's going to um finish up the part of the presentation where she had problems right before lunch. So uh I look forward to that. Okay, thanks zara.

B

Thanks roland sorry for the technical difficulties earlier, so I just wanted to show you the graphs. I don't want to take too much time, but because I was talking about this- and I was actually super excited about it, because it's a really nice visualization.

B

I just wanted to share it with you. So this is the one with the digits with the data set with the numbers, and I was mentioning earlier that if you run this, if you run the qmlu map, you can run the cpu version too. But basically the point is you'll see these clusters and and this one uh for example, four and nine and seven are in one cluster and then another one.

B

We have like two uh three eight um and five that are similar in shape um and then for the second part, um that's also another data set and the difference with this is again it's from mnist, but it's the fashion data set. So it's going to be very similar. If you read through it and again, we compared it with tsne and then, if you run through that, um here's a random state that I was mentioning earlier.

B

If you want to be able to get reproducible results, when you run it a few times on the gpu- and this is what it finally looks like again very interesting- you can see that, for example, um the different images of uh dress and coat um and then shirts are in the same cluster and then the other one we have like sandals and sneakers and ankle boots.

B

So, um and another thing is, I mentioned that earlier too, that I changed the number of neighbors here to 15 and rather than five and the initial one so like I said this is another parameter that you can modify. You can run hyper parameter, optimization on and then the last part is just applying the trustworthiness to compare the cool, mlu map and c cpu umap, and then the higher score will actually indicate that the gpu implementation is comparable to the cpu one, which is about 97, 98 percent.

B

So if you compare this with the original input, we can check how well the algorithm did and preserving that local neighborhood structure. So, to what extent that local structure was maintained, it's basically a way for us to also measure that performance.

B

So that's all I have for now. um If you have any other questions, feel free, because we didn't really get to go through all the cells of this notebook feel free to reach out to us or uh roland and laurie, and we'll be happy to help you with any questions or issues. Like I mentioned earlier.

B

Part of my role is also supporting users with some of my other colleagues, um I saw, I think max katz on the call too so feel free to ask any questions and we'll be happy to help you, um and I won't take any more time I'll hand it over to vivo.

B