National Energy Research Scientific Computing Center (NERSC) NVIDIA RAPIDS Training, April 14, 2020, 14 Apr 2020

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: 4. Introduction to cuDF

Description

From the NERSC NVIDIA RAPIDS Workshop on April 14, 2020. Please see https://www.nersc.gov/users/training/events/rapids-hackathon/ for all course materials.

A

Great, so for the next 45 minutes, before lunch or lunch, if you're in the Pacific Standard time zone, we are gonna, be going over one of the notebooks that was sent out for the flipped classroom aspect of this workshop. This is gonna, be an introduction to the ku DF library and for those who have had a chance to work on the homework. Thank you for that. We'll be reviewing it and also taking questions on that as well. For those who have not had a chance to get started, that's okay!

A

We are not going to be going into depth about all aspects of it, so we we encourage. You to. You know, take some time on your own to do this as well, but we will be covering the entirety of both of these notebooks, so I'm going to quickly share my screen again and also if there are questions that you feel like asking, please do use the chat. I think that that system worked really well in the last session is what the last session as well.

A

A

Is this large enough font for people to see or should I make the font larger.

B

It's okay, but it could be a little larger. Okay.

A

Let me enlarge it.

B

That's great great.

A

Okay, so again, hopefully some of you had a chance to work on this. If not that's, okay, this is gonna, be a fairly comprehensive, but you know high level overview of the KU DF library, it's geared for new users and, if you're familiar with pan there's a lot of this should look very, very familiar, so we're gonna start by introducing pandas and then move on to ku DF in case some folks are not as familiar with pandas.

A

So again, these are just some basic imports. Numpy is the standard PI data ecosystem, numerical Python library for array, processing, we're setting a seed in order to make sure things are reproducible, and so then we get to pandas.

A

Pandas is for structured and unstructured data, usually in the form of text, not really images in video, it's more about structured, structured data and it's a data frame tool that lets you do things like this, so in this case we're using pandas 0.25, which I think is the current release all right. Sorry, it's the release before the current one. Are there now at 1.0? And so, unless you do, things like create a data frame.

A

Put some values in a column and put some more values in a column, this is sort of very basic Python syntax for creating floats in you know, a range of numbers and you end up with this object. This is a data frame that has two columns each of these things, and you can do things to it. Like summations can do aggregations and you can do all sorts of different things.

A

You could take the mean and there's a lot, there's a whole host of them and there's lots of documentation now cou D F is the GPU equivalent of pandas.

A

Pandas is great for small data, sets that fit into your memory and workflows and aren't super intensive computationally, but the baseline pandas rule of thumb is that for every gigabyte of data that you're gonna be working with her name every megabyte, you want to kind of have at least five to ten times as much available memory in order to avoid going out of memory, and these are no.

A

These are actually guidelines in that West, McKinney who's, the lead developer and creator of pandas has put together and the reason for that is complicated, but it relies on the way Pam. Does this internal structure called the block manager handles things, but the result of that is you know. Pandas is an incredibly powerful tool, but it can run into problems with efficient computation and problems with blowing up memory.

A

So coup DF is the GPU version of this designed to solve a lot of those problems and to be faster and stay on the GPU and you'll notice. Now that, in the nest in this code block right here that I'm about to run, this is gonna. Look just like this code block. You know we're doing the exact same code right here that we did with pandas with kudiye.

A

We do it with kudiye it's the same code said we use a cou DF data frame, the CUDA data frame and we're using ku DF, actually I'm, actually I apologize I'm, actually using a version ahead of this the stable release, we release nightly versions as well. Every time we make a new change, there's a new version that you can get if you want to get the tip of the sphere development version I apologize for not pinning mine to the 0.13, but all the features will work in that.

A

But so at the point is it's, it looks the same. You know this data frame, the only way you'd know this was on the GPU is by looking to see it's, not a panda's data frame, it's a KU DF data frame, and so we can do things to it. You know as before we can get the sum.

A

The syntax is the same. This is important because it allows you to be productive, and sometimes it also allows you to just drop in replace some of your existing code and put it on the GPU. It reduces the cognitive burden and lets you focus on the actual workflows, and so now we're going to go through through the basics of crude EF, and there are some exercises which we can go, which we will go over for those who have not yet had a chance to look at this. Their solutions.

A

Excuse me, the solutions to the exercises are in the notebooks. So it's when you do this later or if you do it later, try not to cheat and look at the solutions, but they are there. If you need them, and so you can do things like create series. You know a series is an individual column of data. Just like pandas, you can do it with nulls. You can do it out and nulls kudiye supports extensive list of operations that fully support nulls. You could also create the data frames like before you can create them.

A

You know you can think of a data frame as just being this one is just three separate columns. You know this is one column. There's three columns here in this case you're just some integer columns. You can also create AB data frame in a different way. I see there may be some questions Laurie. If there's any questions that you think are worth stopping for it. Please just just.

A

B

A

um Happy do that free for you to use your judgment whatever you think all.

B

Right well, great.

A

So if you noticed here that these data, you know these data frames are being created on the CPU right we're doing. This is a standard Python operation we're creating this in the CPU. We can also create this directly on the GPU by using other tools and we'll show an example of that. But keep that in mind now you might already have a data frame. Maybe you have a panda's data frame on the CPU and you want to make that your ku DF data frame. You can do that too.

A

Ku DF provides a from pandas API that lets you take pandas dataframe and put it on the GPU in the same way. Now, just like you would do with pandas, you can do things like call head and note that I'm using prints- but you know, there's no need to use print ear. Just depends on different people's machines and has more consistent to use print. You can use head to get the first two rows or to get the first five rows, you can do a sort.

A

This is the same API as pandas for those of you who are aware you can sort by one of these columns in this case that column, B I can also decide if I want to sort ascending in this case. I didn't want to sort of send it, but by default you will that's consistent with pandas. Now, just like pandas as well, you can do things like select columns. You can select rows, you can select all sorts of things and you can use this syntax. You know for those of you familiar with Python.

A

This is the get item syntax. So essentially, this is using a get item operator of this data structure. This is a canonical Python data structure, there's TV a Python method of a class, and it defines this protocol, which is essentially actually called to get item protocol and it lets you grab things from this data structure. So this is column a we can also grab column C.

A

We can also select by label this dot. Loc got Lok. This is the same kind of API for pandas. In fact, it is actually the same API and it lets us grab specific rows and specific columns by name. If we were so inclined, we can also grab by only position. We don't have to grab by name. You can get the first row notice that, like pandas, the first row here is returned as its own column. That's consistent with pandas. You can also grab the first several rows of the first two columns again.

A

This is spec position rather than by name, and you can also use direct access. It's you know it's generally better to use these I, lock the index, location and lock, but you can use direct access and so, as an exercise, you know actually I think I may have already put the solution. This one I apologize, but you can try to select only rows at index, 4 and 9, and so you can do this.

A

You might want to pass 4 & 9 like this, but you'll get an error because it's expecting this to be inside of a list, that's similar to pandas, and that's going to give you those two rows. You can also do more complicated things like create filters on data sets by using boolean indexing, and what that essentially does is lets you filter. So in this case we're gonna, say I'd, like all of the rows in which the value for column B is greater than 15.

A

That's what we got and as expected, if you were to take this out, this would create that boolean index. You know it's true for these. It's false for these, just like in in pandas world, there's also a query API. You know you can use the query API. Instead of doing things like this, you can say: I want things where B equals 3. Okay, you can also pass variables to this. You know if you have a local variable to find you can pass it in, and you know this is something that is generally pretty useful.

A

You can also do this with the local dict keyword. You could pass this as a keyword argument to look with local dict and you would just pass it there. You know you could rename this something else, and so this lets you do all sorts of kinds of operations. The standard boolean operations are also boarded, and so, as an exercise you know, I will go through this trying to select the rows in the data frame where the value and B is greater than C plus 6. So I want to select where the.

A

Plus bfb is greater than DFC plus 6, and there we go once we get down to here. You know this is 6. Plus 6 is 12. Once we got to 7, it would no longer be true, and so we don't get that as a result, and you could see the solution if you're, following along on your own notebooks missing data, is also supported. We can do things like filling in missing values. We can do things like descriptive statistics, I create a series, in this case just from 0 to 9.

A

We can do things like means with variance standard deviation. Kurtosis, you know we can do skew.

A

Sorry, we could do ski. We could do all sorts of different things yellow snow, skew here, but we could do all sorts of different things and we can describe the data to get the the summary statistics and get the you know the inner Court of the quartiles.

A

We can also apply functions and in the next notebook part of the homework, we will see more depth about how you can do this, but you can naively do a lot of things with just basics. You can add 10 to every value in a series now keep in mind that currently, as of today, qu DF does not support applying custom functions on string columns. It only supports this today, 4 columns that are not strings so numeric or daytime, but generally numeric columns, but it works. And it's nice.

A

We have a full support of string methods, but we don't yet support user-defined functions for string methods, but so, for example, you know I encourage you to check out the documentation. um I have our home page documentation right here. This is the rapids site, Doc's dot, rapids AI, and there are links to this in the jupiter notebook to see the kudi F or the strings. Api guides themselves. But you can do things like lowercase your strings.

A

You can do all sorts of different things such as uppercase them and as expected it will follow the same API. So we can uppercase our strings, which look like this with that Str accessor. This is something that those of you who are familiar with. Pandas should recognize. This is the string accessor and it's how pandas exposes its string functionality. Kuh DF is the same. It's the same API and there you go off to the races things like concatenation.

A

You can combine columns, noting that you can by default, have the indexes repeat: you can force them to be consistent. You can do all sorts of things with them. You can also concatenate them I'm different acts on a different axis if you're so inclined.

A

This should look very familiar for those of you who have used pandas before you can append, essentially just a large in a single type of concatenation. You can do, joins and merges in this case we're just creating some data, we're doing a left joint, just like pandas on the key and we're getting. The result. Note, though, that our order has changed by default.

A

Parallel joins will not preserve order, because that would add an explicit step to the operation which may reduce, which would hurt performance. You can enforce this, but by default the default join will not enforce order, so the exercise is to do an inner, join and so to do an inner join in this case just like before you would do DF a dot merge with DF b we're gonna say I want to do an inner join, not a left join and we're gonna join on this key.

A

Now, if the keys were different, we could actually use a different syntax where we started. We'd have to use a different syntax note that the inner join does not return the other values that they didn't actually have a collision on. It only returns the ones that collided the three rows if the keys were different, we'd have to use the left on and the right on arguments now. In this case the keys are the same, so it doesn't actually cope. Sorry, it doesn't actually matter, but they were different. You know if this was key.

A

Five key four, and this was key. Three then we'd have to do it like this. That's a joint like pandas. We also support group bys and this sort of split apply combined paradigm for those of you who use our. You may be familiar with this. You know: Hadley Wickham isn't a lot of really good research and and infrastructure development around this kind of design, oriented data flow, and it's a really fantastic paradigm, and we support this fully. You can do things like aggregations.

A

Do group buys call some. You can do more complicated group ice where you want to do things differently to different columns. You can also do multiple things to this column by putting in a list- and you know I could do count here. As you can see, it's quite comprehensive, oh well, I guess I sort of got ahead of myself, but you can do this multiple, multiple at once. You can also group by multiple columns.

A

So if I were to take this and run this, what I could do is put this in a list and I could also group by the second aggregate aggregate column and now I'm gonna have a hierarchical group. By and again this should look very familiar, but it's very useful and it's also fully on the GPU.

A

Now we also have time series work that we support, which is great the time series work is using the GPU based you know the date time variable and date time data structure, and so you know this is a coup DF data frame that we put a pandas date range into. You know we could have created this on the GPU, but first for showing it. You know right, I'm, sorry, we can look at the D types here. We've got a date/time nano second precision column and of course we have a float.

A

Column fully supports this, and so you know if we want to do things with this, we can use the query, API or use dot, lokor all sorts of different things, and you know if we want to do this query. For example, you know we want to query only the rows with a a time before this date. Well, how would we do this? If we used query, would we possibly run through a problem here? We could say I want to use the date that is before 2018 11:23, but remember we don't support types.

A

Certain types of operations on strings for custom things like this. The query API will fail here. What you could do is say: okay, well, I'll use low, the low KPI and I'll say I want date to be 2018 less than 2018 11:23.

A

ah Sorry, taped, yes, nothing, and so here again we reach the problem. We can't do a date/time operation, yet with a string. So what we do instead is use the timestamp operation, and so, if we do pandas timestamp, we can see now. This object is gonna, create a day time object. And so now, if we took this string right here and said.

A

Now we have a timestamp, you.

A

With this time filter now, we know how to handle this and we're off to the races, and so this is an example of how we can naturally support interaction between date/time variables and date/time columns. All sorts of things like that, there's also, of course, like pandas, more detailed things like you can take the minute you can get Oh second hour, etc. Well, go on there's a lot of them. I won't go through the whole thing and.

A

You can do all of them, which is great now. Some of you probably used coop is coop. I is the GPU array, library, with a numpy, consistent API, and so ku DF and coop I play together incredibly well. In fact they play together without having to make copies. So you can go between worlds without having to take time copying data in between and for any data frame or a column, a series you can call the dot values API just like in pandas to go to a coupie array in pandas.

A

This would go to a numpy array and cou DF. It goes through a coupie array.

A

Now you can also go to pendants, you can call to pandas and it will just put your data frame on the CPU as a panda's data frame being consistent with your data types, with your null handling with your all sorts of things, and you can also go to numpy. If you wanted to you can call as matrix. You know you can also do this by going to coupon and keeping things on the GPU.

A

But if you need an umpire array, you can do that as well, and you can also send a series to be an umpire array with two array now I mentioned before that you know, there's GPU accelerated writing.

A

This is only one small example, and there perhaps is it yeah, there's one small example: I'm using CSVs, but again you can use park' files or all sorts of things JSON.

A

This is gonna, send our data frame to a CSV. You can see that I called it. Foo we've got poop, it's just that same table. We have.

A

We can read that file and if the same API is pandas, and so that's it that's the sort of the intro to Kiev, and you know for the next 20 minutes, we'll go into the user-defined functions section, but before we do that, I want to show an example. If you know what the performance can really mean, you know. So if we had some random values, no, just in this case 10 million, make it make a data frame both in the pandas world and the cou DF world to to call some on this for pandas.

A

You know I'm one of the columns- it's not too time-consuming in this case we're using time it to you know, get a good estimate pretty fast 60 milliseconds for the GPU 300 microseconds. That's you know quite a bit faster, and in this case you know, perhaps it's not so significant, because you feel need to call some once you know. 16 milliseconds is probably ok but think about a more realistic example. You know often we're doing operations many times and so in this case we're gonna simulate some sensor.

A

Data sensor data is relevant for a lot of workflows and it's actually particularly relevant because it's kind of kind of data we're gonna be doing later and one of the nurse cork flows, but in this case we're gonna simulate some sensor data with pandas we're going to imagine. We have a time series going from one day's worth of data, 20:19 August, sorry July 5th, through the 6th, with a value for every millisecond and then we're gonna extract the hour, extract the minutes, print the shape and get the data's head, and so that's took 20 seconds.

A

You know took 20 seconds. It's a lot of data, it's 86 million rows, that's a lot of data, but it's only imagine one sensors worth of data from one day, it's potentially very small and a group buy on this. You know who wants to say: I want the max value for every thing on an hour and minute basis for seconds. Imagine doing this for all of your sensors. If you had a fleet of sensor, this would take forever well with qu DF. We can do this much faster in this case.

A

We're actually gonna run the same code except just put it in a cootie F data frame, so we're doing the data generation on the CPU so but we're gonna just put it on the GPU and show you how it's far faster to do this on the GPU all these processing. In this case, actually we added second thing that we did the first time, but we added seconds it still took only two seconds compared to 20 seconds, the group by 52, milliseconds versus you know four or five seconds: that's a hundred times faster.

A

This is just one example of a group by it's a hundred times faster. Other group, bys might be 50 times, others might be three hundred times faster. It's gonna depend of course, and so you know I encourage you to play around to get a sense of you know how this works, but at a high level that's sort of coup DF. It is a fully featured GPU based data frame library and Lauri. Perhaps maybe we should take some questions now before we move on to the next section.

B

Okay, so this question is from about the details of boolean indexing and I. Can't really read the example he posted in chat, but he wants to know. Can you search a data frame for the header.

A

So perhaps by by the header, do you mean do you think he mean Ralph? If you want to clarify perhaps in the text- and we can come back to this leader by do you mean perhaps by the column type or things like that.

B

The column name is B, can you search? Yes, I mean okay, you can take a look at the chat or maybe rounds, can unmute and ask oh.

C

Yeah I can I, can unmute so I'll. Do it like D, F and open parenthesis and then e closed parenthesis is larger, D, F, yeah yeah, it's larger and d, AF c, plus 6, for example, and then use this result as an index inside the bracket operator. Yes,.

A

You can I'm, sorry, you want it, you want to add.

C

A larger and then go to the very beginning of the 9df open bracket and then close brackets use that, instead of the query, yeah.

A

This is this would absolutely work, so anything that generates a boolean mask will be able to be passed to this. This will work.

C

People best another project for using one over the other, so.

A

There's no real drawback in this, but in general it's it's better in the there's a reason in the pandas world- and it applies here as well. It's better to be explicit and using lock when you do boolean indexing, because often what happens is we're gonna, do a series of boolean indexing in operations and then in a panda's world. You might see this warning.

A

Perhaps many of you have seen it that says setting with copy and it's not an error, it's a warning that tells you you're setting with a copy and that's a result of boolean indexing and to create a new view on a data set versus a copy and then doing operations, and it can cause unexpected results. So while this does work- and it will of course behave correctly, it often is more robust to use this specific access. Err.

C

Okay, thank you.

B

Thanks Ralph, that was the only question, so if anyone else has questions feel free to ask otherwise we can continue great.

A

So you know we saw briefly in that other section of the notebook or one of the sections of the notebook that you know. We have a user-defined function we could make.

A

We could do that apply map on a series and things, but if we want to do a more complicated function, it's a little bit harder I'm not going to go through all of the reasons why GPUs are fast here, we've heard a little bit at a high level from a youth that you know, there's many threads, many cores doing lots of work independently and then coordinating this notebook has a you know, a pretty decent overview of you know, what's actually happening when you write a custom function and look.

A

What does that mean in the CUDA under the hood and I'm? Not gonna spend the time now, but I encourage you to read it. It's actually fairly, it's fairly short, but it's fairly, informative, I think.

B

And I'm just wondering why, wouldn't you call the number directly, why would you want to call it via Rabbids great.

A

Question so you can do both so in this notebook we actually link out to our Docs that have an example of using number directly for the weights. Maybe pianos well it's somewhere, but we have some Doc's that show how you can use them, but directly as well, and the reason you'd want to potentially use number directly would be.

A

If you need to do things that don't naturally map to the dataframe world, you would want to use number directly if you can map your functions and you want to have it just kind of work in the data frame world and put your things in a data frame by default. This is a nice convenient API, but you can of course, use number directly.

A

Great, so, essentially in this in this function, to create a user-defined function, you do have to rely on number and so, for those who are not familiar number is a just-in-time compiler for Python code to transform that into an intermediate representation that then will run much faster because it's I believe it's LLVM compiled then for the CPU, the GPU. It gets eventually compiled down to something called PTX code, and we use this under the hood, for these apply ROS api's that lets you do things like write. A function, call that we called kernel to choose.

A

Some columns operate on, to name the output column and you can even pass values into it and there's a lot of stuff. You can do here. I'm, not gonna, go through all the exercises, but I will go through one and mention that now this actually has been wrapped up into a new library that we have in the rapid world called coup. Spatial for spatial data analytics this haver signed distance function is, you know, fairly commonly used.

A

This has been wrapped up into a nice API as well as many other things like points and polygons trajectory, modeling and all sorts of stuff, but so this you know this exercise is to calculate the Hammerstein's distance between two points. For all your all the points and it's a fairly complex algorithm, a lot of steps.

A

So to do this, you would just run this. It's actually pretty straightforward. We're gonna define the columns we care about in this function and then we're gonna have our output column, and we took this example from a stack overflow post by someone named michael dunn. In order to do this, you have to enumerate and loop through your columns and so we're looping through the four columns and we're keeping track of our index.

A

Gpus rely on CUDA threads and these CUDA threads need to be operating in the right place and send the data to the right specific index in the output, and so this API handles it for us so to lorries. Question of. Why might you want to use number directly versus this with number directly?

A

You need to actually be a little more explicit in making sure that your threads are writing to the correct spot, and it's it's not hard, but you need to be a little more explicit than just looping through and that's one advantage of doing it like this. We've taken care of that for you, but you can see that you know we're just doing some math. You know this algorithm is just math and we've imported the standard math functions from pythons built-in, math library we're doing some math. We are printing some stuff out just for visual sakes.

A

We don't need to do this and then we're putting our output and each thread will write independently in parallel to this column and we'll run it, and we can see that we have the output and I mean we know it's correct. We had print statements in our code. These did not print here because this print is actually running on the terminal from which this was launched, which is not shown in the screen, but it would show us that for every single one of these iterations that are happening in parallel, a lot of things are happening.

A

You know, in this case the array size was four that was automatically created for us by cout DFS number based PDF compiler. We had one one block, we had 64 threads per block again up to 60 or blocks, and it just gives us information about this, and so in this case that we only have you know two.

A

We have two blocks shown but or I guess we have a few blocks shown, but you can find more information about what's going on here and apply rows handles it all for us, and so the exercise is to modify this to pass in the radius of the earth as a keyword argument. So we saw that right here we passed the radius of the earth 6371, but if we were on Mars, this would be completely incorrect and I.

A

Don't know I've taught my head what the radius of Mars is, but we could do this in the same way by passing in a keyword argument, and so how do we do that?

A

Well, we look up ahead to back to where we were and we saw that we can pass keyword, arguments into this function and our kernel will just grab them it'll know, and so we can pass them into the kernel function, just like here at quarg, one quark, two, so I'm gonna do that I'm gonna, take the same process and I'm going to say radius.

A

And I'm not gonna use this I'm gonna comment this out. I'm gonna say that R equals radius.

A

So we've got this and now I have to execute it like before. I'm gonna execute it in the same way. So I'm just gonna copy this to save time and.

A

You know right now you saw before that. I was passing an empty dictionary. I had no keyword arguments, but now, when I passed, this there's an error. Why is there an error? We're expecting a keyword argument, so I gotta fill it so I'm gonna fill this in with radius and maybe I'll call it.

A

Here you always say 5,000: why not? We can pass it in and oh no, we got a typing error. Why did we get a typing error? Oh.

A

Sorry, it's a lot of scrolling a lot of times. We're gonna, see typing errors when we use number. In this case we have an issue with the wrong type of argument being passed and so we're right here. It's expecting to be slightly different, and so perhaps it's a little bit it's just a little bit finicky or perhaps I made a subtle I.

A

Mean it's subtle typing error earlier, but in general we can do it like this, and you know for now: I will just grab the little.

A

A little solution, perhaps I actually have to type it pass it in explicitly unless I do it differently and well, maybe not.

A

Well, for now, I'll just take the solution, but there's a very subtle error that I'm likely making somewhere and apologize for that.

B

With my experience.

B

Yep, okay, but we do have a question from Alex, so maybe you could answer so. He says when the hammer same distance kernel is invoked, the grid and the block dimensions are not specified. Can we invoke the kernel and explicitly define the number of threads in a block and the number of blocks in the grid.

A

Great question: yes, we can so that would be a great unit. Reason to use numba if you wanted to but apply ROS does all that for you, it's it's truck. It automatically is gonna. Try to optimize that under the hood we do expose it with apply chunks applied, chunks is going to let you handle specific things. You can do things like what are the threads per block?

A

You can what, if the block counts, I hope you can see this stuff, but- and it was a little hard with all the prints, but you can do things like controlling the threads and the blocks and all that kind of stuff if the same API or if it's with applied chunks- and we have a set about this right here. In fact, in the notebook.

A

It goes through asking you how to set that- and you can see that when we do this down below in the interest of time, I'll the skip to this there's another exercise that is essentially the same as the one above, but in the in this case we can do the same function with applied chunks and we can set this to be 16 chunks, eight threads per block, and it will of course run, but it will have run with these chunks and these threads four blocks, and we can also do the block count.

A

Does that answer the question? Hopefully Alps.

A

Well assume it does so there's about five minutes left in this segment. The.

A

Next fees is apply grouped. This is a way you can apply the same kinds of user-defined functions on a group by basis.

A

Now this is a little bit more advanced, so perhaps it's not necessary for a lot of people, but conceptually I always want to show that you can do it and I encourage you to take the time to go through it. You can do things like take the rolling average of a group. You know we can write our function and- and this you know, might look a little bit more like a number function.

A

For example, you can see that I'm explicitly going through the thread index, the total size of the group- and you know the block dimensions, and you can do this to do to basically just do a group by based user-defined function. Now, in this case it was a rolling function and it was because I was going to take the window size of three that we got. These missing values that makes sense a window. Size is three by default.

A

We're gonna have only results for when there's at least three values, and so I encourage you to spend more time after this or they really go through these notebooks to really see what's going on here, the group by functionality is a little bit more complicated, as you can see, but it's incredibly powerful as well. Okay,.

B

We have a question about the rolling average kernel that you showed alex. Is you have a branching condition which could lead to warp divergence? Does the code take care of more divergence automatically the.

A

Answer is that it depends and it's complicated, but it's by default. It probably will not take care of warp divergence. It depends how it's gonna get optimized.

B

Is there any way that you, as a user, would know.

A

Likely, yes, not without adding some configuration, um and so I'd have happy to chat about that after this, but by default you wouldn't get a warning about that. That's what you're asking yeah.

A

Great just as a note, it's not part of this, but this this data frame. We did a rolling window calculation here to show that we could do it. You know we we could just do a rolling window. We support enroll and so data frames and on the GPU support rolling.

A

So if we wanted to do this- and in this case it's not a group buy, but we could do rolling and do a window of three and take the mean- and in this case you'd see 48 50 144 now note that this is not a group buy with a group buy.

A

You have to use a group buy API for now, but we do support a fully-featured or at least deeply featured rolling window API to do things and you can also apply user-defined functions to rolling windows and, if you're, really interested in user-defined functions, I highly encourage you to take a look at this guide to UDF's and we've put together, there's a lot of different stuff involved in user-defined functions and a lot of different opportunities for significant speed.

A

Ups with complicated with complex code and there's an example of using a user-defined function and applying it onto a rolling window which I encourage you all to look into so hopefully this has been a useful 45 minutes for those of you who have not yet had a chance to work on the notebooks to get a sense of what they contain. You know give you a sneak peak of you know what you're gonna see when you go through the homeworks for those of you who have gone through them.

A

Hopefully this has been useful to get some more context and see it as well.

A

And with that I don't know if there's any questions.

B

So you can go ahead and move on to Cuomo we're right on schedule. Great.