Flatcar Container Linux Kubernetes & Cloud Native Berlin Meetups, 8 Feb 2023

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Kubernetes & Cloud Native Berlin Meetup February Edition

Description

Welcome to the live stream of the Kubernetes & Cloud Native Berlin Meetup - Feb 2023. Doors open for the in person meet up at 5 pm. The talks will begin at 6 pm, so stay tuned.

Find more information here: https://www.meetup.com/berlin-kubernetes-meetup/events/291177583/

About this meet up: We are a group for people interested in discussions around working with, running and developing Kubernetes and other cloud native technologies. We’re excited about container infrastructure, distributed systems and learning more about managing and extending them as such.

A

Looking good I'm, not gonna, say testing testing anymore. Yes,.

A

Hello good evening, hi everyone please gather around uh we'll begin shortly. The stream has already begun, though I'm Benazir, Khan I'm an event in um well I'm. The program coordinator here and everything and we've been putting up a few of these meetups. It's so nice to see today's time turn out. It's so lovely to see so many of you back and yeah. Let's kick off a February edition of the kubernetes and Cloud native Berlin Meetup.

A

uh First up we have Chris Chris will actually open with some remarks about Chris has been doing meetups uh in the community for a very, very long time. Chris will talk about some of the he'll touch upon a little bit of the history of the meetups. Why we do these meetups, the programs that we try to have and all of that, etc, etc. um I'll, let Chris begin.

B

I won't go too deep into why it's just fun, it's fun uh and we usually learn stuff. So that's that's why we do it right and to you know, get everybody together, uh but yeah I just want to um you know, because we restarted these things since uh covet uh has quote, unquote, ended, uh and so um you know, and now we're able to get back together.

B

So I wanted to kind of give a history of that, because we've had some also changes in the last few months uh and I'd like to kind of address those- and you know talk about like basically how we got here um and so there's kind of an interesting history to the kubernetes meetups. uh You know we did our first ones in 2015, basically he's talking about kubernetes uh 2015 had it how many people had heard of kubernetes in 20 in 2015., probably very few I mean Jerome come on. You got you're.

C

You're cheating.

B

He was with the team who created Docker, so you know he knows this stuff, um so we first actually started uh the core OS Berlin Meetup in February 2015. um that actually got was I. Hope you, like the colors. um uh We um were working with core OS and we were building this uh container. Runtime called Rocket, which was kind of like a competing container, run time that core OS had started.

B

um You know competing with the docker. Of course um they hired us actually to work on that in 24. At the end of 2014. and um yeah, we were basically uh some of the the folks.

B

The founders of Ken folk were basically um three-fourths of the of the um actually I should say yeah, basically three-fourths of the rocket team at the time, and we kind of worked on that for about two years in total, so we're very close to the core S team and we started the core S Berlin Meetup and it really got started with a bang or a second meet up in fact uh was.

D

This fella right here.

B

uh Kelsey Hightower he came to Berlin and presented at the um at the uh um immobilian emo immobiling Scout, uh fiance yeah, and so um you know this was our basically second Meetup.

B

um You know this was I think uh while Court, while Kelsey, was still at core OS uh and we were working with him on a rocket and other stuff and.

C

B

So then we we continued doing that, but you know coreos was you know we were basically covering core arrests and kind of related topics, but it became kind of obvious that uh you know kubernetes was becoming a bigger thing and we should probably start a kubernetes um Berlin Meetup. So in summer of 2015 I think it was August.

B

I was looking to grab the kubernetes Berlin Meetup so that we could start doing meetups there, but Nas I think it was five days before that somebody had grabbed it, and uh those are folks who you know actually later became our friends. But but when it happened, we were like, oh man, those guys they they're they're in Washington state. Well, actually they were mostly in the U.S, uh not only Washington State, uh but they had grabbed a bunch of the kubernetes meetups and they actually did them run. They ran them very well.

B

They got a bunch of local people, uh but we were kind of you know, kind of annoyed by that. uh So you know no kubernetes Berlin meetups for us, um but the folks who were doing it. They, basically you know, started their first ones in January.

B

2016 is when they had their first actual event, um and they continue that until I mean the present basically but I put fall 2022 because that's when uh two meetups uh merged which I'll get into later so because of the um you know, because the kubernetes ones was taken and we wanted actually uh to do this locally and kind of have the Berlin Meetup be led by Berlin people.

B

uh We chose to start the cloud native Computing Berlin, which we thought- okay, that includes kubernetes, because it was also becoming obvious that kubernetes was creating an ecosystem and it wasn't only about kubernetes, and so we started the cloud native Computing Berlin Meetup. uh We had our first, so we could carry it on with the core OS one uh for a while and did uh very much related things, but we eventually uh created this and pretty much almost exclusively did this going forward uh had held events under the cloud native Computing, Berlin Meetup umbrella.

B

um Of course covet came all the meetups stopped, except we did do remote meetups, um so we did carry on with those. But um you know: I was living in the U.S at the time, so I wasn't so super super involved with these, but the rest of our team did carry on no matter how much they kind of uh missed the in-person aspect of things.

B

But uh in mid-2022 we were looking to start things up and we uh it actually turned out that we actually lost control of all of our meetups, because we had there was such a long pause. If you actually, basically don't log into your account over time- and you don't respond to emails- that we did not see, uh it actually gets assigned to other people, and so we were like. Oh, uh uh we don't have. uh We lost access to a few of them.

B

uh We also lost access to this one and it turns out when we reached out to the kubernetes folks they had lost access to this one as well foreign, and so we were both in the same situation. So I contacted the person who picked up the cloud native Computing, one um funnily. He I think he was in Stuttgart, uh but very nice guy. He said yeah I just picked it up, because I saw it was available, and so he gave it back.

B

We had that one, the kubernetes meet up uh one we uh saw who had that we were like pretty sure we could. uh You know he was a nice guy and so he's sitting right here, and so we had. We had lunch and he was very happy to uh you know say: hey, let's: let's do this together, um but we also wanted to talk to the.

D

B

uh Kubernetes Meetup folks, and so we basically all got together, made the plan to merge uh the cloud native Computing, Berlin Meetup and the kubernetes Meetup, and that's why today um the kubernetes and Cloud native Berlin Meetup exist and that's where we are now. So that's basically the story. It's a little bit. It's not really that much drama it was just like. You know, people who don't check their emails enough.

B

um If that's drama for you, uh so yeah I, just thought. We'd give a little bit of History. uh Anybody have any questions actually because there might be a few questions in there. Maybe I messed up I missed out on things.

B

B

No, no, no, no I mean it was kind of you know when kubernetes it was all.

B

It was obvious in summer of 2015 that kubernetes was going to be something um and a bunch of companies were basically, you know, basically forming startups around it, and so they were just kind of early and they they realized that having these meetups would allow them to get their company in front of a lot of eyes, and so they basically had a bunch of meetups all over the world and they did actually reach out to people locally and had them run it uh so and but yeah it was.

B

It was a little weird, uh but yeah I mean it's probably a good move for them, but uh yeah anyway. uh I think that's, oh I should mention, though, that not only do we do meetups, but we did a bunch of other of the reason we were interested in events is because that's kind of uh what we did I mean I I've been doing open, source meetups since 2011.

B

My first Meetup was uh the desktop Summit, which was Katie and gnome coming together uh here in Berlin, um and then we also did you'll see signs up here or do we. You know the system D comp. uh We did that for a couple years. uh That was the first event we did actually in at Kinfolk um and then you know we still do rejects um so the cfp, by the way is open for um uh Cloud native rejects. That is a conference that is designed to be the B-side for kubecon.

B

So it's held the day a couple days before kubecon, and so, if you got a rejection or anybody, you know got a rejection, you can submit to rejects just copy paste and then uh you know we'll do a review of that and maybe you'll be a rejected, reject or you'll be an accepted region.

B

um So yeah, that's a backup for you. We also do we also do all systems go. uh That's been since Kobe.

B

We haven't started that back up, but I'm trying to get it started up uh this year and but that- and that is a low level, uh so we're kind of like kernel and low level uh folks, but also doing the kubernetes stuff uh that was kind of our Niche uh still is, and so we have that which we're trying to start up again uh later this year um for, like user space, low-level Linux user space.

B

um So that's it I hope I didn't run over too much uh and now I get to hand it off two.

A

Thanks thanks Chris, um if you're curious about meetups or Cloud native, rejects chat with us uh during our break I I for some reason wanted to say lunch break, but I realized it's way too late for lunch. um Yeah during our pizza networking break or so next up. We have Aisha and uh give us five to ten minutes and we'll get everything all set up and um yeah chat.

A

E

E

Because it's a separate- and it's my my view and this one is the SRI I guess.

A

I think if you just turn that on you should be able to basically we need you to share your screen. Yeah.

E

And that's my screen.

A

No, but uh you need to no, you need to mirror it um I think. If you go back to settings, okay, I think I'm gonna display or something.

A

Like it has to be super easy.

A

E

I have to check.

E

I think just I'll check the link.

B

B

Yeah once you click on that.

B

Question, okay and then you just share, you want to do screen sharing and the spring okay. You've got a lot of things, but you want to do probably yeah.

E

This Epp yeah.

B

B

Okay, so you should be able to now do your your mirroring okay.

B

B

Think that's okay, we're there and now, if we do this, that should be fine right. All.

F

E

Okay, perfect I think.

E

Should I start but.

B

G

E

E

Okay, okay, thank you, hi everyone. um My name is Aisha kaleem and I work at redhead as a software engineer. First of all, thank you so much for coming and listening me so today, I'm gonna, be talking about the ebpf and just I'm gonna be discussing that what ebpf is? What does it do?

E

How does it work and also I'm gonna, be going into the little architectural level of ebpf um and just to discuss the architecture of evpf that that how it works and how basically it works at the at the kernel level, so I'm going to be discussing about the evpf maps, evpf program, execution architecture and also I'm, going to be discussing about some libraries that we can use to write ebpf programs.

E

So, let's get started uh with the. What is ebpf um eppf is a technology that allows you to run sandbox programs into the Linux kernel um and just to make basically ebpf is allowing you to add more functionalities and more functionality. Programs into the Linux kernel historically operating system was the right place to add networking to implement networking security and observably related functionalities due to the restrictive nature of Linux kernel. But now ebpf makes it possible to add more functionalities and to add more programs into the Linux kernel itself.

E

um So uh currently these These Days a lot. A lot of modern Cloud platforms and data centers are using ebpf extensively for the efficient load, balancing for the efficient security and observability tasks.

E

So um so the use cases of the ebpf is basically is networking, security and observability, and just to summarize, because might be some because there are a lot of Genius people are sitting here. Some of them have already used dvpf, and some of them might be didn't or didn't know about it. So so I just wanted to uh just give a small example. For example, if uh in a web development we have a HTML HTML pages and we have CSS for the designing those HTML pages and they are static websites.

E

And if you want to do something add on to the real time onto the browser, then we need to add anything in into the HTML and to add something into the code into because their website is static. But if we add JavaScript, then we can do a lot of functionalities onto the website and onto the front-end side as well.

E

So that's not exact example for the ebpf, but it's a it's also doing the similar thing, because operating systems, they are constantly evolving Linux, constantly having more versioning, more versions, updating Ubuntu so but the Linux kernel is is not evolving at the speed of the of the uh as operating system is, but the demands are constantly growing. The demands of better security for the better networking and the for the uh better performance is constantly growing and due to the restrictive nature of Kernel, we don't have the access to the kernel space.

E

So the purpose of evpf is to basically to add more functionalities or the more programs or the more features that we need for the efficient performance or for to collect more data or the networking. We need to add ebpf programs into the into the or the or we need little extra functionality to be added into the Linux kernel.

E

So here that's why ebpf comes here and to make sure that we can add more functionalities into the kernel or and run those send those sandbox programs into the Linux kernel without, depending on the on the Kernel version or without depending on any other things. And for that there are a lot of pro projects that we basically use or the developers use to write those ebpf programs.

E

For example, there are few ones which I'm going to be discussing it uh like, for example, BCC, ebpf, Trace, BPF, trace and psyllium, and there are a lot of other projects that can be used to write ebpf programs. But before going to that I'm going to be going into the kernel level that how basically ebpf programs Works into the kernel.

E

Because when the developers was building the ebpf, the Kernel Security was the main important concern and to make sure that if we are running some ebpf programs or any programs or add more functionality into the kernel, then it doesn't harm the kernel itself. Because that's the that's the heart of the Linux and that's the heart of the server. So security was the most important factor and that's why there are two components comes in when it comes to loading, the ebpf programs, first verifier engine and the jit just in time compiler.

E

So in the kernel. Verifier is the one who makes sure that the ebpf programs are verified properly. They are following the set of instructions that needs to be required to to have to in the ebpf programs, and there are no Exceptions. There are no uh any initialized variables, I would say, or there are no in in finite conditions or in finite Loops.

E

So verify generally itself is a really um 10 000 lines of instructions that needs to be verified uh by the verifier for the ebpf program to just run into the into the kernel and then uh basically, verifier sends that program to the git compiler, which is we call just-in-time, compiler and Justin. Time compiler is basically just converts that ebpf program uh just to take that ebpf program and to convert it into the like a kernel, module or run it as like a kernel module or or as native as as a kernel.

E

So that's the work of the git compiler, because the kernel expects you to expect the program in the in the byte code and um internal Earth, and for that, um when the kernel as a developer or as a people, you can of course also write uh the the program into the byte code.

E

But it's not very practical practical is to use those VCC and psyllium in this kind of projects to basically uh run that program into that ebpf program into the kernel, because that's the one which converts the program into the byte code, which I'm going to be discussing it Phil uh right now, I'm gonna, trying to cover from higher level to a lower level, so verifier. Basically, um once the verifier approves the program that it's verified, it's safe to run, it goes to the jit compiler and jit.

E

Compiler is basically make sure that the uh it's running into into the kernel like a kernel, module or executable, and then here, ebpf Maps comes so ebpf maps are the are the one which uh ebf programs uses to share their data or to share their state with the ebpf maps and basically using it as their memory storage. In the kernel, because as ebpf programs are not depending on any kernel version, so the the running procedure of the ebpf programs into the kernel is is independent and the.

E

So that's why ebpf Maps comes to basically share uh State between the kernels or the communication between the kernel or to do some data sharing and also ebpf on and then when we, the programs gets executed. It also returned back. uh The Via system calls um and the in the from the user space.

E

So the basically system calls are most important aspect when it comes to evpf programs, because because system calls are the ones which uses to run those that ABP programs into the kernel and from kernel space to user space system uh system calls are the ones who basically doing that communication.

E

So ebpf maps are basically just to also, if from the user space to collect the executable executable results. Ivpf maps are also the one which shares the data to the user space, and there are also some helper functions um to write ebpf programs um which are predefined and they and they are not kernel. They are kernel independent and they are predefined.

E

uh uh Basically, helper functions that you can use when you are writing ebpf programs with the using any project, whether it's BCC or BPF Trace, and which I'm going to be discussing in coming slides and uh the the purpose of this helper functions is to do some helper methods, for example, Generate random numbers get current date and time and just to give the ebpf map access.

E

So these are the kernel functions, but now um these are the I think a very uh high level uh overview that I I give that how basically uh ebpf programs are running into the kernel space, but that's also I, think might a lot of you. I might be thinking that how is basically running into the user space and how those ibp programs is are going to the to the kernel.

E

So, as I said, the kernel aspects expects abpf programs into into the byte code and we will be using some project, whether it's BCC or some kind of compiler that basically converts that ebpf programs uh into the into the byte code and from byte code.

E

Where a system call we sent those programs to the verifier and the and then it gets compiled into the jit compiler and then all the executions can be done so uh BCC uh is, uh is a is a library that uses llvm and c-link as a as their compiler, and you can write python programs with using BCC BCC is more uh I would say beginner friendly uh for for the people who just want to start with the get started with the ebpf and writing ebpf programs.

E

So uh because it's easy, you can write pvpf programming to the python and it's it's easy to use and More in, and it's I think this. It is one of the oldest one as well, and BPF. Trace is also a high level language that can be used to write evpf programs and then, um and then it also uses llvm as their um as their compiler.

E

Apart from that, there are also a lot of new libraries comes in place, whether it's a go library to write uh programs in go language and uh and then and it it works in the like the same way that the BCC and other working they are using. Basically, compiling the code into the lower level language and it goes to the kernel. So there are a lot of new libraries comes in and there are some libraries which are new, which is in golang, for example, core.

E

They are also working that uh because the once the once the programs are, you write the programs they compile every time into the user space, and then it goes to the kernel. So there are also some situations where, um in with the core live BPF and all that they it compiled. The program should compile the one one time and it can run anytime everywhere with every kernel.

E

So it's also a detailed topic, but I try to cover this talk at a very I would say higher level, because there is a lot of complexities involved when it comes to bbpf, especially when it comes to the security and I try to make it as higher level as possible.

E

And so it's the I would say the summary of the architecture that I have discussed with uh with in this talk that we with the ebpf program and we write a ebpf program using any project, whether it's BCC or higher level, language, BPF, trace or and the psyllium is also one of the really popular project.

E

Basically, who gives the abstraction layer um top of the ebpf and it's a really popular project, and it's a really cool I, would say cool project to to write, ebpf programs and to use that and then it as the compiler and the ibpf course goes to the verifier and verifier to make sure verifies that it's it's safe to run for the kernel and compiles it to the jit compiler and come and then it's it's. The summary of this presentation and yeah I am and I think I.

E

It's so I make your guy is really bored, I, don't know. So if anyone have any questions, I would love to answer, but yeah.

A

Just want to point out that this is Aisha's first presentation and as a local, Meetup I think we should all be extremely encouraging so a hand, and also please do engage. uh If you do have some questions, please, let's just make it a discussion, format, yep yeah,.

A

Yeah you mentioned before that you like an example, a JavaScript when.

E

You tend to then.

E

Improve the um your account yeah.

H

E

E

So um the question was that um okay, okay, so the question was that uh that to for the example, the first uh basically to initially when they, when they uh the people want there are in in the Big, Data, Centers or Cloud platforms. We have really big uh requirements related to the security and observability, and not all requirements can be implemented when it comes to the in operating system, because we need um more information related to the networking.

E

Initially, it was a BPF which was a Berkeley packet filter, but now it's extended Berkeley packet filter, which is extended BPF, so with the JavaScript I. Give the example just to say that it's, for example, is it's a if it's a simple website and we needed to add more functionality. For example, if the user clicks the button, this function calls or this event happens right. But if it's a HTML website and it's aesthetic how we gonna do that so JavaScript we use JavaScript for that.

E

So that's also I would say not exactly but a similar example here that the when the when the requirements increases when the demands increases and the operating systems are also evolving and keep adding features. But when it comes to the adding features into the kernel when it when it comes adding improvements into the kernel, it's not that much growing it's. It takes a lot of time to get updated. The kernel version and and a lot of latest operating system they are using, might be the oldest or the similar uh or or not kernel. There.

E

Evolving evaluation is not very fast, but we still need to add some more uh I would say we mean more functionalities, whether to its networking and observability, especially so here ebpf comes that we, they add one portable program that do some extra task into the kernel so that we don't need to change any kernel module or we don't need to change anything into the kernel source code. We need some at on the top of it. uh Yeah, like I, don't know some extra spice on the top of any dish. You like.

E

Kernel I I think when it comes to, including in the long term, um I'm not really sure, because it's also depends on the use cases. um I think ebpf is not. Everyone is the one that are using ebpf and the projects like psyllium and all other projects. They are the ones which makes it possible for the use cases to use ebpf and- and they are just the programs that can be run on the basis of the requirements or on the basis of the better security and functionality.

E

So yeah I think that's for the future heading into the kernel. I'm, not really sure, because kernel itself has a very restrictive nature and it a lot of security and concerns are connected with the kernel and itself. A verification process of the ebpf itself. uh I would say a really long process to just to verify and to make sure that it's safe to run in the kernel. So yeah.

E

I think uh now the latest versions of Kernel is introducing the ebpf, but I am not exactly sure about the exactly kernel version, but the latest ones are introducing it.

D

E

Yeah recently I was working in a project where it's we were basically doing the collection uh of the debt of the cryo and cubelet between and for the observability purposes. What we needed to add ebpf to get more observability uh related to the cryo, cubelet and other security factors, so yeah I think the latest ones are using it yeah.

F

E

Sandbox yeah uh like this. What uh ebpf I've said that the BPF is a sandbox program for the Linux kernel and I would say. Sandbox program is like a it's a program which is independent of the kernel and that can be run inside the kernel. So that's our piece of program, yeah.

E

Yeah well, I think the most uh it's. The writing. A ebpf program is not very very difficult. It's just. You need to have a ebpf setup into your kernel and you must just make sure that it has a ABP, verifier and and I think just using a BCC Library. You can write ebpf programs and just to run it via system calls, so they can be run via system calls. So yeah, that's possible, I! Think for the beginner.

E

If you want to get try with it or getting Hands-On with it, so I think BCC is the one which is really cool to use at first yeah yeah.

E

Avoiding security issues like if I have a container starting a EPF program to read the network traffic of another container. What are the security features to avoid that I? Think that's a really I would uh say: yeah, that's a very detailed answer. I would I would I would say because the security Factor there are still security loopholes in in that, but uh how you can run it? uh It's just you can use system calls, and you can run run that ebpf programs with the with that, but I'm not really sure about that.

E

If you are running in a container and if it goes to the kernel and what are the security factors for that, um but I think it's also depends on the which project you are using for that, for example, psyllium. It provides great help when it comes to running ebpf program and pro and give a top exception layer to to make sure, and it also provides a lot of security factors to run the evpf program into the into the kernel.

E

So but I am not exactly sure about that that how exactly we we tackle the security factors when it comes to writing the programs from the kubernetes container or a Docker container.

F

um The correctness and the consistency of the verifier and the just in time, compiler are probably very crucial properties of the whole edpi subsystem. Yes,.

E

E

What was the second part of the question.

F

E

Yeah yeah I think uh you are. Your question is really good because when I said, because verifier and uh just in time, compiler basically is just compiling the the program and verifier is the one which is the responsible to verifying the programs and it's 10 000 sets of instructions to make sure that the verifier is verifying.

E

The program in its program is okay to run but and I think it's it's pretty strong, but still there can be loopholes and there can be, and- and there can be a lot, there are a lot of strategies that can be used to verify programs, whether it they can be coming into the user level as well and user space as well. I think the more verification processes that more can be handled in the user space than to the kernel is different face because kernel is the one who is doing the verification.

F

E

F

E

I think I'm not very aware of that. Yeah.

E

From speaking, we can take one.

H

More question we can keep going and I have a suggestion after that. One more question so I'll leave that Cliffhanger there, but who wants to take the last question a question and compatibility.

G

Like if I write an ebpf program, uh will it work if I take like a an order kernel or we keep working on a newer kernel or has there been different yeah.

E

G

E

I think uh yeah, it's a nice question that if it's, the kernel version affects the running of the ebpf programs. um Actually the the purpose of ebpf is to make sure that it's very independent of Kernel. That's why we are compiling it into the user space.

E

We are using the helper functions, which is I'm not even using the internal system, calls or internal functions of the of the kernel, but as far as you have a kernel which is doing the verification process or just in time compiler then I think you can run the ebpf program with and it they are not kernel dependent. They are independent and they are uh just independent, yeah yeah. So yeah, that's it.

E

Thank you so much for your time and it was a great talk, and um this is my Twitter handle. If you want to follow- or if you have some questions you can always reach out to me and if I, if I, don't know the answer, we can figure out together. So thank you so much.

H

Thank you so much Aisha. It was a pleasure having you here. Thank you for your talk and once again, let's do a huge round of applause for Aisha. It is the first physical talk at a Meetup, so keep it coming. We are looking forward to the next one.

H

Thank you all right. So what was the Cliffhanger I was talking about right. Clearly, there's a lot of interest for ebpf and it has been nice to see it as the say this discussion right so I just wanted to look at everyone in this room. There's a lovely crowd in this room and I'm sure a lot more people are also interested in ebpf other topics, and this is the theme of the Meetup right. We have some experts here from all um all across the domain. So what I?

H

What I wanted to encourage is the pizzas here the pizza is getting cold and we have the next talk in about 30 minutes. So what I wanted to encourage is let's grab some slices of pizza and let's get the conversation going and as Benazir would like to say chat.

H

So a quick Logistics note: the restrooms are over here: the pizzas and drinks are over there and the people you can feel free to hang out and now, let's go see you at 7, 15 ish.

H

Let me try, try one for the community foreign yeah I. Think the pizza.

H

Everybody can I have everybody's attention here for a sec. Okay,.

C

A

G

Going sure yeah.

G

Yeah I'm going to keep talking so that you can check that it's working correctly on the other side, I will pretty soon run out of interesting things to say, but I think it's not a problem, because the point is just to check that the sun works correctly and at that point I hope it does. But if it doesn't I'm willing to keep talking and until we know for sure that it works I'm still testing. Are we good okay, perfect.

C

C

H

You're going to do is have the pizza I.

G

H

There was some mistiming with the vegetarian pizza, so I just wanted to.

C

G

I'm fixitarian yeah.

H

But just in case yeah.

G

Good good call, but yeah.

C

C

H

Hello, everyone, I I, think this is not loud enough, but let's see, let's just keep talking until, like everybody gets quiet, I think people in the back cannot hear as well.

C

C

G

I got a good water here.

A

All right guys rain it in come back for another talk.

A

You can bring your pizza slices and and sit and eat while you're watching.

A

And while we're at it, I just want to stress on one quick thing which I think we always miss out on logistically, but um when we do remember, we do talk about it, which is uh the code of conduct. This is a harassment-free, Zone and I know everyone respects that, but we should, and we should also talk about it and we should emphasize it as as much as possible. So um yeah, let's, let's uh disagree if you want to but like in in a nice tasteful way um all right.

A

That said, uh you know the rest of the general code of conduct, which you generally read and know about everywhere. um We are all back. Are we? We have a quorum of enough enough people I think.

H

I may request you questions just like you should do on kubernetes. So if you can move keywords.

A

A

We now have container OG Jerome, patagoni I, don't know if I've said your name right or if I still need to improve on it, but uh I'm sure we're all waiting to hear from him. uh It's a good 30 minutes. Talk and yeah get your Refreshments sit down and take it away sure room.

G

Hi everyone good evening so I'm here to talk about running so ml is for machine learning not like ml, like scheme and these interesting programming languages right in containers with GPU acceleration.

G

So reminder this is streamed, which means it's also recorded, but it's also going to be recorded twice. I mean you should always record anything important twice to have a backup, so I'm going to go to that little terminal here and I'm going to start the recording it's going to be useful for the demo in half an hour, so FFM Peg from pulse audio default input and that's going to be meetup.mp3.

G

G

Font bigger now, but that's not important the the way. The the key thing is to note that this is recording awesome. Okay, so this is recorded twice so I'm Jerome peterzoni. You can follow me on Twitter or Mastodon these days. I was an early employee at Docker. I did a bunch of different things here. I was also an early employee at Enix, which you probably don't know. This is not Square Enix, the makers of Final Fantasy.

G

This is Enix the first French hosting provider to sell virtual machines ever and I have pages from the way back machine proving that back in 2005 we could sell you a zen-based virtual machines before anyone else on that market yay.

G

But that's not what I'm here to talk about today, so I'm here, because um I'm on the right, Circle I, think I know a bit about containers and I'm trying to learn about machine learning, and there are many people knowledgeable about machine learning, but I didn't find many folks in the middle circle and I really struggle with a lot of questions about how to run ml stuff in containers in a way that I would find satisfactory from a container standpoint.

G

So this is a kind of a debrief of the stuff I've learned and by the way who here is more on the right Circle. You know container folks and who's more on the left, Circle machine learning. Folks, oh very few! Okay! Well, if I say anything wrong, please point it out to me: don't hesitate, I'll, be happy to learn and by the way we had a presentation about ebpf before so I have an idea for a talk next year.

G

um So actually, there are many folks who know both about containers and ebpf, because there are many like cross-pollination opportunities like Kinfolk or maybe I should say, like the the band formerly known as Kinfolk has lots of folks who are really knowledgeable about both containers and ebpf. Now ebpf and machine learning and containers- that's a startup idea for next year. Anyway. This stock is not an intro to machine learning, because I am not qualified to do that.

G

If you want to learn about machine learning, there are tons of resources, one that I found pretty good is a three blue one Brown It's videos, I, don't like videos but I like these ones, so that'd probably be good. This is also not an introduction about containers and Docker and kubernetes I might be qualified to deliver that, but I won't, because this is not the theme today, I'm also going to sweep under the rug that rug here, a lot of questions.

G

You know like ethical problems, with machine learning like all that stuff I'm not going to talk about that I'm just going to acknowledge yep.

G

These questions exist, but not today, I'm going to talk about a couple of machine learning models from some companies, but I'm not endorsing them in any way, I'm just going to take them as like use cases, example and stuff like that, and also I'm not going to tell you how to run that on kubernetes, because I just don't have enough time to cover it, but maybe for a later presentation that could be fun.

G

So I'm going to tell you about the context and the specific use case that got me into all this then I'm going to talk about running GPU, stuffing containers, running machine learning in containers, then putting everything together and then see where that takes us. So first, why am I even bothering with this um so a model in the first place? Now we have some interesting open source models, I would say and when we think about open source code, we think yeah, it's python, it's go.

G

It's rust, it's whatever, uh and it's going to be thousands of lines and that's it, but in machine learning you have models where you have basically weights in mattresses like representing coefficients in neural networks, and that's a pretty important thing. It's a it's. Actually, what makes the the magic in the model and- and that's the difficult thing so what's interesting- is that we now have some models where you have both the code and these parameters or weights that are available. A really popular one is stable diffusion. You probably have heard about that.

G

One you can ask it like give me a photo of an astronaut riding a horse, and it will happily give you that kind of picture, and this takes a few minutes on a relatively fast CPU and a few seconds on the GPU. So now we're talking some real acceleration and the parameters I was talking about are about five gigs. So there is the code, and then there is like five gigs of mattresses and floating Point numbers and I.

G

Don't know exactly stable diffusion is not the only one you might have heard about Delhi and mid Johnny and many others, but the interesting thing about stable diffusion is that you can run it on your own computer. I can run it on this old laptop. It would probably take longer than the Meetup to give me a picture. I can run it on a GPU and when I say GPU I mean like a few hundred bucks GPU, not a data center Tesla, whatever that cost more than a Tesla car.

G

These things are incredibly popular on GitHub, but um but but but okay, there is this graph where they're like hey. Look at this hockey stick growth here, that's the number of stars on GitHub, so they are like look we're growing much faster than everyone else. Well, except if you take the first comment on a pull request, apparently somebody thought that they commented on the pull request with a prompt like give me a rock album cover blah blah blah. Somehow the image would show up. No, that's not how it works. I'm.

G

Sorry somebody made like the most interesting GitHub pull request ever and another person this um so and just to be clear. I did not go like trying to find the best cases. I literally took the first ones on this project, so I think here when we look at that kind of graph.

G

Yes, it's popular because it's kind of when you compare B2B and b2c, like obviously there are more people using apps like let's say, Uber or Lyft than people using vs code, because not the same Target and now I think they are like literally millions, if not tens of millions of folks trying to run these models on their machine because, let's face it, it's it's pretty cool. You ask you put some crazy prompt and it comes up with an image. That's really nice I mean back in the days like 1000 years ago.

G

You had to be like the I, don't know the king or whatever of Venice, and have your artist to do that, and now anyone with a computer can do it. That's nice, but it's just to put in perspective the incredible growth on GitHub all right.

G

Another model which for me personally is more useful, is whisper whisper, does automatic speech recognition, which means you talk at it and it gives you text which I really like, because I prefer to read, rather than that, you have to listen to someone even when I'm talking right now, but anyway, whisper is pretty impressive because it works great on English, including with my French accent.

G

It works great on my French as well, even when I'm talking about kubernetes and Cube CTL, and each CD and stuff like that, it actually transcribes them properly, and that was really impressive and I have to give a big thanks to my friend Julius false one of the founders of Prometheus, who told me about this and is like dude. You have to try this and I listened to it and I did and now here I am so it's all his fault, okay, so in the case of whisper, the parameters can have like very different sizes.

G

So, okay, there is that table here, but you see like it goes from um about 80 Megs on disk to about three gigs, so all the sizes that you want and to give you an idea of how fast it goes. If you want to transcribe like half a minute of of speech, it's going to take a few seconds if you do that on GPU, and maybe a few minutes or whatever on CPU depend on the CPU, but again big difference between CPU and GPU and by the way.

G

This is what whisper gives me when I ask it to transcribe some of my Docker or kubernetes courses in English on top or on French below it works really well, and this is raw you know not not edited not fixed or whatever we don't fix in prod in post here. So now we might wonder: do you really need to run that on GPU, okay, sure it's like 10 times or 50 times faster, but who cares you can just wait a little bit?

G

Well, um if I tell you that you can work on the new programming language and it's going to take five seconds to compile of five minutes to compile what do you prefer and be on our own personal Comfort? If I can compile and test in a few seconds, I'm going to be able to experiment a lot more I try I fail, I, try again, I fail again and in an hour, I can do like dozens of experiments, while if it takes one hour to compile in one hour, I can do one experiment.

G

So here um that's you know, that's why I wanted GPU so that I had like a faster feedback loop in my experiments. The results of these experiments, so that's kind of the beta version of the whole thing, but I'm I'm, taking like 16 hours, kubernetes this course and breaking it down in small chunks and automating the whole editing process. It takes maybe 20 minutes of me editing a text file to indicate cut here, cut here, cut here and then a few hours of compute to get something like this. You know you have like a sorry.

G

The text is really small here and that okay, you can pick a specific chapter like exposing containers and I. Don't know if it's going to play really fast, because, but normally that supposed to work and then it plays and what's really fun is that you can interesting. Okay, demo effect, I told you it was a beta version, but normally, if you click in the middle of that text, it takes you to that particular location. So that's that's my use case basically like taking my live courses and turning that into some bite-sized chunks.

G

um All right now. How do we? Let's say that we agree? We want to run machine learning stuff and live GPU, because it's faster and we want to do that in containers. I mean, maybe you don't, but personally I want to do it in containers because I reasons. So how does that work? Normally, when you run code, not just in containers but anywhere the way that the code interacts with the rest of the world is through system calls. So our code, you know I write like C code or python or Google rust.

G

It's going to use libraries and eventually, at some point, the libraries are going to make these system course and I think it's called the ABI like the kind of the API between the what we call user land our code and the kernel. That's the the the frontier you know the line between inside the container and outside the container, the kernel code, and this thing has been incredibly stable over decades. There is here like the Linux programming in the face.

G

I, don't know exactly when that has been written, but I have an older like Linux book, which was about kernel, 2.0 and I'm. Pretty sure that sure this book is more recent, it has more stuff like more system calls and newer features, but you can get my super old French book about like programming for the Linux kernel and you could write code using this system course that would still work today. Actually, you can take container images that I've built 10 years ago.

G

One of the advantages of being at Docker in the early days is that you can have images that are actually 10 years old and it still works today on a modern kernel. Why? Because I won't show you that link, because that would be at least half a dozen code of conduct violations butliness, the great maintenance of the Linux kernel is tends to be kind of a little bit natural when people break that thing and he insults them and berates them, etc, etc.

G

So I, maybe there are better ways to maintain compatibility, but at least it works, and we have this stability on on the Kernel. Okay, great now add gpus and the fun ends here now, when you use, for instance, Nvidia and use the NVIDIA drivers on the right hand, side in the module. We have some kernel modules and, on the left hand, side.

G

We have userlin libraries and the bad news is that the library that you use and the kernel module have to match exactly, which means that when you update your Nvidia driver, you also need to update the corresponding libraries. So now it means that if I update the kernel on my machine, I need to rebuild my containers. That sounds like extremely bad news, from a kind of from a container person perspective like all right. Let's try to do something like this.

G

Where we take the the device nodes, you know the stuff in slash Dev put that in the container put the libraries from the machine. You know, Mount them in the container copy them in the right place and run some program, and it works okay. That was a terminal reference. Sorry about that, but it works. You do that and it works for real, but it feels honestly like super hackish. Nobody wants to do that in production at least I.

G

Don't um so fun thing is that it's stable enough that during the first year of the pandemic, when I moved all my courses to be online and so I had like the whole encoding setup with OBS like open broadcast studio, I was running OBS in containers because reasons with like GPU acceleration and all that stuff, and it was stable enough for me to basically base my source of income on it. So it works but seriously that kind of command line.

G

No, so there is a better way and the better way is something called Nvidia Docker or actually it's the Nvidia like container runtime, and when you install that uh you add a couple of flags to your Docker run and it works now. It does exactly the same thing. You know I'm sure many of you have done stuff with like puppet chef and Sybil Etc, and what we very often do is that we take some really ugly shell scripts and put that behind some coffee management rules. And now you do make this thing happen.

G

It looks nice like this, but behind the scenes, it's just a bunch of shell scripts. So here what happens is that the Nvidia container runtime is a wrapper for runc, so that gives us a no CI compliant runtime and itself is going to call run C to run the containers. Now, if you don't know about oci and run C Etc, you might be like okay, who cares good news?

G

Well, that means that since it's an oci compliant thing, you can drop it in and use it with Batman kubernetes and many other projects in the container ecosystem. So, yes, great little details on that, you need to not only put that dash dash runtime Nvidia, but you need also these environment variables. If you don't put the variables the runtime is not going to wake up so to speak, and it's not going to inject the libraries and and binaries in the container.

G

It took me a while to figure that one um so yeah, that's uh that gives us something that lets us use gpus inside containers great next, we want to run machine learning. Applications in containers and I walked into this thinking. It's just apps, so I'm just gonna write a Docker file and it's going to be easy right. So I took whisper, the the like voice, transcription thing um and like okay, this is Python and they have like a setup py and requirements txt and all that stuff.

G

So you can actually like pip install straight from the repo on GitHub and it works. They require FFM impact to do some audio Trend like transformation, codec stuff, but I got this um like great by the way all the docker files for this are gathered in the repo here. If you want to compare and check Etc so I build this thing like literally these three lines, Docker file and it builds a six gigs image.

G

I've seen some pretty big images in my days, I have a Docker, as you can imagine, but this seemed a little bit excessive so uh and by the way that doesn't include the three gigs file with the models and everything at first I thought: oh, okay, that's the model. No, no! The model is not there yet. So if you look into the dependencies, my understanding is that a lot of machine learning stuff is in Python, not all of it, but a lot of it, and there are two big churches.

G

Well, the two major ones basically are Pi torch and tensorflow tensorflow I, think is kind of backed by Google pytorch by method been formerly known as Facebook. The other way around. No, no I think that's it anyway. If you look at the size of these packages, um you have uh 1.8 gigs for pytorch and I, don't even know how you manage to have like almost two gigs of code. It's probably not just code, but I haven't dived into it. Yet so I I, don't I, can't explain why I'm sure there are really good reasons.

G

Tensorflow same thing. There is an Nvidia package that is more than one gig which I maybe don't need if I want to run on CPU and then well-known stuff, like numpy and by the way Transformers like.

G

If you look into machine learning, it's one of the new like super interesting things like what kind of responsible for a lot of the Breakthrough we had in deep learning the last couple of years, so that one pretty important as well, but anyway, really really really big container image and when I looked a little bit closer I realized I made a beginner's mistake: I had the PIP download cache in there on almost two gigs like all right, let's roll up our sleeves and try to shrink that image a bit so I kind of fast forward through this, because it's you know just like Docker image, optimization, let's remove the cache.

G

Let's do some multi-stage build, let's use the virtual and let's do like oh I'm, going to talk about that one, because this is fairly new and even if you work with containers, you may or may not know about this and I think this is incredibly cool. This is a pretty recent feature with build kit, where you can basically mount a cache directory two benefits there. It means that the cache directory is not going to be included in the final build.

G

So all the stuff, you know when you do like a napped, get update or PP install whatever, or anything like that. If you set the right directories to be cash directories, well, first Advantage, they don't end up in the final image. Second Advantage: they persist from one build to the next. So it's a little bit as if you were doing like a good rm-rf on that directory, but you keep the advantages of caching when you invoke the the build over and over all right.

G

So we do all that stuff, and that gives us some significant improvements. uh You know I kind of squeeze things as much as I could I got like a almost like one, gig image with the whisper code and I was like Yay, then I was like how do we get GPU stuff again? Oh now we are back at 3.7 gigs anyway. My takeaway was well going from 6 gig to one gig. That makes me feel pretty good, because I feel like it's almost one order of magnitude.

G

Now, six gig to three gigs sure it's still three gigs, we saved a bunch of gigs great but I feel like it's not a huge. You know change of Paradigm or whatever.

G

um No so I think the key thing is not really to try to squeeze every gig out of the image at this point, but more to make sure that the build system is going to leverage cash correctly, meaning that when you push changes to your code, it's not going to rebuild the whole thing and push completely New Image, but it's going to just update the layers that need to be updated because now, in that case, you're going to see differences like five or six solder of magnitude. So it's worth it all right now.

G

Let's talk about models um and I was telling you earlier that the models in machine learning they can be big. Not all of them are you know if you follow like the a lot of tutorials on machine learning, it's about like quick, recognizing, handwritten digits and the models are going to be way smaller. Another good presentation I saw with somebody like teaching like a model to play like rock paper scissors and again, the the models are not going to be huge, but these models, like the two I, was telling you about.

G

Well, that's it so for whisper, the big one, the one that we want to run because it's the best it's like three gigs and for stable diffusion. They are like five gigs each. um So that's not small, so good news. When you run these machine learning applications, they tend to automatically download the models. That's pretty awesome right, except they do that. Every time you run your container, because each time you run the container, you get a brand new container and you re-download the model.

G

And no so one question is: maybe we should put the models in the image that way you put the image you get the model with it. That sounds great, no I!

G

Don't think you should do that, except maybe, if you like pain and even then, um because okay, some exceptions if the models are really small like a few Megs, okay, fine, you already have 700 Megs of Pi torch and I, don't know what's in there, but if you add 10 gigs of sorry 10 Megs, not 10, gigs, 10, Megs of models, nobody will ever notice or maybe, if you have a big model, but it never changes and you will never rebuild the image and if you believe that I think you're diluting yourself.

G

But who knows maybe, but in other situations I, don't think you should put the model in the image and I'm going to try and explain why. I, just like bigger images means bigger problems. If you build without build kit like build, kit is kind of the new style, Docker Builder and it's it's great and we really should use it. But if you build without buildkit, which is still the case of some anticated build CI systems Etc each time you dock a build, it's going to send the whole build context to the docker engine.

G

So, even if you have a perfect Docker file with perfect optimization caching Etc, if you have a five gig model file, it's going to be sent to the docker engine at each build, which is dumb, I, think and even if you're, using build kit, which is using some kind of magic to figure out which files have changed and what needs to be sent over, etc, etc.

G

If you have a five gigs model file, you're going to have three copies of that file like one on the machine uh I mean like on your local, let's say home directory, one in the build kit, cache and another copy in the final doc image. So that's not great, also Docker Registries and not just Docker but container Registries in general, because at the end of the day in you know, on your production, kubernetes clusters- you're, probably not using Docker anymore these days, but Registries are not cdns. They are not it's not Netflix right.

G

They are not meant to send five gigs 10 gigs files like this. They can, but should they I think not? They typically are not going to be as fast as doing like a massively parallel pull from S3 or R2 or whatever so I think it would be a good idea to set the model somewhere kind of a side, pull them separately to let the registry breathe a little bit okay, so, instead, what should? We do?

G

Well, for instance, something like this: that's what I'm using locally I, have a compost file and I've put my models in this cache directory, and so each time I run that container uh the model files are here and if I want to run the the models locally, I can use the same cache directory and that's great I avoid to have like multiple copies of these files, uh and it's honestly super easy one line in the compost file.

G

One extra argument: if you use a Docker or podman, to run the container so really no reason to do that differently. Now, if you run on clusters, my personal piece of advice do whatever you want with.

G

It is try to make sure that these models are going to be pulled in some persistent cache, so that, if you, if you have a container that you run over and over a pod that you run over and over, that it doesn't download over and over these five gigs models, all right- and it's honestly not too hard to do uh by the way and I don't have examples here, but that's uh I.

G

The most important part is to be aware that these huge files exist and then they are going to be automatically downloaded every time and that we want to avoid that from happening all right. So um now that we have all that, you know uh it's demo time normally I've been recording that okay, let's stop the recording now um and normally this has been synced pretty much real time to there. Yes, meetup.mp3, so this machine on the Lower Side zaftra, that's a machine at home with a GPU okay.

G

So on top I'm going to run whisper with 2W, it's a little shell script to run the whisper container and mount the local directory and the cache and everything. So we are in a container yeah. We are in a container I I, don't even have PS in the container, but trust me. We are in a container um and I'm going to do whisper meetup.mp3 and this Brave Little laptop, is going to try its best.

G

um So the first little moment was loading the model- it's three gigs, so even if it's all the local SSD it takes a while uh now it says: hey, you're, using CPU. That's too bad! It's going to take a while, uh and ah it's successfully detected English and in a moment, probably 30 seconds 40 seconds from now. It's going to stop uh telling us what was said at the beginning of the Meetup. Meanwhile.

G

So on the machine here at home, we have that Meetup file that was kind of synchronized uh progressively and I'm going to do the same thing, whisper and Whisper meetup.mp3.

G

And oh, there is a okay moment: it says warning you're on CPU, but it shouldn't because I have a GPU at home. So let's check a good way to check if all the GPU stuff is set up or right is to try to run Nvidia, uh SMI and okay command not found, so it means that it's not set up properly. Okay, let's get out of here and let's check my little wrapper right. I forgot to put the dash dash runtime Nvidia. So let's add that dash dash runtime, Nvidia and.

D

G

Oh, maybe just one minute and uh like we're almost almost there um so okay now I can run Nvidia, SMI and yep I have the output that says you have a brave GPU and it's ready to work so now, I do whisper, meetup.mp3 I, don't get the warning about using CPU and things going to be slow and uh okay and there we go, let's unzoom a little bit so I'm just unzooming, so that we can see the comparative speed on top CPU running on this laptop on the bottom GPU running at home and as you can see, the GPU is much much much much much much much faster.

G

So when you're waiting for the whole thing to transcribe and see like how did it perform with these parameters and everything in one case, you're gonna wait like half an hour. In the other case. Two minutes done see it's already at six minutes of transcription. So that's pretty nice.

G

um Now we can zoom a little bit and um okay. It didn't do that. Well on here on Cube CTL and each CD I, don't know which model it's using by default. So, let's add dash dash model, large I, think it's maybe medium by default uh and, let's let it work for a bit and, let's see what happens by the way.

G

The models are so big that even at home, where I have like a mounts over ethernet like on gigabit uh loading, the model is still slow yeah, because a three gigs file uh on a one gig link. It still takes like 30 seconds to load the file, the first time, uh which is super long. Oh, by the way um you.

D

C

G

It said I'm Jerome petersoni, but it kind of didn't get my name right uh now. You can kind of condition the model you can say: hey initial, prompt, initial prompt.

G

um This is a Meetup talk in Berlin by Jerome peterzoni, and then um it's going to use that to know kind of I mean I'm, not exactly telling it what it's about, but it's going to try to continue in the same style and because there is Peter Zone name written correctly. Normally, when there is potassium in the beginning of the transcript it should get, you tried because it had it just before um and I'm okay and yep now you've got the name correctly.

G

One thing that may or may not work, but you can also influence the style so, for instance, If instead of putting like punctuation and all that stuff. When I say, let's do this Twitter Stone style, because machine learning is so cool shake my head. um So sometimes this works. Sometimes it doesn't, but sometimes it's going to continue in the same style and it's going to just like let go of punctuation and and everything like lowercase, we'll see it depends.

G

um A funny thing is that these models can hallucinate. So what does it mean? Hallucinate? It means that it's going to give you stuff. That's not here. For instance, if you have a long pause, yep see it works. Now, it's it's pretty living like I'm, like feverishly typing, on social media, so it can hallucinate.

G

So, for instance, if there is a long pause very often it's going to write like subtitles, By, Radio Canada you're, like what I didn't say that this is because, very often at the end of a movie you're going to have like subtitles by whomever and so the model kind of learned. When there is a long pause, it means the text should be subtitles by blah blah blah. So that's an hallucination okay. That was a little example. Yeah, it's pretty cool right, so results I got my kind of semi-automated like chapter video thingies.

G

um The links are in the slides, so you don't have to remember that it's demo.qctl party slash key as part one slash uh horrible Goose, whatever malicious Goose. No, there is the link. So, just in case, if some of you want to get started with kubernetes, there is like the equivalent of maybe the first days of my kubernetes training there and you can use and abuse that content as much as you want. It's free.

G

The quality is not so great yet because beta Stuff Etc but use it if you want um I'm trying to get this better so, for instance, to have better time stamps because, as you can see in the transcript here, we have timestamps, but it's kind of coarse grain. It's like a whole line, and you have like a few seconds of audio I- would like to get something like word by word, so that I can detect long pauses, so that I can add.

G

Like know when there is a new chapter, for instance, or even here you can see, we have this big thing of text. That's not very palatable, so I would like to automatically put breaks in there, for instance, um get rid of the hallucinations, because it's kind of weird when you're in the middle you're in the zone, in learning about load, balancer services- and it says subtitles but Radio Canada you're like what um better constrain The Prompt text so that it gets my name and other things like that correctly now.

G

The problem is that, honestly, the code I wrote to do that thing. I didn't even dive into the model itself. It's just like shell scripts and Python scripts, I'm, just calling that thing and using the output files.

G

So it's kind of embarrassing in a way because I'm not really you doing machine learning, I'm using machine learning but I'm, not really diving into things themselves, so I hope to eventually get there stuff that I'm working on um so I have led the strongly held belief that if you are using the cloud but you're not using spot instances or whatever your provider calls them, then you're basically lining up the pockets of your cloud provider where you should be using spot instances, because it's like 40, 60, 90, cheaper, so I'm putting up some recipes to.

G

Let us run a kubernetes pods with GPU acceleration and spot instances. Maybe that could be um for a future talk um I'm, also trying to build some content on like machine learning operations and illustrating different patterns. So here, for instance, we have relatively long inference time inference it's when you ask the model to to predict something by opposed to training, which is when you build the model itself. So you have pretty long inference. So it's a good use for kubernetes jobs and pods like something like that.

G

But if you have really short inference time, something that would take like 10 milliseconds, that would be better to put that between an API endpoint, for instance, so I'm trying to build something like that, but honestly I'm not feeling comfortable enough with the machine learning part to do something like this and also to kind of go for a circle. Somebody use like machine learning, sentiment analysis on the Linux kernel mailing list to rank Linux store valves like rents by most hateful, to least head full.

G

So that's kind of interesting in case you want to dive into that. First everything I showed you works on CPU. You don't need a GPU. You will probably want to buy one anyway, but you don't need one um one thing I'm, you know like I, don't really like this situation, but honestly there is pretty much. Only the Nvidia stuff that really works. I have an AMD. Gpu I played a little bit with an M1 Mac and uh no it's. It feels really hackish.

G

uh You sweat a lot and at the end, you have maybe one tenth of the speed of the Nvidia stuff. I. Don't think that it's just because the Nvidia stuff is better I think it's also a matter of support. So progressively the other platforms are going to catch up, but right now you know love it or not. uh If you want to be serious with gpus, you might have to go with Nvidia um and also little detail.

G

If, like me, you have an older GPU lying around like a Siri like a 1660, it doesn't support FP, 16, math correctly and so I had to buy a new one, because all the stuff I was trying to do didn't work little details. We can talk about that after because I think I'm running out of time, but I will be happy to take questions questions. Thank you.

G

Any other questions.

G

There are still sort of issues with training and wanting to be able to differentiate our dictionary. Our dictionary discrepancies or pick up our issues, as well as the words of nature. So that will be things such as.

G

Would that be correct and how? What sort of um what sort of ways are you looking at to be able to to be able to make this model uh tell the difference between each of these right? So the question is like: how can we improve on summon the problems I didn't mentioned like the hallucinations, and how can we kind of tune things.

G

And and the timestamps yes, okay, so first um to be honest, I'm, not quite sure about what tuning a model means I in general terms, I understand. Oh I want the model to perform better on this and that, but when I started to look into it, um I saw very different examples.

G

So, first there is this whole thing like the Transformers architecture, which the way I understand it, which is probably extremely wrong, but is that instead of having just this one model, we do a bunch of Transformations and that unlocks some interesting possibilities like, for instance, stable diffusion? The thing that lets its run on a like plain consumer GPU is that, instead of working with pixels, it works with a kind of compressed representation which is smaller um but and therefore fits into the the video memory of the GPU.

G

And then there is a transformation you know from pixels to what they call this Latin space and then back. So you kind of compress worked on the compress thing and uncompressed at the end, um and so there is this. The this whole thing, and so what I'm understanding is that sometimes tuning means changing.

G

Some of these steps changing some of these parameters uh using one text encoder instead of another or little details like that, a little bit like you know tuning my shell pipeline when I have like fine pipe grip pipe this pipe whatever changing these things and sometimes tuning would mean retraining the model um and again the way I understand it. So probably pretty wrong is that when you train a model, you do like a number of kind of iterations and you could do extra iterations with your own content, I honestly.

G

So so, interestingly, there are lots of guides about how to do that with stable diffusion, because many people, many people, want to use it to produce images. That would once again be kind of. Congratulations um and the the makers of stable diffusion initially didn't want that, so they filter the data set that they use to train the model, to remove anything that was like nudity, Gore, Pawn, etc, etc. But some folks are like no. We want that anyway, so they there is lots of tutorials to kind of teach you how to do that.

G

I didn't really look into that. Yet I might and honestly not because I want to do that, but because I want to learn it and it kind of sucks that the the best resources to learn how to train, for though the model would be for that kind of use. But that's where we are um and another question so on the time stamps.

G

um So there is that thing called whisperex, let's no, not whisper whisper X and what they do is that, instead of the example I was showing you where you have. um Would that be? Yes, that's the one. Instead of having one line with you know a time of the beginning of the sentence and time of the end of the sentence. You have the time for each word, so you can do something like. Where is the demo um and well? We can't really hear at this point.

G

Maybe if I put it, but you can see the the words being as lighted as they tell them in the recording.

G

And so basically, I want to do that for a couple of reasons, the the main one being to be able to detect silences and pauses now to be honest, detecting silence, you don't need machine learning to do that. I could probably also do that on the side and by the way, that's also one way to not exactly solve but help with the hallucinations. When you have a bunch of blank. You know that. Okay, there is blank here there is no Radio Canada subtitling, so you know that you can remove it in post.

G

So that's many things to explore some of them. Looking like super fancy and advanced like retrained models with epochs and stuff, some of them being like well, we know that on these locations there is silence. So we know that we can just snip out the the text and that's it it's just like grab and find and I know that, um so it probably will end up being a mix of both techniques. Yeah.

D

Yeah, when we are dealing with something like resources like CPU,.

H

Thoughts what they provide is like through to make that resource like means like we can use half CPU and.

F

H

F

Between the processes that we have is there anything like that.

G

For GPU good question, so to repeat the question: if I understood correctly, we have pretty good tools to manage like resource sharing on Docker and kubernetes, like saying this container can use that much CPU, that's much RAM. What about that for GPU? It's a mixed bag! Honestly, my understanding is that we can do stuff like Say Hey I want this thing to have one GPU and honestly, in my case, at home. That's kind of well I only have one GPU anyway. So what do you mean?

G

um But- and so when you are on in a data center with machines with multiple gpus, you could say. Oh this thing is one GPU or three GPU or two gpus, but my impression is that scheduling and resource management is still medieval.

G

um One thing that is pretty interesting uh is to look at the resource usage. uh uh Let maybe I can run that in gmux and then I can split. That, and here we can whisper and then do um whisper, Meetup, That, mp3 and while it does that I can run top but I. So we can see it's going to use CPU and then, if I use Nvidia semi. What is it again?

G

I, don't remember the exact syntax, but what's interesting is that you can see the video memory user usage climbing up here and now we are like three yeah. That's how I know that we are probably on the medium model, because when you own the large model, it's taking like 10 gigs uh and as far as I know, there is no way to limit the the video memory usage, for instance, or the GPU CPU usage.

G

Something like that which we do have, though, is metrics uh and I have no idea if this is going to work, because this is a machine at home and yay, okay, um bigger to tail scale, because I started using it recently and it apparently just worked magically. But so I have a little Raspberry Pi at home running a bunch of Prometheus grafana Etc, and um there is an excellent exporter which can show you I'm. Really. Sorry, where is it yeah.

D

G

You metrics, and that shows you your usage, so you can see here like memory utilization, that's when I was doing that demo earlier memory location. What else power draw yep 150 watts, so there is. This is like definitely not zero, etc, etc. So we we can see that usage and we can probably use that to be smart about this. Workload only needs that much vram, so I might be able to collocate it with that other Etc.

G

However, I honestly, we don't have stuff like on kubernetes or Docker to say I limit you to that much of vram and CPU and Etc. Not that I'm, aware of at least um I I wish. I could remember the name of the guy who made that exporter, because I I went I shopped for exporters and there are half a dozen of them, and this is definitely the best one.

G

um I will tweet it or whatever later, because he really deserves a hat tip as well. Yes,.

G

Right, yeah, they're, on on the data center line. There is some partitioning thing going on: um I I have no idea. Oh one thing that also worth mentioning just in case. You know if you're dabbling with GPU stuff. There are some features that are not available on uh consumer gpus. For instance, if you do video encoding, you can only do two streams at a time which is pretty ridiculous. So there is something called Nvidia patch like Nvidia and Patch, which patches your dlls or dot SOS.

G

To remove these limitations just in case pretty convenient stuff, I, don't know if it would give us these data center features, but I, maybe I should check. It could be interesting. Yeah.

G

What's the best framework for distributed training on kubernetes, honestly I, don't know, I haven't trained a model yet I mean I've. Looked at a bunch of videos now about you, know the classic like and drone digits and the rock paper scissors one and Etc and I'm like okay. Tomorrow, I'm gonna train a model tomorrow, but I haven't done it yet so I I haven't dived into that. Yet I'm, sorry yeah.

D

Just a heads up when you will be training the models, usually you just leave the model as you did, and you just use more layers on top of it with your own data or more data that you found in the as much data as possible in your format.

G

But yeah you need to learn how to construct above layers on top of that model right if I understood correctly you're wanting you say it's, it's not just about adding more and more neural layers and and data, but you also need to understand how to pick the right number of layers and how they are connected, yeah, so yeah I, don't.

B

G

I haven't been so far yet but uh I'm yeah I'm, going to remember that, because I think it's going to be useful in the near future.

G

It is excellent question, which capabilities and permission do we need in the container, wow um I, think kids going to work with that and hmm okay I'm going to guess, but it's just like educated guess that when we use the Nvidia runtime, it's going to give the permissions and that's it because here I'm just doing like a Docker run and it works, I didn't need to add extra capabilities or permissions.

G

To be honest, I have no idea if the Nvidia runtime is just like popping out the SC Linux or upper more profiles and capabilities, or if it's leaving them more or less as they are I, don't know so. I have no idea if using the GPU in that case, like does that completely defeat a bunch of security features, I, don't know.

G

D

G

You'll have pizza so before you leave.

G

Oh yeah, and the last thing is that usually the the last slide of any presentation is like we're: hiring I'm, not hiring like I'm, just by myself, uh so usually I also add well I'm doing kubernetes training. So if you want to train your team, get in touch except I, don't really have any availability until months and months ahead.

G

uh So no, but if you do cool machine learning stuff- and you want to hack together because you have the machine, learning, stuff and I bring the container stuff, and maybe we can do fun projects uh hit me up. Thank you.

H

You got me excited I, think I'm gonna go back home and buy a GPU just now.

H

I'm excited. When is your own.

A

Well, I'm not going to this is not a good follow-up act after that that was epic, uh but yeah. I, just I think we're at the close of this.

A

Yet another Meetup I just wanted to summarize what we're doing with the meetups here in organic Community Building efforts of ours, um it's supposed to be a 10 minute slot, but I'm not going to speak for that long, because you hear me around all the time so um I'd love to meet some of you guys, maybe later um and I'm so happy with all of this, the numbers that we're seeing the turnout here. This is important.

A

This is important for us to keep going for all of us, in fact, for this momentum to also keep going, but um also wanted to touch upon some some things in the community building efforts and initiatives and the processes that we're looking at and everything as you can imagine it's organic. It very much involves the community and it very much involves everyone coming together and doing these things together.

A

I think I've spoken a few times in the last few meetups about how it's just necessary that people even sort of start trickling in before the Meetup and sort of help put certain things together and everything. So you feel more involved and invested in the process.

A

um I think it will happen slowly, but this is just our third Meetup and, as you can see, the numbers are also so much nicer and bigger, and everything we're constantly trying to rope in great speakers Leisure room, um and this amazing talk that we had today and all of the other amazing speakers that have spoken here and everything, but I also wanted to sort of briefly touch upon a component, that's also very near and dear to us, which is diversity and inclusion representation.

A

uh All of that, especially this, because I think in in sort of choosing speakers and sort of getting these kind of programs running and choosing uh the topics that we want to talk about, and that we'd like to hear more about, and everything we'd like to hear more from you about what topics you'd like to hear more about, and the speakers that you can recommend to the people that you want to encourage and push forward and everything, because we're constantly looking for ways to best represent the diverse perspectives in the community and diversity, both in terms of just perspectives.

A

Your companies um also in terms of you know, women speakers also coming forward and everything, and sometimes it takes that little bit of a push. I can tell you um and it's worth it, it's I. I can tell you as a woman too. Sometimes it's just worth it. So please do push uh women that you know in the community. Anyone else who'd like to speak, irrespective of of you, know their perspectives, gender, etc, etc. We just want to see a broad representation here.

A

We just want to sort of bring to your broad spectrum of speakers from across the community, um and we want everyone to be heard so, and this is not. This is really not something I'm saying just like that. You can ask the other co-organizers how much I keep pushing and pushing for the fact that hey. Can you talk to this person that person do you know someone?

A

Would that person like to speak about this totally different thing or like this totally different affiliation, that they're from and everything and sometimes I also get asked the same questions like oh, but you know they just sort of had this product release with, for instance, Amazon or something like that. Wouldn't it be a conflict of interest in and I'm, always like. No, that's precisely what we want. We want that in the mix, so um don't be afraid and don't keep thinking through these things. Just just apply.

A

uh I keep saying this a lot of times and yes, I think I need to stop harping on the fact that find Kinfolk on Twitter but find info on LinkedIn um yeah find me on LinkedIn um I'm happy to talk to you guys and exchange more contact details, yes and Mastodon very soon. I I realize that we have to get there anyway, um but yeah um find find anyone in the community there try to reach out and give us some ideas. There's also feedback form that I've circulated.

A

uh That I will be circulating very soon in Atkin folks LinkedIn as well. So if you, if you do like today's program, if, if you have any thoughts at whatsoever, do share I understand it's not easy for everyone to just come forward and have that dialogue or have that conversation, but we're really really willing to hear from you. So please please talk to us. uh Find me. Corner me talk to me.

A

I do want to chat with everyone. I want to chat! That's why I keep saying chat, chat, yeah um yeah, but on that note I think I should say chat again: yeah Encore and yes, and that's it from my side I'll hand it over to Aditya I. Do think that, um because our turnout is also so great and wonderful today and we had uh Jerome, speak and I really really got the time and I really made time to actually sit down and attend his talk and also listen to everything and I learned.

A

So much too that we should try and take a selfie today and Jerome. You are the celebrity, so I think I suggest that you take the selfie and we can all be behind you, so everyone can.

H

Sort of you know congregate here we.

A

Can do that we let's say.

G

If you don't want to be on pictures, you can you know lower your head or hide, or something no I mean it's important to mention that too, because not everyone wants to be on pictures and that's totally fine, okay, uh cheese, whiskey or say happy thing, okay or two three, and there.

C

G

A huge reflection of the light anyway so we'll see what this gives, but.

G

Yeah, we'll fix it in post.

H

So uh just a couple of things: I have a 30 minute talk now about uh community building and so on and no I don't um I will let you go. I have a two minute note about logistics and the next Meetup, as always, so the next Meetup is going to happen next month. uh We are going to keep this streak going. We, as always, are looking for I, don't know what you're always looking for feedback. As Benazir said, we are always looking for help. As you see putting up this audio video situation.

H

If anyone's an expert please join in show up, we would love to see the YouTubers in the crowd uh trying to experiment with some of this stuff. I think I'm gonna take on the challenge of hosting the next one with help from Benazir and Chris, as always so see you in the next one. Again final note: everybody is welcome to speak. We want to hear your experiences whether they are um about kubernetes, about containers about any technology in Cloud native. So please do tweet do reach out, as venezuel says, find us for anywhere.

H

You want and let's chat. Thank you.

C