OctoML Apache TVM Community, 20 Apr 2022

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Apache TVM Community Meeting, April 20, 2022

Description

https://discuss.tvm.apache.org/t/next-tvm-community-meeting-apr-20/12593

A

uh Okay, so welcome everyone to the april 20th edition of micro tv community meeting. um We have a great topic today where um youtube's gonna present on on relax um before we get started with that. um I just wanted to take a moment. uh Anyone who's new. uh If anyone wants to briefly introduce themselves, tell us a little bit about what got you interested in tvm. What you're interested in working on um feel free to do so now, we'll give a second here for that.

A

I do see a lot of familiar faces here, uh so um not expecting a ton of introductions, but um just wanted to give a brief moment for that um uh next. Okay, so we have basically one agenda item today, um which is just to discuss, relax um and just does anyone have anything else they wanted to bring up um or make any changes um kind of late breaking changes now.

A

All right hearing, none, um let's go on to the announcements. So um just a couple announcements. This week we have a new committer lou cutton, as well as ellen calda as a new reviewer, um and I believe, luke worked on some of the arm seems to end stuff, as well as uh some of the aot stuff and ellen works on uh some of the micro mpu stuff, as well as well as cascade scheduling and pattern matching. So um welcome welcome and excited to have you guys as part of the community, as always.

A

So uh with that um I'll hand the floor over to you chen, so we can hear about relax.

B

Cool uh I'm gonna share my screen go for it cool. um Can you see the slice.

A

Yep we're good here.

B

Awesome yeah uh thanks everyone for coming today. um It's really a great honor uh to talk about, relax uh and share our efforts with the whole community and relax is a collaborative effort that many community members are involved and actually a lot of whom are in the audience today.

B

So the main purpose of this meeting is to foster open discussion about relax, so feel free to interrupt me at any time. If you have any questions and for the relaxed contributors and feel free to chime in during the talk. So it's a little bit informal, um and so let's talk about why we started the effort. So relay has been the graph level uh intermediate representation in pvm since 2018, and almost one year ago a bunch of community members asked asked such a question.

B

What are the main, what are the pan points of relay and how can we evolve relay to adapt to today's version, workloads and needs? So the tpm stack for machine learning.

B

Compilation consists of four layers of abstractions at the high level, there is a computational graph which represents a machine learning model and is relayed in pdm and the tensor programs which implement the kernels for the operators and also called like subgraphs in the computational graph, and this is tens ir in pvn and under that there is library and runtimes which include code to execute these tensor operations, such as tvm tech functions and vendor libraries, such as kudiran and so and at the bottom, is the hardware primitives, which include low level assembly code, and if we look at the compilation pipeline today, it follows a multi-stage lowering approach.

B

That is, we have one ir at each level of abstraction and we do some internal optimizations at each level and then we lower the whole machine learning program to the next level to continue the optimizations.

B

Is this a good approach as tng presented in the tbm unity talk in pvmcon last december? There are clear boundaries between these vertical layers, so, firstly, the translation between these vertical layers are done at one shot.

B

In today's tvm, for example, the lowering from relay to tr programs are done in a single pass and it makes it really hard to lower part of the program and incrementally lower different parts of a program. In addition to that, the lowering goes in one direction. So once the relay program is lowered to tensor ir program, uh it's hard to send the auto tuning information back to the high level to rewrite and optimize the high level graph, uh for example, the auto tuning.

B

I might have some information about like we want to rewrite the width layout to get better performance, but it is hard to do to do it today and, lastly, adding customized op is really hard today, because it has to be added in several places in all of the layers.

B

So the first goal of relax is to build a unified interface that breaks the boundaries of the pvms abstractions, to allow for interactions between these layers and and enable call optimizations.

B

So our goal is to evolve the multi-stage lowering approach which we showed like there are one direction arrows into some circle like the graph here, which allows organic interactions between these abstractions.

B

So to achieve this goal, the key design point we made in lex is to allow the high level r to be able to directly uh interact and, according to the low level, cancer, r programs and runtime pack functions.

B

So here is a program written in pvm script, which is the python dsl that is widely used in cancer io. Today we can see that this iron module contains uh two functions, uh a case, ir print function and a relax function.

B

So this relax function uh represents the high level uh computational graph and the tr program represents a kernel implementation and we came up with two intrinsics in relax to bridge the gap between different abstraction levels.

B

The first one is called pact. It indicates that in the high level error we can drag it according to a runtime pack function. It can be like third-party libraries like coding function or like some customized functions, registered using tbm, ffi and people can write it in python and incorporate that in your high-level high-level computational graph, and the second interest intrinsic we introduced is a kotia which allows user to call into a pr function and also it actually allows for uh calling into a pack function, uh biking an uh immutable way.

B

So the pr in this, the core tr the tr here- means like a tir calling convention. So it's not limited to pr program, but the most use case is to call into some tr function as shown here. So basically we have this uh dr function and the second argument is the input input tensor here and the third argument is the output shape and, fourth is the output data type.

B

So, basically, uh through these two intrinsics, we bring the low level cost into the high-level data flow land, and this will unlock a lot of like interesting opportunities.

B

For example, I think michael here he once wrote a test ir program and he want to add that to his uh model, but he found out. He needs to modify a few places to be able to, according to his customize the tensile program by inside relax, you can actually write it in tvm script and directly construct your model, and you can write core tr into your customized kansai program and also like.

B

If we want to introduce some op, which you don't want to implement in tbm, you can actually write it in python and wrap it with the tbm ffi and directly use call pact to according to that uh python program.

B

And secondly, it also uh makes the incremental lowering uh possible. So, basically, you can incrementally lower different parts of your program and at any stage of the transformation, the program is represented with an ir module which can contain both pir print functions and relax functions, and it also allows the automation decisions like the beta schedule to take a core tr node and perform some optimizations and rewrite it into several. uh Several core plan nodes that inform the layout of writing decisions uh to the high level.

B

So it has the enables the capability to uh send some feedback to the high level and do the rewriting for both the high level relax functions and the low level uh test ir functions together and also it brings the byoc flow as a natural part of the transformation, because we can transform part of the graph into some course of some opaque type functions. So these are the opportunities we can unlock by using. uh I have unified ir yeah. This is the first goal of relax um yeah.

B

If no questions I can go on but feel free to present questions.

B

Yeah cool the second goal of relax is to support and optimize dynamic shift workloads, and we can obviously see there are increased number of real-world models today with dynamic shapes and the dynamism can come from a variable input shape, for example, the dynamic batch size and dynamic sequence length in your keywords, and that dynamism can also come from data dependent op such as unique operation.

B

So the output tensor shape uh depends on the input data and we don't know the uh output shape uh until runtime in relay. We have a the relay dot any we can represent that we can represent that unknown dimension during compile time.

B

Using this any, uh for example, uh we have a flattened uh function here to flatten a tensor, but but in relay we can see that the before flattening tensor, a the first dimension is a question mark, uh meaning unknown and after the flattened uh function, the output tensor one dimensional and its first dimension shape is also unknown. So actually we don't know the shape relationships between these two tensors, a and b, but in relax. We can use symbolic shape to represent a and b shapes in the input tensor.

B

A the dynamic batch size can be represented as a symbolic integer n and the input tensor shape can be represented as this right and it gives us additional information at a compile time because, after the flattening function, we know that the b's function, sorry b shape can be represented using uh this equation, and we know that actually tensor a and tensor b have the exact same cancer size. So potentially they can reuse uh potentially b can reuse the same memory space as a so this kind of shape.

B

Relations between the tensors provide more uh optimization opportunities during compile time.

B

So in order to achieve that uh that, in order to achieve that, one design point in relax is to make shape deduction as the first class uh computation. So this uh example last program covers uh some typical scenarios in shape reduction.

B

So, for example, a shape can be constructed by the the symbolic integer, for example the n here, and we can then use a expressions of these symbolic integers to represent some shape calculations and secondly, we can also implement some opaque shape function like the my shape function here, and these opaque shape functions are used for fallbacks to quickly hack, up a runtime shape function, and lastly- and I think is important- is that for the data dependent op like the unique here, we introduced a construct called uh mesh shape in the relaxed ir.

B

So uh so, at compile time, you can see that the uh the output uh tensor of this unique function uh is shape, is runtime dependent, so it's unknown during compile time, but then we can use this uh match shape to refine its shape. Here we we can match this lv5 with this symbolic shape, so n will be defined here, and then we can use this n, the symbolic integer in the following program to uh to carry out optimizations.

B

So this is second goal of relax, which is enable uh symbolic shape and also we have the general support uh which which is like, even though we have unknown shapes throughout the program, we will need to make sure the program can be compiled and executed in tbn.

B

So the third goal of relax is to support computational graph uh writing with advanced semantics. So what does it mean? So? Most of the machine learning engineers today are familiar with the concept of computational graph and, and these optimizations are under the assumption that every operation in the graph has no side effect, which is many passes implemented in relax. Oh sorry, in tbm today have this assumption.

B

This optimization is clearly useful for a majority of the optimizations, but, as we start to working on, uh for example, enable training training in tvn, uh we need to work with uh some effect for uh sorry, I don't know whether it's effect for or like effect yeah effect, for, I think uh operations like random number generations and the weight updates in place updates, and we also need to be able to represent programs that contain uh some complex semantics, such as in place, updates and control flow which have side effects.

B

So we need to find a way to enable most people to write computational graph optimizations accordingly, while still being able to represent these advanced semantics, like the in-place updates and and the control flow.

B

So here we introduce a data flow block construct in relax ir, for example, in this relaxed function. Here uh all the operations under the data flow block uh is side effect free and does not contain any uh advanced semantics like control flows or like if-then-else or nested scopes, and a data flow block can effectively be viewed as the computational graph and most of the bindings here, like the lv 0 within the date of the block, is local, which means they are only visible within this block and these, but these variables can be.

B

We can mark them as outputs, in which case the variables will be visible in outside of its data flow block. It can be visible in the later part of the program like we can use gv0 here, and it can be viewed as the output nodes in the computational graph and note that the uh custom in place update the compact function, call is outside of the data flow block, so everything outside the data of the block can have side effects in relax.

B

um So to conclude that we expect most of the optimizations will happen at the data flow block level, and these optimizations can be done by machine learning. Engineers who are familiar with the pure data flow graph concept and we isolate and represent the effect for components which are outside the data flow block, gives opportunities for more advanced optimizations, for example like the in-place updates and so on.

B

So uh to conclude, uh relax is a compiler system dedicated to the tvm unit division with three major goals. So, first of all, we want to unify the abstractions to enable cross-layer optimizations and secondly, we want to unify the static and the dynamic shape, and also we want to support uh the computational graph style.

B

Writing with advanced semantics like in place, updates and and the control flow which allows for optimizing both for training and influence, and we have been doing uh urban development for a long while- and here is our apple and also we open sourced uh all of our design. Docs here and we have the sub channel in the tvm, discord feel free to join and we have our weekly development meeting every week. People will discuss uh the progress and also the technical discussions, and we it's very interesting.

B

So these are the uh slides I prepared, but I do have some demo jupiter notebook and I will also leave the acknowledgement in the end. So do you guys want to see some demos.

B

C

C

B

Yeah, please bring out questions if you are confused about anything I just talked about and throughout the demo yeah. So, let's see we. Firstly, we import a bunch of python packages and then uh let me show you how to uh build and run a neural network in relaxed.

B

um Here we use a class called blog builder, which is to construct the vlsi, and we want to build such a three-layer neural network, very simple mlp and we can in relax. We have a convenient uh interface called a module which is very similar to pi dodge interface, which allows us to quickly build up some simple neural networks.

B

So basically we can use a very similar api, like as the pi touch, so basically we have unsequential uh and inside we define each layer, uh its input, size and the opposite side, output, size and so on. So basically it's a very simple mlp, and here we define a tr symbolic variable to represent a dynamic batch size and we define the data and the parameter and so basically use the builder to construct the relapse function and, let's see what we will get.

B

So here is the ir module. So in the io module, so basically we we can. It use the mlp using this uh module interface. So here we can see, uh we have a relax function and the relapse function. It has the data which has the first dimension as symbolic uh variable n, because uh it's a dynamic batch size right and inside this uh main function.

B

We have a bunch of courtyard and this courtyard according to the print functions generated by this module interface and under hood, it use something called emity which I will elaborate further later so basically, according to these print functions, and these are the inputs to these print functions, and here is the output shape and the output d type.

B

So these print functions are automatically generated by the mdt in in relaxed block builder, and after that we want to build the model so basically compile it and create a relaxed vm. To execute that so note that the the symbolic variable n is threaded through these relaxed functions. So it's a dynamic workload.

B

So, let's see we initialize the weights randomly and here uh suppose our input data is uh basically is a. It has dynamic uh batch size right. So, let's generate some uh data with batch size of three and we call the main function, which is that mlp and fill in the data and the random parameters so yeah. We we can see that we, we see three rows uh in the output, so basically it can.

B

It means that it outputs the uh like the image classification for your three input, images right um and let's try like data with uh the the uh batch size of five, and you can see that it can be also be executed. So this demonstrates that uh that we can have symbolic shape in a relaxed program and then we can build it and uh and run it in in relaxed vm and under the hood.

B

uh We use something called emity um in relax to automatically generate this core tr cores and the corresponding print functions, and it's actually very simple and neat.

B

So here we want to define some linear layer, and these are just to initialize uh some uh ways and bias and the main implementation is in this forward uh forward function. So it takes some inputs of type relaxed expression to a relaxed variable and you just use the emity function, which takes a topi function and given the relaxed variable.

B

So this is less variable. This is also a relaxed variable. It will emit the right cortial core and also generate the print functions. um So the reason why uh it's so simple is because the t language in pvm also has the symbolic shape as first class and also in relax.

B

Each variable has a symbolic shape and note that the constant shape is also just a specific case of uh symbolic shape. Right, so basically it it can extract the uh extract, the symbolic shape from the uh expression and fit that into existing topi functions and generate the corresponding tr function and the cortial nodes.

B

So yeah here is just another example, so basically we can still use blob builder and inside the blob builder inside we want to generate some function called mlp and we use some data and weight, and we here we just use the blob builder's meeting feeding some. For example, third party c blocks memo and feeding any like relaxed relaxed uh variables, and then, let's see what we will get so here are just also like symbolic variables um and the data weight are all are both dynamic.

B

Yeah so basically, it generates uh the mlp function and it generates two uh core tr one according to the matmo uh prim funk and one coin to the value. So basically, uh this uh integration of uh relax and te and the topi is very organic because both are based on symbolic shapes.

B

So we can also run it with some like dynamic shape data right and in uh next I want to introduce a uh the the the thing we currently use to import models, uh which is which is relate to relaxed translator. So, for example, uh we can get some workloads uh in relay. We can get some workload by using the testing api right and in uh relax.

B

We have a relay translator and you can just call relay translator.from relay and feed in the relay function and it will translate that to a relaxed ir module and, let's see what we get so.

B

Basically, we we get it's pretty long, because it's resnet and currently the printing is, uh is not like ordered, meaning that the relax function is not necessarily at the top of the module, but we can refine that at later, if, uh uh if necessary, so basically the relaxed function is uh is like this um because, like uh the relay model, is a static shape. So basically, you can see here we generate a bunch of like core tr and the print functions are automatically generated, and I want to emphasize that this tr programs are just naive.

B

Pr programs generated by the t functions uh and obviously we can uh with the integration with a meta schedule. We can tune these kernels. So, uh let's see uh uh here is the tuning code. um We can use key uh tr script. Oh sorry has a pvm script to uh construct the uh input module input, io module. It contains like very simple pr print functions and inside this uh relax function.

B

We call into these two tr print funds, and here we just like call the meta schedule, tune, relax function and it will tune the kernels, as you can see here. So it's tuning and I set very small tuning trials.

B

And it is succeeded um and it outputs the result, uh so this is uh relaxed, has already been totally integrated with meta schedule for the training purpose.

B

With that, I think yeah, that's the demos I want to show and do you guys have any questions.

D

I I have a question, but it's not um specifically about this. This notebook.

E

I was wondering if you.

D

Could I was wondering if you could talk a little bit about um how the type how typing works for relax um yeah, it seems like you, might need dependent typing. So I just just thought it might be interesting right right actually.

B

The type scene uh relax is uh like is different is different from relay, so basically, we separate the shape from the type and for tensor we have a dynamic tensor type and the type itself only carries the rank information and also the d-type information, and each expression has a check type field and the chat type could be dynamic. Cancer type with the rank and detail information, and each expression also has a shape field and the shape field can be like uh any expressions right, it can be, uh normally is uh a.

B

uh We call it the shape expression, it's just an array of premium expert. So basically, you can use the tr ti variables uh symbolic to uh to represent some symbolic shape, so we don't have dependent typing in relax.

B

Does that answer your question? Sorry, I I don't know who I asked.

D

B

D

Uh-Huh yeah yeah um yeah. I think so I'm kind of a little bit surprised. We don't have dependent typing because it seems like it would be required, but maybe I'm I'm wrong. There.

F

Is another round dependent on typing in a sense that you can also equivalently introduce a dependent type where your tensor type depends on value? But but then that's at the additional complexities, because because in this case we find that you know, shape is kind of we kind of don't want to get into depending on the typing line. Initially.

F

So so that's why you know it's equivalent to dependent typing but effectively we're separating types, so that type are can still are still non-dependent, but straight values themselves are, are runtime dependent and we can use things like constant, propagations and other things to to still get some of the uh constant shape. Information in analysis.

D

Okay, so the the shapes themselves are kind of dependent but to a limited extent,.

F

They are dependent, but they are not part of type right now.

B

Any other questions.

B

Oh and uh andrew sorry, I cannot hear you, but I you are talking somewhere. Okay,.

A

um So one question I had was um uh just around, so uh you know it seems like one thing we could do in in relay before um that might be harder to do and relax is to express uh like a program without specifying the implementation of of each of the layers um is that I was just curious. It seems like whenever, whenever you call the builder, you know create module here. It's always choosing some implementation, even if it's a really naive implementation for each layer.

A

So I just wanted to kind of better understand that is that also true, basically, that um now it will at least sort of choose a naive implementation that might be later revised by meta scheduler.

B

The implementation you mean the naive printfund.

A

Correct yeah, yeah, sorry yeah, so for each each model layer it seems like. Whenever you create the module, you can't create a module that just has the relax function and none of the implementations of each layer. It has to have a prim function to at least sort of provide a naive implementation.

B

Actually, it's sort of like more flexible in relaxed, so, firstly, you can a user can implement their own test ir program and use the core tr construct in uh relax ir uh to according to that, and secondly, we can use, uh for example, compact uh to directly according to some uh runtime library right and thirdly, we will have a relaxed uh op system uh uh future.

B

We currently don't, which currently sort of like rely on unity, so basically uh uh emit they're, also, according to some key function, to generate the corresponding naive print functions and use uh the uh use the rely on meta schedule to uh tune that, but in the future we will introduce ops, yeah.

A

B

So it sounds like.

A

Like what I was noticing here was when you're importing a relay program into relax, it seems like it schedules everything right then and there, and so I think what you're saying is that in the future, there'll be a corresponding set of relaxed operators, um and perhaps it wouldn't necessarily schedule right away when you converted a relay program into relax exactly.

B

We will actually have uh direct importers from, for example, pi touch to relax because, like you know the relays in reliance, there are like many any stuff, and you cannot translate that, obviously to the relaxed land and the relax has more expressive ability than than relay so we have to directly import from, for example, guitar script or onyx, with the shape information there yeah.

A

B

Anyone else have.

E

A

I don't want to take the whole floor.

G

Yeah, I'm a little interested in the um you know. You talked a little bit about the progressive um lowering rather than having to do everything in one shot, as you kind of just touched on yeah part of that, at least from like the development perspective is being able to inspect, uh inspect the relaxed ir and the corresponding tir at like every step of the transformation.

G

I don't know if this is possible, like one of the things that we lack today with relay is even really just the ability to see what the final state of relay is, after all, transformations before it's lowered in the coach, and I wonder if there's any just like user-friendly way for us and relax to see at, for example, after constant folding or after um fusion to be able to inspect um the transformation at a given sp at a given step and then similarly well actually I'll follow up after one other.

B

Exactly exactly that's actually a main goal of relax, which I didn't uh touch upon uh so basically in relax. We want every transformation to be ir module to our module transformation and which means that each part is just a modular module and the path can, for example, inspect uh part of the relapse function and transform that into either like, for example, some core tr and some tr nodes, uh and also it can, for example, uh transform some relax function uh uh to saying.

B

Okay, these functions need to go into another coget, instead of going through the tr right. So basically, uh each uh parts we want make sure in relaxes our module to our module transformation and for each io module we want, we will make sure to be able to print it out using the tvm script and we have a. We will have a minimum build, which means at any stage of the transformations.

B

If the ir itself is valid, we can always call the build to build the whole module and run it in the last vm and the main benefit of it is that uh when we want to introduce some like tuning pass, which means that we need to explore some candidates and transform the uh transform the uh release r module a little bit and then build it and measure it and send that feedback uh and basically is a for loop and select the next uh candidate, which is some working on currently. So.

B

Basically, the main point is that we want to make sure that uh each uh transformation in relax is our modular module transformation and we can build our module at any stage and visualize it using the tbm script. 3D printer yeah.

G

I see so in this case you're saying that um in principle the optimization step is separated out from build exactly.

B

Okay, that makes sense exactly so. The build is just a very minimum and all the optimization passes, including a better schedule tuning, is just one pass.

G

And we can in principle, then I guess tune at any spot too. If we felt like, for example, we have only incrementally lowered, like some portion of some subgraph of relax. We could focus on tuning and evaluate based on whatever the output, assuming was. If we is that, like all right, if this was the appropriate transformation of the relaxed level, yeah.

B

It's the like. We call it modular uh compilation uh in relax, yeah, that's great stuff cool thanks, yeah thanks.

A

Yeah one question I I also had, since I don't see anyone who has their hand up or anything, but um let's just uh I wanted to understand a little bit better about kind of um where you guys saw this um kind of how you saw it landing in in tvm. um First, just a little bit about it.

A

You know it seems like from the examples right now you're calling things using this relaxed vm and um you know one naive question is: is this the same thing as the the vm inside the the current tvm code base.

B

Actually uh is different, so relax vm is different uh from uh relay vm in several uh parts, so, firstly, we want to have the shape computation in our vm relax the end. So basically, we maintain some ship heap and the shipping manufactures uh there in relation. Secondly, relay vm actually has a a lot of, I think 20-ish instructions and you can relax vm. We dramatically decrease the number of instructions and have a call instruction called the call pact.

B

So basically, the copy function will, according to a bunch of like built-in functions, to uh achieve the same goal as uh like relay v and which uh has them as separate instructions, but it's uh just another uh design which is like reducing instructions that um and so uh so. These are the two main differences uh between relax, vm and the relay vm, uh and for the uh migration upstreaming plan uh for this quarter. We want to robustify the key components of relax and we imagine like during this process.

B

We want to keep communicating with the community and declaring feedback and improve the core structure of relax and next quarter. We want to upstream relax as a separate co-path which will not affect the relay co-part, so they will co-exist.

B

So, uh basically relaxed will have everything in in its like folder and that people can actually try out, relax and also, uh and also uh contribute to relaxing tv. So yeah.

A

Cool okay- and I imagine there's some- I think you mentioned that de- has like symbolic shapes and relax. So there's probably some overlap there to some degree right. No, actually.

B

And so actually, we won't like modify t because t e has the uh also has the first time symbolic shape. Okay,.

A

B

Yeah, so basically in the high level, we have uh also like first class symbolic shape, so they actually interact with each other very uh organically right. Okay, so you haven't modified that yeah yeah yeah. We don't have to yeah.

A

Gotcha um and then um you know, I guess one. What did I want to ask? Oh you know uh you know around. I gosh it's been maybe six months at least since this came up, but.

E

A

One thought that um had been kind of circulating for a little while was whether or not it would be possible to essentially map the vm lowering flow onto the aot back end, basically and essentially, code generate all the vm instructions into an aot program. um I was curious, just it actually almost seems easier to do with relax since there's less instructions, but instructions might be a little bit more dynamic. So in terms of memory usage- and things like that, so I was curious just to hear your thoughts on that and cool.

B

Yeah yeah, actually like we imagine like relaxed vm, is, uh will be just like one of the executors in relax and we will uh have like aot excluded in the future right, as you said, and uh but currently we uh don't have much time uh spending on that. We just rely on relax vm, because it's like flexible yeah and in the future. We imagine we we will have aot executor, which could get uh give us more performance benefit.

G

Cool yeah, yeah yeah, it seems kind of like in your data flow blocks. It would be pretty straightforward to just as andrew was saying just codegen directly to an aot function that is evaluating calls on the tr um that has been lowered and and then it'd, be interesting.

G

Interesting to think about, like the lowering the dynamic aspects is, I feel like this does fit really well into the discussion previously about how we would maybe consider lowering static and dynamic regions separately.

G

Since it's uh um dynamic, eot programs, um I think, are definitely a challenge, but it just seems like it brings us a little bit closer, at least from respect to the runtime.

E

G

E

Would I would also like to second that uh to if you can, it will be really useful if you can explore to codes in those utility functions that will be baked into the relaxed vm.

B

Sounds great, we will definitely uh start attending that. I think you should be like q4 at least this year yeah after we upstream yeah yeah.

B

Oh, I cannot hear you andrew.

A

Sorry, I'm just gonna say: michael, do you have a question that there.

H

Is muted uh yeah, so I I would wanted to ask a little bit on the future questions uh for future work. Maybe this is a bit redundant to things that are have already been said, so I would be interested in. uh I mean the usability aspect so usually, when I talk to people, they give me an onyx model or some some file, and they just want to use it. I mean we have discussed this in the forum. I think um so are there already plans?

H

uh I mean how many steps have to be done until uh the the front ends are available, that can generate relaxed code.

B

uh You may make importers right.

H

B

Yeah yeah, uh I think, um for this quarter. uh What we are doing currently is that a few folks here, like uh suryan, reihang and hongyi, they are working hard on uh matching the performance we relay in the in the static shape, uh workloads and uh using meta schedule, and uh basically we also want to introduce a few like uh fusion like passes like fusion right, and we want to actually match the relay front-end performance on a set of like representative models, and then we will benchmark some dynamic models.

B

So currently we plan to have the importers.

B

For example, we will. We talked about uh the pathogens in powder uh with folks like jungwoo and aws. Folks, like cody and eg, we want to have the employer started, implementing, I think next quarter, and then we will, if we have more contributors, join, which is very welcome. If you want to contribute to relax, um uh we we will definitely have like honest importers uh implementing uh uh to directly uh import models from rx.

B

I it's in the plan, but currently we haven't started investing time on that yeah thanks thanks and I yeah uh I I want to like. Take the rest of time acknowledge a few core contributors who contributed a lot to relax and andrew is from udab and he implemented ebt and now he's investigating uh uh enabling training uh in relax and the syrian.

B

uh I think many of you have heard him, so uh he is, uh he implemented the fusion pass in relax and did a lot of tuning and homie is also uh working on the tuning and he also introduced a few optimization uh passes to relax uh like constant, folding and so on, and the dream rule has been always uh discussing a lot with uh uh with us in either the like the open divide, meeting and internally and thanks a lot for uh for a lot of uh insightful discussions and the illusion.

B

The github profile is, is a bit of funny that he has has been contributed to the passive structures like how adding the python uh pulse decorators to relax and uh he will be working on memory. Dynamic memory planning uh in relax and riyang has been working on uh developing passes, uh especially the fusion parts and summarized a lot of like past patterns, and I learned a lot from him and he he added a lot of improvements to relax.

B

Vm and prep hub has been adding uh shape, support like runtime dependency, and he is now working on onboarding docs and the language back uh relax. Hopefully it will make the like on board more more and more contributors, and the song has been working on a lot.

B

uh A lot of things like the uh meta schedule, uh integration with relax and also uh the passing structure and the tuning parts, and a lot of like modular compilation, flow design and uh thanks song and uh pq uh always uh supports uh relax a lot, and uh we haven't learned a lot from him and uh yeah and the young uh is now working on a control flow, supporting relax and he has been added a lot of support to the partner and printer and he will introduce like he will have lstm demo discorder very exciting, and hopefully we can see.

B

Some implementation, like cancer array and direct cancer in in relax and zucchin, has been contributed. A lot to some calling structures of uh reliance, including uh ast, and he uh also like relaxed vm and the uh visitor and so on, and also he implemented the immediately with andrew together, and these are like 12 active development members who work very hard and happily and contribute to relax, and we definitely hope more and more people join this journey and build the uh relax together with us.

B

And uh besides that, we have a lot of contributors who contribute to designs and ideas and roadmaps and so on, including alton uh from octoml cody from aws uh denise from octo lily.

B

Our previous colleague jared george mihales, tristan from octo and bowen from cmu and eg from aws yeah. Thank you for a lot for all of the uh contributions and all the discussions.

A

Thanks uten uh thanks for a great presentation, um it's a really really interesting piece of work and uh really looking forward to seeing this uh land upstream. So uh thanks so much for our presentation. um If there's any other questions uh speak now, otherwise uh we will post notes here and follow up on the uh relax discuss thread please um and I'll just give one second in case anyone wants to ask one last thing:.

E

Yeah, I have one last question: can I hear you andrew yep.

A

Yeah go for it. Yeah yeah.

E

So can relay and relax exist in a single layer, module.

B

uh No, we cannot, because we relax, has a lot of asd changes, so they uh we didn't uh imagine they. They should exist in the cmi module, yeah, okay, yeah yeah.

A

Cool thanks, yeah and again definitely follow up on the rfc thread and ask questions uh further. If you want to explore that or other topics further, um we have two minutes. So I guess, if there's anything else, anyone wants to mention.

A

E

A

We won't open uh too many cans of worms with uh with only two minutes left so uh join us again uh next week. I think we'll be probably back to packaging. um I don't know if we've had a concrete proposal for what to discuss next week but feel free to post up on the meeting agenda thread. If you have something you want to bring up and um thanks again yutin for for the great presentation, yeah.

B

Thank you thanks to all the contributors, actually, they did a lot of work. Yeah.

A

Absolutely yeah thanks everyone for working on this. Thank you guys.

B

Everyone see you later.

A

B

You there please see you.