.Net Foundation Design Reviews, 4 Sep 2018

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: .NET Design Reviews: Hardware Intrinsics (ARM)

Description

We're looking at these issues today:

https://github.com/dotnet/corefx/issues/26574

https://github.com/dotnet/corefx/issues/26179
https://github.com/dotnet/corefx/issues/26180
https://github.com/dotnet/corefx/issues/26181
https://github.com/dotnet/corefx/issues/26182
https://github.com/dotnet/corefx/issues/26183
https://github.com/dotnet/corefx/issues/26185
https://github.com/dotnet/corefx/issues/26220
https://github.com/dotnet/corefx/issues/26527
https://github.com/dotnet/corefx/issues/26564
https://github.com/dotnet/corefx/issues/26581

A

All righty, so now we are actually live for real before ever just streaming. Our beautiful.

A

Start swing and I remember to mute myself so I, don't you might say that you sit there, so we want to look at the intrinsic stuff that Tanner I think view of us like a long time ago, and so this time you want to go over the specifics that are relevant for the slightly odd design, I guess. But so how do you want to do this? We want to go with a matter issue. First, you want to discuss the individual items. Then, how do you want to draft this I? Think.

B

It'd be better to go over the meta proposal. I think. For the most part, these sub issues we're going to end up Lincoln approving, provided that the overall shape of the API is looks correct, which I think is the important bit sounds.

A

Good to think Steve father's, probably right yeah.

B

A

Filed the majority of the arm issues.

B

A

Do you want to walk us to them, then.

C

Sorry I haven't really thought about walking you through them.

C

We can. We can talk to him. I wrote these very quickly when I wasn't a Microsoft, employee I quickly was draft this one.

C

This one was this one unit on the screen right now was drafted to cover the Cindy issues, so the Cindy subset of the intrinsics, which is what I thought I, was going to be working on heavily at the time and I did work on somewhat so because I started filing a whole bunch of API API proposals, I started thinking, maybe I should stress and naming conventions. An argument could, rather than sort of minutely, reviewing everyone, and so this document really came from a review of the arm.

C

V8 arm reference manual and specifically, this Sindhi subset of the a r64 instruction set. So the naming convention here was sort of my proposal to try to convert the squirrely assembly into human, readable, dotnet style, naming conventions, I. Think.

A

We're talking about this before they I mean bases. You only said like. If you look at the instruction set, there's very often just abbreviations. If you just spell them all, you basically get a name that is probably good enough for the most cases, unless it's something that is really really butchered or something yeah, but I also don't need to innovate too much here, because I think some of them will probably be.

A

You know very domain specific where it's just you have to know what those terms mean in order to make sense of it, but I think that's generally acceptable. Right, I mean I. Think the last example the tenor brought up was you know few small 2-ply add I mean like its OC up. You have to know what this thing does and then, but it gives you enough keywords to Google as well, so I think it's better than some weird abbreviation. That is very common, but not specific enough. So then, so you said you want to.

A

You want to drop the adjectives because we think they're handled by the types of spin do we need new types for that or is it largely or the existing types they already have I.

C

Think the like unsigned long slide long floating are double is sufficient right.

B

Yeah, for the most part, these were envisioned to reuse, the vector 64 vector 128, and if we ever support the SBE extensions of the future 256 512 upper types and then the 10, primitive, integer and or the 8 primitive integer and two floating point types Yeah right, I.

A

Just quickly say one thing: what's.

D

A

There's a typical SSE and right now, SSE.

B

A

I'm, just looking for one so actually to find at least the name space weather in.

B

Pink right now, these are put under system runtime, intrinsics arm forearm, 32 and 64 / 64 and then most of the 71 for foot under a 70 class name. If.

A

They also have the direct line. Intrinsics. We have the vector of 64 it's for 50 60 128. So then those.

B

Are the base shared types between both x86 and arm yeah, if only I couldn't even get down? This would be super helpful, yeah and then I put a link in the Skype chat, which is to the official arm tage further entry for their safety. In physics, it's listed under their neon page when it covers a 32 and a 64.

B

Ics and the big difference between the arm 32 neon and the arm 64 Cindy is the additional support for double precision and a few other new api's.

C

um So one of the things that sort of drove the decision to put this in a separate arm, 64 no choices- was erm. 32 was the discrepancy in the support of types. So in order to make it clear which types are supported, the the 64 or hardware intrinsics were kept in a separate name case from arm 32, rv7 and.

B

Ms bc, GCC and clang for their arm intrinsics. They also keep them completely separate for 32 version 64. So there is prior to that. So.

E

I remember a long time ago, when we talked about three six, we said there are two suspects and we cannot have separate classes for different versions. Yet to me totally makes sense, because well, spec is at some point ratified, and it's gloves correct, so version you know. 2.1 will always get this instruction when we separate api's into different classes based on at manufacture of the chip and some other things do we expect it to change over time. Like, for example, today we say: hey.

D

You know this instruction.

E

Is specific to arm and then two years later.

E

X86, just you know at the same instruction: what are we gonna? Do they.

B

Would be, this could be the same instruction functionally, but there would be different instructions, so in different architectures and different is a check bit so they'd be different classes.

F

We had talked about I believe to your point. We had talked about taking functionality that was common to multiple different chipsets and also exposing those as or generic. Ap is like more more typical dominant in the eyes in addition to what's being displayed here. But that's exactly the issue that they're the concepts.

E

That can grow over time, and then you know, then there would be and confusion like where do I look for this, not a character for this instruction. Is it in you know, should specific type or in.

A

The comment I believe the idea of these ciders that they are always ship specific right. So the idea is that they are all supported by the coach and in the sense that that will, if we call them that would just omit the exact up code that the architecture supports. And if you want to omit different code between you know, arm and and that's the x86. You were supposed to write. But as I said, there are some that we already put.

G

In common, no because there would be, there will be no different in so there would be not, and it.

D

Wouldn't be intrinsic.

G

D

There will be a.

B

G

That even we call.

B

So you might have something like system, manipulation.

B

Could call x86 or you know you have.

F

A software fallback you know and that's the important point to attend or just said well, the the ones that we've considered moving into a common name space would presumably have a software I see. So, basically, if something is common, we don't.

E

E

A

I think the other thing is like we don't this form or other generics in the work typically earlier, that you need to have the right way molarity as well. In order to make software phobics not insane I mean the vector ones that we have are I forgot about. The measurement was about was like 3 X lower than the native implementation, which basically takes away all the video right, and so the you you would expect X 1x, crc32 or whatever like when you have an intrinsic rated. So it's a one-line instruction.

A

Otherwise you have to I, don't know four or five lines of code that you have to write and see, see sure whatever I can save code or they do the same thing. And then you have, you know sensible, API, right, I, guess but I think for some of these things is like I, don't even know what the common API would look like right. Unless you did, you have the exact same instruction, but you don't think it's super common.

H

Those things like pop counter even importance: yeah yeah. This is a well-defined unary operation.

H

Every operation.

A

A

Think adamant conversions that conventionally thing we talked about this as well, we usually refer to them as left and right I believe that somewhat consists of other places with.

B

For the x86 ones, we haven't necessarily called them left and right unless they explicitly make senses left and right, there's a number of them where we've had to call them like source versus upper instead, because they may have different semantics than just left and right. Yeah.

A

I think that's what makes sense if it's a normal binary operator, I'd blame you. You know you can usually swap them.

G

B

Of the other things that we've that needs to be discussed is for the between x86 and arm the instructions. Some of the instructions may do basically the same thing, but they may have different conventions like on x86, you've got and nan, which actually functionally does a not and and you've got the same instruction on arm. That does the end and then the knot instead, but we, if we call them both and mod, they may be confusing the psalm and.

A

Assert to me and honestly for the intrinsics, I I, don't think alignment between them and is super important, I think the design like the you know. How do we, how do we expose intrinsic, should follow a set of patterns that are the same? But you know if one instruction said this operation a full of operation media.

A

The way does it the other way around I would say: I wouldn't try to abstract this, because the whole point of these things is that they're so close to the metal that if you know what you're doing you can do, what you want to do and then there is completely separated from the intrinsic sense higher level. Api is that we try to say on how we expose you know a high level operation.

I

To the developers and that's the place where we try to align, I mean another organ for making. These low-level is presumably sometimes you'll, be translating them from a sample or something is really an actual assembler. So you're looking for him now well, yeah would be just careful health IT addresses because.

A

If you look at the C++ intrinsic names, for example, like they are so ugly and undiscoverable that it's like well, if you make this a goal that you can just copy and paste some Windansea plus classical, then you know these ape science will not work where well is there currently our design, but I also think that, honestly, those are somewhat better than.

F

What people's butts have done, but to be honest, like if you're, if you're taking one of the C style in terms of KPIs and just like copy and pasted into a certain box? Presumably the correct thing. Yeah.

B

I would also I mean it's the other thing, I think right. Now all the doc comments how the C++ equivalent layer so so for.

H

The binary operators, we sing always left and right on a binary operator only on an associative operator because, like for division, I think if we normally named a dividend and divisor so that we're telling you which one is which, especially for things like I, guess, the rim is a binary right, a tuple to tuple. So we need the name that one yeah.

A

We can name them hey, B and.

H

Result so else yes, awesome, please are different, because every time I need to use it anything I test it to remember which thing is no paper and pencil.

E

And they own left and right, you can use left and right, but for division.

H

E

H

So I'm just I'm just asking you to use division. Do we want to just say if I.

F

Think your your comments of only returns with make sounds I.

H

E

Which ones wait? We.

H

Should tell you, which ones which well.

E

So subtract, the subtraction is not yeah left and right, make sense, I! Think it's more about paper and pencil like how you write it. Yeah.

B

In the x86 and I think in the other places where I normally see to buy, we also use left and right just because you've got when you write in c-sharp code.

B

You do left operator right and you're expected to know the semantics of what the live birth versus the right is I'm, very suppose the trinsic for different, no there's a proposal for that be it falls under that general category of this is part of the base cpu it's on every CPU ever, but it may have platform or architecture specific behavior that is not easily representable in c-sharp code. We.

J

Were just referring to there like what instruction was a different.

B

So just dimming it in idea right.

J

E

The way we have math that different yeah, which doesn't use the different instruction, it just actually does division twice. They interest very annoying. If.

B

You try to print numbers, actually, it doesn't hand I think the left shift right now. Instead, because it's faster than doing the organization's.

A

Yeah, but to your point, I think your question that will be fixed into the actually use tantrum, that's going to be handsome.

F

A

Implementation.

F

F

B

So so the reason it's doing something that's functionally equivalent, but is faster to do than to division operations, because the JIT today does not know how to do of a single divide and also give you out the remainder yeah.

A

The good point I think you want earlier was that you cannot easily represent the behavior specific of different in the seashore. Well,.

B

Like you can't express it at the language level, but you can express it in an API with an out parameter, for example: oh this or anything okay, that makes more.

H

B

H

Different, we name the parameters like a V and out C, and it's like well did we have the result of the division to be alpha remainder so going.

E

To the cosmetic seat fixing we have actually serious plans to fix it because we are used, we would use it in our for matters. You know, is.

F

The name: no, no, no, the implementation of map that different. Basically, if we're adding all these api's, would we go back through the framework and start using them? Eternally I would hope so because it would be faster yeah. We would want it.

E

To happen because we commute Lea paid for a matter that actually does it inefficiently because of this issue,.

B

Like for div room today, there's a to do issue three for thirty-nine restore to using mod and div when the JIT they built a eliminate them to one I. So.

A

By the way, there's any interns is right now in math are.

B

There holes there's, there's no intrinsic, they all have software fallback, but there are some that will win. The underlying CPU supports it amid the direct instruction like math dot round, when you have a CPU, that's a sequel. One support will emit round PS instead I sin square root since, like the don't limit square root, SS, okay,.

A

So it's not quite as bad that everything's works a little bit against always on the way food. There.

I

Which we have the intrinsic I.

D

Know our native.

I

D

I

D

Me more than a stock positions as well yeah.

A

All right so then I think we talked about most of things.

A

It's a one question that I had as I. Do we have any cases where instructions would fail.

B

At least on x86 today, there are a few functions which are x64 specific they're generally. The Milano overloads, which will report, as is supported because there in the general is a. But if you try and execute them on a 32-bit machine, they will fail and that's something that we're supposed to discuss next week with the x86 design review. There.

A

Was one thing they do? You have any cases we're like? How would we raise exceptions from the forensic side? This is something we can generally do ORS and then we can't right right.

B

Now the ones that will throw exceptions are like, for example, if you've got something that requires a constant parameter and that parameter only accepts a you know, 0 through 128, and doesn't support 129 through 255. It will throw an argument out of range exception right and that's done by the jail. Ok.

C

There's also the not supported exception when the heart, the heart word, that's Roenick doesn't support the entry like good stuff I do my guys they were just retentive.

B

Yeah for so for any instructions that throw an exception like, for example, if it requires an aligned parameter, then the instruction will just raise the access violation exception, because the JED already handles all of that behavior from the instructions, but otherwise they just return. Whatever result, the instruction returns with no fiddling around for bits or anything to.

B

Tell it take care, well it it's just the general like.

B

Maybe Carol can explain it better, but the jet has all the functionality say like if an instruction tries to access memory than the CPU itself raises some exception internally and then the OS captures that raises a back Bell to us and we convert it to like an access violation exception or oh that's, my baby as well.

B

So there's like no we're opting this instruction into this. It's the instruction we'll just do this and then the jet has.

H

It so the argument exception for them. Oh, you know like argument. No, that's a check on the way in not checking air flight on the way out right right.

A

This is just basically a summary, oh baby. All right is that right, see yeah.

C

This is that, if you looked at that table of see empty instructions, this is sort of the outline of the classes of instructions. I was sort of keeping track, of which ones I had created an API proposal for and which ones I, hadn't I think this is a little out of date. I think there's proposals for a few that that I've been implemented that aren't in here. Yet, as did the Associated API proposal members and isn't here, okay.

A

I think there's that there's a long list of issues that already just opened that Tanner attached to the original in my all right so anymore, Christian of this one, otherwise I would just jump into.

B

The actual API shape the other thing that kind of crosses between both x86 and arm is so. The majority of intrinsics are a one-to-one mapping with an underlying instruction and they're, basically contracted to event that instruction, you just don't get to choose the register or the memory operands.

B

Besides that, for you, there are a number of functions which are basically helper functions that represent core functionality like reinterpret, cast or create a vector 64 and set each element or set all off elements to one value that are, they basically have a software implementation that we build up using the other instructions, they're not contracted to do. If you go to instruction for those there's been discussion and requests on. Do we keep these separate between the two or do we provide?

B

Do we expose these operations on the vector 64 base types or in a general shared class.

A

So my gut feeling would be I would try to centralize that, because that seems like very unfortunate if we have different behavior of the things that are completely harder ended like sitting on videos to one sorry all slots in the vector to one made. It seems like a sandwich which is worth the same.

C

So so one of the issues is that, like iron 54 supports vector, sixty-four vector 128 here, I think, if that's the right name in collection and the x86 supported, 120 and 256. So if you start trying to figure out, what's supported keeping it with the intrinsics, we've had this sort of idea of there's any supported property and every one of these classes right, so that the program can figure it out. If you start moving it outside of the intrinsic sense may be a little harder. Well.

A

I would then say that you're not sitting all you know, elements in the vector to specific failure. I would hope this would be a you know. A BCI defied API that happens to use intrinsic, but I, don't have to write it myself, like I, can just call vector to 128 set all to 0 and then, when unarmed it does. This on x86 does the other thing to be as efficient as possible, but I don't have to write that code, but.

B

I think Steve's point is for something like arm which doesn't support, vector 256. Would we also have a vector 256 is supported to say like this? If you called one functions like set 0, would that just work at the JIT level and the jig would have to do the software implementation for those specific general or would we just say it throws on arm his arm? Doesn't have the hardware support for this? Well, I would I.

A

Would say if it's not even it's in the transit namespace, it should not fail, because that would be generally. The expectation of these API is now I. Think for I, don't know like for the vector types. My understanding was that they will always be supported, because we can always have a video on the stack of that size. um I would say in those cases say yeah. You should probably have an LS at the very bottom.

A

That says, if nothing and support I just feel it affordable yeah it might be slow as hell, but but then all operations on 256 will work. No.

B

No just this one just true.

A

B

Creating and initializing I.

A

Think, there's no don't think there are any operation yeah.

B

There's no operations.

A

On there today, it's just a data holder for these intrinsic api's, the most part. Okay, let me restate.

E

Like not methods directly on the detector 256, but if I'm.

A

Unarmed, yes, can I use, vector, 256 well, I would say that you know you will have a bunch of types. Yeah then are in the unphysical namespace. Those have been supported property because they represent instruction sets and those that take a vector 128. Yes, you operate on that line, but then one with the director 128 by itself is uses. It just holds. You know the data, so you can use. You can use all the instructions that are supported and the ones there's.

C

Currently, no there's currently no arm instructions that user back to 256 right proposed yeah. So why would you get.

E

One, like else yeah, so this is what I was kind of trying to get that I think we should either make like one site inspector: either it either works or it doesn't work. We should be like oh well setting targets to zeros and ones that will work, but anything else will.

A

Not those kind of strange look as I'm saying is that it depends on where the API is, but if the API is on vector, it should work everywhere. If the API is on an instruction set specific type well, then it would work the instructions that are supported, so it's a pretty straight forward and I think will develop and we already made vector 256 work on our it just doesn't do anything well.

B

It's a type and if there's.

G

F

G

B

Then it works like a regular C, C sharp type. So if you said vector 256 equal x equals default, it will at the c-sharp level, just call the default constructor. You just can't.

J

A

J

B

Is that would do anything useful with it, which.

A

Is what I said like I? Don't think the vector itself should have any operation that should all be in somewhere else, except for the you know, initializes it yeah so reinterpret.

H

B

Yes, Ilana's right.

H

You can pass it, you can read from it. Just no operation will never act on it. Unless you wrote it yourself.

A

I mean the other video, the other option that we have is. We don't have back to 128, 256 and intrinsic answers. We actually move them into the you know, architecture, specific settings, but then you have like well go into the eyes. One arm has one and they're all effectively the same they're, just under 28 bits and.

B

And when the SVG extensions form come out and 256 through 2048 that support comes in. If we presumably add that, then we'll have duplicated types between the two that are identical.

B

In SV, or our discussion and x86.

C

B

Supports vector 64 via the MMX extensions, those are just outdated, not recommended for use, and we didn't expose them so Lee cheated well, no one should be using them. Anyways.

A

So do we have any generic so that will always expand them into the specific and sensations that we support so.

B

For forearm 64, as far as I've been able to see the majority of instructions support all ten types arm. Thirty-Two, that's not the case. They generally support nine out of the ten types where they don't support double, but this is a similar question that we've had the with arm. They kind of exposed everything all at once, rather than in multiple versions of the ISA, whereas.

D

With Intel.

B

They expose like only float on ssee, and then they added double in the other nine types and SSE to etc. So what can I instantiate? The.

A

Vector of 64 with today, like we support literally anything like my own struct as well, or do we always blow up and say this with one of the ten primitive yeah.

B

You can use anything, but it won't. You won't get the jet recognition for anything. That's not the ten polymer types. So what happens if I call add on my fancy struct now that would fail at the chip level with the PNAC.

A

But it will compile just fine like the world right, yeah yeah,.

B

Because the that's one of the things we're part of the analyzer that I want to get written and that I started working on in my spare time is like, if you're, using the legal type for that's, not one of the ten primitive, then throw or you know, give a compilation, error and also flag things like you're doing this. But that's equivalent to this instruction, which is more performant as Uproxx I mean.

A

I think vector of T currently like the the the other one that we were three years ago only supports I, think one of ten times right yeah, so you can compile a code that uses any type, but at one time we will fail before the tab. Initialization error, which seems like something I, would probably like us to do, because it's the next best thing we can do.

A

It will be Everest want to support other traits drugs, probably not right, and ideally, if we shall ever ads- and you know where T primitive, you know- then I would like to be able to use it and not have to worry about backwards. Compatibility the only backwards. Compare there would be somebody managed to compile source that they never load it, which I don't think, is a reasonable combat bar I'm.

B

Not sure we'd want to do where T primitive in the future, because we're not going to treat things like half is primitive, mostly growing up where.

F

T numeric or something yeah, yeah yeah, just as a point of reference from the analyzer own, make sure that there's at least a way to turn off that morning, because we do have code pouts, where we have generic methods where we take a T and we do all the checks upfront and then we call it the vector routine. Under the covers right, no end up the chip will just sort things out. It would be that if those no longer compiled great well.

B

That's you can even turn the camera off and.

A

So then, why don't we expand this one, but not this one.

C

So the absolute value, one there's a couple things right now: it only supports sign types as the important it's returning unsigned types as as output, so.

D

C

Should be done or not, but that's that so the set of supported types is not the full set. So in general we expanded out if the set of the ten primitive types was not fully supported, ICU or these special cases where the types don't match right. If you want to change the type that makes complete sense anyway, we'll do that.

B

Why isn't the vector 64 double overload on there.

C

Start a vector well.

B

Well, I realize it not like actually a vector, but it's one of the overloads that underlying construction supports. It.

C

Doesn't support it cuz, it's not a vector.

C

It's not! It's, not a Cindy instruction.

C

Right, a 64-bit double is on a vector, assist, a double is 64 bit, so it's not a vector so in general, I didn't try to use non vectors. I didn't create intrinsics for non vectors.

C

That make sense yeah.

K

But when I use the T overloads so like add below, but basically can I make a vector 64 of double and use it. If I was so inclined, yeah.

C

I, don't think it works I think in general it will throw it time for lots of order expect exception. It.

B

Looks like it's supported from the neon intrinsic intrinsic stage for a 32 and a 64 like that. They explicitly show like in 64 by 1 of T. For most of these instructions.

C

Ok well, I'm, not sure why you.

B

I think it's like the instruction supports it. Even if it's only one element and technically it's not a vector.

I

You're just recording this yeah to.

J

The extent that there are useful places where you would want to use generics, it would be very convenient if it would just work.

I

E

I have a question absolutely body on Man 3. That is the same time, but you know like why do we change the type here doesn't make sense, because this doesn't throw it right. Well,.

B

Because the underlying instruction says that it takes a sign type in every zion.t.

C

I think you could go either way. I just made a choice, I think the Intel stuff, the same thing if I would have based it on what he told it. Yeah I was looking like did.

H

We do absolute value for.

B

All of the x86 instructions we, basically, if it explicitly said it, takes assigned and returns an unsigned or vice versa. Then we made sure that the types matched up there. If it didn't matter, then we just supported both cuz.

H

Some math says in 2 goes to int, etc and vector of T, says vector of T goes to vector of T, there's no unsigned type, and there.

B

Is no like return overload, resolution and c-sharp so, but if you wanted to get an S fight back then you could just reinterpret cast. We would be know well, but.

E

To be consistent, well, just return, the same type would again always been claimed to a 70. What does that negative operator? At the same time,.

B

That people may think that it has a particular semantics when it really doesn't like. What's a motor like the semantics at the instruction level is that it returns a type that is on the side, so it may have special handling or endowment value as compared to what c-sharp may do if you do end up min value, absolute value, which su short, throws right now, they've getting home ethical helps, that's good, so it doesn't.

H

The framework is promising, so that's correct and everything else we've done is wrong. Okay,.

E

Yes, so if we were starting from scratch, we would basically map that ABS would do the same thing. Yeah.

H

That's okay, I, really wouldn't yeah, so the min value of every sign to type is not representable, as that sign type to absolute for.

B

Reference, this is the absolute which takes in at 64, one of t, I.

B

Would share the screen, but your.

C

B

C

Mean strictly speaking, you could treat this as the return value as a signed vector it just would have negative values in it, but you could put them in it afterwards. Well, it could have their values right. I mean the absolute value is going to remove the negative values, but you could put them in again.

E

But I think that main argument, this mid value.

H

For it, as Levi said for s by negative 128, absolute value can't edit that one well I mean it'll, make it positive 128, but you can't have a positive, 128 and assigned to bite, so that would mean to throw if it goes. Psycho. Okay,.

F

H

F

The unsigned Tigers correct yeah, especially if we're caught, if we're saying that these correlate directly to the instructions that will never grow and.

B

And we've tried there at least I've tried to make every account for like we don't do special handling. We don't do transformations that the user might not expect because, for the most part, they're writing performance, oriented code, they're, taking the risk factor on, and we don't want to introduce unexpected performance issues when they thought that it would do one thing we said: do something else: how would they develop a reinterpret house? One of these things is your own API for on Intel and x86.

B

There's the SSE static cast and a px on static cast, but, as per your earlier conversation, some think we're going to move those to the vector, 60 or vector one of many types. When somewhere in the arm proposal, there's an equivalent, you can use unsaved yeah.

F

Well, ideally, we wouldn't force people to use avi. No, especially since this isn't there's no instruction that this corresponds to right. It's just yes held legit, or this is eligible yeah.

B

It's basically ignore the type system and just reinterpret the beds as something else which is safe.

F

For well exactly unsafe, we say for framers yeah yeah, so do.

A

We expect consumers to be generally with specific types or sometimes using actual generic types, because the problem of the expansion is as soon as your engineering code, you're screwed you've not have to write a lot of like manual.

F

You would never.

A

Call apps with temerity right no, but you can, for example, add right. So that's I'm, saying if the consumer is expected to be basically never be generic. You always have a specific type. Then the generic ones here just for us to just have fewer methods. I.

B

Think unless you're riding a general-purpose helper library like what vector of T is then you're probably going to be writing an algorithm. That's typed like you'll. Be writing a vectorized implementation of string dot index of in your type will explicitly be you sure right or you'll be doing math dot. You know sealing and you'll explicitly be using float or double.

A

Yeah I think if you generally expect that the consumers are never generic like I, have no concerns. I'm.

B

Sure that there are people who want to be generic and who will write that code but I, don't believe that's going to be the primary use case for the types of algorithms. People probably should be writing for performance oriented code, yeah.

A

Because the ones that one thing is generic word, that would not I mean calling abs will be a pain in the ass I.

F

Think, honestly, like without the existence something like type traits in the framework like it's, going to be very hard to write an application, we'll use of generics.

B

Like like animals, you have to write a helper function that takes T and then, if check each one and software fallback for all the other teams, yeah, which.

I

Could just be this.

F

I

Mean this is exactly.

F

What something like type traits with the Intendant.

A

And we might have a numerical constraint.

F

At some point in the future, right, yeah but I think this is that doesn't hold a candle to them. However, that's right.

A

All right so then,.

A

People we should just scroll over them, because unless in any interesting points, I mean, though, although pretty much, what I would expect them to look like about giving it too much authority.

A

Yes, what people as an example where you want to probably mess with the tides right.

C

So multiply doesn't support, thank you long and long in the short form. Maybe something.

H

C

Supported here and multiply, they don't remember what.

K

It looks like factor 65 double again, there's no long for.

C

128 yeah, so long is not supported in 128 is what's what's actually missing. Yeah.

B

E

You know what could be useful, adding remarks, basically the comments that you guys just made add them to remarks of these 80 eyes, because actually I can imagine people coding and like where the heck is this overload. And then the remarks would say you know doesn't make sense because multiplying two very.

H

Large, as far as I know, we don't have anywhere in our thought system the ability to put comments on a better group.

F

H

F

Ended what we've done for things like span of T? Is we actually create the method and have it obsolete with a message so that you know exactly what's happening? Really: yeah we've done before what the API we did it with good hash profanity, for instance. Okay! Well, because we didn't Kevin I mean exists.

D

There no but I mean I'm.

F

Actually, adding a similar API in the utf-8 return type like an API that doesn't have to exist, I've literally, because I predict people will need it and I want them to type it and then see an error message that says: don't do this! You should be doing this underneath against time.

F

It's just it's.

E

A

We feel like I, don't think it's a bad idea, because the volume of these kind of things is also like Jeremy's a bit down. I really got boobs anymore, without even have all the group pages anymore. We just put them all on one giant page. So do you want every single overload to specify every other type of support that seems a bit over the top? At this point, it might be easier to just specify that, because Bessie, you have to say, I, don't support you long on each of these overloads right, which seems like adding.

H

A

H

It'll call this the compiler will fail. Does this mean anybody was election, finding multiplying further types they find it now they have to special case. Oh, but if it said long, are you long then don't do this? The.

A

Other call, this reflection review call these areas where we affectionately, don't do optimized code. I will assert. Oh.

B

I would think that these are low enough level and they are specific enough that people are expected to at least have some familiarity with the underlying instruction set that they're writing for, and they would understand why.

H

And it would be weird if we add it now and then in two years a new instruction set extension comes out, and then we probably go back and be like. Don't call this one call the one that lives over there and you'd be like why it's like, because we break these classes up by instructions that architecture and it lives over there. Okay, people.

E

People I suspect that some people will be using these ideas who can successfully use it then don't understand why yeah.

D

E

Know some of the other loads are missing. If we hit this issue, maybe we can start with three marks at the type level term and then start doing smarter things. So maybe.

F

It doesn't do anything until we really know it's a problem. Can we figure out what probability in the salsa here and there there is something to be said to for, as you mentioned like, we expect the consumers of these types to kind of understand the instructions I. Don't.

E

Know I'm just saying I learned that that net has you know vectorizing to the corporations and I just want to move to try two vectors I I.

F

Would hope that, because these are bearing in a very specific namespace look it would it would kind of scream, do it separately, I think.

B

The real concern is probably going in the real problem time when we're going to run into these issues is you're going to have plenty of people who are familiar with x86 code, who want to add an arm path before the go and assume that arm operates the same and it doesn't, and that's probably where people are going to start running into things. Leslie.

F

B

Stuff you mentioned yeah, but.

H

Again, that's a test. You yeah.

F

The writing instruction.

H

Level here, if you're writing assembly, which is what calling this class is, then you need to know what you're doing yes,.

E

But but it would be, we had some documentation that it'll help you get started, meaning like how could I get this information? Whether I learned about the difference is I can interconnect s1 support right, I.

H

Mean so I think it's like a I mean I would expect that our documentation has for each class a link to like this is where you go, get the instruction set architecture, dhaka, that's pretty much the level of maybe that's a good stuff. That's, where weakens where we stop being able to help you and everything else. Is you need to understand what the side effects of this are, and it's all copyright, the company that made the instruction set well I, would.

A

Hope, or or don't like three paragraphs, and what in physics are how they're roughly worked on? What like out two types me because that's probably still at least two orders of magnitude: more approachable than use a 500-page instructions that many other. Let go knock yourself out for.

B

Msp see, we've got a compiler intrinsics page, which starts off with a little blurb on what intrinsics are and then lit links to the various instruction manuals right. So I noticed that this class was named Cindy.

H

Like arm 64 Dawson, well yeah, it is it. Is that reasonable, based off the instruction set architecture- name that is being implemented here like because if they do at vector, 128 one multiply or whatever this later with like people might want I guess I'm asking is Cindy name from the instruction set? Are we naming it for the concept these are for naming it for the concept we're doing people at this service.

B

But it gets accepted so for arm 32. It is formerly called neon for arm 64. Most people, including MSB, see GCC and the arm page on these intrinsic still call it neon, but the actual architecture manual no longer calls it neon. They just call it this empty instruction set. Is that.

A

Face still what happens if the chip, another version of that they.

B

Call this and Steve it sounds like you were trying to say something so.

C

There are some extensions already existing I, don't know about the proposal part. But if you look at this comment about this I be a a 64 PR 0 e l1, that's the identifier in the CPU. So there's a field in there called advanced, MD and well that's equal 0, which is basically base the base. Md support for arms 54, which basically exists on every arm, 64 processor, except I, think the ones that well, except for prostitution, to be given an exception for various architecture of specific uses.

C

So what I think there isn't one that Cynthia extensions? This comes later, it's like long, polynomial multiplies, and so they just call there's a name for that extension, and so this would be another class. It's at like Cindy, long polynomial or something I don't happen to remember what I proposed or, if I proposed it yet, but it would have to be in a separate class. So any extension would go in this upper class and the name would have to correspond to the architectural proposal. So yes, that she one of the big ones, is sve.

B

So it would be in a separate class too, should we call this class, a TV Cindy, then to match what we've done with x86, where we just named a class what the CPA, basically the cpu ID check, is officially documented, as.

C

That's the name of the field: I think the 0 corresponds to Cindy, but has someone have to check? So that's the name of the field in that register.

H

On this, one in particular, I'm just concerned that, because while this may match the name of the instruction set, it's also the name of the concept and that that made that might actually just confuse people as the extent as they're looking for things from the extensions. And so, if there's a work that we could put on here that identifies that this is the same. The instruction set from whatever so whether that's a BBC MD or 17, 0 or whatever, like I. Just something that helps people understand like this is only one particular set of instructions.

H

If what you're looking for is from an extension built like everything such a class and I.

E

Also, don't like that. I know that it's qualified by the namespace name, but we don't tend to have types that are names. We don't have two types that are named the same in the framework. If you strip out the namespace names yeah, we try to leave China.

H

E

H

A

Like user control console on but I think they're I think depression for those kind of things is: are you very likely using them together in.

E

One file- yes, for example, you implement in pull back, so you implementing some algorithm and you basically say if I'm unarmed and do this. Otherwise do this other thing, yeah, I, guess but I mean I, assume.

H

And SS I guess SSE is probably a little find is that sassy called SSE on both x86 and maybe six people yeah.

K

H

We already had that, for it consists well.

B

Well well, the difference is that for x86 AMD 64 is a true extension on top of 32-bit, whereas with arm arm 32 and arm 64 are technically despair. N't ISAs and you can't say that arm. 64 is an extension of the previous, so so like with x86. We just have an x86 namespace and then an SSD class. We don't have different for a 32 versus 64-bit, so this year, yeah, where.

H

We can't do that with arm all right, because, because it is this because it isn't the same instruction coding, it's just the 60.

B

Or 50 alpha motor okay, it looks like the manual formally calls it advanced sympathy and the black is just a BBC MD, and so maybe we should call the clasp at.

C

Well wonder if you've, just given your thing, maybe should have a part city or prefix I, don't know on everyone. I.

B

Think the namespace.

H

Is what does that yeah like anybody who wants to be trying to do arm in arm 64? They would just only use using up to a certain point that they I guess they can be using alias the namespace and then it's arm 60 4.18 d dot, whatever in arm that, maybe not whatever I guess. I just think a language that I work around yes, yeah I got.

B

H

No one else, but.

B

The actual question here would be for arm 32 versus arm 64.

B

If people want to use using both they'd have to do a using, alias and rename it to like armed 6480 beasts. Mv and our 3280 understand.

E

Do not force people to do with, despite the fact that they said okay, yeah I mean.

A

When I usually have done, is they do support the other names face and it is critics it because.

B

A

The names are swollen up by saying: I'm, sixty-four, dot, Cindy and x86 dot. Sim knee my it's reasonably well into that code as well. I, don't even have to lay missus yeah. If.

B

We were the prefix that I want to build for a 32 and a 64 which are the formal short games for the two, but I mean.

A

Regardless of the conflict, I mean what I care about. Is that me don't corner ourselves with its Jeremy said what if we named it the concept, but then it really is represented as a speck level. Then what happens if this thing versions, I, would like all developers to be able to say?

A

Oh, it's clear to me that if it's an SS, c1 I go to the SSE and if it's necessary to go to SSA to and if I meant on, sse2 I do to inheritance, esse, one and SSA to on the same time, I'd I mean that it needs to be somewhat sensible that you can reasonably know which time you need to go to in order to get a particular instruction. Yeah.

B

It looks like in the manual it's adb, CMD, it's four bits. Rather than one and a bit pattern of one one. One one means it's not implemented a bit pattern of 0 0, 0 1 implements it implements everything plus half support and 0 indicates that it implements integer, single integer and single sport, so you're saying they have they've got it so they've got two forms of advanced in me, Oh God, so we could call it a.

H

Dba MDM and ATVs and behalf yeah.

B

C

Actually, because this field, poor.

B

Bids in it they have room for 15, yeah they've got rules, you understand the documentary, so maybe we should call this a TV Cindy zero, zero. This is the instructions that you get when yeah net zero or better.

A

So I think that's it yeah. So.

H

A

I'd, rather have slightly less with readable names as long as it's clear what they are willing to do.

H

We care about like do. They have a casing rule on a TV, simply they they use our crazy. Well, they.

B

Made a little deal, it'll be big. No, it's all.

H

B

Awesome these was singing but adv, yes, actually pass cupcakes and so I I would I would name it that case, we've not done that with the x86 we've named it according to our casing, rules for sseo cetera, so so big, a big s at the zero very.

H

Needy some new 0.

B

Yeah, it happened Pasco case for seventy as well. Oh yeah.

K

Atv stands for advanced straight yeah. It bans some D.

H

But maybe B is its name, will use the name on the class. Also Chris, not like short things. No.

E

I, don't like abbreviations, but.

J

You guys know everything to be short, I like.

E

Short words, but no I think it's fine to call it as this background isn't good name, which is English, not German.

A

H

A

Poly synthetic language so I only because that's what includes.

H

The adjectives in someone else, thank you. Okay,.

A

So what is that so basically said adv 70 0 is the name yeah.

E

Hopefully, then, I'm select the numbers in order.

E

K

Only thing I don't like.

E

K

Maybe I set up two seconds, but like a lot of big pattern, is you have like a line of hips right? If a BX is supported, then do something if SSE is supported, then do thing if ADB sim d0 is supported. It's kind of like sse and AVX are very x86 specific names right, ABB, simp, D isn't really oh I mean it. Is it an arm specific term? That's the one that I just look at this code and it'd be like Oh. Obviously this is I'm on an ARM chip. We.

H

Apparently, we just hold that name out of the spec. So yes well.

B

I think I think like if I just saw advancing the what I assume that it's arm or what I assume that is or what I pretend to assume is something else.

E

Or what its Intel implants that that's and.

H

It'll be under the connections to.

A

Be feared when you search for a TV Cindy, the only thing that's showing up on Google is arm.

I

A

Know so I would say that quote says: okay.

B

In technically.

A

B

X86 is already created advance in the it's called a BX advanced, vector, extensions yeah. So.

H

In Intel would just be or AMD, they would just be silly to name a new extension second clue same thing that exists in the competitors, especially if the weather has the same or different, meaning they're not gonna. Do this. They like trademarks,.

A

A

Happy chivalry.

B

Steve were these just these simple operations in this particular proposal. I noticed there's some like add wide and stuff that aren't here so.

C

The proposals, so these were I, think these are the ones that were actually produced by the cindy vector the vector, the cindy implementation of the vector class and so they've this. We were really easy to do so. They're like well as ready as proposals. It was in the tube just before shipping to one and I was trying to sort of divvy up work that could be delivered by two one, so yeah. So this is strategically low hanging fruit, okay, so these are sort of numeric, simple ops, that already.

H

Exist to drop a bottom lolly walk out the door. We didn't talk about the generic already, but given that adb, CMD zero is the these ten types and adb 71 is the half type. Should we pre expand all these generics to be very clear, which ones are the zero and which ones are the one because happy? You would write the line for half and then we're just going to throw at runtime because you generic expanded when you want them.

H

Looking the other type, so I think we should pre expand all generics here to the types that support it'll be big and ugly on the class, but it is very, very clear what isn't isn't supported I? Don't.

A

Care over the SICU, that's a nice thing.

C

We actually went round and round whether expand them all or not. In some of the discussions, I don't know how its falling out in the.

B

The x86 ones, we've still not decided because that's discussion for next week's meeting, but for the most part we have to have everything exploded. The way the ISAs are layered, but there's a few where they are generic and it's not always clear that they definitely should be generic.

B

There's basically no hard set rules today, those.

C

Are implement I thought there was some concern about the size of the manage interface or the what it would cost to expand all these yeah.

B

Right now, they're all recursive, so at least for the platform they're supported, they're recursive, so the so we got an order.

C

Of order and search for these intrinsics yeah.

B

The search is well, the search is just to match the method name, not the type name as well. I, don't think the type names are done separately after the method. Name search but I have the il level. Two methods that have the same body can be collapsed to the same ion signature, but if they're all recursive, then they don't collapse. They each have their own method, body entry and so the library gets bigger. So that's not.

E

A precise yeah.

B

The metadata size, because.

E

We can't collapse I.

B

E

If it was like thousands of methods, I mean this is this is denied probably what 50 to 800 methods yep. That.

B

E

B

Is like at least a thousand and arms going to be justice finding, but for for the the thousand only impacts like the x86, because we've got the multiple builds of courland, so the x86 core live bro instructions throw P NSE directly, so those collapse into one method body. So just the Intel take space for x86 and it doesn't take any weights on our vice versa.

A

Can we look at how bad that is, I mean if you talk about 5k I, don't think I care. We talk like you know, close to megabyte, then it becomes.

A

B

If it was too much of a concern Carol do you think there would be another way to do the recognition other than recursion if it ever did become a concern? I.

J

Have not given it sufficient thought to.

J

Speculate: okay,.

E

Yeah because I also like extending them, because first it makes it very explicit and reliable. Secondly, it's consistent like you look at the type of you. You know it's consistent here. It's like geez, some of them are genetic seminar and then.

B

You actually get the type safety. Oh I can't pass in my shirt. Well.

A

The other form that I'm really concerned about is say you know adding more times in the future right that really breaks it. For me, yeah, it's impossible to reason about that when something's are open-ended okay, many advantages returning all right so then took that one.

E

E

Yeah comparisons ICS, so it's gonna be the same type. Just we broke the PRS in to or the issues it yeah. So.

B

It's going to be impossible. Yes,.

E

If ya gettin´ a single.

B

A

So maybe I don't understand what this thing does but like what does it compare equal zero do when I pass one back row flow and what is what is individual data represent at that point, so the results.

C

I think it gets sent to all ones if it's equal to zero and if it's not equal zero, it could set to zero. So.

A

Is it then really like should I really interpret them as others? It's load, then I? Was it just a bit better than I'm, so.

B

What we did with x86 is that for anything that takes or returns a mask, we returned the same type as the input. So if it took a float, it returns a float and for anything that took a control word. Instead, we would take and return an integer type rather than like a floating point type and the reason we did. That was because, when you're working with masks, you would generally create and construct those based on the result of another operation like, for example, compare equal where's for the control word, the usual be constructing.

G

It themselves, so what is it again, what.

A

Puts the control over it like well,.

B

Like that stupid mask effectively aids like it's control, controls what the instruction does. Oh, like.

A

Liquid swiveling take this I'll put it there. Oh yeah.

E

A

Masking is more like okay, this is the user to multiplication against their.

B

So like for flow, if it's true, it returns basically not a number and but a specific encoding of not a number. That's not the same coding as the single gamma number.

I

J

Main reason for distinguishing that I mean in some sense logically you'd like to treat the masks. They were more like an integer. Those are not really used as floats, but um it complicates the usage model to have to figure out that. Oh, this is a float. So what I need to hold the mask is an int.

J

Well, that's better! Yeah.

E

But if we expend the API so actually not use genetics, we could do whatever we want. It would be very well use simply because it you will see in the signature you.

B

Could do whatever you want it, but like, for example, with floats, you might be doing something like doing an end which will return a float and that can be your mask and and should return a float. So you don't want to have something like compare equal return, not a float, and you have to do something special with it to make it work with the rest of the float in operations. Yeah.

A

Remember let me do this for vector of teeth. Rémi exercise writing code like it became like this horrific thing where you kept costing between menus, but the problem seok's have to cast the right size, so that becomes really like it's on as trivial as you think it is for the most part and like usually don't hear about the wages because you just passed them on an agency like things compose much nicer in code, and it becomes much less annoying to me. But yeah I mean writing back to rice code in general.

A

Is not it's super straight over exercise. The.

B

Perfect example really is like you: do you do some operation you compare to determine which which values you want to operate on, and then you pass the that compare to and to mask out everything right that you don't care about, and if you return an integer type for the mask, then you have to reinterpret casts to pass it from the end and that's the most common case. So you don't want to make that overly verbose to.

A

Even meet it, oh.

G

E

So the parameter name cell shouldn't become selected I.

B

Think for x86 most the time we call it control, but that's also what it's referred to in the spec. If they've arm reverse to it and select, that's probably better name for them, select or select row select, I would think whatever that whatever the spec actually refers to it has I would think would be the better name.

B

C

You look under BSL I, think it's bit select logical or something yeah.

E

So it basically takes place either from the left or right.

C

F

C

This is really like the question mark operator, with the with the comparisons from the previous right is.

A

It clear what the value should be for taking Reza not taking X hopeful I. Guess it's zero.

E

A

Right, that's what.

C

I get it's one left for the comment about. It looks like selectively the bit said it would take a lot bit.

A

But like I mean it's logically, the one right, so it's a floating point, one versus an integer one right like it's not like a big kind of that I have to construct.

B

A

B

Yeah, this is one of the ones where I think resting six. We were done in Tudor type, because the user will probably be constructing the bitwise pattern for left versus right right. It doesn't correspond to an entire field of left versus an entire field of right. It's an individual babe, oh I, see so like, like with with with x86 there's some masks which are like we take the upper bit of each field right and then select vertices.

D

B

Pack, a bit pattern: 0 1, 2 3, which corresponds to each element, in which case you have to do like a mask and then a bit extract to get what do I think Levi's doing in some of the utf-8 code.

A

So what will be the into like? All of them were just the selector just to select oh right.

B

Yeah I think just to slide. It.

C

So you turn all the comparison. Operations have to change the same.

E

G

E

You use this in general. This this operation doesn't make a lot of sense for flows right.

F

Doesn't even I mean you could I just that's how our there's shortcuts to it, though,.

C

Well, if you want to select between floats, you'd, still use the same integers instruction. It's just kind of weird.

F

Is it reality this isn't treated as a vector anymore? It's just treated as an opaque. Sixty voted integer for the first method and an opaque 128 that interpret a second, definitely because it's bitwise non-televised right. Yes,.

A

This one other cases where maybe it's okay to say you have to interpret cats. If you really want to do it in clothes, I mean we don't want to wait. I really want to sure.

C

I mean right, the the normal comparisons gotta produce all ones in the per field or all.

D

Zeroes in the field.

C

So a normal case: wouldn't this wouldn't actually be a bitwise select it just that's how it would be implemented. So.

B

You're saying that this is the arm equivalent of basically doing a doing a shuffle or an unpacked to select the element from left and right that you want in the resulting vector. Yes,.

C

B

I guess he's saying that this is called bitwise select, but normally you wouldn't do it bitwise. You would do it element wise and.

A

You would choose your bits accordingly, so that it effectively in.

B

Which case you would pass in the normal case would be you'd, be you'd, be passing in a mask. Not not a manually instructed bit pattern, but we're each element of the mask is all zeros or yeah. You would do something crazier if you wanted to, but I guess, that's not at all how.

A

To do this venom day, iwent is a filter that gave me it's not clear to me a little bit better for FLOTUS well,.

B

You could you could theoretically do it to extract like just the exponent a like I would guess if you wanted to I'm.

A

Just saying how do I get if a new type as flawed bill bits, etc? You know I mean for Frank for him. You would.

C

Be engine reinterpret cast a piece.

B

C

Well, the so that, compared if you didn't compare floats, that's what the result was was an.

D

C

For each of the doubles fields for the float fields are all zeros.

A

Right took down the ballot, it.

C

May be confusing.

A

Now, in estimating once we expand by the way we have flexibility, we can remove certain overloads. We can change types like that.

A

All right, basics for simply leading zero counts.

B

Yeah, so this one's not advanced simply for the first set, because they're part of the fate of all armed 64 CPUs have to implement it. So what.

E

Will be called the type.

B

Baby-Faced without the number well, these ones aren't even seventy deserve these operate on scalar types, but they're like the equivalent of the pot counter leading zero count. So it wouldn't you wouldn't want to hunt sympathy and there's. No. You.

C

Might call it a sixty-four days or something so.

A

Basically, this is his support, always be true effectively if.

B

You're not it's basically, president.

A

B

If, oh, if CPU architecture is armed, 64 then return true. So what why do we want to do a 64,000 64, a 64 is the formal name for the a arc 64.

A

I've heard the words and what they mean a 64.

B

Is just the spectrum for it? Okay,.

A

So I think bass to me sounds okay like, but that's something that maybe we should probably well.

E

We would qualify.

A

It would be called a 64.

E

Bass, yeah yeah. That's.

I

A

That would be what I'm saying we will should we use the same thing everywhere else as well right. So if we have some interactions that are on all Intel CPUs, you would call it. You know x86 base over there, something.

B

A

B

If we, they have an idea and I already expect.

E

Them to be so another, so three, six eight 64 no suffix base. Is it part of the spend or we just invented because basing object-oriented API is kind of imply. Maybe I misunderstood this. It's a base class for something x86.

B

And x64, it's formally called the base instruction set, I, don't know about arm, but I think I made it up for arm yeah, yeah I think base probably makes the most sense for the term. If you were given that these are is a names calling it the base is a make sense. You could call it basic if you wanted.

E

Or would it make sense to call it a 64 common.

B

Note they they call it the a 64 base, instructions and.

E

B

They probably them yes, it's fine, that's so I would do like a six before basis. That's like the works.

F

This is what really is the existence of a type called days with a capital B going to somehow mess up intellisense? If you try to the seashore keyboard bass, it's going to be called a 64-bit okay, but.

G

It generally didn't I mean I, get.

A

I mean it will show up over the I. Think bass will show up before I came over to show up before this people saw it before that, and.

B

If it does cause a problem, that's about.

D

Across all the introduced types that we're gonna have the same name types underneath the interfaces like I think we're renaming some DJ a advanced, some d0. Is that gonna, be you know each other. The different article processors know.

B

So so, with with x86, the IOC's have like trademark names like sse, sse2 AVX and that's what they use today and probably cause some ginger yeah.

B

The abbreviation stands for Cindy it's just. They may actually have a trademarked, formal abbreviation for it.

A

Concerns in this white, otherwise closest one as well is.

B

And there's the that same comment, therefore, byte index must be a jet. Constant also applies to a lot of the x86 instructions, and what we do today is if the user passes in a non constant value, then we emit a fallback which is basically a jump table for each index, so it's not as efficient, but it still emits the contracted instruction.

J

And you would never expect people to use that it's just there, so that bug you can deflect now.

C

If you run a for loop, it might show up.

D

Number of the reviews that it is supported, API listed in there I assume that's the only one in this class or yes, it's not we're not intended to have a one for each of these groups of API reviews are well I mean you could not the.

G

Same name that I was gonna: ask.

D

If we need to change the name, this.

A

Is why this is where the idea was that the containing time must it's.

B

Already, a spec that doesn't change.

A

B

If the CPU B is supported check this, basically, if the CPU says this is a supported, then everything in this class is supported under. That is a the one difference. There is on x86, where you've got a 32 bit 64 extension. There may be a couple specifically long super fun for my support. Yeah.

B

Do we need another to support and switch on? Therefore, those extensions- maybe if we were to do that, we'd- have to create a subclass that contain just the 64-bit ones right because there's otherwise, you have like an is supported for five or six different methods, individual methods- and that was just get insane. It's probably easier, just to say, don't do this on I just want to have an exam, whether.

D

Or not attention that you had to.

C

Do the architecture we had sort of decided that we would have one is supported for every sort of architectural extension that was enabled or disabled on a given CPU, and so then, if, if the events in the one half was the half support was added, it would be a separate class right now. So.

D

It might be City half or something, but the extensions are set up to the rule right. These extensions are only.

D

B

They're not they're.

D

Not extensions they're, just regular methods, no I'm, not extension methods in a sense of seashells I'm doctor I thought that processor yeah.

B

Well so like, for example, sse2 has vector instructions which support you know which also support long, but then there's a couple like extract which might take a or inserts a better one which takes a T data and if for 32-bit, T data can't be long because there is no register for a long.

B

But that's completely fine on 64-bit and they're, both under the same I, say so I'm the same.

C

G

How this recode was a separate.

C

I say a separate.

D

Class, at least unless I say I, think they need to be said. Requests especially observe over this different types of this wondering whether or not there's something that people can check to determine whether this is going to succeed without having to hand over the platform a supporting exception. I would.

B

Hope the analyzer would cover that, and we just tell people if you're writing these instructions. You really should be using this analyze. Do you need to do a run time in some cases or maybe cuz? You want to run this code on one home yeah, but then you just do and I don't know if this is a jet constant right now, what it probably should be if OS architecture or a process architecture is x86 versus X excitable.

D

So you would do the.

B

D

B

And hopefully we would get hold that to be a constant just like we do with these supported checks right.

J

I mean that the JIT has checks for that, and so they are effectively just one constant.

D

J

Does generate new code.

F

127, would you expect this to be a warning with the index parameter? Is all the noise for.

B

The analyzer I planned to emit a warning and say something along the lines of you're, not passing in a constant value. This may have about less performance than expected or something.

F

Why so, if the JIT is generating a switch table like doesn't that kind of go against what we were saying earlier about, you should only correspond to specific insurance. Well,.

B

So it what it does is it emits a jump table that basically says switch index K 0, and then it emits the exact instruction with the zero byte encoding case, one exact instruction with the one byte index. So it still calls the exact instruction, but you're only expected to have that happen. If you do much reflection or something like.

J

Normal code pattern is really why people to be doing this, because you know you, you lose efficiency but yeah.

B

So it still emits the documented in structure and it's not like a software fall back or anything it yeah.

B

We just thought it would was a better experience than throwing in the case where they didn't pass an account value or did call this be a reflection or anything else that was trivial to support yeah.

J

So the biggest thing that convinced me, what that was there, was a debugging argument that there were cases where the debugger we actually want to use reflection to float something yeah and there you wouldn't want it to throw.

B

You you also wouldn't want something like in release mode: the JIT correctly folds this constant down, but in debug mode it doesn't so. This works in release, but the Nelson debug mode right.

A

The names are unfortunate here with insert, but other than that for.

D

The have you talked about the names mission. Why is it arm that arms? Because it was just an orange 64.

B

I think that was just the original proposal. I think I think just arm 64 would make sense and people would be able to see arm 64 versus arm 32, yeah I would just make them side by side. Yeah is.

E

There anything that's.

K

Common between the two is.

C

K

That should be just an arm namespace. We put it there in case of that, but I don't there is anything yet what'll be the end of the world. If we made it arm, 3264 is top level and then later said. Oh, these are common, so we just put them in arm like as a side. Namespace they're, not nested, but I- think that.

G

We don't I mean in case that ever came.

B

K

We're still have it out there, yeah I, think that made sense to me. Yeah.

E

I would go with that well or we could name the namespace system, runtime physics arm and then the typed names would be forums, I'm, 64 I'm. You have to prefix on the attention, but.

F

We are doing this kind of anyway, just like he'll be named. The arcs 86 namespace of x86 x86, 64 I, know.

B

We only have why I guess the problem is that we have to rename advancing D to a 64 banks and D, because it's it's ADB Cindy on both 64 and third, but they're. Basically different instruction sets in some sense what.

A

We do in Franco, do we have x86 Nexus, of course, never know.

D

Because they are documented exist.

A

This is what I was asking.

B

About earlier, which I.

D

Guess maybe I didn't specify, but they were they. Gonna have the same type names and it's different different namespaces, because that's going to potentially cause problems if you're trying to make the right call yeah. Like me, he just told me now that the r32 is gonna have a same type name as well, but so.

E

If we, if we collapse it into death, our next phase would call them probably I would just then drop the advanced and call it 164, 70 or a 64 simply and against.

D

It predict the quadrature.

E

Having a same time years across namespaces its recipe for this.

G

B

I mean if anyone hits it well, though, don't see it, and then they would have to explicitly yeah.

D

That's what just makes it much more complicated to use decoded? No, it's.

B

Annoying, but are there going to be a lot of people who are writing arm 32 performant code, since our mistakes before seems to be picking up so.

D

Yeah they're still gonna, probably be people who want to be able to write code. That's going to handle all the processors for different missions, yeah.

E

Maybe not many will do it, but what's the negative, if we think that the negative of drafting this a adv is very large sure. But if we just wait on, you know a 64 of a 832, it seems like it would just I think you were to come in after Carl yeah.

I

C

So there's a common here at the end that there was some discussion of what the the parameter name in order should be is consistently in x86 NR 54. It looks like I didn't update the order to match x86.

C

It's probably it probably correlates the insert case, which I would guess. X86.

B

Would have the index at the end? Would it makes sense to just order these same as the operand order of the instruction and not care about matching them up? I think that would also make the runtime handling easier to understand when you're looking at it, because then you don't have to worry about. Oh operand, 3 at the c-sharp level, is really operand 2 at the instruction level, and so I've got to restore them and have comments in my code indicating why etc, 2 what you're translating from.

F

The C style intrinsics do they tend to keep these parameters in the in.

F

Because if they do that'd.

C

Be a really good argument for them. That would be my preference and that's why I wrote it this way. I think x86 has it in the other order, but it may be that their instruction order goes in a different order.

B

Yeah, because for x86 the the index is always the last immediate parameter is always the last operand in their encoding. Okay,.

C

Yeah works for arm you'd, actually write v-0 bracket slot lane 3 comma, something yeah.

C

Okay, so this makes sense, so it's okay, that they're different I'd.

A

Say yes, I mean in the end, like I, think you should just match as closely to the instruction as possible for the reasons that I mentioned I think you're good with ability for your customers.

A

All right so then, actually no, that.

A

Really works if you actually pass the back.

F

This tether is something that you were talking about: yeah.

B

F

Go to call the namespace right.

B

Yeah, this is one of the ones where its core functionality- that's basically both it's effectively a helper method for initializing, and but this that this is actually instruction on arm 64 right. Well, it's effectively an instruction on Intel as well, but and I'm, not sure if it's the same with farm 64, but at least on x86, you might do you might use one instruction on if you only support ssee.

B

But if you support a BX, you might use a slightly more efficient encoding to do it, and so it's a helper method because, depending on the is a supported, you may do something slightly different to officially set all so one of the.

C

Arguments was that the common stuff should be implemented in terms of the base intrinsics at one point so that yeah, you want to do comment. It should be implemented in terms of this and the x86 equivalents right. So that's why that's.

B

Like on x86 set all effectively does, if SSE is supported, then do a shuffle. If a BX is supported to do a perm you, if you're loading from memory then do a broadcast etc, but.

C

That's done advantage code right. It.

B

Should be done again.

C

B

Yeah, it's currently C.

C

B

Partially managed code and then a little bit of runtime support so.

E

This is nothing intrinsic.

B

I think Steve was saying that in arms case this actually is in pacing instruction and you always use the single instruction, whereas with x86 it varies from so I would generally.

A

Say that the way I see these times is they should expose effectively almost all of the instructions that, if that makes sense, and then, as Steve said, what I would do is I would have a generic setter on the on the types that is implemented in terms of those and on arm. It might just be calling this one method on x86. It might mean either calling two of them or calling different ones depending on how far the spec level is and then let's situated. So what would the common idea ii would like?

A

It would be I think a mastodon vector that says- and I said all to do or still alter daily or whatever so a constructor or a static, bathroom yeah, I.

E

A

Ted, exactly because.

E

The Constructors, obviously because, in addition, this meant that this kind of strange it basically said in that net API in implies that there is something you know the destination and once you set it to this, one is more like create or so yeah. This.

D

Man is not here because.

E

Feel again would take the destination. This one is basically create.

A

A vector that has these values- yes, I mean like that's, it I mean the same as the previous method rather called insert, but you don't really insert it into the vector I just said a particular element. It's.

B

Effectively insert set on an immutable type, it.

G

B

You had an immutable and you expose an insert method. You expose basically a fluent API there that creates a new type with the value and returns the new one or so once the 17 starts to code. Do.

B

Because well at least with x86, it's called set because that's what the C intrinsic is called. Oh.

C

Yeah, so there was debate over the over the months of whether we were supporting trying to keep things consistent with arm nomenclature consistent with c-sharp x86 name nomenclature. But.

E

I think we should like I was consistent and choose one, and then it.

B

Was basically x86 does it, so maybe we should do harm the same way, but I would think for familiarity with people who already doing that code. We should be calling this duplicate instead because, like for the C++ intrinsic, it's called v tube underscore in underscore. You know s8 for signed by, and so people will not go to advance in D in pipe and set all little type doop and expect duplicate to come up, which.

A

Seems reasonable to me, I mean I would even Christa like if you already have to pick a different name and then least make it one that is consistent of Josh man, I mean in with what the or they spent.

B

One of that kind of the spec yeah.

A

Because it seemed, like said all seems, like almost I mean it said, all kind of said it, but.

F

And honestly, we would imagine most people just using Holly.

A

Especially on x86, when it's more involved, but you know it doesn't matter what they call this one.

B

Kristy is effectively broadcaster from you.

F

Yeah almost like using the x86 instructions like it, took me a long time to figure out how to create one of these things. Yeah from.

B

F

Broadcasters, family yeah and that's.

B

Where the set.all is supposed to like distract that away from you in whites, the helper method, at the enemies.

D

B

One but then there's also the thing like today: we're exposing sat all on SSE, but SSE only supports float, but the SSE set all is also supporting the other nine types that are not supported until sse2 into that moving into a company.

F

I, don't even remember what method I ended up, calling to be honest, I do I, have one.

E

Go with this one.

A

Alright, if you couldn't I'm I'm 64 generic attendings here I.

B

Think the generic intrinsic comment is because both arm and x86 expose crc32. So this is where we the exposed core functionality in addition to gay trans at some point in the future, I will.

A

Be very careful with doing generics at this level. I would honestly say the way I would do it would. I would rather have some duplication between the individual Hardware stuff and say then there is a completely non right. Look, you know intrinsic fine generic PCL API that will always work I. Think that's a way better way to factor that than to try to say. Oh, they have some overlap.

A

Let's try to find some way to share, because the six or seven I said doesn't seem to be worth it, and it's probably like yeah, NP, very small percentage of things you can share the.

B

People who want the shared functionality will probably go looking for system dot, whatever is said. So. What would this Dinoco.

C

Tell if it's a 64, crc32 good so.

E

What is a 1864 crc32, what.

B

What is the cpu ID check for this.

F

32-Bit or about support, C or C 32, it says.

B

That it does just not the 64 bit forms Oh.

D

I'm missing some the earliest, the same form method other to see at the end. The second method names here.

C

The the two algorithms use a slightly different polynomial, okay.

F

So the type name would be called a 64 crc32, even though it also works on $32 well on the bully shouldn't. Have this yeah.

B

This is one of those cases were like both expose it, but they're quite different at times in the what exactly it supports that this might be a case of if we're going to put everything under an arm namespace. If we can look at the spec and determine that they are identical at both CPU check and behavior between over 32 and 64, we should just call it crc32 or whatever that I say is. But if there's actual differences between the two, because they're technically distinct I says we should make them separate.

B

But so, if you on I'm 32.

E

And you use, the you know, will be kind of simulating software. No.

G

Well, that's something it would not be an intrinsic anymore good. Well,.

B

That's where today we would say we we say it throws a yeah platform, not supported, but.

A

Why would we not, then, have a 32-bit 0 to 32 and a 64 13.

B

32, possibly because we didn't do that for x86, because it only at least for x86, it applies to like 5 or 6 instructor methods total, and it wasn't worth the additional complexity to support that and.

E

Also, we will then need a base crc32 base, because if you have a shortened it always works. So we would end up with, and you can maybe understand.

G

E

G

E

Would have base and 64-bit, we could end up with 2 times yeah, and then you just make one time to write the other and you're good to go. Yeah.

B

The CPU check for this is ie a a 64 say r0 yeah I wanted. We.

F

Didn't in practice like we were considering exposing this, we would call band type anymore, so yeah.

E

This is not for kind of user, but just for consistency such that you know, I can kind of understand. Oh.

D

Yeah, like the document we.

E

Had at the beginning, those are the rules, how we do intrinsic and then you go into the namespace and you find what they will do. We said that the rules are versus, we said the rules. Are that and then oh I totally.

F

I call this supported.

E

F

32 I, just don't want to be in a situation where, like you're having these exact same.

C

So if you it consistency, makes implication easier to so like a lot of this is table German implementation. So if you try to add, exceptions were like this class works on both and I. Can't answer the am 64 question. The supported question from arm 64 code then make it's more complicated yeah, but that's you know you make the implement it to make it do the right thing anyway. So, but it is a it's.

A

Me, the biggest one of these times is not so much usability, because ability is what it is. But I would say that you need to be able to predict where stuff will run and when it will play all right and that's why I think it's worthwhile to just mirror the reality here and say up there. Two types, two different things: yeah.

E

But so if we make it into two dice, what are they called? Are they called I? Would.

A

Call dates and 64 no I would have an 832, crc32 and I would have an a 64 CRT 32 and make that one extend the other. So if I have an int now, if you have an if statement, no.

F

Because you would almost be able to use a 32-bit version- yes, oh so the 30, okay, okay, because the 64-bit version would simply say class exists. Well, obviously, it's.

A

A static class, so it wouldn't be static. No, it would be so you can sue the rightfulness right. You just don't make an Isetta time because to make it a private, you know bake it face it, just sealed and then the compiler decid it does the right. So we make it. We.

F

Won't be in static, it's it's! What we do with the x86 instructions, where we have like a beer tutus, the difference.

B

There, though, is that, like.

B

X64 is a direct extension of the 32-bit. You cannot implement the 64-bit without the 32-bit, whereas with arm a arc. 64 he's not strictly speaking an extension of a our 32. So you can't necessarily say that crc32 Erik 64 cents from CRC.

E

Well, what you say this is true for other in physics, but but for CFC 32. It happens to be true that they are it.

B

Might be true, I'm trying to determine that now, but if even then, it still might not necessarily be a safe assumption, it.

C

Just it seems like it's not a common enough case that we want a special case this one inheriting from an arm 32 class. It seems suspicious so.

E

Basically, we will have two types and you have to know that on well, according to what you just said, you cannot even know that you can call this instruction on our 64 yeah.

B

Which is why, originally, they were in two different namespaces here, but.

C

B

Right this is this.

C

Was justifying putting it in an arm 64. This sentence at the bottom was just just just to find cutting in the arm. 64 namespace, so.

E

Maybe we just have.

C

Tuned in a common.

E

We have two times and then, if you have been the short end and you want to have a support for both 64 and 32, you write an if statement. It's not as I said it's not that common like ants, a 32 is going away so like so it would be not non-issue.

E

Meanwhile, if you want to have a right, adaptive value does have a mean statement.

F

Look for me, part 3. She was famous last words just like how exodus is going.

A

So, is it fair to say that the army 64 will best it be and it has a copy with 32-bit methods, then find them more places in China either. One works for me all right, no story cast.

C

There's the in the assembly code there's no reason to require them to be aligned. Yeah.

B

Yeah and for x86 we did load vector, is online and then load vector align is aligned, but it sounds like our business distinguish, dependent.

F

You know it used to it doesn't not anymore. Maybe never, professor.

C

There are some instructions that require aligned, some that don't, if you turn alignment on in the OS but I think all the OSS have an off right now. I remember correctly.

B

So you're saying that the the underlying instructions behavior depends on potentially a flag setting that the OS code set yeah I think so.

C

I'm not sure, though, I think the basic loads. It's not that I, don't know. I'd have to think about well.

F

I guess I have to go. Look in the meta question is like with these particular AP is: can I write an application that works correctly, regardless of what the OS or a processor something might be.

B

Which you just make them align then yes, well.

F

But yeah the the data might not be a lot like the data might be coming in from the network like it might not be alive. Yeah I might just be saying: read these bytes from this arbitrary pointer and make it a vector register. I. Think.

C

That the assumption was that yeah I guess they could always use on safe, readable line that should work right.

B

C

A lot of these these these correspond to a basically a load of a vector which most support bytes. So if the, if it just corresponds to a byte oriented load and that loads not going to have an alignment fault, so the element has to be aligned.

F

In interval, at least it's not that the element has to be alignments of the vector itself has to be aligned, because.

C

I think this is actually just the element has to be aligned for these I. Remember correctly, but Tanner is looking yeah.

B

G

To find Annelise.

B

It's got this really cryptic. If in equals, 31 then check alignment, otherwise don't Oh in comes from the RN parameter.

B

It looks like it might be a setting you can encode as part of the instruction encoding if I'm reading this right does.

F

That imply that we would want a line; it wouldn't mean over what it would be any other than that point.

B

If you can actually encode both versions in the instruction I would say yes, but I'm not familiar enough with reading the arm manual to say for sure.

E

Yeah, it would be good to this. You know figure it out before we end, because this athletes.

F

It also gives consistency with actual users, which I know isn't terribly important. Yeah you.

A

Want me thinking about Simon I mean we should probably be somewhat specific about Hollywood encoders in instructions in.

F

Terms of names I mean it's a good point. Oh we would. We would at least want to say, like this is guaranteed to work, even in the face of I, wonder if I might next and they say you.

A

Know you know it's a figures you down or whatever the roads are open.

A

All right so I think just do some homework. Even the other ones are fine. Like the study. Cars is also one that we probably want to have on the on the generic vector s right well,.

F

What what exactly, what static house dude here like it says, booth, but it's not really a group right, isn't it just say in a jet this word this was no different. Oh yeah.

B

So static well, unless Steve had a different idea, for this was supposed to be basically reinterpret cast, but with the same name as x86 does, which is statcast right, but reinterpret cast doesn't correspond. Yeah.

D

B

To be a no op in all cases, right.

C

Yeah I think this corresponds to a move-in arm, 64 jet, but if the move is to the same register that move just gets eliminated because the static cast here could create it, because it's creating a new object that could go into a new register just because you cast it so it doesn't mean the old on old objects. Gonna disappear, no of course, but it seems like move. Doesn't.

F

Really belong as an intrinsic yeah.

C

This is really a static cast, I think.

B

I think Steve's at what seemed was just saying, was how the static cast was originally implemented for x86 as well, which was we basically say by the time we got to coach and if target register and source register were the same, then we just dropped it. Otherwise we did a move, but then Carol suggested that in the importer we just eliminate the static cast there you know and then there's basically you never see static has passed the importer so.

F

I have a related question along those lines. I consider api's like and not like the single end on a TR is the JIT free to say you have vectors a and B you're inverting the and then ending it with a you've called two different instructions, but I'm still going to collapse into a single hand. Mode is.

B

That a valid optimization! We don't do that today and there's still discussion on whether or not we should do that, because the user may be taking advantage of pipelining or something else. But that's still like an up in the air discussion. But it sounds like you might want to reserve the right to.

B

Like active all the native compilers do this in other, um like if they see a shuffle by like 0x 68, they convert it to an unpacked low instead of doing the shuffle, because it does the same logical operation, much more efficient encoding. We don't do that today and there's still an open issue of like. Should we allow this or should we just say we will never do this and we'll rely on an analyzer to tell you. You should be doing this and set go that way.

B

The users are at least deterministic behavior and they can pipeline as they see fit and trust that it will work correctly.

C

So these these are using void. Star addresses, I'm was always surprised at us. Didn't choose to use a span as there's some reasons. Fant doesn't work here so.

B

The reason this is also an open discussion.

B

There are some users who want us to add either ref of T or span overloads and I after having thought this over a lot and discussed with various people. I am against it and the reason being that at least with x86 oftentimes. You are doing this with performance oriented code, and you are wanting to do things like work with a large number of elements in an array, and so the typical operation should be I.

B

Pin this I do something to make sure the rest of the data is aligned in the my operate on all the data as if it was a line till the end, because otherwise the GC is free to move your types around if it's in a span or a rep, and you that will kill cache locality, that will kill alignment checks and everything else. So in the end, it just ends up being better for users to pin their pipes and operate it under the assumption that has been pulled.

F

On to say that, if you took ref of T, you would want to prefix every single method with the same yeah.

B

And if someone does have a span of tea, they can just stick to the span of tea and operate on it that, because the normal thing shouldn't be I'm doing one operation on a span, it should be I'm operating on the entire span yeah or a large subset of it.

E

We can always also spend a little later. You can go to insist, yeah, I.

E

Agree spend I'm on defense. I think it would be.

F

Although there are people who like it, there are 80 80 hours right now on memory, marshal I believe that can go from span of bite into a vector 60. But then a sudden, you kind of you deal.

E

With unsafe code and apology, which is likely to not support it, which I mean then something right.

A

I mean I understand the reason why you want to have a point of one, but is any reason why you cannot also have a span. One I mean like you can't just like overload. Why.

B

Yeah, you could just have them, but that may promote people to do the wrong thing like, like. Basically, the difference between using a span directly and pinning a span and using it for a large array is multiple orders of magnitude, different stuff you would inside. If you take a span.

E

You would inside finish well, then, why not just use because the user it doesn't have to have unsafe go. You have to have.

G

E

Code in this file, why do you get all.

C

E

Not the user code user code, it has expanded cuz. This method would be completely safe, yeah.

B

When that could also be provided in a helper library, not as part of the core API well, you know like there's, so many things could be doing.

F

Nothing stopping.

B

Us from having us in the future service so Steve to answer your question, the reason that you basically have to pin it is because otherwise the JC can move. It can move the the underline span and then you're stuck with oh I, read from address X and now I'm reading from new address ax rather than address X plus one. But.

C

If you're passing in a span this the garbage collector when it moved it would update the span, so the span would be consumed it. It would be just like anything any other time. The span was right.

B

But the underlying the underlying reference was relocated and now you've messed up cache locality and non-temporal stores and everything else that you may have wanted to optimize your code for to do all. So, you have to answer the alignment question which was yeah.

D

B

No longer have the ability to say this was definitely aligned because it was before because it needs to be an eigenvector well, at least on the x86 of it yeah on x86 there's an explicit like load aligned instruction, which basically people want one boundary. It objects in the heap are aligned to, like only.

C

To the natural world size.

B

They are not sixteen byte or 32 byte aligned, which is many times important or required for vector types. So you prevent a read crossing a cache line, boundary which will drop your curve. Consider, it just seems.

C

Like you've turned inverted, if you want to pin it so you want it to don't, want it to move you pivot, you don't want to move the intrinsic cell required, depending though.

B

No, but it encourages the correct behavior for I'm writing code. That's dealing with high performance. If you don't pin it then you're going to get unexpected curve hits most of the time. People will be wanting to pin it and that's how they will be operating on it.

B

A

E

An extra item on the.

A

On the medicine.

E

A

Methods deal before it also except span of T or instead of that I think that's a it's a foreign design, sure that you guys should figure out, but.

E

I think that's, you know, let's take the node and almost a month on because as I as we said, we will about the inside implementation of the method this panel. We think then just call this one. It's just like convenience for the caller, so they don't have to have unsafe blocks all.

A

Right two: more should we can we wrap them up? Work should be made so great 51. It's over I, don't know how much time you'll need, but all these are kryptos. We just read something.

B

Well, I think the question was the same: one that came up for x86, it's like AES sure, sha-1, technically sha-1 is completely deprecated. There's still people who may want to use it for non cryptographic purposes should we still expose. It fall. One still.

F

Has it's useless even for photography, HVAC Hall? What is not busted for us? Someone is going to comment on the video after the fact and say, but it is so yeah.

A

I want to see that if I want to read it good people in advance way, that gives any reason why I shouldn't use. You know those ones. For example, there have been other.

B

Other than the back like like at the enterprise level, various corporations will say, like I, don't care what you're using sha-1 for just don't use it. So it's just that question like even we have rules like you're not allowed to you shot, one for anything, no exceptions, except for the already exposed api's. My.

A

I mean policy at that level of to be honest and like, if you wouldn't.

F

A

That fine, then you look for that or you code ordered or whatever, but it seems like not giving people functionality just because somebody might not like it is generally in order. I.

B

Would personally vote to expose I was sure to raise the.

A

B

Is the same thing that came up for Intel, where we said we're not exposing it because of this I think.

E

B

Jeremy was the one who raised the last time so.

E

I couldn't comment about the underscore see, but I'm at a I have a should not do that. There's.

C

An open issue to rename these parameters, unfortunately, I, don't understand the argument. The our algorithms well enough and I didn't really write. This probably have custom Aesop is proposing, but it's actually right now. These parameters correspond.

F

To words, zero through three words: four through seven words 8 through 11, although like, if you, if you look at the actual algorithm itself, those forever names actually make sense, but.

A

Yeah I think you kill with a name that doesn't evoke an underscore on your ass and there's other ones here, like you know, hash, ABCD and.

E

It's totally something, but somebody who understands their semantics would be good to work with you guys on your names, because you know some. Some transformations may not be trivial, like w0 underscore three, unless you know that it's from zero to three, you probably wouldn't know how to.

A

K

Case where the algorithm explicitly names it whatever name, we typically use that name right, even though, if it breaks our naming conventions, no, no.

F

What really yeah well I was. What I was saying is that if you actually look at the SHA algorithm, like these terms, actually appear in the mouth, yeah well,.

A

I would say like if it's if it's a random abstract thing that happens, a parameter.

K

We have some the KPIs in cryptography that have properties named knowing.

A

Well, I would say Johnny speaking, Photography is not a shining beacon of usability, but I think that's.

K

A

What I would know what I would say that it depends on why they're named this way right, sometimes they're, just lazy to just name buffer, a be AVR and there's no reason why that should be named like that right and there's other cases where you named a parameter coming from some math. That just happens to be called E in any sort of math tape, and that's the only sensible name for that, in which case it's fine right so but I think there's a you know.

A

There's no reason why you know RSA should be spelled out, for example right, but there's also no reason why they should be all uppercase right, I mean so there's like conventions and there's rules, then there's what's the name of the same might have sometimes the name of the thing is the name of the thing and that's fine, but it's.

E

Not necessary the problem here, right so I would say even in the Krypton name spaces I, don't think we have many underscores Oh with plenty of plenty of violations.

F

F

But I mean he is right that we have de and that's also because the actual algorithms refer to those variables yeah. So I was trying to keep the names.

E

Such that you can, if you understand this bag, you can look at this parameter name in manage API and you can make a connection. But try to like you know easy lies the names no underscores and yeah proper casing. Alright,.

B

Last one they answer the align question from earlier. There are explicit instructions: LD XR, which requires alignment versus LDR, which does not require alignment, there's basically two different instructions. They are just like in on X's, so maybe we would expose both yeah.

D

We probably expose.

B

A load, it's called load exclusive and store exclusive instructions, but when.

C

Those are exclusives, that's that's not just a normal load exclusive.

D

C

Atomics, so the exclusive comments you do: a load exclusive stores because of the store exclusive bails. If someone else wrote to that before you between your load in your store, but.

B

Those ones require alignment, whereas the regular LDR does not. But there is a OS level flag for require stack alignment which is different, so I just.

E

C

E

C

For the stack always requires alignment, but that's a difference thing right.

A

So I wrote it up my chair coming. You can't decide what to say alright, so more Sidney stuff.

C

Here yeah I was already saying it seems pretty.

A

Much the lamb we said earlier, no pun intended.

C

So the pairwise in a cross, it's like the pairwise, adds pairs of instructions within the vector, but it's a stamen convention from arm and across I think I copied that from from x64 yeah they're x86, sorry.

F

For this link, it's trapped. Are you going to do the same thing for debugging, where, if the inputs, not a constant, you build up a switch or something for which ones extract of T, where it says index must be a time constant. Yes,.

C

Yeah, so that's the.

B

Same case going on yeah, so it's a switch table underneath and I think in this case, because it's not a full range immediate, it's a partial range. If it's outside that partial range, then it will throw an argument out of rate, etc. Yeah.

A

Yeah, like a cross and pairwise more than vertical horizontal image over before which.

F

Is there no long- and you want for these for things like that for one I.

C

Think yeah there are strange places in the arm instruction set our CT for insertion sort where long is unsupported. Yet.

E

What is one a 32-bit I.

F

Mean as long as it's yeah.

C

They're usually called out it's fully expanded when long wasn't supporter, that's one of the most common cases.

B

And just like, with x86, I'm sure, there's possibly typos in some of these and we'll come across them as we implement them, and then login issues get in temporarily fixed and then come back post implementation like we're going to do next week for x86 and be like. Does this all look correct, yeah a little.

C

The goodness, a choice regarding implementation, I think most of these things are implemented. The load and store are not yet everything else I think is implemented that we've reviewed today. Obviously, the changes have been implemented.

G

And that rolls reasonably sweet, yeah.

B

Done then, I think we're done for the day. What do we want to say about the rest of the advance in the instructions that we haven't reviewed that weren't part of the proposal, but still exist like like there's a number of them that didn't show up in any of these? That, for you, my guess is 90% of them are aren't proposed yet or 80%. So yeah there's like 4,000, but then, if I take what I copied off the.

B

The the neon page, it looks like there's roughly 4,000 different total methods, including overload. What do.

E

You mean by they exist; they exist in the breasts, so 19 our API said well.

B

Yeah so like, for example, I didn't see ad wide up here, which is one of the instructions that is exposed as part of the advance and B instruction set, but wasn't part of any of these six or seven proposals that we went through so far. We.

C

Result is wider than the input yeah.

B

It's basically white plus 5 equals you shore, without with not without you,.

E

And let me something, and that in the proposal is because they are not yet implemented that you face like don't he does that they should be implement. No.

B

Because because there's like 4,000 api's and total and I would take way too long to put them all in.

A

So I mean Miami, generally speaking, is I mean I, don't care about the spec completeness right to me, it's more about like yeah. If we only ever needed these 13 and unless it's from the 4,000 spec, you know whatever complaint, and that was a good MEP for us to share right yeah.

A

It would be nice if we had a recipe where somebody could step in and add an intrinsic and we can MLK. You have to add this method here. Put this attribute on it change it over here at the tests over there and then rinse and repeat, and then we can probably call so that an extent to intercept artists as well I mean my hope is that most of them are also motivated by what people ask for an unjust, as when only picking from this Bank, in which case yeah I, don't see the pollen.

E

With nine well, how do you see it so this is I still don't. Is it an issue? You said, because there are 4,000 of them and pulling them in their separate proposal. Well might be, for example, at work. Is it to put them in the proposal in the review or you actually implementing.

B

That no no it's putting the proposal in reviewing it like so for x86, when we did originally behind the closed doors or whatever. We basically have the Intel come in and be like. Here's, the proposal, here's the abstract concept and here's a review, select pieces of it, but we're proposing like this entire is a be implemented right and now we've gotten implemented it are. We also saying that for a arc 64, we have looked at several VAP eyes.

B

We've determined that this is a is useful, we'll implement it and we're going to implement everything because they're exposed for specific reasons. We just don't have time to review them all in one sitting. Well may have time to implement it.

B

We do, but it would probably take just as much time to like type up the proposal and separate out all 4,000 API sees.

A

What we do so, whatever the thing is that so I think we have enough handle on what the patterns are yeah, but that is usually whatever, but I try to get another view. If now somebody dumps effectively the P R and says here's like you know, three thousand nine Delta and API became critical. Whether this is consistent, yes or no I mean when we reviewed wasn't they had 25,000 API. We did not look over all of them. We looked at like three or the syntax. Api is and said: okay, that's the parent you have.

A

The rest is just the rest. In other.

E

Words implemented and then we have a tool to generate the API yeah listening and then we can defend the review. It's kind of awful I like sandy and said you know as far as I'm concerned. They all follow the same pattern. Let me discuss it there if you, if somebody finds an issue, please let me know yes,.

B

And that's basically, what I would like to do is you know after I eat lunch today go sit down open and up for grabs issue on corvex. Of course you are saying they are x64. Advanced MD in a r32 advanced MD has been approved, but these ones have been formally reviewed. You can go ahead and implement them if you feel like you're up to the task and the rest of them have been basically preliminarily reviewed and will be approved and they'll be reviewed as the PRS come.

F

In yeah that works so quickly, question or the do these two overloads actually exist, or was that a copy/paste never reverse elements on? Let me check.

E

G

C

It's there I think I think we don't correspondent necessary in assembly instructions, but I think it was for like when you read, read a file and it was a big-endian file here on the little endian system. You need to reverse the elements of the bytes or to be read from that. Where.

B

You folks are encoded according to big endian architecture as well, because it's quite wise, our word wise.

A

Yeah all right so then noise, you then all right, then I will call it a day and these guys online and then I see you later.

K

A

K