Rust Programming Language Language Design Team, 20 Apr 2022

Previous Meeting

Next Meeting

⏯

youtube image

►

From YouTube: 2022-04-20 Design Meeting: Felienne Hermans, Psychology of Programming

Description

Felienne Hermans talks about the "Pyschology of Programming" and the various ways we can approach to answering the question of "Should we make this proposed change to Rust?"

A

This meeting is being recorded all right, felina.

B

Great, are you all seeing my slides.

A

Yeah, let me just introduce you for the recording, I guess so this is the lane team meeting. We've got felina hermans going to talk to us about the psychology of programming. um Take it away.

B

Yes, hello, everyone, my name is nina hermans, I'm a social professor at leiden university in the netherlands and what I'm presenting today is actually part of my course. So my course it's a grad level course.

B

It's called psychology of programming, and first I thought you know I make this slide something different for you, but I thought no, you know why not, why not show the lecture, as as I usually do it in class, so that can be fun and the goal of this lecture is to help you all think about how to make decisions on programming, language design.

B

That's also a big part of what the course is about, and so these are my actual lecture, slides that I use and for the students I say, active participation is required, and I think this would also be fun if there's active participation from you all and you don't have to use your voice or your video, if you don't want to. But if you do want to share things, you can put them in the chat.

B

I also have the chat in view, so I can see what you're doing and to make participation a little bit more fun. I actually have the participation panna, which I also use in class. So whenever there's a panda on the screen, this means you have to do something or think about something to keep it a little bit active and we will immediately start with the first venna exercise so feel free to type the answers in the chat.

B

Given that I have two programming languages a and b- and you may also, of course think of this as rust version one and rest version two, so it doesn't necessarily have to be two entirely different programming languages. It can also be versions of one programming language. What would be a way to study? Which version is better just type your answers in the chat.

B

I will read out some of the some of the answers in the chat that are worth it. Someone is saying uh getting spending 10 years to get good at both of them and then make your own comparison.

B

Ask users is what someone is saying: let's see what we get look for code that is similar and compare this to see how simple it is see which helps people write programs faster without with fewer bugs. What are you doing, the language for which one which language catches more issues, very, very cool, already, quite a wide variety of things that you could do if you wanted to compare two languages and, as I said, that's really the goal of this talk.

B

To give you a wide vocabulary of things you can try if you want to compare different languages with each other, so my guess is- and this is mainly a guest that I use for the students- is that something that people would say is doing experiments trying out experiments. This is specifically true for our students.

B

This is a fun statue, that's in front of our old university building. It is in dutch, but the sentence here says by measuring we get to knowing. So if we measure things, we will learn things, and this is in front of our building and that's interesting because, of course, this gives a certain research philosophy. Actually right, we can only know things by measuring things and that's very much. The frame of thinking that I see with my students, but also with people that I interact with in the pl community.

B

Very often people want to get to numerical values. You want to measure something because we've sort of all grown up with this idea that science is about numbers and that it is possible to measure things right. If you went to uh high school in most countries, probably you are familiar with scientific methods.

B

This is what we see as science right, the natural sciences where there's a stage that is observation, something happens in the real world and then you form a theory which may have the form of letters or words, but it can also be letters and formulas.

B

And then, if you have an observation and a theory, you must have a hypothesis like the app this apple will fall and then, if you have a hypothesis, you can do an experiment, that's more or less what we see as the scientific method and sort of whether we like it or not. That is a frame that we often think about. If we think of, we want to gain more knowledge, even if the we want to gain more knowledge is about programming language design.

B

So the goal sort of today is that I'm assuming you are familiar with this way of doing research and it's my idea to give you also additional ways that you could also use to make decisions based on on science that aren't necessarily measurements and experiments.

B

So, let's go into the scientific method and science a little bit more deeply. We typically divide science in different forms or different ways of doing science. First, we have the natural sciences, which is studying nature, studying things that exist in the real world, but we can also study people. We also have the social sciences in which we are studying people and their behavior, and then there's also humanities and humanities means usually studying the things that people produce artifacts like archaeology, but also books and music, studying what people create.

B

That's also a way of doing science and this scientific method very much fits with the natural sciences. That's a good fit because the world it is as it is, whether we like it or not, there is gravity right, so we don't have much control over exactly how the world looks like and works. So then it fits that we are trying to measure things to gain an understanding, so it very nicely fits here, but the social sciences don't really fit, and then you already have to make some changes.

B

Often we do this in medicine, for example, we don't do actual experiments, because people might know that they're in experiment and then they start behaving differently. That's not true for magnetic fields or particles, but that is true for people. So then you have a controlled experiment. It sort of still fits the model, but a little bit less. But then, if you're studying the things that people make yeah, what kind of hypothesis is there?

B

What kind of experiment can you do right if you're an archaeologist and you're gathering um x-axis that are in the in that are in the ground from a certain period? Well, you look at it and you're like oh. This looks like a thing people could use to chop tree to it right. That's not really an experiment that that you can do there. You just observe and then only from observations.

B

You will form a theory and then that's that's the end sort of of the science.

B

So all these three things can play a role and I would go as far as saying should play a role in informing the design of programming languages, but very often the first forms are more commonly used than the latter forms.

B

So imagine we have two groups of people, some people really like javascript and other people really like c sharp, and we want to understand which language is better for whatever better means right. There are good points in the chat like what are people even building, but imagine we want to somehow compare these languages to inform the decision of yet another language like breast.

B

So you can imagine there are numerical things that you can do to gain an understanding of these languages, but then an interesting question immediately becomes okay. What what are we measuring right? So next participation, planner name some things that we could measure. Imagine we want to compare javascript and sharp for other languages doesn't really matter. What actually could we measure? So, let's see what people are already typing in the chat.

B

Yeah that was really fast. Scott well done length of programs lines of code execution time can be measured. uh Efficiency errors. How many errors do people make? How pleasant is it to use the design? So you can ask people what would you like nice, the joy of users, code goal feedback, many many things already many numerical things that we can find the community size existing library size of the ecosystem- also really really good suggestions of numerical values that we could compare.

B

So one of the things that people often think about also what some of you thought about in the chat is performance benchmarking. I think this is one of the most common ways. If you look at purple papers, for example, scientific papers about programming language design, very very often there will be benchmarking say: look we implemented it such and such way, and now you know we shaved off one minute of compilation, time or one minute of execution time under certain conditions.

B

Interestingly enough, this is an accepted measure of okayness of goodness right. So look. This is one second, this is three seconds well, this is quicker and that that's sort of agreed upon that that quicker is better, but there's also other measurements. Someone also mentioned this in the chat lines of code right, shorter and then with shorter, it's already doubtful whether it's better right. Yes in general, we like short code, but it's not always so clear-cut.

B

That short is better than law, because, for example, if we look at game of life, probably familiar to many of you.

A

B

With cells, as you see visualized here, this is an implementation of game of life. That's done in c. If you scroll, this is like 200 lines of c code, and if you execute this uh c plus plus it's not even that exciting right, you just get strings in the in the command line. It's not even very nice. You can compare this with apl. This is also game of life in apl and this. This final line is the entire program, and then they even get this really really nice visualization.

B

But now I have 200 lines of c plus plus or I have one line of apl yeah. You know it is shorter, really better. I'm not I'm not sure. If, if I agree that, I would prefer to maintain this one line of apl rather than the 200 lines of c plus plus, so it's short and long, it is sort of in the eye of the beholder, and it very much depends on your preference, I'm going to say I would rather maintain the 200 lines of seedless less yeah yeah.

B

I guess me too so, with shorter, already taste, you know starts to play a role. What actually do we like? What do we prefer? Are we typing it? Are we maintaining it? Those are different questions.

B

What we could also do and what people also said in the chat table we could. We could ask people for sure we just get random people, programmers from the internet and say hey, what do you prefer right and then people might say well, I prefer javascript, I prefer c some type of people. I say I don't prefer either of them something else, and then we can count what the preference is.

B

However, as soon as people get involved very often, this is the moment where people start to talk about sample sizes and sample bias. Yes, but who have we asked? Wasn't it only? Was this survey not only posted on the javascript website right and that's already interesting?

B

I think, because also if you're benchmarking, clearly they're selection buyers, which algorithms are your benchmarking, which machines which cpus and gpus are you using, but it is my experience that whenever people are in the picture, this becomes a bigger thing than when it's still machines, even though clearly in any selection, including benchmarking, there is always selection buttons so to dive a little bit deeper in these different types of research that you you can do and that we also do in computer science in programming language design.

B

We can measure speed, we can measure lines of codes and we can ask for opinions, and all these things are what we call quantitative ways of doing research, because the ultimate result is a quantity is a number. So we measure how many people prefer c, c, plus, plus or c over javascript we're looking for a numerical value and that's always quantitative research and it sort of feels like science, because it has numbers what we can also do rather than saying which one do you prefer is how do you like this language?

B

Someone also said this in the chat like we can measure the joy of of people using the language, so we can say hey. How do you feel about c sharp and also, how do you feel about javascript, and then you have people give open text answers like? Oh? Well, I don't really like javascript, oh well. I I rather eat stones than than program javascript. I hate it. I it is horrible and then from such open text. You could also draw conclusions, for example, well more people like c sharp, so here's another banner question.

B

If I ask people, how do you feel about javascript and I look at what the answer is compared to how they feel for c sharp? Would you say this is quantitative, so are we here ultimately looking for a number or are we doing something simple, yes or no in the chat would be okay for this.

B

Depends on the question.

B

Interesting answers in the chat, so some people are saying sort of it depends on the question, um so I would say it much more depends on the way of analysis than it does on the question. So, even with this question, I can analyze this question in a numerical way, because I can give a rating right. I can see. Well, okay, it's neutral hate it it's negative. I can do sentiment analysis, for example, answers.

B

I can get this data into a numerical form, so it is somewhat quantitative if I want to get this data in a numerical form, but I can also do something else with it like learn from it observe from the data. So I would say it is definitely possible to get this data into a numerical form and therefore it can be quantitative.

B

I also very much like josh's answer. He says we could also write code and look at people interacting with people. This is very, very much also a way we do research and we will get to that type of research a little bit later as well.

B

So here I would say we could count these numbers, so it's still quantitative, but we could also get sort of more and more away from quantitative research, because I could also ask people. Why do you like javascript? What are the things about? Javascript that you very much like, and then I get really different answers right. I get all sorts of answers. Maybe people like javascript, because my friends also like javascript, so it's a community thing or maybe people like it because it doesn't have a compiler.

B

So it's easier to use this data is harder to get in a numerical form. Now, there's.

A

Not an easy way, I.

B

Can make this into members? Even if I wanted that type of research, we typically call qualitative research, so it's not really about quality, it's more about understanding the aspects of something understanding, the qualities that something has so something like asking opinions can be quantitative, but can also be qualitative and other things. This is a bit. I think what george is was referring to. We could also study what people's experiences are. We could do over the shoulder studies where you really look at someone's screens.

B

You can have think aloud studies where people tell you what they like or dislike about something and a study. That's also quite interesting, and it has not been used so much in pl designers. You can also do diary studies in which you, for example, send people a push message every day at 2 p.m, and you ask them: hey programmer how's your day been so far and you just let them write down little snippets about like one problem that you encounter today.

B

One good experience that you had with your programming language today and if you do that over a longer period of time, you can definitely gain insight in the type of things that people like or the type of things, that people struggle with in a way. That's really different.

B

That might know like small things that might not come up in an interview or you could also ask opinions not in a survey but in an interview which is also again different because in a survey people write stuff down in the interview, you can really go back and forth. Like oh tell me more about this, or do a little bit more of that. So these are types of research that are typically seen as qualitative because we're not aiming to get to a number we're aiming to get to an understanding.

B

Because one of the things I see, people that do pldesign misunderstand, is that they're like oh. But if we have version a and version b of the language and we measure which one is better, then we know what is better, but very often we're not only interested in which one is better, but also in why it is better imagine I do a hypothetical experiment and and c-sharp programmers are, I don't know, half a percent happier than java programmers suppose I can measure this imperfection.

B

Why, right? How does this help rust design? What part of c-sharp can I take that I need to put a rest? You can get that type of understanding from numerical data only, for example, one of the papers. I myself did. I looked at block based visual programming interfaces compared with textual interfaces, and one of the findings, for example, that we found is that if people work in a block like language like scratch, then what they like is that they have this canvas in which they can place stuff right.

B

This is a thing that people appreciate from such an interface that doesn't make it good or bad. It just is a quality that such an interface has over another one.

B

You might wonder okay, so we will do some open text date right. We can ask people what they like, but now what right? What do you do with this type of data? Now I have this data. What what do I do?

B

Typically, what people do with this type of data? Is they categorize the data which is called thematic analysis? For example, if you look at this tiny example, you can see that some people talk about technical aspects of javascript and other people talk about social aspects of programming, language design, which is also very important, like there might be reasons that you're interested in programming haskell that have nothing to do with hastel itself, but that have maybe to do with the type of status that haskell might give you, which javascript might not right.

B

So that's a way you can classify this data and you can dive deeper and make sub-categories as well. For example, you can say that there are technical aspects that really concern the language like. Oh, you can do functional programming, but there are also technical aspects that really concern the environment like. Where do I program and how do I program javascript?

B

This is what people do if you're an expert. If you do this type of research and you look at data- and you say: oh, I see a theme arising here. Oh there's a bit of this and a bit of that, and sometimes you do multiple rounds of interviews where you find three or four teams, and then you ask people to dive deeper a little bit in this, and this is not just random hand waving right. This sounds a bit norm, scientific, but it's just a different way of doing science.

B

This is why I want you to be aware of these different types of streams, they're very, very clear methodologies to group these type of answers. It's not just that. Oh, we randomly browse the answers, and now it looks a little bit like this. You can do this in a more systematic way where you have different researchers, classify independently and then compare notes and refine your methods.

B

I also want to talk a little bit more about these different types of things. We can look at right so so far, we've mainly looked at looking at people asking people interviewing observing, so we're really doing social sciences, research in a sense we're looking at people and we're asking what their experiences are, but there's also this humanities research in which you look at artifacts.

B

That is also a way of doing research that has not been that much adopted by programming language design, while it has been quite adopted in software engineering research, which is interesting. So the idea of doing this type of humanities research is that, rather than manually reading, one or two programs which you could do- and I think even some of you also said that in the chat like you could look at what people write. But if you do this with many programs in an automated way in linguistics or in history, this is called corpus analysis.

B

Where you look at a corpus of data and then you look. How do people talk about certain things right? um How are women or men, for example, discussed differently in literature? You could do this by manually reading books, but you could also do this with a little bit of short code, and you can imagine that something like github is actually very, very good at helping us make decisions on pl design based on what people have already created, which might be a little bit less biased because people might say.

B

Oh, I never use this feature, but they do right. Maybe they don't know they don't remember they forgot about it. They don't want to share it. So here are just two examples of papers that do corpus analysis to look at programming patterns, potentially also to inform decision making. So here's a paper that I very much like that looks at code smells right. Fowler's code smells like a long method or a big class. We sort of all know that code smells are bad, but is it really bad?

B

So these researchers have looked at the eclipse codebase and they see that anti-patterns, which is the word that they use for code, smells that almost in all releases of eclipse, anti-patterns code smells correlates with lines of code being more change, prone, more issue prone and more exception prone than others. So they sort of prove look. Code smells are really bad because they actually correlate with errors and code which might inform programming language design because of course, some programs, some programming languages might not allow for certain code smells to exist, and I have another example here.

B

This is actually one of the papers where I am also a co-author for this paper. We looked at scratch, which is a visual language mainly used for children and for education, and we looked at what variable names do children use for um for variables and procedures and just to show you an interesting piece of pl design. Scratch allows you to use spaces in variable names, so you see here you create a variable with a little pop-up and you see that the name of this variable has spaces in it. So it's number space of space jumps.

B

This is, of course, an interesting choice, because most textual languages will not allow you to actually put spaces in variable names, for very obvious reasons, so we were curious, like is this a feature that that children like? Is this something that is useful to them? We could ask them hey. Do you like to put spaces there, but children might not even be aware that this is something special. If this is the only programming language, they know they'll be like yeah sure, right of course, why? Why not?

B

So we thought it would be super cool to actually look at what people do what children do and we saw that these spaces are very common, so one in three projects actually use a space in a variable name and then some other interesting things like you can also use strings in certain places, and sometimes children would use round brackets, even though it's not necessary because there's no parsing going on- and this is a decision that could potentially inform programming language design because yeah you know, I don't even know what the right answer is.

B

I'm not the creator of scratch like do. We want spaces here, it's very convenient, but then at one point, kids go to python or they go to rust and the spaces are not allowed anymore. So maybe it is confusing, but here you see an example of looking at artifacts that kids have or programmers have created to get a sense of hey. Is this feature used or not, and in what way is it used? So this is a brief.

B

Well, that's a very brief story about that, a brief overview of types of research that you could do so. I hope that going forward if you want to make more evidence-based or evidence, evidence-informed decisions about rust design that you think hey. What am I doing, do I need quantitative data.

B

Do I want to understand how many people think this is confusing, or do I want qualitative data? Hey do I want to understand? Why is this confusing or what three or four competing ways? Could people interpret this piece of source code so thinking about qualitative and quantitative, and I hope you also start to think about- maybe the difference between what am I studying what I'm curious about. Am I curious about people's behavior that I can maybe measure with their code or other things in github as well?

B

Now we're not just looking at code but also as the previous paper did, look at hey what type of code creates an issue? What type of code is very likely to be changed in the future? Github is a bit richer than only looking at code, so sometimes you're curious about people's behavior that you can see also from github, but sometimes you're really curious about people's opinions like what do you think about this? Looking at this piece of syntax like? Does it give you joy? Does it doesn't make you happy?

B

um Those are just a few things about research methodology that might be useful going forward if you want to. uh If you want to measure things, I'm seeing some question. This is the end of my talk. Some questions already going in the chat, so feel free to drop your questions in there we can totally spend the rest of the session doing some q a and also doing maybe some brainstorming on. Are there open questions that you might want to think about? How can you experiment with this.

B

Yes, I'll start with the question I was.

A

Gonna say why don't we take a few questions that are recorded and then we'll cut off the recording and chat open.

B

Yes, let's do that so do you want me to look at the questions in the chat first.

A

B

Yes, uh for corpus analysis: are there existing tools that are commonly used? Yes, there are. There are a lot of tools but, as I said, they mainly come from software engineering. So there are many tools that, for example, can extract your variable names or do smell detection like what are the structural smells like a long method, they're, also tools that can do linguistic smells. For example, if you have a method, that's called uh is valid, but it doesn't return a boolean return. Integer, you probably want something that starts with is to return a boolean.

B

These are called linguistic smells, so those type of tools are very common and also most of the tools that I know are, are meant for java and that, because that's just sort of the dominant thing that people have looked at so there's some prior art. But I guess if we want to do this for rest, we will also have to do a little bit of figuring stuff out for ourselves because yeah, you know, java's, no trust.

A

Ask a follow-up question in java.

B

A

B

A

Is there how much work has there been on developing like general purpose, query languages like I'm wondering if there's a design we can adapt to rust, because I yeah.

B

Yeah there are, there are generic things where you can, you can read a java uh code base and say how many classes does this have how many methods does this have? There are quite some tools that either print this out as a text file or they give you a nice visualization of oh. This is this is sort of the metrics net code metrics.

B

So if that's the type of thing you're looking for it's probable, something's, probably there, but those queries, as far as I know are not like you might want to do things like how many boroughs, how many moves they had. That's all rest. Breast e.

B

Same question, but for diary studies yeah, so that's that's cool because I don't know. I only know of a few diary studies one. This is my favorite diary study, it's very old, it's actually from uh donald kluth, the guy that made and that was involved with lotta. He he wrote a paper in which he um he looked all his own latter errors for like 10 years, and that was the diary study, which was actually very, very cool because it really gave a certain perspective. So that's an example.

B

I know, but I don't know so many examples going forward to felix's question saying: hey gidlocks could be used as a proxy for diary studies. I like that idea. So there's I don't know if that exists, I know there's lots and lots and lots of work in software engineering.

B

uh That's always very uh quantitative in nature, so that really tries to predict I've done a bit of that work myself as well like if an image is in an issue report is the is the issue closed, quicker right, that type of stuff, and if a certain line of code has been changed many times, it is more likely that it will also be changed in the future.

B

So, for example, then you can predict which lines in this code base are likely to be changed in the future, but this is all very software engineering based on error-proneness and never really.

B

As far as I know, if not so often used as a feedback into the programming language, it's usually as a feedback into the code base like oh, this part of your code base is smelly and you should make changes there. I don't know if that much work, really that ties it back into back into programming languages.

B

Sadly, the type of work that I do, or that I want to do is is not so commonly done. Most of the. If you read bubble papers, it's always like hey here's, an algorithm and this the benchmark. That's the research method, that's really dominant in pll design. Somehow there isn't that much work, that's really trying to do this human factor of pl design in mainstream yeah. This is like why I'm here, where I'm trying to change that josh has a nice question.

B

The last point that you are making the trade-off between using one technology scratch and providing good transfer to other languages right. If you allow something in a certain language, should you also allow it in a different language such that it might make sense to not provide a feature in that it will negatively affect people's learning experience with future language learning. Yet this is really really a great question that I could talk for hours on so I'll. Try to be brief.

B

I think, with every programming language that we're designing we're, always teaching two things: we're teaching the language and we're also teaching programming. So I do think to a certain extent, we all have a responsibility to design our languages in such a way that there that you can reasonably expect trade-off. Oh sorry, to transfer to another language like one of the things about russ that I am not so excited about. To give an example is, if that we use a quotation mark in the triangle brackets, because typically this is a string.

B

So if I look at this, my brain so often goes. Oh, it's a string and it's not closed. Why is the string not closed and I have to overwrite that all the time, because quotation mark means string in in most places so making such a decision can you you can expect a little bit that it might interfere with other things.

B

The same is true for choosing different word, similar words for very different things, or choosing very different words for similar things. So it's always interesting that the decisions that we made yeah you should expect people also to go to different languages, and I think the scratch paper is a really interesting example. Where yeah. As I said, I don't know what the right answer is. I think sort of my good feeling says: why would you allow this? It will only make stuff worse later, but, on the other hand, yeah, you know it doesn't have a parser.

B

So why not take advantage of that? There are no easy answers in here.

B

Okay, great question from, I think, scots, what what is the best way to deal with familiarity impact, for example, the weight conversation. I've anecdotally seen people that dislike postfix initially, but they've they've sort of warmed up to it. How could we know if the choice was worse than what we're seeing is.

A

B

Bias or if it's better, that is also a great question to which yeah, sadly, there's not an easy answer. I guess this is almost again also a question sort of a question of language philosophy. Right, I think rust is a type of language where we all say. Well, it's not easy at first, it is really hard and it takes you a while to get it. It's like. I don't know like a good wine or long distance running.

B

You know you have to sort of stick with it for a while and it takes a while to become good or fun. um I think different languages have different opinions on whether or not that's a bad thing, but it's, but it's already good that you are aware that yeah, yes, there will be survivorship bias that the people that, like a certain feature, is generally not not everyone.

B

uh The question of is that a problem yeah. You almost want to do something like exit interviews like hey. Why did you give a breast and what was the straw that broke the camel's back? There is not an easy answer here that you can do apart from. I think being aware that that this is a case- and I already talked a little bit with josh and sam, what's also very important. If we're really talking about introducing a new feature, is you have to figure out what you're worried about right?

B

So sometimes you're worried about people will not understand this feature, and then they give up right. If you don't understand the borrower, checker, then then rust is. There is no bad for you, um but sometimes you might also be aware. You may be worried about negative transfer right that hey, we have a weight, but it's slightly diff from javascript we're not so much worried that people see this and they're like what is this?

B

What we're worried about is that people take their prior knowledge from javascript, which is slightly different, think that they understand this and then make all sorts of horrible errors that they cannot then fix. That's a really different word right is this scary versus? Is this so similar to something else? So sometimes also, if people sort of come to me and they're like hey, can you help design an experiment?

B

The first question is really: what are we worried about here? Are we worried that this is confusing, and are we worried that this is misleading? Those are already different types of answers and I am sometimes also very worried about confusing, like if people have no clue what this means, they will probably google it. I am maybe more worried about this being so similar to rust.

B

In another place, it was in another situation or another programming language in a similar situation, but those are already different types of worry and once you've nailed down what you're worried about, then you can also design a study on experiment where you can, then ask people hey here's a piece of code. What do you expect this to do, and then you can measure whether the thing that you're worried about really occurs very often and maybe even figure out hey is this? Do we have negative interaction more with python programmers or with c programmers?

B

So it is very much a measure a matter of what do we think the problem might be and gaining more understanding about that.

A

Okay, sorry, I'm going to say I think at this point, I'm going to end the recording and thank you very much felina. That was super great um and we'll put this up and then we'll try a little bit after right or not bye, bye to you who are recording.