June 6, 2021

Naming variables in code

When I was doing pure research, I wrote computer code only to run mathematical experiments. And, importantly, any piece of code that I wrote was used exclusively by me. This has changed since I started my new job and I now find myself writing more and more scripts that other people will need to use in the future. In particular, these scripts all have to go through an internal code review process.

Having colleagues comment on my code has been a very useful learning experience. And besides helping me improve my coding skills, it made me notice some differences in the way I used to write code compared to people with more programming experience. Among these differences, the most striking one is related to variable naming: I received countless comments asking that I use more descriptive variable names.

And it’s true, instinctively I would always use very short or even single-letter variable names. After having been corrected for this many times, I still find myself choosing short variable names if I don’t pay attention. So why do I do this? One of the comments I got acknowledged that “mathematicians tend to use short variable names.” If this is true,¹ then we should ask more generally what leads mathematicians to use short variable names.

A first answer might be simple laziness. It is often said that mathematicians are lazy, and this is definitely true in my case. When writing code for a quick experiment that no one else will run, why use ten or more characters for a variable when a single character is sufficient? Yes, auto-complete tools can often save the typing effort, but these are not always available and, importantly, they do not help with the cognitive burden of coming up with a long descriptive name!

But besides laziness, I think that there is another possible explanation for why I instinctively use non-descriptive variable names. The reason is that the variable description is almost always irrelevant for the way the code actually functions. It is only the value of the variable that matters, while the name could very well be completely arbitrary. In particular, giving a descriptive name could be seen as making an unnecessary assumption on the “meaning” of the variable.

To make things a bit more concrete, let’s assume that we need to define a variable for indexing elements of an array. When naming such a variable, it is often said that one should avoid using names like i or j (something that I would always do!). Rather, one should try to make make explicit reference to what the array actually contains, choosing a name like user_num. However, it’s clear that the nature of the elements of the array has absolutely nothing to do with the indexing variable, which is simply an integer. So even though calling the index user_num can help make the code easier to read, it implicitly constrains the index variable to the nature of the content of the array, even though this content is irrelevant from a computational perspective. In practice, we should only do this if we are certain that the code will never be used in a situation where the elements of the array do not represent “users.”

In summary, the choice in naming a variable seems to reflect a tradeoff between readability and generality of the code. For a programmer, readability is usually more important, and using names that refer to external information is a good thing, since they can guide our intuition when looking at the code. For a mathematician, who is used to discarding any assumption that is not required in a given context, these external references may seem like an unnecessary loss of generality.

As a side note, I think it’s interesting that mathematicians tend to give non-descriptive names even to concepts within mathematics. Different properties in multiple areas of mathematics have vague names such as “regular” and “proper.” This contrasts with CS and engineering where it is common to have long technical names (generally shortened using acronyms) and where even very similar concepts often sound completely unrelated. A famous quote by the mathematician Henry Poincaré says that “Mathematics is the art of giving the same name to different things.” In CS and engineering, it seems more common to give different names to the same thing.

Getting back to variables in code, I think that working with more complex code has made me realize how important readability is. But, to be completely honest, there is still a side of me that does not like using names like user_id, row_index, file_count, and probably never will. The conciseness and generality of variables like i ,j, n just seems more aesthetically pleasing.

A related discussion on math Stack Exchange, focused more on the use of variable names in mathematics rather than in computer code. Some answers are quite similar to what I’m writing about here.↩︎

Mathematics in the wild Most people today would probably agree that mathematics is useful. We live in a quantitative world, and math is becoming increasingly important in

My philosophical beliefs: answers to the PhilPapers Survey questions Today I want to try to give my personal answers to the original PhilPapers Survey questions. These are a list of questions on philosophical issues

Naming variables in code

Previous post

Next post