27th January 2006
Amended 5th August 2006
In my article Breaking Backwards Compatibility is EVIL I voiced my disapproval of the idea of introducing case sensitivity for function names into PHP when version 6 is released. This would mean that a function name with the same spelling but different case would be treated as a completely different function. PHP already has case-sensitive variable names (yuck!), and some of the core developers want to extend it for no other reason than to be "consistent" with other languages. This is such a controversial topic that I decided to split it off as a separate article.
As far as I am concerned the introduction of case-sensitivity into software (operating systems, compilers and tools) was the worst ever decision in the entire history of computing. Personally I blame the authors of Unix as they not only created an operating system which was illogical, unintuitive and user-unfriendly, but they also were either too stupid or too lazy to create a case-insensitive file system. All the existing computer systems were case-insensitive, so what was the justification for the change? To make matters worse this caused every piece of software written to run on a Unix system to also be case-sensitive, and so this stupid mistake has spread like a plague.
In the past 30 years I have worked on a variety of mainframe, mini-computer and micro-computer (PC) systems, and none of these has been case-sensitive in any way - this includes the operating systems, compilers, text and document editors, and database query tools. The authors of this software saw no need for case-sensitivity, and none of the users ever requested it, so where did this stupid idea come from?
The Windows operating system, the most widely used OS in the world today, is not sensitive to case, and neither is any tool or application which runs on it. Does this cause any problems? I think not. Would it cause any problems if this software were to decide that case was important? You betcha!
Can you name me one single problem where introducing case-sensitivity was the solution? On the other hand I have lost count of the situations where having case-sensitivity was actually the cause of the problem, so the idea of implementing something which causes problems instead of solving them strikes me as being incredibly stupid. If you think that case-sensitivity is important, can you answer the following questions:
READFILE(), where each combination of case means something different?
BOX, where each combination of case means something different?
NOTE: Some people who read this article are jumping to the wrong conclusion. I am not advocating FOR the right to deliberately use one combination of case for a variable or function only to use a different combination of case elsewhere in the same program. Anybody who deliberately does such a thing deserves a good talking to. What I am advocating AGAINST is the situation where a variable or function, if defined or referenced in a different case, actually becomes a totally different object. If I encounter
READFILE() it causes far more problems if they are totally different functions than it does by being the same function but with different case.
Using a different case may offend the sensibilities of some delicate souls out there (oh, the poor little darlings), but the consequences of having a series of different functions or variables which have the same spelling but different case are far more serious as they can create genuine problems. In my many decades of experience only a complete moron cannot handle a simple change of case, and only a complete moron thinks that having
READFILE() as a series of different functions is a Good Thing.
In order to aid those of you who are intellectually impaired let me give you some practical examples.
If you were to write a series of functions which performed some sort of file access, which naming convention would you choose?
Imagine you are writing code to transfer funds from one account to another, which means that you need to hold two different account numbers. Which naming convention would you choose?
Those who think that option (1) is best and option (2) should be avoided are actually agreeing with my argument. Those who think that option (2) is a Good Thing are beyond redemption and should be subject to involuntary euthanasia at the earliest possible opportunity. I am not the only person who thinks that using duplicate names that differ subtly only in case is not a good idea - check out item 21 on How to write unmaintainable code.
Those of who who say that the ability to write the same name in different case is wrong are confusing "mildly annoying" with "catastrophic". They completely fail to realise that the potential for genuine mistakes is far greater in software which is case sensitive. Most reasonable people would not even notice a slight change in case as they concentrate on the spelling of a word, not the case in which it is written. Those who complain about case sensitivity are being overly sensitive (pun intended). In fact I would go so far as to call them nit-picking, anal retentive, OCPD sufferers who are in serious need of a reality check.
If constructs such as GOTO are removed from modern languages due to their propensity to produce spaghetti code, then why include a feature that can help produce an even worse mess? Some languages which have case-sensitivity (such as Visual Basic) avoid any such problems by automatically changing the case of any variable or function names as they are keyed in to whatever has been previously declared, thus making the fact that the are case-sensitive totally invisible. While this can work with languages which are statically typed and compiled (such as VB), it is more difficult to implement in those which are dynamically typed and interpreted (such as PHP).
Some of the arguments I have heard in favour of case-sensitivity are pretty weak:
avariablelikethisisveryhardtounderstandis less readable than
itsMuchEasierToSayThingsWithSomeCapitalsis no excuse. Personally I prefer the use of underscores to those awful camelcaps, as in
thisisaspecialfunction()he may even write
THISISASPECIALFUNCTION(), or, god forbid,
While I agree that the mixing of case in such a manner would be a stupid thing to do, I should also point out that although it is inconsistent it most certainly does NOT cause any problems in a case insensitive language because the language disregards case and treats all those variations as the same function. This inconsistency, while being a feeble point to argue, does not confuse the language therefore no actual harm is done.
foothe variable and
foothe function. To circumvent this "difficulty" those programmers have created conventions whereby the use of upper and lower case can help identify what each token actually is. These are individual preferences, not language rules, yet these same programmers, when they move to another language, want to bring their old conventions with them. What is worse they want to change what were originally a set of personal choices into a set of requirements that everyone else must follow. What is their justification? "To be consistent", or "To have the same standards in all languages".
move 'text' to foobar call foobar using ... giving ...
It is quite obvious from the context when
foobar is a variable and when it is a function. I do not need to use different combinations of case to tell the difference, nor does the compiler, so the use of different case does not provide any benefits whatsoever.
Other languages, like PHP, use different symbols to identify different tokens, as in:
foo - a constant $foo - a variable foo() - a function $object->foo() - a class method $object->foo - a class propertyThere are five occurrences of the token
foowhich are all different, yet neither the programmer nor the compiler need special combinations of upper and lower case to tell the difference. As far as I am concerned if the language itself does not need differences in case to tell the difference between one type of token and another, then neither should any programmer. If anyone feels compelled to use variations in case in such a manner then all they are doing is highlighting a deficiency in themselves.
BOX. It doesn't matter because they each mean the same thing. But if you have several containers it is common sense to use names such as
box2, or perhaps
box_out. It is common practice to use different names for different objects, not the same name with different case. Only someone who is intellectually challenged would use
Boxto mean different containers.
command -bdifferent from
command -Bdoes not necessarily make it a good idea. The benefits of allowing case-sensitivity in one area should have been weighed against the contra-benefits in other areas. The authors of this idea obviously did not consider the wide-ranging ramifications of their decision.
The fact that some languages and tools have case-sensitivity is no excuse for insisting that ALL languages and tools be changed to implement case-sensitivity "just to be consistent". Ideas are supposed to be implemented because they are good ideas, because they provide benefits or solve problems. Implementing a bad idea just to be consistent with other languages simply perpetuates a consistently bad idea.
Some of my critics like to argue that "plenty of modern languages are case sensitive, yet nobody complains about any problems they cause". But what they fail to notice is that most of these languages trap the situation where a variable or function is declared or referenced with the same spelling but a different case and are able to deal with it before it can cause any problems. This is what happens in Visual Basic for example:
So even though Visual Basic is case sensitive it does not allow the same name (i.e. with the same spelling) to exist more than once with different combinations of upper and lower case. Thus the functions
ReadFile() are exactly the same, and the variables
SomeData are exactly the same. The VB IDE automatically corrects any variation in case, so the use of different case does not cause any problems. How many other languages shield the unwary programmer from differences in case in the same way? THAT is why programmers never complain about case sensitivity causing problems, for the simple reason that any problems which COULD be caused are automatically detected and corrected by the IDE or notified at the time of compilation. It is just not possible to create a compiled VB program which contains the same spelling but with different case even though the language is (apparently) sensitive to case.
Where a language does NOT provide this auto-correction facility, thus deliberately allowing the same name with different case to refer to different objects, this can lead to situations which are a maintenance nightmare.
ReadFile()to exist as different functions? Is it good practice to allow
SomeDatato exist as different variables?
The reason that few programmers complain about the problems caused by case sensitive software is that their complaints are instantly rejected. "It's the standard" they are told, "so you must learn to live with it". Few programmers have the audacity to question such stupid practices, but I am not so timid. I have had my share of being forced to work with second-rate standards which were full of half-baked ideas, and I know what a joy it is to work with first-rate standards where every statement is properly explained and justified. Statements which cannot be explained or justified have no place in any standards, and I'm afraid that explanations such as "it's the standard", or "it's consistent" or "because I say so" just don't qualify.
How many languages or libraries come supplied with functions and variables which exist more than once but with different case? The answer is NONE! Why not? Because it is not considered to be good practice. It would cause immense amounts of confusion and maintenance headaches. So, if this language "feature" is avoided by all language authors and competent programmers because of its potential for misuse, then why do these languages allow this "feature" to exist in the first place? If the GOTO statement has been eliminated from many languages due to the problems which can be caused by its misuse, then why not remove case-sensitivity for exactly the same reason? After all, this would be "consistent" and "promote good practice".
Traditional naming conventions state that all functions and variables should be given names that are meaningful and descriptive - a function name should describe what it does, and a variable name should describe what it contains. This means that if you want a different function or a different variable then you create a different name with different spelling and therefore a different meaning, not the same name in a different case. Am I really the only person to see this?
Those who say that the correct use of naming conventions avoids any problems with case sensitivity are missing the point - it does not solve the problem, it merely hides it. It simply papers over the crack, but the crack is still there and waiting to catch the unwary. It does not prevent programmers from using what is supposed to be the "wrong" case, either accidentally or deliberately. If you have ever debugged a program where the problem was caused by the incorrect use of case you will know what a ridiculous problem this is. Doing this accidentally is excusable, but some perverse programmers do it deliberately just to cause confusion, to create obfuscated code which only they can maintain. In my opinion if the accidental or deliberate use of the wrong case can cause such problems then the ability to use the wrong case should be removed from the language. The re-introduction of case insensitive software, or at least case preserving software, even if it were limited to variable names and function names, would eliminate such annoying problems without any downside whatsoever.
As far as I am concerned if a computer language does not care which case a token is written in then neither should any programmer. If a programmer cannot look at code and understand how that code will be processed by the computer then he is, quite frankly, in the wrong profession. Every programmer is exposed to mixtures of upper and lower case in the outside world before he becomes a programmer, so anyone who cannot understand source code which is written in a mixture of upper and lower case is, quite frankly, in the wrong profession.
There are some people who try to justify the use of case sensitive software with spurious reasons:
Have you also given thought to the time when keyboards give way to voice-controlled input? How cumbersome will it be not just to say the word but to spell out the case of every single letter? Do you think your audience will congratulate you for your foresight and wisdom? I don't think so.
So remember, when you say that you are in favour of case sensitive software you are also saying that you are in favour of the following:
Am I really the only one who thinks that these are NOT good ideas? Apparent not. Please take a look at the following:
I think that the article somethinkodd.com sums up the argument quite nicely:
There is no longer any excuse for making humans learn and handle the quirks of the way computers store upper- and lower-case characters. Instead, software should handle the quirks of human language.
© Tony Marston
27th January 2006
|5th Aug 2006||Added a NOTE for those of questionable intelligence who fail to understand exactly what it is I am arguing about, and whether I am FOR it or AGAINST it.|