Tony Marston's Blog About software development, PHP and OOP

Is PHP too verbose?

Posted on 14th October 2017 by Tony Marston

Amended on 1st September 2020

Introduction
Compact code must not compromise readability
The 80-20 Rule
Supporting Opinions
Contrary Opinions
Is compact code more readable?
Importing features from other languages
And finally ...
Amendment History
Comments

Introduction

Although I do not make amendments to the language (I have never programmed in C), I follow the PHP Internals newsgroup just to see what changes are being proposed. Occasionally I enter the debate on a particular Request For Comments (RFC) just to provide the opinion of a long-time user of the language, and someone who makes a living by selling applications written in PHP. I have a vested interest in the language, and I am not afraid to let my opinion be known should I think that a proposed change would have a detrimental effect on my usage of the language as well as the usage of thousands, if not hundreds of thousands or even millions, of other developers who do not have a voice.

Recently I have seen several comments from people who seem to think that PHP is too verbose, and this is causing a genuine problem for them. What is meant by "verbose"? Here is a definition from dictionary.com:

characterized by the use of many or too many words

In other words, using 10 words when only 5 will do. Do I think that PHP is verbose? Not by a long chalk. My first programming language in the 1970s was COBOL which, with its numerous divisions (identification, environment, data and procedure), required a lot of typing to achieve even the most simplest of things. PHP is the exact opposite - it is concise yet still readable. As with many other principles in the world of software, this drive for compact and concise code can be taken too far. The point about principles is that they can provides benefits when applied in appropriate circumstances, but you have to know when to stop otherwise instead of eliminating problems you end up by creating them. This is known as the Law of Diminishing Returns. Unfortunately, knowing when to stop seems to be a skill which is sadly lacking in far too many of today's programmers.

Compact code must not compromise readability

IMHO when it comes to software development there is only one golden rule which is equally applicable in all programming languages:

Programs must be written for people to read, and only incidentally for machines to execute.

H. Abelson and G. Sussman, "The Structure and Interpretation of Computer Programs",1984

In his blog post Avoiding object oriented overkill Brandon Savage wrote:

Code spends 80% to 90% of its life in maintenance. And in my own experience, the people maintaining the code are not always the same people who wrote it. This means your code is going to have to communicate with more than just a computer processor. It's going to have to communicate intent to developers who have never seen the code before. This is much more challenging than writing code that "just works."

It is therefore absolutely vital that new readers of program code in the future should be able to grasp the structure and logic of that software as quickly as possible. The longer it takes to read and understand a piece of software then the longer, and therefore more expensive, it will be to maintain that software. Remember that maintenance can mean debugging as well as adding enhancements. This means that trying to write code with as few keystrokes as possible is a false economy. Not only does it take time for the author to translate a series of simple instructions into a compact and clever alternative, every single reader will then have to go through the reverse process and translate that compact code into its plain English equivalent. This is even worse when some proposals seek to replace meaningful words with symbols. Code which is comprised of nothing by symbols instead of words is as readable as hieroglyphics. If I show you a piece of code written in COBOL you will understand it far quicker than if I showed you the same code in FORTRAN.

As far as I am concerned the verbosity of a language is directly proportional to its readability. Scanning a statement containing simple words takes far less time than scanning a statement which is full of symbols. Looking up a word in the PHP manual is far easier than looking up a symbol. Using symbols instead of words to reduce the number of keystrokes does NOT make either the programmer or the program more efficient. The time taken to execute an instruction will not improve by shortening the name of that instruction. However, the time taken for the reader, who is usually someone who is not familiar with the code, to understand what a statement containing a bunch of symbols actually does is increased far beyond the original saving in keystrokes. This increases the cognitive load on the reader which causes the compact code to be less readable and therefore less maintainable. So if "verbose = readable" then "less verbose = less readable".

A programming style which aims to put as many commands as possible into a single line also produces unreadable code. A good example of this can be found in Writing highly readable code by Dylan Bridgman:

<?php
// data
$a = [
	['n'=>'John Smith', 'dob'=>'1988-02-03'],
	['n'=>'Jane Jones', 'dob'=>'2014-07-08']
];

// iterate the array
for($x=0;$x<sizeof($a);$x++){/*calculate difference*/$a[$x]['a']=(new DateTime())->diff(new DateTime($a[$x]['dob']))->format("%y");}

?>

When this is written properly with one statement per line it becomes much quicker to read and understand, as shown in the following:

<?php
// data
$a = [
	['n'=>'John Smith', 'dob'=>'1988-02-03'],
	['n'=>'Jane Jones', 'dob'=>'2014-07-08']
];

// Calculate and store the age in years of each user
foreach($users as &$user) {	
    $today = new DateTime();
    $birthday = new DateTime($user['dob']);
    $age = $today->diff($birthday);
    $user['age'] = $age->format("%y");
}
unset($user);

?>

Anyone who says that they can read and understand the first code snippet just as fast as the second is a liar. If you look carefully you will see that the second code snippet is actually more efficient as it does not have to evaluate sizeof($a) within every iteration. It could be made even more efficient by moving the line $today = new DateTime(); to outside the loop as this only needs to be evaluated once.

The 80-20 Rule

As well as allowing programmers to write code which is less readable, which is a sin in itself, this continual addition of syntax to the language which will only be used by a minority of clever programmers will then increase the burden on the average programmer (those of us who do not have a PhD in Computer Science). Not only that, it will also have a detrimental effect on the language as a whole as it will take more effort to maintain that lesser-used clever stuff than it will to maintain and enhance the more commonly-used boring stuff. As anyone who has spent more than a decade in designing and building applications will tell you, 80% of the user transactions follow a simple and straightforward pattern, but 20% have to deal with the edge cases, the exceptions, the complicated variations. The 80-20 Rule (also known as the Pareto Principle) will tell you that only 20% of the code will deal with the 80% "normal" transactions while 80% of the code will be required to deal with the 20% of "abnormal" variations.

In a 1960s study IBM discovered that 80% of a computer's time was spent executing about 20% of the operating code. This was because they had been adding to the instruction sets which were being built into computer processors by making more and more complex instructions which needed to be executed in a single clock cycle. The study showed that these complex instructions, which accounted for 80% of the construction costs, were only being used 20% of the time. They then redesigned their processors so that instead of complex instructions which executed in a single clock cycle they used a series of simpler instructions which used one clock cycle each. By reducing the number of instructions that were built into each chip they not only made the clock cycles faster, they also reduced the amount of power that was consumed as well as the amount of heat that was generated. By replacing complexity with simplicity they reduced their manufacturing costs and actually made their processors run faster and cooler, and with less energy. The overall effect was to make their computers more efficient and faster than competitors' machines for the majority of applications.

This philosophy, the movement from Complex Instruction Set Computing (CISC) to Reduced Instruction Set Computing (RISC) was eventually taken up by other manufacturers. I was working with Hewlett-Packard mini-computers in the 1980s when they introduced their PA-RISC range, which included a new version of the operating system and language compilers to take advantage of the new chip architecture. I can personally vouch for the fact that they were visibly faster than the previous generation of processors.

This ratio between the complex (expensive) and the simple (cheap) can be applied to program code as well as computer processors. If the language already has a series of simple functions which do simple things then what is the advantage of making it do the same thing but with fewer keystrokes? Apart from satisfying the selfish whims of a small but vociferous minority it does absolutely nothing for the majority and it also adds to the maintenance burden for those who look after the language.

I am of the opinion that if something can already be done in 5 lines or less of userland code, then adding a new function to do the same thing should be automatically disallowed as it does nothing but add "noise" to the language instead of substance. Or, to misquote Frankie Howerd: "Clutter ye not!"

Supporting Opinions

If you think that I am the only one to hold this opinion then think again. Below are a few supporting statements which I found on the internet.

Here is a quotation taken from the Revised Report on the Algorithmic Language Scheme.

Programming languages should be designed not by piling feature on top of feature, but by removing the weaknesses and restrictions that make additional features appear necessary.

In this newsgroup post Zeev Suraski said the following:

... new syntax is hardly needed. Few, if anybody, are saying that PHP's syntax is preventing them from doing what they need to do. The argument is always that the new syntax can be useful here or helpful there - which even if we accept as true, would make the rating of these features as 'nice to have', and not 'important' let alone 'must'. Not having them is not a barrier to adoption, nor is it pushing anybody away from PHP. Plus, there's this whole theory that less is more, and in its more relevant form - more is less. So arguably, the added complexity may even hamper adoption.

It's of course a lot easier to implement a patch to the engine to add some new syntax, but that's not what the language needs. There's no need to add new stuff to PHP every year especially not at the language level, and we seem to be obsessed with that. If people focused their efforts on things that can truly move the needle, even if it took a lot longer, it would eventually pay off. Instead, we're not even investing in them - because we're in a 'vicious' yearly cycle of adding new syntax.

> Furthermore, type system enhancements can have enormous impact.
> Consider if generics landed in PHP 7.1. You had better believe that would bring us to the "next level".

I fail to see how adding C++ templates to PHP takes it to the next level in anything but the complexity scale. Not having them is not preventing anybody from doing anything today. Sure, a bunch of frameworks would adopt them once they become available - but it will not enable them to do things that are radically different from what they're doing today.

> Please do not reduce type system enhancements to mere syntax.

For the most part, it is - as most sensible use cases have alternative solutions - either by different methods or by writing tiny bits of userland code.

In this newsgroup post Stanislav Malyshev said the following:

> This is one of my favorites out of HackLang. It's pure syntactic
> sugar, but it goes a long way towards improving readability.
> https://wiki.php.net/rfc/pipe-operator

I think this takes the syntax too much towards Perl 6 direction. There is a lot of fun in inventing cool signs like |> and making them do nontrivial cool stuff. There's much less fun when you try to figure out what such code is actually doing - and especially why it is not doing what you thought it is doing. Brian Kernigan once said:

Everyone knows that debugging is twice as hard as writing a program in the first place. So if you're as clever as you can be when you write it, how will you ever debug it?
I think this applies not only to debugging, but also to reading programs in general. If we are "as clever as we can be" with syntax, the program becomes unreadable.

For this feature specifically, I see nothing wrong with assigning intermediate results to variables - moreover, I see an advantage to it both in reading (you tell people what you expect to have happened in this stage) and in debugging (you can actually *see* what happened and *check* whether what you thought should be happening actually happened).

If we could achieve the elegance of Unix pipes, I'd agree that it is simple enough that we can see what is going on - but having it rely on two magic things which are in different part of expression and context-dependent ($$ means different things in different places) is too much for me.

In this newsgroup post Stanislav Malyshev said the following:

> See, I take that quote in the exact opposite direction. I find chaining
> methods to be far, far less "clever" than deeply nesting them inside

If you tell me that syntax like "foo() |> bar($$)" is more natural or intuitive or readily understandable to anyone with any programming and PHP background than "$fooResult = foo(); bar($fooResult); " then we indeed have exactly opposite understanding of what is natural and intuitive.

I think that overly clever is inventing cryptic syntax to save a couple of keystrokes and rearrange code in unusual pattern that looks unlike the code used to look before and resembles some other language (this time it's F#? how many languages PHP should be at once - can we get a dozen?)

> each other. We shouldn't force pointless intermediate variables on

Why you think they are pointless? And you do produce them, you just hide them behind $$ so you can not actually neither check them nor understand which value goes where without tracing through the code with your finger.

> people for the sake of readability. The fair comparison is the deeply
> nested version vs. the chained version, and in that comparison the
> readability of the chained version is far, far better.

No, it's not a fair comparison. You shouldn't do deep nesting, and you shouldn't do cryptic syntax either. Variables are not evil. They are there to make things easier for you. Use them.

> intermediate variable to analyze and nothing has changed. However,
> passing an array to array_filter() then passing the result to
> array_map() is always type safe (because array_filter still returns an
> array even if it filters down to 0 items), and array_map on an empty
> array is essentially a no-op, so I'm comfortable doing so, and wish I
> could do so more often with less fugly syntax. :-)

I appreciate the careful choice of the example on RFC - indeed, in some very carefully chosen cases (mostly with arrays) you could get away with chaining without getting into trouble. In most cases where it would be used, though, that would mean not checking for errors, forgetting corner cases and so on. And that's how people would use this, because none of the existing functions were developed to work with this pattern, so you can only get away with it due to luck.

In strictly typed languages like Haskell, they use types like Option and Maybe to enable this pattern - essentially, to enable functions to operate on more than one simple type of value. However, it doesn't look in place in PHP, and it would require much more deep restructuring to actually make it work than just having |> as an operator.

> of promises", but more "is there something we can do here to be forward
> thinking, since lots of people want to see async in core PHP?")

I'm not sure what promises have to do with inventing strange syntax to pass result of one function as a parameter to another.

In this newsgroup post Stanislav Malyshev said the following:

> The old function is actively causing confusion - the reported type-names
> aren't even inconsistent with scalar type-hints, and then you have to go
> and read the manual with some explanation about historical reasons why
> the function doesn't work like you'd expect.

Why it should match scalar types? You can't use output of this function in a scalar type in any way.

> PHP is infamous for these surprise moments.

So let's add more of it by having multiple functions that do exactly the same thing but name null and float differently.

> I think that gettype() should be deprecated in favor of a new function
> that actually makes sense.

If you think people would want to edit gigabytes of existing code because you want NULL to be lowercase, you are very seriously mistaken about the order of priority of an average PHP developer. I am sure 99.9999% of people care about all this pedantry infinitely less than they care about their code keeping working and their development not be impeded by things like having to read the manual each time to choose which two of almost identical functions they need now and which of them has null in which case.

> I think that deprecating and fixing things is long-term less confusing
> than documenting your way around legacy functions that produce
> surprising and confusing results.

I think constantly disrupting the language environment by pedantic tweaks that add BC and cognitive load but do not actually enable anything new, just move things around - is not only confusing, but harmful for the whole ecosystem.

And if "NULL" really confuses you to the point you have no idea what it means - well, really, I don't know what to say.

In this newsgroup post Stanislav Malyshev said the following:

> I agree the example code is readable, but it makes me feel the
> language is a little obsolete.

This is a mindset that I feel to be objectionable and take issue with. The idea that we have to constantly invent new syntax to replace working old one, just because old is somehow "obsolete", even though new syntax's only advantage is doing things slightly differently - it seems to me an exactly wrong idea. It may be exciting to invent new syntaxes - but for an industry programmer that has other priorities, like code stability, compatibility, maintainability, etc. "new" is not a positive things unless it gives them measurable or at least perceivable improvement on these qualities.

Existing syntax is not "obsolete" and works completely well. New syntax invents new magic variables (thing that never existed in PHP before, adding a whole new conceptual level to PHP mental model) and a new way of doing the same thing that is already perfectly doable right now with exactly the same effort. I personally strongly object to such changes.

There is a lot of ways in which PHP needs improvement, but right now I think inventing more syntax tricks in not one of them. Even in syntax department, PHP has areas where we could use improvement (e.g. to name named arguments as one) but this one doesn't seem to do much but doing the same thing in a shiny new way. Read: less comprehensible for people not watching "latest new 20 syntaxes PHP invented in the next version", more things to learn to read PHP code, more things to maintain, more complexity for the language that once was supposed to be accessible to beginners.

This is the price of all innovation, but sometimes benefits are much greater and the price is completely warranted. I do not feel this is the case here.

In this newsgroup post Max Semenik said the following:

instead of defining constants like:
const FOO = 'FOO';

they could be defined like:
autoconst FOO; // defines a constant FOO with the value 'FOO'
Sorry, but I'm not a fan of this proposal. Features should not be aiming at minor savings of keystrokes at the expense of readability and maintainability. Remember, we write code once but afterwards it might end up being read hundreds of times. This proposal makes code less readable, and unintuitively works differently from enums where cases without explicit values don't default to anything.

Also, modifiers could be useful:
autoconst uppercase foo; // defines a constant foo with value 'FOO'
autoconst lowercase FOO; // defines a constant FOO with value 'foo';

and maybe:
autoconst camelcase FOO_BAR; // defines a constant FOO_BAR with value 'fooBar'
autoconst snakecase fooBar; // defines a constant fooBar with value 'foo_bar'
This one saves even fewer keystrokes and harms maintainability even more: imagine you're debugging your program and you've dumped some value. You see "MyConstant". Now you search the code for the source of that value but find nothing because instead of MyConstant there's only MY_CONSTANT.

In this newsgroup post Michal Marcin Brzuchalski said the following:

Do we really need it?
IMO the performance of this is not gonna change anything much. We're talking about nanoseconds here possibly.

I'm looking forward to Hello World RFC - that's also often implemented in userland.
```php
function hello(?string $who = 'World', bool $return = false, bool $newline = true): ?string {
    $str = "Hello {$who}!" . ($newline ? PHP_EOL : '');
    if ($return) return $str;
    echo $str;
}
```
I get an impression that we constantly add things into standard library which are from a language perspective irrelevant and that all could be developed in userland with no harm.

Contrary Opinions

There are some people who STILL claim that PHP is too verbose and should have all these long words reduced to smaller abbreviations or, even better, a collection of symbols. In this newsgroup post Benjamin Eberlei said the following:

PHP is the most verbose language I know. Everything in PHP requires more characters than in other languages. Keywords are usually long, variables need an extra $ in front. You have to use $this-> as no implicit context exists. Functions need to be prefixed by "array_", "str_" instead of methods on the objects.

I have even seen some programmers complain about having to type in the three characters "new" when instantiating a class into an object. Unbelievable!

If PHP is the most verbose language you have ever used then you must be used to using some obscure languages which can only be used by people with PhDs. I spent 16 years programming in COBOL which was the most verbose programming language ever written as it used proper words for almost everything. I also did a bit of programming in Assembler on a UNIVAC mainframe and SPL (Systems Programming Language) on a Hewlett-Packard mini-computer, both of which used one-letter mnemonics instead of words. Guess what? I found the COBOL code easier to read than the Assembler because it contained words whose meaning was immediately obvious instead of symbols and mnemonics that had to be translated. It is not the amount of time taken to WRITE lines of code which is important, it the the amount of time taken to READ and UNDERSTAND those same lines of code, especially when you are not the original author and you have not seen those particular lines before. Reading COBOL is like reading a piece of prose. Reading assembler is like reading the hieroglyphics on the wall of an Egyptian tomb.

The fact that some keywords are long is not an issue for me. The fact that string functions are prefixed with "str_" and array functions are prefixed with "array_" makes it easier to identify the family to which a function belongs. The fact that I have to prefix a database query with "mysqli_" or "sqlsrv_" or "pg_" is not an issue for me as I have to identify which DBMS I am using anyway. What would be a viable alternative?

The fact that some words require an extra symbol to identify what type of "thing" they are is also not an issue. In fact NOT having those symbols WOULD create an enormous issue. I invite you to look at the following list and convince me that the time taken to use those symbols is time wasted.

foo                       // WTF is this? A variable? A function? A method?

$foo                      // this is a local variable
$this->foo                // this is a class variable in the current object
$GLOBALS['foo']           // this is a global variable
foo()                     // this is a function
$this->foo()              // this calls method 'foo' in the current object
$foo = new foo;           // this creates a local variable 'foo' to contain an object from class 'foo'
$foo->foo                 // this accesses the class variable 'foo' in object 'foo'
$foo->foo()               // this calls method 'foo' in object 'foo'
$foo->foo($foo)           // this calls method 'foo' in object 'foo' passing local variable 'foo' as an argument 		

Here I have used the same word 'foo' with different sets of symbols in order to identify the various usages. If you tell me that the same level of identification can be achieved WITHOUT those symbols then you are either a liar or an idiot.

In this newsgroup post a person called Tyson Andre said the following:

The primitives any() and all() are a common part of many programming languages and help in avoiding verbosity or unnecessary abstractions.

For example, the following code could be shortened significantly
// Old version
$satisfies_predicate = false;
foreach ($items as $item) {
    if (API::satisfiesCondition($item)) {
        $satisfies_predicate = true;
        break;
    }
}
if (!$satisfies_predicate) {
    throw new APIException("No matches found");
}

// New version is much shorter and readable
if (!any($items, fn($x) => API::satisfiesCondition($x))) {
    throw new APIException("No matches found");
}

You may think it is more readable, but as far as I am concerned it fails one important test - instead of showing that code to another person try reading it out loud. You will then find your use of symbols instead of proper words to be a hindrance instead of help.

Is compact code more readable?

Why is it important that program code be "human readable" more than "machine readable"? What does the term "human readable" actually mean? I don't know about you, but when I read instructional text, whether it be program code or a manual, I tend to hear that text as the spoken word in my mind. The only symbols that I (and the vast majority of the population who don't hold PhDs) have no trouble understanding are those that were taught in school as part of basic mathematics. Here I am talking about such common symbols as:

= equals
< less than
> greater than
<= less than or equal to
>= greater than or equal to
<> not equal to
+ plus
- minus
* multiply by
/ divide by
% percent

When I read any of these symbols in a piece of text the sound that I hear in my head is taken from the column on the right as none of these symbols have names which are different. It was only after I started to write OO code in PHP that I became familiar with other symbols. Again none of these symbols have simple names to identify them, it is only the context in which they are used which gives them meaning.

<<< triple less-than sign - this is used to denote the beginning of a heredoc statement.
=> equal-to plus greater-than sign - this is used in a foreach statement to identify that you are extracting both the key and its value from the array.
-> minus/hyphen plus greater-than sign - this has two variations:
  • $foo->bar refers to variable/property bar in the object contained within variable $foo.
  • $foo->bar() refers to function/method bar in the object contained within variable $foo.
  • $this->bar refers to variable/property bar in the current object.
  • $this->bar() refers to function/method bar in the current object.
<=> less-than-equals-greater-than sign - this is also known as the spaceship operator. The statement $result = $a <=> $b will return into $result an integer value of either -1, 0, or 1 when $a is less than, equal to, or greater than $b, respectively.
expr1 ? expr2 : expr3; ternary operator - this evaluates to expr2 if expr1 evaluates to TRUE (non-zero), and expr3 if expr1 evaluates to FALSE. This is equivalent to
if (expr1) {
    return expr2;
} else {
    return expr3;
}
expr1 ?: expr2; shorthand ternary operator - this evaluates to expr1 if it is TRUE (non-zero), otherwise it returns expr2. This is equivalent to
if (expr1) {
    return expr1;
} else {
    return expr2;
}
expr1 ?? expr2; null coalescing operator - this returns expr1 if it exists and is not null, otherwise it returns expr2. This is equivalent to
if (isset($x)) {
    return $x;
} else {
    return $y;
}

There are several other specialist symbols, but I won't bother to list them here. The important thing to note is that the common symbols are more easily readable and understandable than the more complex symbols that are used in software. This adds to the cognitive load required to understand the language, and as it is more difficult to search the manual for a symbol than it is a word, it leads to many novice programmers asking the question What does so-and-so symbol mean?

When reading code and I encounter one of those statements above which combines symbols with expressions, I don't know about you but I always take the time to convert it into the longhand version. I am sure that other programmers do the same, but after writing out the code in longhand some of them expend further effort in reducing it to a more compact version. Sometimes I think they are trying to see how much functionality they can squeeze into a single line. The problem is that a new reader of that code will have to do the reverse and translate that code from shorthand into longhand before it can be understood, which often takes longer than the original translation. If you think carefully you will see that the translation from longhand (long and simple) to shorthand (short and complex) has to be followed by the reverse translation from shorthand into longhand. So while the original writer is reducing the number of characters being typed he is actually wasting valuable time, both in his original translation from longhand into shorthand, and for every subsequent reader's translation from shorthand into longhand. When you take into account the number of possible readers that a piece of code may have in its lifetime you should see that a lot of time is actually wasted.

There is a word for the act of replacing proper but common words in a document to abbreviations and symbols which are less well known, and that word is obfuscation. This is not usually the intent of the writer, but when you replace Plain English with jargon, argot and hieroglyphics it is the inevitable effect. Just ask an adult to read a teenager's message on his phone, one that is full of textspeak and emojis and you should get the picture. Code should be written so that even a junior programmer can read, understand and amend it. Code that can only be read by a select few is something which should be denounced, not applauded. Some developers like to write such code in an attempt to prove that only those with brains as powerful as theirs are actually qualified to do the job. These are followers of the KICK Principle. But beware, if this approach is taken too far it can have dire consequences. In his blog post Three Bad MySQL Query Types You May Be Writing the author makes this observation:

the developer was a fan of writing tricky to read code for reasons of job security. And yes they were eventually fired for writing code that even they could not maintain.

Writing code is much like any other form of writing. Although it is written for a machine to execute that objective is purely incidental, its primary purpose is to be readable by other people. By "people" I mean any programmer at any level, not just those with PhDs and degrees in obfuscation. In his article How to Eschew Obfuscation & Write Clearly the author states the following:

Don't use your vocabulary to impress or intimidate; use it only to communicate.

In order to do that, you need to understand your audience. You need to know .... what words they're used to. And you need to serve them.

Don't write over their heads; that's pointless.

When programmers forget that their primary purpose is to write code which other programmers can read and understand instead of simply being executed on a computer they can easily fall into the habit of writing bad code. Some inept programmers try to blame the language by saying that it prevents them from writing good code, but this is a fallacy. Just as a good programmer can write good programs in any language, a bad programmer can also write bad programs in any language. While a language may try to incorporate features which promote what is supposed to be "good" practices there will always be some creative programmer out there somewhere who will still produce a dog's dinner. Neither does the lack of such features in a language mean that you are doomed to write bad code from the outset. In the early days of my computing career I remember one disgruntled programmer saying that "COBOL is not a structured language". What the poor little innocent did not understand was that the language did not force you to adopt a particular structure, it simply offered a selection of possibilities, each of which required different degrees of effort, and it was up to the individual programmer to make a choice commensurate with their intellectual abilities.

FORTRAN has often been cited as a language which makes it very easy to write bad code, and this has led to articles such as the following:

This leads me to a simple question - is the time taken by the original writer to produce clever code with fewer keystrokes actually worth the effort when you consider the time taken by multiple readers to "fill in" those missing keystrokes? Is it true to say that saving such keystrokes is a waste of time? This is analogous to writing plain text, then zipping it in order to save space on disk, which then requires every reader to unzip it before they can read it. Sure, you are saving space on disk, but you are replacing space with the time taken to zip and unzip that text. This might have seemed like a good idea in the last century when disk space was very expensive and programmers were cheap, but today the reverse is true. Disk space is ridiculously cheap and it is programmer's time which is expensive. What should be the primary concern of every programmer today - to save space or to save time and money?

What I find even worse is the attitude of some developers who, after writing some readable code, decide that it is not compact enough but they cannot find a way to compress it, so they call for the language to be changed to provide an esoteric way of replacing userland code with new functionality which often requires a new symbol to replace a proper word. This may seem like a good idea to them, but it makes the language more complicated for the core developers, and increases the WTF!! factor for other application developers.

Importing features from other languages

As well as adding shortcuts to existing functionality (such as the short array syntax) another disturbing trend I see far to often is where somebody wants a feature added to the language for no good reason other than it exists in another language.

In this newsgroup post Rasmus Lerdorf said the following:

Rather than piling on language features with the main justification being that other languages have them, I would love to see more focus on practical solutions to real problems.

In this newsgroup post Zeev Suraski said the following:

> A language that is usable primarily by beginners will only be useful for beginners. Non-beginners will shun it, or simply grow out of it and leave.
> A language that is usable only by PhDs will be useful only for PhDs. Beginners won't be able to comprehend it.
> A language that is usable by both beginners and PhDs, and can scale a user from beginner to PhD within the same language, will be used by both.
> Doing that is really hard. And really awesome. And the direction PHP has been trending in recent years is in that direction. Which is pretty danged awesome. :-)

I would argue that PHP was already doing that almost since inception. I think we have ample evidence that we've been seeing a lot of different types of usage - both beginners' and ultra advanced going on in PHP for decades.

I would also argue that in recent years, the trending direction has been focusing on the "PhDs", while neglecting the simplicity seekers (which I wouldn't necessarily call beginners). Making PHP more and more about being like yet-another-language, as opposed to one that tries to come up with creative, simplified ways of solving problems.

Last, I'd argue that a language that tries to be everything for everybody ends up being the "everything and the kitchen sink", rather than something that is truly suitable for everyone.

We also seemed to have dumped some of our fundamental working assumptions - that have made PHP extremely successful to begin with:

- Emphasis on simplicity
- Adding optional features makes the language more complex regardless of whether everyone uses them or not

It does seem as if we're trying to replicate other languages, relentlessly trying to "fix" PHP, which has been and still is one of the most successful languages out there - typically a lot more so than the languages we're trying to replicate.

In this newsgroup post Zeev Suraski said the following:

As a whole, people don't realize that PHP does not need fixing. I'm NOT saying it's perfect and that it cannot be improved - of course it can - but I am saying that it's not broken; In fact, it's remarkably successful the way it is, and in fact, we have no evidence that since the RFC process was embraced and language-level features started making their way into it on a much faster pace - anything changed for the better in terms of popularity. People arguing to introduce radical changes to it (and making PHP a lot more of a typed language, optional or not, absolutely constitutes a radical change) should realize that it's not risk-free, and given that they tend to be advanced, top 5-10% coders - that they're catering not to just coders like them, but also the rest of the 90-95% of the world.

Introducing new syntax to PHP, with new semantics, adds a lot of cognitive load no matter how we spin it. Given how easy it is now to propose an RFC, and the general bias-for-change of internals, we're now doing this at a remarkable pace, with very few checks and balances. Every feature is evaluated context-free, on whether it's useful in some cases yes/no, and without taking into account in any way that 'less is more'. Just see how much discussion we're seeing here about open questions in this typing discussion. Whatever decision we take in each and every one of these discussions - means added cognitive load, as by definition that decision wasn't an intuitive one, but one that required much discussion, debate and sometimes compromise in order to reach.

In this newsgroup post Zeev Suraski said the following:

Creating a generic feature that makes sense in a handful of situations, while at the same time being one that's waiting-to-be-abused in the vast majority of the rest (or as Tom put it, a 'footgun') is a pretty poor bargain IMHO.

In this newsgroup post Stanislav Malyshev said the following:

It also seems to me that some measure of support for these features comes from the "coolness factor" - look, ma, we have complex types, just like those cool academic languages everybody is excited about! And I don't deny the importance of language having some coolness factor and getting people excited, but in this case I think it's a bit misplaced - in *PHP*, I believe, most of the use for this feature would be to hide lazy design and take shortcuts that should not be taken, instead of developing robust and powerful type system.

Now, PHP's origins are not exactly in "powerful type system" world, so it's fine if some people feel not comfortable with this rigidity and having to declare tons of interfaces, and so on. This is fine. But inserting shortcuts in the system to make it "strict, but not strict" seems wrong to me.

In this newsgroup post Stanislav Malyshev said the following:

> same thing as "PHP is not $other_language, therefore nothing from that
> language is useful for PHP." Which is an utterly wrong and useless

No, it's not. Nobody claims *nothing* from other languages is useful in PHP. What is claimed instead that *not everything* from other languages is useful in PHP, and, for example, importing random high-order type constructs from these languages without having extensive supporting infrastructure that those languages have makes no sense. Or, for example, importing arguments like "nulls are evil, let's replace nulls with monadic types" into PHP make much less sense in PHP context then they make in the context where they are made.

> Referencing other languages to support the inclusion of a feature is not
> a coolness argument. It's a "solved problem, prior art exists" argument.

The problem is that this prior art exists in different context, targeting different audience and having different styles, traditions, capabilities and support system, and it is taken wholesale without accounting for that.

> If a need is identified within PHP for a given feature ...

Maybe my feeling is wrong, but I do have a feeling that recently "need is identified" turned into "I can, if I think really hard, think of a complex artificial example where this feature would provide a marginal improvement". I would like a much higher barrier for "need is identified" that that. For me, some of the proposals look like solution is search of a problem. Maybe it's just because I do not understand the actual need enough. In that case I'd like to see it shown more prominently.

I mean, for adding a function or a parameter to function - fine, almost any use case will do, the more the merrier. But for overhauling whole language's type system - I don't think so.

> Similarly, actual computer science (as opposed to the software
> engineering most of us do) is developing real and meaningful new
> solutions to problem spaces, which take years, often decades, to
> percolate down into production languages. That doesn't mean proposing a
> language feature informed by academia is just being hipster or elitist,
> it means learning from and benefiting from the work of others. That's
> the whole point of OSS.

I do not object to being informed by academia, far from it. I object to arguments "some folks from academia say X is a good thing, therefore we must do X". Maybe X is a good thing for PHP, maybe it isn't - but whichever way it is, it's not because somebody likes (or dislikes) it in a completely unrelated context.

> its inclusion. I completely agree with that. But rejecting a feature
> suggestion with "you're just trying to look cool" is unhelpful,
> unconstructive, and frankly harmful to the community and the language.

Nobody does that - that is not the *reason* for rejecting anything, it's just a marginal side note. I just try to turn our attention to the fact that not all cool features that exist in other languages can, or should be, in PHP, even if they do look cool. And I try to share my worry that some of the things being proposed include seriously complicating PHP's conceptual model while serving at best infrequent use cases. Simplicity is a virtue, and we have already lost significant amount of that virtue. Maybe we gained more - I don't know yet, as most of our user base isn't even on 5.5 by now. But it does worry me that we are not only losing it - we seem to be actively trying to get rid of it.

In this newsgroup post Zeev Suraski said the following:

> It's the same thing as "PHP is not
> $other_language, therefore nothing from that language is useful for PHP."

Larry, I don't believe that anybody has ever said anything of the sort on internals, ever (although I've been known to readily admit I'm senile). We never block a feature from PHP because it comes from a given language. However, when considering a new feature for PHP - a procedural, OO loosely-typed language, and not a functional or strongly typed one - bringing in features that make it more functional or more strongly typed, cannot be on the grounds that they exist in other languages. Of course they exist in other languages - there are many different types of languages, PHP cannot and should not try be all of them.

> PHP's history has
> very clearly been one of borrowing and stealing ideas from every language we
> can find if they fit and make sense in PHP (and not if they don't).

I think that it's much more correct to say that PHP's history has been clearly one of borrowing and stealing ideas from C, Java and Perl, and not every language we can find. C, Java and Perl have some very strong commonalities, which is why creating a language that merged good stuff from all of them - plus adding more of our own - made sense and created a generally successful mix. But we never ever wanted, nor do we want right now, to borrow ideas from all of the languages in existence, even if they're good ideas. Good ideas exist in other languages, that don't fit PHP language characteristics.

Which again, does not mean that a feature that comes from a functional/academic language is inherently disqualified, and I do maintain to nobody is saying it; But when we come to evaluate whether it "fits and makes sense in PHP", than naturally, the likelihood that it does is inherently lower.

> Referencing other languages to support the inclusion of a feature is not
> a coolness argument. It's a "solved problem, prior art exists" argument.

But it's a weak, almost trivial argument. It's still one that is relevant - but given that it's weak, people should not expect that by saying that "XYZ language has it", this constitutes a strong argument in its favor. If that XYZ language is from a very different language family, then as I mentioned above, it may be an indicator that it's not a very good fit for PHP. Again - not inherently disqualified - just 'raising questions'.

> If a need is identified within PHP for a given feature, it is both
> logical and expected to look for prior solutions to the same or similar
> problems. That's the whole point of OSS. That doesn't make the solution
> used by another language necessarily the right one, but it should be
> considered a viable candidate.

The problem is, IMHO, that we're very, VERY flexible with the definition of the word 'need'.

There used to be a rule of thumb on internals that finding some use cases for a given language-level feature hardly constituted grounds to add it. It had to be useful on a very wide range of situations, in order to be worth the trouble of implementing it, maintaining it, but most of all - of adding complexity layers to the language (both in terms of cognitive burden and likelihood of misuse). Now, the whole 'complexity' factor is almost ignored. Focus is on finding a use case or a handful of use cases where the feature can be useful - a task which is almost always doable - especially when borrowing features from other languages.

> Remember:
>
> 'Programming languages teach you not to want what they don't provide.'
> --https://twitter.com/compscifact/status/375283793923670016

Is that inherently bad? It could be if it truly limits you, but if a language has a certain way of doing things, and not another - is it bad that it'd funnel you to do things its way?

Is it that bad that something that wants to use functional syntax, will not embrace PHP but something else? We don't have to be everything for everyone.

Regardless, at least as far as I can tell, it seems as if on internals, the sentiment is the 180 degrees opposite from Paul's statement. It's as if we feel PHP's syntax is never ever enough, and is in desperate need of extension - even though some amazingly advanced apps have been and are written on top of it. I'm not saying we should halt adding new syntax, but I am saying that (a) the pace at which we're discussing new syntax is mind boggling and way too fast, and (b) the bars we seem to be happy with in what constitutes 'need' are extremely low.

I would counter that statement with this one:
Perfection is achieved not when there's nothing more to add, but when there is nothing left to take away
IMHO, it would be AWESOME if we could funnel some of these cycles from new syntax and onto other things like parallel processing, async IO, JIT and more - which can truly take PHP to the next level. New syntax cannot.

In this newsgroup post Stanislav Malyshev said the following:

> In general, improving the type system provides a much more interesting and
> practical playground for any kind of tool that would rely on static

That's my point - "more interesting playground" does not sound like a reason enough to mess with the type system of the language used by millions. This sounds like a good description of a thesis project or an academic proof-of-concept language, not something a mature widely-used language prizing simplicity should be aiming for. I completely agree that *if* we added a ton of shiny things into PHP then there would be a lot of interesting stuff to play with. I am saying that is not reason enough to actually add them.

In this newsgroup post Stanislav Malyshev said the following:

There are a lot of additions that may improve PHP in many practical ways. I just think right now there's a bit too much focus on adding new syntax that adds much more complexity than it's worth. I'm not against adding new syntax per se, I guess I just want more necessary capabilities enabled per complexity added ratio.

In this newsgroup post Zeev Suraski said the following:

> This would mean, by an large, that people had tried a more recent version of
> PHP and found that their code was incompatible. I think on the contrary that
> they haven't tried because they have little motive. A lot of running apps are in
> maintenance mode with no significant investments in new code, without which
> it's easier to take the attitude that it's not broken so don't mess around with it.

It's more complicated than that - people don't actually have to try and upgrade in order to know (or think they know) that they'll have to invest time and efforts in getting their code to run on a new version. They guess as much.

That said, I don't think the issue with shiny new things is that they introduce incompatibilities. They rarely do - I think the biggest source of incompatibilities we have is removal of deprecated features and not introduction of new ones. Shiny new features have other issues - increased cognitive burden, increased code complexity, etc. - but typically introduction of incompatibilities is not one of them.

However, we can learn that the attractiveness of new features in PHP is not very high - or we'd see much faster adoption of new versions (which also leads me to believe that we're spending too much effort on the wrong things). I think we're going to see much faster adoption of 7.0 - but in my experience at least, it's predominantly the increased performance and reduced memory consumption that gets people excited - the new features are secondary if they play any role at the decision.

In this newsgroup post Pierre said the following:

I think the language should not enforce any practice, convention or code style, it must remain neutral considering people's whereabouts, workflows or practices.

And finally ...

I shall finish with some other quotes.

Everything should be made as simple as possible, but not simpler.

Albert Einstein

All that is complex is not useful. All that is useful is simple.

Mikhail Kalashnikov

Any idiot can write code than only a genius can understand. A true genius can write code that any idiot can understand.

The mark of genius is to achieve complex things in a simple manner, not to achieve simple things in a complex manner.

The readability of a language is directly proportional to its verbosity. If "verbose = readable" then "less verbose = less readable"

Tony Marston (who???)

You may also like to read RE: Improving PHP's Object Ergonomics which also discusses this topic.


Amendment History

01 Sep 2020 Added Contrary Opinions and Is compact code more readable?

counter