Please do not break our language

By Tony Marston

31st December 2014
Amended 1st April 2016

Introduction

This post is addressed to PHP's core developers who are proposing to break our beloved language yet again in the next major version of PHP (version 7) by removing functionality which has worked perfectly for years simply because it does not fit in with their ideas of how it should be done today. I am talking about PHP RFC: Remove PHP 4 Constructors (and this post on php.internals) which proposes that all code with PHP 4 style constructors be made invalid in favour of the "correct" method which was introduced in PHP 5. This is despite the fact that both types of constructor have lived quite happily side by side for over a decade and that large volumes of code, including PEAR libraries, were written in the PHP 4 style.

Note: since this article was written a new set of existing features has been put up for deprecation in version 7.X and eventual removal in version 8, such as the removal of the "var" keyword. Once again the reasons are spurious and will cause exiting applications to break without the expenditure of large amounts of time and effort in "refactoring" to deal with unnecessary changes. Will these numpties ever learn?

This change will require massive amounts of code to be modified for no tangible benefit other than to please the delicate sensibilities of a few minor nonentities. Millions of existing websites will not be able to upgrade to PHP 7 until their code is rewritten, and I fear this single BC break will make the move from PHP 5 to PHP 7 much slower than the move from PHP 4 to PHP 5. When the core developers start complaining that PHP 7's uptake is slower than they expected they should first consider their own incompetence in breaking the language rather than those millions of website owners who don't see why they should have to expend effort on fixing something which shouldn't have been broken in the first place.

The idea that PHP 4's constructors or using the "var" keyword is the "wrong" way is absolute nonsense. If you look at Constructor (object-oriented_programming) you will see that none of the other languages uses the "__construct" method name at all. What you do see is that Java, C# and C++ all use the PHP 4 style. Languages such as VB.NET, F# and Perl use the method name of "new", Coldfusion uses "init" while Python uses either "new" or "init". This puts paid to the notion that PHP 5's constructors are in keeping with other widely used languages and are therefore the "correct" way.

The idea that using the "var" keyword is the "wrong" way is also absolute nonsense. If you look at the RFC you will see that its removal adds absolutely nothing to the language. Its only purpose is to break BC as the author does not like its use and wants to prevent everyone else from using it. I find this attitude to be dictatorial and authoritarian, and therefore unacceptable.

There are some developers out there who seem to think that the rule that BC breaks can only occur in major/minor releases (where the numbering system is <major>.<minor>.<revision>) means that in any major/minor release they are therefore allowed to break as much BC as they like, and to hell with the consequences on the rest of the community. It is about time that this arrogant attitude was nipped in the bud. In his article On Backwards Compatibility and not Being Evil Derick Rethans calls this willingness to break BC at the drop of a hat an "evil" attitude, and I have to agree with him.

There are some developers out there who are so childish they say that if PHP doesn't change and become more up-to-date, more consistent, more "pure" and more like other languages then they will leave the PHP community altogether. All I can say to such people is "Goodbye, good riddance, and close the door behind you on your way out" We don't need or want people like you in our community, we don't want to have to rewrite working code just because you say that we should.

There are more than a few developers out there who are chomping at the bit to break as much BC as they can turn PHP into a language which is more suited to their ideals. In his article The Siren Song of Backwards Compatibility Phil Haack has this to say:

Usually, when someone tells you breaking compatibility is fine, they mean as long as it doesn't affect them.

Who owns PHP?

PHP was originally created by Rasmus Lerdorf in 1994, and later enhanced by the efforts of Andi Gutmans and Zeev Suraski who helped produce PHP 3 in 1998. They also rewrote the core to produce PHP 4 in 2000. When it became open source it effectively came under the control of the PHP Group on behalf of the grater PHP community. But who comprises this "greater PHP community"? It is, in fact, made up of the following groups:

While there may only be between a couple of dozen to a couple of hundred core developers, as at January 2013 there were over 240 million websites which were powered by PHP. As there were only 22 million in 2005 this represents a considerable increase in growth and popularity.

This popularity was not down to the efforts of the current set of core developers. While the original core developers provided the basic language it was down to the efforts of the application developers who created large numbers of effective applications. Website providers, the hosting companies, could easily support the installation of these applications as the main ingredients - the Apache web server, the PHP scripting language, and the MySQL database were all open source and therefore free to use and easy to install. Compared with other languages PHP was easier to learn and easier to deploy. It could be said that PHP contributed to the explosion in internet usage simply because it enabled websites to be built for customers without the need for expensive software and without the need for highly trained and expensive developers.

The PHP language is not fixed at what Rasmus Lerdorf initially created in 1994, or the version that Andi Gutmans and Zeev Suraski helped create, it is a growing language that is improving in capabilities all the time. However, some people have great difficulty in understanding what the term "improvement" actually means.

To some core developers the term "improvement" means that they can remove functionality which they don't like and replace it with newer and "proper" alternatives. Rather than extending the efforts of the original creators they are effectively changing it into a different language altogether. Amongst their pathetic excuses are:

It has been said that as the core developers are the authors of PHP then they have the right to do what they want with the language. I disagree. Apart from Rasmus Lerdorf, Andi Gutmans and Zeev Suraski who were the original authors, everyone else is a late-comer, a johnny-come-lately, a barnacle on the bottom of a behemoth, one who basks in the glory of others. These newcomers do not understand that PHP became popular because of the way its works, all they want to do is change it into something else. They do not understand that by breaking BC for no good reason (where the only GOOD reason is to fix a security vulnerability) they are actually breaking a large number of applications. This causes those application developers to have to find time to fix what shouldn't have been broken, and this slows down the adoption of each new release. Then these nitwits have the nerve to complain that the new release with all its security fixes is not being adopted quickly enough! I have news for you guys - if you stopped breaking BC with each new release we developers would find it easier to upgrade to the new release and would therefore upgrade sooner. So for f***s sake stop complaining about a problem of your own making!

Once a language has been released to the public and adopted by large numbers of users, the ability of the authors to make arbitrary changes without the consent of the larger community should be seriously diminished. The more BC breaks that each new release contains the less confidence the greater community will have in that language to support their desire to have stability and longevity.

How to become a core developer

Who can become a core developer? As PHP is written in and based on the 'C' language anyone who can program in 'C' can contribute to PHP. Anyone can submit a code change, but it has to be reviewed and voted on by other contributors before it is accepted. Only people who have contributed at least one code change are allowed to vote. Unfortunately there appears to be no minimum on the number of people who need to vote on any proposed change, or a minimum of PHP experience either in years or contributions to open source projects, and I have seen changes go through with as few as 12 voters. This means than a group of 12 activists (I regard them as saboteurs) can gang together and force through changes without consulting the greater community, changes that do not build on the language that we know and love, but which instead morph it into something completely different, something which moves in the direction of "purity" (whatever that is) instead of simplicity. These people want to make PHP look and feel more like other languages, in which case they should leave PHP alone and stick with those other languages.

Rather than adding value to the language they are changing it simply because they can, and with total disregard to any inconvenience that may be caused to the 240 million websites out there in userland. They don't care how much damage they cause or how much code they break, just that they have made the language "better". However, their opinion of what "better" means is totally different to what it means to the greater PHP community.

Adding new functionality to the language is an acceptable improvement as existing code should still work, but the choice of whether or not to use the new functionality should be entirely up to the developer and not those minor nonentities who think that they are masters of the universe. The core developers are not the masters of the greater PHP community, they are its servants, and they should remember that it is their duty to serve us, and not our duty to follow their diktats.

Refactoring code in the real world

One of the most common arguments I hear from those who are pushing for BC breaks is that the language is constantly evolving, so all developers should do their duty and refactor their code to deal with the latest fashions and fads. I'm afraid that this attitude simply does not wash in the real world where organisations have to balance costs against income. If their costs are too great they quickly go out of business. They pay their programmers to write and maintain the software which is essential to the running of the business. Once an application has been written it goes into "maintenance mode", but as well as fixing bugs this also usually entails the addition of new functionality and the modification of existing functionality. This is to cater for changes or additions to the organisation's requirements which either come from internal sources or are the result of external market pressures. As a result of this the size of the application grows as well as its cost. The software therefore becomes an expensive asset which would be very costly to replace in terms of both money and time, which means that the organisation expects that software to last a very long time. It will not be replaced unless the cost of its replacement can be justified. An organisation simply will not authorise a complete rewrite of its core application every 5 years just because it is 5 years old. It is a fact that some organisations are still running COBOL applications which are over 30 years old simply because they cannot afford a complete rewrite in a new language. It may cost extra to hire COBOL programmers due to their rarity, but it's still cheaper than a full system rewrite.

Organisations do not (usually) authorise expenditure unless it first goes through what is known as a cost-benefit analysis to calculate the Return On Investment (ROI). This is where they compare the cost of doing something with the benefits that it may/should provide in order to determine if it would be a worthwhile investment. Unless any proposed work provides benefits which outweigh its costs it is deemed to be not cost effective and is usually dropped from the schedule. Spending 1p to save £1 would be a good investment, but spending £1 to save 1p would not. The length of time it takes for the investment to repay its costs is also a factor - if the costs can be recouped within 1 year then that would be more attractive than taking 10 or 20 years.

When it comes to programmers working on the code, the standard practice is for the organisation's management to decide what needs to be done. After being told how long it will take and how much it will cost they either drop an idea as being too expensive or add it to the development schedule with the relevant priority. As each task is authorised it is given to a programmer, or a team of programmers, to implement. The important thing to note here is that this process means that developers do not work on the code unless the particular task has been costed, shown to have benefit and then authorised. It is simply not done for a programmer to implement any changes which have not been authorised. It is simply not done for a programmer to decide for himself what features he will or will not work on. It is simply not done for a programmer to say "I want to take some time to rewrite the code because I don't like the way it was written". Refactoring code to remove bugs or add wanted functionality can be seen to provide benefits and can be authorised, but refactoring code just to satisfy the whims of a developer can never be seen as providing any benefit to the organisation so will never be justified. Similarly when new programmers join the team who maintain a legacy system they cannot turn around and say "I cannot work on this code unless it is rewritten to conform to my personal idea of perfection, purity or best practice". If they are not prepared to maintain that legacy software then they can be expected to be removed from the team.

When it comes to the PHP language itself you can regard the community of userland developers as the organisation which employs the language developers even though no money actually changes hands. Just as employees are there to service the needs of their employer, the core developers are there to service the needs of the greater community of application developers in userland. This implies certain behaviour:

When you recognise the fact that a small change by a single core developer may require effort by millions of userland developers on millions of applications in order to keep those applications running, you should see that the total impact of that change is really quite significant. It may only be a 30 minute change, but multiply that 30 minutes by 1 million and the total comes to 500,000 man hours. Is that significant or what? In this case before any BC break is implemented the core developers should get it authorised by the greater PHP community. The effect it will have on the core developers themselves is irrelevant. The question should always be phrased around its effect on the greater community:

If the cost of this change is X and its benefits are Y, is this change worth the effort?

In the case where X > Y then the costs outweigh the benefits, so the change is not worth the effort and should be rejected. If there are no benefits at all to the greater community then the change should be rejected regardless of its cost.

The greatest good for the greatest number

Where a small number of people are supposed to service the needs of a much larger community, such as language developers serving the needs of the language users, whose needs and wishes take priority? I worked with several languages before switching to PHP, and each of those languages went through several upgrades which added features but never took any away. In this way all pre-existing code still worked as it was expected to, and nobody was forced to rewrite anything unless they anted to incorporate a new feature.. There was none of this juvenile mentality "I don't like this old feature, so I'm going to remove it" as the language developers were all competent professionals who were the servants of the greater community, not its masters.

Unfortunately the current batch of PHP core developers seem to be cut from a different and inferior cloth. Instead of being competent professionals they give the appearance of being bungling amateurs who have very little understanding of the background, history and usage of PHP. Instead they have their own limited viewpoint and a total disregard for any view which does not match their own.

In post http://news.php.net/php.internals/78979 Kris Craig said:

We certainly have no obligation to support BC on features that became obsolete in PHP5.

They have never been marked as deprecated so they cannot be regarded as obsolete, and the fault is yours if you think that they are. The fact that they have co-existed with PHP 5 constructors for the past 10 years should show that they are capable of co-existing for the next 10 years.

In post http://news.php.net/php.internals/78993 Maxime Veber said:

I didn't know that this use case was still valid.

I'm sorry, but if you don't know what is valid or not valid in PHP then you are not competent enough to be a core developer, and certainly not competent enough to dictate to the rest of the PHP community which language features they should use or not use.

In post http://news.php.net/php.internals/79040 Andre Andreev said:

Most developers with no PHP4 experience don't know that such a feature exists and spend hours trying to figure out why a parent class constructor isn't getting called when it's not overridden by self::__construct().

I'm sorry, but if you cannot be bothered to read what the manual says regarding class constructors then you have only yourself to blame if you get it wrong. It might help to take a look at PHP RFC: Default constructors as this may help solve this particular problem. He also said:

PHP can't support everything forever, old and nowadays rarely used syntax like this one must go away.

Who says that functionality which has worked for decades cannot continue to be supported? How much effort would be required to take this functionality out? How much effort would be required to leave it in? Who says that the old syntax is rarely used? Do you have any idea of how much code which was originally written in the PHP 4 style is still running in userland? Are you aware that a significant number of the PEAR libraries still use PHP 4 constructors?

In post http://news.php.net/php.internals/79086 Nikita Popov said:

I've lost count of the number of times I had to debug some "completely impossible" behavior I got while writing quick testing code (which is obviously not namespaced), because I accidentally created a class "Test" with a method "test" or similar.

Then I strongly suggest that you RTFM which, in the page on Constructors and Destructors clearly states the following

For backwards compatibility, if PHP 5 cannot find a __construct() function for a given class, and the class did not inherit one from a parent class, it will search for the old-style constructor function, by the name of the class.

As of PHP 5.3.3, methods with the same name as the last element of a namespaced class name will no longer be treated as constructor. This change doesn't affect non-namespaced classes.

In post http://news.php.net/php.internals/79053 Johannes Schlüter offers a rare piece of sanity by saying:

Yay! Instead of helping users to keep up to date and respecting their needs we give them reasons to stay on old versions by making things "cleaner". How wonderful.

In post http://news.php.net/php.internals/79077 Jan Schneider said:

I still don't get what problem this RFC is actually going to solve? I don't see a problem. Yes, PHP4 constructors are old and should no longer be used. But there is no problem still supporting them.

In post http://news.php.net/php.internals/79078 Matteo Becatti said:

I can accept BC-breaks if there's a good reason, but this one seems a little weak. For reference 1/3 of the 4500+ php files in the project I work on every day still uses PHP4 constructors. And one third of them is within the bundled PEAR libraries.

I'm not saying it is quality code, but it's a dinosaur open source project... I believe that many other historical open source projects are in the same boat. All the (unpaid) time that the team(s) would need to waste on this could be better spent elsewhere.

There is another PHP RFC: Scalar Type Hints (and this post of php.internals) which proposes to change the existing type hinting mechanism into type enforcement, and this will produce a different set of BC breakages. There are some comments in this thread from wiser members of the core development team which reiterate the common problem that BC breakages cause, which is to slow down the adoption rate for the new version. In this post Stanislav Malyshev said:

Breaking ZF2 and all software built on it is not "some breakage", it's a serious issue which would produce a big barrier for PHP 7 migration. And looks like there are more frameworks that do the same. This would be a barrier to PHP 7 adoption, and note that is even for people that couldn't care less for scalar typing. We'd find ourselves in python 3 situation - where people would be glad to upgrade but they use library X and it doesn't work and they have no idea how to fix it and they keep all their development on the old version and the new one never catches on. It'd be a shame if we spend all this effort on PHP 7 and get no adoption since people can't run their existing code on it.

In this post he also says:

PHP is an universal platform and PHP 7 would be offered as upgrade to all PHP users - running ZF1, ZF2, Symphony, Drupal, anything. If there would be a sizable chance that their existing code would not run on PHP 7, people would not upgrade. Our upgrade record is not stellar as it is, even with extraordinary effort we put in keeping BC 5.4->5.6. If we break major libraries in 7, I am afraid we'd have adoption problem.

When it comes to improving the language only additions and bug fixes should be allowed. Removing a feature simply because a junior core developer does not like it should be disallowed. The issue should be broken down into two simple questions:

  1. How much effort would be required by the core developers to keep the old and new style constructors running side by side, just as they have been for the past 10 years?
  2. How much effort would be required by the greater PHP community to deal with the consequences of removing support for the old constructors?

In any competent organisation the area which identifies the least amount of effort also identifies where the least amount of effort will gain the maximum advantage. So why force the greater PHP community to take up the slack which is a direct result of the core developers' inability to do their job properly?

If there is a choice between a lazy or incompetent core developer doing only half a job and leaving the 240 million members of the greater PHP community to clear up his mess, then it should be obvious to anyone who has more than two brain cells to rub together that it is the core developer who needs to put in the extra effort so that the greater PHP community does not have to. This is clearly a case where the core developer needs to act like a professional, do the job properly, "man up", bend over, and take one for the team. Anything else is lazy, selfish and likely to upset all those developers who have spent years of effort in creating applications which have made PHP the most popular language for the web. If those applications have to be reworked every 12-18 months just to keep up with the whims and preferences of a small number of know-it-all core developers then the popularity of PHP as a language will surely suffer.

One man's meat is another man's poison

The idea of purity is completely subjective, and different people will have a different idea of what this means and how it can be achieved. The idea of OO purity is even more difficult to achieve as nobody can agree on what "OO" actually means, let alone "pure OO". Pragmatic programmers know that something which is effective in the eyes of paying customers has more value that something which is pure in the eyes of a few dogmatists. Pragmatic programmers know that their customers appreciate the effectiveness of a piece of code much more than its theoretical purity.

An example of where a core developer broke backwards compatibility in the name of "purity" can be found in bug #36770 which resulted in a change to array_merge. Before this "improvement" if any of the arguments was not an array, such as FALSE, NULL or an empty string, the function would simply ignore it and carry on using the other arguments. After the "improvement" if any argument was not an array then the function would ignore all the other arguments and return FALSE. His excuse was that each argument was supposed to be an array, so anything which wasn't an array should cause the whole function to error. This is regardless of the fact that PHP was designed from the outset to have automatic type casting or type coercion, so NULL or FALSE had always been treated as empty arrays. How many users complained that the old behaviour was wrong? How many users complained that the new behaviour broke their code?

A second example is the switch from the DOM XML extension in PHP4 to the DOM extension in PHP5. These extensions have exactly the same functionality with the only difference being the method names which were changed from snake case (lower case with underscores) to camel case, for example changing $doc->create_element to $doc->createElement. This meant a complete rewrite of all affected code with absolutely no difference in functionality. The author of this change just decided that snake case was old fashioned and that camel case was the new "best practice" and to hell with what anybody else thought. The fact that snake case was the original style for PHP which was inherited from its roots in the C language did not seem to have any significance. This developer thought otherwise and forced the whole community to modify their code to conform to his preferred style.

A third example is the switch from the XSLT extension in PHP4 to the XSL extension in PHP5. These extensions have exactly the same functionality with the only difference being the switch from procedural style to OO style. Other extensions, such as mysqli for example, can happily offer both styles, but the author of this extension decided that anything which was not pure OO was not acceptable. The fact that PHP was originally procedural only and then enhanced to be multi-paradigm was lost on this individual, and so he forced his personal preference on the entire PHP community.

If it ain't broke then don't fix it

The idea of replacing code simply because it is old or because the same thing can be done with another function does not carry weight in the real world. Unlike their mechanical counterparts, software components do not wear out over time and have to be replaced. They can run and run and run for as long as there is suitable hardware and electrical power.

There are some developers out there who think that a project is never complete but always "work in progress" that needs constant refactoring. They assume that when a new version of the language is released that all developers will revisit existing code to see if it can be refactored to include any new functionality, or simply to achieve the same result in a different way. This is *NOT* how things happen in the real world. Once a unit of work has been delivered to and accepted by the customer then that unit should be left alone until there is a very good reason to go back to it. I deal with enterprise applications which contain 2,000+ units of work, and if you think that I am going to examine all of those units each time a new PHP release comes out then you are seriously deluded. Taking time out to "fix" code which doesn't need fixing is non-productive and therefore not appreciated by the paying customer who wants improvements in the application that he can actually see. This sentiment is echoed by Clayton Neff in his article When is Enough, Enough?.

When electronic ignition was invented did it mean that all those cars with carburetors suddenly stopped working? Did mechanics say to millions of car owners "I'm sorry but I cannot work with obsolete technology any more"? There may be more than one way of achieving something, there may be new and better ways, but each method has its set of pros and cons, so each should be allowed to exist side by side and the choice of which technology to use should be left to the developer/engineer/customer, not the inventor of the new technology. The idea of a piece of code being redundant or obsolete has no value in the real world. If something works you leave it alone, you do not go through the expense and inconvenience of replacing it with the latest gizmo simply because the gizmo's inventor tells you to. Not everyone can afford to completely rewrite their software every five years or so, so if something works it should be left alone. Some applications are expected to have a shelf life which can be measured in decades, not days. Leaving something alone and antagonising no-one takes less effort than taking it out and antagonising millions of customers.

An example of the removal of so-called "redundant" code was the POSIX regular expression engine which was ditched simply because the same functionality was provided by the PCRE extension. "This redundant code is a maintenance burden" they said. Excuse me, but how can a piece of code which does not have any bugs and therefore does not require any maintenance be a maintenance burden? How much effort would it take to say "The POSIX extension is frozen, but we are going to leave it in for our current users, but if you have any issues or want enhancements then you should switch to the PCRE extension instead"?

Using this logic how long will it be before some bright spark jumps up and says that this old 'long' way of creating an array is redundant and therefore should be eliminated in favour of the new 'short' way:

$arr1 = array('long', 'version');
$arr2 = ['short', 'version'];

How long will it be before some bright spark notices that there are two ways of iterating through an array - for and foreach - and decides that one of them must be redundant and can therefore be removed?

There are two things I expect from a language if I am going to continue using it for any length of time - stability and longevity. This means that code which I wrote 10 years ago should still run today, and should still run 10 years from now. There may be new features added to language, but the choice of whether or not to use these new features should be mine and mine alone. If I am forced to modify my code with each new version of PHP then I am unlikely to adopt each new version of PHP.

If a job is worth doing, it is worth doing properly

This concept appears to be totally alien to the latest batch of novice core developers who seem to think that as soon as a change works in the way that they expect it to then their work is done and they can stop. They fail to take into consideration all the possible edge cases, or uses in situations which they did not envisage, or uses in ways in which they do not approve. For example, although PHP has two ways of defining class constructors the authors of the traits and namespaces additions chose to completely ignore the old style constructor. These short sighted developers should be forced to go back and correct their mistakes.

There is also PHP RFC: Default constructors (see also this post on php.internals) which proposes the introduction of the concept of default constructors in PHP. This RFC should continue to deal with PHP 4 constructors and not perpetuate this juvenile and unprofessional argument that only PHP 5 constructors are acceptable. Whether you like it or not PHP 4 constructors have been working happily for over 15 years, and there is no valid reason to suddenly decide that their use is no longer acceptable. Acceptable to whom? A couple of dozen core developers, or 240 million website owners?

In his article Backwards Compatibility Is For Suckers the author Anthony Ferrara said the following:

The problem with trying to maintain Backwards Compatibility between releases is that every release adds more cruft for you to maintain. Over time this creates a halting effect on the code base involved that makes it nearly impossible to clean up and "make things better". This tends to create an anchor that keeps a project stuck in the stone age.

Later on in the same article he says:

That's why I want to introduce a concept here. Instead of worrying about BC as a primary rule, why don't we worry about Forward Compatibility? The basic premise here is simple:

If the core developers followed "best practices" themselves and did their job properly in the first place, then the addition of new features to the language would not result in all this "cruft". If the language is a mess then it is because they made it that way. If they did their job properly in the first place it would not be necessary for the developers in userland to spend untold amounts of their valuable time in fixing mistakes which were not of their making.

Questions to be asked before breaking BC

Too many of the newest additions to the core development team are too eager to break BC in the name of "purity" or "cleanliness" or "fashion" with complete disregard to its effect on the greater PHP community. Before any BC break is considered I would like to see answers to the following questions:

  1. How much of a maintenance burden does this feature currently present to the core developers?
  2. Are there any open bug reports?
  3. How many bug reports have there been in the last 5 years?
  4. What problems does this feature cause to the userland developers?
  5. What effort would be required by the core developers to NOT make this change?
  6. Does this feature stand in the way of adding new features?
  7. How much effort would be required by the core developers to make this change?
  8. What would be the benefit to the core developers by making this change.
  9. How much effort would be required by the userland developers to deal with the effects of this change?
  10. What would be the benefit to the userland developers by making this change.

If the answer to questions #2 and #3 are both zero, then how is it possible to say that the feature creates a burden for the core developers?

If the answer to questions #5 and #6 is "none", then how can leaving this feature as it stands be causing a problem?

If the feature as it stands does NOT present a genuine problem to either the core or userland developers, then why does this non-problem require a solution?

If the costs of this change represented by questions #7 and #9 are "significant", yet the benefits represented by questions #8 and #10 are "insignificant", then how can this change be justified if the costs outweigh the benefits?

When is an upgrade not an upgrade?

For the vast majority of developers who live in in the real world, a proper "upgrade" for a language means that they can replace their old version of the language on their PC and their existing applications will still run without having to change any code. If the new version contains performance improvements then those improvements will automatically be available to that application. If the new version contains bug fixes then those fixes will automatically be available to that application. If the new version contains some new features then those features will NOT be available to the application unless its code is amended to make use of them.

However, if an "upgrade" causes your application to break then it becomes a "non-simple very expensive upgrade" and this will cause a slowdown in the uptake for the new version. If you look at How Microsoft Lost the API War you will see that Raymond Chen and his team bent over backwards to ensure that numerous applications which worked on the previous version of the Windows OS would also work on the new version even when this meant allowing an application to do something which was now considered to be invalid. That is why a program which was written in 1983 will still run flawlessly.

That is why every Windows user is confident that their investment in software applications will not be flushed down the toilet whenever they upgrade their OS. Their existing applications will still run and they will not have to spend any money in upgrading them.

Now contrast this with the Apple Macintosh where each OS upgrade totally ignored backwards compatibility in favour of "purity" and "cleanliness" and "consistency". The OS developers may have patted themselves on the back for a job well done, but its users wanted to give them a hefty kick up the backside for all their expensive applications which were now broken. To take a quote direct from Joel Spolsky's article It's why so few applications from the early days of the Macintosh still work.

Now take a look at what happened when Microsoft "upgraded" Visual Basic 6.0 to VB.NET. This was NOT an upgrade as existing VB 6 code would not work in the new compiler. While various people at Microsoft saw .NET as a "better" language as it did not have to contain all that "unclean" or "impure" code to deal with backwards compatibility, in the outside world it went down like a lead balloon. As Joel Spolsky states:

So now instead of .NET unifying and simplifying, we have a big 6-way mess, with everybody trying to figure out which development strategy to use and whether they can afford to port their existing applications to .NET.

No matter how consistent Microsoft is in their marketing message ("just use .NET—trust us!"), most of their customers are still using C, C++, Visual Basic 6.0, and classic ASP, not to mention all the other development tools from other companies.

Although the .Net family was technically "better" in the eyes of its authors, it went down like a lead balloon with its customers, the application developers. A lot of these developers took the decision that writing applications using Microsoft products was no longer a safe bet as stability and longevity could no longer be guaranteed. They would have to spend large amounts of time and money in modifying their applications every few years or so just to keep up with the latest offering from Microsoft. A lot of them decided that if they had to rewrite their applications anyway, or needed to choose a platform for a new application, then that platform would not be Microsoft. The world is moving away from desktop applications to web applications, and if you look at the most popular languages used today for developing web applications you will see that .NET is nowhere near the top of the list.

The message is clear - keep breaking BC and your customer's confidence in your product will slowly disappear. If your product cannot offer stability and longevity - things which are essential before companies will invest large sums to write large applications which are expected to last decades or even longer - then your customers could end up by thinking that their money would be better spent in switching to a new language altogether.

My credentials

I am not a core developer as I have never used the 'C' language. I am instead a framework and application developer. I have been developing enterprise applications for over 30 years, initially with compiled languages such as COBOL and UNIFACE. Because I worked for a software house I was involved in the design and development of many applications for many organisations, and I created my own frameworks in each of those two languages to make the job easier. The original version was created in COBOL in 1985, and rewritten in UNIFACE in 1995. In 2002 I decided to switch to web development with PHP and rewrote my framework again as a personal project. After requests from programmers who read the articles on my personal website I released this PHP version as open source in 2006.

Because I deal with enterprise applications which are responsible for viewing and maintaining enterprise data, and enterprise data has very high value and a very long life, my applications, one of which is a package used by several customers, tend to have a long life. I wrote my framework in PHP 4 and my first applications in PHP 4, and because of the slow uptake of PHP 5 I had to support PHP 4 for many years.This did not bother me as PHP 4 had all I needed to write effective OO programs, and all the OO "improvements" introduced in PHP 5 I dismissed as mere window dressing which I did not need to use. This is explained in more detail in A minimalist approach to Object Oriented Programming with PHP.

When PHP 5 was released I did not rush to change my code so that it was PHP 5 only because I would have instantly lost access to all those potential customers who were still running PHP 4. Instead I took the effort to ensure that the same codebase could run on both PHP 4 and 5. The only changes that I made to my code were those forced upon me which I mentioned earlier (the DOM XML, XSLT and POSIX extensions). I even included in my framework the ability to automatically switch to use the improved MySQL extension if it had been loaded. This meant that my customers could upgrade both the language and their MySQL database at their leisure and my framework would automatically deal with it without missing a beat.

The first applications that I created all those years ago still run under the latest versions of my framework, PHP, et cetera without the need for any code changes. Since they were first created I have made numerous enhancements to both my applications and my framework, and have constantly updated my development machine to the latest version of PHP, MySQL and Apache, and everything keeps on running. If I can attain such longevity with my framework then why can't the core developers do the same with the language?

If you tell me that code longevity is infeasible then I must call you either a liar or incompetent. It may take extra effort to introduce new features without breaking backwards compatibility, but the language developers are supposed to take that extra effort and not shift it on to their customers, the people who work with their product. If I can do it with my framework then you should be competent enough to do it with the language.

If developers such as myself are forced to change code which has worked for years simply to satisfy the whims of some upstart pipsqueak who thinks he knows better than everyone else, any time that is spent is time wasted as far as I am concerned. It does not result in any visible improvement in the application, and the application users cannot be charged for an "improvement" which they cannot see.

If you think that I am alone in my wish for code longevity, for applications with a long shelf life, then take a look at PHP vs Ruby – Application Shelf Life which states that the high cost of reworking an application just to keep up with changes to the underlying language and framework caused the developer to switch to a totally different language and framework. Once an application has been built the only extra costs that are acceptable are the costs of adding improvements, not in keeping it current with the latest updates to the OS, framework or language.

Conclusion

To borrow a phrase from Shakespeare: "To break or not to break - that is the question". There is no simple "yes" or "no". It requires an evaluation of the advantages and disadvantages to both sides - the language developers and the language users. It also requires a modicum of common sense, something which appears to be sadly lacking in some circles. It is a balancing act which requires that all arguments should be considered and weighed against each other. Which side will have the most pain? Which side will have the most gain? Which side is the most important - the 240 language developers or the 240 million websites which run PHP applications and who would be affected by the decision?

If applications written in PHP 4/5 do not have a long shelf life because of PHP 7 then how long will it be before PHP itself falls out of favour and is consigned to the shelves of history? If PHP 7 is incapable of running applications written in PHP 4/5 then it will not be an "improvement" to the language at all. It will instead be like a new language which needs either completely new code or a complete rewrite of old code. If I am forced to rewrite my code then I might start looking for a new language, something which offers more stability and longevity than the current set of PHP core developers think is necessary.

Any version of PHP that is incapable of running perfectly valid code that was written for PHP4 does not deserve to use "PHP" in its name. Instead it should be called something like FUBAR, which stands for F****d Up Beyond All Recognition. Those developers who want to work with a "better" language should consider creating their own fork, or joining an existing one such as Hack/HHVM.

This quote came from Rasmus Lerdorf, who invented PHP, on the internals list:

Rather than piling on language features with the main justification being that other languages have them, I would love to see more focus on practical solutions to real problems.

This quote came from Johannes in a comment on Derick Rethans' article No to a Uniform Variable Syntax:

As Bjarne Stroustrup says: "Compatibility is a feature." To existing users who can stay up to date easily as well as to new users who can be assured that their investment is safe. Randomly changing the language doesn't do that. Randomly changing the language tells "be prepared to be continuously reviewing your code for breakage" instead of helping them solve their actual issues. Staying up to date should not be an issue for any user.

Here is a quotation taken from the Revised Report on the Algorithmic Language Scheme (PDF).

Programming languages should be designed not by piling feature on top of feature, but by removing the weaknesses and restrictions that make additional features appear necessary.

Feedback

After I published this article on my personal website I also provided a link to it on LinkedIn. It also had a reference in The PHP 7 Revolution: Return Types and Removed Artifacts. Although I have received support from some members of the PHP community, the vast majority of the comments have come from radicals who want to change the language using the philosophy "out with the old and in with the new". They do not want to improve the language by building on the old ways, they want to completely destroy the old ways and create a completely different language which is more in keeping with their ideas of what a language should be. Instead of removing old functionality when it becomes absolutely necessary they want to remove it simply because they can, using pathetic excuses such as code "cleanliness", "purity" or "compatibility". Below are some of their arguments with my responses

You shouldn't be using PHP 4 constructors at all
Who says? They have coexisted with PHP 5 constructors for the last 10 years and have never been marked as deprecated, so they are a valid part of the language. As such I am perfectly free to use them should I so wish. The simple fact of the matter is that my codebase started with PHP 4, and when I upgraded to PHP 5 I only changed my code where it was absolutely necessary (i.e. to replace a function which had become deprecated or had been removed). It has never been "necessary" to change all PHP 4 constructors to PHP 5 constructors.
PHP 4 is obsolete, so PHP 4 constructors are obsolete
Just because PHP 4 is obsolete does not mean that every feature that existed in PHP 4 is now obsolete. Beside the term "obsolete" is irrelevant. Unless a feature has been marked as deprecated it is fully supported and therefore not obsolete.

Using this logic I suppose that sometime in the near future some idiot will call for the removal of the long syntax for arrays as PHP 5 now contains a "new and improved" short syntax. How many people would support that idea? How many would be horrified at the volume of code that they would have to amend?

Nobody writing new applications uses PHP 4 constructors
It depends on what they have been taught. Just because PHP 5 introduced an alternative constructor does not mean that all the existing code was automatically updated, or any of the numerous online tutorials or books were automatically updated. People tend to carry on doing what they have always done unless they are presented with a valid reason to change. Just because you personally have never used PHP 4 constructors and don't even know that they exist does not mean that they should automatically be removed from the language.
PHP 4 constructors cause problems.
One problem that I have heard about is when someone creates a non-namespaced class with a method which has the same name as the class and they do not realise that the method becomes the class constructor instead of a callable method. This is ignorance on the programmer's part and not a fault with the language itself. If they cannot be bothered to read what the manual says about constructors and destructors then they have only themselves to blame.

A second problem is when a subclass wishes to call the parent's constructor and doesn't know whether to use parent::_construct() or parent::<classname>() or even if the parent has a constructor in the first place. This situation could easily be rectified in PHP RFC: Default constructors.

Using PHP 4 constructors is not best practice
Don't wave that old argument in my face as it does not have any weight. There is no single document called "best practices" or "standards" which is universally accepted by all programmers. Each group has its own set of standards which are nothing more than the personal preferences of the group leader, and what is best for one group may be totally ignored or even contradicted by another.
By not allowing the language to change you are holding up progress
A change for the better is a good idea, but that change should be a genuine improvement and not some purely cosmetic exercise which has no measurable benefits. While new features may be added to the language I am not obliged to use them. I am not against functions being removed if it is for a genuine reason such as a security issue (like register_globals) or a bug fix, but removing functionality for a totally frivolous reason does nothing but cause grief for your customers, us application developers.

Progress can be made in one of two ways - revolution or evolution. Revolution is fast, while evolution is slow. Revolution replaces the old with something completely different, while evolution builds on the old and only replaces those parts which actually need replacing. Revolution means that the group with the new ideas are forcing their opinions upon everyone else, while evolution allows all groups to have their say. Revolution is dictatorial while evolution is democratic.

It should be obvious that I favour evolution, not revolution. Those who want revolution are actually asking for a completely new language, not an improved version of the old one, in which case they should either stop using PHP altogether and start using a different language which meets their expectations (if one actually exists, that is) or move to a fork such as Hack/HHVM. In other words they should "Fork Off" and leave our beloved language alone.

The language needs to be cleaned up in order to attract new developers
Why? The language is already good enough to be used by millions of developers to write millions of applications which have been installed on over 240 million websites (source is wikipedia.org) which account for over 80% of the world's web sites (source is w3techs.com) so if they don't have a problem then the new guys should learn to use it as-is. If a programmer who is used to a different language doesn't like the fact that PHP has a different syntax and different behaviour he should learn to understand one simple thing - all the various languages have differences, otherwise they wouldn't be different and would have no reason to exist. If they all had the same syntax and the same behaviour there would be no need for different languages. If a programmer doesn't like PHP then the solution is simple - he should stop using it and find a language that he does like.
You need to remove redundant functionality
How can you prove that a particular function is redundant? Just because you and your colleagues don't use it doesn't mean that all the other developers in the world follow suit. You don't have access to every PHP script that is currently being used, so you cannot even guess at the usage statistics for every function in the language.

The only time you actually need to remove old functionality from the language is when it physically prevents the addition of new functionality, but this is rarely the case. New functionality can be provided with new APIs, and by leaving in the old APIs you also retain access to the old functionality. Unless the two sets of APIs cannot coexist at the same time there is not a genuine reason to remove the old APIs. This argument is echoed by Max Kanat-Alexander in his article When Is Backwards-Compatibility Not Worth It?:

The only time you should seriously consider ditching backwards-compatibility is when keeping it is preventing you from adding obviously useful and important new features. But when that’s the case, you’ve really got to ditch it.

Note here that I support the removal of old APIs if their replacements are genuinely useful, but not when they are not.

You need to refactor your code more often
Some people are under the impression that if code isn't refactored at regular intervals - as little as every 12 to 18 months has been quoted - it becomes stale, outdated and unmaintainable. Others have the impression that when a new version of PHP is released with new features that you must find a way to put these new features in your code in order to be up-to-date and modern.

It may come as a surprise, but I do not subscribe to either of those theories.

Having worked for several software houses for many years where I had to build applications for paying clients, and maintain them afterwards, the prime directive was "time is money". We could not do any work unless it was approved by the client, and I'm afraid that refactoring the existing codebase was never anything that they were willing to pay for, so it never got done. We only ever looked at existing code if it was to fix bugs or to deal with a user requirement.

That habit has stayed with me ever since. I created my RADICORE framework in PHP 4, and even when PHP 5 was released I had some clients who were stuck on PHP 4 for up to 4 years afterwards, so I had to support both versions of PHP with the same codebase. Where there were code differences between the two versions I was able to switch to the correct code at runtime by using the following:

    if (version_compare(phpversion(), '5.0.0', '>=')) {
        ... do it the new way
    } else {
        ... do it the old way
    } // if

When new versions of PHP are released I scan the update notes to see if anything has been marked as deprecated or there are any BC breaks. Where something is marked as deprecated then I usually have plenty of time to change my code before it is eventually removed. So far my code has only been affected by the following:

Although the POSIX functions were deprecated in 5.3.0, as of 5.6.5 they still exist in the language. Although technically I can still use them, because I was given plenty of warning I found the time to change my code to make the switch. This was relatively easy as I did not have to complicate my code with the version_compare() function, and there was no loss in backwards compatibility as the PCRE functions have been available since PHP 4.

Although some new functions are sometimes available with a new PHP release, I never bother to incorporate them into my code as they never provide anything that I actually need. Most of the changes are purely cosmetic or nothing more than syntactic sugar, so any time spent in refactoring my code would be time wasted as it would not do anything that it did not already do before.

When it comes to refactoring my code I tend to follow the maxim "If it ain't broke don't fix it", which means that I don't touch it until I have a compelling reason. I also avoid the problems highlighted in Clayton Neff's article called When is Enough, Enough which describes the consequences of relentless tinkering with the code. A side-effect of my approach is that, so far, I have never broken BC in either my framework or my primary application, which means that my entire codebase will run in all versions of PHP from 5.0 right up to 5.6. This is something that other programmers cannot achieve as by utilising a new feature that comes out in a particular release they immediately make their code unrunnable in any earlier releases. Some of them maintain separate versions of their code for each different version of PHP, but my approach gives me a single codebase which will run on any version of PHP. This means that I can sell my application to customers regardless of which PHP version they are running, and this is far more valuable to me than the lack of benefits from pointless refactoring.

The old function has been replaced by a new function, so the old function is a duplicate and can be removed
This argument is extremely weak. Many languages provide more than one way to achieve an objective, each with their pros and cons depending on the circumstances. This is evident in the motto There's more than one way to do it which is associated with the Perl programming language. It is up to the individual developer to decide which method is appropriate for his current circumstances and not for the core developer to dictate which method is the "only" one that should be used.
You need to clear out the old to make way for the new
What you are describing is revolution instead of evolution. Revolution is more convenient for the core developers but results in a different language which is totally incompatible with the old one, and this can cause a difficult transition period for the application developers which can last for some considerable time. Evolution is slower for the core developers but builds on the old instead of tearing it down, thus causing the transition period to be less painful and problematic for the application developers. Revolution breaks backwards compatibility while evolution maintains it. As far as the PHP language goes I am in favour of evolution, but not revolution.

Here is a quotation from Wayne Mack in the c2.com article Backwards Compatibility:

Backwards Compatibility is the assertion that improvement does not imply the previous version was an absolute failure. One should build on the past, not mindlessly tear it down. ..... Backwards Compatibility often requires minimal effort to maintain, but many seem to prefer to rewrite rather than improve.
Other languages have BC breaks, so it is accepted practice
Just because some languages do it does not necessarily mean that it is right. Some languages support multiple inheritance while others do not, so which is right? Some languages require the use of interfaces while others do not, so which is right? Lots of people cheat on their spouses, but does that make it right? Lots of people cheat on their income tax returns but does that make it right? Lots of people take drugs, but does that make it right?

Before any BC break is allowed through the following questions should be asked:

A smaller codebase for the language is easier to maintain
By saying that you are implying that you want to remove features so that the language developers have less work to do and therefore have an easier job. Compilers do not exist to give compiler writers a job, they exist to provide application programmers with the ability to develop effective software for their clients, so the needs of the application programmers should always take precedence over the whims of the compiler writers. Compilers by their very nature are complex beasts, so people who volunteer to work on them should have no right to complain that they can't handle it because it is too complex. If you can't stand the heat then get out of the kitchen. In his article On the importance of backward compatibility Ian Murdock has this to say:
I’ll argue that it takes a better engineer to move a platform forward while at the same time making sure things don’t break. It’s pretty easy to wash your hands of something and declare it to be someone else’s problem.
One of the main reasons that the codebase is getting larger and larger is the fact that the core developers are constantly adding new functionality to the language simply to appease a vociferous minority of radicals and OO purists, but without any evidence that it will be of benefit to the wider community. So if you think that the codebase is full of crap then one way to address this issue would be to stop adding even more crap! In his article When Is Backwards-Compatibility Not Worth It? Max Kanat-Alexander has this to say:
This also gives one good reason why you shouldn’t just add features willy-nilly to your program. One day you might have to support backwards-compatibility for that feature that you added "because it was easy to add even though it’s not very useful." This is an important thing to think about when adding new features – are you going to have to support that feature forever, now that it’s in your system? The answer is: you probably are.
A compiler does not exist just to provide employment for compiler writers, it exists so that application developers can use it to create applications. A software application does not exist just to provide employment for application developers, it exists to provide benefits for the organisation which commissioned that application. Just as an application developer should not make changes to the application unless they have been authorised by his employer, a compiler developer should not be able to make changes to the language unless they have been authorised by the compiler's users.

References


© Tony Marston
31st December 2014

http://www.tonymarston.net
http://www.radicore.org

Amendment history:

01 Apr 2016 Added Refactoring code in the real world.
Added Questions to be asked before breaking BC.
Added When is an upgrade not an upgrade?.
25 Jan 2015 Added section Feedback and References.

counter