Tony Marston's Blog About software development, PHP and OOP

BC break in 7.2 caused by undocumented and unauthorised change

Posted on 15th June 2018 by Tony Marston
Introduction
Yasuo Ohgaki is illiterate
Yasuo Ohgaki is arrogant
Yasuo Ohgaki is incompetent
Yasuo Ohgaki is a liar
Storm in a Teacup?
References
Comments

Introduction

I have recently upgraded from 7.1.11 to 7.2.5 and have encountered a BC (Backwards Compatibility) break due to a change in the behaviour of the session_name() function. I have been using this function in my code since 2003 and it has always performed as documented.

When I now run the session_name() function it fails with the message "Cannot change session name when session is active". This is a break with the previous behaviour. The documentation clearly states "Get and/or set the current session name" which means that this function can be used in three ways:

At no point did the documentation say that it could not be run in either mode while a session was current active. Although it did say that in order to change the name from the default session.name you had to call session_name() in every request before session_start(), that merely told me that the new name would not become effective until the next call to session_start(). If a session was currently active then it had to be closed before the call to session_start(), but there was certainly no need to close the current session before starting a new one. This is not what the code which I wrote in 2003 did, and this ran for 15 years without any errors.

In my application I allow a user to create multiple browser windows on the same device so that they can view different parts of the application in different windows or tabs. This is done by first creating a clone of the current window in a new window (by using Ctrl+K in Internet Explorer), then starting a new session in one of the windows by clicking on a hyperlink which then executes code like this:

session_start();
... do stuff
if ($_GET['action'] == 'newsession') {
    $session_name = getNewSessionName();  // user-defined function to increment digits at end of name
    session_name($session_name);          // change session name
    session_regenerate_id();              // change session id
    header('location: ' ...);             // restart script to use new session name and id
    exit;
}

The reason that I restart the script is because I want to drop "action=newsession" from the URL.

Note here that I am NOT erasing the previous contents of the $_SESSION array, nor am I destroying the previous session in the file system or database. I am allowing the two sessions to run side by side without interfering with one another. The only reason that I am changing the session name is because the session_id() is stored in the cookie data as a value with the session.name as its key. The only way that I can store multiple session_ids in the cookie data is to have a separate session.name for each one.

During my work on this code, which is documented in Client Clones and Server Sessions and which also runs with my custom session handler, I noticed the following regarding the behaviour of sessions within PHP:

Although I was calling session_name($newname) while a session was active this did NOT cause any problems as it had absolutely zero effect on the current session. No calls were made to the session handler to deal with this change in name, and the current session cookie was not updated. The value of $newname was never associated with a session until the next call to session_start(), just as the manual states.

The change implemented in 7.2 breaks this behaviour and breaks my code. Before I reported this as a bug I search the bug database and found bug #75650 which declared this change in behaviour as "not a bug". My usage is clearly different, so I reported this as a new bug in bug #76358. Again the responder replied "this is not a bug" with a long list of lame excuses which boiled down to the fact that someone had reviewed the behaviour, didn't think that it should work that way, and decided arbitrarily to change it without further ado.

The bug report contains this observation from requinix@php.net:

It's clear to me from the current and past implementations that session_name() was never meant to alter the current session's name. The documentation does not clarify whether the behavior it states applies if the session has or has not been started and without that I find what it says to be ambiguous.

Then you need to read the manual for both session_name() and session_start() which say:

session_name():
The session name is reset to the default value stored in session.name at request startup time. Thus, you need to call session_name() for every request (and before session_start() is called).

session_start():
To use a named session, call session_name() before calling session_start().

The only possible ambiguity is in the use of the term "current session". To me a session cannot be classed as "current" until it has started, and it ceases to be current when it finishes. Changing the value of session.name while a session is active has NEVER affected the current session as it NEVER makes any calls to the session handler. The manual clearly shows that $session_name is only ever passed to the session handler on the open() method which is called when the session is started.

I raised the point that this BC break was never the subject of an RFC to which he replied:

Not all changes to PHP require an RFC.

I asked if this "problem" with session_name() were ever reported as a bug, to which he replied:

Not all changes to PHP require a dedicated bug report. This one came during other session fixes for bug #71038.

If you read that bug report you will see no mention of the session_name() function.

I raised the point that no change to session_name() was ever discussed and voted upon in the php.internals to which he replied:

This one was mentioned on the internals list by @yohgaki in mid October 2016 ("Fixing insane session_start() behaviors").

I checked this thread, but there was no mention of session_name(), and certainly no mention of any change in behaviour that would cause a BC break.

I raised the point that no changes to session_name(), or the fact that the change would cause a BC break, ever appeared in any PHP change logs or documentation, to which he replied:

It was listed in UPGRADING in https://github.com/php/php-src/blob/PHP-7.2.0/UPGRADING.

Later on another git called cmb@php.net made this comment at [2018-05-30 21:30 UTC]:

Not every (minor) BC breaking change requires an RFC or discussion on the internals mailing list. This very change was submitted as pull request on Github (which is an official and publicly available channel), and since there have been no general objections, the PR was committed.

Point #1: There is no such thing as a "minor" BC break, just as there is no such thing as being "slightly" pregnant or "slightly" dead. Every BC break causes code that used to work to suddenly fail after an upgrade. BC breaks are bad for the language as they destroy developers' confidence that their investment in writing software for their clients will not turn to dust with the next upgrade in the language. That is why a lot of developers delay any upgrades for as long as possible as they do not like the idea of being forced to rewrite parts of their applications just to appease some know-it-all knucklehead on the internals list who wants to impose on the world his idea of how things should be done.

Point #2: Neither Github, Twitter, Facebook, Stackoverflow, Linkedin or whatever-forum-you-care-to-mention are suitable channels for discussions on the internals of PHP. There is only one OFFICIAL channel, and that is php.internals. Github may be the official channel for pull requests, but it is not the official channel for discussions or votes.

Point #3: There were no general objections for the simple reason that not every proposed change was properly identified, discussed, justified and authorised. There was just this blanket statement "the former behavior is not valid and allows crash and misbehaviors". What misbehaviours? Where are the bug reports?

Point #4: As well as the proposed BC break never being discussed or voted on in the php.internals list, there was no deprecation notice issued, no mention in any change logs (at least not in the one contained in the 7.2 zip download for Windows), and no change in any documentation. Code that had run quite happily for 15 years suddenly stopped working - without any warning and for (as far as I could ascertain) no good reason.

The comment from cmb@php.net also pointed to a documentation change made by Yasuo at http://svn.php.net/viewvc?view=revision&revision=345080. You should notice three things about this message:

A person identifying himself as a at b dot c dot de made this comment on [2018-05-23 06:43 UTC] :

During the discussion on the internals list and on github, @yohgaki did mention that he _should_ have written an RFC on the subject, but never got around to doing so.

That just proves that he is a lazy git who cannot be bothered to follow the correct procedures.

I sent details of bug #76358 in a private email to Rasmus Lerdorf, Stanislav Malyshev and Nikita Popov as I regarded them as senior members of the php.internals and I wanted them to be made aware of this serious breach in procedure. I had this reply from Nikita:

I don't really know about sessions, maybe Yasuo has some insight here, as he implemented these changes originally?

From a quick look I think the particular combination of session_name + session_regenerate_id does seem legitimate (unlike most other uses of session_name on a started session).

My 2c on session changes in general is that sessions are a huge mess and any changes are dangerous (and many changes made in the past have had unexpected fallout). For the future it might be preferable to stop touching ext/session code and implement a new module instead.

Nikita

This reply makes no reference at all to the breach in procedure regarding BC breaks, so he has gone down in my estimation. I did, however, receive a reply from Yasuo in which he said the following:

In PHP 7.2, session module behaviors are improved to disallow nonsense usages.
https://github.com/php/php-src/blob/PHP-7.2/UPGRADING

You will notice that these "nonsense usages" were never actually identified, so nobody else could offer an opinion as to whether they were actually nonsense or not.

<?php
session_start();
$oldname = session_name('new_name');
?>
This code is wrong because, 'new_name' cannot be set when session is already started.

It could be set in all versions of PHP prior to 7.2 for the simple reason that session.name is used only in the next call to session_start(). Using session_name('new_name') after a session has started has absolutely no effect on that session as all activity is keyed on session_id(). All you have to do is look at the read() and write() methods in the session handler to realise this.

This code is ok in old PHP because output buffering is enabled and cookie can be modified

You are mistaken. session_name() has never modified any cookies, either prior to 7.2 or with 7.2. According to my usage over the past 15 years session cookies are only ever created by session_start() and updated by session_regenerate_id().

I have subsequently exchanged over 100 emails with Yasuo in which I challenged every statement he made and asked for proof of every issue that he thinks he has identified. I shall not quote every one of those emails, instead I will pick out certain statements and provide the conclusion which I drew from those statements.

Yasuo Ohgaki is illiterate

By this I mean that either he cannot read or he cannot understand what he reads. The manual for session_name() states the following:

The session name is reset to the default value stored in session.name at request startup time. Thus, you need to call session_name() for every request (and before session_start() is called).

The session name references the name of the session, which is used in cookies and URLs (e.g. PHPSESSID).

The default value of session.name, usually 'PHPSESSID', is defined in the php.ini file, and this is the name that will be used by session_start() unless it is changed beforehand by a call to session_name().

The manual also states:

session_name() returns the name of the current session.

If name is specified, the name of the current session is changed to its value.

Where this refers to "the name of the current session" this is not actually technically correct. A session cannot be classed as "current" until it is started, and it ceases to be current when it is closed. Thus instead of "the name of the current session" it should read "the current value of session.name". This value will not be used in a session until the next call to session_start().

The manual for session_start() states the following:

As of PHP 4.3.3, calling session_start() after the session was previously started will result in an error of level E_NOTICE. Also, the second session start will simply be ignored.

This makes it perfectly clear that a PHP script can only access one session at a time, but it is possible to close one session and open up another during the life of a script. The manual also states:

session_start() creates a session or resumes the current one based on a session identifier passed via a GET or POST request, or passed via a cookie.

The "session identifier" here is the value for session_id(). When using the cookie option (which is preferred as IDs embedded in URLs present a security issue) a lookup is performed on the cookie data using the current value of session.name as the key, and the returned value is used as the ID. Once the ID has been obtained (or generated if it did not currently exist) all further activity on that session is governed by the value of session_id() and the value of session.name becomes irrelevant while the session is current. This can be verified by looking at the manual for the SessionHandler class where you will see that $session_name is only passed as an argument on the open() method while $session_id is used in the read(), write() and destroy() methods.

The two functions are used for entirely different purposes:

It is possible to switch the value of session_id() during the life of a session, but only by using the session_regenerate_id() function. This function will automatically close the session with the current ID, generate a new ID, then open another session with the new ID. The session data for the old ID will be carried forward to the new ID. The session data for the old ID will not be deleted from the file system unless it is explicitly destroyed.

Note that there is no similar function to switch the value of session_name() while a session is active. The use of session_name('newname') will NOT make any calls to the session handler, will NOT automatically close the current session and will NOT automatically open another one. This is because once a session has started all reading and writing of the session data is based on nothing but the session_id() just as the manual states. Once the value for session_name() has been used to provide the session_id() from the cookie data it is no longer relevant as it is not used to access the session data.

This basic knowledge was obtained by both reading the manual and trying the various functions to see which combinations worked. It would appear that Yasuo was totally incapable of reaching the same level of knowledge even though he is a C programmer and had access to the internal code.

This is why I wrote the following code in 2003 to perform manually what could not be done automatically in a single standard function, and it ran happily for the next 15 years - until Yasuo broke it in version 7.2.

<?php
session_name('oldname');
session_start();
... do stuff
if ($_GET['action'] == 'newsession') {
    session_name('newname');
    session_regenerate_id();
    session_write_close();
    session_start();
}
... more stuff

As you can see I am closing the old session before starting the new one, copying the old session data into the new session, and leaving the old session alone. This technique allows me to have multiple browser windows open on the same device with each window accessing the same application but using a different session, as described in Client Clones and Server Sessions.

Note that it was not necessary to close the current session before calling session_name('newname') simply because it had no impact the current session, only the next session just as it states in the manual.

It would appear that Yasuo did not understand this, but simply decided that it was not logical and should be prohibited. To quote his words to me:

Setting new session name for active session is totally nonsense to me.

This just proves that he doesn't understand that changing the value of session.name has absolutely no effect on the current session as it does not become effective until the next call to session_start(), just as it states in the manual.

Yasuo Ohgaki is arrogant

I pointed out to Yasuo that I had been monitoring the internals list where I had observed that there had been quite a lot of opposition to his proposals for changes to PHP's session management, and he didn't take kindly to people questioning his ideas. I even noted that three of his RFCs - User defined session serializer, Enable session.use_strict_mode by default and Precise Session Management - had been rejected. In his email of 3rd June @ 03:18 he stated:

I wrote RFC, but declined by those who don't understand session security.

I've told the same thing over and over. I'm a bit sick of those who don't/wouldn't realize what the true session security is.

In another email of 3rd June @ 11:59 he stated:

They simply (and/or)
- don't understand what could happen
- don't care other users' use cases
- don't care what experts say
- satisfied with poor security

People are better to watch what experts say.

In his email of 5th June @ 11:13 he stated:

I fed up with their reports, so I finally made session work as it should.

You mean you changed it to work as YOU thought it should. Which RFC covered those changes? THERE WASN'T ONE!!! You completely bypassed the discussion phase as you didn't like the idea of having to argue the merits of your proposed changes with lesser mortals.

In his email of 8th June @ 02:43 he said:

Only ignorants would oppose my patch

So there you have it straight from the horse's mouth - anyone who disagrees with him is ignorant. Does that match the definition of "arrogant" or not?

Yasuo Ohgaki is incompetent

Yasuo claimed that his fix was necessary to prevent problems (what he referred to as "misbehaviours"), so I asked him to provide details of such problems. He could not point to any reports in the bug database (except for bugs which he himself had created), but he did give me five examples of userland code which I repeat below:

Example #1 - Wrong write and/or crash>
<?php
session_start();
session_name('new_name');
session_commit(); // Save handler can write session data to wrong storage.

There is no such thing as "wrong" storage in this case as session_commit() does NOT reference $session_name when calling the write() method in the session handler, it uses $session_id. This code does not change $session_id so the session data is put back in the same place from where it was read.

Example #2 - Wrong name returned
<?php
session_start();
session_name('new_name');
// somewhere in the code
$my_current_session_name = session_name();  // Wrong session name. session_start() uses old one.
// This behavior can cause bug when user is distributing session storage accesses by session name.

There are a number of glaring mistakes here:

Example #3 - Wrong write and/or crash. Pattern 2.
<?php
session_start();
session_name('new_name');
// somewhere in the code
session_regenerate_id();  
// Save handler can write session data to wrong storage.
// In worst cases, PHP crashes
?>

Bad usage again. session_regenerate_id() will regenerate a new id for the CURRENT session, but this will be linked with the value of session.name which was available when the current session was started. The value 'new_name' will not take effect until the next call to session_start(), just as the manual says.

Example #4 - Wrong write and/or crash. Pattern 3.
<?php
session_start();
session_name('new_name');
session_write(); // Save handler can write session data to wrong storage. 
// In worst cases, PHP crashes
?>

This is the same as Example #1.

Example #5 - Wrong write and/or crash. Pattern 4. + Bogus session_name() call.
<?php
session_start();
session_name('new_name');
// Save handler can write session data to wrong storage at the end of script execution.
// In worst cases, PHP crashes
// In addition, user wouldn't notice bogus session_name() call if there is no error.
?>

This is a duplicate of #1 and #4. The session is started with a particular $session_id, and as that $session_id is never changed any updates to that session data will be written back using the same $session_id. A change in $session_name does not change the $session_id. The $session_name and $session_id are different entities which are accessed using different functions. They are only brought together when the cookie is accessed, which is either by session_start() or session_regenerate_id(). This code does nothing to cause 'new_name' to be written out as a cookie, therefore it simply disappears.

I communicated my responses to these five examples to Yasuo, and all he could do was reiterate his claim that his changes were necessary "to prevent side effects of misbehaviours".

In his email of 28th May @ 23:36 he stated:

Remember that session save handler can respect session name. Since nobody cannot expect how save handlers are implemented, session name change may results in crash in the worst case.

Where is this documented? There has never been any evidence that the value of session.name can be used within the session handler after a session has started. The documentation for the SessionHandler class clearly states that $session_name is passed as an argument on the call to the open() method but not to the read() and write() methods.

In his email of 31st May @ 12:06 he stated:

Session name can be used to specify where/how session data is stored. I know there are such usages. e.g. Distributing session storage, distinguishing session with the same session id, etc.

Excuse me? If you bother to read the documentation it is only the $save_path argument on the open() method which identifies where the session data will be stored. If you somehow manipulate the value of $save_path based on the value of $session_name you should also be aware that the value of $save_path cannot be altered after the call to open() has been completed, so you will need to close the current session and open a new one for any change to become effective. As for "distinguishing session with the same session id" you simply have to be joking! Nobody in their right minds would store different sets of session data with the same session_id in different places just because the session_name was different.

I reiterate my point that there is no such thing as "wrong" storage as session_commit() does NOT reference session_name() when writing the session data to the file system, only session_id(). When writing the session data to disk it uses the value of $save_path which was output from the call to the open() method. A change in session_name() does not change the value in $session_id or $save_path, so the session data is put back in the exactly same place from where it was read.

In his email of 1st June @ 12:16 he stated:

It's just your opinion. Nobody said respecting session name is invalid.

In fact, current bundled save handler design that ignores session name is semantically wrong. IMO.
"Got new session name, but the same session data" is awkward. New session name = New session data feels right/natural.

So there you have it. Yasuo firmly believes that his opinion is the only right one, and that anyone who disagrees is automatically wrong. He says that the session handler has been wrong for the past 20 years and that a change in session_name() should automatically generate a new session. I would like to see him run THAT idea past the people on the internals list.

In his email of 2nd June @ 00:03 he stated:

Using both session name and session id in save handlers is perfectly valid.

Then why isn't $session_name passed as an argument on the read() and write() methods? Because it is NOT passed it is reasonable to assume that it is NOT relevant.

In his email of 3nd June @ 11:30 he stated:

Bundled save handlers ignores session name (It should store to different storage, IMO)
Save handlers are free to use session name to achieve intuitive and useful behavior.

That opinion is not reflected in the design of the SessionHandler which has existed for decades. All reading and writing of session data is controlled ONLY by the value of $session_id and NEVER with the value of $session_name. I began to suspect his implementation of a session handler as he appeared to be attempting something which was not covered in the design and was not hinted at in any of the code samples in the manual. Then he explained that he was storing changes to both $session_id and $session_name in global variables, and in his handler he was ignoring the values passed as method arguments and using his global variables instead. I pointed out to him that if it really were necessary to access the values of $save_path and/or $session_name in the read() and write() methods then what a competent programmer would do is put code in the open() method to save these values to class properties so that they could be accessed in other methods using $this->save_path or $this->session_name, but he simply refuses to acknowledge the efficacy of this approach.

He is firmly convinced that the value of $session_name is important to the session handler even though the documentation clearly states that its only purpose is as a lookup in the cookie data to provide a value for $session_id, and once this value has been provided then all further reading and writing of the session data is done using this $session_id, which means that $session_name has become totally redundant. He is also convinced that changing the value of $session_name has an effect on the current session even though the documentation clearly states that a change in name does not take effect until the next call to session_start() and that using the session_name() function has NEVER resulted in any calls to the session handler. Unlike the session_regenerate_id() function which allows the current session to be switched to using a different value for $session_id there is no equivalent function for switching $session_name in the middle of a session, which is why it has to be done in userland code by calling the right combination of functions in the right sequence.

His lack of understanding of how PHP's session module was designed to work, coupled with his inability to follow basic programming practices leaves me no choice but to categorise him as incompetent.

Yasuo Ohgaki is a liar

In his email of 28th May @ 01:04 he stated:

I've made session changes because I was sick of session abuse bug reports that exploit un-managed session status. e.g. Session save handler crash bugs.

You must mean "sick of the bug reports which were generated because of bad changes which YOU made".

In his email of 28th May @ 23:33 he stated:

There are number of bugs that "Session save handler crashes". You can find these reports in our bug db.

I looked in the bug database for such reports, but all I could find were bug #72599 (trying to open a second session before closing the first), bug #73616 (incorrect code in custom session handler), bug #74548 (incorrectly assumed that a change is session name would automatically create a new session id), bug #75350 (changing session name after a session auto-start), and bug #75496 (caused by FastCGI micro caching with nginx). I could find no reports of any problems which concerned the correct usage of session_name(), so I asked him to identify those reports, but he could not.

I also noticed the following bug reports which were made AFTER Yasuo had made changes to the code - bug #67694, bug #68331, bug #69127, bug #69964, bug #70013, bug #70516, bug #70885, bug #71038, bug #71162, bug #71599, et cetera, ad nauseam, ad infinitum. I think this speaks volumes on Yasuo's skill (or lack thereof) as a programmer and his ability to make effective changes to the internal PHP code.

Yasuo eventually admitted that the "fault" he had supposedly fixed was never actually reported in the bug database, it was a bug he had introduced into his own session handler. I eventually squeezed out of him the mechanics of this bug, and it turned out that he was using the contents of $session_name to influence where the session data was actually stored, but instead of taking the value that was supplied on the call to open() he was taking it from a global variable on the assumption that the two values were identical. That is surely a bad assumption that only a rookie would make.

So, instead of fixing a genuine bug which was reported in the bug database, a bug in session_name() which caused a problem in a custom session handler, he instead changed the implementation of session_name() to get around a problem in his own badly written session handler. Instead of fixing his own badly written code he introduced an unannounced BC break into the entire PHP community.

Storm in a Teacup?

Some of you out there might be thinking that if this problem can be fixed simply by swapping round two lines of code in my application then I shouldn't be making such a fuss, that I'm making a mountain out of a mole hill, that it's just a storm in a teacup, but you would be completely missing a very serious issue. If @yohgaki had followed the correct procedures and issued a deprecation notice in one PHP version and then altered the behaviour in the next then I would have had proper warning of the impending change. That would have given me time to change my code and allowed me to upgrade without a hitch.

But that did not happen. There was no deprecation notice nor any other sort of prior warning. There was no RFC, there was no discussion on the internals list, no particular BC breaks were identified nor justified, and there was no vote. The broken code was snuck in under the radar in total secrecy and took everybody by surprise. There was no entry in the backward incompatible changes for upgrading to 7.2, there was no mention in the changelog which was contained in the zip download, and there was no change in the manual. I did notice that after I raised bug #76358 the changelog in the online manual was altered to contain an entry for session_name() which states:

session_name checks session status, previously it only checked cookie status. Therefore, older session_name allows to call session_name after session_start which may crash PHP and may result in misbehaviors.

You will notice that is does not identify that any other session functions were included in the 7.2 upgrade although many were listed in the upgrade notes in Github.

I noticed on June 5th that the documentation for session_name() had been altered to include the following statement:

When new session name is supplied, session_name() modifies HTTP cookie (and output contents when session.transid is enabled). Once HTTP cookie is sent, session_name() raises error.

This behaviour never existed prior to 7.2, so I tested this with my debugger and found that it also did not exist in 7.2, so I raised bug #76413. You will notice that Yasuo is still refusing to admit that he has made a mistake and that his knowledge of the session module is not as good as he thinks it is.

This whole episode raises serious doubts about the future of PHP if someone is allowed to make a change to the language, especially a change which causes a BC break, without following the correct procedures, namely propose, discuss, then vote. If anyone can make an arbitrary change to the language without proper peer review just because they think it should work in a different way then this could lead to more and more BC breaks which could then make the language unreliable and unusable for the army of application developers who have made the language as popular as it now is. I urge the leaders of the internals group to review this serious breach of procedure and takes the necessary steps to ensure that it never happens again.


References

Here are my views on changes to the PHP language and Backwards Compatibility:


counter