Getters and Setters are EVIL

Posted on 2nd December 2023 by Tony Marston

Introduction
Using arrays to pass data around
Data validation made easy
Reading from multiple tables
Writing to multiple tables
References
Comments

Introduction

I write nothing but enterprise applications. For the first 20 years of my career I used compiled languages which used bit-mapped displays, but for the last 20 years I have been developing nothing but web-based applications using PHP, which is an interpreted language. PHP is the first (and only) programming language I have used which has object oriented capabilities, and as this has been often been advertised as the greatest thing since sliced bread I wanted to know what it meant. The best description I found went as follows:

Object Oriented Programming is programming which is oriented around objects, thus taking advantage of Encapsulation, Inheritance and Polymorphism to increase code reuse and decrease code maintenance.

When I was learning PHP I had 3 sources of information - the PHP manual, books and online tutorials. I learned about encapsulation and inheritance, but saw no examples of polymorphism. I loaded some of the sample code onto my home PC and stepped through it with my debugger which was built into the IDE which I chose to use instead of a plain vanilla text editor. I did this so that I could examine in great detail how each line of code worked. As I became more and more familiar with PHP I strived to utilise its OO features in order to create as much reusable code as was humanly possible. I implemented these features in the following ways:

Encapsulation - I created a class for each table in the database. This is because a database application does not handle objects in the real world, it only handles the data about those objects, and that data is held in a database in things called "tables".
Inheritance - Regardless of what operations may be used on objects in the real world, in a database the only operations that can be performed on tables are Create, Read, Update and Delete (CRUD). Because these operations are common to ALL database tables it made sense, to me at least, to define them once in an abstract class so that they could be easily shared by every concrete table class.
Polymorphism - The ability to call a method on a dependent object without knowing the identity of that object, thus making it possible to swap between one object and another. This means that when a Controller calls one or more methods on a Model (concrete table class) it can work with ANY Model, so if I have 40 Controllers and 450 Models it provides 40 x 450 = 18,000 (yes, EIGHTEEN THOUSAND) opportunities for polymorphism. These opportunities can be realised through that mechanism known as Dependency Injection.

While I saw a few samples of "advice" of how things could be done I never accepted this as an instruction on how things should be done. How a programmer solves a problem is down to the skills of that programmer, and I refuse to be limited by the lesser kills of others. I played with some of the sample code which I found and did a bit of experimentation using ideas of my own to see what worked best for me and the type of application which I was writing. When I later became aware of things called "best practices" I could see that the results which they produced were inferior to mine, so I chose to ignore them. They appeared to be nothing more than personal preferences from people with limited experience rather than universal rules compiled by genuine experts, so I dismissed any idea which stood in the way of my aim to create as much reusable code as possible. One of these so-called "best practices" is the subject of this article.

Using arrays to pass data around

I noticed that PHP's handling of data arrays was far superior to that which was available in my previous languages. It meant that I could pass around collections of data whose contents were completely flexible and not tied to a particular pre-defined record structure. The data passed into objects from both the Presentation layer (via the $_POST array) and the Data Access layer (via the result on an SQL SELECT query) appears as an array, and this can contain a value for any number of fields/columns. The foreach function in PHP makes it easy to step through an array and identify what values it contains for what fields.

However, in all of the OOP samples I saw in books or within internet tutorials I noticed that the same convention was followed:

Each table column was defined as a separate class property.
Each table column required its own 'setter' to put a value into the object.
Each table column required its own 'getter' to get a value out of the object.

When I saw this I asked myself some simple questions: If the data outside of an object exists in an array, why is the array split into its component parts before they are passed to the object one component at a time? Can I access the data from an array inside the object, or am I forced to use a separate class variable for each field/column? The answer turns out to be a choice between:

$this->column            // each column has its own class property
and
$fieldarray['column']    // all columns are held in a single class property

Guess what? To PHP there is no discernible difference as either option is possible. The only difference is in how much code the developer has to write to put that data in and to get that data out. I then asked myself another question: Under what circumstances would a separate class property for each piece of data, forcing each to have its own setter (mutator) and getter (accessor), be the preferable choice? The answer is as follows:

When the number of data items and their identities is fixed and never varies.
When each data item is written to and/or read from in isolation.
When the data is processed continuously in real time.
When the response to a change in an item's value may or may not result in a visible response.

This scenario would fit something like an aircraft control system which relies on discrete pieces of data which are supplied by numerous sensors all over the aircraft. When changes in the data are processed the system may alter the aircraft's configuration or it may update the pilot's display in the cockpit.

This scenario does NOT fit a web-based database application for the following reasons:

It deals with any number of tables each with any number of columns in any number of rows.
- It is not necessary to provide values for all columns on a INSERT operation as some may be optional (nullable) or have default values.
- It is not necessary to provide values for all columns on a UPDATE operation, only those columns which you wish to change.
- It is not necessary to provide values for all columns on a DELETE operation, only those values in the primary key.
- The output from a SELECT operation may be any number of columns from any number of rows. Some columns from the table may be missing, some columns may be read from other tables and included because of a JOIN, while other values may be generated from an expression.
It does not input or output data one piece at a time, it deals with data sets which may contain values for any number of columns for any number of rows for any number of tables.
- Data from the Presentation layer is given to the object in the form of the $_POST array which only appears when the user activates a SUBMIT operation. This can contain any number of fields for any number of rows for any number of tables. Data does not appear one field at a time.
- Data from the Data Access layer is given to the object in the form of the mysqli_fetch_assoc array which can contain any number of columns and, with the aid of an SQL JOIN, any number of tables. Note that a single SELECT query can also return data for any number of rows, which includes zero rows.
Note that in both of these associative arrays the keys are the column names and the values are all strings.
The application data is not processed in real time, instead it operates on a request/response cycle where:
- The request, which is known as a user transaction, is instigated manually by a user or possibly automatically according to a timed schedule.
- The response may be a document (such as HTML, CSV, PDF, XML or JSON) which can either be sent to the client device from which the request was issued, or it may be written to disk or sent to another computer. It is also possible for there to be no visible response as the transaction does nothing but update the database.
The only operations which can be performed on a database table are Create, Read, Update and Delete (CRUD). While the Read operation may access any number of tables the Create, Update and Delete operations can only access one table at a time.
Each user transaction can perform one or more of the CRUD operations on one or more tables.

Having built enterprise applications which have hundreds of database tables and thousands of user transactions I realised straight away that having separate class properties for each table column, each with its own setter and getter, would be entirely the wrong approach as it produces tight coupling which in turn greatly restricts the opportunity for reusable software. As the aim of OOP is supposed to be to increase the amount of reusable software I decided that any practice which did not support this aim was something to be avoided.

Consider the following sample code which is required when using a separate property for each table's column:

<?php
require_once 'classes/person.class.inc';
$dbobject = new Person(); 
$dbobject->setUserID    ( $_POST['userID']   ); 
$dbobject->setEmail     ( $_POST['email']    ); 
$dbobject->setFirstname ( $_POST['firstname']); 
$dbobject->setLastname  ( $_POST['lastname'] ); 
$dbobject->setAddress1  ( $_POST['address1'] ); 
$dbobject->setAddress2  ( $_POST['address2'] ); 
$dbobject->setCity      ( $_POST['city']     ); 
$dbobject->setProvince  ( $_POST['province'] ); 
$dbobject->setCountry   ( $_POST['country']  ); 

if ($dbobject->insertPerson($db) !== true) { 
    // do error handling 
} 
?>

This suffers from the following deficiencies:

The name of the database table is hard-coded, which means that this code is tightly coupled to that table class.
The name of the object method includes the table name, which means that this code is tightly coupled to that table class.
The names of the columns are hard-coded, which means that any change to the number of columns would require a change to the number of getters and setters in both the object and the code which calls that object. This is a perfect example of the ripple effect which is a product of tightly coupling.

Contrast this with the following code which can be used when the data array is not split into its component parts:

<?php 
require_once 'classes/$table_id.class.inc';  // $table_id is provided by the previous script
$dbobject = new $table_id;
$result = $dbobject->insertRecord($_POST);
if ($dbobject->errors) {
    // do error handling 
}
?>

This is loosely coupled and offers the following advantages:

Neither the table name, method name or any column names are mentioned anywhere in that code, so it can be used with any table in the application.
The number of columns in the array can be modified at any time, which eliminates any ripple effect caused by having to change method signatures and the places which call those signatures.
Note that standard code within the table class will automatically validate that the data array contains values for the correct number of columns, and that each column's value matches its data type.

This means that I can use the following methods to handle the communication between a Controller and its Model:

Common Table Methods
Methods called externally	Methods called internally	UML diagram
$object->insertRecord($_POST)	$fieldarray = $this->pre_insertRecord($fieldarray); if (empty($this->errors) { $fieldarray = $this->validateInsert($fieldarray); } if (empty($this->errors) { $fieldarray = $this->commonValidation($fieldarray); } if (empty($this->errors) { $fieldarray = $this->dml_insertRecord($fieldarray); $fieldarray = $this->post_insertRecord($fieldarray); }	ADD1 Pattern
$object->updateRecord($_POST)	$fieldarray = $this->pre_updateRecord(fieldarray); if (empty($this->errors) { $fieldarray = $this->validateUpdate($fieldarray); } if (empty($this->errors) { $fieldarray = $this->commonValidation($fieldarray); } if (empty($this->errors) { $fieldarray = $this->dml_updateRecord($fieldarray); $fieldarray = $this->post_updateRecord($fieldarray); }	UPDATE1 Pattern
$object->deleteRecord($_POST)	$fieldarray = $this->pre_deleteRecord(fieldarray); if (empty($this->errors) { $fieldarray = $this->validateDelete($fieldarray); } if (empty($this->errors) { $fieldarray = $this->dml_deleteRecord($fieldarray); $fieldarray = $this->post_deleteRecord($fieldarray); }	DELETE1 Pattern
$object->getData($where)	$where = $this->pre_getData($where); $fieldarray = $this->dml_getData($where); $fieldarray = $this->post_getData($fieldarray);	ENQUIRE1 Pattern

Please note the following:

The column labelled Methods called externally shows the method calls that appear in a Controller.
The column labelled Methods called internally shows what happens inside a Model when this method is called.
The column labelled UML diagram shows a detailed UML diagram which includes the interactions between the invariant methods in the abstract class and the variable "hook" methods which may be overridden in each concrete subclass.

Because these methods are common to every table class it would be foolish to duplicate them, so following my own interpretation of "best practices" I decided to move them to an abstract class so that they could be shared using that simple mechanism called inheritance. Because the same methods produce different results depending on what object they are called on this satisfies the definition of polymorphism which is "same interface, different implementation". This means, for example, that when a Controller calls a series of operations on a Model it is not tightly coupled to a particular Model as it can use any Model which it is given. This is implemented using a technique known as Dependency Injection - Injecting the Model into the Controller.

Extracting the data from an object, such as when transferring it to the View object, does not require a collection of getters as it can be done with one simple command:

$fieldarray = $dbobject->getFieldArray();

This array can contain any number of columns from any number of rows and from any number of tables, which means that is does not require different variations in the code to deal with different combinations. PHP's foreach function provides the ability to iterate through an array and identify both column names and their values.

Another reason which caused me to reject the idea of having a separate class property for each column, each with its own setter and getter, is that it restricts each object to only being able to deal with columns on that particular table.

Long experience has shown me that some input screens contain data that needs to be split across several tables, so how can you utilise setters on a Model for those columns which do not exist as properties within that Model? With my method you can put whatever data you like in the input array, and the operation will only fail if the data required by that Model fails its validation checks. Any data which does not belong on that table will be ignored, but can be handled in the relevant "hook" method, such as _cm_post_insertRecord() or _cm_post_updateRecord().
Long experience has also shown me that sometimes an output screen requires data that comes from several tables, not just one, so how can you utilise getters on a Model for those columns which do not exist as properties within that Model? With my method you can build a SELECT query which will retrieve as little or as much data as you like, which includes the results of expressions as well as JOINs. All the data from an object can be obtained with a single call which does not need to identify any column names.

Data validation made easy

Anybody who has ever written a database application should know never to trust input submitted by a user. Each item of data in a database has a particular data type, and if your code tries to insert an incompatible value into a column, such as inserting the string "four" into a column expecting a number, or "30th February" into a column expecting a date, it will cause the query to abort. After receiving input from a user it is essential that you validate it before attempting to store it in the database.

The first question to ask is "Where should this validation be performed?" The answer is "In the Model", as shown by the calls to the validateInsert() and validateUpdate() methods in common table methods above. There are those who think that this validation should be called outside of the Model as it is wrong to insert unvalidated data into the Model, but it is they who are wrong! Data validation is part of the business logic, and business logic belongs nowhere but in the Business/Domain layer. The principle of encapsulation is defined as The act of placing data and the operations that perform on that data in the same class, and as validation is one of those operations it follows that each entity (table) in the business layer should have its own class, and that class should therefore contain all the necessary business logic for that entity, and this logic includes all validation.

I have also been told by some OO "experts" that I should be using setters as that is the correct place to validate a column's value, but this is wrong for two reasons:

I don't use setters.

That only works when I can validate a single column without reference to another column as that other column may not have been set yet. With the RADICORE framework I can compare the values in several columns very easily with code similar to the following:

function _cm_commonValidation ($fieldarray, $originaldata)
// perform validation that is common to INSERT and UPDATE.
{
    if ($fieldarray['start_date'] > $fieldarray['end_date']) {
        // 'Start Date cannot be later than End Date'
        $this->errors['start_date'] = getLanguageText('e0001');
        // 'End Date cannot be earlier than Start Date'
        $this->errors['end_date']   = getLanguageText('e0002');
    } // if
    
    return $fieldarray;
}

The second question to ask is "How should this validation be performed?" The answer is "With as much reusable code as possible" In all the code samples which I cane across I noticed that everybody was writing code manually to validate each column, but I had already worked out a method of calling a standard routine to perform this validation automatically. This is because the range of possible data types for a column in a database table comes from a fixed list, and the validation for each data type can therefore be fixed in the code. This means that it should be possible to write code along the lines of "If the data type for column X is Y, then validate X's value according to the rules for Y". As the database schema already contains the specifications for each column that exists in every table, and that includes its name, size and data type, it is a straightforward process to extract that information and make it available in each table's class file. This is why I created my Data Dictionary which has one process to import a table's specifications from the INFORMATION_SCHEMA in the application database into an intermediate database, and a second process to export that information to a table structure file so that it can be used to populate the common table properties in that table's object. This then enabled me to create a standard validation class which has two arguments - $fieldarray containing column data and $fieldspec containing each column's specifications. It then becomes a straightforward process to compare each column's value with its specifications.

Note that secondary validation can be performed in any of the following "hook" methods:

_cm_validateInsert() - for an INSERT only
_cm_validateUpdate() - for an UPDATE only
_cm_commonValidation() - common to both INSERT and UPDATE

Reading from multiple tables

I have been informed that in "proper" OO programming each table class is only supposed to retrieve those columns which actually belong in that table. This rule is enforced by each table class having a separate getter for each column which belongs to that table. This also makes it impossible to retrieve data from multiple rows as each call to a getter can only obtain a single value for a single row.

This to me is an artificial restriction which does not support the way that databases work. They deal in data sets which can contain data for any number of columns from any number of tables from any number of rows. This cannot be duplicated with a fixed set of class properties each with their own getters and setters, but it can be duplicated with a single $fieldarray property.

With the RADICORE framework it is possible for an HTML screen to contain data from more than one table in the following ways:

By using a Transaction Pattern whose screen is comprised of more than one zone, such as LIST 2. Each zone is assigned to a different Model and is subject to its own getData() operation.
By causing the SELECT query to perform an SQL JOIN to one or more additional tables. This can be done in one of the following ways:
- Allowing the framework to do it automatically using the procedure described in Using Parent Relations to construct sql JOINs
- Building the query manually by inserting the relevant code into the _cm_pre_getData() method

By accessing a different table within a custom method, such as in the following:

function _cm_getForeignData ($fieldarray, $rownum='')
// Retrieve data from foreign (parent) tables.
// $rownum identifies current row number.
{
    if (!empty($fieldarray['prod_cat_id']) and empty($fieldarray['prod_cat_desc'])) {
        // get description for selected entry
        $dbobject = RDCsingleton::getInstance('product_category');
        $dbobject->sql_select = 'prod_cat_desc';
        $foreign_data = $dbobject->getData("prod_cat_id='{$fieldarray['prod_cat_id']}'");
        // merge with existing data
        $fieldarray = array_merge($fieldarray, $foreign_data[0]);
    } // if

    return $fieldarray;

} // _cm_getForeignData

Note here that I do not use any individual getters to obtain data from the PRODUCT_CATEGORY table, I take whatever data was obtained, in the form of an array called $foreign_data, and merge it with the current contents of $fieldarray. The View object can then extract all this data with a single call to the getFieldArray() method.

Writing to multiple tables

It is quite possible that data from a single HTML screen needs to be spread across more than one database table, so how can this be achieved? The RADICORE framework provides the following ways:

By using a Transaction Pattern whose screen is comprised of more than one zone, such as ADD 6 and ADD 7. Each zone is assigned to a different Model and is subject to its own insertRecord() operation.
By accessing a different table within a custom method, such as in the following:
```
function _cm_post_insertRecord ($fieldarray)
// perform custom processing after database record has been inserted.
{
    $dbobject = RDCsingleton::getInstance('other_table');
    $other_data = $dbobject->insertRecord($fieldarray);
    if ($dbobject->errors) {
        $this->errors = array_merge($this->errors, $dbobject->getErrors());
    } // if

    return $fieldarray;
		
} // _cm_post_insertRecord
```
Notice here that I do not have to bother with individual setters as all the data is present in the associative array called $fieldarray. It does not matter if this array contains values for columns which do not exist in 'other_table' as they will be ignored when it comes to updating the database. The business logic within this other object will take care of all data validation and insert any error messages into the $errors array if anything goes pear-shaped. If this array is not empty the Controller will automatically perform a rollback() instead of a commit() and all error messages will be loaded into the screen.

Here endeth the lesson. Don't applaud, just throw money.

References

PHP, RFCs and Changing fundamental language behaviors from owensoft.net
Why getter and setter methods are evil by Allen Holub
More on getters and setters by Allen Holub
Getters/Setters. Evil. Period by Yegor Bugayenko
Avoid getters and setters whenever possible by Scott Shipp
Why getters and setters are terrible by Eric Normand

These are reasons why I consider some ideas on how to do OOP "properly" to be complete rubbish:

counter

Tony Marston's Blog About software development, PHP and OOP