Object-Oriented Programming for Heretics

By Tony Marston

10th December 2004
Amended 23rd August 2013

Introduction
Definitions
My Approach
- Understand what "abstraction" really means
- Each database table requires its own class
- Don't waste time with Mappers
- Don't waste time with Object Oriented Design
- Three levels of separation is enough
- Methods are centered around database access
- Don't use getters and setters for user data
- Two levels of class hierarchy is enough
- Interfaces are not necessary
- Use design patterns sparingly
- Multiple inheritance is not necessary
Results of my approach
Criticisms by the 'Paradigm Police'
- Your design is centered around data instead of functions
- You have not achieved the correct separation of concerns
- An object can only deal with a single database row
- Your classes are too big
- Your class methods are too visible
- You have the wrongs levels of coupling/cohesion/dependency
- Your design is dependent on SQL
- You don't understand what '3 tier' means
- You have the same data names in every layer
- Your database schema is known to your domain layer
- Your approach is too simple
Conclusion
- Basic misunderstandings
- Reverse Imperative Principle
- Other peculiar ideas
References
Amendment History

 

Introduction

Visitors to my website may be aware that my efforts in building A Development Infrastructure for PHP have been attacked most vociferously by a group who do not like what I have done simply because it is different to the way in which they would have done it. Because it is 'different' it is automatically branded as 'impure'. The following articles document some of their criticisms along with my responses:

These people are like religious zealots who think that their way is 'the only way, the one true way', and that anybody who dares to think differently is an unbeliever, a heretic, and should be burned at the stake. They act like a modern equivalent of the Spanish Inquisition. I call these people the 'paradigm police'.

I do not care for their brand of religion. I will not conform. I will not apologise for being different.

Definitions

The problem with OOP is that there is no clear definition of what it is and what it is not. Since its first appearance numerous people have expanded on its original definition (whatever that was) and invented new 'rules' which in turn have been subject to different interpretations. For each interpretation there are also many different possible implementations. There is no single definition, or set of definitions, which is universally accepted, therefore, no matter what you do, somebody somewhere will always find a reason to complain that 'you are wrong!'

Here is a list of basic definitions that I referenced while creating my infrastructure:

Object Oriented Programming Writing programs which are oriented around objects. Such programs can take advantage of Encapsulation, Polymorphism, and Inheritance to increase code reuse and decrease code maintenance.
Object An instance of a class. A class must be instantiated into an object before it can be used in the software. More than one instance of the same class can be in existence at any one time.
Class A class is a blueprint, or prototype, that defines the variables and the methods common to all objects (entities) of a certain kind.
Encapsulation The act of placing an entity's data and the operations that perform on that data in the same class. The class then becomes the 'capsule' or container for the data and operations.

Note that data may include meta-data (type, size, etc) as well as entity data.

Inheritance The reuse of base classes (superclasses) to form derived classes (subclasses). Methods and properties defined in the superclass are automatically shared by any subclass.
Polymorphism Same interface, different implementation. The ability to substitute one class for another. This means that different classes may contain the same method names, but the result which is returned by a particular method will be different as the code behind that method (the implementation) is different in each class.
Cohesion Describes the contents of a module. The degree to which the responsibilities of a single module/component form a meaningful unit. The degree of interaction within a module. Higher cohesion is better. Modules with high cohesion are preferable because high cohesion is associated with desirable traits such as robustness, reliability, reusability, extendability, and understandability whereas low cohesion is associated with undesirable traits such as being difficult to maintain, difficult to test, difficult to reuse, difficult to extend, and even difficult to understand.

Cohesion is usually contrasted with coupling. High cohesion often correlates with low coupling, and vice versa.

As shown in figure 7 I have split my application into several component types each of which perform separate functions:

  • Controllers which accept input from the user, instruct the Models to perform actions, and update the Views to show the results of those actions.
  • Models contain the data validation and business rules for each business entity. There is a separate Model for each table in the database.
  • Views show the results of actions which were performed on Models.
  • Data Access Objects construct and issue the SQL queries for a particular DBMS, with a separate class for each DBMS (MySQL, PostgreSQL, Oracle, SQL Server).
Coupling Describes how modules interact. The degree of mutual interdependence between modules/components. The degree of interaction between two modules. Lower coupling is better. Low coupling tends to create more reusable methods. It is not possible to write completely decoupled methods, otherwise the program will not work! Tightly coupled systems tend to exhibit the following developmental characteristics, which are often seen as disadvantages:
  • A change in one module usually forces a ripple effect of changes in other modules.
  • Assembly of modules might require more effort and/or time due to the increased inter-module dependency.
  • A particular module might be harder to reuse and/or test because dependent modules must be included.

Coupling is usually contrasted with cohesion. Low coupling often correlates with high cohesion, and vice versa.

As shown in figure 7 I have split my application into several layers and several component types. The only place where table names and column names are mentioned in any code is within the Model, which eliminates a great deal of tight coupling:

This means that if I change a table's structure or validation rules I only have to change a single Model class, and none of the other classes will be affected. I do not have to change any Controllers, Views or Data Access Objects. This high level of reusability is a clear sign that I have achieved low coupling.

Dependency The degree that one component relies on another to perform its responsibilities. High dependency limits code reuse and makes moving components to new projects difficult. Lower dependency is better.

You can only say that "module A is dependent on module B" when there is a subroutine call from A to B. In this situation it would be wrong to say that "module B is dependent on module A" because there is no call in the reverse direction. If module B calls module C then B is dependent on C, but it would be wrong to say that A is dependent on C as A does not call C. Module A does not even know that module C exists.

This term is often interchangeable with coupling.

Visibility The ability to 'see' parts of an object from outside. Any method or property marked as 'public' is visible, whereas any method or property marked as 'private/protected' is not visible to the outside world and is therefore 'hidden'. Methods and properties which should not be directly accessed from outside should be hidden. Lower visibility is better.

My Approach

Fortunately I was not trained in OOP by any of these religious zealots. I trained myself using a combination of common sense, logic, and 25+ years of programming with a mixture of 2nd, 3rd and 4th generation languages. I have successfully built my own development infrastructures in COBOL and UNIFACE which enabled my team members to achieve high rates of productivity, so I saw no reason why I could not repeat this success with PHP.

Understand what "abstraction" really means

One of the most important techniques in software engineering is a concept called "abstraction", but a lot of people seem to start off on the wrong foot by performing the wrong type of abstraction. This state of affairs is made possible by the simple fact that there are two meanings for the term "abstract":

  1. A statement summarizing the important points of a text. To reduce to the essential details. Summary, synopsis, précis, résumé, outline, abridgment, condensation, digest.
  2. Thought of or stated without reference to a specific instance. An ideal or theoretical way of regarding things. Separated from matter, practice, or particular examples; not concrete; insufficiently factual; unreal; hypothetical; abstruse; difficult to understand; incomprehensible.

So which is the correct meaning? In his book, Object-Oriented Design with Applications, Grady Booch (one of the authors of the Unified Modeling Language) defines abstraction in the following way:

An abstraction denotes the essential characteristics of an object that distinguish it from all other kinds of object and thus provide crisply defined conceptual boundaries, relative to the perspective of the viewer.

In other words, abstractions are concerned with the simplification of reality and the removal of inessential details that may be associated with that reality. The end result of this process called "abstraction" is supposed to be one or more class definitions, where each class defines a different type of entity, with its own properties and methods, which can be instantiated into any number of objects. Yet some people seem to think that the product of abstraction should be something that is unreal, something that does not reflect the reality which it is supposed to represent. Take the following example:

Which one of those "abstractions" provides something that models the real world closely enough to be of use in an anatomy class? Would you want to be operated on by a surgeon whose perception of the human body was limited to what he had seen in one of Picasso's paintings?

This misconception about the meaning of the word "abstraction" leads to my approach of having a separate class for each database table being subject to criticism such as this:

Abstract concepts are classes, their instances are objects. Classes are supposed to represent abstract concepts. The concept of a table is abstract. A given SQL table is not, it's an object in the world. Having a separate *class* for each table is therefore bad OO.

If you say that "a given database table is an object in the real world" then how can you say that each table object cannot have its own class?

While an unidentified database table can be described in abstract terms (it does not have a name or any columns yet), a particular database table can be described in more concrete terms. Thus a CUSTOMER table contains properties in the form of columns, and it has the methods with are common to all database tables - select, insert, update and delete. The DDL script for a table is the blueprint for each record within that table, and each record within that table is an instance of that blueprint. The DDL script, the blueprint, can therefore be used to define a CUSTOMER class, and an object of this class can be used to manipulate records within the CUSTOMER table.

Similarly a table to hold PRODUCT data will have its own DDL script with its own set of columns, and because it has a different set of properties it therefore qualifies to have its own class definition. It will share the same abstract methods as any other database table - select, insert, update and delete - although the actual implementation will be different for each. This does not mean that each class will contain its own code to generate those SQL queries as it is possible to generate them using a single shared function which is provided with the table name and an array of column names and their values as its input arguments.

If a given SQL table is an object and its DDL script provides the blueprint for each row (instance) within that table, then all I am doing is following the principles of OOP and using that blueprint to define a concrete class for that table. All abstract concepts which can be applied to any database table are inherited from an abstract class. This is supposed to be what OOP is all about, so why do you insist that I am wrong?

Take a look at the following:

So is the result of your abstraction the "essential details" or something "hypothetical and incomprehensible"? Is the result of your abstraction a work of art like Michelangelo, or is it a piece of nonsense like Picasso?

Each database table requires its own class

This biggest problem virtually everybody has with OOP is how to split the entire application into a collection of different classes. What should be defined as a class, and what should not? What sort of class hierarchy would be best? Getting back to basics what you are trying to do is build a system where you have software objects that represent Real World (RW) objects. Once you have identified which RW objects your application is supposed to deal with, then surely it follows that you must define a class for each of these RW objects from which you are able to create software objects?

In order to avoid confusion between RW objects and software objects I am going to use a different word. 'Thing' is valid but too common for some people. Another word already in use within the IT community is 'entity', so I shall use that. So an 'object' in the software is a representation of an 'entity' to the business.

In my many years of designing, building and using databases one valuable tool is the Entity Relationship Diagram (ERD) without which you cannot design a database that will support the needs of the business. This is where you identify all the entities used in the business and the relationships between them.

To me this seems blindingly obvious:

Also, if you look at the schema (DDL script) for a database table doesn't this qualify as the blueprint for all records of that type? Isn't a record an instance of that blueprint? Doesn't this mean that there are great similarities between the schema for a database table and the contents of a class? Yet there are some OO zealots out there who think that Having a separate class for each database table is not good OO. I shudder to think how they divide their applications into classes. These must be the people who complain Your design is centered around data instead of functions.

This now provides the following definitions to add to the list:

Entity A real world 'thing' with which the business has to deal.
Object A software representation of an entity (which is still an instance of a class).
But suppose my CUSTOMER information includes ADDRESS and CONTACT details - do I treat this as a single entity and therefore a single class?

You may, and others may, but I most definitely would not. There may be circumstances when you wish to access them as a group, but there may be other circumstances when you may wish to access only one of them independently of the others. If you build a class in which you cannot get to an ADDRESS or CONTACT without first going through CUSTOMER then you are creating a potential problem for yourself. If ADDRESS and CONTACT are sufficiently different from CUSTOMER that they require separate database tables then in my humble opinion they also require separate classes. This gives you the ability to access them as a group OR independently.

So instead of this:

Figure 1 - A Class for multiple tables

oop-for-heretics-01 (1K)

I would create this:

Figure 2 - A separate Class for each table

oop-for-heretics-02 (1K)
But what happens if I wish to access several tables as a group, not independently, within the same task?

There are several possibilities which depend on what it is you are trying to do:

This simple arrangement gives you the option of accessing multiple objects which are independent of one another (just like database tables are independent of one another), as shown in Figure 3:

Figure 3 - Accessing multiple objects from a Controller

oop-for-heretics-03 (1K)

Alternatively a Controller can access one object which in turn accesses one (or more) other objects as shown in Figure 4. The fact that the first object links to other objects is completely unknown to the Controller. This simple arrangement gives you the ability to link from one object to another in whatever combinations take your fancy. The possibilities are endless.

Figure 4 - Accessing one object from another

oop-for-heretics-04 (1K)

The method of combining two or more objects is often referred to as object composition/aggregation, but as usual there is more than one way in which this can be implemented. If you look at Figure 1 there is an "outer" object which acts as a container for other "inner" objects. There are two methods I have seen in which this type of structure can be accessed:

I do not like either of these approaches as they totally destroy any advantages gained from polymorphism. This would mean that each calling object (usually a controller) would have to be tailored for each object it wishes to call, which in turn means that it would be difficult to employ generic and reusable controllers. Tailoring a controller so that it can only call a single object results in tight coupling. This is a Bad Thing (™) in my book, therefore something which should be avoided.

If, for example, I have an input screen which requires data to be split across two tables, TableA and TableB, I do not bother customising a controller so that it communicates with both the TableA object and the TableB object. Instead I use the generic ADD 1 controller and use the insertData() method on the TableA object. This has the responsibility of inserting the relevant data into TableA, and also of instantiating an object for TableB so that it can use the insertData() method on the TableB object for the remaining data.

The generic controller sends all the data to TableA, and it is an implementation detail inside the TableA object that some of the data actually needs to go to TableB. Because this "implementation detail" is inside the table object and not the controller it means that the implementation detail can be changed at any time, such as dropping the reference to TableB, or expanding onto TableC as well, by changing the code inside the table object instead of the code inside the controller. Because these "implementation details" are maintained within business objects and not controllers it is possible to use a smaller set of generic and reusable controllers instead of having to customise a controller for each different set of circumstances. This, in my mind at least, is a Good Thing (™).

If object 'B' is a type of object 'A', then surely 'B' must be a subclass of 'A'?

This is a common mistake that produces class structures that are out of step with the underlying database structure, which leads to no end of problems. In a lot of tutorials on OO I see examples of class hierarchies created just because something is a type of something else. For example, "dog" is a class, but because "alsatian", "beagle" and "collie" are regarded as types of "dog" they are automatically represented as subclasses. This results in a structure similar to that shown in Figure 5:

Figure 5 - hierarchy of "dog" classes

oop-for-heretics-05 (1K)

This to me is absolute rubbish as it does not accurately represent the structure which I would create in the database. The differences between each type of dog are not significant enough to warrant a separate database table for each type, so why should they require a separate class? To me DOG_TYPE is just a piece of data associated with each dog, such as GENDER, DATE_OF_BIRTH and COLOUR. In order to avoid a hard-coded list of dog types I would have a dynamic list maintained on a separate database table, and I would use a standard one-to-many relationship as shown in Figure 6:

Figure 6 - structure of "dog" tables

oop-for-heretics-06 (1K)

In this set of circumstances I have just two classes - one for DOG and another for DOG_TYPE. An advantage to this method is that I do no have to create a new subclass if I want to deal with a new type of dog - I simply add a new record to the DOG_TYPE table. Extracting details for particular types of dog is just as easy as extracting details by any attribute, such as:

SELECT * FROM dog WHERE dog_type='COLLIE'
SELECT * FROM dog WHERE gender='MALE'
SELECT * FROM dog WHERE colour='LIGHT BROWN'

This is not exactly rocket science, so why do some people insist on making it more complicated that it need be?

Don't waste time with Mappers

A "mapper" is something which sits between two objects in order to ensure that the output generated by one object is converted to the input expected by the other object. If the objects can engage in a two-way conversation then the mapper is expected to handle the conversion in both directions.

The reason that one might want to employ a mapper is that if the message format or contents in one of the objects ever changes then instead of making a corresponding change in the second object you make the change in the mapper instead.

This might have benefits in a situation where the communication is many-to-mapper-to-one as the change need only be made in the single "mapper" object instead of within each of the "many" objects, but in a situation where the communication is one-to-mapper-to-one there are no savings. On the contrary, in such a situation the introduction of a mapper does nothing but provide an extra level of complexity, more code to write, more code to test, more code to document and more code to debug. Just because a mapper may provide benefits in some circumstances does not guarantee that benefits will be provided in all circumstances. I've heard people estimate that as much as 70 percent of a given project's programming and debugging is spent in the object-relational mapping code, so as far as I am concerned this is a totally unnecessary overhead that can easily be eliminated.

Object-Relational Mappers (ORMs)

An OR Mapper is something that sits between an in-memory object and a relational database. It is required when the structure of one is different from the structure of the other so that data being moved around can be correctly reformatted for the structure of the receiving component.

OR Mappers were originally created when relational databases that were accessed via SQL statements first appeared, and developers were reluctant to learn (or incapable of learning) a new language. Thus all SQL statements were maintained by SQL developers in separate objects, leaving the application developers to continue using the language of their choice. It was not necessary that the data structure in the application code be identical to the data structure in the SQL code as any differences could be dealt with programmatically. Not only was it possible for the two data structures to be different, over periods of time it became inevitable as one side was modified without the corresponding changes being made to the other side.

Another reason for the growth of OR mappers is because a lot of OO programmers have a nasty habit of designing complex object structures and hierarchies without any regard to the physical database design, often because the database is built after the event by different people. Because an OO database (OODBMS) which can support object structures and hierarchies without any mapping code is as rare as rocking horse shit, it is necessary to build a relational database (RDBMS) using different principles, then to write code to deal with all the differences between the Object Oriented and the Relational components.

I have found a simple way to avoid this unnecessary complexity: instead of allowing the structure of the two components to be different, thus requiring the need for a mapper, why not keep the two structures completely synchronised, thus removing the need for a mapper? If there are no differences then you do not need any code to deal with those differences.

I do not build my object hierarchy then add on a database as an afterthought, I design my database up front using the process of normalisation, then I create my class structure. This is very easy because I import the database schema into my Data Dictionary then export those details to my application, which produces one class file and one structure file for each database table. If the structure of any table changes then the import process can be run again to detect and deal with those changes, and the export process will regenerate the structure file. Note that it does not have to regenerate or update the class file as no methods or properties are affected. Thus my class structure and database structure are always in sync, which means that I have absolutely no need for any type of mapper.

This simple and effective process means that I have a separate class for each database table, which, according to the OO purists, is not good OO. Ask me if I care!

Whether you like it or not both the data and the software which manipulates that data have some sort of structure:

In the early 1980's all the courses on Structured Programming emphasised the point that the program structure should mirror the data structure as closely as possible, so if the structure of the data changed then the structure of the code which accessed it should change accordingly. Having personally witnessed the advantages of writing and maintaining code which has the same structure as the data I simply would not consider doing it any other way, so when someone tells me that this approach is wrong I can only say - BALDERDASH! POPPYCOCK! PHOOEY!

Another reason to avoid OR mappers is their impact on performance. The following quote comes from http://www.polepos.org, a company that provides benchmarking software:

The use of O-R mapping technology like Hibernate or JDO O-R mappers has a strong negative impact on performance. If you can't compensate by throwing hardware at your application, you may have to avoid O-R mappers, if performance is important to you.

This theme is followed up in Object Relational Mappers are EVIL.

Metadata Mappers

This type of object deals with the translation of column/field names between one object and another, such as a data access object which communicates with the database, a business object where all business rules are applied, and a presentation object which displays output to and accepts input from the user.

I have been programming for 25+ years, and in that time I have used numerous languages and numerous file systems, and I have worked on numerous different projects with different teams of designers and developers. In all that time I have NEVER, EVER come across the idea that different components should have different names for the same piece of data. It has always seemed so logical to keep the same item names throughout the application. Indeed, some languages have made it virtually impossible to do otherwise, while others will only allow it with the addition of volumes of extra code. It is only by deliberately choosing a different naming scheme between different objects that the need for the services of a mapper arises. But by doing what comes naturally, by applying common sense and logic, the naming schemes are identical and thus there is absolutely no need for a mapper.

As far as I am concerned it is normal to have a single naming scheme for data items throughout the application, regardless of the type or size of the application. To deliberately choose more than one naming scheme strikes me as being abnormal if not perverse. Only a masochist, someone who seeks a path of pain, would make such a choice.

When programming with PHP for example, the various functions with which data can be accessed and manipulated seem to assume a single naming scheme. For example:

As you can (or should) see, the use of a single naming scheme throughout the application requires the least amount of effort and presents the least amount of problems. Any attempt to introduce multiple naming schemes would require an enormous amount of effort for absolutely no gain (that I can see) whatsoever. I don't know what you learned at school, but to me any effort which produces no tangible benefit is wasted effort and should therefore be avoided as a Bad Thing ™ The use of mappers signifies wasted effort, therefore they have no place in my methodology.

Don't waste time with Object Oriented Design

When designing an application which uses a relational database it is essential that the database be properly normalised otherwise its performance may be catastrophic. The process of data normalisation requires the following of a number of rules and techniques which must be implemented in a set sequence, from 1st Normal Form (1NF), 2nd Normal Form (2NF) and 3rd Normal Form (3NF), and possibly all the way up to 6th Normal Form (6NF).

When designing the objects which are to be used in an application you need to identify the individual classes, their properties and their methods. This requires the following of a set of rules and techniques which are not as well defined and are therefore open to a lot of interpretation (and also mis-interpretation).

If you employ both design methodologies, one for the database and another for the software, it is more than likely that you will end up two structures which are not entirely compatible (see Object-Relational Impedance Mismatch). In order to allow the two incompatible components to communicate with one another the usual answer is to introduce a third component, known as an Object Relational Mapper, to sit between the two and deal with the differences.

To my mind once you have designed a properly normalised database it is not necessary to carry out a separate design process for the software objects as all you require has already been provided:

So if everything you need - classes, properties and methods - has been provided by Table Oriented Design (TOD), why waste time with Object Oriented Design (OOD)? Why waste time producing two designs which are so different that it requires an extra component to deal with the differences?

Three levels of separation is enough

It is not considered good practice for each object to have too many responsibilities or concerns otherwise it may become too large and complex, and therefore difficult to maintain. Here is another problem - how do you identify those responsibilities or concerns which can be split off into other objects? Again my previous experience provided valuable insight. I have spent many years writing 1-tier systems in COBOL and 2-tier systems in UNIFACE, so I am well aware of how much complexity and duplication of code these involve. When proper support for the 3-tier architecture was introduced into UNIFACE in the year 2000 I managed to convert my 2-tier infrastructure into 3-tier and I could immediately see the benefits. This has been my favourite design pattern ever since.

For those of you who are unfamiliar with the 3-tier architecture it involves the separation of application logic into three tiers or layers:

Having successfully implemented this degree of separation in one language I saw no reason why I should not be able to do the same in PHP. To this end I decided that all the classes built around business entities would go into the middle business layer and be responsible for nothing more than business logic, while presentation and data access logic would be moved to different components.

This now provides the following definitions to add to the list:

Presentation Object An object or component which exists in the presentation layer and which contains nothing but presentation logic.
Business Object An object or component which exists in the business layer and which contains data validation and business logic. It may also contain information which may be passed to other objects to help them carry out their responsibilities. Each of these objects will be associated with a database table. It is possible for an object to link with other objects in order to access multiple tables for a single task.
Data Access Object An object or component which exists in the data access layer and which contains nothing but data access logic. It is possible to have a separate DAO for each DBMS so that an application can be switched from one DBMS to another without having to change any code in the Business layer.

Please note that by "logic" I mean program code, not data. Information is not Logic just as Data is not Code.

There are some who would argue that it is possible to achieve a greater degree of separation which means creating more than three objects, but having witnessed a disastrous attempt at implementing a ten-tier structure I can only disagree.

However, after I had built my infrastructure someone pointed out that I had actually implemented a version of the Model-View-Controller (MVC) design pattern as my presentation layer contained a controller component and a view component, with my business layer containing the model component. Upon reading the description of this design pattern I could see the similarities, but this was pure coincidence, not deliberate design.

Methods are centered around database access

My approach is simple, yet frowned upon by some OO fanatics because it breaks their rules. See if you can follow my logic:

  1. Each business object is associated with a single database table.
  2. A database table has only four basic operations that can be performed on it - Create (Insert), Read (Select), Update and Delete.
  3. Each object should therefore have a method which corresponds to each of these operations
  4. The table name does not have to be included in any method name as it is already specified in the class from which the object was instantiated.

Point (3) led me to create the following set of methods for each database table object:

Some of these methods only deal will single database records, so later on I added the following:

Point (4) is directly related to polymorphism, one of the fundamental principles of OOP. This means that in the situation where there are database tables for Customer, Product and Invoice I do not do what I have seen others do and create methods such as the following:

I can achieve the same result with the following:

This means that I can use generic controllers which call generic methods on whatever object they are told to work with. This also means that the same controller can be used with ANY object to achieve a predictable result without any modification, thus making them infinitely reusable. This high level of reusability would seem to indicate that my implementation achieves low coupling which is supposed to be a Good Thing (™).

Because each database table class requires the same generic methods I can define these methods just once in an abstract table class which can then be extended into a separate subclass for each individual database table class. This is making good use of inheritance, another of the fundamental principles of OOP.

If I am using polymorphism and inheritance to create reusable code, which is one of the fundamental aims of OOP, how is it possible for these OO fanatics to tell me that my methods are wrong? If I can achieve the desired result by breaking their precious rules, then doesn't it indicate that their precious rules are in desperate need of serious revision?

Don't use getters and setters for user data

In all of the OOP samples I saw in books or within internet tutorials the same convention was followed:

I thought this was very cumbersome as within my previous 3-tier infrastructure all the data could be passed around 'en bloc' as a single XML stream instead of one field at a time. Although it is possible within PHP to write to and read from XML streams I decided that this would be total overkill as PHP is already equipped with a much simpler mechanism - associative arrays.

As you can see it is perfectly possible to pass the whole array around from one layer to the next without losing any functionality, so I saw no need to copy everyone else and generate extra code to pass the data around one item at a time.

This means that instead of code like this:

<?php 
$client = new Client(); 
$client->setUserID    ( $_POST['userID'   ); 
$client->setEmail     ( $_POST['email'    ); 
$client->setFirstname ( $_POST['firstname'); 
$client->setLastname  ( $_POST['lastname' ); 
$client->setAddress1  ( $_POST['address1' ); 
$client->setAddress2  ( $_POST['address2' ); 
$client->setCity      ( $_POST['city'     ); 
$client->setProvince  ( $_POST['province' ); 
$client->setCountry   ( $_POST['country'  ); 

if ($client->submit($db) !== true) 
{ 
    // do error handling 
} 
?> 

I can use code such as this:

<?php 
require_once '$table_name.class.inc';
$dbobject = new $table_name;
$errors = $dbobject->updateRecord($_POST);
if ($errors) 
{ 
    // do error handling 
}
?> 

This simple technique means I can add or remove fields from a database table without having to change any getters or setters anywhere. This also means that because my controllers do not contain getters and setters with hard-coded field names I am one step nearer to having generic controllers which can be used on any database table.

Another advantage of returning an array of values instead of having to use a separate getter for one value at a time is that I do not need one piece of code when dealing with a single record and another piece of code for a collection of records. I have a single getData() method which returns an array, and this array can contain any number of rows.

Two levels of class hierarchy is enough

Instead of doing what some people seem to do and design a complex class hierarchy before writing any code, I did exactly the opposite. I wrote the code, then I split it into a suitable class hierarchy which ended up as only two levels deep - a superclass containing generic code for all database tables, and a series of subclasses for individual database tables. I wrote a class for one database table, then tested it using a family of six standard screens. Once this was working I made a copy of the entire class, then modified it to deal with a second database table. The next exercise was to compare the two classes and determine which code was duplicated and could therefore be shared, and which code was specific to just one database table and therefore could not be shared.

The standard mechanism for sharing code in the OO paradigm is through inheritance, so I put all the common code into a class of its own, a generic table class. For each individual database table I created a separate table class which held specific information for that database table. As each database table subclass 'extends' the generic table superclass the resulting object shares or 'inherits' all the properties and methods from the superclass.

As a final step I took out all the code which communicated with the database and put it into a separate SQL/DML class. This now exists in the data access layer and is sometimes referred to as the data access object or DAO.

There are some circumstances when I find it useful to create subclasses of my database table subclasses, and these are documented in When and how do you use subclassing?

Interfaces are not necessary

My development infrastructure was written in PHP 4 which does not have the more comprehensive object model of PHP 5. Yet it works, so the OO functionality provided by PHP 4 must be perfectly adequate. My infrastructure will also run under PHP 5 without modification. This leads me to believe that some of the new OO features in PHP 5 are merely cosmetic and offer no functional benefit. They must have been put there just to satisfy those OO zealots who say 'Java/C++ does this-and-that, so I want PHP to do the same'.

An example of this can be found with object interfaces which the PHP manual describes thus:

Object interfaces allow you to create code which specifies which methods a class must implement, without having to define how these methods are handled.

Take a look at the following code:

<?php
// Declare the interface 'iTemplate'
interface iTemplate
{
   public function setVariable($name, $var);
   public function getHtml($template);
}

// Implement the interface
class Template implements iTemplate
{
   private $vars = array();
  
   public function setVariable($name, $var)
   {
       ....code....
   }
  
   public function getHtml($template)
   {
       ....code....
   }
}
?> 

Exactly the same result can be achieved with this code:

<?php
class Template
{
   private $vars = array();
  
   public function setVariable($name, $var)
   {
       ....code....
   }
  
   public function getHtml($template)
   {
       ....code....
   }
}
?> 

So my question is this - if I can achieve exactly the same result without using interfaces, then what is the benefit of using them? Why should I waste my time writing more lines of code than is necessary?

After doing a little investigation I discovered the following comment regarding interfaces:

The reason for this is that PHP has its own method of dealing with arguments of different types, or arguments which are optional, while other languages can only do this through the use of interfaces.

I also came across articles which proposed the use of delegates instead of interfaces. Delegates are like interfaces except that they do not require the callee to declare an explicit interface. The caller must have access to the interface declaration in the form, but it is not necessary for the target to explicitly declare the implementation of an interface. Anonymous classes in Java are used for the same purpose.

This raises another important question - if interface declarations are so good, why do these statically typed languages keep using such cumbersome methods of avoiding them?

As far as I am concerned the word 'interface' means 'application program interface' (API) which means the method/function name and its associated arguments. I do not need to define method names in one place and interfaces in another when a single definition will achieve the same result.

Use design patterns sparingly

All too often the OO zealots like to say 'you must prove you are one of us by implementing design patterns'. This usually means THEIR favourite patterns from THEIR favourite author (see all the references to Martin Fowler's Patterns of Enterprise Application Architecture (PoEAA) in In the world of OOP am I Hero or Heretic?). This again produces another dilemma as there are dozens of books by different authors containing hundreds of different design patterns, so which ones do you choose? If you have the time to examine all of these books closely you should observe the following:

So as you can see there is not one set of universally accepted design patterns just as there is not one universally accepted definition of OOP, which means that if you dare make the 'wrong' choice (according to the paradigm police) you are automatically a heretic.

Some programmers start by picking a collection of what they deem to be 'suitable' patterns (or what they are told are suitable to 'real' OO programmers) then attempt to implement them. This, in my opinion, is the wrong approach. Although it is a good idea to be aware of what patterns exist, and what problem each pattern is supposed to solve, you should not seek to employ any particular pattern until such time as you encounter a situation that the pattern was designed to solve. There may be a choice of alternative patterns for a particular situation, so you must take the time to choose the one which is most appropriate for your circumstances.

The only exception to this, where a pattern is deliberately chosen before the first line of code is written, should be a high-level architectural pattern. Before I wrote my own development infrastructure for PHP, for example, I knew that I wanted to employ the 3-tier architecture as I had used it with great success in a previous language. It wasn't until afterwards that someone pointed out that my code also included an implementation of the Model-View-Controller (MVC) design pattern, but that was entirely accidental, not deliberate. The only other design pattern that I have implemented after reading about it is the Singleton. There may be other recognisable patterns within my code, but that is pure coincidence.

For more of my views on design patterns please refer to Design Patterns - a personal perspective.

Multiple inheritance is not necessary

I have never considered multiple inheritance to be the solution to any problem I have encountered, yet others seem to employ it at every possible opportunity. Take the situation where a screen is required to show data from two database tables, TableA and TableB, which exist in a one-to-many relationship. The screen is required to show one occurrence from TableA at the top, with multiple related occurrences immediately below it.

According to some people, the Controller part in the MVC design pattern can only communicate with a single Model object, therefore under these circumstances this object must be a composite of TableA and TableB. It will therefore need one set of methods to access the data from TableA, and another set of methods to access the data from TableB. It will also need to inherit the properties and methods of the original TableA and TableB classes, hence the need for multiple inheritance.

My approach is far simpler. The first thing is to ignore the rule that "a controller can only access one object" and build a controller that is specifically designed to access any two objects which exist in a one-to-many relationship. That means that I do not need to construct a composite object, I do not need different methods to access the data from each of the two tables as the standard methods are more than adequate, and I certainly do not need multiple inheritance. What is more, that single controller can deal with ANY pair of tables which exist in a one-to-many relationship.

I once came across a post in a PHP newsgroup where someone complained that he could not write a routine to filter user input without multiple inheritance, and because PHP did not support multiple inheritance such a routine was physically impossible. As an example he took a piece of data which had to meet the conditions is_numeric and is_required. In his design he wanted to create an object for that piece of data which inherited from the numeric class as well as the required class. I don't know about you, but I can test that a piece of data meets the is_numeric and is_required conditions without putting that piece of data into its own object, and I can certainly do it without requiring any sort of inheritance, multiple or otherwise.The poor deluded soul did not understand that the need for multiple inheritance was a product of his design, and a different design would remove this need. Other programmers can write code which filters user input without the need for multiple inheritance (see below), so why can't he?

This is an example of how this requirement is satisfied in my framework:

$this->fieldspec['field1']             array('type' => 'integer',
                                             'size' => 5,
                                             'minvalue' => 0,
                                             'maxvalue' => 65535,
                                             'required' => 'y');

$this->fieldspec['field2']             array('type' => 'numeric',
                                             'size' => 12,
                                             'precision' => 10,
                                             'scale' => 2,
                                             'blank_when_zero' => 'y',
                                             'minvalue' => 0,
                                             'maxvalue' => 99999999.99);

Can you see what I'm doing here? I am describing the characteristics of each piece of data, which then enables me to have a single routine which checks that each piece if input data conforms to these characteristics. If it doesn't then I can generate a meaningful error message and send the data back to the calling component. Simple, effective and flexible, and all without any of this "I can only do it with multiple inheritance" nonsense.

Results of my approach

By using this simple and straightforward approach I have managed to produce a development infrastructure in PHP which has the following characteristics:

If my infrastructure manages to achieve all this, then who are you to tell me that my implementation is wrong, invalid or impure?


Criticisms by the 'Paradigm Police'

Here is a selection of criticisms generated by members of the OO Purity League:

Your design is centered around data instead of functions

Some people like to use a function-driven design instead of a data-driven design, and complain when somebody dares to be different. If you inspect my infrastructure you should notice that in the different layers some components are designed around functions while others are designed around data:

I may design one component around the function it is supposed to perform and another around the data structure which it is supposed to represent, but that depends entirely on the individual component and where it fits into my infrastructure. There is no 'one size fits all' answer.

You may also wish to take a look at Why is your design centered around data instead of functions?

You have not achieved the correct separation of concerns

My framework is an implementation of the 3 Tier Architecture and Model-View-Controller design pattern (see Figure 7), and if you look closely enough you will see that each component does exactly what it is supposed to. My implementation may be different from yours, but that does not mean that it is wrong.

Figure 7 - MVC plus 3 Tier Architecture

model-view-controller-03a (5K)

The problem with the term 'separation of concerns' (which is sometimes expressed as 'separation of responsibilities' or 'separation of logic') is that different people have a different interpretation of what this actually means. If you study my infrastructure you should notice the following division of responsibilities:

As you can see each component has a single and clearly defined responsibility. The fact that information (data) may be supplied by another component in order to carry out that responsibility does not mean that this other component shares in that responsibility. The code which transforms data into HTML exists in its own component. The code which transforms data into SQL exists in its own component. The code which applies business rules to that data exists in its own component. If I have separate components which are responsible for HTML logic, SQL logic and business logic, how can this possibly be an "incorrect separation of logic"?

When some people talk about the 'separation of logic' they get confused over what the word 'logic' actually means. To me 'logic' means 'code' where an operation or function is actually performed. It is not the same as 'data' or 'information' which may be held in one component but passed to another when it needs to be processed. As an example consider the following:

Although some information is held with a business object it is not actually processed within that object. It is passed to another object (DAO or view) for processing as only that other object contains the logic (program code) to process that information in the relevant manner. The fact that a business object contains information which is passed to a DAO or view object most certainly does not mean that the business object shares in the responsibilities of those other objects. The different responsibilities are clearly carried out within separate objects, therefore I have (in my humble opinion) achieved a clear separation of responsibilities.

You may also wish to take a look at the following articles:

An object can only deal with a single database row

Some OO zealots say that if a class is built around a database table then surely an instance of that class (an object) should only be allowed to deal with a single instance of that database table (row) at a time. They obviously don't know the difference between a Domain Model and a Table Module:

The primary distinction with Domain Model (116) is that, if you have many orders, a Domain Model (116) will have one order object per order while a Table Module will have one object to handle all orders.

This may be because their use of separate getters and setters for individual fields within that table forces them to deal with one database row at a time. They then require a special procedure to obtain a collection of rows, then another procedure to step through them one at a time.

I do not have this problem as I do not use getters and setters for individual fields. All data goes in and out as an associative array, and as arrays can be multi-dimensional they can contain multiple rows as well as multiple fields. I use a standard getData() method to retrieve data regardless of how many rows may be selected, and a standard foreach() loop to process the result.

Your classes are too big

It has been suggested that 'real' OO programmers do not build classes which are beyond certain size limits:

Anything which exceeds these arbitrary limits should therefore be broken down into smaller classes.

I disagree. Breaking a class down into smaller units would break encapsulation. Having the information for a business entity contained within a single class makes it much easier to maintain than having that same information spread across multiple classes. I have already separated the application logic into different components as suggested by the 3-tier architecture and the Model-View-Control design pattern, so I consider any further breakdown to be nothing more than an academic exercise with no practical benefit.

Instead of using an arbitrary value for 'too big/too many' I recently came across a definition (I forget where) which is less ambiguous:

'Too Many' means that you have more than you need. 'Too Few' means that you have less than you need.

Using this definition I can safely say that:

Your class methods are too visible

It has been suggested that even though an object may have a number of public methods, if any other object (such as a controller) which accesses it does not reference all of those methods then those other methods should be 'hidden' from that controller.

I disagree. Once a class has been defined with a set of methods it is not possible, through the mechanism of inheritance, to remove or hide any of those methods. It would be possible to create an entirely separate class for each controller, with only those methods actually required by that controller, but having multiple classes for the same entity would break encapsulation. This proliferation of classes would also make the application more difficult to maintain, and as it would not provide any discernible benefit I consider it to be nothing more than an academic exercise which can safely be ignored.

You have the wrongs levels of coupling/cohesion/dependency

The terms coupling, cohesion and dependency can be viewed in various different ways, therefore can be interpreted in different ways. All too often I am accused of having the 'wrong' level of one or the other according to someone's personal interpretation. The problem lies in the fact that these variables cannot be measured on any scale - they are simply 'high' or 'low'.

So when is it 'too high' or 'too low'? When is it 'high enough' or 'low enough'?

My measuring stick happens to be the results of the 'right' and 'wrong' levels:

My architecture provides for extremely high levels of reusability, therefore my levels of coupling, cohesion and dependency must be at the right end of the scale.

Your design is dependent on SQL

This series of criticisms came from mjlivelyjr:

By having any SQL fragments in your presentation layer creates a dependency within your presentation on SQL. For example, if the database gods decided one day to radically alter SQL then you would have to make changes to your presentation layer because it has that dependency (or knowledge if you will) of SQL. SQL is obviously something that should be in the Data Access (Infrastructure) layer and if we are talking a 3 tiered application the presentation layer should have absolutely no dependency on your data access layer.

Before you start lecturing me on dependencies I suggest you go back to school and learn what dependency actually means. There can be a dependency between one module and another module, but there cannot be a dependency between a module and a piece of data. There is also no dependency between my presentation layer and my data access layer for the simple reason that the presentation layer does not call the data access layer.

My presentation layer does not execute any SQL queries, it merely passes around SQL fragments as data. These variables, which are entirely optional by the way, are passed through the business layer down to the data access layer where they are assembled into a valid query which is then executed. It is where SQL queries are actually executed which is the critical factor, not where the various components of those queries may originate.

Another significant point that you keep failing to take into consideration is that the DAO is never passed a complete SQL query for execution, it is passed a collection of data (user data and meta-data) which must be assembled into a query before it can be executed. As I have a separate DAO for each database engine (MySQL, PostgreSQL, Oracle and SQL Server) the query can be built according to the requirements of the DBMS in question. Thus any changes can be made within the DAO without having to go back to the source of that data.

Your presentation layer, by using SQL, requires you to have knowledge of how SQL works.

So what? As I am in the business of designing and building web applications I require skills in all the relevant technologies - HTML, CSS, XML, XSL, HTTP, SQL, et cetera. I would find it rather difficult to write software without such knowledge. Even Martin Fowler in his article Domain Logic and SQL says that hiding SQL from developers may not be such a good thing after all:

Many application developers, particularly strong OO developers like myself, tend to treat relational databases as a storage mechanism that is best hidden away. Frameworks exist who tout the advantages of shielding application developers from the complexities of SQL.

Yet SQL is much more than a simple data update and retrieval mechanism. SQL's query processing can perform many tasks. By hiding SQL, application developers are excluding a powerful tool.

If he says that mixing SQL and domain logic is not a crime, then who are you to argue?

Yet more criticisms from mjlivelyjr:

I don't know that your view on dependency is entirely accurate. It may not seem like your controller is depending on SQL because it's not using full fledged SQL statements. However, let me ask you this. If you take a controller that is providing SQL fragments, and you decide you want to change the underlying database to a data system that doesn't use SQL will you have to make changes to your controller? If the answer is yes then that means your controller is dependent on SQL. Now you may say "I won't ever change away from SQL." That isn't the point, I am just saying your code is dependent on SQL. I am not even really saying that it's bad to depend on SQL in your controller. I am just saying it's not 3-tiered.

I think it's time for a reality check. Any such "dependency" on SQL is simply theoretical because there are no viable alternatives to SQL databases in the world of enterprise applications. If you don't believe me then answer these questions:

The reason that I do not cater for the possibility of dealing with a non-SQL database is that I do not need to. Refer to You Aren't Gonna Need It for a discussion on the logic of this argument.

This view was supported by aborint:

Sadly, the best argument against his design is that it would make it harder to do things that you would probably never do (e.g. convert you database to a CSV file). Increasing coupling to simplify code is a valid design decision. If you can manage the negative aspects of that decision, more power to you.

This little gem came from Brenden Vickery:

Not being able to switch to another data source is a problem whether that data source is an OO Database, an Relational Db or an XML file. Being able to make that switch is the point of the data source layer and if you know you'll never need to make any changes to how you access your data source then you don't need a data source layer.

Why on earth should I have two data layers, one for data source and another for data access, when both can be provided in a single component? My original Data Access Object communicated with a MySQL database, but when I wanted to use a PostgreSQL database instead I found that all I had to do was take the MySQL class, copy it, keep all the method names but change the code within each method. When I come to instantiate the Data Access Object all I have to do is identify whether to use the MySQL class or the PostgreSQL class. The business layer communicates with whatever object is instantiated using a common interface, so does not have to use different code to talk to a different object. This is a classic example of polymorphism, so should be familiar to every OO programmer. The fact that your implementation would be different does not concern me in the least.

More logic from mjlivelyjr:

The controller class is dependent on SQL. SQL is part of the data storage system. The data system lies in the data access layer. Follow the chain and you see that the controller class is dependent on the data access layer. Follow that one step further and it says your presentation layer is dependent on the data access layer.

I think it is your view on dependency which is not entirely accurate. The following description was provided by dagfinn:

from Martin Fowler, PoEAA

Together with the separation, there's also a steady rule about dependencies: The domain and data source should never be dependent on the presentation. That is, there should be no subroutine call from the domain or data source code into the presentation code.

This clearly states that "A is dependent on B" only when there is a subroutine call from A to B. If you agree with this description (and I dare you to disagree with Martin Fowler) then I can state quite categorically that as nothing in my presentation layer makes a direct call into the data access layer (it always goes indirectly through the middle business layer) then there is categorically no dependency between my presentation and data access layers.

If writing software which is dependent on SQL is such a crime then why does Martin Fowler not have anything of significance to say on the matter? His book Patterns of Enterprise Application Architecture contains the following patterns: Table Data Gateway, Row Data Gateway, Data Mapper, Query Object and Record Set which all take for granted the fact that the underlying database can be accessed using SQL queries. If he regards SQL as the standard, then who are you to say otherwise?

You don't understand what '3 tier' means

This pearl of wisdom came from mjlivelyjr:

You are reading extremely watered down views of n-tier architecture that are most likely being conveyed in a tutorial for people new to the concept. I would wager to guess that the authors themselves would even agree with this assessment.

A similar comment from Brenden Vickery:

Being able to switch to different RDBMS isn't enough to call your layer a data source layer in the 3 tier sense.

I disagree. I suggest you take a look at: Client/Server and the N-Tier Model of Distributed Computing from a company which has been in business since 1982 and which knows a thing or two about the subject. This article clearly identifies the data source as "some sort of SQL server". It also states that database independence is achieved by "using standard SQL which is platform independent. The enterprise is not tied to vendor-specific stored procedures."

This article (and all the other articles I have read on the subject) quite clearly states that by implementing the 3-tier architecture it should be possible to switch from one SQL database to another SQL database simply by switching the data access layer. This I have achieved, therefore my implementation is correct. If this does not conform to your interpretation of the rules I can only suggest that it is your interpretation that needs to be questioned.

This observation came from lastcraft:

You have a client/server app., not a three tier one.

This was followed by this comment:

3 tier is not about dividing up code. You could do that just by placing different source files into different folders on your hard drive and claim it was "3 tier". 3 tier is about severely restricting visibility across those boundaries. If you fail to do that then you don't have a 3 tier architecture. There is no room for opinion here, you simply don't understand the definition if you've not done this.

I think that it is your definition of '3 tier' that needs to be re-examined, not mine. Perhaps if you look hard enough you can find one that is not printed on toilet paper. A 3 Tier Architecture is one which has the following component layers:

  1. A front-end component which is responsible for presentation logic.
  2. A middle-tier component which is responsible for business logic.
  3. A back-end component which is responsible for data access logic.

Communication between these layers is limited to the following:

In other words the requests must always be in the direction front-to-middle-to-back while the responses must always be back-to-middle-to-front.

That is precisely what my framework achieves, so it most definitely is 3 tier. Any definition of 3 tier which excludes these basic principles - such as your "severely restricting visibility across those boundaries" - is completely nonsensical and unworthy of consideration by any competent person.

You have the same data names in every layer

This wonderful piece of wisdom came from lastcraft:

The column names, and with it the schema, are bleeding upwards and destroying the layering.

What the f*** does 'bleeding upwards' mean? Where is this documented? This explanation was offered by Dr Livingston:

The concept of 'bleeding upwards' is not really a concept but a term to refer to one layer knowing about the layer above it.

Any given layer (regardless of it's disposition or task) should only ever know of the layer(s) below it, and not never know what's above it

You do not understand what one layer knowing about another layer actually means. The presentation layer knows about the business layer because it is capable of calling a method (issuing a request) on an object in the business layer. The business layer does not know about the presentation layer for the simple reason that the business layer never issues a request to any object in the presentation layer. The object in the business layer returns a response to a request, but it never issues a request on the presentation layer.

This comment came from mjlivelyjr:

The reason why your example breaks layering is because it references the column name as a column name.

A similar one came from Brenden Vickery:

Your Presentation is tied to your database through column names, and form names. You couldn't change your database without changing your presentation. You cant change your presentation without changing your database. I find the way you have done this to be extremely difficult to use.

So according to your interpretation of the rules it is wrong to refer to data items by the same name in each of the software layers? What absolute rubbish! In all the 25+ years that I have been programming I have never encountered a system which used different data names in different parts of the system. It is illogical, counter-intuitive, and would require additional modules to translate the data names between one component and another. Image how much more difficult debugging would be if a data item changed its name each time it passed between modules! What I am doing is standard practice. What you are suggesting is nothing short of perverse.

Yet another one from Version0-00e:

Presentation layer shouldn't need to know of what fields to use from the database. Looking last night on Tony's site, I seen in XML he passes over the database field names (for whatever reason).

This isn't real separation of concerns surely? An XSL stylesheet doesn't need to know this, all that it's interested in is getting the data from the XML and parsing it dependent on a given template, nothing more.

You are missing the point - as usual. Those data names are simply the data names which exist within the XML document. There is no reference as to where each item of data came from - it may or may have come from a database, it may have been plucked out of thin air, it may or may not have come from a data source with the same name. The only thing that matters is that the data exists within the XML document - where it came from is totally irrelevant.

Your comment that "this isn't real separation of concerns" indicates to me that you haven't a clue as to what "separation of concerns" really means. In the 3 Tier Architecture each component layer has a distinct set of responsibilities/concerns:

  1. A front-end component which is responsible for presentation logic.
  2. A middle-tier component which is responsible for business logic.
  3. A back-end component which is responsible for data access logic.

Note that logic means code, not data, so the fact that an item of data can flow through all 3 layers, and be referenced with the same name in each layer, does NOT violate the "separation of concerns" principle. If I were to access the database from within the presentation layer, or execute business rules within the data access layer, then that would be a violation, but sharing common data names across layers most definitely would not.

Your database schema is known to your domain layer

This little gem came from Brenden Vickery:

The problems here are that, the fact you are using a relational database is known, your database schema is known, ...

So what? Where does it say that it is wrong for an object to have knowledge of the underlying database schema? All the programming languages I have used in the past 25+ years have actually made it impossible for the code to be built around anything other than the physical database schema:

So where does it say in any OOP manual that a business object must not be constructed around a data schema which is the same as the physical database schema? Just because someone has invented some object-relational metadata mapping patterns (refer to Metadata Mapping, Query Object and Repository) which deal with the situation when they are different does not mean that they must be different. It is impossible to write software which does not, somewhere in its bowels, have knowledge of the physical database schema. As a pragmatic programmer it seems utterly stupid to introduce an additional arbitrary structure which then requires an additional mapping layer to convert from one structure to the other. By making the object schema the same as the physical schema I avoid this extra layer of complexity. As a follower of the KISS principle I seek to avoid unnecessary complexity whenever and wherever possible, so this bright idea is a prime candidate for the rubbish bin.

Where does it say that when your software communicates with a relational database that it must not know that it is communicating with a relational database? Knowledge is data, not code. Knowledge is information, not logic. While my presentation and business layers may have variables which can be traced back to an SQL database, it is only within my data access layer that you will find logic (program code) which performs the actual communication with the database.

Decades ago when relational databases were first being introduced the number of people who knew SQL was pretty small, so it was common for software development to have two teams - one writing program code and another writing SQL. Those days are long gone, and nowadays it is expected that anyone who writes software which uses a relational database is capable of writing SQL queries, just as a programmer who writes software for the web is capable of writing HTML.

Your approach is too simple

In a recent blog post someone made the following observation:

If you have one class per database table you are relegating each class to being no more than a simple transport mechanism for moving data between the database and the user interface. It is supposed to be more complicated than that.

Why on earth should it be more complicated than that? I am an old-timer at this game, and my experience tells me that the simplest approach is always the best approach. Before the name was changed to Information Technology (IT) this profession used to be known as Data Processing, and what we developed were called Data Processing Systems. The definition of a "system" is "something which transforms input into output", as shown in Figure 8:

Figure 8 - a system

data-processing-system-1 (1K)

Software is a system as data goes in, is processed, and data comes out. Sometimes the "processing" part of the system is nothing more than saving the data in a high-speed high-capacity storage mechanism (a database) so that it can be be quickly retrieved and displayed to the user in more or less the same format that it went in. In other cases the data may be transformed or manipulated in some way before it is stored, and/or transformed or manipulated in some way before it is output. This would give rise to the situation shown in Figure 9:

Figure 9 - a data processing system

data-processing-system-2 (2K)

Every database application I have ever worked on, in whatever programming language my employer used at the time, has always started off as nothing more than a "simple data transport mechanism" between the user interface and the database. In order to become a usable application the programmer then has to insert code to process all the business rules, either at the input stage or the output stage, and it is this coding of the business rules which adds complexity. In an ideal world a programmer should have to spend as little time as possible on the simple stuff so that he has more time to spend on the complex stuff. I have built three frameworks in three different languages which were aimed at delivering the "simple stuff" as quickly as possible, thereby giving the programmer more time for the "complex stuff". My employer in 1985 liked my framework so much he made it the company standard. My fellow developers liked it because they did not have to spend as much time coding the boring bits. Our customers liked it as we could build applications quicker and therefore cheaper than our rivals.

I have always built my database first, then structure my software to match the database structure. Using an OO language where I can have a separate class for each database table and where the common code can be inherited from an abstract class has given me a framework which is far more productive than any of its predecessors. Because I generate my classes from my database tables I don't have to waste my time with OOD, and because my class structure is always in sync with my database structure I don't have to waste my time with an ORM. My framework takes care of a huge amount of the simple stuff, thus leaving me more time for the complex stuff. I'm not going to throw all that away just because you say it is too simple. My approach is not too simple, it is your approach which is too complicated.

The aim of the RADICORE framework is to build the "simple data transport mechanism" for each database table as quickly as possible. All the developer has to do for each user transaction is to code the processing rules, either at the input or output stages. This task is made easy by virtue of the fact that each table class contains empty methods at both stages, so it is a simple matter of deciding which code to put into which method. If your framework does not make it as simple as that then I would suggest that it is your framework which is too complicated and is in serious need of refactoring.


Conclusion

My approach to OOP causes some consternation among OO zealots who constantly claim that my approach is impure, unclean and should be banned in case it corrupts the minds of those with less experience. According to some I should even be banned from contributing to popular forums altogether or even hung from the nearest tree. What is it that causes such animosity and hatred? What have I done to offend these people? It cannot be that my methods do not work, because I can clearly demonstrate that they do. It can only be that I have broken the rules which they consider to be sacred, and such sacrilege must not be allowed to go unpunished. They are like religious zealots who start foaming at the mouth if anyone dares to question their beliefs.

Their attitude seems to be:

Your methods are wrong because you have broken the rules.

Whereas as my attitude is quite simple:

My methods cannot be wrong for the simple reason that they actually work. Something that works cannot be wrong just as something that does not work cannot be right.

I have not broken the rules as there is no such thing as a single set of rules that everybody must follow. I have simply broken your interpretation of the rules. As there appears to be many different interpretations of many different rules floating around the ether, who is to say which interpretation is right and which interpretation is wrong?

If these rules are open to so much interpretation (and mis-interpretation) then is it not the author's fault for creating rules which are so vague? Or is it because these rules are supposed to be no more than an outline of the major objectives, with the fine details left entirely up to the individual programmer within his particular implementation?

The purpose of the software developer is to develop software which works, not to develop software according to an arbitrary set of rules. It is results that count, not rules. I achieve better results without your rules, therefore I see no reason to be restricted by them.

Like any religion which is gradually corrupted over a period of time the principles of object oriented programming have been gradually corrupted in exactly the same way. The original principles behind OOP were described simply as encapsulation, inheritance and polymorphism (see definitions above), but with the passing years different interpretations have been proposed, and these re-interpretations have in their turn been subject to even more re-interpretation. The end result is a hodge-podge of misinterpretation, misrepresentation and misunderstanding, and is so far removed from the original concepts that it is a wonder that they can be used to produce anything workable at all.

One of the most common criticisms I receive about my approach to OOP is that is is "too simplistic". I have news for you guys - that's what the KISS principle is all about! It seems that some people deliberately avoid the simplest approach in order to make themselves look more clever than they really are. They seem to think that unless a solution is complicated, convoluted and obfuscated it cannot be much of a solution. As for me, there is the simple solution, or there is the stupid solution. If the simple solution works, is easy to implement and easy to maintain, then anything else is just plain stupid.

Basic misunderstandings

Here are some examples of the basic misunderstandings which cause confusion among the OO zealots:

  1. Encapsulation means enclosing information (properties and methods) about an object within a container or capsule. It does not mean that the information must be completely hidden, just the implementation. Take a look at the following:
  2. There is no limit to the size of the container (no more than n properties, no more than n methods), therefore the idea that multiple containers should be used where this arbitrary limit is breached is doing nothing more than breaking the fundamental idea behind encapsulation - one object, one container. Take a look at the following:
  3. The very idea that inheritance breaks encapsulation is just beyond belief.
  4. Some people do not understand the difference between the selection of data and the formatting of data which leads them to say that my method of pagination is wrong because I have pagination code in the wrong class. Take a look at the following:
  5. Some people do not understand what 'separation of responsibilities' actually means. I have a DAO which is responsible for all communication with the database, but because it requires information to be passed down from the business object before it can perform its responsibilities I am accused of breaking the 'separation of responsibilities' rule. "You have information about your database in a business object - that is against the rules!". Take a look at the following:
  6. Another source of confusion is the way that the logic in simple statements can be twisted to form a totally different meaning. This is explained more fully in Reverse Imperative Principle.

Reverse Imperative Principle

This is the method by which it is possible to take a simple sentence and, with a small change, completely reverse the logic. Take, for example, this common piece of pseudo-code:

if <condition> then <imperative statement>

Now, everybody knows that what this means is:

Yet why do some OO zealots seem to translate this as:

"Hang on," I hear you say, "Nobody can be that stupid!" Yet bear with me for a moment and follow this train of thought:

Does this sound familiar to anyone? This must be why I am told the following:

This principle may be familiar to others under the name Contraposition.

Other peculiar ideas

While surfing the web I occasionally come across articles containing statements with which I heartily disagree. I would like to share some of these with you.

In the article Why extends is evil the author makes the following statement:

The first problem is that explicit use of concrete class names locks you into specific implementations, making down-the-line changes unnecessarily difficult.

I have used concrete class names in my framework for many years and have never had any difficulty making down-the-line changes. In fact I have less difficulty now than I did previously with non-OO languages. Perhaps it is the way that I use inheritance which is more superior than yours?

Later on he states the following:

In an implementation-inheritance system that uses extends, the derived classes are very tightly coupled to the base classes, and this close connection is undesirable.

Undesirable? In what way? In my framework every concrete table class is derived from an abstract table class. The abstract class is quite huge while the concrete classes (and I have hundreds, by the way) are quite small. They are small because 95% of the code is inherited from the abstract class. This is the way that OO programmers share code, how they make code reusable. If you are not getting the same results then you must be mis-using inheritance.

One of the people who commented on this article made the following statement:

OOP is not well suited to use in a Database application.

I disagree completely! Perhaps that is the case because of how you implement OO, but for me the opposite is true. I have been writing database applications for many years in several different non-OO languages, and ever since I switched to PHP with its OO capabilities I have found it infinitely easier. Perhaps it is the artificial rules that you follow which make it difficult? I don't follow those rules, therefore I don't experience any difficulty.


If the OO zealots can get confused with relatively simple concepts, is it any wonder they lose the plot completely when things get more complicated? They are so tied up in their fancy rules that they have completely forgotten the purpose behind OOP in the first place - to be able to create software quicker and with fewer bugs. I have managed to achieve this, but in order to do so I have found it necessary to draw on my past 25+ years of experience and reject the ridiculous rules of the OO zealots. If I can produce workable (and some would even say superior) results by breaking their rules, then what does it say about the quality of their rules? I do not appear to be alone with this opinion - take a look at the following:

So before you tell me again that I'm breaking one of your precious rules just answer these simple questions:

All the while you OO zealots keep inventing these ridiculous rules I shall exercise my God-given right to break them. That is, after all, the only way I can create acceptable software.


References


© Tony Marston
10th December 2004

http://www.tonymarston.net
http://www.radicore.org

Amendment history:

23 Aug 2013 Added Your approach is too simple.
10 Apr 2012 Added Other peculiar ideas.
26 Oct 2006 Added Multiple inheritance is not necessary.
15 Feb 2006 Added more details to section Each database table requires its own class
11 May 2005 Added section Don't waste time with Mappers
26 Feb 2005 Added section Understand what "abstraction" really means
19 Feb 2005 Added section Your database schema is known to your domain layer
Added section Reversed Imperative Principle
12 Feb 2005 Added section You have the wrongs levels of coupling/cohesion/dependency
Added section Your design is dependent on SQL
Added section You don't understand what '3 tier' means
Added section You have the same data names in every layer
22 Jan 2005 Added section Conclusion.
29 Dec 2004 Added section Use design patterns sparingly.

counter