What is Object Oriented Programming (OOP)?

By Tony Marston

3rd December 2006
Amended 29th April 2012

Introduction
What OOP is NOT
What is an Object Oriented language?
What OOP is
- Optional Extras
The difference between OOP and non-OOP
Practical Examples
- Encapsulation
- Inheritance
- Polymorphism
Popular misconceptions
- What Encapsulation is not
- What Polymorphism is not
Conclusion
References
Amendment History

Introduction

Quite often I see a question in a newsgroup or forum along the lines of: What is this thing called 'OOP'? What is so special about it? Why should I use it? How do I use it?. The person asking this type of question usually has experience of non-OO programming and wants to know the benefits of making the switch. Unfortunately most of the replies I have seen have been long on words but short on substance, full of airy-fairy, wishy-washy, meaningless phrases which are absolutely no use at all to man or beast.

Having created 1000's of programs using non-OO languages, and another 500+ using the OO features of PHP I feel more than qualified to add my own contribution to the melting pot. According to some OO 'purists' I am not qualified at all as I was not taught to do things 'their' way and I refuse to follow 'their' methods. My response to that accusation is that there is no such thing as 'only one true way' with OOP just as there is no such thing as 'only one true way' with religion. People tell me that my methods are wrong, but they are making a classic mistake. My methods cannot be wrong for the simple reason that they work, and anybody with more than two brain cells to rub together will tell you that something that works cannot be wrong just as something that does not work cannot be right. My methods are not wrong, they are simply different, and sometimes it is a willingness to adopt a different approach that separates the code monkeys from the engineers.

One reason why some people give totally useless answers is that it was what they were taught, and they do not have the intelligence to look beyond what they were taught. Another reason is that some of the explanations about OO are rather vague and can be interpreted in several ways, and if something is open to interpretation it is also open to a great deal of mis-interpretation. If you do not believe that there is widespread confusion as to what OO is and is not then take a look at Nobody Agrees On What OO Is. Even some of the basic terminology can mean different things to different people, as explained in Abstraction, Encapsulation, and Information Hiding. If these people cannot agree on the basic concepts of OOP, then how can they possibly agree on how those concepts may be implemented.


What OOP is NOT

As a first step I shall debunk some of the answers that I have seen. In compiling the following list I picked out those descriptions which are not actually unique to OOP as those features which already exist in non-OO languages cannot be used to differentiate between the two.

OOP is about modeling the 'real world'

OOP is a programming paradigm that uses abstraction to create models based on the real world. It provides for better modeling of the real world by providing a much needed improvement in domain analysis and then integration with system design.

Rubbish. OOP is no better at modeling the real world than any other method. Every computer program which seeks to replace a manual process is based on a conceptual software model of that process, and if the model is wrong then the software will also be wrong. The conceptual model is created as an analyst's view of the real world, and the computer software is based solely on this conceptual model. OOP does not guarantee that the model will be better, just that the implementation of that model will be different. You should also consider the fact that it would be totally impractical to model the whole of the real world as it is simply too vast and too complicated. It is only ever necessary to model those parts which are actually relevant to your current application, and it is the exercise of deciding what is and is not relevant which decides if your abstraction is correct.

The article Don't try to model the real world, it doesn't exist puts forward an interesting viewpoint.

The term "abstraction" is also open to interpretation, and therefore mis-interpretation, as discussed in Understand what "abstraction" really means. This is why some people's abstractions look more like the work of Picasso when what is required should look like the work of Michelangelo.

That is why it is possible to create software that does A, B and C but it is useless to the customer as it does not also do X, Y and Z. The real world may contain X, Y and Z but the analyst did not include it in his model either because he did not spot it or because the customer failed to mention it in his Specification Of Requirements (SOR). I know because I have encountered both situations in my long career.

Not everyone agrees that direct real-world mapping is facilitated by OOP, or is even a worthy goal; Bertrand Meyer argues in Object-Oriented Software Construction that a program is not a model of the world but a model of a model of some part of the world; "Reality is a cousin twice removed".

OOP is about code re-use

The power of object-oriented systems lies in their promise of code reuse which will increase productivity, reduce costs and improve software quality.

Rubbish. This implies that code re-use is possible in OOP and not possible in non-OOP. Using OOP does not guarantee that more reusable code will be available as reusability depends on how the code is written, not the language in which it was written. It is possible to produce libraries of reusable modules in any non-OO language (I know, because I was doing just that with COBOL in 1985) just as it is possible to produce volumes of non-reusable code in any OO language.

It does not matter on the capabilities of the language as it is possible to have the same block of code duplicated in 100 places in any language. It is also possible, in any language, to put that block of code into a reusable module and call that module from those 100 places.

One of the early promises of OOP that I heard many years ago was that it would be possible for a software vendor to produce a library of pre-written classes, and for other developers to use these "off the shelf" classes instead of creating their own custom versions and thus "re-inventing the wheel". This dream never materialised, which just goes to prove that OOP promises much but delivers little.

OOP is about modularity

The source code for an object can be written and maintained independently of the source code for other objects. Once created, an object can be easily passed around inside the system.

Rubbish. The concept of modular programming has existed in non-OO languages for many years, so this argument cannot be used to explain why OO is supposed to be better than non-OO. Just as it is possible in any language to hold the source code for an entire application in a single file, it is just as possible, in any language, to break that source code into smaller modules so that the source code for each module can be maintained and compiled independently of all other modules.

Besides, any software which consists of multiple classes is automatically "modular" as each class can be considered to be a self-contained "module". The critical factor is how well each module or class is designed.

OOP is about plugability

If a particular object turns out to be problematic, you can simply remove it from your application and plug in a different object as its replacement. This is analogous to fixing mechanical problems in the real world. If a bolt breaks, you replace it, not the entire machine.

Rubbish. This is the same as modularity where the source code for any individual module can be modified, recompiled and inserted into the application without having to touch any of the other modules.

OOP is about implementation hiding

By interacting only with an object's methods, the details of its internal implementation remain hidden from the outside world.

In the first place implementation hiding was never one of the aims of OOP, it is merely a by-product of encapsulation. The outside world can see the method names which can be used on a object, but not the code which exists behind those method names.

In the second place implementation hiding is not unique to OOP, nor did it suddenly appear because of OOP. In any language, whether it is object oriented, procedural, functional, or whatever, when you write a function or procedure (and remember that a class method is nothing more than a procedural function within a class) all you are exposing to the outside world, including programmers who write code which calls that function, is the function's signature, as in:

$return = functionName(arg1, arg2, ..., argX);           // procedural
$return = $object->functionName(arg1, arg2, ..., argX);  // object oriented

Here you are identifying three things:

The only thing which is not exposed is the code that is executed when the function is called. You know what the function does but not how it does it. You know what data goes in and what data comes out, but not what code is executed in the middle. In other words how that function is implemented, the actual code which is executed, is hidden from view. The documentation which comes with the function library should describe what the function does so that the programmer can decide if that function is the right one to call, and how to call it, but the actual code behind the function name is still hidden. The documentation may provide a listing of the source code, and the actual source code may be provided for the programmer to view and possibly modify, but as far as the function's signature goes the implementation is effectively hidden. This means that at any time you could install a new version of that function with a modified implementation and, provided that the function's signature did not change, you would not have to change any code which calls that function. This means that the implementation of a function could change at any time but the calling program would not know that it had changed. The implementation is "hidden", so how can the calling program possibly know that it has changed?

OOP is about information hiding

A lot of people assume the misunderstanding regarding implementation hiding then go one step further and say that because the data is part of the implementation then the data must be hidden as well. This is how they justify the use of the visibility options of public, private and protected which in turn necessitate the user of getters (accessors) and setters (mutators). These people do not realise that there is a fundamental difference between "implementation" and "information":

The act of encapsulation is supposed to put an entity's methods and data inside a capsule, but nowhere does it say that the walls of the capsule should be opaque. Nowhere does it say that the data inside the capsule should be hidden from view. Nowhere does it say that an object's data cannot be accessed directly without the use of a separate API. It is only the code behind the object's API which is hidden from view, not the data contained within the object itself.

It is possible to access an object's data in two ways:

While some programmers say that the use of getters and setters to access an object's data should be mandatory, there are others who consider this to be evil.

OOP is about the passing of messages.

Message passing is the process by which an object sends data to another object or asks the other object to invoke a method.

Rubbish. The way that an object's method is invoked in an OO language is identical to the way in which a function or procedure in a non-OO language is invoked. If the language supports both non-OO functions and object methods (as PHP does) the method of invocation is called "calling", not "message passing". In fact in some languages it is necessary to specify the word "call" when invoking a subroutine.

non-OO: $result = function(arg1, arg2, ...)
OO:     $result = $object->function(arg1, arg2, ...)

The result of each invocation is exactly the same - the caller is suspended while control is passed to the callee, and control is not returned to the caller until the callee has finished.

I have worked with messaging software in the past and I can tell you quite categorically that they are completely different:

As you can see the mechanics of activating a method in an object is exactly the same as activating a non-OO function and nothing like passing a message in a messaging system.

OOP is about separation of responsibilities.

Each object can be viewed as an independent little machine with a distinct role or responsibility.

Rubbish. It depends entirely on how the module was written, and not the language in which it was written. It is possible to write independent modules in a procedural language such as COBOL, just as it is possible to write non-independent modules in an OO language.

The problem with "separation of responsibilities" is that different people have a different interpretation as to what it actually means. To some people the database operations such as SELECT, INSERT, UPDATE and DELETE require their own objects whereas others (like myself) put them all together in a single data access object (DAO). Some programmers may have a separate DAO for each table in the database while others (like myself) may have a single DAO which can deal with any and all database tables for a specific DBMS. Before you can separate any responsibilities you must first identify what those responsibilities are, and this is a design decision which is totally separate from the language in which the design is ultimately implemented.

In all my many years of experience the only project that I have ever been involved in which failed to be implemented due to "technical difficulties" was one where the system architects were OO "experts" who knew everything there was to know (or so they thought) about this "separation of responsibilities". They designed a system around design patterns which had a different module for each responsibility, and this resulted in a design with at least ten layers of code between the UI and the database. This made the creation of new components far more complicated and convoluted than it need be, and it made testing and debugging an absolute nightmare. The result was far too expensive for the client, both in time and money, so he pulled the plug on the whole project and cut his losses. A pair of components which took 10 days to build using these "new fangled" OO techniques took me less than an hour to build using my "old fashioned" non-OO methods. So much for the superiority of OO.

Besides, any software which consists of multiple classes/modules automatically has "separation of concerns" as each class/module can be considered to be "concerned" with a particular entity. The critical factor is how well each class/module deals with the requirements of its entity.

OOP is easier to learn.

OOP is easier to learn for those new to computer programming than previous approaches, and its approach is often simpler to develop and to maintain, lending itself to more direct analysis, coding, and understanding of complex situations and procedures than other programming methods.

Rubbish. This is just marketing hype. Every new language/tool/paradigm is supposed to be better than everything else, but it rarely is. It is not what you use but how you use it that counts, and I have personally witnessed where an "old" language, when used by competent programmers, regularly outperformed a "new" language which was advertised as being more productive by several orders of magnitude.

A person's ability to learn something is often limited by the quality of the teachers or teaching materials, and I'm afraid that too much of what is being taught is too complicated, too inefficient, and more likely to lead to project failures than successes. Too often the teachers insist that there is "only one way" to do OOP, and that is where I most strongly disagree. I have successfully migrated to OOP by ignoring all these so-called "experts" and drawing on my years of experience with non-OO languages.

Someone once told me that OOP is not as simple as taking a procedural function and wrapping it in a class. I disagree. It *IS* that simple. The only "trick" is placing related functions in the same class (this is called encapsulation), then adjusting them to deal with the state which can be maintained within an object. The really clever thing that you can do with classes is to extend a parent or abstract class into a number of subclasses through inheritance. You can also have a function/method which is available in objects which are instantiated from different classes, which gives you polymorphism. I have seen too many examples where classes have been created with the wrong mix of functions, either related functions not being in the same class, or classes containing functions which are not actually related. I have seen inheritance over used so much that the resulting class hierarchy is really difficult to maintain and enhance. I have seen programmers use every OO feature or construct which is available in the language for no better reason than to impress other programmers with their ability to write obscure code, the theory being that the more obscure it is the more OO it is. They seem to think that if it is too simple then you are not doing it right. These people have obviously not heard of the KISS principle.

OOP is about actors and actions.

Object Oriented Programming is a mode of software development that modularizes and decomposes code authorship into the definition of actors and actions.

This is so vague it is meaningless, and therefore of absolutely no use at all.

OOP is all about late binding.

'Late' refers to the fact that the binding decisions (which binary to load, which function to call) are deferred as long as possible, often until just before the function is called, rather than having the binding decisions made at compile time (early).

Rubbish. Whether such binding takes place early or late does not separate OOP from non-OOP. It is possible to have a non-OO language which offers late binding, but that does not magically turn it into OO. Conversely, a language which supports classes, encapsulation, inheritance and polymorphism is suddenly not OO simply because it only offers early binding.


As you can see, the above descriptions are either too vague or not specific to OOP, so they cannot be used as distinguishing features.


What is an Object Oriented language?

A computer language can be said to be Object Oriented if it provides support for the following:

Class A class is a blueprint, or prototype, that defines the variables (data) and the methods (operations) common to all objects of a certain kind.
Object An instance of a class. A class must be instantiated into an object before it can be used in the software. More than one instance of the same class can be in existence at any one time.
Encapsulation The act of placing data and the operations that perform on that data in the same class. The class then becomes the 'capsule' or container for the data and operations.

Please also refer to What Encapsulation is not.

Inheritance The reuse of base classes (superclasses) to form derived classes (subclasses). Methods and properties defined in the superclass are automatically shared by any subclass.
Polymorphism Same interface, different implementation. The ability to substitute one class for another. This means that different classes may contain the same method names, but the result which is returned by each method will be different as the code behind each method (the implementation) is different in each class.

Please also refer to What Polymorphism is not.

A class defines (encapsulates) both the properties (data) of an entity and the methods (functions or operations) which may act upon those properties. Neither properties nor methods which can be applied to that entity should exist outside of that class definition.


What OOP is

This is a lot simpler than some people would like you to believe. They like to use the more complicated definitions because it makes them sound more intelligent than they really are. Here is the real definition:

Object Oriented Programming is programming which is oriented around objects, thus taking advantage of Encapsulation, Polymorphism, and Inheritance to increase code reuse and decrease code maintenance.

To do OO programming you need an OO language, and a language can only be said to be object oriented if it supports encapsulation (classes and objects), inheritance and polymorphism. It may support other features, but encapsulation, inheritance and polymorphism are the bare minimum. That is not just my personal opinion, it is also the opinion of the man who invented the term. In addition, Bjarne Stroustrup (who designed and implemented the C++ programming language), provides this broad definition of the term "Object Oriented" in section 3 of his paper called Why C++ is not just an Object Oriented Programming Language:

A language or technique is object-oriented if and only if it directly supports:
  1. Abstraction - providing some form of classes and objects.
  2. Inheritance - providing the ability to build new abstractions out of existing ones.
  3. Runtime polymorphism - providing some form of runtime binding.

Later versions of various OO languages have added more features, and some people seem to think that it is these additional features which decide if a language is OO or not. I totally disagree. It would be like saying that a car is not a car unless it has climate control and satnav. Those are optional extras, not the distinguishing features, and not using them does not make your code not OO. It would also be incorrect to say that a car is a car because it has wheels. Having wheels does not make something a car - a pram has wheels, but that does not make it a car, so having wheels is not a distinguishing feature.

That is why I say that such things as "modularity", "reusability" and "messaging" are not features which distinguish an OO language from a non-OO language for the simple reason that they already exist in some non-OO languages. That is why I say that using some of these later additions, these optional extras, to OO languages does not make your code "more" OO. If you write programs which are oriented around objects then you are an object oriented programmer. It's as simple as that.

Optional Extras

Among these "optional extras" which have nothing to do with a language being OO or not are:

As these are optional extras I am merely exercising the option to not use them. Some people say that my minimalist approach to OOP, the fact that I use nothing but encapsulation, inheritance and polymorphism, means that I am not a "proper" OO programmer. Yet with my approach I can still achieve high reusability and low maintenance, so why am I wrong?


The difference between OOP and non-OOP

A better way of trying to explain the differences between non-OO and OO programming is to actual examples.

They are defined differently

A function is defined as a self-contained block of code. Each function name "fName" must be unique within the application.

function fName ($arg1, $arg2) 
// function description
{
    ....
    
    return $result;
    
} // fName

A class method is defined within the boundaries of a class definition. Each class name "cName" must be unique within the application. Each class may contain any number of functions (also known as "methods"), and the function name "fName" must be unique within the class but need not be unique within the application. In fact, the ability for different classes to share common function/method names is a requirement of polymorphism.

class cName
{
    function fName ($arg1, $arg2) 
    // function description
    {
        ....
        
        return $result;
        
    } // fName
    
} // cName

They are accessed differently

It is important to note that neither a function nor a class can be accessed until the function/class definition has been loaded.

Calling a function is very straightforward:

$result = fName($arg1, $arg2);

Calling a class method is not so straightforward. First it is necessary to create an instance of the class (an object), then to access the function (method) name through the object. The object name must be unique within the application.

$object = new cName; 
$result = $object->fName($arg1, $arg2);

They have different numbers of working copies

A function does not have to be instantiated before it can be accessed, therefore only one copy (or instance) is said to exist at any one time.

A class method can only be accessed after it has been instantiated into an object (unless it has been defined as a static method, see below), and it is possible to create multiple instances (objects) of the same class with different object names.

$object1 = new cName;
$object2 = new cName;
$object3 = new cName; 

Although it is possible to access a static method without first creating an object, this is no better than accessing a non-class function. As it is not actually using an object it cannot be considered part of object oriented programming.

They have different numbers of entry points

A function has only a single point of entry, and that is the function name itself.

An object has multiple points of entry, one for each method name.

They have different methods of maintaining state

A function by default does not have state, by which I mean that each time that it is called it is treated as a fresh invocation and not a continuation of any previous invocation.

An object does have state, by which I mean that each time an object's method is called it acts upon the object's state as it was after the previous method call.

It is possible for both a function and a class method to use local variables, and they both operate in the same way. This means that the local variables do not exist outside the scope of the function or class method, and any values placed in them do not persist between invocations.

It is possible for a function to remember values between different invocations by declaring a variable as static, as in the following example:

function count () {
    static $count = 0;
    $count++;
    return $count;
}

Each time this function is called it will return a value that is one greater than the previous call. Without the keyword static it would always return the value '1'.

Class variables which need to persist outside of a function (method) are declared at class level, as follows:

class calculator
{
    // define class properties (member variables)
    var $value;
    
    // define class methods
    function setValue ($value) 
    {
        $this->value = $value;
        
        return;
        
    } // setValue
    
    function getValue () 
    // function description
    {
        return $this->value;
        
    } // getValue
    
    function add ($value) 
    // function description
    {
        $this->value = $this->value + $value;
        
        return $this->value;
        
    } // add
    
    function subtract ($value) 
    // function description
    {
        $this->value = $this->value - $value;
        
        return $this->value;
        
    } // subtract
    
} // calculator

Note that all class/object variables are referenced with the prefix $this-> as in $this->varname. Any variable which is referenced without this keyword, as in $varname, is treated as a local variable.

Note also that each instance of the class (object) maintains its own set of variables, so the contents of one object are totally independent of the contents of another object, even it is from the same class.


Practical Examples

Here are some practical examples which demonstrate Encapsulation, Inheritance and Polymorphism.

Encapsulation

Encapsulation The act of placing data and the operations that perform on that data in the same class. The class then becomes the 'capsule' or container for the data and operations.

Please also refer to What Encapsulation is not.

Every application deals with a number of different entities or "things", such as "customer" "product" and "invoice", so it is common practice to create a different class for each of these entities. At runtime the software will create one or more objects from each class definition, and when it wants to do something with one of these entities it will do so by calling the relevant method on the relevant object.

The data held within each object at runtime cannot remain in memory for ever, so it is written out to a persistent data store (a database) with a separate table for each entity. There are only four basic operations which can be performed on a database table (Create, Read, Update, Delete) so I shall start by creating a method for each one.

class entity1
{
    // class properties
    var $dbname;           // database name
    var $errors = array(); // array of error messages, indexed by field name          
    var $fieldarray;       // associative array of name=value pairs
    var $fieldspec;        // array of field specifications
    var $numrows;          // number of database rows affected
    var $primary_key;      // array of field names which make up the primary key
    var $tablename;        // table name
    
    // class methods
    function entity1 ()
    // constructor
    {
        $this->tablename   = 'entity1';
        $this->dbname      = 'foobar';
        
        $this->fieldlist   = array('column1', 'column2', 'column3', 'column4');
        $this->primary_key = array('column1');
        
    } // entity1
    
    // class methods
    function getData ($where)
    // read data from the database which satisfies the selection criteria in $where
    {
        ....
        
        return $this->fieldarray;
        
    } // getData
    
    function insertRecord ($fieldarray)
    // create a database record using the contents of $fieldarray
    {
        ....
        
        return $this->fieldarray;
        
    } // insertRecord
    
    function updateRecord ($fieldarray)
    // update a database record using the contents of $fieldarray
    {
        ....
        
        return $this->fieldarray;
        
    } // updateRecord
    
    function deleteRecord ($fieldarray)
    // delete a database record identified in $fieldarray
    {
        ....
        
        return $this->fieldarray;
        
    } // deleteRecord
    
} // entity1

Please note the following:

Each of these classes therefore acts as a 'capsule' which contains both the data for an entity and the operations which can be performed upon that data. This is 'encapsulation'.

Inheritance

Inheritance The reuse of base classes (superclasses) to form derived classes (subclasses). Methods and properties defined in the superclass are automatically shared by any subclass.

After writing and testing a class to deal with 'entity1' I copied it and made it work for 'entity2'. I then compared the two classes to see what code was common and could be shared, and what code was unique and could not be shared. I then transferred all the common code into a separate class known as a 'superclass'.

Firstly, to create the superclass, I changed the class name and the constructor to the following:

class Default_Table
{
    // class properties
    ....
    
    // class methods
    function Default_Table ()
    // constructor
    {
        $this->tablename   = 'unknown';
        $this->dbname      = 'unknown';
        
        $this->fieldlist   = array();
        $this->primary_key = array();
        
    } // Default_Table
    
    function getData ($where)
    
    ....
    
    function insertRecord ($fieldarray)
    
    ....
    
    function updateRecord ($fieldarray)
    
    ....
    
    function deleteRecord ($fieldarray)
    
    ....
    
} // default

This class cannot be used to instantiate a working object as it does not refer to a database table which actually exists, so it is what is known as an 'abstract' class.

Secondly, I altered each table class to remove the common methods and properties, and included the keyword extends to force inheritance from the superclass.

include 'default.class.inc';
class entity1 extends Default_Table
{
    function entity1 ()
    // constructor
    {
        $this->tablename   = 'entity1';
        $this->dbname      = 'foobar';
        
        $this->fieldlist   = array('column1', 'column2', 'column3', 'column4');
        $this->primary_key = array('column1');
        
    } // entity1
    
} // entity1

When a subclass is instantiated into an object that object will contain all the properties and methods of the superclass as well as those of the subclass. If anything has been defined in both the superclass and the subclass, then the definition from the subclass will be used.

In my current development environment the superclass contains several thousand lines of code, but there is only one copy of this code which is inherited by several hundred table classes. Inheritance is therefore a powerful mechanism for making one copy of common code accessible to many objects instead of having multiple copies of that common code.

Polymorphism

Polymorphism Same interface, different implementation. The ability to substitute one class for another. This means that different classes may contain the same method names, but the result which is returned by each method will be different as the code behind each method (the implementation) is different in each class.

Please also refer to What Polymorphism is not.

Polymorphism can only be employed where the same method names exist in several classes. The code within the method may be inherited from a parent class, or it may be totally different. This means that the same method can be used on different objects, but the results will be different.

For example, take a series of classes called 'Customer', 'Product' and 'Invoice'. One practice I have seen which makes polymorphism impossible is to incorporate the entity name into the method name, as in:

  1. getCustomer(), insertCustomer(), updateCustomer() deleteCustomer()
  2. getProduct(), insertProduct(), updateProduct() deleteProduct()
  3. getInvoice(), insertInvoice(), updateInvoice() deleteInvoice()

The problem with this approach is that the object (the controller in MVC) which communicates with each table object (the model in MVC) needs to know the method name before it can open up that channel of communication. If each model has a unique set of method names then it must have a unique set of controllers to communicate with it.

My approach is to use a standard set of method names for standard operations, as in:

  1. getData(), insertRecord(), updateRecord() deleteRecord()

This is made easier as these methods are defined in the superclass and made available to each subclass through inheritance.

The advantage of this is that I can have one standard controller for each standard function, and this controller can work with any table class in the system. This is far better than having a separate set of controllers for each table class.

Here is some example code from one of my controllers:

....
include "$table.class.inc";
$object = new $table;
$data = $object->getData($where);
....

The contents of $table and $where are made available at runtime.

The significant point is that the name of the class (database table) is not hard-coded into the controller, it is passed as an argument at runtime. Only the method names are hard-coded, but as these method names exist within every table class by being inherited from the superclass they will always work. So, if the class name is 'Customer' the controller will obtain data from the 'Customer' table, if it is 'Product' it will obtain data from the 'Product' table, and so on.


Popular misconceptions

I am often berated by my critics, of which there are more than few, for not understanding what OOP really means. This simply boils down to the fact that they have extended the basic principles of OOP to include their personal interpretations, and they have developed rules which govern how these interpretations should be implemented.

Let me make it quite clear that I do not care for these personal interpretations, and I certainly do not care for these rules.

What Encapsulation is not

I disagree with all the following statements:

  1. Encapsulation is about implementation hiding

    This is a meaningless statement as "implementation hiding" is not restricted to encapsulation, and neither is it restricted to OO languages. This was not the aim of encapsulation, nor is it a unique or distinguishing feature of encapsulation. It is a universal property of all languages and all paradigms, as described in OOP is about information hiding.

  2. Encapsulation is about information (data) hiding

    Just because "information" and "implementation" have similar sounds, it does not follow that they also have similar meanings. It is also untrue to say data is part of the implementation, therefore information hiding automatically means data hiding. This is wrong for the simple reason that implementation is code while information is data. Data is not code, so information is not implementation.

    If you think that I am the only person who thinks that encapsulation has nothing to do with information hiding then take a look at the following:

  3. You must use getters and setters to access your data

    As a consequence of the "Encapsulation means information/data hiding" rule some programmers insist that it also means that the visibility of each piece of data should be changed from "public" to "private". This means that instead of

    $foo = $object->foo;
    $object->foo = 'bar';
    
    I should use:
    $foo = $object->getFoo();
    $object->setFoo('bar');
    

    This is because the 'get' and 'set' methods (also known as 'accessors' and 'mutators') provide the opportunity to execute some additional code when the data is being retrieved or inserted.

    Considering that I do not even acknowledge that the rule exists in the first place, why should I be bound by additional consequential rules?

    It should also be noted by experienced programmers that there are not just two methods of retrieving data from and inserting data into an object. There is actually a third - on the method signature itself. Consider the following statement:

    $output = $object->doSomething($input);
    

    $input is data going in, and $output is data coming out. Neither of these variables can have their visibility downgraded from "public" for the simple reason that they are part of the method signature and therefore cannot be hidden. This means that I can put data into and get data out of an object without having to reference a class variable, public or not, hidden or not.

    There is also no rule that says each class property must be a scalar value, so I can use an array if I want to. This can be especially useful when dealing with data associated with relational databases as they deal with rows and columns, and arrays are the perfect mechanism as they can deal with any number of columns from any number of rows. It is possible for the sql SELECT statement to exclude some of the table's columns, or it may include columns from other tables via a JOIN. This method is much more flexible because I don't have to code the names of any columns in any getters and setters. The Controller object injects the entire HTTP request into the Model object as a single array, and the View object retrieves all the data from the Model as a single array.

  4. You must validate your data in the setter methods

    There is a golden rule in programming that you must never trust any data which is supplied by a user - it should always be validated or filtered before it gets written to the database. While I agree with this rule (now there's a surprise!) I do not like the follow-on rule which states As you must be using setter methods it follows that you must validate the user input within the setter method.

    I don't use setters therefore I cannot validate within setters. That does not mean that the data goes unvalidated, it just means that I perform my validation in a different manner. All data goes into the object as an array, and all data gets passed to the data access object (DAO) as an array so that it can be written to the database. But it only gets to the DAO after it has been validated, and that is done by passing the entire array through a validation object. If there are any validation errors then the whole array gets thrown back to the user with a suitable error message and never gets as far as the database.

  5. You must not have more than N methods in a class

    Some programmers say that if you have more than N methods in a class (where N is a completely arbitrary number) then you class is too big and unmanageable. They say that you should break that class down into smaller subclasses as they are easier for the programmer to get his brain around.

    Firstly, encapsulation requires that all the data and all the operations that can be performed on that data are placed in the same class. If you use multiple classes then you are breaking encapsulation.

    Secondly, if a class requires 100 methods then it requires 100 methods. If a programmer cannot deal with 100 methods in a single class then how can he possibly deal with all those methods spread across multiple classes with the additional complexity of extra code which would then be required to pass control to another class just to deal with another operation?

    Thirdly, you can take the idea of "lots of small methods" too far end up with an unmaintainable mess. As a prime example I was recently forced to use a certain email library, and when I downloaded it I found it had over 140 classes, each in its own file, spread across 24 directories and subdirectories. When I came to step through it with my debugger in order to track down what I thought was a minor problem I spent over 30 minutes stepping through line after line of code which didn't actually do anything useful. All it was doing was instantiating object after object and jumping from one object method to another. Most of these methods contained just a single line of code, and very little of this code was actually associated with constructing and sending the email. If that is your idea of "best practice", and you are teaching this idea to others, then all I can say is "God help us!"

What Polymorphism is not

The definition of polymorphism should be easily understood by everyone, yet there are some people who even manage to screw this up. I recently answered a question in the 'Dynamic form generation' thread in the comp.lang.php newsgroup in which a comedian called Jerry Stuckle said:

You've got a *partial* definition. Polymorphism is only applicable when the two classes have a parent/child hierarchy, and the child class has a method of the same name (and in some languages, the same parameter list) as the parent. When there is no parent/child relationship (as in the case of two different database tables), there is no polymorphism.
This was my reply:
The definition of polymorphism does NOT state that the classes have to exist in a parent/child hierarchy, only that they have the same method signature. Having said that, it is usually the case that the two classes ARE related. You obviously haven't used an abstract table class which is inherited by every concrete table class. All my concrete table classes have instant access to all the methods and properties which are defined just once in the abstract class.
To which Jerry responded:
I've used them much longer than you've even known they existed. When you have a class derived from the abstract class, there is a parent/child relationship. But there is no such relationship between two classes derived from the same one. Once again you show you have no knowledge of OO. Polymorphism cannot exist without inheritance - which requires a parent/child hierarchy.
This was my reply:
Yes it can. Polymorphism simply requires that the same interface exist in more than one class. That may come from inheritance, or it may not. It *IS* possible to define the same interface more than once without inheritance. Each one of the 200 concrete table classes in my application is derived from the same abstract table class, so each one of those 200 classes is a sibling of the other, and "sibling" implies a relationship.
To which Jerry responded:
So there is a sibling relationship? Once again you prove how you don't understand the concept of polymorphism.
Later on he said:
You can have the same interface in more than one object WITHOUT inheritance, but that is not polymorphism.

So, according to Jerry Stuckle, a self-proclaimed "expert", polymorphism is restricted by the following rules:

If multiple classes have the "same interface, different implementation" then the conditions for polymorphism exist whether you like it or not. If you have invented additional rules then you are wrong for inventing those rules. I am most definitely *NOT* wrong for refusing to follow those additional rules.


Conclusion

Many people use different words to describe what OOP is supposed to mean, but the problem with words is that they are slippery. Like Humpty Dumpty proclaimed in Lewis Carroll's Through the Looking Glass:

When I use a word, it means just what I choose it to mean -- neither more nor less.

If you take the words used by the originators of OOP and apply different meanings to those words, then others take your words and apply different meanings to them, then you can end up with something which is nothing like the original, as immortalised in that children's game called Chinese Whispers.

There are only three features which really differentiate an Object Oriented language from a non-OO language, and these are Encapsulation, Inheritance and Polymorphism. Everything else is either bullshit or hype. Object Oriented Programming is therefore the use of these features in a programming language. High reusability and low maintainability cannot be guaranteed - that depends entirely on how these features are implemented.

Some people accuse me of having a view of OOP which is too simplistic, but instead of saying that my view is "more simple than it need be" surely it can also mean that their view is "more complex than it need be"? As a long-time follower of the KISS principle, which has a more modern variant in Do The Simplest Thing That Could Possibly Work. I know which view I prefer, and I also know which view is easier to teach to others.


References


© Tony Marston
3rd December 2006

http://www.tonymarston.net
http://www.radicore.org

Amendment history:

29 Apr 2012 Added Popular misconceptions.
10 Apr 2012 Added Optional Extras.
17 Jun 2010 Amended What OOP is to include a reference to a definition provided by Bjarne Stroustrup.

counter