How to handle a Many-to-Many relationship with PHP and MySQL

Posted on 21st May 2003 by Tony Marston

Amended on 2nd December 2023

Intended Audience

This tutorial is intended for those developers who come across the situation where they need to cater for what is known as a 'many-to-many' relationship. It will explain how to generate the database tables to deal with such a relationship, and how to design HTML forms to display and update the contents of these tables.

This situation can often catch the inexperienced designer/developer unawares, often leading to clumsy or unworkable solutions. This tutorial will show a technique that has been successful for me for many years.

Standard Many-to-Many relationship

What exactly is a many-to-many relationship? When your user tries to explain to you how a pair of business objects are related he might say "many of these can be related to many of those, and many of those can be related to many of these." If you attempt to draw such a relationship in an Entity Relationship Diagram (ERD) you will probably end up with something like Figure 1:

Figure 1 - Many-to-Many (wrong)

Those of us who are more experienced in database design will immediately see this as a totally illegal relationship. The only sort of relationship which is possible is a 'one-to-many', so the novice might redraw his ERD as shown in Figure 2:

Figure 2 - Many-to-Many (wrong)

That may look better, but again it is impossible to implement in any RDBMS that I have come across. You can have A-to-B in a 'one-to-many' relationship, or B-to-A in a 'one-to-many' relationship, but you cannot have both at the same time. So what are we to do?

The solution is to create a new table which will act as the 'many' in two 'one-to-many' relationships, with each of the original tables acting as the 'one'. This is shown in Figure 3, with the new table identified as 'X'.

Figure 3 - One-to-Many x 2

Table 'X' is sometimes referred to as a 'link', 'intersection' or 'xref' (cross-reference) table.

Multiple relationships with the same Parent

It is also possible to have a structure where the child table is related to the same parent table more than once, as shown in Figure 4:

Figure 4 - multiple relationships

There are two variations:

Recursive - In this structure each entry on table 'X' signifies a relationship between two entries on table 'A' where one entry is the parent or senior while the other entry is the child or junior. It is recursive by virtue of the fact that a child in one relationship can also be the parent in another, which can produce a hierarchy which is many levels deep. As well as allowing a parent to have many children this structure also allows a child to have many parents. Paths through the hierarchy always starts with a "root" which does not have any parents, "branches" which have both parents and children, and terminating with "leaves" where each "leaf" has parents but no children. An example would be a a Bill Of Materials (BOM) where a product is comprised of a number of other products called components. Each component product could also be comprised of components of its own.
There are several rules in this structure:
- Entries on the child table are optional as a product may not be comprised of separate components.
- Both foreign keys are required.
- No entry can be related to itself - the two foreign keys can never point to the same parent entry.
- No hierarchy can loop back on itself, which means that any path from root to branch to leaf cannot have the same key of table 'A' appearing in more than one branch as it would then form an infinite loop and be impossible to find the bottom of the hierarchy.
Non-recursive - This is where the child table 'X' is not an intersection table but a separate entity where each link to the same parent 'A' is for a different purpose. An example would be where a SALES-ORDER (the child) has links to several addresses (the parent), one being the primary address and another being an optional delivery address.
The rules of this structure are simpler:
- The table contains other columns in additional to the foreign keys.
- Each foreign key may be either required or optional.
- Each foreign key may point to the same entry on the parent table as another foreign key.

Self-Referencing

It is also possible for a table to contain a foreign key which points back to itself.

Figure 5 - Self-Referencing Hierarchy

This has a self-referencing relationship in that the foreign key points to the primary key of a different row on the same table. Each row in table 'A' can be related to no more than one other row in table 'A'. This identifies the row's parent, which is optional as some rows will not have a parent. A typical example of this structure would have PERSON as table 'A' where the foreign key points to that person's manager who is another person. The person at the highest level, the top of the tree (or the hierarchy's "root"), will not have a manager. This would then produce a management or organisational hierarchy.

There are several rules in this type of relationship:

The foreign key must be optional otherwise it will be impossible for the first entry to be related to anything.
No entry can be related to itself - the foreign key can never be the same as the primary key.
While it is possible for the child in one relationship to be the parent in another, it must not be possible for the path through a hierarchy to loop back on itself otherwise it will be impossible to find the bottom of the hierarchy.

Database schema

Database schema - standard

This would be for the structure shown in Figure 3. Let us assume that the existing tables, 'A' and 'B', are defined as follows:

CREATE TABLE `a` (
  `a_id` varchar(6) NOT NULL,
  `a_desc` varchar(40) NOT NULL,
  PRIMARY KEY  (`a_id`)
)

CREATE TABLE `b` (
  `b_id` varchar(6) NOT NULL,
  `b_desc` varchar(40) NOT NULL,
  PRIMARY KEY  (`b_id`)
)

This new table must be constructed to allow the following:

It must have a column which links back to table 'A'.
It must have a column which links back to table 'B'.
It must allow no more than one row to exist for any combination of rows from table 'A' and table 'B'.
It must have a primary key.

In order to satisfy all these rules all I need in table 'X' are two columns, 'a_id' and 'b_id'. I do not need an additional column for a primary key as I can use the existing two columns. This produces a database schema as follows:

CREATE TABLE `x` (
  `a_id` varchar(6) NOT NULL,  (links to a.a_id)
  `b_id` varchar(6) NOT NULL,  (links to b.b_id)
  PRIMARY KEY  (`a_id`,`b_id`)
)

Note here that I have created a primary key from two columns, not one. Some naive database designers insist that every database table should have a single-column technical primary key called 'id' which obtains its value from an automatically-incrementing internal counter. In MySQL this can be done with the 'auto-increment' option. While I agree that there are places where a technical primary key is definitely beneficial, in my years of experience I have also encountered instances where a technical primary key is not only unnecessary, it is actually detrimental. A cross-reference table is one of those places.

Also note that it is not necessary to create a separate index for 'a_id' as this is already covered by virtue of the fact that it forms the leading portion of the primary key. A separate index on 'b_id' might be of benefit depending on the volume of rows in the table and the frequency of searches on that column.

Database schema - multiple, recursive

This would be for the structure described in Multiple relationships with the same Parent (recursive).

CREATE TABLE `a` (
  `a_id` varchar(6) NOT NULL,
  `a_desc` varchar(40) NOT NULL,
  PRIMARY KEY  (`a_id`)
)

CREATE TABLE `x` (
  `a_id_snr` varchar(6) NOT NULL,  (links to a.a_id)
  `a_id_jnr` varchar(6) NOT NULL,  (links to a.a_id)
  PRIMARY KEY  (`a_id_snr`,`a_id_jnr`)
)

Note here that the recursive hierarchy is defined using the entries on table 'X' where the two foreign keys include a suffix which identifies the senior-to-junior/parent-to-child relationship. This makes it possible to traverse the hierarchy both downwards from parent to child and upwards from child to parent. Because it is a recursive structure the entire hierarchy can be retrieved using a recursive query in SQL.

Note also that both foreign keys are marked as NOT NULL. The two foreign keys are joined together to form the primary key as this will disallow the same combination of a_id_snr and a_id_jnr from being added more than once. It is not necessary to have a separate primary key called x_id as it would be completely redundant.

Database schema - multiple, non-recursive

This would be for the structure described in Multiple relationships with the same Parent (non-recursive).

CREATE TABLE `a` (
  `a_id` varchar(6) NOT NULL,
  `a_desc` varchar(40) NOT NULL,
  PRIMARY KEY  (`a_id`)
)

CREATE TABLE `x` (
  `x_id` varchar(6) NOT NULL,
  `a_id_primary` varchar(6) NOT NULL,  (links to a.a_id)
  `a_id_delivery` varchar(6) NULL,     (links to a.a_id)
  PRIMARY KEY  (`x_id`)
)

Note that the two foreign keys include a suffix which identifies their purpose. Note also that a_id_primary is required while a_id_delivery is optional and in this example should only be specified if it is different from a_id_primary.

Self-Referencing

This would be for the structure described in Self-Referencing.

CREATE TABLE `a` (
  `a_id` varchar(6) NOT NULL,
  `a_desc` varchar(40) NOT NULL,
  `a_id_snr` varchar(6) NULL,  (links to a.a_id)
  PRIMARY KEY  (`a_id`)
)

In this structure it is not possible to identify that the entry is a parent in a relationship, only that it is a child. Each entry is therefore limited to only having a single parent. Note that the foreign key a_id_snr is optional as not all entries will be the child in a relationship. The entry at the top of the hierarchy (the "root") will not have a parent. Additionally no entry should be related to itself, so a_id_snr should never be the same as a_id.

Form design

Having designed the database we must now turn our attentions to designing the forms that will manipulate the data. For this I will suggest two alternatives although experienced developers may be able to suggest more.

Simple

In this design, which implements the LIST2 pattern, I have 4 simple forms which are interconnected as shown in Figure 5:

Figure 5 - Forms in the LIST2 pattern

We start off with the form Browse 'A' which enables the user to browse through all occurrences of table 'A'. During the processing of this form the user will be able to select an occurrence of 'A', then press a control (a button or a hyperlink) which will pass the identity of the selected occurrence to a second form which I have identified as Browse 'X' for selected 'A'.

This second form will show which occurrence of table 'A' has been selected, then list the current contents of table 'X' for that occurrence. For each occurrence of table 'X' it will include the relevant description from table 'B'. The MySQL query to achieve this is as follows:

SELECT x.a_id, x.b_id, b.b_desc
FROM x
LEFT JOIN b ON (b.b_id = x.b_id)
WHERE (x.a_id = 'whatever')

A similar result may also be achieved using the following:

SELECT x.a_id, x.b_id, b.b_desc
FROM x, b
WHERE (x.a_id = 'whatever') AND (b.b_id = x.b_id)

The difference between these two SQL queries will only become apparent if an occurrence of table 'B' is missing for an occurrence of table 'X'. With the first query all columns selected from table 'B' will be returned as empty. With the second query the effect will be that the occurrence of 'X' will be dropped from the selection.

The second form can only show existing occurrences of table 'X'. In my design this form would have controls to invoke a third form to Add to 'X' or a fourth form to Delete from 'X'. Note that this pattern can only add or delete from 'X' one row at a time.

Complex

An alternative to this combines the facilities of forms (2), (3) and (4) into a single more complex form, as shown in Figure 6:

Figure 6 - Forms in the LINK1 pattern

This modified form will still show which occurrence of table 'A' has been selected, but it will then list all occurrences of table 'B'. Against each occurrence of table 'B' will be a checkbox to indicate if an associated occurrence exists on table 'X' or not. The user will be able to toggle the checkbox in order to create or delete the entry on table 'X'. The MySQL query to select the data is as follows:

SELECT b.b_id, b.b_desc,
'whatever' AS a_id  
CASE WHEN x.b_id IS NULL THEN 'F' ELSE 'T' END AS selected 
FROM b 
LEFT JOIN x ON (x.a_id = 'whatever' AND x.b_id = b.b_id)
ORDER BY b.b_desc

Notice here that I am not including table 'A' in the query, so I am supplying its primary key, which is also a foreign key on table 'X', as a literal value. Notice also that the value is built into the ON clause of the LEFT JOIN and not as part of the WHERE clause.

As an added refinement I use a third form to allow the entry of selection criteria so that the contents of the second form can be refined. This is usually more relevant in those cases where there are more columns to be displayed, but it does provide the ability to select only those occurrences of 'B' where associated occurrences of 'X' either do or do not exist. This is accomplished by altering the WHERE clause to be one of the following:

WHERE selected='T'

WHERE selected='F'

Complex update

If you thought that the code to retrieve the current data from the database was tricky then what will you think about the code needed to update the database after the user has made his changes? For this I have a standard procedure which has as its input two arrays:

$fieldarray which has an entry for each row displayed in the form, and for each row it contains the names and values for the primary key required for table 'X'.
$selectarray which has an entry for the checkbox on each of the rows. Note that this will only contain entries where the checkbox is ON. If it has been checked OFF then the $_POST array will not contain an entry for that row.

I shall now describe the code to process these two arrays.

I begin by looping through each row that was displayed in the form and initialise two string variables:

foreach ($fieldarray as $rownum => $rowdata) {
   $insert = NULL;
   $delete = NULL;

Each row provides me with the names and values for the primary key, so I can move their details into the two string variables.

   foreach ($rowdata as $fieldname => $fieldvalue) {
      $insert .= "$fieldname='$fieldvalue',";
      $delete .= "$fieldname='$fieldvalue' AND ";
   } // foreach

When there are no more fields left I can trim the unwanted ',' and ' AND '.

   $insert = rtrim($insert, ',');
   $delete = rtrim($delete, ' AND ');

Now I examine the contents of the checkbox in $selectarray and construct the SQL query to either create the entry if the checkbox is ON or delete the entry if the checkbox is OFF:

   if (isset($selectarray[$rownum])) {
      $query = "INSERT INTO $tablename SET $insert";
   } else {
      $query = "DELETE FROM $tablename WHERE $delete";
   } // if

Finally I execute the query and check for errors. Note that I ignore errors concerning duplicate entries. This is caused by a checkbox being ON originally and not being changed to OFF by the user:

   $result = @mysql_query($query, $dbconnect);
   if (mysql_errno() <> 0) {
      if (mysql_errno() == 1062) {
         // ignore duplicate entry
      } else {
         trigger_error("SQL", E_USER_ERROR);
      } // if
   } // if
} // foreach

Transaction Patterns

There are several examples of many-to-many relationships within the RADICORE framework, so I have created several different Transaction Patterns in order to make it easier to create them. As shown in the Introduction the DBMS does not actually support a many-to-many relationship, instead it has to be implemented as a pair of one-to-many relationship utilising not two tables but three, as shown in Figure 3. There will be one relationship between table "A" and table "X" and a second relationship between table "B" and table "X".

LIST2

This is the simplest pattern. It contains two entities - a parent/outer and a child/inner - as shown in Transaction Patterns - List2. In this screen both the parent and child entities are read-only, so child rows can only be added or deleted one at a time using separate tasks which can be activated using navigation buttons.

Two versions of this pattern will be required, the first which has table "A" as the parent, and a second with table "B" as the parent. In both cases the child will be table "X". The following combinations exist in the MENU subsystem:

Task	Parent entity	Child entity
List Role by User	MNU_USER	MNU_USER_ROLE
List User by Role	MNU_ROLE	MNU_USER_ROLE
List Menu Items	MNU_TASK_SNR	MNU_MENU
List Menu Parents	MNU_TASK_JNR	MNU_MENU_SNR
List Navigation Buttons	MNU_TASK_SNR	MNU_NAV_BUTTON_JNR
List Navigation Button Parents	MNU_TASK_JNR	MNU_NAV_BUTTON_SNR

LINK1

This has the screen structure shown in Transaction Patterns - Link1. It does not have navigation buttons to add or remove entries one at a time, instead it uses the SELECT checkbox on each visible row to update the database.

On initial entry it will show the selected occurrence from table "A" and all possible occurrences from table "B", and the SELECT checkbox in each row of the child area will indicate if a corresponding row exists on table "X" or not. If a row exists the checkbox will be ON, but if a row does not exist then the checkbox will be OFF. Rows can be added or deleted simply by toggling the checkbox. Note that you can toggle the checkboxes on multiple rows before pressing the SUBMIT button. You can use the associated SEARCH screen to change the selection criteria to only show those entries where the SELECT checkbox is currently ON, or alternatively only show those entries where the SELECT checkbox is currently OFF.

This pattern is useful when the intersection/link table does not contain any columns other than those for the two foreign keys as there is no mechanism to update any rows, only to add or delete them.

Task	Parent entity	Child entity	Link entity
Link Role(s) to selected Task	MNU_TASK	MNU_ROLE	MNU_ROLE_TASK
Link Task(s) to selected Role	MNU_ROLE	MNU_TASK	MNU_ROLE_TASK
Maintain Menu Items (2)	MNU_TASK_SNR	MNU_TASK_JNR	MNU_MENU
Maintain Navigation Buttons (2)	MNU_TASK_SNR	MNU_TASK_JNR	MNU_NAV_BUTTON

MULTI2

This has the screen structure as shown in Transaction Patterns - Multi2. It is different from the LIST2 pattern in that it does not have a separate UPDATE task as columns on multiple rows can be amended in the child area.

Task	Parent entity	Child entity
Maintain Menu Items (1)	MNU_TASK_SNR	MNU_MENU
Maintain Navigation Buttons (1)	MNU_TASK_SNR	MNU_NAV_BUTTON

TREE View

This collection of patterns was designed to view the contents of a recursive hierarchy shown in Figure 4. This is known as a "tree" view as it is capable of show an entire hierarchy from its root down through all the branches and only stopping at the leaves. There are three variations - TREE1, TREE2 and TREE3 - which only differ in the way that each root is chosen.

As an alternative to the HTML screen the OUTPUT6 pattern will export the details to a CSV file which can then be viewed in a spreadsheet program.

Note that these patterns are only for viewing the hierarchy. The children of a chosen parent should be maintained using the standard LIST2 pattern.

Summary

I first designed this solution to the many-to-many problem several years ago while developing in a different language. It was an interesting exercise to convert it to PHP and MySQL.

In my long career I have also encountered variations of this problem. In this solution table 'X' contains nothing but the foreign key fields which link up to the two related tables, but this can be made more complicated by the following:

There may be additional fields on table 'X', which means that the database update routine will have to deal with more than simply choosing between an insert or a delete.
A more complicated version is where table 'X' can contain multiple occurrences for each combination of 'a_id' and 'b_id'. The example I have encountered several times is where each entry covers a different date range causing only one to be regarded as 'current' on any particular date, thus providing a history of values over a period of time.

Amendment history:

02 Dec 2023	Split Multiple relationships with the same Parent and Self-Referencing into separate sections. Added Transaction Patterns
02 Feb 2019	Updated the COMPLEX section to replace the CROSS JOIN with a more efficient query.

counter

Tony Marston's Blog About software development, PHP and OOP

How to handle a Many-to-Many relationship with PHP and MySQL