Cover image for post Identity Map pattern

Identity Map pattern

I first read about the identity map pattern in Patterns of Enterprise Application Architecture by Martin Fowler. The pattern affects the data access layer of an application and helps to avoid inconsistencies in data objects of your application.

Since yesterdays alpha release of the PersistentObject package in eZ Components, we have an implementation of Identity Map. In this article I want to give you a rough overview on the pattern itself and show you how you can test and use the functionality of PersistentObject.

Identity Map

Typically you have a central layer in your application which abstracts the database access and encapsulates retrieval, storage and manipulation of data objects. Due to loosely coupling of your applications modules, it might occur, that different modules access the same data without knowing about each other.

While this typically means overhead in terms of SQL queries and therefore database load, there is the much larger problem of inconsistent data inside your application: If module A manipulates a data object, while module B uses it, too, the data objects are out of sync. Module B still works with the old data.

To avoid this, you can make your data access layer aware of object identities, using an Identity Map implementation. That means, if a data object is to be loaded, the layer will determine if it already knows this object. If this is the case, it will not create a new instance of the object, but return the already existing one. If the desired object does not exist, yet, it will be created and recorded for later usage.

PersistentObject

The PersistentObject component of the eZ Components library provides you with a data access abstraction for your application. Similar to the Data Mapper pattern (also described by Fowler), it allows you to access your data objects in a unified way, without much SQL writing for typical tasks.

If you are not familiar with the component, yet, I suggest to take a look at the PersistentObject tutorial on the eZ Components website. The general usage of this component is out of scope here.

The latest alpha of PersistentObject is available through our PEAR channel. If you did not discover this channel, yet, use the following command:

$ pear channel-discover components.ez.no

After that, to install the PersistentObject alpha version, use:

$ pear install -f -a ezc/PersistentObject

This command will install the alpha version of PersistentObject for you and all components it depends on. If you already use a version of PersistentObject via PEAR, you can update as follows:

$ pear upgrade -f -a ezc/PersistentObject

Beware: This is currently an alpha release and is only meant for testing purposes. Do not use this in a critical production environment!

Activate identity management

The Identity Map pattern is implemented in PersistentObject as a decorator to the central session class. Assumed that you already have an instance of ezcPersistentSession configured in the variable $session, you can instantiate an identity session as follows:

<?php $idMap = new ezcPersistentBasicIdentityMap( $session->definitionManager ); $idSession = new ezcPersistentIdentitySession( $session, $idMap ); ?>

The identity session utilizes the existing session object to realize database access. In addition, it utilizes and instance of an implementation of the ezcPersistentIdentityMap interface. This object is used for the actual management of object identities. The ezcPersistentBasicIdentityMap shown above performs the original purpose of an Identity Map: Keeping references of object identities in memory.

If you feel the need for a more advanced solution, which e.g. caches object identities using Memcached or similar, you can simply implement the interface and use your own implementation.

Identity Map in action

This section shows a few examples of the identity session object in action. Assume for each example, that $idSession contains the identity session object that was created in the last section.

The identity session can simply be used like the original instance of ezcPersistentSession. That means, you can transparently exchange both objects in your application without changing any code.

A simple case

The simplest scenario where the identity session takes effect looks as follows:

<?php // ... in module A ... $userA = $idSession->load( 'myUser', 23 ); // ... in module B... $userB = $idSession->load( 'myUser', 23 ); ?>

In this case, the object of class myUser with the ID 23 is loaded 2 times in different modules. The first call to load() results in a database query, to actually load the user object. The second call, in contrast, does not perform a database query, since the desired object is already loaded. The existing instance is returned.

Finding objects

While the identity session cannot help you to reduce SQL queries when finding objects using a find query, it still solves the problem of inconsistent data:

<?php // ... in module A ... $query = $idSession->createFindQuery( 'myUser' ); $query->where( $query->expr->gt( 'id', 23 ) ); $users = $idSession->find( $query ); // ... in module B ... $query = $idSession->createFindQuery( 'myUser' ); $query->where( $query->expr->lt( 'id', 42 ) ); ?>

In this example the modules A and B both load a set of user objects. Module A loads all user objects which have an ID larger than 23. Module B finds all users with an ID lesser than 42. Both queries will load the objects with IDs between 23 and 42.

As said before, the identity session cannot avoid the second SQL query in this example. But it avoids data inconsistency inside your application: The user objects which would otherwise exist twice are replaced by their already existing instances.

Further possibilities

The identity session already provides additional features for you. One is called relation pre-fetching.

So far, it was not possible to fetch related objects for a set of objects via a JOIN. You needed to issue multiple calls to getRelatedObjects() to get the desired result, which works fine for small amounts of objects, but ends up in heavy database access for larger sets of objects.

With relation pre-fetching you can define a tree of relations to be loaded in a single query and to be stored in the Identity Map. After that, you can retrieve them using getRelatedObjects() without any additional SQL queries being executed.

Testing

If this new feature of the PersistentObject component looks interesting to you, I beg you to install and test it. Any feedback is welcome and since we are still in alpha state, we now have the chance to fix drawbacks for you as they get visible.

If you have any comments, praise or critics, please do not hesitate to send them to the eZ Components mailing list or comment on this entry.

Comments

Hi Toby

I think I spotted a small error. In the example code under "Related objects":

$newComment = new myComment( 'some text' ); $idSession->addRelatedObject( $user, $comment );

Shouldn't $comment in the 2nd line be $newComment instead?

Best regards

Kristof

Kristof Coomans at 2009-04-22

Hi Kristof,

thanks for the hint, I fixed the error.

Regards, Toby

Toby at 2009-04-22

"In module B, no SQL queries are issued at all."

Shouldn't it perform the query always, as the data could be altered/deleted/inserted by another process?

Ren at 2009-04-22

Hi Ren,

that might happen, yes. However, the other module that originally loaded the object would also be affected by this deletion. Therefore it is more sensible to hava a consistent state inside the current request, by only loading the data objecte once.

However, if you want to have objects fetched evertime, you simply do not use identity management.

Regards, Toby

Toby at 2009-04-22