Identity Map pattern
Table of contents
I first read about the identity map pattern in Patterns of Enterprise Application Architecture by Martin Fowler. The pattern affects the data access layer of an application and helps to avoid inconsistencies in data objects of your application.
Since yesterdays alpha release of the PersistentObject package in eZ Components, we have an implementation of Identity Map. In this article I want to give you a rough overview on the pattern itself and show you how you can test and use the functionality of PersistentObject.
Identity Map
Typically you have a central layer in your application which abstracts the database access and encapsulates retrieval, storage and manipulation of data objects. Due to loosely coupling of your applications modules, it might occur, that different modules access the same data without knowing about each other.
While this typically means overhead in terms of SQL queries and therefore database load, there is the much larger problem of inconsistent data inside your application: If module A manipulates a data object, while module B uses it, too, the data objects are out of sync. Module B still works with the old data.
To avoid this, you can make your data access layer aware of object identities, using an Identity Map implementation. That means, if a data object is to be loaded, the layer will determine if it already knows this object. If this is the case, it will not create a new instance of the object, but return the already existing one. If the desired object does not exist, yet, it will be created and recorded for later usage.
PersistentObject
The PersistentObject component of the eZ Components library provides you with a data access abstraction for your application. Similar to the Data Mapper pattern (also described by Fowler), it allows you to access your data objects in a unified way, without much SQL writing for typical tasks.
If you are not familiar with the component, yet, I suggest to take a look at the PersistentObject tutorial on the eZ Components website. The general usage of this component is out of scope here.
The latest alpha of PersistentObject is available through our PEAR channel. If you did not discover this channel, yet, use the following command:
$ pear channel-discover components.ez.no
After that, to install the PersistentObject alpha version, use:
$ pear install -f -a ezc/PersistentObject
This command will install the alpha version of PersistentObject for you and all components it depends on. If you already use a version of PersistentObject via PEAR, you can update as follows:
$ pear upgrade -f -a ezc/PersistentObject
Beware: This is currently an alpha release and is only meant for testing purposes. Do not use this in a critical production environment!
Activate identity management
The Identity Map pattern is implemented in PersistentObject as a decorator to the central session class. Assumed that you already have an instance of ezcPersistentSession configured in the variable $session, you can instantiate an identity session as follows:
<?php
$idMap = new ezcPersistentBasicIdentityMap(
$session->definitionManager
);
$idSession = new ezcPersistentIdentitySession(
$session,
$idMap
);
?>
The identity session utilizes the existing session object to realize database access. In addition, it utilizes and instance of an implementation of the ezcPersistentIdentityMap interface. This object is used for the actual management of object identities. The ezcPersistentBasicIdentityMap shown above performs the original purpose of an Identity Map: Keeping references of object identities in memory.
If you feel the need for a more advanced solution, which e.g. caches object identities using Memcached or similar, you can simply implement the interface and use your own implementation.
Identity Map in action
This section shows a few examples of the identity session object in action. Assume for each example, that $idSession contains the identity session object that was created in the last section.
The identity session can simply be used like the original instance of ezcPersistentSession. That means, you can transparently exchange both objects in your application without changing any code.
A simple case
The simplest scenario where the identity session takes effect looks as follows:
<?php
// ... in module A ...
$userA = $idSession->load( 'myUser', 23 );
// ... in module B...
$userB = $idSession->load( 'myUser', 23 );
?>
In this case, the object of class myUser with the ID 23 is loaded 2 times in different modules. The first call to load() results in a database query, to actually load the user object. The second call, in contrast, does not perform a database query, since the desired object is already loaded. The existing instance is returned.
Finding objects
While the identity session cannot help you to reduce SQL queries when finding objects using a find query, it still solves the problem of inconsistent data:
<?php
// ... in module A ...
$query = $idSession->createFindQuery( 'myUser' );
$query->where(
$query->expr->gt(
'id',
23
)
);
$users = $idSession->find( $query );
// ... in module B ...
$query = $idSession->createFindQuery( 'myUser' );
$query->where(
$query->expr->lt(
'id',
42
)
);
?>
In this example the modules A and B both load a set of user objects. Module A loads all user objects which have an ID larger than 23. Module B finds all users with an ID lesser than 42. Both queries will load the objects with IDs between 23 and 42.
As said before, the identity session cannot avoid the second SQL query in this example. But it avoids data inconsistency inside your application: The user objects which would otherwise exist twice are replaced by their already existing instances.
Further possibilities
The identity session already provides additional features for you. One is called relation pre-fetching.
So far, it was not possible to fetch related objects for a set of objects via a JOIN. You needed to issue multiple calls to getRelatedObjects() to get the desired result, which works fine for small amounts of objects, but ends up in heavy database access for larger sets of objects.
With relation pre-fetching you can define a tree of relations to be loaded in a single query and to be stored in the Identity Map. After that, you can retrieve them using getRelatedObjects() without any additional SQL queries being executed.
Testing
If this new feature of the PersistentObject component looks interesting to you, I beg you to install and test it. Any feedback is welcome and since we are still in alpha state, we now have the chance to fix drawbacks for you as they get visible.
If you have any comments, praise or critics, please do not hesitate to send them to the eZ Components mailing list or comment on this entry.
Comments