This blog post has first been published in the Qafoo blog and is duplicated here since I wrote it or participated in writing it.
Cover photo for post Tracking Changes in PHP Projects

Tracking Changes in PHP Projects

Since quite some time I've talked to people about the idea for a tool that tracks changes in the classes and methods of your PHP project in order to detect which entities are changed most frequently, which are often affected bugs and other statistics. After some hacking, we are now making it available on Github.

The Qafoo ChangeTrack tool consists of multiple commands that are based on each other to analyzes the source code history of your PHP project. First of, there is the analyze command, that produces an XML document of the source code history, showning which commit affected which method and how many changes were performed. For illustration, here is an extract of the analysis result for the Twig project:

<changes repository="https://github.com/fabpot/Twig"> <!-- ... --> <changeSet revision="92bbc7ee405f5635f4647040d883dbd77d1ac7da" message="made a small optimization to for loop when no else clause exists&#10;git-svn-id: http://svn.twig-project.org/trunk@32 93ef8e89-cb99-4229-a87c-7fa0fa45744b&#10;"> <package name=""> <class name="Twig_Node_For"> <method name="compile"> <added>15</added> <removed>3</removed> </method> </class> </package> </changeSet> <!-- ... --> </changes>

The second command that is currently implemented is calculate. This one works on basis of the XML generated by analyze and calculates how often a certain method was affected by a change of a certain type (e.g. bug or feature).

To detect if a revision was created in order to fix a bug or to implement a feature, you can currently use regular expressions against the commit message or connect Github issue labels. It should be easy to extends this feature to check for issue references on Jira, etc.

The following example shows an extract of the stats gegerated for Twig on basis of the default label provider configuration:

<stats repository="https://github.com/fabpot/Twig"> <package name=""> <!-- ... --> <class name="Twig_Environment"> <!-- ... --> <method name="loadTemplate"> <stats> <count label="misc">17</count> <count label="fix">1</count> </stats> </method> </class> </package> </stats>

Note, that the software is still pre-alpha and has only been tested against a few repositories, yet. Also, the analyze command is really slow at the moment, since I did not pay much attention to performance, yet. For example, on my X220 it takes roughly 40 minutes to analyze the Twig project history.

I would love you to try out the tool, report any bugs you find on Github and to discuss fancy ideas you might have around the generated data and what can be achieved on that basis.

(This tool is, among others, inspired by the bugminer script from Sebastian Bergmann.)