Do your docs suck? - Blog - Open Source - - php, photography and private stuff

Do your docs suck?

A week ago Sebastian pointed out an article on LinuxJournal, which talks about documentation coverage. By the question "Isn't that exactly what tobyS' tool does?" I felt remembered, that I wanted to blog the little tool I wrote for eZ Components a while ago. Since this blurb was lurking in my blog for another week, you get my writings a little more belated.

The actual idea was inspired by a blog post by Lukas Smith which threw the term "documentation coverage" into my mind. We (as in "the eZ Components team") are very keen on documentation, which is reflected in extensive API docs, additional tutorials and lots of example code. While the latter 2 are still conveniently checkable manually, API docs are not that easy to validate, resulting from the huge number of classes and class members in eZ Components. Typos, missing doc-tags and violations of our documentation standards are not easily detectable and can occur easily during development. Checking every doc block by hand is a live time work and even if you try to do so, you will miss many small issues.

Therefore I wrote a little tool to assist us with checking the consistency of API documentation. The tool uses PHPs reflection API to retrieve the OO elements of a component and uses a simple (regex based) parser to extract the doc block elements assigned to them. The pure parsing already gives a hint on broken documentation tags, as far as this is possible using a regex based parser. A simple visitor interface can be used to perform checks on the tree of API-elements and their documentation.

While (almost) 100% of eZ Components API elements are documented using a phpDocumentor syntax. So our major concern is not the doc coverage itself, but syntactical correctness of the phpDocumentor annotations and (as far as possible) the semantical correctness. While the first subject can be checked quite easily using a "real" parser (and even with the current one), the latter one is quite tricky, as very semantical check is.

The current implementation checks especially for the availability of certain tags for certain elements (like a @package for each class) and correct values for annotations like @copyright. Beside that it checks if all parameters of a method are documented, if the documented types match eventually available type hints and if the order is correct. Although this sounds not so much work to be performed, we were amazed how many small and bigger issues with the documentation were already found this way.

Since this proof-of-concept implementation works quite good, I started implementing a real parser for the docs, to get a better tree-structure and perform more valuable checks. But this is at a very early stage and not publically available, yet. Anyway, although the current main-script is very eZ Components specific, the whole thing might by valuable to others, too, which is the main reason for this blog post.

You can check the script out of our SVN, where it is called docanalysis.php. Adjusting this stuff to your own project should be easy. Hope this is valuable for someone.

If you liked this blog post or learned something, please consider using flattr to contribute back: .



Add new comment

Fields with bold names are mandatory.