Image not available

Why documentation matters

If you kept an eye on Planet-PHP during the past days, you will have seen a discussion emerging about the sense of doc blocks and documentation in general. Travis Swicegood states a quite interessting point of view. His main point is, that program languages themselves are declarative and that it is somewhat stupid to describe the purpose of a method in plain english, if it has been already described by the programming language itself. Although I never thought in this direction so far, I can see a valid point here. Partly.

I basically agree with Travis, that programming languages are declarative and (for the usual case) you shouldn't need much documentation for a class, if you name your methods and attributes properly. At least for an experienced programmer, the names should be obvious in most cases. But at least in PHP we have some major issues, which are modelled in code and are not obvious from the outside:

The most obvious thing are exceptions. Unlike Java, PHP does not require that an exception is either caught or the method is declared to throw it again. If you are going to use a programming libraray, you therefore have no possibility to see, which method throws which exceptions, without proper documentation. Another point are read/write only properties. Since PHP does not have a modifier for this so far, you need to go the way of interceptors. Properties which are "declared" using overloading are not as obvious as real attributes are, because they simply are not a declarative construct of the language itself. You therefore need documentation here to mark those up for the user.

But even if you neither use exceptions, nor interceptors to define properties, you still have a lot of cases, where pure naming is not sufficient to make a foreign user see in 1 glance, what happens. Simple things like the classes that are used in realtion with another class are often not obvious. Just think in matters of the factory pattern, dependency injection and similar. Additionally, if you only have the code itself, navigation is quite hard, if you have a large code base. Browsing directory structures to look for what you are searching is not fun, especially, if concepts are used, that you are not familiar with, yet.

Another (and propably the most important) point is simply the difference between people. I'm doing PHP for over 6 years now, I've dealt with a lot of libraries in PHP and have developed many applications. I'm heavily affected by what I've done and I've been inspired by a lot of people and experiences. But you are different! Are you sure, that my thoughts are as obvious to you as they are to me? No, you never can asure that. And that's a major point, why I will have to explain my thoughts to you in a more complex syntax than PHP to make you understand the overall concept behind my doings. This is quite natural and will not change any time soon, because even if programming languages are declarative, they only hava a very limited syntax. Surely you can describe all that in a programming language, too (you do! what else is your complete code base than your concepts and thoughts written down?). But to understand it all, you would need to read the complete code and not only the prototypes.

I think I made my point quite clear so far. But Travis still has a valid point. Inline API docs are not, what he is looking for in the matters of "documentation". I agree with him. No, I'm not contradictory, but in the matter of my last paragraph, people are simply different. While I consider inline API documentation an important part of any code, because it simply gives me an overall reference of what is available, I don't look at in the first glance.

While thinking about Travis matters, I analyzed my own way of working with documentation. When I start with something new (a new library for example), the first thing I want to know is "How does this work in general?". I'm sure a lot of developers have the same feeling and don't want to dig into API docs for hours and hours, before they have the big picture and understand the overall concept.

So, what is basically needed to get started is a tutorial. I consider tutorials the most important part of any documentation. A simple step by step introduction to a practical usage example gives you more than 1000 lines of inline docs when getting started with something new, because you gather the main points you are interessted in with very few learning effort:

  • What is this?

  • What can I do with it?

  • What is the overall concept behind it?

  • How am I expected to use it?

These 4 questions reflect the main information I typically want to have when digging into something new and I actually cannot imagine anything that can help me better than a tutorial here.

Ok, after reading the tutorial for a specific component, what comes next? The next step is to actually face the problem I want to solve using the library. Yes, that is basically what I want to do, solving a problem. I usually start copying some code from the tutorial, which I consider valid for the solution of my problem. Now I have to adjust that piece of code to suite my needs and make it do, what I actually want it to do. At this point API documentation enters the game. What I need is a reference of a specific class or method, to see with 1 glimpse, what I can do to adjust the behaviour. The most important questions now are:

  • What are the options of an object and how do they affect the objects behaviour?

  • What are the parameters of a method and how do they affect the behaviour of the method?

  • What are related classes and methods, which could propably solve my problem easier or in a better way?

  • Which side effects will I have to expect, if I use this class / method?

While the first 2 points should mostly be covered by the code itself (you know, programming languages are declarative), the latter 2 are mostly not. They show another 2 important points of API documentation: The relation between classes/methods and some portions of the background information, that is usually hidden somewhat deep inside the code.

So far so good, I managed to realize what I wanted to achieve. If I reached this point with reasonable time effort, the documentation of the project seems not that bad, does it? What comes next? I propably will not use the same classes I just used very soon again. But the time will come, when I come back for one of 2 reasons:

  • The code I've writen needs maintainance

  • I need to solve a similar problem and want to use the library again

What is different now in contrast to the first time I needed the documentation? I already have the big picture of the component in mind. I also have some working code at hands, which was written by myself and should therefore be understandable for me (inline docs help! ;). The tutorial might be helpfull now, to recall some facts. But more important are the API docs now, again.

So, why am I telling you all this? Basically to show you my view on the documentation issue. I agree with the fact, that API docs are not the solution to all of your problems. A well written tutorial helps much more in the first steps, than any API docs can do. And I think that is basically the same point Travis mentions, when he says "Give me a unit test any day over a three paragraph docblock". Exploring new terrain by example is much more effective than digging into pages of API description. But that does not mean, API docs are useless. They are very important for the further steps and should solve the issue of the limited declarativity a programming language has, due to its limited syntax.

Finally I'd like to refer to the documentation of eZ Components. I think we manage very good to provide all of the named. For each component we have different documentation forms online. A tutorial for the main functionality of a component exists (for example the ConsoleTools tutorial). Beside that, we have additional code examples shipped with each components source, for people that prefer reading PHP over English. ;) The complete source of eZ Components is inline documented and the documentation is rendered online (e.g. ConsoleTools). The API docs are enhanced by a lot of usefull features, like a special markup for the most important classes and some example code in the class docs of. Also quite nice in my eyes is the linking between tutorials and API docs, which helps you to get detailed information about a specific class or method right from the tutorial with 1 click. Finally, especially important for Travis ;), eZ Components are fully unit tested.

Is there anything more you would expect? Tell me, I'm currious! :)

Comments

I have the same problems as Travis and Tobias. I have not yet found a unique solution to all the issues with inline and external doc. I think that one person in the team should be responsible for documentation. When I design, it's all obvious to me. Someone else will know better what needs clarification. When a third person comes in contact with my design, they will also be able to know if the first two missed anything obvious. So like anything in like, readjustment will be required.

Balance is important too. Do not document everything just because it's important to document. Finding the right balance requires experience. We have to live with that fact.

Is the problem really documenting or people/social/educational? This issue came up because someone must have complained. I think it's important we realize that the other person is thriving for the same goal as we do and we have to find a solution that both can live with. It is not because we prefer one kind of documentation that everyone else must feel the same way. Just as Tobias mentionned, we are different.

Christian Roy at 2007-04-14

I think you (and many others) missed some of the points Travis was trying to make in his post.

Unit tests are specifications of the behaviour of your code. If they are well written they will suffice as documentation; no they are no replacement for good old end user documentation, but compared to docblocks they are a huge improvement.

Mark Twain at 2007-04-14

I actually got his point, but nevertheless I disagree. Unit tests give you a great view on how a method is expected to work. But actually they are not really readable. Sure, for a simple class like the examples from the PHPUnit docs this might work perfectly. But in a complex evironment, you have several problems when reading the tests. Testing a specific class there usually requires a lot of setup work, propably has mock objects in place and sometimes even some hacks are used to perform the actual testing. Beside that, you usually have a lot of test cases for a single method, if it performs something more complex than calculating 2+3.

Anyway, as I said I personally prefer tutorials over reading unit tests and I let everone use whatever they think is most useful for him as documentation. And since I consider unit tests as very important, there should be no issue for you guys in using them as documentation. :)

Toby at 2007-04-14

Never said unit tests were to replace tutorials.

Wiring of mocks could easily be moved into external helper methods. testcode is as important as other code so if it is full of hacks I would consider it a major issue. The same is for test cases which are hard to read - if you cannot read them their weaknesses will be hard to track down (and no - code coverage is not everything).

Mark Twain at 2007-04-14

Hehe, we should just keep it as I said: It's a matter of taste. I personally don't like digging into unit tests for documentation purposes.

Beside that, I think you got me wrong in some places above. "sometimes even some hacks" is completly different to "full of hacks". But that's not part of this discussion. Same applies to the fact that I never stated "code coverage is everything". :)

Toby at 2007-04-14

You said "fully tested" ;-)

OK, so we can agree on disagreeing then :-D

at 2007-04-14

Where the "fully tested" was not referencing to a code coverage value. ;)

Toby at 2007-04-14

Toby - great response. I too would love to see more narrative (or as you put it, tutorial) style documentation. I think they are much more useful. To me, the perfect example of this is PicoContainer in Java. They have a One Minute Description, Two Minute Tutorial, and a Five Minute Introduction. You can sit down and with 8 minutes of uninterrupted reading time go through the basics of what do this do and do I need it. I've yet to see any PHP project that has that down. Projects I participate in included.

Just to respond to the two points of exceptions and attributes. I do agree that since PHP doesn't have a hard and fast rule for handling them, documentation does become a somewhat necessary evil. I would, however, put the documentation into unit tests in the form of test method names and use something like testdox in PHPUnit3 to generate the documentation. Assuming a well written test, this provides you with documentation that was validated in addition to just being written.

Travis Swicegood at 2007-04-15