schlitt.info - php, photography and private stuff ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ :Author: Tobias Schlitt :Date: Wed, 19 Nov 2008 23:29:46 +0100 :Revision: 1 :Copyright: CC by-nc-sa ========================= Why code coverage matters ========================= :Description: I'm a fan of PHPUnit code coverage reports. And with this sentence I can see a lot of the developers out there shiver, because they are of the opinion, that code coverage reports for unit tests are nonsense and cannot give you any hint on the quality of a test suite. I see it a bit differently. Surely, a high code coverage rate of a test suite does never indicate, that code is well tested (if you have not written the code and tests yourself). But the other way around works: A small code coverage rate definitly means, that the test suite is not sufficient. But let me dig a bit deeper into code coverage and what it gives you. I'm a fan of `PHPUnit`__ `code coverage reports`__. And with this sentence I can see a lot of the developers out there shiver, because they are of the opinion, that code coverage reports for unit tests are nonsense and cannot give you any hint on the quality of a test suite. I see it a bit differently. Surely, a high code coverage rate of a test suite does never indicate, that code is well tested (if you have not written the code and tests yourself). But the other way around works: A small code coverage rate definitly means, that the test suite is not sufficient. But let me dig a bit deeper into code coverage and what it gives you. .. __: http://phpunit.de .. __: http://static.phpunit.de/ezc/ The PHPUnit code coverage report indicates, how many lines of the code you are testing have been executed during the test run. It shows you the figure for each directory (aggregated from the files and directories contained) and each file of the tested code and gives you some color indication, if you have a high, medium or low code coverage rate. Beside that, it shows you the source code of each of your files and indicates, which code lines got executed, which not and which are unreachable. So, basically you can check, which lines of the code to test are covered by the unit tests and, more important, which are not. A covered line does actually not mean, that this code line is properly tested, but a not covered line definitly means, that this one is not tested at all. For you as a developer, the latter fact is quite important, since it gives you an indicator for test cases you still need to create. Creating a unit test is over all not easy. To create a proper unit test you need to think of several things: First you need to know, what a specific method should be doing. That is kinda easy, if you wrote (usually will write) the method yourself, but can be kinda hard, if you didn't. Second, you need to know what the preconditions of the method are (which attributes the methods accesses, what the values of these must be to achieve the desired result, which other objects must be instanciated,...). Third you need to know, how the method should behave for different input parameters (and with "behaviour" I mean success and failure). If you have all these information collected, you can start writing the test cases for your method. Still it may sometimes be hard, if you have a method that performs a more complex operation, since you need to think of all possible (and in most cases you should also think about impossible) combinations your method can be called with. But I'm sure, if you are not lazy, you can figure out some sensible combinations of preconditions, input parameters and expected result. But how can you figure out, if you really covered all necessary cases? Well, that is even harder than creating the test cases, because you usually think in limited dimensions and can not think of every possible case how a co-developer (or user, if you're developing a library) may try to use your method. Anyway, at this point, code coverage comes into the game. Although it cannot tell you, how a possible user of your method might try to interact with it, it can give you a hint on which code was not tested by you so far. If you see some red lines in the code coverage report, you definitly missed some cases you had in mind while developing the method and which are not tested. Let me give a small example: :: class DiceGamePlayer { public $playerName; // ... some more attributes and methods here... public function throwDice( $diceSize ) { if ( !is_int( $diceSize ) && $diceSize < 2 ) { throw RuntimeException( "A dice with $diceSize values does not make sense! Player $player must be cheating!" ); } return mt_rand( 1, $diceSize ); } } This tiny method could be from a dice game, that contains several different dices (e.g. roleplaying games usually have). It is quite small, so it should be easy to think of unit test cases. The method obviously has the attribute $playerName as its only pre-condition and expects and integer value larger than 2 as its parameter. The desired return value is an integer value between 1 and the given dice size. A typical test case would be, to give it the number 6 and expect that it returns an integer value between 1 and 6. With that, you already have the desired functionality of the method covered. At least, you might think so. But a look at the code coverage will indicate, that you only covered 50% of the (executable) code, because you only checked the functionality for a correct input value and did not test the method with an incorrect value. That means: More test cases are necessary to fully test the method. Since the method expects an integer value, it would be a good idea to test if the exception is thrown if you give it a string value like "foo". Well, another look at the code coverage will indicate that now 100% of the code is tested... and this is the point, where code coverage cannot help you any further. Although you covered the full code now, you still have a bug in it. Another test case is necessary: What happens, if you give that method the integer 1? This is an integer value but smaller than 2, so the new test case would expect the RuntimeException to be thrown. This test case would actually fail, because the checking condition is incorrect. You would need to replace the logical-and operator with a logical-or to have it working correctly. So, why do I tell you all this, if code coverage can not help you in this place? Well, it already helped you, with indicating, that the given portion of code was not tested. But that does not mean, that you can switch off your brain at this point. You still need to think about all possible cases this part of the code should cover. But still the code coverage report gave you a good hint, that you missed to test something at all. What is the conclusion of all this? First of all: Writing sensible unit tests is not easy and needs at least a good imagination. A large code coverage rate does not mean that your test cases are good and that every possibility is tested. But a small code coverage number always means, that you missed to test parts of your code. And the report itself can then give you sensible hints, which parts of the code still need explicit testing. And if you you are capable of writing sensible test cases for these parts of your code and are sure (sure, you never can be!) that you already wrote sensible test cases for the covered parts, a large code coverage rate can also indicate a good test suite. I think a much better alternative for testing the quality of your test suite is the so called "`code mutation`__", which will hopefully be developed for PHPUnit during this years Google Summer of Code. A code mutation tool will try to change small bits of the code to test over and over again, until a test case will fail. If your test suite bails out with (almost) any change in the tested code, this indicates, that you have covered a lot of imaginable cases and in that way, that you tested well (remember, that you can never be really sure!). .. __: http://www.phpunit.de/wiki/Ideas#SupportforMutationTesting So long I can only recommend to check your code coverage report and see where you really missed to test code and then switch on your brain to create sensible test cases for these parts. I'm pretty sure, that this will already raise the quality of your code a lot. At least, this is what I experienced again and again in the past, when looking at my code coverage reports and creating test cases for the not executed parts. Be sure, that you can already clean up a lot of tiny (and possibly larger) bugs and that you can ensure to not break BC much better than before. So long, happy testing! :) .. Local Variables: mode: rst fill-column: 79 End: vim: et syn=rst tw=79 Trackbacks ========== Comments ======== - Lukas at Thu, 12 Apr 2007 10:25:22 +0200 I think code coverage is also nice just for motivation. You see this nice percentage go up every time you add a test. And when you add new code it goes down, so you might feel inclined to bring the number back up. - Sebs at Thu, 12 Apr 2007 12:45:02 +0200 Yeah, code mutation replaces testcase knowhow for some people. Sorry but the whole article misses one important fact: cost. Real life test suites on complex software seem to have a very minimal code coverage very often because the testing effort (in terms of money) is too great if you'd like to have code 100% coverage. We work a lot with tests actually my suite tells me: 26/26 test cases complete: 1050 passes, 0 fails and 3 exceptions. We do a job here to earn money and these tests help us to ensure we are delivering a high quality product in a estimated amount of time, developer ressources and money. As long as we still find bugs with creativity and a little bit of G. Erwin Taller's know how (http://www.amazon.de/Software-Test-Verifikation-Validation-Georg-Thaller/dp/3882291982/ref=pd_bbs_5/028-5037639-0784531?ie=UTF8&s=books&qid=1176374153&sr=8-5) I don't mind a code coverage run at all. It is too much of a false friend that has nothing to do with reality. Not that i say youre 100% wrong, but you forget the facts time2market and ressouces. - Philippe Gamache at Thu, 12 Apr 2007 16:04:40 +0200 (!is_int( $diceSize ) && $diceSize < 2) There is an error here... Can't be non-integer and be smaller that 2 at the same time (Ok! float, but should be in error too). (!is_int( $diceSize ) || $diceSize < 2) is more like it. - Toby at Thu, 12 Apr 2007 17:10:40 +0200 I think this approach is the greatest mistake made in nowerdays software development. If you need to develop any kind of software for a customer (no matter if it is a complete custom program or a standard software you are distributing), the price is mostly counted for the pure development effort. Most companies save money in the areas of planning, designing and testing. This definitly allows you to come to market with a very low price. The major problem occurs later, when your customers complain about bugs and you have to debug them. This usually results in much larger effort as if you had done proper testing before. Writing tests in paralell to (or even before) writing the actual code is much less pain than debugging later. And for most bugs that are found by a user, you pay the fixing price yourself in terms of your warrenty. - Toby at Thu, 12 Apr 2007 17:12:21 +0200 Please read my text carefully again, I already hinted on this error. It was made intentionally to show the limits of what code coverage can do for you. :) - noel darlow at Fri, 13 Apr 2007 06:05:35 +0200 The TDD cycle is to write a test expressing a real implementation-agnostic requirement, write just enough code to make it pass, and then move on to the next requirement. Refactor later. There would never be any code not covered by tests. It's a nice way to explore new territory: little steps, one at a time. - Toby at Fri, 13 Apr 2007 09:12:34 +0200 Yes, this is true TDD. Although this works in some cases, you will find a lot of cases where you need to write tests for existing software (e.g. with the rapid prototyping approach). - sf at Fri, 13 Apr 2007 09:31:31 +0200 Rightful note. Thx! - Sebs at Fri, 13 Apr 2007 15:38:02 +0200 I was not clear enough: After a certain amount of experience I say coding and point out to the whole process, including testing and debugging. - Travis Swicegood at Fri, 13 Apr 2007 18:47:09 +0200 Interesting article. Often times I hear code coverage as an end in itself, instead of using it as a tool as you've outlined here. One point that was missed here is the benefit of a testing a legacy system. In many cases, an acceptance/functional test is more appropriate for a legacy system as the underlying code at some point will be refactored. For this purpose you don't care if there's a 100% test coverage so long as the required functionality is tested. That type of test does not aim for 100% coverage, but rather just enough to make sure that the requirements of function X are met. At that point, you might have a 10% test coverage simply because of cruft. Focusing on testing features that aren't necessary does spend coding (as Sebs describes it) effort where it isn't really necessary. The article does bring up a good point though. When inheriting/using an unfamiliar system that does need to be built upon, code coverage definitely does offer a useful tool for determining what needs to be tested and where. - Soren at Mon, 16 Apr 2007 22:21:39 +0200 A good article thanks. My experience with code-coverage is a mixed bag. I have recently been assigned a project where the other "main" programmer since day-1 has highlighted at all meetings how high he is keeping his code-coverage. However when I started looking at the code, his code-coverage was almost equivalent to the amount of simple getters and setters - none of the critical code was tested at all. While this is no problem to me (if I didn't have to work on the project that is), then I think the real issue is that non-tech people are easily amazed by graphs and numbers, as if its the truth about the state of the code. - Toby at Mon, 16 Apr 2007 22:25:34 +0200 I see the problem. That is why I wrote this article, because code coverage is often missused and also missunderstood. Thanks for your feedback! :) - mortgage rates today at Mon, 19 Mar 2012 07:06:50 +0100 Nice information, valuable and excellent design, as share good stuff with good ideas and concepts, lots of great information and inspiration, both of which I need, thanks to offer such a helpful information here. - DependentBlog.com at Fri, 30 Mar 2012 19:26:06 +0200 The topics presented vary between Unicode news and PHP's dirty secrets, by Derick, Sebastians usual introduction on PHPUnit, XUL in theory and praxis by Karola/Arne and a talk about deployment of PHP applications by David Coallier. Beside that, Jan will talk about next generation web storage with CouchDB, Henri Bergius is joining us for session about personal information management with Midgard, Falko Menge will inform us on web services with PHP and Guillaume Jarysta-Dautel will present a session about rich internet applications. - Agnostic Blog at Thu, 05 Apr 2012 13:28:20 +0200 This article is well thought out and full of good information. Many Thanks for taking time to bring this together into one article. - www.discountstoresreviews.com at Wed, 20 Jun 2012 10:44:26 +0200 I'm sure you had a great rest day! I would like to see more your photos! - blog on car insurance at Thu, 28 Jun 2012 08:09:43 +0200 This is a channel concern for the event spent a. I righteous yawning your book informatory and worthy to say I sound enjoyed manifestation your messages. - kelly at Sat, 22 Dec 2012 11:03:55 +0100 This is not a simple thing to skip the code purposes and I am glad that you discussed this here to solve the existing dilemma in this matter. Thanks a lot for sharing here. click this link