Cover photo for post Python. Good, bad, evil -3-: Flow control exceptions

Python. Good, bad, evil -3-: Flow control exceptions

My grandma is a wise lady. She told me many useful mantras which I should repeat every night before going to bed. One of these is Never use Exceptions for flow control. And she is so right with this. It is even so true, that every OOP newbie should get this mantra tattooed on his hand. During my experiments with Python I sadly found it violated in a central place: When it comes to implementing an iterator.

This entry is part of my series Python. Good, bad, evil, where I discuss my experiences with Python from a PHP developers point of view. Find references to the successor episodes in the trackbacks of this post. The previous episodes are:

  1. Missing braces

  2. Native sets

Exceptions

Wikipedia defines

Unlike signals and event handlers that are part of the normal program flow, exceptions are typically used to signal that something went wrong (e.g. a division by zero occurred or a required file was not found).

what gives a very clear impression of what exceptions are and what not. An exception breaks the current control flow of an application immediately and jumps out of the current scope, without further processing of subsequent instructions. And it does this recursively. So if you throw (or in Python raise) nd exception in a nested function or method call, these calls are recursively terminated and the exception bubbles up until it reaches the global application scope, and there typically results in a fatal error, or it is caught. Therefore: Exceptions are meant for error handling, not for normal flow control.

Most programming languages allow you to catch (or in Python except) exceptions, which enables the programmer to react even on fatal errors and make the application terminate gracefully. This is especially useful to present a meaningful error message to the user or to continue processing other modules of the application, if they are not affected by the error.

Exceptions, as the name states, indicate an exceptional state of your application or at least a certain module of it. An exceptional state is a state which does not allow further graceful processing of the desire action. A short example to illustrate what this means:

Imagine your application contains a module A that reads records from a database. Imagine further, that the connection to the database can not be established due to a failure of the database server. This is an exceptional state for module A, since its purpose is to read from the database and it cannot fulfill this purpose. The module cannot perform further operation so it can throw an exception. But maybe you also have a module B, which does not need a database connection. In that case, your application might still be partially functional, since module B can still operate properly. If, in contrast to this, module C depends on the result of module A, this module would also be in an exceptional state. It does not receive the necessary data due to an exceptional state in module A.

So, why shouldn't exceptions be used to control the flow of your application? Wouldn't it be cool, if you could simply jump out of a chain of nested method calls without manually returning the desired result from each involved method? No, it would not! And that for several reasons:

  • Using exceptions for flow control makes your code as un-maintainable as using goto. Tracing the control flow of your code gets hard. You can never see, where the result of an exception might be caught and vice versa, you don't know where a caught exception might be actually thrown.

  • The semantics of your program is messed up. Exceptions are meant for error handling. If you mix both ways of dealing with them, it is not clear, what an exception is meant for in a certain place. Even if you only use exceptions for flow control. This on the one hand stupid, since you loose the power of exceptions for error handling, on the other hand you confuse other people looking at your code a lot.

  • Because exceptions are made for error situations, they cause the generation of a stack trace. This typically includes a list of the function / method stack (that's where the name comes from), but can also contain parameter and other variable dumps. Generating stack traces therefore wastes computation time and memory, if you use exceptions outside of error conditions.

I put a little example of PHP code into my php-snippets repository on Github to demonstrate these issues a bit. Beside the really ugly code, a quick benchmark [1] revealed that the code that uses exceptions for flow control takes about twice as much time and consumes roughly three times more memory as the correct implementation. I assume things is similar in Python.

Using exceptions for flow control is discussed in several other places on the web, too:

The essence of this rant: Never forget my grandma!

Iterators

Let's come back to the original topic of this post, after the excursion on exceptions. Python, as well as PHP, both give you the tools to easily make an object iteratable. The following examples show you a simple, stupid implementation on how to iterate over the first 5 even numbers, in both languages. Realizing such an iteration is possible in many other, much easier, ways without involving OOP at all. But you know, examples …

PHP

In PHP, you need to implement the Iterator interface with your class, to allow iteration on its instances:

<?php class EvenIterator implements Iterator { private $i; public function current() { return $this->i; } public function next() { $this->i += 2; } public function valid() { return ( $this->i < 11 ); } public function rewind() { $this->i = 2; } public function key() { return ( $this->i / 2 ); } } ?>

The method current() needs to return the element currently selected by the iterator, while next() advances to the next element. valid() is used to determine if the iterator reached its end. In this case it needs to return false. The rewind() method is called before an iteration is started, to set the iterator to its initial state. Remember the last two facts for looking at the Python code in a bit. Since the PHP foreach loop can handle iteration keys and values at the same time, the key() method is expected to return the key of the currently selected element. This can be an integer or string for arrays, but for Iterator implementations also any other kind of variable.

An iteration in PHP follows these steps:

  1. Call rewind() to reset the iterator.

  2. Call valid() to check if there is a current element.

  3. Call current() to receive the current element.

  4. Call key() to receive the current key.

  5. Call next() to advance the iterator.

  6. Call valid() to check if there is a current element.

  7. Call current() to receive the current element.

The following code illustrates the usage of the iterator implementation shown above:

<?php $itr = new EvenIterator(); echo "First:\n"; foreach ( $itr as $key => $val ) { echo "$val\n"; } echo "Second:\n"; foreach ( $itr as $key => $val ) { echo "$val\n"; } ?>

The output of this code snippet is, as you might have guessed, the following:

First: 2 4 6 8 10 Second: 2 4 6 8 10

Note for later, that two iterations are performed here, using the very same iterator object.

Python

To implement an iterator in Python, you only need to implement two methods instead of the 5 you need in PHP. That looks much easier at a first glance:

class EvenIterator: _i = 0 def __iter__(self): return self def next(self): self._i += 2 if self._i > 11: raise StopIteration return self._i

Python does not handle keys in its for loop by default, so the key() method can obviously be left out. In addition, Pythons next() method covers the actions of current() and next() in PHP. A Python iteration cannot access the selected element twice, but only access elements while advancing. This is a little less flexible than with PHP, since you cannot call current() manually in a loop, but for most application this works fine.

I was wondering, why there is no need for a rewind() method or similar, which is called at the beginning of a loop. You will see in a bit, what this means for iterators in Python.

Still, there is another magic method implemented in the class above, which is needed according to the requirements: __iter__() returns the iterator for an object, when it is used in the for loop. The sense here is, that a class must not necessarily be the iterator for its elements. If the class encapsulates an array, you want to be iterated if an instance of the class is iterated, you can simple return the array fro __iter__().

In the shown example, the iterator should be implemented by the class itself, therefore this special method returns just the signaled instance.

So far so well. Then what does this all have to do with abusing exception for flow control? As you might have noted: Python does not have a valid() method. To indicate that the iteration is finished, you need to raise an exception of type StopIteration (sic!).

Come on Python guys, you really make my grandma cry! Is it so hard to realize this functionality through implementing a second method which performs the check and returns a boolean? Do you really need to teach the kids that using exceptions for flow control is a good idea? If an iterator reaches its end, this is by no means an exceptional state for it. An iterator is actually expected to always reach an end at some point.

OK, enough bashing for now. Let's look into using the iterator from above:

itr = EvenIterator() print "First:" for v in itr: print v print "Second:" for w in itr: print w

Naively I would expect the above code to yield the exact same result as the corresponding PHP version. However, this assumption is critically wrong:

First: 2 4 6 8 10 Second:

Looking again at the code, it is logical that the second iteration does not take place at all. The internal state of the iterator is still at the end and since there is now way for the iterating for loop to reset this state, the second iteration cycle terminates as it starts.

One could now argue, that I should have reset the iterators state right after terminating the first iteration round. At least that was my intuition. However, the Python docs consider such an implementation broken:

The intention of the protocol is that once an iterator’s next() method raises StopIteration, it will continue to do so on subsequent calls. Implementations that do not obey this property are deemed broken.

So, the only way to achieve the desired behavior without manual interaction between iteration would be to clone (in Python copy()) the object in the __iter__() method, before returning it. Surely, this would yield higher memory consumption and eat up some execution time for the copying process, even if Python uses copy on write.

Conclusion

Python has at least one very bad example for poor OOP design in its deepest heart. Even the idea of showing programmers how to use exceptions for flow control in the main language docs makes my grandma get a heart attack. Aside of that big thing, I don't get the point of the above piece of documentation. Maybe I overlooked something? Implementing an iterator in PHP is more work and one can start to argue on the pros and cons of the interface design. However, it is a clean OOP approach for realizing an iterator.

1
Yes, benchmarks suck, I know.

Comments

I think this are the "old-style" iterators in python. New iterators are using yield. (see also itertools)

Peter Smit at 2010-04-01

Yield is used in generators, there is a slight difference between a generator and simply iterating over it.

Harro at 2010-04-01

Yes, I'm aware of yield. AFAIK it's just a different way of implementing iterators. I did not find any note about a deprecation of the shown method. Can someone point me to a hint there, if it is really the case? Thx!

Toby at 2010-04-01

just 2 cents not related to the essence of the article: - starting from "So, why shouldn't exceptions be used to control the flow of your application?" everything is in italics here, at least under Opera :) - where's a typo in "iterator's"??

Tomek at 2010-04-01

Tomek,

Could you send me a screenshot via email, please? It looks fine in Firefox and some Opera version I tried at a friends computer.

The typo is the 's. 's is used as a shortcut for "is". In the given case it should be "iterators" not "iterator's". At least AFAIK.

Regards, Toby

Toby at 2010-04-01

I've switched to python and liked it. But why they say it is full OOP, if you cannot make type hinting? And no interface keyword? Well, multiple inheritance is there...

BTW, I like this post

python_tired at 2010-04-01
Rob at 2010-04-02

Rob,

thanks for correcting my english misunderstanding. I removed the offending typo hint.

Regards, Toby

Toby at 2010-04-02

Hi Toby,

Excellent articles!

We must admit that ipython is really much more useful than PEAR::PHP_Shell.

Part1:

The first part looked like an article i had just read, by a Python programmer about PHP: http://spyced.blogspot.com/2005/06/why-php-sucks.html

Anyway, many disagree with the first article of the series, the "missing braces" (and parenthesis and semicolons and -> ...). All coders i've seen coming from PHP or C++ to Python or Go have an impression of "fresh air" - except you of course :P

Note that it is possible to reduce the annoyance of special characters with modified keyboard layouts like dvorak-code or dvorak-jpic.

Part2:

Spl is still a young extension and should be used with care, some BC breaks are commited and not added to the PHP5.3 BC breaks list. It is also poorly documented and ArrayObject for example is really a pain at the beginning (before you learn all its flaws).

So, is it really fair to compare SPL?

Part3:

Unfortunately, only a handful of Python users will understand your point about exceptions. Even after a few years with Python, it still annoys me to be forced to "catch an exception" just to control the flow .... What happened to the standard "return -1" ? is your grandma hiding it?

Actually, you've only got (unfortunnately again) half the thruth!

Because Python users are too busy using exceptions for flow control: they actually (almost) never check for sanity of arguments! ie. If a string is passed instead of a float, the error will be revealed much later in the code .... I don't get why Python users don't do these simple sanity checks when they know that not doing it will make it harder for users to debug. That's obviously the "evil" part :P

Finally, i hope you got the easter egg, try "import this" in the python console.

jpic at 2010-05-05