Python. Good, bad, evil -3-: Flow control exceptions
My grandma is a wise lady. She told me many useful mantras which I should repeat every night before going to bed. One of these is Never use Exceptions for flow control. And she is so right with this. It is even so true, that every OOP newbie should get this mantra tattooed on his hand. During my experiments with Python I sadly found it violated in a central place: When it comes to implementing an iterator.
This entry is part of my series Python. Good, bad, evil, where I discuss my experiences with Python from a PHP developers point of view. Find references to the successor episodes in the trackbacks of this post. The previous episodes are:
Exceptions
Wikipedia defines
Unlike signals and event handlers that are part of the normal program flow, exceptions are typically used to signal that something went wrong (e.g. a division by zero occurred or a required file was not found).
what gives a very clear impression of what exceptions are and what not. An exception breaks the current control flow of an application immediately and jumps out of the current scope, without further processing of subsequent instructions. And it does this recursively. So if you throw
(or in Python raise
) nd exception in a nested function or method call, these calls are recursively terminated and the exception bubbles up until it reaches the global application scope, and there typically results in a fatal error, or it is caught. Therefore: Exceptions are meant for error handling, not for normal flow control.
Most programming languages allow you to catch
(or in Python except
) exceptions, which enables the programmer to react even on fatal errors and make the application terminate gracefully. This is especially useful to present a meaningful error message to the user or to continue processing other modules of the application, if they are not affected by the error.
Exceptions, as the name states, indicate an exceptional state of your application or at least a certain module of it. An exceptional state is a state which does not allow further graceful processing of the desire action. A short example to illustrate what this means:
Imagine your application contains a module A that reads records from a database. Imagine further, that the connection to the database can not be established due to a failure of the database server. This is an exceptional state for module A, since its purpose is to read from the database and it cannot fulfill this purpose. The module cannot perform further operation so it can throw an exception. But maybe you also have a module B, which does not need a database connection. In that case, your application might still be partially functional, since module B can still operate properly. If, in contrast to this, module C depends on the result of module A, this module would also be in an exceptional state. It does not receive the necessary data due to an exceptional state in module A.
So, why shouldn't exceptions be used to control the flow of your application? Wouldn't it be cool, if you could simply jump out of a chain of nested method calls without manually returning the desired result from each involved method? No, it would not! And that for several reasons:
Using exceptions for flow control makes your code as un-maintainable as using
goto
. Tracing the control flow of your code gets hard. You can never see, where the result of an exception might be caught and vice versa, you don't know where a caught exception might be actually thrown.The semantics of your program is messed up. Exceptions are meant for error handling. If you mix both ways of dealing with them, it is not clear, what an exception is meant for in a certain place. Even if you only use exceptions for flow control. This on the one hand stupid, since you loose the power of exceptions for error handling, on the other hand you confuse other people looking at your code a lot.
Because exceptions are made for error situations, they cause the generation of a stack trace. This typically includes a list of the function / method stack (that's where the name comes from), but can also contain parameter and other variable dumps. Generating stack traces therefore wastes computation time and memory, if you use exceptions outside of error conditions.
I put a little example of PHP code into my php-snippets repository on Github to demonstrate these issues a bit. Beside the really ugly code, a quick benchmark [1] revealed that the code that uses exceptions for flow control takes about twice as much time and consumes roughly three times more memory as the correct implementation. I assume things is similar in Python.
Using exceptions for flow control is discussed in several other places on the web, too:
This article in the MSN blogs discusses the performance issues with exceptions
Javas PMD has an extra rule to detect exception abuse for flow control
The essence of this rant: Never forget my grandma!
Iterators
Let's come back to the original topic of this post, after the excursion on exceptions. Python, as well as PHP, both give you the tools to easily make an object iteratable. The following examples show you a simple, stupid implementation on how to iterate over the first 5 even numbers, in both languages. Realizing such an iteration is possible in many other, much easier, ways without involving OOP at all. But you know, examples …
PHP
In PHP, you need to implement the Iterator interface with your class, to allow iteration on its instances:
<?php
class EvenIterator implements Iterator
{
private $i;
public function current()
{
return $this->i;
}
public function next()
{
$this->i += 2;
}
public function valid()
{
return ( $this->i < 11 );
}
public function rewind()
{
$this->i = 2;
}
public function key()
{
return ( $this->i / 2 );
}
}
?>
The method current()
needs to return the element currently selected by the iterator, while next()
advances to the next element. valid()
is used to determine if the iterator reached its end. In this case it needs to return false
. The rewind()
method is called before an iteration is started, to set the iterator to its initial state. Remember the last two facts for looking at the Python code in a bit. Since the PHP foreach
loop can handle iteration keys and values at the same time, the key()
method is expected to return the key of the currently selected element. This can be an integer or string for arrays, but for Iterator
implementations also any other kind of variable.
An iteration in PHP follows these steps:
Call
rewind()
to reset the iterator.Call
valid()
to check if there is a current element.Call
current()
to receive the current element.Call
key()
to receive the current key.Call
next()
to advance the iterator.Call
valid()
to check if there is a current element.Call
current()
to receive the current element.…
The following code illustrates the usage of the iterator implementation shown above:
<?php
$itr = new EvenIterator();
echo "First:\n";
foreach ( $itr as $key => $val )
{
echo "$val\n";
}
echo "Second:\n";
foreach ( $itr as $key => $val )
{
echo "$val\n";
}
?>
The output of this code snippet is, as you might have guessed, the following:
First:
2
4
6
8
10
Second:
2
4
6
8
10
Note for later, that two iterations are performed here, using the very same iterator object.
Python
To implement an iterator in Python, you only need to implement two methods instead of the 5 you need in PHP. That looks much easier at a first glance:
class EvenIterator:
_i = 0
def __iter__(self):
return self
def next(self):
self._i += 2
if self._i > 11:
raise StopIteration
return self._i
Python does not handle keys in its for
loop by default, so the key()
method can obviously be left out. In addition, Pythons next()
method covers the actions of current()
and next()
in PHP. A Python iteration cannot access the selected element twice, but only access elements while advancing. This is a little less flexible than with PHP, since you cannot call current()
manually in a loop, but for most application this works fine.
I was wondering, why there is no need for a rewind()
method or similar, which is called at the beginning of a loop. You will see in a bit, what this means for iterators in Python.
Still, there is another magic method implemented in the class above, which is needed according to the requirements: __iter__()
returns the iterator for an object, when it is used in the for
loop. The sense here is, that a class must not necessarily be the iterator for its elements. If the class encapsulates an array, you want to be iterated if an instance of the class is iterated, you can simple return the array fro __iter__()
.
In the shown example, the iterator should be implemented by the class itself, therefore this special method returns just the signaled instance.
So far so well. Then what does this all have to do with abusing exception for flow control? As you might have noted: Python does not have a valid()
method. To indicate that the iteration is finished, you need to raise an
exception of type StopIteration
(sic!).
Come on Python guys, you really make my grandma cry! Is it so hard to realize this functionality through implementing a second method which performs the check and returns a boolean? Do you really need to teach the kids that using exceptions for flow control is a good idea? If an iterator reaches its end, this is by no means an exceptional state for it. An iterator is actually expected to always reach an end at some point.
OK, enough bashing for now. Let's look into using the iterator from above:
itr = EvenIterator()
print "First:"
for v in itr:
print v
print "Second:"
for w in itr:
print w
Naively I would expect the above code to yield the exact same result as the corresponding PHP version. However, this assumption is critically wrong:
First:
2
4
6
8
10
Second:
Looking again at the code, it is logical that the second iteration does not take place at all. The internal state of the iterator is still at the end and since there is now way for the iterating for
loop to reset this state, the second iteration cycle terminates as it starts.
One could now argue, that I should have reset the iterators state right after terminating the first iteration round. At least that was my intuition. However, the Python docs consider such an implementation broken:
The intention of the protocol is that once an iterator’s next() method raises StopIteration, it will continue to do so on subsequent calls. Implementations that do not obey this property are deemed broken.
So, the only way to achieve the desired behavior without manual interaction between iteration would be to clone (in Python copy()
) the object in the __iter__()
method, before returning it. Surely, this would yield higher memory consumption and eat up some execution time for the copying process, even if Python uses copy on write.
Conclusion
Python has at least one very bad example for poor OOP design in its deepest heart. Even the idea of showing programmers how to use exceptions for flow control in the main language docs makes my grandma get a heart attack. Aside of that big thing, I don't get the point of the above piece of documentation. Maybe I overlooked something? Implementing an iterator in PHP is more work and one can start to argue on the pros and cons of the interface design. However, it is a clean OOP approach for realizing an iterator.
- 1
- Yes, benchmarks suck, I know.
Comments