FAQ
I observe that python library primarily use exception for error
handling rather than use error code.

In the article API Design Matters by Michi Henning

Communications of the ACM
Vol. 52 No. 5, Pages 46-56
10.1145/1506409.1506424
http://cacm.acm.org/magazines/2009/5/24646-api-design-matters/fulltext

It says "Another popular design flaw?namely, throwing exceptions for
expected outcomes?also causes inefficiencies because catching and
handling exceptions is almost always slower than testing a return
value."

My observation is contradicted to the above statement by Henning. If
my observation is wrong, please just ignore my question below.

Otherwise, could some python expert explain to me why exception is
widely used for error handling in python? Is it because the efficiency
is not the primary goal of python?

Search Discussions

  • Chris Rebert at Jan 1, 2010 at 5:24 am

    On Thu, Dec 31, 2009 at 8:47 PM, Peng Yu wrote:
    I observe that python library primarily use exception for error
    handling rather than use error code.

    In the article API Design Matters by Michi Henning

    Communications of the ACM
    Vol. 52 No. 5, Pages 46-56
    10.1145/1506409.1506424
    http://cacm.acm.org/magazines/2009/5/24646-api-design-matters/fulltext

    It says "Another popular design flaw?namely, throwing exceptions for
    expected outcomes?also causes inefficiencies because catching and
    handling exceptions is almost always slower than testing a return
    value."

    My observation is contradicted to the above statement by Henning. If
    my observation is wrong, please just ignore my question below.

    Otherwise, could some python expert explain to me why exception is
    widely used for error handling in python? Is it because the efficiency
    is not the primary goal of python?
    Correct; programmer efficiency is a more important goal for Python instead.
    Python is ~60-100x slower than C;[1] if someone is worried by the
    inefficiency caused by exceptions, then they're using completely the
    wrong language.

    Cheers,
    Chris
  • Peng Yu at Jan 2, 2010 at 1:36 am

    On Thu, Dec 31, 2009 at 11:24 PM, Chris Rebert wrote:
    On Thu, Dec 31, 2009 at 8:47 PM, Peng Yu wrote:
    I observe that python library primarily use exception for error
    handling rather than use error code.

    In the article API Design Matters by Michi Henning

    Communications of the ACM
    Vol. 52 No. 5, Pages 46-56
    10.1145/1506409.1506424
    http://cacm.acm.org/magazines/2009/5/24646-api-design-matters/fulltext

    It says "Another popular design flaw?namely, throwing exceptions for
    expected outcomes?also causes inefficiencies because catching and
    handling exceptions is almost always slower than testing a return
    value."

    My observation is contradicted to the above statement by Henning. If
    my observation is wrong, please just ignore my question below.

    Otherwise, could some python expert explain to me why exception is
    widely used for error handling in python? Is it because the efficiency
    is not the primary goal of python?
    Correct; programmer efficiency is a more important goal for Python instead.
    Python is ~60-100x slower than C;[1] if someone is worried by the
    inefficiency caused by exceptions, then they're using completely the
    wrong language.
    Could somebody let me know how the python calls and exceptions are
    dispatched? Is there a reference for it?
  • Stephen Hansen at Jan 2, 2010 at 8:05 am

    On Fri, Jan 1, 2010 at 5:36 PM, Peng Yu wrote:

    Otherwise, could some python expert explain to me why exception is
    widely used for error handling in python? Is it because the efficiency
    is not the primary goal of python?
    Correct; programmer efficiency is a more important goal for Python instead.
    Python is ~60-100x slower than C;[1] if someone is worried by the
    inefficiency caused by exceptions, then they're using completely the
    wrong language.
    Could somebody let me know how the python calls and exceptions are
    dispatched? Is there a reference for it?
    I don't quite understand what you're asking here, but it sounds almost like
    you're looking at the question from an incorrect POV. "Exceptions" are a
    general sort of concept in computer science and various computer programming
    languages, but they are not at all equal from one language to another. The
    document you referenced was speaking to a particular implementation of the
    concept, and speaking to particular characteristics of that language's
    implementation. Even though its not talking just about say, C, C#, Java, or
    anything -- its speaking from a certain POV of a certain classes of
    languages.

    In Python, setting up an exception -- the 'try' clause -- costs virtually
    nothing. Its about equivalent to having a 'pass' statement in there. If you
    do a test for every iteration of some activity, you're incurring a
    non-negligable cost each time. If you're performing an action and "usually"
    (to varying definitions of 'usually'), it's going to succeed-- then that
    test will result in far more cost in time then using a try/except clause in
    Python.

    Because in the implementation of exceptions in Python, you only pay a more
    expensive cost /if/ that exception is thrown and handled. If its very likely
    that in a situation an exception would be thrown, then yes-- then you should
    probably test first... if that exception-catch is so expensive as to be
    important to your. In most cases, its not. In the vast majority of cases,
    this is premature optimization and often adds additional weight to the test
    as you have to protect against race conditions. (As an aside, in many cases
    using exceptions actually helps you in a wider problem of preventing race
    conditions. Its not a cure-all by any means, but it helps)

    If someone specifies a file, the chances are-- the file is there. Its
    cheaper for you to just try to open it then to test if its there first,
    because if you do the test? Then the likely circumstance (the file exists)
    is running code to test that case /every time/. Whereas if you just try to
    open it, in the likely circumstance -- it works. In the exceptional
    circumstance, it might cost more then if you had tested first... but that's
    an exceptional circumstance and therefore more rare.

    --S
    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: <http://mail.python.org/pipermail/python-list/attachments/20100102/1ed4aa14/attachment.htm>
  • Ulrich Eckhardt at Jan 2, 2010 at 12:01 pm

    Peng Yu wrote:
    Could somebody let me know how the python calls and exceptions are
    dispatched? Is there a reference for it?
    I'm not a Python expert, but I have read some parts of the implementation.
    Hopefully someone steps up if I misrepresent things here...

    In order to understand Python exception handling, take a look at various C
    function implementations. You will see that they commonly return a pointer
    to a Python object (PyObject*), even if it is a pointer to the 'None'
    singleton. So, a function in Python _always_ returns something, even if it
    is 'None'.

    If, at the C level, a function doesn't return anything (i.e. a C NULL
    pointer) that means that the function raised an exception. Checking this
    pointer is pretty easy, typically you check that, clean up and return NULL
    yourself. Further functions for manipulating the exception stack and
    declarations of exception types and singletons are found in pyerrors.h (in
    Python 2.5, at least).

    I mentioned an "exception stack" above, though I'm not 100% sure if that is
    the proper term. I think that exceptions can be stacked upon each other
    (e.g. an HTTPD throwing a high-level RequestError when it encounters a low-
    level IOError) and that that is also how the backtrace is implemented, but
    I'm not sure about that.


    Hopefully someone can confirm or correct me here and that it helped you.

    Cheers!

    Uli
  • Martin v. Loewis at Jan 2, 2010 at 12:52 pm

    I mentioned an "exception stack" above, though I'm not 100% sure if that is
    the proper term. I think that exceptions can be stacked upon each other
    (e.g. an HTTPD throwing a high-level RequestError when it encounters a low-
    level IOError) and that that is also how the backtrace is implemented, but
    I'm not sure about that.
    Not exactly. In this scenario, the IOError exception gets caught, its
    entire traceback discarded, and an entirely new exception RequestError
    gets raised (that has no connection to the original IOError anymore,
    unless the httpd code explicitly links the two).

    Instead, the traceback objects are created for a single exception.
    They are essentially the same as the call stack, just in reverse
    order (so that you get the "most recent call last" traceback output).
    Each traceback links to a frame object, and a next traceback object.

    Regards,
    Martin
  • Diez B. Roggisch at Jan 2, 2010 at 12:05 pm

    Peng Yu schrieb:
    On Thu, Dec 31, 2009 at 11:24 PM, Chris Rebert wrote:
    On Thu, Dec 31, 2009 at 8:47 PM, Peng Yu wrote:
    I observe that python library primarily use exception for error
    handling rather than use error code.

    In the article API Design Matters by Michi Henning

    Communications of the ACM
    Vol. 52 No. 5, Pages 46-56
    10.1145/1506409.1506424
    http://cacm.acm.org/magazines/2009/5/24646-api-design-matters/fulltext

    It says "Another popular design flaw?namely, throwing exceptions for
    expected outcomes?also causes inefficiencies because catching and
    handling exceptions is almost always slower than testing a return
    value."

    My observation is contradicted to the above statement by Henning. If
    my observation is wrong, please just ignore my question below.

    Otherwise, could some python expert explain to me why exception is
    widely used for error handling in python? Is it because the efficiency
    is not the primary goal of python?
    Correct; programmer efficiency is a more important goal for Python instead.
    Python is ~60-100x slower than C;[1] if someone is worried by the
    inefficiency caused by exceptions, then they're using completely the
    wrong language.
    Could somebody let me know how the python calls and exceptions are
    dispatched? Is there a reference for it?
    The source?

    http://python.org/ftp/python/2.6.4/Python-2.6.4.tgz

    These are really deep internals that - if they really concern you - need
    intensive studies, not casual reading of introductionary documents. IMHO
    you shouldn't worry, but then, there's a lot things you seem to care I
    wouldn't... :)

    Diez
  • Peng Yu at Jan 2, 2010 at 3:04 pm

    On Sat, Jan 2, 2010 at 6:05 AM, Diez B. Roggisch wrote:
    Peng Yu schrieb:
    On Thu, Dec 31, 2009 at 11:24 PM, Chris Rebert wrote:
    On Thu, Dec 31, 2009 at 8:47 PM, Peng Yu wrote:

    I observe that python library primarily use exception for error
    handling rather than use error code.

    In the article API Design Matters by Michi Henning

    Communications of the ACM
    Vol. 52 No. 5, Pages 46-56
    10.1145/1506409.1506424
    http://cacm.acm.org/magazines/2009/5/24646-api-design-matters/fulltext

    It says "Another popular design flaw?namely, throwing exceptions for
    expected outcomes?also causes inefficiencies because catching and
    handling exceptions is almost always slower than testing a return
    value."

    My observation is contradicted to the above statement by Henning. If
    my observation is wrong, please just ignore my question below.

    Otherwise, could some python expert explain to me why exception is
    widely used for error handling in python? Is it because the efficiency
    is not the primary goal of python?
    Correct; programmer efficiency is a more important goal for Python
    instead.
    Python is ~60-100x slower than C;[1] if someone is worried by the
    inefficiency caused by exceptions, then they're using completely the
    wrong language.
    Could somebody let me know how the python calls and exceptions are
    dispatched? Is there a reference for it?
    The source?

    http://python.org/ftp/python/2.6.4/Python-2.6.4.tgz

    These are really deep internals that - if they really concern you - need
    intensive studies, not casual reading of introductionary documents. IMHO you
    shouldn't worry, but then, there's a lot things you seem to care I
    wouldn't... :)
    For my own interest, I want understand the run time behavior of python
    and what details causes it much slower. Although people choose python
    for its programming efficiency, but sometimes the runtime still
    matters. This is an important aspect of the language. I'm wondering
    this is not even documented. Why everybody has to go to the source
    code to understand it?

    Are you sure that there is no document that describes how python is
    working internally (including exceptions)?
  • Terry Reedy at Jan 2, 2010 at 9:48 pm

    On 1/2/2010 10:04 AM, Peng Yu wrote:

    For my own interest, I want understand the run time behavior of python
    That depends on the implementation.
    and what details causes it much slower.
    A language feature that slows all implementation is the dynamic
    name/slot binding and resolution. Any implementation can be made faster
    by restricting the dynamism (which makes the imlementaion one of a
    subset of Python).
    Although people choose python
    for its programming efficiency, but sometimes the runtime still
    matters.
    There is no 'the' runtime. Whether or not there even *is* a runtime, as
    usually understoold, is a matter of the implementation.

    This is an important aspect of the language.
    It is an aspect of each implementation, of which there are now more than
    one.

    Terry Jan Reedy
  • Martin v. Loewis at Jan 2, 2010 at 3:17 pm

    For my own interest, I want understand the run time behavior of python
    and what details causes it much slower. Although people choose python
    for its programming efficiency, but sometimes the runtime still
    matters. This is an important aspect of the language. I'm wondering
    this is not even documented. Why everybody has to go to the source
    code to understand it?
    There are two answers to this question:

    a) Because the source is the most precise and most complete way of
    documenting it. Any higher-level documentation would necessarily be
    incomplete.
    b) Because nobody has contributed documentation.

    The two causes correlate: because writing documentation of VM
    internals takes a lot of effort and is of questionable use, nobody
    has written any.
    Are you sure that there is no document that describes how python is
    working internally (including exceptions)?
    Such documents certainly exist, but not as part of the Python
    distribution. See

    http://wiki.python.org/moin/CPythonVmInternals

    for one such document.

    Regards,
    Martin
  • Benjamin Kaplan at Jan 1, 2010 at 5:26 am

    On Thu, Dec 31, 2009 at 11:47 PM, Peng Yu wrote:
    I observe that python library primarily use exception for error
    handling rather than use error code.

    In the article API Design Matters by Michi Henning

    Communications of the ACM
    Vol. 52 No. 5, Pages 46-56
    10.1145/1506409.1506424
    http://cacm.acm.org/magazines/2009/5/24646-api-design-matters/fulltext

    It says "Another popular design flaw?namely, throwing exceptions for
    expected outcomes?also causes inefficiencies because catching and
    handling exceptions is almost always slower than testing a return
    value."

    My observation is contradicted to the above statement by Henning. If
    my observation is wrong, please just ignore my question below.

    Otherwise, could some python expert explain to me why exception is
    widely used for error handling in python? Is it because the efficiency
    is not the primary goal of python?
    --
    http://mail.python.org/mailman/listinfo/python-list
    Read the quote again "Another popular design flaw?namely, throwing
    exceptions *for expected outcomes*"
    In Python, throwing exceptions for expected outcomes is considered
    very bad form (well, except for StopIteration but that should almost
    never be handled directly by the programmer).

    To answer why people recommend using "Easier to Ask Forgiveness than
    Permission" as opposed to "Look Before You Leap" : Because of the way
    it's implemented, Python works quite differently from most languages.
    An attribute look-up is rather expensive because it's a hash table
    look-up at run time. Wrapping a small piece of code in a try block
    however, isn't (relatively) very expensive at all in Python. It's only
    catching the exception that's expensive, but if you're catching the
    exception, something has gone wrong anyway and performance isn't your
    biggest issue.
  • Steven D'Aprano at Jan 1, 2010 at 8:26 am

    On Thu, 31 Dec 2009 20:47:49 -0800, Peng Yu wrote:

    I observe that python library primarily use exception for error handling
    rather than use error code.

    In the article API Design Matters by Michi Henning

    Communications of the ACM
    Vol. 52 No. 5, Pages 46-56
    10.1145/1506409.1506424
    http://cacm.acm.org/magazines/2009/5/24646-api-design-matters/fulltext

    It says "Another popular design flaw?namely, throwing exceptions for
    expected outcomes?also causes inefficiencies because catching and
    handling exceptions is almost always slower than testing a return
    value."
    This is very, very wrong.

    Firstly, notice that the author doesn't compare the same thing. He
    compares "catching AND HANDLING" the exception (emphasis added) with
    *only* testing a return value. Of course it is faster to test a value and
    do nothing, than it is to catch an exception and then handle the
    exception. That's an unfair comparison, and that alone shows that the
    author is biased against exceptions.

    But it's also wrong. If you call a function one million times, and catch
    an exception ONCE (because exceptions are rare) that is likely to be
    much, much faster than testing a return code one million times.

    Before you can talk about which strategy is faster, you need to
    understand your problem. When exceptions are rare (in CPython, about one
    in ten or rarer) then try...except is faster than testing each time. The
    exact cut-off depends on how expensive the test is, and how much work
    gets done before the exception is raised. Using exceptions is only slow
    if they are common.

    But the most important reason for preferring exceptions is that the
    alternatives are error-prone! Testing error codes is the anti-pattern,
    not catching exceptions.

    See, for example:

    http://c2.com/cgi/wiki?UseExceptionsInsteadOfErrorValues
    http://c2.com/cgi/wiki?ExceptionsAreOurFriends
    http://c2.com/cgi/wiki?AvoidExceptionsWheneverPossible

    Despite the title of that last page, it has many excellent arguments for
    why exceptions are better than the alternatives.

    (Be warned: the c2 wiki is filled with Java and C++ programmers who
    mistake the work-arounds for quirks of their language as general design
    principles. For example, because exceptions in Java are evcen more
    expensive and slow than in Python, you will find lots of Java coders
    saying "don't use exceptions" instead of "don't use exceptions IN JAVA".)

    There are many problems with using error codes:

    * They complicate your code. Instead of returning the result you care
    about, you have to return a status code and the return result you care
    about. Even worse is to have a single global variable to hold the status
    of the last function call!

    * Nobody can agree whether the status code means the function call
    failed, or the function call succeeded.

    * If the function call failed, what do you return as the result code?

    * You can't be sure that the caller will remember to check the status
    code. In fact, you can be sure that the caller WILL forget sometimes!
    (This is human nature.) This leads to the frequent problem that by the
    time a caller checks the status code, the original error has been lost
    and the program is working with garbage.

    * Even if you remember to check the status code, it complicates the code,
    makes it less readable, confuses the intent of the code, and often leads
    to the Arrow Anti-pattern: http://c2.com/cgi/wiki?ArrowAntiPattern

    That last argument is critical. Exceptions exist to make writing correct
    code easier to write, understand and maintain.

    Python uses special result codes in at least two places:

    str.find(s) returns -1 if s is not in the string
    re.match() returns None is the regular expression fails

    Both of these are error-prone. Consider a naive way of getting the
    fractional part of a float string:
    s = "234.567"
    print s[s.find('.')+1:]
    567

    But see:
    s = "234"
    print s[s.find('.')+1:]
    234

    You need something like:

    p = s.find('.')
    if p == -1:
    print ''
    else:
    print s[p+1:]



    Similarly, we cannot safely do this in Python:

    re.match(r'\d+', '123abcd').group()
    '123'
    re.match(r'\d+', 'abcd').group()
    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    AttributeError: 'NoneType' object has no attribute 'group'

    You need to do this:

    mo = re.match(r'\d+', '123abcd')
    if mo is not None: # or just `if mo` will work
    mo.group()


    Exceptions are about making it easier to have correct code. They're also
    about making it easier to have readable code. Which is easier to read,
    easier to understand and easier to debug?

    x = function(1, 2, 3)
    if x != -1:
    y = function(x, 1, 2)
    if y != -1:
    z = function(y, x, 1)
    if z != -1:
    print "result is", z
    else:
    print "an error occurred"
    else:
    print "an error occurred"
    else:
    print "an error occurred"


    versus:


    try:
    x = function(1, 2, 3)
    y = function(x, 1, 2)
    print "result is", function(y, x, 1)
    except ValueError:
    print "an error occurred"



    In Python, setting up the try...except block is very fast, about as fast
    as a plain "pass" statement, but actually catching the exception is quite
    slow. So let's compare string.find (which returns an error result) and
    string.index (which raises an exception):

    from timeit import Timer
    setup = "source = 'abcd'*100 + 'e'"
    min(Timer("p = source.index('e')", setup).repeat())
    1.1308379173278809
    min(Timer("p = source.find('e')", setup).repeat())
    1.2237567901611328

    There's hardly any difference at all, and in fact index is slightly
    faster. But what about if there's an exceptional case?

    min(Timer("""
    ... try:
    ... p = source.index('z')
    ... except ValueError:
    ... pass
    ... """, setup).repeat())
    3.5699808597564697
    min(Timer("""
    ... p = source.find('z')
    ... if p == -1:
    ... pass
    ... """, setup).repeat())
    1.7874350070953369


    So in Python, catching the exception is slower, in this case about twice
    as slow. But remember that the "if p == -1" test is not free. It might be
    cheap, but it does take time. If you call find() enough times, and every
    single time you then test the result returned, that extra cost may be
    more expensive than catching a rare exception.

    The general rule in Python is:

    * if the exceptional event is rare (say, on average, less than about one
    time in ten) then use a try...except and catch the exception;

    * but if it is very common (more than one time in ten) then it is faster
    to do a test.

    My observation is contradicted to the above statement by Henning. If my
    observation is wrong, please just ignore my question below.

    Otherwise, could some python expert explain to me why exception is
    widely used for error handling in python? Is it because the efficiency
    is not the primary goal of python?
    Yes.

    Python's aim is to be fast *enough*, without necessarily being as fast as
    possible.

    Python aims to be readable, and to be easy to write correct, bug-free
    code.


    --
    Steven
  • Aahz at Jan 1, 2010 at 8:43 am
    In article <mailman.300.1262323578.28905.python-list at python.org>,
    Benjamin Kaplan wrote:
    In Python, throwing exceptions for expected outcomes is considered
    very bad form [...]
    Who says that? I certainly don't.
    --
    Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/

    Weinberg's Second Law: If builders built buildings the way programmers wrote
    programs, then the first woodpecker that came along would destroy civilization.
  • Jonathan Gardner at Jan 1, 2010 at 10:43 am

    On Jan 1, 12:43?am, a... at pythoncraft.com (Aahz) wrote:
    In article <mailman.300.1262323578.28905.python-l... at python.org>,
    Benjamin Kaplan ?wrote:
    In Python, throwing exceptions for expected outcomes is considered
    very bad form [...]
    Who says that? ?I certainly don't.
    Agreed.

    int("asdf") is supposed to return what, exactly? Any language that
    tries to return an int is horribly broken.
  • Steven D'Aprano at Jan 1, 2010 at 2:24 pm

    On Fri, 01 Jan 2010 02:43:21 -0800, Jonathan Gardner wrote:
    On Jan 1, 12:43?am, a... at pythoncraft.com (Aahz) wrote:
    In article <mailman.300.1262323578.28905.python-l... at python.org>,
    Benjamin Kaplan ?wrote:
    In Python, throwing exceptions for expected outcomes is considered
    very bad form [...]
    Who says that? ?I certainly don't.
    Agreed.

    int("asdf") is supposed to return what, exactly? Any language that tries
    to return an int is horribly broken.

    [sarcasm]
    No no, the right way to deal with that is have int("asdf") return some
    arbitrary bit pattern, and expect the user to check a global variable to
    see whether the function returned a valid result or not. That's much
    better than catching an exception!
    [/sarcasm]


    --
    Steven
  • Mel at Jan 1, 2010 at 4:06 pm

    Steven D'Aprano wrote:
    On Fri, 01 Jan 2010 02:43:21 -0800, Jonathan Gardner wrote:
    On Jan 1, 12:43 am, a... at pythoncraft.com (Aahz) wrote:
    In article <mailman.300.1262323578.28905.python-l... at python.org>,
    Benjamin Kaplan wrote:
    In Python, throwing exceptions for expected outcomes is considered
    very bad form [...]
    Who says that? I certainly don't.
    Agreed.

    int("asdf") is supposed to return what, exactly? Any language that tries
    to return an int is horribly broken.

    [sarcasm]
    No no, the right way to deal with that is have int("asdf") return some
    arbitrary bit pattern, and expect the user to check a global variable to
    see whether the function returned a valid result or not. That's much
    better than catching an exception!
    [/sarcasm]
    Or the other way around, as in C (I suspect the original ACM article assumed
    C.) Look at the legion of C library subroutines that return only 0 for good
    or -1 for bad, and do all their real work in side-effects (through pointers
    as function arguments.) Python is a big improvement: use the function
    return values for the payload, and push the out-of-band "omyghod" response
    into an Exception.

    Mel.
  • Martin v. Loewis at Jan 1, 2010 at 10:34 am

    Peng Yu wrote:
    I observe that python library primarily use exception for error
    handling rather than use error code. [...]
    It says "Another popular design flaw?namely, throwing exceptions for
    expected outcomes?also causes inefficiencies because catching and
    handling exceptions is almost always slower than testing a return
    value."

    My observation is contradicted to the above statement by Henning. If
    my observation is wrong, please just ignore my question below.
    Your observation is not wrong, but, as Benjamin already explained,
    you are misinterpreting Michi Henning's statement. He doesn't condemn
    exception handling per se, but only for the handling of *expected*
    outcomes. He would consider using exceptions fine for *exceptional*
    output, and that is exactly the way they are used in the Python API.

    Notice that in cases where the failure may be expected, Python
    also offers variants that avoid the exception:
    - if you look into a dictionary, expecting that a key may not
    be there, a regular access, d[k], may give a KeyError. However,
    alternatively, you can use d.get(k, default) which raises no
    exception, and you can test "k in d" in advance.
    - if you open a file, not knowing whether it will be there,
    you get an IOError. However, you can use os.path.exists in
    advance to determine whether the file is present (and create
    it if it's not).

    So, in these cases, it is a choice of the user to determine whether
    the error case is exceptional or not.

    Regards,
    Martin
  • Andreas Waldenburger at Jan 1, 2010 at 2:01 pm

    On Fri, 01 Jan 2010 11:34:19 +0100 "Martin v. Loewis" wrote:

    Your observation is not wrong, but, as Benjamin already explained,
    you are misinterpreting Michi Henning's statement. He doesn't condemn
    exception handling per se, but only for the handling of *expected*
    outcomes. He would consider using exceptions fine for *exceptional*
    output, and that is exactly the way they are used in the Python API.
    May I point out at this point that "exceptional" does not mean
    "unexpected"? You catch exceptions, not unexpectations. An exception
    is rare, but not surprising. Case in point: StopIteration.

    To put it differently: When you write "catch DeadParrot", you certainly
    expect to get a DeadParrot once in a while -- why else would you get it
    in your head to try and catch it? An unexpected exception is the one
    that crashes your program.

    /W

    --
    INVALID? DE!
  • Aahz at Jan 2, 2010 at 7:02 am
    In article <4B3DCFAB.3030909 at v.loewis.de>,
    Martin v. Loewis wrote:
    Notice that in cases where the failure may be expected, Python
    also offers variants that avoid the exception:
    - if you look into a dictionary, expecting that a key may not
    be there, a regular access, d[k], may give a KeyError. However,
    alternatively, you can use d.get(k, default) which raises no
    exception, and you can test "k in d" in advance.
    - if you open a file, not knowing whether it will be there,
    you get an IOError. However, you can use os.path.exists in
    advance to determine whether the file is present (and create
    it if it's not).
    But you *still* need to catch IOError: someone might delete the file
    after the test. Figuring out how to deal with race conditions is one of
    the main reasons Alex Martelli advocates EAFP over LBYL.

    Of course, in the real world, you often end up wanting to do it both
    ways.
    --
    Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/

    Weinberg's Second Law: If builders built buildings the way programmers wrote
    programs, then the first woodpecker that came along would destroy civilization.
  • Steven D'Aprano at Jan 1, 2010 at 2:49 pm

    On Fri, 01 Jan 2010 00:26:09 -0500, Benjamin Kaplan wrote:
    On Thu, Dec 31, 2009 at 11:47 PM, Peng Yu wrote:
    I observe that python library primarily use exception for error
    handling rather than use error code.

    In the article API Design Matters by Michi Henning

    Communications of the ACM
    Vol. 52 No. 5, Pages 46-56
    10.1145/1506409.1506424
    http://cacm.acm.org/magazines/2009/5/24646-api-design-matters/fulltext

    It says "Another popular design flaw?namely, throwing exceptions for
    expected outcomes?also causes inefficiencies because catching and
    handling exceptions is almost always slower than testing a return
    value."

    My observation is contradicted to the above statement by Henning. If my
    observation is wrong, please just ignore my question below.

    Otherwise, could some python expert explain to me why exception is
    widely used for error handling in python? Is it because the efficiency
    is not the primary goal of python?
    --
    http://mail.python.org/mailman/listinfo/python-list
    Read the quote again "Another popular design flaw?namely, throwing
    exceptions *for expected outcomes*"
    In Python, throwing exceptions for expected outcomes is considered very
    bad form (well, except for StopIteration but that should almost never be
    handled directly by the programmer).

    Exceptions are *exceptional*, not "errors" or "unexpected". They are
    exceptional because they aren't the "normal" case, but that doesn't mean
    they are surprising or unexpected. Are you surprised that your "for x in
    range(1000)" loop comes to an end? Of course you are not -- it is
    completely expected, even though less than 1% of the iterations are the
    last loop. The end of the sequence is EXCEPTIONAL but not UNEXPECTED.

    If you program without expecting that keys can sometimes be missing from
    dictionaries (KeyError), or that you might sometimes have to deal with a
    list index that is out of range (IndexError), or that the denominator in
    a division might be zero (ZeroDivisionError), then you must be writing
    really buggy code. None of these things are unexpected, but they are all
    exceptional.

    The urllib2 module defines a HTTPError class, which does double-duty as
    both an exception and a valid HTTP response. If you're doing any HTTP
    programming, you better expect to deal with HTTP 301, 302 etc. codes, or
    at least trust that the library you use will transparently handle them
    for you.

    To answer why people recommend using "Easier to Ask Forgiveness than
    Permission" as opposed to "Look Before You Leap" : Because of the way
    it's implemented, Python works quite differently from most languages. An
    attribute look-up is rather expensive because it's a hash table look-up
    at run time. Wrapping a small piece of code in a try block however,
    isn't (relatively) very expensive at all in Python.
    It's not just relatively inexpensive, it's absolutely inexpensive: it
    costs about as much as a pass statement in CPython, which is pretty much
    as cheap as it gets. (If anyone can demonstrate a cheaper operation
    available from pure Python, I'd love to see it.)

    It's only catching the exception that's expensive,
    True.

    but if you're catching the exception,
    something has gone wrong anyway and performance isn't your biggest
    issue.

    The second try...except clause in the urllib2 module reads:

    try:
    kind = int(kind)
    except ValueError:
    pass

    In this case, the error is simply ignored. Nothing has gone wrong.


    Here's an example from my own code: I have an API where you pass a
    mapping (either a dict or a list of (key, value) tuples) to a function.
    So I do this:

    try:
    it = obj.iteritems()
    except AttributeError:
    it = obj
    for key, value in it:
    do_stuff()



    There's nothing wrong with catching exceptions.


    --
    Steven
  • Benjamin Kaplan at Jan 1, 2010 at 4:02 pm

    On Fri, Jan 1, 2010 at 9:49 AM, Steven D'Aprano wrote:

    Exceptions are *exceptional*, not "errors" or "unexpected". They are
    exceptional because they aren't the "normal" case, but that doesn't mean
    they are surprising or unexpected. Are you surprised that your "for x in
    range(1000)" loop comes to an end? Of course you are not -- it is
    completely expected, even though less than 1% of the iterations are the
    last loop. The end of the sequence is EXCEPTIONAL but not UNEXPECTED.
    Sorry if my word choice was confusing- I was trying to point out that
    in Python, you don't test errors for your typical conditions, but for
    ones that you know still exist but don't plan on occurring often.
  • Steven D'Aprano at Jan 1, 2010 at 4:26 pm

    On Fri, 01 Jan 2010 11:02:28 -0500, Benjamin Kaplan wrote:

    I was trying to point out that in
    Python, you don't test errors for your typical conditions, but for ones
    that you know still exist but don't plan on occurring often.
    I'm sorry, but that makes no sense to me at all. I don't understand what
    you are trying to say.


    You do understand that exceptions aren't just for errors? They are raised
    under specific circumstances. Whether that circumstance is an error or
    not is entirely up to the caller.


    try:
    n = mylist.index('fault')
    except ValueError:
    print "All is good, no fault detected"
    else:
    print "Your data contains a fault in position", n



    People get hung up on the idea that exceptions == errors, but they
    aren't. They *may* be errors, and that is one common use, but that
    depends entirely on the caller.


    --
    Steven
  • Martin v. Loewis at Jan 1, 2010 at 5:18 pm

    You do understand that exceptions aren't just for errors? They are raised
    under specific circumstances. Whether that circumstance is an error or
    not is entirely up to the caller.
    I think that's a fairly narrow definition of the word error, and
    probably also the source of confusion in this thread.

    ISTM that there is a long tradition of giving different meaning to
    the word "error" in computing. For example, the Unix man pages
    list various conditions as "errors" purely by their outcome, and
    completely ignoring on whether the caller would consider the result
    erroneous - ISTM that a system call reports an "error" iff it is
    "unsuccessful".

    By that (common) usage of "error", it is a condition determined by
    the callee, not the caller (i.e. callee could not successfully
    complete the operation). In that sense, it is indeed equivalent
    to Python's usage of exceptions, which are also determined by the
    callee, and typically also in cases where successful completion is
    not possible. Whether these cases are "exceptional" in the word
    sense (i.e. deviating from the norm) would have to be decided by
    the application, again (which would set the norm).

    Regards,
    Martin
  • Lie Ryan at Jan 1, 2010 at 3:24 pm

    On 1/1/2010 3:47 PM, Peng Yu wrote:
    I observe that python library primarily use exception for error
    handling rather than use error code.

    In the article API Design Matters by Michi Henning

    Communications of the ACM
    Vol. 52 No. 5, Pages 46-56
    10.1145/1506409.1506424
    http://cacm.acm.org/magazines/2009/5/24646-api-design-matters/fulltext

    It says "Another popular design flaw?namely, throwing exceptions for
    expected outcomes?also causes inefficiencies because catching and
    handling exceptions is almost always slower than testing a return
    value."

    My observation is contradicted to the above statement by Henning. If
    my observation is wrong, please just ignore my question below.

    Otherwise, could some python expert explain to me why exception is
    widely used for error handling in python?
    Simple, when an exception is thrown and I don't catch it, the exception
    terminates the program immediately and I got a traceback showing the
    point of failure. When I return error value and I don't check for it, I
    passed passed errors silently and gets a traceback forty-two lines later
    when trying to use the resources I failed to acquire forty-two lines prior.
    Is it because the efficiency
    is not the primary goal of python?
    Efficiency is not primary goal of python, but since python encourages
    EAFP (Easier to Ask Forgiveness than Permission); the design decisions
    chosen makes setting up a try-block much cheaper than a language
    designed over LBYL (Look Before You Leap) and return codes.
  • Michi at Jan 3, 2010 at 9:44 pm

    On Jan 1, 2:47?pm, Peng Yu wrote:
    In the article API Design Matters by Michi Henning

    Communications of the ACM
    Vol. 52 No. 5, Pages 46-56
    10.1145/1506409.1506424http://cacm.acm.org/magazines/2009/5/24646-api-design-matters/fulltext

    It says "Another popular design flaw?namely, throwing exceptions for
    expected outcomes?also causes inefficiencies because catching and
    handling exceptions is almost always slower than testing a return
    value."

    My observation is contradicted to the above statement by Henning. If
    my observation is wrong, please just ignore my question below.
    Seeing that quite a few people have put their own interpretation on
    what I wrote, I figured I'll post a clarification.

    The quoted sentence appears in a section of the article that deals
    with efficiency. I point out in that section that bad APIs often have
    a price not just in terms of usability and defect rate, but that they
    are often inefficient as well. (For example, wrapper APIs often
    require additional memory allocations and/or data copies.) Incorrect
    use of exceptions also incurs an efficiency penalty.

    In many language implementations, exception handling is expensive;
    significantly more expensive than testing a return value. Consider the
    following:

    int x;
    try {
    x = func();
    } catch (SomeException) {
    doSomething();
    return;
    }
    doSomethingElse();

    Here is the alternative without exceptions. (func() returns
    SpecialValue instead of throwing.)

    int x;
    x = func();
    if (x == SpecialValue) {
    doSomething();
    return;
    }
    doSomethingElse();

    In many language implementations, the second version is considerably
    faster, especially when the exception may be thrown from deep in the
    bowels of func(), possibly many frames down the call tree.

    If func() throws an exception for something that routinely occurs in
    the normal use of the API, the extra cost can be noticeable. Note that
    I am not advocating not to use exceptions. I *am* advocating to not
    throw exceptions for conditions that are not exceptional.

    The classic example of this are lookup functions that, for example,
    retrieve the value of an environment variable, do a table lookup, or
    similar. Many such APIs throw an exception when the lookup fails
    because the key isn't the table. However, very often, looking for
    something that isn't there is a common case, such as when looking for
    a value and, if the value isn't present already, adding it. Here is an
    example of this:

    KeyType k = ...;
    ValueType v;

    try {
    v = collection.lookup(k);
    } catch (NotFoundException) {
    collection.add(k, defaultValue);
    v = defaultValue;
    }
    doSomethingWithValue(v);

    The same code if collection doesn't throw when I look up something
    that isn't there:

    KeyType k = ...;
    ValueType v;

    v = collection.lookup(k);
    if (v == null) {
    collection.add(k, defaultValue);
    v = defaultValue;
    }
    doSomethingWithValue(v);

    The problem is that, if I do something like this in a loop, and the
    loop is performance-critical, the exception version can cause a
    significant penalty.

    As the API designer, when I make the choice between returning a
    special value to indicate some condition, or throwing an exception, I
    should consider the following questions:

    * Is the special condition such that, under most conceivable
    circumstances, the caller will treat the condition as an unexpected
    error?

    * Is it appropriate to force the caller to deal with the condition in
    a catch-handler?

    * If the caller fails to explicitly deal with the condition, is it
    appropriate to terminate the program?

    Only if the answer to these questions is "yes" is it appropriate to
    throw an exception. Note the third question, which is often forgotten.
    By throwing an exception, I not only force the caller to handle the
    exception with a catch-handler (as opposed to leaving the choice to
    the caller), I also force the caller to *always* handle the exception:
    if the caller wants to ignore the condition, he/she still has to write
    a catch-handler and failure to do so terminates the program.

    Apart from the potential performance penalty, throwing exceptions for
    expected outcomes is bad also because it forces a try-catch block on
    the caller. One example of this is the .NET socket API: if I do non-
    blocking I/O on a socket, I get an exception if no data is ready for
    reading (which is the common and expected case), and I get a zero
    return value if the connection was lost (which is the uncommon and
    unexpected case).

    In other words, the .NET API gets this completely the wrong way round.
    Code that needs to do non-blocking reads from a socket turns into a
    proper mess as a result because the outcome of a read() call is tri-
    state:

    * Data was available and returned: no exception

    * No data available: exception

    * Connection lost: no exception

    Because such code normally lives in a loop that decrements a byte
    count until the expected number of bytes have been read, the control
    flow because really awkward because the successful case must be dealt
    with in both the try block and the catch handler, and the error
    condition must be dealt with in the try block as well.

    If the API did what it should, namely, throw an exception when the
    connection is lost, and not throw when I do a read (whether data was
    ready or not), the code would be far simpler and far more
    maintainable.

    At no point did I ever advocate not to use exception handling.
    Exceptions are the correct mechanism to handle errors. However, what
    is considered an error is very much in the eye of the beholder. As the
    API creator, if I indicate errors with exceptions, I make a policy
    decision about what is an error and what is not. It behooves me to be
    conservative in that policy: I should throw exceptions only for
    conditions that are unlikely to arise during routine and normal use of
    the API.

    Cheers,

    Michi.
  • MRAB at Jan 3, 2010 at 11:28 pm

    Michi wrote:
    On Jan 1, 2:47 pm, Peng Yu wrote:
    In the article API Design Matters by Michi Henning

    Communications of the ACM
    Vol. 52 No. 5, Pages 46-56
    10.1145/1506409.1506424http://cacm.acm.org/magazines/2009/5/24646-api-design-matters/fulltext

    It says "Another popular design flaw?namely, throwing exceptions for
    expected outcomes?also causes inefficiencies because catching and
    handling exceptions is almost always slower than testing a return
    value."

    My observation is contradicted to the above statement by Henning. If
    my observation is wrong, please just ignore my question below.
    Seeing that quite a few people have put their own interpretation on
    what I wrote, I figured I'll post a clarification.

    The quoted sentence appears in a section of the article that deals
    with efficiency. I point out in that section that bad APIs often have
    a price not just in terms of usability and defect rate, but that they
    are often inefficient as well. (For example, wrapper APIs often
    require additional memory allocations and/or data copies.) Incorrect
    use of exceptions also incurs an efficiency penalty.

    In many language implementations, exception handling is expensive;
    significantly more expensive than testing a return value. Consider the
    following:

    int x;
    try {
    x = func();
    } catch (SomeException) {
    doSomething();
    return;
    }
    doSomethingElse();

    Here is the alternative without exceptions. (func() returns
    SpecialValue instead of throwing.)

    int x;
    x = func();
    if (x == SpecialValue) {
    doSomething();
    return;
    }
    doSomethingElse();

    In many language implementations, the second version is considerably
    faster, especially when the exception may be thrown from deep in the
    bowels of func(), possibly many frames down the call tree.

    If func() throws an exception for something that routinely occurs in
    the normal use of the API, the extra cost can be noticeable. Note that
    I am not advocating not to use exceptions. I *am* advocating to not
    throw exceptions for conditions that are not exceptional.

    The classic example of this are lookup functions that, for example,
    retrieve the value of an environment variable, do a table lookup, or
    similar. Many such APIs throw an exception when the lookup fails
    because the key isn't the table. However, very often, looking for
    something that isn't there is a common case, such as when looking for
    a value and, if the value isn't present already, adding it. Here is an
    example of this:

    KeyType k = ...;
    ValueType v;

    try {
    v = collection.lookup(k);
    } catch (NotFoundException) {
    collection.add(k, defaultValue);
    v = defaultValue;
    }
    doSomethingWithValue(v);

    The same code if collection doesn't throw when I look up something
    that isn't there:

    KeyType k = ...;
    ValueType v;

    v = collection.lookup(k);
    if (v == null) {
    collection.add(k, defaultValue);
    v = defaultValue;
    }
    doSomethingWithValue(v);

    The problem is that, if I do something like this in a loop, and the
    loop is performance-critical, the exception version can cause a
    significant penalty.
    In Python, of course, there's a method for this: setdefault.
    As the API designer, when I make the choice between returning a
    special value to indicate some condition, or throwing an exception, I
    should consider the following questions:

    * Is the special condition such that, under most conceivable
    circumstances, the caller will treat the condition as an unexpected
    error?

    * Is it appropriate to force the caller to deal with the condition in
    a catch-handler?

    * If the caller fails to explicitly deal with the condition, is it
    appropriate to terminate the program?

    Only if the answer to these questions is "yes" is it appropriate to
    throw an exception. Note the third question, which is often forgotten.
    By throwing an exception, I not only force the caller to handle the
    exception with a catch-handler (as opposed to leaving the choice to
    the caller), I also force the caller to *always* handle the exception:
    if the caller wants to ignore the condition, he/she still has to write
    a catch-handler and failure to do so terminates the program.

    Apart from the potential performance penalty, throwing exceptions for
    expected outcomes is bad also because it forces a try-catch block on
    the caller. One example of this is the .NET socket API: if I do non-
    blocking I/O on a socket, I get an exception if no data is ready for
    reading (which is the common and expected case), and I get a zero
    return value if the connection was lost (which is the uncommon and
    unexpected case).

    In other words, the .NET API gets this completely the wrong way round.
    Code that needs to do non-blocking reads from a socket turns into a
    proper mess as a result because the outcome of a read() call is tri-
    state:

    * Data was available and returned: no exception

    * No data available: exception

    * Connection lost: no exception

    Because such code normally lives in a loop that decrements a byte
    count until the expected number of bytes have been read, the control
    flow because really awkward because the successful case must be dealt
    with in both the try block and the catch handler, and the error
    condition must be dealt with in the try block as well.

    If the API did what it should, namely, throw an exception when the
    connection is lost, and not throw when I do a read (whether data was
    ready or not), the code would be far simpler and far more
    maintainable.

    At no point did I ever advocate not to use exception handling.
    Exceptions are the correct mechanism to handle errors. However, what
    is considered an error is very much in the eye of the beholder. As the
    API creator, if I indicate errors with exceptions, I make a policy
    decision about what is an error and what is not. It behooves me to be
    conservative in that policy: I should throw exceptions only for
    conditions that are unlikely to arise during routine and normal use of
    the API.
    In another area, string slicing in C# uses the Substring method, where
    you provide the start position and number of characters. If the start
    index is out of bounds (it must be >= 0 and < length) or the string is
    too short, then it throws an exception. In practice I find Python's
    behaviour easier to use (and the code is shorter too!).

    C# also misses Python's trick (in Python 2.6 and above) of giving string
    instances a format method, instead making it a class method, so you need
    to write:

    string.format(format_string, ...)

    instead of Python's:

    format_string.format(...)

    On the other hand, C#'s equivalent of raw strings treat backslashes
    always as a normal character. I think it's the only feature of C#'s
    string handling that I prefer to Python's.
  • Steven D'Aprano at Jan 4, 2010 at 3:30 am

    On Sun, 03 Jan 2010 13:44:29 -0800, Michi wrote:


    The quoted sentence appears in a section of the article that deals with
    efficiency. I point out in that section that bad APIs often have a price
    not just in terms of usability and defect rate, but that they are often
    inefficient as well.
    This is very true, but good APIs often trade-off increased usability and
    reduced defect rate against machine efficiency too. In fact, I would
    argue that this is a general design principle of programming languages:
    since correctness and programmer productivity are almost always more
    important than machine efficiency, the long-term trend across virtually
    all languages is to increase correctness and productivity even if doing
    so costs some extra CPU cycles.


    (For example, wrapper APIs often require additional
    memory allocations and/or data copies.) Incorrect use of exceptions also
    incurs an efficiency penalty.
    And? *Correct* use of exceptions also incur a penalty. So does the use of
    functions. Does this imply that putting code in functions is a poor API?
    Certainly not.

    In many language implementations, exception handling is expensive;
    significantly more expensive than testing a return value.
    And in some it is less expensive.

    But no matter how much more expensive, there will always be a cut-off
    point where it is cheaper on average to suffer the cost of handling an
    exception than it is to make unnecessary tests.

    In Python, for dictionary key access, that cut-off is approximately at
    one failure per ten or twenty attempts. So unless you expect more than
    one in ten attempts to lead to a failure, testing first is actually a
    pessimation, not an optimization.



    Consider the following:

    int x;
    try {
    x = func();
    } catch (SomeException) {
    doSomething();
    return;
    }
    doSomethingElse();

    Here is the alternative without exceptions. (func() returns SpecialValue
    instead of throwing.)

    int x;
    x = func();
    if (x == SpecialValue) {
    doSomething();
    return;
    }
    doSomethingElse();

    In some, limited, cases you might be able to use the magic return value
    strategy, but this invariably leads to lost programmer productivity, more
    complex code, lowered readability and usability, and more defects,
    because programmers will invariably neglect to test for the special value:

    int x;
    x = func();
    doSomething(x);
    return;

    Or worse, they will write doSomething() so that it too needs to know
    about SpecialValue, and so do all the functions it calls. Instead of
    dealing with the failure in one place, you can end up having to deal with
    it in a dozen places.


    But even worse is common case that SpecialValue is a legal value when
    passed to doSomething, and you end up with the error propagating deep
    into the application before being found. Or even worse, it is never found
    at all, and the application simply does the wrong thing.



    In many language implementations, the second version is considerably
    faster, especially when the exception may be thrown from deep in the
    bowels of func(), possibly many frames down the call tree.
    This is a classic example of premature optimization. Unless such
    inefficiency can be demonstrated to actually matter, then you do nobody
    any favours by preferring the API that leads to more defects on the basis
    of *assumed* efficiency.

    If your test for a special value is 100 times faster than handling the
    exception, and exceptions occur only one time in 1000, then using a
    strategy of testing for a special value is actually ten times slower on
    average than catching an exception.


    If func() throws an exception for something that routinely occurs in the
    normal use of the API, the extra cost can be noticeable.
    "Can be". But it also might not be noticeable at all.


    [...]
    Here is an example of this:

    KeyType k = ...;
    ValueType v;

    try {
    v = collection.lookup(k);
    } catch (NotFoundException) {
    collection.add(k, defaultValue);
    v = defaultValue;
    }
    doSomethingWithValue(v);

    The same code if collection doesn't throw when I look up something that
    isn't there:

    KeyType k = ...;
    ValueType v;

    v = collection.lookup(k);
    if (v == null) {
    collection.add(k, defaultValue);
    v = defaultValue;
    }
    doSomethingWithValue(v);

    The problem is that, if I do something like this in a loop, and the loop
    is performance-critical, the exception version can cause a significant
    penalty.

    No, the real problems are:

    (1) The caller has to remember to check the return result for the magic
    value. Failure to do so leads to bugs, in some cases, serious and hard-to-
    find bugs.

    (2) If missing keys are rare enough, the cost of all those unnecessary
    tests will out-weigh the saving of avoiding catching the exception. "Rare
    enough" may still be very common: in the case of Python, the cross-over
    point is approximately 1 time in 15.

    (3) Your collection now cannot use the magic value as a legitimate value.

    This last one can be *very* problematic. In the early 1990s, I was
    programming using a callback API that could only return an integer. The
    standard way of indicating an error was to return -1. But what happens if
    -1 is a legitimate return value, e.g. for a maths function? The solution
    used was to have the function create a global variable holding a flag:

    result = function(args)
    if result == -1:
    if globalErrorState == -1:
    print "An error occurred"
    exit
    doSomething(result)


    That is simply horrible.


    As the API designer, when I make the choice between returning a special
    value to indicate some condition, or throwing an exception, I should
    consider the following questions:

    * Is the special condition such that, under most conceivable
    circumstances, the caller will treat the condition as an unexpected
    error?
    Wrong.

    It doesn't matter whether it is an error or not. They are called
    EXCEPTIONS, not ERRORS. What matters is that it is an exceptional case.
    Whether that exceptional case is an error condition or not is dependent
    on the application.


    * Is it appropriate to force the caller to deal with the condition in
    a catch-handler?

    * If the caller fails to explicitly deal with the condition, is it
    appropriate to terminate the program?

    Only if the answer to these questions is "yes" is it appropriate to
    throw an exception. Note the third question, which is often forgotten.
    By throwing an exception, I not only force the caller to handle the
    exception with a catch-handler (as opposed to leaving the choice to the
    caller), I also force the caller to *always* handle the exception: if
    the caller wants to ignore the condition, he/she still has to write a
    catch-handler and failure to do so terminates the program.
    That's a feature of exceptions, not a problem.


    Apart from the potential performance penalty, throwing exceptions for
    expected outcomes is bad also because it forces a try-catch block on the
    caller.
    But it's okay to force a `if (result==MagicValue)` test instead?

    Look, the caller has to deal with exceptional cases (which may include
    error conditions) one way or the other. If you don't deal with them at
    all, your code will core dump, or behave incorrectly, or something. If
    the caller fails to deal with the exceptional case, it is better to cause
    an exception that terminates the application immediately than it is to
    allow the application to generate incorrect results.


    One example of this is the .NET socket API: if I do non-
    blocking I/O on a socket, I get an exception if no data is ready for
    reading (which is the common and expected case), and I get a zero return
    value if the connection was lost (which is the uncommon and unexpected
    case).

    In other words, the .NET API gets this completely the wrong way round.
    Well we can agree on that!

    If the API did what it should, namely, throw an exception when the
    connection is lost, and not throw when I do a read (whether data was
    ready or not), the code would be far simpler and far more maintainable.

    At no point did I ever advocate not to use exception handling.
    Exceptions are the correct mechanism to handle errors. However, what is
    considered an error is very much in the eye of the beholder. As the API
    creator, if I indicate errors with exceptions, I make a policy decision
    about what is an error and what is not. It behooves me to be
    conservative in that policy: I should throw exceptions only for
    conditions that are unlikely to arise during routine and normal use of
    the API.
    But lost connections *are* routine and normal. Hopefully they are rare.



    --
    Steven
  • Roy Smith at Jan 4, 2010 at 3:36 am
    In article <pan.2010.01.04.03.30.41 at REMOVE.THIS.cybersource.com.au>,
    Steven D'Aprano wrote:
    This last one can be *very* problematic. In the early 1990s, I was
    programming using a callback API that could only return an integer. The
    standard way of indicating an error was to return -1. But what happens if
    -1 is a legitimate return value, e.g. for a maths function?
    One of the truly nice features of Python is the universally distinguished
    value, None.
  • Steven D'Aprano at Jan 4, 2010 at 6:55 am

    On Sun, 03 Jan 2010 22:36:44 -0500, Roy Smith wrote:

    In article <pan.2010.01.04.03.30.41 at REMOVE.THIS.cybersource.com.au>,
    Steven D'Aprano wrote:
    This last one can be *very* problematic. In the early 1990s, I was
    programming using a callback API that could only return an integer. The
    standard way of indicating an error was to return -1. But what happens
    if -1 is a legitimate return value, e.g. for a maths function?
    One of the truly nice features of Python is the universally
    distinguished value, None.

    What happens if you need to return None as a legitimate value?


    Here's a good example: iterating over a list. Python generates an
    exception when you hit the end of the list. If instead, Python returned
    None when the index is out of bounds, you couldn't store None in a list
    without breaking code.

    So we produce a special sentinel object EndOfSequence. Now we can't do
    this:

    for obj in ["", 12, None, EndOfSequence, [], {}]:
    print dir(obj) # or some other useful operation

    The fundamental flaw of using magic values is that, there will always be
    some application where you want to use the magic value as a non-special
    value, and then you're screwed.

    This is why, for instance, it's difficult for C strings to contain a null
    byte, and there are problems with text files on DOS and CP/M (and Windows
    under some circumstances) that contain a ^Z byte.



    --
    Steven
  • Michi at Jan 4, 2010 at 9:34 pm

    On Jan 4, 1:30?pm, Steven D'Aprano wrote:

    This is very true, but good APIs often trade-off increased usability and
    reduced defect rate against machine efficiency too. In fact, I would
    argue that this is a general design principle of programming languages:
    since correctness and programmer productivity are almost always more
    important than machine efficiency, the long-term trend across virtually
    all languages is to increase correctness and productivity even if doing
    so costs some extra CPU cycles.
    Yes, I agree with that in general. Correctness and productivity are
    more important, as a rule, and should be given priority.
    (For example, wrapper APIs often require additional
    memory allocations and/or data copies.) Incorrect use of exceptions also
    incurs an efficiency penalty.
    And? *Correct* use of exceptions also incur a penalty. So does the use of
    functions. Does this imply that putting code in functions is a poor API?
    Certainly not.
    It does imply that incorrect use of exceptions incurs an unnecessary
    performance penalty, no more, no less, just as incorrect use of
    wrappers incurs an unnecessary performance penalty.
    But no matter how much more expensive, there will always be a cut-off
    point where it is cheaper on average to suffer the cost of handling an
    exception than it is to make unnecessary tests.

    In Python, for dictionary key access, that cut-off is approximately at
    one failure per ten or twenty attempts. So unless you expect more than
    one in ten attempts to lead to a failure, testing first is actually a
    pessimation, not an optimization.
    What this really comes down to is how frequently or infrequently a
    particular condition arises before that condition should be considered
    an exceptional condition rather than a normal one. It also relates to
    how the set of conditions partitions into "normal" conditions and
    "abnormal" conditions. The difficulty for the API designer is to make
    these choices correctly.
    In some, limited, cases you might be able to use the magic return value
    strategy, but this invariably leads to lost programmer productivity, more
    complex code, lowered readability and usability, and more defects,
    because programmers will invariably neglect to test for the special value:
    I disagree here, to the extent that, whether something is an error or
    not can very much depend on the circumstances in which the API is
    used. The collection case is a very typical example. Whether failing
    to locate a value in a collection is an error very much depends on
    what the collection is used for. In some cases, it's a hard error
    (because it might, for example, imply that internal program state has
    been corrupted); in other cases, not finding a value is perfectly
    normal.

    For the API designer, the problem is that an API that throws an
    exception when it should not sucks just as much as an API that doesn't
    throw an exception when it should. For general-purpose APIs, such as a
    collection API, as the designer, I usually cannot know. As I said
    elsewhere in the article, general-purpose APIs should be policy-free,
    and special-purpose APIs should be policy-rich. As the designer, the
    more I know about the circumstances in which the API will be used, the
    more fascist I can be in the design and bolt down the API more in
    terms of static and run-time safety.

    Wanting to ignore a return value from a function is perfectly normal
    and legitimate in many cases. However, if a function throws instead of
    returning a value, ignoring that value becomes more difficult for the
    caller and can extract a performance penalty that may be unacceptable
    to the caller. The problem really is that, at the time the API is
    designed, there often is no way to tell whether this will actually be
    the case; in turn, no matter whether I choose to throw an exception or
    return an error code, it will be wrong for some people some of the
    time.
    This is a classic example of premature optimization. Unless such
    inefficiency can be demonstrated to actually matter, then you do nobody
    any favours by preferring the API that leads to more defects on the basis
    of *assumed* efficiency.
    I agree with the concern about premature optimisation. However, I
    don't agree with a blanket statement that special return values always
    and unconditionally lead to more defects. Returning to the .NET non-
    blocking I/O example, the fact that the API throws an exception when
    it shouldn't very much complicates the code and introduces a lot of
    extra control logic that is much more likely to be wrong than a simple
    if-then-else statement. As I said, throwing an exception when none
    should be thrown can be just as harmful as the opposite case.
    It doesn't matter whether it is an error or not. They are called
    EXCEPTIONS, not ERRORS. What matters is that it is an exceptional case.
    Whether that exceptional case is an error condition or not is dependent
    on the application.
    Exactly. To me, that implies that making something an exception that,
    to the caller, shouldn't be is just as inconvenient as the other way
    around.
    ?* Is it appropriate to force the caller to deal with the condition in
    a catch-handler?
    ?* If the caller fails to explicitly deal with the condition, is it
    appropriate to terminate the program?
    Only if the answer to these questions is "yes" is it appropriate to
    throw an exception. Note the third question, which is often forgotten.
    By throwing an exception, I not only force the caller to handle the
    exception with a catch-handler (as opposed to leaving the choice to the
    caller), I also force the caller to *always* handle the exception: if
    the caller wants to ignore the condition, he/she still has to write a
    catch-handler and failure to do so terminates the program.
    That's a feature of exceptions, not a problem.
    Yes, and didn't say that it is a problem. However, making the wrong
    choice for the use of the feature is a problem, just as making the
    wrong choice for not using the feature is.
    Apart from the potential performance penalty, throwing exceptions for
    expected outcomes is bad also because it forces a try-catch block on the
    caller.
    But it's okay to force a `if (result==MagicValue)` test instead?
    Yes, in some cases it is. For example:

    int numBytes;
    int fd = open(...);
    while ((numBytes = read(fd, ?)) > 0)
    {
    // process data...
    }

    Would you prefer to see EOF indicated by an exception rather than a
    zero return value? I wouldn't.
    Look, the caller has to deal with exceptional cases (which may include
    error conditions) one way or the other. If you don't deal with them at
    all, your code will core dump, or behave incorrectly, or something. If
    the caller fails to deal with the exceptional case, it is better to cause
    an exception that terminates the application immediately than it is to
    allow the application to generate incorrect results.
    I agree that failing to deal with exceptional cases causes problems. I
    also agree that exceptions, in general, are better than error codes
    because they are less likely to go unnoticed. But, as I said, it
    really depends on the caller whether something should be an exception
    or not.

    The core problem isn't whether exceptions are good or bad in a
    particular case, but that most APIs make this an either-or choice. For
    example, if I had an API that allowed me to choose at run time whether
    an exception will be thrown for a particular condition, I could adapt
    that API to my needs, instead of being stuck with whatever the
    designer came up with.

    There are many ways this could be done. For example, I could have a
    find() operation on a collection that throws if a value isn't found,
    and I could have findNoThrow() if I want a sentinel value returned.
    Or, the API could offer a callback hook that decides at run time
    whether to throw or not. (There are many other possible ways to do
    this, such as setting the behaviour at construction time, or by having
    different collection types with different behaviours.)

    The point is that a more flexible API is likely to be more useful than
    one that sets a single exception policy for everyone.
    As the API
    creator, if I indicate errors with exceptions, I make a policy decision
    about what is an error and what is not. It behooves me to be
    conservative in that policy: I should throw exceptions only for
    conditions that are unlikely to arise during routine and normal use of
    the API.
    But lost connections *are* routine and normal. Hopefully they are rare.
    In the context of my example, they are not. The range of behaviours
    naturally falls into these categories:

    * No data ready
    * Data ready
    * EOF
    * Socket error

    The first three cases are the "normal" ones; they operate on the same
    program state and they are completely expected: while reading a
    message off the wire, the program will almost certainly encounter the
    first two conditions and, if there is no error, it will always
    encounter the EOF condition. The fourth case is the unexpected one, in
    the sense that this case will often not arise at all. That's not to
    say that lost connections aren't routine; they are. But, when a
    connection is lost, the program has to do different things and operate
    on different state than when the connection stays up. This strongly
    suggests that the first three conditions should be dealt with by
    return values and/or out parameters, and the fourth condition should
    be dealt with as an exception.

    Cheers,

    Michi.
  • MRAB at Jan 4, 2010 at 11:18 pm

    Michi wrote:
    On Jan 4, 1:30 pm, Steven D'Aprano
    wrote:
    [snip]
    * Is it appropriate to force the caller to deal with the condition in
    a catch-handler?
    * If the caller fails to explicitly deal with the condition, is it
    appropriate to terminate the program?
    Only if the answer to these questions is "yes" is it appropriate to
    throw an exception. Note the third question, which is often forgotten.
    By throwing an exception, I not only force the caller to handle the
    exception with a catch-handler (as opposed to leaving the choice to the
    caller), I also force the caller to *always* handle the exception: if
    the caller wants to ignore the condition, he/she still has to write a
    catch-handler and failure to do so terminates the program.
    That's a feature of exceptions, not a problem.
    Yes, and didn't say that it is a problem. However, making the wrong
    choice for the use of the feature is a problem, just as making the
    wrong choice for not using the feature is.
    Apart from the potential performance penalty, throwing exceptions for
    expected outcomes is bad also because it forces a try-catch block on the
    caller.
    But it's okay to force a `if (result==MagicValue)` test instead?
    Yes, in some cases it is. For example:

    int numBytes;
    int fd = open(...);
    while ((numBytes = read(fd, ?)) > 0)
    {
    // process data...
    }

    Would you prefer to see EOF indicated by an exception rather than a
    zero return value? I wouldn't.
    I wouldn't consider zero to be a magic value in this case. Returning a
    negative number if an error occurred would be magic. A better comparison
    might be str.find vs str.index, the former returning a magic value -1.
    Which is used more often?
    Look, the caller has to deal with exceptional cases (which may include
    error conditions) one way or the other. If you don't deal with them at
    all, your code will core dump, or behave incorrectly, or something. If
    the caller fails to deal with the exceptional case, it is better to cause
    an exception that terminates the application immediately than it is to
    allow the application to generate incorrect results.
    I agree that failing to deal with exceptional cases causes problems. I
    also agree that exceptions, in general, are better than error codes
    because they are less likely to go unnoticed. But, as I said, it
    really depends on the caller whether something should be an exception
    or not.

    The core problem isn't whether exceptions are good or bad in a
    particular case, but that most APIs make this an either-or choice. For
    example, if I had an API that allowed me to choose at run time whether
    an exception will be thrown for a particular condition, I could adapt
    that API to my needs, instead of being stuck with whatever the
    designer came up with.

    There are many ways this could be done. For example, I could have a
    find() operation on a collection that throws if a value isn't found,
    and I could have findNoThrow() if I want a sentinel value returned.
    Or, the API could offer a callback hook that decides at run time
    whether to throw or not. (There are many other possible ways to do
    this, such as setting the behaviour at construction time, or by having
    different collection types with different behaviours.)
    Or find() could have an extra keyword argument, eg.
    string.find(substring, default=-1), although that should probably be
    string.index(substring, default=-1) as a replacement for
    string.find(substring).
  • Steven D'Aprano at Jan 4, 2010 at 11:44 pm

    On Mon, 04 Jan 2010 13:34:34 -0800, Michi wrote:

    On Jan 4, 1:30?pm, Steven D'Aprano
    wrote:
    This is very true, but good APIs often trade-off increased usability
    and reduced defect rate against machine efficiency too. In fact, I
    would argue that this is a general design principle of programming
    languages: since correctness and programmer productivity are almost
    always more important than machine efficiency, the long-term trend
    across virtually all languages is to increase correctness and
    productivity even if doing so costs some extra CPU cycles.
    Yes, I agree with that in general. Correctness and productivity are more
    important, as a rule, and should be given priority.
    I'm glad we agree on that, but I wonder why you previously emphasised
    machine efficiency so much, and correctness almost not at all, in your
    previous post?

    (For example, wrapper APIs often require additional memory
    allocations and/or data copies.) Incorrect use of exceptions also
    incurs an efficiency penalty.
    And? *Correct* use of exceptions also incur a penalty. So does the use
    of functions. Does this imply that putting code in functions is a poor
    API? Certainly not.
    It does imply that incorrect use of exceptions incurs an unnecessary
    performance penalty, no more, no less, just as incorrect use of wrappers
    incurs an unnecessary performance penalty.
    If all you're argument is that we shouldn't write crappy APIs, then I
    agree with you completely. The .NET example you gave previously is a good
    example of an API that is simply poor: using exceptions isn't a panacea
    that magically makes code better. So I can't disagree that using
    exceptions badly incurs an unnecessary performance penalty, but it also
    incurs an unnecessary penalty against correctness and programmer
    productivity.

    But no matter how much more expensive, there will always be a cut-off
    point where it is cheaper on average to suffer the cost of handling an
    exception than it is to make unnecessary tests.

    In Python, for dictionary key access, that cut-off is approximately at
    one failure per ten or twenty attempts. So unless you expect more than
    one in ten attempts to lead to a failure, testing first is actually a
    pessimation, not an optimization.
    What this really comes down to is how frequently or infrequently a
    particular condition arises before that condition should be considered
    an exceptional condition rather than a normal one. It also relates to
    how the set of conditions partitions into "normal" conditions and
    "abnormal" conditions. The difficulty for the API designer is to make
    these choices correctly.
    The first case is impossible for the API designer to predict, although
    she may be able to make some educated estimates based on experience. For
    instance I know that when I search a string for a substring, "on average"
    I expect to find the substring present more often than not. I've put "on
    average" in scare-quotes because it's not a statistical average at all,
    but a human expectation -- a prejudice in fact. I *expect* to have
    searching succeed more often than fail, not because I actually know how
    many searches succeed and fail, but because I think of searching for an
    item to "naturally" find the item. But if I actually profiled my code in
    use on real data, who knows what ratio of success/failure I would find?

    In the second case, the decision of what counts as "ordinary" and what
    counts as "exceptional" should, in general, be rather obvious. (That's
    not to discount the possibility of unobvious cases, but that's probably a
    case that the function is too complex and tries to do too much.) Take the
    simplest description of what the function is supposed to do: (e.g. "find
    the offset of a substring in a source string"). That's the ordinary case,
    and should be returned. Is there anything else that the function may do?
    (E.g. fail to find the substring because it isn't there.) Then that's an
    exceptional case.

    (There may be other exceptional cases, which is another reason to prefer
    exceptions to magic return values. In general, it's much easier to deal
    with multiple exception types than it is to test for multiple magic
    return values. Consider a function that returns a pointer. You can return
    null to indicate an error. What if you want to distinguish between two
    different error states? What about ten error states?)

    I argue that as designers, we should default to raising an exception and
    only choose otherwise if there is a good reason not to. As we agreed
    earlier, exceptions (in general) are better for correctness and
    productivity, which in turn are (in general) more important than machine
    efficiency. The implication of this is that in general, we should prefer
    exceptions, and only avoid them when necessary. Your argument seems to be
    that we should avoid exceptions by default, and only use them if
    unavoidable. I think that is backwards.

    In some, limited, cases you might be able to use the magic return value
    strategy, but this invariably leads to lost programmer productivity,
    more complex code, lowered readability and usability, and more defects,
    because programmers will invariably neglect to test for the special
    value:
    I disagree here, to the extent that, whether something is an error or
    not can very much depend on the circumstances in which the API is used.
    That's certainly true: a missing key (for example) may be an error, or a
    present key may be an error, or neither may be an error, just different
    branches of an algorithm. That's an application-specific decision. But I
    don't see how that relates to my claim that magic return values are less
    robust and usable than exceptions. Whether it is an error or not, it
    still needs to be handled. If the caller neglects to handle the special
    case, an exception-based strategy will almost certainly lead to the
    application halting (hopefully leading to a harmless bug report rather
    than the crash of a billion-dollar space probe), but a magic return value
    will very often lead to the application silently generating invalid
    results.

    [...]
    Wanting to ignore a return value from a function is perfectly normal and
    legitimate in many cases.
    I wouldn't say that's normal. If you don't care about the function's
    result, why are you calling it? For the side-effects? In languages that
    support procedures, such mutator functions should be written as
    procedures that don't return anything. For languages that don't, like
    Python, they should be written as de-facto procedures, always return
    None, and allow the user to pretend that nothing was returned.

    That is to say, ignoring the return value is acceptable as a work-around
    for the lack of true procedures. But even there, procedures necessarily
    operate by side-effect, and side-effects should be avoided as much as
    possible. So I would say, ideally, wanting to ignore the return value
    should be exceptionally rare.

    However, if a function throws instead of
    returning a value, ignoring that value becomes more difficult for the
    caller and can extract a performance penalty that may be unacceptable to
    the caller.
    There's that premature micro-optimization again.

    The problem really is that, at the time the API is designed,
    there often is no way to tell whether this will actually be the case; in
    turn, no matter whether I choose to throw an exception or return an
    error code, it will be wrong for some people some of the time.
    I've been wondering when you would reach the conclusion that an API
    should offer both forms. For example, Python offers both key-lookup that
    raises exceptions (dict[key]) and key-lookup that doesn't (dict.get(key)).

    The danger of this is that it complicates the API, leads to a more
    complex implementation, and may result in duplicated code (if the two
    functions have independent implementations). But if you don't duplicate
    the code, then the assumed performance benefit of magic return values
    over exceptions might very well be completely negated:

    def get(self, key):
    # This is not the real Python dict.get implementation!
    # This is merely an illustration of how it *could* be.
    try:
    return self[key]
    except KeyError:
    return None


    This just emphasises the importance of not optimising code by assumption.
    If you haven't *measured* the speed of a function you don't know whether
    it will be faster or slower than catching an exception.

    You will note that the above has nothing to do with the API, but is
    entirely an implementation decision. This to me demonstrates that the
    question of machine efficiency is irrelevant to API design.

    This is a classic example of premature optimization. Unless such
    inefficiency can be demonstrated to actually matter, then you do nobody
    any favours by preferring the API that leads to more defects on the
    basis of *assumed* efficiency.
    I agree with the concern about premature optimisation. However, I don't
    agree with a blanket statement that special return values always and
    unconditionally lead to more defects.
    I can't say that they *always* lead to more defects, since that also
    depends on the competence of the caller, but I will say that as a general
    principle, they should be *expected* to lead to more defects.

    Returning to the .NET non-
    blocking I/O example, the fact that the API throws an exception when it
    shouldn't very much complicates the code and introduces a lot of extra
    control logic that is much more likely to be wrong than a simple
    if-then-else statement. As I said, throwing an exception when none
    should be thrown can be just as harmful as the opposite case.
    In this case, it's worse than that -- they use a special return value
    when there should be an exception, and an exception when there should be
    an ordinary, non-special value (an empty string, if I recall correctly).

    It doesn't matter whether it is an error or not. They are called
    EXCEPTIONS, not ERRORS. What matters is that it is an exceptional case.
    Whether that exceptional case is an error condition or not is dependent
    on the application.
    Exactly. To me, that implies that making something an exception that, to
    the caller, shouldn't be is just as inconvenient as the other way
    around.
    Well, obviously I agree that you should only make things be an exception
    if they actually should be an exception. I don't quite see where the
    implication is -- I find myself in the curious position of agreeing with
    your conclusion while questioning your reasoning, as if you had said
    something like:

    All cats have four legs, therefore cats are mammals.

    Apart from the potential performance penalty, throwing exceptions for
    expected outcomes is bad also because it forces a try-catch block on
    the caller.
    But it's okay to force a `if (result==MagicValue)` test instead?
    Yes, in some cases it is. For example:

    int numBytes;
    int fd = open(...);
    while ((numBytes = read(fd, ?)) > 0) {
    // process data...
    }

    Would you prefer to see EOF indicated by an exception rather than a zero
    return value? I wouldn't.
    Why not? Assuming this is a blocking read, once you hit EOF you will
    never recover from it. Is this about the micro-optimisation again? Disc
    IO is almost certainly a thousand times slower than any exception you
    could catch here.

    In Python, we *do* use exceptions for file reads. An explicit read
    returns an empty string, and we might write:


    f = open(filename)
    while 1:
    block = f.read(buffersize)
    if not block:
    f.close()
    break
    process(block)


    This would arguably be easier to write and read, and demonstrates the
    intent of the while loop better:

    f = open(filename)
    try:
    while 1:
    process(f.read(buffersize))
    except EOFError:
    f.close()

    (But the above doesn't work, because an explicit read doesn't raise an
    exception.)

    However, there's another idiom for reading a file which does use an
    exception: line-by-line reading.

    f = open(filename)
    for line in f:
    process(line)
    f.close()

    Because iterating over the file generates a StopIteration when EOF is
    reached, the for loop automatically breaks. If you wanted to handle that
    by hand, something like this should work (but is unnecessary, because
    Python already does it for you):


    f = open(filename)
    try:
    while 1:
    process(f.next())
    except StopIteration:
    f.close()


    [...]
    The core problem isn't whether exceptions are good or bad in a
    particular case, but that most APIs make this an either-or choice. For
    example, if I had an API that allowed me to choose at run time whether
    an exception will be thrown for a particular condition, I could adapt
    that API to my needs, instead of being stuck with whatever the designer
    came up with.

    There are many ways this could be done. For example, I could have a
    find() operation on a collection that throws if a value isn't found, and
    I could have findNoThrow() if I want a sentinel value returned. Or, the
    API could offer a callback hook that decides at run time whether to
    throw or not. (There are many other possible ways to do this, such as
    setting the behaviour at construction time, or by having different
    collection types with different behaviours.)

    The point is that a more flexible API is likely to be more useful than
    one that sets a single exception policy for everyone.

    This has costs of its own. The costs of developer education -- learning
    about, memorising, and deciding between such multiple APIs does not come
    for free. The costs of developing and maintaining the multiple functions.
    The risks of duplicated code in the implementation. The cost of writing
    documentation. A bloated API is not free of costs.


    As the API
    creator, if I indicate errors with exceptions, I make a policy
    decision about what is an error and what is not. It behooves me to be
    conservative in that policy: I should throw exceptions only for
    conditions that are unlikely to arise during routine and normal use
    of the API.
    But lost connections *are* routine and normal. Hopefully they are rare.
    In the context of my example, they are not. The range of behaviours
    naturally falls into these categories:

    * No data ready
    * Data ready
    * EOF
    * Socket error
    Right -- that fourth example is one of the NATURAL categories that any
    half-way decent developer needs to be aware of. When you say something
    isn't natural, and then immediately contradict yourself, that's a sign
    you need to think about what you really mean :)

    The first three cases are the "normal" ones; they operate on the same
    program state and they are completely expected: while reading a message
    off the wire, the program will almost certainly encounter the first two
    conditions and, if there is no error, it will always encounter the EOF
    condition.
    I would call these the ordinary cases, as opposed to the exceptional
    cases.

    The fourth case is the unexpected one, in the sense that this
    case will often not arise at all.
    But it is still expected -- you have to expect that you might get a
    socket error, and code accordingly.

    That's not to say that lost connections aren't routine; they are.
    Right -- we actually agree on this, we just disagree on the terminology.
    I believe that talking about "normal" and "errors" is misleading. Better
    is to talk about "ordinary" and "exceptional".

    But, when a connection is lost,
    the program has to do different things and operate on different state
    than when the connection stays up. This strongly suggests that the first
    three conditions should be dealt with by return values and/or out
    parameters, and the fourth condition should be dealt with as an
    exception.
    Agreed.




    --
    Steven
  • Michi at Jan 6, 2010 at 10:15 pm

    On Jan 5, 9:44?am, Steven D'Aprano <st... at REMOVE-THIS- cybersource.com.au> wrote:

    I'm glad we agree on that, but I wonder why you previously emphasised
    machine efficiency so much, and correctness almost not at all, in your
    previous post?
    Uh? Because the original poster quoted one small paragraph out of a
    large article and that paragraph happened to deal with this particular
    (and minor) point of API design?
    If all you're argument is that we shouldn't write crappy APIs, then I
    agree with you completely.
    Well, yes: the article was precisely about that. And the point about
    exception efficiency was a minor side remark in that article.
    Your argument seems to be
    that we should avoid exceptions by default, and only use them if
    unavoidable. I think that is backwards.
    I never made that argument. After 25 years as a software engineer, I
    well and truly have come to appreciate exceptions as a superior form
    of error handling. I simply stated that throwing an exception when
    none should be thrown is a pain and often inefficient on top of that.
    That's all, really.
    I wouldn't say that's normal. If you don't care about the function's
    result, why are you calling it? For the side-effects?
    printf returns a value that is almost always ignored. And, yes, a
    function such as printf is inevitable called for its side effects. We
    could argue that printf is misdesigned (I would): the return value is
    not useful, otherwise it would be used more. And if printf threw an
    exception when something didn't work, that would be appropriate
    because it fails so rarely that, if it does fail, I probably want to
    know.
    However, if a function throws instead of
    returning a value, ignoring that value becomes more difficult for the
    caller and can extract a performance penalty that may be unacceptable to
    the caller.
    There's that premature micro-optimization again.
    Let's be clear here: the entire discussion is about *inappropriate*
    use of exceptions. This isn't a premature optimisation. It's about
    deciding when an exception is appropriate and when not. If I throw an
    exception when I shouldn't, I make the API harder to use *and* less
    efficient. The real crime isn't the loss of efficiency though, it's
    the inappropriate exception.
    I've been wondering when you would reach the conclusion that an API
    should offer both forms. For example, Python offers both key-lookup that
    raises exceptions (dict[key]) and key-lookup that doesn't (dict.get(key)).

    The danger of this is that it complicates the API, leads to a more
    complex implementation, and may result in duplicated code (if the two
    functions have independent implementations).
    Offering a choice in some form can be appropriate for some APIs. I'm
    not advocating it as a panacea, and I'm aware of the down-side in
    increased complexity, learning curve, etc. (BTW, the article discusses
    this issue in some detail.)
    Well, obviously I agree that you should only make things be an exception
    if they actually should be an exception. I don't quite see where the
    implication is
    In the context of the original article, I argued that throwing
    exceptions that are inappropriate is one of the many things that API
    designers get wrong. To many people, that's stating the obvious. The
    number of APIs that still make exactly this mistake suggests that the
    point is worth making though.

    Anyway, some of the early posts implied that I was arguing against
    exception handling per-se because exceptions can be less efficient. I
    responded to correct that misconception. What the article really said
    is that throwing an exception when none should be thrown is bad API
    design, and inefficient to boot. I stand by that statement.

    Cheers,

    Michi.
  • R0g at Jan 5, 2010 at 2:31 am

    Michi wrote:
    On Jan 4, 1:30 pm, Steven D'Aprano
    wrote:
    In some, limited, cases you might be able to use the magic return value
    strategy, but this invariably leads to lost programmer productivity, more
    complex code, lowered readability and usability, and more defects,
    because programmers will invariably neglect to test for the special value:
    I disagree here, to the extent that, whether something is an error or
    not can very much depend on the circumstances in which the API is
    used. The collection case is a very typical example. Whether failing
    to locate a value in a collection is an error very much depends on
    what the collection is used for. In some cases, it's a hard error
    (because it might, for example, imply that internal program state has
    been corrupted); in other cases, not finding a value is perfectly
    normal.


    A pattern I have used a few times is that of returning an explicit
    success/failure code alongside whatever the function normally returns.
    While subsequent programmers might not intuit the need to test for
    (implicit) "magic" return values they ought to notice if they start
    getting tuples back where they expected scalars...

    def foo(x)
    if x>0:
    return True, x*x
    else:
    return False, "Bad value of x in foo:",str(x)

    ok, value = foo(-1)
    if ok:
    print "foo of x is", value
    else:
    print "ERROR:", value


    Roger.
  • Lie Ryan at Jan 5, 2010 at 7:02 am

    On 1/5/2010 1:31 PM, r0g wrote:
    Michi wrote:
    On Jan 4, 1:30 pm, Steven D'Aprano
    wrote:
    In some, limited, cases you might be able to use the magic return value
    strategy, but this invariably leads to lost programmer productivity, more
    complex code, lowered readability and usability, and more defects,
    because programmers will invariably neglect to test for the special value:
    I disagree here, to the extent that, whether something is an error or
    not can very much depend on the circumstances in which the API is
    used. The collection case is a very typical example. Whether failing
    to locate a value in a collection is an error very much depends on
    what the collection is used for. In some cases, it's a hard error
    (because it might, for example, imply that internal program state has
    been corrupted); in other cases, not finding a value is perfectly
    normal.


    A pattern I have used a few times is that of returning an explicit
    success/failure code alongside whatever the function normally returns.
    While subsequent programmers might not intuit the need to test for
    (implicit) "magic" return values they ought to notice if they start
    getting tuples back where they expected scalars...

    def foo(x)
    if x>0:
    return True, x*x
    else:
    return False, "Bad value of x in foo:",str(x)

    ok, value = foo(-1)
    if ok:
    print "foo of x is", value
    else:
    print "ERROR:", value
    Except that that is a reinvention of try-wheel:

    def foo(x):
    if x > 0:
    return x*x
    else:
    raise MathError("Bad value of x in foo: %s" % x)

    try:
    print foo(-1)
    except MathError, e:
    print "ERROR: System integrity is doubted"

    or rather; that is perhaps a good example of when to use 'assert'. If
    the domain of foo() is positive integers, calling -1 on foo is a bug in
    the caller, not foo().

    I have been looking at Haskell recently and the way the pure functional
    language handled exceptions and I/O gives me a new distinct "insight"
    that exceptions can be thought of as a special return value that is
    implicitly wrapped and unwrapped up the call stack until it is
    explicitly handled.
  • R0g at Jan 5, 2010 at 9:07 am

    Lie Ryan wrote:
    On 1/5/2010 1:31 PM, r0g wrote:
    Michi wrote:
    On Jan 4, 1:30 pm, Steven D'Aprano
    wrote:
    <snip>
    A pattern I have used a few times is that of returning an explicit
    success/failure code alongside whatever the function normally returns.
    While subsequent programmers might not intuit the need to test for
    (implicit) "magic" return values they ought to notice if they start
    getting tuples back where they expected scalars...

    def foo(x)
    if x>0:
    return True, x*x
    else:
    return False, "Bad value of x in foo:",str(x)

    ok, value = foo(-1)
    if ok:
    print "foo of x is", value
    else:
    print "ERROR:", value
    Except that that is a reinvention of try-wheel:

    True, but there's more than one way to skin a cat! Mine's faster if you
    expect a high rate of failures (over 15%).



    def foo(x):
    if x > 0:
    return x*x
    else:
    raise MathError("Bad value of x in foo: %s" % x)

    try:
    print foo(-1)
    except MathError, e:
    print "ERROR: System integrity is doubted"

    or rather; that is perhaps a good example of when to use 'assert'. If
    the domain of foo() is positive integers, calling -1 on foo is a bug in
    the caller, not foo().

    Maybe, although I recently learned on here that one can't rely on assert
    statements in production code, their intended use is to aid debugging
    and testing really.

    Besides, that was just a toy example.


    I have been looking at Haskell recently and the way the pure functional
    language handled exceptions and I/O gives me a new distinct "insight"
    that exceptions can be thought of as a special return value that is
    implicitly wrapped and unwrapped up the call stack until it is
    explicitly handled.


    Yes there's some very interesting paradigms coming out of functional
    programming but, unless you're a maths major, functional languages are a
    long way off being productivity tools! Elegant: yes, provable: maybe,
    practical for everyday coding: not by a long shot!


    Roger.
  • Chris Rebert at Jan 5, 2010 at 9:19 am
    <much snippage>
    On Tue, Jan 5, 2010 at 1:07 AM, r0g wrote:
    Lie Ryan wrote:
    I have been looking at Haskell recently and the way the pure functional
    language handled exceptions and I/O gives me a new distinct "insight"
    that exceptions can be thought of as a special return value that is
    implicitly wrapped and unwrapped up the call stack until it is
    explicitly handled.
    Yes there's some very interesting paradigms coming out of functional
    programming but, unless you're a maths major, functional languages are a
    long way off being productivity tools! Elegant: yes, provable: maybe,
    practical for everyday coding: not by a long shot!
    Methinks the authors of Real World Haskell (excellent read btw) have a
    bone to pick with you.

    Cheers,
    Chris
  • R0g at Jan 5, 2010 at 10:22 am

    Chris Rebert wrote:
    <much snippage>
    On Tue, Jan 5, 2010 at 1:07 AM, r0g wrote:
    Lie Ryan wrote:
    I have been looking at Haskell recently and the way the pure functional
    language handled exceptions and I/O gives me a new distinct "insight"
    that exceptions can be thought of as a special return value that is
    implicitly wrapped and unwrapped up the call stack until it is
    explicitly handled.
    Yes there's some very interesting paradigms coming out of functional
    programming but, unless you're a maths major, functional languages are a
    long way off being productivity tools! Elegant: yes, provable: maybe,
    practical for everyday coding: not by a long shot!
    Methinks the authors of Real World Haskell (excellent read btw) have a
    bone to pick with you.

    Cheers,
    Chris
    --
    http://blog.rebertia.com

    LOL, it seems things have come a long way since ML! I'm impressed how
    many useful libraries Haskell has, and that they've included
    IF-THEN-ELSE in the syntax! :) For all its advantages I still think you
    need to be fundamentally cleverer to write the same programs in a
    functional language than an old fashioned "English like" language.

    Maybe I'm just mistrusting of the new school though and you'll see me on
    comp.lang.haskell in a few years having to eat my own monads!

    Roger.
  • Dave Angel at Jan 5, 2010 at 11:59 am

    r0g wrote:
    <snip>

    Maybe, although I recently learned on here that one can't rely on assert
    statements in production code, their intended use is to aid debugging
    and testing really.
    Hopefully, what you learned is that you can't use assert() in production
    code to validate user data. It's fine to use it to validate program
    logic, because that shouldn't still need testing in production.

    <snip>

    DaveA
  • R0g at Jan 5, 2010 at 1:06 pm

    Dave Angel wrote:

    r0g wrote:
    <snip>

    Maybe, although I recently learned on here that one can't rely on assert
    statements in production code, their intended use is to aid debugging
    and testing really.
    Hopefully, what you learned is that you can't use assert() in production
    code to validate user data. It's fine to use it to validate program
    logic, because that shouldn't still need testing in production.

    <snip>

    DaveA


    Well maybe I didn't quite get it then, could you explain a bit further?

    My understanding was that asserts aren't executed at all if python is
    started with the -O or -OO option, or run through an optimizer. If
    that's the case how can you expect it to validate anything at all in
    production? Do you mean for debugging in situ or something? Could you
    maybe give me an example scenario to illustrate your point?

    Cheers,

    Roger.
  • Steven D'Aprano at Jan 5, 2010 at 2:02 pm

    On Tue, 05 Jan 2010 13:06:20 +0000, r0g wrote:

    Dave Angel wrote:

    r0g wrote:
    <snip>

    Maybe, although I recently learned on here that one can't rely on
    assert
    statements in production code, their intended use is to aid debugging
    and testing really.
    Hopefully, what you learned is that you can't use assert() in
    production code to validate user data. It's fine to use it to validate
    program logic, because that shouldn't still need testing in production.

    <snip>

    DaveA


    Well maybe I didn't quite get it then, could you explain a bit further?

    My understanding was that asserts aren't executed at all if python is
    started with the -O or -OO option,
    Correct.

    or run through an optimizer.
    I don't know what you mean by that.

    If
    that's the case how can you expect it to validate anything at all in
    production?
    The asserts still operate so long as you don't use the -O switch.
    Do you mean for debugging in situ or something? Could you
    maybe give me an example scenario to illustrate your point?

    There are at least two sorts of validation that you will generally need
    to perform: validating user data, and validating your program logic.

    You *always* need to validate user data (contents of user-editable config
    files, command line arguments, data files, text they type into fields,
    etc.) because you have no control over what they put into that. So you
    shouldn't use assert for validating user data except for quick-and-dirty
    scripts you intend to use once and throw away.

    Program logic, on the other hand, theoretically shouldn't need to be
    validated at all, because we, the programmers, are very clever and
    naturally never make mistakes. Since we never make mistakes, any logic
    validation we do is pointless and a waste of time, and therefore we
    should be able to optimise it away to save time.

    *cough*

    Since in reality we're not that clever and do make mistakes, we actually
    do want to do some such program validation, but with the option to
    optimise it away. Hence the assert statement.

    So, a totally made-up example:


    def function(x, y):
    if x < 0:
    raise ValueError("x must be zero or positive")
    if y > 0:
    raise ValueError("y must be zero or negative")
    z = x*y
    assert z < 0, "expected product of +ve and -ve number to be -ve"
    return 1.0/(z-1)



    This example cunningly demonstrates:

    (1) Using explicit test-and-raise for ensuring that user-supplied
    arguments are always validated;

    (2) Using an assertion to test your program logic;

    (3) That the assertion in fact will catch an error in the program logic,
    since if you pass x or y equal to zero, the assertion will fail.


    Any time you are tempted to write a comment saying "This can't happen,
    but we check for it just in case", that is a perfect candidate for an
    assertion. Since it can't happen, it doesn't matter if it goes away with
    the -O flag; but since we're imperfect, and we want to cover ourselves
    just in case it does happen, we perform the test when not optimized.
    From my own code, I have a global constant:
    UNICODE_NUMERALS = u'\uff10\uff11\uff12\uff13\uff14\uff15\uff16\uff17
    \uff18\uff19'


    And then to make sure I haven't missed any:

    assert len(UNICODE_NUMERALS) == 10


    In another function, I validate a mapping {key:value} to ensure that all
    the values are unique:

    seen_values = set()
    for k,v in mapping.items():
    if v in seen_values:
    raise ValueError('duplicate value %s' % k)
    seen_values.add(v)
    # If we get here without error, then the mapping contains no
    # duplicate values.
    assert len(seen_values) == len(mapping)


    The assertion acts as a double-check on my logic, not the data. If my
    logic is wrong (perhaps there is a way to get past the for-loop while
    there is a duplicate?) then the assertion will catch it.


    --
    Steven
  • D'Arcy J.M. Cain at Jan 5, 2010 at 2:19 pm

    On 05 Jan 2010 14:02:50 GMT Steven D'Aprano wrote:
    shouldn't use assert for validating user data except for quick-and-dirty
    scripts you intend to use once and throw away.
    A mythcial beast that has yet to be spotted in the wild.

    --
    D'Arcy J.M. Cain <darcy at druid.net> | Democracy is three wolves
    http://www.druid.net/darcy/ | and a sheep voting on
    +1 416 425 1212 (DoD#0082) (eNTP) | what's for dinner.
  • Steve Holden at Jan 5, 2010 at 5:44 pm

    D'Arcy J.M. Cain wrote:
    On 05 Jan 2010 14:02:50 GMT
    Steven D'Aprano wrote:
    shouldn't use assert for validating user data except for quick-and-dirty
    scripts you intend to use once and throw away.
    A mythcial beast that has yet to be spotted in the wild.
    Not true (he wrote, picking nits). Such programs are written all the
    time. The fact that they invariably get used more often than intended
    doesn't negate the intentions of the author. ;-)

    regards
    Steve
    --
    Steve Holden +1 571 484 6266 +1 800 494 3119
    PyCon is coming! Atlanta, Feb 2010 http://us.pycon.org/
    Holden Web LLC http://www.holdenweb.com/
    UPCOMING EVENTS: http://holdenweb.eventbrite.com/
  • R0g at Jan 5, 2010 at 2:48 pm

    Steven D'Aprano wrote:
    On Tue, 05 Jan 2010 13:06:20 +0000, r0g wrote:
    <snip>
    Well maybe I didn't quite get it then, could you explain a bit further?

    My understanding was that asserts aren't executed at all if python is
    started with the -O or -OO option,
    Correct.

    or run through an optimizer.
    I don't know what you mean by that.

    I've never used them but I heard there are optimizers for python
    (psycho?). I assumed these would do everythin -O does and more,
    including losing the asserts.


    If
    that's the case how can you expect it to validate anything at all in
    production?
    The asserts still operate so long as you don't use the -O switch.
    Do you mean for debugging in situ or something? Could you
    maybe give me an example scenario to illustrate your point?

    There are at least two sorts of validation that you will generally need
    to perform: validating user data, and validating your program logic.
    <snipped very detailed and clear response, thanks :)>


    Cool, that's what I thought i.e. you can't rely on asserts being there
    so don't use them for anything critical but it's still a good idea to
    use them for logic/consistency checking in production code as, should
    you be running your production code unoptimised, it might catch
    something you'd otherwise miss.

    Thanks for responding is such detail :)

    Roger.
  • Lie Ryan at Jan 5, 2010 at 5:10 pm

    On 1/6/2010 1:48 AM, r0g wrote:
    Steven D'Aprano wrote:
    On Tue, 05 Jan 2010 13:06:20 +0000, r0g wrote:
    If
    that's the case how can you expect it to validate anything at all in
    production?
    The asserts still operate so long as you don't use the -O switch.
    Do you mean for debugging in situ or something? Could you
    maybe give me an example scenario to illustrate your point?

    There are at least two sorts of validation that you will generally need
    to perform: validating user data, and validating your program logic.
    <snipped very detailed and clear response, thanks :)>


    Cool, that's what I thought i.e. you can't rely on asserts being there
    so don't use them for anything critical but it's still a good idea to
    use them for logic/consistency checking in production code as, should
    you be running your production code unoptimised, it might catch
    something you'd otherwise miss.
    Steven described the traditional approach to using assertions; another
    approach to when to use assertion is the one inspired by
    Design-by-Contract paradigm. DbC extends the traditional approach by
    focusing on writing a contract (instead of writing assertions) and
    generating assertions[1] to validate the contract. Just like assertions,
    these contracts are meant to be removed in production releases.

    In Design-by-Contract, only codes that interacts with the outer-world
    (e.g. getting user/file/network input, etc) need to do any sort of
    validations. Codes that doesn't interact directly with outside world
    only need to have a "contract" and simplified by *not* needing argument
    checking, since the function relies on the caller obeying the
    contract[2] and never calling it with an invalid input.

    DbC uses assertions[1] spuriously, unlike the traditional approach which
    is much more conservative when using assertions.

    [1] or explicit language support which is just syntax sugar for assertions
    [2] of course, on a debug release, the contract validation code will
    still be enforced to catch logic/consistency bugs that causes the violation
  • R0g at Jan 6, 2010 at 1:03 am

    Lie Ryan wrote:
    On 1/6/2010 1:48 AM, r0g wrote:
    Steven D'Aprano wrote:
    On Tue, 05 Jan 2010 13:06:20 +0000, r0g wrote:
    If
    that's the case how can you expect it to validate anything at all in
    production?
    The asserts still operate so long as you don't use the -O switch.
    <snip>
    checking, since the function relies on the caller obeying the
    contract[2] and never calling it with an invalid input.

    DbC uses assertions[1] spuriously, unlike the traditional approach which
    is much more conservative when using assertions.

    [1] or explicit language support which is just syntax sugar for assertions
    [2] of course, on a debug release, the contract validation code will
    still be enforced to catch logic/consistency bugs that causes the violation

    Thanks for the responses Steven/Dave/Lie, that's some really insightful
    stuff :)

    Roger.
  • Dave Angel at Jan 5, 2010 at 3:34 pm

    r0g wrote:
    Dave Angel wrote:
    r0g wrote:
    <snip>

    Maybe, although I recently learned on here that one can't rely on assert
    statements in production code, their intended use is to aid debugging
    and testing really.

    Hopefully, what you learned is that you can't use assert() in production
    code to validate user data. It's fine to use it to validate program
    logic, because that shouldn't still need testing in production.

    <snip>

    DaveA


    Well maybe I didn't quite get it then, could you explain a bit further?

    My understanding was that asserts aren't executed at all if python is
    started with the -O or -OO option, or run through an optimizer. If
    that's the case how can you expect it to validate anything at all in
    production? Do you mean for debugging in situ or something? Could you
    maybe give me an example scenario to illustrate your point?

    Cheers,

    Roger.
    You understand the -O and -OO options fine. But the point is that you
    should not use assert() for anything that will be properly debugged
    before going to the user. You use if statements, and throw's to catch
    the error, and print to stderr, or GUI dialog boxes, or whatever
    mechanism you use to tell your user. But those errors are ones caused
    by his data, not by your buggy code. And the message tells him what's
    wrong with his data, not that you encountered a negative value for some
    low level function.

    I agree with Steve's pessimistic view of the state of most released
    software. But if you view a particular internal check as useful for
    production, then it should be coded in another mechanism, not in
    assert. Go ahead and write one, with a UI that's appropriate for your
    particular application. But it should do a lot more than assert does,
    including telling the user your contact information to call for support.

    def production_assert(expression, message):
    if not expression:
    dialog_box("Serious internal bug,
    call NNN-NNN-NNNN immediately", message)


    For an overly simplified example showing a user validation, and an assert :

    import sys
    def main():
    try:
    text = raw_input("Enter your age, between 1 and 22 ")
    age = int(text)
    except ValueError, e:
    age = -1
    if not 1 <= age <= 22: #not an assert
    print "Age must be between 1 and 22"
    print "Run program again"
    sys.exit(2)
    grade = calc_grade(age)
    print "Your grade is probably", grade

    table = [0, 0, 0, 0, 0, "K", "First", "2nd", 3]
    def calc_grade(age):
    """ calculate a probable grade value, given an
    i2nteger age between 1 and 22, inclusive
    """
    assert(1 <= age <= len(table))
    grade = table[age] #assume I have a fixed-length table for this
    return grade

    main()

    Note a few things. One I have a bug, in that the table isn't as big as
    the limit I'm checking for. With defensive coding, I'd have another
    assert for that, or even have the table size be available as a global
    constant (all uppers) so that everyone's in synch on the upper limit.
    But in any case, the test suite would be checking to make sure the code
    worked for 1, for 22, for a couple of values in between, and that a
    proper error response happened when a non-integer was entered, or one
    outside of the range. That all happens in separate code, not something
    in this file. And the test suite is run after every change to the
    sources, and certainly before release to production.

    Next, see the docstring. It establishes a precondition for the
    function. Since the function is called only by me (not the user), any
    preconditions can be checked with an assert. An assert without a
    supporting comment (or docstring) is almost worthless.

    And finally, notice that I check the user's input *before* passing it on
    to any uncontrolled code. So any asserts after that cannot fire, unless
    I have a bug which was not caught during testing.

    All opinions my own, of course.

    DaveA
  • Steven D'Aprano at Jan 5, 2010 at 8:26 am

    On Tue, 05 Jan 2010 02:31:34 +0000, r0g wrote:

    A pattern I have used a few times is that of returning an explicit
    success/failure code alongside whatever the function normally returns.
    That doesn't work for languages that can only return a single result,
    e.g. C or Pascal. You can fake it by creating a struct that contains a
    flag and the result you want, but that means doubling the number of data
    types you deal with.

    While subsequent programmers might not intuit the need to test for
    (implicit) "magic" return values they ought to notice if they start
    getting tuples back where they expected scalars...
    What if they're expecting tuples as the result?


    def foo(x)
    if x>0:
    return True, x*x
    else:
    return False, "Bad value of x in foo:",str(x)

    ok, value = foo(-1)
    Oops, that gives:

    ValueError: too many values to unpack


    because you've returned three items instead of two. When an idiom is easy
    to get wrong, it's time to think hard about it.


    if ok:
    print "foo of x is", value
    else:
    print "ERROR:", value

    Whenever I come across a function that returns a flag and a result, I
    never know whether the flag comes first or second. Should I write:

    flag, result = foo(x)

    or

    result, flag = foo(x)



    I've seen APIs that do both.

    And I never know if the flag should be interpreted as a success or a
    failure. Should I write:

    ok, result = foo(x)
    if ok: process(result)
    else: fail()

    or


    err, result = foo(x)
    if err: fail()
    else: process(result)


    Again, I've seen APIs that do both.

    And if the flag indicates failure, what should go into result? An error
    code? An error message? That's impossible for statically-typed languages,
    unless they have variant records or the function normally returns a
    string.

    And even if you dismiss all those concerns, it still hurts readability by
    obfuscating the code. Consider somebody who wants to do this:

    result = foo(bar(x))

    but instead has to do this:


    flag, result = bar(x)
    if flag: # I think this means success
    flag, result = foo(x) # oops, I meant result

    Again, it's error-prone and messy. Imagine writing:


    flag, a = sin(x)
    if flag:
    flag, b = sqrt(x)
    if flag:
    flag, c = cos(b)
    if flag:
    flag, d = exp(a + c)
    if flag:
    flag, e = log(x)
    if flag:
    # Finally, the result we want!!!
    flag, y = d/e
    if not flag:
    fail(y)
    else:
    fail(e)
    else:
    fail(d)
    else:
    fail(c)
    else:
    fail(b)
    else:
    fail(a)



    Compare that to the way with exceptions:

    y = exp(sin(x) + cos(sqrt(x)))/log(x)


    Which would you prefer?




    --
    Steven
  • R0g at Jan 5, 2010 at 9:52 am

    Steven D'Aprano wrote:
    On Tue, 05 Jan 2010 02:31:34 +0000, r0g wrote:

    A pattern I have used a few times is that of returning an explicit
    success/failure code alongside whatever the function normally returns.
    That doesn't work for languages that can only return a single result,
    e.g. C or Pascal. You can fake it by creating a struct that contains a
    flag and the result you want, but that means doubling the number of data
    types you deal with.

    No, but that's why I try not to use languages where you can only return
    a single result, I always found that an arbitrary and annoying
    constraint to have. I leads to ugly practices like "magic" return values
    in C or explicitly packing things into hashtables like PHP, yuk!

    While subsequent programmers might not intuit the need to test for
    (implicit) "magic" return values they ought to notice if they start
    getting tuples back where they expected scalars...
    What if they're expecting tuples as the result?


    def foo(x)
    if x>0:
    return True, x*x
    else:
    return False, "Bad value of x in foo:",str(x)

    ok, value = foo(-1)
    Oops, that gives:

    ValueError: too many values to unpack


    because you've returned three items instead of two. When an idiom is easy
    to get wrong, it's time to think hard about it.

    That seems pretty clear to me, "too many values to unpack", either I've
    not given it enough variables to unpack the result into or I've returned
    too many things. That would take a couple of seconds to notice and fix.
    In fact I was trying to make the point that it would be quite noticable
    if a function returned more things than the programmer was expecting,
    this illustrates that quite well :)



    if ok:
    print "foo of x is", value
    else:
    print "ERROR:", value

    Whenever I come across a function that returns a flag and a result, I
    never know whether the flag comes first or second. Should I write:

    Flag then result, isn't it obvious? The whole point of returning a flag
    AND a result is so you can test the flag so you know what to do with the
    result so that implies a natural order. Of course it doesn't matter
    technically which way you do it, make a convention and stick to it. If
    you get perpetually confused as to the order of parameters then you'd
    better avoid this kind of thing, can't say as I've ever had a problem
    with it though.



    And I never know if the flag should be interpreted as a success or a
    failure. Should I write:

    ok, result = foo(x)
    if ok: process(result)
    else: fail()


    Yes. That would be my strong preference anyway. Naturally you can do it
    the other way round if you like, as long as you document it properly in
    your API. As you say different APIs do it differently... Unix has a
    convention of returning 0 on no-error but unix has to encapsulate a lot
    in that "error code" which is a bit of an anachronism these days. I'd
    argue in favour of remaining positive and using names like ok or
    success, this is closer to the familiar paradigm of checking a result
    does not evaluate to false before using it...

    name = ""
    if name:
    print name




    And if the flag indicates failure, what should go into result? An error
    code? An error message? That's impossible for statically-typed languages,
    unless they have variant records or the function normally returns a
    string.

    Yeah, in my example it's an error message. Maybe I shouldn't have used
    the word "pattern" above though as it has overtones of "universally
    applicable" which it clearly isn't.


    Again, it's error-prone and messy. Imagine writing:


    flag, a = sin(x)
    if flag:
    flag, b = sqrt(x)
    if flag: <snip>
    Compare that to the way with exceptions:

    y = exp(sin(x) + cos(sqrt(x)))/log(x)


    Which would you prefer?
    LOL, straw man is straw!

    You know full well I'm not suggesting every function return a flag, that
    would be silly. There's no reason returning flag and a value shouldn't
    be quite readable and there may be times when it's preferable to raising
    an exception.

    I use exceptions a lot as they're often the right tool for the job and
    they seem pleasingly pythonic but from time to time they can be too slow
    or verbose, where's the sense in forcing yourself to use them then?

    Roger.
  • Paul Rudin at Jan 5, 2010 at 9:54 am

    r0g <aioe.org at technicalbloke.com> writes:

    Steven D'Aprano wrote:
    On Tue, 05 Jan 2010 02:31:34 +0000, r0g wrote:

    A pattern I have used a few times is that of returning an explicit
    success/failure code alongside whatever the function normally returns.
    That doesn't work for languages that can only return a single result,
    e.g. C or Pascal. You can fake it by creating a struct that contains a
    flag and the result you want, but that means doubling the number of data
    types you deal with.

    No, but that's why I try not to use languages where you can only return
    a single result, I always found that an arbitrary and annoying
    constraint to have. I leads to ugly practices like "magic" return values
    in C or explicitly packing things into hashtables like PHP, yuk!
    Doesn't python just return a single result? (I know it can be a tuple and
    assignment statements will unpack a tuple for you.)
  • R0g at Jan 5, 2010 at 10:31 am

    Paul Rudin wrote:
    r0g <aioe.org at technicalbloke.com> writes:
    Steven D'Aprano wrote:
    On Tue, 05 Jan 2010 02:31:34 +0000, r0g wrote:

    A pattern I have used a few times is that of returning an explicit
    success/failure code alongside whatever the function normally returns.
    That doesn't work for languages that can only return a single result,
    e.g. C or Pascal. You can fake it by creating a struct that contains a
    flag and the result you want, but that means doubling the number of data
    types you deal with.
    No, but that's why I try not to use languages where you can only return
    a single result, I always found that an arbitrary and annoying
    constraint to have. I leads to ugly practices like "magic" return values
    in C or explicitly packing things into hashtables like PHP, yuk!
    Doesn't python just return a single result? (I know it can be a tuple and
    assignment statements will unpack a tuple for you.)

    Yes, it returns a tuple if you return more than one value, it just has a
    lovely syntax for it. In static languages you'd need to manually create
    an new array or struct, pack your return vars into it and unpack them on
    the other side. That's something I'd be happy never to see again, sadly
    I have to write in PHP sometimes :(

    Roger.

Related Discussions

People

Translate

site design / logo © 2022 Grokbase