FAQ
Whilst considering a port of old code to python 3 I see that in several
places we are using type comparisons to control processing of user
instances (as opposed to instances of built in types eg float, int, str)

I find that the obvious alternatives are not as fast as the current
code; func0 below. On my machine isinstance seems slower than type for
some reason. My 2.6 timings are

C:\Tmp>\Python\lib\timeit.py -s"import t;v=t.X()" t.func0(v)
1000000 loops, best of 3: 0.348 usec per loop

C:\Tmp>\Python\lib\timeit.py -s"import t;v=t.X()" t.func1(v)
1000000 loops, best of 3: 0.747 usec per loop

C:\Tmp>\Python\lib\timeit.py -s"import t;v=t.X()" t.func2(v)
1000000 loops, best of 3: 0.378 usec per loop

C:\Tmp>\Python\lib\timeit.py -s"import t;v=t.X()" t.func3(v)
1000000 loops, best of 3: 0.33 usec per loop

C:\Tmp>\Python\lib\timeit.py -s"import t;v=t.X()" t.func0(1)
1000000 loops, best of 3: 0.477 usec per loop

C:\Tmp>\Python\lib\timeit.py -s"import t;v=t.X()" t.func1(1)
1000000 loops, best of 3: 1.14 usec per loop

C:\Tmp>\Python\lib\timeit.py -s"import t;v=t.X()" t.func2(1)
1000000 loops, best of 3: 1.16 usec per loop

C:\Tmp>\Python\lib\timeit.py -s"import t;v=t.X()" t.func3(1)
1000000 loops, best of 3: 1.14 usec per loop

so func 3 seems to be the fastest option for the case when the first
test matches, but is poor when it doesn't. Can anyone suggest a better
way to determine if an object is a user instance?

##############################
from types import InstanceType
class X:
__X__=True

class V(X):
pass

def func0(ob):
t=type(ob)
if t is InstanceType:
pass
elif t in (float, int):
pass
else:
pass

def func1(ob):
if isinstance(ob,X):
pass
elif type(ob) in (float, int):
pass
else:
pass

def func2(ob):
if getattr(ob,'__X__',False):
pass
elif type(ob) in (float, int):
pass
else:
pass

def func3(ob):
if hasattr(ob,'__X__'):
pass
elif type(ob) in (float, int):
pass
else:
pass
##############################

--
Robin Becker

Search Discussions

  • Steven D'Aprano at Feb 1, 2009 at 8:15 am

    Robin Becker wrote:

    Whilst considering a port of old code to python 3 I see that in several
    places we are using type comparisons to control processing of user
    instances (as opposed to instances of built in types eg float, int, str)

    I find that the obvious alternatives are not as fast as the current
    code; func0 below. On my machine isinstance seems slower than type for
    some reason. My 2.6 timings are
    First question is, why do you care that it's slower? The difference between
    the fastest and slowest functions is 1.16-0.33 = 0.83 microsecond. If you
    call the slowest function one million times, your code will run less than a
    second longer.

    Does that really matter, or are you engaged in premature optimization? In
    your test functions, the branches all execute "pass". Your real code
    probably calls other functions, makes calculations, etc, which will all
    take time. Probably milliseconds rather than microseconds. I suspect you're
    concerned about a difference of 0.1 of a percent, of one small part of your
    entire application. Unless you have profiled your code and this really is a
    bottleneck, I recommend you worry more about making your code readable and
    maintainable than worrying about micro-optimisations.

    Even more important that being readable is being *correct*, and I believe
    that your code has some unexpected failure modes (bugs). See below:


    so func 3 seems to be the fastest option for the case when the first
    test matches, but is poor when it doesn't. Can anyone suggest a better
    way to determine if an object is a user instance?

    ##############################
    from types import InstanceType
    I believe this will go away in Python 3, as all classes will be New Style
    classes.

    class X:
    __X__=True
    This is an Old Style class in Python 2.x, and a New Style class in Python 3.

    Using hasattr('__X__') is a curious way of detecting what you want. I
    suppose it could be argued that it is a variety of duck-typing: "if it has
    a duck's bill, it must be a duck". (Unless it is a platypus, of course.)
    However, attribute names with leading and trailing double-underscores are
    reserved for use as "special methods". You should rename it to something
    more appropriate: _MAGIC_LABEL, say.

    class V(X):
    pass

    def func0(ob):
    t=type(ob)
    if t is InstanceType:
    pass
    This test is too broad. It will succeed for *any* old-style class, not just
    X and V instances. That's probably not what you want.

    It will also fail if ob is an instance of a New Style class. Remember that
    in Python 3, all classes become new-style.

    elif t in (float, int):
    pass
    This test will fail if ob is a subclass of float or int. That's almost
    certainly the wrong behavior. A better way of writing that is:

    elif issubclass(t, (float, int)):
    pass

    else:
    pass

    def func1(ob):
    if isinstance(ob,X):
    pass
    If you have to do type checking, that's the recommended way of doing so.


    elif type(ob) in (float, int):
    pass
    The usual way to write that is:

    if isinstance(ob, (float, int)):
    pass



    Hope this helps,


    --
    Steven
  • Steven D'Aprano at Feb 1, 2009 at 9:32 am

    Paul Rubin wrote:

    Steven D'Aprano <steve at pearwood.info> writes:
    First question is, why do you care that it's slower? The difference
    between the fastest and slowest functions is 1.16-0.33 = 0.83
    microsecond.
    That's a 71% speedup, pretty good if you ask me.
    Don't you care that the code is demonstrably incorrect? The OP is
    investigating options to use in Python 3, but the fastest method will fail,
    because the "type is InstanceType" test will no longer work. (I believe the
    fastest method, as given, is incorrect even in Python 2.x, as it will
    accept ANY old-style class instead of just the relevant X or V classes.)

    That reminds me of something that happened to my wife some years ago: she
    was in a van with her band's roadies, and one asked the driver "Are you
    sure you know where you're going?", to which the driver replied, "Who
    cares? We're making great time." (True story.)

    If you're going to accept incorrect code in order to save time, then I can
    write even faster code:

    def func4(ob):
    pass

    Trying beating that for speed!

    If you call the slowest function one million times, your code will
    run less than a second longer.
    What if you call it a billion times, or a trillion times, or a
    quadrillion times, you see where this is going?
    It doesn't matter. The proportion of time saved will remain the same. If you
    run it a trillion times, you'll save 12 minutes in a calculation that takes
    278 hours to run. Big Effing Deal. Saving such trivial amounts of time is
    not worth the cost of hard-to-read or incorrect code.

    Of course, if you have profiled your code and discovered that *significant*
    amounts of time are being used in type-testing, *then* such a
    micro-optimization may be worth doing. But I already allowed for that:

    "Does that really matter...?"
    (the answer could be Yes)

    "Unless you have profiled your code and this really is a bottleneck ..."
    (it could be)

    If you're testing
    100-digit numbers, there are an awful lot of them before you run out.
    Yes. So what? Once you've tested them, then what? If *all* you are doing
    them is testing them, your application is pretty boring. Even a print
    statement afterwards is going to take 1000 times longer than doing the
    type-test. In any useful application, the amount of time used in
    type-testing is almost surely going to be a small fraction of the total
    runtime. A 71% speedup on 50% of the runtime is significant; but a 71%
    speedup on 0.1% of the total execution time is not.



    --
    Steven
  • Robin Becker at Feb 1, 2009 at 12:33 pm

    Steven D'Aprano wrote:
    Paul Rubin wrote:
    Steven D'Aprano <steve at pearwood.info> writes:
    First question is, why do you care that it's slower? The difference
    between the fastest and slowest functions is 1.16-0.33 = 0.83
    microsecond.
    That's a 71% speedup, pretty good if you ask me.
    Don't you care that the code is demonstrably incorrect? The OP is
    investigating options to use in Python 3, but the fastest method will fail,
    because the "type is InstanceType" test will no longer work. (I believe the
    fastest method, as given, is incorrect even in Python 2.x, as it will
    accept ANY old-style class instead of just the relevant X or V classes.)
    I'm not clear why this is true? Not all instances will have the __X__
    attribute or has something else changed in Python3?

    The original code was intended to be called with only a subset of all
    class instances being passed as argument; as currently written it was
    unsafe because an instance of an arbitrary old class would pass into
    branch 1. Of course it will still be unsafe as arbitrary instances end
    up in branch 3

    The intent is to firm up the set of cases being accepted in the first
    branch. The problem is that when all instances are new style then
    there's no easy check for the other acceptable arguments eg float,int,
    str etc, as I see it, the instances must be of a known class or have a
    distinguishing attribute.

    As for the timing, when I tried the effect of func1 on our unit tests I
    noticed that it slowed the whole test suite by 0.5%. Luckily func 3
    style improved things by about 0.3% so that's what I'm going for.
    --
    Robin Becker
  • Steven D'Aprano at Feb 1, 2009 at 4:02 pm

    Robin Becker wrote:

    Steven D'Aprano wrote:
    Paul Rubin wrote:
    Steven D'Aprano <steve at pearwood.info> writes:
    First question is, why do you care that it's slower? The difference
    between the fastest and slowest functions is 1.16-0.33 = 0.83
    microsecond.
    That's a 71% speedup, pretty good if you ask me.
    Don't you care that the code is demonstrably incorrect? The OP is
    investigating options to use in Python 3, but the fastest method will
    fail, because the "type is InstanceType" test will no longer work. (I
    believe the fastest method, as given, is incorrect even in Python 2.x, as
    it will accept ANY old-style class instead of just the relevant X or V
    classes.)
    I'm not clear why this is true? Not all instances will have the __X__
    attribute or has something else changed in Python3?
    The func0() test doesn't look for __X__.

    The original code was intended to be called with only a subset of all
    class instances being passed as argument; as currently written it was
    unsafe because an instance of an arbitrary old class would pass into
    branch 1. Of course it will still be unsafe as arbitrary instances end
    up in branch 3

    The intent is to firm up the set of cases being accepted in the first
    branch. The problem is that when all instances are new style then
    there's no easy check for the other acceptable arguments eg float,int,
    str etc,
    Of course there is.

    isinstance(ob, (float, int))

    is the easy, and correct, way to check if ob is a float or int.

    as I see it, the instances must be of a known class or have a
    distinguishing attribute.
    Are you sure you need to check for different types in the first place? Just
    how polymorphic is your code, really? It's hard to judge because I don't
    know what your code actually does.

    As for the timing, when I tried the effect of func1 on our unit tests I
    noticed that it slowed the whole test suite by 0.5%.
    An entire half a percent slower. Wow.

    That's like one minute versus one minute and 0.3 second. Or one hour, versus
    one hour and 18 seconds. I find it very difficult to get worked up over
    such small differences. I think you're guilty of premature optimization:
    wasting time and energy trying to speed up parts of the code that are
    trivial. (Of course I could be wrong, but I doubt it.)


    Luckily func 3
    style improved things by about 0.3% so that's what I'm going for.
    I would call that the worst solution. Not only are you storing an attribute
    which is completely redundant (instances already know what type they are,
    you don't need to manually store a badge on them to mark them as an
    instance of a class), but you're looking up this attribute only to
    immediately throw away the value you get. The only excuse for this extra
    redirection would be if it were significantly faster. But it isn't: you
    said it yourself, 0.3% speed up. That's like 60 seconds versus 59.82
    seconds.


    --
    Steven

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppython-list @
categoriespython
postedJan 31, '09 at 4:39p
activeFeb 1, '09 at 4:02p
posts5
users2
websitepython.org

2 users in discussion

Steven D'Aprano: 3 posts Robin Becker: 2 posts

People

Translate

site design / logo © 2022 Grokbase