FAQ

[Python] Python is far from a top performer according to benchmark test...

Carl
Jan 9, 2004 at 9:05 pm
"Nine Language Performance Round-up: Benchmarking Math & File I/O"
http://www.osnews.com/story.php?news_idV02

I think this is an unfair comparison! I wouldn't dream of developing a
numerical application in Python without using prebuilt numerical libraries
and data objects such as dictionaries and lists.

I have been experimenting with numerical algorithms in Python with a heavy
use of the Numeric module. My experience is that Python is quite fast in
comparison with (and sometimes as fast as) traditional languages such as C
or C++.

The greatest advantage of Python is the great increase in productivity and
the generation of a much smaller number of bugs due to the very clean and
compact structure Python invites you to produce. Sometimes it amazes me how
fast I can produce a working algorithm in Python. The step from an
algorithmic outline on a paper to a working code is very short. The
interactive nature of the Python console invites numerical experimentation
and data exploration. This wasn't mentioned in the article, what a pity!

Carl
reply

Search Discussions

53 responses


  • Notice: Undefined variable: pl_u_link_beg in /home/whirl/sites/grokbase/root/www/public_html__www/cc/flow/tpc.main.php on line 832
    Irmen de Jong
    Notice: Undefined variable: pl_u_link_end in /home/whirl/sites/grokbase/root/www/public_html__www/cc/flow/tpc.main.php on line 832
    at Jan 9, 2004 at 9:08 pm

    Carl wrote:

    "Nine Language Performance Round-up: Benchmarking Math & File I/O"
    http://www.osnews.com/story.php?news_idV02
    This benchmark is beaten to pulp in the discussion about it on slashdot.
    It's a real stupid benchmark (as most benchmarks are) IMNSHO.

    I mean, using python's arbitrary precision long object for 'math'
    and comparing it to the other languages' long /machine types/... come on.
    And thats just one of the flaws.

    --Irmen
  • Krzysztof Stachlewski at Jan 9, 2004 at 9:13 pm
    "Carl" <phleum_nospam at chello.se> wrote in message
    news:ryELb.238$tK2.228 at amstwist00...
    I have been experimenting with numerical algorithms in Python with a heavy
    use of the Numeric module. My experience is that Python is quite fast in
    comparison with (and sometimes as fast as) traditional languages such as C
    or C++.
    With "heavy use of Numeric module" you were calling functions
    written in C. So how can you say that Python is fast,
    when C code is doing all the work.

    Stach
  • Samuel Walters at Jan 9, 2004 at 9:36 pm

    Thus Spake Krzysztof Stachlewski On the now historical date of Fri, 09
    Jan 2004 22:13:58 +0100|
    "Carl" <phleum_nospam at chello.se> wrote in message
    news:ryELb.238$tK2.228 at amstwist00...
    I have been experimenting with numerical algorithms in Python with a
    heavy use of the Numeric module. My experience is that Python is quite
    fast in comparison with (and sometimes as fast as) traditional languages
    such as C or C++.
    With "heavy use of Numeric module" you were calling functions written in
    C. So how can you say that Python is fast, when C code is doing all the
    work.
    Because python works best as a glue layer coordinating outside libraries
    and functionality. I think the true strength of python comes from a
    one-two punch of "Fast and pretty implementation with easy interface to
    lower level tools." When python is not the right tool, we code our
    solution with the right tool and then use python to glue all the right
    tools together. For numerical processing, C is the right tool, but python
    is not. Therefore, noone tried to use the wrong tool, they just used the
    right tool and gave it python bindings so that python could act as a
    coordinator.

    I read the benchmark and I think it doesn't measure python in it's target
    area. That's like taking a world-class marathon runner and wondering why
    he doesn't compete well in a figure-skating event.

    Sam Walters.


    --
    Never forget the halloween documents.
    http://www.opensource.org/halloween/
    """ Where will Microsoft try to drag you today?
    Do you really want to go there?"""
  • Lothar Scholz at Jan 10, 2004 at 5:29 am
    Samuel Walters <swalters_usenet at yahoo.com> wrote in message news:<pan.2004.01.09.21.35.40.132608 at yahoo.com>...
    For numerical processing, C is the right tool,
    Definitely not, you don't want a pointer language when using numerical
    processing: use Fortran.
  • Samuel Walters at Jan 11, 2004 at 6:09 am

    Thus Spake Lothar Scholz On the now historical date of Fri, 09 Jan 2004
    21:29:56 -0800|
    Samuel Walters <swalters_usenet at yahoo.com> wrote in message
    news:<pan.2004.01.09.21.35.40.132608 at yahoo.com>...
    For numerical processing, C is the right tool,
    Definitely not, you don't want a pointer language when using numerical
    processing: use Fortran.
    Hmm. I feel misunderstood. I'm going to try to clarify, but if I'm the
    one doing the misunderstanding, feel free to give me a good old-fashioned
    usenet style bitchslapping back to the stone age.

    First off: Truth in advertising.
    I know very little about numeric processing, and even less about Fortran.
    It's true that my background is in mathematics, but in *pure* mathematics
    where pointer-based languages tend to be helpful, not hurtful. I chose
    pure mathematics precisely because it eschews the grubby sort of shortcuts
    that applied mathematics uses. In other words, I didn't have the proper
    sort of mathematical intuition for it, so I chose pure, where my kind of
    intuition works well. (In the end, this was to my detriment. All the
    interesting problems are in applied math!)

    As I see it, when one considers which language is best for one's needs,
    one considers a couple of things:

    1) Does the language have the semantics I want.
    2) Does the language have the primitives I need.
    3) Can I *easily* build any missing or suboptimal primitives.

    One would assume that Fortran has the proper semantics for numerical
    processing because it seems to have been wildly successful for a long
    period of time. It would appear that Python has the proper semantics for
    numerical processing because a significant number of people are using it
    for that, and they'd be using something else if Python caused them too
    many headaches.

    Fortran naturally comes with the primitives for numerical processing,
    because numerical processing is its stated goal. ("FORmula TRANslation")
    Python doesn't seem to have the native and optimal primitives for
    numerical processing, so that leads to point three.

    Whether one uses Fortran, Python, or any other language, all primitives
    are eventually implemented in either C or assembly. At some point or
    another, we end up scraping bare metal and silicon to get our answers.
    The question then becomes, "How easily can I extend this language to fit
    my needs." NumPy is evidence that at least a few people said "Easily
    enough." I don't know how extensible Fortran is, but I would guess "very"
    since I've heard of it being applied in many domains other than numerical
    processing. (OpenFirmware, for example.)

    So, I guess that my point is that C might not be the right language for
    doing numerical processing, but it seems the right language for
    implementing the primitives of numerical processing. Those primitives
    should, of course, be designed in such a manner that their behaviors are
    not muddied by pointer issues.

    Moving on:
    I think Python's strength flows from the three criterion for choosing a
    language. It's semantics seem to naturally fit the way a programmer
    thinks about problems. All the algorithmic primitives are there for
    naturally expressing one's self easily. Where the primitives don't exist,
    it's easy to bind outside primitives into the system seamlessly. One of
    the joy's of python is that c extension libraries almost never feel bolted
    on. They feel like an intimate part of the language itself. Part of that
    is the blood, sweat and tears of the library implementors, but much of it
    is also the elegance of Python.

    As far as the straw-poll goes, I think it's a good question to ask, and
    that the answer is important, but we also need to figure out where else we
    can ask this question. The problem with asking such a question solely on
    c.l.p is that everyone here has either decided that optimization in python
    isn't enough of an issue to bother them, or hasn't made up their
    mind yet. Those who have decided that optimization in python is a problem
    have already gone elsewhere. Perhaps a better question to ask is "Who has
    decided that Python is too slow for their needs, what prompted that
    decision and are the issues they had worth addressing?"

    Sam Walters.


    --
    Never forget the halloween documents.
    http://www.opensource.org/halloween/
    """ Where will Microsoft try to drag you today?
    Do you really want to go there?"""



    From http Sun Jan 11 07:13:07 2004
    From: http (Paul Rubin)
    Date: 10 Jan 2004 22:13:07 -0800
    Subject: Help with if statement
    References: <d91f466b.0401102201.3709771@posting.google.com>
    Message-ID: <7xeku7qah8.fsf@ruckus.brouhaha.com>

    jikosan83 at yahoo.com (Jikosan) writes:
    for x in range(0,N):
    random = uniform(-1, 1)
    random is a number between -1 and +1. It's always < 1.
    print random
    if random < 1.0: <--- Not working
    neg = neg+1
    You mean "if random < 0.0:".
  • Rainer Deyke at Jan 11, 2004 at 6:46 am

    Samuel Walters wrote:
    So, I guess that my point is that C might not be the right language
    for doing numerical processing, but it seems the right language for
    implementing the primitives of numerical processing.
    The issue with C is that it is too slow for implementing those primitives
    (in part due to pointer aliasing issues). Fortran is considerably faster.


    --
    Rainer Deyke - rainerd at eldwood.com - http://eldwood.com
  • Samuel Walters at Jan 11, 2004 at 10:40 am

    Thus Spake Rainer Deyke On the now historical date of Sun, 11 Jan 2004
    06:46:50 +0000|
    Samuel Walters wrote:
    So, I guess that my point is that C might not be the right language for
    doing numerical processing, but it seems the right language for
    implementing the primitives of numerical processing.
    The issue with C is that it is too slow for implementing those
    primitives (in part due to pointer aliasing issues). Fortran is
    considerably faster.
    I stand corrected.

    Please help me to understand the situation better.

    I went digging for technical documents, but thus far haven't found many
    useful ones. It seems everyone but me already understands pointer
    aliasing models, so they might discuss them, but they don't explain them.
    I am limited in part by my understanding of compilers and also by my
    understanding of Fortran. Here is what I have gathered so far:

    Fortran lacks a stack for function calls. This promotes speed, but
    prevents recursive functions. (Most recursive functions can efficiently be
    written as loops, though, so this shouldn't be considered a hindrance.)

    Fortran passes all arguments by reference. (This is the peppiest way to do
    it, especially with static allocation)

    Fortran 77 lacks explicit pointers and favors static allocation. This
    allows for the compiler to apply powerful automatic optimization.

    Fortran 90 added explicit pointers, but required them to only be pointed
    at specific kinds of objects, and only when those particular objects are
    declared as targets for pointers. This allows the compiler to still apply
    powerful automatic optimizations to code. I'm a bit hazy as to whether
    Fortran 90 uses static or dynamic allocation, or a combination of both,
    and whether it permits recursion.

    These pointers not only reference location, but also dimension and stride.
    Stride is implicit in C pointer declarations (by virtue of compile-time
    knowledge of the data type pointed to) but dimension is not.

    Fortran's extensions for parallel programming have been standardized, and
    the language itself makes it easy to decide how to parallelize procedures
    at compile time. Thus, it is especially favored for numeric computation on
    big iron with lots of parallelism.

    Now, for C:

    Because of dynamic allocation on the stack and the heap, there is no
    compile-time knowledge of where a variable will live, which adds an extra
    layer of reference for even static variables. This also invalidates many
    of optimizations used by Fortran compilers.

    C lacks many of the fundamental array handling semantics and primitives
    that Fortran programs rely on. Implementing them in C is a real PITA.

    C memory allocation is just plain goofy in comparison to Fortran.

    To sum up:
    Fortran sacrifices generality and dynamism for compile-time knowledge
    about data, and deeply optimizes based on that knowledge.

    C sacrifices speed for the sake of generality and dynamism.

    Please correct me or help me flesh out my ideas. Please don't skimp on
    the low-level details, I've done my fair share of assembly programming, so
    what I don't understand, I'll probably be able to find out by googling a
    bit.

    Some other interesting things I found out:

    There are two projects that allow interfacing between Python and Fortran:
    F2Py
    http://cens.ioc.ee/projects/f2py2e/
    PyFortran
    http://sourceforge.net/projects/pyfortran

    Fortran amply supports interfaces to C and C++

    Fortran is compiled. (Doh! and I thought it was interpreted.)

    There are lots of debates on whether C++ will ever be as fast as Fortran.
    The consensus seems to be "Only if you use the right compiler with the
    right switches and are insanely careful about how you program. IOW Don't
    bother, just use Fortran if you want to do numeric processing.

    Well, there's another language to add to my list of languages to learn. It
    seems to be "The Right Tool" for a great many applications, it interfaces
    well with other languages, and it's extremely portable. Chances are, I'll
    end up using it somewhere somehow someday. Now. To find some Fortran
    tutorials.

    Thanks in advance for any of your knowledge and wisdom you are willing to
    confer upon me.

    Sam Walters.

    --
    Never forget the halloween documents.
    http://www.opensource.org/halloween/
    """ Where will Microsoft try to drag you today?
    Do you really want to go there?"""
  • John J. Lee at Jan 11, 2004 at 2:37 pm

    Samuel Walters <swalters_usenet at yahoo.com> writes:
    Thus Spake Rainer Deyke On the now historical date of Sun, 11 Jan 2004
    06:46:50 +0000|
    [...]
    The issue with C is that it is too slow for implementing those
    primitives (in part due to pointer aliasing issues). Fortran is
    considerably faster.
    I stand corrected.

    Please help me to understand the situation better.

    I went digging for technical documents, but thus far haven't found many
    useful ones. It seems everyone but me already understands pointer
    aliasing models, so they might discuss them, but they don't explain them.
    [...]

    (haven't read all your post, so sorry if I'm telling you stuff you
    already know)

    Pointer aliasing is just the state of affairs where two pointers refer
    to a single region of memory. Fortran compilers have more information
    about aliasing than do C compilers, so can make more assumptions at
    compilation time.

    Have you tried comp.lang.fortran, comp.lang.c++, comp.lang.c, etc?

    http://www.google.com/groups?q=pointer+aliasing+FAQ+group:comp.lang.fortran&hl=en&lr=&ie=UTF-8&group=comp.lang.fortran&selm92May27.175805.26097%40newshost.lanl.gov&rnum=3

    http://tinyurl.com/2v3v5


    John
  • Dan Bishop at Jan 12, 2004 at 9:10 am
    Samuel Walters <swalters_usenet at yahoo.com> wrote in message news:<pan.2004.01.11.10.38.45.810669 at yahoo.com>...
    Thus Spake Rainer Deyke On the now historical date of Sun, 11 Jan 2004
    06:46:50 +0000|
    Samuel Walters wrote:
    [Fortran is faster than C.] ...
    I went digging for technical documents, but thus far haven't found many
    useful ones. It seems everyone but me already understands pointer
    aliasing models, so they might discuss them, but they don't explain them.
    I am limited in part by my understanding of compilers and also by my
    understanding of Fortran. Here is what I have gathered so far:

    Fortran passes all arguments by reference. (This is the peppiest way to do
    it, especially with static allocation)
    Btw, for some early compilers, this was the case even with literals,
    which meant that code like

    CALL FOO(4)
    PRINT *, 4

    could print something other than 4.
    ...I'm a bit hazy as to whether
    Fortran 90 uses static or dynamic allocation, or a combination of both,
    You can use both, at least for arrays.
    and whether it permits recursion.
    Fortran 90 does permit recursion, although you have to explicitly
    declare functions as "recursive".
    Now, for C: ...
    C lacks many of the fundamental array handling semantics and primitives
    that Fortran programs rely on. Implementing them in C is a real PITA.
    This is one of my least favorite things about C.
    C memory allocation is just plain goofy in comparison to Fortran.
    And even worse in comparison to Python ;-)
  • David M. Cooke at Jan 11, 2004 at 11:03 pm

    At some point, Samuel Walters wrote:
    Whether one uses Fortran, Python, or any other language, all primitives
    are eventually implemented in either C or assembly. At some point or
    another, we end up scraping bare metal and silicon to get our answers.
    The question then becomes, "How easily can I extend this language to fit
    my needs." NumPy is evidence that at least a few people said "Easily
    enough." I don't know how extensible Fortran is, but I would guess "very"
    since I've heard of it being applied in many domains other than numerical
    processing. (OpenFirmware, for example.)
    You're confusing Fortran with Forth, which is a stack-based language,
    much like Postscript, or RPL used on HP 48 calculators.

    These days, I doubt Fortran is used for anything but numerical processing.

    --
    \/|<
    /--------------------------------------------------------------------------\
    David M. Cooke
    cookedm(at)physics(dot)mcmaster(dot)ca
  • Frithiof Andreas Jensen at Jan 12, 2004 at 10:52 am
    "Samuel Walters" <swalters_usenet at yahoo.com> wrote in message
    news:pan.2004.01.11.06.08.18.867825 at yahoo.com...
    As I see it, when one considers which language is best for one's needs,
    one considers a couple of things:

    1) Does the language have the semantics I want.
    2) Does the language have the primitives I need.
    3) Can I *easily* build any missing or suboptimal primitives. True.
    One would assume that Fortran has the proper semantics for numerical
    processing because it seems to have been wildly successful for a long
    period of time.
    That, in my opinion, is wrong: Fortran is successful because it was there
    first!

    There exists a very large set of actively supported and proven libraries,
    NAG f.ex., which nobody will ever bother to port to another language just
    for the sake of it, and Fortran has been around for so long that it is well
    understood how best to optimise and compile Fortran code. It is easy enough
    to link with NAG if one needs to use it.
    Fortran naturally comes with the primitives for numerical processing,
    because numerical processing is its stated goal. ("FORmula TRANslation")
    ...or maybe the name sounded cool ;-)
    Whether one uses Fortran, Python, or any other language, all primitives
    are eventually implemented in either C or assembly. At some point or
    another, we end up scraping bare metal and silicon to get our answers.
    Exactly - Fortran in itself does not do something that another language
    cannot do as well. It is just the case that Fortran is better understood
    when applied to numering processing than other languages because more
    "numerics" people used it than any other language.

    On DSP architectures, f.ex., I doubt that one would have better performance
    using Fortran in comparison with the C/C++ tools, DSP's usually ship with -
    because DSP's were "born" when C/C++ was hot.

    A lot of real, serious DSP work is done in Mathlab - thus skipping the issue
    of language choice and getting right onto getting the job done. This is good
    IMO.
  • Aahz at Jan 13, 2004 at 4:42 am
    In article <bttu62$gdh$1 at newstree.wise.edt.ericsson.se>,
    Frithiof Andreas Jensen wrote:
    That, in my opinion, is wrong: Fortran is successful because it was there
    first!
    Which brings up the obvious question: why isn't Lisp successful?
    --
    Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/

    A: No.
    Q: Is top-posting okay?
  • Michele Simionato at Jan 13, 2004 at 12:24 pm
    aahz at pythoncraft.com (Aahz) wrote in message news:<btvsvl$pu7$1 at panix3.panix.com>...
    In article <bttu62$gdh$1 at newstree.wise.edt.ericsson.se>,
    Frithiof Andreas Jensen wrote:
    That, in my opinion, is wrong: Fortran is successful because it was there
    first!
    Which brings up the obvious question: why isn't Lisp successful?
    But it is successful! Look at people working on AI, theory of programming
    languages, etc: they will (presumably) know Lisp. OTOH, look at people
    working on number crunching: they will (presumably) know Fortran.
    It turns out that the number of people working on numerical analysis (or
    using numerical tools) is much larger than the number of people working on
    abstract things, so you will have more people knowing Fortran than people
    knowing Lisp. But this fact alone does not mean that one language is more
    successfull than the other in its application niche. You could compare
    Python and Perl (or Ruby) and say that one is more successful than
    the other, but Lisp and Fortran have different audiences and you cannot
    estimate their success just as number of users.

    Just my 2 eurocents,

    Michele
  • Jacek Generowicz at Jan 13, 2004 at 1:14 pm

    michele.simionato at poste.it (Michele Simionato) writes:

    But it is successful! Look at people working on AI, theory of programming
    languages, etc: they will (presumably) know Lisp. OTOH, look at people
    working on number crunching: they will (presumably) know Fortran.
    It turns out that the number of people working on numerical analysis (or
    using numerical tools) is much larger than the number of people working on
    abstract things, so you will have more people knowing Fortran than people
    knowing Lisp. But this fact alone does not mean that one language is more
    successfull than the other in its application niche. You could compare
    Python and Perl (or Ruby) and say that one is more successful than
    the other, but Lisp and Fortran have different audiences and you cannot
    estimate their success just as number of users.
    Lest anyone infer that Lisp has an "application niche" consisting of
    AI and theory of programming languages ... take a look at

    http://www.franz.com/success/

    and glance at the column on the left.

    I suspect that Aahz' working definition of "successful" had more to do
    with success in terms of popularity, rather than success in terms of
    technical excellence: please remember that quality and popularity are
    very weakly correlated.

    If you want to analyze the popularity of a technology, you will get
    far better insight by studying the sociological and historical
    contexts surrounding it rather then its actual technical merits.

    For example, how many readers of this post will be surprised to learn
    that (most) Common Lisp implementations compile to efficient native
    machine code, that Common Lisp has an ANSI standard which includes
    very powerful support for object-oriented programming (to name but two
    features that everybody "knows" it doesn't have) ?

    Go on, raise your hand if you thought that "Lisp" is a slow,
    interperted functional langugage.

    You will need to take, amongst many other things, the abundance of
    such (non-)facts into consideration, if you want to understand Lisp
    lack of "success".

    Similarly, the "success" of C++ probably has more to do with having
    introduced OOP to the C programmers of the world, that with its
    suitability for doing OOP.

    --
    ...Please don't assume Lisp is only useful for Animation and Graphics,
    AI, Bioinformatics, B2B and E-Commerce, Data Mining, EDA/Semiconductor
    applications, Expert Systems, Finance, Intelligent Agents, Knowledge
    Management, Mechanical CAD, Modeling and Simulation, Natural Language,
    Optimization, Research, Risk Analysis, Scheduling, Telecom, and Web
    Authoring just because these are the only things they happened to
    list. -- Kent Pitman
  • Andrew Dalke at Jan 13, 2004 at 9:43 pm
    Jacek Generowicz, quoting Kent Pitman
    --
    ...Please don't assume Lisp is only useful for Animation and Graphics,
    AI, Bioinformatics, ... just because these are the only things they
    happened to list.
    And as I continue to point out, *Python* is more often used in
    bioinformatics than Lisp, and Perl dominates that field, followed
    by C/C++ then Java a distant third.

    Andrew
    dalke at dalkescientific.com
  • Michele Simionato at Jan 14, 2004 at 6:24 am
    Jacek Generowicz <jacek.generowicz at cern.ch> wrote in message news:<tyf65fgdm81.fsf at pcepsft001.cern.ch>...
    Lest anyone infer that Lisp has an "application niche" consisting of
    AI and theory of programming languages ... take a look at

    http://www.franz.com/success/

    and glance at the column on the left.
    Okay, let's restate my point in this way: if you need a very big
    programming power (which, I agree, is not only needed in A.I. & similia),
    then Lisp is a good choice. Most of the people in the word don't need
    a very big programming power, though. They can need a very big numerical
    power, then they use Fortran. Or they can need moderate programming power
    and moderate numerical power (such as in bioinformatics) and then they use
    Perl or Python.
    I suspect that Aahz' working definition of "successful" had more to do
    with success in terms of popularity, rather than success in terms of
    technical excellence: please remember that quality and popularity are
    very weakly correlated.

    If you want to analyze the popularity of a technology, you will get
    far better insight by studying the sociological and historical
    contexts surrounding it rather then its actual technical merits.
    I completely agree.
    For example, how many readers of this post will be surprised to learn
    that (most) Common Lisp implementations compile to efficient native
    machine code, that Common Lisp has an ANSI standard which includes
    very powerful support for object-oriented programming (to name but two
    features that everybody "knows" it doesn't have) ?

    Go on, raise your hand if you thought that "Lisp" is a slow,
    interperted functional langugage.
    Never thought so. IMHO people don't use List because they don't need it,
    not because they think it is a slow, interperted functional language.
    There are simpler alternative languages that are good enough for most
    people and more suitable in terms of libraries (i.e. Fortran for
    numerics, Perl for bioinformatics). Still, Lisp is successful for a
    certain audience (I concede, not restricted to A. I. only, but rather
    small anyway). So, it is successful but not popular. This was my
    point, in contrast to Aahz's view, and I think we agree.


    Michele Simionato
  • Jacek Generowicz at Jan 14, 2004 at 9:38 am

    michele.simionato at poste.it (Michele Simionato) writes:

    Jacek Generowicz <jacek.generowicz at cern.ch> wrote in message news:<tyf65fgdm81.fsf at pcepsft001.cern.ch>...
    Go on, raise your hand if you thought that "Lisp" is a slow,
    interperted functional langugage.
    Never thought so.
    Michele,

    My post was not aimed at you specifically. It was aimed a lurkers who
    might infer that Lisp is only used in AI (or whatever), or who might
    have some unfounded assumptions about Lisp which would be re-inforced
    by your post.

    I was quite sure that my post would not be telling _you_ much you
    didn't already know.
  • Lothar Scholz at Jan 12, 2004 at 1:42 pm
    Samuel Walters <swalters_usenet at yahoo.com> wrote in message news:<pan.2004.01.11.06.08.18.867825 at yahoo.com>...
    I know very little about numeric processing, and even less about Fortran.
    It's true that my background is in mathematics, but in *pure* mathematics
    where pointer-based languages tend to be helpful, not hurtful.
    Okay seems that you don't know a lot about compiler writing.

    A C compiler only knows a little bit about the context so it must
    always assume that a data inside a member can be referenced from
    another place via an alias pointer.

    Fortran does not have this problem so a lot of optimizations can be
    done and values can be hold in registers for a much longer time,
    resulting in much greater speed.

    Remember that on supercomputers a 25% spped enhancement (which a good
    fortran compiler is faster then C) can mean a few million dollars of
    saved hardware costs. The coding time is not importing compared to the
    running time. So real hard numerics are always done in Fortran.

    GNU Fortran is a stupid project because it translates the Fortran code
    to C.


    Python for hardcore numerics, even with PyNumerics, is simply a very
    bad solution.
  • David M. Cooke at Jan 12, 2004 at 3:58 pm

    At some point, llothar at web.de (Lothar Scholz) wrote:

    Samuel Walters <swalters_usenet at yahoo.com> wrote in message news:<pan.2004.01.11.06.08.18.867825 at yahoo.com>...
    I know very little about numeric processing, and even less about Fortran.
    It's true that my background is in mathematics, but in *pure* mathematics
    where pointer-based languages tend to be helpful, not hurtful.
    Okay seems that you don't know a lot about compiler writing.

    A C compiler only knows a little bit about the context so it must
    always assume that a data inside a member can be referenced from
    another place via an alias pointer.

    Fortran does not have this problem so a lot of optimizations can be
    done and values can be hold in registers for a much longer time,
    resulting in much greater speed.

    Remember that on supercomputers a 25% spped enhancement (which a good
    fortran compiler is faster then C) can mean a few million dollars of
    saved hardware costs. The coding time is not importing compared to the
    running time. So real hard numerics are always done in Fortran.

    GNU Fortran is a stupid project because it translates the Fortran code
    to C.
    Err, no. You're thinking of f2c. GNU Fortran uses the same backend as
    the GNU C compiler, in that it translates to the same intermediate
    language (sort of assembly language) on which optimizations are done.
    But the C front end and the Fortran front end are separate.

    The advantage of GNU Fortran is it's *portable*. It runs on everything
    that GCC works on -- which is a lot. This makes a difference when
    you're developing (like on my Apple iBook running Linux). And it looks
    like the G95 project is humming along nicely.
    Python for hardcore numerics, even with PyNumerics, is simply a very
    bad solution.
    You're not limited to pure Python.

    No one's saying you must use only C, or only Fortran, or only Python.
    Use the best tool for the job. Python just has the advantage that
    mixing C and Fortran into it is easy. Write your big, number-crunching
    code as a Fortran routine, wrap it with f2py, then add a Python script
    around it to run it. Presto, no fiddling with Fortran for reading in
    input files, or output files (depending on your routine). You can even
    write a GUI in Python if you wish. Or add a webserver.

    --
    \/|<
    /--------------------------------------------------------------------------\
    David M. Cooke
    cookedm(at)physics(dot)mcmaster(dot)ca
  • Lothar Scholz at Jan 12, 2004 at 7:09 pm
    cookedm+news at physics.mcmaster.ca (David M. Cooke) wrote in message news:<qnk8ykdp3a0.fsf at arbutus.physics.mcmaster.ca>...
    At some point, llothar at web.de (Lothar Scholz) wrote:

    Samuel Walters <swalters_usenet at yahoo.com> wrote in message news:<pan.2004.01.11.06.08.18.867825 at yahoo.com>...
    I know very little about numeric processing, and even less about Fortran.
    It's true that my background is in mathematics, but in *pure* mathematics
    where pointer-based languages tend to be helpful, not hurtful.
    Okay seems that you don't know a lot about compiler writing.

    A C compiler only knows a little bit about the context so it must
    always assume that a data inside a member can be referenced from
    another place via an alias pointer.

    Fortran does not have this problem so a lot of optimizations can be
    done and values can be hold in registers for a much longer time,
    resulting in much greater speed.

    Remember that on supercomputers a 25% spped enhancement (which a good
    fortran compiler is faster then C) can mean a few million dollars of
    saved hardware costs. The coding time is not importing compared to the
    running time. So real hard numerics are always done in Fortran.

    GNU Fortran is a stupid project because it translates the Fortran code
    to C.
    Err, no. You're thinking of f2c. GNU Fortran uses the same backend as
    the GNU C compiler, in that it translates to the same intermediate
    language (sort of assembly language) on which optimizations are done.
    But the C front end and the Fortran front end are separate.
    But the front end is, as far as i know, only responsible for syntax
    analysis. All code generation like control flow analysis, register
    allocation etc. is done on the intermediate 3 register language. And
    the hints you could get from the fortran language are simply unused.

    But i must say that i never used Fortran (when i studied physics) in
    the last 15 years for any real programming. I only wanted to point out
    that a lot of languages are by design faster then C. Fortran was one
    of these. You can find articles on the web where the same is explained
    for lisp and functional languages.

    C and C++ is not a good compilation language when it comes to compiler
    optimization.
  • Jeff Epler at Jan 12, 2004 at 4:24 pm

    On Mon, Jan 12, 2004 at 05:42:22AM -0800, Lothar Scholz wrote:
    Python for hardcore numerics, even with PyNumerics, is simply a very
    bad solution.
    I think that many of us are not in a situation where we have to make our
    program run fast enough to save a million on a supercomputer, but where
    we have to make our program run soon enough to save a few thousands or
    maybe tens of thousands on programmer time.

    ObLink: http://cens.ioc.ee/projects/f2py2e/

    Jeff
  • Robin Becker at Jan 12, 2004 at 10:42 pm
    In article <6ee58e07.0401120542.5d79090d at posting.google.com>, Lothar
    Scholz <llothar at web.de> writes
    ....
    Fortran does not have this problem so a lot of optimizations can be
    done and values can be hold in registers for a much longer time,
    resulting in much greater speed.
    I'm not sure I agree with the above. Aliases could certainly occur in
    fortran 77, I haven't used 90 so can't say for sure.
    Remember that on supercomputers a 25% spped enhancement (which a good
    fortran compiler is faster then C) can mean a few million dollars of
    saved hardware costs. The coding time is not importing compared to the
    running time. So real hard numerics are always done in Fortran.
    --
    Robin Becker
  • Tim Peters at Jan 12, 2004 at 11:10 pm
    [Lothar Scholz]
    Fortran does not have this problem so a lot of optimizations can be
    done and values can be hold in registers for a much longer time,
    resulting in much greater speed.
    [Robin Becker]
    I'm not sure I agree with the above. Aliases could certainly occur in
    fortran 77, I haven't used 90 so can't say for sure.
    That's the magic of Fortran: the F77 standard says (in part):

    If a subprogram reference causes a dummy argument in the
    referenced subprogram to become associated with another
    dummy argument in the referenced subprogram, neither
    dummy argument may become defined during execution of
    that subprogram.

    It bascially says you can alias all you want, so long as you only read the
    aliased entities and don't modify them. If effect, if you do anything with
    aliases that would inhibit optimizations that assume there isn't any
    aliasing, then it's your *program* that's not legitimate Fortran. The
    Fortran standard has lots of text "like that", imposing (often unenforcable)
    restrictions on conforming programs for the benefit of optimizing compilers.
    That was the right choice for Fortran's audience.
  • Robin Becker at Jan 13, 2004 at 12:50 am
    In article <mailman.310.1073949030.12720.python-list at python.org>, Tim
    Peters <tim.one at comcast.net> writes
    .......
    That's the magic of Fortran: the F77 standard says (in part):

    If a subprogram reference causes a dummy argument in the
    referenced subprogram to become associated with another
    dummy argument in the referenced subprogram, neither
    dummy argument may become defined during execution of
    that subprogram.

    It bascially says you can alias all you want, so long as you only read the
    aliased entities and don't modify them. If effect, if you do anything with
    aliases that would inhibit optimizations that assume there isn't any
    aliasing, then it's your *program* that's not legitimate Fortran. The
    Fortran standard has lots of text "like that", imposing (often unenforcable)
    restrictions on conforming programs for the benefit of optimizing compilers.
    That was the right choice for Fortran's audience.
    this was also my understanding, the difficulty is that humans can't do
    the correct analysis in their heads for all, but the most simple
    programs. I seem to remember that almost all the compilers I used had
    mechanisms for turning off the most aggressive optimisations so if the
    card deck suddenly started working with them off then you could try and
    figure out what was wrong. Another nastiness was that by putting prints
    in you often disrupted the optimisations and the values you got printed
    seemed to indicate everything was fine.
    -Common blocks are an invention of the Devil-ly yrs-
    Robin Becker
  • Skip Montanaro at Jan 10, 2004 at 1:50 pm
    QOTW perhaps?

    Sam> I read the benchmark and I think it doesn't measure python in it's
    Sam> target area. That's like taking a world-class marathon runner and
    Sam> wondering why he doesn't compete well in a figure-skating event.

    Skip
  • Samuel Walters at Jan 11, 2004 at 7:29 am

    Thus Spake Skip Montanaro On the now historical date of Sat, 10 Jan 2004
    07:50:09 -0600|
    QOTW perhaps?

    Sam> I read the benchmark and I think it doesn't measure python in
    it's Sam> target area. That's like taking a world-class marathon
    runner and Sam> wondering why he doesn't compete well in a
    figure-skating event.

    Skip
    *garsh*

    I feel flattered. *blush*

    You know, I sadly spent quite a bit of time debating which simile to use
    there. I wandered around the house wondering what to put there.

    Some rejected ideas:
    "It's like asking Kasparov why he didn't win the weight-lifting
    competition."

    "That's like asking a world-class marathon runner why he doesn't compete
    well in a weight-lifting competition."

    "That's like asking a world-class weightlifter why they didn't do well in
    the figure skating competition." (I almost used this one because it
    conjures images of a burly Russian weight-lifter floundering on ice
    skates. Very Monty-Python-esque.)

    I chose the one I did in case I needed to later state that "Both figure
    skating and marathon running are aerobic sports, but that doesn't mean
    that the skills involved are the same."

    .

    Now, I feel compelled to justify my statement.

    Let's look closely at the benchmarks and try to figure out if there's a
    good reason why we fell down where we did.

    We did poorly at Math, and competitively at I/O.

    I'm reminded of Antoine de Saint-Exupe'ry saying "A designer knows he has
    achieved perfection not when there is nothing left to add, but when there
    is nothing left to take away." While not part of the Zen of Python, this
    seems to be an unstated principle of Python's design. It seems to focus
    on the bare minimum of what's needed for elegant expression of algorithms,
    and leave any extravagances to importable libraries.

    Why, then, are integers and longs part of Python itself, and not part of a
    library? Well, we need such constructs for counters, loops and indexes.
    Both range and xrange are evidence of this. Were it not for this, I
    daresay that we'd have at least argued the necessity of keeping these
    things in the language itself.

    Floats are a pragmatic convenience, because it's nice to be able to throw
    around the odd floating point number when you need to. Trig functions are
    housed in a separate library and notice that we didn't do too shabby there.

    I/O is one of our strengths, because we understand that most programs are
    not algorithmically bound, but rather I/O bound. I/O is a big
    bottle-neck, so we should be damn good at it. The fastest assembly
    program won't do much good if it's always waiting on the disk-drive.

    Perhaps our simplicity is the reason we hear so many Lisp'ers vocally
    complaining. While more pragmatic than Lisp, Python is definitely edging
    into the "Lambda Calculus Zone" that Lisp'ers have historically been the
    proud sole-occupants of. After all, until Python, when one wanted a
    nearly theoretical programming experience, one either went to C/Assembly
    (Turing Machine Metaphor) or Lisp (Lambda Calculus Metaphor.)

    Python is being used in so many introductory programming courses for the
    very reason that it so purely fits the way a programmer thinks, while
    still managing to be pragmatic. It allows for a natural introduction to
    some of the hardest concepts: Pointers/Reference, Namespaces, Objects and
    Legibility. Each of these concepts is difficult to learn if you are first
    indoctrinated into an environment without them. In my attempts to learn
    C++, I initially felt like I was beating my head up against a wall trying
    to learn what an object was and why one would use them. I have since
    observed that people coming from a strongly functional programming
    background have the same experience, while those with no functional
    programming dogma in them find objects quite a natural concept. The same
    thing is true of the other concepts I mentioned. If you have them, it's
    easy to work without them. If you don't, you'll flounder trying to pick
    them up. Think about how easy it is to pick out a C programmer from their
    Python coding style.

    The one important concept I didn't mention is message-passing. This is an
    important, but much less used concept. It is the domain of Smalltalk and
    Ruby. I've looked some at Ruby, and lurk their Usenet group. From what I
    can tell, Ruby takes almost the same philosophy as Python, except where we
    think namespaces are a honking great idea, they think message-passing is a
    honking great idea. The nice thing about message-passing is that if you
    have all the other concepts of OO down, message passing seems natural and
    is not terribly difficult to "fake" when it's the only missing OO
    primitive. This is why C++, while not a message-based OO language, is
    used so often in GUI's, an inherently message-based domain. This is also
    why we have such a nice, broad choice of GUI toolkits under Python despite
    lacking a message primitive.


    Well, I've blathered enough on this topic.
    I hope, at least, that I've said something worthwhile.
    Though, I doubt I've said anything that hasn't been said better before.

    Caffeine, Boredom and Usenet are a dangerous mix.

    Sam Walters.


    --
    Never forget the halloween documents.
    http://www.opensource.org/halloween/
    """ Where will Microsoft try to drag you today?
    Do you really want to go there?"""
  • Ganesan R at Jan 11, 2004 at 11:43 am

    "Samuel" == Samuel Walters <swalters_usenet at yahoo.com> writes:
    I/O is one of our strengths, because we understand that most programs are
    not algorithmically bound, but rather I/O bound. I/O is a big
    bottle-neck, so we should be damn good at it. The fastest assembly
    program won't do much good if it's always waiting on the disk-drive.
    Actually, Python is much slower than Perl for I/O. See the thread titled
    "Python IO Performance?" in groups.google.com for a thread started by me on
    this topic. I am a full time C programmer but do write occasional
    Python/Perl for professional/personal use.

    To answer the original question about how much percentage of time I spend
    optimizing my Python programs - probably never. However I did switch back to
    using Perl for my most of my text processing needs. For one program that was
    intended to lookup patterns in a gzipped word list, performance of the
    original python version was horribly slow. Instead of rewriting it in Perl,
    I simply opened a pipe to zgrep and did post processing in python. This
    turned out to be much faster - I don't remember how much faster, but I
    remember waiting for the output from the pure python version while the
    python+zgrep hybrid results were almost instantaneous.

    Ganesan

    --
    Ganesan R
  • Peter Hansen at Jan 12, 2004 at 2:13 pm

    Ganesan R wrote:
    "Samuel" == Samuel Walters <swalters_usenet at yahoo.com> writes:
    I/O is one of our strengths, because we understand that most programs are
    not algorithmically bound, but rather I/O bound. I/O is a big
    bottle-neck, so we should be damn good at it. The fastest assembly
    program won't do much good if it's always waiting on the disk-drive.
    Actually, Python is much slower than Perl for I/O. See the thread titled
    "Python IO Performance?" in groups.google.com for a thread started by me on
    this topic. I am a full time C programmer but do write occasional
    Python/Perl for professional/personal use.

    To answer the original question about how much percentage of time I spend
    optimizing my Python programs - probably never. However I did switch back to
    using Perl for my most of my text processing needs. For one program that was
    intended to lookup patterns in a gzipped word list, performance of the
    original python version was horribly slow. Instead of rewriting it in Perl,
    I simply opened a pipe to zgrep and did post processing in python. This
    turned out to be much faster - I don't remember how much faster, but I
    remember waiting for the output from the pure python version while the
    python+zgrep hybrid results were almost instantaneous.
    I didn't consider this sort of thing in my poll, but I'd have to say you
    actually *are* optimizing your Python programs, even if you did it by falling
    back on another language...

    -Peter
  • JanC at Jan 9, 2004 at 9:39 pm

    "Krzysztof Stachlewski" <stach at fr.USUN.pl> schreef:

    With "heavy use of Numeric module" you were calling functions
    written in C. So how can you say that Python is fast,
    when C code is doing all the work.
    I think all (or at least most) of the tested compilers, VMs, etc. were
    written in C/C++, and thus are using libraries written in C/C++...

    --
    JanC

    "Be strict when sending and tolerant when receiving."
    RFC 1958 - Architectural Principles of the Internet - section 3.9
  • Krzysztof Stachlewski at Jan 9, 2004 at 11:11 pm
    "JanC" <usenet_spam at janc.invalid> wrote in message
    news:Xns946BE66E88E82JanC at 213.119.4.35...
    "Krzysztof Stachlewski" <stach at fr.USUN.pl> schreef:

    I think all (or at least most) of the tested compilers, VMs, etc. were
    written in C/C++, and thus are using libraries written in C/C++...
    Well, yes.
    Whether or not you can say that a piece of code
    is *really* implemented in a chosen language and not
    in "the language that this language is implemented in" ;-)
    is a matter of scale.

    I just think that the Numeric package
    is not the best example of the speed of Python itself.

    Stach
  • Jeff Epler at Jan 9, 2004 at 9:43 pm
    numarray is probably the perfect example of getting extremely good
    performance from a few simple constructs coded in C. (numarray-0.8 is
    less then 50k lines of C code, including comments and blank lines, so
    it's also very simple given the amount of "bang" it provides)

    All your important logic is in Python, you're in no way stuck worrying
    about buffer lengths, when to call free(), and all those other things
    that drive me crazy when I try to write in C.

    If I had to execute my Python programs without executing any code
    written in "C" behind the scenes, well, I'd be stuck. Of course, the
    situation is about the same for any language you care to name, and if
    not then substitute "machine code".

    Jeff
  • Carl at Jan 9, 2004 at 9:45 pm

    Krzysztof Stachlewski wrote:

    "Carl" <phleum_nospam at chello.se> wrote in message
    news:ryELb.238$tK2.228 at amstwist00...
    I have been experimenting with numerical algorithms in Python with a
    heavy use of the Numeric module. My experience is that Python is quite
    fast in comparison with (and sometimes as fast as) traditional languages
    such as C or C++.
    With "heavy use of Numeric module" you were calling functions
    written in C. So how can you say that Python is fast,
    when C code is doing all the work.

    Stach
    Well, I guess you are right!

    What I meant was that when I run my Python algorithm it runs "almost as fast
    as" when I run similar code in C or C++. This is of course due to the
    highly efficient Numeric module. However, the point is that Python is a
    viable language from a numerical perspective.

    Carl
  • Tim Churches at Jan 9, 2004 at 9:52 pm

    On Sat, 2004-01-10 at 08:13, Krzysztof Stachlewski wrote:
    "Carl" <phleum_nospam at chello.se> wrote in message
    news:ryELb.238$tK2.228 at amstwist00...
    I have been experimenting with numerical algorithms in Python with a heavy
    use of the Numeric module. My experience is that Python is quite fast in
    comparison with (and sometimes as fast as) traditional languages such as C
    or C++.
    With "heavy use of Numeric module" you were calling functions
    written in C. So how can you say that Python is fast,
    when C code is doing all the work.
    Well, yes, but the Python VM is also written in C, and every time you
    make a call to a dictionary, or list etc, it is C code which is doing
    all the work. If you like, you could say that Python is a set of
    extremely clever wrappers around a bunch of optimised C code - but that
    rather diminishes the achievement of Python. I dare say that the
    Microsoft .NET and Visual Basic VMs are also written in C/C++, so you
    could say the same things about them - but I don't think that is a
    useful perspective.
    --

    Tim C

    PGP/GnuPG Key 1024D/EAF993D0 available from keyservers everywhere
    or at http://members.optushome.com.au/tchur/pubkey.asc
    Key fingerprint = 8C22 BF76 33BA B3B5 1D5B EB37 7891 46A9 EAF9 93D0


    -------------- next part --------------
    A non-text attachment was scrubbed...
    Name: not available
    Type: application/pgp-signature
    Size: 189 bytes
    Desc: This is a digitally signed message part
    Url : http://mail.python.org/pipermail/python-list/attachments/20040110/0fbd6043/attachment.pgp
  • Terry Reedy at Jan 10, 2004 at 4:03 am
    "Krzysztof Stachlewski" <stach at fr.USUN.pl> wrote in message
    news:btn5db$2f8$1 at absinth.dialog.net.pl...
    "Carl" <phleum_nospam at chello.se> wrote in message
    news:ryELb.238$tK2.228 at amstwist00...
    With "heavy use of Numeric module" you were calling functions
    written in C. So how can you say that Python is fast,
    when C code is doing all the work.
    Well gee. *All* of the functions exposed in builtins *and* in built-in
    modules are also written in C. So are all the methods of builtin types and
    all the hidden functions (some exposed in the C API), including the
    compilation and interpretation. So how can anyone even talk about the
    speed of Python, when C code is doing all the work, whether quickly or
    slowly!

    [and in another post]
    I just think that the Numeric package is not the best example
    of the speed of Python itself.
    But what is 'Python' itself? I think you are making a false distinction.
    Numerical Python and other scientific code driving C and Fortran functions
    was a, if not the killer app for Python when I learned it about 7 years
    ago. It was so important to the early success of Python, such as it was,
    that the slice object was added just for its use.

    Terry J. Reedy
  • Iwan van der Kleyn at Jan 9, 2004 at 9:44 pm

    Carl wrote:
    I think this is an unfair comparison! I wouldn't dream of developing a
    numerical application in Python without using prebuilt numerical libraries
    and data objects such as dictionaries and lists.
    Well, that may be true. And many applications spend most of their time
    in fast libraries anyway (GUI/DB/app servers) etc. Nice examples are Boa
    Constructor and Eric3 which are fully implemented in Python and run
    comfortably fast, thanks to the WxWindows and Qt libraries.

    But I really dont' swallow the argument. In the few years that I'm
    working with Python, I've encountered on several occasions performance
    bottlenecks. And often a solution through partial implementation in C or
    improvements in the algorithm is just not feasible. To paraphrase Kent
    Beck: in real life you make your program run, then work and then your
    manager says you've run out of time. You just cannot make it run as fast
    as you would like it as well.

    In other words: it would be nice if Python on average would run faster
    so the need for optimisation would lessen. Psyco is a nice option to
    have, though not as clear cut as I thought it would be:

    http://www-106.ibm.com/developerworks/linux/library/l-psyco.html
    The greatest advantage of Python is the great increase in productivity and
    the generation of a much smaller number of bugs due to the very clean and
    compact structure Python invites you to produce.
    So dogma dictates. And I've found it to be true on many occasions, if
    not all. BUT, the famed Python Productivity Gain is very difficult to
    quantify. And for me that's a BIG but. I'm trying to push Python within
    my company. Nicely presented "performance benchmarks" go down well with
    management, bevause those are quantities which are supposedly
    understood. And Python does not come across very possitively in that
    respect.

    Regards,

    Iwan
  • Peter Hansen at Jan 9, 2004 at 10:09 pm

    Iwan van der Kleyn wrote:
    In other words: it would be nice if Python on average would run faster
    so the need for optimisation would lessen.
    I disagree with the above. My opinion has long been that Python runs
    adequately fast and that few people should need to spend much time on
    optimization. Maybe that sort of view should be put to the test.

    This is my "straw poll" question:

    Do you spend a "significant" amount of time actually optimizing your
    Python applications? (Significant is here defined as "more than five
    percent of your time", which is for example two hours a week in a
    40-hour work week.)

    Note the distinction between "actually optimizing" and "worrying about
    optimization" and such things. If you pause briefly during coding and
    rewrite a line to use a more efficient idiom, I don't consider that to be
    "optimization" for purposes of this question. Optimization would require
    roughly (a) noticing that performance was inadequate or actual profiling
    your code, and (b) rewriting specifically to get adequate performance.
    Algorithmic improvements that you would make regardless of implementation
    language do not qualify, and wasting time optimizing a script that you
    run once a year so it takes ten seconds instead of fifteen also does not
    qualify because you certainly didn't need to do it...

    Yes or no answers suffice, but feel free to follow up with a paragraph
    qualifying your answer (or quantifying it!). :-)

    -Peter
  • Dave Brueck at Jan 9, 2004 at 10:47 pm

    Peter wrote:
    Iwan van der Kleyn wrote:
    In other words: it would be nice if Python on average would run faster
    so the need for optimisation would lessen.
    I disagree with the above. My opinion has long been that Python runs
    adequately fast and that few people should need to spend much time on
    optimization. Maybe that sort of view should be put to the test.

    This is my "straw poll" question:

    Do you spend a "significant" amount of time actually optimizing your
    Python applications? (Significant is here defined as "more than five
    percent of your time", which is for example two hours a week in a
    40-hour work week.)
    Yay, straw poll! ;-)

    I program in Python full-time and each spend approximately zero hours
    optimizing. In the past two years I can think of two instances in which I went
    into heavy optimization mode: one was for a web server that needed to handle
    hundreds of requests per second and the other was a log processor that needed
    to parse and process several gigabytes of log data per hour.

    In the server I added a tiny C extension to make use of the Linux sendfile API,
    all the other optimizations were algorithmic. In the log processor all the
    optimizations ended up being either algorithmic or doing fewer dumb things
    (like recomputing cacheable data).
  • Mike C. Fletcher at Jan 9, 2004 at 11:19 pm

    Dave Brueck wrote:
    Peter wrote:
    ...
    This is my "straw poll" question:

    Do you spend a "significant" amount of time actually optimizing your
    Python applications? (Significant is here defined as "more than five
    percent of your time", which is for example two hours a week in a
    40-hour work week.)
    Yay, straw poll! ;-)
    Indeed, yay ;) .

    I am often found optimising. The total time spent on it is probably
    somewhere around 1-2% of total (after all, I do still *use* Python, it's
    not like it's killing me). A lot of my projects are speed-sensitive
    (e.g. OpenGLContext trying to give interactive frame-rates in 3D
    scenegraph rendering, SimpleParse trying to create really fast parsers,
    web-sites trying to process 10s of thousands of records into summary
    views interactively), but I don't think I'm *that* far from the norm in
    the amount of time spent on optimisation.

    I'd be surprised if there's not some difference between the experience
    of library programmers and application programmers in this regard.
    Library developers tend to need to focus more effort on performance to
    allow users of the libraries some inefficiencies in their usage. That
    effort tends to be more in the planning stages, though, making sure
    you've chosen good algorithms, estimating work-loads and required
    throughput. You try to make sure the library is robust and fast by
    designing it that way, not by trying to optimise it after the fact. You
    plan for worst-case scenarios, and as a result you become "worried"
    about performance. An application developer is normally going to use
    optimised libraries and just run with it, only needing to deal with
    unacceptable performance if it happens to show up in their particular
    project.

    Anyway, back to it,
    Mike

    _______________________________________
    Mike C. Fletcher
    Designer, VR Plumber, Coder
    http://members.rogers.com/mcfletch/
  • Rene Pijlman at Jan 10, 2004 at 12:12 am
    Mike C. Fletcher:
    I am often found optimising.
    By an unoptimized or by an optimizing coworker? :-)

    --
    Ren? Pijlman
  • Mike C. Fletcher at Jan 10, 2004 at 12:35 am

    Rene Pijlman wrote:
    Mike C. Fletcher:

    I am often found optimising.
    By an unoptimized or by an optimizing coworker? :-)
    Well, in a perfect world, it would be by a lovely and svelte lady,
    missing her husband and seeking to drag me off to connubial bliss (we'll
    assume for propriety's sake that the husband would be me), but in my sad
    and lonely condition, it's normally I who finds myself late at night,
    feverishly optimising ;) :) .

    All a man really needs to be happy is work, and a woman to drag him away
    from it, or so I'm told,
    Mike(y)

    _______________________________________
    Mike C. Fletcher
    Designer, VR Plumber, Coder
    http://members.rogers.com/mcfletch/
  • Jp Calderone at Jan 10, 2004 at 12:57 am

    On Fri, Jan 09, 2004 at 05:09:37PM -0500, Peter Hansen wrote:
    Iwan van der Kleyn wrote:
    In other words: it would be nice if Python on average would run faster
    so the need for optimisation would lessen.
    I disagree with the above. My opinion has long been that Python runs
    adequately fast and that few people should need to spend much time on
    optimization. Maybe that sort of view should be put to the test.

    This is my "straw poll" question:

    Do you spend a "significant" amount of time actually optimizing your
    Python applications? (Significant is here defined as "more than five
    percent of your time", which is for example two hours a week in a
    40-hour work week.)
    No.

    Jp
  • Ken at Jan 10, 2004 at 2:40 am

    Do you spend a "significant" amount of time actually optimizing your
    Python applications? (Significant is here defined as "more than five
    percent of your time", which is for example two hours a week in a
    40-hour work week.)
    Some of them.

    I'm using python for data transformations: some feeds are small and
    easily handled by python, however the large ones (10 million rows per
    file) require a bit of thought to be spent on performance. However,
    this isn't exactly python optimization - more like shifting high-level
    pieces around in the architecture: merge two files or do a binary
    lookup (nested-loop-join) one one? etc...

    To make matters worse we just implemented a metadata-driven
    transformation engine entirely written in python. It'll work great on
    the small files, but the large ones...

    Luckily, the nature of this application lends itself towards
    distributed processing - so my plan is to:
    1. check out psycho for the metadata-driven tool
    2. partition the feeds across multiple servers
    3. rewrite performance-intensive functions in c

    But I think I'll get by with just options #1 and #2: we're using
    python and it's working well - exactly because it is so adaptable.
    The cost in performance is inconsequential in this case compared to
    the maintainability.
  • Brian Kelley at Jan 10, 2004 at 4:19 am

    This is my "straw poll" question:

    Do you spend a "significant" amount of time actually optimizing your
    Python applications? (Significant is here defined as "more than five
    percent of your time", which is for example two hours a week in a
    40-hour work week.)
    Yes and No :) I find that optimizing algorithms is a lot more
    beneficial than optimizing code. Let me give a small example. I have
    written a chemoinformatics engine (http://frowns.sourceforge.net/) and
    one of it's features is substructure searching. That is finding if a
    graph is embedded in another graph. This is an NP-Complete problem. A
    company recently compared their technique in terms of speed and
    correctness to frowns. According to them, frowns was 99% percent
    correct but 1000x slower. (Why they were marketing their system which
    was built over 5+ man years against mine which was built over 3 months I
    never will understand)

    Now, 1000x is a *lot* slower. However, when used in practice in a
    database setting, my system has a quick mechanism that can reject false
    matches very quickly. This is standard practice for chemistry databases
    by the way. All of a sudden the 1000x difference becomes almost
    meaningless. For a given search across 300000+ compounds, my system
    takes 1.2 seconds and their's takes 25 minutes. Using my prefiltering
    scheme their system takes 0.7 seconds. Now my code didn't change at
    all, only the way it was used changed.

    I could, of course, generate an example that takes me much longer but
    the average case is a whole lot better. My system is free though, so my
    users tend not to mind (or quite honestly, expect) as much :)

    Brian
  • Sean 'Shaleh' Perry at Jan 10, 2004 at 8:03 am

    On Friday 09 January 2004 14:09, Peter Hansen wrote:
    Yes or no answers suffice, but feel free to follow up with a paragraph
    qualifying your answer (or quantifying it!). :-)
    not any longer.

    As I learned on this and the tutor list, writing code in Pythonic style tends
    to also result in code being fast enough. Most of my early problems resulted
    from trying to write C in Python.
  • Tim Delaney at Jan 10, 2004 at 12:29 pm
    From: "Peter Hansen" <peter at engcorp.com>
    This is my "straw poll" question:

    Do you spend a "significant" amount of time actually optimizing your
    Python applications? (Significant is here defined as "more than five
    percent of your time", which is for example two hours a week in a
    40-hour work week.)
    I have to say that I do, but I'm also dealing with datasets up to about
    500MB in the worst case - but about 10-20MB in the normal case.

    In most cases, the optimisations are things like only doing a single pass
    over any file, etc. A naive prototype often doesn't scale well when I need
    to deal with large datasets. Often using psyco will be sufficient to get me
    over the hurdle (see the psyco thread) but sometimes not.

    Tim Delaney
  • Matthias at Jan 12, 2004 at 9:34 am

    Peter Hansen <peter at engcorp.com> writes:

    This is my "straw poll" question:

    Do you spend a "significant" amount of time actually optimizing your
    Python applications? (Significant is here defined as "more than five
    percent of your time", which is for example two hours a week in a
    40-hour work week.)
    I was working on an image processing application and was looking for a
    quick prototyping language. I was ready to accept a 10-fold decrease
    in execution speed w.r.t. C/C++. With python+psycho, I experienced a
    1000-fold decrease.

    So I started re-writing parts of my program in C. Execution speed now
    increased, but productivity was as low as before (actually writing the
    programs directly in C++ felt somewhat more natural). Often it
    happened that I prototyped an algorithm in python, started the
    program, implemented the algorithm in C as an extension module and
    before the python algorithm had finished I got the result from the
    C-algorithm. :-(

    I've tried numerics, but my code was mostly not suitable for
    vectorization and I did not like the pointer semantics of numerics.

    So my answer to the question above is NO, I don't spend significant
    times optimizing python code as I do not use python for
    computationally intensive calculations any more. My alternatives are
    Matlab and (sometimes) Common Lisp or Scheme or Haskell.



    From http Mon Jan 12 10:49:51 2004
    From: http (Paul Rubin)
    Date: 12 Jan 2004 01:49:51 -0800
    Subject: Division oddity
    References: <ssn3009029rrug3l8dk1pn06tqd019ucld@4ax.com>
    <7xeku5vrn8.fsf@ruckus.brouhaha.com>
    <My1CLVAiZmAAFw5r@jessikat.fsnet.co.uk>
    Message-ID: <7xbrp9h4xs.fsf@ruckus.brouhaha.com>

    Robin Becker <robin at jessikat.fsnet.co.uk> writes:
    Python 2.3.2 (#49, Oct 2 2003, 20:02:00) [MSC v.1200 32 bit (Intel)] on
    win32
    Type "help", "copyright", "credits" or "license" for more information.
    from __future__ import division
    eval('1/2')
    0.5
    so I guess pythonwin is broken in this respect.
    Huh? you get the expected result with both python and pythonwin. Neither
    one is broken. The surprising result is from input(), not eval():

    $ python
    Python 2.2.2 (#1, Feb 24 2003, 19:13:11)
    [GCC 3.2.2 20030222 (Red Hat Linux 3.2.2-4)] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    from __future__ import division
    print input('expression: ')
    expression: 1/2
    print eval('1/2')
    0.5
    >>>

    That's because input evaluates its string in the context of a
    different module which was compiled with old-style division. I don't
    know of other functions that use eval like that, and input() should
    be deprecated or eliminated anyway, so this isn't a big deal.
  • Peter Hansen at Jan 12, 2004 at 2:15 pm

    Paul Rubin wrote:
    Peter Hansen <peter at engcorp.com> writes:
    This is my "straw poll" question:

    Do you spend a "significant" amount of time actually optimizing your
    Python applications? (Significant is here defined as "more than five
    percent of your time", which is for example two hours a week in a
    40-hour work week.)
    Yes, absolutely.
    Algorithmic improvements that you would make regardless of implementation
    language do not qualify, and wasting time optimizing a script that you
    run once a year so it takes ten seconds instead of fifteen also does not
    qualify because you certainly didn't need to do it...
    Sometimes I'll take the time to implement a fancy algorithm in Python
    where in a faster language I could use brute force and still be fast
    enough. I'd count that as an optimization.
    I would count it as optimization as well, which is why I qualified my
    comment with "regardless of implementation language". Clearly in your
    case that clause does not apply.

    -Peter
  • Dave Brueck at Jan 9, 2004 at 10:40 pm

    Iwan:
    The greatest advantage of Python is the great increase in productivity and
    the generation of a much smaller number of bugs due to the very clean and
    compact structure Python invites you to produce.
    So dogma dictates. And I've found it to be true on many occasions, if
    not all. BUT, the famed Python Productivity Gain is very difficult to
    quantify. And for me that's a BIG but. I'm trying to push Python within
    my company. Nicely presented "performance benchmarks" go down well with
    management, bevause those are quantities which are supposedly
    understood.
    Understood, perhaps, but very often irrelevent. That being the case, using
    performance benchmarks to argue your case is a weak approach.

    If you're talking to management, talk to them about something they care about,
    like money. For most programs it's hard to translate performance improvements
    into money: e.g. it's hard to assert that by doubling the speed of your
    spell-checker implementation that sales will increase. There of course are
    exceptions, but even then there's no guarantee that management would still
    prefer performance above other factors if given a choice.

    It's much more powerful to speak about reduced time to market, for example. Or
    the ability to compete against companies with legions of programmers. Or the
    decreased time it takes to turn ideas into implemented features (especially
    when it's your competitor that came up with the idea). Or a lower cost of
    changing directions technologically. Etc.

    -Dave
  • Tim Churches at Jan 9, 2004 at 9:44 pm

    On Sat, 2004-01-10 at 08:05, Carl wrote:
    "Nine Language Performance Round-up: Benchmarking Math & File I/O"
    http://www.osnews.com/story.php?news_idV02

    I think this is an unfair comparison! I wouldn't dream of developing a
    numerical application in Python without using prebuilt numerical libraries
    and data objects such as dictionaries and lists.
    Benchmarks like those reported are nearly worthless, but nevertheless,
    it would be interesting to re-run the them using Numeric Python and/or
    Pyrex.

    I notice that the author of those benchmarks, Christopher W.
    Cowell-Shah, has a PhD in philosophy. Perhaps Python's very own
    philosophy PhD, David Mertz, might like to repeat the benchmarking
    exercise for one of his columns, but including manipulation of more
    realistic data structures such as lists, arrays and dictionaries, as
    Carl suggests. I'm sure it would be a popular article, and provide a
    counterpoint to the good Dr Mertz's previous articles on Psyco.

    --

    Tim C

    PGP/GnuPG Key 1024D/EAF993D0 available from keyservers everywhere
    or at http://members.optushome.com.au/tchur/pubkey.asc
    Key fingerprint = 8C22 BF76 33BA B3B5 1D5B EB37 7891 46A9 EAF9 93D0


    -------------- next part --------------
    A non-text attachment was scrubbed...
    Name: not available
    Type: application/pgp-signature
    Size: 189 bytes
    Desc: This is a digitally signed message part
    Url : http://mail.python.org/pipermail/python-list/attachments/20040110/d625f8ff/attachment.pgp
  • JanC at Jan 9, 2004 at 10:06 pm

    Tim Churches <tchur at optushome.com.au> schreef:

    I notice that the author of those benchmarks, Christopher W.
    Cowell-Shah, has a PhD in philosophy. Perhaps Python's very own
    philosophy PhD, David Mertz, might like to repeat the benchmarking
    exercise for one of his columns, but including manipulation of more
    realistic data structures such as lists, arrays and dictionaries, as
    Carl suggests.
    And then don't forget to publish it on /. or nobody sees it... ;-)

    --
    JanC

    "Be strict when sending and tolerant when receiving."
    RFC 1958 - Architectural Principles of the Internet - section 3.9

Related Discussions