FAQ
Hi all,

I have just released version 0.2 of Shed Skin, an experimental
(restricted) Python-to-C++ compiler (http://shedskin.googlecode.com).
It comes with 7 new example programs (for a total of 40 example
programs, at over 12,000 lines) and several important improvements/bug
fixes. See http://code.google.com/p/shedskin/wiki/ReleaseNotes for the
full changelog.

The new example programs consist of Disco, an elegant go player (see
http://shed-skin.blogspot.com/2009/07/disco-elegant-python-go-player.html),
a larger Voronoi implementation at 800 lines, a TSP algorithm
simulating ant colonies, a nicer neural network algorithm and three
compressors (Lempel-Ziv, huffman block, and arithmetic).

Other than bug fixes for these programs, this release adds some
important optimizations. First and foremost, inlining was greatly
improved, resulting in potential speedups across the board. Second,
loops such as 'for a, b in enumerate/zip(sequence[, sequence])' should
now be dramatically faster (also inside list comprehensions), by
avoiding allocation of intermediate tuples. Finally, basic list
slicing should now be much faster.


Please try it out!
Mark Dufour.
--
"One of my most productive days was throwing away 1000 lines of code"
- Ken Thompson

Search Discussions

  • William Dode at Jul 20, 2009 at 11:56 am

    On 19-07-2009, Mark Dufour wrote:
    Hi all,

    I have just released version 0.2 of Shed Skin, an experimental
    (restricted) Python-to-C++ compiler (http://shedskin.googlecode.com).
    I just tested it with a litle game, to find the places of horse on
    a board 5x5. The result is :

    c 5s
    gcj 7s
    java 7s
    shedskin 8s
    python + psyco 18s
    cython avec malloc *int 18s
    cython 55s avec [] python
    python 303s (5m3s)

    --
    William Dod? - http://flibuste.net
    Informaticien Ind?pendant
  • Stefan Behnel at Jul 20, 2009 at 1:22 pm

    William Dode wrote:
    On 19-07-2009, Mark Dufour wrote:
    I have just released version 0.2 of Shed Skin, an experimental
    (restricted) Python-to-C++ compiler (http://shedskin.googlecode.com).
    I just tested it with a litle game, to find the places of horse on
    a board 5x5. The result is :

    [...]
    shedskin 8s
    python + psyco 18s
    cython avec malloc *int 18s
    cython 55s avec [] python
    python 303s (5m3s)
    Note that both Psyco and Cython make a lot less assumptions about Python
    code than Shed Skin does. Psyco has the advantage of just needing to jump
    in when it finds out that it can help, so it's the most broadly compatible
    of the three. But Cython also supports quite a large corpus of dynamic
    Python code by now. Shed Skin has a lot of restrictions, many of which are
    by design. It's not intended to compile dynamic code, and I think that's a
    good thing, because that's what makes it fast for the code that it
    supports. Getting the same speed in Cython requires a bit more explicit
    typing, simply because Cython does not assume these restrictions.

    I think that all three have their raison d'?tre, and it currently looks
    like all three are there to stay and to keep growing better. And I'm also
    happy to read that some optimisations jump from one to the other. ;)

    Stefan
  • Bearophile at Jul 20, 2009 at 4:26 pm

    William Dode':
    I just tested it with a litle game, to find the places of horse on
    a board 5x5. The result is :

    c 5s
    gcj 7s
    java 7s
    shedskin 8s
    python + psyco 18s
    cython avec malloc *int 18s
    cython 55s avec [] python
    python 303s (5m3s)
    Nice timings, can you please show me the Python, Java and C code
    versions? I may do more tests.
    The purpose of all those "example programs" in ShedSkin is to find
    bugs and to find details to speedup.

    Bye,
    bearophile
  • Srepmub at Jul 20, 2009 at 7:20 pm

    Nice timings, can you please show me the Python, Java and C code
    versions? I may do more tests.
    also, which shedskin options did you use? did you use -bw, to disable
    bounds and wrap-around checking? this can make quite a difference for
    code that does a lot of indexing. if the code uses random numbers,
    then -r can also make a big difference, to use C rand(), instead of
    Python compatible random numbers.

    and which C++ compiler flags did you use? the default -O2, or
    something like -O3 -fomit-frame-pointer -msse2..?


    thanks,
    mark.
  • James Matthews at Jul 20, 2009 at 9:01 pm
    I like this, I am going to run this as a test. I also want to see the source
    code on how they compile the dynamic variables.
    On Mon, Jul 20, 2009 at 10:20 PM, srepmub wrote:

    Nice timings, can you please show me the Python, Java and C code
    versions? I may do more tests.
    also, which shedskin options did you use? did you use -bw, to disable
    bounds and wrap-around checking? this can make quite a difference for
    code that does a lot of indexing. if the code uses random numbers,
    then -r can also make a big difference, to use C rand(), instead of
    Python compatible random numbers.

    and which C++ compiler flags did you use? the default -O2, or
    something like -O3 -fomit-frame-pointer -msse2..?


    thanks,
    mark.
    --
    http://mail.python.org/mailman/listinfo/python-list


    --
    http://www.jewelerslounge.com
    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: <http://mail.python.org/pipermail/python-list/attachments/20090721/24cdf130/attachment.htm>
  • William Dode at Jul 21, 2009 at 8:19 am

    On 20-07-2009, srepmub wrote:
    Nice timings, can you please show me the Python, Java and C code
    versions? I may do more tests.
    Of course, the codes are here :

    http://hg.flibuste.net/libre/games/cheval

    Like you'll see, i tried to use exactly the same code for each langage.
    also, which shedskin options did you use? did you use -bw, to disable
    bounds and wrap-around checking? this can make quite a difference for
    code that does a lot of indexing. if the code uses random numbers,
    then -r can also make a big difference, to use C rand(), instead of
    Python compatible random numbers.

    and which C++ compiler flags did you use? the default -O2, or
    something like -O3 -fomit-frame-pointer -msse2..?
    I used the default, shedksin cheval.py; make
    shedskin 0.2

    With -bw and -O3 -fomit-frame-pointer -msse2 i have 5.5s (instead of 8)

    Let me know if you find better.

    --
    William Dod? - http://flibuste.net
    Informaticien Ind?pendant
  • Srepmub at Jul 21, 2009 at 12:25 pm

    With -bw and -O3 -fomit-frame-pointer -msse2 i have 5.5s (instead of 8)

    Let me know if you find better.
    thanks. now I'm wondering how fast does the C version become with
    these flags..? :-)


    mark.
  • William Dode at Jul 21, 2009 at 12:44 pm

    On 21-07-2009, srepmub wrote:
    With -bw and -O3 -fomit-frame-pointer -msse2 i have 5.5s (instead of 8)

    Let me know if you find better.
    thanks. now I'm wondering how fast does the C version become with
    these flags..? :-)
    I don't see any difference...

    --
    William Dod? - http://flibuste.net
    Informaticien Ind?pendant
  • Bearophile at Jul 21, 2009 at 3:08 pm

    William Dode':
    http://hg.flibuste.net/libre/games/cheval
    Like you'll see, i tried to use exactly the same code for each langage.
    It's a cute solver.

    Few more versions of mine:

    #1, a Psyco version of mine:
    http://codepad.org/9m5rf7kX

    #2, unrolled Psyco version:
    http://codepad.org/gKFLu34M

    #3, a quick D (D1) version:
    http://codepad.org/Tk9FL7Xk

    Timings (no printing), seconds, best of 3:
    #1: 4.79
    #2: 3.67
    #3: 1.10
    Your Psyco version: 13.37
    Your C version, compiled with GCC 4.3.2, -s -O3 -fomit-frame-pointer:
    3.79

    I have timed the whole running time of the programs, with an external
    timer, so my timings include the start of the PythonVM and the final
    cleanup of the GC.

    Please, feel free to time my code again, so we can compare with your
    other timings (Java, etc) better.

    I have used Psyco 1.6 final and Python 2.6.2 on a Core2 CPU at 2 GHz.

    To compile the D1 code you can use the free DMD compiler for D1, I
    have used version 1.042, www.digitalmars.com/d/download.html ).

    In such benchmarks it's often better to start from the fastest low
    level code (written in an imperative language as C/D/C++, or a version
    in a functional language like Clean or OCaML) and then use it to
    create higher level versions (in Python, etc). This offers you a
    baseline timing, and usually the small differences of the higher level
    code compared to the original C/Clean code don't invalidate the test.

    I have changed only small things in the program, as you can see the
    Psyco program #2 is faster than your C code :-) If you learn how to
    use it well, Psyco1.6 can often lead to very good performance. But lot
    of people don't know how to use Psyco well (I too am ignorant: I don't
    know how to profile Python programs yet).

    Bye,
    bearophile
  • William Dode at Jul 22, 2009 at 11:38 am
    I updated the script (python, c and java) with your unrolled version
    + somes litle thinks.

    I also tried with python3.1, unladen Q2, ironpython1.1.1

    Unfortunately it doesn't work more with shedskin, i'll see on the
    shedskin group...

    c 1.85s
    gcj 2.15s
    java 2.8s
    python2.5 + psyco 3.1s
    unladen-2009Q2 145s (2m45)
    python2.5 254s (4m14s)
    python3.1 300s (5m)
    ironpython1.1.1 680s (11m20)


    --
    William Dod? - http://flibuste.net
    Informaticien Ind?pendant
  • George Sakkis at Jul 22, 2009 at 2:55 pm

    On Jul 22, 7:38?am, William Dode wrote:

    I updated the script (python, c and java) with your unrolled version
    + somes litle thinks.

    I also tried with python3.1, unladen Q2, ironpython1.1.1

    Unfortunately it doesn't work more with shedskin, i'll see on the
    shedskin group...

    c 1.85s
    gcj 2.15s
    java 2.8s
    python2.5 + psyco 3.1s
    unladen-2009Q2 145s (2m45)
    python2.5 254s (4m14s)
    python3.1 300s (5m)
    ironpython1.1.1 680s (11m20)
    Cool; it would be interesting to see the numbers for Jython and Boo as
    well if it's not too much effort.

    George
  • William Dode at Jul 22, 2009 at 4:45 pm

    On 22-07-2009, George Sakkis wrote:
    On Jul 22, 7:38?am, William Dode wrote:

    I updated the script (python, c and java) with your unrolled version
    + somes litle thinks.

    I also tried with python3.1, unladen Q2, ironpython1.1.1

    Unfortunately it doesn't work more with shedskin, i'll see on the
    shedskin group...

    c 1.85s
    gcj 2.15s
    java 2.8s
    python2.5 + psyco 3.1s
    unladen-2009Q2 145s (2m45)
    python2.5 254s (4m14s)
    python3.1 300s (5m)
    ironpython1.1.1 680s (11m20)
    Cool; it would be interesting to see the numbers for Jython and Boo as
    well if it's not too much effort.
    I just tried with jython, but oddly it's faster without array.
    Thanks to Mark, i could compile to shedskin again.
    And add somes little improvements...

    c 1.65s
    gcj 1.9s
    java 2.4s
    python2.5 + psyco 2.9s
    shedskin 3.4s
    unladen-2009Q2 125s (2m05)
    Jython 2.2.1 on java1.6.0_12 176s (without array, like shedskin)
    Jython 2.2.1 on java1.6.0_12 334s (with array)
    python2.5 215s (3m35s)
    python3.1 246s (4m06s)
    ironpython1.1.1 512 (8m32s)


    --
    William Dod? - http://flibuste.net
    Informaticien Ind?pendant
  • William Dode at Jul 22, 2009 at 5:55 pm

    On 22-07-2009, William Dode wrote:

    c 1.65s
    gcj 1.9s
    java 2.4s
    python2.5 + psyco 2.9s
    shedskin 3.4s
    with -bw i have 2.6s
    unladen-2009Q2 125s (2m05)
    Jython 2.2.1 on java1.6.0_12 176s (without array, like shedskin)
    Jython 2.2.1 on java1.6.0_12 334s (with array)
    python2.5 215s (3m35s)
    python3.1 246s (4m06s)
    ironpython1.1.1 512 (8m32s)
    somebody can test with ironpython on windows ?

    Anyway, it's very impressive. I wonder if unladen will be so close in
    the futur.

    --
    William Dod? - http://flibuste.net
    Informaticien Ind?pendant
  • Brian at Jul 22, 2009 at 11:51 pm

    On Wed, Jul 22, 2009 at 11:55 AM, William Dode wrote:
    On 22-07-2009, William Dode wrote:

    c 1.65s
    gcj 1.9s
    java 2.4s
    python2.5 + psyco 2.9s
    shedskin 3.4s
    with -bw i have 2.6s
    unladen-2009Q2 125s (2m05)
    Jython 2.2.1 on java1.6.0_12 176s (without array, like shedskin)
    Jython 2.2.1 on java1.6.0_12 334s (with array)
    python2.5 215s (3m35s)
    python3.1 246s (4m06s)
    ironpython1.1.1 512 (8m32s)
    somebody can test with ironpython on windows ?

    Anyway, it's very impressive. I wonder if unladen will be so close in
    the futur.

    --
    William Dod? - http://flibuste.net
    Informaticien Ind?pendant
    I had time to run a few tests. Since psyco doesn't work on 64 bit and the
    developer now works on pypy I used that.

    Intel Core i7 920 @ 2.66GHz
    Kubuntu Jaunty Jackal

    c gcc 4.3.3 0.77s
    gcj 4.3.3 0.81s
    java 1.6 0.99s
    shedskin 1.63s
    jython 2.2.1 85.37s
    cpython 2.6.2 93.26s
    pypy 1.1.0 1612.15s

    <http://mail.python.org/mailman/listinfo/python-list>
    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: <http://mail.python.org/pipermail/python-list/attachments/20090722/3d82420a/attachment.htm>
  • George Sakkis at Jul 22, 2009 at 8:27 pm

    On Jul 22, 12:45?pm, William Dode wrote:
    On 22-07-2009, George Sakkis wrote:


    On Jul 22, 7:38?am, William Dode wrote:

    I updated the script (python, c and java) with your unrolled version
    + somes litle thinks.
    I also tried with python3.1, unladen Q2, ironpython1.1.1
    Unfortunately it doesn't work more with shedskin, i'll see on the
    shedskin group...
    c 1.85s
    gcj 2.15s
    java 2.8s
    python2.5 + psyco 3.1s
    unladen-2009Q2 145s (2m45)
    python2.5 254s (4m14s)
    python3.1 300s (5m)
    ironpython1.1.1 680s (11m20)
    Cool; it would be interesting to see the numbers for Jython and Boo as
    well if it's not too much effort.
    I just tried with jython, but oddly it's faster without array.
    FYI Jython 2.5 was released last month, you may want to try this
    instead of 2.2.

    George
  • Bearophile at Jul 27, 2009 at 1:09 am

    William Dode':
    I updated the script (python, c and java) with your unrolled version
    + somes litle thinks. [...]
    c 1.85s
    gcj 2.15s
    java 2.8s
    python2.5 + psyco 3.1s
    unladen-2009Q2 145s (2m45)
    python2.5 254s (4m14s)
    python3.1 300s (5m)
    ironpython1.1.1 680s (11m20)
    Sorry for being late, I was away.

    In your last C version this code is useless because the C compiler is
    able to perform such simple optimization by itself (but probably
    Python isn't able, so if you want the code to be the similar in all
    versions it may be better to keep it):

    shift_0=shift[0];
    shift_1=shift[1];
    shift_2=shift[2];
    shift_3=shift[3];
    shift_4=shift[4];
    shift_5=shift[5];
    shift_6=shift[6];
    shift_7=shift[7];


    This part in the Python code is useless:

    shift_0 = shift[0]
    shift_1 = shift[1]
    shift_2 = shift[2]
    shift_3 = shift[3]
    shift_4 = shift[4]
    shift_5 = shift[5]
    shift_6 = shift[6]
    shift_7 = shift[7]

    Because later you copy values locally anyway:

    def solve(nb, x, y,
    SIDE=SIDE, SQR_SIDE=SQR_SIDE, circuit=circuit,
    shift_0=shift_0,
    shift_1=shift_1,
    shift_2=shift_2,
    shift_3=shift_3,
    shift_4=shift_4,
    shift_5=shift_5,
    shift_6=shift_6,
    shift_7=shift_7,
    ):

    So doing something like this is probably enough:

    def solve(nb, x, y,
    SIDE=SIDE, SQR_SIDE=SQR_SIDE, circuit=circuit,
    shift_0=shift[0],
    shift_1=shift[1],
    shift_2=shift[2],
    shift_3=shift[3],
    shift_4=shift[4],
    shift_5=shift[5],
    shift_6=shift[6],
    shift_7=shift[7],
    ):

    In low-level languages like C unrolling has to be done with care, to
    avoid slowing down the code.

    I have tried your latest C version using your compiler options, my
    MinGW based on GCC 4.3.2 produces a crash at runtime. Using LLVM-GCC
    it runs in 1.31 seconds. The D version is a bit less optimized than
    your last C versions, yet using DMD it runs in 1.08-1.10 seconds.
    Let's see if someone is able to write a C version faster than that D
    code :-)

    Have you have compiled/read my D version? In the D version you may
    have missed that I did use an extra trick: unsigned integers, so it
    needs just two tests to see if a number is in the 0-5, 0-5 square :-)
    Note that Pyd, the Python-D bridge, may work with the latest DMD
    version still (and it works if you use a bit older DMD compiler):
    http://pyd.dsource.org/

    Bye,
    bearophile
  • William Dode at Jul 27, 2009 at 11:16 am

    On 27-07-2009, Bearophile wrote:
    William Dode':
    I updated the script (python, c and java) with your unrolled version
    + somes litle thinks. [...]
    c 1.85s
    gcj 2.15s
    java 2.8s
    python2.5 + psyco 3.1s
    unladen-2009Q2 145s (2m45)
    python2.5 254s (4m14s)
    python3.1 300s (5m)
    ironpython1.1.1 680s (11m20)
    Sorry for being late, I was away.

    In your last C version this code is useless because the C compiler is
    able to perform such simple optimization by itself (but probably
    Python isn't able, so if you want the code to be the similar in all
    versions it may be better to keep it):
    I wanted so, specialy if it doesn't change a lot of the result (the
    difference is so small that i cannot see it)...

    ...
    I have tried your latest C version using your compiler options, my
    MinGW based on GCC 4.3.2 produces a crash at runtime.
    Maybe because of -msse2 ?
    Using LLVM-GCC
    it runs in 1.31 seconds. The D version is a bit less optimized than
    your last C versions, yet using DMD it runs in 1.08-1.10 seconds.
    Let's see if someone is able to write a C version faster than that D
    code :-)

    Have you have compiled/read my D version? In the D version you may
    have missed that I did use an extra trick: unsigned integers, so it
    needs just two tests to see if a number is in the 0-5, 0-5 square :-)
    I didn't see, fine ! But the difference is also too small to see...
    Note that Pyd, the Python-D bridge, may work with the latest DMD
    version still (and it works if you use a bit older DMD compiler):
    http://pyd.dsource.org/
    I completly don't know anything about D... When i see the result of
    psyco or shedskin, i'm affraid i'll not look somewhere else soon !

    However, i'd like to see a lisp implementation of this...

    bye

    --
    William Dod? - http://flibuste.net
    Informaticien Ind?pendant
  • Skip at Jul 20, 2009 at 6:26 pm
    William> c 5s
    William> gcj 7s
    William> java 7s
    William> shedskin 8s
    William> python + psyco 18s
    William> cython avec malloc *int 18s
    William> cython 55s avec [] python
    William> python 303s (5m3s)

    I read just enough French to know that "avec" means "with", but I don't
    understand the difference between "avec malloc *int" and "avec []". Can you
    explain please?

    Thx,

    --
    Skip Montanaro - skip at pobox.com - http://www.smontanaro.net/
    That's more than a dress. That's an Audrey Hepburn movie. -- Jerry Maguire
  • Bearophile at Jul 20, 2009 at 6:51 pm

    Skip Montanaro:
    I read just enough French to know that "avec" means "with", but I don't
    understand the difference between "avec malloc *int" and "avec []". ?Can you
    explain please?
    Maybe it's the time difference between using a Python list from Cython
    and using a C "array" allocated with a malloc from Cython.

    Bye,
    bearophile
  • William Dode at Jul 21, 2009 at 8:20 am

    On 20-07-2009, Bearophile wrote:
    Skip Montanaro:
    I read just enough French to know that "avec" means "with", but I don't
    understand the difference between "avec malloc *int" and "avec []". ?Can you
    explain please?
    Maybe it's the time difference between using a Python list from Cython
    and using a C "array" allocated with a malloc from Cython.
    yes, it's this

    --
    William Dod? - http://flibuste.net
    Informaticien Ind?pendant
  • Greg at Jul 22, 2009 at 12:35 am

    William Dode wrote:

    I just tested it with a litle game, to find the places of horse on
    a board 5x5. The result is :

    cython avec malloc *int 18s
    Posting benchmark times for Pyrex or Cython is pretty
    meaningless without showing the exact code that was
    used, since times can vary enormously depending on
    how much you C-ify things.

    --
    Greg
  • Bearophile at Jul 22, 2009 at 6:48 am

    greg:
    Posting benchmark times for Pyrex or Cython is pretty
    meaningless without showing the exact code that was
    used, since times can vary enormously depending on
    how much you C-ify things.
    Was this link, shown by William, not enough?
    http://hg.flibuste.net/libre/games/cheval/file/46797c3a5136/chevalx.pyx#l1

    Bye,
    bearophile
  • Kurt Smith at Jul 23, 2009 at 1:25 am

    On Wed, Jul 22, 2009 at 2:48 AM, Bearophilewrote:
    greg:
    Posting benchmark times for Pyrex or Cython is pretty
    meaningless without showing the exact code that was
    used, since times can vary enormously depending on
    how much you C-ify things.
    Was this link, shown by William, not enough?
    http://hg.flibuste.net/libre/games/cheval/file/46797c3a5136/chevalx.pyx#l1
    I took a stab at converting the recent psyco-optimized code to cython,
    and got a speedup.

    gcj 4.3.3 1.39s
    gcc 4.3.3 1.55s
    cython 11.2 1.91s
    psyco 1.94s
    javac 1.5.0_19 2.00s
    python 2.5.4 168.37s

    It was just a matter of cdef-ing all the arrays & integers --
    bearophile already did the hard work :-)

    Here's the cython code; all the others are from the repo.

    #################################################
    DEF NMOVES = 8
    DEF SIDE = 5
    DEF SQR_SIDE = SIDE * SIDE

    cdef int circuit[SQR_SIDE]
    cdef int nsolutions = 0

    cdef int movex[NMOVES]
    cdef int movey[NMOVES]
    py_movex = [-1,-2,-2,-1,+1,+2,+2,+1]
    py_movey = [-2,-1,+1,+2,+2,+1,-1,-2]
    for i in range(NMOVES):
    movex[i] = py_movex[i]
    movey[i] = py_movey[i]
    shift = [x * SIDE + y for x,y in zip(py_movex, py_movey)]
    cdef int shift_0 = shift[0]
    cdef int shift_1 = shift[1]
    cdef int shift_2 = shift[2]
    cdef int shift_3 = shift[3]
    cdef int shift_4 = shift[4]
    cdef int shift_5 = shift[5]
    cdef int shift_6 = shift[6]
    cdef int shift_7 = shift[7]

    def showCircuit():
    print
    for x in xrange(SIDE):
    x_SIDE = x * SIDE
    for y in xrange(SIDE):
    if SQR_SIDE < 100:
    print "%02d " % circuit[x_SIDE + y],
    else:
    print "%03d " % circuit[x_SIDE + y],
    print

    cdef void solve(int nb, int x, int y,
    int SIDE=SIDE, int SQR_SIDE=SQR_SIDE, int *circuit=circuit,
    int shift_0=shift_0,
    int shift_1=shift_1,
    int shift_2=shift_2,
    int shift_3=shift_3,
    int shift_4=shift_4,
    int shift_5=shift_5,
    int shift_6=shift_6,
    int shift_7=shift_7,
    ):
    global nsolutions

    cdef int newx, newy
    cdef int pos = x * SIDE + y
    circuit[pos] = nb
    if nb == SQR_SIDE:
    #showCircuit()
    nsolutions += 1
    circuit[pos] = 0
    return

    newx = x + -1
    if newx >= 0 and newx < SIDE:
    newy = y + -2
    if newy >= 0 and newy < SIDE and not circuit[pos + shift_0]:
    solve(nb+1, newx, newy)

    newx = x + -2
    if newx >= 0 and newx < SIDE:
    newy = y + -1
    if newy >= 0 and newy < SIDE and not circuit[pos + shift_1]:
    solve(nb+1, newx, newy)

    newx = x + -2
    if newx >= 0 and newx < SIDE:
    newy = y + 1
    if newy >= 0 and newy < SIDE and not circuit[pos + shift_2]:
    solve(nb+1, newx, newy)

    newx = x + -1
    if newx >= 0 and newx < SIDE:
    newy = y + 2
    if newy >= 0 and newy < SIDE and not circuit[pos + shift_3]:
    solve(nb+1, newx, newy)

    newx = x + 1
    if newx >= 0 and newx < SIDE:
    newy = y + 2
    if newy >= 0 and newy < SIDE and not circuit[pos + shift_4]:
    solve(nb+1, newx, newy)

    newx = x + 2
    if newx >= 0 and newx < SIDE:
    newy = y + 1
    if newy >= 0 and newy < SIDE and not circuit[pos + shift_5]:
    solve(nb+1, newx, newy)

    newx = x + 2
    if newx >= 0 and newx < SIDE:
    newy = y + -1
    if newy >= 0 and newy < SIDE and not circuit[pos + shift_6]:
    solve(nb+1, newx, newy)

    newx = x + 1
    if newx >= 0 and newx < SIDE:
    newy = y + -2
    if newy >= 0 and newy < SIDE and not circuit[pos + shift_7]:
    solve(nb+1, newx, newy)

    circuit[pos] = 0

    def main():
    print "Search for side=%d" % SIDE
    cdef int x,y
    for x in range(SIDE):
    for y in range(SIDE):
    solve(1, x, y);
    print "\n%dx%d case, %d solutions." % (SIDE, SIDE, nsolutions)

    def run():
    import time
    s=time.time()
    main()
    print time.time()-s
    #################################################
  • William Dode at Jul 23, 2009 at 6:44 am

    On 23-07-2009, Kurt Smith wrote:
    On Wed, Jul 22, 2009 at 2:48 AM, Bearophilewrote:
    greg:
    Posting benchmark times for Pyrex or Cython is pretty
    meaningless without showing the exact code that was
    used, since times can vary enormously depending on
    how much you C-ify things.
    Was this link, shown by William, not enough?
    http://hg.flibuste.net/libre/games/cheval/file/46797c3a5136/chevalx.pyx#l1
    I took a stab at converting the recent psyco-optimized code to cython,
    and got a speedup.
    thanks, i updated my repo with litle more tips.

    --
    William Dod? - http://flibuste.net
    Informaticien Ind?pendant
  • Greg at Jul 24, 2009 at 11:24 pm

    Bearophile wrote:

    Was this link, shown by William, not enough?
    http://hg.flibuste.net/libre/games/cheval/file/46797c3a5136/chevalx.pyx#l1
    Yes, sorry, I posted too soon.

    --
    Greg
  • Nick Craig-Wood at Jul 21, 2009 at 12:29 pm

    Mark Dufour wrote:
    I have just released version 0.2 of Shed Skin, an experimental
    (restricted) Python-to-C++ compiler (http://shedskin.googlecode.com).
    It comes with 7 new example programs (for a total of 40 example
    programs, at over 12,000 lines) and several important improvements/bug
    fixes. See http://code.google.com/p/shedskin/wiki/ReleaseNotes for the
    full changelog.
    Cool!

    I tried it on a program I wrote to solve a puzzle (Rush Hour).
    (Actually it solves the meta-puzzle - trying to make the hardest
    possible Rush Hour puzzle.)

    After a bit of twiddling (remove psyco and profiling) I got it to
    start compiling, but unfortunately it compiled for about an hour (in
    iterative type analysis..) used up 600 MB of RAM printing an '*' every
    10 minutes or so then gave an error message and gave up.

    Unfortunately I shut the window by accident, but the error message was
    something about not being able to resolve types I think with a list of
    20 or so unresolved types.

    Can you give a hint as to how I debug this? I presume my program has
    some instances of non static types which is causing the problem, but
    it is going to be a very long debugging session if it takes me an hour
    each cycle ;-)

    The program is about 700 lines of python (excluding comments).

    Thanks

    Nick
    --
    Nick Craig-Wood <nick at craig-wood.com> -- http://www.craig-wood.com/nick
  • Bearophile at Jul 21, 2009 at 2:46 pm

    Nick Craig-Wood:
    Can you give a hint as to how I debug this? ?I presume my program has
    some instances of non static types which is causing the problem, but
    it is going to be a very long debugging session if it takes me an hour
    each cycle ;-)

    The program is about 700 lines of python (excluding comments).
    You can show us the Python (SSPython) code, and we can try to find the
    problem. Sometimes there's no simple ways to solve such problems.

    Generally for not very large progrograms if SS doesn't compile in
    about a minute or so then it's gone in infinite loop (there's a
    compilation flag that avoids some infinite loops, try it).

    Bye,
    bearophile
  • Terry Reedy at Jul 21, 2009 at 6:26 pm

    Nick Craig-Wood wrote:
    I tried it on a program I wrote to solve a puzzle (Rush Hour).
    (Actually it solves the meta-puzzle - trying to make the hardest
    possible Rush Hour puzzle.)

    After a bit of twiddling (remove psyco and profiling) I got it to
    start compiling, but unfortunately it compiled for about an hour (in
    iterative type analysis..) used up 600 MB of RAM printing an '*' every
    10 minutes or so then gave an error message and gave up.

    Unfortunately I shut the window by accident, but the error message was
    something about not being able to resolve types I think with a list of
    20 or so unresolved types.

    Can you give a hint as to how I debug this? I presume my program has
    some instances of non static types which is causing the problem, but
    it is going to be a very long debugging session if it takes me an hour
    each cycle ;-)

    The program is about 700 lines of python (excluding comments).
    Split it into pieces and compile each separately.
    Recurse.

    tjr
  • Srepmub at Jul 22, 2009 at 12:55 pm
    please send any program that doesn't work with shedskin (it still is
    experimental after all) to me, or create an issue at
    shedskin.googlecode.com, and I will have a look at the problem.


    thanks,
    mark.
  • William Dode at Jul 22, 2009 at 12:59 pm

    On 22-07-2009, srepmub wrote:
    please send any program that doesn't work with shedskin (it still is
    experimental after all) to me, or create an issue at
    shedskin.googlecode.com, and I will have a look at the problem.
    I did it, on the discussion group
    http://groups.google.com/group/shedskin-discuss/browse_thread/thread/c1f47a7c21897b44

    --
    William Dod? - http://flibuste.net
    Informaticien Ind?pendant
  • Nick Craig-Wood at Jul 23, 2009 at 11:30 am

    srepmub wrote:
    please send any program that doesn't work with shedskin (it still is
    experimental after all) to me, or create an issue at
    shedskin.googlecode.com, and I will have a look at the problem.
    I divided and conquered the program as suggested and eventually I got
    it to compile and run correctly :-)

    I learnt that if you have lots of variables with indeterminate type
    then shedskin takes a very long time indeed before blowing up!

    I also learnt that shedskin doesn't support the idiom I'd been using
    for creating shallow copies, namely the Board.__new__(Board) below.
    shedskin compiles it ok, but the C++ won't compile complaning about
    not being able to find __init__ methods

    Producing these warnings

    *WARNING* rush_hour_solver_cut_down.py:71: class 'Vehicle' has no method '__new__'
    *WARNING* rush_hour_solver_cut_down.py:72: variable 'new' has no type
    *WARNING* rush_hour_solver_cut_down.py:236: variable 'new_vehicle' has no type

    And these compile errors

    rush_hour_solver_cut_down.cpp:94: error: ?__new__? is not a member of ?__rush_hour_solver_cut_down__::Vehicle?
    rush_hour_solver_cut_down.cpp:95: error: expected type-specifier before ?;? token
    rush_hour_solver_cut_down.cpp: In member function ?void* __rush_hour_solver_cut_down__::Board::move(int, int)?:
    rush_hour_solver_cut_down.cpp:276: error: ?void*? is not a pointer-to-object type
    rush_hour_solver_cut_down.cpp:276: error: ?void*? is not a pointer-to-object type
    rush_hour_solver_cut_down.cpp:279: error: ?void*? is not a pointer-to-object type
    rush_hour_solver_cut_down.cpp:279: error: ?void*? is not a pointer-to-object type
    rush_hour_solver_cut_down.cpp:281: error: invalid conversion from ?void*? to ?__rush_hour_solver_cut_down__::Vehicle*?


    def copy(self):
    new = Board.__new__(Board)
    new.me_x = self.me_x
    new.me_y = self.me_y
    new.depth = self.depth
    new.parent = self
    new.best_child = None
    new.board = [self.board[i][:] for i in range(WIDTH)]
    new.rep = self.rep[:]
    new.vehicles = self.vehicles[:]
    return new

    I changed to using copy.copy which did work, but I couldn't name my
    copy methods "copy" otherwise I got this error from the C++ compile

    rush_hour_solver_cut_down.cpp: In member function '__rush_hour_solver_cut_down__::Vehicle* __rush_hour_solver_cut_down__::Vehicle::copy()':
    rush_hour_solver_cut_down.cpp:94: error: no matching function for call to '__rush_hour_solver_cut_down__::Vehicle::copy(__rush_hour_solver_cut_down__::Vehicle* const)'
    rush_hour_solver_cut_down.cpp:89: note: candidates are: __rush_hour_solver_cut_down__::Vehicle* __rush_hour_solver_cut_down__::Vehicle::copy()
    rush_hour_solver_cut_down.cpp: In member function '__rush_hour_solver_cut_down__::Board* __rush_hour_solver_cut_down__::Board::copy()':
    rush_hour_solver_cut_down.cpp:135: error: no matching function for call to '__rush_hour_solver_cut_down__::Board::copy(__rush_hour_solver_cut_down__::Board* const)'
    rush_hour_solver_cut_down.cpp:129: note: candidates are: __rush_hour_solver_cut_down__::Board* __rush_hour_solver_cut_down__::Board::copy()

    So I renamed them to pycopy, and they ended up looking like

    def pycopy(self):
    new = copy(self)
    new.parent = self
    new.best_child = None
    new.board = [self.board[i][:] for i in range(WIDTH)]
    new.rep = self.rep[:]
    new.vehicles = self.vehicles[:]
    return new

    After all that - some timing results!

    Python: 9.3 seconds
    Psyco: 5.8 seconds
    ShedSkin: 1.0 seconds

    Impressive!

    I put the code http://www.craig-wood.com/nick/pub/rush_hour_solver_cut_down.py

    I left in the commented out bits of code I had to change.

    This is only part of the project (375 lines) - it solves Rush Hour
    boards. There is another part which I haven't attempted to compile
    yet which finds the most difficult possible boards using a combination
    of back tracking and a genetic algorithm.

    --
    Nick Craig-Wood <nick at craig-wood.com> -- http://www.craig-wood.com/nick

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppython-list @
categoriespython
postedJul 19, '09 at 2:30p
activeJul 27, '09 at 11:16a
posts32
users12
websitepython.org

People

Translate

site design / logo © 2022 Grokbase