FAQ
Tim Peters wrote:
[see the main list on idle leaks]
if-pystone-works-ship-it-ly y'rs - tim
Well, on request of uncle Timmy, I do it.
Although it's very early.

A preview of stackless Python can be found under

ftp://ftp.pns.cc/pub/stackless_990606.zip

Current status:
The main interpreter is completely stackless.
Just for fun, I've set max recursion depth to 30000,
so just try it.

PyStone does of course run. My measures were about
3-5 percent slower than with standard Python. I think
this is quite fair.

As a side effect, the exec_statement now behaves
better than before, since
exec <anything>
without globals and locals should update the current
environment, which worked only for exec "string".

Most of the Run_<thing> functions are stackless as well.

Almost all cases could be treated tail recursively.

I have just begun to work on the builtins, and there is
a very bloody, new-born stackless map, which seems to
behave quite well. (It is just an hour old, so don't blame me
if I didn't get al refcounts right).

This is a first special case, since I *had* to build a
tiny interpreter from the old map code.
Still quite hacky, but not so bad.
It creates its own frame and bails out whenever it needs
to call the interpreter. If not, it stays in the loop.

Since this one is so fresh, the old map is still
there, and the new one has the name "map_nr".
As a little bonus, map_nr now also shows up in a
traceback. I've set the line no to the iteration count.
Beware, this is just a proof of concept and will
most probably change.

Further plans:
I will make the other builtins stackless as well
(reduce, filter), also the simple tail-recursive
ones which I didn't do now due to lack of time.

I think I will *not* think of stackless imports.
After loking into this for a while, I think this
is rather hairy, and also not necessary.

On extensions:
There will be a coroutine extension in a few days.
This is now nearly a no-brainer, since I did the
stackless Python with exactly that in mind.
This is the real fruit where I'm after, so please
let me pick it :)

Documentation:
Besides the few new comments, there is nothing yet.

Diff files:
Sorry, there are no diffs but just the modified
files. I had no time to do them now. All files stem
from the official Python 1.5.2 release.

You might wonder about the version:
In order to support extension modules which rely on
some special new features of frames, I decided
to name this Python "1.5.42", since I believe
it will be useful at least "four two" people. :-)

I consider this an Alpha 1 version.

fearing the feedback :-) ciao - chris

--
Christian Tismer :^)
Applied Biometrics GmbH : Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net
10553 Berlin : PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
we're tired of banana software - shipped green, ripens at home


From da@ski.org Mon Jun 7 17:43:09 1999
From: da@ski.org (David Ascher)
Date: Mon, 7 Jun 1999 09:43:09 -0700 (Pacific Daylight Time)
Subject: [Python-Dev] ActiveState & fork & Perl
Message-ID: <Pine.WNT.4.04.9906070938570.193-100000@rigoletto.ski.org>

In case you haven't heard about it, ActiveState has recently signed a
contract with Microsoft to do some work on Perl on win32.

One interesting aspect of this for Python is the specific work being
performed. From the FAQ on this joint effort, one gets, under "What is
the scope of the work that is being done?":

fork()

This implementation of fork() will clone the running interpreter
and create a new interpreter with its own thread, but running in the
same process space. The goal is to achieve functional equivalence to
fork() on UNIX systems without suffering the performance hit of the
process creation overhead on Win32 platforms.

Emulating fork() within a single process needs the ability to run
multiple interpreters concurrently in separate threads. Perl version
5.005 has experimental support for this in the form of the PERL_OBJECT
build option, but it has some shortcomings. PERL_OBJECT needs a C++
compiler, and currently only works on Windows. ActiveState will be
working to provide support for revamped support for the PERL_OBJECT
functionality that will run on every platform that Perl will build on,
and will no longer require C++ to work. This means that other operating
systems that lack fork() but have support for threads (such as VMS and
MacOS) will benefit from this aspect of the work.

Any guesses as to whether we could hijack this work if/when it is released
as Open Source?

--david



From guido@CNRI.Reston.VA.US Mon Jun 7 17:49:27 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Mon, 07 Jun 1999 12:49:27 -0400
Subject: [Python-Dev] ActiveState & fork & Perl
In-Reply-To: Your message of "Mon, 07 Jun 1999 09:43:09 PDT."
<Pine.WNT.4.04.9906070938570.193-100000@rigoletto.ski.org>
References: <Pine.WNT.4.04.9906070938570.193-100000@rigoletto.ski.org>
Message-ID: <199906071649.MAA12619@eric.cnri.reston.va.us>
In case you haven't heard about it, ActiveState has recently signed a
contract with Microsoft to do some work on Perl on win32.
Have I ever heard of it! :-) David Grove pulled me into one of his
bouts of paranoia. I think he's calmed down for the moment.
One interesting aspect of this for Python is the specific work being
performed. From the FAQ on this joint effort, one gets, under "What is
the scope of the work that is being done?":

fork()

This implementation of fork() will clone the running interpreter
and create a new interpreter with its own thread, but running in the
same process space. The goal is to achieve functional equivalence to
fork() on UNIX systems without suffering the performance hit of the
process creation overhead on Win32 platforms.

Emulating fork() within a single process needs the ability to run
multiple interpreters concurrently in separate threads. Perl version
5.005 has experimental support for this in the form of the PERL_OBJECT
build option, but it has some shortcomings. PERL_OBJECT needs a C++
compiler, and currently only works on Windows. ActiveState will be
working to provide support for revamped support for the PERL_OBJECT
functionality that will run on every platform that Perl will build on,
and will no longer require C++ to work. This means that other operating
systems that lack fork() but have support for threads (such as VMS and
MacOS) will benefit from this aspect of the work.

Any guesses as to whether we could hijack this work if/when it is released
as Open Source?
When I saw this, my own response was simply "those poor Perl suckers
are relying too much of fork()." Am I wrong, and is this also a habit
of Python programmers?

Anyway, I doubt that we coould use their code, as it undoubtedly
refers to reimplementing fork() at the Perl level, not at the C level
(which would be much harder).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From da@ski.org Mon Jun 7 17:51:45 1999
From: da@ski.org (David Ascher)
Date: Mon, 7 Jun 1999 09:51:45 -0700 (Pacific Daylight Time)
Subject: [Python-Dev] ActiveState & fork & Perl
In-Reply-To: <199906071649.MAA12619@eric.cnri.reston.va.us>
Message-ID: <Pine.WNT.4.04.9906070949490.193-100000@rigoletto.ski.org>
On Mon, 7 Jun 1999, Guido van Rossum wrote:

When I saw this, my own response was simply "those poor Perl suckers
are relying too much of fork()." Am I wrong, and is this also a habit
of Python programmers?
Well, I find the fork() model to be a very simple one to use, much easier
to manage than threads or full-fledged IPC. So, while I don't rely on it
in any crucial way, it's quite convenient at times.

--david



From guido@CNRI.Reston.VA.US Mon Jun 7 17:56:22 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Mon, 07 Jun 1999 12:56:22 -0400
Subject: [Python-Dev] ActiveState & fork & Perl
In-Reply-To: Your message of "Mon, 07 Jun 1999 09:51:45 PDT."
<Pine.WNT.4.04.9906070949490.193-100000@rigoletto.ski.org>
References: <Pine.WNT.4.04.9906070949490.193-100000@rigoletto.ski.org>
Message-ID: <199906071656.MAA12642@eric.cnri.reston.va.us>
Well, I find the fork() model to be a very simple one to use, much easier
to manage than threads or full-fledged IPC. So, while I don't rely on it
in any crucial way, it's quite convenient at times.
Can you give a typical example where you use it, or is this just a gut
feeling?

It's also dangerous -- e.g. unexpected errors may percolate down the
wrong stack (many mailman bugs had to do with forking), GUI apps
generally won't be cloned, and some extension libraries don't like to
be cloned either (e.g. ILU).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From da@ski.org Mon Jun 7 18:02:31 1999
From: da@ski.org (David Ascher)
Date: Mon, 7 Jun 1999 10:02:31 -0700 (Pacific Daylight Time)
Subject: [Python-Dev] ActiveState & fork & Perl
In-Reply-To: <199906071656.MAA12642@eric.cnri.reston.va.us>
Message-ID: <Pine.WNT.4.04.9906070957270.193-100000@rigoletto.ski.org>
On Mon, 7 Jun 1999, Guido van Rossum wrote:

Can you give a typical example where you use it, or is this just a gut
feeling?
Well, the latest example was that I wanted to spawn a Python process to do
viewing of NumPy arrays with Tk from within the Python interactive shell
(without using a shell wrapper). It's trivial with a fork(), and
non-trivial with threads. The solution I had to finalize on was to branch
based on OS and do threads where threads are available and fork()
otherwise. Likely 2.05 times as many errors as with a single solution =).
It's also dangerous -- e.g. unexpected errors may percolate down the
wrong stack (many mailman bugs had to do with forking), GUI apps
generally won't be cloned, and some extension libraries don't like to
be cloned either (e.g. ILU).
More dangerous than threads? Bwaaahaahaa! =). fork() might be
"deceivingly simple in appearance", I grant you that. But sometimes
that's good enough.

It's also possible that fork() without all of its process-handling
relatives isn't useful enough to warrant the effort.

--david



From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Mon Jun 7 18:05:20 1999
From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw)
Date: Mon, 7 Jun 1999 13:05:20 -0400 (EDT)
Subject: [Python-Dev] ActiveState & fork & Perl
References: <Pine.WNT.4.04.9906070949490.193-100000@rigoletto.ski.org>
<199906071656.MAA12642@eric.cnri.reston.va.us>
Message-ID: <14171.64464.805578.325069@anthem.cnri.reston.va.us>
"Guido" == Guido van Rossum <guido@cnri.reston.va.us> writes:
Guido> It's also dangerous -- e.g. unexpected errors may percolate
Guido> down the wrong stack (many mailman bugs had to do with
Guido> forking), GUI apps generally won't be cloned, and some
Guido> extension libraries don't like to be cloned either
Guido> (e.g. ILU).

Rambling mode on...

Okay, so you can't guarantee that fork will be everywhere you might
want to run an application. For example, that's one of the main
reasons Mailman hasn't been ported off of Un*x. But you also can't
guarantee that threads will be everywhere either. One of the things
I'd (eventually) like to do is to re-architect Mailman so that it uses
a threaded central server instead of the current one-shot process
model. But there's been debate among the developers because 1)
threads aren't supported everywhere, and 2) thread support isn't
built-in by default anyway.

I wonder if it's feasible or useful to promote threading support in
Python? Thoughts would include building threads in by default if
possible on the platform, integrating Greg's free threading mods,
etc. Providing more integrated support for threads might encourage
programmers to reach for that particular tool instead of fork, which
is crude, but pretty damn handy and easy to use.

Rambling mode off...

-Barry


From jim@digicool.com Mon Jun 7 18:07:59 1999
From: jim@digicool.com (Jim Fulton)
Date: Mon, 07 Jun 1999 13:07:59 -0400
Subject: [Python-Dev] ActiveState & fork & Perl
References: <Pine.WNT.4.04.9906070949490.193-100000@rigoletto.ski.org>
Message-ID: <375BFC6F.BF779796@digicool.com>


David Ascher wrote:
On Mon, 7 Jun 1999, Guido van Rossum wrote:

When I saw this, my own response was simply "those poor Perl suckers
are relying too much of fork()." Am I wrong, and is this also a habit
of Python programmers?
Well, I find the fork() model to be a very simple one to use, much easier
to manage than threads or full-fledged IPC. So, while I don't rely on it
in any crucial way, it's quite convenient at times.
Interesting. I prefer threads because they eliminate the *need*
for an IPC. I find locks and the various interesting things you can build
from them to be much easier to deal with and more elegant than IPC.
I wonder if the perl folks are also going to emulate doing IPC in the
same process. Hee hee. :)

Jim

--
Jim Fulton mailto:jim@digicool.com Python Powered!
Technical Director (888) 344-4332 http://www.python.org
Digital Creations http://www.digicool.com http://www.zope.org

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission. Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From da@ski.org Mon Jun 7 18:10:56 1999
From: da@ski.org (David Ascher)
Date: Mon, 7 Jun 1999 10:10:56 -0700 (Pacific Daylight Time)
Subject: [Python-Dev] ActiveState & fork & Perl
In-Reply-To: <14171.64464.805578.325069@anthem.cnri.reston.va.us>
Message-ID: <Pine.WNT.4.04.9906071006030.193-100000@rigoletto.ski.org>
On Mon, 7 Jun 1999, Barry A. Warsaw wrote:

I wonder if it's feasible or useful to promote threading support in
Python? Thoughts would include building threads in by default if
possible on the platform,
That seems a good idea to me. It's a relatively safe thing to enable by
default, no?
Providing more integrated support for threads might encourage
programmers to reach for that particular tool instead of fork, which
is crude, but pretty damn handy and easy to use.
While we're at it, it'd be nice if we could provide a better answer when
someone asks (as "they" often do) "how do I program with threads in
Python" than our usual "the way you'd do it in C". Threading tutorials
are very hard to come by, I've found (I got the ORA multi-threaded
programming in win32, but it's such a monster I've barely looked at it). I
suggest that we allocate about 10% of TimBot's time to that task. If
necessary, we can upgrade it to a dual-CPU setup. With Greg's threading
patches, we could even get it to run on both CPUs efficiently. It could
write about itself. <unplug>

--david



From akuchlin@mems-exchange.org Mon Jun 7 18:20:15 1999
From: akuchlin@mems-exchange.org (Andrew M. Kuchling)
Date: Mon, 7 Jun 1999 13:20:15 -0400 (EDT)
Subject: [Python-Dev] ActiveState & fork & Perl
In-Reply-To: <Pine.WNT.4.04.9906071006030.193-100000@rigoletto.ski.org>
References: <14171.64464.805578.325069@anthem.cnri.reston.va.us>
<Pine.WNT.4.04.9906071006030.193-100000@rigoletto.ski.org>
Message-ID: <14171.65359.306743.276505@amarok.cnri.reston.va.us>

David Ascher writes:
While we're at it, it'd be nice if we could provide a better answer when
someone asks (as "they" often do) "how do I program with threads in
Python" than our usual "the way you'd do it in C". Threading tutorials
are very hard to come by, I've found (I got the ORA multi-threaded
Agreed; I'd love to see a HOWTO on thread programming. I really
liked Andrew Birrell's introduction to threads for Modula-3; see
http://gatekeeper.dec.com/pub/DEC/SRC/research-reports/abstracts/src-rr-035.html
(Postscript and PDF versions available.) Translating its approach to
Python would be an excellent starting point.

--
A.M. Kuchling http://starship.python.net/crew/amk/
"If you had stayed with us, we could have given you life until death."
"Don't I get that anyway?"
-- Stheno and Lyta Hall, in SANDMAN #61: "The Kindly Ones:5"



From guido@CNRI.Reston.VA.US Mon Jun 7 18:24:45 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Mon, 07 Jun 1999 13:24:45 -0400
Subject: [Python-Dev] ActiveState & fork & Perl
In-Reply-To: Your message of "Mon, 07 Jun 1999 13:20:15 EDT."
<14171.65359.306743.276505@amarok.cnri.reston.va.us>
References: <14171.64464.805578.325069@anthem.cnri.reston.va.us> <Pine.WNT.4.04.9906071006030.193-100000@rigoletto.ski.org>
<14171.65359.306743.276505@amarok.cnri.reston.va.us>
Message-ID: <199906071724.NAA12743@eric.cnri.reston.va.us>
David Ascher writes:
While we're at it, it'd be nice if we could provide a better answer when
someone asks (as "they" often do) "how do I program with threads in
Python" than our usual "the way you'd do it in C". Threading tutorials
are very hard to come by, I've found (I got the ORA multi-threaded
Andrew Kuchling chimes in:
Agreed; I'd love to see a HOWTO on thread programming. I really
liked Andrew Birrell's introduction to threads for Modula-3; see
http://gatekeeper.dec.com/pub/DEC/SRC/research-reports/abstracts/src-rr-035.html
(Postscript and PDF versions available.) Translating its approach to
Python would be an excellent starting point.
Another idea is for someone to finish the thread tutorial that I
started early 1998 (and never finished because I realized that it
needed the threading module and some thread-safety patches to urllib
for the examples I had in mind to work). It's actually on the website
(but unlinked-to): http://www.python.org/doc/essays/threads.html

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jeremy@cnri.reston.va.us Mon Jun 7 18:28:57 1999
From: jeremy@cnri.reston.va.us (Jeremy Hylton)
Date: Mon, 7 Jun 1999 13:28:57 -0400 (EDT)
Subject: [Python-Dev] ActiveState & fork & Perl
In-Reply-To: <199906071724.NAA12743@eric.cnri.reston.va.us>
References: <14171.64464.805578.325069@anthem.cnri.reston.va.us>
<Pine.WNT.4.04.9906071006030.193-100000@rigoletto.ski.org>
<14171.65359.306743.276505@amarok.cnri.reston.va.us>
<199906071724.NAA12743@eric.cnri.reston.va.us>
Message-ID: <14172.289.552901.264826@bitdiddle.cnri.reston.va.us>

Indeed, it might be better to start with the threading module for the
first tutorial. While I'm also a fan of Birrell's paper, it would
encourage people to start with the low-level thread module, instead of
the higher-level threading module.

So the right answer, of course, is to do both!

Jeremy


From bwarsaw@python.org Mon Jun 7 18:36:05 1999
From: bwarsaw@python.org (Barry A. Warsaw)
Date: Mon, 7 Jun 1999 13:36:05 -0400 (EDT)
Subject: [Python-Dev] ActiveState & fork & Perl
References: <14171.64464.805578.325069@anthem.cnri.reston.va.us>
<Pine.WNT.4.04.9906071006030.193-100000@rigoletto.ski.org>
Message-ID: <14172.773.807413.412693@anthem.cnri.reston.va.us>
"DA" == David Ascher <da@ski.org> writes:
I wonder if it's feasible or useful to promote threading
support in Python? Thoughts would include building threads in
by default if possible on the platform,
DA> That seems a good idea to me. It's a relatively safe thing to
DA> enable by default, no?

Don't know how hard it would be to write the appropriate configure
tests, but then again, if it was easy I'd'a figured Guido would have
done it already.

A simple thing would be to change the default sense of "Do we build in
thread support?". Make this true by default, and add a
--without-threads configure flag people can use to turn them off.

-Barry


From skip@mojam.com (Skip Montanaro) Mon Jun 7 23:37:38 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Mon, 7 Jun 1999 18:37:38 -0400 (EDT)
Subject: [Python-Dev] ActiveState & fork & Perl
In-Reply-To: <14172.773.807413.412693@anthem.cnri.reston.va.us>
References: <14171.64464.805578.325069@anthem.cnri.reston.va.us>
<Pine.WNT.4.04.9906071006030.193-100000@rigoletto.ski.org>
<14172.773.807413.412693@anthem.cnri.reston.va.us>
Message-ID: <14172.18567.631562.79944@cm-24-29-94-19.nycap.rr.com>

BAW> A simple thing would be to change the default sense of "Do we build
BAW> in thread support?". Make this true by default, and add a
BAW> --without-threads configure flag people can use to turn them off.

True enough, but as Guido pointed out, enabling threads by default would
immediately make the Mac a second-class citizen. Test cases and demos would
eventually find their way into the distribution that Mac users could not
run, etc., etc. It may not account for a huge fraction of the Python
development seats, but it seems a shame to leave it out in the cold. Has
there been an assessment of how hard it would be to add thread support to
the Mac? On a scale of 1 to 10 (1: we know how, but it's not implemented
because nobody's needed it so far, 10: drilling for oil on the sun would be
easier), how hard would it be? I assume Jack Jansen is on this list. Jack,
any thoughts? Alpha code? Pre-alpha code?

Skip Montanaro | Mojam: "Uniting the World of Music" http://www.mojam.com/
skip@mojam.com | Musi-Cal: http://www.musi-cal.com/
518-372-5583



From da@ski.org Mon Jun 7 23:43:32 1999
From: da@ski.org (David Ascher)
Date: Mon, 7 Jun 1999 15:43:32 -0700 (Pacific Daylight Time)
Subject: [Python-Dev] ActiveState & fork & Perl
In-Reply-To: <14172.18567.631562.79944@cm-24-29-94-19.nycap.rr.com>
Message-ID: <Pine.WNT.4.04.9906071539310.193-100000@rigoletto.ski.org>
On Mon, 7 Jun 1999, Skip Montanaro wrote:

True enough, but as Guido pointed out, enabling threads by default would
immediately make the Mac a second-class citizen. Test cases and demos would
eventually find their way into the distribution that Mac users could not
run, etc., etc. It may not account for a huge fraction of the Python
development seats, but it seems a shame to leave it out in the cold.
I'm not sure I buy that argument. There are already thread demos in the
current directory, and no one complains. The windows builds are already
threaded by default, and it's not caused any problems that I know of.
Think of it like enabling the *new* module. =)
Has there been an assessment of how hard it would be to add thread
support to the Mac?
That's an interesting question, especially since ActiveState lists it as a
machine w/ threads and w/o fork().

--david



From skip@mojam.com (Skip Montanaro) Mon Jun 7 23:49:12 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Mon, 7 Jun 1999 18:49:12 -0400 (EDT)
Subject: [Python-Dev] ActiveState & fork & Perl
In-Reply-To: <Pine.WNT.4.04.9906071539310.193-100000@rigoletto.ski.org>
References: <14172.18567.631562.79944@cm-24-29-94-19.nycap.rr.com>
<Pine.WNT.4.04.9906071539310.193-100000@rigoletto.ski.org>
Message-ID: <14172.19387.516361.546698@cm-24-29-94-19.nycap.rr.com>

David> I'm not sure I buy that argument. Think of it like enabling the
David> *new* module. =)

That's not quite the same thing. The new module simply exposes some
normally closed-from-Python-code data structures to the Python programmer.
Enabling threads requires some support from the underlying runtime system.
If that was already in place, I suspect the Mac binaries would come with the
thread module enabled by default, yes?

Skip


From da@ski.org Mon Jun 7 23:58:22 1999
From: da@ski.org (David Ascher)
Date: Mon, 7 Jun 1999 15:58:22 -0700 (Pacific Daylight Time)
Subject: [Python-Dev] ActiveState & fork & Perl
In-Reply-To: <14172.19387.516361.546698@cm-24-29-94-19.nycap.rr.com>
Message-ID: <Pine.WNT.4.04.9906071550440.193-100000@rigoletto.ski.org>
On Mon, 7 Jun 1999, Skip Montanaro wrote:

That's not quite the same thing. The new module simply exposes some
normally closed-from-Python-code data structures to the Python programmer.
Enabling threads requires some support from the underlying runtime system.
If that was already in place, I suspect the Mac binaries would come with the
thread module enabled by default, yes?
I'm not denying that. It's just that there are lots of things which fall
into that category, like (to take a pointed example =), os.fork(). We
don't have a --with-fork configure flag. We expose to the Python
programmer all of the underlying OS that is 'wrapped' as long as it's
reasonably portable. I think that most unices + win32 is a reasonable
approximation of 'reasonably portable'. And in fact, this change might
motivate someone with Mac fervor to explore adding Python support of Mac
threads.

--david



From gmcm@hypernet.com Tue Jun 8 01:01:56 1999
From: gmcm@hypernet.com (Gordon McMillan)
Date: Mon, 7 Jun 1999 19:01:56 -0500
Subject: [Python-Dev] ActiveState & fork & Perl
In-Reply-To: <Pine.WNT.4.04.9906071539310.193-100000@rigoletto.ski.org>
References: <14172.18567.631562.79944@cm-24-29-94-19.nycap.rr.com>
Message-ID: <1283322126-63868517@hypernet.com>

David Ascher wrote:
On Mon, 7 Jun 1999, Skip Montanaro wrote:

True enough, but as Guido pointed out, enabling threads by default would
immediately make the Mac a second-class citizen.
I'm not sure I buy that argument. There are already thread demos in
the current directory, and no one complains. The windows builds are
already threaded by default, and it's not caused any problems that I
know of. Think of it like enabling the *new* module. =)
Has there been an assessment of how hard it would be to add thread
support to the Mac?
That's an interesting question, especially since ActiveState lists
it as a machine w/ threads and w/o fork().
Not a Mac programmer, but I recall that when Steve Jobs came back,
they published a schedule that said threads would be available a
couple releases down the road. Schedules only move one way, so I'd
guess ActiveState is premature.

Perhaps Christian's stackless Python would enable green threads...

(And there are a number of things in the standard distribution which
don't work on Windows, either; fork and select()ing on file fds).

- Gordon


From skip@mojam.com (Skip Montanaro) Tue Jun 8 00:06:34 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Mon, 7 Jun 1999 19:06:34 -0400 (EDT)
Subject: [Python-Dev] ActiveState & fork & Perl
In-Reply-To: <Pine.WNT.4.04.9906071550440.193-100000@rigoletto.ski.org>
References: <14172.19387.516361.546698@cm-24-29-94-19.nycap.rr.com>
<Pine.WNT.4.04.9906071550440.193-100000@rigoletto.ski.org>
Message-ID: <14172.20567.40217.703269@cm-24-29-94-19.nycap.rr.com>

David> I think that most unices + win32 is a reasonable approximation of
David> 'reasonably portable'. And in fact, this change might motivate
David> someone with Mac fervor to explore adding Python support of Mac
David> threads.

One can hope... ;-)

Skip



From MHammond@skippinet.com.au Tue Jun 8 00:06:37 1999
From: MHammond@skippinet.com.au (Mark Hammond)
Date: Tue, 8 Jun 1999 09:06:37 +1000
Subject: [Python-Dev] ActiveState & fork & Perl
In-Reply-To: <199906071649.MAA12619@eric.cnri.reston.va.us>
Message-ID: <000501beb13a$9eec2c10$0801a8c0@bobcat>
In case you haven't heard about it, ActiveState has
recently signed a
contract with Microsoft to do some work on Perl on win32.
Have I ever heard of it! :-) David Grove pulled me into one of his
bouts of paranoia. I think he's calmed down for the moment.
It sounds like a :-), but Im afraid I dont understand that reference.

When I first heard this, two things sprung to mind:
a) Why shouldnt Python push for a similar deal?
b) Something more interesting in the MS/Python space is happening anyway,
so nyah nya nya ;-)

Getting some modest funds to (say) put together and maintain single
core+win32 installers to place on the NT resource kit could only help
Python.

Sometimes I wish we had a few less good programmers, and a few more good
marketting type people ;-)
Anyway, I doubt that we coould use their code, as it undoubtedly
refers to reimplementing fork() at the Perl level, not at the C level
(which would be much harder).
Excuse my ignorance, but how hard would it be to simulate/emulate/ovulate
fork using the Win32 extensions? Python has basically all of the native
Win32 process API exposed, and writing a "fork" in Python that only forked
Python scripts (for example) may be feasable and not too difficult.

It would have obvious limitations, including the fact that it is not
available standard with Python on Windows (just like a working popen now
:-) but if we could follow the old 80-20 rule, and catch 80% of the uses
with 20% of the effort it may be worth investigating.

My knowledge of fork is limited to muttering "something about cloning the
current process", so I may be naive in the extreme - but is this feasible?

Mark.



From fredrik@pythonware.com Tue Jun 8 00:21:15 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Tue, 8 Jun 1999 01:21:15 +0200
Subject: [Python-Dev] ActiveState & fork & Perl
References: <000501beb13a$9eec2c10$0801a8c0@bobcat>
Message-ID: <001601beb13c$70ff5b90$f29b12c2@pythonware.com>

Mark wrote:
Excuse my ignorance, but how hard would it be to simulate/emulate/ovulate
fork using the Win32 extensions? Python has basically all of the native
Win32 process API exposed, and writing a "fork" in Python that only forked
Python scripts (for example) may be feasable and not too difficult.

It would have obvious limitations, including the fact that it is not
available standard with Python on Windows (just like a working popen now
:-) but if we could follow the old 80-20 rule, and catch 80% of the uses
with 20% of the effort it may be worth investigating.

My knowledge of fork is limited to muttering "something about cloning the
current process", so I may be naive in the extreme - but is this feasible?
as an aside, GvR added Windows' "spawn" API in 1.5.2,
so you can at least emulate some common variants of
fork+exec. this means that if someone writes a spawn
for Unix, we would at least catch >0% of the uses with
~0% of the effort ;-)

fwiw, I'm more interested in the "unicode all the way
down" parts of the activestate windows project. more
on that later.

</F>



From gstein@lyra.org Tue Jun 8 00:10:38 1999
From: gstein@lyra.org (Greg Stein)
Date: Mon, 07 Jun 1999 16:10:38 -0700
Subject: [Python-Dev] ActiveState & fork & Perl
References: <Pine.WNT.4.04.9906071550440.193-100000@rigoletto.ski.org>
Message-ID: <375C516E.76EC8ED4@lyra.org>

David Ascher wrote:
...
I'm not denying that. It's just that there are lots of things which fall
into that category, like (to take a pointed example =), os.fork(). We
don't have a --with-fork configure flag. We expose to the Python
programmer all of the underlying OS that is 'wrapped' as long as it's
reasonably portable. I think that most unices + win32 is a reasonable
approximation of 'reasonably portable'. And in fact, this change might
motivate someone with Mac fervor to explore adding Python support of Mac
threads.
Agreed. Python isn't a least-common-demoninator language. It tries to
make things easy for people. Why should we kill all platforms because of
a lack on one? Having threads by default will make a lot of things much
simpler (in terms of knowing the default platform). Can't tell you how
many times I curse to find that the default RedHat distribution (as of
5.x) did not use threads, even though they are well-supported on Linux.

And about stuff creeping into the distribution: gee... does that mean
that SocketServer doesn't work on the Mac? Threads *and* fork are not
available on Python/Mac, so all you would get is a single-threaded
server. icky. I can't see how adding threads to other platforms will
*hurt* the Macintosh platform... it can only help others.

About the only reason that I can see to *not* make them the default is
the slight speed loss. But that seems a bit bogus, as the interpreter
loop doesn't spend *that* much time mucking with the interp_lock to
allow thread switches. There have also been some real good suggestions
for making it take near-zero time until you actually create that second
thread.

Cheers,
-g

--
Greg Stein, http://www.lyra.org/


From fredrik@pythonware.com Tue Jun 8 00:26:08 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Tue, 8 Jun 1999 01:26:08 +0200
Subject: [Python-Dev] ActiveState & fork & Perl
References: <14172.18567.631562.79944@cm-24-29-94-19.nycap.rr.com> <1283322126-63868517@hypernet.com>
Message-ID: <002a01beb13d$1fa23c80$f29b12c2@pythonware.com>
Not a Mac programmer, but I recall that when Steve Jobs came back,
they published a schedule that said threads would be available a
couple releases down the road. Schedules only move one way, so I'd
guess ActiveState is premature.
http://www.computerworld.com/home/print.nsf/all/990531AAFA
Perhaps Christian's stackless Python would enable green threads...

(And there are a number of things in the standard distribution which
don't work on Windows, either; fork and select()ing on file fds).
time to implement channels? (Tcl's unified abstraction
for all kinds of streams that you could theoretically use
something like select on. sockets, pipes, asynchronous
disk I/O, etc).

does select really work on ordinary files under Unix,
btw?

</F>



From fredrik@pythonware.com Tue Jun 8 00:30:57 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Tue, 8 Jun 1999 01:30:57 +0200
Subject: [Python-Dev] ActiveState & fork & Perl
Message-ID: <003501beb13d$cbadf7d0$f29b12c2@pythonware.com>

I wrote:
Not a Mac programmer, but I recall that when Steve Jobs came back,
they published a schedule that said threads would be available a
couple releases down the road. Schedules only move one way, so I'd
guess ActiveState is premature.
http://www.computerworld.com/home/print.nsf/all/990531AAFA
which was just my way of saying that "did he perhaps
refer to OS X ?".

or are they adding real threads to good old MacOS too?

</F>



From fredrik@pythonware.com Tue Jun 8 00:38:02 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Tue, 8 Jun 1999 01:38:02 +0200
Subject: [Python-Dev] ActiveState & fork & Perl
References: <Pine.WNT.4.04.9906071550440.193-100000@rigoletto.ski.org> <375C516E.76EC8ED4@lyra.org>
Message-ID: <003f01beb13e$c95a2750$f29b12c2@pythonware.com>
Having threads by default will make a lot of things much simpler
(in terms of knowing the default platform). Can't tell you how
many times I curse to find that the default RedHat distribution
(as of 5.x) did not use threads, even though they are well-
supported on Linux.
I have a vague memory that once upon a time, the standard
X libraries shipped with RedHat weren't thread safe, and
Tkinter didn't work if you compiled Python with threads.

but I might be wrong and/or that may have changed...

</F>



From MHammond@skippinet.com.au Tue Jun 8 00:42:38 1999
From: MHammond@skippinet.com.au (Mark Hammond)
Date: Tue, 8 Jun 1999 09:42:38 +1000
Subject: [Python-Dev] ActiveState & fork & Perl
In-Reply-To: <003501beb13d$cbadf7d0$f29b12c2@pythonware.com>
Message-ID: <000801beb13f$6e118310$0801a8c0@bobcat>
which was just my way of saying that "did he perhaps
refer to OS X ?".

or are they adding real threads to good old MacOS too?
Oh, /F, please dont start adding annotations to your collection of
incredibly obscure URLs - takes away half the fun ;-)

Mark.



From gstein@lyra.org Tue Jun 8 01:01:41 1999
From: gstein@lyra.org (Greg Stein)
Date: Mon, 07 Jun 1999 17:01:41 -0700
Subject: [Python-Dev] ActiveState & fork & Perl
References: <Pine.WNT.4.04.9906071550440.193-100000@rigoletto.ski.org> <375C516E.76EC8ED4@lyra.org> <003f01beb13e$c95a2750$f29b12c2@pythonware.com>
Message-ID: <375C5D65.6E6CD6F@lyra.org>

Fredrik Lundh wrote:
Having threads by default will make a lot of things much simpler
(in terms of knowing the default platform). Can't tell you how
many times I curse to find that the default RedHat distribution
(as of 5.x) did not use threads, even though they are well-
supported on Linux.
I have a vague memory that once upon a time, the standard
X libraries shipped with RedHat weren't thread safe, and
Tkinter didn't work if you compiled Python with threads.

but I might be wrong and/or that may have changed...
Yes, it has changed. RedHat now ships with a thread-safe X so that they
can use GTK and Gnome (which use threads quite a bit).

There may be other limitations, however, as I haven't tried to do any
threaded GUI programming, especially on a recent RedHat (I'm using a
patched/hacked RH 4.1 system). RedHat 6.0 may even ship with a threaded
Python, but I dunno...

-g

--
Greg Stein, http://www.lyra.org/


From da@ski.org Tue Jun 8 01:43:27 1999
From: da@ski.org (David Ascher)
Date: Mon, 7 Jun 1999 17:43:27 -0700 (Pacific Daylight Time)
Subject: [Python-Dev] ActiveState & fork & Perl
In-Reply-To: <000501beb13a$9eec2c10$0801a8c0@bobcat>
Message-ID: <Pine.WNT.4.05.9906071735160.102-100000@david.ski.org>
On Tue, 8 Jun 1999, Mark Hammond wrote:

When I first heard this, two things sprung to mind:
a) Why shouldnt Python push for a similar deal?
b) Something more interesting in the MS/Python space is happening anyway,
so nyah nya nya ;-)

Getting some modest funds to (say) put together and maintain single
core+win32 installers to place on the NT resource kit could only help
Python.
How much money are we talking about (no, I'm not offering =)?

I wonder if one problem we have is that the folks with $$'s don't want to
advertise that they have $$'s because they don't want to be swamped with
vultures (and because "that isn't done"), and the people with skills but
no $$'s don't want to advertise that fact for a variety of reasons
(modesty, fear of being labeled 'commercial', fear of exposing that
they're not 100% busy, so "can't be good", etc.).

I've been wondering if a broker service like sourceXchange for Python
could work -- whether there are enough people who want something done to
Python and are willing to pay for an Open Soure project (and whether there
are enough "worker bees", although I suspect there are). I can think of
several items on various TODO lists which could probably be tackled this
way. (doing things *within* sourceXchange is clearly a possibility in the
long term -- in the short term they seem focused on Linux, but time will
tell).

Guido, you're probably the point-man for such 'angels' -- do you get those
kinds of requests periodically? How about you, Mark?

One thing that ActiveState has going for it which doesn't exist in the
Python world is a corporate entity devoted to software development and
distribution. PPSI is a support company, or at least markets itself that
way.

--david





From gstein@lyra.org Tue Jun 8 02:05:15 1999
From: gstein@lyra.org (Greg Stein)
Date: Mon, 07 Jun 1999 18:05:15 -0700
Subject: [Python-Dev] ActiveState & fork & Perl
References: <Pine.WNT.4.05.9906071735160.102-100000@david.ski.org>
Message-ID: <375C6C4B.617138AB@lyra.org>

David Ascher wrote:
On Tue, 8 Jun 1999, Mark Hammond wrote:

When I first heard this, two things sprung to mind:
a) Why shouldnt Python push for a similar deal?
As David points out, I believe this is simply because ActiveState is
unique in their business type, products, and model. We don't have
anything like that in the Python world (although Pythonware could
theoretically go in a similar direction).
...
I've been wondering if a broker service like sourceXchange for Python
could work -- whether there are enough people who want something done to
Python and are willing to pay for an Open Soure project (and whether there
are enough "worker bees", although I suspect there are). I can think of
several items on various TODO lists which could probably be tackled this
way. (doing things *within* sourceXchange is clearly a possibility in the
long term -- in the short term they seem focused on Linux, but time will
tell).
sourceXchange should work fine. I don't see it being Linux-only by any
means. Heck, the server is a FreeBSD box, and Brian Behlendorf comes
from the Apache world (and is a FreeBSD guy mostly).
Guido, you're probably the point-man for such 'angels' -- do you get those
kinds of requests periodically? How about you, Mark?

One thing that ActiveState has going for it which doesn't exist in the
Python world is a corporate entity devoted to software development and
distribution. PPSI is a support company, or at least markets itself that
way.
Yup. That's all we are. We are specifically avoiding any attempts to be
a product company. ActiveState is all about products and support-type
products.

I met with Dick Hardt (ActiveState founder/president) just a couple
weeks ago. Great guy. We spoke about ActiveState, what they're doing,
and what they'd like to do. They might be looking for good Python
people, too...

Cheers,
-g

--
Greg Stein, http://www.lyra.org/


From akuchlin@mems-exchange.org Tue Jun 8 02:22:59 1999
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Mon, 7 Jun 1999 21:22:59 -0400 (EDT)
Subject: [Python-Dev] ActiveState & fork & Perl
In-Reply-To: <14172.18567.631562.79944@cm-24-29-94-19.nycap.rr.com>
References: <14171.64464.805578.325069@anthem.cnri.reston.va.us>
<Pine.WNT.4.04.9906071006030.193-100000@rigoletto.ski.org>
<14172.773.807413.412693@anthem.cnri.reston.va.us>
<14172.18567.631562.79944@cm-24-29-94-19.nycap.rr.com>
Message-ID: <14172.28787.399827.929220@newcnri.cnri.reston.va.us>

Skip Montanaro writes:
True enough, but as Guido pointed out, enabling threads by default would
immediately make the Mac a second-class citizen. Test cases and demos would
One possibility might be NSPR, the Netscape Portable Runtime,
which provides platform-independent threads and I/O on Mac, Win32, and
Unix. Perhaps a thread implementation could be written that sat on
top of NSPR, in addition to the existing pthreads implementation.
See http://www.mozilla.org/docs/refList/refNSPR/.

(You'd probably only use NSPR on the Mac, though; there seems no
point in adding another layer of complexity to Unix and Windows.)

--
A.M. Kuchling http://starship.python.net/crew/amk/
When religion abandons poetic utterance, it cuts its own throat.
-- Robertson Davies, _Marchbanks' Garland_



From tim_one@email.msn.com Tue Jun 8 02:24:47 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Mon, 7 Jun 1999 21:24:47 -0400
Subject: [Python-Dev] ActiveState & fork & Perl
In-Reply-To: <Pine.WNT.4.04.9906070938570.193-100000@rigoletto.ski.org>
Message-ID: [David Ascher]
In case you haven't heard about it, ActiveState has recently signed a
contract with Microsoft to do some work on Perl on win32.
I'm astonished at the reaction this has provoked "out there". Here:

D:\Python>perl -v

This is perl, version 5.001

Unofficial patchlevel 1m.

Copyright 1987-1994, Larry Wall
Win32 port Copyright (c) 1995 Microsoft Corporation. All rights reserved.
Developed by hip communications inc., http://info.hip.com/info/

Perl for Win32 Build 107
Built Apr 16 1996@14:47:22
Perl may be copied only under the terms of either the Artistic License or
the
GNU General Public License, which may be found in the Perl 5.0 source kit.

D:\Python>

Notice the MS copyright? From 1995?! Perl for Win32 has *always* been
funded by MS, even back when half of ActiveState was named "hip
communications" <0.5 wink>. Thank Perl's dominance in CGI scripting -- MS
couldn't sell NT Server if it didn't run Perl. MS may be vicious, but
they're not stupid <wink>.
...
fork()
...
Any guesses as to whether we could hijack this work if/when it is released
as Open Source?
It's proven impossible so far to reuse anything from the Perl source -- the
code is an incestuous nightmare. From time to time the Perl-Porters talk
about splitting some of it into reusable libraries, but that never happens;
and the less they feel Perl's dominance is assured, the less they even talk
about it.

So I'm pessimistic (what else is new <wink>?). I'd rather see the work put
into threads anyway. The "Mac OS" problem will go away eventually; time to
turn the suckers on by default.

it's-not-like-millions-of-programmers-will-start-writing-thread-code-then-
who-don't-now-ly y'rs - tim




From guido@CNRI.Reston.VA.US Tue Jun 8 02:34:59 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Mon, 07 Jun 1999 21:34:59 -0400
Subject: [Python-Dev] ActiveState & fork & Perl
In-Reply-To: Your message of "Mon, 07 Jun 1999 19:01:56 CDT."
<1283322126-63868517@hypernet.com>
References: <14172.18567.631562.79944@cm-24-29-94-19.nycap.rr.com>
<1283322126-63868517@hypernet.com>
Message-ID: <199906080134.VAA13480@eric.cnri.reston.va.us>
Perhaps Christian's stackless Python would enable green threads...
This has been suggested before... While this seems possible at first,
all blocking I/O calls would have to be redone to pass control to the
thread scheduler, before this would be useful -- a huge task!

I believe SunOS 4.x's LWP (light-weight processes) library used this
method. It was a drop-in replacement for the standard libc,
containing changed versions of all system calls. I recall that there
were one or two missing, which of course upset the posix module
because it references almost *all* system calls...

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim_one@email.msn.com Tue Jun 8 02:38:38 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Mon, 7 Jun 1999 21:38:38 -0400
Subject: [Python-Dev] ActiveState & fork & Perl
In-Reply-To: <003501beb13d$cbadf7d0$f29b12c2@pythonware.com>
Message-ID: [/F]
http://www.computerworld.com/home/print.nsf/all/990531AAFA

which was just my way of saying that "did he perhaps
refer to OS X ?".

or are they adding real threads to good old MacOS too?
Dragon is doing a port of its speech recog software to "good old MacOS" and
"OS X", and best we can tell the former is as close to an impossible target
as we've ever seen. OS X looks like a pleasant romp, in comparison. I
don't think they're going to do anything with "good old MacOS" except let it
die.

it-was-a-reasonable-architecture-15-years-ago-ly y'rs - tim




From gstein@lyra.org Tue Jun 8 02:31:08 1999
From: gstein@lyra.org (Greg Stein)
Date: Mon, 07 Jun 1999 18:31:08 -0700
Subject: [Python-Dev] ActiveState & fork & Perl
References: <14171.64464.805578.325069@anthem.cnri.reston.va.us>
<Pine.WNT.4.04.9906071006030.193-100000@rigoletto.ski.org>
<14172.773.807413.412693@anthem.cnri.reston.va.us>
<14172.18567.631562.79944@cm-24-29-94-19.nycap.rr.com> <14172.28787.399827.929220@newcnri.cnri.reston.va.us>
Message-ID: <375C725C.5A86D05B@lyra.org>

Andrew Kuchling wrote:
Skip Montanaro writes:
True enough, but as Guido pointed out, enabling threads by default would
immediately make the Mac a second-class citizen. Test cases and demos would
One possibility might be NSPR, the Netscape Portable Runtime,
which provides platform-independent threads and I/O on Mac, Win32, and
Unix. Perhaps a thread implementation could be written that sat on
top of NSPR, in addition to the existing pthreads implementation.
See http://www.mozilla.org/docs/refList/refNSPR/.

(You'd probably only use NSPR on the Mac, though; there seems no
point in adding another layer of complexity to Unix and Windows.)
NSPR is licensed under the MPL, which is quite a bit more restrictive
than Python's license. Of course, you could separately point Mac users
to it to say "if you get NSPR, then you can have threads".

Apache ran into the licensing issue and punted NSPR in favor of a
home-grown runtime (which is not as ambitious as NSPR).

Cheers,
-g

--
Greg Stein, http://www.lyra.org/


From gmcm@hypernet.com Tue Jun 8 03:37:34 1999
From: gmcm@hypernet.com (Gordon McMillan)
Date: Mon, 7 Jun 1999 21:37:34 -0500
Subject: [Python-Dev] ActiveState & fork & Perl
In-Reply-To: <002a01beb13d$1fa23c80$f29b12c2@pythonware.com>
Message-ID: <1283312788-64430290@hypernet.com>

Fredrik Lundh writes:
time to implement channels? (Tcl's unified abstraction
for all kinds of streams that you could theoretically use
something like select on. sockets, pipes, asynchronous
disk I/O, etc).
I have mixed feelings about those types of things. I've recently run
across a number of them in some C/C++ libs.

On the "pro" side, they can give acceptable behavior and adequate
performance and thus suffice for the majority of use.

On the "con" side, they're usually an order of magnitude slower than
the raw interface, don't quite behave correctly in borderline
situations, and tend to produce "One True Path" believers.

Of course, so do OSes, editors, languages, GUIs, browsers and colas.
does select really work on ordinary files under Unix,
btw?
Sorry, should've said "where a socket is a real fd" or some such...

just-like-God-intended-ly y'rs

- Gordon


From guido@CNRI.Reston.VA.US Tue Jun 8 02:46:40 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Mon, 07 Jun 1999 21:46:40 -0400
Subject: [Python-Dev] ActiveState & fork & Perl
In-Reply-To: Your message of "Mon, 07 Jun 1999 17:43:27 PDT."
<Pine.WNT.4.05.9906071735160.102-100000@david.ski.org>
References: <Pine.WNT.4.05.9906071735160.102-100000@david.ski.org>
Message-ID: <199906080146.VAA13572@eric.cnri.reston.va.us>
Guido, you're probably the point-man for such 'angels' -- do you get those
kinds of requests periodically?
No, as far as I recall, nobody has ever offered me money for Python code
to be donated to the body of open source. People sometimes seek to
hire me, but promarily to further their highly competitive proprietary
business goals...

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gstein@lyra.org Tue Jun 8 02:41:32 1999
From: gstein@lyra.org (Greg Stein)
Date: Mon, 07 Jun 1999 18:41:32 -0700
Subject: [Python-Dev] licensing
Message-ID: <375C74CC.2947E4AE@lyra.org>

Speaking of licensing issues...

I seem to have read somewhere that the two Medusa files are under a
separate license. Although, reading the files now, it seems they are
not.

The issue that I'm really raising is that Python should ship with a
single license that covers everything. Otherwise, it will become very
complicated for somebody to figure out which pieces fall under what
restrictions.

Is there anything in the distribution that is different than the normal
license?

For example, can I take the async modules and build a commercial product
on them?

Cheers,
-g

--
Greg Stein, http://www.lyra.org/


From guido@CNRI.Reston.VA.US Tue Jun 8 02:56:03 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Mon, 07 Jun 1999 21:56:03 -0400
Subject: [Python-Dev] ActiveState & fork & Perl
In-Reply-To: Your message of "Tue, 08 Jun 1999 09:06:37 +1000."
<000501beb13a$9eec2c10$0801a8c0@bobcat>
References: <000501beb13a$9eec2c10$0801a8c0@bobcat>
Message-ID: [me]
Have I ever heard of it! :-) David Grove pulled me into one of his
bouts of paranoia. I think he's calmed down for the moment.
[Mark]
It sounds like a :-), but Im afraid I dont understand that reference.
David Grove occasionally posts to Perl lists with accusations that
ActiveState is making Perl proprietary. He once announced a program
editor to the Python list which upon inspection by me didn't contain
any Python support, for which I flamed him. He then explained to me
that he was in a hurry because ActiveState was taking over the Perl
world. A couple of days ago, I received an email from him (part of a
conversation on the perl5porters list apparently) where he warned me
that ActiveState was planning a similar takeover of Python. After
some comments from tchrist ("he's a loon") I decided to ignore David.
Sometimes I wish we had a few less good programmers, and a few more good
marketting type people ;-)
Ditto... It sure ain't me!
Excuse my ignorance, but how hard would it be to simulate/emulate/ovulate :-)
fork using the Win32 extensions? Python has basically all of the native
Win32 process API exposed, and writing a "fork" in Python that only forked
Python scripts (for example) may be feasable and not too difficult.

It would have obvious limitations, including the fact that it is not
available standard with Python on Windows (just like a working popen now
:-) but if we could follow the old 80-20 rule, and catch 80% of the uses
with 20% of the effort it may be worth investigating.

My knowledge of fork is limited to muttering "something about cloning the
current process", so I may be naive in the extreme - but is this feasible?
I think it's not needed that much, but David has argued otherwise. I
haven't heard much support either way from others. But I think it
would be a huge task, because it would require taking control of all
file descriptors (given the semantics that upon fork, file descriptors
are shared, but if one half closes an fd it is still open in the other
half).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gmcm@hypernet.com Tue Jun 8 03:58:59 1999
From: gmcm@hypernet.com (Gordon McMillan)
Date: Mon, 7 Jun 1999 21:58:59 -0500
Subject: [Python-Dev] ActiveState & fork & Perl
In-Reply-To: <000e01beb14f$a16d9a40$aaa02299@tim>
References: <003501beb13d$cbadf7d0$f29b12c2@pythonware.com>
Message-ID: [Tim]
Dragon is doing a port of its speech recog software to "good old
MacOS" and "OS X", and best we can tell the former is as close to an
impossible target as we've ever seen. OS X looks like a pleasant
romp, in comparison. I don't think they're going to do anything
with "good old MacOS" except let it die.

it-was-a-reasonable-architecture-15-years-ago-ly y'rs - tim
Don't Macs have another CPU in the keyboard already? Maybe you could
just require a special microphone <wink wink nudge nudge>.

that's-not-a-mini-tower-that's-a-um--subwoofer-ly y'rs

- Gordon


From guido@CNRI.Reston.VA.US Tue Jun 8 03:09:02 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Mon, 07 Jun 1999 22:09:02 -0400
Subject: [Python-Dev] licensing
In-Reply-To: Your message of "Mon, 07 Jun 1999 18:41:32 PDT."
<375C74CC.2947E4AE@lyra.org>
References: <375C74CC.2947E4AE@lyra.org>
Message-ID: <199906080209.WAA13806@eric.cnri.reston.va.us>
Speaking of licensing issues...

I seem to have read somewhere that the two Medusa files are under a
separate license. Although, reading the files now, it seems they are
not.

The issue that I'm really raising is that Python should ship with a
single license that covers everything. Otherwise, it will become very
complicated for somebody to figure out which pieces fall under what
restrictions.

Is there anything in the distribution that is different than the normal
license?
There are pieces with different licenses but they only differ in the
names of the beneficiaries, not in the conditions (although the words
aren't always exactly the same). As far as I can tell, this is the
situation for asyncore.py and asynchat.py: they have a copyright
notice of their own (see the 1.5.2 source for the exact text) with Sam
Rushing's copyright.
For example, can I take the async modules and build a commercial product
on them?
As far as I know, yes. Sam Rushing promised me this when he gave them
to me for inclusion. (I've had a complaint that they aren't the
latest -- can someone confirm this?)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From MHammond@skippinet.com.au Tue Jun 8 04:11:57 1999
From: MHammond@skippinet.com.au (Mark Hammond)
Date: Tue, 8 Jun 1999 13:11:57 +1000
Subject: [Python-Dev] ActiveState & fork & Perl
In-Reply-To: <199906080156.VAA13612@eric.cnri.reston.va.us>
Message-ID: [Please dont copy this out of this list :-]
world. A couple of days ago, I received an email from him (part of a
conversation on the perl5porters list apparently) where he warned me
that ActiveState was planning a similar takeover of Python. After
some comments from tchrist ("he's a loon") I decided to ignore David.
I believe this to be true - at least "take over" in the same way they have
"taken over" Perl. I have it on very good authority that Active State's
medium term business plan includes expanding out of Perl alone, and Python
is very high on their list. I also believe they would like to recruit
people to help with this goal. They are of the opinion that Python alone
could not support such a business quite yet, so attaching it to existing
infrastructure could fly. On one hand I tend to agree, but on the other
hand I think that we do a pretty damn good job as it is, so maybe a Python
could fly all alone?

And Ive got to say that personally, such an offer would be highly
attractive. Depending on the terms (and I must admit I have not had a good
look at the ActiveState Perl licenses) this could provide a real boost to
the Python world. If the business model is open source software with
paid-for support, it seems a win-win situation to me. However, it is very
unclear to me, and the industry, that this model alone can work generally.
A business-plan that involves withholding sources or technologies until a
fee has been paid certainly moves quickly away from win-win to, to quote
Guido, "highly competitive proprietary business goals".

May be some interesting times ahead. For some time now I have meant to
pass this on to PPSI as a heads-up, just incase they intend playing in that
space in the future. So consider this it ;-)

Mark.



From gstein@lyra.org Tue Jun 8 04:13:42 1999
From: gstein@lyra.org (Greg Stein)
Date: Mon, 07 Jun 1999 20:13:42 -0700
Subject: [Python-Dev] ActiveState & fork & Perl
References: <000b01beb15c$abd84ea0$0801a8c0@bobcat>
Message-ID: <375C8A66.56B3F26B@lyra.org>

Mark Hammond wrote:
[Please dont copy this out of this list :-]
It's in the archives now... :-)
...[well-said comments about open source and businesses]...

May be some interesting times ahead. For some time now I have meant to
pass this on to PPSI as a heads-up, just incase they intend playing in that
space in the future. So consider this it ;-)
I've already met Dick Hardt and spoken with him at length. Both on an
individual basis, and as the President of PPSI. Nothing to report...
(yet)

Cheers,
-g

p.s. PPSI is a bit different, as we intend to fill the "support gap"
rather than move into real products; ActiveState does products, along
with support type stuff and other miscellaneous (I don't recall Dick's
list offhand).

--
Greg Stein, http://www.lyra.org/


From tim_one@email.msn.com Tue Jun 8 06:14:36 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Tue, 8 Jun 1999 01:14:36 -0400
Subject: [Python-Dev] ActiveState & fork & Perl
In-Reply-To: <000b01beb15c$abd84ea0$0801a8c0@bobcat>
Message-ID: [MarkH]
...
And Ive got to say that personally, such an offer would be highly
attractive. Depending on the terms (and I must admit I have not
had a good look at the ActiveState Perl licenses) this could provide
a real boost to the Python world.
I find the ActivePerl license to be quite confusing:

http://www.activestate.com/ActivePerl/commlic.htm

It appears to say flatly that you can't distribute it yourself, although
other pages on the site say "sure, go ahead!". Also seems to imply you
can't modify their code (they explicitly allow you to install patches
obtained from ActiveState -- but that's all they mention).

OTOH, they did a wonderful job on the Perl for Win32 port (a difficult port
in the face of an often-hostile Perl community), and gave all the code back
to the Perl folk. I've got no complaints about them so far.
If the business model is open source software with paid-for support, it
seems a win-win situation to me.
"Part of our business model is to sell value added, proprietary
components."; e.g., they sell a Perl Development Kit for $100, and so on.
Fine by me! If I could sell tabnanny ... well, I wouldn't do that to anyone
<wink>.

would-like-to-earn-$1-from-python-before-he-dies-ly y'rs - tim




From skip@mojam.com (Skip Montanaro) Tue Jun 8 06:37:22 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Tue, 8 Jun 1999 01:37:22 -0400 (EDT)
Subject: [Python-Dev] ActiveState & fork & Perl
In-Reply-To: <375C516E.76EC8ED4@lyra.org>
References: <Pine.WNT.4.04.9906071550440.193-100000@rigoletto.ski.org>
<375C516E.76EC8ED4@lyra.org>
Message-ID: <14172.43513.272949.148790@cm-24-29-94-19.nycap.rr.com>

Greg> About the only reason that I can see to *not* make them the
Greg> default is the slight speed loss. But that seems a bit bogus, as
Greg> the interpreter loop doesn't spend *that* much time mucking with
Greg> the interp_lock to allow thread switches. There have also been
Greg> some real good suggestions for making it take near-zero time until
Greg> you actually create that second thread.

Okay, everyone has convinced me that holding threads hostage to the Mac is a
red herring. I have other fish to fry. (It's 1:30AM and I haven't had
dinner yet. Can you tell? ;-)

Is there a way with configure to determine whether or not particular Unix
variants should have threads enabled or not? If so, I think that's the way
to go. I think it would be unfortunate to enable it by default, have it
appear to work on some known to be unsupported platforms, but then bite the
programmer in an inconvenient place at an inconvenient time.

Such a self-deciding configure script should exit with some information
about thread enablement:

Yes, we support threads on RedHat Linux 6.0.

No, you stinking Minix user, you will never have threads.

Rhapsody, huh? I never heard of that. Some weird OS from Sunnyvale,
you say? I don't know how to do threads there yet, but when you
figure it out, send patches along to python-dev@python.org.

Of course, users should be able to override anything using --with-thread or
without-thread and possibly specify compile-time and link-time flags through
arguments or the environment.

Skip Montanaro | Mojam: "Uniting the World of Music" http://www.mojam.com/
skip@mojam.com | Musi-Cal: http://www.musi-cal.com/
518-372-5583



From skip@mojam.com (Skip Montanaro) Tue Jun 8 06:49:19 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Tue, 8 Jun 1999 01:49:19 -0400 (EDT)
Subject: [Python-Dev] ActiveState & fork & Perl
In-Reply-To: <000b01beb15c$abd84ea0$0801a8c0@bobcat>
References: <199906080156.VAA13612@eric.cnri.reston.va.us>
<000b01beb15c$abd84ea0$0801a8c0@bobcat>
Message-ID: <14172.44596.528927.548722@cm-24-29-94-19.nycap.rr.com>

Okay, folks. I must have missed the memo. Who are ActiveState and
sourceXchange? I can't be the only person on python-dev who never heard of
either of them before this evening. I guess I'm the only one who's not shy
about exposing their ignorance.

but-i-can-tell-you-where-to-find-spare-parts-for-your-Triumph-ly 'yrs,

Skip Montanaro
518-372-5583
See my car: http://www.musi-cal.com/~skip/


From da@ski.org Tue Jun 8 07:12:11 1999
From: da@ski.org (David Ascher)
Date: Mon, 7 Jun 1999 23:12:11 -0700 (Pacific Daylight Time)
Subject: [Python-Dev] ActiveState & fork & Perl
In-Reply-To: <14172.44596.528927.548722@cm-24-29-94-19.nycap.rr.com>
Message-ID: <Pine.WNT.4.05.9906072309400.198-100000@david.ski.org>
Okay, folks. I must have missed the memo. Who are ActiveState and
sourceXchange? I can't be the only person on python-dev who never heard of
either of them before this evening. I guess I'm the only one who's not shy
about exposing their ignorance.
Well, one answer is to look at www.activestate.com and
www.sourcexchange.com, of course =)

ActiveState "does" the win32 perl port, for money. (it's a little
controversial within the Perl community, which has inherited some of RMS's
"Microsoft OS? Ha!" attitude).

sourceXchange is aiming to match open source programmers with companies
who want open source work done for $$'s, in a 'market' format. It was
started by Brian Behlendorf, now at O'Reilly, and of Apache fame.

Go get dinner. =)

--david




From rushing@nightmare.com Tue Jun 8 01:10:18 1999
From: rushing@nightmare.com (Sam Rushing)
Date: Mon, 7 Jun 1999 17:10:18 -0700 (PDT)
Subject: [Python-Dev] licensing
In-Reply-To: <9403621@toto.iv>
Message-ID: <14172.23937.83700.673653@seattle.nightmare.com>

Guido van Rossum writes:
Greg Stein writes:
For example, can I take the async modules and build a commercial
product on them?
Yes, my intent was that they go under the normal Python 'do what thou
wilt' license. If I goofed in any way, please let me know!
As far as I know, yes. Sam Rushing promised me this when he gave
them to me for inclusion. (I've had a complaint that they aren't
the latest -- can someone confirm this?)
Guilty as charged. I've been tweaking them a bit lately, for
performance, but anyone can grab the very latest versions out of the
medusa CVS repository:

CVSROOT=:pserver:medusa@seattle.nightmare.com:/usr/local/cvsroot
(the password is 'medusa')

Or download one of the snapshots.

BTW, those particular files have always had the Python
copyright/license.

-Sam



From gstein@lyra.org Tue Jun 8 08:09:00 1999
From: gstein@lyra.org (Greg Stein)
Date: Tue, 08 Jun 1999 00:09:00 -0700
Subject: [Python-Dev] licensing
References: <14172.23937.83700.673653@seattle.nightmare.com>
Message-ID: <375CC18C.1DB5E9F2@lyra.org>

Sam Rushing wrote:
Greg Stein writes:
For example, can I take the async modules and build a commercial
product on them?
Yes, my intent was that they go under the normal Python 'do what thou
wilt' license. If I goofed in any way, please let me know!
Nope... you haven't goofed. I was thrown off when a certain person
(nudge, nudge) goofed in their upcoming book, which I recently reviewed.

thx!
-g

--
Greg Stein, http://www.lyra.org/


From fredrik@pythonware.com Tue Jun 8 09:08:08 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Tue, 8 Jun 1999 10:08:08 +0200
Subject: [Python-Dev] licensing
References: <375C74CC.2947E4AE@lyra.org>
Message-ID: <00c501beb186$0c6d3450$f29b12c2@pythonware.com>
I seem to have read somewhere that the two Medusa files are under a
separate license. Although, reading the files now, it seems they are
not.
the medusa server has restrictive license, but the asyncore
and asynchat modules use the standard Python license, with
Sam Rushing as the copyright owner. just use the source...
The issue that I'm really raising is that Python should ship with a
single license that covers everything. Otherwise, it will become very
complicated for somebody to figure out which pieces fall under what
restrictions.

Is there anything in the distribution that is different than the normal
license?

For example, can I take the async modules and build a commercial product
on them?
surely hope so -- we're using them in everything we do.

and my upcoming book is 60% about doing weird things with
tkinter, and 40% about doing weird things with asynclib...

</F>



From MHammond@skippinet.com.au Tue Jun 8 09:46:33 1999
From: MHammond@skippinet.com.au (Mark Hammond)
Date: Tue, 8 Jun 1999 18:46:33 +1000
Subject: [Python-Dev] licensing
In-Reply-To: <375CC18C.1DB5E9F2@lyra.org>
Message-ID: <001101beb18b$6a049bd0$0801a8c0@bobcat>
Nope... you haven't goofed. I was thrown off when a certain person
(nudge, nudge) goofed in their upcoming book, which I
recently reviewed.
I now feel for the other Mark and David, Aaron et al, etc. Our book is out
of date in a number of ways before the tech reviewers even saw it.

Medusa wasnt a good example - I should have known better when I wrote it.
But Pythonwin is a _real_ problem. Just as I start writing the book, Neil
sends me a really cool editor control and it leads me down a path of
IDLE/Pythonwin integration.

So almost _everything_ I have already written on "IDEs for Python" is
already out of date - and printing is not scheduled for a number of months.

[This may help explain to Guido and Tim my recent fervour in this area - I
want to get the "new look" Pythonwin ready for the book. I just yesterday
got a dockable interactive window happening. Now adding a splitter window
to each window to expose a pyclbr based tree control and then it is time to
stop (and re-write that chapter :-]

Mark.



From fredrik@pythonware.com Tue Jun 8 11:25:47 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Tue, 8 Jun 1999 12:25:47 +0200
Subject: [Python-Dev] ActiveState & fork & Perl
References: <Pine.WNT.4.05.9906071735160.102-100000@david.ski.org>
Message-ID: (modesty, fear of being labeled 'commercial', fear of exposing that
they're not 100% busy, so "can't be good", etc.).
fwiw, we're seeing an endless stream of mails from moral
crusaders even before we have opened the little Python-
Ware shoppe (coming soon, coming soon). some of them
are quite nasty, to say the least...

I usually tell them to raise their concerns on c.l.python
instead. they never do.
One thing that ActiveState has going for it which doesn't exist in the
Python world is a corporate entity devoted to software development and
distribution.
saying that there is NO such entity is a bit harsh, I think ;-)

but different "scripting" companies are using different
strategies, by various reasons. Scriptics, ActiveState,
PythonWare, UserLand, Harlequin, Rebol, etc. are all
doing similar things, but in different ways (due to markets,
existing communities, and probably most important:
different funding strategies). But we're all corporate
entities devoted to software development...

...

by the way, if someone thinks there's no money in Python,
consider this:

---

Google is looking to expand its operations and needs
talented engineers to develop the next generation
search engine. If you have a need to bring order to
a chaotic web, contact us.

Requirements:

Several years of industry or hobby-based experience
B.S. in Computer Science or equivalent (M.S. a plus)
Extensive experience programming in C or C++
Extensive experience programming in the UNIX environment
Knowledge of TCP/IP and network programming
Experience developing/designing large software systems
Experience programming in Python a plus

---

Google Inc., a year-old Internet search-engine company,
said it has attracted $25 million in venture-capital funding
and will add two of Silicon Valley's best-known financiers,
Michael Moritz and L. John Doerr, to its board.

Even by Internet standards, Google has attracted an un-
usually large amount of money for a company still in its
infancy.

---

looks like anyone on this list could get a cool Python job
for an unusually over-funded startup within minutes ;-)

</F>



From skip@mojam.com (Skip Montanaro) Tue Jun 8 12:12:02 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Tue, 8 Jun 1999 07:12:02 -0400 (EDT)
Subject: [Python-Dev] ActiveState & fork & Perl
In-Reply-To: <004d01beb199$4fe171c0$f29b12c2@pythonware.com>
References: <Pine.WNT.4.05.9906071735160.102-100000@david.ski.org>
<004d01beb199$4fe171c0$f29b12c2@pythonware.com>
Message-ID: <14172.63947.54638.275348@cm-24-29-94-19.nycap.rr.com>

Fredrik> Even by Internet standards, Google has attracted an un-
Fredrik> usually large amount of money for a company still in its
Fredrik> infancy.

And it's a damn good search engine to boot, so I think it probably deserves
the funding (most of it will, I suspect, be used to muscle its way into a
crowded market). It is *always* my first stop when I need a general-purpose
search engine these days. I never use InfoSeek/Go, Lycos or HotBot for
anything other than to check that Musi-Cal is still in their database.

Skip Montanaro | Mojam: "Uniting the World of Music" http://www.mojam.com/
skip@mojam.com | Musi-Cal: http://www.musi-cal.com/
518-372-5583


From guido@CNRI.Reston.VA.US Tue Jun 8 13:46:51 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 08 Jun 1999 08:46:51 -0400
Subject: [Python-Dev] ActiveState & fork & Perl
In-Reply-To: Your message of "Tue, 08 Jun 1999 01:37:22 EDT."
<14172.43513.272949.148790@cm-24-29-94-19.nycap.rr.com>
References: <Pine.WNT.4.04.9906071550440.193-100000@rigoletto.ski.org> <375C516E.76EC8ED4@lyra.org>
<14172.43513.272949.148790@cm-24-29-94-19.nycap.rr.com>
Message-ID: <199906081246.IAA14302@eric.cnri.reston.va.us>
Is there a way with configure to determine whether or not particular Unix
variants should have threads enabled or not? If so, I think that's the way
to go. I think it would be unfortunate to enable it by default, have it
appear to work on some known to be unsupported platforms, but then bite the
programmer in an inconvenient place at an inconvenient time.
That's not so much the problem, if you can get a threaded program to
compile and link that probably means sufficient support exists. There
currently are checks in the configure script that try to find out
which thread library to use -- these could be expanded to disable
threads when none of the known ones work.

Anybody care enough to try hacking configure.in, or should I add this
to my tired TODO list?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jack@oratrix.nl Tue Jun 8 13:47:44 1999
From: jack@oratrix.nl (Jack Jansen)
Date: Tue, 08 Jun 1999 14:47:44 +0200
Subject: [Python-Dev] ActiveState & fork & Perl
In-Reply-To: Message by Andrew Kuchling <akuchlin@mems-exchange.org> ,
Mon, 7 Jun 1999 21:22:59 -0400 (EDT) , <14172.28787.399827.929220@newcnri.cnri.reston.va.us>
Message-ID: <19990608124745.3136B303120@snelboot.oratrix.nl>
One possibility might be NSPR, the Netscape Portable Runtime,
which provides platform-independent threads and I/O on Mac, Win32, and
Unix. Perhaps a thread implementation could be written that sat on
top of NSPR, in addition to the existing pthreads implementation.
See http://www.mozilla.org/docs/refList/refNSPR/.
NSPR looks rather promising! Does anyone has any experiences with it? What I'd
also be interested in is experiences in how it interacts with the "real" I/O
system, i.e. can you mix and match NSPR calls with normal os calls, or will
that break things?

The latter is important for Python, because there are lots of external
libraries, and while some are user-built (image libraries, gdbm, etc) and
could conceivably be converted to use NSPR others are not...
--
Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm




From guido@CNRI.Reston.VA.US Tue Jun 8 14:28:02 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 08 Jun 1999 09:28:02 -0400
Subject: [Python-Dev] Python-dev archives going public
In-Reply-To: Your message of "Mon, 07 Jun 1999 20:13:42 PDT."
<375C8A66.56B3F26B@lyra.org>
References: <000b01beb15c$abd84ea0$0801a8c0@bobcat>
<375C8A66.56B3F26B@lyra.org>
Message-ID: <199906081328.JAA14584@eric.cnri.reston.va.us>
[Please dont copy this out of this list :-]
It's in the archives now... :-)
Which reminds me... A while ago, Greg made some noises about the
archives being public, and temporarily I made them private. In the
following brief flurry of messages everybody who spoke up said they
preferred the archives to be public (even though the list remains
invitation-only). But I never made the change back, waiting for Greg
to agree, but after returning from his well deserved tequilla-splashed
vacation, he never gave a peep about this, and I "conveniently
forgot".

I still like the archives to be public. I hope Mark's remark there
was a joke?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From MHammond@skippinet.com.au Tue Jun 8 14:38:03 1999
From: MHammond@skippinet.com.au (Mark Hammond)
Date: Tue, 8 Jun 1999 23:38:03 +1000
Subject: [Python-Dev] Python-dev archives going public
In-Reply-To: <199906081328.JAA14584@eric.cnri.reston.va.us>
Message-ID: <003101beb1b4$22786de0$0801a8c0@bobcat>
I still like the archives to be public. I hope Mark's remark there
was a joke?
Well, not really a joke, but I am not naive to think this is a "private"
forum even in the absence of archives.

What I meant was closer to "please don't make public statements based
purely on this information". I never agreed to keep it private, but by the
same token didnt want to start the rumour mills and get bad press for
either Dick or us ;-)

Mark.



From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Tue Jun 8 16:09:24 1999
From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw)
Date: Tue, 8 Jun 1999 11:09:24 -0400 (EDT)
Subject: [Python-Dev] ActiveState & fork & Perl
References: <Pine.WNT.4.04.9906071550440.193-100000@rigoletto.ski.org>
<375C516E.76EC8ED4@lyra.org>
<14172.43513.272949.148790@cm-24-29-94-19.nycap.rr.com>
<199906081246.IAA14302@eric.cnri.reston.va.us>
Message-ID: <14173.12836.616873.953134@anthem.cnri.reston.va.us>
"Guido" == Guido van Rossum <guido@cnri.reston.va.us> writes:
Guido> Anybody care enough to try hacking configure.in, or should
Guido> I add this to my tired TODO list?

I'll give it a look. I've done enough autoconf hacking that it
shouldn't be too hard. I also need to get my string meths changes
into the tree...

-Barry


From gstein@lyra.org Tue Jun 8 19:11:56 1999
From: gstein@lyra.org (Greg Stein)
Date: Tue, 08 Jun 1999 11:11:56 -0700
Subject: [Python-Dev] Python-dev archives going public
References: <000b01beb15c$abd84ea0$0801a8c0@bobcat>
<375C8A66.56B3F26B@lyra.org> <199906081328.JAA14584@eric.cnri.reston.va.us>
Message-ID: <375D5CEC.340E2531@lyra.org>

Guido van Rossum wrote:
[Please dont copy this out of this list :-]
It's in the archives now... :-)
Which reminds me... A while ago, Greg made some noises about the
archives being public, and temporarily I made them private. In the
following brief flurry of messages everybody who spoke up said they
preferred the archives to be public (even though the list remains
invitation-only). But I never made the change back, waiting for Greg
to agree, but after returning from his well deserved tequilla-splashed
vacation, he never gave a peep about this, and I "conveniently
forgot".
I appreciate the consideration, but figured it was a done deal based on
feedback.

My only consideration in keeping them private was the basic, human fact
that people could feel left out. For example, if they read the archives,
thought it was neat, and attempted to subscribe only to be refused. It
is a bit easier to avoid engendering those bad feelings if the archives
aren't public.

Cheers,
-g

--
Greg Stein, http://www.lyra.org/


From jim@digicool.com Tue Jun 8 19:41:11 1999
From: jim@digicool.com (Jim Fulton)
Date: Tue, 08 Jun 1999 18:41:11 +0000
Subject: [Python-Dev] Python-dev archives going public
References: <000b01beb15c$abd84ea0$0801a8c0@bobcat>
<375C8A66.56B3F26B@lyra.org> <199906081328.JAA14584@eric.cnri.reston.va.us> <375D5CEC.340E2531@lyra.org>
Message-ID: <375D63C7.6BB6697E@digicool.com>

Greg Stein wrote:
My only consideration in keeping them private was the basic, human fact
that people could feel left out. For example, if they read the archives,
thought it was neat, and attempted to subscribe only to be refused. It
is a bit easier to avoid engendering those bad feelings if the archives
aren't public.
I agree.

Jim

--
Jim Fulton mailto:jim@digicool.com
Technical Director (540) 371-6909 Python Powered!
Digital Creations http://www.digicool.com http://www.python.org

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission. Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From tismer@appliedbiometrics.com Tue Jun 8 20:37:21 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Tue, 08 Jun 1999 21:37:21 +0200
Subject: [Python-Dev] Stackless Preview
References: <000901beafcc$424ec400$639e2299@tim> <375AD1DC.19C1C0F6@appliedbiometrics.com>
Message-ID: <375D70F1.37007192@appliedbiometrics.com>


Christian Tismer wrote:
[a lot]
fearing the feedback :-) ciao - chris
I expected everything but forgot to fear "no feedback". :-)

About 5 or 6 people seem to have taken the .zip
file. Now I'm wondering why nobody complains.
Was my code so wonderful, so disgustingly bad,
or is this just boring :-?

If it's none of the three above, I'd be happy to get
a hint if I should continue, or if and what I should
change.

Maybe it would make sense to add some documentation now,
and also to come up with an application which makes
use of the stackless implementation, since there is
now not much to wonder about than that it seems to work :-)

yes-call-me-impatient - ly chris

--
Christian Tismer :^)
Applied Biometrics GmbH : Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net
10553 Berlin : PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
we're tired of banana software - shipped green, ripens at home


From jeremy@cnri.reston.va.us Tue Jun 8 21:09:15 1999
From: jeremy@cnri.reston.va.us (Jeremy Hylton)
Date: Tue, 8 Jun 1999 16:09:15 -0400 (EDT)
Subject: [Python-Dev] Stackless Preview
In-Reply-To: <375D70F1.37007192@appliedbiometrics.com>
References: <000901beafcc$424ec400$639e2299@tim>
<375AD1DC.19C1C0F6@appliedbiometrics.com>
<375D70F1.37007192@appliedbiometrics.com>
Message-ID: <14173.30485.113456.830246@bitdiddle.cnri.reston.va.us>
"CT" == Christian Tismer <tismer@appliedbiometrics.com> writes:
CT> Christian Tismer wrote: [a lot]
fearing the feedback :-) ciao - chris
CT> I expected everything but forgot to fear "no feedback". :-)

CT> About 5 or 6 people seem to have taken the .zip file. Now I'm
CT> wondering why nobody complains. Was my code so wonderful, so
CT> disgustingly bad, or is this just boring :-?

CT> If it's none of the three above, I'd be happy to get a hint if I
CT> should continue, or if and what I should change.

I'm one of the silent 5 or 6. My reasons fall under "None of the
above." They are three in number:
1. No time (the perennial excuse; next 2 weeks are quite hectic)
2. I tried to use ndiff to compare old and new ceval.c, but
ran into some problems with that tool. (Tim, it looks
like the line endings are identical -- all '\012'.)
3. Wasn't sure what to look at first

My only suggestion would be to have an executive summary. If there
was a short README file -- no more than 150 lines -- that described
the essentials of the approach and told me what to look at first, I
would be able to comment more quickly.

Jeremy


From tismer@appliedbiometrics.com Tue Jun 8 21:15:04 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Tue, 08 Jun 1999 22:15:04 +0200
Subject: [Python-Dev] Stackless Preview
References: <000901beafcc$424ec400$639e2299@tim>
<375AD1DC.19C1C0F6@appliedbiometrics.com>
<375D70F1.37007192@appliedbiometrics.com> <14173.30485.113456.830246@bitdiddle.cnri.reston.va.us>
Message-ID: <375D79C8.90B3E721@appliedbiometrics.com>


Jeremy Hylton wrote:
[...]
I'm one of the silent 5 or 6. My reasons fall under "None of the
above." They are three in number:
1. No time (the perennial excuse; next 2 weeks are quite hectic)
2. I tried to use ndiff to compare old and new ceval.c, but
ran into some problems with that tool. (Tim, it looks
like the line endings are identical -- all '\012'.)
Yes, there are a lot of changes.
As a hint: windiff from VC++ does a great job here.
You can see both sources in one, in a very readable
colored form.
3. Wasn't sure what to look at first

My only suggestion would be to have an executive summary. If there
was a short README file -- no more than 150 lines -- that described
the essentials of the approach and told me what to look at first, I
would be able to comment more quickly.
Thanks a lot. Will do this tomorrow moaning as my first task.

feeling much better - ciao - chris

--
Christian Tismer :^)
Applied Biometrics GmbH : Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net
10553 Berlin : PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
we're tired of banana software - shipped green, ripens at home


From Vladimir.Marangozov@inrialpes.fr Tue Jun 8 23:29:27 1999
From: Vladimir.Marangozov@inrialpes.fr (Vladimir Marangozov)
Date: Wed, 9 Jun 1999 00:29:27 +0200 (DFT)
Subject: [Python-Dev] ActiveState & fork & Perl
In-Reply-To: <19990608124745.3136B303120@snelboot.oratrix.nl> from "Jack Jansen" at "Jun 8, 99 02:47:44 pm"
Message-ID: <199906082229.AAA48646@pukapuka.inrialpes.fr>

Jack Jansen wrote:
NSPR looks rather promising! Does anyone has any experiences with it? What I'd
also be interested in is experiences in how it interacts with the "real" I/O
system, i.e. can you mix and match NSPR calls with normal os calls, or will
that break things?
I've looked at it in the past. From memory, NSPR is a fairly big chunk of
code and it seemed to me that it's self contained for lots of system stuff.
Don't know about I/O, but I played with it to replace the BSD malloc it uses
with pymalloc and I was pleased to see the resulting speed & mem stats after
rebuilding one of the past Mozilla distribs. This is all the experience I have
with it.
The latter is important for Python, because there are lots of external
libraries, and while some are user-built (image libraries, gdbm, etc) and
could conceivably be converted to use NSPR others are not...
I guess that this one would be hard...

--
Vladimir MARANGOZOV | Vladimir.Marangozov@inrialpes.fr
http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252


From Vladimir.Marangozov@inrialpes.fr Tue Jun 8 23:45:48 1999
From: Vladimir.Marangozov@inrialpes.fr (Vladimir Marangozov)
Date: Wed, 9 Jun 1999 00:45:48 +0200 (DFT)
Subject: [Python-Dev] Stackless Preview
In-Reply-To: <14173.30485.113456.830246@bitdiddle.cnri.reston.va.us> from "Jeremy Hylton" at "Jun 8, 99 04:09:15 pm"
Message-ID: <199906082245.AAA48828@pukapuka.inrialpes.fr>

Jeremy Hylton wrote:
CT> If it's none of the three above, I'd be happy to get a hint if I
CT> should continue, or if and what I should change.

I'm one of the silent 5 or 6. My reasons fall under "None of the
above." They are three in number:
...
My only suggestion would be to have an executive summary. If there
was a short README file -- no more than 150 lines -- that described
the essentials of the approach and told me what to look at first, I
would be able to comment more quickly.
Same here + a small wish: please save me the stripping of the ^M
line endings typical for MSW, so that I can load the files directly in
Xemacs on a Unix box. Otherwise, like Jeremy, I was a bit lost trying
to read ceval.c which is already too hairy.

--
Vladimir MARANGOZOV | Vladimir.Marangozov@inrialpes.fr
http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252


From tim_one@email.msn.com Wed Jun 9 03:27:37 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Tue, 8 Jun 1999 22:27:37 -0400
Subject: [Python-Dev] Stackless Preview
In-Reply-To: <199906082245.AAA48828@pukapuka.inrialpes.fr>
Message-ID: [Vladimir Marangozov]
...
please save me the stripping of the ^M line endings typical for MSW,
so that I can load the files directly in Xemacs on a Unix box.
Vlad, get linefix.py from Python FTP contrib's System area; converts among
Unix, Windows and Mac line conventions; to Unix by default. For that
matter, do a global replace of ^M in Emacs <wink>.

buncha-lazy-whiners<wink>-ly y'rs - tim




From tim_one@email.msn.com Wed Jun 9 03:27:35 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Tue, 8 Jun 1999 22:27:35 -0400
Subject: [Python-Dev] Stackless Preview
In-Reply-To: <14173.30485.113456.830246@bitdiddle.cnri.reston.va.us>
Message-ID: [Christian Tismer]
...
If it's none of the three above, I'd be happy to get a hint if I
should continue, or if and what I should change.
Sorry, Chris! Just a case of "no time" here. Of *course* you should
continue, and Guido should pop in with an encouraging word too -- or a
"forget it". I think this design opens the doors to a world of interesting
ideas, but that's based on informed prejudice rather than careful study of
your code. Cheer up: if everyone thought you were a lame ass, we all would
have studied your code intensely by now <wink>.

[Jeremy]
2. I tried to use ndiff to compare old and new ceval.c, but
ran into some problems with that tool. (Tim, it looks
like the line endings are identical -- all '\012'.)
Then let's treat this like a real bug <wink>: which version of Python did
you use? And ship me the files in a tarball (I'll find a way to extract
them intact). And does that specific Python+ndiff combo work OK on *other*
files? Or does it fail to find any lines in common no matter what you feed
it (a 1-line test case would be a real help <wink>)?

I couldn't provoke a problem with the stock 1.5.2 ndiff under the stock
1.5.2 Windows Python, using the then-current CVS snapshot of ceval.c as
file1 and the ceval.c from Christian's stackless_990606.zip file as file2.
Both files have \r\n line endings for me, though (one thanks to CVS line
translation, and the other thanks to WinZip line translation).

or-were-you-running-ndiff-under-the-stackless-python<wink>?-ly y'rs - tim




From tim_one@email.msn.com Wed Jun 9 03:27:40 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Tue, 8 Jun 1999 22:27:40 -0400
Subject: [Python-Dev] licensing
In-Reply-To: <001101beb18b$6a049bd0$0801a8c0@bobcat>
Message-ID: [Mark Hammond]
...
[This may help explain to Guido and Tim my recent fervour in this area
- I want to get the "new look" Pythonwin ready for the book. I just
yesterday got a dockable interactive window happening. Now adding a
splitter window to each window to expose a pyclbr based tree control and
then it is time to stop (and re-write that chapter :-]
All right! Do get the latest CVS versions of these files: pyclbr has been
sped up a lot over the past two days, and is much less likely to get baffled
now. And AutoIndent.py now defaults usetabs to 1 (which, of course, means
it still uses spaces in new files <wink>).




From guido@CNRI.Reston.VA.US Wed Jun 9 04:31:11 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 08 Jun 1999 23:31:11 -0400
Subject: [Python-Dev] Splitting up the PVM
In-Reply-To: Your message of "Tue, 08 Jun 1999 22:27:35 EDT."
<000c01beb21f$a2bd5540$2fa22299@tim>
References: <000c01beb21f$a2bd5540$2fa22299@tim>
Message-ID: <199906090331.XAA23066@eric.cnri.reston.va.us>

Tim wrote:
Sorry, Chris! Just a case of "no time" here. Of *course* you
should continue, and Guido should pop in with an encouraging word
too -- or a "forget it". I think this design opens the doors to a
world of interesting ideas, but that's based on informed prejudice
rather than careful study of your code. Cheer up: if everyone
thought you were a lame ass, we all would have studied your code
intensely by now <wink>.
No time here either...

I did try to have a quick peek and my first impression is that it's
*very* tricky code! You know what I think of that...

Here's what I think we should do first (I've mentioned this before but
nobody cheered me on :-). I'd like to see this as the basis for 1.6.

We should structurally split the Python Virtual Machine and related
code up into different parts -- both at the source code level and at
the runtime level. The core PVM becomes a replaceable component, and
so do a few other parts like the parser, the bytecode compiler, the
import code, and the interactive read-eval-print loop. Most object
implementations are shared between all -- or at least the interfaces
are interchangeable. Clearly, a few object types are specific to one
or another PVM (e.g. frames). The collection of builtins is also a
separate component (though some builtins may again be specific to a
PVM -- details, details!).

The goal of course, is to create a market for 3rd party components
here, e.g. Chris' flat PVM, or Skip's bytecode optimizer, or Greg's
importer, and so on.

Thoughts?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From da@ski.org Wed Jun 9 04:37:36 1999
From: da@ski.org (David Ascher)
Date: Tue, 8 Jun 1999 20:37:36 -0700 (Pacific Daylight Time)
Subject: [Python-Dev] Splitting up the PVM
In-Reply-To: <199906090331.XAA23066@eric.cnri.reston.va.us>
Message-ID: <Pine.WNT.4.05.9906082035550.157-100000@david.ski.org>
On Tue, 8 Jun 1999, Guido van Rossum wrote:

We should structurally split the Python Virtual Machine and related
code up into different parts -- both at the source code level and at
the runtime level. The core PVM becomes a replaceable component, and
so do a few other parts like the parser, the bytecode compiler, the
import code, and the interactive read-eval-print loop.
The goal of course, is to create a market for 3rd party components
here, e.g. Chris' flat PVM, or Skip's bytecode optimizer, or Greg's
importer, and so on.

Thoughts?
If I understand it correctly, it means that I can fit in a third-party
read-eval-print loop, which is my biggest area of frustration with the
current internal structure. Sounds like a plan to me, and one which (lucky
for me) I'm not qualified for!

--david





From skip@mojam.com (Skip Montanaro) Wed Jun 9 04:45:33 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Tue, 8 Jun 1999 23:45:33 -0400 (EDT)
Subject: [Python-Dev] Stackless Preview
In-Reply-To: <375D70F1.37007192@appliedbiometrics.com>
References: <000901beafcc$424ec400$639e2299@tim>
<375AD1DC.19C1C0F6@appliedbiometrics.com>
<375D70F1.37007192@appliedbiometrics.com>
Message-ID: <14173.58054.869171.927699@cm-24-29-94-19.nycap.rr.com>

Chris> If it's none of the three above, I'd be happy to get a hint if I
Chris> should continue, or if and what I should change.

Chris,

My vote is for you to keep at it. I haven't looked at it because I have
absolutely zero free time available. This will probably continue until at
least the end of July, perhaps until Labor Day. Big doings at Musi-Cal and
in the Montanaro household (look for an area code change in a month or so).

Skip Montanaro | Mojam: "Uniting the World of Music" http://www.mojam.com/
skip@mojam.com | Musi-Cal: http://www.musi-cal.com/
518-372-5583


From tismer@appliedbiometrics.com Wed Jun 9 13:58:40 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Wed, 09 Jun 1999 14:58:40 +0200
Subject: [Python-Dev] Splitting up the PVM
References: <000c01beb21f$a2bd5540$2fa22299@tim> <199906090331.XAA23066@eric.cnri.reston.va.us>
Message-ID: <375E6500.307EF39E@appliedbiometrics.com>


Guido van Rossum wrote:
Tim wrote:
Sorry, Chris! Just a case of "no time" here. Of *course* you
should continue, and Guido should pop in with an encouraging word
too -- or a "forget it". I think this design opens the doors to a
world of interesting ideas, but that's based on informed prejudice
rather than careful study of your code. Cheer up: if everyone
thought you were a lame ass, we all would have studied your code
intensely by now <wink>.
No time here either...

I did try to have a quick peek and my first impression is that it's
*very* tricky code! You know what I think of that...
Thanks for looking into it, thanks for saying it's tricky.
Since I failed to supply proper documentation yet, this
impression must come up.

But it is really not true. The code is not tricky
but just straightforward and consequent, after one has understood
what it means to work without a stack, under the precondition
to avoid too much changes. I didn't want to rewrite
the world, and I just added the tiny missing bits.

I will write up my documentation now, and you will
understand what the difficulties were. These will not
vanish, "stackless" is a brainteaser. My problem was not how
to change the code, but finally it was how to change
my brain. Now everything is just obvious.
Here's what I think we should do first (I've mentioned this before but
nobody cheered me on :-). I'd like to see this as the basis for 1.6.

We should structurally split the Python Virtual Machine and related
code up into different parts -- both at the source code level and at
the runtime level. The core PVM becomes a replaceable component, and
so do a few other parts like the parser, the bytecode compiler, the
import code, and the interactive read-eval-print loop. Most object
implementations are shared between all -- or at least the interfaces
are interchangeable. Clearly, a few object types are specific to one
or another PVM (e.g. frames). The collection of builtins is also a
separate component (though some builtins may again be specific to a
PVM -- details, details!).
Good idea, and a lot of work.
Having different frames for different PVM's was too much for
me. Instead, I tried to adjust frames in a way where a lot
of machines can work with.

I tried to show the concept of having different VM's by
implementing a stackless map. Stackless map is a very tiny one
which uses frames again (and yes, this was really hacked).
Well, different frame flavors would make sense, perhaps.

But I have a central routine which handles all calls to frames,
and this is what I think is needed. I already *have*
pluggable interpreters here, since a function can produce
a frame which is bound to an interpreter, and push it
to the frame stack.
The goal of course, is to create a market for 3rd party components
here, e.g. Chris' flat PVM, or Skip's bytecode optimizer, or Greg's
importer, and so on.
I'm with that component goal, of course. Much work,
not for one persone, but great.

While I don't think it makes sense to make a flat PVM
pluggable. I would start with a flat PVM, since that opens
a world of possibilities. You can hardly plug flatness in
after you started with a wrong stack layout. Vice versa,
plugging the old machine would be possible.

later - chris

--
Christian Tismer :^)
Applied Biometrics GmbH : Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net
10553 Berlin : PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
we're tired of banana software - shipped green, ripens at home


From tismer@appliedbiometrics.com Wed Jun 9 14:08:38 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Wed, 09 Jun 1999 15:08:38 +0200
Subject: [Python-Dev] Stackless Preview
References: <000c01beb21f$a2bd5540$2fa22299@tim>
Message-ID: <375E6756.370BA78E@appliedbiometrics.com>


Tim Peters wrote:
[Christian Tismer]
...
If it's none of the three above, I'd be happy to get a hint if I
should continue, or if and what I should change.
Sorry, Chris! Just a case of "no time" here. Of *course* you should
continue, and Guido should pop in with an encouraging word too -- or a
"forget it".
Yup, I know this time problem just too good.
Well, I think I got something in between. I was warned
before, so I didn't try to write final code, but I managed
to prove the concept.

I *will* continue, regardless what anybody says.
or-were-you-running-ndiff-under-the-stackless-python<wink>?-ly y'rs - tim
I didn't use ndiff, but regular "diff", and it worked.
But since theere is not much change to the code, but
some significant change to the control flow, I found
the diff output too confusing. Windiff was always open
when I wrote that, to be sure that I didn't trample
on things which I didn't want to mess up. A good tool!

ciao - chris

--
Christian Tismer :^)
Applied Biometrics GmbH : Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net
10553 Berlin : PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
we're tired of banana software - shipped green, ripens at home


From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Wed Jun 9 15:48:34 1999
From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw)
Date: Wed, 9 Jun 1999 10:48:34 -0400 (EDT)
Subject: [Python-Dev] Stackless Preview
References: <199906082245.AAA48828@pukapuka.inrialpes.fr>
<000d01beb21f$a3daac20$2fa22299@tim>
Message-ID: <14174.32450.29368.914458@anthem.cnri.reston.va.us>
"TP" == Tim Peters <tim_one@email.msn.com> writes:
TP> Vlad, get linefix.py from Python FTP contrib's System area;
TP> converts among Unix, Windows and Mac line conventions; to Unix
TP> by default. For that matter, do a global replace of ^M in
TP> Emacs <wink>.

I forgot to follow up to Vlad's original message, but in XEmacs (dunno
about FSFmacs), you can visit DOS-eol files without seeing the ^M's.
You will see a "DOS" in the modeline, and when you go to write the
file it'll ask you if you want to write it in "plain text". I use
XEmacs all the time to convert between DOS-eol and
eol-The-Way-God-Intended :)

To enable this, add the following to your .emacs file:

(require 'crypt)

-Barry


From tismer@appliedbiometrics.com Wed Jun 9 18:58:52 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Wed, 09 Jun 1999 19:58:52 +0200
Subject: [Python-Dev] First Draft on Stackless Python
References: <199906082245.AAA48828@pukapuka.inrialpes.fr>
<000d01beb21f$a3daac20$2fa22299@tim> <14174.32450.29368.914458@anthem.cnri.reston.va.us>
Message-ID: <375EAB5C.138D32CF@appliedbiometrics.com>

Howdy,

I've begun with a first draft on Stackless Python.
Didn't have enough time to finish it, but something might
already be useful. (Should I better drop the fish idea?)
Will write the rest tomorrow.

ciao - chris

http://www.pns.cc/stackless/stackless.htm

--
Christian Tismer :^)
Applied Biometrics GmbH : Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net
10553 Berlin : PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
we're tired of banana software - shipped green, ripens at home


From tim_one@email.msn.com Thu Jun 10 06:25:11 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Thu, 10 Jun 1999 01:25:11 -0400
Subject: [Python-Dev] Splitting up the PVM
In-Reply-To: <375E6500.307EF39E@appliedbiometrics.com>
Message-ID: [Christian Tismer, replying to Guido's enthusiasm <wink>]
Thanks for looking into it, thanks for saying it's tricky.
Since I failed to supply proper documentation yet, this
impression must come up.

But it is really not true. The code is not tricky
but just straightforward and consequent, after one has understood
what it means to work without a stack, under the precondition
to avoid too much changes. I didn't want to rewrite
the world, and I just added the tiny missing bits.

I will write up my documentation now, and you will
understand what the difficulties were. These will not
vanish, "stackless" is a brainteaser. My problem was not how
to change the code, but finally it was how to change
my brain. Now everything is just obvious.
FWIW, I believe you! There's something *inherently* tricky about
maintaining the effect of a stack without using the stack C supplies
implicitly, and from all you've said and what I've learned of your code, it
really isn't the code that's tricky here. You're making formerly-hidden
connections explicit, which means more stuff is visible, but also means more
power and flexibility *because* "more stuff is visible".

Agree too that this clearly <wink> moves in the direction of making the VM
pluggable.
...
I *will* continue, regardless what anybody says.
Ah, if that's how this works, then STOP! Immediately! Don't you dare waste
more of our time with this crap <wink>.

want-some-money?-ly y'rs - tim




From tim_one@email.msn.com Thu Jun 10 06:44:50 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Thu, 10 Jun 1999 01:44:50 -0400
Subject: [Python-Dev] Splitting up the PVM
In-Reply-To: <199906090331.XAA23066@eric.cnri.reston.va.us>
Message-ID: [Guido van Rossum]
...
Here's what I think we should do first (I've mentioned this before but
nobody cheered me on :-). I'd like to see this as the basis for 1.6.

We should structurally split the Python Virtual Machine and related
code up into different parts -- both at the source code level and at
the runtime level. The core PVM becomes a replaceable component, and
so do a few other parts like the parser, the bytecode compiler, the
import code, and the interactive read-eval-print loop. Most object
implementations are shared between all -- or at least the interfaces
are interchangeable. Clearly, a few object types are specific to one
or another PVM (e.g. frames). The collection of builtins is also a
separate component (though some builtins may again be specific to a
PVM -- details, details!).

The goal of course, is to create a market for 3rd party components
here, e.g. Chris' flat PVM, or Skip's bytecode optimizer, or Greg's
importer, and so on.

Thoughts?
The idea of major subsystems getting reworked to conform to well-defined and
well-controlled interfaces is certainly appealing.

I'm just more comfortable squeezing another 1.7% out of list.sort() <0.9
wink>.

trying-to-reduce-my-ambitions-to-match-my-time-ly y'rs - tim




From jack@oratrix.nl Thu Jun 10 09:49:31 1999
From: jack@oratrix.nl (Jack Jansen)
Date: Thu, 10 Jun 1999 10:49:31 +0200
Subject: [Python-Dev] Splitting up the PVM
In-Reply-To: Message by Guido van Rossum <guido@cnri.reston.va.us> ,
Tue, 08 Jun 1999 23:31:11 -0400 , <199906090331.XAA23066@eric.cnri.reston.va.us>
Message-ID: <19990610084931.55882303120@snelboot.oratrix.nl>
Here's what I think we should do first (I've mentioned this before but
nobody cheered me on :-).
Go, Guido, GO!!!!

What I'd like in the split you propose is to see which of the items would be
implementable in Python, and try to do the split in such a way that such a
Python implementation isn't ruled out.

Am I correct in guessing that after factoring out the components you mention
the only things that aren't in a "replaceable component" are the builtin
objects, and a little runtime glue (malloc and such)?
--
Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm




From tismer@appliedbiometrics.com Thu Jun 10 13:16:20 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Thu, 10 Jun 1999 14:16:20 +0200
Subject: [Python-Dev] Splitting up the PVM
References: <001401beb301$9cf20b00$af9e2299@tim>
Message-ID: <375FAC94.D17D43A7@appliedbiometrics.com>


Tim Peters wrote:
[Christian Tismer, replying to Guido's enthusiasm <wink>]
...
I will write up my documentation now, and you will
still under some work :)
understand what the difficulties were. These will not
vanish, "stackless" is a brainteaser. My problem was not how
to change the code, but finally it was how to change
my brain. Now everything is just obvious.
FWIW, I believe you! There's something *inherently* tricky about
maintaining the effect of a stack without using the stack C supplies
implicitly, and from all you've said and what I've learned of your code, it
really isn't the code that's tricky here. You're making formerly-hidden
connections explicit, which means more stuff is visible, but also means more
power and flexibility *because* "more stuff is visible".
I knew you would understand me. Feeling much, much better now :-))

After this is finalized, restartable exceptions
might be interesting to explore. No, Chris, do the doco...
I *will* continue, regardless what anybody says.
Ah, if that's how this works, then STOP! Immediately! Don't you dare waste
more of our time with this crap <wink>.
Thanks, you fired me a continuation.

Here the way to get me into an endless loop:
Give me an unsolvable problem and claim I can't do that. :)

(just realized that I'm just another pluggable interpreter)
want-some-money?-ly y'rs - tim
No, but meet you at least once in my life.

--
Christian Tismer :^)
Applied Biometrics GmbH : Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net
10553 Berlin : PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
we're tired of banana software - shipped green, ripens at home


From arw@ifu.net Thu Jun 10 14:40:51 1999
From: arw@ifu.net (Aaron Watters)
Date: Thu, 10 Jun 1999 09:40:51 -0400
Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v]
Message-ID: <375FC062.62850DE5@ifu.net>

While we're talking about stacks...

I've always considered it a major shame that Python ints and floats
and chars and stuff have anything to do with dynamic allocation, and I
always suspected it might be a major speed performance boost if there
was
some way they could be manipulated without the need for dynamic
memory management.

One conceivable alternative approach would change the basic manipulation
of
objects so that instead of representing objects via pyobject pointers
everywhere
represent them using two "slots" in a structure for each object,
one of which is a type descriptor pointer and the other
being a (void *) which could contain the data directly for small objects

such as ints, floats, chars. In this case, for example, integer
addition would
never require any memory management, as it shouldn't, I think, in a
perfect
world.

IE instead of

C-stack or static: Heap:
(pyobject *) ------------> (refcount, typedescr, data ...)

in general you get

(typedescr
repr* ----------------------> (refcount, data, ...)
)

or for small objects like ints and floats and chars simply

(typedescr,
value)

with no dereferencing or memory management required. My feeling is
that common things like arithmetic and indexing lists of integers and
stuff could
be much faster under this approach since it reduces memory management
overhead and fragmentation, dereferencing, etc...

One bad thing, of course, is that this might be a drastic assault on the

way existing code works... Unless I'm just not being creative enough
with my thinking. Is this a good idea? If so, is there any way to add
it
to the interpreter without breaking extension modules and everything
else?
If Python 2.0 will break stuff anyway, would this be an good change
to the internals?

Curious... -- Aaron Watters

ps: I suppose another gotcha is "when do you do increfs/decrefs?"
because
they no longer make sense for ints in this case... maybe add a flag
to the
type descriptor "increfable" and assume that the typedescriptors are
always
in the CPU cache (?). This would slow down increfs by a couple
cycles...
Would it be worth it? Only the benchmark knows... Another fix would
be
to put the refcount in the static side with no speed penalty

(typedescr
repr* ----------------------> data
refcount
)

but would that be wasteful of space?





From guido@CNRI.Reston.VA.US Thu Jun 10 14:45:51 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 10 Jun 1999 09:45:51 -0400
Subject: [Python-Dev] Splitting up the PVM
In-Reply-To: Your message of "Thu, 10 Jun 1999 10:49:31 +0200."
<19990610084931.55882303120@snelboot.oratrix.nl>
References: <19990610084931.55882303120@snelboot.oratrix.nl>
Message-ID: [me]
Here's what I think we should do first (I've mentioned this before but
nobody cheered me on :-).
[Jack]
Go, Guido, GO!!!!

What I'd like in the split you propose is to see which of the items would be
implementable in Python, and try to do the split in such a way that such a
Python implementation isn't ruled out.
Indeed. The importing code and the read-eval-print loop are obvious
candidates (in fact IDLE shows how the latter can be done today). I'm
not sure if it makes sense to have a parser/compiler or the VM written
in Python, because of the expected slowdown (plus, the VM would
present a chicken-egg problem :-) although for certain purposes one
might want to do this. An optimizing pass would certainly be a good
candidate.
Am I correct in guessing that after factoring out the components you mention
the only things that aren't in a "replaceable component" are the builtin
objects, and a little runtime glue (malloc and such)?
I guess (although how much exactly will only become clear when it's
done). I guess that things like thread-safety and GC policy are also
pervasive.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@CNRI.Reston.VA.US Thu Jun 10 15:11:23 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 10 Jun 1999 10:11:23 -0400
Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v]
In-Reply-To: Your message of "Thu, 10 Jun 1999 09:40:51 EDT."
<375FC062.62850DE5@ifu.net>
References: <375FC062.62850DE5@ifu.net>
Message-ID: [Aaron]
I've always considered it a major shame that Python ints and floats
and chars and stuff have anything to do with dynamic allocation, and
I always suspected it might be a major speed performance boost if
there was some way they could be manipulated without the need for
dynamic memory management.
What you're describing is very close to what I recall I once read
about the runtime organization of Icon. Perl may also use a variant
on this (it has fixed-length object headers). On the other hand, I
believe Smalltalks typically uses something like the following ABC
trick:

In ABC, we used a variation: objects were represented by pointers as
in Python, except when the low bit was 1, in which case the remaining
31 bits were a "small int". My experience with this approach was that
it probably saved some memory, but perhaps not time (since almost all
operations on objects were slowed down by the check "is it an int?"
before the pointer could be accessed); and that because of this it was
a major hassle in keeping the implementation code correct. There was
always the temptation to make a check early in a piece of code and
then skip the check later on, which sometimes didn't work when objects
switched places. Plus in general the checks made the code less
readable, and it was just one more thing to remember to do.

The Icon approach (i.e. yours) seems to require a complete rethinking
of all object implementations and all APIs at the C level -- perhaps
we could think about it for Python 2.0. Some ramifications:

- Uses more memory for highly shared objects (there are as many copies
of the type pointer as there are references).

- Thus, lists take double the memory assuming they reference objects
that also exist elsewhere. This affects the performance of slices
etc.

- On the other hand, a list of ints takes half the memory (given that
most of those ints are not shared).

- *Homogeneous* lists (where all elements have the same type --
i.e. arrays) can be represented more efficiently by having only one
copy of the type pointer. This was an idea for ABC (whose type system
required all container types to be homogenous) that was never
implemented (because in practice the type check wasn't always applied,
and the top-level namespace used by the interactive command
interpreter violated all the rules).

- Reference count manipulations could be done by a macro (or C++
behind-the-scense magic using copy constructors and destructors) that
calls a function in the type object -- i.e. each object could decide
on its own reference counting implementation :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Thu Jun 10 19:02:30 1999
From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw)
Date: Thu, 10 Jun 1999 14:02:30 -0400 (EDT)
Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v]
References: <375FC062.62850DE5@ifu.net>
<199906101411.KAA29962@eric.cnri.reston.va.us>
Message-ID: <14175.64950.720465.456133@anthem.cnri.reston.va.us>
"Guido" == Guido van Rossum <guido@cnri.reston.va.us> writes:
Guido> In ABC, we used a variation: objects were represented by
Guido> pointers as in Python, except when the low bit was 1, in
Guido> which case the remaining 31 bits were a "small int".

Very similar to how Emacs Lisp manages its type system, to which
XEmacs extended. The following is from the XEmacs Internals
documentation[1]. XEmacs' object representation (on a 32 bit machine)
uses the top bit as a GC mark bit, followed by three type tag bits,
followed by a pointer or an integer:

[ 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 ]
[ 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 ]

^ <---> <------------------------------------------------------>
tag a pointer to a structure, or an integer
`---> mark bit

One of the 8 possible types representable by the tag bits, one is a
"record" type, which essentially allows an unlimited (well, 2^32)
number of data types.

As you might guess there are lots of interesting details and
limitations to this scheme, with lots of interesting macros in the C
code :). Reading and debugging the C implementation gets fun too
(we'll ignore for the moment all the GCPRO'ing going on -- if you
think INCREF/DECREF is trouble prone, hah!).

Whether or not this is at all relevent for Python 2.0, it all seems to
work pretty well in (X)Emacs.
"AW" == Aaron Watters <arw@ifu.net> writes:
AW> ps: I suppose another gotcha is "when do you do
AW> increfs/decrefs?" because they no longer make sense for ints
AW> in this case... maybe add a flag to the type descriptor
AW> "increfable" and assume that the typedescriptors are always in
AW> the CPU cache (?). This would slow down increfs by a couple
AW> cycles... Would it be worth it? Only the benchmark knows...
AW> Another fix would be to put the refcount in the static side
AW> with no speed penalty
(typedescr
repr* ----------------------> data
refcount
)
AW> but would that be wasteful of space?

Once again, you can move the refcount out of the objects, a la
NextStep. Could save space and improve LOC for read-only objects.

-Barry

[1] The Internals documentation comes with XEmacs's Info
documetation. Hit:

C-h i m Internals RET m How RET


From tismer@appliedbiometrics.com Thu Jun 10 20:53:10 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Thu, 10 Jun 1999 21:53:10 +0200
Subject: [Python-Dev] Stackless Preview
References: <000d01beb21f$a3daac20$2fa22299@tim>
Message-ID: <376017A6.DC619723@appliedbiometrics.com>

Howdy,

I worked a little more on the docs and figured out
that I could use a hint.
http://www.pns.cc/stackless/stackless.htm

Trying to give an example how coroutines could work,
some weaknesses showed up. I wanted to write some
function coroutine_transfer which swaps two frame
chains. This function should return my unwind token,
but unfortunately in that case a real result would
be needed as well.

Well, I know of several ways out, but it's a matter of
design, and I'd like to find the most elegant solution
for this. Could perhaps someone of those who encouraged
me have a look into the problem? Do I have to add yet
another field for return values and handle that in the
dispatcher?

thanks - chris (tired of thinking)

--
Christian Tismer :^)
Applied Biometrics GmbH : Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net
10553 Berlin : PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
we're tired of banana software - shipped green, ripens at home


From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Fri Jun 11 00:32:26 1999
From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw)
Date: Thu, 10 Jun 1999 19:32:26 -0400 (EDT)
Subject: [Python-Dev] String methods... finally
Message-ID: <14176.19210.146525.172100@anthem.cnri.reston.va.us>

I've finally checked my string methods changes into the source tree,
albeit on a CVS branch (see below). These changes are outgrowths of
discussions we've had on the string-sig, with I think Greg Stein
giving lots of very useful early feedback. I'll call these changes
controversial (hence the branch) because Guido hasn't had much
opportunity to play with them. Now that he -- and you -- can check
them out, I'm sure I'll get lots more feedback!

First, to check them out you need to switch to the string_methods CVS
branch. On Un*x:

cvs update -r string_methods

You might want to do this in a separate tree because this will sticky
tag your tree to this branch. If so, try

cvs checkout -r string_methods python

Here's a brief summary of the changes (as best I can restore the state
-- its been a while since I actually made all these changes ;)

Strings now have as methods most of the functions that were previously
only in the string module. If you've played with JPython, you've
already had this feature for a while. So you can do:

Python 1.5.2+ (#1, Jun 10 1999, 18:22:14) [GCC 2.8.1] on sunos5
Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
s = 'Hello There Devheads'
s.lower()
'hello there devheads'
s.upper()
'HELLO THERE DEVHEADS'
s.split()
['Hello', 'There', 'Devheads']
'hello'.upper()
'HELLO'

that sort of thing. Some of the string module functions don't make
sense as string methods, like join, and others never had a C
implementation so weren't added, like center.

Two new methods startswith and endswith act like their Java cousins.

The string module has been rewritten to be completely (I hope)
backwards compatible. No code should break, though they could be
slower. Guido and I decided that was acceptable.

What else? Some cleaning up of the internals based on Greg's
suggestions. A couple of new C API additions. Builtin int(), long(),
and float() have grown a few new features. I believe they are
essentially interchangable with string.atoi(), string.atol(), and
string.float() now.

After you guys get to toast me (in either sense of the word) for a
while and these changes settle down, I'll make a wider announcement.

Enjoy,
-Barry


From da@ski.org Fri Jun 11 00:37:54 1999
From: da@ski.org (David Ascher)
Date: Thu, 10 Jun 1999 16:37:54 -0700 (Pacific Daylight Time)
Subject: [Python-Dev] String methods... finally
In-Reply-To: <14176.19210.146525.172100@anthem.cnri.reston.va.us>
Message-ID: <Pine.WNT.4.04.9906101635500.173-100000@rigoletto.ski.org>
On Thu, 10 Jun 1999, Barry A. Warsaw wrote:

I've finally checked my string methods changes into the source tree, Great!
... others never had a C implementation so weren't added, like center.
I assume that's not a design decision but a "haven't gotten around to it
yet" statement, right?
Two new methods startswith and endswith act like their Java cousins.
aaaah... <sigh of relief>.

--david



From MHammond@skippinet.com.au Fri Jun 11 00:59:17 1999
From: MHammond@skippinet.com.au (Mark Hammond)
Date: Fri, 11 Jun 1999 09:59:17 +1000
Subject: [Python-Dev] String methods... finally
In-Reply-To: <14176.19210.146525.172100@anthem.cnri.reston.va.us>
Message-ID: <003101beb39d$41b1c7c0$0801a8c0@bobcat>
I've finally checked my string methods changes into the source tree,
albeit on a CVS branch (see below). These changes are outgrowths of
Yay!

Would this also be a good opportunity to dust-off the Unicode
implementation the string-sig recently came up with (as implemented by
Fredrik) and get this in as a type?

Although we still have the unresolved issue of how to use PyArg_ParseTuple
etc to convert to/from Unicode and 8bit, it would still be nice to have
Unicode and String objects capable of being used interchangably at the
Python level.

Of course, the big problem with attempting to test out these sorts of
changes is that you must do so in code that will never see the public for a
good 12 months. I suppose a 1.5.25 is out of the question ;-)

Mark.



From guido@CNRI.Reston.VA.US Fri Jun 11 02:40:07 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 10 Jun 1999 21:40:07 -0400
Subject: [Python-Dev] String methods... finally
In-Reply-To: Your message of "Fri, 11 Jun 1999 09:59:17 +1000."
<003101beb39d$41b1c7c0$0801a8c0@bobcat>
References: <003101beb39d$41b1c7c0$0801a8c0@bobcat>
Message-ID: <199906110140.VAA02180@eric.cnri.reston.va.us>
Would this also be a good opportunity to dust-off the Unicode
implementation the string-sig recently came up with (as implemented by
Fredrik) and get this in as a type?

Although we still have the unresolved issue of how to use PyArg_ParseTuple
etc to convert to/from Unicode and 8bit, it would still be nice to have
Unicode and String objects capable of being used interchangably at the
Python level.
Yes, yes, yes! Even if it's not supported everywhere, at least having
the Unicode type in the source tree would definitely help!
Of course, the big problem with attempting to test out these sorts of
changes is that you must do so in code that will never see the public for a
good 12 months. I suppose a 1.5.25 is out of the question ;-)
We'll see about that...

(I sometimes wished I wasn't in the business of making releases. I've
asked for help with making essential patches to 1.5.2 available but
nobody volunteered... :-( )

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim_one@email.msn.com Fri Jun 11 04:08:28 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Thu, 10 Jun 1999 23:08:28 -0400
Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v]
In-Reply-To: <14175.64950.720465.456133@anthem.cnri.reston.va.us>
Message-ID: <000a01beb3b7$adda3b20$329e2299@tim>

Jumping in to opine that mixing tag/type bits with native pointers is a
Really Bad Idea. Put the bits on the low end and word-addressed machines
are screwed. Put the bits on the high end and you've made severe
assumptions about how the platform parcels out address space. In any case
you're stuck with ugly macros everywhere.

This technique was pioneered by Lisps, and was beautifully exploited by the
Symbolics Lisp Machine and TI Lisp Explorer hardware. Lisp people don't
want to admit those failed, so continue simulating the HW design by hand at
comparatively sluggish C speed <0.6 wink>.

BTW, I've never heard this approach argued as a speed optimization (except
in the HW implementations): software mask-test-branch around every
inc/dec-ref to exempt ints is a nasty new repeated expense. The original
motivation was to save space, and that back in the days when a 128Mb RAM
chip wasn't even conceivable, let alone under $100 <wink>.

once-wrote-a-functional-language-interpreter-in-8085-assembler-that-ran-
in-24Kb-cuz-that's-all-there-was-but-don't-feel-i-need-to-repeat-the-
experience-today-wink>-ly y'rs - tim




From bwarsaw@python.org Fri Jun 11 04:13:29 1999
From: bwarsaw@python.org (Barry A. Warsaw)
Date: Thu, 10 Jun 1999 23:13:29 -0400 (EDT)
Subject: [Python-Dev] String methods... finally
References: <14176.19210.146525.172100@anthem.cnri.reston.va.us>
<Pine.WNT.4.04.9906101635500.173-100000@rigoletto.ski.org>
Message-ID: <14176.32473.408675.992145@anthem.cnri.reston.va.us>
"DA" == David Ascher <da@ski.org> writes:
... others never had a C implementation so weren't added, like
center.
DA> I assume that's not a design decision but a "haven't gotten
DA> around to it yet" statement, right?

I think we decided that they weren't used enough to implement in
C.
Two new methods startswith and endswith act like their Java
cousins.
DA> aaaah... <sigh of relief>.

Tell me about it!
-Barry


From tim_one@email.msn.com Fri Jun 11 04:33:25 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Thu, 10 Jun 1999 23:33:25 -0400
Subject: [Python-Dev] String methods... finally
In-Reply-To: <14176.19210.146525.172100@anthem.cnri.reston.va.us>
Message-ID: <000b01beb3bb$29ccdaa0$329e2299@tim>
Two new methods startswith and endswith act like their Java cousins.
Barry, suggest that both of these grow optional start and end slice indices.
Why? It's Pythonic <wink>. Really, I'm forever marching over huge strings
a slice-pair at a time, and it's important that searches and matches never
give me false hits due to slobbering over the current slice bounds. regexp
objects in general, and string.find/.rfind in particular, support this
beautifully. Java feels less need since sub-stringing is via cheap
descriptor there. The optional indices wouldn't hurt Java, but would help
Python.

then-again-if-strings-were-so-great-i'd-switch-to-tcl<wink>-ly y'rs - tim




From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Fri Jun 11 04:41:55 1999
From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw)
Date: Thu, 10 Jun 1999 23:41:55 -0400 (EDT)
Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v]
References: <14175.64950.720465.456133@anthem.cnri.reston.va.us>
<000a01beb3b7$adda3b20$329e2299@tim>
Message-ID: <14176.34179.125397.282079@anthem.cnri.reston.va.us>
"TP" == Tim Peters <tim_one@email.msn.com> writes:
TP> Jumping in to opine that mixing tag/type bits with native
TP> pointers is a Really Bad Idea. Put the bits on the low end
TP> and word-addressed machines are screwed. Put the bits on the
TP> high end and you've made severe assumptions about how the
TP> platform parcels out address space. In any case you're stuck
TP> with ugly macros everywhere.

Ah, so you /have/ read the Emacs source code! I'll agree that it's
just an RBI for Emacs, but for Python, it'd be a RFSI.

TP> This technique was pioneered by Lisps, and was beautifully
TP> exploited by the Symbolics Lisp Machine and TI Lisp Explorer
TP> hardware. Lisp people don't want to admit those failed, so
TP> continue simulating the HW design by hand at comparatively
TP> sluggish C speed <0.6 wink>.

But of course, the ghosts live on at the FSF and xemacs.org (couldn't
tell ya much about how modren <sic> Lisps do it).

-Barry


From skip@mojam.com (Skip Montanaro) Fri Jun 11 05:26:49 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Fri, 11 Jun 1999 00:26:49 -0400 (EDT)
Subject: [Python-Dev] String methods... finally
In-Reply-To: <14176.19210.146525.172100@anthem.cnri.reston.va.us>
References: <14176.19210.146525.172100@anthem.cnri.reston.va.us>
Message-ID: <14176.36294.902853.594510@cm-24-29-94-19.nycap.rr.com>

Barry> Some of the string module functions don't make sense as string
Barry> methods, like join, and others never had a C implementation so
Barry> weren't added, like center.

I take it string.capwords falls into that category. It's one of those
things that's so easy to write in Python and there's no real speed gain in
going to C, that it didn't make much sense to add it to the strop module,
right?

I see the following functions in string.py that could reasonably be
methodized:

ljust, rjust, center, expandtabs, capwords

That's not very many, and it would appear that this stuff won't see
widespread use for quite some time. I think for completeness sake we should
bite the bullet on them.

BTW, I built it and think it is very cool. Tipping my virtual hat to Barry,
I am...

Skip Montanaro | Mojam: "Uniting the World of Music" http://www.mojam.com/
skip@mojam.com | Musi-Cal: http://www.musi-cal.com/
518-372-5583


From skip@mojam.com (Skip Montanaro) Fri Jun 11 05:57:15 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Fri, 11 Jun 1999 00:57:15 -0400 (EDT)
Subject: [Python-Dev] String methods... finally
In-Reply-To: <14176.36294.902853.594510@cm-24-29-94-19.nycap.rr.com>
References: <14176.19210.146525.172100@anthem.cnri.reston.va.us>
<14176.36294.902853.594510@cm-24-29-94-19.nycap.rr.com>
Message-ID: <14176.38521.124491.987817@cm-24-29-94-19.nycap.rr.com>

Skip> I see the following functions in string.py that could reasonably be
Skip> methodized:

Skip> ljust, rjust, center, expandtabs, capwords

It occurred to me just a few minutes after sending my previous message that
it might make sense to make string.join a method for lists and tuples.
They'd obviously have to make the same type checks that string.join does.

That would leave the string/strip modules implementing just a couple
functions.

Skip


From da@ski.org Fri Jun 11 06:09:46 1999
From: da@ski.org (David Ascher)
Date: Thu, 10 Jun 1999 22:09:46 -0700 (Pacific Daylight Time)
Subject: [Python-Dev] String methods... finally
In-Reply-To: <14176.38521.124491.987817@cm-24-29-94-19.nycap.rr.com>
Message-ID: <Pine.WNT.4.05.9906102157440.173-100000@david.ski.org>
On Fri, 11 Jun 1999, Skip Montanaro wrote:

It occurred to me just a few minutes after sending my previous message that
it might make sense to make string.join a method for lists and tuples.
They'd obviously have to make the same type checks that string.join does.
as in:
['spam!', 'eggs!'].join()
'spam! eggs!'

?

I like the notion, but I think it would naturally migrate towards
genericity, at which point it might be called "reduce", so that:
['spam!', 'eggs!'].reduce()
'spam!eggs!'
['spam!', 'eggs!'].reduce(' ')
'spam! eggs!'
[1,2,3].reduce()
6 # 1 + 2 + 3
[1,2,3].reduce(10)
26 # 1 + 10 + 2 + 10 + 3

note that string.join(foo) == foo.reduce(' ')
and string.join(foo, '') == foo.reduce()

--david




From guido@CNRI.Reston.VA.US Fri Jun 11 06:16:29 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 11 Jun 1999 01:16:29 -0400
Subject: [Python-Dev] String methods... finally
In-Reply-To: Your message of "Thu, 10 Jun 1999 22:09:46 PDT."
<Pine.WNT.4.05.9906102157440.173-100000@david.ski.org>
References: <Pine.WNT.4.05.9906102157440.173-100000@david.ski.org>
Message-ID: <199906110516.BAA02520@eric.cnri.reston.va.us>
On Fri, 11 Jun 1999, Skip Montanaro wrote:

It occurred to me just a few minutes after sending my previous message that
it might make sense to make string.join a method for lists and tuples.
They'd obviously have to make the same type checks that string.join does.
as in:
['spam!', 'eggs!'].join()
'spam! eggs!'
Note that this is not as powerful as string.join(); the latter works
on any sequence, not just on lists and tuples. (Though that may not
be a big deal.)

I also find it slightly objectionable that this is a general list
method but only works if the list contains only strings; Dave Ascher's
generalization to reduce() is cute but strikes me are more general
than useful, and the name will forever present a mystery to most
newcomers.

Perhaps join() ought to be a built-in function?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From da@ski.org Fri Jun 11 06:23:06 1999
From: da@ski.org (David Ascher)
Date: Thu, 10 Jun 1999 22:23:06 -0700 (Pacific Daylight Time)
Subject: [Python-Dev] String methods... finally
In-Reply-To: <199906110516.BAA02520@eric.cnri.reston.va.us>
Message-ID: <Pine.WNT.4.05.9906102220450.173-100000@david.ski.org>
On Fri, 11 Jun 1999, Guido van Rossum wrote:

Perhaps join() ought to be a built-in function?
Would it do the moral equivalent of a reduce(operator.add, ...) or of a
string.join?

I think it should do the former (otherwise something about 'string' should
be in the name), and as a consequence I think it shouldn't have the
default whitespace spacer.

cute-but-general'ly y'rs, david



From da@ski.org Fri Jun 11 06:35:42 1999
From: da@ski.org (David Ascher)
Date: Thu, 10 Jun 1999 22:35:42 -0700 (Pacific Daylight Time)
Subject: [Python-Dev] Aside: apply syntax
Message-ID: <Pine.WNT.4.05.9906102232210.173-100000@david.ski.org>

I've seen repeatedly on c.l.p a suggestion to modify the syntax & the core
to allow * and ** in function calls, so that:


class SubFoo(Foo):
def __init__(self, *args, **kw):
apply(Foo, (self, ) + args, kw)
...

could be written

class SubFoo(Foo):
def __init__(self, *args, **kw):
Foo(self, *args, **kw)
...

I really like this notion, but before I poke around trying to see if it's
doable, I'd like to get feedback on whether y'all think it's a good idea
or not. And if someone else wants to do it, feel free -- I am of course
swamped, and I won't get to it until after rich comparisons.

FWIW, apply() is one of my least favorite builtins, aesthetically
speaking.

--david



From da@ski.org Fri Jun 11 06:36:30 1999
From: da@ski.org (David Ascher)
Date: Thu, 10 Jun 1999 22:36:30 -0700 (Pacific Daylight Time)
Subject: [Python-Dev] Re: Aside: apply syntax
In-Reply-To: <Pine.WNT.4.05.9906102232210.173-100000@david.ski.org>
Message-ID: <Pine.WNT.4.05.9906102236050.173-100000@david.ski.org>

On Thu, 10 Jun 1999, David Ascher wrote:

class SubFoo(Foo):
def __init__(self, *args, **kw):
apply(Foo, (self, ) + args, kw)
...

could be written

class SubFoo(Foo):
def __init__(self, *args, **kw):
Foo(self, *args, **kw)
Of course I meant Foo.__init__ in both of the above!

--david



From skip@mojam.com (Skip Montanaro) Fri Jun 11 08:07:09 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Fri, 11 Jun 1999 03:07:09 -0400 (EDT)
Subject: [Python-Dev] String methods... finally
In-Reply-To: <Pine.WNT.4.05.9906102220450.173-100000@david.ski.org>
References: <199906110516.BAA02520@eric.cnri.reston.va.us>
<Pine.WNT.4.05.9906102220450.173-100000@david.ski.org>
Message-ID: <14176.45761.801671.880774@cm-24-29-94-19.nycap.rr.com>

David> I think it should do the former (otherwise something about
David> 'string' should be in the name), and as a consequence I think it
David> shouldn't have the default whitespace spacer.

Perhaps "joinstrings" would be an appropriate name (though it seems
gratuitously long) or join should call str() on non-string elements.

My thought here is that we have left in the string module a couple functions
that ought to be string object methods but aren't yet mostly for convenience
or time constraints, and one (join) that is 99.9% of the time used on lists
or tuples of strings. That leaves a very small handful of methods that
don't naturally fit somewhere else. You can, of course, complete the
picture and add a join method to string objects, which would be useful to
explode them into individual characters. That would complete the
join-as-a-sequence-method picture I think. If you don't somebody else (and
not me, cuz I'll know why already!) is bound to ask why capwords, join,
ljust, etc got left behind in the string module while all the other
functions got promotions to object methods.

Oh, one other thing I forgot. Split (join) and splitfields (joinfields)
used to be different. They've been the same for a long time now, long
enough that I no longer recall how they used to differ. In making the leap
from string module to string methods, I suggest dropping the long names
altogether. There's no particular compatibility reason to keep them and
they're not really any more descriptive than their shorter siblings. It's
not like you'll be preserving backward compatibility for anyone's code by
having them. However, if you release this code to the larger public, then
you'll be stuck with both in perpetuity.

Skip



From fredrik@pythonware.com Fri Jun 11 08:06:58 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Fri, 11 Jun 1999 09:06:58 +0200
Subject: [Python-Dev] String methods... finally
References: <Pine.WNT.4.05.9906102157440.173-100000@david.ski.org> <199906110516.BAA02520@eric.cnri.reston.va.us>
Message-ID: <008701beb3da$5e2db9d0$f29b12c2@pythonware.com>

Guido wrote:
Note that this is not as powerful as string.join(); the latter works
on any sequence, not just on lists and tuples. (Though that may not
be a big deal.)

I also find it slightly objectionable that this is a general list
method but only works if the list contains only strings; Dave Ascher's
generalization to reduce() is cute but strikes me are more general
than useful, and the name will forever present a mystery to most
newcomers.

Perhaps join() ought to be a built-in function?
come to think of it, the last design I came up with (inspired
by a mail from you which I cannot find right now), was this:

def join(sequence, sep=None):
# built-in
if not sequence:
return ""
sequence[0].__join__(sequence, sep)

string.join => join

and __join__ methods in the unicode and string classes.

Guido?

</F>



From fredrik@pythonware.com Fri Jun 11 08:03:19 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Fri, 11 Jun 1999 09:03:19 +0200
Subject: [Python-Dev] String methods... finally
References: <14176.19210.146525.172100@anthem.cnri.reston.va.us>
Message-ID: <008601beb3da$5e0a7a60$f29b12c2@pythonware.com>

Barry wrote:
Some of the string module functions don't make sense as
string methods, like join, and others never had a C
implementation so weren't added, like center.
fwiw, the Unicode module available from pythonware.com
implements them all, and more importantly, it can be com-
piled for either 8-bit or 16-bit characters...

join is a special problem; IIRC, Guido came up with what
I at that time thought was an excellent solution, but I
don't recall what it was right now ;-)

anyway, maybe we should start by figuring out what methods
we really want in there, and then figure out whether we
should have one or two independent string implementations
in the core...

</F>



From mal@lemburg.com Fri Jun 11 09:15:33 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 11 Jun 1999 10:15:33 +0200
Subject: [Python-Dev] String methods... finally
References: <Pine.WNT.4.05.9906102220450.173-100000@david.ski.org>
Message-ID: <3760C5A5.43FB1658@lemburg.com>

David Ascher wrote:
On Fri, 11 Jun 1999, Guido van Rossum wrote:

Perhaps join() ought to be a built-in function?
Would it do the moral equivalent of a reduce(operator.add, ...) or of a
string.join?

I think it should do the former (otherwise something about 'string' should
be in the name), and as a consequence I think it shouldn't have the
default whitespace spacer.
AFAIK, Guido himself proposed something like this on c.l.p a
few months ago. I think something like the following written
in C and optimized for lists of strings might be useful:

def join(sequence,sep=None):

x = sequence[0]
if sep:
for y in sequence[1:]:
x = x + sep + y
else:
for y in sequence[1:]:
x = x + y
return x
join(('a','b'))
'ab'
join(('a','b'),' ')
'a b'
join((1,2,3),3)
12
join(((1,2),(3,)))
(1, 2, 3)

Also, while we're at string functions/methods. Some of the stuff
in mxTextTools (see Python Pages link below) might be of general
use as well, e.g. splitat(), splitlines() and charsplit().

--
Marc-Andre Lemburg
______________________________________________________________________
Y2000: 203 days left
Business: http://www.lemburg.com/
Python Pages: http://www.lemburg.com/python/



From guido@CNRI.Reston.VA.US Fri Jun 11 13:31:51 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 11 Jun 1999 08:31:51 -0400
Subject: [Python-Dev] Aside: apply syntax
In-Reply-To: Your message of "Thu, 10 Jun 1999 22:35:42 PDT."
<Pine.WNT.4.05.9906102232210.173-100000@david.ski.org>
References: <Pine.WNT.4.05.9906102232210.173-100000@david.ski.org>
Message-ID: <199906111231.IAA02774@eric.cnri.reston.va.us>
I've seen repeatedly on c.l.p a suggestion to modify the syntax & the core
to allow * and ** in function calls, so that:

class SubFoo(Foo):
def __init__(self, *args, **kw):
apply(Foo, (self, ) + args, kw)
...

could be written

class SubFoo(Foo):
def __init__(self, *args, **kw):
Foo(self, *args, **kw)
...

I really like this notion, but before I poke around trying to see if it's
doable, I'd like to get feedback on whether y'all think it's a good idea
or not. And if someone else wants to do it, feel free -- I am of course
swamped, and I won't get to it until after rich comparisons.

FWIW, apply() is one of my least favorite builtins, aesthetically
speaking.
I like the idea, but it would mean a major reworking of the grammar
and the parser. Can I persuade you to keep this on ice until 2.0?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fredrik@pythonware.com Fri Jun 11 13:54:30 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Fri, 11 Jun 1999 14:54:30 +0200
Subject: [Python-Dev] String methods... finally
References: <14176.19210.146525.172100@anthem.cnri.reston.va.us>
Message-ID: <004601beb409$8c535750$f29b12c2@pythonware.com>
Two new methods startswith and endswith act like their Java cousins.
is it just me, or do those method names suck?

begin? starts_with? startsWith? (ouch)
has_prefix?

</F>



From arw@ifu.net Fri Jun 11 14:05:17 1999
From: arw@ifu.net (Aaron Watters)
Date: Fri, 11 Jun 1999 09:05:17 -0400
Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v]
References: <199906110342.XAA07977@python.org>
Message-ID: <3761098D.A56F58A8@ifu.net>

From: "Tim Peters" <tim_one@email.msn.com>
Jumping in to opine that mixing tag/type bits with native pointers is a
Really Bad Idea. Put the bits on the low end and word-addressed machines
are screwed. Put the bits on the high end and you've made severe
assumptions about how the platform parcels out address space. In any case
you're stuck with ugly macros everywhere.
Agreed. Never ever mess with pointers. This mistake has been made over
and over again by each new generation of computer hardware and software
and it's still a mistake.

I thought it would be good to be able to do the following loop with Numeric
arrays

for x in array1:
array2[x] = array3[x] + array4[x]

without any memory management being involved. Right now, I think the
for loop has to continually dynamically
allocate each new x and intermediate sum
(and immediate deallocate them) and that makes the loop
piteously slow. The idea replacing pyobject *'s with a struct [typedescr *, data
*]
was a space/time tradeoff to speed up operations like the above
by eliminating any need for mallocs or other memory management..
I really can't say whether it'd be worth it or not without some sort of
real testing. Just a thought.

-- Aaron Watters




From mal@lemburg.com Fri Jun 11 14:11:20 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 11 Jun 1999 15:11:20 +0200
Subject: [Python-Dev] String methods... finally
References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> <004601beb409$8c535750$f29b12c2@pythonware.com>
Message-ID: <37610AF8.3EC610FD@lemburg.com>

Fredrik Lundh wrote:
Two new methods startswith and endswith act like their Java cousins.
is it just me, or do those method names suck?

begin? starts_with? startsWith? (ouch)
has_prefix?
In mxTextTools I used the names prefix() and suffix() for much
the same thing except that those functions accept a list of
strings and return the (first) matching string instead of
just 1 or 0. Details are available at:

http://starship.skyport.net/~lemburg/mxTextTools.html

--
Marc-Andre Lemburg
______________________________________________________________________
Y2000: 203 days left
Business: http://www.lemburg.com/
Python Pages: http://www.lemburg.com/python/



From guido@CNRI.Reston.VA.US Fri Jun 11 14:58:10 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 11 Jun 1999 09:58:10 -0400
Subject: [Python-Dev] String methods... finally
In-Reply-To: Your message of "Fri, 11 Jun 1999 15:11:20 +0200."
<37610AF8.3EC610FD@lemburg.com>
References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> <004601beb409$8c535750$f29b12c2@pythonware.com>
<37610AF8.3EC610FD@lemburg.com>
Message-ID: <199906111358.JAA02836@eric.cnri.reston.va.us>
Two new methods startswith and endswith act like their Java cousins.
is it just me, or do those method names suck?
It's just you.
begin? starts_with? startsWith? (ouch)
has_prefix?
Those are all painful to type, except "begin", which isn't expressive.
In mxTextTools I used the names prefix() and suffix() for much
The problem with those is that it's arbitrary (==> harder to remember)
whether A.prefix(B) means that A is a prefix of B or that A has B for
a prefix.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mal@lemburg.com Fri Jun 11 15:55:14 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 11 Jun 1999 16:55:14 +0200
Subject: [Python-Dev] String methods... finally
References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> <004601beb409$8c535750$f29b12c2@pythonware.com>
<37610AF8.3EC610FD@lemburg.com> <199906111358.JAA02836@eric.cnri.reston.va.us>
Message-ID: <37612352.227FCA4B@lemburg.com>

Guido van Rossum wrote:
Two new methods startswith and endswith act like their Java cousins.
is it just me, or do those method names suck?
It's just you.
begin? starts_with? startsWith? (ouch)
has_prefix?
Those are all painful to type, except "begin", which isn't expressive.
In mxTextTools I used the names prefix() and suffix() for much
The problem with those is that it's arbitrary (==> harder to remember)
whether A.prefix(B) means that A is a prefix of B or that A has B for
a prefix.
True. These are functions in mxTextTools and take a sequence
as second argument, so the order is clear there... has_prefix()
has_suffix() would probably be appropriate as methods (you don't
type them that often ;-)

--
Marc-Andre Lemburg
______________________________________________________________________
Y2000: 203 days left
Business: http://www.lemburg.com/
Python Pages: http://www.lemburg.com/python/



From jack@oratrix.nl Fri Jun 11 16:55:36 1999
From: jack@oratrix.nl (Jack Jansen)
Date: Fri, 11 Jun 1999 17:55:36 +0200
Subject: [Python-Dev] Aside: apply syntax
In-Reply-To: Message by Guido van Rossum <guido@cnri.reston.va.us> ,
Fri, 11 Jun 1999 08:31:51 -0400 , <199906111231.IAA02774@eric.cnri.reston.va.us>
Message-ID: <19990611155536.944FA303120@snelboot.oratrix.nl>
class SubFoo(Foo):
def __init__(self, *args, **kw):
Foo(self, *args, **kw)
...
Guido:
I like the idea, but it would mean a major reworking of the grammar
and the parser. Can I persuade you to keep this on ice until 2.0?
What exactly would the semantics be? While I hate the apply() loops you have
to jump through nowadays to get this behaviour I don't funny understand how
this would work in general (as opposed to in this case). For instance, would
Foo(self, 12, *args, **kw)
be allowed? And
Foo(self, *args, x, **kw)
?
--
Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm




From da@ski.org Fri Jun 11 17:57:37 1999
From: da@ski.org (David Ascher)
Date: Fri, 11 Jun 1999 09:57:37 -0700 (Pacific Daylight Time)
Subject: [Python-Dev] Aside: apply syntax
In-Reply-To: <199906111231.IAA02774@eric.cnri.reston.va.us>
Message-ID: <Pine.WNT.4.04.9906110957150.289-100000@rigoletto.ski.org>
On Fri, 11 Jun 1999, Guido van Rossum wrote:

I've seen repeatedly on c.l.p a suggestion to modify the syntax & the core
to allow * and ** in function calls, so that:
I like the idea, but it would mean a major reworking of the grammar
and the parser. Can I persuade you to keep this on ice until 2.0?
Sure. That was hard. =)





From da@ski.org Fri Jun 11 18:02:49 1999
From: da@ski.org (David Ascher)
Date: Fri, 11 Jun 1999 10:02:49 -0700 (Pacific Daylight Time)
Subject: [Python-Dev] Aside: apply syntax
In-Reply-To: <19990611155536.944FA303120@snelboot.oratrix.nl>
Message-ID: <Pine.WNT.4.04.9906111000340.289-100000@rigoletto.ski.org>
On Fri, 11 Jun 1999, Jack Jansen wrote:

What exactly would the semantics be? While I hate the apply() loops you have
to jump through nowadays to get this behaviour I don't funny understand how
this would work in general (as opposed to in this case). For instance, would
Foo(self, 12, *args, **kw)
be allowed? And
Foo(self, *args, x, **kw)
Following the rule used for argument processing now, if it's unambiguous,
it should be allowed, and not otherwise. So, IMHO, the above two should
be allowed, and I suspect

Foo.__init__(self, *args, *args2)

could be too, but

Foo.__init__(self, **kw, **kw2)

should not, as dictionary addition is not allowed.

However, I could live with the more restricted version as well.

--david



From bwarsaw@python.org Fri Jun 11 18:17:20 1999
From: bwarsaw@python.org (Barry A. Warsaw)
Date: Fri, 11 Jun 1999 13:17:20 -0400 (EDT)
Subject: [Python-Dev] String methods... finally
References: <14176.19210.146525.172100@anthem.cnri.reston.va.us>
<000b01beb3bb$29ccdaa0$329e2299@tim>
Message-ID: <14177.17568.637272.328126@anthem.cnri.reston.va.us>
"TP" == Tim Peters <tim_one@email.msn.com> writes:
Two new methods startswith and endswith act like their Java
cousins.
TP> Barry, suggest that both of these grow optional start and end
TP> slice indices.

'Course it'll make the Java implementations of these extra args a
little more work. Right now they just forward off to the underlying
String methods. No biggie though.

I've got new implementations to check in -- let me add a few new tests
to cover 'em and watch your checkin emails.

-Barry


From guido@CNRI.Reston.VA.US Fri Jun 11 18:20:57 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 11 Jun 1999 13:20:57 -0400
Subject: [Python-Dev] String methods... finally
In-Reply-To: Your message of "Fri, 11 Jun 1999 13:17:20 EDT."
<14177.17568.637272.328126@anthem.cnri.reston.va.us>
References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> <000b01beb3bb$29ccdaa0$329e2299@tim>
<14177.17568.637272.328126@anthem.cnri.reston.va.us>
Message-ID: <199906111720.NAA03746@eric.cnri.reston.va.us>
From: "Barry A. Warsaw" <bwarsaw@cnri.reston.va.us>

'Course it'll make the Java implementations of these extra args a
little more work. Right now they just forward off to the underlying
String methods. No biggie though.
Which reminds me -- are you tracking this in JPython too?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Fri Jun 11 18:39:41 1999
From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw)
Date: Fri, 11 Jun 1999 13:39:41 -0400 (EDT)
Subject: [Python-Dev] String methods... finally
References: <14176.19210.146525.172100@anthem.cnri.reston.va.us>
<000b01beb3bb$29ccdaa0$329e2299@tim>
<14177.17568.637272.328126@anthem.cnri.reston.va.us>
<199906111720.NAA03746@eric.cnri.reston.va.us>
Message-ID: <14177.18909.980174.55751@anthem.cnri.reston.va.us>
"Guido" == Guido van Rossum <guido@cnri.reston.va.us> writes:
Guido> Which reminds me -- are you tracking this in JPython too?

That's definitely my plan.


From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Fri Jun 11 18:43:35 1999
From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw)
Date: Fri, 11 Jun 1999 13:43:35 -0400 (EDT)
Subject: [Python-Dev] String methods... finally
References: <199906110516.BAA02520@eric.cnri.reston.va.us>
<Pine.WNT.4.05.9906102220450.173-100000@david.ski.org>
<14176.45761.801671.880774@cm-24-29-94-19.nycap.rr.com>
Message-ID: <14177.19143.463951.778491@anthem.cnri.reston.va.us>
"SM" == Skip Montanaro <skip@mojam.com> writes:
SM> Oh, one other thing I forgot. Split (join) and splitfields
SM> (joinfields) used to be different. They've been the same for
SM> a long time now, long enough that I no longer recall how they
SM> used to differ.

I think it was only in the number of arguments they'd accept (at least
that's what's implied by the module docos).

SM> In making the leap from string module to
SM> string methods, I suggest dropping the long names altogether.

I agree. Thinking about it, I'm also inclined to not include
startswith and endswith in the string module.

-Barry


From da@ski.org Fri Jun 11 18:42:59 1999
From: da@ski.org (David Ascher)
Date: Fri, 11 Jun 1999 10:42:59 -0700 (Pacific Daylight Time)
Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v]
In-Reply-To: <3761098D.A56F58A8@ifu.net>
Message-ID: <Pine.WNT.4.04.9906111040070.289-100000@rigoletto.ski.org>
On Fri, 11 Jun 1999, Aaron Watters wrote:

I thought it would be good to be able to do the following loop with Numeric
arrays

for x in array1:
array2[x] = array3[x] + array4[x]

without any memory management being involved. Right now, I think the
FYI, I think it should be done by writing:

array2[array1] = array3[array1] + array4[array1]

and doing "the right thing" in NumPy. In other words, I don't think the
core needs to be involved.

--david

PS: I'm in the process of making the NumPy array objects ExtensionClasses,
which will make the above much easier to do.



From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Fri Jun 11 18:58:36 1999
From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw)
Date: Fri, 11 Jun 1999 13:58:36 -0400 (EDT)
Subject: [Python-Dev] String methods... finally
References: <14176.19210.146525.172100@anthem.cnri.reston.va.us>
<004601beb409$8c535750$f29b12c2@pythonware.com>
Message-ID: <14177.20044.69731.219173@anthem.cnri.reston.va.us>
"FL" == Fredrik Lundh <fredrik@pythonware.com> writes:
Two new methods startswith and endswith act like their Java
cousins.
FL> is it just me, or do those method names suck?

FL> begin? starts_with? startsWith? (ouch)
FL> has_prefix?

The inspiration was Java string objects, while trying to remain as
Pythonic as possible (no mixed case). startswith and endswith doen't
seem as bad as issubclass to me :)

-Barry


From bwarsaw@python.org Fri Jun 11 19:06:22 1999
From: bwarsaw@python.org (Barry A. Warsaw)
Date: Fri, 11 Jun 1999 14:06:22 -0400 (EDT)
Subject: [Python-Dev] String methods... finally
References: <14176.19210.146525.172100@anthem.cnri.reston.va.us>
<008601beb3da$5e0a7a60$f29b12c2@pythonware.com>
Message-ID: <14177.20510.818041.110989@anthem.cnri.reston.va.us>
"FL" == Fredrik Lundh <fredrik@pythonware.com> writes:
FL> fwiw, the Unicode module available from pythonware.com
FL> implements them all, and more importantly, it can be com-
FL> piled for either 8-bit or 16-bit characters...

Are these separately available? I don't see them under downloads.
Send me a URL, and if I can figure out how to get CVS to add files to
the branch :/, maybe I can check this in so people can play with it.

-Barry


From tismer@appliedbiometrics.com Fri Jun 11 19:17:46 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Fri, 11 Jun 1999 20:17:46 +0200
Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v]
References: <Pine.WNT.4.04.9906111040070.289-100000@rigoletto.ski.org>
Message-ID: <376152CA.B46A691E@appliedbiometrics.com>


David Ascher wrote:
On Fri, 11 Jun 1999, Aaron Watters wrote:

I thought it would be good to be able to do the following loop with Numeric
arrays

for x in array1:
array2[x] = array3[x] + array4[x]

without any memory management being involved. Right now, I think the
FYI, I think it should be done by writing:

array2[array1] = array3[array1] + array4[array1]

and doing "the right thing" in NumPy. In other words, I don't think the
core needs to be involved.
For NumPy, this is very ok, dealing with arrays in
an array world.

Without trying to repeat myself, I'd like
to say that I still consider it an unsolved
problem which is worth to be solved or to be
proven unsolvable:

How to do simple things in an efficient
way with many tiny Python objects, without
writing an extension, without rethinking a
problem into APL like style, and without
changing the language.

ciao - chris

--
Christian Tismer :^)
Applied Biometrics GmbH : Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net
10553 Berlin : PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
we're tired of banana software - shipped green, ripens at home


From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Fri Jun 11 19:22:36 1999
From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw)
Date: Fri, 11 Jun 1999 14:22:36 -0400 (EDT)
Subject: [Python-Dev] String methods... finally
References: <199906110516.BAA02520@eric.cnri.reston.va.us>
<Pine.WNT.4.05.9906102220450.173-100000@david.ski.org>
Message-ID: <14177.21484.126155.939932@anthem.cnri.reston.va.us>
Perhaps join() ought to be a built-in function?
IMO, builtin join ought to str()ify all the elements in the sequence,
concatenating the results. That seems an intuitive interpretation of
'join'ing a sequence. Here's my Python prototype:

def join(seq, sep=''):
if not seq:
return ''
x = str(seq[0])
for y in seq[1:]:
x = x + sep + str(y)
return x

Guido?

-Barry


From da@ski.org Fri Jun 11 19:24:34 1999
From: da@ski.org (David Ascher)
Date: Fri, 11 Jun 1999 11:24:34 -0700 (Pacific Daylight Time)
Subject: [Python-Dev] String methods... finally
In-Reply-To: <14177.21484.126155.939932@anthem.cnri.reston.va.us>
Message-ID: <Pine.WNT.4.04.9906111123020.289-100000@rigoletto.ski.org>
On Fri, 11 Jun 1999, Barry A. Warsaw wrote:

IMO, builtin join ought to str()ify all the elements in the sequence,
concatenating the results. That seems an intuitive interpretation of
'join'ing a sequence. Here's my Python prototype:
I don't get it -- why?

I'd expect join(((1,2,3), (4,5,6))) to yield (1,2,3,4,5,6), not anything
involving strings.

--david



From bwarsaw@python.org Fri Jun 11 19:26:48 1999
From: bwarsaw@python.org (Barry A. Warsaw)
Date: Fri, 11 Jun 1999 14:26:48 -0400 (EDT)
Subject: [Python-Dev] String methods... finally
References: <14176.19210.146525.172100@anthem.cnri.reston.va.us>
<14176.36294.902853.594510@cm-24-29-94-19.nycap.rr.com>
Message-ID: <14177.21736.100540.221487@anthem.cnri.reston.va.us>
"SM" == Skip Montanaro <skip@mojam.com> writes:
SM> I see the following functions in string.py that could
SM> reasonably be methodized:

SM> ljust, rjust, center, expandtabs, capwords

Also zfill.

What do you think, are these important enough to add? Maybe we can
just drop in /F's implementation for these.

-Barry


From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Fri Jun 11 19:34:08 1999
From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw)
Date: Fri, 11 Jun 1999 14:34:08 -0400 (EDT)
Subject: [Python-Dev] String methods... finally
References: <14177.21484.126155.939932@anthem.cnri.reston.va.us>
<Pine.WNT.4.04.9906111123020.289-100000@rigoletto.ski.org>
Message-ID: <14177.22176.328185.872134@anthem.cnri.reston.va.us>
"DA" == David Ascher <da@ski.org> writes:
DA> On Fri, 11 Jun 1999, Barry A. Warsaw wrote:
IMO, builtin join ought to str()ify all the elements in the
sequence, concatenating the results. That seems an intuitive
interpretation of 'join'ing a sequence. Here's my Python
prototype:
DA> I don't get it -- why?

DA> I'd expect join(((1,2,3), (4,5,6))) to yield (1,2,3,4,5,6),
DA> not anything involving strings.

Oh, just because I think it might useful, and would provide something
that isn't easily provided with other constructs.

Without those semantics join(((1,2,3), (4,5,6))) isn't much different
than (1,2,3) + (4,5,6), or reduce(operator.add, ((1,2,3), (4,5,6))) as
you point out.

Since those latter two are easy enough to come up with, but str()ing
the elements would require painful lambdas, I figured make the new
built in do something new.

-Barry


From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Fri Jun 11 19:36:54 1999
From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw)
Date: Fri, 11 Jun 1999 14:36:54 -0400 (EDT)
Subject: [Python-Dev] String methods... finally
References: <14176.19210.146525.172100@anthem.cnri.reston.va.us>
<14176.36294.902853.594510@cm-24-29-94-19.nycap.rr.com>
<14177.21736.100540.221487@anthem.cnri.reston.va.us>
Message-ID: <14177.22342.320993.969742@anthem.cnri.reston.va.us>

One other thing to think about. Where should this new methods be
documented? I suppose we should reword the appropriate entries in
modules-string and move them to typesseq-strings.

What do you think Fred?

-Barry


From da@ski.org Fri Jun 11 19:36:32 1999
From: da@ski.org (David Ascher)
Date: Fri, 11 Jun 1999 11:36:32 -0700 (Pacific Daylight Time)
Subject: [Python-Dev] String methods... finally
In-Reply-To: <14177.22176.328185.872134@anthem.cnri.reston.va.us>
Message-ID: <Pine.WNT.4.04.9906111134000.289-100000@rigoletto.ski.org>

On Fri, 11 Jun 1999, Barry A. Warsaw wrote:

Barry:
IMO, builtin join ought to str()ify all the elements in the
sequence, concatenating the results.
Me:
I don't get it -- why? Barry:
Oh, just because I think it might useful, and would provide something
that isn't easily provided with other constructs.
I do map(str, ...) all the time.

My real concern is that there is nothing about the word 'join' which
implies string conversion. Either call it joinstrings or don't do the
conversion, I say.

--david



From bwarsaw@python.org Fri Jun 11 19:42:27 1999
From: bwarsaw@python.org (Barry A. Warsaw)
Date: Fri, 11 Jun 1999 14:42:27 -0400 (EDT)
Subject: [Python-Dev] String methods... finally
References: <14177.22176.328185.872134@anthem.cnri.reston.va.us>
<Pine.WNT.4.04.9906111134000.289-100000@rigoletto.ski.org>
Message-ID: <14177.22675.716917.331314@anthem.cnri.reston.va.us>
"DA" == David Ascher <da@ski.org> writes:
DA> My real concern is that there is nothing about the word 'join'
DA> which implies string conversion. Either call it joinstrings
DA> or don't do the conversion, I say.

Can you say mapconcat() ? :)

Or instead of join, just call it concat?

-Barry


From da@ski.org Fri Jun 11 19:46:19 1999
From: da@ski.org (David Ascher)
Date: Fri, 11 Jun 1999 11:46:19 -0700 (Pacific Daylight Time)
Subject: [Python-Dev] String methods... finally
In-Reply-To: <14177.22675.716917.331314@anthem.cnri.reston.va.us>
Message-ID: <Pine.WNT.4.04.9906111145120.289-100000@rigoletto.ski.org>
On Fri, 11 Jun 1999, Barry A. Warsaw wrote:

"DA" == David Ascher <da@ski.org> writes:
DA> My real concern is that there is nothing about the word 'join'
DA> which implies string conversion. Either call it joinstrings
DA> or don't do the conversion, I say.

Can you say mapconcat() ? :)

Or instead of join, just call it concat?
Again, no. Concatenating sequences is what I think the + operator does. I
think you need the letters S, T, and R in there... But I'm still not
convinced of its utility.





From guido@CNRI.Reston.VA.US Fri Jun 11 19:51:18 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 11 Jun 1999 14:51:18 -0400
Subject: [Python-Dev] join()
Message-ID: <199906111851.OAA04105@eric.cnri.reston.va.us>

Given the heat in this discussion, I'm not sure if I endorse *any* of
the proposals so far any more...

How would Java do this? A static function in the String class,
probably. The Python equivalent is... A function in the string
module. So maybe string.join() it remains.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Fri Jun 11 20:08:11 1999
From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw)
Date: Fri, 11 Jun 1999 15:08:11 -0400 (EDT)
Subject: [Python-Dev] join()
References: <199906111851.OAA04105@eric.cnri.reston.va.us>
Message-ID: <14177.24219.94236.485421@anthem.cnri.reston.va.us>
"Guido" == Guido van Rossum <guido@cnri.reston.va.us> writes:
Guido> Given the heat in this discussion, I'm not sure if I
Guido> endorse *any* of the proposals so far any more...

Oh I dunno. David and I aren't throwing rocks at each other yet :)

Guido> How would Java do this? A static function in the String
Guido> class, probably. The Python equivalent is... A function
Guido> in the string module. So maybe string.join() it remains.

The only reason for making it a builtin would be to avoid pulling in
all of string just to get join. But I guess we need to get some more
experience using the methods before we know whether this is a real
problem or not.

as-good-as-a-from-string-import-join-and-easier-to-implement-ly y'rs,
-Barry


From skip@mojam.com (Skip Montanaro) Fri Jun 11 20:38:33 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Fri, 11 Jun 1999 15:38:33 -0400 (EDT)
Subject: [Python-Dev] String methods... finally
In-Reply-To: <14177.21484.126155.939932@anthem.cnri.reston.va.us>
References: <199906110516.BAA02520@eric.cnri.reston.va.us>
<Pine.WNT.4.05.9906102220450.173-100000@david.ski.org>
<14177.21484.126155.939932@anthem.cnri.reston.va.us>
Message-ID: <14177.25698.40807.786489@cm-24-29-94-19.nycap.rr.com>

Barry> IMO, builtin join ought to str()ify all the elements in the
Barry> sequence, concatenating the results. That seems an intuitive
Barry> interpretation of 'join'ing a sequence.

Any reason why join should be a builtin and not a method available just to
sequences? Would there some valid interpretation of

join( {'a': 1} )
join( 1 )

? If not, I vote for method-hood, not builtin-hood. Seems like you'd avoid
some confusion (and some griping by Graham Matthews about how unpure it is
;-).

Skip


From skip@mojam.com (Skip Montanaro) Fri Jun 11 20:42:11 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Fri, 11 Jun 1999 15:42:11 -0400 (EDT)
Subject: [Python-Dev] join()
In-Reply-To: <14177.24219.94236.485421@anthem.cnri.reston.va.us>
References: <199906111851.OAA04105@eric.cnri.reston.va.us>
<14177.24219.94236.485421@anthem.cnri.reston.va.us>
Message-ID: <14177.26195.71364.77248@cm-24-29-94-19.nycap.rr.com>

BAW> The only reason for making it a builtin would be to avoid pulling
BAW> in all of string just to get join.

I still don't understand the motivation for making it a builtin instead of a
method of the types it operates on. Making it a builtin seems very
un-object-oriented to me.

Skip


From guido@CNRI.Reston.VA.US Fri Jun 11 20:44:28 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 11 Jun 1999 15:44:28 -0400
Subject: [Python-Dev] join()
In-Reply-To: Your message of "Fri, 11 Jun 1999 15:42:11 EDT."
<14177.26195.71364.77248@cm-24-29-94-19.nycap.rr.com>
References: <199906111851.OAA04105@eric.cnri.reston.va.us> <14177.24219.94236.485421@anthem.cnri.reston.va.us>
<14177.26195.71364.77248@cm-24-29-94-19.nycap.rr.com>
Message-ID: <199906111944.PAA04277@eric.cnri.reston.va.us>
I still don't understand the motivation for making it a builtin instead of a
method of the types it operates on. Making it a builtin seems very
un-object-oriented to me.
Because if you make it a method, every sequence type needs to know
about joining strings. (This wouldn't be a problem in Smalltalk where
sequence types inherit this stuff from an abstract sequence class, but
in Python unfortunately that doesn't exist.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From da@ski.org Fri Jun 11 21:11:11 1999
From: da@ski.org (David Ascher)
Date: Fri, 11 Jun 1999 13:11:11 -0700 (Pacific Daylight Time)
Subject: [Python-Dev] join()
In-Reply-To: <199906111944.PAA04277@eric.cnri.reston.va.us>
Message-ID: <Pine.WNT.4.04.9906111247410.289-100000@rigoletto.ski.org>
On Fri, 11 Jun 1999, Guido van Rossum wrote:

I still don't understand the motivation for making it a builtin instead of a
method of the types it operates on. Making it a builtin seems very
un-object-oriented to me.
Because if you make it a method, every sequence type needs to know
about joining strings.
It still seems to me that we could do something like F/'s proposal, where
sequences can define a join() method, which could be optimized if the
first element is a string to do what string.join, by placing the class
method in an instance method of strings, since string joining clearly has
to involve at least one string.

Pseudocode:

class SequenceType:

def join(self, separator=None):

if hasattr(self[0], '__join__')
# covers all types which can be efficiently joined if homogeneous
return self[0].__join__(self, separator)

# for the rest:
if separator is None:
return map(operator.add, self)

result = self[0]
for element in self[1:]:
result = result + separator + element
return result

where the above would have to be done in abstract.c, with error handling,
etc. and with strings (regular and unicode) defining efficient __join__'s
as in:

class StringType:
def join(self, separator):
raise AttributeError, ...

def __join__(self, sequence):
return string.join(sequence) # obviously not literally that =)

class UnicodeStringType:
def __join__(self, sequence):
return unicode.join(sequence)

(in C, of course).

Yes, it's strange to fake class methods with instance methods, but it's
been done before =). Yes, this means expanding what it means to "be a
sequence" -- is that impossible without breaking lots of code?

--david





From gmcm@hypernet.com Fri Jun 11 22:30:10 1999
From: gmcm@hypernet.com (Gordon McMillan)
Date: Fri, 11 Jun 1999 16:30:10 -0500
Subject: [Python-Dev] String methods... finally
In-Reply-To: <Pine.WNT.4.04.9906111145120.289-100000@rigoletto.ski.org>
References: <14177.22675.716917.331314@anthem.cnri.reston.va.us>
Message-ID: <1282985631-84109501@hypernet.com>

David Ascher wrote:
Barry Warsaw wrote:
Or instead of join, just call it concat?
Again, no. Concatenating sequences is what I think the + operator
does. I think you need the letters S, T, and R in there... But I'm
still not convinced of its utility.
But then Q will feel left out, and since Q doesn't go anywhere
without U, pretty soon you'll have the whole damn alphabet in there.

I-draw-the-line-at-$-well-$-&-@-but-definitely-not-#-ly y'rs

- Gordon


From MHammond@skippinet.com.au Fri Jun 11 23:49:29 1999
From: MHammond@skippinet.com.au (Mark Hammond)
Date: Sat, 12 Jun 1999 08:49:29 +1000
Subject: [Python-Dev] String methods... finally
In-Reply-To: <14177.20510.818041.110989@anthem.cnri.reston.va.us>
Message-ID: <006801beb45c$aab5baa0$0801a8c0@bobcat>
Are these separately available? I don't see them under downloads.
Send me a URL, and if I can figure out how to get CVS to add files to
the branch :/, maybe I can check this in so people can play with it.
Fredrik and I have spoken about this. He will dust it off and integrate
some patches in the next few days. He will then send it to me to make sure
the patches I made for Windows CE all made it OK, then one of us will
integrate it with the branch and send it on...

Mark.



From tim_one@email.msn.com Sat Jun 12 01:56:03 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Fri, 11 Jun 1999 20:56:03 -0400
Subject: [Python-Dev] String methods... finally
In-Reply-To: <14177.21736.100540.221487@anthem.cnri.reston.va.us>
Message-ID: [Skip Montanaro]
I see the following functions in string.py that could
reasonably be methodized:

ljust, rjust, center, expandtabs, capwords

Also zfill.
[Barry A. Warsaw]
What do you think, are these important enough to add?
I think lack-of-surprise (gratuitous orthogonality <wink>) was the motive
here. If Guido could drop string functions in 2.0, which would he be happy
to forget? Give him a head start.

ljust and rjust were used often a long time ago, before the "%" sprintf-like
operator was introduced; don't think I've seen new code use them in years.

center was a nice convenience in the pre-HTML world, but probably never
speed-critical and easy to write yourself.

expandtabs is used frequently in IDLE and even pyclbr.py now. Curiously,
though, they almost never want the tab-expanded string, but rather its len.

capwords could become an absolute nightmare in a Unicode world <0.5 wink>.
Maybe we can just drop in /F's implementation for these.
Sounds like A Plan to me. Wouldn't mourn the passing of the first three.

and-i-even-cried-at-my-father's-funeral<wink>-ly y'rs - tim




From tim_one@email.msn.com Sat Jun 12 07:19:33 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Sat, 12 Jun 1999 02:19:33 -0400
Subject: [Python-Dev] String methods... finally
In-Reply-To: <199906110140.VAA02180@eric.cnri.reston.va.us>
Message-ID: [GvR]
(I sometimes wished I wasn't in the business of making releases. I've
asked for help with making essential patches to 1.5.2 available but
nobody volunteered... :-( )
It's kinda baffling "out here" -- checkin comments usually say what a patch
does, but rarely make a judgment about a patch's importance. Sorting thru
hundreds of patches without a clue is a pretty hopeless task.

Perhaps future checkins that the checker-inner feels are essential could be
commented as such in a machine-findable way?

an-ounce-of-foresight-is-worth-a-sheet-of-foreskin-or-something-like-that-ly
y'rs - tim




From tim_one@email.msn.com Sat Jun 12 07:19:37 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Sat, 12 Jun 1999 02:19:37 -0400
Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v]
In-Reply-To: <199906101411.KAA29962@eric.cnri.reston.va.us>
Message-ID: [Aaron, describes a scheme where objects are represented by a fixed-size
(typecode, variant)
pair, where if the typecode is e.g. INT or FLOAT the variant is the
value directly instead of a pointer to the value]

[Guido]
What you're describing is very close to what I recall I once read
about the runtime organization of Icon.
At the lowest level it's exactly what Icon does. It does *not* exempt ints
from Icon's flavor of dynamic memory management, but Icon doesn't use
refcounting -- it uses compacting mark-&-sweep across some 5 distinct
regions each with their own finer-grained policies (e.g., strings are
central to Icon and so it manages the string region a little differently;
and Icon coroutines save away pieces of the platform's C stack so need
*very* special treatment).

So:

1) There are no incref/decref expenses anywhere in Icon.

2) Because of compaction, all allocations cost the same and are dirt cheap:
just increment the appropriate region's "avail" pointer by the number of
bytes you need. If there aren't enough bytes, run GC and try again. If
there still aren't enough bytes, Icon usually shuts down (it's not good at
asking the OS for more memory! it carves up its initial memory in pretty
rigid ways, and relies on tricks like comparing storage addresses to speed
M&S and compaction -- those "regions" are in a fixed order relative to each
other, so new memory can't be tacked on to a region except at the low and
high ends).

3) All the expense is in finding and compacting live objects, so in an odd
literal sense cleaning up trash comes for free.

4) Icon has no finalizers, so it doesn't need to identify or preserve
trash -- compaction simply overwrites "the holes" where the trash used to
be.

Icon is nicely implemented, but it's a "self-contained universe" view of the
world and its memory approach makes life hard for the tiny handful of folks
who have *tried* to make it extendable via C. Icon is also purely
procedural -- no OO, no destructors, no resurrection.

Irony: one reason I picked up Python in '91 is that my int-fiddling code
was too slow in Icon! Even Python 0.9.0 ran int algorithms significantly
faster than the 10-years-refined Icon implementation of that time. Never
looked into why, but now that Aaron brought up the issue I find it very
surprising! Those algorithms had a huge rate of int trash creation, but
very few persistent objects, so Icon's M&S should have run like the wind.
And Icon's allocation is dirt-cheap (at least as fast as Python's fastest
special-purpose allocators), and didn't have any refcounting expenses
either.

There's an important lesson *somewhere* in that <wink>. Maybe it was the
fault of Icon's "goal-directed" expression evaluation, constantly asking
"did this int succeed or fail?", "did that add suceed or fail?", etc.
...
The Icon approach (i.e. yours) seems to require a complete rethinking
of all object implementations and all APIs at the C level -- perhaps
we could think about it for Python 2.0. Some ramifications:

- Uses more memory for highly shared objects (there are as many copies
of the type pointer as there are references).
Actually more than that in Icon: if the "variant" part is a pointer, the
first word of the block it points to is also a copy of the typecode (turns
out the redundancy speeds the GC).
- Thus, lists take double the memory assuming they reference objects
that also exist elsewhere. This affects the performance of slices
etc.

- On the other hand, a list of ints takes half the memory (given that
most of those ints are not shared).
Isn't this 2/3 rather than 1/2? I'm picturing a list element today as
essentially a pointer to a type object pointer + int (3 units in all), and a
type object pointer + int (2 units in all) "tomorrow". Throw in refcounts
too and the ratio likely gets closer to 1.
- *Homogeneous* lists (where all elements have the same type --
i.e. arrays) can be represented more efficiently by having only one
copy of the type pointer. This was an idea for ABC (whose type system
required all container types to be homogenous) that was never
implemented (because in practice the type check wasn't always applied,
and the top-level namespace used by the interactive command
interpreter violated all the rules).
Well, Python already has homogeneous int lists (array.array), and while they
save space they suffer in speed due to needing to wrap raw ints "in an
object" upon reference and unwrap them upon storage.
- Reference count manipulations could be done by a macro (or C++
behind-the-scense magic using copy constructors and destructors) that
calls a function in the type object -- i.e. each object could decide
on its own reference counting implementation :-)
You don't need to switch representations to get that, though, right? That
is, I don't see anything stopping today's type objects from growing
__incref__ and __decref__ slots -- except for common sense <wink>.


An apparent ramification I don't see above that may actually be worth
something <wink>:

- In "i = j + k", the eval stack could contain the ints directly, instead of
pointers to the ints. So fetching the value of i takes two loads (get the
type pointer + the variant) from adjacent stack locations, instead of
today's load-the-pointer + follow-the-pointer (to some other part of
memory); similarly for fetching the value of j. Then the sum can be stored
*directly* into the stack too, without today's need for allocating and
wrapping it in "an int object" first.

Possibly happy variant: on top of the above, *don't* exempt ints from
refcounting. Let 'em incref and decref like everything else. Give them an
intial refcount of max_count/2, and in the exceedingly unlikely event a
decref on an int ever sees zero, the int "destructor" simply resets the
refcount to max_count/2 and is otherwise a nop.

semi-thinking-semi-aloud-ly y'rs - tim




From ping@lfw.org Sat Jun 12 09:05:06 1999
From: ping@lfw.org (Ka-Ping Yee)
Date: Sat, 12 Jun 1999 01:05:06 -0700 (PDT)
Subject: [Python-Dev] String methods... finally
In-Reply-To: <004601beb409$8c535750$f29b12c2@pythonware.com>
Message-ID: <Pine.LNX.3.93.990612005957.107B-100000@skuld.lfw.org>
On Fri, 11 Jun 1999, Fredrik Lundh wrote:
Two new methods startswith and endswith act like their Java cousins.
is it just me, or do those method names suck?

begin? starts_with? startsWith? (ouch)
has_prefix?
I'm quite happy with "startswith" and "endswith". I mean,
they're a bit long, i suppose, but i can't think of anything
better. You definitely want to avoid has_prefix, as that
compounds the has_key vs. hasattr issue.

x.startswith("foo")
x[:3] == "foo"

x.startswith(y)
x[:len(y)] == y

Hmm. I guess it doesn't save you much typing until y is an
expression. But it's still a lot easier to read.



!ping



From ping@lfw.org Sat Jun 12 09:12:38 1999
From: ping@lfw.org (Ka-Ping Yee)
Date: Sat, 12 Jun 1999 01:12:38 -0700 (PDT)
Subject: [Python-Dev] join()
In-Reply-To: <14177.26195.71364.77248@cm-24-29-94-19.nycap.rr.com>
Message-ID: <Pine.LNX.3.93.990612010808.107C-100000@skuld.lfw.org>
On Fri, 11 Jun 1999, Skip Montanaro wrote:

BAW> The only reason for making it a builtin would be to avoid pulling
BAW> in all of string just to get join.

I still don't understand the motivation for making it a builtin instead of a
method of the types it operates on. Making it a builtin seems very
un-object-oriented to me.
Builtin-hood makes it possible for one method to apply to
many types (or a heterogeneous list of things).

I think i'd support the

def join(list, sep=None):
if sep is None:
result = list[0]
for item in list[1:]:
result = result + item
else:
result = list[0]
for item in list[1:]:
result = result + sep + item

idea, basically a reduce(operator.add...) with an optional
separator -- *except* my main issue would be to make sure that
the actual implementation optimizes the case of joining a
list of strings. string.join() currently seems like the last
refuge for those wanting to avoid O(n^2) time when assembling
many small pieces in string buffers, and i don't want it to
see it go away.



!ping



From fredrik@pythonware.com Sat Jun 12 10:13:59 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Sat, 12 Jun 1999 11:13:59 +0200
Subject: [Python-Dev] String methods... finally
References: <14176.19210.146525.172100@anthem.cnri.reston.va.us><008601beb3da$5e0a7a60$f29b12c2@pythonware.com> <14177.20510.818041.110989@anthem.cnri.reston.va.us>
Message-ID: <00c301beb4b3$e84e3de0$f29b12c2@pythonware.com>
FL> fwiw, the Unicode module available from pythonware.com
FL> implements them all, and more importantly, it can be com-
FL> piled for either 8-bit or 16-bit characters...

Are these separately available? I don't see them under downloads.
Send me a URL, and if I can figure out how to get CVS to add files to
the branch :/, maybe I can check this in so people can play with it.
it's under:
http://www.pythonware.com/madscientist/index.htm

but I've teamed up with Mark H. to update the stuff
a bit, test it with his CE port, and produce a set of
patches. I'm working on this in this very moment.

btw, as for the "missing methods in the string type"
issue, my suggestion is to merge the source code into
a unified string module, which is compiled twice (or
three times, the day we find that we need a 32-bit
string type). don't waste any time cutting and
pasting until we've sorted that one out...

</F>



From fredrik@pythonware.com Sat Jun 12 10:31:08 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Sat, 12 Jun 1999 11:31:08 +0200
Subject: [Python-Dev] String methods... finally
References: <000401beb46e$58b965a0$5ba22299@tim>
Message-ID: <00fb01beb4b6$4df59420$f29b12c2@pythonware.com>
expandtabs is used frequently in IDLE and even pyclbr.py now. Curiously,
though, they almost never want the tab-expanded string, but rather its len.
looked in stropmodule.c lately:

static PyObject *
strop_expandtabs(self, args)
...
/* First pass: determine size of output string */
...
/* Second pass: create output string and fill it */
...

(btw, I originally wrote that code for pythonworks ;-)

how about an "expandtabslength" method?

or maybe we should add lazy evaluation of strings!

</F>



From fredrik@pythonware.com Sat Jun 12 10:49:07 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Sat, 12 Jun 1999 11:49:07 +0200
Subject: [Python-Dev] join()
References: <199906111851.OAA04105@eric.cnri.reston.va.us> <14177.24219.94236.485421@anthem.cnri.reston.va.us>
Message-ID: <014001beb4b9$63f1e820$f29b12c2@pythonware.com>
The only reason for making it a builtin would be to avoid pulling in
all of string just to get join.
another reason is that you might be able to avoid
a unicode module...

</F>



From tismer@appliedbiometrics.com Sat Jun 12 14:27:45 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Sat, 12 Jun 1999 15:27:45 +0200
Subject: [Python-Dev] More flexible namespaces.
References: <008d01be92b2$c56ef5d0$0801a8c0@bobcat> <199904300300.XAA00608@eric.cnri.reston.va.us>
<37296096.D0C9C2CC@appliedbiometrics.com> <199904301517.LAA01422@eric.cnri.reston.va.us>
Message-ID: <37626051.C1EA8AE0@appliedbiometrics.com>


Guido van Rossum wrote:
From: Christian Tismer <tismer@appliedbiometrics.com>
I'd really like to look into that.
Also I wouldn't worry too much about speed, since this is
such a cool feature. It might even be a speedup in some cases
which otherwise would need more complex handling.

May I have a look?
Sure!

(I've forwarded Christian the files per separate mail.)

I'm also interested in your opinion on how well thought-out and robust
the patches are -- I've never found the time to do a good close
reading of them.
Coming back from the stackless task with is finished now,
I popped this task from my stack.

I had a look and it seems well-thought and robust so far.
To make a more trustable claim, I would need to build and test it.

Is this still of interest, or should I drop it?
The follow-ups in this thread indicated that the opinions
about flexible namespaces were quite mixed. So,
should I waste time in building and testing or better save it?

chris

--
Christian Tismer :^)
Applied Biometrics GmbH : Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net
10553 Berlin : PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
we're tired of banana software - shipped green, ripens at home


From bwarsaw@python.org Sat Jun 12 18:16:28 1999
From: bwarsaw@python.org (Barry A. Warsaw)
Date: Sat, 12 Jun 1999 13:16:28 -0400 (EDT)
Subject: [Python-Dev] String methods... finally
References: <14176.19210.146525.172100@anthem.cnri.reston.va.us>
<008601beb3da$5e0a7a60$f29b12c2@pythonware.com>
<14177.20510.818041.110989@anthem.cnri.reston.va.us>
<00c301beb4b3$e84e3de0$f29b12c2@pythonware.com>
Message-ID: <14178.38380.734976.164568@anthem.cnri.reston.va.us>
"FL" == Fredrik Lundh <fredrik@pythonware.com> writes:
FL> btw, as for the "missing methods in the string type"
FL> issue, my suggestion is to merge the source code into
FL> a unified string module, which is compiled twice (or
FL> three times, the day we find that we need a 32-bit
FL> string type). don't waste any time cutting and
FL> pasting until we've sorted that one out...

Very good. Give me the nod when the sorting algorithm halts.


From tim_one@email.msn.com Sat Jun 12 19:28:13 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Sat, 12 Jun 1999 14:28:13 -0400
Subject: [Python-Dev] String methods... finally
In-Reply-To: <14177.25698.40807.786489@cm-24-29-94-19.nycap.rr.com>
Message-ID: [Skip Montanaro]
Any reason why join should be a builtin and not a method available just
to sequences? Would there some valid interpretation of

join( {'a': 1} )
join( 1 )

? If not, I vote for method-hood, not builtin-hood.
Same here, except as a method we've got it twice backwards <wink>: it
should be a string method, but a method of the *separator*:

sep.join(seq)

same as

convert each elt in seq to a string of the same flavor as
sep, then paste the converted strings together with sep
between adjacent elements

So

" ".join(list)

delivers the same result as today's

string.join(map(str, list), " ")

and

L" ".join(list)

does much the same tomorrow but delivers a Unicode string (or is the "L" for
Lundh string <wink>?).

It looks odd at first, but the more I play with it the more I think it's
"the right thing" to do: captures everything that's done today, plus the
most common idiom (mapping str first across the sequence) on top of that,
adapts seamlessly (from the user's view) to new string types, and doesn't
invite uselessly redundant generalization to non-sequence types. One other
attraction perhaps unique to me: I can never remember whether string.join's
default separator is a blank or a null string! Explicit is better than
implicit <wink>.

the-heart-of-a-join-is-the-glue-ly y'rs - tim




From tim_one@email.msn.com Sat Jun 12 19:28:18 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Sat, 12 Jun 1999 14:28:18 -0400
Subject: [Python-Dev] String methods... finally
In-Reply-To: <00fb01beb4b6$4df59420$f29b12c2@pythonware.com>
Message-ID: [Tim]
expandtabs is used frequently in IDLE and even pyclbr.py now.
Curiously, though, they almost never want the tab-expanded string,
but rather its len. [/F]
looked in stropmodule.c lately:

static PyObject *
strop_expandtabs(self, args)
...
/* First pass: determine size of output string */
...
/* Second pass: create output string and fill it */
...

(btw, I originally wrote that code for pythonworks ;-)
Yes, it's nice code! The irony was the source of my "curiously" <wink>.
how about an "expandtabslength" method?
Na, it's very specialized, easy to spell by hand, and even IDLE/pyclbr don't
really need more speed in this area.

From an end-user's view, it's much odder that Python supplies expandtabs but
not the converse string.tabify(string, leadingwhitespaceonly=1, tabwidth=8).
or maybe we should add lazy evaluation of strings!
In the compiler world, there's a famous story about a PL/1 compiler that
blew everyone else out of the water by noticing that the inner loop of a
benchmark extended a string by one character on each trip around, but only
*used* the length of the string. So it skipped code for the quadratic-time
repeated strcats, instead just adding 1 to an int representing the length.

i.e.-lazy-strings-aren't-lazy-enough<wink>-ly y'rs - tim




From tim_one@email.msn.com Sat Jun 12 22:37:08 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Sat, 12 Jun 1999 17:37:08 -0400
Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v]
In-Reply-To: <3761098D.A56F58A8@ifu.net>
Message-ID: [Aaron Watters]
...
I thought it would be good to be able to do the following loop
with Numeric arrays

for x in array1:
array2[x] = array3[x] + array4[x]

without any memory management being involved. Right now, I think the
for loop has to continually dynamically allocate each new x
Actually not, it just binds x to the sequence of PyObject*'s already in
array1, one at a time. It does bump & drop the refcount on that object a
lot. Also irksome is that it keeps allocating/deallocating a little integer
on each trip, for the under-the-covers loop index! Marc-Andre (I think)
had/has a patch to worm around that, but IIRC it didn't make much difference
(wouldn't expect it to, though -- not if the loop body does any real work).

One thing a smarter Python compiler could do is notice the obvious <snort>:
the *internal* incref/decref operations on the object denoted by x in the
loop above must cancel out, so there's no need to do any of them.
"internal" == those due to the routine actions of the PVM itself, while
pushing and popping the eval stack. Exploiting that is tedious; e.g.,
inventing a pile of opcode variants that do the same thing as today's except
skip an incref here and a decref there.
and intermediate sum (and immediate deallocate them)
The intermediate sum is allocated each time, but not deallocated (the
pre-existing object at array2[x] *may* be deallocated, though).
and that makes the loop piteously slow.
A lot of things conspire to make it slow. David is certainly right that, in
this particular case, array2[array1] = array3[array1] + etc worms around the
worst of them.
The idea replacing pyobject *'s with a struct [typedescr *, data *]
was a space/time tradeoff to speed up operations like the above
by eliminating any need for mallocs or other memory management..
Fleshing out details may make it look less attractive. For machines where
ints are no wider than pointers, the "data *" can be replaced with the int
directly and then there's real potential. If for a float the "data*" really
*is* a pointer, though, what does it point *at*? Some dynamically allocated
memory to hold the float appears to be the only answer, and you're right
back at the problem you were hoping to avoid.

Make the "data*" field big enough to hold a Python float directly, and the
descriptor likely zooms to 128 bits (assuming float is IEEE double and the
machine requires natural alignment).

Let's say we do that. Where does the "+" implementation get the 16 bytes it
needs to store its result? The space presumably already exists in the slot
indexed by array2[x], but the "+" implementation has no way to *know* that.
Figuring it out requires non-local analysis, which is quite a few steps
beyond what Python's compiler can do today. Easiest: internal functions
all grow a new PyDescriptor* argument into which they are to write their
result's descriptor. The PVM passes "+" the address of the slot indexed by
array2[x] if it's smart enough; or, if it's not, the address of the stack
slot descriptor into which today's PVM *would* push the result. In the
latter case the PVM would need to copy those 16 bytes into the slot indexed
by array2[x] later.

Neither of those are simple as they sound, though, at least because if
array2[x] holds a descriptor with a real pointer in its variant half, the
thing to which it points needs to get decref'ed iff the add succeeds. It
can get very messy!
I really can't say whether it'd be worth it or not without some sort of
real testing. Just a thought.
It's a good thought! Just hard to make real.

but-if-michael-hudson-keeps-hacking-at-bytecodes-and-christian-
keeps-trying-to-prove-he's-crazier-than-michael-by-2001-
we'll-be-able-to-generate-optimized-vector-assembler-for-
it<wink>-ly y'rs - tim




From tim_one@email.msn.com Sat Jun 12 22:37:14 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Sat, 12 Jun 1999 17:37:14 -0400
Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v]
In-Reply-To: <375FC062.62850DE5@ifu.net>
Message-ID: [Aaron Watters]
...
Another fix would be to put the refcount in the static side with
no speed penalty

(typedescr
repr* ----------------------> data
refcount
)

but would that be wasteful of space?
The killer is for types where repr* is a real pointer:

x = [Whatever()]
y = x[:]

Now we have two physically distinct descriptors pointing at the same thing,
and so also two distinct refcounts for that thing -- impossible to keep them
in synch efficiently; "del y" has no way efficient way to find the refcount
hiding in x.

tbings-and-and-their-refcounts-are-monogamous-ly y'rs - tim




From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Sun Jun 13 18:56:33 1999
From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw)
Date: Sun, 13 Jun 1999 13:56:33 -0400 (EDT)
Subject: [Python-Dev] String methods... finally
References: <14177.25698.40807.786489@cm-24-29-94-19.nycap.rr.com>
<000101beb501$55fb9b60$ce9e2299@tim>
Message-ID: <14179.61649.286195.248429@anthem.cnri.reston.va.us>
"TP" == Tim Peters <tim_one@email.msn.com> writes:
TP> Same here, except as a method we've got it twice backwards
TP> <wink>: it should be a string method, but a method of the
TP> *separator*:

TP> sep.join(seq)

TP> same as
convert each elt in seq to a string of the same flavor as
sep, then paste the converted strings together with sep
between adjacent elements
TP> So

TP> " ".join(list)

TP> delivers the same result as today's

TP> string.join(map(str, list), " ")

TP> and

TP> L" ".join(list)

TP> does much the same tomorrow but delivers a Unicode string (or
TP> is the "L" for Lundh string <wink>?).

TP> It looks odd at first, but the more I play with it the more I
TP> think it's "the right thing" to do

At first glance, I like this proposal a lot. I'd be happy to code it
up if David'll stop throwing those rocks. Whether or not they hit me,
they still hurt <snif> :)

-Barry


From tim_one@email.msn.com Sun Jun 13 20:34:57 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Sun, 13 Jun 1999 15:34:57 -0400
Subject: [Python-Dev] String methods... finally
In-Reply-To: <14179.61649.286195.248429@anthem.cnri.reston.va.us>
Message-ID: <000801beb5d3$d1fd06e0$ae9e2299@tim>
"TP" == Tim Peters <tim_one@email.msn.com> writes:
TP> Same here, except as a method we've got it twice backwards
TP> <wink>: it should be a string method, but a method of the
TP> *separator*:

TP> sep.join(seq)

TP> same as
convert each elt in seq to a string of the same flavor as
sep, then paste the converted strings together with sep
between adjacent elements
TP> So

TP> " ".join(list)

TP> delivers the same result as today's

TP> string.join(map(str, list), " ")

TP> and

TP> L" ".join(list)

TP> does much the same tomorrow but delivers a Unicode string (or
TP> is the "L" for Lundh string <wink>?).

TP> It looks odd at first, but the more I play with it the more I
TP> think it's "the right thing" to do
Barry, did it ever occur to you to that this fancy Emacs quoting is pig ugly
<wink>?

[Barry A. Warsaw]
At first glance, I like this proposal a lot.
That's a bit scary -- even I didn't like it at first glance. It kept
growing on me, though, especially after a trivial naming trick:

space, tab, null = ' ', '\t', ''

...

sentence = space.join(list)
table = tab.join(list)
squashed = null.join(list)

That's so beautifully self-descriptive I cried! Well, I actually jerked my
leg and stubbed my little toe badly, but it's healing nicely, thank you.
Note the naturalness too of creating zippier bound method objects for the
kinds of join you're doing most often:

spacejoin = ' '.join
tabjoin = '\t'.join

etc. I still like it more the more I play with it.
I'd be happy to code it up if David'll stop throwing those rocks.
David warmed up to it in pvt email (his first response was the expected
one-liner "Wacky!").

Other issues:

+ David may want C.join(T) generalized to other classes C and argument types
T. So far my response to all such generalizations has been "wacky!" <wink>,
but I don't think that bears one way or t'other on whether
StringType.join(SequenceType) makes good sense on its own.

+ string.join(seq) doesn't currently convert seq elements to string type,
and in my vision it would. At least three of us admit to mapping str across
seq anyway before calling string.join, and I think it would be a nice
convenience:

I think there's no confusion because there's nothing sensible
string.join *could* do with a non-string seq element other than
convert it to string. The primary effect of string.join griping
about a non-string seq element today is that my

if not ok:
sys.__stdout__.write("not ok, args are " + string.join(args) + "\n")

debugging output blows up instead of being helpful <0.8 wink>.

If Guido is opposed to being helpful, though <wink>, the auto-convert bit
isn't essential.
Whether or not they hit me, they still hurt <snif> :)
I know they do, Barry. That's why I never throw rocks at you. If you like,
I'll have a word with David's ISP.

if-this-was-a-flame-war-we're-too-civilized-to-live-long-enough-to-
reproduce-ly y'rs - tim




From da@ski.org Sun Jun 13 20:48:59 1999
From: da@ski.org (David Ascher)
Date: Sun, 13 Jun 1999 12:48:59 -0700 (Pacific Daylight Time)
Subject: [Python-Dev] String methods... finally
In-Reply-To: <14179.61649.286195.248429@anthem.cnri.reston.va.us>
Message-ID: <Pine.WNT.4.05.9906131248170.106-100000@david.ski.org>
On Sun, 13 Jun 1999, Barry A. Warsaw wrote:

At first glance, I like this proposal a lot. I'd be happy to code it
up if David'll stop throwing those rocks. Whether or not they hit me,
they still hurt <snif> :)
I like it too, since you ask. =)

(When you get a chance, could you bring the rocks back? I only have a
limited supply. Thanks).

--david



From guido@CNRI.Reston.VA.US Mon Jun 14 15:46:34 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Mon, 14 Jun 1999 10:46:34 -0400
Subject: [Python-Dev] String methods... finally
In-Reply-To: Your message of "Sat, 12 Jun 1999 14:28:13 EDT."
<000101beb501$55fb9b60$ce9e2299@tim>
References: <000101beb501$55fb9b60$ce9e2299@tim>
Message-ID: <199906141446.KAA00733@eric.cnri.reston.va.us>
Same here, except as a method we've got it twice backwards <wink>: it
should be a string method, but a method of the *separator*:

sep.join(seq)
Funny, but it does seem right! Barry, go for it...

--Guido van Rossum (home page: http://www.python.org/~guido/)


From klm@digicool.com Mon Jun 14 16:09:58 1999
From: klm@digicool.com (Ken Manheimer)
Date: Mon, 14 Jun 1999 11:09:58 -0400
Subject: [Python-Dev] String methods... finally
Message-ID: [Skip Montanaro]
I see the following functions in string.py that could
reasonably be methodized:

ljust, rjust, center, expandtabs, capwords

Also zfill.
[Barry A. Warsaw]
What do you think, are these important enough to add?
I think expandtabs is worthwhile. Though i wouldn't say i use it
frequently, when i do use it i'm thankful it's there - it's something
i'm really glad to have precooked, since i'm generally not looking for
the distraction when i do happen to need it...

Ken
klm@digicool.com


From guido@CNRI.Reston.VA.US Mon Jun 14 16:12:33 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Mon, 14 Jun 1999 11:12:33 -0400
Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v]
In-Reply-To: Your message of "Sat, 12 Jun 1999 02:19:37 EDT."
<000101beb49b$8c27c620$b19e2299@tim>
References: <000101beb49b$8c27c620$b19e2299@tim>
Message-ID: [me]
- Thus, lists take double the memory assuming they reference objects
that also exist elsewhere. This affects the performance of slices
etc.

- On the other hand, a list of ints takes half the memory (given that
most of those ints are not shared).
[Tim]
Isn't this 2/3 rather than 1/2? I'm picturing a list element today as
essentially a pointer to a type object pointer + int (3 units in all), and a
type object pointer + int (2 units in all) "tomorrow". Throw in refcounts
too and the ratio likely gets closer to 1.
An int is currently 3 units: type, refcnt, value. (The sepcial int
allocator means that there's no malloc overhead.) A list item is one
unit. So a list of N ints is 4N units (+ overhead). In the proposed
scheme, there would be 2 units. That makes a factor of 1/2 for me...
Well, Python already has homogeneous int lists (array.array), and while they
save space they suffer in speed due to needing to wrap raw ints "in an
object" upon reference and unwrap them upon storage.
Which would become faster with the proposed scheme since it would not
require any heap allocation (presuming 2-unit structs can be passed
around as function results).
- Reference count manipulations could be done by a macro (or C++
behind-the-scense magic using copy constructors and destructors) that
calls a function in the type object -- i.e. each object could decide
on its own reference counting implementation :-)
You don't need to switch representations to get that, though, right? That
is, I don't see anything stopping today's type objects from growing
__incref__ and __decref__ slots -- except for common sense <wink>.
Eh, indeed <blush>.
An apparent ramification I don't see above that may actually be worth
something <wink>:

- In "i = j + k", the eval stack could contain the ints directly, instead of
pointers to the ints. So fetching the value of i takes two loads (get the
type pointer + the variant) from adjacent stack locations, instead of
today's load-the-pointer + follow-the-pointer (to some other part of
memory); similarly for fetching the value of j. Then the sum can be stored
*directly* into the stack too, without today's need for allocating and
wrapping it in "an int object" first.
I though this was assumed all the time? I mentioned "no heap
allocation" above before I read this. I think this is the reason why
it was proposed at all: things for which the value fits in a unit
don't live on the heap at all, *without* playing tricks with pointer
representations.
Possibly happy variant: on top of the above, *don't* exempt ints from
refcounting. Let 'em incref and decref like everything else. Give them an
intial refcount of max_count/2, and in the exceedingly unlikely event a
decref on an int ever sees zero, the int "destructor" simply resets the
refcount to max_count/2 and is otherwise a nop.
Don't get this -- there's no object on the heap to hold the refcnt.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From bwarsaw@python.org Mon Jun 14 19:47:32 1999
From: bwarsaw@python.org (Barry A. Warsaw)
Date: Mon, 14 Jun 1999 14:47:32 -0400 (EDT)
Subject: [Python-Dev] String methods... finally
References: <14179.61649.286195.248429@anthem.cnri.reston.va.us>
<000801beb5d3$d1fd06e0$ae9e2299@tim>
Message-ID: <14181.20036.857729.999835@anthem.cnri.reston.va.us>
"TP" == Tim Peters <tim_one@email.msn.com> writes:
Timbot> Barry, did it ever occur to you to that this fancy Emacs
Timbot> quoting is pig ugly <wink>?

wink> + string.join(seq) doesn't currently convert seq elements to
wink> string type, and in my vision it would. At least three of
wink> us admit to mapping str across seq anyway before calling
wink> string.join, and I think it would be a nice convenience:

Check the CVS branch. It does seem pretty cool!


From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Mon Jun 14 19:48:10 1999
From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw)
Date: Mon, 14 Jun 1999 14:48:10 -0400 (EDT)
Subject: [Python-Dev] String methods... finally
References: <14179.61649.286195.248429@anthem.cnri.reston.va.us>
<Pine.WNT.4.05.9906131248170.106-100000@david.ski.org>
Message-ID: <14181.20074.728230.764485@anthem.cnri.reston.va.us>
"DA" == David Ascher <da@ski.org> writes:
DA> (When you get a chance, could you bring the rocks back? I
DA> only have a limited supply. Thanks).

Sorry, I need them to fill up the empty spaces in my skull.
-Barry


From tim_one@email.msn.com Tue Jun 15 03:50:08 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Mon, 14 Jun 1999 22:50:08 -0400
Subject: [Python-Dev] RE: [Python-Dev] String methods... finally
In-Reply-To: <14181.20036.857729.999835@anthem.cnri.reston.va.us>
Message-ID: <000001beb6d9$c82e7980$069e2299@tim>
wink> + string.join(seq) [etc]
[Barry]
Check the CVS branch. It does seem pretty cool!
It's even more fun to play with than to argue about <wink>. Thank you,
Barry!

A bug:
'ab'.endswith('b',0,1) # right
'ab'.endswith('ab',0,1) # wrong
1
'ab'.endswith('ab',0,0) # wrong
1
>>>

Two legit compiler warnings from a previous checkin:

Objects\intobject.c(236) : warning C4013: 'isspace' undefined;
assuming extern returning int
Objects\intobject.c(243) : warning C4013: 'isalnum' undefined;
assuming extern returning int

One docstring glitch ("very" -> "every"):
print ''.join.__doc__
S.join(sequence) -> string

Return a string which is the concatenation of the string representation
of very element in the sequence. The separator between elements is S.
>>>

"-".join("very nice indeed! ly".split()) + " y'rs - tim"




From MHammond@skippinet.com.au Tue Jun 15 04:13:03 1999
From: MHammond@skippinet.com.au (Mark Hammond)
Date: Tue, 15 Jun 1999 13:13:03 +1000
Subject: [Python-Dev] RE: [Python-Dev] RE: [Python-Dev] String methods... finally
In-Reply-To: <000001beb6d9$c82e7980$069e2299@tim>
Message-ID: <00e901beb6dc$fc830d60$0801a8c0@bobcat>
"-".join("very nice indeed! ly".split()) + " y'rs - tim"
<chortle>

But now the IDLE "CallTips" extenion seems lame.

Typing
" ".join(
doesnt yield the help, where:
s=" "; s.join(
does :-)

Very cute, I must say. The biggest temptation is going to be, as I
mentioned, avoiding the use of this stuff for "general" code. Im still
unconvinced the "sep.join" concept is natural, but string methods in
general sure as hell are.

Guido almost hinted that post 1.5.2 interim release(s?) would be
acceptable, so long as he didnt have to do it! Im tempted to volunteer to
agree to do something for Windows, and if no other platform biggots
volunteer, I wont mind in the least :-) I realize it still needs settling
down, but this is too good to keep to "ourselves" (being CVS enabled
people) for too long ;-)

Mark.



From tim_one@email.msn.com Tue Jun 15 06:29:03 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Tue, 15 Jun 1999 01:29:03 -0400
Subject: [Python-Dev] RE: [Python-Dev] stackable ints [stupid idea (ignore) :v]
In-Reply-To: <199906141512.LAA00793@eric.cnri.reston.va.us>
Message-ID: [Guido]
- On the other hand, a list of ints takes half the memory (given that
most of those ints are not shared).
[Tim]
Isn't this 2/3 rather than 1/2? [yadda yadda]
[Guido]
An int is currently 3 units: type, refcnt, value. (The sepcial int
allocator means that there's no malloc overhead.) A list item is one
unit. So a list of N ints is 4N units (+ overhead). In the proposed
scheme, there would be 2 units. That makes a factor of 1/2 for me...
Well, if you count the refcount, sure <wink>.

Moving on, implies you're not contemplating making the descriptor big enough
to hold a float (else it would still be 4 units assuming natural alignment),
in turn implying that *only* ints would get the space advantage in
lists/tuples? Plus maybe special-casing the snot out of short strings?
Well, Python already has homogeneous int lists (array.array),
and while they save space they suffer in speed ...
Which would become faster with the proposed scheme since it would not
require any heap allocation (presuming 2-unit structs can be passed
around as function results).
They can be in any std (even reasonable) C (or C++). If this gets serious,
though, strongly suggest timing it on important compiler + platform combos,
especially RISC. You can probably *count* on a PyObject* result getting
returned in a register, but depressed C++ compiler jockeys have been known
to treat struct/class returns via an unoptimized chain of copy constructors.
Probably better to allocate "result space" in the caller and pass that via
reference to the callee. With care, you can get the result written into its
final resting place efficiently then, more efficiently than even a gonzo
globally optimizing compiler could figure out (A calls B call C calls D, and
A can tell D exactly where to store the result if it's explicit).
[other ramifications for
"i = j + k"
]
I though this was assumed all the time?
Apparently it was! At least by you <wink>. Now by me too; no problem.
[refcount-on-int drivel]
Don't get this -- there's no object on the heap to hold the refcnt.
I don't get it either. Desperation? The idea that incref/decref may need
to be treated as virtual methods (in order to exempt ints or other possible
direct values) really disturbs me -- incref/decref happen *all* the time,
explicit integer ops only some of the time. Turning incref/decref into
indirected function calls doesn't sound promising at all. Injecting a
test-branch guard via macro sounds faster but still icky, and especially if
the set of exempt types isn't a singleton.

no-positive-suggestions-just-grousing-ly y'rs - tim




From tim_one@email.msn.com Tue Jun 15 07:17:02 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Tue, 15 Jun 1999 02:17:02 -0400
Subject: [Python-Dev] RE: [Python-Dev] RE: [Python-Dev] RE: [Python-Dev] String methods... finally
In-Reply-To: <00e901beb6dc$fc830d60$0801a8c0@bobcat>
Message-ID: [Mark Hammond]
...
But now the IDLE "CallTips" extenion seems lame.

Typing
" ".join(
doesnt yield the help, where:
s=" "; s.join(
does :-)
No Windows Guy will be stymied by how to hack that! Hint: string literals
always end with one of two characters <wink>.
Very cute, I must say. The biggest temptation is going to be, as I
mentioned, avoiding the use of this stuff for "general" code. Im still
unconvinced the "sep.join" concept is natural, but string methods in
general sure as hell are.
sep.join bothered me until I gave the separator a name (a la the
"space.join, tab.join", etc examples earlier). Then it looked *achingly*
natural! Using a one-character literal instead still rubs me the wrong way,
although for some reason e.g.

", ".join(seq)

no longer does. I can't account for any of it, but I know what I like
<wink>.
Guido almost hinted that post 1.5.2 interim release(s?) would be
acceptable, so long as he didnt have to do it! Im tempted to volunteer to
agree to do something for Windows, and if no other platform biggots
volunteer, I wont mind in the least :-) I realize it still
needs settling down, but this is too good to keep to "ourselves" (being
CVS enabled people) for too long ;-)
Yes, I really like the new string methods too! And I want to rewrite all of
IDLE to use them ASAP <wink>.

damn-the-users-let's-go-nuts-ly y'rs - tim




From fredrik@pythonware.com Tue Jun 15 08:10:28 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Tue, 15 Jun 1999 09:10:28 +0200
Subject: [Python-Dev] Re: [Python-Dev] String methods... finally
References: <14179.61649.286195.248429@anthem.cnri.reston.va.us><000801beb5d3$d1fd06e0$ae9e2299@tim> <14181.20036.857729.999835@anthem.cnri.reston.va.us>
Message-ID: <006801beb6fe$27490d80$f29b12c2@pythonware.com>
wink> + string.join(seq) doesn't currently convert seq elements to
wink> string type, and in my vision it would. At least three of
wink> us admit to mapping str across seq anyway before calling
wink> string.join, and I think it would be a nice convenience:
hmm. consider the following:

space = " "
foo = L"foo"
bar = L"bar"
result = space.join((foo, bar))

what should happen if you run this:

a) Python raises an exception
b) result is an ordinary string object
c) result is a unicode string object

</F>



From ping@lfw.org Tue Jun 15 08:24:33 1999
From: ping@lfw.org (Ka-Ping Yee)
Date: Tue, 15 Jun 1999 00:24:33 -0700 (PDT)
Subject: [Python-Dev] Re: [Python-Dev] RE: [Python-Dev] String methods... finally
In-Reply-To: <000001beb6d9$c82e7980$069e2299@tim>
Message-ID: <Pine.LNX.3.93.990615002207.3865E-100000@localhost>
On Mon, 14 Jun 1999, Tim Peters wrote:

A bug:
'ab'.endswith('b',0,1) # right
'ab'.endswith('ab',0,1) # wrong
1
'ab'.endswith('ab',0,0) # wrong
1
I assumed you meant that the extra arguments should be slices
on the string being searched, i.e.

specimen.startswith(text, start, end)

is equivalent to

specimen[start:end].startswith(text)

without the overhead of slicing the specimen? Or did i understand
you correctly?
Return a string which is the concatenation of the string representation
of very element in the sequence. The separator between elements is S.
"-".join("very nice indeed! ly".split()) + " y'rs - tim"
Yes, i have to agree that this (especially once you name the
separator string) is a pretty nice way to present the "join"
functionality.


!ping

"Is it so small a thing,
To have enjoyed the sun,
To have lived light in the Spring,
To have loved, to have thought, to have done;
To have advanced true friends, and beat down baffling foes--
That we must feign bliss
Of a doubtful future date,
And while we dream on this,
Lose all our present state,
And relegate to worlds... yet distant our repose?"
-- Matthew Arnold



From MHammond@skippinet.com.au Tue Jun 15 09:28:55 1999
From: MHammond@skippinet.com.au (Mark Hammond)
Date: Tue, 15 Jun 1999 18:28:55 +1000
Subject: [Python-Dev] RE: [Python-Dev] Re: [Python-Dev] String methods... finally
In-Reply-To: <006801beb6fe$27490d80$f29b12c2@pythonware.com>
Message-ID: <00f801beb709$1c874b90$0801a8c0@bobcat>
hmm. consider the following:

space = " "
foo = L"foo"
bar = L"bar"
result = space.join((foo, bar))

what should happen if you run this:

a) Python raises an exception
b) result is an ordinary string object
c) result is a unicode string object
Well, we could take this to the extreme, and allow _every_ object to grow a
join method, where join attempts to cooerce to the same type.

Thus:
" ".join([L"foo", L"bar"]) -> "foo bar"
L" ".join(["foo", "bar"]) -> L"foo bar"
" ".join([1,2]) -> "1 2"
0.join(['1',2']) -> 102
[].join([...]) # exercise for the reader ;-)

etc.

Mark.



From ping@lfw.org Tue Jun 15 09:50:34 1999
From: ping@lfw.org (Ka-Ping Yee)
Date: Tue, 15 Jun 1999 01:50:34 -0700 (PDT)
Subject: [Python-Dev] Re: [Python-Dev] RE: [Python-Dev] Re: [Python-Dev] String methods... finally
In-Reply-To: <00f801beb709$1c874b90$0801a8c0@bobcat>
Message-ID: <Pine.LNX.3.93.990615014244.4634A-100000@skuld.lfw.org>
On Tue, 15 Jun 1999, Mark Hammond wrote:
hmm. consider the following:

space = " "
foo = L"foo"
bar = L"bar"
result = space.join((foo, bar))

what should happen if you run this:

a) Python raises an exception
b) result is an ordinary string object
c) result is a unicode string object
Well, we could take this to the extreme, and allow _every_ object to grow a
join method, where join attempts to cooerce to the same type.
I think i'd agree with Mark's answer for this situation, though
i don't know about adding 'join' methods to other types. I see two
arguments that can be made here:

For b): the result should match the type of the object
on which the method was called. This way the type of
the result more easily determinable by the programmer
or reader. Also, since the type of the result is
immediately known to the "join" code, each member of the
passed-in sequence need only be fetched once, and a
__getitem__-style generator can easily stand in for the
sequence.

For c): the result should match the "biggest" type among
the operands. This behaviour is consistent with what
you would get if you added all the operands together.
Unfortunately this means you have to see all the operands
before you know the type of the result, which means you
either scan twice or convert potentially the whole result.

b) weighs more strongly in my opinion, so i think the right
thing to do is to match the type of the separator.

(But if a Unicode string contains characters outside of
the Latin-1 range, is it supposed to raise an exception
on an attempt to convert to an ordinary string? In that
case, the actual behaviour of the above example would be
a) and i'm not sure if that would get annoying fast.)



-- ?!ng

"In the sciences, we are now uniquely privileged to sit side by side
with the giants on whose shoulders we stand."
-- Gerald Holton



From gstein@lyra.org Tue Jun 15 10:05:43 1999
From: gstein@lyra.org (Greg Stein)
Date: Tue, 15 Jun 1999 02:05:43 -0700
Subject: [Python-Dev] Re: String methods... finally
References: <Pine.LNX.3.93.990615014244.4634A-100000@skuld.lfw.org>
Message-ID: <37661767.37D8E370@lyra.org>

Ka-Ping Yee wrote:
...
(But if a Unicode string contains characters outside of
the Latin-1 range, is it supposed to raise an exception
on an attempt to convert to an ordinary string? In that
case, the actual behaviour of the above example would be
a) and i'm not sure if that would get annoying fast.)
I forget the "last word" on this, but (IMO) str(unicode_object) should
return a UTF-8 encoded string.

Cheers,
-g

p.s. what's up with Mailman... it seems to have broken badly on the
[Python-Dev] insertion... I just stripped a bunch of 'em

--
Greg Stein, http://www.lyra.org/


From fredrik@pythonware.com Tue Jun 15 10:48:40 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Tue, 15 Jun 1999 11:48:40 +0200
Subject: [Python-Dev] Re: String methods... finally
References: <Pine.LNX.3.93.990615014244.4634A-100000@skuld.lfw.org>
Message-ID: <003e01beb714$55d7fd80$f29b12c2@pythonware.com>
a) Python raises an exception
b) result is an ordinary string object
c) result is a unicode string object
Well, we could take this to the extreme, and allow _every_ object to grow a
join method, where join attempts to cooerce to the same type.
well, I think that unicode strings and ordinary strings
should behave like "strings" where possible, just like
integers, floats, long integers and complex values be-
have like "numbers" in many (but not all) situations.

if we make unicode strings easier to mix with ordinary
strings, we don't necessarily have to make integers and
lists easier to mix with strings too...

(people who want that can use Tcl instead ;-)
I think i'd agree with Mark's answer for this situation, though
i don't know about adding 'join' methods to other types. I see two
arguments that can be made here:

For b): the result should match the type of the object
on which the method was called. This way the type of
the result more easily determinable by the programmer
or reader. Also, since the type of the result is
immediately known to the "join" code, each member of the
passed-in sequence need only be fetched once, and a
__getitem__-style generator can easily stand in for the
sequence.

For c): the result should match the "biggest" type among
the operands. This behaviour is consistent with what
you would get if you added all the operands together.
Unfortunately this means you have to see all the operands
before you know the type of the result, which means you
either scan twice or convert potentially the whole result.

b) weighs more strongly in my opinion, so i think the right
thing to do is to match the type of the separator.

(But if a Unicode string contains characters outside of
the Latin-1 range, is it supposed to raise an exception
on an attempt to convert to an ordinary string? In that
case, the actual behaviour of the above example would be
a) and i'm not sure if that would get annoying fast.)
exactly. there are some major issues hidden in here,
including:

1) what should "str" do for unicode strings?
2) should join really try to convert its arguments?
3) can "str" really raise an exception for a built-in type?
4) should code written by americans fail when used
in other parts of the world?

based on string-sig input, the unicode class currently
solves (1) by returning a UTF-8 encoded version of the
unicode string contents. this was chosen to make sure
that the answer to (3) is "no, never", and that the an-
swer (4) is "not always, at least" -- we've had enough of
that, thank you:
http://www.lysator.liu.se/%e5ttabitars/7bit-example.txt

if (1) is a reasonable solution (I think it is), I think the
answer to (2) should be no, based on the rule of least
surprise. Python has always required me to explicitly
state when I want to convert things in a way that may
radically change their meaning. I see little reason to
abandon that in 1.6.

</F>



From gstein@lyra.org Tue Jun 15 11:01:09 1999
From: gstein@lyra.org (Greg Stein)
Date: Tue, 15 Jun 1999 03:01:09 -0700
Subject: [Python-Dev] Re: [Python-Dev] Re: String methods... finally
References: <Pine.LNX.3.93.990615014244.4634A-100000@skuld.lfw.org> <003e01beb714$55d7fd80$f29b12c2@pythonware.com>
Message-ID: <37662465.682FA81B@lyra.org>

Fredrik Lundh wrote:
...
if (1) is a reasonable solution (I think it is), I think the
answer to (2) should be no, based on the rule of least
surprise. Python has always required me to explicitly
state when I want to convert things in a way that may
radically change their meaning. I see little reason to
abandon that in 1.6.
Especially because it is such a simple translation:

sep.join(sequence)

becomes

sep.join(map(str, sequence))

Very obvious what is happening. It isn't hard to read, and it doesn't
take a lot out of a person to insert that extra phrase.

And hey... people can always do:

def strjoin(sep, seq):
return sep.join(map(str, seq))

And just use strjoin() everywhere if they hate the typing.

Cheers,
-g

--
Greg Stein, http://www.lyra.org/


From gmcm@hypernet.com Tue Jun 15 14:08:08 1999
From: gmcm@hypernet.com (Gordon McMillan)
Date: Tue, 15 Jun 1999 08:08:08 -0500
Subject: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] Re: String methods... finally
In-Reply-To: <37662465.682FA81B@lyra.org>
Message-ID: <1282670144-103087754@hypernet.com>

Greg Stein wrote:
...
And hey... people can always do:

def strjoin(sep, seq):
return sep.join(map(str, seq))

And just use strjoin() everywhere if they hate the typing.
Those who hate typing regard it as great injury that they have to
define this. Of course, they'll gladly type huge long posts on the
subject.

But, I agree. string.join(['a', 'b', 3]) currently barfs.
L" ".join(seq) should complain if seq isn't all unicode, and same for
good old strings.

- Gordon


From guido@CNRI.Reston.VA.US Tue Jun 15 13:39:09 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 15 Jun 1999 08:39:09 -0400
Subject: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] String methods... finally
In-Reply-To: Your message of "Tue, 15 Jun 1999 09:10:28 +0200."
<006801beb6fe$27490d80$f29b12c2@pythonware.com>
References: <14179.61649.286195.248429@anthem.cnri.reston.va.us><000801beb5d3$d1fd06e0$ae9e2299@tim> <14181.20036.857729.999835@anthem.cnri.reston.va.us>
<006801beb6fe$27490d80$f29b12c2@pythonware.com>
Message-ID: <199906151239.IAA02917@eric.cnri.reston.va.us>
hmm. consider the following:

space = " "
foo = L"foo"
bar = L"bar"
result = space.join((foo, bar))

what should happen if you run this:

a) Python raises an exception
b) result is an ordinary string object
c) result is a unicode string object
The same should happen as for L"foo" + " " + L"bar".

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip@mojam.com (Skip Montanaro) Tue Jun 15 13:50:59 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Tue, 15 Jun 1999 08:50:59 -0400 (EDT)
Subject: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] String methods... finally
In-Reply-To: <199906151239.IAA02917@eric.cnri.reston.va.us>
References: <14179.61649.286195.248429@anthem.cnri.reston.va.us>
<000801beb5d3$d1fd06e0$ae9e2299@tim>
<14181.20036.857729.999835@anthem.cnri.reston.va.us>
<006801beb6fe$27490d80$f29b12c2@pythonware.com>
<199906151239.IAA02917@eric.cnri.reston.va.us>
Message-ID: <14182.19420.462788.15633@cm-24-29-94-19.nycap.rr.com>

Guido> The same should happen as for L"foo" + " " + L"bar".

Remind me again, please. What mnemonic is "L" supposed to evoke? Long?
Lundh? Are we talking about Unicode strings? If so, why not "U"?

Apologies for my increased density.

Skip Montanaro | Mojam: "Uniting the World of Music" http://www.mojam.com/
skip@mojam.com | Musi-Cal: http://www.musi-cal.com/
518-372-5583



From jack@oratrix.nl Tue Jun 15 13:58:05 1999
From: jack@oratrix.nl (Jack Jansen)
Date: Tue, 15 Jun 1999 14:58:05 +0200
Subject: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] String methods...
finally
In-Reply-To: Message by Guido van Rossum <guido@cnri.reston.va.us> ,
Tue, 15 Jun 1999 08:39:09 -0400 , <199906151239.IAA02917@eric.cnri.reston.va.us>
Message-ID: <19990615125805.8CF03303120@snelboot.oratrix.nl>
The same should happen as for L"foo" + " " + L"bar".
This is probably the most reasonable solution. Unfortunately it breaks Marks
truly novel suggestion that 0.join(1, 2) becomes 102, but I guess we'll have
to live with that:-)
--
Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm




From fredrik@pythonware.com Tue Jun 15 15:28:17 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Tue, 15 Jun 1999 16:28:17 +0200
Subject: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] String methods... finally
References: <14179.61649.286195.248429@anthem.cnri.reston.va.us><000801beb5d3$d1fd06e0$ae9e2299@tim> <14181.20036.857729.999835@anthem.cnri.reston.va.us> <006801beb6fe$27490d80$f29b12c2@pythonware.com> <199906151239.IAA02917@eric.cnri.reston.va.us>
Message-ID: <00c201beb73b$5fa27b70$f29b12c2@pythonware.com>
hmm. consider the following:

space = " "
foo = L"foo"
bar = L"bar"
result = space.join((foo, bar))

what should happen if you run this:

a) Python raises an exception
b) result is an ordinary string object
c) result is a unicode string object
The same should happen as for L"foo" + " " + L"bar".
which is?

(alright; for the moment, it's (a) for both:
import unicode
u = unicode.unicode
u("foo") + u(" ") + u("bar")
Traceback (innermost last):
File "<stdin>", line 1, in ?
TypeError: illegal argument type for built-in operation
u("foo") + " " + u("bar")
Traceback (innermost last):
File "<stdin>", line 1, in ?
TypeError: illegal argument type for built-in operation
u(" ").join(("foo", "bar"))
Traceback (innermost last):
File "<stdin>", line 1, in ?
TypeError: first argument must be sequence of unicode strings

but that can of course be changed...)

</F>



From guido@CNRI.Reston.VA.US Tue Jun 15 15:38:32 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 15 Jun 1999 10:38:32 -0400
Subject: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] String methods... finally
In-Reply-To: Your message of "Tue, 15 Jun 1999 16:28:17 +0200."
<00c201beb73b$5fa27b70$f29b12c2@pythonware.com>
References: <14179.61649.286195.248429@anthem.cnri.reston.va.us><000801beb5d3$d1fd06e0$ae9e2299@tim> <14181.20036.857729.999835@anthem.cnri.reston.va.us> <006801beb6fe$27490d80$f29b12c2@pythonware.com> <199906151239.IAA02917@eric.cnri.reston.va.us>
<00c201beb73b$5fa27b70$f29b12c2@pythonware.com>
Message-ID: <199906151438.KAA03355@eric.cnri.reston.va.us>
The same should happen as for L"foo" + " " + L"bar".
which is?
Whatever it is -- I think we did a lot of reasoning about this, and
perhaps we're not quite done -- but I truly believe that whatever is
decided, join() should follow.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Tue Jun 15 16:28:11 1999
From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw)
Date: Tue, 15 Jun 1999 11:28:11 -0400 (EDT)
Subject: [Python-Dev] Re: String methods... finally
References: <Pine.LNX.3.93.990615014244.4634A-100000@skuld.lfw.org>
<37661767.37D8E370@lyra.org>
Message-ID: <14182.28939.509040.125174@anthem.cnri.reston.va.us>
"GS" == Greg Stein <gstein@lyra.org> writes:
GS> p.s. what's up with Mailman... it seems to have broken badly
GS> on the [Python-Dev] insertion... I just stripped a bunch of
GS> 'em

Harald Meland just checked in a fix for this, which I'm installing
now, so the breakage should be just temporary.

-Barry


From tim_one@email.msn.com Tue Jun 15 16:33:38 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Tue, 15 Jun 1999 11:33:38 -0400
Subject: [Python-Dev] String methods... finally
In-Reply-To: <006801beb6fe$27490d80$f29b12c2@pythonware.com>
Message-ID: <000601beb744$70c6f9e0$979e2299@tim>
hmm. consider the following:

space = " "
foo = L"foo"
bar = L"bar"
result = space.join((foo, bar))

what should happen if you run this:

a) Python raises an exception
b) result is an ordinary string object
c) result is a unicode string object
The proposal said #b, or, in general, that the resulting string be of the
same flavor as the separator.




From tim_one@email.msn.com Tue Jun 15 16:33:40 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Tue, 15 Jun 1999 11:33:40 -0400
Subject: [Python-Dev] RE: [Python-Dev] String methods... finally
In-Reply-To: <Pine.LNX.3.93.990615002207.3865E-100000@localhost>
Message-ID: <000701beb744$71e450c0$979e2299@tim>
A bug:
'ab'.endswith('b',0,1) # right
'ab'.endswith('ab',0,1) # wrong
1
'ab'.endswith('ab',0,0) # wrong
1
[Ka-Ping]
I assumed you meant that the extra arguments should be slices
on the string being searched, i.e.

specimen.startswith(text, start, end)

is equivalent to

specimen[start:end].startswith(text)

without the overhead of slicing the specimen? Or did i understand
you correctly?
Yes, and e.g. 'ab'[0:1] == 'a', which does not end with 'ab'. So these are
inconsistent today, and the second is a bug:
'ab'[0:1].endswith('ab')
'ab'.endswith('ab', 0, 1)
1
>>>

Or did I misunderstand you <wink>?




From gward@cnri.reston.va.us Tue Jun 15 16:41:39 1999
From: gward@cnri.reston.va.us (Greg Ward)
Date: Tue, 15 Jun 1999 11:41:39 -0400
Subject: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] String methods... finally
In-Reply-To: <19990615125805.8CF03303120@snelboot.oratrix.nl>; from Jack Jansen on Tue, Jun 15, 1999 at 02:58:05PM +0200
References: <guido@cnri.reston.va.us> <19990615125805.8CF03303120@snelboot.oratrix.nl>
Message-ID: <19990615114139.A3697@cnri.reston.va.us>

On 15 June 1999, Jack Jansen said:
The same should happen as for L"foo" + " " + L"bar".
This is probably the most reasonable solution. Unfortunately it breaks Marks
truly novel suggestion that 0.join(1, 2) becomes 102, but I guess we'll have
to live with that:-)
Careful -- it actually works this way in Perl (well, except that join
isn't a method of strings...):

$ perl -de 1
[...]
DB<2> $sep = 0

DB<3> @list = (1, 2)

DB<4> p join ($sep, @list)
102

Cool! Who needs type-checking anyways?

Greg
--
Greg Ward - software developer gward@cnri.reston.va.us
Corporation for National Research Initiatives
1895 Preston White Drive voice: +1-703-620-8990
Reston, Virginia, USA 20191-5434 fax: +1-703-620-0913


From tim_one@email.msn.com Tue Jun 15 16:58:48 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Tue, 15 Jun 1999 11:58:48 -0400
Subject: [Python-Dev] Re: [Python-Dev] String methods... finally
In-Reply-To: <199906151239.IAA02917@eric.cnri.reston.va.us>
Message-ID: <000901beb747$f4531840$979e2299@tim>
space = " "
foo = L"foo"
bar = L"bar"
result = space.join((foo, bar))
The same should happen as for L"foo" + " " + L"bar".
Then " ".join([" ", 42]) should blow up, and auto-conversion for non-string
types needs to be removed from the implementation.

The attraction of auto-conversion for me is that I had never once seen
string.join blow up where the exception revealed a conceptual error; in
every case conversion to string was the intent, and an obvious one at that.
Just anal nagging. How about dropping Unicode instead <wink>?

Anyway, I'm already on record as saying auto-convert wasn't essential, and
join should first and foremost make good sense for string arguments.

off-to-work-ly y'rs - tim




From MHammond@skippinet.com.au Tue Jun 15 23:29:32 1999
From: MHammond@skippinet.com.au (Mark Hammond)
Date: Wed, 16 Jun 1999 08:29:32 +1000
Subject: [Python-Dev] Re: String methods... finally
In-Reply-To: <003e01beb714$55d7fd80$f29b12c2@pythonware.com>
Message-ID: <010101beb77e$8af64430$0801a8c0@bobcat>
well, I think that unicode strings and ordinary strings
should behave like "strings" where possible, just like
integers, floats, long integers and complex values be-
have like "numbers" in many (but not all) situations.
I obviously missed a few smileys in my post. I was serious that:
L" ".join -> Unicode result
" ".join -> String result

and even
" ".join([1,2]) -> "1 2"

But integers and lists growing "join" methods was a little tounge in cheek
:-)

Mark.



From da@ski.org Tue Jun 15 23:48:41 1999
From: da@ski.org (David Ascher)
Date: Tue, 15 Jun 1999 15:48:41 -0700 (Pacific Daylight Time)
Subject: [Python-Dev] mmap
Message-ID: <Pine.WNT.4.04.9906071040550.193-100000@rigoletto.ski.org>

Another topic: what are the chances of adding the mmap module to the core
distribution? It's restricted to a smallish set of platforms (modern
Unices and Win32, I think), but it's quite small, and would be a nice
thing to have available in the core, IMHO.

(btw, the buffer object needs more documentation)

--david






From MHammond@skippinet.com.au Tue Jun 15 23:53:00 1999
From: MHammond@skippinet.com.au (Mark Hammond)
Date: Wed, 16 Jun 1999 08:53:00 +1000
Subject: [Python-Dev] String methods... finally
In-Reply-To: <000901beb747$f4531840$979e2299@tim>
Message-ID: [Before I start: Skip mentioned "why L, not U". I know C/C++ uses L,
presumably to denote a "long" string (presumably keeping the analogy
between int and long ints). I guess Java has no such indicator, being
native Unicode?

Is there any sort of agreement that Python will use L"..." to denote
Unicode strings? I would be happy with it.

Also, should:
print L"foo" -> 'foo'
and
print `L"foo"` -> L'foo'

I would like to know if there is agreement for this, so I can change the
Pythonwin implementation of Unicode now to make things more seamless later.
]
space = " "
foo = L"foo"
bar = L"bar"
result = space.join((foo, bar))
The same should happen as for L"foo" + " " + L"bar".
I must admit Guido's position has real appeal, even if just from a
documentation POV. Eg, join can be defined as:

sep.join([s1, ..., sn])
Returns s1 + sep + s2 + sep + ... + sepn

Nice and simple to define and understand. Thus, if you can't add 2 items,
you can't join them.

Assuming the Unicode changes allow us to say:
assert " " == L" ", "eek"
assert L" " + "" == L" "
assert " " + L"" == L" " # or even if this == " "

Then this still works well in a Unicode environment; Unicode and strings
could be mixed in the list, and as long as you understand what L" " + ""
returns, you will understand immediately what the result of join() is going
to be.
The attraction of auto-conversion for me is that I had never once seen
string.join blow up where the exception revealed a conceptual
error; in
every case conversion to string was the intent, and an
obvious one at that.
OTOH, my gut tells me this is better - that an implicit conversion to the
seperator type be performed. Also, it appears that this technique will
never surprise anyone in a bad way. It seems the rule above, while simple,
basically means "sep.join can only take string/Unicode objects", as all
other objects will currently fail the add test. So, given that our rule is
that the objects must all be strings, how can it hurt to help the user
conform?
off-to-work-ly y'rs - tim
where-i-should-be-instead-of-writing-rambling-mails-ly,

Mark.



From guido@CNRI.Reston.VA.US Tue Jun 15 23:54:42 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 15 Jun 1999 18:54:42 -0400
Subject: [Python-Dev] mmap
In-Reply-To: Your message of "Tue, 15 Jun 1999 15:48:41 PDT."
<Pine.WNT.4.04.9906071040550.193-100000@rigoletto.ski.org>
References: <Pine.WNT.4.04.9906071040550.193-100000@rigoletto.ski.org>
Message-ID: <199906152254.SAA05114@eric.cnri.reston.va.us>
Another topic: what are the chances of adding the mmap module to the core
distribution? It's restricted to a smallish set of platforms (modern
Unices and Win32, I think), but it's quite small, and would be a nice
thing to have available in the core, IMHO.
If it works on Linux, Solaris, Irix and Windows, and is reasonably
clean, I'll take it. Please send it.
(btw, the buffer object needs more documentation)
That's for Jack & Greg...

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@CNRI.Reston.VA.US Wed Jun 16 00:04:17 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 15 Jun 1999 19:04:17 -0400
Subject: [Python-Dev] String methods... finally
In-Reply-To: Your message of "Wed, 16 Jun 1999 08:53:00 +1000."
<010201beb781$d1febf30$0801a8c0@bobcat>
References: <010201beb781$d1febf30$0801a8c0@bobcat>
Message-ID: <199906152304.TAA05136@eric.cnri.reston.va.us>
Is there any sort of agreement that Python will use L"..." to denote
Unicode strings? I would be happy with it.
I don't know of any agreement, but it makes sense.
Also, should:
print L"foo" -> 'foo'
and
print `L"foo"` -> L'foo'
Yes, I think this should be the way. Exactly what happens to
non-ASCII characters is up to the implementation.

Do we have agreement on escapes like \xDDDD? Should \uDDDD be added?

The difference between the two is that according to the ANSI C
standard, which I follow rather strictly for string literals,
'\xABCDEF' is a single character whose value is the lower bits
(however many fit in a char) of 0xABCDEF; this makes it cumbersome to
write a string consisting of a hex escape followed by a digit or
letter a-f or A-F; you would have to use another hex escape or split
the literal in two, like this: "\xABCD" "EF". (This is true for 8-bit
chars as well as for long char in ANSI C.) The \u escape takes up to
4 bytes but is not ANSI C. In Java, \u has the additional funny
property that it is recognized *everywhere* in the source code, not
just in string literals, and I believe that this complicates the
interpretation of things like "\\uffff" (is the \uffff interpreted
before regular string \ processing happens?). I don't think we ought
to copy this behavior, although JPython users or developers might
disagree. (I don't know anyone who *uses* Unicode strings much, so
it's hard to gauge the importance of these issues.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gmcm@hypernet.com Wed Jun 16 01:09:15 1999
From: gmcm@hypernet.com (Gordon McMillan)
Date: Tue, 15 Jun 1999 19:09:15 -0500
Subject: [Python-Dev] String methods... finally
In-Reply-To: <199906152304.TAA05136@eric.cnri.reston.va.us>
References: Your message of "Wed, 16 Jun 1999 08:53:00 +1000." <010201beb781$d1febf30$0801a8c0@bobcat>
Message-ID: <1282630485-105472998@hypernet.com>

Guido asks:
Do we have agreement on escapes like \xDDDD? Should \uDDDD be
added?
... The \u escape
takes up to 4 bytes but is not ANSI C.
How do endian issues fit in with \u?

- Gordon


From guido@CNRI.Reston.VA.US Wed Jun 16 00:20:07 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 15 Jun 1999 19:20:07 -0400
Subject: [Python-Dev] String methods... finally
In-Reply-To: Your message of "Tue, 15 Jun 1999 19:09:15 CDT."
<1282630485-105472998@hypernet.com>
References: Your message of "Wed, 16 Jun 1999 08:53:00 +1000." <010201beb781$d1febf30$0801a8c0@bobcat>
<1282630485-105472998@hypernet.com>
Message-ID: <199906152320.TAA05211@eric.cnri.reston.va.us>
How do endian issues fit in with \u?
I would assume that it uses the same rules as hex and octal numeric
literals: these are always *written* in big-endian notation, since
that is also what we use for decimal numbers. Thus, on a
little-endian machine, the short integer 0x1234 would be stored as the
bytes {0x34, 0x12} and so would the string literal "\x1234".

--Guido van Rossum (home page: http://www.python.org/~guido/)


From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Wed Jun 16 00:27:44 1999
From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw)
Date: Tue, 15 Jun 1999 19:27:44 -0400 (EDT)
Subject: [Python-Dev] String methods... finally
References: <000901beb747$f4531840$979e2299@tim>
<010201beb781$d1febf30$0801a8c0@bobcat>
Message-ID: <14182.57712.380574.385164@anthem.cnri.reston.va.us>
"MH" == Mark Hammond <MHammond@skippinet.com.au> writes:
MH> OTOH, my gut tells me this is better - that an implicit
MH> conversion to the seperator type be performed.

Right now, the implementation of join uses PyObject_Str() to str-ify
the elements in the sequence. I can't remember, but in our Unicode
worldview doesn't PyObject_Str() return a narrowed string if it can,
and raise an exception if not? So maybe narrow-string's join
shouldn't be doing it this way because that'll autoconvert to the
separator's type, which breaks the symmetry.

OTOH, we could promote sep to the type of sequence[0] and forward the
call to it's join if it were a widestring. That should retain the
symmetry.

-Barry


From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Wed Jun 16 00:46:24 1999
From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw)
Date: Tue, 15 Jun 1999 19:46:24 -0400 (EDT)
Subject: [Python-Dev] String methods... finally
References: <010201beb781$d1febf30$0801a8c0@bobcat>
<199906152304.TAA05136@eric.cnri.reston.va.us>
Message-ID: <14182.58832.140587.711978@anthem.cnri.reston.va.us>
"Guido" == Guido van Rossum <guido@cnri.reston.va.us> writes:
Guido> Should \uDDDD be added?

That'd be nice! :)

Guido> In Java, \u has the additional funny property that it is
Guido> recognized *everywhere* in the source code, not just in
Guido> string literals, and I believe that this complicates the
Guido> interpretation of things like "\\uffff" (is the \uffff
Guido> interpreted before regular string \ processing happens?).

No. JLS section 3.3 says[1]

In addition to the processing implied by the grammar, for each raw
input character that is a backslash \, input processing must
consider how many other \ characters contiguously precede it,
separating it from a non-\ character or the start of the input
stream. If this number is even, then the \ is eligible to begin a
Unicode escape; if the number is odd, then the \ is not eligible
to begin a Unicode escape.

and this is born out by example.

-------------------- snip snip --------------------Uni.java
public class Uni
{
static public void main(String[] args) {
System.out.println("\\u00a9");
System.out.println("\u00a9");
}
}
-------------------- snip snip --------------------outputs
\u00a9
©
-------------------- snip snip --------------------

-Barry

[1] http://java.sun.com/docs/books/jls/html/3.doc.html#44591

PS. it is wonderful having the JLS online :)


From ping@lfw.org Tue Jun 15 17:05:40 1999
From: ping@lfw.org (Ka-Ping Yee)
Date: Tue, 15 Jun 1999 09:05:40 -0700 (PDT)
Subject: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] String methods... finally
In-Reply-To: <19990615114139.A3697@cnri.reston.va.us>
Message-ID: <Pine.LNX.3.93.990615090000.4634D-100000@skuld.lfw.org>
On Tue, 15 Jun 1999, Greg Ward wrote:
Careful -- it actually works this way in Perl (well, except that join
isn't a method of strings...):

$ perl -de 1
[...]
DB<2> $sep = 0

DB<3> @list = (1, 2)

DB<4> p join ($sep, @list)
102

Cool! Who needs type-checking anyways?
Cool! So then
def f(x): return x ** 2
...
def g(x): return x - 5
...
h = join((f, g))
...
h(8)
59

Right? Right?

(Just kidding.)



-- ?!ng

"Any nitwit can understand computers. Many do."
-- Ted Nelson



From tim_one@email.msn.com Wed Jun 16 05:02:46 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Wed, 16 Jun 1999 00:02:46 -0400
Subject: [Python-Dev] String methods... finally
In-Reply-To: <199906152304.TAA05136@eric.cnri.reston.va.us>
Message-ID: [Guido]
Do we have agreement on escapes like \xDDDD?
I think we have to agree to leave that alone -- it affects what e.g. the
regular expression parser does too.
Should \uDDDD be added?
Yes, but only in string literals. You don't want to be within 10 miles of
Barry if you tell him that Emacs pymode has to treat the Unicode escape for
a newline as if it were-- as Java treats it outside literals --an actual
line break <0.01 wink>.
...
The \u escape takes up to 4 bytes
Not in Java: it requires exactly 4 hex characters after == exactly 2 bytes,
and it's an error if it's followed by fewer than 4 hex characters. That's a
good rule (simple!), while ANSI C's is too clumsy to live with if people
want to take Unicode seriously.

So what does it mean for a Unicode escape to appear in a non-L string?

aha-the-secret-escape-to-ucs4<wink>-ly y'rs - tim




From tim_one@email.msn.com Wed Jun 16 05:02:44 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Wed, 16 Jun 1999 00:02:44 -0400
Subject: [Python-Dev] String methods... finally
In-Reply-To: <010201beb781$d1febf30$0801a8c0@bobcat>
Message-ID: [MarkH agonizes, over whether to auto-convert or not]

Well, the rule *could* be that the result type is the widest string type
among the separator and the sequences' string elements (if any), and other
types convert to the result type along the way. I'd be more specific,
except I'm not sure which flavor of string str() returns (or, indeed,
whether that's up to each __str__ implementation). In any case, widening to
Unicode should always be possible, and if "widest wins" it doesn't require a
multi-pass algorithm regardless (although the partial result so far may need
to be widened once -- but that's true even if auto-convert of non-string
types isn't implemented).

Or, IOW,
sep.join([a, b, c]) == f(a) + sep + f(b) + sep + f(c)

where I don't know how to spell f, but f(x) *means*

x' = if x has a string type then x else x.__str__()
return x' coerced to the widest string type seen so far

So I think everyone can get what they want -- except that those who want
auto-convert are at direct odds with those who prefer to wag Guido's fingers
and go "tsk, tsk, we know what you want but you didn't say 'please' so your
program dies" <wink>.

master-of-fair-summaries-ly y'rs - tim




From mal@lemburg.com Wed Jun 16 09:29:27 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 16 Jun 1999 10:29:27 +0200
Subject: [Python-Dev] String methods... finally
References: <010201beb781$d1febf30$0801a8c0@bobcat> <199906152304.TAA05136@eric.cnri.reston.va.us>
Message-ID: <37676067.62E272F4@lemburg.com>

Guido van Rossum wrote:
Is there any sort of agreement that Python will use L"..." to denote
Unicode strings? I would be happy with it.
I don't know of any agreement, but it makes sense.
The u"..." looks more intuitive too me. While inheriting C/C++
constructs usually makes sense I think usage in the C community
is not that wide-spread yet and for a Python freak, the small u will
definitely remind him of Unicode whereas the L will stand for
(nearly) unlimited length/precision.

Not that this is important, but...

--
Marc-Andre Lemburg
______________________________________________________________________
Y2000: 198 days left
Business: http://www.lemburg.com/
Python Pages: http://www.lemburg.com/python/



From fredrik@pythonware.com Wed Jun 16 10:53:23 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Wed, 16 Jun 1999 11:53:23 +0200
Subject: [Python-Dev] String methods... finally
References: <000401beb7ad$175193c0$2ca22299@tim>
Message-ID: <00f701beb7de$cdb422f0$f29b12c2@pythonware.com>
The \u escape takes up to 4 bytes
Not in Java: it requires exactly 4 hex characters after == exactly 2 bytes,
and it's an error if it's followed by fewer than 4 hex characters. That's a
good rule (simple!), while ANSI C's is too clumsy to live with if people
want to take Unicode seriously.

So what does it mean for a Unicode escape to appear in a non-L string?
my suggestion is to store it as UTF-8; see the patches
included in the unicode package for details.

this also means that an u-string literal (L-string, whatever)
could be stored as an 8-bit string internally. and that the
following two are equivalent:

string = u"foo"
string = unicode("foo")

also note that:

unicode(str(u"whatever")) == u"whatever"

...

on the other hand, this means that we have at least four
major "arrays of bytes or characters" thingies mapped on
two data types:

the old string type is used for:

-- plain old 8-bit strings (ascii, iso-latin-1, whatever)
-- byte buffers containing arbitrary data
-- unicode strings stored as 8-bit characters, using
the UTF-8 encoding.

and the unicode string type is used for:

-- unicode strings stored as 16-bit characters

is this reasonable?

...

yet another question is how to deal with source code.
is a python 1.6 source file written in ASCII, ISO Latin 1,
or UTF-8.

speaking from a non-us standpoint, it would be really
cool if you could write Python sources in UTF-8...

</F>



From gstein@lyra.org Wed Jun 16 11:13:45 1999
From: gstein@lyra.org (Greg Stein)
Date: Wed, 16 Jun 1999 03:13:45 -0700 (PDT)
Subject: [Python-Dev] mmap
In-Reply-To: <199906152254.SAA05114@eric.cnri.reston.va.us>
Message-ID: <Pine.LNX.3.95.990616030347.12890A-100000@ns1.lyra.org>
On Tue, 15 Jun 1999, Guido van Rossum wrote:
Another topic: what are the chances of adding the mmap module to the core
distribution? It's restricted to a smallish set of platforms (modern
Unices and Win32, I think), but it's quite small, and would be a nice
thing to have available in the core, IMHO.
If it works on Linux, Solaris, Irix and Windows, and is reasonably
clean, I'll take it. Please send it.
Actually, my preference is to see a change to open() rather than a whole
new module. For example, let's say that you open a file, specifying
memory-mapping. Then you create a buffer against that file:

f = open('foo','rm') # 'm' means mem-map
b = buffer(f)
print b[100:200]

Disclaimer: I haven't looked at the mmap modules (AMK's and Mark's) to see
what capabilities are in there. They may not be expressable soly as open()
changes. (adding add'l params for mmap flags might be another way to
handle this)

I'd like to see mmap native in Python. I won't push, though, until I can
run a test to see what kind of savings will occur when you mmap a .pyc
file and open PyBuffer objects against the thing for the code bytes. My
hypothesis is that you can reduce the working set of Python (i.e. amortize
the cost of a .pyc's code over several processes by mmap'ing it); this
depends on the proportion of code in the pyc relative to "other" stuff.
(btw, the buffer object needs more documentation)
That's for Jack & Greg...
Quite true. My bad :-( ... That would go into the API doc, I guess... I'll
put this on a todo list, but it could be a little while.

Cheers,
-g

--
Greg Stein, http://www.lyra.org/




From fredrik@pythonware.com Wed Jun 16 11:53:29 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Wed, 16 Jun 1999 12:53:29 +0200
Subject: [Python-Dev] mmap
References: <Pine.LNX.3.95.990616030347.12890A-100000@ns1.lyra.org>
Message-ID: <015b01beb7e6$79b61610$f29b12c2@pythonware.com>

Greg wrote:
Actually, my preference is to see a change to open() rather than a whole
new module. For example, let's say that you open a file, specifying
memory-mapping. Then you create a buffer against that file:

f = open('foo','rm') # 'm' means mem-map
b = buffer(f)
print b[100:200]

Disclaimer: I haven't looked at the mmap modules (AMK's and Mark's) to see
what capabilities are in there. They may not be expressable soly as open()
changes. (adding add'l params for mmap flags might be another way to
handle this)

I'd like to see mmap native in Python. I won't push, though, until I can
run a test to see what kind of savings will occur when you mmap a .pyc
file and open PyBuffer objects against the thing for the code bytes. My
hypothesis is that you can reduce the working set of Python (i.e. amortize
the cost of a .pyc's code over several processes by mmap'ing it); this
depends on the proportion of code in the pyc relative to "other" stuff.
yes, yes, yes!

my good friend the mad scientist (the guy who writes code,
not the flaming cult-ridden brainwashed script kiddie) has
considered writing a whole new "abstract file" backend, to
entirely get rid of stdio in the Python core. some potential
advantages:

-- performance (some stdio implementations are slow)
-- portability (stdio doesn't exist on some platforms!)
-- opens up for cool extensions (memory mapping,
pluggable file handlers, etc).

should I tell him to start hacking?

or is this the same thing as PyBuffer/buffer (I've implemented
PyBuffer support for the unicode class, but that doesn't mean
that I understand how it works...)

</F>

PS. someone once told me that Perl goes "below" the standard
file I/O system. does anyone here know if that's true, and per-
haps even explain how they're doing that...



From guido@CNRI.Reston.VA.US Wed Jun 16 13:19:10 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 16 Jun 1999 08:19:10 -0400
Subject: [Python-Dev] mmap
In-Reply-To: Your message of "Wed, 16 Jun 1999 03:13:45 PDT."
<Pine.LNX.3.95.990616030347.12890A-100000@ns1.lyra.org>
References: <Pine.LNX.3.95.990616030347.12890A-100000@ns1.lyra.org>
Message-ID: [me]
If it works on Linux, Solaris, Irix and Windows, and is reasonably
clean, I'll take it. Please send it.
[Greg]
Actually, my preference is to see a change to open() rather than a whole
new module. For example, let's say that you open a file, specifying
memory-mapping. Then you create a buffer against that file:

f = open('foo','rm') # 'm' means mem-map
b = buffer(f)
print b[100:200]
Buh. Changes of this kind to builtins are painful, especially since
we expect that this feature may or may not be supported. And imagine
the poor reader who comes across this for the first time...

What's wrong with

import mmap
f = mmap.open('foo', 'r')

???
I'd like to see mmap native in Python. I won't push, though, until I can
run a test to see what kind of savings will occur when you mmap a .pyc
file and open PyBuffer objects against the thing for the code bytes. My
hypothesis is that you can reduce the working set of Python (i.e. amortize
the cost of a .pyc's code over several processes by mmap'ing it); this
depends on the proportion of code in the pyc relative to "other" stuff.
We've been through this before. I still doubt it will help much.
Anyway, it's a completely independent feature from making the mmap
module(any mmap module) available to users.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@CNRI.Reston.VA.US Wed Jun 16 13:24:26 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 16 Jun 1999 08:24:26 -0400
Subject: [Python-Dev] mmap
In-Reply-To: Your message of "Wed, 16 Jun 1999 12:53:29 +0200."
<015b01beb7e6$79b61610$f29b12c2@pythonware.com>
References: <Pine.LNX.3.95.990616030347.12890A-100000@ns1.lyra.org>
<015b01beb7e6$79b61610$f29b12c2@pythonware.com>
Message-ID: <199906161224.IAA05815@eric.cnri.reston.va.us>
my good friend the mad scientist (the guy who writes code,
not the flaming cult-ridden brainwashed script kiddie) has
considered writing a whole new "abstract file" backend, to
entirely get rid of stdio in the Python core. some potential
advantages:

-- performance (some stdio implementations are slow)
-- portability (stdio doesn't exist on some platforms!)
You have this backwards -- you'd have to port the abstract backend
first! Also don't forget that a *good* stdio might be using all sorts
of platform-specific tricks that you'd have to copy to match its
performance.
-- opens up for cool extensions (memory mapping,
pluggable file handlers, etc).

should I tell him to start hacking?
Tcl/Tk does this. I see some advantages (e.g. you have more control
over and knowledge of how much data is buffered) but also some
disadvantages (more work to port, harder to use from C), plus tons of
changes needed in the rest of Python. I'd say wait until Python 2.0
and let's keep stdio for 1.6.
PS. someone once told me that Perl goes "below" the standard
file I/O system. does anyone here know if that's true, and per-
haps even explain how they're doing that...
Probably just means that they use the C equivalent of os.open() and
friends.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From gward@cnri.reston.va.us Wed Jun 16 13:25:34 1999
From: gward@cnri.reston.va.us (Greg Ward)
Date: Wed, 16 Jun 1999 08:25:34 -0400
Subject: [Python-Dev] mmap
In-Reply-To: <015b01beb7e6$79b61610$f29b12c2@pythonware.com>; from Fredrik Lundh on Wed, Jun 16, 1999 at 12:53:29PM +0200
References: <Pine.LNX.3.95.990616030347.12890A-100000@ns1.lyra.org> <015b01beb7e6$79b61610$f29b12c2@pythonware.com>
Message-ID: <19990616082533.A4142@cnri.reston.va.us>

On 16 June 1999, Fredrik Lundh said:
my good friend the mad scientist (the guy who writes code,
not the flaming cult-ridden brainwashed script kiddie) has
considered writing a whole new "abstract file" backend, to
entirely get rid of stdio in the Python core. some potential
advantages: [...]
PS. someone once told me that Perl goes "below" the standard
file I/O system. does anyone here know if that's true, and per-
haps even explain how they're doing that...
My understanding (mainly from folklore -- peeking into the Perl source
has been known to turn otherwise staid, solid programmers into raving
lunatics) is that yes, Perl does grovel around in the internals of stdio
implementations to wring a few extra cycles out.

However, what's probably of more interest to you -- I mean your mad
scientist alter ego -- is Perl's I/O abstraction layer: a couple of
years ago, somebody hacked up Perl's guts to do basically what you're
proposing for Python. The main result was a half-baked, unfinished (at
least as of last summer, when I actually asked an expert in person at
the Perl Conference) way of building Perl with AT&T's sfio library
instead of stdio. I think the other things you mentioned, eg. more
natural support for memory-mapped files, have also been bandied about as
advantages of this scheme.

The main problem with Perl's I/O abstraction layer is that extension
modules now have to call e.g. PerlIO_open(), PerlIO_printf(), etc. in
place of their stdio counterparts. Surprise surprise, many extension
modules have not adapted to the new way of doing things, even though
it's been in Perl since version 5.003 (I think). Even more
surprisingly, the fourth-party C libraries that those extension modules
often interface to haven't switched to using Perl's I/O abstraction
layer. This doesn't make a whit of difference if Perl is built in
either the "standard way" (no abstraction layer, just direct stdio) or
with the abstraction layer on top of stdio. But as soon as some poor
fool decides Perl on top of sfio would be neat, lots of extension
modules break -- their I/O calls go nowhere.

I'm sure there is some sneaky way to make it all work using sfio's
binary compatibility layer and some clever macros. This might even have
been done. However, AFAIK it's not been documented anywhere.

This is not merely to bitch about unfinished business in the Perl core;
it's to warn you that others have walked down the road you propose to
tread, and there may be potholes. Now if the Python source really does
get even more modularized for 1.6, you might have a much easier job of
it. ("Modular" is not the word that jumps to mind when one looks at the
Perl source code.)

Greg

/*
* "Far below them they saw the white waters pour into a foaming bowl, and
* then swirl darkly about a deep oval basin in the rocks, until they found
* their way out again through a narrow gate, and flowed away, fuming and
* chattering, into calmer and more level reaches."
*/
-- Tolkein, by way of perl/doio.c

--
Greg Ward - software developer gward@cnri.reston.va.us
Corporation for National Research Initiatives
1895 Preston White Drive voice: +1-703-620-8990
Reston, Virginia, USA 20191-5434 fax: +1-703-620-0913


From beazley@cs.uchicago.edu Wed Jun 16 14:23:32 1999
From: beazley@cs.uchicago.edu (David Beazley)
Date: Wed, 16 Jun 1999 08:23:32 -0500 (CDT)
Subject: [Python-Dev] mmap
References: <Pine.LNX.3.95.990616030347.12890A-100000@ns1.lyra.org>
<015b01beb7e6$79b61610$f29b12c2@pythonware.com>
Message-ID: <199906161323.IAA28642@gargoyle.cs.uchicago.edu>

Fredrik Lundh writes:
my good friend the mad scientist (the guy who writes code,
not the flaming cult-ridden brainwashed script kiddie) has
considered writing a whole new "abstract file" backend, to
entirely get rid of stdio in the Python core. some potential
advantages:

-- performance (some stdio implementations are slow)
-- portability (stdio doesn't exist on some platforms!)
-- opens up for cool extensions (memory mapping,
pluggable file handlers, etc).

should I tell him to start hacking?
I am not in favor of obscuring Python's I/O model too much. When
working with C extensions, it is critical to have access to normal I/O
mechanisms such as 'FILE *' or integer file descriptors. If you hide
all of this behind some sort of abstract I/O layer, it's going to make
life hell for extension writers unless you also provide a way to get
access to the raw underlying data structures. This is a major gripe
I have with the Tcl channel model--namely, there seems to be no easy
way to unravel a Tcl channel into a raw file-descriptor for use in C
(unless I'm being dense and have missed some simple way to do it).

Also, what platforms are we talking about here? I've never come
across any normal machine that had a C compiler, but did not have stdio.
Is this really a serious problem?

Cheers,

Dave



From MHammond@skippinet.com.au Wed Jun 16 14:47:44 1999
From: MHammond@skippinet.com.au (Mark Hammond)
Date: Wed, 16 Jun 1999 23:47:44 +1000
Subject: [Python-Dev] mmap
In-Reply-To: <19990616082533.A4142@cnri.reston.va.us>
Message-ID: [Greg writes]
The main problem with Perl's I/O abstraction layer is that extension
modules now have to call e.g. PerlIO_open(), PerlIO_printf(), etc. in
place of their stdio counterparts. Surprise surprise, many extension
Interestingly, Python _nearly_ suffers this problem now. Although Python
does use native FILE pointers, this scheme still assumes that Python and
the extensions all use the same stdio.

I understand that on most Unix system this can be taken for granted.
However, to be truly cross-platform, this assumption may not be valid. A
case in point is (surprise surprise :-) Windows. Windows has a number of C
RTL options, and Python and its extensions must be careful to select the
one that shares FILE * and the heap across separately compiled and linked
modules. In-fact, Windows comes with an excellent debug version of the C
RTL, but this gets in Python's way - if even one (but not all) Python
extension attempts to use these debugging features, we die in a big way.

and-dont-even-talk-to-me-about-Windows-CE ly,

Mark.



From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Wed Jun 16 15:42:01 1999
From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw)
Date: Wed, 16 Jun 1999 10:42:01 -0400 (EDT)
Subject: [Python-Dev] String methods... finally
References: <010201beb781$d1febf30$0801a8c0@bobcat>
<199906152304.TAA05136@eric.cnri.reston.va.us>
<37676067.62E272F4@lemburg.com>
Message-ID: <14183.47033.656933.642197@anthem.cnri.reston.va.us>
"M" == M <mal@lemburg.com> writes:
M> The u"..." looks more intuitive too me. While inheriting C/C++
M> constructs usually makes sense I think usage in the C community
M> is not that wide-spread yet and for a Python freak, the small u
M> will definitely remind him of Unicode whereas the L will stand
M> for (nearly) unlimited length/precision.

I don't think I've every seen C code with L"..." strings in them.
Here's my list in no particular order.

U"..." -- reminds Java/JPython users of Unicode. Alternative
mnemonic: Unamerican-strings

L"..." -- long-strings, Lundh-strings, ...

W"..." -- wide-strings, Warsaw-strings (just trying to take credit
where credit's not due :), what-the-heck-are-these?-strings

H"..." -- happy-strings, Hammond-strings,
hey-you-just-made-my-extension-module-crash-strings

F"..." -- funky-stuff-in-these-hyar-strings

A"..." -- ain't-strings

S"..." -- strange-strings, silly-strings

M> Not that this is important, but...

Agreed.

-Barry


From fredrik@pythonware.com Wed Jun 16 20:11:02 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Wed, 16 Jun 1999 21:11:02 +0200
Subject: [Python-Dev] mmap
References: <Pine.LNX.3.95.990616030347.12890A-100000@ns1.lyra.org> <015b01beb7e6$79b61610$f29b12c2@pythonware.com> <19990616082533.A4142@cnri.reston.va.us>
Message-ID: <001901beb82b$fab54200$f29b12c2@pythonware.com>

Greg Ward wrote:
This is not merely to bitch about unfinished business in the Perl core;
it's to warn you that others have walked down the road you propose to
tread, and there may be potholes.
oh, the mad scientist have rushed down that road a few
times before. we'll see if he's prepared to do that again;
it sure won't happen before the unicode stuff is in place...

</F>



From fredrik@pythonware.com Wed Jun 16 20:16:56 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Wed, 16 Jun 1999 21:16:56 +0200
Subject: [Python-Dev] mmap
References: <Pine.LNX.3.95.990616030347.12890A-100000@ns1.lyra.org> <015b01beb7e6$79b61610$f29b12c2@pythonware.com> <199906161224.IAA05815@eric.cnri.reston.va.us>
Message-ID: <004a01beb82e$36ba54a0$f29b12c2@pythonware.com>
-- performance (some stdio implementations are slow)
-- portability (stdio doesn't exist on some platforms!)
You have this backwards -- you'd have to port the abstract backend
first! Also don't forget that a *good* stdio might be using all sorts
of platform-specific tricks that you'd have to copy to match its
performance.
well, if the backend layer is good enough, I don't
think a stdio-based standard version will be much
slower than todays stdio-only implementation.
PS. someone once told me that Perl goes "below" the standard
file I/O system. does anyone here know if that's true, and per-
haps even explain how they're doing that...
Probably just means that they use the C equivalent of os.open() and
friends.
hopefully. my original source described this as
"digging around in the innards of the stdio package"
(and so did greg). and the same source claimed it
wasn't yet ported to Linux. sounds weird, to say
the least, but maybe he referred to that sfio
package greg mentioned. I'll do some digging,
but not today.

</F>



From fredrik@pythonware.com Wed Jun 16 20:27:02 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Wed, 16 Jun 1999 21:27:02 +0200
Subject: [Python-Dev] mmap
References: <Pine.LNX.3.95.990616030347.12890A-100000@ns1.lyra.org><015b01beb7e6$79b61610$f29b12c2@pythonware.com> <199906161323.IAA28642@gargoyle.cs.uchicago.edu>
Message-ID: <004b01beb82e$36d44540$f29b12c2@pythonware.com>

David Beazley wrote:
I am not in favor of obscuring Python's I/O model too much. When
working with C extensions, it is critical to have access to normal I/O
mechanisms such as 'FILE *' or integer file descriptors. If you hide
all of this behind some sort of abstract I/O layer, it's going to make
life hell for extension writers unless you also provide a way to get
access to the raw underlying data structures. This is a major gripe
I have with the Tcl channel model--namely, there seems to be no easy
way to unravel a Tcl channel into a raw file-descriptor for use in C
(unless I'm being dense and have missed some simple way to do it).

Also, what platforms are we talking about here? I've never come
across any normal machine that had a C compiler, but did not have stdio.
Is this really a serious problem?
in a way, it is a problem today under Windows (in other
words, on most of the machines where Python is used
today). it's very easy to end up with different DLL's using
different stdio implementations, resulting in all kinds of
strange errors. a rewrite could use OS-level handles
instead, and get rid of that problem.

not to mention Windows CE (iirc, Mark had to write his
own stdio-ish package for the CE port), maybe PalmOS,
BeOS's BFile's, and all the other upcoming platforms which
will make Windows look like a fairly decent Unix clone ;-)

...

and in Python, any decent extension writer should write
code that works with arbitrary file objects, right? "if it
cannot deal with StringIO objects, it's broken"...

</F>



From beazley@cs.uchicago.edu Wed Jun 16 20:53:23 1999
From: beazley@cs.uchicago.edu (David Beazley)
Date: Wed, 16 Jun 1999 14:53:23 -0500 (CDT)
Subject: [Python-Dev] mmap
References: <Pine.LNX.3.95.990616030347.12890A-100000@ns1.lyra.org>
<015b01beb7e6$79b61610$f29b12c2@pythonware.com>
<199906161323.IAA28642@gargoyle.cs.uchicago.edu>
<004b01beb82e$36d44540$f29b12c2@pythonware.com>
Message-ID: <199906161953.OAA04527@gargoyle.cs.uchicago.edu>

Fredrik Lundh writes:
and in Python, any decent extension writer should write
code that works with arbitrary file objects, right? "if it
cannot deal with StringIO objects, it's broken"...
I disagree. Given that a lot of people use Python as a glue language
for interfacing with legacy codes, it is unacceptable for extensions
to be forced to use some sort of funky non-standard I/O abstraction.
Unless you are volunteering to rewrite all of these codes to use the
new I/O model, you are always going to need access (in one way or
another) to plain old 'FILE *' and integer file descriptors. Of
course, one can always just provide a function like

FILE *PyFile_AsFile(PyObject *o)

That takes an I/O object and returns a 'FILE *' where supported. (Of
course, if it's not supported, then it doesn't matter if this function
is missing since any extension that needs a 'FILE *' wouldn't work
anyways).

Cheers,

Dave






From fredrik@pythonware.com Wed Jun 16 21:04:54 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Wed, 16 Jun 1999 22:04:54 +0200
Subject: [Python-Dev] mmap
References: <Pine.LNX.3.95.990616030347.12890A-100000@ns1.lyra.org><015b01beb7e6$79b61610$f29b12c2@pythonware.com><199906161323.IAA28642@gargoyle.cs.uchicago.edu><004b01beb82e$36d44540$f29b12c2@pythonware.com> <199906161953.OAA04527@gargoyle.cs.uchicago.edu>
Message-ID: <009d01beb833$80d15d40$f29b12c2@pythonware.com>
and in Python, any decent extension writer should write
code that works with arbitrary file objects, right? "if it
cannot deal with StringIO objects, it's broken"...
I disagree. Given that a lot of people use Python as a glue language
for interfacing with legacy codes, it is unacceptable for extensions
to be forced to use some sort of funky non-standard I/O abstraction.
oh, you're right, of course. should have added that extra smiley
to that last line. cut and paste from this mail if necessary: ;-)
Unless you are volunteering to rewrite all of these codes to use the
new I/O model, you are always going to need access (in one way or
another) to plain old 'FILE *' and integer file descriptors. Of
course, one can always just provide a function like

FILE *PyFile_AsFile(PyObject *o)

That takes an I/O object and returns a 'FILE *' where supported.
exactly my idea. when scanning the code, PyFile_AsFile immediately
popped up as a potential pothole (if you need the fileno, there's
already a method for that in the "standard file object interface").

btw, an "abstract file object" could actually make it much easier
to support arbitrary file objects from C/C++ extensions. just map
the calls back to Python. or add a tp_file slot, and things get
really interesting...
(Of course, if it's not supported, then it doesn't matter if this
function is missing since any extension that needs a 'FILE *' wouldn't
work anyways).
yup. I suspect some legacy code may have a hard time running
under CE et al. but of course, with a little macro trickery, no-
thing stops you from recompiling such code so it uses Python's
new "abstract file... okay, okay, I'll stop now ;-)

</F>



From beazley@cs.uchicago.edu Wed Jun 16 21:13:42 1999
From: beazley@cs.uchicago.edu (David Beazley)
Date: Wed, 16 Jun 1999 15:13:42 -0500 (CDT)
Subject: [Python-Dev] mmap
References: <Pine.LNX.3.95.990616030347.12890A-100000@ns1.lyra.org>
<015b01beb7e6$79b61610$f29b12c2@pythonware.com>
<199906161323.IAA28642@gargoyle.cs.uchicago.edu>
<004b01beb82e$36d44540$f29b12c2@pythonware.com>
<199906161953.OAA04527@gargoyle.cs.uchicago.edu>
<009d01beb833$80d15d40$f29b12c2@pythonware.com>
Message-ID: <199906162013.PAA04781@gargoyle.cs.uchicago.edu>

Fredrik Lundh writes:
and in Python, any decent extension writer should write
code that works with arbitrary file objects, right? "if it
cannot deal with StringIO objects, it's broken"...
I disagree. Given that a lot of people use Python as a glue language
for interfacing with legacy codes, it is unacceptable for extensions
to be forced to use some sort of funky non-standard I/O abstraction.
oh, you're right, of course. should have added that extra smiley
to that last line. cut and paste from this mail if necessary: ;-)
Good. You had me worried there for a second :-).
yup. I suspect some legacy code may have a hard time running
under CE et al. but of course, with a little macro trickery, no-
thing stops you from recompiling such code so it uses Python's
new "abstract file... okay, okay, I'll stop now ;-)
Macro trickery? Oh yes, we could use that too... (one can never
have too much macro trickery if you ask me :-)

Cheers,

Dave





From arw@ifu.net Thu Jun 17 15:12:16 1999
From: arw@ifu.net (Aaron Watters)
Date: Thu, 17 Jun 1999 10:12:16 -0400
Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v]
Message-ID: <37690240.66F601E1@ifu.net>
no-positive-suggestions-just-grousing-ly y'rs - tim
On the contrary. I think this is definitively a bad idea. Retracted.
A double negative is a positive.
-- Aaron Watters

==
"Criticism serves the same purpose as pain. It's not pleasant
but it suggests that something is wrong."
-- Churchill (paraphrased from memory)



From da@ski.org Thu Jun 17 18:50:20 1999
From: da@ski.org (David Ascher)
Date: Thu, 17 Jun 1999 10:50:20 -0700 (Pacific Daylight Time)
Subject: [Python-Dev] org.python.org
Message-ID: <Pine.WNT.4.04.9906171045590.306-100000@rigoletto.ski.org>

Not all that revolutionary, but an interesting migration path. FWIW, I
think the underlying issue is a real one. We're starting to have more and
more conflicts, even among package names. (Of course the symlink solution
doesn't work on Win32, but that's a detail =).

--david

---------- Forwarded message ----------
Date: Thu, 17 Jun 1999 13:44:33 -0400 (EDT)
From: Andy Dustman <adustman@comstar.net>
To: Gordon McMillan <gmcm@hypernet.com>
Cc: M.-A. Lemburg <mal@lemburg.com>, Crew List <crew@server.python.net>
Subject: Re: [Crew] Wizards' Resolution to Zope/PIL/mxDateTime conflict?
On Thu, 17 Jun 1999, Gordon McMillan wrote:

M.A.L. wrote:
Or maybe we should start the com.domain.mypackage thing ASAP.
I know many are against this proposal (makes Python look Feudal?
Reminds people of the J language?), but I think it's the only thing
that makes sense. It does mean you have to do some ugly things to get
Pickle working properly.
Actually, it can be done very easily. I just tried this, in fact:

cd /usr/lib/python1.5
mkdir -p org/python
(cd org/python; ln -s ../.. core)
touch __init__.py org/__init__.py org/python/__init__.py
from org.python.core import rfc822
import profile
So this seems to make things nice and backwards compatible. My only
concern was having __init__.py in /usr/lib/python1.5, but this doesn't
seem to break anything. Of course, if you are using some trendy new
atrocity like Windoze, this might not work.

--
andy dustman | programmer/analyst | comstar communications corporation
telephone: 770.485.6025 / 706.549.7689 | icq: 32922760 | pgp: 0xc72f3f1d
_______________________________________________
Crew maillist - Crew@starship.python.net
http://starship.python.net/mailman/listinfo/crew



From gmcm@hypernet.com Thu Jun 17 20:36:49 1999
From: gmcm@hypernet.com (Gordon McMillan)
Date: Thu, 17 Jun 1999 14:36:49 -0500
Subject: [Python-Dev] org.python.org
In-Reply-To: <Pine.WNT.4.04.9906171045590.306-100000@rigoletto.ski.org>
Message-ID: <1282474031-114884629@hypernet.com>

David forwards from Starship Crew list:
Not all that revolutionary, but an interesting migration path.
FWIW, I think the underlying issue is a real one. We're starting to
have more and more conflicts, even among package names. (Of course
the symlink solution doesn't work on Win32, but that's a detail =).

--david

---------- Forwarded message ----------
Date: Thu, 17 Jun 1999 13:44:33 -0400 (EDT)
From: Andy Dustman <adustman@comstar.net>
To: Gordon McMillan <gmcm@hypernet.com>
Cc: M.-A. Lemburg <mal@lemburg.com>, Crew List
<crew@server.python.net> Subject: Re: [Crew] Wizards' Resolution to
Zope/PIL/mxDateTime conflict?
On Thu, 17 Jun 1999, Gordon McMillan wrote:

M.A.L. wrote:
Or maybe we should start the com.domain.mypackage thing ASAP.
I know many are against this proposal (makes Python look Feudal?
Reminds people of the J language?), but I think it's the only thing
that makes sense. It does mean you have to do some ugly things to get
Pickle working properly.
Actually, it can be done very easily. I just tried this, in fact:

cd /usr/lib/python1.5
mkdir -p org/python
(cd org/python; ln -s ../.. core)
touch __init__.py org/__init__.py org/python/__init__.py
from org.python.core import rfc822
import profile
So this seems to make things nice and backwards compatible. My only
concern was having __init__.py in /usr/lib/python1.5, but this
doesn't seem to break anything. Of course, if you are using some
trendy new atrocity like Windoze, this might not work.
In vanilla cases it's backwards compatible. I try packag-izing almost
everything I install. Sometimes it works, sometimes it doesn't.

In your example, rfc822 uses only builtins at the top level. It's
main will import os. Would that work if os lived in org.python.core?

Though I really don't think we need to packagize the std distr, (if
that happens, I would think it would be for a different reason).

The 2 main problems I run across in packagizing things are
intra-package imports (where M.A.L's proposal for relative names in
dotted imports might ease the pain) and Pickle / cPickle (where the
ugliness of the workarounds has often made me drop back to marshal).


- Gordon


From MHammond@skippinet.com.au Fri Jun 18 09:31:21 1999
From: MHammond@skippinet.com.au (Mark Hammond)
Date: Fri, 18 Jun 1999 18:31:21 +1000
Subject: [Python-Dev] Merge the string_methods tag?
Message-ID: <015601beb964$f37a4fa0$0801a8c0@bobcat>

Ive been running the string_methods tag (term?) under CVS for quite some
time now, and it seems to work perfectly. I admit that I havent stressed
the string methods much, but I feel confident that Barry's patches havent
broken existing string code.

Also, I find using that tag with CVS a bit of a pain. A few updates have
been checked into the main branch, and you tend to miss these (its a pity
CVS can't be told "only these files are affected by this tag, so the rest
should follow the main branch." I know I can do that personally, but that
means I personally need to know all files possibly affected by the branch.)
Anyway, I digress...

I propose that these extensions be merged into the main branch. The main
advantage is that we force more people to bash on it, rather than allowing
them to make that choice <wink>. If the Unicode type is also considered
highly experimental, we can make a new tag for that change, but that is
really quite independant of the string methods.

Mark.



From fredrik@pythonware.com Fri Jun 18 09:56:47 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Fri, 18 Jun 1999 10:56:47 +0200
Subject: [Python-Dev] cvs problems
References: <015601beb964$f37a4fa0$0801a8c0@bobcat>
Message-ID: <001d01beb968$7fd47540$f29b12c2@pythonware.com>

maybe not the right forum, but I suppose everyone
here is using CVS, so...

...could anyone explain why I keep getting this error?

$ cvs -z6 up -P -d
...
cvs server: Updating dist/src/Tools/ht2html
cvs [server aborted]: cannot open directory /projects/cvsroot/python/dist/src/Tools/ht2html: No such
file or directory

it used to work...

</F>



From tismer@appliedbiometrics.com Fri Jun 18 10:47:15 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Fri, 18 Jun 1999 11:47:15 +0200
Subject: [Python-Dev] Flat Python in Linux Weekly
Message-ID: <376A15A3.3968EADE@appliedbiometrics.com>

Howdy,

Who would have thought this...

Linux Weekly took notice.

http://lwn.net/bigpage.phtml

derangedly yours - chris

--
Christian Tismer :^)
Applied Biometrics GmbH : Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net
10553 Berlin : PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
we're tired of banana software - shipped green, ripens at home


From mal@lemburg.com Fri Jun 18 11:05:52 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 18 Jun 1999 12:05:52 +0200
Subject: [Python-Dev] Relative package imports
Message-ID: <376A1A00.3099DE99@lemburg.com>

Although David has already copy-posted a message regarding this issue
to the list, I would like to restate the problem to get a discussion
going (and then maybe take it to c.l.p for general flaming ;).

The problem we have run into on starship is that some well-known
packages have introduced naming conflicts leading to the unfortunate
situation that they can't be all installed on the same default
path:

1. Zope has a module named DateTime which also is the base name of the
package mxDateTime.

2. Both Zope and PIL have a top-level module named ImageFile.py
(different ones of course).

Now the problem is how to resolve these issues. One possibility
is turning Zope and PIL into proper packages altogether. To
ease this transition, one would need a way to specify relative
intra-package imports and a way to tell pickle where to look
for modules/packages.

The next problem we'd probably run into sooner or later is that
there are quite a few useful top-level modules with generic
names that will conflict with package names and other modules
with the same name.

I guess we'd need at least three things to overcome this situation once
and for all ;-):

1. Provide a way to do relative imports, e.g. a single dot could
be interpreted as "parent package":

modA.py
modD.py
[A]
modA.py
modB.py
[B]
modC.py
modD.py

In modC.py:

from modD import * (works as usual: import A.B.modD)
from .modA import * (imports A.modA)
from ..modA import * (import the top-level modA)

2. Establish a general vendor based naming scheme much like the one
used in the Java world:

from org.python.core import time,os,string
from org.zope.core import *
from com.lemburg import DateTime
from com.pythonware import PIL

3. Add a way to prevent double imports of the same file.

This is the mayor gripe I have with pickle currently, because intra-
package imports often lead to package modules being imported
twice leading to many strange problems (e.g. splitting class
hierarchies, problems with isinstance() and issubclass(), etc.), e.g.

from org.python.core import UserDict
u = UserDict.UserDict()
import UserDict
v = UserDict.UserDict()

Now u and v will point to two different classes:
u.__class__
<class org.python.core.UserDict.UserDict at 80d3b48>
v.__class__
<class UserDict.UserDict at 80aed18>

4. Add some kind of redirection or lookup hook to pickle et al.
so that imports done during unpickling can be redirected to the
correct (possibly renamed) package.

--
Marc-Andre Lemburg
______________________________________________________________________
Y2000: 196 days left
Business: http://www.lemburg.com/
Python Pages: http://www.lemburg.com/python/



From fredrik@pythonware.com Fri Jun 18 11:47:49 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Fri, 18 Jun 1999 12:47:49 +0200
Subject: [Python-Dev] Flat Python in Linux Weekly
References: <376A15A3.3968EADE@appliedbiometrics.com>
Message-ID: <001901beb978$0312a440$f29b12c2@pythonware.com>

flat eric, flat beat, flat python?

http://www.flateric-online.de

(best viewed through babelfish.altavista.com,
of course ;-)

should-flat-eric-in-the-routeroute-route-along-ly yrs /F



From fredrik@pythonware.com Fri Jun 18 11:51:21 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Fri, 18 Jun 1999 12:51:21 +0200
Subject: [Python-Dev] Relative package imports
References: <376A1A00.3099DE99@lemburg.com>
Message-ID: <001f01beb978$8177aab0$f29b12c2@pythonware.com>
2. Both Zope and PIL have a top-level module named ImageFile.py
(different ones of course).

Now the problem is how to resolve these issues. One possibility
is turning Zope and PIL into proper packages altogether. To
ease this transition, one would need a way to specify relative
intra-package imports and a way to tell pickle where to look
for modules/packages.
fwiw, PIL 1.0b1 can already be used as a package, but you
have to explicitly import the file format handlers you need:


from PIL import Image
import PIL.GifImagePlugin
import PIL.PngImagePlugin
import PIL.JpegImagePlugin

etc. this has been fixed in PIL 1.0 final.

</F>



From guido@CNRI.Reston.VA.US Fri Jun 18 15:51:16 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 18 Jun 1999 10:51:16 -0400
Subject: [Python-Dev] Merge the string_methods tag?
In-Reply-To: Your message of "Fri, 18 Jun 1999 18:31:21 +1000."
<015601beb964$f37a4fa0$0801a8c0@bobcat>
References: <015601beb964$f37a4fa0$0801a8c0@bobcat>
Message-ID: <199906181451.KAA11549@eric.cnri.reston.va.us>
Ive been running the string_methods tag (term?) under CVS for quite some
time now, and it seems to work perfectly. I admit that I havent stressed
the string methods much, but I feel confident that Barry's patches havent
broken existing string code.

Also, I find using that tag with CVS a bit of a pain. A few updates have
been checked into the main branch, and you tend to miss these (its a pity
CVS can't be told "only these files are affected by this tag, so the rest
should follow the main branch." I know I can do that personally, but that
means I personally need to know all files possibly affected by the branch.)
Anyway, I digress...

I propose that these extensions be merged into the main branch. The main
advantage is that we force more people to bash on it, rather than allowing
them to make that choice <wink>. If the Unicode type is also considered
highly experimental, we can make a new tag for that change, but that is
really quite independant of the string methods.
Hmm... This would make it hard to make a patch release for 1.5.2
(possible called 1.5.3?). I *really* don't want the string methods to
end up in a release yet -- there are too many rough edges (e.g. some
missing methods, should join str() or not, etc.).

I admit that managing CVS branches is painful. We may find that it
works better to create a branch for patch releases and to do all new
development on the main release... But right now I don't want to
change anything yet.

In any case Barry just went on vacation so we'll have to wait 10
days...

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@CNRI.Reston.VA.US Fri Jun 18 15:55:45 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 18 Jun 1999 10:55:45 -0400
Subject: [Python-Dev] cvs problems
In-Reply-To: Your message of "Fri, 18 Jun 1999 10:56:47 +0200."
<001d01beb968$7fd47540$f29b12c2@pythonware.com>
References: <015601beb964$f37a4fa0$0801a8c0@bobcat>
<001d01beb968$7fd47540$f29b12c2@pythonware.com>
Message-ID: <199906181455.KAA11564@eric.cnri.reston.va.us>
maybe not the right forum, but I suppose everyone
here is using CVS, so...

...could anyone explain why I keep getting this error?

$ cvs -z6 up -P -d
...
cvs server: Updating dist/src/Tools/ht2html
cvs [server aborted]: cannot open directory /projects/cvsroot/python/dist/src/Tools/ht2html: No such
file or directory

it used to work...
EXPLANATION: For some reason that directory existed on the mirror
server but not in the master CVS tree repository. It was created once
but quickly deleted -- not quickly enough apparently to prevent it to
leak to the slave. Then we did a global resync from the master to the
mirror and that wiped out the mirror version. Good riddance.

FIX: Edit Tools/CVS/Entries and delete the line that mentions ht2html,
then do another cvs update.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim_one@email.msn.com Fri Jun 18 16:41:54 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Fri, 18 Jun 1999 11:41:54 -0400
Subject: [Python-Dev] cvs problems
In-Reply-To: <001d01beb968$7fd47540$f29b12c2@pythonware.com>
Message-ID: [/F]
...could anyone explain why I keep getting this error?

$ cvs -z6 up -P -d
...
cvs server: Updating dist/src/Tools/ht2html
cvs [server aborted]: cannot open directory
/projects/cvsroot/python/dist/src/Tools/ht2html: No such
file or directory

it used to work...
It stopped working a week ago Thursday, and Guido & Barry know about it.
The directory in question vanished from the server under mysterious
circumstances. You can get going again by deleting the ht2html line in your
local Tools/CVS/Entries file.




From da@ski.org Fri Jun 18 18:09:27 1999
From: da@ski.org (David Ascher)
Date: Fri, 18 Jun 1999 10:09:27 -0700 (Pacific Daylight Time)
Subject: [Python-Dev] automatic wildcard expansion on Win32
Message-ID: <Pine.WNT.4.04.9906181005230.255-100000@rigoletto.ski.org>

A python-help poster finally convinced me that there was a way to enable
automatic wildcard expansion on win32. This is done by linking in
"setargv.obj" along with all of the other MS libs. Quick testing shows
that it works.

Is this a feature we want to add? I can see both sides of that coin.

--david

PS: I saw a RISKS digest posting last week which had a horror story about
wildcard expansion on some flavor of Windows. The person had two files
with long filenames:

verylongfile1.txt
and
verylongfile2.txt

But Win32 stored them in 8.3 format, so they were stored as
verylo~2.txt
and
verylo~1.txt

(Yes, the 1 and 2 were swapped!). So when he did

del *1.txt

he removed the wrong file. Neat, eh?

(This is actually relevant -- it's possible that setargv.obj and glob.glob
could give different answers).

--david




From guido@CNRI.Reston.VA.US Fri Jun 18 19:09:29 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 18 Jun 1999 14:09:29 -0400
Subject: [Python-Dev] automatic wildcard expansion on Win32
In-Reply-To: Your message of "Fri, 18 Jun 1999 10:09:27 PDT."
<Pine.WNT.4.04.9906181005230.255-100000@rigoletto.ski.org>
References: <Pine.WNT.4.04.9906181005230.255-100000@rigoletto.ski.org>
Message-ID: <199906181809.OAA12090@eric.cnri.reston.va.us>
A python-help poster finally convinced me that there was a way to enable
automatic wildcard expansion on win32. This is done by linking in
"setargv.obj" along with all of the other MS libs. Quick testing shows
that it works.

Is this a feature we want to add? I can see both sides of that coin.
I don't see big drawbacks except minor b/w compat problems.

Should it be done for both python.exe and pythonw.exe?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From da@ski.org Fri Jun 18 21:06:09 1999
From: da@ski.org (David Ascher)
Date: Fri, 18 Jun 1999 13:06:09 -0700 (Pacific Daylight Time)
Subject: [Python-Dev] automatic wildcard expansion on Win32
In-Reply-To: <199906181809.OAA12090@eric.cnri.reston.va.us>
Message-ID: <Pine.WNT.4.04.9906181305550.313-100000@rigoletto.ski.org>
On Fri, 18 Jun 1999, Guido van Rossum wrote:

I don't see big drawbacks except minor b/w compat problems.

Should it be done for both python.exe and pythonw.exe?
Sure.





From MHammond@skippinet.com.au Sat Jun 19 01:56:42 1999
From: MHammond@skippinet.com.au (Mark Hammond)
Date: Sat, 19 Jun 1999 10:56:42 +1000
Subject: [Python-Dev] automatic wildcard expansion on Win32
In-Reply-To: <Pine.WNT.4.04.9906181005230.255-100000@rigoletto.ski.org>
Message-ID: <016e01beb9ee$99e1a710$0801a8c0@bobcat>
A python-help poster finally convinced me that there was a
way to enable
automatic wildcard expansion on win32. This is done by linking in
"setargv.obj" along with all of the other MS libs. Quick
testing shows
that it works.
This has existed since I have been using C on Windows.

I personally would vote against it. AFAIK, common wisdom on Windows is to
not use this. Indeed, if people felt that this behaviour was an
improvement, MS would have enabled it by default at some stage over the
last 10 years it has existed, and provided a way of disabling it!

This behaviour causes subtle side effects; effects Unix users are well
aware of, due to every single tool using it. Do the tricks needed to get
the wildcard down to the program exist? Will any windows users know what
they are?

IMO, Windows "fixed" the Unix behaviour by dropping this, and they made a
concession to die-hards by providing a rarely used way of enabling it.
Windows C programmers dont expect it, VB programmers dont expect it, even
batch file programmers dont expect it. I dont think we should use it.
(This is actually relevant -- it's possible that setargv.obj
and glob.glob
could give different answers).
Exactly. As may win32api.FindFiles(). Give the user the wildcard, and let
them make sense of it. The trivial case of using glob() is so simple I
dont believe it worth hiding. Your horror story of the incorrect file
being deleted could then only be blamed on the application, not on Python!

Mark.



From tim_one@email.msn.com Sat Jun 19 02:00:46 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Fri, 18 Jun 1999 21:00:46 -0400
Subject: [Python-Dev] automatic wildcard expansion on Win32
In-Reply-To: <Pine.WNT.4.04.9906181005230.255-100000@rigoletto.ski.org>
Message-ID: [David Ascher]
A python-help poster finally convinced me that there was a way to enable
automatic wildcard expansion on win32. This is done by linking in
"setargv.obj" along with all of the other MS libs. Quick testing shows
that it works.

Is this a feature we want to add? I can see both sides of that coin.
The only real drawback I see is that we're then under some obligation to
document Python's behavior. Which is then inherited from the MS
setargv.obj, which is in turn only partially documented in developer-only
docs, and incorrectly documented at that.
PS: I saw a RISKS digest posting last week which had a horror story about
wildcard expansion on some flavor of Windows. The person had two files
with long filenames:

verylongfile1.txt
and
verylongfile2.txt

But Win32 stored them in 8.3 format, so they were stored as
verylo~2.txt
and
verylo~1.txt

(Yes, the 1 and 2 were swapped!). So when he did

del *1.txt

he removed the wrong file. Neat, eh?

(This is actually relevant -- it's possible that setargv.obj and
glob.glob could give different answers).
Yes, and e.g. it works this way under Win95:

D:\Python>dir *~*

Volume in drive D is DISK1PART2
Volume Serial Number is 1DFF-0F59
Directory of D:\Python

PYCLBR~1 PAT 5,765 06-07-99 11:41p pyclbr.patch
KJBUCK~1 PYD 34,304 03-31-98 3:07a kjbuckets.pyd
WIN32C~1 <DIR> 05-16-99 12:10a win32comext
PYTHON~1 <DIR> 05-16-99 12:10a Pythonwin
TEXTTO~1 <DIR> 01-15-99 11:35p TextTools
UNWISE~1 EXE 109,056 07-03-97 8:35a UnWisePW32.exe
3 file(s) 149,125 bytes
3 dir(s) 1,502,511,104 bytes free

Here's the same thing in an argv-spewing console app whipped up to link
setargv.obj:

D:\Python>garp\debug\garp *~*
0: D:\PYTHON\GARP\DEBUG\GARP.EXE
1: kjbuckets.pyd
2: pyclbr.patch
3: Pythonwin
4: TextTools
5: UnWisePW32.exe
6: win32comext

D:\Python>

setargv.obj is apparently consistent with what native wildcard expansion
does (although you won't find that promise made anywhere!), and it's
definitely surprising in the presence of non-8.3 names. The quoting rules
too are impossible to explain, seemingly random:

D:\Python>garp\debug\garp "\\a\\"
0: D:\PYTHON\GARP\DEBUG\GARP.EXE
1: \\a\

D:\Python>

Before I was on the Help list, I used to believe it would work to just say
"well, it does what Windows does" <wink>.

magnification-of-ignorance-ly y'rs - tim




From tim_one@email.msn.com Sat Jun 19 02:26:42 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Fri, 18 Jun 1999 21:26:42 -0400
Subject: [Python-Dev] automatic wildcard expansion on Win32
In-Reply-To: <016e01beb9ee$99e1a710$0801a8c0@bobcat>
Message-ID: [MarkH, with *the* killer argument <0.3 wink>]
Your horror story of the incorrect file being deleted could then
only be blamed on the application, not on Python!
Sold!

Some years ago in the Perl world, they solved this by making regular old
perl.exe not expand wildcards on Windows, but also supplying perlglob.exe
which did.

Don't know what they're doing today, but they apparently changed their minds
at least once, as the couple-years-old version of perl.exe on my machine
does do wildcard expansion, and does the wrong (i.e., the Windows <wink>)
thing.

screw-it-ly y'rs - tim




From tim_one@email.msn.com Sat Jun 19 19:45:16 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Sat, 19 Jun 1999 14:45:16 -0400
Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v]
In-Reply-To: <199906101411.KAA29962@eric.cnri.reston.va.us>
Message-ID: <000801beba83$df719e80$c49e2299@tim>

Backtracking:

[Aaron]
I've always considered it a major shame that Python ints and floats
and chars and stuff have anything to do with dynamic allocation ... [Guido]
What you're describing is very close to what I recall I once read
about the runtime organization of Icon. Perl may also use a variant
on this (it has fixed-length object headers). ...
I've rarely been able to make sense of Perl's source code, but gave it
another try anyway. An hour later I gave up unenlightened, so cruised the
web. Turns out there's a *terrific* writeup of Perl's type representation
at:

http://home.sol.no/~aas/perl/guts/

Pictures and everything <wink>. Header is 3 words: An 8-bit "type" field,
24 baffling flag bits (e.g., flag #14 is "BREAK -- refcnt is artificially
low"(!)), 32 refcount bits, and a 32-bit pointer field.

Appears that the pointer field is always a real (although possibly NULL)
pointer. Plain ints have type code SvIV, and the pointer then points to a
bogus address, but where that address + 3 words points to the actual integer
value. Why? Because then they can use the same offset to get to the int as
when the type is SvPVIV, which is the combined string/integer type, and
needs three words (to point to the string start address, current len and
allocated len) in addition to the integer value at the end. So why is the
integer value at the end? So the same offsets work for the SvPV type, which
is solely a string descriptor. So why is it important that SvPVIV, SvPV and
SvIV all have the same layout? So that either of the latter types can be
dynamically "upgraded" to SvPVIV (when a string is converted to int or vice
versa; Perl then holds on to both representations internally) by plugging in
a new type code and fiddling some of the baffling flag bits.

Brr. I have no idea how they manage to keep Perl running!

and-not-entirely-sure-that-they-do-ly y'rs - tim




From mal@lemburg.com Mon Jun 21 10:54:50 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 21 Jun 1999 11:54:50 +0200
Subject: [Python-Dev] Relative package imports
References: <376A1A00.3099DE99@lemburg.com>
Message-ID: <376E0BEA.60F22945@lemburg.com>

It seems that there is not much interest in the topic...

I'll be offline for the next two weeks -- maybe someone could
pick the thread up and toss it around a bit while I'm away.

Thanks,
--
Marc-Andre Lemburg
______________________________________________________________________
Y2000: 193 days left
Business: http://www.lemburg.com/
Python Pages: http://www.lemburg.com/python/



From MHammond@skippinet.com.au Mon Jun 21 12:23:34 1999
From: MHammond@skippinet.com.au (Mark Hammond)
Date: Mon, 21 Jun 1999 21:23:34 +1000
Subject: [Python-Dev] Relative package imports
In-Reply-To: <376E0BEA.60F22945@lemburg.com>
Message-ID: <000501bebbd8$80f56b10$0801a8c0@bobcat>
It seems that there is not much interest in the topic...

I'll be offline for the next two weeks -- maybe someone could
pick the thread up and toss it around a bit while I'm away.
OK - here are my 2c on it:

Unless I am mistaken, this problem could be solved with 2 steps:
* Code moves to Python packages.
* The standard Python library move to a package.

If all non-trivial Python program used packages, and some agreement on a
standard namespace could be met, I think it would be addressed. There was
a thread on the newsgroup about the potential naming of the standard
library.

You did state as much in your proposal - indeed, you state "to ease the
transition". Personally, I dont think it is worth it, mainly because we
end up with a half-baked scheme purely for the transition, but one that can
never be removed.

To me, the question is one of:

* Why arent Zope/PIL capable of being used as packages.
* If they are (as I understand to be the case) why do people choose not to
use them as such, or why do the authors not recommend this?
* Is there a deficiency in the package scheme that makes it hard to use?
Eg, should "__" that ni used for the parent package be reinstated?

Mark.



From fredrik@pythonware.com Mon Jun 21 13:41:27 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Mon, 21 Jun 1999 14:41:27 +0200
Subject: [Python-Dev] Relative package imports
References: <000501bebbd8$80f56b10$0801a8c0@bobcat>
Message-ID: <006501bebbe3$6189e570$f29b12c2@pythonware.com>

Mark Hammond wrote:
* Why arent Zope/PIL capable of being used as packages.
PIL can be used as a package ("from PIL import Image"), assuming
that it's installed under a directory in your path. there's one pro-
blem in 1.0b1, though: you have to explicitly import the file format
handlers you need:

import PIL.JpegImagePlugin
import PIL.PngImagePlugin

this has been fixed in 1.0 final.
* If they are (as I understand to be the case) why do people choose not to
use them as such, or why do the authors not recommend this?
inertia, and compatibility concerns. we've decided that all
official material related to PIL 1.0 will use the old syntax (and
all 1.X releases will be possible to install using the PIL.pth
approach). too many users out there...

now, PIL 2.0 is a completely different thing...
* Is there a deficiency in the package scheme that makes it hard to use?
not that I'm aware...

</F>



From mal@lemburg.com Mon Jun 21 15:36:58 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 21 Jun 1999 16:36:58 +0200
Subject: [Python-Dev] Relative package imports
References: <000501bebbd8$80f56b10$0801a8c0@bobcat>
Message-ID: <376E4E0A.3B714BAB@lemburg.com>

Mark Hammond wrote:
It seems that there is not much interest in the topic...

I'll be offline for the next two weeks -- maybe someone could
pick the thread up and toss it around a bit while I'm away.
OK - here are my 2c on it:

Unless I am mistaken, this problem could be solved with 2 steps:
* Code moves to Python packages.
* The standard Python library move to a package.

If all non-trivial Python program used packages, and some agreement on a
standard namespace could be met, I think it would be addressed. There was
a thread on the newsgroup about the potential naming of the standard
library.

You did state as much in your proposal - indeed, you state "to ease the
transition". Personally, I dont think it is worth it, mainly because we
end up with a half-baked scheme purely for the transition, but one that can
never be removed.
With "easing the transition" I ment introducing a way to do relative
package imports: you don't need relative imports if you can be
sure that the package name will never change (with a fixed naming
scheme, a la com.domain.product.package...). The smarter import
mechanism is needed to work-around the pickle problems you face
(because pickle uses absolute package names).
To me, the question is one of:

* Why arent Zope/PIL capable of being used as packages.
* If they are (as I understand to be the case) why do people choose not to
use them as such, or why do the authors not recommend this?
* Is there a deficiency in the package scheme that makes it hard to use?
Eg, should "__" that ni used for the parent package be reinstated?
I guess this would help a great deal; although I'd personally
wouldn't like yet another underscore in the language. Simply leave
the name empty as in '.submodule' or '..subpackage.submodule'.

Cheers,
--
Marc-Andre Lemburg
______________________________________________________________________
Y2000: 193 days left
Business: http://www.lemburg.com/
Python Pages: http://www.lemburg.com/python/



From guido@CNRI.Reston.VA.US Mon Jun 21 23:44:24 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Mon, 21 Jun 1999 18:44:24 -0400
Subject: [Python-Dev] automatic wildcard expansion on Win32
In-Reply-To: Your message of "Fri, 18 Jun 1999 21:26:42 EDT."
<000701beb9f2$c95b9880$a69e2299@tim>
References: <000701beb9f2$c95b9880$a69e2299@tim>
Message-ID: <199906212244.SAA18866@eric.cnri.reston.va.us>
Some years ago in the Perl world, they solved this by making regular old
perl.exe not expand wildcards on Windows, but also supplying perlglob.exe
which did.
This seems a reasonable way out. Just like we have pythonw.exe, we
could add pythong.exe and pythongw.exe (or pythonwg.exe?). I guess
it's time for a README.txt file to be installed explaining all the
different executables... By default the g versions would not be used
unless invoked explicitly.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From Vladimir.Marangozov@inrialpes.fr Thu Jun 24 13:23:48 1999
From: Vladimir.Marangozov@inrialpes.fr (Vladimir Marangozov)
Date: Thu, 24 Jun 1999 14:23:48 +0200 (DFT)
Subject: [Python-Dev] ob_refcnt access
Message-ID: <199906241223.OAA46222@pukapuka.inrialpes.fr>

How about introducing internal macros for explicit ob_refcnt accesses
in the core? Actually, there are a number of places where one can see
"op->ob_refcnt" logic, which could be replaced with _Py_GETREF(op),
_Py_SETREF(op, n) thus decoupling completely the low level refcount
management defined in object.h:

#define _Py_GETREF(op) (((PyObject *)op)->ob_refcnt)
#define _Py_SETREF(op, n) (((PyObject *)op)->ob_refcnt = (n))

Comments?
I've contributed myself to the mess in intobject.c & floatobject.c, so
I thought that such macros would make the code cleaner.


Here's the current state of affairs:

python/dist/src>find . -name "*.[c]" -exec grep ob_refcnt {} \; -print

(void *) v, ((PyObject *) v)->ob_refcnt))
./Modules/_tkinter.c
if (self->arg->ob_refcnt > 1) { \
if (ob->ob_refcnt < 2 || self->fast)
if (args->ob_refcnt > 1) {
./Modules/cPickle.c
if (--inst->ob_refcnt > 0) {
./Objects/classobject.c
if (result->ob_refcnt == 1)
./Objects/fileobject.c
if (PyFloat_Check(p) && p->ob_refcnt != 0)
if (!PyFloat_Check(p) || p->ob_refcnt == 0) {
if (PyFloat_Check(p) && p->ob_refcnt != 0) {
p, p->ob_refcnt, buf);
./Objects/floatobject.c
if (PyInt_Check(p) && p->ob_refcnt != 0)
if (!PyInt_Check(p) || p->ob_refcnt == 0) {
if (PyInt_Check(p) && p->ob_refcnt != 0)
p, p->ob_refcnt, p->ob_ival);
./Objects/intobject.c
assert(v->ob_refcnt == 1); /* Since v will be used as accumulator! */
./Objects/longobject.c
if (op->ob_refcnt <= 0)
op->ob_refcnt, (long)op);
op->ob_refcnt = 1;
if (op->ob_refcnt < 0)
fprintf(fp, "[%d] ", op->ob_refcnt);
./Objects/object.c
if (!PyString_Check(v) || v->ob_refcnt != 1) {
if (key->ob_refcnt == 2 && key == value) {
./Objects/stringobject.c
if (!PyTuple_Check(op) || op->ob_refcnt != 1) {
if (v == NULL || !PyTuple_Check(v) || v->ob_refcnt != 1) {
./Objects/tupleobject.c
if (PyList_Check(seq) && seq->ob_refcnt == 1) {
if (args->ob_refcnt > 1) {
./Python/bltinmodule.c
if (value->ob_refcnt != 1)
./Python/import.c
return PyInt_FromLong((long) arg->ob_refcnt);
./Python/sysmodule.c


--
Vladimir MARANGOZOV | Vladimir.Marangozov@inrialpes.fr
http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252


From guido@CNRI.Reston.VA.US Thu Jun 24 16:30:45 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 24 Jun 1999 11:30:45 -0400
Subject: [Python-Dev] ob_refcnt access
In-Reply-To: Your message of "Thu, 24 Jun 1999 14:23:48 +0200."
<199906241223.OAA46222@pukapuka.inrialpes.fr>
References: <199906241223.OAA46222@pukapuka.inrialpes.fr>
Message-ID: <199906241530.LAA27887@eric.cnri.reston.va.us>
How about introducing internal macros for explicit ob_refcnt accesses
in the core?
What problem does this solve?
Actually, there are a number of places where one can see
"op->ob_refcnt" logic, which could be replaced with _Py_GETREF(op),
_Py_SETREF(op, n) thus decoupling completely the low level refcount
management defined in object.h:

#define _Py_GETREF(op) (((PyObject *)op)->ob_refcnt)
#define _Py_SETREF(op, n) (((PyObject *)op)->ob_refcnt = (n))
Why the cast? It loses some type-safety, e.g. _Py_GETREF(0) will now
cause a core dump instead of a compile-time error.
Comments?
I don't see how it's cleaner or saves typing:

op->ob_refcnt
_Py_GETREF(op)

op->ob_refcnt = 1
_Py_SETREF(op, 1)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From Vladimir.Marangozov@inrialpes.fr Thu Jun 24 17:33:31 1999
From: Vladimir.Marangozov@inrialpes.fr (Vladimir Marangozov)
Date: Thu, 24 Jun 1999 18:33:31 +0200 (DFT)
Subject: [Python-Dev] Re: ob_refcnt access
In-Reply-To: <no.id> from "marangoz" at "Jun 24, 99 02:23:47 pm"
Message-ID: <199906241633.SAA44314@pukapuka.inrialpes.fr>

marangoz wrote:

How about introducing internal macros for explicit ob_refcnt accesses
in the core? Actually, there are a number of places where one can see
"op->ob_refcnt" logic, which could be replaced with _Py_GETREF(op),
_Py_SETREF(op, n) thus decoupling completely the low level refcount
management defined in object.h:

#define _Py_GETREF(op) (((PyObject *)op)->ob_refcnt)
#define _Py_SETREF(op, n) (((PyObject *)op)->ob_refcnt = (n))

Comments?
Of course, the above should be (PyObject *)(op)->ob_refcnt. Also, I forgot
to mention that if this detail doesn't hurt code aesthetics, one (I) could
experiment more easily all sort of weird things with refcounting...

I formulated the same wish for malloc & friends some time ago, that is,
use everywhere in the core PyMem_MALLOC, PyMem_FREE etc, which would be
defined for now as malloc, free, but nobody seems to be very excited
about a smooth transition to other kinds of malloc. Hence, I reiterate
this wish, 'cause switching to macros means preparing the code for the
future, even if in the future it remains intact ;-).

Defining these basic interfaces is clearly Guido's job :-) as he points
out in his summary of the last Open Source summit, but nevertheless,
I'm raising the issue to let him see what other people think about this
and allow him to make decisions easier :-)

--
Vladimir MARANGOZOV | Vladimir.Marangozov@inrialpes.fr
http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252


From ping@lfw.org Thu Jun 24 18:29:19 1999
From: ping@lfw.org (Ka-Ping Yee)
Date: Thu, 24 Jun 1999 10:29:19 -0700 (PDT)
Subject: [Python-Dev] ob_refcnt access
In-Reply-To: <199906241530.LAA27887@eric.cnri.reston.va.us>
Message-ID: <Pine.LNX.3.93.990624101635.11020A-100000@localhost>
On Thu, 24 Jun 1999, Guido van Rossum wrote:
How about introducing internal macros for explicit ob_refcnt accesses
in the core?
What problem does this solve?
I assume Vladimir was trying to leave the door open for further
ob_refcnt manipulation hooks later, like having objects manage
their own refcounts. Until there's an actual problem to solve
that requires this, though, i'm not sure it's necessary. Are
there obvious reasons to want to allow this?

* * *

While we're talking about refcounts and all, i've had the
argument quite successfully made to me that a reasonably
written garbage collector can be both (a) simple and (b) more
efficient than refcounting. Having spent a good number of
work days doing nothing but debugging crashes by tracing
refcounting bugs, i was easily converted into a believer
once a friend dispelled the notion that garbage collectors
were either slow or horribly complicated. I had always been
scared of them before, but less so now.

Is an incremental GC being considered for a future Python?
I've idly been pondering various tricks by which it could be
made to work with existing extension modules -- here are some
possibilities:

1. Keep the refcounts and let existing code do the usual
thing; introduce a new variant of PyObject_NEW that
puts an object into the "gc-able" pool rather than the
"refcounted" pool.

2. Have Py_DECREF and Py_INCREF just do nothing, and let
the garbage collector guess from the contents of the
structure where the pointers are. (I'm told it's
possible to do this safely, since you can only have
false positives, never false negatives.)

3. Have Py_DECREF and Py_INCREF just do nothing, and ask
the extension module to just provide (in its type
object) a table of where the pointers are in its struct.

And so on; mix and match. What are everyone's thoughts on this one?



-- ?!ng

"All models are wrong; some models are useful."
-- George Box



From tim_one@email.msn.com Fri Jun 25 07:38:11 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Fri, 25 Jun 1999 02:38:11 -0400
Subject: [Python-Dev] ob_refcnt access
In-Reply-To: <Pine.LNX.3.93.990624101635.11020A-100000@localhost>
Message-ID: [Ka-Ping Yee, opines about GC]

Ping, I think you're not getting any responses because this has been beaten
to death on c.l.py over the last month (for the 53rd time, no less <wink>).

A hefty percentage of CPython users *like* the reliably timely destruction
refcounting yields, and some clearly rely on it.

Guido recently (10 June) posted the start of a "add GC on top of RC" scheme,
in a thread with the unlikely name "fork()". The combination of cycles,
destructors and resurrection is quite difficult to handle in a way both
principled and useful (Java's way is principled but by most accounts
unhelpful to the point of uselessness).

Python experience with the Boehm collector can be found in the FAQ; note
that the Boehm collector deals with finalizers in cycles by letting cycles
with finalizers leak!
...
While we're talking about refcounts and all, i've had the
argument quite successfully made to me that a reasonably
written garbage collector can be both (a) simple and (b) more
efficient than refcounting.
That's a dubious claim. Sophisticated mark-and-sweep (with or without
compaction) is almost universally acknowledged to beat RC, but simple M&S
has terrible cache behavior (you fill up the address space before reclaiming
anything, then leap all over the address space repeatedly cleaning it up).
Don't discount that, in Python unlike as in most other languages, the simple
loop

for i in xrange(1000000):
pass

creates a huge amount of trash at a furious pace. Under RC it can happily
reuse the same little bit of storage each time around.
Having spent a good number of work days doing nothing but debugging
crashes by tracing refcounting bugs,
Yes, we can trade that for tracking down M&S bugs <0.5 wink> -- instead of
INCREF/DECREF macros, you end up with M&S macros marking regions where the
collector must not be run (because you're in a temporarily "inconsistent"
state). That's under sophisticated M&S, though, but is an absolute
nightmare when you miss a pair (the bugs only show up "sometimes", and not
always the same ways -- depends on when M&S happens to run, and "how
inconsistent" you happen to be at the time).
...
And so on; mix and match. What are everyone's thoughts on this one?
I think Python probably needs to clean up cycles, but by some variant of
Guido's scheme on top of RC; I very much dislike the property of his scheme
that objects with destructors may be get destroyed without their destructors
getting invoked, but it seems hard to fix.

Alternatives include Java's scheme (which really has nothing going for it
other than that Java does it <0.3 wink>); Scheme's "guardian" scheme (which
would let the user "get at" cyclic trash with destructors, but refuses to do
anything with them on its own); following Boehm by saying that cycles with
destructors are immortal; following goofier historical precedent by e.g.
destroying such objects in reverse order of creation; or maybe just raising
an exception if a trash cycle containing a destructor is found.

All of those seem a comparative pain to implement, with Java's being the
most painful -- and quite possibly the least satisfying!

it's-a-whale-of-a-lot-easier-in-a-self-contained-universe-or-even-an-
all-c-one-ly y'rs - tim




From Vladimir.Marangozov@inrialpes.fr Fri Jun 25 12:27:43 1999
From: Vladimir.Marangozov@inrialpes.fr (Vladimir Marangozov)
Date: Fri, 25 Jun 1999 13:27:43 +0200 (DFT)
Subject: [Python-Dev] Re: ob_refcnt access (fwd)
Message-ID: <199906251127.NAA27464@pukapuka.inrialpes.fr>

FYI, my second message on this issue didn't reach the list because
of a stupid error of mine, so Guido and I exchanged two mails
in private. His response to the msg below was that he thinks
that tweaking the refcount scheme at this level wouldn't contribute
much and that he doesn't intend to change anything on this until 2.0
which will be rewritten from scratch.

Besides, if I want to satisfy my curiosity in hacking the refcounts
I can do it with a small patch because I've already located the places
where the ob_refcnt slot is accessed directly.


----- Forwarded message -----

From Vladimir.Marangozov@inrialpes.fr Thu Jun 24 17:33:31 1999
From: Vladimir.Marangozov@inrialpes.fr (Vladimir.Marangozov@inrialpes.fr)
Date: Thu, 24 Jun 1999 18:33:31 +0200 (DFT)
Subject: ob_refcnt access
In-Reply-To: <no.id> from "marangoz" at "Jun 24, 99 02:23:47 pm"

marangoz wrote:

How about introducing internal macros for explicit ob_refcnt accesses
in the core? Actually, there are a number of places where one can see
"op->ob_refcnt" logic, which could be replaced with _Py_GETREF(op),
_Py_SETREF(op, n) thus decoupling completely the low level refcount
management defined in object.h:

#define _Py_GETREF(op) (((PyObject *)op)->ob_refcnt)
#define _Py_SETREF(op, n) (((PyObject *)op)->ob_refcnt = (n))

Comments?
Of course, the above should be (PyObject *)(op)->ob_refcnt. Also, I forgot
to mention that if this detail doesn't hurt code aesthetics, one (I) could
experiment more easily all sort of weird things with refcounting...

I formulated the same wish for malloc & friends some time ago, that is,
use everywhere in the core PyMem_MALLOC, PyMem_FREE etc, which would be
defined for now as malloc, free, but nobody seems to be very excited
about a smooth transition to other kinds of malloc. Hence, I reiterate
this wish, 'cause switching to macros means preparing the code for the
future, even if in the future it remains intact ;-).

Defining these basic interfaces is clearly Guido's job :-) as he points
out in his summary of the last Open Source summit, but nevertheless,
I'm raising the issue to let him see what other people think about this
and allow him to make decisions easier :-)

--
Vladimir MARANGOZOV | Vladimir.Marangozov@inrialpes.fr
http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252

----- End of forwarded message -----

--
Vladimir MARANGOZOV | Vladimir.Marangozov@inrialpes.fr
http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252


From tismer@appliedbiometrics.com Fri Jun 25 19:47:51 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Fri, 25 Jun 1999 20:47:51 +0200
Subject: [Python-Dev] Re: ob_refcnt access (fwd)
References: <199906251127.NAA27464@pukapuka.inrialpes.fr>
Message-ID: <3773CED7.B87D055C@appliedbiometrics.com>


Vladimir Marangozov wrote:
FYI, my second message on this issue didn't reach the list because
of a stupid error of mine, so Guido and I exchanged two mails
in private. His response to the msg below was that he thinks
that tweaking the refcount scheme at this level wouldn't contribute
much and that he doesn't intend to change anything on this until 2.0
which will be rewritten from scratch.

Besides, if I want to satisfy my curiosity in hacking the refcounts
I can do it with a small patch because I've already located the places
where the ob_refcnt slot is accessed directly.
Well, one Euro on that issue:
#define _Py_GETREF(op) (((PyObject *)op)->ob_refcnt)
#define _Py_SETREF(op, n) (((PyObject *)op)->ob_refcnt = (n))

Comments?
Of course, the above should be (PyObject *)(op)->ob_refcnt. Also, I forgot
to mention that if this detail doesn't hurt code aesthetics, one (I) could
experiment more easily all sort of weird things with refcounting...
I think if at all, this should be no typecast to stay safe.
As long as every PyObject has a refcount, this would be correct
and checked by the compiler. Why loose it?

#define _Py_GETREF(op) ((op)->ob_refcnt)

This carries the same semantics, the same compiler check, but
adds a level of abstraction for future changes.
I formulated the same wish for malloc & friends some time ago, that is,
use everywhere in the core PyMem_MALLOC, PyMem_FREE etc, which would be
defined for now as malloc, free, but nobody seems to be very excited
about a smooth transition to other kinds of malloc. Hence, I reiterate
this wish, 'cause switching to macros means preparing the code for the
future, even if in the future it remains intact ;-).
I wish to incref this wish by mine.
In order to be able to try different memory allocation
strategies, I would go even further and give every
object type its own allocation macro which carries
info about the object type about to be allocated.
This costs nothing but a little macro expansion
for the C compiler, but would allow to try
new schemes, without always patching the Python source.

ciao - chris

--
Christian Tismer :^)
Applied Biometrics GmbH : Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net
10553 Berlin : PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
we're tired of banana software - shipped green, ripens at home


From tismer@appliedbiometrics.com Fri Jun 25 19:56:39 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Fri, 25 Jun 1999 20:56:39 +0200
Subject: [Python-Dev] ob_refcnt access
References: <000c01bebed5$4b8d1040$d29e2299@tim>
Message-ID: <3773D0E7.458E00F1@appliedbiometrics.com>


Tim Peters wrote:
[Ka-Ping Yee, opines about GC]

Ping, I think you're not getting any responses because this has been beaten
to death on c.l.py over the last month (for the 53rd time, no less <wink>).

A hefty percentage of CPython users *like* the reliably timely destruction
refcounting yields, and some clearly rely on it.
[CG issue dropped, I know the thread]

I know how much of a pain in the .. proper refcounting can be.
Sometimes, after long debugging, I wished it would go. But
finally, I think it is a *really good thing* to have to do
proper refcounting.
The reason is that this causes a lot of discipline, which
improves the whole program. I guess with GC always there,
quite a number of errors stay undetected.

I can say this, since I have been through a week of debugging
now, and I can now publish

full blown first class continuations for Python

yes I'm happy - chris

--
Christian Tismer :^)
Applied Biometrics GmbH : Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net
10553 Berlin : PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
we're tired of banana software - shipped green, ripens at home


From skip@mojam.com (Skip Montanaro) Sun Jun 27 23:11:28 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Sun, 27 Jun 1999 18:11:28 -0400 (EDT)
Subject: [Python-Dev] Merge the string_methods tag?
In-Reply-To: <199906181451.KAA11549@eric.cnri.reston.va.us>
References: <015601beb964$f37a4fa0$0801a8c0@bobcat>
<199906181451.KAA11549@eric.cnri.reston.va.us>
Message-ID: <14198.41195.968785.37763@cm-24-29-94-19.nycap.rr.com>

Guido> Hmm... This would make it hard to make a patch release for 1.5.2
Guido> (possible called 1.5.3?). I *really* don't want the string
Guido> methods to end up in a release yet -- there are too many rough
Guido> edges (e.g. some missing methods, should join str() or not,
Guido> etc.).

Sorry for the delayed response. I've been out of town. When Barry returns
would it be possible to merge the string methods in conditionally (#ifdef
STRING_METHODS) and add a --with-string-methods configure option? How hard
would it be to modify string.py, stringobject.c and stropmodule.c to carry
that around?

Skip Montanaro | http://www.mojam.com/
skip@mojam.com | http://www.musi-cal.com/~skip/
518-372-5583


From tim_one@email.msn.com Mon Jun 28 03:27:06 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Sun, 27 Jun 1999 22:27:06 -0400
Subject: [Python-Dev] ob_refcnt access
In-Reply-To: <3773D0E7.458E00F1@appliedbiometrics.com>
Message-ID: [Christian Tismer]
...
I can say this, since I have been through a week of debugging
now, and I can now publish

full blown first class continuations for Python

yes I'm happy - chris
You should be! So how come nobody else is <wink/frown>?

Let's fire some imagination here: without the stinkin' C stack snaking its
way thru everything, then with the exception of external system objects
(like open files), the full state of a running Python program is comprised
of objects Python understands and controls.

So with some amount of additional pain we could pickle them. And unpickle
them. Painlessly checkpoint a long computation for possible restarting?
Freeze a program while it's running on your mainframe, download it to your
laptop and resume it while you're on the road? Ship a bug report with the
computation frozen right before the error occurs? Take an app with gobs of
expensive initialization, freeze it after it's "finally ready to go", and
ship the latter instead? Capture the state of an interactive session for
later resumption? Etc.

Not saying those are easy, but getting the C stack out of the way means they
move from impossible to plausible. Maybe it would help get past the
Schemeophobia <wink> if, instead of calling them "continuations", you called
'em "platform-independent potentially picklable threads".

pippt-sounds-as-good-as-it-reads<wink>-ly y'rs - tim




From tim_one@email.msn.com Mon Jun 28 04:13:15 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Sun, 27 Jun 1999 23:13:15 -0400
Subject: [Python-Dev] ActiveState & fork & Perl
Message-ID: <000601bec114$2a2929c0$e19e2299@tim>

Moving back in time ...

[GordonM]
Perhaps Christian's stackless Python would enable green threads... [Guido]
This has been suggested before... While this seems possible at first,
all blocking I/O calls would have to be redone to pass control to the
thread scheduler, before this would be useful -- a huge task!
I didn't understand this. If I/O calls are left alone, and a green thread
hit one, the whole program just sits there waiting for the call to complete,
right?

But if the same thing happens using "real threads" today, the same thing
happens today anyway <wink>. That is, if a thread doesn't release the
global lock before a blocking call today, the whole program just sits there
etc.

Or do you have some other kind of problem in mind here?

unconvincedly y'rs - tim




From MHammond@skippinet.com.au Mon Jun 28 05:29:29 1999
From: MHammond@skippinet.com.au (Mark Hammond)
Date: Mon, 28 Jun 1999 14:29:29 +1000
Subject: [Python-Dev] ob_refcnt access
In-Reply-To: <000501bec10d$b6f1fb40$e19e2299@tim>
Message-ID: <003301bec11e$d0cfc6d0$0801a8c0@bobcat>
yes I'm happy - chris
You should be! So how come nobody else is <wink/frown>?
Im a little unhappy as this will break the Active Debugging stuff - ie, the
ability for Python, Java, Perl, VBScript etc to all exist in the same
process, each calling each other, and each being debuggable (makes a
_great_ demo :-)

Im not _really_ unhappy, Im just throwing this in as an FYI.

The Active Debugging interfaces need some way of sorting a call stack. As
many languages may be participating in a debugging session, there is no
implicit ordering available. Inter-language calls are not made via the
debugger, so it has no chance to intercept.

So the solution MS came up with was, surprise surprise, the machine stack!
:-) The assumption is that all languages will make _some_ use of the
stack, so they ask a language to report its "stack base address" and "stack
size". Using this information, the debugger sorts into the correct call
sequence.

Indeed, getting this information (even the half of it I did manage :-) was
painful, and hard to get right.

Ahh, the joys of bleeding-edge technologies :-)
Let's fire some imagination here: without the stinkin' C
stack snaking its
I tried, and look what happened :-) Seriously, some if this stuff would be
way cool.

Bit I also understand completely the silence on this issue. When the
thread started, there was much discussion about exactly what the hell these
continuation/coroutine thingies even were. However, there were precious
few real-world examples where they could be used. A few acedemic,
theoretical places, but the only real contender I have seen brought up was
Medusa. There were certainly no clear examples of "as soon as we have
this, I could change abc to take advantage, and this would give us the very
cool xyz"

So, if anyone else if feeling at all like me about this issue, they are
feeling all warm and fuzzy knowing that a few smart people are giving us
the facility to do something we hope we never, ever have to do. :-)

Mark.



From rushing@nightmare.com Mon Jun 28 10:53:21 1999
From: rushing@nightmare.com (Sam Rushing)
Date: Mon, 28 Jun 1999 02:53:21 -0700 (PDT)
Subject: [Python-Dev] ob_refcnt access
In-Reply-To: <41219828@toto.iv>
Message-ID: <14199.13497.439332.366329@seattle.nightmare.com>

Mark Hammond writes:
I tried, and look what happened :-) Seriously, some if this stuff
would be way cool. >
Bit I also understand completely the silence on this issue. When
the thread started, there was much discussion about exactly what
the hell these continuation/coroutine thingies even were. However,
there were precious few real-world examples where they could be
used. A few acedemic, theoretical places, but the only real
contender I have seen brought up was Medusa. There were certainly
no clear examples of "as soon as we have this, I could change abc
to take advantage, and this would give us the very cool xyz"
Part of the problem is that we didn't have the feature to play with.
Many of the possibilities are showing up now that it's here...

The basic advantage to coroutines is they allow you to turn any
event-driven/state-machine problem into one that is managed with
'normal' control state; i.e., for loops, while loops, nested procedure
calls, etc...

Here are a few possible real-world uses:

=================================================
Parsing. I remember a discussion from a few years back about the
distinction between 'push' and 'pull' model parsers. Coroutines let
you have it both ways; you can write a parser in the most natural way
(pull), but use it as a 'push'; i.e. for a web browser.

=================================================
"http sessions". A single 'thread' of control that is re-entered
whenever a hit from a particular user ('session') comes in to the web
server:

[Apologies to those that have already seen this cheezy example]

def ecommerce (session):
session.login() # sends a login form, waits for it to return
basket = []
while 1:
item = session.shop_for_item()
if item:
basket.append (item)
else:
break
if basket:
session.get_shipping_info()
session.get_payment_info()
session.transact()

'session.shop_for_item()' will resume the main coroutine, which will
resume this coroutine only when a new hit comes in from that
session/user, and 'return' this hit to the while loop.

I have a little web server that uses this idea to play blackjack:

http://www.nightmare.com:7777/
http://www.nightmare.com/stuff/blackjack_httpd.py

[though I'm a little fuzzy on the rules].

Rather than building a state machine that keeps track of where the
user has been, and what they're doing, you can keep all the state in
local variables (like 'basket' above) - in other words, it's a much
more natural style of programming.

=================================================
One of the areas I'm most excited about is GUI coding. All GUI's are
event driven. All GUI code is therefore written in a really twisted,
state-machine fashion; interactions are very complex. OO helps a bit,
but doesn't change the basic difficulty - past a certain point
interesting things become too complex to try...

Mr. Fuchs' paper ("Escaping the event loop: an alternative control
structure for multi-threaded GUIs") does a much better job of
describing this than I can:

http://cs.nyu.edu/phd_students/fuchs/
http://cs.nyu.edu/phd_students/fuchs/gui.ps

=================================================
Tim's example of 'dumping' a computation in the middle and storing it
on disk (or sending it over a network), is not a fantasy... I have a
'stackless' Scheme system that does this right now.

=================================================
Ok, final example. Isn't there an interface in Python to call a
certain function after every so many vm insns? Using coroutines you
could hook into this and provide non-preemptive 'threads' for those
platforms that don't have them. [And the whole thing would be written
in Python, not in C!]

=================================================
So, if anyone else if feeling at all like me about this issue, they
are feeling all warm and fuzzy knowing that a few smart people are
giving us the facility to do something we hope we never, ever have
to do. :-)
"When the only tool you have is a hammer, everything looks like a
nail". I saw the guys over in the Scheme shop cutting wood with a
power saw; now I feel like a schmuck with my hand saw.

You are right to be frightened by the strangeness of the underlying
machinery; hopefully a simple and easy-to-understand interface can be
built for the C level as well as Python. I think Christian's 'frame
dispatcher' is fairly clear, and not *that* much of a departure from
the current VM; it's amazing to me how little work really had to be
done!

-Sam



From tismer@appliedbiometrics.com Mon Jun 28 13:07:33 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Mon, 28 Jun 1999 14:07:33 +0200
Subject: [Python-Dev] ob_refcnt access
References: <003301bec11e$d0cfc6d0$0801a8c0@bobcat>
Message-ID: <37776585.17B78DD1@appliedbiometrics.com>


Mark Hammond wrote:
yes I'm happy - chris
You should be! So how come nobody else is <wink/frown>?
(to Tim) I believe this comes simply since following me
would force people to change their way of thinking.
I am through this already, but it was hard for me.
And after accepting to be stackless, there is no way
to go back. Today I'm wondering about my past:
"how could I think of stacks when thinking of programs?"
This is so wrong.

The truth is: Programs are just some data, part of it called code,
part of it is local state, and! its future of computation.
Out, over, roger. All the rest is artificial showstoppers.
Im a little unhappy as this will break the Active Debugging stuff - ie, the
ability for Python, Java, Perl, VBScript etc to all exist in the same
process, each calling each other, and each being debuggable (makes a
_great_ demo :-)

Im not _really_ unhappy, Im just throwing this in as an FYI.
Well, yet I see no problem.
The Active Debugging interfaces need some way of sorting a call stack. As
many languages may be participating in a debugging session, there is no
implicit ordering available. Inter-language calls are not made via the
debugger, so it has no chance to intercept.

So the solution MS came up with was, surprise surprise, the machine stack!
:-) The assumption is that all languages will make _some_ use of the
stack, so they ask a language to report its "stack base address" and "stack
size". Using this information, the debugger sorts into the correct call
sequence.
Now, I can give it a machine stack. There is just a frame dispatcher
sitting on the stack, and it grabs frames from the current thread
state.
Indeed, getting this information (even the half of it I did manage :-) was
painful, and hard to get right.
I would have to see the AX interface. But for sure there will
be some method hooks with which I can tell AX how to walk
the frame chain.
And why don't I simply publish frames as COM objects?
This would give you much more than everything else,
I guess.

BTW, as it is now, there is no need to use AX debugging
for Python, since Python can do it alone now. Of course
it makes sense to have it all in the AX environment.
You will be able to modify a running programs local
variables, its evaluation stack, change its code,
change where it returns to, all is doable.

...
Bit I also understand completely the silence on this issue. When the
thread started, there was much discussion about exactly what the hell these
continuation/coroutine thingies even were. However, there were precious
few real-world examples where they could be used. A few acedemic,
theoretical places, but the only real contender I have seen brought up was
Medusa. There were certainly no clear examples of "as soon as we have
this, I could change abc to take advantage, and this would give us the very
cool xyz"
The problem was for me, that I had also no understanding what
I was doing, actually. Implemented continuations without an
idea how they work. But Tim and Sam said they were the most
powerful control strucure possible, so I used all my time
to find this out. Now I'm beginning to understand.
And my continuation based coroutine example turns out to
be twenty lines of Python code. Coming soon, after I served
my whining customers.
So, if anyone else if feeling at all like me about this issue, they are
feeling all warm and fuzzy knowing that a few smart people are giving us
the facility to do something we hope we never, ever have to do. :-)
Think of it as just a flare gun in your hands. By reading the fine
print, you will realize that you actually hold an atom bomb,
with a little code taming it for you. :-)

back-to-the-future - ly y'rs - chris

--
Christian Tismer :^)
Applied Biometrics GmbH : Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net
10553 Berlin : PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
we're tired of banana software - shipped green, ripens at home


From skip@mojam.com (Skip Montanaro) Mon Jun 28 14:13:31 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Mon, 28 Jun 1999 09:13:31 -0400 (EDT)
Subject: [Python-Dev] ActiveState & fork & Perl
In-Reply-To: <000601bec114$2a2929c0$e19e2299@tim>
References: <000601bec114$2a2929c0$e19e2299@tim>
Message-ID: <14199.29856.78030.445795@cm-24-29-94-19.nycap.rr.com>

Still trying to make the brain shift from out-of-town to back-to-work...

Tim> [GordonM]
Perhaps Christian's stackless Python would enable green threads...
What's a green thread?

Skip


From fredrik@pythonware.com Mon Jun 28 14:37:30 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Mon, 28 Jun 1999 15:37:30 +0200
Subject: [Python-Dev] ActiveState & fork & Perl
References: <000601bec114$2a2929c0$e19e2299@tim> <14199.29856.78030.445795@cm-24-29-94-19.nycap.rr.com>
Message-ID: <00ca01bec16b$5eef11e0$f29b12c2@secret.pythonware.com>
What's a green thread?
a user-level thread (essentially what you can implement
yourself by swapping stacks, etc). it's enough to write
smoothly running threaded programs, but not enough to
support true concurrency on multiple processors.

also see:
http://www.sun.com/solaris/java/wp-java/4.html

</F>



From tismer@appliedbiometrics.com Mon Jun 28 17:11:43 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Mon, 28 Jun 1999 18:11:43 +0200
Subject: [Python-Dev] ActiveState & fork & Perl
References: <000601bec114$2a2929c0$e19e2299@tim> <14199.29856.78030.445795@cm-24-29-94-19.nycap.rr.com>
Message-ID: <37779EBF.A146D355@appliedbiometrics.com>


Skip Montanaro wrote:
Still trying to make the brain shift from out-of-town to back-to-work...

Tim> [GordonM]
Perhaps Christian's stackless Python would enable green threads...
What's a green thread?
Nano-Threads. Threadless threads, solely Python driven, no system
threads needed but possible. Think of the "big" system threads
where each can run any number of tiny Python threads.

Powered by snake oil - ciao - chris

--
Christian Tismer :^)
Applied Biometrics GmbH : Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net
10553 Berlin : PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
we're tired of banana software - shipped green, ripens at home


From akuchlin@mems-exchange.org Mon Jun 28 18:55:16 1999
From: akuchlin@mems-exchange.org (Andrew M. Kuchling)
Date: Mon, 28 Jun 1999 13:55:16 -0400 (EDT)
Subject: [Python-Dev] Paul Prescod: add Expat to 1.6
Message-ID: <14199.46852.932030.576094@amarok.cnri.reston.va.us>

--LBZCZBunrI
Content-Type: text/plain; charset=us-ascii
Content-Description: message body text
Content-Transfer-Encoding: 7bit

Paul Prescod sent the following note to the XML-SIG mailing list.
Thoughts?

--amk


--LBZCZBunrI
Content-Type: message/rfc822
Content-Description: forwarded message

Received: from cnri.reston.va.us (ns.cnri.reston.va.us [132.151.1.1])
by newcnri.cnri.reston.va.us (8.9.1a/8.9.1) with SMTP id NAA21297
for <akuchlin@newcnri.cnri.reston.va.us>; Mon, 28 Jun 1999 13:31:36 -0400 (EDT)
Received: from python.org (parrot [132.151.1.90])
by cnri.reston.va.us (8.9.1a/8.9.1) with ESMTP id NAA09256;
Mon, 28 Jun 1999 13:31:36 -0400 (EDT)
Received: from python.org (localhost [127.0.0.1])
by python.org (8.9.1a/8.9.1) with ESMTP id NAA19449;
Mon, 28 Jun 1999 13:29:36 -0400 (EDT)
Received: from relay.pair.com (relay1.pair.com [209.68.1.20])
by python.org (8.9.1a/8.9.1) with ESMTP id NAA19425
for <xml-sig@python.org>; Mon, 28 Jun 1999 13:29:00 -0400 (EDT)
Received: from prescod.net (sdn-ar-004txdallP126.dialsprint.net [168.191.157.190])
by relay.pair.com (8.8.7/8.8.5) with ESMTP id NAA24949
for <xml-sig@python.org>; Mon, 28 Jun 1999 13:31:19 -0400 (EDT)
Message-ID: <37779C32.780A9134@prescod.net>
X-Mailer: Mozilla 4.51 [en] (WinNT; I)
X-Accept-Language: en,tr
MIME-Version: 1.0
Errors-To: xml-sig-admin@python.org
X-Mailman-Version: 1.0rc2
Precedence: bulk
List-Id: XML Processing in Python <xml-sig.python.org>
X-BeenThere: xml-sig@python.org
Content-Type: text/plain; charset=us-ascii
Content-Length: 1040
From: Paul Prescod <paul@prescod.net>
Sender: xml-sig-admin@python.org
To: "xml-sig@python.org" <xml-sig@python.org>
Subject: [XML-SIG] [Fwd: Re: parsers for Palm?]
Date: Mon, 28 Jun 1999 12:00:50 -0400
Expat 1.1 added a compile-time option to allow a smaller (and slightly
slower) parser. With this option on Win32 it compiles into a single DLL
that compresses to 23k. Is that too large for Palm?

James
Wow. I didn't notice that Expat was so small now.

I think that we should certainly move for Python 1.6 to include eXpat and
easysax. At compile time, Unix Python users could choose whether they want
small or fast. For Windows we could just make both DLLs available (though
only the small one would be built-in to the distribution). 23K for
something as significant as massively-accelarated XML seems like a small
price. Note that this 23k includes full Unicode support and is completely
Ansi C, just like Python. Also, I understand that it now supports internal
and external, general and parameter entities. In other words, almost
everything except validation!

Opinions?

Paul Prescod

_______________________________________________
XML-SIG maillist - XML-SIG@python.org
http://www.python.org/mailman/listinfo/xml-sig


--LBZCZBunrI--


From guido@CNRI.Reston.VA.US Mon Jun 28 20:35:04 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Mon, 28 Jun 1999 15:35:04 -0400
Subject: [Python-Dev] Paul Prescod: add Expat to 1.6
In-Reply-To: Your message of "Mon, 28 Jun 1999 13:55:16 EDT."
<14199.46852.932030.576094@amarok.cnri.reston.va.us>
References: <14199.46852.932030.576094@amarok.cnri.reston.va.us>
Message-ID: <199906281935.PAA01439@eric.cnri.reston.va.us>
Paul Prescod sent the following note to the XML-SIG mailing list.
Thoughts?
I don't know any of the acronyms, and I'm busy writing a funding
proposal plus two talks for the Monterey conference, so I don't have
any thoughts to spare at the moment. Perhaps someone could present
the case with some more background info? (It does sounds intriguing,
but then again I'm not sure how many people *really* need to parse XML
-- it doesn't strike me as something of the same generality as regular
expressions yet.)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From jim@digicool.com Mon Jun 28 20:51:00 1999
From: jim@digicool.com (Jim Fulton)
Date: Mon, 28 Jun 1999 15:51:00 -0400
Subject: [Python-Dev] Paul Prescod: add Expat to 1.6
References: <14199.46852.932030.576094@amarok.cnri.reston.va.us>
Message-ID: <3777D224.6936B890@digicool.com>


"Andrew M. Kuchling" wrote:
Paul Prescod sent the following note to the XML-SIG mailing list.
Thoughts?
When I brought up some ideas for adding a separate validation mechanism
for PyExpat, some folks suggested that I should look at some other C
libraries, including one from the ILU folks and some other one
that I can't remember the name of off hand. Should we (used loosely ;)
look into the other libraries before including expat in the Python dist?

Jim

--
Jim Fulton mailto:jim@digicool.com Python Powered!
Technical Director (888) 344-4332 http://www.python.org
Digital Creations http://www.digicool.com http://www.zope.org

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission. Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


From guido@CNRI.Reston.VA.US Mon Jun 28 21:07:50 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Mon, 28 Jun 1999 16:07:50 -0400
Subject: [Python-Dev] ob_refcnt access
In-Reply-To: Your message of "Mon, 28 Jun 1999 02:53:21 PDT."
<14199.13497.439332.366329@seattle.nightmare.com>
References: <14199.13497.439332.366329@seattle.nightmare.com>
Message-ID: <199906282007.QAA01570@eric.cnri.reston.va.us>
Part of the problem is that we didn't have the feature to play with.
Many of the possibilities are showing up now that it's here...

The basic advantage to coroutines is they allow you to turn any
event-driven/state-machine problem into one that is managed with
'normal' control state; i.e., for loops, while loops, nested procedure
calls, etc...

Here are a few possible real-world uses:
Thanks, Sam! Very useful collection of suggestions. (How come I'm
not surprised to see these coming from you ;-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From akuchlin@mems-exchange.org Mon Jun 28 21:08:42 1999
From: akuchlin@mems-exchange.org (Andrew M. Kuchling)
Date: Mon, 28 Jun 1999 16:08:42 -0400 (EDT)
Subject: [Python-Dev] Paul Prescod: add Expat to 1.6
In-Reply-To: <199906281935.PAA01439@eric.cnri.reston.va.us>
References: <14199.46852.932030.576094@amarok.cnri.reston.va.us>
<199906281935.PAA01439@eric.cnri.reston.va.us>
Message-ID: <14199.54858.464165.381344@amarok.cnri.reston.va.us>

Guido van Rossum writes:
any thoughts to spare at the moment. Perhaps someone could present
the case with some more background info? (It does sounds intriguing,
Paul is probably suggesting this so that Python comes with a fast,
standardized XML parser out of the box. On the other hand, where do
you draw the line? Paul suggests including PyExpat and easySAX (a
small SAX implementation), but why not full SAX, and why not DOM?

My personal leaning is that we can get more bang for the buck by
working on the Distutils effort, so that installing a package like
PyExpat becomes much easier, rather than piling more things into the
core distribution.

--
A.M. Kuchling http://starship.python.net/crew/amk/
The Law, in its majestic equality, forbids the rich, as well as the poor, to
sleep under the bridges, to beg in the streets, and to steal bread.
-- Anatole France



From guido@CNRI.Reston.VA.US Mon Jun 28 21:17:41 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Mon, 28 Jun 1999 16:17:41 -0400
Subject: [Python-Dev] ActiveState & fork & Perl
In-Reply-To: Your message of "Sun, 27 Jun 1999 23:13:15 EDT."
<000601bec114$2a2929c0$e19e2299@tim>
References: <000601bec114$2a2929c0$e19e2299@tim>
Message-ID: [Tim]
Moving back in time ...

[GordonM]
Perhaps Christian's stackless Python would enable green threads... [Guido]
This has been suggested before... While this seems possible at first,
all blocking I/O calls would have to be redone to pass control to the
thread scheduler, before this would be useful -- a huge task!
I didn't understand this. If I/O calls are left alone, and a green thread
hit one, the whole program just sits there waiting for the call to complete,
right?

But if the same thing happens using "real threads" today, the same thing
happens today anyway <wink>. That is, if a thread doesn't release the
global lock before a blocking call today, the whole program just sits there
etc.

Or do you have some other kind of problem in mind here?
OK, I'll explain. Suppose there's a wrapper for a read() call whose
essential code looks like this:

Py_BEGIN_ALLOW_THREADS
n = read(fd, buffer, size);
Py_END_ALLOW_THREADS

When the read() call is made, other threads can run. However in green
threads (e.g. using Christian's stackless Python, where a thread
switcher is easily added) the whole program would block at this
point. The way to fix this is to have a way to tell the scheduler
"come back to this thread when there's input ready on this fd". The
scheduler has to combine such calls from all threads into a single
giant select. It gets more complicated when you have blocking I/O
wrapped in library functions, e.g. gethostbyname() or fread(). Then,
you need to have a way to implement sleep() by talking to the thread
schedule (remember, this is the thread scheduler we have to write
ourselves). Oh, and of course the thread scheduler must also have a
select() lookalike API so I can still implement the select module.

Does this help? Or am I misunderstanding your complaint? Or is a
<wink> missing?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@CNRI.Reston.VA.US Mon Jun 28 21:23:57 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Mon, 28 Jun 1999 16:23:57 -0400
Subject: [Python-Dev] ob_refcnt access
In-Reply-To: Your message of "Sun, 27 Jun 1999 22:27:06 EDT."
<000501bec10d$b6f1fb40$e19e2299@tim>
References: <000501bec10d$b6f1fb40$e19e2299@tim>
Message-ID: <199906282023.QAA01605@eric.cnri.reston.va.us>
yes I'm happy - chris
You should be! So how come nobody else is <wink/frown>?
Chris and I have been through this in private, but it seems that as
long as I don't fess up in public I'm afraid it will come back and
I'll get pressure coming at me to endorse Chris' code.

I have no problem with the general concept (see my response to Sam's
post of exciting examples).

But I have a problem with a megapatch like this that affects many
places including very sensitive areas like the main loop in ceval.c.
The problem is simply that I know this is very intricate code, and I
can't accept a patch of this scale to this code before I understand
every little detail of the patch. I'm just too worried otherwise that
there's a reference count bug in it that will very subtly break stuff
and that will take forever to track down; I feel that when I finally
have the time to actually understand the whole patch I'll be able to
prevent that (famous last words).

Please don't expect action or endorsement of Chris' patch from me any
time soon, I'm too busy. However I'd love it if others used the patch
in a real system and related their experiences regarding performance,
stability etc.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip@mojam.com (Skip Montanaro) Mon Jun 28 21:24:46 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Mon, 28 Jun 1999 16:24:46 -0400 (EDT)
Subject: [Python-Dev] Paul Prescod: add Expat to 1.6
In-Reply-To: <14199.54858.464165.381344@amarok.cnri.reston.va.us>
References: <14199.46852.932030.576094@amarok.cnri.reston.va.us>
<199906281935.PAA01439@eric.cnri.reston.va.us>
<14199.54858.464165.381344@amarok.cnri.reston.va.us>
Message-ID: <14199.55737.544299.718558@cm-24-29-94-19.nycap.rr.com>

Andrew> My personal leaning is that we can get more bang for the buck by
Andrew> working on the Distutils effort, so that installing a package
Andrew> like PyExpat becomes much easier, rather than piling more things
Andrew> into the core distribution.

Amen to that. See Guido's note and my response regarding soundex in the
Doc-SIG. Perhaps you could get away with a very small core distribution
that only contained the stuff necessary to pull everything else from the net
via http or ftp...

Skip



From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Mon Jun 28 22:20:05 1999
From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw)
Date: Mon, 28 Jun 1999 17:20:05 -0400 (EDT)
Subject: [Python-Dev] Merge the string_methods tag?
References: <015601beb964$f37a4fa0$0801a8c0@bobcat>
<199906181451.KAA11549@eric.cnri.reston.va.us>
<14198.41195.968785.37763@cm-24-29-94-19.nycap.rr.com>
Message-ID: <14199.59141.447168.107784@anthem.cnri.reston.va.us>
"SM" == Skip Montanaro <skip@mojam.com> writes:
SM> Sorry for the delayed response. I've been out of town. When
SM> Barry returns would it be possible to merge the string methods
SM> in conditionally (#ifdef STRING_METHODS) and add a
SM> --with-string-methods configure option? How hard would it be
SM> to modify string.py, stringobject.c and stropmodule.c to carry
SM> that around?

How clean do you want this separation to be? Just disabling the
actual string methods would be easy, and I'm sure I can craft a
string.py that would work in either case (remember stropmodule.c
wasn't even touched). There are a few other miscellaneous changes
mostly having to do with some code cleaning, but those are probably
small (and uncontroversial?) enough that they can either stay in, or
be easily understood and accepted (optimistic aren't I? :) by Guido
during the merge.

I'll see what I can put together in the next 1/2 hour or so.

-Barry


From skip@mojam.com (Skip Montanaro) Mon Jun 28 22:37:03 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Mon, 28 Jun 1999 17:37:03 -0400 (EDT)
Subject: [Python-Dev] Merge the string_methods tag?
In-Reply-To: <14199.59141.447168.107784@anthem.cnri.reston.va.us>
References: <015601beb964$f37a4fa0$0801a8c0@bobcat>
<199906181451.KAA11549@eric.cnri.reston.va.us>
<14198.41195.968785.37763@cm-24-29-94-19.nycap.rr.com>
<14199.59141.447168.107784@anthem.cnri.reston.va.us>
Message-ID: <14199.59831.285431.966321@cm-24-29-94-19.nycap.rr.com>
"BAW" == Barry A Warsaw <bwarsaw@cnri.reston.va.us> writes:
"SM" == Skip Montanaro <skip@mojam.com> writes:
SM> would it be possible to merge the string methods in conditionally
SM> (#ifdef STRING_METHODS) ...

BAW> How clean do you want this separation to be? Just disabling the
BAW> actual string methods would be easy, and I'm sure I can craft a
BAW> string.py that would work in either case (remember stropmodule.c
BAW> wasn't even touched).

Barry,

I would be happy with having to manually #define STRING_METHODS in
stringobject.c. Forget about the configure flag at first. I think the main
point for experimenters like myself is that it is a hell of a lot easier to
twiddle a #define than to try merging different CVS branches to get access
to the functionality. Most of us have probably advanced far enough on the
Emacs, vi or Notepad learning curves to handle that change, while most of us
are probably not CVS wizards.

Once it's in the main CVS branch, you can announce the change or not on the
main list as you see fit (perhaps on python-dev sooner and on python-list
later after some more experience has been gained with the patches).

Skip


From tismer@appliedbiometrics.com Mon Jun 28 22:41:28 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Mon, 28 Jun 1999 23:41:28 +0200
Subject: [Python-Dev] ob_refcnt access
References: <000501bec10d$b6f1fb40$e19e2299@tim> <199906282023.QAA01605@eric.cnri.reston.va.us>
Message-ID: <3777EC08.42C15478@appliedbiometrics.com>


Guido van Rossum wrote:
yes I'm happy - chris
You should be! So how come nobody else is <wink/frown>?
Chris and I have been through this in private, but it seems that as
long as I don't fess up in public I'm afraid it will come back and
I'll get pressure coming at me to endorse Chris' code.
Please let me add a few comments.
I have no problem with the general concept (see my response to Sam's
post of exciting examples).
This is the most worthful statement I can get.
And see below.
But I have a problem with a megapatch like this that affects many
places including very sensitive areas like the main loop in ceval.c.
Actually it is a rather small patch, but the implicit semantic
change is rather hefty.
The problem is simply that I know this is very intricate code, and I
can't accept a patch of this scale to this code before I understand
every little detail of the patch. I'm just too worried otherwise that
there's a reference count bug in it that will very subtly break stuff
and that will take forever to track down; I feel that when I finally
have the time to actually understand the whole patch I'll be able to
prevent that (famous last words).
I never expected to see this patch go into Python right now.
The current public version is an alpha 0.2. Meanwhile I have
0.3, with again new patches, and a completely reworked policy
of frame refcounting.
Even worse, there is a night mare of more work which I simply
had no time for. All the instance and onbect code must be
carefully changed, since they still need to call back in
a recursive way. This is hard to change until I have a better
mechanism to generate all the callbacks. For instance, I cannot
switch tasks in an __init__ at this time. Although I can
do so in regular methods. But this is all half-baked.

In other words, the danger is by far not over, but still
in the growing phase. I believe I should work on and maintain
this until I'm convinced that there are not more refcount
bugs than before, and until I have evicted every recursion
which is a serious impact. This is still months of work.

When I release the final version, I will pay $100 to the first
person who finds a refcount bug which I introduced. But not
before. I don't want to waste Guido's time, and for sure not
now with this bloody fresh code.

What I needed to know is wether I am on the right track or
if I'm wasting my time. But since I have users already,
it is no waste at all. What I really could use were some
hints about API design.

Guido, thank you for Python - chris

--
Christian Tismer :^)
Applied Biometrics GmbH : Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net
10553 Berlin : PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
we're tired of banana software - shipped green, ripens at home


From bwarsaw@python.org Mon Jun 28 23:04:05 1999
From: bwarsaw@python.org (Barry A. Warsaw)
Date: Mon, 28 Jun 1999 18:04:05 -0400 (EDT)
Subject: [Python-Dev] Merge the string_methods tag?
References: <015601beb964$f37a4fa0$0801a8c0@bobcat>
<199906181451.KAA11549@eric.cnri.reston.va.us>
<14198.41195.968785.37763@cm-24-29-94-19.nycap.rr.com>
<14199.59141.447168.107784@anthem.cnri.reston.va.us>
<14199.59831.285431.966321@cm-24-29-94-19.nycap.rr.com>
Message-ID: <14199.61781.695240.71428@anthem.cnri.reston.va.us>
"SM" == Skip Montanaro <skip@mojam.com> writes:
SM> I would be happy with having to manually #define
SM> STRING_METHODS in stringobject.c. Forget about the configure
SM> flag at first.

Oh, I agree -- I wasn't going to add the configure flag anyway :)
What I meant was how much of my changes should be ifdef-out-able?
Just the methods on string objects? All my changes?

-Barry


From skip@mojam.com (Skip Montanaro) Mon Jun 28 23:30:55 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Mon, 28 Jun 1999 18:30:55 -0400 (EDT)
Subject: [Python-Dev] Merge the string_methods tag?
In-Reply-To: <14199.61781.695240.71428@anthem.cnri.reston.va.us>
References: <015601beb964$f37a4fa0$0801a8c0@bobcat>
<199906181451.KAA11549@eric.cnri.reston.va.us>
<14198.41195.968785.37763@cm-24-29-94-19.nycap.rr.com>
<14199.59141.447168.107784@anthem.cnri.reston.va.us>
<14199.59831.285431.966321@cm-24-29-94-19.nycap.rr.com>
<14199.61781.695240.71428@anthem.cnri.reston.va.us>
Message-ID: <14199.63115.58129.480522@cm-24-29-94-19.nycap.rr.com>

BAW> Oh, I agree -- I wasn't going to add the configure flag anyway :)
BAW> What I meant was how much of my changes should be ifdef-out-able?
BAW> Just the methods on string objects? All my changes?

Well, when the CPP macro is undefined, the behavior from Python should be
unchanged, yes? Am I missing something? There are string methods and what
else involved in the changes? If string.py has to test to see if
"".capitalize yields an AttributeError to decide what to do, I think that
sort of change will be simple enough to accommodate. Any new code that gets
well-exercised now before string methods become widely available is all to
the good in my opinion. It's not fixing something that ain't broke, more
like laying the groundwork for new directions.

Skip


From bwarsaw@python.org Tue Jun 29 00:04:55 1999
From: bwarsaw@python.org (Barry A. Warsaw)
Date: Mon, 28 Jun 1999 19:04:55 -0400 (EDT)
Subject: [Python-Dev] Merge the string_methods tag?
References: <015601beb964$f37a4fa0$0801a8c0@bobcat>
<199906181451.KAA11549@eric.cnri.reston.va.us>
<14198.41195.968785.37763@cm-24-29-94-19.nycap.rr.com>
<14199.59141.447168.107784@anthem.cnri.reston.va.us>
<14199.59831.285431.966321@cm-24-29-94-19.nycap.rr.com>
<14199.61781.695240.71428@anthem.cnri.reston.va.us>
<14199.63115.58129.480522@cm-24-29-94-19.nycap.rr.com>
Message-ID: <14199.65431.161001.730247@anthem.cnri.reston.va.us>
"SM" == Skip Montanaro <skip@mojam.com> writes:
SM> Well, when the CPP macro is undefined, the behavior from
SM> Python should be unchanged, yes? Am I missing something?
SM> There are string methods and what else involved in the
SM> changes?

There are a few additions to the C API, but these probably don't need
to be ifdef'd, since they don't change the existing semantics or
interfaces.

abstract.c has some code cleaning and reorganization, but the public
API and semantics should be unchanged.

Builtin long() and int() have grown an extra optional argument, which
specifies the base to use. If this extra argument isn't given then
they should work the same as in the main branch. Should we ifdef out
the extra argument?

SM> If string.py has to test to see if "".capitalize yields an
SM> AttributeError to decide what to do, I think that sort of
SM> change will be simple enough to accommodate.

Basically what I've got is to move the main-branch string.py to
stringold.py and if you get an attribute error on ''.upper I do a
"from stringold import *". I've also got some hackarounds for
test_string.py to make it work with or without string methods.

SM> Any new code that gets well-exercised now before string
SM> methods become widely available is all to the good in my
SM> opinion. It's not fixing something that ain't broke, more
SM> like laying the groundwork for new directions.

Agreed. I'll check my changes in shortly. The ifdef will only
disable the string methods. long() and int() will still accept the
option argument.

Stay tuned,
-Barry


From tim_one@email.msn.com Tue Jun 29 05:16:34 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Tue, 29 Jun 1999 00:16:34 -0400
Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl)
In-Reply-To: <199906282017.QAA01592@eric.cnri.reston.va.us>
Message-ID: [Tim, claims not to understand Guido's
While this seems possible at first, all blocking I/O calls would
have to be redone to pass control to the thread scheduler, before
this would be useful -- a huge task!
]

[Guido replies, sketching an elaborate scheme for making threads that
are fake nevertheless act like real threads in the particular case of
potentially blocking I/O calls]
...
However in green threads (e.g. using Christian's stackless Python,
where a thread switcher is easily added) the whole program would block
at this point. The way to fix this is [very painful <wink>].
...
Does this help? Or am I misunderstanding your complaint? Or is a
<wink> missing?
No missing wink; I think it hinges on a confusion about the meaning of your
original word "useful".

Threads can be very useful purely as a means for algorithm structuring, due
to independent control flows. Indeed, I use threads in Python most often
these days without any hope or even *use* for potential parallelism
(overlapped I/O or otherwise). It's the only non-brain-busting way to write
code now that requires advanced control of the iterator, generator,
coroutine, or even independent-agents-in-a-pipeline flavors.

Fake threads would allow code like that to run portably, and also likely
faster than with the overheads of OS-level threads. For pedagogical and
debugging purposes too, fake threads could be very much friendlier than the
real thing. Heck, we could even run them on a friendly old Macintosh
<wink>.

If all fake threads block when any hits an I/O call, waiting for the latter
to return, we're no worse off than in a single-threaded program. Being
"fake threads", it *is* a single-threaded program, so it's not even a
surprise <wink>.

Maybe in your

Py_BEGIN_ALLOW_THREADS
n = read(fd, buffer, size);
Py_END_ALLOW_THREADS

you're assuming that some other Python thread needs to run in order for the
read implementation to find something to read? Then that's a dead program
for sure, as it would be for a single-threaded run today too. I can live
with that! I don't expect fake threads to act like real threads in all
cases.

My assumption was that the BEGIN/END macros would do nothing under fake
threads -- since there isn't a real thread backing it up, a fake thread
can't yield in the middle of random C code (Python has no way to
capture/restore the C state). I didn't picture fake threads working except
as a Python-level feature, with context switches limited to bytecode
boundaries (which a stackless ceval can handle with ease; the macro context
switch above is "in the middle of" some bytecode's interpretation, and while
"green threads" may be interested in simulating the that, Tim's "fake
threads" aren't).

different-threads-for-different-heads-ly y'rs - tim




From guido@CNRI.Reston.VA.US Tue Jun 29 13:01:30 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 29 Jun 1999 08:01:30 -0400
Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl)
In-Reply-To: Your message of "Tue, 29 Jun 1999 00:16:34 EDT."
<000201bec1e6$2c496940$229e2299@tim>
References: <000201bec1e6$2c496940$229e2299@tim>
Message-ID: [Tim, claims not to understand Guido's
While this seems possible at first, all blocking I/O calls would
have to be redone to pass control to the thread scheduler, before
this would be useful -- a huge task!
]

[Guido replies, sketching an elaborate scheme for making threads that
are fake nevertheless act like real threads in the particular case of
potentially blocking I/O calls]
[Tim responds, explaining that without this threads are quite useful.]

I guess it's all in the perspective. 99.99% of all thread apps I've
ever written use threads primarily to overlap I/O -- if there wasn't
I/O to overlap I wouldn't use a thread. I think I share this
perspective with most of the thread community (after all, threads
originate in the OS world where they were invented as a replacement
for I/O completion routines).

(And no, I don't use threads to get the use of multiple CPUs, since I
almost never have had more than one of those. And no, I wasn't
expecting the read() to be fed from another thread.)

As far as I can tell, all the examples you give are easily done using
coroutines. Can we call whatever you're asking for coroutines instead
of fake threads?

I think that when you mention threads, green or otherwise colored,
most people who are at all familiar with the concept will assume they
provide I/O overlapping, except perhaps when they grew up in the
parallel machine world. Certainly all examples I give in my
never-completed thread tutorial (still available at
http://www.python.org/doc/essays/threads.html) use I/O as the primary
motivator -- this kind of example appeals to simples souls
(e.g. downloading more than one file in parallel, which they probably
have already seen in action in their web browser), as opposed to
generators or pipelines or coroutines (for which you need to have some
programming theory background to appreciate the powerful abstraction
possibillities they give).

Another good use of threads (suggested by Sam) is for GUI programming.
An old GUI system, News by David Rosenthal at Sun, used threads
programmed in PostScript -- very elegant (and it failed for other
reasons -- if only he had used Python instead :-).

On the other hand, having written lots of GUI code using Tkinter, the
event-driven version doesn't feel so bad to me. Threads would be nice
when doing things like rubberbanding, but I generally agree with
Ousterhout's premise that event-based GUI programming is more reliable
than thread-based. Every time your Netscape freezes you can bet
there's a threading bug somewhere in the code.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gmcm@hypernet.com Wed Jun 30 01:03:37 1999
From: gmcm@hypernet.com (Gordon McMillan)
Date: Tue, 29 Jun 1999 19:03:37 -0500
Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl)
In-Reply-To: <199906291201.IAA02535@eric.cnri.reston.va.us>
References: Your message of "Tue, 29 Jun 1999 00:16:34 EDT." <000201bec1e6$2c496940$229e2299@tim>
Message-ID: <1281421591-30373695@hypernet.com>

I've been out of town, too (not with Skip), but I'll jump back in
here...

[Guido]
When the read() call is made, other threads can run. However in
green threads (e.g. using Christian's stackless Python, where a
thread switcher is easily added) the whole program would block at
this point. The way to fix this is to have a way to tell the
scheduler "come back to this thread when there's input ready on
this fd". The scheduler has to combine such calls from all
threads into a single giant select. It gets more complicated when
you have blocking I/O
I suppose, in the best of all possible worlds, this is true. But I'm
fairly sure there are a number of well-used green thread
implementations which go only part way - eg, if this is a
"selectable" fd, do a select with a timeout of 0 on this one fd and
choose to read/write or swap accordingly. That's a fair amount of
bang for the buck, I think...

[Tim]
Threads can be very useful purely as a means for algorithm
structuring, due to independent control flows.
Spoken like a true schizo, Tim me boyos! Actually, you and Guido are
saying almost the same thing - threads are useful when more than one
thing is "driving" your processing. It's just that in the real world,
that's almost always I/O, not some sick, tortured internal
dialogue...

I think the real question is: how useful would this be on a Mac? On
Win31? (I'll answer that - useful, though I've finally got my last
Win31 client to promise to upgrade, RSN <hack, cough>).


- Gordon


From MHammond@skippinet.com.au Wed Jun 30 00:47:26 1999
From: MHammond@skippinet.com.au (Mark Hammond)
Date: Wed, 30 Jun 1999 09:47:26 +1000
Subject: [Python-Dev] Is Python Free Software, free software, Open Source, open source, etc?
Message-ID: <006f01bec289$bf1e3a90$0801a8c0@bobcat>

This probably isnt the correct list, but I really dont want to start a
philosophical discussion - hopefully people here are both "in the know" and
able to resist a huge thread :-)

Especially given the recent slashdot flamefest between RMS and ESR, I
thought it worth getting correct.

I just read a statement early in our book - "Python is an Open Source tool,
...".

Is this "near enough"? Should I avoid this term in preference for
something more generic (ie, even simply dropping the caps?) - but the
OS(tm) idea seems doomed anyway...

Just-hoping-to-avoid-flame-mail-from-rabid-devotees-of-either-religion :-)

Mark.



From da@ski.org Wed Jun 30 07:16:01 1999
From: da@ski.org (David Ascher)
Date: Tue, 29 Jun 1999 23:16:01 -0700 (Pacific Daylight Time)
Subject: [Python-Dev] Is Python Free Software, free software, Open Source,
open source, etc?
In-Reply-To: <006f01bec289$bf1e3a90$0801a8c0@bobcat>
Message-ID: <Pine.WNT.4.05.9906292311490.213-100000@david.ski.org>
On Wed, 30 Jun 1999, Mark Hammond wrote:

I just read a statement early in our book - "Python is an Open Source tool,
...".

Is this "near enough"? Should I avoid this term in preference for
something more generic (ie, even simply dropping the caps?) - but the
OS(tm) idea seems doomed anyway...
It's not certified Open Source, but my understanding is that ESR believes
the Python license would qualify if GvR applied for certification.

BTW, you won't be able to avoid flames about something or other, and given
that you're writing a Win32 book, you'll be flamed by both pseudo-ESRs and
pseudo-RMSs, all Anonymous Cowards. =)

--david



From fredrik@pythonware.com Wed Jun 30 09:42:15 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Wed, 30 Jun 1999 10:42:15 +0200
Subject: [Python-Dev] Is Python Free Software, free software, Open Source,open source, etc?
References: <Pine.WNT.4.05.9906292311490.213-100000@david.ski.org>
Message-ID: <012601bec2d4$74c315b0$f29b12c2@secret.pythonware.com>
BTW, you won't be able to avoid flames about something or other, and given
that you're writing a Win32 book, you'll be flamed by both pseudo-ESRs and
pseudo-RMSs, all Anonymous Cowards. =)
just check the latest "learning python" review
on Amazon...

surely proves that perlers are weird people ;-)

</F>



From guido@CNRI.Reston.VA.US Wed Jun 30 13:06:21 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Wed, 30 Jun 1999 08:06:21 -0400
Subject: [Python-Dev] Is Python Free Software, free software, Open Source, open source, etc?
In-Reply-To: Your message of "Tue, 29 Jun 1999 23:16:01 PDT."
<Pine.WNT.4.05.9906292311490.213-100000@david.ski.org>
References: <Pine.WNT.4.05.9906292311490.213-100000@david.ski.org>
Message-ID: <199906301206.IAA04619@eric.cnri.reston.va.us>
On Wed, 30 Jun 1999, Mark Hammond wrote:

I just read a statement early in our book - "Python is an Open Source tool,
...".

Is this "near enough"? Should I avoid this term in preference for
something more generic (ie, even simply dropping the caps?) - but the
OS(tm) idea seems doomed anyway...
It's not certified Open Source, but my understanding is that ESR believes
the Python license would qualify if GvR applied for certification.
I did, months ago, and haven't heard back yet. My current policy is
to drop the initial caps and say "open source" -- most people don't
know the difference anyway.
BTW, you won't be able to avoid flames about something or other, and given
that you're writing a Win32 book, you'll be flamed by both pseudo-ESRs and
pseudo-RMSs, all Anonymous Cowards. =)
I don't have the time to read slashdot -- can anyone summarize what
ESR and RMS were flaming about?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fredrik@pythonware.com Wed Jun 30 13:22:09 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Wed, 30 Jun 1999 14:22:09 +0200
Subject: [Python-Dev] Is Python Free Software, free software, Open Source, open source, etc?
References: <Pine.WNT.4.05.9906292311490.213-100000@david.ski.org> <199906301206.IAA04619@eric.cnri.reston.va.us>
Message-ID: <000701bec2f3$2df78430$f29b12c2@secret.pythonware.com>
I did, months ago, and haven't heard back yet. My current policy is
to drop the initial caps and say "open source" -- most people don't
know the difference anyway.
and "Open Source" cannot be trademarked anyway...
I don't have the time to read slashdot -- can anyone summarize what
ESR and RMS were flaming about?
the usual; RMS wrote in saying that 1) he's not part of the
open source movement, 2) open source folks don't under-
stand the real meaning of the word freedom, and 3) he's
not a communist. ESR response is here:

http://www.tuxedo.org/~esr/writings/shut-up-and-show-them.html

... OSI's tactics work. That's the easy part of the lesson.
The hard part is that the FSF's tactics don't work, and
never did. ... So the next time RMS, or anybody else,
urges you to "talk about freedom", I urge you to reply
"Shut up and show them the code."

imo, the best thing is of course to ignore them both, and
continue to ship great stuff under a truly open license...

</F>



From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Wed Jun 30 13:54:06 1999
From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw)
Date: Wed, 30 Jun 1999 08:54:06 -0400 (EDT)
Subject: [Python-Dev] Is Python Free Software, free software, Open Source, open source, etc?
References: <Pine.WNT.4.05.9906292311490.213-100000@david.ski.org>
<199906301206.IAA04619@eric.cnri.reston.va.us>
<000701bec2f3$2df78430$f29b12c2@secret.pythonware.com>
Message-ID: <14202.4974.162380.284749@anthem.cnri.reston.va.us>
"FL" == Fredrik Lundh <fredrik@pythonware.com> writes:
FL> imo, the best thing is of course to ignore them both, and
FL> continue to ship great stuff under a truly open license...

Agreed, of course. I think given the current state of affairs
(i.e. the non-trademarkability of "Open Source", but also the mind
share that little-oh, little-ess has gotten), we should say that
Python (and JPython) are "open source" projects and let people make up
their own minds about what that means.

waiting-for-guido's-inevitable-faq-entry-ly y'rs,
-Barry


From tismer@appliedbiometrics.com Tue Jun 29 19:17:51 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Tue, 29 Jun 1999 20:17:51 +0200
Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl)
References: <000201bec1e6$2c496940$229e2299@tim> <199906291201.IAA02535@eric.cnri.reston.va.us>
Message-ID: <37790DCF.7C0E8FA@appliedbiometrics.com>


Guido van Rossum wrote:
[Guido and Tim, different opinions named misunderstanding :]

I guess it's all in the perspective. 99.99% of all thread apps I've
ever written use threads primarily to overlap I/O -- if there wasn't
I/O to overlap I wouldn't use a thread. I think I share this
perspective with most of the thread community (after all, threads
originate in the OS world where they were invented as a replacement
for I/O completion routines).

(And no, I don't use threads to get the use of multiple CPUs, since I
almost never have had more than one of those. And no, I wasn't
expecting the read() to be fed from another thread.)

As far as I can tell, all the examples you give are easily done using
coroutines. Can we call whatever you're asking for coroutines instead
of fake threads?
I don't think this would match it.
These threads can be implemented by coroutines which always
run apart, and have some scheduling running.

When there is polled I/O available, they can of course
give a threaded feeling. If an application polls the
kbhit function instead of reading, the other "threads"
can run nicely.
Can be quite useful for very small computers like CE.

Many years before, I had my own threads under Turbo Pascal
(I had no idea that these are called so). Ok, this was
DOS, but it was enough of threading to have a "process"
which smoothly updated a graphics screen, while another
(single! :) "process" wrote data to the disk, a third one
handled keyboard input, and a fourth drove a multichannel
A/D sampling device.

? Oops, I just realized that these were *true* threads.
The disk process would not run smooth, I agree. All
the rest would be fine with green threads.

...
On the other hand, having written lots of GUI code using Tkinter, the
event-driven version doesn't feel so bad to me. Threads would be nice
when doing things like rubberbanding, but I generally agree with
Ousterhout's premise that event-based GUI programming is more reliable
than thread-based. Every time your Netscape freezes you can bet
there's a threading bug somewhere in the code.
Right. But with a traceback instead of a machine hang,
this could be more attractive to do. Green threads/coroutines
are incredibly fast (one c call per switch). And since they have
local state, you can save most of the attribute lookups which
are needed with event based programming.
(But this is all theory until we tried it).

ciao - chris

--
Christian Tismer :^)
Applied Biometrics GmbH : Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net
10553 Berlin : PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
we're tired of banana software - shipped green, ripens at home

Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppython-dev @
categoriespython
postedJun 6, '99 at 7:54p
activeJun 6, '99 at 7:54p
posts1
users1
websitepython.org

1 user in discussion

Christian Tismer: 1 post

People

Translate

site design / logo © 2018 Grokbase