FAQ

[Python] package similar to XML::Simple

Paulo Pinto
Jan 28, 2004 at 9:53 am
Hi,


does anyone know of a Python package that
is able to load XML like the XML::Simple
Perl package does?

For those that don't know it, this package
maps the XML file to a dictionary.

Of course I can build such a package myself
but it would be better if it already exists :)

--
Paulo Pinto
reply

Search Discussions

47 responses

  • Pierre N at Jan 28, 2004 at 12:02 pm
    I'm using pyRXP, and it's great.
    It's using one tuple, not dictionnaries.
    Very very fast.
    By the way I'm just starting using this package, anybody met any
    problems with pyRXP?

    -- Pierre

    On Wed, 2004-01-28 at 09:53, Paulo Pinto wrote:
    Hi,


    does anyone know of a Python package that
    is able to load XML like the XML::Simple
    Perl package does?

    For those that don't know it, this package
    maps the XML file to a dictionary.

    Of course I can build such a package myself
    but it would be better if it already exists :)

    --
    Paulo Pinto
  • Harald Massa at Jan 28, 2004 at 12:03 pm
    Paulo Pinto
    does anyone know of a Python package that
    is able to load XML like the XML::Simple
    Perl package does?
    Good to ask! I know of at least 3 packages that do sth. similiar.

    - Fredrik Lundhs elementtree
    - D. Merzs gnosis xml utilities
    - handyxml

    just google for them.
  • Peter Hansen at Jan 28, 2004 at 2:24 pm

    Paulo Pinto wrote:
    does anyone know of a Python package that
    is able to load XML like the XML::Simple
    Perl package does?

    For those that don't know it, this package
    maps the XML file to a dictionary.
    A simple dictionary is insufficient to represent XML in general,
    so perhaps you're talking about a subset of XML, maybe with no
    attributes, and where the order of the child elements doesn't
    matter? Or something else?

    Or do you really mean something like a multiply-nested
    dictionary, perhaps with lists as well?
    Of course I can build such a package myself
    but it would be better if it already exists :)
    We were able to build something similar by stripping down
    Fredrik Lundh's elementtree until we had little more than the
    calls to the expat parser (i.e. we used his source as a tutorial
    on using expat :-), so if this is something like the XML-subset
    I mention above, you could do it in an hour or so from scratch
    if you knew Python well.

    -Peter
  • Paulo Pinto at Jan 30, 2004 at 1:12 pm
    I mean multiple nested dictionaries with lists.

    But handyxml seems to solve my problem.

    Thanks, guys

    Peter Hansen wrote:
    Paulo Pinto wrote:
    does anyone know of a Python package that
    is able to load XML like the XML::Simple
    Perl package does?

    For those that don't know it, this package
    maps the XML file to a dictionary.

    A simple dictionary is insufficient to represent XML in general,
    so perhaps you're talking about a subset of XML, maybe with no
    attributes, and where the order of the child elements doesn't
    matter? Or something else?

    Or do you really mean something like a multiply-nested
    dictionary, perhaps with lists as well?

    Of course I can build such a package myself
    but it would be better if it already exists :)

    We were able to build something similar by stripping down
    Fredrik Lundh's elementtree until we had little more than the
    calls to the expat parser (i.e. we used his source as a tutorial
    on using expat :-), so if this is something like the XML-subset
    I mention above, you could do it in an hour or so from scratch
    if you knew Python well.

    -Peter
  • Uche Ogbuji at Feb 10, 2004 at 7:32 am
    Pierre N <pierren at mac.com> wrote in message news:<mailman.904.1075287732.12720.python-list at python.org>...
    I'm using pyRXP, and it's great.
    It's using one tuple, not dictionnaries.
    Very very fast.
    By the way I'm just starting using this package, anybody met any
    problems with pyRXP?
    I did. It's not an XML parser :-(. It does not accept character
    entities such as &#8230; (the example that bit me), giving meaningless
    "error" messages along the lines: "not a valid 8-bit XML character".
    If you need an XML parser, use PyRXPU, which comes in ReportLab CVS
    only. It is not as fast as PyRXP, but conformant in my testing, and
    the point of XML is conformance, not speed at all costs. If you want
    speed at all costs, use CSV or some other plain text format.

    I'm writing at length about this unfortunate PyRXP situation in my
    next ORA python/XML column (expected Weds).

    --Uche
    http://uche.ogbui.net
  • Peter Hansen at Feb 10, 2004 at 4:35 pm

    Uche Ogbuji wrote:
    Pierre N <pierren at mac.com> wrote in message news:<mailman.904.1075287732.12720.python-list at python.org>...
    I'm using pyRXP, and it's great.
    It's using one tuple, not dictionnaries.
    Very very fast.
    By the way I'm just starting using this package, anybody met any
    problems with pyRXP?
    I did. It's not an XML parser :-(. It does not accept character
    entities such as &#8230; (the example that bit me), giving meaningless
    "error" messages along the lines: "not a valid 8-bit XML character".
    If you need an XML parser, use PyRXPU, which comes in ReportLab CVS
    only. It is not as fast as PyRXP, but conformant in my testing, and
    the point of XML is conformance, not speed at all costs. If you want
    speed at all costs, use CSV or some other plain text format.
    Hmm... so it's your opinion that *all* XML parsers must handle *all*
    aspects of XML? If not, I think you should back off on the criticism
    of PyRXP as being "not an XML parser" and simply point out that it
    doesn't handle all aspects of XML because it is intended to provide
    a very fast/heavily optimized approach to parsing only certain kinds
    of XML. It's a valid choice to do so, though of course if PyRXP is
    promoted as a "full" XML solution that might be inaccurate.

    -Peter
  • Brian Quinlan at Feb 10, 2004 at 6:50 pm

    Peter:
    Hmm... so it's your opinion that *all* XML parsers must handle *all*
    aspects of XML?
    If it isn't Uche's opinion then it is mine.
    If not, I think you should back off on the criticism
    of PyRXP as being "not an XML parser" and simply point out that it
    doesn't handle all aspects of XML because it is intended to provide
    a very fast/heavily optimized approach to parsing only certain kinds
    of XML.
    Then it is not an XML parser. It is a "foo" parser, where "foo" is subset of
    XML.
    It's a valid choice to do so, though of course if PyRXP is
    promoted as a "full" XML solution that might be inaccurate.
    I have no problem with the existence of PyRXP. It just isn't an XML parser.

    Cheers,
    Brian
  • Martin v. Löwis at Feb 10, 2004 at 7:43 pm

    Peter Hansen wrote:
    Hmm... so it's your opinion that *all* XML parsers must handle *all*
    aspects of XML? If not, I think you should back off on the criticism
    of PyRXP as being "not an XML parser" and simply point out that it
    doesn't handle all aspects of XML because it is intended to provide
    a very fast/heavily optimized approach to parsing only certain kinds
    of XML.
    I am not Uche, but I think that all XML parsers should conform to the
    XML recommendation (and treat deviations from the XML recommendation
    as bugs).

    This is not the same as handling all aspects of XML, since the XML
    recommendation makes certain aspects optional. Processing character
    references is not one of them (but e.g. validation is).
    It's a valid choice to do so, though of course if PyRXP is
    promoted as a "full" XML solution that might be inaccurate.
    Packages may help processing only selected XML documents, and they
    may also support documents which are not XML. However, in neither
    case, they should call themselves "XML parsers". "XML-like parsers"
    or "XML subset parsers" might be more appriate.

    Regards,
    Martin
  • Uche Ogbuji at Feb 11, 2004 at 6:44 am
    "Martin v. L?wis" <martin at v.loewis.de> wrote in message news:<c0bc8v$ibu$01$1 at news.t-online.com>...
    Peter Hansen wrote:
    Hmm... so it's your opinion that *all* XML parsers must handle *all*
    aspects of XML? If not, I think you should back off on the criticism
    of PyRXP as being "not an XML parser" and simply point out that it
    doesn't handle all aspects of XML because it is intended to provide
    a very fast/heavily optimized approach to parsing only certain kinds
    of XML.
    I am not Uche, but I think that all XML parsers should conform to the
    XML recommendation (and treat deviations from the XML recommendation
    as bugs).

    This is not the same as handling all aspects of XML, since the XML
    recommendation makes certain aspects optional. Processing character
    references is not one of them (but e.g. validation is).
    It's a valid choice to do so, though of course if PyRXP is
    promoted as a "full" XML solution that might be inaccurate.
    Packages may help processing only selected XML documents, and they
    may also support documents which are not XML. However, in neither
    case, they should call themselves "XML parsers". "XML-like parsers"
    or "XML subset parsers" might be more appriate.
    I wouldn't argue with calling PyRXP an "XML-like parser".

    Because until very recently I thought that PyRXP was an XML parser, I
    was extremely taken aback when I ran afoul of PyRXP's brazen character
    non-conformance. As an example of the danger in this non-conformance,
    PyRXP refused to parse the very first well-formed XML document I gave
    it. And I'm (mostly) a native English speaker. True XML parsers
    strive for interoperability for a reason. Not doing so pretty much
    negates the value of XML.

    I was even more taken aback to read that the PyRXP developers refused
    to make the simple fix needed for conformance. I think it is
    essential to point out that a tool that refuses XML conformance cannot
    go about calling itself an XML parser.


    --Uche
    http://uche.ogbuji.net
  • Peter Hansen at Feb 11, 2004 at 2:20 pm

    Uche Ogbuji wrote:
    I was even more taken aback to read that the PyRXP developers refused
    to make the simple fix needed for conformance.
    This is a very relevant data point that was missing in the discussion
    until now.

    Given that situation, I'd agree that labelling PyRXP simply an "XML parser"
    without qualification is misleading and wrong.

    -Peter
  • Uche Ogbuji at Feb 11, 2004 at 6:30 am
    Peter Hansen <peter at engcorp.com> wrote in message news:<40290854.15BB5CF0 at engcorp.com>...
    Uche Ogbuji wrote:
    Pierre N <pierren at mac.com> wrote in message news:<mailman.904.1075287732.12720.python-list at python.org>...
    I'm using pyRXP, and it's great.
    It's using one tuple, not dictionnaries.
    Very very fast.
    By the way I'm just starting using this package, anybody met any
    problems with pyRXP?
    I did. It's not an XML parser :-(. It does not accept character
    entities such as &#8230; (the example that bit me), giving meaningless
    "error" messages along the lines: "not a valid 8-bit XML character".
    If you need an XML parser, use PyRXPU, which comes in ReportLab CVS
    only. It is not as fast as PyRXP, but conformant in my testing, and
    the point of XML is conformance, not speed at all costs. If you want
    speed at all costs, use CSV or some other plain text format.
    Hmm... so it's your opinion that *all* XML parsers must handle *all*
    aspects of XML?
    XML is clear on what a Parser *must* support. The full character
    production is one of those things. From XML 1.0, section 2.2:

    Character Range
    [2] Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] |
    [#x10000-#x10FFFF]

    There is no "option" to not support characters greater than #xFF. XML
    parsers *can* leave off handling some aspects of XML, external DTD
    subsets, for example, but you can not be as fundamentally
    non-conformant as PyRXP and still call yourself an XML parser.

    This is not just an academic matter. There are a *vast* number of
    useful and heavily-used characters of code point higher than U+FF and
    if parsers decided on a whim to pick and choose what to support the
    result would be complete and utter chaos.

    If not, I think you should back off on the criticism
    of PyRXP as being "not an XML parser" and simply point out that it
    doesn't handle all aspects of XML because it is intended to provide
    a very fast/heavily optimized approach to parsing only certain kinds
    of XML. It's a valid choice to do so, though of course if PyRXP is
    promoted as a "full" XML solution that might be inaccurate.
    PyRXP is not an XML parser. It's that simple. I stand by that veru
    strong satement, and I'd be surprised if XML expert refusaes to
    corroborate it.

    I do want to point out that PyRXPU does seem to be a proper XML
    parser, and is what people should use instead if they like the
    ReportLab products.

    Of course if yu don't really need an XML parser, feel free to use
    PyRXP. Just don't call it what it isn't.

    --Uche
    http://uche.ogbuji.net
  • Uche Ogbuji at Feb 10, 2004 at 7:33 am
    Paulo Pinto <paulo.pinto at cern.ch> wrote in message news:<bv80qq$kbg$1 at sunnews.cern.ch>...
    Hi,


    does anyone know of a Python package that
    is able to load XML like the XML::Simple
    Perl package does?

    For those that don't know it, this package
    maps the XML file to a dictionary.

    Of course I can build such a package myself
    but it would be better if it already exists :)
    FWIW: http://www.xml.com/pub/a/2004/01/14/py-xml.html

    --Uche
    http://uche.ogbui.net
  • Peter Hansen at Feb 10, 2004 at 7:32 pm

    Brian Quinlan wrote:
    Peter:
    Hmm... so it's your opinion that *all* XML parsers must handle *all*
    aspects of XML?
    If it isn't Uche's opinion then it is mine.
    If not, I think you should back off on the criticism
    of PyRXP as being "not an XML parser" and simply point out that it
    doesn't handle all aspects of XML because it is intended to provide
    a very fast/heavily optimized approach to parsing only certain kinds
    of XML.
    Then it is not an XML parser. It is a "foo" parser, where "foo" is subset of
    XML.
    It's a valid choice to do so, though of course if PyRXP is
    promoted as a "full" XML solution that might be inaccurate.
    I have no problem with the existence of PyRXP. It just isn't an XML parser.
    Then there are very few XML parsers in the world, if one includes, say,
    namespaces and validation as part of XML.

    -Peter
  • Brian Quinlan at Feb 10, 2004 at 7:55 pm

    Peter Hansen wrote:
    I have no problem with the existence of PyRXP. It just isn't an XML
    parser.
    Then there are very few XML parsers in the world, if one includes, say,
    namespaces and validation as part of XML.
    An XML parser must be able to parse all (1) well-formed XML documents. The
    W3C XML recommendation provides a definition for well-formed XML documents,
    and can be found here:
    http://www.w3.org/TR/2004/REC-xml-20040204/

    (1) I'll quality this a bit since the size of XML documents is unbounded
    and computer resources are.

    Cheers,
    Brian
  • Martin v. Löwis at Feb 10, 2004 at 8:26 pm

    Peter Hansen wrote:
    Then there are very few XML parsers in the world, if one includes, say,
    namespaces and validation as part of XML.
    Namespaces are clearly *not* part of the XML recommendation (but part
    of the XML namespaces recommendation). Validation is optional in the
    XML recommendation. Character references are not.

    Regards,
    Martin
  • Peter Hansen at Feb 10, 2004 at 8:46 pm

    "Martin v. L?wis" wrote:
    Peter Hansen wrote:
    Then there are very few XML parsers in the world, if one includes, say,
    namespaces and validation as part of XML.
    Namespaces are clearly *not* part of the XML recommendation (but part
    of the XML namespaces recommendation). Validation is optional in the
    XML recommendation. Character references are not.
    See my reply to Brian Q... I can accept that, but read more into Uche's
    objection than he probably meant. It now seems rather over the top to
    write an article lambasting the product for something that would normally
    just be considered a simple bug/defect in the software, unless the
    maintainers have in effect refused to fix it.

    -Peter
  • Martin v. Löwis at Feb 10, 2004 at 9:07 pm

    Peter Hansen wrote:

    See my reply to Brian Q... I can accept that, but read more into Uche's
    objection than he probably meant. It now seems rather over the top to
    write an article lambasting the product for something that would normally
    just be considered a simple bug/defect in the software, unless the
    maintainers have in effect refused to fix it.
    If a parser prints a message "not a valid 8-bit XML character", then it
    can't be an XML parser. XML does not have a notion even remotely related
    to "8-bit characters".

    Regards,
    Martin
  • Peter Hansen at Feb 10, 2004 at 9:29 pm

    "Martin v. L?wis" wrote:
    Peter Hansen wrote:
    See my reply to Brian Q... I can accept that, but read more into Uche's
    objection than he probably meant. It now seems rather over the top to
    write an article lambasting the product for something that would normally
    just be considered a simple bug/defect in the software, unless the
    maintainers have in effect refused to fix it.
    If a parser prints a message "not a valid 8-bit XML character", then it
    can't be an XML parser. XML does not have a notion even remotely related
    to "8-bit characters".
    Aren't we talking about semantics now, and rather tritely at that? So
    don't call it an XML parser, yet. Wait until it no longer emits that
    message, whatever you want. I'd call that throwing out the baby with the
    bath-water, if it means (a) nobody reports that as a bug, and (b) someone
    who could benefit from it even with that defect fails to do so because
    it's illegal and immoral to label it an "XML parser", much as it might
    want to be one, until it's perfect.

    -Peter
  • Brian Quinlan at Feb 10, 2004 at 9:52 pm

    Peter Hansen wrote:
    Aren't we talking about semantics now, and rather tritely at that? So
    don't call it an XML parser, yet. Wait until it no longer emits that
    message, whatever you want. I'd call that throwing out the baby with the
    bath-water, if it means (a) nobody reports that as a bug, and (b) someone
    who could benefit from it even with that defect fails to do so because
    it's illegal and immoral to label it an "XML parser", much as it might
    want to be one, until it's perfect.
    I don't think that anyone is arguing against pyRXP per se; I'm sure it has
    its niche among people working with certain XML subsets.

    We are arguing that the fact that pyRXP even has a concept of an 8-bit XML
    character means that it is probably pretty far from being a real XML parser.

    Cheers,
    Brian
  • Martin v. Löwis at Feb 11, 2004 at 5:16 pm

    Peter Hansen wrote:
    If a parser prints a message "not a valid 8-bit XML character", then it
    can't be an XML parser. XML does not have a notion even remotely related
    to "8-bit characters".

    Aren't we talking about semantics now, and rather tritely at that?
    Not at all. This message indicates a lack of understanding of basic
    principles of XML on the side of the authors of this error message.
    One of the beauties of XML is interoperability and portability. Any
    implementation that chooses to subset XML will sooner or later learn
    that the mere idea of subsetting XML is misguided, and will give
    up all limitations. Any user of such an implementation will get bitten
    by the limitations sooner or later, and it is telling that Uche got
    bitten at the very first document that he passed to the tool.

    My favorite example of a (successful) correction of attitude is
    XML-RPC. The original XML-RPC spec said that the string type is
    used to represent "ASCII strings". People were asking whether this
    is a constraint on the XML documents, and David Winer was responding
    that you can use "full XML" in XML-RPC, with no restrictions. People
    then were asking what else he meant by "ASCII strings", and eventually,
    he simply removed the "ASCII" classification in the spec, thereby
    allowing all characters that are allowed in XML.
    I'd call that throwing out the baby with the
    bath-water, if it means (a) nobody reports that as a bug, and (b) someone
    who could benefit from it even with that defect fails to do so because
    it's illegal and immoral to label it an "XML parser", much as it might
    want to be one, until it's perfect.
    I don't use that package, so I'm in no position to report bugs on it.
    With what I know now, I would discourage usage of the package at its
    current state even for people who "could benefit from it".

    Regards,
    Martin
  • Peter Hansen at Feb 11, 2004 at 5:43 pm

    "Martin v. L?wis" wrote:
    Peter Hansen wrote:
    If a parser prints a message "not a valid 8-bit XML character", then it
    can't be an XML parser. XML does not have a notion even remotely related
    to "8-bit characters".

    Aren't we talking about semantics now, and rather tritely at that?
    Not at all. This message indicates a lack of understanding of basic
    principles of XML on the side of the authors of this error message.
    You missed the point I was trying to make Martin, though perhaps it
    is clear now from other messages.

    Basically, if I tried to write an XML parser, and distributed it as
    such, and you found that it didn't handle this use case, you would
    be somewhat unfair to write an article claiming that my XML parser was
    in fact not an XML parser. It has a bug in it, that's all. I'd fix it!

    The fact that the PyRXP maintainers have apparently refused to fix this
    problem *does* justify the complaint Uche is making, and I support him
    in that now. I wasn't aware that anyone had even tried reporting the
    problem in the first place and was objection to an apparent overreaction.

    If the presence of a bug in a program means that one cannot label the
    program as being what it is intended to be, then all software would have
    to be released with disclaimers like "this is supposed to be a Python
    interpreter, and might be someday, but isn't yet because there are some
    rare cases where it doesn't correctly interpret Python".

    -still-waiting-for-a-real-python-interpreter-to-be-released-ly y'rs,
    Peter
  • Alan Kennedy at Feb 11, 2004 at 7:39 pm
    [Martin v. L?wis]
    If a parser prints a message "not a valid 8-bit XML character",
    then it can't be an XML parser. XML does not have a notion even
    remotely related to "8-bit characters".
    [Peter Hansen]
    Aren't we talking about semantics now, and rather tritely at that?
    [Martin v. L?wis]
    Not at all. This message indicates a lack of understanding of basic
    principles of XML on the side of the authors of this error message.
    [Peter Hansen]
    You missed the point I was trying to make Martin, though perhaps it
    is clear now from other messages.

    Basically, if I tried to write an XML parser, and distributed it as
    such, and you found that it didn't handle this use case, you would
    be somewhat unfair to write an article claiming that my XML parser was
    in fact not an XML parser. It has a bug in it, that's all. I'd fix it!
    I think that the point Martin is making (and one which I wouldn't dare
    disagree with him on ;-) is that the unwillingness to comply 100% with
    the XML spec was a *design decision* on behalf of the PyRXP authors,
    not a bug.

    Choosing to actively ignore parts of a standard eliminates the right
    to claim standards-compliance, IMHO. Standards are there for the
    express purpose of encouraging interoperability. If a software
    designer wants to sacrifice a part of that standard for performance
    reasons, or reasons of code complexity, testing difficulty, etc, then
    their software is not a complete implementation of the standard, and
    should not claim to be so.

    happy-with-the-real-but-possibly-flawed-python-interpreters-i-have-ly
    y'rs,

    --
    alan kennedy
    ------------------------------------------------------
    check http headers here: http://xhaus.com/headers
    email alan: http://xhaus.com/contact/alan
  • Peter Hansen at Feb 11, 2004 at 7:52 pm

    Alan Kennedy wrote:
    I think that the point Martin is making (and one which I wouldn't dare
    disagree with him on ;-) is that the unwillingness to comply 100% with
    the XML spec was a *design decision* on behalf of the PyRXP authors,
    not a bug.
    Well, if that were the case, it would have helped if someone, anyone,
    had said so at the time they said "not an XML parser". Perhaps I should
    have inferred earlier, from such comments, that they knew that this was
    a firm decision by the authors, and not a simple bug.
    Choosing to actively ignore parts of a standard eliminates the right
    to claim standards-compliance, IMHO. Standards are there for the
    express purpose of encouraging interoperability. If a software
    designer wants to sacrifice a part of that standard for performance
    reasons, or reasons of code complexity, testing difficulty, etc, then
    their software is not a complete implementation of the standard, and
    should not claim to be so.
    Completely agreed!

    Hmm... makes me want to check their web site, to see what this is really
    about:

    '''RXP is a very fast validating XML parser written by Richard Tobin of
    the University of Edinburgh. It complies fully with the W3C test suites
    (although we have compiled it without Unicode support for the time being).
    We would like to thank Richard Tobin and Henry Thompson of the Language
    Technology Group for making this code available to the world.
    '''

    Seems pretty self-explanatory to me. Might even be why, when I downloaded
    and tried to use it (and got good results) a year or two ago, I had no
    qualms about using it. Clearly stated, and to the point, except that one
    is left to make the small connection between "compiled without Unicode
    support" and "doesn't handle character entities". (Or is it that it
    handles character entities, but not those beyond 127? Probably moot.)

    Doesn't this imply that anyone, at any time, could choose to recompile
    *with* Unicode support, which is presumably _in place_ but just optionally
    left out of the standard distribution?

    So it's neither a bug, nor a design decision, but a packaging choice.

    I think I'm back to saying that "not an XML parser!!!!" is a bit of an
    unfair reaction, given how open they are about the situation.

    -Peter
  • Anton Vredegoor at Feb 12, 2004 at 11:02 am

    Peter Hansen wrote:
    I think I'm back to saying that "not an XML parser!!!!" is a bit of an
    unfair reaction, given how open they are about the situation.
    Somehow this thread reminds me of a certain Monty Python scene
    involving the sale of a parrot ;-)

    Anton
  • Alan Kennedy at Feb 12, 2004 at 11:49 am
    [Anton Vredegoor]
    Somehow this thread reminds me of a certain Monty Python scene
    involving the sale of a parrot ;-)
    Nice one Anton :-D

    For those unfamiliar with the Parrot sketch,

    http://www.jumpstation.ca/recroom/comedy/python/petshop.html

    For those who already know it, I just had to point this out

    http://www.wackyplanet.com/monpytdeadpa.html

    it's-not-dead-it's-pining-for-iso-10646-ly y'rs.

    --
    alan kennedy
    ------------------------------------------------------
    check http headers here: http://xhaus.com/headers
    email alan: http://xhaus.com/contact/alan
  • Uche Ogbuji at Feb 12, 2004 at 2:29 pm

    Peter Hansen:

    Hmm... makes me want to check their web site, to see what this is really
    about:

    '''RXP is a very fast validating XML parser written by Richard Tobin of
    the University of Edinburgh. It complies fully with the W3C test suites
    (although we have compiled it without Unicode support for the time being).
    We would like to thank Richard Tobin and Henry Thompson of the Language
    Technology Group for making this code available to the world.
    '''

    Seems pretty self-explanatory to me. Might even be why, when I downloaded
    and tried to use it (and got good results) a year or two ago, I had no
    qualms about using it. Clearly stated, and to the point, except that one
    is left to make the small connection between "compiled without Unicode
    support" and "doesn't handle character entities". (Or is it that it
    handles character entities, but not those beyond 127? Probably moot.)

    Doesn't this imply that anyone, at any time, could choose to recompile
    *with* Unicode support, which is presumably _in place_ but just optionally
    left out of the standard distribution?

    So it's neither a bug, nor a design decision, but a packaging choice.

    I think I'm back to saying that "not an XML parser!!!!" is a bit of an
    unfair reaction, given how open they are about the situation.
    *sigh*. I don't know how many more times and ways I can say this. On
    more time and I'm done unless a new, salient point comes up.

    There *is* a packaging of PyRXP that is XML compliant. It's called
    PyRXPU. It is precisely a compiling of PyRXP with Unicode support
    plus output of Unicode objects in the resulting data structure (which
    is my recommendation for XML processing).

    So once more: AFAICT PyRXPU is an XML parser. PyRXP is certainly not
    an XML parser. The substrate RXP is not an XML parser either when
    compiled without Unicode support and although I respect Thompson and
    Tobin as much as I do the PyRXP developers, they were really confusing
    themselves and others when they said "It complies fully with the W3C
    test suites (although we have compiled it without Unicode support for
    the time being)."

    Several early times when this issue was brought up the PyRXP
    developers in effect said approximately: We need it to be fast, so we
    won't be doing anything to make it conformant because we now doing so
    would slow it down. This is a pretty poisonous attitude when claimig
    to support a standard, and what makes this even worse is that the
    PyRXP Web page starts out saying:

    "...PyRXP...the fastest validating XML parser available for Python,
    and quite possibly anywhere :-)."

    And then goes on to justify that statement with a "benchmark" of PyRXP
    against other XML parsers without mentioning the inconvenient fact
    that PyRXP is *not* an XML parser, and that building it so that it is
    would drop it in the benchmarks somewhat. (Not that I know who should
    really care because unless you're using 4DOM or minidom all the
    options are in the same order of magnitude: if you want to wring ut
    the last odd drop of CPU--and you probably don't need to--then you
    should be using neither XML nor Python).

    Are you seriously telling me that in the face of all this, my
    criticism, strongly worded as it is, is unfair?

    My main aim here is to make it well known that PyRXP is not an XML
    parser. It won't trouble me if people continue to use it as currently
    packaged. I just want to make sure they know they are not using what
    they may think they are.

    Once again: PyRXPU (contributed, tellingly, by someone outside the
    PyRXP core team) is the right build of PyRXP if you need an XML
    parser. The bad news is that it's only available from ReportLab CVS.
    My article is now out and includes details for obtaining PyRXPU:

    http://www.xml.com/pub/a/2004/02/11/py-xml.html

    --Uche
    http://uche.ogbuji.net
  • Peter Hansen at Feb 12, 2004 at 6:38 pm
    Uche Ogbuji wrote:
    [snip]
    Are you seriously telling me that in the face of all this, my
    criticism, strongly worded as it is, is unfair? [snip]
    The bad news is that it's only available from ReportLab CVS.
    My article is now out and includes details for obtaining PyRXPU:
    Let's just say that by mentioning pyRXPU as an alternative packaging
    which does support Unicode, and by saying after testing it out that
    "It's good to have this confidence that PyRXPU is a conforming XML parser,"
    I don't have any serious complaint about the criticism.

    Nice article, by the way. And thanks for taking the time to research and
    write it!

    -Peter
  • Richard Tobin at Mar 11, 2004 at 11:59 pm

    So once more: AFAICT PyRXPU is an XML parser. PyRXP is certainly not
    an XML parser. The substrate RXP is not an XML parser either when
    compiled without Unicode support and although I respect Thompson and
    Tobin as much as I do the PyRXP developers, they were really confusing
    themselves and others when they said "It complies fully with the W3C
    test suites (although we have compiled it without Unicode support for
    the time being)."
    Sorry to respond to a thread long after its sell-by date.

    Just for the record, the statement above was made by the PyRXP people,
    not us. RXP's 8-bit mode exists because it was originally written to
    replace a "normalized SGML" parser in an existing (8-bit) application.
    I wouldn't recommend compiling it in that mode for any except the most
    constrained applications.

    -- Richard
  • Robin Becker at Mar 12, 2004 at 8:41 am
    In article <c2qugb$1qlh$1 at pc-news.cogsci.ed.ac.uk>, Richard Tobin
    <richard at cogsci.ed.ac.uk> writes
    So once more: AFAICT PyRXPU is an XML parser. PyRXP is certainly not
    an XML parser. The substrate RXP is not an XML parser either when
    compiled without Unicode support and although I respect Thompson and
    Tobin as much as I do the PyRXP developers, they were really confusing
    themselves and others when they said "It complies fully with the W3C
    test suites (although we have compiled it without Unicode support for
    the time being)."
    Sorry to respond to a thread long after its sell-by date.

    Just for the record, the statement above was made by the PyRXP people,
    not us. RXP's 8-bit mode exists because it was originally written to
    replace a "normalized SGML" parser in an existing (8-bit) application.
    I wouldn't recommend compiling it in that mode for any except the most
    constrained applications.

    -- Richard
    I guess that's why there are now two versions pyRXP & pyRXPU. Personally
    I find XML pretty awful as a lingua franca, but everyone else seems to
    think it's the new sliced bread. As to whether RXP is a good 'parser' my
    latest test of pyRXPU (RXP 1.3.0) with James Clark's XML Test Cases
    version 1998-11-18 gives

    Ran 373 tests in 1.282s

    FAILED (failures=2, errors=1)

    Clearly I have some way to go and we'll have to work harder.
    --
    Robin Becker
  • Uche Ogbuji at Mar 15, 2004 at 4:29 am
    richard at cogsci.ed.ac.uk (Richard Tobin) wrote in message news:<c2qugb$1qlh$1 at pc-news.cogsci.ed.ac.uk>...
    So once more: AFAICT PyRXPU is an XML parser. PyRXP is certainly not
    an XML parser. The substrate RXP is not an XML parser either when
    compiled without Unicode support and although I respect Thompson and
    Tobin as much as I do the PyRXP developers, they were really confusing
    themselves and others when they said "It complies fully with the W3C
    test suites (although we have compiled it without Unicode support for
    the time being)."
    Sorry to respond to a thread long after its sell-by date.

    Just for the record, the statement above was made by the PyRXP people,
    not us.
    My apoloies for the misattribution.

    --Uche


    From bogus@does.not.exist.com Mon Mar 15 05:29:48 2004
    From: bogus@does.not.exist.com (Bill)
    Date: Sun, 14 Mar 2004 20:29:48 -0800
    Subject: random.random - Interpreter works, Script doesn't?
    Message-ID: <105ac9sm9lvij49@corp.supernews.com>

    Hello,
    If I type the following code directly into the interpreter, it works. If I
    run it from a script, it generates the following error. Can someone help?
    Thanks!

    ------------------------

    import random

    for i in range(10):
    x = random.random()
    print x

    ------------------------

    TypeError: 'module' object is not callable
  • Stuart Bishop at Feb 12, 2004 at 11:31 am

    On 12/02/2004, at 4:43 AM, Peter Hansen wrote:

    Not at all. This message indicates a lack of understanding of basic
    principles of XML on the side of the authors of this error message.
    You missed the point I was trying to make Martin, though perhaps it
    is clear now from other messages.

    Basically, if I tried to write an XML parser, and distributed it as
    such, and you found that it didn't handle this use case, you would
    be somewhat unfair to write an article claiming that my XML parser was
    in fact not an XML parser. It has a bug in it, that's all. I'd fix
    it!
    There is a big difference between a bug and not implementing
    a large chunk of the spec for performance.
    The fact that the PyRXP maintainers have apparently refused to fix this
    problem *does* justify the complaint Uche is making, and I support him
    in that now. I wasn't aware that anyone had even tried reporting the
    problem in the first place and was objection to an apparent
    overreaction.
    This isn't quite correct. ReportLab were more than happy to
    take my patch on board, and if you install pyRXP from CVS you
    get *two* parsers (pyRXP and pyRXPU). I personally don't
    consider pyRXP an XML parser, as it is deliberately lacking what
    is in my opinion the single most useful feature of XML (unambiguous
    and universal Unicode support). For what Reportlab and others use
    if for, however, it is quite sufficient.

    It is unfortunate that the release on ReportLab's web site
    is way out of date (possibly over a year), but they may have
    more pressing concerns, or possibly nobody has even asked.
    Not that it is really *their* problem - the code is GPL and
    *anybody* with the time could release a fresh package.
    If the presence of a bug in a program means that one cannot label the
    program as being what it is intended to be, then all software would
    have
    to be released with disclaimers like "this is supposed to be a Python
    interpreter, and might be someday, but isn't yet because there are some
    rare cases where it doesn't correctly interpret Python".
    Classifying a lack of Unicode support in an XML parser as a
    'bug' is ridiculous. I must admit that I was a bit miffed when
    I first tried out 'the fastest validating XML parser around'
    and found out it couldn't validate XHTML, ONIX or anything
    except trivial examples I coded up - I got the impression that
    someone who didn't understand XML had gotten a bit overexcited.
    When I looked into it further, it became clear that pyRXP was
    not what I would call an XML parser *by design* and that
    ReportLab's definition of XML was slightly more flexible than
    mine :-)


    - --
    Stuart Bishop <stuart at stuartbishop.net>
    http://www.stuartbishop.net/
  • Uche Ogbuji at Feb 12, 2004 at 7:30 pm
    Stuart Bishop <stuart.b at commonground.com.au> wrote in message news:<mailman.6.1076585559.698.python-list at python.org>...
    -----BEGIN PGP SIGNED MESSAGE-----
    Hash: SHA1

    On 12/02/2004, at 4:43 AM, Peter Hansen wrote:

    The fact that the PyRXP maintainers have apparently refused to fix this
    problem *does* justify the complaint Uche is making, and I support him
    in that now. I wasn't aware that anyone had even tried reporting the
    problem in the first place and was objection to an apparent
    overreaction.
    This isn't quite correct. ReportLab were more than happy to
    take my patch on board, and if you install pyRXP from CVS you
    get *two* parsers (pyRXP and pyRXPU). I personally don't
    consider pyRXP an XML parser, as it is deliberately lacking what
    is in my opinion the single most useful feature of XML (unambiguous
    and universal Unicode support). For what Reportlab and others use
    if for, however, it is quite sufficient.

    It is unfortunate that the release on ReportLab's web site
    is way out of date (possibly over a year), but they may have
    more pressing concerns, or possibly nobody has even asked.
    Not that it is really *their* problem - the code is GPL and
    *anybody* with the time could release a fresh package.
    First of all, Stuart, I'd like to thank you greatly for PyRXPU. It
    was very much needed and provides another truly compliant option for
    Python-XML processing. The more the merrier but having non-conformant
    options such as the default packaging of PyRXP is IMO detrimental.

    I too hope that PyRXU becomes the default in future packaging, and
    that PyRXP is clearly marked as a super-fast option for those who
    don't really need a true XML parser. Not that I know why such a group
    would not just use plain old super-super-fast delimited ASCII.

    I did point out PyRXPU multiple times in this thread and credited you
    for it.

    If the presence of a bug in a program means that one cannot label the
    program as being what it is intended to be, then all software would
    have
    to be released with disclaimers like "this is supposed to be a Python
    interpreter, and might be someday, but isn't yet because there are some
    rare cases where it doesn't correctly interpret Python".
    Classifying a lack of Unicode support in an XML parser as a
    'bug' is ridiculous.
    My point exactly. It would be like calling it a "bug" if something
    that called itself Python only accepted tabs for indentation rather
    than spaces. Such a thing just simply wouldn't be Python.

    I must admit that I was a bit miffed when
    I first tried out 'the fastest validating XML parser around'
    and found out it couldn't validate XHTML, ONIX or anything
    except trivial examples I coded up - I got the impression that
    someone who didn't understand XML had gotten a bit overexcited.
    When I looked into it further, it became clear that pyRXP was
    not what I would call an XML parser *by design* and that
    ReportLab's definition of XML was slightly more flexible than
    mine :-)
    And we're grateful for PyRXPU which does meet the only valid
    definition of XML. :-)

    --Uche
    http://uche.ogbuji.net
  • Paul Boddie at Feb 13, 2004 at 8:49 am
    uche at ogbuji.net (Uche Ogbuji) wrote in message news:<d116fbae.0402121130.6561b358 at posting.google.com>...
    Stuart Bishop <stuart.b at commonground.com.au> wrote in message news:<mailman.6.1076585559.698.python-list at python.org>...
    Classifying a lack of Unicode support in an XML parser as a
    'bug' is ridiculous.
    My point exactly. It would be like calling it a "bug" if something
    that called itself Python only accepted tabs for indentation rather
    than spaces. Such a thing just simply wouldn't be Python.
    I'm not sure that I can accept that particular analogy, and at least
    there would be reasonable workarounds to get one's space-indented code
    working under tab-only Python. A better analogy would be a version of
    Python which only accepted identifiers in lower case, or which was
    case-insensitive.

    Meanwhile, having actually read your article (so it isn't just
    speculation about what you've said on my part) I think you were
    absolutely justified in pointing this issue out with PyRXP. There are
    always going to be various "appendage measurement" competitions when
    it comes to XML toolkit performance, but in my mind there are few
    things worse in programming than suffering various elevated claims
    about a piece of software whilst having to work up against some very
    serious limitations. In any case, libxml2 is surely the big winner
    when it comes to conformance plus performance, anyway.

    Paul
  • Stuart Bishop at Feb 13, 2004 at 3:52 am

    On 13/02/2004, at 3:20 AM, Peter Hansen wrote:

    Do you mean it doesn't handle addition things that are part of
    the core XML standard? Or are we still talking about *just* the
    lack of Unicode? I thought what I read on their web site made
    it clear that it *does* support Unicode, but it was simply
    not enabled in the current compilation. If that's true, then
    clearly it *is* an XML parser *by design*.
    The RXP library can be compiled in 8bit or 16bit mode.
    The 0.9 release of pyRXP built the RXP library in 8 bit mode,
    and the Python glue was 8 bit only. Building RXP in 16bit mode
    with the 8bit-only glue, unsurprisingly, caused everything to
    fail miserably.

    The patch I submitted (integrated sometime around 0.96 I think
    - - no official release) involved updating the glue so that Python
    could happily get Unicode objects from RXP's 16bit representation
    (RXP works internally in UTF16), and a disgusting hack to setup.py
    so that it would build both the 8bit and Unicode versions of the
    library and wrapper from the same source.

    - --
    Stuart Bishop <stuart at stuartbishop.net>
    http://www.stuartbishop.net/
  • Jarek Zgoda at Feb 13, 2004 at 9:22 pm

    Stuart Bishop <stuart.b at commonground.com.au> pisze:

    Classifying a lack of Unicode support in an XML parser as a
    'bug' is ridiculous.
    Lack of Unicode support in an XML parser makes it unusable for most part
    of humanity. It's much worse than bug, it's a mistake.
  • Peter Hansen at Feb 13, 2004 at 10:30 pm

    Jarek Zgoda wrote:
    Stuart Bishop <stuart.b at commonground.com.au> pisze:
    Classifying a lack of Unicode support in an XML parser as a
    'bug' is ridiculous.
    Lack of Unicode support in an XML parser makes it unusable for most part
    of humanity. It's much worse than bug, it's a mistake.
    Most of humanity doesn't use XML, so that's silly.

    And for the rest of us that do, *many* don't use it for textual data,
    but merely as a standard structured way of representing simple non-Unicode
    data, and for that pyRXP is intended to be quite suitable.

    For those who need non-ASCII input, pyRXP is clearly documented as
    being unsuitable.

    -Peter
  • Jarek Zgoda at Feb 13, 2004 at 10:50 pm

    Peter Hansen <peter at engcorp.com> pisze:

    Classifying a lack of Unicode support in an XML parser as a
    'bug' is ridiculous.
    Lack of Unicode support in an XML parser makes it unusable for most part
    of humanity. It's much worse than bug, it's a mistake.
    Most of humanity doesn't use XML, so that's silly.
    Most of humanity doesn't use ASCII.
  • Rainer Deyke at Feb 13, 2004 at 11:04 pm

    Jarek Zgoda wrote:
    Peter Hansen <peter at engcorp.com> pisze:
    Most of humanity doesn't use XML, so that's silly.
    Most of humanity doesn't use ASCII.
    Most of humanity doesn't use computers.


    --
    Rainer Deyke - rainerd at eldwood.com - http://eldwood.com
  • Jarek Zgoda at Feb 13, 2004 at 11:11 pm

    Rainer Deyke <rainerd at eldwood.com> pisze:

    Most of humanity doesn't use XML, so that's silly.
    Most of humanity doesn't use ASCII.
    Most of humanity doesn't use computers.
    So... Should we write programs by drawing circles on the sand?

    (Just wanted to say that most of humanity cann't read, but surely this
    isn't true in XXI century.)
  • Josiah Carlson at Feb 14, 2004 at 12:54 am

    Lack of Unicode support in an XML parser makes it unusable for
    most part of humanity. It's much worse than bug, it's a mistake.
    Most of humanity doesn't use XML, so that's silly.
    Most of humanity doesn't use ASCII.
    Most of humanity doesn't use computers.
    So... Should we write programs by drawing circles on the sand?
    Of course not. But your initial assertion that the library is useless
    to the majority of humanity, because it does not support unicode, is false.

    I find the library useless because I don't use XML (explicitly), either
    for storage or for IPC.

    As for your comment about ASCII... Last time I checked, TCP/IP was
    designed with the idea of 8-bit bytes and the ASCII character set (which
    is why you see references to NULL, \r, \n, etc.). A large portion of
    internet protocols (http, telnet, ftp, gopher, nntp, etc.), used for
    communicating over TCP/IP, also refer to the same ASCII character set.

    Considering the implementations of compilers for the C and C++
    programming languages, those operating systems written using C and C++,
    most likely have source code stored in ASCII (I doubt you could find a
    major OS with non-ASCII characters that is written in C/C++). This
    would include Microsoft, Linux, Apple, Sun, SGI, etc. I'll leave it up
    to you to come up with the use percentages.

    On the other hand, we could talk about embedded systems (which dwarfs
    the PC industry), but there you'll also find ASCII, because the
    compilers for the 8,16,32 bit processors in embedded systems, are
    sitting on some standard machine using Windows or *nix, both of which
    were written in C/C++, with source code stored in ASCII format.

    It is funny how ASCII is everywhere.
    - Josiah
  • JanC at Feb 14, 2004 at 3:07 am

    Josiah Carlson <jcarlson at nospam.uci.edu> schreef:

    It is funny how ASCII is everywhere.
    No: supersets of ASCII are everywhere.

    --
    JanC

    "Be strict when sending and tolerant when receiving."
    RFC 1958 - Architectural Principles of the Internet - section 3.9
  • Jarek Zgoda at Feb 14, 2004 at 8:24 am

    Josiah Carlson <jcarlson at nospam.uci.edu> pisze:

    Most of humanity doesn't use computers.
    So... Should we write programs by drawing circles on the sand?
    Of course not. But your initial assertion that the library is useless
    to the majority of humanity, because it does not support unicode, is false.

    I find the library useless because I don't use XML (explicitly), either
    for storage or for IPC.
    Sure, I didn't take into account other contexts of usability. ;)
  • Alan Kennedy at Feb 14, 2004 at 1:28 pm
    [Peter Hansen]
    For those who need non-ASCII input, pyRXP is clearly documented as
    being unsuitable.
    So they should remove the following false and misleading statement
    from their web page:

    "pyRXP version 0.9, the fastest validating XML parser available for
    Python, and quite possibly anywhere :-)"

    http://www.reportlab.org/pyrxp.html

    As we've clearly established, PyRXP is not an XML parser, PyRXPU is an
    XML parser. But is PyRXPU the "fastest validating XML parser"?

    They should place one of the following statements on the page

    "pyRXP version 0.9, the fastest validating not-XML parser available
    for Python, and quite possibly anywhere"

    or

    "pyRXPU version CVS_YYYY_MM_DD, is a validating XML parser for Python"

    --
    alan kennedy
    ------------------------------------------------------
    check http headers here: http://xhaus.com/headers
    email alan: http://xhaus.com/contact/alan
  • Uche Ogbuji at Feb 11, 2004 at 6:59 am
    Peter Hansen <peter at engcorp.com> wrote in message news:<40294322.7AF2C6D7 at engcorp.com>...
    "Martin v. L?wis" wrote:
    Peter Hansen wrote:
    Then there are very few XML parsers in the world, if one includes, say,
    namespaces and validation as part of XML.
    Namespaces are clearly *not* part of the XML recommendation (but part
    of the XML namespaces recommendation). Validation is optional in the
    XML recommendation. Character references are not.
    See my reply to Brian Q... I can accept that, but read more into Uche's
    objection than he probably meant. It now seems rather over the top to
    write an article lambasting the product for something that would normally
    just be considered a simple bug/defect in the software, unless the
    maintainers have in effect refused to fix it.
    I agree. I would not be making my point so strongly if I had not in
    fact read several times just such a refusal by the PyRXP authors to
    fix the non-conformance. When I first came across the error, my first
    instinct was to report the error and wait for a fix. Then I did some
    googling and saw that others had eported the error and the maintainrs
    did in fact refuse to fix it.

    Recently Stuart Bishop contributed a conforming variant of PyRXP,
    called PyRXPU. PyRXPU is an XML parser. PyRXP is not. This
    point has not been made clearly enough by the PyRXP dvelopers. I
    think it's great that ReportLab is contributing to the diversity of
    XML tools in Python and I encourage people to try PyRXPU.
    Unfortunately it is only available from ReportLab CVS, but I'd expect
    this will change soon, and for now at least it's easy to build.

    --Uche
    http://uche.ogbuji.net
  • Peter Hansen at Feb 10, 2004 at 8:13 pm

    Brian Quinlan wrote:
    Peter Hansen wrote:
    I have no problem with the existence of PyRXP. It just isn't an XML
    parser.
    Then there are very few XML parsers in the world, if one includes, say,
    namespaces and validation as part of XML.
    An XML parser must be able to parse all (1) well-formed XML documents. The
    W3C XML recommendation provides a definition for well-formed XML documents,
    and can be found here:
    http://www.w3.org/TR/2004/REC-xml-20040204/

    (1) I'll quality this a bit since the size of XML documents is unbounded
    and computer resources are.
    Okay, granted. I'll withdraw my comments. I read too much into
    Uche Ogbuji's objections, thinking he had problems with PyRXP on more
    fronts than just the character entities question. If that's the only
    defect in PyRXP, I suspect it's still somewhat ahead of a lot of other
    "XML parsers" anyway, though clearly imperfect and arguably not yet
    deserving of the label "XML parser".

    I wonder whether this has been reported as a bug to Reportlab, however.
    Maybe they simply didn't happen to have a test case that covered that
    particular use of the parser... and they might be happy to fix it.

    -Peter
  • Chris Herborth at Feb 16, 2004 at 1:16 pm

    Paulo Pinto wrote:

    does anyone know of a Python package that
    is able to load XML like the XML::Simple
    Perl package does?
    Despite all of the, uh, _discussion_ in this thread, I'd like to thank you
    folks for pointing out pyRXP... I hadn't found that before, and if I can
    whip up a pyRXP -> DOM2 translator, it will fit my needs _perfectly_.

    Thanks!

    --
    Chris Herborth chrish at cryptocard.com
    Documentation Overlord, CRYPTOCard Corp. http://www.cryptocard.com/
    Never send a monster to do the work of an evil scientist.
  • Paul Boddie at Feb 17, 2004 at 9:39 am
    Chris Herborth <chrish at cryptocard.com> wrote in message news:<Wj3Yb.3914$Cd6.174692 at news20.bellglobal.com>...
    Despite all of the, uh, _discussion_ in this thread, I'd like to thank you
    folks for pointing out pyRXP... I hadn't found that before, and if I can
    whip up a pyRXP -> DOM2 translator, it will fit my needs _perfectly_.
    Well, if it is true what people claim about dictionaries and tuples
    being faster than objects, then you may see any supposed performance
    advantage claimed by the PyRXP proponents just dissolve away as you
    instantiate all those nodes. But as I noted with respect to "double
    wrapping" libxml2, if you can restrict yourself to very few high-level
    operations through those layers, and then invoke various "native"
    methods directly, then it could still be worth it.

    Paul

Related Discussions