FAQ
Hi All,


Is there any module available in python standard library for XML binding? If not, any other suggestions.


Which is good for parsing large file?
1. XML binding
2. Creating our own classes




Thanks,
Palpandi

Search Discussions

  • Burak Arslan at Sep 3, 2015 at 7:54 pm
    Hello,

    On 09/03/15 19:54, Palpandi wrote:
    Hi All,

    Is there any module available in python standard library for XML binding? If not, any other suggestions.

    lxml is the right xml library to use. You can use lxml's objectify or Spyne.


    Here are some examples:


    http://stackoverflow.com/questions/19545067/python-joining-and-writing-xml-etrees-trees-stored-in-a-list

    Which is good for parsing large file?
    1. XML binding
    2. Creating our own classes

    If you're dealing with huge files, I suggest using just lxml and work
    with raw data. Deserializing xml objects to python classes sure is nicer
    but has performance overhead that gets more and more visible as the
    amount of data you deal with grows.


    Best,
    Burak
  • Lorenzo Sutton at Sep 4, 2015 at 8:11 am
    Hi,

    On 03/09/2015 21:54, Burak Arslan wrote:
    Hello,
    On 09/03/15 19:54, Palpandi wrote:
    Hi All,

    Is there any module available in python standard library for XML binding? If not, any other suggestions.
    lxml is the right xml library to use. You can use lxml's objectify or Spyne.

    I second lxml..


    [...]
    Which is good for parsing large file?
    How large is large?


    I have used lxml (coupled with pygtk) with very good results on XML
    files up to around 250Mb.


    Lorenzo.
  • Palpandi at Sep 4, 2015 at 5:21 am
    Thanks Burak.


    lmxl is good. But it is not supported with python 2.5. Any other option?
  • Chris Angelico at Sep 4, 2015 at 5:36 am

    On Fri, Sep 4, 2015 at 3:21 PM, Palpandi wrote:
    Thanks Burak.

    lmxl is good. But it is not supported with python 2.5. Any other option?

    The latest version isn't. But PyPI has an older version which is:


    https://pypi.python.org/pypi/lxml/3.3.6


    You should be able to install that into a Python 2.5. Though if you
    possibly can, I would recommend upgrading to 2.7.


    ChrisA
  • Laura Creighton at Sep 4, 2015 at 6:46 am

    In a message of Thu, 03 Sep 2015 22:21:29 -0700, Palpandi writes:
    Thanks Burak.

    lmxl is good. But it is not supported with python 2.5. Any other option?
    --
    https://mail.python.org/mailman/listinfo/python-list

    check and see what python you have. If 2.6 or more recent, use lxml
    If you have 2.5 use the slower elementtree.
    https://docs.python.org/2/library/xml.etree.elementtree.html


    If you pay attention to this
    http://lxml.de/compatibility.html
    you can mostly get away with writing your code once.


    Laura
  • Laura Creighton at Sep 4, 2015 at 6:54 am

    In a message of Fri, 04 Sep 2015 08:46:33 +0200, Laura Creighton writes:
    In a message of Thu, 03 Sep 2015 22:21:29 -0700, Palpandi writes:
    Thanks Burak.

    lmxl is good. But it is not supported with python 2.5. Any other option?
    --
    https://mail.python.org/mailman/listinfo/python-list
    check and see what python you have. If 2.6 or more recent, use lxml
    If you have 2.5 use the slower elementtree.
    https://docs.python.org/2/library/xml.etree.elementtree.html

    If you pay attention to this
    http://lxml.de/compatibility.html
    you can mostly get away with writing your code once.

    Laura

    I didn't know about the old versions still available from pip. That
    is probably a better idea.


    Laura
  • Palpandi at Sep 7, 2015 at 1:42 pm
    Hi All,


    Is it better to use pyxb than lxml?


    What are the advantages of lxml and pyxb?


    Thanks,
    Palpandi
  • Dieter at Sep 9, 2015 at 8:20 am

    Palpandi <palpandi111@gmail.com> writes:


    Is it better to use pyxb than lxml?

    What are the advantages of lxml and pyxb?

    "pyxb" has a different aim than "lxml".


    "lxml" is a general purpose library to process XML documents.
    It gives you an interface to the document's resources (elements,
    attributes, comments, processing instructions) on a low level
    independ from the document type.


    "pyxb" is different: there, you start with an XML schema description.
    You use "pyxb" to generate Python bindings for this schema.
    With such a binding generated, "pyxb" can parse XML documents
    following a known XML schema into the corresponding binding.
    The binding objects expose child (XML) elements and (XML) attributes as
    attributes of the binding object. Thus, the Python interface
    (as defined by the binding) is highly dependent on the type (aka XML schema)
    of the document.




    I use "lxml" for either simple XML processing or when the XML documents
    are not described by an XML schema. I use "pyxb" when the XML documents
    has an associated complex schema and the processing is rather complex.
  • Stefan Behnel at Sep 9, 2015 at 9:44 am

    dieter schrieb am 09.09.2015 um 10:20:
    Palpandi writes:
    Is it better to use pyxb than lxml?

    What are the advantages of lxml and pyxb?
    "pyxb" has a different aim than "lxml".

    "lxml" is a general purpose library to process XML documents.
    It gives you an interface to the document's resources (elements,
    attributes, comments, processing instructions) on a low level
    independ from the document type.

    lxml's toolbox is actually larger than that. There's also lxml.objectify
    which provides a Python object interface to the XML tree, similar to what
    data binding would give you. And you can stick your own Element object
    implementations into it if you feel a need to simplify the API itself
    and/or adapt it to a given document format.


    http://lxml.de/objectify.html


    Stefan
  • Dieter at Sep 10, 2015 at 6:30 am

    Stefan Behnel <stefan_ml@behnel.de> writes:


    dieter schrieb am 09.09.2015 um 10:20:
    Palpandi writes:
    Is it better to use pyxb than lxml?

    What are the advantages of lxml and pyxb?
    "pyxb" has a different aim than "lxml".

    "lxml" is a general purpose library to process XML documents.
    It gives you an interface to the document's resources (elements,
    attributes, comments, processing instructions) on a low level
    independ from the document type.
    lxml's toolbox is actually larger than that. There's also lxml.objectify
    which provides a Python object interface to the XML tree, similar to what
    data binding would give you. And you can stick your own Element object
    implementations into it if you feel a need to simplify the API itself
    and/or adapt it to a given document format.

    http://lxml.de/objectify.html

    This is nice - but still quite far from the schema support of "pyxb".


    The "pyxb" binding generation generates a Python class for each type
    defined in the schema. You just instantiate such a class, populate
    the resulting object (in the normal Python way) and either use
    it in the construction of larger objects or serialize it as XML -- no
    need to worry about special construction ("objectivity.DataElement",
    "objectivity.SubElement", ...), no need to worry about xml namespaces.
  • Harirammanohar159 at Sep 9, 2015 at 9:00 am

    On Thursday, 3 September 2015 22:25:06 UTC+5:30, Palpandi wrote:
    Hi All,

    Is there any module available in python standard library for XML binding? If not, any other suggestions.

    Which is good for parsing large file?
    1. XML binding
    2. Creating our own classes


    Thanks,
    Palpandi

    Hey you can use internal package itself, argparse will work good...even xmltree also...
  • Michele Simionato at Sep 11, 2015 at 6:47 am

    On Thursday, September 3, 2015 at 6:55:06 PM UTC+2, Palpandi wrote:
    Hi All,

    Is there any module available in python standard library for XML binding? If not, any other suggestions.

    Which is good for parsing large file?
    1. XML binding
    2. Creating our own classes


    Thanks,
    Palpandi

    I am one who has just abandoned lxml for xml.etree in the standard library. The reasons for that change were:


    1. we had issues compiling/installing lxml on different platforms
    2. we had instabilities from one version to the other for what we wanted to do (XML validation with an XSD schema)


    At the end we solved everything by replacing the XSD validation with a custom validation (which we had to do anyway); at that point the need for lxml disappeared and ElementTree did everything we wanted, except providing the line error in case of invalid files, which was easy to add. It was also smaller, easier to understand and to customize. Its speed was more than enough.


    However I hear that everybody else is happy with lxml, so YMMV.




       Michele

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppython-list @
categoriespython
postedSep 3, '15 at 4:54p
activeSep 11, '15 at 6:47a
posts13
users9
websitepython.org

People

Translate

site design / logo © 2019 Grokbase