FAQ
Hi everyone,

I've got a little project that requires me to parse a simple XML
file. The file comes from the browser history of Apple's Safari and
includes the URL that was visited, the title of the Web page, the
date and time it was last visited, and the total number of times it
was visited. Here's a snippet:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://
www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>WebHistoryDates</key>
<array>
<dict>
<key></key>
<string>http://www.hopkins.k12.mn.us/pages/eisenhower/
eisenhower.lasso</string>
<key>displayTitle</key>
<string>Welcome to Eisenhower Elementary</string>
<key>lastVisitedDate</key>
<string>156350702.0</string>
<key>title</key>
<string>Welcome to Eisenhower Elementary</string>
<key>visitCount</key>
<integer>285</integer>
</dict>
</array>
<key>WebHistoryFileVersion</key>
<integer>1</integer>
</dict>
</plist>

Let's say that instead of one <dict>, the xml file had 100 of them. I
want to generate a simple table of URLs and the dates they were last
visited. I can handle everything except parsing the XML file and
extracting the information.

Anyone have any pointers?

-Tim

--
Tim Wilson
Twin Cities, Minnesota, USA
Educational technology guy, Linux and OS X fan, Grad. student, Daddy
mailto: wilson at visi.com aim: tis270 blog: http://technosavvy.org

Search Discussions

  • Kent Johnson at Dec 16, 2005 at 11:26 am

    Tim Wilson wrote:
    Hi everyone,

    I've got a little project that requires me to parse a simple XML
    file. The file comes from the browser history of Apple's Safari and
    includes the URL that was visited, the title of the Web page, the
    date and time it was last visited, and the total number of times it
    was visited. Here's a snippet:

    <?xml version="1.0" encoding="UTF-8"?>
    <!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://
    www.apple.com/DTDs/PropertyList-1.0.dtd">
    <plist version="1.0"> <snip>
    </plist>

    Let's say that instead of one <dict>, the xml file had 100 of them. I
    want to generate a simple table of URLs and the dates they were last
    visited. I can handle everything except parsing the XML file and
    extracting the information.
    These look promising:
    http://online.effbot.org/2005_03_01_archive.htm#elementplist
    http://www.shearersoftware.com/software/developers/plist/

    though the empty <key></key> for the URL might be a problem. The effbot version could be
    changed to
    "dict": lambda x:
    dict((x[i].text or 'url', x[i+1].text) for i in range(0, len(x), 2)),

    to change empty keys to 'url'; as long as there is only one per <dict> (and it is actually
    the url) that will work.

    If these don't work you can use ElementTree to do the parse and walk the results tree
    yourself to pull out the data.

    Kent

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouptutor @
categoriespython
postedDec 16, '05 at 6:18a
activeDec 16, '05 at 11:26a
posts2
users2
websitepython.org

2 users in discussion

Kent Johnson: 1 post Tim Wilson: 1 post

People

Translate

site design / logo © 2023 Grokbase