|| at Nov 12, 2007 at 7:36 pm
In the standard Python install (Windows 2.5, at least), there's there's a couple example scripts you might find useful:
Crawls specified URL, checking for broken links.
Variant on the above that archives the specified site locally. Including images, but you could probably limit it to HTML easily enough.
I haven't used either extensively, but they appear to work as advertised. It should be easy to modify one and tie it into the MySQLdb extensions:http://sourceforge.net/projects/mysql-python
Technical Art Director
From: python-list-bounces+adam=volition-inc.com at python.org [mailto:python-list-bounces+adam=volition-inc.com at python.org] On Behalf Of Fabian L?pez
Sent: Monday, November 12, 2007 12:33 PM
To: Python-list at python.org
Subject: crawler in python and mysql
I would like to write a code that needs to crawl an url and take all the HTML code. I have noticed that there are different opensource webcrawlers, but they are very extensive for what I need. I only need to crawl an url, and don't know if it is so easy as using an html parser. Is it? Which libraries would you recommend me?
-------------- next part --------------
An HTML attachment was scrubbed...