FAQ
Dear Pythoners,
I have written a python application which authenticates a user, reads a
webpage and searches for pattern and builds a database ( In my case its a
dictinary with fixed set of keys).
Inputting the username and password for authentication and final display of
the results is done by GUI.
I was able to isolate that major chunk of run time is eaten up in opening a
webpages, reading from them and extracting text.
I wanted to know if there is a way to concurrently calling the functions.

here is my pseudo code:

database=[]
in a while loop:
build the url by concatenating the query parametres
temp=urlopen(url).read()
dict['Key1']=get_key1_data(temp) ## passing the entire string obtained by
.read()
dict['key2']=get_key2_data(temp)
.
.
.
dict['keyn']=get_keyn_data(temp)
database=database+[dict] ## building an array of dictionaries

My question here is can I have the functions get_key1_data to get_keyn_data
run in some concurrent way.??
I ask this because these functions are not dependent on one another. They
all have access to the same parsed url and independently pull data in order
to populate the final database

Appreciating your help in this one.

Warm Regards,
Abhijeet.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20110408/1a9205a3/attachment.html>

Search Discussions

  • Chris Angelico at Apr 8, 2011 at 7:25 am

    On Fri, Apr 8, 2011 at 5:04 PM, Abhijeet Mahagaonkar wrote:
    I was able to isolate that major chunk of run time is eaten up in opening a
    webpages, reading from them and extracting text.
    I wanted to know if there is a way to concurrently calling the functions.
    So, to clarify: you have code that's loading lots of separate pages,
    and the time is spent waiting for the internet? If you're saturating
    your connection, then this won't help, but if they're all small pages
    and they're coming over the internet, then yes, you certainly CAN
    fetch them concurrently. As the Perl folks say, There's More Than One
    Way To Do It; one is to spawn a thread for each request, then collect
    up all the results at the end. Look up the 'threading' module for
    details:

    http://docs.python.org/library/threading.html

    It should also be possible to directly use asynchronous I/O and
    select(), but I couldn't see a way to do that with urllib/urllib2. If
    you're using sockets directly, this ought to be an option.

    I don't know what's the most Pythonesque option, but if you already
    have specific Python code for each of your functions, it's probably
    going to be easiest to spawn threads for them all.

    Chris Angelico
    Threading fan ever since he met OS/2 in 1993 or so
  • MRAB at Apr 8, 2011 at 2:41 pm
    On 08/04/2011 08:25, Chris Angelico wrote:
    [snip]
    I don't know what's the most Pythonesque option, but if you already
    have specific Python code for each of your functions, it's probably
    going to be easiest to spawn threads for them all.
    "Pythonesque" refers to "Monty Python's Flying Circus". The word you
    want is "Pythonic".
  • Chris Angelico at Apr 8, 2011 at 3:31 pm

    On Sat, Apr 9, 2011 at 12:41 AM, MRAB wrote:
    On 08/04/2011 08:25, Chris Angelico wrote:
    [snip]
    I don't know what's the most Pythonesque option, but if you already
    have specific Python code for each of your functions, it's probably
    going to be easiest to spawn threads for them all.
    "Pythonesque" refers to "Monty Python's Flying Circus". The word you
    want is "Pythonic".
    Whoops! Remind me not to post while sleep-deprived.

    although.... Sleep-deprived is my normal state, so that may be a bit tricky.

    Chris Angelico
  • Matt Chaput at Apr 8, 2011 at 3:38 pm

    On 08/04/2011 11:31 AM, Chris Angelico wrote:
    On Sat, Apr 9, 2011 at 12:41 AM, MRABwrote:
    On 08/04/2011 08:25, Chris Angelico wrote:
    [snip]
    I don't know what's the most Pythonesque option, but if you already
    have specific Python code for each of your functions, it's probably
    going to be easiest to spawn threads for them all.
    "Pythonesque" refers to "Monty Python's Flying Circus". The word you
    want is "Pythonic".
    And the word for referring to the actual snake is "Pythonical" :P
  • Raymond Hettinger at Apr 8, 2011 at 5:59 pm

    On Apr 8, 12:25?am, Chris Angelico wrote:
    On Fri, Apr 8, 2011 at 5:04 PM, Abhijeet Mahagaonkar

    wrote:
    I was able to isolate that major chunk of run time is eaten up in opening a
    webpages, reading from them and extracting text.
    I wanted to know if there is a way to concurrently calling the functions.
    So, to clarify: you have code that's loading lots of separate pages,
    and the time is spent waiting for the internet? If you're saturating
    your connection, then this won't help, but if they're all small pages
    and they're coming over the internet, then yes, you certainly CAN
    fetch them concurrently. As the Perl folks say, There's More Than One
    Way To Do It; one is to spawn a thread for each request, then collect
    up all the results at the end. Look up the 'threading' module for
    details:

    http://docs.python.org/library/threading.html
    The docs for Python3.2 have a nice example for downloading multiple
    webpages in parallel:

    http://docs.python.org/py3k/library/concurrent.futures.html#threadpoolexecutor-example

    Raymond
  • Abhijeet Mahagaonkar at Apr 9, 2011 at 1:16 am
    Thats awesome. Its time I migrate to 3 :)
    On Fri, Apr 8, 2011 at 11:29 PM, Raymond Hettinger wrote:
    On Apr 8, 12:25 am, Chris Angelico wrote:
    On Fri, Apr 8, 2011 at 5:04 PM, Abhijeet Mahagaonkar

    wrote:
    I was able to isolate that major chunk of run time is eaten up in
    opening a
    webpages, reading from them and extracting text.
    I wanted to know if there is a way to concurrently calling the
    functions.
    So, to clarify: you have code that's loading lots of separate pages,
    and the time is spent waiting for the internet? If you're saturating
    your connection, then this won't help, but if they're all small pages
    and they're coming over the internet, then yes, you certainly CAN
    fetch them concurrently. As the Perl folks say, There's More Than One
    Way To Do It; one is to spawn a thread for each request, then collect
    up all the results at the end. Look up the 'threading' module for
    details:

    http://docs.python.org/library/threading.html
    The docs for Python3.2 have a nice example for downloading multiple
    webpages in parallel:


    http://docs.python.org/py3k/library/concurrent.futures.html#threadpoolexecutor-example

    Raymond
    --
    http://mail.python.org/mailman/listinfo/python-list
    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: <http://mail.python.org/pipermail/python-list/attachments/20110409/38434562/attachment-0001.html>

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppython-list @
categoriespython
postedApr 8, '11 at 7:04a
activeApr 9, '11 at 1:16a
posts7
users5
websitepython.org

People

Translate

site design / logo © 2023 Grokbase