FAQ
This looks like a tornado problem, but trust me, it is almost all
about the mechanism of multiprocessing module.

I borrowed the idea from http://gist.github.com/312676 to implement an
async db query web service using tornado.

p = multiprocessing.Pool(4)
class QueryHandler(tornado.web.RequestHandler):
...
@tornado.web.asynchronous
def get(self):
...
p.apply_async(async_func, [sql_command, arg1, arg2, arg3, ],
callback_func)

def callback_func(self, data):
self.write(data)

def async_func(sql_command, arg1, arg2, arg3):
'''
do the actual query job
'''
...
# data is the query result by executing sql_command
return data

So the workflow is like this,

get() --> fork a subprocess to process the query request in
async_func() -> when async_func() returns, callback_func uses the
return result of async_func as the input argument, and send the query
result to the client.

So the problem is the the query result as the result of sql_command
might be too big to store them all in the memory, which in our case is
stored in the variable "data". Can I send return from the async method
early, say immediately after the query returns with the first result
set, then stream the results to the browser. In other words, can
async_func somehow notify callback_func to prepare receiving the data
before async_func actually returns?

Search Discussions

  • Philip Semanchuk at Mar 8, 2011 at 11:34 pm

    On Mar 8, 2011, at 3:25 PM, Sheng wrote:

    This looks like a tornado problem, but trust me, it is almost all
    about the mechanism of multiprocessing module.
    [snip]

    So the workflow is like this,

    get() --> fork a subprocess to process the query request in
    async_func() -> when async_func() returns, callback_func uses the
    return result of async_func as the input argument, and send the query
    result to the client.

    So the problem is the the query result as the result of sql_command
    might be too big to store them all in the memory, which in our case is
    stored in the variable "data". Can I send return from the async method
    early, say immediately after the query returns with the first result
    set, then stream the results to the browser. In other words, can
    async_func somehow notify callback_func to prepare receiving the data
    before async_func actually returns?
    Hi Sheng,
    Have you looked at multiprocessing.Queue objects?


    HTH
    Philip
  • John Nagle at Mar 8, 2011 at 11:53 pm

    On 3/8/2011 3:34 PM, Philip Semanchuk wrote:
    On Mar 8, 2011, at 3:25 PM, Sheng wrote:

    This looks like a tornado problem, but trust me, it is almost all
    about the mechanism of multiprocessing module.
    [snip]

    So the workflow is like this,

    get() --> fork a subprocess to process the query request in
    async_func() -> when async_func() returns, callback_func uses the
    return result of async_func as the input argument, and send the query
    result to the client.

    So the problem is the the query result as the result of sql_command
    might be too big to store them all in the memory, which in our case is
    stored in the variable "data". Can I send return from the async method
    early, say immediately after the query returns with the first result
    set, then stream the results to the browser. In other words, can
    async_func somehow notify callback_func to prepare receiving the data
    before async_func actually returns?
    Hi Sheng,
    Have you looked at multiprocessing.Queue objects?
    Make sure that, having made a request of the database, you
    quickly read all the results. Until you finish the transaction,
    the database has locks set, and other transactions may stall.
    "Streaming" out to a network connection while still reading from
    the database is undesirable.

    If you're doing really big SELECTs, consider using LIMIT and
    OFFSET in SQL to break them up into smaller bites. Especially
    if the user is paging through the results.

    John Nagle
  • Sheng at Mar 9, 2011 at 3:22 pm
    Hi Philip,

    multiprocessing.Queue is used to transfer data between processes, how
    it could be helpful for solving my problem? Thanks!

    Sheng
    On Mar 8, 6:34?pm, Philip Semanchuk wrote:
    On Mar 8, 2011, at 3:25 PM, Sheng wrote:

    This looks like a tornado problem, but trust me, it is almost all
    about the mechanism of multiprocessing module. [snip]
    So the workflow is like this,
    get() --> fork a subprocess to process the query request in
    async_func() -> when async_func() returns, callback_func uses the
    return result of async_func as the input argument, and send the query
    result to the client.
    So the problem is the the query result as the result of sql_command
    might be too big to store them all in the memory, which in our case is
    stored in the variable "data". Can I send return from the async method
    early, say immediately after the query returns with the first result
    set, then stream the results to the browser. In other words, can
    async_func somehow notify callback_func to prepare receiving the data
    before async_func actually returns?
    Hi Sheng,
    Have you looked at multiprocessing.Queue objects?

    HTH
    Philip
  • Philip Semanchuk at Mar 9, 2011 at 5:54 pm

    On Mar 9, 2011, at 10:22 AM, Sheng wrote:

    Hi Philip,

    multiprocessing.Queue is used to transfer data between processes, how
    it could be helpful for solving my problem? Thanks!
    I misunderstood -- I thought transferring data between processes *was* your problem. If both of your functions are in the same process, I don't understand how multiprocessing figures into it at all.

    If you want a function to start returning results before that function completes, and you want those results to be processed by other code *in the same process*, then you'll have to use threads. A Queue object for threads exists in the standard library too. You might find that useful.

    HTH
    Philip

    On Mar 8, 6:34 pm, Philip Semanchuk wrote:
    On Mar 8, 2011, at 3:25 PM, Sheng wrote:

    This looks like a tornado problem, but trust me, it is almost all
    about the mechanism of multiprocessing module. [snip]
    So the workflow is like this,
    get() --> fork a subprocess to process the query request in
    async_func() -> when async_func() returns, callback_func uses the
    return result of async_func as the input argument, and send the query
    result to the client.
    So the problem is the the query result as the result of sql_command
    might be too big to store them all in the memory, which in our case is
    stored in the variable "data". Can I send return from the async method
    early, say immediately after the query returns with the first result
    set, then stream the results to the browser. In other words, can
    async_func somehow notify callback_func to prepare receiving the data
    before async_func actually returns?
    Hi Sheng,
    Have you looked at multiprocessing.Queue objects?

    HTH
    Philip
    --
    http://mail.python.org/mailman/listinfo/python-list

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppython-list @
categoriespython
postedMar 8, '11 at 8:25p
activeMar 9, '11 at 5:54p
posts5
users3
websitepython.org

People

Translate

site design / logo © 2022 Grokbase