FAQ
Hello all,

I have a CMS-like application using Catalyst for which about 40-50% of requests
contain large file uploads. I'm using a pretty standard two-tiered setup to
avoid tying up heavy mod_perl procs for slow client downloads.

Now I'd like to avoid tying up the heavy procs for slow uploads. Many suggest
to run upload handlers as CGI rather than mod_perl because the 1s startup time
is negligent compared to the time required to upload. I have tried this and it
works, but it still creates a 60+ MB process to sit around for 2 minutes while a
dial up user uploads a huge image.

I'm thinking of splitting file upload handling out to a simple CGI app (no
Catalyst code) that serializes the request, including uploaded files, to a tmp
directory. It would then redirect to a Catalyst-handled URL matching the
serialized request. Then I can validate the submission in Catalyst and display
success or error using my views, etc.

Is there something like this in existence already? I searched Google and CPAN
but didn't find anything promising. I'd appreciate any tips or suggestions. If
I do implement it from scratch, would there be any interest in the CGI and a
Catalyst base controller being released on CPAN?

Best,
Brian Kirkbride

Search Discussions

  • Peter Karman at Apr 12, 2007 at 6:40 pm

    Brian Kirkbride scribbled on 4/12/07 12:16 PM:

    Is there something like this in existence already? I searched Google
    and CPAN but didn't find anything promising. I'd appreciate any tips or
    suggestions. If I do implement it from scratch, would there be any
    interest in the CGI and a Catalyst base controller being released on CPAN?
    I'd be interested in collaborating on something if you end up rolling your own.
    We were looking at this recently:

    http://trac.lighttpd.net/trac/wiki/Docs%3AModUploadProgress

    and doing something similar to what you're describing.

    --
    Peter Karman . http://peknet.com/ . peter@peknet.com
  • Aristotle Pagaltzis at Apr 12, 2007 at 8:25 pm
    Hi Brian,

    * Brian Kirkbride [2007-04-12 19:25]:
    I have a CMS-like application using Catalyst for which about
    40-50% of requests contain large file uploads. I'm using a
    pretty standard two-tiered setup to avoid tying up heavy
    mod_perl procs for slow client downloads.

    [?]

    I'm thinking of splitting file upload handling out to a
    simple CGI app (no Catalyst code) that serializes the request,
    including uploaded files, to a tmp directory. It would then
    redirect to a Catalyst-handled URL matching the serialized
    request.
    My suggestion is to do something much more transparent.

    Use a CGI script on the front-end webserver to accept uploads.
    When the upload completes, the CGI script then in turn makes a
    HTTP request to the application webserver server to upload the
    file, and returns the app?s response to the client.

    That way, the actual application itself does not need to change
    at all. There is no serialisation mechanism, no extra code in the
    application to handle such a special upload mechanism, no need
    for configuration of filesystem paths or other shared resources,
    nothing. The handling is isolated on the front-end webserver.

    This way, a programming consideration becomes a mere deployment
    consideration.

    HTTP is great like that.
    Is there something like this in existence already? I searched
    Google and CPAN but didn't find anything promising. I'd
    appreciate any tips or suggestions. If I do implement it from
    scratch, would there be any interest in the CGI and a Catalyst
    base controller being released on CPAN?
    Sometimes, there is no code to do something because no code is
    needed to do it. :-)

    Regards,
    --
    Aristotle Pagaltzis // <http://plasmasturm.org/>
  • Brian Kirkbride at Apr 12, 2007 at 10:03 pm

    A. Pagaltzis wrote:
    Hi Brian,

    * Brian Kirkbride [2007-04-12 19:25]:
    I have a CMS-like application using Catalyst for which about
    40-50% of requests contain large file uploads. I'm using a
    pretty standard two-tiered setup to avoid tying up heavy
    mod_perl procs for slow client downloads.

    [?]

    I'm thinking of splitting file upload handling out to a
    simple CGI app (no Catalyst code) that serializes the request,
    including uploaded files, to a tmp directory. It would then
    redirect to a Catalyst-handled URL matching the serialized
    request.
    My suggestion is to do something much more transparent.

    Use a CGI script on the front-end webserver to accept uploads.
    When the upload completes, the CGI script then in turn makes a
    HTTP request to the application webserver server to upload the
    file, and returns the app?s response to the client.

    That way, the actual application itself does not need to change
    at all. There is no serialisation mechanism, no extra code in the
    application to handle such a special upload mechanism, no need
    for configuration of filesystem paths or other shared resources,
    nothing. The handling is isolated on the front-end webserver.

    This way, a programming consideration becomes a mere deployment
    consideration.

    HTTP is great like that.
    Is there something like this in existence already? I searched
    Google and CPAN but didn't find anything promising. I'd
    appreciate any tips or suggestions. If I do implement it from
    scratch, would there be any interest in the CGI and a Catalyst
    base controller being released on CPAN?
    Sometimes, there is no code to do something because no code is
    needed to do it. :-)
    Wonderful, thank you for the insight Aristotle. I will try Perrin's suggestion
    of using an upload-buffering proxy first, falling back on a simple CGI proxy
    like this if need be.
  • Perrin Harkins at Apr 12, 2007 at 8:48 pm

    On 4/12/07, Brian Kirkbride wrote:
    Now I'd like to avoid tying up the heavy procs for slow uploads. Many suggest
    to run upload handlers as CGI rather than mod_perl because the 1s startup time
    is negligent compared to the time required to upload. I have tried this and it
    works, but it still creates a 60+ MB process to sit around for 2 minutes while a
    dial up user uploads a huge image.
    Yeah, I was reading that and thinking that you'll probably use more
    memory for CGI, since you still have to run perl but you don't get any
    copy-on-write benefit. It could be offset by loading as few modules
    as possible into the CGI, but that seems kind of ugly.

    A better solution would be a proxy that buffers uploads. I thought
    mod_proxy would do this for you. Are you certain that it doesn't? If
    so, take a look at the other proxy options out there.

    - Perrin
  • Brian Kirkbride at Apr 12, 2007 at 10:04 pm

    Perrin Harkins wrote:
    On 4/12/07, Brian Kirkbride wrote:
    Now I'd like to avoid tying up the heavy procs for slow uploads. Many
    suggest
    to run upload handlers as CGI rather than mod_perl because the 1s
    startup time
    is negligent compared to the time required to upload. I have tried
    this and it
    works, but it still creates a 60+ MB process to sit around for 2
    minutes while a
    dial up user uploads a huge image.
    Yeah, I was reading that and thinking that you'll probably use more
    memory for CGI, since you still have to run perl but you don't get any
    copy-on-write benefit. It could be offset by loading as few modules
    as possible into the CGI, but that seems kind of ugly.

    A better solution would be a proxy that buffers uploads. I thought
    mod_proxy would do this for you. Are you certain that it doesn't? If
    so, take a look at the other proxy options out there.

    - Perrin
    Perrin,

    It doesn't look like mod_proxy, which I am currently using, supports this. It
    looks like Perlbal does however. Any other suggestions? I'd like to avoid
    Squid as it would be overkill for this application.

    Thanks!
  • Perrin Harkins at Apr 12, 2007 at 10:11 pm

    On 4/12/07, Brian Kirkbride wrote:
    It doesn't look like mod_proxy, which I am currently using, supports this. It
    looks like Perlbal does however. Any other suggestions?
    Since I always use mod_proxy, I can't vouch for any others. I've seen
    people mention nginx, pound, and pen.

    It might also be worth asking about this on the mod_proxy list, or
    checking the archives.

    - Perrin
  • Aristotle Pagaltzis at Apr 13, 2007 at 5:21 pm

    * Perrin Harkins [2007-04-12 23:20]:
    I've seen people mention nginx, pound, and pen.
    I also just ran across Varnish:

    * <http://varnish.projects.linpro.no/>
    Varnish is a state-of-the-art, high-performance HTTP
    accelerator. Varnish is targeted primarily at the FreeBSD 6 and
    Linux 2.6 platforms, and will take full advantage of the
    virtual memory system and advanced I/O features offered by
    these operating systems.
    Apparently it does not manage its own cache in order to avoid
    fighting the kernel VMM and takes extra pains to avoid memcpy
    operations to minimise inter-CPU cache thrashing, esp. in multi-
    processor systems.

    I don?t know if it does upload buffer or how it stacks up against
    the competition.

    Regards,
    --
    Aristotle Pagaltzis // <http://plasmasturm.org/>

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcatalyst @
categoriescatalyst, perl
postedApr 12, '07 at 6:16p
activeApr 13, '07 at 5:21p
posts8
users4
websitecatalystframework.org
irc#catalyst

People

Translate

site design / logo © 2022 Grokbase