Attached is the patch that tries to speedup prep_buildtree script, which is
used in VPATH builds, from our configure script.

The idea is to ask `find` to emit directory listing in depth-first order so
that the `mkdir -p` will create the deepest directory first and any
subsequent `mkdir -p` on an intermediate directory will not have to do
anything.

Currently I am seeing a performance improvement of this script by only about
500 ms; say 11.8 seconds vs. 11.3 secs. But I remember distinctly that
yesterday I was able to see an improvement of 11% on the same virtual
machine, averaged on multiple runs; 42 sec vs 37 sec. It might be the case
that the host OS or my Linux virtual machine were loaded at that time and
the filesystem could not cache enough inodes.

Seems like it would improve performance in general, but more so under load
conditions when you actually need it. I am not sure if -depth option is
supported by all incarnations of 'find'.

I have been away from Postgres development for quite a while, so would
appreciate if someone could tell me if such a patch should be submitted for
commitfest (since this is not actually a source patch).

Regards,
--
gurjeet.singh
@ EnterpriseDB - The Enterprise Postgres Company
http://www.EnterpriseDB.com

singh.gurjeet@{ gmail | yahoo }.com
Twitter/Skype: singh_gurjeet

Mail sent from my BlackLaptop device

Search Discussions

  • Robert Haas at Sep 27, 2010 at 1:03 pm

    On Sun, Sep 26, 2010 at 10:15 PM, Gurjeet Singh wrote:
    I have been away from Postgres development for quite a while, so would
    appreciate if someone could tell me if such a patch should be submitted for
    commitfest (since this is not actually a source patch).
    By all means add it to the open CF.

    https://commitfest.postgresql.org/action/commitfest_view/open

    --
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise Postgres Company
  • Gurjeet Singh at Sep 27, 2010 at 2:11 pm

    On Mon, Sep 27, 2010 at 3:02 PM, Robert Haas wrote:
    On Sun, Sep 26, 2010 at 10:15 PM, Gurjeet Singh wrote:
    I have been away from Postgres development for quite a while, so would
    appreciate if someone could tell me if such a patch should be submitted for
    commitfest (since this is not actually a source patch).
    By all means add it to the open CF.

    https://commitfest.postgresql.org/action/commitfest_view/open
    When trying to submit new patch, the 'CommitFest' drop-down has just one
    entry '(None Selected)', and 'Submit' would refuse to go through without a
    topic.

    What should be the value of 'Message-ID for original patch' ?
    the URL: http://archives.postgresql.org/pgsql-hackers/2010-09/msg01837.php
    or the ID: AANLkTinw0HL+jQmrtwXC9Y2tqhcfHFgFekxyyfYGvQrB@mail.gmail.com<AANLkTinw0HL%2BjQmrtwXC9Y2tqhcfHFgFekxyyfYGvQrB@mail.gmail.com>

    Regards,
    --
    gurjeet.singh
    @ EnterpriseDB - The Enterprise Postgres Company
    http://www.EnterpriseDB.com

    singh.gurjeet@{ gmail | yahoo }.com
    Twitter/Skype: singh_gurjeet

    Mail sent from my BlackLaptop device
  • Robert Haas at Sep 27, 2010 at 2:12 pm

    On Mon, Sep 27, 2010 at 10:09 AM, Gurjeet Singh wrote:
    On Mon, Sep 27, 2010 at 3:02 PM, Robert Haas wrote:

    On Sun, Sep 26, 2010 at 10:15 PM, Gurjeet Singh <singh.gurjeet@gmail.com>
    wrote:
    I have been away from Postgres development for quite a while, so would
    appreciate if someone could tell me if such a patch should be submitted
    for
    commitfest (since this is not actually a source patch).
    By all means add it to the open CF.

    https://commitfest.postgresql.org/action/commitfest_view/open
    When trying to submit new patch, the 'CommitFest' drop-down has just one
    entry '(None Selected)', and 'Submit' would refuse to go through without a
    topic.
    Oh, blah. I just added a Miscellaneous topic. You can add others
    yourself, just look on the CF page for "CommitFest topics".
    What should be the value of 'Message-ID for original patch' ?
    the URL: http://archives.postgresql.org/pgsql-hackers/2010-09/msg01837.php
    or the ID: AANLkTinw0HL+jQmrtwXC9Y2tqhcfHFgFekxyyfYGvQrB@mail.gmail.com
    The latter.

    --
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise Postgres Company
  • Andrew Dunstan at Sep 27, 2010 at 2:15 pm

    On 09/27/2010 10:11 AM, Robert Haas wrote:
    What should be the value of 'Message-ID for original patch' ?
    the URL: http://archives.postgresql.org/pgsql-hackers/2010-09/msg01837.php
    or the ID: AANLkTinw0HL+jQmrtwXC9Y2tqhcfHFgFekxyyfYGvQrB@mail.gmail.com
    The latter.
    Could this perhaps be made clearer on the page, perhaps with an example?
    It confused me recently too.

    cheers

    andrew
  • Gurjeet Singh at Sep 27, 2010 at 2:19 pm
    On Mon, Sep 27, 2010 at 4:15 PM, Andrew Dunstan wrote:
    On 09/27/2010 10:11 AM, Robert Haas wrote:


    What should be the value of 'Message-ID for original patch' ?
    the URL:
    http://archives.postgresql.org/pgsql-hackers/2010-09/msg01837.php
    or the ID: AANLkTinw0HL+jQmrtwXC9Y2tqhcfHFgFekxyyfYGvQrB@mail.gmail.com<AANLkTinw0HL%2BjQmrtwXC9Y2tqhcfHFgFekxyyfYGvQrB@mail.gmail.com>
    The latter.
    Could this perhaps be made clearer on the page, perhaps with an example? It
    confused me recently too.
    Or maybe populate the drop-down with every available topic from previous
    CFs, and add an additional '(Add new topic)' to the drop-down which would
    take you to topic creation page.

    Regards,
    --
    gurjeet.singh
    @ EnterpriseDB - The Enterprise Postgres Company
    http://www.EnterpriseDB.com

    singh.gurjeet@{ gmail | yahoo }.com
    Twitter/Skype: singh_gurjeet

    Mail sent from my BlackLaptop device
  • Robert Haas at Sep 27, 2010 at 2:41 pm

    On Mon, Sep 27, 2010 at 10:18 AM, Gurjeet Singh wrote:
    On Mon, Sep 27, 2010 at 4:15 PM, Andrew Dunstan wrote:

    On 09/27/2010 10:11 AM, Robert Haas wrote:

    What should be the value of 'Message-ID for original patch' ?
    the URL:
    http://archives.postgresql.org/pgsql-hackers/2010-09/msg01837.php
    or the ID: AANLkTinw0HL+jQmrtwXC9Y2tqhcfHFgFekxyyfYGvQrB@mail.gmail.com
    The latter.
    Could this perhaps be made clearer on the page, perhaps with an example?
    It confused me recently too.
    Or maybe populate the drop-down with every available topic from previous
    CFs, and add an additional '(Add new topic)' to the drop-down which would
    take you to topic creation page.
    Andrew's question seemed to be about the message-ID. I agree the
    topic thing is confusing, though. I'm wondering if it would be
    sufficient to do the following - if no topic are available, instead of
    showing the form, it says something like:

    No topics have been created for this CommitFest yet. Before adding
    your patch, you must add one or more items to the <link>topic
    list</link>.

    --
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise Postgres Company
  • Tom Lane at Sep 27, 2010 at 2:54 pm

    Robert Haas writes:
    Andrew's question seemed to be about the message-ID. I agree the
    topic thing is confusing, though. I'm wondering if it would be
    sufficient to do the following - if no topic are available, instead of
    showing the form, it says something like:
    No topics have been created for this CommitFest yet. Before adding
    your patch, you must add one or more items to the <link>topic
    list</link>.
    I liked the idea of pre-populating with the historical set of topics.
    If you encourage the first few submitters to a new CF to invent their
    own topic categories without any guidance, you're going to get some
    crazy topics.

    regards, tom lane
  • Robert Haas at Sep 27, 2010 at 3:04 pm

    On Mon, Sep 27, 2010 at 10:54 AM, Tom Lane wrote:
    Robert Haas <robertmhaas@gmail.com> writes:
    Andrew's question seemed to be about the message-ID.  I agree the
    topic thing is confusing, though.  I'm wondering if it would be
    sufficient to do the following - if no topic are available, instead of
    showing the form, it says something like:
    No topics have been created for this CommitFest yet.  Before adding
    your patch, you must add one or more items to the <link>topic
    list</link>.
    I liked the idea of pre-populating with the historical set of topics.
    If you encourage the first few submitters to a new CF to invent their
    own topic categories without any guidance, you're going to get some
    crazy topics.
    Well, the historical set of topics varies from CommitFest to
    CommitFest, by design. There are some that recur pretty regularly, of
    course, like Security, Performance, and Miscellaneous. But not every
    CF will have a section for ECPG or Refactoring, for example. In one
    CF, we may have six ECPG patches, so ECPG gets its own topic; in
    another CF, 1 ECPG patch + 2 libpq patches + 1 psql patch get merged
    together under a section called Interfaces. This generally makes it
    easier to group things in ways that are useful in practice than a
    fixed list of topics, so I'm in favor of keeping it that way.

    This is surely a surmountable issue but the exact right thing to do is
    not altogether obvious to me.

    --
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise Postgres Company
  • Tom Lane at Sep 27, 2010 at 3:10 pm

    Robert Haas writes:
    Well, the historical set of topics varies from CommitFest to
    CommitFest, by design. There are some that recur pretty regularly, of
    course, like Security, Performance, and Miscellaneous. But not every
    CF will have a section for ECPG or Refactoring, for example. In one
    CF, we may have six ECPG patches, so ECPG gets its own topic; in
    another CF, 1 ECPG patch + 2 libpq patches + 1 psql patch get merged
    together under a section called Interfaces. This generally makes it
    easier to group things in ways that are useful in practice than a
    fixed list of topics, so I'm in favor of keeping it that way.
    If it's intentional that the topic for the same patch might vary
    depending on what else is submitted in the same CF, then I think that
    asking submitters to select topics is the wrong thing from the get-go.
    The patches should be uncategorized initially, and then someone like the
    CF manager should group them into topics after-the-fact.

    regards, tom lane
  • Robert Haas at Sep 27, 2010 at 3:16 pm

    On Mon, Sep 27, 2010 at 11:08 AM, Tom Lane wrote:
    Robert Haas <robertmhaas@gmail.com> writes:
    Well, the historical set of topics varies from CommitFest to
    CommitFest, by design.  There are some that recur pretty regularly, of
    course, like Security, Performance, and Miscellaneous.  But not every
    CF will have a section for ECPG or Refactoring, for example.  In one
    CF, we may have six ECPG patches, so ECPG gets its own topic; in
    another CF, 1 ECPG patch + 2 libpq patches + 1 psql patch get merged
    together under a section called Interfaces.  This generally makes it
    easier to group things in ways that are useful in practice than a
    fixed list of topics, so I'm in favor of keeping it that way.
    If it's intentional that the topic for the same patch might vary
    depending on what else is submitted in the same CF, then I think that
    asking submitters to select topics is the wrong thing from the get-go.
    The patches should be uncategorized initially, and then someone like the
    CF manager should group them into topics after-the-fact.
    That's actually not a bad idea, although it would require a bit of
    hacking given the way the schema is currently set up. The current
    system has been working well enough that I'm inclined to do something
    simpler for the present, like maybe just auto-create MIscellaneous for
    each new CF. That would have more or less the same effect for about
    one-tenth the work.

    --
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise Postgres Company
  • Robert Haas at Sep 27, 2010 at 4:38 pm

    On Mon, Sep 27, 2010 at 11:15 AM, Robert Haas wrote:
    On Mon, Sep 27, 2010 at 11:08 AM, Tom Lane wrote:
    Robert Haas <robertmhaas@gmail.com> writes:
    Well, the historical set of topics varies from CommitFest to
    CommitFest, by design.  There are some that recur pretty regularly, of
    course, like Security, Performance, and Miscellaneous.  But not every
    CF will have a section for ECPG or Refactoring, for example.  In one
    CF, we may have six ECPG patches, so ECPG gets its own topic; in
    another CF, 1 ECPG patch + 2 libpq patches + 1 psql patch get merged
    together under a section called Interfaces.  This generally makes it
    easier to group things in ways that are useful in practice than a
    fixed list of topics, so I'm in favor of keeping it that way.
    If it's intentional that the topic for the same patch might vary
    depending on what else is submitted in the same CF, then I think that
    asking submitters to select topics is the wrong thing from the get-go.
    The patches should be uncategorized initially, and then someone like the
    CF manager should group them into topics after-the-fact.
    That's actually not a bad idea, although it would require a bit of
    hacking given the way the schema is currently set up.  The current
    system has been working well enough that I'm inclined to do something
    simpler for the present, like maybe just auto-create MIscellaneous for
    each new CF.  That would have more or less the same effect for about
    one-tenth the work.
    Eh, on further review, I decided to do something even simpler still,
    which is to say unbreak the warning message that's supposed to appear
    in this case. Doing one of the things listed above is probably
    better, but this took approximately 60 seconds, so let's wait and see
    whether it helps. If not, I'll whack it around some more.

    --
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise Postgres Company
  • Kevin Grittner at Sep 27, 2010 at 3:06 pm

    Tom Lane wrote:

    I liked the idea of pre-populating with the historical set of
    topics.
    +1

    -Kevin
  • Robert Haas at Sep 27, 2010 at 2:39 pm

    On Mon, Sep 27, 2010 at 10:15 AM, Andrew Dunstan wrote:
    On 09/27/2010 10:11 AM, Robert Haas wrote:

    What should be the value of 'Message-ID for original patch' ?
    the URL:
    http://archives.postgresql.org/pgsql-hackers/2010-09/msg01837.php
    or the ID: AANLkTinw0HL+jQmrtwXC9Y2tqhcfHFgFekxyyfYGvQrB@mail.gmail.com
    The latter.
    Could this perhaps be made clearer on the page, perhaps with an example? It
    confused me recently too.
    Can you suggest something more specific?

    --
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise Postgres Company
  • Andrew Dunstan at Sep 27, 2010 at 3:15 pm

    On 09/27/2010 10:39 AM, Robert Haas wrote:
    On Mon, Sep 27, 2010 at 10:15 AM, Andrew Dunstanwrote:
    On 09/27/2010 10:11 AM, Robert Haas wrote:
    What should be the value of 'Message-ID for original patch' ?
    the URL:
    http://archives.postgresql.org/pgsql-hackers/2010-09/msg01837.php
    or the ID: AANLkTinw0HL+jQmrtwXC9Y2tqhcfHFgFekxyyfYGvQrB@mail.gmail.com
    The latter.
    Could this perhaps be made clearer on the page, perhaps with an example? It
    confused me recently too.
    Can you suggest something more specific?
    Well, it could say something like:

    The Message-ID can be found in the headers of the relevant email to
    the pgsql-hackers mailing list, and also in the mailing list
    archives at http://archives.postgresql.org. It looks something like
    this (the format varies somewhat depending on the sender's Mail User
    Agent): AANLkTinw0HL+jQmrtwXC9Y2tqhcfHFgFekxyyfYGvQrB@mail.gmail.com

    That would certainly have given me, and I suspect Gurjeet, enough clue.

    cheers

    andrew
  • Robert Haas at Sep 27, 2010 at 3:17 pm

    On Mon, Sep 27, 2010 at 11:15 AM, Andrew Dunstan wrote:

    On 09/27/2010 10:39 AM, Robert Haas wrote:

    On Mon, Sep 27, 2010 at 10:15 AM, Andrew Dunstan wrote:

    On 09/27/2010 10:11 AM, Robert Haas wrote:

    What should be the value of 'Message-ID for original patch' ?
    the URL:
    http://archives.postgresql.org/pgsql-hackers/2010-09/msg01837.php
    or the ID: AANLkTinw0HL+jQmrtwXC9Y2tqhcfHFgFekxyyfYGvQrB@mail.gmail.com

    The latter.

    Could this perhaps be made clearer on the page, perhaps with an example? It
    confused me recently too.

    Can you suggest something more specific?


    Well, it could say something like:

    The Message-ID can be found in the headers of the relevant email to the
    pgsql-hackers mailing list, and also in the mailing list archives at
    http://archives.postgresql.org. It looks something like this (the format
    varies somewhat depending on the sender's Mail User Agent):
    AANLkTinw0HL+jQmrtwXC9Y2tqhcfHFgFekxyyfYGvQrB@mail.gmail.com

    That would certainly have given me, and I suspect Gurjeet, enough clue.
    Where on the page would you suggest that we put that text?

    --
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise Postgres Company
  • Andrew Dunstan at Sep 27, 2010 at 3:28 pm

    On 09/27/2010 11:16 AM, Robert Haas wrote:
    On Mon, Sep 27, 2010 at 11:15 AM, Andrew Dunstanwrote:
    On 09/27/2010 10:39 AM, Robert Haas wrote:

    On Mon, Sep 27, 2010 at 10:15 AM, Andrew Dunstan<andrew@dunslane.net>
    wrote:

    On 09/27/2010 10:11 AM, Robert Haas wrote:

    What should be the value of 'Message-ID for original patch' ?
    the URL:
    http://archives.postgresql.org/pgsql-hackers/2010-09/msg01837.php
    or the ID: AANLkTinw0HL+jQmrtwXC9Y2tqhcfHFgFekxyyfYGvQrB@mail.gmail.com

    The latter.

    Could this perhaps be made clearer on the page, perhaps with an example? It
    confused me recently too.

    Can you suggest something more specific?


    Well, it could say something like:

    The Message-ID can be found in the headers of the relevant email to the
    pgsql-hackers mailing list, and also in the mailing list archives at
    http://archives.postgresql.org. It looks something like this (the format
    varies somewhat depending on the sender's Mail User Agent):
    AANLkTinw0HL+jQmrtwXC9Y2tqhcfHFgFekxyyfYGvQrB@mail.gmail.com

    That would certainly have given me, and I suspect Gurjeet, enough clue.
    Where on the page would you suggest that we put that text?
    Following "Enter your comments below. If you wish your comment to
    reference a message from the mailing list archives, enter the message ID
    into the space provided." The point is that because the app nicely turns
    the Message-id into a URL that links to the archives, it's not entirely
    clear whether the user needs to enter the whole URL or not.

    Another way to handle this might be to extract it from an http URL if
    given one.

    It's a minor nit - it didn't seem worth raising at the time, and I only
    commented when I saw that someone else had the same small confusion I
    had had.


    cheers

    andrew
  • Alvaro Herrera at Sep 27, 2010 at 3:33 pm

    Excerpts from Andrew Dunstan's message of lun sep 27 11:28:33 -0400 2010:

    Another way to handle this might be to extract it from an http URL if
    given one.
    +1 for this approach

    --
    Álvaro Herrera <alvherre@commandprompt.com>
    The PostgreSQL Company - Command Prompt, Inc.
    PostgreSQL Replication, Consulting, Custom Development, 24x7 support
  • Robert Haas at Sep 27, 2010 at 4:36 pm

    On Mon, Sep 27, 2010 at 11:28 AM, Andrew Dunstan wrote:
    Could this perhaps be made clearer on the page, perhaps with an example?
    It confused me recently too.

    Can you suggest something more specific?

    Well, it could say something like:

    The Message-ID can be found in the headers of the relevant email to the
    pgsql-hackers mailing list, and also in the mailing list archives at
    http://archives.postgresql.org. It looks something like this (the format
    varies somewhat depending on the sender's Mail User Agent):
    AANLkTinw0HL+jQmrtwXC9Y2tqhcfHFgFekxyyfYGvQrB@mail.gmail.com

    That would certainly have given me, and I suspect Gurjeet, enough clue.
    Where on the page would you suggest that we put that text?
    Following "Enter your comments below. If you wish your comment to reference
    a message from the mailing list archives, enter the message ID into the
    space provided." Done.
    Another way to handle this might be to extract it from an http URL if given
    one.
    I'm going to leave this idea for another day, thought it's not a bad
    one if someone feels motivated to write the code.

    --
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise Postgres Company
  • Alvaro Herrera at Sep 27, 2010 at 3:28 pm

    Excerpts from Gurjeet Singh's message of dom sep 26 22:15:59 -0400 2010:

    Currently I am seeing a performance improvement of this script by only about
    500 ms; say 11.8 seconds vs. 11.3 secs. But I remember distinctly that
    yesterday I was able to see an improvement of 11% on the same virtual
    machine, averaged on multiple runs; 42 sec vs 37 sec. It might be the case
    that the host OS or my Linux virtual machine were loaded at that time and
    the filesystem could not cache enough inodes.
    Hmm. On my otherwise idle desktop machine, I can't measure a difference.
    But this machine has enough RAM for inode cache.

    With patch:

    real 0m3.092s
    user 0m0.900s
    sys 0m2.220s

    real 0m3.116s
    user 0m0.928s
    sys 0m2.176s

    real 0m3.128s
    user 0m1.040s
    sys 0m2.108s

    Without patch:

    real 0m3.109s
    user 0m0.852s
    sys 0m2.180s

    real 0m3.101s
    user 0m0.884s
    sys 0m2.264s

    real 0m3.121s
    user 0m0.968s
    sys 0m2.140s
    Seems like it would improve performance in general, but more so under load
    conditions when you actually need it. I am not sure if -depth option is
    supported by all incarnations of 'find'.
    --
    Álvaro Herrera <alvherre@commandprompt.com>
    The PostgreSQL Company - Command Prompt, Inc.
    PostgreSQL Replication, Consulting, Custom Development, 24x7 support
  • Tom Lane at Sep 27, 2010 at 5:21 pm

    Alvaro Herrera writes:
    Excerpts from Gurjeet Singh's message of dom sep 26 22:15:59 -0400 2010:
    Currently I am seeing a performance improvement of this script by only about
    500 ms; say 11.8 seconds vs. 11.3 secs. But I remember distinctly that
    yesterday I was able to see an improvement of 11% on the same virtual
    machine, averaged on multiple runs; 42 sec vs 37 sec. It might be the case
    that the host OS or my Linux virtual machine were loaded at that time and
    the filesystem could not cache enough inodes.
    Hmm. On my otherwise idle desktop machine, I can't measure a difference.
    Yeah, this seems like something that would have at best an
    environment-specific effect. I'm not convinced that it couldn't make
    things worse in some cases ...

    regards, tom lane
  • Gurjeet Singh at Sep 27, 2010 at 6:20 pm

    On Mon, Sep 27, 2010 at 7:21 PM, Tom Lane wrote:

    Alvaro Herrera <alvherre@commandprompt.com> writes:
    Excerpts from Gurjeet Singh's message of dom sep 26 22:15:59 -0400 2010:
    Currently I am seeing a performance improvement of this script by only
    about
    500 ms; say 11.8 seconds vs. 11.3 secs. But I remember distinctly that
    yesterday I was able to see an improvement of 11% on the same virtual
    machine, averaged on multiple runs; 42 sec vs 37 sec. It might be the
    case
    that the host OS or my Linux virtual machine were loaded at that time
    and
    the filesystem could not cache enough inodes.
    Hmm. On my otherwise idle desktop machine, I can't measure a difference.
    Yeah, this seems like something that would have at best an
    environment-specific effect. I'm not convinced that it couldn't make
    things worse in some cases ...
    I can't think of any obvious cases where this might hurt. I am unable to
    reproduce the 11% improvement, but I did see that dramatic change which
    prompted me for the patch. On the contrary, nothing so far suggests that it
    could hurt configure times.

    Regards,
    --
    gurjeet.singh
    @ EnterpriseDB - The Enterprise Postgres Company
    http://www.EnterpriseDB.com

    singh.gurjeet@{ gmail | yahoo }.com
    Twitter/Skype: singh_gurjeet

    Mail sent from my BlackLaptop device
  • Greg Smith at Nov 19, 2010 at 4:52 am

    Gurjeet Singh wrote:
    Seems like it would improve performance in general, but more so under
    load conditions when you actually need it. I am not sure if -depth
    option is supported by all incarnations of 'find'.
    Given the way directory writes are cached by the filesystem, I'm not
    sure why the load at the time matters so much. If what you mean by that
    is you're low on memory, that would make more sense to me.

    Anyway, "-depth" is in POSIX as of 2001. It seems to be in all the
    major SysV UNIX variants going back further then that, and of course
    it's in GNU find. But it looks like BSD derived systems from before
    that POSIX standard originally called this "-d" instead. So there's
    some potential for this to not work on older systems; it works fine on
    my test FreeBSD 7.2 system. Maybe someone who has access to some
    ancient BSD-ish system can try this out? The simplest test case similar
    to what the patch adds seems to be if this runs, returning
    subdirectories in depth-first order before their parent:

    $ find / -depth -type d -print

    If that fails somewhere, it may turn out to require another configure
    check just to determine whether you can use this configuration time
    optimization. That's certainly possible to add to the patch if it got
    committed and turns out to break one of the buildfarm members. It seems
    to me like this whole thing may be a bit more work than it's worth,
    given this is a fairly small and difficult to reproduce speedup in only
    one stage of the build process. I'd think that if configure takes
    longer than it has to because the system is heavily loaded, the amount
    compilation time is going to suffer from that would always dwarf this
    component of total build time. But if this was slow enough at some
    point to motivate you to write a patch for it, maybe that assumption is
    wrong.

    --
    Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
    PostgreSQL Training, Services and Support www.2ndQuadrant.us
  • Alvaro Herrera at Nov 19, 2010 at 12:50 pm

    Excerpts from Greg Smith's message of vie nov 19 01:52:34 -0300 2010:

    I'd think that if configure takes
    longer than it has to because the system is heavily loaded, the amount
    compilation time is going to suffer from that would always dwarf this
    component of total build time. But if this was slow enough at some
    point to motivate you to write a patch for it, maybe that assumption is
    wrong.
    What if instead of -depth you do something like
    find the_args | sort -r
    ? If you find a way to filter out the "parents" that you know have
    already been created, you could also cut down on the number of mkdir -p
    calls, which could result in a larger speedup. And maybe we should
    remove the test -d. Also, the `expr` call could be substituted by
    ${item##$sourcedir}, which is supposed to be a POSIX shell feature
    according to
    http://www.unix.org/whitepapers/shdiffs.html and
    http://www.opengroup.org/onlinepubs/009695399/utilities/xcu_chap02.html

    In short, there are plenty of optimization opportunities for this script
    without having to involve nonstandard constructs.

    --
    Álvaro Herrera <alvherre@commandprompt.com>
    The PostgreSQL Company - Command Prompt, Inc.
    PostgreSQL Replication, Consulting, Custom Development, 24x7 support
  • Robert Haas at Nov 21, 2010 at 5:31 pm

    On Fri, Nov 19, 2010 at 7:50 AM, Alvaro Herrera wrote:
    Excerpts from Greg Smith's message of vie nov 19 01:52:34 -0300 2010:
    I'd think that if configure takes
    longer than it has to because the system is heavily loaded, the amount
    compilation time is going to suffer from that would always dwarf this
    component of total build time.  But if this was slow enough at some
    point to motivate you to write a patch for it, maybe that assumption is
    wrong.
    What if instead of -depth you do something like
    find the_args | sort -r
    ?  If you find a way to filter out the "parents" that you know have
    already been created, you could also cut down on the number of mkdir -p
    calls, which could result in a larger speedup.  And maybe we should
    remove the test -d.  Also, the `expr` call could be substituted by
    ${item##$sourcedir}, which is supposed to be a POSIX shell feature
    according to
    http://www.unix.org/whitepapers/shdiffs.html and
    http://www.opengroup.org/onlinepubs/009695399/utilities/xcu_chap02.html

    In short, there are plenty of optimization opportunities for this script
    without having to involve nonstandard constructs.
    It seems that we have a general consensus that, aside from any
    portability concerns (which so far seem to be mostly theoretical),
    there's little to no evidence that it is a consistent win from a
    performance standpoint. Alvaro wasn't able to demonstrate a win at
    all, Tom theorized - albeit without evidence - that it might be a loss
    under some circumstances, and Gurjeet (the OP) could only reproduce
    about a ~4% speedup, amounting to 500 ms (although he did see an ~11%
    speedup, amounting to 5 s, on one occasion). So I agree with Greg
    Smith's comments a couple of days ago - it seems like this may not be
    worth worrying about. I'm going to mark this Returned with Feedback
    for now, though of course it can come back to life if more evidence
    that this is the right thing to do comes to life.

    --
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-hackers @
categoriespostgresql
postedSep 27, '10 at 2:16a
activeNov 21, '10 at 5:31p
posts25
users7
websitepostgresql.org...
irc#postgresql

People

Translate

site design / logo © 2022 Grokbase