Improving prep_buildtree used in VPATH builds
Attached is the patch that tries to speedup prep_buildtree script, which is
used in VPATH builds, from our configure script.
The idea is to ask `find` to emit directory listing in depth-first order so
that the `mkdir -p` will create the deepest directory first and any
subsequent `mkdir -p` on an intermediate directory will not have to do
anything.
Currently I am seeing a performance improvement of this script by only about
500 ms; say 11.8 seconds vs. 11.3 secs. But I remember distinctly that
yesterday I was able to see an improvement of 11% on the same virtual
machine, averaged on multiple runs; 42 sec vs 37 sec. It might be the case
that the host OS or my Linux virtual machine were loaded at that time and
the filesystem could not cache enough inodes.
Seems like it would improve performance in general, but more so under load
conditions when you actually need it. I am not sure if -depth option is
supported by all incarnations of 'find'.
I have been away from Postgres development for quite a while, so would
appreciate if someone could tell me if such a patch should be submitted for
commitfest (since this is not actually a source patch).
Regards,
--
gurjeet.singh
@ EnterpriseDB - The Enterprise Postgres Company
http://www.EnterpriseDB.com
singh.gurjeet@{ gmail | yahoo }.com
Twitter/Skype: singh_gurjeet
Mail sent from my BlackLaptop device
Attachments:
prep_buildtree.patchapplication/octet-stream; name=prep_buildtree.patchDownload+1-1
On Sun, Sep 26, 2010 at 10:15 PM, Gurjeet Singh <singh.gurjeet@gmail.com> wrote:
I have been away from Postgres development for quite a while, so would
appreciate if someone could tell me if such a patch should be submitted for
commitfest (since this is not actually a source patch).
By all means add it to the open CF.
https://commitfest.postgresql.org/action/commitfest_view/open
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
On Mon, Sep 27, 2010 at 3:02 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Sun, Sep 26, 2010 at 10:15 PM, Gurjeet Singh <singh.gurjeet@gmail.com>
wrote:I have been away from Postgres development for quite a while, so would
appreciate if someone could tell me if such a patch should be submittedfor
commitfest (since this is not actually a source patch).
By all means add it to the open CF.
https://commitfest.postgresql.org/action/commitfest_view/open
When trying to submit new patch, the 'CommitFest' drop-down has just one
entry '(None Selected)', and 'Submit' would refuse to go through without a
topic.
What should be the value of 'Message-ID for original patch' ?
the URL: http://archives.postgresql.org/pgsql-hackers/2010-09/msg01837.php
or the ID: AANLkTinw0HL+jQmrtwXC9Y2tqhcfHFgFekxyyfYGvQrB@mail.gmail.com<AANLkTinw0HL%2BjQmrtwXC9Y2tqhcfHFgFekxyyfYGvQrB@mail.gmail.com>
Regards,
--
gurjeet.singh
@ EnterpriseDB - The Enterprise Postgres Company
http://www.EnterpriseDB.com
singh.gurjeet@{ gmail | yahoo }.com
Twitter/Skype: singh_gurjeet
Mail sent from my BlackLaptop device
On Mon, Sep 27, 2010 at 10:09 AM, Gurjeet Singh <singh.gurjeet@gmail.com> wrote:
On Mon, Sep 27, 2010 at 3:02 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Sun, Sep 26, 2010 at 10:15 PM, Gurjeet Singh <singh.gurjeet@gmail.com>
wrote:I have been away from Postgres development for quite a while, so would
appreciate if someone could tell me if such a patch should be submitted
for
commitfest (since this is not actually a source patch).By all means add it to the open CF.
https://commitfest.postgresql.org/action/commitfest_view/open
When trying to submit new patch, the 'CommitFest' drop-down has just one
entry '(None Selected)', and 'Submit' would refuse to go through without a
topic.
Oh, blah. I just added a Miscellaneous topic. You can add others
yourself, just look on the CF page for "CommitFest topics".
What should be the value of 'Message-ID for original patch' ?
the URL: http://archives.postgresql.org/pgsql-hackers/2010-09/msg01837.php
or the ID: AANLkTinw0HL+jQmrtwXC9Y2tqhcfHFgFekxyyfYGvQrB@mail.gmail.com
The latter.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
On 09/27/2010 10:11 AM, Robert Haas wrote:
What should be the value of 'Message-ID for original patch' ?
the URL: http://archives.postgresql.org/pgsql-hackers/2010-09/msg01837.php
or the ID: AANLkTinw0HL+jQmrtwXC9Y2tqhcfHFgFekxyyfYGvQrB@mail.gmail.comThe latter.
Could this perhaps be made clearer on the page, perhaps with an example?
It confused me recently too.
cheers
andrew
On Mon, Sep 27, 2010 at 4:15 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
On 09/27/2010 10:11 AM, Robert Haas wrote:
What should be the value of 'Message-ID for original patch' ?
the URL:
http://archives.postgresql.org/pgsql-hackers/2010-09/msg01837.php
or the ID: AANLkTinw0HL+jQmrtwXC9Y2tqhcfHFgFekxyyfYGvQrB@mail.gmail.com<AANLkTinw0HL%2BjQmrtwXC9Y2tqhcfHFgFekxyyfYGvQrB@mail.gmail.com>The latter.
Could this perhaps be made clearer on the page, perhaps with an example? It
confused me recently too.
Or maybe populate the drop-down with every available topic from previous
CFs, and add an additional '(Add new topic)' to the drop-down which would
take you to topic creation page.
Regards,
--
gurjeet.singh
@ EnterpriseDB - The Enterprise Postgres Company
http://www.EnterpriseDB.com
singh.gurjeet@{ gmail | yahoo }.com
Twitter/Skype: singh_gurjeet
Mail sent from my BlackLaptop device
On Mon, Sep 27, 2010 at 10:15 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
On 09/27/2010 10:11 AM, Robert Haas wrote:
What should be the value of 'Message-ID for original patch' ?
the URL:
http://archives.postgresql.org/pgsql-hackers/2010-09/msg01837.php
or the ID: AANLkTinw0HL+jQmrtwXC9Y2tqhcfHFgFekxyyfYGvQrB@mail.gmail.comThe latter.
Could this perhaps be made clearer on the page, perhaps with an example? It
confused me recently too.
Can you suggest something more specific?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
On Mon, Sep 27, 2010 at 10:18 AM, Gurjeet Singh <singh.gurjeet@gmail.com> wrote:
On Mon, Sep 27, 2010 at 4:15 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
On 09/27/2010 10:11 AM, Robert Haas wrote:
What should be the value of 'Message-ID for original patch' ?
the URL:
http://archives.postgresql.org/pgsql-hackers/2010-09/msg01837.php
or the ID: AANLkTinw0HL+jQmrtwXC9Y2tqhcfHFgFekxyyfYGvQrB@mail.gmail.comThe latter.
Could this perhaps be made clearer on the page, perhaps with an example?
It confused me recently too.Or maybe populate the drop-down with every available topic from previous
CFs, and add an additional '(Add new topic)' to the drop-down which would
take you to topic creation page.
Andrew's question seemed to be about the message-ID. I agree the
topic thing is confusing, though. I'm wondering if it would be
sufficient to do the following - if no topic are available, instead of
showing the form, it says something like:
No topics have been created for this CommitFest yet. Before adding
your patch, you must add one or more items to the <link>topic
list</link>.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
Robert Haas <robertmhaas@gmail.com> writes:
Andrew's question seemed to be about the message-ID. I agree the
topic thing is confusing, though. I'm wondering if it would be
sufficient to do the following - if no topic are available, instead of
showing the form, it says something like:
No topics have been created for this CommitFest yet. Before adding
your patch, you must add one or more items to the <link>topic
list</link>.
I liked the idea of pre-populating with the historical set of topics.
If you encourage the first few submitters to a new CF to invent their
own topic categories without any guidance, you're going to get some
crazy topics.
regards, tom lane
On Mon, Sep 27, 2010 at 10:54 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
Andrew's question seemed to be about the message-ID. I agree the
topic thing is confusing, though. I'm wondering if it would be
sufficient to do the following - if no topic are available, instead of
showing the form, it says something like:No topics have been created for this CommitFest yet. Before adding
your patch, you must add one or more items to the <link>topic
list</link>.I liked the idea of pre-populating with the historical set of topics.
If you encourage the first few submitters to a new CF to invent their
own topic categories without any guidance, you're going to get some
crazy topics.
Well, the historical set of topics varies from CommitFest to
CommitFest, by design. There are some that recur pretty regularly, of
course, like Security, Performance, and Miscellaneous. But not every
CF will have a section for ECPG or Refactoring, for example. In one
CF, we may have six ECPG patches, so ECPG gets its own topic; in
another CF, 1 ECPG patch + 2 libpq patches + 1 psql patch get merged
together under a section called Interfaces. This generally makes it
easier to group things in ways that are useful in practice than a
fixed list of topics, so I'm in favor of keeping it that way.
This is surely a surmountable issue but the exact right thing to do is
not altogether obvious to me.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
Tom Lane <tgl@sss.pgh.pa.us> wrote:
I liked the idea of pre-populating with the historical set of
topics.
+1
-Kevin
Robert Haas <robertmhaas@gmail.com> writes:
Well, the historical set of topics varies from CommitFest to
CommitFest, by design. There are some that recur pretty regularly, of
course, like Security, Performance, and Miscellaneous. But not every
CF will have a section for ECPG or Refactoring, for example. In one
CF, we may have six ECPG patches, so ECPG gets its own topic; in
another CF, 1 ECPG patch + 2 libpq patches + 1 psql patch get merged
together under a section called Interfaces. This generally makes it
easier to group things in ways that are useful in practice than a
fixed list of topics, so I'm in favor of keeping it that way.
If it's intentional that the topic for the same patch might vary
depending on what else is submitted in the same CF, then I think that
asking submitters to select topics is the wrong thing from the get-go.
The patches should be uncategorized initially, and then someone like the
CF manager should group them into topics after-the-fact.
regards, tom lane
On 09/27/2010 10:39 AM, Robert Haas wrote:
On Mon, Sep 27, 2010 at 10:15 AM, Andrew Dunstan<andrew@dunslane.net> wrote:
On 09/27/2010 10:11 AM, Robert Haas wrote:
What should be the value of 'Message-ID for original patch' ?
the URL:
http://archives.postgresql.org/pgsql-hackers/2010-09/msg01837.php
or the ID: AANLkTinw0HL+jQmrtwXC9Y2tqhcfHFgFekxyyfYGvQrB@mail.gmail.comThe latter.
Could this perhaps be made clearer on the page, perhaps with an example? It
confused me recently too.Can you suggest something more specific?
Well, it could say something like:
The Message-ID can be found in the headers of the relevant email to
the pgsql-hackers mailing list, and also in the mailing list
archives at http://archives.postgresql.org. It looks something like
this (the format varies somewhat depending on the sender's Mail User
Agent): AANLkTinw0HL+jQmrtwXC9Y2tqhcfHFgFekxyyfYGvQrB@mail.gmail.com
That would certainly have given me, and I suspect Gurjeet, enough clue.
cheers
andrew
On Mon, Sep 27, 2010 at 11:08 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
Well, the historical set of topics varies from CommitFest to
CommitFest, by design. There are some that recur pretty regularly, of
course, like Security, Performance, and Miscellaneous. But not every
CF will have a section for ECPG or Refactoring, for example. In one
CF, we may have six ECPG patches, so ECPG gets its own topic; in
another CF, 1 ECPG patch + 2 libpq patches + 1 psql patch get merged
together under a section called Interfaces. This generally makes it
easier to group things in ways that are useful in practice than a
fixed list of topics, so I'm in favor of keeping it that way.If it's intentional that the topic for the same patch might vary
depending on what else is submitted in the same CF, then I think that
asking submitters to select topics is the wrong thing from the get-go.
The patches should be uncategorized initially, and then someone like the
CF manager should group them into topics after-the-fact.
That's actually not a bad idea, although it would require a bit of
hacking given the way the schema is currently set up. The current
system has been working well enough that I'm inclined to do something
simpler for the present, like maybe just auto-create MIscellaneous for
each new CF. That would have more or less the same effect for about
one-tenth the work.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
On Mon, Sep 27, 2010 at 11:15 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
On 09/27/2010 10:39 AM, Robert Haas wrote:
On Mon, Sep 27, 2010 at 10:15 AM, Andrew Dunstan <andrew@dunslane.net>
wrote:On 09/27/2010 10:11 AM, Robert Haas wrote:
What should be the value of 'Message-ID for original patch' ?
the URL:
http://archives.postgresql.org/pgsql-hackers/2010-09/msg01837.php
or the ID: AANLkTinw0HL+jQmrtwXC9Y2tqhcfHFgFekxyyfYGvQrB@mail.gmail.comThe latter.
Could this perhaps be made clearer on the page, perhaps with an example? It
confused me recently too.Can you suggest something more specific?
Well, it could say something like:
The Message-ID can be found in the headers of the relevant email to the
pgsql-hackers mailing list, and also in the mailing list archives at
http://archives.postgresql.org. It looks something like this (the format
varies somewhat depending on the sender's Mail User Agent):
AANLkTinw0HL+jQmrtwXC9Y2tqhcfHFgFekxyyfYGvQrB@mail.gmail.comThat would certainly have given me, and I suspect Gurjeet, enough clue.
Where on the page would you suggest that we put that text?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
Excerpts from Gurjeet Singh's message of dom sep 26 22:15:59 -0400 2010:
Currently I am seeing a performance improvement of this script by only about
500 ms; say 11.8 seconds vs. 11.3 secs. But I remember distinctly that
yesterday I was able to see an improvement of 11% on the same virtual
machine, averaged on multiple runs; 42 sec vs 37 sec. It might be the case
that the host OS or my Linux virtual machine were loaded at that time and
the filesystem could not cache enough inodes.
Hmm. On my otherwise idle desktop machine, I can't measure a difference.
But this machine has enough RAM for inode cache.
With patch:
real 0m3.092s
user 0m0.900s
sys 0m2.220s
real 0m3.116s
user 0m0.928s
sys 0m2.176s
real 0m3.128s
user 0m1.040s
sys 0m2.108s
Without patch:
real 0m3.109s
user 0m0.852s
sys 0m2.180s
real 0m3.101s
user 0m0.884s
sys 0m2.264s
real 0m3.121s
user 0m0.968s
sys 0m2.140s
Seems like it would improve performance in general, but more so under load
conditions when you actually need it. I am not sure if -depth option is
supported by all incarnations of 'find'.
--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On 09/27/2010 11:16 AM, Robert Haas wrote:
On Mon, Sep 27, 2010 at 11:15 AM, Andrew Dunstan<andrew@dunslane.net> wrote:
On 09/27/2010 10:39 AM, Robert Haas wrote:
On Mon, Sep 27, 2010 at 10:15 AM, Andrew Dunstan<andrew@dunslane.net>
wrote:On 09/27/2010 10:11 AM, Robert Haas wrote:
What should be the value of 'Message-ID for original patch' ?
the URL:
http://archives.postgresql.org/pgsql-hackers/2010-09/msg01837.php
or the ID: AANLkTinw0HL+jQmrtwXC9Y2tqhcfHFgFekxyyfYGvQrB@mail.gmail.comThe latter.
Could this perhaps be made clearer on the page, perhaps with an example? It
confused me recently too.Can you suggest something more specific?
Well, it could say something like:
The Message-ID can be found in the headers of the relevant email to the
pgsql-hackers mailing list, and also in the mailing list archives at
http://archives.postgresql.org. It looks something like this (the format
varies somewhat depending on the sender's Mail User Agent):
AANLkTinw0HL+jQmrtwXC9Y2tqhcfHFgFekxyyfYGvQrB@mail.gmail.comThat would certainly have given me, and I suspect Gurjeet, enough clue.
Where on the page would you suggest that we put that text?
Following "Enter your comments below. If you wish your comment to
reference a message from the mailing list archives, enter the message ID
into the space provided." The point is that because the app nicely turns
the Message-id into a URL that links to the archives, it's not entirely
clear whether the user needs to enter the whole URL or not.
Another way to handle this might be to extract it from an http URL if
given one.
It's a minor nit - it didn't seem worth raising at the time, and I only
commented when I saw that someone else had the same small confusion I
had had.
cheers
andrew
Excerpts from Andrew Dunstan's message of lun sep 27 11:28:33 -0400 2010:
Another way to handle this might be to extract it from an http URL if
given one.
+1 for this approach
--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Mon, Sep 27, 2010 at 11:28 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
Could this perhaps be made clearer on the page, perhaps with an example?
It confused me recently too.Can you suggest something more specific?
Well, it could say something like:
The Message-ID can be found in the headers of the relevant email to the
pgsql-hackers mailing list, and also in the mailing list archives at
http://archives.postgresql.org. It looks something like this (the format
varies somewhat depending on the sender's Mail User Agent):
AANLkTinw0HL+jQmrtwXC9Y2tqhcfHFgFekxyyfYGvQrB@mail.gmail.comThat would certainly have given me, and I suspect Gurjeet, enough clue.
Where on the page would you suggest that we put that text?
Following "Enter your comments below. If you wish your comment to reference
a message from the mailing list archives, enter the message ID into the
space provided."
Done.
Another way to handle this might be to extract it from an http URL if given
one.
I'm going to leave this idea for another day, thought it's not a bad
one if someone feels motivated to write the code.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
On Mon, Sep 27, 2010 at 11:15 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Sep 27, 2010 at 11:08 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
Well, the historical set of topics varies from CommitFest to
CommitFest, by design. There are some that recur pretty regularly, of
course, like Security, Performance, and Miscellaneous. But not every
CF will have a section for ECPG or Refactoring, for example. In one
CF, we may have six ECPG patches, so ECPG gets its own topic; in
another CF, 1 ECPG patch + 2 libpq patches + 1 psql patch get merged
together under a section called Interfaces. This generally makes it
easier to group things in ways that are useful in practice than a
fixed list of topics, so I'm in favor of keeping it that way.If it's intentional that the topic for the same patch might vary
depending on what else is submitted in the same CF, then I think that
asking submitters to select topics is the wrong thing from the get-go.
The patches should be uncategorized initially, and then someone like the
CF manager should group them into topics after-the-fact.That's actually not a bad idea, although it would require a bit of
hacking given the way the schema is currently set up. The current
system has been working well enough that I'm inclined to do something
simpler for the present, like maybe just auto-create MIscellaneous for
each new CF. That would have more or less the same effect for about
one-tenth the work.
Eh, on further review, I decided to do something even simpler still,
which is to say unbreak the warning message that's supposed to appear
in this case. Doing one of the things listed above is probably
better, but this took approximately 60 seconds, so let's wait and see
whether it helps. If not, I'll whack it around some more.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company