Avoid repeated PQfnumber() in pg_dump
Hi,
When reviewing some pg_dump related code, I found some existsing code
(getTableAttrs() and dumpEnumType()) invoke PQfnumber() repeatedly which seems
unnecessary.
Example
-----
for (int j = 0; j < ntups; j++)
{
if (j + 1 != atoi(PQgetvalue(res, j, PQfnumber(res, "attnum"))))
-----
Since PQfnumber() is not a cheap function, I think we'd better invoke
PQfnumber() out of the loop like the attatched patch.
After applying this change, I can see about 8% performance gain in my test environment
when dump table definitions which have many columns.
Best regards,
Hou zhijie
Attachments:
0001-Avoid-repeated-PQfnumber.patchapplication/octet-stream; name=0001-Avoid-repeated-PQfnumber.patchDownload+71-24
On Wed, Jul 14, 2021 at 08:54:32AM +0000, houzj.fnst@fujitsu.com wrote:
Since PQfnumber() is not a cheap function, I think we'd better invoke
PQfnumber() out of the loop like the attatched patch.
+1
Please add to the next CF
--
Justin
On 14 Jul 2021, at 10:54, houzj.fnst@fujitsu.com wrote:
Since PQfnumber() is not a cheap function, I think we'd better invoke
PQfnumber() out of the loop like the attatched patch.
Looks good on a quick readthrough, and I didn't see any other similar codepaths
in pg_dump on top of what you've fixed.
After applying this change, I can see about 8% performance gain in my test environment
when dump table definitions which have many columns.
Out of curiosity, how many columns are "many columns"?
--
Daniel Gustafsson https://vmware.com/
On July 15, 2021 5:35 AM Daniel Gustafsson <daniel@yesql.se> wrote:
On 14 Jul 2021, at 10:54, houzj.fnst@fujitsu.com wrote:
Since PQfnumber() is not a cheap function, I think we'd better invoke
PQfnumber() out of the loop like the attatched patch.Looks good on a quick readthrough, and I didn't see any other similar
codepaths in pg_dump on top of what you've fixed.
Thanks for reviewing the patch.
Added to the CF: https://commitfest.postgresql.org/34/3254/
After applying this change, I can see about 8% performance gain in my
test environment when dump table definitions which have many columns.Out of curiosity, how many columns are "many columns"?
I tried dump 10 table definitions while each table has 1000 columns
(maybe not real world case).
Best regards,
houzj
On 15 Jul 2021, at 04:51, houzj.fnst@fujitsu.com wrote:
On July 15, 2021 5:35 AM Daniel Gustafsson <daniel@yesql.se> wrote:
Out of curiosity, how many columns are "many columns"?
I tried dump 10 table definitions while each table has 1000 columns
(maybe not real world case).
While unlikely to be common, very wide tables aren’t unheard of. Either way, I
think it has merit to pull out the PQfnumber before the loop even if it’s a
wash performance wise for many users, as it’s a pattern used elsewhere in
pg_dump.
--
Daniel Gustafsson https://vmware.com/
On 7/15/21, 12:48 PM, "Daniel Gustafsson" <daniel@yesql.se> wrote:
While unlikely to be common, very wide tables aren’t unheard of. Either way, I
think it has merit to pull out the PQfnumber before the loop even if it’s a
wash performance wise for many users, as it’s a pattern used elsewhere in
pg_dump.
+1
The patch looks good to me. I am marking it as ready-for-committer.
Nathan
On 23 Jul 2021, at 01:39, Bossart, Nathan <bossartn@amazon.com> wrote:
The patch looks good to me. I am marking it as ready-for-committer.
I took another look at this today and pushed it after verifying with a pgindent
run. Thanks!
--
Daniel Gustafsson https://vmware.com/