Avoid repeated PQfnumber() in pg_dump

Started by Zhijie Hou (Fujitsu)almost 5 years ago8 messageshackers
Jump to latest
#1Zhijie Hou (Fujitsu)
houzj.fnst@fujitsu.com

Hi,

When reviewing some pg_dump related code, I found some existsing code
(getTableAttrs() and dumpEnumType()) invoke PQfnumber() repeatedly which seems
unnecessary.

Example
-----
for (int j = 0; j < ntups; j++)
{
if (j + 1 != atoi(PQgetvalue(res, j, PQfnumber(res, "attnum"))))
-----

Since PQfnumber() is not a cheap function, I think we'd better invoke
PQfnumber() out of the loop like the attatched patch.

After applying this change, I can see about 8% performance gain in my test environment
when dump table definitions which have many columns.

Best regards,
Hou zhijie

Attachments:

0001-Avoid-repeated-PQfnumber.patchapplication/octet-stream; name=0001-Avoid-repeated-PQfnumber.patchDownload+71-24
#2Justin Pryzby
pryzby@telsasoft.com
In reply to: Zhijie Hou (Fujitsu) (#1)
Re: Avoid repeated PQfnumber() in pg_dump

On Wed, Jul 14, 2021 at 08:54:32AM +0000, houzj.fnst@fujitsu.com wrote:

Since PQfnumber() is not a cheap function, I think we'd better invoke
PQfnumber() out of the loop like the attatched patch.

+1

Please add to the next CF

--
Justin

#3Daniel Gustafsson
daniel@yesql.se
In reply to: Zhijie Hou (Fujitsu) (#1)
Re: Avoid repeated PQfnumber() in pg_dump

On 14 Jul 2021, at 10:54, houzj.fnst@fujitsu.com wrote:

Since PQfnumber() is not a cheap function, I think we'd better invoke
PQfnumber() out of the loop like the attatched patch.

Looks good on a quick readthrough, and I didn't see any other similar codepaths
in pg_dump on top of what you've fixed.

After applying this change, I can see about 8% performance gain in my test environment
when dump table definitions which have many columns.

Out of curiosity, how many columns are "many columns"?

--
Daniel Gustafsson https://vmware.com/

#4Zhijie Hou (Fujitsu)
houzj.fnst@fujitsu.com
In reply to: Daniel Gustafsson (#3)
RE: Avoid repeated PQfnumber() in pg_dump

On July 15, 2021 5:35 AM Daniel Gustafsson <daniel@yesql.se> wrote:

On 14 Jul 2021, at 10:54, houzj.fnst@fujitsu.com wrote:

Since PQfnumber() is not a cheap function, I think we'd better invoke
PQfnumber() out of the loop like the attatched patch.

Looks good on a quick readthrough, and I didn't see any other similar
codepaths in pg_dump on top of what you've fixed.

Thanks for reviewing the patch.
Added to the CF: https://commitfest.postgresql.org/34/3254/

After applying this change, I can see about 8% performance gain in my
test environment when dump table definitions which have many columns.

Out of curiosity, how many columns are "many columns"?

I tried dump 10 table definitions while each table has 1000 columns
(maybe not real world case).

Best regards,
houzj

#5Daniel Gustafsson
daniel@yesql.se
In reply to: Zhijie Hou (Fujitsu) (#4)
Re: Avoid repeated PQfnumber() in pg_dump

On 15 Jul 2021, at 04:51, houzj.fnst@fujitsu.com wrote:
On July 15, 2021 5:35 AM Daniel Gustafsson <daniel@yesql.se> wrote:

Out of curiosity, how many columns are "many columns"?

I tried dump 10 table definitions while each table has 1000 columns
(maybe not real world case).

While unlikely to be common, very wide tables aren’t unheard of. Either way, I
think it has merit to pull out the PQfnumber before the loop even if it’s a
wash performance wise for many users, as it’s a pattern used elsewhere in
pg_dump.

--
Daniel Gustafsson https://vmware.com/

#6Nathan Bossart
nathandbossart@gmail.com
In reply to: Daniel Gustafsson (#5)
Re: Avoid repeated PQfnumber() in pg_dump

On 7/15/21, 12:48 PM, "Daniel Gustafsson" <daniel@yesql.se> wrote:

While unlikely to be common, very wide tables aren’t unheard of. Either way, I
think it has merit to pull out the PQfnumber before the loop even if it’s a
wash performance wise for many users, as it’s a pattern used elsewhere in
pg_dump.

+1

The patch looks good to me. I am marking it as ready-for-committer.

Nathan

#7Daniel Gustafsson
daniel@yesql.se
In reply to: Nathan Bossart (#6)
Re: Avoid repeated PQfnumber() in pg_dump

On 23 Jul 2021, at 01:39, Bossart, Nathan <bossartn@amazon.com> wrote:

The patch looks good to me. I am marking it as ready-for-committer.

I took another look at this today and pushed it after verifying with a pgindent
run. Thanks!

--
Daniel Gustafsson https://vmware.com/

#8Nathan Bossart
nathandbossart@gmail.com
In reply to: Daniel Gustafsson (#7)
Re: Avoid repeated PQfnumber() in pg_dump

On 8/27/21, 7:27 AM, "Daniel Gustafsson" <daniel@yesql.se> wrote:

I took another look at this today and pushed it after verifying with a pgindent
run. Thanks!

Thank you!

Nathan