Avoid repeated PQfnumber() in pg_dump

Started by houzj.fnst@fujitsu.comover 4 years ago8 messages
#1houzj.fnst@fujitsu.com
houzj.fnst@fujitsu.com
1 attachment(s)

Hi,

When reviewing some pg_dump related code, I found some existsing code
(getTableAttrs() and dumpEnumType()) invoke PQfnumber() repeatedly which seems
unnecessary.

Example
-----
for (int j = 0; j < ntups; j++)
{
if (j + 1 != atoi(PQgetvalue(res, j, PQfnumber(res, "attnum"))))
-----

Since PQfnumber() is not a cheap function, I think we'd better invoke
PQfnumber() out of the loop like the attatched patch.

After applying this change, I can see about 8% performance gain in my test environment
when dump table definitions which have many columns.

Best regards,
Hou zhijie

Attachments:

0001-Avoid-repeated-PQfnumber.patchapplication/octet-stream; name=0001-Avoid-repeated-PQfnumber.patchDownload
From fe41c4c6660f91627f183c7a2721297eb0ff2a69 Mon Sep 17 00:00:00 2001
From: "houzj.fnst" <houzj.fnst@fujitsu.com>
Date: Wed, 14 Jul 2021 16:01:28 +0800
Subject: [PATCH] dump

---
 src/bin/pg_dump/pg_dump.c | 94 +++++++++++++++++++++++++++++----------
 1 file changed, 71 insertions(+), 23 deletions(-)

diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 53693ec177..7328d5e4d9 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -8659,6 +8659,26 @@ getTableAttrs(Archive *fout, TableInfo *tblinfo, int numTables)
 {
 	DumpOptions *dopt = fout->dopt;
 	PQExpBuffer q = createPQExpBuffer();
+	int			i_attnum;
+	int			i_attname;
+	int			i_atttypname;
+	int			i_atttypmod;
+	int			i_attstattarget;
+	int			i_attstorage;
+	int			i_typstorage;
+	int			i_attidentity;
+	int			i_attgenerated;
+	int			i_attisdropped;
+	int			i_attlen;
+	int			i_attalign;
+	int			i_attislocal;
+	int			i_attnotnull;
+	int			i_attoptions;
+	int			i_attcollation;
+	int			i_attcompression;
+	int			i_attfdwoptions;
+	int			i_attmissingval;
+	int			i_atthasdef;
 
 	for (int i = 0; i < numTables; i++)
 	{
@@ -8802,32 +8822,53 @@ getTableAttrs(Archive *fout, TableInfo *tblinfo, int numTables)
 		tbinfo->attrdefs = (AttrDefInfo **) pg_malloc(ntups * sizeof(AttrDefInfo *));
 		hasdefaults = false;
 
+		i_attnum = PQfnumber(res, "attnum");
+		i_attname = PQfnumber(res, "attname");
+		i_atttypname = PQfnumber(res, "atttypname");
+		i_atttypmod = PQfnumber(res, "atttypmod");
+		i_attstattarget = PQfnumber(res, "attstattarget");
+		i_attstorage = PQfnumber(res, "attstorage");
+		i_typstorage = PQfnumber(res, "typstorage");
+		i_attidentity = PQfnumber(res, "attidentity");
+		i_attgenerated = PQfnumber(res, "attgenerated");
+		i_attisdropped = PQfnumber(res, "attisdropped");
+		i_attlen = PQfnumber(res, "attlen");
+		i_attalign = PQfnumber(res, "attalign");
+		i_attislocal = PQfnumber(res, "attislocal");
+		i_attnotnull = PQfnumber(res, "attnotnull");
+		i_attoptions = PQfnumber(res, "attoptions");
+		i_attcollation = PQfnumber(res, "attcollation");
+		i_attcompression = PQfnumber(res, "attcompression");
+		i_attfdwoptions = PQfnumber(res, "attfdwoptions");
+		i_attmissingval = PQfnumber(res, "attmissingval");
+		i_atthasdef = PQfnumber(res, "atthasdef");
+
 		for (int j = 0; j < ntups; j++)
 		{
-			if (j + 1 != atoi(PQgetvalue(res, j, PQfnumber(res, "attnum"))))
+			if (j + 1 != atoi(PQgetvalue(res, j, i_attnum)))
 				fatal("invalid column numbering in table \"%s\"",
 					  tbinfo->dobj.name);
-			tbinfo->attnames[j] = pg_strdup(PQgetvalue(res, j, PQfnumber(res, "attname")));
-			tbinfo->atttypnames[j] = pg_strdup(PQgetvalue(res, j, PQfnumber(res, "atttypname")));
-			tbinfo->atttypmod[j] = atoi(PQgetvalue(res, j, PQfnumber(res, "atttypmod")));
-			tbinfo->attstattarget[j] = atoi(PQgetvalue(res, j, PQfnumber(res, "attstattarget")));
-			tbinfo->attstorage[j] = *(PQgetvalue(res, j, PQfnumber(res, "attstorage")));
-			tbinfo->typstorage[j] = *(PQgetvalue(res, j, PQfnumber(res, "typstorage")));
-			tbinfo->attidentity[j] = *(PQgetvalue(res, j, PQfnumber(res, "attidentity")));
-			tbinfo->attgenerated[j] = *(PQgetvalue(res, j, PQfnumber(res, "attgenerated")));
+			tbinfo->attnames[j] = pg_strdup(PQgetvalue(res, j, i_attname));
+			tbinfo->atttypnames[j] = pg_strdup(PQgetvalue(res, j, i_atttypname));
+			tbinfo->atttypmod[j] = atoi(PQgetvalue(res, j, i_atttypmod));
+			tbinfo->attstattarget[j] = atoi(PQgetvalue(res, j, i_attstattarget));
+			tbinfo->attstorage[j] = *(PQgetvalue(res, j, i_attstorage));
+			tbinfo->typstorage[j] = *(PQgetvalue(res, j, i_typstorage));
+			tbinfo->attidentity[j] = *(PQgetvalue(res, j, i_attidentity));
+			tbinfo->attgenerated[j] = *(PQgetvalue(res, j, i_attgenerated));
 			tbinfo->needs_override = tbinfo->needs_override || (tbinfo->attidentity[j] == ATTRIBUTE_IDENTITY_ALWAYS);
-			tbinfo->attisdropped[j] = (PQgetvalue(res, j, PQfnumber(res, "attisdropped"))[0] == 't');
-			tbinfo->attlen[j] = atoi(PQgetvalue(res, j, PQfnumber(res, "attlen")));
-			tbinfo->attalign[j] = *(PQgetvalue(res, j, PQfnumber(res, "attalign")));
-			tbinfo->attislocal[j] = (PQgetvalue(res, j, PQfnumber(res, "attislocal"))[0] == 't');
-			tbinfo->notnull[j] = (PQgetvalue(res, j, PQfnumber(res, "attnotnull"))[0] == 't');
-			tbinfo->attoptions[j] = pg_strdup(PQgetvalue(res, j, PQfnumber(res, "attoptions")));
-			tbinfo->attcollation[j] = atooid(PQgetvalue(res, j, PQfnumber(res, "attcollation")));
-			tbinfo->attcompression[j] = *(PQgetvalue(res, j, PQfnumber(res, "attcompression")));
-			tbinfo->attfdwoptions[j] = pg_strdup(PQgetvalue(res, j, PQfnumber(res, "attfdwoptions")));
-			tbinfo->attmissingval[j] = pg_strdup(PQgetvalue(res, j, PQfnumber(res, "attmissingval")));
+			tbinfo->attisdropped[j] = (PQgetvalue(res, j, i_attisdropped)[0] == 't');
+			tbinfo->attlen[j] = atoi(PQgetvalue(res, j, i_attlen));
+			tbinfo->attalign[j] = *(PQgetvalue(res, j, i_attalign));
+			tbinfo->attislocal[j] = (PQgetvalue(res, j, i_attislocal)[0] == 't');
+			tbinfo->notnull[j] = (PQgetvalue(res, j, i_attnotnull)[0] == 't');
+			tbinfo->attoptions[j] = pg_strdup(PQgetvalue(res, j, i_attoptions));
+			tbinfo->attcollation[j] = atooid(PQgetvalue(res, j, i_attcollation));
+			tbinfo->attcompression[j] = *(PQgetvalue(res, j, i_attcompression));
+			tbinfo->attfdwoptions[j] = pg_strdup(PQgetvalue(res, j, i_attfdwoptions));
+			tbinfo->attmissingval[j] = pg_strdup(PQgetvalue(res, j, i_attmissingval));
 			tbinfo->attrdefs[j] = NULL; /* fix below */
-			if (PQgetvalue(res, j, PQfnumber(res, "atthasdef"))[0] == 't')
+			if (PQgetvalue(res, j, i_atthasdef)[0] == 't')
 				hasdefaults = true;
 			/* these flags will be set in flagInhAttrs() */
 			tbinfo->inhNotNull[j] = false;
@@ -10675,6 +10716,8 @@ dumpEnumType(Archive *fout, const TypeInfo *tyinfo)
 	char	   *qtypname;
 	char	   *qualtypname;
 	char	   *label;
+	int			i_enumlabel;
+	int			i_oid;
 
 	if (fout->remoteVersion >= 90100)
 		appendPQExpBuffer(query, "SELECT oid, enumlabel "
@@ -10712,10 +10755,12 @@ dumpEnumType(Archive *fout, const TypeInfo *tyinfo)
 
 	if (!dopt->binary_upgrade)
 	{
+		i_enumlabel = PQfnumber(res, "enumlabel");
+
 		/* Labels with server-assigned oids */
 		for (i = 0; i < num; i++)
 		{
-			label = PQgetvalue(res, i, PQfnumber(res, "enumlabel"));
+			label = PQgetvalue(res, i, i_enumlabel);
 			if (i > 0)
 				appendPQExpBufferChar(q, ',');
 			appendPQExpBufferStr(q, "\n    ");
@@ -10727,11 +10772,14 @@ dumpEnumType(Archive *fout, const TypeInfo *tyinfo)
 
 	if (dopt->binary_upgrade)
 	{
+		i_oid = PQfnumber(res, "oid");
+		i_enumlabel = PQfnumber(res, "enumlabel");
+
 		/* Labels with dump-assigned (preserved) oids */
 		for (i = 0; i < num; i++)
 		{
-			enum_oid = atooid(PQgetvalue(res, i, PQfnumber(res, "oid")));
-			label = PQgetvalue(res, i, PQfnumber(res, "enumlabel"));
+			enum_oid = atooid(PQgetvalue(res, i, i_oid));
+			label = PQgetvalue(res, i, i_enumlabel);
 
 			if (i == 0)
 				appendPQExpBufferStr(q, "\n-- For binary upgrade, must preserve pg_enum oids\n");
-- 
2.18.4

#2Justin Pryzby
pryzby@telsasoft.com
In reply to: houzj.fnst@fujitsu.com (#1)
Re: Avoid repeated PQfnumber() in pg_dump

On Wed, Jul 14, 2021 at 08:54:32AM +0000, houzj.fnst@fujitsu.com wrote:

Since PQfnumber() is not a cheap function, I think we'd better invoke
PQfnumber() out of the loop like the attatched patch.

+1

Please add to the next CF

--
Justin

#3Daniel Gustafsson
daniel@yesql.se
In reply to: houzj.fnst@fujitsu.com (#1)
Re: Avoid repeated PQfnumber() in pg_dump

On 14 Jul 2021, at 10:54, houzj.fnst@fujitsu.com wrote:

Since PQfnumber() is not a cheap function, I think we'd better invoke
PQfnumber() out of the loop like the attatched patch.

Looks good on a quick readthrough, and I didn't see any other similar codepaths
in pg_dump on top of what you've fixed.

After applying this change, I can see about 8% performance gain in my test environment
when dump table definitions which have many columns.

Out of curiosity, how many columns are "many columns"?

--
Daniel Gustafsson https://vmware.com/

#4houzj.fnst@fujitsu.com
houzj.fnst@fujitsu.com
In reply to: Daniel Gustafsson (#3)
RE: Avoid repeated PQfnumber() in pg_dump

On July 15, 2021 5:35 AM Daniel Gustafsson <daniel@yesql.se> wrote:

On 14 Jul 2021, at 10:54, houzj.fnst@fujitsu.com wrote:

Since PQfnumber() is not a cheap function, I think we'd better invoke
PQfnumber() out of the loop like the attatched patch.

Looks good on a quick readthrough, and I didn't see any other similar
codepaths in pg_dump on top of what you've fixed.

Thanks for reviewing the patch.
Added to the CF: https://commitfest.postgresql.org/34/3254/

After applying this change, I can see about 8% performance gain in my
test environment when dump table definitions which have many columns.

Out of curiosity, how many columns are "many columns"?

I tried dump 10 table definitions while each table has 1000 columns
(maybe not real world case).

Best regards,
houzj

#5Daniel Gustafsson
daniel@yesql.se
In reply to: houzj.fnst@fujitsu.com (#4)
Re: Avoid repeated PQfnumber() in pg_dump

On 15 Jul 2021, at 04:51, houzj.fnst@fujitsu.com wrote:
On July 15, 2021 5:35 AM Daniel Gustafsson <daniel@yesql.se> wrote:

Out of curiosity, how many columns are "many columns"?

I tried dump 10 table definitions while each table has 1000 columns
(maybe not real world case).

While unlikely to be common, very wide tables aren’t unheard of. Either way, I
think it has merit to pull out the PQfnumber before the loop even if it’s a
wash performance wise for many users, as it’s a pattern used elsewhere in
pg_dump.

--
Daniel Gustafsson https://vmware.com/

#6Bossart, Nathan
bossartn@amazon.com
In reply to: Daniel Gustafsson (#5)
Re: Avoid repeated PQfnumber() in pg_dump

On 7/15/21, 12:48 PM, "Daniel Gustafsson" <daniel@yesql.se> wrote:

While unlikely to be common, very wide tables aren’t unheard of. Either way, I
think it has merit to pull out the PQfnumber before the loop even if it’s a
wash performance wise for many users, as it’s a pattern used elsewhere in
pg_dump.

+1

The patch looks good to me. I am marking it as ready-for-committer.

Nathan

#7Daniel Gustafsson
daniel@yesql.se
In reply to: Bossart, Nathan (#6)
Re: Avoid repeated PQfnumber() in pg_dump

On 23 Jul 2021, at 01:39, Bossart, Nathan <bossartn@amazon.com> wrote:

The patch looks good to me. I am marking it as ready-for-committer.

I took another look at this today and pushed it after verifying with a pgindent
run. Thanks!

--
Daniel Gustafsson https://vmware.com/

#8Bossart, Nathan
bossartn@amazon.com
In reply to: Daniel Gustafsson (#7)
Re: Avoid repeated PQfnumber() in pg_dump

On 8/27/21, 7:27 AM, "Daniel Gustafsson" <daniel@yesql.se> wrote:

I took another look at this today and pushed it after verifying with a pgindent
run. Thanks!

Thank you!

Nathan