reduce size of fmgr_builtins array

Started by John Naylorabout 6 years ago4 messages
#1John Naylor
john.naylor@2ndquadrant.com
1 attachment(s)

Hi all,

Currently, we include the function name string in each FmgrBuiltin
struct, whose size is 24 bytes on 64 bit platforms. As far as I can
tell, the name is usually unused, so the attached (WIP, untested)
patch stores it separately, reducing this struct to 16 bytes.

We can go one step further and allocate the names as a single
character string, reducing the binary size. It doesn't help much to
store offsets, since there are ~40k characters, requiring 32-bit
offsets. If we instead compute the offset on the fly from stored name
lengths, we can use 8-bit values, saving 19kB of space in the binary
over using string pointers.

--
John Naylor https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

v1-fmgr-func-names.patchapplication/octet-stream; name=v1-fmgr-func-names.patchDownload
 src/backend/utils/Gen_fmgrtab.pl | 26 +++++++++++++++++++++++++-
 src/backend/utils/fmgr/fmgr.c    |  6 ++++--
 src/include/utils/fmgrtab.h      |  7 ++++++-
 3 files changed, 35 insertions(+), 4 deletions(-)

diff --git a/src/backend/utils/Gen_fmgrtab.pl b/src/backend/utils/Gen_fmgrtab.pl
index 80e99189e4..c0c4da5b29 100644
--- a/src/backend/utils/Gen_fmgrtab.pl
+++ b/src/backend/utils/Gen_fmgrtab.pl
@@ -207,7 +207,7 @@ my $fmgr_count       = 0;
 foreach my $s (sort { $a->{oid} <=> $b->{oid} } @fmgr)
 {
 	print $tfh
-	  "  { $s->{oid}, $s->{nargs}, $bmap{$s->{strict}}, $bmap{$s->{retset}}, \"$s->{prosrc}\", $s->{prosrc} }";
+	  "  { $s->{oid}, $s->{nargs}, $bmap{$s->{strict}}, $bmap{$s->{retset}}, $s->{prosrc} }";
 
 	$fmgr_builtin_oid_index[ $s->{oid} ] = $fmgr_count++;
 	$last_builtin_oid = $s->{oid};
@@ -258,6 +258,30 @@ for (my $i = 0; $i <= $last_builtin_oid; $i++)
 print $tfh "};\n";
 
 
+# Emit the string containing all the builtin function names.
+print $tfh qq|\nconst char fmgr_builtin_name_string[] =\n|;
+foreach my $s (sort { $a->{oid} <=> $b->{oid} } @fmgr)
+{
+	print $tfh qq|  "$s->{prosrc}\\0"\n|;
+}
+print $tfh ";\n";
+
+
+# Emit an array of lengths which will be used to calculate the
+# index into the function name string.
+printf $tfh "\nconst uint8 fmgr_builtin_name_lengths[] = {\n";
+my $length = 0;
+foreach my $s (sort { $a->{oid} <=> $b->{oid} } @fmgr)
+{
+	print $tfh "  $length,\n";
+
+	# Get the length of the current function name,
+	# and add 1 for the null terminator.
+	$length = length($s->{prosrc}) + 1;
+}
+print $tfh "};\n";
+
+
 # And add the file footers.
 print $ofh "\n#endif\t\t\t\t\t\t\t/* FMGROIDS_H */\n";
 print $pfh "\n#endif\t\t\t\t\t\t\t/* FMGRPROTOS_H */\n";
diff --git a/src/backend/utils/fmgr/fmgr.c b/src/backend/utils/fmgr/fmgr.c
index 2ce7a866c9..06d2d98747 100644
--- a/src/backend/utils/fmgr/fmgr.c
+++ b/src/backend/utils/fmgr/fmgr.c
@@ -97,11 +97,13 @@ fmgr_isbuiltin(Oid id)
 static const FmgrBuiltin *
 fmgr_lookupByName(const char *name)
 {
-	int			i;
+	int			i,
+				offset = 0;
 
 	for (i = 0; i < fmgr_nbuiltins; i++)
 	{
-		if (strcmp(name, fmgr_builtins[i].funcName) == 0)
+		offset += fmgr_builtin_name_lengths[i];
+		if (strcmp(name, fmgr_builtin_name_string + offset) == 0)
 			return fmgr_builtins + i;
 	}
 	return NULL;
diff --git a/src/include/utils/fmgrtab.h b/src/include/utils/fmgrtab.h
index e981f34934..07287d6f95 100644
--- a/src/include/utils/fmgrtab.h
+++ b/src/include/utils/fmgrtab.h
@@ -28,12 +28,17 @@ typedef struct
 	short		nargs;			/* 0..FUNC_MAX_ARGS, or -1 if variable count */
 	bool		strict;			/* T if function is "strict" */
 	bool		retset;			/* T if function returns a set */
-	const char *funcName;		/* C name of the function */
 	PGFunction	func;			/* pointer to compiled function */
 } FmgrBuiltin;
 
 extern const FmgrBuiltin fmgr_builtins[];
 
+/* names of builtin functions as a single character array */
+extern const char fmgr_builtin_name_string[];
+
+/* lengths of builtin function names, for computing the offset into the name string */
+extern const uint8 fmgr_builtin_name_lengths[];
+
 extern const int fmgr_nbuiltins;	/* number of entries in table */
 
 extern const Oid fmgr_last_builtin_oid; /* highest function OID in table */
#2John Naylor
john.naylor@2ndquadrant.com
In reply to: John Naylor (#1)
1 attachment(s)
Re: reduce size of fmgr_builtins array

I wrote:

Currently, we include the function name string in each FmgrBuiltin
struct, whose size is 24 bytes on 64 bit platforms. As far as I can
tell, the name is usually unused, so the attached (WIP, untested)
patch stores it separately, reducing this struct to 16 bytes.

We can go one step further and allocate the names as a single
character string, reducing the binary size. It doesn't help much to
store offsets, since there are ~40k characters, requiring 32-bit
offsets. If we instead compute the offset on the fly from stored name
lengths, we can use 8-bit values, saving 19kB of space in the binary
over using string pointers.

I tested with the attached C function to make sure
fmgr_internal_function() still returned the correct answer. I assume
this is not a performance critical function, but I still wanted to see
if there was a visible performance regression. I get this when calling
fmgr_internal_function() 100k times:

master: 833ms
patch: 886ms

The point of the patch is to increase the likelihood of
fmgr_isbuiltin() finding the fmgr_builtins[] element in L1 cache. It
seems harder to put a number on that for a realistic workload, but
reducing the array size by 1/3 couldn't hurt. I'll go ahead and add
this to the commitfest.

--
John Naylor https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

drive_builtin_by_name.capplication/octet-stream; name=drive_builtin_by_name.cDownload
#3Heikki Linnakangas
hlinnaka@iki.fi
In reply to: John Naylor (#2)
Re: reduce size of fmgr_builtins array

On 02/01/2020 01:15, John Naylor wrote:

I wrote:

Currently, we include the function name string in each FmgrBuiltin
struct, whose size is 24 bytes on 64 bit platforms. As far as I can
tell, the name is usually unused, so the attached (WIP, untested)
patch stores it separately, reducing this struct to 16 bytes.

We can go one step further and allocate the names as a single
character string, reducing the binary size. It doesn't help much to
store offsets, since there are ~40k characters, requiring 32-bit
offsets. If we instead compute the offset on the fly from stored name
lengths, we can use 8-bit values, saving 19kB of space in the binary
over using string pointers.

I tested with the attached C function to make sure
fmgr_internal_function() still returned the correct answer. I assume
this is not a performance critical function, but I still wanted to see
if there was a visible performance regression. I get this when calling
fmgr_internal_function() 100k times:

master: 833ms
patch: 886ms

Hmm. I was actually expecting this to slightly speed up
fmgr_internal_function(), now that all the names fit in a smaller amount
of cache. I guess there are more branches or a data dependency or
something now. I'm not too worried about that though. If it mattered, we
should switch to binary searching the array.

The point of the patch is to increase the likelihood of
fmgr_isbuiltin() finding the fmgr_builtins[] element in L1 cache. It
seems harder to put a number on that for a realistic workload, but
reducing the array size by 1/3 couldn't hurt.

Yeah. Nevertheless, it would be nice to be able to demonstrate the
benefit in some test, at least. It feels hard to justify committing a
performance patch if we can't show the benefit. Otherwise, we should
just try to keep it as simple as possible, to optimize for readability.

A similar approach was actually discussed a couple of years back:
/messages/by-id/bd13812c-c4ae-3788-5b28-5633beed2929@iki.fi.
The conclusion then was that it's not worth the trouble or the code
complication. So I think this patch is Rejected, unless you can come up
with a test case that concretely shows the benefit.

- Heikki

#4John Naylor
john.naylor@2ndquadrant.com
In reply to: Heikki Linnakangas (#3)
Re: reduce size of fmgr_builtins array

On Tue, Jan 7, 2020 at 9:08 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

Yeah. Nevertheless, it would be nice to be able to demonstrate the
benefit in some test, at least. It feels hard to justify committing a
performance patch if we can't show the benefit. Otherwise, we should
just try to keep it as simple as possible, to optimize for readability.

A similar approach was actually discussed a couple of years back:
/messages/by-id/bd13812c-c4ae-3788-5b28-5633beed2929@iki.fi.
The conclusion then was that it's not worth the trouble or the code
complication. So I think this patch is Rejected, unless you can come up
with a test case that concretely shows the benefit.

Thanks for reviewing! As expected, a microbenchmark didn't show a
difference. I could try profiling in some workload, but I don't think
the benefit would be worth the effort involved. I've marked it
rejected.

--
John Naylor https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services