Binary search in fmgr_isbuiltin() is a bottleneck.

Started by Andres Freundover 8 years ago33 messageshackers
Jump to latest
#1Andres Freund
andres@anarazel.de

Hi,

Surprising myself I discovered that in workloads that do a large number
of fmgr_info* lookups, fmgr_isbuiltin() is actually quite the
bottleneck.

In my development build we have 2765 builtin functions, stored in a 88KB
array. Apparently the ~12 steps are quite noticeable. Generally binary
searches have quite a poor memory access pattern...

In the workload I playing around with right now (producing this wall of
performance fixes) the main source of lookups is
printtup_prepare_info(), which does a fmgr_info for every column. If you
have a large number of columns ...

I think we could conceivable deduplicate the output functions for
printtup to one FmgrInfo per type? I'd assume that it'd be good for
things besides reducing the overhead - alowing the respective function
to reuse fn_extra might be quite good. I've not implemented that idea
at this point, I'm not 100% what the best way to do so is without also
causing slowdowns.

Another idea would be to have an array of FmgrBuiltin*, that we index by
oid. That'd not be super small though, given that the space for function
oids is sparse.

Thus what I've instead done is replacing the binary search in
fmgr_isbuiltin() with a simplehash.h style hashtable. After that the
lookup is still visible in the profile, but far less prominent.

I'd like to move the hash creation out of fmgr_isbuiltin (to avoid
having to check whether it's already created), but there's currently no
convenient place to create the hash from. Now that we don't rely on
the sortedness of fmgrtab.c we could remove a few lines from
Gen_fmgrtab.pl, but I don't quite see the advantage. If we were
interested in a faster by-name lookup we could sort it by name, but
that'd be better solved by another hashtable...

I was wondering about also replacing the C function hash with a
simplehash, but given that I've not seen this in profiles, I've not
bothered so far.

Greetings,

Andres Freund

Attachments:

0004-Add-inline-murmurhash32-int32-function.patchtext/x-diff; charset=us-asciiDownload+20-19
0005-Replace-binary-search-in-fmgr_isbuiltin-with-hashtab.patchtext/x-diff; charset=us-asciiDownload+47-17
#2Jeevan Ladhe
jeevan.ladhe@enterprisedb.com
In reply to: Andres Freund (#1)
Re: Binary search in fmgr_isbuiltin() is a bottleneck.

Hi Andres,

Another idea would be to have an array of FmgrBuiltin*, that we index by

oid. That'd not be super small though, given that the space for function
oids is sparse.

I totally agree here, as the oids are very much scattered having an
array is not feasible here.

Thus what I've instead done is replacing the binary search in
fmgr_isbuiltin() with a simplehash.h style hashtable. After that the
lookup is still visible in the profile, but far less prominent.

I'd like to move the hash creation out of fmgr_isbuiltin (to avoid
having to check whether it's already created), but there's currently no
convenient place to create the hash from. Now that we don't rely on
the sortedness of fmgrtab.c we could remove a few lines from
Gen_fmgrtab.pl, but I don't quite see the advantage. If we were
interested in a faster by-name lookup we could sort it by name, but
that'd be better solved by another hashtable...

I looked into these patches.
Seems patch 004 is already committed, commit id:
791961f59b792fbd4f0a992d3ccab47298e79103

About patch 0005:
The patch still applies cleanly.
There are no failures in ‘make check’

+ /* TODO: it'd be better if this were done separately */
+ if (unlikely(oid2builtins == NULL))
  {
- int i = (high + low) / 2;
- const FmgrBuiltin *ptr = &fmgr_builtins[i];
+ int i;
- if (id == ptr->foid)
- return ptr;
- else if (id > ptr->foid)
- low = i + 1;
- else
- high = i - 1;
+ oid2builtins = oid2builtins_create(TopMemoryContext,
+   fmgr_nbuiltins,
+   NULL);
+ for (i = 0; i < fmgr_nbuiltins; i++)
+ {
+ const FmgrBuiltin *ptr = &fmgr_builtins[i];
+ bool found;
+
+ entry = oid2builtins_insert(oid2builtins, ptr->foid, &found);
+ Assert(!found);
+ entry->builtin = ptr;
+ }

As Andres has already pointed, may be we want to move above code in a
separate
function, and just call that function here in case the hash is not already
built.

Further I am thinking about doing some performance testing, Andres can you
please
point me how did you test it and what perf numbers you saw for this
particular patch(0005).

Regards,
Jeevan Ladhe

#3tushar
tushar.ahuja@enterprisedb.com
In reply to: Andres Freund (#1)
Re: Binary search in fmgr_isbuiltin() is a bottleneck.

On 09/14/2017 12:21 PM, Andres Freund wrote:

Hi,

Surprising myself I discovered that in workloads that do a large number
of fmgr_info* lookups, fmgr_isbuiltin() is actually quite the
bottleneck.

After discussion with Jeevan Ladhe, we created a sql query which contain
lots of inbuild function and tested that against pgbench��� with� master
v/s patch and found an improvement

Virtual Machine configuration - Centos 6.5 x64 / 16 GB RAM / 8 VCPU core
processor

pgbench -c 8 -j 8 -f /tmy/mytest.sql� -T 300 postgres

PG Head������������ -�� tps = 5309.810807 (excluding connections
establishing).
PG HEAD+patch -� tps =� 5751.745767(8.32+% vs. head)

pgbench -c 8 -j 8 -f /tmp/mytest.sql�� -T 500 postgres

PG Head������������ -� tps = 7701.176220(excluding connections
establishing).
PG HEAD+patch -� tps = 7953.934043(3.27+% vs. head)

--
regards,tushar
EnterpriseDB https://www.enterprisedb.com/
The Enterprise PostgreSQL Company

Attachments:

mytest.sqlapplication/sql; name=mytest.sqlDownload
#4Robert Haas
robertmhaas@gmail.com
In reply to: Jeevan Ladhe (#2)
Re: Binary search in fmgr_isbuiltin() is a bottleneck.

On Mon, Sep 25, 2017 at 8:42 AM, Jeevan Ladhe
<jeevan.ladhe@enterprisedb.com> wrote:

As Andres has already pointed, may be we want to move above code in a
separate
function, and just call that function here in case the hash is not already
built.

No, I think what Andres is saying is that we ought to build the hash
table before we ever reach this function, so that we don't have to
have a branch here to check whether it's been done. I don't see why
that's particularly hard -- it can be jammed into the startup sequence
someplace early, I assume. In EXEC_BACKEND builds it will have to be
redone in each child, but that's just a matter of sticking a call into
SubPostmasterMain() as well as PostMasterMain().

Aside from that issue, this seems like a pretty boring patch. If a
hash table is faster than a binary search, then it is. Using
simplehash makes sense for this application, I think, and I'm not
really sure what else there is to say.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Robert Haas
robertmhaas@gmail.com
In reply to: Robert Haas (#4)
Re: Binary search in fmgr_isbuiltin() is a bottleneck.

On Wed, Sep 27, 2017 at 11:18 AM, Robert Haas <robertmhaas@gmail.com> wrote:

No, I think what Andres is saying is that we ought to build the hash
table before we ever reach this function, so that we don't have to
have a branch here to check whether it's been done. I don't see why
that's particularly hard -- it can be jammed into the startup sequence
someplace early, I assume. In EXEC_BACKEND builds it will have to be
redone in each child, but that's just a matter of sticking a call into
SubPostmasterMain() as well as PostMasterMain().

I suppose an even better approach would be to build a perfect hash
table at compile time so that nothing needs to be built at run-time at
all, but I'm not sure it's worth the trouble.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#5)
Re: Binary search in fmgr_isbuiltin() is a bottleneck.

Robert Haas <robertmhaas@gmail.com> writes:

I suppose an even better approach would be to build a perfect hash
table at compile time so that nothing needs to be built at run-time at
all, but I'm not sure it's worth the trouble.

Yeah, I was wondering about that too. It would likely mean adding a
compile time dependency on gperf or similar tool, but we could take
our standard approach of shipping the output in tarballs, so that only
developers would really need to install that tool.

Rebuilding a constant table during every backend start seems like a
pretty brute-force answer.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#6)
Re: Binary search in fmgr_isbuiltin() is a bottleneck.

On 2017-09-27 11:50:56 -0400, Tom Lane wrote:

Robert Haas <robertmhaas@gmail.com> writes:

I suppose an even better approach would be to build a perfect hash
table at compile time so that nothing needs to be built at run-time at
all, but I'm not sure it's worth the trouble.

Yeah, I was wondering about that too. It would likely mean adding a
compile time dependency on gperf or similar tool, but we could take
our standard approach of shipping the output in tarballs, so that only
developers would really need to install that tool.

I'd been wondering about that too, but I'm not sure I buy that it's
worth the effort. The only real argument I see is that there's probably
multiple cases where it'd be potentially beneficial, not just here.

Rebuilding a constant table during every backend start seems like a
pretty brute-force answer.

We could relatively easily move it to be once-per-postmaster start for
!EXEC_BACKEND builds. Constantly doing expensive binary searches is also
pretty brute force ;)

I've been wondering about not actually eagerly filling that hashtable,
but using it for pretty much all of fmgr.c - but that seems like a good
chunk more work...

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#7)
Re: Binary search in fmgr_isbuiltin() is a bottleneck.

On Wed, Sep 27, 2017 at 1:00 PM, Andres Freund <andres@anarazel.de> wrote:

We could relatively easily move it to be once-per-postmaster start for
!EXEC_BACKEND builds. Constantly doing expensive binary searches is also
pretty brute force ;)

I think one useful way to look at it might be -

How many fmgr searches does a backend need to do in order to make up
for the time spent building the hash table?

How many fmgr searches, if any, do we do during backend startup as a
matter of course?

If we're going to make up the time that it takes to build the hash
table by the time the user runs a handful of queries, there's really
no point in stressing about this. The use case where somebody
repeatedly connects and disconnects without running any significant
number of real queries is not an important one.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#8)
Re: Binary search in fmgr_isbuiltin() is a bottleneck.

On 2017-09-27 13:28:22 -0400, Robert Haas wrote:

On Wed, Sep 27, 2017 at 1:00 PM, Andres Freund <andres@anarazel.de> wrote:

We could relatively easily move it to be once-per-postmaster start for
!EXEC_BACKEND builds. Constantly doing expensive binary searches is also
pretty brute force ;)

I think one useful way to look at it might be -

How many fmgr searches does a backend need to do in order to make up
for the time spent building the hash table?

How many fmgr searches, if any, do we do during backend startup as a
matter of course?

If we're going to make up the time that it takes to build the hash
table by the time the user runs a handful of queries, there's really
no point in stressing about this. The use case where somebody
repeatedly connects and disconnects without running any significant
number of real queries is not an important one.

I don't think you can even measure the overhead of building the
table. This is inserting ~8k rows in an accurately sized hashtable - a
vanishingly small amount of time in comparison to the backend startup
time (and even more so postmaster startup). My measurement shows it
takes about 0.4 ms to build (gdb in, query, reset oid2builtins = 0,
query - repeat a couple times).

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#7)
Re: Binary search in fmgr_isbuiltin() is a bottleneck.

Andres Freund <andres@anarazel.de> writes:

On 2017-09-27 11:50:56 -0400, Tom Lane wrote:

Robert Haas <robertmhaas@gmail.com> writes:

I suppose an even better approach would be to build a perfect hash
table at compile time so that nothing needs to be built at run-time at
all, but I'm not sure it's worth the trouble.

Yeah, I was wondering about that too. It would likely mean adding a
compile time dependency on gperf or similar tool, but we could take
our standard approach of shipping the output in tarballs, so that only
developers would really need to install that tool.

I'd been wondering about that too, but I'm not sure I buy that it's
worth the effort. The only real argument I see is that there's probably
multiple cases where it'd be potentially beneficial, not just here.

The other question that ought to be answered is whether a gperf hash
table would be faster. In principle it could be because of
guaranteed-no-collisions, but I have no experience with how fast the
constructed hash functions might be compared to our regular one.

To me, the real takeaway from this thread is that fmgr_isbuiltin()
needs optimization at all; I'd have guessed it didn't matter. But
now that we know that it does, it's worth looking hard at what we
can squeeze out of it.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#10)
Re: Binary search in fmgr_isbuiltin() is a bottleneck.

On 2017-09-27 13:46:50 -0400, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

I'd been wondering about that too, but I'm not sure I buy that it's
worth the effort. The only real argument I see is that there's probably
multiple cases where it'd be potentially beneficial, not just here.

The other question that ought to be answered is whether a gperf hash
table would be faster. In principle it could be because of
guaranteed-no-collisions, but I have no experience with how fast the
constructed hash functions might be compared to our regular one.

The patch uses murmurhash32, i.e. a short and fast hash, already for
performance, and it shows up in profiles.

Ugh, hacking together a quick input file for gperf, I'm *far* from
convinced. The generated code does multiple lookups in significantly
sized arrays, and assumes string input. The latter seems like a complete
dealbreaker, and there doesn't seem to be an option to turn it off.

Greetings,

Andres Freund

Attachments:

fmgrhash.itext/plain; charset=us-asciiDownload
fmgrhash.ctext/plain; charset=us-asciiDownload
#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#11)
Re: Binary search in fmgr_isbuiltin() is a bottleneck.

Andres Freund <andres@anarazel.de> writes:

On 2017-09-27 13:46:50 -0400, Tom Lane wrote:

The other question that ought to be answered is whether a gperf hash
table would be faster.

Ugh, hacking together a quick input file for gperf, I'm *far* from
convinced. The generated code does multiple lookups in significantly
sized arrays, and assumes string input. The latter seems like a complete
dealbreaker, and there doesn't seem to be an option to turn it off.

Ugh. I'd never actually used gperf, and now I know why not ;-)

However, that's just the first tool that came to mind. Wikipedia's
article on perfect hashes links to our man Jenkins:

http://burtleburtle.net/bob/hash/perfect.html

which looks pretty promising.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#12)
Re: Binary search in fmgr_isbuiltin() is a bottleneck.

On 2017-09-27 14:40:20 -0400, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

On 2017-09-27 13:46:50 -0400, Tom Lane wrote:

The other question that ought to be answered is whether a gperf hash
table would be faster.

Ugh, hacking together a quick input file for gperf, I'm *far* from
convinced. The generated code does multiple lookups in significantly
sized arrays, and assumes string input. The latter seems like a complete
dealbreaker, and there doesn't seem to be an option to turn it off.

Ugh. I'd never actually used gperf, and now I know why not ;-)

Heh.

However, that's just the first tool that came to mind. Wikipedia's
article on perfect hashes links to our man Jenkins:

http://burtleburtle.net/bob/hash/perfect.html

which looks pretty promising.

OTOH, that'd pretty much mean we'd have to include this code in our tree
- we can't really expect everyone adding a function to download &
compile this manually. Honestly before going there I'd rather just have
an oid indexed array, computed at compile time.

I've the slight feeling of forgoing the good for the perfect here...

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#13)
Re: Binary search in fmgr_isbuiltin() is a bottleneck.

Andres Freund <andres@anarazel.de> writes:

Honestly before going there I'd rather just have
an oid indexed array, computed at compile time.

Yeah, I'd been kind of wondering about that approach too. We could have,
say, a table of int16s indexed by OIDs from 0 to 9999, containing zero or
an index into the table of FmgrBuiltin structs. So 20000 bytes of
constant data, and O(negligible) lookup time other than possible cache
misses on this table. But a dynahash-ish hash table built for 2800+
entries would probably be about that size ...

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#14)
Re: Binary search in fmgr_isbuiltin() is a bottleneck.

On 2017-09-27 14:58:36 -0400, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

Honestly before going there I'd rather just have
an oid indexed array, computed at compile time.

Yeah, I'd been kind of wondering about that approach too. We could have,
say, a table of int16s indexed by OIDs from 0 to 9999, containing zero or
an index into the table of FmgrBuiltin structs. So 20000 bytes of
constant data, and O(negligible) lookup time other than possible cache
misses on this table. But a dynahash-ish hash table built for 2800+
entries would probably be about that size ...

Well dynahash is *way* too slow for this. But that's pretty much what
the simplehash approach is already doing, anyway. Right now I think the
correct approach would be to just add an fmgr_startup() function, called
by postmaster / backend startup if EXEC_BACKEND.

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#15)
Re: Binary search in fmgr_isbuiltin() is a bottleneck.

Andres Freund <andres@anarazel.de> writes:

On 2017-09-27 14:58:36 -0400, Tom Lane wrote:

Yeah, I'd been kind of wondering about that approach too. We could have,
say, a table of int16s indexed by OIDs from 0 to 9999, containing zero or
an index into the table of FmgrBuiltin structs. So 20000 bytes of
constant data, and O(negligible) lookup time other than possible cache
misses on this table. But a dynahash-ish hash table built for 2800+
entries would probably be about that size ...

Well dynahash is *way* too slow for this. But that's pretty much what
the simplehash approach is already doing, anyway. Right now I think the
correct approach would be to just add an fmgr_startup() function, called
by postmaster / backend startup if EXEC_BACKEND.

Yeah, constructing an index table of that sort on top of the existing
FmgrBuiltin array could be done cheaply enough at startup. It irks me
slightly that it's not part of the read-only text segment, but I can't
say that there's any really measurable impact.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#16)
Re: Binary search in fmgr_isbuiltin() is a bottleneck.

Hi,

On 2017-09-27 15:06:15 -0400, Tom Lane wrote:

Yeah, constructing an index table of that sort on top of the existing
FmgrBuiltin array could be done cheaply enough at startup. It irks me
slightly that it's not part of the read-only text segment, but I can't
say that there's any really measurable impact.

I don't think this case is significant enough to make it worthwhile, but
if we'd find one that is, we certainly could add code that builds the
hash's array once in memory, then serializes that into a .h file, which
then is included into the code. I can't immediately see more of these
coming up, but who knows?

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#9)
Re: Binary search in fmgr_isbuiltin() is a bottleneck.

On Wed, Sep 27, 2017 at 1:40 PM, Andres Freund <andres@anarazel.de> wrote:

I don't think you can even measure the overhead of building the
table. This is inserting ~8k rows in an accurately sized hashtable - a
vanishingly small amount of time in comparison to the backend startup
time (and even more so postmaster startup). My measurement shows it
takes about 0.4 ms to build (gdb in, query, reset oid2builtins = 0,
query - repeat a couple times).

0.4ms isn't negligible as a fraction of backend startup time, is it?
I think backend startup time is a few milliseconds.

$ echo '\set x 1' > x.txt
$ pgbench -n -C -c 1 -f x.txt -T 10
transaction type: x.txt
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 10 s
number of transactions actually processed: 5091
latency average = 1.965 ms
tps = 508.866931 (including connections establishing)
tps = 12909.303693 (excluding connections establishing)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#18)
Re: Binary search in fmgr_isbuiltin() is a bottleneck.

On 2017-09-27 15:30:45 -0400, Robert Haas wrote:

On Wed, Sep 27, 2017 at 1:40 PM, Andres Freund <andres@anarazel.de> wrote:

I don't think you can even measure the overhead of building the
table. This is inserting ~8k rows in an accurately sized hashtable - a
vanishingly small amount of time in comparison to the backend startup
time (and even more so postmaster startup). My measurement shows it
takes about 0.4 ms to build (gdb in, query, reset oid2builtins = 0,
query - repeat a couple times).

0.4ms isn't negligible as a fraction of backend startup time, is it?

Well, on linux you'd only have this on postmaster startup.

I think backend startup time is a few milliseconds.

$ echo '\set x 1' > x.txt
$ pgbench -n -C -c 1 -f x.txt -T 10
transaction type: x.txt
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 10 s
number of transactions actually processed: 5091
latency average = 1.965 ms
tps = 508.866931 (including connections establishing)
tps = 12909.303693 (excluding connections establishing)

I had tried this with an actual simplistic query, and the difference was
either nonexistant, or below in the noise. I didn't do a pgbench run
that doesn't actually do anything in the backend - doesn't seem like a
meaningful thing to measure?

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#17)
Re: Binary search in fmgr_isbuiltin() is a bottleneck.

Andres Freund <andres@anarazel.de> writes:

On 2017-09-27 15:06:15 -0400, Tom Lane wrote:

Yeah, constructing an index table of that sort on top of the existing
FmgrBuiltin array could be done cheaply enough at startup. It irks me
slightly that it's not part of the read-only text segment, but I can't
say that there's any really measurable impact.

I don't think this case is significant enough to make it worthwhile, but
if we'd find one that is, we certainly could add code that builds the
hash's array once in memory, then serializes that into a .h file, which
then is included into the code. I can't immediately see more of these
coming up, but who knows?

Actually ... a more defensible reason for having a precomputed constant
table is that it removes any question about where is a safe place in the
initialization sequence to inject "fmgr_startup". That would clearly
have to go before anything that could conceivably try to call a SQL
function. On the other hand, it has to go after elog.c setup (in case
you get e.g. a malloc failure), which means you've now created a reason
why it will never be safe for elog.c to include any fmgr calls. Maybe
that's unsafe anyway, but I'd just as soon not introduce constraints of
that kind just because we're too lazy to do this optimization properly.
There definitely are places in startup that assume they can call
built-in functions (relying on fmgr_isbuiltin) long before most of the
transactional infrastructure is up.

ISTM it shouldn't be that hard to get Gen_fmgrtab.pl to emit an index
array of the sort we're talking about, along with the FmgrBuiltin array
it already prints out. I'm the world's worst Perl programmer but
I'm happy to take a stab at it if you don't want to.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21tushar
tushar.ahuja@enterprisedb.com
In reply to: tushar (#3)
#22Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#20)
#23Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#20)
#24Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#23)
#25Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#24)
#26Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#25)
#27Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#26)
#28Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#27)
#29Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#28)
#30Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#29)
#31Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Andres Freund (#30)
#32Andres Freund
andres@anarazel.de
In reply to: Heikki Linnakangas (#31)
#33Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#32)