reducing the footprint of ScanKeyword (was Re: Large writable variables)

Started by John Naylorover 7 years ago99 messageshackers
Jump to latest
#1John Naylor
john.naylor@enterprisedb.com

On 10/15/18, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Andres Freund <andres@anarazel.de> writes:

On 2018-10-15 16:36:26 -0400, Tom Lane wrote:

We could possibly fix these by changing the data structure so that
what's in a ScanKeywords entry is an offset into some giant string
constant somewhere. No idea how that would affect performance, but
I do notice that we could reduce the sizeof(ScanKeyword), which can't
hurt.

Yea, that might even help performancewise. Alternatively we could change
ScanKeyword to store the keyword name inline, but that'd be a measurable
size increase...

Yeah. It also seems like doing it this way would improve locality of
access: the pieces of the giant string would presumably be in the same
order as the ScanKeywords entries, whereas with the current setup,
who knows where the compiler has put 'em or in what order.

We'd need some tooling to generate the constants that way, though;
I can't see how to make it directly from kwlist.h.

A few months ago I was looking into faster search algorithms for
ScanKeywordLookup(), so this is interesting to me. While an optimal
full replacement would be a lot of work, the above ideas are much less
invasive and would still have some benefit. Unless anyone intends to
work on this, I'd like to flesh out the offset-into-giant-string
approach a bit further:

Since there are several callers of the current approach that don't use
the core keyword list, we'd have to keep the existing struct and
lookup function, to keep the complexity manageable. Once we have an
offset-based struct and function, it makes sense to use it for all
searches of core keywords. This includes not only the core scanner,
but also adt/rule_utils.c, fe_utils/string_utils.c, and
ecpg/preproc/keywords.c.

There would need to be a header with offsets replacing name strings,
generated from parser/kwlist.h, maybe kwlist_offset.h. It'd probably
be convenient if it was emitted into the common/ dir. The giant string
would likely need its own header (kwlist_string.h?).

Since PL/pgSQL uses the core scanner, we'd need to use offsets in its
reserved_keywords[], too. Those don't change much, so we can probably
get away with hard-coding the offsets and the giant string in that
case. (If that's not acceptable, we could separate that out to
pl_reserved_kwlist.h and reuse the above tooling to generate
pl_reserved_kwlist_{offset,string}.h, but that's more complex.)

The rest should be just a SMOP. Any issues I left out?

-John Naylor

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: John Naylor (#1)
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)

John Naylor <jcnaylor@gmail.com> writes:

A few months ago I was looking into faster search algorithms for
ScanKeywordLookup(), so this is interesting to me. While an optimal
full replacement would be a lot of work, the above ideas are much less
invasive and would still have some benefit. Unless anyone intends to
work on this, I'd like to flesh out the offset-into-giant-string
approach a bit further:

Have at it...

Since PL/pgSQL uses the core scanner, we'd need to use offsets in its
reserved_keywords[], too. Those don't change much, so we can probably
get away with hard-coding the offsets and the giant string in that
case. (If that's not acceptable, we could separate that out to
pl_reserved_kwlist.h and reuse the above tooling to generate
pl_reserved_kwlist_{offset,string}.h, but that's more complex.)

plpgsql isn't as stable as all that: people propose new syntax for it
all the time. I do not think a hand-maintained array would be pleasant
at all.

Also, wouldn't we also adopt this technology for its unreserved keywords,
too?

regards, tom lane

#3John Naylor
john.naylor@enterprisedb.com
In reply to: Tom Lane (#2)
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)

On 12/17/18, Tom Lane <tgl@sss.pgh.pa.us> wrote:

John Naylor <jcnaylor@gmail.com> writes:

Since PL/pgSQL uses the core scanner, we'd need to use offsets in its
reserved_keywords[], too. Those don't change much, so we can probably
get away with hard-coding the offsets and the giant string in that
case. (If that's not acceptable, we could separate that out to
pl_reserved_kwlist.h and reuse the above tooling to generate
pl_reserved_kwlist_{offset,string}.h, but that's more complex.)

plpgsql isn't as stable as all that: people propose new syntax for it
all the time. I do not think a hand-maintained array would be pleasant
at all.

Okay.

Also, wouldn't we also adopt this technology for its unreserved keywords,
too?

We wouldn't be forced to, but there might be other reasons to do so.
Were you thinking of code consistency (within pl_scanner.c or
globally)? Or something else?

If we did adopt this setup for plpgsql unreserved keywords,
ecpg/preproc/ecpg_keywords.c and ecpg/preproc/c_keywords.c would be
left using the current ScanKeyword struct for search. Using offset
search for all 5 types of keywords would be globally consistent, but
it also means additional headers, generated headers, and makefile
rules.

-John Naylor

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: John Naylor (#3)
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)

John Naylor <jcnaylor@gmail.com> writes:

On 12/17/18, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Also, wouldn't we also adopt this technology for its unreserved keywords,
too?

We wouldn't be forced to, but there might be other reasons to do so.
Were you thinking of code consistency (within pl_scanner.c or
globally)? Or something else?

If we did adopt this setup for plpgsql unreserved keywords,
ecpg/preproc/ecpg_keywords.c and ecpg/preproc/c_keywords.c would be
left using the current ScanKeyword struct for search. Using offset
search for all 5 types of keywords would be globally consistent, but
it also means additional headers, generated headers, and makefile
rules.

I'd be kind of inclined to convert all uses of ScanKeyword to the new way,
if only for consistency's sake. On the other hand, I'm not the one
volunteering to do the work.

regards, tom lane

#5John Naylor
john.naylor@enterprisedb.com
In reply to: Tom Lane (#4)
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)

On 12/18/18, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I'd be kind of inclined to convert all uses of ScanKeyword to the new way,
if only for consistency's sake. On the other hand, I'm not the one
volunteering to do the work.

That's reasonable, as long as the design is nailed down first. Along
those lines, attached is a heavily WIP patch that only touches plpgsql
unreserved keywords, to test out the new methodology in a limited
area. After settling APIs and name/directory bikeshedding, I'll move
on to the other four keyword types.

There's a new Perl script, src/common/gen_keywords.pl, which takes
pl_unreserved_kwlist.h as input and outputs
pl_unreserved_kwlist_offset.h and pl_unreserved_kwlist_string.h. The
output headers are not installed or symlinked anywhere. Since the
input keyword lists will never be #included directly, they might be
better as .txt files, like errcodes.txt. If we went that far, we might
also remove the PG_KEYWORD macros (they'd still be in the output
files) and rename parser/kwlist.h to common/core_kwlist.txt. There's
also a case for not changing things unnecessarily, especially if
there's ever a new reason to include the base keyword list directly.

To keep the other keyword types functional, I had to add a separate
new struct ScanKeywordOffset and new function
ScanKeywordLookupOffset(), so the patch is a bit messier than the
final will be. With a 4-byte offset, ScankeyWordOffset is 8 bytes,
down from 12, and is now a power of 2.

I used the global .gitignore, but maybe that's an abuse of it.

Make check passes, but I don't know how well it stresses keyword use.
I'll create a commitfest entry soon.

-John Naylor

Attachments:

wip-01-use-offset-based-scankeywords.patchtext/x-patch; charset=US-ASCII; name=wip-01-use-offset-based-scankeywords.patchDownload+355-95
#6Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: John Naylor (#5)
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)

"John" == John Naylor <jcnaylor@gmail.com> writes:

On 12/18/18, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I'd be kind of inclined to convert all uses of ScanKeyword to the
new way, if only for consistency's sake. On the other hand, I'm not
the one volunteering to do the work.

John> That's reasonable, as long as the design is nailed down first.
John> Along those lines, attached is a heavily WIP patch that only
John> touches plpgsql unreserved keywords, to test out the new
John> methodology in a limited area. After settling APIs and
John> name/directory bikeshedding, I'll move on to the other four
John> keyword types.

Is there any particular reason not to go further and use a perfect hash
function for the lookup, rather than binary search?

--
Andrew (irc:RhodiumToad)

#7Andres Freund
andres@anarazel.de
In reply to: Andrew Gierth (#6)
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)

Hi,

On 2018-12-20 00:54:39 +0000, Andrew Gierth wrote:

"John" == John Naylor <jcnaylor@gmail.com> writes:

On 12/18/18, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I'd be kind of inclined to convert all uses of ScanKeyword to the
new way, if only for consistency's sake. On the other hand, I'm not
the one volunteering to do the work.

John> That's reasonable, as long as the design is nailed down first.
John> Along those lines, attached is a heavily WIP patch that only
John> touches plpgsql unreserved keywords, to test out the new
John> methodology in a limited area. After settling APIs and
John> name/directory bikeshedding, I'll move on to the other four
John> keyword types.

Is there any particular reason not to go further and use a perfect hash
function for the lookup, rather than binary search?

The last time I looked into perfect hash functions, it wasn't easy to
find a generator that competed with a decent normal hashtable (in
particular gperf's are very unconvincing). The added tooling is a
concern imo. OTOH, we're comparing not with a hashtable, but a binary
search, where the latter will usually loose. Wonder if we shouldn't
generate a serialized non-perfect hashtable instead. The lookup code for
a read-only hashtable without concern for adversarial input is pretty
trivial.

Greetings,

Andres Freund

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Gierth (#6)
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)

Andrew Gierth <andrew@tao11.riddles.org.uk> writes:

Is there any particular reason not to go further and use a perfect hash
function for the lookup, rather than binary search?

Tooling? I seem to recall having looked at gperf and deciding that it
pretty much sucked, so it's not real clear to me what we would use.

regards, tom lane

#9John Naylor
john.naylor@enterprisedb.com
In reply to: Andrew Gierth (#6)
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)

On 12/19/18, Andrew Gierth <andrew@tao11.riddles.org.uk> wrote:

Is there any particular reason not to go further and use a perfect hash
function for the lookup, rather than binary search?

When I was investigating faster algorithms, I ruled out gperf based on
discussions in the archives. The approach here has modest goals and
shouldn't be too invasive.

With the makefile support and separate keyword files in place, that'll
be one less thing to do if we ever decide to replace binary search.
The giant string will likely be useful as well.

Since we're on the subject, I think some kind of trie would be ideal
performance-wise, but a large amount of work. The nice thing about a
trie is that it can be faster then a hash table for a key miss. I
found a paper that described some space-efficient trie variations [1]https://infoscience.epfl.ch/record/64394/files/triesearches.pdf,
but we'd likely have to code the algorithm and a way to emit a C code
representation of it. I've found some libraries, but that would have
more of the same difficulties in practicality that gperf had.

[1]: https://infoscience.epfl.ch/record/64394/files/triesearches.pdf

-John Naylor

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: John Naylor (#5)
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)

John Naylor <jcnaylor@gmail.com> writes:

On 12/18/18, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I'd be kind of inclined to convert all uses of ScanKeyword to the new way,
if only for consistency's sake. On the other hand, I'm not the one
volunteering to do the work.

That's reasonable, as long as the design is nailed down first. Along
those lines, attached is a heavily WIP patch that only touches plpgsql
unreserved keywords, to test out the new methodology in a limited
area. After settling APIs and name/directory bikeshedding, I'll move
on to the other four keyword types.

Let the bikeshedding begin ...

There's a new Perl script, src/common/gen_keywords.pl,

I'd be inclined to put the script in src/tools, I think. IMO src/common
is for code that actually gets built into our executables.

which takes
pl_unreserved_kwlist.h as input and outputs
pl_unreserved_kwlist_offset.h and pl_unreserved_kwlist_string.h.

I wonder whether we'd not be better off producing just one output
file, in which we have the offsets emitted as PG_KEYWORD macros
and then the giant string emitted as a macro definition, ie
something like

#define PG_KEYWORD_STRING \
"absolute\0" \
"alias\0" \
...

That simplifies the Makefile-hacking, at least, and it possibly gives
callers more flexibility about what they actually want to do with the
string.

The
output headers are not installed or symlinked anywhere. Since the
input keyword lists will never be #included directly, they might be
better as .txt files, like errcodes.txt. If we went that far, we might
also remove the PG_KEYWORD macros (they'd still be in the output
files) and rename parser/kwlist.h to common/core_kwlist.txt. There's
also a case for not changing things unnecessarily, especially if
there's ever a new reason to include the base keyword list directly.

I'm for "not change things unnecessarily". People might well be
scraping the keyword list out of parser/kwlist.h for other purposes
right now --- indeed, it's defined the way it is exactly to let
people do that. I don't see a good reason to force them to redo
whatever tooling they have that depends on that. So let's build
kwlist_offsets.h alongside that, but not change kwlist.h itself.

To keep the other keyword types functional, I had to add a separate
new struct ScanKeywordOffset and new function
ScanKeywordLookupOffset(), so the patch is a bit messier than the
final will be.

Check.

I used the global .gitignore, but maybe that's an abuse of it.

Yeah, I'd say it is.

+# TODO: Error out if the keyword names are not in ASCII order.

+many for including such a check.

Also note that we don't require people to have Perl installed when
building from a tarball. Therefore, these derived headers must get
built during "make distprep" and removed by maintainer-clean but
not distclean. I think this also has some implications for VPATH
builds, but as long as you follow the pattern used for other
derived header files (e.g. fmgroids.h), you should be fine.

regards, tom lane

#11John Naylor
john.naylor@enterprisedb.com
In reply to: Tom Lane (#10)
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)

On 12/20/18, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I'd be inclined to put the script in src/tools, I think. IMO src/common
is for code that actually gets built into our executables.

Done.

which takes
pl_unreserved_kwlist.h as input and outputs
pl_unreserved_kwlist_offset.h and pl_unreserved_kwlist_string.h.

I wonder whether we'd not be better off producing just one output
file, in which we have the offsets emitted as PG_KEYWORD macros
and then the giant string emitted as a macro definition, ie
something like

#define PG_KEYWORD_STRING \
"absolute\0" \
"alias\0" \
...

That simplifies the Makefile-hacking, at least, and it possibly gives
callers more flexibility about what they actually want to do with the
string.

Okay, I tried that. Since the script is turning one header into
another, I borrowed the "*_d.h" nomenclature from the catalogs. Using
a single file required some #ifdef hacks in the output file. Maybe
there's a cleaner way to do this, but I don't know what it is.

Using a single file also gave me another idea: Take value and category
out of ScanKeyword, and replace them with an index into another array
containing those, which will only be accessed in the event of a hit.
That would shrink ScanKeyword to 4 bytes (offset, index), further
increasing locality of reference. Might not be worth it, but I can try
it after moving on to the core scanner.

I'm for "not change things unnecessarily". People might well be
scraping the keyword list out of parser/kwlist.h for other purposes
right now --- indeed, it's defined the way it is exactly to let
people do that. I don't see a good reason to force them to redo
whatever tooling they have that depends on that. So let's build
kwlist_offsets.h alongside that, but not change kwlist.h itself.

Done.

I used the global .gitignore, but maybe that's an abuse of it.

Yeah, I'd say it is.

Moved.

+# TODO: Error out if the keyword names are not in ASCII order.

+many for including such a check.

Done.

Also note that we don't require people to have Perl installed when
building from a tarball. Therefore, these derived headers must get
built during "make distprep" and removed by maintainer-clean but
not distclean. I think this also has some implications for VPATH
builds, but as long as you follow the pattern used for other
derived header files (e.g. fmgroids.h), you should be fine.

Done. I also blindly added support for MSVC.

-John Naylor

Attachments:

v2-wip-use-offset-based-scankeywords.patchtext/x-patch; charset=US-ASCII; name=v2-wip-use-offset-based-scankeywords.patchDownload+356-95
#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: John Naylor (#11)
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)

John Naylor <jcnaylor@gmail.com> writes:

Using a single file also gave me another idea: Take value and category
out of ScanKeyword, and replace them with an index into another array
containing those, which will only be accessed in the event of a hit.
That would shrink ScanKeyword to 4 bytes (offset, index), further
increasing locality of reference. Might not be worth it, but I can try
it after moving on to the core scanner.

I like that idea a *lot*, actually, because it offers the opportunity
to decouple this mechanism from all assumptions about what the
auxiliary data for a keyword is. Basically, we'd redefine
ScanKeywordLookup as having the API "given a string, return a
keyword index if it is a keyword, -1 if it isn't"; then the caller
would use the keyword index to look up the auxiliary data in a table
that it owns, and ScanKeywordLookup doesn't know about at all.

So that leads to a design like this: the master data is in a header
that's just like kwlist.h is today, except now we are thinking of
PG_KEYWORD as an N-argument macro not necessarily exactly 3 arguments.
The Perl script reads that, paying attention only to the first argument
of the macro calls, and outputs a file containing, say,

static const uint16 kw_offsets[] = { 0, 6, 15, ... };

static const char kw_strings[] =
"abort\0"
"absolute\0"
...
;

(it'd be a good idea to have a switch that allows specifying the
prefix of these constant names). Then ScanKeywordLookup has the
signature

int ScanKeywordLookup(const char *string_to_lookup,
const char *kw_strings,
const uint16 *kw_offsets,
int num_keywords);

and a file using this stuff looks something like

/* Payload data for keywords */
typedef struct MyKeyword
{
int16 value;
int16 category;
} MyKeyword;

#define PG_KEYWORD(kwname, value, category) {value, category},

static const MyKeyword MyKeywords[] = {
#include "kwlist.h"
};

/* String lookup table for keywords */
#include "kwlist_d.h"

/* Lookup code looks about like this: */
kwnum = ScanKeywordLookup(str,
kw_strings,
kw_offsets,
lengthof(kw_offsets));
if (kwnum >= 0)
... look into MyKeywords[kwnum] for info ...

Aside from being arguably better from the locality-of-reference
standpoint, this gets us out of the weird ifdef'ing you've got in
the v2 patch. The kwlist_d.h headers can be very ordinary headers.

regards, tom lane

#13Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#12)
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)

Hi,

On 2018-12-22 12:20:00 -0500, Tom Lane wrote:

John Naylor <jcnaylor@gmail.com> writes:

Using a single file also gave me another idea: Take value and category
out of ScanKeyword, and replace them with an index into another array
containing those, which will only be accessed in the event of a hit.
That would shrink ScanKeyword to 4 bytes (offset, index), further
increasing locality of reference. Might not be worth it, but I can try
it after moving on to the core scanner.

I like that idea a *lot*, actually, because it offers the opportunity
to decouple this mechanism from all assumptions about what the
auxiliary data for a keyword is.

OTOH, it doubles or triples the number of cachelines accessed when
encountering a keyword. The fraction of keywords to not-keywords in SQL
makes me wonder whether that makes it a good deal.

Greetings,

Andres Freund

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#13)
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)

Andres Freund <andres@anarazel.de> writes:

On 2018-12-22 12:20:00 -0500, Tom Lane wrote:

I like that idea a *lot*, actually, because it offers the opportunity
to decouple this mechanism from all assumptions about what the
auxiliary data for a keyword is.

OTOH, it doubles or triples the number of cachelines accessed when
encountering a keyword.

Compared to what? The current situation in that regard is a mess.

Also, AFAICS this proposal involves the least amount of data touched
during the lookup phase of anything we've discussed, so I do not even
accept that your criticism is correct. One extra cacheline fetch
to get the aux data for a particular keyword after the search is not
going to tip the scales away from this being a win.

regards, tom lane

#15John Naylor
john.naylor@enterprisedb.com
In reply to: Tom Lane (#12)
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)

On 12/22/18, Tom Lane <tgl@sss.pgh.pa.us> wrote:

John Naylor <jcnaylor@gmail.com> writes:

Using a single file also gave me another idea: Take value and category
out of ScanKeyword, and replace them with an index into another array
containing those, which will only be accessed in the event of a hit.
That would shrink ScanKeyword to 4 bytes (offset, index), further
increasing locality of reference. Might not be worth it, but I can try
it after moving on to the core scanner.

I like that idea a *lot*, actually, because it offers the opportunity
to decouple this mechanism from all assumptions about what the
auxiliary data for a keyword is.

Okay, in that case I went ahead and did it for WIP v3.

(it'd be a good idea to have a switch that allows specifying the
prefix of these constant names).

Done as an optional switch, and tested, but not yet used in favor of
the previous method as a fallback. I'll probably do it in the final
version to keep lines below 80, and to add 'core_' to the core keyword
vars.

/* Payload data for keywords */
typedef struct MyKeyword
{
int16 value;
int16 category;
} MyKeyword;

I tweaked this a bit to

typedef struct ScanKeywordAux
{
int16 value; /* grammar's token code */
char category; /* see codes above */
} ScanKeywordAux;

It seems that category was only 2 bytes to make ScanKeyword a power of
2 (of course that was on 32 bit machines and doesn't hold true
anymore). Using char will save another few hundred bytes in the core
scanner. Since we're only accessing this once per identifier, we may
not need to worry so much about memory alignment.

Aside from being arguably better from the locality-of-reference
standpoint, this gets us out of the weird ifdef'ing you've got in
the v2 patch. The kwlist_d.h headers can be very ordinary headers.

Yeah, that's a nice (and for me unexpected) bonus.

-John Naylor

Attachments:

v3-wip-use-offset-based-scankeywords.patchtext/x-patch; charset=US-ASCII; name=v3-wip-use-offset-based-scankeywords.patchDownload+361-103
#16Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#7)
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)

On Wed, Dec 19, 2018 at 8:01 PM Andres Freund <andres@anarazel.de> wrote:

The last time I looked into perfect hash functions, it wasn't easy to
find a generator that competed with a decent normal hashtable (in
particular gperf's are very unconvincing). The added tooling is a
concern imo. OTOH, we're comparing not with a hashtable, but a binary
search, where the latter will usually loose. Wonder if we shouldn't
generate a serialized non-perfect hashtable instead. The lookup code for
a read-only hashtable without concern for adversarial input is pretty
trivial.

I wonder if we could do something really simple like a lookup based on
the first character of the scan keyword. It looks to me like there are
440 keywords right now, and the most common starting letter is 'c',
which is the first letter of 51 keywords. So dispatching based on the
first letter clips at least 3 steps off the binary search. I don't
know whether that's enough to be worthwhile, but it's probably pretty
simple to implement.

I'm not sure that I understand quite what you have in mind for a
serialized non-perfect hashtable. Are you thinking that we'd just
construct a simplehash and serialize it?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#16)
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)

Robert Haas <robertmhaas@gmail.com> writes:

On Wed, Dec 19, 2018 at 8:01 PM Andres Freund <andres@anarazel.de> wrote:

The last time I looked into perfect hash functions, it wasn't easy to
find a generator that competed with a decent normal hashtable (in
particular gperf's are very unconvincing). The added tooling is a
concern imo. OTOH, we're comparing not with a hashtable, but a binary
search, where the latter will usually loose. Wonder if we shouldn't
generate a serialized non-perfect hashtable instead. The lookup code for
a read-only hashtable without concern for adversarial input is pretty
trivial.

I wonder if we could do something really simple like a lookup based on
the first character of the scan keyword. It looks to me like there are
440 keywords right now, and the most common starting letter is 'c',
which is the first letter of 51 keywords. So dispatching based on the
first letter clips at least 3 steps off the binary search. I don't
know whether that's enough to be worthwhile, but it's probably pretty
simple to implement.

I think there's a lot of goalpost-moving going on here. The original
idea was to trim the physical size of the data structure, as stated
in the thread subject, and just reap whatever cache benefits we got
along the way from that. I am dubious that we actually have any
performance problem in this code that needs a big dollop of added
complexity to fix.

In my hands, the only part of the low-level parsing code that commonly
shows up as interesting in profiles is the Bison engine. That's probably
because the grammar tables are circa half a megabyte and blow out cache
pretty badly :-(. I don't know of any way to make that better,
unfortunately. I suspect that it's just going to get worse, because
people keep submitting additions to the grammar.

regards, tom lane

#18David Fetter
david@fetter.org
In reply to: Tom Lane (#17)
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)

On Wed, Dec 26, 2018 at 11:22:39AM -0500, Tom Lane wrote:

In my hands, the only part of the low-level parsing code that
commonly shows up as interesting in profiles is the Bison engine.

Should we be considering others? As I understand it, steps have been
made in this field since yacc was originally designed. Is LALR
actually suitable for languages like SQL, or is it just there for
historical reasons?

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#19Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#17)
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)

On Wed, Dec 26, 2018 at 11:22 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I think there's a lot of goalpost-moving going on here. The original
idea was to trim the physical size of the data structure, as stated
in the thread subject, and just reap whatever cache benefits we got
along the way from that. I am dubious that we actually have any
performance problem in this code that needs a big dollop of added
complexity to fix.

I have seen ScanKeywordLookup show up in profiles quite often and
fairly prominently -- like several percent of total runtime. I'm not
trying to impose requirements on John's patch, and I agree that
reducing the physical size of the structure is a good step whether
anything else is done or not. However, I don't see that as a reason to
shut down further discussion of other possible improvements. If his
patch makes this disappear from profiles, cool, but if it doesn't,
then sooner or later somebody's going to want to do more.

FWIW, my bet is this helps but isn't enough to get rid of the problem
completely. A 9-step binary search has got to be slower than a really
well-optimized hash table lookup. In a perfect world the latter
touches the cache line containing the keyword -- which presumably is
already in cache since we just scanned it -- then computes a hash
value without touching any other cache lines -- and then goes straight
to the right entry. So it touches ONE new cache line. That might a
level of optimization that's hard to achieve in practice, but I don't
think it's crazy to want to get there.

In my hands, the only part of the low-level parsing code that commonly
shows up as interesting in profiles is the Bison engine. That's probably
because the grammar tables are circa half a megabyte and blow out cache
pretty badly :-(. I don't know of any way to make that better,
unfortunately. I suspect that it's just going to get worse, because
people keep submitting additions to the grammar.

I'm kinda surprised that you haven't seen ScanKeywordLookup() in
there, but I agree with you that the size of the main parser tables is
a real issue, and that there's no easy solution. At various times
there has been discussion of using some other parser generator, and
I've also toyed with the idea of writing one specifically for
PostgreSQL. Unfortunately, it seems like bison is all but
unmaintained; the alternatives are immature and have limited adoption
and limited community; and writing something from scratch is a ton of
work. :-(

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Fetter (#18)
Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)

David Fetter <david@fetter.org> writes:

On Wed, Dec 26, 2018 at 11:22:39AM -0500, Tom Lane wrote:

In my hands, the only part of the low-level parsing code that
commonly shows up as interesting in profiles is the Bison engine.

Should we be considering others?

We've looked around before, IIRC, and not really seen any arguably
better tools.

regards, tom lane

#21Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#19)
#22Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#19)
#23Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#16)
#24Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#22)
#25Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#23)
#26John Naylor
john.naylor@enterprisedb.com
In reply to: Robert Haas (#16)
#27Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#24)
#28John Naylor
john.naylor@enterprisedb.com
In reply to: John Naylor (#26)
#29Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#27)
#30Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#29)
#31Tom Lane
tgl@sss.pgh.pa.us
In reply to: John Naylor (#28)
#32Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#30)
#33John Naylor
john.naylor@enterprisedb.com
In reply to: Tom Lane (#31)
#34Tom Lane
tgl@sss.pgh.pa.us
In reply to: John Naylor (#33)
#35Andrew Dunstan
andrew@dunslane.net
In reply to: John Naylor (#33)
#36Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#35)
#37Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#36)
#38Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#37)
#39Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#38)
#40David Fetter
david@fetter.org
In reply to: Alvaro Herrera (#39)
#41Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#32)
#42John Naylor
john.naylor@enterprisedb.com
In reply to: Tom Lane (#31)
#43Andres Freund
andres@anarazel.de
In reply to: John Naylor (#42)
#44Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#43)
In reply to: John Naylor (#1)
#46Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#44)
#47John Naylor
john.naylor@enterprisedb.com
In reply to: Andres Freund (#43)
#48John Naylor
john.naylor@enterprisedb.com
In reply to: Joerg Sonnenberger (#45)
#49Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#46)
#50Tom Lane
tgl@sss.pgh.pa.us
In reply to: John Naylor (#48)
#51Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#49)
#52John Naylor
john.naylor@enterprisedb.com
In reply to: Tom Lane (#31)
In reply to: Tom Lane (#50)
#54Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joerg Sonnenberger (#53)
#55Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#54)
In reply to: Andres Freund (#55)
#57Tom Lane
tgl@sss.pgh.pa.us
In reply to: John Naylor (#47)
#58Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#57)
#59David Rowley
dgrowleyml@gmail.com
In reply to: John Naylor (#48)
In reply to: David Rowley (#59)
#61Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joerg Sonnenberger (#60)
In reply to: Tom Lane (#61)
#63Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joerg Sonnenberger (#62)
#64Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joerg Sonnenberger (#62)
#65John Naylor
john.naylor@enterprisedb.com
In reply to: Tom Lane (#64)
#66Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#64)
#67Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#66)
#68Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#66)
#69Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#67)
#70Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#69)
#71Andrew Dunstan
andrew@dunslane.net
In reply to: Andres Freund (#70)
#72John Naylor
john.naylor@enterprisedb.com
In reply to: Andrew Dunstan (#71)
#73Andres Freund
andres@anarazel.de
In reply to: John Naylor (#72)
#74Tom Lane
tgl@sss.pgh.pa.us
In reply to: John Naylor (#72)
#75Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#73)
#76John Naylor
john.naylor@enterprisedb.com
In reply to: Tom Lane (#74)
#77Tom Lane
tgl@sss.pgh.pa.us
In reply to: John Naylor (#76)
#78Tom Lane
tgl@sss.pgh.pa.us
In reply to: John Naylor (#76)
In reply to: Tom Lane (#78)
#80Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#77)
#81Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#75)
#82Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#81)
#83Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#81)
#84Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#83)
#85Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#82)
#86Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#85)
#87Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#86)
In reply to: Tom Lane (#80)
#89John Naylor
john.naylor@enterprisedb.com
In reply to: Tom Lane (#80)
#90John Naylor
john.naylor@enterprisedb.com
In reply to: Tom Lane (#77)
#91John Naylor
john.naylor@enterprisedb.com
In reply to: Tom Lane (#81)
#92Tom Lane
tgl@sss.pgh.pa.us
In reply to: John Naylor (#91)
#93Tom Lane
tgl@sss.pgh.pa.us
In reply to: John Naylor (#89)
#94Tom Lane
tgl@sss.pgh.pa.us
In reply to: John Naylor (#90)
#95John Naylor
john.naylor@enterprisedb.com
In reply to: Tom Lane (#78)
#96Tom Lane
tgl@sss.pgh.pa.us
In reply to: John Naylor (#95)
#97Joel Jacobson
joel@trustly.com
In reply to: Tom Lane (#96)
#98Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joel Jacobson (#97)
#99Joel Jacobson
joel@trustly.com
In reply to: Tom Lane (#98)