automatically generating node support functions

Started by Peter Eisentrautalmost 5 years ago89 messageshackers
Jump to latest
#1Peter Eisentraut
peter_e@gmx.net

I wrote a script to automatically generate the node support functions
(copy, equal, out, and read, as well as the node tags enum) from the
struct definitions.

The first eight patches are to clean up various inconsistencies to make
parsing or generation easier.

The interesting stuff is in patch 0009.

For each of the four node support files, it creates two include files,
e.g., copyfuncs.inc1.c and copyfuncs.inc2.c to include in the main file.
All the scaffolding of the main file stays in place.

In this patch, I have only ifdef'ed out the code to could be removed,
mainly so that it won't constantly have merge conflicts. Eventually,
that should all be changed to delete the code. When we do that, some
code comments should probably be preserved elsewhere, so that will need
another pass of consideration.

I have tried to mostly make the coverage of the output match what is
currently there. For example, one could do out/read coverage of utility
statement nodes easily with this, but I have manually excluded those for
now. The reason is mainly that it's easier to diff the before and
after, and adding a bunch of stuff like this might require a separate
analysis and review.

Subtyping (TidScan -> Scan) is supported.

For the hard cases, you can just write a manual function and exclude
generating one.

For the not so hard cases, there is a way of annotating struct fields to
get special behaviors. For example, pg_node_attr(equal_ignore) has the
field ignored in equal functions.

There are a couple of additional minor issues mentioned in the script
source. But basically, it all seems to work.

Attachments:

v1-0001-Rename-NodeTag-of-ExprState.patchtext/plain; charset=UTF-8; name=v1-0001-Rename-NodeTag-of-ExprState.patch; x-mac-creator=0; x-mac-type=0Download+3-4
v1-0002-Rename-argument-of-_outValue.patchtext/plain; charset=UTF-8; name=v1-0002-Rename-argument-of-_outValue.patch; x-mac-creator=0; x-mac-type=0Download+8-9
v1-0003-Rename-some-node-support-functions-for-consistenc.patchtext/plain; charset=UTF-8; name=v1-0003-Rename-some-node-support-functions-for-consistenc.patch; x-mac-creator=0; x-mac-type=0Download+20-21
v1-0004-Change-SeqScan-node-to-contain-Scan-node.patchtext/plain; charset=UTF-8; name=v1-0004-Change-SeqScan-node-to-contain-Scan-node.patch; x-mac-creator=0; x-mac-type=0Download+15-13
v1-0005-Change-NestPath-node-to-contain-JoinPath-node.patchtext/plain; charset=UTF-8; name=v1-0005-Change-NestPath-node-to-contain-JoinPath-node.patch; x-mac-creator=0; x-mac-type=0Download+51-46
v1-0006-Add-missing-enum-tags-in-enums-used-in-nodes.patchtext/plain; charset=UTF-8; name=v1-0006-Add-missing-enum-tags-in-enums-used-in-nodes.patch; x-mac-creator=0; x-mac-type=0Download+4-5
v1-0007-Check-the-size-in-COPY_POINTER_FIELD.patchtext/plain; charset=UTF-8; name=v1-0007-Check-the-size-in-COPY_POINTER_FIELD.patch; x-mac-creator=0; x-mac-type=0Download+21-34
v1-0008-Remove-T_MemoryContext.patchtext/plain; charset=UTF-8; name=v1-0008-Remove-T_MemoryContext.patch; x-mac-creator=0; x-mac-type=0Download+0-2
v1-0009-Add-script-to-generate-node-support-functions.patchtext/plain; charset=UTF-8; name=v1-0009-Add-script-to-generate-node-support-functions.patch; x-mac-creator=0; x-mac-type=0Download+857-75
v1-0010-XXX-Debugging-support.patchtext/plain; charset=UTF-8; name=v1-0010-XXX-Debugging-support.patch; x-mac-creator=0; x-mac-type=0Download+2-3
#2David Rowley
dgrowleyml@gmail.com
In reply to: Peter Eisentraut (#1)
Re: automatically generating node support functions

On Tue, 8 Jun 2021 at 08:28, Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:

I wrote a script to automatically generate the node support functions
(copy, equal, out, and read, as well as the node tags enum) from the
struct definitions.

Thanks for working on this. I agree that it would be nice to see
improvements in this area.

It's almost 2 years ago now, but I'm wondering if you saw what Andres
proposed in [1]/messages/by-id/20190828234136.fk2ndqtld3onfrrp@alap3.anarazel.de? The idea was basically to make a metadata array of
the node structs so that, instead of having to output large amounts of
.c code to do read/write/copy/equals, instead just have small
functions that loop over the elements in the array for the given
struct and perform the required operation based on the type.

There were still quite a lot of unsolved problems, for example, how to
determine the length of arrays so that we know how many bytes to
compare in equal funcs. I had a quick look at what you've got and
see you've got a solution for that by looking at the last "int" field
before the array and using that. (I wonder if you'd be better to use
something more along the lines of your pg_node_attr() for that?)

There's quite a few advantages having the metadata array rather than
the current approach:

1. We don't need to compile 4 huge .c files and link them into the
postgres binary. I imagine this will make the binary a decent amount
smaller.
2. We can easily add more operations on nodes. e.g serialize nodes
for sending plans to parallel workers. or generating a hash value so
we can store node types in a hash table.

One disadvantage would be what Andres mentioned in [2]/messages/by-id/20190920051857.2fhnvhvx4qdddviz@alap3.anarazel.de. He found
around a 5% performance regression. However, looking at the
NodeTypeComponents struct in [1]/messages/by-id/20190828234136.fk2ndqtld3onfrrp@alap3.anarazel.de, we might be able to speed it up
further by shrinking that struct down a bit and just storing an uint16
position into a giant char array which contains all of the field
names. I imagine they wouldn't take more than 64k. fieldtype could see
a similar change. That would take the NodeTypeComponents struct from
26 bytes down to 14 bytes, which means about double the number of
field metadata we could fit on a cache line.

Do you have any thoughts about that approach instead?

David

[1]: /messages/by-id/20190828234136.fk2ndqtld3onfrrp@alap3.anarazel.de
[2]: /messages/by-id/20190920051857.2fhnvhvx4qdddviz@alap3.anarazel.de

#3Peter Eisentraut
peter_e@gmx.net
In reply to: David Rowley (#2)
Re: automatically generating node support functions

On 08.06.21 15:40, David Rowley wrote:

It's almost 2 years ago now, but I'm wondering if you saw what Andres
proposed in [1]? The idea was basically to make a metadata array of
the node structs so that, instead of having to output large amounts of
.c code to do read/write/copy/equals, instead just have small
functions that loop over the elements in the array for the given
struct and perform the required operation based on the type.

That project was technologically impressive, but it seemed to have
significant hurdles to overcome before it can be useful. My proposal is
usable and useful today. And it doesn't prevent anyone from working on
a more sophisticated solution.

There were still quite a lot of unsolved problems, for example, how to
determine the length of arrays so that we know how many bytes to
compare in equal funcs. I had a quick look at what you've got and
see you've got a solution for that by looking at the last "int" field
before the array and using that. (I wonder if you'd be better to use
something more along the lines of your pg_node_attr() for that?)

I considered that, but since the convention seemed to work everywhere, I
left it. But it wouldn't be hard to change.

#4Andres Freund
andres@anarazel.de
In reply to: Peter Eisentraut (#3)
Re: automatically generating node support functions

Hi,

On 2021-06-08 19:45:58 +0200, Peter Eisentraut wrote:

On 08.06.21 15:40, David Rowley wrote:

It's almost 2 years ago now, but I'm wondering if you saw what Andres
proposed in [1]? The idea was basically to make a metadata array of
the node structs so that, instead of having to output large amounts of
.c code to do read/write/copy/equals, instead just have small
functions that loop over the elements in the array for the given
struct and perform the required operation based on the type.

That project was technologically impressive, but it seemed to have
significant hurdles to overcome before it can be useful. My proposal is
usable and useful today. And it doesn't prevent anyone from working on a
more sophisticated solution.

I think it's short-sighted to further and further go down the path of
parsing "kind of C" without just using a proper C parser. But leaving
that aside, a big part of the promise of the approach in that thread
isn't actually tied to the specific way the type information is
collected: The perl script could output something like the "node type
metadata" I generated in that patchset, and then we don't need the large
amount of generated code and can much more economically add additional
operations handling node types.

Greetings,

Andres Freund

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#4)
Re: automatically generating node support functions

Andres Freund <andres@anarazel.de> writes:

On 2021-06-08 19:45:58 +0200, Peter Eisentraut wrote:

On 08.06.21 15:40, David Rowley wrote:

It's almost 2 years ago now, but I'm wondering if you saw what Andres
proposed in [1]?

That project was technologically impressive, but it seemed to have
significant hurdles to overcome before it can be useful. My proposal is
usable and useful today. And it doesn't prevent anyone from working on a
more sophisticated solution.

I think it's short-sighted to further and further go down the path of
parsing "kind of C" without just using a proper C parser. But leaving
that aside, a big part of the promise of the approach in that thread
isn't actually tied to the specific way the type information is
collected: The perl script could output something like the "node type
metadata" I generated in that patchset, and then we don't need the large
amount of generated code and can much more economically add additional
operations handling node types.

I think the main reason that the previous patch went nowhere was general
resistance to making developers install something as complicated as
libclang --- that could be a big lift on non-mainstream platforms.
So IMO it's a feature not a bug that Peter's approach just uses a perl
script. OTOH, the downstream aspects of your patch did seem appealing.
So I'd like to see a merger of the two approaches, using perl for the
data extraction and then something like what you'd done. Maybe that's
the same thing you're saying.

I also see Peter's point that committing what he has here might be
a reasonable first step on that road. Getting the data extraction
right is a big chunk of the job, and what we do with it afterward
could be improved later.

regards, tom lane

#6Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#5)
Re: automatically generating node support functions

Hi,

On 2021-07-14 17:42:10 -0400, Tom Lane wrote:

I think the main reason that the previous patch went nowhere was general
resistance to making developers install something as complicated as
libclang --- that could be a big lift on non-mainstream platforms.

I'm still not particularly convinced it's and issue - I was suggesting
we commit the resulting metadata, so libclang would only be needed when
modifying node types. And even in case one needs to desperately modify
node types on a system without access to libclang, for an occasionally
small change one could just modify the committed metadata structs
manually.

So IMO it's a feature not a bug that Peter's approach just uses a perl
script. OTOH, the downstream aspects of your patch did seem appealing.
So I'd like to see a merger of the two approaches, using perl for the
data extraction and then something like what you'd done. Maybe that's
the same thing you're saying.

Yes, that's what I was trying to say. I'm still doubtful it's a great
idea to go further down the "weird subset of C parsed by regexes" road,
but I can live with it. If Peter could generate something roughly like
the metadata I emitted, I'd rebase my node functions ontop of that.

I also see Peter's point that committing what he has here might be
a reasonable first step on that road. Getting the data extraction
right is a big chunk of the job, and what we do with it afterward
could be improved later.

To me that seems likely to just cause churn without saving much
effort. The needed information isn't really the same between generating
the node functions as text and collecting the metadata for "generic node
functions", and none of the output is the same.

Greetings,

Andres Freund

#7Peter Eisentraut
peter_e@gmx.net
In reply to: Peter Eisentraut (#1)
Re: automatically generating node support functions

On 07.06.21 22:27, Peter Eisentraut wrote:

I wrote a script to automatically generate the node support functions
(copy, equal, out, and read, as well as the node tags enum) from the
struct definitions.

The first eight patches are to clean up various inconsistencies to make
parsing or generation easier.

Are there any concerns about the patches 0001 through 0008? Otherwise,
maybe we could get those out of the way.

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#7)
Re: automatically generating node support functions

Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:

The first eight patches are to clean up various inconsistencies to make
parsing or generation easier.

Are there any concerns about the patches 0001 through 0008? Otherwise,
maybe we could get those out of the way.

I looked through those and don't have any complaints (though I just
eyeballed them, I didn't see what a compiler would say). I see
you pushed a couple of them already.

regards, tom lane

#9Peter Eisentraut
peter_e@gmx.net
In reply to: Peter Eisentraut (#7)
Re: automatically generating node support functions

Here is another set of preparatory patches that clean up various special
cases and similar in the node support.

0001-Remove-T_Expr.patch

Removes unneeded T_Expr.

0002-Add-COPY_ARRAY_FIELD-and-COMPARE_ARRAY_FIELD.patch
0003-Add-WRITE_INDEX_ARRAY.patch

These add macros to handle a few cases that were previously hand-coded.

0004-Make-node-output-prefix-match-node-structure-name.patch

Some nodes' output/read functions use a label that is slightly different
from their node name, e.g., "NOTIFY" instead of "NOTIFYSTMT". This
cleans that up so that an automated approach doesn't have to deal with
these special cases.

0005-Add-Cardinality-typedef.patch

Adds a typedef Cardinality for double fields that store an estimated row
or other count. Works alongside Cost and Selectivity.

This is useful because it appears that the serialization format for
these float fields depends on their intent: Cardinality => %.0f, Cost =>
%.2f, Selectivity => %.4f. The only remaining exception is allvisfrac,
which uses %.6f. Maybe that could also be a Selectivity, but I left it
as is. I think this improves the clarity in this area.

Attachments:

0001-Remove-T_Expr.patchtext/plain; charset=UTF-8; name=0001-Remove-T_Expr.patch; x-mac-creator=0; x-mac-type=0Download+0-2
0002-Add-COPY_ARRAY_FIELD-and-COMPARE_ARRAY_FIELD.patchtext/plain; charset=UTF-8; name=0002-Add-COPY_ARRAY_FIELD-and-COMPARE_ARRAY_FIELD.patch; x-mac-creator=0; x-mac-type=0Download+14-5
0003-Add-WRITE_INDEX_ARRAY.patchtext/plain; charset=UTF-8; name=0003-Add-WRITE_INDEX_ARRAY.patch; x-mac-creator=0; x-mac-type=0Download+8-9
0004-Make-node-output-prefix-match-node-structure-name.patchtext/plain; charset=UTF-8; name=0004-Make-node-output-prefix-match-node-structure-name.patch; x-mac-creator=0; x-mac-type=0Download+22-23
0005-Add-Cardinality-typedef.patchtext/plain; charset=UTF-8; name=0005-Add-Cardinality-typedef.patch; x-mac-creator=0; x-mac-type=0Download+26-26
#10Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Peter Eisentraut (#9)
Re: automatically generating node support functions

On Tue, 2021-08-17 at 16:36 +0200, Peter Eisentraut wrote:

Here is another set of preparatory patches that clean up various special
cases and similar in the node support.

0001-Remove-T_Expr.patch

Removes unneeded T_Expr.

0002-Add-COPY_ARRAY_FIELD-and-COMPARE_ARRAY_FIELD.patch
0003-Add-WRITE_INDEX_ARRAY.patch

These add macros to handle a few cases that were previously hand-coded.

These look sane to me.

0004-Make-node-output-prefix-match-node-structure-name.patch

Some nodes' output/read functions use a label that is slightly different
from their node name, e.g., "NOTIFY" instead of "NOTIFYSTMT". This
cleans that up so that an automated approach doesn't have to deal with
these special cases.

Is there any concern about the added serialization length, or is that
trivial in practice? The one that particularly caught my eye is
RANGETBLENTRY, which was previously RTE. But I'm not very well-versed
in all the places these strings can be generated and stored.

0005-Add-Cardinality-typedef.patch

Adds a typedef Cardinality for double fields that store an estimated row
or other count. Works alongside Cost and Selectivity.

Should RangeTblEntry.enrtuples also be a Cardinality?

--Jacob

#11Peter Eisentraut
peter_e@gmx.net
In reply to: Jacob Champion (#10)
Re: automatically generating node support functions

On 02.09.21 20:53, Jacob Champion wrote:

0004-Make-node-output-prefix-match-node-structure-name.patch

Some nodes' output/read functions use a label that is slightly different
from their node name, e.g., "NOTIFY" instead of "NOTIFYSTMT". This
cleans that up so that an automated approach doesn't have to deal with
these special cases.

Is there any concern about the added serialization length, or is that
trivial in practice? The one that particularly caught my eye is
RANGETBLENTRY, which was previously RTE. But I'm not very well-versed
in all the places these strings can be generated and stored.

These are just matters of taste. Let's wait a bit more to see if anyone
is concerned.

0005-Add-Cardinality-typedef.patch

Adds a typedef Cardinality for double fields that store an estimated row
or other count. Works alongside Cost and Selectivity.

Should RangeTblEntry.enrtuples also be a Cardinality?

Yes, I'll add that.

#12Noah Misch
noah@leadboat.com
In reply to: Peter Eisentraut (#11)
Re: automatically generating node support functions

On Tue, Sep 07, 2021 at 10:57:02AM +0200, Peter Eisentraut wrote:

On 02.09.21 20:53, Jacob Champion wrote:

0004-Make-node-output-prefix-match-node-structure-name.patch

Some nodes' output/read functions use a label that is slightly different
from their node name, e.g., "NOTIFY" instead of "NOTIFYSTMT". This
cleans that up so that an automated approach doesn't have to deal with
these special cases.

Is there any concern about the added serialization length, or is that
trivial in practice? The one that particularly caught my eye is
RANGETBLENTRY, which was previously RTE. But I'm not very well-versed
in all the places these strings can be generated and stored.

These are just matters of taste. Let's wait a bit more to see if anyone is
concerned.

I am not concerned about changing the serialization length this much. The
format is already quite verbose, and this change is small relative to that
existing verbosity.

#13Peter Eisentraut
peter_e@gmx.net
In reply to: Peter Eisentraut (#9)
Re: automatically generating node support functions

On 17.08.21 16:36, Peter Eisentraut wrote:

Here is another set of preparatory patches that clean up various special
cases and similar in the node support.

This set of patches has been committed. I'll close this commit fest
entry and come back with the main patch series in the future.

#14Peter Eisentraut
peter_e@gmx.net
In reply to: Peter Eisentraut (#13)
Re: automatically generating node support functions

On 15.09.21 21:01, Peter Eisentraut wrote:

On 17.08.21 16:36, Peter Eisentraut wrote:

Here is another set of preparatory patches that clean up various
special cases and similar in the node support.

This set of patches has been committed.  I'll close this commit fest
entry and come back with the main patch series in the future.

Here is an updated version of my original patch, so we have something to
continue the discussion around. This takes into account all the
preparatory patches that have been committed in the meantime. I have
also changed it so that the array size of a pointer is now explicitly
declared using pg_node_attr(array_size(N)) instead of picking the most
recent scalar field, which was admittedly hacky. I have also added MSVC
build support and made the Perl code more portable, so that the cfbot
doesn't have to be sad.

Attachments:

v2-0001-Automatically-generate-node-support-functions.patchtext/plain; charset=UTF-8; name=v2-0001-Automatically-generate-node-support-functions.patch; x-mac-creator=0; x-mac-type=0Download+954-134
#15Corey Huinker
corey.huinker@gmail.com
In reply to: Peter Eisentraut (#14)
Re: automatically generating node support functions

build support and made the Perl code more portable, so that the cfbot
doesn't have to be sad.

Was this also the reason for doing the output with print statements rather
than using one of the templating libraries? I'm mostly just curious, and
certainly don't want it to get in the way of working code.

#16Peter Eisentraut
peter_e@gmx.net
In reply to: Corey Huinker (#15)
Re: automatically generating node support functions

On 12.10.21 03:06, Corey Huinker wrote:

build support and made the Perl code more portable, so that the cfbot
doesn't have to be sad.

Was this also the reason for doing the output with print statements
rather than using one of the templating libraries? I'm mostly just
curious, and certainly don't want it to get in the way of working code.

Unless there is a templating library that ships with Perl (>= 5.8.3,
apparently now), this seems impractical.

#17Andrew Dunstan
andrew@dunslane.net
In reply to: Peter Eisentraut (#14)
Re: automatically generating node support functions

On 10/11/21 10:22 AM, Peter Eisentraut wrote:

On 15.09.21 21:01, Peter Eisentraut wrote:

On 17.08.21 16:36, Peter Eisentraut wrote:

Here is another set of preparatory patches that clean up various
special cases and similar in the node support.

This set of patches has been committed.  I'll close this commit fest
entry and come back with the main patch series in the future.

Here is an updated version of my original patch, so we have something
to continue the discussion around.  This takes into account all the
preparatory patches that have been committed in the meantime.  I have
also changed it so that the array size of a pointer is now explicitly
declared using pg_node_attr(array_size(N)) instead of picking the most
recent scalar field, which was admittedly hacky.  I have also added
MSVC build support and made the Perl code more portable, so that the
cfbot doesn't have to be sad.

I haven't been through the whole thing, but I did notice this: the
comment stripping code looks rather fragile. I think it would blow up if
there were a continuation line not starting with  qr/\s*\*/. It's a lot
simpler and more robust to do this if you slurp the file in whole.
Here's what we do in the buildfarm code:

    my $src = file_contents($_);
# strip C comments
    # We used to use the recipe in perlfaq6 but there is actually no point.
    # We don't need to keep the quoted string values anyway, and
    # on some platforms the complex regex causes perl to barf and crash.
    $src =~ s{/\*.*?\*/}{}gs;

After you've done that splitting it into lines is pretty simple.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

#18Peter Eisentraut
peter_e@gmx.net
In reply to: Andrew Dunstan (#17)
Re: automatically generating node support functions

On 12.10.21 15:52, Andrew Dunstan wrote:

I haven't been through the whole thing, but I did notice this: the
comment stripping code looks rather fragile. I think it would blow up if
there were a continuation line not starting with  qr/\s*\*/. It's a lot
simpler and more robust to do this if you slurp the file in whole.
Here's what we do in the buildfarm code:

    my $src = file_contents($_);
# strip C comments
    # We used to use the recipe in perlfaq6 but there is actually no point.
    # We don't need to keep the quoted string values anyway, and
    # on some platforms the complex regex causes perl to barf and crash.
    $src =~ s{/\*.*?\*/}{}gs;

After you've done that splitting it into lines is pretty simple.

Here is an updated patch, with some general rebasing, and the above
improvement. It now also generates #include lines necessary in
copyfuncs etc. to pull in all the node types it operates on.

Further, I have looked more into the "metadata" approach discussed in
[0]: /messages/by-id/20190828234136.fk2ndqtld3onfrrp@alap3.anarazel.de
structures my script produces. You just loop over all the node types
and print stuff and keep a few counters. I don't plan to work on that
at this time, but I just wanted to point out that if people wanted to
move into that direction, my patch wouldn't be in the way.

[0]: /messages/by-id/20190828234136.fk2ndqtld3onfrrp@alap3.anarazel.de
/messages/by-id/20190828234136.fk2ndqtld3onfrrp@alap3.anarazel.de

Attachments:

v3-0001-Automatically-generate-node-support-functions.patchtext/plain; charset=UTF-8; name=v3-0001-Automatically-generate-node-support-functions.patchDownload+976-148
#19Peter Eisentraut
peter_e@gmx.net
In reply to: Peter Eisentraut (#18)
Re: automatically generating node support functions

Rebased patch to resolve some merge conflicts

Show quoted text

On 29.12.21 12:08, Peter Eisentraut wrote:

On 12.10.21 15:52, Andrew Dunstan wrote:

I haven't been through the whole thing, but I did notice this: the
comment stripping code looks rather fragile. I think it would blow up if
there were a continuation line not starting with  qr/\s*\*/. It's a lot
simpler and more robust to do this if you slurp the file in whole.
Here's what we do in the buildfarm code:

     my $src = file_contents($_);
     # strip C comments
     # We used to use the recipe in perlfaq6 but there is actually no
point.
     # We don't need to keep the quoted string values anyway, and
     # on some platforms the complex regex causes perl to barf and crash.
     $src =~ s{/\*.*?\*/}{}gs;

After you've done that splitting it into lines is pretty simple.

Here is an updated patch, with some general rebasing, and the above
improvement.  It now also generates #include lines necessary in
copyfuncs etc. to pull in all the node types it operates on.

Further, I have looked more into the "metadata" approach discussed in
[0].  It's pretty easy to generate that kind of output from the data
structures my script produces.  You just loop over all the node types
and print stuff and keep a few counters.  I don't plan to work on that
at this time, but I just wanted to point out that if people wanted to
move into that direction, my patch wouldn't be in the way.

[0]:
/messages/by-id/20190828234136.fk2ndqtld3onfrrp@alap3.anarazel.de

Attachments:

v4-0001-Automatically-generate-node-support-functions.patchtext/plain; charset=UTF-8; name=v4-0001-Automatically-generate-node-support-functions.patchDownload+977-149
#20Peter Eisentraut
peter_e@gmx.net
In reply to: Peter Eisentraut (#19)
Re: automatically generating node support functions

What do people think about this patch now?

I have received some feedback on several small technical issues, which
have all been fixed. This patch has been around for several commit
fests now and AFAICT, nothing has broken it. This is just to indicate
that the parsing isn't as flimsy as one might fear.

One thing thing that is waiting behind this patch is that you currently
cannot put utility commands into parse-time SQL functions, because there
is no full out/read support for those. This patch would fix that
problem. (There is a little bit of additional work necessary, but I
have that mostly worked out in a separate branch.)

Show quoted text

On 24.01.22 16:15, Peter Eisentraut wrote:

Rebased patch to resolve some merge conflicts

On 29.12.21 12:08, Peter Eisentraut wrote:

On 12.10.21 15:52, Andrew Dunstan wrote:

I haven't been through the whole thing, but I did notice this: the
comment stripping code looks rather fragile. I think it would blow up if
there were a continuation line not starting with  qr/\s*\*/. It's a lot
simpler and more robust to do this if you slurp the file in whole.
Here's what we do in the buildfarm code:

     my $src = file_contents($_);
     # strip C comments
     # We used to use the recipe in perlfaq6 but there is actually no
point.
     # We don't need to keep the quoted string values anyway, and
     # on some platforms the complex regex causes perl to barf and
crash.
     $src =~ s{/\*.*?\*/}{}gs;

After you've done that splitting it into lines is pretty simple.

Here is an updated patch, with some general rebasing, and the above
improvement.  It now also generates #include lines necessary in
copyfuncs etc. to pull in all the node types it operates on.

Further, I have looked more into the "metadata" approach discussed in
[0].  It's pretty easy to generate that kind of output from the data
structures my script produces.  You just loop over all the node types
and print stuff and keep a few counters.  I don't plan to work on that
at this time, but I just wanted to point out that if people wanted to
move into that direction, my patch wouldn't be in the way.

[0]:
/messages/by-id/20190828234136.fk2ndqtld3onfrrp@alap3.anarazel.de

#21Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#20)
#22Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#21)
#23Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#22)
#24Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#23)
#25Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#24)
#26Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#25)
#27Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#21)
#28David Rowley
dgrowleyml@gmail.com
In reply to: Peter Eisentraut (#27)
#29Peter Eisentraut
peter_e@gmx.net
In reply to: David Rowley (#28)
#30Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#29)
#31Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#30)
#32Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Peter Eisentraut (#27)
#33Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#32)
#34Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#33)
#35Peter Eisentraut
peter_e@gmx.net
In reply to: Alvaro Herrera (#32)
#36Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Peter Eisentraut (#35)
#37Peter Eisentraut
peter_e@gmx.net
In reply to: Peter Eisentraut (#29)
#38Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#37)
#39Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#38)
#40Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#39)
#41Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#40)
#42Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#40)
#43Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#41)
#44Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#42)
#45Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#44)
#46Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#44)
#47Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#46)
#48Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#47)
#49Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Peter Eisentraut (#46)
#50Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#49)
#51Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#48)
#52Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#51)
#53Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#51)
#54Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#53)
#55Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#53)
#56Andres Freund
andres@anarazel.de
In reply to: Peter Eisentraut (#53)
#57Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#56)
#58Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#57)
#59Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#58)
#60Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#58)
#61Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#59)
#62Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#61)
#63Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#60)
#64Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#63)
#65Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#63)
#66Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#64)
#67Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#65)
#68Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#66)
#69Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#66)
#70Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#68)
#71Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#70)
#72Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#69)
#73Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#67)
#74Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#72)
#75Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#74)
#76Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#75)
#77Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#76)
#78Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#77)
#79Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#63)
#80Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#79)
#81Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#78)
#82Amit Kapila
amit.kapila16@gmail.com
In reply to: Peter Eisentraut (#79)
#83Tom Lane
tgl@sss.pgh.pa.us
In reply to: Amit Kapila (#82)
#84Amit Kapila
amit.kapila16@gmail.com
In reply to: Tom Lane (#83)
#85Tom Lane
tgl@sss.pgh.pa.us
In reply to: Amit Kapila (#84)
#86Amit Kapila
amit.kapila16@gmail.com
In reply to: Tom Lane (#85)
#87Tom Lane
tgl@sss.pgh.pa.us
In reply to: Amit Kapila (#86)
#88Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#87)
#89Amit Kapila
amit.kapila16@gmail.com
In reply to: Tom Lane (#88)