unrecognized node type while displaying a Path due to dangling pointer

Started by Jeevan Chalkeover 2 years ago15 messages

jeevan.chalke@enterprisedb.com

over 2 years ago

Hello,

While debugging one issue, we have added the below line in
postgresGetForeignPlan() to see the foreignrel details.

@@ -1238,6 +1238,8 @@ postgresGetForeignPlan(PlannerInfo *root,
bool has_limit = false;
ListCell *lc;

+   elog(INFO, "foreignrel: %s", nodeToString(foreignrel));
+

And with this change, we ran the postgres_fdw regression (executing
sql/postgres_fdw.sql) suite. We observed the below warnings that seem
strange.

+WARNING:  could not dump unrecognized node type: 0
+WARNING:  could not dump unrecognized node type: 26072088
+WARNING:  could not dump unrecognized node type: 26438448
+WARNING:  could not dump unrecognized node type: 368

Of course, with multiple runs, we see some random node types listed above.
Thanks to my colleague Suraj Kharage for this and for working parallel with
me.

Does anybody have any idea about these?

After debugging one random query from the above-failed case, what we have
observed is (we might be wrong, but worth noting here):

1. This warning ended up while displaying RelOptInfo->pathlist.
2. In create_ordered_paths(), input_rel has two paths, and it loops over
both paths to get the best-sorted path.
3. First path was unsorted, and thus we add a sort node on top of it, and
adds that to the ordered_rel.
4. However, 2nd path was already sorted and passed as is to the add_path().
5. add_path() decides to reject this new path on some metrics. However, in
the end, it pfree() this passed in path. It seems wrong as its references
do present elsewhere. For example, in the first path's parent rels path
list.
6. So, while displaying the parent's path, we end up with these warnings.

I tried to get a fix for this but no luck so far.
One approach was to copy the path before passing it to the add_path().
However, there is no easy way to copy a path due to its circular references.

To see whether this warning goes or not, I have commented code in add_path()
that does pfree() on the new_path. And with that, I don't see any warnings.
But removing that code doesn't seem to be the correct fix.

Suggestions?

Thanks

--
Jeevan Chalke

*Senior Staff SDE, Database Architect, and ManagerProduct Development*

edbpostgres.com

Alvaro Herrera

alvherre@alvh.no-ip.org

over 2 years ago

In reply to: Jeevan Chalke (#1)

Re: unrecognized node type while displaying a Path due to dangling pointer

On 2023-Jul-11, Jeevan Chalke wrote:

4. However, 2nd path was already sorted and passed as is to the add_path().
5. add_path() decides to reject this new path on some metrics. However, in
the end, it pfree() this passed in path. It seems wrong as its references
do present elsewhere. For example, in the first path's parent rels path
list.
6. So, while displaying the parent's path, we end up with these warnings.

In other words, this is use-after-free, with add_path freeing the
passed-in Path pointer, but one particular case in which this Path is
still used afterwards.

I tried to get a fix for this but no luck so far.

I proposed to add an add_path_extended() function that adds 'bool
free_input_path' argument, and pass it false in that one place in
create_ordered_paths.

--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/

Jeevan Chalke

jeevan.chalke@enterprisedb.com

over 2 years ago

In reply to: Alvaro Herrera (#2)

Re: unrecognized node type while displaying a Path due to dangling pointer

On Tue, Jul 11, 2023 at 1:19 PM Alvaro Herrera <alvherre@alvh.no-ip.org>
wrote:

On 2023-Jul-11, Jeevan Chalke wrote:

4. However, 2nd path was already sorted and passed as is to the

add_path().

5. add_path() decides to reject this new path on some metrics. However,

in

the end, it pfree() this passed in path. It seems wrong as its references
do present elsewhere. For example, in the first path's parent rels path
list.
6. So, while displaying the parent's path, we end up with these warnings.

In other words, this is use-after-free, with add_path freeing the
passed-in Path pointer, but one particular case in which this Path is
still used afterwards.

I tried to get a fix for this but no luck so far.

I proposed to add an add_path_extended() function that adds 'bool
free_input_path' argument, and pass it false in that one place in
create_ordered_paths.

Yeah, this can be a way.

However, I am thinking the other way around now. What if we first added the
unmodified input path as it is to the ordered_rel first?

If we do so, then while adding the next path, add_path() may decide to
remove the older one as the newer path is the best one. The remove_old
logic in add_path() will free the path (unsorted one), and we end up with
the same error.

And if we conditionally remove that path (remove_old logic one), then we
need to pass false in every add_path() call in create_ordered_paths().

Am I missing something?

Thanks

--
Álvaro Herrera 48°01'N 7°57'E —
https://www.EnterpriseDB.com/

--
Jeevan Chalke

*Senior Staff SDE, Database Architect, and ManagerProduct Development*

edbpostgres.com

Jeevan Chalke

jeevan.chalke@enterprisedb.com

over 2 years ago

In reply to: Jeevan Chalke (#3)

1 attachment(s)

Re: unrecognized node type while displaying a Path due to dangling pointer

On Tue, Jul 11, 2023 at 2:58 PM Jeevan Chalke <
jeevan.chalke@enterprisedb.com> wrote:

On Tue, Jul 11, 2023 at 1:19 PM Alvaro Herrera <alvherre@alvh.no-ip.org>
wrote:

On 2023-Jul-11, Jeevan Chalke wrote:

4. However, 2nd path was already sorted and passed as is to the

add_path().

5. add_path() decides to reject this new path on some metrics. However,

in

the end, it pfree() this passed in path. It seems wrong as its

references

do present elsewhere. For example, in the first path's parent rels path
list.
6. So, while displaying the parent's path, we end up with these

warnings.

In other words, this is use-after-free, with add_path freeing the
passed-in Path pointer, but one particular case in which this Path is
still used afterwards.

I tried to get a fix for this but no luck so far.

I proposed to add an add_path_extended() function that adds 'bool
free_input_path' argument, and pass it false in that one place in
create_ordered_paths.

Yeah, this can be a way.

However, I am thinking the other way around now. What if we first added
the unmodified input path as it is to the ordered_rel first?

If we do so, then while adding the next path, add_path() may decide to
remove the older one as the newer path is the best one. The remove_old
logic in add_path() will free the path (unsorted one), and we end up with
the same error.

And if we conditionally remove that path (remove_old logic one), then we
need to pass false in every add_path() call in create_ordered_paths().

Attached patch.

Am I missing something?

Thanks

--
Álvaro Herrera 48°01'N 7°57'E —
https://www.EnterpriseDB.com/

--
Jeevan Chalke

*Senior Staff SDE, Database Architect, and ManagerProduct Development*

edbpostgres.com

--
Jeevan Chalke

*Senior Staff SDE, Database Architect, and ManagerProduct Development*

edbpostgres.com

Attachments:

fix_add_path.patchapplication/octet-stream; name=fix_add_path.patchDownload

commit f9c753286554cc2fb6e96dcfe6641247216d7107
Author: Jeevan Chalke <jeevan.chalke@enterprisedb.com>
Date:   Tue Jul 11 14:39:58 2023 +0530

    Conditionally free the path in add_path()
    
    The passed-in path or already added path in the rel is freed if we
    reject it or are not needed anymore.  However, that path may have some
    references elsewhere, and thus it is not safe to always free.  We have
    that case when add_path() is called to add an ordered path that is
    already sorted but got rejected.  To avoid this, add the free_path
    argument to add_path() which is used for conditionally freeing the
    path.
    
    To avoid API breakage, add a wrapper instead, named add_path_extended.
    
    Reported by me, fix suggested by Alvaro Herrera.

diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 0e12fde..d0d7c3e 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -5162,7 +5162,7 @@ create_ordered_paths(PlannerInfo *root,
 			sorted_path = apply_projection_to_path(root, ordered_rel,
 												   sorted_path, target);
 
-		add_path(ordered_rel, sorted_path);
+		add_path_extended(ordered_rel, sorted_path, false);
 	}
 
 	/*
@@ -5212,7 +5212,7 @@ create_ordered_paths(PlannerInfo *root,
 				path = apply_projection_to_path(root, ordered_rel,
 												path, target);
 
-			add_path(ordered_rel, path);
+			add_path_extended(ordered_rel, path, false);
 		}
 
 		/*
@@ -5271,7 +5271,7 @@ create_ordered_paths(PlannerInfo *root,
 					sorted_path = apply_projection_to_path(root, ordered_rel,
 														   sorted_path, target);
 
-				add_path(ordered_rel, sorted_path);
+				add_path_extended(ordered_rel, sorted_path, false);
 			}
 		}
 	}
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index 5f55968..262084a 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -362,6 +362,16 @@ set_cheapest(RelOptInfo *parent_rel)
 
 /*
  * add_path
+ *	  See add_path_extended()
+ */
+void
+add_path(RelOptInfo *parent_rel, Path *new_path)
+{
+	add_path_extended(parent_rel, new_path, true);
+}
+
+/*
+ * add_path_extended
  *	  Consider a potential implementation path for the specified parent rel,
  *	  and add it to the rel's pathlist if it is worthy of consideration.
  *	  A path is worthy if it has a better sort order (better pathkeys) or
@@ -415,11 +425,12 @@ set_cheapest(RelOptInfo *parent_rel)
  *
  * 'parent_rel' is the relation entry to which the path corresponds.
  * 'new_path' is a potential path for parent_rel.
+ * 'free_path' if true, pfree() the path if rejected or not needed.
  *
  * Returns nothing, but modifies parent_rel->pathlist.
  */
 void
-add_path(RelOptInfo *parent_rel, Path *new_path)
+add_path_extended(RelOptInfo *parent_rel, Path *new_path, bool free_path)
 {
 	bool		accept_new = true;	/* unless we find a superior old path */
 	int			insert_at = 0;	/* where to insert new item */
@@ -590,7 +601,7 @@ add_path(RelOptInfo *parent_rel, Path *new_path)
 			/*
 			 * Delete the data pointed-to by the deleted cell, if possible
 			 */
-			if (!IsA(old_path, IndexPath))
+			if (free_path && !IsA(old_path, IndexPath))
 				pfree(old_path);
 		}
 		else
@@ -618,7 +629,7 @@ add_path(RelOptInfo *parent_rel, Path *new_path)
 	else
 	{
 		/* Reject and recycle the new path */
-		if (!IsA(new_path, IndexPath))
+		if (free_path && !IsA(new_path, IndexPath))
 			pfree(new_path);
 	}
 }
diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h
index 001e75b..567e70e 100644
--- a/src/include/optimizer/pathnode.h
+++ b/src/include/optimizer/pathnode.h
@@ -27,6 +27,8 @@ extern int	compare_fractional_path_costs(Path *path1, Path *path2,
 										  double fraction);
 extern void set_cheapest(RelOptInfo *parent_rel);
 extern void add_path(RelOptInfo *parent_rel, Path *new_path);
+extern void add_path_extended(RelOptInfo *parent_rel, Path *new_path,
+							  bool free_path);
 extern bool add_path_precheck(RelOptInfo *parent_rel,
 							  Cost startup_cost, Cost total_cost,
 							  List *pathkeys, Relids required_outer);

Tom Lane

tgl@sss.pgh.pa.us

over 2 years ago

In reply to: Jeevan Chalke (#4)

Re: unrecognized node type while displaying a Path due to dangling pointer

Jeevan Chalke <jeevan.chalke@enterprisedb.com> writes:

Attached patch.

I would be astonished if this fixes anything. The code still doesn't
know which paths are referenced by which other ones, and so the place
where we free a previously-added path can't know what to do.

I've speculated about adding some form of reference counting to paths
(maybe just a "pin" flag rather than a full refcount) so that we could
be smarter about this. The existing kluge for "don't free IndexPaths"
could be replaced by setting the pin mark on any IndexPath that we
make a bitmap path from. Up to now it hasn't seemed necessary to
generalize that hack, but maybe it's time. Can you show a concrete
case where we are freeing a still-referenced path?

regards, tom lane

Jeevan Chalke

jeevan.chalke@enterprisedb.com

over 2 years ago

In reply to: Tom Lane (#5)

Re: unrecognized node type while displaying a Path due to dangling pointer

Hi Tom,

On Tue, Jul 11, 2023 at 4:30 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Jeevan Chalke <jeevan.chalke@enterprisedb.com> writes:

Attached patch.

I would be astonished if this fixes anything. The code still doesn't
know which paths are referenced by which other ones, and so the place
where we free a previously-added path can't know what to do.

I've speculated about adding some form of reference counting to paths
(maybe just a "pin" flag rather than a full refcount) so that we could
be smarter about this. The existing kluge for "don't free IndexPaths"
could be replaced by setting the pin mark on any IndexPath that we
make a bitmap path from. Up to now it hasn't seemed necessary to
generalize that hack, but maybe it's time. Can you show a concrete
case where we are freeing a still-referenced path?

As mentioned earlier, while debugging some issues, we have put an elog
displaying the foreignrel contents using nodeToString(). Like below:

@@ -1238,6 +1238,8 @@ postgresGetForeignPlan(PlannerInfo *root,
bool has_limit = false;
ListCell *lc;

+   elog(INFO, "foreignrel: %s", nodeToString(foreignrel));
+

And ran the postgres_fdw regression and observed many warnings saying "could
not dump unrecognized node type". Here are the queries retrieved and
adjusted from postgres_fdw.sql

CREATE EXTENSION postgres_fdw;
CREATE SERVER loopback FOREIGN DATA WRAPPER postgres_fdw OPTIONS (dbname
'postgres', port '5432');
CREATE USER MAPPING FOR CURRENT_USER SERVER loopback;
CREATE TABLE t1 (c1 int NOT NULL, c2 int NOT NULL, CONSTRAINT t1_pkey
PRIMARY KEY (c1));
INSERT INTO t1 SELECT id, id % 10 FROM generate_series(1, 1000) id;
ANALYZE t1;
CREATE FOREIGN TABLE ft2 (c1 int NOT NULL, c2 int NOT NULL) SERVER loopback
OPTIONS (schema_name 'public', table_name 't1');

explain (verbose, costs off)
select c2, sum(c1) from ft2 group by c2 having avg(c1) < 500 and sum(c1) <
49800 order by c2;

With the above elog() in place, we can see the warning. And the pathlist
has a second path as empty ({}). Which got freed but has a reference in
this foreignrel.

Thanks

regards, tom lane

--
Jeevan Chalke

*Senior Staff SDE, Database Architect, and ManagerProduct Development*

edbpostgres.com

Tom Lane

tgl@sss.pgh.pa.us

over 2 years ago

In reply to: Jeevan Chalke (#6)

Re: unrecognized node type while displaying a Path due to dangling pointer

So what's going on here is that create_ordered_paths() does this:

foreach(lc, input_rel->pathlist)
{
Path *input_path = (Path *) lfirst(lc);

if (/* input path is suitably sorted already */)
sorted_path = input_path;
else
/* build a sorted path atop input_path */

/* Add projection step if needed */
if (sorted_path->pathtarget != target)
sorted_path = apply_projection_to_path(root, ordered_rel,
sorted_path, target);

add_path(ordered_rel, sorted_path);
}

Thus, if the input RelOptInfo has a path that already has the correct
ordering and output target, we'll try to add that path directly to
the output RelOptInfo. This is cheating in at least two ways:

1. The path's parent link isn't right: it still points to the input rel.

2. As per Jeevan's report, we now potentially have two different links
to the path. add_path could reject and free the path immediately,
or it could do so later while comparing it to some path offered later
for the output RelOptInfo, and either case leads to a dangling pointer
in the input RelOptInfo's pathlist.

Now, the reason we get away with #2 is that nothing looks at the lower
RelOptInfo's pathlist anymore after create_ordered_paths: we will only
be interested in paths that contribute to a surviving Path in the
output RelOptInfo, and those will be linked directly from the upper
Path. However, that's clearly kind of fragile, plus it's a bit
surprising that nobody has complained about #1.

We could probably fix this by creating a rule that you *must*
wrap a Path for a lower RelOptInfo into some sort of wrapper
Path before offering it as a candidate for an upper RelOptInfo.
(This could be cross-checked by having add_path Assert that
new_path->parent == parent_rel. The wrapper could be a do-nothing
ProjectionPath, perhaps.) But I think there are multiple places
taking similar shortcuts, so I'm a bit worried about how much overhead
we'll add for what seems likely to be only a debugging annoyance.

A low-cost fix perhaps could be to unlink the lower rel's whole
path list (set input_rel->pathlist = NIL, also zero the related
fields such as cheapest_path) once we've finished selecting the
paths we want for the upper rel. That's not great for debuggability
either, but maybe it's the most appropriate compromise.

I don't recall how clearly I understood this while writing the
upper-planner-pathification patch years ago. I think I did
realize the code was cheating, but if so I failed to document
it, so far as I can see.

regards, tom lane

David Rowley

dgrowleyml@gmail.com

over 2 years ago

In reply to: Tom Lane (#7)

Re: unrecognized node type while displaying a Path due to dangling pointer

On Wed, 12 Jul 2023 at 08:46, Tom Lane <tgl@sss.pgh.pa.us> wrote:

A low-cost fix perhaps could be to unlink the lower rel's whole
path list (set input_rel->pathlist = NIL, also zero the related
fields such as cheapest_path) once we've finished selecting the
paths we want for the upper rel. That's not great for debuggability
either, but maybe it's the most appropriate compromise.

I've not taken the time to fully understand this, but from reading the
thread, I'm not immediately understanding why we can't just shallow
copy the Path from the other RelOptInfo and replace the parent before
using it in the upper RelOptInfo. Can you explain?

David

Tom Lane

tgl@sss.pgh.pa.us

over 2 years ago

In reply to: David Rowley (#8)

Re: unrecognized node type while displaying a Path due to dangling pointer

David Rowley <dgrowleyml@gmail.com> writes:

I've not taken the time to fully understand this, but from reading the
thread, I'm not immediately understanding why we can't just shallow
copy the Path from the other RelOptInfo and replace the parent before
using it in the upper RelOptInfo. Can you explain?

I did think about that, but "shallow copy a Path" seems nontrivial
because the Path structs are all different sizes. Maybe it is worth
building some infrastructure to support that?

regards, tom lane

#10

David Rowley

dgrowleyml@gmail.com

over 2 years ago

In reply to: Tom Lane (#9)

Re: unrecognized node type while displaying a Path due to dangling pointer

On Wed, 12 Jul 2023 at 14:23, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I did think about that, but "shallow copy a Path" seems nontrivial
because the Path structs are all different sizes. Maybe it is worth
building some infrastructure to support that?

It seems a reasonable thing to have to do. It seems the minimum thing
we could do to ensure each Path is only mentioned in at most 1
RelOptInfo.

I see GetExistingLocalJoinPath() in foreign.c might be related to this
problem, per:

* If the inner or outer subpath of the chosen path is a ForeignScan, we
* replace it with its outer subpath. For this reason, and also because the
* planner might free the original path later, the path returned by this
* function is a shallow copy of the original. There's no need to copy
* the substructure, so we don't.

so that function could probably disappear if we had this.

David

#11

David Rowley

dgrowleyml@gmail.com

over 2 years ago

In reply to: David Rowley (#10)

1 attachment(s)

Re: unrecognized node type while displaying a Path due to dangling pointer

On Wed, 12 Jul 2023 at 14:50, David Rowley <dgrowleyml@gmail.com> wrote:

On Wed, 12 Jul 2023 at 14:23, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I did think about that, but "shallow copy a Path" seems nontrivial
because the Path structs are all different sizes. Maybe it is worth
building some infrastructure to support that?

It seems a reasonable thing to have to do. It seems the minimum thing
we could do to ensure each Path is only mentioned in at most 1
RelOptInfo.

I've attached a draft patch which adds copyObjectFlat() and supports
all Node types asides from the ones mentioned in @extra_tags in
gen_node_support.pl. This did require including all the node header
files in copyfuncs.c, which that file seems to have avoided until now.

I also didn't do anything about ExtensibleNode types. I assume just
copying the ExtensibleNode isn't good enough. To flat copy the actual
node I think would require adding a new function to
ExtensibleNodeMethods.

I was also unsure what we should do when shallow copying a List. The
problem there is if we just do a shallow copy, a repalloc() on the
elements array would end up pfreeing memory that might be used by a
shallow copied clone. Perhaps List is not unique in that regard?
Maybe the solution there is to add a special case and list_copy()
Lists like what is done in copyObjectImpl().

I'm hoping the attached patch will at least assist in moving the
discussion along.

David

Attachments:

flat_copy_node.patchtext/plain; charset=US-ASCII; name=flat_copy_node.patchDownload

diff --git a/src/backend/nodes/.gitignore b/src/backend/nodes/.gitignore
index 0c14b5697b..91cbd2cf24 100644
--- a/src/backend/nodes/.gitignore
+++ b/src/backend/nodes/.gitignore
@@ -1,4 +1,5 @@
 /node-support-stamp
+/nodesizes.h
 /nodetags.h
 /*funcs.funcs.c
 /*funcs.switch.c
diff --git a/src/backend/nodes/Makefile b/src/backend/nodes/Makefile
index 0a95e683d0..46ce62f828 100644
--- a/src/backend/nodes/Makefile
+++ b/src/backend/nodes/Makefile
@@ -99,4 +99,4 @@ queryjumblefuncs.o: queryjumblefuncs.c queryjumblefuncs.funcs.c queryjumblefuncs
 readfuncs.o:  readfuncs.c readfuncs.funcs.c readfuncs.switch.c | node-support-stamp
 
 maintainer-clean: clean
-	rm -f node-support-stamp $(addsuffix funcs.funcs.c,copy equal out queryjumble read) $(addsuffix funcs.switch.c,copy equal out queryjumble read) nodetags.h
+	rm -f node-support-stamp $(addsuffix funcs.funcs.c,copy equal out queryjumble read) $(addsuffix funcs.switch.c,copy equal out queryjumble read) nodesizes.h nodetags.h
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index f2568ff5e6..ccd4cbfa01 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -15,9 +15,54 @@
 
 #include "postgres.h"
 
+#include "access/amapi.h"
+#include "access/tableam.h"
+#include "access/tsmapi.h"
+#include "commands/event_trigger.h"
+#include "commands/trigger.h"
+#include "foreign/fdwapi.h"
 #include "miscadmin.h"
+#include "nodes/execnodes.h"
+#include "nodes/extensible.h"
+#include "nodes/miscnodes.h"
+#include "nodes/parsenodes.h"
+#include "nodes/pathnodes.h"
+#include "nodes/plannodes.h"
+#include "nodes/replnodes.h"
+#include "nodes/supportnodes.h"
+#include "nodes/tidbitmap.h"
 #include "utils/datum.h"
 
+static const Size flat_node_sizes[] = {
+	0, /* T_Invalid */
+#include "nodes/nodesizes.h"
+};
+
+/*
+ * copyObjectFlat
+ *		Allocate a new copy of the Node type denoted by 'from' and flat copy the
+ *		contents of it into the newly allocated node and return it.
+ */
+void *
+copyObjectFlat(const void *from)
+{
+	Size		size;
+	void	   *retval;
+	NodeTag		tag = nodeTag(from);
+
+	if ((unsigned int) tag >= lengthof(flat_node_sizes))
+	{
+		elog(ERROR, "unrecognized node type: %d", (int) tag);
+		return NULL;
+	}
+
+	/* XXX how to handle ExtensibleNodes? Can we just deep copy? */
+	size = flat_node_sizes[tag];
+	retval = palloc(size);
+	memcpy(retval, from, size);
+
+	return retval;
+}
 
 /*
  * Macros to simplify copying of different kinds of fields.  Use these
diff --git a/src/backend/nodes/gen_node_support.pl b/src/backend/nodes/gen_node_support.pl
index 72c7963578..e9a759c5a0 100644
--- a/src/backend/nodes/gen_node_support.pl
+++ b/src/backend/nodes/gen_node_support.pl
@@ -2,6 +2,7 @@
 #----------------------------------------------------------------------
 #
 # Generate node support files:
+# - nodesizes.h
 # - nodetags.h
 # - copyfuncs
 # - equalfuncs
@@ -599,6 +600,28 @@ my $header_comment =
  */
 ';
 
+# nodesizes.h
+
+push @output_files, 'nodesizes.h';
+open my $ns, '>', "$output_path/nodesizes.h$tmpext"
+  or die "$output_path/nodesizes.h$tmpext: $!";
+
+printf $ns $header_comment, 'nodesizes.h';
+
+foreach my $n (@node_types)
+{
+	next if elem $n, @abstract_types;
+	if (defined $manual_nodetag_number{$n})
+	{
+		print $ns "\tsizeof(T_${n}),\n";
+	}
+	else
+	{
+		print $ns "\tsizeof(${n}),\n";
+	}
+}
+
+close $ns;
 
 # nodetags.h
 
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 44efb1f4eb..2bfcddbce2 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -5143,7 +5143,15 @@ create_ordered_paths(PlannerInfo *root,
 												input_path->pathkeys, &presorted_keys);
 
 		if (is_sorted)
-			sorted_path = input_path;
+		{
+			/*
+			 * Perform a flat copy of the already-sorted node so as not to reference an
+			 * existing Path from another RelOptInfo.  The add_path() call below may
+			 * pfree this path, which would be problematic when it's still referenced
+			 * by input_rel.
+			 */
+			sorted_path = copyObjectFlat(input_path);
+		}
 		else
 		{
 			/*
diff --git a/src/include/Makefile b/src/include/Makefile
index 5d213187e2..283ae311b3 100644
--- a/src/include/Makefile
+++ b/src/include/Makefile
@@ -44,6 +44,7 @@ install: all installdirs
 	$(INSTALL_DATA) pg_config.h     '$(DESTDIR)$(includedir_server)'
 	$(INSTALL_DATA) pg_config_ext.h '$(DESTDIR)$(includedir_server)'
 	$(INSTALL_DATA) pg_config_os.h  '$(DESTDIR)$(includedir_server)'
+	$(INSTALL_DATA) nodes/nodesizes.h '$(DESTDIR)$(includedir_server)/nodes'
 	$(INSTALL_DATA) nodes/nodetags.h '$(DESTDIR)$(includedir_server)/nodes'
 	$(INSTALL_DATA) utils/errcodes.h '$(DESTDIR)$(includedir_server)/utils'
 	$(INSTALL_DATA) utils/fmgroids.h '$(DESTDIR)$(includedir_server)/utils'
@@ -75,7 +76,7 @@ clean:
 	rm -f storage/lwlocknames.h utils/probes.h utils/wait_event_types.h
 	rm -f catalog/schemapg.h catalog/system_fk_info.h
 	rm -f catalog/pg_*_d.h catalog/header-stamp
-	rm -f nodes/nodetags.h nodes/header-stamp
+	rm -f nodes/nodesizes.h nodes/nodetags.h nodes/header-stamp
 
 distclean maintainer-clean: clean
 	rm -f pg_config.h pg_config_ext.h pg_config_os.h stamp-h stamp-ext-h
diff --git a/src/include/nodes/meson.build b/src/include/nodes/meson.build
index 626dc696d5..cc25d99701 100644
--- a/src/include/nodes/meson.build
+++ b/src/include/nodes/meson.build
@@ -31,7 +31,7 @@ foreach i : node_support_input_i
 endforeach
 
 node_support_output = [
-  'nodetags.h',
+  'nodesizes.h', 'nodetags.h',
   'outfuncs.funcs.c', 'outfuncs.switch.c',
   'readfuncs.funcs.c', 'readfuncs.switch.c',
   'copyfuncs.funcs.c', 'copyfuncs.switch.c',
@@ -39,6 +39,7 @@ node_support_output = [
   'queryjumblefuncs.funcs.c', 'queryjumblefuncs.switch.c',
 ]
 node_support_install = [
+  dir_include_server / 'nodes',
   dir_include_server / 'nodes',
   false, false,
   false, false,
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index f8e8fe699a..a325e15dcd 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -235,6 +235,7 @@ extern int16 *readAttrNumberCols(int numCols);
 /*
  * nodes/copyfuncs.c
  */
+extern void *copyObjectFlat(const void *from);
 extern void *copyObjectImpl(const void *from);
 
 /* cast result back to argument type, if supported by compiler */
diff --git a/src/tools/msvc/Solution.pm b/src/tools/msvc/Solution.pm
index 1cbc857e35..52b71e5f84 100644
--- a/src/tools/msvc/Solution.pm
+++ b/src/tools/msvc/Solution.pm
@@ -847,6 +847,14 @@ EOF
 		close($f);
 	}
 
+	if (IsNewer(
+			'src/include/nodes/nodesizes.h',
+			'src/backend/nodes/nodesizes.h'))
+	{
+		copyFile('src/backend/nodes/nodesizes.h',
+			'src/include/nodes/nodesizes.h');
+	}
+
 	if (IsNewer(
 			'src/include/nodes/nodetags.h',
 			'src/backend/nodes/nodetags.h'))
diff --git a/src/tools/msvc/clean.bat b/src/tools/msvc/clean.bat
index 7cb23ea894..f38e0f8dd2 100755
--- a/src/tools/msvc/clean.bat
+++ b/src/tools/msvc/clean.bat
@@ -41,6 +41,7 @@ REM Delete files created with GenerateFiles() in Solution.pm
 if exist src\include\pg_config.h del /q src\include\pg_config.h
 if exist src\include\pg_config_ext.h del /q src\include\pg_config_ext.h
 if exist src\include\pg_config_os.h del /q src\include\pg_config_os.h
+if exist src\include\nodes\nodesizes.h del /q src\include\nodes\nodesizes.h
 if exist src\include\nodes\nodetags.h del /q src\include\nodes\nodetags.h
 if exist src\include\utils\errcodes.h del /q src\include\utils\errcodes.h
 if exist src\include\utils\fmgroids.h del /q src\include\utils\fmgroids.h
diff --git a/src/tools/pginclude/cpluspluscheck b/src/tools/pginclude/cpluspluscheck
index 4e09c4686b..a5f999c5da 100755
--- a/src/tools/pginclude/cpluspluscheck
+++ b/src/tools/pginclude/cpluspluscheck
@@ -97,6 +97,10 @@ do
 	# sepgsql.h depends on headers that aren't there on most platforms.
 	test "$f" = contrib/sepgsql/sepgsql.h && continue
 
+	# nodesizes.h cannot be included standalone: it's just a code fragment.
+	test "$f" = src/include/nodes/nodesizes.h && continue
+	test "$f" = src/backend/nodes/nodesizes.h && continue
+
 	# nodetags.h cannot be included standalone: it's just a code fragment.
 	test "$f" = src/include/nodes/nodetags.h && continue
 	test "$f" = src/backend/nodes/nodetags.h && continue
diff --git a/src/tools/pginclude/headerscheck b/src/tools/pginclude/headerscheck
index 8dee1b5670..18cb54dbb1 100755
--- a/src/tools/pginclude/headerscheck
+++ b/src/tools/pginclude/headerscheck
@@ -92,6 +92,10 @@ do
 	# sepgsql.h depends on headers that aren't there on most platforms.
 	test "$f" = contrib/sepgsql/sepgsql.h && continue
 
+	# nodesizes.h cannot be included standalone: it's just a code fragment.
+	test "$f" = src/include/nodes/nodesizes.h && continue
+	test "$f" = src/backend/nodes/nodesizes.h && continue
+
 	# nodetags.h cannot be included standalone: it's just a code fragment.
 	test "$f" = src/include/nodes/nodetags.h && continue
 	test "$f" = src/backend/nodes/nodetags.h && continue

#12

Tom Lane

tgl@sss.pgh.pa.us

over 2 years ago

In reply to: David Rowley (#11)

Re: unrecognized node type while displaying a Path due to dangling pointer

David Rowley <dgrowleyml@gmail.com> writes:

On Wed, 12 Jul 2023 at 14:50, David Rowley <dgrowleyml@gmail.com> wrote:

On Wed, 12 Jul 2023 at 14:23, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I did think about that, but "shallow copy a Path" seems nontrivial
because the Path structs are all different sizes. Maybe it is worth
building some infrastructure to support that?

It seems a reasonable thing to have to do. It seems the minimum thing
we could do to ensure each Path is only mentioned in at most 1
RelOptInfo.

...
I also didn't do anything about ExtensibleNode types. I assume just
copying the ExtensibleNode isn't good enough. To flat copy the actual
node I think would require adding a new function to
ExtensibleNodeMethods.

Yeah, the problem I've got with this approach is that flat-copying
FDW and Custom paths would require extending the respective APIs.
While that's a perfectly reasonable ask if we only need to do this
in HEAD, it would be a nonstarter for released branches. Is it
okay to only fix this issue in HEAD?

I was also unsure what we should do when shallow copying a List.

The proposal is to shallow-copy a Path node. List is not a kind
of Path, so how does List get into it? (Lists below Paths would
not get copied, by definition.)

regards, tom lane

#13

David Rowley

dgrowleyml@gmail.com

over 2 years ago

In reply to: Tom Lane (#12)

Re: unrecognized node type while displaying a Path due to dangling pointer

On Mon, 17 Jul 2023 at 15:31, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I also didn't do anything about ExtensibleNode types. I assume just
copying the ExtensibleNode isn't good enough. To flat copy the actual
node I think would require adding a new function to
ExtensibleNodeMethods.

Yeah, the problem I've got with this approach is that flat-copying
FDW and Custom paths would require extending the respective APIs.
While that's a perfectly reasonable ask if we only need to do this
in HEAD, it would be a nonstarter for released branches. Is it
okay to only fix this issue in HEAD?

CustomPaths, I didn't think about those. That certainly makes it more
complex. I also now see the header comment for struct CustomPath
mentioning that we don't copy Paths:

* Core code must avoid assuming that the CustomPath is only as large as
* the structure declared here; providers are allowed to make it the first
* element in a larger structure. (Since the planner never copies Paths,
* this doesn't add any complication.) However, for consistency with the
* FDW case, we provide a "custom_private" field in CustomPath; providers
* may prefer to use that rather than define another struct type.

Are there any legitimate reasons to look at the input_rel's pathlist
again aside from debugging? I can't think of any. Perhaps back
branches can be fixed by just emptying the path lists and NULLifying
the cheapest paths as you mentioned last week.

I was also unsure what we should do when shallow copying a List.

The proposal is to shallow-copy a Path node. List is not a kind
of Path, so how does List get into it? (Lists below Paths would
not get copied, by definition.)

The patch contained infrastructure to copy any Node type. Not just
Paths. Perhaps that's more than what's needed, but it seemed more
effort to limit it just to Path types than to make it "work" for all
Node types.

David.

#14

Ashutosh Bapat

ashutosh.bapat.oss@gmail.com

over 2 years ago

In reply to: David Rowley (#13)

Re: unrecognized node type while displaying a Path due to dangling pointer

On Tue, Jul 18, 2023 at 5:05 AM David Rowley <dgrowleyml@gmail.com> wrote:

On Mon, 17 Jul 2023 at 15:31, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I also didn't do anything about ExtensibleNode types. I assume just
copying the ExtensibleNode isn't good enough. To flat copy the actual
node I think would require adding a new function to
ExtensibleNodeMethods.

Yeah, the problem I've got with this approach is that flat-copying
FDW and Custom paths would require extending the respective APIs.
While that's a perfectly reasonable ask if we only need to do this
in HEAD, it would be a nonstarter for released branches. Is it
okay to only fix this issue in HEAD?

CustomPaths, I didn't think about those. That certainly makes it more
complex. I also now see the header comment for struct CustomPath
mentioning that we don't copy Paths:

Somewhere upthread Tom suggested using a dummy projection path. Add a
projection path on top of input path and add the projection path to
output rel's list. That will work right?

There's some shallow copying code in reparameterize_path_by_childrel()
but that's very specific to the purpose there and doesn't consider
Custom or Foreign paths.

--
Best Wishes,
Ashutosh Bapat

#15

Ashutosh Bapat

ashutosh.bapat.oss@gmail.com

over 2 years ago

In reply to: Tom Lane (#5)

Re: unrecognized node type while displaying a Path due to dangling pointer

On Tue, Jul 11, 2023 at 4:30 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Jeevan Chalke <jeevan.chalke@enterprisedb.com> writes:

Attached patch.

I would be astonished if this fixes anything. The code still doesn't
know which paths are referenced by which other ones, and so the place
where we free a previously-added path can't know what to do.

I've speculated about adding some form of reference counting to paths
(maybe just a "pin" flag rather than a full refcount) so that we could
be smarter about this. The existing kluge for "don't free IndexPaths"
could be replaced by setting the pin mark on any IndexPath that we
make a bitmap path from. Up to now it hasn't seemed necessary to
generalize that hack, but maybe it's time. Can you show a concrete
case where we are freeing a still-referenced path?

Set of patches in [1]/messages/by-id/CAExHW5tUcVsBkq9qT=L5vYz4e-cwQNw=KAGJrtSyzOp3F=XacA@mail.gmail.com add infrastructure to reference, link and unlink
paths.The patches are raw and have some TODOs there. But I think that
infrastructure will solve this problem as a side effect. Please take a
look and let me know if this is as per your speculation. It's more
than just pinning though.

The patch set uses references to free memory consumed by paths which
remain unused. The memory consumed is substantial when partitionwise
join is used and there are thousands of partitions.

[1]: /messages/by-id/CAExHW5tUcVsBkq9qT=L5vYz4e-cwQNw=KAGJrtSyzOp3F=XacA@mail.gmail.com

--
Best Wishes,
Ashutosh Bapat