CustomScan in a larger structure (RE: CustomScan support on readfuncs.c)

Started by Kouhei Kaigaiabout 10 years ago20 messages

kaigai@ak.jp.nec.com

about 10 years ago

2 attachment(s)

It is a relevant topic of readfuncs support for custom-scan.

Unlike CustomPath and CustomScanState, we don't allow custom-scan
provider to define own and larger structure that embeds CustomScan
at head of the structure, to support copyObject().
Thus, custom-scan provider needs to have own logic to pack and
unpack private fields, like:
https://github.com/pg-strom/devel/blob/master/src/gpujoin.c#L112

At present, only create_customscan_plan() and _copyCustomScan()
can produce CustomScan node.
The create_customscan_plan() invokes PlanCustomPath callback,
thus, CSP can return a pointer of CustomScan field within
a larger structure. So, we don't adjust this interface.
On the other hands, _copyCustomScan() allocates a new node using
makeNode(CustomScan), thus, here is no space for the larger
structure if CSP wants to use to keep private values without
packing and unpacking.
Only CSP knows exact size of the larger structure, so all we can
do is ask CSP to allocate a new node and copy its private fields.
This patch newly adds NodeCopyCustomScan callback to support
copyObject.

Also, v9.6 will support nodeToString/stringToNode on plan nodes.
So, we need to re-define the role of TextOutCustomScan callback
that is also defined at v9.5.
If CSP extends the CustomScan to save private values on the extra
area, only CSP knows what values and which data types are stored,
thus, all we can do is ask CSP to serialize and deserialize its
private fields without inconsistency.
Even though TextOutCustomScan callback shall be once ripped out
to support readfuncs.c, a pair of TextOut and TextRead callback
will re-define its responsibility again; when CSP defines
a larger structure that embeds CustomScan, a pair of these
callbacks are used to serialize/deserialize the entire custom
defined objects.

The attached patches are for v9.5 and v9.6. The v9.6 patch
assumes the patch for readfuncs support is already applied.
The v9.5 patch assumes the latest master.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

Show quoted text

-----Original Message-----
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Kouhei Kaigai
Sent: Tuesday, November 10, 2015 11:47 AM
To: Robert Haas
Cc: Amit Kapila; Andres Freund; pgsql-hackers
Subject: Re: [HACKERS] CustomScan support on readfuncs.c

On Sat, Nov 7, 2015 at 6:38 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:

Are you suggesting to pass the object name on software build time?

Yes. That's what test_shm_mq and worker_spi already do:

sprintf(worker.bgw_library_name, "test_shm_mq");

OK, I ripped out the portion around dfmgr.c.

If so, it may load incorrect libraries when DBA renamed its filename.
At least, PostgreSQL cannot prevent DBA to rename library filenames
even if they try to deploy "pg_strom.so", "pg_strom.20151106.so" and
"pg_strom.20151030_bug.so" on same directory.

I agree. But that's not a new problem that needs to be solved by this
patch. It already exists - see above.

It still may be a potential problem, even though a corner case.
I'll try to implement an internal API to know "what is my name".

The attached main patch (custom-scan-on-readfuncs.v3.patch) once
removes TextOutCustomScan, thus all the serialized tokens become
known to the core backend, and add _readCustomScan() at readfuncs.c.
It deserializes CustomScan nodes from cstring tokens, especially,
reloads the pointer of CustomScanMethods tables using a pair of
library name and symbol name.
Thus, it also requires custom scan providers to keep the methods
table visible from external objects.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

Attachments:

custom-scan-on-copyfuncs-9.6devel.v3.patchapplication/octet-stream; name=custom-scan-on-copyfuncs-9.6devel.v3.patchDownload

 doc/src/sgml/custom-scan.sgml | 55 +++++++++++++++++++++++++++++++++++++------
 src/backend/nodes/copyfuncs.c | 17 ++++++++++++-
 src/backend/nodes/outfuncs.c  |  7 ++++++
 src/backend/nodes/readfuncs.c | 22 ++++++++++++++++-
 src/include/nodes/plannodes.h | 14 ++++++++---
 5 files changed, 103 insertions(+), 12 deletions(-)

diff --git a/doc/src/sgml/custom-scan.sgml b/doc/src/sgml/custom-scan.sgml
index d042adb..820f8d2 100644
--- a/doc/src/sgml/custom-scan.sgml
+++ b/doc/src/sgml/custom-scan.sgml
@@ -185,7 +185,9 @@ typedef struct CustomScan
     this scan is replacing a join, it will have only one member.
     <structfield>methods</> must point to a (usually statically allocated)
     object implementing the required custom scan methods, which are further
-    detailed below.
+    detailed below. Also note that this field shall be serialized using
+    a pair of library name and symbol name, thus, method variable has to
+    be resolvable by linker.
   </para>
 
   <para>
@@ -196,12 +198,12 @@ typedef struct CustomScan
   </para>
 
   <para>
-   Plan trees must be able to be duplicated using <function>copyObject</>,
-   so all the data stored within the <quote>custom</> fields must consist of
-   nodes that that function can handle.  Furthermore, custom scan providers
-   cannot substitute a larger structure that embeds
-   a <structname>CustomScan</> for the structure itself, as would be possible
-   for a <structname>CustomPath</> or <structname>CustomScanState</>.
+   Plan tree has to be copied using <function>copyObject</>, serialized
+   using <function>nodeToString</> and deserialized using
+   <function>stringToNode</>. In case when custom scan provider defines
+   its own structure that embeds <structname>CustomScan</> on the head,
+   for better private field handling, it has to implement the relevant and
+   optional callbacks.
   </para>
 
   <sect2 id="custom-scan-plan-callbacks">
@@ -220,6 +222,45 @@ Node *(*CreateCustomScanState) (CustomScan *cscan);
     the <function>BeginCustomScan</> callback will be invoked to give the
     custom scan provider a chance to do whatever else is needed.
    </para>
+
+   <para>
+<programlisting>
+void (*TextOutCustomScan) (StringInfo str, const CustomScan *cscan);
+</programlisting>
+    Generate extra output when <function>nodeToString</> is invoked towards
+    the custom scan node. This callback is optional, however, custom scan
+    provider has to implement if it defines larger structure that embeds
+    <structname>CustomScan</> on the head, to save own private fields.
+    In this case, all the callback has to dump are private fields because
+    the core backend dumps knows fields in <structname>CustomScan</>.
+   </para>
+
+   <para>
+<programlisting>
+CustomScan *(*TextReadCustomScan) (void);
+</programlisting>
+    Allocate a larger structure that embeds <structname>CustomScan</> then
+    reconstruct its private fields on the extra fields of this structure;
+    according to the output by <function>TextOutCustomScan</> callback.
+    This callback is optional, however, custom scan provider has to
+    implement if it defines its own structure instead of bare
+    <structname>CustomScan</>.
+    This callback shall be invoked under <function>stringToNode</> context,
+    so <function>pg_strtok</> will give the next token.
+   </para>
+
+   <para>
+<programlisting>
+CustomScan *(*NodeCopyCustomScan) (const CustomScan *from);
+</programlisting>
+    Allocate a larger structure that embeds <structname>CustomScan</> then
+    copies its private fields from the original node.
+    This callback is optional, however, custom scan provider has to
+    implement if it defines its own structure instead of bare
+    <structname>CustomScan</>.
+    All this callback has to care about is its private fields. The known
+    fields in <structname>CustomScan</> shall be copied by the core backend.
+   </para>
   </sect2>
  </sect1>
 
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index c176ff9..34517cd 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -661,7 +661,22 @@ _copyForeignScan(const ForeignScan *from)
 static CustomScan *
 _copyCustomScan(const CustomScan *from)
 {
-	CustomScan *newnode = makeNode(CustomScan);
+	const CustomScanMethods *methods = from->methods;
+	CustomScan	   *newnode;
+
+	/*
+	 * In case when custom scan provider defines a larger structure that
+	 * deploys CustomScan at the head, only custom scan provider knows
+	 * exact size of the new node to be allocated, so we ask them to
+	 * allocate the new node and copy private fields we don't know.
+	 */
+	if (!methods->NodeCopyCustomScan)
+		newnode = makeNode(CustomScan);
+	else
+	{
+		newnode = methods->NodeCopyCustomScan(from);
+		Assert(IsA(newnode, CustomScan));
+	}
 
 	/*
 	 * copy node superclass fields
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index af6674c..43fbd81 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -617,6 +617,13 @@ _outCustomScan(StringInfo str, const CustomScan *node)
 	_outToken(str, node->methods->methodsLibraryName);
 	appendStringInfoChar(str, ' ');
 	_outToken(str, node->methods->methodsSymbolName);
+	/*
+	 * Custom scan provider can/must dump out private fields, if it defines
+	 * a larger structure to save its private fields. It also has to be
+	 * reconstructable using relevant TextReadCustomScan callback.
+	 */
+	if (node->methods->TextOutCustomScan)
+		node->methods->TextOutCustomScan(str, node);
 }
 
 static void
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index d143dac..52faf9e 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -1812,11 +1812,14 @@ _readForeignScan(void)
 static CustomScan *
 _readCustomScan(void)
 {
-	READ_LOCALS(CustomScan);
+	READ_TEMP_LOCALS();
+	CustomScan	local_temp;
+	CustomScan *local_node = &local_temp;
 	char	   *library_name;
 	char	   *symbol_name;
 	const CustomScanMethods *methods;
 
+	NodeSetTag(local_node, T_CustomScan);
 	ReadCommonScan(&local_node->scan);
 
 	READ_UINT_FIELD(flags);
@@ -1839,6 +1842,23 @@ _readCustomScan(void)
 		load_external_function(library_name, symbol_name, true, NULL);
 	Assert(strcmp(methods->methodsLibraryName, library_name) == 0 &&
 		   strcmp(methods->methodsSymbolName, symbol_name) == 0);
+
+	/*
+	 * Then, read private fields if any. If custom scan provider defines
+	 * a larger structure to save its private data, it should be dumpped
+	 * out by TextOutCustomScan callback and also should be reconstructable
+	 * by TextReadCustomScan. In this case, only custom scan provider knows
+	 * exact size of the structure, so TextReadCustomScan callback has to
+	 * allocate the result node not only reading private data fields.
+	 */
+	if (!methods->TextReadCustomScan)
+		local_node = makeNode(CustomScan);
+	else
+	{
+		local_node = methods->TextReadCustomScan();
+		Assert(IsA(local_node, CustomScan));
+	}
+	memcpy(local_node, &local_temp, offsetof(CustomScan, methods));
 	local_node->methods = methods;
 
 	READ_DONE();
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 5ecc2d1..acaf4bd 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -542,9 +542,10 @@ typedef struct ForeignScan
  * custom_private, custom_scan_tlist, and custom_relids fields.  The
  * convention of setting scan.scanrelid to zero for joins applies as well.
  *
- * Note that since Plan trees can be copied, custom scan providers *must*
- * fit all plan data they need into those fields; embedding CustomScan in
- * a larger struct will not work.
+ * Note that Plan tree can be copied, written and read with usual node
+ * functions. If custom scan provider defines a larger structure that
+ * embeds CustomScan at the head, it *must* provide relevant callbacks
+ * to support copyObject, nodeToString and stringToNode.
  * ----------------
  */
 struct CustomScan;
@@ -557,6 +558,13 @@ typedef struct CustomScanMethods
 
 	/* Create execution state (CustomScanState) from a CustomScan plan node */
 	Node	   *(*CreateCustomScanState) (struct CustomScan *cscan);
+	/* Optional: output private field if structure is extended */
+	void		(*TextOutCustomScan) (StringInfo str,
+									  const struct CustomScan *cscan);
+	/* Optional: reconstruct the node if structure is extended */
+	struct CustomScan *(*TextReadCustomScan) (void);
+	/* Optional: duplicate CustomScan node if any additional private fields */
+	struct CustomScan *(*NodeCopyCustomScan) (const struct CustomScan *from);
 } CustomScanMethods;
 
 typedef struct CustomScan

custom-scan-on-copyfuncs-9.5beta.v3.patchapplication/octet-stream; name=custom-scan-on-copyfuncs-9.5beta.v3.patchDownload

 doc/src/sgml/custom-scan.sgml | 42 ++++++++++++++++++++++++++++--------------
 src/backend/nodes/copyfuncs.c | 17 ++++++++++++++++-
 src/backend/nodes/outfuncs.c  |  5 +++++
 src/include/nodes/plannodes.h | 13 ++++++++-----
 4 files changed, 57 insertions(+), 20 deletions(-)

diff --git a/doc/src/sgml/custom-scan.sgml b/doc/src/sgml/custom-scan.sgml
index a229326..507586b 100644
--- a/doc/src/sgml/custom-scan.sgml
+++ b/doc/src/sgml/custom-scan.sgml
@@ -183,7 +183,9 @@ typedef struct CustomScan
     this scan is replacing a join, it will have only one member.
     <structfield>methods</> must point to a (usually statically allocated)
     object implementing the required custom scan methods, which are further
-    detailed below.
+    detailed below. Also note that this field shall be serialized using
+    a pair of library name and symbol name, thus, method variable has to
+    be resolvable by linker.
   </para>
 
   <para>
@@ -194,12 +196,11 @@ typedef struct CustomScan
   </para>
 
   <para>
-   Plan trees must be able to be duplicated using <function>copyObject</>,
-   so all the data stored within the <quote>custom</> fields must consist of
-   nodes that that function can handle.  Furthermore, custom scan providers
-   cannot substitute a larger structure that embeds
-   a <structname>CustomScan</> for the structure itself, as would be possible
-   for a <structname>CustomPath</> or <structname>CustomScanState</>.
+   Plan tree has to be copied using <function>copyObject</> and written
+   using <function>nodeToString</>. In case when custom scan provider defines
+   its own structure that embeds <structname>CustomScan</> on the head,
+   for better private field handling, it has to implement the relevant and
+   optional callbacks.
   </para>
 
   <sect2 id="custom-scan-plan-callbacks">
@@ -221,14 +222,27 @@ Node *(*CreateCustomScanState) (CustomScan *cscan);
 
    <para>
 <programlisting>
-void (*TextOutCustomScan) (StringInfo str,
-                           const CustomScan *node);
+void (*TextOutCustomScan) (StringInfo str, const CustomScan *cscan);
 </programlisting>
-    Generate additional output when <function>nodeToString</> is invoked on
-    this custom plan node.  This callback is optional.  Since
-    <function>nodeToString</> will automatically dump all fields in the
-    structure, including the substructure of the <quote>custom</> fields,
-    there is usually not much need for this callback.
+    Generate extra output when <function>nodeToString</> is invoked towards
+    the custom scan node. This callback is optional, however, custom scan
+    provider has to implement if it defines larger structure that embeds
+    <structname>CustomScan</> on the head, to save own private fields.
+    In this case, all the callback has to dump are private fields because
+    the core backend dumps knows fields in <structname>CustomScan</>.
+   </para>
+
+   <para>
+<programlisting>
+CustomScan *(*NodeCopyCustomScan) (const CustomScan *from);
+</programlisting>
+    Allocate a larger structure that embeds <structname>CustomScan</> then
+    copies its private fields from the original node.
+    This callback is optional, however, custom scan provider has to
+    implement if it defines its own structure instead of bare
+    <structname>CustomScan</>.
+    All this callback has to care about is its private fields. The known
+    fields in <structname>CustomScan</> shall be copied by the core backend.
    </para>
   </sect2>
  </sect1>
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index e19fee4..53c7d90 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -637,7 +637,22 @@ _copyForeignScan(const ForeignScan *from)
 static CustomScan *
 _copyCustomScan(const CustomScan *from)
 {
-	CustomScan *newnode = makeNode(CustomScan);
+	const CustomScanMethods *methods = from->methods;
+	CustomScan	   *newnode;
+
+	/*
+	 * In case when custom scan provider defines a larger structure that
+	 * deploys CustomScan at the head, only custom scan provider knows
+	 * exact size of the new node to be allocated, so we ask them to
+	 * allocate the new node and copy private fields we don't know.
+	 */
+	if (!methods->NodeCopyCustomScan)
+		newnode = makeNode(CustomScan);
+	else
+	{
+		newnode = methods->NodeCopyCustomScan(from);
+		Assert(IsA(newnode, CustomScan));
+	}
 
 	/*
 	 * copy node superclass fields
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index b39c772..4877ccf 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -599,6 +599,11 @@ _outCustomScan(StringInfo str, const CustomScan *node)
 	WRITE_BITMAPSET_FIELD(custom_relids);
 	appendStringInfoString(str, " :methods ");
 	_outToken(str, node->methods->CustomName);
+	/*
+	 * Custom scan provider can/must dump out private fields, if it defines
+	 * a larger structure to save its private fields. It also has to be
+	 * reconstructable using relevant TextReadCustomScan callback.
+	 */
 	if (node->methods->TextOutCustomScan)
 		node->methods->TextOutCustomScan(str, node);
 }
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 7a6a3fe..27ad64b 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -539,9 +539,10 @@ typedef struct ForeignScan
  * custom_private, custom_scan_tlist, and custom_relids fields.  The
  * convention of setting scan.scanrelid to zero for joins applies as well.
  *
- * Note that since Plan trees can be copied, custom scan providers *must*
- * fit all plan data they need into those fields; embedding CustomScan in
- * a larger struct will not work.
+ * Note that Plan tree can be copied and written with usual node functions.
+ * If custom scan provider defines a larger structure that embeds CustomScan
+ * at the head, it *must* provide relevant callbacks to support copyObject
+ * and nodeToString.
  * ----------------
  */
 struct CustomScan;
@@ -552,9 +553,11 @@ typedef struct CustomScanMethods
 
 	/* Create execution state (CustomScanState) from a CustomScan plan node */
 	Node	   *(*CreateCustomScanState) (struct CustomScan *cscan);
-	/* Optional: print custom_xxx fields in some special way */
+	/* Optional: output private field if structure is extended */
 	void		(*TextOutCustomScan) (StringInfo str,
-											  const struct CustomScan *node);
+									  const struct CustomScan *cscan);
+	/* Optional: duplicate CustomScan node if any additional private fields */
+	struct CustomScan *(*NodeCopyCustomScan) (const struct CustomScan *from);
 } CustomScanMethods;
 
 typedef struct CustomScan

Robert Haas

robertmhaas@gmail.com

about 10 years ago

In reply to: Kouhei Kaigai (#1)

Re: CustomScan in a larger structure (RE: CustomScan support on readfuncs.c)

On Mon, Nov 9, 2015 at 10:26 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:

It is a relevant topic of readfuncs support for custom-scan.

Unlike CustomPath and CustomScanState, we don't allow custom-scan
provider to define own and larger structure that embeds CustomScan
at head of the structure, to support copyObject().
Thus, custom-scan provider needs to have own logic to pack and
unpack private fields, like:
https://github.com/pg-strom/devel/blob/master/src/gpujoin.c#L112

At present, only create_customscan_plan() and _copyCustomScan()
can produce CustomScan node.
The create_customscan_plan() invokes PlanCustomPath callback,
thus, CSP can return a pointer of CustomScan field within
a larger structure. So, we don't adjust this interface.
On the other hands, _copyCustomScan() allocates a new node using
makeNode(CustomScan), thus, here is no space for the larger
structure if CSP wants to use to keep private values without
packing and unpacking.
Only CSP knows exact size of the larger structure, so all we can
do is ask CSP to allocate a new node and copy its private fields.
This patch newly adds NodeCopyCustomScan callback to support
copyObject.

Also, v9.6 will support nodeToString/stringToNode on plan nodes.
So, we need to re-define the role of TextOutCustomScan callback
that is also defined at v9.5.
If CSP extends the CustomScan to save private values on the extra
area, only CSP knows what values and which data types are stored,
thus, all we can do is ask CSP to serialize and deserialize its
private fields without inconsistency.
Even though TextOutCustomScan callback shall be once ripped out
to support readfuncs.c, a pair of TextOut and TextRead callback
will re-define its responsibility again; when CSP defines
a larger structure that embeds CustomScan, a pair of these
callbacks are used to serialize/deserialize the entire custom
defined objects.

I don't see this as being a particularly good idea. The same issue
exists for FDWs, and we're just living with it in that case. There
was talk previously of improving it, and maybe some day somebody will,
but I can't get excited about it. Writing serialization and
deserialization code is annoying but completely insufficient, in my
mind, to justify the hassle of creating a system for this that will be
used by very, very little code.

If we do want to improve it, I'm not sure this is the way to go,
either. I think there could be other designs where we focus on making
the serialization and deserialization options better, rather than
letting people tack stuff onto the struct.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Andres Freund

andres@anarazel.de

about 10 years ago

In reply to: Robert Haas (#2)

Re: CustomScan in a larger structure (RE: CustomScan support on readfuncs.c)

On 2015-11-11 14:59:33 -0500, Robert Haas wrote:

I don't see this as being a particularly good idea. The same issue
exists for FDWs, and we're just living with it in that case.

It's absolutely horrible there. I don't see why that's a justification
for much. To deal with the lack of extensible copy/out/readfuncs I've
just had to copy the entirety of readfuncs.c into an extension. Or you
build replacements for those (as e.g. postgres_fdw essentially has
done).

If we do want to improve it, I'm not sure this is the way to go,
either. I think there could be other designs where we focus on making
the serialization and deserialization options better, rather than
letting people tack stuff onto the struct.

Just better serialization doesn't actually help all that much. Being
able to conveniently access data directly, i.e. as fields in a struct,
makes code rather more readable. And in many cases more
efficient. Having to serialize internal datastructures unconditionally,
just so copyfuncs.c works if actually used, makes for a fair amount of
inefficiency (forced deserialization, even when not copying) and uglier
code.

Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Robert Haas

robertmhaas@gmail.com

about 10 years ago

In reply to: Andres Freund (#3)

Re: CustomScan in a larger structure (RE: CustomScan support on readfuncs.c)

On Wed, Nov 11, 2015 at 3:29 PM, Andres Freund <andres@anarazel.de> wrote:

On 2015-11-11 14:59:33 -0500, Robert Haas wrote:

I don't see this as being a particularly good idea. The same issue
exists for FDWs, and we're just living with it in that case.

It's absolutely horrible there. I don't see why that's a justification
for much. To deal with the lack of extensible copy/out/readfuncs I've
just had to copy the entirety of readfuncs.c into an extension. Or you
build replacements for those (as e.g. postgres_fdw essentially has
done).

If we do want to improve it, I'm not sure this is the way to go,
either. I think there could be other designs where we focus on making
the serialization and deserialization options better, rather than
letting people tack stuff onto the struct.

Just better serialization doesn't actually help all that much. Being
able to conveniently access data directly, i.e. as fields in a struct,
makes code rather more readable. And in many cases more
efficient. Having to serialize internal datastructures unconditionally,
just so copyfuncs.c works if actually used, makes for a fair amount of
inefficiency (forced deserialization, even when not copying) and uglier
code.

Fair enough, but I'm still not very interested in having something for
CustomScan only.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tom Lane

tgl@sss.pgh.pa.us

about 10 years ago

In reply to: Robert Haas (#4)

Re: CustomScan in a larger structure (RE: CustomScan support on readfuncs.c)

Robert Haas <robertmhaas@gmail.com> writes:

On Wed, Nov 11, 2015 at 3:29 PM, Andres Freund <andres@anarazel.de> wrote:

Just better serialization doesn't actually help all that much. Being
able to conveniently access data directly, i.e. as fields in a struct,
makes code rather more readable. And in many cases more
efficient. Having to serialize internal datastructures unconditionally,
just so copyfuncs.c works if actually used, makes for a fair amount of
inefficiency (forced deserialization, even when not copying) and uglier
code.

Fair enough, but I'm still not very interested in having something for
CustomScan only.

I'm with Robert here. The fact is that postgres_fdw and other FDWs are
living just fine with the restrictions we put in to allow copyfuncs.c to
copy ForeignScan, and there's been no sufficient justification offered
as to why CustomScan can't play by those rules.

Yes, it would be nice to be able to use a more-tailored struct definition,
but it's not *necessary*. We should not contort the core system
enormously in order to provide a bit more notational cleanliness to
extensions.

I'd be fine with fixing this if a nice solution were to present itself,
but so far nothing's been offered that wasn't horrid.

BTW, the "inefficiency" argument is junk, unless you provide some hard
evidence of a case where it hurts a lot. If we were forcing people to
use serialized representations at execution time, it would be meaningful,
but we're not; you can convert as needed at executor setup time.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Kouhei Kaigai

kaigai@ak.jp.nec.com

about 10 years ago

In reply to: Tom Lane (#5)

Re: CustomScan in a larger structure (RE: CustomScan support on readfuncs.c)

On Wed, Nov 11, 2015 at 3:29 PM, Andres Freund <andres@anarazel.de> wrote:

On 2015-11-11 14:59:33 -0500, Robert Haas wrote:

I don't see this as being a particularly good idea. The same issue
exists for FDWs, and we're just living with it in that case.

It's absolutely horrible there. I don't see why that's a justification
for much. To deal with the lack of extensible copy/out/readfuncs I've
just had to copy the entirety of readfuncs.c into an extension. Or you
build replacements for those (as e.g. postgres_fdw essentially has
done).

If we do want to improve it, I'm not sure this is the way to go,
either. I think there could be other designs where we focus on making
the serialization and deserialization options better, rather than
letting people tack stuff onto the struct.

Just better serialization doesn't actually help all that much. Being
able to conveniently access data directly, i.e. as fields in a struct,
makes code rather more readable. And in many cases more
efficient. Having to serialize internal datastructures unconditionally,
just so copyfuncs.c works if actually used, makes for a fair amount of
inefficiency (forced deserialization, even when not copying) and uglier
code.

Fair enough, but I'm still not very interested in having something for
CustomScan only.

I agree with we have no reason why only custom-scan is allowed to have
serialize/deserialize capability. I can implement an equivalent stuff
for foreign-scan also, and it is helpful for extension authors, especially,
who tries to implement remote join feature because FDW driver has to
keep larger number of private fields than scan.

Also, what is the reason why we allow extensions to define a larger
structure which contains CustomPath or CustomScanState? It seems to
me that these types are not (fully) supported by the current copyfuncs,
outfuncs and readfuncs, aren't it?
(Although outfuncs.c supports path-nodes, thus CustomPathMethods has
TextOut callback but no copy/read handler at this moment.)

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Import Notes

Resolved by subject fallback

Kouhei Kaigai

kaigai@ak.jp.nec.com

about 10 years ago

In reply to: Tom Lane (#5)

Re: CustomScan in a larger structure (RE: CustomScan support on readfuncs.c)

Robert Haas <robertmhaas@gmail.com> writes:

On Wed, Nov 11, 2015 at 3:29 PM, Andres Freund <andres@anarazel.de> wrote:

Just better serialization doesn't actually help all that much. Being
able to conveniently access data directly, i.e. as fields in a struct,
makes code rather more readable. And in many cases more
efficient. Having to serialize internal datastructures unconditionally,
just so copyfuncs.c works if actually used, makes for a fair amount of
inefficiency (forced deserialization, even when not copying) and uglier
code.

Fair enough, but I'm still not very interested in having something for
CustomScan only.

I'm with Robert here. The fact is that postgres_fdw and other FDWs are
living just fine with the restrictions we put in to allow copyfuncs.c to
copy ForeignScan, and there's been no sufficient justification offered
as to why CustomScan can't play by those rules.

So, I'd like to propose copy/out/read callbacks for both of ForeignScan
and CustomScan.

Yes, it would be nice to be able to use a more-tailored struct definition,
but it's not *necessary*. We should not contort the core system
enormously in order to provide a bit more notational cleanliness to
extensions.

I'd be fine with fixing this if a nice solution were to present itself,
but so far nothing's been offered that wasn't horrid.

BTW, the "inefficiency" argument is junk, unless you provide some hard
evidence of a case where it hurts a lot. If we were forcing people to
use serialized representations at execution time, it would be meaningful,
but we're not; you can convert as needed at executor setup time.

Indeed, it is a "nice to have" feature. We can survive without luxury
goods but more expensive tax.

Once an extension tries to implement own join logic, it shall have
much larger number of private fields than usual scan logic.
Andres got a shock when he looked at GpuJoin code of PG-Strom before
because of its pack/unpack routine at:
https://github.com/pg-strom/devel/blob/master/src/gpujoin.c#L112
The custom_private/fdw_private have to be safe to copyObject(), thus,
we have to pack/unpack private variables to/from a List structure.
When we design serialization/deserialization of Plan nodes, it means
ForeignScan and CustomScan has to pay double cost for this.

I never say it is a "necessary" feature, but "very nice to have".

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Kouhei Kaigai

kaigai@ak.jp.nec.com

about 10 years ago

In reply to: Kouhei Kaigai (#7)

1 attachment(s)

Re: CustomScan in a larger structure (RE: CustomScan support on readfuncs.c)

On Wed, Nov 11, 2015 at 3:29 PM, Andres Freund <andres@anarazel.de> wrote:

On 2015-11-11 14:59:33 -0500, Robert Haas wrote:

I don't see this as being a particularly good idea. The same issue
exists for FDWs, and we're just living with it in that case.

It's absolutely horrible there. I don't see why that's a justification
for much. To deal with the lack of extensible copy/out/readfuncs I've
just had to copy the entirety of readfuncs.c into an extension. Or you
build replacements for those (as e.g. postgres_fdw essentially has
done).

If we do want to improve it, I'm not sure this is the way to go,
either. I think there could be other designs where we focus on making
the serialization and deserialization options better, rather than
letting people tack stuff onto the struct.

Just better serialization doesn't actually help all that much. Being
able to conveniently access data directly, i.e. as fields in a struct,
makes code rather more readable. And in many cases more
efficient. Having to serialize internal datastructures unconditionally,
just so copyfuncs.c works if actually used, makes for a fair amount of
inefficiency (forced deserialization, even when not copying) and uglier
code.

Fair enough, but I'm still not very interested in having something for
CustomScan only.

I agree with we have no reason why only custom-scan is allowed to have
serialize/deserialize capability. I can implement an equivalent stuff
for foreign-scan also, and it is helpful for extension authors, especially,
who tries to implement remote join feature because FDW driver has to
keep larger number of private fields than scan.

I tried to make a patch to support serialization/deserialization on both
of ForeignScan and CustomScan, but have not tested yet.

One exceptional stuff in ForeignScan is ForeignPath in outfuncs.c, because
ForeignPath itself does not have identifier to get callback functions (it
is kept in RelOptInfo; that is sufficient in planning phase), thus, we have
no way to write out if ForeignPath is a part of larger structure.
We ought to ignore it at this point. How about your opinion?

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

Attachments:

pgsql-v9.6-foreign-custom-serialization.v1.patchapplication/octet-stream; name=pgsql-v9.6-foreign-custom-serialization.v1.patchDownload

 doc/src/sgml/custom-scan.sgml | 70 +++++++++++++++++++++++++++++++++++++++----
 doc/src/sgml/fdwhandler.sgml  | 64 +++++++++++++++++++++++++++++++++++++++
 src/backend/nodes/copyfuncs.c | 33 ++++++++++++++++++--
 src/backend/nodes/outfuncs.c  | 15 ++++++++++
 src/backend/nodes/readfuncs.c | 38 +++++++++++++++++++++--
 src/include/foreign/fdwapi.h  | 10 +++++++
 src/include/nodes/plannodes.h | 15 ++++++++--
 7 files changed, 233 insertions(+), 12 deletions(-)

diff --git a/doc/src/sgml/custom-scan.sgml b/doc/src/sgml/custom-scan.sgml
index 5bba125..03f1d87 100644
--- a/doc/src/sgml/custom-scan.sgml
+++ b/doc/src/sgml/custom-scan.sgml
@@ -197,11 +197,31 @@ typedef struct CustomScan
 
   <para>
    Plan trees must be able to be duplicated using <function>copyObject</>,
-   so all the data stored within the <quote>custom</> fields must consist of
-   nodes that that function can handle.  Furthermore, custom scan providers
-   cannot substitute a larger structure that embeds
-   a <structname>CustomScan</> for the structure itself, as would be possible
-   for a <structname>CustomPath</> or <structname>CustomScanState</>.
+   serialized using <function>nodeToString</> and deserialized using
+   <function>stringToNode</>.
+   <structname>CustomScan</> can be located on head of larger structure
+   to keep private fields of custom scan provider.
+   Thus, if custom scan provider implements <function>TextOutCustomScan</>
+   to dump its private fields its own way, it also has responsibility of
+   <function>TextReadCustomScan</> to reconstruct the private structure.
+  </para>
+  <para>
+   Like any other fields, <structfield>methods</> pointer of the
+   <structname>CustomScanMethods</> also has to be restored, however,
+   nobody can ensure a particular shared library that contains the custom
+   scan provider is loaded exactly same address, for example, when
+   <structname>CustomScan</> node is deserialized on the background worker
+   under the control of <structname>Gather</>.
+
+   Thus, here is a few additional interface contract. The first one is
+   symbol name of the <structname>CustomScanMethods</> table has to be
+   visible to linker; that means the variable should not be declared as
+   static, because <function>stringToNode</> reconstructs the
+   <structfield>methods</> pointer using a pair of library and symbol
+   name. The second one is, <structname>CustomScanMethods</> table also
+   has to be initialized using <function>INIT_CUSTOM_SCAN_METHODS</>
+   macro on <function>_PG_init</>, to assign exact library and symbol
+   name to be referenced.
   </para>
 
   <sect2 id="custom-scan-plan-callbacks">
@@ -220,6 +240,46 @@ Node *(*CreateCustomScanState) (CustomScan *cscan);
     the <function>BeginCustomScan</> callback will be invoked to give the
     custom scan provider a chance to do whatever else is needed.
    </para>
+
+   <para>
+<programlisting>
+void (*TextOutCustomScan) (StringInfo str,
+                           const CustomScan *node);
+</programlisting>
+    Generate additional output when <function>nodeToString</> is invoked on
+    this custom plan node.  This callback is optional.  Since
+    <function>nodeToString</> will automatically dump all fields in the
+    structure, including the substructure of the <quote>custom</> fields,
+    there is usually not much need for this callback.
+   </para>
+
+   <para>
+<programlisting>
+CustomScan *(*TextReadCustomScan)(void);
+</programlisting>
+    Allocate a <structname>CustomScan</> or larger structure embedding
+    <structname>CustomScan</>, and read its private fields generated by
+    <function>TextOutCustomScan</> callback.
+    This callback is optional, however, must be implemented if custom
+    scan provider makes additional output for support of plan-tree
+    serialization and deserialization.
+    This callback shall be invoked under <function>stringToNode</>
+    context, so <function>pg_strtok</> will give the next token.
+   </para>
+
+   <para>
+<programlisting>
+CustomScan *(*NodeCopyCustomScan)(const struct CustomScan *from);
+</programlisting>
+    Allocate a <structname>CustomScan</> or larger structure embedding
+    <structname>CustomScan</>, and copies its private fields from the
+    original node.
+    This callback is optional, however, must be implemented if custom
+    scan provider extends <structname>CustomScan</> structure to save
+    its private fields for safety of <function>copyObject</>.
+    The common fields are duplicated by the backend, so all extension
+    needs to focus on is its private fields, if exists.
+   </para>
   </sect2>
  </sect1>
 
diff --git a/doc/src/sgml/fdwhandler.sgml b/doc/src/sgml/fdwhandler.sgml
index 1533a6b..bbe798c 100644
--- a/doc/src/sgml/fdwhandler.sgml
+++ b/doc/src/sgml/fdwhandler.sgml
@@ -195,6 +195,70 @@ GetForeignPlan (PlannerInfo *root,
     <para>
 <programlisting>
 void
+TextOutForeignScan(StringInfo str, const ForeignScan *node);
+</programlisting>
+     Write out private fields if FDW-driver defines a larger structure that
+     contains a <structname>ForeignScan</> plan node, and wants to save its
+     private data on the extra area rather than <structfield>fdw_private</>.
+    </para>
+    <para>
+     The core backend already wrote out the known fields of
+     <structname>ForeignScan</> on the supplied <structname>StringInfo</>,
+     then, this callback enables to append some extra private information.
+    </para>
+    <para>
+     This callback is optional, however, it is required to define the
+     relevant <function>TextReadForeignScan</> callback also, to match
+     serialization with deserialization.
+    </para>
+
+    <para>
+<programlisting>
+ForeignScan *
+TextReadForeignScan(const ForeignScan *node);
+</programlisting>
+     Allocates a node then read in private fields if FDW-driver defines
+     a larger structure that contains a <structname>ForeignScan</> plan
+     node, and wants to save its private data on the extra area rather
+     than <structfield>fdw_private</>.
+    </para>
+    <para>
+     The core backend already read in the known fields of
+     <structname>ForeignScan</> from input stream; we can pick up tokens
+     using <function>pg_strtok</> because this callback shall be invoked
+     under the <function>stringToNode</> context.
+    </para>
+    <para>
+     This callback is optional, however, it is required to define the
+     relevant <function>TextOutForeignScan</> callback also, to match
+     serialization with deserialization.
+    </para>
+
+    <para>
+<programlisting>
+ForeignScan *
+NodeCopyForeignScan(const ForeignScan *from);
+</programlisting>
+     Allocates a node object then duplicate private fields if FDW-driver
+     defines a larger structure that contains a <structname>ForeignScan</>
+     plan node, and wants to save its private data on the extra area rather
+     than <structfield>fdw_private</>.
+    </para>
+    <para>
+     The core backend will copy the known fields of <structname>ForeignScan</>,
+     thus, all this function needs to do is allocation of the own larger
+     structure and fill up the private fields.
+    </para>
+    <para>
+     This callback is optional, but should be implemented if FDW-driver
+     wants to save its private fields on the extra field in a larger
+     structure, like <function>TextOutForeignScan</> or
+     <function>TextReadForeignScan</>.
+    </para>
+
+    <para>
+<programlisting>
+void
 BeginForeignScan (ForeignScanState *node,
                   int eflags);
 </programlisting>
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 26264cb..9eb1fd6 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -22,6 +22,7 @@
 
 #include "postgres.h"
 
+#include "foreign/fdwapi.h"
 #include "miscadmin.h"
 #include "nodes/plannodes.h"
 #include "nodes/relation.h"
@@ -635,7 +636,21 @@ _copyWorkTableScan(const WorkTableScan *from)
 static ForeignScan *
 _copyForeignScan(const ForeignScan *from)
 {
-	ForeignScan *newnode = makeNode(ForeignScan);
+	FdwRoutine	*fdwroutine = GetFdwRoutineByServerId(from->fs_server);
+	ForeignScan *newnode;
+
+	/*
+	 * If FDW driver extends ForeignScan node to save extra private fields,
+	 * it is role of the FDW driver to allocate a new node that includes
+	 * the private fields.
+	 */
+	if (!fdwroutine->NodeCopyForeignScan)
+		newnode = makeNode(ForeignScan);
+	else
+	{
+		newnode = fdwroutine->NodeCopyForeignScan(from);
+		Assert(IsA(newnode, ForeignScan));
+	}
 
 	/*
 	 * copy node superclass fields
@@ -662,7 +677,21 @@ _copyForeignScan(const ForeignScan *from)
 static CustomScan *
 _copyCustomScan(const CustomScan *from)
 {
-	CustomScan *newnode = makeNode(CustomScan);
+	const CustomScanMethods *methods = from->methods;
+	CustomScan *newnode;
+
+	/*
+	 * If custom-scan provider extends CustomScan node to save its private
+	 * data, it is role of the provider to allocate a new node that includes
+	 * the private fields.
+	 */
+	if (!methods->NodeCopyCustomScan)
+		newnode = makeNode(CustomScan);
+	else
+	{
+		newnode = methods->NodeCopyCustomScan(from);
+		Assert(IsA(newnode, CustomScan));
+	}
 
 	/*
 	 * copy node superclass fields
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 012c14b..937dc95 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -25,6 +25,7 @@
 
 #include <ctype.h>
 
+#include "foreign/fdwapi.h"
 #include "lib/stringinfo.h"
 #include "nodes/plannodes.h"
 #include "nodes/relation.h"
@@ -587,6 +588,8 @@ _outWorkTableScan(StringInfo str, const WorkTableScan *node)
 static void
 _outForeignScan(StringInfo str, const ForeignScan *node)
 {
+	FdwRoutine *routines = GetFdwRoutineByServerId(node->fs_server);
+
 	WRITE_NODE_TYPE("FOREIGNSCAN");
 
 	_outScanInfo(str, (const Scan *) node);
@@ -598,6 +601,14 @@ _outForeignScan(StringInfo str, const ForeignScan *node)
 	WRITE_NODE_FIELD(fdw_recheck_quals);
 	WRITE_BITMAPSET_FIELD(fs_relids);
 	WRITE_BOOL_FIELD(fsSystemCol);
+
+	/*
+	 * Optional: it allows FDW driver to dump extra private field, if supplied 
+	 * ForeignScan node is located within a larger structure that contains
+	 * various private fields of individual FDW.
+	 */
+	if (routines->TextOutForeignScan)
+		routines->TextOutForeignScan(str, node);
 }
 
 static void
@@ -618,6 +629,10 @@ _outCustomScan(StringInfo str, const CustomScan *node)
 	_outToken(str, node->methods->LibraryName);
 	appendStringInfoChar(str, ' ');
 	_outToken(str, node->methods->SymbolName);
+
+	/* Also, private fields if any */
+	if (node->methods->TextOutCustomScan)
+		node->methods->TextOutCustomScan(str, node);
 }
 
 static void
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 222e2ed..f9a19d8 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -29,6 +29,7 @@
 #include <math.h>
 
 #include "fmgr.h"
+#include "foreign/fdwapi.h"
 #include "nodes/parsenodes.h"
 #include "nodes/plannodes.h"
 #include "nodes/readfuncs.h"
@@ -1792,8 +1793,12 @@ _readWorkTableScan(void)
 static ForeignScan *
 _readForeignScan(void)
 {
-	READ_LOCALS(ForeignScan);
+	READ_TEMP_LOCALS();
+	ForeignScan		local_temp;
+	ForeignScan	   *local_node;
+	FdwRoutine	   *fdwroutine;
 
+	NodeSetTag(local_node, T_ForeignScan);
 	ReadCommonScan(&local_node->scan);
 
 	READ_OID_FIELD(fs_server);
@@ -1804,6 +1809,17 @@ _readForeignScan(void)
 	READ_BITMAPSET_FIELD(fs_relids);
 	READ_BOOL_FIELD(fsSystemCol);
 
+	fdwroutine = GetFdwRoutineByServerId(local_node->fs_server);
+	if (!fdwroutine->TextReadForeignScan)
+		local_node = makeNode(ForeignScan);
+	else
+	{
+		local_node = fdwroutine->TextReadForeignScan(&local_temp);
+		Assert(IsA(local_node, ForeignScan));
+	}
+	/* move the common fields of ForeignScan */
+	memcpy(local_node, &local_temp, sizeof(ForeignScan));
+
 	READ_DONE();
 }
 
@@ -1813,11 +1829,14 @@ _readForeignScan(void)
 static CustomScan *
 _readCustomScan(void)
 {
-	READ_LOCALS(CustomScan);
+	READ_TEMP_LOCALS();
+	CustomScan	local_temp;
+	CustomScan *local_node = &local_temp;
 	char	   *library_name;
 	char	   *symbol_name;
 	const CustomScanMethods *methods;
 
+	NodeSetTag(local_node, T_CustomScan);
 	ReadCommonScan(&local_node->scan);
 
 	READ_UINT_FIELD(flags);
@@ -1842,6 +1861,21 @@ _readCustomScan(void)
 		   strcmp(methods->SymbolName, symbol_name) == 0);
 	local_node->methods = methods;
 
+	/*
+	 * Then, read private data fields and reconstruct a struct that
+	 * contains CustomScan on its head, if any. Elsewhere, construct
+	 * usual CustomScan node.
+	 */
+	if (!methods->TextReadCustomScan)
+		local_node = makeNode(CustomScan);
+	else
+	{
+		local_node = methods->TextReadCustomScan(&local_temp);
+		Assert(IsA(local_node, CustomScan));
+	}
+	/* move the common field of CustomScan */
+	memcpy(local_node, &local_temp, sizeof(CustomScan));
+
 	READ_DONE();
 }
 
diff --git a/src/include/foreign/fdwapi.h b/src/include/foreign/fdwapi.h
index 69b48b4..224d44c 100644
--- a/src/include/foreign/fdwapi.h
+++ b/src/include/foreign/fdwapi.h
@@ -38,6 +38,13 @@ typedef ForeignScan *(*GetForeignPlan_function) (PlannerInfo *root,
 															 List *tlist,
 														 List *scan_clauses);
 
+typedef void (*TextOutForeignScan_function) (StringInfo str,
+											 const ForeignScan *node);
+
+typedef ForeignScan *(*TextReadForeignScan_function) (const ForeignScan *node);
+
+typedef ForeignScan *(*NodeCopyForeignScan_function) (const ForeignScan *from);
+
 typedef void (*BeginForeignScan_function) (ForeignScanState *node,
 													   int eflags);
 
@@ -136,6 +143,9 @@ typedef struct FdwRoutine
 	GetForeignRelSize_function GetForeignRelSize;
 	GetForeignPaths_function GetForeignPaths;
 	GetForeignPlan_function GetForeignPlan;
+	TextOutForeignScan_function TextOutForeignScan;
+	TextReadForeignScan_function TextReadForeignScan;
+	NodeCopyForeignScan_function NodeCopyForeignScan;
 	BeginForeignScan_function BeginForeignScan;
 	IterateForeignScan_function IterateForeignScan;
 	ReScanForeignScan_function ReScanForeignScan;
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 37086c6..18639e6 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -547,9 +547,10 @@ typedef struct ForeignScan
  * custom_private, custom_scan_tlist, and custom_relids fields.  The
  * convention of setting scan.scanrelid to zero for joins applies as well.
  *
- * Note that since Plan trees can be copied, custom scan providers *must*
- * fit all plan data they need into those fields; embedding CustomScan in
- * a larger struct will not work.
+ * Note that Plan trees can be copied, displayed and read using node
+ * functions, thus, custom scan provider *must* support relevant callbacks
+ * if provider wants to embed CustomScan in a larger struct to save private
+ * fields.
  * ----------------
  */
 struct CustomScan;
@@ -562,6 +563,14 @@ typedef struct CustomScanMethods
 
 	/* Create execution state (CustomScanState) from a CustomScan plan node */
 	Node	   *(*CreateCustomScanState) (struct CustomScan *cscan);
+
+	/* Optional: print private fields in some special way */
+	void		(*TextOutCustomScan) (StringInfo str,
+									  const struct CustomScan *node);
+	/* Optional: read the private fields printed in special way */
+	struct CustomScan *(*TextReadCustomScan)(const struct CustomScan *node);
+	/* Optional: copy the original including private fields */
+	struct CustomScan *(*NodeCopyCustomScan)(const struct CustomScan *from);
 } CustomScanMethods;
 
 typedef struct CustomScan

Import Notes

Resolved by subject fallback

Robert Haas

robertmhaas@gmail.com

about 10 years ago

In reply to: Kouhei Kaigai (#6)

Re: CustomScan in a larger structure (RE: CustomScan support on readfuncs.c)

On Wed, Nov 11, 2015 at 11:13 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:

I agree with we have no reason why only custom-scan is allowed to have
serialize/deserialize capability. I can implement an equivalent stuff
for foreign-scan also, and it is helpful for extension authors, especially,
who tries to implement remote join feature because FDW driver has to
keep larger number of private fields than scan.

Also, what is the reason why we allow extensions to define a larger
structure which contains CustomPath or CustomScanState? It seems to
me that these types are not (fully) supported by the current copyfuncs,
outfuncs and readfuncs, aren't it?
(Although outfuncs.c supports path-nodes, thus CustomPathMethods has
TextOut callback but no copy/read handler at this moment.)

I feel like we're sort of plunging blindly down a path of trying to
make certain parts of the system extensible and pluggable without
really having a clear vision of where we want to end up. Somehow I
feel like we ought to start by thinking about a framework for *node*
extensibility; then, within that, we can talk about what that means
for particular types of nodes.

For example, suppose do something like this:

typedef struct
{
NodeTag tag;
char *extnodename;
} ExtensibleNode;

typedef stuct
{
char *extnodename;
char *library_name;
char *function_name;
Size node_size;
ExtensibleNode *(*nodeCopy)(ExtensibleNode *);
bool (*nodeEqual)(ExtensibleNode *, ExtensibleNode *);
void (*nodeOut)(StringInfo str, ExtensibleNode *);
ExtensibleNode (*nodeRead)(void);
} ExtensibleNodeMethods;

extern void RegisterExtensibleNodeMethods(ExtensibleNodeMethods *);
extern ExtensibleNodeMethods *GetExtensibleNodeMethods(char *extnodename);

This provides a generic infrastructure so that we don't have to keep
inventing the same stuff over and over, rather than coming up with one
way to do it for ForeignScans and another way for CustomScan and
another way for CustomScanState.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10

Kouhei Kaigai

kaigai@ak.jp.nec.com

about 10 years ago

In reply to: Robert Haas (#9)

Re: CustomScan in a larger structure (RE: CustomScan support on readfuncs.c)

On Wed, Nov 11, 2015 at 11:13 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:

I agree with we have no reason why only custom-scan is allowed to have
serialize/deserialize capability. I can implement an equivalent stuff
for foreign-scan also, and it is helpful for extension authors, especially,
who tries to implement remote join feature because FDW driver has to
keep larger number of private fields than scan.

Also, what is the reason why we allow extensions to define a larger
structure which contains CustomPath or CustomScanState? It seems to
me that these types are not (fully) supported by the current copyfuncs,
outfuncs and readfuncs, aren't it?
(Although outfuncs.c supports path-nodes, thus CustomPathMethods has
TextOut callback but no copy/read handler at this moment.)

I feel like we're sort of plunging blindly down a path of trying to
make certain parts of the system extensible and pluggable without
really having a clear vision of where we want to end up. Somehow I
feel like we ought to start by thinking about a framework for *node*
extensibility; then, within that, we can talk about what that means
for particular types of nodes.

For example, suppose do something like this:

typedef struct
{
NodeTag tag;
char *extnodename;
} ExtensibleNode;

typedef stuct
{
char *extnodename;
char *library_name;
char *function_name;
Size node_size;
ExtensibleNode *(*nodeCopy)(ExtensibleNode *);
bool (*nodeEqual)(ExtensibleNode *, ExtensibleNode *);
void (*nodeOut)(StringInfo str, ExtensibleNode *);
ExtensibleNode (*nodeRead)(void);
} ExtensibleNodeMethods;

extern void RegisterExtensibleNodeMethods(ExtensibleNodeMethods *);
extern ExtensibleNodeMethods *GetExtensibleNodeMethods(char *extnodename);

This provides a generic infrastructure so that we don't have to keep
inventing the same stuff over and over, rather than coming up with one
way to do it for ForeignScans and another way for CustomScan and
another way for CustomScanState.

Wow! I love the idea.

I guess we will put a pointer to static ExtensibleNodeMethods structure
on ForeignScan, CustomScan, CustomScanState and etc...

Like:
typedef struct ForeignScan
{
Scan scan;
Oid fs_server; /* OID of foreign server */
List *fdw_exprs; /* expressions that FDW may evaluate */
List *fdw_private; /* private data for FDW */
List *fdw_scan_tlist; /* optional tlist describing scan tuple */
List *fdw_recheck_quals; /* original quals not in scan.plan.qual */
Bitmapset *fs_relids; /* RTIs generated by this scan */
bool fsSystemCol; /* true if any "system column" is needed */
+ const ExtensibleNodeMethods *methods;
} ForeignScan;

We may be able to put ExtensibleNodeMethods on the head of CustomScanMethods
structure in case of CustomScan.

Let me write down the steps for deserialization. Please correct me if it is
not what you are thinking.
First, stringToNode picks up a token that shall have node label, like
"SEQSCAN", "CUSTOMSCAN" and others. Once we can identify the following
tokens belong to CustomScan node, it can make advance the pg_strtok pointer
until "methods" field. The method field is consists of a pair of library_name
and symbol_name, then it tries to load the library and resolve the symbol.
Even if library was not loaded on the worker side, this step allows to ensure
the library gets loaded, thus, PG_init() can RegisterExtensibleNodeMethods.
Next, it looks up the table of extensible nodes by a pair of extnodename,
library_name and symbol_name, to get ExtensibleNodeMethods table in this
process address space.
Once it can get nodeRead() callback, we can continue deserialization of
private fields.

I have two concerns on the above proposition.
1) 'node_size' field implies user defined structure to have fixed length.
Extension may want to use variable length structure. The example below
shows GpuJoinState has multiple innerState structure according to the
number of inner relations joined at once.
https://github.com/pg-strom/devel/blob/master/src/gpujoin.c#L240

2) It may be valuable if 'nodeRead(void)' can take an argument of the
knows/common portion of ForeignScan and etc... It allows extension
to adjust number of private fields according to the known part.
For example, I can expect flexible length private fields according
to the length of custom_subplans list.

How about your thought?

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Import Notes

Resolved by subject fallback

#11

Robert Haas

robertmhaas@gmail.com

about 10 years ago

In reply to: Kouhei Kaigai (#10)

Re: CustomScan in a larger structure (RE: CustomScan support on readfuncs.c)

On Mon, Nov 16, 2015 at 9:03 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:

I guess we will put a pointer to static ExtensibleNodeMethods structure
on ForeignScan, CustomScan, CustomScanState and etc...

I think that makes it confusing. What I'd prefer to do is have only
the name stored in the node, and then people who need methods can look
up those methods based on the name.

Let me write down the steps for deserialization. Please correct me if it is
not what you are thinking.
First, stringToNode picks up a token that shall have node label, like
"SEQSCAN", "CUSTOMSCAN" and others. Once we can identify the following
tokens belong to CustomScan node, it can make advance the pg_strtok pointer
until "methods" field. The method field is consists of a pair of library_name
and symbol_name, then it tries to load the library and resolve the symbol.

No. It reads the "extnodename" field (or whatever we decide to call
it) and calls GetExtensibleNodeMethods() on that name. That name
returns the appropriate structure. C libraries that get loaded into
the server can register their extensible node types at _PG_init()
time.

Even if library was not loaded on the worker side, this step allows to ensure
the library gets loaded, thus, PG_init() can RegisterExtensibleNodeMethods.

A parallel worker will load the same loadable modules which the master
has got loaded. If you use some other kind of background worker, of
course, you're on your own.

I have two concerns on the above proposition.
1) 'node_size' field implies user defined structure to have fixed length.
Extension may want to use variable length structure. The example below
shows GpuJoinState has multiple innerState structure according to the
number of inner relations joined at once.
https://github.com/pg-strom/devel/blob/master/src/gpujoin.c#L240

OK, we can replace the node_size field with an allocator callback:
Node (*nodeAlloc)(void) or something like that.

2) It may be valuable if 'nodeRead(void)' can take an argument of the
knows/common portion of ForeignScan and etc... It allows extension
to adjust number of private fields according to the known part.
For example, I can expect flexible length private fields according
to the length of custom_subplans list.

I don't understand this bit.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12

Kouhei Kaigai

kaigai@ak.jp.nec.com

about 10 years ago

In reply to: Robert Haas (#11)

Re: CustomScan in a larger structure (RE: CustomScan support on readfuncs.c)

On Mon, Nov 16, 2015 at 9:03 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:

I guess we will put a pointer to static ExtensibleNodeMethods structure
on ForeignScan, CustomScan, CustomScanState and etc...

I think that makes it confusing. What I'd prefer to do is have only
the name stored in the node, and then people who need methods can look
up those methods based on the name.

OK. It is equivalent.

Let me write down the steps for deserialization. Please correct me if it is
not what you are thinking.
First, stringToNode picks up a token that shall have node label, like
"SEQSCAN", "CUSTOMSCAN" and others. Once we can identify the following
tokens belong to CustomScan node, it can make advance the pg_strtok pointer
until "methods" field. The method field is consists of a pair of library_name
and symbol_name, then it tries to load the library and resolve the symbol.

No. It reads the "extnodename" field (or whatever we decide to call
it) and calls GetExtensibleNodeMethods() on that name. That name
returns the appropriate structure. C libraries that get loaded into
the server can register their extensible node types at _PG_init()
time.

OK, I got.

Even if library was not loaded on the worker side, this step allows to ensure
the library gets loaded, thus, PG_init() can RegisterExtensibleNodeMethods.

A parallel worker will load the same loadable modules which the master
has got loaded. If you use some other kind of background worker, of
course, you're on your own.

Sorry, I oversight it. And SerializeLibraryState() and RestoreLibraryState()
indeed allow background workers even if out of parallel context to restore
the library state easily.

I have two concerns on the above proposition.
1) 'node_size' field implies user defined structure to have fixed length.
Extension may want to use variable length structure. The example below
shows GpuJoinState has multiple innerState structure according to the
number of inner relations joined at once.
https://github.com/pg-strom/devel/blob/master/src/gpujoin.c#L240

OK, we can replace the node_size field with an allocator callback:
Node (*nodeAlloc)(void) or something like that.

2) It may be valuable if 'nodeRead(void)' can take an argument of the
knows/common portion of ForeignScan and etc... It allows extension
to adjust number of private fields according to the known part.
For example, I can expect flexible length private fields according
to the length of custom_subplans list.

I don't understand this bit.

The above two points are for the case if and when extension want to use
variable length fields for its private fields.
So, nodeAlloc() callback is not a perfect answer for the use case because
length of the variable length fields shall be (usually) determined by the
value of another fields (like num_inner_rels, num_gpu_devices, ...) thus
we cannot determine the correct length before read.

Let's assume an custom-scan extension that wants to have:

typedef struct {
CustomScan cscan;
int priv_value_1;
long priv_value_2;
extra_info_t extra_subplan_info[FLEXIBLE_ARRAY_MEMBER];
/* its length equal to number of sub-plans */
} ExampleScan;

The "extnodename" within CustomScan allows to pull callback functions
to handle read node from the serialized format.
However, nodeAlloc() callback cannot determine the correct length
(= number of sub-plans in this example) prior to read 'cscan' part.

So, I'd like to suggest _readCustomScan (and other extendable nodes
also) read tokens on local CustomScan variable once, then call
Node *(nodeRead)(Node *local_node)
to allocate entire ExampleScan node and read other private fields.

The local_node is already filled up by the core backend prior to
the callback invocation, so extension can know how many sub-plans
are expected thus how many private tokens shall appear.
It also means extension can know exact length of the ExampleScan
node, so it can allocate the node as expected then fill up
remaining private tokens.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Import Notes

Resolved by subject fallback

#13

Robert Haas

robertmhaas@gmail.com

about 10 years ago

In reply to: Kouhei Kaigai (#12)

Re: CustomScan in a larger structure (RE: CustomScan support on readfuncs.c)

On Thu, Nov 19, 2015 at 10:11 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:

The above two points are for the case if and when extension want to use
variable length fields for its private fields.
So, nodeAlloc() callback is not a perfect answer for the use case because
length of the variable length fields shall be (usually) determined by the
value of another fields (like num_inner_rels, num_gpu_devices, ...) thus
we cannot determine the correct length before read.

Let's assume an custom-scan extension that wants to have:

typedef struct {
CustomScan cscan;
int priv_value_1;
long priv_value_2;
extra_info_t extra_subplan_info[FLEXIBLE_ARRAY_MEMBER];
/* its length equal to number of sub-plans */
} ExampleScan;

The "extnodename" within CustomScan allows to pull callback functions
to handle read node from the serialized format.
However, nodeAlloc() callback cannot determine the correct length
(= number of sub-plans in this example) prior to read 'cscan' part.

So, I'd like to suggest _readCustomScan (and other extendable nodes
also) read tokens on local CustomScan variable once, then call
Node *(nodeRead)(Node *local_node)
to allocate entire ExampleScan node and read other private fields.

The local_node is already filled up by the core backend prior to
the callback invocation, so extension can know how many sub-plans
are expected thus how many private tokens shall appear.
It also means extension can know exact length of the ExampleScan
node, so it can allocate the node as expected then fill up
remaining private tokens.

On second thought, I think we should insist that nodes have to be
fixed-size. This sounds like a mess. If the node needs a variable
amount of storage for something, it can store a pointer to a
separately-allocated array someplace inside of it. That's what
existing nodes do, and it works fine.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14

Kouhei Kaigai

kaigai@ak.jp.nec.com

about 10 years ago

In reply to: Robert Haas (#13)

Re: CustomScan in a larger structure (RE: CustomScan support on readfuncs.c)

On Thu, Nov 19, 2015 at 10:11 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:

The above two points are for the case if and when extension want to use
variable length fields for its private fields.
So, nodeAlloc() callback is not a perfect answer for the use case because
length of the variable length fields shall be (usually) determined by the
value of another fields (like num_inner_rels, num_gpu_devices, ...) thus
we cannot determine the correct length before read.

Let's assume an custom-scan extension that wants to have:

typedef struct {
CustomScan cscan;
int priv_value_1;
long priv_value_2;
extra_info_t extra_subplan_info[FLEXIBLE_ARRAY_MEMBER];
/* its length equal to number of sub-plans */
} ExampleScan;

The "extnodename" within CustomScan allows to pull callback functions
to handle read node from the serialized format.
However, nodeAlloc() callback cannot determine the correct length
(= number of sub-plans in this example) prior to read 'cscan' part.

So, I'd like to suggest _readCustomScan (and other extendable nodes
also) read tokens on local CustomScan variable once, then call
Node *(nodeRead)(Node *local_node)
to allocate entire ExampleScan node and read other private fields.

The local_node is already filled up by the core backend prior to
the callback invocation, so extension can know how many sub-plans
are expected thus how many private tokens shall appear.
It also means extension can know exact length of the ExampleScan
node, so it can allocate the node as expected then fill up
remaining private tokens.

On second thought, I think we should insist that nodes have to be
fixed-size. This sounds like a mess. If the node needs a variable
amount of storage for something, it can store a pointer to a
separately-allocated array someplace inside of it. That's what
existing nodes do, and it works fine.

OK, let's have "node_size" instead of nodeAlloc callback and other
stuffs.

Please wait for a patch.
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Import Notes

Resolved by subject fallback

#15

Kouhei Kaigai

kaigai@ak.jp.nec.com

about 10 years ago

In reply to: Kouhei Kaigai (#14)

Re: CustomScan in a larger structure (RE: CustomScan support on readfuncs.c)

-----Original Message-----
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Kouhei Kaigai
Sent: Saturday, November 21, 2015 8:05 AM
To: Robert Haas
Cc: Andres Freund; Amit Kapila; pgsql-hackers
Subject: Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan support
on readfuncs.c)

On Thu, Nov 19, 2015 at 10:11 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:

The above two points are for the case if and when extension want to use
variable length fields for its private fields.
So, nodeAlloc() callback is not a perfect answer for the use case because
length of the variable length fields shall be (usually) determined by the
value of another fields (like num_inner_rels, num_gpu_devices, ...) thus
we cannot determine the correct length before read.

Let's assume an custom-scan extension that wants to have:

typedef struct {
CustomScan cscan;
int priv_value_1;
long priv_value_2;
extra_info_t extra_subplan_info[FLEXIBLE_ARRAY_MEMBER];
/* its length equal to number of sub-plans */
} ExampleScan;

The "extnodename" within CustomScan allows to pull callback functions
to handle read node from the serialized format.
However, nodeAlloc() callback cannot determine the correct length
(= number of sub-plans in this example) prior to read 'cscan' part.

So, I'd like to suggest _readCustomScan (and other extendable nodes
also) read tokens on local CustomScan variable once, then call
Node *(nodeRead)(Node *local_node)
to allocate entire ExampleScan node and read other private fields.

The local_node is already filled up by the core backend prior to
the callback invocation, so extension can know how many sub-plans
are expected thus how many private tokens shall appear.
It also means extension can know exact length of the ExampleScan
node, so it can allocate the node as expected then fill up
remaining private tokens.

On second thought, I think we should insist that nodes have to be
fixed-size. This sounds like a mess. If the node needs a variable
amount of storage for something, it can store a pointer to a
separately-allocated array someplace inside of it. That's what
existing nodes do, and it works fine.

OK, let's have "node_size" instead of nodeAlloc callback and other
stuffs.

Please wait for a patch.

I'm now implementing. The above design perfectly works on ForeignScan.
On the other hands, I'd like to have deeper consideration for CustomScan.

My recent patch adds LibraryName and SymbolName on CustomScanMethods
to lookup the method table even if library is not loaded yet.
However, this ExtensibleNodeMethods relies custom scan provider shall
be loaded, by parallel infrastructure, prior to the deserialization.
It means extension has a chance to register itself as well.

My idea is, redefine CustomScanMethod as follows:

typedef struct ExtensibleNodeMethods
{
const char *extnodename;
Size node_size;
Node *(*nodeCopy)(const Node *from);
bool (*nodeEqual)(const Node *a, const Node *b);
void (*nodeOut)(struct StringInfoData *str, const Node *node);
void (*nodeRead)(Node *node);
} ExtensibleNodeMethods;

typedef struct CustomScanMethods
{
union {
const char *CustomName;
ExtensibleNodeMethods xnode;
};
/* Create execution state (CustomScanState) from a CustomScan plan node */
Node *(*CreateCustomScanState) (struct CustomScan *cscan);
} CustomScanMethods;

It allows to pull CustomScanMethods using GetExtensibleNodeMethods
by CustomName; that is alias of extnodename in ExtensibleNodeMethods.
Thus, we don't need to care about LibraryName and SymbolName.

This kind of trick is not needed for ForeignScan because all the pointer
stuff are reproducible by catalog, however, method table of CustomScan
is not.

How about your opinion?

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16

Robert Haas

robertmhaas@gmail.com

about 10 years ago

In reply to: Kouhei Kaigai (#15)

Re: CustomScan in a larger structure (RE: CustomScan support on readfuncs.c)

On Thu, Nov 26, 2015 at 5:27 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:

I'm now implementing. The above design perfectly works on ForeignScan.
On the other hands, I'd like to have deeper consideration for CustomScan.

My recent patch adds LibraryName and SymbolName on CustomScanMethods
to lookup the method table even if library is not loaded yet.
However, this ExtensibleNodeMethods relies custom scan provider shall
be loaded, by parallel infrastructure, prior to the deserialization.
It means extension has a chance to register itself as well.

My idea is, redefine CustomScanMethod as follows:

typedef struct ExtensibleNodeMethods
{
const char *extnodename;
Size node_size;
Node *(*nodeCopy)(const Node *from);
bool (*nodeEqual)(const Node *a, const Node *b);
void (*nodeOut)(struct StringInfoData *str, const Node *node);
void (*nodeRead)(Node *node);
} ExtensibleNodeMethods;

typedef struct CustomScanMethods
{
union {
const char *CustomName;
ExtensibleNodeMethods xnode;
};
/* Create execution state (CustomScanState) from a CustomScan plan node */
Node *(*CreateCustomScanState) (struct CustomScan *cscan);
} CustomScanMethods;

It allows to pull CustomScanMethods using GetExtensibleNodeMethods
by CustomName; that is alias of extnodename in ExtensibleNodeMethods.
Thus, we don't need to care about LibraryName and SymbolName.

This kind of trick is not needed for ForeignScan because all the pointer
stuff are reproducible by catalog, however, method table of CustomScan
is not.

How about your opinion?

Anonymous unions are not C89, so this is a no-go.

I think we should just drop CustomName and replace it with
ExtensibleNodeMethods. That will break things for third-party code
that is using the existing CustomScan stuff, but there's a good chance
that nobody other than you has written any such code. And even if
someone has, updating it for this change will not be very difficult.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17

Kouhei Kaigai

kaigai@ak.jp.nec.com

about 10 years ago

In reply to: Robert Haas (#16)

2 attachment(s)

Re: CustomScan in a larger structure (RE: CustomScan support on readfuncs.c)

On Thu, Nov 26, 2015 at 5:27 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:

I'm now implementing. The above design perfectly works on ForeignScan.
On the other hands, I'd like to have deeper consideration for CustomScan.

My recent patch adds LibraryName and SymbolName on CustomScanMethods
to lookup the method table even if library is not loaded yet.
However, this ExtensibleNodeMethods relies custom scan provider shall
be loaded, by parallel infrastructure, prior to the deserialization.
It means extension has a chance to register itself as well.

My idea is, redefine CustomScanMethod as follows:

typedef struct ExtensibleNodeMethods
{
const char *extnodename;
Size node_size;
Node *(*nodeCopy)(const Node *from);
bool (*nodeEqual)(const Node *a, const Node *b);
void (*nodeOut)(struct StringInfoData *str, const Node *node);
void (*nodeRead)(Node *node);
} ExtensibleNodeMethods;

typedef struct CustomScanMethods
{
union {
const char *CustomName;
ExtensibleNodeMethods xnode;
};
/* Create execution state (CustomScanState) from a CustomScan plan node

*/

Node *(*CreateCustomScanState) (struct CustomScan *cscan);
} CustomScanMethods;

It allows to pull CustomScanMethods using GetExtensibleNodeMethods
by CustomName; that is alias of extnodename in ExtensibleNodeMethods.
Thus, we don't need to care about LibraryName and SymbolName.

This kind of trick is not needed for ForeignScan because all the pointer
stuff are reproducible by catalog, however, method table of CustomScan
is not.

How about your opinion?

Anonymous unions are not C89, so this is a no-go.

I think we should just drop CustomName and replace it with
ExtensibleNodeMethods. That will break things for third-party code
that is using the existing CustomScan stuff, but there's a good chance
that nobody other than you has written any such code. And even if
someone has, updating it for this change will not be very difficult.

Thanks, I have same opinion.
At this moment, CustomName is not utilized so much except for EXPLAIN
and debug output, so it is not a hard stuff to replace this field by
extnodename, even if custom scan provider does not have own structure
thus no callbacks of node operations are defined.

The attached patch adds 'extnodename' fields on ForeignPath and
ForeignScan, also embeds ExtensibleNodeMethods in CustomPathMethods,
CustomScanMethods and CustomExecMethods.

Its handlers in copyfuncs.c were enhanced to utilize the callback
to allocate a larger structure and copy private fields.
Handlers in outfuncs.c and readfuncs.c have to be symmetric. The
core backend writes out 'extnodename' prior to any private fields,
then we can identify the callback to process rest of tokens for
private fields.

RegisterExtensibleNodeMethods() tracks a pair of 'extnodename'
and ExtensibleNodeMethods itself. It saves the pointer of the
method table, but not duplicate, because _readCustomScan()
cast the method pulled by 'extnodename' thus registered table
has to be a static variable on extension; that means extension
never update ExtensibleNodeMethods once registered.

The one other patch is my test using PG-Strom, however, I didn't
have a test on FDW extensions yet.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

Attachments:

pgstrom-custom-private.test.patchapplication/octet-stream; name=pgstrom-custom-private.test.patchDownload

 Makefile           |   2 +-
 src/cuda_control.c |   2 +-
 src/datastore.c    |   2 +-
 src/gpuscan.c      | 172 +++++++++++++++++++++++++++++++++++++++++++++++------
 src/main.c         |  10 ++--
 5 files changed, 163 insertions(+), 25 deletions(-)

diff --git a/Makefile b/Makefile
index 90dc685..a41322d 100644
--- a/Makefile
+++ b/Makefile
@@ -18,7 +18,7 @@ PG_VERSION_NUM=$(shell $(PG_CONFIG) --version | awk '{print $$NF}'	\
 # Source file of CPU portion
 STROM_OBJS = main.o codegen.o datastore.o aggfuncs.o \
 		cuda_control.o cuda_program.o cuda_mmgr.o \
-		gpuscan.o gpujoin.o gpupreagg.o gpusort.o
+		gpuscan.o #gpujoin.o gpupreagg.o gpusort.o
 
 # Source file of GPU portion
 CUDA_OBJS = cuda_common.o \
diff --git a/src/cuda_control.c b/src/cuda_control.c
index 6b5f74c..a012a89 100644
--- a/src/cuda_control.c
+++ b/src/cuda_control.c
@@ -1599,7 +1599,7 @@ pgstrom_fetch_gputask(GpuTaskState *gts)
 				{
 					gts->scan_done = true;
 					elog(DEBUG1, "scan done (%s)",
-						 gts->css.methods->CustomName);
+						 gts->css.methods->xnode.extnodename);
 					break;
 				}
 				dlist_push_tail(&gts->pending_tasks, &gtask->chain);
diff --git a/src/datastore.c b/src/datastore.c
index 84147cf..d22d846 100644
--- a/src/datastore.c
+++ b/src/datastore.c
@@ -286,7 +286,7 @@ pgstrom_get_bulkload_density(Plan *child_plan)
 	 * bulk-output. So, we need to walk down if child node has bulk-
 	 * input.
 	 */
-	while (pgstrom_plan_is_gpujoin_bulkinput(child_plan))
+	while (false) //pgstrom_plan_is_gpujoin_bulkinput(child_plan))
 	{
 		Plan	   *curr_plan = child_plan;
 
diff --git a/src/gpuscan.c b/src/gpuscan.c
index 0ad6315..318b234 100644
--- a/src/gpuscan.c
+++ b/src/gpuscan.c
@@ -19,6 +19,7 @@
 #include "access/xact.h"
 #include "catalog/pg_namespace.h"
 #include "miscadmin.h"
+#include "nodes/readfuncs.h"
 #include "optimizer/cost.h"
 #include "optimizer/pathnode.h"
 #include "optimizer/paths.h"
@@ -99,6 +100,15 @@ deform_gpuscan_info(Plan *plan)
 	return result;
 }
 
+typedef struct {
+	CustomScan	cscan;
+	const char *kern_source;
+	int32		extra_flags;
+	List	   *used_params;
+	List	   *used_vars;
+	List	   *dev_quals;
+} GpuScan;
+
 typedef struct
 {
 	GpuTask			task;
@@ -502,7 +512,8 @@ create_gpuscan_plan(PlannerInfo *root,
 					List *clauses,
 					List *custom_children)
 {
-	CustomScan	   *cscan;
+//	CustomScan	   *cscan;
+	GpuScan		   *gscan;
 	GpuScanInfo		gs_info;
 	List		   *host_quals = NIL;
 	List		   *dev_quals = NIL;
@@ -515,6 +526,8 @@ create_gpuscan_plan(PlannerInfo *root,
 	Assert(rel->rtekind == RTE_RELATION);
 	Assert(custom_children == NIL);
 
+	elog(INFO, "best_path => %s", nodeToString(best_path));
+
 	/*
 	 * Distribution of clauses into device executable and others.
 	 *
@@ -543,23 +556,28 @@ create_gpuscan_plan(PlannerInfo *root,
 	/*
 	 * Construction of GpuScanPlan node; on top of CustomPlan node
 	 */
-	cscan = makeNode(CustomScan);
-	cscan->scan.plan.targetlist = tlist;
-	cscan->scan.plan.qual = host_quals;
-	cscan->scan.plan.lefttree = NULL;
-	cscan->scan.plan.righttree = NULL;
-	cscan->scan.scanrelid = rel->relid;
+	gscan = (GpuScan *) newNode(sizeof(GpuScan), T_CustomScan);
+	gscan->cscan.scan.plan.targetlist = tlist;
+	gscan->cscan.scan.plan.qual = host_quals;
+	gscan->cscan.scan.plan.lefttree = NULL;
+	gscan->cscan.scan.plan.righttree = NULL;
+	gscan->cscan.scan.scanrelid = rel->relid;
 
 	gs_info.kern_source = kern_source;
 	gs_info.extra_flags = context.extra_flags | DEVKERNEL_NEEDS_GPUSCAN;
 	gs_info.used_params = context.used_params;
 	gs_info.used_vars = context.used_vars;
 	gs_info.dev_quals = dev_quals;
-	form_gpuscan_info(cscan, &gs_info);
-	cscan->flags = best_path->flags;
-	cscan->methods = &gpuscan_plan_methods;
-
-	return &cscan->scan.plan;
+	form_gpuscan_info(&gscan->cscan, &gs_info);
+	gscan->cscan.flags = best_path->flags;
+	gscan->cscan.methods = &gpuscan_plan_methods;
+	gscan->kern_source = gs_info.kern_source;
+	gscan->extra_flags = gs_info.extra_flags;
+	gscan->used_params = gs_info.used_params;
+	gscan->used_vars = gs_info.used_vars;
+	gscan->dev_quals = gs_info.dev_quals;
+
+	return &gscan->cscan.scan.plan;
 }
 
 /*
@@ -616,6 +634,100 @@ pgstrom_gpuscan_setup_bulkslot(PlanState *outer_planstate,
 	*p_bulk_slot = css->ss.ss_ScanTupleSlot;
 }
 
+static void
+gpuscan_node_copy(Node *_newnode, const Node *_oldnode)
+{
+	GpuScan		   *newnode = (GpuScan *) _newnode;
+	const GpuScan  *oldnode = (const GpuScan *) _oldnode;
+
+	newnode->kern_source = (oldnode->kern_source != NULL
+							? pstrdup(oldnode->kern_source)
+							: NULL);
+	newnode->extra_flags = oldnode->extra_flags;
+	newnode->used_params = copyObject(oldnode->used_params);
+	newnode->used_vars   = copyObject(oldnode->used_vars);
+	newnode->dev_quals   = copyObject(oldnode->dev_quals);
+}
+
+static void
+_outToken(StringInfo str, const char *s)
+{
+	if (s == NULL || *s == '\0')
+	{
+		appendStringInfoString(str, "<>");
+		return;
+	}
+
+	/*
+	 * Look for characters or patterns that are treated specially by read.c
+	 * (either in pg_strtok() or in nodeRead()), and therefore need a
+	 * protective backslash.
+	 */
+	/* These characters only need to be quoted at the start of the string */
+	if (*s == '<' ||
+		*s == '\"' ||
+		isdigit((unsigned char) *s) ||
+		((*s == '+' || *s == '-') &&
+		 (isdigit((unsigned char) s[1]) || s[1] == '.')))
+		appendStringInfoChar(str, '\\');
+	while (*s)
+	{
+		/* These chars must be backslashed anywhere in the string */
+		if (*s == ' ' || *s == '\n' || *s == '\t' ||
+			*s == '(' || *s == ')' || *s == '{' || *s == '}' ||
+			*s == '\\')
+			appendStringInfoChar(str, '\\');
+		appendStringInfoChar(str, *s++);
+	}
+}
+
+static void
+gpuscan_node_out(StringInfo str, const Node *node)
+{
+	const GpuScan  *gscan = (const GpuScan *) node;
+
+	appendStringInfo(str, " :kern_source ");
+	_outToken(str, gscan->kern_source);
+
+	appendStringInfo(str, " :extra_flags %u", gscan->extra_flags);
+
+	appendStringInfo(str, " :used_params %s",
+					 nodeToString(gscan->used_params));
+	appendStringInfo(str, " :used_vars %s",
+					 nodeToString(gscan->used_vars));
+	appendStringInfo(str, " :dev_quals %s",
+					 nodeToString(gscan->dev_quals));
+}
+
+static void
+gpuscan_node_read(Node *node)
+{
+	GpuScan	   *gscan = (GpuScan *) node;
+	char	   *token;
+	int			length;
+
+	/* :kern_source */
+	token = pg_strtok(&length);
+	token = pg_strtok(&length);
+	gscan->kern_source = (length == 0 ? NULL : debackslash(token, length));
+	/* :extra_flags */
+	token = pg_strtok(&length);
+	token = pg_strtok(&length);
+	gscan->extra_flags = (unsigned int ) strtoul(token, NULL, 10);
+	/* :used_params */
+	token = pg_strtok(&length);
+	token = pg_strtok(&length);
+	gscan->used_params = nodeRead(token, length);
+	/* :used_vars */
+	token = pg_strtok(&length);
+	token = pg_strtok(&length);
+	gscan->used_vars = nodeRead(token, length);
+	/* :dev_quals */
+	token = pg_strtok(&length);
+	token = pg_strtok(&length);
+	gscan->dev_quals = nodeRead(token, length);
+}
+
 /*
  * gpuscan_create_scan_state
  *
@@ -646,6 +758,15 @@ gpuscan_begin(CustomScanState *node, EState *estate, int eflags)
 	GpuContext	   *gcontext = NULL;
 	GpuScanState   *gss = (GpuScanState *) node;
 	GpuScanInfo	   *gs_info = deform_gpuscan_info(node->ss.ps.plan);
+	char		   *test_str;
+	Node		   *test_node;
+
+	/* test for serialization/deserialization */
+	elog(INFO, "test-1: %s", nodeToString(node->ss.ps.plan));
+	test_str = nodeToString(copyObject(node->ss.ps.plan));
+	elog(INFO, "test-2: %s", test_str);
+	test_node = stringToNode(test_str);
+	elog(INFO, "test-3: %s", nodeToString(test_node));
 
 	/* gpuscan should not have inner/outer plan right now */
 	Assert(outerPlan(node) == NULL);
@@ -962,35 +1083,52 @@ pgstrom_init_gpuscan(void)
 
 	/* setup path methods */
 	memset(&gpuscan_path_methods, 0, sizeof(gpuscan_path_methods));
-	gpuscan_path_methods.CustomName			= "GpuScan";
+	gpuscan_path_methods.xnode.extnodename  = "GpuScanPath";
+	gpuscan_path_methods.xnode.node_size    = sizeof(GpuScanPath);
 	gpuscan_path_methods.PlanCustomPath		= create_gpuscan_plan;
+	RegisterExtensibleNodeMethods(&gpuscan_path_methods.xnode);
 
 	/* setup plan methods */
 	memset(&gpuscan_plan_methods, 0, sizeof(gpuscan_plan_methods));
-	gpuscan_plan_methods.CustomName			= "GpuScan";
+	gpuscan_plan_methods.xnode.extnodename  = "GpuScan";
+	gpuscan_plan_methods.xnode.node_size    = sizeof(GpuScan);
+	gpuscan_plan_methods.xnode.nodeCopy     = gpuscan_node_copy;
+	gpuscan_plan_methods.xnode.nodeOut      = gpuscan_node_out;
+	gpuscan_plan_methods.xnode.nodeRead     = gpuscan_node_read;
 	gpuscan_plan_methods.CreateCustomScanState = gpuscan_create_scan_state;
+	RegisterExtensibleNodeMethods(&gpuscan_plan_methods.xnode);
 
 	memset(&bulkscan_plan_methods, 0, sizeof(bulkscan_plan_methods));
-	bulkscan_plan_methods.CustomName		= "BulkScan";
+	bulkscan_plan_methods.xnode.extnodename = "BulkScan";
+	bulkscan_plan_methods.xnode.node_size   = sizeof(GpuScan);
+	bulkscan_plan_methods.xnode.nodeCopy    = gpuscan_node_copy;
+	bulkscan_plan_methods.xnode.nodeOut     = gpuscan_node_out;
+	bulkscan_plan_methods.xnode.nodeRead    = gpuscan_node_read;
 	bulkscan_plan_methods.CreateCustomScanState = gpuscan_create_scan_state;
+	RegisterExtensibleNodeMethods(&bulkscan_plan_methods.xnode);
 
 	/* setup exec methods */
 	memset(&gpuscan_exec_methods, 0, sizeof(gpuscan_exec_methods));
-	gpuscan_exec_methods.c.CustomName         = "GpuScan";
+	gpuscan_exec_methods.c.xnode.extnodename  = "GpuScanState";
+	gpuscan_exec_methods.c.xnode.node_size    = sizeof(GpuScanState);
 	gpuscan_exec_methods.c.BeginCustomScan    = gpuscan_begin;
 	gpuscan_exec_methods.c.ExecCustomScan     = gpuscan_exec;
 	gpuscan_exec_methods.c.EndCustomScan      = gpuscan_end;
 	gpuscan_exec_methods.c.ReScanCustomScan   = gpuscan_rescan;
 	gpuscan_exec_methods.c.ExplainCustomScan  = gpuscan_explain;
 	gpuscan_exec_methods.ExecCustomBulk       = gpuscan_exec_bulk;
+	RegisterExtensibleNodeMethods(&gpuscan_exec_methods.c.xnode);
 
-	bulkscan_exec_methods.c.CustomName        = "BulkScan";
+	memset(&bulkscan_exec_methods, 0, sizeof(bulkscan_exec_methods));
+	bulkscan_exec_methods.c.xnode.extnodename = "BulkScanState";
+	bulkscan_exec_methods.c.xnode.node_size   = sizeof(GpuScanState);
 	bulkscan_exec_methods.c.BeginCustomScan   = gpuscan_begin;
 	bulkscan_exec_methods.c.ExecCustomScan    = gpuscan_exec;
 	bulkscan_exec_methods.c.EndCustomScan     = gpuscan_end;
 	bulkscan_exec_methods.c.ReScanCustomScan  = gpuscan_rescan;
 	bulkscan_exec_methods.c.ExplainCustomScan = gpuscan_explain;
 	bulkscan_exec_methods.ExecCustomBulk      = gpuscan_exec_bulk;
+	RegisterExtensibleNodeMethods(&bulkscan_exec_methods.c.xnode);
 
 	/* hook registration */
 	set_rel_pathlist_next = set_rel_pathlist_hook;
diff --git a/src/main.c b/src/main.c
index e0c2c96..7e2efe9 100644
--- a/src/main.c
+++ b/src/main.c
@@ -208,7 +208,7 @@ pgstrom_recursive_grafter(PlannedStmt *pstmt, Plan *parent, Plan **p_curr_plan)
 			 * Try to inject GpuPreAgg plan if cost of the aggregate plan
 			 * is enough expensive to justify preprocess by GPU.
 			 */
-			pgstrom_try_insert_gpupreagg(pstmt, (Agg *) plan);
+			//pgstrom_try_insert_gpupreagg(pstmt, (Agg *) plan);
 			break;
 
 		case T_SubqueryScan:
@@ -311,7 +311,7 @@ pgstrom_recursive_grafter(PlannedStmt *pstmt, Plan *parent, Plan **p_curr_plan)
 			 * Try to replace Sort node by GpuSort node if cost of
 			 * the alternative plan is enough reasonable to replace.
 			 */
-			pgstrom_try_insert_gpusort(pstmt, p_curr_plan);
+			//pgstrom_try_insert_gpusort(pstmt, p_curr_plan);
 			break;
 
 		default:
@@ -386,9 +386,9 @@ _PG_init(void)
 
 	/* registration of custom-scan providers */
 	pgstrom_init_gpuscan();
-	pgstrom_init_gpujoin();
-	pgstrom_init_gpupreagg();
-	pgstrom_init_gpusort();
+//	pgstrom_init_gpujoin();
+//	pgstrom_init_gpupreagg();
+//	pgstrom_init_gpusort();
 
 	/* miscellaneous initializations */
 	pgstrom_init_misc_guc();

pgsql-v9.6-custom-private.v2.patchapplication/octet-stream; name=pgsql-v9.6-custom-private.v2.patchDownload

 doc/src/sgml/custom-scan.sgml     | 115 +++++++++++++++++++++++++++++---------
 src/backend/commands/explain.c    |   2 +-
 src/backend/executor/nodeCustom.c |   4 +-
 src/backend/nodes/copyfuncs.c     |  19 +++++--
 src/backend/nodes/nodes.c         |  82 +++++++++++++++++++++++++++
 src/backend/nodes/outfuncs.c      |  30 +++++++---
 src/backend/nodes/readfuncs.c     |  46 ++++++++++-----
 src/include/nodes/execnodes.h     |   2 +-
 src/include/nodes/nodes.h         |  18 ++++++
 src/include/nodes/plannodes.h     |   5 +-
 src/include/nodes/relation.h      |   6 +-
 11 files changed, 269 insertions(+), 60 deletions(-)

diff --git a/doc/src/sgml/custom-scan.sgml b/doc/src/sgml/custom-scan.sgml
index 5bba125..933a9bd 100644
--- a/doc/src/sgml/custom-scan.sgml
+++ b/doc/src/sgml/custom-scan.sgml
@@ -78,14 +78,16 @@ typedef struct CustomPath
     nodes used by this custom-path node; these will be transformed into
     <structname>Plan</> nodes by planner.
     <structfield>custom_private</> can be used to store the custom path's
-    private data.  Private data should be stored in a form that can be handled
-    by <literal>nodeToString</>, so that debugging routines that attempt to
-    print the custom path will work as designed.  <structfield>methods</> must
-    point to a (usually statically allocated) object implementing the required
-    custom path methods, of which there is currently only one.  The
-    <structfield>LibraryName</> and <structfield>SymbolName</> fields must also
-    be initialized so that the dynamic loader can resolve them to locate the
-    method table.
+    private data. Private data should be stored in a form that can be handled
+    by <literal>nodeToString</>. <structfield>methods</> must point to a
+    (usually statically allocated) object implementing the required custom
+    path methods, of which there is currently only one.
+    The first field of <structname>CustomPathMethods</> is
+    <structname>ExtensibleNodeMethods</>. If custom scan provider defines
+    own structure larger than <structname>CustomPathMethods</> to keep
+    private information on the extra area, it must support callbacks of
+    the <structname>ExtensibleNodeMethods</> to treat these private fields.
+    See <xref linkend="custom-private-structure"> for more details.
   </para>
 
   <para>
@@ -125,19 +127,6 @@ Plan *(*PlanCustomPath) (PlannerInfo *root,
     be a <literal>CustomScan</> object, which the callback must allocate and
     initialize.  See <xref linkend="custom-scan-plan"> for more details.
    </para>
-
-   <para>
-<programlisting>
-void (*TextOutCustomPath) (StringInfo str,
-                           const CustomPath *node);
-</programlisting>
-    Generate additional output when <function>nodeToString</> is invoked on
-    this custom path.  This callback is optional.  Since
-    <function>nodeToString</> will automatically dump all fields in the
-    structure that it can see, including <structfield>custom_private</>, this
-    is only useful if the <structname>CustomPath</> is actually embedded in a
-    larger struct containing additional fields.
-   </para>
   </sect2>
  </sect1>
 
@@ -198,10 +187,15 @@ typedef struct CustomScan
   <para>
    Plan trees must be able to be duplicated using <function>copyObject</>,
    so all the data stored within the <quote>custom</> fields must consist of
-   nodes that that function can handle.  Furthermore, custom scan providers
-   cannot substitute a larger structure that embeds
-   a <structname>CustomScan</> for the structure itself, as would be possible
-   for a <structname>CustomPath</> or <structname>CustomScanState</>.
+   nodes that that function can handle. In case when custom scan providers
+   can substitute a larger structure that embeds a <structname>CustomScan</>
+   for the structure itself, as would be possible for <structname>CustomPath</>
+   or <structname>CustomScanState</>.
+   If custom scan provider keeps its private fields in the extra area of the
+   above own defined structure, it also have to implement each callbacks of
+   <structname>ExtensibleNodeMethods</> to support serialization and
+   deserialization.
+   See <xref linkend="custom-scan-plan"> for more details.
   </para>
 
   <sect2 id="custom-scan-plan-callbacks">
@@ -328,4 +322,75 @@ void (*ExplainCustomScan) (CustomScanState *node,
    </para>
   </sect2>
  </sect1>
+
+ <sect1 id="custom-private-structure">
+  <title>Private fields with own structure</title>
+  <para>
+   Some node types, like <structname>CustomScan</> or
+   <structname>ForeignScan</>, allows extension to have itself as a part of
+   larger structure for their private fields with arbitrary forms.
+   In this case, extension has to ensure its private fields are kept
+   correctly on node operations, like copy, serialization and deserialization.
+  </para>
+  <para>
+   <structname>ExtensibleNodeMethods</> is a collection of callbacks that
+   enable extension to get control on node operations, to process its private
+   fields.
+   All the <structname>ExtensibleNodeMethods</> implementation has to be
+   registered with a unique name. It is key to lookup proper collection of
+   the callbacks later. Usually, it shall be done on <funcation>_PG_init</>.
+  </para>
+  <para>
+   Specification of the individual callbacks are below.
+  </para>
+  <para>
+<programlisting>
+const char *extnodename;
+</programlisting>
+   A unique name of this collection of the node operations callback.
+   It should not be duplicated with one preliminary registered.
+  </para>
+  <para>
+<programlisting>
+Size    node_size;
+</programlisting>
+   Size of the self define structure. Of course, it has to be larger
+   than or equal to the structure to be embedded in.
+  </para>
+  <para>
+<programlisting>
+void    nodeCopy(Node *newnode, const Node *oldnode);
+</programlisting>
+   This callback copies the private fields on <varname>oldnode</> to
+   the <varname>newnode</>. Extension does not need to copy the known
+   fields in the structure that is embedded in; it is the responsibility
+   of the core backend.
+  </para>
+  <para>
+<programlisting>
+Size    nodeEqual(const Node *a, const Node *b);
+</programlisting>
+   This callback compares private fields on <varname>a</> and <varname>b</>,
+   then returns <literal>true</> if equivalent. Elsewhere, <literal>false</>.
+  </para>
+  <para>
+<programlisting>
+Size    nodeOut(StringInfo str, const Node *node);
+</programlisting>
+   This callback serializes the private fields in text form, then put this
+   string on the supplied <structname>StringInfo</>. All extension has to
+   write out is its own private fields. The known fields on the structure
+   embedded in are written out by the core backend.
+  </para>
+  <para>
+<programlisting>
+Size    nodeRead(Node *node);
+</programlisting>
+   This callback deserializes the private fields, written by the above
+   <function>nodeOut</>, on the supplied <varname>node</> object.
+   So, these callbacks have to be always symmetric.
+   It shall be invoked in the context of <function>stringToNode</>, thus
+   <function>pg_strtok</> will fetch next token to read.
+  </para>
+ </sect1>
 </chapter>
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 183d3d9..2986308 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -891,7 +891,7 @@ ExplainNode(PlanState *planstate, List *ancestors,
 			break;
 		case T_CustomScan:
 			sname = "Custom Scan";
-			custom_name = ((CustomScan *) plan)->methods->CustomName;
+			custom_name = ((CustomScan *) plan)->methods->xnode.extnodename;
 			if (custom_name)
 				pname = psprintf("Custom Scan (%s)", custom_name);
 			else
diff --git a/src/backend/executor/nodeCustom.c b/src/backend/executor/nodeCustom.c
index 0a022df..553fe96 100644
--- a/src/backend/executor/nodeCustom.c
+++ b/src/backend/executor/nodeCustom.c
@@ -145,7 +145,7 @@ ExecCustomMarkPos(CustomScanState *node)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("custom-scan \"%s\" does not support MarkPos",
-						node->methods->CustomName)));
+						node->methods->xnode.extnodename)));
 	node->methods->MarkPosCustomScan(node);
 }
 
@@ -156,6 +156,6 @@ ExecCustomRestrPos(CustomScanState *node)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("custom-scan \"%s\" does not support MarkPos",
-						node->methods->CustomName)));
+						node->methods->xnode.extnodename)));
 	node->methods->RestrPosCustomScan(node);
 }
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 26264cb..c4bb76e 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -635,8 +635,12 @@ _copyWorkTableScan(const WorkTableScan *from)
 static ForeignScan *
 _copyForeignScan(const ForeignScan *from)
 {
-	ForeignScan *newnode = makeNode(ForeignScan);
-
+	const ExtensibleNodeMethods *methods
+		= GetExtensibleNodeMethods(from->extnodename, true);
+	ForeignScan *newnode = (ForeignScan *) newNode(!methods
+												   ? sizeof(ForeignScan)
+												   : methods->node_size,
+												   T_ForeignScan);
 	/*
 	 * copy node superclass fields
 	 */
@@ -645,6 +649,7 @@ _copyForeignScan(const ForeignScan *from)
 	/*
 	 * copy remainder of node
 	 */
+	COPY_STRING_FIELD(extnodename);
 	COPY_SCALAR_FIELD(fs_server);
 	COPY_NODE_FIELD(fdw_exprs);
 	COPY_NODE_FIELD(fdw_private);
@@ -652,6 +657,8 @@ _copyForeignScan(const ForeignScan *from)
 	COPY_NODE_FIELD(fdw_recheck_quals);
 	COPY_BITMAPSET_FIELD(fs_relids);
 	COPY_SCALAR_FIELD(fsSystemCol);
+	if (methods && methods->nodeCopy)
+		methods->nodeCopy((Node *) newnode, (const Node *) from);
 
 	return newnode;
 }
@@ -662,8 +669,9 @@ _copyForeignScan(const ForeignScan *from)
 static CustomScan *
 _copyCustomScan(const CustomScan *from)
 {
-	CustomScan *newnode = makeNode(CustomScan);
-
+	const ExtensibleNodeMethods *methods = &from->methods->xnode;
+	CustomScan *newnode = (CustomScan *) newNode(methods->node_size,
+												 T_CustomScan);
 	/*
 	 * copy node superclass fields
 	 */
@@ -686,6 +694,9 @@ _copyCustomScan(const CustomScan *from)
 	 */
 	COPY_SCALAR_FIELD(methods);
 
+	if (methods->nodeCopy)
+		methods->nodeCopy((Node *) newnode, (const Node *) from);
+
 	return newnode;
 }
 
diff --git a/src/backend/nodes/nodes.c b/src/backend/nodes/nodes.c
index 9c0b3fe..f8dd993 100644
--- a/src/backend/nodes/nodes.c
+++ b/src/backend/nodes/nodes.c
@@ -19,6 +19,9 @@
 #include "postgres.h"
 
 #include "nodes/nodes.h"
+#include "nodes/pg_list.h"
+#include "utils/hsearch.h"
+#include "utils/memutils.h"
 
 /*
  * Support for newNode() macro
@@ -29,3 +32,82 @@
  */
 
 Node	   *newNodeMacroHolder;
+
+/*
+ * Extensible node support - We allow extension to have own structure that
+ * contains well known structure (like ForeignScan or CustomScan), to keep
+ * their private data fields. These structure also have to support
+ * serialization / deserialization using node functions. A collection of
+ * callback enables to handle private fields also, not only well known
+ * portion.
+ */
+#define XNODES_NUM_SLOTS	256
+static List	  **xnodes_methods_slots = NULL;
+
+void
+RegisterExtensibleNodeMethods(const ExtensibleNodeMethods *methods)
+{
+	uint32			hash;
+	int				index;
+	ListCell	   *lc;
+	MemoryContext	oldcxt;
+
+	if (!xnodes_methods_slots)
+		xnodes_methods_slots = MemoryContextAllocZero(TopMemoryContext,
+										  sizeof(List *) * XNODES_NUM_SLOTS);
+
+	hash = hash_any(methods->extnodename, strlen(methods->extnodename));
+	index = hash % XNODES_NUM_SLOTS;
+
+	/* duplication check */
+	foreach (lc, xnodes_methods_slots[index])
+	{
+		const ExtensibleNodeMethods *curr = lfirst(lc);
+
+		if (strcmp(methods->extnodename, curr->extnodename) == 0)
+			ereport(ERROR,
+					(errcode(ERRCODE_DUPLICATE_OBJECT),
+					 errmsg("ExtensibleNodeMethods \"%s\" already exists",
+							methods->extnodename)));
+	}
+	/* ok, register this entry */
+	oldcxt = MemoryContextSwitchTo(TopMemoryContext);
+	xnodes_methods_slots[index] = lappend(xnodes_methods_slots[index],
+										  (void *)methods);
+	MemoryContextSwitchTo(oldcxt);
+}
+
+const ExtensibleNodeMethods *
+GetExtensibleNodeMethods(const char *extnodename, bool missing_ok)
+{
+	const ExtensibleNodeMethods *methods = NULL;
+
+	if (!extnodename)
+		return NULL;
+
+	if (xnodes_methods_slots)
+	{
+		uint32		hash = hash_any(extnodename, strlen(extnodename));
+		int			index = hash % XNODES_NUM_SLOTS;
+		ListCell   *lc;
+
+		/* search the registered table */
+		foreach (lc, xnodes_methods_slots[index])
+		{
+			const ExtensibleNodeMethods *curr = lfirst(lc);
+
+			if (strcmp(curr->extnodename, extnodename) == 0)
+			{
+				methods = curr;
+				break;
+			}
+		}
+	}
+
+	if (!methods && !missing_ok)
+		ereport(ERROR,
+				(errcode(ERRCODE_UNDEFINED_OBJECT),
+				 errmsg("ExtensibleNodeMethods \"%s\" was not registered",
+						extnodename)));
+	return methods;
+}
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 012c14b..3903aea 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -587,6 +587,9 @@ _outWorkTableScan(StringInfo str, const WorkTableScan *node)
 static void
 _outForeignScan(StringInfo str, const ForeignScan *node)
 {
+	const ExtensibleNodeMethods *methods
+		= GetExtensibleNodeMethods(node->extnodename, true);
+
 	WRITE_NODE_TYPE("FOREIGNSCAN");
 
 	_outScanInfo(str, (const Scan *) node);
@@ -598,11 +601,15 @@ _outForeignScan(StringInfo str, const ForeignScan *node)
 	WRITE_NODE_FIELD(fdw_recheck_quals);
 	WRITE_BITMAPSET_FIELD(fs_relids);
 	WRITE_BOOL_FIELD(fsSystemCol);
+	if (methods && methods->nodeOut)
+		methods->nodeOut(str, (const Node *) node);
 }
 
 static void
 _outCustomScan(StringInfo str, const CustomScan *node)
 {
+	const ExtensibleNodeMethods *methods = &node->methods->xnode;
+
 	WRITE_NODE_TYPE("CUSTOMSCAN");
 
 	_outScanInfo(str, (const Scan *) node);
@@ -613,11 +620,12 @@ _outCustomScan(StringInfo str, const CustomScan *node)
 	WRITE_NODE_FIELD(custom_private);
 	WRITE_NODE_FIELD(custom_scan_tlist);
 	WRITE_BITMAPSET_FIELD(custom_relids);
-	/* Dump library and symbol name instead of raw pointer */
+	/* Dump CustomName; alias of extnodename */
 	appendStringInfoString(str, " :methods ");
-	_outToken(str, node->methods->LibraryName);
-	appendStringInfoChar(str, ' ');
-	_outToken(str, node->methods->SymbolName);
+	_outToken(str, methods->extnodename);
+	/* Dump private fields if any */
+	if (methods && methods->nodeOut)
+		methods->nodeOut(str, (const Node *) node);
 }
 
 static void
@@ -1679,16 +1687,24 @@ _outTidPath(StringInfo str, const TidPath *node)
 static void
 _outForeignPath(StringInfo str, const ForeignPath *node)
 {
+	const ExtensibleNodeMethods *methods
+		= GetExtensibleNodeMethods(node->extnodename, true);
+
 	WRITE_NODE_TYPE("FOREIGNPATH");
 
 	_outPathInfo(str, (const Path *) node);
 
+	WRITE_STRING_FIELD(extnodename);
 	WRITE_NODE_FIELD(fdw_private);
+	if (methods && methods->nodeOut)
+		methods->nodeOut(str, (Node *) node);
 }
 
 static void
 _outCustomPath(StringInfo str, const CustomPath *node)
 {
+	const ExtensibleNodeMethods *methods = &node->methods->xnode;
+
 	WRITE_NODE_TYPE("CUSTOMPATH");
 
 	_outPathInfo(str, (const Path *) node);
@@ -1697,9 +1713,9 @@ _outCustomPath(StringInfo str, const CustomPath *node)
 	WRITE_NODE_FIELD(custom_paths);
 	WRITE_NODE_FIELD(custom_private);
 	appendStringInfoString(str, " :methods ");
-	_outToken(str, node->methods->CustomName);
-	if (node->methods->TextOutCustomPath)
-		node->methods->TextOutCustomPath(str, node);
+	_outToken(str, methods->extnodename);
+	if (methods && methods->nodeOut)
+		methods->nodeOut(str, (Node *)node);
 }
 
 static void
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 222e2ed..00e81b7 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -1792,10 +1792,12 @@ _readWorkTableScan(void)
 static ForeignScan *
 _readForeignScan(void)
 {
+	const ExtensibleNodeMethods *methods;
 	READ_LOCALS(ForeignScan);
 
 	ReadCommonScan(&local_node->scan);
 
+	READ_STRING_FIELD(extnodename);
 	READ_OID_FIELD(fs_server);
 	READ_NODE_FIELD(fdw_exprs);
 	READ_NODE_FIELD(fdw_private);
@@ -1804,6 +1806,19 @@ _readForeignScan(void)
 	READ_BITMAPSET_FIELD(fs_relids);
 	READ_BOOL_FIELD(fsSystemCol);
 
+	methods = GetExtensibleNodeMethods(local_node->extnodename, true);
+	if (methods && methods->nodeRead)
+	{
+		ForeignScan	   *local_temp = palloc0(methods->node_size);
+
+		Assert(methods->node_size >= sizeof(ForeignScan));
+
+		memcpy(local_temp, local_node, sizeof(ForeignScan));
+		methods->nodeRead((Node *) local_temp);
+
+		pfree(local_node);
+		local_node = local_temp;
+	}
 	READ_DONE();
 }
 
@@ -1813,10 +1828,9 @@ _readForeignScan(void)
 static CustomScan *
 _readCustomScan(void)
 {
+	const ExtensibleNodeMethods *methods;
+	const char	   *extnodename;
 	READ_LOCALS(CustomScan);
-	char	   *library_name;
-	char	   *symbol_name;
-	const CustomScanMethods *methods;
 
 	ReadCommonScan(&local_node->scan);
 
@@ -1831,18 +1845,24 @@ _readCustomScan(void)
 	 * Reconstruction of methods using library and symbol name
 	 */
 	token = pg_strtok(&length);		/* skip methods: */
-	token = pg_strtok(&length);		/* LibraryName */
-	library_name = nullable_string(token, length);
-	token = pg_strtok(&length);		/* SymbolName */
-	symbol_name = nullable_string(token, length);
+	token = pg_strtok(&length);		/* CustomName (alias of extnodename) */
+	extnodename = nullable_string(token, length);
+	methods = GetExtensibleNodeMethods(extnodename, false);
 
-	methods = (const CustomScanMethods *)
-		load_external_function(library_name, symbol_name, true, NULL);
-	Assert(strcmp(methods->LibraryName, library_name) == 0 &&
-		   strcmp(methods->SymbolName, symbol_name) == 0);
-	local_node->methods = methods;
+	if (methods->nodeRead)
+	{
+		CustomScan	   *local_temp = palloc0(methods->node_size);
 
-	READ_DONE();
+		Assert(methods->node_size >= sizeof(CustomScan));
+		memcpy(local_temp, local_node, sizeof(CustomScan));
+		methods->nodeRead((Node *) local_temp);
+
+		pfree(local_node);
+		local_node = local_temp;
+	}
+	local_node->methods = (const CustomScanMethods *) methods;
+
+    READ_DONE();
 }
 
 /*
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index eb3591a..9e79049 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -1607,7 +1607,7 @@ struct CustomScanState;
 
 typedef struct CustomExecMethods
 {
-	const char *CustomName;
+	ExtensibleNodeMethods xnode;	/* collection of node operations */
 
 	/* Executor methods: mark/restore are optional, the rest are required */
 	void		(*BeginCustomScan) (struct CustomScanState *node,
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 94bdb7c..0ac4565 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -542,6 +542,24 @@ extern void *copyObject(const void *obj);
  */
 extern bool equal(const void *a, const void *b);
 
+/*
+ * ExtensibleNodeMethods - a collection of node operation callbacks.
+ */
+struct StringInfoData;		/* not to include stringinfo.h here */
+
+typedef struct ExtensibleNodeMethods
+{
+	const char *extnodename;
+	Size		node_size;
+	void	  (*nodeCopy)(Node *newnode, const Node *oldnode);
+	bool	  (*nodeEqual)(const Node *a, const Node *b);
+	void	  (*nodeOut)(struct StringInfoData *str, const Node *node);
+	void	  (*nodeRead)(Node *node);
+} ExtensibleNodeMethods;
+
+extern void RegisterExtensibleNodeMethods(const ExtensibleNodeMethods *method);
+extern const ExtensibleNodeMethods *GetExtensibleNodeMethods(const char *name,
+															 bool missing_ok);
 
 /*
  * Typedefs for identifying qualifier selectivities and plan costs as such.
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 37086c6..9b99f65 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -530,6 +530,7 @@ typedef struct WorkTableScan
 typedef struct ForeignScan
 {
 	Scan		scan;
+	const char *extnodename;	/* collection of node operation callbacks */
 	Oid			fs_server;		/* OID of foreign server */
 	List	   *fdw_exprs;		/* expressions that FDW may evaluate */
 	List	   *fdw_private;	/* private data for FDW */
@@ -556,9 +557,7 @@ struct CustomScan;
 
 typedef struct CustomScanMethods
 {
-	const char *CustomName;
-	const char *LibraryName;
-	const char *SymbolName;
+	ExtensibleNodeMethods xnode;	/* collection of node operations */
 
 	/* Create execution state (CustomScanState) from a CustomScan plan node */
 	Node	   *(*CreateCustomScanState) (struct CustomScan *cscan);
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 9a0dd28..8e4d735 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -909,6 +909,7 @@ typedef struct TidPath
 typedef struct ForeignPath
 {
 	Path		path;
+	const char *extnodename;
 	List	   *fdw_private;
 } ForeignPath;
 
@@ -937,7 +938,7 @@ struct CustomPath;
 
 typedef struct CustomPathMethods
 {
-	const char *CustomName;
+	ExtensibleNodeMethods xnode;	/* collection of node operations */
 
 	/* Convert Path to a Plan */
 	struct Plan *(*PlanCustomPath) (PlannerInfo *root,
@@ -946,9 +947,6 @@ typedef struct CustomPathMethods
 												List *tlist,
 												List *clauses,
 												List *custom_plans);
-	/* Optional: print additional fields besides "private" */
-	void		(*TextOutCustomPath) (StringInfo str,
-											  const struct CustomPath *node);
 } CustomPathMethods;
 
 typedef struct CustomPath

Import Notes

Resolved by subject fallback

#18

Andres Freund

andres@anarazel.de

about 10 years ago

In reply to: Kouhei Kaigai (#17)

Re: CustomScan in a larger structure (RE: CustomScan support on readfuncs.c)

On 2015-12-02 08:52:20 +0000, Kouhei Kaigai wrote:

diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 26264cb..c4bb76e 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -635,8 +635,12 @@ _copyWorkTableScan(const WorkTableScan *from)
static ForeignScan *
_copyForeignScan(const ForeignScan *from)
{
-	ForeignScan *newnode = makeNode(ForeignScan);
-
+	const ExtensibleNodeMethods *methods
+		= GetExtensibleNodeMethods(from->extnodename, true);
+	ForeignScan *newnode = (ForeignScan *) newNode(!methods
+												   ? sizeof(ForeignScan)
+												   : methods->node_size,
+												   T_ForeignScan);

Changing the FDW API seems to put this squarely into 9.6
territory. Agreed?

/*
* copy node superclass fields
*/
@@ -645,6 +649,7 @@ _copyForeignScan(const ForeignScan *from)
/*
* copy remainder of node
*/
+	COPY_STRING_FIELD(extnodename);
COPY_SCALAR_FIELD(fs_server);
COPY_NODE_FIELD(fdw_exprs);
COPY_NODE_FIELD(fdw_private);
@@ -652,6 +657,8 @@ _copyForeignScan(const ForeignScan *from)
COPY_NODE_FIELD(fdw_recheck_quals);
COPY_BITMAPSET_FIELD(fs_relids);
COPY_SCALAR_FIELD(fsSystemCol);
+	if (methods && methods->nodeCopy)
+		methods->nodeCopy((Node *) newnode, (const Node *) from);

I wonder if we shouldn't instead "recurse" into a generic routine for
extensible nodes here.

+void
+RegisterExtensibleNodeMethods(const ExtensibleNodeMethods *methods)
+{
+	uint32			hash;
+	int				index;
+	ListCell	   *lc;
+	MemoryContext	oldcxt;
+
+	if (!xnodes_methods_slots)
+		xnodes_methods_slots = MemoryContextAllocZero(TopMemoryContext,
+										  sizeof(List *) * XNODES_NUM_SLOTS);
+
+	hash = hash_any(methods->extnodename, strlen(methods->extnodename));
+	index = hash % XNODES_NUM_SLOTS;
+
+	/* duplication check */
+	foreach (lc, xnodes_methods_slots[index])
+	{
+		const ExtensibleNodeMethods *curr = lfirst(lc);
+
+		if (strcmp(methods->extnodename, curr->extnodename) == 0)
+			ereport(ERROR,
+					(errcode(ERRCODE_DUPLICATE_OBJECT),
+					 errmsg("ExtensibleNodeMethods \"%s\" already exists",
+							methods->extnodename)));
+	}
+	/* ok, register this entry */
+	oldcxt = MemoryContextSwitchTo(TopMemoryContext);
+	xnodes_methods_slots[index] = lappend(xnodes_methods_slots[index],
+										  (void *)methods);
+	MemoryContextSwitchTo(oldcxt);
+}

Why aren't you using dynahash here, and instead reimplement a hashtable?

static ForeignScan *
_readForeignScan(void)
{
+ const ExtensibleNodeMethods *methods;
READ_LOCALS(ForeignScan);

ReadCommonScan(&local_node->scan);

+ READ_STRING_FIELD(extnodename);
READ_OID_FIELD(fs_server);
READ_NODE_FIELD(fdw_exprs);
READ_NODE_FIELD(fdw_private);
@@ -1804,6 +1806,19 @@ _readForeignScan(void)
READ_BITMAPSET_FIELD(fs_relids);
READ_BOOL_FIELD(fsSystemCol);
+	methods = GetExtensibleNodeMethods(local_node->extnodename, true);
+	if (methods && methods->nodeRead)
+	{
+		ForeignScan	   *local_temp = palloc0(methods->node_size);
+
+		Assert(methods->node_size >= sizeof(ForeignScan));
+
+		memcpy(local_temp, local_node, sizeof(ForeignScan));
+		methods->nodeRead((Node *) local_temp);
+
+		pfree(local_node);
+		local_node = local_temp;
+	}
READ_DONE();
}

This imo pretty clearly shows why the 'extensible node' handling should
be delegated to separate routines.

I wonder if we shouldn't, before committing this, at least write a PoC
implementation of actual extensible nodes. I.e. new node types that are
registered at runtime by extensions. ISTM those are relatively closely
linked, and the above doesn't yet support them nicely afaics.

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19

Robert Haas

robertmhaas@gmail.com

about 10 years ago

In reply to: Andres Freund (#18)

Re: CustomScan in a larger structure (RE: CustomScan support on readfuncs.c)

On Wed, Dec 2, 2015 at 4:06 AM, Andres Freund <andres@anarazel.de> wrote:

On 2015-12-02 08:52:20 +0000, Kouhei Kaigai wrote:

diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 26264cb..c4bb76e 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -635,8 +635,12 @@ _copyWorkTableScan(const WorkTableScan *from)
static ForeignScan *
_copyForeignScan(const ForeignScan *from)
{
-     ForeignScan *newnode = makeNode(ForeignScan);
-
+     const ExtensibleNodeMethods *methods
+             = GetExtensibleNodeMethods(from->extnodename, true);
+     ForeignScan *newnode = (ForeignScan *) newNode(!methods
+                                                                                                ? sizeof(ForeignScan)
+                                                                                                : methods->node_size,
+                                                                                                T_ForeignScan);

Changing the FDW API seems to put this squarely into 9.6
territory. Agreed?

I don't think anybody thought this patch was 9.5 material. I didn't,
anyway. The FDW changes for the join-pushdown-EPQ problem are another
issue, but that can be discussed on the other thread.

/*
* copy node superclass fields
*/
@@ -645,6 +649,7 @@ _copyForeignScan(const ForeignScan *from)
/*
* copy remainder of node
*/
+     COPY_STRING_FIELD(extnodename);
COPY_SCALAR_FIELD(fs_server);
COPY_NODE_FIELD(fdw_exprs);
COPY_NODE_FIELD(fdw_private);
@@ -652,6 +657,8 @@ _copyForeignScan(const ForeignScan *from)
COPY_NODE_FIELD(fdw_recheck_quals);
COPY_BITMAPSET_FIELD(fs_relids);
COPY_SCALAR_FIELD(fsSystemCol);
+     if (methods && methods->nodeCopy)
+             methods->nodeCopy((Node *) newnode, (const Node *) from);

I wonder if we shouldn't instead "recurse" into a generic routine for
extensible nodes here.

I don't know what that means.

+void
+RegisterExtensibleNodeMethods(const ExtensibleNodeMethods *methods)
+{
+     uint32                  hash;
+     int                             index;
+     ListCell           *lc;
+     MemoryContext   oldcxt;
+
+     if (!xnodes_methods_slots)
+             xnodes_methods_slots = MemoryContextAllocZero(TopMemoryContext,
+                                                                               sizeof(List *) * XNODES_NUM_SLOTS);
+
+     hash = hash_any(methods->extnodename, strlen(methods->extnodename));
+     index = hash % XNODES_NUM_SLOTS;
+
+     /* duplication check */
+     foreach (lc, xnodes_methods_slots[index])
+     {
+             const ExtensibleNodeMethods *curr = lfirst(lc);
+
+             if (strcmp(methods->extnodename, curr->extnodename) == 0)
+                     ereport(ERROR,
+                                     (errcode(ERRCODE_DUPLICATE_OBJECT),
+                                      errmsg("ExtensibleNodeMethods \"%s\" already exists",
+                                                     methods->extnodename)));
+     }
+     /* ok, register this entry */
+     oldcxt = MemoryContextSwitchTo(TopMemoryContext);
+     xnodes_methods_slots[index] = lappend(xnodes_methods_slots[index],
+                                                                               (void *)methods);
+     MemoryContextSwitchTo(oldcxt);
+}

Why aren't you using dynahash here, and instead reimplement a hashtable?

I had thought we might assume that the number of extensible nodes was
small enough that we could use an array for this and just use linear
search. But if we want to keep the door open to having enough
extensible nodes that this would be inefficient, then I agree that
dynahash is better than a hand-rolled hash table.

static ForeignScan *
_readForeignScan(void)
{
+ const ExtensibleNodeMethods *methods;
READ_LOCALS(ForeignScan);

ReadCommonScan(&local_node->scan);

+ READ_STRING_FIELD(extnodename);
READ_OID_FIELD(fs_server);
READ_NODE_FIELD(fdw_exprs);
READ_NODE_FIELD(fdw_private);
@@ -1804,6 +1806,19 @@ _readForeignScan(void)
READ_BITMAPSET_FIELD(fs_relids);
READ_BOOL_FIELD(fsSystemCol);
+     methods = GetExtensibleNodeMethods(local_node->extnodename, true);
+     if (methods && methods->nodeRead)
+     {
+             ForeignScan        *local_temp = palloc0(methods->node_size);
+
+             Assert(methods->node_size >= sizeof(ForeignScan));
+
+             memcpy(local_temp, local_node, sizeof(ForeignScan));
+             methods->nodeRead((Node *) local_temp);
+
+             pfree(local_node);
+             local_node = local_temp;
+     }
READ_DONE();
}
This imo pretty clearly shows why the 'extensible node' handling should
be delegated to separate routines.

This does look ugly. I'm not sure off-hand how to fix it.

I wonder if we shouldn't, before committing this, at least write a PoC
implementation of actual extensible nodes. I.e. new node types that are
registered at runtime by extensions. ISTM those are relatively closely
linked, and the above doesn't yet support them nicely afaics.

How is what you have in mind different from the pgstrom patch KaiGai attached?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20

Kouhei Kaigai

kaigai@ak.jp.nec.com

about 10 years ago

In reply to: Robert Haas (#19)

Re: CustomScan in a larger structure (RE: CustomScan support on readfuncs.c)

On Wed, Dec 2, 2015 at 4:06 AM, Andres Freund <andres@anarazel.de> wrote:
On 2015-12-02 08:52:20 +0000, Kouhei Kaigai wrote:
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 26264cb..c4bb76e 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -635,8 +635,12 @@ _copyWorkTableScan(const WorkTableScan *from)
static ForeignScan *
_copyForeignScan(const ForeignScan *from)
{
-     ForeignScan *newnode = makeNode(ForeignScan);
-
+     const ExtensibleNodeMethods *methods
+             = GetExtensibleNodeMethods(from->extnodename, true);
+     ForeignScan *newnode = (ForeignScan *) newNode(!methods
+
? sizeof(ForeignScan)

+
: methods->node_size,

+

T_ForeignScan);

Changing the FDW API seems to put this squarely into 9.6
territory. Agreed?

I don't think anybody thought this patch was 9.5 material. I didn't,
anyway. The FDW changes for the join-pushdown-EPQ problem are another
issue, but that can be discussed on the other thread.

I don't expect it is 9.5 feature.
The name of attached patch was "pgsql-v9.6-custom-private.v2.patch".

/*
* copy node superclass fields
*/
@@ -645,6 +649,7 @@ _copyForeignScan(const ForeignScan *from)
/*
* copy remainder of node
*/
+     COPY_STRING_FIELD(extnodename);
COPY_SCALAR_FIELD(fs_server);
COPY_NODE_FIELD(fdw_exprs);
COPY_NODE_FIELD(fdw_private);
@@ -652,6 +657,8 @@ _copyForeignScan(const ForeignScan *from)
COPY_NODE_FIELD(fdw_recheck_quals);
COPY_BITMAPSET_FIELD(fs_relids);
COPY_SCALAR_FIELD(fsSystemCol);
+     if (methods && methods->nodeCopy)
+             methods->nodeCopy((Node *) newnode, (const Node *) from);
I wonder if we shouldn't instead "recurse" into a generic routine for
extensible nodes here.
I don't know what that means.
+void
+RegisterExtensibleNodeMethods(const ExtensibleNodeMethods *methods)
+{
+     uint32                  hash;
+     int                             index;
+     ListCell           *lc;
+     MemoryContext   oldcxt;
+
+     if (!xnodes_methods_slots)
+             xnodes_methods_slots =
MemoryContextAllocZero(TopMemoryContext,

+

sizeof(List *) * XNODES_NUM_SLOTS);
+
+     hash = hash_any(methods->extnodename, strlen(methods->extnodename));
+     index = hash % XNODES_NUM_SLOTS;
+
+     /* duplication check */
+     foreach (lc, xnodes_methods_slots[index])
+     {
+             const ExtensibleNodeMethods *curr = lfirst(lc);
+
+             if (strcmp(methods->extnodename, curr->extnodename) == 0)
+                     ereport(ERROR,
+                                     (errcode(ERRCODE_DUPLICATE_OBJECT),
+                                      errmsg("ExtensibleNodeMethods
\"%s\" already exists",

+

methods->extnodename)));
+     }
+     /* ok, register this entry */
+     oldcxt = MemoryContextSwitchTo(TopMemoryContext);
+     xnodes_methods_slots[index] = lappend(xnodes_methods_slots[index],
+
(void *)methods);

+ MemoryContextSwitchTo(oldcxt);
+}

Why aren't you using dynahash here, and instead reimplement a hashtable?

I had thought we might assume that the number of extensible nodes was
small enough that we could use an array for this and just use linear
search. But if we want to keep the door open to having enough
extensible nodes that this would be inefficient, then I agree that
dynahash is better than a hand-rolled hash table.

Hmm. I also expected the number of extensible nodes are not so much,
so self-defined hash table is sufficient. However, it is not strong
reason. Let me revise the hash implementation.

static ForeignScan *
_readForeignScan(void)
{
+ const ExtensibleNodeMethods *methods;
READ_LOCALS(ForeignScan);

ReadCommonScan(&local_node->scan);

+ READ_STRING_FIELD(extnodename);
READ_OID_FIELD(fs_server);
READ_NODE_FIELD(fdw_exprs);
READ_NODE_FIELD(fdw_private);
@@ -1804,6 +1806,19 @@ _readForeignScan(void)
READ_BITMAPSET_FIELD(fs_relids);
READ_BOOL_FIELD(fsSystemCol);
+     methods = GetExtensibleNodeMethods(local_node->extnodename, true);
+     if (methods && methods->nodeRead)
+     {
+             ForeignScan        *local_temp =
palloc0(methods->node_size);
+
+             Assert(methods->node_size >= sizeof(ForeignScan));
+
+             memcpy(local_temp, local_node, sizeof(ForeignScan));
+             methods->nodeRead((Node *) local_temp);
+
+             pfree(local_node);
+             local_node = local_temp;
+     }
READ_DONE();
}
This imo pretty clearly shows why the 'extensible node' handling should
be delegated to separate routines.
This does look ugly. I'm not sure off-hand how to fix it.

The problem is that we cannot know what extensible node will fit on the
node currently we read unless we don't read extnodename. It also means we
cannot know exact size of the structure prior to the read of extnodename.

I can agree that the above code is never graceful, however, it is not an
avoidable restriction.

I wonder if we shouldn't, before committing this, at least write a PoC
implementation of actual extensible nodes. I.e. new node types that are
registered at runtime by extensions. ISTM those are relatively closely
linked, and the above doesn't yet support them nicely afaics.

How is what you have in mind different from the pgstrom patch KaiGai attached?

This example make node copy by copyObject, then serialize and deserialize
the CustomScan node including extra private fields. I attached it for the
purpose of PoC.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Import Notes

Resolved by subject fallback