pgsql: Modify tqueue infrastructure to support transient record types.

Started by Robert Haasabout 10 years ago6 messages
#1Robert Haas
rhaas@postgresql.org

Modify tqueue infrastructure to support transient record types.

Commit 4a4e6893aa080b9094dadbe0e65f8a75fee41ac6, which introduced this
mechanism, failed to account for the fact that the RECORD pseudo-type
uses transient typmods that are only meaningful within a single
backend. Transferring such tuples without modification between two
cooperating backends does not work. This commit installs a system
for passing the tuple descriptors over the same shm_mq being used to
send the tuples themselves. The two sides might not assign the same
transient typmod to any given tuple descriptor, so we must also
substitute the appropriate receiver-side typmod for the one used by
the sender. That adds some CPU overhead, but still seems better than
being unable to pass records between cooperating parallel processes.

Along the way, move the logic for handling multiple tuple queues from
tqueue.c to nodeGather.c; tqueue.c now provides a TupleQueueReader,
which reads from a single queue, rather than a TupleQueueFunnel, which
potentially reads from multiple queues. This change was suggested
previously as a way to make sure that nodeGather.c rather than tqueue.c
had policy control over the order in which to read from queues, but
it wasn't clear to me until now how good an idea it was. typmod
mapping needs to be performed separately for each queue, and it is
much simpler if the tqueue.c code handles that and leaves multiplexing
multiple queues to higher layers of the stack.

Branch
------
master

Details
-------
http://git.postgresql.org/pg/commitdiff/6e71dd7ce9766582da453f493bc371d64977282f

Modified Files
--------------
src/backend/executor/nodeGather.c | 138 ++++--
src/backend/executor/tqueue.c | 983 ++++++++++++++++++++++++++++++++-----
src/include/executor/tqueue.h | 12 +-
src/include/nodes/execnodes.h | 4 +-
src/tools/pgindent/typedefs.list | 2 +-
5 files changed, 986 insertions(+), 153 deletions(-)

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

#2Kevin Grittner
kgrittn@ymail.com
In reply to: Robert Haas (#1)
Re: pgsql: Modify tqueue infrastructure to support transient record types.

On Friday, November 6, 2015 3:59 PM, Robert Haas <rhaas@postgresql.org> wrote:

Modified Files
--------------

src/backend/executor/tqueue.c | 983 ++++++++++++++++++++++++++++++++-----

This patch created some referenced local variables:

tqueue.c: In function ‘tqueueReceiveSlot’:
tqueue.c:124:18: error: variable ‘tup’ set but not used [-Werror=unused-but-set-variable]
HeapTupleHeader tup;
^
tqueue.c: In function ‘TupleQueueRemapTuple’:
tqueue.c:612:8: error: variable ‘dirty’ set but not used [-Werror=unused-but-set-variable]
bool dirty = false;
^
cc1: all warnings being treated as errors

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

#3Amit Kapila
amit.kapila16@gmail.com
In reply to: Robert Haas (#1)
1 attachment(s)
Re: pgsql: Modify tqueue infrastructure to support transient record types.

On Sat, Nov 7, 2015 at 3:29 AM, Robert Haas <rhaas@postgresql.org> wrote:

Modify tqueue infrastructure to support transient record types.

I am getting below compiler warning on Windows:
tqueue.c(662): warning C4715: 'TupleQueueRemap' : not all control paths
return a value

Attached patch fixes the problem.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

fix_win_compiler_warning_v1.patchapplication/octet-stream; name=fix_win_compiler_warning_v1.patchDownload
diff --git a/src/backend/executor/tqueue.c b/src/backend/executor/tqueue.c
index 7699d98..efb3901 100644
--- a/src/backend/executor/tqueue.c
+++ b/src/backend/executor/tqueue.c
@@ -656,9 +656,12 @@ TupleQueueRemap(TupleQueueReader *reader, RemapClass remapclass, Datum value)
 
 		case TQUEUE_REMAP_RECORD:
 			return TupleQueueRemapRecord(reader, value);
-	}
 
-	elog(ERROR, "unknown remap class: %d", (int) remapclass);
+		default:
+			/* shouldn't happen */
+			elog(ERROR, "unknown remap class: %d", (int) remapclass);
+			PG_RETURN_DATUM(0);
+	}
 }
 
 /*
#4Amit Kapila
amit.kapila16@gmail.com
In reply to: Robert Haas (#1)
1 attachment(s)
Re: pgsql: Modify tqueue infrastructure to support transient record types.

On Sat, Nov 7, 2015 at 3:29 AM, Robert Haas <rhaas@postgresql.org> wrote:

Modify tqueue infrastructure to support transient record types.

+static HeapTuple
+gather_readnext(GatherState *gatherstate)
+{
..
+       if (readerdone)
+       {
+           DestroyTupleQueueReader(reader);
+           --gatherstate->nreaders;
+           if (gatherstate->nreaders == 0)
+           {
+               ExecShutdownGather(gatherstate);
+               return NULL;
+           }
..
}

I think after readers are done, it's not good to call ShutdownGather,
because it will destroy the parallel context as well and the same is
needed for the cases when after the readers are done we still need
to access dsm, like for rescan and for scanning the data from local
node.

Here, we should just shutdown the workers and that is what we were
doing previous to this commit. Attached patch fixes this problem.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

fix_gather_shutdown_workers_v1.patchapplication/octet-stream; name=fix_gather_shutdown_workers_v1.patchDownload
diff --git a/src/backend/executor/nodeGather.c b/src/backend/executor/nodeGather.c
index 850c67e..5139404 100644
--- a/src/backend/executor/nodeGather.c
+++ b/src/backend/executor/nodeGather.c
@@ -341,7 +341,7 @@ gather_readnext(GatherState *gatherstate)
 			--gatherstate->nreaders;
 			if (gatherstate->nreaders == 0)
 			{
-				ExecShutdownGather(gatherstate);
+				ExecShutdownGatherWorkers(gatherstate);
 				return NULL;
 			}
 			else
#5Robert Haas
robertmhaas@gmail.com
In reply to: Amit Kapila (#3)
Re: [COMMITTERS] pgsql: Modify tqueue infrastructure to support transient record types.

On Mon, Nov 9, 2015 at 4:06 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Sat, Nov 7, 2015 at 3:29 AM, Robert Haas <rhaas@postgresql.org> wrote:

Modify tqueue infrastructure to support transient record types.

I am getting below compiler warning on Windows:
tqueue.c(662): warning C4715: 'TupleQueueRemap' : not all control paths
return a value

Well, that sucks. Apparently, your Windows compiler thinks
elog(ERROR, ...) might return. Apparently
b853eb97182079dcd30b4f52576bd5d6c275ee71 wasn't good enough for your
compiler; I wonder why not.

Attached patch fixes the problem.

I don't want to do that, because adding a default: case to the switch
will prevent the compiler from complaining if somebody introduces
another enum value in the future. Instead, we can just add a dummy
return at the end of the function. I'll go do that.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Robert Haas
robertmhaas@gmail.com
In reply to: Amit Kapila (#4)
Re: pgsql: Modify tqueue infrastructure to support transient record types.

On Mon, Nov 9, 2015 at 8:18 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Sat, Nov 7, 2015 at 3:29 AM, Robert Haas <rhaas@postgresql.org> wrote:

Modify tqueue infrastructure to support transient record types.

+static HeapTuple
+gather_readnext(GatherState *gatherstate)
+{
..
+       if (readerdone)
+       {
+           DestroyTupleQueueReader(reader);
+           --gatherstate->nreaders;
+           if (gatherstate->nreaders == 0)
+           {
+               ExecShutdownGather(gatherstate);
+               return NULL;
+           }
..
}

I think after readers are done, it's not good to call ShutdownGather,
because it will destroy the parallel context as well and the same is
needed for the cases when after the readers are done we still need
to access dsm, like for rescan and for scanning the data from local
node.

Here, we should just shutdown the workers and that is what we were
doing previous to this commit. Attached patch fixes this problem.

Oops. Rebasing fail. Committed.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers