CREATE TABLE with parallel workers, 10.0?
Hello,
(I'm posting to hackers since I got no response on the general list.)
I use Postgres + PostGIS quite heavily, and recently have been taking full
advantage of the new parallelism in 9.6. I'm now running queries in a few
hours that would otherwise take more than a day.
However, parallelism is disabled for all queries that perform writes (as
documented). I would normally run "CREATE TABLE AS [some super-expensive
query]", but since that can't use parallelism I'm using the \o option in
psql, creating the table separately, and then \copy-ing in the results.
That works, but "CREATE TABLE AS" would be more convenient.
Are there plans in 10.0 to allow parallelism in queries that write, or at
least in "CREATE TABLE AS" queries? (Support in materialized views would be
great, too!)
Thanks,
Joshua Chamberlain
On Wed, Feb 15, 2017 at 2:24 PM, Joshua Chamberlain <josh@zephyri.co> wrote:
Are there plans in 10.0 to allow parallelism in queries that write, or at
least in "CREATE TABLE AS" queries? (Support in materialized views would be
great, too!)
Nope. There are no patches for now.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Feb 14, 2017 at 09:24:47PM -0800, Joshua Chamberlain wrote:
Hello,
(I'm posting to hackers since I got no response on the general
list.)I use Postgres + PostGIS quite heavily, and recently have been
taking full advantage of the new parallelism in 9.6. I'm now running
queries in a few hours that would otherwise take more than a day.However, parallelism is disabled for all queries that perform writes
(as documented). I would normally run "CREATE TABLE AS [some
super-expensive query]", but since that can't use parallelism I'm
using the \o option in psql, creating the table separately, and then
\copy-ing in the results. That works, but "CREATE TABLE AS" would
be more convenient.
How about creating a temp view?
CREATE TEMPORARY VIEW foo_tv AS [your gigantic query goes here];
CREATE TABLE foo (LIKE foo_tv);
INSERT INTO foo SELECT * FROM foo_tv;
Are there plans in 10.0 to allow parallelism in queries that write,
or at least in "CREATE TABLE AS" queries? (Support in materialized
views would be great, too!)
Patches are always welcome, and there's one more commitfest to go
before 10.
Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Feb 15, 2017 at 12:24 AM, Joshua Chamberlain <josh@zephyri.co> wrote:
Hello,
(I'm posting to hackers since I got no response on the general list.)
I use Postgres + PostGIS quite heavily, and recently have been taking full
advantage of the new parallelism in 9.6. I'm now running queries in a few
hours that would otherwise take more than a day.However, parallelism is disabled for all queries that perform writes (as
documented). I would normally run "CREATE TABLE AS [some super-expensive
query]", but since that can't use parallelism I'm using the \o option in
psql, creating the table separately, and then \copy-ing in the results. That
works, but "CREATE TABLE AS" would be more convenient.Are there plans in 10.0 to allow parallelism in queries that write, or at
least in "CREATE TABLE AS" queries? (Support in materialized views would be
great, too!)
Somebody else asked me about this in private email. Nobody at
EnterpriseDB is working on this right now, and I don't think anybody
else is working on it either. There are several ways to do it, but
none of them are entirely easy and the ones likely to perform better
are less easy.
I think that what we need to do to make this work is come up with a
way to fix the interaction between group locking (which allows
multiple processes to hold mutually conflicting locks at the same time
if they are in a parallel group) and relation extension (which uses
heavyweight locking to prevent multiple processes from trying to
extend the same relation at the same time). I think that perhaps the
right way to solve that problem is to come up with a way of providing
mutual exclusion around relation extension that doesn't need the
heavyweight lock manager. Relation extension locks don't need
multiple lock modes or deadlock detection or any of the other
frammishes that the heavyweight lock manager provides, but they do
need to be fast, which the heavyweight lock manager isn't especially
good at. So moving this out of the heavyweight lock manager might be
a way to solve two problems at once.
There's also a hazard related to GIN indexes since
e2c79e14d998cd31f860854bc9210b37b457bb01, which introduced a new use
of Page locks, which have a similar kind of problem. We got rid of
the major existing use of page locks in
6d46f4783efe457f74816a75173eb23ed8930020, which extirpated them from
hash indexes, and I was kind of hoping they could go away altogether,
but we can't do that as long as GIN is using them.
Anyway, if we solve those problems, we can allow inserts (not updates
or deletes, those have other problems, principally relating to combo
CIDs) in parallel mode, which would make it possible to allow the
kinds of things you are asking about here. Then you could fix things
so that each worker generates a subset of the tuples and inserts the
ones it generates. You'd end up with a parallel plan but no Gather
node! The workers could feed tuples directly to a suitable
DestReceiver, which would be really spiffy.
The other way of fixing this problem is to have each worker generate a
subset of the tuples and funnel them all back to the leader through a
Gather node; the leader then does all the inserts. That avoids having
to solve the problems mentioned above, but it probably doesn't perform
nearly as well.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Feb 16, 2017 at 12:48 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Feb 15, 2017 at 12:24 AM, Joshua Chamberlain <josh@zephyri.co>
wrote:Hello,
(I'm posting to hackers since I got no response on the general list.)
I use Postgres + PostGIS quite heavily, and recently have been taking
full
advantage of the new parallelism in 9.6. I'm now running queries in a few
hours that would otherwise take more than a day.However, parallelism is disabled for all queries that perform writes (as
documented). I would normally run "CREATE TABLE AS [some super-expensive
query]", but since that can't use parallelism I'm using the \o option in
psql, creating the table separately, and then \copy-ing in the results.That
works, but "CREATE TABLE AS" would be more convenient.
Are there plans in 10.0 to allow parallelism in queries that write, or at
least in "CREATE TABLE AS" queries? (Support in materialized views wouldbe
great, too!)
Somebody else asked me about this in private email. Nobody at
EnterpriseDB is working on this right now, and I don't think anybody
else is working on it either. There are several ways to do it, but
none of them are entirely easy and the ones likely to perform better
are less easy.I think that what we need to do to make this work is come up with a
way to fix the interaction between group locking (which allows
multiple processes to hold mutually conflicting locks at the same time
if they are in a parallel group) and relation extension (which uses
heavyweight locking to prevent multiple processes from trying to
extend the same relation at the same time). I think that perhaps the
right way to solve that problem is to come up with a way of providing
mutual exclusion around relation extension that doesn't need the
heavyweight lock manager. Relation extension locks don't need
multiple lock modes or deadlock detection or any of the other
frammishes that the heavyweight lock manager provides, but they do
need to be fast, which the heavyweight lock manager isn't especially
good at. So moving this out of the heavyweight lock manager might be
a way to solve two problems at once.There's also a hazard related to GIN indexes since
e2c79e14d998cd31f860854bc9210b37b457bb01, which introduced a new use
of Page locks, which have a similar kind of problem. We got rid of
the major existing use of page locks in
6d46f4783efe457f74816a75173eb23ed8930020, which extirpated them from
hash indexes, and I was kind of hoping they could go away altogether,
but we can't do that as long as GIN is using them.Anyway, if we solve those problems, we can allow inserts (not updates
or deletes, those have other problems, principally relating to combo
CIDs) in parallel mode, which would make it possible to allow the
kinds of things you are asking about here. Then you could fix things
so that each worker generates a subset of the tuples and inserts the
ones it generates. You'd end up with a parallel plan but no Gather
node! The workers could feed tuples directly to a suitable
DestReceiver, which would be really spiffy.The other way of fixing this problem is to have each worker generate a
subset of the tuples and funnel them all back to the leader through a
Gather node; the leader then does all the inserts. That avoids having
to solve the problems mentioned above, but it probably doesn't perform
nearly as well.
How about supporting something like, backend does the write operations
and whereas the worker will produce the results. This way it may not produce
good performance for all the cases compared to do the writer operation by
all parallel workers, but this may be useful for some scenarios like;
CREATE MATERIALIZED VIEW and etc.
Following are the explain plan with minimal changes in the code to allow
write operations. I didn't verified all the scenarios. How about supporting
writer operations as below and then later enhance it to do the write
operations
by the parallel workers also?
POC patch is attached.
postgres=# explain create materialized view mat_view as select * from tbl
where f1 =10;
QUERY PLAN
------------------------------------------------------------------------
Gather (cost=1000.00..37458.43 rows=1 width=214)
Workers Planned: 2
-> Parallel Seq Scan on tbl (cost=0.00..36458.33 rows=1 width=214)
Filter: (f1 = 10)
(4 rows)
postgres=# explain insert into tbl select * from tbl where f1 = 10;
QUERY PLAN
------------------------------------------------------------------------------------
Insert on tbl (cost=1000.00..37458.43 rows=1 width=214)
-> Gather (cost=1000.00..37458.43 rows=1 width=214)
Workers Planned: 2
-> Parallel Seq Scan on tbl tbl_1 (cost=0.00..36458.33 rows=1
width=214)
Filter: (f1 = 10)
(5 rows)
postgres=# explain update tbl set f1 = 10 where f1 = 10;
QUERY PLAN
------------------------------------------------------------------------------
Update on tbl (cost=1000.00..37458.43 rows=1 width=220)
-> Gather (cost=1000.00..37458.43 rows=1 width=220)
Workers Planned: 2
-> Parallel Seq Scan on tbl (cost=0.00..36458.33 rows=1
width=220)
Filter: (f1 = 10)
(5 rows)
Regards,
Hari Babu
Fujitsu Australia
Attachments:
parallel_writer_poc_1.patchapplication/octet-stream; name=parallel_writer_poc_1.patchDownload
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index af25836..0a3367b 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -2568,15 +2568,13 @@ heap_prepare_insert(Relation relation, HeapTuple tup, TransactionId xid,
CommandId cid, int options)
{
/*
- * For now, parallel operations are required to be strictly read-only.
- * Unlike heap_update() and heap_delete(), an insert should never create a
- * combo CID, so it might be possible to relax this restriction, but not
- * without more thought and testing.
+ * For now, parallel operations are required to be strictly read-only in
+ * parallel worker.
*/
- if (IsInParallelMode())
+ if (IsParallelWorker())
ereport(ERROR,
(errcode(ERRCODE_INVALID_TRANSACTION_STATE),
- errmsg("cannot insert tuples during a parallel operation")));
+ errmsg("cannot insert tuples in a parallel worker")));
if (relation->rd_rel->relhasoids)
{
@@ -3023,10 +3021,10 @@ heap_delete(Relation relation, ItemPointer tid,
* Other workers might need that combocid for visibility checks, and we
* have no provision for broadcasting it to them.
*/
- if (IsInParallelMode())
+ if (IsParallelWorker())
ereport(ERROR,
(errcode(ERRCODE_INVALID_TRANSACTION_STATE),
- errmsg("cannot delete tuples during a parallel operation")));
+ errmsg("cannot delete tuples in a parallel worker")));
block = ItemPointerGetBlockNumber(tid);
buffer = ReadBuffer(relation, block);
@@ -3495,10 +3493,10 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
* Other workers might need that combocid for visibility checks, and we
* have no provision for broadcasting it to them.
*/
- if (IsInParallelMode())
+ if (IsParallelWorker())
ereport(ERROR,
(errcode(ERRCODE_INVALID_TRANSACTION_STATE),
- errmsg("cannot update tuples during a parallel operation")));
+ errmsg("cannot update tuples in a parallel worker")));
/*
* Fetch the list of attributes to be checked for HOT update. This is
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 42fc351..acd24ab 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -15,6 +15,7 @@
#include "access/clog.h"
#include "access/commit_ts.h"
+#include "access/parallel.h"
#include "access/subtrans.h"
#include "access/transam.h"
#include "access/xact.h"
@@ -51,10 +52,10 @@ GetNewTransactionId(bool isSubXact)
/*
* Workers synchronize transaction state at the beginning of each parallel
- * operation, so we can't account for new XIDs after that point.
+ * operation, so we can't account for new XIDs in a parallel worker.
*/
- if (IsInParallelMode())
- elog(ERROR, "cannot assign TransactionIds during a parallel operation");
+ if (IsParallelWorker())
+ elog(ERROR, "cannot assign TransactionIds in a parallel worker");
/*
* During bootstrap initialization, we return the special bootstrap
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 82f9a3c..cafe387 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -497,10 +497,10 @@ AssignTransactionId(TransactionState s)
/*
* Workers synchronize transaction state at the beginning of each parallel
- * operation, so we can't account for new XIDs at this point.
+ * operation, so we can't account for new XIDs in parallel worker.
*/
- if (IsInParallelMode() || IsParallelWorker())
- elog(ERROR, "cannot assign XIDs during a parallel operation");
+ if (IsParallelWorker())
+ elog(ERROR, "cannot assign XIDs in a parallel worker");
/*
* Ensure parent(s) have XIDs, so that a child always has an XID later
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 949844d..5099cad 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -1524,7 +1524,7 @@ BeginCopy(ParseState *pstate,
}
/* plan the query */
- plan = pg_plan_query(query, 0, NULL);
+ plan = pg_plan_query(query, CURSOR_OPT_PARALLEL_OK, NULL);
/*
* With row level security and a user using "COPY relation TO", we
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index c9e0a3e..25e6e9e 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -397,7 +397,7 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
rewritten = QueryRewrite(castNode(Query, copyObject(ctas->query)));
Assert(list_length(rewritten) == 1);
ExplainOneQuery(castNode(Query, linitial(rewritten)),
- 0, ctas->into, es,
+ CURSOR_OPT_PARALLEL_OK, ctas->into, es,
queryString, params);
}
else if (IsA(utilityStmt, DeclareCursorStmt))
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index a666391..97b50d9 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -1585,11 +1585,9 @@ ExecutePlan(EState *estate,
/*
* If a tuple count was supplied, we must force the plan to run without
- * parallelism, because we might exit early. Also disable parallelism
- * when writing into a relation, because no database changes are allowed
- * in parallel mode.
+ * parallelism, because we might exit early.
*/
- if (numberTuples || dest->mydest == DestIntoRel)
+ if (numberTuples)
use_parallel_mode = false;
if (use_parallel_mode)
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 3d33d46..7deaba2 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -242,7 +242,7 @@ standard_planner(Query *parse, int cursorOptions, ParamListInfo boundParams)
if ((cursorOptions & CURSOR_OPT_PARALLEL_OK) != 0 &&
IsUnderPostmaster &&
dynamic_shared_memory_type != DSM_IMPL_NONE &&
- parse->commandType == CMD_SELECT &&
+ parse->commandType != CMD_UTILITY &&
!parse->hasModifyingCTE &&
max_parallel_workers_per_gather > 0 &&
!IsParallelWorker() &&
Hi,
On 2017-02-15 08:48:44 -0500, Robert Haas wrote:
We got rid of the major existing use of page locks in
6d46f4783efe457f74816a75173eb23ed8930020, which extirpated them from
hash indexes, and I was kind of hoping they could go away altogether,
but we can't do that as long as GIN is using them.
Learned a new word today.
Anyway, if we solve those problems, we can allow inserts (not updates
or deletes, those have other problems, principally relating to combo
CIDs) in parallel mode, which would make it possible to allow the
kinds of things you are asking about here.
I don't think general INSERTs are safe, if you consider unique indexes
and foreign keys (both setting xmax in the simple cases and multixacts
are likely to be problematic).
The other way of fixing this problem is to have each worker generate a
subset of the tuples and funnel them all back to the leader through a
Gather node; the leader then does all the inserts. That avoids having
to solve the problems mentioned above, but it probably doesn't perform
nearly as well.
I think it'd already be tremendously useful however. I think it'd not
be an unreasonable first step. It'd be a good fallback that'd be useful
for !insert and such anyway.
Regards,
Andres
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
All,
* Andres Freund (andres@anarazel.de) wrote:
On 2017-02-15 08:48:44 -0500, Robert Haas wrote:
The other way of fixing this problem is to have each worker generate a
subset of the tuples and funnel them all back to the leader through a
Gather node; the leader then does all the inserts. That avoids having
to solve the problems mentioned above, but it probably doesn't perform
nearly as well.I think it'd already be tremendously useful however. I think it'd not
be an unreasonable first step. It'd be a good fallback that'd be useful
for !insert and such anyway.
Absolutely. I had always figured this would be what we would do first,
before coming up with something more clever down the road. In
particular, this allows filters to be pushed down and performed in
parallel, which may significantly reduce the result which is passed back
up to the leader.
In many cases, I expect this would work just as well, if not better,
than trying to actually do writes in parallel.
Thanks!
Stephen
On February 15, 2017 5:20:20 PM PST, Stephen Frost <sfrost@snowman.net> wrote:
In many cases, I expect this would work just as well, if not better,
than trying to actually do writes in parallel.
Why? IPCing tuples around is quite expensive. Or do you mean because it'll be more suitable because of the possible plans?
Andres
Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Andres,
* Andres Freund (andres@anarazel.de) wrote:
On February 15, 2017 5:20:20 PM PST, Stephen Frost <sfrost@snowman.net> wrote:
In many cases, I expect this would work just as well, if not better,
than trying to actually do writes in parallel.Why? IPCing tuples around is quite expensive. Or do you mean because it'll be more suitable because of the possible plans?
Because I've seen some serious problems when trying to have multiple
processes writing into the same relation due to the relation extension
lock, cases where it was much faster to have each process write into its
own table. Admittedly, we've improved things there, so perhaps this isn't
an issue any longer, but we also don't yet really know what the locking
implementation looks like yet for having multiple parallel workers
writing into the same relation, so it may be that sending a few records
back to the leader is cheaper than working out the locking to allow
parallel workers to write to the same relation, or at least not any more
expensive.
Thanks!
Stephen
On 2017-02-15 20:28:43 -0500, Stephen Frost wrote:
Andres,
* Andres Freund (andres@anarazel.de) wrote:
On February 15, 2017 5:20:20 PM PST, Stephen Frost <sfrost@snowman.net> wrote:
In many cases, I expect this would work just as well, if not better,
than trying to actually do writes in parallel.Why? IPCing tuples around is quite expensive. Or do you mean because it'll be more suitable because of the possible plans?
Because I've seen some serious problems when trying to have multiple
processes writing into the same relation due to the relation extension
lock, cases where it was much faster to have each process write into its
own table. Admittedly, we've improved things there, so perhaps this isn't
an issue any longer, but we also don't yet really know what the locking
implementation looks like yet for having multiple parallel workers
writing into the same relation, so it may be that sending a few records
back to the leader is cheaper than working out the locking to allow
parallel workers to write to the same relation, or at least not any more
expensive.
I quite seriously doubt that you will get enough rows to the master via
tuplequeues that it'll be faster than inserting on the workers, eve
nwith such scalability problems present. Even before the 9.6
improvements, and even more so after.
Greetings,
Andres Freund
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Andres,
* Andres Freund (andres@anarazel.de) wrote:
On 2017-02-15 20:28:43 -0500, Stephen Frost wrote:
* Andres Freund (andres@anarazel.de) wrote:
On February 15, 2017 5:20:20 PM PST, Stephen Frost <sfrost@snowman.net> wrote:
In many cases, I expect this would work just as well, if not better,
than trying to actually do writes in parallel.Why? IPCing tuples around is quite expensive. Or do you mean because it'll be more suitable because of the possible plans?
Because I've seen some serious problems when trying to have multiple
processes writing into the same relation due to the relation extension
lock, cases where it was much faster to have each process write into its
own table. Admittedly, we've improved things there, so perhaps this isn't
an issue any longer, but we also don't yet really know what the locking
implementation looks like yet for having multiple parallel workers
writing into the same relation, so it may be that sending a few records
back to the leader is cheaper than working out the locking to allow
parallel workers to write to the same relation, or at least not any more
expensive.I quite seriously doubt that you will get enough rows to the master via
tuplequeues that it'll be faster than inserting on the workers, eve
nwith such scalability problems present. Even before the 9.6
improvements, and even more so after.
Perhaps, but until we've got a system worked out for having the workers
do the writes, we aren't getting anything. Being able to have the
leader do the writes based on the tuples fed back from the workers is
clearly better than nothing.
Thanks!
Stephen
On 2017-02-15 20:35:16 -0500, Stephen Frost wrote:
Perhaps, but until we've got a system worked out for having the workers
do the writes, we aren't getting anything. Being able to have the
leader do the writes based on the tuples fed back from the workers is
clearly better than nothing.
Never argued differently.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Feb 15, 2017 at 8:13 PM, Andres Freund <andres@anarazel.de> wrote:
I don't think general INSERTs are safe, if you consider unique indexes
and foreign keys (both setting xmax in the simple cases and multixacts
are likely to be problematic).
There's no real problem with setting xmax or creating multixacts - I
think - but there's definitely a problem if an INSERT can lead to the
creation of a new combo CID, because we have no way at present of
transmitting the new combo CID mapping to the workers, and if a worker
sees a combo CID for which it doesn't have a mapping, the world blows
up. Do you think an insert can trigger combo CID creation?
(Of course, now that we've got DSA, it wouldn't be nearly as hard to
fix the combo CID synchronization problem.)
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers