ERROR: ORDER/GROUP BY expression not found in targetlist
Hi,
What is going on here?
postgres=# create table logs as select generate_series(1,
1000000)::text as data;
SELECT 1000000
postgres=# insert into logs select * from logs;
INSERT 0 1000000
postgres=# insert into logs select * from logs;
INSERT 0 2000000
postgres=# insert into logs select * from logs;
INSERT 0 4000000
postgres=# insert into logs select * from logs;
INSERT 0 8000000
postgres=# insert into logs select * from logs;
INSERT 0 16000000
postgres=# analyze logs;
ANALYZE
postgres=# set max_parallel_workers_per_gather = 0;
SET
postgres=# explain select length(data) from logs group by length(data);
┌────────────────────────────────────────────────────────────────────────────┐
│ QUERY PLAN │
├────────────────────────────────────────────────────────────────────────────┤
│ Group (cost=5843157.07..6005642.13 rows=993989 width=4) │
│ Group Key: (length(data)) │
│ -> Sort (cost=5843157.07..5923157.11 rows=32000018 width=4) │
│ Sort Key: (length(data)) │
│ -> Seq Scan on logs (cost=0.00..541593.22 rows=32000018 width=4) │
└────────────────────────────────────────────────────────────────────────────┘
(5 rows)
postgres=# set max_parallel_workers_per_gather = 2;
SET
postgres=# explain select length(data) from logs group by length(data);
ERROR: ORDER/GROUP BY expression not found in targetlist
--
Thomas Munro
http://www.enterprisedb.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi,
I tried to run tpc-h queries, but some queries failed by the error on last week.
Subject: Re: [HACKERS] ORDER/GROUP BY expression not found in targetlist
Date: Thu, 09 Jun 2016 12:08:01 +0900
Today, I try it again by changing max_parallel_workers_per_gather parameter.
The result of Q1 is bellow. Is this bug in the Open items on wiki?
-------------
postgres=# set max_parallel_workers_per_gather = 0;
SET
postgres=# \i queries/1.explain.sql
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=43474.03..43474.03 rows=1 width=236) (actual time=1039.583..1039.583 rows=1 loops=1)
-> Sort (cost=43474.03..43474.04 rows=6 width=236) (actual time=1039.583..1039.583 rows=1 loops=1)
Sort Key: l_returnflag, l_linestatus
Sort Method: top-N heapsort Memory: 25kB
-> HashAggregate (cost=43473.83..43474.00 rows=6 width=236) (actual time=1039.529..1039.534 rows=4 loops=1)
Group Key: l_returnflag, l_linestatus
-> Seq Scan on lineitem (cost=0.00..19668.15 rows=595142 width=25) (actual time=0.048..125.332 rows=595224 loops=1)
Filter: (l_shipdate <= '1998-09-22 00:00:00'::timestamp without time zone)
Rows Removed by Filter: 5348
Planning time: 0.180 ms
Execution time: 1039.758 ms
(11 rows)
postgres=# set max_parallel_workers_per_gather = default;
SET
postgres=# \i queries/1.explain.sql
ERROR: ORDER/GROUP BY expression not found in targetlist
-------------
Regards,
Tatsuro Yamada
NTT OSS Center
On 2016/06/13 12:39, Thomas Munro wrote:
Hi,
What is going on here?
postgres=# create table logs as select generate_series(1,
1000000)::text as data;
SELECT 1000000
postgres=# insert into logs select * from logs;
INSERT 0 1000000
postgres=# insert into logs select * from logs;
INSERT 0 2000000
postgres=# insert into logs select * from logs;
INSERT 0 4000000
postgres=# insert into logs select * from logs;
INSERT 0 8000000
postgres=# insert into logs select * from logs;
INSERT 0 16000000
postgres=# analyze logs;
ANALYZE
postgres=# set max_parallel_workers_per_gather = 0;
SET
postgres=# explain select length(data) from logs group by length(data);
┌────────────────────────────────────────────────────────────────────────────┐
│ QUERY PLAN │
├────────────────────────────────────────────────────────────────────────────┤
│ Group (cost=5843157.07..6005642.13 rows=993989 width=4) │
│ Group Key: (length(data)) │
│ -> Sort (cost=5843157.07..5923157.11 rows=32000018 width=4) │
│ Sort Key: (length(data)) │
│ -> Seq Scan on logs (cost=0.00..541593.22 rows=32000018 width=4) │
└────────────────────────────────────────────────────────────────────────────┘
(5 rows)postgres=# set max_parallel_workers_per_gather = 2;
SET
postgres=# explain select length(data) from logs group by length(data);
ERROR: ORDER/GROUP BY expression not found in targetlist
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Jun 13, 2016 at 4:16 PM, Tatsuro Yamada
<yamada.tatsuro@lab.ntt.co.jp> wrote:
I tried to run tpc-h queries, but some queries failed by the error on last
week.Subject: Re: [HACKERS] ORDER/GROUP BY expression not found in targetlist
Date: Thu, 09 Jun 2016 12:08:01 +0900
Right, I saw that thread which involved the same error message:
/messages/by-id/20160526021235.w4nq7k3gnheg7vit@alap3.anarazel.de
... but that seems to be a different problem than the one you and I
have seen, it involved repeated references to columns in the tlist.
It was fixed with this commit:
commit aeb9ae6457865c8949641d71a9523374d843a418
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date: Thu May 26 14:52:24 2016 -0400
Disable physical tlist if any Var would need multiple sortgroupref labels.
Today, I try it again by changing max_parallel_workers_per_gather parameter.
The result of Q1 is bellow. Is this bug in the Open items on wiki?
I don't see it on the Open Issues list.
--
Thomas Munro
http://www.enterprisedb.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 13 June 2016 at 15:39, Thomas Munro <thomas.munro@enterprisedb.com> wrote:
What is going on here?
...
postgres=# set max_parallel_workers_per_gather = 2;
SET
postgres=# explain select length(data) from logs group by length(data);
ERROR: ORDER/GROUP BY expression not found in targetlist
Seems like this was caused by 04ae11f62e643e07c411c4935ea6af46cb112aa9
I missed the discussion on this commit, so I'll go look for that now.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi,
Subject: Re: [HACKERS] ORDER/GROUP BY expression not found in targetlist
Date: Thu, 09 Jun 2016 12:08:01 +0900Right, I saw that thread which involved the same error message:
/messages/by-id/20160526021235.w4nq7k3gnheg7vit@alap3.anarazel.de
... but that seems to be a different problem than the one you and I
have seen, it involved repeated references to columns in the tlist.
It was fixed with this commit:commit aeb9ae6457865c8949641d71a9523374d843a418
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date: Thu May 26 14:52:24 2016 -0400Disable physical tlist if any Var would need multiple sortgroupref labels.
I use this version:f721e94 to run tpc-h on last week.
This patch is commited at Jun 8. If it fixed, I didn't get the error.
PG96beta1
commit: f721e94b5f360391fc3ffe183bf697a0441e9184
-----
commit f721e94b5f360391fc3ffe183bf697a0441e9184
Author: Robert Haas <rhaas@postgresql.org>
Date: Wed Jun 8 08:37:06 2016 -0400
Fix typo.
Amit Langote
-----
I got mistake to write an e-mail to -hackers on last week. :-<
I should have written this.
The bug has not fixed by Tom Lane's patch: commit aeb9ae6.
Because I got the same error using tpc-h.
Today, I try it again by changing max_parallel_workers_per_gather parameter.
The result of Q1 is bellow. Is this bug in the Open items on wiki?I don't see it on the Open Issues list.
I checked the list, but the bug is not listed.
https://wiki.postgresql.org/wiki/PostgreSQL_9.6_Open_Items
Regards,
Tatsuro Yamada
NTT OSS Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Jun 13, 2016 at 2:42 PM, Tatsuro Yamada
<yamada.tatsuro@lab.ntt.co.jp> wrote:
I got mistake to write an e-mail to -hackers on last week. :-<
I should have written this.The bug has not fixed by Tom Lane's patch: commit aeb9ae6.
Because I got the same error using tpc-h.
This looks like a different regression..
Today, I try it again by changing max_parallel_workers_per_gather
parameter.
The result of Q1 is bellow. Is this bug in the Open items on wiki?I don't see it on the Open Issues list.
I checked the list, but the bug is not listed.
https://wiki.postgresql.org/wiki/PostgreSQL_9.6_Open_Items
And the winner is:
04ae11f62e643e07c411c4935ea6af46cb112aa9 is the first bad commit
commit 04ae11f62e643e07c411c4935ea6af46cb112aa9
Author: Robert Haas <rhaas@postgresql.org>
Date: Fri Jun 3 14:27:33 2016 -0400
I am adding that to the list of open items.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Jun 13, 2016 at 11:05 AM, David Rowley <david.rowley@2ndquadrant.com>
wrote:
On 13 June 2016 at 15:39, Thomas Munro <thomas.munro@enterprisedb.com>
wrote:
What is going on here?
...
postgres=# set max_parallel_workers_per_gather = 2;
SET
postgres=# explain select length(data) from logs group by length(data);
ERROR: ORDER/GROUP BY expression not found in targetlistSeems like this was caused by 04ae11f62e643e07c411c4935ea6af46cb112aa9
In create_grouping_paths(), we are building partial_grouping_path and same
is used for gather path and other grouping paths (for partial paths).
However, we don't use it for partial path list and sort path due to which
path target for Sort path is different from what we have expected. Is
there a problem in applying partial_grouping_path for partial pathlist?
Attached patch just does that and I don't see error with patch.
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
Attachments:
apply_partial_pathtarget_partial_pathlist_v1.patchapplication/octet-stream; name=apply_partial_pathtarget_partial_pathlist_v1.patchDownload+11-0
On 2016/06/13 15:52, Michael Paquier wrote:
On Mon, Jun 13, 2016 at 2:42 PM, Tatsuro Yamada
<yamada.tatsuro@lab.ntt.co.jp> wrote:I got mistake to write an e-mail to -hackers on last week. :-<
I should have written this.The bug has not fixed by Tom Lane's patch: commit aeb9ae6.
Because I got the same error using tpc-h.This looks like a different regression..
I understand it now, thanks. :-)
I checked the list, but the bug is not listed.
https://wiki.postgresql.org/wiki/PostgreSQL_9.6_Open_ItemsAnd the winner is:
04ae11f62e643e07c411c4935ea6af46cb112aa9 is the first bad commit
commit 04ae11f62e643e07c411c4935ea6af46cb112aa9
Author: Robert Haas <rhaas@postgresql.org>
Date: Fri Jun 3 14:27:33 2016 -0400I am adding that to the list of open items.
Oh...
I'll try to run tpc-h if I got a new patch which fixes the bug. :)
Thanks,
Tatsuro Yamada
NTT OSS Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi,
I applied your patch and run tpc-h.
Then I got new errors on Q4,12,17 and 19.
ERROR: Aggref found in non-Agg plan node.
See bellow,
----------------------------------
postgres=# \i queries/4.explain.sql
ERROR: Aggref found in non-Agg plan node
STATEMENT: explain analyze select
o_orderpriority,
count(*) as order_count
from
orders
where
o_orderdate >= date '1993-10-01'
and o_orderdate < date '1993-10-01' + interval '3' month
and exists (
select
*
from
lineitem
where
l_orderkey = o_orderkey
and l_commitdate < l_receiptdate
)
group by
o_orderpriority
order by
o_orderpriority
LIMIT 1;
----------------------------------
Regards,
Tatsuro Yamada
NTT OSS Center
On 2016/06/13 16:18, Amit Kapila wrote:
On Mon, Jun 13, 2016 at 11:05 AM, David Rowley <david.rowley@2ndquadrant.com <mailto:david.rowley@2ndquadrant.com>> wrote:
On 13 June 2016 at 15:39, Thomas Munro <thomas.munro@enterprisedb.com <mailto:thomas.munro@enterprisedb.com>> wrote:
What is going on here?
...
postgres=# set max_parallel_workers_per_gather = 2;
SET
postgres=# explain select length(data) from logs group by length(data);
ERROR: ORDER/GROUP BY expression not found in targetlistSeems like this was caused by 04ae11f62e643e07c411c4935ea6af46cb112aa9
In create_grouping_paths(), we are building partial_grouping_path and same is used for gather path and other grouping paths (for partial paths). However, we don't use it for partial path list and sort path due to which path target for Sort path is different from what we have expected. Is there a problem in applying partial_grouping_path for partial pathlist? Attached patch just does that and I don't see error with patch.
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com <http://www.enterprisedb.com/>
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Jun 13, 2016 at 3:18 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
In create_grouping_paths(), we are building partial_grouping_path and same
is used for gather path and other grouping paths (for partial paths).
However, we don't use it for partial path list and sort path due to which
path target for Sort path is different from what we have expected. Is there
a problem in applying partial_grouping_path for partial pathlist?
Attached patch just does that and I don't see error with patch.
It doesn't seem like a good idea to destructive modify
input_rel->partial_pathlist. Applying the projection to each path
before using it would probably be better.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
On Mon, Jun 13, 2016 at 3:18 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
In create_grouping_paths(), we are building partial_grouping_path and same
is used for gather path and other grouping paths (for partial paths).
However, we don't use it for partial path list and sort path due to which
path target for Sort path is different from what we have expected. Is there
a problem in applying partial_grouping_path for partial pathlist?
Attached patch just does that and I don't see error with patch.
It doesn't seem like a good idea to destructive modify
input_rel->partial_pathlist. Applying the projection to each path
before using it would probably be better.
I think the real question here is why the code removed by 04ae11f62
was wrong. It was unsafe to use apply_projection_to_path, certainly,
but using create_projection_path directly would have avoided the
stated problem. And it's very unclear that this new patch doesn't
bring back that bug in a different place.
I am not very happy that neither 04ae11f62 nor this patch include
any regression test case proving that a problem existed and has
been fixed.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Jun 13, 2016 at 7:51 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
On Mon, Jun 13, 2016 at 3:18 AM, Amit Kapila <amit.kapila16@gmail.com>
wrote:
In create_grouping_paths(), we are building partial_grouping_path and
same
is used for gather path and other grouping paths (for partial paths).
However, we don't use it for partial path list and sort path due to
which
path target for Sort path is different from what we have expected. Is
there
a problem in applying partial_grouping_path for partial pathlist?
Attached patch just does that and I don't see error with patch.It doesn't seem like a good idea to destructive modify
input_rel->partial_pathlist. Applying the projection to each path
before using it would probably be better.I think the real question here is why the code removed by 04ae11f62
was wrong. It was unsafe to use apply_projection_to_path, certainly,
but using create_projection_path directly would have avoided the
stated problem. And it's very unclear that this new patch doesn't
bring back that bug in a different place.
This new patch still doesn't seem to be right, but it won't bring back the
original problem because apply_projection_to_path will be only done if
grouped_rel is parallel_safe which means it doesn't have any
parallel-unsafe or parallel-restricted clause in quals or target list.
I am not very happy that neither 04ae11f62 nor this patch include
any regression test case proving that a problem existed and has
been fixed.
It is slightly tricky to write a reproducible parallel-query test, but
point taken and I think we should try to have a test unless such a test is
really time consuming.
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
On Mon, Jun 13, 2016 at 7:17 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Jun 13, 2016 at 3:18 AM, Amit Kapila <amit.kapila16@gmail.com>
wrote:
In create_grouping_paths(), we are building partial_grouping_path and
same
is used for gather path and other grouping paths (for partial paths).
However, we don't use it for partial path list and sort path due to
which
path target for Sort path is different from what we have expected. Is
there
a problem in applying partial_grouping_path for partial pathlist?
Attached patch just does that and I don't see error with patch.It doesn't seem like a good idea to destructive modify
input_rel->partial_pathlist. Applying the projection to each path
before using it would probably be better.
Do you mean to have it when we generate a complete GroupAgg Path atop of
the cheapest partial path?
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
Amit Kapila <amit.kapila@enterprisedb.com> writes:
On Mon, Jun 13, 2016 at 7:51 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I think the real question here is why the code removed by 04ae11f62
was wrong. It was unsafe to use apply_projection_to_path, certainly,
but using create_projection_path directly would have avoided the
stated problem. And it's very unclear that this new patch doesn't
bring back that bug in a different place.
This new patch still doesn't seem to be right, but it won't bring back the
original problem because apply_projection_to_path will be only done if
grouped_rel is parallel_safe which means it doesn't have any
parallel-unsafe or parallel-restricted clause in quals or target list.
The problem cited in 04ae11f62's commit message is that
apply_projection_to_path would overwrite the paths' pathtargets in-place,
causing problems if they'd been used for other purposes elsewhere. I do
not share your confidence that using apply_projection_to_path within
create_grouping_paths is free of such a hazard.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Amit Kapila <amit.kapila@enterprisedb.com> writes:
It is slightly tricky to write a reproducible parallel-query test, but
point taken and I think we should try to have a test unless such a test is
really time consuming.
BTW, decent regression tests could be written without the need to create
enormous tables if the minimum rel size in create_plain_partial_paths()
could be configured to something less than 1000 blocks. I think it's
fairly crazy that that arbitrary constant is hard-wired anyway. Should
we make it a GUC?
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
I wrote:
Amit Kapila <amit.kapila@enterprisedb.com> writes:
It is slightly tricky to write a reproducible parallel-query test, but
point taken and I think we should try to have a test unless such a test is
really time consuming.
BTW, decent regression tests could be written without the need to create
enormous tables if the minimum rel size in create_plain_partial_paths()
could be configured to something less than 1000 blocks. I think it's
fairly crazy that that arbitrary constant is hard-wired anyway. Should
we make it a GUC?
Just as an experiment to see what would happen, I did
- int parallel_threshold = 1000;
+ int parallel_threshold = 1;
and ran the regression tests. I got a core dump in the window.sql test:
Program terminated with signal 11, Segmentation fault.
#0 0x0000000000664dbc in make_partialgroup_input_target (root=0x1795018,
input_rel=0x17957a8, target=0x17bf228, rollup_lists=0x0,
rollup_groupclauses=0x0) at planner.c:4307
4307 Index sgref = final_target->sortgrouprefs[i];
(gdb) bt
#0 0x0000000000664dbc in make_partialgroup_input_target (root=0x1795018,
input_rel=0x17957a8, target=0x17bf228, rollup_lists=0x0,
rollup_groupclauses=0x0) at planner.c:4307
#1 create_grouping_paths (root=0x1795018, input_rel=0x17957a8,
target=0x17bf228, rollup_lists=0x0, rollup_groupclauses=0x0)
at planner.c:3420
#2 0x0000000000667405 in grouping_planner (root=0x1795018,
inheritance_update=0 '\000', tuple_fraction=0) at planner.c:1794
#3 0x0000000000668c80 in subquery_planner (glob=<value optimized out>,
parse=0x1703580, parent_root=<value optimized out>,
hasRecursion=<value optimized out>, tuple_fraction=0) at planner.c:769
#4 0x0000000000668ea5 in standard_planner (parse=0x1703580,
cursorOptions=256, boundParams=0x0) at planner.c:308
#5 0x00000000006691b6 in planner (parse=<value optimized out>,
cursorOptions=<value optimized out>, boundParams=<value optimized out>)
at planner.c:178
#6 0x00000000006fb069 in pg_plan_query (querytree=0x1703580,
cursorOptions=256, boundParams=0x0) at postgres.c:798
(gdb) p debug_query_string
$1 = 0x1702078 "SELECT SUM(COUNT(f1)) OVER () FROM int4_tbl WHERE f1=42;"
which I think may be another manifestation of the failure-to-apply-proper-
pathtarget issue we're looking at in this thread. Or maybe it's just
an unjustified assumption in make_partialgroup_input_target that the
input path must always have some sortgrouprefs assigned.
Before getting to that point, there was also an unexplainable plan change:
*** /home/postgres/pgsql/src/test/regress/expected/aggregates.out Thu Apr 7 21:13:14 2016
--- /home/postgres/pgsql/src/test/regress/results/aggregates.out Mon Jun 13 11:54:01 2016
***************
*** 577,590 ****
explain (costs off)
select max(unique1) from tenk1 where unique1 > 42000;
! QUERY PLAN
! ---------------------------------------------------------------------------
! Result
! InitPlan 1 (returns $0)
! -> Limit
! -> Index Only Scan Backward using tenk1_unique1 on tenk1
! Index Cond: ((unique1 IS NOT NULL) AND (unique1 > 42000))
! (5 rows)
select max(unique1) from tenk1 where unique1 > 42000;
max
--- 577,588 ----
explain (costs off)
select max(unique1) from tenk1 where unique1 > 42000;
! QUERY PLAN
! ----------------------------------------------------
! Aggregate
! -> Index Only Scan using tenk1_unique1 on tenk1
! Index Cond: (unique1 > 42000)
! (3 rows)
select max(unique1) from tenk1 where unique1 > 42000;
max
I would not be surprised at a change to a parallel-query plan, but there's
no parallelism here, so what happened? This looks like a bug to me.
(Also, doing this query without COSTS OFF shows that the newly selected
plan actually has a greater estimated cost than the expected plan, which
makes it definitely a bug.)
At this point I'm pretty firmly convinced that we should have a way to
run the regression tests with parallel scans considered for even very
small tables. If someone doesn't want that way to be a GUC, you'd better
propose another solution.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Jun 13, 2016 at 11:02 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Amit Kapila <amit.kapila@enterprisedb.com> writes:
It is slightly tricky to write a reproducible parallel-query test, but
point taken and I think we should try to have a test unless such a test is
really time consuming.BTW, decent regression tests could be written without the need to create
enormous tables if the minimum rel size in create_plain_partial_paths()
could be configured to something less than 1000 blocks. I think it's
fairly crazy that that arbitrary constant is hard-wired anyway. Should
we make it a GUC?
That was proposed before, and I didn't do it mostly because I couldn't
think of a name for it that didn't sound unbelievably corny. Also,
the whole way that algorithm works is kind of a hack and probably
needs to be overhauled entirely in some future release. I'm worried
about having the words "backward compatibility" thrown in my face when
it's time to improve this logic. But aside from those two issues I'm
OK with exposing a knob.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
On Mon, Jun 13, 2016 at 11:02 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
BTW, decent regression tests could be written without the need to create
enormous tables if the minimum rel size in create_plain_partial_paths()
could be configured to something less than 1000 blocks. I think it's
fairly crazy that that arbitrary constant is hard-wired anyway. Should
we make it a GUC?
That was proposed before, and I didn't do it mostly because I couldn't
think of a name for it that didn't sound unbelievably corny.
min_parallel_relation_size, or min_parallelizable_relation_size, or
something like that?
Also,
the whole way that algorithm works is kind of a hack and probably
needs to be overhauled entirely in some future release. I'm worried
about having the words "backward compatibility" thrown in my face when
it's time to improve this logic. But aside from those two issues I'm
OK with exposing a knob.
I agree it's a hack, and I don't want to expose anything about the
number-of-workers scaling behavior, for precisely that reason. But a
threshold on the size of a table to consider parallel scans for at all
doesn't seem unreasonable.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Jun 13, 2016 at 1:06 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
On Mon, Jun 13, 2016 at 11:02 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
BTW, decent regression tests could be written without the need to create
enormous tables if the minimum rel size in create_plain_partial_paths()
could be configured to something less than 1000 blocks. I think it's
fairly crazy that that arbitrary constant is hard-wired anyway. Should
we make it a GUC?That was proposed before, and I didn't do it mostly because I couldn't
think of a name for it that didn't sound unbelievably corny.min_parallel_relation_size, or min_parallelizable_relation_size, or
something like that?
Sure.
Also,
the whole way that algorithm works is kind of a hack and probably
needs to be overhauled entirely in some future release. I'm worried
about having the words "backward compatibility" thrown in my face when
it's time to improve this logic. But aside from those two issues I'm
OK with exposing a knob.I agree it's a hack, and I don't want to expose anything about the
number-of-workers scaling behavior, for precisely that reason. But a
threshold on the size of a table to consider parallel scans for at all
doesn't seem unreasonable.
OK.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
I wrote:
... there was also an unexplainable plan change:
*** /home/postgres/pgsql/src/test/regress/expected/aggregates.out Thu Apr 7 21:13:14 2016 --- /home/postgres/pgsql/src/test/regress/results/aggregates.out Mon Jun 13 11:54:01 2016 *************** *** 577,590 ****
explain (costs off)
select max(unique1) from tenk1 where unique1 > 42000;
! QUERY PLAN
! ---------------------------------------------------------------------------
! Result
! InitPlan 1 (returns $0)
! -> Limit
! -> Index Only Scan Backward using tenk1_unique1 on tenk1
! Index Cond: ((unique1 IS NOT NULL) AND (unique1 > 42000))
! (5 rows)
select max(unique1) from tenk1 where unique1 > 42000; max --- 577,588 ----
explain (costs off)
select max(unique1) from tenk1 where unique1 > 42000;
! QUERY PLAN
! ----------------------------------------------------
! Aggregate
! -> Index Only Scan using tenk1_unique1 on tenk1
! Index Cond: (unique1 > 42000)
! (3 rows)
select max(unique1) from tenk1 where unique1 > 42000;
max
I would not be surprised at a change to a parallel-query plan, but there's
no parallelism here, so what happened? This looks like a bug to me.
(Also, doing this query without COSTS OFF shows that the newly selected
plan actually has a greater estimated cost than the expected plan, which
makes it definitely a bug.)
I looked into this and found that the costs are considered fuzzily the
same, and then add_path prefers the slightly-worse path on the grounds
that it is marked parallel_safe while the MinMaxAgg path is not. It seems
to me that there is some fuzzy thinking going on there. On exactly what
grounds is a path to be preferred merely because it is parallel safe, and
not actually parallelized? Or perhaps the question to ask is whether a
MinMaxAgg path can be marked parallel-safe.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers