ORDER/GROUP BY expression not found in targetlist
Hi,
trying to reproduce a performance problem I just found:
=# CREATE TABLE twocol(col01 int, col02 int);
=# SELECT DISTINCT col01, col02, col01 FROM twocol ;
ERROR: XX000: ORDER/GROUP BY expression not found in targetlist
LOCATION: get_sortgroupref_tle, tlist.c:341
which appears to be a 9.6 regression, presumable fallout from the path
restructuring.
Greetings,
Andres Freund
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, May 25, 2016 at 7:12 PM, Andres Freund <andres@anarazel.de> wrote:
=# CREATE TABLE twocol(col01 int, col02 int);
=# SELECT DISTINCT col01, col02, col01 FROM twocol ;
ERROR: XX000: ORDER/GROUP BY expression not found in targetlist
LOCATION: get_sortgroupref_tle, tlist.c:341which appears to be a 9.6 regression, presumable fallout from the path
restructuring.
It's surprising that SQL Smith didn't catch something with such simple
steps to reproduce.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Peter Geoghegan writes:
On Wed, May 25, 2016 at 7:12 PM, Andres Freund <andres@anarazel.de> wrote:
=# CREATE TABLE twocol(col01 int, col02 int);
=# SELECT DISTINCT col01, col02, col01 FROM twocol ;
ERROR: XX000: ORDER/GROUP BY expression not found in targetlist
LOCATION: get_sortgroupref_tle, tlist.c:341which appears to be a 9.6 regression, presumable fallout from the path
restructuring.It's surprising that SQL Smith didn't catch something with such simple
steps to reproduce.
I removed distinct relatively early because it causes a large part of
queries to fail due to it not finding an equality operator it likes. It
seems to be more picky about the equality operator than, say, joins.
I'm sure it has a good reason to do so?
regression=> select distinct f1 from path_tbl;
ERROR: could not identify an equality operator for type path
LINE 1: select distinct f1 from path_tbl;
regression=> \do =
-[ RECORD 38 ]-+----------------------------
Schema | pg_catalog
Name | =
Left arg type | path
Right arg type | path
Result type | boolean
Description | equal
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Andreas Seltenreich <seltenreich@gmx.de> writes:
Peter Geoghegan writes:
It's surprising that SQL Smith didn't catch something with such simple
steps to reproduce.
I removed distinct relatively early because it causes a large part of
queries to fail due to it not finding an equality operator it likes. It
seems to be more picky about the equality operator than, say, joins.
I'm sure it has a good reason to do so?
It's looking for an operator that is known to be semantically equality,
by virtue of being the equality member of a btree or hash opclass.
Type path has no such opclass unfortunately. But when you write "a = b"
that just looks for an operator named "=".
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Tom Lane writes:
Andreas Seltenreich <seltenreich@gmx.de> writes:
Peter Geoghegan writes:
It's surprising that SQL Smith didn't catch something with such simple
steps to reproduce.I removed distinct relatively early because it causes a large part of
queries to fail due to it not finding an equality operator it likes. It
seems to be more picky about the equality operator than, say, joins.
I'm sure it has a good reason to do so?It's looking for an operator that is known to be semantically equality,
by virtue of being the equality member of a btree or hash opclass.
Type path has no such opclass unfortunately.
As do lots of data types in the regression db while still having an
operator providing semantic equivalence. I was hoping for someone to
question that status quo. Naively I'd say an equivalence flag is
missing in the catalog that is independent of opclasses.
regards
Andreas
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Andreas Seltenreich <seltenreich@gmx.de> writes:
Tom Lane writes:
It's looking for an operator that is known to be semantically equality,
by virtue of being the equality member of a btree or hash opclass.
Type path has no such opclass unfortunately.
As do lots of data types in the regression db while still having an
operator providing semantic equivalence. I was hoping for someone to
question that status quo. Naively I'd say an equivalence flag is
missing in the catalog that is independent of opclasses.
[ shrug... ] I see little wrong with that status quo. For this
particular use-case, there are two ways we could implement DISTINCT: one
of them requires sorting, and the other requires hashing. So you would
need to provide that opclass infrastructure even if there were some other
way of identifying the operator that means equality.
Type path and the other geometric types lack any natural sort order so
it's hard to imagine making a default btree opclass for them. But a
default hash opclass might not be out of reach, given an exact equality
operator.
Another problem with the geometric types is that long ago somebody
invented "=" operators for most of them that have little to do with what
anyone would consider equality. The "path = path" operator just compares
whether the paths have the same number of points. A lot of the other ones
compare areas. It'd be hard to justify marking any of them as default
equality for the type.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Andres Freund <andres@anarazel.de> writes:
trying to reproduce a performance problem I just found:
=# CREATE TABLE twocol(col01 int, col02 int);
=# SELECT DISTINCT col01, col02, col01 FROM twocol ;
ERROR: XX000: ORDER/GROUP BY expression not found in targetlist
LOCATION: get_sortgroupref_tle, tlist.c:341
which appears to be a 9.6 regression, presumable fallout from the path
restructuring.
Huh. The problem is that createplan.c is trying to apply the
physical-tlist optimization to the seqscan underneath the aggregate
node. That means that the output from the seqscan is just
"col01, col02", which means that col01 can only be decorated with
a single ressortgroupref ... but there are two ressortgrouprefs
for it as far as the groupClause is concerned. Only one gets applied
to the seqscan's tlist, and then later we fail because we don't find
the other one there. Conclusions:
* we need to back off the physical-tlist optimization in this case
* the code that transfers sortgroupref labels onto a tlist probably
ought to notice and complain if it's asked to put inconsistent labels
onto the same column.
I'm a little surprised that it's not discarding the third grouping
item as redundant ... but that's probably not something to mess with
right now. Prior versions don't appear to do that either.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi,
I got same error by TPC-H: Q1,4,8,12 and 17.
I've attached results of the queries.
TPC-H (thanks to Tomas Vondra)
https://github.com/tvondra/pg_tpch
Datasize
Scale Factor: 1
PG96beta1
commit: f721e94b5f360391fc3ffe183bf697a0441e9184
Regards,
Tatsuro Yamada
NTT OSS Center
Show quoted text
On 2016/05/27 2:22, Tom Lane wrote:
Andres Freund <andres@anarazel.de> writes:
trying to reproduce a performance problem I just found:
=# CREATE TABLE twocol(col01 int, col02 int);
=# SELECT DISTINCT col01, col02, col01 FROM twocol ;
ERROR: XX000: ORDER/GROUP BY expression not found in targetlist
LOCATION: get_sortgroupref_tle, tlist.c:341which appears to be a 9.6 regression, presumable fallout from the path
restructuring.Huh. The problem is that createplan.c is trying to apply the
physical-tlist optimization to the seqscan underneath the aggregate
node. That means that the output from the seqscan is just
"col01, col02", which means that col01 can only be decorated with
a single ressortgroupref ... but there are two ressortgrouprefs
for it as far as the groupClause is concerned. Only one gets applied
to the seqscan's tlist, and then later we fail because we don't find
the other one there. Conclusions:* we need to back off the physical-tlist optimization in this case
* the code that transfers sortgroupref labels onto a tlist probably
ought to notice and complain if it's asked to put inconsistent labels
onto the same column.I'm a little surprised that it's not discarding the third grouping
item as redundant ... but that's probably not something to mess with
right now. Prior versions don't appear to do that either.regards, tom lane