query_is_distinct_for does not take into account set returning functions

Started by David Rowleyover 11 years ago2 messages
#1David Rowley
dgrowleyml@gmail.com
1 attachment(s)

Over here ->
/messages/by-id/6351.1404663344@sss.pgh.pa.us Tom
noted that create_unique_path did not check for set returning functions.

Tom Wrote:

I notice that create_unique_path is not paying attention to the question
of whether the subselect's tlist contains SRFs or volatile functions.
It's possible that that's a pre-existing bug.

I looked at this a bit and I can confirm that it does not behave as it
should do. Take the following as an example:

create table x (id int primary key);
create table y (n int not null);

insert into x values(1);
insert into y values(1);

select * from x where (id,id) in(select n,generate_series(1,2) / 10 + 1 g
from y);
id
----
1
(1 row)

select * from x where (id,id) in(select n,generate_series(1,2) / 10 + 1 g
from y group by n);
id
----
1
1
(2 rows)

The 2nd query does group by n, so query_is_distinct_for returns true,
therefore the outer query think's it's ok to perform an INNER JOIN rather
than a SEMI join, which is this case produces an extra record.

I think we should probably include the logic to test for set returning
functions into query_is_distinct_for.

The attached fixes the problem.

Regards

David Rowley

Attachments:

query_is_distinct_for_fix.patchapplication/octet-stream; name=query_is_distinct_for_fix.patchDownload
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index f701954..d9f4776 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -1473,6 +1473,15 @@ query_is_distinct_for(Query *query, List *colnos, List *opids)
 	Assert(list_length(colnos) == list_length(opids));
 
 	/*
+	 * The presence of set returning functions in the query's target list can
+	 * cause multiple rows per group to exist, so we'll need to play it safe
+	 * here and report that the query is not distinct when set returning
+	 * functions exist.
+	 */
+	if (expression_returns_set((Node *) query->targetList))
+		return false;
+
+	/*
 	 * DISTINCT (including DISTINCT ON) guarantees uniqueness if all the
 	 * columns in the DISTINCT clause appear in colnos and operator semantics
 	 * match.
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Rowley (#1)
Re: query_is_distinct_for does not take into account set returning functions

David Rowley <dgrowleyml@gmail.com> writes:

I think we should probably include the logic to test for set returning
functions into query_is_distinct_for.

It strikes me that there's only a problem if the SRF is in a tlist entry
that is not one of the DISTINCT or GROUP BY columns, respectively. It
may not be worth the extra complexity to figure that out, though.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers