parseCheckAggregates vs. assign_query_collations

Started by Andrew Gierthalmost 7 years ago3 messages
#1Andrew Gierth
andrew@tao11.riddles.org.uk

Looking into a bug report on the -general list about grouping sets,
which turns out to be an issue of collation assignment: if the query has

CASE GROUPING(expr) WHEN 1 ...

then the expression is rejected as not matching the one in the GROUP BY
clause, because CASE already assigned collations to the expression (as a
special case in its transform function) while the rest of the query
hasn't yet had them assigned, because parseCheckAggregates gets run
before assign_query_collations.

I'll be looking into this in detail later, but right now, cam anyone
think of any reason why parseCheckAggregates couldn't be moved to after
assign_query_collations?

--
Andrew (irc:RhodiumToad)

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Gierth (#1)
Re: parseCheckAggregates vs. assign_query_collations

Andrew Gierth <andrew@tao11.riddles.org.uk> writes:

Looking into a bug report on the -general list about grouping sets,
which turns out to be an issue of collation assignment: if the query has
CASE GROUPING(expr) WHEN 1 ...
then the expression is rejected as not matching the one in the GROUP BY
clause, because CASE already assigned collations to the expression (as a
special case in its transform function) while the rest of the query
hasn't yet had them assigned, because parseCheckAggregates gets run
before assign_query_collations.

Bleah.

I'll be looking into this in detail later, but right now, cam anyone
think of any reason why parseCheckAggregates couldn't be moved to after
assign_query_collations?

I never particularly liked assign_query_collations, as a matter of overall
system design. I'd prefer to nuke that and instead require collation
assignment to be done per-expression, ie at the end of transformExpr and
similar places. Now that we've seen this example, it's fairly clear why
collation assignment really should be considered an integral part of
expression parsing. Who's to say there aren't more gotchas of this sort
waiting to bite us? Also, if it were integrated into transformExpr as
it should have been to begin with, we would not have the need for quite
so many random calls to assign_expr_collations, with consequent bugs of
omission, cf 7a28e9aa0.

regards, tom lane

#3Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Tom Lane (#2)
Re: parseCheckAggregates vs. assign_query_collations

"Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

I'll be looking into this in detail later, but right now, cam anyone
think of any reason why parseCheckAggregates couldn't be moved to
after assign_query_collations?

Tom> I never particularly liked assign_query_collations, as a matter of
Tom> overall system design. I'd prefer to nuke that and instead require
Tom> collation assignment to be done per-expression, ie at the end of
Tom> transformExpr and similar places. Now that we've seen this
Tom> example, it's fairly clear why collation assignment really should
Tom> be considered an integral part of expression parsing. Who's to say
Tom> there aren't more gotchas of this sort waiting to bite us? Also,
Tom> if it were integrated into transformExpr as it should have been to
Tom> begin with, we would not have the need for quite so many random
Tom> calls to assign_expr_collations, with consequent bugs of omission,
Tom> cf 7a28e9aa0.

Sure, this might be the right approach going forward. But right now we
need a back-patchable fix, and the above sounds a bit too intrusive for
that.

Turns out the issue can be reproduced without grouping sets too:

select case a||'' when '1' then 0 else 1 end
from (select (select random()::text) as a) s group by a||'';
ERROR: column "s.a" must appear in the GROUP BY clause or be used in an aggregate function

select case when a||'' = '1' then 0 else 1 end
from (select (select random()::text) as a) s group by a||''; -- works

--
Andrew (irc:RhodiumToad)