COALESCE with single argument looks like identity function

Started by Maksim Milyutinabout 1 year ago5 messageshackers
Jump to latest
#1Maksim Milyutin
maksim.milyutin@tantorlabs.ru

Hello everyone!

I've noticed that COALESCE function doesn't converge to argument
expression if it is alone in argument list of COALESCE as part
simplification routine for expressions in planner. This might suppress
further useful transformations when non-strict ops are required from
some expression like converging OUTER JOIN to INNER one with WHERE qual
containing COALESCE over single column from inner side.

The patch of transformation in question for COALESCE is attached.

--
Best regard,
Maksim Milyutin

Attachments:

v1-0001-Simplify-COALESCE-with-single-argument.patchtext/x-patch; charset=UTF-8; name=v1-0001-Simplify-COALESCE-with-single-argument.patchDownload+2-1
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Maksim Milyutin (#1)
Re: COALESCE with single argument looks like identity function

Maksim Milyutin <maksim.milyutin@tantorlabs.ru> writes:

I've noticed that COALESCE function doesn't converge to argument
expression if it is alone in argument list of COALESCE as part
simplification routine for expressions in planner. This might suppress
further useful transformations when non-strict ops are required from
some expression like converging OUTER JOIN to INNER one with WHERE qual
containing COALESCE over single column from inner side.

Seems like a reasonable idea --- it's probably a rare case, but the
check is cheap enough. I'd add some comments though.

Please add this to the open commitfest so we don't lose track of it.

regards, tom lane

#3Maksim Milyutin
maksim.milyutin@tantorlabs.ru
In reply to: Tom Lane (#2)
Re: COALESCE with single argument looks like identity function

On 4/11/25 17:00, Tom Lane wrote:

Maksim Milyutin<maksim.milyutin@tantorlabs.ru> writes:

I've noticed that COALESCE function doesn't converge to argument
expression if it is alone in argument list of COALESCE as part
simplification routine for expressions in planner. This might suppress
further useful transformations when non-strict ops are required from
some expression like converging OUTER JOIN to INNER one with WHERE qual
containing COALESCE over single column from inner side.

Seems like a reasonable idea --- it's probably a rare case, but the
check is cheap enough. I'd add some comments though.

Thanks for your comments.

Please add this to the open commitfest so we don't lose track of it.

Done. In regression tests I've replaced all COALESCEs with single
argument to ones with dummy second argument to preserve coalesce calls
as AFAICS their usages are intentional for wrapping attributes to
generate PHVs above.

Also I've noticed the issue in query (in join.sql test suite):

SELECT 1 FROM group_tbl t1
    LEFT JOIN (SELECT a c1, *COALESCE(a)* c2 FROM group_tbl t2) s ON TRUE
GROUP BY s.c1, s.c2

repeatable t2.a in GROUP BY clauses are not converged to single appearance:

                 QUERY PLAN
--------------------------------------------
 Group
   Group Key: t2.a, *t2.a*
   ->  Sort
         Sort Key: t2.a, *t2.a*
         ->  Nested Loop Left Join
               ->  Seq Scan on group_tbl t1
               ->  Seq Scan on group_tbl t2

IMO the cause is in PHV surrounding s.c2 that differentiates its
internal expression with the same first grouping key.

--
Best regard,
Maksim Milyutin

#4Maksim Milyutin
maksim.milyutin@tantorlabs.ru
In reply to: Maksim Milyutin (#3)
Re: COALESCE with single argument looks like identity function

Updated patchset is attached

On 4/14/25 17:25, Maksim Milyutin wrote:

On 4/11/25 17:00, Tom Lane wrote:

Maksim Milyutin<maksim.milyutin@tantorlabs.ru> writes:

I've noticed that COALESCE function doesn't converge to argument
expression if it is alone in argument list of COALESCE as part
simplification routine for expressions in planner. This might suppress
further useful transformations when non-strict ops are required from
some expression like converging OUTER JOIN to INNER one with WHERE qual
containing COALESCE over single column from inner side.

Seems like a reasonable idea --- it's probably a rare case, but the
check is cheap enough. I'd add some comments though.

Thanks for your comments.

Please add this to the open commitfest so we don't lose track of it.

Done. In regression tests I've replaced all COALESCEs with single
argument to ones with dummy second argument to preserve coalesce calls
as AFAICS their usages are intentional for wrapping attributes to
generate PHVs above.

Also I've noticed the issue in query (in join.sql test suite):

SELECT 1 FROM group_tbl t1
    LEFT JOIN (SELECT a c1, *COALESCE(a)* c2 FROM group_tbl t2) s ON TRUE
GROUP BY s.c1, s.c2

repeatable t2.a in GROUP BY clauses are not converged to single
appearance:

                 QUERY PLAN
--------------------------------------------
 Group
   Group Key: t2.a, *t2.a*
   ->  Sort
         Sort Key: t2.a, *t2.a*
         ->  Nested Loop Left Join
               ->  Seq Scan on group_tbl t1
               ->  Seq Scan on group_tbl t2

IMO the cause is in PHV surrounding s.c2 that differentiates its
internal expression with the same first grouping key.

--
Best regard,
Maksim Milyutin

Attachments:

v1-0002-Adjust-regression-tests.patchtext/x-patch; charset=UTF-8; name=v1-0002-Adjust-regression-tests.patchDownload+48-44
v1-0001-Simplify-COALESCE-with-single-argument.patchtext/x-patch; charset=UTF-8; name=v1-0001-Simplify-COALESCE-with-single-argument.patchDownload+7-1
#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Maksim Milyutin (#4)
Re: COALESCE with single argument looks like identity function

Maksim Milyutin <maksim.milyutin@tantorlabs.ru> writes:

Updated patchset is attached

Pushed with minor adjustments. Mainly, I didn't entirely trust
your substitutions of, eg, "COALESCE(q1)" to "COALESCE(q1, 0)".
That would produce a different result if q1 were NULL. I'm not
sure that that actually occurs in these regression tests, or
that it would affect the intent of the tests anyway. But we can
avoid having to think hard about that question by instead writing
"COALESCE(q1, q1)" and so on. That does provably give the same
result as before.

regards, tom lane