Pg17 Crash in Planning (Arrays + Casting + UDF)

Started by Paul Ramseyover 1 year ago6 messages
#1Paul Ramsey
pramsey@cleverelephant.ca

Hackers,

This extremely odd case [2]https://lists.osgeo.org/pipermail/postgis-devel/2024-October/030428.html came in via a report using a lot of PostGIS functions, but it can be reconfigured into a pure-PostgreSQL crasher [1]https://trac.osgeo.org/postgis/ticket/5793#comment:8.

CREATE TABLE n (i integer);

CREATE OR REPLACE FUNCTION add(integer)
RETURNS integer
AS 'SELECT 2 * $1 + 4 * $1'
LANGUAGE 'sql' IMMUTABLE STRICT PARALLEL SAFE;

SELECT add(array_length(array_agg(i)::numeric[],1)::integer) FROM n;

The stack trace shows it doesn’t get past planning, and in fact it doesn’t care if the table has data in it or not.

ATB,

P

[1]: https://trac.osgeo.org/postgis/ticket/5793#comment:8
[2]: https://lists.osgeo.org/pipermail/postgis-devel/2024-October/030428.html

#2Joe Conway
mail@joeconway.com
In reply to: Paul Ramsey (#1)
Re: Pg17 Crash in Planning (Arrays + Casting + UDF)

On 10/9/24 14:52, Paul Ramsey wrote:

Hackers,

This extremely odd case [2] came in via a report using a lot of PostGIS functions, but it can be reconfigured into a pure-PostgreSQL crasher [1].

CREATE TABLE n (i integer);

CREATE OR REPLACE FUNCTION add(integer)
RETURNS integer
AS 'SELECT 2 * $1 + 4 * $1'
LANGUAGE 'sql' IMMUTABLE STRICT PARALLEL SAFE;

SELECT add(array_length(array_agg(i)::numeric[],1)::integer) FROM n;

The stack trace shows it doesn’t get past planning, and in fact it doesn’t care if the table has data in it or not.

I can duplicate the crash on master and 17 stable branches, but not pg16
FWIW. That is as far as I have looked so far.

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Paul Ramsey (#1)
Re: Pg17 Crash in Planning (Arrays + Casting + UDF)

Paul Ramsey <pramsey@cleverelephant.ca> writes:

This extremely odd case [2] came in via a report using a lot of PostGIS functions, but it can be reconfigured into a pure-PostgreSQL crasher [1].

Thanks for the report! Looks like estimate_array_length() is
incautiously assuming that the "root" pointer it receives will
never be NULL.

The overall code path here is eval_const_expressions ->
simplify_function -> cost_qual_eval -> estimate_array_length,
and the proximate cause of root being NULL is that
simplify_function/inline_function don't take a root pointer,
so they pass NULL root to cost_qual_eval.

We could change their signatures ... but it's explicitly documented
that eval_const_expressions allows NULL for root, so there would
presumably still be code paths that'd fail. It looks like the only
safe fix is to ensure that estimate_array_length will cope with NULL
for root.

regards, tom lane

#4Nathan Bossart
nathandbossart@gmail.com
In reply to: Tom Lane (#3)
Re: Pg17 Crash in Planning (Arrays + Casting + UDF)

On Wed, Oct 09, 2024 at 04:21:53PM -0400, Tom Lane wrote:

Paul Ramsey <pramsey@cleverelephant.ca> writes:

This extremely odd case [2] came in via a report using a lot of PostGIS
functions, but it can be reconfigured into a pure-PostgreSQL crasher
[1].

Thanks for the report! Looks like estimate_array_length() is
incautiously assuming that the "root" pointer it receives will
never be NULL.

Yup, git-bisect points me to commit 9391f71.

We could change their signatures ... but it's explicitly documented
that eval_const_expressions allows NULL for root, so there would
presumably still be code paths that'd fail. It looks like the only
safe fix is to ensure that estimate_array_length will cope with NULL
for root.

Do you mean something like this?

-    else if (arrayexpr)
+    else if (arrayexpr && root != NULL)

That'd at least be no worse than how it worked before
estimate_array_length() tried to use statistics, so that seems reasonable
to me.

--
nathan

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Nathan Bossart (#4)
Re: Pg17 Crash in Planning (Arrays + Casting + UDF)

Nathan Bossart <nathandbossart@gmail.com> writes:

Do you mean something like this?

-    else if (arrayexpr)
+    else if (arrayexpr && root != NULL)

That'd at least be no worse than how it worked before
estimate_array_length() tried to use statistics, so that seems reasonable
to me.

Yeah, exactly, just fall back to the old behavior if no root.

regards, tom lane

#6Fredrik Widlert
fredrik.widlert@digpro.se
In reply to: Tom Lane (#5)
Re: Pg17 Crash in Planning (Arrays + Casting + UDF)

As the original reporter on the PostGIS mailing list, I want to thank
everyone both
on that list and on the Postgres list for the very quick response.

Having the problem confirmed and a ticket opened in 2 minutes is really
impressive.

My report contained a very simplified version of our crashing query, but
I've now also
verified that the fix

-    else if (arrayexpr)
+    else if (arrayexpr && root != NULL)

also stops our original big query from crashing.

Thanks!
/Fredrik