BUG #16959: Unnesting null from string_to_array silently removes whole rows from result
The following bug has been logged on the website:
Bug reference: 16959
Logged by: Pete O'Such
Email address: posuch@gmail.com
PostgreSQL version: 11.7
Operating system: Ubuntu (and also Red Hat)
Description:
Applying unnest() to the output of string_to_array() operating on a null
silently drops the entire row from the query results. I'm agnostic
regarding what would be a proper resulting value within the column, but the
silent omission of the whole row from the result is a problem to me.
Sample data:
create table test_rows as
SELECT * FROM (VALUES (1, null), (2, 'second')) AS t (num,letter);
SELECT 2
Query with the unexpected result (I expected 2 rows):
select num, unnest(string_to_array(letter, ',')) from test_rows;
num | unnest
----+--------
2 | second
(1 row)
Correct or expected behavior from a similar operation:
select num, unnest(array[letter]) from test_rows;
num | unnest
----+--------
1 |
2 | second
(2 rows)
Thanks,
Pete O'Such
PG Bug reporting form <noreply@postgresql.org> writes:
Sample data:
create table test_rows as
SELECT * FROM (VALUES (1, null), (2, 'second')) AS t (num,letter);
Query with the unexpected result (I expected 2 rows):
select num, unnest(string_to_array(letter, ',')) from test_rows;
num | unnest
----+--------
2 | second
(1 row)
Well, you could perhaps argue that string_to_array with NULL input
should produce an empty array rather than a NULL. But UNNEST()
would produce zero rows in either case, and I fail to see why you
find that surprising, much less buggy. It would be a bug if it
manufactured a value out of nothing.
Having said that, you could inject the value you prefer using
COALESCE, say
# select num, unnest(coalesce(string_to_array(letter, ','), '{""}')) from test_rows;
num | unnest
-----+--------
1 |
2 | second
(2 rows)
Alternatively, perhaps you'd consider a lateral left join to be
less-surprising behavior:
# select num, u from test_rows left join lateral unnest(string_to_array(letter, ',')) u on true;
num | u
-----+--------
1 |
2 | second
(2 rows)
The behavior you're getting from SRF-in-the-targetlist is basically
equivalent to a lateral plain join, rather than left join. See
https://www.postgresql.org/docs/current/xfunc-sql.html#XFUNC-SQL-FUNCTIONS-RETURNING-SET
regards, tom lane
Thank you for the answers. I applied your first suggestion and of course
it worked well.
There's an implicit question in your email, regarding why I would think of
this outcome as a bug. Not knowing as much of the internals, after
discovering that rows had gone missing my list of things to check was
roughly: no joins, no where clause, no having clause, no grouping, no
distinct, no distinct on, and no union/intersect/except. After that, I was
down to pure trial and error to find the issue.
I get the message that the outcome was obvious to you. For me it was
startling to have a function suppress the entire row, absent those other
query elements. Even having read the note on 9.19, I struggle to see that
as a warning that all rows may disappear. I also wonder how that outcome
is consistent with this:
\pset null 'nuLL'
select 1, split_part('adfsgasf', '234', 3);
?column? | split_part
----------+------------
1 |
(1 row)
Even if it's perfectly sensible to you, I was caught off guard and I think
a note in the documentation alerting readers to this behavior would go a
long way in saving others from the prolonged confusion that I experienced.
Thanks again,
Pete O'Such
On Sun, Apr 11, 2021 at 8:46 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Show quoted text
PG Bug reporting form <noreply@postgresql.org> writes:
Sample data:
create table test_rows as
SELECT * FROM (VALUES (1, null), (2, 'second')) AS t (num,letter);Query with the unexpected result (I expected 2 rows):
select num, unnest(string_to_array(letter, ',')) from test_rows;
num | unnest
----+--------
2 | second
(1 row)Well, you could perhaps argue that string_to_array with NULL input
should produce an empty array rather than a NULL. But UNNEST()
would produce zero rows in either case, and I fail to see why you
find that surprising, much less buggy. It would be a bug if it
manufactured a value out of nothing.Having said that, you could inject the value you prefer using
COALESCE, say# select num, unnest(coalesce(string_to_array(letter, ','), '{""}')) from
test_rows;
num | unnest
-----+--------
1 |
2 | second
(2 rows)Alternatively, perhaps you'd consider a lateral left join to be
less-surprising behavior:# select num, u from test_rows left join lateral
unnest(string_to_array(letter, ',')) u on true;
num | u
-----+--------
1 |
2 | second
(2 rows)The behavior you're getting from SRF-in-the-targetlist is basically
equivalent to a lateral plain join, rather than left join. Seehttps://www.postgresql.org/docs/current/xfunc-sql.html#XFUNC-SQL-FUNCTIONS-RETURNING-SET
regards, tom lane