BUG #16959: Unnesting null from string_to_array silently removes whole rows from result

Started by PG Bug reporting formabout 5 years ago3 messagesbugs
Jump to latest
#1PG Bug reporting form
noreply@postgresql.org

The following bug has been logged on the website:

Bug reference: 16959
Logged by: Pete O'Such
Email address: posuch@gmail.com
PostgreSQL version: 11.7
Operating system: Ubuntu (and also Red Hat)
Description:

Applying unnest() to the output of string_to_array() operating on a null
silently drops the entire row from the query results. I'm agnostic
regarding what would be a proper resulting value within the column, but the
silent omission of the whole row from the result is a problem to me.

Sample data:

create table test_rows as
SELECT * FROM (VALUES (1, null), (2, 'second')) AS t (num,letter);
SELECT 2

Query with the unexpected result (I expected 2 rows):

select num, unnest(string_to_array(letter, ',')) from test_rows;
num | unnest
----+--------
2 | second
(1 row)

Correct or expected behavior from a similar operation:

select num, unnest(array[letter]) from test_rows;
num | unnest
----+--------
1 |
2 | second
(2 rows)

Thanks,
Pete O'Such

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: PG Bug reporting form (#1)
Re: BUG #16959: Unnesting null from string_to_array silently removes whole rows from result

PG Bug reporting form <noreply@postgresql.org> writes:

Sample data:
create table test_rows as
SELECT * FROM (VALUES (1, null), (2, 'second')) AS t (num,letter);

Query with the unexpected result (I expected 2 rows):
select num, unnest(string_to_array(letter, ',')) from test_rows;
num | unnest
----+--------
2 | second
(1 row)

Well, you could perhaps argue that string_to_array with NULL input
should produce an empty array rather than a NULL. But UNNEST()
would produce zero rows in either case, and I fail to see why you
find that surprising, much less buggy. It would be a bug if it
manufactured a value out of nothing.

Having said that, you could inject the value you prefer using
COALESCE, say

# select num, unnest(coalesce(string_to_array(letter, ','), '{""}')) from test_rows;
num | unnest
-----+--------
1 |
2 | second
(2 rows)

Alternatively, perhaps you'd consider a lateral left join to be
less-surprising behavior:

# select num, u from test_rows left join lateral unnest(string_to_array(letter, ',')) u on true;
num | u
-----+--------
1 |
2 | second
(2 rows)

The behavior you're getting from SRF-in-the-targetlist is basically
equivalent to a lateral plain join, rather than left join. See

https://www.postgresql.org/docs/current/xfunc-sql.html#XFUNC-SQL-FUNCTIONS-RETURNING-SET

regards, tom lane

#3Pete O'Such
posuch@gmail.com
In reply to: Tom Lane (#2)
Re: BUG #16959: Unnesting null from string_to_array silently removes whole rows from result

Thank you for the answers. I applied your first suggestion and of course
it worked well.

There's an implicit question in your email, regarding why I would think of
this outcome as a bug. Not knowing as much of the internals, after
discovering that rows had gone missing my list of things to check was
roughly: no joins, no where clause, no having clause, no grouping, no
distinct, no distinct on, and no union/intersect/except. After that, I was
down to pure trial and error to find the issue.

I get the message that the outcome was obvious to you. For me it was
startling to have a function suppress the entire row, absent those other
query elements. Even having read the note on 9.19, I struggle to see that
as a warning that all rows may disappear. I also wonder how that outcome
is consistent with this:

\pset null 'nuLL'
select 1, split_part('adfsgasf', '234', 3);
?column? | split_part
----------+------------
1 |
(1 row)

Even if it's perfectly sensible to you, I was caught off guard and I think
a note in the documentation alerting readers to this behavior would go a
long way in saving others from the prolonged confusion that I experienced.

Thanks again,
Pete O'Such

On Sun, Apr 11, 2021 at 8:46 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Show quoted text

PG Bug reporting form <noreply@postgresql.org> writes:

Sample data:
create table test_rows as
SELECT * FROM (VALUES (1, null), (2, 'second')) AS t (num,letter);

Query with the unexpected result (I expected 2 rows):
select num, unnest(string_to_array(letter, ',')) from test_rows;
num | unnest
----+--------
2 | second
(1 row)

Well, you could perhaps argue that string_to_array with NULL input
should produce an empty array rather than a NULL. But UNNEST()
would produce zero rows in either case, and I fail to see why you
find that surprising, much less buggy. It would be a bug if it
manufactured a value out of nothing.

Having said that, you could inject the value you prefer using
COALESCE, say

# select num, unnest(coalesce(string_to_array(letter, ','), '{""}')) from
test_rows;
num | unnest
-----+--------
1 |
2 | second
(2 rows)

Alternatively, perhaps you'd consider a lateral left join to be
less-surprising behavior:

# select num, u from test_rows left join lateral
unnest(string_to_array(letter, ',')) u on true;
num | u
-----+--------
1 |
2 | second
(2 rows)

The behavior you're getting from SRF-in-the-targetlist is basically
equivalent to a lateral plain join, rather than left join. See

https://www.postgresql.org/docs/current/xfunc-sql.html#XFUNC-SQL-FUNCTIONS-RETURNING-SET

regards, tom lane