inconsistent composite type null handling in plpgsql out variable

Started by Merlin Moncureover 16 years ago8 messagesbugs
Jump to latest
#1Merlin Moncure
mmoncure@gmail.com

Today I ran into a problem relating to $subject. plpgsql's handling
of 'null' composite types is not consistent with what you get in sql:

create table foo(a text, b text);
create table bar(id int, f foo);
insert into bar values (1, ('a', 'b'));

create or replace function f(_foo out foo) returns foo as
$$
declare
r record;
begin
select coalesce(_foo, f) as f
from bar where id = 1 into r;

raise notice '%', r.f;

select f
from bar where id = 1 into r;

raise notice '%', r.f;
return;
end;
$$ language plpgsql;

plpgsql is not assigning the value to _foo in the first case because
coalesce is not treating the _foo as null for some reason.

I'm going to take this opportunity to editorialize a bit:
In this case postgresql is not applying either the insane sql standard
definition of null (non null composite type with fields null) or the
sane definition that is mostly used (a composite type may itself be
null independently of its fields). This leads to some very weird
behaviors, for example 'coalesce(foo, something)' and 'case when foo
is null then something else foo end' can give different answers.

postgresql treats non-null composite types with null fields as null in
terms of the 'IS NULL' operator but as non null every where else
(coalesce, STRICT, etc). IMO, the IS NULL behavior was hacked in for
compliance with the SQL standard. In the wider context of how we do
things, IS NULL simply lies to you. This isn't (IMO) really a big
deal, but when correcting the above behavior, which standard should we
apply?

merlin

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Merlin Moncure (#1)
Re: inconsistent composite type null handling in plpgsql out variable

Merlin Moncure <mmoncure@gmail.com> writes:

Today I ran into a problem relating to $subject. plpgsql's handling
of 'null' composite types is not consistent with what you get in sql:

Hm. It looks like plpgsql treats composite arguments as being "row"
instead of "record" variables, which has some possible efficiency
benefits but there's no way to represent the whole row being null, as
opposed to existing but having all null fields. Not sure how painful
it'd be to change to "record" representation.

regards, tom lane

#3Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Merlin Moncure (#1)
Re: inconsistent composite type null handling in plpgsql out variable

Merlin Moncure <mmoncure@gmail.com> wrote:

This leads to some very weird behaviors, for example 'coalesce(foo,
something)' and 'case when foo is null then something else foo end'
can give different answers.

Quite apart from the issue you're pursuing, this is another example of
how the COALESCE predicate in PostgreSQL is not compliant with the
standard, where it is *defined as* an abbreviation of the CASE
predicate.

I might be persuaded otherwise by a reference to the standard, but my
understanding is that the CASE predicate should be conceptually
similar to the "? :" predicate in C. Does anyone else feel that these
aren't implemented quite right in PostgreSQL?

-Kevin

#4Merlin Moncure
mmoncure@gmail.com
In reply to: Kevin Grittner (#3)
Re: inconsistent composite type null handling in plpgsql out variable

On Fri, Aug 28, 2009 at 1:38 PM, Kevin
Grittner<Kevin.Grittner@wicourts.gov> wrote:

Merlin Moncure <mmoncure@gmail.com> wrote:

This leads to some very weird behaviors, for example 'coalesce(foo,
something)' and 'case when foo is null then something else foo end'
can give different answers.

Quite apart from the issue you're pursuing, this is another example of
how the COALESCE predicate in PostgreSQL is not compliant with the
standard, where it is *defined as* an abbreviation of the CASE
predicate.

I might be persuaded otherwise by a reference to the standard, but my
understanding is that the CASE predicate should be conceptually
similar to the "? :" predicate in C.  Does anyone else feel that these
aren't implemented quite right in PostgreSQL?

I agree with you...it's a mess. Here's what I'm thinking:

1) 'is null', coalesce, STRICT, PQgetisnull, etc should all behave in
consistent manner (and ideally should use the same code paths)
2) make a decision on composite types:
3) If we decide the sql standard is correct, so that (null, null) is
null == true, then we should observe rule 1 and make things work in
consistent way. This means, for example, that null::foo and (null,
null)::foo should not be distinct.
4) If we decide to continue to ignore the standard, so that null::foo
is distinct from (null, null)::foo (which is basically how things work
now), then IS NULL as currently implemented is wrong and should be
changed.
5) plpgsql has a lot of corner cases where composite type behavior is
different from sql...POLS violations. For example, to assign a type
from a type selected in a query, sometimes you have to do (foo).* and
sometimes you have to proxy it through a record variable. input and
output arguments are especially vexing.

merlin

#5Sam Mason
sam@samason.me.uk
In reply to: Merlin Moncure (#4)
Re: inconsistent composite type null handling in plpgsql out variable

On Fri, Aug 28, 2009 at 02:06:02PM -0400, Merlin Moncure wrote:

3) If we decide the sql standard is correct, so that (null, null) is
null == true, then we should observe rule 1 and make things work in
consistent way. This means, for example, that null::foo and (null,
null)::foo should not be distinct.

The more awkward case (to me anyway) is that the standard says (1,NULL)
IS NULL should evaluate to TRUE.

I'd never noticed the ROW / RECORD dichotomy before; could one of these
be made SQL compatible and the other use more sane semantics?

--
Sam http://samason.me.uk/

#6Pavel Stehule
pavel.stehule@gmail.com
In reply to: Sam Mason (#5)
Re: inconsistent composite type null handling in plpgsql out variable

2009/8/31 Sam Mason <sam@samason.me.uk>:

On Fri, Aug 28, 2009 at 02:06:02PM -0400, Merlin Moncure wrote:

3) If we decide the sql standard is correct, so that (null, null) is
null == true, then we should observe rule 1 and make things work in
consistent way.  This means, for example, that null::foo and (null,
null)::foo should not be distinct.

The more awkward case (to me anyway) is that the standard says (1,NULL)
IS NULL should evaluate to TRUE.

what?

only (NULL, NULL) IS NULL is true

regards
Pavel Stehule

p.s. what isn't consistent (maybe - there are more possible interpretations) is

(NULL, NULL) IS DISTINCT FROM NULL is true

Show quoted text

I'd never noticed the ROW / RECORD dichotomy before; could one of these
be made SQL compatible and the other use more sane semantics?

--
 Sam  http://samason.me.uk/

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#7Sam Mason
sam@samason.me.uk
In reply to: Pavel Stehule (#6)
Re: inconsistent composite type null handling in plpgsql out variable

On Mon, Aug 31, 2009 at 07:26:59PM +0200, Pavel Stehule wrote:

2009/8/31 Sam Mason <sam@samason.me.uk>:

The more awkward case (to me anyway) is that the standard says (1,NULL)
IS NULL should evaluate to TRUE.

what?

only (NULL, NULL) IS NULL is true

Bah, sorry you're right! I was rattling my favorite tin and getting
mixed up with the behavior with IS NOT NULL, the negation of which
would say this row is null. I.e:

SELECT NOT (1,NULL) IS NOT NULL;

evaluates to TRUE. I think the consensus is that we should continue to
follow the spec on this, but I was getting confused as to which operator
contains the EXISTS and FORALL operator. I.e. a value "v" IS NULL iff
all elements of "v" are not 'the null value', whereas "v" IS NOT NULL
iff an element of "v" is 'the null value'.

p.s. what isn't consistent (maybe - there are more possible
interpretations) is

(NULL, NULL) IS DISTINCT FROM NULL is true

Yup, I'd agree with Merlin that a ROW consisting entirely of 'null
values' should itself be 'the null value' (to use the terminology from
the copy of the SQL spec I'm reading). I think this should also work
recursively:

SELECT ROW(ROW(NULL)) IS DISTINCT FROM NULL;

should return FALSE, in my understanding.

--
Sam http://samason.me.uk/

#8Pavel Stehule
pavel.stehule@gmail.com
In reply to: Sam Mason (#7)
Re: inconsistent composite type null handling in plpgsql out variable

2009/9/1 Sam Mason <sam@samason.me.uk>:

On Mon, Aug 31, 2009 at 07:26:59PM +0200, Pavel Stehule wrote:

2009/8/31 Sam Mason <sam@samason.me.uk>:

The more awkward case (to me anyway) is that the standard says (1,NULL)
IS NULL should evaluate to TRUE.

what?

only (NULL, NULL) IS NULL is true

Bah, sorry you're right!  I was rattling my favorite tin and getting
mixed up with the behavior with IS NOT NULL, the negation of which
would say this row is null.  I.e:

 SELECT NOT (1,NULL) IS NOT NULL;

evaluates to TRUE.  I think the consensus is that we should continue to
follow the spec on this, but I was getting confused as to which operator
contains the EXISTS and FORALL operator.  I.e. a value "v" IS NULL iff
all elements of "v" are not 'the null value', whereas "v" IS NOT NULL
iff an element of "v" is 'the null value'.

p.s. what isn't consistent (maybe - there are more possible
interpretations) is

(NULL, NULL) IS DISTINCT FROM NULL is true

Yup, I'd agree with Merlin that a ROW consisting entirely of 'null
values' should itself be 'the null value' (to use the terminology from
the copy of the SQL spec I'm reading).  I think this should also work
recursively:

 SELECT ROW(ROW(NULL)) IS DISTINCT FROM NULL;

should return FALSE, in my understanding.

it's question. You ask, is it (NULL, NULL) same as NULL. Without some
reduction - ROW(NULL, NULL) is really different than NULL.

Pavel

Show quoted text

--
 Sam  http://samason.me.uk/

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs