Parameter NOT NULL to CREATE DOMAIN not the same as CHECK (VALUE IS NOT NULL)

Started by PG Doc comments formabout 1 year ago6 messages
#1PG Doc comments form
noreply@postgresql.org

The following documentation comment has been logged on the website:

Page: https://www.postgresql.org/docs/17/sql-createdomain.html
Description:

The manual claims:

The syntax NOT NULL in this command is a PostgreSQL extension. (A
standard-conforming way to write the same would be CHECK (VALUE IS NOT
NULL). […])

But both variants differ when composite types are involved:

CREATE TYPE complex AS (real float8, imag float8);

CREATE DOMAIN d1 AS complex NOT NULL;
CREATE DOMAIN d2 AS complex CHECK (VALUE IS NOT NULL);

SELECT '(,)'::d1; -- allowed
SELECT '(,)'::d2; -- not allowed

Kind Regards,
Jan Behrens

#2Bruce Momjian
bruce@momjian.us
In reply to: PG Doc comments form (#1)
Re: Parameter NOT NULL to CREATE DOMAIN not the same as CHECK (VALUE IS NOT NULL)

On Fri, Jan 3, 2025 at 01:39:44PM +0000, PG Doc comments form wrote:

The following documentation comment has been logged on the website:

Page: https://www.postgresql.org/docs/17/sql-createdomain.html
Description:

The manual claims:

The syntax NOT NULL in this command is a PostgreSQL extension. (A
standard-conforming way to write the same would be CHECK (VALUE IS NOT
NULL). […])

But both variants differ when composite types are involved:

CREATE TYPE complex AS (real float8, imag float8);

CREATE DOMAIN d1 AS complex NOT NULL;
CREATE DOMAIN d2 AS complex CHECK (VALUE IS NOT NULL);

SELECT '(,)'::d1; -- allowed
SELECT '(,)'::d2; -- not allowed

(Theead moved to hackers since there are correctness questions here.)

Wow, I dug into this and found even more problems. First, your examples
in master return what you reported:

CREATE TYPE complex AS (real float8, imag float8);

CREATE DOMAIN d1 AS complex NOT NULL;
CREATE DOMAIN d2 AS complex CHECK (VALUE IS NOT NULL);

--> SELECT '(,)'::d1; -- allowed
d1
-----
(,)

SELECT '(,)'::d2; -- not allowed
ERROR: value for domain d2 violates check constraint "d2_check"

I added some tests without DOMAINs and found further problems,
specifically the ability to put a value that tests IS NULL as true in a
NOT NULL composite-type column, and not honoring WHERE col IS NULL
tests:

CREATE TABLE complex_test (col complex NOT NULL);

-- accepts NULL in a NOT NULL column
INSERT INTO complex_test VALUES ('(,)');

-- proof it is NULL
SELECT col, col IS NULL FROM complex_test;
col | ?column?
-----+----------
(,) | t

-- NOT NULL column returns NULL value
SELECT col, col IS NULL FROM complex_test WHERE col IS NOT NULL;
col | ?column?
-----+----------
(,) | t

EXPLAIN SELECT col, col IS NULL FROM complex_test WHERE col IS NOT NULL;
QUERY PLAN
-----------------------------------------------------------------
Seq Scan on complex_test (cost=0.00..23.60 rows=1360 width=33)

-- IS NULL does not return NULL value
SELECT col, col IS NULL FROM complex_test WHERE col IS NULL;
col | ?column?
-----+----------

-- optimization in PG 17 prevents any comparison to NULL
EXPLAIN SELECT col, col IS NULL FROM complex_test WHERE col IS NULL;
QUERY PLAN
-------------------------------------------
Result (cost=0.00..0.00 rows=0 width=33)
One-Time Filter: false

With the optimizations in PG 17 for NULL checks:

https://www.postgresql.org/docs/17/release-17.html#RELEASE-17-OPTIMIZER
--> Remove IS NOT NULL restrictions from queries on NOT NULL columns and
--> eliminate scans on NOT NULL columns if IS NULL is specified.

I see different output in pre-PG 17, so I would say this got worse in PG
17+ because I think the IS NULL and IS NOT NULL are being removed during
optimization. Notice the IS [NOT] NULL checks that appear in the
EXPLAIN output below, but not above:

SELECT col, col IS NULL FROM complex_test WHERE col IS NOT NULL;
col | ?column?
-----+----------

EXPLAIN SELECT col, col IS NULL FROM complex_test WHERE col IS NOT NULL;
QUERY PLAN
-----------------------------------------------------------------
Seq Scan on complex_test (cost=0.00..23.60 rows=1353 width=33)
Filter: (col IS NOT NULL)

SELECT col, col IS NULL FROM complex_test WHERE col IS NULL;
col | ?column?
-----+----------
(,) | t

EXPLAIN SELECT col, col IS NULL FROM complex_test WHERE col IS NULL;
QUERY PLAN
--------------------------------------------------------------
Seq Scan on complex_test (cost=0.00..23.60 rows=7 width=33)
Filter: (col IS NULL)

I think this needs some serious research.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Do not let urgent matters crowd out time for investment in the future.

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#2)
Re: Parameter NOT NULL to CREATE DOMAIN not the same as CHECK (VALUE IS NOT NULL)

Bruce Momjian <bruce@momjian.us> writes:

I think this needs some serious research.

We've discussed this topic before. The spec's definition of IS [NOT]
NULL for composite values is bizarre to say the least. I think
there's been an intentional choice to keep most NOT NULL checks
"simple", that is we look at the overall value's isnull bit and
don't probe any deeper than that.

If the optimizations added in v17 changed existing behavior,
I agree that's bad. We should probably fix it so that those
are only applied when argisrow is false.

regards, tom lane

#4Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#3)
1 attachment(s)
Re: Parameter NOT NULL to CREATE DOMAIN not the same as CHECK (VALUE IS NOT NULL)

On Wed, Jan 8, 2025 at 04:24:34PM -0500, Tom Lane wrote:

Bruce Momjian <bruce@momjian.us> writes:

I think this needs some serious research.

We've discussed this topic before. The spec's definition of IS [NOT]
NULL for composite values is bizarre to say the least. I think
there's been an intentional choice to keep most NOT NULL checks
"simple", that is we look at the overall value's isnull bit and
don't probe any deeper than that.

If the optimizations added in v17 changed existing behavior,
I agree that's bad. We should probably fix it so that those
are only applied when argisrow is false.

Okay, makes sense. Do we have any sense of whether the docs can be
improved in this area, based on the original email report? Here is a
proposed patch for that.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Do not let urgent matters crowd out time for investment in the future.

Attachments:

domain.difftext/x-diff; charset=us-asciiDownload
diff --git a/doc/src/sgml/ref/create_domain.sgml b/doc/src/sgml/ref/create_domain.sgml
index ce555203486..c111285a69c 100644
--- a/doc/src/sgml/ref/create_domain.sgml
+++ b/doc/src/sgml/ref/create_domain.sgml
@@ -283,7 +283,8 @@ CREATE TABLE us_snail_addy (
   <para>
    The syntax <literal>NOT NULL</literal> in this command is a
    <productname>PostgreSQL</productname> extension.  (A standard-conforming
-   way to write the same would be <literal>CHECK (VALUE IS NOT
+   way to write the same for non-composite data types would be
+   <literal>CHECK (VALUE IS NOT
    NULL)</literal>.  However, per <xref linkend="sql-createdomain-notes"/>,
    such constraints are best avoided in practice anyway.)  The
    <literal>NULL</literal> <quote>constraint</quote> is a
#5Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#3)
1 attachment(s)
Re: Parameter NOT NULL to CREATE DOMAIN not the same as CHECK (VALUE IS NOT NULL)

On Wed, Jan 8, 2025 at 04:24:34PM -0500, Tom Lane wrote:

Bruce Momjian <bruce@momjian.us> writes:

I think this needs some serious research.

We've discussed this topic before. The spec's definition of IS [NOT]
NULL for composite values is bizarre to say the least. I think
there's been an intentional choice to keep most NOT NULL checks
"simple", that is we look at the overall value's isnull bit and
don't probe any deeper than that.

If the optimizations added in v17 changed existing behavior,
I agree that's bad. We should probably fix it so that those
are only applied when argisrow is false.

I have developed the attached patch using your argisrow suggestion which
fixes the test I posted. Is this something we should backpatch?

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Do not let urgent matters crowd out time for investment in the future.

Attachments:

domain.difftext/x-diff; charset=us-asciiDownload
diff --git a/doc/src/sgml/ref/create_domain.sgml b/doc/src/sgml/ref/create_domain.sgml
index ce555203486..c111285a69c 100644
--- a/doc/src/sgml/ref/create_domain.sgml
+++ b/doc/src/sgml/ref/create_domain.sgml
@@ -283,7 +283,8 @@ CREATE TABLE us_snail_addy (
   <para>
    The syntax <literal>NOT NULL</literal> in this command is a
    <productname>PostgreSQL</productname> extension.  (A standard-conforming
-   way to write the same would be <literal>CHECK (VALUE IS NOT
+   way to write the same for non-composite data types would be
+   <literal>CHECK (VALUE IS NOT
    NULL)</literal>.  However, per <xref linkend="sql-createdomain-notes"/>,
    such constraints are best avoided in practice anyway.)  The
    <literal>NULL</literal> <quote>constraint</quote> is a
diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c
index 2cb0ae6d659..e291547112d 100644
--- a/src/backend/optimizer/plan/initsplan.c
+++ b/src/backend/optimizer/plan/initsplan.c
@@ -3100,6 +3100,13 @@ restriction_is_always_true(PlannerInfo *root,
 		if (nulltest->nulltesttype != IS_NOT_NULL)
 			return false;
 
+		/*
+		 * Empty rows can appear NULL in some contexts and NOT NULL in others,
+		 * so avoid this optimization for row expressions.
+		 */
+		if (nulltest->argisrow)
+			return false;
+
 		return expr_is_nonnullable(root, nulltest->arg);
 	}
 
@@ -3149,6 +3156,13 @@ restriction_is_always_false(PlannerInfo *root,
 		if (nulltest->nulltesttype != IS_NULL)
 			return false;
 
+		/*
+		 * Empty rows can appear NULL in some contexts and NOT NULL in others,
+		 * so avoid this optimization for row expressions.
+		 */
+		if (nulltest->argisrow)
+			return false;
+
 		return expr_is_nonnullable(root, nulltest->arg);
 	}
 
#6Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#5)
Re: Parameter NOT NULL to CREATE DOMAIN not the same as CHECK (VALUE IS NOT NULL)

On Thu, Jan 30, 2025 at 08:26:54AM -0500, Bruce Momjian wrote:

On Wed, Jan 8, 2025 at 04:24:34PM -0500, Tom Lane wrote:

Bruce Momjian <bruce@momjian.us> writes:

I think this needs some serious research.

We've discussed this topic before. The spec's definition of IS [NOT]
NULL for composite values is bizarre to say the least. I think
there's been an intentional choice to keep most NOT NULL checks
"simple", that is we look at the overall value's isnull bit and
don't probe any deeper than that.

If the optimizations added in v17 changed existing behavior,
I agree that's bad. We should probably fix it so that those
are only applied when argisrow is false.

I have developed the attached patch using your argisrow suggestion which
fixes the test I posted. Is this something we should backpatch?

Patch applied. This fix will appear in the next minor PG 17 release.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Do not let urgent matters crowd out time for investment in the future.