using index to speedup add not null constraints to a table
hi.
context: i was trying to speedup add check constraint then
Sergei Kornilov sent me a link: [1]/messages/by-id/9878.1511970441@sss.pgh.pa.us.
so i studied that thread, then tried to implement $SUBJECT.
obviously not using SPI at all.
the logic is mentioned [2]/messages/by-id/CA+Tgmoa5NKz8iGW_9v7wz=-+zQFu=E4SZoaTaU1znLaEXRYp-Q@mail.gmail.com:
"""
we could try an index scan - via index_beginscan() /
index_getnext() / index_endscan() - trying to pull exactly one null
from the index, and judge whether or not there are nulls present based
only whether we get one. This would be a lot cheaper than scanning a
large table, but it needs some careful thought because of visibility
issues.
"""
Currently, if the leading key column of an index is the same as the column with
the NOT NULL constraint, then $SUBJECT applies.
so the following test case, the $SUBJECT applies:
create index t_idx_ab on t(a,b);
alter table t add constraint t1 not null a;
However, query:
alter table t add constraint t1 not null b;
$SUBEJCT does not apply, since "b" is not the leading column of the index.
(It is possible that this could be implemented. So I missed something....)
This approach will not work for partitioned tables, as ALTER TABLE ALTER COLUMN
SET EXPRESSION may trigger an index rebuild. We cannot perform an index scan
if the index will be rebuilt later. In the future, if we determine that the
column being rebuilt in the index is the same as the column to which the NOT
NULL constraint is being added, then $SUBJECT can also be applied to partitioned
tables.
based on [3]/messages/by-id/900056D1-32DF-4927-8251-3E0C0DC407FD@anarazel.de, I wrote some isolation tests to address concurrency issues.
however since add not-null constraint takes ACCESS EXCLUSIVE lock,
so there is less anomaly can happen?
PERFORMANCE:
--100% bloat and zero null bloat value:
drop table if exists t \; create unlogged table t(a int, b int, c int)
\; create index t_idx_a on t(a);
insert into t select g, g+1 from generate_series(1,1_000_000) g;
delete from t;
alter table t add constraint t1 not null a;
with patch Time: 1.873 ms
master Time: 648.312 ms
comments are welcome.
[1]: /messages/by-id/9878.1511970441@sss.pgh.pa.us
[2]: /messages/by-id/CA+Tgmoa5NKz8iGW_9v7wz=-+zQFu=E4SZoaTaU1znLaEXRYp-Q@mail.gmail.com
[3]: /messages/by-id/900056D1-32DF-4927-8251-3E0C0DC407FD@anarazel.de
Attachments:
v1-0001-using-index-to-speedup-add-not-null-constraint-to.patchtext/x-patch; charset=US-ASCII; name=v1-0001-using-index-to-speedup-add-not-null-constraint-to.patchDownload+306-10
hi.
In v1 I didn't do the `git add` for newly created isolation test related files.
so the cfbot for isolation tests failed.
v1 with index:
create index t_idx_ab on t(a,b);
we cannot fast add a not-null constraint for column b.
with the attached v2 patch, now we can do that.
v2, isolation test also adds other session drop index scarenio.
Attachments:
v2-0001-using-index-to-speedup-add-not-null-constraint-to.patchtext/x-patch; charset=US-ASCII; name=v2-0001-using-index-to-speedup-add-not-null-constraint-to.patchDownload+516-11
hi
rebased new patch attached.
I also did some cosmetic changes. comments refined.
make sure using index_scan mechanism to fast check column not-null can
only be used via btree index.
isolation tests are simplified.
Attachments:
v3-0001-using-index-to-speedup-add-not-null-constraint-to.patchtext/x-patch; charset=US-ASCII; name=v3-0001-using-index-to-speedup-add-not-null-constraint-to.patchDownload+497-12
On Wed, Feb 5, 2025 at 4:24 PM jian he <jian.universality@gmail.com> wrote:
rebased new patch attached.
I also did some cosmetic changes. comments refined.
make sure using index_scan mechanism to fast check column not-null can
only be used via btree index.
isolation tests are simplified.
I realized that my previous patch was quite wrong,
we should not do indexscan verify individual not-null constraints on phase2.
So a new patch is attached,
the main idea is Phase2 collects all to be added not-null constraints
to AlteredTableInfo->constraints.
then in Phase3 check, can we use index to fast check not-null
constraint or not.
To minimize concurrency issues, using an index scan to quickly validate
NOT NULL constraints requires strict conditions in Phase3:
* No table rewrite
* No table scan
* Each NOT NULL constraint must have a suitable supporting index for
fast checking
* The table must already hold an AccessExclusiveLock
* The DDL must not involve creating any new indexes
I don't have any good ideas to do the regress tests.
I use
ereport(NOTICE,
errmsg("all not-null constraints on relation
\"%s\" are validated by index scan",
RelationGetRelationName(oldrel)));
to do the tests.
for example:
create temp table t2 (x int, y int, z int, primary key (x, y));
create unique index t2_z_uidx on t2(z);
alter table t2 alter column z set not null;
NOTICE: all not-null constraints on relation "t2" are validated by index scan
ALTER TABLE
Attachments:
v4-0001-using-indexscan-to-speedup-add-not-null-constrain.patchtext/x-patch; charset=US-ASCII; name=v4-0001-using-indexscan-to-speedup-add-not-null-constrain.patchDownload+566-7
On Fri, Apr 18, 2025 at 4:07 PM jian he <jian.universality@gmail.com> wrote:
I don't have any good ideas to do the regress tests.
I use
ereport(NOTICE,
errmsg("all not-null constraints on relation
\"%s\" are validated by index scan",
RelationGetRelationName(oldrel)));
to do the tests.
for tests, just found out i can imitate
src/test/modules/test_misc/t/001_constraint_validation.pl,
So I created a file:
src/test/modules/test_misc/t/008_indexscan_validate_notnull.pl
for TAP tests.
Attachments:
v5-0001-using-indexscan-to-speedup-add-not-null-constraints.patchtext/x-patch; charset=US-ASCII; name=v5-0001-using-indexscan-to-speedup-add-not-null-constraints.patchDownload+591-7
On 2025-Apr-28, jian he wrote:
for tests, just found out i can imitate
src/test/modules/test_misc/t/001_constraint_validation.pl,So I created a file:
src/test/modules/test_misc/t/008_indexscan_validate_notnull.pl
for TAP tests.
Seems reasonable, didn't look at it in detail. I think you don't have
any tests where you try to set multiple columns as NOT NULL in a single
ALTER TABLE command; I think this is worth having. Something like
CREATE TABLE foo (col1 int, col2 int, col3 int);
... create indexes on col1 and col2 ...
alter table foo set col1 not null,
set col3 not null,
add constraint bla not null b;
and stuff like that.
--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
"If you have nothing to say, maybe you need just the right tool to help you
not say it." (New York Times, about Microsoft PowerPoint)
Hi,
On 2025-04-28 12:36:14 +0800, jian he wrote:
On Fri, Apr 18, 2025 at 4:07 PM jian he <jian.universality@gmail.com> wrote:
I don't have any good ideas to do the regress tests.
I use
ereport(NOTICE,
errmsg("all not-null constraints on relation
\"%s\" are validated by index scan",
RelationGetRelationName(oldrel)));
to do the tests.for tests, just found out i can imitate
src/test/modules/test_misc/t/001_constraint_validation.pl,So I created a file:
src/test/modules/test_misc/t/008_indexscan_validate_notnull.pl
for TAP tests.
The tests have not passed in a few weeks:
https://cirrus-ci.com/github/postgresql-cfbot/postgresql/cf%2F5444
Greetings,
Andres Freund
On Mon, Apr 28, 2025 at 4:40 PM Álvaro Herrera <alvherre@kurilemu.de> wrote:
On 2025-Apr-28, jian he wrote:
for tests, just found out i can imitate
src/test/modules/test_misc/t/001_constraint_validation.pl,So I created a file:
src/test/modules/test_misc/t/008_indexscan_validate_notnull.pl
for TAP tests.Seems reasonable, didn't look at it in detail. I think you don't have
any tests where you try to set multiple columns as NOT NULL in a single
ALTER TABLE command; I think this is worth having. Something likeCREATE TABLE foo (col1 int, col2 int, col3 int);
... create indexes on col1 and col2 ...
alter table foo set col1 not null,
set col3 not null,
add constraint bla not null b;
and stuff like that.
newly added tests covered that.
now the test coverage is quite good, IMHO.
also rebased to address test failure.
Attachments:
v6-0001-using-indexscan-to-speedup-add-not-null-constrain.patchapplication/x-patch; name=v6-0001-using-indexscan-to-speedup-add-not-null-constrain.patchDownload+630-6
Can you please rebase this? It stopped applying a week ago.
Thanks!
--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
"Sallah, I said NO camels! That's FIVE camels; can't you count?"
(Indiana Jones)
On Fri, Oct 17, 2025 at 3:57 AM Álvaro Herrera <alvherre@kurilemu.de> wrote:
Can you please rebase this? It stopped applying a week ago.
hi.
rebase and minor polishment.
Attachments:
v7-0001-using-indexscan-to-speedup-add-not-null-constraints.patchtext/x-patch; charset=US-ASCII; name=v7-0001-using-indexscan-to-speedup-add-not-null-constraints.patchDownload+636-7
Hello
I rebased this patch to review it. Overall, I think this is a great
idea. Here are some comments on the patch, which I hope are not
anything major.
I'm not really sure that I agree with the way ATRewriteTable works now.
It seems that we scan the list of columns, and verify each one for nulls,
but abort midway if a column is generated. Surely we should check for
generated columns first, to avoid wasting time verifying earlier
columns in case a later column is generated?
That said, we do check for notnull_virtual_attrs to be NIL. It seems to
me that this implies that the checked columns are not generated. In
other words, the test for whether columns are generated is unnecessary
and confusing, and in which case you don't have to change anything, just
remove the "if (attr->attgenerated == ATTRIBUTE_GENERATED_VIRTUAL)"
inner block.
However, if we do this, then I think computing notnull_attrs is
pointless. So we should only change the order: do this new check first,
and if we find that any new not-null column is on a generated column,
then we compute both notnull_virtual_attrs and notnull_attrs. No?
The separation of labor between index_check_notnull() and
index_check_notnull_internal() seems a bit pointless. Why not have a
single routine that does both things? Though, to be honest, it does
look like the former should live in tablecmds.c instead of
execIndexing.c. (But if you do split things that way, then you need to
make index_check_notnull_internal extern).
The tests look a bit excessive. Why do we need an isolation test for
this? And it looks to me like the other test could be in
src/test/regress rather than be a TAP test. After all, that's what you
have the ereport(DEBUG1) there, isn't it?
"veritify" doesn't exist. The word is "verify".
--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
Syntax error: function hell() needs an argument.
Please choose what hell you want to involve.
Attachments:
v8-0001-using-indexscan-to-speedup-add-not-null-constrain.patchtext/x-diff; charset=utf-8Download+632-7
On 2026-Jan-13, Álvaro Herrera wrote:
However, if we do this, then I think computing notnull_attrs is
pointless. So we should only change the order: do this new check
first, and if we find that any new not-null column is on a generated
column, then we compute both notnull_virtual_attrs and notnull_attrs.
No?
Oh, another thing we should do is have a first pass that verifies
whether all columns have an appropriate index, without scanning any of
them; only if we verify that they all have one (and no generated column
is involved) then we start scanning the indexes. Otherwise we waste
time scanning one index and verify that it contains no null values, only
to realize that the next column does not have an appropriate index to
use, and thus we must scan the table. Then the first index scan is
wasted work.
--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
"The saddest aspect of life right now is that science gathers knowledge faster
than society gathers wisdom." (Isaac Asimov)
hi.
---------------------
create unlogged table t2_copy(col1 int, col2 int, col3 int, col4 int,
col5 int) with (autovacuum_enabled = off, vacuum_index_cleanup=off);
insert into t2_copy select g, g, g, g,g from generate_series(1, 10_000_000) g;
set enable_seqscan to off;
set enable_bitmapscan to off;
set max_parallel_workers=0;
set max_parallel_workers_per_gather=0;
create index t2_copy_idx on t2_copy(col1, col2, col3, col4, col5);
explain (costs off, analyze) select from t2_copy where col1 is NULL or
col2 is NULL or col3 is NULL or col4 is null or col5 is null \watch
i=0.1 c=10
drop index t2_copy_idx;
create index t2_copy_idx1 on t2_copy(col1);
create index t2_copy_idx2 on t2_copy(col2);
create index t2_copy_idx3 on t2_copy(col3);
create index t2_copy_idx4 on t2_copy(col4);
create index t2_copy_idx5 on t2_copy(col5);
explain (costs off, analyze) select from t2_copy where col1 is NULL or
col2 is NULL or col3 is NULL or col4 is null or col5 is null \watch
i=0.1 c=10
------------------------------------------
By comparing the above two EXPLAIN, I found out that there will be regression
for using one multiple column indexes fast verify mutiple not-null constraints.
create unlogged table t3_copy(col1 int, col2 int, col3 int, col4 int);
create index t3_copy_idx on t3_copy(col1, col2, col3, col4);
alter table t3_copy add not null col1, add not null col2, add not null
col3, add not null col4;
In this case, scanning the t3_copy_idx index to check these four NOT NULL
constraints is actually slower than a full table scan.
Therefore, I further restricted the use of the index-scan mechanism for fast NOT
NULL verification to scenarios where the attribute being checked is the leading
column of the index.
So, for the above example, to let
alter table t3_copy add not null col1, add not null col2, add not null
col3, add not null col4;
utilize indexscan mechanism, we require four indexes, each index leading column
is corresponding to the not-null constraint attribute.
I have not yet addressed all of your other comments, as this
represents a more of a
major change. Therefore, I am submitting the attached patch first.
index_check_notnull function, in v8 it is:
+ /* collect index attnums while loop */
+ for (int i = 0; i < index->indnkeyatts; i++)
+ {
+ attr = TupleDescAttr(tupdesc, (index->indkey.values[i] - 1));
+
+
+ }
in v9, it's:
+ /* collect index attnums while loop */
+ attr = TupleDescAttr(tupdesc, (index->indkey.values[0] - 1));
+
I am not sure that regression tests would be helpful, since the test hinges on
whether ereport(DEBUG1, ...) is reached.
I am not sure about the concurrency implications; therefore, isolation tests are
attached. maybe it's not necessary.
Attachments:
v9-0001-using-indexscan-to-speedup-add-not-null-constraints.patchtext/x-patch; charset=US-ASCII; name=v9-0001-using-indexscan-to-speedup-add-not-null-constraints.patchDownload+606-7
On 2026-Jan-16, jian he wrote:
In this case, scanning the t3_copy_idx index to check these four NOT NULL
constraints is actually slower than a full table scan.Therefore, I further restricted the use of the index-scan mechanism for fast NOT
NULL verification to scenarios where the attribute being checked is the leading
column of the index.
Makes sense.
I have not yet addressed all of your other comments, as this
represents a more of a
major change. Therefore, I am submitting the attached patch first.
OK, this means we're still waiting then.
Please do see my v8 patch. It contained other cleanups as well.
--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
"La gente vulgar sólo piensa en pasar el tiempo;
el que tiene talento, en aprovecharlo"
hi.
Please check the attached v10.
Turns out regress test is doable, I just need to
SET client_min_messages TO DEBUG1;
It is now implemented as a single static function, index_check_notnull, located
in tablecmd.c. I believe I have addressed all of your other points as well.
Attachments:
v10-0001-using-indexscan-to-speedup-add-not-null-constraints.patchtext/x-patch; charset=US-ASCII; name=v10-0001-using-indexscan-to-speedup-add-not-null-constraints.patchDownload+389-7
Attachments:
v11-0001-using-indexscan-to-speedup-add-not-null-constraints.patchtext/x-patch; charset=US-ASCII; name=v11-0001-using-indexscan-to-speedup-add-not-null-constraints.patchDownload+391-7
On Wed, Feb 4, 2026 at 8:52 PM jian he <jian.universality@gmail.com> wrote:
hi.
v11 is attached.
Mainly fixes regress test failures.
I have tested locally on https://cirrus-ci.com/build/5064590561640448
this time, the attachment should have no regress test failure.
Attachments:
v12-0001-using-indexscan-to-speedup-add-not-null-constraints.patchtext/x-patch; charset=US-ASCII; name=v12-0001-using-indexscan-to-speedup-add-not-null-constraints.patchDownload+392-7
On Fri, Feb 6, 2026 at 12:58 PM jian he <jian.universality@gmail.com> wrote:
+-- should produce a debug message containing "have been validated by
using index scan"
+ ALTER TABLE tp_notnull ALTER COLUMN b SET NOT NULL, ALTER COLUMN b
SET DATA TYPE BIGINT;
DEBUG: rehashing catalog cache id 64 for pg_class; 257 tups, 128 buckets
The above DEBUG message make me realize that
+SET client_min_messages TO 'debug1';
+--debug message should contain "have been validated by using index scan"
+ALTER TABLE tp_notnull ALTER COLUMN b SET NOT NULL;
+DEBUG: all new not-null constraints on relation "tp_notnull_1" have
been validated by using index scan
+DEBUG: all new not-null constraints on relation "tp_notnull_2" have
been validated by using index scan
Using the above regression test to check if certain
``ereport(DEBUG1,`` is reached is unstable.
Therefore, we fall back to the previously used TAP tests to ensure
ereport(DEBUG1,
errmsg_internal("all new not-null constraints on relation
\"%s\" have been validated by using index scan",
RelationGetRelationName(oldrel)));
is being reached.
Rebased and made some minor comment adjustments; no big change.
Attachments:
v13-0001-using-indexscan-to-speedup-add-not-null-constraints.patchtext/x-patch; charset=US-ASCII; name=v13-0001-using-indexscan-to-speedup-add-not-null-constraints.patchDownload+396-6
Hello,
On 2026-Apr-07, jian he wrote:
Rebased and made some minor comment adjustments; no big change.
I think you may have not looked at my review at
/messages/by-id/202601131715.cheonohttbyj@alvherre.pgsql
because as far as I can tell, this new version has essentially the same
structure as before that review.
--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
On Tue, Apr 7, 2026 at 6:02 PM Álvaro Herrera <alvherre@kurilemu.de> wrote:
Hello,
I think you may have not looked at my review at
/messages/by-id/202601131715.cheonohttbyj@alvherre.pgsql
because as far as I can tell, this new version has essentially the same
structure as before that review.
Hi.
I guess the point I haven't addressed yet is:
However, if we do this, then I think computing notnull_attrs is
pointless. So we should only change the order: do this new check first,
and if we find that any new not-null column is on a generated column,
then we compute both notnull_virtual_attrs and notnull_attrs. No?
The attached patch implements this reordering,
first do index_check_notnull then compute both notnull_virtual_attrs
and notnull_attrs.