Optimization of partial index creation for a new column

Started by Александр Асафов4 months ago3 messageshackers
Jump to latest
#1Александр Асафов
asafofalexander@gmail.com

Hello hackers,
I have a feature suggestion to optimize the creation of a partial
index. In some cases, it is possible to skip a full table scan. For
example:

BEGIN;
ALTER TABLE test ADD COLUMN (newcol int);
CREATE INDEX newindex ON test(newcol) WHERE newcol IS NOT NULL;
COMMIT;

In this case, the values of all columns in the new table will be NULL,
and there is no point in checking the conditions for each row. The
same optimization can be done for default values in NOT NULL columns.

How complex would a patch with this optimization be? Will it be
necessary to add any attributes to the table metadata, or can we check
that the previous command was CREATE TABLE when creating an index?

#2Andres Freund
andres@anarazel.de
In reply to: Александр Асафов (#1)
Re: Optimization of partial index creation for a new column

Hi,

On 2025-12-14 23:13:49 +0300, Александр Асафов wrote:

I have a feature suggestion to optimize the creation of a partial
index. In some cases, it is possible to skip a full table scan. For
example:

BEGIN;
ALTER TABLE test ADD COLUMN (newcol int);
CREATE INDEX newindex ON test(newcol) WHERE newcol IS NOT NULL;
COMMIT;

In this case, the values of all columns in the new table will be NULL,
and there is no point in checking the conditions for each row. The
same optimization can be done for default values in NOT NULL columns.

How complex would a patch with this optimization be? Will it be
necessary to add any attributes to the table metadata, or can we check
that the previous command was CREATE TABLE when creating an index?

I doubt this is feasible, or at least that it's going to be worth the
complexity. You'd have to make sure that the transaction didn't actually
insert any rows, and update some in-memory or catalog state if it did. With
subtransactions etc that could get somewhat complicated.

Greetings,

Andres Freund

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#2)
Re: Optimization of partial index creation for a new column

Andres Freund <andres@anarazel.de> writes:

On 2025-12-14 23:13:49 +0300, Александр Асафов wrote:

I have a feature suggestion to optimize the creation of a partial
index. In some cases, it is possible to skip a full table scan. For
example:
...
How complex would a patch with this optimization be? Will it be
necessary to add any attributes to the table metadata, or can we check
that the previous command was CREATE TABLE when creating an index?

I doubt this is feasible, or at least that it's going to be worth the
complexity. You'd have to make sure that the transaction didn't actually
insert any rows, and update some in-memory or catalog state if it did. With
subtransactions etc that could get somewhat complicated.

Even if it could be made to work, I think it likely would be a net
negative for performance. You're suggesting that we add mechanism to
track whether any insertion/update has happened since an ALTER ADD
COLUMN, which means new overhead for every INSERT and UPDATE. Maybe
not a huge amount of overhead, but probably not trivial. That
overhead would only be repaid in the rather unlikely case where a
suitable index gets added soon after the ALTER. Unless you can find
some substantially-more-common scenarios where the overhead would
produce a benefit, it's hard to believe we'd accept such a patch.

regards, tom lane