Ignoring BRIN for HOT udpates seems broken
Hi,
while working on some BRIN stuff, I realized (my) commit 5753d4ee320b
ignoring BRIN indexes for HOT is likely broken. Consider this example:
----------------------------------------------------------------------
CREATE TABLE t (a INT) WITH (fillfactor = 10);
INSERT INTO t SELECT i
FROM generate_series(0,100000) s(i);
CREATE INDEX ON t USING BRIN (a);
UPDATE t SET a = 0 WHERE random() < 0.01;
SET enable_seqscan = off;
EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF) SELECT * FROM t WHERE a = 0;
SET enable_seqscan = on;
EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF) SELECT * FROM t WHERE a = 0;
----------------------------------------------------------------------
which unfortunately produces this:
QUERY PLAN
---------------------------------------------------------------
Bitmap Heap Scan on t (actual rows=23 loops=1)
Recheck Cond: (a = 0)
Rows Removed by Index Recheck: 2793
Heap Blocks: lossy=128
-> Bitmap Index Scan on t_a_idx (actual rows=1280 loops=1)
Index Cond: (a = 0)
Planning Time: 0.049 ms
Execution Time: 0.424 ms
(8 rows)
SET
QUERY PLAN
-----------------------------------------
Seq Scan on t (actual rows=995 loops=1)
Filter: (a = 0)
Rows Removed by Filter: 99006
Planning Time: 0.027 ms
Execution Time: 7.670 ms
(5 rows)
That is, the index fails to return some of the rows :-(
I don't remember the exact reasoning behind the commit, but the commit
message justifies the change like this:
There are no index pointers to individual tuples in BRIN, and the
page range summary will be updated anyway as it relies on visibility
info.
AFAICS that's a misunderstanding of how BRIN uses visibility map, or
rather does not use. In particular, bringetbitmap() does not look at the
vm at all, so it'll produce incomplete bitmap.
So it seems I made a boo boo here. Luckily, this is a PG15 commit, not a
live issue. I don't quite see if this can be salvaged - I'll think about
this a bit more, but it'll probably end with a revert.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachments:
On Sat, 28 May 2022 at 16:51, Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
Hi,
while working on some BRIN stuff, I realized (my) commit 5753d4ee320b
ignoring BRIN indexes for HOT is likely broken. Consider this example:----------------------------------------------------------------------
CREATE TABLE t (a INT) WITH (fillfactor = 10);INSERT INTO t SELECT i
FROM generate_series(0,100000) s(i);CREATE INDEX ON t USING BRIN (a);
UPDATE t SET a = 0 WHERE random() < 0.01;
SET enable_seqscan = off;
EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF) SELECT * FROM t WHERE a = 0;SET enable_seqscan = on;
EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF) SELECT * FROM t WHERE a = 0;
----------------------------------------------------------------------which unfortunately produces this:
QUERY PLAN
---------------------------------------------------------------
Bitmap Heap Scan on t (actual rows=23 loops=1)
Recheck Cond: (a = 0)
Rows Removed by Index Recheck: 2793
Heap Blocks: lossy=128
-> Bitmap Index Scan on t_a_idx (actual rows=1280 loops=1)
Index Cond: (a = 0)
Planning Time: 0.049 ms
Execution Time: 0.424 ms
(8 rows)SET
QUERY PLAN
-----------------------------------------
Seq Scan on t (actual rows=995 loops=1)
Filter: (a = 0)
Rows Removed by Filter: 99006
Planning Time: 0.027 ms
Execution Time: 7.670 ms
(5 rows)That is, the index fails to return some of the rows :-(
I don't remember the exact reasoning behind the commit, but the commit
message justifies the change like this:There are no index pointers to individual tuples in BRIN, and the
page range summary will be updated anyway as it relies on visibility
info.AFAICS that's a misunderstanding of how BRIN uses visibility map, or
rather does not use. In particular, bringetbitmap() does not look at the
vm at all, so it'll produce incomplete bitmap.So it seems I made a boo boo here. Luckily, this is a PG15 commit, not a
live issue. I don't quite see if this can be salvaged - I'll think about
this a bit more, but it'll probably end with a revert.
The principle of 'amhotblocking' for only blocking HOT updates seems
correct, except for the fact that the HOT flag bit is also used as a
way to block the propagation of new values to existing indexes.
A better abstraction would be "amSummarizes[Block]', in which updates
that only modify columns that are only included in summarizing indexes
still allow HOT, but still will see an update call to all (relevant?)
summarizing indexes. That should still improve performance
significantly for the relevant cases.
-Matthias
On 5/28/22 21:24, Matthias van de Meent wrote:
On Sat, 28 May 2022 at 16:51, Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:Hi,
while working on some BRIN stuff, I realized (my) commit 5753d4ee320b
ignoring BRIN indexes for HOT is likely broken. Consider this example:----------------------------------------------------------------------
CREATE TABLE t (a INT) WITH (fillfactor = 10);INSERT INTO t SELECT i
FROM generate_series(0,100000) s(i);CREATE INDEX ON t USING BRIN (a);
UPDATE t SET a = 0 WHERE random() < 0.01;
SET enable_seqscan = off;
EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF) SELECT * FROM t WHERE a = 0;SET enable_seqscan = on;
EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF) SELECT * FROM t WHERE a = 0;
----------------------------------------------------------------------which unfortunately produces this:
QUERY PLAN
---------------------------------------------------------------
Bitmap Heap Scan on t (actual rows=23 loops=1)
Recheck Cond: (a = 0)
Rows Removed by Index Recheck: 2793
Heap Blocks: lossy=128
-> Bitmap Index Scan on t_a_idx (actual rows=1280 loops=1)
Index Cond: (a = 0)
Planning Time: 0.049 ms
Execution Time: 0.424 ms
(8 rows)SET
QUERY PLAN
-----------------------------------------
Seq Scan on t (actual rows=995 loops=1)
Filter: (a = 0)
Rows Removed by Filter: 99006
Planning Time: 0.027 ms
Execution Time: 7.670 ms
(5 rows)That is, the index fails to return some of the rows :-(
I don't remember the exact reasoning behind the commit, but the commit
message justifies the change like this:There are no index pointers to individual tuples in BRIN, and the
page range summary will be updated anyway as it relies on visibility
info.AFAICS that's a misunderstanding of how BRIN uses visibility map, or
rather does not use. In particular, bringetbitmap() does not look at the
vm at all, so it'll produce incomplete bitmap.So it seems I made a boo boo here. Luckily, this is a PG15 commit, not a
live issue. I don't quite see if this can be salvaged - I'll think about
this a bit more, but it'll probably end with a revert.The principle of 'amhotblocking' for only blocking HOT updates seems
correct, except for the fact that the HOT flag bit is also used as a
way to block the propagation of new values to existing indexes.A better abstraction would be "amSummarizes[Block]', in which updates
that only modify columns that are only included in summarizing indexes
still allow HOT, but still will see an update call to all (relevant?)
summarizing indexes. That should still improve performance
significantly for the relevant cases.
Yeah, I think that might/should work. We could still create the HOT
chain, but we'd have to update the BRIN indexes. But that seems like a
fairly complicated change to be done this late for PG15.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Sat, 28 May 2022 at 22:51, Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
On 5/28/22 21:24, Matthias van de Meent wrote:
On Sat, 28 May 2022 at 16:51, Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:Hi,
while working on some BRIN stuff, I realized (my) commit 5753d4ee320b
ignoring BRIN indexes for HOT is likely broken. Consider this example:----------------------------------------------------------------------
CREATE TABLE t (a INT) WITH (fillfactor = 10);INSERT INTO t SELECT i
FROM generate_series(0,100000) s(i);CREATE INDEX ON t USING BRIN (a);
UPDATE t SET a = 0 WHERE random() < 0.01;
SET enable_seqscan = off;
EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF) SELECT * FROM t WHERE a = 0;SET enable_seqscan = on;
EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF) SELECT * FROM t WHERE a = 0;
----------------------------------------------------------------------which unfortunately produces this:
QUERY PLAN
---------------------------------------------------------------
Bitmap Heap Scan on t (actual rows=23 loops=1)
Recheck Cond: (a = 0)
Rows Removed by Index Recheck: 2793
Heap Blocks: lossy=128
-> Bitmap Index Scan on t_a_idx (actual rows=1280 loops=1)
Index Cond: (a = 0)
Planning Time: 0.049 ms
Execution Time: 0.424 ms
(8 rows)SET
QUERY PLAN
-----------------------------------------
Seq Scan on t (actual rows=995 loops=1)
Filter: (a = 0)
Rows Removed by Filter: 99006
Planning Time: 0.027 ms
Execution Time: 7.670 ms
(5 rows)That is, the index fails to return some of the rows :-(
I don't remember the exact reasoning behind the commit, but the commit
message justifies the change like this:There are no index pointers to individual tuples in BRIN, and the
page range summary will be updated anyway as it relies on visibility
info.AFAICS that's a misunderstanding of how BRIN uses visibility map, or
rather does not use. In particular, bringetbitmap() does not look at the
vm at all, so it'll produce incomplete bitmap.So it seems I made a boo boo here. Luckily, this is a PG15 commit, not a
live issue. I don't quite see if this can be salvaged - I'll think about
this a bit more, but it'll probably end with a revert.The principle of 'amhotblocking' for only blocking HOT updates seems
correct, except for the fact that the HOT flag bit is also used as a
way to block the propagation of new values to existing indexes.A better abstraction would be "amSummarizes[Block]', in which updates
that only modify columns that are only included in summarizing indexes
still allow HOT, but still will see an update call to all (relevant?)
summarizing indexes. That should still improve performance
significantly for the relevant cases.Yeah, I think that might/should work. We could still create the HOT
chain, but we'd have to update the BRIN indexes. But that seems like a
fairly complicated change to be done this late for PG15.
Here's an example patch for that (based on a branch derived from
master @ 5bb2b6ab). A nod to the authors of the pHOT patch, as that is
a related patch and was informative in how this could/should impact AM
APIs -- this is doing things similar (but not exactly the same) to
that by only updating select indexes.
Note that this is an ABI change in some critical places -- I'm not
sure it's OK to commit a fix like this into PG15 unless we really
don't want to revert 5753d4ee320b.
Also of note is that this still updates _all_ summarizing indexes, not
only those involved in the tuple update. Better performance is up to a
different implementation.
The patch includes a new regression test based on your example, which
fails on master but succeeds after applying the patch.
-Matthias
Attachments:
v1-0001-Rework-5753d4ee-s-amhotblocking-infrastructure-re.patchapplication/x-patch; name=v1-0001-Rework-5753d4ee-s-amhotblocking-infrastructure-re.patchDownload
From fdcb789a52d6f63c035984da7006a276c7c946e1 Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <boekewurm+postgres@gmail.com>
Date: Mon, 30 May 2022 15:55:02 +0200
Subject: [PATCH v1] Rework 5753d4ee's amhotblocking infrastructure -- replaced
with amsummarizing
Pain points of 'hot blocking': it is a strange name for an indexam-specific
API, as HOT is a heap-specific optimization that is not necessarily shared
across table AMs. The specific feature that allows HOT to still be applied
when there is an index on an updated column would be that the index provides
a 'summary of indexed values on page X'.
The specific feature for those indexes thus becomes whether the index is
'summarizing' the data of the block in a way that doesn't allow point tuple
lookups, such that the values returned by the AM are either all tuples of a
page, or no tuples of that page.
The new code in 5753d4ee320b did not take into account that even summarizing
indexes do need to receive the new value of their indexed columns, even if
the specific tuple would never be indexed as is -- the summary of the index
must still be updated with the new value, lest the summary become invalid.
This patch adds that path -- summarizing indexes will now receive an update
even if HOT is applied, through an ABI-breaking change in table_tuple_update.
Reported-By: https://www.postgresql.org/message-id/05ebcb44-f383-86e3-4f31-0a97a55634cf%40enterprisedb.com
---
doc/src/sgml/indexam.sgml | 12 ++++----
src/backend/access/brin/brin.c | 2 +-
src/backend/access/gin/ginutil.c | 2 +-
src/backend/access/gist/gist.c | 2 +-
src/backend/access/hash/hash.c | 2 +-
src/backend/access/heap/heapam.c | 13 +++++++++
src/backend/access/heap/heapam_handler.c | 19 ++++++++++--
src/backend/access/nbtree/nbtree.c | 2 +-
src/backend/access/spgist/spgutils.c | 2 +-
src/backend/access/table/tableam.c | 2 +-
src/backend/catalog/index.c | 9 ++++--
src/backend/catalog/indexing.c | 18 ++++++++++--
src/backend/commands/copyfrom.c | 5 ++--
src/backend/commands/indexcmds.c | 10 +++++--
src/backend/executor/execIndexing.c | 6 +++-
src/backend/executor/execReplication.c | 9 +++---
src/backend/executor/nodeModifyTable.c | 13 +++++----
src/backend/nodes/makefuncs.c | 4 ++-
src/backend/utils/cache/relcache.c | 24 ++++++++++++---
src/include/access/amapi.h | 4 +--
src/include/access/htup_details.h | 29 +++++++++++++++++++
src/include/access/tableam.h | 19 ++++++++++--
src/include/executor/executor.h | 3 +-
src/include/nodes/execnodes.h | 2 ++
src/include/nodes/makefuncs.h | 4 ++-
src/include/utils/rel.h | 1 +
src/include/utils/relcache.h | 3 +-
.../modules/dummy_index_am/dummy_index_am.c | 2 +-
src/test/regress/expected/brin.out | 21 ++++++++++++--
src/test/regress/sql/brin.sql | 10 +++++--
30 files changed, 202 insertions(+), 52 deletions(-)
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index d4163c96e9..f815da817f 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -126,8 +126,9 @@ typedef struct IndexAmRoutine
bool amcaninclude;
/* does AM use maintenance_work_mem? */
bool amusemaintenanceworkmem;
- /* does AM block HOT update? */
- bool amhotblocking;
+ /* does AM summarize tuples, with at least all tuples in the block
+ * summarized in one summary */
+ bool amsummarizing;
/* OR of parallel vacuum flags */
uint8 amparallelvacuumoptions;
/* type of data stored in index, or InvalidOid if variable */
@@ -249,9 +250,10 @@ typedef struct IndexAmRoutine
</para>
<para>
- The <structfield>amhotblocking</structfield> flag indicates whether the
- access method blocks <acronym>HOT</acronym> when an indexed attribute is
- updated. Access methods without pointers to individual tuples (like
+ The <structfield>amsummarizing</structfield> flag indicates whether the
+ access method summarizes the indexed tuples, with summarizing granularity
+ of at least per block.
+ Access methods without pointers to individual tuples (like
<acronym>BRIN</acronym>) may allow <acronym>HOT</acronym> even in this
case. This does not apply to attributes referenced in index predicates;
an update of such attribute always disables <acronym>HOT</acronym>.
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index 4366010768..550314e6a7 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -108,7 +108,7 @@ brinhandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = false;
amroutine->amcaninclude = false;
amroutine->amusemaintenanceworkmem = false;
- amroutine->amhotblocking = false;
+ amroutine->amsummarizing = true;
amroutine->amparallelvacuumoptions =
VACUUM_OPTION_PARALLEL_CLEANUP;
amroutine->amkeytype = InvalidOid;
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index 3d15701a01..52b5928b58 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -56,7 +56,7 @@ ginhandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = false;
amroutine->amcaninclude = false;
amroutine->amusemaintenanceworkmem = true;
- amroutine->amhotblocking = true;
+ amroutine->amsummarizing = false;
amroutine->amparallelvacuumoptions =
VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_CLEANUP;
amroutine->amkeytype = InvalidOid;
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index 8c6c744ab7..a7d0a3c21e 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -78,7 +78,7 @@ gisthandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = false;
amroutine->amcaninclude = true;
amroutine->amusemaintenanceworkmem = false;
- amroutine->amhotblocking = true;
+ amroutine->amsummarizing = false;
amroutine->amparallelvacuumoptions =
VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
amroutine->amkeytype = InvalidOid;
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index fd1a7119b6..2afa658b0e 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -75,7 +75,7 @@ hashhandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = false;
amroutine->amcaninclude = false;
amroutine->amusemaintenanceworkmem = false;
- amroutine->amhotblocking = true;
+ amroutine->amsummarizing = false;
amroutine->amparallelvacuumoptions =
VACUUM_OPTION_PARALLEL_BULKDEL;
amroutine->amkeytype = INT4OID;
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 7421851027..d15eaf85de 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -3123,6 +3123,7 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
TM_Result result;
TransactionId xid = GetCurrentTransactionId();
Bitmapset *hot_attrs;
+ Bitmapset *sum_attrs;
Bitmapset *key_attrs;
Bitmapset *id_attrs;
Bitmapset *interesting_attrs;
@@ -3145,6 +3146,7 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
bool have_tuple_lock = false;
bool iscombo;
bool use_hot_update = false;
+ bool summarized_update = false;
bool key_intact;
bool all_visible_cleared = false;
bool all_visible_cleared_new = false;
@@ -3191,11 +3193,13 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
* relcache flush happening midway through.
*/
hot_attrs = RelationGetIndexAttrBitmap(relation, INDEX_ATTR_BITMAP_HOT_BLOCKING);
+ sum_attrs = RelationGetIndexAttrBitmap(relation, INDEX_ATTR_BITMAP_SUMMARIZED);
key_attrs = RelationGetIndexAttrBitmap(relation, INDEX_ATTR_BITMAP_KEY);
id_attrs = RelationGetIndexAttrBitmap(relation,
INDEX_ATTR_BITMAP_IDENTITY_KEY);
interesting_attrs = NULL;
interesting_attrs = bms_add_members(interesting_attrs, hot_attrs);
+ interesting_attrs = bms_add_members(interesting_attrs, sum_attrs);
interesting_attrs = bms_add_members(interesting_attrs, key_attrs);
interesting_attrs = bms_add_members(interesting_attrs, id_attrs);
@@ -3506,6 +3510,7 @@ l2:
if (vmbuffer != InvalidBuffer)
ReleaseBuffer(vmbuffer);
bms_free(hot_attrs);
+ bms_free(sum_attrs);
bms_free(key_attrs);
bms_free(id_attrs);
bms_free(modified_attrs);
@@ -3827,7 +3832,11 @@ l2:
* changed.
*/
if (!bms_overlap(modified_attrs, hot_attrs))
+ {
use_hot_update = true;
+ if (bms_overlap(modified_attrs, sum_attrs))
+ summarized_update = true;
+ }
}
else
{
@@ -3987,10 +3996,14 @@ l2:
heap_freetuple(heaptup);
}
+ if (summarized_update)
+ HeapTupleHeaderSetSummaryUpdate(newtup->t_data);
+
if (old_key_tuple != NULL && old_key_copied)
heap_freetuple(old_key_tuple);
bms_free(hot_attrs);
+ bms_free(sum_attrs);
bms_free(key_attrs);
bms_free(id_attrs);
bms_free(modified_attrs);
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index 444f027149..0bfce50b35 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -314,7 +314,7 @@ static TM_Result
heapam_tuple_update(Relation relation, ItemPointer otid, TupleTableSlot *slot,
CommandId cid, Snapshot snapshot, Snapshot crosscheck,
bool wait, TM_FailureData *tmfd,
- LockTupleMode *lockmode, bool *update_indexes)
+ LockTupleMode *lockmode, TU_UpdateIndexes *update_indexes)
{
bool shouldFree = true;
HeapTuple tuple = ExecFetchSlotHeapTuple(slot, true, &shouldFree);
@@ -334,9 +334,22 @@ heapam_tuple_update(Relation relation, ItemPointer otid, TupleTableSlot *slot,
* Note: heap_update returns the tid (location) of the new tuple in the
* t_self field.
*
- * If it's a HOT update, we mustn't insert new index entries.
+ * If it's a HOT update, we mustn't insert new index entries. If the update
+ * was summarized, we must update only those summarizing indexes.
*/
- *update_indexes = result == TM_Ok && !HeapTupleIsHeapOnly(tuple);
+ if (result != TM_Ok)
+ *update_indexes = TUUI_None;
+ else if (!HeapTupleIsHeapOnly(tuple))
+ *update_indexes = TUUI_All;
+ else if (HeapTupleHeaderIsHOTWithSummaryUpdate(tuple->t_data))
+ {
+ *update_indexes = TUUI_Summarizing;
+
+ /* Clear temporary bits */
+ HeapTupleHeaderClearSummaryUpdate(tuple->t_data);
+ }
+ else
+ *update_indexes = TUUI_None;
if (shouldFree)
pfree(tuple);
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 95da2c46bf..2f32ee720f 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -118,7 +118,7 @@ bthandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = true;
amroutine->amcaninclude = true;
amroutine->amusemaintenanceworkmem = false;
- amroutine->amhotblocking = true;
+ amroutine->amsummarizing = false;
amroutine->amparallelvacuumoptions =
VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
amroutine->amkeytype = InvalidOid;
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index a171ca8a08..cf2a11ef70 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -62,7 +62,7 @@ spghandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = false;
amroutine->amcaninclude = true;
amroutine->amusemaintenanceworkmem = false;
- amroutine->amhotblocking = true;
+ amroutine->amsummarizing = false;
amroutine->amparallelvacuumoptions =
VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
amroutine->amkeytype = InvalidOid;
diff --git a/src/backend/access/table/tableam.c b/src/backend/access/table/tableam.c
index b3d1a6c3f8..b6a403aa16 100644
--- a/src/backend/access/table/tableam.c
+++ b/src/backend/access/table/tableam.c
@@ -346,7 +346,7 @@ void
simple_table_tuple_update(Relation rel, ItemPointer otid,
TupleTableSlot *slot,
Snapshot snapshot,
- bool *update_indexes)
+ TU_UpdateIndexes *update_indexes)
{
TM_Result result;
TM_FailureData tmfd;
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index fd389c28d8..85fe9f715d 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -1371,7 +1371,8 @@ index_concurrently_create_copy(Relation heapRelation, Oid oldIndexId,
oldInfo->ii_Unique,
oldInfo->ii_NullsNotDistinct,
false, /* not ready for inserts */
- true);
+ true,
+ indexRelation->rd_indam->amsummarizing);
/*
* Extract the list of column names and the column numbers for the new
@@ -2422,7 +2423,8 @@ BuildIndexInfo(Relation index)
indexStruct->indisunique,
indexStruct->indnullsnotdistinct,
indexStruct->indisready,
- false);
+ false,
+ index->rd_indam->amsummarizing);
/* fill in attribute numbers */
for (i = 0; i < numAtts; i++)
@@ -2482,7 +2484,8 @@ BuildDummyIndexInfo(Relation index)
indexStruct->indisunique,
indexStruct->indnullsnotdistinct,
indexStruct->indisready,
- false);
+ false,
+ index->rd_indam->amsummarizing);
/* fill in attribute numbers */
for (i = 0; i < numAtts; i++)
diff --git a/src/backend/catalog/indexing.c b/src/backend/catalog/indexing.c
index 0b92093322..08efbf458e 100644
--- a/src/backend/catalog/indexing.c
+++ b/src/backend/catalog/indexing.c
@@ -82,6 +82,7 @@ CatalogIndexInsert(CatalogIndexState indstate, HeapTuple heapTuple)
IndexInfo **indexInfoArray;
Datum values[INDEX_MAX_KEYS];
bool isnull[INDEX_MAX_KEYS];
+ bool onlySummarized = false;
/*
* HOT update does not require index inserts. But with asserts enabled we
@@ -90,7 +91,17 @@ CatalogIndexInsert(CatalogIndexState indstate, HeapTuple heapTuple)
*/
#ifndef USE_ASSERT_CHECKING
if (HeapTupleIsHeapOnly(heapTuple))
- return;
+ {
+ if (HeapTupleHeaderIsHOTWithSummaryUpdate(heapTuple->t_data))
+ {
+ HeapTupleHeaderClearSummaryUpdate(heapTuple->t_data);
+ onlySummarized = true;
+ }
+ else
+ {
+ return;
+ }
+ }
#endif
/*
@@ -135,13 +146,16 @@ CatalogIndexInsert(CatalogIndexState indstate, HeapTuple heapTuple)
/* see earlier check above */
#ifdef USE_ASSERT_CHECKING
- if (HeapTupleIsHeapOnly(heapTuple))
+ if (HeapTupleIsHeapOnly(heapTuple) && !onlySummarized)
{
Assert(!ReindexIsProcessingIndex(RelationGetRelid(index)));
continue;
}
#endif /* USE_ASSERT_CHECKING */
+ if (onlySummarized && !indexInfo->ii_Summarizing)
+ continue;
+
/*
* FormIndexDatum fills in its values and isnull parameters with the
* appropriate values for the column(s) of the index.
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 35a1d3a774..c79711b0d5 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -347,7 +347,7 @@ CopyMultiInsertBufferFlush(CopyMultiInsertInfo *miinfo,
recheckIndexes =
ExecInsertIndexTuples(resultRelInfo,
buffer->slots[i], estate, false, false,
- NULL, NIL);
+ NULL, NIL, false);
ExecARInsertTriggers(estate, resultRelInfo,
slots[i], recheckIndexes,
cstate->transition_capture);
@@ -1104,7 +1104,8 @@ CopyFrom(CopyFromState cstate)
false,
false,
NULL,
- NIL);
+ NIL,
+ false);
}
/* AFTER ROW INSERT Triggers */
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index cd30f15eba..af1a979760 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -177,6 +177,7 @@ CheckIndexCompatible(Oid oldId,
Form_pg_am accessMethodForm;
IndexAmRoutine *amRoutine;
bool amcanorder;
+ bool amsummarizing;
int16 *coloptions;
IndexInfo *indexInfo;
int numberOfAttributes;
@@ -215,6 +216,7 @@ CheckIndexCompatible(Oid oldId,
ReleaseSysCache(tuple);
amcanorder = amRoutine->amcanorder;
+ amsummarizing = amRoutine->amsummarizing;
/*
* Compute the operator classes, collations, and exclusion operators for
@@ -226,7 +228,8 @@ CheckIndexCompatible(Oid oldId,
* ii_NumIndexKeyAttrs with same value.
*/
indexInfo = makeIndexInfo(numberOfAttributes, numberOfAttributes,
- accessMethodId, NIL, NIL, false, false, false, false);
+ accessMethodId, NIL, NIL, false, false,
+ false, false, amsummarizing);
typeObjectId = (Oid *) palloc(numberOfAttributes * sizeof(Oid));
collationObjectId = (Oid *) palloc(numberOfAttributes * sizeof(Oid));
classObjectId = (Oid *) palloc(numberOfAttributes * sizeof(Oid));
@@ -531,6 +534,7 @@ DefineIndex(Oid relationId,
Form_pg_am accessMethodForm;
IndexAmRoutine *amRoutine;
bool amcanorder;
+ bool amissummarizing;
amoptions_function amoptions;
bool partitioned;
bool safe_index;
@@ -838,6 +842,7 @@ DefineIndex(Oid relationId,
amcanorder = amRoutine->amcanorder;
amoptions = amRoutine->amoptions;
+ amissummarizing = amRoutine->amsummarizing;
pfree(amRoutine);
ReleaseSysCache(tuple);
@@ -869,7 +874,8 @@ DefineIndex(Oid relationId,
stmt->unique,
stmt->nulls_not_distinct,
!concurrent,
- concurrent);
+ concurrent,
+ amissummarizing);
typeObjectId = (Oid *) palloc(numberOfAttributes * sizeof(Oid));
collationObjectId = (Oid *) palloc(numberOfAttributes * sizeof(Oid));
diff --git a/src/backend/executor/execIndexing.c b/src/backend/executor/execIndexing.c
index 0cb0b8f111..f81274bf89 100644
--- a/src/backend/executor/execIndexing.c
+++ b/src/backend/executor/execIndexing.c
@@ -287,7 +287,8 @@ ExecInsertIndexTuples(ResultRelInfo *resultRelInfo,
bool update,
bool noDupErr,
bool *specConflict,
- List *arbiterIndexes)
+ List *arbiterIndexes,
+ bool onlySummarizing)
{
ItemPointer tupleid = &slot->tts_tid;
List *result = NIL;
@@ -343,6 +344,9 @@ ExecInsertIndexTuples(ResultRelInfo *resultRelInfo,
if (!indexInfo->ii_ReadyForInserts)
continue;
+ if (onlySummarizing && !indexInfo->ii_Summarizing)
+ continue;
+
/* Check for partial index */
if (indexInfo->ii_Predicate != NIL)
{
diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index b000645d48..1a7c928e1e 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -445,7 +445,7 @@ ExecSimpleRelationInsert(ResultRelInfo *resultRelInfo,
if (resultRelInfo->ri_NumIndices > 0)
recheckIndexes = ExecInsertIndexTuples(resultRelInfo,
slot, estate, false, false,
- NULL, NIL);
+ NULL, NIL, false);
/* AFTER ROW INSERT Triggers */
ExecARInsertTriggers(estate, resultRelInfo, slot,
@@ -493,7 +493,7 @@ ExecSimpleRelationUpdate(ResultRelInfo *resultRelInfo,
if (!skip_tuple)
{
List *recheckIndexes = NIL;
- bool update_indexes;
+ TU_UpdateIndexes update_indexes;
/* Compute stored generated columns */
if (rel->rd_att->constr &&
@@ -510,10 +510,11 @@ ExecSimpleRelationUpdate(ResultRelInfo *resultRelInfo,
simple_table_tuple_update(rel, tid, slot, estate->es_snapshot,
&update_indexes);
- if (resultRelInfo->ri_NumIndices > 0 && update_indexes)
+ if (resultRelInfo->ri_NumIndices > 0 && update_indexes != TUUI_None)
recheckIndexes = ExecInsertIndexTuples(resultRelInfo,
slot, estate, true, false,
- NULL, NIL);
+ NULL, NIL,
+ update_indexes == TUUI_Summarizing);
/* AFTER ROW UPDATE Triggers */
ExecARUpdateTriggers(estate, resultRelInfo,
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 171575cd73..a981479ae0 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -130,8 +130,8 @@ typedef struct ModifyTableContext
typedef struct UpdateContext
{
bool updated; /* did UPDATE actually occur? */
- bool updateIndexes; /* index update required? */
bool crossPartUpdate; /* was it a cross-partition update? */
+ TU_UpdateIndexes updateIndexes; /* Which index updates are required? */
} UpdateContext;
@@ -1024,7 +1024,8 @@ ExecInsert(ModifyTableContext *context,
recheckIndexes = ExecInsertIndexTuples(resultRelInfo,
slot, estate, false, true,
&specConflict,
- arbiterIndexes);
+ arbiterIndexes,
+ false);
/* adjust the tuple's state accordingly */
table_tuple_complete_speculative(resultRelationDesc, slot,
@@ -1063,7 +1064,8 @@ ExecInsert(ModifyTableContext *context,
if (resultRelInfo->ri_NumIndices > 0)
recheckIndexes = ExecInsertIndexTuples(resultRelInfo,
slot, estate, false,
- false, NULL, NIL);
+ false, NULL, NIL,
+ false);
}
}
@@ -1994,11 +1996,12 @@ ExecUpdateEpilogue(ModifyTableContext *context, UpdateContext *updateCxt,
ModifyTableState *mtstate = context->mtstate;
/* insert index entries for tuple if necessary */
- if (resultRelInfo->ri_NumIndices > 0 && updateCxt->updateIndexes)
+ if (resultRelInfo->ri_NumIndices > 0 && updateCxt->updateIndexes != TUUI_None)
recheckIndexes = ExecInsertIndexTuples(resultRelInfo,
slot, context->estate,
true, false,
- NULL, NIL);
+ NULL, NIL,
+ updateCxt->updateIndexes == TUUI_Summarizing);
/* AFTER ROW UPDATE Triggers */
ExecARUpdateTriggers(context->estate, resultRelInfo,
diff --git a/src/backend/nodes/makefuncs.c b/src/backend/nodes/makefuncs.c
index 41e26a0fe6..29d7517067 100644
--- a/src/backend/nodes/makefuncs.c
+++ b/src/backend/nodes/makefuncs.c
@@ -742,7 +742,8 @@ make_ands_implicit(Expr *clause)
*/
IndexInfo *
makeIndexInfo(int numattrs, int numkeyattrs, Oid amoid, List *expressions,
- List *predicates, bool unique, bool nulls_not_distinct, bool isready, bool concurrent)
+ List *predicates, bool unique, bool nulls_not_distinct,
+ bool isready, bool concurrent, bool summarizing)
{
IndexInfo *n = makeNode(IndexInfo);
@@ -756,6 +757,7 @@ makeIndexInfo(int numattrs, int numkeyattrs, Oid amoid, List *expressions,
n->ii_CheckedUnchanged = false;
n->ii_IndexUnchanged = false;
n->ii_Concurrent = concurrent;
+ n->ii_Summarizing = summarizing;
/* expressions */
n->ii_Expressions = expressions;
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 43f14c233d..bbc01994d8 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -2443,6 +2443,7 @@ RelationDestroyRelation(Relation relation, bool remember_tupdesc)
bms_free(relation->rd_pkattr);
bms_free(relation->rd_idattr);
bms_free(relation->rd_hotblockingattr);
+ bms_free(relation->rd_summarizedattr);
if (relation->rd_pubdesc)
pfree(relation->rd_pubdesc);
if (relation->rd_options)
@@ -5108,6 +5109,7 @@ RelationGetIndexAttrBitmap(Relation relation, IndexAttrBitmapKind attrKind)
Bitmapset *pkindexattrs; /* columns in the primary index */
Bitmapset *idindexattrs; /* columns in the replica identity */
Bitmapset *hotblockingattrs; /* columns with HOT blocking indexes */
+ Bitmapset *summarizedattrs; /* columns with summarizing indexes */
List *indexoidlist;
List *newindexoidlist;
Oid relpkindex;
@@ -5128,6 +5130,8 @@ RelationGetIndexAttrBitmap(Relation relation, IndexAttrBitmapKind attrKind)
return bms_copy(relation->rd_idattr);
case INDEX_ATTR_BITMAP_HOT_BLOCKING:
return bms_copy(relation->rd_hotblockingattr);
+ case INDEX_ATTR_BITMAP_SUMMARIZED:
+ return bms_copy(relation->rd_summarizedattr);
default:
elog(ERROR, "unknown attrKind %u", attrKind);
}
@@ -5171,6 +5175,7 @@ restart:
pkindexattrs = NULL;
idindexattrs = NULL;
hotblockingattrs = NULL;
+ summarizedattrs = NULL;
foreach(l, indexoidlist)
{
Oid indexOid = lfirst_oid(l);
@@ -5235,9 +5240,12 @@ restart:
*/
if (attrnum != 0)
{
- if (indexDesc->rd_indam->amhotblocking)
+ if (indexDesc->rd_indam->amsummarizing)
+ summarizedattrs = bms_add_member(summarizedattrs,
+ attrnum - FirstLowInvalidHeapAttributeNumber);
+ else
hotblockingattrs = bms_add_member(hotblockingattrs,
- attrnum - FirstLowInvalidHeapAttributeNumber);
+ attrnum - FirstLowInvalidHeapAttributeNumber);
if (isKey && i < indexDesc->rd_index->indnkeyatts)
uindexattrs = bms_add_member(uindexattrs,
@@ -5254,12 +5262,14 @@ restart:
}
/* Collect all attributes used in expressions, too */
- if (indexDesc->rd_indam->amhotblocking)
+ if (indexDesc->rd_indam->amsummarizing)
+ pull_varattnos(indexExpressions, 1, &summarizedattrs);
+ else
pull_varattnos(indexExpressions, 1, &hotblockingattrs);
/*
* Collect all attributes in the index predicate, too. We have to ignore
- * amhotblocking flag, because the row might become indexable, in which
+ * amsummarizing flag, because the row might become indexable, in which
* case we have to add it to the index.
*/
pull_varattnos(indexPredicate, 1, &hotblockingattrs);
@@ -5291,6 +5301,7 @@ restart:
bms_free(pkindexattrs);
bms_free(idindexattrs);
bms_free(hotblockingattrs);
+ bms_free(summarizedattrs);
goto restart;
}
@@ -5304,6 +5315,8 @@ restart:
relation->rd_idattr = NULL;
bms_free(relation->rd_hotblockingattr);
relation->rd_hotblockingattr = NULL;
+ bms_free(relation->rd_summarizedattr);
+ relation->rd_summarizedattr = NULL;
/*
* Now save copies of the bitmaps in the relcache entry. We intentionally
@@ -5317,6 +5330,7 @@ restart:
relation->rd_pkattr = bms_copy(pkindexattrs);
relation->rd_idattr = bms_copy(idindexattrs);
relation->rd_hotblockingattr = bms_copy(hotblockingattrs);
+ relation->rd_summarizedattr = bms_copy(summarizedattrs);
relation->rd_attrsvalid = true;
MemoryContextSwitchTo(oldcxt);
@@ -5331,6 +5345,8 @@ restart:
return idindexattrs;
case INDEX_ATTR_BITMAP_HOT_BLOCKING:
return hotblockingattrs;
+ case INDEX_ATTR_BITMAP_SUMMARIZED:
+ return summarizedattrs;
default:
elog(ERROR, "unknown attrKind %u", attrKind);
return NULL;
diff --git a/src/include/access/amapi.h b/src/include/access/amapi.h
index a382551a98..ba357bd34d 100644
--- a/src/include/access/amapi.h
+++ b/src/include/access/amapi.h
@@ -244,8 +244,8 @@ typedef struct IndexAmRoutine
bool amcaninclude;
/* does AM use maintenance_work_mem? */
bool amusemaintenanceworkmem;
- /* does AM block HOT update? */
- bool amhotblocking;
+ /* does AM store information only at (at least) block level? */
+ bool amsummarizing;
/* OR of parallel vacuum flags. See vacuum.h for flags. */
uint8 amparallelvacuumoptions;
/* type of data stored in index, or InvalidOid if variable */
diff --git a/src/include/access/htup_details.h b/src/include/access/htup_details.h
index 51a60eda08..80d086076b 100644
--- a/src/include/access/htup_details.h
+++ b/src/include/access/htup_details.h
@@ -290,6 +290,16 @@ struct HeapTupleHeaderData
*/
#define HEAP_TUPLE_HAS_MATCH HEAP_ONLY_TUPLE /* tuple has a join match */
+/*
+ * HEAP_TUPLE_SUMMARIZING_UPDATED is a temporary flag used to signal that
+ * of the indexed columns, only columns used in summarizing indexes were
+ * updated. It is only used on the in-memory newly inserted updated tuple,
+ * which can't have been HOT updated at this point, so this should never
+ * pose an issue.
+ */
+#define HEAP_TUPLE_SUMMARIZING_UPDATED HEAP_HOT_UPDATED
+
+
/*
* HeapTupleHeader accessor macros
*
@@ -539,6 +549,25 @@ do { \
(((tup)->t_infomask & HEAP_HASEXTERNAL) != 0)
+#define HeapTupleHeaderIsHOTWithSummaryUpdate(tup) \
+( \
+ ((tup)->t_infomask2 & HEAP_ONLY_TUPLE) != 0 && \
+ ((tup)->t_infomask2 & HEAP_TUPLE_SUMMARIZING_UPDATED) != 0 \
+)
+
+#define HeapTupleHeaderSetSummaryUpdate(tup) \
+( \
+ (tup)->t_infomask2 |= HEAP_TUPLE_SUMMARIZING_UPDATED \
+)
+
+#define HeapTupleHeaderClearSummaryUpdate(tup) \
+( \
+ (tup)->t_infomask2 &= ~HEAP_TUPLE_SUMMARIZING_UPDATED \
+)
+
+
+
+
/*
* BITMAPLEN(NATTS) -
* Computes size of null bitmap given number of data columns.
diff --git a/src/include/access/tableam.h b/src/include/access/tableam.h
index fe869c6c18..5ead2ec0fd 100644
--- a/src/include/access/tableam.h
+++ b/src/include/access/tableam.h
@@ -102,6 +102,19 @@ typedef enum TM_Result
TM_WouldBlock
} TM_Result;
+/*
+ * Result codes for table_update(..., update_indexes*..).
+ * Used to determine which indexes to update.
+ */
+typedef enum TU_UpdateIndexes {
+ /* No indexed columns were updated (incl. TID addressing of tuple) */
+ TUUI_None = 0,
+ /* A non-summarizing indexed column was updated, or the TID has changed */
+ TUUI_All = 1,
+ /* Only summarized columns were updated, TID is unchanged */
+ TUUI_Summarizing = 2
+} TU_UpdateIndexes;
+
/*
* When table_tuple_update, table_tuple_delete, or table_tuple_lock fail
* because the target tuple is already outdated, they fill in this struct to
@@ -526,7 +539,7 @@ typedef struct TableAmRoutine
bool wait,
TM_FailureData *tmfd,
LockTupleMode *lockmode,
- bool *update_indexes);
+ TU_UpdateIndexes *update_indexes);
/* see table_tuple_lock() for reference about parameters */
TM_Result (*tuple_lock) (Relation rel,
@@ -1506,7 +1519,7 @@ static inline TM_Result
table_tuple_update(Relation rel, ItemPointer otid, TupleTableSlot *slot,
CommandId cid, Snapshot snapshot, Snapshot crosscheck,
bool wait, TM_FailureData *tmfd, LockTupleMode *lockmode,
- bool *update_indexes)
+ TU_UpdateIndexes *update_indexes)
{
return rel->rd_tableam->tuple_update(rel, otid, slot,
cid, snapshot, crosscheck,
@@ -2029,7 +2042,7 @@ extern void simple_table_tuple_delete(Relation rel, ItemPointer tid,
Snapshot snapshot);
extern void simple_table_tuple_update(Relation rel, ItemPointer otid,
TupleTableSlot *slot, Snapshot snapshot,
- bool *update_indexes);
+ TU_UpdateIndexes *update_indexes);
/* ----------------------------------------------------------------------------
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 873772f188..518bbf98d0 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -617,7 +617,8 @@ extern List *ExecInsertIndexTuples(ResultRelInfo *resultRelInfo,
TupleTableSlot *slot, EState *estate,
bool update,
bool noDupErr,
- bool *specConflict, List *arbiterIndexes);
+ bool *specConflict, List *arbiterIndexes,
+ bool onlySummarizing);
extern bool ExecCheckIndexConstraints(ResultRelInfo *resultRelInfo,
TupleTableSlot *slot,
EState *estate, ItemPointer conflictTid,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 94b191f8ae..801a4356e3 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -146,6 +146,7 @@ typedef struct ExprState
* IndexUnchanged aminsert hint, cached for retail inserts
* Concurrent are we doing a concurrent index build?
* BrokenHotChain did we detect any broken HOT chains?
+ * Summarizing is it summarizing?
* ParallelWorkers # of workers requested (excludes leader)
* Am Oid of index AM
* AmCache private cache area for index AM
@@ -179,6 +180,7 @@ typedef struct IndexInfo
bool ii_IndexUnchanged;
bool ii_Concurrent;
bool ii_BrokenHotChain;
+ bool ii_Summarizing;
int ii_ParallelWorkers;
Oid ii_Am;
void *ii_AmCache;
diff --git a/src/include/nodes/makefuncs.h b/src/include/nodes/makefuncs.h
index c717468eb3..b631c363ba 100644
--- a/src/include/nodes/makefuncs.h
+++ b/src/include/nodes/makefuncs.h
@@ -96,7 +96,9 @@ extern List *make_ands_implicit(Expr *clause);
extern IndexInfo *makeIndexInfo(int numattrs, int numkeyattrs, Oid amoid,
List *expressions, List *predicates,
- bool unique, bool nulls_not_distinct, bool isready, bool concurrent);
+ bool unique, bool nulls_not_distinct,
+ bool isready, bool concurrent,
+ bool summarizing);
extern DefElem *makeDefElem(char *name, Node *arg, int location);
extern DefElem *makeDefElemExtended(char *nameSpace, char *name, Node *arg,
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index eadbd00904..d41bb79881 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -160,6 +160,7 @@ typedef struct RelationData
Bitmapset *rd_pkattr; /* cols included in primary key */
Bitmapset *rd_idattr; /* included in replica identity index */
Bitmapset *rd_hotblockingattr; /* cols blocking HOT update */
+ Bitmapset *rd_summarizedattr; /* cols indexed by block-or-larger summarizing indexes */
PublicationDesc *rd_pubdesc; /* publication descriptor, or NULL */
diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h
index 86dddbd975..e7cdd11e7b 100644
--- a/src/include/utils/relcache.h
+++ b/src/include/utils/relcache.h
@@ -58,7 +58,8 @@ typedef enum IndexAttrBitmapKind
INDEX_ATTR_BITMAP_KEY,
INDEX_ATTR_BITMAP_PRIMARY_KEY,
INDEX_ATTR_BITMAP_IDENTITY_KEY,
- INDEX_ATTR_BITMAP_HOT_BLOCKING
+ INDEX_ATTR_BITMAP_HOT_BLOCKING,
+ INDEX_ATTR_BITMAP_SUMMARIZED
} IndexAttrBitmapKind;
extern Bitmapset *RelationGetIndexAttrBitmap(Relation relation,
diff --git a/src/test/modules/dummy_index_am/dummy_index_am.c b/src/test/modules/dummy_index_am/dummy_index_am.c
index 22578b6246..3785c6c9bc 100644
--- a/src/test/modules/dummy_index_am/dummy_index_am.c
+++ b/src/test/modules/dummy_index_am/dummy_index_am.c
@@ -298,7 +298,7 @@ dihandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = false;
amroutine->amcaninclude = false;
amroutine->amusemaintenanceworkmem = false;
- amroutine->amhotblocking = true;
+ amroutine->amsummarizing = false;
amroutine->amparallelvacuumoptions = VACUUM_OPTION_NO_PARALLEL;
amroutine->amkeytype = InvalidOid;
diff --git a/src/test/regress/expected/brin.out b/src/test/regress/expected/brin.out
index ed7879f583..8f4698aae4 100644
--- a/src/test/regress/expected/brin.out
+++ b/src/test/regress/expected/brin.out
@@ -607,8 +607,8 @@ SELECT COUNT(*) FROM brin_hot_2 WHERE a = 2 AND b = 100;
CREATE TABLE brin_hot (
id integer PRIMARY KEY,
val integer NOT NULL
-) WITH (autovacuum_enabled = off, fillfactor = 70);
-INSERT INTO brin_hot SELECT *, 0 FROM generate_series(1, 235);
+) WITH (autovacuum_enabled = off, fillfactor = 10);
+INSERT INTO brin_hot SELECT *, 0 FROM generate_series(1, 100000);
CREATE INDEX val_brin ON brin_hot using brin(val);
UPDATE brin_hot SET val = -3 WHERE id = 42;
-- ensure pending stats are flushed
@@ -624,4 +624,21 @@ SELECT pg_stat_get_tuples_hot_updated('brin_hot'::regclass::oid);
1
(1 row)
+-- VACUUM and ANALYZE so that we use the BRIN index on the next query
+VACUUM FREEZE ANALYZE brin_hot;
+EXPLAIN (COSTS OFF) SELECT * FROM brin_hot WHERE val < 0;
+ QUERY PLAN
+-------------------------------------
+ Bitmap Heap Scan on brin_hot
+ Recheck Cond: (val < 0)
+ -> Bitmap Index Scan on val_brin
+ Index Cond: (val < 0)
+(4 rows)
+
+SELECT * FROM brin_hot WHERE val < 0;
+ id | val
+----+-----
+ 42 | -3
+(1 row)
+
DROP TABLE brin_hot;
diff --git a/src/test/regress/sql/brin.sql b/src/test/regress/sql/brin.sql
index 920e053249..9feb0f47ed 100644
--- a/src/test/regress/sql/brin.sql
+++ b/src/test/regress/sql/brin.sql
@@ -532,9 +532,9 @@ SELECT COUNT(*) FROM brin_hot_2 WHERE a = 2 AND b = 100;
CREATE TABLE brin_hot (
id integer PRIMARY KEY,
val integer NOT NULL
-) WITH (autovacuum_enabled = off, fillfactor = 70);
+) WITH (autovacuum_enabled = off, fillfactor = 10);
-INSERT INTO brin_hot SELECT *, 0 FROM generate_series(1, 235);
+INSERT INTO brin_hot SELECT *, 0 FROM generate_series(1, 100000);
CREATE INDEX val_brin ON brin_hot using brin(val);
UPDATE brin_hot SET val = -3 WHERE id = 42;
@@ -544,4 +544,10 @@ SELECT pg_stat_force_next_flush();
SELECT pg_stat_get_tuples_hot_updated('brin_hot'::regclass::oid);
+-- VACUUM and ANALYZE so that we use the BRIN index on the next query
+VACUUM FREEZE ANALYZE brin_hot;
+
+EXPLAIN (COSTS OFF) SELECT * FROM brin_hot WHERE val < 0;
+SELECT * FROM brin_hot WHERE val < 0;
+
DROP TABLE brin_hot;
--
2.30.2
Hi,
On 2022-05-30 17:22:35 +0200, Matthias van de Meent wrote:
Yeah, I think that might/should work. We could still create the HOT
chain, but we'd have to update the BRIN indexes. But that seems like a
fairly complicated change to be done this late for PG15.Here's an example patch for that (based on a branch derived from
master @ 5bb2b6ab). A nod to the authors of the pHOT patch, as that is
a related patch and was informative in how this could/should impact AM
APIs -- this is doing things similar (but not exactly the same) to
that by only updating select indexes.Note that this is an ABI change in some critical places -- I'm not
sure it's OK to commit a fix like this into PG15 unless we really
don't want to revert 5753d4ee320b.Also of note is that this still updates _all_ summarizing indexes, not
only those involved in the tuple update. Better performance is up to a
different implementation.The patch includes a new regression test based on your example, which
fails on master but succeeds after applying the patch.
This seems like a pretty clear cut case for reverting and retrying in
16. There's plenty subtlety in this area (as evidenced by this thread and the
index/reindex concurrently breakage), and building infrastructure post beta1
isn't exactly conducive to careful analysis and testing.
Greetings,
Andres Freund
On Sat, May 28, 2022 at 4:51 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
Yeah, I think that might/should work. We could still create the HOT
chain, but we'd have to update the BRIN indexes. But that seems like a
fairly complicated change to be done this late for PG15.
Yeah, I think a revert is better for now. But I agree that the basic
idea seems salvageable. I think that the commit message is correct
when it states that "When determining whether an index update may be
skipped by using HOT, we can ignore attributes indexed only by BRIN
indexes." However, that doesn't mean that we can ignore the need to
update those indexes. In that regard, the commit message makes it
sound like all is well, because it states that "the page range summary
will be updated anyway" which reads to me like the indexes are in fact
getting updated. Your example, however, seems to show that the indexes
are not getting updated.
--
Robert Haas
EDB: http://www.enterprisedb.com
On 6/1/22 22:38, Robert Haas wrote:
On Sat, May 28, 2022 at 4:51 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:Yeah, I think that might/should work. We could still create the HOT
chain, but we'd have to update the BRIN indexes. But that seems like a
fairly complicated change to be done this late for PG15.Yeah, I think a revert is better for now. But I agree that the basic
idea seems salvageable. I think that the commit message is correct
when it states that "When determining whether an index update may be
skipped by using HOT, we can ignore attributes indexed only by BRIN
indexes." However, that doesn't mean that we can ignore the need to
update those indexes. In that regard, the commit message makes it
sound like all is well, because it states that "the page range summary
will be updated anyway" which reads to me like the indexes are in fact
getting updated. Your example, however, seems to show that the indexes
are not getting updated.
Yeah, agreed :-( I agree we can probably salvage some of the idea, but
it's far too late for major reworks in PG15.
Attached is a patch reverting both commits (5753d4ee32 and fe60b67250).
This changes the IndexAmRoutine struct, so it's an ABI break. That's not
great post-beta :-( In principle we might also leave amhotblocking in
the struct but ignore it in the code (and treat it as false), but that
seems weird and it's going to be a pain when backpatching. Opinions?
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachments:
0001-Revert-changes-in-HOT-handling-of-BRIN-indexes.patchtext/x-patch; charset=UTF-8; name=0001-Revert-changes-in-HOT-handling-of-BRIN-indexes.patchDownload
From 22911ff0df0284bd9c97d4971ab10332444c528d Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas.vondra@postgresql.org>
Date: Mon, 6 Jun 2022 08:21:36 +0200
Subject: [PATCH] Revert changes in HOT handling of BRIN indexes
This reverts commits 5753d4ee32 and fe60b67250 that modified HOT to
ignore BRIN indexes. The commit message for 5753d4ee32 claims that:
When determining whether an index update may be skipped by using
HOT, we can ignore attributes indexed only by BRIN indexes. There
are no index pointers to individual tuples in BRIN, and the page
range summary will be updated anyway as it relies on visibility
info.
This is partially incorrect - it's true BRIN indexes don't point to
individual tuples, so HOT chains are not an issue, but the visibitlity
info is not sufficient to keep the index up to date. This can easily
result in corrupted indexes, as demonstrated in the hackers thread.
This does not mean relaxing the HOT restrictions for BRIN is a lost
cause, but it needs to handle the two aspects (allowing HOT chains and
updating the page range summaries) as separate. But that requires a
major changes, and it's too late for that in the current dev cycle.
Reported-by: Tomas Vondra
Discussion: https://postgr.es/m/05ebcb44-f383-86e3-4f31-0a97a55634cf@enterprisedb.com
---
doc/src/sgml/indexam.sgml | 11 -
src/backend/access/brin/brin.c | 1 -
src/backend/access/gin/ginutil.c | 1 -
src/backend/access/gist/gist.c | 1 -
src/backend/access/hash/hash.c | 1 -
src/backend/access/heap/heapam.c | 2 +-
src/backend/access/nbtree/nbtree.c | 1 -
src/backend/access/spgist/spgutils.c | 1 -
src/backend/utils/cache/relcache.c | 53 ++--
src/include/access/amapi.h | 2 -
src/include/utils/rel.h | 3 +-
src/include/utils/relcache.h | 4 +-
.../modules/dummy_index_am/dummy_index_am.c | 1 -
src/test/regress/expected/brin.out | 58 -----
src/test/regress/expected/stats.out | 242 ------------------
src/test/regress/sql/brin.sql | 36 ---
src/test/regress/sql/stats.sql | 111 --------
17 files changed, 27 insertions(+), 502 deletions(-)
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index d4163c96e9f..cf359fa9ffd 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -126,8 +126,6 @@ typedef struct IndexAmRoutine
bool amcaninclude;
/* does AM use maintenance_work_mem? */
bool amusemaintenanceworkmem;
- /* does AM block HOT update? */
- bool amhotblocking;
/* OR of parallel vacuum flags */
uint8 amparallelvacuumoptions;
/* type of data stored in index, or InvalidOid if variable */
@@ -248,15 +246,6 @@ typedef struct IndexAmRoutine
null, independently of <structfield>amoptionalkey</structfield>.
</para>
- <para>
- The <structfield>amhotblocking</structfield> flag indicates whether the
- access method blocks <acronym>HOT</acronym> when an indexed attribute is
- updated. Access methods without pointers to individual tuples (like
- <acronym>BRIN</acronym>) may allow <acronym>HOT</acronym> even in this
- case. This does not apply to attributes referenced in index predicates;
- an update of such attribute always disables <acronym>HOT</acronym>.
- </para>
-
</sect1>
<sect1 id="index-functions">
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index 0de1441dc6d..e88f7efa7e4 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -108,7 +108,6 @@ brinhandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = false;
amroutine->amcaninclude = false;
amroutine->amusemaintenanceworkmem = false;
- amroutine->amhotblocking = false;
amroutine->amparallelvacuumoptions =
VACUUM_OPTION_PARALLEL_CLEANUP;
amroutine->amkeytype = InvalidOid;
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index 3d15701a01e..20f470648be 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -56,7 +56,6 @@ ginhandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = false;
amroutine->amcaninclude = false;
amroutine->amusemaintenanceworkmem = true;
- amroutine->amhotblocking = true;
amroutine->amparallelvacuumoptions =
VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_CLEANUP;
amroutine->amkeytype = InvalidOid;
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index 8c6c744ab74..5866c6aaaf7 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -78,7 +78,6 @@ gisthandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = false;
amroutine->amcaninclude = true;
amroutine->amusemaintenanceworkmem = false;
- amroutine->amhotblocking = true;
amroutine->amparallelvacuumoptions =
VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
amroutine->amkeytype = InvalidOid;
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index fd1a7119b6c..c361509d68d 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -75,7 +75,6 @@ hashhandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = false;
amroutine->amcaninclude = false;
amroutine->amusemaintenanceworkmem = false;
- amroutine->amhotblocking = true;
amroutine->amparallelvacuumoptions =
VACUUM_OPTION_PARALLEL_BULKDEL;
amroutine->amkeytype = INT4OID;
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 74218510276..637de1116c9 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -3190,7 +3190,7 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
* Note that we get copies of each bitmap, so we need not worry about
* relcache flush happening midway through.
*/
- hot_attrs = RelationGetIndexAttrBitmap(relation, INDEX_ATTR_BITMAP_HOT_BLOCKING);
+ hot_attrs = RelationGetIndexAttrBitmap(relation, INDEX_ATTR_BITMAP_ALL);
key_attrs = RelationGetIndexAttrBitmap(relation, INDEX_ATTR_BITMAP_KEY);
id_attrs = RelationGetIndexAttrBitmap(relation,
INDEX_ATTR_BITMAP_IDENTITY_KEY);
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 06131f23d4b..9b730f303fb 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -114,7 +114,6 @@ bthandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = true;
amroutine->amcaninclude = true;
amroutine->amusemaintenanceworkmem = false;
- amroutine->amhotblocking = true;
amroutine->amparallelvacuumoptions =
VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
amroutine->amkeytype = InvalidOid;
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index a171ca8a08a..2c661fcf96f 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -62,7 +62,6 @@ spghandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = false;
amroutine->amcaninclude = true;
amroutine->amusemaintenanceworkmem = false;
- amroutine->amhotblocking = true;
amroutine->amparallelvacuumoptions =
VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
amroutine->amkeytype = InvalidOid;
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 60e72f9e8bf..0e8fda97f86 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -2439,10 +2439,10 @@ RelationDestroyRelation(Relation relation, bool remember_tupdesc)
list_free_deep(relation->rd_fkeylist);
list_free(relation->rd_indexlist);
list_free(relation->rd_statlist);
+ bms_free(relation->rd_indexattr);
bms_free(relation->rd_keyattr);
bms_free(relation->rd_pkattr);
bms_free(relation->rd_idattr);
- bms_free(relation->rd_hotblockingattr);
if (relation->rd_pubdesc)
pfree(relation->rd_pubdesc);
if (relation->rd_options)
@@ -5104,10 +5104,10 @@ RelationGetIndexPredicate(Relation relation)
Bitmapset *
RelationGetIndexAttrBitmap(Relation relation, IndexAttrBitmapKind attrKind)
{
+ Bitmapset *indexattrs; /* indexed columns */
Bitmapset *uindexattrs; /* columns in unique indexes */
Bitmapset *pkindexattrs; /* columns in the primary index */
Bitmapset *idindexattrs; /* columns in the replica identity */
- Bitmapset *hotblockingattrs; /* columns with HOT blocking indexes */
List *indexoidlist;
List *newindexoidlist;
Oid relpkindex;
@@ -5116,18 +5116,18 @@ RelationGetIndexAttrBitmap(Relation relation, IndexAttrBitmapKind attrKind)
MemoryContext oldcxt;
/* Quick exit if we already computed the result. */
- if (relation->rd_attrsvalid)
+ if (relation->rd_indexattr != NULL)
{
switch (attrKind)
{
+ case INDEX_ATTR_BITMAP_ALL:
+ return bms_copy(relation->rd_indexattr);
case INDEX_ATTR_BITMAP_KEY:
return bms_copy(relation->rd_keyattr);
case INDEX_ATTR_BITMAP_PRIMARY_KEY:
return bms_copy(relation->rd_pkattr);
case INDEX_ATTR_BITMAP_IDENTITY_KEY:
return bms_copy(relation->rd_idattr);
- case INDEX_ATTR_BITMAP_HOT_BLOCKING:
- return bms_copy(relation->rd_hotblockingattr);
default:
elog(ERROR, "unknown attrKind %u", attrKind);
}
@@ -5158,7 +5158,7 @@ restart:
relreplindex = relation->rd_replidindex;
/*
- * For each index, add referenced attributes to appropriate bitmaps.
+ * For each index, add referenced attributes to indexattrs.
*
* Note: we consider all indexes returned by RelationGetIndexList, even if
* they are not indisready or indisvalid. This is important because an
@@ -5167,10 +5167,10 @@ restart:
* CONCURRENTLY is far enough along that we should ignore the index, it
* won't be returned at all by RelationGetIndexList.
*/
+ indexattrs = NULL;
uindexattrs = NULL;
pkindexattrs = NULL;
idindexattrs = NULL;
- hotblockingattrs = NULL;
foreach(l, indexoidlist)
{
Oid indexOid = lfirst_oid(l);
@@ -5235,9 +5235,8 @@ restart:
*/
if (attrnum != 0)
{
- if (indexDesc->rd_indam->amhotblocking)
- hotblockingattrs = bms_add_member(hotblockingattrs,
- attrnum - FirstLowInvalidHeapAttributeNumber);
+ indexattrs = bms_add_member(indexattrs,
+ attrnum - FirstLowInvalidHeapAttributeNumber);
if (isKey && i < indexDesc->rd_index->indnkeyatts)
uindexattrs = bms_add_member(uindexattrs,
@@ -5254,15 +5253,10 @@ restart:
}
/* Collect all attributes used in expressions, too */
- if (indexDesc->rd_indam->amhotblocking)
- pull_varattnos(indexExpressions, 1, &hotblockingattrs);
+ pull_varattnos(indexExpressions, 1, &indexattrs);
- /*
- * Collect all attributes in the index predicate, too. We have to
- * ignore amhotblocking flag, because the row might become indexable,
- * in which case we have to add it to the index.
- */
- pull_varattnos(indexPredicate, 1, &hotblockingattrs);
+ /* Collect all attributes in the index predicate, too */
+ pull_varattnos(indexPredicate, 1, &indexattrs);
index_close(indexDesc, AccessShareLock);
}
@@ -5290,46 +5284,46 @@ restart:
bms_free(uindexattrs);
bms_free(pkindexattrs);
bms_free(idindexattrs);
- bms_free(hotblockingattrs);
+ bms_free(indexattrs);
goto restart;
}
/* Don't leak the old values of these bitmaps, if any */
+ bms_free(relation->rd_indexattr);
+ relation->rd_indexattr = NULL;
bms_free(relation->rd_keyattr);
relation->rd_keyattr = NULL;
bms_free(relation->rd_pkattr);
relation->rd_pkattr = NULL;
bms_free(relation->rd_idattr);
relation->rd_idattr = NULL;
- bms_free(relation->rd_hotblockingattr);
- relation->rd_hotblockingattr = NULL;
/*
* Now save copies of the bitmaps in the relcache entry. We intentionally
- * set rd_attrsvalid last, because that's what signals validity of the
- * values; if we run out of memory before making that copy, we won't leave
- * the relcache entry looking like the other ones are valid but empty.
+ * set rd_indexattr last, because that's the one that signals validity of
+ * the values; if we run out of memory before making that copy, we won't
+ * leave the relcache entry looking like the other ones are valid but
+ * empty.
*/
oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
relation->rd_keyattr = bms_copy(uindexattrs);
relation->rd_pkattr = bms_copy(pkindexattrs);
relation->rd_idattr = bms_copy(idindexattrs);
- relation->rd_hotblockingattr = bms_copy(hotblockingattrs);
- relation->rd_attrsvalid = true;
+ relation->rd_indexattr = bms_copy(indexattrs);
MemoryContextSwitchTo(oldcxt);
/* We return our original working copy for caller to play with */
switch (attrKind)
{
+ case INDEX_ATTR_BITMAP_ALL:
+ return indexattrs;
case INDEX_ATTR_BITMAP_KEY:
return uindexattrs;
case INDEX_ATTR_BITMAP_PRIMARY_KEY:
return pkindexattrs;
case INDEX_ATTR_BITMAP_IDENTITY_KEY:
return idindexattrs;
- case INDEX_ATTR_BITMAP_HOT_BLOCKING:
- return hotblockingattrs;
default:
elog(ERROR, "unknown attrKind %u", attrKind);
return NULL;
@@ -6250,11 +6244,10 @@ load_relcache_init_file(bool shared)
rel->rd_indexlist = NIL;
rel->rd_pkindex = InvalidOid;
rel->rd_replidindex = InvalidOid;
- rel->rd_attrsvalid = false;
+ rel->rd_indexattr = NULL;
rel->rd_keyattr = NULL;
rel->rd_pkattr = NULL;
rel->rd_idattr = NULL;
- rel->rd_hotblockingattr = NULL;
rel->rd_pubdesc = NULL;
rel->rd_statvalid = false;
rel->rd_statlist = NIL;
diff --git a/src/include/access/amapi.h b/src/include/access/amapi.h
index 0b89f399f08..1dc674d2305 100644
--- a/src/include/access/amapi.h
+++ b/src/include/access/amapi.h
@@ -244,8 +244,6 @@ typedef struct IndexAmRoutine
bool amcaninclude;
/* does AM use maintenance_work_mem? */
bool amusemaintenanceworkmem;
- /* does AM block HOT update? */
- bool amhotblocking;
/* OR of parallel vacuum flags. See vacuum.h for flags. */
uint8 amparallelvacuumoptions;
/* type of data stored in index, or InvalidOid if variable */
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 90b3c49bc12..1896a9a06d1 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -155,11 +155,10 @@ typedef struct RelationData
List *rd_statlist; /* list of OIDs of extended stats */
/* data managed by RelationGetIndexAttrBitmap: */
- bool rd_attrsvalid; /* are bitmaps of attrs valid? */
+ Bitmapset *rd_indexattr; /* identifies columns used in indexes */
Bitmapset *rd_keyattr; /* cols that can be ref'd by foreign keys */
Bitmapset *rd_pkattr; /* cols included in primary key */
Bitmapset *rd_idattr; /* included in replica identity index */
- Bitmapset *rd_hotblockingattr; /* cols blocking HOT update */
PublicationDesc *rd_pubdesc; /* publication descriptor, or NULL */
diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h
index 86dddbd975d..c93d8654bb9 100644
--- a/src/include/utils/relcache.h
+++ b/src/include/utils/relcache.h
@@ -55,10 +55,10 @@ extern bytea **RelationGetIndexAttOptions(Relation relation, bool copy);
typedef enum IndexAttrBitmapKind
{
+ INDEX_ATTR_BITMAP_ALL,
INDEX_ATTR_BITMAP_KEY,
INDEX_ATTR_BITMAP_PRIMARY_KEY,
- INDEX_ATTR_BITMAP_IDENTITY_KEY,
- INDEX_ATTR_BITMAP_HOT_BLOCKING
+ INDEX_ATTR_BITMAP_IDENTITY_KEY
} IndexAttrBitmapKind;
extern Bitmapset *RelationGetIndexAttrBitmap(Relation relation,
diff --git a/src/test/modules/dummy_index_am/dummy_index_am.c b/src/test/modules/dummy_index_am/dummy_index_am.c
index 22578b6246e..a0894ff9860 100644
--- a/src/test/modules/dummy_index_am/dummy_index_am.c
+++ b/src/test/modules/dummy_index_am/dummy_index_am.c
@@ -298,7 +298,6 @@ dihandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = false;
amroutine->amcaninclude = false;
amroutine->amusemaintenanceworkmem = false;
- amroutine->amhotblocking = true;
amroutine->amparallelvacuumoptions = VACUUM_OPTION_NO_PARALLEL;
amroutine->amkeytype = InvalidOid;
diff --git a/src/test/regress/expected/brin.out b/src/test/regress/expected/brin.out
index 96cbb5de4ef..ae4c424e79f 100644
--- a/src/test/regress/expected/brin.out
+++ b/src/test/regress/expected/brin.out
@@ -567,61 +567,3 @@ SELECT * FROM brintest_3 WHERE b < '0';
DROP TABLE brintest_3;
RESET enable_seqscan;
--- Test handling of index predicates - updating attributes in predicates
--- should block HOT even for BRIN. We update a row that was not indexed
--- due to the index predicate, and becomes indexable.
-CREATE TABLE brin_hot_2 (a int, b int);
-INSERT INTO brin_hot_2 VALUES (1, 100);
-CREATE INDEX ON brin_hot_2 USING brin (b) WHERE a = 2;
-UPDATE brin_hot_2 SET a = 2;
-EXPLAIN (COSTS OFF) SELECT * FROM brin_hot_2 WHERE a = 2 AND b = 100;
- QUERY PLAN
------------------------------------
- Seq Scan on brin_hot_2
- Filter: ((a = 2) AND (b = 100))
-(2 rows)
-
-SELECT COUNT(*) FROM brin_hot_2 WHERE a = 2 AND b = 100;
- count
--------
- 1
-(1 row)
-
-SET enable_seqscan = off;
-EXPLAIN (COSTS OFF) SELECT * FROM brin_hot_2 WHERE a = 2 AND b = 100;
- QUERY PLAN
----------------------------------------------
- Bitmap Heap Scan on brin_hot_2
- Recheck Cond: ((b = 100) AND (a = 2))
- -> Bitmap Index Scan on brin_hot_2_b_idx
- Index Cond: (b = 100)
-(4 rows)
-
-SELECT COUNT(*) FROM brin_hot_2 WHERE a = 2 AND b = 100;
- count
--------
- 1
-(1 row)
-
--- test BRIN index doesn't block HOT update
-CREATE TABLE brin_hot (
- id integer PRIMARY KEY,
- val integer NOT NULL
-) WITH (autovacuum_enabled = off, fillfactor = 70);
-INSERT INTO brin_hot SELECT *, 0 FROM generate_series(1, 235);
-CREATE INDEX val_brin ON brin_hot using brin(val);
-UPDATE brin_hot SET val = -3 WHERE id = 42;
--- ensure pending stats are flushed
-SELECT pg_stat_force_next_flush();
- pg_stat_force_next_flush
---------------------------
-
-(1 row)
-
-SELECT pg_stat_get_tuples_hot_updated('brin_hot'::regclass::oid);
- pg_stat_get_tuples_hot_updated
---------------------------------
- 1
-(1 row)
-
-DROP TABLE brin_hot;
diff --git a/src/test/regress/expected/stats.out b/src/test/regress/expected/stats.out
index 6b233ff4c05..5b0ebf090f4 100644
--- a/src/test/regress/expected/stats.out
+++ b/src/test/regress/expected/stats.out
@@ -554,246 +554,4 @@ SELECT pg_stat_get_live_tuples(:drop_stats_test_subxact_oid);
DROP TABLE trunc_stats_test, trunc_stats_test1, trunc_stats_test2, trunc_stats_test3, trunc_stats_test4;
DROP TABLE prevstats;
------
--- Test that various stats views are being properly populated
------
--- Test that sessions is incremented when a new session is started in pg_stat_database
-SELECT sessions AS db_stat_sessions FROM pg_stat_database WHERE datname = (SELECT current_database()) \gset
-\c
-SELECT sessions > :db_stat_sessions FROM pg_stat_database WHERE datname = (SELECT current_database());
- ?column?
-----------
- t
-(1 row)
-
--- Test pg_stat_bgwriter checkpointer-related stats, together with pg_stat_wal
-SELECT checkpoints_req AS rqst_ckpts_before FROM pg_stat_bgwriter \gset
--- Test pg_stat_wal
-SELECT wal_bytes AS wal_bytes_before FROM pg_stat_wal \gset
-CREATE TABLE test_stats_temp AS SELECT 17;
-DROP TABLE test_stats_temp;
--- Checkpoint twice: The checkpointer reports stats after reporting completion
--- of the checkpoint. But after a second checkpoint we'll see at least the
--- results of the first.
-CHECKPOINT;
-CHECKPOINT;
-SELECT checkpoints_req > :rqst_ckpts_before FROM pg_stat_bgwriter;
- ?column?
-----------
- t
-(1 row)
-
-SELECT wal_bytes > :wal_bytes_before FROM pg_stat_wal;
- ?column?
-----------
- t
-(1 row)
-
------
--- Test that resetting stats works for reset timestamp
------
--- Test that reset_slru with a specified SLRU works.
-SELECT stats_reset AS slru_commit_ts_reset_ts FROM pg_stat_slru WHERE name = 'CommitTs' \gset
-SELECT stats_reset AS slru_notify_reset_ts FROM pg_stat_slru WHERE name = 'Notify' \gset
-SELECT pg_stat_reset_slru('CommitTs');
- pg_stat_reset_slru
---------------------
-
-(1 row)
-
-SELECT stats_reset > :'slru_commit_ts_reset_ts'::timestamptz FROM pg_stat_slru WHERE name = 'CommitTs';
- ?column?
-----------
- t
-(1 row)
-
-SELECT stats_reset AS slru_commit_ts_reset_ts FROM pg_stat_slru WHERE name = 'CommitTs' \gset
--- Test that multiple SLRUs are reset when no specific SLRU provided to reset function
-SELECT pg_stat_reset_slru(NULL);
- pg_stat_reset_slru
---------------------
-
-(1 row)
-
-SELECT stats_reset > :'slru_commit_ts_reset_ts'::timestamptz FROM pg_stat_slru WHERE name = 'CommitTs';
- ?column?
-----------
- t
-(1 row)
-
-SELECT stats_reset > :'slru_notify_reset_ts'::timestamptz FROM pg_stat_slru WHERE name = 'Notify';
- ?column?
-----------
- t
-(1 row)
-
--- Test that reset_shared with archiver specified as the stats type works
-SELECT stats_reset AS archiver_reset_ts FROM pg_stat_archiver \gset
-SELECT pg_stat_reset_shared('archiver');
- pg_stat_reset_shared
-----------------------
-
-(1 row)
-
-SELECT stats_reset > :'archiver_reset_ts'::timestamptz FROM pg_stat_archiver;
- ?column?
-----------
- t
-(1 row)
-
-SELECT stats_reset AS archiver_reset_ts FROM pg_stat_archiver \gset
--- Test that reset_shared with bgwriter specified as the stats type works
-SELECT stats_reset AS bgwriter_reset_ts FROM pg_stat_bgwriter \gset
-SELECT pg_stat_reset_shared('bgwriter');
- pg_stat_reset_shared
-----------------------
-
-(1 row)
-
-SELECT stats_reset > :'bgwriter_reset_ts'::timestamptz FROM pg_stat_bgwriter;
- ?column?
-----------
- t
-(1 row)
-
-SELECT stats_reset AS bgwriter_reset_ts FROM pg_stat_bgwriter \gset
--- Test that reset_shared with wal specified as the stats type works
-SELECT stats_reset AS wal_reset_ts FROM pg_stat_wal \gset
-SELECT pg_stat_reset_shared('wal');
- pg_stat_reset_shared
-----------------------
-
-(1 row)
-
-SELECT stats_reset > :'wal_reset_ts'::timestamptz FROM pg_stat_wal;
- ?column?
-----------
- t
-(1 row)
-
-SELECT stats_reset AS wal_reset_ts FROM pg_stat_wal \gset
--- Test that reset_shared with no specified stats type doesn't reset anything
-SELECT pg_stat_reset_shared(NULL);
- pg_stat_reset_shared
-----------------------
-
-(1 row)
-
-SELECT stats_reset = :'archiver_reset_ts'::timestamptz FROM pg_stat_archiver;
- ?column?
-----------
- t
-(1 row)
-
-SELECT stats_reset = :'bgwriter_reset_ts'::timestamptz FROM pg_stat_bgwriter;
- ?column?
-----------
- t
-(1 row)
-
-SELECT stats_reset = :'wal_reset_ts'::timestamptz FROM pg_stat_wal;
- ?column?
-----------
- t
-(1 row)
-
--- Test that reset works for pg_stat_database
--- Since pg_stat_database stats_reset starts out as NULL, reset it once first so we have something to compare it to
-SELECT pg_stat_reset();
- pg_stat_reset
----------------
-
-(1 row)
-
-SELECT stats_reset AS db_reset_ts FROM pg_stat_database WHERE datname = (SELECT current_database()) \gset
-SELECT pg_stat_reset();
- pg_stat_reset
----------------
-
-(1 row)
-
-SELECT stats_reset > :'db_reset_ts'::timestamptz FROM pg_stat_database WHERE datname = (SELECT current_database());
- ?column?
-----------
- t
-(1 row)
-
-----
--- pg_stat_get_snapshot_timestamp behavior
-----
-BEGIN;
-SET LOCAL stats_fetch_consistency = snapshot;
--- no snapshot yet, return NULL
-SELECT pg_stat_get_snapshot_timestamp();
- pg_stat_get_snapshot_timestamp
---------------------------------
-
-(1 row)
-
--- any attempt at accessing stats will build snapshot
-SELECT pg_stat_get_function_calls(0);
- pg_stat_get_function_calls
-----------------------------
-
-(1 row)
-
-SELECT pg_stat_get_snapshot_timestamp() >= NOW();
- ?column?
-----------
- t
-(1 row)
-
--- shows NULL again after clearing
-SELECT pg_stat_clear_snapshot();
- pg_stat_clear_snapshot
-------------------------
-
-(1 row)
-
-SELECT pg_stat_get_snapshot_timestamp();
- pg_stat_get_snapshot_timestamp
---------------------------------
-
-(1 row)
-
-COMMIT;
-----
--- pg_stat_have_stats behavior
-----
--- fixed-numbered stats exist
-SELECT pg_stat_have_stats('bgwriter', 0, 0);
- pg_stat_have_stats
---------------------
- t
-(1 row)
-
--- unknown stats kinds error out
-SELECT pg_stat_have_stats('zaphod', 0, 0);
-ERROR: invalid statistics kind: "zaphod"
--- db stats have objoid 0
-SELECT pg_stat_have_stats('database', (SELECT oid FROM pg_database WHERE datname = current_database()), 1);
- pg_stat_have_stats
---------------------
- f
-(1 row)
-
-SELECT pg_stat_have_stats('database', (SELECT oid FROM pg_database WHERE datname = current_database()), 0);
- pg_stat_have_stats
---------------------
- t
-(1 row)
-
--- ensure that stats accessors handle NULL input correctly
-SELECT pg_stat_get_replication_slot(NULL);
- pg_stat_get_replication_slot
-------------------------------
-
-(1 row)
-
-SELECT pg_stat_get_subscription_stats(NULL);
- pg_stat_get_subscription_stats
---------------------------------
-
-(1 row)
-
-- End of Stats Test
diff --git a/src/test/regress/sql/brin.sql b/src/test/regress/sql/brin.sql
index eec73b7fbe2..33a30fcf777 100644
--- a/src/test/regress/sql/brin.sql
+++ b/src/test/regress/sql/brin.sql
@@ -509,39 +509,3 @@ SELECT * FROM brintest_3 WHERE b < '0';
DROP TABLE brintest_3;
RESET enable_seqscan;
-
--- Test handling of index predicates - updating attributes in predicates
--- should block HOT even for BRIN. We update a row that was not indexed
--- due to the index predicate, and becomes indexable.
-CREATE TABLE brin_hot_2 (a int, b int);
-INSERT INTO brin_hot_2 VALUES (1, 100);
-CREATE INDEX ON brin_hot_2 USING brin (b) WHERE a = 2;
-
-UPDATE brin_hot_2 SET a = 2;
-
-EXPLAIN (COSTS OFF) SELECT * FROM brin_hot_2 WHERE a = 2 AND b = 100;
-SELECT COUNT(*) FROM brin_hot_2 WHERE a = 2 AND b = 100;
-
-SET enable_seqscan = off;
-
-EXPLAIN (COSTS OFF) SELECT * FROM brin_hot_2 WHERE a = 2 AND b = 100;
-SELECT COUNT(*) FROM brin_hot_2 WHERE a = 2 AND b = 100;
-
-
--- test BRIN index doesn't block HOT update
-CREATE TABLE brin_hot (
- id integer PRIMARY KEY,
- val integer NOT NULL
-) WITH (autovacuum_enabled = off, fillfactor = 70);
-
-INSERT INTO brin_hot SELECT *, 0 FROM generate_series(1, 235);
-CREATE INDEX val_brin ON brin_hot using brin(val);
-
-UPDATE brin_hot SET val = -3 WHERE id = 42;
-
--- ensure pending stats are flushed
-SELECT pg_stat_force_next_flush();
-
-SELECT pg_stat_get_tuples_hot_updated('brin_hot'::regclass::oid);
-
-DROP TABLE brin_hot;
diff --git a/src/test/regress/sql/stats.sql b/src/test/regress/sql/stats.sql
index 096f00ce8be..3f3cf8fb56b 100644
--- a/src/test/regress/sql/stats.sql
+++ b/src/test/regress/sql/stats.sql
@@ -285,115 +285,4 @@ SELECT pg_stat_get_live_tuples(:drop_stats_test_subxact_oid);
DROP TABLE trunc_stats_test, trunc_stats_test1, trunc_stats_test2, trunc_stats_test3, trunc_stats_test4;
DROP TABLE prevstats;
-
-
------
--- Test that various stats views are being properly populated
------
-
--- Test that sessions is incremented when a new session is started in pg_stat_database
-SELECT sessions AS db_stat_sessions FROM pg_stat_database WHERE datname = (SELECT current_database()) \gset
-\c
-SELECT sessions > :db_stat_sessions FROM pg_stat_database WHERE datname = (SELECT current_database());
-
--- Test pg_stat_bgwriter checkpointer-related stats, together with pg_stat_wal
-SELECT checkpoints_req AS rqst_ckpts_before FROM pg_stat_bgwriter \gset
-
--- Test pg_stat_wal
-SELECT wal_bytes AS wal_bytes_before FROM pg_stat_wal \gset
-
-CREATE TABLE test_stats_temp AS SELECT 17;
-DROP TABLE test_stats_temp;
-
--- Checkpoint twice: The checkpointer reports stats after reporting completion
--- of the checkpoint. But after a second checkpoint we'll see at least the
--- results of the first.
-CHECKPOINT;
-CHECKPOINT;
-
-SELECT checkpoints_req > :rqst_ckpts_before FROM pg_stat_bgwriter;
-SELECT wal_bytes > :wal_bytes_before FROM pg_stat_wal;
-
-
------
--- Test that resetting stats works for reset timestamp
------
-
--- Test that reset_slru with a specified SLRU works.
-SELECT stats_reset AS slru_commit_ts_reset_ts FROM pg_stat_slru WHERE name = 'CommitTs' \gset
-SELECT stats_reset AS slru_notify_reset_ts FROM pg_stat_slru WHERE name = 'Notify' \gset
-SELECT pg_stat_reset_slru('CommitTs');
-SELECT stats_reset > :'slru_commit_ts_reset_ts'::timestamptz FROM pg_stat_slru WHERE name = 'CommitTs';
-SELECT stats_reset AS slru_commit_ts_reset_ts FROM pg_stat_slru WHERE name = 'CommitTs' \gset
-
--- Test that multiple SLRUs are reset when no specific SLRU provided to reset function
-SELECT pg_stat_reset_slru(NULL);
-SELECT stats_reset > :'slru_commit_ts_reset_ts'::timestamptz FROM pg_stat_slru WHERE name = 'CommitTs';
-SELECT stats_reset > :'slru_notify_reset_ts'::timestamptz FROM pg_stat_slru WHERE name = 'Notify';
-
--- Test that reset_shared with archiver specified as the stats type works
-SELECT stats_reset AS archiver_reset_ts FROM pg_stat_archiver \gset
-SELECT pg_stat_reset_shared('archiver');
-SELECT stats_reset > :'archiver_reset_ts'::timestamptz FROM pg_stat_archiver;
-SELECT stats_reset AS archiver_reset_ts FROM pg_stat_archiver \gset
-
--- Test that reset_shared with bgwriter specified as the stats type works
-SELECT stats_reset AS bgwriter_reset_ts FROM pg_stat_bgwriter \gset
-SELECT pg_stat_reset_shared('bgwriter');
-SELECT stats_reset > :'bgwriter_reset_ts'::timestamptz FROM pg_stat_bgwriter;
-SELECT stats_reset AS bgwriter_reset_ts FROM pg_stat_bgwriter \gset
-
--- Test that reset_shared with wal specified as the stats type works
-SELECT stats_reset AS wal_reset_ts FROM pg_stat_wal \gset
-SELECT pg_stat_reset_shared('wal');
-SELECT stats_reset > :'wal_reset_ts'::timestamptz FROM pg_stat_wal;
-SELECT stats_reset AS wal_reset_ts FROM pg_stat_wal \gset
-
--- Test that reset_shared with no specified stats type doesn't reset anything
-SELECT pg_stat_reset_shared(NULL);
-SELECT stats_reset = :'archiver_reset_ts'::timestamptz FROM pg_stat_archiver;
-SELECT stats_reset = :'bgwriter_reset_ts'::timestamptz FROM pg_stat_bgwriter;
-SELECT stats_reset = :'wal_reset_ts'::timestamptz FROM pg_stat_wal;
-
--- Test that reset works for pg_stat_database
-
--- Since pg_stat_database stats_reset starts out as NULL, reset it once first so we have something to compare it to
-SELECT pg_stat_reset();
-SELECT stats_reset AS db_reset_ts FROM pg_stat_database WHERE datname = (SELECT current_database()) \gset
-SELECT pg_stat_reset();
-SELECT stats_reset > :'db_reset_ts'::timestamptz FROM pg_stat_database WHERE datname = (SELECT current_database());
-
-
-----
--- pg_stat_get_snapshot_timestamp behavior
-----
-BEGIN;
-SET LOCAL stats_fetch_consistency = snapshot;
--- no snapshot yet, return NULL
-SELECT pg_stat_get_snapshot_timestamp();
--- any attempt at accessing stats will build snapshot
-SELECT pg_stat_get_function_calls(0);
-SELECT pg_stat_get_snapshot_timestamp() >= NOW();
--- shows NULL again after clearing
-SELECT pg_stat_clear_snapshot();
-SELECT pg_stat_get_snapshot_timestamp();
-COMMIT;
-
-----
--- pg_stat_have_stats behavior
-----
--- fixed-numbered stats exist
-SELECT pg_stat_have_stats('bgwriter', 0, 0);
--- unknown stats kinds error out
-SELECT pg_stat_have_stats('zaphod', 0, 0);
--- db stats have objoid 0
-SELECT pg_stat_have_stats('database', (SELECT oid FROM pg_database WHERE datname = current_database()), 1);
-SELECT pg_stat_have_stats('database', (SELECT oid FROM pg_database WHERE datname = current_database()), 0);
-
-
--- ensure that stats accessors handle NULL input correctly
-SELECT pg_stat_get_replication_slot(NULL);
-SELECT pg_stat_get_subscription_stats(NULL);
-
-
-- End of Stats Test
--
2.34.3
On Mon, Jun 06, 2022 at 09:08:08AM +0200, Tomas Vondra wrote:
Attached is a patch reverting both commits (5753d4ee32 and fe60b67250).
This changes the IndexAmRoutine struct, so it's an ABI break. That's not
great post-beta :-( In principle we might also leave amhotblocking in
the struct but ignore it in the code (and treat it as false), but that
seems weird and it's going to be a pain when backpatching. Opinions?
I don't think that you need to worry about ABI breakages now in beta,
because that's the period of time where we can still change things and
shape the code in its best way for prime time. It depends on the
change, of course, but what you are doing, by removing the field,
looks right to me here.
--
Michael
On 6/6/22 09:28, Michael Paquier wrote:
On Mon, Jun 06, 2022 at 09:08:08AM +0200, Tomas Vondra wrote:
Attached is a patch reverting both commits (5753d4ee32 and fe60b67250).
This changes the IndexAmRoutine struct, so it's an ABI break. That's not
great post-beta :-( In principle we might also leave amhotblocking in
the struct but ignore it in the code (and treat it as false), but that
seems weird and it's going to be a pain when backpatching. Opinions?I don't think that you need to worry about ABI breakages now in beta,
because that's the period of time where we can still change things and
shape the code in its best way for prime time. It depends on the
change, of course, but what you are doing, by removing the field,
looks right to me here.
I've pushed the revert. Let's try again for PG16.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Thu, 16 Jun 2022 at 15:05, Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
I've pushed the revert. Let's try again for PG16.
As we discussed in person at the developer meeting, here's a patch to
try again for PG16.
It combines the committed patches with my fix, and adds some
additional comments and polish. I am confident the code is correct,
but not that it is clean (see the commit message of the patch for
details).
Kind regards,
Matthias van de Meent
PS. I'm adding this to the commitfest
Original patch thread:
/messages/by-id/CAFp7QwpMRGcDAQumN7onN9HjrJ3u4X3ZRXdGFT0K5G2JWvnbWg@mail.gmail.com
Other relevant:
/messages/by-id/CA+TgmoZOgdoAFH9HatRwuydOZkMdyPi=97rNhsu=hQBBYs+gXQ@mail.gmail.com
Attachments:
v1-0001-Ignore-BRIN-indexes-when-checking-for-HOT-updates.patchapplication/x-patch; name=v1-0001-Ignore-BRIN-indexes-when-checking-for-HOT-updates.patchDownload
From 402a07d45b9aae70f8a01edcce059eaa13783360 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas.vondra@postgresql.org>
Date: Tue, 30 Nov 2021 19:15:14 +0100
Subject: [PATCH v1] Ignore BRIN indexes when checking for HOT updates
When determining whether an index update may be skipped by using HOT, we
can ignore attributes indexed by block summarizing indexes without
references to individual tuples that need to be cleaned up.
This also removes rd_indexattr list, and replaces it with rd_attrsvalid
flag. The list was not used anywhere, and a simple flag is sufficient.
A new type TU_UpdateIndexes is invented provide a signal to the executor
to determine which indexes to update - no indexes, all indexes, or only
the summarizing indexes.
One otherwise unused bit in the heap tuple header is (ab)used to signal
that the HOT update would still update at least one summarizing index.
The bit is cleared immediately
Original patch by Josef Simanek, various fixes and improvements by
Tomas Vondra and me.
Authors: Josef Simanek, Tomas Vondra, Matthias van de Meent
Reviewed-by: Tomas Vondra, Alvaro Herrera
---
doc/src/sgml/indexam.sgml | 13 +++
src/backend/access/brin/brin.c | 1 +
src/backend/access/gin/ginutil.c | 1 +
src/backend/access/gist/gist.c | 1 +
src/backend/access/hash/hash.c | 1 +
src/backend/access/heap/heapam.c | 15 ++-
src/backend/access/heap/heapam_handler.c | 21 +++-
src/backend/access/nbtree/nbtree.c | 1 +
src/backend/access/spgist/spgutils.c | 1 +
src/backend/access/table/tableam.c | 2 +-
src/backend/catalog/index.c | 9 +-
src/backend/catalog/indexing.c | 29 ++++-
src/backend/commands/copyfrom.c | 5 +-
src/backend/commands/indexcmds.c | 10 +-
src/backend/executor/execIndexing.c | 37 ++++--
src/backend/executor/execReplication.c | 9 +-
src/backend/executor/nodeModifyTable.c | 13 ++-
src/backend/nodes/makefuncs.c | 4 +-
src/backend/utils/cache/relcache.c | 62 +++++++---
src/include/access/amapi.h | 2 +
src/include/access/htup_details.h | 29 +++++
src/include/access/tableam.h | 19 ++-
src/include/executor/executor.h | 3 +-
src/include/nodes/execnodes.h | 2 +
src/include/nodes/makefuncs.h | 4 +-
src/include/utils/rel.h | 4 +-
src/include/utils/relcache.h | 5 +-
.../modules/dummy_index_am/dummy_index_am.c | 1 +
src/test/regress/expected/stats.out | 110 ++++++++++++++++++
src/test/regress/sql/stats.sql | 82 ++++++++++++-
30 files changed, 431 insertions(+), 65 deletions(-)
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index 4f83970c85..897419ec95 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -127,6 +127,9 @@ typedef struct IndexAmRoutine
bool amcaninclude;
/* does AM use maintenance_work_mem? */
bool amusemaintenanceworkmem;
+ /* does AM summarize tuples, with at least all tuples in the block
+ * summarized in one summary */
+ bool amsummarizing;
/* OR of parallel vacuum flags */
uint8 amparallelvacuumoptions;
/* type of data stored in index, or InvalidOid if variable */
@@ -247,6 +250,16 @@ typedef struct IndexAmRoutine
null, independently of <structfield>amoptionalkey</structfield>.
</para>
+ <para>
+ The <structfield>amsummarizing</structfield> flag indicates whether the
+ access method summarizes the indexed tuples, with summarizing granularity
+ of at least per block.
+ Access methods that do not point to individual tuples, but to (like
+ <acronym>BRIN</acronym>), may allow the <acronym>HOT</acronym> optimization
+ to continue. This does not apply to attributes referenced in index
+ predicates, an update of such attribute always disables <acronym>HOT</acronym>.
+ </para>
+
</sect1>
<sect1 id="index-functions">
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index de1427a1e0..c4bdd0e7b0 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -109,6 +109,7 @@ brinhandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = false;
amroutine->amcaninclude = false;
amroutine->amusemaintenanceworkmem = false;
+ amroutine->amsummarizing = true;
amroutine->amparallelvacuumoptions =
VACUUM_OPTION_PARALLEL_CLEANUP;
amroutine->amkeytype = InvalidOid;
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index f05128ecf5..03fec1704e 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -56,6 +56,7 @@ ginhandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = false;
amroutine->amcaninclude = false;
amroutine->amusemaintenanceworkmem = true;
+ amroutine->amsummarizing = false;
amroutine->amparallelvacuumoptions =
VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_CLEANUP;
amroutine->amkeytype = InvalidOid;
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index ba394f08f6..ea72bcce1b 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -78,6 +78,7 @@ gisthandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = false;
amroutine->amcaninclude = true;
amroutine->amusemaintenanceworkmem = false;
+ amroutine->amsummarizing = false;
amroutine->amparallelvacuumoptions =
VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
amroutine->amkeytype = InvalidOid;
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index eb258337d6..fc5d97f606 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -75,6 +75,7 @@ hashhandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = false;
amroutine->amcaninclude = false;
amroutine->amusemaintenanceworkmem = false;
+ amroutine->amsummarizing = false;
amroutine->amparallelvacuumoptions =
VACUUM_OPTION_PARALLEL_BULKDEL;
amroutine->amkeytype = INT4OID;
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 7eb79cee58..99668c3eae 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -2929,6 +2929,7 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
TM_Result result;
TransactionId xid = GetCurrentTransactionId();
Bitmapset *hot_attrs;
+ Bitmapset *sum_attrs;
Bitmapset *key_attrs;
Bitmapset *id_attrs;
Bitmapset *interesting_attrs;
@@ -2951,6 +2952,7 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
bool have_tuple_lock = false;
bool iscombo;
bool use_hot_update = false;
+ bool summarized_update = false;
bool key_intact;
bool all_visible_cleared = false;
bool all_visible_cleared_new = false;
@@ -2996,12 +2998,14 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
* Note that we get copies of each bitmap, so we need not worry about
* relcache flush happening midway through.
*/
- hot_attrs = RelationGetIndexAttrBitmap(relation, INDEX_ATTR_BITMAP_ALL);
+ hot_attrs = RelationGetIndexAttrBitmap(relation, INDEX_ATTR_BITMAP_HOT_BLOCKING);
+ sum_attrs = RelationGetIndexAttrBitmap(relation, INDEX_ATTR_BITMAP_SUMMARIZED);
key_attrs = RelationGetIndexAttrBitmap(relation, INDEX_ATTR_BITMAP_KEY);
id_attrs = RelationGetIndexAttrBitmap(relation,
INDEX_ATTR_BITMAP_IDENTITY_KEY);
interesting_attrs = NULL;
interesting_attrs = bms_add_members(interesting_attrs, hot_attrs);
+ interesting_attrs = bms_add_members(interesting_attrs, sum_attrs);
interesting_attrs = bms_add_members(interesting_attrs, key_attrs);
interesting_attrs = bms_add_members(interesting_attrs, id_attrs);
@@ -3312,6 +3316,7 @@ l2:
if (vmbuffer != InvalidBuffer)
ReleaseBuffer(vmbuffer);
bms_free(hot_attrs);
+ bms_free(sum_attrs);
bms_free(key_attrs);
bms_free(id_attrs);
bms_free(modified_attrs);
@@ -3633,7 +3638,11 @@ l2:
* changed.
*/
if (!bms_overlap(modified_attrs, hot_attrs))
+ {
use_hot_update = true;
+ if (bms_overlap(modified_attrs, sum_attrs))
+ summarized_update = true;
+ }
}
else
{
@@ -3793,10 +3802,14 @@ l2:
heap_freetuple(heaptup);
}
+ if (summarized_update)
+ HeapTupleHeaderSetSummaryUpdate(newtup->t_data);
+
if (old_key_tuple != NULL && old_key_copied)
heap_freetuple(old_key_tuple);
bms_free(hot_attrs);
+ bms_free(sum_attrs);
bms_free(key_attrs);
bms_free(id_attrs);
bms_free(modified_attrs);
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index c4b1916d36..2072eb351d 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -314,7 +314,7 @@ static TM_Result
heapam_tuple_update(Relation relation, ItemPointer otid, TupleTableSlot *slot,
CommandId cid, Snapshot snapshot, Snapshot crosscheck,
bool wait, TM_FailureData *tmfd,
- LockTupleMode *lockmode, bool *update_indexes)
+ LockTupleMode *lockmode, TU_UpdateIndexes *update_indexes)
{
bool shouldFree = true;
HeapTuple tuple = ExecFetchSlotHeapTuple(slot, true, &shouldFree);
@@ -334,9 +334,24 @@ heapam_tuple_update(Relation relation, ItemPointer otid, TupleTableSlot *slot,
* Note: heap_update returns the tid (location) of the new tuple in the
* t_self field.
*
- * If it's a HOT update, we mustn't insert new index entries.
+ * If it is a HOT update, we must not insert new index entries on all index.
+ * However, if it updates columns in summarized indexes, we must still update
+ * those summarizing indexes, lest we fail to update those summaries and
+ * get incorrect results (for example, minmax bounds of the block may change).
*/
- *update_indexes = result == TM_Ok && !HeapTupleIsHeapOnly(tuple);
+ if (result != TM_Ok)
+ *update_indexes = TUUI_None;
+ else if (!HeapTupleIsHeapOnly(tuple))
+ *update_indexes = TUUI_All;
+ else if (HeapTupleHeaderIsHOTWithSummaryUpdate(tuple->t_data))
+ {
+ *update_indexes = TUUI_Summarizing;
+
+ /* Clear temporary bits */
+ HeapTupleHeaderClearSummaryUpdate(tuple->t_data);
+ }
+ else
+ *update_indexes = TUUI_None;
if (shouldFree)
pfree(tuple);
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 1cc88da032..681c30b0d8 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -114,6 +114,7 @@ bthandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = true;
amroutine->amcaninclude = true;
amroutine->amusemaintenanceworkmem = false;
+ amroutine->amsummarizing = false;
amroutine->amparallelvacuumoptions =
VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
amroutine->amkeytype = InvalidOid;
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index 3761f2c193..4e7ff1d160 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -62,6 +62,7 @@ spghandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = false;
amroutine->amcaninclude = true;
amroutine->amusemaintenanceworkmem = false;
+ amroutine->amsummarizing = false;
amroutine->amparallelvacuumoptions =
VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
amroutine->amkeytype = InvalidOid;
diff --git a/src/backend/access/table/tableam.c b/src/backend/access/table/tableam.c
index ef0d34fcee..a5e6c92f35 100644
--- a/src/backend/access/table/tableam.c
+++ b/src/backend/access/table/tableam.c
@@ -345,7 +345,7 @@ void
simple_table_tuple_update(Relation rel, ItemPointer otid,
TupleTableSlot *slot,
Snapshot snapshot,
- bool *update_indexes)
+ TU_UpdateIndexes *update_indexes)
{
TM_Result result;
TM_FailureData tmfd;
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 41b16cb89b..e2fd035f44 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -1370,7 +1370,8 @@ index_concurrently_create_copy(Relation heapRelation, Oid oldIndexId,
oldInfo->ii_Unique,
oldInfo->ii_NullsNotDistinct,
false, /* not ready for inserts */
- true);
+ true,
+ indexRelation->rd_indam->amsummarizing);
/*
* Extract the list of column names and the column numbers for the new
@@ -2442,7 +2443,8 @@ BuildIndexInfo(Relation index)
indexStruct->indisunique,
indexStruct->indnullsnotdistinct,
indexStruct->indisready,
- false);
+ false,
+ index->rd_indam->amsummarizing);
/* fill in attribute numbers */
for (i = 0; i < numAtts; i++)
@@ -2502,7 +2504,8 @@ BuildDummyIndexInfo(Relation index)
indexStruct->indisunique,
indexStruct->indnullsnotdistinct,
indexStruct->indisready,
- false);
+ false,
+ index->rd_indam->amsummarizing);
/* fill in attribute numbers */
for (i = 0; i < numAtts; i++)
diff --git a/src/backend/catalog/indexing.c b/src/backend/catalog/indexing.c
index bb7cc3601c..262a246e44 100644
--- a/src/backend/catalog/indexing.c
+++ b/src/backend/catalog/indexing.c
@@ -82,15 +82,27 @@ CatalogIndexInsert(CatalogIndexState indstate, HeapTuple heapTuple)
IndexInfo **indexInfoArray;
Datum values[INDEX_MAX_KEYS];
bool isnull[INDEX_MAX_KEYS];
+ bool onlySummarized = false;
/*
- * HOT update does not require index inserts. But with asserts enabled we
- * want to check that it'd be legal to currently insert into the
- * table/index.
+ * HOT updates may be a 'summary update', for which we need to make
+ * sure to update summarized indexes.
*/
+ if (HeapTupleHeaderIsHOTWithSummaryUpdate(heapTuple->t_data))
+ {
+ HeapTupleHeaderClearSummaryUpdate(heapTuple->t_data);
+ onlySummarized = true;
+ }
#ifndef USE_ASSERT_CHECKING
- if (HeapTupleIsHeapOnly(heapTuple))
+ else if (HeapTupleIsHeapOnly(heapTuple))
+ {
+ /*
+ * Normal HOT update does not require index inserts. But with
+ * asserts enabled we want to check that it'd be legal to
+ * currently insert into the table/index.
+ */
return;
+ }
#endif
/*
@@ -135,13 +147,20 @@ CatalogIndexInsert(CatalogIndexState indstate, HeapTuple heapTuple)
/* see earlier check above */
#ifdef USE_ASSERT_CHECKING
- if (HeapTupleIsHeapOnly(heapTuple))
+ if (HeapTupleIsHeapOnly(heapTuple) && !onlySummarized)
{
Assert(!ReindexIsProcessingIndex(RelationGetRelid(index)));
continue;
}
#endif /* USE_ASSERT_CHECKING */
+ /*
+ * Skip insertions into non-summarizing indexes if we only need
+ * to update summarizing indexes
+ */
+ if (onlySummarized && !indexInfo->ii_Summarizing)
+ continue;
+
/*
* FormIndexDatum fills in its values and isnull parameters with the
* appropriate values for the column(s) of the index.
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index af52faca6d..564520289a 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -435,7 +435,7 @@ CopyMultiInsertBufferFlush(CopyMultiInsertInfo *miinfo,
recheckIndexes =
ExecInsertIndexTuples(resultRelInfo,
buffer->slots[i], estate, false,
- false, NULL, NIL);
+ false, NULL, NIL, false);
ExecARInsertTriggers(estate, resultRelInfo,
slots[i], recheckIndexes,
cstate->transition_capture);
@@ -1254,7 +1254,8 @@ CopyFrom(CopyFromState cstate)
false,
false,
NULL,
- NIL);
+ NIL,
+ false);
}
/* AFTER ROW INSERT Triggers */
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 16ec0b114e..ff48f44c66 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -184,6 +184,7 @@ CheckIndexCompatible(Oid oldId,
Form_pg_am accessMethodForm;
IndexAmRoutine *amRoutine;
bool amcanorder;
+ bool amsummarizing;
int16 *coloptions;
IndexInfo *indexInfo;
int numberOfAttributes;
@@ -222,6 +223,7 @@ CheckIndexCompatible(Oid oldId,
ReleaseSysCache(tuple);
amcanorder = amRoutine->amcanorder;
+ amsummarizing = amRoutine->amsummarizing;
/*
* Compute the operator classes, collations, and exclusion operators for
@@ -232,7 +234,8 @@ CheckIndexCompatible(Oid oldId,
* ii_NumIndexKeyAttrs with same value.
*/
indexInfo = makeIndexInfo(numberOfAttributes, numberOfAttributes,
- accessMethodId, NIL, NIL, false, false, false, false);
+ accessMethodId, NIL, NIL, false, false,
+ false, false, amsummarizing);
typeObjectId = palloc_array(Oid, numberOfAttributes);
collationObjectId = palloc_array(Oid, numberOfAttributes);
classObjectId = palloc_array(Oid, numberOfAttributes);
@@ -550,6 +553,7 @@ DefineIndex(Oid relationId,
Form_pg_am accessMethodForm;
IndexAmRoutine *amRoutine;
bool amcanorder;
+ bool amissummarizing;
amoptions_function amoptions;
bool partitioned;
bool safe_index;
@@ -866,6 +870,7 @@ DefineIndex(Oid relationId,
amcanorder = amRoutine->amcanorder;
amoptions = amRoutine->amoptions;
+ amissummarizing = amRoutine->amsummarizing;
pfree(amRoutine);
ReleaseSysCache(tuple);
@@ -897,7 +902,8 @@ DefineIndex(Oid relationId,
stmt->unique,
stmt->nulls_not_distinct,
!concurrent,
- concurrent);
+ concurrent,
+ amissummarizing);
typeObjectId = palloc_array(Oid, numberOfAttributes);
collationObjectId = palloc_array(Oid, numberOfAttributes);
diff --git a/src/backend/executor/execIndexing.c b/src/backend/executor/execIndexing.c
index 6e88e72813..0533711219 100644
--- a/src/backend/executor/execIndexing.c
+++ b/src/backend/executor/execIndexing.c
@@ -259,15 +259,24 @@ ExecCloseIndices(ResultRelInfo *resultRelInfo)
* into all the relations indexing the result relation
* when a heap tuple is inserted into the result relation.
*
- * When 'update' is true, executor is performing an UPDATE
- * that could not use an optimization like heapam's HOT (in
- * more general terms a call to table_tuple_update() took
- * place and set 'update_indexes' to true). Receiving this
- * hint makes us consider if we should pass down the
- * 'indexUnchanged' hint in turn. That's something that we
- * figure out for each index_insert() call iff 'update' is
- * true. (When 'update' is false we already know not to pass
- * the hint to any index.)
+ * When 'update' is true and 'onlySummarizing' is false,
+ * executor is performing an UPDATE that could not use an
+ * optimization like heapam's HOT (in more general terms a
+ * call to table_tuple_update() took place and set
+ * 'update_indexes' to TUUI_All). Receiving this hint makes
+ * us consider if we should pass down the 'indexUnchanged'
+ * hint in turn. That's something that we figure out for
+ * each index_insert() call iff 'update' is true.
+ * (When 'update' is false we already know not to pass the
+ * hint to any index.)
+ *
+ * If onlySummarizing is set, an equivalent optimization to
+ * HOT has been applied and any updated columns are indexed
+ * only by summarizing indexes (or in more general terms a
+ * call to table_tuple_update() took place and set
+ * 'update_indexes' to TUUI_Summarizing). We can (and must)
+ * therefore only update the indexes that have
+ * 'amsummarizing' = true.
*
* Unique and exclusion constraints are enforced at the same
* time. This returns a list of index OIDs for any unique or
@@ -287,7 +296,8 @@ ExecInsertIndexTuples(ResultRelInfo *resultRelInfo,
bool update,
bool noDupErr,
bool *specConflict,
- List *arbiterIndexes)
+ List *arbiterIndexes,
+ bool onlySummarizing)
{
ItemPointer tupleid = &slot->tts_tid;
List *result = NIL;
@@ -343,6 +353,13 @@ ExecInsertIndexTuples(ResultRelInfo *resultRelInfo,
if (!indexInfo->ii_ReadyForInserts)
continue;
+ /*
+ * Skip processing of non-summarizing indexes if we only need
+ * to update summarizing indexes
+ */
+ if (onlySummarizing && !indexInfo->ii_Summarizing)
+ continue;
+
/* Check for partial index */
if (indexInfo->ii_Predicate != NIL)
{
diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index c484f5c301..aca8484e96 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -445,7 +445,7 @@ ExecSimpleRelationInsert(ResultRelInfo *resultRelInfo,
if (resultRelInfo->ri_NumIndices > 0)
recheckIndexes = ExecInsertIndexTuples(resultRelInfo,
slot, estate, false, false,
- NULL, NIL);
+ NULL, NIL, false);
/* AFTER ROW INSERT Triggers */
ExecARInsertTriggers(estate, resultRelInfo, slot,
@@ -493,7 +493,7 @@ ExecSimpleRelationUpdate(ResultRelInfo *resultRelInfo,
if (!skip_tuple)
{
List *recheckIndexes = NIL;
- bool update_indexes;
+ TU_UpdateIndexes update_indexes;
/* Compute stored generated columns */
if (rel->rd_att->constr &&
@@ -510,10 +510,11 @@ ExecSimpleRelationUpdate(ResultRelInfo *resultRelInfo,
simple_table_tuple_update(rel, tid, slot, estate->es_snapshot,
&update_indexes);
- if (resultRelInfo->ri_NumIndices > 0 && update_indexes)
+ if (resultRelInfo->ri_NumIndices > 0 && update_indexes != TUUI_None)
recheckIndexes = ExecInsertIndexTuples(resultRelInfo,
slot, estate, true, false,
- NULL, NIL);
+ NULL, NIL,
+ update_indexes == TUUI_Summarizing);
/* AFTER ROW UPDATE Triggers */
ExecARUpdateTriggers(estate, resultRelInfo,
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index a94d7f86e5..8320a602e8 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -125,8 +125,8 @@ typedef struct ModifyTableContext
typedef struct UpdateContext
{
bool updated; /* did UPDATE actually occur? */
- bool updateIndexes; /* index update required? */
bool crossPartUpdate; /* was it a cross-partition update? */
+ TU_UpdateIndexes updateIndexes; /* Which index updates are required? */
/*
* Lock mode to acquire on the latest tuple version before performing
@@ -1106,7 +1106,8 @@ ExecInsert(ModifyTableContext *context,
recheckIndexes = ExecInsertIndexTuples(resultRelInfo,
slot, estate, false, true,
&specConflict,
- arbiterIndexes);
+ arbiterIndexes,
+ false);
/* adjust the tuple's state accordingly */
table_tuple_complete_speculative(resultRelationDesc, slot,
@@ -1145,7 +1146,8 @@ ExecInsert(ModifyTableContext *context,
if (resultRelInfo->ri_NumIndices > 0)
recheckIndexes = ExecInsertIndexTuples(resultRelInfo,
slot, estate, false,
- false, NULL, NIL);
+ false, NULL, NIL,
+ false);
}
}
@@ -2108,11 +2110,12 @@ ExecUpdateEpilogue(ModifyTableContext *context, UpdateContext *updateCxt,
ModifyTableState *mtstate = context->mtstate;
/* insert index entries for tuple if necessary */
- if (resultRelInfo->ri_NumIndices > 0 && updateCxt->updateIndexes)
+ if (resultRelInfo->ri_NumIndices > 0 && updateCxt->updateIndexes != TUUI_None)
recheckIndexes = ExecInsertIndexTuples(resultRelInfo,
slot, context->estate,
true, false,
- NULL, NIL);
+ NULL, NIL,
+ updateCxt->updateIndexes == TUUI_Summarizing);
/* AFTER ROW UPDATE Triggers */
ExecARUpdateTriggers(context->estate, resultRelInfo,
diff --git a/src/backend/nodes/makefuncs.c b/src/backend/nodes/makefuncs.c
index fe67baf142..73ad67597e 100644
--- a/src/backend/nodes/makefuncs.c
+++ b/src/backend/nodes/makefuncs.c
@@ -743,7 +743,8 @@ make_ands_implicit(Expr *clause)
*/
IndexInfo *
makeIndexInfo(int numattrs, int numkeyattrs, Oid amoid, List *expressions,
- List *predicates, bool unique, bool nulls_not_distinct, bool isready, bool concurrent)
+ List *predicates, bool unique, bool nulls_not_distinct,
+ bool isready, bool concurrent, bool summarizing)
{
IndexInfo *n = makeNode(IndexInfo);
@@ -757,6 +758,7 @@ makeIndexInfo(int numattrs, int numkeyattrs, Oid amoid, List *expressions,
n->ii_CheckedUnchanged = false;
n->ii_IndexUnchanged = false;
n->ii_Concurrent = concurrent;
+ n->ii_Summarizing = summarizing;
/* expressions */
n->ii_Expressions = expressions;
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 13f7987373..fe260e416a 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -2440,10 +2440,11 @@ RelationDestroyRelation(Relation relation, bool remember_tupdesc)
list_free_deep(relation->rd_fkeylist);
list_free(relation->rd_indexlist);
list_free(relation->rd_statlist);
- bms_free(relation->rd_indexattr);
bms_free(relation->rd_keyattr);
bms_free(relation->rd_pkattr);
bms_free(relation->rd_idattr);
+ bms_free(relation->rd_hotblockingattr);
+ bms_free(relation->rd_summarizedattr);
if (relation->rd_pubdesc)
pfree(relation->rd_pubdesc);
if (relation->rd_options)
@@ -5167,10 +5168,11 @@ RelationGetIndexPredicate(Relation relation)
Bitmapset *
RelationGetIndexAttrBitmap(Relation relation, IndexAttrBitmapKind attrKind)
{
- Bitmapset *indexattrs; /* indexed columns */
Bitmapset *uindexattrs; /* columns in unique indexes */
Bitmapset *pkindexattrs; /* columns in the primary index */
Bitmapset *idindexattrs; /* columns in the replica identity */
+ Bitmapset *hotblockingattrs; /* columns with HOT blocking indexes */
+ Bitmapset *summarizedattrs; /* columns with summarizing indexes */
List *indexoidlist;
List *newindexoidlist;
Oid relpkindex;
@@ -5179,18 +5181,20 @@ RelationGetIndexAttrBitmap(Relation relation, IndexAttrBitmapKind attrKind)
MemoryContext oldcxt;
/* Quick exit if we already computed the result. */
- if (relation->rd_indexattr != NULL)
+ if (relation->rd_attrsvalid)
{
switch (attrKind)
{
- case INDEX_ATTR_BITMAP_ALL:
- return bms_copy(relation->rd_indexattr);
case INDEX_ATTR_BITMAP_KEY:
return bms_copy(relation->rd_keyattr);
case INDEX_ATTR_BITMAP_PRIMARY_KEY:
return bms_copy(relation->rd_pkattr);
case INDEX_ATTR_BITMAP_IDENTITY_KEY:
return bms_copy(relation->rd_idattr);
+ case INDEX_ATTR_BITMAP_HOT_BLOCKING:
+ return bms_copy(relation->rd_hotblockingattr);
+ case INDEX_ATTR_BITMAP_SUMMARIZED:
+ return bms_copy(relation->rd_summarizedattr);
default:
elog(ERROR, "unknown attrKind %u", attrKind);
}
@@ -5230,10 +5234,11 @@ restart:
* CONCURRENTLY is far enough along that we should ignore the index, it
* won't be returned at all by RelationGetIndexList.
*/
- indexattrs = NULL;
uindexattrs = NULL;
pkindexattrs = NULL;
idindexattrs = NULL;
+ hotblockingattrs = NULL;
+ summarizedattrs = NULL;
foreach(l, indexoidlist)
{
Oid indexOid = lfirst_oid(l);
@@ -5298,8 +5303,12 @@ restart:
*/
if (attrnum != 0)
{
- indexattrs = bms_add_member(indexattrs,
- attrnum - FirstLowInvalidHeapAttributeNumber);
+ if (indexDesc->rd_indam->amsummarizing)
+ summarizedattrs = bms_add_member(summarizedattrs,
+ attrnum - FirstLowInvalidHeapAttributeNumber);
+ else
+ hotblockingattrs = bms_add_member(hotblockingattrs,
+ attrnum - FirstLowInvalidHeapAttributeNumber);
if (isKey && i < indexDesc->rd_index->indnkeyatts)
uindexattrs = bms_add_member(uindexattrs,
@@ -5316,10 +5325,17 @@ restart:
}
/* Collect all attributes used in expressions, too */
- pull_varattnos(indexExpressions, 1, &indexattrs);
+ if (indexDesc->rd_indam->amsummarizing)
+ pull_varattnos(indexExpressions, 1, &summarizedattrs);
+ else
+ pull_varattnos(indexExpressions, 1, &hotblockingattrs);
- /* Collect all attributes in the index predicate, too */
- pull_varattnos(indexPredicate, 1, &indexattrs);
+ /*
+ * Collect all attributes in the index predicate, too. We have to ignore
+ * amsummarizing flag, because the row might become indexable, in which
+ * case we have to add it to the index.
+ */
+ pull_varattnos(indexPredicate, 1, &hotblockingattrs);
index_close(indexDesc, AccessShareLock);
}
@@ -5347,24 +5363,28 @@ restart:
bms_free(uindexattrs);
bms_free(pkindexattrs);
bms_free(idindexattrs);
- bms_free(indexattrs);
+ bms_free(hotblockingattrs);
+ bms_free(summarizedattrs);
goto restart;
}
/* Don't leak the old values of these bitmaps, if any */
- bms_free(relation->rd_indexattr);
- relation->rd_indexattr = NULL;
+ relation->rd_attrsvalid = false;
bms_free(relation->rd_keyattr);
relation->rd_keyattr = NULL;
bms_free(relation->rd_pkattr);
relation->rd_pkattr = NULL;
bms_free(relation->rd_idattr);
relation->rd_idattr = NULL;
+ bms_free(relation->rd_hotblockingattr);
+ relation->rd_hotblockingattr = NULL;
+ bms_free(relation->rd_summarizedattr);
+ relation->rd_summarizedattr = NULL;
/*
* Now save copies of the bitmaps in the relcache entry. We intentionally
- * set rd_indexattr last, because that's the one that signals validity of
+ * set rd_attrsvalid last, because that's the one that signals validity of
* the values; if we run out of memory before making that copy, we won't
* leave the relcache entry looking like the other ones are valid but
* empty.
@@ -5373,20 +5393,24 @@ restart:
relation->rd_keyattr = bms_copy(uindexattrs);
relation->rd_pkattr = bms_copy(pkindexattrs);
relation->rd_idattr = bms_copy(idindexattrs);
- relation->rd_indexattr = bms_copy(indexattrs);
+ relation->rd_hotblockingattr = bms_copy(hotblockingattrs);
+ relation->rd_summarizedattr = bms_copy(summarizedattrs);
+ relation->rd_attrsvalid = true;
MemoryContextSwitchTo(oldcxt);
/* We return our original working copy for caller to play with */
switch (attrKind)
{
- case INDEX_ATTR_BITMAP_ALL:
- return indexattrs;
case INDEX_ATTR_BITMAP_KEY:
return uindexattrs;
case INDEX_ATTR_BITMAP_PRIMARY_KEY:
return pkindexattrs;
case INDEX_ATTR_BITMAP_IDENTITY_KEY:
return idindexattrs;
+ case INDEX_ATTR_BITMAP_HOT_BLOCKING:
+ return hotblockingattrs;
+ case INDEX_ATTR_BITMAP_SUMMARIZED:
+ return summarizedattrs;
default:
elog(ERROR, "unknown attrKind %u", attrKind);
return NULL;
@@ -6307,7 +6331,7 @@ load_relcache_init_file(bool shared)
rel->rd_indexlist = NIL;
rel->rd_pkindex = InvalidOid;
rel->rd_replidindex = InvalidOid;
- rel->rd_indexattr = NULL;
+ rel->rd_attrsvalid = false;
rel->rd_keyattr = NULL;
rel->rd_pkattr = NULL;
rel->rd_idattr = NULL;
diff --git a/src/include/access/amapi.h b/src/include/access/amapi.h
index 4f1f67b4d0..281039ef67 100644
--- a/src/include/access/amapi.h
+++ b/src/include/access/amapi.h
@@ -244,6 +244,8 @@ typedef struct IndexAmRoutine
bool amcaninclude;
/* does AM use maintenance_work_mem? */
bool amusemaintenanceworkmem;
+ /* does AM store tuple information only at block granularity? */
+ bool amsummarizing;
/* OR of parallel vacuum flags. See vacuum.h for flags. */
uint8 amparallelvacuumoptions;
/* type of data stored in index, or InvalidOid if variable */
diff --git a/src/include/access/htup_details.h b/src/include/access/htup_details.h
index e01f4f35c8..331eec915d 100644
--- a/src/include/access/htup_details.h
+++ b/src/include/access/htup_details.h
@@ -291,6 +291,16 @@ struct HeapTupleHeaderData
*/
#define HEAP_TUPLE_HAS_MATCH HEAP_ONLY_TUPLE /* tuple has a join match */
+/*
+ * HEAP_TUPLE_SUMMARIZING_UPDATED is a temporary flag used to signal that
+ * of the indexed columns, only columns used in summarizing indexes were
+ * updated. It is only used on the in-memory newly inserted updated tuple,
+ * which can't have been HOT updated at this point, so this should never
+ * pose an issue.
+ */
+#define HEAP_TUPLE_SUMMARIZING_UPDATED HEAP_HOT_UPDATED
+
+
/*
* HeapTupleHeader accessor macros
*
@@ -543,6 +553,25 @@ StaticAssertDecl(MaxOffsetNumber < SpecTokenOffsetNumber,
(((tup)->t_infomask & HEAP_HASEXTERNAL) != 0)
+#define HeapTupleHeaderIsHOTWithSummaryUpdate(tup) \
+( \
+ ((tup)->t_infomask2 & HEAP_ONLY_TUPLE) != 0 && \
+ ((tup)->t_infomask2 & HEAP_TUPLE_SUMMARIZING_UPDATED) != 0 \
+)
+
+#define HeapTupleHeaderSetSummaryUpdate(tup) \
+( \
+ (tup)->t_infomask2 |= HEAP_TUPLE_SUMMARIZING_UPDATED \
+)
+
+#define HeapTupleHeaderClearSummaryUpdate(tup) \
+( \
+ (tup)->t_infomask2 &= ~HEAP_TUPLE_SUMMARIZING_UPDATED \
+)
+
+
+
+
/*
* BITMAPLEN(NATTS) -
* Computes size of null bitmap given number of data columns.
diff --git a/src/include/access/tableam.h b/src/include/access/tableam.h
index 652e96f1b0..db444abb35 100644
--- a/src/include/access/tableam.h
+++ b/src/include/access/tableam.h
@@ -102,6 +102,19 @@ typedef enum TM_Result
TM_WouldBlock
} TM_Result;
+/*
+ * Result codes for table_update(..., update_indexes*..).
+ * Used to determine which indexes to update.
+ */
+typedef enum TU_UpdateIndexes {
+ /* No indexed columns were updated (incl. TID addressing of tuple) */
+ TUUI_None = 0,
+ /* A non-summarizing indexed column was updated, or the TID has changed */
+ TUUI_All = 1,
+ /* Only summarized columns were updated, TID is unchanged */
+ TUUI_Summarizing = 2
+} TU_UpdateIndexes;
+
/*
* When table_tuple_update, table_tuple_delete, or table_tuple_lock fail
* because the target tuple is already outdated, they fill in this struct to
@@ -526,7 +539,7 @@ typedef struct TableAmRoutine
bool wait,
TM_FailureData *tmfd,
LockTupleMode *lockmode,
- bool *update_indexes);
+ TU_UpdateIndexes *update_indexes);
/* see table_tuple_lock() for reference about parameters */
TM_Result (*tuple_lock) (Relation rel,
@@ -1514,7 +1527,7 @@ static inline TM_Result
table_tuple_update(Relation rel, ItemPointer otid, TupleTableSlot *slot,
CommandId cid, Snapshot snapshot, Snapshot crosscheck,
bool wait, TM_FailureData *tmfd, LockTupleMode *lockmode,
- bool *update_indexes)
+ TU_UpdateIndexes *update_indexes)
{
return rel->rd_tableam->tuple_update(rel, otid, slot,
cid, snapshot, crosscheck,
@@ -2038,7 +2051,7 @@ extern void simple_table_tuple_delete(Relation rel, ItemPointer tid,
Snapshot snapshot);
extern void simple_table_tuple_update(Relation rel, ItemPointer otid,
TupleTableSlot *slot, Snapshot snapshot,
- bool *update_indexes);
+ TU_UpdateIndexes *update_indexes);
/* ----------------------------------------------------------------------------
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index e7e25c057e..551c5d7ae0 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -620,7 +620,8 @@ extern List *ExecInsertIndexTuples(ResultRelInfo *resultRelInfo,
TupleTableSlot *slot, EState *estate,
bool update,
bool noDupErr,
- bool *specConflict, List *arbiterIndexes);
+ bool *specConflict, List *arbiterIndexes,
+ bool onlySummarizing);
extern bool ExecCheckIndexConstraints(ResultRelInfo *resultRelInfo,
TupleTableSlot *slot,
EState *estate, ItemPointer conflictTid,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 20f4c8b35f..3f1b8818a1 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -161,6 +161,7 @@ typedef struct ExprState
* IndexUnchanged aminsert hint, cached for retail inserts
* Concurrent are we doing a concurrent index build?
* BrokenHotChain did we detect any broken HOT chains?
+ * Summarizing is it summarizing?
* ParallelWorkers # of workers requested (excludes leader)
* Am Oid of index AM
* AmCache private cache area for index AM
@@ -194,6 +195,7 @@ typedef struct IndexInfo
bool ii_IndexUnchanged;
bool ii_Concurrent;
bool ii_BrokenHotChain;
+ bool ii_Summarizing;
int ii_ParallelWorkers;
Oid ii_Am;
void *ii_AmCache;
diff --git a/src/include/nodes/makefuncs.h b/src/include/nodes/makefuncs.h
index 80f1d5336b..64651c9b00 100644
--- a/src/include/nodes/makefuncs.h
+++ b/src/include/nodes/makefuncs.h
@@ -96,7 +96,9 @@ extern List *make_ands_implicit(Expr *clause);
extern IndexInfo *makeIndexInfo(int numattrs, int numkeyattrs, Oid amoid,
List *expressions, List *predicates,
- bool unique, bool nulls_not_distinct, bool isready, bool concurrent);
+ bool unique, bool nulls_not_distinct,
+ bool isready, bool concurrent,
+ bool summarizing);
extern DefElem *makeDefElem(char *name, Node *arg, int location);
extern DefElem *makeDefElemExtended(char *nameSpace, char *name, Node *arg,
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 67f994cb3e..2294cb1a6a 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -156,10 +156,12 @@ typedef struct RelationData
List *rd_statlist; /* list of OIDs of extended stats */
/* data managed by RelationGetIndexAttrBitmap: */
- Bitmapset *rd_indexattr; /* identifies columns used in indexes */
+ bool rd_attrsvalid; /* are bitmaps of attrs valid? */
Bitmapset *rd_keyattr; /* cols that can be ref'd by foreign keys */
Bitmapset *rd_pkattr; /* cols included in primary key */
Bitmapset *rd_idattr; /* included in replica identity index */
+ Bitmapset *rd_hotblockingattr; /* cols blocking HOT update */
+ Bitmapset *rd_summarizedattr; /* cols indexed by block-or-larger summarizing indexes */
PublicationDesc *rd_pubdesc; /* publication descriptor, or NULL */
diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h
index 88460f21c5..beeb28b83c 100644
--- a/src/include/utils/relcache.h
+++ b/src/include/utils/relcache.h
@@ -56,10 +56,11 @@ extern bytea **RelationGetIndexAttOptions(Relation relation, bool copy);
typedef enum IndexAttrBitmapKind
{
- INDEX_ATTR_BITMAP_ALL,
INDEX_ATTR_BITMAP_KEY,
INDEX_ATTR_BITMAP_PRIMARY_KEY,
- INDEX_ATTR_BITMAP_IDENTITY_KEY
+ INDEX_ATTR_BITMAP_IDENTITY_KEY,
+ INDEX_ATTR_BITMAP_HOT_BLOCKING,
+ INDEX_ATTR_BITMAP_SUMMARIZED
} IndexAttrBitmapKind;
extern Bitmapset *RelationGetIndexAttrBitmap(Relation relation,
diff --git a/src/test/modules/dummy_index_am/dummy_index_am.c b/src/test/modules/dummy_index_am/dummy_index_am.c
index dfb1ebb846..c14e0abe0c 100644
--- a/src/test/modules/dummy_index_am/dummy_index_am.c
+++ b/src/test/modules/dummy_index_am/dummy_index_am.c
@@ -296,6 +296,7 @@ dihandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = false;
amroutine->amcaninclude = false;
amroutine->amusemaintenanceworkmem = false;
+ amroutine->amsummarizing = false;
amroutine->amparallelvacuumoptions = VACUUM_OPTION_NO_PARALLEL;
amroutine->amkeytype = InvalidOid;
diff --git a/src/test/regress/expected/stats.out b/src/test/regress/expected/stats.out
index 937b2101b3..728c474cfa 100644
--- a/src/test/regress/expected/stats.out
+++ b/src/test/regress/expected/stats.out
@@ -1354,4 +1354,114 @@ SELECT :io_stats_post_reset < :io_stats_pre_reset;
t
(1 row)
+-- test BRIN index doesn't block HOT update
+CREATE TABLE brin_hot (
+ id integer PRIMARY KEY,
+ val integer NOT NULL
+) WITH (autovacuum_enabled = off, fillfactor = 70);
+INSERT INTO brin_hot SELECT *, 0 FROM generate_series(1, 235);
+CREATE INDEX val_brin ON brin_hot using brin(val);
+CREATE FUNCTION wait_for_hot_stats() RETURNS void AS $$
+DECLARE
+ start_time timestamptz := clock_timestamp();
+ updated bool;
+BEGIN
+ -- we don't want to wait forever; loop will exit after 30 seconds
+ FOR i IN 1 .. 300 LOOP
+ SELECT (pg_stat_get_tuples_hot_updated('brin_hot'::regclass::oid) > 0) INTO updated;
+ EXIT WHEN updated;
+
+ -- wait a little
+ PERFORM pg_sleep_for('100 milliseconds');
+ -- reset stats snapshot so we can test again
+ PERFORM pg_stat_clear_snapshot();
+ END LOOP;
+ -- report time waited in postmaster log (where it won't change test output)
+ RAISE log 'wait_for_hot_stats delayed % seconds',
+ EXTRACT(epoch FROM clock_timestamp() - start_time);
+END
+$$ LANGUAGE plpgsql;
+UPDATE brin_hot SET val = -3 WHERE id = 42;
+-- We can't just call wait_for_hot_stats() at this point, because we only
+-- transmit stats when the session goes idle, and we probably didn't
+-- transmit the last couple of counts yet thanks to the rate-limiting logic
+-- in pgstat_report_stat(). But instead of waiting for the rate limiter's
+-- timeout to elapse, let's just start a new session. The old one will
+-- then send its stats before dying.
+\c -
+SELECT wait_for_hot_stats();
+ wait_for_hot_stats
+--------------------
+
+(1 row)
+
+SELECT pg_stat_get_tuples_hot_updated('brin_hot'::regclass::oid);
+ pg_stat_get_tuples_hot_updated
+--------------------------------
+ 1
+(1 row)
+
+DROP TABLE brin_hot;
+DROP FUNCTION wait_for_hot_stats();
+-- Test handling of index predicates - updating attributes in precicates
+-- should block HOT even for BRIN. We update a row that was not indexed
+-- due to the index predicate, and becomes indexable.
+CREATE TABLE brin_hot_2 (a int, b int);
+INSERT INTO brin_hot_2 VALUES (1, 100);
+CREATE INDEX ON brin_hot_2 USING brin (b) WHERE a = 2;
+UPDATE brin_hot_2 SET a = 2;
+EXPLAIN (COSTS OFF) SELECT * FROM brin_hot_2 WHERE a = 2 AND b = 100;
+ QUERY PLAN
+-----------------------------------
+ Seq Scan on brin_hot_2
+ Filter: ((a = 2) AND (b = 100))
+(2 rows)
+
+SELECT COUNT(*) FROM brin_hot_2 WHERE a = 2 AND b = 100;
+ count
+-------
+ 1
+(1 row)
+
+SET enable_seqscan = off;
+EXPLAIN (COSTS OFF) SELECT * FROM brin_hot_2 WHERE a = 2 AND b = 100;
+ QUERY PLAN
+---------------------------------------------
+ Bitmap Heap Scan on brin_hot_2
+ Recheck Cond: ((b = 100) AND (a = 2))
+ -> Bitmap Index Scan on brin_hot_2_b_idx
+ Index Cond: (b = 100)
+(4 rows)
+
+SELECT COUNT(*) FROM brin_hot_2 WHERE a = 2 AND b = 100;
+ count
+-------
+ 1
+(1 row)
+
+DROP TABLE brin_hot_2;
+-- Test that updates to indexed columns are still propagated to the
+-- BRIN column.
+-- https://postgr.es/m/05ebcb44-f383-86e3-4f31-0a97a55634cf@enterprisedb.com
+CREATE TABLE brin_hot_3 (a int, filler text) WITH (fillfactor = 10);
+INSERT INTO brin_hot_3 SELECT 1, repeat(' ', 500) FROM generate_series(1, 20);
+CREATE INDEX ON brin_hot_3 USING brin (a) WITH (pages_per_range = 1);
+UPDATE brin_hot_3 SET a = 2;
+EXPLAIN (COSTS OFF) SELECT * FROM brin_hot_3 WHERE a = 2;
+ QUERY PLAN
+---------------------------------------------
+ Bitmap Heap Scan on brin_hot_3
+ Recheck Cond: (a = 2)
+ -> Bitmap Index Scan on brin_hot_3_a_idx
+ Index Cond: (a = 2)
+(4 rows)
+
+SELECT COUNT(*) FROM brin_hot_3 WHERE a = 2;
+ count
+-------
+ 20
+(1 row)
+
+DROP TABLE brin_hot_3;
+SET enable_seqscan = on;
-- End of Stats Test
diff --git a/src/test/regress/sql/stats.sql b/src/test/regress/sql/stats.sql
index 74e592aa8a..e113f6906c 100644
--- a/src/test/regress/sql/stats.sql
+++ b/src/test/regress/sql/stats.sql
@@ -535,7 +535,6 @@ SET enable_seqscan TO on;
SELECT pg_stat_get_replication_slot(NULL);
SELECT pg_stat_get_subscription_stats(NULL);
-
-- Test that the following operations are tracked in pg_stat_io:
-- - reads of target blocks into shared buffers
-- - writes of shared buffers to permanent storage
@@ -678,4 +677,85 @@ SELECT sum(evictions) + sum(reuses) + sum(extends) + sum(fsyncs) + sum(reads) +
FROM pg_stat_io \gset
SELECT :io_stats_post_reset < :io_stats_pre_reset;
+
+-- test BRIN index doesn't block HOT update
+CREATE TABLE brin_hot (
+ id integer PRIMARY KEY,
+ val integer NOT NULL
+) WITH (autovacuum_enabled = off, fillfactor = 70);
+
+INSERT INTO brin_hot SELECT *, 0 FROM generate_series(1, 235);
+CREATE INDEX val_brin ON brin_hot using brin(val);
+
+CREATE FUNCTION wait_for_hot_stats() RETURNS void AS $$
+DECLARE
+ start_time timestamptz := clock_timestamp();
+ updated bool;
+BEGIN
+ -- we don't want to wait forever; loop will exit after 30 seconds
+ FOR i IN 1 .. 300 LOOP
+ SELECT (pg_stat_get_tuples_hot_updated('brin_hot'::regclass::oid) > 0) INTO updated;
+ EXIT WHEN updated;
+
+ -- wait a little
+ PERFORM pg_sleep_for('100 milliseconds');
+ -- reset stats snapshot so we can test again
+ PERFORM pg_stat_clear_snapshot();
+ END LOOP;
+ -- report time waited in postmaster log (where it won't change test output)
+ RAISE log 'wait_for_hot_stats delayed % seconds',
+ EXTRACT(epoch FROM clock_timestamp() - start_time);
+END
+$$ LANGUAGE plpgsql;
+
+UPDATE brin_hot SET val = -3 WHERE id = 42;
+
+-- We can't just call wait_for_hot_stats() at this point, because we only
+-- transmit stats when the session goes idle, and we probably didn't
+-- transmit the last couple of counts yet thanks to the rate-limiting logic
+-- in pgstat_report_stat(). But instead of waiting for the rate limiter's
+-- timeout to elapse, let's just start a new session. The old one will
+-- then send its stats before dying.
+\c -
+
+SELECT wait_for_hot_stats();
+SELECT pg_stat_get_tuples_hot_updated('brin_hot'::regclass::oid);
+
+DROP TABLE brin_hot;
+DROP FUNCTION wait_for_hot_stats();
+
+-- Test handling of index predicates - updating attributes in precicates
+-- should block HOT even for BRIN. We update a row that was not indexed
+-- due to the index predicate, and becomes indexable.
+CREATE TABLE brin_hot_2 (a int, b int);
+INSERT INTO brin_hot_2 VALUES (1, 100);
+CREATE INDEX ON brin_hot_2 USING brin (b) WHERE a = 2;
+
+UPDATE brin_hot_2 SET a = 2;
+
+EXPLAIN (COSTS OFF) SELECT * FROM brin_hot_2 WHERE a = 2 AND b = 100;
+SELECT COUNT(*) FROM brin_hot_2 WHERE a = 2 AND b = 100;
+
+SET enable_seqscan = off;
+
+EXPLAIN (COSTS OFF) SELECT * FROM brin_hot_2 WHERE a = 2 AND b = 100;
+SELECT COUNT(*) FROM brin_hot_2 WHERE a = 2 AND b = 100;
+
+DROP TABLE brin_hot_2;
+
+-- Test that updates to indexed columns are still propagated to the
+-- BRIN column.
+-- https://postgr.es/m/05ebcb44-f383-86e3-4f31-0a97a55634cf@enterprisedb.com
+CREATE TABLE brin_hot_3 (a int, filler text) WITH (fillfactor = 10);
+INSERT INTO brin_hot_3 SELECT 1, repeat(' ', 500) FROM generate_series(1, 20);
+CREATE INDEX ON brin_hot_3 USING brin (a) WITH (pages_per_range = 1);
+UPDATE brin_hot_3 SET a = 2;
+
+EXPLAIN (COSTS OFF) SELECT * FROM brin_hot_3 WHERE a = 2;
+SELECT COUNT(*) FROM brin_hot_3 WHERE a = 2;
+
+DROP TABLE brin_hot_3;
+
+SET enable_seqscan = on;
+
-- End of Stats Test
--
2.33.0.windows.2
Hi,
On 2/19/23 02:03, Matthias van de Meent wrote:
On Thu, 16 Jun 2022 at 15:05, Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:I've pushed the revert. Let's try again for PG16.
As we discussed in person at the developer meeting, here's a patch to
try again for PG16.It combines the committed patches with my fix, and adds some
additional comments and polish. I am confident the code is correct,
but not that it is clean (see the commit message of the patch for
details).
Thanks for the patch. I took a quick look, and I agree it seems correct,
and fairly clean too. Which places you think need cleanup/improvement?
AFAICS some of the code comes from the original (reverted) patch, so
that should be fairly non-controversial. The two new bits seem to be
TU_UpdateIndexes and HEAP_TUPLE_SUMMARIZING_UPDATED.
I have some minor review comments regarding TU_UpdateIndexes, but in
principle it's fine - we need to track/pass the flag somehow, and this
is reasonable IMHO.
I'm not entirely sure about HEAP_TUPLE_SUMMARIZING_UPDATED yet. It's
pretty much a counter-part to TU_UpdateIndexes - until now we've only
had HOT vs. non-HOT, and one bit in header (HEAP_HOT_UPDATED) was
sufficient for that. But now we need 3 states, so an extra bit is
needed. That's fine, and using another bit in the header makes sense.
The commit message says the bit is "otherwise unused" but after a while
I realized it's just an "alias" for HEAP_HOT_UPDATED - I guess it means
it's unused in the places that need to track set it, right? I wonder if
something can be confused by this - thinking it's a regular HOT update,
and doing something wrong.
Do we have some precedent for using a header bit like this? Something
that'd set a bit on in-memory tuple only to reset it shortly after?
Does it make sense to add asserts that'd ensure we can't set the bit
twice? Like a code setting both HEAP_HOT_UPDATED and the new flag?
A couple more minor comments after eye-balling the patch:
* I think heap_update would benefit from a couple more comments, e.g.
the comment before calculating sum_attrs should probably mention the
summarization optimization.
* heapam_tuple_update is probably the one place that I find hard to read
not quite readable.
* I don't understand why the TU_UpdateIndexes fields are prefixed TUUI_?
Why not to just use TU_?
* indexam.sgml says:
Access methods that do not point to individual tuples, but to (like
I guess "page range" (or something like that) is missing.
Note: I wonder how difficult would it be to also deal with attributes in
predicates. IIRC if the predicate is false, we can ignore the index, but
the consensus back then was it's too expensive as it can't be done using
the bitmaps and requires evaluating the expression, etc. But maybe there
are ways to work around that by first checking everything except for the
index predicates, and only when we still think HOT is possible we would
check the predicates. Tables usually have only a couple partial indexes,
so this might be a win. Not something this patch should/needs to do, of
course.
* bikeshedding: rel.h says
Bitmapset *rd_summarizedattr; /* cols indexed by block-or-larger
summarizing indexes */
I think the "block-or-larger" bit is unnecessary. I think the crucial
bit is the index does not contain pointers to individual tuples.
Similarly for indexam.sgml, which talks about "at least all tuples in
one block".
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Hi,
On Sun, 19 Feb 2023 at 16:04, Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
Hi,
On 2/19/23 02:03, Matthias van de Meent wrote:
On Thu, 16 Jun 2022 at 15:05, Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:I've pushed the revert. Let's try again for PG16.
As we discussed in person at the developer meeting, here's a patch to
try again for PG16.It combines the committed patches with my fix, and adds some
additional comments and polish. I am confident the code is correct,
but not that it is clean (see the commit message of the patch for
details).Thanks for the patch. I took a quick look, and I agree it seems correct,
and fairly clean too.
Thanks. Based on feedback, attached is v2 of the patch, with as
significant changes:
- We don't store the columns we mention in predicates of summarized
indexes in the hotblocking column anymore, they are stored in the
summarized columns bitmap instead. This further reduces the chance of
failiing to apply HOT with summarizing indexes.
- The heaptuple header bit for summarized update in inserted tuples is
replaced with passing an out parameter. This simplifies the logic and
decreases chances of accidentally storing incorrect data.
Responses to feedback below.
Which places you think need cleanup/improvement?
I wasn't confident about the use of HEAP_TUPLE_SUMMARIZING_UPDATED -
it's not a nice way to signal what indexes to update. This has been
updated in the attached patch.
AFAICS some of the code comes from the original (reverted) patch, so
that should be fairly non-controversial. The two new bits seem to be
TU_UpdateIndexes and HEAP_TUPLE_SUMMARIZING_UPDATED.
Correct.
I have some minor review comments regarding TU_UpdateIndexes, but in
principle it's fine - we need to track/pass the flag somehow, and this
is reasonable IMHO.I'm not entirely sure about HEAP_TUPLE_SUMMARIZING_UPDATED yet.
This is the part that I wasn't sure about either. I don't really like
the way it was implemented (temporary in-memory only bits in the tuple
header), but I also couldn't find an amazing alternative back in the
v15 beta window when I wrote the original fix for the now-reverted
commit. I've updated this to utilize 'out parameters' instead.
Although this change requires some more function signature changes, I
think it's better overall.
It's
pretty much a counter-part to TU_UpdateIndexes - until now we've only
had HOT vs. non-HOT, and one bit in header (HEAP_HOT_UPDATED) was
sufficient for that. But now we need 3 states, so an extra bit is
needed. That's fine, and using another bit in the header makes sense.The commit message says the bit is "otherwise unused" but after a while
I realized it's just an "alias" for HEAP_HOT_UPDATED - I guess it means
it's unused in the places that need to track set it, right? I wonder if
something can be confused by this - thinking it's a regular HOT update,
and doing something wrong.
Yes. A newly inserted tuple, whether created from an update or a fresh
insert, can't already have been HOT-updated, so the bit is only
available (not in use for meaningful operations) in the in-memory
tuple processing path of new tuple insertion (be it update or actual
insert).
Do we have some precedent for using a header bit like this? Something
that'd set a bit on in-memory tuple only to reset it shortly after?
I can't find any, but I also haven't looked very far.
Does it make sense to add asserts that'd ensure we can't set the bit
twice? Like a code setting both HEAP_HOT_UPDATED and the new flag?
I'm doubtful of that; as this is basically a HOT chain intermediate
tuple being returned (but only in memory), instead of the normal
freshly inserted HOT tuple that's the end of a HOT chain. Anyway, that
code has been removed in the attached patch.
A couple more minor comments after eye-balling the patch:
* I think heap_update would benefit from a couple more comments, e.g.
the comment before calculating sum_attrs should probably mention the
summarization optimization.
Done.
* heapam_tuple_update is probably the one place that I find hard to read
not quite readable.
Updated.
* I don't understand why the TU_UpdateIndexes fields are prefixed TUUI_?
Why not to just use TU_?
I was under the (after checking, mistaken) impression that we already
had an enum that used the TU_* prefix. This has been updated.
* indexam.sgml says:
Access methods that do not point to individual tuples, but to (like
I guess "page range" (or something like that) is missing.
Fixed
Note: I wonder how difficult would it be to also deal with attributes in
predicates. IIRC if the predicate is false, we can ignore the index, but
the consensus back then was it's too expensive as it can't be done using
the bitmaps and requires evaluating the expression, etc. But maybe there
are ways to work around that by first checking everything except for the
index predicates, and only when we still think HOT is possible we would
check the predicates. Tables usually have only a couple partial indexes,
so this might be a win. Not something this patch should/needs to do, of
course.
Yes, I think that could be considered separately.
* bikeshedding: rel.h says
Bitmapset *rd_summarizedattr; /* cols indexed by block-or-larger
summarizing indexes */I think the "block-or-larger" bit is unnecessary. I think the crucial
bit is the index does not contain pointers to individual tuples.
Similarly for indexam.sgml, which talks about "at least all tuples in
one block".
That makes sense, fixed.
Kind regards,
Matthias van de Meent
Attachments:
v2-0001-Ignore-BRIN-indexes-when-checking-for-HOT-updates.patchapplication/octet-stream; name=v2-0001-Ignore-BRIN-indexes-when-checking-for-HOT-updates.patchDownload
From caafec48982ad1446eb6f060c8c071713d12122c Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <boekewurm+postgres@gmail.com>
Date: Mon, 20 Feb 2023 18:38:30 +0100
Subject: [PATCH v2] Ignore BRIN indexes when checking for HOT updates
When determining whether an index update may be skipped by using HOT, we
can ignore attributes indexed by block summarizing indexes without
references to individual tuples that need to be cleaned up.
This also removes rd_indexattr list, and replaces it with rd_attrsvalid
flag. The list was not used anywhere, and a simple flag is sufficient.
A new type TU_UpdateIndexes is invented provide a signal to the executor
to determine which indexes to update - no indexes, all indexes, or only
the summarizing indexes.
One otherwise unused bit in the heap tuple header is (ab)used to signal
that the HOT update would still update at least one summarizing index.
The bit is cleared immediately
Original patch by Josef Simanek, various fixes and improvements by
Tomas Vondra and me.
Authors: Josef Simanek, Tomas Vondra, Matthias van de Meent
Reviewed-by: Tomas Vondra, Alvaro Herrera
---
doc/src/sgml/indexam.sgml | 13 +++
src/backend/access/brin/brin.c | 1 +
src/backend/access/gin/ginutil.c | 1 +
src/backend/access/gist/gist.c | 1 +
src/backend/access/hash/hash.c | 1 +
src/backend/access/heap/heapam.c | 48 +++++++-
src/backend/access/heap/heapam_handler.c | 19 ++-
src/backend/access/nbtree/nbtree.c | 1 +
src/backend/access/spgist/spgutils.c | 1 +
src/backend/access/table/tableam.c | 2 +-
src/backend/catalog/index.c | 9 +-
src/backend/catalog/indexing.c | 35 ++++--
src/backend/commands/copyfrom.c | 5 +-
src/backend/commands/indexcmds.c | 10 +-
src/backend/executor/execIndexing.c | 37 ++++--
src/backend/executor/execReplication.c | 9 +-
src/backend/executor/nodeModifyTable.c | 13 ++-
src/backend/nodes/makefuncs.c | 7 +-
src/backend/utils/cache/relcache.c | 73 ++++++++----
src/include/access/amapi.h | 2 +
src/include/access/heapam.h | 5 +-
src/include/access/tableam.h | 19 ++-
src/include/executor/executor.h | 3 +-
src/include/nodes/execnodes.h | 2 +
src/include/nodes/makefuncs.h | 4 +-
src/include/utils/rel.h | 4 +-
src/include/utils/relcache.h | 5 +-
.../modules/dummy_index_am/dummy_index_am.c | 1 +
src/test/regress/expected/stats.out | 110 ++++++++++++++++++
src/test/regress/sql/stats.sql | 82 ++++++++++++-
30 files changed, 445 insertions(+), 78 deletions(-)
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index 4f83970c85..897419ec95 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -127,6 +127,9 @@ typedef struct IndexAmRoutine
bool amcaninclude;
/* does AM use maintenance_work_mem? */
bool amusemaintenanceworkmem;
+ /* does AM summarize tuples, with at least all tuples in the block
+ * summarized in one summary */
+ bool amsummarizing;
/* OR of parallel vacuum flags */
uint8 amparallelvacuumoptions;
/* type of data stored in index, or InvalidOid if variable */
@@ -247,6 +250,16 @@ typedef struct IndexAmRoutine
null, independently of <structfield>amoptionalkey</structfield>.
</para>
+ <para>
+ The <structfield>amsummarizing</structfield> flag indicates whether the
+ access method summarizes the indexed tuples, with summarizing granularity
+ of at least per block.
+ Access methods that do not point to individual tuples, but to (like
+ <acronym>BRIN</acronym>), may allow the <acronym>HOT</acronym> optimization
+ to continue. This does not apply to attributes referenced in index
+ predicates, an update of such attribute always disables <acronym>HOT</acronym>.
+ </para>
+
</sect1>
<sect1 id="index-functions">
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index b5a5fa7b33..53e4721a54 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -109,6 +109,7 @@ brinhandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = false;
amroutine->amcaninclude = false;
amroutine->amusemaintenanceworkmem = false;
+ amroutine->amsummarizing = true;
amroutine->amparallelvacuumoptions =
VACUUM_OPTION_PARALLEL_CLEANUP;
amroutine->amkeytype = InvalidOid;
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index f05128ecf5..03fec1704e 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -56,6 +56,7 @@ ginhandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = false;
amroutine->amcaninclude = false;
amroutine->amusemaintenanceworkmem = true;
+ amroutine->amsummarizing = false;
amroutine->amparallelvacuumoptions =
VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_CLEANUP;
amroutine->amkeytype = InvalidOid;
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index ba394f08f6..ea72bcce1b 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -78,6 +78,7 @@ gisthandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = false;
amroutine->amcaninclude = true;
amroutine->amusemaintenanceworkmem = false;
+ amroutine->amsummarizing = false;
amroutine->amparallelvacuumoptions =
VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
amroutine->amkeytype = InvalidOid;
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index eb258337d6..fc5d97f606 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -75,6 +75,7 @@ hashhandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = false;
amroutine->amcaninclude = false;
amroutine->amusemaintenanceworkmem = false;
+ amroutine->amsummarizing = false;
amroutine->amparallelvacuumoptions =
VACUUM_OPTION_PARALLEL_BULKDEL;
amroutine->amkeytype = INT4OID;
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 7eb79cee58..89fdfe5353 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -2924,11 +2924,13 @@ simple_heap_delete(Relation relation, ItemPointer tid)
TM_Result
heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
CommandId cid, Snapshot crosscheck, bool wait,
- TM_FailureData *tmfd, LockTupleMode *lockmode)
+ TM_FailureData *tmfd, LockTupleMode *lockmode,
+ TU_UpdateIndexes *update_indexes)
{
TM_Result result;
TransactionId xid = GetCurrentTransactionId();
Bitmapset *hot_attrs;
+ Bitmapset *sum_attrs;
Bitmapset *key_attrs;
Bitmapset *id_attrs;
Bitmapset *interesting_attrs;
@@ -2951,6 +2953,7 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
bool have_tuple_lock = false;
bool iscombo;
bool use_hot_update = false;
+ bool summarized_update = false;
bool key_intact;
bool all_visible_cleared = false;
bool all_visible_cleared_new = false;
@@ -2996,12 +2999,16 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
* Note that we get copies of each bitmap, so we need not worry about
* relcache flush happening midway through.
*/
- hot_attrs = RelationGetIndexAttrBitmap(relation, INDEX_ATTR_BITMAP_ALL);
+ hot_attrs = RelationGetIndexAttrBitmap(relation,
+ INDEX_ATTR_BITMAP_HOT_BLOCKING);
+ sum_attrs = RelationGetIndexAttrBitmap(relation,
+ INDEX_ATTR_BITMAP_SUMMARIZED);
key_attrs = RelationGetIndexAttrBitmap(relation, INDEX_ATTR_BITMAP_KEY);
id_attrs = RelationGetIndexAttrBitmap(relation,
INDEX_ATTR_BITMAP_IDENTITY_KEY);
interesting_attrs = NULL;
interesting_attrs = bms_add_members(interesting_attrs, hot_attrs);
+ interesting_attrs = bms_add_members(interesting_attrs, sum_attrs);
interesting_attrs = bms_add_members(interesting_attrs, key_attrs);
interesting_attrs = bms_add_members(interesting_attrs, id_attrs);
@@ -3311,7 +3318,10 @@ l2:
UnlockTupleTuplock(relation, &(oldtup.t_self), *lockmode);
if (vmbuffer != InvalidBuffer)
ReleaseBuffer(vmbuffer);
+ *update_indexes = TU_None;
+
bms_free(hot_attrs);
+ bms_free(sum_attrs);
bms_free(key_attrs);
bms_free(id_attrs);
bms_free(modified_attrs);
@@ -3633,7 +3643,19 @@ l2:
* changed.
*/
if (!bms_overlap(modified_attrs, hot_attrs))
+ {
use_hot_update = true;
+
+ /*
+ * If none of the columns that are used in hot-blocking indexes
+ * were updated, we can apply HOT, but we do still need to check
+ * if we need to update the summarizing indexes, and update those
+ * indexes if the columns were updated, or we may fail to detect
+ * e.g. value bound changes in BRIN minmax indexes.
+ */
+ if (bms_overlap(modified_attrs, sum_attrs))
+ summarized_update = true;
+ }
}
else
{
@@ -3793,10 +3815,27 @@ l2:
heap_freetuple(heaptup);
}
+ /*
+ * If it is a HOT update, the update may still need to update summarized
+ * indexes, lest we fail to update those summaries and get incorrect
+ * results (for example, minmax bounds of the block may change with this
+ * update).
+ */
+ if (use_hot_update)
+ {
+ if (summarized_update)
+ *update_indexes = TU_Summarizing;
+ else
+ *update_indexes = TU_None;
+ }
+ else
+ *update_indexes = TU_All;
+
if (old_key_tuple != NULL && old_key_copied)
heap_freetuple(old_key_tuple);
bms_free(hot_attrs);
+ bms_free(sum_attrs);
bms_free(key_attrs);
bms_free(id_attrs);
bms_free(modified_attrs);
@@ -3960,7 +3999,8 @@ HeapDetermineColumnsInfo(Relation relation,
* via ereport().
*/
void
-simple_heap_update(Relation relation, ItemPointer otid, HeapTuple tup)
+simple_heap_update(Relation relation, ItemPointer otid, HeapTuple tup,
+ TU_UpdateIndexes *update_indexes)
{
TM_Result result;
TM_FailureData tmfd;
@@ -3969,7 +4009,7 @@ simple_heap_update(Relation relation, ItemPointer otid, HeapTuple tup)
result = heap_update(relation, otid, tup,
GetCurrentCommandId(true), InvalidSnapshot,
true /* wait for commit */ ,
- &tmfd, &lockmode);
+ &tmfd, &lockmode, update_indexes);
switch (result)
{
case TM_SelfModified:
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index c4b1916d36..a1d7d91ff7 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -314,7 +314,7 @@ static TM_Result
heapam_tuple_update(Relation relation, ItemPointer otid, TupleTableSlot *slot,
CommandId cid, Snapshot snapshot, Snapshot crosscheck,
bool wait, TM_FailureData *tmfd,
- LockTupleMode *lockmode, bool *update_indexes)
+ LockTupleMode *lockmode, TU_UpdateIndexes *update_indexes)
{
bool shouldFree = true;
HeapTuple tuple = ExecFetchSlotHeapTuple(slot, true, &shouldFree);
@@ -325,7 +325,7 @@ heapam_tuple_update(Relation relation, ItemPointer otid, TupleTableSlot *slot,
tuple->t_tableOid = slot->tts_tableOid;
result = heap_update(relation, otid, tuple, cid, crosscheck, wait,
- tmfd, lockmode);
+ tmfd, lockmode, update_indexes);
ItemPointerCopy(&tuple->t_self, &slot->tts_tid);
/*
@@ -334,9 +334,20 @@ heapam_tuple_update(Relation relation, ItemPointer otid, TupleTableSlot *slot,
* Note: heap_update returns the tid (location) of the new tuple in the
* t_self field.
*
- * If it's a HOT update, we mustn't insert new index entries.
+ * If the update is not HOT, we must update all indexes. If the update
+ * is HOT, it could be that we updated summarized columns, so we either
+ * update only summarized indexes, or none at all.
*/
- *update_indexes = result == TM_Ok && !HeapTupleIsHeapOnly(tuple);
+ if (result != TM_Ok)
+ {
+ Assert(*update_indexes == TU_None);
+ *update_indexes = TU_None;
+ }
+ else if (!HeapTupleIsHeapOnly(tuple))
+ Assert(*update_indexes == TU_All);
+ else
+ Assert(*update_indexes == TU_Summarizing ||
+ *update_indexes == TU_None);
if (shouldFree)
pfree(tuple);
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 1cc88da032..681c30b0d8 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -114,6 +114,7 @@ bthandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = true;
amroutine->amcaninclude = true;
amroutine->amusemaintenanceworkmem = false;
+ amroutine->amsummarizing = false;
amroutine->amparallelvacuumoptions =
VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
amroutine->amkeytype = InvalidOid;
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index 3761f2c193..4e7ff1d160 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -62,6 +62,7 @@ spghandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = false;
amroutine->amcaninclude = true;
amroutine->amusemaintenanceworkmem = false;
+ amroutine->amsummarizing = false;
amroutine->amparallelvacuumoptions =
VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
amroutine->amkeytype = InvalidOid;
diff --git a/src/backend/access/table/tableam.c b/src/backend/access/table/tableam.c
index ef0d34fcee..a5e6c92f35 100644
--- a/src/backend/access/table/tableam.c
+++ b/src/backend/access/table/tableam.c
@@ -345,7 +345,7 @@ void
simple_table_tuple_update(Relation rel, ItemPointer otid,
TupleTableSlot *slot,
Snapshot snapshot,
- bool *update_indexes)
+ TU_UpdateIndexes *update_indexes)
{
TM_Result result;
TM_FailureData tmfd;
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 41b16cb89b..e2fd035f44 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -1370,7 +1370,8 @@ index_concurrently_create_copy(Relation heapRelation, Oid oldIndexId,
oldInfo->ii_Unique,
oldInfo->ii_NullsNotDistinct,
false, /* not ready for inserts */
- true);
+ true,
+ indexRelation->rd_indam->amsummarizing);
/*
* Extract the list of column names and the column numbers for the new
@@ -2442,7 +2443,8 @@ BuildIndexInfo(Relation index)
indexStruct->indisunique,
indexStruct->indnullsnotdistinct,
indexStruct->indisready,
- false);
+ false,
+ index->rd_indam->amsummarizing);
/* fill in attribute numbers */
for (i = 0; i < numAtts; i++)
@@ -2502,7 +2504,8 @@ BuildDummyIndexInfo(Relation index)
indexStruct->indisunique,
indexStruct->indnullsnotdistinct,
indexStruct->indisready,
- false);
+ false,
+ index->rd_indam->amsummarizing);
/* fill in attribute numbers */
for (i = 0; i < numAtts; i++)
diff --git a/src/backend/catalog/indexing.c b/src/backend/catalog/indexing.c
index bb7cc3601c..a387eccdc4 100644
--- a/src/backend/catalog/indexing.c
+++ b/src/backend/catalog/indexing.c
@@ -72,7 +72,8 @@ CatalogCloseIndexes(CatalogIndexState indstate)
* This is effectively a cut-down version of ExecInsertIndexTuples.
*/
static void
-CatalogIndexInsert(CatalogIndexState indstate, HeapTuple heapTuple)
+CatalogIndexInsert(CatalogIndexState indstate, HeapTuple heapTuple,
+ TU_UpdateIndexes updateIndexes)
{
int i;
int numIndexes;
@@ -82,6 +83,7 @@ CatalogIndexInsert(CatalogIndexState indstate, HeapTuple heapTuple)
IndexInfo **indexInfoArray;
Datum values[INDEX_MAX_KEYS];
bool isnull[INDEX_MAX_KEYS];
+ bool onlySummarized = updateIndexes == TU_Summarizing;
/*
* HOT update does not require index inserts. But with asserts enabled we
@@ -89,10 +91,13 @@ CatalogIndexInsert(CatalogIndexState indstate, HeapTuple heapTuple)
* table/index.
*/
#ifndef USE_ASSERT_CHECKING
- if (HeapTupleIsHeapOnly(heapTuple))
+ if (HeapTupleIsHeapOnly(heapTuple) && !onlySummarized)
return;
#endif
+ if (onlySummarized)
+ Assert(HeapTupleIsHeapOnly(heapTuple));
+
/*
* Get information from the state structure. Fall out if nothing to do.
*/
@@ -135,13 +140,20 @@ CatalogIndexInsert(CatalogIndexState indstate, HeapTuple heapTuple)
/* see earlier check above */
#ifdef USE_ASSERT_CHECKING
- if (HeapTupleIsHeapOnly(heapTuple))
+ if (HeapTupleIsHeapOnly(heapTuple) && !onlySummarized)
{
Assert(!ReindexIsProcessingIndex(RelationGetRelid(index)));
continue;
}
#endif /* USE_ASSERT_CHECKING */
+ /*
+ * Skip insertions into non-summarizing indexes if we only need
+ * to update summarizing indexes
+ */
+ if (onlySummarized && !indexInfo->ii_Summarizing)
+ continue;
+
/*
* FormIndexDatum fills in its values and isnull parameters with the
* appropriate values for the column(s) of the index.
@@ -228,7 +240,7 @@ CatalogTupleInsert(Relation heapRel, HeapTuple tup)
simple_heap_insert(heapRel, tup);
- CatalogIndexInsert(indstate, tup);
+ CatalogIndexInsert(indstate, tup, TU_All);
CatalogCloseIndexes(indstate);
}
@@ -248,7 +260,7 @@ CatalogTupleInsertWithInfo(Relation heapRel, HeapTuple tup,
simple_heap_insert(heapRel, tup);
- CatalogIndexInsert(indstate, tup);
+ CatalogIndexInsert(indstate, tup, TU_All);
}
/*
@@ -279,7 +291,7 @@ CatalogTuplesMultiInsertWithInfo(Relation heapRel, TupleTableSlot **slot,
tuple = ExecFetchSlotHeapTuple(slot[i], true, &should_free);
tuple->t_tableOid = slot[i]->tts_tableOid;
- CatalogIndexInsert(indstate, tuple);
+ CatalogIndexInsert(indstate, tuple, TU_All);
if (should_free)
heap_freetuple(tuple);
@@ -301,14 +313,15 @@ void
CatalogTupleUpdate(Relation heapRel, ItemPointer otid, HeapTuple tup)
{
CatalogIndexState indstate;
+ TU_UpdateIndexes updateIndexes = TU_All;
CatalogTupleCheckConstraints(heapRel, tup);
indstate = CatalogOpenIndexes(heapRel);
- simple_heap_update(heapRel, otid, tup);
+ simple_heap_update(heapRel, otid, tup, &updateIndexes);
- CatalogIndexInsert(indstate, tup);
+ CatalogIndexInsert(indstate, tup, updateIndexes);
CatalogCloseIndexes(indstate);
}
@@ -324,11 +337,13 @@ void
CatalogTupleUpdateWithInfo(Relation heapRel, ItemPointer otid, HeapTuple tup,
CatalogIndexState indstate)
{
+ TU_UpdateIndexes updateIndexes = TU_All;
+
CatalogTupleCheckConstraints(heapRel, tup);
- simple_heap_update(heapRel, otid, tup);
+ simple_heap_update(heapRel, otid, tup, &updateIndexes);
- CatalogIndexInsert(indstate, tup);
+ CatalogIndexInsert(indstate, tup, updateIndexes);
}
/*
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index af52faca6d..564520289a 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -435,7 +435,7 @@ CopyMultiInsertBufferFlush(CopyMultiInsertInfo *miinfo,
recheckIndexes =
ExecInsertIndexTuples(resultRelInfo,
buffer->slots[i], estate, false,
- false, NULL, NIL);
+ false, NULL, NIL, false);
ExecARInsertTriggers(estate, resultRelInfo,
slots[i], recheckIndexes,
cstate->transition_capture);
@@ -1254,7 +1254,8 @@ CopyFrom(CopyFromState cstate)
false,
false,
NULL,
- NIL);
+ NIL,
+ false);
}
/* AFTER ROW INSERT Triggers */
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 16ec0b114e..ff48f44c66 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -184,6 +184,7 @@ CheckIndexCompatible(Oid oldId,
Form_pg_am accessMethodForm;
IndexAmRoutine *amRoutine;
bool amcanorder;
+ bool amsummarizing;
int16 *coloptions;
IndexInfo *indexInfo;
int numberOfAttributes;
@@ -222,6 +223,7 @@ CheckIndexCompatible(Oid oldId,
ReleaseSysCache(tuple);
amcanorder = amRoutine->amcanorder;
+ amsummarizing = amRoutine->amsummarizing;
/*
* Compute the operator classes, collations, and exclusion operators for
@@ -232,7 +234,8 @@ CheckIndexCompatible(Oid oldId,
* ii_NumIndexKeyAttrs with same value.
*/
indexInfo = makeIndexInfo(numberOfAttributes, numberOfAttributes,
- accessMethodId, NIL, NIL, false, false, false, false);
+ accessMethodId, NIL, NIL, false, false,
+ false, false, amsummarizing);
typeObjectId = palloc_array(Oid, numberOfAttributes);
collationObjectId = palloc_array(Oid, numberOfAttributes);
classObjectId = palloc_array(Oid, numberOfAttributes);
@@ -550,6 +553,7 @@ DefineIndex(Oid relationId,
Form_pg_am accessMethodForm;
IndexAmRoutine *amRoutine;
bool amcanorder;
+ bool amissummarizing;
amoptions_function amoptions;
bool partitioned;
bool safe_index;
@@ -866,6 +870,7 @@ DefineIndex(Oid relationId,
amcanorder = amRoutine->amcanorder;
amoptions = amRoutine->amoptions;
+ amissummarizing = amRoutine->amsummarizing;
pfree(amRoutine);
ReleaseSysCache(tuple);
@@ -897,7 +902,8 @@ DefineIndex(Oid relationId,
stmt->unique,
stmt->nulls_not_distinct,
!concurrent,
- concurrent);
+ concurrent,
+ amissummarizing);
typeObjectId = palloc_array(Oid, numberOfAttributes);
collationObjectId = palloc_array(Oid, numberOfAttributes);
diff --git a/src/backend/executor/execIndexing.c b/src/backend/executor/execIndexing.c
index 6e88e72813..da28e5e40c 100644
--- a/src/backend/executor/execIndexing.c
+++ b/src/backend/executor/execIndexing.c
@@ -259,15 +259,24 @@ ExecCloseIndices(ResultRelInfo *resultRelInfo)
* into all the relations indexing the result relation
* when a heap tuple is inserted into the result relation.
*
- * When 'update' is true, executor is performing an UPDATE
- * that could not use an optimization like heapam's HOT (in
- * more general terms a call to table_tuple_update() took
- * place and set 'update_indexes' to true). Receiving this
- * hint makes us consider if we should pass down the
- * 'indexUnchanged' hint in turn. That's something that we
- * figure out for each index_insert() call iff 'update' is
- * true. (When 'update' is false we already know not to pass
- * the hint to any index.)
+ * When 'update' is true and 'onlySummarizing' is false,
+ * executor is performing an UPDATE that could not use an
+ * optimization like heapam's HOT (in more general terms a
+ * call to table_tuple_update() took place and set
+ * 'update_indexes' to TUUI_All). Receiving this hint makes
+ * us consider if we should pass down the 'indexUnchanged'
+ * hint in turn. That's something that we figure out for
+ * each index_insert() call iff 'update' is true.
+ * (When 'update' is false we already know not to pass the
+ * hint to any index.)
+ *
+ * If onlySummarizing is set, an equivalent optimization to
+ * HOT has been applied and any updated columns are indexed
+ * only by summarizing indexes (or in more general terms a
+ * call to table_tuple_update() took place and set
+ * 'update_indexes' to TUUI_Summarizing). We can (and must)
+ * therefore only update the indexes that have
+ * 'amsummarizing' = true.
*
* Unique and exclusion constraints are enforced at the same
* time. This returns a list of index OIDs for any unique or
@@ -287,7 +296,8 @@ ExecInsertIndexTuples(ResultRelInfo *resultRelInfo,
bool update,
bool noDupErr,
bool *specConflict,
- List *arbiterIndexes)
+ List *arbiterIndexes,
+ bool onlySummarizing)
{
ItemPointer tupleid = &slot->tts_tid;
List *result = NIL;
@@ -343,6 +353,13 @@ ExecInsertIndexTuples(ResultRelInfo *resultRelInfo,
if (!indexInfo->ii_ReadyForInserts)
continue;
+ /*
+ * Skip processing of non-summarizing indexes if we only
+ * update summarizing indexes
+ */
+ if (onlySummarizing && !indexInfo->ii_Summarizing)
+ continue;
+
/* Check for partial index */
if (indexInfo->ii_Predicate != NIL)
{
diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index c484f5c301..4c01ef63cb 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -445,7 +445,7 @@ ExecSimpleRelationInsert(ResultRelInfo *resultRelInfo,
if (resultRelInfo->ri_NumIndices > 0)
recheckIndexes = ExecInsertIndexTuples(resultRelInfo,
slot, estate, false, false,
- NULL, NIL);
+ NULL, NIL, false);
/* AFTER ROW INSERT Triggers */
ExecARInsertTriggers(estate, resultRelInfo, slot,
@@ -493,7 +493,7 @@ ExecSimpleRelationUpdate(ResultRelInfo *resultRelInfo,
if (!skip_tuple)
{
List *recheckIndexes = NIL;
- bool update_indexes;
+ TU_UpdateIndexes update_indexes;
/* Compute stored generated columns */
if (rel->rd_att->constr &&
@@ -510,10 +510,11 @@ ExecSimpleRelationUpdate(ResultRelInfo *resultRelInfo,
simple_table_tuple_update(rel, tid, slot, estate->es_snapshot,
&update_indexes);
- if (resultRelInfo->ri_NumIndices > 0 && update_indexes)
+ if (resultRelInfo->ri_NumIndices > 0 && update_indexes != TU_None)
recheckIndexes = ExecInsertIndexTuples(resultRelInfo,
slot, estate, true, false,
- NULL, NIL);
+ NULL, NIL,
+ update_indexes == TU_Summarizing);
/* AFTER ROW UPDATE Triggers */
ExecARUpdateTriggers(estate, resultRelInfo,
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index a94d7f86e5..099ccda95a 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -125,8 +125,8 @@ typedef struct ModifyTableContext
typedef struct UpdateContext
{
bool updated; /* did UPDATE actually occur? */
- bool updateIndexes; /* index update required? */
bool crossPartUpdate; /* was it a cross-partition update? */
+ TU_UpdateIndexes updateIndexes; /* Which index updates are required? */
/*
* Lock mode to acquire on the latest tuple version before performing
@@ -1106,7 +1106,8 @@ ExecInsert(ModifyTableContext *context,
recheckIndexes = ExecInsertIndexTuples(resultRelInfo,
slot, estate, false, true,
&specConflict,
- arbiterIndexes);
+ arbiterIndexes,
+ false);
/* adjust the tuple's state accordingly */
table_tuple_complete_speculative(resultRelationDesc, slot,
@@ -1145,7 +1146,8 @@ ExecInsert(ModifyTableContext *context,
if (resultRelInfo->ri_NumIndices > 0)
recheckIndexes = ExecInsertIndexTuples(resultRelInfo,
slot, estate, false,
- false, NULL, NIL);
+ false, NULL, NIL,
+ false);
}
}
@@ -2108,11 +2110,12 @@ ExecUpdateEpilogue(ModifyTableContext *context, UpdateContext *updateCxt,
ModifyTableState *mtstate = context->mtstate;
/* insert index entries for tuple if necessary */
- if (resultRelInfo->ri_NumIndices > 0 && updateCxt->updateIndexes)
+ if (resultRelInfo->ri_NumIndices > 0 && updateCxt->updateIndexes != TU_None)
recheckIndexes = ExecInsertIndexTuples(resultRelInfo,
slot, context->estate,
true, false,
- NULL, NIL);
+ NULL, NIL,
+ updateCxt->updateIndexes == TU_Summarizing);
/* AFTER ROW UPDATE Triggers */
ExecARUpdateTriggers(context->estate, resultRelInfo,
diff --git a/src/backend/nodes/makefuncs.c b/src/backend/nodes/makefuncs.c
index fe67baf142..f23f8b7349 100644
--- a/src/backend/nodes/makefuncs.c
+++ b/src/backend/nodes/makefuncs.c
@@ -743,7 +743,8 @@ make_ands_implicit(Expr *clause)
*/
IndexInfo *
makeIndexInfo(int numattrs, int numkeyattrs, Oid amoid, List *expressions,
- List *predicates, bool unique, bool nulls_not_distinct, bool isready, bool concurrent)
+ List *predicates, bool unique, bool nulls_not_distinct,
+ bool isready, bool concurrent, bool summarizing)
{
IndexInfo *n = makeNode(IndexInfo);
@@ -757,6 +758,10 @@ makeIndexInfo(int numattrs, int numkeyattrs, Oid amoid, List *expressions,
n->ii_CheckedUnchanged = false;
n->ii_IndexUnchanged = false;
n->ii_Concurrent = concurrent;
+ n->ii_Summarizing = summarizing;
+
+ /* summarizing indexes cannot contain non-key attributes */
+ Assert(!summarizing || numkeyattrs == numattrs);
/* expressions */
n->ii_Expressions = expressions;
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 13f7987373..092c5ed8c7 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -2440,10 +2440,11 @@ RelationDestroyRelation(Relation relation, bool remember_tupdesc)
list_free_deep(relation->rd_fkeylist);
list_free(relation->rd_indexlist);
list_free(relation->rd_statlist);
- bms_free(relation->rd_indexattr);
bms_free(relation->rd_keyattr);
bms_free(relation->rd_pkattr);
bms_free(relation->rd_idattr);
+ bms_free(relation->rd_hotblockingattr);
+ bms_free(relation->rd_summarizedattr);
if (relation->rd_pubdesc)
pfree(relation->rd_pubdesc);
if (relation->rd_options)
@@ -5167,10 +5168,11 @@ RelationGetIndexPredicate(Relation relation)
Bitmapset *
RelationGetIndexAttrBitmap(Relation relation, IndexAttrBitmapKind attrKind)
{
- Bitmapset *indexattrs; /* indexed columns */
Bitmapset *uindexattrs; /* columns in unique indexes */
Bitmapset *pkindexattrs; /* columns in the primary index */
Bitmapset *idindexattrs; /* columns in the replica identity */
+ Bitmapset *hotblockingattrs; /* columns with HOT blocking indexes */
+ Bitmapset *summarizedattrs; /* columns with summarizing indexes */
List *indexoidlist;
List *newindexoidlist;
Oid relpkindex;
@@ -5179,18 +5181,20 @@ RelationGetIndexAttrBitmap(Relation relation, IndexAttrBitmapKind attrKind)
MemoryContext oldcxt;
/* Quick exit if we already computed the result. */
- if (relation->rd_indexattr != NULL)
+ if (relation->rd_attrsvalid)
{
switch (attrKind)
{
- case INDEX_ATTR_BITMAP_ALL:
- return bms_copy(relation->rd_indexattr);
case INDEX_ATTR_BITMAP_KEY:
return bms_copy(relation->rd_keyattr);
case INDEX_ATTR_BITMAP_PRIMARY_KEY:
return bms_copy(relation->rd_pkattr);
case INDEX_ATTR_BITMAP_IDENTITY_KEY:
return bms_copy(relation->rd_idattr);
+ case INDEX_ATTR_BITMAP_HOT_BLOCKING:
+ return bms_copy(relation->rd_hotblockingattr);
+ case INDEX_ATTR_BITMAP_SUMMARIZED:
+ return bms_copy(relation->rd_summarizedattr);
default:
elog(ERROR, "unknown attrKind %u", attrKind);
}
@@ -5230,10 +5234,11 @@ restart:
* CONCURRENTLY is far enough along that we should ignore the index, it
* won't be returned at all by RelationGetIndexList.
*/
- indexattrs = NULL;
uindexattrs = NULL;
pkindexattrs = NULL;
idindexattrs = NULL;
+ hotblockingattrs = NULL;
+ summarizedattrs = NULL;
foreach(l, indexoidlist)
{
Oid indexOid = lfirst_oid(l);
@@ -5291,15 +5296,25 @@ restart:
/*
* Since we have covering indexes with non-key columns, we must
* handle them accurately here. non-key columns must be added into
- * indexattrs, since they are in index, and HOT-update shouldn't
- * miss them. Obviously, non-key columns couldn't be referenced by
+ * hotblockingattrs, since they are in index, and HOT-update
+ * shouldn't miss them.
+ *
+ * Summarizing indexes do not block HOT, but do need to be updated
+ * when the column value changes, thus require a separate
+ * attribute bitmapset.
+ *
+ * Obviously, non-key columns couldn't be referenced by
* foreign key or identity key. Hence we do not include them into
* uindexattrs, pkindexattrs and idindexattrs bitmaps.
*/
if (attrnum != 0)
{
- indexattrs = bms_add_member(indexattrs,
- attrnum - FirstLowInvalidHeapAttributeNumber);
+ if (indexDesc->rd_indam->amsummarizing)
+ summarizedattrs = bms_add_member(summarizedattrs,
+ attrnum - FirstLowInvalidHeapAttributeNumber);
+ else
+ hotblockingattrs = bms_add_member(hotblockingattrs,
+ attrnum - FirstLowInvalidHeapAttributeNumber);
if (isKey && i < indexDesc->rd_index->indnkeyatts)
uindexattrs = bms_add_member(uindexattrs,
@@ -5316,10 +5331,18 @@ restart:
}
/* Collect all attributes used in expressions, too */
- pull_varattnos(indexExpressions, 1, &indexattrs);
+ if (indexDesc->rd_indam->amsummarizing)
+ pull_varattnos(indexExpressions, 1, &summarizedattrs);
+ else
+ pull_varattnos(indexExpressions, 1, &hotblockingattrs);
- /* Collect all attributes in the index predicate, too */
- pull_varattnos(indexPredicate, 1, &indexattrs);
+ /*
+ * Collect all attributes in the index predicate, too.
+ */
+ if (indexDesc->rd_indam->amsummarizing)
+ pull_varattnos(indexPredicate, 1, &summarizedattrs);
+ else
+ pull_varattnos(indexPredicate, 1, &hotblockingattrs);
index_close(indexDesc, AccessShareLock);
}
@@ -5347,24 +5370,28 @@ restart:
bms_free(uindexattrs);
bms_free(pkindexattrs);
bms_free(idindexattrs);
- bms_free(indexattrs);
+ bms_free(hotblockingattrs);
+ bms_free(summarizedattrs);
goto restart;
}
/* Don't leak the old values of these bitmaps, if any */
- bms_free(relation->rd_indexattr);
- relation->rd_indexattr = NULL;
+ relation->rd_attrsvalid = false;
bms_free(relation->rd_keyattr);
relation->rd_keyattr = NULL;
bms_free(relation->rd_pkattr);
relation->rd_pkattr = NULL;
bms_free(relation->rd_idattr);
relation->rd_idattr = NULL;
+ bms_free(relation->rd_hotblockingattr);
+ relation->rd_hotblockingattr = NULL;
+ bms_free(relation->rd_summarizedattr);
+ relation->rd_summarizedattr = NULL;
/*
* Now save copies of the bitmaps in the relcache entry. We intentionally
- * set rd_indexattr last, because that's the one that signals validity of
+ * set rd_attrsvalid last, because that's the one that signals validity of
* the values; if we run out of memory before making that copy, we won't
* leave the relcache entry looking like the other ones are valid but
* empty.
@@ -5373,20 +5400,24 @@ restart:
relation->rd_keyattr = bms_copy(uindexattrs);
relation->rd_pkattr = bms_copy(pkindexattrs);
relation->rd_idattr = bms_copy(idindexattrs);
- relation->rd_indexattr = bms_copy(indexattrs);
+ relation->rd_hotblockingattr = bms_copy(hotblockingattrs);
+ relation->rd_summarizedattr = bms_copy(summarizedattrs);
+ relation->rd_attrsvalid = true;
MemoryContextSwitchTo(oldcxt);
/* We return our original working copy for caller to play with */
switch (attrKind)
{
- case INDEX_ATTR_BITMAP_ALL:
- return indexattrs;
case INDEX_ATTR_BITMAP_KEY:
return uindexattrs;
case INDEX_ATTR_BITMAP_PRIMARY_KEY:
return pkindexattrs;
case INDEX_ATTR_BITMAP_IDENTITY_KEY:
return idindexattrs;
+ case INDEX_ATTR_BITMAP_HOT_BLOCKING:
+ return hotblockingattrs;
+ case INDEX_ATTR_BITMAP_SUMMARIZED:
+ return summarizedattrs;
default:
elog(ERROR, "unknown attrKind %u", attrKind);
return NULL;
@@ -6307,7 +6338,7 @@ load_relcache_init_file(bool shared)
rel->rd_indexlist = NIL;
rel->rd_pkindex = InvalidOid;
rel->rd_replidindex = InvalidOid;
- rel->rd_indexattr = NULL;
+ rel->rd_attrsvalid = false;
rel->rd_keyattr = NULL;
rel->rd_pkattr = NULL;
rel->rd_idattr = NULL;
diff --git a/src/include/access/amapi.h b/src/include/access/amapi.h
index 4f1f67b4d0..281039ef67 100644
--- a/src/include/access/amapi.h
+++ b/src/include/access/amapi.h
@@ -244,6 +244,8 @@ typedef struct IndexAmRoutine
bool amcaninclude;
/* does AM use maintenance_work_mem? */
bool amusemaintenanceworkmem;
+ /* does AM store tuple information only at block granularity? */
+ bool amsummarizing;
/* OR of parallel vacuum flags. See vacuum.h for flags. */
uint8 amparallelvacuumoptions;
/* type of data stored in index, or InvalidOid if variable */
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 8d74d1b7e3..faf5026519 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -249,7 +249,8 @@ extern void heap_abort_speculative(Relation relation, ItemPointer tid);
extern TM_Result heap_update(Relation relation, ItemPointer otid,
HeapTuple newtup,
CommandId cid, Snapshot crosscheck, bool wait,
- struct TM_FailureData *tmfd, LockTupleMode *lockmode);
+ struct TM_FailureData *tmfd, LockTupleMode *lockmode,
+ TU_UpdateIndexes *update_indexes);
extern TM_Result heap_lock_tuple(Relation relation, HeapTuple tuple,
CommandId cid, LockTupleMode mode, LockWaitPolicy wait_policy,
bool follow_updates,
@@ -275,7 +276,7 @@ extern bool heap_tuple_needs_eventual_freeze(HeapTupleHeader tuple);
extern void simple_heap_insert(Relation relation, HeapTuple tup);
extern void simple_heap_delete(Relation relation, ItemPointer tid);
extern void simple_heap_update(Relation relation, ItemPointer otid,
- HeapTuple tup);
+ HeapTuple tup, TU_UpdateIndexes *update_indexes);
extern TransactionId heap_index_delete_tuples(Relation rel,
TM_IndexDeleteOp *delstate);
diff --git a/src/include/access/tableam.h b/src/include/access/tableam.h
index 652e96f1b0..f31d7693ec 100644
--- a/src/include/access/tableam.h
+++ b/src/include/access/tableam.h
@@ -102,6 +102,19 @@ typedef enum TM_Result
TM_WouldBlock
} TM_Result;
+/*
+ * Result codes for table_update(..., update_indexes*..).
+ * Used to determine which indexes to update.
+ */
+typedef enum TU_UpdateIndexes {
+ /* No indexed columns were updated (incl. TID addressing of tuple) */
+ TU_None = 0,
+ /* A non-summarizing indexed column was updated, or the TID has changed */
+ TU_All = 1,
+ /* Only summarized columns were updated, TID is unchanged */
+ TU_Summarizing = 2
+} TU_UpdateIndexes;
+
/*
* When table_tuple_update, table_tuple_delete, or table_tuple_lock fail
* because the target tuple is already outdated, they fill in this struct to
@@ -526,7 +539,7 @@ typedef struct TableAmRoutine
bool wait,
TM_FailureData *tmfd,
LockTupleMode *lockmode,
- bool *update_indexes);
+ TU_UpdateIndexes *update_indexes);
/* see table_tuple_lock() for reference about parameters */
TM_Result (*tuple_lock) (Relation rel,
@@ -1514,7 +1527,7 @@ static inline TM_Result
table_tuple_update(Relation rel, ItemPointer otid, TupleTableSlot *slot,
CommandId cid, Snapshot snapshot, Snapshot crosscheck,
bool wait, TM_FailureData *tmfd, LockTupleMode *lockmode,
- bool *update_indexes)
+ TU_UpdateIndexes *update_indexes)
{
return rel->rd_tableam->tuple_update(rel, otid, slot,
cid, snapshot, crosscheck,
@@ -2038,7 +2051,7 @@ extern void simple_table_tuple_delete(Relation rel, ItemPointer tid,
Snapshot snapshot);
extern void simple_table_tuple_update(Relation rel, ItemPointer otid,
TupleTableSlot *slot, Snapshot snapshot,
- bool *update_indexes);
+ TU_UpdateIndexes *update_indexes);
/* ----------------------------------------------------------------------------
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index e7e25c057e..551c5d7ae0 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -620,7 +620,8 @@ extern List *ExecInsertIndexTuples(ResultRelInfo *resultRelInfo,
TupleTableSlot *slot, EState *estate,
bool update,
bool noDupErr,
- bool *specConflict, List *arbiterIndexes);
+ bool *specConflict, List *arbiterIndexes,
+ bool onlySummarizing);
extern bool ExecCheckIndexConstraints(ResultRelInfo *resultRelInfo,
TupleTableSlot *slot,
EState *estate, ItemPointer conflictTid,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 20f4c8b35f..3f1b8818a1 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -161,6 +161,7 @@ typedef struct ExprState
* IndexUnchanged aminsert hint, cached for retail inserts
* Concurrent are we doing a concurrent index build?
* BrokenHotChain did we detect any broken HOT chains?
+ * Summarizing is it summarizing?
* ParallelWorkers # of workers requested (excludes leader)
* Am Oid of index AM
* AmCache private cache area for index AM
@@ -194,6 +195,7 @@ typedef struct IndexInfo
bool ii_IndexUnchanged;
bool ii_Concurrent;
bool ii_BrokenHotChain;
+ bool ii_Summarizing;
int ii_ParallelWorkers;
Oid ii_Am;
void *ii_AmCache;
diff --git a/src/include/nodes/makefuncs.h b/src/include/nodes/makefuncs.h
index 80f1d5336b..64651c9b00 100644
--- a/src/include/nodes/makefuncs.h
+++ b/src/include/nodes/makefuncs.h
@@ -96,7 +96,9 @@ extern List *make_ands_implicit(Expr *clause);
extern IndexInfo *makeIndexInfo(int numattrs, int numkeyattrs, Oid amoid,
List *expressions, List *predicates,
- bool unique, bool nulls_not_distinct, bool isready, bool concurrent);
+ bool unique, bool nulls_not_distinct,
+ bool isready, bool concurrent,
+ bool summarizing);
extern DefElem *makeDefElem(char *name, Node *arg, int location);
extern DefElem *makeDefElemExtended(char *nameSpace, char *name, Node *arg,
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 67f994cb3e..c0ddddb2f0 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -156,10 +156,12 @@ typedef struct RelationData
List *rd_statlist; /* list of OIDs of extended stats */
/* data managed by RelationGetIndexAttrBitmap: */
- Bitmapset *rd_indexattr; /* identifies columns used in indexes */
+ bool rd_attrsvalid; /* are bitmaps of attrs valid? */
Bitmapset *rd_keyattr; /* cols that can be ref'd by foreign keys */
Bitmapset *rd_pkattr; /* cols included in primary key */
Bitmapset *rd_idattr; /* included in replica identity index */
+ Bitmapset *rd_hotblockingattr; /* cols blocking HOT update */
+ Bitmapset *rd_summarizedattr; /* cols indexed by summarizing indexes */
PublicationDesc *rd_pubdesc; /* publication descriptor, or NULL */
diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h
index 88460f21c5..beeb28b83c 100644
--- a/src/include/utils/relcache.h
+++ b/src/include/utils/relcache.h
@@ -56,10 +56,11 @@ extern bytea **RelationGetIndexAttOptions(Relation relation, bool copy);
typedef enum IndexAttrBitmapKind
{
- INDEX_ATTR_BITMAP_ALL,
INDEX_ATTR_BITMAP_KEY,
INDEX_ATTR_BITMAP_PRIMARY_KEY,
- INDEX_ATTR_BITMAP_IDENTITY_KEY
+ INDEX_ATTR_BITMAP_IDENTITY_KEY,
+ INDEX_ATTR_BITMAP_HOT_BLOCKING,
+ INDEX_ATTR_BITMAP_SUMMARIZED
} IndexAttrBitmapKind;
extern Bitmapset *RelationGetIndexAttrBitmap(Relation relation,
diff --git a/src/test/modules/dummy_index_am/dummy_index_am.c b/src/test/modules/dummy_index_am/dummy_index_am.c
index dfb1ebb846..c14e0abe0c 100644
--- a/src/test/modules/dummy_index_am/dummy_index_am.c
+++ b/src/test/modules/dummy_index_am/dummy_index_am.c
@@ -296,6 +296,7 @@ dihandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = false;
amroutine->amcaninclude = false;
amroutine->amusemaintenanceworkmem = false;
+ amroutine->amsummarizing = false;
amroutine->amparallelvacuumoptions = VACUUM_OPTION_NO_PARALLEL;
amroutine->amkeytype = InvalidOid;
diff --git a/src/test/regress/expected/stats.out b/src/test/regress/expected/stats.out
index 937b2101b3..728c474cfa 100644
--- a/src/test/regress/expected/stats.out
+++ b/src/test/regress/expected/stats.out
@@ -1354,4 +1354,114 @@ SELECT :io_stats_post_reset < :io_stats_pre_reset;
t
(1 row)
+-- test BRIN index doesn't block HOT update
+CREATE TABLE brin_hot (
+ id integer PRIMARY KEY,
+ val integer NOT NULL
+) WITH (autovacuum_enabled = off, fillfactor = 70);
+INSERT INTO brin_hot SELECT *, 0 FROM generate_series(1, 235);
+CREATE INDEX val_brin ON brin_hot using brin(val);
+CREATE FUNCTION wait_for_hot_stats() RETURNS void AS $$
+DECLARE
+ start_time timestamptz := clock_timestamp();
+ updated bool;
+BEGIN
+ -- we don't want to wait forever; loop will exit after 30 seconds
+ FOR i IN 1 .. 300 LOOP
+ SELECT (pg_stat_get_tuples_hot_updated('brin_hot'::regclass::oid) > 0) INTO updated;
+ EXIT WHEN updated;
+
+ -- wait a little
+ PERFORM pg_sleep_for('100 milliseconds');
+ -- reset stats snapshot so we can test again
+ PERFORM pg_stat_clear_snapshot();
+ END LOOP;
+ -- report time waited in postmaster log (where it won't change test output)
+ RAISE log 'wait_for_hot_stats delayed % seconds',
+ EXTRACT(epoch FROM clock_timestamp() - start_time);
+END
+$$ LANGUAGE plpgsql;
+UPDATE brin_hot SET val = -3 WHERE id = 42;
+-- We can't just call wait_for_hot_stats() at this point, because we only
+-- transmit stats when the session goes idle, and we probably didn't
+-- transmit the last couple of counts yet thanks to the rate-limiting logic
+-- in pgstat_report_stat(). But instead of waiting for the rate limiter's
+-- timeout to elapse, let's just start a new session. The old one will
+-- then send its stats before dying.
+\c -
+SELECT wait_for_hot_stats();
+ wait_for_hot_stats
+--------------------
+
+(1 row)
+
+SELECT pg_stat_get_tuples_hot_updated('brin_hot'::regclass::oid);
+ pg_stat_get_tuples_hot_updated
+--------------------------------
+ 1
+(1 row)
+
+DROP TABLE brin_hot;
+DROP FUNCTION wait_for_hot_stats();
+-- Test handling of index predicates - updating attributes in precicates
+-- should block HOT even for BRIN. We update a row that was not indexed
+-- due to the index predicate, and becomes indexable.
+CREATE TABLE brin_hot_2 (a int, b int);
+INSERT INTO brin_hot_2 VALUES (1, 100);
+CREATE INDEX ON brin_hot_2 USING brin (b) WHERE a = 2;
+UPDATE brin_hot_2 SET a = 2;
+EXPLAIN (COSTS OFF) SELECT * FROM brin_hot_2 WHERE a = 2 AND b = 100;
+ QUERY PLAN
+-----------------------------------
+ Seq Scan on brin_hot_2
+ Filter: ((a = 2) AND (b = 100))
+(2 rows)
+
+SELECT COUNT(*) FROM brin_hot_2 WHERE a = 2 AND b = 100;
+ count
+-------
+ 1
+(1 row)
+
+SET enable_seqscan = off;
+EXPLAIN (COSTS OFF) SELECT * FROM brin_hot_2 WHERE a = 2 AND b = 100;
+ QUERY PLAN
+---------------------------------------------
+ Bitmap Heap Scan on brin_hot_2
+ Recheck Cond: ((b = 100) AND (a = 2))
+ -> Bitmap Index Scan on brin_hot_2_b_idx
+ Index Cond: (b = 100)
+(4 rows)
+
+SELECT COUNT(*) FROM brin_hot_2 WHERE a = 2 AND b = 100;
+ count
+-------
+ 1
+(1 row)
+
+DROP TABLE brin_hot_2;
+-- Test that updates to indexed columns are still propagated to the
+-- BRIN column.
+-- https://postgr.es/m/05ebcb44-f383-86e3-4f31-0a97a55634cf@enterprisedb.com
+CREATE TABLE brin_hot_3 (a int, filler text) WITH (fillfactor = 10);
+INSERT INTO brin_hot_3 SELECT 1, repeat(' ', 500) FROM generate_series(1, 20);
+CREATE INDEX ON brin_hot_3 USING brin (a) WITH (pages_per_range = 1);
+UPDATE brin_hot_3 SET a = 2;
+EXPLAIN (COSTS OFF) SELECT * FROM brin_hot_3 WHERE a = 2;
+ QUERY PLAN
+---------------------------------------------
+ Bitmap Heap Scan on brin_hot_3
+ Recheck Cond: (a = 2)
+ -> Bitmap Index Scan on brin_hot_3_a_idx
+ Index Cond: (a = 2)
+(4 rows)
+
+SELECT COUNT(*) FROM brin_hot_3 WHERE a = 2;
+ count
+-------
+ 20
+(1 row)
+
+DROP TABLE brin_hot_3;
+SET enable_seqscan = on;
-- End of Stats Test
diff --git a/src/test/regress/sql/stats.sql b/src/test/regress/sql/stats.sql
index 74e592aa8a..e113f6906c 100644
--- a/src/test/regress/sql/stats.sql
+++ b/src/test/regress/sql/stats.sql
@@ -535,7 +535,6 @@ SET enable_seqscan TO on;
SELECT pg_stat_get_replication_slot(NULL);
SELECT pg_stat_get_subscription_stats(NULL);
-
-- Test that the following operations are tracked in pg_stat_io:
-- - reads of target blocks into shared buffers
-- - writes of shared buffers to permanent storage
@@ -678,4 +677,85 @@ SELECT sum(evictions) + sum(reuses) + sum(extends) + sum(fsyncs) + sum(reads) +
FROM pg_stat_io \gset
SELECT :io_stats_post_reset < :io_stats_pre_reset;
+
+-- test BRIN index doesn't block HOT update
+CREATE TABLE brin_hot (
+ id integer PRIMARY KEY,
+ val integer NOT NULL
+) WITH (autovacuum_enabled = off, fillfactor = 70);
+
+INSERT INTO brin_hot SELECT *, 0 FROM generate_series(1, 235);
+CREATE INDEX val_brin ON brin_hot using brin(val);
+
+CREATE FUNCTION wait_for_hot_stats() RETURNS void AS $$
+DECLARE
+ start_time timestamptz := clock_timestamp();
+ updated bool;
+BEGIN
+ -- we don't want to wait forever; loop will exit after 30 seconds
+ FOR i IN 1 .. 300 LOOP
+ SELECT (pg_stat_get_tuples_hot_updated('brin_hot'::regclass::oid) > 0) INTO updated;
+ EXIT WHEN updated;
+
+ -- wait a little
+ PERFORM pg_sleep_for('100 milliseconds');
+ -- reset stats snapshot so we can test again
+ PERFORM pg_stat_clear_snapshot();
+ END LOOP;
+ -- report time waited in postmaster log (where it won't change test output)
+ RAISE log 'wait_for_hot_stats delayed % seconds',
+ EXTRACT(epoch FROM clock_timestamp() - start_time);
+END
+$$ LANGUAGE plpgsql;
+
+UPDATE brin_hot SET val = -3 WHERE id = 42;
+
+-- We can't just call wait_for_hot_stats() at this point, because we only
+-- transmit stats when the session goes idle, and we probably didn't
+-- transmit the last couple of counts yet thanks to the rate-limiting logic
+-- in pgstat_report_stat(). But instead of waiting for the rate limiter's
+-- timeout to elapse, let's just start a new session. The old one will
+-- then send its stats before dying.
+\c -
+
+SELECT wait_for_hot_stats();
+SELECT pg_stat_get_tuples_hot_updated('brin_hot'::regclass::oid);
+
+DROP TABLE brin_hot;
+DROP FUNCTION wait_for_hot_stats();
+
+-- Test handling of index predicates - updating attributes in precicates
+-- should block HOT even for BRIN. We update a row that was not indexed
+-- due to the index predicate, and becomes indexable.
+CREATE TABLE brin_hot_2 (a int, b int);
+INSERT INTO brin_hot_2 VALUES (1, 100);
+CREATE INDEX ON brin_hot_2 USING brin (b) WHERE a = 2;
+
+UPDATE brin_hot_2 SET a = 2;
+
+EXPLAIN (COSTS OFF) SELECT * FROM brin_hot_2 WHERE a = 2 AND b = 100;
+SELECT COUNT(*) FROM brin_hot_2 WHERE a = 2 AND b = 100;
+
+SET enable_seqscan = off;
+
+EXPLAIN (COSTS OFF) SELECT * FROM brin_hot_2 WHERE a = 2 AND b = 100;
+SELECT COUNT(*) FROM brin_hot_2 WHERE a = 2 AND b = 100;
+
+DROP TABLE brin_hot_2;
+
+-- Test that updates to indexed columns are still propagated to the
+-- BRIN column.
+-- https://postgr.es/m/05ebcb44-f383-86e3-4f31-0a97a55634cf@enterprisedb.com
+CREATE TABLE brin_hot_3 (a int, filler text) WITH (fillfactor = 10);
+INSERT INTO brin_hot_3 SELECT 1, repeat(' ', 500) FROM generate_series(1, 20);
+CREATE INDEX ON brin_hot_3 USING brin (a) WITH (pages_per_range = 1);
+UPDATE brin_hot_3 SET a = 2;
+
+EXPLAIN (COSTS OFF) SELECT * FROM brin_hot_3 WHERE a = 2;
+SELECT COUNT(*) FROM brin_hot_3 WHERE a = 2;
+
+DROP TABLE brin_hot_3;
+
+SET enable_seqscan = on;
+
-- End of Stats Test
--
2.39.0
On 2/20/23 19:15, Matthias van de Meent wrote:
Hi,
On Sun, 19 Feb 2023 at 16:04, Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:Hi,
On 2/19/23 02:03, Matthias van de Meent wrote:
On Thu, 16 Jun 2022 at 15:05, Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:I've pushed the revert. Let's try again for PG16.
As we discussed in person at the developer meeting, here's a patch to
try again for PG16.It combines the committed patches with my fix, and adds some
additional comments and polish. I am confident the code is correct,
but not that it is clean (see the commit message of the patch for
details).Thanks for the patch. I took a quick look, and I agree it seems correct,
and fairly clean too.Thanks. Based on feedback, attached is v2 of the patch, with as
significant changes:- We don't store the columns we mention in predicates of summarized
indexes in the hotblocking column anymore, they are stored in the
summarized columns bitmap instead. This further reduces the chance of
failiing to apply HOT with summarizing indexes.
Interesting idea. I need to think about the correctness, but AFAICS it
should work. Do we have any tests covering such cases?
I see both v1 and v2 had exactly this
src/test/regress/expected/stats.out | 110 ++++++++++++++++++
src/test/regress/sql/stats.sql | 82 ++++++++++++-
so I guess there are no new tests testing this for BRIN with predicates.
We should probably add some ...
- The heaptuple header bit for summarized update in inserted tuples is
replaced with passing an out parameter. This simplifies the logic and
decreases chances of accidentally storing incorrect data.
OK.
0002 proposes a minor RelationGetIndexPredicate() tweak, getting rid of
the repeated if/else branches. Feel free to discard, if you think the v2
approach is better.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachments:
v3-0001-Ignore-BRIN-indexes-when-checking-for-HOT-updates.patchtext/x-patch; charset=UTF-8; name=v3-0001-Ignore-BRIN-indexes-when-checking-for-HOT-updates.patchDownload
From b238e9c4308e9f2b7f0f2ca068b1217d1f906604 Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <boekewurm+postgres@gmail.com>
Date: Mon, 20 Feb 2023 18:38:30 +0100
Subject: [PATCH v3 1/2] Ignore BRIN indexes when checking for HOT updates
When determining whether an index update may be skipped by using HOT, we
can ignore attributes indexed by block summarizing indexes without
references to individual tuples that need to be cleaned up.
This also removes rd_indexattr list, and replaces it with rd_attrsvalid
flag. The list was not used anywhere, and a simple flag is sufficient.
A new type TU_UpdateIndexes is invented provide a signal to the executor
to determine which indexes to update - no indexes, all indexes, or only
the summarizing indexes.
One otherwise unused bit in the heap tuple header is (ab)used to signal
that the HOT update would still update at least one summarizing index.
The bit is cleared immediately
Original patch by Josef Simanek, various fixes and improvements by
Tomas Vondra and me.
Authors: Josef Simanek, Tomas Vondra, Matthias van de Meent
Reviewed-by: Tomas Vondra, Alvaro Herrera
---
doc/src/sgml/indexam.sgml | 13 +++
src/backend/access/brin/brin.c | 1 +
src/backend/access/gin/ginutil.c | 1 +
src/backend/access/gist/gist.c | 1 +
src/backend/access/hash/hash.c | 1 +
src/backend/access/heap/heapam.c | 48 +++++++-
src/backend/access/heap/heapam_handler.c | 19 ++-
src/backend/access/nbtree/nbtree.c | 1 +
src/backend/access/spgist/spgutils.c | 1 +
src/backend/access/table/tableam.c | 2 +-
src/backend/catalog/index.c | 9 +-
src/backend/catalog/indexing.c | 35 ++++--
src/backend/commands/copyfrom.c | 5 +-
src/backend/commands/indexcmds.c | 10 +-
src/backend/executor/execIndexing.c | 37 ++++--
src/backend/executor/execReplication.c | 9 +-
src/backend/executor/nodeModifyTable.c | 13 ++-
src/backend/nodes/makefuncs.c | 7 +-
src/backend/utils/cache/relcache.c | 73 ++++++++----
src/include/access/amapi.h | 2 +
src/include/access/heapam.h | 5 +-
src/include/access/tableam.h | 19 ++-
src/include/executor/executor.h | 3 +-
src/include/nodes/execnodes.h | 2 +
src/include/nodes/makefuncs.h | 4 +-
src/include/utils/rel.h | 4 +-
src/include/utils/relcache.h | 5 +-
.../modules/dummy_index_am/dummy_index_am.c | 1 +
src/test/regress/expected/stats.out | 110 ++++++++++++++++++
src/test/regress/sql/stats.sql | 82 ++++++++++++-
30 files changed, 445 insertions(+), 78 deletions(-)
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index 4f83970c851..897419ec959 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -127,6 +127,9 @@ typedef struct IndexAmRoutine
bool amcaninclude;
/* does AM use maintenance_work_mem? */
bool amusemaintenanceworkmem;
+ /* does AM summarize tuples, with at least all tuples in the block
+ * summarized in one summary */
+ bool amsummarizing;
/* OR of parallel vacuum flags */
uint8 amparallelvacuumoptions;
/* type of data stored in index, or InvalidOid if variable */
@@ -247,6 +250,16 @@ typedef struct IndexAmRoutine
null, independently of <structfield>amoptionalkey</structfield>.
</para>
+ <para>
+ The <structfield>amsummarizing</structfield> flag indicates whether the
+ access method summarizes the indexed tuples, with summarizing granularity
+ of at least per block.
+ Access methods that do not point to individual tuples, but to (like
+ <acronym>BRIN</acronym>), may allow the <acronym>HOT</acronym> optimization
+ to continue. This does not apply to attributes referenced in index
+ predicates, an update of such attribute always disables <acronym>HOT</acronym>.
+ </para>
+
</sect1>
<sect1 id="index-functions">
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index b5a5fa7b334..53e4721a54e 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -109,6 +109,7 @@ brinhandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = false;
amroutine->amcaninclude = false;
amroutine->amusemaintenanceworkmem = false;
+ amroutine->amsummarizing = true;
amroutine->amparallelvacuumoptions =
VACUUM_OPTION_PARALLEL_CLEANUP;
amroutine->amkeytype = InvalidOid;
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index f05128ecf50..03fec1704e9 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -56,6 +56,7 @@ ginhandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = false;
amroutine->amcaninclude = false;
amroutine->amusemaintenanceworkmem = true;
+ amroutine->amsummarizing = false;
amroutine->amparallelvacuumoptions =
VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_CLEANUP;
amroutine->amkeytype = InvalidOid;
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index ba394f08f61..ea72bcce1bc 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -78,6 +78,7 @@ gisthandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = false;
amroutine->amcaninclude = true;
amroutine->amusemaintenanceworkmem = false;
+ amroutine->amsummarizing = false;
amroutine->amparallelvacuumoptions =
VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
amroutine->amkeytype = InvalidOid;
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index eb258337d69..fc5d97f606e 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -75,6 +75,7 @@ hashhandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = false;
amroutine->amcaninclude = false;
amroutine->amusemaintenanceworkmem = false;
+ amroutine->amsummarizing = false;
amroutine->amparallelvacuumoptions =
VACUUM_OPTION_PARALLEL_BULKDEL;
amroutine->amkeytype = INT4OID;
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 7eb79cee58d..89fdfe53532 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -2924,11 +2924,13 @@ simple_heap_delete(Relation relation, ItemPointer tid)
TM_Result
heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
CommandId cid, Snapshot crosscheck, bool wait,
- TM_FailureData *tmfd, LockTupleMode *lockmode)
+ TM_FailureData *tmfd, LockTupleMode *lockmode,
+ TU_UpdateIndexes *update_indexes)
{
TM_Result result;
TransactionId xid = GetCurrentTransactionId();
Bitmapset *hot_attrs;
+ Bitmapset *sum_attrs;
Bitmapset *key_attrs;
Bitmapset *id_attrs;
Bitmapset *interesting_attrs;
@@ -2951,6 +2953,7 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
bool have_tuple_lock = false;
bool iscombo;
bool use_hot_update = false;
+ bool summarized_update = false;
bool key_intact;
bool all_visible_cleared = false;
bool all_visible_cleared_new = false;
@@ -2996,12 +2999,16 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
* Note that we get copies of each bitmap, so we need not worry about
* relcache flush happening midway through.
*/
- hot_attrs = RelationGetIndexAttrBitmap(relation, INDEX_ATTR_BITMAP_ALL);
+ hot_attrs = RelationGetIndexAttrBitmap(relation,
+ INDEX_ATTR_BITMAP_HOT_BLOCKING);
+ sum_attrs = RelationGetIndexAttrBitmap(relation,
+ INDEX_ATTR_BITMAP_SUMMARIZED);
key_attrs = RelationGetIndexAttrBitmap(relation, INDEX_ATTR_BITMAP_KEY);
id_attrs = RelationGetIndexAttrBitmap(relation,
INDEX_ATTR_BITMAP_IDENTITY_KEY);
interesting_attrs = NULL;
interesting_attrs = bms_add_members(interesting_attrs, hot_attrs);
+ interesting_attrs = bms_add_members(interesting_attrs, sum_attrs);
interesting_attrs = bms_add_members(interesting_attrs, key_attrs);
interesting_attrs = bms_add_members(interesting_attrs, id_attrs);
@@ -3311,7 +3318,10 @@ l2:
UnlockTupleTuplock(relation, &(oldtup.t_self), *lockmode);
if (vmbuffer != InvalidBuffer)
ReleaseBuffer(vmbuffer);
+ *update_indexes = TU_None;
+
bms_free(hot_attrs);
+ bms_free(sum_attrs);
bms_free(key_attrs);
bms_free(id_attrs);
bms_free(modified_attrs);
@@ -3633,7 +3643,19 @@ l2:
* changed.
*/
if (!bms_overlap(modified_attrs, hot_attrs))
+ {
use_hot_update = true;
+
+ /*
+ * If none of the columns that are used in hot-blocking indexes
+ * were updated, we can apply HOT, but we do still need to check
+ * if we need to update the summarizing indexes, and update those
+ * indexes if the columns were updated, or we may fail to detect
+ * e.g. value bound changes in BRIN minmax indexes.
+ */
+ if (bms_overlap(modified_attrs, sum_attrs))
+ summarized_update = true;
+ }
}
else
{
@@ -3793,10 +3815,27 @@ l2:
heap_freetuple(heaptup);
}
+ /*
+ * If it is a HOT update, the update may still need to update summarized
+ * indexes, lest we fail to update those summaries and get incorrect
+ * results (for example, minmax bounds of the block may change with this
+ * update).
+ */
+ if (use_hot_update)
+ {
+ if (summarized_update)
+ *update_indexes = TU_Summarizing;
+ else
+ *update_indexes = TU_None;
+ }
+ else
+ *update_indexes = TU_All;
+
if (old_key_tuple != NULL && old_key_copied)
heap_freetuple(old_key_tuple);
bms_free(hot_attrs);
+ bms_free(sum_attrs);
bms_free(key_attrs);
bms_free(id_attrs);
bms_free(modified_attrs);
@@ -3960,7 +3999,8 @@ HeapDetermineColumnsInfo(Relation relation,
* via ereport().
*/
void
-simple_heap_update(Relation relation, ItemPointer otid, HeapTuple tup)
+simple_heap_update(Relation relation, ItemPointer otid, HeapTuple tup,
+ TU_UpdateIndexes *update_indexes)
{
TM_Result result;
TM_FailureData tmfd;
@@ -3969,7 +4009,7 @@ simple_heap_update(Relation relation, ItemPointer otid, HeapTuple tup)
result = heap_update(relation, otid, tup,
GetCurrentCommandId(true), InvalidSnapshot,
true /* wait for commit */ ,
- &tmfd, &lockmode);
+ &tmfd, &lockmode, update_indexes);
switch (result)
{
case TM_SelfModified:
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index c4b1916d36e..a1d7d91ff77 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -314,7 +314,7 @@ static TM_Result
heapam_tuple_update(Relation relation, ItemPointer otid, TupleTableSlot *slot,
CommandId cid, Snapshot snapshot, Snapshot crosscheck,
bool wait, TM_FailureData *tmfd,
- LockTupleMode *lockmode, bool *update_indexes)
+ LockTupleMode *lockmode, TU_UpdateIndexes *update_indexes)
{
bool shouldFree = true;
HeapTuple tuple = ExecFetchSlotHeapTuple(slot, true, &shouldFree);
@@ -325,7 +325,7 @@ heapam_tuple_update(Relation relation, ItemPointer otid, TupleTableSlot *slot,
tuple->t_tableOid = slot->tts_tableOid;
result = heap_update(relation, otid, tuple, cid, crosscheck, wait,
- tmfd, lockmode);
+ tmfd, lockmode, update_indexes);
ItemPointerCopy(&tuple->t_self, &slot->tts_tid);
/*
@@ -334,9 +334,20 @@ heapam_tuple_update(Relation relation, ItemPointer otid, TupleTableSlot *slot,
* Note: heap_update returns the tid (location) of the new tuple in the
* t_self field.
*
- * If it's a HOT update, we mustn't insert new index entries.
+ * If the update is not HOT, we must update all indexes. If the update
+ * is HOT, it could be that we updated summarized columns, so we either
+ * update only summarized indexes, or none at all.
*/
- *update_indexes = result == TM_Ok && !HeapTupleIsHeapOnly(tuple);
+ if (result != TM_Ok)
+ {
+ Assert(*update_indexes == TU_None);
+ *update_indexes = TU_None;
+ }
+ else if (!HeapTupleIsHeapOnly(tuple))
+ Assert(*update_indexes == TU_All);
+ else
+ Assert(*update_indexes == TU_Summarizing ||
+ *update_indexes == TU_None);
if (shouldFree)
pfree(tuple);
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 1cc88da032d..681c30b0d8d 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -114,6 +114,7 @@ bthandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = true;
amroutine->amcaninclude = true;
amroutine->amusemaintenanceworkmem = false;
+ amroutine->amsummarizing = false;
amroutine->amparallelvacuumoptions =
VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
amroutine->amkeytype = InvalidOid;
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index 3761f2c193b..4e7ff1d1603 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -62,6 +62,7 @@ spghandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = false;
amroutine->amcaninclude = true;
amroutine->amusemaintenanceworkmem = false;
+ amroutine->amsummarizing = false;
amroutine->amparallelvacuumoptions =
VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
amroutine->amkeytype = InvalidOid;
diff --git a/src/backend/access/table/tableam.c b/src/backend/access/table/tableam.c
index ef0d34fceee..a5e6c92f35e 100644
--- a/src/backend/access/table/tableam.c
+++ b/src/backend/access/table/tableam.c
@@ -345,7 +345,7 @@ void
simple_table_tuple_update(Relation rel, ItemPointer otid,
TupleTableSlot *slot,
Snapshot snapshot,
- bool *update_indexes)
+ TU_UpdateIndexes *update_indexes)
{
TM_Result result;
TM_FailureData tmfd;
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 41b16cb89bc..e2fd035f445 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -1370,7 +1370,8 @@ index_concurrently_create_copy(Relation heapRelation, Oid oldIndexId,
oldInfo->ii_Unique,
oldInfo->ii_NullsNotDistinct,
false, /* not ready for inserts */
- true);
+ true,
+ indexRelation->rd_indam->amsummarizing);
/*
* Extract the list of column names and the column numbers for the new
@@ -2442,7 +2443,8 @@ BuildIndexInfo(Relation index)
indexStruct->indisunique,
indexStruct->indnullsnotdistinct,
indexStruct->indisready,
- false);
+ false,
+ index->rd_indam->amsummarizing);
/* fill in attribute numbers */
for (i = 0; i < numAtts; i++)
@@ -2502,7 +2504,8 @@ BuildDummyIndexInfo(Relation index)
indexStruct->indisunique,
indexStruct->indnullsnotdistinct,
indexStruct->indisready,
- false);
+ false,
+ index->rd_indam->amsummarizing);
/* fill in attribute numbers */
for (i = 0; i < numAtts; i++)
diff --git a/src/backend/catalog/indexing.c b/src/backend/catalog/indexing.c
index bb7cc3601c7..a387eccdc40 100644
--- a/src/backend/catalog/indexing.c
+++ b/src/backend/catalog/indexing.c
@@ -72,7 +72,8 @@ CatalogCloseIndexes(CatalogIndexState indstate)
* This is effectively a cut-down version of ExecInsertIndexTuples.
*/
static void
-CatalogIndexInsert(CatalogIndexState indstate, HeapTuple heapTuple)
+CatalogIndexInsert(CatalogIndexState indstate, HeapTuple heapTuple,
+ TU_UpdateIndexes updateIndexes)
{
int i;
int numIndexes;
@@ -82,6 +83,7 @@ CatalogIndexInsert(CatalogIndexState indstate, HeapTuple heapTuple)
IndexInfo **indexInfoArray;
Datum values[INDEX_MAX_KEYS];
bool isnull[INDEX_MAX_KEYS];
+ bool onlySummarized = updateIndexes == TU_Summarizing;
/*
* HOT update does not require index inserts. But with asserts enabled we
@@ -89,10 +91,13 @@ CatalogIndexInsert(CatalogIndexState indstate, HeapTuple heapTuple)
* table/index.
*/
#ifndef USE_ASSERT_CHECKING
- if (HeapTupleIsHeapOnly(heapTuple))
+ if (HeapTupleIsHeapOnly(heapTuple) && !onlySummarized)
return;
#endif
+ if (onlySummarized)
+ Assert(HeapTupleIsHeapOnly(heapTuple));
+
/*
* Get information from the state structure. Fall out if nothing to do.
*/
@@ -135,13 +140,20 @@ CatalogIndexInsert(CatalogIndexState indstate, HeapTuple heapTuple)
/* see earlier check above */
#ifdef USE_ASSERT_CHECKING
- if (HeapTupleIsHeapOnly(heapTuple))
+ if (HeapTupleIsHeapOnly(heapTuple) && !onlySummarized)
{
Assert(!ReindexIsProcessingIndex(RelationGetRelid(index)));
continue;
}
#endif /* USE_ASSERT_CHECKING */
+ /*
+ * Skip insertions into non-summarizing indexes if we only need
+ * to update summarizing indexes
+ */
+ if (onlySummarized && !indexInfo->ii_Summarizing)
+ continue;
+
/*
* FormIndexDatum fills in its values and isnull parameters with the
* appropriate values for the column(s) of the index.
@@ -228,7 +240,7 @@ CatalogTupleInsert(Relation heapRel, HeapTuple tup)
simple_heap_insert(heapRel, tup);
- CatalogIndexInsert(indstate, tup);
+ CatalogIndexInsert(indstate, tup, TU_All);
CatalogCloseIndexes(indstate);
}
@@ -248,7 +260,7 @@ CatalogTupleInsertWithInfo(Relation heapRel, HeapTuple tup,
simple_heap_insert(heapRel, tup);
- CatalogIndexInsert(indstate, tup);
+ CatalogIndexInsert(indstate, tup, TU_All);
}
/*
@@ -279,7 +291,7 @@ CatalogTuplesMultiInsertWithInfo(Relation heapRel, TupleTableSlot **slot,
tuple = ExecFetchSlotHeapTuple(slot[i], true, &should_free);
tuple->t_tableOid = slot[i]->tts_tableOid;
- CatalogIndexInsert(indstate, tuple);
+ CatalogIndexInsert(indstate, tuple, TU_All);
if (should_free)
heap_freetuple(tuple);
@@ -301,14 +313,15 @@ void
CatalogTupleUpdate(Relation heapRel, ItemPointer otid, HeapTuple tup)
{
CatalogIndexState indstate;
+ TU_UpdateIndexes updateIndexes = TU_All;
CatalogTupleCheckConstraints(heapRel, tup);
indstate = CatalogOpenIndexes(heapRel);
- simple_heap_update(heapRel, otid, tup);
+ simple_heap_update(heapRel, otid, tup, &updateIndexes);
- CatalogIndexInsert(indstate, tup);
+ CatalogIndexInsert(indstate, tup, updateIndexes);
CatalogCloseIndexes(indstate);
}
@@ -324,11 +337,13 @@ void
CatalogTupleUpdateWithInfo(Relation heapRel, ItemPointer otid, HeapTuple tup,
CatalogIndexState indstate)
{
+ TU_UpdateIndexes updateIndexes = TU_All;
+
CatalogTupleCheckConstraints(heapRel, tup);
- simple_heap_update(heapRel, otid, tup);
+ simple_heap_update(heapRel, otid, tup, &updateIndexes);
- CatalogIndexInsert(indstate, tup);
+ CatalogIndexInsert(indstate, tup, updateIndexes);
}
/*
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index af52faca6d4..564520289a8 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -435,7 +435,7 @@ CopyMultiInsertBufferFlush(CopyMultiInsertInfo *miinfo,
recheckIndexes =
ExecInsertIndexTuples(resultRelInfo,
buffer->slots[i], estate, false,
- false, NULL, NIL);
+ false, NULL, NIL, false);
ExecARInsertTriggers(estate, resultRelInfo,
slots[i], recheckIndexes,
cstate->transition_capture);
@@ -1254,7 +1254,8 @@ CopyFrom(CopyFromState cstate)
false,
false,
NULL,
- NIL);
+ NIL,
+ false);
}
/* AFTER ROW INSERT Triggers */
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 16ec0b114e6..ff48f44c66f 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -184,6 +184,7 @@ CheckIndexCompatible(Oid oldId,
Form_pg_am accessMethodForm;
IndexAmRoutine *amRoutine;
bool amcanorder;
+ bool amsummarizing;
int16 *coloptions;
IndexInfo *indexInfo;
int numberOfAttributes;
@@ -222,6 +223,7 @@ CheckIndexCompatible(Oid oldId,
ReleaseSysCache(tuple);
amcanorder = amRoutine->amcanorder;
+ amsummarizing = amRoutine->amsummarizing;
/*
* Compute the operator classes, collations, and exclusion operators for
@@ -232,7 +234,8 @@ CheckIndexCompatible(Oid oldId,
* ii_NumIndexKeyAttrs with same value.
*/
indexInfo = makeIndexInfo(numberOfAttributes, numberOfAttributes,
- accessMethodId, NIL, NIL, false, false, false, false);
+ accessMethodId, NIL, NIL, false, false,
+ false, false, amsummarizing);
typeObjectId = palloc_array(Oid, numberOfAttributes);
collationObjectId = palloc_array(Oid, numberOfAttributes);
classObjectId = palloc_array(Oid, numberOfAttributes);
@@ -550,6 +553,7 @@ DefineIndex(Oid relationId,
Form_pg_am accessMethodForm;
IndexAmRoutine *amRoutine;
bool amcanorder;
+ bool amissummarizing;
amoptions_function amoptions;
bool partitioned;
bool safe_index;
@@ -866,6 +870,7 @@ DefineIndex(Oid relationId,
amcanorder = amRoutine->amcanorder;
amoptions = amRoutine->amoptions;
+ amissummarizing = amRoutine->amsummarizing;
pfree(amRoutine);
ReleaseSysCache(tuple);
@@ -897,7 +902,8 @@ DefineIndex(Oid relationId,
stmt->unique,
stmt->nulls_not_distinct,
!concurrent,
- concurrent);
+ concurrent,
+ amissummarizing);
typeObjectId = palloc_array(Oid, numberOfAttributes);
collationObjectId = palloc_array(Oid, numberOfAttributes);
diff --git a/src/backend/executor/execIndexing.c b/src/backend/executor/execIndexing.c
index 6e88e72813f..da28e5e40ca 100644
--- a/src/backend/executor/execIndexing.c
+++ b/src/backend/executor/execIndexing.c
@@ -259,15 +259,24 @@ ExecCloseIndices(ResultRelInfo *resultRelInfo)
* into all the relations indexing the result relation
* when a heap tuple is inserted into the result relation.
*
- * When 'update' is true, executor is performing an UPDATE
- * that could not use an optimization like heapam's HOT (in
- * more general terms a call to table_tuple_update() took
- * place and set 'update_indexes' to true). Receiving this
- * hint makes us consider if we should pass down the
- * 'indexUnchanged' hint in turn. That's something that we
- * figure out for each index_insert() call iff 'update' is
- * true. (When 'update' is false we already know not to pass
- * the hint to any index.)
+ * When 'update' is true and 'onlySummarizing' is false,
+ * executor is performing an UPDATE that could not use an
+ * optimization like heapam's HOT (in more general terms a
+ * call to table_tuple_update() took place and set
+ * 'update_indexes' to TUUI_All). Receiving this hint makes
+ * us consider if we should pass down the 'indexUnchanged'
+ * hint in turn. That's something that we figure out for
+ * each index_insert() call iff 'update' is true.
+ * (When 'update' is false we already know not to pass the
+ * hint to any index.)
+ *
+ * If onlySummarizing is set, an equivalent optimization to
+ * HOT has been applied and any updated columns are indexed
+ * only by summarizing indexes (or in more general terms a
+ * call to table_tuple_update() took place and set
+ * 'update_indexes' to TUUI_Summarizing). We can (and must)
+ * therefore only update the indexes that have
+ * 'amsummarizing' = true.
*
* Unique and exclusion constraints are enforced at the same
* time. This returns a list of index OIDs for any unique or
@@ -287,7 +296,8 @@ ExecInsertIndexTuples(ResultRelInfo *resultRelInfo,
bool update,
bool noDupErr,
bool *specConflict,
- List *arbiterIndexes)
+ List *arbiterIndexes,
+ bool onlySummarizing)
{
ItemPointer tupleid = &slot->tts_tid;
List *result = NIL;
@@ -343,6 +353,13 @@ ExecInsertIndexTuples(ResultRelInfo *resultRelInfo,
if (!indexInfo->ii_ReadyForInserts)
continue;
+ /*
+ * Skip processing of non-summarizing indexes if we only
+ * update summarizing indexes
+ */
+ if (onlySummarizing && !indexInfo->ii_Summarizing)
+ continue;
+
/* Check for partial index */
if (indexInfo->ii_Predicate != NIL)
{
diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index c484f5c3019..4c01ef63cb1 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -445,7 +445,7 @@ ExecSimpleRelationInsert(ResultRelInfo *resultRelInfo,
if (resultRelInfo->ri_NumIndices > 0)
recheckIndexes = ExecInsertIndexTuples(resultRelInfo,
slot, estate, false, false,
- NULL, NIL);
+ NULL, NIL, false);
/* AFTER ROW INSERT Triggers */
ExecARInsertTriggers(estate, resultRelInfo, slot,
@@ -493,7 +493,7 @@ ExecSimpleRelationUpdate(ResultRelInfo *resultRelInfo,
if (!skip_tuple)
{
List *recheckIndexes = NIL;
- bool update_indexes;
+ TU_UpdateIndexes update_indexes;
/* Compute stored generated columns */
if (rel->rd_att->constr &&
@@ -510,10 +510,11 @@ ExecSimpleRelationUpdate(ResultRelInfo *resultRelInfo,
simple_table_tuple_update(rel, tid, slot, estate->es_snapshot,
&update_indexes);
- if (resultRelInfo->ri_NumIndices > 0 && update_indexes)
+ if (resultRelInfo->ri_NumIndices > 0 && update_indexes != TU_None)
recheckIndexes = ExecInsertIndexTuples(resultRelInfo,
slot, estate, true, false,
- NULL, NIL);
+ NULL, NIL,
+ update_indexes == TU_Summarizing);
/* AFTER ROW UPDATE Triggers */
ExecARUpdateTriggers(estate, resultRelInfo,
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index a94d7f86e54..099ccda95ac 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -125,8 +125,8 @@ typedef struct ModifyTableContext
typedef struct UpdateContext
{
bool updated; /* did UPDATE actually occur? */
- bool updateIndexes; /* index update required? */
bool crossPartUpdate; /* was it a cross-partition update? */
+ TU_UpdateIndexes updateIndexes; /* Which index updates are required? */
/*
* Lock mode to acquire on the latest tuple version before performing
@@ -1106,7 +1106,8 @@ ExecInsert(ModifyTableContext *context,
recheckIndexes = ExecInsertIndexTuples(resultRelInfo,
slot, estate, false, true,
&specConflict,
- arbiterIndexes);
+ arbiterIndexes,
+ false);
/* adjust the tuple's state accordingly */
table_tuple_complete_speculative(resultRelationDesc, slot,
@@ -1145,7 +1146,8 @@ ExecInsert(ModifyTableContext *context,
if (resultRelInfo->ri_NumIndices > 0)
recheckIndexes = ExecInsertIndexTuples(resultRelInfo,
slot, estate, false,
- false, NULL, NIL);
+ false, NULL, NIL,
+ false);
}
}
@@ -2108,11 +2110,12 @@ ExecUpdateEpilogue(ModifyTableContext *context, UpdateContext *updateCxt,
ModifyTableState *mtstate = context->mtstate;
/* insert index entries for tuple if necessary */
- if (resultRelInfo->ri_NumIndices > 0 && updateCxt->updateIndexes)
+ if (resultRelInfo->ri_NumIndices > 0 && updateCxt->updateIndexes != TU_None)
recheckIndexes = ExecInsertIndexTuples(resultRelInfo,
slot, context->estate,
true, false,
- NULL, NIL);
+ NULL, NIL,
+ updateCxt->updateIndexes == TU_Summarizing);
/* AFTER ROW UPDATE Triggers */
ExecARUpdateTriggers(context->estate, resultRelInfo,
diff --git a/src/backend/nodes/makefuncs.c b/src/backend/nodes/makefuncs.c
index fe67baf1420..f23f8b73492 100644
--- a/src/backend/nodes/makefuncs.c
+++ b/src/backend/nodes/makefuncs.c
@@ -743,7 +743,8 @@ make_ands_implicit(Expr *clause)
*/
IndexInfo *
makeIndexInfo(int numattrs, int numkeyattrs, Oid amoid, List *expressions,
- List *predicates, bool unique, bool nulls_not_distinct, bool isready, bool concurrent)
+ List *predicates, bool unique, bool nulls_not_distinct,
+ bool isready, bool concurrent, bool summarizing)
{
IndexInfo *n = makeNode(IndexInfo);
@@ -757,6 +758,10 @@ makeIndexInfo(int numattrs, int numkeyattrs, Oid amoid, List *expressions,
n->ii_CheckedUnchanged = false;
n->ii_IndexUnchanged = false;
n->ii_Concurrent = concurrent;
+ n->ii_Summarizing = summarizing;
+
+ /* summarizing indexes cannot contain non-key attributes */
+ Assert(!summarizing || numkeyattrs == numattrs);
/* expressions */
n->ii_Expressions = expressions;
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 13f79873733..092c5ed8c7f 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -2440,10 +2440,11 @@ RelationDestroyRelation(Relation relation, bool remember_tupdesc)
list_free_deep(relation->rd_fkeylist);
list_free(relation->rd_indexlist);
list_free(relation->rd_statlist);
- bms_free(relation->rd_indexattr);
bms_free(relation->rd_keyattr);
bms_free(relation->rd_pkattr);
bms_free(relation->rd_idattr);
+ bms_free(relation->rd_hotblockingattr);
+ bms_free(relation->rd_summarizedattr);
if (relation->rd_pubdesc)
pfree(relation->rd_pubdesc);
if (relation->rd_options)
@@ -5167,10 +5168,11 @@ RelationGetIndexPredicate(Relation relation)
Bitmapset *
RelationGetIndexAttrBitmap(Relation relation, IndexAttrBitmapKind attrKind)
{
- Bitmapset *indexattrs; /* indexed columns */
Bitmapset *uindexattrs; /* columns in unique indexes */
Bitmapset *pkindexattrs; /* columns in the primary index */
Bitmapset *idindexattrs; /* columns in the replica identity */
+ Bitmapset *hotblockingattrs; /* columns with HOT blocking indexes */
+ Bitmapset *summarizedattrs; /* columns with summarizing indexes */
List *indexoidlist;
List *newindexoidlist;
Oid relpkindex;
@@ -5179,18 +5181,20 @@ RelationGetIndexAttrBitmap(Relation relation, IndexAttrBitmapKind attrKind)
MemoryContext oldcxt;
/* Quick exit if we already computed the result. */
- if (relation->rd_indexattr != NULL)
+ if (relation->rd_attrsvalid)
{
switch (attrKind)
{
- case INDEX_ATTR_BITMAP_ALL:
- return bms_copy(relation->rd_indexattr);
case INDEX_ATTR_BITMAP_KEY:
return bms_copy(relation->rd_keyattr);
case INDEX_ATTR_BITMAP_PRIMARY_KEY:
return bms_copy(relation->rd_pkattr);
case INDEX_ATTR_BITMAP_IDENTITY_KEY:
return bms_copy(relation->rd_idattr);
+ case INDEX_ATTR_BITMAP_HOT_BLOCKING:
+ return bms_copy(relation->rd_hotblockingattr);
+ case INDEX_ATTR_BITMAP_SUMMARIZED:
+ return bms_copy(relation->rd_summarizedattr);
default:
elog(ERROR, "unknown attrKind %u", attrKind);
}
@@ -5230,10 +5234,11 @@ restart:
* CONCURRENTLY is far enough along that we should ignore the index, it
* won't be returned at all by RelationGetIndexList.
*/
- indexattrs = NULL;
uindexattrs = NULL;
pkindexattrs = NULL;
idindexattrs = NULL;
+ hotblockingattrs = NULL;
+ summarizedattrs = NULL;
foreach(l, indexoidlist)
{
Oid indexOid = lfirst_oid(l);
@@ -5291,15 +5296,25 @@ restart:
/*
* Since we have covering indexes with non-key columns, we must
* handle them accurately here. non-key columns must be added into
- * indexattrs, since they are in index, and HOT-update shouldn't
- * miss them. Obviously, non-key columns couldn't be referenced by
+ * hotblockingattrs, since they are in index, and HOT-update
+ * shouldn't miss them.
+ *
+ * Summarizing indexes do not block HOT, but do need to be updated
+ * when the column value changes, thus require a separate
+ * attribute bitmapset.
+ *
+ * Obviously, non-key columns couldn't be referenced by
* foreign key or identity key. Hence we do not include them into
* uindexattrs, pkindexattrs and idindexattrs bitmaps.
*/
if (attrnum != 0)
{
- indexattrs = bms_add_member(indexattrs,
- attrnum - FirstLowInvalidHeapAttributeNumber);
+ if (indexDesc->rd_indam->amsummarizing)
+ summarizedattrs = bms_add_member(summarizedattrs,
+ attrnum - FirstLowInvalidHeapAttributeNumber);
+ else
+ hotblockingattrs = bms_add_member(hotblockingattrs,
+ attrnum - FirstLowInvalidHeapAttributeNumber);
if (isKey && i < indexDesc->rd_index->indnkeyatts)
uindexattrs = bms_add_member(uindexattrs,
@@ -5316,10 +5331,18 @@ restart:
}
/* Collect all attributes used in expressions, too */
- pull_varattnos(indexExpressions, 1, &indexattrs);
+ if (indexDesc->rd_indam->amsummarizing)
+ pull_varattnos(indexExpressions, 1, &summarizedattrs);
+ else
+ pull_varattnos(indexExpressions, 1, &hotblockingattrs);
- /* Collect all attributes in the index predicate, too */
- pull_varattnos(indexPredicate, 1, &indexattrs);
+ /*
+ * Collect all attributes in the index predicate, too.
+ */
+ if (indexDesc->rd_indam->amsummarizing)
+ pull_varattnos(indexPredicate, 1, &summarizedattrs);
+ else
+ pull_varattnos(indexPredicate, 1, &hotblockingattrs);
index_close(indexDesc, AccessShareLock);
}
@@ -5347,24 +5370,28 @@ restart:
bms_free(uindexattrs);
bms_free(pkindexattrs);
bms_free(idindexattrs);
- bms_free(indexattrs);
+ bms_free(hotblockingattrs);
+ bms_free(summarizedattrs);
goto restart;
}
/* Don't leak the old values of these bitmaps, if any */
- bms_free(relation->rd_indexattr);
- relation->rd_indexattr = NULL;
+ relation->rd_attrsvalid = false;
bms_free(relation->rd_keyattr);
relation->rd_keyattr = NULL;
bms_free(relation->rd_pkattr);
relation->rd_pkattr = NULL;
bms_free(relation->rd_idattr);
relation->rd_idattr = NULL;
+ bms_free(relation->rd_hotblockingattr);
+ relation->rd_hotblockingattr = NULL;
+ bms_free(relation->rd_summarizedattr);
+ relation->rd_summarizedattr = NULL;
/*
* Now save copies of the bitmaps in the relcache entry. We intentionally
- * set rd_indexattr last, because that's the one that signals validity of
+ * set rd_attrsvalid last, because that's the one that signals validity of
* the values; if we run out of memory before making that copy, we won't
* leave the relcache entry looking like the other ones are valid but
* empty.
@@ -5373,20 +5400,24 @@ restart:
relation->rd_keyattr = bms_copy(uindexattrs);
relation->rd_pkattr = bms_copy(pkindexattrs);
relation->rd_idattr = bms_copy(idindexattrs);
- relation->rd_indexattr = bms_copy(indexattrs);
+ relation->rd_hotblockingattr = bms_copy(hotblockingattrs);
+ relation->rd_summarizedattr = bms_copy(summarizedattrs);
+ relation->rd_attrsvalid = true;
MemoryContextSwitchTo(oldcxt);
/* We return our original working copy for caller to play with */
switch (attrKind)
{
- case INDEX_ATTR_BITMAP_ALL:
- return indexattrs;
case INDEX_ATTR_BITMAP_KEY:
return uindexattrs;
case INDEX_ATTR_BITMAP_PRIMARY_KEY:
return pkindexattrs;
case INDEX_ATTR_BITMAP_IDENTITY_KEY:
return idindexattrs;
+ case INDEX_ATTR_BITMAP_HOT_BLOCKING:
+ return hotblockingattrs;
+ case INDEX_ATTR_BITMAP_SUMMARIZED:
+ return summarizedattrs;
default:
elog(ERROR, "unknown attrKind %u", attrKind);
return NULL;
@@ -6307,7 +6338,7 @@ load_relcache_init_file(bool shared)
rel->rd_indexlist = NIL;
rel->rd_pkindex = InvalidOid;
rel->rd_replidindex = InvalidOid;
- rel->rd_indexattr = NULL;
+ rel->rd_attrsvalid = false;
rel->rd_keyattr = NULL;
rel->rd_pkattr = NULL;
rel->rd_idattr = NULL;
diff --git a/src/include/access/amapi.h b/src/include/access/amapi.h
index 4f1f67b4d03..281039ef673 100644
--- a/src/include/access/amapi.h
+++ b/src/include/access/amapi.h
@@ -244,6 +244,8 @@ typedef struct IndexAmRoutine
bool amcaninclude;
/* does AM use maintenance_work_mem? */
bool amusemaintenanceworkmem;
+ /* does AM store tuple information only at block granularity? */
+ bool amsummarizing;
/* OR of parallel vacuum flags. See vacuum.h for flags. */
uint8 amparallelvacuumoptions;
/* type of data stored in index, or InvalidOid if variable */
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 8d74d1b7e30..faf50265191 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -249,7 +249,8 @@ extern void heap_abort_speculative(Relation relation, ItemPointer tid);
extern TM_Result heap_update(Relation relation, ItemPointer otid,
HeapTuple newtup,
CommandId cid, Snapshot crosscheck, bool wait,
- struct TM_FailureData *tmfd, LockTupleMode *lockmode);
+ struct TM_FailureData *tmfd, LockTupleMode *lockmode,
+ TU_UpdateIndexes *update_indexes);
extern TM_Result heap_lock_tuple(Relation relation, HeapTuple tuple,
CommandId cid, LockTupleMode mode, LockWaitPolicy wait_policy,
bool follow_updates,
@@ -275,7 +276,7 @@ extern bool heap_tuple_needs_eventual_freeze(HeapTupleHeader tuple);
extern void simple_heap_insert(Relation relation, HeapTuple tup);
extern void simple_heap_delete(Relation relation, ItemPointer tid);
extern void simple_heap_update(Relation relation, ItemPointer otid,
- HeapTuple tup);
+ HeapTuple tup, TU_UpdateIndexes *update_indexes);
extern TransactionId heap_index_delete_tuples(Relation rel,
TM_IndexDeleteOp *delstate);
diff --git a/src/include/access/tableam.h b/src/include/access/tableam.h
index 652e96f1b0b..f31d7693ecd 100644
--- a/src/include/access/tableam.h
+++ b/src/include/access/tableam.h
@@ -102,6 +102,19 @@ typedef enum TM_Result
TM_WouldBlock
} TM_Result;
+/*
+ * Result codes for table_update(..., update_indexes*..).
+ * Used to determine which indexes to update.
+ */
+typedef enum TU_UpdateIndexes {
+ /* No indexed columns were updated (incl. TID addressing of tuple) */
+ TU_None = 0,
+ /* A non-summarizing indexed column was updated, or the TID has changed */
+ TU_All = 1,
+ /* Only summarized columns were updated, TID is unchanged */
+ TU_Summarizing = 2
+} TU_UpdateIndexes;
+
/*
* When table_tuple_update, table_tuple_delete, or table_tuple_lock fail
* because the target tuple is already outdated, they fill in this struct to
@@ -526,7 +539,7 @@ typedef struct TableAmRoutine
bool wait,
TM_FailureData *tmfd,
LockTupleMode *lockmode,
- bool *update_indexes);
+ TU_UpdateIndexes *update_indexes);
/* see table_tuple_lock() for reference about parameters */
TM_Result (*tuple_lock) (Relation rel,
@@ -1514,7 +1527,7 @@ static inline TM_Result
table_tuple_update(Relation rel, ItemPointer otid, TupleTableSlot *slot,
CommandId cid, Snapshot snapshot, Snapshot crosscheck,
bool wait, TM_FailureData *tmfd, LockTupleMode *lockmode,
- bool *update_indexes)
+ TU_UpdateIndexes *update_indexes)
{
return rel->rd_tableam->tuple_update(rel, otid, slot,
cid, snapshot, crosscheck,
@@ -2038,7 +2051,7 @@ extern void simple_table_tuple_delete(Relation rel, ItemPointer tid,
Snapshot snapshot);
extern void simple_table_tuple_update(Relation rel, ItemPointer otid,
TupleTableSlot *slot, Snapshot snapshot,
- bool *update_indexes);
+ TU_UpdateIndexes *update_indexes);
/* ----------------------------------------------------------------------------
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index e7e25c057ef..551c5d7ae0c 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -620,7 +620,8 @@ extern List *ExecInsertIndexTuples(ResultRelInfo *resultRelInfo,
TupleTableSlot *slot, EState *estate,
bool update,
bool noDupErr,
- bool *specConflict, List *arbiterIndexes);
+ bool *specConflict, List *arbiterIndexes,
+ bool onlySummarizing);
extern bool ExecCheckIndexConstraints(ResultRelInfo *resultRelInfo,
TupleTableSlot *slot,
EState *estate, ItemPointer conflictTid,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 20f4c8b35f3..3f1b8818a1f 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -161,6 +161,7 @@ typedef struct ExprState
* IndexUnchanged aminsert hint, cached for retail inserts
* Concurrent are we doing a concurrent index build?
* BrokenHotChain did we detect any broken HOT chains?
+ * Summarizing is it summarizing?
* ParallelWorkers # of workers requested (excludes leader)
* Am Oid of index AM
* AmCache private cache area for index AM
@@ -194,6 +195,7 @@ typedef struct IndexInfo
bool ii_IndexUnchanged;
bool ii_Concurrent;
bool ii_BrokenHotChain;
+ bool ii_Summarizing;
int ii_ParallelWorkers;
Oid ii_Am;
void *ii_AmCache;
diff --git a/src/include/nodes/makefuncs.h b/src/include/nodes/makefuncs.h
index 80f1d5336bb..64651c9b00b 100644
--- a/src/include/nodes/makefuncs.h
+++ b/src/include/nodes/makefuncs.h
@@ -96,7 +96,9 @@ extern List *make_ands_implicit(Expr *clause);
extern IndexInfo *makeIndexInfo(int numattrs, int numkeyattrs, Oid amoid,
List *expressions, List *predicates,
- bool unique, bool nulls_not_distinct, bool isready, bool concurrent);
+ bool unique, bool nulls_not_distinct,
+ bool isready, bool concurrent,
+ bool summarizing);
extern DefElem *makeDefElem(char *name, Node *arg, int location);
extern DefElem *makeDefElemExtended(char *nameSpace, char *name, Node *arg,
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 67f994cb3e2..c0ddddb2f0d 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -156,10 +156,12 @@ typedef struct RelationData
List *rd_statlist; /* list of OIDs of extended stats */
/* data managed by RelationGetIndexAttrBitmap: */
- Bitmapset *rd_indexattr; /* identifies columns used in indexes */
+ bool rd_attrsvalid; /* are bitmaps of attrs valid? */
Bitmapset *rd_keyattr; /* cols that can be ref'd by foreign keys */
Bitmapset *rd_pkattr; /* cols included in primary key */
Bitmapset *rd_idattr; /* included in replica identity index */
+ Bitmapset *rd_hotblockingattr; /* cols blocking HOT update */
+ Bitmapset *rd_summarizedattr; /* cols indexed by summarizing indexes */
PublicationDesc *rd_pubdesc; /* publication descriptor, or NULL */
diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h
index 88460f21c56..beeb28b83cb 100644
--- a/src/include/utils/relcache.h
+++ b/src/include/utils/relcache.h
@@ -56,10 +56,11 @@ extern bytea **RelationGetIndexAttOptions(Relation relation, bool copy);
typedef enum IndexAttrBitmapKind
{
- INDEX_ATTR_BITMAP_ALL,
INDEX_ATTR_BITMAP_KEY,
INDEX_ATTR_BITMAP_PRIMARY_KEY,
- INDEX_ATTR_BITMAP_IDENTITY_KEY
+ INDEX_ATTR_BITMAP_IDENTITY_KEY,
+ INDEX_ATTR_BITMAP_HOT_BLOCKING,
+ INDEX_ATTR_BITMAP_SUMMARIZED
} IndexAttrBitmapKind;
extern Bitmapset *RelationGetIndexAttrBitmap(Relation relation,
diff --git a/src/test/modules/dummy_index_am/dummy_index_am.c b/src/test/modules/dummy_index_am/dummy_index_am.c
index dfb1ebb846a..c14e0abe0c6 100644
--- a/src/test/modules/dummy_index_am/dummy_index_am.c
+++ b/src/test/modules/dummy_index_am/dummy_index_am.c
@@ -296,6 +296,7 @@ dihandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = false;
amroutine->amcaninclude = false;
amroutine->amusemaintenanceworkmem = false;
+ amroutine->amsummarizing = false;
amroutine->amparallelvacuumoptions = VACUUM_OPTION_NO_PARALLEL;
amroutine->amkeytype = InvalidOid;
diff --git a/src/test/regress/expected/stats.out b/src/test/regress/expected/stats.out
index 937b2101b33..728c474cfa3 100644
--- a/src/test/regress/expected/stats.out
+++ b/src/test/regress/expected/stats.out
@@ -1354,4 +1354,114 @@ SELECT :io_stats_post_reset < :io_stats_pre_reset;
t
(1 row)
+-- test BRIN index doesn't block HOT update
+CREATE TABLE brin_hot (
+ id integer PRIMARY KEY,
+ val integer NOT NULL
+) WITH (autovacuum_enabled = off, fillfactor = 70);
+INSERT INTO brin_hot SELECT *, 0 FROM generate_series(1, 235);
+CREATE INDEX val_brin ON brin_hot using brin(val);
+CREATE FUNCTION wait_for_hot_stats() RETURNS void AS $$
+DECLARE
+ start_time timestamptz := clock_timestamp();
+ updated bool;
+BEGIN
+ -- we don't want to wait forever; loop will exit after 30 seconds
+ FOR i IN 1 .. 300 LOOP
+ SELECT (pg_stat_get_tuples_hot_updated('brin_hot'::regclass::oid) > 0) INTO updated;
+ EXIT WHEN updated;
+
+ -- wait a little
+ PERFORM pg_sleep_for('100 milliseconds');
+ -- reset stats snapshot so we can test again
+ PERFORM pg_stat_clear_snapshot();
+ END LOOP;
+ -- report time waited in postmaster log (where it won't change test output)
+ RAISE log 'wait_for_hot_stats delayed % seconds',
+ EXTRACT(epoch FROM clock_timestamp() - start_time);
+END
+$$ LANGUAGE plpgsql;
+UPDATE brin_hot SET val = -3 WHERE id = 42;
+-- We can't just call wait_for_hot_stats() at this point, because we only
+-- transmit stats when the session goes idle, and we probably didn't
+-- transmit the last couple of counts yet thanks to the rate-limiting logic
+-- in pgstat_report_stat(). But instead of waiting for the rate limiter's
+-- timeout to elapse, let's just start a new session. The old one will
+-- then send its stats before dying.
+\c -
+SELECT wait_for_hot_stats();
+ wait_for_hot_stats
+--------------------
+
+(1 row)
+
+SELECT pg_stat_get_tuples_hot_updated('brin_hot'::regclass::oid);
+ pg_stat_get_tuples_hot_updated
+--------------------------------
+ 1
+(1 row)
+
+DROP TABLE brin_hot;
+DROP FUNCTION wait_for_hot_stats();
+-- Test handling of index predicates - updating attributes in precicates
+-- should block HOT even for BRIN. We update a row that was not indexed
+-- due to the index predicate, and becomes indexable.
+CREATE TABLE brin_hot_2 (a int, b int);
+INSERT INTO brin_hot_2 VALUES (1, 100);
+CREATE INDEX ON brin_hot_2 USING brin (b) WHERE a = 2;
+UPDATE brin_hot_2 SET a = 2;
+EXPLAIN (COSTS OFF) SELECT * FROM brin_hot_2 WHERE a = 2 AND b = 100;
+ QUERY PLAN
+-----------------------------------
+ Seq Scan on brin_hot_2
+ Filter: ((a = 2) AND (b = 100))
+(2 rows)
+
+SELECT COUNT(*) FROM brin_hot_2 WHERE a = 2 AND b = 100;
+ count
+-------
+ 1
+(1 row)
+
+SET enable_seqscan = off;
+EXPLAIN (COSTS OFF) SELECT * FROM brin_hot_2 WHERE a = 2 AND b = 100;
+ QUERY PLAN
+---------------------------------------------
+ Bitmap Heap Scan on brin_hot_2
+ Recheck Cond: ((b = 100) AND (a = 2))
+ -> Bitmap Index Scan on brin_hot_2_b_idx
+ Index Cond: (b = 100)
+(4 rows)
+
+SELECT COUNT(*) FROM brin_hot_2 WHERE a = 2 AND b = 100;
+ count
+-------
+ 1
+(1 row)
+
+DROP TABLE brin_hot_2;
+-- Test that updates to indexed columns are still propagated to the
+-- BRIN column.
+-- https://postgr.es/m/05ebcb44-f383-86e3-4f31-0a97a55634cf@enterprisedb.com
+CREATE TABLE brin_hot_3 (a int, filler text) WITH (fillfactor = 10);
+INSERT INTO brin_hot_3 SELECT 1, repeat(' ', 500) FROM generate_series(1, 20);
+CREATE INDEX ON brin_hot_3 USING brin (a) WITH (pages_per_range = 1);
+UPDATE brin_hot_3 SET a = 2;
+EXPLAIN (COSTS OFF) SELECT * FROM brin_hot_3 WHERE a = 2;
+ QUERY PLAN
+---------------------------------------------
+ Bitmap Heap Scan on brin_hot_3
+ Recheck Cond: (a = 2)
+ -> Bitmap Index Scan on brin_hot_3_a_idx
+ Index Cond: (a = 2)
+(4 rows)
+
+SELECT COUNT(*) FROM brin_hot_3 WHERE a = 2;
+ count
+-------
+ 20
+(1 row)
+
+DROP TABLE brin_hot_3;
+SET enable_seqscan = on;
-- End of Stats Test
diff --git a/src/test/regress/sql/stats.sql b/src/test/regress/sql/stats.sql
index 74e592aa8af..e113f6906c9 100644
--- a/src/test/regress/sql/stats.sql
+++ b/src/test/regress/sql/stats.sql
@@ -535,7 +535,6 @@ SET enable_seqscan TO on;
SELECT pg_stat_get_replication_slot(NULL);
SELECT pg_stat_get_subscription_stats(NULL);
-
-- Test that the following operations are tracked in pg_stat_io:
-- - reads of target blocks into shared buffers
-- - writes of shared buffers to permanent storage
@@ -678,4 +677,85 @@ SELECT sum(evictions) + sum(reuses) + sum(extends) + sum(fsyncs) + sum(reads) +
FROM pg_stat_io \gset
SELECT :io_stats_post_reset < :io_stats_pre_reset;
+
+-- test BRIN index doesn't block HOT update
+CREATE TABLE brin_hot (
+ id integer PRIMARY KEY,
+ val integer NOT NULL
+) WITH (autovacuum_enabled = off, fillfactor = 70);
+
+INSERT INTO brin_hot SELECT *, 0 FROM generate_series(1, 235);
+CREATE INDEX val_brin ON brin_hot using brin(val);
+
+CREATE FUNCTION wait_for_hot_stats() RETURNS void AS $$
+DECLARE
+ start_time timestamptz := clock_timestamp();
+ updated bool;
+BEGIN
+ -- we don't want to wait forever; loop will exit after 30 seconds
+ FOR i IN 1 .. 300 LOOP
+ SELECT (pg_stat_get_tuples_hot_updated('brin_hot'::regclass::oid) > 0) INTO updated;
+ EXIT WHEN updated;
+
+ -- wait a little
+ PERFORM pg_sleep_for('100 milliseconds');
+ -- reset stats snapshot so we can test again
+ PERFORM pg_stat_clear_snapshot();
+ END LOOP;
+ -- report time waited in postmaster log (where it won't change test output)
+ RAISE log 'wait_for_hot_stats delayed % seconds',
+ EXTRACT(epoch FROM clock_timestamp() - start_time);
+END
+$$ LANGUAGE plpgsql;
+
+UPDATE brin_hot SET val = -3 WHERE id = 42;
+
+-- We can't just call wait_for_hot_stats() at this point, because we only
+-- transmit stats when the session goes idle, and we probably didn't
+-- transmit the last couple of counts yet thanks to the rate-limiting logic
+-- in pgstat_report_stat(). But instead of waiting for the rate limiter's
+-- timeout to elapse, let's just start a new session. The old one will
+-- then send its stats before dying.
+\c -
+
+SELECT wait_for_hot_stats();
+SELECT pg_stat_get_tuples_hot_updated('brin_hot'::regclass::oid);
+
+DROP TABLE brin_hot;
+DROP FUNCTION wait_for_hot_stats();
+
+-- Test handling of index predicates - updating attributes in precicates
+-- should block HOT even for BRIN. We update a row that was not indexed
+-- due to the index predicate, and becomes indexable.
+CREATE TABLE brin_hot_2 (a int, b int);
+INSERT INTO brin_hot_2 VALUES (1, 100);
+CREATE INDEX ON brin_hot_2 USING brin (b) WHERE a = 2;
+
+UPDATE brin_hot_2 SET a = 2;
+
+EXPLAIN (COSTS OFF) SELECT * FROM brin_hot_2 WHERE a = 2 AND b = 100;
+SELECT COUNT(*) FROM brin_hot_2 WHERE a = 2 AND b = 100;
+
+SET enable_seqscan = off;
+
+EXPLAIN (COSTS OFF) SELECT * FROM brin_hot_2 WHERE a = 2 AND b = 100;
+SELECT COUNT(*) FROM brin_hot_2 WHERE a = 2 AND b = 100;
+
+DROP TABLE brin_hot_2;
+
+-- Test that updates to indexed columns are still propagated to the
+-- BRIN column.
+-- https://postgr.es/m/05ebcb44-f383-86e3-4f31-0a97a55634cf@enterprisedb.com
+CREATE TABLE brin_hot_3 (a int, filler text) WITH (fillfactor = 10);
+INSERT INTO brin_hot_3 SELECT 1, repeat(' ', 500) FROM generate_series(1, 20);
+CREATE INDEX ON brin_hot_3 USING brin (a) WITH (pages_per_range = 1);
+UPDATE brin_hot_3 SET a = 2;
+
+EXPLAIN (COSTS OFF) SELECT * FROM brin_hot_3 WHERE a = 2;
+SELECT COUNT(*) FROM brin_hot_3 WHERE a = 2;
+
+DROP TABLE brin_hot_3;
+
+SET enable_seqscan = on;
+
-- End of Stats Test
--
2.39.1
v3-0002-tweaks.patchtext/x-patch; charset=UTF-8; name=v3-0002-tweaks.patchDownload
From 319de1400bff8a803afce6046d413e3febc75cbb Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas.vondra@postgresql.org>
Date: Wed, 22 Feb 2023 12:38:30 +0100
Subject: [PATCH v3 2/2] tweaks
---
src/backend/utils/cache/relcache.c | 26 +++++++++++---------------
1 file changed, 11 insertions(+), 15 deletions(-)
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 092c5ed8c7f..cd0f6e2a5ee 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -5251,6 +5251,7 @@ restart:
bool isKey; /* candidate key */
bool isPK; /* primary key */
bool isIDKey; /* replica identity index */
+ Bitmapset **attrs;
indexDesc = index_open(indexOid, AccessShareLock);
@@ -5288,6 +5289,11 @@ restart:
/* Is this index the configured (or default) replica identity? */
isIDKey = (indexOid == relreplindex);
+ if (indexDesc->rd_indam->amsummarizing)
+ attrs = &summarizedattrs;
+ else
+ attrs = &hotblockingattrs;
+
/* Collect simple attribute references */
for (i = 0; i < indexDesc->rd_index->indnatts; i++)
{
@@ -5298,7 +5304,7 @@ restart:
* handle them accurately here. non-key columns must be added into
* hotblockingattrs, since they are in index, and HOT-update
* shouldn't miss them.
- *
+ *
* Summarizing indexes do not block HOT, but do need to be updated
* when the column value changes, thus require a separate
* attribute bitmapset.
@@ -5309,12 +5315,8 @@ restart:
*/
if (attrnum != 0)
{
- if (indexDesc->rd_indam->amsummarizing)
- summarizedattrs = bms_add_member(summarizedattrs,
- attrnum - FirstLowInvalidHeapAttributeNumber);
- else
- hotblockingattrs = bms_add_member(hotblockingattrs,
- attrnum - FirstLowInvalidHeapAttributeNumber);
+ *attrs = bms_add_member(*attrs,
+ attrnum - FirstLowInvalidHeapAttributeNumber);
if (isKey && i < indexDesc->rd_index->indnkeyatts)
uindexattrs = bms_add_member(uindexattrs,
@@ -5331,18 +5333,12 @@ restart:
}
/* Collect all attributes used in expressions, too */
- if (indexDesc->rd_indam->amsummarizing)
- pull_varattnos(indexExpressions, 1, &summarizedattrs);
- else
- pull_varattnos(indexExpressions, 1, &hotblockingattrs);
+ pull_varattnos(indexExpressions, 1, attrs);
/*
* Collect all attributes in the index predicate, too.
*/
- if (indexDesc->rd_indam->amsummarizing)
- pull_varattnos(indexPredicate, 1, &summarizedattrs);
- else
- pull_varattnos(indexPredicate, 1, &hotblockingattrs);
+ pull_varattnos(indexPredicate, 1, attrs);
index_close(indexDesc, AccessShareLock);
}
--
2.39.1
On Wed, 22 Feb 2023 at 13:15, Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
On 2/20/23 19:15, Matthias van de Meent wrote:
Thanks. Based on feedback, attached is v2 of the patch, with as
significant changes:- We don't store the columns we mention in predicates of summarized
indexes in the hotblocking column anymore, they are stored in the
summarized columns bitmap instead. This further reduces the chance of
failiing to apply HOT with summarizing indexes.Interesting idea. I need to think about the correctness, but AFAICS it
should work. Do we have any tests covering such cases?
There is a test that checks that an update to the predicated column
does update the index (on table brin_hot_2). However, the description
was out of date, so I've updated that in v4.
- The heaptuple header bit for summarized update in inserted tuples is
replaced with passing an out parameter. This simplifies the logic and
decreases chances of accidentally storing incorrect data.OK.
0002 proposes a minor RelationGetIndexPredicate() tweak, getting rid of
the repeated if/else branches. Feel free to discard, if you think the v2
approach is better.
I agree that this is better, it's included in v4 of the patch, as attached.
Kind regards,
Matthias van de Meent.
Attachments:
v4-0001-Ignore-BRIN-indexes-when-checking-for-HOT-updates.patchapplication/octet-stream; name=v4-0001-Ignore-BRIN-indexes-when-checking-for-HOT-updates.patchDownload
From c7047eb4fbbbb57c147bb6ac887969e94bcc2f1e Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <boekewurm+postgres@gmail.com>
Date: Mon, 20 Feb 2023 18:38:30 +0100
Subject: [PATCH v4] Ignore BRIN indexes when checking for HOT updates
When determining whether an index update may be skipped by using HOT, we
can ignore attributes indexed by block summarizing indexes without
references to individual tuples that need to be cleaned up.
This also removes rd_indexattr list, and replaces it with rd_attrsvalid
flag. The list was not used anywhere, and a simple flag is sufficient.
A new type TU_UpdateIndexes is invented provide a signal to the executor
to determine which indexes to update - no indexes, all indexes, or only
the summarizing indexes.
One otherwise unused bit in the heap tuple header is (ab)used to signal
that the HOT update would still update at least one summarizing index.
The bit is cleared immediately
Original patch by Josef Simanek, various fixes and improvements by
Tomas Vondra and me.
Authors: Josef Simanek, Tomas Vondra, Matthias van de Meent
Reviewed-by: Tomas Vondra, Alvaro Herrera
---
doc/src/sgml/indexam.sgml | 13 ++
src/backend/access/brin/brin.c | 1 +
src/backend/access/gin/ginutil.c | 1 +
src/backend/access/gist/gist.c | 1 +
src/backend/access/hash/hash.c | 1 +
src/backend/access/heap/heapam.c | 48 +++++++-
src/backend/access/heap/heapam_handler.c | 19 ++-
src/backend/access/nbtree/nbtree.c | 1 +
src/backend/access/spgist/spgutils.c | 1 +
src/backend/access/table/tableam.c | 2 +-
src/backend/catalog/index.c | 9 +-
src/backend/catalog/indexing.c | 35 ++++--
src/backend/commands/copyfrom.c | 5 +-
src/backend/commands/indexcmds.c | 10 +-
src/backend/executor/execIndexing.c | 37 ++++--
src/backend/executor/execReplication.c | 9 +-
src/backend/executor/nodeModifyTable.c | 13 +-
src/backend/nodes/makefuncs.c | 7 +-
src/backend/utils/cache/relcache.c | 69 +++++++----
src/include/access/amapi.h | 2 +
src/include/access/heapam.h | 5 +-
src/include/access/tableam.h | 19 ++-
src/include/executor/executor.h | 3 +-
src/include/nodes/execnodes.h | 2 +
src/include/nodes/makefuncs.h | 4 +-
src/include/utils/rel.h | 4 +-
src/include/utils/relcache.h | 5 +-
.../modules/dummy_index_am/dummy_index_am.c | 1 +
src/test/regress/expected/stats.out | 111 ++++++++++++++++++
src/test/regress/sql/stats.sql | 83 ++++++++++++-
30 files changed, 443 insertions(+), 78 deletions(-)
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index 4f83970c85..897419ec95 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -127,6 +127,9 @@ typedef struct IndexAmRoutine
bool amcaninclude;
/* does AM use maintenance_work_mem? */
bool amusemaintenanceworkmem;
+ /* does AM summarize tuples, with at least all tuples in the block
+ * summarized in one summary */
+ bool amsummarizing;
/* OR of parallel vacuum flags */
uint8 amparallelvacuumoptions;
/* type of data stored in index, or InvalidOid if variable */
@@ -247,6 +250,16 @@ typedef struct IndexAmRoutine
null, independently of <structfield>amoptionalkey</structfield>.
</para>
+ <para>
+ The <structfield>amsummarizing</structfield> flag indicates whether the
+ access method summarizes the indexed tuples, with summarizing granularity
+ of at least per block.
+ Access methods that do not point to individual tuples, but to (like
+ <acronym>BRIN</acronym>), may allow the <acronym>HOT</acronym> optimization
+ to continue. This does not apply to attributes referenced in index
+ predicates, an update of such attribute always disables <acronym>HOT</acronym>.
+ </para>
+
</sect1>
<sect1 id="index-functions">
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index b5a5fa7b33..53e4721a54 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -109,6 +109,7 @@ brinhandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = false;
amroutine->amcaninclude = false;
amroutine->amusemaintenanceworkmem = false;
+ amroutine->amsummarizing = true;
amroutine->amparallelvacuumoptions =
VACUUM_OPTION_PARALLEL_CLEANUP;
amroutine->amkeytype = InvalidOid;
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index f05128ecf5..03fec1704e 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -56,6 +56,7 @@ ginhandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = false;
amroutine->amcaninclude = false;
amroutine->amusemaintenanceworkmem = true;
+ amroutine->amsummarizing = false;
amroutine->amparallelvacuumoptions =
VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_CLEANUP;
amroutine->amkeytype = InvalidOid;
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index ba394f08f6..ea72bcce1b 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -78,6 +78,7 @@ gisthandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = false;
amroutine->amcaninclude = true;
amroutine->amusemaintenanceworkmem = false;
+ amroutine->amsummarizing = false;
amroutine->amparallelvacuumoptions =
VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
amroutine->amkeytype = InvalidOid;
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index eb258337d6..fc5d97f606 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -75,6 +75,7 @@ hashhandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = false;
amroutine->amcaninclude = false;
amroutine->amusemaintenanceworkmem = false;
+ amroutine->amsummarizing = false;
amroutine->amparallelvacuumoptions =
VACUUM_OPTION_PARALLEL_BULKDEL;
amroutine->amkeytype = INT4OID;
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 7eb79cee58..89fdfe5353 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -2924,11 +2924,13 @@ simple_heap_delete(Relation relation, ItemPointer tid)
TM_Result
heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
CommandId cid, Snapshot crosscheck, bool wait,
- TM_FailureData *tmfd, LockTupleMode *lockmode)
+ TM_FailureData *tmfd, LockTupleMode *lockmode,
+ TU_UpdateIndexes *update_indexes)
{
TM_Result result;
TransactionId xid = GetCurrentTransactionId();
Bitmapset *hot_attrs;
+ Bitmapset *sum_attrs;
Bitmapset *key_attrs;
Bitmapset *id_attrs;
Bitmapset *interesting_attrs;
@@ -2951,6 +2953,7 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
bool have_tuple_lock = false;
bool iscombo;
bool use_hot_update = false;
+ bool summarized_update = false;
bool key_intact;
bool all_visible_cleared = false;
bool all_visible_cleared_new = false;
@@ -2996,12 +2999,16 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
* Note that we get copies of each bitmap, so we need not worry about
* relcache flush happening midway through.
*/
- hot_attrs = RelationGetIndexAttrBitmap(relation, INDEX_ATTR_BITMAP_ALL);
+ hot_attrs = RelationGetIndexAttrBitmap(relation,
+ INDEX_ATTR_BITMAP_HOT_BLOCKING);
+ sum_attrs = RelationGetIndexAttrBitmap(relation,
+ INDEX_ATTR_BITMAP_SUMMARIZED);
key_attrs = RelationGetIndexAttrBitmap(relation, INDEX_ATTR_BITMAP_KEY);
id_attrs = RelationGetIndexAttrBitmap(relation,
INDEX_ATTR_BITMAP_IDENTITY_KEY);
interesting_attrs = NULL;
interesting_attrs = bms_add_members(interesting_attrs, hot_attrs);
+ interesting_attrs = bms_add_members(interesting_attrs, sum_attrs);
interesting_attrs = bms_add_members(interesting_attrs, key_attrs);
interesting_attrs = bms_add_members(interesting_attrs, id_attrs);
@@ -3311,7 +3318,10 @@ l2:
UnlockTupleTuplock(relation, &(oldtup.t_self), *lockmode);
if (vmbuffer != InvalidBuffer)
ReleaseBuffer(vmbuffer);
+ *update_indexes = TU_None;
+
bms_free(hot_attrs);
+ bms_free(sum_attrs);
bms_free(key_attrs);
bms_free(id_attrs);
bms_free(modified_attrs);
@@ -3633,7 +3643,19 @@ l2:
* changed.
*/
if (!bms_overlap(modified_attrs, hot_attrs))
+ {
use_hot_update = true;
+
+ /*
+ * If none of the columns that are used in hot-blocking indexes
+ * were updated, we can apply HOT, but we do still need to check
+ * if we need to update the summarizing indexes, and update those
+ * indexes if the columns were updated, or we may fail to detect
+ * e.g. value bound changes in BRIN minmax indexes.
+ */
+ if (bms_overlap(modified_attrs, sum_attrs))
+ summarized_update = true;
+ }
}
else
{
@@ -3793,10 +3815,27 @@ l2:
heap_freetuple(heaptup);
}
+ /*
+ * If it is a HOT update, the update may still need to update summarized
+ * indexes, lest we fail to update those summaries and get incorrect
+ * results (for example, minmax bounds of the block may change with this
+ * update).
+ */
+ if (use_hot_update)
+ {
+ if (summarized_update)
+ *update_indexes = TU_Summarizing;
+ else
+ *update_indexes = TU_None;
+ }
+ else
+ *update_indexes = TU_All;
+
if (old_key_tuple != NULL && old_key_copied)
heap_freetuple(old_key_tuple);
bms_free(hot_attrs);
+ bms_free(sum_attrs);
bms_free(key_attrs);
bms_free(id_attrs);
bms_free(modified_attrs);
@@ -3960,7 +3999,8 @@ HeapDetermineColumnsInfo(Relation relation,
* via ereport().
*/
void
-simple_heap_update(Relation relation, ItemPointer otid, HeapTuple tup)
+simple_heap_update(Relation relation, ItemPointer otid, HeapTuple tup,
+ TU_UpdateIndexes *update_indexes)
{
TM_Result result;
TM_FailureData tmfd;
@@ -3969,7 +4009,7 @@ simple_heap_update(Relation relation, ItemPointer otid, HeapTuple tup)
result = heap_update(relation, otid, tup,
GetCurrentCommandId(true), InvalidSnapshot,
true /* wait for commit */ ,
- &tmfd, &lockmode);
+ &tmfd, &lockmode, update_indexes);
switch (result)
{
case TM_SelfModified:
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index c4b1916d36..a1d7d91ff7 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -314,7 +314,7 @@ static TM_Result
heapam_tuple_update(Relation relation, ItemPointer otid, TupleTableSlot *slot,
CommandId cid, Snapshot snapshot, Snapshot crosscheck,
bool wait, TM_FailureData *tmfd,
- LockTupleMode *lockmode, bool *update_indexes)
+ LockTupleMode *lockmode, TU_UpdateIndexes *update_indexes)
{
bool shouldFree = true;
HeapTuple tuple = ExecFetchSlotHeapTuple(slot, true, &shouldFree);
@@ -325,7 +325,7 @@ heapam_tuple_update(Relation relation, ItemPointer otid, TupleTableSlot *slot,
tuple->t_tableOid = slot->tts_tableOid;
result = heap_update(relation, otid, tuple, cid, crosscheck, wait,
- tmfd, lockmode);
+ tmfd, lockmode, update_indexes);
ItemPointerCopy(&tuple->t_self, &slot->tts_tid);
/*
@@ -334,9 +334,20 @@ heapam_tuple_update(Relation relation, ItemPointer otid, TupleTableSlot *slot,
* Note: heap_update returns the tid (location) of the new tuple in the
* t_self field.
*
- * If it's a HOT update, we mustn't insert new index entries.
+ * If the update is not HOT, we must update all indexes. If the update
+ * is HOT, it could be that we updated summarized columns, so we either
+ * update only summarized indexes, or none at all.
*/
- *update_indexes = result == TM_Ok && !HeapTupleIsHeapOnly(tuple);
+ if (result != TM_Ok)
+ {
+ Assert(*update_indexes == TU_None);
+ *update_indexes = TU_None;
+ }
+ else if (!HeapTupleIsHeapOnly(tuple))
+ Assert(*update_indexes == TU_All);
+ else
+ Assert(*update_indexes == TU_Summarizing ||
+ *update_indexes == TU_None);
if (shouldFree)
pfree(tuple);
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 1cc88da032..681c30b0d8 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -114,6 +114,7 @@ bthandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = true;
amroutine->amcaninclude = true;
amroutine->amusemaintenanceworkmem = false;
+ amroutine->amsummarizing = false;
amroutine->amparallelvacuumoptions =
VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
amroutine->amkeytype = InvalidOid;
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index 3761f2c193..4e7ff1d160 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -62,6 +62,7 @@ spghandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = false;
amroutine->amcaninclude = true;
amroutine->amusemaintenanceworkmem = false;
+ amroutine->amsummarizing = false;
amroutine->amparallelvacuumoptions =
VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
amroutine->amkeytype = InvalidOid;
diff --git a/src/backend/access/table/tableam.c b/src/backend/access/table/tableam.c
index ef0d34fcee..a5e6c92f35 100644
--- a/src/backend/access/table/tableam.c
+++ b/src/backend/access/table/tableam.c
@@ -345,7 +345,7 @@ void
simple_table_tuple_update(Relation rel, ItemPointer otid,
TupleTableSlot *slot,
Snapshot snapshot,
- bool *update_indexes)
+ TU_UpdateIndexes *update_indexes)
{
TM_Result result;
TM_FailureData tmfd;
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 41b16cb89b..e2fd035f44 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -1370,7 +1370,8 @@ index_concurrently_create_copy(Relation heapRelation, Oid oldIndexId,
oldInfo->ii_Unique,
oldInfo->ii_NullsNotDistinct,
false, /* not ready for inserts */
- true);
+ true,
+ indexRelation->rd_indam->amsummarizing);
/*
* Extract the list of column names and the column numbers for the new
@@ -2442,7 +2443,8 @@ BuildIndexInfo(Relation index)
indexStruct->indisunique,
indexStruct->indnullsnotdistinct,
indexStruct->indisready,
- false);
+ false,
+ index->rd_indam->amsummarizing);
/* fill in attribute numbers */
for (i = 0; i < numAtts; i++)
@@ -2502,7 +2504,8 @@ BuildDummyIndexInfo(Relation index)
indexStruct->indisunique,
indexStruct->indnullsnotdistinct,
indexStruct->indisready,
- false);
+ false,
+ index->rd_indam->amsummarizing);
/* fill in attribute numbers */
for (i = 0; i < numAtts; i++)
diff --git a/src/backend/catalog/indexing.c b/src/backend/catalog/indexing.c
index bb7cc3601c..a387eccdc4 100644
--- a/src/backend/catalog/indexing.c
+++ b/src/backend/catalog/indexing.c
@@ -72,7 +72,8 @@ CatalogCloseIndexes(CatalogIndexState indstate)
* This is effectively a cut-down version of ExecInsertIndexTuples.
*/
static void
-CatalogIndexInsert(CatalogIndexState indstate, HeapTuple heapTuple)
+CatalogIndexInsert(CatalogIndexState indstate, HeapTuple heapTuple,
+ TU_UpdateIndexes updateIndexes)
{
int i;
int numIndexes;
@@ -82,6 +83,7 @@ CatalogIndexInsert(CatalogIndexState indstate, HeapTuple heapTuple)
IndexInfo **indexInfoArray;
Datum values[INDEX_MAX_KEYS];
bool isnull[INDEX_MAX_KEYS];
+ bool onlySummarized = updateIndexes == TU_Summarizing;
/*
* HOT update does not require index inserts. But with asserts enabled we
@@ -89,10 +91,13 @@ CatalogIndexInsert(CatalogIndexState indstate, HeapTuple heapTuple)
* table/index.
*/
#ifndef USE_ASSERT_CHECKING
- if (HeapTupleIsHeapOnly(heapTuple))
+ if (HeapTupleIsHeapOnly(heapTuple) && !onlySummarized)
return;
#endif
+ if (onlySummarized)
+ Assert(HeapTupleIsHeapOnly(heapTuple));
+
/*
* Get information from the state structure. Fall out if nothing to do.
*/
@@ -135,13 +140,20 @@ CatalogIndexInsert(CatalogIndexState indstate, HeapTuple heapTuple)
/* see earlier check above */
#ifdef USE_ASSERT_CHECKING
- if (HeapTupleIsHeapOnly(heapTuple))
+ if (HeapTupleIsHeapOnly(heapTuple) && !onlySummarized)
{
Assert(!ReindexIsProcessingIndex(RelationGetRelid(index)));
continue;
}
#endif /* USE_ASSERT_CHECKING */
+ /*
+ * Skip insertions into non-summarizing indexes if we only need
+ * to update summarizing indexes
+ */
+ if (onlySummarized && !indexInfo->ii_Summarizing)
+ continue;
+
/*
* FormIndexDatum fills in its values and isnull parameters with the
* appropriate values for the column(s) of the index.
@@ -228,7 +240,7 @@ CatalogTupleInsert(Relation heapRel, HeapTuple tup)
simple_heap_insert(heapRel, tup);
- CatalogIndexInsert(indstate, tup);
+ CatalogIndexInsert(indstate, tup, TU_All);
CatalogCloseIndexes(indstate);
}
@@ -248,7 +260,7 @@ CatalogTupleInsertWithInfo(Relation heapRel, HeapTuple tup,
simple_heap_insert(heapRel, tup);
- CatalogIndexInsert(indstate, tup);
+ CatalogIndexInsert(indstate, tup, TU_All);
}
/*
@@ -279,7 +291,7 @@ CatalogTuplesMultiInsertWithInfo(Relation heapRel, TupleTableSlot **slot,
tuple = ExecFetchSlotHeapTuple(slot[i], true, &should_free);
tuple->t_tableOid = slot[i]->tts_tableOid;
- CatalogIndexInsert(indstate, tuple);
+ CatalogIndexInsert(indstate, tuple, TU_All);
if (should_free)
heap_freetuple(tuple);
@@ -301,14 +313,15 @@ void
CatalogTupleUpdate(Relation heapRel, ItemPointer otid, HeapTuple tup)
{
CatalogIndexState indstate;
+ TU_UpdateIndexes updateIndexes = TU_All;
CatalogTupleCheckConstraints(heapRel, tup);
indstate = CatalogOpenIndexes(heapRel);
- simple_heap_update(heapRel, otid, tup);
+ simple_heap_update(heapRel, otid, tup, &updateIndexes);
- CatalogIndexInsert(indstate, tup);
+ CatalogIndexInsert(indstate, tup, updateIndexes);
CatalogCloseIndexes(indstate);
}
@@ -324,11 +337,13 @@ void
CatalogTupleUpdateWithInfo(Relation heapRel, ItemPointer otid, HeapTuple tup,
CatalogIndexState indstate)
{
+ TU_UpdateIndexes updateIndexes = TU_All;
+
CatalogTupleCheckConstraints(heapRel, tup);
- simple_heap_update(heapRel, otid, tup);
+ simple_heap_update(heapRel, otid, tup, &updateIndexes);
- CatalogIndexInsert(indstate, tup);
+ CatalogIndexInsert(indstate, tup, updateIndexes);
}
/*
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index af52faca6d..564520289a 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -435,7 +435,7 @@ CopyMultiInsertBufferFlush(CopyMultiInsertInfo *miinfo,
recheckIndexes =
ExecInsertIndexTuples(resultRelInfo,
buffer->slots[i], estate, false,
- false, NULL, NIL);
+ false, NULL, NIL, false);
ExecARInsertTriggers(estate, resultRelInfo,
slots[i], recheckIndexes,
cstate->transition_capture);
@@ -1254,7 +1254,8 @@ CopyFrom(CopyFromState cstate)
false,
false,
NULL,
- NIL);
+ NIL,
+ false);
}
/* AFTER ROW INSERT Triggers */
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 16ec0b114e..ff48f44c66 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -184,6 +184,7 @@ CheckIndexCompatible(Oid oldId,
Form_pg_am accessMethodForm;
IndexAmRoutine *amRoutine;
bool amcanorder;
+ bool amsummarizing;
int16 *coloptions;
IndexInfo *indexInfo;
int numberOfAttributes;
@@ -222,6 +223,7 @@ CheckIndexCompatible(Oid oldId,
ReleaseSysCache(tuple);
amcanorder = amRoutine->amcanorder;
+ amsummarizing = amRoutine->amsummarizing;
/*
* Compute the operator classes, collations, and exclusion operators for
@@ -232,7 +234,8 @@ CheckIndexCompatible(Oid oldId,
* ii_NumIndexKeyAttrs with same value.
*/
indexInfo = makeIndexInfo(numberOfAttributes, numberOfAttributes,
- accessMethodId, NIL, NIL, false, false, false, false);
+ accessMethodId, NIL, NIL, false, false,
+ false, false, amsummarizing);
typeObjectId = palloc_array(Oid, numberOfAttributes);
collationObjectId = palloc_array(Oid, numberOfAttributes);
classObjectId = palloc_array(Oid, numberOfAttributes);
@@ -550,6 +553,7 @@ DefineIndex(Oid relationId,
Form_pg_am accessMethodForm;
IndexAmRoutine *amRoutine;
bool amcanorder;
+ bool amissummarizing;
amoptions_function amoptions;
bool partitioned;
bool safe_index;
@@ -866,6 +870,7 @@ DefineIndex(Oid relationId,
amcanorder = amRoutine->amcanorder;
amoptions = amRoutine->amoptions;
+ amissummarizing = amRoutine->amsummarizing;
pfree(amRoutine);
ReleaseSysCache(tuple);
@@ -897,7 +902,8 @@ DefineIndex(Oid relationId,
stmt->unique,
stmt->nulls_not_distinct,
!concurrent,
- concurrent);
+ concurrent,
+ amissummarizing);
typeObjectId = palloc_array(Oid, numberOfAttributes);
collationObjectId = palloc_array(Oid, numberOfAttributes);
diff --git a/src/backend/executor/execIndexing.c b/src/backend/executor/execIndexing.c
index 6e88e72813..da28e5e40c 100644
--- a/src/backend/executor/execIndexing.c
+++ b/src/backend/executor/execIndexing.c
@@ -259,15 +259,24 @@ ExecCloseIndices(ResultRelInfo *resultRelInfo)
* into all the relations indexing the result relation
* when a heap tuple is inserted into the result relation.
*
- * When 'update' is true, executor is performing an UPDATE
- * that could not use an optimization like heapam's HOT (in
- * more general terms a call to table_tuple_update() took
- * place and set 'update_indexes' to true). Receiving this
- * hint makes us consider if we should pass down the
- * 'indexUnchanged' hint in turn. That's something that we
- * figure out for each index_insert() call iff 'update' is
- * true. (When 'update' is false we already know not to pass
- * the hint to any index.)
+ * When 'update' is true and 'onlySummarizing' is false,
+ * executor is performing an UPDATE that could not use an
+ * optimization like heapam's HOT (in more general terms a
+ * call to table_tuple_update() took place and set
+ * 'update_indexes' to TUUI_All). Receiving this hint makes
+ * us consider if we should pass down the 'indexUnchanged'
+ * hint in turn. That's something that we figure out for
+ * each index_insert() call iff 'update' is true.
+ * (When 'update' is false we already know not to pass the
+ * hint to any index.)
+ *
+ * If onlySummarizing is set, an equivalent optimization to
+ * HOT has been applied and any updated columns are indexed
+ * only by summarizing indexes (or in more general terms a
+ * call to table_tuple_update() took place and set
+ * 'update_indexes' to TUUI_Summarizing). We can (and must)
+ * therefore only update the indexes that have
+ * 'amsummarizing' = true.
*
* Unique and exclusion constraints are enforced at the same
* time. This returns a list of index OIDs for any unique or
@@ -287,7 +296,8 @@ ExecInsertIndexTuples(ResultRelInfo *resultRelInfo,
bool update,
bool noDupErr,
bool *specConflict,
- List *arbiterIndexes)
+ List *arbiterIndexes,
+ bool onlySummarizing)
{
ItemPointer tupleid = &slot->tts_tid;
List *result = NIL;
@@ -343,6 +353,13 @@ ExecInsertIndexTuples(ResultRelInfo *resultRelInfo,
if (!indexInfo->ii_ReadyForInserts)
continue;
+ /*
+ * Skip processing of non-summarizing indexes if we only
+ * update summarizing indexes
+ */
+ if (onlySummarizing && !indexInfo->ii_Summarizing)
+ continue;
+
/* Check for partial index */
if (indexInfo->ii_Predicate != NIL)
{
diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index c484f5c301..4c01ef63cb 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -445,7 +445,7 @@ ExecSimpleRelationInsert(ResultRelInfo *resultRelInfo,
if (resultRelInfo->ri_NumIndices > 0)
recheckIndexes = ExecInsertIndexTuples(resultRelInfo,
slot, estate, false, false,
- NULL, NIL);
+ NULL, NIL, false);
/* AFTER ROW INSERT Triggers */
ExecARInsertTriggers(estate, resultRelInfo, slot,
@@ -493,7 +493,7 @@ ExecSimpleRelationUpdate(ResultRelInfo *resultRelInfo,
if (!skip_tuple)
{
List *recheckIndexes = NIL;
- bool update_indexes;
+ TU_UpdateIndexes update_indexes;
/* Compute stored generated columns */
if (rel->rd_att->constr &&
@@ -510,10 +510,11 @@ ExecSimpleRelationUpdate(ResultRelInfo *resultRelInfo,
simple_table_tuple_update(rel, tid, slot, estate->es_snapshot,
&update_indexes);
- if (resultRelInfo->ri_NumIndices > 0 && update_indexes)
+ if (resultRelInfo->ri_NumIndices > 0 && update_indexes != TU_None)
recheckIndexes = ExecInsertIndexTuples(resultRelInfo,
slot, estate, true, false,
- NULL, NIL);
+ NULL, NIL,
+ update_indexes == TU_Summarizing);
/* AFTER ROW UPDATE Triggers */
ExecARUpdateTriggers(estate, resultRelInfo,
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index a94d7f86e5..099ccda95a 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -125,8 +125,8 @@ typedef struct ModifyTableContext
typedef struct UpdateContext
{
bool updated; /* did UPDATE actually occur? */
- bool updateIndexes; /* index update required? */
bool crossPartUpdate; /* was it a cross-partition update? */
+ TU_UpdateIndexes updateIndexes; /* Which index updates are required? */
/*
* Lock mode to acquire on the latest tuple version before performing
@@ -1106,7 +1106,8 @@ ExecInsert(ModifyTableContext *context,
recheckIndexes = ExecInsertIndexTuples(resultRelInfo,
slot, estate, false, true,
&specConflict,
- arbiterIndexes);
+ arbiterIndexes,
+ false);
/* adjust the tuple's state accordingly */
table_tuple_complete_speculative(resultRelationDesc, slot,
@@ -1145,7 +1146,8 @@ ExecInsert(ModifyTableContext *context,
if (resultRelInfo->ri_NumIndices > 0)
recheckIndexes = ExecInsertIndexTuples(resultRelInfo,
slot, estate, false,
- false, NULL, NIL);
+ false, NULL, NIL,
+ false);
}
}
@@ -2108,11 +2110,12 @@ ExecUpdateEpilogue(ModifyTableContext *context, UpdateContext *updateCxt,
ModifyTableState *mtstate = context->mtstate;
/* insert index entries for tuple if necessary */
- if (resultRelInfo->ri_NumIndices > 0 && updateCxt->updateIndexes)
+ if (resultRelInfo->ri_NumIndices > 0 && updateCxt->updateIndexes != TU_None)
recheckIndexes = ExecInsertIndexTuples(resultRelInfo,
slot, context->estate,
true, false,
- NULL, NIL);
+ NULL, NIL,
+ updateCxt->updateIndexes == TU_Summarizing);
/* AFTER ROW UPDATE Triggers */
ExecARUpdateTriggers(context->estate, resultRelInfo,
diff --git a/src/backend/nodes/makefuncs.c b/src/backend/nodes/makefuncs.c
index fe67baf142..f23f8b7349 100644
--- a/src/backend/nodes/makefuncs.c
+++ b/src/backend/nodes/makefuncs.c
@@ -743,7 +743,8 @@ make_ands_implicit(Expr *clause)
*/
IndexInfo *
makeIndexInfo(int numattrs, int numkeyattrs, Oid amoid, List *expressions,
- List *predicates, bool unique, bool nulls_not_distinct, bool isready, bool concurrent)
+ List *predicates, bool unique, bool nulls_not_distinct,
+ bool isready, bool concurrent, bool summarizing)
{
IndexInfo *n = makeNode(IndexInfo);
@@ -757,6 +758,10 @@ makeIndexInfo(int numattrs, int numkeyattrs, Oid amoid, List *expressions,
n->ii_CheckedUnchanged = false;
n->ii_IndexUnchanged = false;
n->ii_Concurrent = concurrent;
+ n->ii_Summarizing = summarizing;
+
+ /* summarizing indexes cannot contain non-key attributes */
+ Assert(!summarizing || numkeyattrs == numattrs);
/* expressions */
n->ii_Expressions = expressions;
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 13f7987373..cd0f6e2a5e 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -2440,10 +2440,11 @@ RelationDestroyRelation(Relation relation, bool remember_tupdesc)
list_free_deep(relation->rd_fkeylist);
list_free(relation->rd_indexlist);
list_free(relation->rd_statlist);
- bms_free(relation->rd_indexattr);
bms_free(relation->rd_keyattr);
bms_free(relation->rd_pkattr);
bms_free(relation->rd_idattr);
+ bms_free(relation->rd_hotblockingattr);
+ bms_free(relation->rd_summarizedattr);
if (relation->rd_pubdesc)
pfree(relation->rd_pubdesc);
if (relation->rd_options)
@@ -5167,10 +5168,11 @@ RelationGetIndexPredicate(Relation relation)
Bitmapset *
RelationGetIndexAttrBitmap(Relation relation, IndexAttrBitmapKind attrKind)
{
- Bitmapset *indexattrs; /* indexed columns */
Bitmapset *uindexattrs; /* columns in unique indexes */
Bitmapset *pkindexattrs; /* columns in the primary index */
Bitmapset *idindexattrs; /* columns in the replica identity */
+ Bitmapset *hotblockingattrs; /* columns with HOT blocking indexes */
+ Bitmapset *summarizedattrs; /* columns with summarizing indexes */
List *indexoidlist;
List *newindexoidlist;
Oid relpkindex;
@@ -5179,18 +5181,20 @@ RelationGetIndexAttrBitmap(Relation relation, IndexAttrBitmapKind attrKind)
MemoryContext oldcxt;
/* Quick exit if we already computed the result. */
- if (relation->rd_indexattr != NULL)
+ if (relation->rd_attrsvalid)
{
switch (attrKind)
{
- case INDEX_ATTR_BITMAP_ALL:
- return bms_copy(relation->rd_indexattr);
case INDEX_ATTR_BITMAP_KEY:
return bms_copy(relation->rd_keyattr);
case INDEX_ATTR_BITMAP_PRIMARY_KEY:
return bms_copy(relation->rd_pkattr);
case INDEX_ATTR_BITMAP_IDENTITY_KEY:
return bms_copy(relation->rd_idattr);
+ case INDEX_ATTR_BITMAP_HOT_BLOCKING:
+ return bms_copy(relation->rd_hotblockingattr);
+ case INDEX_ATTR_BITMAP_SUMMARIZED:
+ return bms_copy(relation->rd_summarizedattr);
default:
elog(ERROR, "unknown attrKind %u", attrKind);
}
@@ -5230,10 +5234,11 @@ restart:
* CONCURRENTLY is far enough along that we should ignore the index, it
* won't be returned at all by RelationGetIndexList.
*/
- indexattrs = NULL;
uindexattrs = NULL;
pkindexattrs = NULL;
idindexattrs = NULL;
+ hotblockingattrs = NULL;
+ summarizedattrs = NULL;
foreach(l, indexoidlist)
{
Oid indexOid = lfirst_oid(l);
@@ -5246,6 +5251,7 @@ restart:
bool isKey; /* candidate key */
bool isPK; /* primary key */
bool isIDKey; /* replica identity index */
+ Bitmapset **attrs;
indexDesc = index_open(indexOid, AccessShareLock);
@@ -5283,6 +5289,11 @@ restart:
/* Is this index the configured (or default) replica identity? */
isIDKey = (indexOid == relreplindex);
+ if (indexDesc->rd_indam->amsummarizing)
+ attrs = &summarizedattrs;
+ else
+ attrs = &hotblockingattrs;
+
/* Collect simple attribute references */
for (i = 0; i < indexDesc->rd_index->indnatts; i++)
{
@@ -5291,15 +5302,21 @@ restart:
/*
* Since we have covering indexes with non-key columns, we must
* handle them accurately here. non-key columns must be added into
- * indexattrs, since they are in index, and HOT-update shouldn't
- * miss them. Obviously, non-key columns couldn't be referenced by
+ * hotblockingattrs, since they are in index, and HOT-update
+ * shouldn't miss them.
+ *
+ * Summarizing indexes do not block HOT, but do need to be updated
+ * when the column value changes, thus require a separate
+ * attribute bitmapset.
+ *
+ * Obviously, non-key columns couldn't be referenced by
* foreign key or identity key. Hence we do not include them into
* uindexattrs, pkindexattrs and idindexattrs bitmaps.
*/
if (attrnum != 0)
{
- indexattrs = bms_add_member(indexattrs,
- attrnum - FirstLowInvalidHeapAttributeNumber);
+ *attrs = bms_add_member(*attrs,
+ attrnum - FirstLowInvalidHeapAttributeNumber);
if (isKey && i < indexDesc->rd_index->indnkeyatts)
uindexattrs = bms_add_member(uindexattrs,
@@ -5316,10 +5333,12 @@ restart:
}
/* Collect all attributes used in expressions, too */
- pull_varattnos(indexExpressions, 1, &indexattrs);
+ pull_varattnos(indexExpressions, 1, attrs);
- /* Collect all attributes in the index predicate, too */
- pull_varattnos(indexPredicate, 1, &indexattrs);
+ /*
+ * Collect all attributes in the index predicate, too.
+ */
+ pull_varattnos(indexPredicate, 1, attrs);
index_close(indexDesc, AccessShareLock);
}
@@ -5347,24 +5366,28 @@ restart:
bms_free(uindexattrs);
bms_free(pkindexattrs);
bms_free(idindexattrs);
- bms_free(indexattrs);
+ bms_free(hotblockingattrs);
+ bms_free(summarizedattrs);
goto restart;
}
/* Don't leak the old values of these bitmaps, if any */
- bms_free(relation->rd_indexattr);
- relation->rd_indexattr = NULL;
+ relation->rd_attrsvalid = false;
bms_free(relation->rd_keyattr);
relation->rd_keyattr = NULL;
bms_free(relation->rd_pkattr);
relation->rd_pkattr = NULL;
bms_free(relation->rd_idattr);
relation->rd_idattr = NULL;
+ bms_free(relation->rd_hotblockingattr);
+ relation->rd_hotblockingattr = NULL;
+ bms_free(relation->rd_summarizedattr);
+ relation->rd_summarizedattr = NULL;
/*
* Now save copies of the bitmaps in the relcache entry. We intentionally
- * set rd_indexattr last, because that's the one that signals validity of
+ * set rd_attrsvalid last, because that's the one that signals validity of
* the values; if we run out of memory before making that copy, we won't
* leave the relcache entry looking like the other ones are valid but
* empty.
@@ -5373,20 +5396,24 @@ restart:
relation->rd_keyattr = bms_copy(uindexattrs);
relation->rd_pkattr = bms_copy(pkindexattrs);
relation->rd_idattr = bms_copy(idindexattrs);
- relation->rd_indexattr = bms_copy(indexattrs);
+ relation->rd_hotblockingattr = bms_copy(hotblockingattrs);
+ relation->rd_summarizedattr = bms_copy(summarizedattrs);
+ relation->rd_attrsvalid = true;
MemoryContextSwitchTo(oldcxt);
/* We return our original working copy for caller to play with */
switch (attrKind)
{
- case INDEX_ATTR_BITMAP_ALL:
- return indexattrs;
case INDEX_ATTR_BITMAP_KEY:
return uindexattrs;
case INDEX_ATTR_BITMAP_PRIMARY_KEY:
return pkindexattrs;
case INDEX_ATTR_BITMAP_IDENTITY_KEY:
return idindexattrs;
+ case INDEX_ATTR_BITMAP_HOT_BLOCKING:
+ return hotblockingattrs;
+ case INDEX_ATTR_BITMAP_SUMMARIZED:
+ return summarizedattrs;
default:
elog(ERROR, "unknown attrKind %u", attrKind);
return NULL;
@@ -6307,7 +6334,7 @@ load_relcache_init_file(bool shared)
rel->rd_indexlist = NIL;
rel->rd_pkindex = InvalidOid;
rel->rd_replidindex = InvalidOid;
- rel->rd_indexattr = NULL;
+ rel->rd_attrsvalid = false;
rel->rd_keyattr = NULL;
rel->rd_pkattr = NULL;
rel->rd_idattr = NULL;
diff --git a/src/include/access/amapi.h b/src/include/access/amapi.h
index 4f1f67b4d0..281039ef67 100644
--- a/src/include/access/amapi.h
+++ b/src/include/access/amapi.h
@@ -244,6 +244,8 @@ typedef struct IndexAmRoutine
bool amcaninclude;
/* does AM use maintenance_work_mem? */
bool amusemaintenanceworkmem;
+ /* does AM store tuple information only at block granularity? */
+ bool amsummarizing;
/* OR of parallel vacuum flags. See vacuum.h for flags. */
uint8 amparallelvacuumoptions;
/* type of data stored in index, or InvalidOid if variable */
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 8d74d1b7e3..faf5026519 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -249,7 +249,8 @@ extern void heap_abort_speculative(Relation relation, ItemPointer tid);
extern TM_Result heap_update(Relation relation, ItemPointer otid,
HeapTuple newtup,
CommandId cid, Snapshot crosscheck, bool wait,
- struct TM_FailureData *tmfd, LockTupleMode *lockmode);
+ struct TM_FailureData *tmfd, LockTupleMode *lockmode,
+ TU_UpdateIndexes *update_indexes);
extern TM_Result heap_lock_tuple(Relation relation, HeapTuple tuple,
CommandId cid, LockTupleMode mode, LockWaitPolicy wait_policy,
bool follow_updates,
@@ -275,7 +276,7 @@ extern bool heap_tuple_needs_eventual_freeze(HeapTupleHeader tuple);
extern void simple_heap_insert(Relation relation, HeapTuple tup);
extern void simple_heap_delete(Relation relation, ItemPointer tid);
extern void simple_heap_update(Relation relation, ItemPointer otid,
- HeapTuple tup);
+ HeapTuple tup, TU_UpdateIndexes *update_indexes);
extern TransactionId heap_index_delete_tuples(Relation rel,
TM_IndexDeleteOp *delstate);
diff --git a/src/include/access/tableam.h b/src/include/access/tableam.h
index 652e96f1b0..f31d7693ec 100644
--- a/src/include/access/tableam.h
+++ b/src/include/access/tableam.h
@@ -102,6 +102,19 @@ typedef enum TM_Result
TM_WouldBlock
} TM_Result;
+/*
+ * Result codes for table_update(..., update_indexes*..).
+ * Used to determine which indexes to update.
+ */
+typedef enum TU_UpdateIndexes {
+ /* No indexed columns were updated (incl. TID addressing of tuple) */
+ TU_None = 0,
+ /* A non-summarizing indexed column was updated, or the TID has changed */
+ TU_All = 1,
+ /* Only summarized columns were updated, TID is unchanged */
+ TU_Summarizing = 2
+} TU_UpdateIndexes;
+
/*
* When table_tuple_update, table_tuple_delete, or table_tuple_lock fail
* because the target tuple is already outdated, they fill in this struct to
@@ -526,7 +539,7 @@ typedef struct TableAmRoutine
bool wait,
TM_FailureData *tmfd,
LockTupleMode *lockmode,
- bool *update_indexes);
+ TU_UpdateIndexes *update_indexes);
/* see table_tuple_lock() for reference about parameters */
TM_Result (*tuple_lock) (Relation rel,
@@ -1514,7 +1527,7 @@ static inline TM_Result
table_tuple_update(Relation rel, ItemPointer otid, TupleTableSlot *slot,
CommandId cid, Snapshot snapshot, Snapshot crosscheck,
bool wait, TM_FailureData *tmfd, LockTupleMode *lockmode,
- bool *update_indexes)
+ TU_UpdateIndexes *update_indexes)
{
return rel->rd_tableam->tuple_update(rel, otid, slot,
cid, snapshot, crosscheck,
@@ -2038,7 +2051,7 @@ extern void simple_table_tuple_delete(Relation rel, ItemPointer tid,
Snapshot snapshot);
extern void simple_table_tuple_update(Relation rel, ItemPointer otid,
TupleTableSlot *slot, Snapshot snapshot,
- bool *update_indexes);
+ TU_UpdateIndexes *update_indexes);
/* ----------------------------------------------------------------------------
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index e7e25c057e..551c5d7ae0 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -620,7 +620,8 @@ extern List *ExecInsertIndexTuples(ResultRelInfo *resultRelInfo,
TupleTableSlot *slot, EState *estate,
bool update,
bool noDupErr,
- bool *specConflict, List *arbiterIndexes);
+ bool *specConflict, List *arbiterIndexes,
+ bool onlySummarizing);
extern bool ExecCheckIndexConstraints(ResultRelInfo *resultRelInfo,
TupleTableSlot *slot,
EState *estate, ItemPointer conflictTid,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 20f4c8b35f..3f1b8818a1 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -161,6 +161,7 @@ typedef struct ExprState
* IndexUnchanged aminsert hint, cached for retail inserts
* Concurrent are we doing a concurrent index build?
* BrokenHotChain did we detect any broken HOT chains?
+ * Summarizing is it summarizing?
* ParallelWorkers # of workers requested (excludes leader)
* Am Oid of index AM
* AmCache private cache area for index AM
@@ -194,6 +195,7 @@ typedef struct IndexInfo
bool ii_IndexUnchanged;
bool ii_Concurrent;
bool ii_BrokenHotChain;
+ bool ii_Summarizing;
int ii_ParallelWorkers;
Oid ii_Am;
void *ii_AmCache;
diff --git a/src/include/nodes/makefuncs.h b/src/include/nodes/makefuncs.h
index 80f1d5336b..64651c9b00 100644
--- a/src/include/nodes/makefuncs.h
+++ b/src/include/nodes/makefuncs.h
@@ -96,7 +96,9 @@ extern List *make_ands_implicit(Expr *clause);
extern IndexInfo *makeIndexInfo(int numattrs, int numkeyattrs, Oid amoid,
List *expressions, List *predicates,
- bool unique, bool nulls_not_distinct, bool isready, bool concurrent);
+ bool unique, bool nulls_not_distinct,
+ bool isready, bool concurrent,
+ bool summarizing);
extern DefElem *makeDefElem(char *name, Node *arg, int location);
extern DefElem *makeDefElemExtended(char *nameSpace, char *name, Node *arg,
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 67f994cb3e..c0ddddb2f0 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -156,10 +156,12 @@ typedef struct RelationData
List *rd_statlist; /* list of OIDs of extended stats */
/* data managed by RelationGetIndexAttrBitmap: */
- Bitmapset *rd_indexattr; /* identifies columns used in indexes */
+ bool rd_attrsvalid; /* are bitmaps of attrs valid? */
Bitmapset *rd_keyattr; /* cols that can be ref'd by foreign keys */
Bitmapset *rd_pkattr; /* cols included in primary key */
Bitmapset *rd_idattr; /* included in replica identity index */
+ Bitmapset *rd_hotblockingattr; /* cols blocking HOT update */
+ Bitmapset *rd_summarizedattr; /* cols indexed by summarizing indexes */
PublicationDesc *rd_pubdesc; /* publication descriptor, or NULL */
diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h
index 88460f21c5..beeb28b83c 100644
--- a/src/include/utils/relcache.h
+++ b/src/include/utils/relcache.h
@@ -56,10 +56,11 @@ extern bytea **RelationGetIndexAttOptions(Relation relation, bool copy);
typedef enum IndexAttrBitmapKind
{
- INDEX_ATTR_BITMAP_ALL,
INDEX_ATTR_BITMAP_KEY,
INDEX_ATTR_BITMAP_PRIMARY_KEY,
- INDEX_ATTR_BITMAP_IDENTITY_KEY
+ INDEX_ATTR_BITMAP_IDENTITY_KEY,
+ INDEX_ATTR_BITMAP_HOT_BLOCKING,
+ INDEX_ATTR_BITMAP_SUMMARIZED
} IndexAttrBitmapKind;
extern Bitmapset *RelationGetIndexAttrBitmap(Relation relation,
diff --git a/src/test/modules/dummy_index_am/dummy_index_am.c b/src/test/modules/dummy_index_am/dummy_index_am.c
index dfb1ebb846..c14e0abe0c 100644
--- a/src/test/modules/dummy_index_am/dummy_index_am.c
+++ b/src/test/modules/dummy_index_am/dummy_index_am.c
@@ -296,6 +296,7 @@ dihandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = false;
amroutine->amcaninclude = false;
amroutine->amusemaintenanceworkmem = false;
+ amroutine->amsummarizing = false;
amroutine->amparallelvacuumoptions = VACUUM_OPTION_NO_PARALLEL;
amroutine->amkeytype = InvalidOid;
diff --git a/src/test/regress/expected/stats.out b/src/test/regress/expected/stats.out
index 937b2101b3..edc26dea97 100644
--- a/src/test/regress/expected/stats.out
+++ b/src/test/regress/expected/stats.out
@@ -1354,4 +1354,115 @@ SELECT :io_stats_post_reset < :io_stats_pre_reset;
t
(1 row)
+-- test BRIN index doesn't block HOT update
+CREATE TABLE brin_hot (
+ id integer PRIMARY KEY,
+ val integer NOT NULL
+) WITH (autovacuum_enabled = off, fillfactor = 70);
+INSERT INTO brin_hot SELECT *, 0 FROM generate_series(1, 235);
+CREATE INDEX val_brin ON brin_hot using brin(val);
+CREATE FUNCTION wait_for_hot_stats() RETURNS void AS $$
+DECLARE
+ start_time timestamptz := clock_timestamp();
+ updated bool;
+BEGIN
+ -- we don't want to wait forever; loop will exit after 30 seconds
+ FOR i IN 1 .. 300 LOOP
+ SELECT (pg_stat_get_tuples_hot_updated('brin_hot'::regclass::oid) > 0) INTO updated;
+ EXIT WHEN updated;
+
+ -- wait a little
+ PERFORM pg_sleep_for('100 milliseconds');
+ -- reset stats snapshot so we can test again
+ PERFORM pg_stat_clear_snapshot();
+ END LOOP;
+ -- report time waited in postmaster log (where it won't change test output)
+ RAISE log 'wait_for_hot_stats delayed % seconds',
+ EXTRACT(epoch FROM clock_timestamp() - start_time);
+END
+$$ LANGUAGE plpgsql;
+UPDATE brin_hot SET val = -3 WHERE id = 42;
+-- We can't just call wait_for_hot_stats() at this point, because we only
+-- transmit stats when the session goes idle, and we probably didn't
+-- transmit the last couple of counts yet thanks to the rate-limiting logic
+-- in pgstat_report_stat(). But instead of waiting for the rate limiter's
+-- timeout to elapse, let's just start a new session. The old one will
+-- then send its stats before dying.
+\c -
+SELECT wait_for_hot_stats();
+ wait_for_hot_stats
+--------------------
+
+(1 row)
+
+SELECT pg_stat_get_tuples_hot_updated('brin_hot'::regclass::oid);
+ pg_stat_get_tuples_hot_updated
+--------------------------------
+ 1
+(1 row)
+
+DROP TABLE brin_hot;
+DROP FUNCTION wait_for_hot_stats();
+-- Test handling of index predicates - updating attributes in precicates
+-- should not block HOT when summarizing indexes are involved. We update
+-- a row that was not indexed due to the index predicate, and becomes
+-- indexable - the HOT-updated tuple is forwarded to the BRIN index.
+CREATE TABLE brin_hot_2 (a int, b int);
+INSERT INTO brin_hot_2 VALUES (1, 100);
+CREATE INDEX ON brin_hot_2 USING brin (b) WHERE a = 2;
+UPDATE brin_hot_2 SET a = 2;
+EXPLAIN (COSTS OFF) SELECT * FROM brin_hot_2 WHERE a = 2 AND b = 100;
+ QUERY PLAN
+-----------------------------------
+ Seq Scan on brin_hot_2
+ Filter: ((a = 2) AND (b = 100))
+(2 rows)
+
+SELECT COUNT(*) FROM brin_hot_2 WHERE a = 2 AND b = 100;
+ count
+-------
+ 1
+(1 row)
+
+SET enable_seqscan = off;
+EXPLAIN (COSTS OFF) SELECT * FROM brin_hot_2 WHERE a = 2 AND b = 100;
+ QUERY PLAN
+---------------------------------------------
+ Bitmap Heap Scan on brin_hot_2
+ Recheck Cond: ((b = 100) AND (a = 2))
+ -> Bitmap Index Scan on brin_hot_2_b_idx
+ Index Cond: (b = 100)
+(4 rows)
+
+SELECT COUNT(*) FROM brin_hot_2 WHERE a = 2 AND b = 100;
+ count
+-------
+ 1
+(1 row)
+
+DROP TABLE brin_hot_2;
+-- Test that updates to indexed columns are still propagated to the
+-- BRIN column.
+-- https://postgr.es/m/05ebcb44-f383-86e3-4f31-0a97a55634cf@enterprisedb.com
+CREATE TABLE brin_hot_3 (a int, filler text) WITH (fillfactor = 10);
+INSERT INTO brin_hot_3 SELECT 1, repeat(' ', 500) FROM generate_series(1, 20);
+CREATE INDEX ON brin_hot_3 USING brin (a) WITH (pages_per_range = 1);
+UPDATE brin_hot_3 SET a = 2;
+EXPLAIN (COSTS OFF) SELECT * FROM brin_hot_3 WHERE a = 2;
+ QUERY PLAN
+---------------------------------------------
+ Bitmap Heap Scan on brin_hot_3
+ Recheck Cond: (a = 2)
+ -> Bitmap Index Scan on brin_hot_3_a_idx
+ Index Cond: (a = 2)
+(4 rows)
+
+SELECT COUNT(*) FROM brin_hot_3 WHERE a = 2;
+ count
+-------
+ 20
+(1 row)
+
+DROP TABLE brin_hot_3;
+SET enable_seqscan = on;
-- End of Stats Test
diff --git a/src/test/regress/sql/stats.sql b/src/test/regress/sql/stats.sql
index 74e592aa8a..67e0473021 100644
--- a/src/test/regress/sql/stats.sql
+++ b/src/test/regress/sql/stats.sql
@@ -535,7 +535,6 @@ SET enable_seqscan TO on;
SELECT pg_stat_get_replication_slot(NULL);
SELECT pg_stat_get_subscription_stats(NULL);
-
-- Test that the following operations are tracked in pg_stat_io:
-- - reads of target blocks into shared buffers
-- - writes of shared buffers to permanent storage
@@ -678,4 +677,86 @@ SELECT sum(evictions) + sum(reuses) + sum(extends) + sum(fsyncs) + sum(reads) +
FROM pg_stat_io \gset
SELECT :io_stats_post_reset < :io_stats_pre_reset;
+
+-- test BRIN index doesn't block HOT update
+CREATE TABLE brin_hot (
+ id integer PRIMARY KEY,
+ val integer NOT NULL
+) WITH (autovacuum_enabled = off, fillfactor = 70);
+
+INSERT INTO brin_hot SELECT *, 0 FROM generate_series(1, 235);
+CREATE INDEX val_brin ON brin_hot using brin(val);
+
+CREATE FUNCTION wait_for_hot_stats() RETURNS void AS $$
+DECLARE
+ start_time timestamptz := clock_timestamp();
+ updated bool;
+BEGIN
+ -- we don't want to wait forever; loop will exit after 30 seconds
+ FOR i IN 1 .. 300 LOOP
+ SELECT (pg_stat_get_tuples_hot_updated('brin_hot'::regclass::oid) > 0) INTO updated;
+ EXIT WHEN updated;
+
+ -- wait a little
+ PERFORM pg_sleep_for('100 milliseconds');
+ -- reset stats snapshot so we can test again
+ PERFORM pg_stat_clear_snapshot();
+ END LOOP;
+ -- report time waited in postmaster log (where it won't change test output)
+ RAISE log 'wait_for_hot_stats delayed % seconds',
+ EXTRACT(epoch FROM clock_timestamp() - start_time);
+END
+$$ LANGUAGE plpgsql;
+
+UPDATE brin_hot SET val = -3 WHERE id = 42;
+
+-- We can't just call wait_for_hot_stats() at this point, because we only
+-- transmit stats when the session goes idle, and we probably didn't
+-- transmit the last couple of counts yet thanks to the rate-limiting logic
+-- in pgstat_report_stat(). But instead of waiting for the rate limiter's
+-- timeout to elapse, let's just start a new session. The old one will
+-- then send its stats before dying.
+\c -
+
+SELECT wait_for_hot_stats();
+SELECT pg_stat_get_tuples_hot_updated('brin_hot'::regclass::oid);
+
+DROP TABLE brin_hot;
+DROP FUNCTION wait_for_hot_stats();
+
+-- Test handling of index predicates - updating attributes in precicates
+-- should not block HOT when summarizing indexes are involved. We update
+-- a row that was not indexed due to the index predicate, and becomes
+-- indexable - the HOT-updated tuple is forwarded to the BRIN index.
+CREATE TABLE brin_hot_2 (a int, b int);
+INSERT INTO brin_hot_2 VALUES (1, 100);
+CREATE INDEX ON brin_hot_2 USING brin (b) WHERE a = 2;
+
+UPDATE brin_hot_2 SET a = 2;
+
+EXPLAIN (COSTS OFF) SELECT * FROM brin_hot_2 WHERE a = 2 AND b = 100;
+SELECT COUNT(*) FROM brin_hot_2 WHERE a = 2 AND b = 100;
+
+SET enable_seqscan = off;
+
+EXPLAIN (COSTS OFF) SELECT * FROM brin_hot_2 WHERE a = 2 AND b = 100;
+SELECT COUNT(*) FROM brin_hot_2 WHERE a = 2 AND b = 100;
+
+DROP TABLE brin_hot_2;
+
+-- Test that updates to indexed columns are still propagated to the
+-- BRIN column.
+-- https://postgr.es/m/05ebcb44-f383-86e3-4f31-0a97a55634cf@enterprisedb.com
+CREATE TABLE brin_hot_3 (a int, filler text) WITH (fillfactor = 10);
+INSERT INTO brin_hot_3 SELECT 1, repeat(' ', 500) FROM generate_series(1, 20);
+CREATE INDEX ON brin_hot_3 USING brin (a) WITH (pages_per_range = 1);
+UPDATE brin_hot_3 SET a = 2;
+
+EXPLAIN (COSTS OFF) SELECT * FROM brin_hot_3 WHERE a = 2;
+SELECT COUNT(*) FROM brin_hot_3 WHERE a = 2;
+
+DROP TABLE brin_hot_3;
+
+SET enable_seqscan = on;
+
-- End of Stats Test
--
2.39.0
On Wed, 22 Feb 2023 at 14:14, Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:
On Wed, 22 Feb 2023 at 13:15, Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:On 2/20/23 19:15, Matthias van de Meent wrote:
Thanks. Based on feedback, attached is v2 of the patch, with as
significant changes:- We don't store the columns we mention in predicates of summarized
indexes in the hotblocking column anymore, they are stored in the
summarized columns bitmap instead. This further reduces the chance of
failiing to apply HOT with summarizing indexes.Interesting idea. I need to think about the correctness, but AFAICS it
should work. Do we have any tests covering such cases?There is a test that checks that an update to the predicated column
does update the index (on table brin_hot_2). However, the description
was out of date, so I've updated that in v4.- The heaptuple header bit for summarized update in inserted tuples is
replaced with passing an out parameter. This simplifies the logic and
decreases chances of accidentally storing incorrect data.OK.
0002 proposes a minor RelationGetIndexPredicate() tweak, getting rid of
the repeated if/else branches. Feel free to discard, if you think the v2
approach is better.I agree that this is better, it's included in v4 of the patch, as attached.
I think that the v4 patch solves all comments up to now; and
considering that most of this patch was committed but then reverted
due to an issue in v15, and that said issue is fixed in this patch,
I'm marking this as ready for committer.
Tomas, would you be up for that?
Kind regards,
Matthias van de Meent
On 3/8/23 23:31, Matthias van de Meent wrote:
On Wed, 22 Feb 2023 at 14:14, Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:On Wed, 22 Feb 2023 at 13:15, Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:On 2/20/23 19:15, Matthias van de Meent wrote:
Thanks. Based on feedback, attached is v2 of the patch, with as
significant changes:- We don't store the columns we mention in predicates of summarized
indexes in the hotblocking column anymore, they are stored in the
summarized columns bitmap instead. This further reduces the chance of
failiing to apply HOT with summarizing indexes.Interesting idea. I need to think about the correctness, but AFAICS it
should work. Do we have any tests covering such cases?There is a test that checks that an update to the predicated column
does update the index (on table brin_hot_2). However, the description
was out of date, so I've updated that in v4.- The heaptuple header bit for summarized update in inserted tuples is
replaced with passing an out parameter. This simplifies the logic and
decreases chances of accidentally storing incorrect data.OK.
0002 proposes a minor RelationGetIndexPredicate() tweak, getting rid of
the repeated if/else branches. Feel free to discard, if you think the v2
approach is better.I agree that this is better, it's included in v4 of the patch, as attached.
I think that the v4 patch solves all comments up to now; and
considering that most of this patch was committed but then reverted
due to an issue in v15, and that said issue is fixed in this patch,
I'm marking this as ready for committer.Tomas, would you be up for that?
Thanks for the patch. I started looking at it yesterday, and I think
it's 99% RFC. I think it's correct and I only have some minor comments,
(see the 0002 patch):
1) There were still a couple minor wording issues in the sgml docs.
2) bikeshedding: I added a bunch of "()" to various conditions, I think
it makes it clearer.
3) This seems a bit weird way to write a conditional Assert:
if (onlySummarized)
Assert(HeapTupleIsHeapOnly(heapTuple));
better to do a composed Assert(!(onlySummarized && !...)) or something?
4) A couple comments and minor tweaks.
5) Undoing a couple unnecessary changes (whitespace, ...).
6) Proper formatting of TU_UpdateIndexes enum.
7) Comment in RelationGetIndexAttrBitmap() is misleading, as it still
references hotblockingattrs, even though it may update summarizedattrs
in some cases.
If you agree with these changes, I'll get it committed.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachments:
v5-0001-Ignore-BRIN-indexes-when-checking-for-HOT-updates.patchtext/x-patch; charset=UTF-8; name=v5-0001-Ignore-BRIN-indexes-when-checking-for-HOT-updates.patchDownload
From 4459b7b325843d2dfcfad42d645ae5d3b47d784d Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@2ndquadrant.com>
Date: Tue, 14 Mar 2023 01:11:38 +0100
Subject: [PATCH v5 1/2] Ignore BRIN indexes when checking for HOT updates
When determining whether an index update may be skipped by using HOT, we
can ignore attributes indexed by block summarizing indexes without
references to individual tuples that need to be cleaned up.
This also removes rd_indexattr list, and replaces it with rd_attrsvalid
flag. The list was not used anywhere, and a simple flag is sufficient.
A new type TU_UpdateIndexes is invented provide a signal to the executor
to determine which indexes to update - no indexes, all indexes, or only
the summarizing indexes.
One otherwise unused bit in the heap tuple header is (ab)used to signal
that the HOT update would still update at least one summarizing index.
The bit is cleared immediately
Original patch by Josef Simanek, various fixes and improvements by
Tomas Vondra and me.
Authors: Josef Simanek, Tomas Vondra, Matthias van de Meent
Reviewed-by: Tomas Vondra, Alvaro Herrera
---
doc/src/sgml/indexam.sgml | 13 ++
src/backend/access/brin/brin.c | 1 +
src/backend/access/gin/ginutil.c | 1 +
src/backend/access/gist/gist.c | 1 +
src/backend/access/hash/hash.c | 1 +
src/backend/access/heap/heapam.c | 48 +++++++-
src/backend/access/heap/heapam_handler.c | 19 ++-
src/backend/access/nbtree/nbtree.c | 1 +
src/backend/access/spgist/spgutils.c | 1 +
src/backend/access/table/tableam.c | 2 +-
src/backend/catalog/index.c | 9 +-
src/backend/catalog/indexing.c | 35 ++++--
src/backend/commands/copyfrom.c | 5 +-
src/backend/commands/indexcmds.c | 10 +-
src/backend/executor/execIndexing.c | 37 ++++--
src/backend/executor/execReplication.c | 9 +-
src/backend/executor/nodeModifyTable.c | 13 +-
src/backend/nodes/makefuncs.c | 7 +-
src/backend/utils/cache/relcache.c | 69 +++++++----
src/include/access/amapi.h | 2 +
src/include/access/heapam.h | 5 +-
src/include/access/tableam.h | 19 ++-
src/include/executor/executor.h | 3 +-
src/include/nodes/execnodes.h | 2 +
src/include/nodes/makefuncs.h | 4 +-
src/include/utils/rel.h | 4 +-
src/include/utils/relcache.h | 5 +-
.../modules/dummy_index_am/dummy_index_am.c | 1 +
src/test/regress/expected/stats.out | 111 ++++++++++++++++++
src/test/regress/sql/stats.sql | 83 ++++++++++++-
30 files changed, 443 insertions(+), 78 deletions(-)
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index 4f83970c85..897419ec95 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -127,6 +127,9 @@ typedef struct IndexAmRoutine
bool amcaninclude;
/* does AM use maintenance_work_mem? */
bool amusemaintenanceworkmem;
+ /* does AM summarize tuples, with at least all tuples in the block
+ * summarized in one summary */
+ bool amsummarizing;
/* OR of parallel vacuum flags */
uint8 amparallelvacuumoptions;
/* type of data stored in index, or InvalidOid if variable */
@@ -247,6 +250,16 @@ typedef struct IndexAmRoutine
null, independently of <structfield>amoptionalkey</structfield>.
</para>
+ <para>
+ The <structfield>amsummarizing</structfield> flag indicates whether the
+ access method summarizes the indexed tuples, with summarizing granularity
+ of at least per block.
+ Access methods that do not point to individual tuples, but to (like
+ <acronym>BRIN</acronym>), may allow the <acronym>HOT</acronym> optimization
+ to continue. This does not apply to attributes referenced in index
+ predicates, an update of such attribute always disables <acronym>HOT</acronym>.
+ </para>
+
</sect1>
<sect1 id="index-functions">
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index b5a5fa7b33..53e4721a54 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -109,6 +109,7 @@ brinhandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = false;
amroutine->amcaninclude = false;
amroutine->amusemaintenanceworkmem = false;
+ amroutine->amsummarizing = true;
amroutine->amparallelvacuumoptions =
VACUUM_OPTION_PARALLEL_CLEANUP;
amroutine->amkeytype = InvalidOid;
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index f05128ecf5..03fec1704e 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -56,6 +56,7 @@ ginhandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = false;
amroutine->amcaninclude = false;
amroutine->amusemaintenanceworkmem = true;
+ amroutine->amsummarizing = false;
amroutine->amparallelvacuumoptions =
VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_CLEANUP;
amroutine->amkeytype = InvalidOid;
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index ba394f08f6..ea72bcce1b 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -78,6 +78,7 @@ gisthandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = false;
amroutine->amcaninclude = true;
amroutine->amusemaintenanceworkmem = false;
+ amroutine->amsummarizing = false;
amroutine->amparallelvacuumoptions =
VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
amroutine->amkeytype = InvalidOid;
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index eb258337d6..fc5d97f606 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -75,6 +75,7 @@ hashhandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = false;
amroutine->amcaninclude = false;
amroutine->amusemaintenanceworkmem = false;
+ amroutine->amsummarizing = false;
amroutine->amparallelvacuumoptions =
VACUUM_OPTION_PARALLEL_BULKDEL;
amroutine->amkeytype = INT4OID;
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 4f50e0dd34..cf4b917eb4 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -2924,11 +2924,13 @@ simple_heap_delete(Relation relation, ItemPointer tid)
TM_Result
heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
CommandId cid, Snapshot crosscheck, bool wait,
- TM_FailureData *tmfd, LockTupleMode *lockmode)
+ TM_FailureData *tmfd, LockTupleMode *lockmode,
+ TU_UpdateIndexes *update_indexes)
{
TM_Result result;
TransactionId xid = GetCurrentTransactionId();
Bitmapset *hot_attrs;
+ Bitmapset *sum_attrs;
Bitmapset *key_attrs;
Bitmapset *id_attrs;
Bitmapset *interesting_attrs;
@@ -2951,6 +2953,7 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
bool have_tuple_lock = false;
bool iscombo;
bool use_hot_update = false;
+ bool summarized_update = false;
bool key_intact;
bool all_visible_cleared = false;
bool all_visible_cleared_new = false;
@@ -2996,12 +2999,16 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
* Note that we get copies of each bitmap, so we need not worry about
* relcache flush happening midway through.
*/
- hot_attrs = RelationGetIndexAttrBitmap(relation, INDEX_ATTR_BITMAP_ALL);
+ hot_attrs = RelationGetIndexAttrBitmap(relation,
+ INDEX_ATTR_BITMAP_HOT_BLOCKING);
+ sum_attrs = RelationGetIndexAttrBitmap(relation,
+ INDEX_ATTR_BITMAP_SUMMARIZED);
key_attrs = RelationGetIndexAttrBitmap(relation, INDEX_ATTR_BITMAP_KEY);
id_attrs = RelationGetIndexAttrBitmap(relation,
INDEX_ATTR_BITMAP_IDENTITY_KEY);
interesting_attrs = NULL;
interesting_attrs = bms_add_members(interesting_attrs, hot_attrs);
+ interesting_attrs = bms_add_members(interesting_attrs, sum_attrs);
interesting_attrs = bms_add_members(interesting_attrs, key_attrs);
interesting_attrs = bms_add_members(interesting_attrs, id_attrs);
@@ -3311,7 +3318,10 @@ l2:
UnlockTupleTuplock(relation, &(oldtup.t_self), *lockmode);
if (vmbuffer != InvalidBuffer)
ReleaseBuffer(vmbuffer);
+ *update_indexes = TU_None;
+
bms_free(hot_attrs);
+ bms_free(sum_attrs);
bms_free(key_attrs);
bms_free(id_attrs);
bms_free(modified_attrs);
@@ -3633,7 +3643,19 @@ l2:
* changed.
*/
if (!bms_overlap(modified_attrs, hot_attrs))
+ {
use_hot_update = true;
+
+ /*
+ * If none of the columns that are used in hot-blocking indexes
+ * were updated, we can apply HOT, but we do still need to check
+ * if we need to update the summarizing indexes, and update those
+ * indexes if the columns were updated, or we may fail to detect
+ * e.g. value bound changes in BRIN minmax indexes.
+ */
+ if (bms_overlap(modified_attrs, sum_attrs))
+ summarized_update = true;
+ }
}
else
{
@@ -3793,10 +3815,27 @@ l2:
heap_freetuple(heaptup);
}
+ /*
+ * If it is a HOT update, the update may still need to update summarized
+ * indexes, lest we fail to update those summaries and get incorrect
+ * results (for example, minmax bounds of the block may change with this
+ * update).
+ */
+ if (use_hot_update)
+ {
+ if (summarized_update)
+ *update_indexes = TU_Summarizing;
+ else
+ *update_indexes = TU_None;
+ }
+ else
+ *update_indexes = TU_All;
+
if (old_key_tuple != NULL && old_key_copied)
heap_freetuple(old_key_tuple);
bms_free(hot_attrs);
+ bms_free(sum_attrs);
bms_free(key_attrs);
bms_free(id_attrs);
bms_free(modified_attrs);
@@ -3951,7 +3990,8 @@ HeapDetermineColumnsInfo(Relation relation,
* via ereport().
*/
void
-simple_heap_update(Relation relation, ItemPointer otid, HeapTuple tup)
+simple_heap_update(Relation relation, ItemPointer otid, HeapTuple tup,
+ TU_UpdateIndexes *update_indexes)
{
TM_Result result;
TM_FailureData tmfd;
@@ -3960,7 +4000,7 @@ simple_heap_update(Relation relation, ItemPointer otid, HeapTuple tup)
result = heap_update(relation, otid, tup,
GetCurrentCommandId(true), InvalidSnapshot,
true /* wait for commit */ ,
- &tmfd, &lockmode);
+ &tmfd, &lockmode, update_indexes);
switch (result)
{
case TM_SelfModified:
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index c4b1916d36..a1d7d91ff7 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -314,7 +314,7 @@ static TM_Result
heapam_tuple_update(Relation relation, ItemPointer otid, TupleTableSlot *slot,
CommandId cid, Snapshot snapshot, Snapshot crosscheck,
bool wait, TM_FailureData *tmfd,
- LockTupleMode *lockmode, bool *update_indexes)
+ LockTupleMode *lockmode, TU_UpdateIndexes *update_indexes)
{
bool shouldFree = true;
HeapTuple tuple = ExecFetchSlotHeapTuple(slot, true, &shouldFree);
@@ -325,7 +325,7 @@ heapam_tuple_update(Relation relation, ItemPointer otid, TupleTableSlot *slot,
tuple->t_tableOid = slot->tts_tableOid;
result = heap_update(relation, otid, tuple, cid, crosscheck, wait,
- tmfd, lockmode);
+ tmfd, lockmode, update_indexes);
ItemPointerCopy(&tuple->t_self, &slot->tts_tid);
/*
@@ -334,9 +334,20 @@ heapam_tuple_update(Relation relation, ItemPointer otid, TupleTableSlot *slot,
* Note: heap_update returns the tid (location) of the new tuple in the
* t_self field.
*
- * If it's a HOT update, we mustn't insert new index entries.
+ * If the update is not HOT, we must update all indexes. If the update
+ * is HOT, it could be that we updated summarized columns, so we either
+ * update only summarized indexes, or none at all.
*/
- *update_indexes = result == TM_Ok && !HeapTupleIsHeapOnly(tuple);
+ if (result != TM_Ok)
+ {
+ Assert(*update_indexes == TU_None);
+ *update_indexes = TU_None;
+ }
+ else if (!HeapTupleIsHeapOnly(tuple))
+ Assert(*update_indexes == TU_All);
+ else
+ Assert(*update_indexes == TU_Summarizing ||
+ *update_indexes == TU_None);
if (shouldFree)
pfree(tuple);
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 3f7b541e9d..a68dd07534 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -114,6 +114,7 @@ bthandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = true;
amroutine->amcaninclude = true;
amroutine->amusemaintenanceworkmem = false;
+ amroutine->amsummarizing = false;
amroutine->amparallelvacuumoptions =
VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
amroutine->amkeytype = InvalidOid;
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index 3761f2c193..4e7ff1d160 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -62,6 +62,7 @@ spghandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = false;
amroutine->amcaninclude = true;
amroutine->amusemaintenanceworkmem = false;
+ amroutine->amsummarizing = false;
amroutine->amparallelvacuumoptions =
VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP;
amroutine->amkeytype = InvalidOid;
diff --git a/src/backend/access/table/tableam.c b/src/backend/access/table/tableam.c
index ef0d34fcee..a5e6c92f35 100644
--- a/src/backend/access/table/tableam.c
+++ b/src/backend/access/table/tableam.c
@@ -345,7 +345,7 @@ void
simple_table_tuple_update(Relation rel, ItemPointer otid,
TupleTableSlot *slot,
Snapshot snapshot,
- bool *update_indexes)
+ TU_UpdateIndexes *update_indexes)
{
TM_Result result;
TM_FailureData tmfd;
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 7777e7ec77..33e3d0ec05 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -1383,7 +1383,8 @@ index_concurrently_create_copy(Relation heapRelation, Oid oldIndexId,
oldInfo->ii_Unique,
oldInfo->ii_NullsNotDistinct,
false, /* not ready for inserts */
- true);
+ true,
+ indexRelation->rd_indam->amsummarizing);
/*
* Extract the list of column names and the column numbers for the new
@@ -2455,7 +2456,8 @@ BuildIndexInfo(Relation index)
indexStruct->indisunique,
indexStruct->indnullsnotdistinct,
indexStruct->indisready,
- false);
+ false,
+ index->rd_indam->amsummarizing);
/* fill in attribute numbers */
for (i = 0; i < numAtts; i++)
@@ -2515,7 +2517,8 @@ BuildDummyIndexInfo(Relation index)
indexStruct->indisunique,
indexStruct->indnullsnotdistinct,
indexStruct->indisready,
- false);
+ false,
+ index->rd_indam->amsummarizing);
/* fill in attribute numbers */
for (i = 0; i < numAtts; i++)
diff --git a/src/backend/catalog/indexing.c b/src/backend/catalog/indexing.c
index bb7cc3601c..a387eccdc4 100644
--- a/src/backend/catalog/indexing.c
+++ b/src/backend/catalog/indexing.c
@@ -72,7 +72,8 @@ CatalogCloseIndexes(CatalogIndexState indstate)
* This is effectively a cut-down version of ExecInsertIndexTuples.
*/
static void
-CatalogIndexInsert(CatalogIndexState indstate, HeapTuple heapTuple)
+CatalogIndexInsert(CatalogIndexState indstate, HeapTuple heapTuple,
+ TU_UpdateIndexes updateIndexes)
{
int i;
int numIndexes;
@@ -82,6 +83,7 @@ CatalogIndexInsert(CatalogIndexState indstate, HeapTuple heapTuple)
IndexInfo **indexInfoArray;
Datum values[INDEX_MAX_KEYS];
bool isnull[INDEX_MAX_KEYS];
+ bool onlySummarized = updateIndexes == TU_Summarizing;
/*
* HOT update does not require index inserts. But with asserts enabled we
@@ -89,10 +91,13 @@ CatalogIndexInsert(CatalogIndexState indstate, HeapTuple heapTuple)
* table/index.
*/
#ifndef USE_ASSERT_CHECKING
- if (HeapTupleIsHeapOnly(heapTuple))
+ if (HeapTupleIsHeapOnly(heapTuple) && !onlySummarized)
return;
#endif
+ if (onlySummarized)
+ Assert(HeapTupleIsHeapOnly(heapTuple));
+
/*
* Get information from the state structure. Fall out if nothing to do.
*/
@@ -135,13 +140,20 @@ CatalogIndexInsert(CatalogIndexState indstate, HeapTuple heapTuple)
/* see earlier check above */
#ifdef USE_ASSERT_CHECKING
- if (HeapTupleIsHeapOnly(heapTuple))
+ if (HeapTupleIsHeapOnly(heapTuple) && !onlySummarized)
{
Assert(!ReindexIsProcessingIndex(RelationGetRelid(index)));
continue;
}
#endif /* USE_ASSERT_CHECKING */
+ /*
+ * Skip insertions into non-summarizing indexes if we only need
+ * to update summarizing indexes
+ */
+ if (onlySummarized && !indexInfo->ii_Summarizing)
+ continue;
+
/*
* FormIndexDatum fills in its values and isnull parameters with the
* appropriate values for the column(s) of the index.
@@ -228,7 +240,7 @@ CatalogTupleInsert(Relation heapRel, HeapTuple tup)
simple_heap_insert(heapRel, tup);
- CatalogIndexInsert(indstate, tup);
+ CatalogIndexInsert(indstate, tup, TU_All);
CatalogCloseIndexes(indstate);
}
@@ -248,7 +260,7 @@ CatalogTupleInsertWithInfo(Relation heapRel, HeapTuple tup,
simple_heap_insert(heapRel, tup);
- CatalogIndexInsert(indstate, tup);
+ CatalogIndexInsert(indstate, tup, TU_All);
}
/*
@@ -279,7 +291,7 @@ CatalogTuplesMultiInsertWithInfo(Relation heapRel, TupleTableSlot **slot,
tuple = ExecFetchSlotHeapTuple(slot[i], true, &should_free);
tuple->t_tableOid = slot[i]->tts_tableOid;
- CatalogIndexInsert(indstate, tuple);
+ CatalogIndexInsert(indstate, tuple, TU_All);
if (should_free)
heap_freetuple(tuple);
@@ -301,14 +313,15 @@ void
CatalogTupleUpdate(Relation heapRel, ItemPointer otid, HeapTuple tup)
{
CatalogIndexState indstate;
+ TU_UpdateIndexes updateIndexes = TU_All;
CatalogTupleCheckConstraints(heapRel, tup);
indstate = CatalogOpenIndexes(heapRel);
- simple_heap_update(heapRel, otid, tup);
+ simple_heap_update(heapRel, otid, tup, &updateIndexes);
- CatalogIndexInsert(indstate, tup);
+ CatalogIndexInsert(indstate, tup, updateIndexes);
CatalogCloseIndexes(indstate);
}
@@ -324,11 +337,13 @@ void
CatalogTupleUpdateWithInfo(Relation heapRel, ItemPointer otid, HeapTuple tup,
CatalogIndexState indstate)
{
+ TU_UpdateIndexes updateIndexes = TU_All;
+
CatalogTupleCheckConstraints(heapRel, tup);
- simple_heap_update(heapRel, otid, tup);
+ simple_heap_update(heapRel, otid, tup, &updateIndexes);
- CatalogIndexInsert(indstate, tup);
+ CatalogIndexInsert(indstate, tup, updateIndexes);
}
/*
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 321a7fad85..80bca79cd0 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -435,7 +435,7 @@ CopyMultiInsertBufferFlush(CopyMultiInsertInfo *miinfo,
recheckIndexes =
ExecInsertIndexTuples(resultRelInfo,
buffer->slots[i], estate, false,
- false, NULL, NIL);
+ false, NULL, NIL, false);
ExecARInsertTriggers(estate, resultRelInfo,
slots[i], recheckIndexes,
cstate->transition_capture);
@@ -1248,7 +1248,8 @@ CopyFrom(CopyFromState cstate)
false,
false,
NULL,
- NIL);
+ NIL,
+ false);
}
/* AFTER ROW INSERT Triggers */
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 16ec0b114e..ff48f44c66 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -184,6 +184,7 @@ CheckIndexCompatible(Oid oldId,
Form_pg_am accessMethodForm;
IndexAmRoutine *amRoutine;
bool amcanorder;
+ bool amsummarizing;
int16 *coloptions;
IndexInfo *indexInfo;
int numberOfAttributes;
@@ -222,6 +223,7 @@ CheckIndexCompatible(Oid oldId,
ReleaseSysCache(tuple);
amcanorder = amRoutine->amcanorder;
+ amsummarizing = amRoutine->amsummarizing;
/*
* Compute the operator classes, collations, and exclusion operators for
@@ -232,7 +234,8 @@ CheckIndexCompatible(Oid oldId,
* ii_NumIndexKeyAttrs with same value.
*/
indexInfo = makeIndexInfo(numberOfAttributes, numberOfAttributes,
- accessMethodId, NIL, NIL, false, false, false, false);
+ accessMethodId, NIL, NIL, false, false,
+ false, false, amsummarizing);
typeObjectId = palloc_array(Oid, numberOfAttributes);
collationObjectId = palloc_array(Oid, numberOfAttributes);
classObjectId = palloc_array(Oid, numberOfAttributes);
@@ -550,6 +553,7 @@ DefineIndex(Oid relationId,
Form_pg_am accessMethodForm;
IndexAmRoutine *amRoutine;
bool amcanorder;
+ bool amissummarizing;
amoptions_function amoptions;
bool partitioned;
bool safe_index;
@@ -866,6 +870,7 @@ DefineIndex(Oid relationId,
amcanorder = amRoutine->amcanorder;
amoptions = amRoutine->amoptions;
+ amissummarizing = amRoutine->amsummarizing;
pfree(amRoutine);
ReleaseSysCache(tuple);
@@ -897,7 +902,8 @@ DefineIndex(Oid relationId,
stmt->unique,
stmt->nulls_not_distinct,
!concurrent,
- concurrent);
+ concurrent,
+ amissummarizing);
typeObjectId = palloc_array(Oid, numberOfAttributes);
collationObjectId = palloc_array(Oid, numberOfAttributes);
diff --git a/src/backend/executor/execIndexing.c b/src/backend/executor/execIndexing.c
index 6e88e72813..da28e5e40c 100644
--- a/src/backend/executor/execIndexing.c
+++ b/src/backend/executor/execIndexing.c
@@ -259,15 +259,24 @@ ExecCloseIndices(ResultRelInfo *resultRelInfo)
* into all the relations indexing the result relation
* when a heap tuple is inserted into the result relation.
*
- * When 'update' is true, executor is performing an UPDATE
- * that could not use an optimization like heapam's HOT (in
- * more general terms a call to table_tuple_update() took
- * place and set 'update_indexes' to true). Receiving this
- * hint makes us consider if we should pass down the
- * 'indexUnchanged' hint in turn. That's something that we
- * figure out for each index_insert() call iff 'update' is
- * true. (When 'update' is false we already know not to pass
- * the hint to any index.)
+ * When 'update' is true and 'onlySummarizing' is false,
+ * executor is performing an UPDATE that could not use an
+ * optimization like heapam's HOT (in more general terms a
+ * call to table_tuple_update() took place and set
+ * 'update_indexes' to TUUI_All). Receiving this hint makes
+ * us consider if we should pass down the 'indexUnchanged'
+ * hint in turn. That's something that we figure out for
+ * each index_insert() call iff 'update' is true.
+ * (When 'update' is false we already know not to pass the
+ * hint to any index.)
+ *
+ * If onlySummarizing is set, an equivalent optimization to
+ * HOT has been applied and any updated columns are indexed
+ * only by summarizing indexes (or in more general terms a
+ * call to table_tuple_update() took place and set
+ * 'update_indexes' to TUUI_Summarizing). We can (and must)
+ * therefore only update the indexes that have
+ * 'amsummarizing' = true.
*
* Unique and exclusion constraints are enforced at the same
* time. This returns a list of index OIDs for any unique or
@@ -287,7 +296,8 @@ ExecInsertIndexTuples(ResultRelInfo *resultRelInfo,
bool update,
bool noDupErr,
bool *specConflict,
- List *arbiterIndexes)
+ List *arbiterIndexes,
+ bool onlySummarizing)
{
ItemPointer tupleid = &slot->tts_tid;
List *result = NIL;
@@ -343,6 +353,13 @@ ExecInsertIndexTuples(ResultRelInfo *resultRelInfo,
if (!indexInfo->ii_ReadyForInserts)
continue;
+ /*
+ * Skip processing of non-summarizing indexes if we only
+ * update summarizing indexes
+ */
+ if (onlySummarizing && !indexInfo->ii_Summarizing)
+ continue;
+
/* Check for partial index */
if (indexInfo->ii_Predicate != NIL)
{
diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index 4f5083a598..36196e4d94 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -445,7 +445,7 @@ ExecSimpleRelationInsert(ResultRelInfo *resultRelInfo,
if (resultRelInfo->ri_NumIndices > 0)
recheckIndexes = ExecInsertIndexTuples(resultRelInfo,
slot, estate, false, false,
- NULL, NIL);
+ NULL, NIL, false);
/* AFTER ROW INSERT Triggers */
ExecARInsertTriggers(estate, resultRelInfo, slot,
@@ -493,7 +493,7 @@ ExecSimpleRelationUpdate(ResultRelInfo *resultRelInfo,
if (!skip_tuple)
{
List *recheckIndexes = NIL;
- bool update_indexes;
+ TU_UpdateIndexes update_indexes;
/* Compute stored generated columns */
if (rel->rd_att->constr &&
@@ -510,10 +510,11 @@ ExecSimpleRelationUpdate(ResultRelInfo *resultRelInfo,
simple_table_tuple_update(rel, tid, slot, estate->es_snapshot,
&update_indexes);
- if (resultRelInfo->ri_NumIndices > 0 && update_indexes)
+ if (resultRelInfo->ri_NumIndices > 0 && update_indexes != TU_None)
recheckIndexes = ExecInsertIndexTuples(resultRelInfo,
slot, estate, true, false,
- NULL, NIL);
+ NULL, NIL,
+ update_indexes == TU_Summarizing);
/* AFTER ROW UPDATE Triggers */
ExecARUpdateTriggers(estate, resultRelInfo,
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 3fa2b930a5..3d0efebacc 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -110,8 +110,8 @@ typedef struct ModifyTableContext
typedef struct UpdateContext
{
bool updated; /* did UPDATE actually occur? */
- bool updateIndexes; /* index update required? */
bool crossPartUpdate; /* was it a cross-partition update? */
+ TU_UpdateIndexes updateIndexes; /* Which index updates are required? */
/*
* Lock mode to acquire on the latest tuple version before performing
@@ -1099,7 +1099,8 @@ ExecInsert(ModifyTableContext *context,
recheckIndexes = ExecInsertIndexTuples(resultRelInfo,
slot, estate, false, true,
&specConflict,
- arbiterIndexes);
+ arbiterIndexes,
+ false);
/* adjust the tuple's state accordingly */
table_tuple_complete_speculative(resultRelationDesc, slot,
@@ -1138,7 +1139,8 @@ ExecInsert(ModifyTableContext *context,
if (resultRelInfo->ri_NumIndices > 0)
recheckIndexes = ExecInsertIndexTuples(resultRelInfo,
slot, estate, false,
- false, NULL, NIL);
+ false, NULL, NIL,
+ false);
}
}
@@ -2118,11 +2120,12 @@ ExecUpdateEpilogue(ModifyTableContext *context, UpdateContext *updateCxt,
List *recheckIndexes = NIL;
/* insert index entries for tuple if necessary */
- if (resultRelInfo->ri_NumIndices > 0 && updateCxt->updateIndexes)
+ if (resultRelInfo->ri_NumIndices > 0 && updateCxt->updateIndexes != TU_None)
recheckIndexes = ExecInsertIndexTuples(resultRelInfo,
slot, context->estate,
true, false,
- NULL, NIL);
+ NULL, NIL,
+ updateCxt->updateIndexes == TU_Summarizing);
/* AFTER ROW UPDATE Triggers */
ExecARUpdateTriggers(context->estate, resultRelInfo,
diff --git a/src/backend/nodes/makefuncs.c b/src/backend/nodes/makefuncs.c
index fe67baf142..f23f8b7349 100644
--- a/src/backend/nodes/makefuncs.c
+++ b/src/backend/nodes/makefuncs.c
@@ -743,7 +743,8 @@ make_ands_implicit(Expr *clause)
*/
IndexInfo *
makeIndexInfo(int numattrs, int numkeyattrs, Oid amoid, List *expressions,
- List *predicates, bool unique, bool nulls_not_distinct, bool isready, bool concurrent)
+ List *predicates, bool unique, bool nulls_not_distinct,
+ bool isready, bool concurrent, bool summarizing)
{
IndexInfo *n = makeNode(IndexInfo);
@@ -757,6 +758,10 @@ makeIndexInfo(int numattrs, int numkeyattrs, Oid amoid, List *expressions,
n->ii_CheckedUnchanged = false;
n->ii_IndexUnchanged = false;
n->ii_Concurrent = concurrent;
+ n->ii_Summarizing = summarizing;
+
+ /* summarizing indexes cannot contain non-key attributes */
+ Assert(!summarizing || numkeyattrs == numattrs);
/* expressions */
n->ii_Expressions = expressions;
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 13f7987373..cd0f6e2a5e 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -2440,10 +2440,11 @@ RelationDestroyRelation(Relation relation, bool remember_tupdesc)
list_free_deep(relation->rd_fkeylist);
list_free(relation->rd_indexlist);
list_free(relation->rd_statlist);
- bms_free(relation->rd_indexattr);
bms_free(relation->rd_keyattr);
bms_free(relation->rd_pkattr);
bms_free(relation->rd_idattr);
+ bms_free(relation->rd_hotblockingattr);
+ bms_free(relation->rd_summarizedattr);
if (relation->rd_pubdesc)
pfree(relation->rd_pubdesc);
if (relation->rd_options)
@@ -5167,10 +5168,11 @@ RelationGetIndexPredicate(Relation relation)
Bitmapset *
RelationGetIndexAttrBitmap(Relation relation, IndexAttrBitmapKind attrKind)
{
- Bitmapset *indexattrs; /* indexed columns */
Bitmapset *uindexattrs; /* columns in unique indexes */
Bitmapset *pkindexattrs; /* columns in the primary index */
Bitmapset *idindexattrs; /* columns in the replica identity */
+ Bitmapset *hotblockingattrs; /* columns with HOT blocking indexes */
+ Bitmapset *summarizedattrs; /* columns with summarizing indexes */
List *indexoidlist;
List *newindexoidlist;
Oid relpkindex;
@@ -5179,18 +5181,20 @@ RelationGetIndexAttrBitmap(Relation relation, IndexAttrBitmapKind attrKind)
MemoryContext oldcxt;
/* Quick exit if we already computed the result. */
- if (relation->rd_indexattr != NULL)
+ if (relation->rd_attrsvalid)
{
switch (attrKind)
{
- case INDEX_ATTR_BITMAP_ALL:
- return bms_copy(relation->rd_indexattr);
case INDEX_ATTR_BITMAP_KEY:
return bms_copy(relation->rd_keyattr);
case INDEX_ATTR_BITMAP_PRIMARY_KEY:
return bms_copy(relation->rd_pkattr);
case INDEX_ATTR_BITMAP_IDENTITY_KEY:
return bms_copy(relation->rd_idattr);
+ case INDEX_ATTR_BITMAP_HOT_BLOCKING:
+ return bms_copy(relation->rd_hotblockingattr);
+ case INDEX_ATTR_BITMAP_SUMMARIZED:
+ return bms_copy(relation->rd_summarizedattr);
default:
elog(ERROR, "unknown attrKind %u", attrKind);
}
@@ -5230,10 +5234,11 @@ restart:
* CONCURRENTLY is far enough along that we should ignore the index, it
* won't be returned at all by RelationGetIndexList.
*/
- indexattrs = NULL;
uindexattrs = NULL;
pkindexattrs = NULL;
idindexattrs = NULL;
+ hotblockingattrs = NULL;
+ summarizedattrs = NULL;
foreach(l, indexoidlist)
{
Oid indexOid = lfirst_oid(l);
@@ -5246,6 +5251,7 @@ restart:
bool isKey; /* candidate key */
bool isPK; /* primary key */
bool isIDKey; /* replica identity index */
+ Bitmapset **attrs;
indexDesc = index_open(indexOid, AccessShareLock);
@@ -5283,6 +5289,11 @@ restart:
/* Is this index the configured (or default) replica identity? */
isIDKey = (indexOid == relreplindex);
+ if (indexDesc->rd_indam->amsummarizing)
+ attrs = &summarizedattrs;
+ else
+ attrs = &hotblockingattrs;
+
/* Collect simple attribute references */
for (i = 0; i < indexDesc->rd_index->indnatts; i++)
{
@@ -5291,15 +5302,21 @@ restart:
/*
* Since we have covering indexes with non-key columns, we must
* handle them accurately here. non-key columns must be added into
- * indexattrs, since they are in index, and HOT-update shouldn't
- * miss them. Obviously, non-key columns couldn't be referenced by
+ * hotblockingattrs, since they are in index, and HOT-update
+ * shouldn't miss them.
+ *
+ * Summarizing indexes do not block HOT, but do need to be updated
+ * when the column value changes, thus require a separate
+ * attribute bitmapset.
+ *
+ * Obviously, non-key columns couldn't be referenced by
* foreign key or identity key. Hence we do not include them into
* uindexattrs, pkindexattrs and idindexattrs bitmaps.
*/
if (attrnum != 0)
{
- indexattrs = bms_add_member(indexattrs,
- attrnum - FirstLowInvalidHeapAttributeNumber);
+ *attrs = bms_add_member(*attrs,
+ attrnum - FirstLowInvalidHeapAttributeNumber);
if (isKey && i < indexDesc->rd_index->indnkeyatts)
uindexattrs = bms_add_member(uindexattrs,
@@ -5316,10 +5333,12 @@ restart:
}
/* Collect all attributes used in expressions, too */
- pull_varattnos(indexExpressions, 1, &indexattrs);
+ pull_varattnos(indexExpressions, 1, attrs);
- /* Collect all attributes in the index predicate, too */
- pull_varattnos(indexPredicate, 1, &indexattrs);
+ /*
+ * Collect all attributes in the index predicate, too.
+ */
+ pull_varattnos(indexPredicate, 1, attrs);
index_close(indexDesc, AccessShareLock);
}
@@ -5347,24 +5366,28 @@ restart:
bms_free(uindexattrs);
bms_free(pkindexattrs);
bms_free(idindexattrs);
- bms_free(indexattrs);
+ bms_free(hotblockingattrs);
+ bms_free(summarizedattrs);
goto restart;
}
/* Don't leak the old values of these bitmaps, if any */
- bms_free(relation->rd_indexattr);
- relation->rd_indexattr = NULL;
+ relation->rd_attrsvalid = false;
bms_free(relation->rd_keyattr);
relation->rd_keyattr = NULL;
bms_free(relation->rd_pkattr);
relation->rd_pkattr = NULL;
bms_free(relation->rd_idattr);
relation->rd_idattr = NULL;
+ bms_free(relation->rd_hotblockingattr);
+ relation->rd_hotblockingattr = NULL;
+ bms_free(relation->rd_summarizedattr);
+ relation->rd_summarizedattr = NULL;
/*
* Now save copies of the bitmaps in the relcache entry. We intentionally
- * set rd_indexattr last, because that's the one that signals validity of
+ * set rd_attrsvalid last, because that's the one that signals validity of
* the values; if we run out of memory before making that copy, we won't
* leave the relcache entry looking like the other ones are valid but
* empty.
@@ -5373,20 +5396,24 @@ restart:
relation->rd_keyattr = bms_copy(uindexattrs);
relation->rd_pkattr = bms_copy(pkindexattrs);
relation->rd_idattr = bms_copy(idindexattrs);
- relation->rd_indexattr = bms_copy(indexattrs);
+ relation->rd_hotblockingattr = bms_copy(hotblockingattrs);
+ relation->rd_summarizedattr = bms_copy(summarizedattrs);
+ relation->rd_attrsvalid = true;
MemoryContextSwitchTo(oldcxt);
/* We return our original working copy for caller to play with */
switch (attrKind)
{
- case INDEX_ATTR_BITMAP_ALL:
- return indexattrs;
case INDEX_ATTR_BITMAP_KEY:
return uindexattrs;
case INDEX_ATTR_BITMAP_PRIMARY_KEY:
return pkindexattrs;
case INDEX_ATTR_BITMAP_IDENTITY_KEY:
return idindexattrs;
+ case INDEX_ATTR_BITMAP_HOT_BLOCKING:
+ return hotblockingattrs;
+ case INDEX_ATTR_BITMAP_SUMMARIZED:
+ return summarizedattrs;
default:
elog(ERROR, "unknown attrKind %u", attrKind);
return NULL;
@@ -6307,7 +6334,7 @@ load_relcache_init_file(bool shared)
rel->rd_indexlist = NIL;
rel->rd_pkindex = InvalidOid;
rel->rd_replidindex = InvalidOid;
- rel->rd_indexattr = NULL;
+ rel->rd_attrsvalid = false;
rel->rd_keyattr = NULL;
rel->rd_pkattr = NULL;
rel->rd_idattr = NULL;
diff --git a/src/include/access/amapi.h b/src/include/access/amapi.h
index 4f1f67b4d0..281039ef67 100644
--- a/src/include/access/amapi.h
+++ b/src/include/access/amapi.h
@@ -244,6 +244,8 @@ typedef struct IndexAmRoutine
bool amcaninclude;
/* does AM use maintenance_work_mem? */
bool amusemaintenanceworkmem;
+ /* does AM store tuple information only at block granularity? */
+ bool amsummarizing;
/* OR of parallel vacuum flags. See vacuum.h for flags. */
uint8 amparallelvacuumoptions;
/* type of data stored in index, or InvalidOid if variable */
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 8d74d1b7e3..faf5026519 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -249,7 +249,8 @@ extern void heap_abort_speculative(Relation relation, ItemPointer tid);
extern TM_Result heap_update(Relation relation, ItemPointer otid,
HeapTuple newtup,
CommandId cid, Snapshot crosscheck, bool wait,
- struct TM_FailureData *tmfd, LockTupleMode *lockmode);
+ struct TM_FailureData *tmfd, LockTupleMode *lockmode,
+ TU_UpdateIndexes *update_indexes);
extern TM_Result heap_lock_tuple(Relation relation, HeapTuple tuple,
CommandId cid, LockTupleMode mode, LockWaitPolicy wait_policy,
bool follow_updates,
@@ -275,7 +276,7 @@ extern bool heap_tuple_needs_eventual_freeze(HeapTupleHeader tuple);
extern void simple_heap_insert(Relation relation, HeapTuple tup);
extern void simple_heap_delete(Relation relation, ItemPointer tid);
extern void simple_heap_update(Relation relation, ItemPointer otid,
- HeapTuple tup);
+ HeapTuple tup, TU_UpdateIndexes *update_indexes);
extern TransactionId heap_index_delete_tuples(Relation rel,
TM_IndexDeleteOp *delstate);
diff --git a/src/include/access/tableam.h b/src/include/access/tableam.h
index 652e96f1b0..f31d7693ec 100644
--- a/src/include/access/tableam.h
+++ b/src/include/access/tableam.h
@@ -102,6 +102,19 @@ typedef enum TM_Result
TM_WouldBlock
} TM_Result;
+/*
+ * Result codes for table_update(..., update_indexes*..).
+ * Used to determine which indexes to update.
+ */
+typedef enum TU_UpdateIndexes {
+ /* No indexed columns were updated (incl. TID addressing of tuple) */
+ TU_None = 0,
+ /* A non-summarizing indexed column was updated, or the TID has changed */
+ TU_All = 1,
+ /* Only summarized columns were updated, TID is unchanged */
+ TU_Summarizing = 2
+} TU_UpdateIndexes;
+
/*
* When table_tuple_update, table_tuple_delete, or table_tuple_lock fail
* because the target tuple is already outdated, they fill in this struct to
@@ -526,7 +539,7 @@ typedef struct TableAmRoutine
bool wait,
TM_FailureData *tmfd,
LockTupleMode *lockmode,
- bool *update_indexes);
+ TU_UpdateIndexes *update_indexes);
/* see table_tuple_lock() for reference about parameters */
TM_Result (*tuple_lock) (Relation rel,
@@ -1514,7 +1527,7 @@ static inline TM_Result
table_tuple_update(Relation rel, ItemPointer otid, TupleTableSlot *slot,
CommandId cid, Snapshot snapshot, Snapshot crosscheck,
bool wait, TM_FailureData *tmfd, LockTupleMode *lockmode,
- bool *update_indexes)
+ TU_UpdateIndexes *update_indexes)
{
return rel->rd_tableam->tuple_update(rel, otid, slot,
cid, snapshot, crosscheck,
@@ -2038,7 +2051,7 @@ extern void simple_table_tuple_delete(Relation rel, ItemPointer tid,
Snapshot snapshot);
extern void simple_table_tuple_update(Relation rel, ItemPointer otid,
TupleTableSlot *slot, Snapshot snapshot,
- bool *update_indexes);
+ TU_UpdateIndexes *update_indexes);
/* ----------------------------------------------------------------------------
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 946abc0051..dbd77050c7 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -620,7 +620,8 @@ extern List *ExecInsertIndexTuples(ResultRelInfo *resultRelInfo,
TupleTableSlot *slot, EState *estate,
bool update,
bool noDupErr,
- bool *specConflict, List *arbiterIndexes);
+ bool *specConflict, List *arbiterIndexes,
+ bool onlySummarizing);
extern bool ExecCheckIndexConstraints(ResultRelInfo *resultRelInfo,
TupleTableSlot *slot,
EState *estate, ItemPointer conflictTid,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index bc67cb9ed8..ae3608cd93 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -161,6 +161,7 @@ typedef struct ExprState
* IndexUnchanged aminsert hint, cached for retail inserts
* Concurrent are we doing a concurrent index build?
* BrokenHotChain did we detect any broken HOT chains?
+ * Summarizing is it summarizing?
* ParallelWorkers # of workers requested (excludes leader)
* Am Oid of index AM
* AmCache private cache area for index AM
@@ -194,6 +195,7 @@ typedef struct IndexInfo
bool ii_IndexUnchanged;
bool ii_Concurrent;
bool ii_BrokenHotChain;
+ bool ii_Summarizing;
int ii_ParallelWorkers;
Oid ii_Am;
void *ii_AmCache;
diff --git a/src/include/nodes/makefuncs.h b/src/include/nodes/makefuncs.h
index 80f1d5336b..64651c9b00 100644
--- a/src/include/nodes/makefuncs.h
+++ b/src/include/nodes/makefuncs.h
@@ -96,7 +96,9 @@ extern List *make_ands_implicit(Expr *clause);
extern IndexInfo *makeIndexInfo(int numattrs, int numkeyattrs, Oid amoid,
List *expressions, List *predicates,
- bool unique, bool nulls_not_distinct, bool isready, bool concurrent);
+ bool unique, bool nulls_not_distinct,
+ bool isready, bool concurrent,
+ bool summarizing);
extern DefElem *makeDefElem(char *name, Node *arg, int location);
extern DefElem *makeDefElemExtended(char *nameSpace, char *name, Node *arg,
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 67f994cb3e..c0ddddb2f0 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -156,10 +156,12 @@ typedef struct RelationData
List *rd_statlist; /* list of OIDs of extended stats */
/* data managed by RelationGetIndexAttrBitmap: */
- Bitmapset *rd_indexattr; /* identifies columns used in indexes */
+ bool rd_attrsvalid; /* are bitmaps of attrs valid? */
Bitmapset *rd_keyattr; /* cols that can be ref'd by foreign keys */
Bitmapset *rd_pkattr; /* cols included in primary key */
Bitmapset *rd_idattr; /* included in replica identity index */
+ Bitmapset *rd_hotblockingattr; /* cols blocking HOT update */
+ Bitmapset *rd_summarizedattr; /* cols indexed by summarizing indexes */
PublicationDesc *rd_pubdesc; /* publication descriptor, or NULL */
diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h
index 88460f21c5..beeb28b83c 100644
--- a/src/include/utils/relcache.h
+++ b/src/include/utils/relcache.h
@@ -56,10 +56,11 @@ extern bytea **RelationGetIndexAttOptions(Relation relation, bool copy);
typedef enum IndexAttrBitmapKind
{
- INDEX_ATTR_BITMAP_ALL,
INDEX_ATTR_BITMAP_KEY,
INDEX_ATTR_BITMAP_PRIMARY_KEY,
- INDEX_ATTR_BITMAP_IDENTITY_KEY
+ INDEX_ATTR_BITMAP_IDENTITY_KEY,
+ INDEX_ATTR_BITMAP_HOT_BLOCKING,
+ INDEX_ATTR_BITMAP_SUMMARIZED
} IndexAttrBitmapKind;
extern Bitmapset *RelationGetIndexAttrBitmap(Relation relation,
diff --git a/src/test/modules/dummy_index_am/dummy_index_am.c b/src/test/modules/dummy_index_am/dummy_index_am.c
index dfb1ebb846..c14e0abe0c 100644
--- a/src/test/modules/dummy_index_am/dummy_index_am.c
+++ b/src/test/modules/dummy_index_am/dummy_index_am.c
@@ -296,6 +296,7 @@ dihandler(PG_FUNCTION_ARGS)
amroutine->amcanparallel = false;
amroutine->amcaninclude = false;
amroutine->amusemaintenanceworkmem = false;
+ amroutine->amsummarizing = false;
amroutine->amparallelvacuumoptions = VACUUM_OPTION_NO_PARALLEL;
amroutine->amkeytype = InvalidOid;
diff --git a/src/test/regress/expected/stats.out b/src/test/regress/expected/stats.out
index 186c296299..a668bd2e48 100644
--- a/src/test/regress/expected/stats.out
+++ b/src/test/regress/expected/stats.out
@@ -1358,4 +1358,115 @@ SELECT :io_stats_post_reset < :io_stats_pre_reset;
t
(1 row)
+-- test BRIN index doesn't block HOT update
+CREATE TABLE brin_hot (
+ id integer PRIMARY KEY,
+ val integer NOT NULL
+) WITH (autovacuum_enabled = off, fillfactor = 70);
+INSERT INTO brin_hot SELECT *, 0 FROM generate_series(1, 235);
+CREATE INDEX val_brin ON brin_hot using brin(val);
+CREATE FUNCTION wait_for_hot_stats() RETURNS void AS $$
+DECLARE
+ start_time timestamptz := clock_timestamp();
+ updated bool;
+BEGIN
+ -- we don't want to wait forever; loop will exit after 30 seconds
+ FOR i IN 1 .. 300 LOOP
+ SELECT (pg_stat_get_tuples_hot_updated('brin_hot'::regclass::oid) > 0) INTO updated;
+ EXIT WHEN updated;
+
+ -- wait a little
+ PERFORM pg_sleep_for('100 milliseconds');
+ -- reset stats snapshot so we can test again
+ PERFORM pg_stat_clear_snapshot();
+ END LOOP;
+ -- report time waited in postmaster log (where it won't change test output)
+ RAISE log 'wait_for_hot_stats delayed % seconds',
+ EXTRACT(epoch FROM clock_timestamp() - start_time);
+END
+$$ LANGUAGE plpgsql;
+UPDATE brin_hot SET val = -3 WHERE id = 42;
+-- We can't just call wait_for_hot_stats() at this point, because we only
+-- transmit stats when the session goes idle, and we probably didn't
+-- transmit the last couple of counts yet thanks to the rate-limiting logic
+-- in pgstat_report_stat(). But instead of waiting for the rate limiter's
+-- timeout to elapse, let's just start a new session. The old one will
+-- then send its stats before dying.
+\c -
+SELECT wait_for_hot_stats();
+ wait_for_hot_stats
+--------------------
+
+(1 row)
+
+SELECT pg_stat_get_tuples_hot_updated('brin_hot'::regclass::oid);
+ pg_stat_get_tuples_hot_updated
+--------------------------------
+ 1
+(1 row)
+
+DROP TABLE brin_hot;
+DROP FUNCTION wait_for_hot_stats();
+-- Test handling of index predicates - updating attributes in precicates
+-- should not block HOT when summarizing indexes are involved. We update
+-- a row that was not indexed due to the index predicate, and becomes
+-- indexable - the HOT-updated tuple is forwarded to the BRIN index.
+CREATE TABLE brin_hot_2 (a int, b int);
+INSERT INTO brin_hot_2 VALUES (1, 100);
+CREATE INDEX ON brin_hot_2 USING brin (b) WHERE a = 2;
+UPDATE brin_hot_2 SET a = 2;
+EXPLAIN (COSTS OFF) SELECT * FROM brin_hot_2 WHERE a = 2 AND b = 100;
+ QUERY PLAN
+-----------------------------------
+ Seq Scan on brin_hot_2
+ Filter: ((a = 2) AND (b = 100))
+(2 rows)
+
+SELECT COUNT(*) FROM brin_hot_2 WHERE a = 2 AND b = 100;
+ count
+-------
+ 1
+(1 row)
+
+SET enable_seqscan = off;
+EXPLAIN (COSTS OFF) SELECT * FROM brin_hot_2 WHERE a = 2 AND b = 100;
+ QUERY PLAN
+---------------------------------------------
+ Bitmap Heap Scan on brin_hot_2
+ Recheck Cond: ((b = 100) AND (a = 2))
+ -> Bitmap Index Scan on brin_hot_2_b_idx
+ Index Cond: (b = 100)
+(4 rows)
+
+SELECT COUNT(*) FROM brin_hot_2 WHERE a = 2 AND b = 100;
+ count
+-------
+ 1
+(1 row)
+
+DROP TABLE brin_hot_2;
+-- Test that updates to indexed columns are still propagated to the
+-- BRIN column.
+-- https://postgr.es/m/05ebcb44-f383-86e3-4f31-0a97a55634cf@enterprisedb.com
+CREATE TABLE brin_hot_3 (a int, filler text) WITH (fillfactor = 10);
+INSERT INTO brin_hot_3 SELECT 1, repeat(' ', 500) FROM generate_series(1, 20);
+CREATE INDEX ON brin_hot_3 USING brin (a) WITH (pages_per_range = 1);
+UPDATE brin_hot_3 SET a = 2;
+EXPLAIN (COSTS OFF) SELECT * FROM brin_hot_3 WHERE a = 2;
+ QUERY PLAN
+---------------------------------------------
+ Bitmap Heap Scan on brin_hot_3
+ Recheck Cond: (a = 2)
+ -> Bitmap Index Scan on brin_hot_3_a_idx
+ Index Cond: (a = 2)
+(4 rows)
+
+SELECT COUNT(*) FROM brin_hot_3 WHERE a = 2;
+ count
+-------
+ 20
+(1 row)
+
+DROP TABLE brin_hot_3;
+SET enable_seqscan = on;
-- End of Stats Test
diff --git a/src/test/regress/sql/stats.sql b/src/test/regress/sql/stats.sql
index d7f873cfc9..c8caa4db38 100644
--- a/src/test/regress/sql/stats.sql
+++ b/src/test/regress/sql/stats.sql
@@ -535,7 +535,6 @@ SET enable_seqscan TO on;
SELECT pg_stat_get_replication_slot(NULL);
SELECT pg_stat_get_subscription_stats(NULL);
-
-- Test that the following operations are tracked in pg_stat_io:
-- - reads of target blocks into shared buffers
-- - writes of shared buffers to permanent storage
@@ -682,4 +681,86 @@ SELECT sum(evictions) + sum(reuses) + sum(extends) + sum(fsyncs) + sum(reads) +
FROM pg_stat_io \gset
SELECT :io_stats_post_reset < :io_stats_pre_reset;
+
+-- test BRIN index doesn't block HOT update
+CREATE TABLE brin_hot (
+ id integer PRIMARY KEY,
+ val integer NOT NULL
+) WITH (autovacuum_enabled = off, fillfactor = 70);
+
+INSERT INTO brin_hot SELECT *, 0 FROM generate_series(1, 235);
+CREATE INDEX val_brin ON brin_hot using brin(val);
+
+CREATE FUNCTION wait_for_hot_stats() RETURNS void AS $$
+DECLARE
+ start_time timestamptz := clock_timestamp();
+ updated bool;
+BEGIN
+ -- we don't want to wait forever; loop will exit after 30 seconds
+ FOR i IN 1 .. 300 LOOP
+ SELECT (pg_stat_get_tuples_hot_updated('brin_hot'::regclass::oid) > 0) INTO updated;
+ EXIT WHEN updated;
+
+ -- wait a little
+ PERFORM pg_sleep_for('100 milliseconds');
+ -- reset stats snapshot so we can test again
+ PERFORM pg_stat_clear_snapshot();
+ END LOOP;
+ -- report time waited in postmaster log (where it won't change test output)
+ RAISE log 'wait_for_hot_stats delayed % seconds',
+ EXTRACT(epoch FROM clock_timestamp() - start_time);
+END
+$$ LANGUAGE plpgsql;
+
+UPDATE brin_hot SET val = -3 WHERE id = 42;
+
+-- We can't just call wait_for_hot_stats() at this point, because we only
+-- transmit stats when the session goes idle, and we probably didn't
+-- transmit the last couple of counts yet thanks to the rate-limiting logic
+-- in pgstat_report_stat(). But instead of waiting for the rate limiter's
+-- timeout to elapse, let's just start a new session. The old one will
+-- then send its stats before dying.
+\c -
+
+SELECT wait_for_hot_stats();
+SELECT pg_stat_get_tuples_hot_updated('brin_hot'::regclass::oid);
+
+DROP TABLE brin_hot;
+DROP FUNCTION wait_for_hot_stats();
+
+-- Test handling of index predicates - updating attributes in precicates
+-- should not block HOT when summarizing indexes are involved. We update
+-- a row that was not indexed due to the index predicate, and becomes
+-- indexable - the HOT-updated tuple is forwarded to the BRIN index.
+CREATE TABLE brin_hot_2 (a int, b int);
+INSERT INTO brin_hot_2 VALUES (1, 100);
+CREATE INDEX ON brin_hot_2 USING brin (b) WHERE a = 2;
+
+UPDATE brin_hot_2 SET a = 2;
+
+EXPLAIN (COSTS OFF) SELECT * FROM brin_hot_2 WHERE a = 2 AND b = 100;
+SELECT COUNT(*) FROM brin_hot_2 WHERE a = 2 AND b = 100;
+
+SET enable_seqscan = off;
+
+EXPLAIN (COSTS OFF) SELECT * FROM brin_hot_2 WHERE a = 2 AND b = 100;
+SELECT COUNT(*) FROM brin_hot_2 WHERE a = 2 AND b = 100;
+
+DROP TABLE brin_hot_2;
+
+-- Test that updates to indexed columns are still propagated to the
+-- BRIN column.
+-- https://postgr.es/m/05ebcb44-f383-86e3-4f31-0a97a55634cf@enterprisedb.com
+CREATE TABLE brin_hot_3 (a int, filler text) WITH (fillfactor = 10);
+INSERT INTO brin_hot_3 SELECT 1, repeat(' ', 500) FROM generate_series(1, 20);
+CREATE INDEX ON brin_hot_3 USING brin (a) WITH (pages_per_range = 1);
+UPDATE brin_hot_3 SET a = 2;
+
+EXPLAIN (COSTS OFF) SELECT * FROM brin_hot_3 WHERE a = 2;
+SELECT COUNT(*) FROM brin_hot_3 WHERE a = 2;
+
+DROP TABLE brin_hot_3;
+
+SET enable_seqscan = on;
+
-- End of Stats Test
--
2.39.2
v5-0002-review-comments-and-tweaks.patchtext/x-patch; charset=UTF-8; name=v5-0002-review-comments-and-tweaks.patchDownload
From 944e318a1d00b1875d068b18b4b45743f2888e11 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@2ndquadrant.com>
Date: Tue, 14 Mar 2023 14:36:30 +0100
Subject: [PATCH v5 2/2] review comments and tweaks
---
doc/src/sgml/indexam.sgml | 4 ++--
src/backend/access/heap/heapam_handler.c | 4 ++--
src/backend/catalog/indexing.c | 15 +++++++++++++--
src/backend/executor/execReplication.c | 4 ++--
src/backend/executor/nodeModifyTable.c | 4 ++--
src/backend/nodes/makefuncs.c | 2 +-
src/backend/utils/cache/relcache.c | 12 +++++++++---
src/include/access/tableam.h | 8 +++++++-
src/include/nodes/execnodes.h | 2 +-
src/test/regress/sql/stats.sql | 1 +
10 files changed, 40 insertions(+), 16 deletions(-)
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index 897419ec95..29ece7c42e 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -254,8 +254,8 @@ typedef struct IndexAmRoutine
The <structfield>amsummarizing</structfield> flag indicates whether the
access method summarizes the indexed tuples, with summarizing granularity
of at least per block.
- Access methods that do not point to individual tuples, but to (like
- <acronym>BRIN</acronym>), may allow the <acronym>HOT</acronym> optimization
+ Access methods that do not point to individual tuples, but to block ranges
+ (like <acronym>BRIN</acronym>), may allow the <acronym>HOT</acronym> optimization
to continue. This does not apply to attributes referenced in index
predicates, an update of such attribute always disables <acronym>HOT</acronym>.
</para>
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index a1d7d91ff7..1ce7c6b971 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -346,8 +346,8 @@ heapam_tuple_update(Relation relation, ItemPointer otid, TupleTableSlot *slot,
else if (!HeapTupleIsHeapOnly(tuple))
Assert(*update_indexes == TU_All);
else
- Assert(*update_indexes == TU_Summarizing ||
- *update_indexes == TU_None);
+ Assert((*update_indexes == TU_Summarizing) ||
+ (*update_indexes == TU_None));
if (shouldFree)
pfree(tuple);
diff --git a/src/backend/catalog/indexing.c b/src/backend/catalog/indexing.c
index a387eccdc4..f00e077cc2 100644
--- a/src/backend/catalog/indexing.c
+++ b/src/backend/catalog/indexing.c
@@ -83,7 +83,7 @@ CatalogIndexInsert(CatalogIndexState indstate, HeapTuple heapTuple,
IndexInfo **indexInfoArray;
Datum values[INDEX_MAX_KEYS];
bool isnull[INDEX_MAX_KEYS];
- bool onlySummarized = updateIndexes == TU_Summarizing;
+ bool onlySummarized = (updateIndexes == TU_Summarizing);
/*
* HOT update does not require index inserts. But with asserts enabled we
@@ -95,6 +95,14 @@ CatalogIndexInsert(CatalogIndexState indstate, HeapTuple heapTuple,
return;
#endif
+ /* XXX This is a bit weird way to make a conditional assert. Maybe it'd be
+ * better to write it like this:
+ *
+ * Assert(!(onlySummarized && !HeapTupleIsHeapOnly(heapTuple)));
+ *
+ * With a comment "When only updating summarized indexes, it has to be
+ * a HOT-only tuple" as explanation.
+ */
if (onlySummarized)
Assert(HeapTupleIsHeapOnly(heapTuple));
@@ -149,7 +157,10 @@ CatalogIndexInsert(CatalogIndexState indstate, HeapTuple heapTuple,
/*
* Skip insertions into non-summarizing indexes if we only need
- * to update summarizing indexes
+ * to update summarizing indexes.
+ *
+ * XXX I wonder if we could create a BRIN index on a catalog, so
+ * that this actually triggers during testing.
*/
if (onlySummarized && !indexInfo->ii_Summarizing)
continue;
diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index 36196e4d94..c0a20a015b 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -510,11 +510,11 @@ ExecSimpleRelationUpdate(ResultRelInfo *resultRelInfo,
simple_table_tuple_update(rel, tid, slot, estate->es_snapshot,
&update_indexes);
- if (resultRelInfo->ri_NumIndices > 0 && update_indexes != TU_None)
+ if (resultRelInfo->ri_NumIndices > 0 && (update_indexes != TU_None))
recheckIndexes = ExecInsertIndexTuples(resultRelInfo,
slot, estate, true, false,
NULL, NIL,
- update_indexes == TU_Summarizing);
+ (update_indexes == TU_Summarizing));
/* AFTER ROW UPDATE Triggers */
ExecARUpdateTriggers(estate, resultRelInfo,
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 3d0efebacc..3a67389508 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -2120,12 +2120,12 @@ ExecUpdateEpilogue(ModifyTableContext *context, UpdateContext *updateCxt,
List *recheckIndexes = NIL;
/* insert index entries for tuple if necessary */
- if (resultRelInfo->ri_NumIndices > 0 && updateCxt->updateIndexes != TU_None)
+ if (resultRelInfo->ri_NumIndices > 0 && (updateCxt->updateIndexes != TU_None))
recheckIndexes = ExecInsertIndexTuples(resultRelInfo,
slot, context->estate,
true, false,
NULL, NIL,
- updateCxt->updateIndexes == TU_Summarizing);
+ (updateCxt->updateIndexes == TU_Summarizing));
/* AFTER ROW UPDATE Triggers */
ExecARUpdateTriggers(context->estate, resultRelInfo,
diff --git a/src/backend/nodes/makefuncs.c b/src/backend/nodes/makefuncs.c
index f23f8b7349..216383ca23 100644
--- a/src/backend/nodes/makefuncs.c
+++ b/src/backend/nodes/makefuncs.c
@@ -761,7 +761,7 @@ makeIndexInfo(int numattrs, int numkeyattrs, Oid amoid, List *expressions,
n->ii_Summarizing = summarizing;
/* summarizing indexes cannot contain non-key attributes */
- Assert(!summarizing || numkeyattrs == numattrs);
+ Assert(!summarizing || (numkeyattrs == numattrs));
/* expressions */
n->ii_Expressions = expressions;
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index cd0f6e2a5e..fc8cc5b5de 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -5289,6 +5289,11 @@ restart:
/* Is this index the configured (or default) replica identity? */
isIDKey = (indexOid == relreplindex);
+ /*
+ * If the index is summarizing, it doesn't block HOT updates, but we
+ * may still need to update it (if the attributes were modified). So
+ * decide which bitmap we'll update in the following loop.
+ */
if (indexDesc->rd_indam->amsummarizing)
attrs = &summarizedattrs;
else
@@ -5305,6 +5310,9 @@ restart:
* hotblockingattrs, since they are in index, and HOT-update
* shouldn't miss them.
*
+ * XXX This is misleading, because it talks about hotblockingattrs
+ * but then it adds stuff into attrs.
+ *
* Summarizing indexes do not block HOT, but do need to be updated
* when the column value changes, thus require a separate
* attribute bitmapset.
@@ -5335,9 +5343,7 @@ restart:
/* Collect all attributes used in expressions, too */
pull_varattnos(indexExpressions, 1, attrs);
- /*
- * Collect all attributes in the index predicate, too.
- */
+ /* Collect all attributes in the index predicate, too */
pull_varattnos(indexPredicate, 1, attrs);
index_close(indexDesc, AccessShareLock);
diff --git a/src/include/access/tableam.h b/src/include/access/tableam.h
index f31d7693ec..36c5835628 100644
--- a/src/include/access/tableam.h
+++ b/src/include/access/tableam.h
@@ -105,12 +105,18 @@ typedef enum TM_Result
/*
* Result codes for table_update(..., update_indexes*..).
* Used to determine which indexes to update.
+ *
+ * XXX Why do we assign explicit values to the members, instead of just letting
+ * it up to the enum (just like for TM_Result)?
*/
-typedef enum TU_UpdateIndexes {
+typedef enum TU_UpdateIndexes
+{
/* No indexed columns were updated (incl. TID addressing of tuple) */
TU_None = 0,
+
/* A non-summarizing indexed column was updated, or the TID has changed */
TU_All = 1,
+
/* Only summarized columns were updated, TID is unchanged */
TU_Summarizing = 2
} TU_UpdateIndexes;
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index ae3608cd93..d97f5a8e7d 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -161,7 +161,7 @@ typedef struct ExprState
* IndexUnchanged aminsert hint, cached for retail inserts
* Concurrent are we doing a concurrent index build?
* BrokenHotChain did we detect any broken HOT chains?
- * Summarizing is it summarizing?
+ * Summarizing is it a summarizing index?
* ParallelWorkers # of workers requested (excludes leader)
* Am Oid of index AM
* AmCache private cache area for index AM
diff --git a/src/test/regress/sql/stats.sql b/src/test/regress/sql/stats.sql
index c8caa4db38..8b946d05cc 100644
--- a/src/test/regress/sql/stats.sql
+++ b/src/test/regress/sql/stats.sql
@@ -535,6 +535,7 @@ SET enable_seqscan TO on;
SELECT pg_stat_get_replication_slot(NULL);
SELECT pg_stat_get_subscription_stats(NULL);
+
-- Test that the following operations are tracked in pg_stat_io:
-- - reads of target blocks into shared buffers
-- - writes of shared buffers to permanent storage
--
2.39.2
On Tue, 14 Mar 2023 at 14:49, Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
On 3/8/23 23:31, Matthias van de Meent wrote:
On Wed, 22 Feb 2023 at 14:14, Matthias van de Meent
I think that the v4 patch solves all comments up to now; and
considering that most of this patch was committed but then reverted
due to an issue in v15, and that said issue is fixed in this patch,
I'm marking this as ready for committer.Tomas, would you be up for that?
Thanks for the patch. I started looking at it yesterday, and I think
it's 99% RFC. I think it's correct and I only have some minor comments,
(see the 0002 patch):1) There were still a couple minor wording issues in the sgml docs.
2) bikeshedding: I added a bunch of "()" to various conditions, I think
it makes it clearer.
Sure
3) This seems a bit weird way to write a conditional Assert:
if (onlySummarized)
Assert(HeapTupleIsHeapOnly(heapTuple));better to do a composed Assert(!(onlySummarized && !...)) or something?
I don't like this double negation, as it adds significant parsing
complexity to the statement. If I'd have gone with a single Assert()
statement, I'd have used the following:
Assert((!onlySummarized) || HeapTupleIsHeapOnly(heapTuple));
because in the code section above that the HOT + !onlySummarized case
is an early exit.
4) A couple comments and minor tweaks.
5) Undoing a couple unnecessary changes (whitespace, ...).
6) Proper formatting of TU_UpdateIndexes enum.
Allright
+ * + * XXX Why do we assign explicit values to the members, instead of just letting + * it up to the enum (just like for TM_Result)?
This was from the v15 beta window, to reduce the difference between
bool and TU_UpdateIndexes. With pg16, that can be dropped.
7) Comment in RelationGetIndexAttrBitmap() is misleading, as it still
references hotblockingattrs, even though it may update summarizedattrs
in some cases.
How about
Since we have covering indexes with non-key columns, we must
handle them accurately here. Non-key columns must be added into
the hotblocking or summarizing attrs bitmap, since they are in
the index, and update shouldn't miss them.
instead for that section?
If you agree with these changes, I'll get it committed.
Yes, thanks!
Kind regards,
Matthias van de Meent
On 3/14/23 15:41, Matthias van de Meent wrote:
On Tue, 14 Mar 2023 at 14:49, Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:...
If you agree with these changes, I'll get it committed.
Yes, thanks!
I've tweaked the patch per the last round of comments, cleaned up the
commit message a bit (it still talked about unused bit in tuple header
and so on), and pushed it.
Thanks for fixing the issues that got the patch reverted last year!
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Mon, 20 Mar 2023 at 11:11, Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
On 3/14/23 15:41, Matthias van de Meent wrote:
On Tue, 14 Mar 2023 at 14:49, Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:...
If you agree with these changes, I'll get it committed.
Yes, thanks!
I've tweaked the patch per the last round of comments, cleaned up the
commit message a bit (it still talked about unused bit in tuple header
and so on), and pushed it.Thanks for fixing the issues that got the patch reverted last year!
Thanks for helping getting this in!
Kind regards,
Matthias van de Meent.
po 20. 3. 2023 v 11:24 odesílatel Matthias van de Meent
<boekewurm+postgres@gmail.com> napsal:
On Mon, 20 Mar 2023 at 11:11, Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:On 3/14/23 15:41, Matthias van de Meent wrote:
On Tue, 14 Mar 2023 at 14:49, Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:...
If you agree with these changes, I'll get it committed.
Yes, thanks!
I've tweaked the patch per the last round of comments, cleaned up the
commit message a bit (it still talked about unused bit in tuple header
and so on), and pushed it.Thanks for fixing the issues that got the patch reverted last year!
Thanks for helping getting this in!
Thanks for fixing the problems!
Show quoted text
Kind regards,
Matthias van de Meent.