Showing primitive index scan count in EXPLAIN ANALYZE (for skip scan and SAOP scans)
Attached patch has EXPLAIN ANALYZE display the total number of
primitive index scans for all 3 kinds of index scan node. This is
useful for index scans that happen to use SAOP arrays. It also seems
almost essential to offer this kind of instrumentation for the skip
scan patch [1]/messages/by-id/CAH2-Wzmn1YsLzOGgjAQZdn1STSG_y8qP__vggTaPAYXJP+G4bw@mail.gmail.com. Skip scan works by reusing all of the Postgres 17 work
(see commit 5bf748b8) to skip over irrelevant sections of a composite
index with a low cardinality leading column, so it has all the same
issues.
One reason to have this patch is to differentiate between similar
cases involving simple SAOP arrays. The user will have some reasonable
way of determining how a query such as this:
pg@regression:5432 [2070325]=# explain (analyze, buffers, costs off,
summary off)
select
abalance
from
pgbench_accounts
where
aid in (1, 2, 3, 4, 5);
┌──────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ QUERY PLAN
│
├──────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Index Scan using pgbench_accounts_pkey on pgbench_accounts (actual
time=0.007..0.008 rows=5 loops=1) │
│ Index Cond: (aid = ANY ('{1,2,3,4,5}'::integer[]))
│
│ Primitive Index Scans: 1
│
│ Buffers: shared hit=4
│
└──────────────────────────────────────────────────────────────────────────────────────────────────────┘
(4 rows)
...differs from a similar query, such as this:
pg@regression:5432 [2070325]=# explain (analyze, buffers, costs off,
summary off)
select
abalance
from
pgbench_accounts
where
aid in (1000, 2000, 3000, 4000, 5000);
┌──────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ QUERY PLAN
│
├──────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Index Scan using pgbench_accounts_pkey on pgbench_accounts (actual
time=0.006..0.012 rows=5 loops=1) │
│ Index Cond: (aid = ANY ('{1000,2000,3000,4000,5000}'::integer[]))
│
│ Primitive Index Scans: 5
│
│ Buffers: shared hit=20
│
└──────────────────────────────────────────────────────────────────────────────────────────────────────┘
(4 rows)
Another reason to have this patch is consistency. We're only showing
the user the number of times we've incremented
pg_stat_user_tables.idx_scan in each case. The fact that
pg_stat_user_tables.idx_scan counts primitive index scans like this is
nothing new. That issue was only documented quite recently, as part of
the Postgres 17 work, and it seems quite misleading. It's consistent,
but not necessarily nonsurprising. Making it readily apparent that
there is more than one primitive index scan involved here makes the
issue less surprising.
Skip scan
---------
Here's an example with this EXPLAIN ANALYZE patch applied on top of my
skip scan patch [1]/messages/by-id/CAH2-Wzmn1YsLzOGgjAQZdn1STSG_y8qP__vggTaPAYXJP+G4bw@mail.gmail.com, using the tenk1 table left behind when the
standard regression tests are run:
pg@regression:5432 [2070865]=# create index on tenk1 (four, stringu1);
CREATE INDEX
pg@regression:5432 [2070865]=# explain (analyze, buffers, costs off,
summary off)
select
stringu1
from
tenk1
where
-- Omitted: the leading column on "four"
stringu1 = 'YGAAAA';
┌───────────────────────────────────────────────────────────────────────────────────────────────────┐
│ QUERY PLAN
│
├───────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Index Only Scan using tenk1_four_stringu1_idx on tenk1 (actual
time=0.011..0.017 rows=15 loops=1) │
│ Index Cond: (stringu1 = 'YGAAAA'::name)
│
│ Heap Fetches: 0
│
│ Primitive Index Scans: 5
│
│ Buffers: shared hit=11
│
└───────────────────────────────────────────────────────────────────────────────────────────────────┘
(5 rows)
Notice that there are 5 primitive index scans here. That's what I'd
expect, given that there are exactly 4 distinct "logical subindexes"
implied by our use of a leading column on "four" as the scan's skip
column. Under the hood, an initial primitive index scan locates the
lowest "four" value. There are then 4 additional primitive index scans
to locate the next "four" value (needed when the current "four" value
gets past the value's "stringu1 = 'YGAAAA'" tuples).
Obviously, the cardinality of the leading column predicts the number
of primitive index scans at runtime. But it can be much more
complicated of a relationship than what I've shown here may suggest.
Skewness matters, too. Small clusters of index tuples with unique
leading column values will greatly increase column
cardinality/ndistinct, without a commensurate increase in the cost of
a skip scan (that skips using that column). Those small clusters of
unique values will appear on the same few leaf pages. It follows that
they cannot substantially increase the number of primitive scans
required at runtime -- they'll just be read all together at once.
An important goal of my design for skip scan is that we avoid the need
for special index paths within the optimizer. Whether or not we skip
is always a runtime decision (when a skippable index attribute exists
at all). The optimizer needs to know about skipping for costing
purposes only -- all of the required optimizer changes are in
selfuncs.c. That's why you didn't see some kind of special new index
scan node here -- you just saw the number of primitive index scans.
My motivation for working on this EXPLAIN ANALYZE patch is primarily
skip scan. I don't think that it necessarily matters, though. I think
that this patch can be treated as independent work. It would have been
weird to not bring it up skip scan even once here, though.
Presentation design choices
---------------------------
I've used the term "primitive index scan" for this. That is the
existing user-visible terminology [2]https://www.postgresql.org/docs/devel/monitoring-stats.html#MONITORING-PG-STAT-ALL-INDEXES-VIEW -- see "Note" box -- Peter Geoghegan, though I suppose that that
could be revisited now.
Another quasi-arbitrary design choice: I don't break out primitive
index scans for scan nodes with multiple loops (e.g., the inner side
of a nested loop join). The count of primitive scans accumulates
across index_rescan calls. I did things this way because it felt
slightly more logical to follow what we show for "Buffers" --
primitive index scans are another physical cost. I'm certainly not
opposed to doing that part differently. It doesn't have to be one or
the other (could break it out both ways), if people think that the
added verbosity is worth it.
I think that we shouldn't be counting calls to _bt_first as a
primitive index scan unless they either call _bt_search or
_bt_endpoint to descend the index (in the case of nbtree scans). This
means that cases where we detect a contradictory qual in
_bt_preprocess_keys should count as having zero primitive index scans.
That is technically an independent thing, though it seems far more
logical to just do it that way.
Actually, I think that there might be existing bugs on HEAD, with
parallel index scan -- I think we might be overcounting. We're not
properly accounting for the fact that parallel workers usually don't
perform a primitive index scan when their backend calls into
_bt_first. I wonder if I should address that separately, as a bug
fix...
[1]: /messages/by-id/CAH2-Wzmn1YsLzOGgjAQZdn1STSG_y8qP__vggTaPAYXJP+G4bw@mail.gmail.com
[2]: https://www.postgresql.org/docs/devel/monitoring-stats.html#MONITORING-PG-STAT-ALL-INDEXES-VIEW -- see "Note" box -- Peter Geoghegan
-- see "Note" box
--
Peter Geoghegan
Attachments:
v1-0001-Show-primitive-scan-count-in-EXPLAIN-ANALYZE.patchapplication/octet-stream; name=v1-0001-Show-primitive-scan-count-in-EXPLAIN-ANALYZE.patchDownload
From f6fb654d92073de6c8141bfad022278817b7920e Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@bowt.ie>
Date: Wed, 14 Aug 2024 13:50:23 -0400
Subject: [PATCH v1] Show primitive scan count in EXPLAIN ANALYZE.
Also stop counting the case where nbtree detects contradictory quals as
a distinct primitive index scan (do so neither in EXPLAIN ANALYZE nor in
the pg_stat_*_indexes.idx_scan stats).
---
src/include/access/relscan.h | 3 +
src/backend/access/brin/brin.c | 1 +
src/backend/access/gin/ginscan.c | 1 +
src/backend/access/gist/gistget.c | 2 +
src/backend/access/hash/hashsearch.c | 1 +
src/backend/access/index/genam.c | 1 +
src/backend/access/nbtree/nbtree.c | 13 +++++
src/backend/access/nbtree/nbtsearch.c | 9 ++-
src/backend/access/spgist/spgscan.c | 1 +
src/backend/commands/explain.c | 39 +++++++++++++
doc/src/sgml/bloom.sgml | 2 +
doc/src/sgml/perform.sgml | 8 +++
doc/src/sgml/ref/explain.sgml | 1 +
doc/src/sgml/rules.sgml | 1 +
src/test/regress/expected/brin_multi.out | 27 ++++++---
src/test/regress/expected/memoize.out | 50 +++++++++++-----
src/test/regress/expected/partition_prune.out | 57 ++++++++++++++-----
src/test/regress/expected/select.out | 3 +-
src/test/regress/sql/memoize.sql | 6 +-
src/test/regress/sql/partition_prune.sql | 4 ++
20 files changed, 189 insertions(+), 41 deletions(-)
diff --git a/src/include/access/relscan.h b/src/include/access/relscan.h
index 521043304..87619ec55 100644
--- a/src/include/access/relscan.h
+++ b/src/include/access/relscan.h
@@ -130,6 +130,9 @@ typedef struct IndexScanDescData
bool xactStartedInRecovery; /* prevents killing/seeing killed
* tuples */
+ /* index access method instrumentation output state */
+ uint64 nprimscans; /* # of primitive index scans */
+
/* index access method's private state */
void *opaque; /* access-method-specific info */
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index 6467bed60..7cd4c2e57 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -581,6 +581,7 @@ bringetbitmap(IndexScanDesc scan, TIDBitmap *tbm)
opaque = (BrinOpaque *) scan->opaque;
bdesc = opaque->bo_bdesc;
pgstat_count_index_scan(idxRel);
+ scan->nprimscans++;
/*
* We need to know the size of the table so that we know how long to
diff --git a/src/backend/access/gin/ginscan.c b/src/backend/access/gin/ginscan.c
index af24d3854..b031d8228 100644
--- a/src/backend/access/gin/ginscan.c
+++ b/src/backend/access/gin/ginscan.c
@@ -436,6 +436,7 @@ ginNewScanKey(IndexScanDesc scan)
MemoryContextSwitchTo(oldCtx);
pgstat_count_index_scan(scan->indexRelation);
+ scan->nprimscans++;
}
void
diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c
index b35b8a975..e452008e4 100644
--- a/src/backend/access/gist/gistget.c
+++ b/src/backend/access/gist/gistget.c
@@ -625,6 +625,7 @@ gistgettuple(IndexScanDesc scan, ScanDirection dir)
GISTSearchItem fakeItem;
pgstat_count_index_scan(scan->indexRelation);
+ scan->nprimscans++;
so->firstCall = false;
so->curPageData = so->nPageData = 0;
@@ -750,6 +751,7 @@ gistgetbitmap(IndexScanDesc scan, TIDBitmap *tbm)
return 0;
pgstat_count_index_scan(scan->indexRelation);
+ scan->nprimscans++;
/* Begin the scan by processing the root page */
so->curPageData = so->nPageData = 0;
diff --git a/src/backend/access/hash/hashsearch.c b/src/backend/access/hash/hashsearch.c
index 0d99d6abc..a827c4052 100644
--- a/src/backend/access/hash/hashsearch.c
+++ b/src/backend/access/hash/hashsearch.c
@@ -298,6 +298,7 @@ _hash_first(IndexScanDesc scan, ScanDirection dir)
HashScanPosItem *currItem;
pgstat_count_index_scan(rel);
+ scan->nprimscans++;
/*
* We do not support hash scans with no index qualification, because we
diff --git a/src/backend/access/index/genam.c b/src/backend/access/index/genam.c
index de751e8e4..58a447494 100644
--- a/src/backend/access/index/genam.c
+++ b/src/backend/access/index/genam.c
@@ -116,6 +116,7 @@ RelationGetIndexScan(Relation indexRelation, int nkeys, int norderbys)
scan->xactStartedInRecovery = TransactionStartedDuringRecovery();
scan->ignore_killed_tuples = !scan->xactStartedInRecovery;
+ scan->nprimscans = 0; /* not reset by index_rescan */
scan->opaque = NULL;
scan->xs_itup = NULL;
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 686a3206f..906b1be51 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -70,6 +70,7 @@ typedef struct BTParallelScanDescData
BTPS_State btps_pageStatus; /* indicates whether next page is
* available for scan. see above for
* possible states of parallel scan. */
+ uint64 btps_nprimscans; /* instrumentation */
slock_t btps_mutex; /* protects above variables, btps_arrElems */
ConditionVariable btps_cv; /* used to synchronize parallel scan */
@@ -551,6 +552,7 @@ btinitparallelscan(void *target)
SpinLockInit(&bt_target->btps_mutex);
bt_target->btps_scanPage = InvalidBlockNumber;
bt_target->btps_pageStatus = BTPARALLEL_NOT_INITIALIZED;
+ bt_target->btps_nprimscans = 0;
ConditionVariableInit(&bt_target->btps_cv);
}
@@ -576,6 +578,7 @@ btparallelrescan(IndexScanDesc scan)
SpinLockAcquire(&btscan->btps_mutex);
btscan->btps_scanPage = InvalidBlockNumber;
btscan->btps_pageStatus = BTPARALLEL_NOT_INITIALIZED;
+ /* deliberately don't reset btps_nprimscans (matches index_rescan) */
SpinLockRelease(&btscan->btps_mutex);
}
@@ -680,6 +683,11 @@ _bt_parallel_seize(IndexScanDesc scan, BlockNumber *pageno, bool first)
* We have successfully seized control of the scan for the purpose
* of advancing it to a new page!
*/
+ if (first && btscan->btps_pageStatus == BTPARALLEL_NOT_INITIALIZED)
+ {
+ /* count the first primitive scan for this btrescan */
+ btscan->btps_nprimscans++;
+ }
btscan->btps_pageStatus = BTPARALLEL_ADVANCING;
*pageno = btscan->btps_scanPage;
exit_loop = true;
@@ -747,11 +755,15 @@ _bt_parallel_done(IndexScanDesc scan)
* already
*/
SpinLockAcquire(&btscan->btps_mutex);
+ btscan->btps_nprimscans += scan->nprimscans;
+ scan->nprimscans = btscan->btps_nprimscans;
if (btscan->btps_pageStatus != BTPARALLEL_DONE)
{
btscan->btps_pageStatus = BTPARALLEL_DONE;
status_changed = true;
}
+ /* Copy the authoritative shared primitive scan counter to local field */
+ scan->nprimscans = btscan->btps_nprimscans;
SpinLockRelease(&btscan->btps_mutex);
/* wake up all the workers associated with this parallel scan */
@@ -785,6 +797,7 @@ _bt_parallel_primscan_schedule(IndexScanDesc scan, BlockNumber prev_scan_page)
{
btscan->btps_scanPage = InvalidBlockNumber;
btscan->btps_pageStatus = BTPARALLEL_NEED_PRIMSCAN;
+ btscan->btps_nprimscans++;
/* Serialize scan's current array keys */
for (int i = 0; i < so->numArrayKeys; i++)
diff --git a/src/backend/access/nbtree/nbtsearch.c b/src/backend/access/nbtree/nbtsearch.c
index 2551df8a6..596de2889 100644
--- a/src/backend/access/nbtree/nbtsearch.c
+++ b/src/backend/access/nbtree/nbtsearch.c
@@ -896,8 +896,6 @@ _bt_first(IndexScanDesc scan, ScanDirection dir)
Assert(!BTScanPosIsValid(so->currPos));
- pgstat_count_index_scan(rel);
-
/*
* Examine the scan keys and eliminate any redundant keys; also mark the
* keys that must be matched to continue the scan.
@@ -960,6 +958,13 @@ _bt_first(IndexScanDesc scan, ScanDirection dir)
_bt_start_array_keys(scan, dir);
}
+ /*
+ * We've established that we'll either call _bt_search or _bt_endpoint.
+ * Count this as a primitive index scan.
+ */
+ pgstat_count_index_scan(rel);
+ scan->nprimscans++;
+
/*----------
* Examine the scan keys to discover where we need to start the scan.
*
diff --git a/src/backend/access/spgist/spgscan.c b/src/backend/access/spgist/spgscan.c
index 03293a781..6b8486ae9 100644
--- a/src/backend/access/spgist/spgscan.c
+++ b/src/backend/access/spgist/spgscan.c
@@ -423,6 +423,7 @@ spgrescan(IndexScanDesc scan, ScanKey scankey, int nscankeys,
/* count an indexscan for stats */
pgstat_count_index_scan(scan->indexRelation);
+ scan->nprimscans++;
}
void
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 5771aabf4..39e7234e3 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -13,6 +13,7 @@
*/
#include "postgres.h"
+#include "access/relscan.h"
#include "access/xact.h"
#include "catalog/pg_type.h"
#include "commands/createas.h"
@@ -89,6 +90,7 @@ static void show_plan_tlist(PlanState *planstate, List *ancestors,
static void show_expression(Node *node, const char *qlabel,
PlanState *planstate, List *ancestors,
bool useprefix, ExplainState *es);
+static void show_indexscan_nprimscans(PlanState *planstate, ExplainState *es);
static void show_qual(List *qual, const char *qlabel,
PlanState *planstate, List *ancestors,
bool useprefix, ExplainState *es);
@@ -1980,6 +1982,8 @@ ExplainNode(PlanState *planstate, List *ancestors,
if (plan->qual)
show_instrumentation_count("Rows Removed by Filter", 1,
planstate, es);
+ if (es->analyze)
+ show_indexscan_nprimscans(planstate, es);
break;
case T_IndexOnlyScan:
show_scan_qual(((IndexOnlyScan *) plan)->indexqual,
@@ -1994,12 +1998,17 @@ ExplainNode(PlanState *planstate, List *ancestors,
show_instrumentation_count("Rows Removed by Filter", 1,
planstate, es);
if (es->analyze)
+ {
ExplainPropertyFloat("Heap Fetches", NULL,
planstate->instrument->ntuples2, 0, es);
+ show_indexscan_nprimscans(planstate, es);
+ }
break;
case T_BitmapIndexScan:
show_scan_qual(((BitmapIndexScan *) plan)->indexqualorig,
"Index Cond", planstate, ancestors, es);
+ if (es->analyze)
+ show_indexscan_nprimscans(planstate, es);
break;
case T_BitmapHeapScan:
show_scan_qual(((BitmapHeapScan *) plan)->bitmapqualorig,
@@ -2509,6 +2518,36 @@ show_expression(Node *node, const char *qlabel,
ExplainPropertyText(qlabel, exprstr, es);
}
+/*
+ * Show the number of primitive index scans within an IndexScan node,
+ * IndexOnlyScan node, or BitmapIndexScan node
+ */
+static void
+show_indexscan_nprimscans(PlanState *planstate, ExplainState *es)
+{
+ Plan *plan = planstate->plan;
+ struct IndexScanDescData *scanDesc = NULL;
+
+ switch (nodeTag(plan))
+ {
+ case T_IndexScan:
+ scanDesc = ((IndexScanState *) planstate)->iss_ScanDesc;
+ break;
+ case T_IndexOnlyScan:
+ scanDesc = ((IndexOnlyScanState *) planstate)->ioss_ScanDesc;
+ break;
+ case T_BitmapIndexScan:
+ scanDesc = ((BitmapIndexScanState *) planstate)->biss_ScanDesc;
+ break;
+ default:
+ break;
+ }
+
+ if (scanDesc && scanDesc->nprimscans > 0)
+ ExplainPropertyUInteger("Primitive Index Scans", NULL,
+ scanDesc->nprimscans, es);
+}
+
/*
* Show a qualifier expression (which is a List with implicit AND semantics)
*/
diff --git a/doc/src/sgml/bloom.sgml b/doc/src/sgml/bloom.sgml
index 19f2b172c..5f06030d8 100644
--- a/doc/src/sgml/bloom.sgml
+++ b/doc/src/sgml/bloom.sgml
@@ -170,6 +170,7 @@ CREATE INDEX
Heap Blocks: exact=28
-> Bitmap Index Scan on bloomidx (cost=0.00..1792.00 rows=2 width=0) (actual time=0.356..0.356 rows=29 loops=1)
Index Cond: ((i2 = 898732) AND (i5 = 123451))
+ Primitive Index Scans: 1
Planning Time: 0.099 ms
Execution Time: 0.408 ms
(8 rows)
@@ -202,6 +203,7 @@ CREATE INDEX
-> BitmapAnd (cost=24.34..24.34 rows=2 width=0) (actual time=0.027..0.027 rows=0 loops=1)
-> Bitmap Index Scan on btreeidx5 (cost=0.00..12.04 rows=500 width=0) (actual time=0.026..0.026 rows=0 loops=1)
Index Cond: (i5 = 123451)
+ Primitive Index Scans: 1
-> Bitmap Index Scan on btreeidx2 (cost=0.00..12.04 rows=500 width=0) (never executed)
Index Cond: (i2 = 898732)
Planning Time: 0.491 ms
diff --git a/doc/src/sgml/perform.sgml b/doc/src/sgml/perform.sgml
index ff689b652..987e5b47f 100644
--- a/doc/src/sgml/perform.sgml
+++ b/doc/src/sgml/perform.sgml
@@ -702,8 +702,10 @@ WHERE t1.unique1 < 10 AND t1.unique2 = t2.unique2;
Heap Blocks: exact=10
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..4.36 rows=10 width=0) (actual time=0.004..0.004 rows=10 loops=1)
Index Cond: (unique1 < 10)
+ Primitive Index Scans: 1
-> Index Scan using tenk2_unique2 on tenk2 t2 (cost=0.29..7.90 rows=1 width=244) (actual time=0.003..0.003 rows=1 loops=10)
Index Cond: (unique2 = t1.unique2)
+ Primitive Index Scans: 1
Planning Time: 0.485 ms
Execution Time: 0.073 ms
</screen>
@@ -754,6 +756,7 @@ WHERE t1.unique1 < 100 AND t1.unique2 = t2.unique2 ORDER BY t1.fivethous;
Heap Blocks: exact=90
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.04 rows=100 width=0) (actual time=0.013..0.013 rows=100 loops=1)
Index Cond: (unique1 < 100)
+ Primitive Index Scans: 1
Planning Time: 0.187 ms
Execution Time: 3.036 ms
</screen>
@@ -819,6 +822,7 @@ EXPLAIN ANALYZE SELECT * FROM polygon_tbl WHERE f1 @> polygon '(0.5,2.0)';
-------------------------------------------------------------------&zwsp;-------------------------------------------------------
Index Scan using gpolygonind on polygon_tbl (cost=0.13..8.15 rows=1 width=85) (actual time=0.074..0.074 rows=0 loops=1)
Index Cond: (f1 @> '((0.5,2))'::polygon)
+ Primitive Index Scans: 1
Rows Removed by Index Recheck: 1
Planning Time: 0.039 ms
Execution Time: 0.098 ms
@@ -848,9 +852,11 @@ EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM tenk1 WHERE unique1 < 100 AND unique
Buffers: shared hit=4 read=3
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.04 rows=100 width=0) (actual time=0.027..0.027 rows=100 loops=1)
Index Cond: (unique1 < 100)
+ Primitive Index Scans: 1
Buffers: shared hit=2
-> Bitmap Index Scan on tenk1_unique2 (cost=0.00..19.78 rows=999 width=0) (actual time=0.070..0.070 rows=999 loops=1)
Index Cond: (unique2 > 9000)
+ Primitive Index Scans: 1
Buffers: shared hit=2 read=3
Planning:
Buffers: shared hit=3
@@ -883,6 +889,7 @@ EXPLAIN ANALYZE UPDATE tenk1 SET hundred = hundred + 1 WHERE unique1 < 100;
Heap Blocks: exact=90
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.04 rows=100 width=0) (actual time=0.031..0.031 rows=100 loops=1)
Index Cond: (unique1 < 100)
+ Primitive Index Scans: 1
Planning Time: 0.151 ms
Execution Time: 1.856 ms
@@ -1019,6 +1026,7 @@ EXPLAIN ANALYZE SELECT * FROM tenk1 WHERE unique1 < 100 AND unique2 > 9000
Index Cond: (unique2 > 9000)
Filter: (unique1 < 100)
Rows Removed by Filter: 287
+ Primitive Index Scans: 1
Planning Time: 0.077 ms
Execution Time: 0.086 ms
</screen>
diff --git a/doc/src/sgml/ref/explain.sgml b/doc/src/sgml/ref/explain.sgml
index db9d3a854..859007ca0 100644
--- a/doc/src/sgml/ref/explain.sgml
+++ b/doc/src/sgml/ref/explain.sgml
@@ -502,6 +502,7 @@ EXPLAIN ANALYZE EXECUTE query(100, 200);
Batches: 1 Memory Usage: 24kB
-> Index Scan using test_pkey on test (cost=0.29..10.27 rows=99 width=8) (actual time=0.009..0.025 rows=99 loops=1)
Index Cond: ((id > 100) AND (id < 200))
+ Primitive Index Scans: 1
Planning Time: 0.244 ms
Execution Time: 0.073 ms
(7 rows)
diff --git a/doc/src/sgml/rules.sgml b/doc/src/sgml/rules.sgml
index 7a928bd7b..873132c30 100644
--- a/doc/src/sgml/rules.sgml
+++ b/doc/src/sgml/rules.sgml
@@ -1046,6 +1046,7 @@ SELECT count(*) FROM words WHERE word = 'caterpiler';
-> Index Only Scan using wrd_word on wrd (cost=0.42..4.44 rows=1 width=0) (actual time=0.039..0.039 rows=0 loops=1)
Index Cond: (word = 'caterpiler'::text)
Heap Fetches: 0
+ Primitive Index Scans: 1
Planning time: 0.164 ms
Execution time: 0.117 ms
</programlisting>
diff --git a/src/test/regress/expected/brin_multi.out b/src/test/regress/expected/brin_multi.out
index ae9ce9d8e..7e5c0b456 100644
--- a/src/test/regress/expected/brin_multi.out
+++ b/src/test/regress/expected/brin_multi.out
@@ -853,7 +853,8 @@ SELECT * FROM brin_date_test WHERE a = '2023-01-01'::date;
Recheck Cond: (a = '2023-01-01'::date)
-> Bitmap Index Scan on brin_date_test_a_idx (actual rows=0 loops=1)
Index Cond: (a = '2023-01-01'::date)
-(4 rows)
+ Primitive Index Scans: 1
+(5 rows)
DROP TABLE brin_date_test;
RESET enable_seqscan;
@@ -872,7 +873,8 @@ SELECT * FROM brin_timestamp_test WHERE a = '2023-01-01'::timestamp;
Recheck Cond: (a = '2023-01-01 00:00:00'::timestamp without time zone)
-> Bitmap Index Scan on brin_timestamp_test_a_idx (actual rows=0 loops=1)
Index Cond: (a = '2023-01-01 00:00:00'::timestamp without time zone)
-(4 rows)
+ Primitive Index Scans: 1
+(5 rows)
EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF, SUMMARY OFF)
SELECT * FROM brin_timestamp_test WHERE a = '1900-01-01'::timestamp;
@@ -882,7 +884,8 @@ SELECT * FROM brin_timestamp_test WHERE a = '1900-01-01'::timestamp;
Recheck Cond: (a = '1900-01-01 00:00:00'::timestamp without time zone)
-> Bitmap Index Scan on brin_timestamp_test_a_idx (actual rows=0 loops=1)
Index Cond: (a = '1900-01-01 00:00:00'::timestamp without time zone)
-(4 rows)
+ Primitive Index Scans: 1
+(5 rows)
DROP TABLE brin_timestamp_test;
RESET enable_seqscan;
@@ -900,7 +903,8 @@ SELECT * FROM brin_date_test WHERE a = '2023-01-01'::date;
Recheck Cond: (a = '2023-01-01'::date)
-> Bitmap Index Scan on brin_date_test_a_idx (actual rows=0 loops=1)
Index Cond: (a = '2023-01-01'::date)
-(4 rows)
+ Primitive Index Scans: 1
+(5 rows)
EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF, SUMMARY OFF)
SELECT * FROM brin_date_test WHERE a = '1900-01-01'::date;
@@ -910,7 +914,8 @@ SELECT * FROM brin_date_test WHERE a = '1900-01-01'::date;
Recheck Cond: (a = '1900-01-01'::date)
-> Bitmap Index Scan on brin_date_test_a_idx (actual rows=0 loops=1)
Index Cond: (a = '1900-01-01'::date)
-(4 rows)
+ Primitive Index Scans: 1
+(5 rows)
DROP TABLE brin_date_test;
RESET enable_seqscan;
@@ -929,7 +934,8 @@ SELECT * FROM brin_interval_test WHERE a = '-30 years'::interval;
Recheck Cond: (a = '@ 30 years ago'::interval)
-> Bitmap Index Scan on brin_interval_test_a_idx (actual rows=0 loops=1)
Index Cond: (a = '@ 30 years ago'::interval)
-(4 rows)
+ Primitive Index Scans: 1
+(5 rows)
EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF, SUMMARY OFF)
SELECT * FROM brin_interval_test WHERE a = '30 years'::interval;
@@ -939,7 +945,8 @@ SELECT * FROM brin_interval_test WHERE a = '30 years'::interval;
Recheck Cond: (a = '@ 30 years'::interval)
-> Bitmap Index Scan on brin_interval_test_a_idx (actual rows=0 loops=1)
Index Cond: (a = '@ 30 years'::interval)
-(4 rows)
+ Primitive Index Scans: 1
+(5 rows)
DROP TABLE brin_interval_test;
RESET enable_seqscan;
@@ -957,7 +964,8 @@ SELECT * FROM brin_interval_test WHERE a = '-30 years'::interval;
Recheck Cond: (a = '@ 30 years ago'::interval)
-> Bitmap Index Scan on brin_interval_test_a_idx (actual rows=0 loops=1)
Index Cond: (a = '@ 30 years ago'::interval)
-(4 rows)
+ Primitive Index Scans: 1
+(5 rows)
EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF, SUMMARY OFF)
SELECT * FROM brin_interval_test WHERE a = '30 years'::interval;
@@ -967,7 +975,8 @@ SELECT * FROM brin_interval_test WHERE a = '30 years'::interval;
Recheck Cond: (a = '@ 30 years'::interval)
-> Bitmap Index Scan on brin_interval_test_a_idx (actual rows=0 loops=1)
Index Cond: (a = '@ 30 years'::interval)
-(4 rows)
+ Primitive Index Scans: 1
+(5 rows)
DROP TABLE brin_interval_test;
RESET enable_seqscan;
diff --git a/src/test/regress/expected/memoize.out b/src/test/regress/expected/memoize.out
index 96906104d..ecad4b67e 100644
--- a/src/test/regress/expected/memoize.out
+++ b/src/test/regress/expected/memoize.out
@@ -22,8 +22,10 @@ begin
ln := regexp_replace(ln, 'Evictions: 0', 'Evictions: Zero');
ln := regexp_replace(ln, 'Evictions: \d+', 'Evictions: N');
ln := regexp_replace(ln, 'Memory Usage: \d+', 'Memory Usage: N');
- ln := regexp_replace(ln, 'Heap Fetches: \d+', 'Heap Fetches: N');
- ln := regexp_replace(ln, 'loops=\d+', 'loops=N');
+ ln := regexp_replace(ln, 'Heap Fetches: \d+', 'Heap Fetches: N');
+ ln := regexp_replace(ln, 'loops=\d+', 'loops=N');
+ ln := regexp_replace(ln, 'Primitive Index Scans: 0', 'Primitive Index Scans: Zero');
+ ln := regexp_replace(ln, 'Primitive Index Scans: \d+', 'Primitive Index Scans: N');
return next ln;
end loop;
end;
@@ -49,7 +51,8 @@ WHERE t2.unique1 < 1000;', false);
-> Index Only Scan using tenk1_unique1 on tenk1 t1 (actual rows=1 loops=N)
Index Cond: (unique1 = t2.twenty)
Heap Fetches: N
-(12 rows)
+ Primitive Index Scans: N
+(13 rows)
-- And check we get the expected results.
SELECT COUNT(*),AVG(t1.unique1) FROM tenk1 t1
@@ -80,7 +83,8 @@ WHERE t1.unique1 < 1000;', false);
-> Index Only Scan using tenk1_unique1 on tenk1 t2 (actual rows=1 loops=N)
Index Cond: (unique1 = t1.twenty)
Heap Fetches: N
-(12 rows)
+ Primitive Index Scans: N
+(13 rows)
-- And check we get the expected results.
SELECT COUNT(*),AVG(t2.unique1) FROM tenk1 t1,
@@ -106,6 +110,7 @@ WHERE t1.unique1 < 10;', false);
-> Nested Loop Left Join (actual rows=20 loops=N)
-> Index Scan using tenk1_unique1 on tenk1 t1 (actual rows=10 loops=N)
Index Cond: (unique1 < 10)
+ Primitive Index Scans: N
-> Memoize (actual rows=2 loops=N)
Cache Key: t1.two
Cache Mode: binary
@@ -115,7 +120,8 @@ WHERE t1.unique1 < 10;', false);
Rows Removed by Filter: 2
-> Index Scan using tenk1_unique1 on tenk1 t2_1 (actual rows=4 loops=N)
Index Cond: (unique1 < 4)
-(13 rows)
+ Primitive Index Scans: N
+(15 rows)
-- And check we get the expected results.
SELECT COUNT(*),AVG(t2.t1two) FROM tenk1 t1 LEFT JOIN
@@ -149,7 +155,8 @@ WHERE s.c1 = s.c2 AND t1.unique1 < 1000;', false);
Filter: ((t1.two + 1) = unique1)
Rows Removed by Filter: 9999
Heap Fetches: N
-(13 rows)
+ Primitive Index Scans: N
+(14 rows)
-- And check we get the expected results.
SELECT COUNT(*), AVG(t1.twenty) FROM tenk1 t1 LEFT JOIN
@@ -249,7 +256,8 @@ ON t1.x = t2.t::numeric AND t1.t::numeric = t2.x;', false);
Index Cond: (x = (t1.t)::numeric)
Filter: (t1.x = (t)::numeric)
Heap Fetches: N
-(10 rows)
+ Primitive Index Scans: N
+(11 rows)
DROP TABLE expr_key;
-- Reduce work_mem and hash_mem_multiplier so that we see some cache evictions
@@ -276,7 +284,8 @@ WHERE t2.unique1 < 1200;', true);
-> Index Only Scan using tenk1_unique1 on tenk1 t1 (actual rows=1 loops=N)
Index Cond: (unique1 = t2.thousand)
Heap Fetches: N
-(12 rows)
+ Primitive Index Scans: N
+(13 rows)
CREATE TABLE flt (f float);
CREATE INDEX flt_f_idx ON flt (f);
@@ -291,6 +300,7 @@ SELECT * FROM flt f1 INNER JOIN flt f2 ON f1.f = f2.f;', false);
Nested Loop (actual rows=4 loops=N)
-> Index Only Scan using flt_f_idx on flt f1 (actual rows=2 loops=N)
Heap Fetches: N
+ Primitive Index Scans: N
-> Memoize (actual rows=2 loops=N)
Cache Key: f1.f
Cache Mode: logical
@@ -298,7 +308,8 @@ SELECT * FROM flt f1 INNER JOIN flt f2 ON f1.f = f2.f;', false);
-> Index Only Scan using flt_f_idx on flt f2 (actual rows=2 loops=N)
Index Cond: (f = f1.f)
Heap Fetches: N
-(10 rows)
+ Primitive Index Scans: N
+(12 rows)
-- Ensure memoize operates in binary mode
SELECT explain_memoize('
@@ -308,6 +319,7 @@ SELECT * FROM flt f1 INNER JOIN flt f2 ON f1.f >= f2.f;', false);
Nested Loop (actual rows=4 loops=N)
-> Index Only Scan using flt_f_idx on flt f1 (actual rows=2 loops=N)
Heap Fetches: N
+ Primitive Index Scans: N
-> Memoize (actual rows=2 loops=N)
Cache Key: f1.f
Cache Mode: binary
@@ -315,7 +327,8 @@ SELECT * FROM flt f1 INNER JOIN flt f2 ON f1.f >= f2.f;', false);
-> Index Only Scan using flt_f_idx on flt f2 (actual rows=2 loops=N)
Index Cond: (f <= f1.f)
Heap Fetches: N
-(10 rows)
+ Primitive Index Scans: N
+(12 rows)
DROP TABLE flt;
-- Exercise Memoize in binary mode with a large fixed width type and a
@@ -340,7 +353,8 @@ SELECT * FROM strtest s1 INNER JOIN strtest s2 ON s1.n >= s2.n;', false);
Hits: 3 Misses: 3 Evictions: Zero Overflows: 0 Memory Usage: NkB
-> Index Scan using strtest_n_idx on strtest s2 (actual rows=4 loops=N)
Index Cond: (n <= s1.n)
-(8 rows)
+ Primitive Index Scans: N
+(9 rows)
-- Ensure we get 3 hits and 3 misses
SELECT explain_memoize('
@@ -355,7 +369,8 @@ SELECT * FROM strtest s1 INNER JOIN strtest s2 ON s1.t >= s2.t;', false);
Hits: 3 Misses: 3 Evictions: Zero Overflows: 0 Memory Usage: NkB
-> Index Scan using strtest_t_idx on strtest s2 (actual rows=4 loops=N)
Index Cond: (t <= s1.t)
-(8 rows)
+ Primitive Index Scans: N
+(9 rows)
DROP TABLE strtest;
-- Ensure memoize works with partitionwise join
@@ -376,6 +391,7 @@ SELECT * FROM prt t1 INNER JOIN prt t2 ON t1.a = t2.a;', false);
-> Nested Loop (actual rows=16 loops=N)
-> Index Only Scan using iprt_p1_a on prt_p1 t1_1 (actual rows=4 loops=N)
Heap Fetches: N
+ Primitive Index Scans: N
-> Memoize (actual rows=4 loops=N)
Cache Key: t1_1.a
Cache Mode: logical
@@ -383,9 +399,11 @@ SELECT * FROM prt t1 INNER JOIN prt t2 ON t1.a = t2.a;', false);
-> Index Only Scan using iprt_p1_a on prt_p1 t2_1 (actual rows=4 loops=N)
Index Cond: (a = t1_1.a)
Heap Fetches: N
+ Primitive Index Scans: N
-> Nested Loop (actual rows=16 loops=N)
-> Index Only Scan using iprt_p2_a on prt_p2 t1_2 (actual rows=4 loops=N)
Heap Fetches: N
+ Primitive Index Scans: N
-> Memoize (actual rows=4 loops=N)
Cache Key: t1_2.a
Cache Mode: logical
@@ -393,7 +411,8 @@ SELECT * FROM prt t1 INNER JOIN prt t2 ON t1.a = t2.a;', false);
-> Index Only Scan using iprt_p2_a on prt_p2 t2_2 (actual rows=4 loops=N)
Index Cond: (a = t1_2.a)
Heap Fetches: N
-(21 rows)
+ Primitive Index Scans: N
+(25 rows)
-- Ensure memoize works with parameterized union-all Append path
SET enable_partitionwise_join TO off;
@@ -406,6 +425,7 @@ ON t1.a = t2.a;', false);
Nested Loop (actual rows=16 loops=N)
-> Index Only Scan using iprt_p1_a on prt_p1 t1 (actual rows=4 loops=N)
Heap Fetches: N
+ Primitive Index Scans: N
-> Memoize (actual rows=4 loops=N)
Cache Key: t1.a
Cache Mode: logical
@@ -414,10 +434,12 @@ ON t1.a = t2.a;', false);
-> Index Only Scan using iprt_p1_a on prt_p1 (actual rows=4 loops=N)
Index Cond: (a = t1.a)
Heap Fetches: N
+ Primitive Index Scans: N
-> Index Only Scan using iprt_p2_a on prt_p2 (actual rows=0 loops=N)
Index Cond: (a = t1.a)
Heap Fetches: N
-(14 rows)
+ Primitive Index Scans: N
+(17 rows)
DROP TABLE prt;
RESET enable_partitionwise_join;
diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out
index 7a03b4e36..8f5a16f93 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -2340,6 +2340,10 @@ begin
ln := regexp_replace(ln, 'Workers Launched: \d+', 'Workers Launched: N');
ln := regexp_replace(ln, 'actual rows=\d+ loops=\d+', 'actual rows=N loops=N');
ln := regexp_replace(ln, 'Rows Removed by Filter: \d+', 'Rows Removed by Filter: N');
+ perform regexp_matches(ln, 'Primitive Index Scans: \d+');
+ if found then
+ continue;
+ end if;
return next ln;
end loop;
end;
@@ -2692,12 +2696,13 @@ select * from ab where a = (select max(a) from lprt_a) and b = (select max(a)-1
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a3_b2_a_idx (actual rows=0 loops=1)
Index Cond: (a = (InitPlan 1).col1)
+ Primitive Index Scans: 1
-> Bitmap Heap Scan on ab_a3_b3 ab_9 (never executed)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a3_b3_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
-(52 rows)
+(53 rows)
-- Test run-time partition pruning with UNION ALL parents
explain (analyze, costs off, summary off, timing off)
@@ -2713,6 +2718,7 @@ select * from (select * from ab where a = 1 union all select * from ab) ab where
Filter: (b = (InitPlan 1).col1)
-> Bitmap Index Scan on ab_a1_b1_a_idx (actual rows=0 loops=1)
Index Cond: (a = 1)
+ Primitive Index Scans: 1
-> Bitmap Heap Scan on ab_a1_b2 ab_12 (never executed)
Recheck Cond: (a = 1)
Filter: (b = (InitPlan 1).col1)
@@ -2741,7 +2747,7 @@ select * from (select * from ab where a = 1 union all select * from ab) ab where
Filter: (b = (InitPlan 1).col1)
-> Seq Scan on ab_a3_b3 ab_9 (never executed)
Filter: (b = (InitPlan 1).col1)
-(37 rows)
+(38 rows)
-- A case containing a UNION ALL with a non-partitioned child.
explain (analyze, costs off, summary off, timing off)
@@ -2757,6 +2763,7 @@ select * from (select * from ab where a = 1 union all (values(10,5)) union all s
Filter: (b = (InitPlan 1).col1)
-> Bitmap Index Scan on ab_a1_b1_a_idx (actual rows=0 loops=1)
Index Cond: (a = 1)
+ Primitive Index Scans: 1
-> Bitmap Heap Scan on ab_a1_b2 ab_12 (never executed)
Recheck Cond: (a = 1)
Filter: (b = (InitPlan 1).col1)
@@ -2787,7 +2794,7 @@ select * from (select * from ab where a = 1 union all (values(10,5)) union all s
Filter: (b = (InitPlan 1).col1)
-> Seq Scan on ab_a3_b3 ab_9 (never executed)
Filter: (b = (InitPlan 1).col1)
-(39 rows)
+(40 rows)
-- Another UNION ALL test, but containing a mix of exec init and exec run-time pruning.
create table xy_1 (x int, y int);
@@ -2858,16 +2865,19 @@ update ab_a1 set b = 3 from ab where ab.a = 1 and ab.a = ab_a1.a;');
Recheck Cond: (a = 1)
-> Bitmap Index Scan on ab_a1_b1_a_idx (actual rows=0 loops=1)
Index Cond: (a = 1)
+ Primitive Index Scans: 1
-> Bitmap Heap Scan on ab_a1_b2 ab_a1_2 (actual rows=1 loops=1)
Recheck Cond: (a = 1)
Heap Blocks: exact=1
-> Bitmap Index Scan on ab_a1_b2_a_idx (actual rows=1 loops=1)
Index Cond: (a = 1)
+ Primitive Index Scans: 1
-> Bitmap Heap Scan on ab_a1_b3 ab_a1_3 (actual rows=0 loops=1)
Recheck Cond: (a = 1)
Heap Blocks: exact=1
-> Bitmap Index Scan on ab_a1_b3_a_idx (actual rows=1 loops=1)
Index Cond: (a = 1)
+ Primitive Index Scans: 1
-> Materialize (actual rows=1 loops=1)
Storage: Memory Maximum Storage: NkB
-> Append (actual rows=1 loops=1)
@@ -2875,17 +2885,20 @@ update ab_a1 set b = 3 from ab where ab.a = 1 and ab.a = ab_a1.a;');
Recheck Cond: (a = 1)
-> Bitmap Index Scan on ab_a1_b1_a_idx (actual rows=0 loops=1)
Index Cond: (a = 1)
+ Primitive Index Scans: 1
-> Bitmap Heap Scan on ab_a1_b2 ab_2 (actual rows=1 loops=1)
Recheck Cond: (a = 1)
Heap Blocks: exact=1
-> Bitmap Index Scan on ab_a1_b2_a_idx (actual rows=1 loops=1)
Index Cond: (a = 1)
+ Primitive Index Scans: 1
-> Bitmap Heap Scan on ab_a1_b3 ab_3 (actual rows=0 loops=1)
Recheck Cond: (a = 1)
Heap Blocks: exact=1
-> Bitmap Index Scan on ab_a1_b3_a_idx (actual rows=1 loops=1)
Index Cond: (a = 1)
-(37 rows)
+ Primitive Index Scans: 1
+(43 rows)
table ab;
a | b
@@ -2961,8 +2974,10 @@ select * from tbl1 join tprt on tbl1.col1 > tprt.col1;
-> Append (actual rows=3 loops=2)
-> Index Scan using tprt1_idx on tprt_1 (actual rows=2 loops=2)
Index Cond: (col1 < tbl1.col1)
+ Primitive Index Scans: 2
-> Index Scan using tprt2_idx on tprt_2 (actual rows=2 loops=1)
Index Cond: (col1 < tbl1.col1)
+ Primitive Index Scans: 1
-> Index Scan using tprt3_idx on tprt_3 (never executed)
Index Cond: (col1 < tbl1.col1)
-> Index Scan using tprt4_idx on tprt_4 (never executed)
@@ -2971,7 +2986,7 @@ select * from tbl1 join tprt on tbl1.col1 > tprt.col1;
Index Cond: (col1 < tbl1.col1)
-> Index Scan using tprt6_idx on tprt_6 (never executed)
Index Cond: (col1 < tbl1.col1)
-(15 rows)
+(17 rows)
explain (analyze, costs off, summary off, timing off)
select * from tbl1 join tprt on tbl1.col1 = tprt.col1;
@@ -2984,6 +2999,7 @@ select * from tbl1 join tprt on tbl1.col1 = tprt.col1;
Index Cond: (col1 = tbl1.col1)
-> Index Scan using tprt2_idx on tprt_2 (actual rows=1 loops=2)
Index Cond: (col1 = tbl1.col1)
+ Primitive Index Scans: 2
-> Index Scan using tprt3_idx on tprt_3 (never executed)
Index Cond: (col1 = tbl1.col1)
-> Index Scan using tprt4_idx on tprt_4 (never executed)
@@ -2992,7 +3008,7 @@ select * from tbl1 join tprt on tbl1.col1 = tprt.col1;
Index Cond: (col1 = tbl1.col1)
-> Index Scan using tprt6_idx on tprt_6 (never executed)
Index Cond: (col1 = tbl1.col1)
-(15 rows)
+(16 rows)
select tbl1.col1, tprt.col1 from tbl1
inner join tprt on tbl1.col1 > tprt.col1
@@ -3027,17 +3043,20 @@ select * from tbl1 inner join tprt on tbl1.col1 > tprt.col1;
-> Append (actual rows=5 loops=5)
-> Index Scan using tprt1_idx on tprt_1 (actual rows=2 loops=5)
Index Cond: (col1 < tbl1.col1)
+ Primitive Index Scans: 5
-> Index Scan using tprt2_idx on tprt_2 (actual rows=3 loops=4)
Index Cond: (col1 < tbl1.col1)
+ Primitive Index Scans: 4
-> Index Scan using tprt3_idx on tprt_3 (actual rows=1 loops=2)
Index Cond: (col1 < tbl1.col1)
+ Primitive Index Scans: 2
-> Index Scan using tprt4_idx on tprt_4 (never executed)
Index Cond: (col1 < tbl1.col1)
-> Index Scan using tprt5_idx on tprt_5 (never executed)
Index Cond: (col1 < tbl1.col1)
-> Index Scan using tprt6_idx on tprt_6 (never executed)
Index Cond: (col1 < tbl1.col1)
-(15 rows)
+(18 rows)
explain (analyze, costs off, summary off, timing off)
select * from tbl1 inner join tprt on tbl1.col1 = tprt.col1;
@@ -3050,15 +3069,17 @@ select * from tbl1 inner join tprt on tbl1.col1 = tprt.col1;
Index Cond: (col1 = tbl1.col1)
-> Index Scan using tprt2_idx on tprt_2 (actual rows=1 loops=2)
Index Cond: (col1 = tbl1.col1)
+ Primitive Index Scans: 2
-> Index Scan using tprt3_idx on tprt_3 (actual rows=0 loops=3)
Index Cond: (col1 = tbl1.col1)
+ Primitive Index Scans: 3
-> Index Scan using tprt4_idx on tprt_4 (never executed)
Index Cond: (col1 = tbl1.col1)
-> Index Scan using tprt5_idx on tprt_5 (never executed)
Index Cond: (col1 = tbl1.col1)
-> Index Scan using tprt6_idx on tprt_6 (never executed)
Index Cond: (col1 = tbl1.col1)
-(15 rows)
+(17 rows)
select tbl1.col1, tprt.col1 from tbl1
inner join tprt on tbl1.col1 > tprt.col1
@@ -3122,7 +3143,8 @@ select * from tbl1 join tprt on tbl1.col1 < tprt.col1;
Index Cond: (col1 > tbl1.col1)
-> Index Scan using tprt6_idx on tprt_6 (actual rows=1 loops=1)
Index Cond: (col1 > tbl1.col1)
-(15 rows)
+ Primitive Index Scans: 1
+(16 rows)
select tbl1.col1, tprt.col1 from tbl1
inner join tprt on tbl1.col1 < tprt.col1
@@ -3484,10 +3506,12 @@ explain (analyze, costs off, summary off, timing off) execute mt_q1(15);
-> Index Scan using ma_test_p2_b_idx on ma_test_p2 ma_test_1 (actual rows=1 loops=1)
Filter: ((a >= $1) AND ((a % 10) = 5))
Rows Removed by Filter: 9
+ Primitive Index Scans: 1
-> Index Scan using ma_test_p3_b_idx on ma_test_p3 ma_test_2 (actual rows=1 loops=1)
Filter: ((a >= $1) AND ((a % 10) = 5))
Rows Removed by Filter: 9
-(9 rows)
+ Primitive Index Scans: 1
+(11 rows)
execute mt_q1(15);
a
@@ -3505,7 +3529,8 @@ explain (analyze, costs off, summary off, timing off) execute mt_q1(25);
-> Index Scan using ma_test_p3_b_idx on ma_test_p3 ma_test_1 (actual rows=1 loops=1)
Filter: ((a >= $1) AND ((a % 10) = 5))
Rows Removed by Filter: 9
-(6 rows)
+ Primitive Index Scans: 1
+(7 rows)
execute mt_q1(25);
a
@@ -3553,13 +3578,16 @@ explain (analyze, costs off, summary off, timing off) select * from ma_test wher
-> Limit (actual rows=1 loops=1)
-> Index Scan using ma_test_p2_b_idx on ma_test_p2 (actual rows=1 loops=1)
Index Cond: (b IS NOT NULL)
+ Primitive Index Scans: 1
-> Index Scan using ma_test_p1_b_idx on ma_test_p1 ma_test_1 (never executed)
Filter: (a >= (InitPlan 2).col1)
-> Index Scan using ma_test_p2_b_idx on ma_test_p2 ma_test_2 (actual rows=10 loops=1)
Filter: (a >= (InitPlan 2).col1)
+ Primitive Index Scans: 1
-> Index Scan using ma_test_p3_b_idx on ma_test_p3 ma_test_3 (actual rows=10 loops=1)
Filter: (a >= (InitPlan 2).col1)
-(14 rows)
+ Primitive Index Scans: 1
+(17 rows)
reset enable_seqscan;
reset enable_sort;
@@ -4130,13 +4158,16 @@ select * from rangep where b IN((select 1),(select 2)) order by a;
Sort Key: rangep_2.a
-> Index Scan using rangep_0_to_100_1_a_idx on rangep_0_to_100_1 rangep_2 (actual rows=0 loops=1)
Filter: (b = ANY (ARRAY[(InitPlan 1).col1, (InitPlan 2).col1]))
+ Primitive Index Scans: 1
-> Index Scan using rangep_0_to_100_2_a_idx on rangep_0_to_100_2 rangep_3 (actual rows=0 loops=1)
Filter: (b = ANY (ARRAY[(InitPlan 1).col1, (InitPlan 2).col1]))
+ Primitive Index Scans: 1
-> Index Scan using rangep_0_to_100_3_a_idx on rangep_0_to_100_3 rangep_4 (never executed)
Filter: (b = ANY (ARRAY[(InitPlan 1).col1, (InitPlan 2).col1]))
-> Index Scan using rangep_100_to_200_a_idx on rangep_100_to_200 rangep_5 (actual rows=0 loops=1)
Filter: (b = ANY (ARRAY[(InitPlan 1).col1, (InitPlan 2).col1]))
-(15 rows)
+ Primitive Index Scans: 1
+(18 rows)
reset enable_sort;
drop table rangep;
diff --git a/src/test/regress/expected/select.out b/src/test/regress/expected/select.out
index 33a6dceb0..905252092 100644
--- a/src/test/regress/expected/select.out
+++ b/src/test/regress/expected/select.out
@@ -764,7 +764,8 @@ select * from onek2 where unique2 = 11 and stringu1 = 'ATAAAA';
Index Scan using onek2_u2_prtl on onek2 (actual rows=1 loops=1)
Index Cond: (unique2 = 11)
Filter: (stringu1 = 'ATAAAA'::name)
-(3 rows)
+ Primitive Index Scans: 1
+(4 rows)
explain (costs off)
select unique2 from onek2 where unique2 = 11 and stringu1 = 'ATAAAA';
diff --git a/src/test/regress/sql/memoize.sql b/src/test/regress/sql/memoize.sql
index 059bec5f4..303be8f69 100644
--- a/src/test/regress/sql/memoize.sql
+++ b/src/test/regress/sql/memoize.sql
@@ -23,8 +23,10 @@ begin
ln := regexp_replace(ln, 'Evictions: 0', 'Evictions: Zero');
ln := regexp_replace(ln, 'Evictions: \d+', 'Evictions: N');
ln := regexp_replace(ln, 'Memory Usage: \d+', 'Memory Usage: N');
- ln := regexp_replace(ln, 'Heap Fetches: \d+', 'Heap Fetches: N');
- ln := regexp_replace(ln, 'loops=\d+', 'loops=N');
+ ln := regexp_replace(ln, 'Heap Fetches: \d+', 'Heap Fetches: N');
+ ln := regexp_replace(ln, 'loops=\d+', 'loops=N');
+ ln := regexp_replace(ln, 'Primitive Index Scans: 0', 'Primitive Index Scans: Zero');
+ ln := regexp_replace(ln, 'Primitive Index Scans: \d+', 'Primitive Index Scans: N');
return next ln;
end loop;
end;
diff --git a/src/test/regress/sql/partition_prune.sql b/src/test/regress/sql/partition_prune.sql
index 442428d93..1b2c89274 100644
--- a/src/test/regress/sql/partition_prune.sql
+++ b/src/test/regress/sql/partition_prune.sql
@@ -573,6 +573,10 @@ begin
ln := regexp_replace(ln, 'Workers Launched: \d+', 'Workers Launched: N');
ln := regexp_replace(ln, 'actual rows=\d+ loops=\d+', 'actual rows=N loops=N');
ln := regexp_replace(ln, 'Rows Removed by Filter: \d+', 'Rows Removed by Filter: N');
+ perform regexp_matches(ln, 'Primitive Index Scans: \d+');
+ if found then
+ continue;
+ end if;
return next ln;
end loop;
end;
--
2.45.2
On Thu, 15 Aug 2024 at 21:23, Peter Geoghegan <pg@bowt.ie> wrote:
Attached patch has EXPLAIN ANALYZE display the total number of
primitive index scans for all 3 kinds of index scan node. This is
useful for index scans that happen to use SAOP arrays. It also seems
almost essential to offer this kind of instrumentation for the skip
scan patch [1]. Skip scan works by reusing all of the Postgres 17 work
(see commit 5bf748b8) to skip over irrelevant sections of a composite
index with a low cardinality leading column, so it has all the same
issues.
Did you notice the patch over at [0]/messages/by-id/TYWPR01MB10982D24AFA7CDC273445BFF0B1DC2@TYWPR01MB10982.jpnprd01.prod.outlook.com, where additional diagnostic
EXPLAIN output for btrees is being discussed, too? I'm asking, because
I'm not very convinced that 'primitive scans' are a useful metric
across all (or even: most) index AMs (e.g. BRIN probably never will
have a 'primitive scans' metric that differs from the loop count), so
maybe this would better be implemented in that framework?
Kind regards,
Matthias van de Meent
Neon (https://neon.tech)
[0]: /messages/by-id/TYWPR01MB10982D24AFA7CDC273445BFF0B1DC2@TYWPR01MB10982.jpnprd01.prod.outlook.com
Hi! Thank you for your work on this subject!
On 15.08.2024 22:22, Peter Geoghegan wrote:
Attached patch has EXPLAIN ANALYZE display the total number of
primitive index scans for all 3 kinds of index scan node. This is
useful for index scans that happen to use SAOP arrays. It also seems
almost essential to offer this kind of instrumentation for the skip
scan patch [1]. Skip scan works by reusing all of the Postgres 17 work
(see commit 5bf748b8) to skip over irrelevant sections of a composite
index with a low cardinality leading column, so it has all the same
issues.
I think that it is enough to pass the IndexScanDesc parameter to the
function - this saves us from having to define the planstate type twice.
For this reason, I suggest some changes that I think may improve your patch.
One reason to have this patch is to differentiate between similar
cases involving simple SAOP arrays. The user will have some reasonable
way of determining how a query such as this:pg@regression:5432 [2070325]=# explain (analyze, buffers, costs off,
summary off)
select
abalance
from
pgbench_accounts
where
aid in (1, 2, 3, 4, 5);
┌──────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ QUERY PLAN
│
├──────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Index Scan using pgbench_accounts_pkey on pgbench_accounts (actual
time=0.007..0.008 rows=5 loops=1) │
│ Index Cond: (aid = ANY ('{1,2,3,4,5}'::integer[]))
│
│ Primitive Index Scans: 1
│
│ Buffers: shared hit=4
│
└──────────────────────────────────────────────────────────────────────────────────────────────────────┘
(4 rows)...differs from a similar query, such as this:
pg@regression:5432 [2070325]=# explain (analyze, buffers, costs off,
summary off)
select
abalance
from
pgbench_accounts
where
aid in (1000, 2000, 3000, 4000, 5000);
┌──────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ QUERY PLAN
│
├──────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Index Scan using pgbench_accounts_pkey on pgbench_accounts (actual
time=0.006..0.012 rows=5 loops=1) │
│ Index Cond: (aid = ANY ('{1000,2000,3000,4000,5000}'::integer[]))
│
│ Primitive Index Scans: 5
│
│ Buffers: shared hit=20
│
└──────────────────────────────────────────────────────────────────────────────────────────────────────┘
(4 rows)Another reason to have this patch is consistency. We're only showing
the user the number of times we've incremented
pg_stat_user_tables.idx_scan in each case. The fact that
pg_stat_user_tables.idx_scan counts primitive index scans like this is
nothing new. That issue was only documented quite recently, as part of
the Postgres 17 work, and it seems quite misleading. It's consistent,
but not necessarily nonsurprising. Making it readily apparent that
there is more than one primitive index scan involved here makes the
issue less surprising.Skip scan
---------Here's an example with this EXPLAIN ANALYZE patch applied on top of my
skip scan patch [1], using the tenk1 table left behind when the
standard regression tests are run:pg@regression:5432 [2070865]=# create index on tenk1 (four, stringu1);
CREATE INDEX
pg@regression:5432 [2070865]=# explain (analyze, buffers, costs off,
summary off)
select
stringu1
from
tenk1
where
-- Omitted: the leading column on "four"
stringu1 = 'YGAAAA';
┌───────────────────────────────────────────────────────────────────────────────────────────────────┐
│ QUERY PLAN
│
├───────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Index Only Scan using tenk1_four_stringu1_idx on tenk1 (actual
time=0.011..0.017 rows=15 loops=1) │
│ Index Cond: (stringu1 = 'YGAAAA'::name)
│
│ Heap Fetches: 0
│
│ Primitive Index Scans: 5
│
│ Buffers: shared hit=11
│
└───────────────────────────────────────────────────────────────────────────────────────────────────┘
(5 rows)Notice that there are 5 primitive index scans here. That's what I'd
expect, given that there are exactly 4 distinct "logical subindexes"
implied by our use of a leading column on "four" as the scan's skip
column. Under the hood, an initial primitive index scan locates the
lowest "four" value. There are then 4 additional primitive index scans
to locate the next "four" value (needed when the current "four" value
gets past the value's "stringu1 = 'YGAAAA'" tuples).Obviously, the cardinality of the leading column predicts the number
of primitive index scans at runtime. But it can be much more
complicated of a relationship than what I've shown here may suggest.
Skewness matters, too. Small clusters of index tuples with unique
leading column values will greatly increase column
cardinality/ndistinct, without a commensurate increase in the cost of
a skip scan (that skips using that column). Those small clusters of
unique values will appear on the same few leaf pages. It follows that
they cannot substantially increase the number of primitive scans
required at runtime -- they'll just be read all together at once.An important goal of my design for skip scan is that we avoid the need
for special index paths within the optimizer. Whether or not we skip
is always a runtime decision (when a skippable index attribute exists
at all). The optimizer needs to know about skipping for costing
purposes only -- all of the required optimizer changes are in
selfuncs.c. That's why you didn't see some kind of special new index
scan node here -- you just saw the number of primitive index scans.My motivation for working on this EXPLAIN ANALYZE patch is primarily
skip scan. I don't think that it necessarily matters, though. I think
that this patch can be treated as independent work. It would have been
weird to not bring it up skip scan even once here, though.Presentation design choices
---------------------------I've used the term "primitive index scan" for this. That is the
existing user-visible terminology [2], though I suppose that that
could be revisited now.Another quasi-arbitrary design choice: I don't break out primitive
index scans for scan nodes with multiple loops (e.g., the inner side
of a nested loop join). The count of primitive scans accumulates
across index_rescan calls. I did things this way because it felt
slightly more logical to follow what we show for "Buffers" --
primitive index scans are another physical cost. I'm certainly not
opposed to doing that part differently. It doesn't have to be one or
the other (could break it out both ways), if people think that the
added verbosity is worth it.I think that we shouldn't be counting calls to _bt_first as a
primitive index scan unless they either call _bt_search or
_bt_endpoint to descend the index (in the case of nbtree scans). This
means that cases where we detect a contradictory qual in
_bt_preprocess_keys should count as having zero primitive index scans.
That is technically an independent thing, though it seems far more
logical to just do it that way.Actually, I think that there might be existing bugs on HEAD, with
parallel index scan -- I think we might be overcounting. We're not
properly accounting for the fact that parallel workers usually don't
perform a primitive index scan when their backend calls into
_bt_first. I wonder if I should address that separately, as a bug
fix...[1]/messages/by-id/CAH2-Wzmn1YsLzOGgjAQZdn1STSG_y8qP__vggTaPAYXJP+G4bw@mail.gmail.com
[2]https://www.postgresql.org/docs/devel/monitoring-stats.html#MONITORING-PG-STAT-ALL-INDEXES-VIEW
-- see "Note" box
--
Peter Geoghegan
To be honest, I don't quite understand how information in explain
analyze about the number of used primitive indexes
will help me improve my database system as a user. Perhaps I'm missing
something.
Maybe it can tell me which columns are best to create an index on or
something like that?
Could you explain it me, please?
--
Regards,
Alena Rybakina
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company
Attachments:
diff.no-cfbottext/plain; charset=UTF-8; name=diff.no-cfbotDownload
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 39e7234e3ac..11e0ee96a1d 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -1983,7 +1983,7 @@ ExplainNode(PlanState *planstate, List *ancestors,
show_instrumentation_count("Rows Removed by Filter", 1,
planstate, es);
if (es->analyze)
- show_indexscan_nprimscans(planstate, es);
+ show_indexscan_nprimscans(((IndexScanState *) planstate)->iss_ScanDesc, es);
break;
case T_IndexOnlyScan:
show_scan_qual(((IndexOnlyScan *) plan)->indexqual,
@@ -2001,14 +2001,14 @@ ExplainNode(PlanState *planstate, List *ancestors,
{
ExplainPropertyFloat("Heap Fetches", NULL,
planstate->instrument->ntuples2, 0, es);
- show_indexscan_nprimscans(planstate, es);
+ show_indexscan_nprimscans(((IndexOnlyScanState *) planstate)->iss_ScanDesc, es);
}
break;
case T_BitmapIndexScan:
show_scan_qual(((BitmapIndexScan *) plan)->indexqualorig,
"Index Cond", planstate, ancestors, es);
if (es->analyze)
- show_indexscan_nprimscans(planstate, es);
+ show_indexscan_nprimscans(((BitmapIndexScanState *) planstate)->iss_ScanDesc, es);
break;
case T_BitmapHeapScan:
show_scan_qual(((BitmapHeapScan *) plan)->bitmapqualorig,
@@ -2523,26 +2523,8 @@ show_expression(Node *node, const char *qlabel,
* IndexOnlyScan node, or BitmapIndexScan node
*/
static void
-show_indexscan_nprimscans(PlanState *planstate, ExplainState *es)
+show_indexscan_nprimscans(IndexScanDescData *scanDesc, ExplainState *es)
{
- Plan *plan = planstate->plan;
- struct IndexScanDescData *scanDesc = NULL;
-
- switch (nodeTag(plan))
- {
- case T_IndexScan:
- scanDesc = ((IndexScanState *) planstate)->iss_ScanDesc;
- break;
- case T_IndexOnlyScan:
- scanDesc = ((IndexOnlyScanState *) planstate)->ioss_ScanDesc;
- break;
- case T_BitmapIndexScan:
- scanDesc = ((BitmapIndexScanState *) planstate)->biss_ScanDesc;
- break;
- default:
- break;
- }
-
if (scanDesc && scanDesc->nprimscans > 0)
ExplainPropertyUInteger("Primitive Index Scans", NULL,
scanDesc->nprimscans, es);
On Thu, Aug 15, 2024 at 4:34 PM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:
Attached patch has EXPLAIN ANALYZE display the total number of
primitive index scans for all 3 kinds of index scan node. This is
useful for index scans that happen to use SAOP arrays. It also seems
almost essential to offer this kind of instrumentation for the skip
scan patch [1]. Skip scan works by reusing all of the Postgres 17 work
(see commit 5bf748b8) to skip over irrelevant sections of a composite
index with a low cardinality leading column, so it has all the same
issues.Did you notice the patch over at [0], where additional diagnostic
EXPLAIN output for btrees is being discussed, too?
To be clear, for those that haven't been paying attention to that
other thread: that other EXPLAIN patch (the one authored by Masahiro
Ikeda) surfaces information about a distinction that the skip scan
patch renders obsolete. That is, the skip scan patch makes all "Non
Key Filter" quals into quals that can relocate the scan to a later
leaf page by starting a new primitive index scan. Technically, skip
scan removes the concept that that patch calls "Non Key Filter"
altogether.
Note that this isn't the same thing as making that other patch
obsolete. Skip scan renders the whole concept of "Non Key Filter"
obsolete *in name only*. You might prefer to think of it as making
that whole concept squishy. Just because we can theoretically use the
leading column to skip doesn't mean we actually will. It isn't an
either/or thing. We might skip during some parts of a scan, but not
during other parts.
It's just not clear how to handle those sorts of fuzzy distinctions
right now. It does seem worth pursuing, but I see no conflict.
I'm asking, because
I'm not very convinced that 'primitive scans' are a useful metric
across all (or even: most) index AMs (e.g. BRIN probably never will
have a 'primitive scans' metric that differs from the loop count), so
maybe this would better be implemented in that framework?
What do you mean by "within that framework"? They seem orthogonal?
It's true that BRIN index scans will probably never show more than a
single primitive index scan. I don't think that the same is true of
any other index AM, though. Don't they all support SAOPs, albeit
non-natively?
The important question is: what do you want to do about cases like the
BRIN case? Our choices are all fairly obvious choices. We can be
selective, and *not* show this information when a set of heuristics
indicate that it's not relevant. This is fairly straightforward to
implement. Which do you prefer: overall consistency, or less
verbosity?
Personally I think that the consistency argument works in favor of
displaying this information for every kind of index scan. That's a
hopelessly subjective position, though.
--
Peter Geoghegan
On Thu, Aug 15, 2024 at 4:58 PM Alena Rybakina
<a.rybakina@postgrespro.ru> wrote:
I think that it is enough to pass the IndexScanDesc parameter to the function - this saves us from having to define the planstate type twice.
For this reason, I suggest some changes that I think may improve your patch.
Perhaps it's a little better that way. I'll consider it.
To be honest, I don't quite understand how information in explain analyze about the number of used primitive indexes
will help me improve my database system as a user. Perhaps I'm missing something.
There is probably no typical case. The patch shows implementation
details, which need to be interpreted in the context of a particular
problem.
Maybe the problem is that some of the heuristics added by one of my
nbtree patches interact relatively badly with some real world query.
It would be presumptuous of me to say that that will never happen.
Maybe it can tell me which columns are best to create an index on or something like that?
That's definitely going to be important in the case of skip scan.
Simply showing the user that the index scan skips at all will make
them aware that there are missing index columns. That could be a sign
that they'd be better off not using skip scan at all, by creating a
new index that suits the particular query (by not having the extra
skipped column).
It's almost always possible to beat skip scan by creating a new index
-- whether or not it's worth the trouble/expense of maintaining a
whole new index is the important question. Is this particular query
the most important query *to the business*, for whatever reason? Or is
having merely adequate performance acceptable?
Your OR-to-SAOP-rewrite patch effectively makes two or more bitmap
index scans into one single continuous index scan. Or...does it? The
true number of (primitive) index scans might be "the same" as it was
before (without your patch), or there might really only be one
(primitive) index scan with your patch. Or it might be anywhere in
between those two extremes. Users will benefit from knowing where on
this continuum a particular index scan falls. It's just useful to know
where time is spent.
Knowing this information might even allow the user to create a new
multicolumn index, with columns in an order better suited to an
affected query. It's not so much the cost of descending the index
multiple times that we need to worry about here, even though that's
what we're talking about counting here. Varying index column order
could make an index scan faster by increasing locality. Locality is
usually very important. Few index scans is a good proxy for greater
locality.
It's easiest to understand what I mean about locality with an example.
An index on (a, b) is good for queries with quals such as "where a =
42 and b in (1,2,3,4,5,6,7,8,9)" if it allows such a query to only
access one or two leaf pages, where all of the "b" values of interest
live side by side. Obviously that won't be true if it's the other way
around -- if the typical qual looks more like "where b = 7 and a in
(1,2,3,4,5,6,7,8,9)". This is the difference between 1 primitive
index scan and 9 primitive index scans -- quite a big difference. Note
that the main cost we need to worry about here *isn't* the cost of
descending the index. It's mostly the cost of reading the leaf pages.
--
Peter Geoghegan
On Thu, 15 Aug 2024 at 23:10, Peter Geoghegan <pg@bowt.ie> wrote:
On Thu, Aug 15, 2024 at 4:34 PM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:Attached patch has EXPLAIN ANALYZE display the total number of
primitive index scans for all 3 kinds of index scan node. This is
useful for index scans that happen to use SAOP arrays. It also seems
almost essential to offer this kind of instrumentation for the skip
scan patch [1]. Skip scan works by reusing all of the Postgres 17 work
(see commit 5bf748b8) to skip over irrelevant sections of a composite
index with a low cardinality leading column, so it has all the same
issues.Did you notice the patch over at [0], where additional diagnostic
EXPLAIN output for btrees is being discussed, too?To be clear, for those that haven't been paying attention to that
other thread: that other EXPLAIN patch (the one authored by Masahiro
Ikeda) surfaces information about a distinction that the skip scan
patch renders obsolete. That is, the skip scan patch makes all "Non
Key Filter" quals into quals that can relocate the scan to a later
leaf page by starting a new primitive index scan. Technically, skip
scan removes the concept that that patch calls "Non Key Filter"
altogether.Note that this isn't the same thing as making that other patch
obsolete. Skip scan renders the whole concept of "Non Key Filter"
obsolete *in name only*. You might prefer to think of it as making
that whole concept squishy. Just because we can theoretically use the
leading column to skip doesn't mean we actually will. It isn't an
either/or thing. We might skip during some parts of a scan, but not
during other parts.
Yes.
It's just not clear how to handle those sorts of fuzzy distinctions
right now. It does seem worth pursuing, but I see no conflict.I'm asking, because
I'm not very convinced that 'primitive scans' are a useful metric
across all (or even: most) index AMs (e.g. BRIN probably never will
have a 'primitive scans' metric that differs from the loop count), so
maybe this would better be implemented in that framework?What do you mean by "within that framework"? They seem orthogonal?
What I meant was putting this 'primitive scans' info into the
AM-specific explain callback as seen in the latest patch version.
It's true that BRIN index scans will probably never show more than a
single primitive index scan. I don't think that the same is true of
any other index AM, though. Don't they all support SAOPs, albeit
non-natively?
Not always. For Bitmap Index Scan the node's functions can allow
non-native SAOP support (it ORs the bitmaps), but normal indexes
without SAOP support won't get SAOP-functionality from the IS/IOS
node's infrastructure, it'll need to be added as Filter.
The important question is: what do you want to do about cases like the
BRIN case? Our choices are all fairly obvious choices. We can be
selective, and *not* show this information when a set of heuristics
indicate that it's not relevant. This is fairly straightforward to
implement. Which do you prefer: overall consistency, or less
verbosity?
Consistency, I suppose. But adding explain attributes left and right
in Index Scan's explain output when and where every index type needs
them doesn't scale, so I'd put index-specific output into it's own
system (see the linked thread for more rationale). And, in this case,
the use case seems quite index-specific, at least for IS/IOS nodes.
Personally I think that the consistency argument works in favor of
displaying this information for every kind of index scan.
Agreed, assuming "this information" is indeed shared (and useful)
across all AMs.
This made me notice that you add a new metric that should generally be
exactly the same as pg_stat_all_indexes.idx_scan (you mention the
same). Can't you pull that data, instead of inventing a new place
every AMs needs to touch for it's metrics?
Kind regards,
Matthias van de Meent
Neon (https://neon.tech)
On Thu, Aug 15, 2024 at 5:47 PM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:
I'm asking, because
I'm not very convinced that 'primitive scans' are a useful metric
across all (or even: most) index AMs (e.g. BRIN probably never will
have a 'primitive scans' metric that differs from the loop count), so
maybe this would better be implemented in that framework?What do you mean by "within that framework"? They seem orthogonal?
What I meant was putting this 'primitive scans' info into the
AM-specific explain callback as seen in the latest patch version.
I don't see how that could work. This is fundamentally information
that is only known when the query has fully finished execution.
Again, this is already something that we track at the whole-table
level, within pg_stat_user_tables.idx_scan. It's already considered
index AM agnostic information, in that sense.
It's true that BRIN index scans will probably never show more than a
single primitive index scan. I don't think that the same is true of
any other index AM, though. Don't they all support SAOPs, albeit
non-natively?Not always. For Bitmap Index Scan the node's functions can allow
non-native SAOP support (it ORs the bitmaps), but normal indexes
without SAOP support won't get SAOP-functionality from the IS/IOS
node's infrastructure, it'll need to be added as Filter.
Again, what do you want me to do about it? Almost anything is possible
in principle, and can be implemented without great difficulty. But you
have to clearly say what you want, and why you want it.
Yeah, non-native SAOP index scans are always bitmap scans. In the case
of GIN, there are only lossy/bitmap index scans, anyway -- can't see
that ever changing. In the case of GiST, we could in the future add
native SAOP support, so do we really want to be inconsistent in what
we show now? (Tom said something about that recently, in fact.)
I don't hate the idea of selectively not showing this information (for
BRIN, say). Just as I don't hate the idea of totally omitting
"loops=1" in the common case where we couldn't possibly be more than
one loop in practice. It's just that I don't think that it's worth it,
on balance. Not all redundancy is bad.
The important question is: what do you want to do about cases like the
BRIN case? Our choices are all fairly obvious choices. We can be
selective, and *not* show this information when a set of heuristics
indicate that it's not relevant. This is fairly straightforward to
implement. Which do you prefer: overall consistency, or less
verbosity?Consistency, I suppose. But adding explain attributes left and right
in Index Scan's explain output when and where every index type needs
them doesn't scale, so I'd put index-specific output into it's own
system (see the linked thread for more rationale).
I can't argue with that. I just don't think it's directly relevant.
And, in this case,
the use case seems quite index-specific, at least for IS/IOS nodes.
I disagree. It's an existing concept, exposed in system views, and now
in EXPLAIN ANALYZE. It's precisely that -- nothing more, nothing less.
The fact that it tends to be much more useful in the case of nbtree
(at least for now) makes this no less true.
This made me notice that you add a new metric that should generally be
exactly the same as pg_stat_all_indexes.idx_scan (you mention the
same).
I didn't imagine that that part was subtle.
Can't you pull that data, instead of inventing a new place
every AMs needs to touch for it's metrics?
No. At least not in a way that's scoped to a particular index scan.
--
Peter Geoghegan
On Thu, Aug 15, 2024 at 3:22 PM Peter Geoghegan <pg@bowt.ie> wrote:
Attached patch has EXPLAIN ANALYZE display the total number of
primitive index scans for all 3 kinds of index scan node.
Attached is v2, which fixes bitrot.
v2 also uses new terminology. EXPLAIN ANALYZE will now show "Index
Searches: N", not "Primitive Index Scans: N". Although there is
limited precedent for using the primitive scan terminology, I think
that it's a bit unwieldy.
No other notable changes.
--
Peter Geoghegan
Attachments:
v2-0001-Show-index-search-count-in-EXPLAIN-ANALYZE.patchapplication/octet-stream; name=v2-0001-Show-index-search-count-in-EXPLAIN-ANALYZE.patchDownload
From 919fe73e5ef53cc4a8dd1afdd138c9280d9234ac Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@bowt.ie>
Date: Wed, 14 Aug 2024 13:50:23 -0400
Subject: [PATCH v2] Show index search count in EXPLAIN ANALYZE.
Also stop counting the case where nbtree detects contradictory quals as
a distinct index search (do so neither in EXPLAIN ANALYZE nor in the
pg_stat_*_indexes.idx_scan stats).
Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-WzkRqvaqR2CTNqTZP0z6FuL4-3ED6eQB0yx38XBNj1v-4Q@mail.gmail.com
---
src/include/access/relscan.h | 3 +
src/backend/access/brin/brin.c | 1 +
src/backend/access/gin/ginscan.c | 1 +
src/backend/access/gist/gistget.c | 2 +
src/backend/access/hash/hashsearch.c | 1 +
src/backend/access/index/genam.c | 1 +
src/backend/access/nbtree/nbtree.c | 11 ++++
src/backend/access/nbtree/nbtsearch.c | 9 ++-
src/backend/access/spgist/spgscan.c | 1 +
src/backend/commands/explain.c | 39 +++++++++++++
doc/src/sgml/bloom.sgml | 2 +
doc/src/sgml/monitoring.sgml | 12 +++-
doc/src/sgml/perform.sgml | 8 +++
doc/src/sgml/ref/explain.sgml | 1 +
doc/src/sgml/rules.sgml | 1 +
src/test/regress/expected/brin_multi.out | 27 ++++++---
src/test/regress/expected/memoize.out | 50 +++++++++++-----
src/test/regress/expected/partition_prune.out | 57 ++++++++++++++-----
src/test/regress/expected/select.out | 3 +-
src/test/regress/sql/memoize.sql | 6 +-
src/test/regress/sql/partition_prune.sql | 4 ++
21 files changed, 196 insertions(+), 44 deletions(-)
diff --git a/src/include/access/relscan.h b/src/include/access/relscan.h
index 521043304..b992d4080 100644
--- a/src/include/access/relscan.h
+++ b/src/include/access/relscan.h
@@ -130,6 +130,9 @@ typedef struct IndexScanDescData
bool xactStartedInRecovery; /* prevents killing/seeing killed
* tuples */
+ /* index access method instrumentation output state */
+ uint64 nsearches; /* # of index searches */
+
/* index access method's private state */
void *opaque; /* access-method-specific info */
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index 6467bed60..749d8b845 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -581,6 +581,7 @@ bringetbitmap(IndexScanDesc scan, TIDBitmap *tbm)
opaque = (BrinOpaque *) scan->opaque;
bdesc = opaque->bo_bdesc;
pgstat_count_index_scan(idxRel);
+ scan->nsearches++;
/*
* We need to know the size of the table so that we know how long to
diff --git a/src/backend/access/gin/ginscan.c b/src/backend/access/gin/ginscan.c
index af24d3854..594478116 100644
--- a/src/backend/access/gin/ginscan.c
+++ b/src/backend/access/gin/ginscan.c
@@ -436,6 +436,7 @@ ginNewScanKey(IndexScanDesc scan)
MemoryContextSwitchTo(oldCtx);
pgstat_count_index_scan(scan->indexRelation);
+ scan->nsearches++;
}
void
diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c
index b35b8a975..36f1435cb 100644
--- a/src/backend/access/gist/gistget.c
+++ b/src/backend/access/gist/gistget.c
@@ -625,6 +625,7 @@ gistgettuple(IndexScanDesc scan, ScanDirection dir)
GISTSearchItem fakeItem;
pgstat_count_index_scan(scan->indexRelation);
+ scan->nsearches++;
so->firstCall = false;
so->curPageData = so->nPageData = 0;
@@ -750,6 +751,7 @@ gistgetbitmap(IndexScanDesc scan, TIDBitmap *tbm)
return 0;
pgstat_count_index_scan(scan->indexRelation);
+ scan->nsearches++;
/* Begin the scan by processing the root page */
so->curPageData = so->nPageData = 0;
diff --git a/src/backend/access/hash/hashsearch.c b/src/backend/access/hash/hashsearch.c
index 0d99d6abc..927ba1039 100644
--- a/src/backend/access/hash/hashsearch.c
+++ b/src/backend/access/hash/hashsearch.c
@@ -298,6 +298,7 @@ _hash_first(IndexScanDesc scan, ScanDirection dir)
HashScanPosItem *currItem;
pgstat_count_index_scan(rel);
+ scan->nsearches++;
/*
* We do not support hash scans with no index qualification, because we
diff --git a/src/backend/access/index/genam.c b/src/backend/access/index/genam.c
index 43c95d610..5f4544724 100644
--- a/src/backend/access/index/genam.c
+++ b/src/backend/access/index/genam.c
@@ -116,6 +116,7 @@ RelationGetIndexScan(Relation indexRelation, int nkeys, int norderbys)
scan->xactStartedInRecovery = TransactionStartedDuringRecovery();
scan->ignore_killed_tuples = !scan->xactStartedInRecovery;
+ scan->nsearches = 0; /* not reset by index_rescan */
scan->opaque = NULL;
scan->xs_itup = NULL;
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 686a3206f..c1cc757c9 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -70,6 +70,7 @@ typedef struct BTParallelScanDescData
BTPS_State btps_pageStatus; /* indicates whether next page is
* available for scan. see above for
* possible states of parallel scan. */
+ uint64 btps_nsearches; /* instrumentation */
slock_t btps_mutex; /* protects above variables, btps_arrElems */
ConditionVariable btps_cv; /* used to synchronize parallel scan */
@@ -551,6 +552,7 @@ btinitparallelscan(void *target)
SpinLockInit(&bt_target->btps_mutex);
bt_target->btps_scanPage = InvalidBlockNumber;
bt_target->btps_pageStatus = BTPARALLEL_NOT_INITIALIZED;
+ bt_target->btps_nsearches = 0;
ConditionVariableInit(&bt_target->btps_cv);
}
@@ -576,6 +578,7 @@ btparallelrescan(IndexScanDesc scan)
SpinLockAcquire(&btscan->btps_mutex);
btscan->btps_scanPage = InvalidBlockNumber;
btscan->btps_pageStatus = BTPARALLEL_NOT_INITIALIZED;
+ /* deliberately don't reset btps_nsearches (matches index_rescan) */
SpinLockRelease(&btscan->btps_mutex);
}
@@ -680,6 +683,11 @@ _bt_parallel_seize(IndexScanDesc scan, BlockNumber *pageno, bool first)
* We have successfully seized control of the scan for the purpose
* of advancing it to a new page!
*/
+ if (first && btscan->btps_pageStatus == BTPARALLEL_NOT_INITIALIZED)
+ {
+ /* count the first primitive scan for this btrescan */
+ btscan->btps_nsearches++;
+ }
btscan->btps_pageStatus = BTPARALLEL_ADVANCING;
*pageno = btscan->btps_scanPage;
exit_loop = true;
@@ -752,6 +760,8 @@ _bt_parallel_done(IndexScanDesc scan)
btscan->btps_pageStatus = BTPARALLEL_DONE;
status_changed = true;
}
+ /* Copy the authoritative shared primitive scan counter to local field */
+ scan->nsearches = btscan->btps_nsearches;
SpinLockRelease(&btscan->btps_mutex);
/* wake up all the workers associated with this parallel scan */
@@ -785,6 +795,7 @@ _bt_parallel_primscan_schedule(IndexScanDesc scan, BlockNumber prev_scan_page)
{
btscan->btps_scanPage = InvalidBlockNumber;
btscan->btps_pageStatus = BTPARALLEL_NEED_PRIMSCAN;
+ btscan->btps_nsearches++;
/* Serialize scan's current array keys */
for (int i = 0; i < so->numArrayKeys; i++)
diff --git a/src/backend/access/nbtree/nbtsearch.c b/src/backend/access/nbtree/nbtsearch.c
index 2551df8a6..4b91a192e 100644
--- a/src/backend/access/nbtree/nbtsearch.c
+++ b/src/backend/access/nbtree/nbtsearch.c
@@ -896,8 +896,6 @@ _bt_first(IndexScanDesc scan, ScanDirection dir)
Assert(!BTScanPosIsValid(so->currPos));
- pgstat_count_index_scan(rel);
-
/*
* Examine the scan keys and eliminate any redundant keys; also mark the
* keys that must be matched to continue the scan.
@@ -960,6 +958,13 @@ _bt_first(IndexScanDesc scan, ScanDirection dir)
_bt_start_array_keys(scan, dir);
}
+ /*
+ * We've established that we'll either call _bt_search or _bt_endpoint.
+ * Count this as a primitive index scan/index search.
+ */
+ pgstat_count_index_scan(rel);
+ scan->nsearches++;
+
/*----------
* Examine the scan keys to discover where we need to start the scan.
*
diff --git a/src/backend/access/spgist/spgscan.c b/src/backend/access/spgist/spgscan.c
index 03293a781..9138fc03a 100644
--- a/src/backend/access/spgist/spgscan.c
+++ b/src/backend/access/spgist/spgscan.c
@@ -423,6 +423,7 @@ spgrescan(IndexScanDesc scan, ScanKey scankey, int nscankeys,
/* count an indexscan for stats */
pgstat_count_index_scan(scan->indexRelation);
+ scan->nsearches++;
}
void
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 11df4a04d..d40e1dea3 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -13,6 +13,7 @@
*/
#include "postgres.h"
+#include "access/relscan.h"
#include "access/xact.h"
#include "catalog/pg_type.h"
#include "commands/createas.h"
@@ -89,6 +90,7 @@ static void show_plan_tlist(PlanState *planstate, List *ancestors,
static void show_expression(Node *node, const char *qlabel,
PlanState *planstate, List *ancestors,
bool useprefix, ExplainState *es);
+static void show_indexscan_nsearches(PlanState *planstate, ExplainState *es);
static void show_qual(List *qual, const char *qlabel,
PlanState *planstate, List *ancestors,
bool useprefix, ExplainState *es);
@@ -1984,6 +1986,8 @@ ExplainNode(PlanState *planstate, List *ancestors,
if (plan->qual)
show_instrumentation_count("Rows Removed by Filter", 1,
planstate, es);
+ if (es->analyze)
+ show_indexscan_nsearches(planstate, es);
break;
case T_IndexOnlyScan:
show_scan_qual(((IndexOnlyScan *) plan)->indexqual,
@@ -1998,12 +2002,17 @@ ExplainNode(PlanState *planstate, List *ancestors,
show_instrumentation_count("Rows Removed by Filter", 1,
planstate, es);
if (es->analyze)
+ {
ExplainPropertyFloat("Heap Fetches", NULL,
planstate->instrument->ntuples2, 0, es);
+ show_indexscan_nsearches(planstate, es);
+ }
break;
case T_BitmapIndexScan:
show_scan_qual(((BitmapIndexScan *) plan)->indexqualorig,
"Index Cond", planstate, ancestors, es);
+ if (es->analyze)
+ show_indexscan_nsearches(planstate, es);
break;
case T_BitmapHeapScan:
show_scan_qual(((BitmapHeapScan *) plan)->bitmapqualorig,
@@ -2513,6 +2522,36 @@ show_expression(Node *node, const char *qlabel,
ExplainPropertyText(qlabel, exprstr, es);
}
+/*
+ * Show the number of index searches within an IndexScan node, IndexOnlyScan
+ * node, or BitmapIndexScan node
+ */
+static void
+show_indexscan_nsearches(PlanState *planstate, ExplainState *es)
+{
+ Plan *plan = planstate->plan;
+ struct IndexScanDescData *scanDesc = NULL;
+
+ switch (nodeTag(plan))
+ {
+ case T_IndexScan:
+ scanDesc = ((IndexScanState *) planstate)->iss_ScanDesc;
+ break;
+ case T_IndexOnlyScan:
+ scanDesc = ((IndexOnlyScanState *) planstate)->ioss_ScanDesc;
+ break;
+ case T_BitmapIndexScan:
+ scanDesc = ((BitmapIndexScanState *) planstate)->biss_ScanDesc;
+ break;
+ default:
+ break;
+ }
+
+ if (scanDesc && scanDesc->nsearches > 0)
+ ExplainPropertyUInteger("Index Searches", NULL,
+ scanDesc->nsearches, es);
+}
+
/*
* Show a qualifier expression (which is a List with implicit AND semantics)
*/
diff --git a/doc/src/sgml/bloom.sgml b/doc/src/sgml/bloom.sgml
index 19f2b172c..8744020eb 100644
--- a/doc/src/sgml/bloom.sgml
+++ b/doc/src/sgml/bloom.sgml
@@ -170,6 +170,7 @@ CREATE INDEX
Heap Blocks: exact=28
-> Bitmap Index Scan on bloomidx (cost=0.00..1792.00 rows=2 width=0) (actual time=0.356..0.356 rows=29 loops=1)
Index Cond: ((i2 = 898732) AND (i5 = 123451))
+ Index Searches: 1
Planning Time: 0.099 ms
Execution Time: 0.408 ms
(8 rows)
@@ -202,6 +203,7 @@ CREATE INDEX
-> BitmapAnd (cost=24.34..24.34 rows=2 width=0) (actual time=0.027..0.027 rows=0 loops=1)
-> Bitmap Index Scan on btreeidx5 (cost=0.00..12.04 rows=500 width=0) (actual time=0.026..0.026 rows=0 loops=1)
Index Cond: (i5 = 123451)
+ Index Searches: 1
-> Bitmap Index Scan on btreeidx2 (cost=0.00..12.04 rows=500 width=0) (never executed)
Index Cond: (i2 = 898732)
Planning Time: 0.491 ms
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 55417a6fa..487851994 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -4077,12 +4077,18 @@ description | Waiting for a newly initialized WAL file to reach durable storage
Queries that use certain <acronym>SQL</acronym> constructs to search for
rows matching any value out of a list or array of multiple scalar values
(see <xref linkend="functions-comparisons"/>) perform multiple
- <quote>primitive</quote> index scans (up to one primitive scan per scalar
- value) during query execution. Each internal primitive index scan
- increments <structname>pg_stat_all_indexes</structname>.<structfield>idx_scan</structfield>,
+ index searches (up to one index search per scalar value) during query
+ execution. Each internal index search increments
+ <structname>pg_stat_all_indexes</structname>.<structfield>idx_scan</structfield>,
so it's possible for the count of index scans to significantly exceed the
total number of index scan executor node executions.
</para>
+ <para>
+ <command>EXPLAIN ANALYZE</command> breaks down the total number of index
+ searches performed by each index scan node. <literal>Index Searches: N</literal>
+ indicates the total number of searches across <emphasis>all</emphasis>
+ executor node executions/loops.
+ </para>
</note>
</sect2>
diff --git a/doc/src/sgml/perform.sgml b/doc/src/sgml/perform.sgml
index ff689b652..bd8d9c2ce 100644
--- a/doc/src/sgml/perform.sgml
+++ b/doc/src/sgml/perform.sgml
@@ -702,8 +702,10 @@ WHERE t1.unique1 < 10 AND t1.unique2 = t2.unique2;
Heap Blocks: exact=10
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..4.36 rows=10 width=0) (actual time=0.004..0.004 rows=10 loops=1)
Index Cond: (unique1 < 10)
+ Index Searches: 1
-> Index Scan using tenk2_unique2 on tenk2 t2 (cost=0.29..7.90 rows=1 width=244) (actual time=0.003..0.003 rows=1 loops=10)
Index Cond: (unique2 = t1.unique2)
+ Index Searches: 1
Planning Time: 0.485 ms
Execution Time: 0.073 ms
</screen>
@@ -754,6 +756,7 @@ WHERE t1.unique1 < 100 AND t1.unique2 = t2.unique2 ORDER BY t1.fivethous;
Heap Blocks: exact=90
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.04 rows=100 width=0) (actual time=0.013..0.013 rows=100 loops=1)
Index Cond: (unique1 < 100)
+ Index Searches: 1
Planning Time: 0.187 ms
Execution Time: 3.036 ms
</screen>
@@ -819,6 +822,7 @@ EXPLAIN ANALYZE SELECT * FROM polygon_tbl WHERE f1 @> polygon '(0.5,2.0)';
-------------------------------------------------------------------&zwsp;-------------------------------------------------------
Index Scan using gpolygonind on polygon_tbl (cost=0.13..8.15 rows=1 width=85) (actual time=0.074..0.074 rows=0 loops=1)
Index Cond: (f1 @> '((0.5,2))'::polygon)
+ Index Searches: 1
Rows Removed by Index Recheck: 1
Planning Time: 0.039 ms
Execution Time: 0.098 ms
@@ -848,9 +852,11 @@ EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM tenk1 WHERE unique1 < 100 AND unique
Buffers: shared hit=4 read=3
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.04 rows=100 width=0) (actual time=0.027..0.027 rows=100 loops=1)
Index Cond: (unique1 < 100)
+ Index Searches: 1
Buffers: shared hit=2
-> Bitmap Index Scan on tenk1_unique2 (cost=0.00..19.78 rows=999 width=0) (actual time=0.070..0.070 rows=999 loops=1)
Index Cond: (unique2 > 9000)
+ Index Searches: 1
Buffers: shared hit=2 read=3
Planning:
Buffers: shared hit=3
@@ -883,6 +889,7 @@ EXPLAIN ANALYZE UPDATE tenk1 SET hundred = hundred + 1 WHERE unique1 < 100;
Heap Blocks: exact=90
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.04 rows=100 width=0) (actual time=0.031..0.031 rows=100 loops=1)
Index Cond: (unique1 < 100)
+ Index Searches: 1
Planning Time: 0.151 ms
Execution Time: 1.856 ms
@@ -1019,6 +1026,7 @@ EXPLAIN ANALYZE SELECT * FROM tenk1 WHERE unique1 < 100 AND unique2 > 9000
Index Cond: (unique2 > 9000)
Filter: (unique1 < 100)
Rows Removed by Filter: 287
+ Index Searches: 1
Planning Time: 0.077 ms
Execution Time: 0.086 ms
</screen>
diff --git a/doc/src/sgml/ref/explain.sgml b/doc/src/sgml/ref/explain.sgml
index db9d3a854..0ceb93070 100644
--- a/doc/src/sgml/ref/explain.sgml
+++ b/doc/src/sgml/ref/explain.sgml
@@ -502,6 +502,7 @@ EXPLAIN ANALYZE EXECUTE query(100, 200);
Batches: 1 Memory Usage: 24kB
-> Index Scan using test_pkey on test (cost=0.29..10.27 rows=99 width=8) (actual time=0.009..0.025 rows=99 loops=1)
Index Cond: ((id > 100) AND (id < 200))
+ Index Searches: 1
Planning Time: 0.244 ms
Execution Time: 0.073 ms
(7 rows)
diff --git a/doc/src/sgml/rules.sgml b/doc/src/sgml/rules.sgml
index 7a928bd7b..17112971f 100644
--- a/doc/src/sgml/rules.sgml
+++ b/doc/src/sgml/rules.sgml
@@ -1046,6 +1046,7 @@ SELECT count(*) FROM words WHERE word = 'caterpiler';
-> Index Only Scan using wrd_word on wrd (cost=0.42..4.44 rows=1 width=0) (actual time=0.039..0.039 rows=0 loops=1)
Index Cond: (word = 'caterpiler'::text)
Heap Fetches: 0
+ Index Searches: 1
Planning time: 0.164 ms
Execution time: 0.117 ms
</programlisting>
diff --git a/src/test/regress/expected/brin_multi.out b/src/test/regress/expected/brin_multi.out
index ae9ce9d8e..c24d56007 100644
--- a/src/test/regress/expected/brin_multi.out
+++ b/src/test/regress/expected/brin_multi.out
@@ -853,7 +853,8 @@ SELECT * FROM brin_date_test WHERE a = '2023-01-01'::date;
Recheck Cond: (a = '2023-01-01'::date)
-> Bitmap Index Scan on brin_date_test_a_idx (actual rows=0 loops=1)
Index Cond: (a = '2023-01-01'::date)
-(4 rows)
+ Index Searches: 1
+(5 rows)
DROP TABLE brin_date_test;
RESET enable_seqscan;
@@ -872,7 +873,8 @@ SELECT * FROM brin_timestamp_test WHERE a = '2023-01-01'::timestamp;
Recheck Cond: (a = '2023-01-01 00:00:00'::timestamp without time zone)
-> Bitmap Index Scan on brin_timestamp_test_a_idx (actual rows=0 loops=1)
Index Cond: (a = '2023-01-01 00:00:00'::timestamp without time zone)
-(4 rows)
+ Index Searches: 1
+(5 rows)
EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF, SUMMARY OFF)
SELECT * FROM brin_timestamp_test WHERE a = '1900-01-01'::timestamp;
@@ -882,7 +884,8 @@ SELECT * FROM brin_timestamp_test WHERE a = '1900-01-01'::timestamp;
Recheck Cond: (a = '1900-01-01 00:00:00'::timestamp without time zone)
-> Bitmap Index Scan on brin_timestamp_test_a_idx (actual rows=0 loops=1)
Index Cond: (a = '1900-01-01 00:00:00'::timestamp without time zone)
-(4 rows)
+ Index Searches: 1
+(5 rows)
DROP TABLE brin_timestamp_test;
RESET enable_seqscan;
@@ -900,7 +903,8 @@ SELECT * FROM brin_date_test WHERE a = '2023-01-01'::date;
Recheck Cond: (a = '2023-01-01'::date)
-> Bitmap Index Scan on brin_date_test_a_idx (actual rows=0 loops=1)
Index Cond: (a = '2023-01-01'::date)
-(4 rows)
+ Index Searches: 1
+(5 rows)
EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF, SUMMARY OFF)
SELECT * FROM brin_date_test WHERE a = '1900-01-01'::date;
@@ -910,7 +914,8 @@ SELECT * FROM brin_date_test WHERE a = '1900-01-01'::date;
Recheck Cond: (a = '1900-01-01'::date)
-> Bitmap Index Scan on brin_date_test_a_idx (actual rows=0 loops=1)
Index Cond: (a = '1900-01-01'::date)
-(4 rows)
+ Index Searches: 1
+(5 rows)
DROP TABLE brin_date_test;
RESET enable_seqscan;
@@ -929,7 +934,8 @@ SELECT * FROM brin_interval_test WHERE a = '-30 years'::interval;
Recheck Cond: (a = '@ 30 years ago'::interval)
-> Bitmap Index Scan on brin_interval_test_a_idx (actual rows=0 loops=1)
Index Cond: (a = '@ 30 years ago'::interval)
-(4 rows)
+ Index Searches: 1
+(5 rows)
EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF, SUMMARY OFF)
SELECT * FROM brin_interval_test WHERE a = '30 years'::interval;
@@ -939,7 +945,8 @@ SELECT * FROM brin_interval_test WHERE a = '30 years'::interval;
Recheck Cond: (a = '@ 30 years'::interval)
-> Bitmap Index Scan on brin_interval_test_a_idx (actual rows=0 loops=1)
Index Cond: (a = '@ 30 years'::interval)
-(4 rows)
+ Index Searches: 1
+(5 rows)
DROP TABLE brin_interval_test;
RESET enable_seqscan;
@@ -957,7 +964,8 @@ SELECT * FROM brin_interval_test WHERE a = '-30 years'::interval;
Recheck Cond: (a = '@ 30 years ago'::interval)
-> Bitmap Index Scan on brin_interval_test_a_idx (actual rows=0 loops=1)
Index Cond: (a = '@ 30 years ago'::interval)
-(4 rows)
+ Index Searches: 1
+(5 rows)
EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF, SUMMARY OFF)
SELECT * FROM brin_interval_test WHERE a = '30 years'::interval;
@@ -967,7 +975,8 @@ SELECT * FROM brin_interval_test WHERE a = '30 years'::interval;
Recheck Cond: (a = '@ 30 years'::interval)
-> Bitmap Index Scan on brin_interval_test_a_idx (actual rows=0 loops=1)
Index Cond: (a = '@ 30 years'::interval)
-(4 rows)
+ Index Searches: 1
+(5 rows)
DROP TABLE brin_interval_test;
RESET enable_seqscan;
diff --git a/src/test/regress/expected/memoize.out b/src/test/regress/expected/memoize.out
index df2ca5ba4..b9e457857 100644
--- a/src/test/regress/expected/memoize.out
+++ b/src/test/regress/expected/memoize.out
@@ -22,8 +22,10 @@ begin
ln := regexp_replace(ln, 'Evictions: 0', 'Evictions: Zero');
ln := regexp_replace(ln, 'Evictions: \d+', 'Evictions: N');
ln := regexp_replace(ln, 'Memory Usage: \d+', 'Memory Usage: N');
- ln := regexp_replace(ln, 'Heap Fetches: \d+', 'Heap Fetches: N');
- ln := regexp_replace(ln, 'loops=\d+', 'loops=N');
+ ln := regexp_replace(ln, 'Heap Fetches: \d+', 'Heap Fetches: N');
+ ln := regexp_replace(ln, 'loops=\d+', 'loops=N');
+ ln := regexp_replace(ln, 'Index Searches: 0', 'Index Searches: Zero');
+ ln := regexp_replace(ln, 'Index Searches: \d+', 'Index Searches: N');
return next ln;
end loop;
end;
@@ -49,7 +51,8 @@ WHERE t2.unique1 < 1000;', false);
-> Index Only Scan using tenk1_unique1 on tenk1 t1 (actual rows=1 loops=N)
Index Cond: (unique1 = t2.twenty)
Heap Fetches: N
-(12 rows)
+ Index Searches: N
+(13 rows)
-- And check we get the expected results.
SELECT COUNT(*),AVG(t1.unique1) FROM tenk1 t1
@@ -80,7 +83,8 @@ WHERE t1.unique1 < 1000;', false);
-> Index Only Scan using tenk1_unique1 on tenk1 t2 (actual rows=1 loops=N)
Index Cond: (unique1 = t1.twenty)
Heap Fetches: N
-(12 rows)
+ Index Searches: N
+(13 rows)
-- And check we get the expected results.
SELECT COUNT(*),AVG(t2.unique1) FROM tenk1 t1,
@@ -106,6 +110,7 @@ WHERE t1.unique1 < 10;', false);
-> Nested Loop Left Join (actual rows=20 loops=N)
-> Index Scan using tenk1_unique1 on tenk1 t1 (actual rows=10 loops=N)
Index Cond: (unique1 < 10)
+ Index Searches: N
-> Memoize (actual rows=2 loops=N)
Cache Key: t1.two
Cache Mode: binary
@@ -115,7 +120,8 @@ WHERE t1.unique1 < 10;', false);
Rows Removed by Filter: 2
-> Index Scan using tenk1_unique1 on tenk1 t2_1 (actual rows=4 loops=N)
Index Cond: (unique1 < 4)
-(13 rows)
+ Index Searches: N
+(15 rows)
-- And check we get the expected results.
SELECT COUNT(*),AVG(t2.t1two) FROM tenk1 t1 LEFT JOIN
@@ -149,7 +155,8 @@ WHERE s.c1 = s.c2 AND t1.unique1 < 1000;', false);
Filter: ((t1.two + 1) = unique1)
Rows Removed by Filter: 9999
Heap Fetches: N
-(13 rows)
+ Index Searches: N
+(14 rows)
-- And check we get the expected results.
SELECT COUNT(*), AVG(t1.twenty) FROM tenk1 t1 LEFT JOIN
@@ -249,7 +256,8 @@ ON t1.x = t2.t::numeric AND t1.t::numeric = t2.x;', false);
Index Cond: (x = (t1.t)::numeric)
Filter: (t1.x = (t)::numeric)
Heap Fetches: N
-(10 rows)
+ Index Searches: N
+(11 rows)
DROP TABLE expr_key;
-- Reduce work_mem and hash_mem_multiplier so that we see some cache evictions
@@ -276,7 +284,8 @@ WHERE t2.unique1 < 1200;', true);
-> Index Only Scan using tenk1_unique1 on tenk1 t1 (actual rows=1 loops=N)
Index Cond: (unique1 = t2.thousand)
Heap Fetches: N
-(12 rows)
+ Index Searches: N
+(13 rows)
CREATE TABLE flt (f float);
CREATE INDEX flt_f_idx ON flt (f);
@@ -291,6 +300,7 @@ SELECT * FROM flt f1 INNER JOIN flt f2 ON f1.f = f2.f;', false);
Nested Loop (actual rows=4 loops=N)
-> Index Only Scan using flt_f_idx on flt f1 (actual rows=2 loops=N)
Heap Fetches: N
+ Index Searches: N
-> Memoize (actual rows=2 loops=N)
Cache Key: f1.f
Cache Mode: logical
@@ -298,7 +308,8 @@ SELECT * FROM flt f1 INNER JOIN flt f2 ON f1.f = f2.f;', false);
-> Index Only Scan using flt_f_idx on flt f2 (actual rows=2 loops=N)
Index Cond: (f = f1.f)
Heap Fetches: N
-(10 rows)
+ Index Searches: N
+(12 rows)
-- Ensure memoize operates in binary mode
SELECT explain_memoize('
@@ -308,6 +319,7 @@ SELECT * FROM flt f1 INNER JOIN flt f2 ON f1.f >= f2.f;', false);
Nested Loop (actual rows=4 loops=N)
-> Index Only Scan using flt_f_idx on flt f1 (actual rows=2 loops=N)
Heap Fetches: N
+ Index Searches: N
-> Memoize (actual rows=2 loops=N)
Cache Key: f1.f
Cache Mode: binary
@@ -315,7 +327,8 @@ SELECT * FROM flt f1 INNER JOIN flt f2 ON f1.f >= f2.f;', false);
-> Index Only Scan using flt_f_idx on flt f2 (actual rows=2 loops=N)
Index Cond: (f <= f1.f)
Heap Fetches: N
-(10 rows)
+ Index Searches: N
+(12 rows)
DROP TABLE flt;
-- Exercise Memoize in binary mode with a large fixed width type and a
@@ -342,7 +355,8 @@ SELECT * FROM strtest s1 INNER JOIN strtest s2 ON s1.n >= s2.n;', false);
Hits: 3 Misses: 3 Evictions: Zero Overflows: 0 Memory Usage: NkB
-> Index Scan using strtest_n_idx on strtest s2 (actual rows=4 loops=N)
Index Cond: (n <= s1.n)
-(10 rows)
+ Index Searches: N
+(11 rows)
-- Ensure we get 3 hits and 3 misses
SELECT explain_memoize('
@@ -359,7 +373,8 @@ SELECT * FROM strtest s1 INNER JOIN strtest s2 ON s1.t >= s2.t;', false);
Hits: 3 Misses: 3 Evictions: Zero Overflows: 0 Memory Usage: NkB
-> Index Scan using strtest_t_idx on strtest s2 (actual rows=4 loops=N)
Index Cond: (t <= s1.t)
-(10 rows)
+ Index Searches: N
+(11 rows)
DROP TABLE strtest;
-- Ensure memoize works with partitionwise join
@@ -380,6 +395,7 @@ SELECT * FROM prt t1 INNER JOIN prt t2 ON t1.a = t2.a;', false);
-> Nested Loop (actual rows=16 loops=N)
-> Index Only Scan using iprt_p1_a on prt_p1 t1_1 (actual rows=4 loops=N)
Heap Fetches: N
+ Index Searches: N
-> Memoize (actual rows=4 loops=N)
Cache Key: t1_1.a
Cache Mode: logical
@@ -387,9 +403,11 @@ SELECT * FROM prt t1 INNER JOIN prt t2 ON t1.a = t2.a;', false);
-> Index Only Scan using iprt_p1_a on prt_p1 t2_1 (actual rows=4 loops=N)
Index Cond: (a = t1_1.a)
Heap Fetches: N
+ Index Searches: N
-> Nested Loop (actual rows=16 loops=N)
-> Index Only Scan using iprt_p2_a on prt_p2 t1_2 (actual rows=4 loops=N)
Heap Fetches: N
+ Index Searches: N
-> Memoize (actual rows=4 loops=N)
Cache Key: t1_2.a
Cache Mode: logical
@@ -397,7 +415,8 @@ SELECT * FROM prt t1 INNER JOIN prt t2 ON t1.a = t2.a;', false);
-> Index Only Scan using iprt_p2_a on prt_p2 t2_2 (actual rows=4 loops=N)
Index Cond: (a = t1_2.a)
Heap Fetches: N
-(21 rows)
+ Index Searches: N
+(25 rows)
-- Ensure memoize works with parameterized union-all Append path
SET enable_partitionwise_join TO off;
@@ -410,6 +429,7 @@ ON t1.a = t2.a;', false);
Nested Loop (actual rows=16 loops=N)
-> Index Only Scan using iprt_p1_a on prt_p1 t1 (actual rows=4 loops=N)
Heap Fetches: N
+ Index Searches: N
-> Memoize (actual rows=4 loops=N)
Cache Key: t1.a
Cache Mode: logical
@@ -418,10 +438,12 @@ ON t1.a = t2.a;', false);
-> Index Only Scan using iprt_p1_a on prt_p1 (actual rows=4 loops=N)
Index Cond: (a = t1.a)
Heap Fetches: N
+ Index Searches: N
-> Index Only Scan using iprt_p2_a on prt_p2 (actual rows=0 loops=N)
Index Cond: (a = t1.a)
Heap Fetches: N
-(14 rows)
+ Index Searches: N
+(17 rows)
DROP TABLE prt;
RESET enable_partitionwise_join;
diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out
index 7a03b4e36..65f8387d3 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -2340,6 +2340,10 @@ begin
ln := regexp_replace(ln, 'Workers Launched: \d+', 'Workers Launched: N');
ln := regexp_replace(ln, 'actual rows=\d+ loops=\d+', 'actual rows=N loops=N');
ln := regexp_replace(ln, 'Rows Removed by Filter: \d+', 'Rows Removed by Filter: N');
+ perform regexp_matches(ln, 'Index Searches: \d+');
+ if found then
+ continue;
+ end if;
return next ln;
end loop;
end;
@@ -2692,12 +2696,13 @@ select * from ab where a = (select max(a) from lprt_a) and b = (select max(a)-1
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a3_b2_a_idx (actual rows=0 loops=1)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a3_b3 ab_9 (never executed)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a3_b3_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
-(52 rows)
+(53 rows)
-- Test run-time partition pruning with UNION ALL parents
explain (analyze, costs off, summary off, timing off)
@@ -2713,6 +2718,7 @@ select * from (select * from ab where a = 1 union all select * from ab) ab where
Filter: (b = (InitPlan 1).col1)
-> Bitmap Index Scan on ab_a1_b1_a_idx (actual rows=0 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b2 ab_12 (never executed)
Recheck Cond: (a = 1)
Filter: (b = (InitPlan 1).col1)
@@ -2741,7 +2747,7 @@ select * from (select * from ab where a = 1 union all select * from ab) ab where
Filter: (b = (InitPlan 1).col1)
-> Seq Scan on ab_a3_b3 ab_9 (never executed)
Filter: (b = (InitPlan 1).col1)
-(37 rows)
+(38 rows)
-- A case containing a UNION ALL with a non-partitioned child.
explain (analyze, costs off, summary off, timing off)
@@ -2757,6 +2763,7 @@ select * from (select * from ab where a = 1 union all (values(10,5)) union all s
Filter: (b = (InitPlan 1).col1)
-> Bitmap Index Scan on ab_a1_b1_a_idx (actual rows=0 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b2 ab_12 (never executed)
Recheck Cond: (a = 1)
Filter: (b = (InitPlan 1).col1)
@@ -2787,7 +2794,7 @@ select * from (select * from ab where a = 1 union all (values(10,5)) union all s
Filter: (b = (InitPlan 1).col1)
-> Seq Scan on ab_a3_b3 ab_9 (never executed)
Filter: (b = (InitPlan 1).col1)
-(39 rows)
+(40 rows)
-- Another UNION ALL test, but containing a mix of exec init and exec run-time pruning.
create table xy_1 (x int, y int);
@@ -2858,16 +2865,19 @@ update ab_a1 set b = 3 from ab where ab.a = 1 and ab.a = ab_a1.a;');
Recheck Cond: (a = 1)
-> Bitmap Index Scan on ab_a1_b1_a_idx (actual rows=0 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b2 ab_a1_2 (actual rows=1 loops=1)
Recheck Cond: (a = 1)
Heap Blocks: exact=1
-> Bitmap Index Scan on ab_a1_b2_a_idx (actual rows=1 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b3 ab_a1_3 (actual rows=0 loops=1)
Recheck Cond: (a = 1)
Heap Blocks: exact=1
-> Bitmap Index Scan on ab_a1_b3_a_idx (actual rows=1 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Materialize (actual rows=1 loops=1)
Storage: Memory Maximum Storage: NkB
-> Append (actual rows=1 loops=1)
@@ -2875,17 +2885,20 @@ update ab_a1 set b = 3 from ab where ab.a = 1 and ab.a = ab_a1.a;');
Recheck Cond: (a = 1)
-> Bitmap Index Scan on ab_a1_b1_a_idx (actual rows=0 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b2 ab_2 (actual rows=1 loops=1)
Recheck Cond: (a = 1)
Heap Blocks: exact=1
-> Bitmap Index Scan on ab_a1_b2_a_idx (actual rows=1 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b3 ab_3 (actual rows=0 loops=1)
Recheck Cond: (a = 1)
Heap Blocks: exact=1
-> Bitmap Index Scan on ab_a1_b3_a_idx (actual rows=1 loops=1)
Index Cond: (a = 1)
-(37 rows)
+ Index Searches: 1
+(43 rows)
table ab;
a | b
@@ -2961,8 +2974,10 @@ select * from tbl1 join tprt on tbl1.col1 > tprt.col1;
-> Append (actual rows=3 loops=2)
-> Index Scan using tprt1_idx on tprt_1 (actual rows=2 loops=2)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 2
-> Index Scan using tprt2_idx on tprt_2 (actual rows=2 loops=1)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 1
-> Index Scan using tprt3_idx on tprt_3 (never executed)
Index Cond: (col1 < tbl1.col1)
-> Index Scan using tprt4_idx on tprt_4 (never executed)
@@ -2971,7 +2986,7 @@ select * from tbl1 join tprt on tbl1.col1 > tprt.col1;
Index Cond: (col1 < tbl1.col1)
-> Index Scan using tprt6_idx on tprt_6 (never executed)
Index Cond: (col1 < tbl1.col1)
-(15 rows)
+(17 rows)
explain (analyze, costs off, summary off, timing off)
select * from tbl1 join tprt on tbl1.col1 = tprt.col1;
@@ -2984,6 +2999,7 @@ select * from tbl1 join tprt on tbl1.col1 = tprt.col1;
Index Cond: (col1 = tbl1.col1)
-> Index Scan using tprt2_idx on tprt_2 (actual rows=1 loops=2)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 2
-> Index Scan using tprt3_idx on tprt_3 (never executed)
Index Cond: (col1 = tbl1.col1)
-> Index Scan using tprt4_idx on tprt_4 (never executed)
@@ -2992,7 +3008,7 @@ select * from tbl1 join tprt on tbl1.col1 = tprt.col1;
Index Cond: (col1 = tbl1.col1)
-> Index Scan using tprt6_idx on tprt_6 (never executed)
Index Cond: (col1 = tbl1.col1)
-(15 rows)
+(16 rows)
select tbl1.col1, tprt.col1 from tbl1
inner join tprt on tbl1.col1 > tprt.col1
@@ -3027,17 +3043,20 @@ select * from tbl1 inner join tprt on tbl1.col1 > tprt.col1;
-> Append (actual rows=5 loops=5)
-> Index Scan using tprt1_idx on tprt_1 (actual rows=2 loops=5)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 5
-> Index Scan using tprt2_idx on tprt_2 (actual rows=3 loops=4)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 4
-> Index Scan using tprt3_idx on tprt_3 (actual rows=1 loops=2)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 2
-> Index Scan using tprt4_idx on tprt_4 (never executed)
Index Cond: (col1 < tbl1.col1)
-> Index Scan using tprt5_idx on tprt_5 (never executed)
Index Cond: (col1 < tbl1.col1)
-> Index Scan using tprt6_idx on tprt_6 (never executed)
Index Cond: (col1 < tbl1.col1)
-(15 rows)
+(18 rows)
explain (analyze, costs off, summary off, timing off)
select * from tbl1 inner join tprt on tbl1.col1 = tprt.col1;
@@ -3050,15 +3069,17 @@ select * from tbl1 inner join tprt on tbl1.col1 = tprt.col1;
Index Cond: (col1 = tbl1.col1)
-> Index Scan using tprt2_idx on tprt_2 (actual rows=1 loops=2)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 2
-> Index Scan using tprt3_idx on tprt_3 (actual rows=0 loops=3)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 3
-> Index Scan using tprt4_idx on tprt_4 (never executed)
Index Cond: (col1 = tbl1.col1)
-> Index Scan using tprt5_idx on tprt_5 (never executed)
Index Cond: (col1 = tbl1.col1)
-> Index Scan using tprt6_idx on tprt_6 (never executed)
Index Cond: (col1 = tbl1.col1)
-(15 rows)
+(17 rows)
select tbl1.col1, tprt.col1 from tbl1
inner join tprt on tbl1.col1 > tprt.col1
@@ -3122,7 +3143,8 @@ select * from tbl1 join tprt on tbl1.col1 < tprt.col1;
Index Cond: (col1 > tbl1.col1)
-> Index Scan using tprt6_idx on tprt_6 (actual rows=1 loops=1)
Index Cond: (col1 > tbl1.col1)
-(15 rows)
+ Index Searches: 1
+(16 rows)
select tbl1.col1, tprt.col1 from tbl1
inner join tprt on tbl1.col1 < tprt.col1
@@ -3484,10 +3506,12 @@ explain (analyze, costs off, summary off, timing off) execute mt_q1(15);
-> Index Scan using ma_test_p2_b_idx on ma_test_p2 ma_test_1 (actual rows=1 loops=1)
Filter: ((a >= $1) AND ((a % 10) = 5))
Rows Removed by Filter: 9
+ Index Searches: 1
-> Index Scan using ma_test_p3_b_idx on ma_test_p3 ma_test_2 (actual rows=1 loops=1)
Filter: ((a >= $1) AND ((a % 10) = 5))
Rows Removed by Filter: 9
-(9 rows)
+ Index Searches: 1
+(11 rows)
execute mt_q1(15);
a
@@ -3505,7 +3529,8 @@ explain (analyze, costs off, summary off, timing off) execute mt_q1(25);
-> Index Scan using ma_test_p3_b_idx on ma_test_p3 ma_test_1 (actual rows=1 loops=1)
Filter: ((a >= $1) AND ((a % 10) = 5))
Rows Removed by Filter: 9
-(6 rows)
+ Index Searches: 1
+(7 rows)
execute mt_q1(25);
a
@@ -3553,13 +3578,16 @@ explain (analyze, costs off, summary off, timing off) select * from ma_test wher
-> Limit (actual rows=1 loops=1)
-> Index Scan using ma_test_p2_b_idx on ma_test_p2 (actual rows=1 loops=1)
Index Cond: (b IS NOT NULL)
+ Index Searches: 1
-> Index Scan using ma_test_p1_b_idx on ma_test_p1 ma_test_1 (never executed)
Filter: (a >= (InitPlan 2).col1)
-> Index Scan using ma_test_p2_b_idx on ma_test_p2 ma_test_2 (actual rows=10 loops=1)
Filter: (a >= (InitPlan 2).col1)
+ Index Searches: 1
-> Index Scan using ma_test_p3_b_idx on ma_test_p3 ma_test_3 (actual rows=10 loops=1)
Filter: (a >= (InitPlan 2).col1)
-(14 rows)
+ Index Searches: 1
+(17 rows)
reset enable_seqscan;
reset enable_sort;
@@ -4130,13 +4158,16 @@ select * from rangep where b IN((select 1),(select 2)) order by a;
Sort Key: rangep_2.a
-> Index Scan using rangep_0_to_100_1_a_idx on rangep_0_to_100_1 rangep_2 (actual rows=0 loops=1)
Filter: (b = ANY (ARRAY[(InitPlan 1).col1, (InitPlan 2).col1]))
+ Index Searches: 1
-> Index Scan using rangep_0_to_100_2_a_idx on rangep_0_to_100_2 rangep_3 (actual rows=0 loops=1)
Filter: (b = ANY (ARRAY[(InitPlan 1).col1, (InitPlan 2).col1]))
+ Index Searches: 1
-> Index Scan using rangep_0_to_100_3_a_idx on rangep_0_to_100_3 rangep_4 (never executed)
Filter: (b = ANY (ARRAY[(InitPlan 1).col1, (InitPlan 2).col1]))
-> Index Scan using rangep_100_to_200_a_idx on rangep_100_to_200 rangep_5 (actual rows=0 loops=1)
Filter: (b = ANY (ARRAY[(InitPlan 1).col1, (InitPlan 2).col1]))
-(15 rows)
+ Index Searches: 1
+(18 rows)
reset enable_sort;
drop table rangep;
diff --git a/src/test/regress/expected/select.out b/src/test/regress/expected/select.out
index 33a6dceb0..02797a259 100644
--- a/src/test/regress/expected/select.out
+++ b/src/test/regress/expected/select.out
@@ -764,7 +764,8 @@ select * from onek2 where unique2 = 11 and stringu1 = 'ATAAAA';
Index Scan using onek2_u2_prtl on onek2 (actual rows=1 loops=1)
Index Cond: (unique2 = 11)
Filter: (stringu1 = 'ATAAAA'::name)
-(3 rows)
+ Index Searches: 1
+(4 rows)
explain (costs off)
select unique2 from onek2 where unique2 = 11 and stringu1 = 'ATAAAA';
diff --git a/src/test/regress/sql/memoize.sql b/src/test/regress/sql/memoize.sql
index 059bec5f4..33e6c4b67 100644
--- a/src/test/regress/sql/memoize.sql
+++ b/src/test/regress/sql/memoize.sql
@@ -23,8 +23,10 @@ begin
ln := regexp_replace(ln, 'Evictions: 0', 'Evictions: Zero');
ln := regexp_replace(ln, 'Evictions: \d+', 'Evictions: N');
ln := regexp_replace(ln, 'Memory Usage: \d+', 'Memory Usage: N');
- ln := regexp_replace(ln, 'Heap Fetches: \d+', 'Heap Fetches: N');
- ln := regexp_replace(ln, 'loops=\d+', 'loops=N');
+ ln := regexp_replace(ln, 'Heap Fetches: \d+', 'Heap Fetches: N');
+ ln := regexp_replace(ln, 'loops=\d+', 'loops=N');
+ ln := regexp_replace(ln, 'Index Searches: 0', 'Index Searches: Zero');
+ ln := regexp_replace(ln, 'Index Searches: \d+', 'Index Searches: N');
return next ln;
end loop;
end;
diff --git a/src/test/regress/sql/partition_prune.sql b/src/test/regress/sql/partition_prune.sql
index 442428d93..085e746af 100644
--- a/src/test/regress/sql/partition_prune.sql
+++ b/src/test/regress/sql/partition_prune.sql
@@ -573,6 +573,10 @@ begin
ln := regexp_replace(ln, 'Workers Launched: \d+', 'Workers Launched: N');
ln := regexp_replace(ln, 'actual rows=\d+ loops=\d+', 'actual rows=N loops=N');
ln := regexp_replace(ln, 'Rows Removed by Filter: \d+', 'Rows Removed by Filter: N');
+ perform regexp_matches(ln, 'Index Searches: \d+');
+ if found then
+ continue;
+ end if;
return next ln;
end loop;
end;
--
2.45.2
On Tue, Aug 27, 2024 at 11:16 AM Peter Geoghegan <pg@bowt.ie> wrote:
On Thu, Aug 15, 2024 at 3:22 PM Peter Geoghegan <pg@bowt.ie> wrote:
Attached patch has EXPLAIN ANALYZE display the total number of
primitive index scans for all 3 kinds of index scan node.Attached is v2, which fixes bitrot.
v2 also uses new terminology. EXPLAIN ANALYZE will now show "Index
Searches: N", not "Primitive Index Scans: N". Although there is
limited precedent for using the primitive scan terminology, I think
that it's a bit unwieldy.
I do like "Index Searches" better than "Primitive Index Scans."
But I think Matthias had some good points about this being
btree-specific. I'm not sure whether he was completely correct, but
you seemed to just dismiss his argument and say "well, that can't be
done," which doesn't seem convincing to me at all. If, for non-btree
indexes, the number of index searches will always be the same as the
loop count, then surely there is some way to avoid cluttering the
output for non-btree indexes with output that can never be of any use.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Tue, Aug 27, 2024 at 1:45 PM Robert Haas <robertmhaas@gmail.com> wrote:
I do like "Index Searches" better than "Primitive Index Scans."
But I think Matthias had some good points about this being
btree-specific.
It's not B-Tree specific -- not really. Any index scan that can at
least non-natively support ScalarArrayOps (i.e. SAOP scans that the
executor manages using ExecIndexEvalArrayKeys() + bitmap scans) will
show information that is exactly equivalent to what B-Tree will show,
given a similar ScalarArrayOps query.
There is at best one limited sense in which the information shown is
B-Tree specific: it tends to be more interesting in the case of B-Tree
index scans. You cannot trivially derive the number based on the
number of array keys for B-Tree scans, since nbtree is now clever
about not needlessly searching the index anew. It's quite possible
that other index AMs will in the future be enhanced in about the same
way as nbtree was in commit 5bf748b86b, at which point even this will
no longer apply. (Tom speculated about adding something like that to
GiST recently).
I'm not sure whether he was completely correct, but
you seemed to just dismiss his argument and say "well, that can't be
done," which doesn't seem convincing to me at all.
To be clear, any variation that you can think of *can* be done without
much difficulty. I thought that Matthias was unclear about what he
even wanted, is all.
The problem isn't that there aren't any alternatives. The problem, if
any, is that there are a huge number of slightly different
alternatives. There are hopelessly subjective questions about what the
best trade-off between redundancy and consistency is. I'm absolutely
not set on doing things in exactly the way I've laid out.
What do you think should be done? Note that the number of loops
matters here, in addition to the number of SAOP primitive
scans/searches. If you want to suppress the information shown in the
typical "nsearches == 1" case, what does that mean for the less common
"nsearches == 0" case?
If, for non-btree
indexes, the number of index searches will always be the same as the
loop count, then surely there is some way to avoid cluttering the
output for non-btree indexes with output that can never be of any use.
Even if we assume that a given index/index AM will never use SAOPs,
it's still possible to show more than one "Index Search" per executor
node execution. For example, when an index scan node is the inner side
of a nestloop join.
I see value in making it obvious to users when and how
pg_stat_all_indexes.idx_scan advances. Being able to easily relate it
to EXPLAIN ANALYZE output is useful, independent of whether or not
SAOPs happen to be used. That's probably the single best argument in
favor of showing "Index Searches: N" unconditionally. But I'm
certainly not going to refuse to budge over that.
--
Peter Geoghegan
Peter Geoghegan <pg@bowt.ie> writes:
I see value in making it obvious to users when and how
pg_stat_all_indexes.idx_scan advances. Being able to easily relate it
to EXPLAIN ANALYZE output is useful, independent of whether or not
SAOPs happen to be used. That's probably the single best argument in
favor of showing "Index Searches: N" unconditionally. But I'm
certainly not going to refuse to budge over that.
TBH, I'm afraid that this patch basically is exposing numbers that
nobody but Peter Geoghegan and maybe two or three other hackers
will understand, and even fewer people will find useful (since the
how-many-primitive-scans behavior is not something users have any
control over, IIUC). I doubt that "it lines up with
pg_stat_all_indexes.idx_scan" is enough to justify the additional
clutter in EXPLAIN. Maybe we should be going the other direction
and trying to make pg_stat_all_indexes count in a less detailed but
less surprising way, ie once per indexscan plan node invocation.
regards, tom lane
On Tue, Aug 27, 2024 at 3:04 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
TBH, I'm afraid that this patch basically is exposing numbers that
nobody but Peter Geoghegan and maybe two or three other hackers
will understand, and even fewer people will find useful (since the
how-many-primitive-scans behavior is not something users have any
control over, IIUC).
You can make about the same argument against showing "Buffers". It's
not really something that you can address directly, either. It's
helpful only in the context of a specific problem.
I doubt that "it lines up with
pg_stat_all_indexes.idx_scan" is enough to justify the additional
clutter in EXPLAIN.
The scheme laid out in the patch is just a starting point for
discussion. I just think that it's particularly important that we have
this for skip scan -- that's the part that I feel strongly about.
With skip scan in place, every scan of the kind we'd currently call a
"full index scan" will be eligible to skip. Whether and to what extent
we actually skip is determined at runtime. We really need some way of
determining how much skipping has taken place. (There are many
disadvantages to having a dedicated skip scan index path, which I can
go into if you want.)
Maybe we should be going the other direction
and trying to make pg_stat_all_indexes count in a less detailed but
less surprising way, ie once per indexscan plan node invocation.
Is that less surprising, though? I think that it's more surprising.
--
Peter Geoghegan
On Fri, 16 Aug 2024 at 00:34, Peter Geoghegan <pg@bowt.ie> wrote:
On Thu, Aug 15, 2024 at 5:47 PM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:I'm asking, because
I'm not very convinced that 'primitive scans' are a useful metric
across all (or even: most) index AMs (e.g. BRIN probably never will
have a 'primitive scans' metric that differs from the loop count), so
maybe this would better be implemented in that framework?What do you mean by "within that framework"? They seem orthogonal?
What I meant was putting this 'primitive scans' info into the
AM-specific explain callback as seen in the latest patch version.I don't see how that could work. This is fundamentally information
that is only known when the query has fully finished execution.
If the counter was put into the BTScanOpaque, rather than the
IndexScanDesc, then this could trivially be used in an explain AM
callback, as IndexScanDesc and ->opaque are both still available while
building the explain output. As a result, it wouldn't bloat the
IndexScanDesc for other index AMs who might not be interested in this
metric.
Again, this is already something that we track at the whole-table
level, within pg_stat_user_tables.idx_scan. It's already considered
index AM agnostic information, in that sense.
That's true, but for most indexes there is a 1:1 relationship between
loops and idx_scan counts, with ony btree behaving differently in that
regard. Not to say it isn't an important insight for btree, but just
that it seems to be only relevant for btree and no other index I can
think of right now.
It's true that BRIN index scans will probably never show more than a
single primitive index scan. I don't think that the same is true of
any other index AM, though. Don't they all support SAOPs, albeit
non-natively?Not always. For Bitmap Index Scan the node's functions can allow
non-native SAOP support (it ORs the bitmaps), but normal indexes
without SAOP support won't get SAOP-functionality from the IS/IOS
node's infrastructure, it'll need to be added as Filter.Again, what do you want me to do about it? Almost anything is possible
in principle, and can be implemented without great difficulty. But you
have to clearly say what you want, and why you want it.
I don't want anything, or anything done about it, but your statement
that all index AMs support SAOP (potentially non-natively) is not
true, as the non-native SAOP support is only for bitmap index scans,
and index AMs aren't guaranteed to support bitmap index scans (e.g.
pgvector's IVFFLAT and HNSW are good examples, as they only support
amgettuple).
Yeah, non-native SAOP index scans are always bitmap scans. In the case
of GIN, there are only lossy/bitmap index scans, anyway -- can't see
that ever changing.
GIN had amgettuple-based index scans until the fastinsert path was
added, and with some work (I don't think it needs to be a lot) the
feature can probably be returned to the AM. The GIN internals would
probably only need relatively few changes, as they already seem to
mostly use precise TID-based scans - the only addition would be a
filter that prohibits returning tuples that were previously returned
while scanning the fastinsert path during the normal index scan.
And, in this case,
the use case seems quite index-specific, at least for IS/IOS nodes.I disagree. It's an existing concept, exposed in system views, and now
in EXPLAIN ANALYZE. It's precisely that -- nothing more, nothing less.
To be precise, it is not precisely that, because it's a different
counter that an AM must update when the pgstat data is updated if it
wants the explain output to reflect the stats counter accurately. When
an AM forgets to update one of these metrics (or fails to realize they
have to both be updated) then they'd be out-of-sync. I'd prefer if an
AM didn't have to account for it's statistics in more than one place.
This made me notice that you add a new metric that should generally be
exactly the same as pg_stat_all_indexes.idx_scan (you mention the
same).I didn't imagine that that part was subtle.
It wasn't, but it was not present in the first two paragraphs of the
mail, which I had only skimmed when I sent my first reply (as you
maybe could see indicated by the quote). That's why it took me until
my second reply to realise these were considered to be equivalent,
especially after I noticed the headerfile changes where you added a
new metric rather than pulling data from existing stats.
Can't you pull that data, instead of inventing a new place
every AMs needs to touch for it's metrics?No. At least not in a way that's scoped to a particular index scan.
Similar per-node counter data is pulled for the global (!) counters of
pgBufferUsage, so why would it be less possible to gather such metrics
for just one index's stats here? While I do think it won't be easy to
find a good way to integrate this into EXPLAIN's Instrumentation, I
imagine other systems (e.g. table scans) may benefit from a better
integration and explanation of pgstat statistics in EXPLAIN, too. E.g.
I'd love to be able to explain how many times which function was
called in a plans' projections, and what the relevant time expendature
for those functions is in my plans. This data is available with
track_functions enabled, and diffing in the execution nodes should
allow this to be shown in EXPLAIN output. It'd certainly be more
expensive than not doing the analysis, but I believe that's what
EXPLAIN options are for - you can show a more detailed analysis at the
cost of increased overhead in the plan execution.
Alternatively, you could update the patch so that only the field in
IndexScan would need to be updated by the index AM by making the
executor responsible to update the relation's stats at once at the end
of the query with the data from the IndexScanDesc.
Kind regards,
Matthias van de Meent
Neon (https://neon.tech)
On Tue, Aug 27, 2024 at 5:03 PM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:
If the counter was put into the BTScanOpaque, rather than the
IndexScanDesc, then this could trivially be used in an explain AM
callback, as IndexScanDesc and ->opaque are both still available while
building the explain output.
Right, "trivial". Except in that it requires inventing a whole new
general purpose infrastructure. Meanwhile, Tom is arguing against even
showing this very basic information in EXPLAIN ANALYZE.You see the
problem?
As a result, it wouldn't bloat the
IndexScanDesc for other index AMs who might not be interested in this
metric.
Why do you persist with the idea that it isn't useful for other index
AMs? I mean it literally works in exactly the same way! It's literally
indistinguishable to users, and works in a way that's consistent with
historical behavior/definitions.
I don't want anything, or anything done about it, but your statement
that all index AMs support SAOP (potentially non-natively) is not
true, as the non-native SAOP support is only for bitmap index scans,
and index AMs aren't guaranteed to support bitmap index scans (e.g.
pgvector's IVFFLAT and HNSW are good examples, as they only support
amgettuple).
Yes, there are some very minor exceptions -- index AMs where even
non-native SAOPs won't be used. What difference does it make?
And, in this case,
the use case seems quite index-specific, at least for IS/IOS nodes.I disagree. It's an existing concept, exposed in system views, and now
in EXPLAIN ANALYZE. It's precisely that -- nothing more, nothing less.To be precise, it is not precisely that, because it's a different
counter that an AM must update when the pgstat data is updated if it
wants the explain output to reflect the stats counter accurately.
Why does that matter? I could easily move the counter to the opaque
struct, but that would make the patch longer and more complicated, for
absolutely no benefit.
When an AM forgets to update one of these metrics (or fails to realize they
have to both be updated) then they'd be out-of-sync. I'd prefer if an
AM didn't have to account for it's statistics in more than one place.
I could easily change the pgstat_count_index_scan macro so that index
AMs were forced to do both, or neither. (Not that this is a real
problem.)
Can't you pull that data, instead of inventing a new place
every AMs needs to touch for it's metrics?No. At least not in a way that's scoped to a particular index scan.
Similar per-node counter data is pulled for the global (!) counters of
pgBufferUsage, so why would it be less possible to gather such metrics
for just one index's stats here?
I told you why already, when we talked about this privately: there is
no guarantee that it's the index indicated by the scan
instrumentation. For example, due to syscache lookups. There's also
the question of how we maintain the count for things like nestloop
joins, where invocations of different index scan nodes may be freely
woven together. So it just won't work.
Besides, I thought that you wanted me to use some new field in
BTScanOpaque? But now you want me to use a global counter. Which is
it?
While I do think it won't be easy to
find a good way to integrate this into EXPLAIN's Instrumentation, I
imagine other systems (e.g. table scans) may benefit from a better
integration and explanation of pgstat statistics in EXPLAIN, too. E.g.
I'd love to be able to explain how many times which function was
called in a plans' projections, and what the relevant time expendature
for those functions is in my plans.
Seems completely unrelated.
Alternatively, you could update the patch so that only the field in
IndexScan would need to be updated by the index AM by making the
executor responsible to update the relation's stats at once at the end
of the query with the data from the IndexScanDesc.
I don't understand why this is an alternative to the other thing that
you said. Or even why it's desirable.
--
Peter Geoghegan
On Tue, 27 Aug 2024 at 23:40, Peter Geoghegan <pg@bowt.ie> wrote:
On Tue, Aug 27, 2024 at 5:03 PM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:If the counter was put into the BTScanOpaque, rather than the
IndexScanDesc, then this could trivially be used in an explain AM
callback, as IndexScanDesc and ->opaque are both still available while
building the explain output.Right, "trivial". Except in that it requires inventing a whole new
general purpose infrastructure.
Which seems to be in the process of being invented already elsewhere.
Meanwhile, Tom is arguing against even
showing this very basic information in EXPLAIN ANALYZE.You see the
problem?
I think Tom's main issue is additional clutter when running just plain
`explain analyze`, and he'd probably be fine with it if this was gated
behind e.g. VERBOSE or a new "get me the AM's view of this node"
-flag.
As a result, it wouldn't bloat the
IndexScanDesc for other index AMs who might not be interested in this
metric.Why do you persist with the idea that it isn't useful for other index
AMs?
Because
- there are no other index AMs that would show a count that's
different from loops (Yes, I'm explicitly ignoring bitmapscan's synthetic SAOP)
- because there is already a place where this info is stored.
I mean it literally works in exactly the same way! It's literally
indistinguishable to users, and works in a way that's consistent with
historical behavior/definitions.
Historically, no statistics/explain-only info is stored in the
IndexScanDesc, all data inside that struct is relevant even when
EXPLAIN was removed from the codebase. The same is true for
TableScanDesc
Now, you want to add this metadata to the struct. I'm quite hesitant
to start walking on such a surface, as it might just be a slippery
slope.
I don't want anything, or anything done about it, but your statement
that all index AMs support SAOP (potentially non-natively) is not
true, as the non-native SAOP support is only for bitmap index scans,
and index AMs aren't guaranteed to support bitmap index scans (e.g.
pgvector's IVFFLAT and HNSW are good examples, as they only support
amgettuple).Yes, there are some very minor exceptions -- index AMs where even
non-native SAOPs won't be used. What difference does it make?
That not all index types (even: most index types) have no interesting
performance numbers indicated by the count.
And, in this case,
the use case seems quite index-specific, at least for IS/IOS nodes.I disagree. It's an existing concept, exposed in system views, and now
in EXPLAIN ANALYZE. It's precisely that -- nothing more, nothing less.To be precise, it is not precisely that, because it's a different
counter that an AM must update when the pgstat data is updated if it
wants the explain output to reflect the stats counter accurately.Why does that matter?
Because to me it seels like one more thing an existing index AM's
author needs to needlessly add to its index.
When an AM forgets to update one of these metrics (or fails to realize they
have to both be updated) then they'd be out-of-sync. I'd prefer if an
AM didn't have to account for it's statistics in more than one place.I could easily change the pgstat_count_index_scan macro so that index
AMs were forced to do both, or neither. (Not that this is a real
problem.)
That'd be one way to reduce the chances of accidental bugs, which
seems like a good start.
Can't you pull that data, instead of inventing a new place
every AMs needs to touch for it's metrics?No. At least not in a way that's scoped to a particular index scan.
Similar per-node counter data is pulled for the global (!) counters of
pgBufferUsage, so why would it be less possible to gather such metrics
for just one index's stats here?I told you why already, when we talked about this privately: there is
no guarantee that it's the index indicated by the scan
instrumentation.
For the pgstat entry in rel->pgstat_info, it is _exactly_ guaranteed
to be the index of the IndexScan node. pgBufferUsage happens to be
global, but pgstat_info is gathered at the relation level.
For example, due to syscache lookups.
Sure, if we're executing a query on catalogs looking at index's
numscans might count multiple index scans if the index scan needs to
access that same catalog table's data through that same catalog index,
but in those cases I think it's an acceptable count difference.
There's also
the question of how we maintain the count for things like nestloop
joins, where invocations of different index scan nodes may be freely
woven together. So it just won't work.
Gathering usage counters on interleaving execution nodes has been done
for pgBufferUsage, so I don't see how it just won't work. To me, it
seems very realistically possible.
Besides, I thought that you wanted me to use some new field in
BTScanOpaque? But now you want me to use a global counter. Which is
it?
If you think it's important to have this info on all indexes then I'd
prefer the pgstat approach over adding a field in IndexScanDescData.
If instead you think that this is primarily important to expose for
nbtree index scans, then I'd prefer putting it in the BTSO using e.g.
the index AM analyze hook approach, as I think that's much more
elegant than this.
While I do think it won't be easy to
find a good way to integrate this into EXPLAIN's Instrumentation, I
imagine other systems (e.g. table scans) may benefit from a better
integration and explanation of pgstat statistics in EXPLAIN, too. E.g.
I'd love to be able to explain how many times which function was
called in a plans' projections, and what the relevant time expendature
for those functions is in my plans.Seems completely unrelated.
I'd call "exposing function's pgstat data in explain" at least
somewhat related to "exposing indexes' pgstat data in explain".
Alternatively, you could update the patch so that only the field in
IndexScan would need to be updated by the index AM by making the
executor responsible to update the relation's stats at once at the end
of the query with the data from the IndexScanDesc.I don't understand why this is an alternative to the other thing that
you said. Or even why it's desirable.
I think it would be preferred over requiring Index AMs to maintain 2
fields in 2 very different locations but in the same way with the same
update pattern. With the mentioned change, they'd only have to keep
the ISD's numscans updated with rescans (or, _bt_first/_bt_search's).
Your alternative approach of making pgstat_count_index_scan update
both would probably have the same desired effect of requiring the AM
author to only mind this one entry point for counting index scan
stats.
Kind regards,
Matthias van de Meent
Neon (https://neon.tech)
On Tue, Aug 27, 2024 at 7:22 PM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:
On Tue, 27 Aug 2024 at 23:40, Peter Geoghegan <pg@bowt.ie> wrote:
Right, "trivial". Except in that it requires inventing a whole new
general purpose infrastructure.Which seems to be in the process of being invented already elsewhere.
None of this stuff about implementation details really matters if
there isn't agreement on what actual user-visible behavior we want.
We're very far from that right now.
Meanwhile, Tom is arguing against even
showing this very basic information in EXPLAIN ANALYZE.You see the
problem?I think Tom's main issue is additional clutter when running just plain
`explain analyze`, and he'd probably be fine with it if this was gated
behind e.g. VERBOSE or a new "get me the AM's view of this node"
-flag.
I'm not at all confident that you're right about that.
I mean it literally works in exactly the same way! It's literally
indistinguishable to users, and works in a way that's consistent with
historical behavior/definitions.Historically, no statistics/explain-only info is stored in the
IndexScanDesc, all data inside that struct is relevant even when
EXPLAIN was removed from the codebase. The same is true for
TableScanDesc
Please try to separate questions about user-visible behavior from
questions about the implementation. Here you're answering a point I'm
making about user visible behavior with a point about where the
counter goes. It's just not relevant. At all.
Now, you want to add this metadata to the struct. I'm quite hesitant
to start walking on such a surface, as it might just be a slippery
slope.
I don't know why you seem to assume that it's inevitable that we'll
get a huge amount of similar EXPLAIN ANALYZE instrumentation, of which
this is just the start. It isn't. It's far from clear that even
something like my patch will get in.
Seems completely unrelated.
I'd call "exposing function's pgstat data in explain" at least
somewhat related to "exposing indexes' pgstat data in explain".
Not in any practical sense.
--
Peter Geoghegan
On Wed, 28 Aug 2024 at 01:42, Peter Geoghegan <pg@bowt.ie> wrote:
On Tue, Aug 27, 2024 at 7:22 PM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:On Tue, 27 Aug 2024 at 23:40, Peter Geoghegan <pg@bowt.ie> wrote:
Right, "trivial". Except in that it requires inventing a whole new
general purpose infrastructure.Which seems to be in the process of being invented already elsewhere.
None of this stuff about implementation details really matters if
there isn't agreement on what actual user-visible behavior we want.
We're very far from that right now.
I'd expect the value to only be displayed for more verbose outputs
(such as under VERBOSE, or another option, or an as of yet
unimplemented unnamed "get me AM-specific info" option), and only if
it differed from nloops or if the index scan is otherwise interesting
and would benefit from showing this data, which would require AM
involvement to check if the scan is "interesting".
E.g. I think it's "interesting" to see only 1 index search /loop for
an index SAOP (with array >>1 attribute, or parameterized), but not at
all interesting to see 1 index search /loop for a scan with a single
equality scankey on the only key attribute: if it were anything else
that'd be an indication of serious issues (and we'd show it, because
it wouldn't be 1 search per loop).
and works in a way that's consistent with
historical behavior/definitions.Historically, no statistics/explain-only info is stored in the
IndexScanDesc, all data inside that struct is relevant even when
EXPLAIN was removed from the codebase. The same is true for
TableScanDescPlease try to separate questions about user-visible behavior from
questions about the implementation. Here you're answering a point I'm
making about user visible behavior with a point about where the
counter goes. It's just not relevant. At all.
I thought you were talking about type definitions with your
'definitions', but apparently not. What were you referring to with
"consistent with historical behavior/definitions"?
Now, you want to add this metadata to the struct. I'm quite hesitant
to start walking on such a surface, as it might just be a slippery
slope.I don't know why you seem to assume that it's inevitable that we'll
get a huge amount of similar EXPLAIN ANALYZE instrumentation, of which
this is just the start. It isn't. It's far from clear that even
something like my patch will get in.
It doesn't have to be a huge amount, but I'd be extremely careful
setting a precedent where scandescs will have space reserved for data
that can be derived from other fields, and is also used by
approximately 0% of queries in any production workload (except when
autoanalyze is enabled, in which case there are other systems that
could probably gather this data).
Kind regards,
Matthias van de Meent
Neon (https://neon.tech)
On Tue, Aug 27, 2024 at 3:04 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Peter Geoghegan <pg@bowt.ie> writes:
I see value in making it obvious to users when and how
pg_stat_all_indexes.idx_scan advances. Being able to easily relate it
to EXPLAIN ANALYZE output is useful, independent of whether or not
SAOPs happen to be used. That's probably the single best argument in
favor of showing "Index Searches: N" unconditionally. But I'm
certainly not going to refuse to budge over that.TBH, I'm afraid that this patch basically is exposing numbers that
nobody but Peter Geoghegan and maybe two or three other hackers
will understand, and even fewer people will find useful (since the
how-many-primitive-scans behavior is not something users have any
control over, IIUC). I doubt that "it lines up with
pg_stat_all_indexes.idx_scan" is enough to justify the additional
clutter in EXPLAIN. Maybe we should be going the other direction
and trying to make pg_stat_all_indexes count in a less detailed but
less surprising way, ie once per indexscan plan node invocation.
I kind of had that reaction too initially, but I think that was mostly
because "Primitive Index Scans" seemed extremely unclear. I think
"Index Searches" is pretty comprehensible, honestly. Why shouldn't
someone be able to figure out what that means?
Might make sense to restrict this to VERBOSE mode, too.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Tue, Aug 27, 2024 at 7:22 PM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:
Besides, I thought that you wanted me to use some new field in
BTScanOpaque? But now you want me to use a global counter. Which is
it?If you think it's important to have this info on all indexes then I'd
prefer the pgstat approach over adding a field in IndexScanDescData.
If instead you think that this is primarily important to expose for
nbtree index scans, then I'd prefer putting it in the BTSO using e.g.
the index AM analyze hook approach, as I think that's much more
elegant than this.
I agree with this analysis. I don't see why IndexScanDesc would ever
be the right place for this.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Wed, Aug 28, 2024 at 9:35 AM Robert Haas <robertmhaas@gmail.com> wrote:
If you think it's important to have this info on all indexes then I'd
prefer the pgstat approach over adding a field in IndexScanDescData.
If instead you think that this is primarily important to expose for
nbtree index scans, then I'd prefer putting it in the BTSO using e.g.
the index AM analyze hook approach, as I think that's much more
elegant than this.I agree with this analysis. I don't see why IndexScanDesc would ever
be the right place for this.
Then what do you think is the right place?
There's no simple way to get to the planstate instrumentation from
within an index scan. You could do it by passing it down as an
argument to either ambeginscan or amrescan. But, realistically, it'd
probably be better to just add a pointer to the instrumentation to the
IndexScanDesc passed to amrescan. That's very close to what I've done
already.
--
Peter Geoghegan
On Wed, Aug 28, 2024 at 9:41 AM Peter Geoghegan <pg@bowt.ie> wrote:
On Wed, Aug 28, 2024 at 9:35 AM Robert Haas <robertmhaas@gmail.com> wrote:
If you think it's important to have this info on all indexes then I'd
prefer the pgstat approach over adding a field in IndexScanDescData.
If instead you think that this is primarily important to expose for
nbtree index scans, then I'd prefer putting it in the BTSO using e.g.
the index AM analyze hook approach, as I think that's much more
elegant than this.I agree with this analysis. I don't see why IndexScanDesc would ever
be the right place for this.Then what do you think is the right place?
The paragraph that I agreed with and quoted in my reply, and that you
then quoted in your reply to me, appears to me to address that exact
question.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Wed, Aug 28, 2024 at 9:49 AM Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Aug 28, 2024 at 9:41 AM Peter Geoghegan <pg@bowt.ie> wrote:
On Wed, Aug 28, 2024 at 9:35 AM Robert Haas <robertmhaas@gmail.com> wrote:
If you think it's important to have this info on all indexes then I'd
prefer the pgstat approach over adding a field in IndexScanDescData.
If instead you think that this is primarily important to expose for
nbtree index scans, then I'd prefer putting it in the BTSO using e.g.
the index AM analyze hook approach, as I think that's much more
elegant than this.I agree with this analysis. I don't see why IndexScanDesc would ever
be the right place for this.Then what do you think is the right place?
The paragraph that I agreed with and quoted in my reply, and that you
then quoted in your reply to me, appears to me to address that exact
question.
Are you talking about adding global counters, in the style of pgBufferUsage?
Or are you talking about adding it to BTSO? If it's the latter, then
why isn't that at least as bad? It's just the IndexScanDesc thing, but
with an additional indirection.
--
Peter Geoghegan
On Wed, Aug 28, 2024 at 9:25 AM Robert Haas <robertmhaas@gmail.com> wrote:
Might make sense to restrict this to VERBOSE mode, too.
If we have to make the new output appear selectively, I'd prefer to do
it this way.
There are lots of small problems with selectively displaying less/no
information based on rules applied against the number of index
searches/loops/whatever. While that general approach works quite well
in the case of the "Buffers" instrumentation, it won't really work
here. After all, the base case is that there is one index search per
index scan node -- not zero searches.
--
Peter Geoghegan
On Wed, Aug 28, 2024 at 9:25 AM Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Aug 27, 2024 at 3:04 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
I kind of had that reaction too initially, but I think that was mostly
because "Primitive Index Scans" seemed extremely unclear. I think
"Index Searches" is pretty comprehensible, honestly. Why shouldn't
someone be able to figure out what that means?
Worth noting that Lukas Fittl made a point of prominently highlighting
the issue with how this works when he explained the Postgres 17 nbtree
work:
https://pganalyze.com/blog/5mins-postgres-17-faster-btree-index-scans
And no, I wasn't asked to give any input to the blog post. Lukas has a
general interest in making the system easier to understand for
ordinary users. Presumably that's why he zeroed in on this one aspect
of the work. It's far from an esoteric implementation detail.
--
Peter Geoghegan
On Wed, 28 Aug 2024 at 15:53, Peter Geoghegan <pg@bowt.ie> wrote:
On Wed, Aug 28, 2024 at 9:49 AM Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Aug 28, 2024 at 9:41 AM Peter Geoghegan <pg@bowt.ie> wrote:
On Wed, Aug 28, 2024 at 9:35 AM Robert Haas <robertmhaas@gmail.com> wrote:
If you think it's important to have this info on all indexes then I'd
prefer the pgstat approach over adding a field in IndexScanDescData.
If instead you think that this is primarily important to expose for
nbtree index scans, then I'd prefer putting it in the BTSO using e.g.
the index AM analyze hook approach, as I think that's much more
elegant than this.I agree with this analysis. I don't see why IndexScanDesc would ever
be the right place for this.Then what do you think is the right place?
The paragraph that I agreed with and quoted in my reply, and that you
then quoted in your reply to me, appears to me to address that exact
question.Are you talking about adding global counters, in the style of pgBufferUsage?
My pgstat approach would be that ExecIndexScan (plus ExecIOS and
ExecBitmapIS) could record the current state of relevant fields from
node->ss.ss_currentRelation->pgstat_info, and diff them with the
recorded values at the end of that node's execution, pushing the
result into e.g. Instrumentation; diffing which is similar to what
happens in InstrStartNode() and InstrStopNode() but for the relation's
pgstat_info instead of pgBufferUsage and pgWalUsage. Alternatively
this could happen in ExecProcNodeInstr, but it'd need some more
special-casing to make sure it only addresses (index) relation scan
nodes.
By pulling the stats directly from Relation->pgstat_info, no catalog
index scans are counted if they aren't also the index which is subject
to that [Bitmap]Index[Only]Scan.
In effect, this would need no changes in AM code, as this would "just"
pull the data from those statistics which are already being updated in
AM code.
Kind regards,
Matthias van de Meent
Neon (https://neon.tech)
Hi!
On 27.08.2024 18:15, Peter Geoghegan wrote:
On Thu, Aug 15, 2024 at 3:22 PM Peter Geoghegan<pg@bowt.ie> wrote:
Attached patch has EXPLAIN ANALYZE display the total number of
primitive index scans for all 3 kinds of index scan node.Attached is v2, which fixes bitrot.
v2 also uses new terminology. EXPLAIN ANALYZE will now show "Index
Searches: N", not "Primitive Index Scans: N". Although there is
limited precedent for using the primitive scan terminology, I think
that it's a bit unwieldy.No other notable changes.
While reviewing the thread again, I noticed that the patch was applied
with conflicts. I fixed it. The updated version is in the
show_primitive_index.diff file.
You should look at why the test results in stats.out changed. To be
honest, I haven't investigated this deeply yet.
diff -U3
/home/alena/postgrespro__copy10/src/test/regress/expected/stats.out
/home/alena/postgrespro__copy10/src/test/regress/results/stats.out
--- /home/alena/postgrespro__copy10/src/test/regress/expected/stats.out
2024-11-09 17:45:03.812313004 +0300
+++ /home/alena/postgrespro__copy10/src/test/regress/results/stats.out
2024-11-09 18:05:02.129524219 +0300
@@ -673,7 +673,7 @@
FROM pg_stat_all_tables WHERE relid = 'test_last_scan'::regclass;
seq_scan | seq_ok | idx_scan | idx_ok
----------+--------+----------+--------
- 2 | t | 1 | t
+ 2 | t | 2 | t
(1 row)
-- fetch timestamps from before the next test
@@ -716,7 +716,7 @@
FROM pg_stat_all_tables WHERE relid = 'test_last_scan'::regclass;
seq_scan | seq_ok | idx_scan | idx_ok
----------+--------+----------+--------
- 2 | t | 2 | t
+ 2 | t | 4 | t
(1 row)
-- fetch timestamps from before the next test
@@ -761,7 +761,7 @@
FROM pg_stat_all_tables WHERE relid = 'test_last_scan'::regclass;
seq_scan | seq_ok | idx_scan | idx_ok
----------+--------+----------+--------
- 2 | t | 3 | t
+ 2 | t | 6 | t
(1 row)
I noticed that the "Index Searches" cases shown in the regression tests
are only for partitioned tables, maybe something you should add some
tests for regular tables like tenk1.
In general, I support the initiative to display this information in the
query plan output. I think it is necessary for finding the reasons for
low query performance.
--
Regards,
Alena Rybakina
Postgres Professional
Attachments:
show_primitive_index.difftext/x-patch; charset=UTF-8; name=show_primitive_index.diffDownload
diff --git a/doc/src/sgml/bloom.sgml b/doc/src/sgml/bloom.sgml
index 19f2b172cc4..8744020eb0d 100644
--- a/doc/src/sgml/bloom.sgml
+++ b/doc/src/sgml/bloom.sgml
@@ -170,6 +170,7 @@ CREATE INDEX
Heap Blocks: exact=28
-> Bitmap Index Scan on bloomidx (cost=0.00..1792.00 rows=2 width=0) (actual time=0.356..0.356 rows=29 loops=1)
Index Cond: ((i2 = 898732) AND (i5 = 123451))
+ Index Searches: 1
Planning Time: 0.099 ms
Execution Time: 0.408 ms
(8 rows)
@@ -202,6 +203,7 @@ CREATE INDEX
-> BitmapAnd (cost=24.34..24.34 rows=2 width=0) (actual time=0.027..0.027 rows=0 loops=1)
-> Bitmap Index Scan on btreeidx5 (cost=0.00..12.04 rows=500 width=0) (actual time=0.026..0.026 rows=0 loops=1)
Index Cond: (i5 = 123451)
+ Index Searches: 1
-> Bitmap Index Scan on btreeidx2 (cost=0.00..12.04 rows=500 width=0) (never executed)
Index Cond: (i2 = 898732)
Planning Time: 0.491 ms
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 331315f8d3c..014a66ef785 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -4180,12 +4180,18 @@ description | Waiting for a newly initialized WAL file to reach durable storage
Queries that use certain <acronym>SQL</acronym> constructs to search for
rows matching any value out of a list or array of multiple scalar values
(see <xref linkend="functions-comparisons"/>) perform multiple
- <quote>primitive</quote> index scans (up to one primitive scan per scalar
- value) during query execution. Each internal primitive index scan
- increments <structname>pg_stat_all_indexes</structname>.<structfield>idx_scan</structfield>,
+ index searches (up to one index search per scalar value) during query
+ execution. Each internal index search increments
+ <structname>pg_stat_all_indexes</structname>.<structfield>idx_scan</structfield>,
so it's possible for the count of index scans to significantly exceed the
total number of index scan executor node executions.
</para>
+ <para>
+ <command>EXPLAIN ANALYZE</command> breaks down the total number of index
+ searches performed by each index scan node. <literal>Index Searches: N</literal>
+ indicates the total number of searches across <emphasis>all</emphasis>
+ executor node executions/loops.
+ </para>
</note>
</sect2>
diff --git a/doc/src/sgml/perform.sgml b/doc/src/sgml/perform.sgml
index cd12b9ce48b..301bad786b7 100644
--- a/doc/src/sgml/perform.sgml
+++ b/doc/src/sgml/perform.sgml
@@ -727,8 +727,10 @@ WHERE t1.unique1 < 10 AND t1.unique2 = t2.unique2;
Heap Blocks: exact=10
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..4.36 rows=10 width=0) (actual time=0.004..0.004 rows=10 loops=1)
Index Cond: (unique1 < 10)
+ Index Searches: 1
-> Index Scan using tenk2_unique2 on tenk2 t2 (cost=0.29..7.90 rows=1 width=244) (actual time=0.003..0.003 rows=1 loops=10)
Index Cond: (unique2 = t1.unique2)
+ Index Searches: 1
Planning Time: 0.485 ms
Execution Time: 0.073 ms
</screen>
@@ -779,6 +781,7 @@ WHERE t1.unique1 < 100 AND t1.unique2 = t2.unique2 ORDER BY t1.fivethous;
Heap Blocks: exact=90
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.04 rows=100 width=0) (actual time=0.013..0.013 rows=100 loops=1)
Index Cond: (unique1 < 100)
+ Index Searches: 1
Planning Time: 0.187 ms
Execution Time: 3.036 ms
</screen>
@@ -844,6 +847,7 @@ EXPLAIN ANALYZE SELECT * FROM polygon_tbl WHERE f1 @> polygon '(0.5,2.0)';
-------------------------------------------------------------------&zwsp;-------------------------------------------------------
Index Scan using gpolygonind on polygon_tbl (cost=0.13..8.15 rows=1 width=85) (actual time=0.074..0.074 rows=0 loops=1)
Index Cond: (f1 @> '((0.5,2))'::polygon)
+ Index Searches: 1
Rows Removed by Index Recheck: 1
Planning Time: 0.039 ms
Execution Time: 0.098 ms
@@ -873,9 +877,11 @@ EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM tenk1 WHERE unique1 < 100 AND unique
Buffers: shared hit=4 read=3
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.04 rows=100 width=0) (actual time=0.027..0.027 rows=100 loops=1)
Index Cond: (unique1 < 100)
+ Index Searches: 1
Buffers: shared hit=2
-> Bitmap Index Scan on tenk1_unique2 (cost=0.00..19.78 rows=999 width=0) (actual time=0.070..0.070 rows=999 loops=1)
Index Cond: (unique2 > 9000)
+ Index Searches: 1
Buffers: shared hit=2 read=3
Planning:
Buffers: shared hit=3
@@ -908,6 +914,7 @@ EXPLAIN ANALYZE UPDATE tenk1 SET hundred = hundred + 1 WHERE unique1 < 100;
Heap Blocks: exact=90
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.04 rows=100 width=0) (actual time=0.031..0.031 rows=100 loops=1)
Index Cond: (unique1 < 100)
+ Index Searches: 1
Planning Time: 0.151 ms
Execution Time: 1.856 ms
@@ -1044,6 +1051,7 @@ EXPLAIN ANALYZE SELECT * FROM tenk1 WHERE unique1 < 100 AND unique2 > 9000
Index Cond: (unique2 > 9000)
Filter: (unique1 < 100)
Rows Removed by Filter: 287
+ Index Searches: 1
Planning Time: 0.077 ms
Execution Time: 0.086 ms
</screen>
diff --git a/doc/src/sgml/ref/explain.sgml b/doc/src/sgml/ref/explain.sgml
index db9d3a8549a..0ceb9307071 100644
--- a/doc/src/sgml/ref/explain.sgml
+++ b/doc/src/sgml/ref/explain.sgml
@@ -502,6 +502,7 @@ EXPLAIN ANALYZE EXECUTE query(100, 200);
Batches: 1 Memory Usage: 24kB
-> Index Scan using test_pkey on test (cost=0.29..10.27 rows=99 width=8) (actual time=0.009..0.025 rows=99 loops=1)
Index Cond: ((id > 100) AND (id < 200))
+ Index Searches: 1
Planning Time: 0.244 ms
Execution Time: 0.073 ms
(7 rows)
diff --git a/doc/src/sgml/rules.sgml b/doc/src/sgml/rules.sgml
index 7a928bd7b90..17112971fec 100644
--- a/doc/src/sgml/rules.sgml
+++ b/doc/src/sgml/rules.sgml
@@ -1046,6 +1046,7 @@ SELECT count(*) FROM words WHERE word = 'caterpiler';
-> Index Only Scan using wrd_word on wrd (cost=0.42..4.44 rows=1 width=0) (actual time=0.039..0.039 rows=0 loops=1)
Index Cond: (word = 'caterpiler'::text)
Heap Fetches: 0
+ Index Searches: 1
Planning time: 0.164 ms
Execution time: 0.117 ms
</programlisting>
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index c0b978119ac..52939d3175c 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -585,6 +585,7 @@ bringetbitmap(IndexScanDesc scan, TIDBitmap *tbm)
opaque = (BrinOpaque *) scan->opaque;
bdesc = opaque->bo_bdesc;
pgstat_count_index_scan(idxRel);
+ scan->nsearches++;
/*
* We need to know the size of the table so that we know how long to
diff --git a/src/backend/access/gin/ginscan.c b/src/backend/access/gin/ginscan.c
index f2fd62afbbf..5e423e155f5 100644
--- a/src/backend/access/gin/ginscan.c
+++ b/src/backend/access/gin/ginscan.c
@@ -436,6 +436,7 @@ ginNewScanKey(IndexScanDesc scan)
MemoryContextSwitchTo(oldCtx);
pgstat_count_index_scan(scan->indexRelation);
+ scan->nsearches++;
}
void
diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c
index b35b8a97577..36f1435cb62 100644
--- a/src/backend/access/gist/gistget.c
+++ b/src/backend/access/gist/gistget.c
@@ -625,6 +625,7 @@ gistgettuple(IndexScanDesc scan, ScanDirection dir)
GISTSearchItem fakeItem;
pgstat_count_index_scan(scan->indexRelation);
+ scan->nsearches++;
so->firstCall = false;
so->curPageData = so->nPageData = 0;
@@ -750,6 +751,7 @@ gistgetbitmap(IndexScanDesc scan, TIDBitmap *tbm)
return 0;
pgstat_count_index_scan(scan->indexRelation);
+ scan->nsearches++;
/* Begin the scan by processing the root page */
so->curPageData = so->nPageData = 0;
diff --git a/src/backend/access/hash/hashsearch.c b/src/backend/access/hash/hashsearch.c
index 0d99d6abc86..927ba103907 100644
--- a/src/backend/access/hash/hashsearch.c
+++ b/src/backend/access/hash/hashsearch.c
@@ -298,6 +298,7 @@ _hash_first(IndexScanDesc scan, ScanDirection dir)
HashScanPosItem *currItem;
pgstat_count_index_scan(rel);
+ scan->nsearches++;
/*
* We do not support hash scans with no index qualification, because we
diff --git a/src/backend/access/index/genam.c b/src/backend/access/index/genam.c
index 60c61039d66..daf161684a0 100644
--- a/src/backend/access/index/genam.c
+++ b/src/backend/access/index/genam.c
@@ -118,6 +118,7 @@ RelationGetIndexScan(Relation indexRelation, int nkeys, int norderbys)
scan->xactStartedInRecovery = TransactionStartedDuringRecovery();
scan->ignore_killed_tuples = !scan->xactStartedInRecovery;
+ scan->nsearches = 0; /* not reset by index_rescan */
scan->opaque = NULL;
scan->xs_itup = NULL;
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index dd76fe1da90..bc106566b96 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -69,6 +69,7 @@ typedef struct BTParallelScanDescData
BTPS_State btps_pageStatus; /* indicates whether next page is
* available for scan. see above for
* possible states of parallel scan. */
+ uint64 btps_nsearches; /* instrumentation */
slock_t btps_mutex; /* protects above variables, btps_arrElems */
ConditionVariable btps_cv; /* used to synchronize parallel scan */
@@ -552,6 +553,7 @@ btinitparallelscan(void *target)
bt_target->btps_nextScanPage = InvalidBlockNumber;
bt_target->btps_lastCurrPage = InvalidBlockNumber;
bt_target->btps_pageStatus = BTPARALLEL_NOT_INITIALIZED;
+ bt_target->btps_nsearches = 0;
ConditionVariableInit(&bt_target->btps_cv);
}
@@ -578,6 +580,7 @@ btparallelrescan(IndexScanDesc scan)
btscan->btps_nextScanPage = InvalidBlockNumber;
btscan->btps_lastCurrPage = InvalidBlockNumber;
btscan->btps_pageStatus = BTPARALLEL_NOT_INITIALIZED;
+ /* deliberately don't reset btps_nsearches (matches index_rescan) */
SpinLockRelease(&btscan->btps_mutex);
}
@@ -705,6 +708,11 @@ _bt_parallel_seize(IndexScanDesc scan, BlockNumber *next_scan_page,
* We have successfully seized control of the scan for the purpose
* of advancing it to a new page!
*/
+ if (first && btscan->btps_pageStatus == BTPARALLEL_NOT_INITIALIZED)
+ {
+ /* count the first primitive scan for this btrescan */
+ btscan->btps_nsearches++;
+ }
btscan->btps_pageStatus = BTPARALLEL_ADVANCING;
Assert(btscan->btps_nextScanPage != P_NONE);
*next_scan_page = btscan->btps_nextScanPage;
@@ -805,6 +813,8 @@ _bt_parallel_done(IndexScanDesc scan)
btscan->btps_pageStatus = BTPARALLEL_DONE;
status_changed = true;
}
+ /* Copy the authoritative shared primitive scan counter to local field */
+ scan->nsearches = btscan->btps_nsearches;
SpinLockRelease(&btscan->btps_mutex);
/* wake up all the workers associated with this parallel scan */
@@ -839,6 +849,7 @@ _bt_parallel_primscan_schedule(IndexScanDesc scan, BlockNumber curr_page)
btscan->btps_nextScanPage = InvalidBlockNumber;
btscan->btps_lastCurrPage = InvalidBlockNumber;
btscan->btps_pageStatus = BTPARALLEL_NEED_PRIMSCAN;
+ btscan->btps_nsearches++;
/* Serialize scan's current array keys */
for (int i = 0; i < so->numArrayKeys; i++)
diff --git a/src/backend/access/nbtree/nbtsearch.c b/src/backend/access/nbtree/nbtsearch.c
index 2786a8564f2..9c493bc8fd4 100644
--- a/src/backend/access/nbtree/nbtsearch.c
+++ b/src/backend/access/nbtree/nbtsearch.c
@@ -964,6 +964,13 @@ _bt_first(IndexScanDesc scan, ScanDirection dir)
_bt_start_array_keys(scan, dir);
}
+ /*
+ * We've established that we'll either call _bt_search or _bt_endpoint.
+ * Count this as a primitive index scan/index search.
+ */
+ pgstat_count_index_scan(rel);
+ scan->nsearches++;
+
/*
* Count an indexscan for stats, now that we know that we'll call
* _bt_search/_bt_endpoint below
diff --git a/src/backend/access/spgist/spgscan.c b/src/backend/access/spgist/spgscan.c
index 3017861859f..be668abf220 100644
--- a/src/backend/access/spgist/spgscan.c
+++ b/src/backend/access/spgist/spgscan.c
@@ -421,6 +421,7 @@ spgrescan(IndexScanDesc scan, ScanKey scankey, int nscankeys,
/* count an indexscan for stats */
pgstat_count_index_scan(scan->indexRelation);
+ scan->nsearches++;
}
void
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 7c0fd63b2f0..87f55e99212 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -13,6 +13,7 @@
*/
#include "postgres.h"
+#include "access/relscan.h"
#include "access/xact.h"
#include "catalog/pg_type.h"
#include "commands/createas.h"
@@ -88,6 +89,7 @@ static void show_plan_tlist(PlanState *planstate, List *ancestors,
static void show_expression(Node *node, const char *qlabel,
PlanState *planstate, List *ancestors,
bool useprefix, ExplainState *es);
+static void show_indexscan_nsearches(PlanState *planstate, ExplainState *es);
static void show_qual(List *qual, const char *qlabel,
PlanState *planstate, List *ancestors,
bool useprefix, ExplainState *es);
@@ -2107,6 +2109,8 @@ ExplainNode(PlanState *planstate, List *ancestors,
if (plan->qual)
show_instrumentation_count("Rows Removed by Filter", 1,
planstate, es);
+ if (es->analyze)
+ show_indexscan_nsearches(planstate, es);
break;
case T_IndexOnlyScan:
show_scan_qual(((IndexOnlyScan *) plan)->indexqual,
@@ -2121,12 +2125,17 @@ ExplainNode(PlanState *planstate, List *ancestors,
show_instrumentation_count("Rows Removed by Filter", 1,
planstate, es);
if (es->analyze)
+ {
ExplainPropertyFloat("Heap Fetches", NULL,
planstate->instrument->ntuples2, 0, es);
+ show_indexscan_nsearches(planstate, es);
+ }
break;
case T_BitmapIndexScan:
show_scan_qual(((BitmapIndexScan *) plan)->indexqualorig,
"Index Cond", planstate, ancestors, es);
+ if (es->analyze)
+ show_indexscan_nsearches(planstate, es);
break;
case T_BitmapHeapScan:
show_scan_qual(((BitmapHeapScan *) plan)->bitmapqualorig,
@@ -2645,6 +2654,36 @@ show_expression(Node *node, const char *qlabel,
ExplainPropertyText(qlabel, exprstr, es);
}
+/*
+ * Show the number of index searches within an IndexScan node, IndexOnlyScan
+ * node, or BitmapIndexScan node
+ */
+static void
+show_indexscan_nsearches(PlanState *planstate, ExplainState *es)
+{
+ Plan *plan = planstate->plan;
+ struct IndexScanDescData *scanDesc = NULL;
+
+ switch (nodeTag(plan))
+ {
+ case T_IndexScan:
+ scanDesc = ((IndexScanState *) planstate)->iss_ScanDesc;
+ break;
+ case T_IndexOnlyScan:
+ scanDesc = ((IndexOnlyScanState *) planstate)->ioss_ScanDesc;
+ break;
+ case T_BitmapIndexScan:
+ scanDesc = ((BitmapIndexScanState *) planstate)->biss_ScanDesc;
+ break;
+ default:
+ break;
+ }
+
+ if (scanDesc && scanDesc->nsearches > 0)
+ ExplainPropertyUInteger("Index Searches", NULL,
+ scanDesc->nsearches, es);
+}
+
/*
* Show a qualifier expression (which is a List with implicit AND semantics)
*/
diff --git a/src/include/access/relscan.h b/src/include/access/relscan.h
index e1884acf493..7b4180db5aa 100644
--- a/src/include/access/relscan.h
+++ b/src/include/access/relscan.h
@@ -153,6 +153,9 @@ typedef struct IndexScanDescData
bool xactStartedInRecovery; /* prevents killing/seeing killed
* tuples */
+ /* index access method instrumentation output state */
+ uint64 nsearches; /* # of index searches */
+
/* index access method's private state */
void *opaque; /* access-method-specific info */
diff --git a/src/test/regress/expected/brin_multi.out b/src/test/regress/expected/brin_multi.out
index ae9ce9d8ecf..c24d560077e 100644
--- a/src/test/regress/expected/brin_multi.out
+++ b/src/test/regress/expected/brin_multi.out
@@ -853,7 +853,8 @@ SELECT * FROM brin_date_test WHERE a = '2023-01-01'::date;
Recheck Cond: (a = '2023-01-01'::date)
-> Bitmap Index Scan on brin_date_test_a_idx (actual rows=0 loops=1)
Index Cond: (a = '2023-01-01'::date)
-(4 rows)
+ Index Searches: 1
+(5 rows)
DROP TABLE brin_date_test;
RESET enable_seqscan;
@@ -872,7 +873,8 @@ SELECT * FROM brin_timestamp_test WHERE a = '2023-01-01'::timestamp;
Recheck Cond: (a = '2023-01-01 00:00:00'::timestamp without time zone)
-> Bitmap Index Scan on brin_timestamp_test_a_idx (actual rows=0 loops=1)
Index Cond: (a = '2023-01-01 00:00:00'::timestamp without time zone)
-(4 rows)
+ Index Searches: 1
+(5 rows)
EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF, SUMMARY OFF)
SELECT * FROM brin_timestamp_test WHERE a = '1900-01-01'::timestamp;
@@ -882,7 +884,8 @@ SELECT * FROM brin_timestamp_test WHERE a = '1900-01-01'::timestamp;
Recheck Cond: (a = '1900-01-01 00:00:00'::timestamp without time zone)
-> Bitmap Index Scan on brin_timestamp_test_a_idx (actual rows=0 loops=1)
Index Cond: (a = '1900-01-01 00:00:00'::timestamp without time zone)
-(4 rows)
+ Index Searches: 1
+(5 rows)
DROP TABLE brin_timestamp_test;
RESET enable_seqscan;
@@ -900,7 +903,8 @@ SELECT * FROM brin_date_test WHERE a = '2023-01-01'::date;
Recheck Cond: (a = '2023-01-01'::date)
-> Bitmap Index Scan on brin_date_test_a_idx (actual rows=0 loops=1)
Index Cond: (a = '2023-01-01'::date)
-(4 rows)
+ Index Searches: 1
+(5 rows)
EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF, SUMMARY OFF)
SELECT * FROM brin_date_test WHERE a = '1900-01-01'::date;
@@ -910,7 +914,8 @@ SELECT * FROM brin_date_test WHERE a = '1900-01-01'::date;
Recheck Cond: (a = '1900-01-01'::date)
-> Bitmap Index Scan on brin_date_test_a_idx (actual rows=0 loops=1)
Index Cond: (a = '1900-01-01'::date)
-(4 rows)
+ Index Searches: 1
+(5 rows)
DROP TABLE brin_date_test;
RESET enable_seqscan;
@@ -929,7 +934,8 @@ SELECT * FROM brin_interval_test WHERE a = '-30 years'::interval;
Recheck Cond: (a = '@ 30 years ago'::interval)
-> Bitmap Index Scan on brin_interval_test_a_idx (actual rows=0 loops=1)
Index Cond: (a = '@ 30 years ago'::interval)
-(4 rows)
+ Index Searches: 1
+(5 rows)
EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF, SUMMARY OFF)
SELECT * FROM brin_interval_test WHERE a = '30 years'::interval;
@@ -939,7 +945,8 @@ SELECT * FROM brin_interval_test WHERE a = '30 years'::interval;
Recheck Cond: (a = '@ 30 years'::interval)
-> Bitmap Index Scan on brin_interval_test_a_idx (actual rows=0 loops=1)
Index Cond: (a = '@ 30 years'::interval)
-(4 rows)
+ Index Searches: 1
+(5 rows)
DROP TABLE brin_interval_test;
RESET enable_seqscan;
@@ -957,7 +964,8 @@ SELECT * FROM brin_interval_test WHERE a = '-30 years'::interval;
Recheck Cond: (a = '@ 30 years ago'::interval)
-> Bitmap Index Scan on brin_interval_test_a_idx (actual rows=0 loops=1)
Index Cond: (a = '@ 30 years ago'::interval)
-(4 rows)
+ Index Searches: 1
+(5 rows)
EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF, SUMMARY OFF)
SELECT * FROM brin_interval_test WHERE a = '30 years'::interval;
@@ -967,7 +975,8 @@ SELECT * FROM brin_interval_test WHERE a = '30 years'::interval;
Recheck Cond: (a = '@ 30 years'::interval)
-> Bitmap Index Scan on brin_interval_test_a_idx (actual rows=0 loops=1)
Index Cond: (a = '@ 30 years'::interval)
-(4 rows)
+ Index Searches: 1
+(5 rows)
DROP TABLE brin_interval_test;
RESET enable_seqscan;
diff --git a/src/test/regress/expected/memoize.out b/src/test/regress/expected/memoize.out
index f6b8329cd61..510aa99c527 100644
--- a/src/test/regress/expected/memoize.out
+++ b/src/test/regress/expected/memoize.out
@@ -22,8 +22,10 @@ begin
ln := regexp_replace(ln, 'Evictions: 0', 'Evictions: Zero');
ln := regexp_replace(ln, 'Evictions: \d+', 'Evictions: N');
ln := regexp_replace(ln, 'Memory Usage: \d+', 'Memory Usage: N');
- ln := regexp_replace(ln, 'Heap Fetches: \d+', 'Heap Fetches: N');
- ln := regexp_replace(ln, 'loops=\d+', 'loops=N');
+ ln := regexp_replace(ln, 'Heap Fetches: \d+', 'Heap Fetches: N');
+ ln := regexp_replace(ln, 'loops=\d+', 'loops=N');
+ ln := regexp_replace(ln, 'Index Searches: 0', 'Index Searches: Zero');
+ ln := regexp_replace(ln, 'Index Searches: \d+', 'Index Searches: N');
return next ln;
end loop;
end;
@@ -49,7 +51,8 @@ WHERE t2.unique1 < 1000;', false);
-> Index Only Scan using tenk1_unique1 on tenk1 t1 (actual rows=1 loops=N)
Index Cond: (unique1 = t2.twenty)
Heap Fetches: N
-(12 rows)
+ Index Searches: N
+(13 rows)
-- And check we get the expected results.
SELECT COUNT(*),AVG(t1.unique1) FROM tenk1 t1
@@ -80,7 +83,8 @@ WHERE t1.unique1 < 1000;', false);
-> Index Only Scan using tenk1_unique1 on tenk1 t2 (actual rows=1 loops=N)
Index Cond: (unique1 = t1.twenty)
Heap Fetches: N
-(12 rows)
+ Index Searches: N
+(13 rows)
-- And check we get the expected results.
SELECT COUNT(*),AVG(t2.unique1) FROM tenk1 t1,
@@ -106,6 +110,7 @@ WHERE t1.unique1 < 10;', false);
-> Nested Loop Left Join (actual rows=20 loops=N)
-> Index Scan using tenk1_unique1 on tenk1 t1 (actual rows=10 loops=N)
Index Cond: (unique1 < 10)
+ Index Searches: N
-> Memoize (actual rows=2 loops=N)
Cache Key: t1.two
Cache Mode: binary
@@ -115,7 +120,8 @@ WHERE t1.unique1 < 10;', false);
Rows Removed by Filter: 2
-> Index Scan using tenk1_unique1 on tenk1 t2_1 (actual rows=4 loops=N)
Index Cond: (unique1 < 4)
-(13 rows)
+ Index Searches: N
+(15 rows)
-- And check we get the expected results.
SELECT COUNT(*),AVG(t2.t1two) FROM tenk1 t1 LEFT JOIN
@@ -149,7 +155,8 @@ WHERE s.c1 = s.c2 AND t1.unique1 < 1000;', false);
Filter: ((t1.two + 1) = unique1)
Rows Removed by Filter: 9999
Heap Fetches: N
-(13 rows)
+ Index Searches: N
+(14 rows)
-- And check we get the expected results.
SELECT COUNT(*), AVG(t1.twenty) FROM tenk1 t1 LEFT JOIN
@@ -219,7 +226,8 @@ ON t1.x = t2.t::numeric AND t1.t::numeric = t2.x;', false);
Index Cond: (x = (t1.t)::numeric)
Filter: (t1.x = (t)::numeric)
Heap Fetches: N
-(10 rows)
+ Index Searches: N
+(11 rows)
DROP TABLE expr_key;
-- Reduce work_mem and hash_mem_multiplier so that we see some cache evictions
@@ -246,7 +254,8 @@ WHERE t2.unique1 < 1200;', true);
-> Index Only Scan using tenk1_unique1 on tenk1 t1 (actual rows=1 loops=N)
Index Cond: (unique1 = t2.thousand)
Heap Fetches: N
-(12 rows)
+ Index Searches: N
+(13 rows)
CREATE TABLE flt (f float);
CREATE INDEX flt_f_idx ON flt (f);
@@ -261,6 +270,7 @@ SELECT * FROM flt f1 INNER JOIN flt f2 ON f1.f = f2.f;', false);
Nested Loop (actual rows=4 loops=N)
-> Index Only Scan using flt_f_idx on flt f1 (actual rows=2 loops=N)
Heap Fetches: N
+ Index Searches: N
-> Memoize (actual rows=2 loops=N)
Cache Key: f1.f
Cache Mode: logical
@@ -268,7 +278,8 @@ SELECT * FROM flt f1 INNER JOIN flt f2 ON f1.f = f2.f;', false);
-> Index Only Scan using flt_f_idx on flt f2 (actual rows=2 loops=N)
Index Cond: (f = f1.f)
Heap Fetches: N
-(10 rows)
+ Index Searches: N
+(12 rows)
-- Ensure memoize operates in binary mode
SELECT explain_memoize('
@@ -278,6 +289,7 @@ SELECT * FROM flt f1 INNER JOIN flt f2 ON f1.f >= f2.f;', false);
Nested Loop (actual rows=4 loops=N)
-> Index Only Scan using flt_f_idx on flt f1 (actual rows=2 loops=N)
Heap Fetches: N
+ Index Searches: N
-> Memoize (actual rows=2 loops=N)
Cache Key: f1.f
Cache Mode: binary
@@ -285,7 +297,8 @@ SELECT * FROM flt f1 INNER JOIN flt f2 ON f1.f >= f2.f;', false);
-> Index Only Scan using flt_f_idx on flt f2 (actual rows=2 loops=N)
Index Cond: (f <= f1.f)
Heap Fetches: N
-(10 rows)
+ Index Searches: N
+(12 rows)
DROP TABLE flt;
-- Exercise Memoize in binary mode with a large fixed width type and a
@@ -311,7 +324,8 @@ SELECT * FROM strtest s1 INNER JOIN strtest s2 ON s1.n >= s2.n;', false);
Hits: 3 Misses: 3 Evictions: Zero Overflows: 0 Memory Usage: NkB
-> Index Scan using strtest_n_idx on strtest s2 (actual rows=4 loops=N)
Index Cond: (n <= s1.n)
-(9 rows)
+ Index Searches: N
+(10 rows)
-- Ensure we get 3 hits and 3 misses
SELECT explain_memoize('
@@ -327,7 +341,8 @@ SELECT * FROM strtest s1 INNER JOIN strtest s2 ON s1.t >= s2.t;', false);
Hits: 3 Misses: 3 Evictions: Zero Overflows: 0 Memory Usage: NkB
-> Index Scan using strtest_t_idx on strtest s2 (actual rows=4 loops=N)
Index Cond: (t <= s1.t)
-(9 rows)
+ Index Searches: N
+(10 rows)
DROP TABLE strtest;
-- Ensure memoize works with partitionwise join
@@ -348,6 +363,7 @@ SELECT * FROM prt t1 INNER JOIN prt t2 ON t1.a = t2.a;', false);
-> Nested Loop (actual rows=16 loops=N)
-> Index Only Scan using iprt_p1_a on prt_p1 t1_1 (actual rows=4 loops=N)
Heap Fetches: N
+ Index Searches: N
-> Memoize (actual rows=4 loops=N)
Cache Key: t1_1.a
Cache Mode: logical
@@ -355,9 +371,11 @@ SELECT * FROM prt t1 INNER JOIN prt t2 ON t1.a = t2.a;', false);
-> Index Only Scan using iprt_p1_a on prt_p1 t2_1 (actual rows=4 loops=N)
Index Cond: (a = t1_1.a)
Heap Fetches: N
+ Index Searches: N
-> Nested Loop (actual rows=16 loops=N)
-> Index Only Scan using iprt_p2_a on prt_p2 t1_2 (actual rows=4 loops=N)
Heap Fetches: N
+ Index Searches: N
-> Memoize (actual rows=4 loops=N)
Cache Key: t1_2.a
Cache Mode: logical
@@ -365,7 +383,8 @@ SELECT * FROM prt t1 INNER JOIN prt t2 ON t1.a = t2.a;', false);
-> Index Only Scan using iprt_p2_a on prt_p2 t2_2 (actual rows=4 loops=N)
Index Cond: (a = t1_2.a)
Heap Fetches: N
-(21 rows)
+ Index Searches: N
+(25 rows)
-- Ensure memoize works with parameterized union-all Append path
SET enable_partitionwise_join TO off;
@@ -378,6 +397,7 @@ ON t1.a = t2.a;', false);
Nested Loop (actual rows=16 loops=N)
-> Index Only Scan using iprt_p1_a on prt_p1 t1 (actual rows=4 loops=N)
Heap Fetches: N
+ Index Searches: N
-> Memoize (actual rows=4 loops=N)
Cache Key: t1.a
Cache Mode: logical
@@ -386,10 +406,12 @@ ON t1.a = t2.a;', false);
-> Index Only Scan using iprt_p1_a on prt_p1 (actual rows=4 loops=N)
Index Cond: (a = t1.a)
Heap Fetches: N
+ Index Searches: N
-> Index Only Scan using iprt_p2_a on prt_p2 (actual rows=0 loops=N)
Index Cond: (a = t1.a)
Heap Fetches: N
-(14 rows)
+ Index Searches: N
+(17 rows)
DROP TABLE prt;
RESET enable_partitionwise_join;
diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out
index 7a03b4e3607..65f8387d3df 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -2340,6 +2340,10 @@ begin
ln := regexp_replace(ln, 'Workers Launched: \d+', 'Workers Launched: N');
ln := regexp_replace(ln, 'actual rows=\d+ loops=\d+', 'actual rows=N loops=N');
ln := regexp_replace(ln, 'Rows Removed by Filter: \d+', 'Rows Removed by Filter: N');
+ perform regexp_matches(ln, 'Index Searches: \d+');
+ if found then
+ continue;
+ end if;
return next ln;
end loop;
end;
@@ -2692,12 +2696,13 @@ select * from ab where a = (select max(a) from lprt_a) and b = (select max(a)-1
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a3_b2_a_idx (actual rows=0 loops=1)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a3_b3 ab_9 (never executed)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a3_b3_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
-(52 rows)
+(53 rows)
-- Test run-time partition pruning with UNION ALL parents
explain (analyze, costs off, summary off, timing off)
@@ -2713,6 +2718,7 @@ select * from (select * from ab where a = 1 union all select * from ab) ab where
Filter: (b = (InitPlan 1).col1)
-> Bitmap Index Scan on ab_a1_b1_a_idx (actual rows=0 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b2 ab_12 (never executed)
Recheck Cond: (a = 1)
Filter: (b = (InitPlan 1).col1)
@@ -2741,7 +2747,7 @@ select * from (select * from ab where a = 1 union all select * from ab) ab where
Filter: (b = (InitPlan 1).col1)
-> Seq Scan on ab_a3_b3 ab_9 (never executed)
Filter: (b = (InitPlan 1).col1)
-(37 rows)
+(38 rows)
-- A case containing a UNION ALL with a non-partitioned child.
explain (analyze, costs off, summary off, timing off)
@@ -2757,6 +2763,7 @@ select * from (select * from ab where a = 1 union all (values(10,5)) union all s
Filter: (b = (InitPlan 1).col1)
-> Bitmap Index Scan on ab_a1_b1_a_idx (actual rows=0 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b2 ab_12 (never executed)
Recheck Cond: (a = 1)
Filter: (b = (InitPlan 1).col1)
@@ -2787,7 +2794,7 @@ select * from (select * from ab where a = 1 union all (values(10,5)) union all s
Filter: (b = (InitPlan 1).col1)
-> Seq Scan on ab_a3_b3 ab_9 (never executed)
Filter: (b = (InitPlan 1).col1)
-(39 rows)
+(40 rows)
-- Another UNION ALL test, but containing a mix of exec init and exec run-time pruning.
create table xy_1 (x int, y int);
@@ -2858,16 +2865,19 @@ update ab_a1 set b = 3 from ab where ab.a = 1 and ab.a = ab_a1.a;');
Recheck Cond: (a = 1)
-> Bitmap Index Scan on ab_a1_b1_a_idx (actual rows=0 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b2 ab_a1_2 (actual rows=1 loops=1)
Recheck Cond: (a = 1)
Heap Blocks: exact=1
-> Bitmap Index Scan on ab_a1_b2_a_idx (actual rows=1 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b3 ab_a1_3 (actual rows=0 loops=1)
Recheck Cond: (a = 1)
Heap Blocks: exact=1
-> Bitmap Index Scan on ab_a1_b3_a_idx (actual rows=1 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Materialize (actual rows=1 loops=1)
Storage: Memory Maximum Storage: NkB
-> Append (actual rows=1 loops=1)
@@ -2875,17 +2885,20 @@ update ab_a1 set b = 3 from ab where ab.a = 1 and ab.a = ab_a1.a;');
Recheck Cond: (a = 1)
-> Bitmap Index Scan on ab_a1_b1_a_idx (actual rows=0 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b2 ab_2 (actual rows=1 loops=1)
Recheck Cond: (a = 1)
Heap Blocks: exact=1
-> Bitmap Index Scan on ab_a1_b2_a_idx (actual rows=1 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b3 ab_3 (actual rows=0 loops=1)
Recheck Cond: (a = 1)
Heap Blocks: exact=1
-> Bitmap Index Scan on ab_a1_b3_a_idx (actual rows=1 loops=1)
Index Cond: (a = 1)
-(37 rows)
+ Index Searches: 1
+(43 rows)
table ab;
a | b
@@ -2961,8 +2974,10 @@ select * from tbl1 join tprt on tbl1.col1 > tprt.col1;
-> Append (actual rows=3 loops=2)
-> Index Scan using tprt1_idx on tprt_1 (actual rows=2 loops=2)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 2
-> Index Scan using tprt2_idx on tprt_2 (actual rows=2 loops=1)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 1
-> Index Scan using tprt3_idx on tprt_3 (never executed)
Index Cond: (col1 < tbl1.col1)
-> Index Scan using tprt4_idx on tprt_4 (never executed)
@@ -2971,7 +2986,7 @@ select * from tbl1 join tprt on tbl1.col1 > tprt.col1;
Index Cond: (col1 < tbl1.col1)
-> Index Scan using tprt6_idx on tprt_6 (never executed)
Index Cond: (col1 < tbl1.col1)
-(15 rows)
+(17 rows)
explain (analyze, costs off, summary off, timing off)
select * from tbl1 join tprt on tbl1.col1 = tprt.col1;
@@ -2984,6 +2999,7 @@ select * from tbl1 join tprt on tbl1.col1 = tprt.col1;
Index Cond: (col1 = tbl1.col1)
-> Index Scan using tprt2_idx on tprt_2 (actual rows=1 loops=2)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 2
-> Index Scan using tprt3_idx on tprt_3 (never executed)
Index Cond: (col1 = tbl1.col1)
-> Index Scan using tprt4_idx on tprt_4 (never executed)
@@ -2992,7 +3008,7 @@ select * from tbl1 join tprt on tbl1.col1 = tprt.col1;
Index Cond: (col1 = tbl1.col1)
-> Index Scan using tprt6_idx on tprt_6 (never executed)
Index Cond: (col1 = tbl1.col1)
-(15 rows)
+(16 rows)
select tbl1.col1, tprt.col1 from tbl1
inner join tprt on tbl1.col1 > tprt.col1
@@ -3027,17 +3043,20 @@ select * from tbl1 inner join tprt on tbl1.col1 > tprt.col1;
-> Append (actual rows=5 loops=5)
-> Index Scan using tprt1_idx on tprt_1 (actual rows=2 loops=5)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 5
-> Index Scan using tprt2_idx on tprt_2 (actual rows=3 loops=4)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 4
-> Index Scan using tprt3_idx on tprt_3 (actual rows=1 loops=2)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 2
-> Index Scan using tprt4_idx on tprt_4 (never executed)
Index Cond: (col1 < tbl1.col1)
-> Index Scan using tprt5_idx on tprt_5 (never executed)
Index Cond: (col1 < tbl1.col1)
-> Index Scan using tprt6_idx on tprt_6 (never executed)
Index Cond: (col1 < tbl1.col1)
-(15 rows)
+(18 rows)
explain (analyze, costs off, summary off, timing off)
select * from tbl1 inner join tprt on tbl1.col1 = tprt.col1;
@@ -3050,15 +3069,17 @@ select * from tbl1 inner join tprt on tbl1.col1 = tprt.col1;
Index Cond: (col1 = tbl1.col1)
-> Index Scan using tprt2_idx on tprt_2 (actual rows=1 loops=2)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 2
-> Index Scan using tprt3_idx on tprt_3 (actual rows=0 loops=3)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 3
-> Index Scan using tprt4_idx on tprt_4 (never executed)
Index Cond: (col1 = tbl1.col1)
-> Index Scan using tprt5_idx on tprt_5 (never executed)
Index Cond: (col1 = tbl1.col1)
-> Index Scan using tprt6_idx on tprt_6 (never executed)
Index Cond: (col1 = tbl1.col1)
-(15 rows)
+(17 rows)
select tbl1.col1, tprt.col1 from tbl1
inner join tprt on tbl1.col1 > tprt.col1
@@ -3122,7 +3143,8 @@ select * from tbl1 join tprt on tbl1.col1 < tprt.col1;
Index Cond: (col1 > tbl1.col1)
-> Index Scan using tprt6_idx on tprt_6 (actual rows=1 loops=1)
Index Cond: (col1 > tbl1.col1)
-(15 rows)
+ Index Searches: 1
+(16 rows)
select tbl1.col1, tprt.col1 from tbl1
inner join tprt on tbl1.col1 < tprt.col1
@@ -3484,10 +3506,12 @@ explain (analyze, costs off, summary off, timing off) execute mt_q1(15);
-> Index Scan using ma_test_p2_b_idx on ma_test_p2 ma_test_1 (actual rows=1 loops=1)
Filter: ((a >= $1) AND ((a % 10) = 5))
Rows Removed by Filter: 9
+ Index Searches: 1
-> Index Scan using ma_test_p3_b_idx on ma_test_p3 ma_test_2 (actual rows=1 loops=1)
Filter: ((a >= $1) AND ((a % 10) = 5))
Rows Removed by Filter: 9
-(9 rows)
+ Index Searches: 1
+(11 rows)
execute mt_q1(15);
a
@@ -3505,7 +3529,8 @@ explain (analyze, costs off, summary off, timing off) execute mt_q1(25);
-> Index Scan using ma_test_p3_b_idx on ma_test_p3 ma_test_1 (actual rows=1 loops=1)
Filter: ((a >= $1) AND ((a % 10) = 5))
Rows Removed by Filter: 9
-(6 rows)
+ Index Searches: 1
+(7 rows)
execute mt_q1(25);
a
@@ -3553,13 +3578,16 @@ explain (analyze, costs off, summary off, timing off) select * from ma_test wher
-> Limit (actual rows=1 loops=1)
-> Index Scan using ma_test_p2_b_idx on ma_test_p2 (actual rows=1 loops=1)
Index Cond: (b IS NOT NULL)
+ Index Searches: 1
-> Index Scan using ma_test_p1_b_idx on ma_test_p1 ma_test_1 (never executed)
Filter: (a >= (InitPlan 2).col1)
-> Index Scan using ma_test_p2_b_idx on ma_test_p2 ma_test_2 (actual rows=10 loops=1)
Filter: (a >= (InitPlan 2).col1)
+ Index Searches: 1
-> Index Scan using ma_test_p3_b_idx on ma_test_p3 ma_test_3 (actual rows=10 loops=1)
Filter: (a >= (InitPlan 2).col1)
-(14 rows)
+ Index Searches: 1
+(17 rows)
reset enable_seqscan;
reset enable_sort;
@@ -4130,13 +4158,16 @@ select * from rangep where b IN((select 1),(select 2)) order by a;
Sort Key: rangep_2.a
-> Index Scan using rangep_0_to_100_1_a_idx on rangep_0_to_100_1 rangep_2 (actual rows=0 loops=1)
Filter: (b = ANY (ARRAY[(InitPlan 1).col1, (InitPlan 2).col1]))
+ Index Searches: 1
-> Index Scan using rangep_0_to_100_2_a_idx on rangep_0_to_100_2 rangep_3 (actual rows=0 loops=1)
Filter: (b = ANY (ARRAY[(InitPlan 1).col1, (InitPlan 2).col1]))
+ Index Searches: 1
-> Index Scan using rangep_0_to_100_3_a_idx on rangep_0_to_100_3 rangep_4 (never executed)
Filter: (b = ANY (ARRAY[(InitPlan 1).col1, (InitPlan 2).col1]))
-> Index Scan using rangep_100_to_200_a_idx on rangep_100_to_200 rangep_5 (actual rows=0 loops=1)
Filter: (b = ANY (ARRAY[(InitPlan 1).col1, (InitPlan 2).col1]))
-(15 rows)
+ Index Searches: 1
+(18 rows)
reset enable_sort;
drop table rangep;
diff --git a/src/test/regress/expected/select.out b/src/test/regress/expected/select.out
index 33a6dceb0e3..02797a259de 100644
--- a/src/test/regress/expected/select.out
+++ b/src/test/regress/expected/select.out
@@ -764,7 +764,8 @@ select * from onek2 where unique2 = 11 and stringu1 = 'ATAAAA';
Index Scan using onek2_u2_prtl on onek2 (actual rows=1 loops=1)
Index Cond: (unique2 = 11)
Filter: (stringu1 = 'ATAAAA'::name)
-(3 rows)
+ Index Searches: 1
+(4 rows)
explain (costs off)
select unique2 from onek2 where unique2 = 11 and stringu1 = 'ATAAAA';
diff --git a/src/test/regress/sql/memoize.sql b/src/test/regress/sql/memoize.sql
index 2eaeb1477ac..9afe205e063 100644
--- a/src/test/regress/sql/memoize.sql
+++ b/src/test/regress/sql/memoize.sql
@@ -23,8 +23,10 @@ begin
ln := regexp_replace(ln, 'Evictions: 0', 'Evictions: Zero');
ln := regexp_replace(ln, 'Evictions: \d+', 'Evictions: N');
ln := regexp_replace(ln, 'Memory Usage: \d+', 'Memory Usage: N');
- ln := regexp_replace(ln, 'Heap Fetches: \d+', 'Heap Fetches: N');
- ln := regexp_replace(ln, 'loops=\d+', 'loops=N');
+ ln := regexp_replace(ln, 'Heap Fetches: \d+', 'Heap Fetches: N');
+ ln := regexp_replace(ln, 'loops=\d+', 'loops=N');
+ ln := regexp_replace(ln, 'Index Searches: 0', 'Index Searches: Zero');
+ ln := regexp_replace(ln, 'Index Searches: \d+', 'Index Searches: N');
return next ln;
end loop;
end;
diff --git a/src/test/regress/sql/partition_prune.sql b/src/test/regress/sql/partition_prune.sql
index 442428d937c..085e746af35 100644
--- a/src/test/regress/sql/partition_prune.sql
+++ b/src/test/regress/sql/partition_prune.sql
@@ -573,6 +573,10 @@ begin
ln := regexp_replace(ln, 'Workers Launched: \d+', 'Workers Launched: N');
ln := regexp_replace(ln, 'actual rows=\d+ loops=\d+', 'actual rows=N loops=N');
ln := regexp_replace(ln, 'Rows Removed by Filter: \d+', 'Rows Removed by Filter: N');
+ perform regexp_matches(ln, 'Index Searches: \d+');
+ if found then
+ continue;
+ end if;
return next ln;
end loop;
end;
On Sat, Nov 9, 2024 at 12:37 PM Alena Rybakina
<a.rybakina@postgrespro.ru> wrote:
I noticed that the "Index Searches" cases shown in the regression tests are only for partitioned tables, maybe something you should add some tests for regular tables like tenk1.
I allowed the patch on this thread to bitrot, but I've been
maintaining this same patch as part of the skip scan patchset.
Attached is the latest version of this patch (technically this is the
first patch in the skip scan patch series). Just to keep things
passing on the CFTester app.
I haven't done anything about the implementation (still using a
counter that lives in IndexScanDesc) due to a lack of clarity on
what'll work best. Hopefully discussion of those aspects of this patch
will pick up again soon.
Note that I have changed the patch to divide "Index Searches:" by
nloops, since Tomas Vondra seemed to want to do it that way
(personally I don't feel strongly about that either way). So that's
one behavioral change, not seen in any of the versions of the patch
that have been posted to this thread so far.
In general, I support the initiative to display this information in the query plan output. I think it is necessary for finding the reasons for low query performance.
I just know that if Postgres 18 has skip scan, but doesn't have basic
instrumentation of the number of index searches in EXPLAIN ANALYZE
when skip scan is in use, we're going to get lots of complaints about
it. It'll be very different from the current status quo. My main
motivation here is to avoid complaints about the behavior of skip scan
being completely opaque to users.
I think that the same issue could also happen with your OR
transformation patch, if we don't get this EXPLAIN ANALYZE
instrumentation. Users will still naturally want to know if a query
"WHERE a = 2 OR a = 4 OR a = 6" required only one index search during
its index scan, or if it required as many as 3 searches. They can
already see this information with a BitmapOr-based plan, today. Why
wouldn't they expect to continue to see the same information (or
similar information) when the index searches happen to be coordinated
by the index scan node/index AM itself?
--
Peter Geoghegan
Attachments:
v14-0001-Show-index-search-count-in-EXPLAIN-ANALYZE.patchapplication/octet-stream; name=v14-0001-Show-index-search-count-in-EXPLAIN-ANALYZE.patchDownload
From d1ab58f20a481fc159ec05eab41213d6a6827c4f Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@bowt.ie>
Date: Wed, 14 Aug 2024 13:50:23 -0400
Subject: [PATCH v14 1/2] Show index search count in EXPLAIN ANALYZE.
Expose the information tracked by pg_stat_*_indexes.idx_scan to EXPLAIN
ANALYZE output. This is particularly useful for index scans that use
ScalarArrayOp quals, where the number of index scans isn't predictable
in advance with optimizations like the ones added to nbtree by commit
5bf748b8.
This information is made more important still by an upcoming patch that
adds skip scan optimizations to nbtree. The patch implements skip scan
by generating "skip arrays" during nbtree preprocessing, which makes the
relationship between the total number of primitive index scans and the
scan qual looser still. The new instrumentation will help users to
understand how effective these skip scan optimizations are in practice.
Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CAH2-Wz=PKR6rB7qbx+Vnd7eqeB5VTcrW=iJvAsTsKbdG+kW_UA@mail.gmail.com
Discussion: https://postgr.es/m/CAH2-WzkRqvaqR2CTNqTZP0z6FuL4-3ED6eQB0yx38XBNj1v-4Q@mail.gmail.com
---
src/include/access/relscan.h | 3 +
src/backend/access/brin/brin.c | 1 +
src/backend/access/gin/ginscan.c | 1 +
src/backend/access/gist/gistget.c | 2 +
src/backend/access/hash/hashsearch.c | 1 +
src/backend/access/index/genam.c | 1 +
src/backend/access/nbtree/nbtree.c | 11 ++
src/backend/access/nbtree/nbtsearch.c | 1 +
src/backend/access/spgist/spgscan.c | 1 +
src/backend/commands/explain.c | 46 ++++++++
doc/src/sgml/bloom.sgml | 6 +-
doc/src/sgml/monitoring.sgml | 12 ++-
doc/src/sgml/perform.sgml | 8 ++
doc/src/sgml/ref/explain.sgml | 3 +-
doc/src/sgml/rules.sgml | 1 +
src/test/regress/expected/brin_multi.out | 27 +++--
src/test/regress/expected/memoize.out | 50 ++++++---
src/test/regress/expected/partition_prune.out | 100 +++++++++++++++---
src/test/regress/expected/select.out | 3 +-
src/test/regress/sql/memoize.sql | 6 +-
src/test/regress/sql/partition_prune.sql | 4 +
21 files changed, 242 insertions(+), 46 deletions(-)
diff --git a/src/include/access/relscan.h b/src/include/access/relscan.h
index e1884acf4..7b4180db5 100644
--- a/src/include/access/relscan.h
+++ b/src/include/access/relscan.h
@@ -153,6 +153,9 @@ typedef struct IndexScanDescData
bool xactStartedInRecovery; /* prevents killing/seeing killed
* tuples */
+ /* index access method instrumentation output state */
+ uint64 nsearches; /* # of index searches */
+
/* index access method's private state */
void *opaque; /* access-method-specific info */
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index c0b978119..52939d317 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -585,6 +585,7 @@ bringetbitmap(IndexScanDesc scan, TIDBitmap *tbm)
opaque = (BrinOpaque *) scan->opaque;
bdesc = opaque->bo_bdesc;
pgstat_count_index_scan(idxRel);
+ scan->nsearches++;
/*
* We need to know the size of the table so that we know how long to
diff --git a/src/backend/access/gin/ginscan.c b/src/backend/access/gin/ginscan.c
index f2fd62afb..5e423e155 100644
--- a/src/backend/access/gin/ginscan.c
+++ b/src/backend/access/gin/ginscan.c
@@ -436,6 +436,7 @@ ginNewScanKey(IndexScanDesc scan)
MemoryContextSwitchTo(oldCtx);
pgstat_count_index_scan(scan->indexRelation);
+ scan->nsearches++;
}
void
diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c
index b35b8a975..36f1435cb 100644
--- a/src/backend/access/gist/gistget.c
+++ b/src/backend/access/gist/gistget.c
@@ -625,6 +625,7 @@ gistgettuple(IndexScanDesc scan, ScanDirection dir)
GISTSearchItem fakeItem;
pgstat_count_index_scan(scan->indexRelation);
+ scan->nsearches++;
so->firstCall = false;
so->curPageData = so->nPageData = 0;
@@ -750,6 +751,7 @@ gistgetbitmap(IndexScanDesc scan, TIDBitmap *tbm)
return 0;
pgstat_count_index_scan(scan->indexRelation);
+ scan->nsearches++;
/* Begin the scan by processing the root page */
so->curPageData = so->nPageData = 0;
diff --git a/src/backend/access/hash/hashsearch.c b/src/backend/access/hash/hashsearch.c
index 0d99d6abc..927ba1039 100644
--- a/src/backend/access/hash/hashsearch.c
+++ b/src/backend/access/hash/hashsearch.c
@@ -298,6 +298,7 @@ _hash_first(IndexScanDesc scan, ScanDirection dir)
HashScanPosItem *currItem;
pgstat_count_index_scan(rel);
+ scan->nsearches++;
/*
* We do not support hash scans with no index qualification, because we
diff --git a/src/backend/access/index/genam.c b/src/backend/access/index/genam.c
index 60c61039d..ec259012b 100644
--- a/src/backend/access/index/genam.c
+++ b/src/backend/access/index/genam.c
@@ -118,6 +118,7 @@ RelationGetIndexScan(Relation indexRelation, int nkeys, int norderbys)
scan->xactStartedInRecovery = TransactionStartedDuringRecovery();
scan->ignore_killed_tuples = !scan->xactStartedInRecovery;
+ scan->nsearches = 0; /* deliberately not reset by index_rescan */
scan->opaque = NULL;
scan->xs_itup = NULL;
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index dd76fe1da..8bbb3d734 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -69,6 +69,7 @@ typedef struct BTParallelScanDescData
BTPS_State btps_pageStatus; /* indicates whether next page is
* available for scan. see above for
* possible states of parallel scan. */
+ uint64 btps_nsearches; /* counts index searches for EXPLAIN ANALYZE */
slock_t btps_mutex; /* protects above variables, btps_arrElems */
ConditionVariable btps_cv; /* used to synchronize parallel scan */
@@ -552,6 +553,7 @@ btinitparallelscan(void *target)
bt_target->btps_nextScanPage = InvalidBlockNumber;
bt_target->btps_lastCurrPage = InvalidBlockNumber;
bt_target->btps_pageStatus = BTPARALLEL_NOT_INITIALIZED;
+ bt_target->btps_nsearches = 0;
ConditionVariableInit(&bt_target->btps_cv);
}
@@ -578,6 +580,7 @@ btparallelrescan(IndexScanDesc scan)
btscan->btps_nextScanPage = InvalidBlockNumber;
btscan->btps_lastCurrPage = InvalidBlockNumber;
btscan->btps_pageStatus = BTPARALLEL_NOT_INITIALIZED;
+ /* deliberately don't reset btps_nsearches (matches index_rescan) */
SpinLockRelease(&btscan->btps_mutex);
}
@@ -705,6 +708,11 @@ _bt_parallel_seize(IndexScanDesc scan, BlockNumber *next_scan_page,
* We have successfully seized control of the scan for the purpose
* of advancing it to a new page!
*/
+ if (first && btscan->btps_pageStatus == BTPARALLEL_NOT_INITIALIZED)
+ {
+ /* count the first primitive scan for this btrescan */
+ btscan->btps_nsearches++;
+ }
btscan->btps_pageStatus = BTPARALLEL_ADVANCING;
Assert(btscan->btps_nextScanPage != P_NONE);
*next_scan_page = btscan->btps_nextScanPage;
@@ -805,6 +813,8 @@ _bt_parallel_done(IndexScanDesc scan)
btscan->btps_pageStatus = BTPARALLEL_DONE;
status_changed = true;
}
+ /* Copy the authoritative shared primitive scan counter to local field */
+ scan->nsearches = btscan->btps_nsearches;
SpinLockRelease(&btscan->btps_mutex);
/* wake up all the workers associated with this parallel scan */
@@ -839,6 +849,7 @@ _bt_parallel_primscan_schedule(IndexScanDesc scan, BlockNumber curr_page)
btscan->btps_nextScanPage = InvalidBlockNumber;
btscan->btps_lastCurrPage = InvalidBlockNumber;
btscan->btps_pageStatus = BTPARALLEL_NEED_PRIMSCAN;
+ btscan->btps_nsearches++;
/* Serialize scan's current array keys */
for (int i = 0; i < so->numArrayKeys; i++)
diff --git a/src/backend/access/nbtree/nbtsearch.c b/src/backend/access/nbtree/nbtsearch.c
index 2786a8564..79329aaa9 100644
--- a/src/backend/access/nbtree/nbtsearch.c
+++ b/src/backend/access/nbtree/nbtsearch.c
@@ -969,6 +969,7 @@ _bt_first(IndexScanDesc scan, ScanDirection dir)
* _bt_search/_bt_endpoint below
*/
pgstat_count_index_scan(rel);
+ scan->nsearches++;
/*----------
* Examine the scan keys to discover where we need to start the scan.
diff --git a/src/backend/access/spgist/spgscan.c b/src/backend/access/spgist/spgscan.c
index 301786185..be668abf2 100644
--- a/src/backend/access/spgist/spgscan.c
+++ b/src/backend/access/spgist/spgscan.c
@@ -421,6 +421,7 @@ spgrescan(IndexScanDesc scan, ScanKey scankey, int nscankeys,
/* count an indexscan for stats */
pgstat_count_index_scan(scan->indexRelation);
+ scan->nsearches++;
}
void
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 7c0fd63b2..6cb5ebcd2 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -13,6 +13,7 @@
*/
#include "postgres.h"
+#include "access/relscan.h"
#include "access/xact.h"
#include "catalog/pg_type.h"
#include "commands/createas.h"
@@ -88,6 +89,7 @@ static void show_plan_tlist(PlanState *planstate, List *ancestors,
static void show_expression(Node *node, const char *qlabel,
PlanState *planstate, List *ancestors,
bool useprefix, ExplainState *es);
+static void show_indexscan_nsearches(PlanState *planstate, ExplainState *es);
static void show_qual(List *qual, const char *qlabel,
PlanState *planstate, List *ancestors,
bool useprefix, ExplainState *es);
@@ -2098,6 +2100,7 @@ ExplainNode(PlanState *planstate, List *ancestors,
case T_IndexScan:
show_scan_qual(((IndexScan *) plan)->indexqualorig,
"Index Cond", planstate, ancestors, es);
+ show_indexscan_nsearches(planstate, es);
if (((IndexScan *) plan)->indexqualorig)
show_instrumentation_count("Rows Removed by Index Recheck", 2,
planstate, es);
@@ -2111,6 +2114,7 @@ ExplainNode(PlanState *planstate, List *ancestors,
case T_IndexOnlyScan:
show_scan_qual(((IndexOnlyScan *) plan)->indexqual,
"Index Cond", planstate, ancestors, es);
+ show_indexscan_nsearches(planstate, es);
if (((IndexOnlyScan *) plan)->recheckqual)
show_instrumentation_count("Rows Removed by Index Recheck", 2,
planstate, es);
@@ -2127,6 +2131,7 @@ ExplainNode(PlanState *planstate, List *ancestors,
case T_BitmapIndexScan:
show_scan_qual(((BitmapIndexScan *) plan)->indexqualorig,
"Index Cond", planstate, ancestors, es);
+ show_indexscan_nsearches(planstate, es);
break;
case T_BitmapHeapScan:
show_scan_qual(((BitmapHeapScan *) plan)->bitmapqualorig,
@@ -2645,6 +2650,47 @@ show_expression(Node *node, const char *qlabel,
ExplainPropertyText(qlabel, exprstr, es);
}
+/*
+ * Show the number of index searches within an IndexScan node, IndexOnlyScan
+ * node, or BitmapIndexScan node
+ */
+static void
+show_indexscan_nsearches(PlanState *planstate, ExplainState *es)
+{
+ Plan *plan = planstate->plan;
+ struct IndexScanDescData *scanDesc = NULL;
+ uint64 nsearches = 0;
+ double nloops;
+
+ if (!es->analyze)
+ return;
+
+ nloops = planstate->instrument->nloops;
+
+ switch (nodeTag(plan))
+ {
+ case T_IndexScan:
+ scanDesc = ((IndexScanState *) planstate)->iss_ScanDesc;
+ break;
+ case T_IndexOnlyScan:
+ scanDesc = ((IndexOnlyScanState *) planstate)->ioss_ScanDesc;
+ break;
+ case T_BitmapIndexScan:
+ scanDesc = ((BitmapIndexScanState *) planstate)->biss_ScanDesc;
+ break;
+ default:
+ break;
+ }
+
+ if (scanDesc)
+ nsearches = scanDesc->nsearches;
+
+ if (nloops > 0)
+ ExplainPropertyFloat("Index Searches", NULL, nsearches / nloops, 0, es);
+ else
+ ExplainPropertyFloat("Index Searches", NULL, 0.0, 0, es);
+}
+
/*
* Show a qualifier expression (which is a List with implicit AND semantics)
*/
diff --git a/doc/src/sgml/bloom.sgml b/doc/src/sgml/bloom.sgml
index 19f2b172c..92b13f539 100644
--- a/doc/src/sgml/bloom.sgml
+++ b/doc/src/sgml/bloom.sgml
@@ -170,9 +170,10 @@ CREATE INDEX
Heap Blocks: exact=28
-> Bitmap Index Scan on bloomidx (cost=0.00..1792.00 rows=2 width=0) (actual time=0.356..0.356 rows=29 loops=1)
Index Cond: ((i2 = 898732) AND (i5 = 123451))
+ Index Searches: 1
Planning Time: 0.099 ms
Execution Time: 0.408 ms
-(8 rows)
+(9 rows)
</programlisting>
</para>
@@ -202,11 +203,12 @@ CREATE INDEX
-> BitmapAnd (cost=24.34..24.34 rows=2 width=0) (actual time=0.027..0.027 rows=0 loops=1)
-> Bitmap Index Scan on btreeidx5 (cost=0.00..12.04 rows=500 width=0) (actual time=0.026..0.026 rows=0 loops=1)
Index Cond: (i5 = 123451)
+ Index Searches: 1
-> Bitmap Index Scan on btreeidx2 (cost=0.00..12.04 rows=500 width=0) (never executed)
Index Cond: (i2 = 898732)
Planning Time: 0.491 ms
Execution Time: 0.055 ms
-(9 rows)
+(10 rows)
</programlisting>
Although this query runs much faster than with either of the single
indexes, we pay a penalty in index size. Each of the single-column
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 331315f8d..014a66ef7 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -4180,12 +4180,18 @@ description | Waiting for a newly initialized WAL file to reach durable storage
Queries that use certain <acronym>SQL</acronym> constructs to search for
rows matching any value out of a list or array of multiple scalar values
(see <xref linkend="functions-comparisons"/>) perform multiple
- <quote>primitive</quote> index scans (up to one primitive scan per scalar
- value) during query execution. Each internal primitive index scan
- increments <structname>pg_stat_all_indexes</structname>.<structfield>idx_scan</structfield>,
+ index searches (up to one index search per scalar value) during query
+ execution. Each internal index search increments
+ <structname>pg_stat_all_indexes</structname>.<structfield>idx_scan</structfield>,
so it's possible for the count of index scans to significantly exceed the
total number of index scan executor node executions.
</para>
+ <para>
+ <command>EXPLAIN ANALYZE</command> breaks down the total number of index
+ searches performed by each index scan node. <literal>Index Searches: N</literal>
+ indicates the total number of searches across <emphasis>all</emphasis>
+ executor node executions/loops.
+ </para>
</note>
</sect2>
diff --git a/doc/src/sgml/perform.sgml b/doc/src/sgml/perform.sgml
index cd12b9ce4..482715397 100644
--- a/doc/src/sgml/perform.sgml
+++ b/doc/src/sgml/perform.sgml
@@ -727,8 +727,10 @@ WHERE t1.unique1 < 10 AND t1.unique2 = t2.unique2;
Heap Blocks: exact=10
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..4.36 rows=10 width=0) (actual time=0.004..0.004 rows=10 loops=1)
Index Cond: (unique1 < 10)
+ Index Searches: 1
-> Index Scan using tenk2_unique2 on tenk2 t2 (cost=0.29..7.90 rows=1 width=244) (actual time=0.003..0.003 rows=1 loops=10)
Index Cond: (unique2 = t1.unique2)
+ Index Searches: 1
Planning Time: 0.485 ms
Execution Time: 0.073 ms
</screen>
@@ -779,6 +781,7 @@ WHERE t1.unique1 < 100 AND t1.unique2 = t2.unique2 ORDER BY t1.fivethous;
Heap Blocks: exact=90
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.04 rows=100 width=0) (actual time=0.013..0.013 rows=100 loops=1)
Index Cond: (unique1 < 100)
+ Index Searches: 1
Planning Time: 0.187 ms
Execution Time: 3.036 ms
</screen>
@@ -844,6 +847,7 @@ EXPLAIN ANALYZE SELECT * FROM polygon_tbl WHERE f1 @> polygon '(0.5,2.0)';
-------------------------------------------------------------------&zwsp;-------------------------------------------------------
Index Scan using gpolygonind on polygon_tbl (cost=0.13..8.15 rows=1 width=85) (actual time=0.074..0.074 rows=0 loops=1)
Index Cond: (f1 @> '((0.5,2))'::polygon)
+ Index Searches: 1
Rows Removed by Index Recheck: 1
Planning Time: 0.039 ms
Execution Time: 0.098 ms
@@ -873,9 +877,11 @@ EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM tenk1 WHERE unique1 < 100 AND unique
Buffers: shared hit=4 read=3
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.04 rows=100 width=0) (actual time=0.027..0.027 rows=100 loops=1)
Index Cond: (unique1 < 100)
+ Index Searches: 1
Buffers: shared hit=2
-> Bitmap Index Scan on tenk1_unique2 (cost=0.00..19.78 rows=999 width=0) (actual time=0.070..0.070 rows=999 loops=1)
Index Cond: (unique2 > 9000)
+ Index Searches: 1
Buffers: shared hit=2 read=3
Planning:
Buffers: shared hit=3
@@ -908,6 +914,7 @@ EXPLAIN ANALYZE UPDATE tenk1 SET hundred = hundred + 1 WHERE unique1 < 100;
Heap Blocks: exact=90
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.04 rows=100 width=0) (actual time=0.031..0.031 rows=100 loops=1)
Index Cond: (unique1 < 100)
+ Index Searches: 1
Planning Time: 0.151 ms
Execution Time: 1.856 ms
@@ -1042,6 +1049,7 @@ EXPLAIN ANALYZE SELECT * FROM tenk1 WHERE unique1 < 100 AND unique2 > 9000
Limit (cost=0.29..14.33 rows=2 width=244) (actual time=0.051..0.071 rows=2 loops=1)
-> Index Scan using tenk1_unique2 on tenk1 (cost=0.29..70.50 rows=10 width=244) (actual time=0.051..0.070 rows=2 loops=1)
Index Cond: (unique2 > 9000)
+ Index Searches: 1
Filter: (unique1 < 100)
Rows Removed by Filter: 287
Planning Time: 0.077 ms
diff --git a/doc/src/sgml/ref/explain.sgml b/doc/src/sgml/ref/explain.sgml
index db9d3a854..e042638b7 100644
--- a/doc/src/sgml/ref/explain.sgml
+++ b/doc/src/sgml/ref/explain.sgml
@@ -502,9 +502,10 @@ EXPLAIN ANALYZE EXECUTE query(100, 200);
Batches: 1 Memory Usage: 24kB
-> Index Scan using test_pkey on test (cost=0.29..10.27 rows=99 width=8) (actual time=0.009..0.025 rows=99 loops=1)
Index Cond: ((id > 100) AND (id < 200))
+ Index Searches: 1
Planning Time: 0.244 ms
Execution Time: 0.073 ms
-(7 rows)
+(8 rows)
</programlisting>
</para>
diff --git a/doc/src/sgml/rules.sgml b/doc/src/sgml/rules.sgml
index 7a928bd7b..7a00e4c0e 100644
--- a/doc/src/sgml/rules.sgml
+++ b/doc/src/sgml/rules.sgml
@@ -1045,6 +1045,7 @@ SELECT count(*) FROM words WHERE word = 'caterpiler';
Aggregate (cost=4.44..4.45 rows=1 width=0) (actual time=0.042..0.042 rows=1 loops=1)
-> Index Only Scan using wrd_word on wrd (cost=0.42..4.44 rows=1 width=0) (actual time=0.039..0.039 rows=0 loops=1)
Index Cond: (word = 'caterpiler'::text)
+ Index Searches: 1
Heap Fetches: 0
Planning time: 0.164 ms
Execution time: 0.117 ms
diff --git a/src/test/regress/expected/brin_multi.out b/src/test/regress/expected/brin_multi.out
index ae9ce9d8e..c24d56007 100644
--- a/src/test/regress/expected/brin_multi.out
+++ b/src/test/regress/expected/brin_multi.out
@@ -853,7 +853,8 @@ SELECT * FROM brin_date_test WHERE a = '2023-01-01'::date;
Recheck Cond: (a = '2023-01-01'::date)
-> Bitmap Index Scan on brin_date_test_a_idx (actual rows=0 loops=1)
Index Cond: (a = '2023-01-01'::date)
-(4 rows)
+ Index Searches: 1
+(5 rows)
DROP TABLE brin_date_test;
RESET enable_seqscan;
@@ -872,7 +873,8 @@ SELECT * FROM brin_timestamp_test WHERE a = '2023-01-01'::timestamp;
Recheck Cond: (a = '2023-01-01 00:00:00'::timestamp without time zone)
-> Bitmap Index Scan on brin_timestamp_test_a_idx (actual rows=0 loops=1)
Index Cond: (a = '2023-01-01 00:00:00'::timestamp without time zone)
-(4 rows)
+ Index Searches: 1
+(5 rows)
EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF, SUMMARY OFF)
SELECT * FROM brin_timestamp_test WHERE a = '1900-01-01'::timestamp;
@@ -882,7 +884,8 @@ SELECT * FROM brin_timestamp_test WHERE a = '1900-01-01'::timestamp;
Recheck Cond: (a = '1900-01-01 00:00:00'::timestamp without time zone)
-> Bitmap Index Scan on brin_timestamp_test_a_idx (actual rows=0 loops=1)
Index Cond: (a = '1900-01-01 00:00:00'::timestamp without time zone)
-(4 rows)
+ Index Searches: 1
+(5 rows)
DROP TABLE brin_timestamp_test;
RESET enable_seqscan;
@@ -900,7 +903,8 @@ SELECT * FROM brin_date_test WHERE a = '2023-01-01'::date;
Recheck Cond: (a = '2023-01-01'::date)
-> Bitmap Index Scan on brin_date_test_a_idx (actual rows=0 loops=1)
Index Cond: (a = '2023-01-01'::date)
-(4 rows)
+ Index Searches: 1
+(5 rows)
EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF, SUMMARY OFF)
SELECT * FROM brin_date_test WHERE a = '1900-01-01'::date;
@@ -910,7 +914,8 @@ SELECT * FROM brin_date_test WHERE a = '1900-01-01'::date;
Recheck Cond: (a = '1900-01-01'::date)
-> Bitmap Index Scan on brin_date_test_a_idx (actual rows=0 loops=1)
Index Cond: (a = '1900-01-01'::date)
-(4 rows)
+ Index Searches: 1
+(5 rows)
DROP TABLE brin_date_test;
RESET enable_seqscan;
@@ -929,7 +934,8 @@ SELECT * FROM brin_interval_test WHERE a = '-30 years'::interval;
Recheck Cond: (a = '@ 30 years ago'::interval)
-> Bitmap Index Scan on brin_interval_test_a_idx (actual rows=0 loops=1)
Index Cond: (a = '@ 30 years ago'::interval)
-(4 rows)
+ Index Searches: 1
+(5 rows)
EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF, SUMMARY OFF)
SELECT * FROM brin_interval_test WHERE a = '30 years'::interval;
@@ -939,7 +945,8 @@ SELECT * FROM brin_interval_test WHERE a = '30 years'::interval;
Recheck Cond: (a = '@ 30 years'::interval)
-> Bitmap Index Scan on brin_interval_test_a_idx (actual rows=0 loops=1)
Index Cond: (a = '@ 30 years'::interval)
-(4 rows)
+ Index Searches: 1
+(5 rows)
DROP TABLE brin_interval_test;
RESET enable_seqscan;
@@ -957,7 +964,8 @@ SELECT * FROM brin_interval_test WHERE a = '-30 years'::interval;
Recheck Cond: (a = '@ 30 years ago'::interval)
-> Bitmap Index Scan on brin_interval_test_a_idx (actual rows=0 loops=1)
Index Cond: (a = '@ 30 years ago'::interval)
-(4 rows)
+ Index Searches: 1
+(5 rows)
EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF, SUMMARY OFF)
SELECT * FROM brin_interval_test WHERE a = '30 years'::interval;
@@ -967,7 +975,8 @@ SELECT * FROM brin_interval_test WHERE a = '30 years'::interval;
Recheck Cond: (a = '@ 30 years'::interval)
-> Bitmap Index Scan on brin_interval_test_a_idx (actual rows=0 loops=1)
Index Cond: (a = '@ 30 years'::interval)
-(4 rows)
+ Index Searches: 1
+(5 rows)
DROP TABLE brin_interval_test;
RESET enable_seqscan;
diff --git a/src/test/regress/expected/memoize.out b/src/test/regress/expected/memoize.out
index f6b8329cd..a8bc0bfd7 100644
--- a/src/test/regress/expected/memoize.out
+++ b/src/test/regress/expected/memoize.out
@@ -22,8 +22,10 @@ begin
ln := regexp_replace(ln, 'Evictions: 0', 'Evictions: Zero');
ln := regexp_replace(ln, 'Evictions: \d+', 'Evictions: N');
ln := regexp_replace(ln, 'Memory Usage: \d+', 'Memory Usage: N');
- ln := regexp_replace(ln, 'Heap Fetches: \d+', 'Heap Fetches: N');
- ln := regexp_replace(ln, 'loops=\d+', 'loops=N');
+ ln := regexp_replace(ln, 'Heap Fetches: \d+', 'Heap Fetches: N');
+ ln := regexp_replace(ln, 'loops=\d+', 'loops=N');
+ ln := regexp_replace(ln, 'Index Searches: 0', 'Index Searches: Zero');
+ ln := regexp_replace(ln, 'Index Searches: \d+', 'Index Searches: N');
return next ln;
end loop;
end;
@@ -48,8 +50,9 @@ WHERE t2.unique1 < 1000;', false);
Hits: 980 Misses: 20 Evictions: Zero Overflows: 0 Memory Usage: NkB
-> Index Only Scan using tenk1_unique1 on tenk1 t1 (actual rows=1 loops=N)
Index Cond: (unique1 = t2.twenty)
+ Index Searches: N
Heap Fetches: N
-(12 rows)
+(13 rows)
-- And check we get the expected results.
SELECT COUNT(*),AVG(t1.unique1) FROM tenk1 t1
@@ -79,8 +82,9 @@ WHERE t1.unique1 < 1000;', false);
Hits: 980 Misses: 20 Evictions: Zero Overflows: 0 Memory Usage: NkB
-> Index Only Scan using tenk1_unique1 on tenk1 t2 (actual rows=1 loops=N)
Index Cond: (unique1 = t1.twenty)
+ Index Searches: N
Heap Fetches: N
-(12 rows)
+(13 rows)
-- And check we get the expected results.
SELECT COUNT(*),AVG(t2.unique1) FROM tenk1 t1,
@@ -106,6 +110,7 @@ WHERE t1.unique1 < 10;', false);
-> Nested Loop Left Join (actual rows=20 loops=N)
-> Index Scan using tenk1_unique1 on tenk1 t1 (actual rows=10 loops=N)
Index Cond: (unique1 < 10)
+ Index Searches: N
-> Memoize (actual rows=2 loops=N)
Cache Key: t1.two
Cache Mode: binary
@@ -115,7 +120,8 @@ WHERE t1.unique1 < 10;', false);
Rows Removed by Filter: 2
-> Index Scan using tenk1_unique1 on tenk1 t2_1 (actual rows=4 loops=N)
Index Cond: (unique1 < 4)
-(13 rows)
+ Index Searches: N
+(15 rows)
-- And check we get the expected results.
SELECT COUNT(*),AVG(t2.t1two) FROM tenk1 t1 LEFT JOIN
@@ -146,10 +152,11 @@ WHERE s.c1 = s.c2 AND t1.unique1 < 1000;', false);
Cache Mode: binary
Hits: 998 Misses: 2 Evictions: Zero Overflows: 0 Memory Usage: NkB
-> Index Only Scan using tenk1_unique1 on tenk1 t2 (actual rows=1 loops=N)
+ Index Searches: N
Filter: ((t1.two + 1) = unique1)
Rows Removed by Filter: 9999
Heap Fetches: N
-(13 rows)
+(14 rows)
-- And check we get the expected results.
SELECT COUNT(*), AVG(t1.twenty) FROM tenk1 t1 LEFT JOIN
@@ -217,9 +224,10 @@ ON t1.x = t2.t::numeric AND t1.t::numeric = t2.x;', false);
Hits: 20 Misses: 20 Evictions: Zero Overflows: 0 Memory Usage: NkB
-> Index Only Scan using expr_key_idx_x_t on expr_key t2 (actual rows=2 loops=N)
Index Cond: (x = (t1.t)::numeric)
+ Index Searches: N
Filter: (t1.x = (t)::numeric)
Heap Fetches: N
-(10 rows)
+(11 rows)
DROP TABLE expr_key;
-- Reduce work_mem and hash_mem_multiplier so that we see some cache evictions
@@ -245,8 +253,9 @@ WHERE t2.unique1 < 1200;', true);
Hits: N Misses: N Evictions: N Overflows: 0 Memory Usage: NkB
-> Index Only Scan using tenk1_unique1 on tenk1 t1 (actual rows=1 loops=N)
Index Cond: (unique1 = t2.thousand)
+ Index Searches: N
Heap Fetches: N
-(12 rows)
+(13 rows)
CREATE TABLE flt (f float);
CREATE INDEX flt_f_idx ON flt (f);
@@ -260,6 +269,7 @@ SELECT * FROM flt f1 INNER JOIN flt f2 ON f1.f = f2.f;', false);
-------------------------------------------------------------------------------
Nested Loop (actual rows=4 loops=N)
-> Index Only Scan using flt_f_idx on flt f1 (actual rows=2 loops=N)
+ Index Searches: N
Heap Fetches: N
-> Memoize (actual rows=2 loops=N)
Cache Key: f1.f
@@ -267,8 +277,9 @@ SELECT * FROM flt f1 INNER JOIN flt f2 ON f1.f = f2.f;', false);
Hits: 1 Misses: 1 Evictions: Zero Overflows: 0 Memory Usage: NkB
-> Index Only Scan using flt_f_idx on flt f2 (actual rows=2 loops=N)
Index Cond: (f = f1.f)
+ Index Searches: N
Heap Fetches: N
-(10 rows)
+(12 rows)
-- Ensure memoize operates in binary mode
SELECT explain_memoize('
@@ -277,6 +288,7 @@ SELECT * FROM flt f1 INNER JOIN flt f2 ON f1.f >= f2.f;', false);
-------------------------------------------------------------------------------
Nested Loop (actual rows=4 loops=N)
-> Index Only Scan using flt_f_idx on flt f1 (actual rows=2 loops=N)
+ Index Searches: N
Heap Fetches: N
-> Memoize (actual rows=2 loops=N)
Cache Key: f1.f
@@ -284,8 +296,9 @@ SELECT * FROM flt f1 INNER JOIN flt f2 ON f1.f >= f2.f;', false);
Hits: 0 Misses: 2 Evictions: Zero Overflows: 0 Memory Usage: NkB
-> Index Only Scan using flt_f_idx on flt f2 (actual rows=2 loops=N)
Index Cond: (f <= f1.f)
+ Index Searches: N
Heap Fetches: N
-(10 rows)
+(12 rows)
DROP TABLE flt;
-- Exercise Memoize in binary mode with a large fixed width type and a
@@ -311,7 +324,8 @@ SELECT * FROM strtest s1 INNER JOIN strtest s2 ON s1.n >= s2.n;', false);
Hits: 3 Misses: 3 Evictions: Zero Overflows: 0 Memory Usage: NkB
-> Index Scan using strtest_n_idx on strtest s2 (actual rows=4 loops=N)
Index Cond: (n <= s1.n)
-(9 rows)
+ Index Searches: N
+(10 rows)
-- Ensure we get 3 hits and 3 misses
SELECT explain_memoize('
@@ -327,7 +341,8 @@ SELECT * FROM strtest s1 INNER JOIN strtest s2 ON s1.t >= s2.t;', false);
Hits: 3 Misses: 3 Evictions: Zero Overflows: 0 Memory Usage: NkB
-> Index Scan using strtest_t_idx on strtest s2 (actual rows=4 loops=N)
Index Cond: (t <= s1.t)
-(9 rows)
+ Index Searches: N
+(10 rows)
DROP TABLE strtest;
-- Ensure memoize works with partitionwise join
@@ -347,6 +362,7 @@ SELECT * FROM prt t1 INNER JOIN prt t2 ON t1.a = t2.a;', false);
Append (actual rows=32 loops=N)
-> Nested Loop (actual rows=16 loops=N)
-> Index Only Scan using iprt_p1_a on prt_p1 t1_1 (actual rows=4 loops=N)
+ Index Searches: N
Heap Fetches: N
-> Memoize (actual rows=4 loops=N)
Cache Key: t1_1.a
@@ -354,9 +370,11 @@ SELECT * FROM prt t1 INNER JOIN prt t2 ON t1.a = t2.a;', false);
Hits: 3 Misses: 1 Evictions: Zero Overflows: 0 Memory Usage: NkB
-> Index Only Scan using iprt_p1_a on prt_p1 t2_1 (actual rows=4 loops=N)
Index Cond: (a = t1_1.a)
+ Index Searches: N
Heap Fetches: N
-> Nested Loop (actual rows=16 loops=N)
-> Index Only Scan using iprt_p2_a on prt_p2 t1_2 (actual rows=4 loops=N)
+ Index Searches: N
Heap Fetches: N
-> Memoize (actual rows=4 loops=N)
Cache Key: t1_2.a
@@ -364,8 +382,9 @@ SELECT * FROM prt t1 INNER JOIN prt t2 ON t1.a = t2.a;', false);
Hits: 3 Misses: 1 Evictions: Zero Overflows: 0 Memory Usage: NkB
-> Index Only Scan using iprt_p2_a on prt_p2 t2_2 (actual rows=4 loops=N)
Index Cond: (a = t1_2.a)
+ Index Searches: N
Heap Fetches: N
-(21 rows)
+(25 rows)
-- Ensure memoize works with parameterized union-all Append path
SET enable_partitionwise_join TO off;
@@ -377,6 +396,7 @@ ON t1.a = t2.a;', false);
-------------------------------------------------------------------------------------
Nested Loop (actual rows=16 loops=N)
-> Index Only Scan using iprt_p1_a on prt_p1 t1 (actual rows=4 loops=N)
+ Index Searches: N
Heap Fetches: N
-> Memoize (actual rows=4 loops=N)
Cache Key: t1.a
@@ -385,11 +405,13 @@ ON t1.a = t2.a;', false);
-> Append (actual rows=4 loops=N)
-> Index Only Scan using iprt_p1_a on prt_p1 (actual rows=4 loops=N)
Index Cond: (a = t1.a)
+ Index Searches: N
Heap Fetches: N
-> Index Only Scan using iprt_p2_a on prt_p2 (actual rows=0 loops=N)
Index Cond: (a = t1.a)
+ Index Searches: N
Heap Fetches: N
-(14 rows)
+(17 rows)
DROP TABLE prt;
RESET enable_partitionwise_join;
diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out
index 7a03b4e36..e3e99272a 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -2340,6 +2340,10 @@ begin
ln := regexp_replace(ln, 'Workers Launched: \d+', 'Workers Launched: N');
ln := regexp_replace(ln, 'actual rows=\d+ loops=\d+', 'actual rows=N loops=N');
ln := regexp_replace(ln, 'Rows Removed by Filter: \d+', 'Rows Removed by Filter: N');
+ perform regexp_matches(ln, 'Index Searches: \d+');
+ if found then
+ continue;
+ end if;
return next ln;
end loop;
end;
@@ -2657,47 +2661,56 @@ select * from ab where a = (select max(a) from lprt_a) and b = (select max(a)-1
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a1_b1_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a1_b2 ab_2 (never executed)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a1_b2_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a1_b3 ab_3 (never executed)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a1_b3_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a2_b1 ab_4 (never executed)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a2_b1_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a2_b2 ab_5 (never executed)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a2_b2_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a2_b3 ab_6 (never executed)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a2_b3_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a3_b1 ab_7 (never executed)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a3_b1_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a3_b2 ab_8 (actual rows=0 loops=1)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a3_b2_a_idx (actual rows=0 loops=1)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a3_b3 ab_9 (never executed)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a3_b3_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
-(52 rows)
+ Index Searches: 0
+(61 rows)
-- Test run-time partition pruning with UNION ALL parents
explain (analyze, costs off, summary off, timing off)
@@ -2713,16 +2726,19 @@ select * from (select * from ab where a = 1 union all select * from ab) ab where
Filter: (b = (InitPlan 1).col1)
-> Bitmap Index Scan on ab_a1_b1_a_idx (actual rows=0 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b2 ab_12 (never executed)
Recheck Cond: (a = 1)
Filter: (b = (InitPlan 1).col1)
-> Bitmap Index Scan on ab_a1_b2_a_idx (never executed)
Index Cond: (a = 1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a1_b3 ab_13 (never executed)
Recheck Cond: (a = 1)
Filter: (b = (InitPlan 1).col1)
-> Bitmap Index Scan on ab_a1_b3_a_idx (never executed)
Index Cond: (a = 1)
+ Index Searches: 0
-> Seq Scan on ab_a1_b1 ab_1 (actual rows=0 loops=1)
Filter: (b = (InitPlan 1).col1)
-> Seq Scan on ab_a1_b2 ab_2 (never executed)
@@ -2741,7 +2757,7 @@ select * from (select * from ab where a = 1 union all select * from ab) ab where
Filter: (b = (InitPlan 1).col1)
-> Seq Scan on ab_a3_b3 ab_9 (never executed)
Filter: (b = (InitPlan 1).col1)
-(37 rows)
+(40 rows)
-- A case containing a UNION ALL with a non-partitioned child.
explain (analyze, costs off, summary off, timing off)
@@ -2757,16 +2773,19 @@ select * from (select * from ab where a = 1 union all (values(10,5)) union all s
Filter: (b = (InitPlan 1).col1)
-> Bitmap Index Scan on ab_a1_b1_a_idx (actual rows=0 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b2 ab_12 (never executed)
Recheck Cond: (a = 1)
Filter: (b = (InitPlan 1).col1)
-> Bitmap Index Scan on ab_a1_b2_a_idx (never executed)
Index Cond: (a = 1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a1_b3 ab_13 (never executed)
Recheck Cond: (a = 1)
Filter: (b = (InitPlan 1).col1)
-> Bitmap Index Scan on ab_a1_b3_a_idx (never executed)
Index Cond: (a = 1)
+ Index Searches: 0
-> Result (actual rows=0 loops=1)
One-Time Filter: (5 = (InitPlan 1).col1)
-> Seq Scan on ab_a1_b1 ab_1 (actual rows=0 loops=1)
@@ -2787,7 +2806,7 @@ select * from (select * from ab where a = 1 union all (values(10,5)) union all s
Filter: (b = (InitPlan 1).col1)
-> Seq Scan on ab_a3_b3 ab_9 (never executed)
Filter: (b = (InitPlan 1).col1)
-(39 rows)
+(42 rows)
-- Another UNION ALL test, but containing a mix of exec init and exec run-time pruning.
create table xy_1 (x int, y int);
@@ -2858,16 +2877,19 @@ update ab_a1 set b = 3 from ab where ab.a = 1 and ab.a = ab_a1.a;');
Recheck Cond: (a = 1)
-> Bitmap Index Scan on ab_a1_b1_a_idx (actual rows=0 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b2 ab_a1_2 (actual rows=1 loops=1)
Recheck Cond: (a = 1)
Heap Blocks: exact=1
-> Bitmap Index Scan on ab_a1_b2_a_idx (actual rows=1 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b3 ab_a1_3 (actual rows=0 loops=1)
Recheck Cond: (a = 1)
Heap Blocks: exact=1
-> Bitmap Index Scan on ab_a1_b3_a_idx (actual rows=1 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Materialize (actual rows=1 loops=1)
Storage: Memory Maximum Storage: NkB
-> Append (actual rows=1 loops=1)
@@ -2875,17 +2897,20 @@ update ab_a1 set b = 3 from ab where ab.a = 1 and ab.a = ab_a1.a;');
Recheck Cond: (a = 1)
-> Bitmap Index Scan on ab_a1_b1_a_idx (actual rows=0 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b2 ab_2 (actual rows=1 loops=1)
Recheck Cond: (a = 1)
Heap Blocks: exact=1
-> Bitmap Index Scan on ab_a1_b2_a_idx (actual rows=1 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b3 ab_3 (actual rows=0 loops=1)
Recheck Cond: (a = 1)
Heap Blocks: exact=1
-> Bitmap Index Scan on ab_a1_b3_a_idx (actual rows=1 loops=1)
Index Cond: (a = 1)
-(37 rows)
+ Index Searches: 1
+(43 rows)
table ab;
a | b
@@ -2961,17 +2986,23 @@ select * from tbl1 join tprt on tbl1.col1 > tprt.col1;
-> Append (actual rows=3 loops=2)
-> Index Scan using tprt1_idx on tprt_1 (actual rows=2 loops=2)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 1
-> Index Scan using tprt2_idx on tprt_2 (actual rows=2 loops=1)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 1
-> Index Scan using tprt3_idx on tprt_3 (never executed)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt4_idx on tprt_4 (never executed)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt5_idx on tprt_5 (never executed)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt6_idx on tprt_6 (never executed)
Index Cond: (col1 < tbl1.col1)
-(15 rows)
+ Index Searches: 0
+(21 rows)
explain (analyze, costs off, summary off, timing off)
select * from tbl1 join tprt on tbl1.col1 = tprt.col1;
@@ -2982,17 +3013,23 @@ select * from tbl1 join tprt on tbl1.col1 = tprt.col1;
-> Append (actual rows=1 loops=2)
-> Index Scan using tprt1_idx on tprt_1 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt2_idx on tprt_2 (actual rows=1 loops=2)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 1
-> Index Scan using tprt3_idx on tprt_3 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt4_idx on tprt_4 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt5_idx on tprt_5 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt6_idx on tprt_6 (never executed)
Index Cond: (col1 = tbl1.col1)
-(15 rows)
+ Index Searches: 0
+(21 rows)
select tbl1.col1, tprt.col1 from tbl1
inner join tprt on tbl1.col1 > tprt.col1
@@ -3027,17 +3064,23 @@ select * from tbl1 inner join tprt on tbl1.col1 > tprt.col1;
-> Append (actual rows=5 loops=5)
-> Index Scan using tprt1_idx on tprt_1 (actual rows=2 loops=5)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 1
-> Index Scan using tprt2_idx on tprt_2 (actual rows=3 loops=4)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 1
-> Index Scan using tprt3_idx on tprt_3 (actual rows=1 loops=2)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 1
-> Index Scan using tprt4_idx on tprt_4 (never executed)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt5_idx on tprt_5 (never executed)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt6_idx on tprt_6 (never executed)
Index Cond: (col1 < tbl1.col1)
-(15 rows)
+ Index Searches: 0
+(21 rows)
explain (analyze, costs off, summary off, timing off)
select * from tbl1 inner join tprt on tbl1.col1 = tprt.col1;
@@ -3048,17 +3091,23 @@ select * from tbl1 inner join tprt on tbl1.col1 = tprt.col1;
-> Append (actual rows=1 loops=5)
-> Index Scan using tprt1_idx on tprt_1 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt2_idx on tprt_2 (actual rows=1 loops=2)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 1
-> Index Scan using tprt3_idx on tprt_3 (actual rows=0 loops=3)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 1
-> Index Scan using tprt4_idx on tprt_4 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt5_idx on tprt_5 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt6_idx on tprt_6 (never executed)
Index Cond: (col1 = tbl1.col1)
-(15 rows)
+ Index Searches: 0
+(21 rows)
select tbl1.col1, tprt.col1 from tbl1
inner join tprt on tbl1.col1 > tprt.col1
@@ -3112,17 +3161,23 @@ select * from tbl1 join tprt on tbl1.col1 < tprt.col1;
-> Append (actual rows=1 loops=1)
-> Index Scan using tprt1_idx on tprt_1 (never executed)
Index Cond: (col1 > tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt2_idx on tprt_2 (never executed)
Index Cond: (col1 > tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt3_idx on tprt_3 (never executed)
Index Cond: (col1 > tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt4_idx on tprt_4 (never executed)
Index Cond: (col1 > tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt5_idx on tprt_5 (never executed)
Index Cond: (col1 > tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt6_idx on tprt_6 (actual rows=1 loops=1)
Index Cond: (col1 > tbl1.col1)
-(15 rows)
+ Index Searches: 1
+(21 rows)
select tbl1.col1, tprt.col1 from tbl1
inner join tprt on tbl1.col1 < tprt.col1
@@ -3144,17 +3199,23 @@ select * from tbl1 join tprt on tbl1.col1 = tprt.col1;
-> Append (actual rows=0 loops=1)
-> Index Scan using tprt1_idx on tprt_1 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt2_idx on tprt_2 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt3_idx on tprt_3 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt4_idx on tprt_4 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt5_idx on tprt_5 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt6_idx on tprt_6 (never executed)
Index Cond: (col1 = tbl1.col1)
-(15 rows)
+ Index Searches: 0
+(21 rows)
select tbl1.col1, tprt.col1 from tbl1
inner join tprt on tbl1.col1 = tprt.col1
@@ -3482,12 +3543,14 @@ explain (analyze, costs off, summary off, timing off) execute mt_q1(15);
Sort Key: ma_test.b
Subplans Removed: 1
-> Index Scan using ma_test_p2_b_idx on ma_test_p2 ma_test_1 (actual rows=1 loops=1)
+ Index Searches: 1
Filter: ((a >= $1) AND ((a % 10) = 5))
Rows Removed by Filter: 9
-> Index Scan using ma_test_p3_b_idx on ma_test_p3 ma_test_2 (actual rows=1 loops=1)
+ Index Searches: 1
Filter: ((a >= $1) AND ((a % 10) = 5))
Rows Removed by Filter: 9
-(9 rows)
+(11 rows)
execute mt_q1(15);
a
@@ -3503,9 +3566,10 @@ explain (analyze, costs off, summary off, timing off) execute mt_q1(25);
Sort Key: ma_test.b
Subplans Removed: 2
-> Index Scan using ma_test_p3_b_idx on ma_test_p3 ma_test_1 (actual rows=1 loops=1)
+ Index Searches: 1
Filter: ((a >= $1) AND ((a % 10) = 5))
Rows Removed by Filter: 9
-(6 rows)
+(7 rows)
execute mt_q1(25);
a
@@ -3553,13 +3617,17 @@ explain (analyze, costs off, summary off, timing off) select * from ma_test wher
-> Limit (actual rows=1 loops=1)
-> Index Scan using ma_test_p2_b_idx on ma_test_p2 (actual rows=1 loops=1)
Index Cond: (b IS NOT NULL)
+ Index Searches: 1
-> Index Scan using ma_test_p1_b_idx on ma_test_p1 ma_test_1 (never executed)
+ Index Searches: 0
Filter: (a >= (InitPlan 2).col1)
-> Index Scan using ma_test_p2_b_idx on ma_test_p2 ma_test_2 (actual rows=10 loops=1)
+ Index Searches: 1
Filter: (a >= (InitPlan 2).col1)
-> Index Scan using ma_test_p3_b_idx on ma_test_p3 ma_test_3 (actual rows=10 loops=1)
+ Index Searches: 1
Filter: (a >= (InitPlan 2).col1)
-(14 rows)
+(18 rows)
reset enable_seqscan;
reset enable_sort;
@@ -4129,14 +4197,18 @@ select * from rangep where b IN((select 1),(select 2)) order by a;
-> Merge Append (actual rows=0 loops=1)
Sort Key: rangep_2.a
-> Index Scan using rangep_0_to_100_1_a_idx on rangep_0_to_100_1 rangep_2 (actual rows=0 loops=1)
+ Index Searches: 1
Filter: (b = ANY (ARRAY[(InitPlan 1).col1, (InitPlan 2).col1]))
-> Index Scan using rangep_0_to_100_2_a_idx on rangep_0_to_100_2 rangep_3 (actual rows=0 loops=1)
+ Index Searches: 1
Filter: (b = ANY (ARRAY[(InitPlan 1).col1, (InitPlan 2).col1]))
-> Index Scan using rangep_0_to_100_3_a_idx on rangep_0_to_100_3 rangep_4 (never executed)
+ Index Searches: 0
Filter: (b = ANY (ARRAY[(InitPlan 1).col1, (InitPlan 2).col1]))
-> Index Scan using rangep_100_to_200_a_idx on rangep_100_to_200 rangep_5 (actual rows=0 loops=1)
+ Index Searches: 1
Filter: (b = ANY (ARRAY[(InitPlan 1).col1, (InitPlan 2).col1]))
-(15 rows)
+(19 rows)
reset enable_sort;
drop table rangep;
diff --git a/src/test/regress/expected/select.out b/src/test/regress/expected/select.out
index 33a6dceb0..b7cf35b9a 100644
--- a/src/test/regress/expected/select.out
+++ b/src/test/regress/expected/select.out
@@ -763,8 +763,9 @@ select * from onek2 where unique2 = 11 and stringu1 = 'ATAAAA';
-----------------------------------------------------------------
Index Scan using onek2_u2_prtl on onek2 (actual rows=1 loops=1)
Index Cond: (unique2 = 11)
+ Index Searches: 1
Filter: (stringu1 = 'ATAAAA'::name)
-(3 rows)
+(4 rows)
explain (costs off)
select unique2 from onek2 where unique2 = 11 and stringu1 = 'ATAAAA';
diff --git a/src/test/regress/sql/memoize.sql b/src/test/regress/sql/memoize.sql
index 2eaeb1477..9afe205e0 100644
--- a/src/test/regress/sql/memoize.sql
+++ b/src/test/regress/sql/memoize.sql
@@ -23,8 +23,10 @@ begin
ln := regexp_replace(ln, 'Evictions: 0', 'Evictions: Zero');
ln := regexp_replace(ln, 'Evictions: \d+', 'Evictions: N');
ln := regexp_replace(ln, 'Memory Usage: \d+', 'Memory Usage: N');
- ln := regexp_replace(ln, 'Heap Fetches: \d+', 'Heap Fetches: N');
- ln := regexp_replace(ln, 'loops=\d+', 'loops=N');
+ ln := regexp_replace(ln, 'Heap Fetches: \d+', 'Heap Fetches: N');
+ ln := regexp_replace(ln, 'loops=\d+', 'loops=N');
+ ln := regexp_replace(ln, 'Index Searches: 0', 'Index Searches: Zero');
+ ln := regexp_replace(ln, 'Index Searches: \d+', 'Index Searches: N');
return next ln;
end loop;
end;
diff --git a/src/test/regress/sql/partition_prune.sql b/src/test/regress/sql/partition_prune.sql
index 442428d93..085e746af 100644
--- a/src/test/regress/sql/partition_prune.sql
+++ b/src/test/regress/sql/partition_prune.sql
@@ -573,6 +573,10 @@ begin
ln := regexp_replace(ln, 'Workers Launched: \d+', 'Workers Launched: N');
ln := regexp_replace(ln, 'actual rows=\d+ loops=\d+', 'actual rows=N loops=N');
ln := regexp_replace(ln, 'Rows Removed by Filter: \d+', 'Rows Removed by Filter: N');
+ perform regexp_matches(ln, 'Index Searches: \d+');
+ if found then
+ continue;
+ end if;
return next ln;
end loop;
end;
--
2.45.2
On 09.11.2024 21:46, Peter Geoghegan wrote:
On Sat, Nov 9, 2024 at 12:37 PM Alena Rybakina
<a.rybakina@postgrespro.ru> wrote:I noticed that the "Index Searches" cases shown in the regression tests are only for partitioned tables, maybe something you should add some tests for regular tables like tenk1.
I allowed the patch on this thread to bitrot, but I've been
maintaining this same patch as part of the skip scan patchset.
Attached is the latest version of this patch (technically this is the
first patch in the skip scan patch series). Just to keep things
passing on the CFTester app.
Thank you)
I haven't done anything about the implementation (still using a
counter that lives in IndexScanDesc) due to a lack of clarity on
what'll work best.
I've been still researching this to be honest and also haven't yet
opinion when the counter will be more suitable.
Hopefully discussion of those aspects of this patch
will pick up again soon.
I hope too.
Note that I have changed the patch to divide "Index Searches:" by
nloops, since Tomas Vondra seemed to want to do it that way
(personally I don't feel strongly about that either way). So that's
one behavioral change, not seen in any of the versions of the patch
that have been posted to this thread so far.
Or maybe I was affected by fatigue, but I don’t understand this point,
to be honest. I see from the documentation and your first letter that it
specifies how many times in total the tuple search would be performed
during the index execution. Is that not quite right?
The documentation:
<para>
<command>EXPLAIN ANALYZE</command> breaks down the total number of
index
searches performed by each index scan node. <literal>Index
Searches: N</literal>
indicates the total number of searches across <emphasis>all</emphasis>
executor node executions/loops.
</para>
In general, I support the initiative to display this information in the query plan output. I think it is necessary for finding the reasons for low query performance.
I just know that if Postgres 18 has skip scan, but doesn't have basic
instrumentation of the number of index searches in EXPLAIN ANALYZE
when skip scan is in use, we're going to get lots of complaints about
it. It'll be very different from the current status quo. My main
motivation here is to avoid complaints about the behavior of skip scan
being completely opaque to users.
Yes, we can expect users to be concerned about this, but it is wrong not
to display information about it at all. The right thing to do is to see
the problem and try to solve it in the future.
I think this patch is the first step towards a solution, right?
It may also encourage the user to consider other options for solving
this problem, such as not to use index scan (for example, use
pg_hint_plan extension) or building a view from this table or something
else, if it significantly harms their performance.
I think that the same issue could also happen with your OR
transformation patch, if we don't get this EXPLAIN ANALYZE
instrumentation. Users will still naturally want to know if a query
"WHERE a = 2 OR a = 4 OR a = 6" required only one index search during
its index scan, or if it required as many as 3 searches. They can
already see this information with a BitmapOr-based plan, today.Why wouldn't they expect to continue to see the same information (or
similar information) when the index searches happen to be coordinated
by the index scan node/index AM itself?
To be honest, I don't quite understand this. Can you please explain in
more detail?
--
Regards,
Alena Rybakina
Postgres Professional
On Sun, Nov 10, 2024 at 2:00 PM Alena Rybakina
<a.rybakina@postgrespro.ru> wrote:
Or maybe I was affected by fatigue, but I don’t understand this point, to be honest. I see from the documentation and your first letter that it specifies how many times in total the tuple search would be performed during the index execution. Is that not quite right?
Well, nodes that appear on the inner side of a nested loop join (and
in a few other contexts) generally have their row counts (and a few
other things) divided by the total number of executions. The idea is
that we're showing the average across all executions of the node -- if
the user wants the true absolute number, they're expected to multiply
nrows by nloops themselves. This is slightly controversial behavior,
but it is long established (weirdly, we never divide by nloops for
"Buffers").
Initial versions of my patch didn't do this. The latest version does
divide like this, though. In general it isn't all that likely that an
inner index scan would have more than a single primitive index scan,
in any case, so which particular behavior I use here (divide vs don't
divide) is not something that I feel strongly about.
Why wouldn't they expect to continue to see the same information (or
similar information) when the index searches happen to be coordinated
by the index scan node/index AM itself?To be honest, I don't quite understand this. Can you please explain in more detail?
I just meant that your OR transformation patch is another case where
we shouldn't obscure the count of primitive index scans.
It would be inconsistent of us to allow users to see the number of
index scans today (without your patch), while denying users the
ability to see essentially the same information in the future (with
your patch). The fact that an index scan has its own executor node
today and won't have one tomorrow shouldn't in itself affect
instrumentation of the number of (primitive) index scans shown by
EXPLAIN ANALYZE (it certainly won't affect the instrumentation within
the pg_stat_all_indexes view, as things stand, even without my patch).
--
Peter Geoghegan
Sorry it took me so long to answer, I had some minor health complications
On 12.11.2024 23:00, Peter Geoghegan wrote:
On Sun, Nov 10, 2024 at 2:00 PM Alena Rybakina
<a.rybakina@postgrespro.ru> wrote:Or maybe I was affected by fatigue, but I don’t understand this point, to be honest. I see from the documentation and your first letter that it specifies how many times in total the tuple search would be performed during the index execution. Is that not quite right?
Well, nodes that appear on the inner side of a nested loop join (and
in a few other contexts) generally have their row counts (and a few
other things) divided by the total number of executions. The idea is
that we're showing the average across all executions of the node -- if
the user wants the true absolute number, they're expected to multiply
nrows by nloops themselves. This is slightly controversial behavior,
but it is long established (weirdly, we never divide by nloops for
"Buffers").
I understood what you mean and I faced this situation before when I saw
extremely more number of actual rows that could be and it was caused by
the number of scanned tuples per cycles. [0]/messages/by-id/9f4a159b-f527-465f-b82e-38b4b7df812f@postgrespro.ru
[0]: /messages/by-id/9f4a159b-f527-465f-b82e-38b4b7df812f@postgrespro.ru
/messages/by-id/9f4a159b-f527-465f-b82e-38b4b7df812f@postgrespro.ru
Initial versions of my patch didn't do this. The latest version does
divide like this, though. In general it isn't all that likely that an
inner index scan would have more than a single primitive index scan,
in any case, so which particular behavior I use here (divide vs don't
divide) is not something that I feel strongly about.
I think we should divide them because by dividing the total buffer usage
by the number of loops, user finds the average buffer consumption per
loop. This gives them a clearer picture of the resource intensity per
basic unit of work.
--
Regards,
Alena Rybakina
Postgres Professional
On Wed, 27 Nov 2024 at 14:22, Alena Rybakina <a.rybakina@postgrespro.ru> wrote:
Sorry it took me so long to answer, I had some minor health complications
On 12.11.2024 23:00, Peter Geoghegan wrote:
On Sun, Nov 10, 2024 at 2:00 PM Alena Rybakina
<a.rybakina@postgrespro.ru> wrote:Or maybe I was affected by fatigue, but I don’t understand this point, to be honest. I see from the documentation and your first letter that it specifies how many times in total the tuple search would be performed during the index execution. Is that not quite right?
Well, nodes that appear on the inner side of a nested loop join (and
in a few other contexts) generally have their row counts (and a few
other things) divided by the total number of executions. The idea is
that we're showing the average across all executions of the node -- if
the user wants the true absolute number, they're expected to multiply
nrows by nloops themselves. This is slightly controversial behavior,
but it is long established (weirdly, we never divide by nloops for
"Buffers").I understood what you mean and I faced this situation before when I saw extremely more number of actual rows that could be and it was caused by the number of scanned tuples per cycles. [0]
[0] /messages/by-id/9f4a159b-f527-465f-b82e-38b4b7df812f@postgrespro.ru
Initial versions of my patch didn't do this. The latest version does
divide like this, though. In general it isn't all that likely that an
inner index scan would have more than a single primitive index scan,
in any case, so which particular behavior I use here (divide vs don't
divide) is not something that I feel strongly about.I think we should divide them because by dividing the total buffer usage by the number of loops, user finds the average buffer consumption per loop. This gives them a clearer picture of the resource intensity per basic unit of work.
I disagree; I think the whole "dividing by number of loops and
rounding up to integer" was the wrong choice for tuple count, as that
makes it difficult if not impossible to determine the actual produced
count when it's less than the number of loops. Data is lost in the
rounding/processing, and I don't want to have lost that data.
Same applies for ~scans~ searches: If we do an index search, we should
show it in the count as total sum, not partial processed value. If a
user is interested in per-loopcount values, then they can derive that
value from the data they're presented with; but that isn't true when
we present only the divided-and-rounded value.
Kind regards,
Matthias van de Meent
Neon (https://neon.tech)
Hi!
On 27.11.2024 16:36, Matthias van de Meent wrote:
On Wed, 27 Nov 2024 at 14:22, Alena Rybakina <a.rybakina@postgrespro.ru> wrote:
Sorry it took me so long to answer, I had some minor health complications
On 12.11.2024 23:00, Peter Geoghegan wrote:
On Sun, Nov 10, 2024 at 2:00 PM Alena Rybakina
<a.rybakina@postgrespro.ru> wrote:Or maybe I was affected by fatigue, but I don’t understand this point, to be honest. I see from the documentation and your first letter that it specifies how many times in total the tuple search would be performed during the index execution. Is that not quite right?
Well, nodes that appear on the inner side of a nested loop join (and
in a few other contexts) generally have their row counts (and a few
other things) divided by the total number of executions. The idea is
that we're showing the average across all executions of the node -- if
the user wants the true absolute number, they're expected to multiply
nrows by nloops themselves. This is slightly controversial behavior,
but it is long established (weirdly, we never divide by nloops for
"Buffers").I understood what you mean and I faced this situation before when I saw extremely more number of actual rows that could be and it was caused by the number of scanned tuples per cycles. [0]
[0] /messages/by-id/9f4a159b-f527-465f-b82e-38b4b7df812f@postgrespro.ru
Initial versions of my patch didn't do this. The latest version does
divide like this, though. In general it isn't all that likely that an
inner index scan would have more than a single primitive index scan,
in any case, so which particular behavior I use here (divide vs don't
divide) is not something that I feel strongly about.I think we should divide them because by dividing the total buffer usage by the number of loops, user finds the average buffer consumption per loop. This gives them a clearer picture of the resource intensity per basic unit of work.
I disagree; I think the whole "dividing by number of loops and
rounding up to integer" was the wrong choice for tuple count, as that
makes it difficult if not impossible to determine the actual produced
count when it's less than the number of loops. Data is lost in the
rounding/processing, and I don't want to have lost that data.Same applies for ~scans~ searches: If we do an index search, we should
show it in the count as total sum, not partial processed value. If a
user is interested in per-loopcount values, then they can derive that
value from the data they're presented with; but that isn't true when
we present only the divided-and-rounded value.
To be honest, I didn't understand how it will be helpful because there
is an uneven distribution of buffer usage from cycle to cycle, isn't it?
I thought that the dividing memory on number of cycles helps us to
normalize the metric to account for the repeated iterations. This gives
us a clearer picture of the resource intensity per basic unit of work,
rather than just the overall total. Each loop may consume a different
amount of buffer space, but by averaging it out, we're smoothing those
fluctuations into a more representative measure.
Moreover, this does not correspond to another metric that is nearby -
the number of lines processed by the algorithm for the inner node. Will
not the user who evaluates the query plan be confused by such a discrepancy?
--
Regards,
Alena Rybakina
Postgres Professional
On Thu, 28 Nov 2024 at 22:09, Alena Rybakina <a.rybakina@postgrespro.ru> wrote:
Hi!
On 27.11.2024 16:36, Matthias van de Meent wrote:
On Wed, 27 Nov 2024 at 14:22, Alena Rybakina <a.rybakina@postgrespro.ru> wrote:
Sorry it took me so long to answer, I had some minor health complications
On 12.11.2024 23:00, Peter Geoghegan wrote:
On Sun, Nov 10, 2024 at 2:00 PM Alena Rybakina
<a.rybakina@postgrespro.ru> wrote:Or maybe I was affected by fatigue, but I don’t understand this point, to be honest. I see from the documentation and your first letter that it specifies how many times in total the tuple search would be performed during the index execution. Is that not quite right?
Well, nodes that appear on the inner side of a nested loop join (and
in a few other contexts) generally have their row counts (and a few
other things) divided by the total number of executions. The idea is
that we're showing the average across all executions of the node -- if
the user wants the true absolute number, they're expected to multiply
nrows by nloops themselves. This is slightly controversial behavior,
but it is long established (weirdly, we never divide by nloops for
"Buffers").I understood what you mean and I faced this situation before when I saw extremely more number of actual rows that could be and it was caused by the number of scanned tuples per cycles. [0]
[0] /messages/by-id/9f4a159b-f527-465f-b82e-38b4b7df812f@postgrespro.ru
Initial versions of my patch didn't do this. The latest version does
divide like this, though. In general it isn't all that likely that an
inner index scan would have more than a single primitive index scan,
in any case, so which particular behavior I use here (divide vs don't
divide) is not something that I feel strongly about.I think we should divide them because by dividing the total buffer usage by the number of loops, user finds the average buffer consumption per loop. This gives them a clearer picture of the resource intensity per basic unit of work.
I disagree; I think the whole "dividing by number of loops and
rounding up to integer" was the wrong choice for tuple count, as that
makes it difficult if not impossible to determine the actual produced
count when it's less than the number of loops. Data is lost in the
rounding/processing, and I don't want to have lost that data.Same applies for ~scans~ searches: If we do an index search, we should
show it in the count as total sum, not partial processed value. If a
user is interested in per-loopcount values, then they can derive that
value from the data they're presented with; but that isn't true when
we present only the divided-and-rounded value.To be honest, I didn't understand how it will be helpful because there
is an uneven distribution of buffer usage from cycle to cycle, isn't it?
I'm sorry, I don't quite understand what you mean by cycle here.
I thought that the dividing memory on number of cycles helps us to
normalize the metric to account for the repeated iterations. This gives
us a clearer picture of the resource intensity per basic unit of work,
rather than just the overall total. Each loop may consume a different
amount of buffer space, but by averaging it out, we're smoothing those
fluctuations into a more representative measure.
The issue I see here is that users can get those numbers from raw
results, but they can't get the raw (more accurate) data from the
current output; if we only show processed data (like the 'rows' metric
in text output, which is a divided-and-rounded value) you can't get
the original data back with good confidence.
E.g., I have a table 'twentyone' with values 1..21, and I left join it
on a table 'ten' with values 1..10. The current text explain output
-once the planner is convinced to execute (nested loop left join
(seqscan 'thousand'), (index scan 'ten'))- will show that the index
scan path produced 0 rows, which is clearly wrong, and I can't get the
original value back with accuracy by multiplying rows with loops due
to the rounding.
Moreover, this does not correspond to another metric that is nearby -
the number of lines processed by the algorithm for the inner node.
It doesn't have much correspondence to that anyway, as we don't count
lines that were accessed but didn't match index quals, nor heap tuples
filtered by rechecks, in the `rows` metric.
Will
not the user who evaluates the query plan be confused by such a discrepancy?
I think users will be more confused about a discrepancy between buffer
accesses and index searches (which are more closely related to
eachother) than a discrepancy between index searches and
rounded-average-number-of-tuples-produced-per-loop, or the discrepancy
between not-quite-average-tuples-procuded-per-loop vs the "heap
fetches" counter of an IndexOnlyScan, etc.
Kind regards,
Matthias van de Meent
Neon (https://neon.tech)
On Sat, Nov 9, 2024 at 1:46 PM Peter Geoghegan <pg@bowt.ie> wrote:
On Sat, Nov 9, 2024 at 12:37 PM Alena Rybakina
<a.rybakina@postgrespro.ru> wrote:I noticed that the "Index Searches" cases shown in the regression tests are only for partitioned tables, maybe something you should add some tests for regular tables like tenk1.
I allowed the patch on this thread to bitrot, but I've been
maintaining this same patch as part of the skip scan patchset.
Attached is the latest version of this patch (technically this is the
first patch in the skip scan patch series). Just to keep things
passing on the CFTester app.
Attached revision just fixes bitrot.
The patch stopped applying against HEAD cleanly due to recent work
that made EXPLAIN ANALYZE show buffers output by default.
--
Peter Geoghegan
Attachments:
v19-0001-Show-index-search-count-in-EXPLAIN-ANALYZE.patchapplication/octet-stream; name=v19-0001-Show-index-search-count-in-EXPLAIN-ANALYZE.patchDownload
From 05b40acb6bac4c7b80746ee9f3a1f716a01f8803 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@bowt.ie>
Date: Wed, 14 Aug 2024 13:50:23 -0400
Subject: [PATCH v19 1/3] Show index search count in EXPLAIN ANALYZE.
Expose the information tracked by pg_stat_*_indexes.idx_scan to EXPLAIN
ANALYZE output. This is particularly useful for index scans that use
ScalarArrayOp quals, where the number of index scans isn't predictable
in advance with optimizations like the ones added to nbtree by commit
5bf748b8.
This information is made more important still by an upcoming patch that
adds skip scan optimizations to nbtree. The patch implements skip scan
by generating "skip arrays" during nbtree preprocessing, which makes the
relationship between the total number of primitive index scans and the
scan qual looser still. The new instrumentation will help users to
understand how effective these skip scan optimizations are in practice.
Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Tomas Vondra <tomas@vondra.me>
Reviewed-By: Masahiro Ikeda <ikedamsh@oss.nttdata.com>
Discussion: https://postgr.es/m/CAH2-Wz=PKR6rB7qbx+Vnd7eqeB5VTcrW=iJvAsTsKbdG+kW_UA@mail.gmail.com
Discussion: https://postgr.es/m/CAH2-WzkRqvaqR2CTNqTZP0z6FuL4-3ED6eQB0yx38XBNj1v-4Q@mail.gmail.com
---
src/include/access/relscan.h | 3 +
src/backend/access/brin/brin.c | 1 +
src/backend/access/gin/ginscan.c | 1 +
src/backend/access/gist/gistget.c | 2 +
src/backend/access/hash/hashsearch.c | 1 +
src/backend/access/index/genam.c | 1 +
src/backend/access/nbtree/nbtree.c | 11 ++
src/backend/access/nbtree/nbtsearch.c | 1 +
src/backend/access/spgist/spgscan.c | 1 +
src/backend/commands/explain.c | 46 ++++++++
contrib/bloom/blscan.c | 1 +
doc/src/sgml/bloom.sgml | 7 +-
doc/src/sgml/monitoring.sgml | 12 ++-
doc/src/sgml/perform.sgml | 8 ++
doc/src/sgml/ref/explain.sgml | 3 +-
doc/src/sgml/rules.sgml | 1 +
src/test/regress/expected/brin_multi.out | 27 +++--
src/test/regress/expected/memoize.out | 50 ++++++---
src/test/regress/expected/partition_prune.out | 100 +++++++++++++++---
src/test/regress/expected/select.out | 3 +-
src/test/regress/sql/memoize.sql | 6 +-
src/test/regress/sql/partition_prune.sql | 4 +
22 files changed, 244 insertions(+), 46 deletions(-)
diff --git a/src/include/access/relscan.h b/src/include/access/relscan.h
index e1884acf4..7b4180db5 100644
--- a/src/include/access/relscan.h
+++ b/src/include/access/relscan.h
@@ -153,6 +153,9 @@ typedef struct IndexScanDescData
bool xactStartedInRecovery; /* prevents killing/seeing killed
* tuples */
+ /* index access method instrumentation output state */
+ uint64 nsearches; /* # of index searches */
+
/* index access method's private state */
void *opaque; /* access-method-specific info */
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index 3aedec882..2fd284fc0 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -585,6 +585,7 @@ bringetbitmap(IndexScanDesc scan, TIDBitmap *tbm)
opaque = (BrinOpaque *) scan->opaque;
bdesc = opaque->bo_bdesc;
pgstat_count_index_scan(idxRel);
+ scan->nsearches++;
/*
* We need to know the size of the table so that we know how long to
diff --git a/src/backend/access/gin/ginscan.c b/src/backend/access/gin/ginscan.c
index f2fd62afb..5e423e155 100644
--- a/src/backend/access/gin/ginscan.c
+++ b/src/backend/access/gin/ginscan.c
@@ -436,6 +436,7 @@ ginNewScanKey(IndexScanDesc scan)
MemoryContextSwitchTo(oldCtx);
pgstat_count_index_scan(scan->indexRelation);
+ scan->nsearches++;
}
void
diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c
index b35b8a975..36f1435cb 100644
--- a/src/backend/access/gist/gistget.c
+++ b/src/backend/access/gist/gistget.c
@@ -625,6 +625,7 @@ gistgettuple(IndexScanDesc scan, ScanDirection dir)
GISTSearchItem fakeItem;
pgstat_count_index_scan(scan->indexRelation);
+ scan->nsearches++;
so->firstCall = false;
so->curPageData = so->nPageData = 0;
@@ -750,6 +751,7 @@ gistgetbitmap(IndexScanDesc scan, TIDBitmap *tbm)
return 0;
pgstat_count_index_scan(scan->indexRelation);
+ scan->nsearches++;
/* Begin the scan by processing the root page */
so->curPageData = so->nPageData = 0;
diff --git a/src/backend/access/hash/hashsearch.c b/src/backend/access/hash/hashsearch.c
index 0d99d6abc..927ba1039 100644
--- a/src/backend/access/hash/hashsearch.c
+++ b/src/backend/access/hash/hashsearch.c
@@ -298,6 +298,7 @@ _hash_first(IndexScanDesc scan, ScanDirection dir)
HashScanPosItem *currItem;
pgstat_count_index_scan(rel);
+ scan->nsearches++;
/*
* We do not support hash scans with no index qualification, because we
diff --git a/src/backend/access/index/genam.c b/src/backend/access/index/genam.c
index 4b4ebff6a..a7adf9709 100644
--- a/src/backend/access/index/genam.c
+++ b/src/backend/access/index/genam.c
@@ -118,6 +118,7 @@ RelationGetIndexScan(Relation indexRelation, int nkeys, int norderbys)
scan->xactStartedInRecovery = TransactionStartedDuringRecovery();
scan->ignore_killed_tuples = !scan->xactStartedInRecovery;
+ scan->nsearches = 0; /* deliberately not reset by index_rescan */
scan->opaque = NULL;
scan->xs_itup = NULL;
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 77afa1489..2441eaaaa 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -69,6 +69,7 @@ typedef struct BTParallelScanDescData
BTPS_State btps_pageStatus; /* indicates whether next page is
* available for scan. see above for
* possible states of parallel scan. */
+ uint64 btps_nsearches; /* counts index searches for EXPLAIN ANALYZE */
slock_t btps_mutex; /* protects above variables, btps_arrElems */
ConditionVariable btps_cv; /* used to synchronize parallel scan */
@@ -552,6 +553,7 @@ btinitparallelscan(void *target)
bt_target->btps_nextScanPage = InvalidBlockNumber;
bt_target->btps_lastCurrPage = InvalidBlockNumber;
bt_target->btps_pageStatus = BTPARALLEL_NOT_INITIALIZED;
+ bt_target->btps_nsearches = 0;
ConditionVariableInit(&bt_target->btps_cv);
}
@@ -578,6 +580,7 @@ btparallelrescan(IndexScanDesc scan)
btscan->btps_nextScanPage = InvalidBlockNumber;
btscan->btps_lastCurrPage = InvalidBlockNumber;
btscan->btps_pageStatus = BTPARALLEL_NOT_INITIALIZED;
+ /* deliberately don't reset btps_nsearches (matches index_rescan) */
SpinLockRelease(&btscan->btps_mutex);
}
@@ -705,6 +708,11 @@ _bt_parallel_seize(IndexScanDesc scan, BlockNumber *next_scan_page,
* We have successfully seized control of the scan for the purpose
* of advancing it to a new page!
*/
+ if (first && btscan->btps_pageStatus == BTPARALLEL_NOT_INITIALIZED)
+ {
+ /* count the first primitive scan for this btrescan */
+ btscan->btps_nsearches++;
+ }
btscan->btps_pageStatus = BTPARALLEL_ADVANCING;
Assert(btscan->btps_nextScanPage != P_NONE);
*next_scan_page = btscan->btps_nextScanPage;
@@ -805,6 +813,8 @@ _bt_parallel_done(IndexScanDesc scan)
btscan->btps_pageStatus = BTPARALLEL_DONE;
status_changed = true;
}
+ /* Copy the authoritative shared primitive scan counter to local field */
+ scan->nsearches = btscan->btps_nsearches;
SpinLockRelease(&btscan->btps_mutex);
/* wake up all the workers associated with this parallel scan */
@@ -839,6 +849,7 @@ _bt_parallel_primscan_schedule(IndexScanDesc scan, BlockNumber curr_page)
btscan->btps_nextScanPage = InvalidBlockNumber;
btscan->btps_lastCurrPage = InvalidBlockNumber;
btscan->btps_pageStatus = BTPARALLEL_NEED_PRIMSCAN;
+ btscan->btps_nsearches++;
/* Serialize scan's current array keys */
for (int i = 0; i < so->numArrayKeys; i++)
diff --git a/src/backend/access/nbtree/nbtsearch.c b/src/backend/access/nbtree/nbtsearch.c
index 0cd046613..82bb93d1a 100644
--- a/src/backend/access/nbtree/nbtsearch.c
+++ b/src/backend/access/nbtree/nbtsearch.c
@@ -970,6 +970,7 @@ _bt_first(IndexScanDesc scan, ScanDirection dir)
* _bt_search/_bt_endpoint below
*/
pgstat_count_index_scan(rel);
+ scan->nsearches++;
/*----------
* Examine the scan keys to discover where we need to start the scan.
diff --git a/src/backend/access/spgist/spgscan.c b/src/backend/access/spgist/spgscan.c
index 301786185..be668abf2 100644
--- a/src/backend/access/spgist/spgscan.c
+++ b/src/backend/access/spgist/spgscan.c
@@ -421,6 +421,7 @@ spgrescan(IndexScanDesc scan, ScanKey scankey, int nscankeys,
/* count an indexscan for stats */
pgstat_count_index_scan(scan->indexRelation);
+ scan->nsearches++;
}
void
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index a201ed308..33ea21f38 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -13,6 +13,7 @@
*/
#include "postgres.h"
+#include "access/relscan.h"
#include "access/xact.h"
#include "catalog/pg_type.h"
#include "commands/createas.h"
@@ -88,6 +89,7 @@ static void show_plan_tlist(PlanState *planstate, List *ancestors,
static void show_expression(Node *node, const char *qlabel,
PlanState *planstate, List *ancestors,
bool useprefix, ExplainState *es);
+static void show_indexscan_nsearches(PlanState *planstate, ExplainState *es);
static void show_qual(List *qual, const char *qlabel,
PlanState *planstate, List *ancestors,
bool useprefix, ExplainState *es);
@@ -2105,6 +2107,7 @@ ExplainNode(PlanState *planstate, List *ancestors,
case T_IndexScan:
show_scan_qual(((IndexScan *) plan)->indexqualorig,
"Index Cond", planstate, ancestors, es);
+ show_indexscan_nsearches(planstate, es);
if (((IndexScan *) plan)->indexqualorig)
show_instrumentation_count("Rows Removed by Index Recheck", 2,
planstate, es);
@@ -2118,6 +2121,7 @@ ExplainNode(PlanState *planstate, List *ancestors,
case T_IndexOnlyScan:
show_scan_qual(((IndexOnlyScan *) plan)->indexqual,
"Index Cond", planstate, ancestors, es);
+ show_indexscan_nsearches(planstate, es);
if (((IndexOnlyScan *) plan)->recheckqual)
show_instrumentation_count("Rows Removed by Index Recheck", 2,
planstate, es);
@@ -2134,6 +2138,7 @@ ExplainNode(PlanState *planstate, List *ancestors,
case T_BitmapIndexScan:
show_scan_qual(((BitmapIndexScan *) plan)->indexqualorig,
"Index Cond", planstate, ancestors, es);
+ show_indexscan_nsearches(planstate, es);
break;
case T_BitmapHeapScan:
show_scan_qual(((BitmapHeapScan *) plan)->bitmapqualorig,
@@ -2652,6 +2657,47 @@ show_expression(Node *node, const char *qlabel,
ExplainPropertyText(qlabel, exprstr, es);
}
+/*
+ * Show the number of index searches within an IndexScan node, IndexOnlyScan
+ * node, or BitmapIndexScan node
+ */
+static void
+show_indexscan_nsearches(PlanState *planstate, ExplainState *es)
+{
+ Plan *plan = planstate->plan;
+ struct IndexScanDescData *scanDesc = NULL;
+ uint64 nsearches = 0;
+ double nloops;
+
+ if (!es->analyze)
+ return;
+
+ nloops = planstate->instrument->nloops;
+
+ switch (nodeTag(plan))
+ {
+ case T_IndexScan:
+ scanDesc = ((IndexScanState *) planstate)->iss_ScanDesc;
+ break;
+ case T_IndexOnlyScan:
+ scanDesc = ((IndexOnlyScanState *) planstate)->ioss_ScanDesc;
+ break;
+ case T_BitmapIndexScan:
+ scanDesc = ((BitmapIndexScanState *) planstate)->biss_ScanDesc;
+ break;
+ default:
+ break;
+ }
+
+ if (scanDesc)
+ nsearches = scanDesc->nsearches;
+
+ if (nloops > 0)
+ ExplainPropertyFloat("Index Searches", NULL, nsearches / nloops, 0, es);
+ else
+ ExplainPropertyFloat("Index Searches", NULL, 0.0, 0, es);
+}
+
/*
* Show a qualifier expression (which is a List with implicit AND semantics)
*/
diff --git a/contrib/bloom/blscan.c b/contrib/bloom/blscan.c
index 0c5fb725e..f77f716f0 100644
--- a/contrib/bloom/blscan.c
+++ b/contrib/bloom/blscan.c
@@ -116,6 +116,7 @@ blgetbitmap(IndexScanDesc scan, TIDBitmap *tbm)
bas = GetAccessStrategy(BAS_BULKREAD);
npages = RelationGetNumberOfBlocks(scan->indexRelation);
pgstat_count_index_scan(scan->indexRelation);
+ scan->nsearches++;
for (blkno = BLOOM_HEAD_BLKNO; blkno < npages; blkno++)
{
diff --git a/doc/src/sgml/bloom.sgml b/doc/src/sgml/bloom.sgml
index 6a8a60b8c..4cddfaae1 100644
--- a/doc/src/sgml/bloom.sgml
+++ b/doc/src/sgml/bloom.sgml
@@ -174,9 +174,10 @@ CREATE INDEX
-> Bitmap Index Scan on bloomidx (cost=0.00..178436.00 rows=1 width=0) (actual time=20.005..20.005 rows=2300 loops=1)
Index Cond: ((i2 = 898732) AND (i5 = 123451))
Buffers: shared hit=19608
+ Index Searches: 1
Planning Time: 0.099 ms
Execution Time: 22.632 ms
-(10 rows)
+(11 rows)
</programlisting>
</para>
@@ -209,12 +210,14 @@ CREATE INDEX
-> Bitmap Index Scan on btreeidx5 (cost=0.00..4.52 rows=11 width=0) (actual time=0.026..0.026 rows=7 loops=1)
Index Cond: (i5 = 123451)
Buffers: shared hit=3
+ Index Searches: 1
-> Bitmap Index Scan on btreeidx2 (cost=0.00..4.52 rows=11 width=0) (actual time=0.007..0.007 rows=8 loops=1)
Index Cond: (i2 = 898732)
Buffers: shared hit=3
+ Index Searches: 1
Planning Time: 0.264 ms
Execution Time: 0.047 ms
-(13 rows)
+(15 rows)
</programlisting>
Although this query runs much faster than with either of the single
indexes, we pay a penalty in index size. Each of the single-column
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 840d7f816..c68eb770e 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -4198,12 +4198,18 @@ description | Waiting for a newly initialized WAL file to reach durable storage
Queries that use certain <acronym>SQL</acronym> constructs to search for
rows matching any value out of a list or array of multiple scalar values
(see <xref linkend="functions-comparisons"/>) perform multiple
- <quote>primitive</quote> index scans (up to one primitive scan per scalar
- value) during query execution. Each internal primitive index scan
- increments <structname>pg_stat_all_indexes</structname>.<structfield>idx_scan</structfield>,
+ index searches (up to one index search per scalar value) during query
+ execution. Each internal index search increments
+ <structname>pg_stat_all_indexes</structname>.<structfield>idx_scan</structfield>,
so it's possible for the count of index scans to significantly exceed the
total number of index scan executor node executions.
</para>
+ <para>
+ <command>EXPLAIN ANALYZE</command> breaks down the total number of index
+ searches performed by each index scan node. <literal>Index Searches: N</literal>
+ indicates the total number of searches across <emphasis>all</emphasis>
+ executor node executions/loops.
+ </para>
</note>
</sect2>
diff --git a/doc/src/sgml/perform.sgml b/doc/src/sgml/perform.sgml
index a502a2aab..34225c010 100644
--- a/doc/src/sgml/perform.sgml
+++ b/doc/src/sgml/perform.sgml
@@ -730,9 +730,11 @@ WHERE t1.unique1 < 10 AND t1.unique2 = t2.unique2;
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..4.36 rows=10 width=0) (actual time=0.004..0.004 rows=10 loops=1)
Index Cond: (unique1 < 10)
Buffers: shared hit=2
+ Index Searches: 1
-> Index Scan using tenk2_unique2 on tenk2 t2 (cost=0.29..7.90 rows=1 width=244) (actual time=0.003..0.003 rows=1 loops=10)
Index Cond: (unique2 = t1.unique2)
Buffers: shared hit=24 read=6
+ Index Searches: 1
Planning:
Buffers: shared hit=15 dirtied=9
Planning Time: 0.485 ms
@@ -791,6 +793,7 @@ WHERE t1.unique1 < 100 AND t1.unique2 = t2.unique2 ORDER BY t1.fivethous;
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.04 rows=100 width=0) (actual time=0.013..0.013 rows=100 loops=1)
Index Cond: (unique1 < 100)
Buffers: shared hit=2
+ Index Searches: 1
Planning:
Buffers: shared hit=12
Planning Time: 0.187 ms
@@ -860,6 +863,7 @@ EXPLAIN ANALYZE SELECT * FROM polygon_tbl WHERE f1 @> polygon '(0.5,2.0)';
-------------------------------------------------------------------&zwsp;-------------------------------------------------------
Index Scan using gpolygonind on polygon_tbl (cost=0.13..8.15 rows=1 width=85) (actual time=0.074..0.074 rows=0 loops=1)
Index Cond: (f1 @> '((0.5,2))'::polygon)
+ Index Searches: 1
Rows Removed by Index Recheck: 1
Buffers: shared hit=1
Planning Time: 0.039 ms
@@ -894,8 +898,10 @@ EXPLAIN (ANALYZE, BUFFERS OFF) SELECT * FROM tenk1 WHERE unique1 < 100 AND un
-> BitmapAnd (cost=25.07..25.07 rows=10 width=0) (actual time=0.100..0.101 rows=0 loops=1)
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.04 rows=100 width=0) (actual time=0.027..0.027 rows=100 loops=1)
Index Cond: (unique1 < 100)
+ Index Searches: 1
-> Bitmap Index Scan on tenk1_unique2 (cost=0.00..19.78 rows=999 width=0) (actual time=0.070..0.070 rows=999 loops=1)
Index Cond: (unique2 > 9000)
+ Index Searches: 1
Planning Time: 0.162 ms
Execution Time: 0.143 ms
</screen>
@@ -924,6 +930,7 @@ EXPLAIN ANALYZE UPDATE tenk1 SET hundred = hundred + 1 WHERE unique1 < 100;
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.04 rows=100 width=0) (actual time=0.031..0.031 rows=100 loops=1)
Index Cond: (unique1 < 100)
Buffers: shared read=2
+ Index Searches: 1
Planning Time: 0.151 ms
Execution Time: 1.856 ms
@@ -1059,6 +1066,7 @@ EXPLAIN ANALYZE SELECT * FROM tenk1 WHERE unique1 < 100 AND unique2 > 9000
Buffers: shared hit=16
-> Index Scan using tenk1_unique2 on tenk1 (cost=0.29..70.50 rows=10 width=244) (actual time=0.051..0.070 rows=2 loops=1)
Index Cond: (unique2 > 9000)
+ Index Searches: 1
Filter: (unique1 < 100)
Rows Removed by Filter: 287
Buffers: shared hit=16
diff --git a/doc/src/sgml/ref/explain.sgml b/doc/src/sgml/ref/explain.sgml
index 6361a14e6..6fc9bfedc 100644
--- a/doc/src/sgml/ref/explain.sgml
+++ b/doc/src/sgml/ref/explain.sgml
@@ -506,9 +506,10 @@ EXPLAIN ANALYZE EXECUTE query(100, 200);
-> Index Scan using test_pkey on test (cost=0.29..10.27 rows=99 width=8) (actual time=0.009..0.025 rows=99 loops=1)
Index Cond: ((id > 100) AND (id < 200))
Buffers: shared hit=4
+ Index Searches: 1
Planning Time: 0.244 ms
Execution Time: 0.073 ms
-(9 rows)
+(10 rows)
</programlisting>
</para>
diff --git a/doc/src/sgml/rules.sgml b/doc/src/sgml/rules.sgml
index 7a928bd7b..7a00e4c0e 100644
--- a/doc/src/sgml/rules.sgml
+++ b/doc/src/sgml/rules.sgml
@@ -1045,6 +1045,7 @@ SELECT count(*) FROM words WHERE word = 'caterpiler';
Aggregate (cost=4.44..4.45 rows=1 width=0) (actual time=0.042..0.042 rows=1 loops=1)
-> Index Only Scan using wrd_word on wrd (cost=0.42..4.44 rows=1 width=0) (actual time=0.039..0.039 rows=0 loops=1)
Index Cond: (word = 'caterpiler'::text)
+ Index Searches: 1
Heap Fetches: 0
Planning time: 0.164 ms
Execution time: 0.117 ms
diff --git a/src/test/regress/expected/brin_multi.out b/src/test/regress/expected/brin_multi.out
index f2d146581..a5df4107a 100644
--- a/src/test/regress/expected/brin_multi.out
+++ b/src/test/regress/expected/brin_multi.out
@@ -853,7 +853,8 @@ SELECT * FROM brin_date_test WHERE a = '2023-01-01'::date;
Recheck Cond: (a = '2023-01-01'::date)
-> Bitmap Index Scan on brin_date_test_a_idx (actual rows=0 loops=1)
Index Cond: (a = '2023-01-01'::date)
-(4 rows)
+ Index Searches: 1
+(5 rows)
DROP TABLE brin_date_test;
RESET enable_seqscan;
@@ -872,7 +873,8 @@ SELECT * FROM brin_timestamp_test WHERE a = '2023-01-01'::timestamp;
Recheck Cond: (a = '2023-01-01 00:00:00'::timestamp without time zone)
-> Bitmap Index Scan on brin_timestamp_test_a_idx (actual rows=0 loops=1)
Index Cond: (a = '2023-01-01 00:00:00'::timestamp without time zone)
-(4 rows)
+ Index Searches: 1
+(5 rows)
EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF, SUMMARY OFF, BUFFERS OFF)
SELECT * FROM brin_timestamp_test WHERE a = '1900-01-01'::timestamp;
@@ -882,7 +884,8 @@ SELECT * FROM brin_timestamp_test WHERE a = '1900-01-01'::timestamp;
Recheck Cond: (a = '1900-01-01 00:00:00'::timestamp without time zone)
-> Bitmap Index Scan on brin_timestamp_test_a_idx (actual rows=0 loops=1)
Index Cond: (a = '1900-01-01 00:00:00'::timestamp without time zone)
-(4 rows)
+ Index Searches: 1
+(5 rows)
DROP TABLE brin_timestamp_test;
RESET enable_seqscan;
@@ -900,7 +903,8 @@ SELECT * FROM brin_date_test WHERE a = '2023-01-01'::date;
Recheck Cond: (a = '2023-01-01'::date)
-> Bitmap Index Scan on brin_date_test_a_idx (actual rows=0 loops=1)
Index Cond: (a = '2023-01-01'::date)
-(4 rows)
+ Index Searches: 1
+(5 rows)
EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF, SUMMARY OFF, BUFFERS OFF)
SELECT * FROM brin_date_test WHERE a = '1900-01-01'::date;
@@ -910,7 +914,8 @@ SELECT * FROM brin_date_test WHERE a = '1900-01-01'::date;
Recheck Cond: (a = '1900-01-01'::date)
-> Bitmap Index Scan on brin_date_test_a_idx (actual rows=0 loops=1)
Index Cond: (a = '1900-01-01'::date)
-(4 rows)
+ Index Searches: 1
+(5 rows)
DROP TABLE brin_date_test;
RESET enable_seqscan;
@@ -929,7 +934,8 @@ SELECT * FROM brin_interval_test WHERE a = '-30 years'::interval;
Recheck Cond: (a = '@ 30 years ago'::interval)
-> Bitmap Index Scan on brin_interval_test_a_idx (actual rows=0 loops=1)
Index Cond: (a = '@ 30 years ago'::interval)
-(4 rows)
+ Index Searches: 1
+(5 rows)
EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF, SUMMARY OFF, BUFFERS OFF)
SELECT * FROM brin_interval_test WHERE a = '30 years'::interval;
@@ -939,7 +945,8 @@ SELECT * FROM brin_interval_test WHERE a = '30 years'::interval;
Recheck Cond: (a = '@ 30 years'::interval)
-> Bitmap Index Scan on brin_interval_test_a_idx (actual rows=0 loops=1)
Index Cond: (a = '@ 30 years'::interval)
-(4 rows)
+ Index Searches: 1
+(5 rows)
DROP TABLE brin_interval_test;
RESET enable_seqscan;
@@ -957,7 +964,8 @@ SELECT * FROM brin_interval_test WHERE a = '-30 years'::interval;
Recheck Cond: (a = '@ 30 years ago'::interval)
-> Bitmap Index Scan on brin_interval_test_a_idx (actual rows=0 loops=1)
Index Cond: (a = '@ 30 years ago'::interval)
-(4 rows)
+ Index Searches: 1
+(5 rows)
EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF, SUMMARY OFF, BUFFERS OFF)
SELECT * FROM brin_interval_test WHERE a = '30 years'::interval;
@@ -967,7 +975,8 @@ SELECT * FROM brin_interval_test WHERE a = '30 years'::interval;
Recheck Cond: (a = '@ 30 years'::interval)
-> Bitmap Index Scan on brin_interval_test_a_idx (actual rows=0 loops=1)
Index Cond: (a = '@ 30 years'::interval)
-(4 rows)
+ Index Searches: 1
+(5 rows)
DROP TABLE brin_interval_test;
RESET enable_seqscan;
diff --git a/src/test/regress/expected/memoize.out b/src/test/regress/expected/memoize.out
index 5ecf971da..d9df5c0e1 100644
--- a/src/test/regress/expected/memoize.out
+++ b/src/test/regress/expected/memoize.out
@@ -22,8 +22,10 @@ begin
ln := regexp_replace(ln, 'Evictions: 0', 'Evictions: Zero');
ln := regexp_replace(ln, 'Evictions: \d+', 'Evictions: N');
ln := regexp_replace(ln, 'Memory Usage: \d+', 'Memory Usage: N');
- ln := regexp_replace(ln, 'Heap Fetches: \d+', 'Heap Fetches: N');
- ln := regexp_replace(ln, 'loops=\d+', 'loops=N');
+ ln := regexp_replace(ln, 'Heap Fetches: \d+', 'Heap Fetches: N');
+ ln := regexp_replace(ln, 'loops=\d+', 'loops=N');
+ ln := regexp_replace(ln, 'Index Searches: 0', 'Index Searches: Zero');
+ ln := regexp_replace(ln, 'Index Searches: \d+', 'Index Searches: N');
return next ln;
end loop;
end;
@@ -48,8 +50,9 @@ WHERE t2.unique1 < 1000;', false);
Hits: 980 Misses: 20 Evictions: Zero Overflows: 0 Memory Usage: NkB
-> Index Only Scan using tenk1_unique1 on tenk1 t1 (actual rows=1 loops=N)
Index Cond: (unique1 = t2.twenty)
+ Index Searches: N
Heap Fetches: N
-(12 rows)
+(13 rows)
-- And check we get the expected results.
SELECT COUNT(*),AVG(t1.unique1) FROM tenk1 t1
@@ -79,8 +82,9 @@ WHERE t1.unique1 < 1000;', false);
Hits: 980 Misses: 20 Evictions: Zero Overflows: 0 Memory Usage: NkB
-> Index Only Scan using tenk1_unique1 on tenk1 t2 (actual rows=1 loops=N)
Index Cond: (unique1 = t1.twenty)
+ Index Searches: N
Heap Fetches: N
-(12 rows)
+(13 rows)
-- And check we get the expected results.
SELECT COUNT(*),AVG(t2.unique1) FROM tenk1 t1,
@@ -106,6 +110,7 @@ WHERE t1.unique1 < 10;', false);
-> Nested Loop Left Join (actual rows=20 loops=N)
-> Index Scan using tenk1_unique1 on tenk1 t1 (actual rows=10 loops=N)
Index Cond: (unique1 < 10)
+ Index Searches: N
-> Memoize (actual rows=2 loops=N)
Cache Key: t1.two
Cache Mode: binary
@@ -115,7 +120,8 @@ WHERE t1.unique1 < 10;', false);
Rows Removed by Filter: 2
-> Index Scan using tenk1_unique1 on tenk1 t2_1 (actual rows=4 loops=N)
Index Cond: (unique1 < 4)
-(13 rows)
+ Index Searches: N
+(15 rows)
-- And check we get the expected results.
SELECT COUNT(*),AVG(t2.t1two) FROM tenk1 t1 LEFT JOIN
@@ -146,10 +152,11 @@ WHERE s.c1 = s.c2 AND t1.unique1 < 1000;', false);
Cache Mode: binary
Hits: 998 Misses: 2 Evictions: Zero Overflows: 0 Memory Usage: NkB
-> Index Only Scan using tenk1_unique1 on tenk1 t2 (actual rows=1 loops=N)
+ Index Searches: N
Filter: ((t1.two + 1) = unique1)
Rows Removed by Filter: 9999
Heap Fetches: N
-(13 rows)
+(14 rows)
-- And check we get the expected results.
SELECT COUNT(*), AVG(t1.twenty) FROM tenk1 t1 LEFT JOIN
@@ -217,9 +224,10 @@ ON t1.x = t2.t::numeric AND t1.t::numeric = t2.x;', false);
Hits: 20 Misses: 20 Evictions: Zero Overflows: 0 Memory Usage: NkB
-> Index Only Scan using expr_key_idx_x_t on expr_key t2 (actual rows=2 loops=N)
Index Cond: (x = (t1.t)::numeric)
+ Index Searches: N
Filter: (t1.x = (t)::numeric)
Heap Fetches: N
-(10 rows)
+(11 rows)
DROP TABLE expr_key;
-- Reduce work_mem and hash_mem_multiplier so that we see some cache evictions
@@ -245,8 +253,9 @@ WHERE t2.unique1 < 1200;', true);
Hits: N Misses: N Evictions: N Overflows: 0 Memory Usage: NkB
-> Index Only Scan using tenk1_unique1 on tenk1 t1 (actual rows=1 loops=N)
Index Cond: (unique1 = t2.thousand)
+ Index Searches: N
Heap Fetches: N
-(12 rows)
+(13 rows)
CREATE TABLE flt (f float);
CREATE INDEX flt_f_idx ON flt (f);
@@ -260,6 +269,7 @@ SELECT * FROM flt f1 INNER JOIN flt f2 ON f1.f = f2.f;', false);
-------------------------------------------------------------------------------
Nested Loop (actual rows=4 loops=N)
-> Index Only Scan using flt_f_idx on flt f1 (actual rows=2 loops=N)
+ Index Searches: N
Heap Fetches: N
-> Memoize (actual rows=2 loops=N)
Cache Key: f1.f
@@ -267,8 +277,9 @@ SELECT * FROM flt f1 INNER JOIN flt f2 ON f1.f = f2.f;', false);
Hits: 1 Misses: 1 Evictions: Zero Overflows: 0 Memory Usage: NkB
-> Index Only Scan using flt_f_idx on flt f2 (actual rows=2 loops=N)
Index Cond: (f = f1.f)
+ Index Searches: N
Heap Fetches: N
-(10 rows)
+(12 rows)
-- Ensure memoize operates in binary mode
SELECT explain_memoize('
@@ -277,6 +288,7 @@ SELECT * FROM flt f1 INNER JOIN flt f2 ON f1.f >= f2.f;', false);
-------------------------------------------------------------------------------
Nested Loop (actual rows=4 loops=N)
-> Index Only Scan using flt_f_idx on flt f1 (actual rows=2 loops=N)
+ Index Searches: N
Heap Fetches: N
-> Memoize (actual rows=2 loops=N)
Cache Key: f1.f
@@ -284,8 +296,9 @@ SELECT * FROM flt f1 INNER JOIN flt f2 ON f1.f >= f2.f;', false);
Hits: 0 Misses: 2 Evictions: Zero Overflows: 0 Memory Usage: NkB
-> Index Only Scan using flt_f_idx on flt f2 (actual rows=2 loops=N)
Index Cond: (f <= f1.f)
+ Index Searches: N
Heap Fetches: N
-(10 rows)
+(12 rows)
DROP TABLE flt;
-- Exercise Memoize in binary mode with a large fixed width type and a
@@ -311,7 +324,8 @@ SELECT * FROM strtest s1 INNER JOIN strtest s2 ON s1.n >= s2.n;', false);
Hits: 3 Misses: 3 Evictions: Zero Overflows: 0 Memory Usage: NkB
-> Index Scan using strtest_n_idx on strtest s2 (actual rows=4 loops=N)
Index Cond: (n <= s1.n)
-(9 rows)
+ Index Searches: N
+(10 rows)
-- Ensure we get 3 hits and 3 misses
SELECT explain_memoize('
@@ -327,7 +341,8 @@ SELECT * FROM strtest s1 INNER JOIN strtest s2 ON s1.t >= s2.t;', false);
Hits: 3 Misses: 3 Evictions: Zero Overflows: 0 Memory Usage: NkB
-> Index Scan using strtest_t_idx on strtest s2 (actual rows=4 loops=N)
Index Cond: (t <= s1.t)
-(9 rows)
+ Index Searches: N
+(10 rows)
DROP TABLE strtest;
-- Ensure memoize works with partitionwise join
@@ -347,6 +362,7 @@ SELECT * FROM prt t1 INNER JOIN prt t2 ON t1.a = t2.a;', false);
Append (actual rows=32 loops=N)
-> Nested Loop (actual rows=16 loops=N)
-> Index Only Scan using iprt_p1_a on prt_p1 t1_1 (actual rows=4 loops=N)
+ Index Searches: N
Heap Fetches: N
-> Memoize (actual rows=4 loops=N)
Cache Key: t1_1.a
@@ -354,9 +370,11 @@ SELECT * FROM prt t1 INNER JOIN prt t2 ON t1.a = t2.a;', false);
Hits: 3 Misses: 1 Evictions: Zero Overflows: 0 Memory Usage: NkB
-> Index Only Scan using iprt_p1_a on prt_p1 t2_1 (actual rows=4 loops=N)
Index Cond: (a = t1_1.a)
+ Index Searches: N
Heap Fetches: N
-> Nested Loop (actual rows=16 loops=N)
-> Index Only Scan using iprt_p2_a on prt_p2 t1_2 (actual rows=4 loops=N)
+ Index Searches: N
Heap Fetches: N
-> Memoize (actual rows=4 loops=N)
Cache Key: t1_2.a
@@ -364,8 +382,9 @@ SELECT * FROM prt t1 INNER JOIN prt t2 ON t1.a = t2.a;', false);
Hits: 3 Misses: 1 Evictions: Zero Overflows: 0 Memory Usage: NkB
-> Index Only Scan using iprt_p2_a on prt_p2 t2_2 (actual rows=4 loops=N)
Index Cond: (a = t1_2.a)
+ Index Searches: N
Heap Fetches: N
-(21 rows)
+(25 rows)
-- Ensure memoize works with parameterized union-all Append path
SET enable_partitionwise_join TO off;
@@ -377,6 +396,7 @@ ON t1.a = t2.a;', false);
-------------------------------------------------------------------------------------
Nested Loop (actual rows=16 loops=N)
-> Index Only Scan using iprt_p1_a on prt_p1 t1 (actual rows=4 loops=N)
+ Index Searches: N
Heap Fetches: N
-> Memoize (actual rows=4 loops=N)
Cache Key: t1.a
@@ -385,11 +405,13 @@ ON t1.a = t2.a;', false);
-> Append (actual rows=4 loops=N)
-> Index Only Scan using iprt_p1_a on prt_p1 (actual rows=4 loops=N)
Index Cond: (a = t1.a)
+ Index Searches: N
Heap Fetches: N
-> Index Only Scan using iprt_p2_a on prt_p2 (actual rows=0 loops=N)
Index Cond: (a = t1.a)
+ Index Searches: N
Heap Fetches: N
-(14 rows)
+(17 rows)
DROP TABLE prt;
RESET enable_partitionwise_join;
diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out
index c52bc40e8..b053891b6 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -2340,6 +2340,10 @@ begin
ln := regexp_replace(ln, 'Workers Launched: \d+', 'Workers Launched: N');
ln := regexp_replace(ln, 'actual rows=\d+ loops=\d+', 'actual rows=N loops=N');
ln := regexp_replace(ln, 'Rows Removed by Filter: \d+', 'Rows Removed by Filter: N');
+ perform regexp_matches(ln, 'Index Searches: \d+');
+ if found then
+ continue;
+ end if;
return next ln;
end loop;
end;
@@ -2657,47 +2661,56 @@ select * from ab where a = (select max(a) from lprt_a) and b = (select max(a)-1
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a1_b1_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a1_b2 ab_2 (never executed)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a1_b2_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a1_b3 ab_3 (never executed)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a1_b3_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a2_b1 ab_4 (never executed)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a2_b1_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a2_b2 ab_5 (never executed)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a2_b2_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a2_b3 ab_6 (never executed)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a2_b3_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a3_b1 ab_7 (never executed)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a3_b1_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a3_b2 ab_8 (actual rows=0 loops=1)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a3_b2_a_idx (actual rows=0 loops=1)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a3_b3 ab_9 (never executed)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a3_b3_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
-(52 rows)
+ Index Searches: 0
+(61 rows)
-- Test run-time partition pruning with UNION ALL parents
explain (analyze, costs off, summary off, timing off, buffers off)
@@ -2713,16 +2726,19 @@ select * from (select * from ab where a = 1 union all select * from ab) ab where
Filter: (b = (InitPlan 1).col1)
-> Bitmap Index Scan on ab_a1_b1_a_idx (actual rows=0 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b2 ab_12 (never executed)
Recheck Cond: (a = 1)
Filter: (b = (InitPlan 1).col1)
-> Bitmap Index Scan on ab_a1_b2_a_idx (never executed)
Index Cond: (a = 1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a1_b3 ab_13 (never executed)
Recheck Cond: (a = 1)
Filter: (b = (InitPlan 1).col1)
-> Bitmap Index Scan on ab_a1_b3_a_idx (never executed)
Index Cond: (a = 1)
+ Index Searches: 0
-> Seq Scan on ab_a1_b1 ab_1 (actual rows=0 loops=1)
Filter: (b = (InitPlan 1).col1)
-> Seq Scan on ab_a1_b2 ab_2 (never executed)
@@ -2741,7 +2757,7 @@ select * from (select * from ab where a = 1 union all select * from ab) ab where
Filter: (b = (InitPlan 1).col1)
-> Seq Scan on ab_a3_b3 ab_9 (never executed)
Filter: (b = (InitPlan 1).col1)
-(37 rows)
+(40 rows)
-- A case containing a UNION ALL with a non-partitioned child.
explain (analyze, costs off, summary off, timing off, buffers off)
@@ -2757,16 +2773,19 @@ select * from (select * from ab where a = 1 union all (values(10,5)) union all s
Filter: (b = (InitPlan 1).col1)
-> Bitmap Index Scan on ab_a1_b1_a_idx (actual rows=0 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b2 ab_12 (never executed)
Recheck Cond: (a = 1)
Filter: (b = (InitPlan 1).col1)
-> Bitmap Index Scan on ab_a1_b2_a_idx (never executed)
Index Cond: (a = 1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a1_b3 ab_13 (never executed)
Recheck Cond: (a = 1)
Filter: (b = (InitPlan 1).col1)
-> Bitmap Index Scan on ab_a1_b3_a_idx (never executed)
Index Cond: (a = 1)
+ Index Searches: 0
-> Result (actual rows=0 loops=1)
One-Time Filter: (5 = (InitPlan 1).col1)
-> Seq Scan on ab_a1_b1 ab_1 (actual rows=0 loops=1)
@@ -2787,7 +2806,7 @@ select * from (select * from ab where a = 1 union all (values(10,5)) union all s
Filter: (b = (InitPlan 1).col1)
-> Seq Scan on ab_a3_b3 ab_9 (never executed)
Filter: (b = (InitPlan 1).col1)
-(39 rows)
+(42 rows)
-- Another UNION ALL test, but containing a mix of exec init and exec run-time pruning.
create table xy_1 (x int, y int);
@@ -2858,16 +2877,19 @@ update ab_a1 set b = 3 from ab where ab.a = 1 and ab.a = ab_a1.a;');
Recheck Cond: (a = 1)
-> Bitmap Index Scan on ab_a1_b1_a_idx (actual rows=0 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b2 ab_a1_2 (actual rows=1 loops=1)
Recheck Cond: (a = 1)
Heap Blocks: exact=1
-> Bitmap Index Scan on ab_a1_b2_a_idx (actual rows=1 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b3 ab_a1_3 (actual rows=0 loops=1)
Recheck Cond: (a = 1)
Heap Blocks: exact=1
-> Bitmap Index Scan on ab_a1_b3_a_idx (actual rows=1 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Materialize (actual rows=1 loops=1)
Storage: Memory Maximum Storage: NkB
-> Append (actual rows=1 loops=1)
@@ -2875,17 +2897,20 @@ update ab_a1 set b = 3 from ab where ab.a = 1 and ab.a = ab_a1.a;');
Recheck Cond: (a = 1)
-> Bitmap Index Scan on ab_a1_b1_a_idx (actual rows=0 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b2 ab_2 (actual rows=1 loops=1)
Recheck Cond: (a = 1)
Heap Blocks: exact=1
-> Bitmap Index Scan on ab_a1_b2_a_idx (actual rows=1 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b3 ab_3 (actual rows=0 loops=1)
Recheck Cond: (a = 1)
Heap Blocks: exact=1
-> Bitmap Index Scan on ab_a1_b3_a_idx (actual rows=1 loops=1)
Index Cond: (a = 1)
-(37 rows)
+ Index Searches: 1
+(43 rows)
table ab;
a | b
@@ -2961,17 +2986,23 @@ select * from tbl1 join tprt on tbl1.col1 > tprt.col1;
-> Append (actual rows=3 loops=2)
-> Index Scan using tprt1_idx on tprt_1 (actual rows=2 loops=2)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 1
-> Index Scan using tprt2_idx on tprt_2 (actual rows=2 loops=1)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 1
-> Index Scan using tprt3_idx on tprt_3 (never executed)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt4_idx on tprt_4 (never executed)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt5_idx on tprt_5 (never executed)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt6_idx on tprt_6 (never executed)
Index Cond: (col1 < tbl1.col1)
-(15 rows)
+ Index Searches: 0
+(21 rows)
explain (analyze, costs off, summary off, timing off, buffers off)
select * from tbl1 join tprt on tbl1.col1 = tprt.col1;
@@ -2982,17 +3013,23 @@ select * from tbl1 join tprt on tbl1.col1 = tprt.col1;
-> Append (actual rows=1 loops=2)
-> Index Scan using tprt1_idx on tprt_1 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt2_idx on tprt_2 (actual rows=1 loops=2)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 1
-> Index Scan using tprt3_idx on tprt_3 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt4_idx on tprt_4 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt5_idx on tprt_5 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt6_idx on tprt_6 (never executed)
Index Cond: (col1 = tbl1.col1)
-(15 rows)
+ Index Searches: 0
+(21 rows)
select tbl1.col1, tprt.col1 from tbl1
inner join tprt on tbl1.col1 > tprt.col1
@@ -3027,17 +3064,23 @@ select * from tbl1 inner join tprt on tbl1.col1 > tprt.col1;
-> Append (actual rows=5 loops=5)
-> Index Scan using tprt1_idx on tprt_1 (actual rows=2 loops=5)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 1
-> Index Scan using tprt2_idx on tprt_2 (actual rows=3 loops=4)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 1
-> Index Scan using tprt3_idx on tprt_3 (actual rows=1 loops=2)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 1
-> Index Scan using tprt4_idx on tprt_4 (never executed)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt5_idx on tprt_5 (never executed)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt6_idx on tprt_6 (never executed)
Index Cond: (col1 < tbl1.col1)
-(15 rows)
+ Index Searches: 0
+(21 rows)
explain (analyze, costs off, summary off, timing off, buffers off)
select * from tbl1 inner join tprt on tbl1.col1 = tprt.col1;
@@ -3048,17 +3091,23 @@ select * from tbl1 inner join tprt on tbl1.col1 = tprt.col1;
-> Append (actual rows=1 loops=5)
-> Index Scan using tprt1_idx on tprt_1 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt2_idx on tprt_2 (actual rows=1 loops=2)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 1
-> Index Scan using tprt3_idx on tprt_3 (actual rows=0 loops=3)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 1
-> Index Scan using tprt4_idx on tprt_4 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt5_idx on tprt_5 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt6_idx on tprt_6 (never executed)
Index Cond: (col1 = tbl1.col1)
-(15 rows)
+ Index Searches: 0
+(21 rows)
select tbl1.col1, tprt.col1 from tbl1
inner join tprt on tbl1.col1 > tprt.col1
@@ -3112,17 +3161,23 @@ select * from tbl1 join tprt on tbl1.col1 < tprt.col1;
-> Append (actual rows=1 loops=1)
-> Index Scan using tprt1_idx on tprt_1 (never executed)
Index Cond: (col1 > tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt2_idx on tprt_2 (never executed)
Index Cond: (col1 > tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt3_idx on tprt_3 (never executed)
Index Cond: (col1 > tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt4_idx on tprt_4 (never executed)
Index Cond: (col1 > tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt5_idx on tprt_5 (never executed)
Index Cond: (col1 > tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt6_idx on tprt_6 (actual rows=1 loops=1)
Index Cond: (col1 > tbl1.col1)
-(15 rows)
+ Index Searches: 1
+(21 rows)
select tbl1.col1, tprt.col1 from tbl1
inner join tprt on tbl1.col1 < tprt.col1
@@ -3144,17 +3199,23 @@ select * from tbl1 join tprt on tbl1.col1 = tprt.col1;
-> Append (actual rows=0 loops=1)
-> Index Scan using tprt1_idx on tprt_1 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt2_idx on tprt_2 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt3_idx on tprt_3 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt4_idx on tprt_4 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt5_idx on tprt_5 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt6_idx on tprt_6 (never executed)
Index Cond: (col1 = tbl1.col1)
-(15 rows)
+ Index Searches: 0
+(21 rows)
select tbl1.col1, tprt.col1 from tbl1
inner join tprt on tbl1.col1 = tprt.col1
@@ -3482,12 +3543,14 @@ explain (analyze, costs off, summary off, timing off, buffers off) execute mt_q1
Sort Key: ma_test.b
Subplans Removed: 1
-> Index Scan using ma_test_p2_b_idx on ma_test_p2 ma_test_1 (actual rows=1 loops=1)
+ Index Searches: 1
Filter: ((a >= $1) AND ((a % 10) = 5))
Rows Removed by Filter: 9
-> Index Scan using ma_test_p3_b_idx on ma_test_p3 ma_test_2 (actual rows=1 loops=1)
+ Index Searches: 1
Filter: ((a >= $1) AND ((a % 10) = 5))
Rows Removed by Filter: 9
-(9 rows)
+(11 rows)
execute mt_q1(15);
a
@@ -3503,9 +3566,10 @@ explain (analyze, costs off, summary off, timing off, buffers off) execute mt_q1
Sort Key: ma_test.b
Subplans Removed: 2
-> Index Scan using ma_test_p3_b_idx on ma_test_p3 ma_test_1 (actual rows=1 loops=1)
+ Index Searches: 1
Filter: ((a >= $1) AND ((a % 10) = 5))
Rows Removed by Filter: 9
-(6 rows)
+(7 rows)
execute mt_q1(25);
a
@@ -3553,13 +3617,17 @@ explain (analyze, costs off, summary off, timing off, buffers off) select * from
-> Limit (actual rows=1 loops=1)
-> Index Scan using ma_test_p2_b_idx on ma_test_p2 (actual rows=1 loops=1)
Index Cond: (b IS NOT NULL)
+ Index Searches: 1
-> Index Scan using ma_test_p1_b_idx on ma_test_p1 ma_test_1 (never executed)
+ Index Searches: 0
Filter: (a >= (InitPlan 2).col1)
-> Index Scan using ma_test_p2_b_idx on ma_test_p2 ma_test_2 (actual rows=10 loops=1)
+ Index Searches: 1
Filter: (a >= (InitPlan 2).col1)
-> Index Scan using ma_test_p3_b_idx on ma_test_p3 ma_test_3 (actual rows=10 loops=1)
+ Index Searches: 1
Filter: (a >= (InitPlan 2).col1)
-(14 rows)
+(18 rows)
reset enable_seqscan;
reset enable_sort;
@@ -4129,14 +4197,18 @@ select * from rangep where b IN((select 1),(select 2)) order by a;
-> Merge Append (actual rows=0 loops=1)
Sort Key: rangep_2.a
-> Index Scan using rangep_0_to_100_1_a_idx on rangep_0_to_100_1 rangep_2 (actual rows=0 loops=1)
+ Index Searches: 1
Filter: (b = ANY (ARRAY[(InitPlan 1).col1, (InitPlan 2).col1]))
-> Index Scan using rangep_0_to_100_2_a_idx on rangep_0_to_100_2 rangep_3 (actual rows=0 loops=1)
+ Index Searches: 1
Filter: (b = ANY (ARRAY[(InitPlan 1).col1, (InitPlan 2).col1]))
-> Index Scan using rangep_0_to_100_3_a_idx on rangep_0_to_100_3 rangep_4 (never executed)
+ Index Searches: 0
Filter: (b = ANY (ARRAY[(InitPlan 1).col1, (InitPlan 2).col1]))
-> Index Scan using rangep_100_to_200_a_idx on rangep_100_to_200 rangep_5 (actual rows=0 loops=1)
+ Index Searches: 1
Filter: (b = ANY (ARRAY[(InitPlan 1).col1, (InitPlan 2).col1]))
-(15 rows)
+(19 rows)
reset enable_sort;
drop table rangep;
diff --git a/src/test/regress/expected/select.out b/src/test/regress/expected/select.out
index 88911ca2b..cde2eac73 100644
--- a/src/test/regress/expected/select.out
+++ b/src/test/regress/expected/select.out
@@ -763,8 +763,9 @@ select * from onek2 where unique2 = 11 and stringu1 = 'ATAAAA';
-----------------------------------------------------------------
Index Scan using onek2_u2_prtl on onek2 (actual rows=1 loops=1)
Index Cond: (unique2 = 11)
+ Index Searches: 1
Filter: (stringu1 = 'ATAAAA'::name)
-(3 rows)
+(4 rows)
explain (costs off)
select unique2 from onek2 where unique2 = 11 and stringu1 = 'ATAAAA';
diff --git a/src/test/regress/sql/memoize.sql b/src/test/regress/sql/memoize.sql
index d5aab4e56..2197b0ac5 100644
--- a/src/test/regress/sql/memoize.sql
+++ b/src/test/regress/sql/memoize.sql
@@ -23,8 +23,10 @@ begin
ln := regexp_replace(ln, 'Evictions: 0', 'Evictions: Zero');
ln := regexp_replace(ln, 'Evictions: \d+', 'Evictions: N');
ln := regexp_replace(ln, 'Memory Usage: \d+', 'Memory Usage: N');
- ln := regexp_replace(ln, 'Heap Fetches: \d+', 'Heap Fetches: N');
- ln := regexp_replace(ln, 'loops=\d+', 'loops=N');
+ ln := regexp_replace(ln, 'Heap Fetches: \d+', 'Heap Fetches: N');
+ ln := regexp_replace(ln, 'loops=\d+', 'loops=N');
+ ln := regexp_replace(ln, 'Index Searches: 0', 'Index Searches: Zero');
+ ln := regexp_replace(ln, 'Index Searches: \d+', 'Index Searches: N');
return next ln;
end loop;
end;
diff --git a/src/test/regress/sql/partition_prune.sql b/src/test/regress/sql/partition_prune.sql
index d67598d5c..5492bd6e9 100644
--- a/src/test/regress/sql/partition_prune.sql
+++ b/src/test/regress/sql/partition_prune.sql
@@ -573,6 +573,10 @@ begin
ln := regexp_replace(ln, 'Workers Launched: \d+', 'Workers Launched: N');
ln := regexp_replace(ln, 'actual rows=\d+ loops=\d+', 'actual rows=N loops=N');
ln := regexp_replace(ln, 'Rows Removed by Filter: \d+', 'Rows Removed by Filter: N');
+ perform regexp_matches(ln, 'Index Searches: \d+');
+ if found then
+ continue;
+ end if;
return next ln;
end loop;
end;
--
2.45.2
On Wed, Nov 27, 2024 at 8:36 AM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:
I think we should divide them because by dividing the total buffer usage by the number of loops, user finds the average buffer consumption per loop. This gives them a clearer picture of the resource intensity per basic unit of work.
I disagree; I think the whole "dividing by number of loops and
rounding up to integer" was the wrong choice for tuple count, as that
makes it difficult if not impossible to determine the actual produced
count when it's less than the number of loops. Data is lost in the
rounding/processing, and I don't want to have lost that data.
I think that you're definitely right about this. I changed my mind (or
changed it back to my original position) recently, when I noticed how
bad the problem was with parallel index scans: nloops generally comes
from the number of workers (including the leader) for parallel scans,
and so it wasn't that hard to see "Index Searches: 0" with the latest
version (the version that started to divide by nloops). Obviously,
that behavior is completely ridiculous. Let's not do that.
The precedent to follow here is "Heap Fetches: N" (in the context of
index-only scans), which also doesn't divide by nloops. Likely because
the same sorts of issues arise with heap fetches.
Same applies for ~scans~ searches: If we do an index search, we should
show it in the count as total sum, not partial processed value. If a
user is interested in per-loopcount values, then they can derive that
value from the data they're presented with; but that isn't true when
we present only the divided-and-rounded value.
I recently came across a good example of how showing "Index Searches:
N" is likely to be useful in the context of nested loop joins. The
example comes from the recently committed ORs-to-SAOP join
transformation work (commit 627d6341).
If I run the test case (taken from src/test/regress/sql/join.sql) with
EXPLAIN ANALYZE, the output confirms that the optimization added by
that commit works particularly well:
pg@regression:5432 [205457]=# explain (analyze, costs off, SUMMARY off)
select count(*)
from tenk1 t1, tenk1 t2
where t2.thousand = t1.tenthous or t2.thousand = t1.unique1 or
t2.thousand = t1.unique2;
┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ QUERY PLAN
│
├──────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Aggregate (actual time=13.761..13.762 rows=1 loops=1)
│
│ Buffers: shared hit=24201
│
│ -> Nested Loop (actual time=0.011..12.928 rows=20000 loops=1)
│
│ Buffers: shared hit=24201
│
│ -> Seq Scan on tenk1 t1 (actual time=0.004..0.507
rows=10000 loops=1) │
│ Buffers: shared hit=345
│
│ -> Index Only Scan using tenk1_thous_tenthous on tenk1 t2
(actual time=0.001..0.001 rows=2 loops=10000) │
│ Index Cond: (thousand = ANY (ARRAY[t1.tenthous,
t1.unique1, t1.unique2])) │
│ Index Searches: 11885
│
│ Heap Fetches: 0
│
│ Buffers: shared hit=23856
│
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(11 rows)
As you can see, there are 10,000 executions of the inner index-only
scan here, which has a SAOP qual whose array will always have 3 array
elements. That means that the best possible case is 10,000 index
searches, and the worst possible case is 30,000 index searches. We
actually see "Index Searches: 11885" -- not bad!
The main factor that gets us pretty close to that best possible case
is a certain kind of redundancy: many individual inner index scans
have duplicate array elements, allowing nbtree preprocessing to shrink
the array when as it is sorted and deduplicated -- the array used
during many individual inner scan executions has as few as one or two
array elements. Another contributing factor is the prevalence of "out
of bounds" array elements: many individual SAOP arrays/inner scans
have 2 array elements that are both greater than 1,000. That'll allow
nbtree to get away with needing only one index search for all
out-of-bounds array elements. That is, it allows nbtree to determine
that all out-of-bounds elements can't possibly have any matches using
only one index search (a search that lands on the rightmost leaf page,
where no matches for any out-of-bounds element will be found).
Of course, this can only be surmised from the EXPLAIN ANALYZE output
shown because I went back to not dividing by nloops within explain.c.
A huge amount of useful information would be lost in cases like this
if we divide by nloops. So, again, let's not do it that way.
It'd be just as easy to surmise what's going on here if the inner
index scan happened to be a plain index scan. That would make the
"Buffers" output include heap buffer hits, which would usually make it
impossible to infer how many of the "Buffers hit" came from the index
structure. My analysis didn't rely on "Buffers" at all, though (only
on "Index Searches: 11885" + "loops=10000"), so everything I pointed
out would be just as readily apparent.
--
Peter Geoghegan
On 17.02.2025 20:56, Peter Geoghegan wrote:
On Wed, Nov 27, 2024 at 8:36 AM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:I think we should divide them because by dividing the total buffer usage by the number of loops, user finds the average buffer consumption per loop. This gives them a clearer picture of the resource intensity per basic unit of work.
I disagree; I think the whole "dividing by number of loops and
rounding up to integer" was the wrong choice for tuple count, as that
makes it difficult if not impossible to determine the actual produced
count when it's less than the number of loops. Data is lost in the
rounding/processing, and I don't want to have lost that data.I think that you're definitely right about this. I changed my mind (or
changed it back to my original position) recently, when I noticed how
bad the problem was with parallel index scans: nloops generally comes
from the number of workers (including the leader) for parallel scans,
and so it wasn't that hard to see "Index Searches: 0" with the latest
version (the version that started to divide by nloops). Obviously,
that behavior is completely ridiculous. Let's not do that.The precedent to follow here is "Heap Fetches: N" (in the context of
index-only scans), which also doesn't divide by nloops. Likely because
the same sorts of issues arise with heap fetches.
Yes, you are right, I agree with both of you.
--
Regards,
Alena Rybakina
Postgres Professional
On Wed, Aug 28, 2024 at 9:52 AM Peter Geoghegan <pg@bowt.ie> wrote:
On Wed, Aug 28, 2024 at 9:49 AM Robert Haas <robertmhaas@gmail.com> wrote:
I agree with this analysis. I don't see why IndexScanDesc would ever
be the right place for this.Then what do you think is the right place?
The paragraph that I agreed with and quoted in my reply, and that you
then quoted in your reply to me, appears to me to address that exact
question.Are you talking about adding global counters, in the style of pgBufferUsage?
Or are you talking about adding it to BTSO? If it's the latter, then
why isn't that at least as bad? It's just the IndexScanDesc thing, but
with an additional indirection.
I need more feedback about this. I don't understand your perspective here.
If I commit the skip scan patch, but don't have something like this
instrumentation in place, it seems quite likely that users will
complain about how opaque its behavior is. While having this
instrumentation isn't quite a blocker to committing the skip scan
patch, it's not too far off, either. I want to be pragmatic. Any
approach that's deemed acceptable is fine by me, provided it
implements approximately the same behavior as the patch that I wrote
implements.
Where is this state that tracks the number of index searches going to
live, if not in IndexScanDesc? I get why you don't particularly care
for that. But I don't understand what the alternative you favor looks
like.
--
Peter Geoghegan
On Mon, Feb 17, 2025 at 5:44 PM Peter Geoghegan <pg@bowt.ie> wrote:
I need more feedback about this. I don't understand your perspective here.
Attached is a version of the patch that will apply cleanly against
HEAD. (This is from v26 of my skip scan patch, which is why I've
skipped so many version numbers compared to the last patch posted on
this thread.)
I still haven't changed anything about the implementation, since this
is just to keep CFTester happy.
--
Peter Geoghegan
Attachments:
v26-0001-Show-index-search-count-in-EXPLAIN-ANALYZE.patchapplication/octet-stream; name=v26-0001-Show-index-search-count-in-EXPLAIN-ANALYZE.patchDownload
From e3ccac132272a41d8303b488254f6cb55a200f79 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@bowt.ie>
Date: Wed, 14 Aug 2024 13:50:23 -0400
Subject: [PATCH v26 1/6] Show index search count in EXPLAIN ANALYZE.
Expose the information tracked by pg_stat_*_indexes.idx_scan to EXPLAIN
ANALYZE output. This is particularly useful for index scans that use
ScalarArrayOp quals, where the number of index scans isn't predictable
in advance with optimizations like the ones added to nbtree by commit
5bf748b8. It's also likely to make behavior that will be introduced by
an upcoming patch to add skip scan optimizations easier to understand.
Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Tomas Vondra <tomas@vondra.me>
Reviewed-By: Masahiro Ikeda <ikedamsh@oss.nttdata.com>
Discussion: https://postgr.es/m/CAH2-Wz=PKR6rB7qbx+Vnd7eqeB5VTcrW=iJvAsTsKbdG+kW_UA@mail.gmail.com
Discussion: https://postgr.es/m/CAH2-WzkRqvaqR2CTNqTZP0z6FuL4-3ED6eQB0yx38XBNj1v-4Q@mail.gmail.com
---
src/include/access/relscan.h | 3 +
src/backend/access/brin/brin.c | 1 +
src/backend/access/gin/ginscan.c | 1 +
src/backend/access/gist/gistget.c | 2 +
src/backend/access/hash/hashsearch.c | 1 +
src/backend/access/index/genam.c | 1 +
src/backend/access/nbtree/nbtree.c | 15 +++
src/backend/access/nbtree/nbtsearch.c | 1 +
src/backend/access/spgist/spgscan.c | 1 +
src/backend/commands/explain.c | 40 +++++++
contrib/bloom/blscan.c | 1 +
doc/src/sgml/bloom.sgml | 6 +-
doc/src/sgml/monitoring.sgml | 27 +++--
doc/src/sgml/perform.sgml | 7 ++
doc/src/sgml/ref/explain.sgml | 3 +-
doc/src/sgml/rules.sgml | 1 +
src/test/regress/expected/brin_multi.out | 27 +++--
src/test/regress/expected/memoize.out | 50 ++++++---
src/test/regress/expected/partition_prune.out | 100 +++++++++++++++---
src/test/regress/expected/select.out | 3 +-
src/test/regress/sql/memoize.sql | 6 +-
src/test/regress/sql/partition_prune.sql | 4 +
22 files changed, 252 insertions(+), 49 deletions(-)
diff --git a/src/include/access/relscan.h b/src/include/access/relscan.h
index dc6e01842..0d1ad8379 100644
--- a/src/include/access/relscan.h
+++ b/src/include/access/relscan.h
@@ -147,6 +147,9 @@ typedef struct IndexScanDescData
bool xactStartedInRecovery; /* prevents killing/seeing killed
* tuples */
+ /* index access method instrumentation output state */
+ uint64 nsearches; /* # of index searches */
+
/* index access method's private state */
void *opaque; /* access-method-specific info */
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index 75a65ec9c..9f146c12a 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -591,6 +591,7 @@ bringetbitmap(IndexScanDesc scan, TIDBitmap *tbm)
opaque = (BrinOpaque *) scan->opaque;
bdesc = opaque->bo_bdesc;
pgstat_count_index_scan(idxRel);
+ scan->nsearches++;
/*
* We need to know the size of the table so that we know how long to
diff --git a/src/backend/access/gin/ginscan.c b/src/backend/access/gin/ginscan.c
index 63ded6301..8c1bbf366 100644
--- a/src/backend/access/gin/ginscan.c
+++ b/src/backend/access/gin/ginscan.c
@@ -437,6 +437,7 @@ ginNewScanKey(IndexScanDesc scan)
MemoryContextSwitchTo(oldCtx);
pgstat_count_index_scan(scan->indexRelation);
+ scan->nsearches++;
}
void
diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c
index cc40e928e..609e85fda 100644
--- a/src/backend/access/gist/gistget.c
+++ b/src/backend/access/gist/gistget.c
@@ -625,6 +625,7 @@ gistgettuple(IndexScanDesc scan, ScanDirection dir)
GISTSearchItem fakeItem;
pgstat_count_index_scan(scan->indexRelation);
+ scan->nsearches++;
so->firstCall = false;
so->curPageData = so->nPageData = 0;
@@ -750,6 +751,7 @@ gistgetbitmap(IndexScanDesc scan, TIDBitmap *tbm)
return 0;
pgstat_count_index_scan(scan->indexRelation);
+ scan->nsearches++;
/* Begin the scan by processing the root page */
so->curPageData = so->nPageData = 0;
diff --git a/src/backend/access/hash/hashsearch.c b/src/backend/access/hash/hashsearch.c
index a3a1fccf3..c4f730437 100644
--- a/src/backend/access/hash/hashsearch.c
+++ b/src/backend/access/hash/hashsearch.c
@@ -298,6 +298,7 @@ _hash_first(IndexScanDesc scan, ScanDirection dir)
HashScanPosItem *currItem;
pgstat_count_index_scan(rel);
+ scan->nsearches++;
/*
* We do not support hash scans with no index qualification, because we
diff --git a/src/backend/access/index/genam.c b/src/backend/access/index/genam.c
index 07bae342e..369e39bc4 100644
--- a/src/backend/access/index/genam.c
+++ b/src/backend/access/index/genam.c
@@ -118,6 +118,7 @@ RelationGetIndexScan(Relation indexRelation, int nkeys, int norderbys)
scan->xactStartedInRecovery = TransactionStartedDuringRecovery();
scan->ignore_killed_tuples = !scan->xactStartedInRecovery;
+ scan->nsearches = 0; /* deliberately not reset by index_rescan */
scan->opaque = NULL;
scan->xs_itup = NULL;
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 45ea6afba..ca2a6e8f8 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -70,6 +70,7 @@ typedef struct BTParallelScanDescData
BTPS_State btps_pageStatus; /* indicates whether next page is
* available for scan. see above for
* possible states of parallel scan. */
+ uint64 btps_nsearches; /* counts index searches for EXPLAIN ANALYZE */
slock_t btps_mutex; /* protects above variables, btps_arrElems */
ConditionVariable btps_cv; /* used to synchronize parallel scan */
@@ -557,6 +558,7 @@ btinitparallelscan(void *target)
bt_target->btps_nextScanPage = InvalidBlockNumber;
bt_target->btps_lastCurrPage = InvalidBlockNumber;
bt_target->btps_pageStatus = BTPARALLEL_NOT_INITIALIZED;
+ bt_target->btps_nsearches = 0;
ConditionVariableInit(&bt_target->btps_cv);
}
@@ -583,6 +585,7 @@ btparallelrescan(IndexScanDesc scan)
btscan->btps_nextScanPage = InvalidBlockNumber;
btscan->btps_lastCurrPage = InvalidBlockNumber;
btscan->btps_pageStatus = BTPARALLEL_NOT_INITIALIZED;
+ /* deliberately don't reset btps_nsearches (matches index_rescan) */
SpinLockRelease(&btscan->btps_mutex);
}
@@ -710,6 +713,11 @@ _bt_parallel_seize(IndexScanDesc scan, BlockNumber *next_scan_page,
* We have successfully seized control of the scan for the purpose
* of advancing it to a new page!
*/
+ if (first && btscan->btps_pageStatus == BTPARALLEL_NOT_INITIALIZED)
+ {
+ /* count the first primitive scan for this btrescan */
+ btscan->btps_nsearches++;
+ }
btscan->btps_pageStatus = BTPARALLEL_ADVANCING;
Assert(btscan->btps_nextScanPage != P_NONE);
*next_scan_page = btscan->btps_nextScanPage;
@@ -810,6 +818,12 @@ _bt_parallel_done(IndexScanDesc scan)
btscan->btps_pageStatus = BTPARALLEL_DONE;
status_changed = true;
}
+
+ /*
+ * Don't use local primscan counter -- overwrite it with the authoritative
+ * primscan counter, which we maintain in shared memory
+ */
+ scan->nsearches = btscan->btps_nsearches;
SpinLockRelease(&btscan->btps_mutex);
/* wake up all the workers associated with this parallel scan */
@@ -844,6 +858,7 @@ _bt_parallel_primscan_schedule(IndexScanDesc scan, BlockNumber curr_page)
btscan->btps_nextScanPage = InvalidBlockNumber;
btscan->btps_lastCurrPage = InvalidBlockNumber;
btscan->btps_pageStatus = BTPARALLEL_NEED_PRIMSCAN;
+ btscan->btps_nsearches++;
/* Serialize scan's current array keys */
for (int i = 0; i < so->numArrayKeys; i++)
diff --git a/src/backend/access/nbtree/nbtsearch.c b/src/backend/access/nbtree/nbtsearch.c
index 472ce06f1..941b4eaaf 100644
--- a/src/backend/access/nbtree/nbtsearch.c
+++ b/src/backend/access/nbtree/nbtsearch.c
@@ -950,6 +950,7 @@ _bt_first(IndexScanDesc scan, ScanDirection dir)
* _bt_search/_bt_endpoint below
*/
pgstat_count_index_scan(rel);
+ scan->nsearches++;
/*----------
* Examine the scan keys to discover where we need to start the scan.
diff --git a/src/backend/access/spgist/spgscan.c b/src/backend/access/spgist/spgscan.c
index 53f910e9d..8554b4535 100644
--- a/src/backend/access/spgist/spgscan.c
+++ b/src/backend/access/spgist/spgscan.c
@@ -421,6 +421,7 @@ spgrescan(IndexScanDesc scan, ScanKey scankey, int nscankeys,
/* count an indexscan for stats */
pgstat_count_index_scan(scan->indexRelation);
+ scan->nsearches++;
}
void
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 4271dd48e..15ca901ca 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -13,6 +13,7 @@
*/
#include "postgres.h"
+#include "access/relscan.h"
#include "access/xact.h"
#include "catalog/pg_type.h"
#include "commands/createas.h"
@@ -88,6 +89,7 @@ static void show_plan_tlist(PlanState *planstate, List *ancestors,
static void show_expression(Node *node, const char *qlabel,
PlanState *planstate, List *ancestors,
bool useprefix, ExplainState *es);
+static void show_indexscan_nsearches(PlanState *planstate, ExplainState *es);
static void show_qual(List *qual, const char *qlabel,
PlanState *planstate, List *ancestors,
bool useprefix, ExplainState *es);
@@ -2113,6 +2115,7 @@ ExplainNode(PlanState *planstate, List *ancestors,
case T_IndexScan:
show_scan_qual(((IndexScan *) plan)->indexqualorig,
"Index Cond", planstate, ancestors, es);
+ show_indexscan_nsearches(planstate, es);
if (((IndexScan *) plan)->indexqualorig)
show_instrumentation_count("Rows Removed by Index Recheck", 2,
planstate, es);
@@ -2126,6 +2129,7 @@ ExplainNode(PlanState *planstate, List *ancestors,
case T_IndexOnlyScan:
show_scan_qual(((IndexOnlyScan *) plan)->indexqual,
"Index Cond", planstate, ancestors, es);
+ show_indexscan_nsearches(planstate, es);
if (((IndexOnlyScan *) plan)->recheckqual)
show_instrumentation_count("Rows Removed by Index Recheck", 2,
planstate, es);
@@ -2142,6 +2146,7 @@ ExplainNode(PlanState *planstate, List *ancestors,
case T_BitmapIndexScan:
show_scan_qual(((BitmapIndexScan *) plan)->indexqualorig,
"Index Cond", planstate, ancestors, es);
+ show_indexscan_nsearches(planstate, es);
break;
case T_BitmapHeapScan:
show_scan_qual(((BitmapHeapScan *) plan)->bitmapqualorig,
@@ -2660,6 +2665,41 @@ show_expression(Node *node, const char *qlabel,
ExplainPropertyText(qlabel, exprstr, es);
}
+/*
+ * Show the number of index searches within an IndexScan node, IndexOnlyScan
+ * node, or BitmapIndexScan node
+ */
+static void
+show_indexscan_nsearches(PlanState *planstate, ExplainState *es)
+{
+ Plan *plan = planstate->plan;
+ struct IndexScanDescData *scanDesc = NULL;
+ uint64 nsearches = 0;
+
+ if (!es->analyze)
+ return;
+
+ switch (nodeTag(plan))
+ {
+ case T_IndexScan:
+ scanDesc = ((IndexScanState *) planstate)->iss_ScanDesc;
+ break;
+ case T_IndexOnlyScan:
+ scanDesc = ((IndexOnlyScanState *) planstate)->ioss_ScanDesc;
+ break;
+ case T_BitmapIndexScan:
+ scanDesc = ((BitmapIndexScanState *) planstate)->biss_ScanDesc;
+ break;
+ default:
+ break;
+ }
+
+ if (scanDesc)
+ nsearches = scanDesc->nsearches;
+
+ ExplainPropertyFloat("Index Searches", NULL, nsearches, 0, es);
+}
+
/*
* Show a qualifier expression (which is a List with implicit AND semantics)
*/
diff --git a/contrib/bloom/blscan.c b/contrib/bloom/blscan.c
index bf801fe78..472169d61 100644
--- a/contrib/bloom/blscan.c
+++ b/contrib/bloom/blscan.c
@@ -116,6 +116,7 @@ blgetbitmap(IndexScanDesc scan, TIDBitmap *tbm)
bas = GetAccessStrategy(BAS_BULKREAD);
npages = RelationGetNumberOfBlocks(scan->indexRelation);
pgstat_count_index_scan(scan->indexRelation);
+ scan->nsearches++;
for (blkno = BLOOM_HEAD_BLKNO; blkno < npages; blkno++)
{
diff --git a/doc/src/sgml/bloom.sgml b/doc/src/sgml/bloom.sgml
index 663a0a4a6..9d9cf6df9 100644
--- a/doc/src/sgml/bloom.sgml
+++ b/doc/src/sgml/bloom.sgml
@@ -173,10 +173,11 @@ CREATE INDEX
Buffers: shared hit=21864
-> Bitmap Index Scan on bloomidx (cost=0.00..178436.00 rows=1 width=0) (actual time=20.005..20.005 rows=2300.00 loops=1)
Index Cond: ((i2 = 898732) AND (i5 = 123451))
+ Index Searches: 1
Buffers: shared hit=19608
Planning Time: 0.099 ms
Execution Time: 22.632 ms
-(10 rows)
+(11 rows)
</programlisting>
</para>
@@ -211,10 +212,11 @@ CREATE INDEX
Buffers: shared hit=3
-> Bitmap Index Scan on btreeidx2 (cost=0.00..4.52 rows=11 width=0) (actual time=0.007..0.007 rows=8.00 loops=1)
Index Cond: (i2 = 898732)
+ Index Searches: 1
Buffers: shared hit=3
Planning Time: 0.264 ms
Execution Time: 0.047 ms
-(13 rows)
+(15 rows)
</programlisting>
Although this query runs much faster than with either of the single
indexes, we pay a penalty in index size. Each of the single-column
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 9178f1d34..55e3b382f 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -4228,16 +4228,31 @@ description | Waiting for a newly initialized WAL file to reach durable storage
<note>
<para>
- Queries that use certain <acronym>SQL</acronym> constructs to search for
- rows matching any value out of a list or array of multiple scalar values
- (see <xref linkend="functions-comparisons"/>) perform multiple
- <quote>primitive</quote> index scans (up to one primitive scan per scalar
- value) during query execution. Each internal primitive index scan
- increments <structname>pg_stat_all_indexes</structname>.<structfield>idx_scan</structfield>,
+ Index scans may sometimes perform multiple index searches per execution.
+ Each index search increments <structname>pg_stat_all_indexes</structname>.<structfield>idx_scan</structfield>,
so it's possible for the count of index scans to significantly exceed the
total number of index scan executor node executions.
</para>
+ <para>
+ This can happen with queries that use certain <acronym>SQL</acronym>
+ constructs to search for rows matching any value out of a list or array of
+ multiple scalar values (see <xref linkend="functions-comparisons"/>). It
+ can also happen with queries that have
+ <literal><replaceable>expression</replaceable> =
+ <replaceable>value1</replaceable> OR
+ <replaceable>expression</replaceable> = <replaceable>value2</replaceable>
+ ...</literal> constructs when the query planner transforms the constructs
+ into an equivalent multi-valued array representation.
+ </para>
</note>
+ <tip>
+ <para>
+ <command>EXPLAIN ANALYZE</command> outputs the total number of index
+ searches performed by each index scan node. <literal>Index Searches: N</literal>
+ indicates the total number of searches across <emphasis>all</emphasis>
+ executor node executions/loops.
+ </para>
+ </tip>
</sect2>
diff --git a/doc/src/sgml/perform.sgml b/doc/src/sgml/perform.sgml
index be4b49f62..9c51f5869 100644
--- a/doc/src/sgml/perform.sgml
+++ b/doc/src/sgml/perform.sgml
@@ -729,9 +729,11 @@ WHERE t1.unique1 < 10 AND t1.unique2 = t2.unique2;
Buffers: shared hit=3 read=5 written=4
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..4.36 rows=10 width=0) (actual time=0.004..0.004 rows=10.00 loops=1)
Index Cond: (unique1 < 10)
+ Index Searches: 1
Buffers: shared hit=2
-> Index Scan using tenk2_unique2 on tenk2 t2 (cost=0.29..7.90 rows=1 width=244) (actual time=0.003..0.003 rows=1 loops=10)
Index Cond: (unique2 = t1.unique2)
+ Index Searches: 10
Buffers: shared hit=24 read=6
Planning:
Buffers: shared hit=15 dirtied=9
@@ -790,6 +792,7 @@ WHERE t1.unique1 < 100 AND t1.unique2 = t2.unique2 ORDER BY t1.fivethous;
Buffers: shared hit=92
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.04 rows=100 width=0) (actual time=0.013..0.013 rows=100.00 loops=1)
Index Cond: (unique1 < 100)
+ Index Searches: 1
Buffers: shared hit=2
Planning:
Buffers: shared hit=12
@@ -861,6 +864,7 @@ EXPLAIN ANALYZE SELECT * FROM polygon_tbl WHERE f1 @> polygon '(0.5,2.0)';
Index Scan using gpolygonind on polygon_tbl (cost=0.13..8.15 rows=1 width=85) (actual time=0.074..0.074 rows=0.00 loops=1)
Index Cond: (f1 @> '((0.5,2))'::polygon)
Rows Removed by Index Recheck: 1
+ Index Searches: 1
Buffers: shared hit=1
Planning Time: 0.039 ms
Execution Time: 0.098 ms
@@ -896,6 +900,7 @@ EXPLAIN (ANALYZE, BUFFERS OFF) SELECT * FROM tenk1 WHERE unique1 < 100 AND un
Index Cond: (unique1 < 100)
-> Bitmap Index Scan on tenk1_unique2 (cost=0.00..19.78 rows=999 width=0) (actual time=0.070..0.070 rows=999.00 loops=1)
Index Cond: (unique2 > 9000)
+ Index Searches: 1
Planning Time: 0.162 ms
Execution Time: 0.143 ms
</screen>
@@ -923,6 +928,7 @@ EXPLAIN ANALYZE UPDATE tenk1 SET hundred = hundred + 1 WHERE unique1 < 100;
Buffers: shared hit=4 read=2
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.04 rows=100 width=0) (actual time=0.031..0.031 rows=100.00 loops=1)
Index Cond: (unique1 < 100)
+ Index Searches: 1
Buffers: shared read=2
Planning Time: 0.151 ms
Execution Time: 1.856 ms
@@ -1061,6 +1067,7 @@ EXPLAIN ANALYZE SELECT * FROM tenk1 WHERE unique1 < 100 AND unique2 > 9000
Index Cond: (unique2 > 9000)
Filter: (unique1 < 100)
Rows Removed by Filter: 287
+ Index Searches: 1
Buffers: shared hit=16
Planning Time: 0.077 ms
Execution Time: 0.086 ms
diff --git a/doc/src/sgml/ref/explain.sgml b/doc/src/sgml/ref/explain.sgml
index 7daddf03e..e2507bb9d 100644
--- a/doc/src/sgml/ref/explain.sgml
+++ b/doc/src/sgml/ref/explain.sgml
@@ -507,9 +507,10 @@ EXPLAIN ANALYZE EXECUTE query(100, 200);
-> Index Scan using test_pkey on test (cost=0.29..10.27 rows=99 width=8) (actual time=0.009..0.025 rows=99.00 loops=1)
Index Cond: ((id > 100) AND (id < 200))
Buffers: shared hit=4
+ Index Searches: 1
Planning Time: 0.244 ms
Execution Time: 0.073 ms
-(9 rows)
+(10 rows)
</programlisting>
</para>
diff --git a/doc/src/sgml/rules.sgml b/doc/src/sgml/rules.sgml
index 1d9924a2a..7ddab09a5 100644
--- a/doc/src/sgml/rules.sgml
+++ b/doc/src/sgml/rules.sgml
@@ -1045,6 +1045,7 @@ SELECT count(*) FROM words WHERE word = 'caterpiler';
Aggregate (cost=4.44..4.45 rows=1 width=0) (actual time=0.042..0.042 rows=1.00 loops=1)
-> Index Only Scan using wrd_word on wrd (cost=0.42..4.44 rows=1 width=0) (actual time=0.039..0.039 rows=0.00 loops=1)
Index Cond: (word = 'caterpiler'::text)
+ Index Searches: 1
Heap Fetches: 0
Planning time: 0.164 ms
Execution time: 0.117 ms
diff --git a/src/test/regress/expected/brin_multi.out b/src/test/regress/expected/brin_multi.out
index 991b7eaca..cb5b5e53e 100644
--- a/src/test/regress/expected/brin_multi.out
+++ b/src/test/regress/expected/brin_multi.out
@@ -853,7 +853,8 @@ SELECT * FROM brin_date_test WHERE a = '2023-01-01'::date;
Recheck Cond: (a = '2023-01-01'::date)
-> Bitmap Index Scan on brin_date_test_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = '2023-01-01'::date)
-(4 rows)
+ Index Searches: 1
+(5 rows)
DROP TABLE brin_date_test;
RESET enable_seqscan;
@@ -872,7 +873,8 @@ SELECT * FROM brin_timestamp_test WHERE a = '2023-01-01'::timestamp;
Recheck Cond: (a = '2023-01-01 00:00:00'::timestamp without time zone)
-> Bitmap Index Scan on brin_timestamp_test_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = '2023-01-01 00:00:00'::timestamp without time zone)
-(4 rows)
+ Index Searches: 1
+(5 rows)
EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF, SUMMARY OFF, BUFFERS OFF)
SELECT * FROM brin_timestamp_test WHERE a = '1900-01-01'::timestamp;
@@ -882,7 +884,8 @@ SELECT * FROM brin_timestamp_test WHERE a = '1900-01-01'::timestamp;
Recheck Cond: (a = '1900-01-01 00:00:00'::timestamp without time zone)
-> Bitmap Index Scan on brin_timestamp_test_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = '1900-01-01 00:00:00'::timestamp without time zone)
-(4 rows)
+ Index Searches: 1
+(5 rows)
DROP TABLE brin_timestamp_test;
RESET enable_seqscan;
@@ -900,7 +903,8 @@ SELECT * FROM brin_date_test WHERE a = '2023-01-01'::date;
Recheck Cond: (a = '2023-01-01'::date)
-> Bitmap Index Scan on brin_date_test_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = '2023-01-01'::date)
-(4 rows)
+ Index Searches: 1
+(5 rows)
EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF, SUMMARY OFF, BUFFERS OFF)
SELECT * FROM brin_date_test WHERE a = '1900-01-01'::date;
@@ -910,7 +914,8 @@ SELECT * FROM brin_date_test WHERE a = '1900-01-01'::date;
Recheck Cond: (a = '1900-01-01'::date)
-> Bitmap Index Scan on brin_date_test_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = '1900-01-01'::date)
-(4 rows)
+ Index Searches: 1
+(5 rows)
DROP TABLE brin_date_test;
RESET enable_seqscan;
@@ -929,7 +934,8 @@ SELECT * FROM brin_interval_test WHERE a = '-30 years'::interval;
Recheck Cond: (a = '@ 30 years ago'::interval)
-> Bitmap Index Scan on brin_interval_test_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = '@ 30 years ago'::interval)
-(4 rows)
+ Index Searches: 1
+(5 rows)
EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF, SUMMARY OFF, BUFFERS OFF)
SELECT * FROM brin_interval_test WHERE a = '30 years'::interval;
@@ -939,7 +945,8 @@ SELECT * FROM brin_interval_test WHERE a = '30 years'::interval;
Recheck Cond: (a = '@ 30 years'::interval)
-> Bitmap Index Scan on brin_interval_test_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = '@ 30 years'::interval)
-(4 rows)
+ Index Searches: 1
+(5 rows)
DROP TABLE brin_interval_test;
RESET enable_seqscan;
@@ -957,7 +964,8 @@ SELECT * FROM brin_interval_test WHERE a = '-30 years'::interval;
Recheck Cond: (a = '@ 30 years ago'::interval)
-> Bitmap Index Scan on brin_interval_test_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = '@ 30 years ago'::interval)
-(4 rows)
+ Index Searches: 1
+(5 rows)
EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF, SUMMARY OFF, BUFFERS OFF)
SELECT * FROM brin_interval_test WHERE a = '30 years'::interval;
@@ -967,7 +975,8 @@ SELECT * FROM brin_interval_test WHERE a = '30 years'::interval;
Recheck Cond: (a = '@ 30 years'::interval)
-> Bitmap Index Scan on brin_interval_test_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = '@ 30 years'::interval)
-(4 rows)
+ Index Searches: 1
+(5 rows)
DROP TABLE brin_interval_test;
RESET enable_seqscan;
diff --git a/src/test/regress/expected/memoize.out b/src/test/regress/expected/memoize.out
index 22f2d3284..e83ad3b5d 100644
--- a/src/test/regress/expected/memoize.out
+++ b/src/test/regress/expected/memoize.out
@@ -22,8 +22,10 @@ begin
ln := regexp_replace(ln, 'Evictions: 0', 'Evictions: Zero');
ln := regexp_replace(ln, 'Evictions: \d+', 'Evictions: N');
ln := regexp_replace(ln, 'Memory Usage: \d+', 'Memory Usage: N');
- ln := regexp_replace(ln, 'Heap Fetches: \d+', 'Heap Fetches: N');
- ln := regexp_replace(ln, 'loops=\d+', 'loops=N');
+ ln := regexp_replace(ln, 'Heap Fetches: \d+', 'Heap Fetches: N');
+ ln := regexp_replace(ln, 'loops=\d+', 'loops=N');
+ ln := regexp_replace(ln, 'Index Searches: 0', 'Index Searches: Zero');
+ ln := regexp_replace(ln, 'Index Searches: \d+', 'Index Searches: N');
return next ln;
end loop;
end;
@@ -48,8 +50,9 @@ WHERE t2.unique1 < 1000;', false);
Hits: 980 Misses: 20 Evictions: Zero Overflows: 0 Memory Usage: NkB
-> Index Only Scan using tenk1_unique1 on tenk1 t1 (actual rows=1.00 loops=N)
Index Cond: (unique1 = t2.twenty)
+ Index Searches: N
Heap Fetches: N
-(12 rows)
+(13 rows)
-- And check we get the expected results.
SELECT COUNT(*),AVG(t1.unique1) FROM tenk1 t1
@@ -79,8 +82,9 @@ WHERE t1.unique1 < 1000;', false);
Hits: 980 Misses: 20 Evictions: Zero Overflows: 0 Memory Usage: NkB
-> Index Only Scan using tenk1_unique1 on tenk1 t2 (actual rows=1.00 loops=N)
Index Cond: (unique1 = t1.twenty)
+ Index Searches: N
Heap Fetches: N
-(12 rows)
+(13 rows)
-- And check we get the expected results.
SELECT COUNT(*),AVG(t2.unique1) FROM tenk1 t1,
@@ -106,6 +110,7 @@ WHERE t1.unique1 < 10;', false);
-> Nested Loop Left Join (actual rows=20.00 loops=N)
-> Index Scan using tenk1_unique1 on tenk1 t1 (actual rows=10.00 loops=N)
Index Cond: (unique1 < 10)
+ Index Searches: N
-> Memoize (actual rows=2.00 loops=N)
Cache Key: t1.two
Cache Mode: binary
@@ -115,7 +120,8 @@ WHERE t1.unique1 < 10;', false);
Rows Removed by Filter: 2
-> Index Scan using tenk1_unique1 on tenk1 t2_1 (actual rows=4.00 loops=N)
Index Cond: (unique1 < 4)
-(13 rows)
+ Index Searches: N
+(15 rows)
-- And check we get the expected results.
SELECT COUNT(*),AVG(t2.t1two) FROM tenk1 t1 LEFT JOIN
@@ -146,10 +152,11 @@ WHERE s.c1 = s.c2 AND t1.unique1 < 1000;', false);
Cache Mode: binary
Hits: 998 Misses: 2 Evictions: Zero Overflows: 0 Memory Usage: NkB
-> Index Only Scan using tenk1_unique1 on tenk1 t2 (actual rows=1.00 loops=N)
+ Index Searches: N
Filter: ((t1.two + 1) = unique1)
Rows Removed by Filter: 9999
Heap Fetches: N
-(13 rows)
+(14 rows)
-- And check we get the expected results.
SELECT COUNT(*), AVG(t1.twenty) FROM tenk1 t1 LEFT JOIN
@@ -217,9 +224,10 @@ ON t1.x = t2.t::numeric AND t1.t::numeric = t2.x;', false);
Hits: 20 Misses: 20 Evictions: Zero Overflows: 0 Memory Usage: NkB
-> Index Only Scan using expr_key_idx_x_t on expr_key t2 (actual rows=2.00 loops=N)
Index Cond: (x = (t1.t)::numeric)
+ Index Searches: N
Filter: (t1.x = (t)::numeric)
Heap Fetches: N
-(10 rows)
+(11 rows)
DROP TABLE expr_key;
-- Reduce work_mem and hash_mem_multiplier so that we see some cache evictions
@@ -245,8 +253,9 @@ WHERE t2.unique1 < 1200;', true);
Hits: N Misses: N Evictions: N Overflows: 0 Memory Usage: NkB
-> Index Only Scan using tenk1_unique1 on tenk1 t1 (actual rows=1.00 loops=N)
Index Cond: (unique1 = t2.thousand)
+ Index Searches: N
Heap Fetches: N
-(12 rows)
+(13 rows)
CREATE TABLE flt (f float);
CREATE INDEX flt_f_idx ON flt (f);
@@ -260,6 +269,7 @@ SELECT * FROM flt f1 INNER JOIN flt f2 ON f1.f = f2.f;', false);
----------------------------------------------------------------------------------
Nested Loop (actual rows=4.00 loops=N)
-> Index Only Scan using flt_f_idx on flt f1 (actual rows=2.00 loops=N)
+ Index Searches: N
Heap Fetches: N
-> Memoize (actual rows=2.00 loops=N)
Cache Key: f1.f
@@ -267,8 +277,9 @@ SELECT * FROM flt f1 INNER JOIN flt f2 ON f1.f = f2.f;', false);
Hits: 1 Misses: 1 Evictions: Zero Overflows: 0 Memory Usage: NkB
-> Index Only Scan using flt_f_idx on flt f2 (actual rows=2.00 loops=N)
Index Cond: (f = f1.f)
+ Index Searches: N
Heap Fetches: N
-(10 rows)
+(12 rows)
-- Ensure memoize operates in binary mode
SELECT explain_memoize('
@@ -277,6 +288,7 @@ SELECT * FROM flt f1 INNER JOIN flt f2 ON f1.f >= f2.f;', false);
----------------------------------------------------------------------------------
Nested Loop (actual rows=4.00 loops=N)
-> Index Only Scan using flt_f_idx on flt f1 (actual rows=2.00 loops=N)
+ Index Searches: N
Heap Fetches: N
-> Memoize (actual rows=2.00 loops=N)
Cache Key: f1.f
@@ -284,8 +296,9 @@ SELECT * FROM flt f1 INNER JOIN flt f2 ON f1.f >= f2.f;', false);
Hits: 0 Misses: 2 Evictions: Zero Overflows: 0 Memory Usage: NkB
-> Index Only Scan using flt_f_idx on flt f2 (actual rows=2.00 loops=N)
Index Cond: (f <= f1.f)
+ Index Searches: N
Heap Fetches: N
-(10 rows)
+(12 rows)
DROP TABLE flt;
-- Exercise Memoize in binary mode with a large fixed width type and a
@@ -311,7 +324,8 @@ SELECT * FROM strtest s1 INNER JOIN strtest s2 ON s1.n >= s2.n;', false);
Hits: 3 Misses: 3 Evictions: Zero Overflows: 0 Memory Usage: NkB
-> Index Scan using strtest_n_idx on strtest s2 (actual rows=4.00 loops=N)
Index Cond: (n <= s1.n)
-(9 rows)
+ Index Searches: N
+(10 rows)
-- Ensure we get 3 hits and 3 misses
SELECT explain_memoize('
@@ -327,7 +341,8 @@ SELECT * FROM strtest s1 INNER JOIN strtest s2 ON s1.t >= s2.t;', false);
Hits: 3 Misses: 3 Evictions: Zero Overflows: 0 Memory Usage: NkB
-> Index Scan using strtest_t_idx on strtest s2 (actual rows=4.00 loops=N)
Index Cond: (t <= s1.t)
-(9 rows)
+ Index Searches: N
+(10 rows)
DROP TABLE strtest;
-- Ensure memoize works with partitionwise join
@@ -347,6 +362,7 @@ SELECT * FROM prt t1 INNER JOIN prt t2 ON t1.a = t2.a;', false);
Append (actual rows=32.00 loops=N)
-> Nested Loop (actual rows=16.00 loops=N)
-> Index Only Scan using iprt_p1_a on prt_p1 t1_1 (actual rows=4.00 loops=N)
+ Index Searches: N
Heap Fetches: N
-> Memoize (actual rows=4.00 loops=N)
Cache Key: t1_1.a
@@ -354,9 +370,11 @@ SELECT * FROM prt t1 INNER JOIN prt t2 ON t1.a = t2.a;', false);
Hits: 3 Misses: 1 Evictions: Zero Overflows: 0 Memory Usage: NkB
-> Index Only Scan using iprt_p1_a on prt_p1 t2_1 (actual rows=4.00 loops=N)
Index Cond: (a = t1_1.a)
+ Index Searches: N
Heap Fetches: N
-> Nested Loop (actual rows=16.00 loops=N)
-> Index Only Scan using iprt_p2_a on prt_p2 t1_2 (actual rows=4.00 loops=N)
+ Index Searches: N
Heap Fetches: N
-> Memoize (actual rows=4.00 loops=N)
Cache Key: t1_2.a
@@ -364,8 +382,9 @@ SELECT * FROM prt t1 INNER JOIN prt t2 ON t1.a = t2.a;', false);
Hits: 3 Misses: 1 Evictions: Zero Overflows: 0 Memory Usage: NkB
-> Index Only Scan using iprt_p2_a on prt_p2 t2_2 (actual rows=4.00 loops=N)
Index Cond: (a = t1_2.a)
+ Index Searches: N
Heap Fetches: N
-(21 rows)
+(25 rows)
-- Ensure memoize works with parameterized union-all Append path
SET enable_partitionwise_join TO off;
@@ -377,6 +396,7 @@ ON t1.a = t2.a;', false);
----------------------------------------------------------------------------------------
Nested Loop (actual rows=16.00 loops=N)
-> Index Only Scan using iprt_p1_a on prt_p1 t1 (actual rows=4.00 loops=N)
+ Index Searches: N
Heap Fetches: N
-> Memoize (actual rows=4.00 loops=N)
Cache Key: t1.a
@@ -385,11 +405,13 @@ ON t1.a = t2.a;', false);
-> Append (actual rows=4.00 loops=N)
-> Index Only Scan using iprt_p1_a on prt_p1 (actual rows=4.00 loops=N)
Index Cond: (a = t1.a)
+ Index Searches: N
Heap Fetches: N
-> Index Only Scan using iprt_p2_a on prt_p2 (actual rows=0.00 loops=N)
Index Cond: (a = t1.a)
+ Index Searches: N
Heap Fetches: N
-(14 rows)
+(17 rows)
DROP TABLE prt;
RESET enable_partitionwise_join;
diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out
index d95d2395d..1bd64ca6a 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -2369,6 +2369,10 @@ begin
ln := regexp_replace(ln, 'Workers Launched: \d+', 'Workers Launched: N');
ln := regexp_replace(ln, 'actual rows=\d+(?:\.\d+)? loops=\d+', 'actual rows=N loops=N');
ln := regexp_replace(ln, 'Rows Removed by Filter: \d+', 'Rows Removed by Filter: N');
+ perform regexp_matches(ln, 'Index Searches: \d+');
+ if found then
+ continue;
+ end if;
return next ln;
end loop;
end;
@@ -2686,47 +2690,56 @@ select * from ab where a = (select max(a) from lprt_a) and b = (select max(a)-1
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a1_b1_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a1_b2 ab_2 (never executed)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a1_b2_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a1_b3 ab_3 (never executed)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a1_b3_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a2_b1 ab_4 (never executed)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a2_b1_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a2_b2 ab_5 (never executed)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a2_b2_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a2_b3 ab_6 (never executed)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a2_b3_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a3_b1 ab_7 (never executed)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a3_b1_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a3_b2 ab_8 (actual rows=0.00 loops=1)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a3_b2_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a3_b3 ab_9 (never executed)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a3_b3_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
-(52 rows)
+ Index Searches: 0
+(61 rows)
-- Test run-time partition pruning with UNION ALL parents
explain (analyze, costs off, summary off, timing off, buffers off)
@@ -2742,16 +2755,19 @@ select * from (select * from ab where a = 1 union all select * from ab) ab where
Filter: (b = (InitPlan 1).col1)
-> Bitmap Index Scan on ab_a1_b1_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b2 ab_12 (never executed)
Recheck Cond: (a = 1)
Filter: (b = (InitPlan 1).col1)
-> Bitmap Index Scan on ab_a1_b2_a_idx (never executed)
Index Cond: (a = 1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a1_b3 ab_13 (never executed)
Recheck Cond: (a = 1)
Filter: (b = (InitPlan 1).col1)
-> Bitmap Index Scan on ab_a1_b3_a_idx (never executed)
Index Cond: (a = 1)
+ Index Searches: 0
-> Seq Scan on ab_a1_b1 ab_1 (actual rows=0.00 loops=1)
Filter: (b = (InitPlan 1).col1)
-> Seq Scan on ab_a1_b2 ab_2 (never executed)
@@ -2770,7 +2786,7 @@ select * from (select * from ab where a = 1 union all select * from ab) ab where
Filter: (b = (InitPlan 1).col1)
-> Seq Scan on ab_a3_b3 ab_9 (never executed)
Filter: (b = (InitPlan 1).col1)
-(37 rows)
+(40 rows)
-- A case containing a UNION ALL with a non-partitioned child.
explain (analyze, costs off, summary off, timing off, buffers off)
@@ -2786,16 +2802,19 @@ select * from (select * from ab where a = 1 union all (values(10,5)) union all s
Filter: (b = (InitPlan 1).col1)
-> Bitmap Index Scan on ab_a1_b1_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b2 ab_12 (never executed)
Recheck Cond: (a = 1)
Filter: (b = (InitPlan 1).col1)
-> Bitmap Index Scan on ab_a1_b2_a_idx (never executed)
Index Cond: (a = 1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a1_b3 ab_13 (never executed)
Recheck Cond: (a = 1)
Filter: (b = (InitPlan 1).col1)
-> Bitmap Index Scan on ab_a1_b3_a_idx (never executed)
Index Cond: (a = 1)
+ Index Searches: 0
-> Result (actual rows=0.00 loops=1)
One-Time Filter: (5 = (InitPlan 1).col1)
-> Seq Scan on ab_a1_b1 ab_1 (actual rows=0.00 loops=1)
@@ -2816,7 +2835,7 @@ select * from (select * from ab where a = 1 union all (values(10,5)) union all s
Filter: (b = (InitPlan 1).col1)
-> Seq Scan on ab_a3_b3 ab_9 (never executed)
Filter: (b = (InitPlan 1).col1)
-(39 rows)
+(42 rows)
-- Another UNION ALL test, but containing a mix of exec init and exec run-time pruning.
create table xy_1 (x int, y int);
@@ -2887,16 +2906,19 @@ update ab_a1 set b = 3 from ab where ab.a = 1 and ab.a = ab_a1.a;');
Recheck Cond: (a = 1)
-> Bitmap Index Scan on ab_a1_b1_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b2 ab_a1_2 (actual rows=1.00 loops=1)
Recheck Cond: (a = 1)
Heap Blocks: exact=1
-> Bitmap Index Scan on ab_a1_b2_a_idx (actual rows=1.00 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b3 ab_a1_3 (actual rows=0.00 loops=1)
Recheck Cond: (a = 1)
Heap Blocks: exact=1
-> Bitmap Index Scan on ab_a1_b3_a_idx (actual rows=1.00 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Materialize (actual rows=1.00 loops=1)
Storage: Memory Maximum Storage: NkB
-> Append (actual rows=1.00 loops=1)
@@ -2904,17 +2926,20 @@ update ab_a1 set b = 3 from ab where ab.a = 1 and ab.a = ab_a1.a;');
Recheck Cond: (a = 1)
-> Bitmap Index Scan on ab_a1_b1_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b2 ab_2 (actual rows=1.00 loops=1)
Recheck Cond: (a = 1)
Heap Blocks: exact=1
-> Bitmap Index Scan on ab_a1_b2_a_idx (actual rows=1.00 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b3 ab_3 (actual rows=0.00 loops=1)
Recheck Cond: (a = 1)
Heap Blocks: exact=1
-> Bitmap Index Scan on ab_a1_b3_a_idx (actual rows=1.00 loops=1)
Index Cond: (a = 1)
-(37 rows)
+ Index Searches: 1
+(43 rows)
table ab;
a | b
@@ -2990,17 +3015,23 @@ select * from tbl1 join tprt on tbl1.col1 > tprt.col1;
-> Append (actual rows=3.00 loops=2)
-> Index Scan using tprt1_idx on tprt_1 (actual rows=2.00 loops=2)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 2
-> Index Scan using tprt2_idx on tprt_2 (actual rows=2.00 loops=1)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 1
-> Index Scan using tprt3_idx on tprt_3 (never executed)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt4_idx on tprt_4 (never executed)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt5_idx on tprt_5 (never executed)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt6_idx on tprt_6 (never executed)
Index Cond: (col1 < tbl1.col1)
-(15 rows)
+ Index Searches: 0
+(21 rows)
explain (analyze, costs off, summary off, timing off, buffers off)
select * from tbl1 join tprt on tbl1.col1 = tprt.col1;
@@ -3011,17 +3042,23 @@ select * from tbl1 join tprt on tbl1.col1 = tprt.col1;
-> Append (actual rows=1.00 loops=2)
-> Index Scan using tprt1_idx on tprt_1 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt2_idx on tprt_2 (actual rows=1.00 loops=2)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 2
-> Index Scan using tprt3_idx on tprt_3 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt4_idx on tprt_4 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt5_idx on tprt_5 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt6_idx on tprt_6 (never executed)
Index Cond: (col1 = tbl1.col1)
-(15 rows)
+ Index Searches: 0
+(21 rows)
select tbl1.col1, tprt.col1 from tbl1
inner join tprt on tbl1.col1 > tprt.col1
@@ -3056,17 +3093,23 @@ select * from tbl1 inner join tprt on tbl1.col1 > tprt.col1;
-> Append (actual rows=4.60 loops=5)
-> Index Scan using tprt1_idx on tprt_1 (actual rows=2.00 loops=5)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 5
-> Index Scan using tprt2_idx on tprt_2 (actual rows=2.75 loops=4)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 4
-> Index Scan using tprt3_idx on tprt_3 (actual rows=1.00 loops=2)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 2
-> Index Scan using tprt4_idx on tprt_4 (never executed)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt5_idx on tprt_5 (never executed)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt6_idx on tprt_6 (never executed)
Index Cond: (col1 < tbl1.col1)
-(15 rows)
+ Index Searches: 0
+(21 rows)
explain (analyze, costs off, summary off, timing off, buffers off)
select * from tbl1 inner join tprt on tbl1.col1 = tprt.col1;
@@ -3077,17 +3120,23 @@ select * from tbl1 inner join tprt on tbl1.col1 = tprt.col1;
-> Append (actual rows=0.60 loops=5)
-> Index Scan using tprt1_idx on tprt_1 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt2_idx on tprt_2 (actual rows=1.00 loops=2)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 2
-> Index Scan using tprt3_idx on tprt_3 (actual rows=0.33 loops=3)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 3
-> Index Scan using tprt4_idx on tprt_4 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt5_idx on tprt_5 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt6_idx on tprt_6 (never executed)
Index Cond: (col1 = tbl1.col1)
-(15 rows)
+ Index Searches: 0
+(21 rows)
select tbl1.col1, tprt.col1 from tbl1
inner join tprt on tbl1.col1 > tprt.col1
@@ -3141,17 +3190,23 @@ select * from tbl1 join tprt on tbl1.col1 < tprt.col1;
-> Append (actual rows=1.00 loops=1)
-> Index Scan using tprt1_idx on tprt_1 (never executed)
Index Cond: (col1 > tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt2_idx on tprt_2 (never executed)
Index Cond: (col1 > tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt3_idx on tprt_3 (never executed)
Index Cond: (col1 > tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt4_idx on tprt_4 (never executed)
Index Cond: (col1 > tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt5_idx on tprt_5 (never executed)
Index Cond: (col1 > tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt6_idx on tprt_6 (actual rows=1.00 loops=1)
Index Cond: (col1 > tbl1.col1)
-(15 rows)
+ Index Searches: 1
+(21 rows)
select tbl1.col1, tprt.col1 from tbl1
inner join tprt on tbl1.col1 < tprt.col1
@@ -3173,17 +3228,23 @@ select * from tbl1 join tprt on tbl1.col1 = tprt.col1;
-> Append (actual rows=0.00 loops=1)
-> Index Scan using tprt1_idx on tprt_1 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt2_idx on tprt_2 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt3_idx on tprt_3 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt4_idx on tprt_4 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt5_idx on tprt_5 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt6_idx on tprt_6 (never executed)
Index Cond: (col1 = tbl1.col1)
-(15 rows)
+ Index Searches: 0
+(21 rows)
select tbl1.col1, tprt.col1 from tbl1
inner join tprt on tbl1.col1 = tprt.col1
@@ -3511,12 +3572,14 @@ explain (analyze, costs off, summary off, timing off, buffers off) execute mt_q1
Sort Key: ma_test.b
Subplans Removed: 1
-> Index Scan using ma_test_p2_b_idx on ma_test_p2 ma_test_1 (actual rows=1.00 loops=1)
+ Index Searches: 1
Filter: ((a >= $1) AND ((a % 10) = 5))
Rows Removed by Filter: 9
-> Index Scan using ma_test_p3_b_idx on ma_test_p3 ma_test_2 (actual rows=1.00 loops=1)
+ Index Searches: 1
Filter: ((a >= $1) AND ((a % 10) = 5))
Rows Removed by Filter: 9
-(9 rows)
+(11 rows)
execute mt_q1(15);
a
@@ -3532,9 +3595,10 @@ explain (analyze, costs off, summary off, timing off, buffers off) execute mt_q1
Sort Key: ma_test.b
Subplans Removed: 2
-> Index Scan using ma_test_p3_b_idx on ma_test_p3 ma_test_1 (actual rows=1.00 loops=1)
+ Index Searches: 1
Filter: ((a >= $1) AND ((a % 10) = 5))
Rows Removed by Filter: 9
-(6 rows)
+(7 rows)
execute mt_q1(25);
a
@@ -3582,13 +3646,17 @@ explain (analyze, costs off, summary off, timing off, buffers off) select * from
-> Limit (actual rows=1.00 loops=1)
-> Index Scan using ma_test_p2_b_idx on ma_test_p2 (actual rows=1.00 loops=1)
Index Cond: (b IS NOT NULL)
+ Index Searches: 1
-> Index Scan using ma_test_p1_b_idx on ma_test_p1 ma_test_1 (never executed)
+ Index Searches: 0
Filter: (a >= (InitPlan 2).col1)
-> Index Scan using ma_test_p2_b_idx on ma_test_p2 ma_test_2 (actual rows=10.00 loops=1)
+ Index Searches: 1
Filter: (a >= (InitPlan 2).col1)
-> Index Scan using ma_test_p3_b_idx on ma_test_p3 ma_test_3 (actual rows=10.00 loops=1)
+ Index Searches: 1
Filter: (a >= (InitPlan 2).col1)
-(14 rows)
+(18 rows)
reset enable_seqscan;
reset enable_sort;
@@ -4158,14 +4226,18 @@ select * from rangep where b IN((select 1),(select 2)) order by a;
-> Merge Append (actual rows=0.00 loops=1)
Sort Key: rangep_2.a
-> Index Scan using rangep_0_to_100_1_a_idx on rangep_0_to_100_1 rangep_2 (actual rows=0.00 loops=1)
+ Index Searches: 1
Filter: (b = ANY (ARRAY[(InitPlan 1).col1, (InitPlan 2).col1]))
-> Index Scan using rangep_0_to_100_2_a_idx on rangep_0_to_100_2 rangep_3 (actual rows=0.00 loops=1)
+ Index Searches: 1
Filter: (b = ANY (ARRAY[(InitPlan 1).col1, (InitPlan 2).col1]))
-> Index Scan using rangep_0_to_100_3_a_idx on rangep_0_to_100_3 rangep_4 (never executed)
+ Index Searches: 0
Filter: (b = ANY (ARRAY[(InitPlan 1).col1, (InitPlan 2).col1]))
-> Index Scan using rangep_100_to_200_a_idx on rangep_100_to_200 rangep_5 (actual rows=0.00 loops=1)
+ Index Searches: 1
Filter: (b = ANY (ARRAY[(InitPlan 1).col1, (InitPlan 2).col1]))
-(15 rows)
+(19 rows)
reset enable_sort;
drop table rangep;
diff --git a/src/test/regress/expected/select.out b/src/test/regress/expected/select.out
index cd79abc35..aa55b6a82 100644
--- a/src/test/regress/expected/select.out
+++ b/src/test/regress/expected/select.out
@@ -763,8 +763,9 @@ select * from onek2 where unique2 = 11 and stringu1 = 'ATAAAA';
--------------------------------------------------------------------
Index Scan using onek2_u2_prtl on onek2 (actual rows=1.00 loops=1)
Index Cond: (unique2 = 11)
+ Index Searches: 1
Filter: (stringu1 = 'ATAAAA'::name)
-(3 rows)
+(4 rows)
explain (costs off)
select unique2 from onek2 where unique2 = 11 and stringu1 = 'ATAAAA';
diff --git a/src/test/regress/sql/memoize.sql b/src/test/regress/sql/memoize.sql
index d5aab4e56..2197b0ac5 100644
--- a/src/test/regress/sql/memoize.sql
+++ b/src/test/regress/sql/memoize.sql
@@ -23,8 +23,10 @@ begin
ln := regexp_replace(ln, 'Evictions: 0', 'Evictions: Zero');
ln := regexp_replace(ln, 'Evictions: \d+', 'Evictions: N');
ln := regexp_replace(ln, 'Memory Usage: \d+', 'Memory Usage: N');
- ln := regexp_replace(ln, 'Heap Fetches: \d+', 'Heap Fetches: N');
- ln := regexp_replace(ln, 'loops=\d+', 'loops=N');
+ ln := regexp_replace(ln, 'Heap Fetches: \d+', 'Heap Fetches: N');
+ ln := regexp_replace(ln, 'loops=\d+', 'loops=N');
+ ln := regexp_replace(ln, 'Index Searches: 0', 'Index Searches: Zero');
+ ln := regexp_replace(ln, 'Index Searches: \d+', 'Index Searches: N');
return next ln;
end loop;
end;
diff --git a/src/test/regress/sql/partition_prune.sql b/src/test/regress/sql/partition_prune.sql
index 5f36d589b..4a2c74b08 100644
--- a/src/test/regress/sql/partition_prune.sql
+++ b/src/test/regress/sql/partition_prune.sql
@@ -588,6 +588,10 @@ begin
ln := regexp_replace(ln, 'Workers Launched: \d+', 'Workers Launched: N');
ln := regexp_replace(ln, 'actual rows=\d+(?:\.\d+)? loops=\d+', 'actual rows=N loops=N');
ln := regexp_replace(ln, 'Rows Removed by Filter: \d+', 'Rows Removed by Filter: N');
+ perform regexp_matches(ln, 'Index Searches: \d+');
+ if found then
+ continue;
+ end if;
return next ln;
end loop;
end;
--
2.47.2
On Mon, Feb 17, 2025 at 5:44 PM Peter Geoghegan <pg@bowt.ie> wrote:
I need more feedback about this. I don't understand your perspective here.
If I commit the skip scan patch, but don't have something like this
instrumentation in place, it seems quite likely that users will
complain about how opaque its behavior is. While having this
instrumentation isn't quite a blocker to committing the skip scan
patch, it's not too far off, either. I want to be pragmatic. Any
approach that's deemed acceptable is fine by me, provided it
implements approximately the same behavior as the patch that I wrote
implements.Where is this state that tracks the number of index searches going to
live, if not in IndexScanDesc? I get why you don't particularly care
for that. But I don't understand what the alternative you favor looks
like.
+1 for having some instrumentation. I do not agree with Tom that these
are numbers that only Peter Geoghegan and 2-3 other people will ever
understand. I grant that it's not going to make sense to everyone, but
the number of people to which it will make sense I would guess is
probably in the hundreds or the thousands rather than the single
digits. Good documentation could help.
So, where should we store that information?
The thing that's odd about using IndexScanDesc is that it doesn't
contain any other instrumentation currently, or at least not that I
can see. Everything else that EXPLAIN prints in terms of index scan is
printed by show_instrumentation_count() from planstate->instrument. So
it seems reasonable to think maybe this should be part of
planstate->instrument, too, but there seem to be at least two problems
with that idea. First, that struct has just four counters (ntuples,
ntuples2, nfiltered, nfiltered2). For an index-only scan, all four of
them are in use -- a plain index scan does not use ntuples2 -- but I
think this optimization can apply to both index and index-only scans,
so we just don't have room. Second, those existing counters are used
for things that we can count in the executor, but the executor won't
know how many index searches occur down inside the AM. So I see the
following possibilities:
1. Add a new field to struct Instrumentation. Make
index_getnext_slot() and possibly other similar functions return a
count of index searches via a new parameter, and use that to bump the
new struct Instrumentation counter. It's a little ugly to have to add
a special-purpose parameter for this, but it doesn't look like there
are tons and tons of calls so maybe it's OK.
2. Add a new field to BufferUsage. Then the AM can bump this field and
explain.c will know about it the same way it knows about other changes
to pgBufferUsage. However, this is not really about buffer usage and
the new field seems utterly unlike the other things that are in that
structure, so this seems really bad to me.
3. Add a new field to IndexScanDesc, as you originally proposed. This
seems similar to #1: it's still shoveling instrumentation data around
in a way that we don't currently, but instead of shoveling it through
a new parameter, we shovel it through a new structure member. Either
way, the new thing (parameter or structure member) doesn't really look
like it belongs with what's already there, so it seems like
conservation of ugliness.
4. Add a new field to the btree-specific structure referenced by the
IndexScanDesc's opaque pointer. I think this is what Matthias was
proposing. It doesn't seem particularly hard to implement. and seems
similar to #1 and #3.
It is not clear to me that any of #1, #3, and #4 are radically better
than any of the other ones, with the following exception: it would be
a poor idea to choose #4 over #3 if this field will ultimately be used
for a bunch of different AMs, and a poor idea to choose #3 over #4 if
it's always going to be interesting only for btree. I'll defer to you
on which of those things is the case, but with the request that you
think about what is practically likely to happen and not advocate too
vigorously based on an argument that makes prominent use of the phrase
"in theory". To be honest, I don't really like any of these options
very much: they all seem a tad inelegant. But sometimes that is a
necessary evil when inventing something new. I believe if I were
implementing this myself I would probably try #1 first; if that ended
up seeming too ugly, then I would fall back to #3 or #4.
Does that help?
--
Robert Haas
EDB: http://www.enterprisedb.com
On Thu, Feb 27, 2025 at 3:42 PM Robert Haas <robertmhaas@gmail.com> wrote:
+1 for having some instrumentation. I do not agree with Tom that these
are numbers that only Peter Geoghegan and 2-3 other people will ever
understand. I grant that it's not going to make sense to everyone, but
the number of people to which it will make sense I would guess is
probably in the hundreds or the thousands rather than the single
digits.
I agree that it's likely to be of interest to only a minority of
users. But a fairly large minority.
Good documentation could help.
It's easy to produce an example that makes intuitive sense. For
example, with skip scan that has a qual such as "WHERE a BETWEEN 1 and
5 AND b = 12345", it is likely that EXPLAIN ANALYZE will show "Index
Searches: 5" -- one search per "a" value. Such an example might be
more useful than my original pgbench_accounts example.
Do you think that that would help?
So, where should we store that information?
The thing that's odd about using IndexScanDesc is that it doesn't
contain any other instrumentation currently, or at least not that I
can see.
It is unique right now, but perhaps only because this is the first
piece of instrumentation that:
A. We must track at the level of an individual index scan -- not at
the level of an index relation, like pgstat_count_index_scan(), nor at
the level of the whole system, like BufferUsage state.
AND
B. Requires that we count something that fundamentally lives inside
the index AM -- something that we cannot reasonably track/infer from
the executor proper (more on that below, in my response to your scheme
#1).
First, that struct has just four counters (ntuples,
ntuples2, nfiltered, nfiltered2). For an index-only scan, all four of
them are in use -- a plain index scan does not use ntuples2 -- but I
think this optimization can apply to both index and index-only scans,
so we just don't have room.
Right, index-only scans have exactly the same requirements as plain
index scans/bitmap index scans.
1. Add a new field to struct Instrumentation. Make
index_getnext_slot() and possibly other similar functions return a
count of index searches via a new parameter, and use that to bump the
new struct Instrumentation counter. It's a little ugly to have to add
a special-purpose parameter for this, but it doesn't look like there
are tons and tons of calls so maybe it's OK.
That seems like the strongest possible alternative to the original
scheme used in the current draft patch (scheme #3).
This scheme #1 has the same issue as scheme #3, though: it still
requires an integer counter that tracks the number of index searches
(something a bit simpler than that, such as a boolean flag set once
per amgettuple call, won't do). This is due to there being no fixed
limit on the number of index searches required during any single
amgettuple call: in general the index AM may perform quite a few index
searches before it is able to return the first tuple to the scan (or
before it can return the next tuple).
The only difference that I can see between scheme #1 and scheme #3 is
that the former requires 2 counters instead of just 1. And, we'd still
need to have 1 out of the 2 counters located either in IndexScanDesc
itself (just like scheme #3), or located in some other struct that can
at least be accessed through IndexScanDesc (like the index AM opaque
state, per scheme #4). After all, *every* piece of state known to any
amgettuple routine must ultimately come from IndexScanDesc (or from
backend global state, per scheme #2).
(Actually, I supposed it is technically possible to avoid storing
anything in IndexScanDesc by inventing another amgettuple argument,
just for this. That seems like a distinction without a difference,
though.)
2. Add a new field to BufferUsage. Then the AM can bump this field and
explain.c will know about it the same way it knows about other changes
to pgBufferUsage. However, this is not really about buffer usage and
the new field seems utterly unlike the other things that are in that
structure, so this seems really bad to me.
I agree. The use of global variables seems quite inappropriate for
something like this. It'll result in wrong output whenever an index
scan uses an InitPlan in its qual when that InitPlan is itself a plan
that contains an index scan (this is already an issue with "Buffers:"
instrumentation, but this would be much worse).
3. Add a new field to IndexScanDesc, as you originally proposed. This
seems similar to #1: it's still shoveling instrumentation data around
in a way that we don't currently, but instead of shoveling it through
a new parameter, we shovel it through a new structure member. Either
way, the new thing (parameter or structure member) doesn't really look
like it belongs with what's already there, so it seems like
conservation of ugliness.
Perhaps a comment noting why the new counter lives in IndexScanDesc would help?
4. Add a new field to the btree-specific structure referenced by the
IndexScanDesc's opaque pointer. I think this is what Matthias was
proposing. It doesn't seem particularly hard to implement. and seems
similar to #1 and #3.
It definitely isn't hard to implement. But...
It is not clear to me that any of #1, #3, and #4 are radically better
than any of the other ones, with the following exception: it would be
a poor idea to choose #4 over #3 if this field will ultimately be used
for a bunch of different AMs, and a poor idea to choose #3 over #4 if
it's always going to be interesting only for btree.
...the requirements here really are 100% index-AM-generic. So I just
don't see any advantage to scheme #4.
All that I propose to do here is to display the information that is
already tracked by
pgstat_count_index_scan()/pg_stat_user_tables.idx_scan (at the level
of each index relation) when EXPLAIN ANALYZE is run (at the level of
each index scan). I'm not inventing a new concept; I'm extending an
existing index-AM-generic approach. It just so happens that there is a
greater practical need for that information within B-Tree scans.
Admittedly, the actual amount of code the patch adds to nbtree is
slightly more than it'll add to other index AMs (all of which require
exactly one added line of code). But that's only because nbtree alone
supports parallel index scans. Were it not for that, nbtree would also
require only a single additional line of code.
I'll defer to you on which of those things is the case, but with the request that you
think about what is practically likely to happen and not advocate too
vigorously based on an argument that makes prominent use of the phrase
"in theory".
Right now, the only instrumentation that lives inside index AMs is
pgstat_count_index_scan(), which works at the relation level (not the
scan level).
Matthias may well be right that we'll eventually want to add more
stuff like this. For example, maybe we'd show the number of index
tuples that we evaluated against the scan's qual that were not
actually returned to the executor proper. Then we really would be
inventing a whole new concept -- but a concept that was also
index-AM-neutral. I think that that's likely to be true generally, for
all manner of possible instrumentation improvements that live inside
index AMs -- so we'll likely end up putting more fields in
IndexScanDesc for that stuff.
To be honest, I don't really like any of these options
very much: they all seem a tad inelegant. But sometimes that is a
necessary evil when inventing something new. I believe if I were
implementing this myself I would probably try #1 first; if that ended
up seeming too ugly, then I would fall back to #3 or #4.
I do get that. I hope that you don't think that I've failed to take
your feedback on board.
I ended up at #3 only through yak-shaving/trial-and-error coding.
Hopefully my reasoning makes sense.
Does that help?
Yes, it does -- hugely. I'll need to think about it some more. The
documentation definitely needs more work, at a minimum.
Thanks for the review!
--
Peter Geoghegan
On Thu, Feb 27, 2025 at 7:57 PM Peter Geoghegan <pg@bowt.ie> wrote:
On Thu, Feb 27, 2025 at 3:42 PM Robert Haas <robertmhaas@gmail.com> wrote:
Good documentation could help.
Attached revision adds an example that shows how "Index Searches: N"
can vary. This appears in "14.1.2. EXPLAIN ANALYZE".
Other changes in this revision:
* Improved commit message.
* We now consistently show "Index Searches: N" after all other scan
related output, so that it will reliably appear immediately before the
"Buffers: " line.
This seemed slightly better, since it is often useful to consider
these two numbers together.
My current plan is to commit this patch on Wednesday or Thursday,
barring any objections.
Thanks
--
Peter Geoghegan
Attachments:
0001-Show-index-search-count-in-EXPLAIN-ANALYZE.patchapplication/octet-stream; name=0001-Show-index-search-count-in-EXPLAIN-ANALYZE.patchDownload
From a5832ee06f93ae6d57773e4dec409abf1748449b Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@bowt.ie>
Date: Wed, 14 Aug 2024 13:50:23 -0400
Subject: [PATCH] Show index search count in EXPLAIN ANALYZE.
Expose the count of index searches/index descents in EXPLAIN ANALYZE's
output for index scan nodes. This information is particularly useful
with scans that use ScalarArrayOp quals, where the number of index scans
isn't predictable in advance (at least not with optimizations like the
one added to nbtree by Postgres 17 commit 5bf748b8). It will also be
useful when EXPLAIN ANALYZE shows details of an nbtree index scan that
uses skip scan optimizations set to be introduced by an upcoming patch.
The instrumentation works by teaching index AMs to increment a new
nsearches counter whenever a new index search begins. The counter is
incremented at exactly the same point that index AMs must already
increment the index's pg_stat_*_indexes.idx_scan counter (since we're
counting exactly the same event, though at a finer granularity).
There was much debate on the best place to store the new counter. We
settled on storing it in the index scan descriptor (IndexScanDescData).
This approach is unique among all the different approaches used to track
query execution costs, but the requirements seem to leave us with no
better alternative. The requirements are themselves unique (at least
right now): we must track the number of searches at the scan granularity
(at the level of the scan, not the level of the system/relation), from
within each index AM's amgettuple/amgetbitmap routine (since it isn't
possible to track it from the executor proper).
Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Tomas Vondra <tomas@vondra.me>
Reviewed-By: Robert Haas <robertmhaas@gmail.com>
Reviewed-By: Masahiro Ikeda <ikedamsh@oss.nttdata.com>
Discussion: https://postgr.es/m/CAH2-Wz=PKR6rB7qbx+Vnd7eqeB5VTcrW=iJvAsTsKbdG+kW_UA@mail.gmail.com
Discussion: https://postgr.es/m/CAH2-WzkRqvaqR2CTNqTZP0z6FuL4-3ED6eQB0yx38XBNj1v-4Q@mail.gmail.com
---
src/include/access/relscan.h | 3 +
src/backend/access/brin/brin.c | 1 +
src/backend/access/gin/ginscan.c | 1 +
src/backend/access/gist/gistget.c | 2 +
src/backend/access/hash/hashsearch.c | 1 +
src/backend/access/index/genam.c | 1 +
src/backend/access/nbtree/nbtree.c | 15 +++
src/backend/access/nbtree/nbtsearch.c | 1 +
src/backend/access/spgist/spgscan.c | 1 +
src/backend/commands/explain.c | 40 +++++++
contrib/bloom/blscan.c | 1 +
doc/src/sgml/bloom.sgml | 7 +-
doc/src/sgml/monitoring.sgml | 26 +++--
doc/src/sgml/perform.sgml | 60 +++++++++++
doc/src/sgml/ref/explain.sgml | 3 +-
doc/src/sgml/rules.sgml | 2 +
src/test/regress/expected/brin_multi.out | 27 +++--
src/test/regress/expected/memoize.out | 50 ++++++---
src/test/regress/expected/partition_prune.out | 100 +++++++++++++++---
src/test/regress/expected/select.out | 3 +-
src/test/regress/sql/memoize.sql | 6 +-
src/test/regress/sql/partition_prune.sql | 4 +
22 files changed, 306 insertions(+), 49 deletions(-)
diff --git a/src/include/access/relscan.h b/src/include/access/relscan.h
index dc6e01842..8b0552f76 100644
--- a/src/include/access/relscan.h
+++ b/src/include/access/relscan.h
@@ -150,6 +150,9 @@ typedef struct IndexScanDescData
/* index access method's private state */
void *opaque; /* access-method-specific info */
+ /* instrumentation that is maintained by index access method */
+ uint64 nsearches; /* total # of index searches */
+
/*
* In an index-only scan, a successful amgettuple call must fill either
* xs_itup (and xs_itupdesc) or xs_hitup (and xs_hitupdesc) to provide the
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index 75a65ec9c..9f146c12a 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -591,6 +591,7 @@ bringetbitmap(IndexScanDesc scan, TIDBitmap *tbm)
opaque = (BrinOpaque *) scan->opaque;
bdesc = opaque->bo_bdesc;
pgstat_count_index_scan(idxRel);
+ scan->nsearches++;
/*
* We need to know the size of the table so that we know how long to
diff --git a/src/backend/access/gin/ginscan.c b/src/backend/access/gin/ginscan.c
index 63ded6301..8c1bbf366 100644
--- a/src/backend/access/gin/ginscan.c
+++ b/src/backend/access/gin/ginscan.c
@@ -437,6 +437,7 @@ ginNewScanKey(IndexScanDesc scan)
MemoryContextSwitchTo(oldCtx);
pgstat_count_index_scan(scan->indexRelation);
+ scan->nsearches++;
}
void
diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c
index cc40e928e..609e85fda 100644
--- a/src/backend/access/gist/gistget.c
+++ b/src/backend/access/gist/gistget.c
@@ -625,6 +625,7 @@ gistgettuple(IndexScanDesc scan, ScanDirection dir)
GISTSearchItem fakeItem;
pgstat_count_index_scan(scan->indexRelation);
+ scan->nsearches++;
so->firstCall = false;
so->curPageData = so->nPageData = 0;
@@ -750,6 +751,7 @@ gistgetbitmap(IndexScanDesc scan, TIDBitmap *tbm)
return 0;
pgstat_count_index_scan(scan->indexRelation);
+ scan->nsearches++;
/* Begin the scan by processing the root page */
so->curPageData = so->nPageData = 0;
diff --git a/src/backend/access/hash/hashsearch.c b/src/backend/access/hash/hashsearch.c
index a3a1fccf3..c4f730437 100644
--- a/src/backend/access/hash/hashsearch.c
+++ b/src/backend/access/hash/hashsearch.c
@@ -298,6 +298,7 @@ _hash_first(IndexScanDesc scan, ScanDirection dir)
HashScanPosItem *currItem;
pgstat_count_index_scan(rel);
+ scan->nsearches++;
/*
* We do not support hash scans with no index qualification, because we
diff --git a/src/backend/access/index/genam.c b/src/backend/access/index/genam.c
index 07bae342e..0cabfa5de 100644
--- a/src/backend/access/index/genam.c
+++ b/src/backend/access/index/genam.c
@@ -119,6 +119,7 @@ RelationGetIndexScan(Relation indexRelation, int nkeys, int norderbys)
scan->ignore_killed_tuples = !scan->xactStartedInRecovery;
scan->opaque = NULL;
+ scan->nsearches = 0;
scan->xs_itup = NULL;
scan->xs_itupdesc = NULL;
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 45ea6afba..eabd54b8c 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -70,6 +70,7 @@ typedef struct BTParallelScanDescData
BTPS_State btps_pageStatus; /* indicates whether next page is
* available for scan. see above for
* possible states of parallel scan. */
+ uint64 btps_nsearches; /* tracked for IndexScanDescData.nsearches */
slock_t btps_mutex; /* protects above variables, btps_arrElems */
ConditionVariable btps_cv; /* used to synchronize parallel scan */
@@ -557,6 +558,7 @@ btinitparallelscan(void *target)
bt_target->btps_nextScanPage = InvalidBlockNumber;
bt_target->btps_lastCurrPage = InvalidBlockNumber;
bt_target->btps_pageStatus = BTPARALLEL_NOT_INITIALIZED;
+ bt_target->btps_nsearches = 0;
ConditionVariableInit(&bt_target->btps_cv);
}
@@ -583,6 +585,7 @@ btparallelrescan(IndexScanDesc scan)
btscan->btps_nextScanPage = InvalidBlockNumber;
btscan->btps_lastCurrPage = InvalidBlockNumber;
btscan->btps_pageStatus = BTPARALLEL_NOT_INITIALIZED;
+ /* deliberately don't reset btps_nsearches (matches index_rescan) */
SpinLockRelease(&btscan->btps_mutex);
}
@@ -676,6 +679,7 @@ _bt_parallel_seize(IndexScanDesc scan, BlockNumber *next_scan_page,
{
/* Can start scheduled primitive scan right away, so do so */
btscan->btps_pageStatus = BTPARALLEL_ADVANCING;
+ btscan->btps_nsearches++;
for (int i = 0; i < so->numArrayKeys; i++)
{
BTArrayKeyInfo *array = &so->arrayKeys[i];
@@ -712,6 +716,11 @@ _bt_parallel_seize(IndexScanDesc scan, BlockNumber *next_scan_page,
*/
btscan->btps_pageStatus = BTPARALLEL_ADVANCING;
Assert(btscan->btps_nextScanPage != P_NONE);
+ if (btscan->btps_nextScanPage == InvalidBlockNumber)
+ {
+ Assert(first);
+ btscan->btps_nsearches++;
+ }
*next_scan_page = btscan->btps_nextScanPage;
*last_curr_page = btscan->btps_lastCurrPage;
exit_loop = true;
@@ -810,6 +819,12 @@ _bt_parallel_done(IndexScanDesc scan)
btscan->btps_pageStatus = BTPARALLEL_DONE;
status_changed = true;
}
+
+ /*
+ * Don't use local nsearches counter -- overwrite it with the nsearches
+ * counter that we've been maintaining in shared memory
+ */
+ scan->nsearches = btscan->btps_nsearches;
SpinLockRelease(&btscan->btps_mutex);
/* wake up all the workers associated with this parallel scan */
diff --git a/src/backend/access/nbtree/nbtsearch.c b/src/backend/access/nbtree/nbtsearch.c
index 472ce06f1..941b4eaaf 100644
--- a/src/backend/access/nbtree/nbtsearch.c
+++ b/src/backend/access/nbtree/nbtsearch.c
@@ -950,6 +950,7 @@ _bt_first(IndexScanDesc scan, ScanDirection dir)
* _bt_search/_bt_endpoint below
*/
pgstat_count_index_scan(rel);
+ scan->nsearches++;
/*----------
* Examine the scan keys to discover where we need to start the scan.
diff --git a/src/backend/access/spgist/spgscan.c b/src/backend/access/spgist/spgscan.c
index 53f910e9d..8554b4535 100644
--- a/src/backend/access/spgist/spgscan.c
+++ b/src/backend/access/spgist/spgscan.c
@@ -421,6 +421,7 @@ spgrescan(IndexScanDesc scan, ScanKey scankey, int nscankeys,
/* count an indexscan for stats */
pgstat_count_index_scan(scan->indexRelation);
+ scan->nsearches++;
}
void
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 4271dd48e..8d81357bf 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -13,6 +13,7 @@
*/
#include "postgres.h"
+#include "access/relscan.h"
#include "access/xact.h"
#include "catalog/pg_type.h"
#include "commands/createas.h"
@@ -137,6 +138,7 @@ static void show_recursive_union_info(RecursiveUnionState *rstate,
static void show_memoize_info(MemoizeState *mstate, List *ancestors,
ExplainState *es);
static void show_hashagg_info(AggState *aggstate, ExplainState *es);
+static void show_indexsearches_info(PlanState *planstate, ExplainState *es);
static void show_tidbitmap_info(BitmapHeapScanState *planstate,
ExplainState *es);
static void show_instrumentation_count(const char *qlabel, int which,
@@ -2122,6 +2124,7 @@ ExplainNode(PlanState *planstate, List *ancestors,
if (plan->qual)
show_instrumentation_count("Rows Removed by Filter", 1,
planstate, es);
+ show_indexsearches_info(planstate, es);
break;
case T_IndexOnlyScan:
show_scan_qual(((IndexOnlyScan *) plan)->indexqual,
@@ -2138,10 +2141,12 @@ ExplainNode(PlanState *planstate, List *ancestors,
if (es->analyze)
ExplainPropertyFloat("Heap Fetches", NULL,
planstate->instrument->ntuples2, 0, es);
+ show_indexsearches_info(planstate, es);
break;
case T_BitmapIndexScan:
show_scan_qual(((BitmapIndexScan *) plan)->indexqualorig,
"Index Cond", planstate, ancestors, es);
+ show_indexsearches_info(planstate, es);
break;
case T_BitmapHeapScan:
show_scan_qual(((BitmapHeapScan *) plan)->bitmapqualorig,
@@ -3881,6 +3886,41 @@ show_hashagg_info(AggState *aggstate, ExplainState *es)
}
}
+/*
+ * Show the total number of index searches performed by a
+ * IndexScan/IndexOnlyScan/BitmapIndexScan node
+ */
+static void
+show_indexsearches_info(PlanState *planstate, ExplainState *es)
+{
+ Plan *plan = planstate->plan;
+ struct IndexScanDescData *scanDesc = NULL;
+ uint64 nsearches = 0;
+
+ if (!es->analyze)
+ return;
+
+ switch (nodeTag(plan))
+ {
+ case T_IndexScan:
+ scanDesc = ((IndexScanState *) planstate)->iss_ScanDesc;
+ break;
+ case T_IndexOnlyScan:
+ scanDesc = ((IndexOnlyScanState *) planstate)->ioss_ScanDesc;
+ break;
+ case T_BitmapIndexScan:
+ scanDesc = ((BitmapIndexScanState *) planstate)->biss_ScanDesc;
+ break;
+ default:
+ break;
+ }
+
+ if (scanDesc)
+ nsearches = scanDesc->nsearches;
+
+ ExplainPropertyUInteger("Index Searches", NULL, nsearches, es);
+}
+
/*
* Show exact/lossy pages for a BitmapHeapScan node
*/
diff --git a/contrib/bloom/blscan.c b/contrib/bloom/blscan.c
index bf801fe78..472169d61 100644
--- a/contrib/bloom/blscan.c
+++ b/contrib/bloom/blscan.c
@@ -116,6 +116,7 @@ blgetbitmap(IndexScanDesc scan, TIDBitmap *tbm)
bas = GetAccessStrategy(BAS_BULKREAD);
npages = RelationGetNumberOfBlocks(scan->indexRelation);
pgstat_count_index_scan(scan->indexRelation);
+ scan->nsearches++;
for (blkno = BLOOM_HEAD_BLKNO; blkno < npages; blkno++)
{
diff --git a/doc/src/sgml/bloom.sgml b/doc/src/sgml/bloom.sgml
index 663a0a4a6..ec5d07767 100644
--- a/doc/src/sgml/bloom.sgml
+++ b/doc/src/sgml/bloom.sgml
@@ -173,10 +173,11 @@ CREATE INDEX
Buffers: shared hit=21864
-> Bitmap Index Scan on bloomidx (cost=0.00..178436.00 rows=1 width=0) (actual time=20.005..20.005 rows=2300.00 loops=1)
Index Cond: ((i2 = 898732) AND (i5 = 123451))
+ Index Searches: 1
Buffers: shared hit=19608
Planning Time: 0.099 ms
Execution Time: 22.632 ms
-(10 rows)
+(11 rows)
</programlisting>
</para>
@@ -208,13 +209,15 @@ CREATE INDEX
Buffers: shared hit=6
-> Bitmap Index Scan on btreeidx5 (cost=0.00..4.52 rows=11 width=0) (actual time=0.026..0.026 rows=7.00 loops=1)
Index Cond: (i5 = 123451)
+ Index Searches: 1
Buffers: shared hit=3
-> Bitmap Index Scan on btreeidx2 (cost=0.00..4.52 rows=11 width=0) (actual time=0.007..0.007 rows=8.00 loops=1)
Index Cond: (i2 = 898732)
+ Index Searches: 1
Buffers: shared hit=3
Planning Time: 0.264 ms
Execution Time: 0.047 ms
-(13 rows)
+(15 rows)
</programlisting>
Although this query runs much faster than with either of the single
indexes, we pay a penalty in index size. Each of the single-column
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 9178f1d34..675d2bf42 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -4228,16 +4228,30 @@ description | Waiting for a newly initialized WAL file to reach durable storage
<note>
<para>
- Queries that use certain <acronym>SQL</acronym> constructs to search for
- rows matching any value out of a list or array of multiple scalar values
- (see <xref linkend="functions-comparisons"/>) perform multiple
- <quote>primitive</quote> index scans (up to one primitive scan per scalar
- value) during query execution. Each internal primitive index scan
- increments <structname>pg_stat_all_indexes</structname>.<structfield>idx_scan</structfield>,
+ Index scans may sometimes perform multiple index searches per execution.
+ Each index search increments <structname>pg_stat_all_indexes</structname>.<structfield>idx_scan</structfield>,
so it's possible for the count of index scans to significantly exceed the
total number of index scan executor node executions.
</para>
+ <para>
+ This can happen with queries that use certain <acronym>SQL</acronym>
+ constructs to search for rows matching any value out of a list or array of
+ multiple scalar values (see <xref linkend="functions-comparisons"/>). It
+ can also happen to queries with
+ <literal><replaceable>expression</replaceable> =
+ <replaceable>value1</replaceable> OR
+ <replaceable>expression</replaceable> = <replaceable>value2</replaceable>
+ ...</literal> constructs, though only when the optimizer transforms the
+ constructs into an equivalent multi-valued array representation.
+ </para>
</note>
+ <tip>
+ <para>
+ <command>EXPLAIN ANALYZE</command> outputs the total number of index
+ searches performed by each index scan node. See
+ <xref linkend="using-explain-analyze"/> for an example.
+ </para>
+ </tip>
</sect2>
diff --git a/doc/src/sgml/perform.sgml b/doc/src/sgml/perform.sgml
index be4b49f62..37d9bd365 100644
--- a/doc/src/sgml/perform.sgml
+++ b/doc/src/sgml/perform.sgml
@@ -729,9 +729,11 @@ WHERE t1.unique1 < 10 AND t1.unique2 = t2.unique2;
Buffers: shared hit=3 read=5 written=4
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..4.36 rows=10 width=0) (actual time=0.004..0.004 rows=10.00 loops=1)
Index Cond: (unique1 < 10)
+ Index Searches: 1
Buffers: shared hit=2
-> Index Scan using tenk2_unique2 on tenk2 t2 (cost=0.29..7.90 rows=1 width=244) (actual time=0.003..0.003 rows=1 loops=10)
Index Cond: (unique2 = t1.unique2)
+ Index Searches: 10
Buffers: shared hit=24 read=6
Planning:
Buffers: shared hit=15 dirtied=9
@@ -790,6 +792,7 @@ WHERE t1.unique1 < 100 AND t1.unique2 = t2.unique2 ORDER BY t1.fivethous;
Buffers: shared hit=92
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.04 rows=100 width=0) (actual time=0.013..0.013 rows=100.00 loops=1)
Index Cond: (unique1 < 100)
+ Index Searches: 1
Buffers: shared hit=2
Planning:
Buffers: shared hit=12
@@ -805,6 +808,58 @@ WHERE t1.unique1 < 100 AND t1.unique2 = t2.unique2 ORDER BY t1.fivethous;
shown.)
</para>
+ <para>
+ Index Scan nodes (as well as Bitmap Index Scan and Index-Only Scan nodes)
+ show an <quote>Index Searches</quote> line that reports the total number
+ of searches across <emphasis>all</emphasis> node
+ executions/<literal>loops</literal>:
+
+<screen>
+EXPLAIN ANALYZE SELECT * FROM tenk1 WHERE thousand IN (1, 500, 700, 999);
+ QUERY PLAN
+-------------------------------------------------------------------&zwsp;---------------------------------------------------------
+ Bitmap Heap Scan on tenk1 (cost=9.45..73.44 rows=40 width=244) (actual time=0.012..0.028 rows=40.00 loops=1)
+ Recheck Cond: (thousand = ANY ('{1,500,700,999}'::integer[]))
+ Heap Blocks: exact=39
+ Buffers: shared hit=47
+ -> Bitmap Index Scan on tenk1_thous_tenthous (cost=0.00..9.44 rows=40 width=0) (actual time=0.009..0.009 rows=40.00 loops=1)
+ Index Cond: (thousand = ANY ('{1,500,700,999}'::integer[]))
+ Index Searches: 4
+ Buffers: shared hit=8
+ Planning Time: 0.037 ms
+ Execution Time: 0.034 ms
+</screen>
+
+ Here we see a Bitmap Index Scan node that needed 4 separate index
+ searches. The scan had to search the index from the
+ <structname>tenk1_thous_tenthous</structname> index root page once per
+ <type>integer</type> value from the predicate's <literal>IN</literal>
+ construct. However, the number of index searches often won't have such a
+ simple correspondance to the query predicate:
+
+<screen>
+EXPLAIN ANALYZE SELECT * FROM tenk1 WHERE thousand IN (1, 2, 3, 4);
+ QUERY PLAN
+----------------------------------------------------------------------------------------------------------------------------------
+ Bitmap Heap Scan on tenk1 (cost=9.45..73.44 rows=40 width=244) (actual time=0.009..0.019 rows=40.00 loops=1)
+ Recheck Cond: (thousand = ANY ('{1,2,3,4}'::integer[]))
+ Heap Blocks: exact=38
+ Buffers: shared hit=40
+ -> Bitmap Index Scan on tenk1_thous_tenthous (cost=0.00..9.44 rows=40 width=0) (actual time=0.005..0.005 rows=40.00 loops=1)
+ Index Cond: (thousand = ANY ('{1,2,3,4}'::integer[]))
+ Index Searches: 1
+ Buffers: shared hit=2
+ Planning Time: 0.029 ms
+ Execution Time: 0.026 ms
+</screen>
+
+ This variant of our <literal>IN</literal> query performed only 1 index
+ search. It spent less time traversing the index (compared to the original
+ query) because its <literal>IN</literal> construct uses values matching
+ index tuples stored next to each other, on the same
+ <structname>tenk1_thous_tenthous</structname> index leaf page.
+ </para>
+
<para>
Another type of extra information is the number of rows removed by a
filter condition:
@@ -861,6 +916,7 @@ EXPLAIN ANALYZE SELECT * FROM polygon_tbl WHERE f1 @> polygon '(0.5,2.0)';
Index Scan using gpolygonind on polygon_tbl (cost=0.13..8.15 rows=1 width=85) (actual time=0.074..0.074 rows=0.00 loops=1)
Index Cond: (f1 @> '((0.5,2))'::polygon)
Rows Removed by Index Recheck: 1
+ Index Searches: 1
Buffers: shared hit=1
Planning Time: 0.039 ms
Execution Time: 0.098 ms
@@ -894,8 +950,10 @@ EXPLAIN (ANALYZE, BUFFERS OFF) SELECT * FROM tenk1 WHERE unique1 < 100 AND un
-> BitmapAnd (cost=25.07..25.07 rows=10 width=0) (actual time=0.100..0.101 rows=0.00 loops=1)
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.04 rows=100 width=0) (actual time=0.027..0.027 rows=100.00 loops=1)
Index Cond: (unique1 < 100)
+ Index Searches: 1
-> Bitmap Index Scan on tenk1_unique2 (cost=0.00..19.78 rows=999 width=0) (actual time=0.070..0.070 rows=999.00 loops=1)
Index Cond: (unique2 > 9000)
+ Index Searches: 1
Planning Time: 0.162 ms
Execution Time: 0.143 ms
</screen>
@@ -923,6 +981,7 @@ EXPLAIN ANALYZE UPDATE tenk1 SET hundred = hundred + 1 WHERE unique1 < 100;
Buffers: shared hit=4 read=2
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.04 rows=100 width=0) (actual time=0.031..0.031 rows=100.00 loops=1)
Index Cond: (unique1 < 100)
+ Index Searches: 1
Buffers: shared read=2
Planning Time: 0.151 ms
Execution Time: 1.856 ms
@@ -1061,6 +1120,7 @@ EXPLAIN ANALYZE SELECT * FROM tenk1 WHERE unique1 < 100 AND unique2 > 9000
Index Cond: (unique2 > 9000)
Filter: (unique1 < 100)
Rows Removed by Filter: 287
+ Index Searches: 1
Buffers: shared hit=16
Planning Time: 0.077 ms
Execution Time: 0.086 ms
diff --git a/doc/src/sgml/ref/explain.sgml b/doc/src/sgml/ref/explain.sgml
index 7daddf03e..9ed1061b7 100644
--- a/doc/src/sgml/ref/explain.sgml
+++ b/doc/src/sgml/ref/explain.sgml
@@ -506,10 +506,11 @@ EXPLAIN ANALYZE EXECUTE query(100, 200);
Buffers: shared hit=4
-> Index Scan using test_pkey on test (cost=0.29..10.27 rows=99 width=8) (actual time=0.009..0.025 rows=99.00 loops=1)
Index Cond: ((id > 100) AND (id < 200))
+ Index Searches: 1
Buffers: shared hit=4
Planning Time: 0.244 ms
Execution Time: 0.073 ms
-(9 rows)
+(10 rows)
</programlisting>
</para>
diff --git a/doc/src/sgml/rules.sgml b/doc/src/sgml/rules.sgml
index 1d9924a2a..8467d961f 100644
--- a/doc/src/sgml/rules.sgml
+++ b/doc/src/sgml/rules.sgml
@@ -1046,6 +1046,7 @@ SELECT count(*) FROM words WHERE word = 'caterpiler';
-> Index Only Scan using wrd_word on wrd (cost=0.42..4.44 rows=1 width=0) (actual time=0.039..0.039 rows=0.00 loops=1)
Index Cond: (word = 'caterpiler'::text)
Heap Fetches: 0
+ Index Searches: 1
Planning time: 0.164 ms
Execution time: 0.117 ms
</programlisting>
@@ -1090,6 +1091,7 @@ SELECT word FROM words ORDER BY word <-> 'caterpiler' LIMIT 10;
Limit (cost=0.29..1.06 rows=10 width=10) (actual time=187.222..188.257 rows=10.00 loops=1)
-> Index Scan using wrd_trgm on wrd (cost=0.29..37020.87 rows=479829 width=10) (actual time=187.219..188.252 rows=10.00 loops=1)
Order By: (word <-> 'caterpiler'::text)
+ Index Searches: 1
Planning time: 0.196 ms
Execution time: 198.640 ms
</programlisting>
diff --git a/src/test/regress/expected/brin_multi.out b/src/test/regress/expected/brin_multi.out
index 991b7eaca..cb5b5e53e 100644
--- a/src/test/regress/expected/brin_multi.out
+++ b/src/test/regress/expected/brin_multi.out
@@ -853,7 +853,8 @@ SELECT * FROM brin_date_test WHERE a = '2023-01-01'::date;
Recheck Cond: (a = '2023-01-01'::date)
-> Bitmap Index Scan on brin_date_test_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = '2023-01-01'::date)
-(4 rows)
+ Index Searches: 1
+(5 rows)
DROP TABLE brin_date_test;
RESET enable_seqscan;
@@ -872,7 +873,8 @@ SELECT * FROM brin_timestamp_test WHERE a = '2023-01-01'::timestamp;
Recheck Cond: (a = '2023-01-01 00:00:00'::timestamp without time zone)
-> Bitmap Index Scan on brin_timestamp_test_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = '2023-01-01 00:00:00'::timestamp without time zone)
-(4 rows)
+ Index Searches: 1
+(5 rows)
EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF, SUMMARY OFF, BUFFERS OFF)
SELECT * FROM brin_timestamp_test WHERE a = '1900-01-01'::timestamp;
@@ -882,7 +884,8 @@ SELECT * FROM brin_timestamp_test WHERE a = '1900-01-01'::timestamp;
Recheck Cond: (a = '1900-01-01 00:00:00'::timestamp without time zone)
-> Bitmap Index Scan on brin_timestamp_test_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = '1900-01-01 00:00:00'::timestamp without time zone)
-(4 rows)
+ Index Searches: 1
+(5 rows)
DROP TABLE brin_timestamp_test;
RESET enable_seqscan;
@@ -900,7 +903,8 @@ SELECT * FROM brin_date_test WHERE a = '2023-01-01'::date;
Recheck Cond: (a = '2023-01-01'::date)
-> Bitmap Index Scan on brin_date_test_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = '2023-01-01'::date)
-(4 rows)
+ Index Searches: 1
+(5 rows)
EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF, SUMMARY OFF, BUFFERS OFF)
SELECT * FROM brin_date_test WHERE a = '1900-01-01'::date;
@@ -910,7 +914,8 @@ SELECT * FROM brin_date_test WHERE a = '1900-01-01'::date;
Recheck Cond: (a = '1900-01-01'::date)
-> Bitmap Index Scan on brin_date_test_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = '1900-01-01'::date)
-(4 rows)
+ Index Searches: 1
+(5 rows)
DROP TABLE brin_date_test;
RESET enable_seqscan;
@@ -929,7 +934,8 @@ SELECT * FROM brin_interval_test WHERE a = '-30 years'::interval;
Recheck Cond: (a = '@ 30 years ago'::interval)
-> Bitmap Index Scan on brin_interval_test_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = '@ 30 years ago'::interval)
-(4 rows)
+ Index Searches: 1
+(5 rows)
EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF, SUMMARY OFF, BUFFERS OFF)
SELECT * FROM brin_interval_test WHERE a = '30 years'::interval;
@@ -939,7 +945,8 @@ SELECT * FROM brin_interval_test WHERE a = '30 years'::interval;
Recheck Cond: (a = '@ 30 years'::interval)
-> Bitmap Index Scan on brin_interval_test_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = '@ 30 years'::interval)
-(4 rows)
+ Index Searches: 1
+(5 rows)
DROP TABLE brin_interval_test;
RESET enable_seqscan;
@@ -957,7 +964,8 @@ SELECT * FROM brin_interval_test WHERE a = '-30 years'::interval;
Recheck Cond: (a = '@ 30 years ago'::interval)
-> Bitmap Index Scan on brin_interval_test_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = '@ 30 years ago'::interval)
-(4 rows)
+ Index Searches: 1
+(5 rows)
EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF, SUMMARY OFF, BUFFERS OFF)
SELECT * FROM brin_interval_test WHERE a = '30 years'::interval;
@@ -967,7 +975,8 @@ SELECT * FROM brin_interval_test WHERE a = '30 years'::interval;
Recheck Cond: (a = '@ 30 years'::interval)
-> Bitmap Index Scan on brin_interval_test_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = '@ 30 years'::interval)
-(4 rows)
+ Index Searches: 1
+(5 rows)
DROP TABLE brin_interval_test;
RESET enable_seqscan;
diff --git a/src/test/regress/expected/memoize.out b/src/test/regress/expected/memoize.out
index 22f2d3284..71bc65968 100644
--- a/src/test/regress/expected/memoize.out
+++ b/src/test/regress/expected/memoize.out
@@ -22,8 +22,10 @@ begin
ln := regexp_replace(ln, 'Evictions: 0', 'Evictions: Zero');
ln := regexp_replace(ln, 'Evictions: \d+', 'Evictions: N');
ln := regexp_replace(ln, 'Memory Usage: \d+', 'Memory Usage: N');
- ln := regexp_replace(ln, 'Heap Fetches: \d+', 'Heap Fetches: N');
- ln := regexp_replace(ln, 'loops=\d+', 'loops=N');
+ ln := regexp_replace(ln, 'Heap Fetches: \d+', 'Heap Fetches: N');
+ ln := regexp_replace(ln, 'loops=\d+', 'loops=N');
+ ln := regexp_replace(ln, 'Index Searches: 0', 'Index Searches: Zero');
+ ln := regexp_replace(ln, 'Index Searches: \d+', 'Index Searches: N');
return next ln;
end loop;
end;
@@ -49,7 +51,8 @@ WHERE t2.unique1 < 1000;', false);
-> Index Only Scan using tenk1_unique1 on tenk1 t1 (actual rows=1.00 loops=N)
Index Cond: (unique1 = t2.twenty)
Heap Fetches: N
-(12 rows)
+ Index Searches: N
+(13 rows)
-- And check we get the expected results.
SELECT COUNT(*),AVG(t1.unique1) FROM tenk1 t1
@@ -80,7 +83,8 @@ WHERE t1.unique1 < 1000;', false);
-> Index Only Scan using tenk1_unique1 on tenk1 t2 (actual rows=1.00 loops=N)
Index Cond: (unique1 = t1.twenty)
Heap Fetches: N
-(12 rows)
+ Index Searches: N
+(13 rows)
-- And check we get the expected results.
SELECT COUNT(*),AVG(t2.unique1) FROM tenk1 t1,
@@ -106,6 +110,7 @@ WHERE t1.unique1 < 10;', false);
-> Nested Loop Left Join (actual rows=20.00 loops=N)
-> Index Scan using tenk1_unique1 on tenk1 t1 (actual rows=10.00 loops=N)
Index Cond: (unique1 < 10)
+ Index Searches: N
-> Memoize (actual rows=2.00 loops=N)
Cache Key: t1.two
Cache Mode: binary
@@ -115,7 +120,8 @@ WHERE t1.unique1 < 10;', false);
Rows Removed by Filter: 2
-> Index Scan using tenk1_unique1 on tenk1 t2_1 (actual rows=4.00 loops=N)
Index Cond: (unique1 < 4)
-(13 rows)
+ Index Searches: N
+(15 rows)
-- And check we get the expected results.
SELECT COUNT(*),AVG(t2.t1two) FROM tenk1 t1 LEFT JOIN
@@ -149,7 +155,8 @@ WHERE s.c1 = s.c2 AND t1.unique1 < 1000;', false);
Filter: ((t1.two + 1) = unique1)
Rows Removed by Filter: 9999
Heap Fetches: N
-(13 rows)
+ Index Searches: N
+(14 rows)
-- And check we get the expected results.
SELECT COUNT(*), AVG(t1.twenty) FROM tenk1 t1 LEFT JOIN
@@ -219,7 +226,8 @@ ON t1.x = t2.t::numeric AND t1.t::numeric = t2.x;', false);
Index Cond: (x = (t1.t)::numeric)
Filter: (t1.x = (t)::numeric)
Heap Fetches: N
-(10 rows)
+ Index Searches: N
+(11 rows)
DROP TABLE expr_key;
-- Reduce work_mem and hash_mem_multiplier so that we see some cache evictions
@@ -246,7 +254,8 @@ WHERE t2.unique1 < 1200;', true);
-> Index Only Scan using tenk1_unique1 on tenk1 t1 (actual rows=1.00 loops=N)
Index Cond: (unique1 = t2.thousand)
Heap Fetches: N
-(12 rows)
+ Index Searches: N
+(13 rows)
CREATE TABLE flt (f float);
CREATE INDEX flt_f_idx ON flt (f);
@@ -261,6 +270,7 @@ SELECT * FROM flt f1 INNER JOIN flt f2 ON f1.f = f2.f;', false);
Nested Loop (actual rows=4.00 loops=N)
-> Index Only Scan using flt_f_idx on flt f1 (actual rows=2.00 loops=N)
Heap Fetches: N
+ Index Searches: N
-> Memoize (actual rows=2.00 loops=N)
Cache Key: f1.f
Cache Mode: logical
@@ -268,7 +278,8 @@ SELECT * FROM flt f1 INNER JOIN flt f2 ON f1.f = f2.f;', false);
-> Index Only Scan using flt_f_idx on flt f2 (actual rows=2.00 loops=N)
Index Cond: (f = f1.f)
Heap Fetches: N
-(10 rows)
+ Index Searches: N
+(12 rows)
-- Ensure memoize operates in binary mode
SELECT explain_memoize('
@@ -278,6 +289,7 @@ SELECT * FROM flt f1 INNER JOIN flt f2 ON f1.f >= f2.f;', false);
Nested Loop (actual rows=4.00 loops=N)
-> Index Only Scan using flt_f_idx on flt f1 (actual rows=2.00 loops=N)
Heap Fetches: N
+ Index Searches: N
-> Memoize (actual rows=2.00 loops=N)
Cache Key: f1.f
Cache Mode: binary
@@ -285,7 +297,8 @@ SELECT * FROM flt f1 INNER JOIN flt f2 ON f1.f >= f2.f;', false);
-> Index Only Scan using flt_f_idx on flt f2 (actual rows=2.00 loops=N)
Index Cond: (f <= f1.f)
Heap Fetches: N
-(10 rows)
+ Index Searches: N
+(12 rows)
DROP TABLE flt;
-- Exercise Memoize in binary mode with a large fixed width type and a
@@ -311,7 +324,8 @@ SELECT * FROM strtest s1 INNER JOIN strtest s2 ON s1.n >= s2.n;', false);
Hits: 3 Misses: 3 Evictions: Zero Overflows: 0 Memory Usage: NkB
-> Index Scan using strtest_n_idx on strtest s2 (actual rows=4.00 loops=N)
Index Cond: (n <= s1.n)
-(9 rows)
+ Index Searches: N
+(10 rows)
-- Ensure we get 3 hits and 3 misses
SELECT explain_memoize('
@@ -327,7 +341,8 @@ SELECT * FROM strtest s1 INNER JOIN strtest s2 ON s1.t >= s2.t;', false);
Hits: 3 Misses: 3 Evictions: Zero Overflows: 0 Memory Usage: NkB
-> Index Scan using strtest_t_idx on strtest s2 (actual rows=4.00 loops=N)
Index Cond: (t <= s1.t)
-(9 rows)
+ Index Searches: N
+(10 rows)
DROP TABLE strtest;
-- Ensure memoize works with partitionwise join
@@ -348,6 +363,7 @@ SELECT * FROM prt t1 INNER JOIN prt t2 ON t1.a = t2.a;', false);
-> Nested Loop (actual rows=16.00 loops=N)
-> Index Only Scan using iprt_p1_a on prt_p1 t1_1 (actual rows=4.00 loops=N)
Heap Fetches: N
+ Index Searches: N
-> Memoize (actual rows=4.00 loops=N)
Cache Key: t1_1.a
Cache Mode: logical
@@ -355,9 +371,11 @@ SELECT * FROM prt t1 INNER JOIN prt t2 ON t1.a = t2.a;', false);
-> Index Only Scan using iprt_p1_a on prt_p1 t2_1 (actual rows=4.00 loops=N)
Index Cond: (a = t1_1.a)
Heap Fetches: N
+ Index Searches: N
-> Nested Loop (actual rows=16.00 loops=N)
-> Index Only Scan using iprt_p2_a on prt_p2 t1_2 (actual rows=4.00 loops=N)
Heap Fetches: N
+ Index Searches: N
-> Memoize (actual rows=4.00 loops=N)
Cache Key: t1_2.a
Cache Mode: logical
@@ -365,7 +383,8 @@ SELECT * FROM prt t1 INNER JOIN prt t2 ON t1.a = t2.a;', false);
-> Index Only Scan using iprt_p2_a on prt_p2 t2_2 (actual rows=4.00 loops=N)
Index Cond: (a = t1_2.a)
Heap Fetches: N
-(21 rows)
+ Index Searches: N
+(25 rows)
-- Ensure memoize works with parameterized union-all Append path
SET enable_partitionwise_join TO off;
@@ -378,6 +397,7 @@ ON t1.a = t2.a;', false);
Nested Loop (actual rows=16.00 loops=N)
-> Index Only Scan using iprt_p1_a on prt_p1 t1 (actual rows=4.00 loops=N)
Heap Fetches: N
+ Index Searches: N
-> Memoize (actual rows=4.00 loops=N)
Cache Key: t1.a
Cache Mode: logical
@@ -386,10 +406,12 @@ ON t1.a = t2.a;', false);
-> Index Only Scan using iprt_p1_a on prt_p1 (actual rows=4.00 loops=N)
Index Cond: (a = t1.a)
Heap Fetches: N
+ Index Searches: N
-> Index Only Scan using iprt_p2_a on prt_p2 (actual rows=0.00 loops=N)
Index Cond: (a = t1.a)
Heap Fetches: N
-(14 rows)
+ Index Searches: N
+(17 rows)
DROP TABLE prt;
RESET enable_partitionwise_join;
diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out
index d95d2395d..34f2b0b8d 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -2369,6 +2369,10 @@ begin
ln := regexp_replace(ln, 'Workers Launched: \d+', 'Workers Launched: N');
ln := regexp_replace(ln, 'actual rows=\d+(?:\.\d+)? loops=\d+', 'actual rows=N loops=N');
ln := regexp_replace(ln, 'Rows Removed by Filter: \d+', 'Rows Removed by Filter: N');
+ perform regexp_matches(ln, 'Index Searches: \d+');
+ if found then
+ continue;
+ end if;
return next ln;
end loop;
end;
@@ -2686,47 +2690,56 @@ select * from ab where a = (select max(a) from lprt_a) and b = (select max(a)-1
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a1_b1_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a1_b2 ab_2 (never executed)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a1_b2_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a1_b3 ab_3 (never executed)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a1_b3_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a2_b1 ab_4 (never executed)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a2_b1_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a2_b2 ab_5 (never executed)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a2_b2_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a2_b3 ab_6 (never executed)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a2_b3_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a3_b1 ab_7 (never executed)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a3_b1_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a3_b2 ab_8 (actual rows=0.00 loops=1)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a3_b2_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a3_b3 ab_9 (never executed)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a3_b3_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
-(52 rows)
+ Index Searches: 0
+(61 rows)
-- Test run-time partition pruning with UNION ALL parents
explain (analyze, costs off, summary off, timing off, buffers off)
@@ -2742,16 +2755,19 @@ select * from (select * from ab where a = 1 union all select * from ab) ab where
Filter: (b = (InitPlan 1).col1)
-> Bitmap Index Scan on ab_a1_b1_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b2 ab_12 (never executed)
Recheck Cond: (a = 1)
Filter: (b = (InitPlan 1).col1)
-> Bitmap Index Scan on ab_a1_b2_a_idx (never executed)
Index Cond: (a = 1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a1_b3 ab_13 (never executed)
Recheck Cond: (a = 1)
Filter: (b = (InitPlan 1).col1)
-> Bitmap Index Scan on ab_a1_b3_a_idx (never executed)
Index Cond: (a = 1)
+ Index Searches: 0
-> Seq Scan on ab_a1_b1 ab_1 (actual rows=0.00 loops=1)
Filter: (b = (InitPlan 1).col1)
-> Seq Scan on ab_a1_b2 ab_2 (never executed)
@@ -2770,7 +2786,7 @@ select * from (select * from ab where a = 1 union all select * from ab) ab where
Filter: (b = (InitPlan 1).col1)
-> Seq Scan on ab_a3_b3 ab_9 (never executed)
Filter: (b = (InitPlan 1).col1)
-(37 rows)
+(40 rows)
-- A case containing a UNION ALL with a non-partitioned child.
explain (analyze, costs off, summary off, timing off, buffers off)
@@ -2786,16 +2802,19 @@ select * from (select * from ab where a = 1 union all (values(10,5)) union all s
Filter: (b = (InitPlan 1).col1)
-> Bitmap Index Scan on ab_a1_b1_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b2 ab_12 (never executed)
Recheck Cond: (a = 1)
Filter: (b = (InitPlan 1).col1)
-> Bitmap Index Scan on ab_a1_b2_a_idx (never executed)
Index Cond: (a = 1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a1_b3 ab_13 (never executed)
Recheck Cond: (a = 1)
Filter: (b = (InitPlan 1).col1)
-> Bitmap Index Scan on ab_a1_b3_a_idx (never executed)
Index Cond: (a = 1)
+ Index Searches: 0
-> Result (actual rows=0.00 loops=1)
One-Time Filter: (5 = (InitPlan 1).col1)
-> Seq Scan on ab_a1_b1 ab_1 (actual rows=0.00 loops=1)
@@ -2816,7 +2835,7 @@ select * from (select * from ab where a = 1 union all (values(10,5)) union all s
Filter: (b = (InitPlan 1).col1)
-> Seq Scan on ab_a3_b3 ab_9 (never executed)
Filter: (b = (InitPlan 1).col1)
-(39 rows)
+(42 rows)
-- Another UNION ALL test, but containing a mix of exec init and exec run-time pruning.
create table xy_1 (x int, y int);
@@ -2887,16 +2906,19 @@ update ab_a1 set b = 3 from ab where ab.a = 1 and ab.a = ab_a1.a;');
Recheck Cond: (a = 1)
-> Bitmap Index Scan on ab_a1_b1_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b2 ab_a1_2 (actual rows=1.00 loops=1)
Recheck Cond: (a = 1)
Heap Blocks: exact=1
-> Bitmap Index Scan on ab_a1_b2_a_idx (actual rows=1.00 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b3 ab_a1_3 (actual rows=0.00 loops=1)
Recheck Cond: (a = 1)
Heap Blocks: exact=1
-> Bitmap Index Scan on ab_a1_b3_a_idx (actual rows=1.00 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Materialize (actual rows=1.00 loops=1)
Storage: Memory Maximum Storage: NkB
-> Append (actual rows=1.00 loops=1)
@@ -2904,17 +2926,20 @@ update ab_a1 set b = 3 from ab where ab.a = 1 and ab.a = ab_a1.a;');
Recheck Cond: (a = 1)
-> Bitmap Index Scan on ab_a1_b1_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b2 ab_2 (actual rows=1.00 loops=1)
Recheck Cond: (a = 1)
Heap Blocks: exact=1
-> Bitmap Index Scan on ab_a1_b2_a_idx (actual rows=1.00 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b3 ab_3 (actual rows=0.00 loops=1)
Recheck Cond: (a = 1)
Heap Blocks: exact=1
-> Bitmap Index Scan on ab_a1_b3_a_idx (actual rows=1.00 loops=1)
Index Cond: (a = 1)
-(37 rows)
+ Index Searches: 1
+(43 rows)
table ab;
a | b
@@ -2990,17 +3015,23 @@ select * from tbl1 join tprt on tbl1.col1 > tprt.col1;
-> Append (actual rows=3.00 loops=2)
-> Index Scan using tprt1_idx on tprt_1 (actual rows=2.00 loops=2)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 2
-> Index Scan using tprt2_idx on tprt_2 (actual rows=2.00 loops=1)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 1
-> Index Scan using tprt3_idx on tprt_3 (never executed)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt4_idx on tprt_4 (never executed)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt5_idx on tprt_5 (never executed)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt6_idx on tprt_6 (never executed)
Index Cond: (col1 < tbl1.col1)
-(15 rows)
+ Index Searches: 0
+(21 rows)
explain (analyze, costs off, summary off, timing off, buffers off)
select * from tbl1 join tprt on tbl1.col1 = tprt.col1;
@@ -3011,17 +3042,23 @@ select * from tbl1 join tprt on tbl1.col1 = tprt.col1;
-> Append (actual rows=1.00 loops=2)
-> Index Scan using tprt1_idx on tprt_1 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt2_idx on tprt_2 (actual rows=1.00 loops=2)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 2
-> Index Scan using tprt3_idx on tprt_3 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt4_idx on tprt_4 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt5_idx on tprt_5 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt6_idx on tprt_6 (never executed)
Index Cond: (col1 = tbl1.col1)
-(15 rows)
+ Index Searches: 0
+(21 rows)
select tbl1.col1, tprt.col1 from tbl1
inner join tprt on tbl1.col1 > tprt.col1
@@ -3056,17 +3093,23 @@ select * from tbl1 inner join tprt on tbl1.col1 > tprt.col1;
-> Append (actual rows=4.60 loops=5)
-> Index Scan using tprt1_idx on tprt_1 (actual rows=2.00 loops=5)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 5
-> Index Scan using tprt2_idx on tprt_2 (actual rows=2.75 loops=4)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 4
-> Index Scan using tprt3_idx on tprt_3 (actual rows=1.00 loops=2)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 2
-> Index Scan using tprt4_idx on tprt_4 (never executed)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt5_idx on tprt_5 (never executed)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt6_idx on tprt_6 (never executed)
Index Cond: (col1 < tbl1.col1)
-(15 rows)
+ Index Searches: 0
+(21 rows)
explain (analyze, costs off, summary off, timing off, buffers off)
select * from tbl1 inner join tprt on tbl1.col1 = tprt.col1;
@@ -3077,17 +3120,23 @@ select * from tbl1 inner join tprt on tbl1.col1 = tprt.col1;
-> Append (actual rows=0.60 loops=5)
-> Index Scan using tprt1_idx on tprt_1 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt2_idx on tprt_2 (actual rows=1.00 loops=2)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 2
-> Index Scan using tprt3_idx on tprt_3 (actual rows=0.33 loops=3)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 3
-> Index Scan using tprt4_idx on tprt_4 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt5_idx on tprt_5 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt6_idx on tprt_6 (never executed)
Index Cond: (col1 = tbl1.col1)
-(15 rows)
+ Index Searches: 0
+(21 rows)
select tbl1.col1, tprt.col1 from tbl1
inner join tprt on tbl1.col1 > tprt.col1
@@ -3141,17 +3190,23 @@ select * from tbl1 join tprt on tbl1.col1 < tprt.col1;
-> Append (actual rows=1.00 loops=1)
-> Index Scan using tprt1_idx on tprt_1 (never executed)
Index Cond: (col1 > tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt2_idx on tprt_2 (never executed)
Index Cond: (col1 > tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt3_idx on tprt_3 (never executed)
Index Cond: (col1 > tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt4_idx on tprt_4 (never executed)
Index Cond: (col1 > tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt5_idx on tprt_5 (never executed)
Index Cond: (col1 > tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt6_idx on tprt_6 (actual rows=1.00 loops=1)
Index Cond: (col1 > tbl1.col1)
-(15 rows)
+ Index Searches: 1
+(21 rows)
select tbl1.col1, tprt.col1 from tbl1
inner join tprt on tbl1.col1 < tprt.col1
@@ -3173,17 +3228,23 @@ select * from tbl1 join tprt on tbl1.col1 = tprt.col1;
-> Append (actual rows=0.00 loops=1)
-> Index Scan using tprt1_idx on tprt_1 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt2_idx on tprt_2 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt3_idx on tprt_3 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt4_idx on tprt_4 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt5_idx on tprt_5 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt6_idx on tprt_6 (never executed)
Index Cond: (col1 = tbl1.col1)
-(15 rows)
+ Index Searches: 0
+(21 rows)
select tbl1.col1, tprt.col1 from tbl1
inner join tprt on tbl1.col1 = tprt.col1
@@ -3513,10 +3574,12 @@ explain (analyze, costs off, summary off, timing off, buffers off) execute mt_q1
-> Index Scan using ma_test_p2_b_idx on ma_test_p2 ma_test_1 (actual rows=1.00 loops=1)
Filter: ((a >= $1) AND ((a % 10) = 5))
Rows Removed by Filter: 9
+ Index Searches: 1
-> Index Scan using ma_test_p3_b_idx on ma_test_p3 ma_test_2 (actual rows=1.00 loops=1)
Filter: ((a >= $1) AND ((a % 10) = 5))
Rows Removed by Filter: 9
-(9 rows)
+ Index Searches: 1
+(11 rows)
execute mt_q1(15);
a
@@ -3534,7 +3597,8 @@ explain (analyze, costs off, summary off, timing off, buffers off) execute mt_q1
-> Index Scan using ma_test_p3_b_idx on ma_test_p3 ma_test_1 (actual rows=1.00 loops=1)
Filter: ((a >= $1) AND ((a % 10) = 5))
Rows Removed by Filter: 9
-(6 rows)
+ Index Searches: 1
+(7 rows)
execute mt_q1(25);
a
@@ -3582,13 +3646,17 @@ explain (analyze, costs off, summary off, timing off, buffers off) select * from
-> Limit (actual rows=1.00 loops=1)
-> Index Scan using ma_test_p2_b_idx on ma_test_p2 (actual rows=1.00 loops=1)
Index Cond: (b IS NOT NULL)
+ Index Searches: 1
-> Index Scan using ma_test_p1_b_idx on ma_test_p1 ma_test_1 (never executed)
Filter: (a >= (InitPlan 2).col1)
+ Index Searches: 0
-> Index Scan using ma_test_p2_b_idx on ma_test_p2 ma_test_2 (actual rows=10.00 loops=1)
Filter: (a >= (InitPlan 2).col1)
+ Index Searches: 1
-> Index Scan using ma_test_p3_b_idx on ma_test_p3 ma_test_3 (actual rows=10.00 loops=1)
Filter: (a >= (InitPlan 2).col1)
-(14 rows)
+ Index Searches: 1
+(18 rows)
reset enable_seqscan;
reset enable_sort;
@@ -4159,13 +4227,17 @@ select * from rangep where b IN((select 1),(select 2)) order by a;
Sort Key: rangep_2.a
-> Index Scan using rangep_0_to_100_1_a_idx on rangep_0_to_100_1 rangep_2 (actual rows=0.00 loops=1)
Filter: (b = ANY (ARRAY[(InitPlan 1).col1, (InitPlan 2).col1]))
+ Index Searches: 1
-> Index Scan using rangep_0_to_100_2_a_idx on rangep_0_to_100_2 rangep_3 (actual rows=0.00 loops=1)
Filter: (b = ANY (ARRAY[(InitPlan 1).col1, (InitPlan 2).col1]))
+ Index Searches: 1
-> Index Scan using rangep_0_to_100_3_a_idx on rangep_0_to_100_3 rangep_4 (never executed)
Filter: (b = ANY (ARRAY[(InitPlan 1).col1, (InitPlan 2).col1]))
+ Index Searches: 0
-> Index Scan using rangep_100_to_200_a_idx on rangep_100_to_200 rangep_5 (actual rows=0.00 loops=1)
Filter: (b = ANY (ARRAY[(InitPlan 1).col1, (InitPlan 2).col1]))
-(15 rows)
+ Index Searches: 1
+(19 rows)
reset enable_sort;
drop table rangep;
diff --git a/src/test/regress/expected/select.out b/src/test/regress/expected/select.out
index cd79abc35..bab0cc93f 100644
--- a/src/test/regress/expected/select.out
+++ b/src/test/regress/expected/select.out
@@ -764,7 +764,8 @@ select * from onek2 where unique2 = 11 and stringu1 = 'ATAAAA';
Index Scan using onek2_u2_prtl on onek2 (actual rows=1.00 loops=1)
Index Cond: (unique2 = 11)
Filter: (stringu1 = 'ATAAAA'::name)
-(3 rows)
+ Index Searches: 1
+(4 rows)
explain (costs off)
select unique2 from onek2 where unique2 = 11 and stringu1 = 'ATAAAA';
diff --git a/src/test/regress/sql/memoize.sql b/src/test/regress/sql/memoize.sql
index d5aab4e56..2197b0ac5 100644
--- a/src/test/regress/sql/memoize.sql
+++ b/src/test/regress/sql/memoize.sql
@@ -23,8 +23,10 @@ begin
ln := regexp_replace(ln, 'Evictions: 0', 'Evictions: Zero');
ln := regexp_replace(ln, 'Evictions: \d+', 'Evictions: N');
ln := regexp_replace(ln, 'Memory Usage: \d+', 'Memory Usage: N');
- ln := regexp_replace(ln, 'Heap Fetches: \d+', 'Heap Fetches: N');
- ln := regexp_replace(ln, 'loops=\d+', 'loops=N');
+ ln := regexp_replace(ln, 'Heap Fetches: \d+', 'Heap Fetches: N');
+ ln := regexp_replace(ln, 'loops=\d+', 'loops=N');
+ ln := regexp_replace(ln, 'Index Searches: 0', 'Index Searches: Zero');
+ ln := regexp_replace(ln, 'Index Searches: \d+', 'Index Searches: N');
return next ln;
end loop;
end;
diff --git a/src/test/regress/sql/partition_prune.sql b/src/test/regress/sql/partition_prune.sql
index 5f36d589b..4a2c74b08 100644
--- a/src/test/regress/sql/partition_prune.sql
+++ b/src/test/regress/sql/partition_prune.sql
@@ -588,6 +588,10 @@ begin
ln := regexp_replace(ln, 'Workers Launched: \d+', 'Workers Launched: N');
ln := regexp_replace(ln, 'actual rows=\d+(?:\.\d+)? loops=\d+', 'actual rows=N loops=N');
ln := regexp_replace(ln, 'Rows Removed by Filter: \d+', 'Rows Removed by Filter: N');
+ perform regexp_matches(ln, 'Index Searches: \d+');
+ if found then
+ continue;
+ end if;
return next ln;
end loop;
end;
--
2.47.2
On Thu, Feb 27, 2025 at 7:58 PM Peter Geoghegan <pg@bowt.ie> wrote:
It's easy to produce an example that makes intuitive sense. For
example, with skip scan that has a qual such as "WHERE a BETWEEN 1 and
5 AND b = 12345", it is likely that EXPLAIN ANALYZE will show "Index
Searches: 5" -- one search per "a" value. Such an example might be
more useful than my original pgbench_accounts example.Do you think that that would help?
Yes.
It is unique right now, but perhaps only because this is the first
piece of instrumentation that:
Yeah, possible.
Perhaps a comment noting why the new counter lives in IndexScanDesc would help?
+1.
I do get that. I hope that you don't think that I've failed to take
your feedback on board.
To the contrary, I appreciate you taking the time to listen to my opinion.
--
Robert Haas
EDB: http://www.enterprisedb.com
Committed just now. Thanks again.
On Mon, Mar 3, 2025 at 4:01 PM Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Feb 27, 2025 at 7:58 PM Peter Geoghegan <pg@bowt.ie> wrote:
It's easy to produce an example that makes intuitive sense. For
example, with skip scan that has a qual such as "WHERE a BETWEEN 1 and
5 AND b = 12345", it is likely that EXPLAIN ANALYZE will show "Index
Searches: 5" -- one search per "a" value. Such an example might be
more useful than my original pgbench_accounts example.Do you think that that would help?
Yes.
As you might have seen already, I added an example involving SAOPs to
"14.1.2. EXPLAIN ANALYZE". I have a TODO item about adding an
additional example involving skip scan immediately afterwards, as part
of the skip scan patch.
Perhaps a comment noting why the new counter lives in IndexScanDesc would help?
+1.
Added a IndexScanDesc comment about this to the committed version.
--
Peter Geoghegan
On Wed, Mar 5, 2025 at 9:37 AM Peter Geoghegan <pg@bowt.ie> wrote:
Committed just now. Thanks again.
I had to revert this for now, due to issues with debug_parallel_query.
Apologies for the inconvenience.
The immediate problem is that when the parallel leader doesn't
participate, there is no valid IndexScanDescData in planstate to work
off of. There isn't an obvious way to get to shared memory from the
leader process, since that all goes through the
IndexScanDescData.parallel_scan -- there is nothing that points to
shared memory in any of the relevant planstate structs (namely
IndexScanState, IndexOnlyScanState, and BitmapIndexScanState). I was
hoping that you'd be able to provide some guidance on how best to fix
this.
I think that the problem here is similar to the problem with hash
joins and their HashInstrumentation struct -- at least in the
parallel-oblivious case. Here are the points of similarity:
* The information in question is for the node execution as a whole --
it is orthogonal to what might have happened in each individual
worker, and displays the same basic operation-level stats. It is
independent of whether or not the scan happened to use parallel
workers or not.
* For the most part when running a parallel hash join it doesn't
matter what worker EXPLAIN gets its stats from -- they should all
agree on the details (in the parallel-oblivious case, though the
parallel-aware case is still fairly similar). Comments in
show_hash_info explain this.
* However, there are important exceptions: cases where the parallel
leader didn't participate at all, or showed up late, never building
its own hash table. We have to be prepared to get the information from
all workers, iff the leader doesn't have it.
I failed to account for this last point. I wonder if I can fix this
using an approach like the one from bugfix commit 5bcf389ecf. Note
that show_hash_info has since changed; at the time of the commit we
only had parallel oblivious hash joins, so it made sense to loop
through SharedHashInfo for workers and go with the details taken from
the first worker that successfully built a hash table (the hash tables
must be identical anyway).
As I said, a sticking point for this approach is that there is no
existing way to get to someplace in shared memory from the parallel
leader when it never participated. Parallel index scans have their
ParallelIndexScanDesc state stored when they call
index_beginscan_parallel. But that's not happening in a parallel
leader that never participates. Parallel hash join doesn't have that
problem, I think, because the leader will reliably get a pointer to
shared state when ExecHashInitializeDSM() is called. As comments in
its ExecParallelInitializeDSM caller put it, ExecHashInitializeDSM is
called "even when not parallel-aware, for EXPLAIN ANALYZE" -- this
makes it like a few other kinds of nodes, but not like index scan
nodes.
--
Peter Geoghegan
On Thu, Mar 6, 2025 at 1:17 PM Peter Geoghegan <pg@bowt.ie> wrote:
The immediate problem is that when the parallel leader doesn't
participate, there is no valid IndexScanDescData in planstate to work
off of. There isn't an obvious way to get to shared memory from the
leader process, since that all goes through the
IndexScanDescData.parallel_scan -- there is nothing that points to
shared memory in any of the relevant planstate structs (namely
IndexScanState, IndexOnlyScanState, and BitmapIndexScanState). I was
hoping that you'd be able to provide some guidance on how best to fix
this.
Hmm, it seems weird that you can't get a hold of that structure to me.
Why can't you just go find it in the DSM?
--
Robert Haas
EDB: http://www.enterprisedb.com
On Thu, Mar 6, 2025 at 1:54 PM Robert Haas <robertmhaas@gmail.com> wrote:
Hmm, it seems weird that you can't get a hold of that structure to me.
Why can't you just go find it in the DSM?
Sorry, I was unclear.
One reason is that there isn't necessarily anything to find.
Certainly, when I try this out with a debugger, even the B-Tree scan
doesn't have doesn't even have IndexScanDescData.parallel_scan set. It
isn't actually a parallel B-Tree scan. It is a
serial/non-parallel-aware index scan that is run from a parallel
worker, and feeds its output into a gather merge node despite all
this.
--
Peter Geoghegan
On Thu, Mar 6, 2025 at 1:58 PM Peter Geoghegan <pg@bowt.ie> wrote:
On Thu, Mar 6, 2025 at 1:54 PM Robert Haas <robertmhaas@gmail.com> wrote:
Hmm, it seems weird that you can't get a hold of that structure to me.
Why can't you just go find it in the DSM?Sorry, I was unclear.
One reason is that there isn't necessarily anything to find.
Certainly, when I try this out with a debugger, even the B-Tree scan
doesn't have doesn't even have IndexScanDescData.parallel_scan set. It
isn't actually a parallel B-Tree scan. It is a
serial/non-parallel-aware index scan that is run from a parallel
worker, and feeds its output into a gather merge node despite all
this.
Well, I think this calls the basic design into question. We discussed
putting this into IndexScanDescData as a convenient way of piping it
through to EXPLAIN, but what I think we have now discovered is that
there isn't actually convenient at all, because every process has its
own IndexScanDescData and the leader sees only its own. It seems like
what you need is to have each process accumulate stats locally, and
then at the end total them up. Maybe show_sort_info() has some useful
precedent, since that's also a bit of node-specific instrumentation,
and it seems to know what to do about workers.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Thu, Mar 6, 2025 at 2:12 PM Robert Haas <robertmhaas@gmail.com> wrote:
Well, I think this calls the basic design into question. We discussed
putting this into IndexScanDescData as a convenient way of piping it
through to EXPLAIN, but what I think we have now discovered is that
there isn't actually convenient at all, because every process has its
own IndexScanDescData and the leader sees only its own.\
I agree that it isn't convenient. But there's an inescapable need to
pass *something* down to amgettuple. Everything that we currently pass
to amgettuple goes through the IndexScanDesc arg (its only other arg
is ScanDirection), which isn't a bad reason to put this here too.
So I still think that we need to either store something like nsearches
in IndexScanDescData, or store a pointer to some other struct that
contains the nsearches field (and possibly other fields, in the
future). The only alternative is to change the amtuple signature
(e.g., pass down planstate), which doesn't seem like an improvement.
It seems like
what you need is to have each process accumulate stats locally, and
then at the end total them up. Maybe show_sort_info() has some useful
precedent, since that's also a bit of node-specific instrumentation,
and it seems to know what to do about workers.
That seems similar to the hash join case I looked at.
I think that the main problem with the reverted patch isn't that it
uses IndexScanDescData -- that detail is almost inevitable. The main
problem is that it failed to teach
nodeIndexscan.c/nodeIndexonlyscan.c/nodeBitmapIndexscan.c to place the
IndexScanDescData.nsearches counter somewhere where explain.c could
later get to reliably. That'd probably be easier if
IndexScanDescData.nsearches was a pointer instead of a raw integer.
Thanks
--
Peter Geoghegan
On Thu, Mar 6, 2025 at 2:12 PM Robert Haas <robertmhaas@gmail.com> wrote:
Maybe show_sort_info() has some useful
precedent, since that's also a bit of node-specific instrumentation,
and it seems to know what to do about workers.
What do you think of the attached WIP patch, which does things this
way? Does this seem like the right general direction to you?
Unfortunately, my new approach seems to require quite a bit more code,
including adding new parallel query functions for bitmap index scans
(which previously didn't require anything like this at all). I can
probably simplify it some more, but likely not by much.
I now put a pointer to an instrumentation struct in IndexScanDescData.
The pointer always points to local memory: specifically, it points to
a dedicated field in each of the 3 supported executor node planstate
structs. Each of the workers copy their local instrumentation struct
into a dedicated space in shared memory, at the point that
ExecIndexScanRetrieveInstrumentation/ExecIndexOnlyScanRetrieveInstrumentation/ExecBitmapIndexScanRetrieveInstrumentation
is called (though only when running during EXPLAIN ANALYZE). Once we
get to explain.c, we take more or less the same approach already used
for things like sort nodes and hash join nodes.
Obviously, this revised version of the patch passes all tests when the
tests are run with debug_parallel_query=regress.
--
Peter Geoghegan
Attachments:
v1oftake2-0001-Show-index-search-count-in-EXPLAIN-ANALYZE.patchapplication/octet-stream; name=v1oftake2-0001-Show-index-search-count-in-EXPLAIN-ANALYZE.patchDownload
From 8d4a69fa623b29675495c64c7cc157d08b60afcb Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@bowt.ie>
Date: Wed, 5 Mar 2025 09:36:48 -0500
Subject: [PATCH v1oftake2] Show index search count in EXPLAIN ANALYZE, take 2.
Expose the count of index searches/index descents in EXPLAIN ANALYZE's
output for index scan nodes. This information is particularly useful
with scans that use ScalarArrayOp quals, where the number of index scans
isn't predictable in advance (at least not with optimizations like the
one added to nbtree by Postgres 17 commit 5bf748b8). It will also be
useful when EXPLAIN ANALYZE shows details of an nbtree index scan that
uses skip scan optimizations set to be introduced by an upcoming patch.
The instrumentation in this revised version of the patch works by
teaching index AMs to increment a new nsearches counter whenever a new
index search begins. The counter is incremented at exactly the same
point that index AMs already increment the pg_stat_*_indexes.idx_scan
counter (we're counting the same event, but at the scan level rather
than the relation level). Parallel index scans have parallel workers
copy the counter into shared memory.
This approach doesn't match the approach used when tracking other index
scan specific costs (e.g., "Rows Removed by Filter:"). It is similar to
the approach used in other cases where we must track costs that are only
readily accessible inside an access method, and not from the executor
(e.g., "Heap Blocks:" output for a Bitmap Heap Scan). It is inherently
necessary to maintain a counter that can be incremented multiple times
during a single amgettuple call (or amgetbitmap call), and directly
exposing PlanState.instrument to index access methods seems unappealing.
Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Robert Haas <robertmhaas@gmail.com>
Reviewed-By: Tomas Vondra <tomas@vondra.me>
Reviewed-By: Masahiro Ikeda <ikedamsh@oss.nttdata.com>
Reviewed-By: Matthias van de Meent <boekewurm+postgres@gmail.com>
Discussion: https://postgr.es/m/CAH2-Wz=PKR6rB7qbx+Vnd7eqeB5VTcrW=iJvAsTsKbdG+kW_UA@mail.gmail.com
Discussion: https://postgr.es/m/CAH2-WzkRqvaqR2CTNqTZP0z6FuL4-3ED6eQB0yx38XBNj1v-4Q@mail.gmail.com
---
src/include/access/genam.h | 32 +++-
src/include/access/relscan.h | 11 +-
src/include/executor/nodeBitmapIndexscan.h | 6 +
src/include/executor/nodeIndexonlyscan.h | 1 +
src/include/executor/nodeIndexscan.h | 1 +
src/include/nodes/execnodes.h | 22 +++
src/backend/access/brin/brin.c | 2 +
src/backend/access/gin/ginscan.c | 2 +
src/backend/access/gist/gistget.c | 4 +
src/backend/access/hash/hashsearch.c | 2 +
src/backend/access/heap/heapam_handler.c | 2 +-
src/backend/access/index/genam.c | 5 +-
src/backend/access/index/indexam.c | 38 +++--
src/backend/access/nbtree/nbtree.c | 10 +-
src/backend/access/nbtree/nbtsearch.c | 2 +
src/backend/access/spgist/spgscan.c | 2 +
src/backend/commands/explain.c | 71 +++++++++
src/backend/executor/execIndexing.c | 2 +-
src/backend/executor/execParallel.c | 69 ++++++---
src/backend/executor/execReplication.c | 2 +-
src/backend/executor/nodeBitmapIndexscan.c | 112 ++++++++++++++
src/backend/executor/nodeIndexonlyscan.c | 138 ++++++++++++++----
src/backend/executor/nodeIndexscan.c | 136 +++++++++++++----
src/backend/utils/adt/selfuncs.c | 2 +-
contrib/bloom/blscan.c | 2 +
doc/src/sgml/bloom.sgml | 7 +-
doc/src/sgml/monitoring.sgml | 28 +++-
doc/src/sgml/perform.sgml | 60 ++++++++
doc/src/sgml/ref/explain.sgml | 3 +-
doc/src/sgml/rules.sgml | 2 +
src/test/regress/expected/brin_multi.out | 27 ++--
src/test/regress/expected/memoize.out | 49 +++++--
src/test/regress/expected/partition_prune.out | 100 +++++++++++--
src/test/regress/expected/select.out | 3 +-
src/test/regress/sql/memoize.sql | 5 +-
src/test/regress/sql/partition_prune.sql | 4 +
36 files changed, 807 insertions(+), 157 deletions(-)
diff --git a/src/include/access/genam.h b/src/include/access/genam.h
index 1be873957..087e127d2 100644
--- a/src/include/access/genam.h
+++ b/src/include/access/genam.h
@@ -85,6 +85,26 @@ typedef struct IndexBulkDeleteResult
BlockNumber pages_free; /* # pages available for reuse */
} IndexBulkDeleteResult;
+/*
+ * Data structure for reporting index scan statistics that are maintained by
+ * index scans. Note that IndexScanInstrumentation can't contain any pointers
+ * because we sometimes put it in shared memory.
+ */
+typedef struct IndexScanInstrumentation
+{
+ uint64 nsearches;
+} IndexScanInstrumentation;
+
+/* ----------------
+ * Shared memory container for per-worker index scan information
+ * ----------------
+ */
+typedef struct SharedIndexScanInstrumentation
+{
+ int num_workers;
+ IndexScanInstrumentation winstrument[FLEXIBLE_ARRAY_MEMBER];
+} SharedIndexScanInstrumentation;
+
/* Typedef for callback function to determine if a tuple is bulk-deletable */
typedef bool (*IndexBulkDeleteCallback) (ItemPointer itemptr, void *state);
@@ -157,9 +177,11 @@ extern void index_insert_cleanup(Relation indexRelation,
extern IndexScanDesc index_beginscan(Relation heapRelation,
Relation indexRelation,
Snapshot snapshot,
+ IndexScanInstrumentation *instrument,
int nkeys, int norderbys);
extern IndexScanDesc index_beginscan_bitmap(Relation indexRelation,
Snapshot snapshot,
+ IndexScanInstrumentation *instrument,
int nkeys);
extern void index_rescan(IndexScanDesc scan,
ScanKey keys, int nkeys,
@@ -168,14 +190,18 @@ extern void index_endscan(IndexScanDesc scan);
extern void index_markpos(IndexScanDesc scan);
extern void index_restrpos(IndexScanDesc scan);
extern Size index_parallelscan_estimate(Relation indexRelation,
- int nkeys, int norderbys, Snapshot snapshot);
+ int nkeys, int norderbys, Snapshot snapshot,
+ bool instrument, int nworkers,
+ Size *instroffset);
extern void index_parallelscan_initialize(Relation heapRelation,
Relation indexRelation, Snapshot snapshot,
- ParallelIndexScanDesc target);
+ ParallelIndexScanDesc target,
+ Size ps_offset_ins);
extern void index_parallelrescan(IndexScanDesc scan);
extern IndexScanDesc index_beginscan_parallel(Relation heaprel,
Relation indexrel, int nkeys, int norderbys,
- ParallelIndexScanDesc pscan);
+ ParallelIndexScanDesc pscan,
+ IndexScanInstrumentation *instrument);
extern ItemPointer index_getnext_tid(IndexScanDesc scan,
ScanDirection direction);
struct TupleTableSlot;
diff --git a/src/include/access/relscan.h b/src/include/access/relscan.h
index dc6e01842..2fb97620c 100644
--- a/src/include/access/relscan.h
+++ b/src/include/access/relscan.h
@@ -123,6 +123,8 @@ typedef struct IndexFetchTableData
Relation rel;
} IndexFetchTableData;
+struct IndexScanInstrumentation;
+
/*
* We use the same IndexScanDescData structure for both amgettuple-based
* and amgetbitmap-based index scans. Some fields are only relevant in
@@ -150,6 +152,12 @@ typedef struct IndexScanDescData
/* index access method's private state */
void *opaque; /* access-method-specific info */
+ /*
+ * Instrumentation counters that are maintained by every index access
+ * method, for all scan types
+ */
+ struct IndexScanInstrumentation *instrument;
+
/*
* In an index-only scan, a successful amgettuple call must fill either
* xs_itup (and xs_itupdesc) or xs_hitup (and xs_hitupdesc) to provide the
@@ -188,7 +196,8 @@ typedef struct ParallelIndexScanDescData
{
RelFileLocator ps_locator; /* physical table relation to scan */
RelFileLocator ps_indexlocator; /* physical index relation to scan */
- Size ps_offset; /* Offset in bytes of am specific structure */
+ Size ps_offset_am; /* Offset in bytes of am specific structure */
+ Size ps_offset_ins; /* Offset to SharedIndexScanInstrumentation */
char ps_snapshot_data[FLEXIBLE_ARRAY_MEMBER];
} ParallelIndexScanDescData;
diff --git a/src/include/executor/nodeBitmapIndexscan.h b/src/include/executor/nodeBitmapIndexscan.h
index b51cb184e..b6a5ae25e 100644
--- a/src/include/executor/nodeBitmapIndexscan.h
+++ b/src/include/executor/nodeBitmapIndexscan.h
@@ -14,11 +14,17 @@
#ifndef NODEBITMAPINDEXSCAN_H
#define NODEBITMAPINDEXSCAN_H
+#include "access/parallel.h"
#include "nodes/execnodes.h"
extern BitmapIndexScanState *ExecInitBitmapIndexScan(BitmapIndexScan *node, EState *estate, int eflags);
extern Node *MultiExecBitmapIndexScan(BitmapIndexScanState *node);
extern void ExecEndBitmapIndexScan(BitmapIndexScanState *node);
extern void ExecReScanBitmapIndexScan(BitmapIndexScanState *node);
+extern void ExecBitmapIndexScanEstimate(BitmapIndexScanState *node, ParallelContext *pcxt);
+extern void ExecBitmapIndexScanInitializeDSM(BitmapIndexScanState *node, ParallelContext *pcxt);
+extern void ExecBitmapIndexScanInitializeWorker(BitmapIndexScanState *node,
+ ParallelWorkerContext *pwcxt);
+extern void ExecBitmapIndexScanRetrieveInstrumentation(BitmapIndexScanState *node);
#endif /* NODEBITMAPINDEXSCAN_H */
diff --git a/src/include/executor/nodeIndexonlyscan.h b/src/include/executor/nodeIndexonlyscan.h
index c27d8eb6d..ae85dee6d 100644
--- a/src/include/executor/nodeIndexonlyscan.h
+++ b/src/include/executor/nodeIndexonlyscan.h
@@ -32,5 +32,6 @@ extern void ExecIndexOnlyScanReInitializeDSM(IndexOnlyScanState *node,
ParallelContext *pcxt);
extern void ExecIndexOnlyScanInitializeWorker(IndexOnlyScanState *node,
ParallelWorkerContext *pwcxt);
+extern void ExecIndexOnlyScanRetrieveInstrumentation(IndexOnlyScanState *node);
#endif /* NODEINDEXONLYSCAN_H */
diff --git a/src/include/executor/nodeIndexscan.h b/src/include/executor/nodeIndexscan.h
index 1c63d0615..08f0a148d 100644
--- a/src/include/executor/nodeIndexscan.h
+++ b/src/include/executor/nodeIndexscan.h
@@ -28,6 +28,7 @@ extern void ExecIndexScanInitializeDSM(IndexScanState *node, ParallelContext *pc
extern void ExecIndexScanReInitializeDSM(IndexScanState *node, ParallelContext *pcxt);
extern void ExecIndexScanInitializeWorker(IndexScanState *node,
ParallelWorkerContext *pwcxt);
+extern void ExecIndexScanRetrieveInstrumentation(IndexScanState *node);
/*
* These routines are exported to share code with nodeIndexonlyscan.c and
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index a323fa98b..4fafc7fb2 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -1679,7 +1679,11 @@ typedef struct
* RuntimeKeysReady true if runtime Skeys have been computed
* RuntimeContext expr context for evaling runtime Skeys
* RelationDesc index relation descriptor
+ * Instrument index scan instrumentation
* ScanDesc index scan descriptor
+ * ParallelScanDesc parallel index scan descriptor
+ * SharedInfo statistics for parallel workers
+ * InstrOffset offset to SharedInfo in shared memory allocation
*
* ReorderQueue tuples that need reordering due to re-check
* ReachedEnd have we fetched all tuples from index already?
@@ -1705,7 +1709,11 @@ typedef struct IndexScanState
bool iss_RuntimeKeysReady;
ExprContext *iss_RuntimeContext;
Relation iss_RelationDesc;
+ IndexScanInstrumentation iss_Instrument;
struct IndexScanDescData *iss_ScanDesc;
+ ParallelIndexScanDesc iss_ParallelScanDesc;
+ SharedIndexScanInstrumentation *iss_SharedInfo;
+ Size iss_InstrOffset;
/* These are needed for re-checking ORDER BY expr ordering */
pairingheap *iss_ReorderQueue;
@@ -1731,7 +1739,11 @@ typedef struct IndexScanState
* RuntimeKeysReady true if runtime Skeys have been computed
* RuntimeContext expr context for evaling runtime Skeys
* RelationDesc index relation descriptor
+ * Instrument index scan instrumentation
* ScanDesc index scan descriptor
+ * ParallelScanDesc parallel index scan descriptor
+ * SharedInfo statistics for parallel workers
+ * InstrOffset offset to SharedInfo in shared memory allocation
* TableSlot slot for holding tuples fetched from the table
* VMBuffer buffer in use for visibility map testing, if any
* PscanLen size of parallel index-only scan descriptor
@@ -1752,7 +1764,11 @@ typedef struct IndexOnlyScanState
bool ioss_RuntimeKeysReady;
ExprContext *ioss_RuntimeContext;
Relation ioss_RelationDesc;
+ IndexScanInstrumentation ioss_Instrument;
struct IndexScanDescData *ioss_ScanDesc;
+ ParallelIndexScanDesc ioss_ParallelScanDesc;
+ SharedIndexScanInstrumentation *ioss_SharedInfo;
+ Size ioss_InstrOffset;
TupleTableSlot *ioss_TableSlot;
Buffer ioss_VMBuffer;
Size ioss_PscanLen;
@@ -1773,7 +1789,10 @@ typedef struct IndexOnlyScanState
* RuntimeKeysReady true if runtime Skeys have been computed
* RuntimeContext expr context for evaling runtime Skeys
* RelationDesc index relation descriptor
+ * Instrument index scan instrumentation
* ScanDesc index scan descriptor
+ * ParallelScanDesc parallel index scan descriptor
+ * SharedInfo statistics for parallel workers
* ----------------
*/
typedef struct BitmapIndexScanState
@@ -1789,7 +1808,10 @@ typedef struct BitmapIndexScanState
bool biss_RuntimeKeysReady;
ExprContext *biss_RuntimeContext;
Relation biss_RelationDesc;
+ IndexScanInstrumentation biss_Instrument;
struct IndexScanDescData *biss_ScanDesc;
+ ParallelIndexScanDesc biss_ParallelScanDesc;
+ SharedIndexScanInstrumentation *biss_SharedInfo;
} BitmapIndexScanState;
/* ----------------
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index b01009c5d..737ad6388 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -592,6 +592,8 @@ bringetbitmap(IndexScanDesc scan, TIDBitmap *tbm)
opaque = (BrinOpaque *) scan->opaque;
bdesc = opaque->bo_bdesc;
pgstat_count_index_scan(idxRel);
+ if (scan->instrument)
+ scan->instrument->nsearches++;
/*
* We need to know the size of the table so that we know how long to
diff --git a/src/backend/access/gin/ginscan.c b/src/backend/access/gin/ginscan.c
index 84aa14594..f6cdd098a 100644
--- a/src/backend/access/gin/ginscan.c
+++ b/src/backend/access/gin/ginscan.c
@@ -442,6 +442,8 @@ ginNewScanKey(IndexScanDesc scan)
MemoryContextSwitchTo(oldCtx);
pgstat_count_index_scan(scan->indexRelation);
+ if (scan->instrument)
+ scan->instrument->nsearches++;
}
void
diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c
index cc40e928e..387d99723 100644
--- a/src/backend/access/gist/gistget.c
+++ b/src/backend/access/gist/gistget.c
@@ -625,6 +625,8 @@ gistgettuple(IndexScanDesc scan, ScanDirection dir)
GISTSearchItem fakeItem;
pgstat_count_index_scan(scan->indexRelation);
+ if (scan->instrument)
+ scan->instrument->nsearches++;
so->firstCall = false;
so->curPageData = so->nPageData = 0;
@@ -750,6 +752,8 @@ gistgetbitmap(IndexScanDesc scan, TIDBitmap *tbm)
return 0;
pgstat_count_index_scan(scan->indexRelation);
+ if (scan->instrument)
+ scan->instrument->nsearches++;
/* Begin the scan by processing the root page */
so->curPageData = so->nPageData = 0;
diff --git a/src/backend/access/hash/hashsearch.c b/src/backend/access/hash/hashsearch.c
index a3a1fccf3..92c15a65b 100644
--- a/src/backend/access/hash/hashsearch.c
+++ b/src/backend/access/hash/hashsearch.c
@@ -298,6 +298,8 @@ _hash_first(IndexScanDesc scan, ScanDirection dir)
HashScanPosItem *currItem;
pgstat_count_index_scan(rel);
+ if (scan->instrument)
+ scan->instrument->nsearches++;
/*
* We do not support hash scans with no index qualification, because we
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index e78682c3c..d74f0fbc5 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -749,7 +749,7 @@ heapam_relation_copy_for_cluster(Relation OldHeap, Relation NewHeap,
tableScan = NULL;
heapScan = NULL;
- indexScan = index_beginscan(OldHeap, OldIndex, SnapshotAny, 0, 0);
+ indexScan = index_beginscan(OldHeap, OldIndex, SnapshotAny, NULL, 0, 0);
index_rescan(indexScan, NULL, 0, NULL, 0);
}
else
diff --git a/src/backend/access/index/genam.c b/src/backend/access/index/genam.c
index 07bae342e..886c05655 100644
--- a/src/backend/access/index/genam.c
+++ b/src/backend/access/index/genam.c
@@ -119,6 +119,7 @@ RelationGetIndexScan(Relation indexRelation, int nkeys, int norderbys)
scan->ignore_killed_tuples = !scan->xactStartedInRecovery;
scan->opaque = NULL;
+ scan->instrument = NULL;
scan->xs_itup = NULL;
scan->xs_itupdesc = NULL;
@@ -446,7 +447,7 @@ systable_beginscan(Relation heapRelation,
}
sysscan->iscan = index_beginscan(heapRelation, irel,
- snapshot, nkeys, 0);
+ snapshot, NULL, nkeys, 0);
index_rescan(sysscan->iscan, idxkey, nkeys, NULL, 0);
sysscan->scan = NULL;
@@ -711,7 +712,7 @@ systable_beginscan_ordered(Relation heapRelation,
}
sysscan->iscan = index_beginscan(heapRelation, indexRelation,
- snapshot, nkeys, 0);
+ snapshot, NULL, nkeys, 0);
index_rescan(sysscan->iscan, idxkey, nkeys, NULL, 0);
sysscan->scan = NULL;
diff --git a/src/backend/access/index/indexam.c b/src/backend/access/index/indexam.c
index 8b1f55543..ca5ef2794 100644
--- a/src/backend/access/index/indexam.c
+++ b/src/backend/access/index/indexam.c
@@ -256,6 +256,7 @@ IndexScanDesc
index_beginscan(Relation heapRelation,
Relation indexRelation,
Snapshot snapshot,
+ IndexScanInstrumentation *instrument,
int nkeys, int norderbys)
{
IndexScanDesc scan;
@@ -270,6 +271,7 @@ index_beginscan(Relation heapRelation,
*/
scan->heapRelation = heapRelation;
scan->xs_snapshot = snapshot;
+ scan->instrument = instrument;
/* prepare to fetch index matches from table */
scan->xs_heapfetch = table_index_fetch_begin(heapRelation);
@@ -286,6 +288,7 @@ index_beginscan(Relation heapRelation,
IndexScanDesc
index_beginscan_bitmap(Relation indexRelation,
Snapshot snapshot,
+ IndexScanInstrumentation *instrument,
int nkeys)
{
IndexScanDesc scan;
@@ -299,6 +302,7 @@ index_beginscan_bitmap(Relation indexRelation,
* up by RelationGetIndexScan.
*/
scan->xs_snapshot = snapshot;
+ scan->instrument = instrument;
return scan;
}
@@ -451,9 +455,11 @@ index_restrpos(IndexScanDesc scan)
*/
Size
index_parallelscan_estimate(Relation indexRelation, int nkeys, int norderbys,
- Snapshot snapshot)
+ Snapshot snapshot, bool instrument, int nworkers,
+ Size *instroffset)
{
Size nbytes;
+ Size ninstrbytes;
Assert(snapshot != InvalidSnapshot);
@@ -462,6 +468,7 @@ index_parallelscan_estimate(Relation indexRelation, int nkeys, int norderbys,
nbytes = offsetof(ParallelIndexScanDescData, ps_snapshot_data);
nbytes = add_size(nbytes, EstimateSnapshotSpace(snapshot));
nbytes = MAXALIGN(nbytes);
+ *instroffset = 0;
/*
* If amestimateparallelscan is not provided, assume there is no
@@ -472,6 +479,15 @@ index_parallelscan_estimate(Relation indexRelation, int nkeys, int norderbys,
nbytes = add_size(nbytes,
indexRelation->rd_indam->amestimateparallelscan(nkeys,
norderbys));
+ if (!instrument || nworkers == 0)
+ return nbytes;
+
+ *instroffset = MAXALIGN(nbytes);
+ ninstrbytes = mul_size(nworkers, sizeof(IndexScanInstrumentation));
+ ninstrbytes = add_size(ninstrbytes, offsetof(SharedIndexScanInstrumentation, winstrument));
+ ninstrbytes = MAXALIGN(ninstrbytes);
+
+ nbytes = add_size(nbytes, ninstrbytes);
return nbytes;
}
@@ -488,21 +504,22 @@ index_parallelscan_estimate(Relation indexRelation, int nkeys, int norderbys,
*/
void
index_parallelscan_initialize(Relation heapRelation, Relation indexRelation,
- Snapshot snapshot, ParallelIndexScanDesc target)
+ Snapshot snapshot, ParallelIndexScanDesc target,
+ Size ps_offset_ins)
{
- Size offset;
+ Size ps_offset_am;
Assert(snapshot != InvalidSnapshot);
RELATION_CHECKS;
- offset = add_size(offsetof(ParallelIndexScanDescData, ps_snapshot_data),
- EstimateSnapshotSpace(snapshot));
- offset = MAXALIGN(offset);
+ ps_offset_am = add_size(offsetof(ParallelIndexScanDescData, ps_snapshot_data),
+ EstimateSnapshotSpace(snapshot));
+ ps_offset_am = MAXALIGN(ps_offset_am);
target->ps_locator = heapRelation->rd_locator;
target->ps_indexlocator = indexRelation->rd_locator;
- target->ps_offset = offset;
+ target->ps_offset_am = ps_offset_am;
SerializeSnapshot(snapshot, target->ps_snapshot_data);
/* aminitparallelscan is optional; assume no-op if not provided by AM */
@@ -510,9 +527,10 @@ index_parallelscan_initialize(Relation heapRelation, Relation indexRelation,
{
void *amtarget;
- amtarget = OffsetToPointer(target, offset);
+ amtarget = OffsetToPointer(target, ps_offset_am);
indexRelation->rd_indam->aminitparallelscan(amtarget);
}
+ target->ps_offset_ins = ps_offset_ins;
}
/* ----------------
@@ -539,7 +557,8 @@ index_parallelrescan(IndexScanDesc scan)
*/
IndexScanDesc
index_beginscan_parallel(Relation heaprel, Relation indexrel, int nkeys,
- int norderbys, ParallelIndexScanDesc pscan)
+ int norderbys, ParallelIndexScanDesc pscan,
+ IndexScanInstrumentation *instrument)
{
Snapshot snapshot;
IndexScanDesc scan;
@@ -558,6 +577,7 @@ index_beginscan_parallel(Relation heaprel, Relation indexrel, int nkeys,
*/
scan->heapRelation = heaprel;
scan->xs_snapshot = snapshot;
+ scan->instrument = instrument;
/* prepare to fetch index matches from table */
scan->xs_heapfetch = table_index_fetch_begin(heaprel);
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 136e9408a..3151c2dc7 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -573,7 +573,7 @@ btparallelrescan(IndexScanDesc scan)
Assert(parallel_scan);
btscan = (BTParallelScanDesc) OffsetToPointer(parallel_scan,
- parallel_scan->ps_offset);
+ parallel_scan->ps_offset_am);
/*
* In theory, we don't need to acquire the spinlock here, because there
@@ -651,7 +651,7 @@ _bt_parallel_seize(IndexScanDesc scan, BlockNumber *next_scan_page,
}
btscan = (BTParallelScanDesc) OffsetToPointer(parallel_scan,
- parallel_scan->ps_offset);
+ parallel_scan->ps_offset_am);
while (1)
{
@@ -759,7 +759,7 @@ _bt_parallel_release(IndexScanDesc scan, BlockNumber next_scan_page,
Assert(BlockNumberIsValid(next_scan_page));
btscan = (BTParallelScanDesc) OffsetToPointer(parallel_scan,
- parallel_scan->ps_offset);
+ parallel_scan->ps_offset_am);
SpinLockAcquire(&btscan->btps_mutex);
btscan->btps_nextScanPage = next_scan_page;
@@ -798,7 +798,7 @@ _bt_parallel_done(IndexScanDesc scan)
return;
btscan = (BTParallelScanDesc) OffsetToPointer(parallel_scan,
- parallel_scan->ps_offset);
+ parallel_scan->ps_offset_am);
/*
* Mark the parallel scan as done, unless some other process did so
@@ -836,7 +836,7 @@ _bt_parallel_primscan_schedule(IndexScanDesc scan, BlockNumber curr_page)
Assert(so->numArrayKeys);
btscan = (BTParallelScanDesc) OffsetToPointer(parallel_scan,
- parallel_scan->ps_offset);
+ parallel_scan->ps_offset_am);
SpinLockAcquire(&btscan->btps_mutex);
if (btscan->btps_lastCurrPage == curr_page &&
diff --git a/src/backend/access/nbtree/nbtsearch.c b/src/backend/access/nbtree/nbtsearch.c
index 472ce06f1..3676d3d5d 100644
--- a/src/backend/access/nbtree/nbtsearch.c
+++ b/src/backend/access/nbtree/nbtsearch.c
@@ -950,6 +950,8 @@ _bt_first(IndexScanDesc scan, ScanDirection dir)
* _bt_search/_bt_endpoint below
*/
pgstat_count_index_scan(rel);
+ if (scan->instrument)
+ scan->instrument->nsearches++;
/*----------
* Examine the scan keys to discover where we need to start the scan.
diff --git a/src/backend/access/spgist/spgscan.c b/src/backend/access/spgist/spgscan.c
index 53f910e9d..25893050c 100644
--- a/src/backend/access/spgist/spgscan.c
+++ b/src/backend/access/spgist/spgscan.c
@@ -421,6 +421,8 @@ spgrescan(IndexScanDesc scan, ScanKey scankey, int nscankeys,
/* count an indexscan for stats */
pgstat_count_index_scan(scan->indexRelation);
+ if (scan->instrument)
+ scan->instrument->nsearches++;
}
void
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index d8a7232ce..8ca78f219 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -125,6 +125,7 @@ static void show_recursive_union_info(RecursiveUnionState *rstate,
static void show_memoize_info(MemoizeState *mstate, List *ancestors,
ExplainState *es);
static void show_hashagg_info(AggState *aggstate, ExplainState *es);
+static void show_indexsearches_info(PlanState *planstate, ExplainState *es);
static void show_tidbitmap_info(BitmapHeapScanState *planstate,
ExplainState *es);
static void show_instrumentation_count(const char *qlabel, int which,
@@ -2096,6 +2097,7 @@ ExplainNode(PlanState *planstate, List *ancestors,
if (plan->qual)
show_instrumentation_count("Rows Removed by Filter", 1,
planstate, es);
+ show_indexsearches_info(planstate, es);
break;
case T_IndexOnlyScan:
show_scan_qual(((IndexOnlyScan *) plan)->indexqual,
@@ -2112,10 +2114,12 @@ ExplainNode(PlanState *planstate, List *ancestors,
if (es->analyze)
ExplainPropertyFloat("Heap Fetches", NULL,
planstate->instrument->ntuples2, 0, es);
+ show_indexsearches_info(planstate, es);
break;
case T_BitmapIndexScan:
show_scan_qual(((BitmapIndexScan *) plan)->indexqualorig,
"Index Cond", planstate, ancestors, es);
+ show_indexsearches_info(planstate, es);
break;
case T_BitmapHeapScan:
show_scan_qual(((BitmapHeapScan *) plan)->bitmapqualorig,
@@ -3855,6 +3859,73 @@ show_hashagg_info(AggState *aggstate, ExplainState *es)
}
}
+/*
+ * Show the total number of index searches performed by a
+ * IndexScan/IndexOnlyScan/BitmapIndexScan node
+ */
+static void
+show_indexsearches_info(PlanState *planstate, ExplainState *es)
+{
+ Plan *plan = planstate->plan;
+ SharedIndexScanInstrumentation *SharedInfo = NULL;
+ uint64 nsearches = 0;
+
+ if (!es->analyze)
+ return;
+
+ /*
+ * Collect stats from the local process, even when it's a parallel query
+ */
+ switch (nodeTag(plan))
+ {
+ case T_IndexScan:
+ {
+ IndexScanState *indexstate = ((IndexScanState *) planstate);
+
+ nsearches = indexstate->iss_Instrument.nsearches;
+ SharedInfo = indexstate->iss_SharedInfo;
+ break;
+ }
+ case T_IndexOnlyScan:
+ {
+ IndexOnlyScanState *indexstate = ((IndexOnlyScanState *) planstate);
+
+ nsearches = indexstate->ioss_Instrument.nsearches;
+ SharedInfo = indexstate->ioss_SharedInfo;
+ break;
+ }
+ case T_BitmapIndexScan:
+ {
+ BitmapIndexScanState *indexstate = ((BitmapIndexScanState *) planstate);
+
+ nsearches = indexstate->biss_Instrument.nsearches;
+ SharedInfo = indexstate->biss_SharedInfo;
+ break;
+ }
+ default:
+ break;
+ }
+
+ /*
+ * Merge results from workers into local process counters.
+ *
+ * In a parallel query, the leader process may or may not have run the
+ * index scan. Therefore we have to be prepared to get instrumentation
+ * data from all participants.
+ */
+ if (SharedInfo)
+ {
+ for (int i = 0; i < SharedInfo->num_workers; ++i)
+ {
+ IndexScanInstrumentation *winstrument = &SharedInfo->winstrument[i];
+
+ nsearches += winstrument->nsearches;
+ }
+ }
+
+ ExplainPropertyUInteger("Index Searches", NULL, nsearches, es);
+}
+
/*
* Show exact/lossy pages for a BitmapHeapScan node
*/
diff --git a/src/backend/executor/execIndexing.c b/src/backend/executor/execIndexing.c
index 742f3f8c0..e3fe9b78b 100644
--- a/src/backend/executor/execIndexing.c
+++ b/src/backend/executor/execIndexing.c
@@ -816,7 +816,7 @@ check_exclusion_or_unique_constraint(Relation heap, Relation index,
retry:
conflict = false;
found_self = false;
- index_scan = index_beginscan(heap, index, &DirtySnapshot, indnkeyatts, 0);
+ index_scan = index_beginscan(heap, index, &DirtySnapshot, NULL, indnkeyatts, 0);
index_rescan(index_scan, scankeys, indnkeyatts, NULL, 0);
while (index_getnext_slot(index_scan, ForwardScanDirection, existing_slot))
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index 1bedb8083..38c2833bc 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -28,6 +28,7 @@
#include "executor/nodeAgg.h"
#include "executor/nodeAppend.h"
#include "executor/nodeBitmapHeapscan.h"
+#include "executor/nodeBitmapIndexscan.h"
#include "executor/nodeCustom.h"
#include "executor/nodeForeignscan.h"
#include "executor/nodeHash.h"
@@ -244,14 +245,19 @@ ExecParallelEstimate(PlanState *planstate, ExecParallelEstimateContext *e)
e->pcxt);
break;
case T_IndexScanState:
- if (planstate->plan->parallel_aware)
- ExecIndexScanEstimate((IndexScanState *) planstate,
- e->pcxt);
+ /* even when not parallel-aware, for EXPLAIN ANALYZE */
+ ExecIndexScanEstimate((IndexScanState *) planstate,
+ e->pcxt);
+ break;
+ case T_BitmapIndexScanState:
+ /* even when not parallel-aware, for EXPLAIN ANALYZE */
+ ExecBitmapIndexScanEstimate((BitmapIndexScanState *) planstate,
+ e->pcxt);
break;
case T_IndexOnlyScanState:
- if (planstate->plan->parallel_aware)
- ExecIndexOnlyScanEstimate((IndexOnlyScanState *) planstate,
- e->pcxt);
+ /* even when not parallel-aware, for EXPLAIN ANALYZE */
+ ExecIndexOnlyScanEstimate((IndexOnlyScanState *) planstate,
+ e->pcxt);
break;
case T_ForeignScanState:
if (planstate->plan->parallel_aware)
@@ -468,14 +474,17 @@ ExecParallelInitializeDSM(PlanState *planstate,
d->pcxt);
break;
case T_IndexScanState:
- if (planstate->plan->parallel_aware)
- ExecIndexScanInitializeDSM((IndexScanState *) planstate,
- d->pcxt);
+ /* even when not parallel-aware, for EXPLAIN ANALYZE */
+ ExecIndexScanInitializeDSM((IndexScanState *) planstate, d->pcxt);
+ break;
+ case T_BitmapIndexScanState:
+ /* even when not parallel-aware, for EXPLAIN ANALYZE */
+ ExecBitmapIndexScanInitializeDSM((BitmapIndexScanState *) planstate, d->pcxt);
break;
case T_IndexOnlyScanState:
- if (planstate->plan->parallel_aware)
- ExecIndexOnlyScanInitializeDSM((IndexOnlyScanState *) planstate,
- d->pcxt);
+ /* even when not parallel-aware, for EXPLAIN ANALYZE */
+ ExecIndexOnlyScanInitializeDSM((IndexOnlyScanState *) planstate,
+ d->pcxt);
break;
case T_ForeignScanState:
if (planstate->plan->parallel_aware)
@@ -969,14 +978,13 @@ ExecParallelReInitializeDSM(PlanState *planstate,
pcxt);
break;
case T_IndexScanState:
- if (planstate->plan->parallel_aware)
- ExecIndexScanReInitializeDSM((IndexScanState *) planstate,
- pcxt);
+ /* even when not parallel-aware, for EXPLAIN ANALYZE */
+ ExecIndexScanReInitializeDSM((IndexScanState *) planstate, pcxt);
break;
case T_IndexOnlyScanState:
- if (planstate->plan->parallel_aware)
- ExecIndexOnlyScanReInitializeDSM((IndexOnlyScanState *) planstate,
- pcxt);
+ /* even when not parallel-aware, for EXPLAIN ANALYZE */
+ ExecIndexOnlyScanReInitializeDSM((IndexOnlyScanState *) planstate,
+ pcxt);
break;
case T_ForeignScanState:
if (planstate->plan->parallel_aware)
@@ -1063,6 +1071,15 @@ ExecParallelRetrieveInstrumentation(PlanState *planstate,
/* Perform any node-type-specific work that needs to be done. */
switch (nodeTag(planstate))
{
+ case T_IndexScanState:
+ ExecIndexScanRetrieveInstrumentation((IndexScanState *) planstate);
+ break;
+ case T_BitmapIndexScanState:
+ ExecBitmapIndexScanRetrieveInstrumentation((BitmapIndexScanState *) planstate);
+ break;
+ case T_IndexOnlyScanState:
+ ExecIndexOnlyScanRetrieveInstrumentation((IndexOnlyScanState *) planstate);
+ break;
case T_SortState:
ExecSortRetrieveInstrumentation((SortState *) planstate);
break;
@@ -1330,14 +1347,18 @@ ExecParallelInitializeWorker(PlanState *planstate, ParallelWorkerContext *pwcxt)
ExecSeqScanInitializeWorker((SeqScanState *) planstate, pwcxt);
break;
case T_IndexScanState:
- if (planstate->plan->parallel_aware)
- ExecIndexScanInitializeWorker((IndexScanState *) planstate,
- pwcxt);
+ /* even when not parallel-aware, for EXPLAIN ANALYZE */
+ ExecIndexScanInitializeWorker((IndexScanState *) planstate, pwcxt);
+ break;
+ case T_BitmapIndexScanState:
+ /* even when not parallel-aware, for EXPLAIN ANALYZE */
+ ExecBitmapIndexScanInitializeWorker((BitmapIndexScanState *) planstate,
+ pwcxt);
break;
case T_IndexOnlyScanState:
- if (planstate->plan->parallel_aware)
- ExecIndexOnlyScanInitializeWorker((IndexOnlyScanState *) planstate,
- pwcxt);
+ /* even when not parallel-aware, for EXPLAIN ANALYZE */
+ ExecIndexOnlyScanInitializeWorker((IndexOnlyScanState *) planstate,
+ pwcxt);
break;
case T_ForeignScanState:
if (planstate->plan->parallel_aware)
diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index 5cef54f00..b52031b41 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -202,7 +202,7 @@ RelationFindReplTupleByIndex(Relation rel, Oid idxoid,
skey_attoff = build_replindex_scan_key(skey, rel, idxrel, searchslot);
/* Start an index scan. */
- scan = index_beginscan(rel, idxrel, &snap, skey_attoff, 0);
+ scan = index_beginscan(rel, idxrel, &snap, NULL, skey_attoff, 0);
retry:
found = false;
diff --git a/src/backend/executor/nodeBitmapIndexscan.c b/src/backend/executor/nodeBitmapIndexscan.c
index 0b32c3a02..1595741aa 100644
--- a/src/backend/executor/nodeBitmapIndexscan.c
+++ b/src/backend/executor/nodeBitmapIndexscan.c
@@ -183,6 +183,21 @@ ExecEndBitmapIndexScan(BitmapIndexScanState *node)
indexRelationDesc = node->biss_RelationDesc;
indexScanDesc = node->biss_ScanDesc;
+ /*
+ * When ending a parallel worker, copy the statistics gathered by the
+ * worker back into shared memory so that it can be picked up by the main
+ * process to report in EXPLAIN ANALYZE.
+ */
+ if (node->biss_SharedInfo != NULL && IsParallelWorker())
+ {
+ IndexScanInstrumentation *winstrument;
+
+ Assert(ParallelWorkerNumber <= node->biss_SharedInfo->num_workers);
+ winstrument = &node->biss_SharedInfo->winstrument[ParallelWorkerNumber];
+ memcpy(winstrument, &node->biss_Instrument,
+ sizeof(IndexScanInstrumentation));
+ }
+
/*
* close the index relation (no-op if we didn't open it)
*/
@@ -217,6 +232,7 @@ ExecInitBitmapIndexScan(BitmapIndexScan *node, EState *estate, int eflags)
/* normally we don't make the result bitmap till runtime */
indexstate->biss_result = NULL;
+ indexstate->biss_SharedInfo = NULL;
/*
* We do not open or lock the base relation here. We assume that an
@@ -302,6 +318,7 @@ ExecInitBitmapIndexScan(BitmapIndexScan *node, EState *estate, int eflags)
indexstate->biss_ScanDesc =
index_beginscan_bitmap(indexstate->biss_RelationDesc,
estate->es_snapshot,
+ &indexstate->biss_Instrument,
indexstate->biss_NumScanKeys);
/*
@@ -319,3 +336,98 @@ ExecInitBitmapIndexScan(BitmapIndexScan *node, EState *estate, int eflags)
*/
return indexstate;
}
+
+/* ----------------------------------------------------------------
+ * ExecBitmapIndexScanEstimate
+ *
+ * Compute the amount of space we'll need in the parallel
+ * query DSM, and inform pcxt->estimator about our needs.
+ * ----------------------------------------------------------------
+ */
+void
+ExecBitmapIndexScanEstimate(BitmapIndexScanState *node, ParallelContext *pcxt)
+{
+ Size size;
+
+ /* don't need this if not instrumenting or no workers */
+ if (!node->ss.ps.instrument || pcxt->nworkers == 0)
+ return;
+
+ size = mul_size(pcxt->nworkers, sizeof(IndexScanInstrumentation));
+ size = add_size(size, offsetof(SharedIndexScanInstrumentation, winstrument));
+ shm_toc_estimate_chunk(&pcxt->estimator, size);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+}
+
+/* ----------------------------------------------------------------
+ * ExecBitmapIndexScanInitializeDSM
+ *
+ * Set up parallel bitmap index scan shared instrumentation.
+ * ----------------------------------------------------------------
+ */
+void
+ExecBitmapIndexScanInitializeDSM(BitmapIndexScanState *node,
+ ParallelContext *pcxt)
+{
+ Size size;
+
+ /* don't need this if not instrumenting or no workers */
+ if (!node->ss.ps.instrument || pcxt->nworkers == 0)
+ return;
+
+ size = offsetof(SharedIndexScanInstrumentation, winstrument) +
+ pcxt->nworkers * sizeof(IndexScanInstrumentation);
+ node->biss_SharedInfo =
+ (SharedIndexScanInstrumentation *) shm_toc_allocate(pcxt->toc,
+ size);
+
+ /* Each per-worker area must start out as zeroes. */
+ memset(node->biss_SharedInfo, 0, size);
+
+ node->biss_SharedInfo->num_workers = pcxt->nworkers;
+ shm_toc_insert(pcxt->toc, node->ss.ps.plan->plan_node_id,
+ node->biss_SharedInfo);
+}
+
+/* ----------------------------------------------------------------
+ * ExecBitmapIndexScanInitializeWorker
+ *
+ * Copy relevant information from TOC into planstate.
+ * ----------------------------------------------------------------
+ */
+void
+ExecBitmapIndexScanInitializeWorker(BitmapIndexScanState *node,
+ ParallelWorkerContext *pwcxt)
+{
+ /* don't need this if not instrumenting */
+ if (!node->ss.ps.instrument)
+ return;
+
+ /*
+ * Find our entry in the shared area, and set up a pointer to it
+ */
+ node->biss_SharedInfo = (SharedIndexScanInstrumentation *)
+ shm_toc_lookup(pwcxt->toc, node->ss.ps.plan->plan_node_id, false);
+}
+
+/* ----------------------------------------------------------------
+ * ExecBitmapIndexScanRetrieveInstrumentation
+ *
+ * Transfer bitmap index scan statistics from DSM to private memory.
+ * ----------------------------------------------------------------
+ */
+void
+ExecBitmapIndexScanRetrieveInstrumentation(BitmapIndexScanState *node)
+{
+ SharedIndexScanInstrumentation *SharedInfo = node->biss_SharedInfo;
+ size_t size;
+
+ if (SharedInfo == NULL)
+ return;
+
+ /* Replace node->shared_info with a copy in backend-local memory. */
+ size = offsetof(SharedIndexScanInstrumentation, winstrument) +
+ SharedInfo->num_workers * sizeof(IndexScanInstrumentation);
+ node->biss_SharedInfo = palloc(size);
+ memcpy(node->biss_SharedInfo, SharedInfo, size);
+}
diff --git a/src/backend/executor/nodeIndexonlyscan.c b/src/backend/executor/nodeIndexonlyscan.c
index e66352331..f393cd3a9 100644
--- a/src/backend/executor/nodeIndexonlyscan.c
+++ b/src/backend/executor/nodeIndexonlyscan.c
@@ -92,6 +92,7 @@ IndexOnlyNext(IndexOnlyScanState *node)
scandesc = index_beginscan(node->ss.ss_currentRelation,
node->ioss_RelationDesc,
estate->es_snapshot,
+ &node->ioss_Instrument,
node->ioss_NumScanKeys,
node->ioss_NumOrderByKeys);
@@ -413,6 +414,21 @@ ExecEndIndexOnlyScan(IndexOnlyScanState *node)
node->ioss_VMBuffer = InvalidBuffer;
}
+ /*
+ * When ending a parallel worker, copy the statistics gathered by the
+ * worker back into shared memory so that it can be picked up by the main
+ * process to report in EXPLAIN ANALYZE.
+ */
+ if (node->ioss_SharedInfo != NULL && IsParallelWorker())
+ {
+ IndexScanInstrumentation *winstrument;
+
+ Assert(ParallelWorkerNumber <= node->ioss_SharedInfo->num_workers);
+ winstrument = &node->ioss_SharedInfo->winstrument[ParallelWorkerNumber];
+ memcpy(winstrument, &node->ioss_Instrument,
+ sizeof(IndexScanInstrumentation));
+ }
+
/*
* close the index relation (no-op if we didn't open it)
*/
@@ -593,6 +609,8 @@ ExecInitIndexOnlyScan(IndexOnlyScan *node, EState *estate, int eflags)
indexstate->ioss_RuntimeKeysReady = false;
indexstate->ioss_RuntimeKeys = NULL;
indexstate->ioss_NumRuntimeKeys = 0;
+ indexstate->ioss_ParallelScanDesc = NULL;
+ indexstate->ioss_SharedInfo = NULL;
/*
* build the index scan keys from the index qualification
@@ -711,7 +729,10 @@ ExecIndexOnlyScanEstimate(IndexOnlyScanState *node,
node->ioss_PscanLen = index_parallelscan_estimate(node->ioss_RelationDesc,
node->ioss_NumScanKeys,
node->ioss_NumOrderByKeys,
- estate->es_snapshot);
+ estate->es_snapshot,
+ pcxt->nworkers,
+ node->ss.ps.instrument != NULL,
+ &node->ioss_InstrOffset);
shm_toc_estimate_chunk(&pcxt->estimator, node->ioss_PscanLen);
shm_toc_estimate_keys(&pcxt->estimator, 1);
}
@@ -727,31 +748,50 @@ ExecIndexOnlyScanInitializeDSM(IndexOnlyScanState *node,
ParallelContext *pcxt)
{
EState *estate = node->ss.ps.state;
+ Size size;
ParallelIndexScanDesc piscan;
piscan = shm_toc_allocate(pcxt->toc, node->ioss_PscanLen);
index_parallelscan_initialize(node->ss.ss_currentRelation,
node->ioss_RelationDesc,
estate->es_snapshot,
- piscan);
+ piscan, node->ioss_InstrOffset);
shm_toc_insert(pcxt->toc, node->ss.ps.plan->plan_node_id, piscan);
- node->ioss_ScanDesc =
- index_beginscan_parallel(node->ss.ss_currentRelation,
- node->ioss_RelationDesc,
- node->ioss_NumScanKeys,
- node->ioss_NumOrderByKeys,
- piscan);
- node->ioss_ScanDesc->xs_want_itup = true;
- node->ioss_VMBuffer = InvalidBuffer;
+ node->ioss_ParallelScanDesc = piscan;
+ if (node->ss.ps.plan->parallel_aware)
+ {
+ node->ioss_ScanDesc =
+ index_beginscan_parallel(node->ss.ss_currentRelation,
+ node->ioss_RelationDesc,
+ node->ioss_NumScanKeys,
+ node->ioss_NumOrderByKeys,
+ piscan,
+ &node->ioss_Instrument);
+ node->ioss_ScanDesc->xs_want_itup = true;
+ node->ioss_VMBuffer = InvalidBuffer;
- /*
- * If no run-time keys to calculate or they are ready, go ahead and pass
- * the scankeys to the index AM.
- */
- if (node->ioss_NumRuntimeKeys == 0 || node->ioss_RuntimeKeysReady)
- index_rescan(node->ioss_ScanDesc,
- node->ioss_ScanKeys, node->ioss_NumScanKeys,
- node->ioss_OrderByKeys, node->ioss_NumOrderByKeys);
+ /*
+ * If no run-time keys to calculate or they are ready, go ahead and
+ * pass the scankeys to the index AM.
+ */
+ if (node->ioss_NumRuntimeKeys == 0 || node->ioss_RuntimeKeysReady)
+ index_rescan(node->ioss_ScanDesc,
+ node->ioss_ScanKeys, node->ioss_NumScanKeys,
+ node->ioss_OrderByKeys, node->ioss_NumOrderByKeys);
+ }
+
+ /* don't need this if not instrumenting */
+ if (node->ioss_InstrOffset == 0)
+ return;
+
+ size = offsetof(SharedIndexScanInstrumentation, winstrument) +
+ pcxt->nworkers * sizeof(IndexScanInstrumentation);
+ node->ioss_SharedInfo = (SharedIndexScanInstrumentation *)
+ OffsetToPointer(piscan, piscan->ps_offset_ins);
+
+ /* Each per-worker area must start out as zeroes. */
+ memset(node->ioss_SharedInfo, 0, size);
+ node->ioss_SharedInfo->num_workers = pcxt->nworkers;
}
/* ----------------------------------------------------------------
@@ -780,20 +820,56 @@ ExecIndexOnlyScanInitializeWorker(IndexOnlyScanState *node,
ParallelIndexScanDesc piscan;
piscan = shm_toc_lookup(pwcxt->toc, node->ss.ps.plan->plan_node_id, false);
- node->ioss_ScanDesc =
- index_beginscan_parallel(node->ss.ss_currentRelation,
- node->ioss_RelationDesc,
- node->ioss_NumScanKeys,
- node->ioss_NumOrderByKeys,
- piscan);
- node->ioss_ScanDesc->xs_want_itup = true;
+ if (node->ss.ps.plan->parallel_aware)
+ {
+ node->ioss_ScanDesc =
+ index_beginscan_parallel(node->ss.ss_currentRelation,
+ node->ioss_RelationDesc,
+ node->ioss_NumScanKeys,
+ node->ioss_NumOrderByKeys,
+ piscan,
+ &node->ioss_Instrument);
+ node->ioss_ScanDesc->xs_want_itup = true;
+
+ /*
+ * If no run-time keys to calculate or they are ready, go ahead and
+ * pass the scankeys to the index AM.
+ */
+ if (node->ioss_NumRuntimeKeys == 0 || node->ioss_RuntimeKeysReady)
+ index_rescan(node->ioss_ScanDesc,
+ node->ioss_ScanKeys, node->ioss_NumScanKeys,
+ node->ioss_OrderByKeys, node->ioss_NumOrderByKeys);
+ }
+
+ /* don't need this if not instrumenting */
+ if (piscan->ps_offset_ins == 0)
+ return;
/*
- * If no run-time keys to calculate or they are ready, go ahead and pass
- * the scankeys to the index AM.
+ * Find our entry in the shared area, and set up a pointer to it
*/
- if (node->ioss_NumRuntimeKeys == 0 || node->ioss_RuntimeKeysReady)
- index_rescan(node->ioss_ScanDesc,
- node->ioss_ScanKeys, node->ioss_NumScanKeys,
- node->ioss_OrderByKeys, node->ioss_NumOrderByKeys);
+ node->ioss_SharedInfo = (SharedIndexScanInstrumentation *)
+ OffsetToPointer(piscan, piscan->ps_offset_ins);
+}
+
+/* ----------------------------------------------------------------
+ * ExecIndexOnlyScanRetrieveInstrumentation
+ *
+ * Transfer index-only statistics from DSM to private memory.
+ * ----------------------------------------------------------------
+ */
+void
+ExecIndexOnlyScanRetrieveInstrumentation(IndexOnlyScanState *node)
+{
+ SharedIndexScanInstrumentation *SharedInfo = node->ioss_SharedInfo;
+ size_t size;
+
+ if (SharedInfo == NULL)
+ return;
+
+ /* Replace node->shared_info with a copy in backend-local memory. */
+ size = offsetof(SharedIndexScanInstrumentation, winstrument) +
+ SharedInfo->num_workers * sizeof(IndexScanInstrumentation);
+ node->ioss_SharedInfo = palloc(size);
+ memcpy(node->ioss_SharedInfo, SharedInfo, size);
}
diff --git a/src/backend/executor/nodeIndexscan.c b/src/backend/executor/nodeIndexscan.c
index c30b9c2c1..b3b3c33ba 100644
--- a/src/backend/executor/nodeIndexscan.c
+++ b/src/backend/executor/nodeIndexscan.c
@@ -109,6 +109,7 @@ IndexNext(IndexScanState *node)
scandesc = index_beginscan(node->ss.ss_currentRelation,
node->iss_RelationDesc,
estate->es_snapshot,
+ &node->iss_Instrument,
node->iss_NumScanKeys,
node->iss_NumOrderByKeys);
@@ -204,6 +205,7 @@ IndexNextWithReorder(IndexScanState *node)
scandesc = index_beginscan(node->ss.ss_currentRelation,
node->iss_RelationDesc,
estate->es_snapshot,
+ &node->iss_Instrument,
node->iss_NumScanKeys,
node->iss_NumOrderByKeys);
@@ -793,6 +795,21 @@ ExecEndIndexScan(IndexScanState *node)
indexRelationDesc = node->iss_RelationDesc;
indexScanDesc = node->iss_ScanDesc;
+ /*
+ * When ending a parallel worker, copy the statistics gathered by the
+ * worker back into shared memory so that it can be picked up by the main
+ * process to report in EXPLAIN ANALYZE.
+ */
+ if (node->iss_SharedInfo != NULL && IsParallelWorker())
+ {
+ IndexScanInstrumentation *winstrument;
+
+ Assert(ParallelWorkerNumber <= node->iss_SharedInfo->num_workers);
+ winstrument = &node->iss_SharedInfo->winstrument[ParallelWorkerNumber];
+ memcpy(winstrument, &node->iss_Instrument,
+ sizeof(IndexScanInstrumentation));
+ }
+
/*
* close the index relation (no-op if we didn't open it)
*/
@@ -960,6 +977,8 @@ ExecInitIndexScan(IndexScan *node, EState *estate, int eflags)
indexstate->iss_RuntimeKeysReady = false;
indexstate->iss_RuntimeKeys = NULL;
indexstate->iss_NumRuntimeKeys = 0;
+ indexstate->iss_ParallelScanDesc = NULL;
+ indexstate->iss_SharedInfo = NULL;
/*
* build the index scan keys from the index qualification
@@ -1646,7 +1665,10 @@ ExecIndexScanEstimate(IndexScanState *node,
node->iss_PscanLen = index_parallelscan_estimate(node->iss_RelationDesc,
node->iss_NumScanKeys,
node->iss_NumOrderByKeys,
- estate->es_snapshot);
+ estate->es_snapshot,
+ pcxt->nworkers,
+ node->ss.ps.instrument != NULL,
+ &node->iss_InstrOffset);
shm_toc_estimate_chunk(&pcxt->estimator, node->iss_PscanLen);
shm_toc_estimate_keys(&pcxt->estimator, 1);
}
@@ -1662,29 +1684,49 @@ ExecIndexScanInitializeDSM(IndexScanState *node,
ParallelContext *pcxt)
{
EState *estate = node->ss.ps.state;
+ Size size;
ParallelIndexScanDesc piscan;
piscan = shm_toc_allocate(pcxt->toc, node->iss_PscanLen);
index_parallelscan_initialize(node->ss.ss_currentRelation,
node->iss_RelationDesc,
estate->es_snapshot,
- piscan);
- shm_toc_insert(pcxt->toc, node->ss.ps.plan->plan_node_id, piscan);
- node->iss_ScanDesc =
- index_beginscan_parallel(node->ss.ss_currentRelation,
- node->iss_RelationDesc,
- node->iss_NumScanKeys,
- node->iss_NumOrderByKeys,
- piscan);
+ piscan, node->iss_InstrOffset);
- /*
- * If no run-time keys to calculate or they are ready, go ahead and pass
- * the scankeys to the index AM.
- */
- if (node->iss_NumRuntimeKeys == 0 || node->iss_RuntimeKeysReady)
- index_rescan(node->iss_ScanDesc,
- node->iss_ScanKeys, node->iss_NumScanKeys,
- node->iss_OrderByKeys, node->iss_NumOrderByKeys);
+ shm_toc_insert(pcxt->toc, node->ss.ps.plan->plan_node_id, piscan);
+ node->iss_ParallelScanDesc = piscan;
+ if (node->ss.ps.plan->parallel_aware)
+ {
+ node->iss_ScanDesc =
+ index_beginscan_parallel(node->ss.ss_currentRelation,
+ node->iss_RelationDesc,
+ node->iss_NumScanKeys,
+ node->iss_NumOrderByKeys,
+ piscan,
+ &node->iss_Instrument);
+
+ /*
+ * If no run-time keys to calculate or they are ready, go ahead and
+ * pass the scankeys to the index AM.
+ */
+ if (node->iss_NumRuntimeKeys == 0 || node->iss_RuntimeKeysReady)
+ index_rescan(node->iss_ScanDesc,
+ node->iss_ScanKeys, node->iss_NumScanKeys,
+ node->iss_OrderByKeys, node->iss_NumOrderByKeys);
+ }
+
+ /* don't need this if not instrumenting */
+ if (node->iss_InstrOffset == 0)
+ return;
+
+ size = offsetof(SharedIndexScanInstrumentation, winstrument) +
+ pcxt->nworkers * sizeof(IndexScanInstrumentation);
+ node->iss_SharedInfo = (SharedIndexScanInstrumentation *)
+ OffsetToPointer(piscan, piscan->ps_offset_ins);
+
+ /* Each per-worker area must start out as zeroes. */
+ memset(node->iss_SharedInfo, 0, size);
+ node->iss_SharedInfo->num_workers = pcxt->nworkers;
}
/* ----------------------------------------------------------------
@@ -1713,19 +1755,55 @@ ExecIndexScanInitializeWorker(IndexScanState *node,
ParallelIndexScanDesc piscan;
piscan = shm_toc_lookup(pwcxt->toc, node->ss.ps.plan->plan_node_id, false);
- node->iss_ScanDesc =
- index_beginscan_parallel(node->ss.ss_currentRelation,
- node->iss_RelationDesc,
- node->iss_NumScanKeys,
- node->iss_NumOrderByKeys,
- piscan);
+ if (node->ss.ps.plan->parallel_aware)
+ {
+ node->iss_ScanDesc =
+ index_beginscan_parallel(node->ss.ss_currentRelation,
+ node->iss_RelationDesc,
+ node->iss_NumScanKeys,
+ node->iss_NumOrderByKeys,
+ piscan,
+ &node->iss_Instrument);
+
+ /*
+ * If no run-time keys to calculate or they are ready, go ahead and
+ * pass the scankeys to the index AM.
+ */
+ if (node->iss_NumRuntimeKeys == 0 || node->iss_RuntimeKeysReady)
+ index_rescan(node->iss_ScanDesc,
+ node->iss_ScanKeys, node->iss_NumScanKeys,
+ node->iss_OrderByKeys, node->iss_NumOrderByKeys);
+ }
+
+ /* don't need this if not instrumenting */
+ if (piscan->ps_offset_ins == 0)
+ return;
/*
- * If no run-time keys to calculate or they are ready, go ahead and pass
- * the scankeys to the index AM.
+ * Find our entry in the shared area, and set up a pointer to it
*/
- if (node->iss_NumRuntimeKeys == 0 || node->iss_RuntimeKeysReady)
- index_rescan(node->iss_ScanDesc,
- node->iss_ScanKeys, node->iss_NumScanKeys,
- node->iss_OrderByKeys, node->iss_NumOrderByKeys);
+ node->iss_SharedInfo = (SharedIndexScanInstrumentation *)
+ OffsetToPointer(piscan, piscan->ps_offset_ins);
+}
+
+/* ----------------------------------------------------------------
+ * ExecIndexScanRetrieveInstrumentation
+ *
+ * Transfer index scan statistics from DSM to private memory.
+ * ----------------------------------------------------------------
+ */
+void
+ExecIndexScanRetrieveInstrumentation(IndexScanState *node)
+{
+ SharedIndexScanInstrumentation *SharedInfo = node->iss_SharedInfo;
+ size_t size;
+
+ if (SharedInfo == NULL)
+ return;
+
+ /* Replace node->shared_info with a copy in backend-local memory. */
+ size = offsetof(SharedIndexScanInstrumentation, winstrument) +
+ SharedInfo->num_workers * sizeof(IndexScanInstrumentation);
+ node->iss_SharedInfo = palloc(size);
+ memcpy(node->iss_SharedInfo, SharedInfo, size);
}
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index c2918c9c8..b4dc91c7c 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -6376,7 +6376,7 @@ get_actual_variable_endpoint(Relation heapRel,
GlobalVisTestFor(heapRel));
index_scan = index_beginscan(heapRel, indexRel,
- &SnapshotNonVacuumable,
+ &SnapshotNonVacuumable, NULL,
1, 0);
/* Set it up for index-only scan */
index_scan->xs_want_itup = true;
diff --git a/contrib/bloom/blscan.c b/contrib/bloom/blscan.c
index bf801fe78..d072f47fe 100644
--- a/contrib/bloom/blscan.c
+++ b/contrib/bloom/blscan.c
@@ -116,6 +116,8 @@ blgetbitmap(IndexScanDesc scan, TIDBitmap *tbm)
bas = GetAccessStrategy(BAS_BULKREAD);
npages = RelationGetNumberOfBlocks(scan->indexRelation);
pgstat_count_index_scan(scan->indexRelation);
+ if (scan->instrument)
+ scan->instrument->nsearches++;
for (blkno = BLOOM_HEAD_BLKNO; blkno < npages; blkno++)
{
diff --git a/doc/src/sgml/bloom.sgml b/doc/src/sgml/bloom.sgml
index 663a0a4a6..ec5d07767 100644
--- a/doc/src/sgml/bloom.sgml
+++ b/doc/src/sgml/bloom.sgml
@@ -173,10 +173,11 @@ CREATE INDEX
Buffers: shared hit=21864
-> Bitmap Index Scan on bloomidx (cost=0.00..178436.00 rows=1 width=0) (actual time=20.005..20.005 rows=2300.00 loops=1)
Index Cond: ((i2 = 898732) AND (i5 = 123451))
+ Index Searches: 1
Buffers: shared hit=19608
Planning Time: 0.099 ms
Execution Time: 22.632 ms
-(10 rows)
+(11 rows)
</programlisting>
</para>
@@ -208,13 +209,15 @@ CREATE INDEX
Buffers: shared hit=6
-> Bitmap Index Scan on btreeidx5 (cost=0.00..4.52 rows=11 width=0) (actual time=0.026..0.026 rows=7.00 loops=1)
Index Cond: (i5 = 123451)
+ Index Searches: 1
Buffers: shared hit=3
-> Bitmap Index Scan on btreeidx2 (cost=0.00..4.52 rows=11 width=0) (actual time=0.007..0.007 rows=8.00 loops=1)
Index Cond: (i2 = 898732)
+ Index Searches: 1
Buffers: shared hit=3
Planning Time: 0.264 ms
Execution Time: 0.047 ms
-(13 rows)
+(15 rows)
</programlisting>
Although this query runs much faster than with either of the single
indexes, we pay a penalty in index size. Each of the single-column
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 16646f560..fd9bdd884 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -4234,16 +4234,32 @@ description | Waiting for a newly initialized WAL file to reach durable storage
<note>
<para>
- Queries that use certain <acronym>SQL</acronym> constructs to search for
- rows matching any value out of a list or array of multiple scalar values
- (see <xref linkend="functions-comparisons"/>) perform multiple
- <quote>primitive</quote> index scans (up to one primitive scan per scalar
- value) during query execution. Each internal primitive index scan
- increments <structname>pg_stat_all_indexes</structname>.<structfield>idx_scan</structfield>,
+ Index scans may sometimes perform multiple index searches per execution.
+ Each index search increments <structname>pg_stat_all_indexes</structname>.<structfield>idx_scan</structfield>,
so it's possible for the count of index scans to significantly exceed the
total number of index scan executor node executions.
</para>
+ <para>
+ This can happen with queries that use certain <acronym>SQL</acronym>
+ constructs to search for rows matching any value out of a list or array of
+ multiple scalar values (see <xref linkend="functions-comparisons"/>). It
+ can also happen to queries with a
+ <literal><replaceable>column_name</replaceable> =
+ <replaceable>value1</replaceable> OR
+ <replaceable>column_name</replaceable> =
+ <replaceable>value2</replaceable> ...</literal> construct, though only
+ when the optimizer transforms the construct into an equivalent
+ multi-valued array representation.
+ </para>
</note>
+ <tip>
+ <para>
+ <command>EXPLAIN ANALYZE</command> outputs the total number of index
+ searches performed by each index scan node. See
+ <xref linkend="using-explain-analyze"/> for an example demonstrating how
+ this works.
+ </para>
+ </tip>
</sect2>
diff --git a/doc/src/sgml/perform.sgml b/doc/src/sgml/perform.sgml
index 91feb59ab..b4bb03253 100644
--- a/doc/src/sgml/perform.sgml
+++ b/doc/src/sgml/perform.sgml
@@ -729,9 +729,11 @@ WHERE t1.unique1 < 10 AND t1.unique2 = t2.unique2;
Buffers: shared hit=3 read=5 written=4
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..4.36 rows=10 width=0) (actual time=0.004..0.004 rows=10.00 loops=1)
Index Cond: (unique1 < 10)
+ Index Searches: 1
Buffers: shared hit=2
-> Index Scan using tenk2_unique2 on tenk2 t2 (cost=0.29..7.90 rows=1 width=244) (actual time=0.003..0.003 rows=1.00 loops=10)
Index Cond: (unique2 = t1.unique2)
+ Index Searches: 10
Buffers: shared hit=24 read=6
Planning:
Buffers: shared hit=15 dirtied=9
@@ -790,6 +792,7 @@ WHERE t1.unique1 < 100 AND t1.unique2 = t2.unique2 ORDER BY t1.fivethous;
Buffers: shared hit=92
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.04 rows=100 width=0) (actual time=0.013..0.013 rows=100.00 loops=1)
Index Cond: (unique1 < 100)
+ Index Searches: 1
Buffers: shared hit=2
Planning:
Buffers: shared hit=12
@@ -805,6 +808,58 @@ WHERE t1.unique1 < 100 AND t1.unique2 = t2.unique2 ORDER BY t1.fivethous;
shown.)
</para>
+ <para>
+ Index Scan nodes (as well as Bitmap Index Scan and Index-Only Scan nodes)
+ show an <quote>Index Searches</quote> line that reports the total number
+ of searches across <emphasis>all</emphasis> node
+ executions/<literal>loops</literal>:
+
+<screen>
+EXPLAIN ANALYZE SELECT * FROM tenk1 WHERE thousand IN (1, 500, 700, 999);
+ QUERY PLAN
+-------------------------------------------------------------------&zwsp;---------------------------------------------------------
+ Bitmap Heap Scan on tenk1 (cost=9.45..73.44 rows=40 width=244) (actual time=0.012..0.028 rows=40.00 loops=1)
+ Recheck Cond: (thousand = ANY ('{1,500,700,999}'::integer[]))
+ Heap Blocks: exact=39
+ Buffers: shared hit=47
+ -> Bitmap Index Scan on tenk1_thous_tenthous (cost=0.00..9.44 rows=40 width=0) (actual time=0.009..0.009 rows=40.00 loops=1)
+ Index Cond: (thousand = ANY ('{1,500,700,999}'::integer[]))
+ Index Searches: 4
+ Buffers: shared hit=8
+ Planning Time: 0.037 ms
+ Execution Time: 0.034 ms
+</screen>
+
+ Here we see a Bitmap Index Scan node that needed 4 separate index
+ searches. The scan had to search the index from the
+ <structname>tenk1_thous_tenthous</structname> index root page once per
+ <type>integer</type> value from the predicate's <literal>IN</literal>
+ construct. However, the number of index searches often won't have such a
+ simple correspondence to the query predicate:
+
+<screen>
+EXPLAIN ANALYZE SELECT * FROM tenk1 WHERE thousand IN (1, 2, 3, 4);
+ QUERY PLAN
+----------------------------------------------------------------------------------------------------------------------------------
+ Bitmap Heap Scan on tenk1 (cost=9.45..73.44 rows=40 width=244) (actual time=0.009..0.019 rows=40.00 loops=1)
+ Recheck Cond: (thousand = ANY ('{1,2,3,4}'::integer[]))
+ Heap Blocks: exact=38
+ Buffers: shared hit=40
+ -> Bitmap Index Scan on tenk1_thous_tenthous (cost=0.00..9.44 rows=40 width=0) (actual time=0.005..0.005 rows=40.00 loops=1)
+ Index Cond: (thousand = ANY ('{1,2,3,4}'::integer[]))
+ Index Searches: 1
+ Buffers: shared hit=2
+ Planning Time: 0.029 ms
+ Execution Time: 0.026 ms
+</screen>
+
+ This variant of our <literal>IN</literal> query performed only 1 index
+ search. It spent less time traversing the index (compared to the original
+ query) because its <literal>IN</literal> construct uses values matching
+ index tuples stored next to each other, on the same
+ <structname>tenk1_thous_tenthous</structname> index leaf page.
+ </para>
+
<para>
Another type of extra information is the number of rows removed by a
filter condition:
@@ -861,6 +916,7 @@ EXPLAIN ANALYZE SELECT * FROM polygon_tbl WHERE f1 @> polygon '(0.5,2.0)';
Index Scan using gpolygonind on polygon_tbl (cost=0.13..8.15 rows=1 width=85) (actual time=0.074..0.074 rows=0.00 loops=1)
Index Cond: (f1 @> '((0.5,2))'::polygon)
Rows Removed by Index Recheck: 1
+ Index Searches: 1
Buffers: shared hit=1
Planning Time: 0.039 ms
Execution Time: 0.098 ms
@@ -894,8 +950,10 @@ EXPLAIN (ANALYZE, BUFFERS OFF) SELECT * FROM tenk1 WHERE unique1 < 100 AND un
-> BitmapAnd (cost=25.07..25.07 rows=10 width=0) (actual time=0.100..0.101 rows=0.00 loops=1)
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.04 rows=100 width=0) (actual time=0.027..0.027 rows=100.00 loops=1)
Index Cond: (unique1 < 100)
+ Index Searches: 1
-> Bitmap Index Scan on tenk1_unique2 (cost=0.00..19.78 rows=999 width=0) (actual time=0.070..0.070 rows=999.00 loops=1)
Index Cond: (unique2 > 9000)
+ Index Searches: 1
Planning Time: 0.162 ms
Execution Time: 0.143 ms
</screen>
@@ -923,6 +981,7 @@ EXPLAIN ANALYZE UPDATE tenk1 SET hundred = hundred + 1 WHERE unique1 < 100;
Buffers: shared hit=4 read=2
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.04 rows=100 width=0) (actual time=0.031..0.031 rows=100.00 loops=1)
Index Cond: (unique1 < 100)
+ Index Searches: 1
Buffers: shared read=2
Planning Time: 0.151 ms
Execution Time: 1.856 ms
@@ -1061,6 +1120,7 @@ EXPLAIN ANALYZE SELECT * FROM tenk1 WHERE unique1 < 100 AND unique2 > 9000
Index Cond: (unique2 > 9000)
Filter: (unique1 < 100)
Rows Removed by Filter: 287
+ Index Searches: 1
Buffers: shared hit=16
Planning Time: 0.077 ms
Execution Time: 0.086 ms
diff --git a/doc/src/sgml/ref/explain.sgml b/doc/src/sgml/ref/explain.sgml
index 7daddf03e..9ed1061b7 100644
--- a/doc/src/sgml/ref/explain.sgml
+++ b/doc/src/sgml/ref/explain.sgml
@@ -506,10 +506,11 @@ EXPLAIN ANALYZE EXECUTE query(100, 200);
Buffers: shared hit=4
-> Index Scan using test_pkey on test (cost=0.29..10.27 rows=99 width=8) (actual time=0.009..0.025 rows=99.00 loops=1)
Index Cond: ((id > 100) AND (id < 200))
+ Index Searches: 1
Buffers: shared hit=4
Planning Time: 0.244 ms
Execution Time: 0.073 ms
-(9 rows)
+(10 rows)
</programlisting>
</para>
diff --git a/doc/src/sgml/rules.sgml b/doc/src/sgml/rules.sgml
index 1d9924a2a..8467d961f 100644
--- a/doc/src/sgml/rules.sgml
+++ b/doc/src/sgml/rules.sgml
@@ -1046,6 +1046,7 @@ SELECT count(*) FROM words WHERE word = 'caterpiler';
-> Index Only Scan using wrd_word on wrd (cost=0.42..4.44 rows=1 width=0) (actual time=0.039..0.039 rows=0.00 loops=1)
Index Cond: (word = 'caterpiler'::text)
Heap Fetches: 0
+ Index Searches: 1
Planning time: 0.164 ms
Execution time: 0.117 ms
</programlisting>
@@ -1090,6 +1091,7 @@ SELECT word FROM words ORDER BY word <-> 'caterpiler' LIMIT 10;
Limit (cost=0.29..1.06 rows=10 width=10) (actual time=187.222..188.257 rows=10.00 loops=1)
-> Index Scan using wrd_trgm on wrd (cost=0.29..37020.87 rows=479829 width=10) (actual time=187.219..188.252 rows=10.00 loops=1)
Order By: (word <-> 'caterpiler'::text)
+ Index Searches: 1
Planning time: 0.196 ms
Execution time: 198.640 ms
</programlisting>
diff --git a/src/test/regress/expected/brin_multi.out b/src/test/regress/expected/brin_multi.out
index 991b7eaca..cb5b5e53e 100644
--- a/src/test/regress/expected/brin_multi.out
+++ b/src/test/regress/expected/brin_multi.out
@@ -853,7 +853,8 @@ SELECT * FROM brin_date_test WHERE a = '2023-01-01'::date;
Recheck Cond: (a = '2023-01-01'::date)
-> Bitmap Index Scan on brin_date_test_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = '2023-01-01'::date)
-(4 rows)
+ Index Searches: 1
+(5 rows)
DROP TABLE brin_date_test;
RESET enable_seqscan;
@@ -872,7 +873,8 @@ SELECT * FROM brin_timestamp_test WHERE a = '2023-01-01'::timestamp;
Recheck Cond: (a = '2023-01-01 00:00:00'::timestamp without time zone)
-> Bitmap Index Scan on brin_timestamp_test_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = '2023-01-01 00:00:00'::timestamp without time zone)
-(4 rows)
+ Index Searches: 1
+(5 rows)
EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF, SUMMARY OFF, BUFFERS OFF)
SELECT * FROM brin_timestamp_test WHERE a = '1900-01-01'::timestamp;
@@ -882,7 +884,8 @@ SELECT * FROM brin_timestamp_test WHERE a = '1900-01-01'::timestamp;
Recheck Cond: (a = '1900-01-01 00:00:00'::timestamp without time zone)
-> Bitmap Index Scan on brin_timestamp_test_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = '1900-01-01 00:00:00'::timestamp without time zone)
-(4 rows)
+ Index Searches: 1
+(5 rows)
DROP TABLE brin_timestamp_test;
RESET enable_seqscan;
@@ -900,7 +903,8 @@ SELECT * FROM brin_date_test WHERE a = '2023-01-01'::date;
Recheck Cond: (a = '2023-01-01'::date)
-> Bitmap Index Scan on brin_date_test_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = '2023-01-01'::date)
-(4 rows)
+ Index Searches: 1
+(5 rows)
EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF, SUMMARY OFF, BUFFERS OFF)
SELECT * FROM brin_date_test WHERE a = '1900-01-01'::date;
@@ -910,7 +914,8 @@ SELECT * FROM brin_date_test WHERE a = '1900-01-01'::date;
Recheck Cond: (a = '1900-01-01'::date)
-> Bitmap Index Scan on brin_date_test_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = '1900-01-01'::date)
-(4 rows)
+ Index Searches: 1
+(5 rows)
DROP TABLE brin_date_test;
RESET enable_seqscan;
@@ -929,7 +934,8 @@ SELECT * FROM brin_interval_test WHERE a = '-30 years'::interval;
Recheck Cond: (a = '@ 30 years ago'::interval)
-> Bitmap Index Scan on brin_interval_test_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = '@ 30 years ago'::interval)
-(4 rows)
+ Index Searches: 1
+(5 rows)
EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF, SUMMARY OFF, BUFFERS OFF)
SELECT * FROM brin_interval_test WHERE a = '30 years'::interval;
@@ -939,7 +945,8 @@ SELECT * FROM brin_interval_test WHERE a = '30 years'::interval;
Recheck Cond: (a = '@ 30 years'::interval)
-> Bitmap Index Scan on brin_interval_test_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = '@ 30 years'::interval)
-(4 rows)
+ Index Searches: 1
+(5 rows)
DROP TABLE brin_interval_test;
RESET enable_seqscan;
@@ -957,7 +964,8 @@ SELECT * FROM brin_interval_test WHERE a = '-30 years'::interval;
Recheck Cond: (a = '@ 30 years ago'::interval)
-> Bitmap Index Scan on brin_interval_test_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = '@ 30 years ago'::interval)
-(4 rows)
+ Index Searches: 1
+(5 rows)
EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF, SUMMARY OFF, BUFFERS OFF)
SELECT * FROM brin_interval_test WHERE a = '30 years'::interval;
@@ -967,7 +975,8 @@ SELECT * FROM brin_interval_test WHERE a = '30 years'::interval;
Recheck Cond: (a = '@ 30 years'::interval)
-> Bitmap Index Scan on brin_interval_test_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = '@ 30 years'::interval)
-(4 rows)
+ Index Searches: 1
+(5 rows)
DROP TABLE brin_interval_test;
RESET enable_seqscan;
diff --git a/src/test/regress/expected/memoize.out b/src/test/regress/expected/memoize.out
index 22f2d3284..38dfaf021 100644
--- a/src/test/regress/expected/memoize.out
+++ b/src/test/regress/expected/memoize.out
@@ -22,8 +22,9 @@ begin
ln := regexp_replace(ln, 'Evictions: 0', 'Evictions: Zero');
ln := regexp_replace(ln, 'Evictions: \d+', 'Evictions: N');
ln := regexp_replace(ln, 'Memory Usage: \d+', 'Memory Usage: N');
- ln := regexp_replace(ln, 'Heap Fetches: \d+', 'Heap Fetches: N');
- ln := regexp_replace(ln, 'loops=\d+', 'loops=N');
+ ln := regexp_replace(ln, 'Heap Fetches: \d+', 'Heap Fetches: N');
+ ln := regexp_replace(ln, 'loops=\d+', 'loops=N');
+ ln := regexp_replace(ln, 'Index Searches: \d+', 'Index Searches: N');
return next ln;
end loop;
end;
@@ -49,7 +50,8 @@ WHERE t2.unique1 < 1000;', false);
-> Index Only Scan using tenk1_unique1 on tenk1 t1 (actual rows=1.00 loops=N)
Index Cond: (unique1 = t2.twenty)
Heap Fetches: N
-(12 rows)
+ Index Searches: N
+(13 rows)
-- And check we get the expected results.
SELECT COUNT(*),AVG(t1.unique1) FROM tenk1 t1
@@ -80,7 +82,8 @@ WHERE t1.unique1 < 1000;', false);
-> Index Only Scan using tenk1_unique1 on tenk1 t2 (actual rows=1.00 loops=N)
Index Cond: (unique1 = t1.twenty)
Heap Fetches: N
-(12 rows)
+ Index Searches: N
+(13 rows)
-- And check we get the expected results.
SELECT COUNT(*),AVG(t2.unique1) FROM tenk1 t1,
@@ -106,6 +109,7 @@ WHERE t1.unique1 < 10;', false);
-> Nested Loop Left Join (actual rows=20.00 loops=N)
-> Index Scan using tenk1_unique1 on tenk1 t1 (actual rows=10.00 loops=N)
Index Cond: (unique1 < 10)
+ Index Searches: N
-> Memoize (actual rows=2.00 loops=N)
Cache Key: t1.two
Cache Mode: binary
@@ -115,7 +119,8 @@ WHERE t1.unique1 < 10;', false);
Rows Removed by Filter: 2
-> Index Scan using tenk1_unique1 on tenk1 t2_1 (actual rows=4.00 loops=N)
Index Cond: (unique1 < 4)
-(13 rows)
+ Index Searches: N
+(15 rows)
-- And check we get the expected results.
SELECT COUNT(*),AVG(t2.t1two) FROM tenk1 t1 LEFT JOIN
@@ -149,7 +154,8 @@ WHERE s.c1 = s.c2 AND t1.unique1 < 1000;', false);
Filter: ((t1.two + 1) = unique1)
Rows Removed by Filter: 9999
Heap Fetches: N
-(13 rows)
+ Index Searches: N
+(14 rows)
-- And check we get the expected results.
SELECT COUNT(*), AVG(t1.twenty) FROM tenk1 t1 LEFT JOIN
@@ -219,7 +225,8 @@ ON t1.x = t2.t::numeric AND t1.t::numeric = t2.x;', false);
Index Cond: (x = (t1.t)::numeric)
Filter: (t1.x = (t)::numeric)
Heap Fetches: N
-(10 rows)
+ Index Searches: N
+(11 rows)
DROP TABLE expr_key;
-- Reduce work_mem and hash_mem_multiplier so that we see some cache evictions
@@ -246,7 +253,8 @@ WHERE t2.unique1 < 1200;', true);
-> Index Only Scan using tenk1_unique1 on tenk1 t1 (actual rows=1.00 loops=N)
Index Cond: (unique1 = t2.thousand)
Heap Fetches: N
-(12 rows)
+ Index Searches: N
+(13 rows)
CREATE TABLE flt (f float);
CREATE INDEX flt_f_idx ON flt (f);
@@ -261,6 +269,7 @@ SELECT * FROM flt f1 INNER JOIN flt f2 ON f1.f = f2.f;', false);
Nested Loop (actual rows=4.00 loops=N)
-> Index Only Scan using flt_f_idx on flt f1 (actual rows=2.00 loops=N)
Heap Fetches: N
+ Index Searches: N
-> Memoize (actual rows=2.00 loops=N)
Cache Key: f1.f
Cache Mode: logical
@@ -268,7 +277,8 @@ SELECT * FROM flt f1 INNER JOIN flt f2 ON f1.f = f2.f;', false);
-> Index Only Scan using flt_f_idx on flt f2 (actual rows=2.00 loops=N)
Index Cond: (f = f1.f)
Heap Fetches: N
-(10 rows)
+ Index Searches: N
+(12 rows)
-- Ensure memoize operates in binary mode
SELECT explain_memoize('
@@ -278,6 +288,7 @@ SELECT * FROM flt f1 INNER JOIN flt f2 ON f1.f >= f2.f;', false);
Nested Loop (actual rows=4.00 loops=N)
-> Index Only Scan using flt_f_idx on flt f1 (actual rows=2.00 loops=N)
Heap Fetches: N
+ Index Searches: N
-> Memoize (actual rows=2.00 loops=N)
Cache Key: f1.f
Cache Mode: binary
@@ -285,7 +296,8 @@ SELECT * FROM flt f1 INNER JOIN flt f2 ON f1.f >= f2.f;', false);
-> Index Only Scan using flt_f_idx on flt f2 (actual rows=2.00 loops=N)
Index Cond: (f <= f1.f)
Heap Fetches: N
-(10 rows)
+ Index Searches: N
+(12 rows)
DROP TABLE flt;
-- Exercise Memoize in binary mode with a large fixed width type and a
@@ -311,7 +323,8 @@ SELECT * FROM strtest s1 INNER JOIN strtest s2 ON s1.n >= s2.n;', false);
Hits: 3 Misses: 3 Evictions: Zero Overflows: 0 Memory Usage: NkB
-> Index Scan using strtest_n_idx on strtest s2 (actual rows=4.00 loops=N)
Index Cond: (n <= s1.n)
-(9 rows)
+ Index Searches: N
+(10 rows)
-- Ensure we get 3 hits and 3 misses
SELECT explain_memoize('
@@ -327,7 +340,8 @@ SELECT * FROM strtest s1 INNER JOIN strtest s2 ON s1.t >= s2.t;', false);
Hits: 3 Misses: 3 Evictions: Zero Overflows: 0 Memory Usage: NkB
-> Index Scan using strtest_t_idx on strtest s2 (actual rows=4.00 loops=N)
Index Cond: (t <= s1.t)
-(9 rows)
+ Index Searches: N
+(10 rows)
DROP TABLE strtest;
-- Ensure memoize works with partitionwise join
@@ -348,6 +362,7 @@ SELECT * FROM prt t1 INNER JOIN prt t2 ON t1.a = t2.a;', false);
-> Nested Loop (actual rows=16.00 loops=N)
-> Index Only Scan using iprt_p1_a on prt_p1 t1_1 (actual rows=4.00 loops=N)
Heap Fetches: N
+ Index Searches: N
-> Memoize (actual rows=4.00 loops=N)
Cache Key: t1_1.a
Cache Mode: logical
@@ -355,9 +370,11 @@ SELECT * FROM prt t1 INNER JOIN prt t2 ON t1.a = t2.a;', false);
-> Index Only Scan using iprt_p1_a on prt_p1 t2_1 (actual rows=4.00 loops=N)
Index Cond: (a = t1_1.a)
Heap Fetches: N
+ Index Searches: N
-> Nested Loop (actual rows=16.00 loops=N)
-> Index Only Scan using iprt_p2_a on prt_p2 t1_2 (actual rows=4.00 loops=N)
Heap Fetches: N
+ Index Searches: N
-> Memoize (actual rows=4.00 loops=N)
Cache Key: t1_2.a
Cache Mode: logical
@@ -365,7 +382,8 @@ SELECT * FROM prt t1 INNER JOIN prt t2 ON t1.a = t2.a;', false);
-> Index Only Scan using iprt_p2_a on prt_p2 t2_2 (actual rows=4.00 loops=N)
Index Cond: (a = t1_2.a)
Heap Fetches: N
-(21 rows)
+ Index Searches: N
+(25 rows)
-- Ensure memoize works with parameterized union-all Append path
SET enable_partitionwise_join TO off;
@@ -378,6 +396,7 @@ ON t1.a = t2.a;', false);
Nested Loop (actual rows=16.00 loops=N)
-> Index Only Scan using iprt_p1_a on prt_p1 t1 (actual rows=4.00 loops=N)
Heap Fetches: N
+ Index Searches: N
-> Memoize (actual rows=4.00 loops=N)
Cache Key: t1.a
Cache Mode: logical
@@ -386,10 +405,12 @@ ON t1.a = t2.a;', false);
-> Index Only Scan using iprt_p1_a on prt_p1 (actual rows=4.00 loops=N)
Index Cond: (a = t1.a)
Heap Fetches: N
+ Index Searches: N
-> Index Only Scan using iprt_p2_a on prt_p2 (actual rows=0.00 loops=N)
Index Cond: (a = t1.a)
Heap Fetches: N
-(14 rows)
+ Index Searches: N
+(17 rows)
DROP TABLE prt;
RESET enable_partitionwise_join;
diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out
index d95d2395d..34f2b0b8d 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -2369,6 +2369,10 @@ begin
ln := regexp_replace(ln, 'Workers Launched: \d+', 'Workers Launched: N');
ln := regexp_replace(ln, 'actual rows=\d+(?:\.\d+)? loops=\d+', 'actual rows=N loops=N');
ln := regexp_replace(ln, 'Rows Removed by Filter: \d+', 'Rows Removed by Filter: N');
+ perform regexp_matches(ln, 'Index Searches: \d+');
+ if found then
+ continue;
+ end if;
return next ln;
end loop;
end;
@@ -2686,47 +2690,56 @@ select * from ab where a = (select max(a) from lprt_a) and b = (select max(a)-1
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a1_b1_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a1_b2 ab_2 (never executed)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a1_b2_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a1_b3 ab_3 (never executed)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a1_b3_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a2_b1 ab_4 (never executed)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a2_b1_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a2_b2 ab_5 (never executed)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a2_b2_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a2_b3 ab_6 (never executed)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a2_b3_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a3_b1 ab_7 (never executed)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a3_b1_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a3_b2 ab_8 (actual rows=0.00 loops=1)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a3_b2_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a3_b3 ab_9 (never executed)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a3_b3_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
-(52 rows)
+ Index Searches: 0
+(61 rows)
-- Test run-time partition pruning with UNION ALL parents
explain (analyze, costs off, summary off, timing off, buffers off)
@@ -2742,16 +2755,19 @@ select * from (select * from ab where a = 1 union all select * from ab) ab where
Filter: (b = (InitPlan 1).col1)
-> Bitmap Index Scan on ab_a1_b1_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b2 ab_12 (never executed)
Recheck Cond: (a = 1)
Filter: (b = (InitPlan 1).col1)
-> Bitmap Index Scan on ab_a1_b2_a_idx (never executed)
Index Cond: (a = 1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a1_b3 ab_13 (never executed)
Recheck Cond: (a = 1)
Filter: (b = (InitPlan 1).col1)
-> Bitmap Index Scan on ab_a1_b3_a_idx (never executed)
Index Cond: (a = 1)
+ Index Searches: 0
-> Seq Scan on ab_a1_b1 ab_1 (actual rows=0.00 loops=1)
Filter: (b = (InitPlan 1).col1)
-> Seq Scan on ab_a1_b2 ab_2 (never executed)
@@ -2770,7 +2786,7 @@ select * from (select * from ab where a = 1 union all select * from ab) ab where
Filter: (b = (InitPlan 1).col1)
-> Seq Scan on ab_a3_b3 ab_9 (never executed)
Filter: (b = (InitPlan 1).col1)
-(37 rows)
+(40 rows)
-- A case containing a UNION ALL with a non-partitioned child.
explain (analyze, costs off, summary off, timing off, buffers off)
@@ -2786,16 +2802,19 @@ select * from (select * from ab where a = 1 union all (values(10,5)) union all s
Filter: (b = (InitPlan 1).col1)
-> Bitmap Index Scan on ab_a1_b1_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b2 ab_12 (never executed)
Recheck Cond: (a = 1)
Filter: (b = (InitPlan 1).col1)
-> Bitmap Index Scan on ab_a1_b2_a_idx (never executed)
Index Cond: (a = 1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a1_b3 ab_13 (never executed)
Recheck Cond: (a = 1)
Filter: (b = (InitPlan 1).col1)
-> Bitmap Index Scan on ab_a1_b3_a_idx (never executed)
Index Cond: (a = 1)
+ Index Searches: 0
-> Result (actual rows=0.00 loops=1)
One-Time Filter: (5 = (InitPlan 1).col1)
-> Seq Scan on ab_a1_b1 ab_1 (actual rows=0.00 loops=1)
@@ -2816,7 +2835,7 @@ select * from (select * from ab where a = 1 union all (values(10,5)) union all s
Filter: (b = (InitPlan 1).col1)
-> Seq Scan on ab_a3_b3 ab_9 (never executed)
Filter: (b = (InitPlan 1).col1)
-(39 rows)
+(42 rows)
-- Another UNION ALL test, but containing a mix of exec init and exec run-time pruning.
create table xy_1 (x int, y int);
@@ -2887,16 +2906,19 @@ update ab_a1 set b = 3 from ab where ab.a = 1 and ab.a = ab_a1.a;');
Recheck Cond: (a = 1)
-> Bitmap Index Scan on ab_a1_b1_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b2 ab_a1_2 (actual rows=1.00 loops=1)
Recheck Cond: (a = 1)
Heap Blocks: exact=1
-> Bitmap Index Scan on ab_a1_b2_a_idx (actual rows=1.00 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b3 ab_a1_3 (actual rows=0.00 loops=1)
Recheck Cond: (a = 1)
Heap Blocks: exact=1
-> Bitmap Index Scan on ab_a1_b3_a_idx (actual rows=1.00 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Materialize (actual rows=1.00 loops=1)
Storage: Memory Maximum Storage: NkB
-> Append (actual rows=1.00 loops=1)
@@ -2904,17 +2926,20 @@ update ab_a1 set b = 3 from ab where ab.a = 1 and ab.a = ab_a1.a;');
Recheck Cond: (a = 1)
-> Bitmap Index Scan on ab_a1_b1_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b2 ab_2 (actual rows=1.00 loops=1)
Recheck Cond: (a = 1)
Heap Blocks: exact=1
-> Bitmap Index Scan on ab_a1_b2_a_idx (actual rows=1.00 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b3 ab_3 (actual rows=0.00 loops=1)
Recheck Cond: (a = 1)
Heap Blocks: exact=1
-> Bitmap Index Scan on ab_a1_b3_a_idx (actual rows=1.00 loops=1)
Index Cond: (a = 1)
-(37 rows)
+ Index Searches: 1
+(43 rows)
table ab;
a | b
@@ -2990,17 +3015,23 @@ select * from tbl1 join tprt on tbl1.col1 > tprt.col1;
-> Append (actual rows=3.00 loops=2)
-> Index Scan using tprt1_idx on tprt_1 (actual rows=2.00 loops=2)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 2
-> Index Scan using tprt2_idx on tprt_2 (actual rows=2.00 loops=1)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 1
-> Index Scan using tprt3_idx on tprt_3 (never executed)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt4_idx on tprt_4 (never executed)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt5_idx on tprt_5 (never executed)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt6_idx on tprt_6 (never executed)
Index Cond: (col1 < tbl1.col1)
-(15 rows)
+ Index Searches: 0
+(21 rows)
explain (analyze, costs off, summary off, timing off, buffers off)
select * from tbl1 join tprt on tbl1.col1 = tprt.col1;
@@ -3011,17 +3042,23 @@ select * from tbl1 join tprt on tbl1.col1 = tprt.col1;
-> Append (actual rows=1.00 loops=2)
-> Index Scan using tprt1_idx on tprt_1 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt2_idx on tprt_2 (actual rows=1.00 loops=2)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 2
-> Index Scan using tprt3_idx on tprt_3 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt4_idx on tprt_4 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt5_idx on tprt_5 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt6_idx on tprt_6 (never executed)
Index Cond: (col1 = tbl1.col1)
-(15 rows)
+ Index Searches: 0
+(21 rows)
select tbl1.col1, tprt.col1 from tbl1
inner join tprt on tbl1.col1 > tprt.col1
@@ -3056,17 +3093,23 @@ select * from tbl1 inner join tprt on tbl1.col1 > tprt.col1;
-> Append (actual rows=4.60 loops=5)
-> Index Scan using tprt1_idx on tprt_1 (actual rows=2.00 loops=5)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 5
-> Index Scan using tprt2_idx on tprt_2 (actual rows=2.75 loops=4)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 4
-> Index Scan using tprt3_idx on tprt_3 (actual rows=1.00 loops=2)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 2
-> Index Scan using tprt4_idx on tprt_4 (never executed)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt5_idx on tprt_5 (never executed)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt6_idx on tprt_6 (never executed)
Index Cond: (col1 < tbl1.col1)
-(15 rows)
+ Index Searches: 0
+(21 rows)
explain (analyze, costs off, summary off, timing off, buffers off)
select * from tbl1 inner join tprt on tbl1.col1 = tprt.col1;
@@ -3077,17 +3120,23 @@ select * from tbl1 inner join tprt on tbl1.col1 = tprt.col1;
-> Append (actual rows=0.60 loops=5)
-> Index Scan using tprt1_idx on tprt_1 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt2_idx on tprt_2 (actual rows=1.00 loops=2)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 2
-> Index Scan using tprt3_idx on tprt_3 (actual rows=0.33 loops=3)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 3
-> Index Scan using tprt4_idx on tprt_4 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt5_idx on tprt_5 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt6_idx on tprt_6 (never executed)
Index Cond: (col1 = tbl1.col1)
-(15 rows)
+ Index Searches: 0
+(21 rows)
select tbl1.col1, tprt.col1 from tbl1
inner join tprt on tbl1.col1 > tprt.col1
@@ -3141,17 +3190,23 @@ select * from tbl1 join tprt on tbl1.col1 < tprt.col1;
-> Append (actual rows=1.00 loops=1)
-> Index Scan using tprt1_idx on tprt_1 (never executed)
Index Cond: (col1 > tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt2_idx on tprt_2 (never executed)
Index Cond: (col1 > tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt3_idx on tprt_3 (never executed)
Index Cond: (col1 > tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt4_idx on tprt_4 (never executed)
Index Cond: (col1 > tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt5_idx on tprt_5 (never executed)
Index Cond: (col1 > tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt6_idx on tprt_6 (actual rows=1.00 loops=1)
Index Cond: (col1 > tbl1.col1)
-(15 rows)
+ Index Searches: 1
+(21 rows)
select tbl1.col1, tprt.col1 from tbl1
inner join tprt on tbl1.col1 < tprt.col1
@@ -3173,17 +3228,23 @@ select * from tbl1 join tprt on tbl1.col1 = tprt.col1;
-> Append (actual rows=0.00 loops=1)
-> Index Scan using tprt1_idx on tprt_1 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt2_idx on tprt_2 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt3_idx on tprt_3 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt4_idx on tprt_4 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt5_idx on tprt_5 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt6_idx on tprt_6 (never executed)
Index Cond: (col1 = tbl1.col1)
-(15 rows)
+ Index Searches: 0
+(21 rows)
select tbl1.col1, tprt.col1 from tbl1
inner join tprt on tbl1.col1 = tprt.col1
@@ -3513,10 +3574,12 @@ explain (analyze, costs off, summary off, timing off, buffers off) execute mt_q1
-> Index Scan using ma_test_p2_b_idx on ma_test_p2 ma_test_1 (actual rows=1.00 loops=1)
Filter: ((a >= $1) AND ((a % 10) = 5))
Rows Removed by Filter: 9
+ Index Searches: 1
-> Index Scan using ma_test_p3_b_idx on ma_test_p3 ma_test_2 (actual rows=1.00 loops=1)
Filter: ((a >= $1) AND ((a % 10) = 5))
Rows Removed by Filter: 9
-(9 rows)
+ Index Searches: 1
+(11 rows)
execute mt_q1(15);
a
@@ -3534,7 +3597,8 @@ explain (analyze, costs off, summary off, timing off, buffers off) execute mt_q1
-> Index Scan using ma_test_p3_b_idx on ma_test_p3 ma_test_1 (actual rows=1.00 loops=1)
Filter: ((a >= $1) AND ((a % 10) = 5))
Rows Removed by Filter: 9
-(6 rows)
+ Index Searches: 1
+(7 rows)
execute mt_q1(25);
a
@@ -3582,13 +3646,17 @@ explain (analyze, costs off, summary off, timing off, buffers off) select * from
-> Limit (actual rows=1.00 loops=1)
-> Index Scan using ma_test_p2_b_idx on ma_test_p2 (actual rows=1.00 loops=1)
Index Cond: (b IS NOT NULL)
+ Index Searches: 1
-> Index Scan using ma_test_p1_b_idx on ma_test_p1 ma_test_1 (never executed)
Filter: (a >= (InitPlan 2).col1)
+ Index Searches: 0
-> Index Scan using ma_test_p2_b_idx on ma_test_p2 ma_test_2 (actual rows=10.00 loops=1)
Filter: (a >= (InitPlan 2).col1)
+ Index Searches: 1
-> Index Scan using ma_test_p3_b_idx on ma_test_p3 ma_test_3 (actual rows=10.00 loops=1)
Filter: (a >= (InitPlan 2).col1)
-(14 rows)
+ Index Searches: 1
+(18 rows)
reset enable_seqscan;
reset enable_sort;
@@ -4159,13 +4227,17 @@ select * from rangep where b IN((select 1),(select 2)) order by a;
Sort Key: rangep_2.a
-> Index Scan using rangep_0_to_100_1_a_idx on rangep_0_to_100_1 rangep_2 (actual rows=0.00 loops=1)
Filter: (b = ANY (ARRAY[(InitPlan 1).col1, (InitPlan 2).col1]))
+ Index Searches: 1
-> Index Scan using rangep_0_to_100_2_a_idx on rangep_0_to_100_2 rangep_3 (actual rows=0.00 loops=1)
Filter: (b = ANY (ARRAY[(InitPlan 1).col1, (InitPlan 2).col1]))
+ Index Searches: 1
-> Index Scan using rangep_0_to_100_3_a_idx on rangep_0_to_100_3 rangep_4 (never executed)
Filter: (b = ANY (ARRAY[(InitPlan 1).col1, (InitPlan 2).col1]))
+ Index Searches: 0
-> Index Scan using rangep_100_to_200_a_idx on rangep_100_to_200 rangep_5 (actual rows=0.00 loops=1)
Filter: (b = ANY (ARRAY[(InitPlan 1).col1, (InitPlan 2).col1]))
-(15 rows)
+ Index Searches: 1
+(19 rows)
reset enable_sort;
drop table rangep;
diff --git a/src/test/regress/expected/select.out b/src/test/regress/expected/select.out
index cd79abc35..bab0cc93f 100644
--- a/src/test/regress/expected/select.out
+++ b/src/test/regress/expected/select.out
@@ -764,7 +764,8 @@ select * from onek2 where unique2 = 11 and stringu1 = 'ATAAAA';
Index Scan using onek2_u2_prtl on onek2 (actual rows=1.00 loops=1)
Index Cond: (unique2 = 11)
Filter: (stringu1 = 'ATAAAA'::name)
-(3 rows)
+ Index Searches: 1
+(4 rows)
explain (costs off)
select unique2 from onek2 where unique2 = 11 and stringu1 = 'ATAAAA';
diff --git a/src/test/regress/sql/memoize.sql b/src/test/regress/sql/memoize.sql
index d5aab4e56..c0d47fa87 100644
--- a/src/test/regress/sql/memoize.sql
+++ b/src/test/regress/sql/memoize.sql
@@ -23,8 +23,9 @@ begin
ln := regexp_replace(ln, 'Evictions: 0', 'Evictions: Zero');
ln := regexp_replace(ln, 'Evictions: \d+', 'Evictions: N');
ln := regexp_replace(ln, 'Memory Usage: \d+', 'Memory Usage: N');
- ln := regexp_replace(ln, 'Heap Fetches: \d+', 'Heap Fetches: N');
- ln := regexp_replace(ln, 'loops=\d+', 'loops=N');
+ ln := regexp_replace(ln, 'Heap Fetches: \d+', 'Heap Fetches: N');
+ ln := regexp_replace(ln, 'loops=\d+', 'loops=N');
+ ln := regexp_replace(ln, 'Index Searches: \d+', 'Index Searches: N');
return next ln;
end loop;
end;
diff --git a/src/test/regress/sql/partition_prune.sql b/src/test/regress/sql/partition_prune.sql
index 5f36d589b..4a2c74b08 100644
--- a/src/test/regress/sql/partition_prune.sql
+++ b/src/test/regress/sql/partition_prune.sql
@@ -588,6 +588,10 @@ begin
ln := regexp_replace(ln, 'Workers Launched: \d+', 'Workers Launched: N');
ln := regexp_replace(ln, 'actual rows=\d+(?:\.\d+)? loops=\d+', 'actual rows=N loops=N');
ln := regexp_replace(ln, 'Rows Removed by Filter: \d+', 'Rows Removed by Filter: N');
+ perform regexp_matches(ln, 'Index Searches: \d+');
+ if found then
+ continue;
+ end if;
return next ln;
end loop;
end;
--
2.47.2
On Fri, Mar 7, 2025 at 12:18 PM Peter Geoghegan <pg@bowt.ie> wrote:
What do you think of the attached WIP patch, which does things this
way? Does this seem like the right general direction to you?
Attached is a more refined version of this patch, which is
substantially the same the same as the version I posted yesterday.
My current plan is to commit this on Tuesday or Wednesday, barring any
objections.
Thanks
--
Peter Geoghegan
Attachments:
v27-0001-Show-index-search-count-in-EXPLAIN-ANALYZE-take-.patchapplication/octet-stream; name=v27-0001-Show-index-search-count-in-EXPLAIN-ANALYZE-take-.patchDownload
From c139ebfdd3703585ae46cdc5348a23ac528908bc Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@bowt.ie>
Date: Wed, 5 Mar 2025 09:36:48 -0500
Subject: [PATCH v27 1/7] Show index search count in EXPLAIN ANALYZE, take 2.
Expose the count of index searches/index descents in EXPLAIN ANALYZE's
output for index scan/index-only scan/bitmap index scan nodes. This
information is particularly useful with scans that use ScalarArrayOp
quals, where the number of index scans isn't predictable (at least not
with optimizations like the one added by Postgres 17 commit 5bf748b8).
It will also be useful when EXPLAIN ANALYZE shows details of an nbtree
index scan that uses skip scan optimizations set to be introduced by an
upcoming patch.
The instrumentation works by teaching all index AMs to increment a new
nsearches counter whenever a new index search begins. The counter is
incremented at exactly the same point that index AMs already increment
the pg_stat_*_indexes.idx_scan counter (we're counting the same event,
but at the scan level rather than the relation level). Parallel index
scans have parallel workers copy the counter into shared memory, even
when parallel workers run an index scan node that isn't parallel aware.
(This addresses an oversight in an earlier committed version that was to
immediate reverted in commit d00107cd).
Our approach doesn't match the approach used when tracking other index
scan specific costs (e.g., "Rows Removed by Filter:"). It is similar to
the approach used in other cases where we must track costs that are only
readily accessible inside an access method, and not from the executor
(e.g., "Heap Blocks:" output for a Bitmap Heap Scan). It is inherently
necessary to maintain a counter that can be incremented multiple times
during a single amgettuple call (or amgetbitmap call), which makes
passing down PlanState.instrument to amgettuple routines unappealing.
Index access methods work off of a dedicated instrumentation struct,
which could easily be expanded to track other kinds of execution costs.
Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Robert Haas <robertmhaas@gmail.com>
Reviewed-By: Tomas Vondra <tomas@vondra.me>
Reviewed-By: Masahiro Ikeda <ikedamsh@oss.nttdata.com>
Reviewed-By: Matthias van de Meent <boekewurm+postgres@gmail.com>
Discussion: https://postgr.es/m/CAH2-Wz=PKR6rB7qbx+Vnd7eqeB5VTcrW=iJvAsTsKbdG+kW_UA@mail.gmail.com
Discussion: https://postgr.es/m/CAH2-WzkRqvaqR2CTNqTZP0z6FuL4-3ED6eQB0yx38XBNj1v-4Q@mail.gmail.com
---
src/include/access/genam.h | 33 ++++-
src/include/access/relscan.h | 11 +-
src/include/executor/nodeBitmapIndexscan.h | 6 +
src/include/executor/nodeIndexonlyscan.h | 1 +
src/include/executor/nodeIndexscan.h | 1 +
src/include/nodes/execnodes.h | 16 ++
src/backend/access/brin/brin.c | 2 +
src/backend/access/gin/ginscan.c | 2 +
src/backend/access/gist/gistget.c | 4 +
src/backend/access/hash/hashsearch.c | 2 +
src/backend/access/heap/heapam_handler.c | 2 +-
src/backend/access/index/genam.c | 5 +-
src/backend/access/index/indexam.c | 62 +++++---
src/backend/access/nbtree/nbtree.c | 10 +-
src/backend/access/nbtree/nbtsearch.c | 2 +
src/backend/access/spgist/spgscan.c | 2 +
src/backend/commands/explain.c | 63 ++++++++
src/backend/executor/execIndexing.c | 2 +-
src/backend/executor/execParallel.c | 59 +++++---
src/backend/executor/execReplication.c | 2 +-
src/backend/executor/nodeBitmapIndexscan.c | 111 ++++++++++++++
src/backend/executor/nodeIndexonlyscan.c | 138 +++++++++++++-----
src/backend/executor/nodeIndexscan.c | 136 +++++++++++++----
src/backend/utils/adt/selfuncs.c | 2 +-
contrib/bloom/blscan.c | 2 +
doc/src/sgml/bloom.sgml | 7 +-
doc/src/sgml/monitoring.sgml | 28 +++-
doc/src/sgml/perform.sgml | 60 ++++++++
doc/src/sgml/ref/explain.sgml | 3 +-
doc/src/sgml/rules.sgml | 2 +
src/test/regress/expected/brin_multi.out | 27 ++--
src/test/regress/expected/memoize.out | 49 +++++--
src/test/regress/expected/partition_prune.out | 100 +++++++++++--
src/test/regress/expected/select.out | 3 +-
src/test/regress/sql/memoize.sql | 5 +-
src/test/regress/sql/partition_prune.sql | 4 +
src/tools/pgindent/typedefs.list | 2 +
37 files changed, 803 insertions(+), 163 deletions(-)
diff --git a/src/include/access/genam.h b/src/include/access/genam.h
index 1be873957..437b31457 100644
--- a/src/include/access/genam.h
+++ b/src/include/access/genam.h
@@ -85,6 +85,27 @@ typedef struct IndexBulkDeleteResult
BlockNumber pages_free; /* # pages available for reuse */
} IndexBulkDeleteResult;
+/*
+ * Data structure for reporting index scan statistics that are maintained by
+ * index scans. Note that IndexScanInstrumentation can't contain any pointers
+ * because it might need to be copied into a SharedIndexScanInstrumentation.
+ */
+typedef struct IndexScanInstrumentation
+{
+ /* Index search count (increment after calling pgstat_count_index_scan) */
+ uint64 nsearches;
+} IndexScanInstrumentation;
+
+/* ----------------
+ * Shared memory container for per-worker index scan information
+ * ----------------
+ */
+typedef struct SharedIndexScanInstrumentation
+{
+ int num_workers;
+ IndexScanInstrumentation winstrument[FLEXIBLE_ARRAY_MEMBER];
+} SharedIndexScanInstrumentation;
+
/* Typedef for callback function to determine if a tuple is bulk-deletable */
typedef bool (*IndexBulkDeleteCallback) (ItemPointer itemptr, void *state);
@@ -157,9 +178,11 @@ extern void index_insert_cleanup(Relation indexRelation,
extern IndexScanDesc index_beginscan(Relation heapRelation,
Relation indexRelation,
Snapshot snapshot,
+ IndexScanInstrumentation *instrument,
int nkeys, int norderbys);
extern IndexScanDesc index_beginscan_bitmap(Relation indexRelation,
Snapshot snapshot,
+ IndexScanInstrumentation *instrument,
int nkeys);
extern void index_rescan(IndexScanDesc scan,
ScanKey keys, int nkeys,
@@ -168,14 +191,18 @@ extern void index_endscan(IndexScanDesc scan);
extern void index_markpos(IndexScanDesc scan);
extern void index_restrpos(IndexScanDesc scan);
extern Size index_parallelscan_estimate(Relation indexRelation,
- int nkeys, int norderbys, Snapshot snapshot);
+ int nkeys, int norderbys, Snapshot snapshot,
+ bool instrument, int nworkers,
+ Size *instroffset);
extern void index_parallelscan_initialize(Relation heapRelation,
Relation indexRelation, Snapshot snapshot,
- ParallelIndexScanDesc target);
+ ParallelIndexScanDesc target,
+ Size ps_offset_ins);
extern void index_parallelrescan(IndexScanDesc scan);
extern IndexScanDesc index_beginscan_parallel(Relation heaprel,
Relation indexrel, int nkeys, int norderbys,
- ParallelIndexScanDesc pscan);
+ ParallelIndexScanDesc pscan,
+ IndexScanInstrumentation *instrument);
extern ItemPointer index_getnext_tid(IndexScanDesc scan,
ScanDirection direction);
struct TupleTableSlot;
diff --git a/src/include/access/relscan.h b/src/include/access/relscan.h
index dc6e01842..1b65d44db 100644
--- a/src/include/access/relscan.h
+++ b/src/include/access/relscan.h
@@ -123,6 +123,8 @@ typedef struct IndexFetchTableData
Relation rel;
} IndexFetchTableData;
+struct IndexScanInstrumentation;
+
/*
* We use the same IndexScanDescData structure for both amgettuple-based
* and amgetbitmap-based index scans. Some fields are only relevant in
@@ -150,6 +152,12 @@ typedef struct IndexScanDescData
/* index access method's private state */
void *opaque; /* access-method-specific info */
+ /*
+ * Instrumentation counters that are maintained by every index access
+ * method, for all scan types (except when instrument is set to NULL)
+ */
+ struct IndexScanInstrumentation *instrument;
+
/*
* In an index-only scan, a successful amgettuple call must fill either
* xs_itup (and xs_itupdesc) or xs_hitup (and xs_hitupdesc) to provide the
@@ -188,7 +196,8 @@ typedef struct ParallelIndexScanDescData
{
RelFileLocator ps_locator; /* physical table relation to scan */
RelFileLocator ps_indexlocator; /* physical index relation to scan */
- Size ps_offset; /* Offset in bytes of am specific structure */
+ Size ps_offset_am; /* Offset in bytes to am-specific structure */
+ Size ps_offset_ins; /* Offset to SharedIndexScanInstrumentation */
char ps_snapshot_data[FLEXIBLE_ARRAY_MEMBER];
} ParallelIndexScanDescData;
diff --git a/src/include/executor/nodeBitmapIndexscan.h b/src/include/executor/nodeBitmapIndexscan.h
index b51cb184e..b6a5ae25e 100644
--- a/src/include/executor/nodeBitmapIndexscan.h
+++ b/src/include/executor/nodeBitmapIndexscan.h
@@ -14,11 +14,17 @@
#ifndef NODEBITMAPINDEXSCAN_H
#define NODEBITMAPINDEXSCAN_H
+#include "access/parallel.h"
#include "nodes/execnodes.h"
extern BitmapIndexScanState *ExecInitBitmapIndexScan(BitmapIndexScan *node, EState *estate, int eflags);
extern Node *MultiExecBitmapIndexScan(BitmapIndexScanState *node);
extern void ExecEndBitmapIndexScan(BitmapIndexScanState *node);
extern void ExecReScanBitmapIndexScan(BitmapIndexScanState *node);
+extern void ExecBitmapIndexScanEstimate(BitmapIndexScanState *node, ParallelContext *pcxt);
+extern void ExecBitmapIndexScanInitializeDSM(BitmapIndexScanState *node, ParallelContext *pcxt);
+extern void ExecBitmapIndexScanInitializeWorker(BitmapIndexScanState *node,
+ ParallelWorkerContext *pwcxt);
+extern void ExecBitmapIndexScanRetrieveInstrumentation(BitmapIndexScanState *node);
#endif /* NODEBITMAPINDEXSCAN_H */
diff --git a/src/include/executor/nodeIndexonlyscan.h b/src/include/executor/nodeIndexonlyscan.h
index c27d8eb6d..ae85dee6d 100644
--- a/src/include/executor/nodeIndexonlyscan.h
+++ b/src/include/executor/nodeIndexonlyscan.h
@@ -32,5 +32,6 @@ extern void ExecIndexOnlyScanReInitializeDSM(IndexOnlyScanState *node,
ParallelContext *pcxt);
extern void ExecIndexOnlyScanInitializeWorker(IndexOnlyScanState *node,
ParallelWorkerContext *pwcxt);
+extern void ExecIndexOnlyScanRetrieveInstrumentation(IndexOnlyScanState *node);
#endif /* NODEINDEXONLYSCAN_H */
diff --git a/src/include/executor/nodeIndexscan.h b/src/include/executor/nodeIndexscan.h
index 1c63d0615..08f0a148d 100644
--- a/src/include/executor/nodeIndexscan.h
+++ b/src/include/executor/nodeIndexscan.h
@@ -28,6 +28,7 @@ extern void ExecIndexScanInitializeDSM(IndexScanState *node, ParallelContext *pc
extern void ExecIndexScanReInitializeDSM(IndexScanState *node, ParallelContext *pcxt);
extern void ExecIndexScanInitializeWorker(IndexScanState *node,
ParallelWorkerContext *pwcxt);
+extern void ExecIndexScanRetrieveInstrumentation(IndexScanState *node);
/*
* These routines are exported to share code with nodeIndexonlyscan.c and
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index a323fa98b..c0a983718 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -1680,6 +1680,8 @@ typedef struct
* RuntimeContext expr context for evaling runtime Skeys
* RelationDesc index relation descriptor
* ScanDesc index scan descriptor
+ * Instrument local index scan instrumentation
+ * SharedInfo statistics for parallel workers
*
* ReorderQueue tuples that need reordering due to re-check
* ReachedEnd have we fetched all tuples from index already?
@@ -1689,6 +1691,7 @@ typedef struct
* OrderByTypByVals is the datatype of order by expression pass-by-value?
* OrderByTypLens typlens of the datatypes of order by expressions
* PscanLen size of parallel index scan descriptor
+ * PscanInstrOffset offset to SharedInfo (only used in leader)
* ----------------
*/
typedef struct IndexScanState
@@ -1706,6 +1709,8 @@ typedef struct IndexScanState
ExprContext *iss_RuntimeContext;
Relation iss_RelationDesc;
struct IndexScanDescData *iss_ScanDesc;
+ IndexScanInstrumentation iss_Instrument;
+ SharedIndexScanInstrumentation *iss_SharedInfo;
/* These are needed for re-checking ORDER BY expr ordering */
pairingheap *iss_ReorderQueue;
@@ -1716,6 +1721,7 @@ typedef struct IndexScanState
bool *iss_OrderByTypByVals;
int16 *iss_OrderByTypLens;
Size iss_PscanLen;
+ Size iss_PscanInstrOffset;
} IndexScanState;
/* ----------------
@@ -1732,9 +1738,12 @@ typedef struct IndexScanState
* RuntimeContext expr context for evaling runtime Skeys
* RelationDesc index relation descriptor
* ScanDesc index scan descriptor
+ * Instrument local index scan instrumentation
+ * SharedInfo statistics for parallel workers
* TableSlot slot for holding tuples fetched from the table
* VMBuffer buffer in use for visibility map testing, if any
* PscanLen size of parallel index-only scan descriptor
+ * PscanInstrOffset offset to SharedInfo (only used in leader)
* NameCStringAttNums attnums of name typed columns to pad to NAMEDATALEN
* NameCStringCount number of elements in the NameCStringAttNums array
* ----------------
@@ -1753,9 +1762,12 @@ typedef struct IndexOnlyScanState
ExprContext *ioss_RuntimeContext;
Relation ioss_RelationDesc;
struct IndexScanDescData *ioss_ScanDesc;
+ IndexScanInstrumentation ioss_Instrument;
+ SharedIndexScanInstrumentation *ioss_SharedInfo;
TupleTableSlot *ioss_TableSlot;
Buffer ioss_VMBuffer;
Size ioss_PscanLen;
+ Size ioss_PscanInstrOffset;
AttrNumber *ioss_NameCStringAttNums;
int ioss_NameCStringCount;
} IndexOnlyScanState;
@@ -1774,6 +1786,8 @@ typedef struct IndexOnlyScanState
* RuntimeContext expr context for evaling runtime Skeys
* RelationDesc index relation descriptor
* ScanDesc index scan descriptor
+ * Instrument local index scan instrumentation
+ * SharedInfo statistics for parallel workers
* ----------------
*/
typedef struct BitmapIndexScanState
@@ -1790,6 +1804,8 @@ typedef struct BitmapIndexScanState
ExprContext *biss_RuntimeContext;
Relation biss_RelationDesc;
struct IndexScanDescData *biss_ScanDesc;
+ IndexScanInstrumentation biss_Instrument;
+ SharedIndexScanInstrumentation *biss_SharedInfo;
} BitmapIndexScanState;
/* ----------------
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index b01009c5d..737ad6388 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -592,6 +592,8 @@ bringetbitmap(IndexScanDesc scan, TIDBitmap *tbm)
opaque = (BrinOpaque *) scan->opaque;
bdesc = opaque->bo_bdesc;
pgstat_count_index_scan(idxRel);
+ if (scan->instrument)
+ scan->instrument->nsearches++;
/*
* We need to know the size of the table so that we know how long to
diff --git a/src/backend/access/gin/ginscan.c b/src/backend/access/gin/ginscan.c
index 84aa14594..f6cdd098a 100644
--- a/src/backend/access/gin/ginscan.c
+++ b/src/backend/access/gin/ginscan.c
@@ -442,6 +442,8 @@ ginNewScanKey(IndexScanDesc scan)
MemoryContextSwitchTo(oldCtx);
pgstat_count_index_scan(scan->indexRelation);
+ if (scan->instrument)
+ scan->instrument->nsearches++;
}
void
diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c
index cc40e928e..387d99723 100644
--- a/src/backend/access/gist/gistget.c
+++ b/src/backend/access/gist/gistget.c
@@ -625,6 +625,8 @@ gistgettuple(IndexScanDesc scan, ScanDirection dir)
GISTSearchItem fakeItem;
pgstat_count_index_scan(scan->indexRelation);
+ if (scan->instrument)
+ scan->instrument->nsearches++;
so->firstCall = false;
so->curPageData = so->nPageData = 0;
@@ -750,6 +752,8 @@ gistgetbitmap(IndexScanDesc scan, TIDBitmap *tbm)
return 0;
pgstat_count_index_scan(scan->indexRelation);
+ if (scan->instrument)
+ scan->instrument->nsearches++;
/* Begin the scan by processing the root page */
so->curPageData = so->nPageData = 0;
diff --git a/src/backend/access/hash/hashsearch.c b/src/backend/access/hash/hashsearch.c
index a3a1fccf3..92c15a65b 100644
--- a/src/backend/access/hash/hashsearch.c
+++ b/src/backend/access/hash/hashsearch.c
@@ -298,6 +298,8 @@ _hash_first(IndexScanDesc scan, ScanDirection dir)
HashScanPosItem *currItem;
pgstat_count_index_scan(rel);
+ if (scan->instrument)
+ scan->instrument->nsearches++;
/*
* We do not support hash scans with no index qualification, because we
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index e78682c3c..d74f0fbc5 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -749,7 +749,7 @@ heapam_relation_copy_for_cluster(Relation OldHeap, Relation NewHeap,
tableScan = NULL;
heapScan = NULL;
- indexScan = index_beginscan(OldHeap, OldIndex, SnapshotAny, 0, 0);
+ indexScan = index_beginscan(OldHeap, OldIndex, SnapshotAny, NULL, 0, 0);
index_rescan(indexScan, NULL, 0, NULL, 0);
}
else
diff --git a/src/backend/access/index/genam.c b/src/backend/access/index/genam.c
index 07bae342e..886c05655 100644
--- a/src/backend/access/index/genam.c
+++ b/src/backend/access/index/genam.c
@@ -119,6 +119,7 @@ RelationGetIndexScan(Relation indexRelation, int nkeys, int norderbys)
scan->ignore_killed_tuples = !scan->xactStartedInRecovery;
scan->opaque = NULL;
+ scan->instrument = NULL;
scan->xs_itup = NULL;
scan->xs_itupdesc = NULL;
@@ -446,7 +447,7 @@ systable_beginscan(Relation heapRelation,
}
sysscan->iscan = index_beginscan(heapRelation, irel,
- snapshot, nkeys, 0);
+ snapshot, NULL, nkeys, 0);
index_rescan(sysscan->iscan, idxkey, nkeys, NULL, 0);
sysscan->scan = NULL;
@@ -711,7 +712,7 @@ systable_beginscan_ordered(Relation heapRelation,
}
sysscan->iscan = index_beginscan(heapRelation, indexRelation,
- snapshot, nkeys, 0);
+ snapshot, NULL, nkeys, 0);
index_rescan(sysscan->iscan, idxkey, nkeys, NULL, 0);
sysscan->scan = NULL;
diff --git a/src/backend/access/index/indexam.c b/src/backend/access/index/indexam.c
index 8b1f55543..073d58bd2 100644
--- a/src/backend/access/index/indexam.c
+++ b/src/backend/access/index/indexam.c
@@ -256,6 +256,7 @@ IndexScanDesc
index_beginscan(Relation heapRelation,
Relation indexRelation,
Snapshot snapshot,
+ IndexScanInstrumentation *instrument,
int nkeys, int norderbys)
{
IndexScanDesc scan;
@@ -270,6 +271,7 @@ index_beginscan(Relation heapRelation,
*/
scan->heapRelation = heapRelation;
scan->xs_snapshot = snapshot;
+ scan->instrument = instrument;
/* prepare to fetch index matches from table */
scan->xs_heapfetch = table_index_fetch_begin(heapRelation);
@@ -286,6 +288,7 @@ index_beginscan(Relation heapRelation,
IndexScanDesc
index_beginscan_bitmap(Relation indexRelation,
Snapshot snapshot,
+ IndexScanInstrumentation *instrument,
int nkeys)
{
IndexScanDesc scan;
@@ -299,6 +302,7 @@ index_beginscan_bitmap(Relation indexRelation,
* up by RelationGetIndexScan.
*/
scan->xs_snapshot = snapshot;
+ scan->instrument = instrument;
return scan;
}
@@ -448,20 +452,26 @@ index_restrpos(IndexScanDesc scan)
/*
* index_parallelscan_estimate - estimate shared memory for parallel scan
+ *
+ * Sets *instroffset to the offset into shared memory that caller should store
+ * the scan's SharedIndexScanInstrumentation state. This is set to 0 when no
+ * instrumentation is required/allocated.
*/
Size
index_parallelscan_estimate(Relation indexRelation, int nkeys, int norderbys,
- Snapshot snapshot)
+ Snapshot snapshot, bool instrument, int nworkers,
+ Size *instroffset)
{
- Size nbytes;
+ Size nscanbytes;
+ Size ninstrbytes;
Assert(snapshot != InvalidSnapshot);
RELATION_CHECKS;
- nbytes = offsetof(ParallelIndexScanDescData, ps_snapshot_data);
- nbytes = add_size(nbytes, EstimateSnapshotSpace(snapshot));
- nbytes = MAXALIGN(nbytes);
+ nscanbytes = offsetof(ParallelIndexScanDescData, ps_snapshot_data);
+ nscanbytes = add_size(nscanbytes, EstimateSnapshotSpace(snapshot));
+ nscanbytes = MAXALIGN(nscanbytes);
/*
* If amestimateparallelscan is not provided, assume there is no
@@ -469,11 +479,25 @@ index_parallelscan_estimate(Relation indexRelation, int nkeys, int norderbys,
* it's easy enough to cater to it here.)
*/
if (indexRelation->rd_indam->amestimateparallelscan != NULL)
- nbytes = add_size(nbytes,
- indexRelation->rd_indam->amestimateparallelscan(nkeys,
- norderbys));
+ nscanbytes = add_size(nscanbytes,
+ indexRelation->rd_indam->amestimateparallelscan(nkeys,
+ norderbys));
+ if (!instrument || nworkers == 0)
+ {
+ *instroffset = 0; /* i.e. no instrumentation */
+ return nscanbytes;
+ }
- return nbytes;
+ *instroffset = MAXALIGN(nscanbytes); /* set *instroffset to start of
+ * SharedIndexScanInstrumentation */
+
+ /* determine space required for instrumentation */
+ ninstrbytes = mul_size(nworkers, sizeof(IndexScanInstrumentation));
+ ninstrbytes = add_size(ninstrbytes,
+ offsetof(SharedIndexScanInstrumentation, winstrument));
+ ninstrbytes = MAXALIGN(ninstrbytes);
+
+ return add_size(nscanbytes, ninstrbytes);
}
/*
@@ -488,21 +512,22 @@ index_parallelscan_estimate(Relation indexRelation, int nkeys, int norderbys,
*/
void
index_parallelscan_initialize(Relation heapRelation, Relation indexRelation,
- Snapshot snapshot, ParallelIndexScanDesc target)
+ Snapshot snapshot, ParallelIndexScanDesc target,
+ Size ps_offset_ins)
{
- Size offset;
+ Size ps_offset_am;
Assert(snapshot != InvalidSnapshot);
RELATION_CHECKS;
- offset = add_size(offsetof(ParallelIndexScanDescData, ps_snapshot_data),
- EstimateSnapshotSpace(snapshot));
- offset = MAXALIGN(offset);
+ ps_offset_am = add_size(offsetof(ParallelIndexScanDescData, ps_snapshot_data),
+ EstimateSnapshotSpace(snapshot));
+ ps_offset_am = MAXALIGN(ps_offset_am);
target->ps_locator = heapRelation->rd_locator;
target->ps_indexlocator = indexRelation->rd_locator;
- target->ps_offset = offset;
+ target->ps_offset_am = ps_offset_am;
SerializeSnapshot(snapshot, target->ps_snapshot_data);
/* aminitparallelscan is optional; assume no-op if not provided by AM */
@@ -510,9 +535,10 @@ index_parallelscan_initialize(Relation heapRelation, Relation indexRelation,
{
void *amtarget;
- amtarget = OffsetToPointer(target, offset);
+ amtarget = OffsetToPointer(target, ps_offset_am);
indexRelation->rd_indam->aminitparallelscan(amtarget);
}
+ target->ps_offset_ins = ps_offset_ins;
}
/* ----------------
@@ -539,7 +565,8 @@ index_parallelrescan(IndexScanDesc scan)
*/
IndexScanDesc
index_beginscan_parallel(Relation heaprel, Relation indexrel, int nkeys,
- int norderbys, ParallelIndexScanDesc pscan)
+ int norderbys, ParallelIndexScanDesc pscan,
+ IndexScanInstrumentation *instrument)
{
Snapshot snapshot;
IndexScanDesc scan;
@@ -558,6 +585,7 @@ index_beginscan_parallel(Relation heaprel, Relation indexrel, int nkeys,
*/
scan->heapRelation = heaprel;
scan->xs_snapshot = snapshot;
+ scan->instrument = instrument;
/* prepare to fetch index matches from table */
scan->xs_heapfetch = table_index_fetch_begin(heaprel);
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 25188a644..c0a8833e0 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -574,7 +574,7 @@ btparallelrescan(IndexScanDesc scan)
Assert(parallel_scan);
btscan = (BTParallelScanDesc) OffsetToPointer(parallel_scan,
- parallel_scan->ps_offset);
+ parallel_scan->ps_offset_am);
/*
* In theory, we don't need to acquire the LWLock here, because there
@@ -652,7 +652,7 @@ _bt_parallel_seize(IndexScanDesc scan, BlockNumber *next_scan_page,
}
btscan = (BTParallelScanDesc) OffsetToPointer(parallel_scan,
- parallel_scan->ps_offset);
+ parallel_scan->ps_offset_am);
while (1)
{
@@ -760,7 +760,7 @@ _bt_parallel_release(IndexScanDesc scan, BlockNumber next_scan_page,
Assert(BlockNumberIsValid(next_scan_page));
btscan = (BTParallelScanDesc) OffsetToPointer(parallel_scan,
- parallel_scan->ps_offset);
+ parallel_scan->ps_offset_am);
LWLockAcquire(&btscan->btps_lock, LW_EXCLUSIVE);
btscan->btps_nextScanPage = next_scan_page;
@@ -799,7 +799,7 @@ _bt_parallel_done(IndexScanDesc scan)
return;
btscan = (BTParallelScanDesc) OffsetToPointer(parallel_scan,
- parallel_scan->ps_offset);
+ parallel_scan->ps_offset_am);
/*
* Mark the parallel scan as done, unless some other process did so
@@ -837,7 +837,7 @@ _bt_parallel_primscan_schedule(IndexScanDesc scan, BlockNumber curr_page)
Assert(so->numArrayKeys);
btscan = (BTParallelScanDesc) OffsetToPointer(parallel_scan,
- parallel_scan->ps_offset);
+ parallel_scan->ps_offset_am);
LWLockAcquire(&btscan->btps_lock, LW_EXCLUSIVE);
if (btscan->btps_lastCurrPage == curr_page &&
diff --git a/src/backend/access/nbtree/nbtsearch.c b/src/backend/access/nbtree/nbtsearch.c
index 6b2f464aa..22b27d01d 100644
--- a/src/backend/access/nbtree/nbtsearch.c
+++ b/src/backend/access/nbtree/nbtsearch.c
@@ -950,6 +950,8 @@ _bt_first(IndexScanDesc scan, ScanDirection dir)
* _bt_search/_bt_endpoint below
*/
pgstat_count_index_scan(rel);
+ if (scan->instrument)
+ scan->instrument->nsearches++;
/*----------
* Examine the scan keys to discover where we need to start the scan.
diff --git a/src/backend/access/spgist/spgscan.c b/src/backend/access/spgist/spgscan.c
index 53f910e9d..25893050c 100644
--- a/src/backend/access/spgist/spgscan.c
+++ b/src/backend/access/spgist/spgscan.c
@@ -421,6 +421,8 @@ spgrescan(IndexScanDesc scan, ScanKey scankey, int nscankeys,
/* count an indexscan for stats */
pgstat_count_index_scan(scan->indexRelation);
+ if (scan->instrument)
+ scan->instrument->nsearches++;
}
void
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index d8a7232ce..4b06275f5 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -125,6 +125,7 @@ static void show_recursive_union_info(RecursiveUnionState *rstate,
static void show_memoize_info(MemoizeState *mstate, List *ancestors,
ExplainState *es);
static void show_hashagg_info(AggState *aggstate, ExplainState *es);
+static void show_indexsearches_info(PlanState *planstate, ExplainState *es);
static void show_tidbitmap_info(BitmapHeapScanState *planstate,
ExplainState *es);
static void show_instrumentation_count(const char *qlabel, int which,
@@ -2096,6 +2097,7 @@ ExplainNode(PlanState *planstate, List *ancestors,
if (plan->qual)
show_instrumentation_count("Rows Removed by Filter", 1,
planstate, es);
+ show_indexsearches_info(planstate, es);
break;
case T_IndexOnlyScan:
show_scan_qual(((IndexOnlyScan *) plan)->indexqual,
@@ -2112,10 +2114,12 @@ ExplainNode(PlanState *planstate, List *ancestors,
if (es->analyze)
ExplainPropertyFloat("Heap Fetches", NULL,
planstate->instrument->ntuples2, 0, es);
+ show_indexsearches_info(planstate, es);
break;
case T_BitmapIndexScan:
show_scan_qual(((BitmapIndexScan *) plan)->indexqualorig,
"Index Cond", planstate, ancestors, es);
+ show_indexsearches_info(planstate, es);
break;
case T_BitmapHeapScan:
show_scan_qual(((BitmapHeapScan *) plan)->bitmapqualorig,
@@ -3855,6 +3859,65 @@ show_hashagg_info(AggState *aggstate, ExplainState *es)
}
}
+/*
+ * Show the total number of index searches performed by a
+ * IndexScan/IndexOnlyScan/BitmapIndexScan node
+ */
+static void
+show_indexsearches_info(PlanState *planstate, ExplainState *es)
+{
+ Plan *plan = planstate->plan;
+ SharedIndexScanInstrumentation *SharedInfo = NULL;
+ uint64 nsearches = 0;
+
+ if (!es->analyze)
+ return;
+
+ /* Initialize counters with stats from the local process first */
+ switch (nodeTag(plan))
+ {
+ case T_IndexScan:
+ {
+ IndexScanState *indexstate = ((IndexScanState *) planstate);
+
+ nsearches = indexstate->iss_Instrument.nsearches;
+ SharedInfo = indexstate->iss_SharedInfo;
+ break;
+ }
+ case T_IndexOnlyScan:
+ {
+ IndexOnlyScanState *indexstate = ((IndexOnlyScanState *) planstate);
+
+ nsearches = indexstate->ioss_Instrument.nsearches;
+ SharedInfo = indexstate->ioss_SharedInfo;
+ break;
+ }
+ case T_BitmapIndexScan:
+ {
+ BitmapIndexScanState *indexstate = ((BitmapIndexScanState *) planstate);
+
+ nsearches = indexstate->biss_Instrument.nsearches;
+ SharedInfo = indexstate->biss_SharedInfo;
+ break;
+ }
+ default:
+ break;
+ }
+
+ /* Next get the sum of the counters set within each and every process */
+ if (SharedInfo)
+ {
+ for (int i = 0; i < SharedInfo->num_workers; ++i)
+ {
+ IndexScanInstrumentation *winstrument = &SharedInfo->winstrument[i];
+
+ nsearches += winstrument->nsearches;
+ }
+ }
+
+ ExplainPropertyUInteger("Index Searches", NULL, nsearches, es);
+}
+
/*
* Show exact/lossy pages for a BitmapHeapScan node
*/
diff --git a/src/backend/executor/execIndexing.c b/src/backend/executor/execIndexing.c
index 742f3f8c0..e3fe9b78b 100644
--- a/src/backend/executor/execIndexing.c
+++ b/src/backend/executor/execIndexing.c
@@ -816,7 +816,7 @@ check_exclusion_or_unique_constraint(Relation heap, Relation index,
retry:
conflict = false;
found_self = false;
- index_scan = index_beginscan(heap, index, &DirtySnapshot, indnkeyatts, 0);
+ index_scan = index_beginscan(heap, index, &DirtySnapshot, NULL, indnkeyatts, 0);
index_rescan(index_scan, scankeys, indnkeyatts, NULL, 0);
while (index_getnext_slot(index_scan, ForwardScanDirection, existing_slot))
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index 1bedb8083..e9337a97d 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -28,6 +28,7 @@
#include "executor/nodeAgg.h"
#include "executor/nodeAppend.h"
#include "executor/nodeBitmapHeapscan.h"
+#include "executor/nodeBitmapIndexscan.h"
#include "executor/nodeCustom.h"
#include "executor/nodeForeignscan.h"
#include "executor/nodeHash.h"
@@ -244,14 +245,19 @@ ExecParallelEstimate(PlanState *planstate, ExecParallelEstimateContext *e)
e->pcxt);
break;
case T_IndexScanState:
- if (planstate->plan->parallel_aware)
- ExecIndexScanEstimate((IndexScanState *) planstate,
- e->pcxt);
+ /* even when not parallel-aware, for EXPLAIN ANALYZE */
+ ExecIndexScanEstimate((IndexScanState *) planstate,
+ e->pcxt);
break;
case T_IndexOnlyScanState:
- if (planstate->plan->parallel_aware)
- ExecIndexOnlyScanEstimate((IndexOnlyScanState *) planstate,
- e->pcxt);
+ /* even when not parallel-aware, for EXPLAIN ANALYZE */
+ ExecIndexOnlyScanEstimate((IndexOnlyScanState *) planstate,
+ e->pcxt);
+ break;
+ case T_BitmapIndexScanState:
+ /* even when not parallel-aware, for EXPLAIN ANALYZE */
+ ExecBitmapIndexScanEstimate((BitmapIndexScanState *) planstate,
+ e->pcxt);
break;
case T_ForeignScanState:
if (planstate->plan->parallel_aware)
@@ -468,14 +474,17 @@ ExecParallelInitializeDSM(PlanState *planstate,
d->pcxt);
break;
case T_IndexScanState:
- if (planstate->plan->parallel_aware)
- ExecIndexScanInitializeDSM((IndexScanState *) planstate,
- d->pcxt);
+ /* even when not parallel-aware, for EXPLAIN ANALYZE */
+ ExecIndexScanInitializeDSM((IndexScanState *) planstate, d->pcxt);
break;
case T_IndexOnlyScanState:
- if (planstate->plan->parallel_aware)
- ExecIndexOnlyScanInitializeDSM((IndexOnlyScanState *) planstate,
- d->pcxt);
+ /* even when not parallel-aware, for EXPLAIN ANALYZE */
+ ExecIndexOnlyScanInitializeDSM((IndexOnlyScanState *) planstate,
+ d->pcxt);
+ break;
+ case T_BitmapIndexScanState:
+ /* even when not parallel-aware, for EXPLAIN ANALYZE */
+ ExecBitmapIndexScanInitializeDSM((BitmapIndexScanState *) planstate, d->pcxt);
break;
case T_ForeignScanState:
if (planstate->plan->parallel_aware)
@@ -1002,6 +1011,7 @@ ExecParallelReInitializeDSM(PlanState *planstate,
ExecHashJoinReInitializeDSM((HashJoinState *) planstate,
pcxt);
break;
+ case T_BitmapIndexScanState:
case T_HashState:
case T_SortState:
case T_IncrementalSortState:
@@ -1063,6 +1073,15 @@ ExecParallelRetrieveInstrumentation(PlanState *planstate,
/* Perform any node-type-specific work that needs to be done. */
switch (nodeTag(planstate))
{
+ case T_IndexScanState:
+ ExecIndexScanRetrieveInstrumentation((IndexScanState *) planstate);
+ break;
+ case T_IndexOnlyScanState:
+ ExecIndexOnlyScanRetrieveInstrumentation((IndexOnlyScanState *) planstate);
+ break;
+ case T_BitmapIndexScanState:
+ ExecBitmapIndexScanRetrieveInstrumentation((BitmapIndexScanState *) planstate);
+ break;
case T_SortState:
ExecSortRetrieveInstrumentation((SortState *) planstate);
break;
@@ -1330,14 +1349,18 @@ ExecParallelInitializeWorker(PlanState *planstate, ParallelWorkerContext *pwcxt)
ExecSeqScanInitializeWorker((SeqScanState *) planstate, pwcxt);
break;
case T_IndexScanState:
- if (planstate->plan->parallel_aware)
- ExecIndexScanInitializeWorker((IndexScanState *) planstate,
- pwcxt);
+ /* even when not parallel-aware, for EXPLAIN ANALYZE */
+ ExecIndexScanInitializeWorker((IndexScanState *) planstate, pwcxt);
break;
case T_IndexOnlyScanState:
- if (planstate->plan->parallel_aware)
- ExecIndexOnlyScanInitializeWorker((IndexOnlyScanState *) planstate,
- pwcxt);
+ /* even when not parallel-aware, for EXPLAIN ANALYZE */
+ ExecIndexOnlyScanInitializeWorker((IndexOnlyScanState *) planstate,
+ pwcxt);
+ break;
+ case T_BitmapIndexScanState:
+ /* even when not parallel-aware, for EXPLAIN ANALYZE */
+ ExecBitmapIndexScanInitializeWorker((BitmapIndexScanState *) planstate,
+ pwcxt);
break;
case T_ForeignScanState:
if (planstate->plan->parallel_aware)
diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index 5cef54f00..b52031b41 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -202,7 +202,7 @@ RelationFindReplTupleByIndex(Relation rel, Oid idxoid,
skey_attoff = build_replindex_scan_key(skey, rel, idxrel, searchslot);
/* Start an index scan. */
- scan = index_beginscan(rel, idxrel, &snap, skey_attoff, 0);
+ scan = index_beginscan(rel, idxrel, &snap, NULL, skey_attoff, 0);
retry:
found = false;
diff --git a/src/backend/executor/nodeBitmapIndexscan.c b/src/backend/executor/nodeBitmapIndexscan.c
index 0b32c3a02..e3e4dd36f 100644
--- a/src/backend/executor/nodeBitmapIndexscan.c
+++ b/src/backend/executor/nodeBitmapIndexscan.c
@@ -183,6 +183,21 @@ ExecEndBitmapIndexScan(BitmapIndexScanState *node)
indexRelationDesc = node->biss_RelationDesc;
indexScanDesc = node->biss_ScanDesc;
+ /*
+ * When ending a parallel worker, copy the statistics gathered by the
+ * worker back into shared memory so that it can be picked up by the main
+ * process to report in EXPLAIN ANALYZE
+ */
+ if (node->biss_SharedInfo != NULL && IsParallelWorker())
+ {
+ IndexScanInstrumentation *winstrument;
+
+ Assert(ParallelWorkerNumber <= node->biss_SharedInfo->num_workers);
+ winstrument = &node->biss_SharedInfo->winstrument[ParallelWorkerNumber];
+ memcpy(winstrument, &node->biss_Instrument,
+ sizeof(IndexScanInstrumentation));
+ }
+
/*
* close the index relation (no-op if we didn't open it)
*/
@@ -217,6 +232,7 @@ ExecInitBitmapIndexScan(BitmapIndexScan *node, EState *estate, int eflags)
/* normally we don't make the result bitmap till runtime */
indexstate->biss_result = NULL;
+ indexstate->biss_SharedInfo = NULL;
/*
* We do not open or lock the base relation here. We assume that an
@@ -302,6 +318,7 @@ ExecInitBitmapIndexScan(BitmapIndexScan *node, EState *estate, int eflags)
indexstate->biss_ScanDesc =
index_beginscan_bitmap(indexstate->biss_RelationDesc,
estate->es_snapshot,
+ &indexstate->biss_Instrument,
indexstate->biss_NumScanKeys);
/*
@@ -319,3 +336,97 @@ ExecInitBitmapIndexScan(BitmapIndexScan *node, EState *estate, int eflags)
*/
return indexstate;
}
+
+/* ----------------------------------------------------------------
+ * ExecBitmapIndexScanEstimate
+ *
+ * Compute the amount of space we'll need in the parallel
+ * query DSM, and inform pcxt->estimator about our needs.
+ * ----------------------------------------------------------------
+ */
+void
+ExecBitmapIndexScanEstimate(BitmapIndexScanState *node, ParallelContext *pcxt)
+{
+ Size size;
+
+ /*
+ * Parallel bitmap index scans are not supported, but we still need to
+ * store the scan's instrumentation in shared memory during parallel query
+ */
+ if (!node->ss.ps.instrument || pcxt->nworkers == 0)
+ return;
+
+ size = mul_size(pcxt->nworkers, sizeof(IndexScanInstrumentation));
+ size = add_size(size, offsetof(SharedIndexScanInstrumentation, winstrument));
+ shm_toc_estimate_chunk(&pcxt->estimator, size);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+}
+
+/* ----------------------------------------------------------------
+ * ExecBitmapIndexScanInitializeDSM
+ *
+ * Set up parallel bitmap index scan shared instrumentation.
+ * ----------------------------------------------------------------
+ */
+void
+ExecBitmapIndexScanInitializeDSM(BitmapIndexScanState *node,
+ ParallelContext *pcxt)
+{
+ Size size;
+
+ /* Only here to set up SharedInfo instrumentation */
+ if (!node->ss.ps.instrument || pcxt->nworkers == 0)
+ return;
+
+ size = offsetof(SharedIndexScanInstrumentation, winstrument) +
+ pcxt->nworkers * sizeof(IndexScanInstrumentation);
+ node->biss_SharedInfo =
+ (SharedIndexScanInstrumentation *) shm_toc_allocate(pcxt->toc,
+ size);
+ shm_toc_insert(pcxt->toc, node->ss.ps.plan->plan_node_id,
+ node->biss_SharedInfo);
+
+ /* Each per-worker area must start out as zeroes */
+ memset(node->biss_SharedInfo, 0, size);
+ node->biss_SharedInfo->num_workers = pcxt->nworkers;
+}
+
+/* ----------------------------------------------------------------
+ * ExecBitmapIndexScanInitializeWorker
+ *
+ * Copy relevant information from TOC into planstate.
+ * ----------------------------------------------------------------
+ */
+void
+ExecBitmapIndexScanInitializeWorker(BitmapIndexScanState *node,
+ ParallelWorkerContext *pwcxt)
+{
+ /* Only here to set up SharedInfo instrumentation */
+ if (!node->ss.ps.instrument)
+ return;
+
+ node->biss_SharedInfo = (SharedIndexScanInstrumentation *)
+ shm_toc_lookup(pwcxt->toc, node->ss.ps.plan->plan_node_id, false);
+}
+
+/* ----------------------------------------------------------------
+ * ExecBitmapIndexScanRetrieveInstrumentation
+ *
+ * Transfer bitmap index scan statistics from DSM to private memory.
+ * ----------------------------------------------------------------
+ */
+void
+ExecBitmapIndexScanRetrieveInstrumentation(BitmapIndexScanState *node)
+{
+ SharedIndexScanInstrumentation *SharedInfo = node->biss_SharedInfo;
+ size_t size;
+
+ if (SharedInfo == NULL)
+ return;
+
+ /* Replace node->shared_info with a copy in backend-local memory */
+ size = offsetof(SharedIndexScanInstrumentation, winstrument) +
+ SharedInfo->num_workers * sizeof(IndexScanInstrumentation);
+ node->biss_SharedInfo = palloc(size);
+ memcpy(node->biss_SharedInfo, SharedInfo, size);
+}
diff --git a/src/backend/executor/nodeIndexonlyscan.c b/src/backend/executor/nodeIndexonlyscan.c
index e66352331..4f4229793 100644
--- a/src/backend/executor/nodeIndexonlyscan.c
+++ b/src/backend/executor/nodeIndexonlyscan.c
@@ -92,6 +92,7 @@ IndexOnlyNext(IndexOnlyScanState *node)
scandesc = index_beginscan(node->ss.ss_currentRelation,
node->ioss_RelationDesc,
estate->es_snapshot,
+ &node->ioss_Instrument,
node->ioss_NumScanKeys,
node->ioss_NumOrderByKeys);
@@ -413,6 +414,21 @@ ExecEndIndexOnlyScan(IndexOnlyScanState *node)
node->ioss_VMBuffer = InvalidBuffer;
}
+ /*
+ * When ending a parallel worker, copy the statistics gathered by the
+ * worker back into shared memory so that it can be picked up by the main
+ * process to report in EXPLAIN ANALYZE
+ */
+ if (node->ioss_SharedInfo != NULL && IsParallelWorker())
+ {
+ IndexScanInstrumentation *winstrument;
+
+ Assert(ParallelWorkerNumber <= node->ioss_SharedInfo->num_workers);
+ winstrument = &node->ioss_SharedInfo->winstrument[ParallelWorkerNumber];
+ memcpy(winstrument, &node->ioss_Instrument,
+ sizeof(IndexScanInstrumentation));
+ }
+
/*
* close the index relation (no-op if we didn't open it)
*/
@@ -593,6 +609,7 @@ ExecInitIndexOnlyScan(IndexOnlyScan *node, EState *estate, int eflags)
indexstate->ioss_RuntimeKeysReady = false;
indexstate->ioss_RuntimeKeys = NULL;
indexstate->ioss_NumRuntimeKeys = 0;
+ indexstate->ioss_SharedInfo = NULL;
/*
* build the index scan keys from the index qualification
@@ -711,7 +728,10 @@ ExecIndexOnlyScanEstimate(IndexOnlyScanState *node,
node->ioss_PscanLen = index_parallelscan_estimate(node->ioss_RelationDesc,
node->ioss_NumScanKeys,
node->ioss_NumOrderByKeys,
- estate->es_snapshot);
+ estate->es_snapshot,
+ pcxt->nworkers,
+ node->ss.ps.instrument != NULL,
+ &node->ioss_PscanInstrOffset);
shm_toc_estimate_chunk(&pcxt->estimator, node->ioss_PscanLen);
shm_toc_estimate_keys(&pcxt->estimator, 1);
}
@@ -727,31 +747,49 @@ ExecIndexOnlyScanInitializeDSM(IndexOnlyScanState *node,
ParallelContext *pcxt)
{
EState *estate = node->ss.ps.state;
+ Size size;
ParallelIndexScanDesc piscan;
piscan = shm_toc_allocate(pcxt->toc, node->ioss_PscanLen);
index_parallelscan_initialize(node->ss.ss_currentRelation,
node->ioss_RelationDesc,
estate->es_snapshot,
- piscan);
+ piscan, node->ioss_PscanInstrOffset);
shm_toc_insert(pcxt->toc, node->ss.ps.plan->plan_node_id, piscan);
- node->ioss_ScanDesc =
- index_beginscan_parallel(node->ss.ss_currentRelation,
- node->ioss_RelationDesc,
- node->ioss_NumScanKeys,
- node->ioss_NumOrderByKeys,
- piscan);
- node->ioss_ScanDesc->xs_want_itup = true;
- node->ioss_VMBuffer = InvalidBuffer;
+ if (node->ss.ps.plan->parallel_aware)
+ {
+ node->ioss_ScanDesc =
+ index_beginscan_parallel(node->ss.ss_currentRelation,
+ node->ioss_RelationDesc,
+ node->ioss_NumScanKeys,
+ node->ioss_NumOrderByKeys,
+ piscan,
+ &node->ioss_Instrument);
+ node->ioss_ScanDesc->xs_want_itup = true;
+ node->ioss_VMBuffer = InvalidBuffer;
- /*
- * If no run-time keys to calculate or they are ready, go ahead and pass
- * the scankeys to the index AM.
- */
- if (node->ioss_NumRuntimeKeys == 0 || node->ioss_RuntimeKeysReady)
- index_rescan(node->ioss_ScanDesc,
- node->ioss_ScanKeys, node->ioss_NumScanKeys,
- node->ioss_OrderByKeys, node->ioss_NumOrderByKeys);
+ /*
+ * If no run-time keys to calculate or they are ready, go ahead and
+ * pass the scankeys to the index AM.
+ */
+ if (node->ioss_NumRuntimeKeys == 0 || node->ioss_RuntimeKeysReady)
+ index_rescan(node->ioss_ScanDesc,
+ node->ioss_ScanKeys, node->ioss_NumScanKeys,
+ node->ioss_OrderByKeys, node->ioss_NumOrderByKeys);
+ }
+
+ /* Done if SharedInfo instrumentation space isn't required */
+ if (node->ioss_PscanInstrOffset == 0)
+ return;
+
+ size = offsetof(SharedIndexScanInstrumentation, winstrument) +
+ pcxt->nworkers * sizeof(IndexScanInstrumentation);
+ node->ioss_SharedInfo = (SharedIndexScanInstrumentation *)
+ OffsetToPointer(piscan, piscan->ps_offset_ins);
+
+ /* Each per-worker area must start out as zeroes */
+ memset(node->ioss_SharedInfo, 0, size);
+ node->ioss_SharedInfo->num_workers = pcxt->nworkers;
}
/* ----------------------------------------------------------------
@@ -764,6 +802,7 @@ void
ExecIndexOnlyScanReInitializeDSM(IndexOnlyScanState *node,
ParallelContext *pcxt)
{
+ Assert(node->ss.ps.plan->parallel_aware);
index_parallelrescan(node->ioss_ScanDesc);
}
@@ -780,20 +819,53 @@ ExecIndexOnlyScanInitializeWorker(IndexOnlyScanState *node,
ParallelIndexScanDesc piscan;
piscan = shm_toc_lookup(pwcxt->toc, node->ss.ps.plan->plan_node_id, false);
- node->ioss_ScanDesc =
- index_beginscan_parallel(node->ss.ss_currentRelation,
- node->ioss_RelationDesc,
- node->ioss_NumScanKeys,
- node->ioss_NumOrderByKeys,
- piscan);
- node->ioss_ScanDesc->xs_want_itup = true;
+ if (node->ss.ps.plan->parallel_aware)
+ {
+ node->ioss_ScanDesc =
+ index_beginscan_parallel(node->ss.ss_currentRelation,
+ node->ioss_RelationDesc,
+ node->ioss_NumScanKeys,
+ node->ioss_NumOrderByKeys,
+ piscan,
+ &node->ioss_Instrument);
+ node->ioss_ScanDesc->xs_want_itup = true;
- /*
- * If no run-time keys to calculate or they are ready, go ahead and pass
- * the scankeys to the index AM.
- */
- if (node->ioss_NumRuntimeKeys == 0 || node->ioss_RuntimeKeysReady)
- index_rescan(node->ioss_ScanDesc,
- node->ioss_ScanKeys, node->ioss_NumScanKeys,
- node->ioss_OrderByKeys, node->ioss_NumOrderByKeys);
+ /*
+ * If no run-time keys to calculate or they are ready, go ahead and
+ * pass the scankeys to the index AM.
+ */
+ if (node->ioss_NumRuntimeKeys == 0 || node->ioss_RuntimeKeysReady)
+ index_rescan(node->ioss_ScanDesc,
+ node->ioss_ScanKeys, node->ioss_NumScanKeys,
+ node->ioss_OrderByKeys, node->ioss_NumOrderByKeys);
+ }
+
+ /* Done if SharedInfo instrumentation space isn't required */
+ if (piscan->ps_offset_ins == 0)
+ return;
+
+ node->ioss_SharedInfo = (SharedIndexScanInstrumentation *)
+ OffsetToPointer(piscan, piscan->ps_offset_ins);
+}
+
+/* ----------------------------------------------------------------
+ * ExecIndexOnlyScanRetrieveInstrumentation
+ *
+ * Transfer index-only scan statistics from DSM to private memory.
+ * ----------------------------------------------------------------
+ */
+void
+ExecIndexOnlyScanRetrieveInstrumentation(IndexOnlyScanState *node)
+{
+ SharedIndexScanInstrumentation *SharedInfo = node->ioss_SharedInfo;
+ size_t size;
+
+ if (SharedInfo == NULL)
+ return;
+
+ /* Replace node->shared_info with a copy in backend-local memory */
+ size = offsetof(SharedIndexScanInstrumentation, winstrument) +
+ SharedInfo->num_workers * sizeof(IndexScanInstrumentation);
+ node->ioss_SharedInfo = palloc(size);
+ memcpy(node->ioss_SharedInfo, SharedInfo, size);
}
diff --git a/src/backend/executor/nodeIndexscan.c b/src/backend/executor/nodeIndexscan.c
index c30b9c2c1..fdc3304cf 100644
--- a/src/backend/executor/nodeIndexscan.c
+++ b/src/backend/executor/nodeIndexscan.c
@@ -109,6 +109,7 @@ IndexNext(IndexScanState *node)
scandesc = index_beginscan(node->ss.ss_currentRelation,
node->iss_RelationDesc,
estate->es_snapshot,
+ &node->iss_Instrument,
node->iss_NumScanKeys,
node->iss_NumOrderByKeys);
@@ -204,6 +205,7 @@ IndexNextWithReorder(IndexScanState *node)
scandesc = index_beginscan(node->ss.ss_currentRelation,
node->iss_RelationDesc,
estate->es_snapshot,
+ &node->iss_Instrument,
node->iss_NumScanKeys,
node->iss_NumOrderByKeys);
@@ -793,6 +795,21 @@ ExecEndIndexScan(IndexScanState *node)
indexRelationDesc = node->iss_RelationDesc;
indexScanDesc = node->iss_ScanDesc;
+ /*
+ * When ending a parallel worker, copy the statistics gathered by the
+ * worker back into shared memory so that it can be picked up by the main
+ * process to report in EXPLAIN ANALYZE
+ */
+ if (node->iss_SharedInfo != NULL && IsParallelWorker())
+ {
+ IndexScanInstrumentation *winstrument;
+
+ Assert(ParallelWorkerNumber <= node->iss_SharedInfo->num_workers);
+ winstrument = &node->iss_SharedInfo->winstrument[ParallelWorkerNumber];
+ memcpy(winstrument, &node->iss_Instrument,
+ sizeof(IndexScanInstrumentation));
+ }
+
/*
* close the index relation (no-op if we didn't open it)
*/
@@ -960,6 +977,7 @@ ExecInitIndexScan(IndexScan *node, EState *estate, int eflags)
indexstate->iss_RuntimeKeysReady = false;
indexstate->iss_RuntimeKeys = NULL;
indexstate->iss_NumRuntimeKeys = 0;
+ indexstate->iss_SharedInfo = NULL;
/*
* build the index scan keys from the index qualification
@@ -1646,7 +1664,10 @@ ExecIndexScanEstimate(IndexScanState *node,
node->iss_PscanLen = index_parallelscan_estimate(node->iss_RelationDesc,
node->iss_NumScanKeys,
node->iss_NumOrderByKeys,
- estate->es_snapshot);
+ estate->es_snapshot,
+ pcxt->nworkers,
+ node->ss.ps.instrument != NULL,
+ &node->iss_PscanInstrOffset);
shm_toc_estimate_chunk(&pcxt->estimator, node->iss_PscanLen);
shm_toc_estimate_keys(&pcxt->estimator, 1);
}
@@ -1662,29 +1683,48 @@ ExecIndexScanInitializeDSM(IndexScanState *node,
ParallelContext *pcxt)
{
EState *estate = node->ss.ps.state;
+ Size size;
ParallelIndexScanDesc piscan;
piscan = shm_toc_allocate(pcxt->toc, node->iss_PscanLen);
index_parallelscan_initialize(node->ss.ss_currentRelation,
node->iss_RelationDesc,
estate->es_snapshot,
- piscan);
- shm_toc_insert(pcxt->toc, node->ss.ps.plan->plan_node_id, piscan);
- node->iss_ScanDesc =
- index_beginscan_parallel(node->ss.ss_currentRelation,
- node->iss_RelationDesc,
- node->iss_NumScanKeys,
- node->iss_NumOrderByKeys,
- piscan);
+ piscan, node->iss_PscanInstrOffset);
- /*
- * If no run-time keys to calculate or they are ready, go ahead and pass
- * the scankeys to the index AM.
- */
- if (node->iss_NumRuntimeKeys == 0 || node->iss_RuntimeKeysReady)
- index_rescan(node->iss_ScanDesc,
- node->iss_ScanKeys, node->iss_NumScanKeys,
- node->iss_OrderByKeys, node->iss_NumOrderByKeys);
+ shm_toc_insert(pcxt->toc, node->ss.ps.plan->plan_node_id, piscan);
+ if (node->ss.ps.plan->parallel_aware)
+ {
+ node->iss_ScanDesc =
+ index_beginscan_parallel(node->ss.ss_currentRelation,
+ node->iss_RelationDesc,
+ node->iss_NumScanKeys,
+ node->iss_NumOrderByKeys,
+ piscan,
+ &node->iss_Instrument);
+
+ /*
+ * If no run-time keys to calculate or they are ready, go ahead and
+ * pass the scankeys to the index AM.
+ */
+ if (node->iss_NumRuntimeKeys == 0 || node->iss_RuntimeKeysReady)
+ index_rescan(node->iss_ScanDesc,
+ node->iss_ScanKeys, node->iss_NumScanKeys,
+ node->iss_OrderByKeys, node->iss_NumOrderByKeys);
+ }
+
+ /* Done if shared memory contains no instrumentation state */
+ if (node->iss_PscanInstrOffset == 0)
+ return;
+
+ size = offsetof(SharedIndexScanInstrumentation, winstrument) +
+ pcxt->nworkers * sizeof(IndexScanInstrumentation);
+ node->iss_SharedInfo = (SharedIndexScanInstrumentation *)
+ OffsetToPointer(piscan, piscan->ps_offset_ins);
+
+ /* Each per-worker area must start out as zeroes */
+ memset(node->iss_SharedInfo, 0, size);
+ node->iss_SharedInfo->num_workers = pcxt->nworkers;
}
/* ----------------------------------------------------------------
@@ -1697,6 +1737,7 @@ void
ExecIndexScanReInitializeDSM(IndexScanState *node,
ParallelContext *pcxt)
{
+ Assert(node->ss.ps.plan->parallel_aware);
index_parallelrescan(node->iss_ScanDesc);
}
@@ -1713,19 +1754,52 @@ ExecIndexScanInitializeWorker(IndexScanState *node,
ParallelIndexScanDesc piscan;
piscan = shm_toc_lookup(pwcxt->toc, node->ss.ps.plan->plan_node_id, false);
- node->iss_ScanDesc =
- index_beginscan_parallel(node->ss.ss_currentRelation,
- node->iss_RelationDesc,
- node->iss_NumScanKeys,
- node->iss_NumOrderByKeys,
- piscan);
+ if (node->ss.ps.plan->parallel_aware)
+ {
+ node->iss_ScanDesc =
+ index_beginscan_parallel(node->ss.ss_currentRelation,
+ node->iss_RelationDesc,
+ node->iss_NumScanKeys,
+ node->iss_NumOrderByKeys,
+ piscan,
+ &node->iss_Instrument);
- /*
- * If no run-time keys to calculate or they are ready, go ahead and pass
- * the scankeys to the index AM.
- */
- if (node->iss_NumRuntimeKeys == 0 || node->iss_RuntimeKeysReady)
- index_rescan(node->iss_ScanDesc,
- node->iss_ScanKeys, node->iss_NumScanKeys,
- node->iss_OrderByKeys, node->iss_NumOrderByKeys);
+ /*
+ * If no run-time keys to calculate or they are ready, go ahead and
+ * pass the scankeys to the index AM.
+ */
+ if (node->iss_NumRuntimeKeys == 0 || node->iss_RuntimeKeysReady)
+ index_rescan(node->iss_ScanDesc,
+ node->iss_ScanKeys, node->iss_NumScanKeys,
+ node->iss_OrderByKeys, node->iss_NumOrderByKeys);
+ }
+
+ /* Done if shared memory contains no instrumentation state */
+ if (piscan->ps_offset_ins == 0)
+ return;
+
+ node->iss_SharedInfo = (SharedIndexScanInstrumentation *)
+ OffsetToPointer(piscan, piscan->ps_offset_ins);
+}
+
+/* ----------------------------------------------------------------
+ * ExecIndexScanRetrieveInstrumentation
+ *
+ * Transfer index scan statistics from DSM to private memory.
+ * ----------------------------------------------------------------
+ */
+void
+ExecIndexScanRetrieveInstrumentation(IndexScanState *node)
+{
+ SharedIndexScanInstrumentation *SharedInfo = node->iss_SharedInfo;
+ size_t size;
+
+ if (SharedInfo == NULL)
+ return;
+
+ /* Replace node->shared_info with a copy in backend-local memory */
+ size = offsetof(SharedIndexScanInstrumentation, winstrument) +
+ SharedInfo->num_workers * sizeof(IndexScanInstrumentation);
+ node->iss_SharedInfo = palloc(size);
+ memcpy(node->iss_SharedInfo, SharedInfo, size);
}
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index c2918c9c8..b4dc91c7c 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -6376,7 +6376,7 @@ get_actual_variable_endpoint(Relation heapRel,
GlobalVisTestFor(heapRel));
index_scan = index_beginscan(heapRel, indexRel,
- &SnapshotNonVacuumable,
+ &SnapshotNonVacuumable, NULL,
1, 0);
/* Set it up for index-only scan */
index_scan->xs_want_itup = true;
diff --git a/contrib/bloom/blscan.c b/contrib/bloom/blscan.c
index bf801fe78..d072f47fe 100644
--- a/contrib/bloom/blscan.c
+++ b/contrib/bloom/blscan.c
@@ -116,6 +116,8 @@ blgetbitmap(IndexScanDesc scan, TIDBitmap *tbm)
bas = GetAccessStrategy(BAS_BULKREAD);
npages = RelationGetNumberOfBlocks(scan->indexRelation);
pgstat_count_index_scan(scan->indexRelation);
+ if (scan->instrument)
+ scan->instrument->nsearches++;
for (blkno = BLOOM_HEAD_BLKNO; blkno < npages; blkno++)
{
diff --git a/doc/src/sgml/bloom.sgml b/doc/src/sgml/bloom.sgml
index 663a0a4a6..ec5d07767 100644
--- a/doc/src/sgml/bloom.sgml
+++ b/doc/src/sgml/bloom.sgml
@@ -173,10 +173,11 @@ CREATE INDEX
Buffers: shared hit=21864
-> Bitmap Index Scan on bloomidx (cost=0.00..178436.00 rows=1 width=0) (actual time=20.005..20.005 rows=2300.00 loops=1)
Index Cond: ((i2 = 898732) AND (i5 = 123451))
+ Index Searches: 1
Buffers: shared hit=19608
Planning Time: 0.099 ms
Execution Time: 22.632 ms
-(10 rows)
+(11 rows)
</programlisting>
</para>
@@ -208,13 +209,15 @@ CREATE INDEX
Buffers: shared hit=6
-> Bitmap Index Scan on btreeidx5 (cost=0.00..4.52 rows=11 width=0) (actual time=0.026..0.026 rows=7.00 loops=1)
Index Cond: (i5 = 123451)
+ Index Searches: 1
Buffers: shared hit=3
-> Bitmap Index Scan on btreeidx2 (cost=0.00..4.52 rows=11 width=0) (actual time=0.007..0.007 rows=8.00 loops=1)
Index Cond: (i2 = 898732)
+ Index Searches: 1
Buffers: shared hit=3
Planning Time: 0.264 ms
Execution Time: 0.047 ms
-(13 rows)
+(15 rows)
</programlisting>
Although this query runs much faster than with either of the single
indexes, we pay a penalty in index size. Each of the single-column
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 16646f560..fd9bdd884 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -4234,16 +4234,32 @@ description | Waiting for a newly initialized WAL file to reach durable storage
<note>
<para>
- Queries that use certain <acronym>SQL</acronym> constructs to search for
- rows matching any value out of a list or array of multiple scalar values
- (see <xref linkend="functions-comparisons"/>) perform multiple
- <quote>primitive</quote> index scans (up to one primitive scan per scalar
- value) during query execution. Each internal primitive index scan
- increments <structname>pg_stat_all_indexes</structname>.<structfield>idx_scan</structfield>,
+ Index scans may sometimes perform multiple index searches per execution.
+ Each index search increments <structname>pg_stat_all_indexes</structname>.<structfield>idx_scan</structfield>,
so it's possible for the count of index scans to significantly exceed the
total number of index scan executor node executions.
</para>
+ <para>
+ This can happen with queries that use certain <acronym>SQL</acronym>
+ constructs to search for rows matching any value out of a list or array of
+ multiple scalar values (see <xref linkend="functions-comparisons"/>). It
+ can also happen to queries with a
+ <literal><replaceable>column_name</replaceable> =
+ <replaceable>value1</replaceable> OR
+ <replaceable>column_name</replaceable> =
+ <replaceable>value2</replaceable> ...</literal> construct, though only
+ when the optimizer transforms the construct into an equivalent
+ multi-valued array representation.
+ </para>
</note>
+ <tip>
+ <para>
+ <command>EXPLAIN ANALYZE</command> outputs the total number of index
+ searches performed by each index scan node. See
+ <xref linkend="using-explain-analyze"/> for an example demonstrating how
+ this works.
+ </para>
+ </tip>
</sect2>
diff --git a/doc/src/sgml/perform.sgml b/doc/src/sgml/perform.sgml
index 91feb59ab..b4bb03253 100644
--- a/doc/src/sgml/perform.sgml
+++ b/doc/src/sgml/perform.sgml
@@ -729,9 +729,11 @@ WHERE t1.unique1 < 10 AND t1.unique2 = t2.unique2;
Buffers: shared hit=3 read=5 written=4
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..4.36 rows=10 width=0) (actual time=0.004..0.004 rows=10.00 loops=1)
Index Cond: (unique1 < 10)
+ Index Searches: 1
Buffers: shared hit=2
-> Index Scan using tenk2_unique2 on tenk2 t2 (cost=0.29..7.90 rows=1 width=244) (actual time=0.003..0.003 rows=1.00 loops=10)
Index Cond: (unique2 = t1.unique2)
+ Index Searches: 10
Buffers: shared hit=24 read=6
Planning:
Buffers: shared hit=15 dirtied=9
@@ -790,6 +792,7 @@ WHERE t1.unique1 < 100 AND t1.unique2 = t2.unique2 ORDER BY t1.fivethous;
Buffers: shared hit=92
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.04 rows=100 width=0) (actual time=0.013..0.013 rows=100.00 loops=1)
Index Cond: (unique1 < 100)
+ Index Searches: 1
Buffers: shared hit=2
Planning:
Buffers: shared hit=12
@@ -805,6 +808,58 @@ WHERE t1.unique1 < 100 AND t1.unique2 = t2.unique2 ORDER BY t1.fivethous;
shown.)
</para>
+ <para>
+ Index Scan nodes (as well as Bitmap Index Scan and Index-Only Scan nodes)
+ show an <quote>Index Searches</quote> line that reports the total number
+ of searches across <emphasis>all</emphasis> node
+ executions/<literal>loops</literal>:
+
+<screen>
+EXPLAIN ANALYZE SELECT * FROM tenk1 WHERE thousand IN (1, 500, 700, 999);
+ QUERY PLAN
+-------------------------------------------------------------------&zwsp;---------------------------------------------------------
+ Bitmap Heap Scan on tenk1 (cost=9.45..73.44 rows=40 width=244) (actual time=0.012..0.028 rows=40.00 loops=1)
+ Recheck Cond: (thousand = ANY ('{1,500,700,999}'::integer[]))
+ Heap Blocks: exact=39
+ Buffers: shared hit=47
+ -> Bitmap Index Scan on tenk1_thous_tenthous (cost=0.00..9.44 rows=40 width=0) (actual time=0.009..0.009 rows=40.00 loops=1)
+ Index Cond: (thousand = ANY ('{1,500,700,999}'::integer[]))
+ Index Searches: 4
+ Buffers: shared hit=8
+ Planning Time: 0.037 ms
+ Execution Time: 0.034 ms
+</screen>
+
+ Here we see a Bitmap Index Scan node that needed 4 separate index
+ searches. The scan had to search the index from the
+ <structname>tenk1_thous_tenthous</structname> index root page once per
+ <type>integer</type> value from the predicate's <literal>IN</literal>
+ construct. However, the number of index searches often won't have such a
+ simple correspondence to the query predicate:
+
+<screen>
+EXPLAIN ANALYZE SELECT * FROM tenk1 WHERE thousand IN (1, 2, 3, 4);
+ QUERY PLAN
+----------------------------------------------------------------------------------------------------------------------------------
+ Bitmap Heap Scan on tenk1 (cost=9.45..73.44 rows=40 width=244) (actual time=0.009..0.019 rows=40.00 loops=1)
+ Recheck Cond: (thousand = ANY ('{1,2,3,4}'::integer[]))
+ Heap Blocks: exact=38
+ Buffers: shared hit=40
+ -> Bitmap Index Scan on tenk1_thous_tenthous (cost=0.00..9.44 rows=40 width=0) (actual time=0.005..0.005 rows=40.00 loops=1)
+ Index Cond: (thousand = ANY ('{1,2,3,4}'::integer[]))
+ Index Searches: 1
+ Buffers: shared hit=2
+ Planning Time: 0.029 ms
+ Execution Time: 0.026 ms
+</screen>
+
+ This variant of our <literal>IN</literal> query performed only 1 index
+ search. It spent less time traversing the index (compared to the original
+ query) because its <literal>IN</literal> construct uses values matching
+ index tuples stored next to each other, on the same
+ <structname>tenk1_thous_tenthous</structname> index leaf page.
+ </para>
+
<para>
Another type of extra information is the number of rows removed by a
filter condition:
@@ -861,6 +916,7 @@ EXPLAIN ANALYZE SELECT * FROM polygon_tbl WHERE f1 @> polygon '(0.5,2.0)';
Index Scan using gpolygonind on polygon_tbl (cost=0.13..8.15 rows=1 width=85) (actual time=0.074..0.074 rows=0.00 loops=1)
Index Cond: (f1 @> '((0.5,2))'::polygon)
Rows Removed by Index Recheck: 1
+ Index Searches: 1
Buffers: shared hit=1
Planning Time: 0.039 ms
Execution Time: 0.098 ms
@@ -894,8 +950,10 @@ EXPLAIN (ANALYZE, BUFFERS OFF) SELECT * FROM tenk1 WHERE unique1 < 100 AND un
-> BitmapAnd (cost=25.07..25.07 rows=10 width=0) (actual time=0.100..0.101 rows=0.00 loops=1)
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.04 rows=100 width=0) (actual time=0.027..0.027 rows=100.00 loops=1)
Index Cond: (unique1 < 100)
+ Index Searches: 1
-> Bitmap Index Scan on tenk1_unique2 (cost=0.00..19.78 rows=999 width=0) (actual time=0.070..0.070 rows=999.00 loops=1)
Index Cond: (unique2 > 9000)
+ Index Searches: 1
Planning Time: 0.162 ms
Execution Time: 0.143 ms
</screen>
@@ -923,6 +981,7 @@ EXPLAIN ANALYZE UPDATE tenk1 SET hundred = hundred + 1 WHERE unique1 < 100;
Buffers: shared hit=4 read=2
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.04 rows=100 width=0) (actual time=0.031..0.031 rows=100.00 loops=1)
Index Cond: (unique1 < 100)
+ Index Searches: 1
Buffers: shared read=2
Planning Time: 0.151 ms
Execution Time: 1.856 ms
@@ -1061,6 +1120,7 @@ EXPLAIN ANALYZE SELECT * FROM tenk1 WHERE unique1 < 100 AND unique2 > 9000
Index Cond: (unique2 > 9000)
Filter: (unique1 < 100)
Rows Removed by Filter: 287
+ Index Searches: 1
Buffers: shared hit=16
Planning Time: 0.077 ms
Execution Time: 0.086 ms
diff --git a/doc/src/sgml/ref/explain.sgml b/doc/src/sgml/ref/explain.sgml
index 7daddf03e..9ed1061b7 100644
--- a/doc/src/sgml/ref/explain.sgml
+++ b/doc/src/sgml/ref/explain.sgml
@@ -506,10 +506,11 @@ EXPLAIN ANALYZE EXECUTE query(100, 200);
Buffers: shared hit=4
-> Index Scan using test_pkey on test (cost=0.29..10.27 rows=99 width=8) (actual time=0.009..0.025 rows=99.00 loops=1)
Index Cond: ((id > 100) AND (id < 200))
+ Index Searches: 1
Buffers: shared hit=4
Planning Time: 0.244 ms
Execution Time: 0.073 ms
-(9 rows)
+(10 rows)
</programlisting>
</para>
diff --git a/doc/src/sgml/rules.sgml b/doc/src/sgml/rules.sgml
index 1d9924a2a..8467d961f 100644
--- a/doc/src/sgml/rules.sgml
+++ b/doc/src/sgml/rules.sgml
@@ -1046,6 +1046,7 @@ SELECT count(*) FROM words WHERE word = 'caterpiler';
-> Index Only Scan using wrd_word on wrd (cost=0.42..4.44 rows=1 width=0) (actual time=0.039..0.039 rows=0.00 loops=1)
Index Cond: (word = 'caterpiler'::text)
Heap Fetches: 0
+ Index Searches: 1
Planning time: 0.164 ms
Execution time: 0.117 ms
</programlisting>
@@ -1090,6 +1091,7 @@ SELECT word FROM words ORDER BY word <-> 'caterpiler' LIMIT 10;
Limit (cost=0.29..1.06 rows=10 width=10) (actual time=187.222..188.257 rows=10.00 loops=1)
-> Index Scan using wrd_trgm on wrd (cost=0.29..37020.87 rows=479829 width=10) (actual time=187.219..188.252 rows=10.00 loops=1)
Order By: (word <-> 'caterpiler'::text)
+ Index Searches: 1
Planning time: 0.196 ms
Execution time: 198.640 ms
</programlisting>
diff --git a/src/test/regress/expected/brin_multi.out b/src/test/regress/expected/brin_multi.out
index 991b7eaca..cb5b5e53e 100644
--- a/src/test/regress/expected/brin_multi.out
+++ b/src/test/regress/expected/brin_multi.out
@@ -853,7 +853,8 @@ SELECT * FROM brin_date_test WHERE a = '2023-01-01'::date;
Recheck Cond: (a = '2023-01-01'::date)
-> Bitmap Index Scan on brin_date_test_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = '2023-01-01'::date)
-(4 rows)
+ Index Searches: 1
+(5 rows)
DROP TABLE brin_date_test;
RESET enable_seqscan;
@@ -872,7 +873,8 @@ SELECT * FROM brin_timestamp_test WHERE a = '2023-01-01'::timestamp;
Recheck Cond: (a = '2023-01-01 00:00:00'::timestamp without time zone)
-> Bitmap Index Scan on brin_timestamp_test_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = '2023-01-01 00:00:00'::timestamp without time zone)
-(4 rows)
+ Index Searches: 1
+(5 rows)
EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF, SUMMARY OFF, BUFFERS OFF)
SELECT * FROM brin_timestamp_test WHERE a = '1900-01-01'::timestamp;
@@ -882,7 +884,8 @@ SELECT * FROM brin_timestamp_test WHERE a = '1900-01-01'::timestamp;
Recheck Cond: (a = '1900-01-01 00:00:00'::timestamp without time zone)
-> Bitmap Index Scan on brin_timestamp_test_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = '1900-01-01 00:00:00'::timestamp without time zone)
-(4 rows)
+ Index Searches: 1
+(5 rows)
DROP TABLE brin_timestamp_test;
RESET enable_seqscan;
@@ -900,7 +903,8 @@ SELECT * FROM brin_date_test WHERE a = '2023-01-01'::date;
Recheck Cond: (a = '2023-01-01'::date)
-> Bitmap Index Scan on brin_date_test_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = '2023-01-01'::date)
-(4 rows)
+ Index Searches: 1
+(5 rows)
EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF, SUMMARY OFF, BUFFERS OFF)
SELECT * FROM brin_date_test WHERE a = '1900-01-01'::date;
@@ -910,7 +914,8 @@ SELECT * FROM brin_date_test WHERE a = '1900-01-01'::date;
Recheck Cond: (a = '1900-01-01'::date)
-> Bitmap Index Scan on brin_date_test_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = '1900-01-01'::date)
-(4 rows)
+ Index Searches: 1
+(5 rows)
DROP TABLE brin_date_test;
RESET enable_seqscan;
@@ -929,7 +934,8 @@ SELECT * FROM brin_interval_test WHERE a = '-30 years'::interval;
Recheck Cond: (a = '@ 30 years ago'::interval)
-> Bitmap Index Scan on brin_interval_test_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = '@ 30 years ago'::interval)
-(4 rows)
+ Index Searches: 1
+(5 rows)
EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF, SUMMARY OFF, BUFFERS OFF)
SELECT * FROM brin_interval_test WHERE a = '30 years'::interval;
@@ -939,7 +945,8 @@ SELECT * FROM brin_interval_test WHERE a = '30 years'::interval;
Recheck Cond: (a = '@ 30 years'::interval)
-> Bitmap Index Scan on brin_interval_test_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = '@ 30 years'::interval)
-(4 rows)
+ Index Searches: 1
+(5 rows)
DROP TABLE brin_interval_test;
RESET enable_seqscan;
@@ -957,7 +964,8 @@ SELECT * FROM brin_interval_test WHERE a = '-30 years'::interval;
Recheck Cond: (a = '@ 30 years ago'::interval)
-> Bitmap Index Scan on brin_interval_test_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = '@ 30 years ago'::interval)
-(4 rows)
+ Index Searches: 1
+(5 rows)
EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF, SUMMARY OFF, BUFFERS OFF)
SELECT * FROM brin_interval_test WHERE a = '30 years'::interval;
@@ -967,7 +975,8 @@ SELECT * FROM brin_interval_test WHERE a = '30 years'::interval;
Recheck Cond: (a = '@ 30 years'::interval)
-> Bitmap Index Scan on brin_interval_test_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = '@ 30 years'::interval)
-(4 rows)
+ Index Searches: 1
+(5 rows)
DROP TABLE brin_interval_test;
RESET enable_seqscan;
diff --git a/src/test/regress/expected/memoize.out b/src/test/regress/expected/memoize.out
index 22f2d3284..38dfaf021 100644
--- a/src/test/regress/expected/memoize.out
+++ b/src/test/regress/expected/memoize.out
@@ -22,8 +22,9 @@ begin
ln := regexp_replace(ln, 'Evictions: 0', 'Evictions: Zero');
ln := regexp_replace(ln, 'Evictions: \d+', 'Evictions: N');
ln := regexp_replace(ln, 'Memory Usage: \d+', 'Memory Usage: N');
- ln := regexp_replace(ln, 'Heap Fetches: \d+', 'Heap Fetches: N');
- ln := regexp_replace(ln, 'loops=\d+', 'loops=N');
+ ln := regexp_replace(ln, 'Heap Fetches: \d+', 'Heap Fetches: N');
+ ln := regexp_replace(ln, 'loops=\d+', 'loops=N');
+ ln := regexp_replace(ln, 'Index Searches: \d+', 'Index Searches: N');
return next ln;
end loop;
end;
@@ -49,7 +50,8 @@ WHERE t2.unique1 < 1000;', false);
-> Index Only Scan using tenk1_unique1 on tenk1 t1 (actual rows=1.00 loops=N)
Index Cond: (unique1 = t2.twenty)
Heap Fetches: N
-(12 rows)
+ Index Searches: N
+(13 rows)
-- And check we get the expected results.
SELECT COUNT(*),AVG(t1.unique1) FROM tenk1 t1
@@ -80,7 +82,8 @@ WHERE t1.unique1 < 1000;', false);
-> Index Only Scan using tenk1_unique1 on tenk1 t2 (actual rows=1.00 loops=N)
Index Cond: (unique1 = t1.twenty)
Heap Fetches: N
-(12 rows)
+ Index Searches: N
+(13 rows)
-- And check we get the expected results.
SELECT COUNT(*),AVG(t2.unique1) FROM tenk1 t1,
@@ -106,6 +109,7 @@ WHERE t1.unique1 < 10;', false);
-> Nested Loop Left Join (actual rows=20.00 loops=N)
-> Index Scan using tenk1_unique1 on tenk1 t1 (actual rows=10.00 loops=N)
Index Cond: (unique1 < 10)
+ Index Searches: N
-> Memoize (actual rows=2.00 loops=N)
Cache Key: t1.two
Cache Mode: binary
@@ -115,7 +119,8 @@ WHERE t1.unique1 < 10;', false);
Rows Removed by Filter: 2
-> Index Scan using tenk1_unique1 on tenk1 t2_1 (actual rows=4.00 loops=N)
Index Cond: (unique1 < 4)
-(13 rows)
+ Index Searches: N
+(15 rows)
-- And check we get the expected results.
SELECT COUNT(*),AVG(t2.t1two) FROM tenk1 t1 LEFT JOIN
@@ -149,7 +154,8 @@ WHERE s.c1 = s.c2 AND t1.unique1 < 1000;', false);
Filter: ((t1.two + 1) = unique1)
Rows Removed by Filter: 9999
Heap Fetches: N
-(13 rows)
+ Index Searches: N
+(14 rows)
-- And check we get the expected results.
SELECT COUNT(*), AVG(t1.twenty) FROM tenk1 t1 LEFT JOIN
@@ -219,7 +225,8 @@ ON t1.x = t2.t::numeric AND t1.t::numeric = t2.x;', false);
Index Cond: (x = (t1.t)::numeric)
Filter: (t1.x = (t)::numeric)
Heap Fetches: N
-(10 rows)
+ Index Searches: N
+(11 rows)
DROP TABLE expr_key;
-- Reduce work_mem and hash_mem_multiplier so that we see some cache evictions
@@ -246,7 +253,8 @@ WHERE t2.unique1 < 1200;', true);
-> Index Only Scan using tenk1_unique1 on tenk1 t1 (actual rows=1.00 loops=N)
Index Cond: (unique1 = t2.thousand)
Heap Fetches: N
-(12 rows)
+ Index Searches: N
+(13 rows)
CREATE TABLE flt (f float);
CREATE INDEX flt_f_idx ON flt (f);
@@ -261,6 +269,7 @@ SELECT * FROM flt f1 INNER JOIN flt f2 ON f1.f = f2.f;', false);
Nested Loop (actual rows=4.00 loops=N)
-> Index Only Scan using flt_f_idx on flt f1 (actual rows=2.00 loops=N)
Heap Fetches: N
+ Index Searches: N
-> Memoize (actual rows=2.00 loops=N)
Cache Key: f1.f
Cache Mode: logical
@@ -268,7 +277,8 @@ SELECT * FROM flt f1 INNER JOIN flt f2 ON f1.f = f2.f;', false);
-> Index Only Scan using flt_f_idx on flt f2 (actual rows=2.00 loops=N)
Index Cond: (f = f1.f)
Heap Fetches: N
-(10 rows)
+ Index Searches: N
+(12 rows)
-- Ensure memoize operates in binary mode
SELECT explain_memoize('
@@ -278,6 +288,7 @@ SELECT * FROM flt f1 INNER JOIN flt f2 ON f1.f >= f2.f;', false);
Nested Loop (actual rows=4.00 loops=N)
-> Index Only Scan using flt_f_idx on flt f1 (actual rows=2.00 loops=N)
Heap Fetches: N
+ Index Searches: N
-> Memoize (actual rows=2.00 loops=N)
Cache Key: f1.f
Cache Mode: binary
@@ -285,7 +296,8 @@ SELECT * FROM flt f1 INNER JOIN flt f2 ON f1.f >= f2.f;', false);
-> Index Only Scan using flt_f_idx on flt f2 (actual rows=2.00 loops=N)
Index Cond: (f <= f1.f)
Heap Fetches: N
-(10 rows)
+ Index Searches: N
+(12 rows)
DROP TABLE flt;
-- Exercise Memoize in binary mode with a large fixed width type and a
@@ -311,7 +323,8 @@ SELECT * FROM strtest s1 INNER JOIN strtest s2 ON s1.n >= s2.n;', false);
Hits: 3 Misses: 3 Evictions: Zero Overflows: 0 Memory Usage: NkB
-> Index Scan using strtest_n_idx on strtest s2 (actual rows=4.00 loops=N)
Index Cond: (n <= s1.n)
-(9 rows)
+ Index Searches: N
+(10 rows)
-- Ensure we get 3 hits and 3 misses
SELECT explain_memoize('
@@ -327,7 +340,8 @@ SELECT * FROM strtest s1 INNER JOIN strtest s2 ON s1.t >= s2.t;', false);
Hits: 3 Misses: 3 Evictions: Zero Overflows: 0 Memory Usage: NkB
-> Index Scan using strtest_t_idx on strtest s2 (actual rows=4.00 loops=N)
Index Cond: (t <= s1.t)
-(9 rows)
+ Index Searches: N
+(10 rows)
DROP TABLE strtest;
-- Ensure memoize works with partitionwise join
@@ -348,6 +362,7 @@ SELECT * FROM prt t1 INNER JOIN prt t2 ON t1.a = t2.a;', false);
-> Nested Loop (actual rows=16.00 loops=N)
-> Index Only Scan using iprt_p1_a on prt_p1 t1_1 (actual rows=4.00 loops=N)
Heap Fetches: N
+ Index Searches: N
-> Memoize (actual rows=4.00 loops=N)
Cache Key: t1_1.a
Cache Mode: logical
@@ -355,9 +370,11 @@ SELECT * FROM prt t1 INNER JOIN prt t2 ON t1.a = t2.a;', false);
-> Index Only Scan using iprt_p1_a on prt_p1 t2_1 (actual rows=4.00 loops=N)
Index Cond: (a = t1_1.a)
Heap Fetches: N
+ Index Searches: N
-> Nested Loop (actual rows=16.00 loops=N)
-> Index Only Scan using iprt_p2_a on prt_p2 t1_2 (actual rows=4.00 loops=N)
Heap Fetches: N
+ Index Searches: N
-> Memoize (actual rows=4.00 loops=N)
Cache Key: t1_2.a
Cache Mode: logical
@@ -365,7 +382,8 @@ SELECT * FROM prt t1 INNER JOIN prt t2 ON t1.a = t2.a;', false);
-> Index Only Scan using iprt_p2_a on prt_p2 t2_2 (actual rows=4.00 loops=N)
Index Cond: (a = t1_2.a)
Heap Fetches: N
-(21 rows)
+ Index Searches: N
+(25 rows)
-- Ensure memoize works with parameterized union-all Append path
SET enable_partitionwise_join TO off;
@@ -378,6 +396,7 @@ ON t1.a = t2.a;', false);
Nested Loop (actual rows=16.00 loops=N)
-> Index Only Scan using iprt_p1_a on prt_p1 t1 (actual rows=4.00 loops=N)
Heap Fetches: N
+ Index Searches: N
-> Memoize (actual rows=4.00 loops=N)
Cache Key: t1.a
Cache Mode: logical
@@ -386,10 +405,12 @@ ON t1.a = t2.a;', false);
-> Index Only Scan using iprt_p1_a on prt_p1 (actual rows=4.00 loops=N)
Index Cond: (a = t1.a)
Heap Fetches: N
+ Index Searches: N
-> Index Only Scan using iprt_p2_a on prt_p2 (actual rows=0.00 loops=N)
Index Cond: (a = t1.a)
Heap Fetches: N
-(14 rows)
+ Index Searches: N
+(17 rows)
DROP TABLE prt;
RESET enable_partitionwise_join;
diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out
index d95d2395d..34f2b0b8d 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -2369,6 +2369,10 @@ begin
ln := regexp_replace(ln, 'Workers Launched: \d+', 'Workers Launched: N');
ln := regexp_replace(ln, 'actual rows=\d+(?:\.\d+)? loops=\d+', 'actual rows=N loops=N');
ln := regexp_replace(ln, 'Rows Removed by Filter: \d+', 'Rows Removed by Filter: N');
+ perform regexp_matches(ln, 'Index Searches: \d+');
+ if found then
+ continue;
+ end if;
return next ln;
end loop;
end;
@@ -2686,47 +2690,56 @@ select * from ab where a = (select max(a) from lprt_a) and b = (select max(a)-1
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a1_b1_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a1_b2 ab_2 (never executed)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a1_b2_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a1_b3 ab_3 (never executed)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a1_b3_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a2_b1 ab_4 (never executed)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a2_b1_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a2_b2 ab_5 (never executed)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a2_b2_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a2_b3 ab_6 (never executed)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a2_b3_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a3_b1 ab_7 (never executed)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a3_b1_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a3_b2 ab_8 (actual rows=0.00 loops=1)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a3_b2_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = (InitPlan 1).col1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a3_b3 ab_9 (never executed)
Recheck Cond: (a = (InitPlan 1).col1)
Filter: (b = (InitPlan 2).col1)
-> Bitmap Index Scan on ab_a3_b3_a_idx (never executed)
Index Cond: (a = (InitPlan 1).col1)
-(52 rows)
+ Index Searches: 0
+(61 rows)
-- Test run-time partition pruning with UNION ALL parents
explain (analyze, costs off, summary off, timing off, buffers off)
@@ -2742,16 +2755,19 @@ select * from (select * from ab where a = 1 union all select * from ab) ab where
Filter: (b = (InitPlan 1).col1)
-> Bitmap Index Scan on ab_a1_b1_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b2 ab_12 (never executed)
Recheck Cond: (a = 1)
Filter: (b = (InitPlan 1).col1)
-> Bitmap Index Scan on ab_a1_b2_a_idx (never executed)
Index Cond: (a = 1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a1_b3 ab_13 (never executed)
Recheck Cond: (a = 1)
Filter: (b = (InitPlan 1).col1)
-> Bitmap Index Scan on ab_a1_b3_a_idx (never executed)
Index Cond: (a = 1)
+ Index Searches: 0
-> Seq Scan on ab_a1_b1 ab_1 (actual rows=0.00 loops=1)
Filter: (b = (InitPlan 1).col1)
-> Seq Scan on ab_a1_b2 ab_2 (never executed)
@@ -2770,7 +2786,7 @@ select * from (select * from ab where a = 1 union all select * from ab) ab where
Filter: (b = (InitPlan 1).col1)
-> Seq Scan on ab_a3_b3 ab_9 (never executed)
Filter: (b = (InitPlan 1).col1)
-(37 rows)
+(40 rows)
-- A case containing a UNION ALL with a non-partitioned child.
explain (analyze, costs off, summary off, timing off, buffers off)
@@ -2786,16 +2802,19 @@ select * from (select * from ab where a = 1 union all (values(10,5)) union all s
Filter: (b = (InitPlan 1).col1)
-> Bitmap Index Scan on ab_a1_b1_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b2 ab_12 (never executed)
Recheck Cond: (a = 1)
Filter: (b = (InitPlan 1).col1)
-> Bitmap Index Scan on ab_a1_b2_a_idx (never executed)
Index Cond: (a = 1)
+ Index Searches: 0
-> Bitmap Heap Scan on ab_a1_b3 ab_13 (never executed)
Recheck Cond: (a = 1)
Filter: (b = (InitPlan 1).col1)
-> Bitmap Index Scan on ab_a1_b3_a_idx (never executed)
Index Cond: (a = 1)
+ Index Searches: 0
-> Result (actual rows=0.00 loops=1)
One-Time Filter: (5 = (InitPlan 1).col1)
-> Seq Scan on ab_a1_b1 ab_1 (actual rows=0.00 loops=1)
@@ -2816,7 +2835,7 @@ select * from (select * from ab where a = 1 union all (values(10,5)) union all s
Filter: (b = (InitPlan 1).col1)
-> Seq Scan on ab_a3_b3 ab_9 (never executed)
Filter: (b = (InitPlan 1).col1)
-(39 rows)
+(42 rows)
-- Another UNION ALL test, but containing a mix of exec init and exec run-time pruning.
create table xy_1 (x int, y int);
@@ -2887,16 +2906,19 @@ update ab_a1 set b = 3 from ab where ab.a = 1 and ab.a = ab_a1.a;');
Recheck Cond: (a = 1)
-> Bitmap Index Scan on ab_a1_b1_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b2 ab_a1_2 (actual rows=1.00 loops=1)
Recheck Cond: (a = 1)
Heap Blocks: exact=1
-> Bitmap Index Scan on ab_a1_b2_a_idx (actual rows=1.00 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b3 ab_a1_3 (actual rows=0.00 loops=1)
Recheck Cond: (a = 1)
Heap Blocks: exact=1
-> Bitmap Index Scan on ab_a1_b3_a_idx (actual rows=1.00 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Materialize (actual rows=1.00 loops=1)
Storage: Memory Maximum Storage: NkB
-> Append (actual rows=1.00 loops=1)
@@ -2904,17 +2926,20 @@ update ab_a1 set b = 3 from ab where ab.a = 1 and ab.a = ab_a1.a;');
Recheck Cond: (a = 1)
-> Bitmap Index Scan on ab_a1_b1_a_idx (actual rows=0.00 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b2 ab_2 (actual rows=1.00 loops=1)
Recheck Cond: (a = 1)
Heap Blocks: exact=1
-> Bitmap Index Scan on ab_a1_b2_a_idx (actual rows=1.00 loops=1)
Index Cond: (a = 1)
+ Index Searches: 1
-> Bitmap Heap Scan on ab_a1_b3 ab_3 (actual rows=0.00 loops=1)
Recheck Cond: (a = 1)
Heap Blocks: exact=1
-> Bitmap Index Scan on ab_a1_b3_a_idx (actual rows=1.00 loops=1)
Index Cond: (a = 1)
-(37 rows)
+ Index Searches: 1
+(43 rows)
table ab;
a | b
@@ -2990,17 +3015,23 @@ select * from tbl1 join tprt on tbl1.col1 > tprt.col1;
-> Append (actual rows=3.00 loops=2)
-> Index Scan using tprt1_idx on tprt_1 (actual rows=2.00 loops=2)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 2
-> Index Scan using tprt2_idx on tprt_2 (actual rows=2.00 loops=1)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 1
-> Index Scan using tprt3_idx on tprt_3 (never executed)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt4_idx on tprt_4 (never executed)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt5_idx on tprt_5 (never executed)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt6_idx on tprt_6 (never executed)
Index Cond: (col1 < tbl1.col1)
-(15 rows)
+ Index Searches: 0
+(21 rows)
explain (analyze, costs off, summary off, timing off, buffers off)
select * from tbl1 join tprt on tbl1.col1 = tprt.col1;
@@ -3011,17 +3042,23 @@ select * from tbl1 join tprt on tbl1.col1 = tprt.col1;
-> Append (actual rows=1.00 loops=2)
-> Index Scan using tprt1_idx on tprt_1 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt2_idx on tprt_2 (actual rows=1.00 loops=2)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 2
-> Index Scan using tprt3_idx on tprt_3 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt4_idx on tprt_4 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt5_idx on tprt_5 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt6_idx on tprt_6 (never executed)
Index Cond: (col1 = tbl1.col1)
-(15 rows)
+ Index Searches: 0
+(21 rows)
select tbl1.col1, tprt.col1 from tbl1
inner join tprt on tbl1.col1 > tprt.col1
@@ -3056,17 +3093,23 @@ select * from tbl1 inner join tprt on tbl1.col1 > tprt.col1;
-> Append (actual rows=4.60 loops=5)
-> Index Scan using tprt1_idx on tprt_1 (actual rows=2.00 loops=5)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 5
-> Index Scan using tprt2_idx on tprt_2 (actual rows=2.75 loops=4)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 4
-> Index Scan using tprt3_idx on tprt_3 (actual rows=1.00 loops=2)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 2
-> Index Scan using tprt4_idx on tprt_4 (never executed)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt5_idx on tprt_5 (never executed)
Index Cond: (col1 < tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt6_idx on tprt_6 (never executed)
Index Cond: (col1 < tbl1.col1)
-(15 rows)
+ Index Searches: 0
+(21 rows)
explain (analyze, costs off, summary off, timing off, buffers off)
select * from tbl1 inner join tprt on tbl1.col1 = tprt.col1;
@@ -3077,17 +3120,23 @@ select * from tbl1 inner join tprt on tbl1.col1 = tprt.col1;
-> Append (actual rows=0.60 loops=5)
-> Index Scan using tprt1_idx on tprt_1 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt2_idx on tprt_2 (actual rows=1.00 loops=2)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 2
-> Index Scan using tprt3_idx on tprt_3 (actual rows=0.33 loops=3)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 3
-> Index Scan using tprt4_idx on tprt_4 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt5_idx on tprt_5 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt6_idx on tprt_6 (never executed)
Index Cond: (col1 = tbl1.col1)
-(15 rows)
+ Index Searches: 0
+(21 rows)
select tbl1.col1, tprt.col1 from tbl1
inner join tprt on tbl1.col1 > tprt.col1
@@ -3141,17 +3190,23 @@ select * from tbl1 join tprt on tbl1.col1 < tprt.col1;
-> Append (actual rows=1.00 loops=1)
-> Index Scan using tprt1_idx on tprt_1 (never executed)
Index Cond: (col1 > tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt2_idx on tprt_2 (never executed)
Index Cond: (col1 > tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt3_idx on tprt_3 (never executed)
Index Cond: (col1 > tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt4_idx on tprt_4 (never executed)
Index Cond: (col1 > tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt5_idx on tprt_5 (never executed)
Index Cond: (col1 > tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt6_idx on tprt_6 (actual rows=1.00 loops=1)
Index Cond: (col1 > tbl1.col1)
-(15 rows)
+ Index Searches: 1
+(21 rows)
select tbl1.col1, tprt.col1 from tbl1
inner join tprt on tbl1.col1 < tprt.col1
@@ -3173,17 +3228,23 @@ select * from tbl1 join tprt on tbl1.col1 = tprt.col1;
-> Append (actual rows=0.00 loops=1)
-> Index Scan using tprt1_idx on tprt_1 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt2_idx on tprt_2 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt3_idx on tprt_3 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt4_idx on tprt_4 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt5_idx on tprt_5 (never executed)
Index Cond: (col1 = tbl1.col1)
+ Index Searches: 0
-> Index Scan using tprt6_idx on tprt_6 (never executed)
Index Cond: (col1 = tbl1.col1)
-(15 rows)
+ Index Searches: 0
+(21 rows)
select tbl1.col1, tprt.col1 from tbl1
inner join tprt on tbl1.col1 = tprt.col1
@@ -3513,10 +3574,12 @@ explain (analyze, costs off, summary off, timing off, buffers off) execute mt_q1
-> Index Scan using ma_test_p2_b_idx on ma_test_p2 ma_test_1 (actual rows=1.00 loops=1)
Filter: ((a >= $1) AND ((a % 10) = 5))
Rows Removed by Filter: 9
+ Index Searches: 1
-> Index Scan using ma_test_p3_b_idx on ma_test_p3 ma_test_2 (actual rows=1.00 loops=1)
Filter: ((a >= $1) AND ((a % 10) = 5))
Rows Removed by Filter: 9
-(9 rows)
+ Index Searches: 1
+(11 rows)
execute mt_q1(15);
a
@@ -3534,7 +3597,8 @@ explain (analyze, costs off, summary off, timing off, buffers off) execute mt_q1
-> Index Scan using ma_test_p3_b_idx on ma_test_p3 ma_test_1 (actual rows=1.00 loops=1)
Filter: ((a >= $1) AND ((a % 10) = 5))
Rows Removed by Filter: 9
-(6 rows)
+ Index Searches: 1
+(7 rows)
execute mt_q1(25);
a
@@ -3582,13 +3646,17 @@ explain (analyze, costs off, summary off, timing off, buffers off) select * from
-> Limit (actual rows=1.00 loops=1)
-> Index Scan using ma_test_p2_b_idx on ma_test_p2 (actual rows=1.00 loops=1)
Index Cond: (b IS NOT NULL)
+ Index Searches: 1
-> Index Scan using ma_test_p1_b_idx on ma_test_p1 ma_test_1 (never executed)
Filter: (a >= (InitPlan 2).col1)
+ Index Searches: 0
-> Index Scan using ma_test_p2_b_idx on ma_test_p2 ma_test_2 (actual rows=10.00 loops=1)
Filter: (a >= (InitPlan 2).col1)
+ Index Searches: 1
-> Index Scan using ma_test_p3_b_idx on ma_test_p3 ma_test_3 (actual rows=10.00 loops=1)
Filter: (a >= (InitPlan 2).col1)
-(14 rows)
+ Index Searches: 1
+(18 rows)
reset enable_seqscan;
reset enable_sort;
@@ -4159,13 +4227,17 @@ select * from rangep where b IN((select 1),(select 2)) order by a;
Sort Key: rangep_2.a
-> Index Scan using rangep_0_to_100_1_a_idx on rangep_0_to_100_1 rangep_2 (actual rows=0.00 loops=1)
Filter: (b = ANY (ARRAY[(InitPlan 1).col1, (InitPlan 2).col1]))
+ Index Searches: 1
-> Index Scan using rangep_0_to_100_2_a_idx on rangep_0_to_100_2 rangep_3 (actual rows=0.00 loops=1)
Filter: (b = ANY (ARRAY[(InitPlan 1).col1, (InitPlan 2).col1]))
+ Index Searches: 1
-> Index Scan using rangep_0_to_100_3_a_idx on rangep_0_to_100_3 rangep_4 (never executed)
Filter: (b = ANY (ARRAY[(InitPlan 1).col1, (InitPlan 2).col1]))
+ Index Searches: 0
-> Index Scan using rangep_100_to_200_a_idx on rangep_100_to_200 rangep_5 (actual rows=0.00 loops=1)
Filter: (b = ANY (ARRAY[(InitPlan 1).col1, (InitPlan 2).col1]))
-(15 rows)
+ Index Searches: 1
+(19 rows)
reset enable_sort;
drop table rangep;
diff --git a/src/test/regress/expected/select.out b/src/test/regress/expected/select.out
index cd79abc35..bab0cc93f 100644
--- a/src/test/regress/expected/select.out
+++ b/src/test/regress/expected/select.out
@@ -764,7 +764,8 @@ select * from onek2 where unique2 = 11 and stringu1 = 'ATAAAA';
Index Scan using onek2_u2_prtl on onek2 (actual rows=1.00 loops=1)
Index Cond: (unique2 = 11)
Filter: (stringu1 = 'ATAAAA'::name)
-(3 rows)
+ Index Searches: 1
+(4 rows)
explain (costs off)
select unique2 from onek2 where unique2 = 11 and stringu1 = 'ATAAAA';
diff --git a/src/test/regress/sql/memoize.sql b/src/test/regress/sql/memoize.sql
index d5aab4e56..c0d47fa87 100644
--- a/src/test/regress/sql/memoize.sql
+++ b/src/test/regress/sql/memoize.sql
@@ -23,8 +23,9 @@ begin
ln := regexp_replace(ln, 'Evictions: 0', 'Evictions: Zero');
ln := regexp_replace(ln, 'Evictions: \d+', 'Evictions: N');
ln := regexp_replace(ln, 'Memory Usage: \d+', 'Memory Usage: N');
- ln := regexp_replace(ln, 'Heap Fetches: \d+', 'Heap Fetches: N');
- ln := regexp_replace(ln, 'loops=\d+', 'loops=N');
+ ln := regexp_replace(ln, 'Heap Fetches: \d+', 'Heap Fetches: N');
+ ln := regexp_replace(ln, 'loops=\d+', 'loops=N');
+ ln := regexp_replace(ln, 'Index Searches: \d+', 'Index Searches: N');
return next ln;
end loop;
end;
diff --git a/src/test/regress/sql/partition_prune.sql b/src/test/regress/sql/partition_prune.sql
index 5f36d589b..4a2c74b08 100644
--- a/src/test/regress/sql/partition_prune.sql
+++ b/src/test/regress/sql/partition_prune.sql
@@ -588,6 +588,10 @@ begin
ln := regexp_replace(ln, 'Workers Launched: \d+', 'Workers Launched: N');
ln := regexp_replace(ln, 'actual rows=\d+(?:\.\d+)? loops=\d+', 'actual rows=N loops=N');
ln := regexp_replace(ln, 'Rows Removed by Filter: \d+', 'Rows Removed by Filter: N');
+ perform regexp_matches(ln, 'Index Searches: \d+');
+ if found then
+ continue;
+ end if;
return next ln;
end loop;
end;
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 984006099..d60ae7c72 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1238,6 +1238,7 @@ IndexPath
IndexRuntimeKeyInfo
IndexScan
IndexScanDesc
+IndexScanInstrumentation
IndexScanState
IndexStateFlagsAction
IndexStmt
@@ -2666,6 +2667,7 @@ SharedExecutorInstrumentation
SharedFileSet
SharedHashInfo
SharedIncrementalSortInfo
+SharedIndexScanInstrumentation;
SharedInvalCatalogMsg
SharedInvalCatcacheMsg
SharedInvalRelcacheMsg
--
2.47.2
Hi,
On 2025-03-08 11:47:25 -0500, Peter Geoghegan wrote:
My current plan is to commit this on Tuesday or Wednesday, barring any
objections.
A minor question about this patch: Was there a particular reason it added the
index specific instrumentation information inline in IndexScanState etc? Of
course the amount of memory right now is rather trivial, so that is not an
issue memory usage wise. Is that the reason?
The background for my question is that I was looking at what it would take to
track the index and table buffer usage separately for
IndexScanState/IndexOnlyScanState and IndexScanInstrumentation seems to be
pre-destined for that information. But it seems a a bit too much memory to
just keep a BufferUsage around even when analyze isn't used.
Greetings,
Andres Freund
PS: Another thing that I think we ought to track is the number of fetches from
the table that missed, but that's not really related to my question here or
this thread...
On Wed, Jul 23, 2025 at 1:50 PM Andres Freund <andres@anarazel.de> wrote:
A minor question about this patch: Was there a particular reason it added the
index specific instrumentation information inline in IndexScanState etc? Of
course the amount of memory right now is rather trivial, so that is not an
issue memory usage wise. Is that the reason?
There was no very good reason behind my choice to do things that way.
I wanted to minimize the amount of churn in files like
nodeIndexScan.c. It was almost an arbitrary choice.
The background for my question is that I was looking at what it would take to
track the index and table buffer usage separately for
IndexScanState/IndexOnlyScanState and IndexScanInstrumentation seems to be
pre-destined for that information. But it seems a a bit too much memory to
just keep a BufferUsage around even when analyze isn't used.
Offhand, I'd say that it would almost certainly be okay to switch over
to using dynamic allocation for IndexScanInstrumentation, instead of
storing it inline in IndexScanState/IndexOnlyScanState. That way you
could add many more fields to IndexScanInstrumentation, without that
creating any memory bloat problems in the common case where the scan
isn't running in an EXPLAIN ANALYZE.
--
Peter Geoghegan