Improve EXPLAIN output for multicolumn B-Tree Index

https://www.postgresql.org/docs/17/indexes-multicolumn.html

almost 2 years ago

Hi,

Regarding the multicolumn B-Tree Index, I'm considering

if we can enhance the EXPLAIN output. There have been requests

for this from our customer.

As the document says, we need to use it carefully.

The exact rule is that equality constraints on leading columns,

plus any inequality constraints on the first column that does

not have an equality constraint, will be used to limit the portion

of the index that is scanned.

However, it's not easy to confirm whether multi-column indexes are

being used efficiently because we need to compare the index

definitions and query conditions individually.

For instance, just by looking at the following EXPLAIN result, we

can't determine whether the index is being used efficiently or not

at a glance. Indeed, the current index definition is not suitable

for the query, so the cost is significantly high.

=# EXPLAIN (VERBOSE, ANALYZE) SELECT * FROM test WHERE id1 = 1 AND id3 = 101;

QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------

Index Scan using test_idx on public.test (cost=0.42..12754.76 rows=1 width=18) (actual time=0.033..54.115 rows=1 loops=1)

Output: id1, id2, id3, value

Index Cond: ((test.id1 = 1) AND (test.id3 = 101)) -- Is it efficient or not?

Planning Time: 0.145 ms

Execution Time: 54.150 ms

(6 rows)

So, I'd like to improve the output to be more user-friendly.

# Idea

I'm considering adding new information, "Index Bound Cond", which specifies

what quals will be used for the boundary condition of the B-Tree index.

(Since this is just my current idea, I'm open to changing the output.)

Here is an example output.

-- prepare for the test

CREATE TABLE test (id1 int, id2 int, id3 int, value varchar(32));

CREATE INDEX test_idx ON test(id1, id2, id3); -- multicolumn B-Tree index

INSERT INTO test (SELECT i % 2, i, i, 'hello' FROM generate_series(1,1000000) s(i));

ANALYZE;

-- explain

=# EXPLAIN (VERBOSE, ANALYZE) SELECT * FROM test WHERE id1 = 1 AND id2 = 101;

QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------

Index Scan using test_idx on public.test (cost=0.42..8.45 rows=1 width=18) (actual time=0.046..0.047 rows=1 loops=1)

Output: id1, id2, id3, value

Index Cond: ((test.id1 = 1) AND (test.id2 = 101))

Index Bound Cond: ((test.id1 = 1) AND (test.id2 = 101)) -- The B-Tree index is used efficiently.

Planning Time: 0.124 ms

Execution Time: 0.076 ms

(6 rows)

=# EXPLAIN (VERBOSE, ANALYZE) SELECT * FROM test WHERE id1 = 1 AND id3 = 101;

QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------

Index Scan using test_idx on public.test (cost=0.42..12754.76 rows=1 width=18) (actual time=0.033..54.115 rows=1 loops=1)

Output: id1, id2, id3, value

Index Cond: ((test.id1 = 1) AND (test.id3 = 101))

Index Bound Cond: (test.id1 = 1) -- The B-tree index is *not* used efficiently

-- compared to the previous execution conditions,

-- because it differs from "Index Cond".

Planning Time: 0.145 ms

Execution Time: 54.150 ms

(6 rows)

# PoC patch

The PoC patch makes the following changes:

* Adds a new variable related to bound conditions

to IndexPath, IndexScan, IndexOnlyScan, and BitmapIndexScan

* Adds quals for bound conditions to IndexPath when estimating cost, since

the B-Tree index considers the boundary condition in btcostestimate()

* Adds quals for bound conditions to the output of EXPLAIN

Thank you for reading my suggestion. Please feel free to comment.

* Is this feature useful? Is there a possibility it will be accepted?

* Are there any other ideas for determining if multicolumn indexes are

being used efficiently? Although I considered calculating the efficiency using

pg_statio_all_indexes.idx_blks_read and pg_stat_all_indexes.idx_tup_read,

I believe improving the EXPLAIN output is better because it can be output

per query and it's more user-friendly.

* Is "Index Bound Cond" the proper term?I also considered changing

"Index Cond" to only show quals for the boundary condition and adding

a new term "Index Filter".

* Would it be better to add new interfaces to Index AM? Is there any case

to output the EXPLAIN for each index context? At least, I think it's worth

considering whether it's good for amcostestimate() to modify the

IndexPath directly as the PoC patch does.

Regards,

Masahiro Ikeda

NTT DATA CORPORATION

Ashutosh Bapat

ashutosh.bapat@enterprisedb.com

almost 2 years ago

In reply to: Masahiro.Ikeda@nttdata.com (#1)

Re: Improve EXPLAIN output for multicolumn B-Tree Index

On Fri, Jun 21, 2024 at 12:42 PM <Masahiro.Ikeda@nttdata.com> wrote:

Hi,

Regarding the multicolumn B-Tree Index, I'm considering

if we can enhance the EXPLAIN output. There have been requests

for this from our customer.

As the document says, we need to use it carefully.

The exact rule is that equality constraints on leading columns,

plus any inequality constraints on the first column that does

not have an equality constraint, will be used to limit the portion

of the index that is scanned.

*https://www.postgresql.org/docs/17/indexes-multicolumn.html
<https://www.postgresql.org/docs/17/indexes-multicolumn.html>*

However, it's not easy to confirm whether multi-column indexes are

being used efficiently because we need to compare the index

definitions and query conditions individually.

For instance, just by looking at the following EXPLAIN result, we

can't determine whether the index is being used efficiently or not

at a glance. Indeed, the current index definition is not suitable

for the query, so the cost is significantly high.

=# EXPLAIN (VERBOSE, ANALYZE) SELECT * FROM test WHERE id1 = 1 AND id3 =
101;

QUERY
PLAN

----------------------------------------------------------------------------------------------------------------------------

Index Scan using test_idx on public.test (cost=0.42..12754.76 rows=1
width=18) (actual time=0.033..54.115 rows=1 loops=1)

Output: id1, id2, id3, value

Index Cond: ((test.id1 = 1) AND (test.id3 = 101)) -- Is it
efficient or not?

Planning Time: 0.145 ms

Execution Time: 54.150 ms

(6 rows)

So, I'd like to improve the output to be more user-friendly.

# Idea

I'm considering adding new information, "Index Bound Cond", which specifies

what quals will be used for the boundary condition of the B-Tree index.

(Since this is just my current idea, I'm open to changing the output.)

Here is an example output.

-- prepare for the test

CREATE TABLE test (id1 int, id2 int, id3 int, value varchar(32));

CREATE INDEX test_idx ON test(id1, id2, id3); --
multicolumn B-Tree index

INSERT INTO test (SELECT i % 2, i, i, 'hello' FROM
generate_series(1,1000000) s(i));

ANALYZE;

-- explain

=# EXPLAIN (VERBOSE, ANALYZE) SELECT * FROM test WHERE id1 = 1 AND id2 =
101;

QUERY
PLAN

-----------------------------------------------------------------------------------------------------------------------

Index Scan using test_idx on public.test (cost=0.42..8.45 rows=1
width=18) (actual time=0.046..0.047 rows=1 loops=1)

Output: id1, id2, id3, value

Index Cond: ((test.id1 = 1) AND (test.id2 = 101))

Index Bound Cond: ((test.id1 = 1) AND (test.id2 = 101)) -- The B-Tree
index is used efficiently.

Planning Time: 0.124 ms

Execution Time: 0.076 ms

(6 rows)

=# EXPLAIN (VERBOSE, ANALYZE) SELECT * FROM test WHERE id1 = 1 AND id3 =
101;

QUERY
PLAN

----------------------------------------------------------------------------------------------------------------------------

Index Scan using test_idx on public.test (cost=0.42..12754.76 rows=1
width=18) (actual time=0.033..54.115 rows=1 loops=1)

Output: id1, id2, id3, value

Index Cond: ((test.id1 = 1) AND (test.id3 = 101))

Index Bound Cond: (test.id1 = 1) -- The B-tree
index is *not* used efficiently

-- compared to
the previous execution conditions,

-- because it
differs from "Index Cond".

Planning Time: 0.145 ms

Execution Time: 54.150 ms

(6 rows)

# PoC patch

The PoC patch makes the following changes:

* Adds a new variable related to bound conditions

to IndexPath, IndexScan, IndexOnlyScan, and BitmapIndexScan

* Adds quals for bound conditions to IndexPath when estimating cost, since

the B-Tree index considers the boundary condition in btcostestimate()

* Adds quals for bound conditions to the output of EXPLAIN

Thank you for reading my suggestion. Please feel free to comment.

* Is this feature useful? Is there a possibility it will be accepted?

* Are there any other ideas for determining if multicolumn indexes are

being used efficiently? Although I considered calculating the efficiency
using

pg_statio_all_indexes.idx_blks_read and pg_stat_all_indexes.idx_tup_read,

I believe improving the EXPLAIN output is better because it can be output

per query and it's more user-friendly.

* Is "Index Bound Cond" the proper term?I also considered changing

"Index Cond" to only show quals for the boundary condition and adding

a new term "Index Filter".

* Would it be better to add new interfaces to Index AM? Is there any case

to output the EXPLAIN for each index context? At least, I think it's
worth

considering whether it's good for amcostestimate() to modify the

IndexPath directly as the PoC patch does.

I am unable to decide whether reporting the bound quals is just enough to
decide the efficiency of index without knowing the difference in the number
of index tuples selectivity and heap tuple selectivity. The difference
seems to be a better indicator of index efficiency whereas the bound quals
will help debug the in-efficiency, if any.

Also, do we want to report bound quals even if they are the same as index
conditions or just when they are different?
--
Best Wishes,
Ashutosh Bapat

Yugo Nagata

nagata@sraoss.co.jp

almost 2 years ago

In reply to: Masahiro.Ikeda@nttdata.com (#1)

Re: Improve EXPLAIN output for multicolumn B-Tree Index

On Fri, 21 Jun 2024 07:12:25 +0000
<Masahiro.Ikeda@nttdata.com> wrote:

* Is this feature useful? Is there a possibility it will be accepted?

I think adding such information to EXPLAIN outputs is useful because it
will help users confirm the effect of a multicolumn index on a certain query
and decide to whether leave, drop, or recreate the index, and so on.

* Are there any other ideas for determining if multicolumn indexes are

being used efficiently? Although I considered calculating the efficiency using

pg_statio_all_indexes.idx_blks_read and pg_stat_all_indexes.idx_tup_read,

I believe improving the EXPLAIN output is better because it can be output

per query and it's more user-friendly.

It seems for me improving EXPLAIN is a natural way to show information
on query optimization like index scans.

* Is "Index Bound Cond" the proper term?I also considered changing

"Index Cond" to only show quals for the boundary condition and adding

a new term "Index Filter".

"Index Bound Cond" seems not intuitive for me because I could not find
description explaining what this means from the documentation. I like
"Index Filter" that implies the index has to be scanned.

* Would it be better to add new interfaces to Index AM? Is there any case

to output the EXPLAIN for each index context? At least, I think it's worth

considering whether it's good for amcostestimate() to modify the

IndexPath directly as the PoC patch does.

I am not sure it is the best way to modify IndexPath in amcostestimate(), but
I don't have better ideas for now.

Regards,
Yugo Nagata

Regards,

--

Masahiro Ikeda

NTT DATA CORPORATION

--
Yugo NAGATA <nagata@sraoss.co.jp>

almost 2 years ago

In reply to: Ashutosh Bapat (#2)

RE: Improve EXPLAIN output for multicolumn B-Tree Index

I am unable to decide whether reporting the bound quals is just enough to decide the efficiency of index without knowing the difference in the number of index tuples selectivity and heap tuple selectivity. The difference seems to be a better indicator of index efficiency whereas the bound quals will help debug the in-efficiency, if any.
Also, do we want to report bound quals even if they are the same as index conditions or just when they are different?

Thank you for your comment. After receiving your comment, I thought it would be better to also report information that would make the difference in selectivity understandable. One idea I had is to output the number of index tuples inefficiently extracted, like “Rows Removed by Filter”. Users can check the selectivity and efficiency by looking at the number.

Also, I thought it would be better to change the way bound quals are reported to align with the "Filter". I think it would be better to modify it so that it does not output when the bound quals are the same as the index conditions.

In my local PoC patch, I have modified the output as follows, what do you think?

=# EXPLAIN (VERBOSE, ANALYZE) SELECT * FROM test WHERE id1 = 1 AND id2 = 101;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
Index Scan using test_idx on ikedamsh.test (cost=0.42..8.45 rows=1 width=18) (actual time=0.082..0.086 rows=1 loops=1)
Output: id1, id2, id3, value
Index Cond: ((test.id1 = 1) AND (test.id2 = 101)) -- If it’s efficient, the output won’t change.
Planning Time: 5.088 ms
Execution Time: 0.162 ms
(5 rows)

=# EXPLAIN (VERBOSE, ANALYZE) SELECT * FROM test WHERE id1 = 1 AND id3 = 101;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
Index Scan using test_idx on ikedamsh.test (cost=0.42..12630.10 rows=1 width=18) (actual time=0.175..279.819 rows=1 loops=1)
Output: id1, id2, id3, value
Index Cond: (test.id1 = 1) -- Change the output. Show only the bound quals.
Index Filter: (test.id3 = 101) -- New. Output quals which are not used as the bound quals
Rows Removed by Index Filter: 499999 -- New. Output when ANALYZE option is specified
Planning Time: 0.354 ms
Execution Time: 279.908 ms
(7 rows)

Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION

almost 2 years ago

In reply to: Yugo Nagata (#3)

RE: Improve EXPLAIN output for multicolumn B-Tree Index

* Is this feature useful? Is there a possibility it will be accepted?

I think adding such information to EXPLAIN outputs is useful because it will help users
confirm the effect of a multicolumn index on a certain query and decide to whether
leave, drop, or recreate the index, and so on.

Thank you for your comments and for empathizing with the utility of the approach.

* Are there any other ideas for determining if multicolumn indexes are

being used efficiently? Although I considered calculating the
efficiency using

pg_statio_all_indexes.idx_blks_read and
pg_stat_all_indexes.idx_tup_read,

I believe improving the EXPLAIN output is better because it can be
output

per query and it's more user-friendly.

It seems for me improving EXPLAIN is a natural way to show information on query
optimization like index scans.

OK, I'll proceed with the way.

* Is "Index Bound Cond" the proper term?I also considered changing

"Index Cond" to only show quals for the boundary condition and adding

a new term "Index Filter".

"Index Bound Cond" seems not intuitive for me because I could not find description
explaining what this means from the documentation. I like "Index Filter" that implies the
index has to be scanned.

OK, I think you are right. Even at this point, there are things like ‘Filter’ and
‘Rows Removed by Filter’, so it seems natural to align with them. I described a
new output example in the previous email, how about that?

* Would it be better to add new interfaces to Index AM? Is there any
case

to output the EXPLAIN for each index context? At least, I think it's
worth

considering whether it's good for amcostestimate() to modify the

IndexPath directly as the PoC patch does.

I am not sure it is the best way to modify IndexPath in amcostestimate(), but I don't
have better ideas for now.

OK, I’ll consider what the best way to change is. In addition, if we add
"Rows Removed by Index Filter", we might need to consider a method to receive the
number of filtered tuples at execution time from Index AM.

Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION

boekewurm+postgres@gmail.com

almost 2 years ago

In reply to: Masahiro.Ikeda@nttdata.com (#4)

Re: Improve EXPLAIN output for multicolumn B-Tree Index

On Mon, 24 Jun 2024 at 04:38, <Masahiro.Ikeda@nttdata.com> wrote:

In my local PoC patch, I have modified the output as follows, what do you think?

=# EXPLAIN (VERBOSE, ANALYZE) SELECT * FROM test WHERE id1 = 1 AND id2 = 101;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
Index Scan using test_idx on ikedamsh.test (cost=0.42..8.45 rows=1 width=18) (actual time=0.082..0.086 rows=1 loops=1)
Output: id1, id2, id3, value
Index Cond: ((test.id1 = 1) AND (test.id2 = 101)) -- If it’s efficient, the output won’t change.
Planning Time: 5.088 ms
Execution Time: 0.162 ms
(5 rows)

=# EXPLAIN (VERBOSE, ANALYZE) SELECT * FROM test WHERE id1 = 1 AND id3 = 101;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
Index Scan using test_idx on ikedamsh.test (cost=0.42..12630.10 rows=1 width=18) (actual time=0.175..279.819 rows=1 loops=1)
Output: id1, id2, id3, value
Index Cond: (test.id1 = 1) -- Change the output. Show only the bound quals.
Index Filter: (test.id3 = 101) -- New. Output quals which are not used as the bound quals

I think this is too easy to confuse with the pre-existing 'Filter'
condition, which you'll find on indexes with INCLUDE-d columns or
filters on non-index columns.
Furthermore, I think this is probably not helpful (maybe even harmful)
for index types like GIN and BRIN, where index searchkey order is
mostly irrelevant to the index shape and performance.
Finally, does this change the index AM API? Does this add another
scankey argument to ->amrescan?

Rows Removed by Index Filter: 499999 -- New. Output when ANALYZE option is specified

Separate from the changes to Index Cond/Index Filter output changes I
think this can be useful output, though I'd probably let the AM
specify what kind of filter data to display.
E.g. BRIN may well want to display how many ranges matched the
predicate, vs how many ranges were unsummarized and thus returned; two
conditions which aren't as easy to differentiate but can be important
debugging query performance.

Planning Time: 0.354 ms
Execution Time: 279.908 ms
(7 rows)

Was this a test against the same dataset as the one you'd posted your
measurements of your first patchset with? The execution time seems to
have slown down quite significantly, so if the testset is the same
then this doesn't bode well for your patchset.

Kind regards,

Matthias van de Meent

postgres@jeltef.nl

almost 2 years ago

In reply to: Matthias van de Meent (#6)

Re: Improve EXPLAIN output for multicolumn B-Tree Index

+1 for the idea.

On Mon, 24 Jun 2024 at 11:11, Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:

I think this is too easy to confuse with the pre-existing 'Filter'
condition, which you'll find on indexes with INCLUDE-d columns or
filters on non-index columns.

Why not combine them? And both call them Filter? In a sense this
filtering acts very similar to INCLUDE based filtering (for btrees at
least). Although I might be wrong about that, because when I try to
confirm the same perf using the following script I do get quite
different timings (maybe you have an idea what's going on here). But
even if it does mean something slightly different perf wise, I think
using Filter for both is unlikely to confuse anyone. Since, while
allowed, it seems extremely unlikely in practice that someone will use
the same column as part of the indexed columns and as part of the
INCLUDE-d columns (why would you store the same info twice).

CREATE TABLE test (id1 int, id2 int, id3 int, value varchar(32));
INSERT INTO test (SELECT i % 10, i % 1000, i, 'hello' FROM
generate_series(1,1000000) s(i));
vacuum freeze test;
CREATE INDEX test_idx_include ON test(id1, id2) INCLUDE (id3);
ANALYZE test;
EXPLAIN (VERBOSE, ANALYZE, BUFFERS) SELECT id1, id3 FROM test WHERE
id1 = 1 AND id3 = 101;
CREATE INDEX test_idx ON test(id1, id2, id3);
ANALYZE test;
EXPLAIN (VERBOSE, ANALYZE, BUFFERS) SELECT id1, id3 FROM test WHERE
id1 = 1 AND id3 = 101;

QUERY PLAN
───────────────────────────────────────
Index Only Scan using test_idx_include on public.test
(cost=0.42..3557.09 rows=1 width=8) (actual time=0.708..6.639 rows=1
loops=1)
Output: id1, id3
Index Cond: (test.id1 = 1)
Filter: (test.id3 = 101)
Rows Removed by Filter: 99999
Heap Fetches: 0
Buffers: shared hit=1 read=386
Query Identifier: 471139784017641093
Planning:
Buffers: shared hit=8 read=1
Planning Time: 0.091 ms
Execution Time: 6.656 ms
(12 rows)

Time: 7.139 ms
QUERY PLAN
─────────────────────────────────────
Index Only Scan using test_idx on public.test (cost=0.42..2591.77
rows=1 width=8) (actual time=0.238..2.110 rows=1 loops=1)
Output: id1, id3
Index Cond: ((test.id1 = 1) AND (test.id3 = 101))
Heap Fetches: 0
Buffers: shared hit=1 read=386
Query Identifier: 471139784017641093
Planning:
Buffers: shared hit=10 read=1
Planning Time: 0.129 ms
Execution Time: 2.128 ms
(10 rows)

Time: 2.645 ms

boekewurm+postgres@gmail.com

almost 2 years ago

In reply to: Jelte Fennema-Nio (#7)

Re: Improve EXPLAIN output for multicolumn B-Tree Index

On Mon, 24 Jun 2024 at 11:58, Jelte Fennema-Nio <postgres@jeltef.nl> wrote:

+1 for the idea.

On Mon, 24 Jun 2024 at 11:11, Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:

I think this is too easy to confuse with the pre-existing 'Filter'
condition, which you'll find on indexes with INCLUDE-d columns or
filters on non-index columns.

Why not combine them? And both call them Filter? In a sense this
filtering acts very similar to INCLUDE based filtering (for btrees at
least).

It does not really behave similar: index scan keys (such as the
id3=101 scankey) don't require visibility checks in the btree code,
while the Filter condition _does_ require a visibility check, and
delegates the check to the table AM if the scan isn't Index-Only, or
if the VM didn't show all-visible during the check.

Furthermore, the index could use the scankey to improve the number of
keys to scan using "skip scans"; by realising during a forward scan
that if you've reached tuple (1, 2, 3) and looking for (1, _, 1) you
can skip forward to (1, 3, _), rather than having to go through tuples
(1, 2, 4), (1, 2, 5), ... (1, 2, n). This is not possible for
INCLUDE-d columns, because their datatypes and structure are opaque to
the index AM; the AM cannot assume anything about or do anything with
those values.

Although I might be wrong about that, because when I try to
confirm the same perf using the following script I do get quite
different timings (maybe you have an idea what's going on here). But
even if it does mean something slightly different perf wise, I think
using Filter for both is unlikely to confuse anyone.

I don't want A to to be the plan, while showing B' to the user, as the
performance picture for the two may be completely different. And, as I
mentioned upthread, the differences between AMs in the (lack of)
meaning in index column order also makes it quite wrong to generally
separate prefixes equalities from the rest of the keys.

Since, while
allowed, it seems extremely unlikely in practice that someone will use
the same column as part of the indexed columns and as part of the
INCLUDE-d columns (why would you store the same info twice).

Yeah, people don't generally include the same index column more than
once in the same index.

CREATE INDEX test_idx_include ON test(id1, id2) INCLUDE (id3);
CREATE INDEX test_idx ON test(id1, id2, id3);

QUERY PLAN
───────────────────────────────────────
Index Only Scan using test_idx_include on public.test

[...]

Time: 7.139 ms
QUERY PLAN
─────────────────────────────────────
Index Only Scan using test_idx on public.test (cost=0.42..2591.77

[...]

Time: 2.645 ms

As you can see, there's a huge difference in performance. Putting both
non-bound and "normal" filter clauses in the same Filter clause will
make it more difficult to explain performance issues based on only the
explain output.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)

postgres@jeltef.nl

almost 2 years ago

In reply to: Matthias van de Meent (#8)

Re: Improve EXPLAIN output for multicolumn B-Tree Index

On Mon, 24 Jun 2024 at 13:02, Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:

It does not really behave similar: index scan keys (such as the
id3=101 scankey) don't require visibility checks in the btree code,
while the Filter condition _does_ require a visibility check, and
delegates the check to the table AM if the scan isn't Index-Only, or
if the VM didn't show all-visible during the check.

Any chance you could point me in the right direction for the
code/docs/comment about this? I'd like to learn a bit more about why
that is the case, because I didn't realize visibility checks worked
differently for index scan keys and Filter keys.

Furthermore, the index could use the scankey to improve the number of
keys to scan using "skip scans"; by realising during a forward scan
that if you've reached tuple (1, 2, 3) and looking for (1, _, 1) you
can skip forward to (1, 3, _), rather than having to go through tuples
(1, 2, 4), (1, 2, 5), ... (1, 2, n). This is not possible for
INCLUDE-d columns, because their datatypes and structure are opaque to
the index AM; the AM cannot assume anything about or do anything with
those values.

Does Postgres actually support this currently? I thought skip scans
were not available (yet).

I don't want A to to be the plan, while showing B' to the user, as the
performance picture for the two may be completely different. And, as I
mentioned upthread, the differences between AMs in the (lack of)
meaning in index column order also makes it quite wrong to generally
separate prefixes equalities from the rest of the keys.

Yeah, that makes sense. These specific explain lines probably
only/mostly make sense for btree. So yeah we'd want the index AM to be
able to add some stuff to the explain plan.

As you can see, there's a huge difference in performance. Putting both
non-bound and "normal" filter clauses in the same Filter clause will
make it more difficult to explain performance issues based on only the
explain output.

Fair enough, that's of course the main point of this patch in the
first place: being able to better interpret the explain plan when you
don't have access to the schema. Still I think Filter is the correct
keyword for both, so how about we make it less confusing by making the
current "Filter" more specific by calling it something like "Non-key
Filter" or "INCLUDE Filter" and then call the other something like
"Index Filter" or "Secondary Bound Filter".

#10

Ashutosh Bapat

ashutosh.bapat@enterprisedb.com

almost 2 years ago

In reply to: Masahiro.Ikeda@nttdata.com (#4)

Re: Improve EXPLAIN output for multicolumn B-Tree Index

On Mon, Jun 24, 2024 at 8:08 AM <Masahiro.Ikeda@nttdata.com> wrote:

I am unable to decide whether reporting the bound quals is just enough

to decide the efficiency of index without knowing the difference in the
number of index tuples selectivity and heap tuple selectivity. The
difference seems to be a better indicator of index efficiency whereas the
bound quals will help debug the in-efficiency, if any.

Also, do we want to report bound quals even if they are the same as

index conditions or just when they are different?

Thank you for your comment. After receiving your comment, I thought it
would be better to also report information that would make the difference
in selectivity understandable. One idea I had is to output the number of
index tuples inefficiently extracted, like “Rows Removed by Filter”. Users
can check the selectivity and efficiency by looking at the number.

Also, I thought it would be better to change the way bound quals are
reported to align with the "Filter". I think it would be better to modify
it so that it does not output when the bound quals are the same as the
index conditions.

In my local PoC patch, I have modified the output as follows, what do you
think?

=# EXPLAIN (VERBOSE, ANALYZE) SELECT * FROM test WHERE id1 = 1 AND id2 =
101;
QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------
Index Scan using test_idx on ikedamsh.test (cost=0.42..8.45 rows=1
width=18) (actual time=0.082..0.086 rows=1 loops=1)
Output: id1, id2, id3, value
Index Cond: ((test.id1 = 1) AND (test.id2 = 101)) -- If it’s
efficient, the output won’t change.
Planning Time: 5.088 ms
Execution Time: 0.162 ms
(5 rows)

This looks fine. We may highlight in the documentation that lack of Index
bound quals in EXPLAIN output indicate that they are same as Index Cond:.
Other idea is to use Index Cond and bound quals as property name but that's
too long.

=# EXPLAIN (VERBOSE, ANALYZE) SELECT * FROM test WHERE id1 = 1 AND id3 =
101;
QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------
Index Scan using test_idx on ikedamsh.test (cost=0.42..12630.10 rows=1
width=18) (actual time=0.175..279.819 rows=1 loops=1)
Output: id1, id2, id3, value
Index Cond: (test.id1 = 1) -- Change the output. Show
only the bound quals.
Index Filter: (test.id3 = 101) -- New. Output quals which
are not used as the bound quals
Rows Removed by Index Filter: 499999 -- New. Output when ANALYZE
option is specified
Planning Time: 0.354 ms
Execution Time: 279.908 ms
(7 rows)

I don't think we want to split these clauses. Index Cond should indicate
the conditions applied to the index scan. Bound quals should be listed
separately even though they will have an intersection with Index Cond. I am
not sure whether Index Filter is the right name, maybe Index Bound Cond:
But I don't know this area enough to make a final call.

About Rows Removed by Index Filter: it's good to provide a number when
ANALYZE is specified, but it will be also better to specify what was
estimated. We do that for (cost snd rows etc.) but doing that somewhere in
the plan output may not have a precedent. I think we should try that and
see what others think.

--
Best Wishes,
Ashutosh Bapat

#11

[0]: /messages/by-id/N1xaIrU29uk5YxLyW55MGk5fz9s6V2FNtj54JRaVlFbPixD5z8sJ07Ite5CvbWwik8ZvDG07oSTN-usENLVMq2UAcizVTEd5b-o16ZGDIIU=@yamlcoder.me
[1]: /messages/by-id/cf85f46f-b02f-05b2-5248-5000b894ebab@enterprisedb.com

boekewurm+postgres@gmail.com

almost 2 years ago

In reply to: Jelte Fennema-Nio (#9)

Re: Improve EXPLAIN output for multicolumn B-Tree Index

On Mon, 24 Jun 2024 at 14:42, Jelte Fennema-Nio <postgres@jeltef.nl> wrote:

On Mon, 24 Jun 2024 at 13:02, Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:

It does not really behave similar: index scan keys (such as the
id3=101 scankey) don't require visibility checks in the btree code,
while the Filter condition _does_ require a visibility check, and
delegates the check to the table AM if the scan isn't Index-Only, or
if the VM didn't show all-visible during the check.

Any chance you could point me in the right direction for the
code/docs/comment about this? I'd like to learn a bit more about why
that is the case, because I didn't realize visibility checks worked
differently for index scan keys and Filter keys.

This can be derived by combining how Filter works (it only filters the
returned live tuples) and how Index-Only scans work (return the index
tuple, unless !ALL_VISIBLE, in which case the heap tuple is
projected). There have been several threads more or less recently that
also touch this topic and closely related topics, e.g. [0]/messages/by-id/N1xaIrU29uk5YxLyW55MGk5fz9s6V2FNtj54JRaVlFbPixD5z8sJ07Ite5CvbWwik8ZvDG07oSTN-usENLVMq2UAcizVTEd5b-o16ZGDIIU=@yamlcoder.me[1]/messages/by-id/cf85f46f-b02f-05b2-5248-5000b894ebab@enterprisedb.com.

Furthermore, the index could use the scankey to improve the number of
keys to scan using "skip scans"; by realising during a forward scan
that if you've reached tuple (1, 2, 3) and looking for (1, _, 1) you
can skip forward to (1, 3, _), rather than having to go through tuples
(1, 2, 4), (1, 2, 5), ... (1, 2, n). This is not possible for
INCLUDE-d columns, because their datatypes and structure are opaque to
the index AM; the AM cannot assume anything about or do anything with
those values.

Does Postgres actually support this currently? I thought skip scans
were not available (yet).

Peter Geoghegan has been working on it as project after PG17's
IN()-list improvements were committed, and I hear he has the basics
working but the further details need fleshing out.

As you can see, there's a huge difference in performance. Putting both
non-bound and "normal" filter clauses in the same Filter clause will
make it more difficult to explain performance issues based on only the
explain output.

Fair enough, that's of course the main point of this patch in the
first place: being able to better interpret the explain plan when you
don't have access to the schema. Still I think Filter is the correct
keyword for both, so how about we make it less confusing by making the
current "Filter" more specific by calling it something like "Non-key
Filter" or "INCLUDE Filter" and then call the other something like
"Index Filter" or "Secondary Bound Filter".

I'm not sure how debuggable explain plans are without access to the
schema, especially when VERBOSE isn't configured, so I would be
hesitant to accept that as an argument here.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)

#12

almost 2 years ago

In reply to: Matthias van de Meent (#6)

RE: Improve EXPLAIN output for multicolumn B-Tree Index

On Mon, 24 Jun 2024 at 04:38, <Masahiro.Ikeda@nttdata.com> wrote:

In my local PoC patch, I have modified the output as follows, what do you think?

=# EXPLAIN (VERBOSE, ANALYZE) SELECT * FROM test WHERE id1 = 1 AND id2 =

101;

QUERY PLAN
----------------------------------------------------------------------
---------------------------------------------------
Index Scan using test_idx on ikedamsh.test (cost=0.42..8.45 rows=1 width=18)

(actual time=0.082..0.086 rows=1 loops=1)

Output: id1, id2, id3, value
Index Cond: ((test.id1 = 1) AND (test.id2 = 101)) -- If it’s efficient, the output

won’t change.

Planning Time: 5.088 ms
Execution Time: 0.162 ms
(5 rows)

=# EXPLAIN (VERBOSE, ANALYZE) SELECT * FROM test WHERE id1 = 1 AND id3 =

101;

QUERY PLAN
----------------------------------------------------------------------
---------------------------------------------------------
Index Scan using test_idx on ikedamsh.test (cost=0.42..12630.10 rows=1

width=18) (actual time=0.175..279.819 rows=1 loops=1)

Output: id1, id2, id3, value
Index Cond: (test.id1 = 1) -- Change the output. Show only the

bound quals.

Index Filter: (test.id3 = 101) -- New. Output quals which are not

used as the bound quals

I think this is too easy to confuse with the pre-existing 'Filter'
condition, which you'll find on indexes with INCLUDE-d columns or filters on non-index
columns.

Thanks for your comment. I forgot the case.

Furthermore, I think this is probably not helpful (maybe even harmful) for index types
like GIN and BRIN, where index searchkey order is mostly irrelevant to the index shape
and performance.

Yes, I expected that only B-Tree index support the feature.

Finally, does this change the index AM API? Does this add another scankey argument to
->amrescan?

Yes, I think so. But since I'd like to make users know the index scan will happen without
ANALYZE, I planned to change amcostestimate for "Index Filter" and amrescan() for
"Rows Removed by Index Filter".

Rows Removed by Index Filter: 499999 -- New. Output when ANALYZE option

is specified

Separate from the changes to Index Cond/Index Filter output changes I think this can
be useful output, though I'd probably let the AM specify what kind of filter data to
display.
E.g. BRIN may well want to display how many ranges matched the predicate, vs how
many ranges were unsummarized and thus returned; two conditions which aren't as
easy to differentiate but can be important debugging query performance.

OK, thanks. I understood that it would be nice if we could customize to output information
specific to other indexes like BRIN.

Planning Time: 0.354 ms
Execution Time: 279.908 ms
(7 rows)

Was this a test against the same dataset as the one you'd posted your measurements of
your first patchset with? The execution time seems to have slown down quite
significantly, so if the testset is the same then this doesn't bode well for your patchset.

Yes, the reason is that the cache hit ratio is very low since I tested after I restarted the
machine. I had to add BUFFERS option.

Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION

#13

almost 2 years ago

In reply to: Jelte Fennema-Nio (#7)

RE: Improve EXPLAIN output for multicolumn B-Tree Index

+1 for the idea.

Thanks! I was interested in the test result that you shared.

Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION

#14

almost 2 years ago

In reply to: Ashutosh Bapat (#10)

RE: Improve EXPLAIN output for multicolumn B-Tree Index

=# EXPLAIN (VERBOSE, ANALYZE) SELECT * FROM test WHERE id1 = 1 AND id3 = 101;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
Index Scan using test_idx on ikedamsh.test (cost=0.42..12630.10 rows=1 width=18) (actual time=0.175..279.819 rows=1 loops=1)
Output: id1, id2, id3, value
Index Cond: (test.id1 = 1) -- Change the output. Show only the bound quals.
Index Filter: (test.id3 = 101) -- New. Output quals which are not used as the bound quals
Rows Removed by Index Filter: 499999 -- New. Output when ANALYZE option is specified
Planning Time: 0.354 ms
Execution Time: 279.908 ms
(7 rows)

I don't think we want to split these clauses. Index Cond should indicate the conditions applied
to the index scan. Bound quals should be listed separately even though they will have an
intersection with Index Cond. I am not sure whether Index Filter is the right name,
maybe Index Bound Cond: But I don't know this area enough to make a final call.

OK, I understood that it's better to only add new ones. I think "Index Filter" fits other than "Index
Bound Cond" if we introduce "Rows Removed By Index Filter".

About Rows Removed by Index Filter: it's good to provide a number when ANALYZE is
specified, but it will be also better to specify what was estimated. We do that for (cost snd rows etc.)
but doing that somewhere in the plan output may not have a precedent. I think we should try that
and see what others think.

It's interesting! It’s an idea that can be applied not only to multi-column indexes, right?
I will consider the implementation and discuss it in a new thread. However, I would like to
focus on the feature to output information about multi-column indexes at first.

Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION

#15

https://use-the-index-luke.com/sql/explain-plan/postgresql/filter-predicates

almost 2 years ago

In reply to: Matthias van de Meent (#11)

RE: Improve EXPLAIN output for multicolumn B-Tree Index

On Mon, 24 Jun 2024 at 14:42, Jelte Fennema-Nio <postgres@jeltef.nl> wrote:

On Mon, 24 Jun 2024 at 13:02, Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:

It does not really behave similar: index scan keys (such as the
id3=101 scankey) don't require visibility checks in the btree code,
while the Filter condition _does_ require a visibility check, and
delegates the check to the table AM if the scan isn't Index-Only, or
if the VM didn't show all-visible during the check.

Any chance you could point me in the right direction for the
code/docs/comment about this? I'd like to learn a bit more about why
that is the case, because I didn't realize visibility checks worked
differently for index scan keys and Filter keys.

This can be derived by combining how Filter works (it only filters the returned live tuples)
and how Index-Only scans work (return the index tuple, unless !ALL_VISIBLE, in which
case the heap tuple is projected). There have been several threads more or less
recently that also touch this topic and closely related topics, e.g. [0][1].

Thanks! I could understand what is difference between INCLUDE based filter and index filter.

As you can see, there's a huge difference in performance. Putting
both non-bound and "normal" filter clauses in the same Filter clause
will make it more difficult to explain performance issues based on
only the explain output.

Fair enough, that's of course the main point of this patch in the
first place: being able to better interpret the explain plan when you
don't have access to the schema. Still I think Filter is the correct
keyword for both, so how about we make it less confusing by making the
current "Filter" more specific by calling it something like "Non-key
Filter" or "INCLUDE Filter" and then call the other something like
"Index Filter" or "Secondary Bound Filter".

I'm not sure how debuggable explain plans are without access to the schema, especially
when VERBOSE isn't configured, so I would be hesitant to accept that as an argument
here.

IMHO, it's nice to be able to understand the differences between each
FILTER even without the VERBOSE option. (+1 for Jelte Fennema-Nio's idea)

Even without access to the schema, it would be possible to quickly know if
the plan is not as expected, and I believe there are virtually no disadvantages
to having multiple "XXX FILTER" outputs.

If it's better to output such information only with the VERBOSE option,
What do you think about the following idea?
* When the VERBOSE option is not specified, output as "Filter" in all cases
* When the VERBOSE option is specified, output as "Non-key Filter", "INCLUDE Filter"
and "Index Filter".

In addition, I think it would be good to mention the differences between each filter in
the documentation.

Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION

#16

Peter Geoghegan

pg@bowt.ie

almost 2 years ago

In reply to: Masahiro.Ikeda@nttdata.com (#1)

Re: Improve EXPLAIN output for multicolumn B-Tree Index

On Fri, Jun 21, 2024 at 3:12 AM <Masahiro.Ikeda@nttdata.com> wrote:

Regarding the multicolumn B-Tree Index, I'm considering
if we can enhance the EXPLAIN output. There have been requests
for this from our customer.

I agree that this is a real problem -- I'm not surprised to hear that
your customer asked about it.

In the past, we've heard complaints about this problem from Markus Winand, too:

As it happens I have been thinking about this problem a lot recently.
Specifically the user-facing aspects, what we show in EXPLAIN. It is
relevant to my ongoing work on skip scan:

https://commitfest.postgresql.org/48/5081/
/messages/by-id/CAH2-Wzmn1YsLzOGgjAQZdn1STSG_y8qP__vggTaPAYXJP+G4bw@mail.gmail.com

Unfortunately, my patch will make the situation more complicated for
your patch. I would like to resolve the tension between the two
patches, but I'm not sure how to do that.

If you look at the example query that I included in my introductory
email on the skip scan thread (the query against the sales_mdam_paper
table), you'll see that my patch makes it go much faster. My patch
will effectively "convert" nbtree scan keys that would traditionally
have to use non-index-bound conditions/filter predicates, into
index-bound conditions/access predicates. This all happens at runtime,
during nbtree preprocessing (not during planning).

This may mean that your patch's approach of determining which
columns/scan keys are in which category (bound vs. non-bound) cannot
rely on its current approach of placing each type of clause into one
of two categories inside btcostestimate() -- the view that we see from
btcostestimate() will be made less authoritative by skip scan. What
actually matters in what happens during nbtree preprocessing, inside
_bt_preprocess_keys().

Unfortunately, this is even more complicated than it sounds. It would
be a good idea if we moved _bt_preprocess_keys() to plan time, so that
btcostestimate() operated off of authoritative information, rather
than independently figuring out the same details for the purposes of
costing. We've talked about this before, even [1]/messages/by-id/2587523.1647982549@sss.pgh.pa.us -- the final full paragraph mentions moving _bt_preprocess_keys() into the planner -- Peter Geoghegan. That way your patch
could just work off of this authoritative information. But even that
doesn't necessarily fix the problem.

Note that the skip scan patch makes _bt_preprocess_keys()
*indiscriminately* "convert" *all* scan keys to index bound conditions
-- at least where that's possible at all. There are minor
implementation restrictions that mean that we can't always do that.
But overall, the patch more or less eliminates non-bound index
conditions. That is, it'll be rare to non-existent for nbtree to fail
to mark *any* scan key as SK_BT_REQFWD/SK_BT_REQBKWD. Technically
speaking, non-bound conditions mostly won't exist anymore.

Of course, this doesn't mean that the problem that your patch is
solving will actually go away. I fully expect that the skip scan patch
will merely make some scan keys "required-by-scan/index bound
condition scan keys in name only". Technically they won't be the
problematic kind of index condition, but that won't actually be true
in any practical sense. Because users (like your customer) will still
get full index scans, and be surprised, just like today.

As I explain in my email on the skip scan thread, I believe that the
patch's aggressive approach to "converting" scan keys is an advantage.
The amount of skipping that actually takes place should be decided
dynamically, at runtime. It is a decision that should be made at the
level of individual leaf pages (or small groups of leaf pages), not at
the level of the whole scan. The distinction between index bound
conditions and non-bound conditions becomes much more "squishy", which
is mostly (though not entirely) a good thing.

I really don't know what to do about this. As I said, I agree with the
general idea of this patch -- this is definitely a real problem. And,
I don't pretend that my skip scan patch will actually define the
problem out of existence (except perhaps in those cases that it
actually makes it much faster). Maybe we could make a guess (based on
statistics) whether or not any skip attributes will leave the
lower-order clauses as useful index bound conditions at runtime. But I
don't know...that condition is actually a "continuous" condition now
-- it is not a strict dichotomy (it is not either/or, but rather a
question of degree, perhaps on a scale of 0.0 - 1.0).

It's also possible that we should just do something simple, like your
patch, even though technically it won't really be accurate in cases
where skip scan is used to good effect. Maybe showing the "default
working assumption" about how the scan keys/clauses will behave at
runtime is actually the right thing to do. Maybe I am just
overthinking it.

[1]: /messages/by-id/2587523.1647982549@sss.pgh.pa.us -- the final full paragraph mentions moving _bt_preprocess_keys() into the planner -- Peter Geoghegan
-- the final full paragraph mentions moving _bt_preprocess_keys() into
the planner
--
Peter Geoghegan

#17

postgres@jeltef.nl

almost 2 years ago

In reply to: Peter Geoghegan (#16)

Re: Improve EXPLAIN output for multicolumn B-Tree Index

On Thu, 27 Jun 2024 at 22:02, Peter Geoghegan <pg@bowt.ie> wrote:

It's also possible that we should just do something simple, like your
patch, even though technically it won't really be accurate in cases
where skip scan is used to good effect. Maybe showing the "default
working assumption" about how the scan keys/clauses will behave at
runtime is actually the right thing to do. Maybe I am just
overthinking it.

IIUC, you're saying that your skip scan will improve the situation
Masahiro describes dramatically in some/most cases. But it still won't
be as good as a pure index "prefix" scan.

If that's the case then I do think you're overthinking this a bit.
Because then you'd still want to see this difference between the
prefix-scan keys and the skip-scan keys. I think the main thing that
the introduction of the skip scan changes is the name that we should
show, e.g. instead of "Non-key Filter" we might want to call it "Skip
Scan Cond"

I do think though that in addition to a "Skip Scan Filtered" count for
ANALYZE, it would be very nice to also get a "Skip Scan Skipped" count
(if that's possible to measure/estimate somehow). This would allow
users to determine how effective the skip scan was, i.e. were they
able to skip over large swaths of the index? Or did they skip over
nothing because the second column of the index (on which there was no
filter) was unique within the table

#18

Peter Geoghegan

pg@bowt.ie

almost 2 years ago

In reply to: Jelte Fennema-Nio (#17)

Re: Improve EXPLAIN output for multicolumn B-Tree Index

On Thu, Jun 27, 2024 at 4:46 PM Jelte Fennema-Nio <postgres@jeltef.nl> wrote:

On Thu, 27 Jun 2024 at 22:02, Peter Geoghegan <pg@bowt.ie> wrote:

It's also possible that we should just do something simple, like your
patch, even though technically it won't really be accurate in cases
where skip scan is used to good effect. Maybe showing the "default
working assumption" about how the scan keys/clauses will behave at
runtime is actually the right thing to do. Maybe I am just
overthinking it.

IIUC, you're saying that your skip scan will improve the situation
Masahiro describes dramatically in some/most cases.

"Most cases" seems likely to be overstating it. Overall, I doubt that
it makes sense to try to generalize like that.

The breakdown of the cases that we see in the field right now
(whatever it really is) is bound to be strongly influenced by the
current capabilities of Postgres. If something is intolerably slow,
then it just isn't tolerated. If something works adequately, then
users don't usually care why it is so.

But it still won't
be as good as a pure index "prefix" scan.

Typically, no, it won't be. But there's really no telling for sure.
The access patterns for a composite index on '(a, b)' with a qual
"WHERE b = 5" are identical to a qual explicitly written "WHERE a =
any(<every possible value in 'a'>) AND b = 5".

If that's the case then I do think you're overthinking this a bit.
Because then you'd still want to see this difference between the
prefix-scan keys and the skip-scan keys. I think the main thing that
the introduction of the skip scan changes is the name that we should
show, e.g. instead of "Non-key Filter" we might want to call it "Skip
Scan Cond"

What about cases where we legitimately have to vary our strategy
during the same index scan? We might very well be able to skip over
many leaf pages when scanning through a low cardinality subset of the
index (low cardinality in respect of a leading column 'a'). Then we
might find that there are long runs on leaf pages where no skipping is
possible.

I don't expect this to be uncommon. I do expect it to happen when the
optimizer wasn't particularly expecting it. Like when a full index
scan was the fastest plan anyway. Or when a skip scan wasn't quite as
good as expected, but nevertheless turned out to be the fastest plan.

I do think though that in addition to a "Skip Scan Filtered" count for
ANALYZE, it would be very nice to also get a "Skip Scan Skipped" count
(if that's possible to measure/estimate somehow). This would allow
users to determine how effective the skip scan was, i.e. were they
able to skip over large swaths of the index? Or did they skip over
nothing because the second column of the index (on which there was no
filter) was unique within the table

Yeah, EXPLAIN ANALYZE should probably be showing something about
skipping. That provides us with a way of telling the user what really
happened, which could help when EXPLAIN output alone turns out to be
quite misleading.

In fact, that'd make sense even today, without skip scan (just with
the 17 work on nbtree SAOP scans). Even with regular SAOP nbtree index
scans, the number of primitive scans is hard to predict, and quite
indicative of what's really going on with the scan.

--
Peter Geoghegan

#19

almost 2 years ago

In reply to: Peter Geoghegan (#18)

RE: Improve EXPLAIN output for multicolumn B-Tree Index

On Thu, 27 Jun 2024 at 22:02, Peter Geoghegan <pg@bowt.ie> wrote:

Unfortunately, my patch will make the situation more complicated
for your patch. I would like to resolve the tension between the
two patches, but I'm not sure how to do that.

OK. I would like to understand more about your proposed patch. I
have also registered as a reviewer in the commitfests entry.

On 2024-06-28 07:40, Peter Geoghegan wrote:

On Thu, Jun 27, 2024 at 4:46 PM Jelte Fennema-Nio <postgres@jeltef.nl> wrote:

I do think though that in addition to a "Skip Scan Filtered" count for
ANALYZE, it would be very nice to also get a "Skip Scan Skipped" count
(if that's possible to measure/estimate somehow). This would allow
users to determine how effective the skip scan was, i.e. were they
able to skip over large swaths of the index? Or did they skip overx
nothing because the second column of the index (on which there was no
filter) was unique within the table

Yeah, EXPLAIN ANALYZE should probably be showing something about
skipping. That provides us with a way of telling the user what really
happened, which could help when EXPLAIN output alone turns out to be
quite misleading.

In fact, that'd make sense even today, without skip scan (just with
the 17 work on nbtree SAOP scans). Even with regular SAOP nbtree index
scans, the number of primitive scans is hard to predict, and quite
indicative of what's really going on with the scan.

I agree as well.

Although I haven't looked on your patch yet, if it's difficult to know
how it can optimize during the planning phase, it's enough for me to just
show "Skip Scan Cond (or Non-Key Filter)". This is because users can
understand that inefficient index scans *may* occur.

If users want more detail, they can execute "EXPLAIN ANALYZE". This will
allow them to understand the execution effectively and determine if there
is any room to optimize the plan by looking at the counter of
"Skip Scan Filtered (or Skip Scan Skipped)".

In terms of the concept of EXPLAIN output, I thought that runtime partition
pruning is similar. "EXPLAIN without ANALYZE" only shows the possibilities and
"EXPLAIN ANALYZE" shows the actual results.

Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION

#20

postgres@jeltef.nl

almost 2 years ago

In reply to: Peter Geoghegan (#18)

Re: Improve EXPLAIN output for multicolumn B-Tree Index

On Fri, 28 Jun 2024 at 00:41, Peter Geoghegan <pg@bowt.ie> wrote:

Typically, no, it won't be. But there's really no telling for sure.
The access patterns for a composite index on '(a, b)' with a qual
"WHERE b = 5" are identical to a qual explicitly written "WHERE a =
any(<every possible value in 'a'>) AND b = 5".

Hmm, that's true. But in that case the explain plan gives a clear hint
that something like that might be going on, because you'll see:

Index Cond: a = any(<every possible value in 'a'>) AND b = 5

That does make me think of another way, and maybe more "correct" way,
of representing Masahiros situation than adding a new "Skip Scan Cond"
row to the EXPLAIN output. We could explicitly include a comparison to
all prefix columns in the Index Cond:

Index Cond: ((test.id1 = 1) AND (test.id2 = ANY) AND (test.id3 = 101))

Or if you want to go with syntactically correct SQL we could do the following:

Index Cond: ((test.id1 = 1) AND ((test.id2 IS NULL) OR (test.id2 IS
NOT NULL)) AND (test.id3 = 101))

An additional benefit this provides is that you now know which
additional column you should use a more specific filter on to speed up
the query. In this case test.id2

OTOH, maybe it's not a great way because actually running that puts
the IS NULL+ IS NOT NULL query in the Filter clause (which honestly
surprises me because I had expected this "always true expression"
would have been optimized away by the planner).

EXPLAIN (VERBOSE, ANALYZE) SELECT id1, id2, id3 FROM test WHERE id1 = 1 AND (id2 IS NULL OR id2 IS NOT NULL) AND id3 = 101;

QUERY PLAN
─────────────────────────────────────────────────────
Index Only Scan using test_idx on public.test (cost=0.42..12809.10
rows=1 width=12) (actual time=0.027..11.234 rows=1 loops=1)
Output: id1, id2, id3
Index Cond: ((test.id1 = 1) AND (test.id3 = 101))
Filter: ((test.id2 IS NULL) OR (test.id2 IS NOT NULL))

What about cases where we legitimately have to vary our strategy
during the same index scan?

Would my new suggestion above address this?

In fact, that'd make sense even today, without skip scan (just with
the 17 work on nbtree SAOP scans). Even with regular SAOP nbtree index
scans, the number of primitive scans is hard to predict, and quite
indicative of what's really going on with the scan.

*googles nbtree SAOP scans and finds the very helpful[1]https://www.youtube.com/watch?v=jg2KeSB5DR8*

Yes, I feel like this should definitely be part of the ANALYZE output.
Seeing how Lukas has to look at pg_stat_user_tables to get this
information seems quite annoying[2]https://youtu.be/jg2KeSB5DR8?t=188 and only possible on systems that
have no concurrent queries.

So it sounds like we'd want a "Primitive Index Scans" counter in
ANALYZE too. In addition to the number of filtered rows by, which if
we go with my proposal above should probably be called "Rows Removed
by Index Cond".

[1]: https://www.youtube.com/watch?v=jg2KeSB5DR8
[2]: https://youtu.be/jg2KeSB5DR8?t=188

#21