BUG #16303: A condtion whether an index-only scan is possible includes a wrong
The following bug has been logged on the website:
Bug reference: 16303
Logged by: Horimoto Yasuhiro
Email address: horimoto@clear-code.com
PostgreSQL version: 12.2
Operating system: Debian 10.3
Description:
Hello, developers.
I think that the condition of whether an index-only scan is possible
includes a wrong.
For example, in the following case, the index has no data to return. Because
the query doesn't use specify columns.
However, the query planner choice index-only scan.
create table gist_count_tbl (tsv tsvector);
insert into gist_count_tbl values (null);
create index gist_count_tbl_index on gist_count_tbl using gist (tsv);
vacuum analyze gist_count_tbl;
set enable_seqscan=off;
set enable_bitmapscan=off;
set enable_indexonlyscan=on;
explain (costs off)
select count(*) from gist_count_tbl;
QUERY PLAN
--------------------------------------------------------------------
Aggregate
-> Index Only Scan using gist_count_tbl_index on gist_count_tbl
(2 rows)
In my opinion, we expected that the query planner doesn't choose an
index-only scan in the above case.
In fact, index_canreturn_attrs of
https://github.com/postgres/postgres/blob/master/src/backend/optimizer/path/indxpath.c#L1951
is NULL in the above case.
thanks!
I send a patch for this problem.
thanks!
From: PG Bug reporting form <noreply@postgresql.org>
Subject: BUG #16303: A condtion whether an index-only scan is possible includes a wrong
Date: Mon, 16 Mar 2020 06:02:25 +0000
Show quoted text
The following bug has been logged on the website:
Bug reference: 16303
Logged by: Horimoto Yasuhiro
Email address: horimoto@clear-code.com
PostgreSQL version: 12.2
Operating system: Debian 10.3
Description:Hello, developers.
I think that the condition of whether an index-only scan is possible
includes a wrong.For example, in the following case, the index has no data to return. Because
the query doesn't use specify columns.
However, the query planner choice index-only scan.create table gist_count_tbl (tsv tsvector);
insert into gist_count_tbl values (null);
create index gist_count_tbl_index on gist_count_tbl using gist (tsv);vacuum analyze gist_count_tbl;
set enable_seqscan=off;
set enable_bitmapscan=off;
set enable_indexonlyscan=on;explain (costs off)
select count(*) from gist_count_tbl;
QUERY PLAN
--------------------------------------------------------------------
Aggregate
-> Index Only Scan using gist_count_tbl_index on gist_count_tbl
(2 rows)In my opinion, we expected that the query planner doesn't choose an
index-only scan in the above case.In fact, index_canreturn_attrs of
https://github.com/postgres/postgres/blob/master/src/backend/optimizer/path/indxpath.c#L1951
is NULL in the above case.thanks!
Attachments:
fix_check_index_only_scan.patchtext/x-patch; charset=us-asciiDownload+60-2
On Sunday, March 15, 2020, Horimoto Yasuhiro <horimoto@clear-code.com>
wrote:
I send a patch for this problem.!
From: PG Bug reporting form <noreply@postgresql.org>
Subject: BUG #16303: A condtion whether an index-only scan is possible
includes a wrong
Date: Mon, 16 Mar 2020 06:02:25 +0000I think that the condition of whether an index-only scan is possible
includes a wrong.For example, in the following case, the index has no data to return.
Because
the query doesn't use specify columns.
However, the query planner choice index-only scan.
In my opinion, we expected that the query planner doesn't choose an
index-only scan in the above case.
I don't see a behavioral bug here. I would expect that any index would be
an acceptable match for a query whose set of return columns is the empty
set. The empty set is a subset of all sets.
David J.
Hello,
Thank you for your comments.
I should have explained much more about this case. Sorry.
In this case, gistcanreturn() returns false:
https://github.com/postgres/postgres/blob/master/src/backend/access/gist/gistget.c#L798-L799
Because tsvector_ops provides GIST_COMPRESS_PROC:
https://github.com/postgres/postgres/blob/master/src/include/access/gist.h#L31
https://github.com/postgres/postgres/blob/master/src/include/catalog/pg_amproc.dat#L530-L532
and doesn't provide GIST_FETCH_PROC:and provides GIST_COMPRESS_PROC:
https://github.com/postgres/postgres/blob/master/src/include/access/gist.h#L37
https://github.com/postgres/postgres/blob/master/src/include/catalog/pg_amproc.dat#L524-L543
Reurning false from amcanreturn() function such as gistcanreturn() means the index can't support
index-only scans on the given column:
https://www.postgresql.org/docs/12/index-functions.html
bool
amcanreturn (Relation indexRelation, int attno);
Check whether the index can support index-only scans on the given column, by returning the
indexed column values for an index entry in the form of an IndexTuple.
So I think that choosing index-only scan for this case (false is returned by gistcanreturn() case)
isn't expected behavior.
Background:
We're developing a PostgreSQL index, PGroonga: https://pgroonga.github.io/
It supports index-only scans for most supported types but doesn't support
index-only scan for jsonb type. Because we can't reconstruct the original data
from data in indexed data.
We found the problem I reported with PGroonga:
https://github.com/pgroonga/pgroonga/issues/101
This problem causes an error because PGroonga can't return
IndexScanDesc::xs_itup but IndexScanDesc::xs_want_itup is true.
Because PostgreSQL requests index-only scan even when PGroonga
returns false by amcanreturn().
We've introduced the following workaround:
https://github.com/pgroonga/pgroonga/commit/0390f79f810c44422940c4ef701c186055b4899e
It returns a NULL tuple instead of real tuple data.
IMO, it's better that PostgreSQL doesn't choose index-only scan in this case.
Or index developers need to implement workaround in each index.
thanks!
From: "David G. Johnston" <david.g.johnston@gmail.com>
Subject: Re: BUG #16303: A condtion whether an index-only scan is possible includes a wrong
Date: Sun, 15 Mar 2020 23:35:53 -0700
Show quoted text
On Sunday, March 15, 2020, Horimoto Yasuhiro <horimoto@clear-code.com>
wrote:I send a patch for this problem.!
From: PG Bug reporting form <noreply@postgresql.org>
Subject: BUG #16303: A condtion whether an index-only scan is possible
includes a wrong
Date: Mon, 16 Mar 2020 06:02:25 +0000I think that the condition of whether an index-only scan is possible
includes a wrong.For example, in the following case, the index has no data to return.
Because
the query doesn't use specify columns.
However, the query planner choice index-only scan.In my opinion, we expected that the query planner doesn't choose an
index-only scan in the above case.I don't see a behavioral bug here. I would expect that any index would be
an acceptable match for a query whose set of return columns is the empty
set. The empty set is a subset of all sets.David J.
Horimoto Yasuhiro <horimoto@clear-code.com> writes:
So I think that choosing index-only scan for this case (false is returned by gistcanreturn() case)
isn't expected behavior.
You're mistaken. An index-only scan is not only useful but potentially
quite desirable for queries such as count(*); in the best case it ends
up just counting the index entries without ever visiting the heap at all.
It does look a bit weird that we might select an IOS for an index type
that can't ever return any column values, but so far as I can see it
should work fine. The index isn't being asked to do anything that it
doesn't have to do anyway, ie, return the correct set of heap TIDs.
Testing locally with a GIST index gives the right answers, too.
We found the problem I reported with PGroonga:
https://github.com/pgroonga/pgroonga/issues/101
This problem causes an error because PGroonga can't return
IndexScanDesc::xs_itup but IndexScanDesc::xs_want_itup is true.
Because PostgreSQL requests index-only scan even when PGroonga
returns false by amcanreturn().
You should be returning an empty, zero-column tuple (either itup
or htup format) in that situation. The limit of "I don't have
any columns I can return" is to form a tuple with no columns,
not to fail to form a tuple.
regards, tom lane
Hello,
Thank you for your explanation.
I understand what you say.
Thank you,
From: Tom Lane <tgl@sss.pgh.pa.us>
Subject: Re: BUG #16303: A condtion whether an index-only scan is possible includes a wrong
Date: Mon, 16 Mar 2020 11:20:43 -0400
Show quoted text
Horimoto Yasuhiro <horimoto@clear-code.com> writes:
So I think that choosing index-only scan for this case (false is returned by gistcanreturn() case)
isn't expected behavior.You're mistaken. An index-only scan is not only useful but potentially
quite desirable for queries such as count(*); in the best case it ends
up just counting the index entries without ever visiting the heap at all.It does look a bit weird that we might select an IOS for an index type
that can't ever return any column values, but so far as I can see it
should work fine. The index isn't being asked to do anything that it
doesn't have to do anyway, ie, return the correct set of heap TIDs.
Testing locally with a GIST index gives the right answers, too.We found the problem I reported with PGroonga:
https://github.com/pgroonga/pgroonga/issues/101This problem causes an error because PGroonga can't return
IndexScanDesc::xs_itup but IndexScanDesc::xs_want_itup is true.
Because PostgreSQL requests index-only scan even when PGroonga
returns false by amcanreturn().You should be returning an empty, zero-column tuple (either itup
or htup format) in that situation. The limit of "I don't have
any columns I can return" is to form a tuple with no columns,
not to fail to form a tuple.regards, tom lane