BUG #16303: A condtion whether an index-only scan is possible includes a wrong

Started by PG Bug reporting formabout 6 years ago6 messagesbugs
Jump to latest
#1PG Bug reporting form
noreply@postgresql.org

The following bug has been logged on the website:

Bug reference: 16303
Logged by: Horimoto Yasuhiro
Email address: horimoto@clear-code.com
PostgreSQL version: 12.2
Operating system: Debian 10.3
Description:

Hello, developers.

I think that the condition of whether an index-only scan is possible
includes a wrong.

For example, in the following case, the index has no data to return. Because
the query doesn't use specify columns.
However, the query planner choice index-only scan.

create table gist_count_tbl (tsv tsvector);
insert into gist_count_tbl values (null);
create index gist_count_tbl_index on gist_count_tbl using gist (tsv);

vacuum analyze gist_count_tbl;

set enable_seqscan=off;
set enable_bitmapscan=off;
set enable_indexonlyscan=on;

explain (costs off)
select count(*) from gist_count_tbl;
QUERY PLAN
--------------------------------------------------------------------
Aggregate
-> Index Only Scan using gist_count_tbl_index on gist_count_tbl
(2 rows)

In my opinion, we expected that the query planner doesn't choose an
index-only scan in the above case.

In fact, index_canreturn_attrs of
https://github.com/postgres/postgres/blob/master/src/backend/optimizer/path/indxpath.c#L1951
is NULL in the above case.

thanks!

#2Horimoto Yasuhiro
horimoto@clear-code.com
In reply to: PG Bug reporting form (#1)
Re: BUG #16303: A condtion whether an index-only scan is possible includes a wrong

I send a patch for this problem.

thanks!

From: PG Bug reporting form <noreply@postgresql.org>
Subject: BUG #16303: A condtion whether an index-only scan is possible includes a wrong
Date: Mon, 16 Mar 2020 06:02:25 +0000

Show quoted text

The following bug has been logged on the website:

Bug reference: 16303
Logged by: Horimoto Yasuhiro
Email address: horimoto@clear-code.com
PostgreSQL version: 12.2
Operating system: Debian 10.3
Description:

Hello, developers.

I think that the condition of whether an index-only scan is possible
includes a wrong.

For example, in the following case, the index has no data to return. Because
the query doesn't use specify columns.
However, the query planner choice index-only scan.

create table gist_count_tbl (tsv tsvector);
insert into gist_count_tbl values (null);
create index gist_count_tbl_index on gist_count_tbl using gist (tsv);

vacuum analyze gist_count_tbl;

set enable_seqscan=off;
set enable_bitmapscan=off;
set enable_indexonlyscan=on;

explain (costs off)
select count(*) from gist_count_tbl;
QUERY PLAN
--------------------------------------------------------------------
Aggregate
-> Index Only Scan using gist_count_tbl_index on gist_count_tbl
(2 rows)

In my opinion, we expected that the query planner doesn't choose an
index-only scan in the above case.

In fact, index_canreturn_attrs of
https://github.com/postgres/postgres/blob/master/src/backend/optimizer/path/indxpath.c#L1951
is NULL in the above case.

thanks!

Attachments:

fix_check_index_only_scan.patchtext/x-patch; charset=us-asciiDownload+60-2
#3David G. Johnston
david.g.johnston@gmail.com
In reply to: Horimoto Yasuhiro (#2)
Re: BUG #16303: A condtion whether an index-only scan is possible includes a wrong

On Sunday, March 15, 2020, Horimoto Yasuhiro <horimoto@clear-code.com>
wrote:

I send a patch for this problem.!

From: PG Bug reporting form <noreply@postgresql.org>
Subject: BUG #16303: A condtion whether an index-only scan is possible
includes a wrong
Date: Mon, 16 Mar 2020 06:02:25 +0000

I think that the condition of whether an index-only scan is possible
includes a wrong.

For example, in the following case, the index has no data to return.

Because

the query doesn't use specify columns.
However, the query planner choice index-only scan.

In my opinion, we expected that the query planner doesn't choose an
index-only scan in the above case.

I don't see a behavioral bug here. I would expect that any index would be
an acceptable match for a query whose set of return columns is the empty
set. The empty set is a subset of all sets.

David J.

#4Horimoto Yasuhiro
horimoto@clear-code.com
In reply to: David G. Johnston (#3)
Re: BUG #16303: A condtion whether an index-only scan is possible includes a wrong

Hello,
Thank you for your comments.

I should have explained much more about this case. Sorry.
In this case, gistcanreturn() returns false:

https://github.com/postgres/postgres/blob/master/src/backend/access/gist/gistget.c#L798-L799

Because tsvector_ops provides GIST_COMPRESS_PROC:

https://github.com/postgres/postgres/blob/master/src/include/access/gist.h#L31
https://github.com/postgres/postgres/blob/master/src/include/catalog/pg_amproc.dat#L530-L532

and doesn't provide GIST_FETCH_PROC:and provides GIST_COMPRESS_PROC:

https://github.com/postgres/postgres/blob/master/src/include/access/gist.h#L37
https://github.com/postgres/postgres/blob/master/src/include/catalog/pg_amproc.dat#L524-L543

Reurning false from amcanreturn() function such as gistcanreturn() means the index can't support
index-only scans on the given column:

https://www.postgresql.org/docs/12/index-functions.html

bool
amcanreturn (Relation indexRelation, int attno);

Check whether the index can support index-only scans on the given column, by returning the
indexed column values for an index entry in the form of an IndexTuple.

So I think that choosing index-only scan for this case (false is returned by gistcanreturn() case)
isn't expected behavior.

Background:
We're developing a PostgreSQL index, PGroonga: https://pgroonga.github.io/
It supports index-only scans for most supported types but doesn't support
index-only scan for jsonb type. Because we can't reconstruct the original data
from data in indexed data.

We found the problem I reported with PGroonga:
https://github.com/pgroonga/pgroonga/issues/101

This problem causes an error because PGroonga can't return
IndexScanDesc::xs_itup but IndexScanDesc::xs_want_itup is true.
Because PostgreSQL requests index-only scan even when PGroonga
returns false by amcanreturn().

We've introduced the following workaround:
https://github.com/pgroonga/pgroonga/commit/0390f79f810c44422940c4ef701c186055b4899e

It returns a NULL tuple instead of real tuple data.

IMO, it's better that PostgreSQL doesn't choose index-only scan in this case.
Or index developers need to implement workaround in each index.

thanks!

From: "David G. Johnston" <david.g.johnston@gmail.com>
Subject: Re: BUG #16303: A condtion whether an index-only scan is possible includes a wrong
Date: Sun, 15 Mar 2020 23:35:53 -0700

Show quoted text

On Sunday, March 15, 2020, Horimoto Yasuhiro <horimoto@clear-code.com>
wrote:

I send a patch for this problem.!

From: PG Bug reporting form <noreply@postgresql.org>
Subject: BUG #16303: A condtion whether an index-only scan is possible
includes a wrong
Date: Mon, 16 Mar 2020 06:02:25 +0000

I think that the condition of whether an index-only scan is possible
includes a wrong.

For example, in the following case, the index has no data to return.

Because

the query doesn't use specify columns.
However, the query planner choice index-only scan.

In my opinion, we expected that the query planner doesn't choose an
index-only scan in the above case.

I don't see a behavioral bug here. I would expect that any index would be
an acceptable match for a query whose set of return columns is the empty
set. The empty set is a subset of all sets.

David J.

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Horimoto Yasuhiro (#4)
Re: BUG #16303: A condtion whether an index-only scan is possible includes a wrong

Horimoto Yasuhiro <horimoto@clear-code.com> writes:

So I think that choosing index-only scan for this case (false is returned by gistcanreturn() case)
isn't expected behavior.

You're mistaken. An index-only scan is not only useful but potentially
quite desirable for queries such as count(*); in the best case it ends
up just counting the index entries without ever visiting the heap at all.

It does look a bit weird that we might select an IOS for an index type
that can't ever return any column values, but so far as I can see it
should work fine. The index isn't being asked to do anything that it
doesn't have to do anyway, ie, return the correct set of heap TIDs.
Testing locally with a GIST index gives the right answers, too.

We found the problem I reported with PGroonga:
https://github.com/pgroonga/pgroonga/issues/101

This problem causes an error because PGroonga can't return
IndexScanDesc::xs_itup but IndexScanDesc::xs_want_itup is true.
Because PostgreSQL requests index-only scan even when PGroonga
returns false by amcanreturn().

You should be returning an empty, zero-column tuple (either itup
or htup format) in that situation. The limit of "I don't have
any columns I can return" is to form a tuple with no columns,
not to fail to form a tuple.

regards, tom lane

#6Horimoto Yasuhiro
horimoto@clear-code.com
In reply to: Tom Lane (#5)
Re: BUG #16303: A condtion whether an index-only scan is possible includes a wrong

Hello,

Thank you for your explanation.
I understand what you say.

Thank you,

From: Tom Lane <tgl@sss.pgh.pa.us>
Subject: Re: BUG #16303: A condtion whether an index-only scan is possible includes a wrong
Date: Mon, 16 Mar 2020 11:20:43 -0400

Show quoted text

Horimoto Yasuhiro <horimoto@clear-code.com> writes:

So I think that choosing index-only scan for this case (false is returned by gistcanreturn() case)
isn't expected behavior.

You're mistaken. An index-only scan is not only useful but potentially
quite desirable for queries such as count(*); in the best case it ends
up just counting the index entries without ever visiting the heap at all.

It does look a bit weird that we might select an IOS for an index type
that can't ever return any column values, but so far as I can see it
should work fine. The index isn't being asked to do anything that it
doesn't have to do anyway, ie, return the correct set of heap TIDs.
Testing locally with a GIST index gives the right answers, too.

We found the problem I reported with PGroonga:
https://github.com/pgroonga/pgroonga/issues/101

This problem causes an error because PGroonga can't return
IndexScanDesc::xs_itup but IndexScanDesc::xs_want_itup is true.
Because PostgreSQL requests index-only scan even when PGroonga
returns false by amcanreturn().

You should be returning an empty, zero-column tuple (either itup
or htup format) in that situation. The limit of "I don't have
any columns I can return" is to form a tuple with no columns,
not to fail to form a tuple.

regards, tom lane