回复: BUG #18437: The index scan result is more than the full scan result, and the primary key index has duplicate val

Started by 1165125080about 2 years ago1 messagesbugs
Jump to latest
#11165125080
1165125080@qq.com

>> set enable_indexscan = off;
>> select count(id) from dste_smt.dste_role_t;
>>  count 
>> --------
>>  125680
>> (1 row)
>>
>> But with index only scan, the number of rows is more than the real number.
>> dste_pg_db=# select count(col4) from dste_smt.dste_role_t;
 >> count  
>> --------
 >> 126847
>> (1 row)

> Yes, that must be data corruption.
>
> You'll have to identify and delete duplicate values, then rebuild the indexes.
>
> That *might* be caused by a PostgreSQL bug, and it might well be a bug that
> got fixed since 12.6.  Hard to tell.  You should have applied the latest
> minor release (but that cannot fix the problem now).

After testing, I found that the problem was with the table's vm visibility mapping file.
When I replace the problem table vm file with an all-0 vm file that uses an empty table.

$ hexdump 115327046_vm
0000000 0e3f 0000 0110 9400 0000 0000 0018 2000
0000010 2000 2004 0000 0000 0001 0000 0000 0000
0000020 0000 0000 0000 0000 0000 0000 0000 0000
*
0002000

I found that the index only scan result became the same as the seq scan result.
dste_pg_db=> explain analyze select count(col4) from sch1.tb1;
                                                       QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=2604.43..2604.44 rows=1 width=8) (actual time=98.969..98.970 rows=1 loops=1) 
  -> Index only using dste_col_i_2 on sch1.tb1  (cost=0.42..2291.20 rows=125292 width=2) (actual time=0.137..82.381 rows=125680 loops=1)
         Heap Fetches: 126847
 Planning time: 1.050 ms
 Execution time: 99.057 ms
(6 rows)

dste_pg_db=> select count(col4) from sch1.tb1;
select count(id) from dste_smt.dste_role_t;
 count  
--------
 125680
(1 row)

The problem should be in the vm file, is there any possible reason for this?

------------------ 原始邮件 ------------------
发件人: "Laurenz Albe"<laurenz.albe@cybertec.at&gt;;
发送时间: 2024年4月15日(星期一) 晚上10:18
收件人: "1165125080"<1165125080@qq.com&gt;; "pgsql-bugs"<pgsql-bugs@lists.postgresql.org&gt;;
主题: Re: BUG #18437: The index scan result is more than the full scan result, and the primary key index has duplicate val

2024-04-15 at 13:25 +0000, PG Bug reporting form wrote:
&gt; PostgreSQL version: 12.6
&gt;
&gt; I have a table sch.tb1 with the following structure:
&gt;
&gt; dste_pg_db=# \d sch.tb1
&gt; ...
&gt; Indexes:
&gt;&nbsp;&nbsp;&nbsp;&nbsp; "tb1_id_pkey" PRIMARY KEY, btree (id)
&gt;&nbsp;&nbsp;&nbsp;&nbsp; "dste_col_i_1" btree (col3)
&gt;&nbsp;&nbsp;&nbsp;&nbsp; "dste_col_i_2" btree (col4)
&gt;&nbsp;&nbsp;&nbsp;&nbsp; "dste_col_i_3" btree (col7)
&gt;&nbsp;&nbsp;&nbsp;&nbsp; "dste_col_i_4" btree (col11)
&gt;&nbsp;&nbsp;&nbsp;&nbsp; "dste_col_i_5" btree (col1)
&gt;&nbsp;&nbsp;&nbsp;&nbsp; "dste_col_i_6" btree (col8)
&gt;
&gt;
&gt; Use seq scan,the number of lines is 125680
&gt;
&gt; set enable_indexscan = off;
&gt; select count(id) from dste_smt.dste_role_t;
&gt;&nbsp; count&nbsp;
&gt; --------
&gt;&nbsp; 125680
&gt; (1 row)
&gt;
&gt; But with index only scan, the number of rows is more than the real number.
&gt;
&gt; All indexes are, including primary keys.
&gt;
&gt; set enable_indexscan = on;
&gt; dste_pg_db=# select count(id) from dste_smt.dste_role_t;
&gt;&nbsp; count&nbsp;
&gt; --------
&gt;&nbsp; 125684
&gt; (1 row)

Yes, that must be data corruption.

You'll have to identify and delete duplicate values, then rebuild the indexes.

That *might* be caused by a PostgreSQL bug, and it might well be a bug that
got fixed since 12.6.&nbsp; Hard to tell.&nbsp; You should have applied the latest
minor release (but that cannot fix the problem now).

Without a way to reproduce this in PostgreSQL 12.18, there is little we can do.

Yours,
Laurenz Albe