Index plan returns different results to sequential scan

Started by John Burnsabout 2 years ago10 messagesbugs
Jump to latest
#1John Burns
john@impactdatametrics.com

Hi…

We’ve been using the postcode extension from PGXN for a number of years and it has not caused any problems until now.
The original developer is no longer around, but a couple of minor changes have kept it operating since version 9.

There is an operator (%) and underlying ‘C’ function (postcode_eq_partial) that provides a containment test
e.g. ‘NW10 5AQ’ % ‘NW10’ is true , ‘L17 3PB’ % ‘NW10’ is false etc.

If we run a query against a table with a postcode field but no index using the % operator we get the correct result.

If we run the same query against the table after adding a tree index on the postcode result we get few fewer results.

If we drop and reindex we get a different number of results.

The query is SELECT * FROM XXX where postcode % ’NW10’
To create a sample table — create table XXX ( udprn bigint, postcode postcode )
To Index it CREATE INDEX on XXX(postcode)

The underlying representation of the postcode is a 32 bit integer, so not especially esoteric.

Although the postcode package is probably not especially significant in the scheme of things , the behaviour of indexed versus non-indexed queries is worrisome.

I’ve had to make a couple of minor changes to the postcode package as the Postgres versions have changed, i.e. define TRUE and FALSE in the c header files and change the location of the include files when Postgres 16 arrived, but nothing else.

I’ve attached a sample data file to populate a table that exhibits the behaviour, and the tweaked version of the Postcode package.

Regards, John Burns

Version : PostgreSQL 16.2 (Ubuntu 16.2-1.pgdg22.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, 64-bit

john@impactdatametrics.com

Attachments:

Archive.zipapplication/zip; name=Archive.zip; x-unix-mode=0644Download+5-3
#2Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: John Burns (#1)
Re: Index plan returns different results to sequential scan

On 3/21/24 18:25, John Burns wrote:

Hi…

...

Although the postcode package is probably not especially significant
in the scheme of things , the behaviour of indexed versus non-indexed
queries is worrisome.

Seems like a thinko in the indexing, but that's just a guess.

I’ve had to make a couple of minor changes to the postcode package
as the Postgres versions have changed, i.e. define TRUE and FALSE in
the c header files and change the location of the include files when
Postgres 16 arrived, but nothing else.

Can you share a patch with the tweaks?

I’ve attached a sample data file to populate a table that exhibits
the behaviour, and the tweaked version of the Postcode package.

I don't see any attachments :-(

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#3Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: John Burns (#1)
Re: Index plan returns different results to sequential scan

On 3/21/24 21:08, John Burns wrote:

On 21 Mar 2024, at 20:27, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:

Sorry about that … the patched version of the package, plus a sample data set is in the attached zip file,

-John

Thanks, I can reproduce the issue. I don't know why is it happening, but
the behavior I see is that the (postcode % 'NW10') query somehow misses
rows with postcodes 'NW10 0AA' - 'NW10 3NL' when executed using index.

I don't see anything obviously wrong in the extension / index pages.

You suggested it used to work OK and then broke. Do you perhaps know at
which version it broke? Or did you use 9.x until now, upgrades to 16 and
realized it's broken?

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#4Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Tomas Vondra (#3)
Re: Index plan returns different results to sequential scan

On 3/22/24 00:58, Tomas Vondra wrote:

On 3/21/24 21:08, John Burns wrote:

On 21 Mar 2024, at 20:27, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:

Sorry about that … the patched version of the package, plus a sample data set is in the attached zip file,

-John

Thanks, I can reproduce the issue. I don't know why is it happening, but
the behavior I see is that the (postcode % 'NW10') query somehow misses
rows with postcodes 'NW10 0AA' - 'NW10 3NL' when executed using index.

I don't see anything obviously wrong in the extension / index pages.

You suggested it used to work OK and then broke. Do you perhaps know at
which version it broke? Or did you use 9.x until now, upgrades to 16 and
realized it's broken?

After some bisecting, this seems to be broken since this PG12 commit:

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=dd299df8189bd00fbe54b72c64f43b6af2ffeccd

==========================================================================
commit dd299df8189bd00fbe54b72c64f43b6af2ffeccd
tree 931ef720687d61cf5e75464fa0b1c1d75fb3f9d3 tree
parent e5adcb789d80ba565ccacb1ed4341a7c29085238 commit | diff
Make heap TID a tiebreaker nbtree index column.

Make nbtree treat all index tuples as having a heap TID attribute.
Index searches can distinguish duplicates by heap TID, since heap TID is
always guaranteed to be unique. This general approach has numerous
benefits for performance, and is prerequisite to teaching VACUUM to
perform "retail index tuple deletion".

...
==========================================================================

I don't know what the problem is, or whether it's a big in PG or in the
postcode extension (I agree the extension is fairly straightforward).

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In reply to: Tomas Vondra (#4)
Re: Index plan returns different results to sequential scan

On Thu, Mar 21, 2024 at 9:22 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:

I don't know what the problem is, or whether it's a big in PG or in the
postcode extension (I agree the extension is fairly straightforward).

Could you try running amcheck's bt_index_check function against the index?

--
Peter Geoghegan

#6John Burns
john@impactdatametrics.com
In reply to: Tomas Vondra (#3)
Re: Index plan returns different results to sequential scan

I’m sorry, but I can’t really say when it broke — we suspect when we went from version 14 to 16 — we use this functionality a lot, and we would have noticed long ago if it was wrong.

As additional information, I created a new operator that used another function — wrapping the function postcode_cmp_partial(...) = 0 — and it exhibited the same behaviour.
Not surprised since the postcode_eq_partial is a trivially simple function.

— John Burns

Show quoted text

On 22 Mar 2024, at 00:58, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:

On 3/21/24 21:08, John Burns wrote:

On 21 Mar 2024, at 20:27, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:

Sorry about that … the patched version of the package, plus a sample data set is in the attached zip file,

-John

Thanks, I can reproduce the issue. I don't know why is it happening, but
the behavior I see is that the (postcode % 'NW10') query somehow misses
rows with postcodes 'NW10 0AA' - 'NW10 3NL' when executed using index.

I don't see anything obviously wrong in the extension / index pages.

You suggested it used to work OK and then broke. Do you perhaps know at
which version it broke? Or did you use 9.x until now, upgrades to 16 and
realized it's broken?

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#7Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Peter Geoghegan (#5)
Re: Index plan returns different results to sequential scan

On 3/22/24 02:40, Peter Geoghegan wrote:

On Thu, Mar 21, 2024 at 9:22 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:

I don't know what the problem is, or whether it's a big in PG or in the
postcode extension (I agree the extension is fairly straightforward).

Could you try running amcheck's bt_index_check function against the index?

I tried (on master). It doesn't report anything:

test=# select bt_index_check('xxx_postcode_idx', 't');
bt_index_check
----------------

(1 row)

I also tried looking at the index using bt_page_items from pageinspect,
and I did not notice anything obviously wrong. But I don't have much
experience with btree at such low level ...

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#8Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Tomas Vondra (#7)
Re: Index plan returns different results to sequential scan

On 3/22/24 08:43, Tomas Vondra wrote:

On 3/22/24 02:40, Peter Geoghegan wrote:

On Thu, Mar 21, 2024 at 9:22 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:

I don't know what the problem is, or whether it's a big in PG or in the
postcode extension (I agree the extension is fairly straightforward).

Could you try running amcheck's bt_index_check function against the index?

I tried (on master). It doesn't report anything:

test=# select bt_index_check('xxx_postcode_idx', 't');
bt_index_check
----------------

(1 row)

I also tried looking at the index using bt_page_items from pageinspect,
and I did not notice anything obviously wrong. But I don't have much
experience with btree at such low level ...

I realized I can create an index on e5adcb789, apply dd299df81 and
create another index, and then compare what bt_page_items says for those.

FWIW xxx_postcode_idx (created before e5adcb789) keeps returning the
correct data even after dd299df81 gets applied.

I'll only show 5 rows - the only difference is in the first item (which
I guess is hikey?), all other items are exactly the same.

For the index created on e5adcb789 (that returns the correct results
even with the new build), I get this:

test=# select * from bt_page_items(get_Raw_page('xxx_postcode_idx', 1))
limit 5;
itemoffset | ctid | itemlen | nulls | vars | data
------------+--------+---------+-------+------+-------------------------
1 | (2,15) | 16 | f | f | ce 51 20 50 00 00 00 00
2 | (0,1) | 16 | f | f | 27 48 20 01 00 00 00 00
3 | (0,2) | 16 | f | f | 30 48 20 01 00 00 00 00
4 | (0,3) | 16 | f | f | 41 48 20 01 00 00 00 00
5 | (0,4) | 16 | f | f | 41 48 20 01 00 00 00 00
(5 rows)

test=# select * from bt_page_items(get_Raw_page('xxx_postcode_idx', 2))
limit 5;
itemoffset | ctid | itemlen | nulls | vars | data
------------+--------+---------+-------+------+-------------------------
1 | (4,5) | 16 | f | f | 25 60 20 50 00 00 00 00
2 | (2,15) | 16 | f | f | ce 51 20 50 00 00 00 00
3 | (2,16) | 16 | f | f | d3 51 20 50 00 00 00 00
4 | (2,17) | 16 | f | f | d3 51 20 50 00 00 00 00
5 | (2,18) | 16 | f | f | d3 51 20 50 00 00 00 00
(5 rows)

test=# select * from bt_page_items(get_Raw_page('xxx_postcode_idx', 3))
limit 5;
itemoffset | ctid | itemlen | nulls | vars | data
------------+--------+---------+-------+------+-------------------------
1 | (1,0) | 8 | f | f |
2 | (2,15) | 16 | f | f | ce 51 20 50 00 00 00 00
3 | (4,5) | 16 | f | f | 25 60 20 50 00 00 00 00
(3 rows)

test=# select * from bt_page_items(get_Raw_page('xxx_postcode_idx', 4))
limit 5;
itemoffset | ctid | itemlen | nulls | vars | data
------------+-------+---------+-------+------+-------------------------
1 | (4,5) | 16 | f | f | 25 60 20 50 00 00 00 00
2 | (4,6) | 16 | f | f | 25 60 20 50 00 00 00 00
3 | (4,7) | 16 | f | f | 25 60 20 50 00 00 00 00
4 | (4,8) | 16 | f | f | 25 60 20 50 00 00 00 00
5 | (4,9) | 16 | f | f | 30 60 20 50 00 00 00 00
(5 rows)

while for the index created with dd299df81 (and returning incomplete
results) I get this:

test=# select * from bt_page_items(get_Raw_page('xxx2_postcode_idx', 1))
limit 5;
itemoffset | ctid | itemlen | nulls | vars | data
------------+-------+---------+-------+------+-------------------------
1 | (2,1) | 16 | f | f | ce 51 20 50 00 00 00 00
2 | (0,1) | 16 | f | f | 27 48 20 01 00 00 00 00
3 | (0,2) | 16 | f | f | 30 48 20 01 00 00 00 00
4 | (0,3) | 16 | f | f | 41 48 20 01 00 00 00 00
5 | (0,4) | 16 | f | f | 41 48 20 01 00 00 00 00
(5 rows)

test=# select * from bt_page_items(get_Raw_page('xxx2_postcode_idx', 2))
limit 5;
itemoffset | ctid | itemlen | nulls | vars |
data
------------+----------+---------+-------+------+-------------------------------------------------
1 | (4,4097) | 24 | f | f | 25 60 20 50 00 00 00
00 00 00 00 00 04 00 04 00
2 | (2,15) | 16 | f | f | ce 51 20 50 00 00 00 00
3 | (2,16) | 16 | f | f | d3 51 20 50 00 00 00 00
4 | (2,17) | 16 | f | f | d3 51 20 50 00 00 00 00
5 | (2,18) | 16 | f | f | d3 51 20 50 00 00 00 00
(5 rows)

test=# select * from bt_page_items(get_Raw_page('xxx2_postcode_idx', 3))
limit 5;
itemoffset | ctid | itemlen | nulls | vars |
data
------------+----------+---------+-------+------+-------------------------------------------------
1 | (1,0) | 8 | f | f |
2 | (2,1) | 16 | f | f | ce 51 20 50 00 00 00 00
3 | (4,4097) | 24 | f | f | 25 60 20 50 00 00 00
00 00 00 00 00 04 00 04 00
(3 rows)

test=# select * from bt_page_items(get_Raw_page('xxx2_postcode_idx', 4))
limit 5;
itemoffset | ctid | itemlen | nulls | vars | data
------------+-------+---------+-------+------+-------------------------
1 | (4,5) | 16 | f | f | 25 60 20 50 00 00 00 00
2 | (4,6) | 16 | f | f | 25 60 20 50 00 00 00 00
3 | (4,7) | 16 | f | f | 25 60 20 50 00 00 00 00
4 | (4,8) | 16 | f | f | 25 60 20 50 00 00 00 00
5 | (4,9) | 16 | f | f | 30 60 20 50 00 00 00 00
(5 rows)

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In reply to: John Burns (#1)
Re: Index plan returns different results to sequential scan

On Thu, Mar 21, 2024 at 2:03 PM John Burns <john@impactdatametrics.com> wrote:

The query is SELECT * FROM XXX where postcode % ’NW10’
To create a sample table — create table XXX ( udprn bigint, postcode postcode )
To Index it CREATE INDEX on XXX(postcode)

The opfamily's % operator uses the B-Tree equality strategy. This
means that it works the same way as = works in most other opfamilies.

I don't see how equality can work reliably here. A query with a
predicate "WHERE my_indexed_postcode_column % ‘NW10’" seems to work by
masking the value stored in the index, throwing away some amount of
suffix bytes in the process. But the values from the index are still
stored in their original order -- the temporarily masked suffix bytes
aren't masked in the index, of course (they're only masked
temporarily, by the cross-type equality operator %).

Wouldn't you need something closer to "WHERE
my_indexed_postcode_column >= ‘NW10’ and my_indexed_postcode_column <
‘NW11’" for this to work reliably?

The relevant rules for btree operator families are described here:

https://www.postgresql.org/docs/devel/btree-behavior.html

Offhand, I suspect that you don't see problems pre-12 B-Tree because
the B-Tree code happened to have been more forgiving of opfamilies
that were broken in this way. Earlier versions treated < and <= as the
same thing in certain contexts.

--
Peter Geoghegan

#10Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Peter Geoghegan (#9)
Re: Index plan returns different results to sequential scan

On 3/23/24 03:24, Peter Geoghegan wrote:

On Thu, Mar 21, 2024 at 2:03 PM John Burns <john@impactdatametrics.com> wrote:

The query is SELECT * FROM XXX where postcode % ’NW10’
To create a sample table — create table XXX ( udprn bigint, postcode postcode )
To Index it CREATE INDEX on XXX(postcode)

The opfamily's % operator uses the B-Tree equality strategy. This
means that it works the same way as = works in most other opfamilies.

I don't see how equality can work reliably here. A query with a
predicate "WHERE my_indexed_postcode_column % ‘NW10’" seems to work by
masking the value stored in the index, throwing away some amount of
suffix bytes in the process. But the values from the index are still
stored in their original order -- the temporarily masked suffix bytes
aren't masked in the index, of course (they're only masked
temporarily, by the cross-type equality operator %).

Wouldn't you need something closer to "WHERE
my_indexed_postcode_column >= ‘NW10’ and my_indexed_postcode_column <
‘NW11’" for this to work reliably?

Yeah, I was not sure how come this could be processed using equality
operator, but I got distracted by bisecting this to a particular commit.
I didn't realize the commit might have changed how much we rely on the
opfamily to do things correctly.

I do think the '%' operator is pretty close to what we do for LIKE with
prefix patterns for text:

explain select * from t where a like 'aa%';
QUERY PLAN
-----------------------------------------------------------------------
Index Only Scan using t_a_idx on t (cost=0.29..4.31 rows=1 width=33)
Index Cond: ((a ~>=~ 'aa'::text) AND (a ~<~ 'ab'::text))
Filter: (a ~~ 'aa%'::text)
(3 rows)

So I guess postcode would need to do something similar - treat the '%'
operator as separate from opclass equality, and define a new support
procedure akin to text_support, translating the '%' into the range quals
that can match the index.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company