Add GiST support for mixed-width integer operators

Started by Paul Jungwirthalmost 2 years ago4 messageshackers
Jump to latest
#1Paul Jungwirth
pj@illuminatedcomputing.com

Hi Hackers,

I noticed that this query wasn't using my GiST index:

postgres=# create extension btree_gist;
CREATE EXTENSION
postgres=# create table t (id bigint, valid_at daterange, exclude using gist (id with =, valid_at
with &&));
CREATE TABLE
postgres=# explain select * from t where id = 5;
QUERY PLAN
---------------------------------------------------
Seq Scan on t (cost=0.00..25.00 rows=6 width=40)
Filter: (id = 5)
(2 rows)

But if I add a cast to bigint, it does:

postgres=# explain select * from t where id = 5::bigint;
QUERY PLAN
---------------------------------------------------------------------------------
Bitmap Heap Scan on t (cost=4.19..13.66 rows=6 width=40)
Recheck Cond: (id = '5'::bigint)
-> Bitmap Index Scan on t_id_valid_at_excl (cost=0.00..4.19 rows=6 width=0)
Index Cond: (id = '5'::bigint)
(4 rows)

There is a StackOverflow question about this with 5 upvotes, so it's not just me who was surprised
by it.[1]https://stackoverflow.com/questions/71788182/postgres-not-using-btree-gist-index

The reason is that btree_gist only creates pg_amop entries for symmetrical operators, unlike btree
which has =(int2,int8), etc. So this commit adds support for all combinations of int2/int4/int8 for
all five btree operators (</<=/=/>=/>). After doing that, my query uses the index without a cast.

One complication is that while btree has just one opfamily for everything (integer_ops), btree_gist
splits things up into gist_int2_ops, gist_int4_ops, and gist_int8_ops. So where to put the
operators? I thought it made the most sense for a larger width to support smaller ones, so I added
=(int2,int8) and =(int4,int8) to gist_int8_ops, and I added =(int2,int4) to gist_int4_ops.

[1]: https://stackoverflow.com/questions/71788182/postgres-not-using-btree-gist-index

Yours,

--
Paul ~{:-)
pj@illuminatedcomputing.com

Attachments:

v1-0001-Use-GiST-index-with-mixed-integer-widths.patchtext/x-patch; charset=UTF-8; name=v1-0001-Use-GiST-index-with-mixed-integer-widths.patchDownload+104-3
#2Andrey Borodin
amborodin@acm.org
In reply to: Paul Jungwirth (#1)
Re: Add GiST support for mixed-width integer operators

On 5 Jul 2024, at 23:46, Paul Jungwirth <pj@illuminatedcomputing.com> wrote:

this commit adds support for all combinations of int2/int4/int8 for all five btree operators (</<=/=/>=/>).

Looks like a nice feature to have.
Would it make sense to do something similar to float8? Or, perhaps, some other types from btree_gist?

Also, are we sure such tests will be stable?
+SET enable_seqscan=on;
+-- It should use the index with a different integer width:
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM int4tmp WHERE a = smallint '42';
+                   QUERY PLAN                   
+------------------------------------------------
+ Aggregate
+   ->  Bitmap Heap Scan on int4tmp
+         Recheck Cond: (a = '42'::smallint)
+         ->  Bitmap Index Scan on int4idx
+               Index Cond: (a = '42'::smallint)

Best regards, Andrey Borodin.

#3Paul Jungwirth
pj@illuminatedcomputing.com
In reply to: Andrey Borodin (#2)
Re: Add GiST support for mixed-width integer operators

On 7/6/24 05:04, Andrey M. Borodin wrote:>> On 5 Jul 2024, at 23:46, Paul Jungwirth
<pj@illuminatedcomputing.com> wrote:

this commit adds support for all combinations of int2/int4/int8 for all five btree operators (</<=/=/>=/>).

Looks like a nice feature to have.
Would it make sense to do something similar to float8? Or, perhaps, some other types from btree_gist?

Here is another patch adding float4/8 and also date/timestamp/timestamptz, in the same combinations
as btree.

No other types seem like they deserve this treatment. For example btree doesn't mix oids with ints.

Also, are we sure such tests will be stable?

You're right, it was questionable. We hadn't analyzed the table, and after doing that the plan
changes from a bitmap scan to an index-only scan. That makes more sense, and I doubt it will change
now that it's based on statistics.

Yours,

--
Paul ~{:-)
pj@illuminatedcomputing.com

Attachments:

v2-0001-Use-GiST-index-with-mixed-integer-float-timestamp.patchtext/x-patch; charset=UTF-8; name=v2-0001-Use-GiST-index-with-mixed-integer-float-timestamp.patchDownload+227-3
#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Paul Jungwirth (#3)
Re: Add GiST support for mixed-width integer operators

Paul Jungwirth <pj@illuminatedcomputing.com> writes:

On 7/6/24 05:04, Andrey M. Borodin wrote:>> On 5 Jul 2024, at 23:46, Paul Jungwirth
<pj@illuminatedcomputing.com> wrote:

this commit adds support for all combinations of int2/int4/int8 for all five btree operators (</<=/=/>=/>).

Perhaps I'm missing something, but how can this possibly work without
any changes to the C code?

For example, gbt_int4_consistent assumes that the comparison value
is always an int4. Due to the way we represent Datums in-memory,
this will kind of work if it's really an int2 or int8 --- unless the
comparison value doesn't fit in int4, and then you will get a
completely wrong answer based on a value truncated to int4. (But I
would argue that even the cases where it seems to work are a type
violation, and getting the right answer is accidental if you have not
applied the correct PG_GETARG conversion macro.) Plus, int4-vs-int8
comparisons will fail in very obvious ways, up to and including core
dumps, on 32-bit machines where int8 is pass-by-reference.

Here is another patch adding float4/8 and also date/timestamp/timestamptz, in the same combinations
as btree.

Similar problems, plus comparing timestamp to timestamptz requires a
timezone conversion that this code isn't doing.

I think to make this work, you'd need to define a batch of new
opclass-specific strategy numbers that convey both the kind of
comparison to perform and the type of the RHS value. And then
there would need to be a nontrivial amount of new code in the
consistent functions to deal with cross-type cases.

regards, tom lane