Support functions for range types

Started by Kim Johan Anderssonover 3 years ago4 messagesgeneral
Jump to latest
#1Kim Johan Andersson
kimjand@kimmet.dk

I was surprised by the poor performance when I first tried to use range
types. What I expected was that the following two queries would be
equivalent (see attached script):

postgres=# EXPLAIN ANALYZE SELECT some_number FROM integer_test WHERE
some_number BETWEEN -2 AND 2;

QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------
Index Only Scan using integer_test_some_number_idx on integer_test
(cost=0.28..8.38 rows=5 width=4) (actual time=0.045..0.052 rows=5 loops=1)
Index Cond: ((some_number >= '-2'::integer) AND (some_number <= 2))
Heap Fetches: 5
Planning Time: 0.319 ms
Execution Time: 0.094 ms
(5 rows)

postgres=# EXPLAIN ANALYZE SELECT some_number FROM integer_test WHERE
some_number <@ int4range(-2, 2, '[]');
QUERY PLAN
--------------------------------------------------------------------------------------------------------
Seq Scan on integer_test (cost=0.00..34.01 rows=10 width=4) (actual
time=0.585..1.136 rows=5 loops=1)
Filter: (some_number <@ '[-2,3)'::int4range)
Rows Removed by Filter: 1996
Planning Time: 0.175 ms
Execution Time: 1.164 ms
(5 rows)

But clearly, the planner is not able to use the btree index in the
presence of the range operator.
So I attempted to add support functions for the
'elem_contained_by_range' and 'range_contains_elem' operators (patch
attached):
That gives the following execution plan (applied on
26f7802beb2a4aafa0903f5bedeb7f1fa6f4f358):

QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------
Index Only Scan using integer_test_some_number_idx on integer_test
(cost=0.28..8.38 rows=10 width=4) (actual time=0.046..0.058 rows=5 loops=1)
Index Cond: ((some_number >= '-2'::integer) AND (some_number < 3))
Heap Fetches: 5
Planning Time: 0.694 ms
Execution Time: 0.114 ms
(5 rows)

That was what I was hoping to see (even though the row estimate is still
a bit off).
Unfortunately this only works for the trivial case where the range is
actually a constant.
The third query in the attached script (range_test.sql) produces the
following plan, where the support function is not useful:

QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.14..419.56 rows=22 width=12) (actual
time=3.791..36.549 rows=121 loops=1)
Join Filter: (integer_test.some_number <@
int4range(number_q.one_number, number_q.another_number, '[]'::text))
Rows Removed by Join Filter: 21890
CTE number_q
-> Function Scan on generate_series (cost=0.00..0.14 rows=11
width=8) (actual time=0.063..0.076 rows=11 loops=1)
-> CTE Scan on number_q (cost=0.00..0.22 rows=11 width=8) (actual
time=0.071..0.107 rows=11 loops=1)
-> Materialize (cost=0.00..39.02 rows=2001 width=4) (actual
time=0.011..0.516 rows=2001 loops=11)
-> Seq Scan on integer_test (cost=0.00..29.01 rows=2001
width=4) (actual time=0.077..1.043 rows=2001 loops=1)
Planning Time: 3.172 ms
Execution Time: 36.908 ms
(10 rows)

So my question here is, how to go about handling the more interesting
cases, where we are passed a FuncExpr (instead of a Const)?
Is it even possible to return something useful in this case?

As far as I can tell, the support function is being passed a reference
to the range constructor function when the range is not a constant.
But I don't have the insight required to build opclauses that can handle
non-constants.
Any thoughs or pointers on solving this?

Thanks,
Kim Johan Andersson

Attachments:

support.patchtext/plain; charset=UTF-8; name=support.patchDownload+272-3
range_test.sqltext/plain; charset=UTF-8; name=range_test.sqlDownload
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Kim Johan Andersson (#1)
Re: Support functions for range types

Kim Johan Andersson <kimjand@kimmet.dk> writes:

So my question here is, how to go about handling the more interesting
cases, where we are passed a FuncExpr (instead of a Const)?
Is it even possible to return something useful in this case?

Doesn't look like it to me. You could check whether the RHS is a
range constructor function call, but there's a big semantic problem:
int4_range(NULL, ...) converts to a range with an infinite bound,
not a null bound. So translating that to "indxvar >= NULL" would
give the wrong answers. And generally speaking, if the argument
isn't a constant then you're not going to be able to be sure that
it doesn't produce NULL.

I guess you could produce something like

indxvar >= coalesce(argument, minimum-value-of-type)

in cases where the data type has an identifiable minimum resp.
maximum value, but that'd make the whole affair annoyingly
data-type-specific. Not sure it's worth going there.

regards, tom lane

#3Kim Johan Andersson
kimjand@kimmet.dk
In reply to: Tom Lane (#2)
Re: Support functions for range types

On 25-09-2022 16:43, Tom Lane wrote:

Doesn't look like it to me. You could check whether the RHS is a
range constructor function call, but there's a big semantic problem:
int4_range(NULL, ...) converts to a range with an infinite bound,
not a null bound. So translating that to "indxvar >= NULL" would
give the wrong answers. And generally speaking, if the argument
isn't a constant then you're not going to be able to be sure that
it doesn't produce NULL.

I was suspecting that it would not be possible. So we can peek at the
constructor parameters, but we won't know the actual value until execution.

I guess you could produce something like

indxvar >= coalesce(argument, minimum-value-of-type)

in cases where the data type has an identifiable minimum resp.
maximum value, but that'd make the whole affair annoyingly
data-type-specific. Not sure it's worth going there.

I was hoping it was possible to rely on the existing range mechanics for
handling the data types. Along the lines of adding nodes dependent on
the result of the constructor call.
But if there is no opportunity to make a dynamic expression suitable for
the index, then I guess it won't be possible to make a really useful
support function for range types.
Thanks for the input.

Regards,
Kim Johan Andersson

#4Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Kim Johan Andersson (#3)
Re: Support functions for range types

On Mon, 2022-09-26 at 06:57 +0200, Kim Johan Andersson wrote:

But if there is no opportunity to make a dynamic expression suitable for
the index, then I guess it won't be possible to make a really useful
support function for range types.

I think it could still be useful if it only deals with constant operands.

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com