BitmapOr node not used in plan for ANY/IN but is for sequence of ORs ...
(sent to general users mailing list yesterday - but perhaps this is a more
suitable audience?)
In PG16.4, we have a table of key/pair data (around 30M rows) where there
are about 7 distinct keys and each has a conditional or partial index on
them (the distribution is different for each key/value pair combination).
I've found that when we have a query that uses an OR then those partial
indexes are used but not if the query is written to use ANY/IN, which is
more convenient from a programmer POV (especially any with 3rd party query
generators etc.). Naturally, the result sets returned by the queries are
identical due to the filter semantics of any of the 3 solution variants.
Here's a shareable, MRP;
Is there any trick I can do to get the planner to make use of the
conditional/partial index? Or is this simply an unoptimised code path yet
to be exploited!?
Cheers,
Jim
--
Jim Vanns
Principal Production Engineer
Industrial Light & Magic, London
On 11/13/24 13:08, Jim Vanns wrote:
(sent to general users mailing list yesterday - but perhaps this is a
more suitable audience?)In PG16.4, we have a table of key/pair data (around 30M rows) where
there are about 7 distinct keys and each has a conditional or partial
index on them (the distribution is different for each key/value pair
combination). I've found that when we have a query that uses an OR then
those partial indexes are used but not if the query is written to use
ANY/IN, which is more convenient from a programmer POV (especially any
with 3rd party query generators etc.). Naturally, the result sets
returned by the queries are identical due to the filter semantics of any
of the 3 solution variants.Here's a shareable, MRP;
https://dbfiddle.uk/OKs_7HWv <https://dbfiddle.uk/OKs_7HWv>
Is there any trick I can do to get the planner to make use of the
conditional/partial index? Or is this simply an unoptimised code path
yet to be exploited!?
I believe this is "simply" not implemented, so there's no way to
convince the planner to use these partial indexes.
The proximate cause is that the planner does not treat ANY()/IN() as
equivalent to an OR clause, and does not even consider building the
"bitmap OR" path for those queries. That's what happens at the very
beginning of generate_bitmap_or_paths().
Perhaps we could "expand" the ANY/IN clauses into an OR clause, so that
restriction_is_or_clause() returns "true". But I haven't tried and I'm
sure there'd be more stuff to fix to make this work.
regards
--
Tomas Vondra
Thanks Tomas, that's useful to know.
Cheers
Jim
On Wed, 13 Nov 2024 at 13:13, Tomas Vondra <tomas@vondra.me> wrote:
On 11/13/24 13:08, Jim Vanns wrote:
(sent to general users mailing list yesterday - but perhaps this is a
more suitable audience?)In PG16.4, we have a table of key/pair data (around 30M rows) where
there are about 7 distinct keys and each has a conditional or partial
index on them (the distribution is different for each key/value pair
combination). I've found that when we have a query that uses an OR then
those partial indexes are used but not if the query is written to use
ANY/IN, which is more convenient from a programmer POV (especially any
with 3rd party query generators etc.). Naturally, the result sets
returned by the queries are identical due to the filter semantics of any
of the 3 solution variants.Here's a shareable, MRP;
https://dbfiddle.uk/OKs_7HWv <https://dbfiddle.uk/OKs_7HWv>
Is there any trick I can do to get the planner to make use of the
conditional/partial index? Or is this simply an unoptimised code path
yet to be exploited!?I believe this is "simply" not implemented, so there's no way to
convince the planner to use these partial indexes.The proximate cause is that the planner does not treat ANY()/IN() as
equivalent to an OR clause, and does not even consider building the
"bitmap OR" path for those queries. That's what happens at the very
beginning of generate_bitmap_or_paths().Perhaps we could "expand" the ANY/IN clauses into an OR clause, so that
restriction_is_or_clause() returns "true". But I haven't tried and I'm
sure there'd be more stuff to fix to make this work.regards
--
Tomas Vondra
--
Jim Vanns
Principal Production Engineer
Industrial Light & Magic, London