Question about partial index WHERE clause predicate ordering

Started by Arik Schimmel20 days ago3 messages
#1Arik Schimmel
arik.schimmel@wiz.io

Hi all,
I noticed that when creating a partial index with multiple predicates in
the WHERE clause, the order in which I write the predicates appears to
affect the index build time (specifically the index validation phase).

I created an index with this WHERE clause:

CREATE INDEX CONCURRENTLY idx_v1
ON my_table (tenant_id, jsonb_extract_path_text(data, 'field1'),
object_type, id)
WHERE deleted IS NULL
AND jsonb_extract_path_text(data, 'field1') <> ''
AND object_type = 'SpecificType';

Then I tried reordering the predicates to put expensive operations last:

CREATE INDEX CONCURRENTLY idx_v2
ON my_table (tenant_id, jsonb_extract_path_text(data, 'field1'),
object_type, id)
WHERE deleted IS NULL
AND object_type = 'SpecificType'
AND jsonb_extract_path_text(data, 'field1') <> '';

The second version (idx_v2) was significantly faster to build
Looking at the PostgreSQL source, it appears that:
- Regular query WHERE clauses go through the planner's
order_qual_clauses()
function (in createplan.c), which sorts predicates by cost
- Partial index predicates appear to go through ExecPrepareQual() (in
execExpr.c),
which processes predicates in the given order without reordering

This seems to be a difference between how the planner handles query
predicates versus how the executor handles index predicates.

Is this expected/intended behavior?
Is there a reason partial index predicates aren't reordered by cost?

I'm using Postgres 16.9

Thanks for any insights!
Arik Schimmel

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Arik Schimmel (#1)
Re: Question about partial index WHERE clause predicate ordering

Arik Schimmel <arik.schimmel@wiz.io> writes:

Is there a reason partial index predicates aren't reordered by cost?

It hasn't come up AFAIR. I'm dubious that it'd be worth the trouble,
because order_qual_clauses is really quite crude when dealing with
simple expressions. We don't have accurate costing data for most
functions/operators --- they're all just labeled with procost 1 ---
so that the "cost-based ordering" reduces to just counting the
functions. That gets the right answer in your example, but only
accidentally IMO; it has no idea that jsonb_extract_path_text()
is particularly expensive. order_qual_clauses exists mostly to
ensure that subplans get pushed to the end, and that's not relevant
in this context because we don't support those in indexes.

regards, tom lane

#3zengman
zengman@halodbtech.com
In reply to: Arik Schimmel (#1)
Re:Question about partial index WHERE clause predicate ordering

Hi,

From what I recall, I came across a blog post quite some time ago that might be helpful for this topic:
`https://danolivo.substack.com/p/on-expressions-reordering-in-postgres`

--
Regards,
Man Zeng
www.openhalo.org