Index filter instead of index condition w/ IN / ANY queries above certain set size
Hey everyone,
I'm trying to understand when the planner decides to use an index condition
vs an index filter, specifically for x IN / = ANY {set}, and if we can tune
some parameters to move it between these plans.
We have two tables and a query similar to the following fiddle:
https://www.db-fiddle.com/f/g9tZvVk65RRg3XRskQZpXK/1
In the fiddle - we see that the planner uses an index condition up to a set
of 3 elements, and fallbacks to use an index filter when the set has 4 or
more elements;
In our case - which I couldn't easily replicate in the fiddle - the
threshold is a single element - that is, a single element uses the index
condition, and 2 or more elements use an index filter, and the latter is
much slower on our data set.
This ^ also causes a side effect, where IN queries of a single element are
'flattened' to a single element comparison (x = y), but ANY queries aren't
flattened, and are still being compared against a set of one element;
This flattening is what makes our IN queries use the index condition, but
the = ANY(array[one-element]) to use the index filter.
I've tried playing w/ the random_page_cost, create extended statistics and
tune other sys parameters - but it didn't nudge the planner the other way;
Would appreciate if anyone could shed some light on this behavior (code
refs are also welcomed), and if there's any way we could help the planner
move between the plans.
Thanks a ton - appreciate your time !
Danny
On Wed, 2022-11-23 at 10:49 +0200, Danny Shemesh wrote:
I'm trying to understand when the planner decides to use an index condition vs an index filter
I am not sure what you mean by "index filter".
If you could send the result of EXPLAIN (ANALYZE, BUFFERS) for the queries,
that would be most useful.
Yours,
Laurenz Albe
Hey Laurenz, thanks for the prompt response !
What I meant is this - the plan consists of either an index scan that uses
all passed columns in the index condition, or an index scan
that uses only one column as an index condition, with an additional filter
step.
The below are explain (analyze, buffers) outputs of both queries from our
system, redacted:
This one uses = ANY({element}) and has an index cond + filter (which is
less favorable on our data set, performance wise):
EXPLAIN (ANALYZE, BUFFERS)
SELECT distinct(pid) FROM hashes WHERE tid =
'13371337-1337-1337-1337-133713371337' AND hid = any(WITH RECURSIVE cte AS
((SELECT pidh FROM refs WHERE tid = '13371337-1337-1337-1337-133713371337'
AND tidh = ANY('{13391339-1339-1339-1339-133913391339}')
ORDER BY pidh LIMIT 1
) UNION ALL SELECT lateral_join.* FROM cte CROSS JOIN LATERAL (SELECT pidh
FROM refs WHERE tid = '13371337-1337-1337-1337-133713371337' AND tidh =
ANY('{13391339-1339-1339-1339-133913391339}') pidh > cte.pidh ORDER BY pidh
LIMIT 1) lateral_join)
SELECT pidh FROM cte);
Unique (cost=2557.89..2560.30 rows=65 width=16) (actual
time=105369.476..105369.498 rows=37 loops=1)
Buffers: shared hit=336506 read=902254
I/O Timings: read=2423376.942
-> Sort (cost=2557.89..2559.09 rows=482 width=16) (actual
time=105369.474..105369.484 rows=57 loops=1)
Sort Key: hashes.pid
Sort Method: quicksort Memory: 27kB
Buffers: shared hit=336506 read=902254
I/O Timings: read=2423376.942
-> Hash Semi Join (cost=2449.10..2536.41 rows=482 width=16)
(actual time=105358.077..105369.411 rows=57 loops=1)
Hash Cond: (hashes.hid = cte.pidh)
Buffers: shared hit=336503 read=902254
I/O Timings: read=2423376.942
-> Seq Scan on hashes (cost=0.00..73.28 rows=3302 width=32)
(actual time=0.865..11.736 rows=3400 loops=1)
Filter: (tid =
'13371337-1337-1337-1337-133713371337'::uuid)
Buffers: shared read=32
I/O Timings: read=39.112
-> Hash (cost=2447.84..2447.84 rows=101 width=16) (actual
time=105357.200..105357.204 rows=39 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 10kB
Buffers: shared hit=336503 read=902222
I/O Timings: read=2423337.830
-> CTE Scan on cte (cost=2444.81..2446.83 rows=101
width=16) (actual time=1727.214..105357.126 rows=39 loops=1)
Buffers: shared hit=336503 read=902222
I/O Timings: read=2423337.830
CTE cte
-> Recursive Union (cost=0.56..2444.81
rows=101 width=16) (actual time=1727.212..105357.035 rows=39 loops=1)
Buffers: shared hit=336503 read=902222
I/O Timings: read=2423337.830
-> Limit (cost=0.56..22.00 rows=1
width=16) (actual time=1727.210..1727.210 rows=1 loops=1)
Buffers: shared hit=13051 read=14561
I/O Timings: read=53405.294
-> Index Only Scan using
idx_hashes on refs (cost=0.56..722735.47 rows=33715 width=16) (actual
time=1727.208..1727.208 rows=1 loops=1)
Index Cond: (tid =
'13371337-1337-1337-1337-133713371337'::uuid)
* Filter: (tidh = ANY
('{13391339-1339-1339-1339-133913391339}'::uuid[])) <<<<<<<<<<<<<<<-
Note this line* Rows Removed
by Filter: 109087
Heap Fetches: 16976
Buffers: shared hit=13051
read=14561
I/O Timings: read=53405.294
-> Nested Loop (cost=0.56..242.08
rows=10 width=16) (actual time=2657.169..2657.171 rows=1 loops=39)
Buffers: shared hit=323452
read=887661
I/O Timings: read=2369932.536
-> WorkTable Scan on cte cte_1
(cost=0.00..0.20 rows=10 width=16) (actual time=0.000..0.001 rows=1
loops=39)
-> Limit (cost=0.56..24.17 rows=1
width=16) (actual time=2657.167..2657.167 rows=1 loops=39)
Buffers: shared hit=323452
read=887661
I/O Timings: read=2369932.536
-> Index Only Scan using
idx_hashes on refs refs_1 (cost=0.56..265306.68 rows=11238 width=16)
(actual time=2657.162..2657.162 rows=1 loops=39)
Index Cond: ((tid =
'13371337-1337-1337-1337-133713371337'::uuid) AND (pidh > cte_1.pidh))
* Filter: (tidh = ANY
('{13391339-1339-1339-1339-133913391339}'::uuid[])) <<<<<<<<<<<<<<<-
Note this line* Rows
Removed by Filter: 346024
Heap Fetches: 1506359
Buffers: shared
hit=323452 read=887661
I/O Timings:
read=2369932.536
Planning:
Buffers: shared hit=8
Planning Time: 0.537 ms
Execution Time: 105369.560 ms
Time: 105.378s (1 minute 45 seconds), executed in: 105.378s (1 minute 45
seconds)
Now, this is a plan that only changes the = ANY to IN, where we could note
three things:
1. The IN operator was converted to a plain comparison of x = y
2. The filter step is now gone, and the column now appears in the Index
Cond step
3. The query is 350x ~ faster
EXPLAIN (ANALYZE, BUFFERS)
SELECT distinct(pid) FROM hashes WHERE tid =
'13371337-1337-1337-1337-133713371337' AND hid = any(WITH RECURSIVE cte AS
((SELECT pidh FROM refs WHERE tid = '13371337-1337-1337-1337-133713371337'
AND tidh in('13391339-1339-1339-1339-133913391339')
ORDER BY pidh LIMIT 1
) UNION ALL SELECT lateral_join.* FROM cte CROSS JOIN LATERAL (SELECT pidh
FROM refs WHERE tid = '13371337-1337-1337-1337-133713371337' AND tidh in
('13391339-1339-1339-1339-133913391339') pidh > cte.pidh ORDER BY pidh
LIMIT 1) lateral_join)
SELECT pidh FROM cte);
HashAggregate (cost=854.80..855.45 rows=65 width=16) (actual
time=297.874..297.883 rows=37 loops=1)
Group Key: hashes.pid
Batches: 1 Memory Usage: 24kB
Buffers: shared hit=91843 read=164
I/O Timings: read=154.439
-> Hash Semi Join (cost=766.29..853.60 rows=482 width=16) (actual
time=297.009..297.857 rows=57 loops=1)
Hash Cond: (hashes.hid = cte.pidh)
Buffers: shared hit=91843 read=164
I/O Timings: read=154.439
-> Seq Scan on hashes (cost=0.00..73.28 rows=3302 width=32)
(actual time=0.006..0.449 rows=3400 loops=1)
Filter: (tid = '13371337-1337-1337-1337-133713371337'::uuid)
Buffers: shared hit=32
I/O Timings: read=3.289
-> Hash (cost=765.03..765.03 rows=101 width=16) (actual
time=296.995..296.998 rows=39 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 10kB
Buffers: shared hit=91811 read=164
I/O Timings: read=151.151
-> CTE Scan on cte (cost=762.00..764.02 rows=101 width=16)
(actual time=1.940..296.958 rows=39 loops=1)
Buffers: shared hit=91811 read=164
I/O Timings: read=151.151
CTE cte
-> Recursive Union (cost=0.56..762.00 rows=101
width=16) (actual time=1.938..296.919 rows=39 loops=1)
Buffers: shared hit=91811 read=164
I/O Timings: read=151.151
-> Limit (cost=0.56..6.83 rows=1 width=16)
(actual time=1.936..1.938 rows=1 loops=1)
Buffers: shared hit=1791
-> Index Only Scan using idx_hashes on
refs (cost=0.56..211281.01 rows=33715 width=16) (actual time=1.936..1.936
rows=1 loops=1)
* Index Cond: ((tid =
'13371337-1337-1337-1337-133713371337'::uuid) AND (tidh =
'13391339-1339-1339-1339-133913391339'::uuid))** <<<<<<<<<<<<<<<- Note
this line*
Heap Fetches: 0
Buffers: shared hit=1791
-> Nested Loop (cost=0.56..75.32 rows=10
width=16) (actual time=7.561..7.562 rows=1 loops=39)
Buffers: shared hit=90020 read=164
I/O Timings: read=151.151
-> WorkTable Scan on cte cte_1
(cost=0.00..0.20 rows=10 width=16) (actual time=0.000..0.000 rows=1
loops=39)
-> Limit (cost=0.56..7.49 rows=1
width=16) (actual time=7.559..7.560 rows=1 loops=39)
Buffers: shared hit=90020 read=164
I/O Timings: read=151.151
-> Index Only Scan using
idx_hashes on refs refs_1 (cost=0.56..77897.32 rows=11238 width=16)
(actual time=7.557..7.557 rows=1 loops=39)
* Index Cond: ((tid =
'13371337-1337-1337-1337-133713371337'::uuid) AND (pidh > cte_1.pidh) AND
(tidh = '13391339-1339-1339-1339-133913391339'::uuid))**
<<<<<<<<<<<<<<<- Note this line*
Heap Fetches: 10
Buffers: shared hit=90020
read=164
I/O Timings: read=151.151
Planning:
Buffers: shared hit=8
Planning Time: 0.505 ms
Execution Time: 297.948 ms
I can also note that if I change the IN expression to compare on 2 or more
items - it is converted to an ANY expression,
and the previous plan is chosen again.
Thanks a ton !
Danny
On Wed, Nov 23, 2022 at 11:22 AM Laurenz Albe <laurenz.albe@cybertec.at>
wrote:
Show quoted text
On Wed, 2022-11-23 at 10:49 +0200, Danny Shemesh wrote:
I'm trying to understand when the planner decides to use an index
condition vs an index filter
I am not sure what you mean by "index filter".
If you could send the result of EXPLAIN (ANALYZE, BUFFERS) for the queries,
that would be most useful.Yours,
Laurenz Albe
On Wed, 2022-11-23 at 15:38 +0200, Danny Shemesh wrote:
-> Limit (cost=0.56..24.17 rows=1 width=16) (actual time=2657.167..2657.167 rows=1 loops=39)
Buffers: shared hit=323452 read=887661
I/O Timings: read=2369932.536
-> Index Only Scan using idx_hashes on refs refs_1 (cost=0.56..265306.68 rows=11238 width=16) (actual time=2657.162..2657.162 rows=1 loops=39)
Index Cond: ((tid = '13371337-1337-1337-1337-133713371337'::uuid) AND (pidh > cte_1.pidh))
Filter: (tidh = ANY ('{13391339-1339-1339-1339-133913391339}'::uuid[])) <<<<<<<<<<<<<<<- Note this line
Rows Removed by Filter: 346024
Heap Fetches: 1506359
Buffers: shared hit=323452 read=887661
I/O Timings: read=2369932.536
PostgreSQL thinks that there are enough such rows that it is cheaper to use the index
that supports the ORDER BY. I don't know why there is a difference between = ANY
and = here, but you can use an expression like "ORDER BY pidh + 0" to avoid that.
Yours,
Laurenz Albe
Danny Shemesh <dany74q@gmail.com> writes:
-> Index Only Scan using
idx_hashes on refs (cost=0.56..722735.47 rows=33715 width=16) (actual
time=1727.208..1727.208 rows=1 loops=1)
Index Cond: (tid =
'13371337-1337-1337-1337-133713371337'::uuid)
* Filter: (tidh = ANY
('{13391339-1339-1339-1339-133913391339}'::uuid[])) <<<<<<<<<<<<<<<-
Note this line* Rows Removed
by Filter: 109087
Heap Fetches: 16976
Buffers: shared hit=13051
read=14561
I/O Timings: read=53405.294
This doesn't match up terribly well with the table definition
you showed before, but I wonder whether tidh is a low-order
index column. If you need to optimize this specific shape
of query you need to pay attention to the index column order, per
https://www.postgresql.org/docs/current/indexes-multicolumn.html
That is, tid and tidh need to be the first two index columns.
regards, tom lane
Hey Laurenz, Tom - thanks again !
that it is cheaper to use the index that supports the ORDER BY
Thing is, that both queries use the exact same index (idx_hashes), but one
uses it w/ the filter and one does not.
This doesn't match up terribly well with the table definition you showed
before
Yeah.. it was a bit hard to reproduce exactly, but the fiddle does showcase
that there's some threshold to the ANY set-size
where it stops using the column in the index condition, and moves it to the
filter step - I thought it might originate
from similar reasons.
but I wonder whether tidh is a low-order index column.
The index indeed uses tidh as a low order column, and it's better to have
it the other way around -
currently, it's: (tid, pidh, tidh) - where (tid, tidh, pidh) would've
probably worked better.
We've already optimized the query itself - but for pure understanding of
the planner decision here,
I'd really still like to understand, if possible, the difference between
ANY and IN,
and why, even though the column order isn't optimal - one plan still
successfully uses the index more efficiently than another.
Any idea where I could zone-in in the source code to look for hints, maybe ?
Appreciate it !
Danny
On Wed, Nov 23, 2022 at 4:29 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Show quoted text
Danny Shemesh <dany74q@gmail.com> writes:
-> Index Only Scan using
idx_hashes on refs (cost=0.56..722735.47 rows=33715 width=16) (actual
time=1727.208..1727.208 rows=1 loops=1)
Index Cond: (tid =
'13371337-1337-1337-1337-133713371337'::uuid)
* Filter: (tidh = ANY
('{13391339-1339-1339-1339-133913391339}'::uuid[])) <<<<<<<<<<<<<<<-
Note this line* Rows Removed
by Filter: 109087
Heap Fetches: 16976
Buffers: shared hit=13051
read=14561
I/O Timings: read=53405.294This doesn't match up terribly well with the table definition
you showed before, but I wonder whether tidh is a low-order
index column. If you need to optimize this specific shape
of query you need to pay attention to the index column order, perhttps://www.postgresql.org/docs/current/indexes-multicolumn.html
That is, tid and tidh need to be the first two index columns.
regards, tom lane