BitmapAnd on correlated column?

Started by greigwiseover 6 years ago6 messagesgeneral

greigwise@comcast.net

over 6 years ago

I'm running the following query on Postgres version 10.8:

SELECT count(*) FROM test_table WHERE and id_column_1 IN (9954, 9690, 9689,
9688) AND
id_column_2 IN (75328, 51448, 48060, 48065, 51803, 51449, 51802, 48064,
48061, 48062, 48059, 48063, 121834, 326649, 303167, 303249, 336838, 334172,
303419, 48148, 329727, 320007, 205066, 205065, 75406, 48153, 56298, 51570,
62668, 48150, 110618, 48154, 48147, 110626, 96830, 110624, 110615, 110625,
110621, 110617, 115865, 196128, 179940, 156843, 186994, 125995, 206982,
240753, 245896, 131002, 215463, 303111, 303399, 336923, 325403, 336189,
335278, 335271, 334985, 325014, 59681, 48277, 68237, 48067, 48072, 49379,
60063, 85903, 117843, 51451, 65681, 117848, 117845, 48068, 48066, 48071,
50323, 51450, 48069, 121835, 117844, 48070, 49310, 293247, 134761, 210493,
325404, 117847, 117846, 169399, 187795, 117849, 200056, 309165, 106874,
310255, 117343, 240831, 117842, 270870, 259300, 259291, 307661, 335006,
329259, 119279, 293474, 303516, 119277, 303557, 307101, 302380, 301949,
252399, 51988, 65660, 68236, 51987, 86026, 50557, 82723, 51452, 50550,
50552, 50559, 70009, 50554, 50551, 50555, 50553, 55955, 82766, 119274,
57069, 58184, 111317, 84796, 122502, 119275, 119270, 119273, 89111, 307110,
105797, 105798, 57068, 119280, 122495, 132483, 293249, 140929, 307666,
307660, 210494, 294051, 210507, 211284, 74299, 235551, 119281, 269433,
269434, 269891, 240828, 50556, 307977);

There are 2 indexes, one on id_column_1 and one on id_column_2.

The plan looks like this:

Aggregate (cost=732.87..732.88 rows=1 width=8) (actual time=33.835..33.835
rows=1 loops=1)
-> Bitmap Heap Scan on test_table (cost=701.88..732.82 rows=18 width=0)
(actual time=13.073..31.288 rows=24965 loops=1)
Recheck Cond: ((id_column_1 = ANY
('{9954,9690,9689,9688}'::integer[])) AND (id_column_2 = ANY
('{75328,51448,48060,48065,51803,51449,51802,48064,48061,48062,48059,48063,121834,326649,303167,303249,336838,334172,303419,48148,329727,320007,205066,205065,75406,48153,56298,51570,62668,48150,110618,48154,48147,110626,96830,110624,110615,110625,110621,110617,115865,196128,179940,156843,186994,125995,206982,240753,245896,131002,215463,303111,303399,336923,325403,336189,335278,335271,334985,325014,59681,48277,68237,48067,48072,49379,60063,85903,117843,51451,65681,117848,117845,48068,48066,48071,50323,51450,48069,121835,117844,48070,49310,293247,134761,210493,325404,117847,117846,169399,187795,117849,200056,309165,106874,310255,117343,240831,117842,270870,259300,259291,307661,335006,329259,119279,293474,303516,119277,303557,307101,302380,301949,252399,51988,65660,68236,51987,86026,50557,82723,51452,50550,50552,50559,70009,50554,50551,50555,50553,55955,82766,119274,57069,58184,111317,84796,122502,119275,119270,119273,89111,307110,105797,105798,57068,119280,122495,132483,293249,140929,307666,307660,210494,294051,210507,211284,74299,235551,119281,269433,269434,269891,240828,50556,307977}'::integer[])))
Heap Blocks: exact=20879
-> BitmapAnd (cost=701.88..701.88 rows=18 width=0) (actual
time=9.904..9.904 rows=0 loops=1)
-> Bitmap Index Scan on index_on_col1 (cost=0.00..95.31
rows=7676 width=0) (actual time=5.345..5.345 rows=28346 loops=1)
Index Cond: (id_column_1 = ANY
('{9954,9690,9689,9688}'::integer[]))
-> Bitmap Index Scan on index_on_col2 (cost=0.00..606.32
rows=37941 width=0) (actual time=3.395..3.395 rows=24965 loops=1)
Index Cond: (id_column_2 = ANY
('{75328,51448,48060,48065,51803,51449,51802,48064,48061,48062,48059,48063,121834,326649,303167,303249,336838,334172,303419,48148,329727,320007,205066,205065,75406,48153,56298,51570,62668,48150,110618,48154,48147,110626,96830,110624,110615,110625,110621,110617,115865,196128,179940,156843,186994,125995,206982,240753,245896,131002,215463,303111,303399,336923,325403,336189,335278,335271,334985,325014,59681,48277,68237,48067,48072,49379,60063,85903,117843,51451,65681,117848,117845,48068,48066,48071,50323,51450,48069,121835,117844,48070,49310,293247,134761,210493,325404,117847,117846,169399,187795,117849,200056,309165,106874,310255,117343,240831,117842,270870,259300,259291,307661,335006,329259,119279,293474,303516,119277,303557,307101,302380,301949,252399,51988,65660,68236,51987,86026,50557,82723,51452,50550,50552,50559,70009,50554,50551,50555,50553,55955,82766,119274,57069,58184,111317,84796,122502,119275,119270,119273,89111,307110,105797,105798,57068,119280,122495,132483,293249,140929,307666,307660,210494,294051,210507,211284,74299,235551,119281,269433,269434,269891,240828,50556,307977}'::integer[]))
Planning time: 1.452 ms
Execution time: 34.036 ms

The thing is that id_column_1 is really dependent on id_column_2. So
there's really no point in scanning the index on id_column_1. In fact, if I
remove that in clause for id_column_1 from the query, I get a better plan:

Aggregate (cost=14045.14..14045.15 rows=1 width=8) (actual
time=22.743..22.743 rows=1 loops=1)
-> Index Only Scan using index_on_col2 on test_table
(cost=0.43..13950.28 rows=37941 width=0) (actual time=0.130..19.880
rows=24965 loops=1)
Index Cond: (id_column_2 = ANY
('{75328,51448,48060,48065,51803,51449,51802,48064,48061,48062,48059,48063,121834,326649,303167,303249,336838,334172,303419,48148,329727,320007,205066,205065,75406,48153,56298,51570,62668,48150,110618,48154,48147,110626,96830,110624,110615,110625,110621,110617,115865,196128,179940,156843,186994,125995,206982,240753,245896,131002,215463,303111,303399,336923,325403,336189,335278,335271,334985,325014,59681,48277,68237,48067,48072,49379,60063,85903,117843,51451,65681,117848,117845,48068,48066,48071,50323,51450,48069,121835,117844,48070,49310,293247,134761,210493,325404,117847,117846,169399,187795,117849,200056,309165,106874,310255,117343,240831,117842,270870,259300,259291,307661,335006,329259,119279,293474,303516,119277,303557,307101,302380,301949,252399,51988,65660,68236,51987,86026,50557,82723,51452,50550,50552,50559,70009,50554,50551,50555,50553,55955,82766,119274,57069,58184,111317,84796,122502,119275,119270,119273,89111,307110,105797,105798,57068,119280,122495,132483,293249,140929,307666,307660,210494,294051,210507,211284,74299,235551,119281,269433,269434,269891,240828,50556,307977}'::integer[]))
Heap Fetches: 24965
Planning time: 0.647 ms
Execution time: 22.781 ms

I thought maybe extended statistics would help, so I did this:

create statistics test (dependencies) on id_column_2, id_column_1 from
test_table;
analyze test_table;

But the plan was nearly identical to the first plan with the BitmapAND even
after creating the extended statistics:

Aggregate (cost=738.12..738.13 rows=1 width=8) (actual time=34.207..34.207
rows=1 loops=1)
-> Bitmap Heap Scan on test_table (cost=707.13..738.07 rows=18 width=0)
(actual time=14.230..31.782 rows=24965 loops=1)
Recheck Cond: ((id_column_1 = ANY
('{9954,9690,9689,9688}'::integer[])) AND (id_column_2 = ANY
('{75328,51448,48060,48065,51803,51449,51802,48064,48061,48062,48059,48063,121834,326649,303167,303249,336838,334172,303419,48148,329727,320007,205066,205065,75406,48153,56298,51570,62668,48150,110618,48154,48147,110626,96830,110624,110615,110625,110621,110617,115865,196128,179940,156843,186994,125995,206982,240753,245896,131002,215463,303111,303399,336923,325403,336189,335278,335271,334985,325014,59681,48277,68237,48067,48072,49379,60063,85903,117843,51451,65681,117848,117845,48068,48066,48071,50323,51450,48069,121835,117844,48070,49310,293247,134761,210493,325404,117847,117846,169399,187795,117849,200056,309165,106874,310255,117343,240831,117842,270870,259300,259291,307661,335006,329259,119279,293474,303516,119277,303557,307101,302380,301949,252399,51988,65660,68236,51987,86026,50557,82723,51452,50550,50552,50559,70009,50554,50551,50555,50553,55955,82766,119274,57069,58184,111317,84796,122502,119275,119270,119273,89111,307110,105797,105798,57068,119280,122495,132483,293249,140929,307666,307660,210494,294051,210507,211284,74299,235551,119281,269433,269434,269891,240828,50556,307977}'::integer[])))
Heap Blocks: exact=20879
-> BitmapAnd (cost=707.13..707.13 rows=18 width=0) (actual
time=10.710..10.710 rows=0 loops=1)
-> Bitmap Index Scan on index_on_col1 (cost=0.00..95.58
rows=7711 width=0) (actual time=5.232..5.232 rows=28346 loops=1)
Index Cond: (id_column_1 = ANY
('{9954,9690,9689,9688}'::integer[]))
-> Bitmap Index Scan on index_on_col2 (cost=0.00..611.30
rows=38750 width=0) (actual time=4.010..4.010 rows=24965 loops=1)
Index Cond: (id_column_2 = ANY
('{75328,51448,48060,48065,51803,51449,51802,48064,48061,48062,48059,48063,121834,326649,303167,303249,336838,334172,303419,48148,329727,320007,205066,205065,75406,48153,56298,51570,62668,48150,110618,48154,48147,110626,96830,110624,110615,110625,110621,110617,115865,196128,179940,156843,186994,125995,206982,240753,245896,131002,215463,303111,303399,336923,325403,336189,335278,335271,334985,325014,59681,48277,68237,48067,48072,49379,60063,85903,117843,51451,65681,117848,117845,48068,48066,48071,50323,51450,48069,121835,117844,48070,49310,293247,134761,210493,325404,117847,117846,169399,187795,117849,200056,309165,106874,310255,117343,240831,117842,270870,259300,259291,307661,335006,329259,119279,293474,303516,119277,303557,307101,302380,301949,252399,51988,65660,68236,51987,86026,50557,82723,51452,50550,50552,50559,70009,50554,50551,50555,50553,55955,82766,119274,57069,58184,111317,84796,122502,119275,119270,119273,89111,307110,105797,105798,57068,119280,122495,132483,293249,140929,307666,307660,210494,294051,210507,211284,74299,235551,119281,269433,269434,269891,240828,50556,307977}'::integer[]))
Planning time: 1.073 ms
Execution time: 34.331 ms

So, I'm just wondering if there's anything I can do to influence the
optimize to pick the better plan using just the one index on id_column_2
(aside from re-writing the query).

Thanks,
Greig Wise

--
Sent from: https://www.postgresql-archive.org/PostgreSQL-general-f1843780.html