avoiding seq scan without duplicating

Started by Andrusover 17 years ago3 messagesgeneral
Jump to latest
#1Andrus
kobruleht2@hot.ee

Simple query is slow, performs seq scan while index exists:

explain select count(*)::integer as cnt
from firma2.dok
where dokumnr in (888817,2) and
dokumnr not in (select dokumnr FROM firma2.bilkaib WHERE
alusdok='LF' -- and dokumnr in (888817,2)
)

"Aggregate (cost=152063.71..152063.73 rows=1 width=0)"
" -> Bitmap Heap Scan on dok (cost=152055.67..152063.71 rows=1 width=0)"
" Recheck Cond: ((dokumnr = 888817) OR (dokumnr = 2))"
" Filter: (NOT (hashed subplan))"
" -> BitmapOr (cost=4.01..4.01 rows=2 width=0)"
" -> Bitmap Index Scan on dok_dokumnr_idx (cost=0.00..2.00
rows=1 width=0)"
" Index Cond: (dokumnr = 888817)"
" -> Bitmap Index Scan on dok_dokumnr_idx (cost=0.00..2.00
rows=1 width=0)"
" Index Cond: (dokumnr = 2)"
" SubPlan"
" -> Seq Scan on bilkaib (cost=0.00..152034.41 rows=6902
width=4)"
" Filter: (alusdok = 'LF'::bpchar)"

Index is used if join condition is duplicated in subquery:

explain select count(*)::integer as cnt
from firma2.dok
where dokumnr in (888817,2) and
dokumnr not in (select dokumnr FROM firma2.bilkaib WHERE
alusdok='LF' and dokumnr in (888817,2)
)

"Aggregate (cost=870.45..870.46 rows=1 width=0)"
" -> Bitmap Heap Scan on dok (cost=862.41..870.44 rows=1 width=0)"
" Recheck Cond: ((dokumnr = 888817) OR (dokumnr = 2))"
" Filter: (NOT (hashed subplan))"
" -> BitmapOr (cost=4.01..4.01 rows=2 width=0)"
" -> Bitmap Index Scan on dok_dokumnr_idx (cost=0.00..2.00
rows=1 width=0)"
" Index Cond: (dokumnr = 888817)"
" -> Bitmap Index Scan on dok_dokumnr_idx (cost=0.00..2.00
rows=1 width=0)"
" Index Cond: (dokumnr = 2)"
" SubPlan"
" -> Bitmap Heap Scan on bilkaib (cost=4.77..858.39 rows=3
width=4)"
" Recheck Cond: ((dokumnr = 888817) OR (dokumnr = 2))"
" Filter: (alusdok = 'LF'::bpchar)"
" -> BitmapOr (cost=4.77..4.77 rows=219 width=0)"
" -> Bitmap Index Scan on bilkaib_dokumnr_idx
(cost=0.00..2.38 rows=110 width=0)"
" Index Cond: (dokumnr = 888817)"
" -> Bitmap Index Scan on bilkaib_dokumnr_idx
(cost=0.00..2.38 rows=110 width=0)"
" Index Cond: (dokumnr = 2)"

how to make query fast without repeating join condition two times in query ?

Andurs.

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrus (#1)
Re: avoiding seq scan without duplicating

"Andrus" <kobruleht2@hot.ee> writes:

Simple query is slow, performs seq scan while index exists:
explain select count(*)::integer as cnt
from firma2.dok
where dokumnr in (888817,2) and
dokumnr not in (select dokumnr FROM firma2.bilkaib WHERE
alusdok='LF' -- and dokumnr in (888817,2)
)

Index is used if join condition is duplicated in subquery:

explain select count(*)::integer as cnt
from firma2.dok
where dokumnr in (888817,2) and
dokumnr not in (select dokumnr FROM firma2.bilkaib WHERE
alusdok='LF' and dokumnr in (888817,2)
)

The proposed transformation is not correct because of the odd behavior
of NOT IN with respect to nulls.

regards, tom lane

#3Andrus
kobruleht2@hot.ee
In reply to: Tom Lane (#2)
Re: avoiding seq scan without duplicating

Tom,

The proposed transformation is not correct because of the odd behavior
of NOT IN with respect to nulls.

Thank you.

In this particular case dokumnr is dok table primary key of type int.
bilkaib.dokumnr can contain nulls but this does not affect to result
probably.

So in this case this conversion is probably correct.

Andrus.