subselect and optimizer

Started by Igor Sysoevover 27 years ago4 messages
#1Igor Sysoev
igor@nitek.ru

I'm using PostgreSQL 6.3.2.

As reported in some messages ago PostgreSQL has problem with
"... where some_field in (select ..." type subqueries.
One of the solutions was to create indecies.
I created two indecies for character(9) fields key and newkey:
create index key_i on bik (key);
create index newkey_i on bik (newkey);
run two quiery explain:

bik=> explain select * from bik where key in (select newkey from bik where
bik=
'044531864');
NOTICE: Apr 21 14:15:41:QUERY PLAN:

Seq Scan on bik (cost=770.92 size=1373 width=113)
SubPlan
-> Seq Scan on bik (cost=770.92 size=1 width=12)

EXPLAIN
bik=> explain select * from bik where key = (select newkey from bik where
bik='
044531864');
NOTICE: Apr 21 14:16:01:QUERY PLAN:

Index Scan on bik (cost=2.05 size=1 width=113)
InitPlan
-> Seq Scan on bik (cost=770.92 size=1 width=12)

EXPLAIN

When I run first query it hang for a long time, at least 10 minutes
(I interrupted it) while second one completed in 1 second.
Table bik has about 13000 rows and 2.6M size.
It seems the problem is that in first queiry plan is "Seq Scan" while
in second is "Index Scan". How it can be fixed ?

with best regards,
Igor Sysoev

#2Vadim B. Mikheev
vadim@sable.krasnoyarsk.su
In reply to: Igor Sysoev (#1)
Re: [HACKERS] subselect and optimizer

Igor Sysoev wrote:

I'm using PostgreSQL 6.3.2.

As reported in some messages ago PostgreSQL has problem with
"... where some_field in (select ..." type subqueries.
One of the solutions was to create indecies.
I created two indecies for character(9) fields key and newkey:
create index key_i on bik (key);
create index newkey_i on bik (newkey);
run two quiery explain:

bik=> explain select * from bik where key in (select newkey from bik where
bik='044531864');
NOTICE: Apr 21 14:15:41:QUERY PLAN:

Seq Scan on bik (cost=770.92 size=1373 width=113)
SubPlan
-> Seq Scan on bik (cost=770.92 size=1 width=12)

^^^
This is very strange. Index Scan should be used here.
I'll try to discover this...

BTW, IN is slow (currently :) - try to create 2-key index on bik (bik, newkey)
and rewrite your query as

select * from bik b1 where EXISTS (select newkey from bik where
bik = '....' and b1.key = newkey)

And let's know... (Note, that index on (newkey, bik) may be more useful
than on (bik, newkey) - it depends on your data).

Vadim

#3Igor Sysoev
igor@nitek.ru
In reply to: Vadim B. Mikheev (#2)
Re: [HACKERS] subselect and optimizer

Vadim wrote:

I'm using PostgreSQL 6.3.2.

As reported in some messages ago PostgreSQL has problem with
"... where some_field in (select ..." type subqueries.
One of the solutions was to create indecies.
I created two indecies for character(9) fields key and newkey:
create index key_i on bik (key);
create index newkey_i on bik (newkey);
run two quiery explain:

bik=>explain select * from bik where key in (select newkey from
bik where bik='044531864');
NOTICE: Apr 21 14:15:41:QUERY PLAN:

Seq Scan on bik (cost=770.92 size=1373 width=113)
SubPlan
-> Seq Scan on bik (cost=770.92 size=1 width=12)

^^^
This is very strange. Index Scan should be used here.
I'll try to discover this...

No, I think it's not strange - I haven't index for bik (bik) so in both
cases
internal select should using Seq Scan. I repeat EXPLAIN from second query
(You
droped it):

------
bik=> explain select * from bik where key = (select newkey from bik
where bik='044531864');
NOTICE: Apr 21 14:16:01:QUERY PLAN:

Index Scan on bik (cost=2.05 size=1 width=113)
InitPlan
-> Seq Scan on bik (cost=770.92 size=1 width=12)

EXPLAIN
-------

Strange is another - outer select in second query using Index Scan (it's
right)
but it doesn't use it in first query.

BTW, IN is slow (currently :) - try to create 2-key index on bik (bik,

newkey)

and rewrite your query as

I tried simple query to check can IN use Index Scan ? EXPLAIN show it can:

--------
bik=> explain select * from bik where key in ('aqQWV+ZG');
NOTICE: Apr 22 10:29:44:QUERY PLAN:

Index Scan on bik (cost=2.05 size=1 width=113)

EXPLAIN
--------

select * from bik b1 where EXISTS (select newkey from bik where
bik = '....' and b1.key = newkey)

And let's know... (Note, that index on (newkey, bik) may be more useful
than on (bik, newkey) - it depends on your data).

Ok, I' will try it now but main problem is that I often need to use LIKE
operator (i.e. bik ~ '31864') in subselect and can't use indecies in this
case.

Igor Sysoev

#4Igor Sysoev
igor@nitek.ru
In reply to: Igor Sysoev (#3)
Re: [HACKERS] subselect and optimizer

Vadim wrote:

BTW, IN is slow (currently :) - try to create 2-key index on bik (bik,

newkey)

and rewrite your query as

select * from bik b1 where EXISTS (select newkey from bik where
bik = '....' and b1.key = newkey)

And let's know... (Note, that index on (newkey, bik) may be more useful
than on (bik, newkey) - it depends on your data).

I had tried - it really works ! I don't even try to use index for bik
(bik).
It works even when I tried "bik ~ '...'".
The one downside is I need to crunch and twist my brains to use
such SQL statemants :). Thank you.

Igor Sysoev