What's the difference?

Started by Victor Wagnerabout 24 years ago2 messagesbugs
Jump to latest
#1Victor Wagner
vitus@ice.ru

As far as I understand, following three queries are exactly equivalent:

select item.item_id, item.title, author.email
from item, author, item_link
where
item.item_id=author.item_id and item_link.active=item.item_id and
item_link.linktype_id='MODERATES' and
item_link.passive='bob_news';

select item.item_id, item.title, author.email
from item natural join author, item_link
where item.item_id=item_link.active and
item_link.linktype_id='MODERATES'
and item_link.passive='bob_news';

select item,item_id, item.title, author.email
from item inner join author on (item.item_id = author.item_id),
item_link
where item.item_id=item_link.active and
item_link.linktype_id='MODERATES'
and item_link.passive='bob_news';

However, first query takes 0.004 seconds to execute
with following execution plan:
Nested Loop (cost=0.00..17.63 rows=1 width=68)
-> Nested Loop (cost=0.00..11.67 rows=1 width=37)
-> Index Scan using active_linked on item_link (cost=0.00..6.01
rows=1 width=10)
-> Index Scan using item_key on item (cost=0.00..5.65 rows=1
width=27)
-> Index Scan using author_key on author (cost=0.00..5.95 rows=1
width=31)

And second two - about 1.5 seconds with following plan
Nested Loop (cost=97.34..10078.92 rows=1 width=68)
-> Index Scan using active_linked on item_link (cost=0.00..6.01 rows=1
width=10)
-> Materialize (cost=10025.58..10025.58 rows=3787 width=58)
-> Hash Join (cost=97.34..10025.58 rows=3787 width=58)
-> Seq Scan on item (cost=0.00..8250.76 rows=108676
width=27)
-> Hash (cost=87.87..87.87 rows=3787 width=31)
-> Seq Scan on author (cost=0.00..87.87 rows=3787
width=31

I've tried to reorder tables in the FROM clause, putting ITEM_LINK first,
but it makes no difference.

Additional information about database:

Cardinality of tables:
item: 108941
autor: 3791
item_link: 132031

Primary key of author and item tables consists of one field item_id,
and there are no other identically named tables.

Primary key of item_link consists of fields active, passive, linktype_id
and there are several supplementary indices. Index active_linked was
created by following command:
create UNIQUE index ACTIVE_LINKED on ITEM_LINK(PASSIVE,LINKTYPE_ID,ACTIVE);

All key fields are VARCHAR(20), database created with encoding WIN, but
these fields contain ascii characters only.

PostgreSQL version 7.2.0.

BTW, if I use item left outer join author, execution plan is simular
with second one and execution time is about 4 seconds. I feel that
it is possible to use execution plan simular with first one,
and make outer join queries fly (as it happens in Oracle).

--
Victor Wagner vitus@ice.ru
Chief Technical Officer Office:7-(095)-748-53-88
Communiware.Net Home: 7-(095)-135-46-61
http://www.communiware.net http://www.ice.ru/~vitus

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Victor Wagner (#1)
Re: What's the difference?

Victor Wagner <vitus@ice.ru> writes:

As far as I understand, following three queries are exactly equivalent:

Same results, but the second two constrain the planner's choice of join
order. See

http://www.ca.postgresql.org/users-lounge/docs/7.2/postgres/explicit-joins.html

Whether this is a feature or a bug depends on context...

regards, tom lane