EXCEPT performace

Started by K. Ari Krupnikovover 24 years ago3 messagesgeneral
Jump to latest
#1K. Ari Krupnikov
ari@cogsci.ed.ac.uk

I noticed an unexpected difference in performance between seemingly equivalent queries:

SELECT id FROM short_table EXCEPT SELECT id FROM long_table;

and

SELECT id FROM short_table EXCEPT
(SELECT id FROM short_table JOIN long_table
ON short_table.id = long_table.id);

There is an index on longtable.id; short_table is a temporary table. EXPLAIN
suggests that the index is ignored in the first case but consulted in the second.

Seq Scan on short_table (cost=0.00..2857177.50 rows=1000 width=12)
SubPlan
-> Seq Scan on long_table (cost=0.00..5714.31 rows=200831 width=12)

vs.

Seq Scan on short_table (cost=0.00..44.52 rows=3 width=12)
SubPlan
-> Materialize (cost=14.49..14.49 rows=3 width=24)
-> Nested Loop (cost=0.00..14.49 rows=3 width=24)
-> Seq Scan on short_table (cost=0.00..1.03 rows=3 width=12)
-> Index Scan using long_table_id_ix on long_table (cost=0.00..4.48 rows=1 width=12)

Is there a reason an index cannot be used in an EXCEPT query?

Ari.

#2Stephan Szabo
sszabo@megazone23.bigpanda.com
In reply to: K. Ari Krupnikov (#1)
Re: EXCEPT performace

On Sat, 3 Nov 2001, K. Ari Krupnikov wrote:

I noticed an unexpected difference in performance between seemingly equivalent queries:

SELECT id FROM short_table EXCEPT SELECT id FROM long_table;

and

SELECT id FROM short_table EXCEPT
(SELECT id FROM short_table JOIN long_table
ON short_table.id = long_table.id);

There is an index on longtable.id; short_table is a temporary table. EXPLAIN
suggests that the index is ignored in the first case but consulted in the second.

A complete guess, but EXCEPT may be being rewritten into an IN<subquery>
style query, which would mean it'd have the same performance
issues as that form (see the faq).

If your queries are simple, it may be best to rewrite them as something
like:
select id from short_table where not exists (select id from long_table
where short_table.id=long_table.id);

There may be some null related issues because I don't remember how except
handles nulls, but that will probably run better.

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: K. Ari Krupnikov (#1)
Re: EXCEPT performace

"K. Ari Krupnikov" <ari@cogsci.ed.ac.uk> writes:

I noticed an unexpected difference in performance between seemingly equivalent queries:
SELECT id FROM short_table EXCEPT SELECT id FROM long_table;

and

SELECT id FROM short_table EXCEPT
(SELECT id FROM short_table JOIN long_table
ON short_table.id = long_table.id);

But they're not equivalent. Consider what happens when NULLs are
present in both tables.

regards, tom lane