Thread: EXCEPT performace
I noticed an unexpected difference in performance between seemingly equivalent queries: SELECT id FROM short_table EXCEPT SELECT id FROM long_table; and SELECT id FROM short_table EXCEPT (SELECT id FROM short_table JOIN long_table ON short_table.id = long_table.id); There is an index on longtable.id; short_table is a temporary table. EXPLAIN suggests that the index is ignored in the first case but consulted in the second. Seq Scan on short_table (cost=0.00..2857177.50 rows=1000 width=12) SubPlan -> Seq Scan on long_table (cost=0.00..5714.31 rows=200831 width=12) vs. Seq Scan on short_table (cost=0.00..44.52 rows=3 width=12) SubPlan -> Materialize (cost=14.49..14.49 rows=3 width=24) -> Nested Loop (cost=0.00..14.49 rows=3 width=24) -> Seq Scan on short_table (cost=0.00..1.03 rows=3 width=12) -> Index Scan using long_table_id_ix on long_table (cost=0.00..4.48 rows=1 width=12) Is there a reason an index cannot be used in an EXCEPT query? Ari.
On Sat, 3 Nov 2001, K. Ari Krupnikov wrote: > I noticed an unexpected difference in performance between seemingly equivalent queries: > > SELECT id FROM short_table EXCEPT SELECT id FROM long_table; > > and > > SELECT id FROM short_table EXCEPT > (SELECT id FROM short_table JOIN long_table > ON short_table.id = long_table.id); > > There is an index on longtable.id; short_table is a temporary table. EXPLAIN > suggests that the index is ignored in the first case but consulted in the second. A complete guess, but EXCEPT may be being rewritten into an IN<subquery> style query, which would mean it'd have the same performance issues as that form (see the faq). If your queries are simple, it may be best to rewrite them as something like: select id from short_table where not exists (select id from long_table where short_table.id=long_table.id); There may be some null related issues because I don't remember how except handles nulls, but that will probably run better.
"K. Ari Krupnikov" <ari@cogsci.ed.ac.uk> writes: > I noticed an unexpected difference in performance between seemingly equivalent queries: > SELECT id FROM short_table EXCEPT SELECT id FROM long_table; > and > SELECT id FROM short_table EXCEPT > (SELECT id FROM short_table JOIN long_table > ON short_table.id = long_table.id); But they're not equivalent. Consider what happens when NULLs are present in both tables. regards, tom lane