Thread: the best way to get some records not in another table
Hi, all Try to get some records not in another table. As the following, please advise which one will be the best way to do. Or is there any other way to do better? SELECT DISTINCT a.c1 FROM test_j2 a WHERE a.c1 NOT IN (SELECT DISTINCT b.c1 FROM test_j1 b); SELECT a.c1 FROM test_j2 a EXCEPT SELECT b.c1 FROM test_j1 b; Jack
> > Try to get some records not in another table. As the following, please > advise which one will be the best way to do. Or is there any other way to do > better? > > SELECT DISTINCT a.c1 > FROM test_j2 a > WHERE a.c1 NOT IN (SELECT DISTINCT b.c1 FROM test_j1 b); > > SELECT a.c1 FROM test_j2 a > EXCEPT > SELECT b.c1 FROM test_j1 b; > IN resp. NOT IN clauses are known to be slow. SELECT DISTINCT a.c1 FROM test_j2 a WHERE NOT EXISTS (SELECT b.c1 FROM test_j1 b WHERE b.c1 = a.c1) ; Can't tell if EXISTS performs better than EXCEPT, have a look at the EXPLAIN output. Regards, Christoph
Hi, According to the following report, I think using "except" would be the best way to do. Thank you! Jack ======================== EXPLAIN SELECT DISTINCT a.c1 FROM test_j2 a WHERE a.c1 NOT IN (SELECT DISTINCT b.c1 FROM test_j1 b); Unique (cost=54544.91..54547.41 rows=50 width=6) -> Sort (cost=54544.91..54546.16 rows=500 width=6) Sort Key:c1 -> Seq Scan on test_j2 a (cost=0.00..54522.50 rows=500 width=6) Filter: (subplan) SubPlan -> Materialize (cost=54.50..54.50 rows=100 width=6) -> Unique (cost=0.00..54.50rows=100 width=6) -> Index Scan using test_j1_pkey on test_j1 b (cost=0.00..52.00 rows=1000 width=6) (9 rows) ======================= EXPLAIN SELECT a.c1 FROM test_j2 a EXCEPT SELECT b.c1 FROM test_j1 b; SetOp Except (cost=149.66..159.66 rows=200 width=6) -> Sort (cost=149.66..154.66 rows=2000 width=6) Sort Key:c1 -> Append (cost=0.00..40.00 rows=2000 width=6) -> Subquery Scan "*SELECT* 1" (cost=0.00..20.00rows=1000 width=6) -> Seq Scan on test_j2 a (cost=0.00..20.00 rows=1000 width=6) -> Subquery Scan "*SELECT* 2" (cost=0.00..20.00 rows=1000 width=6) -> Seq Scan on test_j1 b (cost=0.00..20.00 rows=1000 width=6) (8 rows) ========================= EXPLAIN SELECT DISTINCT a.c1 FROM test_j2 a WHERE NOT EXISTS (SELECT b.c1 FROM test_j1 b WHERE b.c1 = a.c1) ; Unique (cost=3455.91..3458.41 rows=50 width=6) -> Sort (cost=3455.91..3457.16 rows=500 width=6) Sort Key: c1 -> Seq Scan on test_j2 a (cost=0.00..3433.50 rows=500 width=6) Filter: (NOT (subplan)) SubPlan -> Index Scan using test_j1_pkey on test_j1 b (cost=0.00..17.07 rows=5 width=6) Index Cond: (c1 = $0) (8 rows)