Dear Jie Liang,
The common case is that there will only be a few or no orphans. Won't
recreateing and reindicing this massive table (table a) take ages and known
my db out of commission for everyone else? One of the reasons why I do
things is two steps is that this allows me archive the deleted items. This
is, alas, a common (read daily) operation / query.
Leo
"Jie Liang" <jie@stbernard.com> wrote in message
news:7C760DAA511DC74B99E7D22189F786F1906D98@MAIL01.stbernard.com...
> set operation especially NOT EXIST is very slow(for big table),
> I recommand you use a few queries for your propose:
> 1. select * into c from a where join1=b.join1 and join2=b.join2;
> 2. truncate table a;
> 3. insert into a select * from c;
> 4. drop table c;
>
> You don't need index c since you use full table scan anyway.
>
> What is the best way of removing rows which are not in another table?
>
> I have these two tables each with millions of rows/tuples.
> They are joined on two fields:
> CREATE TABLE a
> (
> join1 OID,
> join2 OID,
> --a fair number of other fields
> .....
> )
> CREATE TABLE b
> (
> join1 OID,
> join2 OID,
> --a fair number of other fields
> .....
> )
> There are indices on both of them (on "join1, join2").
> At the moment, I am doing
>
> 1) get items which are orphaned in a.
> CREATE TEMP TABLE orphans as
> SELECT join1, join2
> FROM a
> WHERE NOT EXISTS
> (
> SELECT *
> FROM b
> WHERE
> a.join1 = b.join1 AND
> a.join2 = b.join2
> )
> 2) DELETE FROM a where orphans.join1 = a.join1 and orphans.join2=a.join2
> 3) DROP TABLE orphans
>
> This is very slow. Is there a better way?