Matt Clement <mattpc9@gmail.com> writes:
> Now we have all of the ex_id's that we want to keep. We can use a simple
> NOT IN condition to get all of the rows that should be deleted:
> SELECT * from ex
> WHERE ex_id NOT IN (
> SELECT ex_id from ex t1
> INNER JOIN (
> SELECT c_id, guid, MAX(ts) AS mydate1 FROM ex
> GROUP BY c_id, guid
> ORDER BY guid, MAX(ts);
> ) t2 ON t1.c_id = t2.c_id AND t1.guid = t2.guid AND t1.ts = t2.mydate1
> )
BTW, it's often a good idea to avoid NOT IN in favor of NOT EXISTS.
The principal reason for that probably doesn't arise here, because
I imagine ex.ex_id is never null, but if the output of the sub-select
did contain any nulls then this query wouldn't work at all. That's
because in the presence of nulls, a NOT IN test cannot yield TRUE,
only FALSE or NULL.
Also, at least in PG, NOT EXISTS usually performs better --- it's
hard to optimize NOT IN because of the aforesaid weirdness for nulls.
In short I'd suggest something like
DELETE FROM ex t0
WHERE NOT EXISTS (SELECT 1 FROM ex t1 JOIN
(SELECT c_id, guid, max(ts) mts from ex group by c_id, guid) as t2
ON t1.c_id = t2.c_id AND t1.guid = t2.guid AND t1.ts = t2.mts
AND t0.ex_id = t1.ex_id);
regards, tom lane