On Wed, 7 May 2003, Lucas Adamski wrote:
> I wrote it originally as:
>
> SELECT tracking.pk,events.data1,events.data2 FROM tracking,events WHERE
> tracking.event_fk = event.pk EXCEPT (SELECT events.data1,events.data2 FROM
> events WHERE event.type = 10)
>
> because each of these subqueries restricts the dataset greatly before doing
> the join. I've simplified the actual problem (as the real code has a bunch
> of extraneous stuff that makes it even more obtuse), but essentially, the
> tracking table maintains a record of the last record type that was entered.
> The type is incremented for each batch of events that is loaded. In this
> case, I'm assuming that the latest batch is type=10 (or 5000, or 100000),
> and the tracking table references a small subset of previous events
> (possibly of types 1-9 in this example). This particular query is supposed
> to return all tracking.pk's that are present in the previous batches (types)
> but not in the latest batch (10). I didn't mean to make it quite so obtuse,
> sorry. :)
Maybe something like nominally like (quickly done so possibly wrong
again):
select tracking.pk, events.data1, events.data2 from
tracking,events where not exists (select * from events e where
e.type=10 and e.data1=events.data1 and e.data2=events.data2)
and tracking.event_fk=event.pk
Get all tracking/event combinations, not including those where the data1/2
matches that of an event with type 10.
That might give dups if there are multiple events rows with that pk for
different types (but not 10).