Joris Dobbelsteen writes:
> Try this:
> SELECT distinct export_messages.export_id as id,
> exports.export_id as exports_export_id
> FROM export_messages
> LEFT OUTER JOIN exports ON
> (export_messages.export_id = exports.export_id)
> WHERE exports.export_id IS NOT NULL;
In my case I needed "IS NULL"
Your query worked. Thanks!!!
However.. I find it very strange that just the selects by themselves
produced the same ouput up to limit 100.
SELECT distinct export_messages.export_id as id
FROM export_messages
LEFT OUTER JOIN exports ON (export_messages.export_id = exports.export_id)
limit 100;
and
SELECT distinct export_messages.export_id as id
FROM export_messages
LEFT OUTER JOIN exports ON
(export_messages.export_id = exports.export_id)
WHERE exports.export_id IS NULL limit 100;
Produced the same output.
> At this point you should know whats going wrong...
>
>>>>DELETE FROM export_messages
>>>>WHERE export_id IN
>>>> (SELECT distinct export_messages.export_id as id
>>>> FROM export_messages
>>>> LEFT OUTER JOIN exports
>
> The LEFT OUTER join will at all times return ALL rows from
> export_messages.
> In effect, you generate a list with ALL export_messages.export_id. Thus
> we must conclude that for every row you are trying to delete, the
> condition must evaluate to true.
>
>>>> ON (export_messages.export_id = exports.export_id)
>>>> );
>
>>> Thought: are you sure you are going to delete those rows? In there
>>> cases human verification is usually the way to go, though it takes a
>>> lot of time.
>>
>>If I can't not find a way to do this through SQL I will write
>>a program.
>>The time to try and clean this by hand would be countless
>>hours. There are a lot of records in the child table that do
>>not have a matching record in the parent table.
>
> That's the trade-off: effects of a mistake * chance of a mistake against
> the cost to prevent these.
>
> Hope this helps...
>
> - Joris
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
> choose an index scan if your joining column's datatypes do not
> match