At 10:05 AM 8/7/2008, pgsql-sql-owner@postgresql.org wrote:
>Date: Thu, 7 Aug 2008 09:14:49 -0700
>From: felix@crowfix.com
>To: pgsql-sql@postgresql.org
>Subject: DELETE with JOIN
>Message-ID: <20080807161449.GA19337@crowfix.com>
>
>I want to delete with a join condition. Google shows this is a common
>problem, but the only solutions are either for MySQL or they don't
>work in my situation because there are too many rows selected. I also
>have to make this work on several databases, includeing, grrr, Oracle,
>so non-standard MySQL "solutions" are doubly aggravating.
>
> DELETE FROM a WHERE a.b_id = b.id AND b.second_id = ?
>
>I have tried to do this before and always found a way, usually
>
> DELETE FROM a WHERE a.b_id IN (SELECT id FROM b WHERE second_id =
> ?)
>
>but I have too many rows, millions, in the IN crowd, ha ha, and it
>barfs. EXISTS is no better. At least Oracle barfs, and I haven't got
>to the others yet. I figured I would go with the worst offender
>first, and let me tell you, it is offensive. Dang I wish it were
>postgresql only!
>
>I could write a Dumb Little Test Program (tm) to read in all those IN
>ids and execute a zillion individual DELETE statements, but it would
>be slow as puke and this little delete is going to come up quite often
>now that I have a test program which needs to generate the junky data
>and play with it for several days before deleting it and starting over
>again.
Hi,
Have you tried something where you read in all those "IN id's" and then
group them into blocks (of say 1,000 or 10,000 or whatever number works
best)? Then execute:
DELETE FROM a WHERE a.b_id in ([static_list_of_ids])
Replacing in a loop "[static_list_of_ids]" with each block of 1000 id's
in a comma delimited string? I use this technique sometimes in
middleware and it works pretty well. There's probably a pure-sql
solution in Pg as well but this method should work across any SQL
platform, which seems like one of your requirements.
Steve