Thread: COMMIT within function?
Suppose I have vacuum_values() function, which removes all "no longer referenced" by parent column. Kind of function to be run from time to time to clean table from crud. It looks like this: CREATE FUNCTION vacuum_values() RETURNS void AS $$ DECLARE r RECORD; BEGIN FOR r IN SELECT value_id FROM values NATURAL LEFT JOIN other_tab WHERE other_tab.value_id IS NULL FOR UPDATE OF values LOOP DELETE FROM values WHERE value_id = r.value_id; END LOOP; RETURN; END; $$ LANGUAGE plpgsql; Here, as the query runs against two table values (2 mln. rows) and ther_tab (20 mln. rows) it is relatively slow... However there is a chance that while this query goes, and goes, some rows will become referenced once more... and the DELETE will fail because of FOREIGN KEY, and the whole function will ROLLBACK... Is there a way to force "ignore errors" or something? As far as I checked, I can catch errors, but I don't really can stop the ROLLBACK. There are SAVEPOINTs but I guess they are useful for explicit ROLLBACK TO SAVEPOINT... Of course I can move all this logic outside of backend, and make the backend just 'do' the DELETEs, ignoring errors... But still, it should be doable in the procedural languages aswell..... Regards, Dawid
> Suppose I have vacuum_values() function, which removes all > "no longer referenced" by parent column. Kind of function > to be run from time to time to clean table from crud. > It looks like this: I suppose you have a good reason to not use a foreign key with "ON DELETE CASCADE" ? > FOR r IN SELECT value_id FROM values NATURAL LEFT JOIN > other_tab WHERE other_tab.value_id IS NULL FOR UPDATE OF values LOOP > DELETE FROM values WHERE value_id = r.value_id; > END LOOP; > RETURN; I don't remember the exact syntax (look in the DELETE docs) but you can certainly put a left join inside a delete and do it all at once with only one query, and it'll be faster to boot.
On Mon, 22 Nov 2004 00:16:07 +0100, Pierre-Frédéric Caillaud <lists@boutiquenumerique.com> wrote: > > > Suppose I have vacuum_values() function, which removes all > > "no longer referenced" by parent column. Kind of function > I suppose you have a good reason to not use a foreign key with "ON DELETE > CASCADE" ? Well, the issue here is saving space and speed with lots of repeatable data. Like e-mail addresses, most of them are frequently reused, so instead of a table CREATE TABLE messages (author text, ...); I create two: CREATE TABLE authors (author_id serial PRIMARY KEY, author text UNIQUE NOT NULL); CREATE TABLE messages (author_id integer REFERENCES authors, ...); ...and a matching view, and a function/rule which "invisibly" changes author to author_id whenever data is added (with authors table being updated when necessary). Now, after some time I remove old messages, and some of authors become "unreferenced" (think: From-s of spam messages). It would be nice to vacuum them out. The problem is when one of those authors "shows up" after long absence between our SELECT and actual DELETE. For a busy table (this happen to be one) it is quite possible. :) Ah, and ON DELETE CASCADE would mean I would loose perfectly good messages. Having LOCK on the table is also not-so-good an idea (think: authors with 2mln rows, messags with 20mln rows). > > FOR r IN SELECT value_id FROM values NATURAL LEFT JOIN > > other_tab WHERE other_tab.value_id IS NULL FOR UPDATE OF values LOOP > > DELETE FROM values WHERE value_id = r.value_id; > > END LOOP; > > RETURN; > > I don't remember the exact syntax (look in the DELETE docs) but you can > certainly put a left join inside a delete and do it all at once with only > one query, and it'll be faster to boot. Well, DELETE FROM ... WHERE ... is nice, but it will be explicitly "all-or-nothing", whereas with FUNCTION I have a ghost of hope that it may not be atomic. :) ...and I don't think you can do OUTER JOIN without subselect using DELETE FROM WHERE. Regards, Dawid
On Sun, Nov 21, 2004 at 07:29:26PM +0100, Dawid Kuroczko wrote: > Of course I can move all this logic outside of backend, and make > the backend just 'do' the DELETEs, ignoring errors... But still, > it should be doable in the procedural languages aswell..... In PostgreSQL 8.0, PL/pgSQL functions can trap errors without rolling back the entire transaction: http://developer.postgresql.org/docs/postgres/plpgsql-control-structures.html#PLPGSQL-ERROR-TRAPPING BEGIN DELETE FROM values WHERE value_id = r.value_id; EXCEPTION WHEN foreign_key_violation THEN NULL; END; -- Michael Fuhr http://www.fuhr.org/~mfuhr/
On Sun, 21 Nov 2004 20:10:03 -0700, Michael Fuhr <mike@fuhr.org> wrote: http://developer.postgresql.org/docs/postgres/plpgsql-control-structures.html#PLPGSQL-ERROR-TRAPPING > > BEGIN > DELETE FROM values WHERE value_id = r.value_id; > EXCEPTION > WHEN foreign_key_violation THEN > NULL; > END; Ahh, exactly what I was looking for. :) The thing I didn't notice was that, while exception causes rollback to "BEGIN", it does not mean to the beginning of the function. In other words I didn't nest BEGIN...END blocks and all I got from using exceptions was that they did not show any errors. :) Thank you! I am now enlightened. This works perfect, exactly as I hoped it would. :) Regards, dawid