Re: MOVE - Mailing list pgsql-general
From | PFC |
---|---|
Subject | Re: MOVE |
Date | |
Msg-id | opskmam9mtth1vuj@musicbox Whole thread Raw |
In response to | Re: MOVE (Martijn van Oosterhout <kleptog@svana.org>) |
List | pgsql-general |
> BEGIN; > INSERT INTO archive (...) SELECT ... FROM active WHERE user_id = ...; > DELETE FROM active WHERE user_id = ...; > COMMIT; > > The DELETE can only delete the rows returned by the select, that's the > whole point of transactions... Well, in the case of having a unique index on user_id, and if no-one updates the row between the insert and the delete, it will work ;) And if someone updates it in between, well, the update is not archived, so you have to LOCK your row FOR UPDATE. And if this procedure is called twice at the same time, the rows will be historized twice... etc... Which is precisely why I don't like this approach ! As a side note, I've installed 8.0 rc, and wow. The slow queries feel a lot faster on the command prompt, my small queries have became faster too... very good work ! In the end, I've implemented it with an AFTER DELETE trigger on the 'live' table, which, after the row has been deleted, inserts it in the history table, using the magic variable OLD. This will work because the row is already deleted, thus can't be concurrently updated by another transaction (because a transaction trying to update a row will wait on the lock acquired by the DELETE, and vice versa). So, for ONE row at a time, in a trigger, it works beautifully, thank you postgres ! I find the solution very elegant : when a session expires, it is deleted from the live session table, then the trigger catches it just in time to shove it in the sessions history table, then some other tables like user-to-chatroom connections, which happen to have a user_id referencing into the live sessions table, get the ON DELETE CASCADE and are also purged and historized automatically. I am very happy with this solution BUT it's done one-row-at-a-time, so it's slower than I'd like ! The key is to insert the row deleted from the live table into the history table AFTER it has been deleted, to avoid all funky locks problems. Now consider the following : you must DELETE several items at the same time and historize them. If you INSERT then DELETE: - records can be inserted, updated or deleted between the two. The inserted ones will be historized but not deleted (duplicates !), the deleted ones will be lost forever, unhistorized, the updated ones won't have their updates historized. Not very well for concurrecy ! You can LOCK FOR UPDATE before, this solves the UPDATE and DELETE problem, but not the INSERT problem. You can, of course, lock the entire table, but well, it reminds me too much of the MySQL way. You can also use SERIALIZABLE mode which solves all the problems, but if something goes wrong, everything fails and you have to retry the whole trasaction, whereas a proper lock would be waited on... If there is a primary key in the 'live' table you can SELECT FOR UPDATE into a tamporary table, then delete using the pkeys in the temp table, then insert from the temp table... ugly ! That's why I bother you to have the possibility of DELETE returning the DELETE'd rows ;) It's not very useful if you process one row, but when you process several at a time, it would be really great, because instead of 2*N queries (DELETE+INSERT hidden in a trigger) you'd just do one (INSERT ... DELETE AND SELECT ... FROM ...). Today, if you don't want to do it in a trigger, you have to have a unique index, SELECT FOR UPDATE, INSERT, DELETE, that's three queries per row. In a perfect world, this would be then used in an ON DELETE RULE which would replace the DELETES by deletes inserting the rows into the history table Also I've thought about some other interesting applications, if DELETE returns rows, why not UPDATE or even INSERT ? Many applications use INSERT... then SELECT currval(sequence). I also like to set defaults in the database, like for instance some rows which have timestamp fields defaulting to now() or things like that. I have a tree table with a ltree field which is generated by a trigger from the parent's path and the current row's id. Some other fields are also inherited from the parent. Why not do INSERT INTO ... AND SELECT ... which would return the sequence field, and any other fields which have been initialized by ON INSERT triggers... this would be neat... instead of INSERT, SELECT currval, SELECT .. FROM table WHERE id=... Same thing for on update triggers. You could replace some plpgsql procedures with one query, and what's more important, not worry about locking headaches. Anyway, my problem is solved now with triggers, but I like the idea very much (and Oracle has it) (and Tom once said a DELETE was just more or less like a SELECT)... so ... Regards
pgsql-general by date: