Thread: Finalizing commit taking very long
Hello, I am having a strange latency problem on my instance of Postgres that I don't know how to investigate. I am accessing the db instance using a Java application and the Cayenne mapping framework. Everything works fine, except when it is time to delete a user account (that is a user of the application, not of Postgres). Deleting an account trigger a (sort-of) cascade delete to remove also all the dependent records stored on the db. The cascade constraints are managed by the Cayenne library, and the db receives a list of delete statements for all the rows of the different tables that should be deleted; this is probably not optimal from a db point of view, but is extremely convenient from an application point of view. Anyway, the deletion of all the records is executed without much problems (I had a huge slowdown here a few weeks ago, but I was missing an index on a constraint). Obviously, all the delete statements are grouped into a single transaction; and when it is time to commit this transaction, the db instance takes "forever". Here are some logs taken from the db server: [ lot of delete statements skipped] 2007-10-24 12:13:17 CEST LOG: statement: EXECUTE <unnamed> [PREPARE: DELETE FROM connection.USRCNN WHERE ID_USRCNN = $1] 2007-10-24 12:13:17 CEST LOG: duration: 0.206 ms 2007-10-24 12:13:17 CEST LOG: duration: 0.206 ms statement: EXECUTE <unnamed> [PREPARE: DELETE FROM connection.USRCNN WHERE ID_USRCNN = $1] 2007-10-24 12:13:17 CEST LOG: statement: PREPARE <unnamed> AS DELETE FROM clipperz.USR WHERE ID_USR = $1 2007-10-24 12:13:17 CEST LOG: statement: <BIND> 2007-10-24 12:13:17 CEST LOG: statement: EXECUTE <unnamed> [PREPARE: DELETE FROM clipperz.USR WHERE ID_USR = $1] 2007-10-24 12:13:17 CEST LOG: duration: 0.761 ms 2007-10-24 12:13:17 CEST LOG: duration: 0.761 ms statement: EXECUTE <unnamed> [PREPARE: DELETE FROM clipperz.USR WHERE ID_USR = $1] 2007-10-24 12:13:17 CEST LOG: statement: <BIND> 2007-10-24 12:13:17 CEST LOG: statement: EXECUTE <unnamed> [PREPARE: COMMIT] 2007-10-24 12:13:51 CEST LOG: autovacuum: processing database "clipperz_beta" 2007-10-24 12:14:51 CEST LOG: autovacuum: processing database "clipperz_beta" 2007-10-24 12:15:10 CEST LOG: duration: 113300.147 ms 2007-10-24 12:15:10 CEST LOG: duration: 113300.147 ms statement: EXECUTE <unnamed> [PREPARE: COMMIT] As you may notice, the commit phase takes almost 2 full minutes. :-( How can I understand what is going on on that timeframe in order to try to fix it? Thanks for your attention. Best regards, Giulio Cesare PS: I run my development machine on a MacOS, with Postgres running on a Parallels virtual machine. I don't think this really matters for the above problem, but in case ...
"Giulio Cesare Solaroli" <giulio.cesare@gmail.com> writes: > As you may notice, the commit phase takes almost 2 full minutes. :-( Yow. It's hard to believe that the actual commit (ie, flushing the commit record to WAL) could take more than a fraction of a second. I'm thinking there must be a pile of pre-commit work to do, like a lot of deferred triggers. Do you use deferred foreign keys? If so, the most likely bet is that the DELETE is triggering a lot of deferred FK checks, and these are slow for some reason (maybe another missing index). regards, tom lane
Hello Tom, On 10/24/07, Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Giulio Cesare Solaroli" <giulio.cesare@gmail.com> writes: > > As you may notice, the commit phase takes almost 2 full minutes. :-( > > Yow. It's hard to believe that the actual commit (ie, flushing the > commit record to WAL) could take more than a fraction of a second. > I'm thinking there must be a pile of pre-commit work to do, like a > lot of deferred triggers. Do you use deferred foreign keys? > If so, the most likely bet is that the DELETE is triggering a lot > of deferred FK checks, and these are slow for some reason (maybe > another missing index). I have most (if not all) of my constraint defined with the DEFERRABLE INITIALLY DEFERRED clause. I have done this as I have not a direct control on the order of the SQL statements that the Cayenne library sends to the server, and this will avoid all the constraint violations inside a single transaction. How can I try to isolate the trigger taking so long, in oder to understand which is/are the missing index(es)? Best regards, Giulio Cesare
"Giulio Cesare Solaroli" <giulio.cesare@gmail.com> writes: > How can I try to isolate the trigger taking so long, in oder to > understand which is/are the missing index(es)? Try SET CONSTRAINTS ALL IMMEDIATE and then EXPLAIN ANALYZE the delete. This should cause all the triggers to run within the scope of the EXPLAIN ANALYZE, and you'll be able to see which one(s) are slow. (This assumes you're running a recent release of PG; I think EXPLAIN shows trigger times since 8.1 or so.) regards, tom lane
On 10/24/07, Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Giulio Cesare Solaroli" <giulio.cesare@gmail.com> writes: > > How can I try to isolate the trigger taking so long, in oder to > > understand which is/are the missing index(es)? > > Try SET CONSTRAINTS ALL IMMEDIATE and then EXPLAIN ANALYZE the > delete. This should cause all the triggers to run within the > scope of the EXPLAIN ANALYZE, and you'll be able to see which > one(s) are slow. (This assumes you're running a recent release > of PG; I think EXPLAIN shows trigger times since 8.1 or so.) I was thinking about something similar after writing the last message. Thank you very much for your attention!! Giulio Cesare
Hello Tom, I can confirm that adding the indexes used by the deferred constraint triggers solved the issue. Thank you very much for your suggestions. Best regards, Giulio Cesare On 10/24/07, Giulio Cesare Solaroli <giulio.cesare@gmail.com> wrote: > On 10/24/07, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > "Giulio Cesare Solaroli" <giulio.cesare@gmail.com> writes: > > > How can I try to isolate the trigger taking so long, in oder to > > > understand which is/are the missing index(es)? > > > > Try SET CONSTRAINTS ALL IMMEDIATE and then EXPLAIN ANALYZE the > > delete. This should cause all the triggers to run within the > > scope of the EXPLAIN ANALYZE, and you'll be able to see which > > one(s) are slow. (This assumes you're running a recent release > > of PG; I think EXPLAIN shows trigger times since 8.1 or so.) > > I was thinking about something similar after writing the last message. > > Thank you very much for your attention!! > > Giulio Cesare >