Thread: 8.4.4 locked after power failure
So here's the thing. I got a message from one of the developers, that running 'create temporary sequence xyz;' hangs on the database. That seemed suspicious. I tried running any ddl command, and that hang. No other connections to the database. It turned out that it had a power failure earlier in the morning. That seems ok, but in the past postgresql will always recover fine (at least 8.3.x). This time I had to reindex, and vacuum all user and system catalogues to get system back in order. Any ideas ? Is that something new, is it fixed in any newer releases ? Unfortunately it had to be done rather quick - so I couldn't salvage any data. -- GJ
2011/9/12 Grzegorz Jaśkiewicz <gryzman@gmail.com>: > So here's the thing. I got a message from one of the developers, that > running 'create temporary sequence xyz;' hangs on the database. > That seemed suspicious. I tried running any ddl command, and that hang. > No other connections to the database. > > It turned out that it had a power failure earlier in the morning. That > seems ok, but in the past postgresql will always recover fine (at > least 8.3.x). > This time I had to reindex, and vacuum all user and system catalogues > to get system back in order. > > Any ideas ? Is that something new, is it fixed in any newer releases ? > > > Unfortunately it had to be done rather quick - so I couldn't salvage any data. It seems odd that you could not create a temp sequence but you were able to reindex the entire database. did you confirm you were blocking on a non-granted lock? merlin
2011/9/12 Merlin Moncure <mmoncure@gmail.com>: > > It seems odd that you could not create a temp sequence but you were > able to reindex the entire database. did you confirm you were > blocking on a non-granted lock? I could revacuum/reindex all stuff, only if I had to do the system catalogues first. That seemed a bit suspicious. Also, I had to stop all other connections - because any other locked transaction would prevent it from going on. Oh, the last thing - pg_cancel_backend() couldn't kill the locked backend. I had to restart the whole postgresql using -m immediate. Odd stuff indeed, as I was hoping postgresql will clean things like locks on recovery from crash. Never happened to me with 8.3 -- GJ
I'm guessing I won't get much more from devs , without providing more info here which unfortunately has been lost.
2011/9/13 Grzegorz Jaśkiewicz <gryzman@gmail.com>: > I'm guessing I won't get much more from devs , without providing more > info here which unfortunately has been lost. yup -- you destroyed all the evidence. if it happens again, try posting some more info, particularly what's going on with locks, pg_stat_activity, maybe an strace of any stuck processes, etc. also irc can be a good place to get some quick help if its time sensitive. merlin
2011/9/13 Merlin Moncure <mmoncure@gmail.com>: > 2011/9/13 Grzegorz Jaśkiewicz <gryzman@gmail.com>: >> I'm guessing I won't get much more from devs , without providing more >> info here which unfortunately has been lost. > > yup -- you destroyed all the evidence. if it happens again, try > posting some more info, particularly what's going on with locks, > pg_stat_activity, maybe an strace of any stuck processes, etc. also > irc can be a good place to get some quick help if its time sensitive. See, I was hoping that just the fact that it occurred might trigger some suspicions on the -devel side of things. I know it would in my product. -- GJ
2011/9/13 Grzegorz Jaśkiewicz <gryzman@gmail.com>: > 2011/9/13 Merlin Moncure <mmoncure@gmail.com>: >> 2011/9/13 Grzegorz Jaśkiewicz <gryzman@gmail.com>: >>> I'm guessing I won't get much more from devs , without providing more >>> info here which unfortunately has been lost. >> >> yup -- you destroyed all the evidence. if it happens again, try >> posting some more info, particularly what's going on with locks, >> pg_stat_activity, maybe an strace of any stuck processes, etc. also >> irc can be a good place to get some quick help if its time sensitive. > > See, I was hoping that just the fact that it occurred might trigger > some suspicions on the -devel side of things. I know it would in my > product. sure. In lieu of useful and/or specific information, the #1 suspicion is always going to be 'operator error', because, more often than not, that's what it is. Having seen and read your posts on the various lists for some time now I know that if you report a problem then there likely is a problem, but without more detail this is essentially a fishing expedition :-). merlin
On 09/13/2011 11:04 PM, Grzegorz Jaśkiewicz wrote: > 2011/9/13 Merlin Moncure<mmoncure@gmail.com>: >> 2011/9/13 Grzegorz Jaśkiewicz<gryzman@gmail.com>: >>> I'm guessing I won't get much more from devs , without providing more >>> info here which unfortunately has been lost. >> >> yup -- you destroyed all the evidence. if it happens again, try >> posting some more info, particularly what's going on with locks, >> pg_stat_activity, maybe an strace of any stuck processes, etc. also >> irc can be a good place to get some quick help if its time sensitive. > > See, I was hoping that just the fact that it occurred might trigger > some suspicions on the -devel side of things. I know it would in my > product. Of course it does. How would _you_ go about reproducing it or tracing it so you could even begin to track it down, though? There just isn't enough information to work with. It's like someone contacting you and explaining that they saved a file containing a huge embedded image from your office suite software, but when they went to open the file later it crashed the office suite. They were in a hurry and didn't have the disk space to keep the old one, so they had to redo the work and save over the file. It's good to get the report, but ... what do you do about it? They don't have a faulty file, a reproducible test case, any or any good prospects for working on creating a test case. -- Craig Ringer