Thread: postgres crash
Hi,
I've tried using pg_resetxlog with the -f switch but I don't really know what I'm doing.
I am a complete newb with postgres. In fact, I don't even use it as such but have a problem with a crashed postgresql db associated with using Calendar Server on Mac OS X 10.6 on a Mac Mini. To cut a long story short, I had to reformat my hard drive and restore form a Time Machine backup. Now, when I try to start the calendar server, I get a crash message in the console log and some of the following in the postgres log. There are no dates / times in the log file so I don't know which entries relate to which activities - sorry!
FATAL: the database system is starting up
LOG: startup process (PID 313) was terminated by signal 6: Abort trap
LOG: aborting startup due to startup process failure
LOG: database system was interrupted; last known up at 2011-07-15 07:48:24 BST
LOG: unexpected pageaddr 0/2058000 in log file 0, segment 4, offset 360448
LOG: invalid primary checkpoint record
LOG: unexpected pageaddr 0/2056000 in log file 0, segment 4, offset 352256
LOG: invalid secondary checkpoint record
PANIC: could not locate a valid checkpoint record
FATAL: the database system is starting up
FATAL: the database system is starting up
FATAL: the database system is starting up
FATAL: the database system is starting up
LOG: startup process (PID 646) was terminated by signal 6: Abort trap
LOG: aborting startup due to startup process failure
I've also asked the Calendar Server mailing list but have been requested to come to this forum....
Any hints / tips please?
Many thanks,
Matt
mattfairley@netscape.net writes: > I am a complete newb with postgres. In fact, I don't even use it as > such but have a problem with a crashed postgresql db associated with > using Calendar Server on Mac OS X 10.6 on a Mac Mini. To cut a long > story short, I had to reformat my hard drive and restore form a Time > Machine backup. Yeah, that's not exactly the approved way to back up a Postgres instance; you're likely to get a collection of files that are somewhat out-of-sync with each other, which seems to be exactly what this is about: > LOG: database system was interrupted; last known up at 2011-07-15 07:48:24 BST > LOG: unexpected pageaddr 0/2058000 in log file 0, segment 4, offset 360448 > LOG: invalid primary checkpoint record > LOG: unexpected pageaddr 0/2056000 in log file 0, segment 4, offset 352256 > LOG: invalid secondary checkpoint record > PANIC: could not locate a valid checkpoint record > I've tried using pg_resetxlog with the -f switch but I don't really know what I'm doing. pg_resetxlog is pretty much the only way out, given that you don't have any other form of backup. But you haven't shown us exactly what you did or exactly how it failed. The man page for pg_resetxlog is reasonably thorough, did you read it? http://www.postgresql.org/docs/9.0/static/app-pgresetxlog.html (Once you do get it to start again, you'll at minimum want to do a complete database reindex, and ideally a dump/re-initdb/reload, to try to make sure there's not lingering database corruption.) regards, tom lane
Thanks for the reply Tom - I'll give the man page a read. A couple of comments below, just to clarify though. Regards, Matt Sent from my iPhone On 3 Aug 2011, at 19:27, Tom Lane <tgl@sss.pgh.pa.us> wrote: > mattfairley@netscape.net writes: >> I am a complete newb with postgres. In fact, I don't even use it as >> such but have a problem with a crashed postgresql db associated with >> using Calendar Server on Mac OS X 10.6 on a Mac Mini. To cut a long >> story short, I had to reformat my hard drive and restore form a Time >> Machine backup. > > Yeah, that's not exactly the approved way to back up a Postgres > instance; you're likely to get a collection of files that are somewhat > out-of-sync with each other, which seems to be exactly what this is > about: > To clarify - the whole hard drive was restored from the Time Machine back up, not just the db, so I would have thought itwould have been ok, that things would have been in sync, rather than out of sync. However, due to hard drive corruption,is it possible that the restored database is screwy? From what I can gather things went a bit wrong with the harddrive about a week before I noticed it. >> LOG: database system was interrupted; last known up at 2011-07-15 07:48:24 BST >> LOG: unexpected pageaddr 0/2058000 in log file 0, segment 4, offset 360448 >> LOG: invalid primary checkpoint record >> LOG: unexpected pageaddr 0/2056000 in log file 0, segment 4, offset 352256 >> LOG: invalid secondary checkpoint record >> PANIC: could not locate a valid checkpoint record > >> I've tried using pg_resetxlog with the -f switch but I don't really know what I'm doing. > > pg_resetxlog is pretty much the only way out, given that you don't have > any other form of backup. But you haven't shown us exactly what you did > or exactly how it failed. > From memory, I ran: pg_resetxlog -f /usr/local/pgsql/data Was that right? > The man page for pg_resetxlog is reasonably thorough, did you read it? > http://www.postgresql.org/docs/9.0/static/app-pgresetxlog.html > > (Once you do get it to start again, you'll at minimum want to do a > complete database reindex, and ideally a dump/re-initdb/reload, to try > to make sure there's not lingering database corruption.) > How do I do all that? > regards, tom lane
Hi On 3 August 2011 22:15, Matthew Fairley <mattfairley@netscape.net> wrote: > Thanks for the reply Tom - I'll give the man page a read. A couple of comments below, just to clarify though. > > Regards, > > Matt > > Sent from my iPhone > > On 3 Aug 2011, at 19:27, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> mattfairley@netscape.net writes: >>> I am a complete newb with postgres. In fact, I don't even use it as >>> such but have a problem with a crashed postgresql db associated with >>> using Calendar Server on Mac OS X 10.6 on a Mac Mini. To cut a long >>> story short, I had to reformat my hard drive and restore form a Time >>> Machine backup. >> >> Yeah, that's not exactly the approved way to back up a Postgres >> instance; you're likely to get a collection of files that are somewhat >> out-of-sync with each other, which seems to be exactly what this is >> about: > > To clarify - the whole hard drive was restored from the Time Machine back up, not just the db, so I would have thoughtit would have been ok, that things would have been in sync, rather than out of sync. That is most likely not good enough. If Time Machine creates a snapshot of the whole hard drive before copying the files to the backup location then it would be almost OK (as if you pulled the power cable out the back of the machine while it was running.) I don't think Time Machine does that, though. So while it's backing up one of Postgres' files, the others could still be modified. Then while it backs up the next one, again others might be modified. So they can be out of sync with each other as Tom says. > However, due to hard drive corruption, is it possible that the restored database is screwy? Yes, but as mentioned above this could also just be as a result of the way Time Machine does the backups. > From what I can gather things went a bit wrong with the hard drive about a week before I noticed it. [...] >>> I've tried using pg_resetxlog with the -f switch but I don't really know what I'm doing. >> >> pg_resetxlog is pretty much the only way out, given that you don't have >> any other form of backup. But you haven't shown us exactly what you did >> or exactly how it failed. >> > From memory, I ran: > > pg_resetxlog -f /usr/local/pgsql/data > > Was that right? [...] You will need to use some of the other options mentioned in the documentation. Have a look at that and then ask again if you don't understand the documentation. -- Michael Wood <esiotrot@gmail.com>
Michael Wood <esiotrot@gmail.com> writes: > On 3 August 2011 22:15, Matthew Fairley <mattfairley@netscape.net> wrote: >> To clarify - the whole hard drive was restored from the Time Machine back up, not just the db, so I would have thoughtit would have been ok, that things would have been in sync, rather than out of sync. > That is most likely not good enough. If Time Machine creates a > snapshot of the whole hard drive before copying the files to the > backup location then it would be almost OK (as if you pulled the power > cable out the back of the machine while it was running.) I don't > think Time Machine does that, though. So while it's backing up one of > Postgres' files, the others could still be modified. Then while it > backs up the next one, again others might be modified. So they can be > out of sync with each other as Tom says. If you've ever watched Time Machine do its thing, you'll notice that it actually makes two passes over your drive per backup session --- the second pass copies any files that changed during the first pass. So it's not even trying to deliver an exact single-instant filesystem snapshot. (I doubt OS X has support for such a thing anyway.) >> From memory, I ran: >> pg_resetxlog -f /usr/local/pgsql/data >> Was that right? > You will need to use some of the other options mentioned in the > documentation. Have a look at that and then ask again if you don't > understand the documentation. He may or may not need any other options. -f might not have been a good idea though ... it would have been nice to see the output from pg_resetxlog before that. For that matter, we still haven't seen what happened after trying pg_resetxlog. regards, tom lane