Thread: Postgres Server crashed
Hi All,
I am a noob in database field.
Lately the postgres server crashed due to space constraints, pg_control file got lost and the server wouldn’t start. Had a old pg_control file placed in $PGDATA/global/ path.
I was able to start the server, however, the data seems to be a very old one (2011 Jan 07) (probably the time where pg_control I had pasted was last updated). How do I recover the rest of the data? The data directory seems to be showing entire data, I am guessing so, because the major 4 tables which seemed to be missed are the only contributors to around 100 GB and the DATAPATH still says 163 GB is utilized.
When trying to take a pg_dump, it says : pg_dump: schema with OID 29257 does not exist, this OID is there in pg_type, but not there in any other catalog tables.
Please help me!!
Regards,
Akash.
DISCLAIMER: The information in this message is confidential and may be legally privileged. It is intended solely for the addressee. Access to this message by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, or distribution of the message, or any action or omission taken by you in reliance on it, is prohibited and may be unlawful. Please immediately contact the sender if you have received this message in error. Further, this e-mail may contain viruses and all reasonable precaution to minimize the risk arising there from is taken by OnMobile. OnMobile is not liable for any damage sustained by you as a result of any virus in this e-mail. All applicable virus checks should be carried out by you before opening this e-mail or any attachment thereto.
Thank you - OnMobile Global Limited.
kash Kodibail <akash.kodibail@onmobile.com> wrote: > Lately the postgres server crashed due to space constraints, The very first thing to do is to copy the entire directory structure for the database cluster to somewhere you can keep it safe until the problem has been resolved. If you have not yet done so, I strongly recommend that you do so before trying anything else for recovery. > pg_control file got lost and the server wouldn't start. "got lost"? It seems to me that it would be a Very Bad Thing if an out of disk space situation allowed PostgreSQL to cause that. Should we be looking for a bug, or is there a chance someone deleted this in an attempt to free space? If the latter, is there any chance that other files critical to data integrity were deleted, like in the pg_xlog or pg_clog directories? > Had a old pg_control file placed in $PGDATA/global/ path. > > I was able to start the server, however, the data seems to be a > very old one (2011 Jan 07) (probably the time where pg_control I > had pasted was last updated). And normal maintenance may be viewing newer data as old, due to transaction wrap-around from the old pg_control file, and removing it as part of normal cleanup. So you may have destroyed some of your more recent data by doing that. > How do I recover the rest of the data? Well, as I said above, start by trying to preserve whatever you have now. After you've done that, please provide more information. We don't even know what version of PostgreSQL you're talking about, for example. http://wiki.postgresql.org/wiki/Guide_to_reporting_problems It's somewhat likely that you'll be best off going to your last known good backup, assuming you've been doing regular backups. -Kevin
Thank You Kevin. Details: --------------- Running "select version()" gives me "PostgreSQL 8.4.4 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 3.4.6 20060404(Red Hat 3.4.6-10), 64-bit" dmesg | head -10 -gives- "Linux version 2.6.18-92.el5 (mockbuild@builder10.centos.org) (gcc version 4.1.2 20071124 (Red Hat4.1.2-42)) #1 SMP Tue Jun 10 18:51:06 EDT 2008" Database mounted on 285 GB NFS HD --------------- I am copying it to a path as and when I write this email. Regular backups is something I am afraid I have not been taking. But this incident sure has taught me a lesson. pg_clog andpg_xlog are untouched. Thanks and Regards, Akash. -----Original Message----- From: Kevin Grittner [mailto:Kevin.Grittner@wicourts.gov] Sent: Monday, October 10, 2011 10:47 PM To: Akash Kodibail; pgsql-admin@postgresql.org Subject: Re: [ADMIN] Postgres Server crashed kash Kodibail <akash.kodibail@onmobile.com> wrote: > Lately the postgres server crashed due to space constraints, The very first thing to do is to copy the entire directory structure for the database cluster to somewhere you can keep it safe until the problem has been resolved. If you have not yet done so, I strongly recommend that you do so before trying anything else for recovery. > pg_control file got lost and the server wouldn't start. "got lost"? It seems to me that it would be a Very Bad Thing if an out of disk space situation allowed PostgreSQL to cause that. Should we be looking for a bug, or is there a chance someone deleted this in an attempt to free space? If the latter, is there any chance that other files critical to data integrity were deleted, like in the pg_xlog or pg_clog directories? > Had a old pg_control file placed in $PGDATA/global/ path. > > I was able to start the server, however, the data seems to be a > very old one (2011 Jan 07) (probably the time where pg_control I > had pasted was last updated). And normal maintenance may be viewing newer data as old, due to transaction wrap-around from the old pg_control file, and removing it as part of normal cleanup. So you may have destroyed some of your more recent data by doing that. > How do I recover the rest of the data? Well, as I said above, start by trying to preserve whatever you have now. After you've done that, please provide more information. We don't even know what version of PostgreSQL you're talking about, for example. http://wiki.postgresql.org/wiki/Guide_to_reporting_problems It's somewhat likely that you'll be best off going to your last known good backup, assuming you've been doing regular backups. -Kevin DISCLAIMER: The information in this message is confidential and may be legally privileged. It is intended solely for theaddressee. Access to this message by anyone else is unauthorized. If you are not the intended recipient, any disclosure,copying, or distribution of the message, or any action or omission taken by you in reliance on it, is prohibitedand may be unlawful. Please immediately contact the sender if you have received this message in error. Further,this e-mail may contain viruses and all reasonable precaution to minimize the risk arising there from is taken byOnMobile. OnMobile is not liable for any damage sustained by you as a result of any virus in this e-mail. All applicablevirus checks should be carried out by you before opening this e-mail or any attachment thereto. Thank you - OnMobile Global Limited.
On Mon, Oct 10, 2011 at 11:17 AM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > And normal maintenance may be viewing newer data as old, due to > transaction wrap-around from the old pg_control file, and removing > it as part of normal cleanup. So you may have destroyed some of > your more recent data by doing that. At this point, once you have a backup, would you deploy pg_resetxlog?
Scott Marlowe <scott.marlowe@gmail.com> wrote: > Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > >> And normal maintenance may be viewing newer data as old, due to >> transaction wrap-around from the old pg_control file, and >> removing it as part of normal cleanup. So you may have destroyed >> some of your more recent data by doing that. > > At this point, once you have a backup, would you deploy > pg_resetxlog? That's what I was thinking. I shudder to think what autovacuum may have done while running with the old pg_control file; but I don't know what else to do at this juncture. Anyone else have a better idea? -Kevin
I was hopeful of restoring the data from data files in $PGDATA path. I read this article about PITR using the recovery.conf,But I am not aware of the pre-requisites and not entirely sure about the concept beneath this. Would it not be possible? Regards, Akash. -----Original Message----- From: Kevin Grittner [mailto:Kevin.Grittner@wicourts.gov] Sent: Tuesday, October 11, 2011 12:33 AM To: Scott Marlowe Cc: Akash Kodibail; pgsql-admin@postgresql.org Subject: Re: [ADMIN] Postgres Server crashed Scott Marlowe <scott.marlowe@gmail.com> wrote: > Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > >> And normal maintenance may be viewing newer data as old, due to >> transaction wrap-around from the old pg_control file, and >> removing it as part of normal cleanup. So you may have destroyed >> some of your more recent data by doing that. > > At this point, once you have a backup, would you deploy > pg_resetxlog? That's what I was thinking. I shudder to think what autovacuum may have done while running with the old pg_control file; but I don't know what else to do at this juncture. Anyone else have a better idea? -Kevin DISCLAIMER: The information in this message is confidential and may be legally privileged. It is intended solely for theaddressee. Access to this message by anyone else is unauthorized. If you are not the intended recipient, any disclosure,copying, or distribution of the message, or any action or omission taken by you in reliance on it, is prohibitedand may be unlawful. Please immediately contact the sender if you have received this message in error. Further,this e-mail may contain viruses and all reasonable precaution to minimize the risk arising there from is taken byOnMobile. OnMobile is not liable for any damage sustained by you as a result of any virus in this e-mail. All applicablevirus checks should be carried out by you before opening this e-mail or any attachment thereto. Thank you - OnMobile Global Limited.
On Mon, Oct 10, 2011 at 1:13 PM, Akash Kodibail <akash.kodibail@onmobile.com> wrote: > I was hopeful of restoring the data from data files in $PGDATA path. I read this article about PITR using the recovery.conf,But I am not aware of the pre-requisites and not entirely sure about the concept beneath this. With PITR you set up the system ahead of time to do this. It is not a rescue / recovery procedure for a broken database. I like Kevin fear that by starting it back up with an old pg_Controldata you may have made matters much worse, depending on what autovacuum might have done. Once you have a backup it's time to start experimenting, and I think pg_resetxlog is the best place to start.
Hi Scott, I believe pg_resetxlog has to have calculated and precise parameters: $PGDATA/pg_xlog has file -- 000000010000034A000000C0 of 16Meg $PGDATA/pg_multixact/members has file -- 0000 of 8k $PGDATA/pg_multixact/offsets has file -- 0000 of 8k Can you please help me out in the options that might have to be set for pg_resetxlog. Backing up is still in progress, so will try my luck with pg_resetxlog. Regards, Akash. -----Original Message----- From: Scott Marlowe [mailto:scott.marlowe@gmail.com] Sent: Tuesday, October 11, 2011 12:41 AM To: Akash Kodibail Cc: Kevin Grittner; pgsql-admin@postgresql.org Subject: Re: [ADMIN] Postgres Server crashed On Mon, Oct 10, 2011 at 1:13 PM, Akash Kodibail <akash.kodibail@onmobile.com> wrote: > I was hopeful of restoring the data from data files in $PGDATA path. I read this article about PITR using the recovery.conf,But I am not aware of the pre-requisites and not entirely sure about the concept beneath this. With PITR you set up the system ahead of time to do this. It is not a rescue / recovery procedure for a broken database. I like Kevin fear that by starting it back up with an old pg_Controldata you may have made matters much worse, depending on what autovacuum might have done. Once you have a backup it's time to start experimenting, and I think pg_resetxlog is the best place to start. DISCLAIMER: The information in this message is confidential and may be legally privileged. It is intended solely for theaddressee. Access to this message by anyone else is unauthorized. If you are not the intended recipient, any disclosure,copying, or distribution of the message, or any action or omission taken by you in reliance on it, is prohibitedand may be unlawful. Please immediately contact the sender if you have received this message in error. Further,this e-mail may contain viruses and all reasonable precaution to minimize the risk arising there from is taken byOnMobile. OnMobile is not liable for any damage sustained by you as a result of any virus in this e-mail. All applicablevirus checks should be carried out by you before opening this e-mail or any attachment thereto. Thank you - OnMobile Global Limited.
Akash Kodibail <akash.kodibail@onmobile.com> wrote: > Can you please help me out in the options that might have to be > set for pg_resetxlog. It should normally be able to calculate what it needs if you just point it at the data directory. I would run it with the -n (no operation) flag first, to see what it plans to do. If that looks sane, run again without the -n switch. I recommend that you read this page closely before proceeding: http://www.postgresql.org/docs/8.4/interactive/app-pgresetxlog.html -Kevin
On Mon, Oct 10, 2011 at 2:59 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > Akash Kodibail <akash.kodibail@onmobile.com> wrote: > >> Can you please help me out in the options that might have to be >> set for pg_resetxlog. > > It should normally be able to calculate what it needs if you just > point it at the data directory. I would run it with the -n (no > operation) flag first, to see what it plans to do. If that looks > sane, run again without the -n switch. But it uses pg_controldata to determine what it needs automagically. So... Since his pg_controldata is not the up to date one he's gonna have to figure out what number to give it. I have no clue how to do that at this point.
Hi Scott/Kevin, Everything got recovered with pg_resetxlog with appropriate -m, -x, -O and -l options set appropriately described in "http://www.postgresql.org/docs/8.2/static/app-pgresetxlog.html".We had lost 4 tables, which were insignificant (comparedto losing everything), and did a sanity check on all the rest of the tables, seemed all fine. Please let me knowif you see any pitfalls in the way this was recovered. Now that things are normal, I would definitely look for a good backup, restoration process all set up. Stitch in time savesnine :-) Thanks a ton for all the support. Next destination postgres documentation. :-) Many thanks and Regards, Akash. -----Original Message----- From: Akash Kodibail Sent: Tuesday, October 11, 2011 1:02 AM To: 'Scott Marlowe' Cc: Kevin Grittner; pgsql-admin@postgresql.org Subject: RE: [ADMIN] Postgres Server crashed Hi Scott, I believe pg_resetxlog has to have calculated and precise parameters: $PGDATA/pg_xlog has file -- 000000010000034A000000C0 of 16Meg $PGDATA/pg_multixact/members has file -- 0000 of 8k $PGDATA/pg_multixact/offsets has file -- 0000 of 8k Can you please help me out in the options that might have to be set for pg_resetxlog. Backing up is still in progress, so will try my luck with pg_resetxlog. Regards, Akash. -----Original Message----- From: Scott Marlowe [mailto:scott.marlowe@gmail.com] Sent: Tuesday, October 11, 2011 12:41 AM To: Akash Kodibail Cc: Kevin Grittner; pgsql-admin@postgresql.org Subject: Re: [ADMIN] Postgres Server crashed On Mon, Oct 10, 2011 at 1:13 PM, Akash Kodibail <akash.kodibail@onmobile.com> wrote: > I was hopeful of restoring the data from data files in $PGDATA path. I read this article about PITR using the recovery.conf,But I am not aware of the pre-requisites and not entirely sure about the concept beneath this. With PITR you set up the system ahead of time to do this. It is not a rescue / recovery procedure for a broken database. I like Kevin fear that by starting it back up with an old pg_Controldata you may have made matters much worse, depending on what autovacuum might have done. Once you have a backup it's time to start experimenting, and I think pg_resetxlog is the best place to start. DISCLAIMER: The information in this message is confidential and may be legally privileged. It is intended solely for theaddressee. Access to this message by anyone else is unauthorized. If you are not the intended recipient, any disclosure,copying, or distribution of the message, or any action or omission taken by you in reliance on it, is prohibitedand may be unlawful. Please immediately contact the sender if you have received this message in error. Further,this e-mail may contain viruses and all reasonable precaution to minimize the risk arising there from is taken byOnMobile. OnMobile is not liable for any damage sustained by you as a result of any virus in this e-mail. All applicablevirus checks should be carried out by you before opening this e-mail or any attachment thereto. Thank you - OnMobile Global Limited.