Thread: dumping tables from badly damaged db
Recently I had a problem where a system crash scribbed on some directories, which landed a bunch of files, including a few of the system table files for one of my databases, in lost+found along with a zillion other files. I might be able to find the file for this table/index in lost+found, but how do I know what name to give it in /var/lib/postgres/...? These files are named after OID's which I can't find because I can't connect to the database. If I can't get the file back, is there any way I can dump the data out of the remaining tables? Currently, I can't connect to the database with the following error: psql: FATAL 1: cannot open pg_class_relname_index: No such file or directory I suspect that's very bad, considering the centrality of the pg_class table... -- Brian Ristuccia bristucc@sw.starentnetworks.com
Brian Ristuccia <bristucc@sw.starentnetworks.com> writes: > Recently I had a problem where a system crash scribbed on some directories, > which landed a bunch of files, including a few of the system table files for > one of my databases, in lost+found along with a zillion other files. Ugh. > I might be able to find the file for this table/index in lost+found, but how > do I know what name to give it in /var/lib/postgres/...? I can't think of any reasonably simple way to identify the files by content (this might be something to try to fix in future, but for now you're stuck). Best idea I can think of is to examine "od -c" dumps and try to intuit which file is which. > Currently, I can't connect to the database with the following error: > psql: FATAL 1: cannot open pg_class_relname_index: No such file or directory You might be able to get past this by starting a standalone postgres with the -P command-line option (ignore system indexes). If so, try "select relname, relfilenode from pg_class". With luck that will give you a list of which file name is needed for each table. I'd not counsel trying to do more than that in the standalone backend until you've gotten at least the tables put back together. You do not need to try very hard to recreate the indexes --- you can use REINDEX to rebuild them. Good luck! regards, tom lane
Brian Ristuccia <bristucc@sw.starentnetworks.com> writes: > The standalone backend errors out with: > FATAL 1: _mdfd_getrelnfd: cannot open relation pg_trigger: No such file or > directory Well, if you can identify which of the lost+found files is pg_trigger, you can move it back into place and then try again. (Look for trigger names in the od -c dump...) All the system tables have fixed names (relfilenode values) which you can determine by consulting another database of the same PG version. pg_trigger is 16412 in 7.3, for instance. Lather, rinse, repeat until it comes up ... > My week-old backups are starting to look more and more attractive. I didn't say this was going to be painless. regards, tom lane
On Fri, Oct 31, 2003 at 02:05:57PM -0500, Tom Lane wrote: > > You might be able to get past this by starting a standalone postgres > with the -P command-line option (ignore system indexes). If so, try > "select relname, relfilenode from pg_class". With luck that will give > you a list of which file name is needed for each table. I'd not counsel > trying to do more than that in the standalone backend until you've > gotten at least the tables put back together. > The standalone backend errors out with: FATAL 1: _mdfd_getrelnfd: cannot open relation pg_trigger: No such file or directory I suspect my troubles may be more severe than just a missing index... My week-old backups are starting to look more and more attractive. Any other hints? Thanks. -Brian
On Sun, Nov 02, 2003 at 01:45:39PM -0500, Brian Ristuccia wrote: > > pg_dump: NOTICE: RelationBuildDesc: can't open deprecated_data1: No such file or directory > pg_dump: NOTICE: RelationBuildDesc: can't open unimportant_stuff: No such file or directory > pg_dump: handler procedure for procedural language plpgsql not found > > I can't drop the tables from psql: > > drop TABLE deprecated_data1 ; > ERROR: cannot open deprecated_data1: No such file or directory > Ok. I managed to drop these in the standalone backend with some hackery: backend> drop table useless_junk NOTICE: RelationBuildDesc: can't open useless_junk: No such file or directory ERROR: cannot open useless_junk: No such file or directory So I'd have to go select on pg_class and find the relfilenode, go touch that file so it exists. Try the drop again. Drop any indexes which don't exist. Find the relfilenode for the associated pg_toast table and go touch that file too. Not a lot of fun, but the destroyed tables are gone now. > Am I likely to have success by performing surgery on the pg_table relation > in order to remove these destroyed tables? A simple DROP LANGUAGE did the trick. I can now run pg_dump - I feel much better. -- Brian Ristuccia
On Fri, Oct 31, 2003 at 02:45:57PM -0500, Tom Lane wrote: > Lather, rinse, repeat until it comes up ... > Ok. That excercise was certainly no fun. For the various system tables I was missing, I managed to either find them in lost+found, or I copied the version from template0. The files for some relatively unimportant tables were irrecoverably destroyed. Of course, I realize that my database is probably now in a percarious state. I want to run a pg_dump, drop the db, and then rebuild into a fresh database. Except for selects on the destroyed tables, my database works as normal, so I'm confident I can get my data out. But pg_dump refuses to run. It complains about the destroyed tables and also something about plpgsql: pg_dump: NOTICE: RelationBuildDesc: can't open deprecated_data1: No such file or directory pg_dump: NOTICE: RelationBuildDesc: can't open unimportant_stuff: No such file or directory pg_dump: handler procedure for procedural language plpgsql not found I can't drop the tables from psql: drop TABLE deprecated_data1 ; ERROR: cannot open deprecated_data1: No such file or directory Am I likely to have success by performing surgery on the pg_table relation in order to remove these destroyed tables? Any hints on what I should do about the procedural language issue? -- Brian Ristuccia