Re: after using pg_resetxlog, db lost - Mailing list pgsql-performance
From | Shea,Dan [CIS] |
---|---|
Subject | Re: after using pg_resetxlog, db lost |
Date | |
Msg-id | 644D07D3D59D8F408CD01AC2F833D8C62B9217@cisxa.cmc.int.ec.gc.ca Whole thread Raw |
In response to | after using pg_resetxlog, db lost ("Shea,Dan [CIS]" <Dan.Shea@ec.gc.ca>) |
Responses |
Re: after using pg_resetxlog, db lost
|
List | pgsql-performance |
I determined the largest was 12,293,162 and set it to pg_resetxlog -x 15000000 /var/lib/pgsql/data I am now able to see all the data. I actually checked the log for the previous successfull startup before it the pg_control file was reset and it reported Jun 22 11:55:44 pascal postgres[24993]: [5-1] LOG: next transaction ID: 14820367; next OID: 727013114 So I entered pg_resetxlog -o 750000000 /var/lib/pgsql/data Setting oid value I couldn't set 10000/0, so tried below pg_resetxlog -l 10000,0 /var/lib/pgsql/data This seems to be wrong because the databse is complaining and shutting down Jun 24 15:02:05 murphy postgres[28061]: [6-1] LOG: checkpoint record is at 2710/1000050 Jun 24 15:02:05 murphy postgres[28061]: [7-1] LOG: redo record is at 2710/1000050; undo record is at 0/0; shutdown TRUE Jun 24 15:02:05 murphy postgres[28061]: [8-1] LOG: next transaction ID: 15000010; next OID: 750000000 Jun 24 15:02:05 murphy postgres[28061]: [9-1] LOG: database system was not properly shut down; automatic recovery in progress Jun 24 15:02:05 murphy postgres[28062]: [5-1] FATAL: the database system is starting up Jun 24 15:02:05 murphy postgres[28063]: [5-1] FATAL: the database system is starting up Jun 24 15:02:05 murphy postgres[28061]: [10-1] LOG: redo starts at 2710/1000090 Jun 24 15:02:05 murphy postgres[28061]: [11-1] PANIC: could not access status of transaction 15000030 Jun 24 15:02:05 murphy postgres[28061]: [11-2] DETAIL: could not read from file "/var/lib/pgsql/data/pg_clog/000E" at offset73728: Success Jun 24 15:02:05 murphy postgres[24771]: [5-1] LOG: startup process (PID 28061) was terminated by signal 6 Jun 24 15:02:05 murphy postgres[24771]: [6-1] LOG: aborting startup due to startup process failure Jun 24 15:50:51 murphy sshd(pam_unix)[690]: session opened for user root by (uid=0) Jun 24 15:54:47 murphy su(pam_unix)[1541]: session opened for user postgres by root(uid=0) Jun 24 16:03:47 murphy su(pam_unix)[2911]: session opened for user postgres by root(uid=0) Jun 24 16:03:48 murphy su(pam_unix)[2911]: session closed for user postgres Jun 24 16:03:48 murphy postgres[3182]: [1-1] LOG: could not create IPv6 socket: Address family not supported by protocol Jun 24 16:03:48 murphy postgres[3188]: [2-1] LOG: database system was interrupted while in recovery at 2004-06-24 15:02:05GMT Jun 24 16:03:48 murphy postgres[3188]: [2-2] HINT: This probably means that some data is corrupted and you will have touse the last backup for recovery. Jun 24 16:03:48 murphy postgres[3188]: [3-1] LOG: checkpoint record is at 2710/1000050 Jun 24 16:03:48 murphy postgres[3188]: [4-1] LOG: redo record is at 2710/1000050; undo record is at 0/0; shutdown TRUE Jun 24 16:03:48 murphy postgres[3188]: [5-1] LOG: next transaction ID: 15000010; next OID: 750000000 Jun 24 16:03:48 murphy postgres[3188]: [6-1] LOG: database system was not properly shut down; automatic recovery in progress Jun 24 16:03:48 murphy postgres[3188]: [7-1] LOG: redo starts at 2710/1000090 Jun 24 16:03:48 murphy postgres[3188]: [8-1] PANIC: could not access status of transaction 15000030 Jun 24 16:03:48 murphy postgres[3188]: [8-2] DETAIL: could not read from file "/var/lib/pgsql/data/pg_clog/000E" at offset73728: Success Jun 24 16:03:48 murphy postgres[3182]: [2-1] LOG: startup process (PID 3188) was terminated by signal 6 Jun 24 16:03:48 murphy postgres[3182]: [3-1] LOG: aborting startup due to startup process failure How do I set the xlog properly, or rather to 10000/0? Dan. -----Original Message----- From: Tom Lane [mailto:tgl@sss.pgh.pa.us] Sent: Wednesday, June 23, 2004 11:41 PM To: Shea,Dan [CIS] Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] after using pg_resetxlog, db lost "Shea,Dan [CIS]" <Dan.Shea@ec.gc.ca> writes: > Tom I see you from past emails that you reference using -i -f with > pg_filedump. I have tried this, but do not know what I am looking at. What you want to look at is valid XMIN and XMAX values. In this example: > Item 1 -- Length: 196 Offset: 4292 (0x10c4) Flags: USED > XID: min (2) CMIN|XMAX: 211 CMAX|XVAC: 469 > Block Id: 0 linp Index: 1 Attributes: 24 Size: 28 > infomask: 0x2912 (HASVARWIDTH|HASOID|XMIN_COMMITTED|XMAX_INVALID|UPDATED) the infomask shows XMIN_COMMITTED, so xmin (here 2) is valid, but it also shows XMAX_INVALID, so the putative XMAX (211) should be ignored. In general the xmin field should be valid, but xmax shares storage with cmin and so you have to look at the infomask bits to know whether to believe that the cmin/xmax field represents a transaction ID. The cmax/xvac field could also hold a transaction ID. If I had only the above data to go on, I'd guess that the current transaction counter is at least 469. Under normal circumstances, command counter values (cmin or cmax) are unlikely to exceed a few hundred, while the transaction IDs you are looking for are likely to be much larger. So you could get away with just computing the max of *all* the numbers you see in xmin, cmin/xmax, or cmax/cvac, and then using something a million or so bigger for safety factor. regards, tom lane
pgsql-performance by date: