Thread: Data corruption zero a file - help!!
Hi all,
I posted this on the novice mailing list and as yet had no response, hopefully someone here can help.
While we where trying to do a vacuum / pg_dump we encountered the following error:
postgres@db:~$ pg_dumpall -d > dump.pg
pg_dump: dumpClasses(): SQL command failed
pg_dump: Error message from server: ERROR: invalid page header in block
9022921 of relation "gap"
pg_dump: The command was: FETCH 100 FROM _pg_dump_cursor
pg_dumpall: pg_dump failed on database "monashprotein", exiting
Now after doing some searches I managed to work out that the data corruption starts at 902292.137
using this sql:
SELECT * FROM gap WHERE ctid = '(902292,$x)'
Where $x I changed from 1-150.
as mentioned on this post:http://archives.postgresql.org/pgsql-general/2005-11/msg01148.php
Following this post it seems all we need to do is re-zero from this point on. However we're not sure which file to do this in.
I've worked out the database/relation files are
$PGDATA/37958/111685332.* with the max * being 101.
Any help locating which file we need to do the re-zero thing would be really appreciated.
Cheers
Noel
I posted this on the novice mailing list and as yet had no response, hopefully someone here can help.
While we where trying to do a vacuum / pg_dump we encountered the following error:
postgres@db:~$ pg_dumpall -d > dump.pg
pg_dump: dumpClasses(): SQL command failed
pg_dump: Error message from server: ERROR: invalid page header in block
9022921 of relation "gap"
pg_dump: The command was: FETCH 100 FROM _pg_dump_cursor
pg_dumpall: pg_dump failed on database "monashprotein", exiting
Now after doing some searches I managed to work out that the data corruption starts at 902292.137
using this sql:
SELECT * FROM gap WHERE ctid = '(902292,$x)'
Where $x I changed from 1-150.
as mentioned on this post:http://archives.postgresql.org/pgsql-general/2005-11/msg01148.php
Following this post it seems all we need to do is re-zero from this point on. However we're not sure which file to do this in.
I've worked out the database/relation files are
$PGDATA/37958/111685332.* with the max * being 101.
Any help locating which file we need to do the re-zero thing would be really appreciated.
Cheers
Noel
Attachment
On Wed, Mar 01, 2006 at 04:12:53PM +1100, Noel Faux wrote: > Now after doing some searches I managed to work out that the data > corruption starts at 902292.137 > using this sql: > SELECT * FROM gap WHERE ctid = '(902292,$x)' > Where $x I changed from 1-150. > > as mentioned on this > post:http://archives.postgresql.org/pgsql-general/2005-11/msg01148.php > > Following this post it seems all we need to do is re-zero from this > point on. However we're not sure which file to do this in. This earlier message in that thread should be helpful: http://archives.postgresql.org/pgsql-general/2005-11/msg01141.php > I've worked out the database/relation files are > $PGDATA/37958/111685332.* with the max * being 101. Is your table really over 100G? Anyway, if the block size is 8192 then 902292 sould be in the .6 file. If you can spare the time then you might run the dd and od commands that Tom Lane mentions in the above message and post the output. I think the command would be dd bs=8k skip=115860 count=1 if=$PGDATA/37958/111685332.6 | od -x See Tom's message for how I arrived at .6 and 115860 (and verify the math yourself). -- Michael Fuhr
On Tue, Feb 28, 2006 at 10:54:48PM -0700, Michael Fuhr wrote: > Is your table really over 100G? Anyway, if the block size is 8192 > then 902292 sould be in the .6 file. If you can spare the time > then you might run the dd and od commands that Tom Lane mentions > in the above message and post the output. pg_filedump is also handy for examining the backend's data files. http://sources.redhat.com/rhdb/utilities.html -- Michael Fuhr
Thanks for the pointers Michael!
Which config file will tell us how big the bock sizes are?
Cheers
Noel
Michael Fuhr wrote:
Which config file will tell us how big the bock sizes are?
Cheers
Noel
Michael Fuhr wrote:
On Wed, Mar 01, 2006 at 04:12:53PM +1100, Noel Faux wrote:Now after doing some searches I managed to work out that the data corruption starts at 902292.137 using this sql: SELECT * FROM gap WHERE ctid = '(902292,$x)' Where $x I changed from 1-150. as mentioned on this post:http://archives.postgresql.org/pgsql-general/2005-11/msg01148.php Following this post it seems all we need to do is re-zero from this point on. However we're not sure which file to do this in.This earlier message in that thread should be helpful: http://archives.postgresql.org/pgsql-general/2005-11/msg01141.phpI've worked out the database/relation files are $PGDATA/37958/111685332.* with the max * being 101.Is your table really over 100G? Anyway, if the block size is 8192 then 902292 sould be in the .6 file. If you can spare the time then you might run the dd and od commands that Tom Lane mentions in the above message and post the output. I think the command would be dd bs=8k skip=115860 count=1 if=$PGDATA/37958/111685332.6 | od -x See Tom's message for how I arrived at .6 and 115860 (and verify the math yourself).
Attachment
On Fri, Mar 03, 2006 at 09:56:40AM +1100, Noel Faux wrote: > Which config file will tell us how big the bock sizes are? Run the query "SHOW block_size" in the database or use pg_controldata from the shell. It's probably 8192; changing it is done at compile time. -- Michael Fuhr
> Is your table really over 100G? Yeap 600+ million rows. > Anyway, if the block size is 8192 > then 902292 sould be in the .6 file. If you can spare the time > then you might run the dd and od commands that Tom Lane mentions > in the above message and post the output. Here's the output: 0000000 002a 0000 2880 def5 0010 0000 0234 0240 0000020 2000 2001 9fc8 0062 9f90 0062 9f58 0062 0000040 9f20 0062 9ee8 0062 9eb0 0062 9e78 0062 0000060 9e40 0062 9e08 0062 9dd0 0062 9d98 0062 0000100 9d60 0062 9d28 0062 9cf0 0062 9cb8 0062 0000120 9c80 0062 9c48 0062 9c10 0062 9bd8 0062 0000140 9ba0 0062 9b68 0062 9b30 0062 9af8 0062 0000160 9ac0 0062 9a88 0062 9a50 0062 9a18 0062 0000200 99e0 0062 99a8 0062 9970 0062 9938 0062 0000220 9900 0062 98c8 0062 9890 0062 9858 0062 0000240 9820 0062 97e8 0062 97b0 0062 9778 0062 0000260 9740 0062 9708 0062 96d0 0062 9698 0062 0000300 9660 0062 9628 0062 95f0 0062 95b8 0062 0000320 9580 0062 9548 0062 9510 0062 94d8 0062 0000340 94a0 0062 9468 0062 9430 0062 93f8 0062 0000360 93c0 0062 9388 0062 9350 0062 9318 0062 0000400 92e0 0062 92a8 0062 9270 0062 9238 0062 0000420 9200 0062 91c8 0062 9190 0062 9158 0062 0000440 9120 0062 90e8 0062 90b0 0062 9078 0062 0000460 9040 0062 9008 0062 8fd0 0062 8f98 0062 0000500 8f60 0062 8f28 0062 8ef0 0062 8eb8 0062 0000520 8e80 0062 8e48 0062 8e10 0062 8dd8 0062 0000540 8da0 0062 8d68 0062 8d30 0062 8cf8 0062 0000560 8cc0 0062 8c88 0062 8c50 0062 8c18 0062 0000600 8be0 0062 8ba8 0062 8b70 0062 8b38 0062 0000620 8b00 0062 8ac8 0062 8a90 0062 8a58 0062 0000640 8a20 0062 89e8 0062 89b0 0062 8978 0062 0000660 8940 0062 8908 0062 88d0 0062 8898 0062 0000700 8860 0062 8828 0062 87f0 0062 87b8 0062 0000720 8780 0062 8748 0062 8710 0062 86d8 0062 0000740 86a0 0062 8668 0062 8630 0062 85f8 0062 0000760 85c0 0062 8588 0062 8550 0062 8518 0062 0001000 84e0 0062 84a8 0062 8470 0062 8438 0062 0001020 8400 0062 83c8 0062 8390 0062 8358 0062 0001040 8320 0062 82e8 0062 82b0 0062 8278 0062 0001060 8240 0062 0000 0000 0000 0000 0000 0000 0001100 0002 0000 0000 0000 0000 0000 000d c494 0001120 0088 0005 0910 0020 0000 0000 c93c 303b 0001140 bbc9 07ed 41be 0266 0064 0000 000c 0000 0001160 0000 0000 0000 0000 0002 0000 0000 0000 0001200 0000 0000 000d c494 0087 0005 0910 0020 0001220 0000 0000 c93b 303b bbc6 07ed 41be 0266 0001240 001c 0000 0002 0000 0000 0000 0000 0000 0001260 0002 0000 0000 0000 0000 0000 000d c494 0001300 0086 0005 0910 0020 0000 0000 c93a 303b 0001320 bbc0 07ed 41bb 0266 0064 0000 0002 0000 0001340 0000 0000 0000 0000 0002 0000 0000 0000 0001360 0000 0000 000d c494 0085 0005 0910 0020 0001400 0000 0000 c939 303b bbbd 07ed 41bb 0266 0001420 0031 0000 0001 0000 0000 0000 0000 0000 0001440 0002 0000 0000 0000 0000 0000 000d c494 0001460 0084 0005 0910 0020 0000 0000 c938 303b 0001500 bbba 07ed 41ba 0266 0062 0000 0004 0000 0001520 0001 0000 0000 0000 0002 0000 0000 0000 0001540 0000 0000 000d c494 0083 0005 0910 0020 0001560 0000 0000 c937 303b bbb8 07ed 41ba 0266 0001600 0037 0000 0001 0000 0000 0000 0000 0000 0001620 0002 0000 0000 0000 0000 0000 000d c494 0001640 0082 0005 0910 0020 0000 0000 c936 303b 0001660 bbb3 07ed 41b7 0266 006a 0000 0004 0000 0001700 0000 0000 0000 0000 0002 0000 0000 0000 0001720 0000 0000 000d c494 0081 0005 0910 0020 0001740 0000 0000 c935 303b bbb2 07ed 41b7 0266 0001760 0020 0000 0001 0000 0000 0000 0000 0000 0002000 0002 0000 0000 0000 0000 0000 000d c494 0002020 0080 0005 0910 0020 0000 0000 c934 303b 0002040 bbae 07ed 41b6 0266 0062 0000 0003 0000 0002060 0000 0000 0000 0000 0002 0000 0000 0000 0002100 0000 0000 000d c494 007f 0005 0910 0020 0002120 0000 0000 c933 303b bbaa 07ed 41b3 0266 0002140 0023 0000 0002 0000 0000 0000 0000 0000 0002160 0002 0000 0000 0000 0000 0000 000d c494 0002200 007e 0005 0910 0020 0000 0000 c932 303b 0002220 bba6 07ed 41b2 0266 0064 0000 0002 0000 0002240 0000 0000 0000 0000 0002 0000 0000 0000 0002260 0000 0000 000d c494 007d 0005 0910 0020 0002300 0000 0000 c931 303b bba3 07ed 41b2 0266 0002320 0036 0000 0001 0000 0000 0000 0000 0000 0002340 0002 0000 0000 0000 0000 0000 000d c494 0002360 007c 0005 0910 0020 0000 0000 c930 303b 0002400 bba0 07ed 41b0 0266 006a 0000 000b 0000 0002420 0000 0000 0000 0000 0002 0000 0000 0000 0002440 0000 0000 000d c494 007b 0005 0910 0020 0002460 0000 0000 c92f 303b bb9d 07ed 41b0 0266 0002500 0039 0000 0001 0000 0000 0000 0000 0000 0002520 0002 0000 0000 0000 0000 0000 000d c494 0002540 007a 0005 0910 0020 0000 0000 c92e 303b 0002560 bb98 07ed 41ad 0266 006a 0000 000b 0000 0002600 0000 0000 0000 0000 0002 0000 0000 0000 0002620 0000 0000 000d c494 0079 0005 0910 0020 0002640 0000 0000 c92d 303b bb97 07ed 41ad 0266 0002660 0039 0000 0001 0000 0000 0000 0000 0000 0002700 0002 0000 0000 0000 0000 0000 000d c494 0002720 0078 0005 0910 0020 0000 0000 c92c 303b 0002740 bb94 07ed 41ac 0266 0065 0000 0001 0000 0002760 0000 0000 0000 0000 0002 0000 0000 0000 0003000 0000 0000 000d c494 0077 0005 0910 0020 0003020 0000 0000 c92b 303b bb91 07ed 41ac 0266 0003040 0039 0000 0001 0000 0000 0000 0000 0000 0003060 0002 0000 0000 0000 0000 0000 000d c494 0003100 0076 0005 0910 0020 0000 0000 c92a 303b 0003120 bb8d 07ed 41a9 0266 0065 0000 0001 0000 0003140 0000 0000 0000 0000 0002 0000 0000 0000 0003160 0000 0000 000d c494 0075 0005 0910 0020 0003200 0000 0000 c929 303b bb8b 07ed 41a9 0266 0003220 0039 0000 0001 0000 0000 0000 0000 0000 0003240 0002 0000 0000 0000 0000 0000 000d c494 0003260 0074 0005 0910 0020 0000 0000 c928 303b 0003300 bb86 07ed 41a6 0266 0065 0000 0001 0000 0003320 0000 0000 0000 0000 0002 0000 0000 0000 0003340 0000 0000 000d c494 0073 0005 0910 0020 0003360 0000 0000 c927 303b bb85 07ed 41a6 0266 0003400 0039 0000 0001 0000 0000 0000 0000 0000 0003420 0002 0000 0000 0000 0000 0000 000d c494 0003440 0072 0005 0910 0020 0000 0000 c926 303b 0003460 bb7f 07ed 41a3 0266 0065 0000 0001 0000 0003500 0000 0000 0000 0000 0002 0000 0000 0000 0003520 0000 0000 000d c494 0071 0005 0910 0020 0003540 0000 0000 c925 303b bb7d 07ed 41a3 0266 0003560 0039 0000 0001 0000 0000 0000 0000 0000 0003600 0002 0000 0000 0000 0000 0000 000d c494 0003620 0070 0005 0910 0020 0000 0000 c924 303b 0003640 bb76 07ed 41a1 0266 0068 0000 0004 0000 0003660 0000 0000 0000 0000 0002 0000 0000 0000 0003700 0000 0000 000d c494 006f 0005 0910 0020 0003720 0000 0000 c923 303b bb74 07ed 41a1 0266 0003740 0034 0000 0001 0000 0000 0000 0000 0000 0003760 0002 0000 0000 0000 0000 0000 000d c494 0004000 006e 0005 0910 0020 0000 0000 c922 303b 0004020 bb70 07ed 419d 0266 0050 0000 0001 0000 0004040 0001 0000 0000 0000 0002 0000 0000 0000 0004060 0000 0000 000d c494 006d 0005 0910 0020 0004100 0000 0000 c921 303b bb6e 07ed 419d 0266 0004120 0066 0000 0001 0000 0000 0000 0000 0000 0004140 0002 0000 0000 0000 0000 0000 000d c494 0004160 006c 0005 0910 0020 0000 0000 c920 303b 0004200 bb6a 07ed 419b 0266 0065 0000 0001 0000 0004220 0000 0000 0000 0000 0002 0000 0000 0000 0004240 0000 0000 000d c494 006b 0005 0910 0020 0004260 0000 0000 c91f 303b bb67 07ed 419b 0266 0004300 0038 0000 0001 0000 0000 0000 0000 0000 0004320 0002 0000 0000 0000 0000 0000 000d c494 0004340 006a 0005 0910 0020 0000 0000 c91e 303b 0004360 bb64 07ed 4199 0266 0069 0000 0001 0000 0004400 0001 0000 0000 0000 0002 0000 0000 0000 0004420 0000 0000 000d c494 0069 0005 0910 0020 0004440 0000 0000 c91d 303b bb61 07ed 4199 0266 0004460 0060 0000 0005 0000 0000 0000 0000 0000 0004500 0002 0000 0000 0000 0000 0000 000d c494 0004520 0068 0005 0910 0020 0000 0000 c91c 303b 0004540 bb5c 07ed 4196 0266 00f4 0000 0001 0000 0004560 0001 0000 0000 0000 0002 0000 0000 0000 0004600 0000 0000 000d c494 0067 0005 0910 0020 0004620 0000 0000 c91b 303b bb5b 07ed 4196 0266 0004640 00ce 0000 0006 0000 0001 0000 0000 0000 0004660 0002 0000 0000 0000 0000 0000 000d c494 0004700 0066 0005 0910 0020 0000 0000 c91a 303b 0004720 bb59 07ed 4196 0266 00a9 0000 0002 0000 0004740 0001 0000 0000 0000 0002 0000 0000 0000 0004760 0000 0000 000d c494 0065 0005 0910 0020 0005000 0000 0000 c919 303b bb57 07ed 4196 0266 0005020 000c 0000 0001 0000 0000 0000 0000 0000 0005040 0002 0000 0000 0000 0000 0000 000d c494 0005060 0064 0005 0910 0020 0000 0000 c918 303b 0005100 bb51 07ed 4195 0266 0065 0000 0004 0000 0005120 0000 0000 0000 0000 0002 0000 0000 0000 0005140 0000 0000 000d c494 0063 0005 0910 0020 0005160 0000 0000 c917 303b bb4f 07ed 4195 0266 0005200 0035 0000 0001 0000 0000 0000 0000 0000 0005220 0002 0000 0000 0000 0000 0000 000d c494 0005240 0062 0005 0910 0020 0000 0000 c916 303b 0005260 bb4d 07ed 4192 0266 0063 0000 0004 0000 0005300 0000 0000 0000 0000 0002 0000 0000 0000 0005320 0000 0000 000d c494 0061 0005 0910 0020 0005340 0000 0000 c915 303b bb4a 07ed 4192 0266 0005360 003a 0000 0001 0000 0000 0000 0000 0000 0005400 0002 0000 0000 0000 0000 0000 000d c494 0005420 0060 0005 0910 0020 0000 0000 c914 303b 0005440 bb45 07ed 418f 0266 0067 0000 0007 0000 0005460 0000 0000 0000 0000 0002 0000 0000 0000 0005500 0000 0000 000d c494 005f 0005 0910 0020 0005520 0000 0000 c913 303b bb43 07ed 418f 0266 0005540 0019 0000 0002 0000 0000 0000 0000 0000 0005560 0002 0000 0000 0000 0000 0000 000d c494 0005600 005e 0005 0910 0020 0000 0000 c912 303b 0005620 bb3f 07ed 418d 0266 0066 0000 0003 0000 0005640 0000 0000 0000 0000 0002 0000 0000 0000 0005660 0000 0000 000d c494 005d 0005 0910 0020 0005700 0000 0000 c911 303b bb3d 07ed 418d 0266 0005720 0039 0000 0003 0000 0000 0000 0000 0000 0005740 0002 0000 0000 0000 0000 0000 000d c494 0005760 005c 0005 0910 0020 0000 0000 c910 303b 0006000 bb39 07ed 418a 0266 0063 0000 0003 0000 0006020 0000 0000 0000 0000 0002 0000 0000 0000 0006040 0000 0000 000d c494 005b 0005 0910 0020 0006060 0000 0000 c90f 303b bb31 07ed 4188 0266 0006100 0065 0000 000a 0000 0000 0000 0000 0000 0006120 0002 0000 0000 0000 0000 0000 000d c494 0006140 005a 0005 0910 0020 0000 0000 c90e 303b 0006160 bb2e 07ed 4188 0266 0035 0000 0001 0000 0006200 0000 0000 0000 0000 0002 0000 0000 0000 0006220 0000 0000 000d c494 0059 0005 0910 0020 0006240 0000 0000 c90d 303b bb29 07ed 4186 0266 0006260 0067 0000 0001 0000 0000 0000 0000 0000 0006300 0002 0000 0000 0000 0000 0000 000d c494 0006320 0058 0005 0910 0020 0000 0000 c90c 303b 0006340 bb26 07ed 4186 0266 0037 0000 0001 0000 0006360 0000 0000 0000 0000 0002 0000 0000 0000 0006400 0000 0000 000d c494 0057 0005 0910 0020 0006420 0000 0000 c90b 303b bb22 07ed 4183 0266 0006440 0065 0000 0003 0000 0000 0000 0000 0000 0006460 0002 0000 0000 0000 0000 0000 000d c494 0006500 0056 0005 0910 0020 0000 0000 c90a 303b 0006520 bb1f 07ed 4183 0266 0039 0000 0001 0000 0006540 0000 0000 0000 0000 0002 0000 0000 0000 0006560 0000 0000 000d c494 0055 0005 0910 0020 0006600 0000 0000 c909 303b bb1c 07ed 4181 0266 0006620 0039 0000 0001 0000 0000 0000 0000 0000 0006640 0002 0000 0000 0000 0000 0000 000d c494 0006660 0054 0005 0910 0020 0000 0000 c908 303b 0006700 bb19 07ed 417e 0266 0037 0000 0001 0000 0006720 0000 0000 0000 0000 0002 0000 0000 0000 0006740 0000 0000 000d c494 0053 0005 0910 0020 0006760 0000 0000 c907 303b bb14 07ed 417c 0266 0007000 0063 0000 0003 0000 0000 0000 0000 0000 0007020 0002 0000 0000 0000 0000 0000 000d c494 0007040 0052 0005 0910 0020 0000 0000 c906 303b 0007060 bb11 07ed 417c 0266 0039 0000 0001 0000 0007100 0000 0000 0000 0000 0002 0000 0000 0000 0007120 0000 0000 000d c494 0051 0005 0910 0020 0007140 0000 0000 c905 303b bb0d 07ed 417b 0266 0007160 0064 0000 0003 0000 0000 0000 0000 0000 0007200 0002 0000 0000 0000 0000 0000 000d c494 0007220 0050 0005 0910 0020 0000 0000 c904 303b 0007240 bb0a 07ed 417b 0266 0035 0000 0001 0000 0007260 0000 0000 0000 0000 0002 0000 0000 0000 0007300 0000 0000 000d c494 004f 0005 0910 0020 0007320 0000 0000 c903 303b bb06 07ed 4178 0266 0007340 006a 0000 0004 0000 0000 0000 0000 0000 0007360 0002 0000 0000 0000 0000 0000 000d c494 0007400 004e 0005 0910 0020 0000 0000 c902 303b 0007420 bb04 07ed 4178 0266 0033 0000 0002 0000 0007440 0000 0000 0000 0000 0002 0000 0000 0000 0007460 0000 0000 000d c494 004d 0005 0910 0020 0007500 0000 0000 c901 303b bb01 07ed 4178 0266 0007520 0021 0000 0002 0000 0000 0000 0000 0000 0007540 0002 0000 0000 0000 0000 0000 000d c494 0007560 004c 0005 0910 0020 0000 0000 c900 303b 0007600 bafd 07ed 4176 0266 0036 0000 0001 0000 0007620 0000 0000 0000 0000 0002 0000 0000 0000 0007640 0000 0000 000d c494 004b 0005 0910 0020 0007660 0000 0000 c8ff 303b baf7 07ed 4174 0266 0007700 0031 0000 0001 0000 0000 0000 0000 0000 0007720 0002 0000 0000 0000 0000 0000 000d c494 0007740 004a 0005 0910 0020 0000 0000 c8fe 303b 0007760 baf5 07ed 4172 0266 0039 0000 0001 0000 0010000 0000 0000 0000 0000 0002 0000 0000 0000 0010020 0000 0000 000d c494 0049 0005 0910 0020 0010040 0000 0000 c8fd 303b baef 07ed 416f 0266 0010060 0064 0000 000c 0000 0000 0000 0000 0000 0010100 0002 0000 0000 0000 0000 0000 000d c494 0010120 0048 0005 0910 0020 0000 0000 c8fc 303b 0010140 baee 07ed 416f 0266 0018 0000 0002 0000 0010160 0000 0000 0000 0000 0002 0000 0000 0000 0010200 0000 0000 000d c494 0047 0005 0910 0020 0010220 0000 0000 c8fb 303b baea 07ed 416e 0266 0010240 0037 0000 0001 0000 0000 0000 0000 0000 0010260 0002 0000 0000 0000 0000 0000 000d c494 0010300 0046 0005 0910 0020 0000 0000 c8fa 303b 0010320 bae5 07ed 416c 0266 0055 0000 0001 0000 0010340 0001 0000 0000 0000 0002 0000 0000 0000 0010360 0000 0000 000d c494 0045 0005 0910 0020 0010400 0000 0000 c8f9 303b bae3 07ed 416c 0266 0010420 0038 0000 0001 0000 0000 0000 0000 0000 0010440 0002 0000 0000 0000 0000 0000 000d c494 0010460 0044 0005 0910 0020 0000 0000 c8f8 303b 0010500 badf 07ed 416a 0266 0039 0000 0003 0000 0010520 0000 0000 0000 0000 0002 0000 0000 0000 0010540 0000 0000 000d c494 0043 0005 0910 0020 0010560 0000 0000 c8f7 303b badb 07ed 4168 0266 0010600 0062 0000 0002 0000 0000 0000 0000 0000 0010620 0002 0000 0000 0000 0000 0000 000d c494 0010640 0042 0005 0910 0020 0000 0000 c8f6 303b 0010660 bad6 07ed 4166 0266 0036 0000 0001 0000 0010700 0000 0000 0000 0000 0002 0000 0000 0000 0010720 0000 0000 000d c494 0041 0005 0910 0020 0010740 0000 0000 c8f5 303b bad2 07ed 4164 0266 0010760 0064 0000 0002 0000 0000 0000 0000 0000 0011000 0002 0000 0000 0000 0000 0000 000d c494 0011020 0040 0005 0910 0020 0000 0000 c8f4 303b 0011040 bacf 07ed 4164 0266 001f 0000 0001 0000 0011060 0000 0000 0000 0000 0002 0000 0000 0000 0011100 0000 0000 000d c494 003f 0005 0910 0020 0011120 0000 0000 c8f3 303b baca 07ed 4162 0266 0011140 0037 0000 0001 0000 0000 0000 0000 0000 0011160 0002 0000 0000 0000 0000 0000 000d c494 0011200 003e 0005 0910 0020 0000 0000 c8f2 303b 0011220 bac5 07ed 4160 0266 0067 0000 0004 0000 0011240 0000 0000 0000 0000 0002 0000 0000 0000 0011260 0000 0000 000d c494 003d 0005 0910 0020 0011300 0000 0000 c8f1 303b bac1 07ed 415e 0266 0011320 0037 0000 0001 0000 0000 0000 0000 0000 0011340 0002 0000 0000 0000 0000 0000 000d c494 0011360 003c 0005 0910 0020 0000 0000 c8f0 303b 0011400 babc 07ed 415b 0266 0066 0000 0006 0000 0011420 0000 0000 0000 0000 0002 0000 0000 0000 0011440 0000 0000 000d c494 003b 0005 0910 0020 0011460 0000 0000 c8ef 303b baba 07ed 415b 0266 0011500 0039 0000 0001 0000 0000 0000 0000 0000 0011520 0002 0000 0000 0000 0000 0000 000d c494 0011540 003a 0005 0910 0020 0000 0000 c8ee 303b 0011560 bab7 07ed 415a 0266 0064 0000 000b 0000 0011600 0000 0000 0000 0000 0002 0000 0000 0000 0011620 0000 0000 000d c494 0039 0005 0910 0020 0011640 0000 0000 c8ed 303b bab4 07ed 415a 0266 0011660 0020 0000 0002 0000 0000 0000 0000 0000 0011700 0002 0000 0000 0000 0000 0000 000d c494 0011720 0038 0005 0910 0020 0000 0000 c8ec 303b 0011740 bab0 07ed 4158 0266 0063 0000 0003 0000 0011760 0000 0000 0000 0000 0002 0000 0000 0000 0012000 0000 0000 000d c494 0037 0005 0910 0020 0012020 0000 0000 c8eb 303b baaa 07ed 4153 0266 0012040 0062 0000 0002 0000 0000 0000 0000 0000 0012060 0002 0000 0000 0000 0000 0000 000d c494 0012100 0036 0005 0910 0020 0000 0000 c8ea 303b 0012120 baa8 07ed 4153 0266 0037 0000 0001 0000 0012140 0000 0000 0000 0000 0002 0000 0000 0000 0012160 0000 0000 000d c494 0035 0005 0910 0020 0012200 0000 0000 c8e9 303b baa2 07ed 4150 0266 0012220 0065 0000 000a 0000 0000 0000 0000 0000 0012240 0002 0000 0000 0000 0000 0000 000d c494 0012260 0034 0005 0910 0020 0000 0000 c8e8 303b 0012300 ba9f 07ed 414f 0266 0065 0000 0002 0000 0012320 0000 0000 0000 0000 0002 0000 0000 0000 0012340 0000 0000 000d c494 0033 0005 0910 0020 0012360 0000 0000 c8e7 303b ba9c 07ed 414f 0266 0012400 0034 0000 0001 0000 0000 0000 0000 0000 0012420 0002 0000 0000 0000 0000 0000 000d c494 0012440 0032 0005 0910 0020 0000 0000 c8e6 303b 0012460 ba98 07ed 414d 0266 006b 0000 0002 0000 0012500 0000 0000 0000 0000 0002 0000 0000 0000 0012520 0000 0000 000d c494 0031 0005 0910 0020 0012540 0000 0000 c8e5 303b ba95 07ed 414d 0266 0012560 0037 0000 0001 0000 0000 0000 0000 0000 0012600 0002 0000 0000 0000 0000 0000 000d c494 0012620 0030 0005 0910 0020 0000 0000 c8e4 303b 0012640 ba91 07ed 414b 0266 0066 0000 0002 0000 0012660 0001 0000 0000 0000 0002 0000 0000 0000 0012700 0000 0000 000d c494 002f 0005 0910 0020 0012720 0000 0000 c8e3 303b ba8d 07ed 4149 0266 0012740 0064 0000 0003 0000 0001 0000 0000 0000 0012760 0002 0000 0000 0000 0000 0000 000d c494 0013000 002e 0005 0910 0020 0000 0000 c8e2 303b 0013020 ba89 07ed 4148 0266 0037 0000 0001 0000 0013040 0000 0000 0000 0000 0002 0000 0000 0000 0013060 0000 0000 000d c494 002d 0005 0910 0020 0013100 0000 0000 c8e1 303b ba85 07ed 4145 0266 0013120 005e 0000 0003 0000 0001 0000 0000 0000 0013140 0002 0000 0000 0000 0000 0000 000d c494 0013160 002c 0005 0910 0020 0000 0000 c8e0 303b 0013200 ba7f 07ed 4142 0266 0068 0000 0009 0000 0013220 0000 0000 0000 0000 0002 0000 0000 0000 0013240 0000 0000 000d c494 002b 0005 0910 0020 0013260 0000 0000 c8df 303b ba7d 07ed 4142 0266 0013300 0019 0000 0002 0000 0000 0000 0000 0000 0013320 0002 0000 0000 0000 0000 0000 000d c494 0013340 002a 0005 0910 0020 0000 0000 c8de 303b 0013360 ba79 07ed 4140 0266 0062 0000 0008 0000 0013400 0000 0000 0000 0000 0002 0000 0000 0000 0013420 0000 0000 000d c494 0029 0005 0910 0020 0013440 0000 0000 c8dd 303b ba76 07ed 4140 0266 0013460 0039 0000 0001 0000 0000 0000 0000 0000 0013500 0002 0000 0000 0000 0000 0000 000d c494 0013520 0028 0005 0910 0020 0000 0000 c8dc 303b 0013540 ba72 07ed 413f 0266 0067 0000 0004 0000 0013560 0000 0000 0000 0000 0002 0000 0000 0000 0013600 0000 0000 000d c494 0027 0005 0910 0020 0013620 0000 0000 c8db 303b ba6a 07ed 4139 0266 0013640 0063 0000 0004 0000 0000 0000 0000 0000 0013660 0002 0000 0000 0000 0000 0000 000d c494 0013700 0026 0005 0910 0020 0000 0000 c8da 303b 0013720 ba69 07ed 4139 0266 0037 0000 0001 0000 0013740 0000 0000 0000 0000 0002 0000 0000 0000 0013760 0000 0000 000d c494 0025 0005 0910 0020 0014000 0000 0000 c8d9 303b ba63 07ed 4137 0266 0014020 0038 0000 0001 0000 0000 0000 0000 0000 0014040 0002 0000 0000 0000 0000 0000 000d c494 0014060 0024 0005 0910 0020 0000 0000 c8d8 303b 0014100 ba5f 07ed 4135 0266 0035 0000 0001 0000 0014120 0000 0000 0000 0000 0002 0000 0000 0000 0014140 0000 0000 000d c494 0023 0005 0910 0020 0014160 0000 0000 c8d7 303b ba5c 07ed 4132 0266 0014200 0036 0000 0001 0000 0000 0000 0000 0000 0014220 0002 0000 0000 0000 0000 0000 000d c494 0014240 0022 0005 0910 0020 0000 0000 c8d6 303b 0014260 ba56 07ed 412f 0266 0067 0000 0002 0000 0014300 0000 0000 0000 0000 0002 0000 0000 0000 0014320 0000 0000 000d c494 0021 0005 0910 0020 0014340 0000 0000 c8d5 303b ba55 07ed 412f 0266 0014360 0036 0000 0001 0000 0000 0000 0000 0000 0014400 0002 0000 0000 0000 0000 0000 000d c494 0014420 0020 0005 0910 0020 0000 0000 c8d4 303b 0014440 ba51 07ed 412c 0266 0067 0000 0002 0000 0014460 0000 0000 0000 0000 0002 0000 0000 0000 0014500 0000 0000 000d c494 001f 0005 0910 0020 0014520 0000 0000 c8d3 303b ba4e 07ed 412c 0266 0014540 0037 0000 0001 0000 0000 0000 0000 0000 0014560 0002 0000 0000 0000 0000 0000 000d c494 0014600 001e 0005 0910 0020 0000 0000 c8d2 303b 0014620 ba4b 07ed 412a 0266 0066 0000 0004 0000 0014640 0000 0000 0000 0000 0002 0000 0000 0000 0014660 0000 0000 000d c494 001d 0005 0910 0020 0014700 0000 0000 c8d1 303b ba48 07ed 412a 0266 0014720 0038 0000 0001 0000 0000 0000 0000 0000 0014740 0002 0000 0000 0000 0000 0000 000d c494 0014760 001c 0005 0910 0020 0000 0000 c8d0 303b 0015000 ba44 07ed 4127 0266 0069 0000 0003 0000 0015020 0000 0000 0000 0000 0002 0000 0000 0000 0015040 0000 0000 000d c494 001b 0005 0910 0020 0015060 0000 0000 c8cf 303b ba42 07ed 4127 0266 0015100 0039 0000 0003 0000 0000 0000 0000 0000 0015120 0002 0000 0000 0000 0000 0000 000d c494 0015140 001a 0005 0910 0020 0000 0000 c8ce 303b 0015160 ba3d 07ed 4124 0266 0068 0000 0002 0000 0015200 0000 0000 0000 0000 0002 0000 0000 0000 0015220 0000 0000 000d c494 0019 0005 0910 0020 0015240 0000 0000 c8cd 303b ba3b 07ed 4124 0266 0015260 0037 0000 0001 0000 0000 0000 0000 0000 0015300 0002 0000 0000 0000 0000 0000 000d c494 0015320 0018 0005 0910 0020 0000 0000 c8cc 303b 0015340 ba37 07ed 4122 0266 0067 0000 0001 0000 0015360 0000 0000 0000 0000 0002 0000 0000 0000 0015400 0000 0000 000d c494 0017 0005 0910 0020 0015420 0000 0000 c8cb 303b ba35 07ed 4122 0266 0015440 0037 0000 0001 0000 0000 0000 0000 0000 0015460 0002 0000 0000 0000 0000 0000 000d c494 0015500 0016 0005 0910 0020 0000 0000 c8ca 303b 0015520 ba2f 07ed 4121 0266 0039 0000 0001 0000 0015540 0000 0000 0000 0000 0002 0000 0000 0000 0015560 0000 0000 000d c494 0015 0005 0910 0020 0015600 0000 0000 c8c9 303b ba2d 07ed 411f 0266 0015620 0036 0000 0001 0000 0000 0000 0000 0000 0015640 0002 0000 0000 0000 0000 0000 000d c494 0015660 0014 0005 0910 0020 0000 0000 c8c8 303b 0015700 ba2a 07ed 411c 0266 0036 0000 0001 0000 0015720 0000 0000 0000 0000 0002 0000 0000 0000 0015740 0000 0000 000d c494 0013 0005 0910 0020 0015760 0000 0000 c8c7 303b ba26 07ed 411b 0266 0016000 0068 0000 0003 0000 0000 0000 0000 0000 0016020 0002 0000 0000 0000 0000 0000 000d c494 0016040 0012 0005 0910 0020 0000 0000 c8c6 303b 0016060 ba21 07ed 411b 0266 001e 0000 0002 0000 0016100 0000 0000 0000 0000 0002 0000 0000 0000 0016120 0000 0000 000d c494 0011 0005 0910 0020 0016140 0000 0000 c8c5 303b ba1f 07ed 4118 0266 0016160 0035 0000 0001 0000 0000 0000 0000 0000 0016200 0002 0000 0000 0000 0000 0000 000d c494 0016220 0010 0005 0910 0020 0000 0000 c8c4 303b 0016240 ba1b 07ed 4116 0266 0064 0000 0001 0000 0016260 0000 0000 0000 0000 0002 0000 0000 0000 0016300 0000 0000 000d c494 000f 0005 0910 0020 0016320 0000 0000 c8c3 303b ba19 07ed 4116 0266 0016340 0023 0000 0002 0000 0000 0000 0000 0000 0016360 0002 0000 0000 0000 0000 0000 000d c494 0016400 000e 0005 0910 0020 0000 0000 c8c2 303b 0016420 ba15 07ed 4114 0266 0063 0000 0001 0000 0016440 0000 0000 0000 0000 0002 0000 0000 0000 0016460 0000 0000 000d c494 000d 0005 0910 0020 0016500 0000 0000 c8c1 303b ba13 07ed 4114 0266 0016520 0036 0000 0001 0000 0000 0000 0000 0000 0016540 0002 0000 0000 0000 0000 0000 000d c494 0016560 000c 0005 0910 0020 0000 0000 c8c0 303b 0016600 ba0e 07ed 4112 0266 0036 0000 0001 0000 0016620 0000 0000 0000 0000 0002 0000 0000 0000 0016640 0000 0000 000d c494 000b 0005 0910 0020 0016660 0000 0000 c8bf 303b ba0b 07ed 410f 0266 0016700 0036 0000 0001 0000 0000 0000 0000 0000 0016720 0002 0000 0000 0000 0000 0000 000d c494 0016740 000a 0005 0910 0020 0000 0000 c8be 303b 0016760 ba07 07ed 410e 0266 0039 0000 0003 0000 0017000 0000 0000 0000 0000 0002 0000 0000 0000 0017020 0000 0000 000d c494 0009 0005 0910 0020 0017040 0000 0000 c8bd 303b ba03 07ed 410c 0266 0017060 000c 0000 0001 0000 0001 0000 0000 0000 0017100 0002 0000 0000 0000 0000 0000 000d c494 0017120 0008 0005 0910 0020 0000 0000 c8bc 303b 0017140 ba00 07ed 410c 0266 0037 0000 0001 0000 0017160 0000 0000 0000 0000 0002 0000 0000 0000 0017200 0000 0000 000d c494 0007 0005 0910 0020 0017220 0000 0000 c8bb 303b b9fd 07ed 4109 0266 0017240 0062 0000 0007 0000 0000 0000 0000 0000 0017260 0002 0000 0000 0000 0000 0000 000d c494 0017300 0006 0005 0910 0020 0000 0000 c8ba 303b 0017320 b9fb 07ed 4109 0266 003a 0000 0001 0000 0017340 0000 0000 0000 0000 0002 0000 0000 0000 0017360 0000 0000 000d c494 0005 0005 0910 0020 0017400 0000 0000 c8b9 303b b9f8 07ed 4107 0266 0017420 005f 0000 0001 0000 0001 0000 0000 0000 0017440 0002 0000 0000 0000 0000 0000 000d c494 0017460 0004 0005 0910 0020 0000 0000 c8b8 303b 0017500 b9f5 07ed 4104 0266 0064 0000 0005 0000 0017520 0000 0000 0000 0000 0002 0000 0000 0000 0017540 0000 0000 000d c494 0003 0005 0910 0020 0017560 0000 0000 c8b7 303b b9f3 07ed 4104 0266 0017600 0039 0000 0001 0000 0000 0000 0000 0000 0017620 0002 0000 0000 0000 0000 0000 000d c494 0017640 0002 0005 0910 0020 0000 0000 c8b6 303b 0017660 b9ee 07ed 4101 0266 0067 0000 0018 0000 0017700 0000 0000 0000 0000 0002 0000 0000 0000 0017720 0000 0000 000d c494 0001 0005 0910 0020 0017740 0000 0000 c8b5 303b b9ed 07ed 4101 0266 0017760 0035 0000 0003 0000 0000 0000 0000 0000 0020000 Being a complete novice I've got no idea what I'm looking at. Is there anything in the above that looks out of place? Is the next step similar to that of: http://archives.postgresql.org/pgsql-general/2005-11/msg01150.php ? Cheers Noel
Attachment
On Mon, Mar 06, 2006 at 11:57:53AM +1100, Noel Faux wrote: > >Anyway, if the block size is 8192 > >then 902292 sould be in the .6 file. If you can spare the time > >then you might run the dd and od commands that Tom Lane mentions > >in the above message and post the output. > Here's the output: I just noticed that the error in your original message was: > pg_dump: Error message from server: ERROR: invalid page header in block > 9022921 of relation "gap" That's a different block than the one you said you found: > Now after doing some searches I managed to work out that the data > corruption starts at 902292.137 using this sql: > SELECT * FROM gap WHERE ctid = '(902292,$x)' > Where $x I changed from 1-150. I suggested looking in the .6 file based on block 902292; if the real bad block is 9022921 then I think it would be block 110025 in file .68 (again, check the math yourself). -- Michael Fuhr
On Mon, Mar 06, 2006 at 05:17:54PM +1100, Noel Faux wrote: > dd bs=8k skip=115860 count=1 > if=/usr/local/postgresql/postgresql-7.4.8/data/base/37958/111685332.68 | > od -x Wrong block (115860) -- you used the number from my earlier message, which was based on the bad block being 902292. After noticing that the error message said the bad block was 9022921 I corrected both the file number and the block: > >I suggested looking in the .6 file based on block 902292; if the > >real bad block is 9022921 then I think it would be block 110025 > >in file .68 (again, check the math yourself). Try skip=110025. You can use pg_filedump to examine the block in an easier-to-read format; since you're running 7.4 you'd need pg_filedump 3.0. http://sources.redhat.com/rhdb/utilities.html Try running this command: pg_filedump -if -R 110024 110026 /path/111685332.68 That should show the bad block (110025) and the block before and after it. -- Michael Fuhr
On Tue, Mar 07, 2006 at 01:41:44PM +1100, Noel Faux wrote: > Here is the output from the pg_filedump; is there anything which looks > suss and where would we re-zero the data, if that's the next step: [...] > Block 110025 ******************************************************** > <Header> ----- > Block Offset: 0x35b92000 Offsets: Lower 0 (0x0000) > Block: Size 0 Version 24 Upper 2 (0x0002) > LSN: logid 0 recoff 0x00000000 Special 0 (0x0000) > Items: 0 Free Space: 2 > Length (including item array): 24 > > Error: Invalid header information. > > 0000: 00000000 00000000 00000000 00000200 ................ > 0010: 00001800 af459a00 .....E.. > > <Data> ------ > Empty block - no items listed > > <Special Section> ----- > Error: Invalid special section encountered. > Error: Special section points off page. Unable to dump contents. Looks like we've successfully identified the bad block; contrast these header values and the hex dump with the good blocks and you can see at a glance that this one is different. It might be interesting to you (but probably not to us, so don't send the output) to see if the block's contents are recognizable, as though they came from some unrelated file (which might suggest an OS bug). Check your local documentation to see what od/hd/hexdump/whatever options will give you an ASCII dump and use dd to fetch the page and pipe it into that command. Try this (substitute the hd command with whatever works on your system): dd bs=8k skip=110025 count=1 if=/path/file | hd Even if you don't care about the block's current contents, you might want to redirect dd's output to a file to save a copy of the block in case you do ever want to examine it further. And it would be prudent to verify that the data shown by the above dd command matches the data in the pg_filedump output before doing anything destructive. When you're ready to zero the file, shut down the postmaster and run a command like the following (but keep reading before doing so): dd bs=8k seek=110025 conv=notrunc count=1 if=/dev/zero of=/path/file Before running that command I would strongly advise reading the dd manual page on your system to make sure the options are correct and that you understand them. I'd also suggest practicing on a test table: create a table, populate it with arbitrary data, pick a page to zero, identify the file and block, run a command like the above, and verify that the table is intact except for the missing block. Make *sure* you know what you're doing and that the above command works before running it -- if you botch it you might lose a 1G file instead of an 8K block. In one of his messages Tom Lane suggested vacuuming the table after zeroing the bad block to see if vacuum discovers any other bad blocks. During the vacuum you should see a message like this: WARNING: relation "foo" page 110025 is uninitialized --- fixing If you see any other errors or warnings then please post them. -- Michael Fuhr
Ok it worked but we ran into another bad block :(
vacuumdb: vacuuming of database "monashprotein" failed: ERROR: invalid
page header in block 9022937 of relation "gap"
So the command we used was:
dd bs=8k seek=110025 conv=notrunc count=1 if=/dev/zero
of=/usr/local/postgresql/postgresql-7.4.8/data/base/37958/111685332.68
I'm tried to work out the formula for finding the file (i.e. the 111685332.*) to fix and the value to seek to, but as a complete novice I'm lost, any pointers would be a great help. We checked the block size and it's 8192.
Cheers
Noel
Michael Fuhr wrote:
vacuumdb: vacuuming of database "monashprotein" failed: ERROR: invalid
page header in block 9022937 of relation "gap"
So the command we used was:
dd bs=8k seek=110025 conv=notrunc count=1 if=/dev/zero
of=/usr/local/postgresql/postgresql-7.4.8/data/base/37958/111685332.68
I'm tried to work out the formula for finding the file (i.e. the 111685332.*) to fix and the value to seek to, but as a complete novice I'm lost, any pointers would be a great help. We checked the block size and it's 8192.
Cheers
Noel
Michael Fuhr wrote:
On Tue, Mar 07, 2006 at 01:41:44PM +1100, Noel Faux wrote:Here is the output from the pg_filedump; is there anything which looks suss and where would we re-zero the data, if that's the next step:[...]Block 110025 ******************************************************** <Header> ----- Block Offset: 0x35b92000 Offsets: Lower 0 (0x0000) Block: Size 0 Version 24 Upper 2 (0x0002) LSN: logid 0 recoff 0x00000000 Special 0 (0x0000) Items: 0 Free Space: 2 Length (including item array): 24 Error: Invalid header information. 0000: 00000000 00000000 00000000 00000200 ................0010: 00001800 af459a00 .....E.. <Data> ------ Empty block - no items listed <Special Section> ----- Error: Invalid special section encountered. Error: Special section points off page. Unable to dump contents.Looks like we've successfully identified the bad block; contrast these header values and the hex dump with the good blocks and you can see at a glance that this one is different. It might be interesting to you (but probably not to us, so don't send the output) to see if the block's contents are recognizable, as though they came from some unrelated file (which might suggest an OS bug). Check your local documentation to see what od/hd/hexdump/whatever options will give you an ASCII dump and use dd to fetch the page and pipe it into that command. Try this (substitute the hd command with whatever works on your system): dd bs=8k skip=110025 count=1 if=/path/file | hd Even if you don't care about the block's current contents, you might want to redirect dd's output to a file to save a copy of the block in case you do ever want to examine it further. And it would be prudent to verify that the data shown by the above dd command matches the data in the pg_filedump output before doing anything destructive. When you're ready to zero the file, shut down the postmaster and run a command like the following (but keep reading before doing so): dd bs=8k seek=110025 conv=notrunc count=1 if=/dev/zero of=/path/file Before running that command I would strongly advise reading the dd manual page on your system to make sure the options are correct and that you understand them. I'd also suggest practicing on a test table: create a table, populate it with arbitrary data, pick a page to zero, identify the file and block, run a command like the above, and verify that the table is intact except for the missing block. Make *sure* you know what you're doing and that the above command works before running it -- if you botch it you might lose a 1G file instead of an 8K block. In one of his messages Tom Lane suggested vacuuming the table after zeroing the bad block to see if vacuum discovers any other bad blocks. During the vacuum you should see a message like this: WARNING: relation "foo" page 110025 is uninitialized --- fixing If you see any other errors or warnings then please post them.
Attachment
On Thu, Mar 09, 2006 at 11:13:40AM +1100, Noel Faux wrote: > Ok it worked but we ran into another bad block :( > /vacuumdb: vacuuming of database "monashprotein" failed: ERROR: invalid > page header in block 9022937 of relation "gap" > / > So the command we used was: > dd bs=8k seek=110025 conv=notrunc count=1 if=/dev/zero > of=/usr/local/postgresql/postgresql-7.4.8/data/base/37958/111685332.68 > > I'm tried to work out the formula for finding the file (i.e. the > 111685332.*) to fix and the value to seek to, but as a complete novice > I'm lost, any pointers would be a great help. We checked the block size > and it's 8192. The database files are 1G, or 131072 8k blocks. The bad block you zeroed was 9022921; here's how you could have determined the file and block number within that file: test=> SELECT 9022921 / 131072 AS filenum, 9022921 % 131072 AS blocknum; filenum | blocknum ---------+---------- 68 | 110025 (1 row) The new bad block is 9022937 so the query would be: test=> SELECT 9022937 / 131072 AS filenum, 9022937 % 131072 AS blocknum; filenum | blocknum ---------+---------- 68 | 110041 (1 row) If you're running 7.4.8 then consider upgrading to 7.4.12. Offhand I don't know if any bugs have been fixed that might cause the problem you're seeing, but there have been other bug fixes. -- Michael Fuhr
Thanks for all your help Michael, we wish to do a vacuum and dump before the upgrade to 8.02. Do you believe this data corruption is a postgres issue of an OS / hardware issue?
Cheers
Noel
Michael Fuhr wrote:
Cheers
Noel
Michael Fuhr wrote:
On Thu, Mar 09, 2006 at 11:13:40AM +1100, Noel Faux wrote:Ok it worked but we ran into another bad block :( /vacuumdb: vacuuming of database "monashprotein" failed: ERROR: invalid page header in block 9022937 of relation "gap" / So the command we used was: dd bs=8k seek=110025 conv=notrunc count=1 if=/dev/zero of=/usr/local/postgresql/postgresql-7.4.8/data/base/37958/111685332.68 I'm tried to work out the formula for finding the file (i.e. the 111685332.*) to fix and the value to seek to, but as a complete novice I'm lost, any pointers would be a great help. We checked the block size and it's 8192.The database files are 1G, or 131072 8k blocks. The bad block you zeroed was 9022921; here's how you could have determined the file and block number within that file: test=> SELECT 9022921 / 131072 AS filenum, 9022921 % 131072 AS blocknum;filenum | blocknum ---------+---------- 68 | 110025 (1 row) The new bad block is 9022937 so the query would be: test=> SELECT 9022937 / 131072 AS filenum, 9022937 % 131072 AS blocknum;filenum | blocknum ---------+---------- 68 | 110041 (1 row) If you're running 7.4.8 then consider upgrading to 7.4.12. Offhand I don't know if any bugs have been fixed that might cause the problem you're seeing, but there have been other bug fixes.
Attachment
I've been watching the post: Re: [GENERAL] Fixing up a corrupted toast table
In there they mention deletion of the bad rows from the table based on the citid. If I could come up with a def of a back row, would this work, or are there other issues?
Cheers
Noel
Michael Fuhr wrote:
In there they mention deletion of the bad rows from the table based on the citid. If I could come up with a def of a back row, would this work, or are there other issues?
Cheers
Noel
Michael Fuhr wrote:
On Thu, Mar 09, 2006 at 11:13:40AM +1100, Noel Faux wrote:Ok it worked but we ran into another bad block :( /vacuumdb: vacuuming of database "monashprotein" failed: ERROR: invalid page header in block 9022937 of relation "gap" / So the command we used was: dd bs=8k seek=110025 conv=notrunc count=1 if=/dev/zero of=/usr/local/postgresql/postgresql-7.4.8/data/base/37958/111685332.68 I'm tried to work out the formula for finding the file (i.e. the 111685332.*) to fix and the value to seek to, but as a complete novice I'm lost, any pointers would be a great help. We checked the block size and it's 8192.The database files are 1G, or 131072 8k blocks. The bad block you zeroed was 9022921; here's how you could have determined the file and block number within that file: test=> SELECT 9022921 / 131072 AS filenum, 9022921 % 131072 AS blocknum;filenum | blocknum ---------+---------- 68 | 110025 (1 row) The new bad block is 9022937 so the query would be: test=> SELECT 9022937 / 131072 AS filenum, 9022937 % 131072 AS blocknum;filenum | blocknum ---------+---------- 68 | 110041 (1 row) If you're running 7.4.8 then consider upgrading to 7.4.12. Offhand I don't know if any bugs have been fixed that might cause the problem you're seeing, but there have been other bug fixes.
Attachment
On Thu, Mar 09, 2006 at 12:29:17PM +1100, Noel Faux wrote: > Thanks for all your help Michael, we wish to do a vacuum and dump before > the upgrade to 8.02. 8.0.7 and 8.1.3 are the latest versions in their respective branches; those are the versions to run to get the latest bug fixes. > Do you believe this data corruption is a postgres issue of an > OS / hardware issue? Beats me; it could be any or all of them. Certain filesystem and hardware configurations are more prone to data corruption than others, especially in the event of a system crash, so those are among the usual suspects. One reason to look at the data in the bad block is to see what's there: if you see data that obviously came from outside the database then that would tend to exonerate PostgreSQL. -- Michael Fuhr
On Thu, Mar 09, 2006 at 12:37:52PM +1100, Noel Faux wrote: > I've been watching the post: Re: [GENERAL] Fixing up a corrupted toast table > In there they mention deletion of the bad rows from the table based on > the citid. If I could come up with a def of a back row, would this > work, or are there other issues? If you have a corrupt tuple within an otherwise good block then you can try deleting that tuple, but if the block header is corrupt then you have no way of addressing any of that block's tuples. Errors implying a bad tuple include "missing chunk number" and "invalid memory alloc request size"; but "invalid page header in block" means the block itself is bad. -- Michael Fuhr
Given that this seems problem has occurred a number of times for a number I've written a small step by step procedure to address this issue. Is there any other comments you which to add. I was thinking that this should be added to the FAQ / troubleshooting in the docs.
How to repair corrupted data due to "ERROR: invalid page header in block X of relation "Y":
CAUTION this will permanently remove the data defined in the bad block
1. To identify which file(s) the relation is in:
1. <THE SELECT STATEMENTS TO COLLECT THIS DATA> I can't remember how I did it, but will keep hunting for my notes :)
2. so the file(s) are $PGDATA/<databaseDIR>/<relFile>.* The * is any number which is defined below
2. To calculate the * value:
1. SELECT <block> / 131072 AS filenum, <block> % 131072 AS blocknum;
filenum | blocknum
-----------+----------------
<filenum> | <blocknum>
1. 131072 comes from "each database file is 1G, or 131072 * 8k blocks"
2. The block size is determined when compiling postgres
3. Use "SHOW block_size in the database or use pg_controldata from the shell." to confirm this. The default is 8k.
3. Now you need to re-zero this block using the following command:
1. dd bs=8k seek=<blocknum> conv=notrunc count=1 if=/dev/zero of=$PGDATA/base/<databaseDIR>/<relFile>.<filenum>
1. Before you do this it is best to backup the block: "dd bs=8k skip=<blocknum> count=1 if=/path/file | hd"
See this post: http://archives.postgresql.org/pgsql-general/2006-03/msg00002.php
Your thoughts / comments...
Cheers
Noel
Michael Fuhr wrote:
How to repair corrupted data due to "ERROR: invalid page header in block X of relation "Y":
CAUTION this will permanently remove the data defined in the bad block
1. To identify which file(s) the relation is in:
1. <THE SELECT STATEMENTS TO COLLECT THIS DATA> I can't remember how I did it, but will keep hunting for my notes :)
2. so the file(s) are $PGDATA/<databaseDIR>/<relFile>.* The * is any number which is defined below
2. To calculate the * value:
1. SELECT <block> / 131072 AS filenum, <block> % 131072 AS blocknum;
filenum | blocknum
-----------+----------------
<filenum> | <blocknum>
1. 131072 comes from "each database file is 1G, or 131072 * 8k blocks"
2. The block size is determined when compiling postgres
3. Use "SHOW block_size in the database or use pg_controldata from the shell." to confirm this. The default is 8k.
3. Now you need to re-zero this block using the following command:
1. dd bs=8k seek=<blocknum> conv=notrunc count=1 if=/dev/zero of=$PGDATA/base/<databaseDIR>/<relFile>.<filenum>
1. Before you do this it is best to backup the block: "dd bs=8k skip=<blocknum> count=1 if=/path/file | hd"
See this post: http://archives.postgresql.org/pgsql-general/2006-03/msg00002.php
Your thoughts / comments...
Cheers
Noel
Michael Fuhr wrote:
On Thu, Mar 09, 2006 at 12:37:52PM +1100, Noel Faux wrote:I've been watching the post: Re: [GENERAL] Fixing up a corrupted toast table In there they mention deletion of the bad rows from the table based on the citid. If I could come up with a def of a back row, would this work, or are there other issues?If you have a corrupt tuple within an otherwise good block then you can try deleting that tuple, but if the block header is corrupt then you have no way of addressing any of that block's tuples. Errors implying a bad tuple include "missing chunk number" and "invalid memory alloc request size"; but "invalid page header in block" means the block itself is bad.
Attachment
On Thu, Mar 09, 2006 at 03:57:46PM +1100, Noel Faux wrote: > Given that this seems problem has occurred a number of times for a > number I've written a small step by step procedure to address this > issue. Is there any other comments you which to add. I was thinking > that this should be added to the FAQ / troubleshooting in the docs. > > How to repair corrupted data due to "ERROR: invalid page header in block > X of relation "Y": The word "repair" might be misleading. The operation repairs the table in a sense, but as the following caution points out it does so by completing the destruction that something else began. > CAUTION this will permanently remove the data defined in the bad block > > 1. To identify which file(s) the relation is in: > 1. <THE SELECT STATEMENTS TO COLLECT THIS DATA> I can't > remember how I did it, but will keep hunting for my notes :) Hint: pg_class.relfilenode http://www.postgresql.org/docs/8.1/interactive/catalog-pg-class.html > 2. so the file(s) are $PGDATA/<databaseDIR>/<relFile>.* The * > is any number which is defined below You might want to show how to determine <databaseDIR>. The actual location might not be under $PGDATA -- 8.0 and later have tablespaces and earlier versions support alternative locations, so instructions should account for that. Also, relations smaller than 1G won't have any .N files. http://www.postgresql.org/docs/8.1/interactive/manage-ag-tablespaces.html http://www.postgresql.org/docs/8.1/interactive/storage.html http://www.postgresql.org/docs/7.4/interactive/manage-ag-alternate-locs.html > 2. To calculate the * value: > 1. SELECT <block> / 131072 AS filenum, <block> % 131072 AS blocknum; > filenum | blocknum > -----------+---------------- > <filenum> | <blocknum> > 1. 131072 comes from "each database file is 1G, or 131072 > * 8k blocks" The 1G figure obviously applies only to tables that require that much space. If <filenum> comes back zero then you'd use the file without any .N suffix. If the bad block is less than 131072 (or however many other-than-8k blocks fit in 1G) then you needn't bother with the calculation. > 3. Now you need to re-zero this block using the following command: > 1. dd bs=8k seek=<blocknum> conv=notrunc count=1 if=/dev/zero > of=$PGDATA/base/<databaseDIR>/<relFile>.<filenum> I'd recommend testing the command on a throwaway file before working with real data -- "measure twice, cut once" as it were. To gain confidence in what you're doing you could create a test table, populate it with data, corrupt its data file, then zero its bad blocks until you can select all of the remaining data. Playing around in a production database is probably a bad idea; a safer way would be to initdb a test cluster and run a separate postmaster (listening on a different port if you're on the same machine as the real database). It's probably best to shut down the postmaster while you're mucking around with the data files. > 1. Before you do this it is best to backup the block: > "dd bs=8k skip=<blocknum> count=1 if=/path/file | hd" This command doesn't back up the block, it pipes the block into a command that on some systems will display a hex and ASCII dump of the data (some systems will require a command other than hd). You could back up the block by redirecting the dd output to a file instead of piping it into another command. Incidentally, I was looking at your web site and your project might make an interesting case study for the PostgreSQL web site (Community -> In The Real World -> Case studies). http://www.postgresql.org/about/casestudies/ Some users and potential users might be interested in reading about how you're using PostgreSQL with a 100G+ database. Post a message to pgsql-www if you'd be interested in providing a write-up. -- Michael Fuhr
On Thu, Mar 09, 2006 at 03:57:46PM +1100, Noel Faux wrote: > Given that this seems problem has occurred a number of times for a > number I've written a small step by step procedure to address this > issue. Is there any other comments you which to add. I was thinking > that this should be added to the FAQ / troubleshooting in the docs. > > How to repair corrupted data due to "ERROR: invalid page header in block > X of relation "Y": In pgsql-novice Christopher Goodfellow mentioned the zero_damaged_pages option that I had forgotten about. If you don't care about examining the bad pages then you could set this option in a session and execute a statement that hits every page in the file (Tom Lane mentioned VACUUM and SELECT COUNT(*)). Here's the example I posted in reply: test=# select count(*) from foo; ERROR: invalid page header in block 10 of relation "foo" test=# set zero_damaged_pages to on; SET test=# select count(*) from foo; WARNING: invalid page header in block 10 of relation "foo"; zeroing out page WARNING: invalid page header in block 20 of relation "foo"; zeroing out page WARNING: invalid page header in block 30 of relation "foo"; zeroing out page count ------- 9445 (1 row) test=# set zero_damaged_pages to off; SET -- Michael Fuhr
To clarify, when set on, every time it hits this error, postgres will rezero that block?
Michael Fuhr wrote:
Michael Fuhr wrote:
On Thu, Mar 09, 2006 at 03:57:46PM +1100, Noel Faux wrote:Given that this seems problem has occurred a number of times for a number I've written a small step by step procedure to address this issue. Is there any other comments you which to add. I was thinking that this should be added to the FAQ / troubleshooting in the docs. How to repair corrupted data due to "ERROR: invalid page header in block X of relation "Y":In pgsql-novice Christopher Goodfellow mentioned the zero_damaged_pages option that I had forgotten about. If you don't care about examining the bad pages then you could set this option in a session and execute a statement that hits every page in the file (Tom Lane mentioned VACUUM and SELECT COUNT(*)). Here's the example I posted in reply: test=# select count(*) from foo; ERROR: invalid page header in block 10 of relation "foo" test=# set zero_damaged_pages to on; SET test=# select count(*) from foo; WARNING: invalid page header in block 10 of relation "foo"; zeroing out page WARNING: invalid page header in block 20 of relation "foo"; zeroing out page WARNING: invalid page header in block 30 of relation "foo"; zeroing out pagecount ------- 9445 (1 row) test=# set zero_damaged_pages to off; SET
Attachment
Noel Faux <noel.faux@med.monash.edu.au> writes: > To clarify, when set on, every time it hits this error, postgres will > rezero that block? It'll only "re" zero if the page gets dropped from shared memory without there having been any occasion to write it out. Otherwise, the first write will clobber the bad data on disk and that's the end of it. My suggestion to use either VACUUM or SELECT COUNT(*) failed to take that behavior into account --- VACUUM *will* rewrite the page, but a SELECT scan won't dirty the page. So you might consider a SELECT to see how bad the situation is (how many bad pages) and then a VACUUM if you want them cleaned up. regards, tom lane