Re: pg_verify_checksums failure with hash indexes - Mailing list pgsql-hackers
From | Yugo Nagata |
---|---|
Subject | Re: pg_verify_checksums failure with hash indexes |
Date | |
Msg-id | 20180829182526.9a4ab0c3deec9f81c439195b@sraoss.co.jp Whole thread Raw |
In response to | Re: pg_verify_checksums failure with hash indexes (Michael Banck <michael.banck@credativ.de>) |
List | pgsql-hackers |
On Tue, 28 Aug 2018 15:02:56 +0200 Michael Banck <michael.banck@credativ.de> wrote: > Hi, > > On Tue, Aug 28, 2018 at 11:21:34AM +0200, Peter Eisentraut wrote: > > This is reproducible with PG11 and PG12: > > > > initdb -k data > > postgres -D data > > > > make installcheck > > # shut down postgres with Ctrl-C > > > > pg_verify_checksums data > > > > pg_verify_checksums: checksum verification failed in file > > "data/base/16384/28647", block 65: calculated checksum DC70 but expected 0 > > pg_verify_checksums: checksum verification failed in file > > "data/base/16384/28649", block 65: calculated checksum 89D8 but expected 0 > > pg_verify_checksums: checksum verification failed in file > > "data/base/16384/28648", block 65: calculated checksum 9636 but expected 0 > > Checksum scan completed > > Data checksum version: 1 > > Files scanned: 2493 > > Blocks scanned: 13172 > > Bad checksums: 3 > > > > The files in question correspond to > > > > hash_i4_index > > hash_name_index > > hash_txt_index > > > > Discuss. ;-) > > I took a look at hash_name_index, assuming the others are similar. > > Page 65 is the last page, pageinspect barfs on it as well: > > regression=# SELECT get_raw_page('hash_name_index', 'main', 65); > WARNING: page verification failed, calculated checksum 18066 but expected 0 > ERROR: invalid page in block 65 of relation base/16384/28638 > > The pages before that one from page 35 on are empty: > > regression=# SELECT * FROM page_header(get_raw_page('hash_name_index', 'main', 1)); > lsn | checksum | flags | lower | upper | special | pagesize | version | prune_xid > -----------+----------+-------+-------+-------+---------+----------+---------+----------- > 0/422D890 | 8807 | 0 | 664 | 5616 | 8176 | 8192 | 4 | 0 > (1 Zeile) > [...] > regression=# SELECT * FROM page_header(get_raw_page('hash_name_index', 'main', 34)); > lsn | checksum | flags | lower | upper | special | pagesize | version | prune_xid > -----------+----------+-------+-------+-------+---------+----------+---------+----------- > 0/422C690 | 18153 | 0 | 580 | 5952 | 8176 | 8192 | 4 | 0 > (1 Zeile) > regression=# SELECT * FROM page_header(get_raw_page('hash_name_index', 'main', 35)); > lsn | checksum | flags | lower | upper | special | pagesize | version | prune_xid > -----+----------+-------+-------+-------+---------+----------+---------+----------- > 0/0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 > [...] > regression=# SELECT * FROM page_header(get_raw_page('hash_name_index', 'main', 64)); > lsn | checksum | flags | lower | upper | special | pagesize | version | prune_xid > -----+----------+-------+-------+-------+---------+----------+---------+----------- > 0/0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 > regression=# SELECT * FROM page_header(get_raw_page('hash_name_index', 'main', 65)); > WARNING: page verification failed, calculated checksum 18066 but expected 0 > ERROR: invalid page in block 65 of relation base/16384/28638 > > Running pg_filedump on the last two pages results in (not sure the > "Invalid header information." are legit; neither about the checksum > failure on block 64): > > mba@fock:~/[...]postgresql/build/src/test/regress$ ~/tmp/bin/pg_filedump -R 64 65 -k -f tmp_check/data/base/16384/28638 > > --8<-- > ******************************************************************* > * PostgreSQL File/Block Formatted Dump Utility - Version 10.1 > * > * File: tmp_check/data/base/16384/28638 > * Options used: -R 64 65 -k -f > * > * Dump created on: Tue Aug 28 14:53:37 2018 > ******************************************************************* > > Block 64 ******************************************************** > <Header> ----- > Block Offset: 0x00080000 Offsets: Lower 0 (0x0000) > Block: Size 0 Version 0 Upper 0 (0x0000) > LSN: logid 0 recoff 0x00000000 Special 0 (0x0000) > Items: 0 Free Space: 0 > Checksum: 0x0000 Prune XID: 0x00000000 Flags: 0x0000 () > Length (including item array): 24 > > Error: Invalid header information. > > Error: checksum failure: calculated 0xc66a. > > 0000: 00000000 00000000 00000000 00000000 ................ > 0010: 00000000 00000000 ........ > > <Data> ------ > Empty block - no items listed > > <Special Section> ----- > Error: Invalid special section encountered. > Error: Special section points off page. Unable to dump contents. > > Block 65 ******************************************************** > <Header> ----- > Block Offset: 0x00082000 Offsets: Lower 24 (0x0018) > Block: Size 8192 Version 4 Upper 8176 (0x1ff0) > LSN: logid 0 recoff 0x04229c20 Special 8176 (0x1ff0) > Items: 0 Free Space: 8152 > Checksum: 0x0000 Prune XID: 0x00000000 Flags: 0x0000 () > Length (including item array): 24 > > Error: checksum failure: calculated 0x4692. > > 0000: 00000000 209c2204 00000000 1800f01f .... ."......... > 0010: f01f0420 00000000 ... .... > > <Data> ------ > Empty block - no items listed > > <Special Section> ----- > Hash Index Section: > Flags: 0x0000 () > Bucket Number: 0xffffffff > Blocks: Previous (-1) Next (-1) > > 1ff0: ffffffff ffffffff ffffffff 000080ff ................ > > > *** End of Requested Range Encountered. Last Block Read: 65 *** > --8<-- > > So it seems there is some data on the last page, which makes the zero > checksum bogus, but I don't know anything about hash indexes. Also maybe > those empty pages are not initialized correctly? Or maybe the "Invalid > special section encountered" error meand pg_filedump cannot handle hash > indexes completely. I saw the same thing in the hash_i4_index case using pageinspect with checksum disablbed. The last page (block 65) has some data in its header. regression=# select * from page_header(get_raw_page('hash_i4_index',65)); lsn | checksum | flags | lower | upper | special | pagesize | version | prune_xid -----------+----------+-------+-------+-------+---------+----------+---------+----------- 0/939FE48 | 0 | 0 | 24 | 8176 | 8176 | 8192 | 4 | 0 (1 row) Looking at the code to check the checksum, each page is checked if this is a new page by using PageIsNew(), and if so its checksum is not checked because new pages are assumed to have no checksum. PageIsNew() determines if a page is new or not from pd_upper. For some reason, the last page has pd_upper but no checksum, so the checksum verification fails. It is not clear for me why the last page has a head information, but, after some code investigation, I think it happend in _hash_alloc_buckets(). When expanding a hash table, smgrextend() add some blocks to a file. At that time, it seems that a page that has a header infomation is written to the end of the file (in mdextend()). I'm not sure how to fix this for now, but it might be worth to share my analysis for this issue. Regards, -- Yugo Nagata <nagata@sraoss.co.jp>
pgsql-hackers by date: