Thread: pg_filedump strangeness

pg_filedump strangeness

From
Alvaro Herrera
Date:
Hi,

I'm chasing an apparent index corruption problem, and I came across
something I can't quite explain in pg_filedump.  Say I dump a non-leaf
btree index page:

*******************************************************************
* PostgreSQL File/Block Formatted Dump Utility - Version 8.3.0
*
* File: 31141
* Options used: -fi -x -R 6246 
*
* Dump created on: Tue Apr  6 17:40:28 2010
*******************************************************************

Block 6246 ********************************************************
<Header> -----Block Offset: 0x030cc000         Offsets: Lower      36 (0x0024)Block: Size 8192  Version    4
Upper   8120 (0x1fb8)LSN:  logid   1077 recoff 0x45c8b660      Special  8176 (0x1ff0)Items:    3
FreeSpace: 8084TLI: 0x0001  Prune XID: 0x00000000  Flags: 0x0000 ()Length (including item array): 36
 
 0000: 35040000 60b6c845 01000000 2400b81f  5...`..E....$... 0010: f01f0420 00000000 d89f3000 d09f1000  ...
......0.....0020: b89f3000                             ..0.            
 

<Data> ------ Item   1 -- Length:   24  Offset: 8152 (0x1fd8)  Flags: NORMAL Block Id: 6232  linp Index: 1  Size: 24
HasNulls: 32768  Has Varwidths: 0
 
 1fd8: 00005818 01001880 01000000 00000000  ..X............. 1fe8: 80bcc57d 74230100                    ...}t#..

Item   2 -- Length:    8  Offset: 8144 (0x1fd0)  Flags: NORMAL Block Id: 2756  linp Index: 1  Size: 8 Has Nulls: 0  Has
Varwidths:0
 
 1fd0: 0000c40a 01000800                    ........        
Item   3 -- Length:   24  Offset: 8120 (0x1fb8)  Flags: NORMAL Block Id: 6231  linp Index: 1  Size: 24 Has Nulls: 32768
Has Varwidths: 0
 
 1fb8: 00005718 01001880 01000000 00000000  ..W............. 1fc8: 4009cc7f 73230100                    @...s#..



<Special Section> -----BTree Index Section: Flags: 0x0000 () Blocks: Previous (6109)  Next (6305)  Level (1)  CycleId
(0)
 1ff0: dd170000 a1180000 01000000 00000000  ................


*** End of Requested Range Encountered. Last Block Read: 6246 ***


Notice how item 2 has size 8, but regular entries have size 24.  I know
this is related to the high key of this page, but I can't quite figure
out why the short entry is 2 not 1.  Is item 2 just assumed to be
greater than the previous' page high key?

Page's 6109 high key is:
Item   1 -- Length:   24  Offset: 8152 (0x1fd8)  Flags: NORMAL Block Id: 6101  linp Index: 1  Size: 24 Has Nulls: 32768
Has Varwidths: 0
 
 1fd8: 0000d517 01001880 01000000 00000000  ................ 1fe8: 8004f17d 6f230100                    ...}o#..



Note that the data values are integer timestamp without time zone in
little endian byte order.


(The "Has Nulls" bit is somewhat bogus -- it displays 32768 when the
0x8000 bit is on, which is rather surprising.  I'd expect it to display
1).


-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: pg_filedump strangeness

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> I'm chasing an apparent index corruption problem, and I came across
> something I can't quite explain in pg_filedump.  Say I dump a non-leaf
> btree index page:

I think this is actually OK.  Remember that in a non-rightmost page,
item 1 is the high key not a data entry.  On the other hand, in a
non-leaf page, we don't bother to store the key for the first downlink
entry, since the associated key is really "minus infinity".  Cf
nbtree/README:
On a non-leaf page, the data items are down-links to child pages withbounding keys.  The key in each data item is the
*lower*bound forkeys on that child page, so logically the key is to the left of thatdownlink.  The high key (if
present)is the upper bound for the lastdownlink.  The first data item on each such page has no lower bound--- or lower
boundof minus infinity, if you prefer.  The comparisonroutines must treat it accordingly.  The actual key stored in
theitemis irrelevant, and need not be stored at all.  This arrangementcorresponds to the fact that an L&Y non-leaf page
hasone more pointerthan key.
 

So item 2 doesn't have a key in it.  The other two items have null
keys, which means they need a null bitmap.  I don't however understand
why there seems to be data as well as a null bitmap in there --- is
this perhaps a two-column index?

> (The "Has Nulls" bit is somewhat bogus -- it displays 32768 when the
> 0x8000 bit is on, which is rather surprising.  I'd expect it to display
> 1).

Yeah, I noticed that too.  Made a note to myself to fix it in the next
revision of pg_filedump, which I suppose I'd better get on with
producing...
        regards, tom lane


Re: pg_filedump strangeness

From
Alvaro Herrera
Date:
Tom Lane wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
> > I'm chasing an apparent index corruption problem, and I came across
> > something I can't quite explain in pg_filedump.  Say I dump a non-leaf
> > btree index page:
> 
> I think this is actually OK.  Remember that in a non-rightmost page,
> item 1 is the high key not a data entry.  On the other hand, in a
> non-leaf page, we don't bother to store the key for the first downlink
> entry, since the associated key is really "minus infinity".  Cf
> nbtree/README:
> 
>     On a non-leaf page, the data items are down-links to child pages with
>     bounding keys.  The key in each data item is the *lower* bound for
>     keys on that child page, so logically the key is to the left of that
>     downlink.  The high key (if present) is the upper bound for the last
>     downlink.  The first data item on each such page has no lower bound
>     --- or lower bound of minus infinity, if you prefer.  The comparison
>     routines must treat it accordingly.  The actual key stored in the
>     item is irrelevant, and need not be stored at all.  This arrangement
>     corresponds to the fact that an L&Y non-leaf page has one more pointer
>     than key.

Ahh, I had forgotten that bit completely.  Thanks.


> So item 2 doesn't have a key in it.  The other two items have null
> keys, which means they need a null bitmap.

Correct.

> I don't however understand
> why there seems to be data as well as a null bitmap in there --- is
> this perhaps a two-column index?

Eh, yeah, it's a two column index, so it's OK.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: pg_filedump strangeness

From
Alvaro Herrera
Date:
Tom Lane wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
> > I'm chasing an apparent index corruption problem, and I came across
> > something I can't quite explain in pg_filedump.  Say I dump a non-leaf
> > btree index page:
> 
> I think this is actually OK.  Remember that in a non-rightmost page,
> item 1 is the high key not a data entry.

Oh, BTW, this is not what's corrupted about this index -- I just had
trouble following what pg_filedump was reporting.  The corruption is
more subtle: vacuum cannot find the parent page when trying to mark a
page for deletion.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.