Thread: Re: [COMMITTERS] pgsql: Compress GIN posting lists, for smaller index size.

Re: [COMMITTERS] pgsql: Compress GIN posting lists, for smaller index size.

From
Stefan Kaltenbrunner
Date:
On 01/22/2014 06:28 PM, Heikki Linnakangas wrote:
> Compress GIN posting lists, for smaller index size.
> 
> GIN posting lists are now encoded using varbyte-encoding, which allows them
> to fit in much smaller space than the straight ItemPointer array format used
> before. The new encoding is used for both the lists stored in-line in entry
> tree items, and in posting tree leaf pages.
> 
> To maintain backwards-compatibility and keep pg_upgrade working, the code
> can still read old-style pages and tuples. Posting tree leaf pages in the
> new format are flagged with GIN_COMPRESSED flag, to distinguish old and new
> format pages. Likewise, entry tree tuples in the new format have a
> GIN_ITUP_COMPRESSED flag set in a bit that was previously unused.
> 
> This patch bumps GIN_CURRENT_VERSION from 1 to 2. New indexes created with
> version 9.4 will therefore have version number 2 in the metapage, while old
> pg_upgraded indexes will have version 1. The code treats them the same, but
> it might be come handy in the future, if we want to drop support for the
> uncompressed format.
> 
> Alexander Korotkov and me. Reviewed by Tomas Vondra and Amit Langote.


it seems that this commit made spoonbill an unhappy animal:

http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=spoonbill&dt=2014-01-23%2000%3A00%3A04



Stefan



Re: [COMMITTERS] pgsql: Compress GIN posting lists, for smaller index size.

From
Heikki Linnakangas
Date:
On 01/23/2014 09:18 PM, Stefan Kaltenbrunner wrote:
> On 01/22/2014 06:28 PM, Heikki Linnakangas wrote:
>> Compress GIN posting lists, for smaller index size.
>>
>> GIN posting lists are now encoded using varbyte-encoding, which allows them
>> to fit in much smaller space than the straight ItemPointer array format used
>> before. The new encoding is used for both the lists stored in-line in entry
>> tree items, and in posting tree leaf pages.
>>
>> To maintain backwards-compatibility and keep pg_upgrade working, the code
>> can still read old-style pages and tuples. Posting tree leaf pages in the
>> new format are flagged with GIN_COMPRESSED flag, to distinguish old and new
>> format pages. Likewise, entry tree tuples in the new format have a
>> GIN_ITUP_COMPRESSED flag set in a bit that was previously unused.
>>
>> This patch bumps GIN_CURRENT_VERSION from 1 to 2. New indexes created with
>> version 9.4 will therefore have version number 2 in the metapage, while old
>> pg_upgraded indexes will have version 1. The code treats them the same, but
>> it might be come handy in the future, if we want to drop support for the
>> uncompressed format.
>>
>> Alexander Korotkov and me. Reviewed by Tomas Vondra and Amit Langote.
>
> it seems that this commit made spoonbill an unhappy animal:
>
> http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=spoonbill&dt=2014-01-23%2000%3A00%3A04

Hmm, all the Sparcs. Some kind of an alignment issue, perhaps? I will 
investigate..

- Heikki

-- 
- Heikki



Heikki Linnakangas <hlinnakangas@vmware.com> writes:
> On 01/23/2014 09:18 PM, Stefan Kaltenbrunner wrote:
>> it seems that this commit made spoonbill an unhappy animal:
>> http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=spoonbill&dt=2014-01-23%2000%3A00%3A04

> Hmm, all the Sparcs. Some kind of an alignment issue, perhaps? I will 
> investigate..

My HPUX box, which is also picky about alignment, is unhappy as well.
It's crashing here:

ginPostingListDecode (plist=0xc39efac1, ndecoded=0x7b03bee0)   at ginpostinglist.c:263
263             return ginPostingListDecodeAllSegments(plist,
(gdb) bt
#0  ginPostingListDecode (plist=0xc39efac1, ndecoded=0x7b03bee0)   at ginpostinglist.c:263
#1  0x205308 in ginReadTuple (ginstate=0xc39efac1, attnum=48864,    itup=0x7b03bee0, nitems=0x403ee9a4) at
ginentrypage.c:170
#2  0x21074c in startScanEntry (ginstate=0x403ec3ac, entry=0x403ee970)   at ginget.c:463
#3  0x21086c in startScan (scan=0xc39efac1) at ginget.c:493
#4  0x212c14 in gingetbitmap (fcinfo=0xc39efac1) at ginget.c:1531
#5  0x5ffc50 in FunctionCall2Coll (flinfo=0xc39efac1, collation=2063843040,    arg1=2063843040, arg2=1077864868) at
fmgr.c:1323
#6  0x24ee5c in index_getbitmap (scan=0x40163878, bitmap=0x403ee620)   at indexam.c:649
#7  0x3b9430 in MultiExecBitmapIndexScan (node=0x40163768)   at nodeBitmapIndexscan.c:89
#8  0x3a5a3c in MultiExecProcNode (node=0x40163768) at execProcnode.c:562
#9  0x3b8610 in BitmapHeapNext (node=0x401628f0) at nodeBitmapHeapscan.c:104
#10 0x3ae5b0 in ExecScan (node=0x401628f0,    accessMtd=0x4001a2c2 <DINFINITY+3802>,    recheckMtd=0x4001a2ca
<DINFINITY+3810>)at execScan.c:82
 
#11 0x3b8e9c in ExecBitmapHeapScan (node=0xc39efac1)   at nodeBitmapHeapscan.c:441
#12 0x3a56e0 in ExecProcNode (node=0x401628f0) at execProcnode.c:414
...

(gdb) p debug_query_string
$1 = 0x4006d4a8 "SELECT * FROM array_index_op_test WHERE i <@ '{38,34,32,89}' ORDER BY seqno;"

The problem appears to be due to the misaligned "plist" pointer
(0xc39efac1 here).
        regards, tom lane



Re: Re: [COMMITTERS] pgsql: Compress GIN posting lists, for smaller index size.

From
Heikki Linnakangas
Date:
On 01/23/2014 10:37 PM, Tom Lane wrote:
> Heikki Linnakangas <hlinnakangas@vmware.com> writes:
>> On 01/23/2014 09:18 PM, Stefan Kaltenbrunner wrote:
>>> it seems that this commit made spoonbill an unhappy animal:
>>> http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=spoonbill&dt=2014-01-23%2000%3A00%3A04
>
>> Hmm, all the Sparcs. Some kind of an alignment issue, perhaps? I will
>> investigate..
>
> My HPUX box, which is also picky about alignment, is unhappy as well.
> It's crashing here:
>
> ginPostingListDecode (plist=0xc39efac1, ndecoded=0x7b03bee0)
>      at ginpostinglist.c:263
> 263             return ginPostingListDecodeAllSegments(plist,
> (gdb) bt
> #0  ginPostingListDecode (plist=0xc39efac1, ndecoded=0x7b03bee0)
>      at ginpostinglist.c:263
> #1  0x205308 in ginReadTuple (ginstate=0xc39efac1, attnum=48864,
>      itup=0x7b03bee0, nitems=0x403ee9a4) at ginentrypage.c:170
> #2  0x21074c in startScanEntry (ginstate=0x403ec3ac, entry=0x403ee970)
>      at ginget.c:463
> #3  0x21086c in startScan (scan=0xc39efac1) at ginget.c:493
> #4  0x212c14 in gingetbitmap (fcinfo=0xc39efac1) at ginget.c:1531
> #5  0x5ffc50 in FunctionCall2Coll (flinfo=0xc39efac1, collation=2063843040,
>      arg1=2063843040, arg2=1077864868) at fmgr.c:1323
> #6  0x24ee5c in index_getbitmap (scan=0x40163878, bitmap=0x403ee620)
>      at indexam.c:649
> #7  0x3b9430 in MultiExecBitmapIndexScan (node=0x40163768)
>      at nodeBitmapIndexscan.c:89
> #8  0x3a5a3c in MultiExecProcNode (node=0x40163768) at execProcnode.c:562
> #9  0x3b8610 in BitmapHeapNext (node=0x401628f0) at nodeBitmapHeapscan.c:104
> #10 0x3ae5b0 in ExecScan (node=0x401628f0,
>      accessMtd=0x4001a2c2 <DINFINITY+3802>,
>      recheckMtd=0x4001a2ca <DINFINITY+3810>) at execScan.c:82
> #11 0x3b8e9c in ExecBitmapHeapScan (node=0xc39efac1)
>      at nodeBitmapHeapscan.c:441
> #12 0x3a56e0 in ExecProcNode (node=0x401628f0) at execProcnode.c:414
> ...
>
> (gdb) p debug_query_string
> $1 = 0x4006d4a8 "SELECT * FROM array_index_op_test WHERE i <@ '{38,34,32,89}' ORDER BY seqno;"
>
> The problem appears to be due to the misaligned "plist" pointer
> (0xc39efac1 here).

Ah, thanks! Looks like I removed a SHORTALIGN from ginFormTuple that was 
in fact very much necessary.. Fixed now, let's see if that pacifies the 
sparcs.

- Heikki



Heikki Linnakangas <hlinnakangas@vmware.com> writes:
> On 01/23/2014 10:37 PM, Tom Lane wrote:
>> The problem appears to be due to the misaligned "plist" pointer
>> (0xc39efac1 here).

> Ah, thanks! Looks like I removed a SHORTALIGN from ginFormTuple that was 
> in fact very much necessary.. Fixed now, let's see if that pacifies the 
> sparcs.

My HPPA box is happy again, anyway.  Thanks.
        regards, tom lane