Re: GIN improvements part 1: additional information - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: GIN improvements part 1: additional information
Date
Msg-id CAPpHfdvku_rgziJ0GB_JeNszNNeZ8LqOZ1A1-P1ycr3urK0hdA@mail.gmail.com
Whole thread Raw
In response to Re: GIN improvements part 1: additional information  (Alexander Korotkov <aekorotkov@gmail.com>)
List pgsql-hackers
On Tue, Jan 21, 2014 at 4:28 PM, Alexander Korotkov <aekorotkov@gmail.com> wrote:
I noticed that the gin vacuum redo routine is dead code, except for the data-leaf page handling, because we never remove entries or internal nodes (page deletion is a separate wal record type). And the data-leaf case is functionally equivalent to heap newpage records. I removed the dead code and made it more clear that it resembles heap newpage.

Attached is a yet another version, with more bugs fixed and more comments added and updated. I would appreciate some heavy-testing of this patch now. If you could re-run the tests you've been using, that could be great. I've tested the WAL replay by replicating GIN operations over streaming replication. That doesn't guarantee it's correct, but it's a good smoke test.

I tried my test-suite but it hangs on index scan with infinite loop. I re-tried it on my laptop with -O0. I found it to crash on update and vacuum in some random places like:
Assert(GinPageIsData(page)); in xlogVacuumPage
Assert(ndecoded == totalpacked); in ginCompressPostingList
Trying to debug it.

Another question is about dataPlaceToPageLeaf:

while ((Pointer) seg < segend)
{
    if (ginCompareItemPointers(&minNewItem, &seg->first) < 0)
        break;

Shouldn't we adjust seg to previous segment? If minNewItem is less than seg->first we should insert it to previous segment.

------
With best regards,
Alexander Korotkov.  

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: REINDEX CONCURRENTLY 2.0
Next
From: "MauMau"
Date:
Subject: Re: [bug fix] pg_ctl always uses the same event source