Re: GIN improvements part 1: additional information - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: GIN improvements part 1: additional information
Date
Msg-id CAPpHfdsW=VXD8BCHycCDFtjbOmd2BRCZmHqDgrHNoE0V1Gjutw@mail.gmail.com
Whole thread Raw
In response to Re: GIN improvements part 1: additional information  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Responses Re: GIN improvements part 1: additional information
List pgsql-hackers
On Mon, Jan 20, 2014 at 10:30 PM, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:
On 01/17/2014 08:49 PM, Alexander Korotkov wrote:
On Fri, Jan 17, 2014 at 10:38 PM, Heikki Linnakangas <
hlinnakangas@vmware.com> wrote:

On 01/17/2014 01:05 PM, Alexander Korotkov wrote:

Seems to be fixed in the attached version of patch.
Another improvement in this version: only left-most segments and further
are re-encoded. Left part of page are left untouched.

I'm looking into this now. A few quick notes:

* Even when you don't re-encode the whole page, you still WAL-log the
whole page. While correct, it'd be a pretty obvious optimization to only
WAL-log the modified part.

Oh, I overlooked it. I wrote code in ginRedoInsertData which finds
appropriate place fro data but didn't notice that I still wrote whole page
to xlog record.

Yeah. I think ginRedoInsertData was too cute to be bug-free. I added an explicit unmodifiedsize field to the WAL record, so that ginRedoInsertData doesn't need to calculate it.


* When new items are appended to the end of the page so that only the last
existing compressed segment is re-encoded, and the page has to be split,
the items aren't divided 50/50 on the pages. The loop that moves segments
destined for the left page to the right won't move any existing, untouched,
segments.

I think this loop will move one segment. However, it's too small.

Implemented this.

I noticed that the gin vacuum redo routine is dead code, except for the data-leaf page handling, because we never remove entries or internal nodes (page deletion is a separate wal record type). And the data-leaf case is functionally equivalent to heap newpage records. I removed the dead code and made it more clear that it resembles heap newpage.

Attached is a yet another version, with more bugs fixed and more comments added and updated. I would appreciate some heavy-testing of this patch now. If you could re-run the tests you've been using, that could be great. I've tested the WAL replay by replicating GIN operations over streaming replication. That doesn't guarantee it's correct, but it's a good smoke test.

I tried my test-suite but it hangs on index scan with infinite loop. I re-tried it on my laptop with -O0. I found it to crash on update and vacuum in some random places like:
Assert(GinPageIsData(page)); in xlogVacuumPage
Assert(ndecoded == totalpacked); in ginCompressPostingList
Trying to debug it.

------
With best regards,
Alexander Korotkov.  

pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: proposal: hide application_name from other users
Next
From: Michael Paquier
Date:
Subject: Re: ALTER SYSTEM SET typos and fix for temporary file name management