Re: GIN improvements part 1: additional information - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: GIN improvements part 1: additional information
Date
Msg-id CAPpHfdvP3G2k04tpCyEA0mAd2e8xOQyuj=2wwAj0UVhB1_oe+g@mail.gmail.com
Whole thread Raw
In response to Re: GIN improvements part 1: additional information  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Responses Re: GIN improvements part 1: additional information
List pgsql-hackers
On Fri, Nov 29, 2013 at 11:17 PM, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:
On 11/29/2013 11:41 AM, Heikki Linnakangas wrote:
On 11/28/2013 09:19 AM, Alexander Korotkov wrote:
On Wed, Nov 27, 2013 at 1:14 AM, Heikki Linnakangas
<hlinnakangas@vmware.com
wrote:

On 11/26/13 15:34, Alexander Korotkov wrote:

What's your plans about GIN now? I tried to rebase packed posting lists
with head. But I found that you've changed interface of placeToPage
function. That's conflicts with packed posting lists, because
dataPlaceToPageLeaf needs not only offset number to describe place to
insert item pointer. Do you like to commit rework of handling GIN
incomplete splits before?

Yeah, I'm planning to get back to this patch after committing the
incomplete splits patch. I think the refactoring of the WAL-logging
that I
did in that patch will simplify this patch, too. I'll take a look at
Michael's latest comments on the incomplete splits patch tomorrow, so I
should get back to this on Thursday or Friday.

Should I try to rebase this patch now or you plan to do it yourself? Some
changes like "void *insertdata" argument make me think you have some
particular plan to rebase this patch, but I didn't get it exactly.

Here's rebased version. I'll continue reviewing it now..

Another update. Fixes a bunch of bugs. Mostly introduced by me, but a couple were present in your v16:

* When allocating the entry->list buffer in a scan, it must be large enough for the max number of items that can fit on a compressed page, whether the current page is compressed or not. That's because the same buffer is reused on subsequent pages, which might be compressed.

* When splitting a leaf page during index creation, missed the trick that's present in current master, to choose the split point so that left page is packed as full as possible. I put that back, it makes newly-built indexes somewhat smaller. (I wonder if we should leave some free space for future updates. But that's a separate patch, let's keep the current behavior in this patch)

I'll continue reviewing next week..

Good. Thanks for debug and fixing bugs.
Can I do anything for this patch now?

------
With best regards,
Alexander Korotkov. 

pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: ANALYZE sampling is too good
Next
From: "MauMau"
Date:
Subject: Re: [bug fix] pg_ctl always uses the same event source