Re: gistchoose vs. bloat - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: gistchoose vs. bloat
Date
Msg-id CAPpHfds2DifG0wRc79R4ZJ8438q8n5pjbcQk238h+Qp_0PnGBw@mail.gmail.com
Whole thread Raw
In response to Re: gistchoose vs. bloat  (Jeff Davis <pgsql@j-davis.com>)
Responses Re: gistchoose vs. bloat  (Alvaro Herrera <alvherre@2ndquadrant.com>)
List pgsql-hackers
On Mon, Oct 1, 2012 at 5:15 AM, Jeff Davis <pgsql@j-davis.com> wrote:
On Tue, 2012-09-04 at 19:21 +0400, Alexander Korotkov wrote:

> New version of patch is attached. Parameter "randomization" was
> introduced. It controls whether to randomize choose. Choose algorithm
> was rewritten.
>
Review comments:

1. Comment above while loop in gistRelocateBuildBuffersOnSplit needs to
be updated.

Actually, I didn't realize what exact comment you expect. Check if added commend meets you expectations.
 

2. Typo in two places: "if randomization id required".

3. In gistRelocateBuildBuffersOnSplit, shouldn't that be:
     splitPageInfo = &relocationBuffersInfos[bufferIndex];
   not:
     splitPageInfo = &relocationBuffersInfos[i];

Fixed.
 
4. It looks like the randomization is happening while trying to compare
the penalties. I think it may be more readable to separate those two
steps; e.g.

  /* create a mapping whether randomization is on or not */
  for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
      offsets[i - FirstOffsetNumber] = i;

  if (randomization)
      /* randomize offsets array */

  for (i = 0; i < maxoff; i++)
  {
     offset = offsets[i];
     ...
  }

That's just an idea; if you think it's more readable as-is (or if I am
misunderstanding) then let me know.

Actually, current implementation comes from idea of creating possible less overhead when randomization is off. I'll try to measure overhead in worst case. If it is low enough then you proposal looks reasonable to me.

------
With best regards,
Alexander Korotkov.
Attachment

pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Hash id in pg_stat_statements
Next
From: Daniel Farina
Date:
Subject: Re: Hash id in pg_stat_statements