Re: patch for geqo tweaks - Mailing list pgsql-hackers

From Nathan Wagner
Subject Re: patch for geqo tweaks
Date
Msg-id 20151106031425.GA8342@granicus.if.org
Whole thread Raw
In response to Re: patch for geqo tweaks  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: patch for geqo tweaks  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Wed, Nov 04, 2015 at 12:51:52PM -0500, Tom Lane wrote:

> As for the second part, I had to look up Fisher-Yates ;-) but after
> having read Wikipedia's entry about it I think this is a good change.
> The code's shorter and more efficient, and it should mathematically
> provide an equally-unbiased initial shuffle.  It could do with a
> better comment, and I'd be inclined to handle the first element
> outside the loop rather than uselessly computing geqo_randint(0,0),
> but those are trivial changes.

I see you committed a modified version of my patch in commit
59464bd6f928ad0da30502cbe9b54baec9ca2c69.

You changed the tour[0] to be hardcoded to 1, but it should be any of
the possible gene numbers from 0 to remainder.  If you want to pull the
geqo_randint(0,0) out of the loop, it would be the last element, not the
first (i.e. where remainder == 0).

We might be able to just skip the last swap, and the loop could be

for (i=0; i < num_gene-1; i++) {

but I'd need to re-read the details of the Fisher-Yates algorithm to be
sure.  It may be that the last swap needs to happen for the shuffle to
be fully random.  In any case, tour[0] certainly shouldn't be hardcoded
to 1.

-- 
nw



pgsql-hackers by date:

Previous
From: Kyotaro HORIGUCHI
Date:
Subject: Re: psql completion for ids in multibyte string
Next
From: Amit Kapila
Date:
Subject: Re: Parallel Seq Scan