Re: gistchoose vs. bloat - Mailing list pgsql-hackers

From Tom Lane
Subject Re: gistchoose vs. bloat
Date
Msg-id 25278.1359061656@sss.pgh.pa.us
Whole thread Raw
In response to Re: gistchoose vs. bloat  (Heikki Linnakangas <hlinnakangas@vmware.com>)
List pgsql-hackers
Heikki Linnakangas <hlinnakangas@vmware.com> writes:
> BTW, one thing that I wondered about this: How expensive is random()? 
> I'm assuming not very, but I don't really know. Alexander's patch called 
> random() for every tuple on the page, while I call it only once for each 
> equal-penalty tuple. If it's really cheap, I think my/Tom's patch could 
> be slightly simplified by always initializing keep_current_best with 
> random() at top of the function, and eliminating the -1 "haven't decided 
> yet" state.

I thought about that too, and concluded that random() is probably
expensive enough that we don't want to call it unnecessarily.

> OTOH, if random() is expensive, I note that we only need one 
> bit of randomness, so we could avoid calling random() so often if we 
> utilized all the bits from each random() call.

Meh.  That would hard-wire the decision that the probability of keeping
a best tuple is exactly 0.5.  I'd rather keep the flexibility to tune it
later.  The way your patch is set up, it seems unlikely that it will
call random() very many times per page, so I'm not that concerned about
minimizing the number of calls further.  (Also, in the probably-common
case that there are no exactly equally good alternatives, this would
actually be a net loss since it would add one unnecessary call.)

So basically, Alexander's method requires more random() calls and fewer
penalty() calls than yours, at least in typical cases.  It's hard to say
which is faster without benchmarking, and it would matter which opclass
you were testing on.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: BUG #6510: A simple prompt is displayed using wrong charset
Next
From: Tom Lane
Date:
Subject: Re: [BUGS] BUG #6572: The example of SPI_execute is bogus