On Sat, Jul 31, 2010 at 12:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Sat, Jul 31, 2010 at 12:02 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> I'm tempted to suggest that making RBNode be a hidden struct containing
>>> a pointer to somebody else's datum is fundamentally the wrong way to
>>> go about things, because the extra void pointer is pure overhead,
>>> and we aren't ever going to be using these things in a context where
>>> memory usage isn't of concern. If we refactored the API so that RBNode
>>> was intended to be the first field of some larger struct, as is done in
>>> dynahash tables for instance, we could eliminate the void pointer and
>>> the palloc inefficiency.
>
>> Even if we do that, is it still going to be too much of a performance
>> regression overall?
>
> Looking back, EntryAccumulator was 56 bytes on 64-bit machines in 8.4
> (it should have been smaller, but a poor choice of field ordering left
> a lot of pad space). Right now it's 32 bytes, and if we stick an RBNode
> field in the front it'd be 64. So that'd be a 14% penalty compared to
> 8.4, as opposed to the present situation which is a 100% penalty (32+80
> bytes per entry). On 32-bit machines the numbers are 32 bytes (8.4)
> versus 20+40 (HEAD) versus 36 bytes (my proposal), so 12.5% penalty
> versus 87.5%. (All of these numbers should be discounted by whatever
> space you want to assume the pass-by-reference key datum takes.)
>
> So it'd definitely be a lot better than now. Maybe there'd be some
> remaining performance gap for non-pathological cases, but I think we
> would be willing to pay that in order to avoid bad behavior for the
> pathological cases. It's difficult to say for sure of course
> without going to the trouble of coding and testing it.
Well, it sounds like a reasonable thing to try, then. You going to
take a crack at it?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company