Re: [PATCH] binary heap implementation - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: [PATCH] binary heap implementation |
Date | |
Msg-id | CA+TgmoaajxCvcuJBQBx0+Kvz4bq=cuKwxggLi-v+3+NPKgwzGQ@mail.gmail.com Whole thread Raw |
In response to | Re: [PATCH] binary heap implementation (Andres Freund <andres@2ndquadrant.com>) |
Responses |
Re: [PATCH] binary heap implementation
|
List | pgsql-hackers |
On Tue, Nov 20, 2012 at 2:29 PM, Andres Freund <andres@2ndquadrant.com> wrote: > On 2012-11-20 14:03:12 -0500, Robert Haas wrote: >> On Thu, Nov 15, 2012 at 8:56 PM, Abhijit Menon-Sen <ams@2ndquadrant.com> wrote: >> > [ new patch ] >> >> I went over this again today with a view to getting it committed, but >> discovered some compiler warnings that look like this: >> >> warning: cast to pointer from integer of different size >> >> The problem seems to be that the binary heap implementation wants the >> key and value to be a void *, but the way it's being used in >> nodeMergeAppend.c the value is actually an int. I don't think we want >> to commit a binary heap implementation which has an impedance mismatch >> of this type with its only client. The obvious solution seems to be >> to make the key and value each a Datum, and then clients can use >> WhateverGetDatum and DatumGetWhatever to pass whatever built-in data >> type they happen to have. I'm trying that approach now and will post >> an updated patch based on that approach if it seems to work OK and >> nobody suggests something better between now and then. > > Sounds like a good plan, one other alternative would have been declaring > the parameters a size_t (and thus explicitly always big enough to store > a pointer, but also valid to store inline data) instead of using a > void*. But given there is DatumGetPointer/PointerGetDatum et al, using > the postgres-y datatype seems fine to me. > > In the LCR case those really are pointers, so preserving that case is > important to me ;), but your proposal does that, so ... On further reflection, I'm wondering if it wouldn't be a smarter idea to just get rid of the binaryheap_node concept altogether. Instead, when you allocate a binary heap, you specify an "element size" as well as a capacity. Then, the binary heap can truly treat the elements as an opaque array of bytes rather than a struct of any particular type. Of course, the disadvantage of this approach is that it's likely to be slightly slower, because we'll need to do a multiplication by a variable rather than simple bit shift. But the extra flexibility might be worth it, because for example for merge append this is kind of inefficient from a storage perspective. We only really need N+1 pointers, but we've got storage for 2N. With the above change plus a user-specified third argument for the comparator function (passed as yet another argument to binaryheap_allocate) we could get around that while preserving the option for LCR or any other client code to do as it sees fit. Thoughts? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: