Thread: Small patch: Change calling convention for ShmemInitHash (and fix possible bug)

Small patch: Change calling convention for ShmemInitHash (and fix possible bug)

From
Aleksander Alekseev
Date:
Hello

I would like to continue discussion regarding changing calling
convention for ShmemInitHash procedure:

http://www.postgresql.org/message-id/CA+TgmoZm=Uowt8a_XaSfooGwufeeLJ861NTADiCEOpyFehV8Wg@mail.gmail.com

Currently this procedure has two arguments --- init_size and max_size.
But since shared hash tables have fixed size there is little sense to
pass two arguments. In fact currently ShmemInitHash is always called
with init_size == max_size with only one exception, InitLocks procedure
(see lock.c), which I believe is actually a bug.

Patch is attached.

What do you think?

--
Best regards,
Aleksander Alekseev
http://eax.me/

Attachment
On Thu, Mar 24, 2016 at 9:50 AM, Aleksander Alekseev
<a.alekseev@postgrespro.ru> wrote:
> I would like to continue discussion regarding changing calling
> convention for ShmemInitHash procedure:
>
> http://www.postgresql.org/message-id/CA+TgmoZm=Uowt8a_XaSfooGwufeeLJ861NTADiCEOpyFehV8Wg@mail.gmail.com
>
> Currently this procedure has two arguments --- init_size and max_size.
> But since shared hash tables have fixed size there is little sense to
> pass two arguments. In fact currently ShmemInitHash is always called
> with init_size == max_size with only one exception, InitLocks procedure
> (see lock.c), which I believe is actually a bug.

No, I think we left it that way on purpose.  I don't remember the
discussion exactly, but I don't think it's hurting anything.

> Patch is attached.
>
> What do you think?

I don't think this actually buys us anything.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Small patch: Change calling convention for ShmemInitHash (and fix possible bug)

From
Aleksander Alekseev
Date:
> No, I think we left it that way on purpose.  I don't remember the
> discussion exactly, but I don't think it's hurting anything.

This was a part of original dynahash optimization patch. Since that
patch was about performance improvement and this concrete change is
about refactoring, not performance, we agreed to discuss it later as a
separate patch.

> I don't think this actually buys us anything.

For sure it doesn't make anything worse. Current code is just
confusing. I spent quite some time trying to figure out what is a
reason for passing two arguments before I realized - there is none.

-- 
Best regards,
Aleksander Alekseev
http://eax.me/



On Fri, Mar 25, 2016 at 9:17 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Mar 24, 2016 at 9:50 AM, Aleksander Alekseev
> <a.alekseev@postgrespro.ru> wrote:
>> Currently this procedure has two arguments --- init_size and max_size.
>> But since shared hash tables have fixed size there is little sense to
>> pass two arguments. In fact currently ShmemInitHash is always called
>> with init_size == max_size with only one exception, InitLocks procedure
>> (see lock.c), which I believe is actually a bug.
>
> No, I think we left it that way on purpose.  I don't remember the
> discussion exactly, but I don't think it's hurting anything.

My instinct is telling me that this is useful in this shape for
plugins, and that it has been designed on purpose for that.
-- 
Michael



Michael Paquier <michael.paquier@gmail.com> writes:
> On Fri, Mar 25, 2016 at 9:17 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Thu, Mar 24, 2016 at 9:50 AM, Aleksander Alekseev
>> <a.alekseev@postgrespro.ru> wrote:
>>> Currently this procedure has two arguments --- init_size and max_size.
>>> But since shared hash tables have fixed size there is little sense to
>>> pass two arguments. In fact currently ShmemInitHash is always called
>>> with init_size == max_size with only one exception, InitLocks procedure
>>> (see lock.c), which I believe is actually a bug.

>> No, I think we left it that way on purpose.  I don't remember the
>> discussion exactly, but I don't think it's hurting anything.

> My instinct is telling me that this is useful in this shape for
> plugins, and that it has been designed on purpose for that.

Yes, it is that way on purpose.  The comments for ShmemInitHash
explain why:
* init_size is the number of hashtable entries to preallocate.  For a table* whose maximum size is certain, this should
beequal to max_size; that* ensures that no run-time out-of-shared-memory failures can occur.
 

The sizes of the lock hashtables are *not* certain, and in particular
we're guessing as to the ratio of LOCK entries to PROCLOCK entries.
By setting max_size larger than init_size, we leave a pool of
initially-unallocated shared memory that can be doled out to either
hashtable on demand, thus adapting to whatever the actual load is.

(Of course, it'd be even better if such memory could be given back
when the peak demand passes.  But the lack of a full-on allocator
for shared memory is not a reason to dumb down ShmemInitHash.)

In short: the error in Aleksander's argument is the assumption that
shared hashtables have fixed size.  That's simply false.
        regards, tom lane



Re: Small patch: Change calling convention for ShmemInitHash (and fix possible bug)

From
Aleksander Alekseev
Date:
> In short: the error in Aleksander's argument is the assumption that
> shared hashtables have fixed size.  That's simply false.

Well this is a bit embarrassing but I have to admit that you are right.
Dynahash code is a bit non-trivial to say the least (let me guess -
there is no point of suggesting a patch that splits it into two or
three separate implementations for each use case, right? :) and I
misunderstood how it actually works. My apologies.

-- 
Best regards,
Aleksander Alekseev
http://eax.me/