On Wed, 2005-12-07 at 22:46 -0500, Tom Lane wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
> > Is hashtable overhead all that large? Each table could be made
> > initially size-of-current-table/N entries. One problem is that
> > currently the memory freed from a hashtable is not put back into shmem
> > freespace, is it?
>
> Yeah; the problem is mainly that we'd have to allocate extra space to
> allow for unevenness of usage across the multiple hashtables. It's hard
> to judge how large the effect would be without testing, but I think that
> this problem would inhibit us from having dozens or hundreds of separate
> partitions.
The imbalance across partitions would be a major issue because of the
difficulty of selecting a well-distributed partitioning key. If you use
the LOCKTAG, then operations on the heaviest hit tables would go to the
same partitions continually for LockRelation requests. The frequency of
access per table usually drops off dramatically in rank order: look at
TPC-B (pgbench) and TPC-C; my own experience would be that you seldom
have as many even as 16 heavy hit tables. My guess would be that doing
all of that would do little more than reduce contention to ~50%, and
that this would show quickly diminishing returns for N > 4. Also, the
more sharply defined your application profile, the worse this effect
will be.
Having said that, I think this *is* the best way forward *if* we
continue to accept the barrage of lock requests. So I've reopened the
debate on the earlier thread: [HACKERS] Reducing relation locking overhead
and am reviewing thoughts/requirements on that thread to avoid the
necessity of altering the lock manager in this way.
pgbench is the right workload to expose this effect and measure worst
case contention, so at least performance gains are easy to measure.
> A possible response is to try to improve dynahash.c to make its memory
> management more flexible, but I'd prefer not to get into that unless
> it becomes really necessary. A shared freespace pool would create a
> contention bottleneck of its own...
...but a less frequently accessed one.
Best Regards, Simon Riggs