On Thu, 2006-07-20 at 15:41 -0400, Tom Lane wrote:
> Usage of a partitioned hash table would then be like
>
> compute hashtable lookup key;
> entryhashcode = calc_hash(lookup key);
> partitionnumber = entryhashcode % NumPartitions;
> LWLockAcquire(PartitionLock[partitionnumber], ...);
> manipulate hashtable;
> LWLockRelease(PartitionLock[partitionnumber]);
>
> We could do this without changing the API of hash_search, but then we'd
> be computing the entry hashcode twice, so I'm inclined to provide an
> additional entry point that takes a precalculated hashcode.
That should be an additional win anyway, since hash_any() uses about 1%
CPU on tests I've seen - so we will hold locks slightly shorter
duration.
> Potential downsides of applying this idea to the buffer mapping table:
>
> 1. Reassigning a buffer to a new page will (usually) require two cycles
> of LWLockAcquire/Release for the two different partitions involved.
> Since this path also requires at least a read() kernel call (maybe a
> write() too), I don't think there'll be any meaningful slowdown.
> 3. Taking the freelist spinlock is new computation that wasn't there
> before. But, again, it's only needed in code paths that will also be
> doing a kernel call.
...So the additional overhead sounds acceptable, given we will save
somewhat on the hash_any()
> If we do this we should probably also handle the lmgr lock tables the
> same way (partially reverting my partition-the-LockMgrLock patch of a
> couple months ago). However, downside #3 might be a stronger objection
> for lmgr, since it can create or destroy lock objects without necessarily
> doing any I/O.
We should be in a position to test this soon.
-- Simon Riggs EnterpriseDB http://www.enterprisedb.com