Hi,
> First of all, why not merge both patches into one? They aren't too
> big anyway.
Agree.
> I think comments should be changed, to be more informative here.
> Add a comment here too.
> Maybe you should explain this magic number 7 in the comment above?
Done.
> Then, this thread became too tangled. I think it's worth to write a
> new message with the patch, the test script, some results and brief
> overview of how does it really works. It will make following review
> much easier.
Sure.
HASHHDR represents a hash table. It could be usual or partitioned.
Partitioned table is stored in a shared memory and accessed by multiple
processes simultaneously. To prevent data corruption hash table is
partitioned and each process has to acquire a lock for a corresponding
partition before accessing data in it. Number of partition is determine
by lower bits of key's hash value. Most tricky part is --- dynahash
knows nothing about these locks, all locking is done on calling side.
Since shared memory is pre-allocated and can't grow to allocate memory
in a hash table we use freeList. Also we use nentries field to count
current number of elements in a hash table. Since hash table is used by
multiple processes these fields are protected by a spinlock. Which as
it turned out could cause lock contention and create a bottleneck.
After trying a few approaches I discovered that using one spinlock per
partition works quite well. Here are last benchmark results:
http://www.postgresql.org/message-id/20151229184851.1bb7d1bd@fujitsu
Note that "no locks" solution cant be used because it doesn't guarantee
that all available memory will be used in some corner cases.
You can find a few more details and a test script in the first message
of this thread. If you have any other questions regarding this patch
please don't hesitate to ask.
Best regards,
Aleksander